Loanwords in the World's Languages: A Comparative Handbook 9783110218442, 9783110218435

This book is the first work to address the question of what kinds of words get borrowed in a systematic and comparative

239 12 92MB

English Pages 1102 [1103] Year 2009

Table of contents :
Frontmatter
Table of contents
General chapters
I. The Loanword Typology project and the World Loanword Database
II. Lexical borrowing: Concepts and issues
III. Loanwords in the world’s languages: Findings and results
The languages
1. Loanwords in Swahili
2. Loanwords in Iraqw, a Cushitic language of Tanzania
3. Loanwords in Gawwada, a Cushitic language of Ethiopia
4. Loanwords in Hausa, a Chadic language in West Africa
5. Loanwords in Kanuri, a Saharan language
6. Loanwords in Tarifiyt, a Berber language of Morocco
7. Loanwords in Seychelles Creole
8. Loanwords in Romanian
9. Loanwords in Selice Romani, an Indo-Aryan language of Slovakia
10. Loanwords in Lower Sorbian, a Slavic language of Germany
11. Loanwords in Old High German
12. Loanwords in Dutch
13. Loanwords in British English
14. Loanwords in Kildin Saami, a Uralic language of northern Europe
15. Loanwords in Bezhta, a Nakh-Daghestanian language of the North Caucasus
16. Loanwords in Archi, a Nakh-Daghestanian language of the North Caucasus
17. Loanwords in Manange, a Tibeto-Burman Language of Nepal
18. Loanwords in Ket
19. Loanwords in Sakha (Yakut), a Turkic language of Siberia
20. Loanwords in Oroqen, a Tungusic language of China
21. Loanwords in Japanese
22. Loanwords in Mandarin Chinese
23. Loanwords in Thai
24. Loanwords in Vietnamese
25. Loanwords in White Hmong
26. Loanwords in Ceq Wong, an Austroasiatic language of Peninsular Malaysia
27. Loanwords in Indonesian
28. Loanwords in Malagasy
29. Loanwords in Takia, an Oceanic language of Papua New Guinea
30. Loanwords in Hawaiian
31. Loanwords in Gurindji, a Pama-Nyungan language of Australia
32. Loanwords in Yaqui, a Uto-Aztecan language of Mexico
33. Loanwords in Zinacantán Tzotzil, a Mayan language of Mexico
34. Loanwords in Q’eqchi’, a Mayan language of Guatemala
35. Loanwords in Otomi, an Otomanguean language of Mexico
36. Loanwords in Saramaccan, an English-based creole of Suriname
37. Loanwords in Imbabura Quechua
38. Loanwords in Kali’na, a Cariban language of French Guiana
39. Loanwords in Hup, a Nadahup language of Amazonia
40. Loanwords in Wichí, a Mataco-Mataguayan language of Argentina
41. Loanwords in Mapudungun, a language of Chile and Argentina
Backmatter

Recommend Papers

Number in the World's Languages: A Comparative Handbook 9783110622713, 9783110560695

The strong development in research on grammatical number in recent years has created a need for a unified perspective. T

168 64 22MB Read more

Number in the World's Languages: A Comparative Handbook 9783110622713, 9783110560695

The strong development in research on grammatical number in recent years has created a need for a unified perspective. T

101 38 6MB Read more

Sign Languages of the World: A Comparative Handbook 9781614518174, 9781614517962

Although a number of edited collections deal with either the languages of the world or the languages of particular regio

173 18 32MB Read more

Sign Languages of the World: A Comparative Handbook 9781614518174, 9781614517962

Although a number of edited collections deal with either the languages of the world or the languages of particular regio

220 67 53MB Read more

A Comparative Dictionary of the Agaw Languages 3896454811

190 61 11MB Read more

Comparative Studies in Amerindian Languages 9783110815009, 9789027921109

179 105 12MB Read more

Exploring The God Worlds Handbook

122 27 4MB Read more

Diathesis in the Semitic Languages: A Comparative Morphological Study 9004088180, 9789004088184

Book by Retso, Jan

183 102 13MB Read more

Words and Worlds: World Languages Review 9781853598289

World Languages Review aims to examine the sociolinguistic situation of the world: to describe the linguistic diversity

132 102 3MB Read more

Comparative Law: A Handbook 9781509955558, 9781841135960

This innovative, refreshing, and reader-friendly book is aimed at enabling students to familiarise themselves with the c

181 118 3MB Read more

Loanwords in the World's Languages: A Comparative Handbook
9783110218442, 9783110218435

Author / Uploaded
Martin Haspelmath (editor)
Uri Tadmor (editor)

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

Loanwords in the World’s Languages

Loanwords in the World’s Languages A Comparative Handbook Edited by Martin Haspelmath Uri Tadmor

De Gruyter Mouton

De Gruyter Mouton (formerly Mouton, The Hague) is a Division of Walter de Gruyter GmbH & Co. KG, Berlin.

앝 Printed on acid-free paper which falls within the guidelines 앪 of the ANSI to ensure permanence and durability.

Library of Congress Cataloging-in-Publication Data Loanwords in the world’s languages : a comparative handbook / edited by Martin Haspelmath, Uri Tadmor. p. cm. Includes bibliographical references and index. ISBN 978-3-11-021843-5 (cloth : alk. paper) 1. Language and languages ⫺ Foreign words and phrases. I. Haspelmath, Martin, 1963⫺ II. Tadmor, Uri, 1960⫺ P324.L63 2009 412⫺dc22 2009045067

ISBN 978-3-11-021843-5 Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available in the Internet at http://dnb.d-nb.de. ” Copyright 2009 by Walter de Gruyter GmbH & Co. KG, 10785 Berlin, Germany. All rights reserved, including those of translation into foreign languages. No part of this book may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. Cover design: Martin Zech, Bremen. Printed in Germany.

Table of contents Notational conventions............................................................................................. ix Acknowledgments ...................................................................................................... x List of authors ......................................................................................................... xi

GENERAL CHAPTERS I.

The Loanword Typology project and the World Loanword Database Martin Haspelmath and Uri Tadmor ............................................................... 1

II.

Lexical borrowing: Concepts and issues Martin Haspelmath........................................................................................ 35

III. Loanwords in the world’s languages: Findings and results Uri Tadmor .................................................................................................. 55

THE LANGUAGES 1.

Loanwords in Swahili Thilo C. Schadeberg ....................................................................................... 76

2.

Loanwords in Iraqw, a Cushitic language of Tanzania Maarten Mous and Martha Qorro ............................................................... 103

3.

Loanwords in Gawwada, a Cushitic language of Ethiopia Mauro Tosco ............................................................................................... 124

4.

Loanwords in Hausa, a Chadic language in West Africa Ari Awagana and H. Ekkehard Wolff, with Doris Löhr................................... 142

5.

Loanwords in Kanuri, a Saharan language Doris Löhr and H. Ekkehard Wolff, with Ari Awagana................................... 166

vi

Table of contents

6.

Loanwords in Tarifiyt, a Berber language of Morocco Maarten Kossmann ...................................................................................... 191

7.

Loanwords in Seychelles Creole Susanne Michaelis with Marcel Rosalie .......................................................... 215

8.

Loanwords in Romanian Kim Schulte ................................................................................................. 230

9.

Loanwords in Selice Romani, an Indo-Aryan language of Slovakia Viktor El!ík.................................................................................................. 260

10. Loanwords in Lower Sorbian, a Slavic language of Germany Hauke Bartels .............................................................................................. 304 11. Loanwords in Old High German Roland Schuhmann ...................................................................................... 330 12. Loanwords in Dutch Nicoline van der Sijs ..................................................................................... 338 13. Loanwords in British English Anthony Grant............................................................................................. 360 14. Loanwords in Kildin Saami, a Uralic language of northern Europe Michael Rießler............................................................................................ 384 15. Loanwords in Bezhta, a Nakh-Daghestanian of the North Caucasus Bernard Comrie and Madzhid Khalilov ........................................................ 414 16. Loanwords in Archi, a Nakh-Daghestanian of the North Caucasus Marina Chumakina..................................................................................... 430 17. Loanwords in Manange, a Tibeto-Burman language of Nepal Kristine A. Hildebrandt................................................................................. 447

Table of contents

vii

18. Loanwords in Ket, a Yeniseian language of Siberia Edward Vajda.............................................................................................. 471 19. Loanwords in Sakha (Yakut), a Turkic language of Siberia Brigitte Pakendorf and Innokentij N. Novgorodov .......................................... 496 20. Loanwords in Oroqen, a Tungusic language of China Fengxiang Li and Lindsay J. Whaley ............................................................ 525 21. Loanwords in Japanese Christopher K. Schmidt................................................................................. 545 22. Loanwords in Mandarin Chinese Thekla Wiebusch and Uri Tadmor ............................................................... 575 23. Loanwords in Thai Titima Suthiwan and Uri Tadmor............................................................... 599 24. Loanwords in Vietnamese Mark J. Alves ............................................................................................... 617 25. Loanwords in White Hmong Martha Ratliff ............................................................................................. 638 26. Loanwords in Ceq Wong, an Austroasiatic language of Peninsular Malaysia Nicole Kruspe ............................................................................................... 659 27. Loanwords in Indonesian Uri Tadmor ................................................................................................ 686 28. Loanwords in Malagasy Alexander Adelaar ........................................................................................ 717 29. Loanwords in Takia, an Oceanic language of Papua New Guinea Malcolm Ross............................................................................................... 747

viii

Table of contents

30. Loanwords in Hawaiian ‘"iwi Parker Jones........................................................................................ 771 31. Loanwords in Gurindji, a Pama-Nyungan language of Australia Patrick McConvell ....................................................................................... 790 32. Loanwords in Yaqui, a Uto-Aztecan language of Mexico Zarina Estrada Fernández............................................................................ 823 33. Loanwords in Zinacantán Tzotzil, a Mayan language of Mexico Cecil H. Brown ............................................................................................ 848 34. Loanwords in Q’eqchi’, a Mayan language of Guatemala Søren Wichmann and Kerry Hull.................................................................. 873 35. Loanwords in Otomi, an Otomanguean language of Mexico Ewald Hekking and Dik Bakker.................................................................... 897 36. Loanwords in Saramaccan, an English-based creole of Suriname Jeff Good...................................................................................................... 918 37. Loanwords in Imbabura Quechua Jorge Gómez Rendón and Willem Adelaar ..................................................... 944 38. Loanwords in Kali’na, a Cariban language of French Guiana Odile Renault-Lescure .................................................................................. 968 39. Loanwords in Hup, a Nadahup language of Amazonia Patience Epps ............................................................................................... 992 40. Loanwords in Wichí, a Mataco-Mataguayan language of Argentina Alejandra Vidal and Verónica Nercesian....................................................... 1015 41. Loanwords in Mapudungun, a language of Chile and Argentina Lucía A. Golluscio ...................................................................................... 1035 Index of Languages ............................................................................................. 1072

Notational conventions List of abbreviations 1 2 3 A ABL ABS ACC ACT ADJ ADV AGR AGT ALL ANTIP APPL ART AUX BEN CAUS CIRC CLF COLL COM COMP COMPL COND COP CVB DAT DECL DEF DEM DENOM DET DIMIN DIST DISTR DU DUR ERG EXCL F FEM FOC

first person second person third person agent-like argument of canonical transitive verb ablative absolutive accusative active adjective adverb(ial) agreement agent, agentive allative antipassive applicative article auxiliary benefactive causative circumfix classifier collective comitative complementizer completive conditional copula converb dative declarative definite demonstrative denominal determiner diminutive distal distributive dual durative ergative exclusive feminine feminine focus

FREQ FUT GEN HON IMP INCL IND INDF INF INS INTR IPFV IRR LOC M MASC MID NNEG NMLZ NOM OBJ OBL P PASS PFV PL POSS PRED PRF PRS PROG PROH PROX PST PTCP PURP Q QUOT RECP REFL REL RES

frequentative future genitive honorific imperative inclusive indicative indefinite infinitive instrumental intransitive imperfective irrealis locative masculine masculine middle non- (e.g. NSG nonsingular, NPST nonpast) negation, negative nominalizer/nominalization nominative object oblique patient-like argument of canonical transitive verb passive perfective plural possessive predicative perfect present progressive prohibitive proximal/proximate past participle purposive question particle/marker quotative reciprocal reflexive relative resultative

x

Notational conventions & Acknowledgments

S SBJ SBJV SEM SG

single argument of canonical intransitive verb subject subjunctive semelfactive singular

STAT TOP TR VN VOC

stative topic transitive verbal noun vocative

Notational conventions for the maps Language

Yoruba

Main language

Hausa

Country

NIGERIA

City

! Katsina

Province etc.

Borno

Geographical regions

Acknowledgments The Loanword Typology project, whose results are reported in this book, was made possible by generous funding from the Department of Linguistics of the Max Planck Institute for Evolutionary Anthropology, Leipzig (Bernard Comrie, director). Most of the authors were able to attend one or more of the ten meetings between 2003 and 2007, organized competently by Max Planck staff members; we thank in particular Julia Cissewski, Claudia Büchel, Peter Fröhlich and Claudia Schmidt. We also had great help from a number of highly motivated and reliable student assistants, not only in checking and correcting the databases, but also in editing and even typesetting this volume. Thanks are due especially to Yan Luo, Eva-Maria Schmortte, Birgit Jänen, Luise Dorenbusch, Jenny Seeg, Alex Jahraus, and Tyko Dirksmeyer. For the maps, we had invaluable help from Sandra Michaelis from Max Planck!s multimedia department. But the most important person over the years has been our indefatigable database manager, Bradley Taylor, without whom this project would have had to remain much more modest in its goals and achievements. For the creation of the online version of the World Loanword Database, we are grateful to the Max Planck Digital Library, especially Robert Forkel. Leipzig/Jakarta, 16 September 2009

Martin Haspelmath and Uri Tadmor

List of authors Alexander Adelaar Asia Institute The University of Melbourne Victoria 3010 Australia E-mail: [email protected] Homepage: http://www.asiainstitute.unimelb.edu.au/people/staff/adelaar.html Willem Adelaar Leiden University Centre for Linguistics P.O. Box 9515 2300 RA Leiden The Netherlands E-mail: [email protected] Homepage: http://www.hum.leiden.edu/lucl/organisation/members/adelaarwa.html Mark J. Alves Department of Reading, ESL, World Languages and Philosophy Montgomery College 51 Mannakee St. Rockville, MD 20850 U.S.A. E-mail: [email protected] Ari Awagana Institut für Afrikanistik Universität Leipzig Postfach 100920 04009 Leipzig Germany E-mail: [email protected] Homepage: http://www.uni-leipzig.de/~afrika

xii

List of authors

Dik Bakker Department of Linguistics and English Language Lancaster University Lancaster LA1 4YT United Kingdom E-mail: [email protected] Homepage: http://home.medewerker.uva.nl/d.bakker/ Hauke Bartels Sorbisches Institut Abteilung für niedersorbische Forschungen August-Bebel-Straße 82 03046 Cottbus Germany E-mail: [email protected] Homepage: http://www.serbski-institut.de/cms/de/116 Cecil H. Brown Department of Anthropology Northern Illinois University Stevens Building 102 DeKalb, IL 60115 U.S.A. E-mail: [email protected] Homepage: http://www3.niu.edu/anthro/people/faculty/brown.htm Marina Chumakina Surrey Morphology Group University of Surrey Guildford, GU2 7XH United Kingdom E-mail: [email protected] Homepage: http://www.surrey.ac.uk/cmc/staff-profiles/marina-chumakina.htm Bernard Comrie University of California Santa Barbara & Max Planck Institute for Evolutionary Anthropology Deutscher Platz 6 04103 Leipzig Germany E-mail: [email protected] Homepage: http://email.eva.mpg.de/~comrie/

List of authors

xiii

Viktor El!ík Ústav lingvistiky a ugrofinistiky (Institute of Linguistics and Finno-Ugric Studies) Univerzita Karlova (Charles University) Nám. J. Palacha 2 Praha 1, 110 00 Czech Republic E-mail: [email protected] Homepage: http://ulug.ff.cuni.cz/osobni/elsik/index.php Patience Epps Department of Linguistics University of Texas at Austin 1 University Station B5100 Austin, TX 78712–0198 U.S.A. E-mail: [email protected] Zarina Estrada Fernández Departamento de Letras y Lingüística División de Humanidades y Bellas Artes Universidad de Sonora Edificio 3-A Rosales y Blvd. Luis Encinas s/n Col. Centro 83000 Hermosillo, Sonora Mexico E-mail: [email protected] Lucía A. Golluscio Instituto de Lingüística Facultad de Filosofía y Letras Universidad de Buenos Aires er 25 de mayo 217 – 1 piso 1002 Buenos Aires Argentina E-mail: [email protected], [email protected]

xiv

List of authors

Jorge A. Gómez Rendón Department Theoretical Linguistics Faculty of Humanities University of Amsterdam Spuistraat 210 1012 VT Amsterdam The Netherlands E-mail: [email protected], [email protected] Homepage: http://home.medewerker.uva.nl/j.a.gomezrendon/ Jeff Good Department of Linguistics University at Buffalo 609 Baldy Hall Buffalo, NY 14260 U.S.A. E-mail: [email protected] Homepage: http://buffalo.edu/˜jcgood Anthony Grant Department of English and History Edge Hill University St Helens Road Ormskirk, Lancashire L39 4QP United Kingdom E-mail: [email protected] Homepage: http://www.edgehill.ac.uk/english/EnglishLanguage/Staff/AnthonyGrant.htm Martin Haspelmath Max-Planck-Institut für evolutionäre Anthropologie Deutscher Platz 6 04103 Leipzig Germany E-mail: [email protected] Homepage: http://www.eva.mpg.de/lingua/staff/haspelmath.php

List of authors

xv

Ewald Hekking Departamento de Investigaciones Antropológicas Facultad de Filosofía Universidad Autónoma de Querétaro Querétaro Mexico E-mail: [email protected], [email protected] Kristine A. Hildebrandt Department of English Southern Illinois University Edwardsville Edwardsville IL 62026 U.S.A. E-mail: [email protected], [email protected] Homepage: http://www.siue.edu/~khildeb Kerry Hull College of Foreign Studies Reitaku University 2–1–1 Hikarigaoka Tobu Jutaku 44 Kashiwa, Chiba 277–0065 Japan E-mail: [email protected] Madzhid Khalilov G. Tsadasa Institute for Language, Literature, and Art of the Daghestan Scientific Center of the Russian Academy of Sciences & Max Planck Institute for Evolutionary Anthropology Deutscher Platz 6 04103 Leipzig Germany E-mail: [email protected], [email protected] Maarten Kossmann Department of African Languages and Cultures Leiden University P.O. Box 9515 2300 RA Leiden The Netherlands E-mail: [email protected]

xvi

List of authors

Nicole Kruspe School of Languages and Linguistics, Arts Centre 515B The University of Melbourne Parkville, Victoria, 3010 Australia E-mail: [email protected] Homepage: http://www.linguistics.unimelb.edu.au/about/staff/profiles/kruspe/index.html Li Fengxiang Department of English California State University, Chico Taylor Hall 209 Chico, CA 95929-0830 U.S.A. E-mail: [email protected] Doris Löhr Asien-Afrika-Institut Abteilung für Afrikanistik & Äthiopistik Universität Hamburg Edmund-Siemers-Allee 1 20146 Hamburg Germany E-mail: [email protected] Homepage: http://www.aai.uni-hamburg.de/afrika/Personen.html Patrick McConvell School of Language Studies Australian National University Canberra ACT 0200 Australia E-mail: [email protected] Susanne Michaelis Max-Planck-Institut für evolutionäre Anthropologie Deutscher Platz 6 04103 Leipzig Germany E-mail: [email protected] Homepage: http://email.eva.mpg.de/~michaels/

Residence: Berghofweg 8 35041 Marburg Germany

List of authors

Maarten Mous Department of African Languages and Cultures Leiden University Centre for Linguistics P.O. Box 9515 2300 RA Leiden The Netherlands E-mail: [email protected] Homepage: http://www.hum.leiden.edu/lucl/organisation/mousmpgm.jsp Verónica Nercesian Universidad Nacional de Formosa Facultad de Humanidades e Instituto de Investigaciones Lingüísticas Av. Gutnisky 3200 Formosa Argentina E-mail: [email protected] Innokentij N. Novgorodov Yakut State Engineering Technical Institute Ulica Stroitelej 8 677009 Yakutsk Yakutia, Russia E-mail: [email protected] Homepage: http://www.yseti.ru Martha Qorro Dean of students, University of Dar es Salaam Department of Foreign Languages and Linguistics; Communication skills P.O. Box 35040 Dar es Salaam Tanzania E-mail: [email protected] Brigitte Pakendorf Max Planck Institute for Evolutionary Anthropology Deutscher Platz 6 04103 Leipzig Germany E-mail: [email protected] Homepage: http://www.eva.mpg.de/cpl/staff/pakendorf/index.html

xvii

xviii

List of authors

‘"iwi Parker Jones Wolfson College University of Oxford Linton Road Oxford OX2 6UD United Kingdom E-mail: [email protected] Homepage: http://www.phon.ox.ac.uk/~oiwi/ Martha Ratliff Department of English Wayne State University 5057 Woodward Ave., Suite 9408 Detroit, MI 48202 U.S.A. E-mail: [email protected] Homepage: http://www.clas.wayne.edu/faculty/ratliff Odile Renault-Lescure Institut de Recherche pour le Développement (IRD) Centre d’Etudes des Langues Indigènes d’Amérique (CELIA) 7, rue Guy Môquet 94801 Villejuif France E-mail: [email protected] Homepage: http://celia.cnrs.fr/Fr/Labo/Renault-Lescure.htm Michael Rießler Institut für Vergleichende Germanische Philologie und Skandinavistik Albert-Ludwigs-Universität Freiburg i. Br. 79085 Freiburg Germany E-mail: [email protected] Homepage: http://www.skandinavistik.uni-freiburg.de/institut/mitarbeiter/riessler/ Marcel Rosalie Sanssouci Victoria, Mahé Seychelles E-mail: [email protected]

List of authors

Malcolm Ross Department of Linguistics Research School of Pacific and Asian Studies The Australian National University Canberra ACT 0200 Australia E-mail: [email protected] Homepage: http://rspas.anu.edu.au/linguistics/projects/biomdr.html Thilo C. Schadeberg Department of African Languages and Cultures Leiden University P.O. Box 9515 2300 RA Leiden The Netherlands E-mail: [email protected] Christopher K. Schmidt Department of Linguistics, MS–23 Rice University 6100 Main Street Houston, TX 77005–1827 U.S.A. E-mail: [email protected] Roland Schuhmann Lehrstuhl für Indogermanistik Friedrich-Schiller-Universität Jena Zwätzengasse 12 07743 Jena Germany E-mail: [email protected] Homepage: http://www.indogermanistik.uni-jena.de/ Kim Schulte Department of Modern Languages The University of Exeter Queen’s Building, The Queen’s Drive Exeter, EX4 4QH United Kingdom E-mail: [email protected] Homepage: http://www.sall.ex.ac.uk/languages/content/view/344/3/

xix

xx

List of authors

Titima Suthiwan Centre for Language Studies National University of Singapore 9 Arts Link AS4/02–05 Singapore 117570 E-mail: [email protected] Homepage: http://profile.nus.edu.sg/fass/clsts/ Uri Tadmor Max Planck Institute for Evolutionary Anthropology Deutscher Platz 6 04103 Leipzig Germany E-mail: [email protected] Homepage: http://www.eva.mpg.de/lingua/staff/tadmor.php

Jakarta Field Station PKBB, Unika Atma Jaya Jl. Sudirman 51 Jakarta 12930 Indonesia

Mauro Tosco Department of Oriental Studies University of Turin via Giulia di Barolo, 3 A 10124 Torino Italy E-mail: [email protected], [email protected] Edward Vajda Modern and Classical Languages Western Washington University Bellingham, WA 98225 U.S.A. E-mail: [email protected] Homepage: http://pandora.cii.wwu.edu/vajda/ Nicoline van der Sijs Schaepmanplein 20 2314 EH Leiden The Netherlands E-mail: [email protected] Homepage: http://nl.wikipedia.org/wiki/Nicoline_van_der_Sijs

List of authors

Alejandra Vidal Universidad Nacional de Formosa Facultad de Humanidades e Instituto de Investigaciones Lingüísticas Av. Gutnisky 3200 Formosa Argentina E-mail: [email protected] Lindsay J. Whaley Linguistics and Cognitive Science Dartmouth College 6086 Reed Hall Hanover, NH 03755 U.S.A. Email: [email protected] Homepage: http://www.dartmouth.edu/~linguist/faculty/whaley.html Søren Wichmann Max Planck Institute for Evolutionary Anthropology Deutscher Platz 6 04103 Leipzig Germany E-mail: [email protected] Homepage: http://email.eva.mpg.de/~wichmann/ Thekla Wiebusch Centre de Recherches Linguistiques sur l’Asie Orientale (CRLAO), CNRS 54 Bd. Raspail 75006 Paris France E-mail: [email protected] H. Ekkehard Wolff Institut für Afrikanistik, Universität Leipzig Mailing address: Schleiblick 5 Weseby 24354 Kosel Germany E-mail: [email protected] Homepage: http://www.uni-leipzig.de/~afrika

xxi

!!

Chapter I

The Loanword Typology project and the World Loanword Database Martin Haspelmath and Uri Tadmor 1. Goals and collaborative design How likely is it that a word with a given lexical meaning would be borrowed from one language into another? Before we set out on the Loanword Typology project, this question could be answered mostly on the basis of impressionistic observations, such as “body part terms are unlikely to be borrowed”, or “terms for new artifacts are often borrowed”. In contrast, the research reported in this book is an empirical study of borrowability – the relative likelihood that words with particular meanings would be borrowed. In the study, we used the classical methods of linguistic typology: (i) establishing a worldwide sample of languages (41 languages, see §2) (ii) surveying the types of loanwords found in these languages, on the basis of a fixed list of lexical meanings (1,460 lexical meanings, see §3) (iii) attempting generalizations across the languages of the sample (see chapter III on findings and results). There are a variety of reasons why one would like to know, for each lexical meaning, what its chances of being borrowed are: (a) In assessing genealogical relatedness between languages, it is important to separate inherited material from borrowed material. Loanwords point to historical contact between two languages (the presence of people with at least some knowledge of both languages at some stage), but not to genealogical relatedness (i.e. descent from a common ancestral language). But which words are the most likely to be inherited? Linguists often assume that there is a set of words that are highly stable, unlikely to be replaced by borrowings, meaning shift, or new formations (or become obsolete without replacement), but this notion needs to be made more precise to enhance the validity of conclusions based on it. (b) The likelihood of lexical borrowing depends on the type of contact situation. A language of a population under the political control of another group may be likely to borrow administrative terms from the dominant group’s language, and seafaring populations may contribute marine words to languages spoken inland. An invading population may borrow terms for local flora and fauna even if it is technologically and economically superior to the indigenous

2

Martin Haspelmath and Uri Tadmor

population. Once such generalizations have been securely established (on the basis of attested examples), it will be possible to draw inferences about the history of a population from the loanword patterns of the language it uses. (c) Borrowing patterns may be influenced by non-social, strictly linguistic factors. Most often cited are genealogical or typological distances among the languages involved. Other potential factors include semantic complexity, abstractness, and taxonomic level, as well as syntactic factors such as word-class affiliation (e.g. noun vs. verb vs. adjective; content word vs. function word). Generalizations in this area are interesting for linguistics-internal reasons because they may provide insights about the nature of language structure. To make progress on these questions, the editors set up a collaborative project, called the Loanword Typology (LWT) project, which began in 2004 and is being completed with the publication of the volume and the accompanying online database. In practical terms, prospects for the success of the project seemed to be greatest if each language was under the responsibility of an author who is a specialist of the language and its history. Each of the authors (or author teams) provided counterparts (translational equivalents) for items on a fixed list of 1,460 items, called the LWT meaning list (§3). They were also requested to supply information on what is known about the historical circumstances under which these words were borrowed. The authors were allowed to include additional loanwords (beyond the 1,460 items on the list), but the fixed list was important so that quantitative generalizations across the sample languages could be made. The resulting combined database, which we call the World Loanword Database 1 (WOLD), is a lexical database comprising 41 individual language subdatabases. Each subdatabase contains about 1,000–2,000 words or counterparts of the meanings on the Loanword Typology Meaning List. The number of words in each subdatabase varies, because sometimes a language has no counterpart for a particular meaning, while in other cases it has several counterparts. Moreover, contributors were able to add meanings (and their counterparts) to their individual subdatabases, although these were not considered for statistical purposes (§4). For each word, the database gives the (orthographic) form, information about the analyzability of this form, a morpheme-by-morpheme gloss (for analyzable words), information about loanword status, information about the age of the word, and optional further information of various kinds (§5). For each loanword, the database gives the donor language and the source word (with its meaning), as well as some information about the borrowing circumstances (§6). In this chapter, we describe the World Loanword Database in greater detail, explaining the data fields and the criteria for providing the information that the contributors were asked to follow during their work on the LWT project. Each of the case studies in this volume describes the lexical borrowing patterns in one of

1

The database is available online at http://wold.livingsources.org/

I. The Loanword Typology project and the World Loanword Database

3

the project languages and presents some of the results of the database in the form of uniform tables (see §8).

2. The language sample In selecting languages for inclusion in the project, an effort was made to represent the world’s genealogical, geographical, typological, and sociolinguistic diversity. However, the overriding factors were practical. Languages could only be included if a specialist in the language volunteered to invest the considerable amount of time and effort needed to complete the database and to write a book chapter based on the findings. Indeed, no serious and timely offer to contribute a database and book chapter was turned down. The project languages are listed in Table 1, with their genealogical affiliations and their rough geographical locations (more detail is provided in the individual chapters). Admittedly, our language sample is not ideal. Some regions or language families are over- or under-represented, as are some typological and sociolinguistic types. Moreover, the inclusion of a number of closely related languages led to some skewing of the statistics. For example, if languages A, B, and C are descended from language D, and all three were included in the project, words borrowed into language D and inherited by A, B, and C would have been counted as three loanwords, whereas in fact this represents a single borrowing event. In any event, few closely related languages were included, and the number of loanwords which constitute shared retentions among project languages is negligible. Since few languages from North America, no single Papuan language, and quite generally too many languages with large numbers of speakers were included, we cannot say that the sample is fully representative of the world’s linguistic diversity. But it is much better than anything that existed before our project. At any rate, our world-wide sample is preferable to using just one or two languages or to relying on intuition, as in many previous studies of borrowability. Our sample includes languages indigenous to all continents and belonging to many language families. There are also two creoles (Saramaccan and Seychelles Creole) and an isolate (Mapudungun – or two isolates, if one counts Ket, now an isolate due the extinction of its last relative). Some of the project languages are spoken by hundreds of millions (English, Mandarin), while others only by a few thousands (Hup) or even a few hundreds (Ceq Wong). Some have a written history going back millennia (Chinese, [Malay-]Indonesian), while others are not normally written to this day (Gawwada, Tarifiyt Berber). Some are official languages of nation-states (Dutch, Romanian) while others are spoken by ethnolinguistic minorities (Yaqui, Kildin Saami). Typologically, the sample includes highly isolating languages (Vietnamese, White Hmong) as well as synthetic languages, both more fusional ones (Hausa, Lower Sorbian) and more agglutinative ones (Imbabura Quechua, Indonesian).

4

Martin Haspelmath and Uri Tadmor

Table 1:

The LWT project languages and the contributors

Language

Affiliation

Main location(s)

Contributor(s)

Archi

Lezgic, Nakh-Daghestanian Tsezic, Nakh-Daghestanian Aslian, Austro-Asiatic Germanic, Indo-European Germanic, Indo-European Cushitic, Afro-Asiatic Pama-Nyungan Chadic, Afro-Asiatic

Daghestan, Russian Federation Daghestan, Russian Federation West Malaysia Netherlands

Marina Chumakina

Britain, USA, Canada, Australia Ethiopia Australia Nigeria, Niger

Anthony Grant

Hawai!i

Hup Imbabura Quechua

Polynesian, Austronesian Nadahup Quechuan

Indonesian Iraqw

Malayic, Austronesian Cushitic, Afro-Asiatic

Indonesia Tanzania

Japanese Kali’na Kanuri

Japanese-Ryukyuan Cariban Saharan

Japan Venezuela Nigeria, Niger

Ket

Yeniseian

Russia

Kildin Saami Lower Sorbian Malagasy

Russia Germany Madagascar

Manange Mandarin Chinese

Uralic Slavic, Indo-European Southeast Barito, Austronesian Bodish, Sino-Tibetan Sinitic, Sino-Tibetan

Mapudungun

(isolate)

Chile, Argentina

Old High German

Northern Germany

Oroqen

Germanic, Indo-European Tungusic

Otomi

Otomanguean

Mexico

Q’eqchi’

Mayan

Guatemala, El Salvador, Belize

Bezhta Ceq Wong Dutch English Gawwada Gurindji Hausa Hawaiian

Brazil, Colombia Ecuador

Nepal China

China

Bernard Comrie and Madzhid Khalilov Nicole Kruspe Nicoline van der Sijs

Mauro Tosco Patrick McConvell Ari Awagana and H. Ekkehard Wolff, with Doris Löhr !"iwi Parker Jones Patience Epps Jorge A. Gómez Rendón (and Willem Adelaar) Uri Tadmor Maarten Mous and Martha Qorro Christopher K. Schmidt Odile Renault-Lescure Doris Löhr and H. Ekkehard Wolff, with Ari Awagana Edward Vajda and Andrey Nefedov Michael Rießler Hauke Bartels Alexander Adelaar Kristine Hildebrandt Thekla Wiebusch (and Uri Tadmor) Lucía A. Golluscio, Fresia Mellico, and Adriana Fraguas Roland Schuhmann Fengxiang Li and Lindsay J. Whaley Dik Bakker and Ewald Hekking Søren Wichmann and Kerry Hull

I. The Loanword Typology project and the World Loanword Database Romanian

Romance, Indo-European Turkic

Romania

Kim Schulte

Siberia Surinam Slovakia

Seychelles Creole

English-based creole Indo-Iranian, Indo-European French-based creole

Brigitte Pakendorf and Innokentij N. Novgorodov Jeff Good Viktor El#ík

Swahili

Bantu, Niger-Congo

Sakha Saramaccan Selice Romani

Takia Thai Tarifiyt Berber Vietnamese White Hmong Yaqui Wichí Zinacantán Tzotzil

Seychelles

5

Susanne Michaelis, with Marcel Rosalie and Katrin Muhme Thilo C. Schadeberg

Tanzania, Kenya, Uganda, D. R. Congo Oceanic, Austronesian Papua New Guinea Malcolm Ross Tai-Kadai Thailand Titima Suthiwan (and Uri Tadmor) Afro-Asiatic Morocco Maarten Kossmann Viet-Muong, Vietnam Mark J. Alves Austro-Asiatic Hmong-Mien Laos Martha Ratliff Uto-Aztecan Mexico Zarina Estrada Fernández Mataco-Mataguayan Argentina, Bolivia Alejandra Vidal and Verónica Nercesian Mayan Mexico Cecil H. Brown

3. The Loanword Typology meaning list 3.1.

General features of the list

The LWT meaning list consists of 1,460 lexical meanings, which are listed in the appendix to this chapter. Most can be expected to have word counterparts in any language, at least in principle, e.g. ‘head’, ‘to eat’, ‘dead’, and ‘strong’. However, some other meanings are not expected to occur in many languages. These include numerous meanings inherited from the lists on which our list was based (discussed later in this section) such as ‘mead’, ‘awl’, ‘nit’, ‘battle-ax’, ‘netbag’, ‘men’s house’, ‘mother-in-law of a man’, and ‘father-in-law of a woman’. A few region-specific meanings were also added by the LWT project based on suggestions by contributors, for example ‘larch’, ‘manioc bread’, ‘tumpline’, and ‘snowshoe’. By asking the contributors to provide the counterparts of these meanings, we aimed to obtain comparable lexical samples from all project languages. Note that the list is a “meaning list”, not a “word list”. The items on the list are meanings that could be relevant in any language, not words of a particular language (in particular, they are not words of our working language English; see §3.3). Of course, the comparability of the lexicons of different languages is necessarily limited: biogeographical and cultural variation entail different kinds of lexical meanings that occur in different languages. Amazonian languages do not have words for ‘snowshoe’ or ‘mosque’, and Siberian languages do not have words for

6

Martin Haspelmath and Uri Tadmor

‘toucan’ or ‘manioc’. None of the 41 individual language databases has counterparts for all 1,460 meanings, but only a handful of the subdatabases have fewer than 1,000 counterparts. In addition, even when talking about similar things, different languages divide the world in different ways. For rodents of the genus Mus, Indonesian uses the word tikus, but this word does not mean simply ‘mouse’, because it can equally be used for the genus Rattus (‘rat’). Similarly, English normally makes no distinction between the genera Ciconia and Mycteria, calling both of them stork. Such lack of cross-linguistic semantic congruence is pervasive and will be discussed further below (§4). The LWT meaning list is based on the meaning list of the Intercontinental Dictionary Series (IDS), a project founded by Mary Ritchie Key (1924–2003) and now headed by Bernard Comrie. Key modeled the IDS list, which consists of 1,310 meanings, after Carl Darling Buck’s Dictionary of Selected Synonyms in the Principal Indo-European Languages (1949, University of Chicago Press), which contains over 1,200 meanings. The lexical meanings of Buck’s list are naturally biased towards earlier periods and the geographical region of Europe and southwestern Asia. Key added many meanings to the list that are appropriate in other biogeographical and cultural settings, especially (native) South America. The LWT meaning list includes all 1,310 meanings on the IDS list as well as 150 additional meanings which fall principally into three categories: (a) concepts important to geographical regions beyond the geographical and cultural biases of the IDS list; (b) meanings that appear on the Swadesh 207 list but not on the IDS list; and (c) other meanings deemed diagnostically useful, especially common meanings pertaining to modern life, as such terms are almost entirely missing from the IDS list (‘radio’, ‘bus’, ‘hospital’, and similar items). As mentioned above, the contributors were allowed to add further meanings to their individual language subdatabases beyond the meanings on the LWT list, but these were not counted for statistical purposes. 3.2.

Semantic fields and semantic word classes

The 1,460 meanings are divided into 24 fields. Of these, 22 were semantic fields retained from Buck’s (1949) list and Key’s IDS list (slightly renamed in some cases), and two fields were added (23 and 24). These additional two fields are not, strictly speaking, semantic fields, but they were deemed important for our study. The 24 fields are given in Table 2. In some cases, the grouping of words is fairly obvious (e.g. animal names in field 3, body parts in field 4), but in many other cases the grouping of the words is somewhat arbitrary, and alternative groupings are possible and might be preferred by other scholars. Thus, ‘wheel’ is in field 10 (Motion), but could also be put into field 9 (Basic actions and technology), and ‘clever’ is in field 16 (Emotions and values), but would equally fit into field 17 (Cognition). Nevertheless, these semantic fields are very useful for a first orientation, and they are used in the tables of all the language chapters. The field numbers are also used in the LWT meaning codes,

I. The Loanword Typology project and the World Loanword Database

7

following Buck’s and Key’s usage. For example, the world has the code 1.1, the cat has the code 3.62, and clever has code 16.84, indicating they belong to fields 1, 3, and 16, respectively. In addition, we assigned each meaning to a semantic word class, as shown in Table 3. These are meant as very approximate categories, corresponding roughly to ‘things and entities’, ‘actions and processes’, ‘properties’, ‘manner and location’, and ‘grammatical meanings’. The labels correspond to traditional part-of-speech labels, but this is purely for convenience. There is no expectation that these ontological categories would necessarily match the parts of speech in a particular language (although this is often the case). As already explained, the list is a meaning list, not a list of words with syntactic properties, so some meanings may well have counterparts in different languages that belong to different parts of speech (see §5.1). Note that the semantic word class “Function word” is broader than the field Miscellaneous function words, as the latter only contains grammatical meanings not already included in one of the other chapters. Table 2:

Semantic fields of the LWT meaning list

Semantic field label 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

The physical world Kinship Animals The body Food and drink Clothing and grooming The house Agriculture and vegetation Basic actions and technology Motion Possession Spatial relations Quantity Time Sense perception Emotions and values Cognition Speech and language Social and political relations Warfare and hunting Law Religion and belief Modern world Miscellaneous function words Total

Number of meanings 75 85 116 159 81 59 47 74 78 82 46 75 38 57 49 48 51 41 36 40 26 26 57 14 1,460

8

Martin Haspelmath and Uri Tadmor

Table 3:

Semantic word classes of the LWT meaning list

Semantic word class label “Nouns” “Verbs” “Adjectives” “Adverbs” “Content words” “Function words” Grand total

3.3.

Number of meanings 905 334 120 4 1,363 97 1,460

Identifying the meanings

It is important to emphasize that, as mentioned earlier, the Loanword Typology meaning list should be thought of as a list of meanings that is designed to elicit words from the project languages. It is not a list of English words, though the meanings are most readily identified by human users through their English labels (for the computer, the LWT code is the primary identifier). Some LWT meanings are narrower than those of the English label; for instance, LWT 9.61, labeled to forge, is intended to refer to the action of making something from a piece of metal, not to the action of illegally copying something. And some LWT meanings are broader than those of corresponding English words; e.g. LWT 1.36, labeled the river or stream, is intended to refer to a flowing body of water of any size, a meaning for which English does not have a non-circumlocutory expression. Contributors were made aware of differences between LWT meanings and the meanings of the English labels by means of typical context sentences and/or a meaning descriptions when deemed necessary. The LWT meaning list thus consists of three pieces of information for each meaning (in addition to the semantic field and semantic word class that we just saw): the label, a meaning description, and a typical context. The LWT label consists of an English word or phrase and serves to summarize the intended meaning. In many cases, this summary is sufficient to give a clear idea of the meaning (e.g. the apple, the coffee, long). Labels of nouns contain the article the in order to make clear immediately that it is not the verb that is intended, and similarly verb labels contain the infinitival marker to (e.g. the fly, to fly, the bark, to smell, the light, to light). Where different meanings have the same English label, numbers in parentheses are used to differentiate the labels, so that all the labels are unique (e.g. to ask (1), i.e. ‘inquire’, to ask (2), i.e. ‘request’). Where the major varieties of English have different words, both are used in the label, separated by a slash (e.g. sick/ill, the maize/corn, the autumn/fall). The meaning description field contains additional information about the meaning provided by the editors in order to clarify or disambiguate certain items. The

I. The Loanword Typology project and the World Loanword Database

9

typical context is a phrase or sentence that likewise serves to pick out a specific use if the English word of the label has a range of different uses.

4. Meaning-word relationships Word lists traditionally used for cross-linguistic comparison, such as the Swadesh lists, usually contain just two types of information: (1) meanings in a metalanguage and (2) counterparts of those meanings in the language under investigation. In the overwhelming number of cases, just one counterpart is entered for each meaning. The norm of choosing just one counterpart per meaning is understandable – multiple words complicate the comparison and make statistics more difficult. However, this can lead to an unintended distortion of the data. For example, given a choice between several counterparts, a historical linguist trying to show that two languages are related may prefer a word that has a cognate in the other language to words that do not. On the other hand, a dialectologist trying to establish an isogloss between two dialects would naturally prefer a word without a cognate in the other dialect. To take an example more relevant to the present study: given a choice of two synonyms, a genealogical linguist might prefer a native counterpart while a contact linguist would be predisposed to choosing a loanword. The World Loanword Database addresses this problem by allowing several (indeed, an unlimited number of) counterparts per meaning. Likewise, the database also allows several meanings to be linked to one word, so that a polysemous word is not counted as several homophonous words. The database essentially consists of three tables: a Meanings table as the basis of the comparison (with information relating to the 1,460 meanings, as described in §3.2–3), a Words table (with word-related information on each language), and a Meaning-Word Pairs table (with information concerning the relationship between meanings and words). The three tables are linked through a unique meaning identifier (the LWT code) and a unique word identifier (the word record number). A challenge in comparing lexical data among numerous languages is that complete identity of meaning rarely occurs within a single language, let alone across languages. Thus, in our project there was often less than complete identity between LWT meanings (labeled in English) and their counterparts in the various project languages. The semantic scope of the counterparts could be broader or narrower than that of the LWT meaning, or a more complex semantic relationship between them could obtain. Contributors were given the possibility to indicate, if they so wished, one of four semantic relationships between LWT meanings and their counterparts: exact counterpart, sub-counterpart, super-counterpart, and paracounterpart. Exact counterpart was used if the LWT meaning roughly coincided with that of its counterpart in the relevant project language. Sub-counterpart was used if the meaning of the counterpart was narrower. For example, the Indonesian counterpart of LWT 2.96 (they) is meréka, which has a narrower scope than ‘they’, since it does not refer to inanimate objects. LWT 7.27 (the wall) is somewhat

10

Martin Haspelmath and Uri Tadmor

different, in that two Indonesian sub-counterparts together make up its meaning: dinding ‘internal wall’ and témbok ‘external wall’. Super-counterpart was used for counterparts with a broader meaning than the LWT meaning. Thus LWT 1.212 (the soil) only has a super-counterpart in Indonesian, tanah, which means ‘land’ and ‘ground’ in addition to ‘soil’. Finally, para-counterpart was used in cases where the LWT meaning and the counterpart shared part of their meaning in a more complex way. Thus Indonesian dapur can be said to have a broader meaning than LWT 5.25 (the oven), because it can also mean ‘kitchen’; but it can be said to have a narrower meaning, since it can only refer to a traditional oven, not to the modern kitchen appliance, which is called open. When no counterpart is provided for a given meaning from a fixed list, traditional lists normally show a blank space or a dash. The user is left unsure of the reason why no word was filled in. In the World Loanword Database, contributors cite a reason why no counterpart was provided for each meaning that lacks a counterpart. We distinguish three kinds of reasons: (i) “Insufficient information” is chosen when the contributor could not identify a counterpart; (ii) “Meaning irrelevant to speakers” applies to cultural and environmental items that speakers virtually never have occasion to talk about, for example ‘snow’ in the languages of many peoples inhabiting tropical regions. If most or all speakers are bilingual and speak a dominant language that has words for such items, speakers are often happy to use nonce borrowings, but the contributors were asked not to include words that are not well-established in the language; (iii) “No counterpart”: Some meanings do not have a direct counterpart in the project language, although they would not be irrelevant in the culture or environment. For example, there is no counterpart in Indonesian for 3.16 (the pasture), even though livestock in Indonesia uses pastures like in other countries. Speakers know fully well what pastures are, but they do not have a specific word for it, using a circumlocutionary expression such as padang rumput ‘grass field’ instead. A meaning might also be taboo, or expressed only grammatically rather than lexically (as is often the case with meanings such as ‘without’ and ‘if’).

5. Information on all words in the subdatabases In addition to the structural improvements over traditional word lists discussed in the preceding section, the World Loanword Database has the advantage of providing much more information. Besides the meaning’s counterpart(s), contributors were also asked to give various other details regarding the word form, its structure, and its history. The kinds of information provided for each entry are discussed in this section. Some of the types of information (or “database fields” from the technical point of view) are so important that they were considered obligatory by the

I. The Loanword Typology project and the World Loanword Database

11

editors, while others were optional. The kinds of information in §5.1–5.4 were obligatory in principle, though “no information” was always an option, also for the obligatory fields. 5.1.

Word form in the project language

Contributors were asked to provide the counterparts in the relevant language in their standard citation forms, even if the citation form contained a grammatical morpheme such as a case affix, an article, or an infinitive marker. For nouns, this meant singular forms in almost all cases. However, if a noun occurred only in a different number than the English label, this was not a problem. Thus, a LWT meaning expressed by an English plurale tantum such as oats could have a counterpart that was a singular mass noun, and a singular could be rendered by a plurale tantum; e.g. German only has the plural form Geschwister ‘siblings’, but this would be a suitable counterpart of LWT 2.456 (the sibling). The counterpart words in the project languages were provided in the spelling or transcription/transliteration that is most commonly used by linguists for the language. When the language uses a non-Latin script, the form in the indigenous orthography could optionally be provided in a separate field. When there were two slightly different forms of the word, they could both be included in this field and separated by a comma, e.g. Indonesian mas, emas, a counterpart of LWT 9.64 (the gold). But when a meaning has two quite different counterparts, they counted as two distinct words and had to be entered in separate records. Homonyms were distinguished by indices in parentheses (e.g. Indonesian pasang (1) ‘high tide’, pasang (2) ‘pair’). The counterpart did not have to match the word class of the English word used to convey the LWT meaning (although it often did). As already mentioned, if the project language had a close semantic match that belonged to a different word class, it could constitute the counterpart. For example, LWT 5.14 (to be hungry) has an adjective counterpart in English (hungry), a verb counterpart in Gawwada (puffí ‘be hungry’), and a noun counterpart in Swahili (njaa ‘hunger’). The counterpart could be a single word, a compound, or a phrasal expression. Although the counterparts could be larger than words, we still refer to them as “words” for the sake of simplicity (a more precise term would be lexeme, in the sense of ‘lexical entry’). Phrasal expressions were only to be given if they were fixed and conventionalized. Contributors were specifically asked not to provide descriptions or explanations of the meaning as counterparts. For example, for LWT 4.393 (the feather) a language may have had ‘hair of bird’ as the best equivalent, but if this was not a fixed expression, it could not be used as the language’s counterpart, and the entry should have been left unfilled. By contrast, to make love is a fixed expression in English, so it can be used as counterpart of LWT 4.67 (to have sex). Similarly, contributors were asked not to enter compound phrases of two or more

12

Martin Haspelmath and Uri Tadmor

hyponyms (“A and/or B”). For example, in the Indonesian list, LWT 2.94 (we) is left unfilled, because there is no single word that corresponds to this meaning, but rather two sub-counterparts that are already included in the LWT list: kami ‘we [inclusive]’ (corresponding to LWT 2.941) and kita ‘we [exclusive]’ (corresponding to LWT 2.942). Only established loanwords that were felt by the contributor to be part of the language’s lexicon were to be provided, not nonce borrowings (instances of singleword code switching). This distinction was often hard to make, especially when the language had no monolingual speakers; but contributors were asked to try to make it as best they could. 5.2.

Analyzability and gloss

In assessing the possible loanword status of a word, the first question was whether the word was analyzable (i.e. morphosyntactically complex) within the language. If this was the case, it was almost certain that it was created by speakers of the language rather than borrowed from some other language. Such words were not considered loanwords, even when they contained borrowed elements. Contributors were asked to indicate whether the word was (1) unanalyzable (if the form could not be analyzed into two or more constituents); (2) semi-analyzable (if a constituent structure could be identified, but not all constituents had meanings, such as a “cranberry morph”; or if the word was analyzable to linguists but not to lay speakers); (3) analyzable derived; (4) analyzable compound; (5) analyzable phrasal. For analyzable items, contributors were asked to give a morpheme-bymorpheme gloss, i.e. a hyphenation and a gloss in square brackets. For example, for Kanuri shàdàmá ‘witness’, the field contains the following: “shàdà-má [testimonyowner.of]”. For abbreviations of grammatical categories, contributors were referred to the Leipzig Glossing Rules. 5.3.

Borrowed

Most importantly, of course, contributors were asked to indicate whether, to the best of their knowledge, the word was a loanword, i.e. had been borrowed from another language at some point in the language’s history. Protolanguages were also considered stages of the same language, so that a word borrowed into Proto-Uralic, for example, would count as a loanword in Saami. Five degrees of certainty were distinguished:

I. The Loanword Typology project and the World Loanword Database

13

0. No evidence for borrowing 1. Very little evidence for borrowing 2. Perhaps borrowed 3. Probably borrowed 4. Clearly borrowed A value such as “Clearly not borrowed” or “Clearly inherited” was not used, because any word could have been borrowed at some prehistoric time, so we can never be sure that a word is not an old loanword. And even loanwords can be inherited, e.g. a word borrowed into Proto-Uralic can be inherited by Saami. We define a loanword as a lexeme that has been transferred from one lect into another and is used as a word (rather than as an affix, for example) in the recipient language. Words from a substrate language, too, were considered to be loanwords for the purposes of the LWT project, so we include both adopted and imposed words (see chapter II, §2, §7.4). Lexemes transferred from one regional dialect to another and between an acrolect and a basilect were also treated as loanwords in principle, although in practice they play a minor role. Excluded from the class of loanwords are neologisms (= productively created lexemes) which consist partly or entirely of foreign material, because they are created in the recipient language, and not transferred from a donor language (cf. §5.5.4; but see “calqued” in §5.5.3). 5.4.

Age

For each word, contributors gave the earliest time at which it was attested or could be reconstructed in the language. For loanwords, this meant the time when the word was borrowed. For nonloanwords, it meant the time of earliest attestation or reconstruction. Dates could be indicated by years (or centuries) or by period name, e.g. “Middle High German”, or “Tang dynasty”, in which case contributors were asked to provide approximate dates for the periods, e.g. 1050–1350 for “Middle High German”, 618–907 for “Tang Dynasty”, and 5000–3000 BCE for “ProtoIndo-European”. Knowing the age of a word is important in this context for several reasons. For nonloanwords which have only existed in the language for a relatively short period, it is not possible to draw conclusions regarding borrowability: they may be replaced by a loanword given sufficient time. On the other hand, a word that has been present in a language for a thousand years without being replaced by a loanword provides good evidence that its meaning is less borrowable. For older loanwords, tracing their origin can be more rewarding since we are less likely to know the history of their borrowing situation compared to newer loanwords. Studying old loanwords can thus help fill important gaps in historical and archeological knowledge. Finally, in a diverse sample such as ours, the history of some languages is relatively well documented for thousands of years, while others have only been re-

14

Martin Haspelmath and Uri Tadmor

corded for less than a century. Naturally, it would be much easier to identify and trace loanwords in languages with well-documented histories. Knowing the age of loanwords thus enables us to make much more nuanced cross-linguistic comparisons. 5.5.

Optional information

Various kinds of information were deemed important enough to be included in the project, but not so crucial as to be made obligatory. 5.5.1.

Original script

As mentioned above, if the language used a Latin script, the word was given in the usual spelling. If the language used a non-Latin script, a conventional transcription or transliteration was provided if one existed. If the language had no conventional orthography, the contributor’s own transcription was used. In such cases, the word in the language’s usual writing system could be provided in the separate field “original script”. 5.5.2.

The meaning of the word form, grammatical information, and comments

The meaning of the word in the project language was provided when it differed noticeably from the corresponding LWT meaning. (For the possible semantic relationships between the LWT meaning and the counterpart(s), see §4.) There were also optional fields for grammatical information (e.g. syntactic category or subclass) and for general comments about the word. 5.5.3.

Calquing

Semantic borrowing (the transfer of meaning without the transfer of words) was not the focus of our project, which dealt strictly with lexical borrowing (which must involve the transfer of forms together with meanings). However, following requests from several contributors, an optional field was added where contributors could note if an analyzable word appeared to be a calque. A calque is a complex form that was created on the model of a complex form in a donor language and whose constituents correspond semantically to the donor language constituents, e.g. Archi !:"elmin dogi ‘snail’, literally ‘rain’s donkey’, calqued from Avar c’adal #ama ‘snail’, which also literally means ‘rain’s donkey’. Simple semantic loans (words in which the contact influence concerns exclusively the polysemy patterns) were expressly excluded from the project, although contributors were free to note them in the comments.

I. The Loanword Typology project and the World Loanword Database

5.5.4.

15

Borrowed elements not constituting whole words

If an analyzable word was derived from a loanword, this could be noted in a special field (“contains a borrowed base”). However, even if the borrowed element constituted the main root, the word was not treated as a loanword, since it was created as a neologism in the recipient language and not in the donor language. An exception was made if the added morphemes were part of the borrowing process or part of the word’s normal citation form; such words were treated as loanwords. For example, Ket pasatbet ‘to rescue’ contains the suffixal element -bet, but this element was required to integrate the loan verb (borrowed from Russian spasat’) into the language, so this was not counted as a neologism. 5.5.5.

Frequency

Frequency counts are important for the analysis of lexical borrowing, since it is generally assumed that lexical stability increases (and therefore borrowability decreases) with frequency. However, frequency information was not made obligatory in the LWT project, because many languages do not have significant representative corpora on which frequency counts can be computed. Contributors who had access to frequency information could enter the occurrence per million words in an optional field. Another field was provided for entering more impressionistic categories (“very common”, “fairly common”, or “not common”) for the benefit of contributors who had strong intuitions about the relative frequency of words but did not have access to numeric frequency counts. For various reasons, very few contributors made use of the frequency fields, so this is one of the aspects of lexical borrowing that will be left for future research. 5.5.6.

Register

Register is also of potential interest for lexical borrowing, and an optional field was provided for entering information about the word’s register: “formal”, “colloquial”, or “general”. For example, when giving the English counterparts for item LWT 1.343 (the cape), promontory would be marked as ‘formal’ while peninsula would be marked ‘general’.

6. Additional information for all loanwords If a word was considered a (certain or probable) loanword, there were several further (obligatory or optional) data fields that the contributors filled in.

16

Martin Haspelmath and Uri Tadmor

6.1.

Source word and donor language

The first question we asked about borrowed items was their source, ideally their immediate source. For example, LWT 23.31 (the president) has the Indonesian counterpart presidén, which is ultimately derived from Latin praesidens. But Indonesian borrowed the word directly from Dutch president, so this was given as the source word. However, sometimes the immediate source was not known, in which case an earlier known source word was given in a separate field. For example, the Bezhta counterpart of LWT 5.821 (the chili pepper) is isti$o!. It appears to ultimately derive from Azerbaijani istiot ‘capsicum’, but its immediate source word is unknown (the etymon does not seem to occur in Avar, the expected intermediate donor language). In addition, it was sometimes clear from a word’s form that it was a loanword, but no source word (either immediate or earlier) could be identified. The source word was given in the spelling or transcription/transliteration that was most commonly used by linguists for the language. The donor language, if known, was also stated. If it was questionable, it could be marked as “uncertain”. Sometimes there were more than one possible donor languages, which could be narrowed down to a small set of languages, e.g. a family of closely related languages, or a set of two or three languages. Therefore family names like “Mongolic” were also acceptable as donor languages, and it was also possible to give several donor language names, each marked “uncertain”. The meaning of the source word was always provided, even if it was identical to the meaning of the loanword in the recipient language. 6.2.

Loanwords vis-à-vis the recipient language

When a word is borrowed it has an effect on the lexicon of the recipient language. It may replace an earlier word of roughly the same meaning, or simply be added to the lexicon where no earlier word with that meaning existed, or it may coexist with an earlier word of roughly the same meaning. Such information, as far as it was known, was provided by the contributors (with the choices “replacement”, “insertion”, and “coexistence”) for all loanwords. Borrowing a word often entails a certain modification of the source word, required for the integration of the word into the recipient language. Contributors were not required to measure the degree of phonological and morphological integration of loanwords in a precise way. Instead, they were asked to rank the loanword impressionistically as “highly integrated”, “intermediate”, or “unintegrated”. As a rough guideline, unintegrated loanwords were those that kept significant phonological and/or morphological peculiarities of the donor language and were therefore recognizable as loanwords also to speakers with no training in linguistics. On the contrary, highly integrated loanwords were those that had no structural properties that betrayed their foreign origin. Loanwords that had some synchronic properties of the foreign language were marked “intermediate”.

I. The Loanword Typology project and the World Loanword Database

17

The environmental salience of borrowed meanings was also noted. By this we mean the degree to which a word’s meaning is relevant to the speakers in their environment. Three values could be chosen: “present in pre-contact environment” (for example, there were mountains in England even before the word mountain was borrowed from French); “present only since contact” (speakers of many South American languages borrowed the word for ‘horse’ from the Spaniards who introduced it to their environment); or “not present” (snow did not exist in Thailand either before or after the introduction of the Sanskrit loanword himá ‘snow’, but the word itself is known and understood by speakers of Thai). By “contact”, we mean the first contact between speakers of the recipient language and the donor language. This contact could have been with speakers of the donor language, but it could also have been with texts in the donor language. 6.3.

Contact situation

One of the goals of the LWT project was to make inferences on possible linguistic outcomes of different contact situations. Contributors were asked to provide a name for each contact situation that has led to lexical borrowing in the language on which they were working. There was not always a one-to-one relationship between the number of donor languages and the number of contact situations. One donor language could be involved in more than one contact situation, and conversely more than one language could be involved in the same contact situation. For instance, English dish was borrowed from Latin discus in pre-Old English times, whereas English discus was borrowed from the same Latin word in the 17th century. In this case, we need to distinguish between two contact situations: “Latin to West Germanic” and “Latin to English”. On the other hand, for the borrowing of boomerang and kangaroo, we can assume basically the same contact situation (“Australian Aboriginal contact”), even if the two terms are from two different donor languages. The opposite situation obtains with Javanese and Sundanese loanwords in Indonesian. Words were borrowed from both languages into Jakarta Indonesian (including when it was still called “Malay”) when speakers of both languages poured into the city at the same time. This constituted a single contact situation. The various contact situations are explained in some detail in the individual language chapters.

7. The database template for data entry by the contributors For the purposes of data entry, a database template was designed by Bradley Taylor using FileMaker Pro, a cross-platform relational database application. This section briefly describes the template to give the reader a concrete idea of the contributors' task. The look and feel of the presentation of the data in the online World Loanword Database is different, but the basic database design is of course identical.

18

Martin Haspelmath and Uri Tadmor

Each main record in the database represented a pairing of a meaning and a counterpart word form in the relevant language, with some additional information. For meanings without a counterpart, a “missing word form” section was completed (see §4 above and Figure 5 below). A completed database therefore contains at least 1,460 meaning-word pair records, one for each meaning on the LWT list. (As already mentioned, contributors were free to create additional meaning records, but the added meanings are not readily comparable across languages and were not included in our various statistics.) The meaning fields were not changeable by the contributors, who were supposed to fill in only the word-related fields. The database contained several different tables. The MEANINGS table contained information regarding LWT meanings. The information was supplied (and was not modifiable) for the 1,460 LWT meanings, but when contributors added meanings, they could of course add the information themselves. This information includes the LWT code, the meaning label (in English), the semantic field, and semantic word class. In many cases, a disambiguating meaning description and/or an example of the intended meaning used in a prototypical context were also provided. Figure 1 shows the way the information from this table was displayed in the database temLoanword Typology plate that the contributors worked with. database MEANING

LWT code

1.21

Missing word form

Semantic Field The physical world

Reason for missing word form

LWT label (M2) !"#!$%&'

Comments (on missing word for

Meaning Description (M3) 'the hard surface of the earth, when compared to the area covered by Typical Context (M4) sea' The captain sighted land in the distance. Semantic category (M5) Noun Loanword Typology database MEANING MEANING-WORD RELATIONSHIP Word to meaning relationship (W25) 1.21 word form (%&)*%)#!&%+# Figure 1:LWT codeMeaning fields in the LWT databaseMissing template Semantic Field The physical world

Reason for missing word form

LWT label (M2) Comments on !"#!$%&' meaning-word relationship (W5A)

Comments (on missing word form)

The WORDS table contained information about the counterpart in the project lanSemantic category (M5) Noun guage, divided into obligatory fields (with a white background) and optional fields FOR ALL WORDS MEANING-WORD RELATIONSHIP (gray). The variousRecord fields#were explained in §5. to meaning relationship (W25)Figure 2 shows the fields that were 5 Word on meaning-word relationship (W5A) toComments be filled in for all words in the database template. Project Language (W0) Meaning Description (M3) 'the hard surface of the earth, when compared to the area covered by Typical Context (M4) sea' The captain sighted land in the distance.

FOR ALL WORDS

Record # Form 5 Word (W2) Original Script (W3)

Project Language (W0)

Meaning Word Form (W2)

(W6)

Comments On Word Form (W5B) Grammatical Info (W4)

Meaning (W6)

Calqued (W11)

Analyzability (W7)

Analyzability (W7)

Created On Loan Basis (W12) Comments On Borrowed (W10)

Morpheme By Morpheme Gloss (W8)

Borrowed (W9) Morpheme By Morpheme Gloss Age (W13) (W8)

Borrowed (W9)

Start year of period

Record status

Frequency [numeric] (W13A) End year of period

occurrences per million words

Frequency [relative] (W13B)

INCOMPLETE

Register (W23)

Age (W13)

Frequency [numeric] (W13A)

Start year of Fields toperiod be filled

End yearand of period (native borrowed)

Figure 2: inONfor allBASIS wordsNo source word (neither W14 nor W14A) is identifiable ONLY FOR LOANWORDS, CALQUES, AND CREATED LOAN INCOMPLETE

Immediate status source word Record

Word form (W14)

Frequency [relative] (W13B)

Earliest known source word

Donor Language (W15)

Word form (W14A) Donor Language (W15A)

Meaning (W16)

Meaning (W16A) Comments on Intermediate Source Words (W17) Integration (W19)

Effect (W18)

Environmental Salience (W20) ONLY FOR LOANWORDS, CALQUES, AND CREATED ON LOAN BASIS No source word (neither W14 nor W14A) Contact situation (W27) Reference (W21)

Immediate source word

Other Comments (W22)

Word form (W14) Donor Language (W15) Meaning (W16) Effect (W18) Environmental Salience (W20) Reference (W21) Other Comments (W22)

Earliest known source word

MEANING

Missing word form

(%&)*%)#!&%+#

Reason for missing word form Comments (on missing word form)

Meaning Description (M3) 'the hard surface of the earth, when compared to the area covered by sea' captain sighted land in the distance. Typical Context (M4) The Semantic category (M5) Noun

MEANING-WORD RELATIONSHIP

Word to meaning relationship (W25)

Comments on meaning-word relationship (W5A)

FOR ALL WORDS

Record # 5 Original Script (W3)

Project Language (W0)

Comments On Word Form (W5B)

Word Form (W2)

Grammatical Info (W4)

Meaning (W6) Analyzability (W7)

(W11) I. The Loanword Typology project and the World LoanwordCalqued Database

19

Created On Loan Basis (W12)

Comments On Borrowed (W10)

Morpheme By Morpheme Gloss (W8)

Borrowed (W9)

In addition, the WORDS table contained Frequency information inperonly for Age (W13) occurrences million words [numeric] (W13A)that was filled Start year of period End year of period Frequency [relative] (W13B) loanwords (and optionally for calques), as discussed in §6. This information was Record status INCOMPLETE Register (W23) also divided into obligatory (white) and optional (gray) fields (see Figure 3). ONLY FOR LOANWORDS, CALQUES, AND CREATED ON LOAN BASIS Immediate source word

No source word (neither W14 nor W14A) is identifiable

Earliest known source word

Word form (W14)

Word form (W14A)

Donor Language (W15)

Donor Language (W15A)

Meaning (W16)

Meaning (W16A) Comments on Intermediate Source Words (W17) Integration (W19)

Effect (W18) Environmental Salience (W20)

Contact situation (W27)

Reference (W21) Other Comments (W22)

Typologytodatabase Figure 3: Loanword Fields pertaining loanwords (and calques) only MEANING

LWT code

1.21

Missing word form

A special feature ofphysical the world database was the possibility of providing information Semantic Field The Reason for missing word form not (M2) !"#!$%&' Comments (on missing word only onLWT thelabelword form in the relevant language, but also on the relationship ofform) the Meaning Description (M3) 'the hard surface of the earth, when compared to the area covered by wordTypical form with the LWT meaning (exact counterpart, sub-counterpart, etc.; see sea' Context (M4) The captain sighted land in the distance. (M5) Noun 4 below). §4Semantic abovecategory and Figure MEANING-WORD RELATIONSHIP

(%&)*%)#!&%+#

Word to meaning relationship (W25)

Comments on meaning-word relationship (W5A)

Figure FOR ALL 4: WORDSFields

Record #

for meaning-word relationship 5

Original Script (W3)

Project Language (W0)

As already mentioned in §4, when no word form was supplied, the contributor was Word Form (W2) required toMeaning tick(W6) a special box (see Figure 5) and specify the reason why the word form was missing. The meaning could be irrelevant to the speakers, or the contributor Analyzability did not (W7) have sufficient information (i.e. did not know what the counterpart Morpheme By Morpheme Gloss (W8) was), orBorrowed there was simply no suitable counterpart word (even though the meaning (W9) itself was relevant to the speakers). Age (W13) Frequency [numeric] (W13A)

tabase

Start year of period

End year of period

Missing form Record status word INCOMPLETE

Comments On Word Form (W5B) Grammatical Info (W4) Calqued (W11) Created On Loan Basis (W12) Comments On Borrowed (W10)

occurrences per million words

Frequency [relative] (W13B)

(%&)*%)#!&%+#

Register (W23)

Reason for missing word form Comments (on missing word form)

ONLY FOR LOANWORDS, CALQUES, AND CREATED ON LOAN BASIS

when compared to the area covered by e distance.

Immediate source word

No source word (neither W14 nor W14A) is identifiable

Earliest known source word

Word form (W14)

Word form (W14A)

Donor Language (W15)

Figure 5:

Donor Language (W15A)

Meaning (W16)

Fields for meanings for which no counterpart is provided

Effect (W18)

Word to meaning relationship (W25)Salience (W20) Environmental

In addition to (W21) the MEANINGS, WORDS, and MISSING WORD FORMS tables, there Reference Comments (W22) related tables in the database structure. These were the AGES wereOther several other table, used for periodization of the word (§5.4); the SEMANTIC FIELDS table, containing all the semantic fields used in the project by their reference number (see Table 1); LANGUAGES, containing all project languages and information about Original Script (W3) them; and DONOR LANGUAGES, listing all the languages from which counterpart Comments On Word Form (W5B) loanwords were borrowed. Grammatical Several special features were added to the database Info to (W4) help contributors in their (W11) field and by semantic word work. Raw statistics breaking down words Calqued by semantic Created On Loan Basis (W12) Comments On Borrowed (W10)

Frequency [numeric] (W13A)

End year of period

occurrences per million words

Frequency [relative] (W13B) Register (W23)

REATED ON LOAN BASIS

rd

LWT code 1.21

Semantic Field The physical world LWT label (M2) !"#!$%&'

No source word (neither W14 nor W14A) is identifiable

Earliest known source word

Meaning (W16A) Comments on Intermediate Source Words (W17) Integration (W19) Contact situation (W27)

20

Martin Haspelmath and Uri Tadmor

class, along with percentage figures, could be generated with the help of special tools. Another tool generated lists of all probable and clear loanwords in each project language, which were used for preparing the loanword appendices accompanying each case study chapter. Finally, ten custom fields were provided which contributors could name and use as they saw fit. Most were used for private organizational and editorial purposes and were not included in the published World Loanword Database. Others, however, contained important information such as reconstructions and bibliographical references, and were therefore retained.

8. Tables showing numbers of loanwords Each of the language chapters contains at least two tables with the most important quantitative information about the loanwords in the language. These two tables only take into account level-4 loanwords (“Clearly borrowed”) and level-3 loanwords (“Probably borrowed”) (see §5.3 for these levels). One table gives the breakdown of loanwords by donor language and semantic word classes (semantic nouns, verbs, adjectives, adverbs, plus function words), and another table gives it by donor languages and semantic fields (the 24 fields of Table 2). When there was a large number of donor languages (with some donor languages contributing only very few loanwords), these were grouped into donor language groups, shown in the columns instead of donor languages. The standard tables in the chapters show percentages rather than absolute numbers, in order to make the figures more comparable. It has to be borne in mind, however, that the absolute number of loanwords is quite different across semantic fields: 20% of loanwords translates to many more in the field The body (159 meanings), for instance, than in the field Religion and belief (26 meanings). Giving breakdowns of loanwords by donor language and semantic fields/word classes is not straightforward, because loanwords are not always uniquely associated with these kinds of information. While 94.9% of all loanwords are associated with a single donor language, a minority of 3.5% are associated with two or more donor languages (the remainder have no known donor language). When a loanword is associated with two donor languages, it is counted half (0.5) for each of the two languages for the purposes of the table, and when it is associated with three languages, it is counted one third (0.33). Likewise, loanwords are not uniquely associated with semantic fields and word classes – these are properties of LWT meanings, not of words in the database. Thus, a word may correspond to two meanings that are in different semantic fields, like Japanese niku, which means ‘meat’ (Food and drink) or ‘flesh’ (The body)). Or a word may correspond to two meanings that are in different semantic word classes, like Zinacantán Tzotzil buro, which means ‘donkey’ (Noun) or ‘stupid’ (Adjective). Again, in such cases the word counts only half (and one third when it is in three different categories, and so on). Non-unique association with semantic fields is not common, with only 3.9% of all words belonging to two or more fields, and non-

I. The Loanword Typology project and the World Loanword Database

21

unique association with semantic word classes is even rarer (only 1.0% of all words have this property). This method of counting makes the figures a little more abstract, but in this way we do not give undue weight to words that can be assigned to multiple semantic categories, and to words that cannot be uniquely associated with a single donor language. In assigning loanwords to donor languages for the percentage tables, we did not distinguish between certain donor languages and uncertain donor languages.

9. History and future of the LWT project The Loanword Typology project was conceived in 2003 and officially launched in 2004. Contributors on various languages were added until 2006. Between 2003 and 2007, several workshops took place at which the issues arising from this project were discussed and the contributors presented progress reports and preliminary results. In 2008–2009 the database and case studies were edited for publication. Over these years, the original design of the project was changed and broadened considerably. Initially the amount of information requested from contributors was limited, and they were allowed to submit it any format, including simple text files. As the project progressed, and especially following workshops, more kinds information were added to the list, and a decision was taken to design a custom template in a commonly used database application. The template itself underwent several major revisions and many minor ones, resulting in the final format which was rather complex but also user-friendly (§7). To the editors’ delight, contributors took all these changes (which often entailed much additional work on their part) in stride, and the vast majority of those who volunteered to contribute a database and a book chapter did complete them. The publication of this book, while constituting an important milestone for the Loanword Typology project, is by no mean its end. Scholars are encouraged to use the case studies in the book as a basis for general and comparative research on lexical borrowing. Moreover, with the launching of the online World Loanword Database (URL: http://wold.livingsources.org/), the project has been given a new life. The plan is to provide contributors with the possibility of updating their databases at regular intervals. It is also envisaged that databases on additional languages would be added to the World Loanword Database which will gradually fill in unintended gaps in the language sample and give the database a stronger statistical foundation.

22

Martin Haspelmath and Uri Tadmor

Appendix: The Loanword Typology meaning list 1. The physical world 1.1

world

1.21

land

1.212

soil

1.213

dust

1.214

mud

1.215

sand

1.22

mountain or hill

1.222

cliff or precipice

1.23

plain

1.24

valley

1.25

island

1.26

mainland

1.27

shore

1.28

cave

1.31

water

1.32

sea

1.322

calm

1.323

rough (2)

1.324

foam

1.329

ocean

1.33

lake

1.34

bay

1.341

lagoon

1.342

reef

1.343

cape

1.35

wave

1.352

tide

1.353

low tide

1.354

high tide

1.36

river or stream

1.362

whirlpool

1.37

spring or well

1.38

swamp

1.39

waterfall

1.41

woods or forest

1.411

savanna

1.43

wood

1.44

stone or rock

1.45

earthquake

1.51

sky

1.52

sun

1.53

moon

1.54

star

1.55

lightning

1.56

thunder

1.57

bolt of lightning

1.58

storm

1.59

rainbow

1.61

light

1.62

darkness

1.63

shade or shadow

1.64

dew

1.71

air

1.72

wind

1.73

cloud

1.74

fog

1.75

rain

1.76

snow

1.77

ice

1.771

arctic lights

1.775

to freeze

1.78

weather

1.81

fire

1.82

flame

1.83

smoke

1.831

steam

1.84

ash

1.841

embers

1.851

to burn (1)

1.852

to burn (2)

1.86

to light

1.861

to extinguish

1.87

match

1.88

firewood

1.89

charcoal

2. Kinship 2.1

person

2.21

man

2.22

woman

2.23

male (1)

2.24

female (1)

2.25

boy

2.251

young man

2.26

girl

2.261

young woman

2.27

child (1)

2.28

baby

2.31

husband

2.32

wife

2.33

to marry

2.34

wedding

2.341

divorce

2.35

father

2.36

mother

2.37

parents

2.38

married man

2.39

married woman

2.41

son

2.42

daughter

2.43

child (2)

2.44

brother

2.444

older brother

2.445

younger brother

2.45

sister

2.454

older sister

2.455

younger sister

2.456

sibling

2.4561 older sibling 2.4562 younger sibling 2.458

twins

2.46

grandfather

2.461

old man

2.47

grandmother

I. The Loanword Typology project and the World Loanword Database 2.471

old woman

2.81

relatives

3.46

donkey

2.4711 grandparents

2.82

family

3.47

mule

2.48

grandson

2.91

I

3.5

fowl

2.49

granddaughter

2.92

you (singular)

3.52

cock/rooster

2.50

grandchild

2.93

he/she/it

3.54

hen

2.51

uncle

2.931

he

3.55

chicken

2.511

mother’s brother

2.932

she

3.56

goose

2.512

father’s brother

2.933

it

3.57

duck

2.52

aunt

2.94

we

3.58

nest

2.521

mother’s sister

2.941

we (inclusive)

3.581

bird

2.522

father’s sister

2.942

we (exclusive)

3.582

seagull

2.53

nephew

2.95

you (plural)

3.583

heron

2.54

niece

2.96

they

3.584

eagle

2.541

sibling’s child

3.585

hawk

2.55

cousin

3.586

vulture

3. Animals

2.56

ancestors

3.11

3.591

bat

2.57

descendants

3.12

male (2)

3.592

parrot

father-in-law (of a man)

3.13

female (2)

3.593

crow

3.15

livestock

3.594

dove

father-in-law (of a woman)

3.16

pasture

3.596

owl

3.18

herdsman

mother-in-law (of a man)

3.597

cormorant

3.19

stable or stall

3.598

toucan

mother-in-law (of a woman)

3.2

cattle

3.61

dog

3.21

bull

3.614

rabbit

3.22

ox

3.62

cat

3.23

cow

3.622

opossum

3.24

calf

3.63

mouse or rat

3.25

sheep

3.65

fish

daughter-in-law (of a man)

3.26

ram

3.652

fin

3.28

ewe

3.653

scale

daughter-in-law (of a woman)

3.29

lamb

3.654

gill

3.32

boar

2.6411 child-in-law

3.655

shell

3.34

sow

2.6412 sibling-in-law

3.661

shark

3.35

pig

2.71

stepfather

3.662

porpoise or dolphin

3.36

goat

2.72

stepmother

3.663

whale

3.37

he-goat

2.73

stepson

3.664

stingray

3.38

kid

2.74

stepdaughter

3.665

freshwater eel

3.41

horse

2.75

orphan

3.71

wolf

3.42

stallion

2.76

widow

3.72

lion

3.44

mare

2.77

widower

3.73

bear

3.45

foal or colt

2.61 2.611 2.62 2.621 2.622

parents-in-law

2.63

son-in-law (of a man)

2.631

son-in-law (of a woman)

2.64 2.641

animal

23

24

Martin Haspelmath and Uri Tadmor

3.74

fox

3.913

chameleon

4.215

to blink

3.75

deer

3.917

buffalo

4.22

ear

3.76

monkey

3.92

butterfly

4.221

earlobe

3.77

elephant

3.93

grasshopper

4.222

earwax

3.78

camel

3.94

snail

4.23

nose

3.81

insect

3.95

frog

4.231

nostril

3.811

head louse

3.96

lizard

4.232

nasal mucus

3.8112

body louse

3.97

crocodile or alligator

4.24

mouth

3.812

nit

3.98

turtle

4.241

beak

3.813

flea

3.99

tapir

4.25

lip

3.814

centipede

4.26

tongue

3.815

scorpion

4.27

tooth

4. The body

3.816

cockroach

4.11

4.271

gums

3.817

ant

4.12

skin or hide

4.272

molar tooth

3.818

spider

4.13

flesh

4.28

neck

3.819

spider web

4.14

hair

4.281

nape of neck

3.82

bee

4.142

beard

4.29

throat

3.821

beeswax

4.144

body hair

4.3

shoulder

3.822

beehive

4.145

pubic hair

4.301

shoulderblade

3.823

wasp

4.146

dandruff

4.302

collarbone

3.83

fly

4.15

blood

4.31

arm

sandfly or midge or gnat

4.151

vein or artery

4.312

armpit

4.16

bone

4.32

elbow

3.832

mosquito

4.162

rib

4.321

wrist

3.833

prawns or shrimp

4.17

horn

4.33

hand

3.834

termites

4.18

tail

4.331

palm of hand

3.835

tick

4.19

back

4.34

finger

3.84

worm

4.191

spine

4.342

thumb

3.85

snake

4.2

head

4.344

fingernail

3.862

coyote

4.201

temples

4.345

claw

3.863

hare

4.202

skull

4.35

leg

3.865

quail

4.203

brain

4.351

thigh

3.866

raccoon

4.204

face

4.352

calf of leg

3.869

squirrel

4.205

forehead

4.36

knee

3.871

reindeer/caribou

4.207

jaw

4.37

foot

3.872

elk/moose

4.208

cheek

4.371

ankle

3.873

beaver

4.209

chin

4.372

heel

3.88

kangaroo

4.21

eye

4.374

footprint

3.89

anteater

4.212

eyebrow

4.38

toe

3.90

jaguar

4.213

eyelid

4.392

wing

3.91

firefly

4.214

eyelash

3.831

body

I. The Loanword Typology project and the World Loanword Database 4.393

feather

4.66

to shit

4.912

to rest

4.4

chest

4.67

to have sex

4.92

lazy

4.41

breast

4.68

to shiver

4.93

bald

4.412

nipple or teat

4.69

to bathe

4.94

lame

4.42

udder

4.71

to beget

4.95

deaf

4.43

navel

4.72

to be born

4.96

mute

4.431

belly

4.73

pregnant

4.97

blind

4.44

heart

4.732

to conceive

4.98

drunk

4.441

lung

4.74

to be alive

4.99

naked

4.45

liver

4.741

life

4.451

kidney

4.75

to die

4.452

spleen

4.7501

dead

5.11

to eat

4.46

stomach

4.751

to drown

5.12

food cooked

5. Food and drink

4.461

intestines or guts

4.76

to kill

5.121

4.462

waist

4.77

corpse

5.122

raw ripe

4.463

hip

4.771

carcass

5.123

4.464

buttocks

4.78

to bury

5.124

unripe rotten

4.465

sinew or tendon

4.79

grave

5.125

4.47

womb

4.81

strong

5.13

to drink

weak

5.14

to be hungry famine

4.49

testicles

4.82

4.492

penis

4.83

healthy

5.141

4.493

vagina

4.84

sick/ill

5.15

to be thirsty to suck

4.494

vulva

4.841

fever

5.16

4.51

to breathe

4.842

goitre/goiter

5.18

to chew to swallow

4.52

to yawn

4.843

cold

5.181

4.521

to hiccough

4.844

disease

5.19

to choke to cook

4.53

to cough

4.85

wound or sore

5.21

4.54

to sneeze

4.852

bruise

5.22

to boil to roast or fry

4.55

to perspire

4.853

swelling

5.23

4.56

to spit

4.854

itch

5.24

to bake

to scratch

5.25

oven pot

4.57

to vomit

4.8541

4.58

to bite

4.855

blister

5.26

4.59

to lick

4.856

boil

5.27

kettle pan

4.591

to dribble

4.857

pus

5.28

4.61

to sleep

4.858

scar

5.31

dish plate

4.612

to snore

4.86

to cure

5.32

4.62

to dream

4.87

physician

5.33

bowl jug/pitcher

4.63

to wake up

4.88

medicine

5.34

4.64

to fart

4.89

poison

5.35

cup

tired

5.36

saucer

5.37

spoon

4.65

to piss

4.91

25

26

Martin Haspelmath and Uri Tadmor

5.38

knife (1)

5.87

to milk

6.48

trousers

5.39

fork

5.88

cheese

6.49

sock or stocking

5.391

tongs

5.89

butter

6.51

shoe

5.41

meal

5.9

drink

6.52

boot

5.42

breakfast

5.91

mead

6.54

shoemaker

5.43

lunch

5.92

wine

6.55

hat or cap

5.44

dinner

5.93

beer

6.57

belt

5.45

supper

5.94

fermented drink

6.58

glove

5.46

to peel

5.97

egg

6.59

veil

5.47

to sieve or to strain

5.971

yolk

6.61

pocket

5.48

to scrape

5.983

manioc bread

6.62

button

5.49

to stir or to mix

6.63

pin

5.51

bread

6.71

5.53

dough

6.11

to put on

ornament or adornment

5.54

to knead

6.12

clothing or clothes

6.72

jewel

5.55

flour

6.13

tailor

6.73

ring

5.56

to crush or to grind

6.21

cloth

6.74

bracelet

5.57

mill

6.22

wool

6.75

necklace

5.58

mortar (1)

6.23

linen

6.76

bead

pestle

6.24

cotton

6.77

earring

meat

6.25

silk

6.78

5.63

sausage

6.27

felt

headband or headdress

5.64

soup

6.28

fur

6.79

tattoo

5.65

vegetables

6.29

leather

6.81

handkerchief or rag

5.66

bean

6.31

to spin

6.82

towel

5.7

potato

6.32

spindle

6.91

comb

5.71

fruit

6.33

to weave

6.92

brush

5.712

bunch

6.34

loom

6.921

plait/braid

5.75

fig

6.35

to sew

6.93

razor

5.76

grape

6.36

needle (1)

6.94

ointment

5.77

nut

6.37

awl

6.95

soap

5.78

olive

6.38

thread

6.96

mirror

5.79

oil

6.39

to dye

6.98

snowshoe

5.791

grease or fat

6.41

cloak

5.81

salt

6.411

poncho

5.82

pepper

6.42

(woman’s) dress

5.821

chili pepper

6.43

coat

5.84

honey

6.44

shirt

5.85

sugar

6.45

collar

5.86

milk

6.46

skirt

6.461

grass-skirt

5.59 5.61

6. Clothing and grooming

7. The house 7.11

to live

7.12

house

7.13

hut

7.131

garden-house

7.14

tent

7.15

yard or court

I. The Loanword Typology project and the World Loanword Database 7.16

men’s house

7.17

cookhouse

7.18

meeting house

7.67

to tan

8. Agriculture and vegetation farmer

8.56

leaf

8.57

flower

8.6

tree

27

7.21

room

8.11

8.61

oak

7.22

door or gate

8.12

field

8.62

beech

7.221

doorpost

8.121

paddy

8.63

birch

7.23

lock

8.13

garden

8.64

pine

7.231

latch or door-bolt

8.15

to cultivate

8.65

fir

7.232

padlock

8.16

fence

8.66

acorn

7.24

key

8.17

ditch

8.67

vine

7.25

window

8.21

to plough/plow

8.68

tobacco

7.26

floor

8.212

furrow

8.69

to smoke

7.27

wall

8.22

to dig

8.691

pipe

7.31

fireplace

8.23

spade

8.72

tree stump

7.32

stove

8.24

shovel

8.73

tree trunk

7.33

chimney

8.25

hoe

8.74

forked branch

7.37

ladder

8.26

fork (2)/pitchfork

8.75

bark

7.42

bed

8.27

rake

8.76

sap

7.421

pillow

8.28

8.81

palm tree

7.422

blanket

digging stick (=yamstick) lasso

coconut

chair

8.29

8.82

7.43

table

to sow

citrus fruit

7.44

8.31

8.83

seed

banana

lamp or torch

8.311

8.84

7.45

candle

to mow

banyan

7.46

8.32

8.85

sickle or scythe

sweet potato

shelf

8.33

8.91

7.47

trough

to thresh

millet or sorghum

7.48

8.34

8.911

roof

threshing-floor

yam

7.51

8.35

8.912

harvest

cassava/manioc

thatch

8.41

8.92

7.52

ridgepole

grain

gourd

7.53

8.42

8.93

wheat

pumpkin or squash

rafter

8.43

8.931

7.54

beam

barley

bamboo

7.55

8.44

8.94

rye

sugar cane

post or pole

8.45

8.941

7.56

board

oats

fish poison

7.57

8.46

8.96

maize/corn

nettle

arch

8.47

8.97

7.58

mason

rice

mushroom

7.61

8.48

8.98

brick

grass

larch

7.62

8.51

8.991

hay

needle (2)

mortar (2)

8.52

8.993

7.63

adobe

plant

cone

7.64

8.53

8.996

7.65

camp

8.531

to plant

9. Basic actions and technology

7.66

hammock

8.54

root

9.11

to do

8.55

branch

9.111

to make

28

Martin Haspelmath and Uri Tadmor

9.12

work

9.48

saw

10.14

to wrap

9.14

to bend

9.49

hammer

10.15

to roll

9.15

to fold

9.5

nail

10.16

to drop

9.16

to tie

9.56

glue

10.17

to twist

9.161

to untie

9.6

blacksmith

10.21

to rise

9.18

chain

9.61

to forge

10.22

to raise or lift

9.19

rope

9.62

anvil

10.23

to fall

9.192

knot

9.63

to cast

10.24

to drip

9.21

to strike or hit or beat 9.64

gold

10.25

to throw

9.211

to pound

9.65

silver

10.252

to catch

9.22

to cut

9.66

copper

10.26

to shake

9.221

to cut down

9.67

iron

10.32

to flow

9.222

to chop

9.68

lead

10.33

to sink

9.223

to stab

9.69

tin or tinplate

10.34

to float

9.23

knife (2)

9.71

potter

10.35

to swim

9.24

scissors or shears

9.72

to mould/mold

10.351

to dive

9.25

axe/ax

9.73

clay

10.352

to splash

9.251

adze

9.74

glass

10.36

to sail

9.26

to break

9.75

to weave or plait/braid 10.37

to fly

9.261

broken

9.76

basket

10.38

to blow

9.27

to split

9.77

mat

10.41

to crawl

9.28

to tear

9.771

rug

10.412

to kneel

9.29

to skin

9.78

netbag

10.413

to crouch

9.31

to rub

9.79

fan

10.42

to slide or slip

9.311

to wipe

9.791

to fan

10.43

to jump

9.32

to stretch

9.81

to carve

10.431

to kick

9.33

to pull

9.82

sculptor

10.44

to dance

9.34

to spread out

9.83

statue

10.45

to walk

9.341

to hang up

9.84

chisel

10.451

to limp

9.342

to press

9.87

boomerang

10.46

to run

9.343

to squeeze

9.88

paint

10.47

to go

9.35

to pour

9.89

to paint

10.471

to go up

9.36

to wash

9.90

to draw water

10.472

to climb

9.37

to sweep

9.91

peg

10.473

to go down

9.38

broom

9.92

tumpline

10.474

to go out

9.422

tool

9.93

whetstone

10.48

to come

9.43

carpenter

10.481

to come back

9.44

to build

10.49

to leave

9.46

to bore

10.11

to move

10.491

to disappear

to hollow out

10.12

to turn

10.51

to flee

10.13

to turn around

9.461

10. Motion

I. The Loanword Typology project and the World Loanword Database 10.52

to follow

10.53

to pursue

10.55

to arrive

10.56

to approach

10.57

to enter

10.58

to go or return home

10.61

to carry

10.612

to carry in hand

10.613

to carry on shoulder

10.614

to carry on head

10.615

to carry under arm

10.62

to bring

10.63

to send

10.64

to lead

10.65

to drive

10.66

to ride

10.67

to push

10.71

road

10.72

path

10.74

bridge

10.75

cart or wagon

10.76

wheel

10.77

axle

10.78

yoke

10.79

sledge/sled

10.81

ship

10.83

boat

10.831

canoe

10.832

outrigger

10.84

raft

10.85

oar

10.851

paddle

10.852

to row

10.86

rudder

10.87

mast

10.88

sail

10.89

anchor

10.91

port

10.92

to land

11. Possession 11.11

to have

11.12

to own

11.13

to take

11.14

to grasp

11.15

to hold

11.16

to get

11.17

to keep

11.18

thing

11.21

to give

11.22

to give back

11.24

to preserve

11.25

to rescue

11.27

to destroy

11.28

to injure

11.29

to damage

11.31

to look for

11.32

to find

11.33

to lose

11.34

to let go

11.43

money

11.44

coin

11.51

rich

11.52

poor

11.53

beggar

11.54

stingy

11.61

to lend

11.62

to borrow

11.63

to owe

11.64

debt

11.65

to pay

11.66

bill

11.69

tax

11.77

to hire

11.78

wages

11.79

to earn

11.81

to buy

11.82

to sell

11.83

to trade or barter

11.84

merchant

11.85

market

11.86

shop/store

11.87

price

11.88

expensive

11.89

cheap

11.91

to share

11.92

to weigh

12. Spatial relations 12.01

after

12.011

behind

12.012

in

12.013

at

12.02

beside

12.03

down

12.04

before

12.041

in front of

12.05

inside

12.06

outside

12.07

under

12.08

up

12.081

above

12.11

place

12.12

to put

12.13

to sit

12.14

to lie down

12.15

to stand

12.16

to remain

12.17

remains

12.21

to gather

12.212

to pick up

12.213

to pile up

12.22

to join

12.23

to separate

12.232

to divide

12.24

to open

12.25

to shut

12.26

to cover

12.27

to hide

12.31

high

12.32

low

29

30

Martin Haspelmath and Uri Tadmor

12.33

top

12.85

hole

13.38

twice/two times

12.34

bottom

12.92

similar

13.42

third

12.35

end (1)

12.93

to change

13.44

three times

12.352

pointed

12.353

edge

12.36

13. Quantity

14. Time

side

13

zero

14.11

time

12.37

middle

13.01

one

14.12

age

12.41

right (1)

13.02

two

14.13

new

12.42

left

13.03

three

14.14

young

12.43

near

13.04

four

14.15

old

far

13.05

five

14.16

early

12.45

east

13.06

six

14.17

late

12.46

west

13.07

seven

14.18

now

12.47

north

13.08

eight

14.19

immediately

12.48

south

13.09

nine

14.21

fast

12.53

to grow

13.1

ten

14.22

slow

12.54

to measure

13.101

eleven

14.23

to hurry

12.541

fathom

13.102

twelve

14.24

to be late

12.55

big

13.103

fifteen

14.25

to begin

small

13.104

twenty

14.251

beginning

12.57

long

13.105

a hundred

14.252

to last

12.58

tall

13.106

a thousand

14.26

end (2)

12.59

short

13.107

to count

14.27

to finish

12.61

wide

13.14

all

14.28

to cease

12.62

narrow

13.15

many

14.29

ready

12.63

thick

13.16

more

14.31

always

12.65

thin

13.17

few

14.32

often

12.67

deep

13.18

enough

14.33

sometimes

12.68

shallow

13.181

some

14.331

soon

12.71

flat

13.19

crowd

14.332

for a long time

straight

13.21

full

14.34

never

12.74

crooked

13.22

empty

14.35

again

12.75

hook

13.23

part

14.41

day (1)

12.76

corner

13.231

piece

14.411

day (2)

12.77

cross

13.24

half

14.42

night

12.78

square

13.33

only

14.43

dawn

12.81

round

13.331

alone

14.44

morning

12.82

circle

13.34

first

14.45

midday

12.83

ball

13.35

last

14.451

afternoon

line

13.36

second

14.46

evening

13.37

pair

14.47

today

12.44

12.56

12.73

12.84

I. The Loanword Typology project and the World Loanword Database

31

14.48

tomorrow

15.55

to show

16.26

to play

14.481

day after tomorrow

15.56

to shine

16.27

to love

14.49

yesterday

15.57

bright

16.29

to kiss

14.491

day before yesterday

15.61

colour/color

16.3

to embrace

14.51

hour

15.62

light (2)

16.31

pain

14.53

clock

15.63

dark

16.32

grief

14.61

week

15.64

white

16.33

anxiety

14.62

Sunday

15.65

black

16.34

to regret or be sorry

14.63

Monday

15.66

red

16.35

pity

14.64

Tuesday

15.67

blue

16.37

to cry

14.65

Wednesday

15.68

green

16.38

tear

14.66

Thursday

15.69

yellow

16.39

to groan

14.67

Friday

15.71

to touch

16.41

to hate

14.68

Saturday

15.712

to pinch

16.42

anger

14.71

month

15.72

to feel

16.44

envy or jealousy

14.73

year

15.74

hard

16.45

shame

14.74

winter

15.75

soft

16.48

proud

14.75

spring (2)

15.76

rough (1)

16.51

to dare

14.76

summer

15.77

smooth

16.52

brave

14.77

autumn/fall

15.78

sharp

16.53

fear

14.78

season

15.79

blunt

16.54

danger

15.81

heavy

16.62

to want

15.82

light (1)

16.622

to choose

15. Sense perception 15.21

to smell (1)

15.83

wet

16.63

to hope

15.212

to sniff

15.84

dry

16.65

faithful

15.22

to smell (2)

15.85

hot

16.66

true

15.25

fragrant

15.851

warm

16.67

to lie (2)

15.26

stinking

15.86

cold

16.68

deceit

15.31

to taste

15.87

clean

16.69

to forgive

15.35

sweet

15.88

dirty

16.71

good

15.36

salty

15.89

wrinkled

16.72

bad

15.37

bitter

right (2)

15.38

sour

16. Emotions and values

16.73 16.74

wrong

15.39

brackish

16.11

soul or spirit

16.76

fault

15.41

to hear

16.15

16.77

mistake

15.42

to listen

surprised or astonished good luck

blame

sound or noise

16.18

16.78

15.44

loud

bad luck

praise

15.45

16.19

16.79

happy

beautiful

quiet

16.23

16.81

15.46

to see

to laugh

ugly

15.51

16.25

16.82

to smile

greedy

to look

16.251

16.83

15.52

32

Martin Haspelmath and Uri Tadmor

16.84

clever

17. Cognition

17.53

if

18.41

to call (1)

17.54

or

18.42

to call (2)

17.55

yes

18.43

to announce

17.11

mind

17.56

no

18.44

to threaten

17.13

to think (1)

17.61

how?

18.45

to boast

17.14

to think (2)

17.62

how many?

18.51

to write

17.15

to believe

17.63

how much?

18.52

to read

17.16

to understand

17.64

what?

18.56

paper

17.17

to know

17.65

when?

18.57

pen

17.171

to guess

17.66

where?

18.61

book

17.172

to imitate

17.67

which?

18.67

poet

17.18

to seem

17.68

who?

18.71

flute

17.19

idea

17.69

why?

18.72

drum

17.21

wise

horn or trumpet

17.22

stupid

18. Speech and language

18.73 18.74

rattle

17.23

mad

18.11

voice

17.24

to learn

18.12

to sing

17.242

to study

18.13

to shout

19. Social and political relations

17.25

to teach

18.15

to whisper

19.11

country

17.26

pupil

18.16

to mumble

19.12

native country

17.27

teacher

18.17

to whistle

19.15

town

17.28

school

18.18

to shriek

19.16

village

17.31

to remember

18.19

to howl

19.17

boundary

17.32

to forget

18.21

to speak or talk

19.21

people

17.34

clear

18.211

to stutter or stammer

19.23

clan chieftain

17.35

obscure

18.22

to say

19.24

17.36

secret

18.221

to tell

19.25

walking stick to rule or govern

17.37

certain

18.222

speech

19.31

17.38

to explain

18.23

to be silent

19.32

king queen

17.41

intention

18.24

language

19.33

17.42

cause

18.26

word

19.36

noble

17.43

doubt

18.28

name

19.37

citizen

17.44

to suspect

18.31

to ask (1)

19.41

master

17.441

to betray

18.32

to answer

19.42

slave

17.45

need or necessity

18.33

to admit

19.43

servant

17.46

easy

18.34

to deny

19.44

freeman to liberate

17.47

difficult

18.35

to ask (2)

19.445

17.48

to try

18.36

to promise

19.45

to command or order to obey

17.49

manner

18.37

to refuse

19.46

17.51

and

18.38

to forbid

19.47

to permit

17.52

because

18.39

to scold

19.51

friend

I. The Loanword Typology project and the World Loanword Database 19.52

enemy

20.471

guard

21.51

to steal

19.54

neighbour

20.48

booty

21.52

thief

19.55

stranger

20.49

ambush

19.56

guest

20.51

fisherman

19.565

to invite

20.52

fishhook

22.11

religion

19.57

host

20.53

fishing line

22.12

god temple

22. Religion and belief

19.58

to help

20.54

fishnet

22.13

19.59

to prevent

20.55

fish trap

22.131

church mosque

19.61

custom

20.56

bait

22.132

19.62

quarrel

20.61

to hunt

22.14

altar

to shoot

22.15

sacrifice to worship

19.63

plot

20.62

19.65

to meet

20.63

to miss

22.16

19.72

prostitute

20.64

trap

22.17

to pray

to trap

22.18

priest

22.19

holy

22.22

to preach

20. Warfare and hunting

20.65

33

20.11

to fight

21. Law

20.13

war or battle

21.11

law

22.23

to bless

20.14

peace

21.15

court

22.24

to curse

20.15

army

21.16

to adjudicate

22.26

to fast

20.17

soldier

21.17

judgment

22.31

heaven

20.21

weapons

21.18

judge

22.32

hell

20.22

club

21.21

plaintiff

22.35

demon

20.222

battle-axe

21.22

defendant

22.37

idol

20.23

sling

21.23

witness

22.42

magic

20.24

bow

21.24

to swear

22.43

sorcerer or witch

20.25

arrow

21.25

oath

22.44

fairy or elf

20.26

spear

21.31

to accuse

22.45

ghost

20.27

sword

21.32

to condemn

22.47

omen

20.28

gun

21.33

to convict

22.50

circumcision

20.31

armour

21.34

to acquit

22.51

initiation ceremony

20.33

helmet

21.35

guilty

20.34

shield

21.36

innocent

20.35

fortress

21.37

23.1

radio

20.36

tower

penalty or punishment

23.11

television

21.38

fine

23.12

telephone bicycle

20.41

victory

23. The Modern world

20.42

defeat

21.39

prison

23.13

20.43

attack

21.42

murder

23.135

motorcycle

20.44

to defend

21.43

adultery

23.14

car

20.45

to retreat

21.44

rape

23.15

bus

20.46

to surrender

21.46

arson

23.155

train

20.47

captive or prisoner

21.47

perjury

23.16

airplane

34

Martin Haspelmath and Uri Tadmor

23.17

electricity

23.395

street

23.63

music

23.175

battery

23.4

post/mail

23.64

song

23.18

to brake

23.41

postage stamp

23.9

tea

23.185

motor

23.42

letter

23.91

coffee

23.19

machine

23.43

postcard

23.195

petroleum

23.44

bank

23.2

hospital

23.5

tap/faucet

23.21

nurse

23.51

sink

24.01

to be

23.22

pill or tablet

23.52

toilet

24.02

to become without

24. Miscellaneous function words

23.23

injection

23.53

mattress

24.03

23.24

spectacles/glasses

23.54

tin/can

24.04

with through

23.3

government

23.55

screw

24.05

23.31

president

23.555

screwdriver

24.06

not this

23.32

minister

23.56

bottle

24.07

23.33

police

23.565

candy/sweets

24.08

that

plastic

24.09

here there

23.34

driver’s license

23.57

23.35

license plate

23.575

bomb

24.10

23.36

birth certificate

23.58

workshop

24.11

other next

23.37

crime

23.59

cigarette

24.12

23.38

election

23.6

newspaper

24.13

same

24.14

nothing

23.385

address

23.61

calendar

23.39

number

23.62

film/movie

Chapter II

Lexical borrowing: Concepts and issues Martin Haspelmath 1. Lexical borrowing as a topic for general linguistics There is a large amount of previous research on loanwords in individual languages, but the Loanword Typology project is the first research project that attempts to 1 shed light on lexical borrowing in general by adopting a typological approach. This chapter defines and discusses some of the basic notions required for such an endeavor, and raises some of the most important issues. A broadly comparative (and ideally world-wide) perspective is essential if we want to go beyond the descriptive goal of identifying particular loanwords and their histories, towards the goal of explaining (at least partially) why certain words but not other words have been borrowed from one language into another language. To be sure, there are many simple cases of culturally motivated borrowing where a cultural importation is accompanied by a lexical importation in a straightforward way, e.g. Quechua borrowing plata ‘money’ from Spanish, or English borrowing kosher from Yiddish. But even in such seemingly unproblematic cases, there is always the question why a borrowing had to take place at all, because all languages have the means to create novel expressions out of their own resources. Instead of borrowing a word, they could simply make up a new word. And of course there are many other cases where it is not at all clear why a language borrowed a word from another language, because a fully equivalent word existed beforehand. Thus, French had no need to borrow blanc ‘white’ from Franconian (because Latin had albus ‘white’), and English had no need to borrow window from Old Norse (because Old English had an equivalent word eag!yrel). Thus, explaining observed loanwords and assessing the likelihood of borrowing particular words is not straightforward. Two main types of factors have been made responsible: ! social and attitudinal factors (prestige of the donor language, puristic attitudes) ! grammatical factors (e.g. the claim that verbs are more difficult to borrow than nouns because they need more grammatical adaptation than nouns). When we set out on this project, there were many suggestions in the literature (some of which are reviewed in subsequent sections of this introduction), but very 1

Earlier general work such as Møller (1933) and Deroy (1956) is limited to European languages, and almost all theoretically oriented corpus-based work work (e.g. Poplack & Sankoff 1984, Poplack et al. 1988, van Hout & Muysken 1994) is limited to a single language.

36

Martin Haspelmath

little systematic evidence for them. The best-known generalization about lexical borrowing is the constraint that “core vocabulary” is very rarely (or never) borrowed. This has found its way into many textbooks (e.g. Hock & Joseph 1996: 257, Thomason 2001: 71–72), but a definition of what constitutes this hard-to-borrow “core” or “basic” vocabulary is rarely given. In practical terms, linguists often work with Swadesh’s (1955) list of non-cultural vocabulary, which were intended by their author to be his best guesses as to which words are resistant to borrowing. But this list was drawn up by Swadesh on the basis of his personal anecdotal knowledge and intuition, not on the basis of systematic cross-linguistic research. The Loanword Typology project represents some of the research that would have been a prerequisite for Swadesh’s word-list-based historical-comparative linguistics. More generally, better knowledge of lexical borrowability will be important for further progress in historical-comparative linguistics (cf. Haspelmath 2008). Especially in less well-researched languages and language families, and at older stages of history, it is often unclear whether a word is a loanword or a native word that is cognate with its putative source. Often two languages or families showing striking lexical similarities that unambiguously prove a historical relationship, but whether these lexical similarities are due to common inheritance or to borrowing is a matter of dispute. In such disputes, more systematic knowledge of the general patterns of loanword distribution will hopefully be helpful in the future, and the results presented in Tadmor’s chapter constitute a beginning.

2. Defining “loanword” Loanword (or lexical borrowing) is here defined as a word that at some point in the history of a language entered its lexicon as a result of borrowing (or transfer, or copying). Fortunately, this definition is uncontroversial, but there are a number of things to note. First, the term borrowing has been used in two different senses: (i) As a general term for all kinds of transfer or copying processes, whether they are due to native speakers adopting elements from other languages into the recipient language, or whether they result from non-native speakers imposing properties of their native language onto a recipient language. This general sense seems to be by far the most prevalent use of the term borrowing. But borrowing has also been used in a more restricted sense, (ii) “to refer to the incorporation of foreign elements into the speakers’ native language” (Thomason & Kaufman 1988:21), i.e. as a synonym of adoption (Thomason & Kaufman use the term substratum interference for ‘imposition’, and interference as a cover term for ‘borrowing/adoption’ and ‘substratum interference/ imposition’). In this work, we use borrowing in the more common, broad sense, and the two types of borrowing, depending on whether the borrowers are native speakers or non-native speakers, are called adoption and imposition (or, equivalently, retention) (following Van Coetsem 1988, Winford 2006). Apart from the fact that this

II. Lexical borrowing: Concepts and issues

37

terminology is more in conformity with traditional usage, it is symmetrical, and it gives us additional verbs (adopt, impose, retain) that can be used in a precise, technical sense. Of course, the term borrowing is based on a strange metaphor (after all, the donor language does not expect to receive its words back), so a term like transfer or transference (e.g. Clyne 2004) would be preferable. Even better is Johanson’s (2002) term copying, because the transfer metaphor still suggests that the donor language loses the element in question. However, since borrowing is so wellth established in linguistics, going back at least to the 18 century, and since the metaphor does not lead to any misunderstandings, we will continue to use it here (alongside its synonyms transfer(erence) and copying). The language from which a loanword has been borrowed will be called the donor language, and the language into which it has been borrowed is the recipient language. (Alternative term pairs that one sometimes finds in the literature are source language/borrowing language, and model language/replica language.) The word 2 that served as a model for the loanword will be called source word. Loanwords are always words (i.e. lexemes) in the narrow sense, not lexical phrases, and they are normally unanalyzable units in the recipient language. The corresponding source word in the donor language, by contrast, may be complex or even phrasal, but this internal structure is lost when the word enters the recipient 3 language. For example, Russian has the loanword buterbrod ‘sandwich’, borrowed from German Butter-brot [butter-bread]. This is a transparent compound in German, but since Russian has no other words with the elements buter or brod, the Russian word is monomorphemic and not analyzable by native speakers. However, when a language borrows multiple complex words from another language, the elements may recur with a similar meaning, so that the morphological structure may be reconstituted. This is the case with the numerous Japanese loans based on Chinese compounds. For example, Japanese borrowed kokumin !" ‘citizen’ from Chinese guó-mín [country-people] !" (cf. Schmidt, Japanese subdatabase), but it also borrowed other words with the element kok(u) ‘country’ (e.g. kok-ka !# ‘nation’, koku-" !$ ‘king’) and other words with the element min ‘people’ (e.g. minsh# "% ‘population’, j#min &" ‘inhabitant’). As a result of these multiple borrowings, many of the original Chinese compounds are again transparent in Japanese, and can be regarded as analyzable. Similarly, in English neoclassical compounds (like ethnography, ethnocracy, ethnology, gerontology, gerontocracy,

2

3

Notice that the verb to borrow can take either the source word or the loanword as its object: We can say “Portuguese borrowed the Chinese word chai ‘tea’ (as chá)”, or we can say “Portuguese borrowed chá ‘tea’ from Chinese (chai)”. The context will make clear which is intended. (Likewise, expressions such as Portuguese loanword are ambiguous, referring either to loanwords borrowed from Portuguese, or to Portuguese words which are loanwords.) Conversely, when a word is analyzable within the recipient language, it can normally not be a loanword, because it was created within the recipient language (even if its members are loanwords: the English compound train station is not a loanword, although it consists of two borrowed roots). The Japanese compounds mentioned in the text below are exceptions to this generalization.

38

Martin Haspelmath

crystallography) are often transparent, and the pattern is productive even among speakers who do not know Greek and Latin. Loanwords are opposed to native words, i.e. words “which we can take back to the earliest known stages of a language” (Lehmann 1962: 212). But given our definition of loanword above, we can never exclude that a word is a loanword, i.e. that it has been borrowed at some stage in the history of the language. Thus, the status of native words is always relative to what we know about the history of a language. English dish goes back to Old English and has cognates in other Germanic languages (e.g. German Tisch ‘table’), so in this sense it could be regarded as a native th word (contrasting with disk, which was borrowed from Latin discus in the 17 century). But we know more about the history of English than the attested forms in Old English: Proto-West Germanic *disk has itself clearly been borrowed from Latin discus, so that English dish must count as a loanword after all. Even for words that have been reconstructed for a very ancient proto-language, such as English mother (from Proto-Indo-European *m$t%r) or ten (from *dek!m), we cannot be sure that they were not borrowed from another language at some earlier stage. Thus, we can identify loanwords, but we cannot identify “non-loanwords” in an absolute sense. A “non-loanword” is simply a word for which we have no knowledge that it 4 was borrowed. Note, finally, that the term borrowing refers to a completed language change, a diachronic process that once started as an individual innovation but has been propagated throughout the speech community (the innovation/propagation contrast will be discussed further in §4). The nominalization borrowing can also metonymically 5 refer to a borrowed element (a borrowing, or a loan ‘a borrowed element’).

3. Loanwords in a taxonomy of borrowings Although in this work we are primarily concerned with loanwords, it will be useful to consider briefly a range of other borrowing phenomena that are more or less closely related to loanwords. A basic distinction that must be made is that between material borrowing and structural borrowing (or matter borrowing and pattern borrowing, Matras & Sakel 2007). Material borrowing refers to borrowing of soundmeaning pairs (generally lexemes, or more precisely lexeme stems, but sometimes just affixes, and occasionally perhaps entire phrases), while structural borrowing 4

5

Technically, native word is equivalent to “non-loanword”, but there is a tendency among historical linguists to restrict the term to words which have cognates in related languages and which can be reconstructed to some proto-language. A word such as English bad, which did not exist in Old English and which has no known cognates in other Germanic languages, would not normally be called a “native word”. In the world-wide perspective of our work, where we deal with many languages about whose prehistory little is known, the term native word is not very useful. The English terminology would be more systematic if we said borrowing-word instead of loan-word, or to loan instead of to borrow. Apparently loanword was calqued from German Lehnwort, while to borrow was used much earlier.

II. Lexical borrowing: Concepts and issues

39

refers to the copying of syntactic, morphological or semantic patterns (e.g. word order patterns, case-marking patterns, semantic patterns such as kinship term systems). Loanwords are the most important type of material borrowing, and loan translations (or calques) are an important type of structural borrowing. A calque (or loan translation) is a complex lexical unit (either a single word or a fixed phrasal expression) that was created by an item-by-item translation of the (complex) source unit. The most frequently cited examples of calques are compounds, such as German herunter-laden (calqued from English down-load), French presqu’île (calqued from Latin paen-insula ‘almost-island’), or English loan-word (calqued from German Lehn-wort). But calques may also be morphological derivatives, such as Czech diva-dlo ‘theatre’ (calqued from Greek thea-tron (look-PLACE)), or Italian marcat-ezza (calqued from English marked-ness). And calques may be fixed phrasal expressions, such as English marriage of convenience (calqued from French mariage de convenance). Another important type of structural borrowing is loan meaning extension, an extremely common (and often unnoticed) process whereby a polysemy pattern of a donor language word is copied into the recipient language. For example, the English word head is used in a technical sense to refer to the main word in a syntactic phrase, and following this usage, the German word Kopf ‘head’ is now also used in 6 this grammatical sense. Since such cases reproduce a semantic pattern, they also fall under structural borrowing. (Loan translations and loan meaning extensions are sometimes grouped together as loanshifts, i.e. lexical innovations created by purely structural borrowing; Haugen 1950: 219.) Loanblends are hybrid borrowings which consist of partly borrowed material and partly native material (the structural properties are also borrowed). An example given by Haugen (1950: 219) is Pennsylvania German bockabuch ‘pocketbook’, where bocka- is a material borrowing (from English pocket) that is restricted to this word, and -buch is a native German element rendering -book. Loanblends are not widely attested. Most hybrid-looking or foreign-looking expressions are in fact not borrowings at all, but loan-based creations, i.e. words created in a language with material that was previously borrowed (e.g. English desk lamp, a compound made up of two words that were ultimately borrowed from Greek). Such words are related to loanwords etymologically, but they cannot count as loanwords. Finally, some authors also include loan creations among borrowings, i.e. formations that were inspired by a foreign concept but whose structure is not patterned on its expression in any way. For example, the German word Umwelt (Um-welt [around-world]) was coined to render French milieu (mi-lieu [mid-place]) ‘envi6

Haugen (1950: 214) uses the term semantic loan for a rather different concept: He cites the example of Portuguese humoroso ‘capricious’, which acquired the meaning ‘humorous’ in American Portuguese under the influence of English. A similar example is Old English dwellan ‘lead astray’, which changed its meaning to ‘dwell’ under the influence of Old Norse dvelja ‘abide’ (Lehmann 1962: 213). In these examples, the two words are recognizably cognate, and this must have facilitated the semantic change. Thus, they are really closer to material borrowing than to structural borrowing.

40

Martin Haspelmath

ronment’. According to Haugen (1950: 220), such words “may ultimately be due to contact with a second culture and its language, but…are not strictly loans at all” (see also Höfler 1981). However, if the meaning of the loan creation is an exact copy of the meaning of the model word, then we are dealing with clear cases of pure semantic borrowing here.

4. Loanwords and code-switching Bilingual speakers often alternate between the two languages in the same discourse, sometimes even within the same sentence or the same word. This phenomenon is called code-switching. Although there are some grammatical restrictions on codeswitching (Myers-Scotton 1993, Muysken 2000), the alternation between the two languages is not conventionalized in code-switching. Code-switching does not mean that there is a mixed code, but speakers produce mixed utterances including elements from both codes. Thus, code-switching is not a kind of contact-induced language change, but rather a kind of contact-induced speech behavior. In this way, code-switching differs sharply from borrowing. However, when an utterance consists of just a single word from one language and all other words are from the other language, it may be difficult to decide whether this word is a loanword or a single-word switch. Consider the example in (1). (1)

Moroccan Arabic (with Dutch) ye-&'i-w n-nas l-uitkering dyal-h(m 3-give-PL

DEF-people

DEF-benefit

(Boumans & Caubet 2000: 116)

of-3PL

‘They’ll give the people their (social security) benefit.’ (2)

Australian German (with English) Wir müssen sie report-en zur Polizei. ‘We must report them to the police.’

(Myers-Scotton 1993)

Are uitkering in (1) and reporten in (2) single-word switches or loanwords? At an abstract level, the answer is clear: If reporten is part of the mental lexicon of the Australian German of the speaker, it is a loanword, otherwise it is a single-word switch. But since we are unable to look directly into the speaker’s mental lexicon, other criteria have to be used in practice. From the point of view of an entire language (not that of a single speaker), a loanword is a word that can conventionally be used as part of the language. In particular, it can be used in situations where no code-switching occurs, e.g. in the speech of monolinguals. This is the simplest and most reliable criterion for distinguishing loanwords from single-word switches. But it is often the case that the whole speech community is bilingual, so that code-switching may always occur. In such circumstances, the frequency criterion is

II. Lexical borrowing: Concepts and issues

41

useful: If particular concepts are very frequently or regularly expressed by a word originating in another language, while other concepts show a lot of variability, then the first group can be considered loanwords, while the second group are switches 7 (cf. Myers-Scotton 1993: 191–204). In addition, loanwords typically show various kinds of phonological and morphological adaptation (cf. §5), whereas code-switching by definition does not show any kind of adaptation. Some authors have regarded this as the most important distinguishing feature of borrowings, but it is clear that it does not coincide perfectly with the criterion of conventionalization. In particular, non-conventionalized words taken from another language may be morphologically integrated, and codeswitches are often pronounced with a foreign accent, if the speaker speaks one of the two languages non-natively. Such code-switches can hardly be distinguished from phonologically integrated loanwords. For such phonologically and syntactically adapted non-conventional words, the term nonce borrowing is often used, contrasting with established borrowing, i.e. a regular, conventionalized loanword (e.g. Sankoff 8 et al. 1990). However, this terminology is confusing: Above (in §2) we defined borrowing as a completed process of language change, and a loanword/lexical borrowing as a particular type of such a change. On this definition of borrowing, borrowings are “established” by definition. Code-switching, by contrast, is defined as the use of an element from another language in speech “for the nonce”, so 9 “nonce-borrowings” should be called code-switches. Of course, all loanwords start out as innovations in speech, like other cases of language change, and the process of propagation of the novel word through the speech community is gradual (cf. Croft 2000 on the distinction between innovation and propagation). It is also conceivable and indeed likely that the process of a word entering the mental lexicon of a speaker is gradual. Thus, there are bound to be intermediate cases between loanwords and single-word code-switches. These could be called “incipient loanwords”, “regular switches”, or similar, but they should not 10 be called “nonce borrowings”, because this term is contradictory. According to Myers-Scotton (1993: ch. 6), many loanwords start out as singly occurring switches that gradually get conventionalized. This is an intriguing suggestion, but so far there is not much evidence for it. In any event, the occurrence of 7

In the chapters of this volume, the authors were given the following instruction for distinguishing between loanwords and code-switching: “Only established, conventionalized loanwords that are felt to be part of the language should be given, not nonce borrowings. This distinction is often hard to make (especially when there are no monolingual speakers), but authors should try as best they can.” 8 Grosjean (1983: 308ff.) makes a similar distinction, using the terms speech borrowing and language borrowing. 9 See Myers-Scotton (1993: 181-182) for further arguments against the notion of “nonce borrowing”. 10 Of course, one could decide to define the term borrowing more broadly, to encompass also interference in speech. For instance, Haugen (1950: 212) adopts a broad definition: “the attempted reproduction in one language of patterns previously found in another”. However, given this definition of borrowing, all instances of code-switching fall under “nonce-borrowing”. Nobody would nowadays propose such a definition.

42

Martin Haspelmath

code-switching is by no means universal in bilingual situations, and lexical borrowing is not in any way dependent on code-switching.

5. Adaptation and integration of loanwords The source words of loanwords often have phonological, orthographic, morphological and syntactic properties in the donor language that do not fit into the system of the recipient language. For example, Russian lacks a front rounded vowel, so that French words like résumé [rezyme] ‘summary’ are problematic; and French words are either masculine or feminine, so that English inanimate genderless nouns are problematic. In such situations of lack of fit (which are the rule rather than the exception), loanwords often undergo changes to make them fit better into the recipient language. These changes are generally called loanword adaptation (or loanword 11 integration; but see below for a possible distinction between adaptation and integration). For example, French [y] becomes [u] (with palatalization of the preceding consonant) in Russian, i.e. résumé > Russian rezjume; and the English word weekend is assigned the default masculine gender in French (le weekend). Loanword adaptation is sometimes indispensable for the word to be usable in the recipient language. In particular, languages with gender and inflection classes need to assign each word to a gender and inflection class, so that it can occur in syntactic patterns which require gender agreement or certain inflected forms. Similarly, loanwords from Arabic have to be adapted orthographically in English because otherwise they would not be readable. However, in many cases the degree of adaptation varies, depending on the age of a loanword, knowledge of the donor language by recipient language speakers, and their attitude toward the donor language. If the donor language is well-known and/or the loanword is recent, recipient-language speakers may choose not to adapt the word in pronunciation, and they may borrow certain inflected forms from the donor language. In this way, English borrowed plural forms of words from Greek and Latin (phenomenon/phenomena, fungus/fungi, crisis/crises), and German even borrowed a few case forms (e.g. the genitive in das Leben Jesu ‘the life of Jesus’). And orthographic adaptation is not necessary to the extent that readers are familiar with the donor language’s writing system (thus, in Japanese and Russian, English words are not always orthographically adapted, because readers can be expected to be familiar with the Latin script). Complete adaptation of non-fitting loanwords may take a very long time, and frequently at least a linguist who is familiar with the language’s usual phonotactic patterns will recognize a word as a loanword simply by its unusual shape (see also §6).

11

Other equivalent terms are accommodation, assimilation and nativization.

II. Lexical borrowing: Concepts and issues

43

Loanwords that are not adapted to the recipient language’s system are typically recognizable as loanwords, and they are sometimes called foreignisms (German traditionally makes a distinction between Fremdwörter ‘foreignisms’ and Lehnwörter ‘adapted/integrated/established loanwords’; von Polenz 1967, Krier 1980). However, recognition of a word as a borrowing by speakers is a complex matter that depends on many different factors, and adaptation is only one of them. Another is mere novelty: If a word entered the language just recently, many older speakers will remember an earlier stage of the language and will thus be aware of the word’s young age. Innovating speakers may face criticism by older speakers for using a loanword, and this contributes to the general awareness of the degree to which a word is an accepted and established part of the language. The dimension along which Fremdwörter and Lehnwörter differ is thus not identical to the degree of adaptation, and we may choose the term degree of integration for it, to keep the two dimensions separate. (However, in practice linguists do not distinguish adaptation and integration systematically along these lines, and the authors of this book generally use integration for ‘adaptation’.) The notion of foreignism is evidently close to that of a single-word switch discussed in the previous section. We might say that singleword switches are even less integrated than foreignisms, to the point of not being (clear) members of the language’s lexicon. Integration would thus be the degree to which a word is felt to be a full member of the recipient language system. If a large number of loanwords come from a single donor language, then there is less need for adaptation, and instead the donor language patterns will be imported along with the words. Thus, Japanese borrowed many Chinese words that ended up with long vowels and diphthongs, so that now these phonological patterns are integral parts of the Japanese sound system. However, Sino-Japanese words still form a separate stratum in contemporary Japanese, with grammatical behavior that differs from native Japanese words, and speakers are aware of the distinction (cf. Schmidt, this volume). Similarly, German borrowed the plural suffix -s along with words from Low German and English, and now this suffix has become an integral part of the language which is also extended to non-loanwords. The precise ways in which the adaptation process happens are often complex and a matter of ongoing debate. In phonological adaptation, the respective roles of phonetic constraints and phonological patterns are contentious (e.g. Peperkamp 2005, Yip 2006). In gender assignment to loanwords, a multitude of factors seem to play a role (e.g. Stolz 2009). The role of morphological adaptation in verb borrowing is explored by Wohlgemuth (2009: ch. 5–7). In this volume, loanword adaptation is not the focus of the authors’ interests, but most of the language chapters contain a section on adaptation (generally called “Integration of loanwords”).

6. Recognizing loanwords Linguists identify words as loanwords if they have a shape and meaning that is very similar to the shape and meaning of a word from another language from which it

44

Martin Haspelmath

could have been taken (because a plausible language contact scenario exists), and if the similarities have no plausible alternative explanation. Most importantly, of course, we need to exclude the possibility of descent from a common ancestor, which is a very common reason for word similarities across languages. The Hebrew word for ‘head’ (ro)) and the Arabic word for ‘head’ (ra’s) are similar, but not because either language borrowed its word for ‘head’ from the other, but because both inherited it from a common ancestor (Proto-Semitic). Thus, if two languages that cannot be shown to go back to a common ancestor share a word, it is plausible to 12 assume that it is a loanword. In general, a word can only be recognized with certainty as a loanword if both a plausible source word and a donor language can be identified. In the World Loanword Database, the vast majority of loanwords are associated with a source word (sometimes with several possible source words, because there are a number of languages with similar words that could have been the source). However, in some cases, we can be fairly confident that a word is a loanword even though we have not found a source word. This is the case, in particular, if the word is phonologically aberrant in a way that would be explicable by a borrowing history of the word. For example, Thurgood (1999: 11) notes that many loanwords from Mon-Khmer languages into Chamic languages (of the Austronesian family) can be recognized by their loan phonemes, sounds which occur only in borrowed words (e.g. implosives; thus, Chamic *ia+ ‘little’ seems to have a Mon-Khmer origin, Thurgood 1999: 313). If a word simply has no etymology within its family, this is a less good reason for assuming a borrowing history, but often such inferences have been made. Thus, Vennemann (1984) observes that about a third of the Germanic words have no Indo-European cognates, and he assumes (following many others) that they were 13 borrowed from another (unknown ) language. For any individual word, one might object that it could be an inherited word that happened to be lost in other branches of the family, but for the many dozens of Germanic words with no cognates, this is implausible, so the reasoning seems sound. However, once we have found a pair of similar words in two languages that are not genealogically related and we are certain that borrowing must be involved, it is often sill unclear what the borrowing direction was. For example, Sanskrit has a word k,l$la- (referring to some kind of cheese), which has no Indo-European etymology and to judge by its phonological shape seems to be a loanword (see Burrow 1946: 2–3). It could be from Burushaski k,l$y, but since Burushaski has borrowed heavily from Indic languages, the borrowing direction may well have been the

12

Minor alternative reasons for similarities are onomatopoeia and chance. Thus, if two languages have word such as titi or tili for ‘twitter’, this is not strong evidence for either common ancestry or borrowing, because the words could easily have been created independently. And if two languages have a question particle a, this is not strong evidence, because many particles consist of a single vowel, and a is a very frequent vowel, so this similarity could be due to chance. 13 But see Vennemann (2000) for further speculation on what kind of language may have been the donor language.

II. Lexical borrowing: Concepts and issues

45

opposite. In this case, we simply do not know whether the Burushaski word or the Sanskrit word was the source of the borrowing. However, there are a number of criteria available that often give us a clear indication of the borrowing direction. First, if the word is morphologically analyzable in one language but unanalyzable in another one, then it must come from the first language. For instance, German Grenze ‘border’ must have been borrowed from Polish granica ‘border’ rather than the other way round, because -ica is a wellrecognized suffix in Polish, and the stem gran- occurs elsewhere, whereas German Grenze is not analyzable in this way. Similarly, Sanskrit m$ta-ga- ‘elephant’ must come from a Munda language, because the element -to- means ‘hand’ within Munda, but has no meaning in Sanskrit (Burrow 1946: 5). Second, phonological criteria are often available: If a word shows signs of phonological integration in language A but not in language B, it must come from language B. Third, if the word is attested in a sister language of language B that cannot have been under the influence of language A, it must come from language B. Thus, Sanskrit jemati ‘eat’ must come from Munda (e.g. Kurku jome ‘eat’), because the root is also attested in Mon-Khmer languages which were not under Indic influence to the same extent as Munda languages (Burrow 1946: 5). Fourth, the meaning often helps: Sanskrit nakra- ‘crocodile’ is likely to be a loanword from Dravidian (e.g. Kannada negar), because Indo-Aryan speakers coming from northern India would not have brought a word for crocodile with them (Burrow 1946: 9). However, these criteria do not always give clear results, especially if the words are very old, and if they appear in languages from a number of different families in a particular area. Such words are sometimes called Wanderwörter, and Awagana & Wolff and Löhr & Wolff (in this volume, in their chapters on Hausa and Kanuri) call the phenomenon “areal roots”. Even when a loanword is not very old, there may be several different possible donor languages, and it may not be decidable which language the word was borrowed from. This happens, in particular, when several related languages are donor candidates, as in the case of Romance influence on Germanic. The Dutch word pijp ‘pipe’ must have been borrowed from a Romance language, but whether it was French (pipe) or Italian (pipa) is unclear (van der Sijs, Dutch subdatabase). Thus, in the World Loanword Database, quite a few donor languages are in fact “donor 14 families”. In other cases, several different donor languages are given as alternatives, so the relationship between words and donor languages is occasionally a one-tomany relationship. Again, sometimes subtle phonological criteria are available for distinguishing between different donor languages. Thus, Samoan tapa’a ‘tobacco’ was not borrowed directly from English, but via Tongan tapaka (because Samoan ’ regularly corresponds to Tongan k; Mosel 2004: 219). 14

A cover term for languages and families is languoid, so we sometimes talk about “donor languoids” (= donor languages or “donor families”).

46

Martin Haspelmath

7. Why do languages borrow words? Explaining why languages change is generally very difficult, and explaining why languages borrow words is no exception. In fact, it is probably more difficult to explain lexical borrowing than most people think. This section will thus limit itself to raising and discussing a number of issues, rather than propose or endorse specific explanations. A simple dichotomy divides loanwords into cultural borrowings, which designate a new concept coming from outside, and core borrowings, which duplicate meanings for which a native word already exists (Myers-Scotton 2002: 41, MyersScotton 2006: §8.3). For example, Imbabura Quechua borrowed arrusa ‘rice’, riluju ‘clock’, and simana ‘week’ from Spanish (Gómez Rendón, subdatabase of the World Loanword Database), all referring to cultural items that did not exist in the Americas before the European invasions. On the other hand, the Austroasiatic language Ceq Wong borrowed baya- ‘shadow’, batok ‘to cough’, and dalam ‘deep’ from Malay (Kruspe, subdatabase of the World Loanword Database), all referring to concepts 15 that must have existed before the Ceq Wong came into contact with Malays. 7.1.

Cultural borrowings

At first glance, explaining cultural loans is straightforward, and such loans have also been called “loanwords by necessity”. However, there is nothing necessary about a borrowing process. All languages have sufficient creative resources to make up new words for new concepts. As Brown (1999) documented in great detail, many North American languages do not use loanwords for introduced concepts like ‘rice’, ‘clock’, and ‘week’, but instead make use of their own resources. If a new concept becomes very frequent and the newly created expression becomes too cumbersome, there are always ways of shortening the expression. For example, Witkowski & Brown (1983: 571) report that the word for ‘sheep’ in Tenejapa Tzeltal (in Chiapas, Mexico) was originally tunim .ih [cotton deer], but that as sheep became more important to the people in highland Chiapas, the modifier tunim was simply omitted, so that .ih now means simply ‘sheep’ (to designate a deer, the modifier te+tikil ‘wild’ has to be added). This process is quite similar to simple semantic change or extension, another frequently used mechanism for creating words for new concepts. For example, the words volume, mouse, menu, memory, and bookmark have taken on rather new meanings in recent computer technology, and English has no need for any borrowing

15

Tadmor (2007) proposes the following explanation for the borrowing of basic words in this and similar cases: Speakers tried to assimilate to the strongly dominant Malay people, but had very little access to the Malay language, so they borrowed what they could, the basic vocabulary that they knew. Thus we get the unusual result that more basic than non-basic vocabulary is borrowed in some languages.

II. Lexical borrowing: Concepts and issues

47

here. Of course, there is no potential donor language, but similar mechanisms could be used by languages that have donors available. Thus, in order to explain the widespread use of loanwords for new concepts, one probably needs to appeal to the convenience of using the loanword in situations of reasonably widespread bilingualism. As soon as many people in the Andes had become Quechua/Spanish bilinguals, using Spanish words for new concepts became very convenient, and using native Quechua neologisms or meaning extensions lost out: When many people know a concept by a certain word but not by another word, even if the better-known word belongs to another language, it becomes more efficient to use the better-known word. This efficiency consideration can be overridden if there is a strong cultural convention in the community to use one’s language as a marker of ethnic identity. For example, Aikhenvald (2002) describes the contact situation between the Arawakan language Tariana and the dominant East Tucanoan languages in the Vaupés region of Amazonia. All Tariana speakers are bilingual, and Tariana grammatical patterns have been strongly influenced by East Tucanoan patterns, but due to cultural pressure to preserve the Tariana language, almost no East Tucanoan loanwords have entered Tariana. Neologisms are instead calqued, e.g. di-tape-dapana [3SGmedicine-house] ‘hospital’, calqued on Tucano /hko-wi’i [medicine-house] (Aikhenvald 2002: 229). Similarly, while the educated elites of French-speaking countries tend to be bilingual in English, there is a certain cultural pressure to avoid English loanwords (e.g. in the domain of computer technology), and neologisms based on French words are promoted by language-planning bodies and have a good chance of being accepted (e.g. courriel for ‘e-mail’). In this, French contrasts interestingly with a number of neighboring European languages (Italian, German, Dutch), where the educated elites are more receptive to English loanwords. Cultural resistance to loanwords is called purism. The problem with such an explanation based on cultural attitudes is that there is a certain danger of circularity, i.e. of inferring puristic attitudes from the avoidance of loanwords. While the amount of loan vocabulary can be readily observed and measured, the speakers’ attitudes cannot be easily observed in an objective way. Speakers are not likely to be aware of their attitudes to borrowing, because they rarely have extensive knowledge about other sociolinguistic situations and other possible attitudes. Thus, questions like “Is it OK to borrow words, in your opinion?” will not be very meaningful to most speakers. However, for languages with a written tradition and a powerful status, purism among the educated elites is often manifested in published recommendations, or even through the existence of language authorities (e.g. national academies) whose recommendations are likely to be followed by teachers, journalists, etc. In such cases, even purification may be successful, i.e. large-scale replacement of loanwords by native formations. This phenomenon is best-known from various central and eastern European languages, from the 18th century through the first half of the 20th century, but another notable example is the purification of Korean of Japanese elements after the liberation in 1945 (Song 2005: 84).

48

Martin Haspelmath

Thus, unless there are significant purist attitudes among the (influential) speakers, new concepts adopted from another culture are the more likely to be expressed by loanwords, the more widely the donor language is known. If only very few people speak the donor language, native neologisms and meaning shifts are more likely to be used for the new concepts. In a very thorough comparative study, Brown (1999) shows that the North American languages whose primary European contact language was English borrowed far fewer words than languages whose primary contact language was Spanish. He attributes this to the fact that the indigenous populations had more access to Spanish (e.g. through missionary schools) than to English during the initial period of European contact. 7.2.

Core borrowings

Explaining core borrowings (loanwords that duplicate or replace existing native 16 words) is more difficult. Why should speakers use a word from another language if they have a perfectly good word for the same concept in their own language? Here it seems that all we can say is that speakers adopt such new words in order to be associated with the prestige of the donor language. Like “puristic attitude”, “prestige” is a factor that is very difficult to measure independently, and a danger of circularity exists. However, it seems to me undeniable that prestige is a factor with paramount importance for language change, going far beyond our current topic of loanwords. The way we talk (or write) is not only determined by the ideas we want to get across, but also by the impression we want to convey on others, and by the kind of social identity that we want to be associated with. Other terms such as “cultural pressure” (Thomason & Kaufman 1988: 77) or “loss of vitality (of the recipient language)” (Myers-Scotton 2006: 215) are often found, but these are even more vague and intangible than “prestige”. It is perhaps easiest to understand the adoption of words for already existing concepts in a situation of widespread bilingualism, as is the case in Selice Romani (speakers are bilingual in Hungarian, El"ík this volume) or Tarifiyt Berber (speakers are bilingual in Moroccan Arabic, Kossmann this volume). When (almost) everyone also understands the other language, it does not really matter which words one uses – one will be understood anyway. More surprising is the borrowing of basic words like ‘star’ and ‘turn around’ by Ceq Wong (from Malay, see Kruspe this volume), even though bilingualism has not been common until quite recently. See note 15 for a possible explanation of this case. While the distinction between cultural and core borrowings is useful, it is by no means always clear how to classify a loanword. If all languages had the same lexical meanings that have to be expressed by words, this would be straightforward, but of 16

This term is potentially misleading because it suggests that core borrowings concern core vocabulary only. It is retained here for lack of a better alternative, and because it was used prominently by Myers-Scotton (2002, 2006, and elsewhere).

II. Lexical borrowing: Concepts and issues

49

course lexical meanings do not have to fit into predefined slots. For example, one might think that the Sakha word for ‘roof’, k/r/:sa (from Russian kry)a) must be a core borrowing, because the Sakha had roofs before the Russians arrived in Yakutia. However, as Pakendorf & Novgorodov note in the Sakha subdatabase: “The traditional Sakha winter-house had a covering of earth and cow-dung like the walls, not a separate roof like the modern Russian-style houses.” So although the Russians would have called the Sakha-style roof kry)a, the Sakha may well have decided that the Russian-style roof was a different kind of thing, deserving a special word (thus a cultural borrowing). Another example is the word mews0m ‘weather’ in Manange, borrowed from Nepali (in Hildebrandt’s subdatabase). Of course Manange speakers talked about the weather before Nepali contact, but they seem to have had no general word for weather. The ‘weather’ word is new to the language, but we can hardly say that the Manange learned a new cultural concept from the Nepali – this 17 word is thus not easily classifiable as a core or cultural borrowing. In the World Loanword Database, we categorized the effect of a loanword on the lexical stock of the recipient language as follows: insertion (the word is inserted into the vocabulary as a completely new item), replacement (the word may replace an earlier word with the same meaning that falls out of use, or changes its meaning), or coexistence (the word may coexist with a native word with the same meaning). For each loanword, we asked the contributors to specify the effect in these terms. Obviously, insertion refers to cultural borrowings, while replacement and coexistence refer to core borrowings. Our contributors were often unsure how to fill in these database fields, because the cultural/core distinction is somewhat problematic, as we just saw. Nevertheless, the information from these fields may prove useful. The distribution of these three effect types in our database is as follows: effect number of (clear) loanwords insertion 4823 replacement 1667 coexistence 2542 no information 3443

17

The lack of clarity about what a new concept is also means that information about this is not easy to get. Nevertheless, the World Loanword Database has a field (“Environmental salience”) that indicates for loanwords whether the phenomenon was present before the contact or not. The overall result is (for clearly borrowed words): phenomenon present only since contact: 4471 phenomenon present in pre-contact environment: 5524 phenomenon not present: 240 no information/not applicable: 2140 These figures seem to show that a very large (perhaps surprisingly large) part of the loanwords are core borrowings.

50

Martin Haspelmath

7.3.

Therapeutic borrowing

Borrowing of new words along with new concepts (cultural borrowing) and borrowing for reasons of prestige (core borrowing) are the two most important reasons for borrowing, but borrowing has also been said to occur for therapeutic reasons, when the original word became unavailable. Two subcases of this are: (i) Borrowing due to word taboo: In some cultures, there are strict word taboo rules, e.g. rules that prohibit a certain word that occurs in a deceased person’s name, or a word that occurs in the name of a taboo relative (e.g. in Australian languages, Dixon 2002: 27, 43). In such cases, a language may acquire large parts of another language’s basic lexicon, so that its genealogical position is recognizable only from its grammatical morphemes (Comrie 2000). (ii) Borrowing for reasons of homonymy avoidance (cf. Rédei 1970: 11): If a word becomes too similar to another word due to sound change, the homonymy clash might be avoided by borrowing. Thus, it has been suggested that the homonymy of earlier English bread (from Old English bræde) ‘roast meat’ and bread (from Old English bread) ‘morsel, bread’ led to the replacement of the first by a French loan (roast, from Old French rost) (cf. Burnley 1992: 493). However, English borrowed many other words from French, so whether the homonymy was a major reason for the borrowing here, and whether it is ever an important reason, is questionable (cf. also Weinreich’s 1953: 58 cautionary remarks). 7.4.

Adoption vs. imposition

Finally, we should consider the distinction between adoption and imposition that was briefly mentioned in §2 (Van Coetsem 1988, Guy 1990, Winford 2005). For borrowed structural patterns, this distinction is very important: Some borrowed phonological and syntactic patterns are due to native speakers borrowing (= adopting) features from another (dominant) language into their own language, and others are due to non-native speakers unintentionally retaining (= imposing) features of their native language on a language to which they are shifting (thus, imposition is called “interference through shift” by Thomason & Kaufman 1988). Imposed patterns survive only if a large number of speakers acquire a new language and shift to it. Thus, features of Indian languages survive in Indian English, but not in British English, where the number of speakers from India is not large enough to have an impact on the general language. Borrowing by imposition has also been called substrate or superstrate influence. It is well-known that in imposition (or substrate/superstrate) situations, the borrowing primarily concerns the phonology and the syntax, whereas in adoption (or adstrate) situations, the borrowing affects the lexicon first, before it extends to other domains of language structure. This is understandable, because second-language speakers cannot avoid phonological and syntactic interference from

II. Lexical borrowing: Concepts and issues

51

their native language, but it is quite easy to avoid using words from one’s native language. But if substrate influence equals imposition (= non-native speakers’ agentivity), just as adstrate influence equals adoption (= native speakers’ agentivity), we may ask why lexical substrate influence should occur at all. Why are there some Gaulish words in French, some Coptic words in Egyptian Arabic, and some Kikongo words in Saramaccan (cf. Good in this volume)? Why are there Dravidian words in IndoAryan, Sumerian words in Akkadian, and Yiddish words in New York English? Is it possible that substrate speakers unintentionally impose or retain words from their original language, just as they unintentionally transfer its phonological and syntactic patterns? 18 The answer seems to be: No, words are not unintentionally retained, but in a substrate situation, there are other mechanisms for borrowing. First of all, the words may have been borrowed (adopted) before the borrowing language became dominant and before the donor language speakers began to shift. Thus, Akkadian and Sumerian were in contact long before the Akkadians took over, and the invasion of Dravidian territory by Indo-Aryans was presumably a long, gradual process. (This contrasts with the Romans in Gaul and the Arabs in Egypt, where contact basically began with the military invasion.) Second, the dominant group may borrow words for concepts that do not exist in their previous experience, especially animal and plant names and other words for natural phenomena. (Minimally, an invading group is likely to retain place names, as in the case of American English, which adopted many indigenous place names, but little else.) Third, substrate language words may occasionally be retained by substrate speakers as markers of their (somewhat) separate identity. Language shift generally takes place when a group of speakers decides that it wants to merge with a more powerful group in principle, but this is not incompatible with retaining a few emblematic words from the original language. Ross (1991) discusses this case (citing the example of a dialect of the Sissano language of Papua New Guinea) and notes that such emblematic borrowing is really a special case of adoption, rather than imposition. The use of a few Yiddish words in New York English, especially when they mark Jewish identity, may also fall in this category. And finally, words from the language of the shifting speakers may survive if these are a dominant group, as in the case of Franconian words in French, and (Anglo-Norman) French words in English. The latter case is traditionally called superstrate (as opposed to substrate, i.e. shift by a non-dominant group). Significantly, French has many more words from its Franconian superstrate than from its Gaulish substrate, and English has many more words from its French superstrate than from its Celtic substrate. However, these superstrate words are cases of (prestige-based) adoption by recipient-language speakers before the shift, not of unintentional imposition by the donor-language speakers. 18

Uri Tadmor (p.c.) claims that ethnically Javanese speakers of Indonesian commonly use Javanese words when speaking Indonesian to each other, and also to non-Javanese Indonesians, and they do so unintentionally. This would be a counterexample to the above claim (see also Stewart 2004).

52

Martin Haspelmath

References Aikhenvald, Alexandra. 2002. Language Contact in Amazonia. Oxford: Oxford University Press. Boumans, Louis & Caubet, Dominique. 2000. Modelling intrasentential codeswitching: A comparative study of Algerian/French in Algeria and Moroccan/Dutch in the Netherlands. In Owens, Jonathan (ed.), Arabic as a minority language, 113–180. Berlin: Mouton de Gruyter. Brown, Cecil H. 1999. Lexical acculturation in Native American languages. New York: Oxford University Press. Buck, Carl Darling. 1949. A Dictionary of Selected Synonyms in the Principal Indo-European Languages. Chicago: The University of Chicago Press. Burnley, David. 1992. Lexis and semantics. In Blake, Norman (ed.), The Cambridge history of the English language, Vol. 2: 1066–1476, 409–499. Cambridge: Cambridge University Press. Burrow, Thomas. 1964. Loanwords in Sanskrit. Transactions of the Philological Society 1964:1–30. Clyne, Michael. 2004. Dynamics of language contact. Cambridge: Cambridge University Press. Comrie, Bernard. 2000. Language contact, lexical borrowing, and semantic fields. In Gilbers, Dicky & Nerbonne, John & Schaeken, Jos (eds.), Languages in Contact (Studies in Slavic and General Linguistics 28), 73–86. Amsterdam: Rodopi. Croft, William. 2000. Explaining language change: An evolutionary approach. London: Longman. Deroy, L. 1956. L'emprunt linguistique. (Bibliothéque de la Faculté de Philosophie et lettres de l'Université de Liège 141). Paris. Dixon, R. M. W. 2002. Australian languages. Cambridge: Cambridge University Press. Grosjean, François. 1982. Life with two languages: An introduction to bilingualism. Cambridge, MA: Harvard University Press. Guy, Gregory. 1990. The sociolinguistic types of language change. Diachronica 7:47–67. Haspelmath, Martin. 2008. Loanword typology: Steps toward a systematic cross-linguistic study of lexical borrowability. In Stolz, Thomas & Bakker, Dik & Salas Palomo, Rosa (eds.), Aspects of language contact: New theoretical, methodological and empirical findings with special focus on Romancisation processes, 43–62. Berlin: Mouton de Gruyter. Haugen, Einar. 1950. The analysis of linguistic borrowing. Language 26:210–231. Hock, Hans Henrich & Joseph, Brian D. 1996. Language history, language change, and language relationship. Berlin: Mouton de Gruyter. Höfler, Manfred. 1981. Für eine Ausgliederung der Kategorie 'Lehnschöpfung' aus dem Bereich sprachlicher Entlehnung. In Pöckl, Wolfgang (ed.), Europäische Mehrsprachigkeit: Festschrift zum 70. Geburtstag von Mario Wandruszka, 149–153. Tübingen: Niemeyer.

II. Lexical borrowing: Concepts and issues

53

Johanson, Lars. 2002. Structural factors in Turkic language contacts. London: Curzon. Krier, Fernande. 1980. Lehnwort und Fremdwort im Maltesischen. Folia Linguistica 14:179–184. Lehmann, Winfred P. 1962. Historical linguistics: An introduction. New York: Holt, Rinehart & Winston. Matras, Yaron & Sakel, Jeanette. 2007. Grammatical Borrowing in Cross-Linguistic Perspective. Berlin: Mouton de Gruyter. Mosel, Ulrike. 2004. Borrowing in Samoan. In Tent, Jan & Geraghty, Paul (eds.), Borrowing: A Pacific perspective, 215–232. Canberra: Pacific Linguistics, ANU. Møller, Christen. 1933. Zur Methodik der Fremdwortkunde. København. Muysken, Pieter. 2000. Bilingual speech. Cambridge: Cambridge University Press. Myers-Scotton, Carol. 1993. Duelling languages: Grammatical structure in codeswitching. Oxford: Clarendon. Myers-Scotton, Carol. 2002. Contact linguistics: Bilingual encounters and grammatical outcomes. Oxford: Oxford University Press. Myers-Scotton, Carol. 2006. Multiple voices: An introduction to bilingualism. Malden, MA: Blackwell. Peperkamp, Sharon. 2005. A psycholinguistic theory of loanword adaptations. In Ettlinger, M. & Fleischer, N. & Park-Doob, M. (eds.), Proceedings of the 30th Annual Meeting of the Berkeley Linguistics Society, 341–352. Berkeley, CA: The Society. Poplack, Shana & Sankoff, David. 1984. Borrowing: The synchrony of integration. Linguistics 22:99–136. Poplack, Shana & Sankoff, David & Miller, Christopher. 1988. The social correlates and linguistic processes of lexical borrowing and assimilation. Linguistics 26:47–104. Rédei, Károly. 1970. Die syrjänischen Lehnwörter im Wogulischen. The Hague: Mouton. Ross, Malcolm. 1991. Refining Guy's sociolinguistic types of language change. Diachronica 8(1):119–129. Sankoff, David, Poplack, Shana and Vanniarajan, Swathi. 1990. The case of the nonce loan in Tamil. Language variation and change 2:71–101. Song, Jae Jung. 2005. The Korean language: structure, use and context. London: Routledge. Stewart, Thomas W. Jr. 2004. Lexical imposition: Old Norse vocabulary in Scottish Gaelic. Diachronica 21(2):393–420. Stolz, Christel. 2009. A different kind of gender problem: Maltese loan-word gender from a typological perspective. In Comrie, Bernard & Fabri, Ray & Hume, Elizabeth & Mifsud, Manwel & Stolz, Thomas & Vanhove, Martine (eds.), Introducing Maltese Linguistics, 321–353. Amsterdam: Benjamins. Swadesh, Morris. 1955. Towards greater accuracy in lexicostatistic dating. Linguistics 21:121–137.

54

Martin Haspelmath

Tadmor, Uri. 2007. Is borrwability borrowable? Paper presented at the Symposium Language Contact and the Dynamics of Language, Max Planck Institute for Evolutionary Anthropology, Leipzig, May 2007. Thomason, Sarah G. 2001. Language contact. Washington, D.C.: Georgetown University Press. Thomason, Sarah Grey & Kaufman, Terrence. 1988. Language contact, creolization, and genetic linguistics. Berkeley: University of California Press. Thurgood, Graham. 1999. From ancient Cham to modern dialects: two thousand years of language contact and change. Honololu: University of Hawai’i Press. Van Coetsem, Frans. 1988. Loan phonology and the two transfer types in language contact. Dordrecht: Foris. van Hout, Roeland & Muysken, Pieter. 1994. Modeling lexical borrowability. Language Variation and Change 6:39–62. Vennemann, Theo. 1984. Bemerkung zum frühgermanischen Wortschatz. In Eroms, HansWerner & Gajek, Bernhard & Kolb, Herbert (eds.), Studia Linguistica et Philologica: Festschrift für Klaus Matzel zum sechzigsten Geburtstag, 105–119. Heidelberg: Carl Winter. Vennemann, Theo. 2000. Zur Entstehung des Germanischen. Sprachwissenschaft 25(3):233– 269. von Polenz, Peter. 1967. Fremdwort und Lehnwort sprachwissenschaftlich betrachtet. Muttersprache 77:65–80. Winford, Donald. 2005. Contact-induced changes: Classification and processes. Diachronica 22(2):373–427. Wohlgemuth, Jan. 2009. A typology of verbal borrowings. Berlin: Mouton de Gruyter. Yip, Moira. 2006. The symbiosis between perception and grammar in loanword phonology. Lingua 116(7):950–975.

Chapter III

Loanwords in the world’s languages: Findings and results Uri Tadmor The Loanword Typology (LWT) project has had several tangible results, including the case studies in this volume (chapters 1–41) and the online World Loanword Database (WOLD, at http://wold.livingsources.org). Together, they have made it possible to conduct comparative investigations into various aspects of lexical borrowing. This chapter presents some of the findings of the LWT project. Another result of the project, the Leipzig-Jakarta List of basic vocabulary, is also presented in this chapter (§5).

1. Lexical borrowing across languages Lexical borrowing rates vary greatly among languages, but it is important to bear in mind that the rates also reflect varying degrees of knowledge about each language. Some languages have a long written history and have been thoroughly studied for centuries; others were only recently documented, and very little is known about their histories. A low borrowing rate may thus indicate that a language has adopted few loanwords during its history, but it could also mean that linguists have not yet identified some of its loanwords. Moreover, not all languages in the sample are of the same age. Most are contemporaries, but Old High German records a much earlier stage of development, while the two creole languages (Saramaccan and Seychelles Creole) are only a few centuries old and have therefore not had much time to borrow words. Despite the difficulties that such discrepancies present, it is still possible to draw some basic conclusions. The first point is obvious but nevertheless important to make: lexical borrowing is universal. No language in the sample – and probably no language in the world – is entirely devoid of loanwords. The average borrowing rate, at 24.2%, is substantial and higher than expected. Admittedly, there is a bias in the sample towards languages with many loanwords, because specialists on languages with few loanwords were less interested in joining the project. But it is clear that lexical borrowing is a very pervasive phenomenon. What makes a language particularly amenable to lexical borrowing? Figures for the total numbers of words and (certain or probable) loanwords in the LWT project languages are presented in Table 1. Looking at the ten languages with the highest borrowing rate, it is clear that there is no one answer, as these languages

56

Uri Tadmor!

exhibit very different typological and sociolinguistic types. The same can be said for the ten languages with the lowest borrowing rates. Whatever generalizations are formulated, counter-examples can probably be found not only among the world’s thousands of languages, but even among the 41 languages in the sample. For example, one may postulate that Seychelles Creole has a low borrowing rate (10.7%) because it is a new language that has only come into being in recent history, and therefore had not had time to borrow many words. But the other creole in the sample, Saramaccan, has the sixth highest borrowing rate (38.3%). So the most useful explanations appear to be language-specific rather than general. (Moreover, things like typological classification and sociolinguistic circumstances are not constant – they may, and often do, change during a language’s history.) With regard to the discrepancy between the borrowing rates of the two creoles, the explanation is found in the fact that Saramaccan has undergone partial relexification by Portuguese words (Good, this volume). Words of Portuguese origin are considered to be loanwords, and account for the high borrowing rate. Seychelles Creole did not undergo relexification (Michaelis, this volume), so the explanation that it has not had time to borrow many words still holds. With time, its borrowing rate will surely rise to resemble that of non-creole languages. Table 1:

Lexical borrowing rates in LWT project languages

Borrowing type

Languages

Very high borrowers

Selice Romani Tarifiyt Berber Gurindji Romanian English Saramaccan Ceq Wong Japanese Indonesian Bezhta Kildin Saami Imbabura Quechua Archi Sakha Vietnamese Swahili Yaqui Thai Takia

High borrowers

Total words

Loanwords

Loanwords as % of total

1,431 1,526 842 2,137 1,504 1,089 862 1,975 1,942 1,344 1,336 1,158 1,112 1,411 1,477 1,610 1,379 2,063 1,123

898 789 384 894 617 417 319 689 660 427 408 350 328 409 415 447 366 539 291

62.7% 51.7% 45.6% 41.8% 41.0% 38.3% 37.0% 34.9% 34.0% 31.8% 30.5% 30.2% 29.5% 29.0% 28.1% 27.8% 26.5% 26.1% 25.9%

III. Loanwords in the World’s Languages: Findings and results Borrowing type

Languages

Average borrowers Lower Sorbian Hausa Mapudungun White Hmong Kanuri Dutch Malagasy Zinacantán Tzotzil Wichí Q’eqchi’ Iraqw Kali’na Hawaiian Oroqen Hup Gawwada Seychelles Creole Otomi Low borrowers Ket Manange Old High German Mandarin Chinese

57

Total words

Loanwords

Loanwords as % of total

1,671 1,452 1,236 1,290 1,427 1,513 1,526 1,217 1,187 1,774 1,117 1,110 1,245 1,138 993 982 1,879 2,158 1,030 1,009 1,203 2,042

374 323 274 273 283 289 267 195 188 266 162 156 169 137 114 111 201 231 100 84 70 25

22.4% 22.2% 22.2% 21.2% 19.8% 19.1% 17.5% 16.0% 15.8% 15.0% 14.5% 14.0% 13.6% 12.0% 11.5% 11.3% 10.7% 10.7% 9.7% 8.3% 5.8% 1.2%

The languages in Table 1 can be roughly divided into four categories: Very high borrowers, with a borrowing rate of over 50%; high borrowers, with a borrowing rate of 25–50%; average borrowers, with a borrowing rate of 10–25%; and low borrowers, with a borrowing rate of under 10%. The threshold for these categories was set at the lower end of what would appear warranted by the figures, to compensate for the bias in the sample towards relatively high borrowers. This makes the categories applicable for classifying other languages as well. A category “very low borrowers” is not proposed because, as already mentioned, a low borrowing rate may reflect lack of knowledge as well as paucity of actual borrowing. It would impossible to discuss the circumstances behind each language’s borrowing rate within this chapter; the readers are referred to the case studies (Chapters 1–41). The discussion will therefore be confined to just two languages: Selice Romani, the highest borrower, and Mandarin Chinese, the lowest borrowers. The sociolinguistic and other determinants which have brought about their extreme rates of borrowing will also be applicable, to some extent, to other languages. Selice Romani is a dialect spoken by about 1,350 people in a village in southwestern Slovakia (El!ík, this volume). Mandarin Chinese is spoken by almost a billion people in China and beyond (Wiebusch & Tadmor, this volume). Romani has always been spoken as a minority language, in linguistic situations dominated by th th other languages, since its ancestors left India around the 8 or 9 century CE.

58

Uri Tadmor!

Chinese, on the other hand, has been the dominant language of its region for thousands of years. This has put Chinese in a position where it had no particular need or reason to borrow words, while Romani, as the language of marginalized minority communities, was under constant linguistic pressure of more dominant languages. Moreover, all speakers of Selice Romani of school age and over are fluent in Hungarian, and most also speak Slovak (and to a smaller extent, Czech). Bilingualism with Hungarian in particular has been widespread for generations. There was therefore plenty of opportunity for Selice Romani speakers to borrow from Hungarian. A permissive attitude towards lexical borrowing among the Roma of Selice also encouraged borrowing, as did the fact that there was never a standard form of Romani to which speakers were expected to conform. On the other hand, few native speakers of Mandarin speak any other language; even if they were in a sociolinguistic situation that fostered borrowing, they would not have the necessary input to do so. A standard form of Mandarin has been used in writing for many centuries, another inhibitor to lexical change; Romani, including the dialect of Selice, has no standard and is seldom written. Although the sociolinguistic circumstances favor mass borrowing into Selice Romani and disfavor borrowing into Mandarin, these are not the only reasons for their exceptional borrowing rates. While socio-politically marginalized, Romani has been the subject of much academic attention, particularly in the context of language contact. Language contact in Mandarin, on the other had, has not been thoroughly investigated. Very little is known about the languages which preceded Mandarin in the vast areas where it is now spoken, and surprisingly little is know about borrowing into Mandarin from other Chinese languages either. In contrast, the history of Romani is much shorter and better understood, and the languages it has come in contact with are well known. Therefore, the large difference between the borrowing rates of Selice Romani and Mandarin Chinese may also reflect a gap in knowledge, not only a gap in actual borrowing. The principal determinants of the respective very high and very low borrowing rates of the two languages are summarized in Table 2. Table 2:

Sociolinguistic circumstances underlying lexical borrowing rates in Selice Romani and Mandarin Chinese

Selice Romani Universal multilingualism Minority language Socio-politically marginalized Relatively short history Long absence from ancestral homeland Permissiveness towards borrowing No standard Language contact well-studied Donor languages well known

Mandarin Chinese Almost no bilingualism Majority language Socio-politically dominant Relatively long history Long presence in ancestral homeland Purism Highly standardized Language contact poorly studied Some donor languages poorly known

III. Loanwords in the World’s Languages: Findings and results

59

2. Loanwords and semantic word classes To examine the relationship between borrowing and word class membership, items on the LWT list were classified into one of the following categories: “nouns”, “verbs”, “adjectives”, “adverbs” and “function words”. It is important to remember, however, that the list consists of lexical meanings rather than of words. (For practical reasons the meanings are expressed in English, but in principle they could be expressed in any other language.) Since meanings have no syntactic properties, these designations should not be construed to refer to syntactic categories. Thus “nouns” is used as convenient shorthand for “meanings of words denoting things or entities”, “verbs” as shorthand for “meanings of words denoting events or actions”, and so forth. Since the word class of the LWT meaning and that of the counterpart word in the project language did not always coincide, findings regarding word classes must be interpreted with some caution. 2.1.

Borrowed content words vs. function words

A generalization often made in the literature, for which there is now strong empirical evidence, is that content words are more borrowable than function words. Not only do the total figures indicate this (Table 3.1), but individually too most languages in the sample have a higher proportion – often much higher – of borrowed content words as compared to function words (Table 3.2). Three languages, however, buck this trend. In White Hmong, 22.4% of the function words are loanwords compared to 21.1% of the content words, a slightly lower proportion. The results for Hup are much more robust: only 11.1% of content words are loanwords compared to 16.6% of all function words. Wichí exhibits similar proportions: 15.5% of content words are loanwords compared to 21.5% of function words. It is interesting to contrast the situation in Wichí with that of Imbabura Quechua, where 32.5% of content words are loanwords compared to only 2.3% of function words. Both languages borrowed predominantly from Spanish under broadly similar sociolinguistic circumstances, and it is not clear what brought about this great difference in borrowing behavior. Table 3.1: Borrowed content words and function words: total figures Category Content words Function words Total (all words)

All words

Loanwords

Loanwords as % of total

53,446 4,071 57,517

13,446 492 13,938

25.2% 12.1% 24.2%

60

Uri Tadmor!

Table 3.2: Borrowed content words and function words by project language Language Imbabura Quechua Iraqw Seychelles Creole Oroqen Dutch Romanian Q’eqchi’ Hawaiian English Lower Sorbian Manange Mapudungun Sakha Gurindji Bezhta Malagasy Kanuri Archi Kali’na Kildin Saami Selice Romani Gawwada Hausa Swahili Indonesian Otomi Ket Zinacantán Tzotzil Vietnamese Yaqui Japanese Saramaccan Takia Berber Thai Ceq Wong White Hmong Hup Wichí Mandarin Chinese Old High German

Loanwords as % of Loanwords as % of Loan content words to all content words all function words loan function words ratio 32.5% 15.6% 11.3% 12.9% 20.3% 44.0% 15.9% 14.3% 43.1% 23.5% 8.9% 23.5% 30.4% 48.1% 33.4% 18.3% 20.7% 30.7% 14.6% 31.7% 65.6% 11.8% 22.9% 28.6% 35.0% 11.1% 10.1% 16.5% 28.8% 27.3% 35.7% 39.2% 26.5% 52.5% 26.6% 37.4% 21.1% 11.1% 15.5% 1.3% 6.3%

2.3% 1.2% 0.9% 1.2% 2.1% 5.9% 2.5% 2.5% 8.9% 4.9% 2.3% 6.3% 8.6% 15.4% 11.9% 7.0% 8.9% 13.8% 7.0% 15.1% 30.9% 5.9% 12.3% 14.9% 19.8% 6.2% 6.1% 9.5% 17.6% 17.4% 24.8% 27.3% 18.8% 39.5% 20.8% 32.9% 22.4% 16.6% 21.5% 0.0% 0.0%

14.4 12.7 12.0 10.6 9.9 7.5 6.2 5.8 4.8 4.8 3.9 3.8 3.5 3.1 2.8 2.6 2.3 2.2 2.1 2.1 2.1 2.0 1.9 1.9 1.8 1.8 1.7 1.7 1.6 1.6 1.4 1.4 1.4 1.3 1.3 1.1 0.9 0.7 0.7 – –

III. Loanwords in the World’s Languages: Findings and results

2.2.

61

Loan nouns and loan verbs

An oft-made assertion is that nouns are more borrowable than verbs because verbs constitute complex and rigid systems that inhibit borrowing. Moravcsik (1975) claimed that verbs cannot be borrowed as such, but are rather borrowed as nouns which then undergo verbalization in the recipient language. Obviously, when making this statement Moravcsik was not familiar with highly isolating languages, where verbs can be (and often are) borrowed as such without any morphosyntactic modification. In fact, as Wohlgemuth (2009) has shown, even synthetic languages can borrow verbs by what he terms direct insertion (i.e. without any morphosyntactic modification). Nevertheless, as will be seen below, even highly isolating languages borrow proportionally more nouns than verbs, so the reason for the higher borrowing rate of nouns cannot be due solely to grammatical factors. Part of the explanation may be the simple fact that things and concepts are easily adopted across cultures (along with the words for them). Over 31% of all nouns in the cross-linguistic database are loanwords, as compared to only 14% of the verbs (Table 4). This is a very significant disparity that cannot be due to chance. Even if many more languages are added to the sample, the roughly two-to-one ratio between the borrowing rates of nouns and verbs will probably not change significantly. Table 4:

Borrowing by semantic word class, total figures

Semantic word class Nouns Adjectives and adverbs Verbs All content words

All words

Loanwords

Loanwords as % of total

34,355 5,284 13,808 53,446

10,712 803 1,932 13,446

31.2% 15.2% 14.0% 25.2%

Among the 41 languages in the sample, only two have a smaller proportion of loan nouns than loan verbs. About 48.8% of the Gurindji nouns in the sample have been identified as loanwords, as compared to 49.7% of verbs. This difference is not statistically significant, and the relative proportion may change if more words are added to the sample (which, with 842 words, is the smallest vocabulary in the database). However, for Saramaccan the difference is more robust: 44% of verbs are borrowed, compared to only 37.1% of nouns. This difference emerged following the partial relexification of Saramaccan by Portuguese (Good, this volume). Many more verbs were relexified (30% of all Saramaccan verbs in the sample are of Portuguese origin) compared to nouns (only 12%). It is not clear what brought about to this unusual pattern of lexical replacement. The inner workings of relexification are not well understood yet, and this is one more indication that it constitutes a process that is quite different from ordinary lexical borrowing. Other than Gurindji and Saramaccan, all languages in the sample exhibit a higher borrowing rate for nouns than for verbs, although the specific proportions vary significantly (Table 5).

62

Uri Tadmor!

Table 5:

Loan nouns and loan verbs by project language

Language Zinacantán Tzotzil Takia Iraqw Wichí Otomi Bezhta Oroqen Kali’na Old High German Q’eqchi’ Hausa Hawaiian Manange Yaqui Gawwada Archi Dutch Seychelles Creole Ket Lower Sorbian Malagasy Mapudungun Sakha Kanuri Imbabura Quechua Indonesian Japanese Swahili Kildin Saami Thai Hup Selice Romani Romanian English Tarifiyt Berber Ceq Wong Vietnamese White Hmong Gurindji Saramaccan Mandarin Chinese Total

Loan nouns (as % of all nouns) 24.1% 37.7% 23.6% 23.1% 17.0% 44.4% 18.6% 21.1% 9.0% 23.0% 31.2% 19.3% 12.3% 37.3% 16.9% 40.6% 26.3% 14.6% 13.6% 30.7% 23.9% 31.3% 40.0% 26.7% 43.1% 43.7% 43.2% 34.3% 38.0% 32.3% 13.8% 75.6% 50.2% 48.0% 56.1% 41.6% 31.3% 21.5% 48.8% 37.1% 1.9% 31.2%

Loan verbs (as % of all verbs) 0.6% 3.2% 2.1% 2.7% 2.2% 6.0% 2.8% 3.6% 1.7% 4.8% 7.0% 5.1% 3.3% 10.1% 4.6% 11.7% 7.5% 4.1% 4.0% 9.0% 7.0% 10.1% 12.8% 8.7% 15.5% 17.2% 19.9% 16.0% 19.1% 16.3% 8.3% 45.1% 32.1% 34.1% 44.1% 32.8% 25.0% 18.8% 49.7% 44.0% 0.0% 14.0%

Loan noun to loan verb ratio 37.5 11.8 11.3 8.4 7.6 7.5 6.7 5.8 5.4 4.8 4.4 3.8 3.7 3.7 3.6 3.5 3.5 3.5 3.4 3.4 3.4 3.1 3.1 3.0 2.8 2.5 2.2 2.1 2.0 2.0 1.7 1.7 1.6 1.4 1.3 1.3 1.3 1.1 1.0 0.8 – 2.2

III. Loanwords in the World’s Languages: Findings and results

63

The borrowing of verbs as opposed to nouns is one area where structural constraints may play a significant role. As discussed above, the more isolating the recipient language, the less morphosyntactic adaptation is necessary for borrowing verbs as such; conversely, the more synthetic the language, the more adaption is required. It is therefore much easier to borrow verbs into isolating languages than it is into synthetic languages. For example, Thai (Suthiwan & Tadmor, this volume) has borrowed many English verbs without any morphosyntactic modification, e.g. care (as khææ) and cheer (as chia). The fact that these loanwords function (and have always functioned) as verbs in Thai is demonstrated, among other things, by their taking of the verbal negator mây. Compared to Thai, Hebrew is highly synthetic, especially in its verbal paradigms, where one verbal root can take hundreds of forms. Moreover, in Modern Hebrew nouns do not have to belong to a particular noun class, while verbs can only be conjugated within one of seven verb classes. In other words, a noun can be borrowed without any morphosyntactic modification, but not a verb. In order to borrow a verb, a consonantal root must first be derived, and then it has to fit into one of the existing verbal classes. It is then used in conjunction with a large number of complex discontinuous morphemes. To take a recent example, the English verb to chat has been borrowed into Hebrew with the restricted yet commonly used meaning ‘to chat online’. First, a three-consonant root had to be derived; since chat only has two consonants, the second consonant was reduplicated, resulting in the root ch-t-t. Second, a suitable verb class had to be chosen (in this case the class known as qal, though this is by no means an obvious choice). Roots of Hebrew verbs cannot occur independently, so ch-t-t must be used in conjunction with various discontinuous morphemes, resulting in forms like lechotét ‘to chat’, chotátnu ‘we chatted’, and techotetí ‘you (FEM.SG) will chat’. Nouns, on the other hand, can be borrowed without any morphosyntactic modification, as they only occur in two forms – singular and plural (the dual is rarely used with loanwords). The plural is expressed by two simple suffixes: -ot (for loanwords ending in -a, treated as feminine) and -im (for all other loanwords, treated as masculine). Thus, even a long noun such as encyclopedia is easily borrowed as enciklopédya, with the equally easily derived plural entsiklopédyot, while endocrinologist is borrowed as endokrinológ with the plural endokrinológim. On the other hand, a verb like reanalyze would be very difficult to borrow as such into Hebrew, because it is unclear how to derive a consonantal root from it. It would either be borrowed as a noun (reanalíza ‘reanalysis’) and the verb would be derived periphrastically (laasót reanalíza, lit. ‘to do a reanalysis’), or it would be calqued (lenatéax mexadásh, lit. ‘to dissect anew’). These examples suffice to demonstrate why it is relatively difficult to borrow verbs as such into synthetic languages, but quite easy to do so into isolating languages. However, whether speakers of a particular language actually do borrow verbs depends on social rather than linguistic factors. None of the Mandarin verbs in the sample are borrowed, even though Mandarin is a highly isolating language (see Wiebusch & Tadmor, this volume). On the other hand, Berber (Kossmann, this volume) has borrowed a large number of verbs despite being highly synthetic, because it has been under heavy pressure from Arabic for a long time.

64

Uri Tadmor!

3. Loanwords and semantic fields Words belonging to different semantic fields display wildly varying borrowing rates. However, different languages display a remarkable degree of consistency which regard to which fields are more or less affected by borrowing. While there are certainly cross-linguistic differences, most languages tend to borrow more words into similar fields, and the same fields turn up again and again as the ones most resistant to borrowing. A list of all semantic fields in the LWT meaning list, along with the borrowing rate for each one, can be found in Table 6. Table 6:

Borrowing by semantic field

Semantic field Religion and belief Clothing and grooming The house Law Social and political relations Agriculture and vegetation Food and drink Warfare and hunting Possession Animals Cognition Basic actions and technology Time Speech and language Quantity Emotions and values The physical world Motion Kinship The body Spatial relations Sense perception All words

Loanwords as % of total 41.2% 38.6% 37.2% 34.3% 31.0% 30.0% 29.3% 27.9% 27.1% 25.5% 24.2% 23.8% 23.2% 22.3% 20.5% 19.9% 19.8% 17.3% 15.0% 14.2% 14.0% 11.0% 24.2%

The semantic fields most affected by borrowing are Religion and belief, Clothing and grooming, and The house. These semantic fields correspond to domains which have typically been most affected by intercultural influences. Examining the distribution and history of the world’s major religions reveals 1 why religious terminology constitutes the most borrowable part of the lexicon . The world’s largest religions by far are Christianity and Islam. Both came into be1

Technical vocabulary would probably show an even higher borrowing rate but for various reasons was not included in the project.

III. Loanwords in the World’s Languages: Findings and results

65

ing in historical times and started out in very limited geographical locations, but later spread around the world and were adopted by speakers of hundreds of different languages. It was only natural that as people adopted these religions they also adopted the terminologies that came along with them. Other world religions such as Buddhism – and to a lesser extent Hinduism – also spread from relatively small areas to encompass many ethnolinguistic groups. The fields of Clothing and grooming and The house are similar. Colonialism and globalization have contributed to the spread and adoption of garments which were once worn only in Europe. When watching a television report from the streets of London, Beijing, or Nairobi, the clothes people wear – shoes, trousers, dresses, and shirts – look remarkably similar. Compare this with photographs of street scenes take in these cities less than a century ago, and the differences in dressing styles become striking. The same goes for houses (and especially apartment blocks), which increasingly tend to resemble each other in cities all across the globe. Inside the houses, too, people across the globe now use similar furniture and utensils. As garments and household items spread around the world, so did the words denoting them. The semantic fields at the other extreme, comprising those least amenable to borrowing, are no less interesting. They consist of concepts that are universal and shared by most human societies. Practically every language can be expected to have indigenous words for such concepts, and therefore has no need to borrow them. These fields consist of Sense perception, Spatial relations, The body, and Kinship, which have a borrowing rate of just 10%–15%.

4. Measuring borrowability Studying borrowability is particularly interesting in view of the importance attached to resistance to borrowing as a defining feature of basic vocabulary. Thus, Greenberg (1957: 39) stated that basic (“fundamental”) vocabulary is much less susceptible to borrowing than non-basic (cultural) vocabulary. He conceded that any lexical item might be borrowed, but asserted that “fundamental vocabulary is proof against mass borrowing”. Similarly, Campbell (2006: 347) also referred to the topic, albeit with a different emphasis: “basic vocabulary can also be borrowed – though less frequently – so that its role as a safeguard against borrowing is not foolproof”. The cross-linguistic data from the World Loanword Database can be used to assess these claims, but as we will see, measuring the borrowability of lexical meanings is not entirely straightforward. 4.1.

An unweighted list

As discussed in Chapter I, contributors were asked to assign a “borrowed status” to each word of their databases, in terms of five different degrees of borrowing likeli-

66

Uri Tadmor! 2

hood. Each degree was then assigned a numerical score by the editors, which enabled us to compute the average “unborrowed score” of each meaning. All the items on the LWT list were then ranked by the unborrowed score in descending order. The 100 most borrowing-resistant items, as determined by this method, are listed in Table 7. Only five meanings have an unborrowed score of 1, meaning they have no counterpart in any language that is probably or clearly a loanword. The least-borrowed items on this list contain surprisingly few of the meanings traditionally associated with the notion of “basic vocabulary”, such as body parts and important natural phenomena. Far more numerous, especially in the highest rankings, are functions words and deictics, especially ones related to spatial organization: in, at, behind, above, under, outside, in front of, this, that, here, there, up, down. There are also several time deictics (today, yesterday, the day before yesterday, the day after tomorrow – but interestingly not tomorrow), as well as various person deictics (pronouns): I, you (both singular and plural), he/she/it, we (both inclusive and exclusive), and others. Interestingly, all the interrogatives among the 1,460 items on the LWT meaning list are ranked in the top 100 least-borrowed items: what, who, which, when, where, how, why, and how much. However, this unweighted list is problematic as a list of borrowing-resistant meanings, for a number of reasons. (i) Some lexical meanings are not represented by fixed lexical expressions in many or most languages (and have to be expressed by descriptive phrases). Quite a few languages do not have counterparts for meanings such as ‘day after tomorrow’, ‘younger sister’, or ‘married woman’. Such items do not constitute good evidence for low borrowability because of their poor “representation” in the combined database; the data are insufficient to determine whether they are borrowable or not. (ii) Quite a few meanings are represented in many languages, but not by simple monomorphemic words. Rather, they comprise analyzable expressions such as complex words, compounds, and phrasal expressions. Such analyzable expressions are almost by definition created in the recipient language and hence could not normally count as loanwords. Therefore they are not relevant for studying borrowability. (iii) Another important factor that does not figure in Table 7 is age. The longer a word exists in a language, the greater the opportunity it has to be replaced by a loanword. If a word has existed in a language for thousands of years without being replaced by a loanword, this clearly indicates high resistance to borrowing. On the other hand, if a word has only existed for a few years, it is not possible to tell whether it is borrowing-resistant: given sufficient time, it might be replaced by a loanword. Therefore, old words constitute much more reliable evidence for resistance to borrowing than new words. For each word, contributors were asked to note the earliest date to which each word could be attested or reconstructed. 2

The assigned scores were as follows: No evidence for borrowing, 1.00; Very little evidence for borrowing, 0.75; Perhaps borrowed, 0.50; Probably borrowed, 0.25; and Clearly borrowed, 0.00.

III. Loanwords in the World’s Languages: Findings and results

Table 7:

100 most borrowing-resistant items on the LWT meaning list

Rank Label

Unbor. Rank Label Score

Unbor. Rank Label Score

1 1 1 1 1 6 7 8 9 10 11 11 13 13 15 15 17

he/she/it we (inclusive) we (exclusive) up this where? why? which? we married woman younger sister to rise day after tomorrow to spin stinking to bring day before yesterday

1.000 1.000 1.000 1.000 1.000 0.997 0.995 0.994 0.991 0.990 0.989 0.989 0.987 0.987 0.982 0.982 0.981

34 35 35 35 38 38 38 41 41 43 44 45 46 47 47 47 50

what? to grasp I to be hungry younger brother yolk above to come who? next to listen it under to fart fire not to bite

0.971 0.970 0.970 0.970 0.969 0.969 0.969 0.968 0.968 0.967 0.967 0.966 0.966 0.965 0.965 0.965 0.964

67 67 69 69 71 71 73 73 73 73 73 78 78 78 81 81 82

17 17 17 17 22 23 24 24 26 26 28 28 28 28 28 33

there to lie down to stand here how? to run behind bitter nose thatch to go out to say to draw water that itch to go/return home

0.981 0.981 0.981 0.981 0.980 0.976 0.975 0.975 0.973 0.973 0.972 0.972 0.972 0.972 0.972 0.971

50 50 53 53 53 56 56 56 59 60 60 62 62 64 65 65

child-in-law right (side) to have to go to lose to blow to howl you (plural) to grow to throw to drop you (singular) to flow yesterday to hollow out to play

0.964 0.964 0.963 0.963 0.963 0.962 0.962 0.962 0.961 0.960 0.960 0.958 0.958 0.958 0.957 0.957

84 84 84 84 88 88 90 90 92 92 92 92 96 96 97 97 97

4.2.

67

eyelid long to hit/beat wide udder to climb married man to hear loud when? bright today down nit black firewood to burn (intransitive) thick louse to chop to float finger outside fly in at she breast to do/make to fall how much? raw older sister in front of

Unbor. Score 0.956 0.956 0.955 0.955 0.954 0.954 0.953 0.953 0.953 0.953 0.953 0.952 0.952 0.952 0.951 0.951 0.951 0.950 0.950 0.950 0.950 0.949 0.949 0.948 0.948 0.947 0.947 0.947 0.947 0.946 0.946 0.945 0.945 0.945

Three additional factors

In order to take into account these factors of representation in the database, analyzability/simplicity, and age, scores were computed for each of the factors, with values between 1 and 0 (as for the unborrowed score).

68

Uri Tadmor!

The representation score is simply based on the number of languages in the LWT sample that had at least one counterpart for the meaning. The score ranges from zero, for meanings not represented in any language, to 1.00, for meanings represented in all 41 LWT project languages. The analyzability/simplicity score is based on the information on analyzability provided by the contributors. Words which are analyzable, regardless of whether they are complex, compound, or phrasal expressions, were assigned a zero value. The only exception was when an affix had to be used with a citation form, for example a case ending or an infinitive marker, or when an affix was added as part of the borrowing process. Such words – unless containing more than one base – were considered unanalyzable and also assigned a zero value. Semi-analyzable words, for example words whose etymology is transparent only to linguists, or words containing “cranberry morphs”, were given a value of 0.50. Finally, unanalyzable words were given the value 1.00. The values of all words corresponding to each meaning were averaged, to produce the overall analyzability/simplicity score. 3 Words were also assigned numerical scores in proportion to their age , and the values for all counterparts of each meaning were again averaged, producing an overall age score for each lexical meaning. The result of including these additional factors is discussed in the next section.

5. The Leipzig-Jakarta list of basic vocabulary 5.1.

A weighted list

To arrive at a list that takes into account the additional factors, the unborrowed score, representation score, simplicity score, and age score were multiplied by each other, to produce a composite score. The items on the LWT were then ranked in descending order by this composite score (see Table 8). This is no longer just a (low) borrowability ranking, because it gives equal weight to three other factors (analyzability, representation, and age). In fact, it is a full-fledged basic vocabulary ranking. It comprises the notions normally associated with this concept: stability (our age score), universality (our representation score), and simplicity (our analyzability score), as well as resistance to borrowing (our unborrowed score). A raw ranking was generated from the consolidated LWT database based on the composite score described above. It was then slightly edited. Several pairs of meanings which are not represented by two separate words in most languages were combined, and their scores were averaged:

3

The assigned values were as follows: Words first reconstructed or attested earlier than 1000 CE, 1.00; earlier than 1500, 0.90; earlier than 1800, 0.80; earlier than 1900, 0.70; earlier than 1950, 0.60; and 2007 or earlier, 0.50.

III. Loanwords in the World’s Languages: Findings and results

meat + flesh arm + hand leg + foot head louse + body louse to do + to make

> > > > >

69

meat/flesh arm/hand leg/foot louse to do/make

A few labels were also slightly edited for brevity, clarity, and consistence. Finally, the top 100 items were taken to produce the new basic vocabulary list. It is named the Leipzig-Jakarta list, after the locations where it was conceived and created. Table 8:

The Leipzig-Jakarta list of basic vocabulary

Rank Word meaning 1 2 3 4 5 6 7 7 9 9 11 12 13 14 15 15 17 18 19 20 20 22 23 23 25 26 27 28 28 28 31

fire nose to go water mouth tongue blood bone 2SG pronoun root to come breast rain 1SG pronoun name louse wing flesh/meat arm/hand fly night ear neck far to do/make house stone/rock bitter to say tooth hair

Unborrowed score 0.965 0.973 0.963 0.909 0.920 0.934 0.904 0.918 0.958 0.944 0.968 0.947 0.916 0.970 0.915 0.950 0.884 0.877 0.881 0.948 0.931 0.896 0.895 0.944 0.947 0.893 0.895 0.975 0.972 0.882 0.944

Age score 0.939 0.906 0.887 0.926 0.904 0.908 0.890 0.904 0.893 0.869 0.876 0.856 0.898 0.875 0.886 0.861 0.904 0.892 0.903 0.858 0.880 0.888 0.881 0.850 0.877 0.876 0.882 0.872 0.837 0.877 0.871

Simplicity score 0.995 0.980 0.974 0.987 0.982 0.954 1.000 0.971 0.933 0.973 0.940 0.967 0.950 0.936 0.955 0.946 0.968 0.986 0.966 0.942 0.934 0.961 0.964 0.948 0.914 0.969 0.958 0.889 0.928 0.975 0.917

Representation score 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 0.976 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000

Composite score 0.901 0.864 0.832 0.831 0.817 0.808 0.805 0.805 0.798 0.798 0.796 0.783 0.782 0.776 0.774 0.774 0.773 0.771 0.768 0.766 0.766 0.764 0.760 0.760 0.759 0.758 0.756 0.755 0.755 0.755 0.754

70

Uri Tadmor!

Rank Word meaning 32 32 34 34 36 37 38 38 38 41 42 42 42 45 46 46 48 49 50 51 52 53 53 53 56 56 58 59 59 61 61 63 64 64 66 67 67 67 70 71 71

big one who? 3SG pronoun to hit/beat leg/foot horn this fish yesterday to drink black navel to stand to bite back wind smoke what? child (kin term) egg to give new to burn (intr.) not good to know knee sand to laugh to hear soil leaf red liver to hide skin/hide to suck to carry ant heavy

Unborrowed score 0.889 0.870 0.968 1.000 0.955 0.856 0.840 1.000 0.855 0.958 0.904 0.951 0.878 0.981 0.964 0.918 0.828 0.916 0.971 0.929 0.910 0.913 0.920 0.951 0.965 0.893 0.933 0.911 0.901 0.942 0.953 0.900 0.897 0.926 0.869 0.928 0.889 0.940 0.919 0.865 0.911

Age score 0.864 0.893 0.838 0.893 0.827 0.897 0.898 0.851 0.885 0.843 0.877 0.866 0.860 0.847 0.861 0.868 0.900 0.863 0.804 0.866 0.846 0.878 0.860 0.860 0.880 0.860 0.856 0.862 0.866 0.844 0.848 0.883 0.823 0.864 0.857 0.847 0.875 0.860 0.838 0.850 0.874

Simplicity score 0.980 0.969 0.924 0.955 0.947 0.972 0.987 0.897 0.984 0.922 0.934 0.899 0.982 0.889 0.887 0.924 0.987 0.929 0.939 0.930 0.945 0.907 0.920 0.889 0.974 0.945 0.908 0.922 0.928 0.910 0.895 0.954 0.977 0.900 0.967 0.913 0.924 0.888 0.953 0.975 0.901

Representation score 1.000 1.000 1.000 0.878 1.000 1.000 1.000 0.976 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 0.976 1.000 1.000 1.000 1.000 0.878 1.000 1.000 1.000 1.000 1.000 1.000 0.951 1.000 1.000 1.000 1.000 1.000 1.000 0.976 1.000 1.000

Composite score 0.753 0.753 0.749 0.749 0.748 0.747 0.745 0.745 0.745 0.744 0.741 0.741 0.741 0.738 0.736 0.736 0.736 0.734 0.732 0.730 0.728 0.727 0.727 0.727 0.726 0.726 0.725 0.724 0.724 0.723 0.723 0.722 0.721 0.721 0.720 0.718 0.718 0.718 0.717 0.716 0.716

III. Loanwords in the World’s Languages: Findings and results Rank Word meaning 71 74 75 76 76 78 79 80 81 81 83 84 84 84 87 88 89 89 91 91 91 91 91 96 97 97 99 100

to take old to eat thigh thick long to blow wood to run to fall eye ash tail dog to cry/weep to tie to see sweet rope shade/shadow bird salt small wide star in hard to crush/grind

Unborrowed score 0.900 0.896 0.920 0.906 0.950 0.956 0.962 0.860 0.976 0.946 0.904 0.853 0.883 0.838 0.871 0.879 0.918 0.914 0.848 0.887 0.842 0.848 0.909 0.955 0.830 0.948 0.918 0.919

Age score 0.898 0.867 0.840 0.856 0.827 0.824 0.857 0.871 0.833 0.825 0.847 0.891 0.813 0.869 0.871 0.836 0.842 0.857 0.824 0.840 0.857 0.838 0.790 0.819 0.859 0.856 0.833 0.845

Simplicity score 0.887 0.920 0.925 0.918 0.906 0.898 0.878 0.940 0.867 0.903 0.918 0.921 0.973 0.960 0.921 0.948 0.900 0.887 0.993 0.931 0.962 0.976 0.966 0.885 0.970 0.943 0.903 0.886

Representation score 1.000 1.000 1.000 1.000 1.000 1.000 0.976 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 0.902 1.000 1.000

71

Composite score 0.716 0.715 0.714 0.712 0.712 0.707 0.706 0.705 0.704 0.704 0.703 0.699 0.699 0.699 0.698 0.697 0.695 0.695 0.694 0.694 0.694 0.694 0.694 0.692 0.691 0.691 0.690 0.688

The most important categories of meanings on the Leipzig-Jakarta list are described below. Body parts constitute the most prominent group. Items from the semantic field The body make up only about a tenth of all the items on the 1,460-item LWT meaning list, but fully a quarter of the items on the Leipzig-Jakarta list of basic vocabulary. Most items represent external organs expected to be known to any normal speaker in any society: mouth, ear, nose, eye, arm/hand, leg/foot, and many others. Universally present natural phenomena which are of importance to humans are very heavily over-represented on the list in comparison to their overall distribution. They include water, fire, stone/rock, rain, night, star, wind, and others.

72

Uri Tadmor!

The generic animal terms fish and bird are on the list, as well as a few more specific terms for creatures found wherever there are humans: louse, ant, fly, and man’s best friend – dog. Unlike animals, there are no specific plant names on the list, probably because no plants are found everywhere on the planet and relevant to all human societies. However, there are a few plant-related terms: root, leaf, and wood. Generic actions include movement verbs like to go and to come as well as basic activities such as to eat, to drink, and to laugh. The sense perception verbs to see and to hear are on the list, but not to smell, to touch, or to taste, which are of somewhat less cardinal importance to humans. Basic properties such as big, small, old, and new, as well as the color terms black and red, but interestingly not white. The singular pronouns I, you (singular), and he/she/it, but not any plural pronouns, probably because these are sometimes polymorphemic. The interrogatives what? and who?, but not other interrogatives, which are less basic or tend to be polymorphemic. Because of the importance given by Swadesh and others to “culture-free” terms, it is interesting to note that a few items on the list are directly related to human culture. These include house, name, rope, and to tie which, although not culture-free concepts, probably exist in all human societies. The high ranking of rope and to tie are interesting, as they suggest that the rope is the most basic of human tools and tying is the most basic technology. Basic vocabulary as defined by stability, universality, simplicity, and resistance to borrowing is a useful concept for general cross-linguistic comparison, particularly for historical linguistics, where it is an invaluable tool for determining whether and how languages are related to each other. However, basic vocabulary is also relevant for synchronic language description. When setting out to study a previously undocumented language, eliciting the basic vocabulary is among the very first tasks. Indeed, no description of a language can be considered complete without at least some discussion of its basic vocabulary. It is hoped that the Leipzig-Jakarta list will prove useful to linguists in these endeavors. 5.2.

The Leipzig-Jakarta list and the Swadesh list

The best-known list of basic vocabulary is the Swadesh list, created by American linguist Morris Swadesh in the 1950s. Swadesh first proposed a 200-item list, which he then edited and narrowed down to a 100-item list (for a discussion, see Swadesh 1971). The Swadesh list has been used countless times and has proven to be of tremendous value for linguists around the world. Yet it is important to remember that it was created based mostly on intuition. Indeed, it was the intuition of a brilliant, knowledgeable, and accomplished scholar, but nonetheless an intuition. The Leipzig-Jakarta List makes use of the powers of computational linguistics and internet-enabled academic collaboration to produce an alternative, empirically-based basic vocabulary list.

III. Loanwords in the World’s Languages: Findings and results

73

It is interesting to see how the Leipzig-Jakarta list compares with the Swadesh list. In fact, there is a fair degree of correlation: 62 items on the lists overlap (Table 9.1). This means that total of 38 items on the Leipzig-Jakarta list do not appear on the Swadesh list (Table 9.2) and vice versa (Table 9.3). Swadesh’ intuitions thus appear to have been not far off the mark, although a 38% difference is substantial and can lead to rather different lexicostatistical and other results. Moreover, our findings indicated that quite a few items on the Swadesh list are not very basic. For example round is ranked 371 on our list, and person is ranked 526. It is probably not a coincidence that both these terms are represented by loanwords in English. At any rate, the major advantage of the Leipzig-Jakarta list is that it has a strong empirical foundation and is thus a more reliable tool for scientific purposes. Table 9.1: Items shared by the Swadesh list and the Leipzig-Jakarta list LeipzigJakarta list arm/hand ash big bird to bite black blood bone breast to burn (intr.) to come dog to drink ear to eat egg eye fire fish flesh/meat to give

Swadesh list hand ash big bird bite black blood bone breasts burn come dog drink ear eat egg eye fire fish flesh give

LeipzigJakarta list good hair louse to hear horn 1SG pron. knee to know leaf leg/foot liver long mouth name neck new night nose not one rain

Swadesh list good hair louse hear horn I knee know leaf foot liver long mouth name neck new night nose not one rain

LeipzigJakarta list red root sand to say to see skin/hide small smoke soil to stand star stone/rock tail this tongue tooth water what? who? 2SG pron.

Swadesh list red root sand say see skin small smoke earth stand star stone tail this tongue tooth water what? who? thou

74

Uri Tadmor!

Table 9.2: Items on the Leipzig-Jakarta list but not on the Swadesh list ant back bitter to blow to carry child (kin term) to crush/grind to cry/weep

to do/make to fall far fly to go hard 3SG pronoun heavy

to hide house in to laugh navel old rope to run

salt shade/shadow to hit/beat to suck sweet to take thick Thigh

to tie wide wind wing wood yesterday

Table 9.3: Items on the Swadesh list but not on the Leipzig-Jakarta list all bark belly cloud cold die dry feather

fingernail fly full grease green head heart hot

kill lie man many moon mountain path person

round seed sit sleep sun swim that tree

two walk we white woman yellow

6. Conclusion This chapter presented some of the findings and results of the LWT project. Part of its aims has been to provide an empirical foundation to long-held beliefs, such as that content words are more borrowable than function words and that nouns are more borrowable than words. Less expected – but not less important – has been the realization that borrowability in itself has limited (though interesting) applications. It becomes much more meaningful when used in conjunction with other variables, such as universality, stability, and simplicity. Thus used, it can be a useful aid for diachronic as well as synchronic description and analysis of languages. An interesting issue which could not be explored due to insufficient data input was the correlation between frequency and borrowability. It seems logical that frequently used words would also be highly resistant to borrowing, because more time and effort would be needed for the borrowing to become established. Similarly, it is possible that small speech communities are more amenable to borrowing (and to language change in general) than large speech communities, because innovations could spread among the entire community more readily. Evidence from the LWT project, however, is inconclusive. These topics are left for future research. Finally, it should be emphasized that the publication of this book is by no means the end of the Loanword Typology project. The World Loanword Database (WOLD) will be online for years to come, and will hopefully serve as a resource

III. Loanwords in the World’s Languages: Findings and results

75

tool for future studies. The database will be updated periodically, and additional language databases may also be added in the future.

References Campbell, Lyle. 2004. Historical linguistics: An Introduction. 2nd edn. Cambridge, MA: MIT Press. Greenberg, Joseph. 1957. Essays in Linguistics. Chicago: University of Chicago Press. Moravcsik, Edith. 1957. Borrowed verbs. Wiener Linguistische Gazette 8:3–30. Swadesh, Morris. 1971. The origin and diversification of language. Sherzer, Joel (ed.). Chicago: Aldine. Wohlgemuth, Jan. 2009. A typology of verbal borrowings. Berlin: Mouton de Gruyter.

Chapter 1

Loanwords in Swahili* Thilo C. Schadeberg 1. The language and its speakers Swahili is spoken by approximately 75 million people in eastern and central Africa. The great majority of speakers live in Tanzania, Kenya and Uganda where a standardized variety of Swahili has the status of national language. Coastal dialects and lingua franca varieties used outside Tanzania may deviate considerably from East African Standard Swahili which is the variety documented in the LWT database. 1 Standard Swahili is based on Kiunguja, the dialect of Zanzibar City. “However, whereas Kiunguja has retained its distinctiveness as a dialect, standard Swahili has continued to expand and to market itself as a radically modernized version of Kiunguja” (Mkude 2005: 2). There may be only two or three million speakers of Kiunguja and other coastal varieties of Swahili (such as Kimvita from Mombasa), but the number of people for whom Standard Swahili is the first language (in a chronological sense) or the primary language (in terms of competence, importance, and usage frequency) is many times larger and rapidly growing in the urban centers. Standard Swahili fills a wide range of functions. It is spoken at home, in the market and in shops, at work, in school, at religious and political meetings and in parliament, and it is the normal daily language of radio and television broadcasts. Swahili newspapers, journals and books of all kinds (fiction and non-fiction) have been published for many decades. Swahili is the language of traditional but still popular poetry and music, as well as of all modern genres of pop and rap. In Tanzania, where Swahili has the status of official language on a par with English, the progress of standardization is monitored by the National Swahili Council known as BAKITA (Baraza la Kiswahili la Taifa). BAKITA cooperates with similar institutions in Kenya and Uganda. The East African Community and the African Union have recognized Swahili as one of their working languages. Swahili is scientifically documented and analyzed at academic institutes in Dar es Salaam and elsewhere in East Africa. It is also studied and taught worldwide at numerous universities in Europe, North America and East *

1

The subdatabase of the World Loanword Database that accompanies this chapter is available online at http://wold.livingsources.org. It is a separate electronic publication that should be cited as follows: Schadeberg, Thilo C. 2009. Swahili vocabulary. In Haspelmath, Martin & Tadmor, Uri (eds.) World Loanword Database. Munich: Max Planck Digital Library, 1625 entries.

The external and internal genealogical classification of Swahili is dealt with at the start of §3.

1. Loanwords in Swahili

77

Asia. Several of these countries provide Swahili broadcasts for listeners in East Africa.

2. Sources of data Swahili is a well-documented language, particularly with regard to its lexicon, and loanwords have received much attention in the literature. The Swahili database is based on the perusal of dictionaries and loanword studth ies.2 The documentation of Swahili lexicon (and grammar) started in the 19 century with the works of Ludwig Krapf and Edward Steere. The most complete dictionary representing pre-standard Swahili is Sacleux (1939). The same year also saw the publication of the first “Standard Swahili” dictionary by Johnson (1939a, 1939b). All these dictionaries attempted to mark loanwords, with varying degrees of precision in identifying the donor language and the particular source word. Two important book-length studies of Swahili loanwords are Krumm (1940) and Lodhi (2000). Zawawi (1979) has a strong bias towards postulating loanwords from Arabic which makes her claim some fanciful etymologies for Swahili words that have undisputed Bantu origins. Geider (1995) provides a useful introduction to the study of loanwords (and neologisms) with annotated bibliographical references. The monumental monograph by Nurse & Hinnebusch (1993) on the linguistic history of Swahili is probably unique for a language that has only been written for a relatively short time.

3. History and contact situations Swahili is a Bantu language. Its closest relatives are the Sabaki languages, i.e., Elwana and Pokomo spoken along the Tana River in Kenya, the Mijikenda varieties (e.g. Giryama) spoken in the immediate hinterland of Swahili towns and settlements along the Kenyan coast, and Comorian spoken on the Comoro Islands. A partial genealogical tree of Swahili is given in Figure 1 (based on Nurse & Hinnebusch (1993), chapter 1; see also their Map 1, p. 40).

2

Dictionaries not mentioned in the main text: (Swahili:) Höftmann with Mhando (1963), Höftmann with Herms (1979), Legère (1990), Sacleux (1949), TUKI (1981, 1996, 2001), Velten (1910, 1933); (Arabic:) Dozy (1881), Grosset-Grange (1993), Kazimirski (1846–1860), Kirkeby (2000), Lane (1863–1893); (Hindi:) Platts (1977), Wagenaar (1993); (Malay:) Wilkinson (1901, 1932); (Persian:) Steingass (1892); (other languages:) Kisbey (1906), Worms (1898); loanword studies and other specialized studies not mentioned in the main text: Baldi (1988), Batibo (1996), Broomfield (1931), Chuwa (1988), Eastman (1991), Gower (1952), Gromova (2000), Holes (2001), Knappert (1972– 1973, 1983, 1989), Krumm (1932), Lafon (1983), Legère (1987), Lodhi (1994), Maganga & Schadeberg (1992), Laecka (1959), McCall (1969), Nurse (1988), Pasch & Strauch (1998), R!"i#ka (1953), Teubner (1974), Tucker (1946–1947), Whiteley (1967), Zawawi (1979).

78

Thilo C. Schadeberg

Proto-Bantu

Figure 1:

Pare

Pare ~ Asu, Tuveta

Ruvu

Gogo, Kaguru, Kami, …, Luguru, Zaramo

Seuta

Bondei, Sambaa, Ngulu, Zigula

Sabaki

Elwana, Pokomo, M!ikenda, Swahili, Comorian

Northeast Coast

From Proto-Bantu to Swahili

Nurse & Hinnebusch (1993: 23) provide the following time frame: “An approximate date around or slightly later than 1 A.D. would seem reasonable for PNEC [proto-Northeast Coast], perhaps five hundred years later for PSA [proto-Sabaki], shortly after that for PSW [proto-Swahili]”. The split of Sabaki into distinct societies, and the subsequent spreading out of Swahili and Comorian along the coastline and to the offshore islands of East Africa, from Somalia in the north to Mozambique in the south, appears to have been a rapid development completed by about 800 CE. The Swahili people never formed a single geopolitical unit but stayed organized around their cities. These cities formed a network, competing with each other and sometimes even waging war against each other. The linguistic diversity was considerable, but contact was never lost and, as economic and political power shifted from one city to another so shifted the currents of linguistic influence. Such intraSwahili borrowings have not been identified in the present study. Standard Swahili th is based on the dialect of Zanzibar town, which gained prominence in the 19 century under Omani rule. We assume that Kiunguja was formed in this period, and for the purposes of the present study I consider all contributing Swahili dialects as its ancestors, concentrating on lexical borrowings from other languages. 3.1.

Contact situations

Throughout its history, Swahili has been a contact language par excellence, and this common history of external contacts is important for the identity of Swahilispeaking peoples. Table 2 is an attempt to identify the contact situations in which the adoption of numerous loanwords occurred. Each situation is identified by a number followed by an estimated time period, a label characterizing the contact situation, and a list of the relevant donor languages. Not all donor languages are documented within the database. (Languages named as donors in the literature but which have probably exerted their influence through some other language are put in brackets.)

1. Loanwords in Swahili

Table 1:

Contact situations

Period

Label

1 2 3

before 800 CE 800 – 2000 800 – 1920

Pre-Swahili Hinterland neighborhood Indian Ocean trading network

4 5 5a 5b 5c 6 7

1000 – 2000

Arabic-dominated Islamic culture Foreign political dominance: • Portuguese • Omani • Late colonialism Caravan trade Standardization & modernization

1500 – 1700 1600 – 1920 1800 – 1960 1800 – 1900 1960 – 2000

3.1.1.

79

Donor languages South Cushitic NEC Bantu: Sambaa, Zaramo, Zigua, etc. (Indian Ocean) Arabic, “Hindi” (Indian), [Persian], [Chinese], Malagasy, Malay Arabic Portuguese Arabic English, German, French, Italian Nyamwezi English, Arabic, Neo-Latin

Contact situation 1: Pre-Swahili (before 800)

In the first century CE, the Periplus Maris Erythraei (Casson 1989) recorded that even in pre-Swahili times, the coast of north-eastern Africa was already a southern extension of the maritime trading network that reached from the Red Sea to the Bay of Bengal, indirectly linking the Roman empire in the west with China in the East. However, we do not know what kind of language was spoken by the autochthonous population of the ancient market towns Kanbalu and Raphta in today’s Tanzania, and no loanword evidence from this period has come down to Swahili. The Swahili database contains just one probable loanword predating the Swahili 3 threshold which we put at 800 CE. This is the word maziwa (class 6) ‘milk’ which probably entered proto-Sabaki through contact with a cattle-keeping people speak4 ing a South Cushitic language. It replaced the Proto-Bantu word *ma-béede (class 6), which had the primary sense ‘breasts’ (with a singular form in class 5). Following this pattern of polysemy, the loanword maziwa subsequently acquired the sense ‘breasts’ (sing. ziwa), thus providing an example for how a basic body part can have the appearance of a loanword. (In later times, Swahili contacts with speakers of Cushitic languages, e.g. Oromo, have often been hostile and did not result in the adoption of many loanwords.) 3

4

Swahili nouns fall into “classes” for which Bantu linguistics has devised a conventional numbering system. Most noun classes are overtly marked on the noun by prefixes; agreement markers appear on dependent adnominals and as pronominal elements on verbs marking subjects and objects. Singular and plural forms of a noun belong to different classes, as do diminutives, augmentatives, and locatives. Nurse & Hinnebusch (1993: 631) reconstruct *i-ziwa ‘milk’ as a loanword from Southern Cushitic for proto-Sabaki on the evidence that it appears in most varieties of Swahili and Sabaki (p. 631) — although the Bantu word mawele ‘milk’ has been preserved (or reintroduced?) in the dialect of Pemba (Sacleux 1939). The semantic extension from ‘milk’ to ‘breasts’ is a purely Swahili innovation (Nurse & Hinnebusch 1993: 296).

80 3.1.2.

Thilo C. Schadeberg

Contact situation 2: Hinterland neighborhood (800 - 2000)

The coastal settlements of the Swahili were not isolated from their immediate hinterland. The neighboring farming communities were mostly small and not politically organized on a large scale. Their relations with Swahili towns and villages appear on the whole to have been peaceful and mutually profitable. We may assume that immigrants from the hinterland were constantly assimilated into Swahili communities. Identifying loans from neighboring Bantu languages is not easy because they are, more often than not, indistinguishable from cognates. It is also difficult to identify the exact donor language because of their resemblance to each other. Sacleux (1939) is the major source indicating loans from such languages, often naming several languages as possible donors, e.g., mgono ‘fishtrap’ is marked as a loan from Zaramo, Zigua, Bondei, or Nyika. I have tried to identify at least one possible source form, which was often Sambaa because of its superior lexical documentation (LangHeinrich 1921; Besha 1993). 3.1.3.

Contact situation 3: Indian Ocean trading network (800–1920)

Swahili came into being in the context of a large Indian Ocean trading network, which appears to have been dominated – linguistically speaking – by seafarers speaking Arabic. Other participants in this network were speakers of Hindi-related languages from the Indian subcontinent, possibly Persian, and – appearing more sporadically on the East African coast – Malagasy, Malay, and even Chinese. Shipping on the Indian Ocean depended on the monsoon winds, the kusi blowing from the south from April to September, and the northerly kaskazi from November to March. A ship could sail from the mouth of the Indus River to the northern Swahili coast, do its trading and return home when the wind had turned around, all within one year; it would be more difficult to do this from or to one of the more southerly Swahili harbors. We therefore assume that there was intensive trading between the Swahili towns along the coastline and on the islands, and that the northern towns functioned also as entrepôts. This is how Swahili varieties (and Comorian), in spite of considerable linguistic differences, kept a certain unity and often adopted the same loanwords. Arabic was the dominant language of the Indian Ocean trade, and Arabs were no doubt frequent visitors as well as permanent residents in East Africa. We can only guess at the kind of Arabic that was spoken on the dhows sailing to and from East Africa, but we have given the label “Indian Ocean Arabic” to words having a particular affinity with items that we could trace to varieties of Arabic spoken on the southern coast of the Arabian Peninsula and on the shores of the Gulf.

1. Loanwords in Swahili

81

Map 1: The geographical setting of Swahili Arabic has been the most important donor language in the Indian Ocean trading network, not just for Swahili but probably also for the languages of other participating cultures. Conversely, the variety of Arabic that served as a lingua franca for the Indian Ocean trading network has itself been enriched by borrowings from those languages. As a consequence, many etymons can be found in more than one of the languages involved, and it is not always possible to identify the immediate donor language of a particular loan word in Swahili. For example, tufani ‘storm, hurricane’ could be derived from Arabic !"f#n or from Hindi t"f#n or from Persian tof#n, all with more or less the same meaning (and all ultimately derived from Cantonese tai fung ‘great wind’). Indian merchants and immigrants, too, have made lasting contributions to Swahili, although the precise donor language of such loanwords often remains unknown. Indian languages are the second most important source of Swahili loanwords in this contact situation. There is no doubt as to “Hindi” being the direct source of about a dozen words contained in the database (as well as of many more that are not). In the database, the label “Hindi” is used in its Swahili sense,

82

Thilo C. Schadeberg 5

applying to any language from the Indian subcontinent. I have used Hindi and Urdu dictionaries to identify possible loan words, but these two languages are unlikely to be direct donor languages. Lodhi (2000) finds that most Indian loans come 6 from Cutchi/Sindhi and Gujarati. The role of other languages as direct donors of Swahili loan words is less prominent or even doubtful. Shirazi descent has enjoyed a high prestige in traditional Swahili society, and there are indeed quite a number of loanwords originating from Persian. However, since there is no convincing evidence of any actual Persian presence in East Africa, and since Persian has been a prestigious donor language for Hindi and the varieties of Arabic spoken on the eastern and southern coast of the Arabian peninsula, we assume that all or most such words entered Swahili via one of these other lan7 guages. Swahili influence on Malagasy is well attested, but the reverse influence is less obvious. One might think that most contacts between Swahili and Malagasy were mediated by Comorian, but only one of the (at least) three probable loans from Malagasy can easily be identified in Comorian (Shingazidja) dictionaries: divai ‘wine’ 8 (originally < French du vin ‘some wine’; cf. Knappert 1970: 85). The database has (at least) one probable loan from Malay: kiazi ‘potato’ (< k$ladi ‘tuber, cassava’). The item is of particular interest because, if the etymology is correct, then the borrowing must predate not just the (partial) loss of l but also the older pre-Swahili sound shift *di > zi. A curious case is presented by the word chenza ‘tangerine’ which may well be derived from Chinese chenz$ ‘k.o. orange’ (Brauner 1986). However, it seems unlikely that the borrowing is a direct result of incidental Chinese visits to East Africa. 3.1.4.

Contact situation 4: Arab-dominated Islamic culture (1000 - 2000)

Early Swahili society is likely to have had Muslim visitors and immigrants. Since the beginning of the second millennium CE Islam has been increasingly important, th th and in the 12 or 13 century Swahili society adopted Islam as its dominant religion. With it, it opened itself to Arab poetry, music, learning, and way of life — 5 6

7

8

All “Hindi” etymologies in the database refer to Hindi or related Indo-Aryan languages, although a couple of them may find their ultimate origin in Dravidian. Abdulaziz Lodhi is one of the few Swahili scholars who is well acquainted with Swahili as well as with Indian languages, having conducted field work on both sides of the Indian Ocean and having roots in both cultures. Note however six words in the database having a probable Persian etymology without any indication of an alternative donor language: pamba ‘cotton’, gurudumu ‘wheel’, bwana ‘Sir’, malaya ‘prostitute’, boma ‘fortress’, balungi ‘grapefruit’. The other two items are wali ‘cooked rice (< Malagasy vàry) and -ji-vuna ‘be proud’ (< Malagasy àvona ‘pride’), reanalysed in Swahili as a reflexive verb, thus forming a small cluster with two other reflexive verbs with almost the same meaning, but without any semantic link with the verb -vuna ‘harvest’.

1. Loanwords in Swahili

83

adapting and integrating these and other foreign elements with local values and thus creating a unique Swahili culture. Of course, it is not possible to fully separate Arab Islamic culture from the Indian Ocean trading network, but in the context of the present loanword study it seems expedient to point out that the two are not the same, not just because large parts of the Indian subcontinent never converted to Islam, but also because the influence of Arabic and Islam continues until today, long after the Omani rulers have left Zanzibar and the dhow trade has ceased to be of any impact. 3.1.5.

Contact situation 5a: Foreign political dominance: Portuguese (1500 -1700)

Vasco da Gama reached East Africa in 1498, and during the next two centuries Portugal colonized and terrorized the Swahili towns until they were driven out by the Omani Arabs, retaining but the more southerly parts of the coast in today’s Mozambique. Portuguese intentions were directed less at gaining land than at monopolizing trade and extracting as much as possible from local economies. This led of course to fierce competition and repeated hostilities with Swahili cities, in which the Portuguese often proved to be on the winning side. The Portuguese community living in the Swahili cities of present-day Kenya and Tanzania was probably not large, consisting mainly of military personnel, administrators and priests. Some soldiers and sailors deserted their service and decided to stay permanently. Swahili-Portuguese bilingualism was presumably not widely spread. 3.1.6.

Contact situation 5b: Foreign political dominance: Omani (1700 - 1920)

The Omani drove the Portuguese out of Mombasa in 1698; they became a major power in the Indian Ocean and added large parts of the East African coast as well as Zanzibar to their sultanate. The Swahili coast became more and more prosperous, and in 1832 Sultan Sayyid Said moved his capital from Muscat to Zanzibar. During th the 19 century, the Swahili town of Zanzibar, with an Omani ruler presiding over a truly international court, bodyguards from the Comoros and from Baluchistan, bankers from India, and a plantation workforce from all over East Africa, became the centre of East African trade where caravans were planned and financed and goods from the interior found their way onto the global market. In this period, Kiunguja, the language of Zanzibar town, was formed from southern varieties of Swahili as they were spoken on the coast of present-day Tanzania (Mrima, Mgao) but also incorporating many elements from the more northern dialect of Mombasa (Mvita). It is also the period which saw the greatest influx of foreign lexis, i.e. loans from Omani Arabic.

84 3.1.7.

Thilo C. Schadeberg

Contact situation 5c: Foreign political dominance: Late colonialism (1800 - 1960) th

During the 19 century, Britain and other European powers gradually strengthened their presence in the Indian Ocean and eventually colonized all of East Africa. By th the end of the 19 century, the British ruled Zanzibar, Kenya, and Uganda, the Germans Tanganyika (i.e., Tanzania without Zanzibar), Rwanda, and Burundi, the French the Comoros and Madagascar, the Italians southern Somalia, and the Portuguese were left with Mozambique. After World War I, the League of Nations gave mandatory rule over Tanganyika to Britain, and over Rwanda and Burundi to Belgium. Colonial rule ended in the years between 1960 and 1975. Only English has had a lasting impact on Swahili in the form of loanwords; German colonial administration preferred to use Swahili as the official language. The one probable loan from German in the database is shule ‘school’ (< Schule). The literature often refers to loans originating from Turkish. Almost all of them consist of military terms, e.g., singe ‘bayonet’, bimbashi ‘sergeant’. It appears that these words were used by Arabic-speaking troops of the Ottoman empire, and were then adopted and further spread by the Omani Arabs and the British. There has never been any direct military presence of Turkish troops in East Africa. 3.1.8.

Contact situation 6: Caravan trade (1800 - 1900)

Zanzibar became the first Swahili town to organize and supervise regular longdistance trade with the interior of the continent. The Nyamwezi played an important part in this caravan trade – the mother of the famous trader Tippu Tip was a lady from Tabora – and thereby also in the early spread of Swahili-Kiunguja throughout East Africa and far into the Congo. In the same period, Zanzibar not only supplied its own plantations with forced labor from the interior, it also became a thriving centre of the international slave market. The number of words adopted into Kiunguja during this period may be small – the only item in the database is kangara ‘k.o. beer’ (< Nyamwezi %haangál&) – but th the 19 century was an important phase in the history of Swahili. It is the era in which Swahili developed into a major lingua franca and as such spread over large parts of eastern and central Africa. Swahili varieties spoken in the Congo have assimilated more lexical items and features of phonology and grammar that entered Swahili in the times of the caravan trade than has modern East African Standard Swahili which is documented here. 3.1.9.

Contact situation 7: Standardization and modernization (1960 - 2000)

Official standardization of Swahili began under British auspices in the 1920s. The Zanzibar city dialect Kiunguja was chosen, rather than Kimvita from Mombasa or Kiamu from Lamu which enjoyed higher literary prestige, because it was the variety

1. Loanwords in Swahili th

85

th

that had been spread by the 19 century caravan trade, by early 20 century anticolonial movements (Majimaji), and by German and British colonial administration and missionary activities. For the purposes of the present study, we subsume the pre-independence period of standardization under the contact situation of British dominance (“Late colonialism”). After independence, top-down standardization was most actively pursued in Tanzania through BAKITA, which saw one of its main tasks in the creation of new lexical items to serve the needs of spreading the use of Swahili to all facets of a modern state and its institutions. In addition to institutions like BAKITA, translators and academics have been constantly creating new terms for their publications. In all these situations where the creation of a loanword precedes its use — if indeed it ever gains acceptance — no contact with any living speaker of the donor language is needed; a dictionary is sufficient (cf. mbetula ‘birch’ < Latin betula). Although BAKITA established word-creation principles disfavoring foreign loans, especially those from English, many new loanwords have entered Swahili in recent decades, the majority from English. The lexical growth of Swahili is much too vigorous to be effectively controlled by BAKITA or any other regulating institutions. 3.2.

The identification of donor languages, contact situations, and associated periods

The historical contact situations as described above typically involve more than one language in addition to Swahili. In the context of the Indian Ocean trading network, it has often been impossible to trace a loanword to a particular donor language. The lexical dataset lists almost all possible combinations of participating languages as donors: Arabic or Hindi Arabic or Persian Arabic or Persian or Hindi Arabic or Persian or Hindi or Malagasy Arabic or Portuguese Hindi or Malay or Malagasy Hindi or Portuguese

Indian Ocean Arabic or Hindi Indian Ocean Arabic or Persian Indian Ocean Arabic or Persian or Hindi Indian Ocean Arabic or Persian or Hindi or Portuguese Persian or Malagasy Persian or Hindi

Table 2 provides a simplified list of donor languages occurring in the dataset, each with the absolute number of loanwords and the corresponding percentage (based on 1610 words). Where ranges are given, the first (lower) figure refers to loans that have unambiguously been traced to that particular donor language, the second (higher) figure includes loans that could also have come from one of the other languages.

86

Thilo C. Schadeberg

Table 2:

Donor languages for loanwords in Swahili

donor languages

loanwords

percentage

Arabic (incl. Indian Ocean Arabic) English Portuguese Hindi Persian Malagasy Sambaa Malay Chinese German Italian Neo-Latin Nyamwezi South Cushitic Unidentified

262–321 74 14–16 11–62 6–54 3–6 2 1–2 1 1 1 1 1 1 1

16.3–20.0 % 4.6 % 0.9–1.0 % 0.7–3.9 % 0.4–3.4 % 0.2–0.4% 0.1 % 0.1 % 0.1 % 0.1 % 0.1 % 0.1 % 0.1 % 0.1 % 0.1 %

The two major donor languages Arabic and English both occur in more than one historical context. Arabic has been identified as a donor language in as much as four different contact situations. Unfortunately, we have no formal ways of determining the age of a particular loanword. Nurse & Hinnebusch venture the opinion that “the largest identifiable set of borrowed lexis almost certainly stems from Omani Arabic in the last three centuries or so” (1993: 321). Since the Omani period is contained within the period of the Indian Ocean trading network, we subsume all loans from Arabic under the label “Indian Ocean” in the database, except for rare cases such as raisi ‘president’ – the word makes its debut in dictionaries only after independence and is therefore marked as a product of “modernization”. English, too, figures as a donor language in two different contact situations: (5c) “late colonialism” and (6) “standardization and modernization”. In this case, the absence of a word from all pre-1960 dictionaries is taken as an indication that the word stems from the “modernization” period. In the database, the following abbreviated labels are used to classify words according to their origin and age:

1. Loanwords in Swahili

Table 3:

87

Labels referring to age and origin of words

Start

End

Age

Contact situation(s)

–3000 1 500 700 800 1500 1800 1940

1 500 700 800 1940 1700 1940 2000

Earlier Bantu Proto-North-East-Coast Proto-Sabaki Proto-Swahili Pre-modern Swahili 1500–1700 1800–1940 1940–2000

– – Pre-Swahili (1) Indian Ocean (3) Neighborhood (2), Indian Ocean (3/4/5b) Portuguese (5a) Indian Ocean (3/4/5b), Caravans (6), British (5c) Modernization (7)

4. Numbers and kinds of loanwords The Loanword Typology meaning list includes many concepts which are not universally lexicalized, if only because they refer to things and concepts that are not present in all climates and cultures, e.g. ‘arctic lights’, ‘boomerang’, ‘birth certificate’, ‘citrus fruit’. I have tried to find Swahili translational equivalents for as many meanings as I could, even if such terms may not be known to or actively used by every speaker of Swahili. In the database, I have avoided using the label “Meaning irrelevant to speakers” as a pre-defined reason for not giving a translation. Swahili speakers have access to all areas of modern knowledge, through the internet and all the other modern media, and they can and do talk in Swahili about all they see, hear and read. This approach may have boosted the number of loanwords in Swahili compared to languages of lesser functionality. The total percentage of loanwords in the Swahili database is somewhere between 25 and 30%. Table 4 shows how loanwords are unevenly distributed across semantic fields. The figures in Table 4 show that English influence is concentrated on the semantic field Modern world, including (modern) clothing and the (modern) legal system. Arabic influence, on the other hand, evenly affects all other fields which were subject to intensive borrowing in the course of Swahili history. The specific contributions of other donor languages are not apparent from the given Loanword Typology semantic fields. For example, Portuguese has left clear marks on the terminology related to shipbuilding (and, less importantly, also card playing). Hindi is an important donor language, inter alia, in the semantic field of “food culture” (cf. Lodhi 2000: 83ff.) – yet this is not apparent from the figures of the present study. Semantic word classes (“parts of speech”) differ in how easily they are borrowed. Table 5 shows how loanwords are distributed over the given Loanword Typology word classes.

88

Thilo C. Schadeberg

Hindi

Persian

Portuguese

Other

Total loanwords

Nonloanwords

The physical world Kinship Animals The body Food and drink Clothing and grooming The house Agriculture and vegetation Basic actions and technology Motion Possession Spatial relations Quantity Time Sense perception Emotions and values Cognition Speech and language Social and political relations Warfare and hunting Law Religion and belief Modern world Miscellaneous function words

English

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Arabic

Table 4: Loanwords in Swahili by donor language and semantic field (percentages)

12.4 13.4 6.2 9.0 13.9 11.1 19.3 5.8 12.8 6.7 41.4 17.6 33.9 23.1 12.6 39.0 40.6 17.2 37.9 11.9 41.1 47.5 15.1 7.1 17.8

1.2 1.9 0.9 0.6 6.1 18.8 6.6 5.0 4.2 2.8 1.9 1.3 1.5 1.9 1.6 1.5 2.2 9.4 43.7 4.6

2.6 0.9 0.3 1.1 4.6 6.1 1.4 4.6 3.0 2.5 1.3 2.4 1.9 1.2 2.9 0.7 1.0 1.4 7.4 2.0

2.0 0.9 1.1 4.6 2.8 2.7 1.0 2.3 1.5 1.3 0.5 0.5 2.7 1.2 6.7 3.0 1.4 3.4 1.6

3.1 4.3 2.8 2.5 0.9 2.0 3.1 2.7 4.0 0.9

0.9 3.8 5.5 0.5 0.6 1.6 1.5 2.0 2.7 0.9

18.3 15.3 9.8 9.9 29.2 43.4 37.5 23.0 23.1 15.4 48.1 20.2 33.9 26.3 17.4 46.8 46.0 21.2 47.5 17.8 54.7 55.7 73.6 7.1 27.8

81.7 84.7 90.2 90.1 70.8 56.6 62.5 77.0 76.9 84.6 51.9 79.8 66.1 73.7 82.6 53.2 54.0 78.8 52.5 82.2 45.3 44.3 26.4 92.9 72.2

English

Hindi

Persian

Portuguese

Other

Total loanwords

Nonloanwords

Nouns Verbs Adjectives Adverbs Function words all words

Arabic

Table 5: Loanwords in Swahili by donor language and semantic word class (percentages)

19.2 14.5 19.5 14.3 14.9 17.8

7.1 1.0 0.7 4.6

3.0 0.1 1.9 2.0

2.3 0.1 1.5 1.6

1.6 0.9

1.3 0.3 0.7 0.9

34.3 16.0 24.3 14.3 14.9 27.8

65.7 84.0 75.7 85.7 85.1 72.2

1. Loanwords in Swahili

89

In absolute numbers, Swahili has borrowed five times more nouns than verbs and ten times more nouns than adjectives. Looking at the percentages, verbs and adjectives switch their ranking. There is an interesting correlation for this. When we look at nouns and verbs (as defined in the Loanword Typology meaning list), most Swahili translations fall into the equivalent categories (as defined in Swahili grammar). By contrast, only about one third of “semantic” adjectives (about 50) are also “grammatical” adjectives in Swahili. Within the class of “grammatical” adjectives, about one third of the items are probable loans. Thus, Swahili has borrowed “adjectival concepts” with relative ease – somewhere in between nominal and verbal concepts, but it has been reluctant to accommodate borrowed adjectives within the salient but rather small class of grammatical adjectives. (See next section for the difficult integration of borrowed adjectives.)

5. Integration of loanwords Most loanwords need to be adapted to function as Swahili words. We distinguish phonological and morphological adaptation. 5.1.

Phonological integration

Phonological integration concerns the adaptation of foreign sounds, foreign sound sequences, and foreign stress patterns. Arabic has a much larger consonant inventory than Proto-Swahili. Standard Swahili has adopted some new phonemes which occur only in loan words: the dental fricatives th and dh and the voiced velar fricative gh. Not all inland speakers of Swahili adhere to the standard pronunciation of these consonants – common substitutes are s, z, and g. Conversely, some coastal speakers have a larger inventory of faithful reproductions of Arabic consonants, the most common “extra” sound (and grapheme) being the voiceless velar fricative kh. The consonant h occurs natively only in grammatical morphemes (i.e., proximal and referential demonstratives, and the habitual and the negative pre-initial markers hu- and ha-); its appearance in lexical items is a sign of borrowing, e.g. huru ‘free’ < Arabic 'urr), muhanga ‘aardvark’ (< Sambaa mhanga). The contrast between l and r very likely has its origin in interdialectal borrowings (not identified in the present study). An example is the minimal pair mlima ‘mountain’ versus Mrima, the name of the coastal area in Tanzania north of the river Rufiji. The phonemic contrast has been reinforced by the introduction of loanwords, e.g., kalamu ‘pen’ (< Arabic qalam) versus karamu ‘feast’ (< Arabic karam ‘generosity’). Elsewhere in East African Bantu including some varieties of Swahili, l and r are often allophones, both reflecting Bantu *d. Other foreign sounds have been merged with their closest equivalents; this concerns inter alia emphatic (pharyngealized) and geminate consonants from Arabic

90

Thilo C. Schadeberg

and dental versus retroflex stops from Indian languages. Swahili has also eliminated foreign distinctions of vowel qualities including length and tone. Inherited Swahili words consist of two or more syllables of the shape ((N)C)V. (A syllable may also consist of just a nasal.) Borrowed words are often adapted to the Swahili norm, usually by inserting vowels into consonant clusters and after word-final coda consonants. The quality of the inserted vowel depends largely on the phonological environment; e.g., barafu ‘ice’ (< Persian/Hindi barf); karatasi ‘paper’ (< Arabic qar!as). In other borrowed words, certain consonant sequences are admitted, some in onset position (e.g., bluu ‘blue’ < English), some medially (with uncertain syllabification, e.g. bu.sta.ni or bus.ta.ni ‘garden’ < Arabic/Persian bust#n). There is much lexical, individual and stylistic variation regarding the degree of adaptation to Swahili syllable structure, which can be tested by using such words in metrically fixed poetry. Thus, we find buluu next to bluu, and mahakama next to mahkama ‘law court’ (< Arabic ma'kama(t)). Swahili has a very regular penultimate stress; e.g., kúla ‘to eat’, kulála ‘to sleep’, kuamúa ‘to decide’. Some borrowed words may or must have stress on the antepenultimate syllable; e.g., lázima, less common lazíma ‘necessity, must’ (< Arabic l#zim(an)); barábara ‘proper’ (< Hindi/Persian bar-#-bar) contrasting with regular barabára ‘big road’. 5.2.

Morphological integration

Shifting to morphological adaptation, we observe that each of the major categories noun, adjective, and verb has its own characteristic paradigm to which loanwords have to be adapted or else the paradigm has to be amended. Each borrowed noun has to be assigned to one of the classes or – in case of a count noun – to one of the pairs of classes indicating singular and plural. The choice of class is based on a mix of semantic and phonological criteria. Several strategies exist, listed here in their decreasing order of frequency: ! Assignment to classes 9/10: This strategy does not require any change in the form of the borrowed word since (i) there is no synchronically segmentable nominal prefix, and (ii) both classes are identical in shape (they only differ in their agreement morphemes); e.g., samaki n 9/10 ‘fish’ (sg/pl) (< Arabic samak). ! Assignment to classes 5/6: This strategy is almost as undemanding as the previous one, since class 5 has a zero prefix in most cases (except when the stem is monosyllabic and also with some disyllabic vowel-initial stems). The corresponding plural class 6 (prefix ma-) is used as a default strategy for the formation of plurals. Example: kaburi / makaburi n 5/6 ‘grave’ (< Arabic qabr).

1. Loanwords in Swahili

91

! Reanalysis of initial C(V)-syllable as a noun class prefix: Examples: mw-alimu / w-alimu n 1/2 ‘teacher’ (< Arabic mu(allim); m-sumari / mi-sumari n 3/4 ‘nail’ (< Arabic mism#r); ma-radhi n 6 ‘disease’ (< Arabic mara)); ki-mau / vi-mau n 7/8 ‘tunic’ (< Portuguese quimão), u-mri n 11 ‘age’ (< Arabic (umr). ! Addition of a noun class prefix: Examples: m-shumaa / mi-shumaa n 3/4 ‘candle’ (Arabic *ama(a(t)); n.gamia n 9/10 ‘camel’ (< Arabic +amal). Borrowed adjectives and verbs are morphologically less integrated than nouns. Native adjectives take a nominal prefix in agreement with the class of the head noun; most borrowed adjectives are invariable stems. Compare: kitambaa kichafu / vitambaa vichafu ‘a dirty cloth / dirty cloths’ (native) versus kitambaa safi / vitambaa safi ‘a clean cloth / clean cloths’ (< Arabic ,#f(in)). Native underived verb stems consist of a root and a suffix (often called “Final Vowel”) which is -i in the General Negative, -e in the Optative and some forms of the Imperative, and -a elsewhere. Most borrowed verb stems end in i, u or e and cannot be morphologically segmented. Compare native ku-som-a ‘to read’ : u-som-e ‘you should read’ versus borrowed ku-jibu ‘to answer’ : u-jibu ‘you should answer’ (< Arabic +#ba (II)). However, a few borrowed verbs end in -a and are inflected just like native verbs; e.g., ku-tawal-a ‘to rule’ : u-tawal-e ‘you should rule’ (< Arabic tawall# (V)). Borrowed words can freely be used as bases for native derivational processes. For example, the causative suffix -ish- and the verbalizing final vowel suffix -a can be added to borrowed nouns, adjectives and verbs: haraka n. ‘hurry’ (< Arabic 'araka(t)) > -harakisha v. ‘to hurry’, safi a. ‘clean’ (< Arabic ,#f(in)) > -safisha v. ‘to clean’, -dumu v. ‘to last’ (< Arabic d#ma (u)) > -dumisha v. ‘to make last’. I am not aware of empirical studies about speakers’ attitudes towards loanwords. My intuition is that speakers have no marked attitude towards those loanwords that are part of their active vocabulary. However, rare or new loanwords from Arabic or English may encounter positive or negative feelings depending on a person’s general feeling towards the culture associated with the source language.

6. Grammatical borrowing Arabic and, more recently and to a lesser extent also English, have had some influence on Swahili morphology, syntax, and style. Morphological additions are (i) the class of verb stems which cannot be segmented into root and “final vowel” suffix and (ii) a class of invariable adjectives without a nominal class prefix. Both kinds of words were not completely new: (i) there is one native verb stem not ending in -a, i.e., -keti ‘sit’ (historically derived from the Mombasa form -kaa n!ti ‘sit ground’), and (ii) non-agreeing adnominal modifiers existed at least among numerals, e.g., kenda ‘nine’ (archaic). Arabic loanwords have boosted the originally small number of words serving as the equivalents of conjunctions and prepositions, for example lakini ‘but’ (< Arabic

92

Thilo C. Schadeberg

lakin), bila ‘without’ (< Arabic bil#), kabla ‘before’ (< Arabic qabla) and baada ‘after’ (< Arabic ba(da). The word lakini is used like a true conjunction, but most other items show their nominal character by forming heads of class 9 connexive constructions, e.g. kabla ya vita / baada ya vita ‘before/after the war’, baada ya kurudi ‘after returning’ (rarely also without connexive: baada kuambiwa ‘after having been told’). Wald (2001) concludes that this kind of construction has given more prominence to nominal clauses common in Arabic and in European languages at the expense of older verbal strategies, but without having led to their complete loss. Thus, the following two constructions are largely equivalent: kabla

y-a

9.before

9-CONN 15-enter

ku-ingia … a-li-tu-ita SM1-PAST-OM1PL-call

kabla

ha-ja-ingia …

a-li-tu-ita

9.before

NEG.SM1-not.yet-enter

SM1-PAST-OM1PL-call

‘before entering … he called us’ On the whole, grammatical borrowing appears to be rather modest. Nurse (1997) investigates the question whether Swahili ever went through a phase of pidginization or creolization – a reasonable question given the history of Swahili as a contact language and as a lingua franca. He finds no evidence for such claims. Even the fact that Swahili has lost tone (reconstructed for earlier Bantu and present in most other Bantu languages) cannot convincingly be attributed to sociolinguistic factors. English presently exerts a more subtle stylistic influence on certain text genres. The following journalistic passage closely follows English lexicon and syntax, especially regarding the passive verb -daiwa ‘to be claimed’ (here not referring to any debt) and the negative relative neuter verb form wasiofahamika ‘(those) who are not known’: Inadaiwa kuwa, Aprili 6 watu wasiofahamika walifika nyumbani kwa mwinjilisti … ‘It is claimed that on April, 6th, people unknown entered the house of the evangelist …’

7. Conclusion Swahili has for a long time been regarded as an extreme case of lexical and grammatical admixture. The first sentence of the article Kiswahili in the Swahili Wikipedia is symptomatic: Kiswahili ni lugha ya kibantu yenye misamiati mingi ya kiarabu inayozungumzwa katika eneo kubwa la Afrika ya Mashariki. Swahili is a Bantu language with many Arabic terms that is spoken over a large area in East Africa. [my translation]

1. Loanwords in Swahili

93

th

The sometimes disproportionate focus on loanwords stems from the 19 century when trade routes through eastern and central Africa all connected to Zanzibar which was the gate for European (and American) travelers who had just begun to discover the interior of the “dark continent”. These visitors were deeply impressed by the picturesque Omani Arab component of Swahili society and its language. Emphasis on loanwords can be an issue loaded with conflicting emotions in Swahili studies. Some object to it on the grounds that giving undue weight to foreign influences detracts from and belittles the Africanness of Swahili language and culture. Others appear to feel proud about the presence of many loanwords which makes Swahili a peer of such prestigious languages and cultures as Persian, Arabic and Hindi. The present study does not support common claims about an extremely mixed nature of Swahili lexis. Lexical expansion is not merely a fact of Swahili history, it is also the way ahead as seen by many translators, educators, language planners and others who seek to promote Swahili in their particular fields. D. Mkude, talking about “the challenge of globalization” (2005: 8–11), cites Ohly as saying that “Swahili will conform to Tanzanian social reality only when 82.3% of its lexicon consists of new terms, i.e. more than 30,000 terms have to be coined”. Mkude warns against all too easily dismissing such surprising claims and the attitudes behind it, pointing out that “language needs careful nurturing and cultivation in order to function efficiently as a vehicle of modern communication”. It seems that speakers of Swahili have followed exactly that path throughout the last twelve hundred years of their history.

Acknowledgments I wish to thank Maarten Kossmann with whom I have enjoyed many stimulating conversations during our work on the Loanword Typology Project.

94

Thilo C. Schadeberg

References Baldi, Sergio. 1988. A first ethnolinguistic comparison of Arabic loanwords common to Hausa and Swahili. (Suppl. 57, Annali 48). Napels: Istituto Universitario Orientale. Batibo, Herman M. 1996. Loanword cluster nativization rules in Tswana and Swahili: A comparative study. South African Journal of African Languages 16(2):33–41. Besha, Ruth M. 1993. A classified vocabulary of the Shambala language with outline grammar. (Bantu Vocabulary Series 10). Tokyo: Institute for the Study of Languages and Cultures of Asia and Africa = ILCAA. Brauner, Siegmund. 1986. Zum Verhältnis von Kultur- und Sprachgeschichte: Chinesische Lehnwörter im Swahili [On the relationship of culture and language history: Chinese loanwords in Swahili]. Zeitschrift für Phonetik, Sprachwissenschaft und Kommunikationsforschung 39(5):595–601. Broomfield, G. W. 1931. Rebantuization of the Swahili language. Africa 4:77–85. Casson, Lionel. 1989. The Periplus maris Erythraei: Text, translation, and commentary. Princeton: Princeton University Press. Chuwa, A. R. 1988. Foreign loan words in Kiswahili. Kiswahili 55:163–172. de Biberstein Kazimirski, A. 1846–1860. Dictionnaire arabe-français [Arabic-French dictionary]. 2 vols. Paris. Dozy, R. 1881. Supplément aux dictionnaires arabes. 2 vols. Leiden. Eastman, Carol M. 1991. Loanwords and Swahili nominal inflection. In Blommaert, J. (ed.), Swahili studies: Essays in honour of Marcel Van Spaandonck, 57–77. Ghent: Academia Press. Geider, Thomas. 1995. Lehnwort- und Neologismenforschung [Loanword and neologism research]. In Miehe, G. & Möhlig, W. J. G. (eds.), Swahili-Handbuch, 323–337. Cologne: Rüdiger Köppe. Gower, R. H. 1952. Swahili borrowings from English. Africa 22:154–157. Gromova, Nelli V. 2000. Borrowings from local Bantu languages in Swahili. In Kahigi, K. & Kihore, Y. & Mous, M. (eds.), Lugha za Tanzania / Languages of Tanzania: Studies devoted to the memory of Prof. Clement Maganga, 43–50. Leiden: Research School for Asian, African and Amerindian Studies = CNWS. Grosset-Grange, Henri. 1993. Glossaire nautique arabe ancien et moderne de l’Océan Indien (1975). Texte établi par Alain Rouaud. Paris: C.T.H.S. Höftmann, Hildegard with Mhando, Stephen. 1963. Suaheli-deutsches Wörterbuch [SwahiliGerman dictionary]. Leipzig: Verlag Enzyklopädie. Höftmann, Hildegard & Herms, Irmtraud. 1979. Wörterbuch Swahili-Deutsch [Dictionary German-Swahili]. Leipzig: Verlag Enzyklopädie. Holes, Clive. 2001. Dialect, culture, and society in Eastern Arabia. Vol. 1: Glossary. Leiden: Brill. Johnson, Frederick. 1939a. A standard Swahili-English dictionary. Oxford University Press.

1. Loanwords in Swahili

95

Johnson, Frederick. 1939b. A standard English-Swahili dictionary. Oxford University Press. Kirkeby, Willy A. 2000. English Swahili dictionary. Skedsmokorset: Kirkeby Forlag. Also published in 2001. Dar es Salaam: Kakapela Publishing. Kisbey, Walter H. 1906. Zigua-English dictionary. London: Society for Promoting Christian Knowledge. Knappert, Jan. 1970. Contribution from the study of loanwords to the cultural history of Africa. In Dalby, D. (ed.), Language and history in Africa, 78–88. London: Frank Cass. Knappert, Jan. 1972/1973. The study of loan words in African languages. Afrika und Übersee 56:283–308. Knappert, Jan. 1983. Persian and Turkish loanwords in Swahili. Sprache und Geschichte in Afrika = SUGIA 5:111–143. Knappert, Jan. 1989. Les mots swahilis empruntés au grec, aux langues romanes et américaines. In Rombi, M.-F. (ed.), Le swahili et ses limites: Ambiguïté des notions reçues, 41–57. Paris: CNRS. Krapf, L. 1964 [1882]. A dictionary of the Suahili language. Ridgewood, NJ: Gregg. Krumm, Bernhard. 1932. Wörter und Wortformen orientalischen Ursprungs im Suaheli [Words and word forms of Oriental origin in Swahili]. Hamburg. Krumm, Bernhard. 1940. Words of oriental origin in Swahili. London: Sheldon Press. Lafon, M. 1983. Les emprunts arabes en swahili: Notes de lecture sur le livre de Sharifa M. Zawawi, Loan words and their effect on the classification of Swahili nominals. Afrique et langage 20:47–65. Lane, Edward William. 1955–1956 [1863–1893]. An Arabic-English lexicon. 8 vols. Reprint New York, 1955-56. London: Williams and Norgate. LangHeinrich, F. 1921. Schambala-Wörterbuch [Shambala dictionary]. Hamburg: L. Friederichsen. Legère, Karsten. 1987. Portugiesische Lehnwörter im Swahili [Portuguese loanwords in Swahili]. In Perl, M. (ed.), Beiträge zur Afrolusitanistik und Kreolistik (Linguistische Studien, Reihe A: Arbeitsberichte 172), 100–112. (= Linguistische Studien, Reihe A: Arbeitsberichte, 172). Berlin: Akademie der Wissenschaften der DDR. Legère, Karsten. 1990. Wörterbuch Deutsch-Swahili [German-Swahili dictionary]. Leipzig: Verlag Enzyklopädie. Lodhi, Abdulaziz Y. 1994. Arabic grammatical loans in the languages of Eastern Africa. University of Trondheim Working Papers in Linguistics 22:60–74. Lodhi, Abdulaziz Y. 2000. Oriental influences in Swahili: A study in language and culture contacts. (Acta Universitatis Gothoburgensis, Orientalia et Africana Gothoburgensia 15). (Acta Universitatis Gothoburgensis, Orientalia et Africana Gothoburgensia, 15.) Göteborgs Universitet. Göteborgs Universitet. Maganga, Clement & Schadeberg, Thilo C. 1992. Kinyamwezi: Grammar, texts, vocabulary. Köln: Rüdiger Köppe.

96

Thilo C. Schadeberg

Malecka, A. 1959. Quelques emprunts arabes dans la langue souaheli [Some Arabic loans in Swahili]. Folia Orientalia 1:141–143. McCall, Daniel F. 1969. Swahili loanwords: Whence and when. In McCall, D. F. & Bennett, N. & Butler, J. (eds.), Boston University Papers on Africa, Vol. 3: Eastern African history, 28–73. New York: Frederick A. Praeger. Mkude, Daniel. 2005. The passive construction in Swahili. (African Language Study Series 4). (African Language Study Series, 4.). Tokyo: Research Institute for Languages and Cultures of Asia and Africa, Tokyo University of Foreign Studies. Nurse, Derek. 1988. The borrowing of inflectional morphology: Tense and aspect in Unguja. Afrikanistische Arbeitspapiere = AAP 15:107–119. Nurse, Derek. 1997. Prior pidginization or creolization in Swahili. In Thomason, S. G. (ed.), Contact languages: A wider perspective, 271–294. Amsterdam: John Benjamins. Nurse, Derek & Hinnebusch, Thomas J. 1993. Swahili and Sabaki: A linguistic history. Berkeley/Los Angeles: University of California Press. Nurse, Derek & Spear, Thomas. 1985. The Swahili: Reconstructing the history and language of an African society, 800-1500. Philadelphia: University of Pennsylvania Press. Pasch, Helma & Strauch, Christiane. 1998. Ist das Klassenpaar 5/6 des Swahili ein Zwischenlager für Lehnwörter. Swahili Forum 5 = AAP 55 p.145–154. Platts, John T. 1977 [1884]. A dictionary of Urdu, classical Hindi and English. New Delhi. R!"i#ka, Karel F. 1953. Lehnwörter im Swahili [Loanwords in Swahili]. Archiv Orientální 21:582–603. Sacleux, Ch. 1939. Dictionnaire swahili-français [Swahili-French dictionary]. (Travaux et Mémoires de l’Institut de l’Ethnologie 36–37). Travaux et Mémoires de l’Institut de l’Ethnologie, 36-37. Paris: Musée de l’Homme. Sacleux, Ch. 1949. Dictionnaire français-swahili [French-Swahili dictionary]. (Travaux et Mémoires de l’Institut de l’Ethnologie 54). Travaux et Mémoires de l’Institut de l’Ethnologie, 54. Paris: Musée de l’Homme. rd

Steere, Edward. 1885. A handbook of the Swahili language, as spoken at Zanzibar. 3 edn. revised and enlarged by A. C. Madan. London: Society for the Promotion of Christian Knowledge. Steingass, Franz. 1892. A comprehensive Persian-English dictionary including the Arabic words and phrases to be met with in Persian literature, being Johnson and Richardson’s Persian, Arabic and English Dictionary revised, enlarged, and entirely reconstructed. London: Routledge & Kegan Paul. Teubner, Johann Karl. 1974. Altaisches, fernöstliches und malaiisches Wortgut im Suaheli. Zeitschrift der Deutschen Morgenländischen Gesellschaft 2:629–636. Zeitschrift der Deutschen Morgenländischen Gesellschaft, Suppl. 2. Tucker, A. N. 1946–1947. Foreign sounds in Swahili. Bulletin of the School of Oriental and African Studies 11(4):587 and 12(1):230.

1. Loanwords in Swahili

97

TUKI (Taasisi ya Uchunguzi wa Kiswahili). 1981. Kamusi ya Kiswahili Sanifu. TUKI (Taasisi ya Uchunguzi wa Kiswahili). Dar es Salaam: Oxford University Press. TUKI (Taasisi ya Uchunguzi wa Kiswahili). 1996. English-Swahili dictionary / Kamusi ya Kiingereza-Kiswahili. TUKI (Taasisi ya Uchunguzi wa Kiswahili). Dar es Salaam: Institute of Kiswahili Research. TUKI (Taasisi ya Uchunguzi wa Kiswahili). 2001. Kamusi ya Kiswahili-Kiingereza / SwahiliEnglish dictionary. TUKI (Taasisi ya Uchunguzi wa Kiswahili). Dar es Salaam: Institute of Kiswahili Research. Velten, Carl. 1910. Suaheli-Wörterbuch [Swahili dictionary]. Vol. 2: Suaheli-Deutsch [Swahili-German]. Berlin. Velten, Carl. 1933. Suaheli-Wörterbuch [Swahili dictionary]. Vol. 1: Deutsch-Suaheli [German-Swahili]. Leipzig: Otto Harrassowitz. Wagenaar, H. W. 1993. Transliterated Hindi-Hindi-English dictionary. New Delhi: Allied Chambers. Wald, Benji. 2001. Substratal and superstratal influences on the evolution of Swahili syntax: Central East Coast Bantu and Arabic. Sprache und Geschichte in Afrika = SUGIA 16/17:455–522. Werner, Alice. 1935. Review of B. Krumm, Wörter und Wortformen orientalischen Ursprungs im Suaheli. Africa 8:120–121. Whiteley, W. H. 1958–1959. Maisha ya Hamed bin Muhammed el Murjebi yaani Tippu Tip kwa maneno yake mwenyewe. Supplement to the East African Swahili Committee Journals 28(2) and 29(1). (Supplement to the East African Swahili Committee Journals 28/2 and 29/1.). Arusha: Beauchamp Printing. Whiteley, W. H. 1967. La classification nominal dans les langues négro-africaines. 157–174. Paris: CNRS. Wilkinson, R. J. 1901. A Malay-English dictionary. Singapore: Kelly & Walsh. Wilkinson, R. J. 1932. A Malay-English dictionary (romanised). 2 vols. Mytilene: Salavopoulos and Kinderlis. Worms, A. 1898. Wörterverzeichniss der Sprache von Uzaramo [Word index of the language of Uzaramo]. Zeitschrift für afrikanische und ozeanische Sprachen 4:339–365. Zawawi, Sharifa M. 1979. Loan words and their effect on the classification of Swahili nominals. Leiden: E. J. Brill.

98

Thilo C. Schadeberg

Loanword Appendix Arabic dunia

n

world

rasi

n

cape

radi

n

bolt of lightning

maradhi

n

disease

dhahabu

n

gold

usaha

n

pus

fedha

n

silver; money

tiba

n

medicine

shaba

n

copper

sumu

n

poison

risasi

n

lead

sufuria

n

metal pot, pan

zulia

n

rug carve

n

kettle

-nakshi, -nakishi

v

birika

dhoruba, dharuba

n

storm

nuru

n

light

bakuli

n

dish

-rudi

v

return

theluji

n

snow

sahani

n

plate

-rejea

v

return

arusi, harusi

n

wedding

gudulia

n

jug, pitcher

daraja

n

bridge, stairs

tini

n

fig

nira

n

yoke

talaka

n

divorce

zabibu

n

grape

merikebu

n

ami, amu

n

father’s brother

zeituni

n

olive

ship (oceangoing)

asali

n

honey

usukani

n

rudder

binamu

n

father’s brother’s son

sukari

n

sugar

-miliki

v

own

-hifadhi

v

keep, preserve

-haribu

v

destroy

-jeruhi

v

injure

-hasiri

v

damage

-dhuru

v

damage

-tafuta

v

look for

sarafu

n

coin

dhuria

n

descendants

yatima

n

orphan

jamaa

n

relatives, family

jibini

n

cheese

siagi

n

butter

sufu

n

wool

hariri

n

silk

maharazi

n

awl

fahali

n

bull

kofia

n

cap, hat

farasi

n

horse

n

baghala

n

mule

sitara, stara

samaki

n

fish

conceiling sth. that is private

tajiri

n

rich person

dubu

n

bear

-ishi

v

live

maskini

n

poor person

ngamia

n

camel

hema

n

tent

bahili

n

n

dryness, hardness; dandruff

n

floor

stingy person

uyabisi

sakafu dohani

n

chimney

-dai

v

mshumaa

n

candle

blood

rafu

n

shelf

claim a debt (passive: owe)

n

camp

deni

n

debt

ushuru

n

tax

-ajiri

v

engage, employ

damu

n

dhakari

n

penis

kambi

-jamii

v

have sex

mfereji

n

ditch

hai

a

alive

nafaka

n

grain

maisha

n

life

shayiri

n

barley

mshahara

n

wages

maiti

n

corpse

n

n

trade

n

grave

scissors, shears

biashara

kaburi

mkasi, makasi

tool

market

weak

n

n

a

ala

soko

dhaifu

zana

n

tool

bei

n

price

-subu

v

cast, mould

ghali

a

expensive

-kalibu

v

mould

rahisi

a

cheap, easy

afya

n

health

homa

n

fever

1. Loanwords in Swahili

99

uzani

n

weight

Ijumaa

n

Friday

-fikiri

v

think

baada

n

after

majira

n

season

-dhani

v

think

kabla

n

before

harufu

n

smell

-amini

v

believe

mahali

n

place

ladha

n

taste

-fahamu

v

understand

-baki

v

remain

-tamu

a

sweet

-kisi

v

guess

karibu

adv

near

sauti

n

voice

dhana

n

idea

mashariki

n

east

laini

a

smooth

hekima

n

wisdom

magharibi

n

west

baridi

a

cold

busara

n

kusini

n

south

safi

a

clean

good judgement, skill

kulabu

n

hook

roho

n

soul, spirit

msalaba

n

cross

-furahi

v

be happy

mwalimu

n

teacher

mraba

n

square

-tabasamu

v

smile

-sahau

v

forget

duara

n

circle

-busu

v

kiss

dhahiri

a

clear

mstari

n

line

huzuni

n

grief

siri

n

secret

sifuri

n

zero

wasiwasi

n

n

certainty

sita

a

six

nia

n

saba

a

seven

worry, uncertainty

hakika

intention, will

tisa

a

nine

huruma

n

pity

kusudi

n

intention

ishirini

n

twenty

dhamira

n

intention, purpose

mia

n

hundred

sababu

n

cause

elfu

n

thousand

shaka

n

doubt

-hesabu

v

-shuku

v

suspect

zaidi

-kasirika

v

be angry

hasira

n

anger

n

count

husuda, hasada, uhasidi

envy, jealousy

n

more

aibu

n

shame

-tuhumu

v

suspect

umati

n

crowd

fahari

n

pride

-haini

v

betray

sehemu

n

part

-thubutu

v

dare

-saliti

v

betray

nusu

n

half

a , v brave

lazima

n

jozi

n

pair

jasiri, jasiri

need, necessity

muda

n

a period of time

shujaa

n

hero

-jaribu

v

try

hofu

n

fear

kama

conj if

umri

n

age (of person)

hatari

n

danger

au

conj or

haraka

n

hurry

-tumaini, -tumai

v

hope

hotuba

n

speech, address

-dumu

v

last

-samehe

v

forgive

lugha

n

language

abadani

adv

never

sahihi

a

-jibu

v

answer

alfajiri

n

dawn

right, correct

-kiri

v

asubuhi

n

morning

sawa

a

adhuhuri

n

midday

same, equal, right

admit, acknowledge

alasiri

n

afternoon

lawama

n

blame

-ahidi

v

promise

n

praise

karatasi

n

paper

n

mind, intelligence

kalamu

n

pen

kitabu

n

book

saa

n

hour; clock

sifa

juma

n

week

akili

Alhamisi

n

Thursday

100

Thilo C. Schadeberg

shehe, shekh, sheik

n

sheikh, wise old man

-tawala

v

rule, govern

malkia

n

queen

raia

n

citizen

huru

a

free

Mola

n

god

hekalu

n

temple

kanisa

n

church

madhabahu n

kafara

n

Arabic or Hindi hewa

n

air

kiberiti

n

matchbox

dawa

n

medicine

lozi

n

almond

sacrifice

sabuni

n

soap

n

padlock

altar; slaughterhouse

-amuru

v

command, order

-abudu

v

worship

kufuli

-sali

v

pray

hori

n

amri

v

command, order

kasisi

n

priest (Christian)

creek; trough; k.o. canoe

-tii

v

obey

-hubiri

v

preach

sonobari

n

-ruhusu

v

permit

-hutubu

v

preach

cone (of conifere)

rafiki

n

friend

statue

bless

n

v

sanamu

-bariki

adui

n

enemy

hell

time

n

n

jahanamu

wakati

jirani

n

neighbor

jini

n

demon

sultani

n

sultan

-saidia

v

help

hakimu

n

judge

shetani

n

devil

mila

n

custom

kisirani

n

omen

dini

n

religion

salama

n

peace

n

circumcision

n

tohara

msala

jeshi

n

army

toilet; prayer mat

miwani

n

spectacles

kahawa

n

coffee

askari

n

soldier

rais, raisi

n

president

silaha

n

weapon, arms

jinai

n

crime

Arabic or Persian

tower

n

address

n

n

anwani

bara

mnara

n

law

n

cigarette

mainland, continent

sheria

sigara bila

n, without prep

bata

n

duck

mdudu

n

insect

n

poem

tasa

n

metal bowl

n

linen

mahakama n

court

-hukumu

v

adjudicate

hukumu

n

judgment

kadhi

n

judge

shahidi

n

witness

yamini

n

oath

-shtaki, -shitaki

v

accuse

-laani

v

shairi -himili

v

support, sustain

kitani joho

n

k.o. cloak

-badili

v

change, substitute

barakoa

n

k.o. veil

tofali

n

brick

-salimu

v

greet

bustani

n

garden

mara

n

a time, a turn; suddenly

kilele

n

top, summit

stadi

a

clever

n

mosque

curse, condemn

hatia

n

guilt, fault

Hindi

n

India

msikiti

adhabu

n

penalty, punishment

-halifu

v

commit a crime

Arabic or Hindi or Persian

mali

n

property

bahari

n

sea, ocean

adultery

hali

n

state, condition

tufani

n

hurricane

pilipili

n

chili pepper

each, every

suruali

n

trousers

asherati, uasherati zinaa

n n

adultery

kila

a

1. Loanwords in Swahili dusumali

n

colourful headscarf

simu

n

stovu

n

stove

tumbaku

n

tobacco

ndimu

n

lime (fruit)

Indian Ocean Arabic, Hindi or Persian

blangeti, blanketi

n

blanket

msumari

n

nail

desturi

mota

n

mortar

plau

v

plough/plow

sherisi

n

carpenter’s glue

sepetu, sepeto

n

spade, shovel

jahazi

n

ship, dhow, vessel

reki

n

rake

muoki

n

oak

gamu

n

glue, gum

gluu

n

glue

feni

n

fan

bumarengi

n

boomerang

English

dansi

v

dance

savana

n

savanna

ekseli

n

axle

n

family

sleji

n

sledge, sled

kangaroo

bili

n

bill

kona

n

corner (of road, in soccer)

wiki

n

week

n

telephone

custom

Indian Ocean Arabic, Hindi, Persian or Portuguese meza

n

table

bandari

n

port

duka

n

shop

tayari

a

ready

bahati (njema)

n

(good) luck

barabara (2)

a

right, correct

wazi

a

clear

familia

kahaba

n

prostitute

kangaruu

bunduki

n

gun

daktari

n

physician

serikali

n

government

oveni

n

oven

n

jug, pitcher

Chinese chenza

n

n

mandarin (fruit)

waziri

n

minister

jagi

karakana

n

workshop

soseji

n

sausage

chai

n

tea

supu

n

soup

chizi

n

cheese

bia

n

beer

gauni

n

(woman’s) dress

Arabic, Hindi, Persian or Malagasy popoo

n

areca nut

Arabic or Portuguese bomba

n

tap, faucet

Indian Ocean Arabic shwari, shuwari

n

calm sea

ghuba

n

bay

barua

n

letter

buluu, bluu a

blue

-kisi

v

kiss

skuli

n

school

helmeti

n

helmet

jaji

n

judge

faini

n

fine

jela

n

prison

redio

n

radio

televisheni

n

television

baisikeli

n

bicycle

motokaa

n

car

koti

n

coat

shati

n

shirt

kola

n

collar

sketi

n

skirt

soksi

n

sock, stocking

buti

n

boot

basi

n

bus

treni

n

train

glavu, glovu

n

glove

eropleni

n

airplane

pini

n

pin

betri

n

battery

hereni, herini

n

earring

breki

v

brake

injini

n

motor

Indian Ocean Arabic or Persian

taulo

n

towel

mota

n

motor

tufe

burashi, brashi

n

brush

mashine

n

machine

skii

n

hospitali

n

hospital

Indian Ocean Arabic or Hindi mashua

n

n

boat

spherical object

101

snowshoe

102

Thilo C. Schadeberg

polisi

n

police

godoro

n

mattress

namba, nambari

n

number

bisibisi

n

screwdriver

stempu

n

benki

sinki

n

n

postage stamp

Hindi or Malay or Malagasy ngalawa

n

bank (financial institution)

dug-out with outriggers and sail

Persian pamba

n

cotton

balungi

n

grapefruit

gurudumu

n

wheel

bwana

n

master

malaya

n

prostitute

boma

n

fortress

(kitchen) sink

Hindi or Persian

skrubu, n sukurubu

screw

barafu

n

ice

dirisha

n

window

pipi

n

sweet, peppermint

rangi

n

colour, paint

Portuguese

peremende

n

sweet, peppermint

gari

n

cart, car

nanga

n

-kodi

v

hodari namna cheti

plastiki

n

plastic

bomu

n

bomb

gazeti

n

newspaper

kalenda

n

calendar

filamu

n

film, movie

muziki

n

music

leseni

n

license

kadi

n

card

shule

n

school

Hindi beberu

n

he-goat

shela

n

k.o. veil

bangili, bangiri

n

bracelet

taa

n

lamp, torch

gundi

n

glue

bati

n

tin, tin plate

patasi

n

chisel

meli

n

ship (big, oceangoing)

manjano

n

carpenter

mbatata

n

potato

anchor

korosho

n

cashewnut

hire

mvinyo

n

wine

a

clever

kimau

n

n

manner

k.o. shirt (tunic)

n

certificate

chepeo

n

hat

leso

n

handkerchief

Hindi or Portuguese

pau, pao

n

rafter

pesa

limau

n

lemon

muhogo

n

cassava

tarumbeta

n

horn, trumpet

gereza

n

prison

padri

n

priest (Christian)

n

turmeric, yellow

n

money

Italian altare

German

Persian or Malagasy

n

seremala

altar

Malagasy divai

n

wine

wali

n

rice

posta

n

post, mail

-jivuna

v

be proud

kopo

n

tin can

muhanga

n

anteater

-jigamba

v

boast

Sambaa

Malay kiazi

n

potato, sweet potato

South Cushitic

Neo-Latin mbetula

n

birch

n

milk

Unknown Origin

Nyamwezi kangara

maziwa

n

mead

shepe

n

shovel

Buki, Bukini

n

Madagascar

Chapter 2

Loanwords in Iraqw, a Cushitic language of Tanzania* Maarten Mous and Martha Qorro 1. The language and its speakers Iraqw is spoken in northern Tanzania, on the high plateau between Lake Manyara and Lake Eyasi by roughly half a million speakers. It is the largest Southern Cushitic language and its closest relatives are Gorwaa, Alagwa, and Burunge which by contrast have each about fifteen thousand speakers or less. These four languages form the southernmost group of Cushitic, a language family that extends north to the Sudanese-Egyptian border. Most languages in the family are spoken in Ethiopia. The geographical variation within Iraqw is negligible. Swahili, the official and national language of Tanzania, is a second language for the vast majority of Iraqw speakers. Swahili is used in dealings with the administration, in school, and in writing. Iraqw is used in all other domains, and occasionally in formal domains as well. Protestant churches use Iraqw more than Catholic churches; very few Iraqw are Muslim. Iraqw is hardly used in written communication. The written material available in Iraqw is religious in nature (the Bible); in addition some stories, riddles, and other specimens of the rich Iraqw verbal art are available in writing. The language has been expanding rapidly in recent history, not only because of the relative high fertility rate, but also because the Iraqw can be characterized as an immigrant society in the sense that outsiders are welcomed but expected to become Iraqw. In particular, an important number of Datooga have become Iraqw when they opted for a more sedentary farming lifestyle and gave up their Southern Nilotic language in the process. Iraqw is a dominant regional language. Iraqw is surrounded by speakers of languages of different language phyla (see map): Hadza (an isolate click language), Mbugwe (Bantu, Niger-Congo), and Datooga (Southern Nilotic, Nilo-Saharan). This helps the identification of loan words. To the north, there are the Maasai (Eastern Nilotic), with whom there is little contact. However, the fear of Maasai aggression is still recounted in Iraqw oral history. To the west are the Mbugwe, a Bantu-speaking people who were once rich in cattle. There is trade with the Mbugwe who are famous for their basketry and *

The subdatabase of the World Loanword Database that accompanies this chapter is available online at http://wold.livingsources.org. It is a separate electronic publication that should be cited as: Mous, Maarten. 2009. Iraqw vocabulary. In Haspelmath, Martin & Tadmor, Uri (eds.) World Loanword Database. Munich: Max Planck Digital Library, 1117 entries.

104

Maarten Mous and Martha Qorro

pots. A visit from Iraqw to Mbugwe land involves a very steep descent into the Rift Valley. To the south are the Gorwaa, who are the closest relatives of the Iraqw. Many Gorwaa also speak Iraqw, which may explain the Iraqw perception of the closeness of the two languages. To the west are the Bantu Nyiramba and Isanzu, and the Hadza.

Map 1: Geographical setting of Iraqw There is very little contact with the hunter-gathering Hadza who live close to Lake Eyasi. In fact there is an area between the Hadza and the Iraqw which is claimed by

2. Loanwords in Iraqw

105

the Datooga as their grazing grounds. The Datooga (Southern Nilotic) are cattleraising nomads who are divided into several moieties. Of these, the Iraqw have intensive contact with the Gisamjeega, the Barabaiga, and the Buradik, and this has been the case for centuries. The Iraqw discuss the relationships with all their neighbors in their traditional prayer, the slufay, in which they declare their desire for peace and prosperity and above all for Iraqw expansion. Iraqw are experienced farmers and grow a variety of grains. Most people have cattle and its economic value is largely the dung, which is used as fertilizer. Cattle products are culturally salient (see Rekdal 1996). One of the key concepts in Iraqw culture is purity. A number of situations can cause (ritual) uncleanness, which requires avoidance of contact with others. Borders are crucial in the Iraqw view of their environment and traditional border posts in the form of ritually drawn limits exist (see Thornton 1980). There is a father-in-law naming taboo in Datooga and to a limited extent among the Iraqw who are in contact with Datooga. As is common in East Africa, naming dangerous animals at night, or naming dangerous diseases is avoided. Otherwise there are no taboo practices that have major linguistic consequences.

2. Sources of data The source for the data in the subdatabase was initially Mous et al. (2002). Earlier dictionaries such as Maghway (1995) and Wada (1973) were also consulted. The latter is, despite its title, a semantically organized list of Swahili terms with their Iraqw renderings. In our dictionary we were on the conservative side with regard to the inclusion of loanwords. This dictionary has as its core in the materials (elicitation, all sorts of texts) collected by Mous during his fieldwork for a grammar of Iraqw (Mous 1993). Specific work on the lemmata of the dictionary was conducted with several Iraqw speakers from different areas and in the final stages the dictionary was edited by Mous & Qorro after which additional words, some archaic, were added from the Berger collection of the 1930s, edited by Kießling (Berger & Kießling 1998). For the purpose of the loanword subdatabase additional entries were provided by Martha Qorro, an Iraqw speaker and a linguist. One of the difficult issues we discussed when preparing the Iraqw subdatabase was which entries should be considered as having no Iraqw equivalent. When we considered that the concept would be unfamiliar to the average Iraqw speaker we left it blank, even if some educated Iraqw might use a Swahili equivalent for such words. It is not uncommon in Iraqw to have loanwords and native words side by side. We have entered the native word if that is the most common form and we have entered both if the loanword is fairly common. For the history of inherited words we made use of the lexical reconstruction of Tanzanian Cushitic in Kießling & Mous (2003) and its morphological reconstruction in Kießling (2002). Aspects of borrowing in Iraqw are discussed in Legère (1988), Kießling (1998), and Rottland & Mous (2001).

106

Maarten Mous and Martha Qorro

3. Contact situations At present the language exerting most influence on Iraqw is Swahili. Swahili is the dominant national language in formal situations and the source of loanwords for new concepts. Most Iraqw people also speak Swahili, and proficiency in Swahili is rising. Iraqw-Swahili code switching does occur, but probably primarily in the urban environment outside the Iraqw homeland. In order to better understand language choice patterns, we observed a recorded documentary on the use of development aid money and its problems among the Iraqw (Aarden et al. 2003). In this documentary, some of the interviews were conducted in Swahili, and some in Iraqw. The first 20 minutes, consisted of seven short interviews conducted in Iraqw, in which there was no code switching with Swahili. In one interview a young man threw in the Swahili phrase si vizuri sana ‘it is not very good’. In two interviews a Swahili sentence introducer was used: kwanza ‘first’ and halafu ‘thus, then’. All other Swahili items in these interviews must be characterized as (nonce) borrowings. A number of interviews with Iraqw people in this documentary were conducted in Swahili. Not a single Iraqw word was introduced in these Swahili conversations, although occasionally an English word was used. Evidently, those people who felt comfortable enough in Swahili to opt for that language felt no need to use Iraqw vocabulary. Proficiency in Swahili has not always been that profound; it is a relatively recent phenomenon in the Iraqw area. Swahili knowledge in the pre-colonial and early colonial periods was limited. Iraqw oral tradition in the form of the Song of Kepa recounts that there was only one person who was capable to translate into Swahili at the times of the first encounters with the German colonial administration. The Swahili that was used in the colonial times was far from standard; see Kießling (1995) for a description of this variety of up-country Swahili. There is a long history of contact between the Iraqw and the Datooga in which the balance of power changed several times (for a detailed analysis, see Kießling 1998). The central Mbulu highlands used to be a Datooga area, as is evident from many Datooga place names such as Endabash, Bashanet, and Imboru (=Mbulu). Presently there is some Iraqw-Datooga bilingualism among both Iraqw and Datooga in the areas where the two meet. More people shift from Datooga to Iraqw than the other way around. The influence of Datooga on Iraqw is from an earlier period when the Datooga were military and culturally dominant. Before the Iraqw settled in the area where they are now, they were already in contact with Datooga. In earlier times Datooga was the prestige language for Iraqw speakers. Datooga words are still appreciated in certain literary genres such as the girayda poetry. From the clan histories it is clear that several Iraqw clans are of Datooga origin, and thus a fair number of Datooga must have shifted to Iraqw in the past because all these clans speak Iraqw now. There are no other languages with which Iraqw is in contact in any substantial way. Even though English is taught at school and is the official language of instruction in secondary education, it has had little influence on Iraqw directly and

2. Loanwords in Iraqw

107

loanwords from English enter Iraqw through Swahili. In actual fact Swahili is used more than English in secondary schools and Tanzanians in general are more at ease in Swahili than in English, even those with advanced schooling. When Iraqw speakers practice code switching with a second language, this language is Swahili, not English. The clan histories reveal that Iraqw society incorporated Nyiramba, Alagwa, and Sandawe speakers, among others. But these languages do not seem to have left traces in the Iraqw language. If we go further back in history, we face the problem of determining whether the borrowing language was Iraqw or a predecessor language of Iraqw. Some words can be recognized as early borrowings in a forerunner of Iraqw. For example, the word miringamoo ‘beehive, trough’ is a Bantu borrowing (not Swahili because Swahili has mzinga) at a time when the four West Rift Cushitic languages, AlagwaBurunge-Gorwaa-Iraqw, were still one language. Given the problems that we face when comparing proto West Rift Cushitic lexicon to the rest of Cushitic, these languages must have undergone a lexical revolution. Due to lack of data, insufficient Cushitic reconstruction, and lack of early documents, we do not know the ultimate origin of many proto West Rift Cushitic lexical items. The evidence that is available suggests influence from pre-Sandawe and Southern Nilotic (pre-Datooga) into proto-West Rift; see Kießling & Mous (2003: 32).

4. Number of loanwords There are 156 loanwords in the Iraqw subdatabase. This is a rather conservative count: We have taken only loans that are certain; we have ignored possible loans that have a more common Iraqw equivalent. There are a few words for which we are not certain that they are borrowed. Some words look borrowed from their shape only; some have equivalent word forms in neighboring languages but the direction is unclear. An intriguing case is the word dasi ‘girl, daughter’, which is an Iraqw lexical innovation and could be a loan from Hindi (see Banti 1997), but it is unlikely that the contact between Indian shopkeepers (very few in number) and Iraqw customers sufficed for the introduction of such a common term into the Iraqw language. This number of 156 (14%) is low compared to other languages in this survey. Iraqw is a strong language in a relatively conservative, traditional cultural context. Iraqw flourishes primarily in the rural area of Mbulu district and neighboring districts. In this context Iraqw culture is strong and the Iraqw language is highly valued by its speakers. The attitude of speakers is that for many modern concepts Iraqw words are used, for example ‘paper’ has the Iraqw equivalent

108

Maarten Mous and Martha Qorro 1

slambaré/ , whose original meaning is ‘piece of smooth leather’, or pampoo for the arithmetic character ‘zero’, which comes from the answer ‘nothing, zero’ in the game of guessing how many stones one has in the fist. The vast majority of loans is from Swahili (86%); Datooga, the second most important donor language, has only 9%. Not unexpectedly, the vast majority of loans are nouns (93%).

Datooga

Pre-Rangi

PreSandawe

Other

Total loanwords

Nonloanwords

Nouns Verbs Adjectives Adverbs Function words All words

Swahili

Table 1: Loanwords in Iraqw by donor language and semantic word class (percentages)

21.0 2.1 12.7

1.5 1.1 1.2 1.1

0.5 0.3

0.2 0.1

0.4 1.1 0.3

23.6 2.1 2.1 0.0 1.2 14.5

76.4 97.9 97.9 100.0 98.8 85.5

Some loans are counterparts of different meanings on the Loanword Typology (LWT) meaning list. These are pilipili ‘pepper; chili pepper’ (not a common crop or food item in the area), kasíis ‘potato/sweet potato’ (especially European potatoes are common in the area and can be distinguished from sweet potatoes by using a calque from Swahili, kasíir Ulaya ‘potato-of Europe’, but this is rarely done), salami ‘plate; saucer’ (not a common item in the area), miringamoo ‘trough; beehive’ (basically half of a hollow log; ‘trough’ and ‘beehive’ are commonly referred to with the same word in East Africa). The difference in the structure of lexical meaning that is behind these doublets is not between the donor (Swahili) and the recipient language (Iraqw) but with the meta language of the list, i.e. English/European. The great advantage of this in-depth comparative study of borrowing is obviously that the languages can be compared because the same list has been used for all languages. We have to ask ourselves, though, to what extent the results from using the standard list give a representative picture of lexical borrowing in Iraqw. In order to address this issue, we examined the use of borrowings in conversation and narrative stories, as well as the number of borrowings in our dictionary that are not included in the list. There are no frequency studies on Iraqw vocabulary; what is presented below is a very limited first attempt, for loans only.

1

Iraqw orthography uses / for the voiced pharyngeal obstruent, hh for its voiceless counterpart, ' for glottal stop, sl for the voiceless lateral fricative, y for the glide, ch for the voiceless palatal affricate, q for the voiceless uvular stop, x for the voiceless velar stop, ng for the nasal velar, ts for the voiceless ejective alveolar affricate, and tl for the voiceless ejective palatal affricate with lateral release. Long vowels are indicated by doubling the vowel symbol. High tone is indicated by an acute accent; low tone is left unmarked.

2. Loanwords in Iraqw

109

In order to get some idea about the use of loanwords in conversation, we counted the Swahili loans that were used in the Iraqw conversations in the documentary mentioned above. This documentary contains interviews with different kinds of people in the Iraqw-speaking area. Loans that occurred in these conversations were serikali ‘government’, tattoo, mutation ‘problems’, shida ‘problem’, msaada ‘(development) aid’, kiongozi ‘leader’, viongozi ‘leaders’ (used in a generalized sense), tanki ‘water tank’, ofisi ‘office’, deebe ‘metal container’, dawa ‘medicine’. All these words cannot be replaced by Iraqw words. For some of these words, there are Iraqw equivalents, but these would not be used in the context of modern politics. For example, dawa is used for (Western) medicine; its Iraqw equivalent maasáy is not normally used for Western medicine. Nevertheless, when filling in the word list, maasáy was felt to be the closest equivalent of ‘medicine’. Only twice was a Swahili word used for which there would an Iraqw alternative. The first was ugonjwa ‘illness’, for which the normal Iraqw terms are tiqti or genet (a loan form Datooga); this item does not occur in the LWT meaning list. This unexpected use of a Swahili word comes immediately after the intrusion of the Swahili phrase si vizuri sana ‘it is not well’ and is probably triggered by the preceding switch to Swahili. The second was the use, by the interviewer, of the Swahili word hakika ‘certainty, insight’. This common and useful Swahili word to express a state of mind cannot be rendered by one word in Iraqw (or in English), and obviously does not figure in the LWT list either. Very few Swahili words were used in conversation. The fact that most of the loans were not used in these conversations is obviously due to the topic of conversation. Within the domains of politics, government, and development aid, the loans that are used do not figure in the LWT list: leader, office, problems, aid. Those that do occur in the LWT list in this semantic field (Modern world), such as ‘government’, ‘president’, ‘minister’, ‘police’, ‘election’ (all Swahili loans) happen not to occur in these conversations. The domain of politics does favor the use of loans in Iraqw as evidenced by the actual conversations on this topic in the documentary. This conclusion can also be drawn from the fact that various loans in the domain of politics occur in the LWT list. However, the actual loans in question do not overlap between the observed use in conversation and the elicited data in the subdatabase. Note that both tatizo ‘problem’ and its plural matatizo, kiongozi ‘leader’ and its plural viongozi are used. These words are equally common in singular and plural use and the plural forms are used more often due to the tendency to speak about ‘problems’ and ‘political leaders’ in a general sense. It cannot be considered as evidence that Swahili morphology is active in these conversations that are conducted in Iraqw. Other electronic texts that we have at our disposal are primarily specimens of verbal art. Among these, stories come fairly close to natural informal speech. We checked the loans in the story of Lách. This is one of the longer stories; the recording is about 40 minutes long, and the text contains about 4,000 words. The story is narrated by an experienced story teller, Hhawu Tarmo of HhayLoto, in a very lively way, with frequent interruptions by the audience. Very few loans were

110

Maarten Mous and Martha Qorro

used in this story: The name of the protagonist, Lách, is a loan from Datooga and ultimately from Sandawe where lai means ‘hare’. The Swahili words shuulee ‘school’, shida ‘problem’, sumu ‘poison’, nafasi ‘opening, opportunity’ occur in the story. Shida and nafasi are very general and common words; shuulee occurs when recounting about Lách’s childhood. These Swahili loans are completely integrated into Iraqw morphology, e.g. shidá-r doo-hung /problem:of-F house-your.PL/; thus shida is assigned feminine gender and can be used in any Iraqw nominal construction. For sumu ‘poison’, the Iraqw equivalent tsaganoo is used later in the same sentence as a kind of repair mechanism and thus sumu can be considered to be a nonce borrowing. One Swahili phrase is used in the story: maneno mabaya ‘bad words’. The cursory study of the use of loanwords in everyday Iraqw supports the impression from the subdatabase, namely that Iraqw uses relatively few borrowings.

5. Kinds of loanwords 5.1.

General overview

Most loans are, not surprisingly, additive (insertions) for modern concepts and mostly from Swahili. In all semantic fields, Swahili is the number one donor language, except for the domain of domestic animals, which has more loans from Datooga. There are occasional loans from Mbugwe, Rangi, English, and Latin. We discuss the contributions from the source languages separately below. An overview of the number of loans for modern concepts and their source languages across semantic fields is presented in Table 2. For this table we have used our own dictionary files, rather than the LWT meaning list. Table 2: Sources of loanwords for modern concepts by semantic field Meaning area Modern Medicine Modern Dress Modern food and utensils Modern Housing Domestic animals Modern Agriculture Modern Society Modern Transport Modern Instruments Reading, Writing, Schooling Modern Economy Modern Government Newly known Peoples New Religion Army

Source 4 Swahili, 8 Swahili, 16 Swahili, 15 Swahili, 3 Swahili, 19 Swahili, 3 Swahili, 3 Swahili 16 Swahili, 14 Swahili 9 Swahili, 6 Swahili 6 Swahili 8 Swahili 3 Swahili,

1 Datooga, 1 English 3 Datooga, 1 Mbugwe 1 Mgugwe 2 Datooga 11 Datooga 1 Rangi, 1? 1 Datooga 1 Datooga 1 Latin

1 Datooga

2. Loanwords in Iraqw

111

The distribution in the semantic fields of the LWT meaning list is represented in Table 3. A second important category of loans is general-purpose words such as ‘time’, hyperonyms and sentence introducers/attitude markers. In these areas, Datooga and Swahili are equally important. Examples of borrowed general purpose words are iidígw ‘news’ from Datooga iidiga ‘news’, iimi ‘time’ probably from Datooga, gajéet ‘work, task’ probably from Datooga, gídabá ‘because’ from Datooga aba gid!!ba ‘because’.

Pre-Rangi

PreSandawe

Other

Total loanwords

Nonloanwords

The physical world Kinship Animals The body Food and drink Clothing and grooming The house Agriculture and vegetation Basic actions and technology Motion Possession Spatial relations Quantity Time Sense perception Emotions and values Cognition Speech and language Social and political relations Warfare and hunting Law Religion and belief Modern world Miscellaneous function words

Datooga

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Swahili

Table 3: Loanwords in Iraqw by donor language and semantic field (in percentages)

10.3 2.0 8.2 1.4 17.5 37.3 35.2 18.9 13.1 4.9 12.7 1.6 3.7 2.5 14.3 8.2 24.0 17.9 94.3 12.7

1.4 1.7 1.6 5.3 2.1 1.7 0.8 2.1 2.5 4.1 1.1

0.3 3.2 0.8 0.8 0.3

1.8 0.1

2.0 3.2 2.1 0.3

10.3 4.0 10.0 3.1 25.5 42.6 36.0 23.2 14.8 5.7 12.7 2.4 0.0 5.5 2.1 0.0 5.0 14.3 0.0 12.3 24.0 17.9 94.3 0.0 14.5

89.7 96.0 90.0 96.9 74.5 57.4 64.0 76.8 85.2 94.3 87.3 97.6 100.0 94.5 97.9 100.0 95.0 85.7 100.0 87.7 76.0 82.1 5.7 100.0 85.5

The hyperonym (and the only word with this meaning) siiyóo ‘fish’ is possibly from Datooga (traditionally the Iraqw do not eat fish). There are no words for specific species of fish. Other West Rift Southern Cushitic languages use a more recent loan from Swahili. Sentence introducers are often borrowed because they occur at a

112

Maarten Mous and Martha Qorro

position where code-switching easily occurs and they serve the communicative purpose of an early and easily recognizable indication of attitude of the speakers towards the information to come. General purpose words and hyperonyms are felt to be “gaps” in the native lexicon once the speakers are aware of the usefulness of such words due to their experience in other languages. Yet another category of loans is names. Several Iraqw personal names, geographical names and cow names are form Datooga; examples are personal names such as Barhe, Malé, the geographical name Mbulu from Imboru, the cow names siléet, digéet, lagéet, etc. 5.2.

Swahili loans

Speakers do not and cannot avoid borrowing from Swahili. There is some personal variation in the extent to which Swahili is used for words for which there is an Iraqw equivalent. It is not well-regarded to use Swahili words in Iraqw conversation despite the fact that Swahili has high prestige. 5.3.

Datooga loans

In an earlier stage of the language there was borrowing from Datooga (or Barabaig), a neighboring Nilotic language. Presently, the influence of Iraqw on Datooga is strong but the balance of power was different in earlier times. See Kießling (1995) for a socio-historical discussion of Iraqw-Datooga contact; for a discussion of mutual influence in the area of agriculture and animal husbandry see Rottland & Mous (2001). The relations between Iraqw and Datooga are determined by differences in their traditional economies. Datooga were migrant pastoralists, while Iraqw are sedentary mixed farmers. A number of the Datooga loans are assumed to have entered at an earlier stage of the language, because they are shared by Iraqw’s closest relative Gorwaa (Kießling 1998: 219f.). The reconstructed (starred) forms are at the level of protoIraqw-Gorwaa. These occur mostly in the following domains: ! Warfare: *'oohayooda a cry to gather people to fight ! Leather work: *shaarooda ‘leather bag for meat or honey’ ! Metal work: *saaqaanda ‘pair of metal spiral earrings’ ! Cattle colors: *muur ‘brownish’ ! Cattle disabilities: *sooni ‘barren cow’, *sooraari ‘cow without a womb’ ! Flora: *yuudeeda ‘acacia sp.’, *bariyoomoodi ‘Acacia nilotica’, *geetálongo ‘tree sp.’ ! Fauna: *gigiríg ~ giriríg ‘tape worm’, *haraariyooda ‘mythical giant snake’, *sakweeli ‘ostrich’

2. Loanwords in Iraqw

113

! Body parts: *daamooda ~ *daamooga ‘beard’, *gwalay ‘vagina’, *seellanée(da) ‘mane of lion’ Words from Datooga are used in the traditional poetry of girayda. These words are not normally used in Iraqw. The whole purpose of girayda is to speak in a veiled manner. This is evidence for the (former) high prestige of Datooga among Iraqw speakers. There is no such association for Swahili. Datooga is a common source for cattle names and cattle colors, e.g. areer ‘red (as cow color)’, from Datooga areera ‘red’, nawéet ‘name for cow born on the road’, from Datooga naweeda ‘road’. Some personal names are also of Datooga origin, e.g. Margwée(t), Barandí, Darabée(t); the final t is optionally pronounced in Iraqw. 5.4.

Non-Swahili Bantu loans

Bantu languages other than Swahili had some influence on Iraqw. Specifically, neighboring Mbugwe is the source of the words katanti (F) ‘small basket’ and sirwi (F) ‘earthen water pot with a narrow neck (produced by Mbugwe people only)’. These words reflect items that are typically acquired from the Mbugwe. Mbugwe is also the source of the personal names (male and female) Umblá and Kimblá, both based on a Mbugwe word for ‘sheep’. (Male) names of Nyiramba origin are Banga and Bayo. There are some indications of earlier non-Swahili Bantu influence on Iraqw. Some of these words are distributed over a larger area which we discuss in the next section. 5.5.

Area words

A number of loans have connections in the vaster area of Northern Tanzania and beyond. The source and direction of transfer of these Wanderwörter is difficult to establish. They support the hypothesis of a Tanzanian Rift Valley linguistic area (see § 7). Some shared lexical items in the Tanzania Rift Valley area are listed below. ! ‘bull, large male animal’: Iraqw yaqamba, Alagwa (Rift S. Cushitic) yaqamba, Burunge (Rift S. Cushitic) yaqamba, Nyilamba (Bantu) nzagamba, Nyaturu (Bantu) njaghamba, nzagaamba, but also Sukuma (Bantu): yagambá, nzagaamba; also widespread in West Tanzania and in Central Kenya Bantu languages. ! ‘ram’: Iraqw gwanda, Burunge (Rift S. Cushitic) gondi, Alagwa (Rift S. Cushitic) gwandu, Datooga (S. Nilotic) lagweenda, Mbugwe (Bantu) "oondi; but also Sukuma (Bantu) goondi, Mbugu (Bantu) igonji ‘sheep’, Nara (Bantu) "#ndi ‘sheep’, etc.

114

Maarten Mous and Martha Qorro

! ‘boys’: Iraqw masomba, Alagwa (Rift S. Cushitic) masomba, Asax (S. Cushitic) msumbe, Nyaturu (Bantu) nsuumba, Nyilamba (Bantu) msumba, Mbugwe (Bantu) lemusomba ‘slave’, but also Sukuma (Bantu) sumba. ! ‘milk’: Nyilamba masu(n)su, Rangi (Bantu) masu(n)su, Iraqw maso'o ‘first milk after a cow has calved’. ! ‘beehive’: proto-West-Rift S. Cushitic *mariinga, Rangi (Bantu) muri"ga, Nilyamba (Bantu) mlinga, Bianjida-Datooga (S. Nilotic) mèrèe"jáandà; but also Yaaku (East Cushitic) merengo, Mogogodo-Maasai (East Nilotic) m!rán.

6. Integration of loanwords Some borrowings from Swahili are not so recent and are integrated to the extent that they are no longer felt to be loans. For example, kasiis ‘(sweet) potato’ is ultimately from Swahili kiazi ‘potato’ or from another Bantu source but it does not feel like a loan anymore to the speakers. Integration was reversed in a number of words. For example, Swahili chupa ‘bottle’ was first borrowed as tupa, but is nowadays pronounced as in Swahili. Adaptation to Iraqw structure includes phonological adaptation and morphological adaptation. Nouns have to belong to a gender category as well as a number system. In particular, adaptation to the Iraqw system of nominal number offers some surprises, which we discuss after dealing with the phonological and gender adaptation. Phonological integration of loans does not always take place. In particular, Swahili loans are not always fully adapted to Iraqw pronunciation, because speakers also have control of Swahili phonology, which they use when speaking Swahili. Penultimate stress in Swahili is interpreted as a long vowel in Iraqw. In words that end in disyllabic ia in Swahili this is rendered as yáa in Iraqw when the preceding consonant allows for a Cy cluster, and iiya elsewhere, as in gunyáa (F) ‘bag’ (from Swahili gunia) and kofyáa (F) ‘hat, cap’ (from Swahili kofia) but sufuriiya (F) ‘pot’ (from Swahili sufuria). Occasionally word final ku is rendered kw and the preceding vowel (stressed in Swahili) is not long, e.g. sandukw (M) ‘boxes (PL)’ (from Swahili sanduku). Iraqw does not have a voice opposition in fricatives, and Swahili z is pronounced s in loans, e.g. /aansuus ‘start’ from Swahili -anza, gaseeti from Swahili gazeti ‘newspaper’. There are no examples of Iraqw borrowings of Swahili words which contain v. The same adaptation has taken place in the local variety of Swahili in the first th half of the 19 century (Kießling 1995: 121). Another problem for Iraqw pronunciation are the word-initial nasal-stop combinations of Swahili. Nasal-stop clusters are allowed in Iraqw, but not in wordinitial position. In Swahili they occur in this position. In initial position Swahili nouns may contain a syllabic m prefix and they may start in a prenasalized stop; syllabic nasals do not occur in Iraqw, nor can the nasal-stop combinations be interpreted as prenasalized stops since these too are ruled out in initial position in Iraqw. Some loans take the Swahili plural form as base and in that way avoid the problem

2. Loanwords in Iraqw

115

of the syllabic nasal. This is the case with mikaate (F) ‘bread (SG)’ from Swahili plural mikate ‘loaves of bread’; singular mkate. Syllabic m is rendered by m plus a vowel, either mu, mo or ma. The first, mu, can be interpreted as a straightforward adaptation of the syllabic m. This happens specifically before voiceless fricatives, as in musmari (F) ‘nails (PL)’ from Swahili msumari ‘nail (SG)’, muhind-moo (M) ‘Indian person’ from Swahili mhindi, muhogo (M) ‘cassava’ from Swahili mhogo. This pronunciation is not very different from non-standard Swahili pronunciation for words that start in h. The vowel o appears in mochele (F) ‘rice’ from Swahili mchele; the vowel o cannot be explained. The adaptation ma for m is probably due to vowel assimilation. Examples are mas.laba (F) ‘cross’ from Swahili msalaba and makasi (F) ‘scissors (SG)’ from Swahili mkasi ‘scissors (SG)’. The vowel choice for a vowel a is influenced by the Swahili plural noun class prefix ma- in the adapted loan makaate (F) ‘loaves of breads (PL)’ from Swahili mkate (3) ‘bread (SG)’. When loans start in prenasalized stops, an initial vowel is added which is identical to the first stem vowel, i.e., a prothetic echo vowel: angamiiya (F) ‘camel’ from Swahili ngamia, angaano (M) ‘wheat’ from Swahili ngano, angaasi (F) ‘ladder’ from Swahili ngazi, ondo'o (F) ‘bucket’ from Swahili ndoo, impiira (F) ‘(foot)balls (PL)’ from Swahili mpira ‘(foot)ball (SG)’. Alternatively the prenasalization is deleted, as in guruuwe (F) ‘domestic pig’ from Swahili nguruwe. These strategies are also occasionally used when the nasal is syllabic in Swahili. The syllabic nasal is deleted in nada (F) ‘monthly cattle market’ from Swahili mnada, and limaw (M) ‘lemons (PL)’ from Swahili mlimao ‘lemon’ (another variant of this loan is malmaw). An initial vowel is added to Swahili mpishi ‘cook (SG)’ where the nasal is syllabic, giving Iraqw imbishi (F) ‘cooks (PL)’. Nasal-stop clusters are always voiced in Iraqw. A word-initial velar nasal is rendered by an oral velar stop, e.g. gambo ‘other side’ for Swahili ng'ambo. Palatals are sometimes adapted in earlier loans, but less so in recent ones. The palatal stops ch, j and ny do occur in native Iraqw words and are reconstructed for proto West Rift Southern Cushitic (Kießling & Mous 2003: 5–7) but they are rare. Older Swahili borrowings sometimes render ch as t, for example tupa (F) ‘bottles (PL)’ for Swahili chupa (but now again chupa), j may be rendered as y as in yerman ‘German(s)’ from Swahili mjerumani (or possibly directly borrowed from English german), ny is rendered as n or y, for example nuundu (M) or yuundu (M) ‘hammer’ for Swahili nyundo. The palatal fricative sh occurs only in loans and as free variant of s before i, e.g. Legère (1988: 642–643) has shiptaali ‘hospital’, now more often pronounced as siptaali. Loans with sh followed by other vowels are sometimes pronounced with s, but nowadays more often as sh, e.g. suule or shuule (F) ‘school’ from Swahili shule. Occasionally Iraqw has r where Swahili has l although both languages have both sounds, e.g. burungeeti ‘blanket’ from Swahili blanketi, filiimbi (F) ‘flute’ from Swahili firimbi. An r is added where pre-Swahili had an l in kutunguuru ‘onions (PL)’ from Swahili kitinguu (SG). A number of these seemingly unmotivated adaptations can be explained by considering as source the local variety of Swahili at the time as borrowing. Kießling (1995: 131–132) mentions insertion of r and r for l as Rangi or Mbugwe lexical influence on Rift Valley Swahili in the 1930s.

116

Maarten Mous and Martha Qorro

Iraqw allows consonant clusters where Swahili does not. Iraqw has a rule of vowel deletion in the middle of three consecutive syllables with short vowels and this applies to loans too. Thus we have matfaali (F) ‘bricks (PL) from Swahili matofali, bikra (F) ‘nun’ from Swahili bikira, musmaari (F) ‘nails (PL)’ from Swahili msumari (SG), malmaw ‘lemons (PL)’ from Swahili mlimao, malmu (M) ‘teacher’ from Swahili mwalimu. Other adaptations to a preferred Iraqw syllabic pattern are siptaali (F) ‘hospital’ from Swahili hospitali, harsaasi (F) ‘bullet’ from Swahili risasi. The motivation for these latter adaptations is that one of the Iraqw preferred syllable patterns for noun roots is to have a short vowel i, a or u in the first syllable and a long vowel aa (or ee or oo) in the second syllable. Long vowels alternate with V'V in Iraqw, also in loans, e.g. da'ari (F) ‘ceiling’, plural daaradu from Swahili trisyllabic daari, ondo'o (F) ‘bucket’ from Swahili disyllabic ndoo; and in the plurals of taa (F) ‘lamp’, pl: ta'adu from Swahili taa, saa (F) ‘hour’, pl: sa'adu from Swahili saa. Borrowed words need to be integrated in Iraqw morphology. Integration of nouns requires gender and number assignment. This is the topic of Legère (1988) in which he discusses gender allocation of loans. The state of knowledge about Iraqw grammar at that time prevented him from recognizing the nature and structure of the number system which is crucial in recognizing number suffixes and understanding gender assignment. Gender is largely determined by form; and even more strongly in the borrowed part of the vocabulary. In Iraqw gender is not predictable on the basis of meaning. Iraqw nouns which contain a number suffix have the gender that is determined by the number suffix. Gender is not completely predictable on the basis of form but there are clear tendencies of correlation between the final vowel and gender. There is a strong tendency for underived nouns ending in round vowels to be masculine, and for those ending in front vowels to be feminine. Loanwords follow this pattern. However, Swahili loans that end in o are not all masculine, e.g. kijiko (F) ‘spoon’, koleeyo (F) ‘pincers’, koopo (F) ‘cup’ are feminine; ondo'o (F) ‘bucket’ is feminine because the ending o'o is homophonous to a feminine singulative suffix -o'o. Loans ending in the vowel a are feminine; inherited words ending in a are more or less equally divided between masculine and feminine. Listed below are some Swahili loans and their gender allocation according to quality of final vowel. ! ! ! ! ! ! !

kalaamu (M) ‘pen’ kitaabu (M) ‘book’ angaano (M) ‘wheat (corn or plant)’ roobo (M) ‘quarter’ boofolo (M) ‘loaf of bread’ muhoogo (M) ‘casava’ magoogo (M) ‘Gogo’

! ! ! ! ! !

shuule (F) ‘school’ mikaate (F) ‘bread’ baati (F) ‘corrugated iron sheet’ gaseeti (F) ‘newspaper’ chumba (F) ‘room’ chupa (F) ‘bottle’

The loan patri (M) ‘priests (PL)’ is masculine in gender due to semantic considerations which overrule the formal characteristic of a final front vowel, despite the fact that meaning is not decisive for gender assignment in Iraqw. There is another

2. Loanwords in Iraqw

117

counter-example to the gender assignment rule based on the final vowel, mochele (M) ‘rice’ from Swahili mchele, where semantics cannot account for the exceptional feminine gender assignment. Number is a derivational category in Iraqw; for some lexical items singular is derived, for some plural is derived, for others both are derived. Loans are either based on the singular or the plural form of the source language. Most loans are singular based on the singular in the source language and receive, for their plural, one of the more common Iraqw plural derivations. However, the plural formation is not predictable, neither in the borrowed part of the vocabulary nor in the rest. Thus there are 32 such loans that have a plural with -du and 17 such words that have a plural with -ay; while 5 loans based on a singular have still other plural derivations. The word for ‘chair’, borrowed from Swahili kiti, has the singular form kiti-angw (derived by means of the suffix -angw) and the plural form kiti-eeri (derived with the suffix -eeri that is associated with singulars in -angw). In the word muhindmoo (M) ‘Indian’ (from Swahili mhindi), the singular is derived with the Iraqw singular suffix -moo. However, its plural form wahindi (F) is not based on this singular, but rather borrowed directly from Swahili wahindi, with no Iraqw number morphology added. A number of loans are based on plural forms in the source language, with the singular forms derived by an Iraqw singular derivation, e.g. matofaali ‘bricks’ (from Swahili ma-tofali) and its derived singular matfal-moo (M); mabaati (F) ‘corrugated iron sheets’ from Swahili ma-bati (plural noun class 6), miiti (F) ‘sticks’ from Swahili mi-ti (plural noun class 4) ‘trees, sticks’. In several cases the source is a number-invariant Swahili word, e.g. askáari (F) ‘soldiers’ from Swahili askari SG/PL. The singular is derived, askaarmoo (M). There are 13 such loans, and for most of them the meaning of the word makes it understandable that a plural form is taken as basic, e.g. money, paper, jiggers, groundnuts, barley, wheat, cassava, sugarcane, tomatoes, lemons, pawpaw, mangos, peppers, tiles, stamps, soldiers, but not for all, e.g. balls, boxes. The Iraqw loan chupa ‘bottle’ is borrowed as a plural (derived singular: chup-ito'o) although it is a singular in Swahili, but in various other Bantu languages in Tanzania this lexeme is in noun class pair 9/10 with no distinction between singular and plural. What is intriguing is that some words that commonly occur in the plural have plural forms based on Swahili singular forms, for example kasiis ‘potatoes’ from Swahili kiazi (noun class 7), musmaari ‘nails’ from Swahili msumari (noun class 3), kitunguuru ‘onions’ from Swahili kitunguu (noun class 7). The reverse also occurs, that is, a plural Swahili form borrowed as a singular, e.g. ma’uuwa (F) ‘sunflower’, pl. ma’uwa-adu from Swahili ma-uwa (plural noun class 6) ‘flowers’, majini (F) ‘precious stone’, PL: majinay (M) from Swahili ma-jini (plural noun class 6). Words that have human referents are often borrowed with a plural as base while the singular is derived: wachaga (F) ‘Chaga people’, wa’araabu (M) ‘Arabs’, nyiraamba (F) ‘Nyiramba people’, magogóo ‘Gogo people’, irangi (F) ‘Rangi people’, yerman ‘Germans’, angreesi (F) ‘English people’, is.laamu (M) ‘Muslims’, prosansi (F) ‘protestants’, katoliki (F) ‘Catholics’, imbishi (F) ‘cooks’, patri (F) ‘priests’, jeela (F)

118

Maarten Mous and Martha Qorro

‘prison, prisoners’, askoofu (M) ‘bishops’. The singular forms of these words are derived by the singulative suffix -moo (M) as is common for words with human referents. Despite this fact, many of these words are not manifestly plural, and a separate plural can be formed to be explicit about plurality, e.g. askofo-moo (M) ‘a bishop (SG)’, PL: askofa-ma' ‘bishops (PL)’. Some words for people are borrowed as singular: daktaari (F) ‘physician’, bikra (F) ‘nun’, yaaya (F) ‘nurse’. A number of the borrowed nouns have only one number form that can be used for both singular and plural reference but number is not salient for them. Such words are difáay (F) ‘wine’, uji (F) ‘porridge’, muchele (M) SG/PL ‘rice’, miwa (F) ‘sugar cane’, sukari (F) ‘sugar’, musiki (F) ‘music’. Verbs that are borrowed need a verbalizing suffix. These verbalizers contain a causative derivational suffix -s or a durative derivational suffix -m: shitak-uus ‘accuse’ from Swahili ku-shtaki, soom-uus ‘read’ from Swahili ku-soma, nyouus ‘shave’ from Swahili ku-nyoa, sifuus ‘praise’ from Swahili ku-sifa, panguus from Swahili ku-panga, saaliim ‘pray’ from Swahili ku-sala, paasiim ‘1. pass an exam, 2. iron (clothes)’ from Swahili ku-piga pasi ‘iron’ and ku-pas ‘pass (English loan)’. The loan verbs ignore the noun class prefix of Swahili infinitives and are based on the Swahili verbal stem, used as a free form in Swahili only in imperatives.

7. Grammatical borrowing There is no grammatical borrowing form Swahili. The intrusion of Swahili into Iraqw remains superficial in the Iraqw speaking heartland because Iraqw is a strong language with a self-confident speaker community and influence of Swahili is not strengthened by bilingualism of Iraqw in any other Bantu language. The latter is the case for the related Cushitic languages Alagwa and Burunge. Speakers of these two languages also speak Rangi, a regional dominant Bantu language, in addition to Swahili. There is evidence for grammatical influence from Bantu onto these languages, for example in the increase of verb - object word order and in the gradual shift from formal gender agreement in the subject marking on the verb to semantically based agreement. There is no evidence for the borrowing of grammatical markers (form and function) in any of these languages. However, in Kießling et al. (2008) we propose four phonological and fifteen structural and semantic shared features that define a Sprachbund of Alagwa, Burunge, Gorwaa, Iraqw (all West Rift Southern Cushitic), Datooga (Southern Nilotic), Mbugwe, Rangi, Nyaturu (all Bantu zone F30), Sandawe (probably Central Khoisan), Hadza (isolate). Examples include the emergence of Proto West Rift Southern Cushitic ventive ni from a Southern Nilotic source; Bantu substrate influence in the development of past tenses at several earlier stages of Iraqw; and the conceptual transfer from Datooga onto pre-Iraqw in the area of spatial concepts, such as the development of a locative noun meaning ‘back’ indicating ‘top’ rather than ‘behind’, presumably under the influence of a Datooga cattle-centered conceptualization, see Kießling (2002: 422ff.) and also Heine & Reh (1984), Carlin & Mous (1995), and Reh (1999). In a similar

2. Loanwords in Iraqw

119

vein, the use of the word ‘belly’ for expressing emotional concepts can be considered as Datooga semantic influence (Kießling 2002: 428f.).

8. Retention and innovation in the lexicon Since we have a fairly extensive reconstruction of the West Rift Southern Cushitic lexicon, it is possible to determine which words have been stable for the past 500 years or so. Obviously, there is also innovation in the lexicon that is not due to borrowing. Table 4 presents the number of retentions across the semantic areas of the LWT meaning list. The semantic domains of field 23 (Modern world) is excluded because this domain was added to the LWT list to obtain more loans; very few retentions occur in this domain; field 24 (Miscellaneous function words) is excluded because it is not semantic in nature. We present both the number of retentions for the LWT list as for an extended list that is supplemented by lexemes in our dictionary files that did not appear in the LWT list. The reason for adding these latter numbers is that they give a fuller picture and show where the LWT list deviates in semantic coverage from the semantic structure of the Iraqw lexicon. In total, 722 stable words were identified at the level of Proto Iraqw-Gorwaa-Alagwa and earlier (leaving out some 124 that could not easily be fitted into the LWT list and its semantic categorization). Particularly conservative semantic fields are The physical world, The body, Spatial Relations, and Sense perception. That the semantic domains of body parts and the physical world are rather conservative is in line with the prevailing impressions of linguists, as these domains are well represented in lists of basic words such as the Swadesh list. The domains of space and sense are possibly more surprising. In the semantic domain of space are words that refer to basic concepts such as left and right, top, bottom, edge, side, height, and basic adjectives such as wide, straight, small, big, long. These concepts are not much influenced by cultural changes. The same is valid for the stable items in domain of sense which include: cool, smell/stink, sweet, sour, hear, noise, silence, see, look at, observe, show, shine, glitter, color, colors, smooth, sharp, heavy, light, fresh, wet, dry, hot, cold, dirty.

120

Maarten Mous and Martha Qorro

Table 4: Retentions in the lexicon according to semantic field LWT semantic field

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22

The physical world Kinship Animals The body Food and drink Clothing and grooming The house Agriculture and vegetation Basic actions and technology Motion Possession Spatial relations Quantity Time Sense perception Emotions and values Mind Speech and language Social and politicalrelations Warfare and hunting Law Religion and belief Total

Number of retentions in Total items (retentions Percentage of core LWT list + added + nonretentions) retentions items 71 47 121 145 67 27 26 65 46 51 32 63 19 25 46 35 21 29 27 29 7 13 1012

94 89 174 191 106 77 56 103 80 91 51 87 39 61 56 59 58 49 47 55 27 28 1678

76 53 70 76 63 35 46 63 58 56 63 72 49 41 82 59 36 59 57 53 26 46 60

Within the semantic domain of (domestic and wild) animals the only loanwords are ‘horse’, ‘camel’, and ‘duck’, which are recent additions from Swahili and which are all fairly irrelevant to the area. ‘Beehive’ is an early borrowing from a Bantu language (pre-Rangi) into Tanzanian Cushitic. If the beehive represented a technological innovation, it has to be noted that the Southern Cushitic Burunge are and used to be the beekeeping specialists of the region. There are numerous Datooga borrowings in the area of cattle colors, cattle names, and cattle diseases, which do not appear in the LWT list. On the other hand, various animals have to be left blank because they are irrelevant in the area. Consequently, only 41 of the 96 items in this field (43%) can be reconstructed for Tanzanian Cushitic, which could give the false impression that this field is highly innovative. This percentage would be much higher (70%) if we included additional items that would fall under the heading ‘animals’. In this domain the LWT list is not very representative for borrowing nor is it for stability, for Iraqw. Iraqw has a lot of specific cattle terminology: barren cow, hair from cow’s tail, poetic terms for cow, a set of commands for cows, words for different stages of dung/manure, different systems of

2. Loanwords in Iraqw

121

cattle loans, cattle colors and cattle names. The lexical structure for domestic animals usually contains a generic term which is also female, a specific male term, terms for kids that differentiate between general and specific female kids but no distinction sheep/goats for kids. The most relevant innovation in the area of domestic animals stays unnoticed in the LWT list and is mainly due to new cattle keeping knowledge from the Nilotic (pre-)Datooga. For the rest the semantic domain of domestic animals is fairly stable; this is a core semantic area for Iraqw culture. In the area of fauna the following can be reconstructed: 58 mammals, 63 birds, 51 insects, and 25 reptiles. There are some “nicknames” (alternative, humorous names) among birds and insects: some less important insects and birds have formal properties of names or are derived as names. This is a factor of lexical innovation that does not necessarily involve transfer from another language. The semantic domain of fauna is stable for the major (larger and important) animals but unstable for the smaller and unimportant animals.

References Aarden, Ton et al. 2003. Clouds but no rain. Film. Ton Aarden videoproductions. Banti, Giorgio. 1997. Review of Mous (1993). Journal of African Languages and Linguistics 18:95–106. Berger, Paul & Kießling, Roland. 1998. Iraqw texts. Cologne: Rüdiger Köppe. Carlin, Eithne & Mous, Maarten. 1995. The back in Iraqw: Extensions of meaning in space. Dutch Studies-NELL 2:121–133. Heine, Bernd & Reh, Mechthild. 1984. Grammaticalisation and Reanalysis in African Languages. Hamburg: Helmut Buske. Kießling, Roland. 1995. Mainland Kiswahili used as a Lingua Franca in the Rift Valley area of Tanzania in 1935. Afrikanistische Arbeitspapiere 43:119–135. Kießling, Roland. 1998. Reconstructing the Sociohistorical Background of the Iraqw Language. Afrika und Übersee 81:167–225. Kießling, Roland. 2002. Die Rekonstruktion der südkuschitischen Sprachen (West-Rift): Von den systemlinguistischen Manifestationen zum gesellschaftlichen Rahmen des Sprachwandels [The reconstruction of the Southern Cushitic languages (West Rift): From the system linguistic manifestations to the social frame of language change]. Cologne: Rüdiger Köppe. Kießling, Roland & Mous, Maarten. 2003. The lexical reconstruction of West Rift (Southern Cushitic). Cologne: Rüdiger Köppe. Kießling, Roland & Mous, Maarten & Nurse, Derek. 2008. The Rift valley area of Central Tanzania as a linguistic contact zone. In Heine, Bernd & Nurse, Derek (eds.), A Linguistic Geography of Africa, 186–227. Cambridge: Cambridge University Press. Legère, Karsten. 1988. Bantu and Southern Cushitic: The Impact of Swahili on Iraqw. Zeitschrift für Sprachwissenschaften und Kommunikationsforschung 41(5):640–647.

122

Maarten Mous and Martha Qorro

Maghway, Josephat. 1995. Annotated Iraqw lexicon. (African language study series 2). Tokyo: Institute for the Study of Languages and Cultures of Asia and Africa (ILCAA), Tokyo University of Foreign Studies. Mous, Maarten. 1993. A grammar of Iraqw. Hamburg: Helmut Buske. Mous, Maarten & Qorro, Martha & Kießling, Roland. 2002. An Iraqw-English Dictionary. (Cushitic Language Studies 15). Cologne: Rüdiger Köppe. Reh, Mechthild. 1999. “Body”, “Back” and “Belly” or On the antonyms of “inside” and their conceptual sources. Frankfurter Afrikanistische Blätter 11:101–123. Rekdal, Ole-Bjørn. 1996. Money, Milk and Sorghum Beer: Change and Continuity among the Iraqw of Tanzania Africa. Journal of the International African Institute 66(3):367– 385. Rottland, Franz & Mous, Maarten. 2001. Datooga and Iraqw: A comparison of subsistence vocabulary. In Ibriszimov, D. & Leger, R. (eds.), Von Ägypten zum Tschadsee: Eine linguistische Reise durch Afrika (Abhandlungen zur Kunde des Morgenlandes 53.3), 377– 400. Mainz: Deutsche Morgenländische Gesellschaft (Ergon Verlag). Thornton, Robert J. 1980. Space, time, and culture among the Iraqw of Tanzania. New York: Academic Press. Wada, Shohei. 1973. Iraqw basic vocabulary with Swahili equivalents. Tokyo: ILCAA.

Loanword Appendix Swahili kisiwa bara bahari barafu hali ya hewa kibriiti faras bata angamia mdudu kimulimuli daktari sufuriya sahaani koopo uma makaate pilipili sukaari usi

island mainland sea ice weather the match horse duck camel insect firefly physician pan plate, saucer cup, drinking vessel fork bread pepper, chili pepper sugar thread

gawni koti shaati kola surwaali soksi kafyá kifungo pete tawlo sabuuni hema chumba funguo dirisha sakafu, matl/angw burungeti kitaangw meesa taa'a mushumaa matfali

woman’s dress coat shirt collar trousers stocking, sock hat, cap button ring (for finger) towel soap tent room key window floor blanket chair table lamp, torch candle brick

fereegi angaano shayiiri muchele naasi malmaw kasíis muhoogo miwa makaasi fundi nuundu musmaari gilasi rangi kupiga rangi barbara gari meli tlar/a; nootay peesa koodi

ditch wheat barley rice coconut citrus fruit sweet potato tapioca, manioc, cassava sugar cane scissors, shears carpenter hammer nail glass the paint to paint road carriage, wagon, cart ship money coin tax, tribute

2. Loanwords in Iraqw nada duka impiirmo shuulee soomuus karatasi kalaamu kitaabu filliimbi askaari punduki shitakuus faini jeela dini kaniisa patri reediyo simu paskeeli gaari basi gari la moshi indege umeme injin mashiini petroli, oili

market (place) store, shop sphere, ball school read paper pen book flute soldier gun, cannon accuse fine prison, jail religion temple, church priest radio telephone bicycle car bus train airplane electricity motor, engine machine oil, petroleum

siptaali yaaya dawa sindamo miwaani sirkaali rais, akóo aya wasiri poliisi leseni kosa uchagusi anwaani namba mtaa posta stempu barwa kadi benki choo gadoro deebe msumari bisisbisi chupa plastiki

hospital nurse pill, tablet injection spectacles government president minister police driver’s license crime election address number street mail postage stamp letter postcard bank (financial institution) toilet mattress can, tin screw screwdriver bottle plastic

bomu kiwanda sigara gaseeti kalenda film

123

bomb, explosive workshop, factory cigarette newspaper calendar film, movie

Datooga injoloot maynyaar gídabá gamboot

sickle blue because shield

Other languages dasi miringamo kasiis tumatí masomba musa garangaar

girl, daughter beehive, trough potato tobacco young man (adolescent) pestle roast, fry

Chapter 3

Loanwords in Gawwada, a Cushitic language of Ethiopia* Mauro Tosco 1. The language and its speakers Gawwada ([kawwa!a]) is a member of the so-called Dullay dialect cluster and is spoken in southwestern Ethiopia. Administratively, it is part of the Southern Peoples, Nations, and Nationalities Region. The region was known until 1991 as Gamu Gofa, a name often still encountered. The area consists of undulating hills and mountains and lies at about 1,600–1,700 meters above sea level. The Gawwada people are mostly engaged in farming and cattle breeding on a household basis. According to current classification, Dullay is a direct offspring of East Cushitic, although Hayward (1978) has substantiated a proposal originally made by Ehret (1974, 1976), according to which within East Cushitic Dullay forms a genealogical subgroup with the isolated (and now possibly extinct) Yaaku language of the Mount Kenya region. In Tosco (2000) I generally accepted Hayward’s arguments and proposed that the group made up by Dullay and Yaaku be called Transversal Southern Lowland East Cushitic. Figure 1 shows the classification of Gawwada. Within Dullay one may distinguish a western and an eastern group of dialects. The western group is basically made up of Ts’amakko and Gawwada, and, geographically, spans the two banks of the Weyt’o river. The eastern dialects occupy the highlands to the east and north of Gawwada. Harso, Dobaze, and the other dialects studied in Amborn et al. (1980) are representative of the eastern group. Mutual intelligibility between the eastern and western groups is high, and Dullay may probably be regarded as a dialect chain. Gawwada speakers have no trouble understanding speakers of Ts’amakko, while they claim to have some problems understanding the eastern varieties. Knowledge of the Dullay peoples and languages dates back essentially to the 1970s. Early European travelers, such as Vittorio Bottego in 1895/1897, did not even pass through the area while reaching west towards the Omo river.

*

The subdatabase of the World Loanword Database that accompanies this chapter is available online at http://wold.livingsources.org. It is a separate electronic publication that should be cited as: Tosco, Mauro. 2009. Gawwada vocabulary. In Haspelmath, Martin & Tadmor, Uri (eds.) World Loanword Database. Munich: Max Planck Digital Library, 982 entries.

3. Loanwords in Gawwada

125

A!oasiatic

Berber

Chadic

Egyptian †

Northern Cushitic (= Beja)

Cushitic

Central Cushitic (= Agaw)

Omotic

Semitic

Eastern Cushitic

Highland East Cushitic (HEC)

Yaaku-Dullay

Yaaku †

Figure 1:

Lowland East Cushitic

Dullay

Western

Gawwada

Southern Cushitic

Ts'amakko

Eastern

Harso

Dobase

…

The classification of Gawwada

The Dullay speakers have no overall name for themselves, nor do they seem to recognize themselves as an ethnic or linguistic unit. At least three labels have been proposed for their language in the scientific literature: 1. The name Werizoid was proposed by Bender (1971: 187), and used by Black (1976). According to Amborn et al. (1980: 14, fn. 3), the term is derived from the official (Amharic) administrative name for the area at that time, itself resulting from a misunderstanding of the chief lineage of Harso (one of the Dullayspeaking groups) as a name for the area. 2. The name Qawko was proposed by Hayward (1978), from the term for ‘man’ (qawho in Gawwada) in all the varieties. 3. The name Dullay was proposed by Amborn et al. (1980), from the name of the river known in Amharic as Weyt’o, and which is perhaps the most salient geographic feature of the area (actually, the river divides the westernmost group, the Ts’amakko, from all the other Dullay-speaking groups; in Gawwada it is called tullayho). The name Dullay has gained wider acceptance in the linguistic literature and will be retained here, although it must be stressed that none of these names have any meaning as ethnic and linguistic labels to the speakers themselves. In this article, the name Gawwada is used for the dialect spoken in the town of Gawwada (approximately at 5°25’ N, 37°14’ E) and in the neighboring villages. The town lies approximately 40 km (one hour’s drive) west of the town of Konso, and 12 km north of the main road leading from Konso to Jinka and the Omo valley.

126

Mauro Tosco

Map 1: Gawwada and the neighboring languages In Ethiopia, Gawwada is now officially used as a cover term for all the Dullayspeaking groups except the Ts’amakko, who live on the western bank of the Weyt’o river. Although linguistically unwarranted, the division of the Dullay-speakers between Gawwada and Ts’amakko reflects well the cultural and economic cleavage between the inhabitants of the highlands, whose economy is centered around small1 scale agriculture, and the Ts’amakko, pastoralists and political allies of the Omotic Hamar and Banna, by whom they have apparently been heavily influenced culturally. The practice of labeling all the Dullay-speakers except the Ts’amakko as Gawwada is reflected, for example, in the 1994 Ethiopian Census (Federal Democratic Republic of Ethiopia 1998), according to which there were 32,636 Gawwada, almost all of them living in the Konso Special District (14,498, i.e., 44 % of the total), and in the Dirashe Special District (17,752, i.e., 54 %). In both the Konso Special District and the Dirashe Special District the Gawwada are the second largest ethnic group, well behind the Konso (137,120) and the Gidole (52,536), respectively. Recently, a Gawwada Special District has been set up, although not actually

1

Cf. Minker (1986) for a description of the economy of the highland peoples.

3. Loanwords in Gawwada

127

implemented yet. The 1994 Ethiopian Census listed 8,621 speakers of Ts’amakko, bringing the total number of the Dullay speakers to approximately 42,000. The Dullay varieties are not endangered. Bilingualism and multilingualism involve Konso and other Konsoid varieties, Amharic, and Oromo. The Dullay varieties are not written, although evangelical missions have been preparing an orthography (Horsch 2006).

2. Sources of data For Gawwada itself, the author’s own database is used. Fieldwork on Gawwada started in 2000. Tosco (2007a) is a preliminary sketch of Gawwada. Specific aspects of Gawwada morphology and syntax are dealt with in Tosco (2005, 2007b, 2008). The database has been supplemented by the dictionary contained in Amborn et al. (1980), as well as the data in Kebede (2003). For Konso, Black & Otto’s (1973) unpublished dictionary was used, and for D’iraasha (or Gidole), Black’s (1973b) unpublished dictionary; other data were drawn from Black’s partially unpublished articles (Black 1973a, 1973c), as well as Sim’s (1977) sketch. For Amharic, Kane’s (1991) and Leslau’s (1976) dictionaries were used (references in the subdatabase are to the latter); for Oromo, Gragg (1982) and the lexical appendix to GriefenowMewis & Tamene’s (1994) grammar, as well as, marginally, Hinsene (1998) and Les et al. (1992).

3. Contact situations 3.1.

Gawwada-Amharic

Present-day contact involves primarily Amharic. Contact with the Amhara and other Christian populations from the North is probably fairly old, as the presence in the Arba Minch area (Northeast of Gawwada) of ruins of Orthodox churches and monasteries seems to suggest. Direct contact with the Amharic language can be dated back at least to the years 1897–98, when the whole country was stormed and conquered by Emperor Menelik’s II troops. The resulting destruction and harsh exploitation resulted in a dramatic drop in population (Amborn et al. 1980: 17). This may explain why many local populations of the South actually welcomed the Italian invasion of 1935–36 as a means eliminating the Amhara domination. th During the first half of the 20 century the central Ethiopian government exercised little direct control, so there were probably few linguistic effects. After the Second World War a modicum of modern administration was established, and it is from these times that intense Gawwada-Amharic contact can be dated. The intense social upheavals of the 1970s, with the abolition of the monarchy, the

128

Mauro Tosco

establishment of a Marxist dictatorship, and the ensuing war (which ended in 1991) have probably accelerated language contact and the knowledge of Amharic. Although nowadays Amharic is no longer the official language of Ethiopia and the present constitution accords full rights to each community and language (cf. Savà & Tosco 2008), Amharic remains the most widely used means of inter-ethnic communication and is the working language of the federal institutions and of the highly multiethnic Southern regional government. In the Gawwada area teaching, administration, and all other official functions are carried out exclusively in Amharic. Amharic is the main, if not the only, medium of access to modern vocabulary. Terms pertaining to politics, Christianity, modern technology, and also many agricultural items and implements, are overwhelmingly expressed by Amharic loans. Amharic is likewise the medium through which loans from European languages entered and still enter Gawwada. Although there are no precise data, Amharic is certainly the only language in which bilingualism is widespread (although far from universal) among the Gawwada, and especially among the younger generations. Many colloquial Amharic words have therefore found their way into Gawwada, although not all of them are listed in Amharic dictionaries (which often fail to re2 port loans and words of the spoken registers). It seems also reasonable to assume that any word which came ultimately from a European language (mainly English or Italian, in a few cases French) entered Gawwada through the intermediacy of Amharic, even if the Amharic intermediate source words are not attested in dictionaries nor perhaps used anymore. This applies in particular to Italian loans (maybe dating from the times of the Italian occupation in the 1930s) which seem to be obsolete in Amharic itself. 3.2.

Gawwada-Konso and the southwestern Ethiopian language area

Before the establishment of the Gawwada Special District, the area where Dullay varieties are spoken was divided between the Konso Special District and the Diraasha Special District. In both districts Dullay speakers were the second largest ethnolinguistic group, but they respectively accounted for just 9.2% and 19.7% of the total population. The Konso and the Gidole (the autonym of the D’iraasha, Gidole being an Amharic appellation) respectively accounted for 87% and 58.4% of the population of these districts. The Konso and Gidole or D’iraasha languages are the main representatives of Konsoid, which together with Oromo forms the 3 Oromoid branch of East Cushitic. Dullay speakers are therefore clearly outnumbered by Konsoid speakers. They also seem to be in a weak position in terms of economic and political power. 2

In these cases, in the subdatabase the source “Colloquial Amharic” has been given on the basis of personal knowledge or through elicitation from native speakers. 3 Cf. Hallpike (1972) for an ethnographic description of the Konso.

3. Loanwords in Gawwada

129

Relations are often tense, with open warfare breaking out occasionally. It seems certain that bilingualism in Konso is not widespread among the Gawwada: According to Wedekind (2002) (on the basis of a sociolinguistic survey conducted in September 1994), even though Konso is generally considered to serve as a trade language in the area, only a small minority of Gawwada understand it and still fewer can speak it. In the town of Gawwada itself, these groups would constitute 10% and 5%, respectively. These numbers were obtained by asking the leaders of the Gawwada to judge the competence in Konso of their people. Although they probably underestimated bilingualism, it can be safely assumed that knowledge of the Konso language is far from universal. What is neverthess certain is that Dullay and Konsoid languages share a very large part of their vocabularies in all lexical domains, core lexicon included (e.g. some body part terms). Amborn et al. (1980: 60) comment: “The Dullay dialects share a strikingly large amount of their lexicon with the Konso-Gidole languages. Approximately 30% of the basic vocabulary is almost identical, including a number of specific innovations in such inner-core domains of the basic vocabulary as body parts and kinship terminology” [my translation]. It is evident therefore that intense contact has been going on possibly for centuries, or, as again Amborn et al. (1980: 60) put it: “Dullay and Konso-Gidole form a linguistic area which arose out of a long period of mutual influence” [my translation]. Beyond the direct close link between Dullay and Konsoid, these languages are part of a southwestern Ethiopian language area together with Omotic Zayse and Koyra, and Highland East Cushitic Burji (Sasse 1986). Among the phonological and morphological traits shared by most languages in the area the following are perhaps the most relevant: ! The absence of voicing opposition among the plain stops, whereby /p/, /t/, and /k/ are realized as voiced in intervocalic position and also word-initially, and as long voiceless when geminate. (The status of voiced velar [g] is doubtful and could be a separate phoneme in certain varieties). ! Inceptive verbal forms in -uy; Gawwada: !awn-e ‘night’ ! !awn-uy ‘to become night’; war"-e ‘local beer’ ! war"-uy ‘to brew’; ! A suffix -a(a)mp for permanent quality; Gawwada: sor ‘to run’ ! sor-amp-akko ‘a good runner’; ! A noun-forming suffix *-ayt (Gawwada: Masculine *-ayt-ko ! -akko; Feminine *-ayt-te ! -atte); Gawwada: cuppul-akko/cuppul-atte ‘a bad, vicious man/woman’; ! A semelfactive verbal extension formed with reduplication of the final consonant of the stem: Gawwada qox-a ‘to milk’ ! qox~xi ‘to milk once, a little bit’; ! The use of -n- (fossilized stative of an existential verb) in inflection; in Gawwada it is a future marker: !an=#af-i [1=spread-PF.1S] ‘I spread’ (past) ! !an=#af-n-i [1=spread-FUT-PF.1S] ‘I’ll spread.’

130

Mauro Tosco

At least two different views have been put forward about the origin of the DullayKonsoid contact situation and the linguistic and ethnic prehistory of the area. According to Black (1975), the Dullay speakers are the autochthonous population of the area, and the forefathers of the Konsoid peoples were later immigrants. Linguistically, Dullay would be the main substrate of Konsoid. As Amborn et al. (1980: 61) point out, this reconstruction is not supported by the linguistic evidence nor by ethnographic evidence (such as local traditions and myths of origin). They propose a more complex scenario, in which we are not dealing “with mere substratal influence, but with a continual series of convergence phenomena operating over a very long stretch of time. In other words, not (only) has Konso-Gidole spread at the expense of Dullay, but speakers of both Konso-Gidole and Dullay must have been living for many centuries in a situation of very close linguistic contact, with changing prestige situations and different intensity patterns. This has led to a high degree of mutual intelligibility, but never to the demise of one language or the other.” [my translation].

According to this hypothesis, which is retained here, contact is therefore multilateral, and both parts have given and received a good amount. Amborn et al. specifically mention a couple of Gawwada words (*sata#te ‘heart,’ *$itte ‘root; vein,’) in which the presence of /t/ would be due to the influence of Konso (hittina, sataata) versus the Dullay forms of other varieties (e.g., Harso-Dobase sasa#ko, $isse). In this and possibly other cases, “the native words were not replaced by loanwords but remodeled after the pattern of another language” [my translation]. Nevertheless, in our data Gawwada has exactly the expected Dullay forms (sa#ako and hisse), with no /t/, and the same applies, further to the West, for Ts’amakko (za#ko and $izze; Savà 2005: 256, 263). Direct evidence for the direction of borrowing is therefore hard to come by. The data show that a good part of the common Gawwada-Konso lexicon contains phonemes whose disappearance in Konso is easier to account for than their presence in Gawwada. In particular, Konso (and Oromoid languages in general, most particularly Oromo) lack the pharyngeal fricatives /"/ and /#/. /#/ is fully preserved in Gawwada, while ["] is common as an allophone of /h/. Therefore, words containing pharyngeals in Gawwada could be seen as instances of Dullay loans in Konso – and thereby be excluded from our counts. They have nonetheless been kept in the subdatabase as possible loans. 3.3.

Gawwada-Oromo

Oromo is the second largest language of Ethiopia (maybe even the largest in terms of first language speakers). It is the dominant language over much of western, eastern, and southern Ethiopia, and has certainly played a role in the lexical history of

3. Loanwords in Gawwada

131

all the languages of the area. Nevertheless, a surprisingly small number of items in the subdatabase could be traced back with certainty to Oromo. It must nevertheless be noted that the Konsoid varieties and Oromo together form the so-called Oromoid branch of East Cushitic and are in general very close to each other. Not surprisingly, it is often difficult or even impossible to determine whether Konso or Oromo was the source of a loan in Gawwada. In such cases Konso was chosen as the most likely candidate.

4. Number of loanwords Of the 1460 meanings of the Loanword Typology meaning list, 314 have no counterparts in Gawwada. Of the 982 words in the Gawwada subdatabase, ! 21 words are classified as showing “very little evidence of borrowing” (level 1), ! 71 as “perhaps borrowed” (level 2), ! 15 as “probably borrowed” (level 3), ! 96 as “clearly borrowed” (level 4). This means 203 items for all levels of borrowing in total, and 111 items considering levels 3 and 4 only; in percentages, this amounts to 11% of the subdatabase and 55% of all possible loans. The breakdown of these 111 items per source language reveals that 91 of all loans in levels 3 and 4 are of Amharic origin. This includes 88 established Amharic loans and 3 which were ascribed to “colloquial Amharic” and for which evidence was found through elicitation with native or Gawwada speakers; Amharic loans therefore represent 78% of all level 3/4 loans and 9% of the subdatabase. The remaining 25 items are almost evenly divided between Oromo (11 words, i.e., 9% of all level 3/4 loans and 1% of the subdatabase), and Konso (8 words, i.e. 7% of level 3/4 loans and 0.8% of the subdatabase), with English, Italian, “areal words” and two loans of unknown origin accounting for the six remaining items. In light of the amount of Dullay-Konsoid prehistoric and historical contact (detailed in §3.2 above), these figures may appear low. Nevertheless, one must not forget that to decide on the direction of borrowing between Dullay and Konsoid is difficult or even impossible; as a consequence, most of the shared Gawwada-Konso vocabulary was listed in categories 1 (very little evidence of borrowing) and 2 (perhaps borrowed): out of the 88 words in the database classified in these categories, 78 are from Konso, but only 6 from Oromo, and 4 from Amharic. On the other hand, 88 of the 95 Amharic loans (excluding here those from colloquial Amharic) are classified as clearly borrowed. They represent the overwhelming majority of the 94 clearly borrowed words. Table 1 shows the distribution of loans per borrowing category of the three major source languages.

132

Mauro Tosco

Table 1: Distribution of loans from the three major source languages by borrowing level 1. very little evidence for borrowing 2. perhaps borrowed 3. probably borrowed 4. clearly borrowed total

Konso

Oromo

Amharic

unknown

total

15 63 7 1 86

1 5 5 6 17

2 2 3 87 94

3 1 2 6

21 71 15 96 203

The distribution of the loanword figures suggests that prior to the arrival of the Amhara and the forceful integration of the area within the Ethiopian Empire, the Gawwada lived in relative isolation, and whatever linguistic interactions that were going on had to be mostly local in character.

5. Kinds of loanwords As is often the case, most loans in Gawwada too are nouns: 79% of the 208 loans in all categories and 83% of the 116 level 3/4 loanwords are nouns. Roughly speaking, one can say that four out of any five loans belong to this category. Almost one third of all nouns are possible loanwords. The second largest category, lagging far behind, is represented by verbs: They represent about 15% of all possible loans, and about 28% of level 3/4 loans, as well as 12% of all verbs in the subdatabase. There are no adverbs among the loanwords and almost no adjectives (only a single level 4 loan adjective); finally, loan function words are only 10 (all borrowing levels considered) and 6 (in levels 3 and 4). The breakdown of (level 3/4) loans by semantic word class and donor language gives the percentages shown in Table 2.

Oromo

Konso

English

Italian

Unidentified

Total loanwords

Nonloanwords

Nouns Verbs Adjectives Adverbs Function words all words

Amharic

Table 2: Loanwords in Gawwada by donor language and semantic word class (percentages)

13.3 4.2 1.1 2.0 8.9

1.6 0.4 1.0

1.3 1.3 0.8

0.2 0.1

0.2 0.1

0.4 2.6 0.4

16.9 4.6 1.1 0.0 5.9 11.3

83.1 95.4 98.9 100.0 94.1 88.7

3. Loanwords in Gawwada

133

As shown in Table 2, almost all loan verbs come from Amharic, the only exception being the verb karkar ‘to help’, from Oromo gargaara. Again, the apparent absence of verb loans from Konso may be due to the difficulty of deciding the direction of borrowing in the case of the Dullay-Konsoid interaction: 18 verbs are actually ascribed to Konso in the subdatabase, but all of them belong to the borrowing levels 1 and 2. Table 3: Noun and verb loans (categories 3 and 4) by three major donor languages nouns Konso Oromo Amharic

7 9 76

verbs 0 1 13

The breakdown of loanwords according to semantic fields is shown in Table 4. Not unexpectedly, Amharic loanwords in the semantic field Modern World make up the bulk of the loans. All other semantic fields lag well behind, the second position being taken by Social and political relations. Due again to the absence of doubtful loans from Konso, a few semantic fields have very no or very few loanwords. This is in particular the case of the fields Kinship as well as of the Miscellaneous function words.

134

Mauro Tosco

Konso

English

Italian

Unidentified

Nonloanwords

Total loanwords

The physical world Kinship Animals The body Food and drink Clothing and grooming The house Agriculture and vegetation Basic actions and technology Motion Possession Spatial relations Quantity Time Sense perception Emotions and values Cognition Speech and language Social and political relations Warfare and hunting Law Religion and belief Modern world Miscellaneous function words

Oromo

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Amharic

Table 4: Loanwords in Gawwada by semantic field (percentages)

3.7 1.3 2.7 13.3 13.8 13.9 13.3 6.5 3.9 13.8 2.0 2.9 11.0 2.8 6.1 24.4 9.4 26.7 3.1 10.0 21.1 54.0 -

2.7 2.2 6.3 10.7 12.2 10.0 3.3 -

1.7 2.5 5.6 5.8 5.3 -

3.3 -

3.3 -

1.7 5.1 3.3 -

96.3 100.0 98.7 97.3 83.4 83.8 86.1 84.1 88.0 96.1 86.2 98.0 91.3 86.7 97.2 93.9 70.5 84.3 57.3 84.7 80.0 78.9 33.0 100.0

3.7 0.0 1.3 2.7 16.6 16.2 13.9 15.9 12.0 3.9 13.8 2.0 8.7 13.3 2.8 6.1 29.5 15.7 42.7 15.3 20.0 21.1 67.0 0.0

8.9

1.0

0.8

0.1

0.1

0.4

88.7

11.3

In general, we can say that non-borrowed vocabulary together with ancient and completely assimilated loans from Konsoid together make up the semantic domains of basic vocabulary. These are the semantic domains covering body parts, kinship, flora, and fauna. On the other hand, Gawwada has recourse to Amharic as a source of new words accompanying new concepts, like many other languages of Ethiopia. This includes the domain of agriculture, where many plants (from the potato to various beans to the banana) and techniques (e.g. plowing) are recent innovations in the area. Again, it is through Amharic that a number of ultimately Western words have entered Gawwada from languages such as English, French, and Italian. The couple of items (one each for English and Italian) where no Amharic equivalent has been found can hardly suggest a process of direct borrowing from these languages. Knowledge of foreign languages is to all practical purposes nonexistent among Gawwada speakers as is often the case for speakers of other languages of Ethiopia.

3. Loanwords in Gawwada

135

Still other concepts are rendered with calques which mirror similar words in Amharic. Finally, a sizable portion of the Loanword Typology meanings has no established equivalents in Gawwada, a fact which once again stresses the great isolation of the Gawwada and of the whole Southwest Ethiopia until recent times.

6. Integration of loanwords 6.1. 6.1.1.

Phonological integration Phonological integration of non-Amharic loans

Most loanwords are fully integrated into the phonology and morphology of Gawwada. Little can be said about the integration of the Konso loans: phonologically, the most striking difference between the two language groups (Dullay and Konsoid) involves the retention in Dullay (and therefore in Gawwada) of the inherited Afro-Asiatic (and Cushitic) pharyngeals, compared to their loss in Konsoid. (The same applies to Oromo loans; Oromo and Konsoid together make up the Oromoid branch of East Cushitic.) It is frequently the case, therefore, that a Gawwada word containing a pharyngeal has a Konso cognate with /h/ or a glide, for example tahakko (=[ta$akko]) vs. Konso tahayta ‘sand’ and le#o vs. Konso leya ‘moon, month’. All these words are assigned a low level of borrowing probability (categories 1 and 2) and are excluded from our lists and counts. 6.1.2.

Phonological integration of Amharic loanwords

Phonological integration of Amharic words involves the mapping of the Amharic seven-system vowel system onto the five-vowel system of Gawwada. Amharic /ä/ (phonetically [$]) is generally rendered by /a/ in Gawwada: k’alame (< Amharic qäläm) ‘paint’, maskoote (< Amharic mäskot) ‘window’, sansalate (< Amharic sänsälät) ‘chain’, zayte (< Amharic zäyt) ‘oil’. Amharic /%/ is generally rendered by /i/ in Gawwada: !itile (Amharic %dd%l) ‘luck’, kipraate (< Amharic k%brit) ‘matches’, pirre (< Amharic b%rr) ‘money’, tiste (< Amharic d%st) ‘earthenware pot, kettle’. Since in Gawwada there is no voicing opposition in plosives, voiced and voiceless plain stops are phonologically undifferentiated in Amharic loans. They are phonologically transcribed with the symbols for the voiceless stop /p/, /t/, /k/, their phonetic realization being determined by the phonological environment. They are usually voiced in word-initial position, always voiced between vowels, and always voiceless when geminate: kize (< Amharic gize) ‘time’, parpaare (< Amharic barbärre) ‘chili; pepper’, tiltilte (< Amharic d%ld%y) ‘bridge’. Consonant substitution operates in the case of phonemes not found in Gawwada. The bilabial ejective stop /p’/, which is very rare in Amharic, is not attested among the loans in the subdatabase. As for the Amharic plain affricate /&/ ([']), it

136

Mauro Tosco

can lose its stop component and shift to /"/ ([(]): tinni"a (< Amharic d%nn%&&) ‘potato’. Its voiced counterpart /j/ ([)]) is retained in a few cases, as in t’a''e (< Amharic t%jj) ‘mead’ (as an unassimilated alternative to t’ayye). Note that in both cases /i/ would be expected instead of /a/ in the first syllable of the loan. There are nevertheless a few cases in which a foreign loan phoneme is retained in Gawwada. For example, the affricate /)/ in the word ma'ammariya ‘beginning; first’ from Amharic mäjämmäriya is retained. (Another example is the alternative form t’a''e ‘mead’ discussed above.) This word and the Amharic verb jämmärä ‘to begin’ have been found in a number of Ethiopian languages. Another case of loan phoneme is the palatal nasal /(/ in moo((e ‘stupid’, from the Amharic adjective mo((. A third example is zayte ‘oil’ from Amharic zäyt: in nonborrowed vocabulary, /z/ only appears in the numeral !ízzah ‘three’. The Amharic orthography often retains the Semitic- (and Afroasiatic-) inherited pharyngeals ([#] and ["]), which have been historically reduced to Ø and /h/ respectively. Both pharyngeals are preserved in the phonological inventory of Gawwada. In Amharic loans, being the spoken – rather than the written – register the source, the historical pharyngeals are represented by Ø and /h/ respectively, as in !alame (< Amharic al)m, orthographically ‘aläm) ‘world’. 6.2. 6.2.1.

Morphological integration Morphological integration of non-Amharic loans

In both Konso and Dullay nouns always end in a vowel. Consonant-final Amharic nouns get a final vowel, generally -e, upon borrowing (in native vocabulary -e marks feminine and plural nouns), as in kinine (< Amharic kinin < English or French quinine) ‘quinine’, mankiste (< Amharic mäng%st) ‘government’, and pompe (< Amharic bomb < English bomb) ‘bomb’. In a few cases Amharic loans lexicalized with a final -e receive a collective meaning: !akime ‘(medical) doctors’ (< Amharic (h)akim ‘medical) doctor’) > !akimitto ‘a medical doctor’, tamare ‘students’ (< Amharic tämari ‘student’) > tamaritto ‘a student’. The same applies to loans from Oromo (where -a is the most typical ending for nouns), a few of which may in turn be loans from Amharic. In the following case, while final -a points to Oromo as the source, the presence of /t’/ rather than /*/ in Gawwada may point to Amharic as the source: mat’afa (< Oromo macaafa [ma'’a*fa] < Amharic mä+haf (=[m$+’af]) ‘book’. 6.2.2.

Morphological integration of Amharic loans

Amharic nouns ending in -a (often themselves of loan origin) retain it upon borrowing. As Gawwada nouns generally end in -o or -e, final -a is a sure sign of loan

3. Loanwords in Gawwada

137

origin: paampa (< Amharic b,amb,a < Italian pompa ‘pump’) ‘faucet’, paqeela (< Amharic baqela) ‘a variety of beans’, waaka (< Amharic waga) ‘price’ As already mentioned in §5, almost all loan verbs are borrowed from Amharic. It is not exactly clear which form of the Amharic stem is taken as the basis for borrowing, but apparently the most simple form, as found in the Perfect, is used. In rd Amharic (as generally in Semitic) the form of the 3 person masculine of the perfect positive is the morphologically simplest form. Thus, the Gawwada verb fak’k’at ‘to permit’ (quoted as usual in Cushitic in the morphologically simplest form, the singular imperative positive) is easily derivable from Amharic fäqqädä ‘he permitted’. Any other form of the stem would yield a different vowel pattern in the Gawwada loan; e.g. the imperative singular Masculine is f%qäd. Other examples of loan verbs borrowed from Amharic are !amman ‘to believe’ < Amharic ammänä; !asaap ‘to think’ < Amharic assäbä; tamar ‘to study’ < Amharic tämarä; !assas ‘to order’ < Amharic azzäzä. Loan verbs are further given the full inflection and derivation of Gawwada verbs; e.g. in derivation, the semelfactive of nipaap ‘to read’ is nippaappi, and the causative of tamar ‘to study’ is tamarsis. Data are too limited to draw conclusions on the integration of the very few loans belonging to other categories.

7. Grammatical borrowing and calques Grammatical borrowing is apparently nonexistent in Gawwada. Loan nouns may enter into compounds, as minne tamaratte ‘school’ (= minne tamar-atte “house students=ASSOC.P”), calqued on Amharic tämari bet “student’s house” (alongside t%mh%rt bet “study house”). In some calqued compounds and phrases both elements are from native vocabulary: minne "aappete ‘prison’ (= minne "aapp-ete “house tie=ASSOC.F”), calqued on Amharic %s%r bet “tying house”. The expression ye huli ‘I understood’ (= ye hul-i “me enter-PF.3m”, i.e. “it entered me”) is calqued on Amharic gabbä(( (gabbä-(( “it entered-me”).

8. Conclusions Language contact and language replacement are probably ubiquitous in human history. Nation states and the imposition of national languages are on the contrary very recent phenomena, whose role in bringing about much of present-day reduction in language diversity have not so far been fully appreciated (cf. Tosco 2004). While Ethiopia is a very old cultural entity, its transformation into a modern nation state occurred very late. For example, Leyew (2003) has shown that Kemant, a Central Cushitic language of the northern Ethiopian highlands, was able to coexist with Amharic for centuries; it was only the direct impact of modern administration

138

Mauro Tosco

(and conversion to Orthodox Christianity) after the Second World War that brought about disastrous effects for language retention (Kemant is presently highly endangered). Like many languages spoken along the borders of modern Ethiopia, Gawwada entered rather late into contact with Amharic. Intensive contact is even more recent, although it has already brought about a large number of Amharic loans. The rate of effective bilingualism (and subsequent borrowing) is probably increasing, although Gawwada – like most languages of the area – is far from being endangered. This is also indicated by the absence of grammatical borrowing. It is also possible that local contact situations (as detailed in §3.2) are receding under the pressure of the pervasive, all-embracing Amharic influence. Recent developments in language policy and the increasing use of local languages in administration and education have not affected Gawwada so far. The future role of these developments in changing language attitudes and language borrowing patterns remains at present an open question.

3. Loanwords in Gawwada

139

References Amborn, Hermann & Minker, Gunter & Sasse, Hans-Jürgen. 1980. Das Dullay: Materialien zu einer ostkuschitischen Sprachgruppe [The Dullay: Materials on an Eastern Cushitic language group]. Berlin: Dietrich Reimer. Bender, M. Lionel. 1971. The Languages of Ethiopia: A New Lexicostatistic Classification and Some Problems of Diffusion. Anthropological Linguistics 13:165–288. Black, Paul. 1973a. Draft sketch of Konso Phonology, Morphology, and Syntax. Unpublished manuscript. Black, Paul. 1973b. Preliminary draft of a Gidole dictionary. Unpublished manuscript. Black, Paul. 1973c. Konsoid: An example of extreme dialectal differentiation. Paper submitted to the Conference on African Linguistics, Queens College, April 1973. Black, Paul. 1975. Linguistic Evidence on the Origins of the Konsoid Peoples. In Marcus, Harold C. (ed.), Proceedings of the First United States Conference on Ethiopian Studies, 1973, 291–302. East Lansing, MI: African Studies Center, Michigan State University. Black, Paul. 1976. Werizoid. In Bender, M. Lionel (ed.), The Non-Cushitic Languages of Ethiopia, 222–231. East Lansing, MI: African Studies Center, Michigan State University. Black, Paul & Otto, Shako. 1973. Konso dictionary. Unpublished manuscript. Ehret, Christopher. 1974. Ethiopians and East Africans: The Problem of Contacts. Nairobi: East Africa Publishing House. Ehret, Christopher. 1976. Cushitic Prehistory. In Bender, M. Lionel (ed.), The NonCushitic Languages of Ethiopia, 85–96. East Lansing, MI: African Studies Center, Michigan State University. Federal Democratic Republic of Ethiopia. 1998. The 1994 Population and Housing Census of Ethiopia: Summary Reports at Country and Regional Levels. Addis Ababa: Office of Population and Housing Census Commission, Central Statistical Authority. Gragg, Gene B. 1982. Oromo Dictionary. East Lansing, MI: African Studies Center, Michigan State University. Griefenow-Mewis, Catherine & Bitima, Tamene. 1994. Lehrbuch des Oromo [Oromo textbook]. Köln: Rüdiger Köppe. Hallpike, C. R. 1972. The Konso of Ethiopia: A study of the values of a Cushitic people. Oxford: Clarendon Press. Hayward, Richard J. 1978. The Qawko Dialects and Yaaku. Abbay 9:59–70. Hinsene, Mekuria. 1998. Galmee Jechoota Afaan Oromoo-Amaaraa-Inglizii. [OromoAmharic-English Dictionary] Finfinnee (Addis Ababa). Horsch, Roland. 2006. Suggesting an Orthography for the Dullay Language Cluster. Unpublished manuscript, SIL. Kane, Thomas L. 1991. Amharic-English Dictionary. Wiesbaden: Harrassowitz.

140

Mauro Tosco

Kebede, Haregewoin. 2003. Aspects of Gawada Phonology. Vol. 3. 42–56. Zena Lissan: Ethiopian Languages Research Center, Addis Ababa University. Les, Ton & Van de Loo, Joseph & Cotter George. 1992. An Oromo-English Vocabulary. Debre Zeit, Ethiopia. Leslau, Wolf. 1976. Concise Amharic Dictionary. Wiesbaden: Harrassowitz. Leyew, Zelealem. 2003. The Kemantney Language: A Sociolinguistic and Grammatical Study of Language Replacement. Köln: Rüdiger Köppe. Minker, Gunter. 1986. Burji - Konso-Gidole - Dullay: Materialen zur Demographie, Landwirtschaft und Siedlungsstruktur eines südäthiopischen Kulturareals [Burji - KonsoGidole - Dullay: Materials on demography, agriculture and settlement structure of a southern Ethiopian cultural areal]. Bremen: Übersee-Museum. Sasse, Hans-Jürgen. 1986. A Southwest Ethiopian Language Area and Its Cultural Background. In Fishman, Joshua A. & Tabouret-Keller, Andrée & Clyne, Michael & Krishnamurti, Bh. & Abdulaziz, Mohamed (eds.), The Fergusonian Impact, Vol. 1, 327– 342. Berlin: Mouton de Gruyter. Savà, Graziano. 2005. A Grammar of Ts’amakko. Köln: Rüdiger Köppe. Savà, Graziano & Tosco, Mauro. 2008. Ex Uno Plura: The uneasy road of Ethiopian languages towards standardization. International Journal of the Sociology of Language 119: 111–139. Sim, Ronald James. 1977. A linguistic sketch: Phonology and morphology of the word in Konso. M.A. thesis. University of Nairobi. University of Nairobi. Tosco, Mauro. 2000. Cushitic Overview. Journal of Ethiopian Studies 33(2):87–121. Tosco, Mauro. 2004. The Case for a Laissez-Faire Language Policy. Language and Communication 24(2):165–181. Tosco, Mauro. 2005. La naissance d’une catégorie morphologique: Les clitiques sujet entre couchitique et langues romanes. [The birth of a morphological category: the subject clitics between Cushitic and Romance languages] Faits de langues 26:203–215. Tosco, Mauro. 2007a. Gawwada Morphology. In Kaye, Alan S. (ed.), Morphologies of Asia and Africa, Vol. 1, 505–528. Winona Lake, Indiana: Eisenbrauns. Tosco, Mauro. 2007b. Feature-geometry and diachrony: The development of the subject clitics in Cushitic and Romance. Diachronica 24(1):119–153. Tosco, Mauro. 2008. Between subordination and coordination in Gawwada. In Frajzyngier, Zygmunt & Shay, Erin (eds.), Interaction of morphology and syntax: Case studies in Afroasiatic, 207–226. Amsterdam: John Benjamins. Wedekind, Klaus (ed.). 2002. Sociolinguistic Survey Report of the Languages of the Gawwada (Dullay), Diraasha (Gidole), Muusiye (Bussa) Areas. SIL International. .

3. Loanwords in Gawwada

141

Loanword Appendix Amharic !alame kipraate ciwciwitto qurcumcumitte nassa!akime kinine tiste paqeela tinni"a zayte parpaare t’ayye piira marfe suufa k’ope kiise samuna k’ulfe maskoote patri kapare kaspo poqollo kayye muuse sansalate pillaawe k’ork’orro k’alame (1) tiltilte k’anpara pirre saantipe k’aratto waaka mulo kize sa!ate sampata k’ummasa !itile !itile makitte

world match cock/rooster ankle to breathe physician medicine kettle bean potato oil chili pepper mead beer needle (1) (woman’s) dress hat or cap pocket soap key window lamp, torch farmer barley maize/corn tobacco banana chain knife (2) tin, tinplate paint, color bridge yoke money coin tax price all time hour Sunday Friday good luck bad luck

!afrite !asaap !amman ye huli moo((e tamar tamaritto !astamaritto minne tamaratte mekinnate t’aratt’t’ar t’af nipaap wark’ate koosa !assas fak’k’at t’alaata paalake kaat k’it’ate

minne "aappete !amaante k’esitto sappak toom retiyo silke !akimitte paampa mankiste polisitto taptaappe t’armuse karamela pompe filime kaarne mootore makinatte nafse ma'ammariya

shame to think (1) to believe understand stupid to study pupil teacher school

minne kiristaanay

cause to suspect to write to read paper the clan to command, to order to permit enemy prostitute to betray penalty, punishment prison religion priest to preach to fast radio telephone nurse tap/faucet government police letter bottle candy/sweets bomb television birth certificate motorcycle machine life beginning

saykile

church

Colloquial Amharic piife kalse motopilakko

lunch sock, stocking car

English bicycle

Italian kaacca

airplane

Konso "aahe xoope sipilitte sipile !orre to!on kanta qoota

sugar shoe nail iron potter one neighbor piece

Oromo mayt’a torpa mala k’alame (2) mat’afa mantara karkar nakayho k’awe nak’aa"e lonce

palm tree week manner pen book village to help peace gun witness bus

Areal words haay !a!a

no no

Unknown Origin roo..ile !alquqa

airplane bean

Chapter 4

Loanwords in Hausa, a Chadic language in West Africa* Ari Awagana and H. Ekkehard Wolff, with Doris Löhr 1. The language and its speakers Hausa is one of the major lingua francas in parts of West and Central Africa. It is spoken by up to 50 million speakers, about half of them mother tongue (L1) speakers, along the southern fringe of the Sahara desert (“Sahel”) and in many urban agglomerations between Dakar (Senegal) and Khartoum (Sudan), Tunis (Tunisia) and Duala (Cameroon), particularly along the pilgrims’ route to and from the Holy Cities of Islam in Arabia. Presently, the Hausa heartland of predominantly mothertongue speakers of the language lies on both sides of the international border between Nigeria and Niger, i.e. north of the Nigerian Middle Belt region and south of the semi-arid and arid zones of the Sahara in Niger. Individuals and small groups of speakers can further be found in major cities outside the Central African region and outside Africa due to more recent migration and globalization. In the Republic of Niger, L1 speakers of Hausa constitute the largest ethno-linguistic group with more than 50% of the population (about 7 million), the national range of Hausa as main lingua franca in most of the territory of Niger is estimated at about 80%. In Nigeria, Hausa is one of the three biggest linguae francae, together with Yoruba and Igbo; its range within Nigeria may be estimated to cover up to one third of the total population of about 120 million. Hausa belongs to the Western branch of the Chadic language family (subbranch A), one of the macro-families within the Afroasiatic phylum. The classification of Chadic as Afroasiatic and Hausa as a West Chadic language is undisputed. On the basis of Newman (1990: 2–4), the sub-classification of Chadic is given in Figure 1 in terms of branches, sub-branches, and language groups. The “Hausa group” itself consists of Hausa and its only sister language Gwandara, now found to the south of Hausa territory as part of the Middle Belt linguistic fragmentation zone and typologically quite distinct due to heavy interference from surrounding Benue-Congo languages.

*

The subdatabase of the World Loanword Database that accompanies this chapter is available online at http://wold.livingsources.org. It is a separate electronic publication that should be cited as: Wolff, Ekkehard & Awagana, Ari & Löhr, Doris. 2009. Hausa vocabulary. In Haspelmath, Martin & Tadmor, Uri (eds.) World Loanword Database. Munich: Max Planck Digital Library, 1452 entries.

4. Loanwords in Hausa

A

Hausa, Bole, Angas, Ron

B

Bade, Warji, Saya

A

Tera, Bura, Higi, Mandara, Matakam, Sukur, Daba, Bata

B

Kotoko, Musgu

C

Gidar

143

Western

Central

Chadic Masa

Masa A

Somrai, Lele, Kera

B

Dangla, Mukulu, Sokoro

Eastern

Figure 1:

The Chadic language family

Hausa was one of the first Chadic languages, besides Kotoko and Wandala (Mandara), to be known to philologists and linguists outside Africa. The earliest lexicographic and descriptive sources going beyond short vocabularies (cf. Afnu and Kaschne in Adelung & Vater’s Mithridates of 1812) date back to the middle of the th 19 century beginning with the works of James Frederick Schön (1843 and later) and Heinrich Barth (1862). Various dictionaries have been published since then by Schön (1862), Robinson (1899–1900), Mischlich (1906), Bargery (1934), Abraham & Kano (1949), Skinner (1959, 1965), Newman & Newman (1977), Herms (1987), R. M. Newman (1990), McIntyre & Meyer-Bahlburg (1991), Mijinguini (1987), Awde (1996), Caron & Amfani (1997), and P. Newman (2007). In terms of reference works on Hausa grammar, the language prides itself on three quite recent reference grammars (Wolff 1993; P. Newman 2000; Jaggar 2001). Standardization of Hausa based on the Roman alphabet began 1911 (Vischer 1912); there is also a tradition of writing Hausa in ’ajami, i.e. an adaptation of the Arabic script. Note that the standard orthography (as also used in this chapter unless a phonemic transcription is required) neither indicates distinctive vowel length and tone nor the phonemic distinction between retroflex-flapped /r/ [!] and coronal tap or rolled /r""/. So-called “Standard Hausa” is largely based on the koine spoken in the urban agglomeration of Kano in Nigeria. Systematic Hausa dialectology is poorly developed; a notable exception is the monographic description of the non-standard Hausa variety of Ader spoken in Niger (Caron 1991); sporadic reference to and evidence from Hausa dialects are found, for instance, in the reference grammars of Wolff (1993) and Newman (2000). There is a clear division line in terms of bundles of isoglosses that separates

144

Ari Awagana and H. Ekkehard Wolff, with Doris Löhr!

a major dialect group in the northwest situated largely in Niger from another group in the southeast and situated largely in Nigeria, including the city of Kano on which Standard Hausa is based. There is full intelligibility among speakers of Hausa dialects despite considerable differences in phonology, grammar and lexicon. The major dialects are referred to by names of settlements or geographical areas: Ader (around Tahoua), Kurfey (Filingué), Arewa (Dogondoutchi), Tibiri (Maradi), Damagaram (Zinder) in Niger, Sakkwato (Sokoto), Katsina, Daura, Kano, Zazzau (Zaria) and Guddiri (Bauchi) in Nigeria. Comparative Chadic linguistics began some 70 years ago with the early works of Johannes Lukas under the assumption of a relevant dichotomy of “Chadohamitic” vs. “Chadic” languages based on partly doubtful typological evidence as much as on partly robust comparative evidence. Sound lexical and phonological comparison targeted at the reconstruction of Proto-Chadic lexicon and based on Greenberg’s (1963) hypothesis of the genealogical unity of the “Chad” family within Afroasiatic including both Lukas’ “Chadohamitic” and “Chadic”, began with Newman & Ma (1966, revised and enlarged by P. Newman 1977, Jungraithmayr & Ibriszimow (1994), to be complemented – with a focus on Hausa and rather impressionistic at times – by Skinner (1996). Hausa is commonly viewed as a language of highly mobile commercial traders who created city-states (the “legitimate seven” Hausa kingdoms, called Hausa bakwai vs. the “illegitimate seven”, banza bakwai) mainly in what is now northern Nigeria, with subsequent migration and formation of diaspora communities over large parts of West and Central Africa, at the cross-roads of the major pilgrimage route to Mekka and trans-Saharan trade routes. In the rural areas, however, the L1 Hausa speakers are largely farmers (staple crops: sorghum and millet) with small animal husbandry, and in more urban settlements they work as small scale traders, traditional craftsmen (blacksmiths, tanners, leatherworkers) or butchers. The mythological origin of the Hausa is reported to have been in Daura, following the immigration of the founding hero Bayajida, a prince from Bornu (i.e. the Kanem Empire) or even Baghdad, to the east. Hausa historiography is supported, if not irritated, by rich chronicle materials, the most informative being the so-called Kano Chronicle. Orthodox and quite popular accounts held until more recently like to view the origin of Hausa rather in the north, preferably the area around Agades in the southern Sahara. This view cannot be maintained in the light of modern linguistic geography and historical linguistic research, and is likely a survival of the notorious and long since discredited “Hamitic” element in Hausa historiography. Hausa is further credited to have been the dominant language of the Sokoto Sultanate, including its ruling class of Ful#e origin, spreading dynamically during its th jihadic territorial expansion in the early 19 century. Indeed, the vast geographical spread of Hausa both as L1 and as lingua franca requires some explanation. It can be considered an established fact that pre-Hausa was one of several “local” West Chadic languages spoken, at one point in time estimated at about a millennium ago, somewhere near the Jos Plateau in the Nigerian Middle Belt region, perceptibly along and below granite inselbergs and hill escarpments with their commonly

4. Loanwords in Hausa

145

fertile bases and where most of its most closely related sister languages of the Angas and Ron group are still spoken (cf. Ballard 1971, Sutton 1979), i.e. along the southern and/or south-eastern borders of today’s Hausa expansion zone (cf. Map 1). Sutton (1979: 184f.) speculates on a first expansion “with the nuclear region in or against the Bornu marches, that of the Daura-HadejiaKano very roughly, a thousand years or so ago, the next development would have been the incorporation of northern Zazzau and Katsina in the early centuries of the present millennium... The further expansion through this wooded zone into the plains of Zamfara and Kebbi beyond (modern Sokoto, that is) need not have been earlier than the fourteenth-fifteenth centuries; and with the rise of Kebbi to power and the retreat of that of Songhai in the sixteenth century Hausa influence would have radiated rapidly in these western areas... The fifteenth century is seen as a ‘watershed’.”

It is then – significantly too the time of the rise of Songhai and of Oyo, of Bornu and of Agades – that the emergence of substantial city-states in eastern Hausaland – Kano, Katsina and Zazzau – first becomes obvious and that Hausaland clearly connects itself with the cultural and commercial network of the Sudan and the Sahara and its Islamizing features... Therefore the celebrated passage in the Kano Chronicle indicating a mid-fifteenth century efflorescence, marked by new cultural, commercial and political developments and external contacts in all four directions, though doubtlessly impressionistic, is not mere invention.” The most plausible scenario based on what we know about linguistic geography would, therefore, involve a basically westward and north-western expansion of Hausa from an area somewhere between the Jos and the Bauchi Plateaus, its eastward and north-eastern expansion probably blocked by the installation of Kanuri of the Kanem-Bornu Empire (cf. the chapter on Kanuri in this volume). Hausa would have met with and, quite likely, have assimilated several smaller languages of (East) Benue-Congo genealogical affiliation, for instance the formerly so-called Plateau languages, and connected, finally, with speakers of Songhay, Mande and Berber (Tamajeq). These contacts have left traces in Hausa vocabulary, for which the subdatabase provides examples – in addition to Arabic, English, Kanuri, French and other donor languages of more recent times and reflecting rather different contact types. Islam probably arrived in Hausaland from both the west and northwest (with Songhay and Mande languages acting as potential linguistic vehicles) and from the east via Kanuri speakers. It appears to have been established permanently in the area th th by the 12 /13 century, and since then has made Arabic an important direct source for loans. Kanuri must be singled out as the major indigenous donor or intermediary language for loans into Hausa, including loans ultimately from Arabic, since from the sixteenth century onward Hausaland came under the impact of the Kanuri-speaking Borno state, which “exerted considerable cultural influence on Hausaland, on eastern Hausaland especially and in the fields of Islam, learning and administration” (Sutton 1979: 194). The trade of kola nuts from the coastal areas, on the other hand, would have connected Hausa economy with the south and could

146

Ari Awagana and H. Ekkehard Wolff, with Doris Löhr!

explain the role of languages like Yoruba (and possibly other Niger-Congo languages along the trade routes) as donors or intermediaries for loanwords of both African and European origin. With the arrival of British and French colonialism th towards the end of the 19 century and subsequent independence of Nigeria (formerly under British rule) and Niger (formerly a French colony), Hausaland came under the still persisting localized linguistic influence of these two (ex-) colonial languages: English in Nigeria, French in Niger. However, full active command of these languages, which serve as official languages, is rather minimal and largely restricted to urban agglomerations and the schooled elites. Hausa remains a widely spoken L1 and a highly dynamic and still spreading lingua franca both in Nigeria and in Niger, and in other neighboring countries of the larger region.

Map 1: Geographical setting of Hausa (map based on Sutton 1979: 183)

2. Sources of data Loanwords in Hausa have already attracted the attention of quite a few linguists in the past, such as Greenberg (1947, 1960), Brauner (1964), Gouffé (1974), Hoffmann (1970), Wexler (1980), Baldi (1992, 1995, 1999), Skinner (1981, 1996) and Kossmann (2005). The lexical data for the Hausa subdatabase were taken, first of all, from the linguistic competence of one of the project members, Dr. Ari Awagana, who grew up in the eastern part of Niger with L1 Kanuri and acquired L2 Hausa, practically in a

4. Loanwords in Hausa

147

simultaneous fashion (Damagaranci and other western varieties of Hausa) in Niger. The data have been cross-checked with available dictionaries, mainly Bargery (1934), Newman & Newman (1977), R. M. Newman (1990), Awde (1996) and Broß & Baba (1996). Where Niger and Nigeria Hausa varieties differed with regard to loans from the respective (ex-) colonial language, i.e. French or English, the loans from both donor languages were included to reflect the parallel localized input during colonial and post-colonial times. In order to identify loanwords in Hausa, the available publications on both (potential) loans and lexical reconstructions (for Chadic and Afroasiatic, but also for Saharan and Nilo-Saharan as well as for Niger-Congo and some of its families) have been consulted, including, in particular, Skinner (1996) for Hausa, and the Sahelia database for languages of the wider Sahel zone, which was made available to us by our colleague Robert Nicola! (Université de Nice), whose support is gratefully acknowledged.

3. Contact situations and types Arabic stands out as the most frequent donor language to Hausa, followed by English, and then by Kanuri, followed in turn by French. Note that Kanuri often serves as intermediary for Arabic loans into Hausa, as was noticed quite early by Greenberg (1947). This is easily explained by reference to the long time period of contact and its nature, as is reflected in both the number of loans and the particular semantic domains. Arabic as a donor language is, first of all, closely linked to the advent and spread of Islam in West and Central Africa, both in terms of religion and dominant th culture. Islamization began in the 11 century. Quite likely, Islam reached the area between the river Niger in the west and Lake Chad in the east via several gateways and roughly at the same time: The earliest contacts may have been through either the straight north-south trans-Saharan route from Fezzan to Lake Chad or from the east, i.e. following what would later become the main pilgrimage route along the southern fringes of the Sahara desert that connected, for instance, the Islamic centers in Mali, such as Timbuktu, with Khartoum in Sudan and the Holy Cities across the Red Sea. There was also the north-west gateway via Morocco and the western trans-Saharan trade routes. The central and eastern routes would first have affected the Kanuri, who would then have passed the culture on to the Hausa states that had developed to the west of them. The north-western route would make Mande-speaking intermediaries a likely hypothesis, as Islam reached the territories of the empires first of Mali, and later of Songhay, before it reached the Hausa states and, subsequently again, Kanem-Borno, where all three gateways finally meet. Note th also that the early 19 century jihad of Usman $an Fodio spread from Sokoto (one of the Hausa states and center of the Sokoto Sultanate). Occasionally, Berberspeaking Tuareg of the Southern Sahara may have played a role in the process. Independent of and prior to the advent of Islam and in the course of expanding from their Proto- or Early-West Chadic homeland somewhere between the Jos and

148

Ari Awagana and H. Ekkehard Wolff, with Doris Löhr!

Bauchi Plateaus in the Nigerian Middle Belt, Hausa speaking groups formed local kingdoms, later to be known as the “seven legitimate” and “seven illegitimate” Hausa city-states (Hausa bakwai, banza bakwai), which later became early centers of cultural and commercial contacts, situated on the cross-roads of the west-east pilgrimage route and the north-south trans-Saharan trade routes. This western and north-western expansion must have brought contacts with speakers of autochthonous non-Chadic (presumably Benue-Congo or “Plateau” languages), many of which were, quite likely, assimilated linguistically and culturally. Note that “being Hausa”, at least today, simply means to be a speaker of the Hausa language. Following the expansion from the Middle Belt region, Hausa came into contact not only with Kanuri, but also with the languages of the Songhay Empire, Mande languages, and Berber as spoken by the Tuareg (varieties shall be jointly referred to here as “Tamajeq”). Commercial contacts with the coastal regions in the south imply contact with Yoruba and other Benue-Congo and Kwa languages. Traces of all such contacts can be found in the subdatabase. With the advent of missionaries and British and French colonialism, English and French became major sources of loans, particularly in semantic domains related th to colonial administration, Christian mission, and 20 century Western civilization. The observation that English plays a larger role as donor in the database than French may reflect our primary concern with Standard Hausa, which is based on the dialect (or rather: koine) of the city of Kano in Nigeria in the former British sphere of influence. Hausa soon became the language of the “West African Frontier Force” which was recruited mostly from non-Hausa minority groups from the Nigerian Middle Belt, and was standardized as early as 1911 to serve for (adult) alphabetization campaigns which began in Northern Nigeria in 1912; the first western-type school opened in 1919. Other African languages may have acted as intermediaries for loans from English and French, particularly languages of the West African coastal regions from where much of the colonial conquest set out, so that coastal languages like Yoruba are brought into the picture for likely transporting English loans to the interior. The most common intermediary language, however, was apparently Kanuri as a lingua franca in its own right at the time of the Kanem-Borno Empire, which passed on many words, particularly ultimately from Arabic that traveled across the Western and Central Sudan. It is not unlikely that some loans of ultimately Arabic origin came into Hausa via the north-western gateway, i.e. Moroccan influence particularly on the Songhay Empire that bordered onto the north-west of the Hausa expansion zone before and during the sixteenth century. In view of the assumption that Chadic languages may have had a rather long history of geographical neighborhood with, first of all, Benue-Congo languages to the south and west, but also Saharan languages to the east and north, it is not surprising to find lexical items which are geographically very widely spread across established genealogical affiliations of languages and whose ultimate origin remains obscure to the extent that the few available reconstructions of proto-languages may reflect such items across different language phyla. For such lexical items that have

4. Loanwords in Hausa

149

become inherited vocabulary in more than one established phylum (i.e. Nilosaharan, Afroasiatic, Niger-Congo), we have introduced the term “areal root”. For such areal roots, we assume borrowing at such time-depths that the ultimate origin and directions of borrowing is no longer detectable. Such examples may also include lexemes of universal onomatopoeic origin, such as hùuhúu ‘the lung’ and nóonòo ‘the milk’. (1)

‘lung’: Proto-Afro-Asiatic *fuf- ‘lung, breast’ (Orel & Stolbova) h Proto-Nilo-Saharan *p ùh (Ehret), Kanuri fùfú

(2)

‘milk’ : Proto-Afro-Asiatic */amam- (Orel & Stolbova) Chadic non-, nan-, nuw!n ‘mother’ (Bole-Tangle group), num ‘to milk’ (Buduma, Logone) Cushitic *nunu’ ‘suck breast’ (Proto-South Cushitic), *an"na ‘breast’ (ProtoHighland East Cushitic), *n"g- ‘suck’ (Proto-East Cushitic), *n-gw-, *Ngw‘nipple, breast, suck(le) (Proto-Cushitic), na#$! ‘to milk’ (Bedauye), but also Nilo-Saharan n%nu ‘téter’ (Zarma) Niger-Congo *nono (Mande, Bambara, Proto-Mandekan), *canon- ‘breast’ (Orel & Stolbova)

Possibly also "áaráa ‘the sound or noise’ belongs here, cf. the root "aaraa in Chadic languages and in Afro-Asiatic, but also in Proto-Nilo-Saharan: k&ìl, Saharan kl or gr. Other “areal roots” may relate to “baby talk” reduplicative kinship terms denoting close relatives, such as kàakáa ‘the grandfather’, cf. Kanuri kaga, Tubu kaka, Songhai k%ga, Bagirmi kaka, Common Bantu kaka Currently we consider the following to be such areal roots (or rather old Wanderwörter): (3)

Areal roots in Hausa gàdúu ‘the boar’, cf. Kanuri gòdú, Tubu gadu, Common Bantu *-gudu kàrée ‘the dog’, cf. Kanuri k!%rì, Proto-Nilo-Saharan *kor (2) (Bender) "àdángárèe ‘the lizard’, cf. Dera gandal, Fali Kiria (w)njaxala, Cushitic dangalai, Tubu kadunkuli karyèe ‘to break’, cf. Proto-Chadic k!'!, Lele kar ‘enlever les feuilles vertes des tiges’, Bilin kar, Janjero ka’ra, Songhay keyri &, Teda k!r, g!r "éeràa ‘to forge’, cf. Plateau Chadic kw-l- ‘forge, falsehood’&, Siri & Diri "wa', Mafa g!(a, Bacama k!la ‘blacksmith’, Fali Jilbu kura, but also Kanuri kág)%l, kàrò and Mande kula, kola ‘forger, enclume’; *"ura*- ‘strike’ (Orel & Stolbova)

Further, the Hausa word góomà ‘ten’ belongs here: gwm/gw-m is a likely old NigerCongo loan (cf. Gbaya go+#ma ‘100’, Proto-Bantu *-kumi ‘10’), which occurs quite frequently in Chadic and may be here reconstructed as *g,am-. For a wider distri-

150

Ari Awagana and H. Ekkehard Wolff, with Doris Löhr!

bution of possibly related forms in Afroasiatic cf. gim ‘1,000’ (Tamajeq), *kw-m‘large number, heap, 1,000’ (Proto-Cushitic), *gum’a ‘all’ (Proto-Highland East Cushitic), k"ma ‘1000’ (Janjero). For the time being, however, it is necessary to admit that in the context of African languages, contact scenarios, routes and intermediaries even for clearly borrowed words are hard to establish and almost impossible to prove beyond doubt. The reasons are scarcity of data on potential donor or intermediary languages, lack of historical documentation and of methodologically sound reconstructions, lack of robust dialectological evidence even for the target language. Judgment, therefore, often remains intelligent guesswork. Contact with Arabic as the language and symbol of Arabo-Islamic culture can be th th assumed to have begun with Islamization during the 12 /13 century CE and has continued until this day. The leading figures in Hausa society maintain contact with the other parts of the Muslim world, for instance, through qur’anic education, pilgrimage (hajj), commercial migration, and higher education. This does not necessarily imply Hausa-Arabic bilingualism. Arabic is present, most of all, as a written language of the Holy Qur’an. Arabo-Islamic culture can be considered to have been and still be dominant, yet without being connected with direct political or military dominance in Hausaland. Contact with Kanuri can be assumed to be characterized by two different scenarios. A first contact period must have begun after the westward migration of Kanuri speakers into Borno and the establishment of their new capital in Birni th Gazargamo in the 15 century (cf. the chapter on Kanuri in this volume). By that time, Hausa may already have played some role as a commercial lingua franca in the so-called Hausa states that were geographically adjacent in the west of Borno. Intermarriage might have occurred to no little extent. Both societies shared the common impact of Arabo-Islamic culture and urban medieval civilization. A certain amount of bilingualism cannot be excluded which, in any case, marks the more recent contact situation which can be said to begin during the colonial period when Hausa became the dominant lingua franca of Northern Nigeria, and of the armed forces. After independence of both Nigeria and Niger, Hausa gained considerable dominance in the spheres of commercial activities, politics and administration, education, and the media, while the role of Kanuri as a lingua franca has been th continuously diminishing since the beginning of the 20 century. English and French represent colonial conquest since roughly 1880, and postcolonial political and cultural dominance even after independence of both Nigeria and Niger. Hausa-English bilingualism in Nigeria and Hausa-French bilingualism in Niger is common only among the schooled elites. English and French are ubiquitous media of communication in the urban centers, less so in the rural areas, both orally and as written languages of documents and in the print media. Both languages are also widely heard over radio and television. 324 words have been identified as certain or probably loanwords in the database. In descending order, 112 loans from Arabic (35%) are followed by 88 loans from Eng-

4. Loanwords in Hausa

151

lish (27%), 44 from Kanuri (14%), and 28 from French (9%). Distinctly fewer loans are identified for Yoruba (9 loanwords), Fulfulde (5), Songhay (5), Berber (3), Mande (1), and unidentified Benue-Congo source(s) (3 loanwords). Loans from Berber into Hausa reflect both lexemes that are widely spread in Berber as well as those that appear to be clearly attributable to the language of the Tuareg, i.e. Tamajeq and its varieties which have long been in contact with Hausa. General Berber words (some of ultimately Arabic origin) in Hausa are: (4)

Hausa íyàakáa ràa"úmíi túnkìyáa ( àlfíjìr.. ‘dawn’, al-’ibra > àllúur.àa ‘needle’, al-xam6s > Àlhàmîs ‘Thursday’, al-qalam > àl"álàmíi ‘pen’, al-qa!7l >àl"áalíi ‘judge’. The uvular stop /q/ is quite regularly rendered by the

158

Ari Awagana and H. Ekkehard Wolff, with Doris Löhr!

ejective velar /'/ in Hausa: al-qalam > àl"álàmíi ‘pen’, al-q%%I > àl"áalíi ‘judge’. Passing through Kanuri, Arabic loans may lose their characteristic ‘article’ on the way by reducing it to initial /l/ of the loanwoard: al-xayma > Kanuri l-imà > Hausa láimàa ‘tent’, al-waqt > Kanuri lóktù > Hausa lóokàcíi ‘time’, al-8afia > Kanuri k!1láfíà (with Kanuri prefix k!1-) > Hausa láafíyàa (identifying the word ending with Hausa overt feminine gender marking) ‘health’. Stressed word-medial syllables in English are often rendered by long vowels in Hausa, vowels unstressed in English and epenthetic (final) vowels in Hausa tend to be short: button > bóotìn, towel > táawùl, court > kóotù, wool > úulù. As a rule, inherently non-definite nouns in Hausa have long final vowel, thus long final vowels in Hausa that originate from consonant-final words in English could indicate ‘age’ of loans: yard > yáadìi (particularly as a measure for textile fabrics), bread > búr.óodìi which may be among the oldest loans into Hausa from English (and if via a coastal language). Several phonological processes in Hausa apply with regard to the integration of loanwords from and through Kanuri. This may be an indication of their relative age and is based on differences in the phonological inventory of the two languages. Whereas Kanuri has no vowel length distinction, Hausa needs to identify vowels as either short or long, not the least in order to make them fit structural patterns of both nouns and verbs, cf. ráwànyí > r.áafàníi ‘uncle’, tágà > táagàa ‘window’ (< Arabic), lóktù > lóokàcíi ‘time’ (< Arabic), rùwùt!2 > r.úbùutáa ‘write’ (< Arabic). Further, since Hausa no longer possesses a central vowel schwa, Kanuri /(/ is rendered as any of the three short vowels that are available for word-medial position in Hausa: /a/, /i/, /u/; the choice of quality would appear to depend on the immediate phonological environment: ng!1rmà > íngár.màa ‘stallion’, ng(%rmù > ùngùlúu ‘vulture’, k!2móló > kùmállóo ‘fasting’, k!2t!2fó > kùtùfóo ‘fist’, k!2l!)s!1 > kìlíishíi ‘rug’, f!1làif!1lái > fílàafílíi ‘paddle’, b!2rnyí > bír.níi ‘town’. Note that tone melodies sometimes match but sometimes do not, and readily apparent conditions have not been found. Hausa distinguishes retroflex flapped /r/ [!] and coronal tap or rolled /r"/, Kanuri loans are usually rendered with the latter: ráwànyí > r.áafàníi ‘uncle’, búwúr > búkúr.úu ‘bowl’, gàrú > gàar.úu ‘wall’, k!2ràt!2 > kár.àntáa ‘read’ (< Arabic), rùwùt!2 > r.úbùutáa ‘write’ (< Arabic). Some of these examples clearly show the relative age of borrowing, namely where Hausa has consonantal features that have been lost due to “weakening” in more recent periods of Kanuri linguistic history, cf. /w/ /f/, /w/ /k/, and /w/ /b/ as in ráwànyí r.áafàníi ‘uncle’, búwúr búkúr.úu ‘bowl’, rùwùt!2 r.úbùutáa ‘write’; in these examples Hausa reflects older Kanuri forms such as postulated *ráfànyí (with possessive suffix -nyí), *búkúr, and *rùbùt!) (with verbal noun suffix -t!)). The same would be true for palatalization: Hausa reflects unpalatalized stages of earlier Kanuri periods, such as in ráwànyí > r.áafàníi ‘uncle’, àwá cíntà > ùbáa kíntàa ‘stepfather’, kájì > kàazáa ‘fowl’. On the other hand, Hausa appears to have syllabified the prenasal coarticulation of /ng/ in Kanuri in the following two examples: ng!1rmà > íngár.màa ‘stallion’, ng!1rmù > ùngùlúu ‘vulture’. Less clear is why Hausa would have spirantized initial /d/ in Kanuri, unless this is an indication that the word has not come into Hausa directly from Kanuri (both words are ulti-

4. Loanwords in Hausa

159

mately from Arabic): dínàr > zínáar.-ìyáa ‘gold’ (with overt feminine gender ending), dártò > zár.tòo ‘saw’. Hausa tends to maintain the so-called moveable k which is prefixed to noun stems in Kanuri (cf. the analogy of maintaining the prefixed ‘article’ from Arabic): k!)móló > kùmállóo ‘fasting’, kùrángá > kwàr..ángáa ‘ladder’, k!)t!)fó > kùtùfóo ‘fist’, the following examples are also ultimately from Arabic and carry the Kanuri kV-prefix: kàsúgù [kàsúù] > kàasúwáa ‘market’, kárùa > káar.ùwà ‘prostitute’. With one noun at least, however, Hausa does not copy the kV-prefix, again ultimately a loan from Arabic: k!)láfíà > láafíyàa ‘health’.

6. Grammatical borrowing Little is known about grammatical borrowing into Hausa. Its major typological grammatical features such as gender distinction, the aspecto-temporal make-up of its verbal inflection system, negation, derivative strategies for nouns and verbs (the latter referred to as the highly idiosyncratic ‘verbal grade system’) etc. are seen as either retentions or internally conditioned modifications from previous stages of Hausa’s linguistic history (Proto-Chadic, even Proto-Afroasiatic) or fairly recent grammaticalizations from within the system. To the best of the authors’ knowledge, systematic investigation into possible grammatical borrowing has not begun. The database contains two examples of lexical borrowing from Kanuri, however, where the Kanuri strategy of simply juxtaposing two nouns in ‘genitive’ association is copied into Hausa. If this were a noun+noun construction in Hausa, the insertion of a linking morpheme would have be required which is missing in the following two expressions which are loan translations with regard to N1 and with N2 clearly borrowed from Kanuri (albeit in a non-palatalized earlier shape): àwá cíntà > ùbáa kíntàa ‘stepfather’, yâ cíntà > úwáa kíntàa ‘stepmother’.

7. Conclusion In sociolinguistic perspective, Hausa has developed from a small local West Chadic language of the Nigerian Middle Belt to become the biggest and presently most dynamic lingua franca of a vast territory in West and Central Africa. It has assimilated speakers of various languages of different genealogical affiliations (NigerCongo, Nilosaharan, Berber) probably over extended periods of bilingualism of individuals or larger groups of speakers. The assumed at least one thousand years of territorial expansion have entailed contact with other linguae francae of their times, such as Songhay and Kanuri, commercial trade networks have linked Hausa speaking communities also with the West African coast and across the Sahara, involving contact, for instance, with Yoruba and Berber (Tamajeq). During colonial times, Hausa became an integrating factor for linguistic minorities (in the armed forces/police) in most parts of northern Nigeria as much as in the African diaspora. All this would explain substratum effects of varied origins. On the other hand,

160

Ari Awagana and H. Ekkehard Wolff, with Doris Löhr!

Hausa has widely borrowed from Arabic, the language of Arabo-Islamic culture th th since at least the 12 /13 century, mediated to no little extent by Kanuri, the neighboring Nilosaharan language of the Kanem-Borno Empire for at least the last 500 years. Finally, English (in Nigeria) and French (in Niger) have left lasting impressions as dominant languages of colonial rule and as official languages since independence.

Acknowledgments Ari Awagana and H. Ekkehard Wolff herewith gratefully acknowledge intensive and continuous cooperation in the LWT project with their colleague at the Institut für Afrikanistik, University of Leipzig, Dr. Doris Löhr, as both a linguistic expert on Kanuri and Chadic languages west of Lake Chad. The authors further acknowledge the input of students and student assistants at the Institut für Afrikanistik, University of Leipzig, who took actively part in research seminars that were at least partly devoted to the study of loan words in Hausa and Kanuri during the academic years 2004/5–2006/7, some of it in the highly conducive environment of the university’s facilities in Zingst at the southern shore of the Baltic Sea.

References Abraham, R. C. & Kano, Mai. 1949. Dictionary of the Hausa Language. London: University of London Press. Adelung, Johann C. & Vater, Johann S. 1806–1817. Mithridates, oder allgemeine Sprachenkunde. 6 vols. Berlin: Vossische Buchhandlung. Awde, Nicholas. 1996. Hausa-English, English-Hausa Dictionary. New York: Hippocrene Books. Baldi, Sergio. 1992. Arabic loanwords in Hausa via Kanuri and Fulfulde. In Ebermann, Erwin & Sommerauer, Erich R. & Thomanek, Karl E. (eds.), Komparative Afrikanistik (Festschrift Mukarovsky), 9–14. Wien: Veröffentlichungen der Institute für Afrikanistik und Ägyptologie der Universität Wien. Baldi, Sergio. 1995. On Arabic Loans in Hausa and Kanuri. In Ibriszimow, Dymitr & Leger, Rudolf (eds.), Studia Chadica et Hamito-Semitica: Akten des Internationalen Symposions zur Tschadsprachenforschung, Johann-Wolfgang-Goethe-Universität, Frankfurt am Main, 6.–8. Mai 1991, 252–278. Köln: Rüdiger Köppe. Baldi, Sergio. 1999. Ancient and New Arabic Loans in Chadic. (University of Leipzig Papers on Africa). Leipzig: Institut für Afrikanistik. Ballard, John A. 1971. Historical inferences from the linguistic geography of the Nigerian middle belt. Africa 11(1):294–305. Bargery, George P. 1934. A Hausa-English Dictionary and English-Hausa Vocabulary. London: Oxford University Press.

4. Loanwords in Hausa

161

Barth, Heinrich. 1862. Sammlung und Bearbeitung Central-Afrikanischer Vokabularien. Vol. 3. Gotha. Brauner, Siegmund. 1964. Bemerkungen zum entlehnten Wortschatz des Hausa (Yorubalehnwörter im Hausa) [Remarks on the borrowed vocabulary of Hausa (Yoruba loanwords in Hausa)]. Mitteilungen des Instituts für Orientforschung 10:103–107. Broß, Michael & Baba, A. T. 1996. Dictionary of Hausa Crafts / Kamus na Sana’o’in Hausa A Dialectal Documentation / Bincike Kan Karin Harshen Hausa. Köln: Rüdiger Köppe. Caron, Bernard. 1991. Le haoussa de l'Ader. Berlin: Reimer. Caron, Bernard & Amfani, A. H. 1997. Dictionnaire français-haoussa: Suivi d'un index haoussa-français [French-Hausa dictionary: Followed by a Hausa-French index]. Paris/Ibadan: Karthala, IFRA-Ibadan. Gouffé, Claude. 1974. Contacts de vocabulaire entre le haoussa et le touareg. In Actes du premier Congrès International de Linguistique Sémitique et Chamito-Sémitique. La Haye/Paris: Mouton. Greenberg, Joseph H. 1947. Word 3:85–97. Greenberg, Joseph H. 1960. Evidence for the Influence of the Kanuri on the Hausa. Journal of African History 1(2):205–212. Greenberg, Joseph H. 1963. The Languages of Africa. Bloomington, IN: Indiana University Press. Herms, Irmtraud. 1987. Wörterbuch Hausa-Deutsch [Hausa-German dictionary]. Leipzig: Verlag Enzyklopädie. Hoffmann, Carl. 1970. Ancient Benue-Congo loans in Chadic. Africana Marburgensia 3(2):3–23. Jaggar, Philip J. 2001. Hausa. Amsterdam/New York: John Benjamins Publishing Comp. Jungraithmayr, Herrmann. 1988. Étymologie tchadique: Vocabulaire fondamental et anciens emprunts. In Barreteau, D. & Tourneux, H. (eds.), Le milieu et les hommes: Recherches comparatives et historiques dans le bassin du lac Tchad, 241–251. Paris: Orstom. Jungraithmayr, Herrmann & Ibriszimow, Dymitr. 1994. Chadic Lexical Roots. Berlin: Dietrich Reimer. Kossmann, Maarten. 2005. Berber loans in Hausa. Köln: Rüdiger Köppe. McIntyre, Joseph & Meyer-Bahlburg, Hilke. 1991. Hausa in the Media: A Lexical Guide. Hausa-English-German / English-Hausa / German-Hausa. Hamburg: Buske. Mijinguini, Abdou. 1987. Karamin Kamus na Hausa zuwa Faransanci (haoussa-français). Niamey: CELHTO. Mischlich, Adam. 1906. Wörterbuch der Hausasprache [Dictionary of the Hausa language]. Vol. 1: Hausa-Deutsch. Berlin: Reimer. Newman, Paul. 1977. Chadic classification and reconstructions. Afroasiatic Linguistics 5(1):1– 42. Newman, Paul. 1990. Nominal and Verbal Plurality in Chadic. Dordrecht: Foris.

162

Ari Awagana and H. Ekkehard Wolff, with Doris Löhr!

Newman, Paul. 2000. The Hausa Language: An Encyclopedic Reference Grammar. New Haven/London: Yale University Press. Newman, Paul. 2007. A Hausa-English Dictionary. New Haven/London: Yale University Press. Newman, Paul & Newman, Roxana Ma. 1966. Comparative Chadic: Phonology and lexicon. Journal of African Languages 5:218–251. Newman, Paul & Newman, Roxana Ma. 1977. Modern Hausa-English Dictionary. Ibadan: Oxford University Press. Newman, Roxana Ma. 1990. An English-Hausa Dictionary. New Haven/London: Yale University Press. Orel, V. E. & Stolbova, O. V. 1995. Hamito-Semitic Etymological Dictionary: Materials for a Reconstruction. Leiden: Brill. Robinson, Charles H. 1899–1900. Dictionary of the Hausa Language. Vol. 1: Hausa-English, Vol. 2: English-Hausa. Assisted by Brooks, W. H. Cambridge: At The University Press. Sahelia data base. n.d. http://sahelia.unice.fr/. Schön, James F. 1843. A Dictionary of the Hausa language. London: Church Missionary House. Schön, James F. 1862. Grammar of the Hausa Language. London: Church Missionary House. Skinner, Neil. 1959. Hausa-English Pocket Dictionary. Zaria: Longmans, Green and Co. Skinner, Neil. 1965. Kamus na Turanci da Hausa. Babban Jagora ga Turanci [Englisch-Hausa Dictionary]. Zaria: The Northern Nigerian Publishing Company Ltd. Skinner, Neil. 1981. Loans in Hausa and Pre-Hausa: Some Etymologies. In Jungraithmayr, H. (ed.), Berliner Afrikanistische Vorträge: XXI. Deutscher Orientalistentag, Berlin 24.– 29.3.1980, 167–202. Berlin: Reimer. Skinner, Neil. 1996. Hausa Comparative Dictionary. Köln: Köppe. Stolbova, Ol'ga V. 1996. Studies in Chadic Comparative Phonology. Moscow: Diaphragma Publishers. Sutton, J. E. G. 1979. Towards a less orthodox history of Hausaland. Journal of African History 20:179–201. Vischer, Hans. 1912. Report on education. London: Edward Arnold. Wehr, Hans. 1976. Arabic-English Dictionary. Ithaca: Spoken Language Services. Wexler, P. 1980. Problems in monitoring the diffusion of Arabic into West and Central African languages. Zeitschrift der Deutschen Morgenländischen Gesellschaft 130:522–556. Wolff, H. Ekkehard. 1993. Referenzgrammatik des Hausa [Reference grammar of Hausa]. Münster/Hamburg: Lit Verlag.

4. Loanwords in Hausa

163

Loanword Appendix Arabic áadàlíi àbàdá àdíinìi àkálàa àl’áadàa àlbâashíi àlbár.kàtáa àléewàa àlfádàr.íi àlfíjìr. Àlhàmîs àlhàr.íníi àljánáa àljíhúu àl"áalíi àl"álàmíi àlkámàa àlkyábbàa Allàh àllúur.àa

noble never religion trough custom wages to bless candy/sweets mule dawn Thursday silk fairy, elf pocket judge pen wheat cloak god needle, injection àlmákàshíi scissors, shears àmíinìi friend ámìncée to admit àráa to lend àr.áadùu bolt of lightning àr.àháa cheap ár.zìkíi rich Àsábàr. Saturday àshìr.ín twenty àsíir.íi secret áudùgáa cotton àzzákàr.íi penis báa dà jàwáabìi to give an answer báhàa; báhàr. sea bàláa’ìn famine (yúnwàa) (of hunger) báyyànáa to explain cí àmáanàa to betray (“eat betrayal”) cí r.íibàa to earn dáa’ìr.áa circle 9áalìbíi pupil dàbáar.àa idea dàlíilìi cause

dànkálìi díiwáanìi dúuníyàa fàhímtàa fàkíir.ìi fár.jìi fásàlíi fàtálwáa gàháawàa gàníimàa gàríi gíyàa hádàríi há9àr.íi hánkàlíi hàr.áajìi hàsúumíyàa hìjáabìi hùkúmtàa húkúncìi ín ínàbíi ìsháar.àa ìyáalìi jàmá’àa jàr.íidàa jár.r.àbáa jàwáabìi jéemàa Júmmá’àa kábàr.íi kàbíilàa kálmàa kàtíifàa kúllúm là’ánàa Làar.àbá Láhádì lâifíi láunìi lèemóo Lìtìnîn líttáafìi màllákàa

ná’àm

sweet potato bill world to understand poor vagina season ghost coffee booty town wine (alcoholic drink) storm danger intelligence tax tower veil to condemn penalty, punishment if grape omen family people newspaper to try speech to tan Friday grave clan word mattress always crime Wednesday Sunday fault color citrus fruit Monday book to own, to rule, to govern yes

násár.àa r.á’àyíi ríijìyáa sáa’àa sáafée sàatáa sádáukâr.wáa sàkáanìi sámàa sháa’ìr.íi sháayìi shâidáa shài9án shà’îr. shákkàa shàr.i’àa shâusháawàa sífìr.íi súkàr.íi táabàa táajìr.íi tàaráa táasàa tábbàt Tàláatà tàmbáyàa túfáafìi túhùmtáa túubá ùmúr.tàa wàndóo wàsíi"àa wútán jàhánnàmàa yáafèe yàr.dá yâu yí àddú’àa yí àláamàa yí àlkáwár.ìi yí átìsháa yí hámmàa

victory opinion spring, well time, good luck morning to steal sacrifice kettle sky, heaven poet tea witness demon barley doubt law tattoo zero sugar tobacco rich fine dish certain Tuesday to ask (1) clothing, clothes to accuse, o suspect to regret, to be sorry to command, to order trousers letter hell to forgive to admit today to pray (“to do prayer”) to seem to promise (“to do promise”) to sneeze to yawn

164

Ari Awagana and H. Ekkehard Wolff, with Doris Löhr!

yí hánzáríi yí ìbáadàa yí jìmmáa’ìi yí názàr.íi yí sállàa yí tsàmmáaníi yí wá’àzíi zàitúunìi zìnáa zúur.íyàa

to hurry to worship (“to do worship”) to have sex to study to pray to think (2) to preach olive adultery descendant

English àdìr.éeshìi ángàa ásìbítì áwàa báasúkùr. báatìr. bánkìi bàntée bâs bêl bìr.kíi bôm bóotìn bùlôo búr.óodìi bùr.óoshìi bût cánjàa cízìl cóocìi émtì fàadáa féntì fîm fîn fôk fúr.súnàa gáadìi gádàa gâm gátàa gìláashìi

address anchor hospital hour bicycle battery bank (financial institution) grass-skirt (“panty”) bus belt brick bomb button brick (“block”) bread brush boot to change chisel church empty priest (“father”) paint film/movie pin fork captive, prisoner guard bridge (“girder”) glue (“gum”) ditch (“gutter”) glass

gwámnátì hámàa hánkícì

government hammer handkerchief, rag háyàa to hire ínjìi machine, engine jáa bír.kìi to brake (“pull brake”) jóojìi judge káafíntàa carpenter káatìn gáisúwáa postcard (“card of greeting”) kálàa color kàléndà calendar kân fámfòo tap/faucet (“head of pump”) kántàa shelf (“counter”) kàntíi shop (“canteen”) kéndìr. candle kóofìi coffee kóotù court kûm comb kwâf cup kwálàa collar kwálbátì ditch (“culvert”) kwánàa corner kwât coat láasìn driver’s license láayìi line lámbàa number làmbàtûu ditch (“number two”) làntár.kìi electricity lílìn linen mân féetùr. petroleum (“oil of petrol”) mínístà minister móotàa car (“motor”) óobìn oven r.éedíyòo radio r.éezàa razor r.óobàa plastic (“rubber”) sáatíi week, Saturday shât shirt shéebùr. shovel

shûuméekàa sìkêt sìlíkìi sóocì sóojàa sùkùndìr.éebàa táawùl táayàa (tyre) tálàbíjìn tántìi tàr.hôo téebùr. téelà úulù wáyàa (wire) yáadìi (yard) yí óodàa

shoemaker skirt silk sock, stocking soldier screwdriver towel wheel television tent telephone table tailor wool telephone cloth to command, to order

Kanuri ábbàaníi

father’s brother àkú parrot báabà father’s sister bàabáaníi father’s brother bàzáwàr.áa widow bíndígàa gun bír.níi town búkúr.úu bowl 5àráawòo thief dábbàa animal dár.màa lead fàatàaríi skirt fílàafílíi oar, paddle gàar.úu wall íngár.màa stallion ínnàa aunt írìi seed, color ìsá to arrive íyàa mother’s sister jáar.ùmíi brave káar.ùwà prostitute kàasúwáa market kàazáa fowl kàcíyàa circumcision kár.àntáa to read, to study kárìn kùmállóo breakfast (“breaking of nausea”)

4. Loanwords in Hausa kìlíishíi kújèeráa kúrmíi kùtùfóo kwàr.ángáa láafíyàa láimàa lóokàcíi máakòo máalàmíi másàr.áa mátàlàucíi r.áafàníi rínàa r.úbùutáa sàabúlìi súbdìi súlkée táagàa tàutáu ùbáa kíntàa

rug chair forest to pound ladder healthy tent time week teacher maize/corn poor uncle to dye to write soap Saturday armor window spider stepfather (“father step”)

úwáa kíntàa

ùngùlúu zár.tòo záurèe zínáar.ìyáa

stepmother (“mother step”) vulture saw meeting house gold

French àdìr.éeshìi bàmbô bîs bóotìi bùtôo fàr.mîi fàr.zìdân fêl fénnì fîm gwálálóo hùr.shéetìi húur.ùu jîp

address candy/sweets screw boot button driver’s license president shovel comb film/movie ditch fork oven skirt

júujù kàfê kân fámfòo kásòo kùr.shêt kwâl lùuléetì màasôn màntôo màr.t:o mòotôo múshúwàr.íi r.àatôo r.ôb sàr.béetì tàr.hôo tìsî tùr.nàbîs

165

judge coffee tap (“head of pump”) prison fork collar spectacles mason coat hammer motorcycle handkerchief, rag rake (woman’s) dress towel telephone cloth screwdriver

Chapter 5

Loanwords in Kanuri, a Saharan language* Doris Löhr and H. Ekkehard Wolff, with Ari Awagana 1. The language and its speakers Kanuri is spoken by some 3 to 4 million speakers mostly in the western vicinity of Lake Chad, i.e. in parts of eastern Niger but mainly in northeastern Nigeria; smaller groups of speakers are also found in adjacent regions in Libya and Northern Cameroon. Kanuri can be assumed to form a dialect continuum with its close relative Kanembu: While Kanuri is mainly spoken in Borno and Yobe states of Nigeria and in adjacent regions across the international borders into eastern Niger and northern Cameroon to the west and south of Lake Chad, its sister language Kanembu is mainly spoken east of the Lake in the Kanem region in Western Chad. Since the vast majority of Kanuri speakers embrace Islam, diaspora speakers of Kanuri can also be found in settlements mainly along the old routes for pilgrims from West Africa along the southern fringe of the Sahara desert particularly in Sudan (Khartoum) and the Holy Cities of Islam in Saudi Arabia. Individuals and small groups of speakers can further be found in major cities outside the Central African region and outside Africa due to more recent migration and globalization. In the most general sense, however, the term Kanuri refers to the language and to the Kanuri-speaking population of the historical region of Borno. Kanuri belongs to the Western subgroup of the Saharan language family, which Greenberg (1963) included in his Nilosaharan phylum. The other members of the Saharan family are Tedaga and Dazaga, spoken in northeastern Niger and northern Chad, i.e. the languages of the Teda and Daza peoples which are jointly referred to as Tubu. The Eastern subgroup of Saharan is constituted by Beria (also called Zaghawa) and Berti (said to be extinct). Heinrich Barth (1862) and Gustav th Nachtigal (1881–89) had done the classificatory groundwork in the 19 century which was set forth by Johannes Lukas (1951/52); the internal classification of the Saharan family was never challenged (cf. Cyffer 2000: 160).

*

The subdatabase of the World Loanword Database that accompanies this chapter is available online at http://wold.livingsources.org. It is a separate electronic publication that should be cited as: Doris Löhr and H. Ekkehard Wolff, with Ari Awagana. 2009. Kanuri vocabulary. In Haspelmath, Martin & Tadmor, Uri (eds.) World Loanword Database. Munich: Max Planck Digital Library, 1427 entries.

5. Loanwords in Kanuri

167

Kanuri-Kanembu Western Branch

Teda Tubu

Saharan

Daza Beria Eastern Branch Berti †

Figure 1:

The Saharan language family

Comparative Saharan linguistics and etymology is weakly developed largely for at least three major reasons. First of all, with the exception of Kanuri, linguistic documentation is poor and largely outdated where it exists (the major sources for Tedaga and Dazaga are Lukas 1953, LeCoeur & LeCoeur 1956 and Jourdan 1935). Only recently has Beria become the object of a systematic description (Jakobi & Crass 2004), and have studies on Tedaga had started again (cf. Ortman 2001). Also, with a maximum of six languages in the family of which four must still be said to be poorly documented (Kanembu, Tedaga, Dazaga, Berti), the comparative method soon finds its limits (cf. Petrá!ek 1979). Finally, the extremely rich phonetic and phonological variation within Kanuri(-Kanembu) itself poses methodological problems, since also the systematic dialectology of Kanuri is still in its infancy, with little agreement among even the very few experts worldwide for this language. Thus, the Ethnologue (Gordon 2005) treats Kanuri as a language cluster comprising four individual languages: Kanuri Central, Kanuri Manga, Kanuri Tumari, and Kanembu. Kanuri experts, however, tend to consider the Kanuri language to be a dialect cluster (cf. Bulakarima 1997) or a dialect continuum. Consider the two competing sub-classifications in Figures 2a and 2b. Dag!ra Manga Bilma Western Kanuri Mobar Suwurti

Kanuri

Kubari Eastern Kanuri Tumari

Figure 2a: The classification of Kanuri (Jarrett 1988)

168

Doris Löhr and H. Ekkehard Wolff, with Ari Awagana

West Kanuri

Mowar, Yerwa Bilma, Fachi

East Dager, Manga Kanuri-Kanembu West

Tumari Kuburi, Suwurti

Kanembu East

Bol

Figure 2b: Kanuri-Kanembu dialects (based on Hutchison 2000) The classification of what is labeled Kanuri “dialects” and “sub-dialects” is based, at times, more on political considerations and ethnic lines of argument than on purely linguistic factors. One is usually confronted with a confusing mix of reference to both ethnic groups or clans and linguistic varieties. Non-experts often fail to recognize that several Kanuri-speaking people identify themselves as Kanuri despite their obvious and admitted non-Kanuri origin (cf. Löhr 2007). How encompassing this “Kanurization” (Cyffer et al. 1996) has been and which non-Kanuri languages and ethnic groups were affected, partly or completely, still awaits detailed and systematic study. There is plenty of indirect evidence, however, that Kanuri has linguistically and culturally assimilated several different languages and ethnic groups over the last 900 years or more. Among the dialects of Kanuri, two can be singled out for their sociolinguistic prominence in Nigeria on the one hand, i.e. Yerwa, and in Niger on the other, i.e. Manga. The Yerwa dialect named after a nearby village is the variety spoken in and around Maiduguri, Nigeria. It was chosen as the basis for the standard dialect for which a standard orthography was developed in the 1970s. Maiduguri/Yerwa can be considered to be the political and cultural centre of Kanuri; it is the seat of the traditional ruler, i.e. the Shehu of Borno and his court since colonial times. The Shehu of Borno represents more than 1000 years of continuous feudal rulership of what historians refer to as the Kanem-Borno Empire. Accordingly, the Kanuri language also has a long tradition both as a symbol of Kanuri political and cultural dominance in the region and, as we may assume, a lingua franca for speakers of a fair (yet unknown) number of autochthonous languages of presumably mostly Chadic linguistic affiliation, but also Shuwa Arabic and possibly others. “Before Arabic and, especially, Hausa began to play a more important role in the central Sahelian area, Kanuri served widely in the area as a lingua franca. For example, the traveller Gerhard Rohlfs noted that in Fezzan in the 1860s it was easier to communicate in Kanuri than in Arabic (Rohlfs 1984). These facts may also play a role in the evaluation of linguistic contact features in the language.” (Cyffer 2007: 1090)

5. Loanwords in Kanuri

169

Since the colonial times, Kanuri has kept losing its importance as a regional lingua franca in favor of Hausa, which has become the most dynamic lingua franca in the formerly Kanuri-speaking parts of Nigeria and Niger. The recent spreading of Hausa, therefore, provides an important contact scenario for Kanuri. However, in addition we can assume several centuries of Chadic (including Hausa?) and Kanuri contact that predate the colonial and post-independence periods: This much earlier contact was linked to the migration of (Pre-Modern) Kanuri-Kanembu speakers into areas west of Lake Chad over the last at least 700 years; these areas can be assumed to have largely been inhabited by speakers of Chadic languages (cf. Jungraithmayr et al. 2004; Löhr 2009). Since the colonial occupation, English (in Nigeria) and French (in Niger and Chad) have entered the contact scenario picture. Christian missionary activities played almost no role among the largely Muslim speakers of Kanuri. The eminent role of Islam, however, makes Arabic a major contact language, at least as the language of Islamic teaching and as a representative symbol of Arabo-Islamic cultural cum religious impact. There is little if any postliteracy activity in Kanuri; it has largely remained a language for oral communication including modern media (radio, television). There are, however, texts available coded both in Ajami, i.e. writing systems based on the Arabic script with adoptions to African language needs, and in the Roman alphabet. The “Kanuri Standard Orthography” is based on the Yerwa dialect and the Roman alphabet. The first grammatical description of Kanuri appeared as early as 1854; it was compiled by S. W. Koelle in Freetown, Sierra Leone, based on his work with freed slaves. Johannes Lukas’ Kanuri grammar (1937) remains very influential. More recent reference works explicitly start out from Lukas’ work and include Hutchison (1981) and Cyffer (1998). In Nigeria, Kanuri has about 3 million speakers and is attributed the status of a major national language, which means that it is regionally used in primary education and literacy campaigns. In Niger, about 70% of the about 500.000 Kanuri speakers are considered to belong to the Manga-Dag!ra cluster of sub-dialects. Kanuri is one of the 10 legally recognized national languages of Niger and is used in literacy campaigns and as medium of instruction in a few so-called experimental schools since the 1960s. The variety used for the subdatabase and the analysis is primarily the so-called Standard Kanuri spoken in Maiduguri/Yerwa to which most of the available sources refer. Occasionally, reflecting the slightly different contact scenarios in the former French colony from which arose the modern République du Niger, reference will also be made to the Mangvariety. The literature on Borno history (e.g. Barkindo 1985) agrees that the Kanuri came from Kanem, east of Lake Chad, where they lived in Njimi. After the fall of their new capital Birni Ngazargamo west of Lake Chad in 1808, they scattered over the whole of the Borno region. A majority now live in and around th Yerwa/Maiduguri, the residence of the Shehu of Borno for most of the 20 century; some of them live in Dikwa, the previous residence of the Shehu. Many, however, live in the rural areas in Nigeria and Niger, in the oases of Bilma and

170

Doris Löhr and H. Ekkehard Wolff, with Ari Awagana

Fachi in north-eastern Niger, and across the international borders in Libya and Northern Cameroon.

Map 1: Geographical setting of Kanuri Prior to the westward migration of the Kanuri, a kingdom east of Lake Chad was th localized in and referred to by Arab geographers as Kanem as early as in the 9 centh tury CE. By the 11 century, the Empire of Kanem with its capital Njimi was well established; its pre-Islamic population consisted of Tubu and Kanembu people and came under the impact of nomadic Beria (Zaghawa) with ensuing dynastic battles and transformations affecting the first dynasty of the Sefuwa. Arab sources (Ibn Said, Ibn Khaldun, al-Maqrizi, Ibn Battuta, Ibn Furtu) relate historical events beth th tween the 12 and 15 century that, among other things, testify to political (and, by inference, linguistic?) contact with neighboring Chadic speakers (Bade, Buduma, Kotoko, †Kuri) along the shores of Lake Chad and its tributaries to the north (Bahr el Ghazal), west (Komadugu Yobe), and south (Logone and Shari). By the th 13 century, Borno to the west of Lake Chad had become an integral part of what could now be referred to as the Kanem-Borno Empire even though Kanem and th Borno were at least temporarily ruled each by its own king or sultan in the 13 and th 14 century. By and large, this empire was less to be seen as a centralized monolithic political structure but rather as a hegemonial or even confederate structure of more or less dependent and tribute-paying kingdoms or sultanates (such as Fika, Mandara, Bagirmi) under the leadership of the Sefuwa kings. Military conflicts

5. Loanwords in Kanuri

171

continued at the peripheries with the Tubu to the north and the Bulala to the th th south of Kanem during the 14 and 15 century. This was reason enough for the Sefuwa kings to relocate their capital towards the western provinces of the empire, i.e. Borno. The new capital was finally established in Birni Ngazargamo by King Ali Ibn Dunama (Ali Gaji, 1455–1487), cf. Map 1. By this time the so-called Hausa states to the west of Borno had already been under the political control of the empire for some time. In 1808/09, however, Borno was negatively impacted by Usman "an Fodio’s jihad in the process of expanding the political control of the Sokoto Caliphate towards the east. El-Kanemi, the military leader of the Borno Empire, successfully resisted the conquest and finally established himself as founder of a new dynasty to rule the empire until 1893 and later again under British colonial rule. During the colonial period, the territory of the Kanem-Borno Empire became eventually divided among the colonial powers Great Britain (part of post-independence Nigeria), France (parts of post-independence Chad and Cameroon), and imperial Germany (parts of the German colony “Kamerun” which finally became divided between Nigeria, Cameroon, and Chad following World War I and independence). th th Some time during the 12 /13 century, Islam was more or less established permanently in the area. Neighboring non-Muslim societies were subjected to th continuous slave-raids and other forms of oppression until the turn of the 20 century.

2. Sources of data The lexical data for the project were taken, first of all, from a variety of published linguistic sources on both Standard Kanuri (based on the Yerwa dialect in Nigeria) and the Manga variety of Niger in order to reflect, among other things, the different colonial history of both Nigeria (formerly under British colonial rule) and Niger (formerly under French colonial rule). Our main sources were the grammars by Koelle (1854), Lukas (1937), Hutchison (1981), Cyffer (1998), in particular also the two available dictionaries (Cyffer & Hutchison 1991, Cyffer 1994). When published sources were insufficient and for cross-checking, the data were complemented by direct elicitation from one of the contributors. Dr. Elhaji Ari Awagana is a native speaker of a Manga variety in Niger and a trained linguistic expert on Chadic languages in general and Hausa in particular. In the absence of published systematic comparative work on the Saharan language family, sources and data from other Saharan languages were consulted. For Beria (also known as Zaghawa): Jakobi & Crass (2004), Zakaria Fadul (1996, 2002, 2005) for Teda-Daza: Lukas (1953) and LeCoeur & LeCoeur (1956); for Berti: Petrá!ek (1987). We also consulted Bender (1996) and Ehret (2001) for highly tentative and mostly rather speculative reconstructions of and within ProtoNilosaharan.

172

Doris Löhr and H. Ekkehard Wolff, with Ari Awagana

A very valuable tool was the sahelia database located in Nice and run by Robert Nicolaï, to whom we are indebted for access and support in making available to us a rich data collection from various languages of the Sahel Zone. In order to identify loans from a variety of Chadic languages, not only Hausa, the following published sources were consulted in particular: Skinner (1996), Wexler (1980). For Arabic, as far as other compilations and sources did not already identify Arabic loans in Kanuri, the dictionary of Wehr (1976) was occasionally consulted. Loanwords in Kanuri were subject of only very few previous studies, notably Lang (1923–1924) and Baldi (1992, 1995). Unfortunately, these studies include quite a few loans that do not form part of the Loanword Typology meaning list, on which the Kanuri subdatabase rests. There are a few studies available on Kanuri loans into other languages which have also been duly consulted, e.g. Greenberg (1960) or Schuh (2003).

3. Contact situations and types Arabic stands out as the most frequent donor language to Kanuri, followed by Hausa. Note that Hausa often serves as an intermediary for Arabic (and later English) loans into Kanuri and vice versa, i.e. Kanuri also quite often served as an intermediary for Arabic loans into Hausa as had been noticed quite early by Greenberg (1947). This is easily explained by reference to the long time period of contact and its nature, both in terms of number of loans and semantic domains. Arabic is, first of all, closely linked to the advent and spread of Islam in West and Central Africa both in terms of religion and dominant political force and culture, but there have also been nomadic speakers of Shuwa Arabic (and speakers Chad Arabic?) who had entered in greater numbers from the eastern region from 1700 CE on, after having infiltrated southern Kanem and the areas southeast of Lake Chad (cf. Braukämper 2004), some of whom underwent complete linguistic and th cultural assimilation (“Kanuri(ci)zation”). Islamization began in the 11 century. Quite likely, Islam reached the area between the River Niger in the west and Lake Chad in the east via several gateways and roughly at the same time: The earliest contacts may have been through either the straight north-south trans-Saharan route from Fezzan to Lake Chad or from the east, i.e. following what would later become the main pilgrimage route along the southern fringes of the Sahara desert that connected, for instance, the Islamic centres in Mali, such as Timbuktu, with Khartoum in Sudan and the Holy Cities across the Red Sea. There was also the north-west gateway via Morocco and the western trans-Saharan trade routes. The central and eastern routes would first have affected the Kanuri who would then have passed things on to the Hausa states that had developed to the west of them. The north-western route would make Mande-speaking intermediaries a likely hypothesis when Islam reached the territories of the Empires first of Mali, and later Songhay, before it would reach the Hausa states and, subsequently again, Kanemth Borno, where all three gateways finally meet. Note also that the late 19 century

5. Loanwords in Kanuri

173

jihad of Usman "an Fodio spread from Sokoto (one of the Hausa states). Occasionally, berberophone Tuareg of the southern Sahara may have interacted in the process. Independent of the advent of Islam and in the course of expanding their local kingdom, centuries of slave-raids and tribute-raising attacks by the Kanuri on neighboring polities must have brought contacts with countless speakers of nonSaharan indigenous languages at the peripheries who had come under the dominance of the Kanem-Borno Empire. These would have been West Chadic and Central Chadic languages, and possibly Adamawa-Ubangi and Central Sudanic languages. Traces of such contacts can be found in the subdatabase. With the advent of European missionaries and colonialism (British, French, and German), English and French became major sources of loans, particularly in semanth tic domains related to colonial administration, Christian mission, and 20 century Western civilization. The observation that English plays a larger role as donor than French in the subdatabase may reflect our primary concern with Standard Kanuri (based on a dialect in the former British sphere of influence) on the one hand, and differences in the colonial politics of the British and the French on the other. (Vernacular education and functional multilingualism was largely supported under the British “indirect rule” approach, whereas French assimilation politics fostered tendencies of strong “language purism”.) Note that other African languages may have acted as intermediaries for loans from English and French, particularly languages of the West African coastal regions, from where much of the colonial conquest set out. This would bring coastal languages like Yoruba (for transporting English loans to the interior) into the picture. The most common intermediary language, however, was apparently Hausa, the major lingua franca of the last few centuries which passed on many words that travelled across the Western and Central Sudan region in geographical west > east and south > north directions. In view of the fact that Saharan languages, in particular Kanuri-Kanembu, must have a rather long history of geographical neighborhood with not only other Saharan languages (like Tedaga-Dazaga to the north, and Beria further to the east) but with immediately adjacent Chadic languages in the vicinity and south, west, and east of Lake Chad, plus Adamawa-Ubangi and Central Sudanic languages (like Bagirmi) even further to the south, it is not surprising to find lexical items which are geographically very widely spread across established genealogical affiliations of languages and whose ultimate origin remains obscure to the extent that the few available reconstructions of proto-languages may reflect such items across different language phyla. For such lexical items that have become inherited vocabulary in more than one established phylum (i.e. Nilosaharan, Afroasiatic, Niger-Congo) we have introduced the term “areal root”. For such areal roots we assume borrowing at such time-depth that the ultimate origin and direction of borrowing is no longer detectable. For the time being, however, it is necessary to admit that in the context of African languages for several reasons contact scenarios, routes and intermediaries even for clearly borrowed words are hard to establish and almost impossible to

174

Doris Löhr and H. Ekkehard Wolff, with Ari Awagana

prove beyond doubt. The reasons are scarcity of data for target languages as much as for potential donor or intermediary languages, lack of historical documentation and of methodologically sound reconstructions, lack of dialectological evidence even for the target language. Table 1 attempts to arrange the major contact situations referred to in the subdatabase by number of loans.

numerical asymmetry

Variety of social contact situations

Number of recipient bilinguals

Social status of recipient bilinguals

duration of contact

written language used by recipient speakers

language of religion of most recipient speakers

Arabic > Kanuri old Hausa > Kanuri English > Kanuri French > Kanuri

cultural dominance

Schematic representation of contact types of the major donor languages Political dominance

Table 1:

0 0–1 3 3

2 1–2 1 1

0 1 0 0

0–1 3 1–2 1–2

1–2 2 -1 -1

2 -3 3 3

3 3 1–2 1–2

2 1 1 1

3 1 0 0

Contact with Arabic as the language and symbol of Arabo-Islamic culture can be th th assumed to have begun with Islamization during the 12 /13 century CE, if not before, and has continued until this day since almost all Kanuri mother-tongue speakers are Muslims. The leading figures in Kanuri society maintain contact with the other parts of the Muslim world, for instance, through qur’anic education, pilgrimage (hajj), commercial migration, and higher education. This does not necessarily, not even as a rule, imply Kanuri-Arabic bilingualism. Arabic is present, most of all, as a written language (of the Holy Qur’an), but also in vernacular use by minority and – until recently – largely nomadic/agro-pastoralist speakers of Shuwa (and Chad) Arabic. Native speakers of Arabic have been assimilated into Kanuri society for at least the last 250 years. Arabo-Islamic culture can be considered dominant, yet without political dominance. Contact with Hausa can be assumed to be characterized by two different scenarios. A first contact period must have begun after Kanuri westward migration into th Borno and the establishment of their new capital in Birni Ngazargamo in the 15 century CE. By that time, Hausa may already have played some role as a commercial lingua franca in the so-called Hausa states that were geographically adjacent in the west of Borno. Intermarriage might have occurred to no little extent. Both societies shared the common impact of Arabo-Islamic culture and urban medieval civilization. A certain amount of bilingualism cannot be excluded, or in any case becomes more widespread in the recent contact situation which can be said to begin during the colonial period, when Hausa became the dominant lingua franca of northern Nigeria and of the armed forces. After independence of both Nigeria and Niger,

5. Loanwords in Kanuri

175

Hausa gained considerable dominance in the spheres of commercial activities, politics and administration, education, and the media. Today, about 3 million Kanuri speakers face about 50 million Hausa speakers (half of them L2-speakers, among them many if not most native Kanuri speakers). Hausa also serves as gateway and intermediary for loans from other languages, such as Yoruba (and Nupe, possibly Songhay, and Portuguese). Loans from Berber (Tamasheq) would also find their way into Kanuri via Hausa, even though direct contact is possible since the Kanuri speaking oasis dwellers of Fachi and Bilma trade their salt directly with the Tuareg with whom, however, they prefer to speak Hausa (Kanuri-Tamasheq bilingualism would appear to be the exception). English and French represent the colonial conquest since roughly 1880 and post-colonial cultural dominance and political influence even after independence of both Nigeria and Niger. Bi-/trilingualism Kanuri-Hausa-English in Nigeria and Kanuri-Hausa-French in Niger is quite common. English and French are ubiquitous in the urban centres, less so in the rural areas, both orally and as written languages of documents and print media. Both languages are also widely heard on the radio and television. Kanuri can be assumed to have assimilated a fair number of speakers of West and Central Chadic languages who have shifted completely or at least partly to Kanuri, like the Malgwa (and possibly parts of the Bade and Ngizim speaking populations). As a result, one would expect considerable Chadic substratum effects both in lexicon and in grammar. As in the case of Shuwa Arabic, Chadic speakers would have come under the political and cultural dominance of the Kanem-Borno Empire. Periods of Chadic-Kanuri bilingualism preceding language shift to Kanuri can be assumed to have been a pattern throughout much of the territory dominated by the Kanuri Empire (there are practically no written records available). On the other hand, the geographical neighborhood of Kanuri/West-Saharan languages and both Berber (Tamasheq) and several Chadic languages spoken in the vicinity of Lake Chad must have prevailed for many centuries, if not millennia. Linguistic contact, therefore, may have occurred already at such time depth that the ultimate source of a word that is now widespread in the Sahel zone south of the Sahara desert remains unidentifiable (here we speak of ancient “areal” roots). Some 310 loanwords (category 3 and 4) and 76 potential loanwords (category 1 and 2) have been identified in the subdatabase. First of all, we cannot exclude family-internal borrowing from another Saharan language, e.g. the Saharan word for ‘cat’ ngâm. We have identified 5 ancient “areal” loans that relate Kanuri words to Afroasiatic in general and that we assume to represent the oldest layer of contact, possibly due to shared settlement areas north of Lake Chad. These are cídí ‘soil/land’, kábbì ‘arch’, t!#làm ‘language/tongue’, kàléà ‘servant’, cîr~kîr (F) ‘slave’. Chadic substrata are assumed to be responsible for two further loans in Kanuri. Chronologically, these might already be related to the westward migration into th Borno of the Kanuri, which began in the 11 century CE: Proto (West?) Chadic: kárám ‘crocodile’, tíg!$ ‘body’.

176

Doris Löhr and H. Ekkehard Wolff, with Ari Awagana

The following 5 loans are (directly?) from Berber (Tamasheq), but are distributed in the whole area: dímì ‘ewe’ (doubtful), kàlímò ‘camel’, bòrkó ‘blanket’, dìfúnò ‘date palm’ and rùwùt!# ‘to write’. Another possible loanword might ultimately come from a language of the Ubangi sub-family of Niger-Congo (possibly Banda): cúkú ‘island’. For the many loans ultimately from Arabic (or Latin via Arabic), Hausa, English, and French, see the appendix. With regard to chronology, we cautiously suggest four major contact periods: A. Ancient areal contacts between (Pre-)Kanuri/Saharan languages and Afroasiatic (Chadic/Berber/Semitic) languages, possibly also other Nilosaharan and Niger-Congo languages, which precede the advent of Islam/Arabic in the Western and Central Sudan, roughly speaking before 1300 CE. B. The early Islamic period roughly from about 1300 CE to 1500 CE with strong impact of early Arabo-Islamic culture in the Western and Central Sudan region either directly or via intermediary languages. C. The medieval contact period between 1500 and about 1880 in which Hausa becomes a strong source of interference, and possible other autochthonous Chadic languages in Borno. D. The modern colonial and postcolonial period beginning with the advent of colonialism (British in Nigeria, some German in former German Cameroon, and French in modern Niger and Chad) and lasting until the present day. Besides the (ex)colonial languages English and French, Hausa as the most important lingua franca in the area continues to have an impact on Kanuri, and so does Arabic, whether via Hausa or not, in terms of reference to the “modern world”. Two loans of ultimately Latin origin had entered Kanuri via Arabic, i.e. sáwûl < sapo ‘soap’ and kàkkád!$ < chartas ‘paper’. Loans that came via Hausa into Kanuri are e.g. àyàwà ‘banana’ < Nupe word for ‘banana’, bàrwùnó < Songhay ‘chili pepper’ and tàwâ < ‘tobacco’ (ultimately Portuguese?)

4. Number and kinds of loanwords Out of a total of 1619 entries (1728 minus 109 “missing words”) in the subdatabase, 386 have been considered to be at least potential candidates for loans into Kanuri, i.e. 23.84%. The identification of loanwords in Kanuri is usually possible with a high degree of confidence for most of the semantic fields of the Loanword Typology list. This is shown by the high scores for category 4 evaluations (285 are “clearly borrowed”), i.e. 17.6% as opposed to categories 1–3 which indicate lower levels of confidence, i.e. 101 % 6.23%. (A total of 67 items, i.e. 4.13% were considered highly doubtful cases with “very little evidence for borrowing”). In Table 2 we indicate our qualitative evaluation by indicating our confidence judgments for each semantic field, in addition to the total number of items and the percentages of loans (all categories). The table also shows that Kanuri has more or

177

5. Loanwords in Kanuri

less clearly borrowed words from all 24 semantic fields of the LWT list, possibly to the exception of Miscellaneous function words, where we find just one highly doubtful item. Table 2 lists the potential loans in terms of semantic word class and levels of confidence. Table 2:

23 22 7 20 19 14 8 18 11 21 15 9 5 13 17 6 1 3 16 4 2 10 24 12

Loanwords by semantic fields (absolute numbers and percentages), ranked by percentages of borrowed words and indicating levels of confidence of judgment

Modern world Religion and belief The house Warfare and hunting Social and political relations Time Agriculture and vegetation Speech and language Possession Law Sense perception Basic action and technology Food and drink Quantity Cognition Clothing and grooming The Physical world Animals Emotions and values The body Kinship Motion Function words Spatial relations

total

loans

%

lev 4

lev 3

lev 2

lev 1

57 22 58 26 34 55 43 50 45 40 34 67 71 37 48 80 72 96 48 149 74 80 14 72

44 15 37 13 16 24 17 19 16 14 11 21 21 8 10 16 14 17 8 22 10 6 1 4

77.19 68.18 63.79 50.00 47.05 43.63 39.53 38.00 35.55 35.00 32.35 31.34 29.57 21.62 20.83 20.00 19.44 17.71 16.66 14.76 13.51 7.50 7.14 5.55

41 13 33 12 14 18 14 13 11 12 5 16 16 2 7 15 9 7 3 12 4 3 2

1 1 3 1 4 2 1 1 1 3 3 1 1 1 -

1 1 3 2 1 1 1 1 1 1 -

1 1 3 1 2 2 3 2 4 3 4 5 1 5 7 1 8 4 3 1 2

In terms of identifiable donor languages, the clearly borrowed items (cat. 4) come, most of all, from Arabic (9.4%), followed by Hausa (5.4%), English (2.2%) and French (0.7%). Recall that Hausa serves two functions with regard to loans in Kanuri: Hausa may not only be the ultimate source of a loan, but also the intermediary through which other source languages pass words into Hausa. It is not always easy to decide which of these roles Hausa plays in a given case. Table 4 relates donor languages and semantic word classes.

178

Doris Löhr and H. Ekkehard Wolff, with Ari Awagana

5

4 7 1 2 151

4 3 3 110

total

Function words

Adjectives

25 6 1 10 390

Adverbs

233 16 12 47 1072

285 25 9 67 1233 1619

Other

8.5 1.0 2.4 2.0 5.8

3.9 2.4

1.3 0.8

0.9 0.3 0.8 0.7

26.7 8.7 9.5 0.0 8.9 19.8

Nonloanwords

French

12.1 7.5 6.3 6.9 10.2

Total loanwords

English

Nouns Verbs Adjectives Adverbs Function words all words

Hausa

Percentages of loanwords by semantic word class and donor languages Arabic

Table 4:

Verbs

4. Clearly borrowed 3. Probably borrowed 2. Perhaps borrowed 1. Very little evidence for borrowing 0. No evidence total

Nouns

Table 3: Potential loans in terms of semantic word class and categories of confidence

73.3 91.3 90.5 100.0 91.1 80.2

Arabic as the main donor language has quite likely contributed some 189 loanwords (10.2%) to the subdatabase from across almost all semantic fields. In descending order of number of loanwords, Arabic contributed most of all to the semantic fields of Time (17 = 1.1%), Cognition (12+2 = 0.8%), Speech and language (13 = 0.8%), Clothing and grooming (10+1 = 0.8%), followed by Law (8+1 = 0.6%), Religion and belief, Emotions and values, Possession (each field with 8 = 0.5%), then by The physical world, Warfare and hunting (each with 7 = 0.4%), and further The body, The house, Social and political relations (each with 6 = 0.4%) and also the Modern world (5+1 = 0.3%). For the remaining semantic fields, Arabic contributes 2–5 items, none; however, to the fields of Motion, Spatial relations, and Miscellaneous function words. These figures underline the considerable impact that Arabo-Islamic culture had on Kanuri society in intellectual matters such as religion, law and science, politics and administration (including military matters), and medieval urban civilization and ways of life. Hausa is the runner up donor language with a contribution of 107 loans to the subdatabase. While Arabic loans cover 21 out of the 24 semantic fields, Hausa still covers 19 fields. In descending order: the Modern world (17 = 1.1%), Clothing and grooming (12 = 0.8%), followed by Agriculture and vegetation, Basic actions and

5. Loanwords in Kanuri

179

technology (each with 8 = 0.5%), and further Food and drink (7 = 0.4%), The house (6 = 0.4%), Religion and belief (5 = 0.3%), The body, Possession (each with 4 = 0.3%). For the remaining fields, Hausa contributes 1–3 items, none, however, to the fields of Kinship, Speech and language, Social and political relations, Warfare and hunting, and Miscellaneous function words. These figures underline a different cultural impact in comparison to Arabo-Islamic culture albeit with some overlap (Clothing and grooming, The house, Religion and belief, The body, Possession, Modern world), the impact was more on the basics of life (comprehensibly so in the light of the westward migration of the Kanuri into a new habitat that was now quite close to where the Hausa were living). We cannot exclude new intermarriage patterns between speakers of Hausa and Kanuri emerging from the Kanuri westward expansion. Also, we observe mutual enforcement of shared Arabo-Islamic culture. More recently and due to the spread of Hausa as the new lingua franca and medium of instruction in primary education, a fairly large number of loanwords relating to the modern world have entered Kanuri from or via Hausa. Not surprisingly, English (contributing 36 loanwords = 2.2% to the subdatabase) as the language of colonialization and official language for use in the domains of education, politics and administration in post-independent Nigeria where the vast majority of Kanuri speakers lives, has left its major impact in the semantic field of the Modern world with half of all its loans into Kanuri (18 = 1.1%). Clothing and grooming (6 = 0.4%) and Warfare and hunting (5 = 0.3%) follow. The other fields are Food and drink (3), Basic actions and technology (2), and Spatial relations (1). French (contributing only 11 loanwords = 0.7% to the subdatabase) as the colonializing and official language in Niger (as well as Chad) meets stereotypical expectations insofar as half of its loanwords relate to the semantic field of Clothing and Grooming (5 = 0.3%). Less significant are the fields of Agriculture and vegetation, Basic actions and technology (2 each), The house, Law (1 each). Table 5 gives the percentage figures by semantic field and principal donor languages.

180

Doris Löhr and H. Ekkehard Wolff, with Ari Awagana

English

French

Other

Total loanwords

Nonloanwords

The physical world Kinship Animals The body Food and drink Clothing and grooming The house Agriculture and vegetation Basic actions and technology Motion Possession Spatial relations Quantity Time Sense perception Emotions and values Cognition Speech and language Social and political relations Warfare and hunting Law Religion and belief Modern world Miscellaneous function words all words

Hausa

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Arabic

Table 5: Loanwords in Kanuri by donor language and semantic field (percentages)

8.6 8.5 5.4 3.8 5.1 14.8 12.3 5.9 3.7 16.9 4.5 31.4 5.8 12.2 24.9 28.0 15.2 15.3 30.5 31.3 9.5 10.2

1.4 1.7 1.1 1.8 6.8 16.9 12.3 13.2 7.4 4.5 8.5 0.7 2.3 3.5 3.9 4.3 7.4 3.8 18.8 25.2 5.8

2.5 8.5 2.1 1.5 2.5 1.3 9.4 29.2 2.4

7.0 2.1 2.9 2.5 3.8 0.8

3.3 1.3 2.1 4.4 2.1 2.4 0.7

10.0 10.3 9.8 5.6 15.6 47.2 30.8 27.9 16.0 4.5 25.4 2.0 6.8 34.9 9.7 18.6 32.3 30.5 15.2 24.7 38.1 50.0 63.9 0.0 19.8

90.0 89.7 90.2 94.4 84.4 52.8 69.2 72.1 84.0 95.5 74.6 98.0 93.2 65.1 90.3 81.4 67.7 69.5 84.8 75.3 61.9 50.0 36.1 100.0 80.2

5. Integration of loanwords 5.1.

General observations

Kanuri does not possess all the consonants that the principal donor languages Arabic, Hausa, English and French use as phonemes or allophones. This is particularly the case of laryngealized/glottalized (“emphatic”) consonants of neighboring Afroasiatic languages (Chadic, Semitic, Berber) which are rendered by Kanuri consonants of nearest phonetic affinity (like Arabic !a"n > sàkkân ‘kettle’, with epenthetic [a] insertion), or simply omission (like Arabic /&/ and /'/, e.g. Arabic #àku > àkú ‘parrot’, Arabic al-laa$u%ur > lúsùr ‘lazy’). Syllable and word structure constraints require the elimination of consonant clusters that occur in the donor

5. Loanwords in Kanuri

181

language, either through consonant deletion or vocalic epenthesis. Some consonant-final loanwords insert a word-final vowel, some do not, and in some cases final vowels even get deleted in Kanuri; the reasons for such seemingly erratic behavior are not (yet) clear. The characteristic Kanuri tendency of obstruent weakening, particularly in intervocalic position, also widely applies in the integration of loanwords; “weakening” processes include (and combine) degemination, voicing, spirantization, sonorization, and reduction to zero (deletion). Long vowels are shortened since Kanuri does not have phonemic vowel length. Short vowels may be rendered by Kanuri schwa (symbolized by in Kanuri orthography). Compensatory consonant lengthening has been observed in at least one case. Vowel and consonant assimilations are quite frequent, haplology and reduction of the number of syllables occur occasionally. In terms of grammatical integration of nouns, we note different treatments of the Arabic article in Kanuri. In some examples the article is simply dropped. Quite often, however, the consonant /l/ of the article is retained to become the first consonant of the word in Kanuri, i.e. the initial vowel /a/ of the article is deleted. Sometimes the full VC-structure of the article is retained and the largely functionless “moveable k” prefix of Kanuri nominal morphology is added. Verbs are, as a rule, integrated by adding a verbal nominalizer, depending on the verb class, mostly /-t(/. The frequently applied phonological and grammatical integration strategies allow a fairly sound judgment in terms of degrees of loanword integration, cf. the Table 5 based on the subdatabase: Table 6:

Integration of loanwords into Kanuri phonology and grammar

Highly integrated Intermediate Unintegrated Total

5.2.

229 108 17 354 (of 386)

Integration of loanwords from Arabic

The strategies for integrating nouns from Arabic can be illustrated with the following selected examples. Arabic alqutan > kàlwús&'n ‘cotton’ maintains the Arabic article in its full shape and adds the Kanuri noun prefix (“moveable k”, cf. Greenberg 1981). Arabic /q/ corresponds to a velar stop which is regularly weakened to /w/ (unless in initial position as in Arabic qamis > g&'májè ‘shirt’; for weakening of velar stops cf. also Arabic sukkar > súwùr ‘sugar’). /t/ undergoes spirantization to /s/, and the short vowel /a/ is rendered by /(/ in Kanuri. Nouns that maintain the consonantal part of the Arabic article /l/ are, for instance, Arabic al%ar( > lárd&' ‘land’, Arabic alba)ari > làwát&*rà ‘mule’, Arabic al#ibra > líwùlà ‘needle’, Arabic al-#irs > lòrúsà ‘bride’. Further to these examples: Arabic

182

Doris Löhr and H. Ekkehard Wolff, with Ari Awagana

/&/ and /'/ are foreign to Kanuri phonology and are simply omitted in lárd&$ ‘land’, lúsùr ‘lazy’, líwùlà ‘needle’, lòrúsà ‘bride’. Regular consonant weakening occurs in the cases of /b/ > /w/ in làwát&*rà ‘mule’ and líwùlà ‘needle’. A long vowel is shortened and subsequently assimilated, and /)/ is depalatalized in Arabic al-laa$u%ur > lúsùr ‘lazy’. Vowel epenthesis and final vowel addition (indicated by [ ] in the following examples) occurs in líw[ù]là ‘needle’ (with /r/ > /l/, cf. also Arabic "ar+r > harîl ‘silk’), l[ò]r[ú]s[à] ‘bride’ (with the rounding of epenthetic schwa to [o] and [u] reflecting the original presence of /&/), lárd[&'] ‘land’, cf. also Arabic ba"r > báh[à]r ‘ocean’. làwát&*rà ‘mule’ again shows short /a/ > /(/ and a still unexplained final vowel substitution /i/ > /a/ that could point towards the existence of an intermediary language (yet to be identified). Nouns that have been borrowed from Arabic without the article are, for instance, Arabic ar-ra#d > rád&$$ ‘thunder’, Arabic far, / alfár,i > fárgì ‘vagina’ (with unusual consonant “hardening” /j/ > /g/!). Kanuri may delete or add final vowels, e.g. Arabic nuuru > nûr ‘day’, and Arabic #inab > yínàbí ‘grape’. Final consonants may be deleted, e.g. Arabic ruu" > rô ‘soul’. Arabic tfill > tíwàl ‘baby’ presupposes intervocalic voicing of the original /f/ > */b/ with subsequent regular weakening to /w/; the actual vowel sequence could reflect metathesis after epenthetic vowel insertion: /tfill/ > **t[a]bil > /tiwal/. 5.3.

Integration of loanwords from Hausa

Many of the phonological integration processes observed with loans from Arabic also apply to loans from Hausa. Note that the Arabic article may be deleted in Kanuri even if it is present in Hausa, like (àlmàkáshì > mágàsù ‘scissors’ (< Arabic). Further, Hausa-internally motivated initial /h/ (in order to avoid vowel-initial words) may be treated differently, as the following loan (ultimately from Arabic) shows: hánkàlìi > hángal ~ ángàl-là ; the suffix -là being the associative marker -Ca with reduplication of the final consonant. This example also shows voicing assimilation following a nasal, cf. also mákáráantáa > màkàràndí (with tone pattern shift and final vowel replacement presently not accounted for). Glottalized consonants have non-glottalized counterparts in Kanuri and treat short and long vowels in a similar way, e.g. Ha. "àrámì > k!'rámì ‘younger brother’ (also with short /a/ > /!/), Ha. -àn tòofíi > dántòbí ‘skirt’. Vowel shortening in all positions is regular since Kanuri has no vowel length contrast. Consonant weakening affects labial and velar stops, most of all: Hausa téebùr > téwùr ‘table’ (< Engl.), Hausa ríibàa > ríwà ‘profit’ (< Arabic), Hausa bárkònóo > bárwùnó ‘pepper’; weakened intervocalic velars regularly may be further weakened to zero, e.g. líkítàa > lìítà ‘doctor’ (< Yoruba), sáa%àa > ságà ~ sáà ‘time’ (< Arabic). There are also examples of both trilled and flapped /r/ > /l/, like dármàa > dálmà ‘hoe’, fùrée > f!lé ‘flower’. Occasionally, consonants are assimilated (e.g. sóocì > sósì ‘socks’ (< Engl.)) and short consonants are rendered !là ‘lamp’ (< Arabic).

5. Loanwords in Kanuri

183

Various processes (haplology, apocope, consonant weakening to zero, etc.) are involved in the frequent reduction of the number of syllables in Kanuri, cf. lílìn > lîn ‘linen’ (< English), láayìi > lâi ‘line, street’ (< English), gíigínyàa > jíínà ‘deleb palm’, dúuníyàa > dúnyâ ‘world’ (< Arabic). For %yár cíkìi ‘(a certain kind of) shirt’ which in Hausa is a compound involving a feminine noun %yáa ‘daughter’ also used as diminutive (lit. daughter/small+fem.of belly), Kanuri has re-created the corresponding masculine form which does not exist in Hausa: *-án cíkìi) and has phonologically adopted this back-formation into Kanuri as dánkíkì. 5.4.

Integration of loanwords from English

Loans from English (and French in Niger) appear to come into Kanuri mostly through Hausa (note that it is almost impossible to tell whether Modern world vocabulary has come through Hausa, the major lingua franca, or straight from English through the media) and thereby may already contain features that are characteristic for their integration into Hausa phonology. This is particularly true for the reduction of consonant clusters through either deletion or epenthetic vowel insertion and the presence/absence of final vowels. (Note, therefore, the integration strategies for Hausa loans into Kanuri above.) Cf. English screwdriver > s!'kùld!'réwà, postcard > póskàt, president > f!'rèsìdân, bomb > bôm, bank > bánkì, engine > ínjìn, coat > kôt, tongs > tóngù.

6. Grammatical borrowing Little if any detailed systematic study has so far been devoted to contact-induced change and grammatical borrowing in Kanuri. This is quite surprising given the (for an African language) unusually long history of research (since 1854) and the popular sweeping statements regarding contact-induced grammatical change in Kanuri, such as the following. “The coherence of the Saharan languages is characterized by close similarity alongside considerable diversity. While the basic lexicon shows a relative distance, especially between Kanuri and Beria, the structural similarities are far greater, for example, the build-up of verb classes and general word and sentence formation. This discrepancy between structural and lexical coherence can be explained by contact with other languages… Kanuri has undergone more linguistic changes than the other Saharan languages.” (Cyffer 2007: 1090)

Quite recently, detailed analysis of contact-induced grammatical changes in the tense-aspect-mood system of Kanuri under the substratum impact of certain Chadic languages has begun (Wolff & Löhr 2006, 2008) which indicates the likelihood of

184

Doris Löhr and H. Ekkehard Wolff, with Ari Awagana

Kanuri’s borrowing from Chadic the non-canonical inflexional category of predication focus encoded in verbal morphology. Under this diachronic theory, Kanuri has been induced to rearrange its tense-aspect-mood system and to innovate oppositions between in-focus and out-of-focus forms in the perfective aspect domain marked by verb suffixes. This sets Kanuri aside from other Saharan languages and resembles tense-aspect-mood systems that are found in Chadic languages adjacent to Kanuri settlement areas.

7. Conclusion Kanuri has been the lingua franca of the Kanem-Borno Empire and, therefore, second language (even though dramatically receding in this function) for speakers of several non-Saharan languages over a few centuries. It is most likely responsible for extensive bilingualism and language shift in the past and has assimilated (“Kanurizised”) considerable numbers of individuals and even groups of speakers of different languages. Insofar, Kanuri could be expected to have borrowed and show lexical interference from these substrata which link up in particular with a westward migration from Kanem, east of Lake Chad, into Borno to the west of Lake Chad, i.e. into Chadic speaking territories, over the last 700 years, at least. On the other hand, Kanuri has increasingly come under the impact of Hausa which, quite parallel, developed into the most dynamic new lingua franca in the area, and with which Kanuri has been in contact for at least 500 years anyway. Both languages, Kanuri and Hausa, shared and still share the dominant impact of Arabo-Islamic culture since Islamization began some 800 or 900 years ago. It is not surprising, therefore, to find that Kanuri has about 16.5% clearly borrowed words and at least 4–5% likely borrowed words reflected in the subdatabase, mostly from Arabic and Hausa. In more recent periods since advent of colonialism around 1880 CE, English and French became official languages in later independent Nigeria and Niger, instigating elitist bilingualism and diglossia which also left superstratum traces in Kanuri vocabulary, but to a much lesser extent than Arabic and Hausa. The semantic fields in which the loans from these four major donor languages occur reflect the different socio-cultural and historical scenarios which each of the respective donor languages represents. Only a few and isolated loanwords can be somewhat erratically be related to other languages, some of them apparently reflecting ancient “areal” roots that have become integral part of genealogically unrelated language families and phyla.

Acknowledgments Doris Löhr and H. Ekkehard Wolff herewith gratefully acknowledge intensive and continuous cooperation in the Loanword Typology project with their colleague at the Institut für Afrikanistik, University of Leipzig, Dr. Ari Awagana, as both a linguistic expert on Hausa and Chadic languages in general, and as a native speaker

5. Loanwords in Kanuri

185

of Kanuri. The authors further acknowledge the input of students and student assistants at the Institut für Afrikanistik, University of Leipzig, who took actively part in research seminars that were at least partly devoted to the study of loanwords in Hausa and Kanuri during the academic years 2004/5–2006/7, some of it in the highly conducive environment of the university’s facilities in Zingst at the southern shore of the Baltic Sea. We also gratefully acknowledge a travel grant received from the Verein der Freunde und Förderer der Universität Leipzig for Dr. Ari Awagana to go and consult the sahelia database located in Nice under the directorship of our colleague Robert Nicolaï.

References Baldi, Sergio. 1992. Arabic loanwords in Hausa via Kanuri and Fulfulde. In Ebermann, Erwin & Sommerauer, Erich R. & Thomanek, Karl E. (eds.), Komparative Afrikanistik (Festschrift Mukarovsky), 9–14. Wien: Veröffentlichungen der Institute für Afrikanistik und Ägyptologie der Universität Wien. Baldi, Sergio. 1995. On Arabic Loans in Hausa and Kanuri. In Ibriszimow, Dymitr & Leger, Rudolf (eds.), Studia Chadica et Hamito-Semitica: Akten des Internationalen Symposions zur Tschadsprachenforschung, 252–278. Johann-Wolfgang-GoetheUniversität, Frankfurt am Main, 6. - 8. Mai 1991. Köln: Rüdiger Köppe. Barkindo, Bawuro M. 1985. The early states of the Central Sudan: Kanem, Borno and some of their neighbours to c. 1500 A.D. In Ajayi, Jacob F. & Crowder, Michael (eds.), History of West Africa, 3rd edn. Vol. 1, 225–254. Harlow. Barth, Heinrich. 1862. Sammlung und Bearbeitung Central-Afrikanischer Vokabularien [Collection and edition of Central African vocabularies]. Vol. 3. Gotha. Bender, M. Lionel. 1996. The Nilo-Saharan Languages: A comparative essay. München: LINCOM Europa. Braukämper, Ulrich. 2004. Towards a Chronology of Arabic Settlement in the Chad Basin. In Krings, Matthias & Platte, Editha (eds.), Living with the Lake: Perspectives on History, Culture and Economy of Lake Chad, 148–170. Köln: Rüdiger Köppe. Bulakarima, Shettima U. 1997. Survey of Kanuri Dialects. In Cyffer, Norbert & Geider, Thomas (eds.), Advances in Kanuri scholarship, 67–76. Köln: Rüdiger Köppe. Cyffer, Norbert. 1994. English-Kanuri Dictionary. Köln: Rüdiger Köppe. Cyffer, Norbert. 1998. A Sketch of Kanuri. Köln: Rüdiger Köppe. Cyffer, Norbert. 2000. Linguistic properties of the Saharan languages. In Zima, Petr (ed.), Areal and genetic Factors in Language Classification and Description: Africa South of the Sahara, 30–59. München: LINCOM Europa. Cyffer, Norbert. 2007. Kanuri Morphology. In Kaye, Alan S. (ed.), Morphologies of Asia and Africa, 1089–1126. Winona Lake, IN: Eisenbrauns. Cyffer, Norbert & Hutchison, John P. 1991. Dictionary of the Kanuri language. Dordrecht: Foris.

186

Doris Löhr and H. Ekkehard Wolff, with Ari Awagana

Cyffer, Norbert & Löhr, Doris & Platte, Editha & Tijani, Abba I. 1996. Adaptation and delimitation: Some thoughts about the Kanurization of the Gamergu. In SFB 268 (ed.), Berichte des Sonderforschungsbereiches 268, Vorträge Internationales Symposium, Frankfurt/Main 13.12. – 16.12. 1995, 49–66. Frankfurt/Main. Ehret, Christopher. 2001. A Historical-Comparative Reconstruction of Nilo-Saharan. (SUGIA-Beihefte 12). Köln: Rüdiger Köppe. th

Gordon, Jr., Raymond G. (ed.). 2005. Ethnologue: Languages of the World. 15 edn. Dallas, TX: SIL International. Greenberg, Joseph H. 1947. Arabic Loan-Words in Hausa. Word 3:85–97. Greenberg, Joseph H. 1960. Linguistic Evidence for the Influence of the Kanuri on the Hausa. Journal of African History 1(2):205–212. Greenberg, Joseph H. 1963. The Languages of Africa. (International Journal of American Linguistics 29.1). Bloomington: Indiana University Press. Greenberg, Joseph H. 1981. Nilo-Saharan movable k- as a stage III article (with a Penutian typological parallel). Journal of African Languages and Literature 3:105–112. Hutchison, John P. 1981. The Kanuri language: A reference grammar. Madison, WI: African Studies Program, University of Wisconsin. Hutchison, John P. 2000. Predicate Focusing Constructions in African and Diaspora Languages. In Wolff, H. Ekkehard & Gensler, Orin D. (eds.), Proceedings of the 2nd World Congress of African Linguistics, Leipzig 1997, 577–591. Köln: Rüdiger Köppe. Jakobi, Angelika & Crass, Joachim. 2004. Grammaire descriptive du beria (langue saharienne) [Descriptive grammar of Beria (Saharan language)]. (Nilo-Saharan: Linguistic Analyses and Documentation 18). Köln: Rüdiger Köppe. Jarrett, Kevin. 1988. Dialectes et alphabétisation dans les écoles: Une étude explorative de l’intercompréhension des différents dialectes kanuri du Niger [Dialects and alphabetization: An explorative study on the mutual intelligibility of the different Kanuri dialects of Niger]. Journal of West African Languages 18(2):105–124. Jourdan, Paul. 1935. Notes grammaticales et vocabulaire de la langue Daza [Grammatical notes and vocabulary of the Daza language]. London: Kegan Paul, Trench, Trubner and Co. Jungraithmayr, Herrmann & Leger, Rudolf & Löhr, Doris. 2004. "Westwärts zieht der Wind": Migrationen im südlichen Tschadseegebiet ["Westwards moves the wind": Migration in the southern Lake Chad region]. In Albert, Klaus-Dieter & Löhr, Doris & Neumann, Katharina (eds.), Mensch und Natur in Westafrika: Ergebnisse aus dem Sonderforschungsbereich "Kulturentwicklung und Sprachgeschichte im Naturraum Westafrikanische Savanne", 169–195. Weinheim: Wiley-VCH. Koelle, Sigismund W. 1854. Grammar of the Bornu or Kanuri language. London: CMS House. Lang, Karl. 1923/1924. Arabische Lehnwörter im Kanuri [Arabic loanwords in Kanuri]. Anthropos 18/19:1063–1074.

5. Loanwords in Kanuri

187

LeCoeur, Charles & LeCoeur, Marguerite. 1956. Grammaire et textes Teda-Daza. Dakar: Mem. IFAN. Löhr, Doris. 2007. Nigerian Kanuri (Sub-)Dialects Reconsidered: A Corpus-based Approach. In Payne, Doris & Reh, Mechthild (eds.), Advances in Nilo-Saharan th Linguistics, 165–182. Proceedings of the 8 Nilo-Saharan Linguistics Colloquium, University of Hamburg, August 22-25, 2001. Köln: Rüdiger Köppe. Löhr, Doris. 2009. Lake Chad and the migratory routes to Borno: A linguistic trail. In Tourneux, Henry (ed.), XIIIth Mega-Chad Conference “Migrations and spatial mobility in the Lake Chad Basin”, Maroua, 31.10. – 3.11. 2005 (Colloques et seminaires), 665– 681. Montpellier/Marseille: Editions de l'IRD. Lukas, Johannes. 1937. A study of the Kanuri language, grammar and vocabulary. London/New York: Oxford University Press for the International Institute of African Languages and Cultures. Lukas, Johannes. 1951/52. Umrisse einer ostsaharanischen Sprachgruppe [Outlines of an Eastern Saharan language group]. Afrika und Übersee 36:3–7. Lukas, Johannes. 1953. Die Sprache der Tubu in der zentralen Sahara [The language of the Tubu in the central Sahara]. Berlin: Deutsche Akademie der Wissenschaften, Institut für Orientforschung. Nachtigal, Gustav. 1967 [1881–1889]. Saharâ und Sûdân: Ergebnisse sechsjähriger Reisen in Afrika [Sahara and Sudan: Results of six years of travel in Africa]. Vols. 1–3. Original publication: Vol. 1–2, Berlin 1879–1881; Vol. 3, Leipzig 1889. Ortman, Mark. 2001. The role of the reflexivity morpheme in Teda verb structure. Hamburg:: Paper read at the 8th Nilo-Saharan Linguistics Colloquium, University of Hamburg, 22nd–25th August, 2001. Petrá!ek, Karel. 1979. Zur inneren Rekonstruktion des zentralsaharanischen Verbalsystems. Asian and African Linguistic Studies 9:93–127. Petrá!ek, Karel. 1987. Berti or Sagato-a (Saharan) vocabulary. Afrika und Übersee 70(2):163– 193. Rohlfs, Gerhard. 1984. Quer durch Afrika: Die Erstdurchquerung der Sahara vom Mittelmeer zum Golf von Guinea 1865–1867 [Across Africa: The first crossing of the Sahara from the Mediterranean Sea to the Gulf of Guinea 1865–1867]. Nachdruck Lenningen: Edition Erdmann. Lenningen: Erdmann. Schuh, Russell G. 2003. The Linguistic influence of Kanuri on Bade and Ngizim. Maiduguri Journal of Linguistic and Literary Studies 5:55–89. Skinner, Neil. 1996. Hausa Comparative Dictionary. Köln: Rüdiger Köppe. Wehr, Hans. 1976. Arabic-English Dictionary. Cowan, J. Milton (ed.). Ithaca: Spoken Language Services. Wexler, Paul. 1980. Problems in monitoring the diffusion of Arabic into West and Central African languages. Zeitschrift der Deutschen Morgenländischen Gesellschaft 130:522–556.

188

Doris Löhr and H. Ekkehard Wolff, with Ari Awagana

Wolff, H. Ekkehard & Löhr, Doris. 2005. Convergence in Saharan and Chadic TAM Systems. Afrika und Übersee. Special Vol. 88: Johannes Lukas (1901–1980) p.265–299. Hamburg. Wolff, H. Ekkehard & Löhr, Doris. 2006. Encoding Focus in Verbal Morphology: Predication Focus and the “Kanuri Focus Shift”. In Fiedler, Ines & Schwarz, Anne (eds.), Papers on Information Structure in African Languages (ZASPIL 46), 185–209. Berlin: ZAS. Zakaria Fadoul, Khidir. 1996. Quelques caracteristiques des verbes du beria [Some characteristics of Beria verbs]. Afrikanistische Arbeitspapiere 47:77–81. Zakaria Fadoul, Khidir. 2002. Lexique des animaux chez les Beri du Tchad. (University of Leipzig Papers on Africa, Languages and Literatures 17). Leipzig: Institut für Afrikanistik. Zakaria Fadoul, Khidir. 2005. Bases et radicaux verbaux. Déverbatifs et déverbaux du beria (langue saharienne). (Nilo-Saharan: Linguistic Analyses and Documentation 20). Köln: Rüdiger Köppe.

5. Loanwords in Kanuri

189

Loanword Appendix Arabic dúnyâ lárd!' báhàr sàmî rád!' nûr máfì tíwàl nyìyâ lòrúsà yâl dábbà l!'mân

f!.r làwát!*rà àkú hámmà líwà lúsùr zákàr fárgì rô kawîn *d!'mb!*r sàkkân yínàbí zàitûn súwùr kàlwús!'n harîl líwùlà màsíllá bàrmús g!'májè kûp hìjâp líwù alwúta *mùsùwár sàwûl/shàwûl làíma tákkà/táà kùrîs fàt!*là mósùwà

world, weather land, country sea/ocean sky, heaven bolt of lightning light snow, ice baby wedding wedding family animal animal, livestock horse mule parrot ? to yawn corpse lazy penis vagina life corpse buttocks kettle grape olive sugar cotton silk needle (1) awl cloak shirt boot veil pocket handkerchief pin soap tent window chair lamp/torch lamp/torch

shàmê *wás!*là làámà sháìr lèmûn sàgàd!' jìnjîr dínàr lìwúlà hàlált!* *bàyîl àrájì àlwúsùr mágàsù ákkì sáw!'rt!* kàsúwù támàn bás míyà àzàlt!* sâ lóktù fájàr àbàdá s!*wà lád!' lìtìlîn tálàg!* láráwà làmís!' júmmà s!*bd!' zàmân *yîm *kàjílí ásàr lánnù *lìwùllá sâ t!*w!*rít!* tájírwà gàf!'rt!* àfùt!* áiwù sháwà ángàlla

candle ridgepole wheat barley citrus fruit pumpkin chain gold silver to own stingy tax wages chisel tax to trade, to barter market price only a hundred to hurry hour, time, season time dawn never morning Sunday Monday Tuesday Wednesday Thursday Friday Saturday season day afternoon afternoon color blue luck to regret danger to forgive to forgive fault beautiful clever

*n!'mm!'náfùk hángàl dàwárì sówórì màkàràndí záhir àsír bàyènt!* dàlíl nà’ám k!'ràt!* *táwàt gùlt!* ájàp kálìmá jàáwù àrdìt!* àng!'rt!* wàd!' got!* rùwùt!* kìtáwù jàmâ wàlàdí ádà hêr ásk!'r búndúg!' súlwé k!'násàr gànímà *l!'wálà shàrâ shàràrám hówúm shàdàmá àlàptà hówúmt!* wówom háá àzáwù àlà sàlìt!* àlàbè àshêm got!* shètán jíndì

deceit mind idea idea school clear secret to explain cause yes to study, to read certain to say, to tell astonished word to answer, speech to admit to deny to promise to write book people servant custom peace army, soldier gun armor victory booty quarrel law court judgment witness to swear to condemn, to convict penalty fine penalty god to pray holy to fast demon ghost

190

Doris Löhr and H. Ekkehard Wolff, in cooperation with Ari Awagana

mashídi àshêm z!'mt!* sháyì *kàtífà *gàsásà âù lâ

mosque to fast tea mattress bottle or or

Hausa álád!' tìmbí káùrì dàngálì nasara tásà búródì mòsóró bàrwùnó yádì lîn s!'lékì dànkìkì sósì àníní áská téwùr làmbàtú tàwá dàngálì górà mágásù dálmà góngóng fìnjâl gúnkì ámálánké

pig stomach, belly grave potato plate, bowl bread pepper chili pepper cloth linen silk shirt sock button razor table ditch tobacco sweet potato bamboo scissors lead tin/tinplate glass statue, idol cart/wagon

s!*f!'rí kúllúm àgógó mál!'m shég!* àlgálàm l!'gálì àdîn cócì fádà gúmnátì jèrídà róbà s!'kùld!'réwà kanandîr àdréshì

zero always clock teacher doubt pen judge religion temple priest government newspaper plastic screwdriver petroleum address

English fôk ínjìn tóngù kôt sìkêt bêl táwùl bùrôs g!'r!/s b!'rkí g!'lâs féntì bôl ódà dío sójì klóp hélmèt

fork mill (engine) tongs coat skirt belt towel brush grease brick glass paint ball to command soldier, army club helmet

f!*rs!'nà gàdìmá fómfòm tî tíbî báskùr mátò bâs b!'rkì bakta dóktà f!'rèsìdân mìnístà pòlîs póskàt bánkì sìgárì kàlándá bátèr míntì bôm

prisoner guard tap/faucet tea television bicycle car bus to brake nurse president minister police postcard bank cigarette calendar battery candy/sweets bomb

French màntô kwâl jîp bùtô féngèl másô fêl ràtô mét!'rà kôl kósò

coat collar skirt button pin mason shovel rake hammer glue prison

Chapter 6

Loanwords in Tarifiyt, a Berber language of Morocco* Maarten Kossmann 1. The language and its speakers 1

Tarifiyt Berber (also called Tarifit, Riffian or Rif Berber, in Tarifiyt !mazix! or !arif!"!) is the name of a large group of dialects (cf. Lafkioui 2007) spoken in the northeastern part of Morocco. Its current number of speakers is unknown, as there are no published census data for native language use in Morocco, but population statistics of the provinces which are mainly Tarifiyt-speaking, Alhoceima and 2 Nador, suggest it has between one million and a million and a half speakers. Taqer’iyt (or Guelaya, !aq#$%#"!) is an eastern variety of Tarifiyt, spoken in the vicinity of the Moroccan town of Nador (Nna&’ua in Tarifiyt) and the Spanish enclave Melilla (M$i'). The language treated here is Taqer’iyt Tarifiyt Berber, as spoken by Mr. Khalid Mourigh, a student in his twenties originating from Segangane, a village which is now part of the Nador agglomeration. Although he has spent all his life in the Netherlands, Mr. Mourigh is a confident and reliable speaker of Tarifiyt. His data can be considered representative for the urban variety of Tarifiyt as spoken by the younger generation in the agglomeration of Nador. Berber is a separate branch of the Afroasiatic language phylum. It consists of a number of languages, spoken in Northern Africa (the Maghreb) and the Sahara. The time-depth of Proto-Berber is relatively shallow, and is probably similar to that of Germanic or Romance (Louali & Philippson 2004: 106, Kossmann 1999: 15). For this reason, some scholars consider Berber one single language with a considerable amount of dialectal diversity (e.g. Chaker 1995: 9). Subclassification of Berber languages and varieties is extremely problematic (Kossmann 1999: 30ff.), and *

1 2

The subdatabase of the World Loanword Database that accompanies this chapter is available online at http://wold.livingsources.org. It is a separate electronic publication that should be cited as: Kossmann, Maarten. 2009. Tarifiyt vocabulary. In Haspelmath, Martin & Tadmor, Uri (eds.) World Loanword Database. Munich: Max Planck Digital Library, 1533 entries.

For an explanation of the transcription conventions, the reader is referred to the accompanying online database (cf. also Lafkioui 1997). The Arabic Wikipedia, quoting 2004 census figures, gives 728,634 inhabitants of Nador province and 395,644 for Alhoceima province. Of course, not all inhabitants of these provinces speak Tarifiyt, as there exists considerable internal migration in Morocco. On the other hand, Tarifiyt is also spoken by large communities outside these two provinces, because of substantial migration to other parts of Morocco and to Europe. There is no reason to follow McClelland (2004) in calling Tarifiyt an endangered language.

192

Maarten Kossmann!

no reliable classification has been proposed so far; therefore no subclassification will be presented here. Some scholars consider Tarifiyt part of a larger unit called Northern Berber, which comprises all Berber varieties of Morocco and (nonTuareg) Algeria. This should probably be understood as a typological rather than as a genealogical classification. Tarifiyt is part of a dialect continuum stretching towards the south into the eastern Middle Atlas (Ayt Warayn) and to the east into Beni Iznasen Berber, which is often considered to be part of Tarifiyt. Towards the west there are substantial differences between Tarifiyt and the Berber varieties spoken by the so-called Senhaja de Sraïr, which one might consider a separate language. To the west and to the south of the Tarifiyt speaking region there are large groups of speakers of Moroccan Arabic. There exists no standard variety of Tarifiyt, and speakers have more positive attitudes to their own dialect than to that of others. Still, some influence emanating from the main centers can be discerned, and it is not unusual to hear typical Nador Tarifiyt forms such as amm=u ‘so’ and the past tense marker ttu(a in regions where the traditional dialect has other forms (such as amy=a and ))a). Tarifiyt is mainly used inside the region where it belongs in spoken communication. Speakers of Tarifiyt use Moroccan Arabic when speaking to people from outside the region, or with people in the region who do not know Tarifiyt. Writing and the more formal genres in mass media use Standard Arabic; informal genres in the mass media (e.g. talk shows, interviews with football players, and washing powder advertisements) may use Moroccan Arabic. In spite of much effort undertaken by Riffian writers and activists, the use of Tarifiyt in writing or in the mass media is still marginal. In the popular music of the region, on the other hand, Tarifiyt competes seriously with Moroccan Arabic and foreign languages. For many years, efforts have been made to create a standard Berber variety, which would be valid for all Moroccan and Algerian Berber languages. This creation combines words and constructions from different Berber varieties and has as its special task to oust all Arabic loan influence. Standard Berber is sometimes used in writing, normally with many explanatory notes in French, Arabic, or another Berber language (cf. the newspaper articles on the front page of the Nador-based journal Tawiza). In normal communication, only a few iconic neologisms are regularly used, such as azul ‘hello’, and !i$!lli ‘freedom’. While many speakers would recognize these words, their use is restricted to persons who are involved in the Berber issue. Since 2003, the Moroccan government has started a program to introduce Berber in experimental education. In the first year of primary education, according to the region, one out of three Berber languages is taught, Tashelhiyt, Central Moroccan Berber, or Tarifiyt. The language of the schoolbooks is designed to contain no Arabic loanwords; where there are alternatives inside Tarifiyt, the Berber alternative is chosen, and where only the Arabic loan exists in Tarifiyt, a term is borrowed from another Berber language or a new term is coined. As a result, the schoolbook language is not mutually intelligible with spoken Tarifiyt, and must be considered a different language. From the second school year onward, a

6. Loanwords in Tarifiyt

193

common Moroccan Berber standard language is taught; this variety is of course still further from spoken Berber.

2. Sources of data and scholarly history The data source for this article and the subdatabase is the linguistic knowledge of one speaker, Mr. Khalid Mourigh. The data in the database have been filled in in a cooperative effort by Mr. Mourigh and the present author. In a few cases, they were supplemented by other speakers, whom Mr. Mourigh consulted. Unless mentioned otherwise, Moroccan Arabic data come from Harrell & 3 Sobelman (1966) and Iraqui Sinaceur (1993). The term Maghribine Arabic, which is in fact a cover term for all Arabic varieties spoken in Morocco, Algeria and Tunisia, is used only when the word in question was not found in any dictionary of Moroccan Arabic, but appears in a dictionary of a Maghribine variety from outside Morocco. The tacit assumption is that the absence of these words in the Moroccan dictionaries is due to insufficient documentation. Tarifiyt has been studied from the late nineteenth century onward. Important pre-colonial and colonial works are Biarnay (1911), a description of the Tarifiyt variety of the ancient immigrant community of Vieil Arzeu in Algeria, Biarnay (1917), which presents a large-scale dialect overview of Tarifiyt phonetics, as well as a vocabulary and texts, and Renisio (1932), which is a relatively unsophisticated grammatical sketch, supplemented by high quality transcriptions of texts and a vocabulary. Ibáñez (1944) is a large, but often problematic, Spanish-Tarifiyt vocabulary. Some post-independence studies are Chami (1979), a phonological and morphological overview of Nador Tarifiyt; Cadi (1987), a morphosyntactic study of the Nador Tarifiyt verbal system; Kossmann (2000), a grammatical sketch of Eastern Riffian Berber (Beni Iznasen); and the large, but still unpublished Tarifiyt dictionary by Mohammed Serhoual (2002). Lafkioui (2007) is a very detailed dialect atlas of Tarifiyt and neighboring Berber varieties. McClelland (2004) is a TarifiytEnglish dictionary, which is riddled with errors and virtually useless as a data source. Contact influence on Berber is a relatively neglected subject. There are a number of studies on Punic and Latin influence (e.g. Vycichl 2005, Brugnatelli 1999). The Arabic influence, which is much stronger, has never been investigated systematically, and many questions remain unanswered. Thus, as far as I know, nobody has ever studied the question why some Arabic CVC (“hollow”) verbs are taken over in their Imperfective form, while others are taken over in their Perfective form. Similarly, there exists no study that even poses the question why some Arabic nouns receive full Berber morphology, while others retain most of their original morphology. 3

Other dictionaries that have been used are Beaussier (1931), Prémare (1993-), Sabia et al. (2000), Vycichl (1983), and Wehr (1976).

194

Maarten Kossmann!

Map 1: Geographical setting of Tarifiyt

3. Contact situations and contact history Proto-Berber probably does not pre-date 500 BCE; any possible loan influences from before this date are impossible to trace, and will not be treated here. The earliest loanwords which can be traced are a few Wanderwörter of different origins, such as !iyni ‘date’, which eventually comes from Ancient Egyptian (Kossmann 2002). Another example, no more used in Tarifiyt, is the Berber word az’r!f ‘silver’, which may have an Iberian source (Boutkan & Kossmann 2001). The first identifiable group of loanwords is due to the Phoenician and Carthaginian influence on Northern Africa (cf. Vycichl 2005: 2–16 for an overview) and has been borrowed from Phoenician or Punic. Many of the proposed Punic etymologies are disputable, though, and most of the unproblematic Punic elements have been lost in Tarifiyt. The main exception to this is the Tarifiyt toponym a*&ia, which is well attested as a noun elsewhere in Berber (e.g. Tashelhiyt agadir ‘fortified place’) and goes back to Punic g-d-r ‘fence’ (Vycichl 2005: 3). An additional problem in identifying Punic vocabulary is the possibility of later borrowing from the sister language of Punic, Hebrew. This is the case of Tarifiyt $m!& ‘to

6. Loanwords in Tarifiyt

195

learn’, which is probably derived from the Semitic (but not Arabic) verb l-m-d ‘to learn’. This verb could very well go back to Punic (Vycichl 2005: 3–4), but a derivation from Hebrew (or Aramaic) is at least as likely. Before the advent of Islam, an important part of the Berber population adhered to Judaism, so influence from Hebrew in the realm of learning is not unexpected. Much stronger influence was exercised by Latin (see, among others, Vycichl 2005: 16–32, Brugnatelli 1999). Quite a number of terms have been borrowed from this language, many of them related to agriculture. Some examples are a!mun ‘plough-beam’ < Latin t+m,(nem) (Laoust 1920: 286); asnus ‘donkey fowl’ < Latin asinus ‘donkey’. It is sometimes possible to differentiate between earlier and later loans from Latin. The earliest Latin loans are taken over in the nominative singular (as in the case of asnus < asinus, NOM:SG), while later loans are based on accusative forms (as in the case of a!mun < t+m,nem, ACC:SG). For the later period, one may prefer to call the donor language North-African Romance rather than Latin. It is uncertain how long Romance persisted in Northern Africa after the Arabic conquest, but the presence of Romance influence on Maghribine Arabic shows that it must have been a significant linguistic factor in the early period of Islamic rule, especially in the northwestern part of Morocco, a region adjacent to where Tarifiyt is spoken (Colin 1926: 65–68). Northern Africa was subdued by Islamic troops in the course of the seventh century CE. At first, this conquest did probably not have much impact on the linguistic practices of Berber speaking populations. Arabic settlements were mainly found in the cities, which at that time were probably mainly Romance speaking (Levy 1998). In order to bring the Islamic faith to the Berbers, special religious vocabulary was designed (van den Boogert & Kossmann 1997), using a blend of heavily adapted loanwords – e.g. z’a)) ‘to pray’ < Arabic s’all-; z’um ‘to fast’ < s’-ma ! and neologisms, such as the names of the daily prayers, not preserved in Tarifiyt. As these terms are found all over Berber, and both with groups adhering to Sunnite and to Kharijite Islam, this vocabulary must have been introduced at a time when the Islamic schism was not yet a major issue in Berber country. This suggests a time before or during the Kharijite predominance in Northern Africa, i.e. in the eighth century CE or earlier. These terms must have been spread by missionaries using Berber as their language of religious teaching. Classical Arabic (and its offshoot Standard Arabic) was the only language in international politics, religion, and learning from the advent of Islam until the colonial period (which, for the Rif, started in 1912). Like elsewhere in the Arabic world, it is mainly a language of written and recited texts; only in contexts involving formal education and international communication is it sometimes used in conversation. The immigration of many Arabic-speaking people from the east as well as language shift by large groups of Romance and Berber speaking autochthonous people lead to the establishment of dialectal Maghribine Arabic as a major language in the area. Nowadays, over half of the population of Morocco has dialectal Arabic as their mother tongue, and it is everywhere used as a lingua franca.

196

Maarten Kossmann!

The great bulk of loanwords from Arabic in Tarifiyt have been taken from Maghribine dialectal Arabic; most of them have close correspondents in present-day Eastern Moroccan Arabic. In Northern Africa, Arabic has been the dominant language for a long period. This does not mean that the social circumstances under which borrowing took place are easy to reconstruct. In pre-colonial times, there never was a policy aiming at the introduction of Arabic in other than religious and literary contexts, and dialectal Arabic ! the main donor of Arabic lexicon in Tarifiyt ! never had any special status. During most of its history, this part of Morocco recognized the religious authority of the Moroccan sultan, but did not submit to his secular power. Thus, political dominance of speakers of Arabic was only rarely an issue in the Tarifiyt speaking country before Morocco regained its independence in 1956. This leaves us with a stingy question: if socio-political pressure from Arabic was relatively weak (except for domains such as law and religion), why are there so many loanwords from dialectal Arabic in Tarifiyt? Trade may have played a paramount role. In the Moroccan countryside trade is organized through weekly markets ! every village on its own week day ! and traders make a tour of these markets. It is very well possible that these traders used Arabic as their language of communication; some of them because it was their native language, others because some of the markets they would trade in were in Arabic-speaking villages. Thus, Arabic would have become the dominant language of the markets, and many important items of vocabulary could thus enter Berber. This scenario is suggested by the fact that some areas of vocabulary which are highly affected by borrowing consist of words which are frequent in a market context, such as numerals and names of fruits and vegetables. Since the colonial period, the Tarifiyt-speaking country has more and more become integrated in the tissue of Moroccan society. As Moroccan Arabic is the main language of communication outside the village, an important number of Moroccan Arabic loanwords may have entered the language during the twentieth century. It should be stressed, however, that linguistic studies which predate the colonial occupation of Morocco in 1912 clearly show that the strong lexical influence of dialectal Arabic on Tarifiyt was already there before colonial times. Spanish has been a language in the region since Spanish troops occupied Melilla in 1497, a foreign presence that continues until the present day. There is not much evidence, though, for substantial Spanish influence on Tarifiyt before the start of the occupation of Northern Morocco in 1912. In fact, one suspects that a large percentage of Spanish loanwords was borrowed after Morocco regained its independence (1956), and is due to the intensive trading (and smuggling) relations between Nador and the Spanish enclave Melilla, as well as to easy access to Spanish radio and television. This is, among others, suggested by the fact that Spanish influence is much more prominent in Taqer’iyt Tarifiyt, spoken near Nador and Melilla, than in neighboring Tarifiyt dialects, such as Ayt Sa’id, which are spoken in regions somewhat further away from Melilla. In many cases Taqer’iyt Tarifiyt has borrowed Spanish terms where other Tarifiyt dialects use dialectal Arabic or French loanwords.

6. Loanwords in Tarifiyt

197

In contrast to most parts of Morocco, the other colonial language, French, was never a major factor in the region, as it belonged to the Spanish part of the protectorate. French loanwords mainly entered Tarifiyt through Moroccan Arabic. A special category is constituted by recent Standard Arabic loans, which can be recognized by specific phonological and morphological features. These are mainly the consequence of formal teaching, Standard Arabic being the main language of education in Morocco. As is probably the case in all languages with heavy lexical borrowing, bilingualism must have played a major role in the introduction of loanwords. Nowadays, most speakers of Taqer’iyt Tarifiyt are at least fluent in two languages, Tarifiyt and Moroccan Arabic. Many will have a reasonable knowledge of Spanish too. In addition to these three languages, those who received formal education (a rapidly increasing percentage) know Standard Arabic and French as well. Due to the largescale migration of Tarifiyt speakers to Europe, especially the Netherlands, Belgium and Germany, many people know Dutch or German. One cannot, of course, project this picture on the pre-colonial period. However, it is highly probable that bilingualism in Moroccan Arabic and Tarifiyt has long been widespread, especially among men.

4. Numbers and kinds of loanwords The subdatabase for Tarifiyt has 51.7% borrowings. Inside the semantic groups assigned to words in the database, none has a percentage of loanwords below 20%, while only three fields (Miscellaneous function words, The body, and Kinship) have a percentage below one third. Table 1 summarizes the results of the subdatabase as regards the incidence of borrowing semantic groups into which the lexical data have been arranged. Borrowings are found in most parts of speech; they are very common in the open lexical classes: nouns, verbs, adjectives and adverbs. As far as function words are concerned, loans are absent in the personal pronouns system and rare among prepositions; on the other hand they are relatively common among coordinating and subordinating particles, and all numerals but ‘one’ are loanwords. Table 2 provides the percentages according to semantic word class.

198

Maarten Kossmann!

Standard Arabic

Pre-Islamic

Classical Arabic

Unidentified

Total loanwords

Nonloanwords

38.1 28.0 27.2 28.9 40.4 60.5 51.3 38.7 42.6 37.8 55.0 29.7 55.0 62.0 36.7 55.0 51.8 52.0 59.1 56.4 48.2 66.2 40.6 21.7 41.7

1.2 7.0 0.5 7.5 12.5 15.3 6.3 4.7 6.1 6.0 1.3 6.9 3.7 4.2 3.7 8.1 4.8 10.3 4.7 3.9 41.4 6.3

3.7 1.2 3.5 1.1 2.0 1.3 1.8 6.5 4.8 5.2 5.1 9.4 11.7 8.0 2.1

0.9 1.8 2.2 4.7 2.4 2.4 2.3 1.6 0.8

2.3 0.5 10.4 0.3

0.9 2.2 1.6 2.5 3.9 3.2 0.5

41.8 30.5 39.5 29.5 49.0 74.8 70.9 51.3 49.7 46.3 63.0 34.8 66.4 65.7 40.9 60.6 68.5 61.7 64.3 71.8 62.4 96.1 93.1 21.7 51.7

58.2 69.5 60.5 70.5 51.0 25.2 29.1 48.7 50.3 53.7 37.0 65.2 33.6 34.3 59.1 39.4 31.5 38.3 35.7 28.2 37.6 3.9 6.9 78.3 48.3

Pre-Islamic

Classical Arabic

Unidentified

Total loanwords

Non-loanwords

Nouns Verbs Adjectives Adverbs Function words all words

Standard Arabic

Loanwords in Tarifiyt Berber by donor language and semantic word class (percentages)

Spanish/ French

Table 2:

Spanish/ French

The physical world Kinship Animals The body Food and drink Clothing and grooming The house Agriculture and vegetation Basic actions and technology Motion Possession Spatial relations Quantity Time Sense perception Emotions and values Cognition Speech and language Social and political relations Warfare and hunting Law Religion and belief Modern world Miscellaneous function words all words

Dialectal Arabic

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Dialectal Arabic

Table 1: Loanwords in Tarifiyt Berber by donor language and semantic field (percentages)

41.9 40.9 48.5 40.0 35.4 41.7

8.8 2.4 2.5 3.1 6.3

3.4 0.8 2.1

1.2 0.3 0.8

0.1 0.5 1.0 0.3

0.8 0.8 0.5

56.1 44.1 52.7 40.0 39.5 51.7

43.9 55.9 47.3 60.0 60.5 48.3

6. Loanwords in Tarifiyt

199

5. Integration of loanwords Loanwords in Northern Berber show different degrees of integration, both as regards phonology and, with nouns, morphological structure. 5.1.

Phonological integration

As far as phonological integration is concerned, one may distinguish three major groups of loanwords. The first group has full integration into (early) Berber phonology, and replaces any foreign sound by Berber phonemes. This category consists of loanwords from Punic and Latin, as well as the religious vocabulary belonging to the earliest stratum of Arabic loanwords. For example, in the early Arabic loan z’um ‘to fast’, from Arabic s’-ma ‘to fast’, the pharyngealized (“emphatic”) s’ of the original has been replaced by the Berber phoneme z’. In the second group, which is by far the largest group of borrowings, phonological adaptation is only partial. A large number of consonantal phonemes were borrowed from Arabic and Spanish, such as s’, q, %, ., p, but other phonemes undergo changes in the course of borrowing. Thus in loanwords of this group, l is changed to $, ll to )), while r is changed to a in many contexts. Non-geminate stops are replaced by fricatives which are pronounced more to the front than the original (e.g. alveolar plosive t > interdental fricative !). Moreover, Arabic t’ is mostly taken over as &’. The cooccurrence of introduction of foreign phonemes with significant adaptation is illustrated by a word like .$u ‘to be sweet’ < .lu, which shows replacement of l by $, but retention of the foreign sound .. In the third group, only minor adjustments to the Tarifiyt sound system take place, and some of the most conspicuous adaptations, such as the replacement of l by $, are absent. This is found with some dialectal Arabic loans (e.g. mli. ‘good’), but most of these words are borrowings from Standard Arabic or Spanish. Of course, many loanwords defy categorization, as the original source words did not contain any sounds foreign to the Berber system. Thus, there is no way to determine the category to which am#n ‘to believe’ (< Moroccan Arabic am#n) would belong, as neither a, nor m, nor n are expected to undergo phonetic changes in the process of borrowing. The difference between full integration and partial adaptation is clearly chronological in nature, full integration being limited to loanwords of the early Islamic period and before. The classical interpretation of the difference between partial integration and marginal adaptation would also be chronological. Many substitutions found in the partially integrated loanwords reflect sound changes that have taken place in Tarifiyt. Thus, the substitution of l by $ reflects the sound change l > $, which is also typical for Berber etyma, e.g. $um ‘straw’ < *lum (as found in the direct neighbor to the east, Beni Iznasen Berber). One might surmise that loans belonging to the partially integrated type (i.e. which have $ instead of l) were in fact

200

Maarten Kossmann!

borrowed before the sound change took place, while those which retain l were borrowed after the sound change. This chronological interpretation turns out to be quite problematic. The l > $ sound change – to remain with this one example – is quite old. It is also found in the now probably extinct language of the Riffian emigrant community in the city of Vieil Arzeu in Algeria, which settled there around 1750 (Biarnay 1911: 21). This shows that it had already taken place 250 years ago, which would imply that the great bulk of Moroccan Arabic loanwords was already there in the eighteenth century, and that only a small number of loans entered the language afterwards. A much better explanation would be that (bilingual) speakers remained conscious of the sound correspondences found between Tarifiyt loanwords from Arabic and the real Arabic, and that they could therefore replicate, so to say, the sound change while adapting the loanword to the native phonological system. Even today, the conscious manipulation of sound change can be witnessed. The gross term aq#))a$ ‘testicle’, which is (reluctantly) used by elderly people, is now doubled in youth speech by aq#llal, a form which suggests a loan from Standard Arabic. As there is no Moroccan Arabic or Standard Arabic source for this 4 word, the undoing of the sound change by the younger generation must be a conscious effort to create an educated effect, thereby giving this impolite word a learned connotation - a remarkable blend of euphemism and irony. 5.2.

Morphological integration: verbs

As far as morphological integration is concerned, there is a major difference between verbs on the one hand and nouns and adjectives on the other. Loan verbs are always inserted into Berber morphological patterns. As Moroccan Arabic verb stem structure is quite similar to Berber patterns ! a rare heritage from ProtoAfroasiatic ! this integration is relatively simple. Once introduced into a Berber formal verb class, the Arabic verb undergoes stem alternations according to Berber patterns. As an example, in Table 3 the tense-aspect-mood (TAM) morphology of two Arabic loan verbs is compared to that of two inherited Berber verbs, belonging to the same formal classes (TAM terminology as in Kossmann 2009+): Table 3:

Examples of TAM morphology in native and loan verbs Aorist

‘go in’ (inherited) ‘believe’ (< Arabic) ‘scratch’ (inherited) ‘bite’ (< Arabic)

4

a&#f am#n "m#z z%#f

Perfective u&#f um#n "m#z z%#f

Negative Perfective u&if umin "miz z%if

Imperfective tta&#f ttam#n "#mm#z z#%%#f

Negative Imperfective tti&#f ttim#n "#mm#z z#%%#f

In fact, aq#))a$ is probably an expressive formation based on am#))a$ ‘testicle’!which itself is derived from the Berber word for ‘egg’.

6. Loanwords in Tarifiyt

201

Arabic loan verbs can undergo derivation according to Berber patterns, and are inflected in the same way as Berber verbs. 5.3.

Morphological integration: nouns

The situation with nouns is much more complicated. In fact, with nouns no fewer than four categories of morphological integration can be distinguished. The first category consists of words which are fully integrated into Berber nominal morphology. Such nouns have the inherited Berber nominal affixes, which indicate gender, number, and case (called “state” in the tradition of Berber studies), as shown in Table 4. Table 4: The morphology of a native noun (‘bovine’) and a fully integrated loan (‘child’) M:SG

Free State Annexed State Free State Annexed State

a-funas u-funas a-.ram w-!.ram

M:PL

i-funas-!n i-funasen i-.ram-!n y-!.ram-!n

F:SG

!a-funas-! !-funas-! !a-.ran-t !-!.ran-t

F:PL

!i-funas-in !-funas-in !i-.ram-in !-!.ram-in

‘bovine’ ‘child’

The second group of loanwords has quasi-dialectal Arabic morphology. Instead of the Berber prefix, these loanwords have an obligatory nominal prefix $-, which is derived from the Arabic definite article l-. As in Moroccan Arabic, this prefix is assimilated to a following alveolar consonant. Feminine forms substitute the Arabic ending -a by -#! or -!, depending on the syllable structure of the noun stem. The interesting thing about this suffix is that, although it strongly resembles the inherited Berber suffix -! (F:SG), it is not identical to it. The Berber feminine singular suffix -! does not take a preceding schwa, while the ending used in Arabic loans normally does. One can see this when confronting a Berber noun such as !aw!#nt ( Seychelles Creole semen-d-fer, French pâturage > Seychelles Creole patiraz. In French loanwords starting with a vowel, the definite article is “agglutinated”, i.e. becomes part of the root, e.g. French l’auto > Seychelles Creole loto, French l’obscurité > Seychelles Creole lobskirite. In the plural, only the final consonant of the article is adopted into the root, e.g. les insectes ‘insectes’ gives rise to Seychelles Creole zenzek. In some lexemes, we see the loss or simplification of the final obstruent cluster of the French etymon, e.g. French ministre > Seychelles Creole minis (cf. also les insectes > zenzek). These same phonological processes have also taken place during creolization th during the 18 century. This makes it difficult if at all possible to distinguish th th French loanwords which entered the Creole in the 19 and 20 centuries from th those French words which formed the earliest Creole in the 18 century.

7. Grammatical borrowing Compared to the low percentage of loanwords in the Seychelles Creole lexicon, we find substantial grammatical borrowing in Seychelles Creole, e.g. from eastern Bantu languages. Ditransitive constructions are an important case of substrate influence in Seychelles Creole (e.g. Michaelis & Haspelmath 2003). In contrast to the pattern of the indirect object construction in French, Pierre donne le livre à Marie ‘Pierre gives Marie the book’ (recipient marked by à), Seychelles Creole allows only the double object construction: Pierre donn Marie liv. In this construction both recipient and theme are not marked, but just postverbally aggregated. This pattern is clearly inherited from the eastern Bantu substrate languages. PATH constructions where ‘motion-to’ and ‘motion-from’ is not differently marked (lit. ‘I go/come in the market’) and noun coordination expressed by comitative ‘with’ ek (‘A with B’ meaning ‘A and B’) are two other salient construction types which clearly have their source in eastern Bantu languages. Note that only patterns are borrowed, not the grammatical function words themselves, e.g. na ‘with’ from Bantu. For a more detailed discussion, see Michaelis (2008).

226

Susanne Michaelis with Marcel Rosalie

8. Conclusion th

Seychelles Creole, a language which evolved in a high contact situation in the 18 century, shows only slightly more than 10% loanwords within the analyzed set of LWT meanings, where loanwords are defined as all lexemes which have not been th inherited from 18 century French. This finding may be interesting in that one would have expected a much higher percentage of words in the resulting creole language deriving from the relevant substrate languages, eastern Bantu and Malagasy. Apparently, in the domain of the lexicon the creole-creating speakers found words of the colonial French varieties more vital in their daily social communicative contexts. The detailed and rich lexicon of Seychelles Creole based on colonial French mirrors this fact. Only 2,6% of the overall analyzed set of lexemes shows eastern Bantu and/or Malagasy etyma. The overwhelming source of loanwords (in our definition) are the languages of the former colonial powers English and French. So far we have not mentioned lexical calquing. The reason for not treating this phenomenon is that it seems to be rare in Seychelles Creole, even though one should admit that it is extremely difficult to detect. One should know both Seychelles Creole and possible calquing languages very well so that one would be able to identify the parallel structures in both contexts. There is one often-cited example of lexical calquing in Seychelles Creole, which is the praying mantis, kasbol apparently calqued from Swahili kivunjajungu ‘break pot’ (cf. Baker 1993: 131). The Swahili expression refers to the superstition that the person who kills a mantis will break the next thing he touches, e.g. a bowl (Baker 1982: 119). As the creole expression kasbol can be segmented into two parts kas ‘to break’ and bol ‘bowl’, the Seychelles Creole item is clearly calqued from Swahili. Whereas lexical calquing seems to be rare in Seychelles Creole, grammatical calquing has a much more important role, as was shortly alluded to in §7 above.

References Allen, Richard B. 2001. Licentious and unbridled proceedings: The illegal slave trade to Mauritius and the Seychelles during the early nineteenth century. Journal of African History 42:91–116. Baker, Philip. 1982. The contribution of non-Francophone immigrants to the lexicon of Mauritian Creole. 2 vols. Ph.D. dissertation. University of London (SOAS). Baker, Philip. 1993. African contribution to French-based creoles. In Mufwene, Salikoko S. (ed.), Africanims in Afro-American Language Varieties, 123–155. Athens: University of Georgia Press. Baker, Philip & Corne, Chris. 1986. Universals, Substrata and the Indian Ocean Creoles. In Muysken, Pieter & Smith, Norval (eds.), Substrata versus Universals in Creole Genesis, 163–183. Amsterdam: Benjamins. Baker, Philip & Hookoomsing, Vinesh. 1987. Diksoner kreol morisyen. Paris: L'Harmattan.

7. Loanwords in Seychelles Creole

227

Bickerton, Derek. 1981. Roots of Language. Ann Arbor: Karoma. Bollée, Annegret. 1977. Le créole français des Seychelles: Esquisse d'une grammaire – textes – vocabulaire. Tübingen: Niemeyer. Bollée, Annegret. 1993–2007. Dictionnaire étymologique des créoles français de l'Océan Indien [Etymological dictionary of the French creoles of the Indian Ocean]. 4 vols. Hamburg: Buske. Brousseau, Anne-Marie & Lefebvre, Claire. 2002. A grammar of Fongbe. Berlin: Mouton. Chaudenson, Robert. 1974. Le lexique du parler créole de la Réunion. 2 vols. Paris: Champion. Chaudenson, Robert. 1979. Créoles français de l'océan Indien et langues africaines [French creoles of the Indian Ocean and African languages]. In Hancock, Ian F. (ed.), Readings in creole studies, 217–237. Ghent: E. Story-Scientia. Chaudenson, Robert. 1992. Des îles, des hommes, des langues: Langues créoles – cultures créoles [Islands, people, languages: Creole languages – creole cultures]. Paris: L'Harmattan. Corne, Chris. 1977. Seychelles Creole Grammar: Elements for Indian Ocean Proto-Creole Reconstruction. Tübingen: Narr. de St Jorre, Danielle & Lionnet, Guy. 1991. Diskyonner kreol-franse. Dictionnaire créole seychellois-français. 2nd edition. Bamberg/Mahé. DeGraff, Michel. 1999. Creolization, language change, and language acquisition: An epilogue. In DeGraff, Michel (ed.), Language creation and language change: Creolization, diachrony and development. Cambridge, MA: MIT Press. Lefebvre, Claire. 1998. Creole genesis and the aquisistion of grammar: The case of Haitian creole. Cambridge: Cambridge University Press. Lionnet, Guy. 1972. The Seychelles. Newton Abbot. McWhorter, John H. 1998. Identifying the creole prototype: Vindicating a typological class. Language 74:788–818. McWhorter, John H. 2005. Defining creole. Oxford: Oxford University Press. Michaelis, Susanne. 2008. Valency patterns in Seychelles Creole: Where do they come from. In Michaelis, Susanne (ed.), Roots of creole structures: Weighing the contribution of substrates and superstrates, 225–251. Amsterdam: Benjamins. Michaelis, Susanne & Haspelmath, Martin. 2003. Ditransitive constructions: Creoles in a cross-linguistic perspective. Creolica April. . Mufwene, Salikoko S. 2001. The ecology of language evolution. Cambridge: Cambridge University Press. Nwulia, Moses. 1981. The History of Slavery in Mauritius and the Seychelles: 1810–1875. Rutherford: Fairleigh Dickenson University Press. Young, Rodolphine. 1983. Fables de La Fontaine traduites en créole seychellois. Introduction, notes, remarques sur la langue et glossaire par Bollée, Annegret et Lionnet, Guy. Hamburg: Buske.

228

Susanne Michaelis with Marcel Rosalie

Loanword Appendix Arabic (earlier donor language) otmil benn larak arack (alcoadz holic drink) flay dray Bemba koyn belenga ugly poketmonnen stor Chinese loke pot, Chinese wok cooking vessel top rouk Eastern Bantu layn daw little sailing boat draf blenk katyolo canoe fifti-fifti English louksi tyeke countryside plain egzibit waterfall waterfall dyanm ays ice tityer grani grandmother speech mice mouse bann dyagwar jaguar bolpenn boufalo buffalo boy fit healthy, tobrouk beautiful sik sick/ill snake to eat kouker oven, stove jug jug/pitcher brekfas breakfast dof dough aro kornflor flour brengen mortar (i.e. bowl elmet mortar for crushing) fort gines dark beer bodigard stocking (long stoking platoun sock worn by sentri men) plentif pin pin sarze headband or erbann bayk headdress batri tatou tattoo matronn flo floor ners stov stove tyermenn blennket blanket laysens shelf shelf feloni bim beam ermel septitenk ditch sped spade sink

oats flakes to bend, to fold adze to flee to drive coin pocket money shop/store to shut top hook line crowd empty half to look to look to show hard teacher speech to forbid pen servant auxiliary soldier from the Seychelles in WW2 (toponym of a fort in North Africa) arrow gun helmet fortress guard guard guard plaintiff to accuse bicycle battery nurse nurse president driver’s license crime letter (by airmail) sink

dablysi kann (2) tin kingsayz

toilet tin/can tin/can cigarette

English (Indian) gengan tifin

buttocks, testicles light meal

English or French kongoulou, kangourou poncho boumerang masin plastik sigaret

kangaroo poncho boomerang machine plastic cigarette

Fon koutou

ghost

French lobskirite sangliye testikil penis vazen vilv nouritir nilon kondannen sirkonsizyon radyo televizyon telefonn bisiklet moto loto bis semen-d-fer avyon kouran lelektrisite moter pilil pikir

darkness boar testicles penis vagina vulva food fishing line to condemn, to convict circumcision radio television telephone bicycle motorcycle car bus train airplane electricity electricity motor pill, tablet injection

7. Loanwords in Seychelles Creole linet

spectacles/ glasses minister

pestle kind of bean minis kind of green bean French (Mauritian) sort of veil worn lanba by Malagasy konstab policeman women three stones touk French (West African) used as firebangala penis place tapinak, tapnak roof Hindi-Urdu kanbar yam bay strong soubik kind of basket pot with a round kalay anfangok crooked bottom takon many gadyak breakfast malang stinking, dirty fermented drink baka quiet, to be zin kapra cloak silent kamli blanket mandeng to tell lies pesa money lafrang, frang big fishhook bazar market vouv kind of fish trap malgol, golmal ugly misouk to steal boubak stupid topi colonial Portuguese helmet kanmaron prawns, shrimp cannabis bangi karapat tick cigarette panel bowl manteg ghee Kongo makeket, maket a kind of ant (Odontomachus haematoda) Makua kourpa moukapa kalipa

snail big land turtle strong, clever

Malagasy mantoun bib tanpann latet kelkel tanbav

kapkap

insect a kind of spider dandruff armpit children’s disease that involves diarrhea to eat quickly, voraciously

kalou bwenm zantak

Portuguese (Indian) baba karya kalen

baby termites tin (plate), coin

Sena maloumbo Swahili

big testicles

toto

child (young person, offspring) parrot monkey in creole fairy tales tooth belly testicles big testicles, hernia

kasoukou soungoula meno toumero bilenga poum

kalele tyakoula tembo madora konnose kapatya, pakatya kikapo katyakatya konan

229

penis (coll.) light meal fermented drink ornament or adornment to pound a basket made of plaited coconut leaves a kind of basket made of pandanus fiber a kind of rattle sorcerer, ghost

Tamil poukay dalon

drunk friend

Unknown origin kinday nanmkoyo

kololo tonkibo zoubrit sousouna potao foutaleza tonkonny

head louse Loggerhead Sea Turtle (Caretta caretta) penis (coll.) penis (coll.) penis (coll.) vagina (coll.) jug made of metal hut of Malagasy origin tree stump

Wolof tyaptyap

to eat quickly, gulp down one’s food

Yao kapor senga longanis dondosya

strong, wellbuilt person fermented drink sorcerer ghost, zombie

Chapter 8

Loanwords in Romanian* Kim Schulte 1. The language and its speakers Romanian, also known as Rumanian (sometimes also spelt Roumanian, especially until the 1940s), belongs to the Romance languages, which form a branch of the Indo-European language family. Among the Romance languages, Romanian belongs to the Daco-Romance sub-branch of the Eastern Romance branch. There are four distinct Daco-Romance languages, all of which are frequently referred to as different “dialects” of Romanian: Aromanian (c. 300,000 speakers in the Republic of Macedonia, Albania, northern Greece, Serbia and Bulgaria), Megleno-Romanian (c. 5,000 speakers in northern Greece and the Republic of Macedonia), IstroRomanian (c. 1,000 speakers in the Istrian Peninsula in Croatia), and DacoRomanian (c. 25 million speakers in Romania and Moldova). The subdatabase for Romanian is restricted to the lexicon of Daco-Romanian, the language generally referred to as Romanian in everyday usage. Throughout the remainder of this chapter, Romanian will be used as a synonym for Daco-Romanian. Romanian, used in all domains from the most informal to the most official, is the official language of Romania and the adjoining Republic of Moldova. Both are located in south-eastern Europe, northeast of the Balkan Peninsula, in an area including the inner and outer arch of the southern Carpathian Mountains, from the lower Danube in the southwest and south of the territory to the river Dniester in the northeast. This Romanian-speaking area is surrounded by speakers of nonRomance languages, namely Hungarian and several Slavic languages (Ukrainian, Bulgarian, Serbian). Beyond the territories of Romania and the Republic of Moldova, Romanian has co-official status in the Vojvodina Province in northern Serbia, and speakers of Romanian also live in areas of Ukraine close to the Romanian and Moldovan borders. There is a large Romanian diaspora, estimated at around eight million people, with concentrations in North America, Australia and Israel; due to recent emigration, there are also Romanian communities of considerable size in Italy and Spain (about one million in each country).

!

The subdatabase of the World Loanword Database that accompanies this chapter is available online at http://wold.livingsources.org. It is a separate electronic publication that should be cited as: Schulte, Kim. 2009. Romanian vocabulary. In Haspelmath, Martin & Tadmor, Uri (eds.) World Loanword Database. Munich: Max Planck Digital Library, 2137 entries.

8. Loanwords in Romanian

231

Within Romania, several historically established minority languages are spoken by the corresponding ethnic groups, the most significant of which are the Hungarians in western and central Transylvania, as well as the Romani minority, the latter constituting approximately ten percent of the overall population. Smaller ethnic groups include Albanians, Turks (mainly along the Danube in south-eastern Romania), Tatars (mainly in the Dobrogea region), Russian Lipovens (in the Danube Delta) and speakers of other Slavic languages, mainly near the borders with the respective countries. Whilst Romanian is the second language for some speakers of these minority languages, the majority can be considered to be partly or fully bilingual. In the Republic of Moldova, the Turkic language Gagauz is spoken by approximately 150,000 inhabitants of the Province of Gagauzia, in the south of the country. In Transnistria, a region east of the river Dniester, approximately one third of the population are ethnic Russians and another third are ethnic Ukrainians. Within the remaining territory, there is a clear urban-rural divide, with a comparatively large proportion of ethnic Russians in the cities, especially in the capital Chi!in"u, due to migration during the period under Soviet rule; many native Russian speakers only have limited linguistic competence in Romanian.

Map 1: Geographical setting of Romanian Romanian can be subdivided into two major dialect groups, the Muntenian-based dialects spoken in the south, and the Moldavian-based ones spoken in the north of

232

Kim Schulte

Romania and the Republic of Moldova. The official name of the national language of the Republic of Moldova is Moldovan or Moldavian, but linguistically speaking it is very similar to the neighboring dialects of north-eastern Romania. In general, Romanian has comparatively little dialectal variation, but regional differences can nevertheless be observed and are the basis for a distinction between dialects such as Moldavian, Transylvanian, or that of the Banat region. An important distinguishing feature between regional varieties is their lexicon, particularly lexical loans; unsurprisingly, those donor languages spoken in the immediate vicinity tend to be the source of a comparatively larger proportion of loanwords in the respective regional varieties. In order to provide loanword data for Romanian as a whole, the lexicon used for the subdatabase is not based on any specific regional dialect, but on what is considered to be part of the language according to the Romanian Academy’s dictionary (Coteanu et al. 1998). As a result, a number of the loanwords included are most commonly used in particular regions; in some cases this results in the incorporation of several synonyms borrowed from different source languages. The historic foundations for the emergence of Romanian were laid when the Dacians, inhabitants of an area broadly coinciding with modern-day Romania, were defeated by the Romans under Emperor Trajan between 101 and 106 CE, leading to the foundation of the Roman province of Dacia. This was followed by a period of intense colonization and Romanization, during which a regional variety of Popular Latin established itself as the local language. The contact with the rest of the Roman Empire was relatively short-lived, as the invading Goths forced Rome to pull out of Dacia after less than 170 years, around 271 CE. Despite the comparatively short duration of direct contact with the rest of the Roman Empire, language shift from the Thraco-Dacian substrate to Latin must have been sufficiently extensive for a Latin-based language that we might call proto-Romanian to completely replace the substrate language(s), though this may have been a gradual and prolonged process (see §3.1 below). Subsequently, various peoples invaded the area, generally moving in from the northeast and east. Whilst some invading tribes, e.g. the Huns, left few cultural and linguistic traces, other populations settled amongst the early Romanian speakers, th th notably Magyars (from the 9 century) and Slavs in several waves of migration (6 th 11 century), providing ideal conditions for long-term linguistic contact. There is an ongoing debate as to whether ethnic Romanians have been living in areas north of the Danube, particularly in Transylvania, continuously since Roman times, or whether they were pushed back by a large Hungarian population, eventually returning to those areas at a later stage. As this debate is primarily politically motivated and linked to territorial claims, it will not be entered into here; in any case, the linguistic evidence suggests a considerable degree of cultural contact, typical of a situation of cohabitation over an extended time period. Other linguistically relevant historical events include the arrival of German setth th tlers in Transylvania in the 12 and 13 centuries, encouraged by the Hungarian th rulers, and the imposition of Ottoman suzerainty from the 16 century, bringing

8. Loanwords in Romanian

233

the population into increased cultural, administrative and trade-based contact with other areas of the Ottoman empire, particularly modern-day Turkey, Bulgaria, and Greece.

2. Sources of data The source of the lexical data, i.e. the Romanian words corresponding to the Loanword Typology meanings, was either the author’s personal knowledge or standard bilingual dictionaries (Isb"!escu 1995; Savin et al. 1997; Levi#chi & Banta! 1992), complemented by the Romanian Academy’s monolingual dictionary (Coteanu et al. 1998) and a dictionary of synonyms (Seche & Seche 1997); the latter were used to identify any existing synonyms and to determine the degree of semantic overlap between near and partial synonyms. The two main sources of the etymological information that appears in the database are (a) the Romanian Academy’s Dic!ionarul explicativ al limbii române (Coteanu et al. 1998), which provides the source language and etymon, where known, for each entry, but does not supply any additional etymological explanation or discussion, and (b) Cior"nescu’s (1966) etymological dictionary of Romanian, which contains very detailed etymologies but has a limited number of entries. Where neither of these default sources provided a fully satisfactory etymology, it was either complemented with suggestions by time-honored Romanian philologists (Pu!cariu 1943 [1997]; Philippide 1894; Ha!deu 1877, 1879, 1883), or more specific studies dealing specifically with the etymology of loanwords from individual source languages were consulted. Among these, Wendt (1960) examines loans from Turkish, Miklosich (1860, 1862–65) investigates the incorporation of Slavic elements into Romanian, Conev (1921) looks at contact between Bulgarian and Romanian, Murnu (1894) and Diculescu (1924–26) investigate Greek elements in Romanian, whilst Cihac (1879) and McClure (1976) examine the loans from various source languages. Information regarding the exact word form of the source word was frequently taken from dictionaries of the respective languages, e.g. Newmark (1998) and Fiedler & Klosi (1997) for Albanian, Gruji$ (1998) for Serbian, and Steuerwald (1972) for Turkish. Information regarding the earliest known source word was obtained from etymological dictionaries of various languages, e.g. Corominas (1961) for loanwords shared with Spanish and Grebe (1963) for loanwords with German cognates.

3. Contact situations For the present analysis of loanwords in Romanian, the focus lies on words borrowed into the language after Latin began to be used in the area where Romanian is spoken today. Whilst it is neither possible nor sensible to define an exact point in time at which Latin became Romanian, it can be ruled out that a Latin-based

234

Kim Schulte

Romanian language existed before Latin began to be used in the territory. Thus, loanwords in Latin such as camisia ‘shirt, alb’ from Germanic hemidi ‘mantel, shirt’, which entered Latin via Celtic and was passed on, like any native item, to its Romance daughter languages (Spanish camisa ‘shirt’, French chemise ‘shirt’, Romanian c"ma#" ‘shirt’), are certainly borrowed, but into Latin, not into Romanian. As the aim of this chapter is to examine the impact of borrowing on the lexical structure of Romanian and to compare and contrast these developments with other languages (including Romance sister languages of Romanian), the following survey examining contact situations that have left their traces in the Romanian lexicon will begin around the time when a specifically Dacian or Romanian regional variety of Popular Latin began to develop, namely in the second century CE, after the Roman conquest of the area. 3.1.

Contact with Thraco-Dacian substrate languages and/or Albanian

Little is known about the Thraco-Dacian substrate spoken in the area before the shift to Latin, but it is generally assumed that it was an Indo-European language closely related to Albanian, perhaps even the direct ancestor of modern Albanian (du Nay 1996: 72). Whilst this Thraco-Dacian substrate disappeared with the adoption of Latin, it must be assumed that contact with closely related languages, perhaps varieties of Albanian, continued for several centuries. In addition to peasants in remote areas, who were not immediately affected by the Roman occupation and probably took longer to shift to Latin, a certain degree of population movement and mixture between Latinized and non-Latinized areas must be assumed, particularly due to the fact that semi-nomadic herdsmen roamed large areas of the Balkan Peninsula, thereby acting as a continuous source of contact with the Thraco-Dacian/Albanian language(s). In most cases, our lack of precise knowledge of these languages makes it impossible to determine whether Romanian words with cognate counterparts in Albanian were borrowed directly from the substrate languages, or from Albanian at a later stage (Rosetti 1938–1941 [1978: 223]). The nature of the contact between Latin and the local substrate language can be assumed to have proceeded along similar lines as in many other areas that were incorporated into the Roman Empire. After conquering and occupying the territory militarily, Roman administrative structures were implemented and former Roman soldiers from across the Empire were given land on which to settle. For the existing inhabitants of the area, who received the status of Roman citizens, it was useful or even necessary to learn and speak Latin to participate and be successful in this new society; there was little resistance to adopting Latin, as it had more prestige than the substrate language and was associated with wealth and progress. This resulted in a rapid language shift to Latin, despite the relatively brief period of Roman rule (106 CE to 271 CE); the number of lexical and morpho-syntactic elements retained from the substrate (i.e. borrowed into the regional variety of Latin) is comparatively

8. Loanwords in Romanian

235

1

small , despite some ongoing contact with languages closely related to the original substrate, such as Albanian. The relationship between the Latin superstrate and the local substrate languages was initially one defined by the political and cultural dominance of the Romans. After the Romans’ withdrawal from the province, however, it can be assumed that the contact situation gradually changed to one of cohabitation, in which speakers of early Romanian and speakers of Thraco-Dacian/Albanian lived in close vicinity of each other and communicated on a regular basis about everyday matters regarding their pastoral activity and the natural environment. 3.2. 3.2.1.

Contact with Slavic languages The first contact with Slavic under the Avars th

th

Between the 6 and the 8 century, the Avars occupied the area north of the Danube and ruled an area roughly coinciding with modern-day Transylvania. While their leaders were aristocrats of Turkic origin, the Avars were, in fact, a multiethnic group; the majority of the population that moved into Romanian-speaking territory were of Slavic extraction. As these Slavs did not belong to the ruling class under the Avars, their language is unlikely to have had superstrate status. The contact situation can be assumed to have been one of cohabitation and regular interaction between Romanians and Slavs, without a great degree of cultural dominance of either of the two. 3.2.2.

Contact with South Slavic th

There must have been intense contact with South Slavic well before the 9 century, during the “common Romanian period”, i.e. before Daco-Romanian, Aromanian and Megleno-Romanian separated. The evidence for this early contact with South Slavic is the presence of the same loanwords in all three languages and the fact that certain sound changes that had taken place in South Slavic by the end th of the 9 century are not reflected in the corresponding Romanian loanwords (du Nay 1996: 100). After the influx of Slavs into the Balkan Peninsula, the contact situation was initially one of population mix of Romanian and South Slavic speakers, probably in approximately equal proportions. The large number of lexical items and morpho-syntactic structures shared by Romanian and Bulgarian/Macedonian to the present day indicates that there was a high degree of bilingualism in this mixed population in the entire area. Complementing this adstrate situation, the standardization and exclusive use of th th Old Church Slavonic for religious purposes from the 9 to the 17 century gave 1

A number of morpho-syntactic elements are shared by several languages belonging to the Balkan Sprachbund, some of which may be rooted in structures that were present in the common substrate language(s) (see §6).

236

Kim Schulte

South Slavic the status of a cultural superstrate language, particularly in semantic fields related to religious beliefs and practices. 3.2.3.

Later contact with Slavic languages

In addition to the comparatively early South Slavic influence, there was more localized contact between Romanian and individual Slavic languages at later dates. There is evidence of contact between Romanian and Ukrainian that must have taken place th after the 12 century (as shown by the fact that borrowing took place after a th Ukrainian [h]>[g] change, dated around the 12 century, cf. Mih"il" 1973: 46) in th the north, of contact with Serbian since the 15 century in the east, as well as continuing contact with Bulgarian in the south. These regionally limited contact situations were characterized by interaction, in most domains of everyday life, between the Slavic and the Romanian populations in the respective areas. 3.3.

Contact with Greek

Contact between Romanian and (Byzantine) Greek was both direct and indirect. th From before the Byzantine period until approximately the 10 century, Balkan Romance, as well as South Slavic, were spoken in an area that bordered on northern Greece; in the south of this area, the presence of a considerable Greek population led to a trilingual contact situation. Cohabitation and everyday interaction between all three population groups were common. Even after Daco-Romanian was physically separated from the Greek-speaking area, contact between Romanian and Greek continued, especially in the areas of trade and commerce. To what extent this contact was mediated by the Slavic population that separated the two cannot be precisely determined; the presence of numerous Greek loans in Romanian, Bulgarian and other Balkan languages shows that these words were widely borrowed throughout the region. th From the 15 century onwards, contact between Greek and Romanian speakers, primarily through commercial activity, continued within the expanding Ottoman Empire. 3.4.

Contact with Hungarian th

From the late 9 century, the Magyars began to settle in areas north of the Carpathian Mountains; to the present day, a large number of ethnic Hungarians live in certain areas of Transylvania. In contrast to the contact situations described above, Romanians and Hungarians did not mix to the same extent, maintaining separate ethnic and linguistic identities. An important factor in this process was the fact that they tended to live in separate villages; as a result, contact was largely limited to trade and other occasional encounters. Even when Hungarian increasingly turned into a superstrate language after Transylvania became incorporated into the Austroth Hungarian Empire from the 18 century onwards, the Romanian population was

8. Loanwords in Romanian

237

largely excluded from official matters; contact with Hungarian thus remained comparatively limited. 3.5.

Contact with German

Having been granted special privileges, German settlers founded towns and villages th th th th in Transylvania in the 12 and 13 centuries. In the 17 and 18 centuries, more German settlements were founded in the Banat area in the east of Romania. Contact with the Romanian population was similar to that between Romanians and Hungarians, as the German settlers preserved their separate cultural and linguistic identity. Contact was generally limited to commercial interaction. th A separate contact situation, beginning in the second half of the 19 century, arose due to an increasing orientation towards Western European culture and lifestyle. Though the primary cultural model was France, Germany also served as a model. The upper classes travelled to Germany and visited German universities, German literature was read in intellectual circles, and certain novel products and concepts came to Romania from or via Germany. 3.6. 3.6.1.

Contact with Turkish Trade th

Turkish traders settled along the Black Sea coast from the late 15 century onwards. Being a relatively small group of the population, the overall impact of this contact was largely limited to commercial transactions. 3.6.2.

The Ottoman Empire

The Ottoman rule in the Balkans brought about a contact situation in which Turkish acquired considerable importance as a language used in the military and administrative domains. Whilst Romanian-speaking areas retained a certain degree th th of independence under Ottoman suzerainty between the 16 and 18 centuries, many Turkish products and cultural practices found their way into Romania. Furthermore, Romanians were recruited to fight in the Ottoman armies, which led to a high degree of contact in the military domain. 3.7.

Contact with languages serving as a cultural model

Like in many European cultures, Latin, and by extension Italian, was viewed as an th educated linguistic model. In Romanian, this trend surfaces as early as the 17 century, exemplified by the borrowing of Italian popolo > Romanian popor ‘people, population’, as an alternative to neam, borrowed from Hungarian. The cultural importance of Latin, Italian, to some extent German, but most of th th all French, especially during the second half of the 19 and the first half of the 20

238

Kim Schulte

century, led to a contact situation between Romanian and predominantly written French, Italian, Classical Latin etc. Whilst wealthier sections of the Romanian population did travel to France, Italy and Germany for the purpose of business, holiday and education, others merely had indirect contact, through literature and education. It is worth pointing out that the vast majority of the population had very little formal education and did not speak French; nevertheless, large numbers of French, Italian and “learned Latin” loanwords entered the language and permeated all sections of society.

4. Number and types of loanwords 4.1.

Semantic fields and word classes

Of the words contained in the subdatabase for Romanian, about 42% are loanwords. Whilst this implies that the majority of the Romanian lexicon is inherited from Latin, it also shows that Romanian has incorporated an exceptionally large amount of lexical material from other languages. Table 1 shows the distribution of loanwords by donor language and semantic field. Perhaps unsurprisingly, the largest number of loanwords is found in the semantic field Modern world, where 70.5% of the words are borrowed. About half of these loanwords come from French, either fully or partially. (For a discussion of the concept of “partial borrowing” from French, see §4.2 below.) Due to the strong cultural th th orientation towards France in the 19 and 20 centuries (see §3.7), the reason for this intense borrowing from French can be linked, at least in part, to the fact that many of the new inventions and concepts belonging to this semantic field were introduced to Romanian speakers through France and the French language. This is also the background for the only borrowed item in the category Miscellaneous function words: a deveni ‘to become’, an alternative to the synonymous periphrastic a se face (literally ‘to make oneself’) was borrowed from French devenir as a term linked primarily to modern philosophy, which Romanians came into contact with through French. On the other hand, loans from the Slavic languages are hardly relevant in the category Modern world, despite accounting for a significant proportion of the borrowed lexical stock in general. Items in this category, it may be assumed, were not normally introduced to Romanian speakers via the populations of neighboring countries during the period in which most of the Modern world meanings emerged and gained significance. In this semantic field, the largest amount of loans from a Slavic language come from Russian, largely due to Russia serving as a model in the area of administration. Thus po#ta ‘post, mail’ and poli!ie ‘police’ are borrowed from Russian.

8. Loanwords in Romanian

239

6.8 16.9 17.8 6.4 19.0

5.3 4.7 4.4 4.5 5.6

4.5 4.4 3.5 7.3 7.8

0.5 1.7 2.7 0.9 1.1

0.5 2.4 1.9 3.4

1.1 2.0 0.8 4.5 4.5

- 0.8 4.8 0.8

Non-loanwords

Total loanwords

Others

Ukrainian/Russian

Albanian

Serbian

German

Turkish

Italian

Latin

-

Bulgarian

Hungarian

1 The physical world 2 Kinship 3 Animals 4 The body 5 Food and drink 6 Clothing and grooming 7 The house 8 Agriculture and vegetation 9 Basic actions and technology 10 Motion 11 Possession 12 Spatial relations 13 Quantity 14 Time 15 Sense perception 16 Emotions and values 17 Cognition 18 Speech and language 19 Social and political relations 20 Warfare and hunting 21 Law 22 Religion and belief 23 Modern world 24 Miscellaneous function words

Slavic

17.5 12.3 6.3 0.4 1.2 2.4 0.8

French

Greek

Table 1: Loanwords in Romanian by donor language and semantic field (percentages)

- 47.2 52.8

1.1 - 0.3 3.2 - 23.1 76.9 2.0 1.4 1.0 1.7 2.7 1.4 0.7 42.9 57.1 1.9 2.3 1.2 1.2 1.2 - 0.4 39.2 60.8 2.7 - 0.9 1.8 - 1.8 - 30.9 69.1 6.1 4.5 3.4 2.8 1.1 2.2 1.1 62.6 37.4

4.3 10.0 8.2 - 1.4 10.0 5.7 4.3 1.4 5.4 1.4 2.9 1.4 56.4 43.6 14.6 6.6 8.0 1.4 - 4.7 - 1.9 2.8 6.1 2.8 1.8 0.9 51.9 48.1 12.6 13.9 4.1 0.9 1.3 1.7 0.4 3.5 1.7 1.1

- 3.5

- 44.6 55.4

9.6 10.6 12.2 2.1 3.9 2.6 16.8

- 0.9 - 2.1

-

11.0 12.1 2.6 9.4 12.3 5.2 14.7

3.5 4.2 1.3 1.9 0.9 2.1

2.2 4.5 3.9 4.2 3.9 0.9 3.7

3.5 0.9 1.8 0.9 0.9 7.6 - 1.5 - 1.5 - 3.0 - 2.1 1.3 - 1.3 - 1.7 5.3 - 2.1 - 0.5

16.8 6.9 1.7 8.1 2.3 9.8 16.5 0.8 7.5 2.3

0.9 2.7 0.4 0.6 2.6 -

- 1.2 3.5 1.7 1.7 - 4.5 - 0.8

19.4 17.7 4.0 11.3 1.6 3.2

- 6.5

16.0

- 1.6 1.6 1.6

9.6 3.2 4.8 4.8 3.2

35.1 44.7 24.3 17.7 25.3 13.9 47.4

64.9 55.3 75.7 82.3 74.7 86.1 52.6

- 43.9 56.1 - 1.5 1.5 45.1 54.9

- 0.8 1.6

-

- 66.1 33.9

- 1.6 1.6 49.6 50.4

24.1 5.1 - 7.6 2.5 - 2.5 2.5 10.2 22.7 2.3 13.6 - 2.3 4.5 - 3.4

-

-

-

- 44.3 55.7 - 59.1 40.9

36.3 6.5

-

- 4.4 -

- 70.5 29.5 - 6.5 93.5

0.7 1.5 4.4 2.2 -

- 5.8 -

- 15.3 -

13.7 8.4 3.9 3.2 1.9 1.9 1.7 1.6 1.6 1.5 1.0 1.1 0.3 41.8 58.2

In the category Religion and beliefs, there is also a large proportion of loanwords, almost 60%. Half of these are borrowed from early South Slavic, Bulgarian or Greek, i.e. the languages through which Romania was in touch with the Orthodox

240

Kim Schulte

Church. The use of Old Church Slavonic as the language of religion for many centuries explains the large number of Slavic loans; it is likely that words belonging to other semantic fields also entered the language via this religious use of Slavic, for instance dragoste ‘love’ and prieten ‘friend’. Another semantic field with an exceptionally large proportion of loans, 66.1%, is Social and political relations. This is not entirely unexpected, as new social and political structures are often influenced by, or imported from, populations with a different socio-political system when a contact situation arises. Even subtle differences between the old concepts and the newly imported or adapted ones are likely to be reflected lexically, as the affected speakers are acutely aware of the differences affecting their daily lives. Many loanwords in the category Social and political relations, around a third of the loanwords in this category, come from Slavic languages, which had the most profound impact on Romanian society over the centuries. These words include rather fundamental concepts such as a porunci ‘to order, to command’ as well as st"pân ‘master’ and rob ‘slave’, which suggest that contact with the Slavs brought changes to this aspect of society with it; the influence of the use of Old Church Slavonic in religious contexts is also likely to have contributed to the adoption of these words. Another case showing how differences that came with a new administrative and political system are reflected by lexical replacement is grani!" ‘border, frontier’, also borrowed from Slavic, which may be assumed to have come into use with a new type of frontier established under the influence of a Slavic administrative system. An example of a distinction between similar meanings being made by means of adopting a loanword is the near-synonymous word pair for ‘village’, in which c"tun, cognate with Albanian katund, is probably a loan from the pre-Latin substrate and refers to a hamlet without any formal administrative structure of its own, a type of settlement that may be assumed to have existed before Roman occupation. The inherited Romance word for ‘village’, sat, derived from Latin fossatum, on the other hand, refers to a somewhat larger village which, according to the meaning of fossatum, was originally typically fortified in some way. Furthermore, the much later borrowing of ora# ‘town, city’ from Hungarian város suggests that urbanization in the present-day sense came to Romanian primarily through contact with Hungarian speakers. Apart from Miscellaneous function words, the semantic fields with the lowest percentage of loanwords are Sense perception (13.9%) and Quantity (17.7%). In both of these categories, the majority of borrowed items come from Slavic; the proportion of loans from Slavic in almost all semantic fields will be further discussed in §4.2 below. All in all, it is significant to observe that borrowing into Romanian has occurred across the entire lexicon; the average of close to 42% across all semantic fields is not distorted by exceptionally large numbers of loans in particular semantic areas. This is confirmed by the fact that the median percentage of loanwords across all categories is approximately 45%, indicating that a high proportion of borrowed

8. Loanwords in Romanian

241

items is found across most of the semantic fields distinguished in the loanword database. Sorted by semantic word class, Romanian loanwords conform to the common pattern that nouns appear to be most easily borrowed, as shown by Table 2. Just over 50% of the nouns in the subdatabase are loans, whilst verbs and adjectives have an almost equal loanword quota of 32%. As adjectives and adverbs are generally not morphologically distinguished in Romanian, the proportion of borrowed adverbs must be assumed to be approximately equivalent to that of borrowed adjectives.

4.2.

Total loanwords

Non-loanwords

Ukrainian/ Russian

50.2 32.1 32.0 20.0 5.9 41.8

49.8 67.9 68.0 80.0 94.1 58.2

Albanian

1.9 1.7 1.5 0.5 0.5 - 0.2 1.6 - 1.0 0.4 1.5 1.0 1.1 0.3

Serbian

German

Hungarian

Greek

Turkish

Italian

Latin

Bulgarian

16.1 8.2 5.1 2.9 2.4 2.9 2.6 2.0 2.5 10.5 11.2 2.1 4.3 1.3 - 0.4 1.4 13.6 7.1 2.1 3.0 1.3 1.0 - 0.5 0.8 - 20.0 1.3 0.8 0.4 2.1 - 0.8 13.7 8.4 3.9 3.2 1.9 1.9 1.7 1.6 1.6

Others

Nouns Verbs Adjectives Adverbs Function words all words

Slavic

French

Table 2: Loanwords in Romanian by donor language and semantic word class (percentages)

“Educated” loanwords

The largest number of loanwords comes from French. About 12% of the Romanian words in the database are unambiguously borrowed from French, and this number rises to about 16% if all loanwords that have French as a partial or possible source are added. A word can be considered to be “partially borrowed” from French if its form in Romanian does not allow us to determine unambiguously whether it is borrowed from French or from one of the other languages that served as cultural models at the same time. An example of a word borrowed from French and Latin is Romanian conspira!ie ‘plot, conspiracy’, which is morphologically integrated in such a way that it is not evident whether it is borrowed from Latin conspiratio or French conspiration; both of these would change to conspira!ie according to the normal rules of loanword integration. In such cases, it is quite possible that the Romanians who had knowledge of both the possible source languages began to use the word in Romanian without consciously deciding from which of the two languages they had borrowed the word. Similarly, Romanian banc" ‘bank’ might be borrowed from Italian banca and French banque, and Romanian ren ‘reindeer’ might be borrowed from German Ren and French renne. If we further add “learned Latin” and Italian loanwords to the French ones, then the total proportion of items borrowed from languages serving as cultural models is

242

Kim Schulte

about 20%, i.e. one in five.2 Such a high proportion of “educated loans” requires some explanation. Whilst it is normal for a speech community to borrow terms for newly introduced objects or concepts, Romanian has gone far beyond this level, borrowing heavily from French to create synonyms of words already present in the language. Thus Romanian surs" ‘spring’ from French source is synonymous with izvor, borrowed from Slavic izvor$, and with fântân" (now archaic in this meaning), the word inherited from Latin. Similarly, Romanian litoral ‘coast’ has been borrowed from French littoral despite the existence of the synonyms coast" and !ârm, inherited from Latin costa and termen, as well as Romanian mal from Albanian/substrate malj. In such cases, it is relevant to determine whether the newly borrowed item is merely a marginal, rarely used or stylistically restricted alternative, or whether it is frequently used and in real competition with the synonyms that it is predated by. However, the reality is not always quite as black and white. In the case of ‘coast’, for example, mal is used 50% more than litoral, but litoral still occurs twice as frequently as coast"; !ârm is only found in just over 1% of all cases. This means that the most recently added synonym, the loanword from French, has established itself as a serious competitor. In some cases, two cognate words have been borrowed into Romanian from more than one of these languages. For example, jaluzie ‘jealousy’, borrowed from French jalousie, has been borrowed in addition to the more commonly used gelozie, from Italian gelosia; there is also a synonym borrowed from Greek, zulie. All of these are winning the competition with the inherited terms temere or temut, which covers a somewhat wider emotional area including ‘fear’ as well as ‘jealousy’. Loans from French, Latin, Italian, and to some extent from German are found in all semantic fields covered by the subdatabase; however, there are considerable differences in the proportion of words from this source. Whilst both the mean and the median of the number of loans in the different semantic fields are just above 20%, indicating that there is no significant imbalance caused by individual semantic categories, the percentage of loanwords from these languages ranges from a mere 3.5% (Sense perception) to 58% (Modern world). For the latter category, it has already been mentioned above that most of the meanings it contains did not exist prior to their introduction via the cultures corresponding to the respective donor languages. A similar explanation can be given for the category Animals with 22% of (primarily French) educated loanwords, which contains a considerable number of animals that are not indigenous to the Romanian-speaking territory and for which the terms were therefore borrowed when Romanian speakers first became aware of their existence via French; examples are c"mil" ‘camel’, elefant ‘elephant’ or cangur ‘kangaroo’. With a total of 27% of loanwords from these source languages, the semantic field Clothing and grooming is also strongly affected. In this category, Romanian has 2

Some loanwords from German could also be included in this class, but it is not always clear whether a word was borrowed from German as a language representing a cultural model or through to contact with the German minority in Romania.

8. Loanwords in Romanian

243

borrowed from virtually all contact languages over the centuries, incorporating corresponding words for new types of clothing items introduced by or through the respective population or culture. A considerable number of words from French and Italian have been added as (near) synonyms or hyponyms to existing ones, reflecting an orientation towards emulating western European fashion, including the use of the corresponding vocabulary. For instance, beret" and basc" have been borrowed from French to denote specific types of ‘cap’, complementing the existing #apc" (borrowed from Bulgarian %apka) and c"ciul" (cognate with Albanian kësulë), both of which also refer to caps. Similarly, gheat" ‘boot’ was borrowed from Italian ghetta, joining the default word for ‘boot’, cism" (borrowed from Hungarian csizma) and ciubot" (borrowed from Ukrainian &oboty). Similarly, in the areas Cognition and Emotions and values, with 26% and 29% of words borrowed from languages serving as cultural models, many of the loans are synonyms of existing words, incorporated into Romanian due to the fashion and prestige associated with the use of “educated loans”. Terms from these languages associated with the areas Law and Hunting and warfare (35% and 27% respectively), on the other hand, were borrowed together with the new objects and concepts they denoted, due to fundamental changes to the legal system in post-Ottoman Romania and significant advances in military technology and strategy, respectively. At the other end of the scale, the semantic fields Sense perception (3%), Quantity (6%), The house (7%), Food and drink (7%), and Kinship (8%) have incorporated far fewer loans from languages representing cultural models during the past two centuries. Whilst Sense perception and Quantity appear to be generally more resistant to borrowing pressure, with only 13.9% and 17.7% of words in the respective categories identifiable as loans from any external source, the number of items borrowed from other source languages in the fields Kinship and Food and drink indicates that these semantic categories were specifically more resistant to borrowing from western European languages serving as cultural models. 4.3.

Loanwords from Slavic

The second most significant source of loanwords in Romanian are the Slavic languages, due to prolonged, close contact between Romanians and the Slavic peoples that moved into the area. Within the subdatabase, 8.4% of all words are borrowed from (South) Slavic with no particular regional provenance, 5.4% from Bulgarian and/or Serbian, 0.7% from Ukrainian, and a few isolated items from Russian and Polish. The total percentage of loanwords from Slavic sources is 14.6%, approximately one seventh of the words in the database. It has already been mentioned above that a considerable proportion of Slavic loans entered Romanian through the use of Old Church Slavonic as the language of religion, particularly in the semantic field Religion and belief (25%), but also in the category Social and political relations. It is not always easy to decide whether a loanword was borrowed from Slavic primarily via the religious domain or due to the general prolonged and close contact between the two languages due to cohabitation

244

Kim Schulte

during several centuries. In the category Emotions and values, for instance, 17% are borrowed from South Slavic including Bulgarian; it is likely that the borrowing of words such as mil" ‘pity’ from Slavic mil$ is due to its use in everyday conversation as well as specifically religious contexts. Other semantic fields that show a significant number of loans from Slavic are Speech and language (17%), Basic actions and technology (14%), Time (12%), The physical world (12%), Possession (12%), Motion (11%), The house (10%), and Warfare and hunting (10%). Loanwords from Slavic have, in many cases, replaced inherited words even where their meanings have been continually present since Roman times; for instance, nisip ‘sand’ is borrowed from Bulgarian nasip, z"pad" and om"t (both ‘snow’) from Slavic zapad$ and omet$, all but replacing the respective inherited synonyms arin" and nea, which are nowadays restricted to regional and poetic use. In a similar way, izvor ‘spring’ from Slavic izvor$ has ousted the inherited fântân" in this meaning; under the influence of Italian, French and German, fântân" has nowadays shifted its meaning to ‘fountain’. In other cases, loans from Slavic have filled lexical gaps; the most visible example is da ‘yes’, a notion that could not be rendered by any single word in Latin. In numerous instances, however, Slavic loanwords co-exist with synonymous inherited words; in many cases, there is little or no discernible difference in meaning or in usage frequency. Borrowing of an exact synonym from Slavic can eventually lead to semantic differentiation; inherited timp and borrowed vreme (from Slavic vr'men), synonyms referring to both ‘time’ and ‘weather’, for instance, show an incipient semantic split, with vreme increasingly becoming the more common choice for ‘weather’. On the other hand, a number of Slavic loanwords have fallen victim to a strong th re-latinisation process since the 19 century. Thus, the Slavic loanword cern ‘black’ has disappeared from modern usage, ousted by the inherited synonym negru. th A different development can be observed with the pair german (a 19 century “learned” loan from Latin) and neam! from Slavic n'mici (both ‘German’), which are used virtually synonymously in everyday conversation, though some speakers feel it is inappropriate to call a German neam! to his face, even though it is not generally perceived as a disrespectful term. This example shows that synonym pairs created by borrowing can come to contain complex and unpredictable semantic and sociopragmatic nuances that go beyond their lexical meaning. In antonym pairs with one element borrowed from Slavic, there is an intriguing tendency for the Slavic word to be the one with more positive connotations. Examples are ‘to love’ vs. ‘to hate’ (a iubi from Slavic ljubiti vs. inherited a urî), ‘friend’ vs. ‘enemy’ (prieten borrowed from Slavic prijatel vs. du#man borrowed from Turkish dü#man), and ‘yes’ vs. ‘no’ (da borrowed from Slavic vs. inherited nu).

8. Loanwords in Romanian

4.4.

245

Loanwords from Turkish

About 2% of the words in the subdatabase for Romanian are borrowed from Turkish. Whilst these loanwords account for a far smaller proportion of Romanian vocabulary than those discussed in the previous sections, it is still a considerable amount of the lexical stock. It is perhaps noteworthy that not a single Romanian verb in the database is borrowed from Turkish. By far the largest impact can be observed in the lexical field The house, with 10% of the vocabulary in this category borrowed from Turkish. The majority of these loanwords can be attributed to innovations and improvements in construction, furniture, tools etc., as well as fashions, introduced from the Ottoman Empire. Some examples are chirpici ‘adobe’ from Turkish kerpiç, chio#c ‘garden house, kiosk’ from Turkish kö#k, geam ‘window’ from Turkish cam, sob" ‘stove’ from Turkish soba, hogeag ‘chimney’ from Turkish ocak. Many of these words were widely borrowed throughout the Balkan areas of the Ottoman Empire and have cognate loanforms in neighboring languages; Turkish cam, for example, has also been borrowed into Greek as %&'µ( (dzámi), into Albanian xham [)am] and Bulgarian d(am. Other semantic fields with a relative large proportion of loans from Turkish are Agriculture and vegetation (4.7%), Food and drink (4.5%), and Clothing and grooming (4.5%). In a way similar to the previous examples of words borrowed from Turkish, these loans are typically linked to the introduction of the corresponding objects as a result of Turkish influence through the close political and commercial links with the Ottoman Empire. In some cases, Turkish loanwords are synonymous with inherited items, with no discernible semantic difference; one of the more visible examples due to its central nature as part of Romanian rural culture is cioban ‘shepherd’ from Turkish çoban, a synonym of the inherited word pastor. There are also some intriguing antonym pairs in which the Turkish loanword tends to be the element with more negative connotations. Examples are ‘clean’ vs. ‘dirty’ (inherited curat vs. murdar borrowed from Turkish) and ‘friend’ vs. ‘enemy’ (prieten borrowed from Slavic prijatel vs. du#man borrowed from Turkish dü#man). 4.5.

Loanwords from other source languages

1.7% of the words in the database are borrowed from Greek, with the highest proportion in the semantic fields Clothing and grooming (6.1%), Modern world (5.8%), The house (5.7%), and Religion and belief (4.5%). Many of the loans from Greek entered the language in a way similar to the Turkish loans, due to contact and trade between Greeks and Romanians within the Ottoman area of influence; religious terminology can be attributed to the shared Orthodox Christian background. A similar percentage, 1.6%, is borrowed from Hungarian. The largest number of loans, in the category Social and political relations (6.5%) can be attributed to the fact that Transylvania, a large section of the Romanian-speaking territory, was unth th der Hungarian influence or rule between the 11 and the 20 century, and as a

246

Kim Schulte

result, social relations were influenced by Hungarian concepts. Even relatively fundamental social concepts such as gazd" ‘host’ from Hungarian gazda and a se întîlni ‘to meet’ from Hungarian taláni were incorporated into the common Romanian vocabulary and are not limited to the territories that were under Hungarian influence. Other lexical fields with a relatively high proportion of Hungarian loanwords are Clothing and grooming (4.5%), Speech and language (4.5%), and The house (4.3%). In all these semantic categories, new words were introduced together with the corresponding culturally specific objects and concepts from Hungarian. Hungarian loanwords also fill genuine lexical gaps; in Romanian, there is normally no distinction made between ‘leg’ and ‘foot’, both of them rendered by picior. To make specific reference to ‘foot’ without the leg, the word lab", borrowed from Hungarian láb. Loanwords from Albanian or closely related pre-Latin substrate languages account for only 1% of the vocabulary in the database. Only a limited number of semantic fields contain loans from this source: The physical world (4.8%), Kinship (3.2%), Agriculture and vegetation (2.8%), Animals (2.7%), Social and political relations (1.6%), The house (1.4%), The body (1.2%), and Clothing and grooming (1.1%). Virtually all loanwords from Albanian/substrate fall into the areas of family, farming, and basic living. The fact that these words survived the process of language shift to Latin indicates that terms and concepts belonging to these areas of life were so deeply rooted in the culture that they continued to be used during and even after the shift to Latin.

5. Integration of loanwords 5.1.

Phonological integration

Generally, the majority of loanwords are not subject to a great deal of phonological change, largely due to the relative large phonological inventory and tolerant phonotactics of Romanian. It is likely that this phonological tolerance is itself, at least in part, due to the continual influx of borrowed words from various source languages. A number of phonological features of Romanian may have emerged as a result of contact and large-scale borrowing; the central vowel /!/, for instance, may have entered Romanian from Slavic (Hall 1974: 73), though this claim is disputed by Petrucci (1999: 60–69). Certain voiced word-initial consonant clusters such as /zdr-/ are also likely to have developed due to borrowing from Slavic, as they regularly occur in loanwords from Slavic beginning with /s*dr-/.

8. Loanwords in Romanian

Romanian a zdrobi zdrav"n zdrean!"

‘to annihilate’ ‘strong, healthy’ ‘rag’

247

Slavic s$drobiti s$dravin$ s$dran$

The phonological development leading to the creation of this cluster is not limited to Romanian; cognate forms such as Serbian zdravo ‘healthy’ show that this is a more widespread, regional process. However, the occurrence of the same initial consonant cluster in Romanian words with unknown origin may be an extension of this phonological sequence beyond the originally borrowed items, indicating that the cluster has become a fully integrated part of the language. a zdruncina ‘to shake’ a zdr"ng"ni ‘to tinkle, jingle’

(source unknown) (source unknown)

Some vowel distinctions in the source languages that do not exist in Romanian have led to a change in vowel quality (and quantity where applicable) in the borrowed Romanian word. For instance, in ora# ‘town’ from Hungarian város [va+,o-], 3 the quantity distinction is not preserved , and in du#man ‘enemy’ from Turkish dü#man, the Turkish close front rounded vowel /y/ is replaced by a close back rounded vowel /u/. Similarly, the close-mid front rounded vowel [ø] has been replaced by the close-mid back rounded vowel [o] in chio#c ‘garden house’ from Turkish kö#k, but a trace of the original vowel is preserved in the fact that the preceding /k/ is followed by a palatal glide, which is a regular development affecting the sequence /k/ + close-mid front vowel. On the other hand, the more recently borrowed foen [fœn] ‘hairdryer’ (from German Föhn) preserves the original vowel almost unchanged, thereby effectively adding a new vowel phoneme to the Romanian inventory. Some phonological patterns are perceived to be typical of loanwords from a particular donor language. A example of this are nouns ending.in -ea when indefinite and in -eaua when definite, which are perceived by most Romanians to be of Turkish origin. In many cases, this is accurate. perdea, perdeaua merdenea,.merdeneaua

‘curtain’ ‘filled.puff pastry’

< Turkish perde < Turkish merdan

However, this pattern is, in fact, indigenous, as shown in the inherited word for ‘star’. stea, steaua

3

‘star’

< Latin stella

Whether the metathesis of the two vowels is linked to the original difference in quantity is unclear.

248

Kim Schulte

5.2.

Morphological integration

Almost all loanwords are morphologically fully integrated; most verbs borrowed from Slavic are incorporated into the conjugation of infinitives ending in /-i/, due to the similarity to the original ending in /-iti/. Verbs from other Romance languages and learned Latin borrowings are generally integrated into the conjugations corresponding to those the respective nouns belong to in their source languages. However, a set of borrowed verbs ending in /-rî/ in Romanian, most of them of Slavic origin, have developed a modified paradigm that can be analyzed as a newly created conjugation (Schulte 2005). Nouns and adjectives are also generally fully integrated, following one of the various declension patterns available. The original gender of nouns is generally retained, and inanimate objects that do not have a gender in their source language usually receive ‘neuter’ gender and morphology, which implies masculine agreement in the singular and feminine agreement in the plural. Examples are gref ‘grapefruit’ from English grapefruit and pix ‘ballpoint pen’, probably borrowed via English from the brand name Bic, with the respective plural forms grefuri/grefe and pixuri requiring feminine agreement. The reanalysis of borrowed plurals ending in /-s/ as singular forms and the subsequent addition of the plural morpheme /-uri/ is commonly found, especially with loans from English, as in cips (SG) from English (potato) chips, with the plural cipsuri. As will be briefly discussed in §6, Romanian has borrowed a large number of South Slavic affixes, many of which appear on loanwords from South Slavic but have become productive affixes in Romanian. As a result, the presence of originally Slavic derivational morphology on loanwords does not set them apart from other Romanian words.

6. Grammatical borrowing As a member of the well-known Balkan convergence area, Romanian has developed a considerable number of morphological and syntactic structures in parallel with the other member languages of the Balkan Sprachbund. For some of the features shared among these languages, it is impossible to determine in which direction they were borrowed, or whether they are simply the result of joint development. These include postposition/suffixation of the definite article, a case system in which the genitive and the dative have merged, an analytic comparative, the formation of the numerals eleven to nineteen with a preposition meaning ‘on’ or ‘over’, following the pattern ‘one-over-ten, two-over-ten’ etc., formation of an analytic future tense with a verb originally meaning ‘to want’, and the use of an “empty imperative verb” (Greek áide, Turkish haydi, Bulgarian/Romanian haide), used to encourage the addressee to go ahead with an unspecified but pragmatically obvious action; in Bulgarian and Romanian, haide can receive morphological person inflection (Romanian haidem (1PL) and haide!i (2PL)).

8. Loanwords in Romanian

249

For other features shared by the Balkan languages, a specific source language can be pinpointed. For instance the widespread tendency to use finite subordinate clauses instead of infinitival clauses even in cases of subject coreference is generally attributed to the merger of the infinitive and subjunctive in Greek, subsequently spreading across the Balkan Peninsula. Other features of Romanian morpho-syntax can be attributed directly to specific source languages. The use of an obligatory attribute agreement marker is a feature of Albanian, and presumably of pre-Latin substrate languages in Romania, that has been borrowed into Romanian, albeit with greater restrictions on its usage. Among the morphological features clearly borrowed from South Slavic are the use of a vocative in /-o/ for feminine nouns ending in /-a/ in their nominative form, the loss of the final syllable of the infinitive, and a large set of derivational morphemes directly borrowed from South Slavic (du Nay 1996: 102–108). Finally, in more recent times, an increased use of the infinitive since the second th half of the 19 century may, at least in part, be attributable to the strong influence of French (Close 1974: 227). However, this is not a complete innovation; the French model is likely to have acted as a reinforcement of an incipient resurgence of the infinitive in certain constructions, especially in prepositional adverbial clauses (Schulte 2007: 308–316).

7. Conclusion Having borrowed from a considerable number of languages over the centuries, Romanian can serve as an example of a language with a high degree of lexical permeability. Borrowing has taken place in a number of very distinct types of contact situations, ranging from cohabitation and population mix on the one end of the scale (e.g. South Slavic) to predominantly indirect contact (e.g. French). Despite the fundamentally different nature of these contact situations, both have provided a large amount of morphologically and phonologically fully integrated lexical material. Whilst the differences between contact situation types affect the number of loanwords from particular languages in certain semantic categories, borrowed items are found in all areas of the lexicon. Loanwords are not only used for objects and concepts for which there was no indigenous word, but have also been introduced as synonyms for existing words; in some cases this has led to the creation of multiple synonyms from a number of different donor languages. Borrowed synonyms often coexist with little or no discernible meaning difference, though their availability is sometimes exploited to make subtle semantic or pragmatic distinctions. Some loanwords are typically associated with a particular register, but generally even words from high-status source languages are used in colloquial speech, which shows that they have been fully integrated into the language. The continuous addition of lexical material from various source languages over the centuries means that a large proportion of the vocabulary of present-day Roma-

250

Kim Schulte

nian is not inherited from Latin; in some semantic areas, loanwords far outnumber inherited ones. Even relatively basic words denoting continually present meanings, such as features of the natural environment, are frequently borrowed. Whilst it might therefore be argued that Romanian is a language with a hybrid vocabulary, the large number of words borrowed from other Romance languages over the last two centuries nevertheless gives its lexicon a distinctly Romance appearance.

References Cihac, Alexandru. 1879. Dictionnaire d’étymologie daco-romane. Vol. 2: Élémemts slaves, magyars, turcs, grecs-moderne et albanais. Frankfurt (Main): St. Goar. Cior"nescu, Alexandru [Alejandro]. 1966. Diccionario etimológico rumano. La Laguna: Biblioteca Filológica & Madrid: Gredos. Close, Elizabeth. 1974. The development of modern Rumanian. Oxford: Oxford University Press. Conev, Benju. 1921. Ezikovni vzaimnosti me(du b"lgari i rum"ni. Sofia: Sofia University Press. Corominas, Joan. 1961. Breve diccionario etimológico de la lengua castellana. Madrid: Gredos. Coteanu, Ion et al. 1998. Dic!ionarul explicativ al limbii române. 2nd edn. Bucharest: Univers Enciclopedic/Academia Român" –Institutul de Linvistic" ‘Iorgu Iordan’. Diculescu, Constantin. 1924–26. Elementele vechi grece!ti din limba romîn". Dacoromania IV:394–516. du Nay, André. 1996. The Origins of the Rumanians: The early history of the Rumanian language. Toronto/Buffalo: Matthias Corvinus Publishing. Fiedler, Wilfried & Klosi, Ardian. 1997. Wörterbuch Deutsch-Albanisch. Berlin/München: Langenscheidt. Grebe, Paul et al. 1963. Duden Etymologie: Herkunftswörterbuch der deutschen Sprache [Duden Etymology: Etymological dictionary of the German language]. Mannheim: Bibliographisches Institut (Dudenverlag). Gruji$, Branislav. 1998. D(epni re&nik nema&ko-srpski. Belgrade: Janus. Hall Jr. Robert A. 1974. External History of the Romance Languages. New York: Elsevier. Ha!deu, Bogdan Petriceu. 1877. Columna lui Traian VIII (1877). Ha!deu, Bogdan Petriceu. 1879. Cuvinte den b"trîni. Vol. 1. Bucharest. Ha!deu, Bogdan Petriceu. 1883. Columna lui Traian XIV (1883). Isb"!escu, Mihai. 1995. Dic!ionar german-român: 60.000 de cuvinte. Bucharest: Teora. Levi#chi, Leon & Banta!, Andrei. 1992. Dic!ionar englez-român. 70.000 de cuvinte. Bucharest: Teora. McClure, Erica F. 1976. Ethnoanatomy in a multilingual community: An analysis of semantic change. American Ethnologist 3(3):525–542.

8. Loanwords in Romanian

251

Mih"il", Gheorghe. 1973. Studii de lexicologie #i istorie a lingvisticii române#ti. Bucharest: Editura didactica si pedagogica. Miklosich, Franz. 1860. Die slavischen Elemente im Rumunischen [The slavic elements in Romanian]. (Denkschriften der Kaiserlichen Akademie der Wissenschaften XII). Vienna: Kaiserliche Akademie der Wissenschaften. Miklosich, Franz. 1862–65. Lexicon palaeoslovenico-greco-latinum. Vienna: Braumüller. Murnu, George. 1894. Studiu asupra elementului grec ante-fanariot în limba român". Bucharest: Tipografia Cur#ii Regale, F.Göbl Fii. Newmark, Leonard. 1998. Albanian-English Dictionary. Oxford: Oxford University Press. Petrucci, Peter R. 1999. Slavic Features in the History of Rumanian. Munich: LINCOM Europa. Philippide, Alexandru. 1894. Istoria limbii romîne. Vol. 1: Principii de istoria limbii. Ia!i: Tipografia Na#ional". Pu!cariu, Sextil. 1997 [1943]. Die rumänische Sprache [The Romanian language]. Bucharest: Grai !i Suflet, Cultura Na#ional". Rosetti, Alexandru. 1978 [1938–41]. Istoria Limbii Române. Bucharest: Editura .tiin#ific" !i Enciclopedic". Savin, Emilia & L"z"rescu, Ioan & /ân#u, Katharina. 1997. Dic!ionar german–român [German-Romanian dictionary]. Bucharest: Editura .tiin#ific". Schulte, Kim. 2005. Vowel Centralization in Romanian Verbs of Slavic Origin: Deliberate exploitation of an indigenous sound change. In Jacobs, Haike & Geerts, Twan (eds.), Romance Languages and Linguistic Theory 2003, 378–94. Amsterdam: Benjamins. Schulte, Kim. 2007. Prepositional infinitives in Romance: A usage-based approach to syntactic change. Bern/Oxford: Peter Lang. Seche, Luiza & Seche, Mircea. 1997. Dic!ionar de sinonime al limbii Române. Bucharest: Univers Enciclopedic. Steuerwald, Karl. 1972. Türkisch-deutsches Lexikon [Turkish-German dictionary]. Wiesbaden: Harrassowitz. Wendt, Heinz Friedrich. 1960. Die türkischen Elemente im Rumänischen [The Turkish elements in Romanian]. (Berliner byzantinistische Arbeiten 12). Berlin: AkademieVerlag.

252

Kim Schulte

Loanword Appendix Thraco-Dacian or Albanian m"gur"

mountain, hill

mal

shore

pârâu

river, stream

mlac"

swamp

abur

fog, steam

scrum

ash

copil

child

mo#

grandfather, old man

moa#"

grandmother, old woman

!ap

om"t

snow

brici

razor

vreme

weather, time

colib"

hut

jar

embers

ograd"

yard, court

j"ratic

embers

stâlp

fl"c"u

young man

doorpost, post, pole

nevast"

wife

z"vor

maic"

mother

latch, doorbolt

bab"

grandmother, old woman

grind"

beam

bârn"

beam camp

odrasle

descendants

tab"r"

izlaz

pasture

hârle!

spade

coco#

cock, rooster

lopat"

shovel

he-goat

scoic"

shell

a s"di

to sow

mânz

foal, colt

c"mil"

camel

a cosi

to mow

c"pu#"

tick

prepeli!"

quail

coas"

sickle, scythe

#opârl"

lizard

veveri!"

squirrel

ov"z

oats

cioc

beak

bivol

buffalo

orez

rice

ceaf"

nape of neck

trup

body

a munci

work

groap"

grave

obraz

face, cheek

a încovoia

to bend

c"ciul"

hat, cap

a clipi

to blink

a lovi

vatr"

fireplace

glezn"

ankle

to strike, hit, beat, to kick

gard

fence

pizd"

vagina, vulva

a (d)oborî

to cut down

copac

tree

a se trezi

to wake up

topor

axe, ax

brad

fir

a omorî

to kill

tesl"

adze

c"tun

village

stârv

carcass

a târî

to pull

ran"

wound, sore

a tescui

to squeeze

leac

medicine

a cl"di

to build

otrav"

poison

a zidi

to build

clei

glue

covaci

blacksmith

nicoval"

anvil

cositor

tin, tinplate

sticl"

glass, bottle

plas"

netbag

dalt"

chisel

a (r")suci

to turn, to twist

a t"v"li

to roll

a cl"ti(na)

to shake

a stropi

to splash

(South) Slavic praf

dust

pr"pastie

cliff, precipice

gol

naked, empty

ostrov

island

a n"bu#i

to choke

a pr"ji

to roast, fry

val

wave

izvor

spring, well

cle#te

tongs

smârc

swamp

m"slin"

olive

mla#tin"

swamp

ulei

oil

stânc"

stone, rock

pâsl"

felt

vijelie

storm

mantie

cloak

v"zduh

air

podoab"

pâcl"

fog

ornament, adornment

z"pad"

snow

perie

brush

8. Loanwords in Romanian

253

a se târî

to crawl

milostenie

pity

pra#tie

sling

a se zgârci

to crouch

mândru

proud

suli!"

spear

a se chirci

to crouch

a (în)dr"zni

to dare

straj"

guard

a coborî

to go down

viteaz

brave

n"vod

fishnet

a pogorî

to go down

dârz

brave

vinovat

guilty

osie

axle

primejdie

danger

temni!"

prison

cârm"

rudder

a voi

to want

sfânt

holy

corabie

sail

vin"

fault

a propov"dui

to preach

a primi

to get

lacom

greedy

a blagoslovi

to bless

a r"ni

to injure

a ghici

to guess

a posti

to fast

a g"si

to find

tâmp(it)

stupid

iad

hell

bogat

rich

prost

stupid

gheen"

hell

a pl"ti

to pay

a preda

to teach

demon

demon

a tocmi

to hire

ucenic

pupil

idol

idol

târg

market

nevoie

need, necessity

vraj"

magic

scump

expensive

glas

voice

duh

ghost

vârf

top

a #op(o)ti

to whisper

pisc

top

a !ipa

to shriek

rând

line

a r"cni

to shriek

noroi

mud

to admit

mocirl"

mud, swamp sand

sut"

a hundred

a m"rturisi

Bulgarian

ceat"

crowd

a opri

to forbid

nisip

r"zle!

alone

a dojeni

to scold

deal

mountain, hill cave

pribeag

alone

a oc"rî

to scold

pe#ter"

vârst"

age

a se f"li

to boast

gârl"

river, stream whirlpool

iute

fast

a citi

to read

vârtej

a porni

to begin

trâmbi!"

horn, trumpet

jeg

embers descendants

a sfâr#i

to finish

toiag

walking stick

ml"de!e

a ispr"vi

to finish

st"pân

master

ma#teh

stepfather stepmother

a (se) opri

to cease

rob

slave

ma#teh"

gata

ready

slug"

servant

rude

relatives

to command, order

jivin"

animal

grajd

stable, stall

zori

dawn

a porunci

sâmb"t"

Saturday

a mirosi

to smell

prieten

friend

gâsc"

goose

a privi

to look

vr"jma#

enemy

bâtlan

heron

a pip"i

to feel

a pofti

to invite

liliac

bat

noroc

good luck

a poticni

to prevent

p"ianjen

spider

a iubi

to love

datin"

custom

pleoap"

eyelid

jale

grief

sfad"

quarrel

clon!

beak

jelanie

grief

curv"

prostitute

a n"du#i

to perspire

pity

r"zboi

war, battle

a z"misli

to beget

club

a tr"i

to be alive

mil"

bât"

254

Kim Schulte

bolnav

sick, ill

nad"

bait

#coal"

school

a se odihni

to rest

mecet

mosque

da

yes

ple#uv

bald

cutie

tin, can

hârtie

paper

blid

dish, bowl

grani!"

boundary

castron

dish, bowl

can"

cup

s"rman

orphan, poor

Serbian

cea#c"

cup

dobitoace

livestock

în"bu#i

to extinguish

smochin(")

fig

voinic

strong, healthy

suhat

pasture

piper

pepper

slab

weak

stup

beehive

haine

clothing, clothes

hran"

food

boal"

disease

ibric

kettle

trândav

lazy

blan"

fur

tigaie

pan

bumbac

cotton

a vopsi

to dye, to paint

mied

mead

pern"

pillow

r"zboi (de !esut)

loom

bolt"

arch

stejar

oak

duhan

tobacco

ciot

tree stump

zgomot

sound, noise

vâr#"

fish trap

Bulgarian or Serbian

#apc"

hat, cap

cârp"

handkerchief, rag

rochie

(woman’s) dress

cosi!"

plait, braid

hain"

coat

odaie

room

zid

wall

prag

door, gate

co#

c"min

fireplace

chimney, basket lamp, torch

raft

shelf

f"clie

grebl"

rake

copaie

trough

creang"

branch

gospodar

farmer

crac"

branch

ogor

field

mâzg"

sap

gr"din"

garden

ciuperc"

mushroom

brazd"

furrow

lan!

chain

a r"s"di

to plant

a jupui

to skin

coaj"

mat

tigv"

sanie plut"

Ukrainian n"mol

mud

buhai

bull

pajur"

eagle

manta

cloak

ciubot"

boot

iv"r

latch, doorbolt

bark

horn

chimney

gourd

lan

field

sledge, sled

ciocan

hammer

a bort(el)i

to bore

raft

a ciopli

to carve

covali

blacksmith

oar

drum

path

covor

rug

a p"stra

to keep

pod

bridge

!"ru#

peg

stânjen

fathom

s"rac

poor

bort"

hole

a (se) gr"bi

to hurry

leaf"

wages

hâd

ugly

a zâmbi

to smile

ieftin

cheap

slut

ugly

grij"

anxiety

col!

corner

obicei

custom

ceas

hour, clock

Polish

gâlceav"

quarrel

a ciupi

to pinch

pav"z"

sword

dasc"l

teacher

rogojin"

vâsl"

sabie

shield

8. Loanwords in Romanian Russian ceainic

kettle

brag"

fermented drink

scop

intention

a se întâlni

to meet

martor

witness

pu#c"

gun

biseric"

church

sudalm"

oath

farmec

magic

hoit

corpse, carcass

mattress

batat

sweet potato

saltea

goarn"

horn, trumpet

cofeturi

candy, sweets

armie

army

zaharikale

candy, sweets

poli!ie

police

calendar

calendar

po#t"

post, mail

ceai

tea

Greek (all periods)

255

Tatar arcan

lasso

Turkish

Hungarian

talaz

wave

ima#

pasture

chibrit

match

uliu

hawk

mangal

charcoal young woman

furtun"

storm

chip

face

duduie

!a!"

aunt

lab"

foot

cioban

herdsman

papagal

parrot

beteag

sick, ill

catâr

mule

cucuvea

owl

bete#ug

disease

calcan

stingray

carid"

prawns, shrimp

a t"m"dui

to cure

le#

corpse, carcass

guler

collar

chel

bald

back

cizm"

boot

farfurie

plate sausage

spate spat"

shoulderblade

bumb

button

mezel

pl"mân

lung

#irag

necklace

ciorb"

soup vegetables

splin"

spleen

a locui

to live

zarzavaturi

mitr"

womb

sob"(2)

room

ca#caval

cheese sock, stocking

strachin"

dish, bowl

lac"t

lock, padlock

ciorap

fasole

bean

hold"

field

giuvaier

jewel

pipe

colan

necklace

basma

handkerchief, rag

chio#c

garden-house

geam

window

du#umea

floor

sob"(1)

stove

hogeag

chimney

covata

trough

chirpici

adobe

cazma

spade

arman

threshingfloor

tutun

tobacco

lulea

pipe

zah"r

sugar

pip"

fust"

skirt

bard"

axe, ax

buzunar

pocket

fer"str"u

saw

scul"

jewel, tool

firiz

saw

prosop

towel

il"u

anvil

alifie

ointment

a se ciuc(ul)i

to crouch

s"pun

soap

mereu

always

cort

tent

!el

intention

crivat

bed

a b"nui

to suspect

c"r"mid"

brick

fel

manner

a arg"si

to tan

t"g"dui

to deny

a sosi

to arrive

f"g"dui

to promise

catarg

mast

dob"

drum

moned"

coin

ora#

town

zulie

envy, jealousy

neam

people

fric"

fear

gazd"

host

256

Kim Schulte

dovleak

pumpkin, squash

dulgher

carpenter

boia

paint

liman

port

buluc

crowd

murdar

dirty

du#man

enemy

musafir

guest

baltag

battle-axe

capcan"

trap

geamie

mosque

German biber

beaver

doctor

physician

medicin"

medicine

bere

beer

stof"

cloth

sacou

coat

pantof

shoe

lamp"

lamp, torch

#an!

ditch

palm

palm tree

bambus

bamboo

#treang

rope

defect

broken

padel"

paddle

obiect

thing

vest

west

cvadrat

square

motiv

cause

turn

tower

spital

hospital

veceu

toilet

#urub

screw

plastic

plastic

!igar"

cigarette

!igaret"

cigarette

French and/or German

savan"

savanna

seism

earthquake

obscuritate

darkness

ren

reindeer, caribou

corp

body

flam"

flame

corpse, carcass

a incendia

to light

orient

east

persoan"

person

nord

north

masculin

male

linie

line

feminin

female

real

true

bebe

baby

idee

idea

divor!

divorce

religie

religion

tanti

aunt

paradis

heaven

descenden!i

descendants

fee

fairy, elf

animal

animal

radio

radio

animale

livestock

telefon

telephone

bovine

cattle

automobil

car

cormoran

cormorant

ma#in"

car, machine

oposum

opossum

tablet"

pill, tablet

branhie

gill

adres"

address

cochilie

shell

closet

toilet

rechin

shark

bomb"

bomb

balen"

whale

film

film, movie

elefant

elephant

insect"

insect

cadavru

French

miriapod

centipede

sol

soil

scorpion

scorpion

colin"

mountain, hill

crevet"

falez"

cliff, precipice

prawns, shrimp

abis

cliff, precipice

termit"

termites

continent

mainland

coiot

coyote

litoral

shore

elan

elk, moose

calm

calm

cangur

kangaroo

ocean

ocean

jaguar

jaguar

golf

bay

cameleon

chameleon

recif

reef

crocodil

crocodile

cap (2)

cape

aligator

crocodile

maree

tide

tapir

tapir

turbion

whirlpool

arter"

artery

surs"

spring, well

ven"

vein

cascad"

waterfall

spine

cataract"

waterfall

coloan" vertebral" figur"

face

8. Loanwords in Romanian

257

mandibul"

jaw

putrid

rotten

ignam"

yam

maxilar

jaw

bol

bowl

manioc

lob

earlobe

sucup"

saucer

cassava, manioc

cerumen

earwax

dineu

supper

con

cone to bend

molar

molar tooth

sup"

soup

a curba

omoplat

shoulderblade

buchet

bunch

a boxa

to pound to rub

axil"

armpit

hidromel

mead

a fric!iona

mamel"

nipple, teat

fetru

felt

a presa

to press to build

ombilic

navel

pelerin"

cloak

a construi

stomac

stomach

poncho

poncho

a forja

to forge

skirt

argil"

clay rug

intestin

intestines, guts

jup"

viscere

intestines, guts

panatalon(i)

trousers

carpet"

talie

waist

#oset"

sock, stocking

mochet"

rug fan

ligament

sinew, tendon

beret"

hat, cap

evantai

tendon

sinew, tendon

basc"

hat, cap

a sculpta

to carve to carve

testicul

testicles

centura

belt

a grava

penis

penis

voal

veil

bumerang

boomerang to move

vagin

vagina

bijuterie

jewel

a deplasa

a respira

to breathe

colier

necklace

a plonja

to dive

diadem"

headband, headdress

derapa

to slide, slip

a dansa

to dance

to dribble

tatuaj

tattoo

#osea

road

a defeca

to shit

batist"

ax(")

axle

gravid"

pregnant

handkerchief, rag

canoe

canoe

pomad"

ointment

port

port

#o#on

snowshoe

a debarca

to land

pavilion

garden-house

a poseda

to own

cupol"

arch

a leza

to injure

hamac

hammock

avar

stingy

rigol"

ditch

a angaja

to hire

lasou

lasso

magazin

shop, store

recolt"

harvest

resturi

remains

cereal"

grain

a recolta

to gather

plant"

plant

a deta#a

to separate

a planta

to plant

a repartiza

to divide

foaie

leaf

bordur"

edge

sev"

sap

est

east

suc

sap

sud

south

palmier

palm tree

plat

flat

banan"

banana

glob

ball

banian

banyan

similar

similar

a transpira

to perspire

a voma

to vomit

a saliva

a deceda

to die

a sucomba

to die

a expira

to die

a înhuma

to bury

sepultur"

grave

viguros

strong

temperatur"

fever

maladie

disease

afec!iune

disease

leziune

wound, sore

echimoz"

bruise

contuziune

bruise

tumefac!ie

swelling

medicament

medicine

a se repauza

to rest

indolent

lazy

258

Kim Schulte

zero

zero

a ordona

to command, order

debut

beginning

torid

hot

amfitrion

host

#ans"

good luck

tradi!ie

custom

French and/or Latin

ghinion

bad luck

complot

plot

vapori

steam

a regreta

to regret, be sorry

armur"

armor

familie

family

casc"

helmet

castor

beaver

compasiune

pity

fort"rea!"

fortress

abdomen

belly

jaluzie

envy, jealousy

atac

attack

vulv"

vulva

curajos

brave

prizonier

to cultivate

faithful

captive, prisoner

a cultiva

fidel

sculptor

sculptor

veridic

true

gardian

guard

statuie

statue

fraud"

deceit

ambuscad"

ambush

a agita

to shake

corect

right (2)

a rata

to miss

a ruina

to destroy

just

right (2)

reclamant

plaintiff

impozit

tax

repro#

blame

a condamna

to condemn

salariu

wages

hidos

ugly

a achita

to acquit

convenabil

cheap

avid

greedy

amend"

fine

a separa

to separate

stupid

stupid

asasinat

murder

centru

middle

idiot

stupid

crim"

murder, crime

occident

west

to study

viol

rape

plan

flat

elev

pupil

moschee

mosque

sfer"

ball

profesor

teacher

magie

magic

imediat

immediately

secret

secret

fantom"

ghost

a dura

to last

a suspecta

to suspect

televiziune

television

a termina

to finish

facil

easy

biciclet"

bicycle

auror"

dawn

dificil

difficult

motociclet"

motorcycle

culoare

color

manier"

manner

autovehicul

car

furie

anger

discurs

speech

autobuz

bus

temerar

brave

a refuza

to refuse

tren

train

a imita

to imitate

stilou

pen

avion

airplane

no!iune

idea

poet

poet

baterie

battery

dement

mad

trompet"

horn, trumpet

a frâna

to brake

obscur

obscure

clan

clan

petrol

petroleum

clandestin

secret

#ef

chieftain

infirmier"

nurse

a explica

to explain

a guverna

to rule, govern

ministru

minister

inten!ie

intention

aristocrat

noble

timbru

postage stamp

cauz"

cause

servitor

servant

robinet

tap, faucet

a murmura

to mumble

to command, order

chiuvet"

sink

a prohibi

to forbid

toalet"

toilet

a anun!a

to announce

bomboane

candy, sweets

patrie

native country

a se instrui

a comanda

atelier

workshop

a deveni

to become

8. Loanwords in Romanian a invita

to invite

p"l"rie

hat, cap

conspira!ie

plot

nasture

button

conjura!ie

plot

camer"

armatur"

armor

staniu

259

ornament

ornament, adornment

room

ramificare

forked branch

tin, tinplate

a disloca

to move to lead, drive

a capitula

to surrender

strad"

road, street

a conduce

captiv

captive, prisoner

furchet

outrigger

a (e)libera

to let go

ancor"

anchor

profund

deep

captur"

booty

a restitui

to give back

a modifica

to change

tribunal

court

a salva

to rescue

suficient

enough

sentin!"

judgment

a distruge

to destroy

ultim

last

a acuza

to accuse

merchant

or"

hour

a inculpa

to accuse

pia!"

market

invidie

envy, jealousy

inocent

innocent

fine

end (2)

culp"

fault

adulter

adultery

gelozie

envy, jealousy

clar

clear

motor

motor

pericol

danger

tr"da

to betray

infrac!iune

crime

a spera

to hope

necesitate

need, necessity

elogiu

praise

voce

voice

a studia

to study

a nega

to deny

mod

manner

a promite

to promise

flaut

flute

sclav

slave

armat"

army

a elibera

to liberate

spear

a permite

to permit

spad"

sword

inimic

enemy

sperjur

perjury

victorie

victory

templu

temple

Italian and/or Latin

sacrificiu

sacrifice

fals

wrong

a adora

to worship

a respinge

to refuse

a predica

to preach

popor

people

infern

hell

amic

friend

circumciziune

circumcision

ochelari

spectacles, glasses

delict

crime

French and/or Italian grot"

cave

a modela

to mould, mold

a naviga

to sail

nav"

ship

vapor

ship

doliu

grief

banc"

bank (financial institution)

muzic"

music

Italian lagun"

lagoon

mascul

male

acvil"

eagle

tucan

toucan

delfin

porpoise, dolphin

police febr"

comerciant

lance

Latin asin

donkey

leu

lion

craniu

skull

thumb

scapul"

shoulderblade

fever

clavicul"

collarbone

cicatrice

scar

policar

thumb

medic

physician

uter

womb

nud

naked

a vomita

to vomit

gheat"

boot

a sufoca

to choke

English pix

pen

Unknown origin bordei

hut

ra!"

duck

Chapter 9

Loanwords in Selice Romani, an Indo-Aryan language of Slovakia* Viktor El!ík 1. The language and its speakers Romani is an Indo-Aryan (Indo-Iranian, Indo-European) language, whose numerous and rather divergent dialects are spoken by several millions of “Gypsies” – Roma, Sinti, M"nu!, K"le and other related groups – throughout Europe and elsewhere. The variety under description, Selice Romani, is a dialect of Romani spoken by ca. 1,350 Romani inhabitants of the multiethnic village of Selice (Hungarian Sókszel#ce, Romani $óka) in southwestern Slovakia. Selice Romani is part of a linguistic continuum of closely related Romani dialects spoken in southwestern and south-central Slovakia and in north-central Hungary, which together form the Northern subgroup of the South Central group of Romani dialects (cf. Boretzky 1 1999; El!ík et al. 1999). The Northern South Central dialects are often referred to as Rumungro in Romani linguistics (e.g. Matras 2002) and I will also adopt this term here for its brevity. Although all Rumungro varieties have been influenced by Hungarian, most Rumungro speakers presently live in ethnically Slovak parts of Slovakia and are Slovak bilinguals, whereas an overwhelming majority of Rumungro communities in Hungary and in the Hungarian parts of Slovakia have undergone language shift to Hungarian (cf. El!ík 2003). Selice Romani is one of the few extant Rumungro varieties whose speakers are Hungarian bilinguals.

*

1

The subdatabase of the World Loanword Database that accompanies this chapter is available online at http://wold.livingsources.org. It is a separate electronic publication that should be cited as: El!ík, Victor. 2009. Selice Romani vocabulary. In Haspelmath, Martin & Tadmor, Uri (eds.) World Loanword Database. Munich: Max Planck Digital Library, 1536 entries.

Varieties of the other, Southern (or Vendic), subgroup of the South Central dialects of Romani are spoken in western Hungary, the Austrian Burgenland, and the Slovenian Prekmurje.

9. Loanwords in Selice Romani

261

2

The genealogical affiliation of Selice Romani is shown in Figure 1. While I will discuss loanwords into all ancestor varieties of present-day Selice Romani, commencing with Proto-Indo-European, the term Romani will only be applied, as is usual, to the part of the variety’s genealogical lineage that starts at “the point at which the language became sufficiently distinct from other related Indo-Aryan idioms to be classified as an entity in its own right” (Matras 2002: 18; emphasis mine). Early Romani is the undocumented, but partly reconstructed, common ancestor of all present-day Romani dialects, which was spoken prior to the dispersion of Romani-speaking groups throughout Europe and the consequent split into dialects (cf. El!ík & Matras 2006: 68–84). Proto-Romani (or *%omm"n&, cf. Tálos 1999) then covers the pre-Early Romani stages of Romani (but cf. Matras (2002: 18) for a slightly different use of the term). Pre-split loanwords are those that can be reconstructed to have been present in Early Romani, while post-split loanwords are dialect-specific within Romani. Pre-Selice Romani refers to the post-Early Romani ancestor varieties of present-day Selice Romani. Indo-European Indo-Iranian Indo-Aryan Central Indo-Aryan Romani South Central Romani Rumungro (= Northern South Central Romani) Selice Romani

Figure 1:

Genealogical affiliation of Selice Romani 3

Three ethnic groups are represented in the village of Selice: Hungarians, and two distinct Romani groups, viz. the “Hungarian” Roms, most of whom are native speakers of the dialect under description, and the much less numerous “Vlax” Roms, who speak a different Romani dialect natively (see §3.7). Both Romani groups use the plain ethnonym Rom for their own group and both are called cigányok ‘Gypsies’ by Hungarians, although the Hungarian villagers clearly 2

3

Note, however, that the character of Romani dialect groups is a controversial issue: although they may have resulted from separate migrations of Romani speakers out of Asia Minor or the southern Balkans, and so conform well to the family tree model (Boretzky 1999; Boretzky & Igla 2004), they may also have developed in situ due to feature diffusion within Romani, and so represent a convenient reference grid rather than genealogical units (Matras 2002, 2005). While I tend to see more evidence for the separate migration scenario in the case of the South Central Romani group (El!ík 2006), the issue certainly requires further research. A score of ethnic Slovaks and Czechs and a couple of Ruthenians and Poles have married into Hungarian or Romani families. The once numerous Hungarian-speaking Jewish community of Selice was completele annihilated during the Holocaust; the single living survivor does not live in the village anymore.

262

Viktor El!ík

differentiate between magyar cigányok ‘Hungarian Gypsies’ and oláh cigányok ‘Romanian Gypsies’, i.e. the Vlax Roms. The former are referred to as Rumungri by the latter, who are in turn called Pojáki by the former. Until the 1970s, the Hungarian Roms of Selice inhabited a separate, densely inhabited, neighborhood of oneroom adobe houses on the southeastern outskirts of the village. Presently, however, they live in regular houses, interspersed among the Hungarian population. The Vlax Roms have been based in Selice for more than a century, though they were semi-itinerant until 1958, when the Czechoslovak authorities forced them to settle. Their small colony is still located on the northwestern outskirts of Selice. If counted together, the two Romani groups slightly outnumber the Hungarian popu4 lation of the village. Until recently, however, the Hungarians were in a demographic majority and they remain the economically and politically dominant group in the village.

Map 1: Geographical setting of Selice Romani

4

Roms are taken here to be the people who identify themselves as Roms in most informal social contexts and/or who are identified as Roms/Gypsies by other locals. (Most, though not all, Roms thus defined speak Romani natively.) However, only 3% and 4% of the villagers declared Romani ethnicity in the 1991 and 2001 censuses respectively, which amounts to ca. 7% of the Romani population; two thirds of Selice Roms declared Slovak ethnicity and a fifth declared Hungarian ethnicity.

9. Loanwords in Selice Romani

263

Selice Romani is prevalently an oral language. Some Hungarian Roms of Selice are able to write letters or text messages in Romani but the language is not used for regular written communication. Nor is it used in mass media or in formal education. Although Romani in general is an officially recognized language in Slovakia, there is no recognition of the Rumungro dialect specifically and, so far, there have been no attempts at its standardization. While all Hungarian Roms of Selice born before 1975 or so are native speakers of Selice Romani, in some families children are presently spoken to only in Hungarian and/or Slovak, and left to acquire some competence in Romani in adolescent and adult peer groups, if at all. Thus, Selice Romani is not a safe language, though it is not seriously endangered yet. Interestingly, many Hungarian villagers understand Selice Romani well, although only a few have some active competence in it and I know of no fluent speakers. (See §3.7 for more details on the current contact situation.)

2. Sources of data All the Selice Romani data in the subdatabase stem from my linguistic fieldwork, which has been carried out during short but numerous fieldtrips to Selice since 1997. I have worked especially with one middle-age female speaker and with people, of both genders and different generations, from within her extended family. Thus, the variety of Selice Romani described here represents a familiolect rather than the local dialect of the Hungarian Roms in general. This is important to stress, as it seems that the Selice Romani lexicon shows significant variation across different groups of speakers, especially with regard to the number of loanwords from Hun5 garian. In addition to her native language, my main consultant speaks Hungarian, Slovak and Czech fluently, and she has some basic competence in Russian. While a great many of the words in the subdatabase have been acquired through analysis of spontaneous narratives and conversations, all of these have been re-checked with my consultants. A significant part of the subdatabase entries, a third or so, stem from direct lexical elicitation. Many Early Romani etymologies, including those of pre-split loanwords, have been discussed at least in some of the previous lexical and/or etymological studies on Romani (e.g. Pott 1844–1845, Ascoli 1865, Miklosich 1872–1881, Sampson 1926, Wolf 1960, Valtonen 1972, Vekerdi 1983 [2000], Soravia 1988, Boretzky & Igla 1994, M"nu! 1994, M"nu!s et al. 1997, Tálos 1999). Several publications on individual layers of lexical borrowings into Romani are mentioned in §3. I have drawn especially on two sound sources, Boretzky & Igla (1994) (cf. Kostov 1996, Matras 1996) and M"nu!s et al. 1997 (cf. Bakker 1999), in etymologizing pre-split loanwords in Selice Romani, while most etymologies of post-split loanwords, including all etymologies of loanwords from Hungarian, Slovak and Czech, are my 5

On the other hand, Selice Romani exhibits a great degree of homogeneity as far as its morphosyntax and phonology are concerned.

264

Viktor El!ík

own. Finally, I have consulted several publications (Bení!ek 2006; Buck 1949; Burrow & Emeneau 1960, 1984; Kuiper 1948, 1991; Lubotsky 2001; Mayrhofer 1986–2001; Turner 1962–1966; Witzel 1999a, 1999b, 1999c) in order to identify loanwords into the Old Indo-Aryan and earlier stages of Selice Romani, which, for obvious reasons, have hardly ever been considered in etymological studies on Romani.

3. Contact situations Selice Romani and its ancestor varieties have come into contact with a number of different languages in a variety of contact situations, including in all likelihood language shift (see §3.2). This section is structured chronologically into periods characterized by contact with a certain language or, more often, with a cluster of languages that may be conveniently discussed together. Although we lack any direct evidence, it is clear that at least after the out-migration of Romani speakers from the Indian subcontinent, the speakers of the immediate contact languages of Romani were overwhelmingly dominant numerically and politically with regard to the Roms. Extrapolating from the similar current demographic and political conditions of Romani in Europe, we may reasonably assume widespread bilingualism among the Roms during their migrations (§3.4–6). As the current contact situation (§3.7) clearly indicates, we must always allow for plurilingualism of the speakers rather than mere bilingualism and for periods of overlap of contact with different languages. 3.1.

Contact with non-Indo-European Central Asian languages

Being an Indo-Iranian language, Selice Romani inherits some of the loanwords into Proto-Indo-Iranian that had been acquired before the Aryans arrived in the Indian subcontinent. The source languages of these loanwords remain unidentified, although some authors hypothesize that they mostly represent the non-IndoEuropean element of ancient Central Asia, specifically the language (or languages) of the Bactria-Margiana Archaeological Complex in the Amu Darya region (e.g. Witzel 1999a: 54; 2003: 52; Lubotsky 2001). While the source forms of the suggested loanwords are unattested, criteria such as irregularity with regard to the Indo-European phonological, phonotactic and morphological patterns, together with the restricted distribution of the etyma within Indo-European, are used in establishing their loanword status (cf. Lubotsky 2001: 301–305). Reviewing all Proto-Indo-Iranian words that are unattested elsewhere in IndoEuropean, Lubotsky (2001) argues that many of them are likely to have been borrowed in Central Asia. Of these probable loanwords, Proto-Indo-Iranian *matsi'a‘fish’, *r(!i- ‘seer’, *s)*&- ‘needle’, and *u'r(tka- ‘kidney’ have survived into Selice Romani (see Appendix; note Proto-Indo-Iranian ‘seer’, ‘kidney’ > Selice Romani

9. Loanwords in Selice Romani

265

‘priest’, ‘liver’). In addition, the borrowed Proto-Indo-Iranian *u'ar"+j,a- ‘wild boar’ might be reflected in Selice Romani bálo ‘pig’, if M"nu!s et al. (1997: 28) are correct in deriving the Romani word from Old Indo-Aryan var"há- ‘wild boar’ (cf. Turner 1962–1966: 520 and Boretzky & Igla 1994: 19 for a different view). The Selice Romani verb khand- ‘to smell’ is based on a lost noun (reconstructable for Early Romani) that continued the borrowed Proto-Indo-Iranian noun *gand,/t- ‘smell’. A few more of Lubotsky’s loanwords have been lost in Selice Romani but are continued in other Romani dialects (‘donkey’, ‘tree’, and perhaps also ‘well, source’). Of a different origin – perhaps Burushaski, perhaps Semitic, perhaps Anatolian (cf. Mayrhofer 1986–2001: I, 499; Witzel 1999a: 29, 55) – might be the Proto-Indo-Iranian etymon for ‘wheat’, whose Old Indo-Aryan reflex godh)+ma- has developed into Early Romani *giv (e.g. Turner 1962–1966: 230). The Selice Romani equivalent !u-o jiv ‘wheat’, which can be literally translated as ‘clean snow’, must have developed through confusion of an older *.iv ‘wheat’ (still attested in closely related Rumungro dialects, cf. Vekerdi 2000: 56) and the near-homonymous noun jiv ‘snow’ (which reflects Proto-Indo-European */,im- ‘cold etc.’, e.g. Mayrhofer 1986– 2001: II, 815). Finally, Proto-Indo-European *med,u- ‘sweet drink, honey’ is, according to Witzel (1999a: 55–56), a loanword from an unknown paleo-Eurasian language of eastern Europe or northern Central Asia. If Boretzky & Igla (1994: 183) are correct in deriving Romani mol ‘wine’ from Old Indo-Aryan mádhu- ‘honey, mead’, then this etymon may be the oldest quotable loanword in Selice Romani. However, a much later borrowing into Romani of Persian mol ‘wine’ (e.g. Turner 1962–1966: 562; M"nu!s et al. 1997: 87), itself of the same origin, appears to be a more convincing hypothesis on both formal and semantic grounds. 3.2.

Contact with non-Indo-Aryan Indian languages

As an Indo-Aryan language, Selice Romani inherits traces of linguistic contacts of its Old and Middle Indo-Aryan ancestor varieties with non-Indo-Aryan languages of India. Kuiper (1991) has shown that already Rgveda, the pre-iron age Old IndoAryan text of the Greater Panjab, contains several hundreds of clearly non-IndoAryan words. While the presence of Dravidian loanwords in Old Indo-Aryan has long been recognized (e.g. Burrow 1945, 1946, 1947–8; Burrow & Emeneau 1960/1984; Southworth 2005a, 2005b), Witzel (1999a, 1999b) argues that they started to enter the language only in the middle and late Rgvedic periods. The earliest Rgvedic period, on the other hand, is characterized by loanwords from undocumented Greater Panjab substrates. Following Kuiper’s (e.g. 1948, 1991) work on Proto-Munda loanwords in Old Indo-Aryan, Witzel (1999a) refers to the major Rgvedic substrate as Para-Mundic and considers it to be a western variety of Austroasiatic. The number of both Dravidian and (Para/Proto-)Munda loanwords in Indo-Aryan increases in post-Vedic times (Burrow 1973: 386, Witzel 1999a: 34). In addition, a number of unidentified substrate languages, such as Masica’s (1979)

266

Viktor El!ík

Gangetic Language X, have been suggested to have contributed loanwords to regional varieties of Indo-Aryan. Selice Romani retains over a dozen of non-Indo-Aryan Indian loanwords into Indo-Aryan, which are, with a few exceptions (e.g. ‘sack’ or ‘straw’), represented in the Loanword Typology (LWT) meaning list. The bulk of the loanwords are attested in, or have been reconstructed for, Old Indo-Aryan, though a few may be of a later or local origin. For example, Romani purum ‘onion’, a possible loanword from Dravidian (cf. Tamil p)!0u ‘onion, garlic’, M"nu! 1994: 34; M"nu!s et al. 6 1997: 106), appears to be isolated within Indo-Aryan. Some of the Indian loanwords in Romani have a more or less established Dravidian etymology (Burrow & Emeneau 1960/1984; Turner 1962–1966), while others continue probable or possible loanwords from Proto-Munda (Kuiper 1948). It is possible that the Romani word mur! ‘man, male’ continues a loanword of Proto-Burushaski *mru-a/mru!a7 ‘Burusho’ into Old Indo-Aryan. Certainly the most telling Indian loanword in Romani is the ethnic autonym of 8 Roms, cf. Early Romani *"om *‘Rom; Romani married man; Romani husband’. Its ancestor form, Old Indo-Aryan 01mba-, which also survives as the name of other Indian-origin ethnic groups in the Middle East and of various low castes in northern India (cf. Briggs 1953), is clearly of Munda provenance (Kuiper 1948: 87; Turner 1962–1966: 313; Bení!ek 2006). This indicates (though does not prove) that the #$mba were originally a Munda-speaking group who shifted to an IndoAryan language (Vekerdi 1981; Bení!ek 2006). On account of the late attestation of the term 01mba- in Indo-Aryan, viz. in the sixth century CE, Bení!ek (2006: 23– 24) suggests that the shift did not take place before the beginning of the Common Era. 3.3.

Contact with other Indo-Aryan languages

It is likely that, in addition to borrowing from the non-Indo-Aryan Indian languages, there was also lexical borrowing from other Indo-Aryan varieties into the 6

7

8

It certainly does not continue Old Indo-Aryan pal" 0u- ‘onion’, of unclear etymology (Mayrhofer 1996: II, 102) and probably also a borrowing, on account of the “suspicious” cluster /%&/ (cf. Witzel 1999a: 11, 43). Traditionally, the Romani word has been explained as a contamination of Old Indo-Aryan manu2yà‘human being’, which itself results in Romani manu!, with Old Indo-Aryan puru2a- ‘man’ (e.g. Turner 1962–1966: 564). The latter has been suggested to be based on the Proto-Burushaski form (Witzel 1999c) but given the presence of m-initial forms such as Multani and Parya mu s, Sindhi mursu etc. in the Indian North West, we may perhaps derive the Romani word directly from an unattested m-initial Old Indo-Aryan form. While some groups of Romani speakers have replaced this original ethnonym by various innovative autonyms (e.g. Matras 1999, 2002), all Romani dialects retain the word’s secondary meaning ‘(Rom) husband’, whose development has been elucidated by Bení!ek (2006: 14–17). In some dialects, the word can only be used to refer to husbands of the Romani ethnicity in its secondary meaning, while in others, including Selice Romani, it has acquired an ethnically neutral meaning ‘husband’.

9. Loanwords in Selice Romani

267

Indo-Aryan ancestor varieties of Romani. First, there may have been loanwords into Proto-Romani from literary Indo-Aryan languages, though – assuming that ProtoRomani did not have any literate speakers – they would have had to be acquired through mediation of other vernaculars. For example, Turner (1926: 151) suggests that Romani tru! ‘thirst’ and ra!aj ‘priest’, both retained in Selice Romani, may reflect early loanwords from Sanskrit. In a later publication he only derives the latter from an unattested North Western Prakrit form (Turner 1962–1966: 118), which brings us to a second, geographical, point: Turner (1926) argues convincingly that Proto-Romani originated as a Central Indo-Aryan variety and, somewhat less convincingly (cf. Woolner 1928; Bení!ek 2006: 23–24), that it must have severed its connection with the Central group before the third century BCE. He also claims that Proto-Romani speakers then migrated to the Indian northwest, which was actually long (e.g. still in Turner 1924: 41) believed to be the original home of Proto-Romani; there they spent several centuries, borrowing words, including several that can be identified specifically as Northwestern Indo-Aryan or even “Dardic.” The ones Turner (1926: 156, 174) explicitly mentions are reflected in Selice Romani as !tár ‘four’, !ó ‘six’ and mur! ‘man, male’. However, as Matras (2002: 47) points out, the lexical evidence for the Northwestern contact of Proto-Romani is “marginal and largely inconclusive.” Indeed, Turner (1962–1966: 742–743) himself appears to have later revised his Dardic hypothesis regarding the origin of the Romani numeral ‘six’, deriving it instead from a separate Old Indo-Aryan form, and he no more mentions the possible Dardic origin of the other Romani forms. 3.4.

Contact with Middle-Eastern languages

While hypotheses about the time of the out-migration of Proto-Romani speakers from India vary tremendously, ranging between the fourth century BCE and the eleventh century CE, Matras’ (2002: 18) suggestion that the ancestors of the Roms left the subcontinent some time in the eighth or ninth century CE cannot be wildly off the mark. Between this period and the arrival of the Roms in the Byzantine Empire (see §3.5), Proto-Romani was in contact with several Middle Eastern languages, as evidenced by loanwords attested in various Romani dialects and hence reconstructable for Early Romani: First, there are a relatively high number of Iranian loanwords in Romani. Boretzky & Igla (1994: 329–331) list 67 possible Iranianisms, of which over three dozen are quite certain, while Hancock (1995) includes as many as 119 potential loanwords from Iranian, though many of these are obviously recent, dialect-specific, borrowings into Romani dialects of the Balkans via Turkish and other Balkan languages (cf. Matras 2002: 23). Additional lexical Iranianisms not identified or classified as such in either of the above lists are identified especially in M"nu!s et al. (1997). The overwhelming majority of Iranian loanwords in Romani can be derived from (late) Middle Persian, although many allow for, and some appear to require, a

268

Viktor El!ík

different source. Kurdish and Ossetic are widely held to have contributed a few loanwords each, e.g. Early Romani *kirivó ‘godfather’ < Kurdish kirîv (M"nu!s et al. 1997: 72) and Early Romani *vr(dón ‘cart, wagon’ < Ossetic w3rdon (e.g. Boretzky & Igla 1994: 301, 331; but cf. also Middle Persian wardy)n). Selice Romani retains two dozen Iranian loanwords from the larger Early Romani pool, including zijand 9 ‘damage, pity’ from Persian ziy"n ‘damage [etc.]’ (my etymology). Most of the Iranian loanwords in Selice Romani are represented in the LWT meaning list, with the exception of a possibility particle and nouns meaning ‘strength, force, power’, ‘whip’, and ‘co-father-in-law’. Second, the Romani lexicon contains loanwords from Armenian (many of which are themselves loanwords from Iranian, and sometimes difficult to distinguish from immediate Iranianisms). Their number is somewhat lower than that of Iranian loanwords, though still relatively important: recent overviews list 34 (Hancock 1987), 41 (Boretzky & Igla 1994: 331–332), or 51 (Boretzky 1995) possible items, of which around two dozens are quite certain (cf. Matras 2002: 23). Selice Romani retains only nine certain or probable loanwords from Armenian, one of which is not represented in the LWT meaning list: pativ-ake ‘in vain, for free’, an adverbialized dative of the noun *pativ ‘honor’ < Armenian patiw, which has been lost in the variety. Finally, four Romani nouns have been suggested to be loanwords from Georgian: ‘plum’, ‘suet, tallow’ (e.g. Pobo'niak 1964: 79), ‘eyelash’ (Friedman 1988), and ‘sand’ (Grant 2003: 27). None of these etyma have survived into Selice Romani: they have been replaced either by more recent loanwords or through a dialectspecific semantic shift of an indigenous word (viz. ‘sand’ < ‘dust, powder’). Since “[a] thorough investigation of the Iranian element in Romani from an Iranist’s point of view is still missing” (Matras 2002: 23), we cannot exclude that Proto-Romani was also in contact with other Iranian languages than those mentioned above. If, however, the lack of loanwords from East Iranian languages (with the exception of Ossetic, spoken in the Caucasus) and Balochi turns out to be genuine, we may hypothesize a relatively rapid migration of the ancestors of the Roms out of the Indian subcontinent to Khorasan, a more likely place, it appears, for their acquisition of Persian loanwords than Fars. The further migration route is likewise far from certain: Boretzky (1995) considers the possibility that the few Georgian words in Romani were borrowed via Armenian. Matras (2002: 25), in a similar vein, suggests that both the Georgian and the Ossetic loanwords may have been transmitted via other sources. Also, most if not all of the suggested Ossetic loanwords allow for alternative, Iranian or Armenian, etymologies. Considering all this plus the well-known fact that Armenian was also spoken in eastern Anatolia, it is quite possible that Proto-Romani speakers never actually inhabited the southern 9

The form of the noun ziján-i ‘damage’ in some Romani dialects of the Balkans (e.g. in the South Vlax dialect of Ajia Varvara, Athens; cf. Messing 1988: 140, Friedman 1989) clearly indicates that it is a relatively recent Turkism (of Persian origin). On the other hand, the form of the Selice Romani word makes it clear that it continues a Proto-Romani loanword from Persian.

9. Loanwords in Selice Romani

269

Caucasus. Indeed, Matras (1996, 2002: 25) suggests that the contact of Romani with Armenian and Western Iranian could have taken place simultaneously with its contact with Byzantine Greek. This is compatible with, though not implied by, Toropov’s (2004: 15) convincing argument that Romani contact with Armenian must have occurred by the ninth century CE. Important for the reconstruction of Romani migrations is the lack of any unambiguous pre-split loanwords from Turkic, whether immediate or mediated by other languages. Ultimate Arabisms are very rare and most likely mediated by other Middle Eastern languages (Matras 2002: 25). Selice Romani retains a single Arabism, viz. humer ‘boiled or baked dough; pastry; noodles’, which has been borrowed into Romani via Persian and/or Armenian. Interestingly, Berger (1959) suggests a number of Burushaski etymologies for Romani, which however are mostly rejected as unconvincing by Matras (2002: 24). One of Berger’s Burushaskisms, reflected in Selice Romani as cid- ‘to pull; draw; suck’, is deemed possible by Matras but it receives a more convincing Indo-Aryan etymology in Tálos (1999: 257), and so we may actually dispense with the assumption of the presence of Romani speakers in the Karakoram Mountains on their way out of India. 3.5.

Contact with Greek

While the first historical records of the presence of Gypsies in the Byzantine Empire originate from the late eleventh century CE (e.g. Soulis 1961), Tzitzilis (2001: 327–8) argues on linguistic grounds that Romani contact with Greek must have occurred by the tenth century. He also suggests that the oldest layer of Hellenisms in Romani are loanwords from Pontic and Cappadocian dialects of Medieval Greek, which of course also makes sense geographically. Differing degrees of morphological integration of Greek loanwords may reflect different layers of contact (see §5.2). For example, Greek 4róm-os ‘way’ is fully integrated as drom in Romani, and is likely to be an earlier loanword than that of Greek fór-os ‘square; market’, which retains its Greek nominative inflections in Romani. The fact that Greek is the source of numerous inflectional and derivational affixes in Romani (e.g. Boretzky & Igla 1991, Bakker 1997) and the model of radical morphosyntactic Balkanization-cumHellenization of the language (e.g. Friedman 1986, 2000; Matras 1994, 1995) suggests that contact with Greek involved fluent bilingualism of adult Romani speakers. Since most of the Greek-origin grammatical component is shared by all present-day Romani dialects, we may safely assume a relatively homogeneous speech community at the time of (early) Greek contact and locate Early Romani, the common ancestor of all modern Romani dialects, in the Byzantine period. Selice Romani retains three dozen Greek loanwords, a third of which are not represented in the LWT-based subdatabase, including nouns meaning ‘cabbage’, ‘carrot’, ‘fairy tale’, ‘lap’, ‘jelly’, and several function words. This number contrasts, for example, with twice as high a number of Hellenisms in a familiolect of Welsh

270

Viktor El!ík 10

Romani (Sampson 1926, counted in Grant 2003: 29). Both numbers certainly represent a mere fraction of all Greek loanwords that were in use in Romani during its Byzantine period, as indicated by the sum of Hellenisms that have been retained at least in some modern dialects of Romani outside of the Greek-speaking area. For example, Boretzky & Igla’s (1994) dictionary contains a list of 238 loanwords from Greek; Grant (2003) lists over 300 items, of which 260 he considers to be assured or likely; and there are several additional Greek items in Vekerdi (1983 [2000]) and Tzitzilis (2001) not discussed in either of the above. Two loanwords retained in Selice Romani have not been previously identified as Hellenisms, viz. the ethnonyms ungro ‘Hungarian’ and servo ‘Slovak’ < Greek úngros and sérvos ‘Serb’, respectively. 3.6.

Contact with South Slavic languages

The first historical records of the presence of Gypsies in the South Slavic area originate from the second half of the fourteenth century CE (e.g. Fraser 1992), just before the Ottoman conquest of Bulgaria and Serbia, though the first contacts of Romani speakers with South Slavic are likely to have occurred somewhat earlier. Since early historical records do not discriminate between different Romani groups, we are not in position to date with any precision the beginning of the South Slavic bilingualism of the ancestors of Selice Romani speakers on historical grounds. The South Slavic languages contribute almost three dozen loanwords to the subdatabase, which amount to two thirds of all South Slavic loanwords attested in Selice Romani. Those that are not represented in the sample include an ethnic noun referring to non-Roms, which has the source meaning ‘(the) coarse (one)’; the comparative adjective ‘worse’, whose suppletive positive-degree counterpart is also a South Slavic loanword; and more. The number of South Slavic loanwords was certainly much higher during the time of South Slavic bilingualism of preSelice Romani speakers. In fact, closely related Rumungro varieties retain a number of Slavicisms that have been replaced by Hungarian loanwords in Selice Romani, e.g. ‘world’, ‘foreign’, ‘to write’, and more. A few South Slavic loanwords have a relatively wide distribution within Romani and may be assumed to have been borrowed into the language before the outmigration of different Romani groups from the southern Balkans and their geographical dispersal throughout Europe (cf. Boretzky & Igla 2004: 9; Boretzky n.d.). One example of such a word is Selice Romani vodro ‘bed’ (cf. Old Church Slavonic odr5 ‘bed’), which is also attested, for example, in Welsh and Finnish Romani. Its meaning, too, shows that it must be a relatively old borrowing: the word has undergone various semantic specializations in modern South Slavic languages, e.g. 10

Grant (2003: 29) also counts Greek loanwords in other Romani dialects such as Lovari (Vekerdi 1983), but these represent dialect clusters rather than individual local varieties, and so these counts are, strictly speaking, not comparable to the number of loanwords in Selice Romani.

9. Loanwords in Selice Romani

271

Bulgarian od6r ‘plank bed’, Serbo-Croatian odar ‘hearse, catafalque’, or Slovene oder ‘platform, plank stand’. Nevertheless, the majority of South Slavic loanwords in Selice Romani are dialect-specific loanwords, most of which are restricted within Romani to the South Central dialect group. Several South Slavic loanwords in Selice Romani could have originated in any South Slavic idiom, e.g. zelen-o ‘green’ < zelen. Mostly, however, the distribution of the source word is restricted within the South Slavic area, and it is often possible to identify the source language quite specifically, due to form and/or meaning peculiarities of the Selice Romani loanword. For example, Selice Romani er.avo ‘bad, evil, wrong’ clearly derives from Serbo-Croatian r7av ‘rusty; bad, evil’, since the other South Slavic languages exhibit very different forms and have not developed the relevant secondary meaning ‘bad, evil’ (cf. Bulgarian r6-div, Macedonian ‘r8osan, Slovene rjast ‘rusty’). A few Selice Romani words, both within and without the sample, can be identified even more specifically as loanwords from an Ikavian dialect of Serbo-Croatian (El!ík et al. 1999), e.g. cilo ‘whole; all’ < cio ~ cil-, ninco ‘German’ < nimac ~ nimc-. While quite a few South Slavic loanwords in Selice Romani must originate in Serbo-Croatian, almost all of them can, and so it may well be that Selice Romani acquired almost all of its South Slavic loanwords from a single source. Although there is no historical documentation of the out-migration of the ancestors of Selice Romani speakers out of the South Slavic linguistic area, it is quite likely that it was part of wider population movements triggered by the Ottoman expansion in the Balkans and towards Hungary and Hapsburg Austria. It is tempting to connect the current presence of the South Central Romani speakers in the western part of historical Hungary to the large-scale re-settlement of Croats to Burgenland (Gradi!9e) and the neighboring parts of Hungary, including the southwest of present-day Slovakia, which took place especially during the sixteenth 11 century. However, a small piece of linguistic evidence appears to indicate a somewhat later out-migration. The only Turkism among the South Slavic loanwords in pre-Selice Romani, viz. duhano ‘tobacco’ < Serbo-Croatian duhan (< Turkish duhan ‘smoke’ < Arabic duh"n; cf. Buck 1949: 534), denotes a New World plant that was introduced into the Balkans by the Ottomans at the very beginning of the seventeenth century (e.g. Mijatovi( 2006). This requires that there still was contact between pre-Selice Romani and (the Turkish-influenced varieties of) Serbo12 Croatian at this time.

11

For example, the village of Hrvatski Grob, located several dozen kilometers to the northwest of Selice, was founded in 1552 by settlers from the Moslavian region in Croatia. The local Croatian dialect, still spoken by some elders, contains an Ikavian element. 12 The etymon is also found in Hungarian as dohány ‘tobacco’ and it cannot be excluded that the immediate source of the Selice Romani word is an unattested dialectal Hungarian form *duhan.

272

Viktor El!ík

3.7.

The current contact situation

All school-age or older native speakers of Selice Romani are plurilingual, speaking two or more languages fluently, in addition to Romani. First of all, they are all fluent and highly competent in Hungarian, which they use especially in their everyday communication with the Hungarian villagers but also with those Hungarian Roms of the village and the region who are less competent in Romani or who do not speak or understand Romani at all. Some young children may be monolingual in Romani, although early acquisition of Hungarian appears to be the prevailing pattern nowadays. We do not know when the contact with Hungarian started, neither is it clear when the ancestors of the Hungarian Roms of Selice settled in the village. They retain no memory of their previous homes or migrations and the locals claim that the recently abandoned settlement of the Hungarian Roms (see §1), by far the largest Romani settlement in the region, had been there “from times immemorial.” The bilingualism of Selice Romani speakers in Hungarian has certainly lasted for many generations, and quite likely for several centuries. An overwhelming majority of Hungarian Roms of Selice are also fluent in Slo13 vak, which they use especially at schools and outside of the village. Although few ethnic Slovaks live in Selice, Slovak-speaking villages are located nearby, and so it is likely that the first contacts of Selice Romani with Slovak predate the creation of Czechoslovakia in 1918, whereafter Slovak became the official and dominant language of Slovakia. The contact with vernacular Slovak of the region is confirmed by dialectal features in the Slovak of elder Roms and by the form of some established Slovak loanwords in Selice Romani, e.g. !kráteko ‘elf’ from Slovak dialectal !krátek 14 (cf. standard !kriatok). Nevertheless, it has been the recent influence of Slovak mass media and schooling that contributed to the general Slovak bilingualism among the Hungarian Roms of Selice. Most Hungarian Roms of Selice have also acquired at least passive competence in Czech through their exposure to Czech mass media and especially during their employment-related stays in the Czech part of the former Czechoslovakia, where most families spent between ten to thirty years in 1960–1980s. Many Selice Romani speakers, including my main consultant, attended Czech primary schools. Active competence in other languages is individual and usually acquired during job-related stays in foreign countries. My primary consultant and several members of her family spent a year in Kazakhstan in early 1990s, where they spoke Russian with the locals. I am aware of a single word of Russian origin in Selice Romani, viz. .engi ‘money’ < d’en’g’i, which is a rarely used slang alternative to an indigenous Romani word. 13

In contrast, some local Hungarians are still monolingual in Hungarian and hardly understand Slovak. 14 An early contact with Slovak is, incidentally, also suggested by a peculiar semantic shift in the loanword of the Greek ethnonym sérvos ‘Serb’: the fact that Selice Romani servo means ‘Slovak’ appears to indicate that the ancestors of the Hungarian Roms still spoke, or at least understood, South Slavic when they first encountered the Slavic-speaking Slovaks.

9. Loanwords in Selice Romani

273

Finally, a few words about the social and linguistic relations between the Hungarian Roms and the Vlax Roms of Selice are in order. Both groups consider their own 15 group to be superior. There is no intermarriage between members of the two groups, and social contact is mostly restricted to economic exchange. The native language of the Vlax Roms is a Lovari-type North Vlax dialect of Romani (cf. Boretzky 2003), which is quite different from Selice Romani. In fact, the Hungarian Roms claim that they do not understand much of the dialect of the Vlaxs, and my field observations appear to confirm this. Yet, many Hungarian Roms are aware of certain salient lexical differences between the dialects and take some pride or amusement in citing “typical Vlax words,” e.g. khan*i ‘nothing’ (cf. Selice Romani ni!ta). All adult Vlax Roms, on the other hand, regularly use Selice Romani, or rather a distinct ethnolect of it, in communication with the Hungarian Roms. Given the mutual disdain, this asymmetrical pattern clearly reflects the demographic asymmetry between the two Romani groups in Selice. The lack of any significant competence of the Hungarian Roms of Selice in the Vlax dialect makes it unsurprising that there are very few Vlax loanwords in Selice Romani. One of them is krísa, a loanword of Vlax krísi ‘judgement, trial, tribunal, court’, itself a loanword from Greek, which is used to refer to a community-internal judicial institution among the Vlaxs (no such institution exists among the Hungarian Roms). The Greek loanword is likely to have been present in Early Romani, then lost in the ancestor variety of Selice Romani, and then – as its meaning and form clearly show – borrowed “again” as a cultural insertion from Vlax.

4. Numbers and kinds of loanwords 4.1.

A note on what counts as a loanword

There are 1536 lexemes in the Selice Romani subdatabase, of which 62.6% I classify as loanwords. In the overwhelming majority of instances, the lexemes considered to be loanwords here have been borrowed without any doubt, while a tiny minority of them are merely probable loanwords. In addition, a couple of dozen further words have been suggested to be loanwords (and indeed may be ones), but are not counted as such in this paper, because I do not consider their borrowing etymologies to be fully convincing. In addition to loanwords proper, there are ca. 6% of lexemes in the sample that are merely “created on loan basis” and not counted as loanwords: these are either lexicalized collocations or compounds containing a clear or probable loanword, or (synchronic or merely etymological)

15

To wit: the Hungarian Roms consider themselves to be more civilized and progressive, resenting the wildness and backwardness of the Vlaxs, while the Vlax Roms consider themselves to be the only real and pure Roms, disdaining the Hungarian Roms as assimilated half-Hungarians (hence also the ethnic exonym Rumungro, originally *Rom-Ungro ‘Gypsy-Hungarian’).

274

Viktor El!ík 16

derivations from a clear or probable loanword. Semicalques, which involve borrowing of matter but not borrowing of the whole form of the lexeme, e.g. Selice Romani vala-kana vs. Hungarian vala-mikor [some-when] ‘sometimes’, are not considered to be loanwords either. This rather restrictive approach to what counts as a loanword means that the number of words that consist exclusively of indigenous morphemes is significantly smaller than the number of words that are classified as non-loanwords. 4.2.

Loanwords by source language

It is often difficult to identify the immediate source language of a loanword precisely, especially due to genealogical relatedness or contact between source languages. For example, Selice Romani kopaj ‘stick; club’ can be a loanword from Pontic Greek, but also from Armenian or Kurdish, which borrowed the Greek word (cf. Tzitzilis 2001: 332). Given this, I find it useful to simplify the quantitative presentation of the data by lumping, in the following cases, several source languages into “contact clusters:” the Indian cluster consists of loanwords into Old and Middle Indo-Aryan from (Para/Proto-)Munda and/or Dravidian (see §3.2); the South Slavic cluster subsumes any South Slavic source (see §3.6); and, finally, the Slovak/Czech cluster consists of loanwords from both Slovak and Czech. In addition, I took a few arbitrary decisions, including the following: loanwords that can originate in Hungarian are counted as Hungarian, even if they can also originate in Slovak/Czech and/or South Slavic; and loanwords that can originate in South Slavic and Slovak/Czech are counted as South Slavic. Table 1 shows the breakdown of sample loanwords by donor language or donor language group:

16

The Selice Romani noun -uto ‘yolk’, for example, has developed through onomasiological conversion of the adjective -uto ‘yellow’, which is a clear loanword of Serbo-Croatian -ut ‘yellow’. The conversion may have occurred due to pattern borrowing from Hungarian, cf. sárga ‘yellow’ and (tojás-)sárgá-ja [(egg-)yellow-3SG.POSS] ‘yolk’. Although the (base) form of the Selice Romani noun is identical to that of the borrowed adjective and although the noun’s development through conversion may have been contact-induced, the noun is not considered to be a loanword, since there is no noun of the relevant form and meaning in the source language (cf. Serbo-Croatian -umance, -umanjak, -utanjak, -utac etc. ‘yolk’).

9. Loanwords in Selice Romani

Table 1:

275

Loanwords in Selice Romani by source language

Source language

#

% of words

% of loanwords

pre-Indian Indian Persian Kurdish Ossetic Armenian Greek South Slavic Hungarian Slovak/Czech Vlax Romani Total loanwords Total words

3.0 12.0 18.0 1.0 2.0 9.0 25.0 32.0 753.5 38.0 2.0 895.5 1430.0

0.2 0.8 1.3 0.1 0.1 0.6 1.7 2.2 52.7 2.7 0.1 62.6 100.0

0.3 1.3 2.0 0.1 0.2 1.0 2.8 3.6 84.2 4.2 0.2 100.0 –

Hungarian, the primary current contact language of Selice Romani, is far and away the most important source of loanwords, contributing the bulk of all loanwords and over half of all words in the sample. This statement remains true even if items that may but need not be immediate loanwords from Hungarian are discounted. In addition, there are hundreds of established loanwords from Hungarian that are regularly used in Selice Romani but whose meanings are not represented in the sample. Unsurprisingly, Hungarian is also a frequent source of nonce loanwords in Selice Romani discourse. In contrast, the other contact languages or clusters, including all past contact languages, each contribute less than a twentieth of all loanwords. Although nonce loanwords from Slovak and Czech often occur in the speech of many Selice Romani speakers, the number of established Slovak or Czech loanwords cannot be much higher than the one indicated by the sample. Considering the fact that Selice Romani speakers are fluent active bilinguals in Slovak, and many of them in Czech as well, the great quantitative disproportion between the Hungarian and the Slovak(/Czech) lexical components in Selice Romani is striking. Assuming that the length of contact is hardly the only factor, the disproportion is in need of a detailed sociolinguistic explanation. Since there is no space here to discuss in any detail the ultimate and intermediate sources of Selice Romani loanwords, I will restrict myself to a few remarks: The current contact languages Hungarian, Slovak and Czech have mediated a number of loanwords from German, Latin, French, Italian, and other languages. Hungarian is also the immediate source of a number of Slavicisms (including recent Slovakisms in the local Hungarian dialect) and Turkisms (mostly of Oghuric affiliation). In addition to direct loanwords from Greek there are also several ultimate Hellenisms in Selice Romani that entered the language via Hungarian, Slovak/Czech or Vlax Romani. On the other hand, immediate contact with Greek also introduced a couple of Latin and ultimately Germanic (via Italian: ‘soap’) and Turkic (via Slavic: ‘Hungarian’) words. Direct loanwords from Iranian languages contrast with Iranian-

276

Viktor El!ík

isms acquired via Armenian, Hungarian (e.g. ‘thousand’) or via Turkish and South Slavic (‘cotton’). Names of several plants and products originating in South Asia have been re-introduced via European languages (e.g. ‘black pepper’, ‘rice’, or ‘sugar’). Lexical borrowing has resulted in several etymological doublets in Selice Romani. 4.3.

Loanwords by word class

The standard breakdown of sample loanwords by semantic word class is shown in 17 Table 2.

South Slavic

Greek

Iranian

Indian

Armenian

Vlax Romani

Unidentified

Total loanwords

Non-loanwords

Nouns Verbs Adjectives Adverbs Function words all words

Czech/Slovak

Loanwords in Selice Romani by semantic word class (percentages)

Hungarian

Table 2:

59.9 41.2 42.1 50.0 21.0 50.8

7.2 0.6 4.3

2.5 1.2 4.0 4.2 2.5

2.0 0.9 4.2 1.7

1.9 1.5 1.6 0.8 1.7

0.7 0.3 3.2 0.8

1.0 0.6

0.1 0.8 0.1

0.4 0.2

75.6 45.1 51.7 50.0 30.9 62.7

24.4 54.9 48.3 50.0 69.1 37.3

Of all word classes, nouns exhibit the highest proportion of loanwords: over three quarters. The other content word classes lag behind nouns and are roughly similar to one another with regard to loanword proportions: loanwords represent half of all adverbs, just over half of all adjectives, and somewhat less than half of all verbs. However, adverbs only amount to 4 items in the LWT meaning list, and so the proportion of loan-adverbs is clearly beyond statistical significance. In fact, all Selice Romani manner adverbs that semantically correspond to Hungarian-origin adjectives are themselves lexical borrowings from Hungarian, rather then internal derivations from the borrowed adjectives, and so the proportion of loan-adverbs could be very different in an extended meaning sample. Finally, function words show the lowest proportion of loanwords: just below a third.

17

The Selice Romani morphosyntactic word classes Verb, Noun, and Adjective closely match the semantic word classes. Almost any individual LWT meaning of a certain semantic word class (as indicated in the database template) can be, provided it is lexicalized at all in Selice Romani, rendered by an expression of the corresponding language-specific morphosyntactic word class. There are only very few exceptions: for example, there is no adjective meaning ‘stinking’, only a verb meaning ‘to stink’ in Selice Romani. Consequently, the breakdown of loanwords by Selice Romani word classes would show numbers almost identical to those of Table 2.

9. Loanwords in Selice Romani

277

Table 3 displays the proportions of selected diachronic layers of loanwords to all loanwords by word class (the word classes are arranged by decreasing loanword proportions), plus arithmetical differences from the total proportion of this kind. The diachronic layers considered are: loanwords from Hungarian; loanwords from all current contact languages, i.e. Hungarian, Slovak, Czech, and Vlax Romani; and loanwords acquired since the contact with Greek, including those from the current contact languages, i.e. roughly during the last millennium. Table 3: Word class Nouns Adjectives Adverbs Verbs Function words Total

Loanwords in Selice Romani by semantic word class and diachronic layer (percentages) Loans 75.6 51.7 50.0 45.1 30.9 62.7

Hungarian 83.3 81.4 100.0 91.9 71.8 84.2

-0.9 -2.8 +15.8 +7.7 -12.4 0.0

Current L2s 89.3 82.9 100.0 91.9 71.8 88.7

+0.6 -5.8 +11.3 +3.2 -16.9 0.0

Last 1000 years 94.9 90.7 100.0 96.6 97.1 95.0

-0.1 -4.3 +5.0 +1.6 +2.1 0.0

Hungarian loanwords (and the current loanwords in general) represent over four fifths of all loanwords in any content word class; the proportion is somewhat lower in function words. At least 90% of loanwords of any word class have been borrowed within the last millennium of the history of Selice Romani. The following may also be read off Tables 2 and 3: Hungarian is unique among the source languages in contributing a higher proportion of loan-verbs than that of loan-nouns (with regard to all loanwords of the respective word class). Slovak and Czech only contribute nouns, not other word classes. The LWT meaning list appears to be representative in this respect: although there is an established mechanism for morphological integration of Slovak and Czech verbs (see §5.4), they appear to be overwhelmingly, if not exclusively, nonce loanwords; and there are no established mechanisms for morphological integration of Slovak and Czech adjectives. 4.4.

Loanwords by semantic field

The standard breakdown of loanwords by semantic fields is shown in Table 4.

Viktor El!ík

Non-loanwords

Armenian

Total loanwords

Indian

0.8 1.2 11.7 2.0 4.1 6.4 15.7 10.7 3.3 4.3 2.6 1.8 0.8 10.9 4.4 9.6 14.9 4.3

2.4 1.2 1.2 1.4 2.2 1.2 6.5 7.5 2.0 2.5 2.6 5.1 0.8 4.4 4.0 5.0 2.3 5.8 0.8 8.5 2.5

1.2 1.2 1.4 1.3 2.2 3.5 2.3 1.3 2.5 2.1 10.2 5.4 4.0 3.1 0.4 1.7

3.2 1.2 3.7 3.2 5.0 1.3 3.8 2.9 1.3 3.4 0.4 0.4 4.3 1.7

1.6 2.5 0.9 2.5 1.7 2.0 4.4 0.8

- 74.3 25.7 - 32.1 67.9 1.2 - 77.8 22.2 1.2 - 0.6 57.2 42.8 0.6 - 58.8 41.2 1.7 - 1.7 86.7 13.3 - 92.7 7.3 1.7 - 89.8 10.2 - 60.5 39.5 - 57.6 42.4 - 2.1 - 51.2 48.8 - 47.7 52.3 2.6 - 37.0 63.0 - 59.2 40.8 - 55.4 44.6 - 51.7 48.3 - 50.7 49.3 - 62.2 37.8 0.4 - 76.8 23.2 0.4 - 81.4 18.6 - 4.4 - 61.3 38.7 7.7 - 3.8 63.5 36.5 - 92.3 7.7 - 12.8 87.2 0.6 0.1 0.2 62.7 37.3

Unidentified

Iranian

66.4 25.9 61.1 45.9 44.8 67.0 66.9 67.6 52.5 48.8 39.7 39.3 19.1 51.3 46.5 40.3 50.7 56.4 72.8 67.1 52.6 36.5 76.7 50.8

Vlax Romani

Greek

The physical world Kinship Animals The body Food and drink Clothing and grooming The house Agriculture and vegetation Basic actions and technology Motion Possession Spatial relations Quantity Time Sense perception Emotions and values Cognition Speech and language Social and political relations Warfare and hunting Law Religion and belief Modern world Miscellaneous function words all words

South Slavic

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Loanwords in Selice Romani by semantic field (percentages) Czech/Slovak

Table 4:

Hungarian

278

Table 5, analogous to Table 3 in §4.3, displays the proportions of selected diachronic layers of loanwords to all loanwords by semantic field.

9. Loanwords in Selice Romani

279

Table 5: Loanwords in Selice Romani by semantic field and diachronic layer (percentages) Semantic field (field number) 7 23 8 6 20 3 19 1 22 18 21 9 5 14 4 10 15 16 11 17 12 13 2 24

The house The modern world Agriculture and vegetation Clothing and grooming Warfare and hunting Animals Social and political relations The physical world Religion and belief Speech and language Law Basic actions and technology Food and drink Time The body Motion Sense perception Emotions and values Possession Cognition Spatial relations Quantity Kinship Misc. function words Total

Loans 92.7 92.3 89.8 86.7 81.4 77.8 76.8 74.3 63.5 62.2 61.3 60.5 58.8 59.2 57.2 57.6 55.4 51.7 51.2 50.7 47.7 37.0 32.1 12.8 62.7

Hungarian

Current L2s

79.3 95.9 82.1 82.2 84.3 84.1 94.9 92.4 60.6 92.0 85.8 87.9 78.7 89.7 80.4 86.5 83.9 77.9 77.5 100.0 82.4 51.9 80.7 0.0 84.2

90.6 99.2 89.8 86.0 95.7 95.1 94.9 92.4 72.8 92.0 100.0 92.4 80.9 89.7 82.6 86.5 83.9 77.9 90.2 100.0 87.8 51.9 84.4 0.0 88.7

-4.9 +11.7 -2.1 -2.0 +0.1 -0.1 +10.7 +8.2 -23.6 +7.8 +1.6 +3.7 -5.5 +5.5 -3.8 +2.3 -0.3 -6.3 -6.7 +15.8 -1.8 -32.3 -3.5 -84.2 0.0

+1.9 +10.5 +1.1 -2.7 +7.0 +6.4 +6.2 +3.7 -15.9 +3.3 +11.3 +3.7 -7.8 +1.0 -6.1 -2.2 -4.8 -10.8 +1.5 +11.3 -0.9 -36.8 -4.3 -88.7 0.0

Last 1000 years 100.0 100.0 98.1 88.6 100.0 97.9 100.0 95.7 81.9 100.0 100.0 97.9 87.2 100.0 88.6 95.4 91.9 93.4 94.5 100.0 93.3 93.2 91.9 49.4 95.0

+5.0 +5.0 +3.1 -6.4 +5.0 +2.9 +5.0 +0.7 -13.1 +5.0 +5.0 +2.9 -7.8 +5.0 -6.4 +0.4 -3.1 -1.6 -0.5 +5.0 -1.7 -1.8 -3.1 -45.6 0.0

Disregarding the field Miscellaneous function words for the moment, we may observe the following: All fields contain from just below a third to over 90% loanwords. The overwhelming majority of fields contain more loanwords than non-loanwords (with the exception of Kinship, Quantity, and Spatial relations), and around a third of fields contain more than three quarters of loanwords. The proportion of Hungarian loanwords to all loanwords ranges between a half and all in different semantic fields, with the bulk of fields showing more than three quarters of Hungarian loanwords. The proportions of loanwords from all current contact languages do not present a significantly different picture. At least four fifths of loanwords in any semantic field, and often all of them, have been borrowed within the last millennium. The fields that contain fewer loanwords in general also tend to contain, with some exceptions, a smaller proportion of the more recent, Greek and post-Greek, loanwords to all loanwords (since, however, the statistical significance of the proportions of different loanword layers will differ greatly for different fields, this latter observation should not be given too much weight).

280

Viktor El!ík

There is certainly no single principle behind the ordering of the LWT semantic fields with regard to the proportion of loanwords they contain. Nevertheless, it may be observed that several fields consisting, to a considerable extent, of abstract concepts (e.g. Quantity, Spatial relations, Cognition, Possession, or Emotions and values) possess relatively low proportions of loanwords, whereas numerous fields that mostly contain very concrete meanings (e.g. The house, Modern world, Agriculture and vegetation, Clothing and grooming, or Animals) possess relatively high proportions of loanwords. Some of those semantic fields that stand out in Table 5 in various respects are discussed below: The field The house shows the highest proportion of loanwords. There are only three LWT meanings that must be expressed by an indigenous word: ‘house’, 18 ‘door’, and ‘to live, dwell’ (< ‘to sit’). It is likely that some loanwords in this field have been cultural insertions accompanying the speakers’ sedentarization and other changes in their dwelling patterns and conditions (e.g. ‘room’), although other loanwords have demonstrably replaced indigenous words (e.g. ‘board’) or presedentarization loanwords (e.g. ‘stove’). It thus remains unclear to what extent extralinguistic factors can be made responsible for the extremely high proportion of loanwords in this semantic domain. The fact that this field consists almost exclusively of nouns, which are the most borrowable word class in Selice Romani (see §4.3), may also be significant. The second highest proportion of loanwords in the field Modern world is not surprising. Unlike The house, this field contains, expectedly, an above-average proportion of Hungarian and current loanwords. In fact, the only pre-Hungarian loanword in this field, caklo ‘glass [material]; bottle’ from South Slavic, has acquired its latter, modern-world, meaning through calquing the polysemy of the Hungarian noun üveg. In addition, there are a few relatively recent internal derivations in this field, and an indigenous noun meaning ‘song’, which is an ancient rather than modern concept in Romani culture. The field Religion and Belief stands out in showing the highest proportion of old, pre-Greek, loanwords. However, given that there are only three of them, viz. ‘priest’, ‘witch’ and ‘sorcerer, wizard’ (the field contains relatively few words in Selice Romani), their outstanding proportion is probably not statistically significant. The field Quantity has a relatively low proportion of loanwords and, especially, the lowest proportion of loanwords from Hungarian and from Selice Romani’s current contact languages in general. Almost a half of quantity loanwords were borrowed from Selice Romani’s previous European contact languages, viz. Greek and South Slavic, which otherwise contribute much smaller proportions of lexicon. The lowest proportion of loanwords is found in the field Kinship, although they still amount to almost a third of all kinship words. Moreover, numerous expressions 18

In addition, there is an indigenous noun meaning ‘space under one’s head in bed’ (whereas ‘pillow’ is a loanword), and two polysemous indigenous nouns that can be used to refer to ‘floor’ (primarily ‘earth; land’) and ‘bed’ (primarily ‘place’), for both of which there are borrowed synonyms in the relevant specific meanings.

9. Loanwords in Selice Romani

281

in this field are collocations containing a loanword or derivations from a loanword, and so the proportion of indigenous words is much lower. Indigenous kin terms that are used by all Selice Romani speakers are restricted to ‘brother’, ‘sister’, ‘father’, and ‘mother’ (the latter, however, may be a loanword). Only the older generations of Selice Romani speakers also use indigenous words for ‘father-in-law’, ‘mother-in-law’, and ‘daughter-in-law’. Further indigenous words in this field include ‘human being’, ‘man; male’ (which may be an old loanword), ‘woman; female’, and ‘wedding’. The semantic field that has by far the lowest proportion of loanwords, and which has been disregarded in the above discussion, are the Miscellaneous function words. There are only two loanwords here, one from Iranian (‘without etc.’) and one from Serbo-Croatian (‘nothing’), i.e. none from Hungarian or any other current contact language. As a result, the various loanword proportions in this field are very different from those in all other fields. Note that this field only contains certain kinds of function words, including some of the less borrowable ones (e.g. demonstratives, basic adpositions, auxiliary verbs), and should not be considered representative of function words in general: the semantic word class of function words has more than three times as high proportion of loanwords (see §4.3).

5. Integration of loanwords 5.1.

Phonological integration of loanwords

The phoneme inventory of Selice Romani is almost identical to that of the local dialect of Hungarian, partly because Selice Romani has both acquired and lost a number of phonemic distinctions due to contact with this contact language (cf. El!ík 2007). The only Hungarian phonemes to get phonologically adapted in Selice Romani loanwords are the front rounded vowels: the mid /ö/ [ø] and /)/ [ø*] and the high /ü/ [y] and /+/ [y*]. They are mostly replaced with their front unrounded counterparts, the mid /e/ [e ~ æ] and /é/ [æ*] and the high /i/ [i] and /í/ [i*], respectively, e.g. Hungarian csütörtök ‘Thursday’ > Selice Romani *itertek-o and Hungarian k#m:ves ‘bricklayer’ > Selice Romani kémíve!-i. One systematic exception occurs in Selice Romani loanwords of polysyllabic Hungarian nouns whose base form ends in the long front rounded vowels. Here, Hungarian /)/ and /+/ are replaced with the back rounded vowels /ó/ [o*] and /ú/ [u*], respectively, e.g. Hungarian keresked# ‘merchant’ > Selice Romani kere!kedó and Hungarian keszty: ‘glove’ > Selice Romani kes;ú(-va). However, when these nouns are parts of compounds in Hungarian, the regular unrounding applies, e.g. Hungarian tüd# > Selice Romani tidó ‘lung’ but Hungarian tüd#+baj [lung+trouble] > Selice Romani tidébaj-a ‘pulmonary tuberculosis’. Also regular is the phonological adaptation in loanwords of Hungarian adjectivals and monosyllabic nouns ending in the long front rounded vowels, e.g. Hungarian els# ‘first’ > Selice Romani é!é-n-o, Hungarian könny: ‘light; easy’ > Selice Romani ke k,] in Slovak v=chod ‘east’ > Selice Romani víkhod-o, and obligatory [r- > .] in Czech pep> > Selice Romani pep!-o ‘black pepper’. Many apparent instances of phonological adaptation in current Selice Romani loanwords in fact reflect dialectal source forms, e.g. Selice Romani *ekíl-n-o ‘shallow’ < Hungarian dialectal csekíl, cf. standard sekély; or adoption of the source language’s non-base stem variants, e.g. Selice Romani samar-a ‘donkey’ < Hungarian szamar-, cf. the base stem szamár. In addition to these factors, post-contact phonological changes must also be taken into account when one tries to identify adaptation processes in older loanwords. For example, the SerboCroatian word volja ‘will; mood’ was probably borrowed without any phonological adaptation before it has changed to present-day Selice Romani vója ‘good mood’, due to regular Hungarian-induced phonological developments. One of the few clear instances of pre-Hungarian phonological adaptation is the change [y > u] in Early Romani kurko ‘Sunday; week’, a loanword of Medieval Greek kyrikó(n) ‘Lord’s (day); Sunday’ (Tzitzilis 2001: 327). 5.2.

Morphological integration of loanwords

Loanwords that are assigned the status of an inflected Selice Romani word class (noun, verb, or adjective) are, as a rule, morphologically integrated into Selice Romani inflectional patterns. However, there is a general division in Romani between two major diachronic layers of loanwords with regard to their degree of integration: loanwords from pre-Greek contact languages are fully integrated and indistinguishable from indigenous words on morphological grounds, whereas loanwords from post-Greek contact languages are, or can be reconstructed to have been in Early Romani, overtly marked by various morphological means as loanwords. Loanwords from Greek, which is the source of most loanword markers (e.g. Bakker 1997), are split between these two layers: some Hellenisms, presumably the early ones, are fully integrated, while others, presumably the later ones, are overtly marked as loanwords. This diachronic division is synchronically reflected as a morphologically encoded etymological compartmentalization of the lexicon: older loanwords, together with indigenous words, have what I term oikoclitic morphology, 19

The regular unrounding of the front rounded vowels is also a characteristic ethnolectal feature of some Selice Romani speakers’ Hungarian. Some Selice Romani–Hungarian bilinguals thus lack the front rounded vowels in both of their primary languages, while for others unrounding is an L1internal adaptation process. In addition, there is some interesting lexical and sociolinguistic variation with regard to unrounding in the latter group of speakers: certain loanwords tend to retain the front rounded vowels, and some speakers tend to retain them in more loanwords than others. It seems that the lack of phonological adaptation in Selice Romani functions as a sociolinguistic marker of a kind of prestige associated with success in the non-Romani society.

9. Loanwords in Selice Romani

283

while more recent loanwords have xenoclitic morphology. The distinction between oikoclisis and xenoclisis, which can be reconstructed for Early Romani, has undergone a variety of analogical developments in individual Romani dialects, affecting not only individual lexemes, but also whole inflectional and derivational classes (see El!ík & Matras 2006: 324–333 for an overview). The distinction between the full integration (oikoclisis) of earlier loanwords and marked integration (xenoclisis) of later loanwords is well retained in Selice Romani noun inflection. Xenoclitic loanwords are characterized by borrowed nominative suffixes, mostly of Greek origin, and by analogically reshaped oblique stem suffixes (see El!ík 2000, Matras 2002: 80–85 for details). For example, oikoclitic masculine loan-nouns in -o (e.g. *ár-o ‘bowl, dish’ from Dravidian, ;irm-o ‘worm’ from Persian, and kurk-o ‘Sunday; week’ from Greek) take the indigenous nominative plural suffix -e and the indigenous oblique singular suffix -es-, whereas xenoclitic masculine loan-nouns in -o < Early Romani *-os (e.g. fór-o ‘town’ from Greek, prah-o ‘dust, powder’ from South Slavic, világ-o ‘world’ from Hungarian, and pep!-o ‘black pepper’ from Czech) take the borrowed nominative plural suffix -i and the reshaped oblique singular suffix -os-. Other inflectional classes show different markers, but the principle remains the same. Similarly, pre-Greek and early Greek loan-verbs show full morphological integration and are structurally indistinguishable from indigenous verbs. Post-Greek loan-verbs, on the other hand, are marked out by an overt (and dedicated) adaptation marker, the Greek-origin suffix -in-, which is added to an inflectional stem of the source verb (e.g. vi*-in- ‘to shout’ from Serbo-Croatian vi*-, dógoz-in- ‘to work’ from Hungarian dolgoz-), and followed by regular indigenous inflections. The suffix, which is a pre-inflectional though non-derivational morpheme, was extracted from lexical borrowings of Greek verbs with the present stem in -in-. Though none of these have been retained in Selice Romani, the suffix has been extended to those Greek loan-verbs that originally contained a different suffix, e.g. rum-in- ‘to destroy, break, damage, spoil’ from Greek rim-az- ‘to ravage’. Dialect comparison suggests that the suffix -in- was originally specialized for non-perfective adaptation of some transitive loan-verbs in Romani (Matras 2002: 130). In Selice Romani, however, it has developed into a general, aspect- and valency-neutral, verb20 adaptation marker. Nonce loan-verbs from Slovak or Czech show a distinct pattern of morphological adaptation: their infinitive stems get adapted by the 21 Hungarian-origin adaptation suffix -ál-, in addition to the regular adaptation suffix -in-, e.g. sledov-ál-in- ‘to observe, follow’ from Slovak/Czech sled-ov-a-. In adjectives, the distinction between xenoclitic and oikoclitic inflection, which is attested in most Romani dialects and reconstructable for Early Romani (e.g. The Greek-origin suffix *-(V)s-, which appears to have been the marker of perfective adaptation of all loan-verbs and of non-perfective adaptation of intansitive loan-verbs (Matras 2002: 130), has acquired novel functions in Selice Romani (cf. El!ík 2007). 21 Although Kenesei et al. (1998: 357–358) describe the Hungarian suffix -ál- as a de-nominal verbderiving marker, their examples show that it is in fact a verb-adapting suffix, which is synchronically distinct from the de-nominal verb-deriving suffix -(V)l. 20

284

Viktor El!ík

Boretzky & Igla 2004: 112–113), has been lost due to internal analogical developments in all South Central dialects of Romani, including Selice Romani (cf. El!ík et al. 1999: 334, El!ík & Matras 2006: 329). All borrowed adjectives – i.e. not only those borrowed from Selice Romani’s pre-Greek contact languages – now inflect exactly like indigenous adjectives and employ the former oikoclitic inflectional suffixes. In loanwords from pre-Hungarian contact languages, these inflections are suffixed directly to the inflectional stem of their source adjective, e.g. Selice Romani -ut-o ‘yellow’ from Serbo-Croatian -ut. In loanwords from Hungarian, on the other hand, the suffixation of the indigenous inflections to the source adjective’s inflectional stem is mediated by overt and dedicated adaptation suffixes of South Slavic origin, e.g. Selice Romani kík-n-o ‘blue’, ke [l/w]. The above-mentioned change could also have been an internal one after borrowing. !" In some cases there probably also existed a German dialectal form with -a- (e.g. laben). Then a borrowing without the mentioned sound adaptation is possible, too.

10. Loanwords in Lower Sorbian

321

Elektriz-ität ‘electricity’) and -tion (konfirma-cija < NHG Konfirma-tion ‘confirmation (as initiation ceremony)’); note that the suffixes -ita and -cija were borrowed from Upper Sorbian themselves in order to replace non-integrated forms like konfirmazion or elektrizität. With respect to the morphological system verbs are generally integrated by replacing the German suffix -en by the Lower Sorbian -owa$ or some other variants (-a$, -i$, -nu$), e.g. rejt-owa$ < NHG reit-en ‘to ride’, trjef-i$ < NHG treff-en ‘to meet’. German verbs with the ending -ieren are generally taken over with the suffix 13 -(#r)owa$ : reg-#rowa$ < NHG reg-ieren ‘to rule or govern’. To adjectives the two possible suffixes -ny or -ski are used (of course, with all possible endings with respect to gender, number and case): gropny < NHG grob ‘rough’, identiski < NHG identisch ‘identical’ (not in the subdatabase). German nouns of different genders and ending in a consonant were usually taken into feminine gender with the ending -a in Lower Sorbian: bucht-a < NHG Bucht (F) ‘bay’, fry!tuk-a < NHG Frühstück (N) ‘breakfast’, !trump-a < MLG strump (M) ‘sock or stocking’. The same suffix is also added to nouns with the endings -e or, less frequently, -en: lamp-a < NHG Lamp-e (F) ‘lamp’, krag-a < NHG Krag-en (M) ‘collar’. In some cases this process was combined with the addition of the diminutive suffix -k: e.g. muter-k-a < NHG Mutt-er ‘mother’, ner-k-a < MLG ner-e ‘kidney’. In other borrowings, instead of the simple feminine ending -a the suffix -wa was used (without additional meaning): rat-wa < NHG Ratt-e ‘rat’, &at-wa < NHG Latt-e ‘lath’, tint-wa < NHG Tint-e ‘ink’ (not in the subdatabase). In today’s Lower Sorbian most of these and some other processes are largely conventionalized, mainly with respect to internationalisms, e.g. -ija < -ie (garant-ija < Garant-ie ‘guarantee’), -ura < -ur (cenz-ura < Zens-ur ‘censorship; grade’), -izm < 14 -ismus (central-izm < Zentral-ismus ‘centralism’ ) (cf. Starosta 1991–1992: 33–4). 5.2.

Loanwords from Upper Sorbian

Some of the main adaptation processes of Upper Sorbian loanwords in Lower Sorbian are described in Pohontsch (2002: 291–6). In this paper we will concentrate on the changes that are relevant with respect to the loanwords included in the subdatabase. 5.2.1.

Sound adaptations

Because of the more similar sound systems in Lower and Upper Sorbian, there often is no sound adaptation at all (e.g. LSo/USo policija ‘police’, LSo/USo tysac 13

Today almost only -#rowa$ is used while mainly in the second half of the 20th century there are also parallel forms with -owa$, e.g. eksistowa$ vs. eksist#rowa$ (Starosta 1991: 56). !$ In the Lower Sorbian dialects, we mostly find the borrowing of -ismus without any morphological adaptation.

322

Hauke Bartels

‘thousand’). Nevertheless, there are several important differences between the two sound systems, which in other cases make some adaptation processes necessary. 5.2.1.1.

Substitution of vowels

In the loanwords in the subdatabase, the most frequent adaptation is the replacement of the front vowel [i] by the central [*] in combination with the replacement of the palatalized consonant [t2] (comp. footnote 15) from Upper Sorbian by the non-palatal [ts] in Lower Sorbian (cf. §5.2.1.2): cys&o ‘the number’ < USo +is&o, cyta$ ‘to read’ < USo +ita$, p$icyna ‘the cause’ < USo p/i+ina. 5.2.1.2.

Substitution of consonants

The above mentioned replacement of Upper Sorbian by Lower Sorbian is necessary because of the historical sound change % [6] > c [7] only in Lower Sorbian (Schaarschmidt 1998: 113–116); cf. the examples in §5.2.1.1. Generally historical sound changes in (one of) the Sorbian languages have led to the need for th adaptation. In Upper Sorbian [g] changed to [h] before the end of the 14 century (Schaarschmidt 1998: 94–7). An example of the corresponding adaptation process is gó)ina ‘hour’ < USo hod)ina. The Proto-Slavonic palatalized consonant *t’ changed 15 to [6] in Sorbian, but further to [2] only in Lower Sorbian (Schaarschmidt 1998: 118–22; Mucke 1891: 197–198). In p&a$izna ‘price’ < USo p&a"izna, we see the adaptation that was necessary afterwards. Whereas [;] has phonemic status in Upper Sorbian, it is only an allomorphic variant of [5] in Lower Sorbian (after sibilants). So in most cases Upper Sorbian has to be replaced by Lower Sorbian : wó)id&o ‘rudder’ < USo wod)id&o, gó)ina ‘hour’ < USo hod)ina.

6. Grammatical borrowing Due to the long-standing and intensive language contact Lower Sorbian has been influenced by German also in grammar (Faßke 1997: 1795). Here we can give only some typical examples. The most remarkable contact-induced phenomena in Lower Sorbian are (1) the regular use of indefinite and definite articles (Bayer 2006: 123– 134); (2) the borrowing of a word formation model with verbal particles e.g. sobu)#li$ ‘to inform, to communicate’ < NHG mit-teilen [with-divide] (Bayer 2006: 171– 245); (3) the borrowing of the dynamic canonical passive construction with the auxiliary wordowa$ ‘to become’ < dialectal German warden (NHG werden) and of the indirect passive with krydnu$ ‘to get’ < Early New High German krigen (Bayer 2006: 15

This describes the historical state. The two further existing sounds [6] and [t&] merged into one sound in Upper Sorbian later, which unlike Lower Sorbian [6] is described as only palatalized: [t&j].

10. Loanwords in Lower Sorbian

323

272–9, Bartels 2008). Also word order and the system of verbal aspects are influenced by German (Starosta 1991–1992: 90, Faßke 1997: 1795).

Acknowledgments My greatest thanks goes to Manfred Starosta, who in countless discussions was a great help not only in finding but also in judging possible counterparts of the LWT meanings. I also thank Gunter Spieß and Fabian Kaulfürst for helpful comments on an earlier version of this text and Franziska Schulze, a former student of Sorbian languages, who assisted in preparing a first collection of the Lower Sorbian counterparts during an internship at the Sorbian institute in Cottbus/Chó9ebuz, Germany.

Special Abbreviations LSo MHG MLG NHG OHG USo

Lower Sorbian Middle High German Middle Low German New High German Old High German Upper Sorbian

324

Hauke Bartels

References Bartels, Hauke. 2008. Konkurrierende Passivkonstruktionen in der niedersorbischen Schriftsprache: Ein Beispiel für Sprachwandel durch Purismus [Competing passive constructions in the Lower Sorbian literary language: An example for language change through purism]. In Kempgen, Sebastian & Gutschmidt, Karl & Jekutsch, Ulrike & Udolph, Ludger (eds.), Deutsche Beiträge zum 14. Internationalen Slavistenkongress Ohrid 2008, 27–38. München: Sagner. Bayer, Markus. 2006. Sprachkontakt deutsch-slawisch: Eine kontrastive Interferenzstudie am Beispiel des Ober- und Niedersorbischen, Kärntnerslovenischen und Burgenlandkroatischen [German-Slavic language contact: A contrastive study using the examples of Upper and Lower Sorbian, Carinthian Slovene and Burgenland Croatian]. Frankfurt am Main: Lang. Bielfeldt, Hans Holm. 1933. Die deutschen Lehnwörter im Obersorbischen [German loans in Upper Sorbian]. Leipzig: Harrassowitz. Bielfeldt, Hans Holm. 1975. Die Entlehnungen des Sorbischen aus dem Deutschen im 16. Jahrhundert [The loans of Sorbian from German in the 16th century]. Zeitschrift für Slawistik 20(3):303–363. Bielfeldt, Hans Holm. 1977. Die ältesten nicht mehr gemeinslawischen Entlehnungen des Nordwestslawischen aus dem Deutschen. Zeitschrift für Slawistik 22(4):431–454. Bielfeldt, Hans Holm. 1978. Sorbische Entlehnungen aus dem Deutschen in der Zeit des Fränkisch-Karolingischen Reiches [Sorbian loans from German in the time of the Frankish-Carolingian Empire]. Slavia Orientalis 27(2):219–228. Bielfeldt, Hans Holm. 1982. Sorbisch-deutsche Lehnwortforschung 50 Jahre später [Sorbian-German loanword research 50 years later]. Zeitschrift für Slavistik 27(1):13–19. Branka%k, Jan & M#t$k, Frido (eds.). 1977. Geschichte der Sorben [History of the Sorbs]. Vol. 1: Von den Anfängen bis 1789. Bautzen: Domowina-Verlag. Faßke [Faska], Helmut. 1998. Serb!"ina. Opole: Uniwersytet Opolski, Instytut Filologii Polskiej. Faßke, Helmut. 1990. Sorbischer Sprachatlas [Sorbian language atlas]. Vol. 13: Synchronische Phonologie. Bautzen: Domowina-Verlag. Faßke, Helmut. 1997. Deutsch-Sorbisch [German-Sorbian]. In Goebl, Hans & Nelde, Peter H. & Star200 speakers today) Proto-Ket-Yugh Yugh (extinct 1970s) Kott (extinct 1850) Common Yeniseian

Proto-Kott-Assan Assan (extinct 1800) Arin (extinct 1730s) Proto-Arin-Pumpokol Pumpokol (extinct early 1800s)

Figure 1:

Documented members of the Yeniseian language family

The Ket-Yugh subgroup (Northern Yeniseian) is obvious from ample lexical and grammatical homologies, as is the close connection between Kott and Assan. The position of Pumpokol is more difficult to assess. This language probably forms an early branch with Arin, as presented above; however, it may be that Arin and Pumpokol form separate primary nodes, a possibility that cannot be excluded given the scanty documentation of both languages. Because some Yugh material was misidentified as Pumpokol in the early attestations, identifying genuine Pumpokol forms *

The subdatabase of the World Loanword Database that accompanies this chapter is available online at http://wold.livingsources.org. It is a separate electronic publication that should be cited as: Vajda, Edward J. 2009. Ket vocabulary. In Haspelmath, Martin & Tadmor, Uri (eds.) World Loanword Database. Munich: Max Planck Digital Library, 1030 entries.

472

Edward J. Vajda

can sometimes be difficult. The fullest and most accessible account of data known from the extinct members of Yeniseian can be found in Werner (2005). Today the Ket as an ethnic group number around 1200, but fewer than 200 can be regarded as fluent speakers. Exhaustive sociolinguistic surveys conducted by the ethnographer V. P. Krivonogov during the past two decades (Krivonogov 1998, 2003) attest to the rapid and apparently irrevocable language shift to Russian among the ethnic Ket, as well as to a rise in inter-ethnic marriages and the beginnings of a sort of Ket diaspora, where over 200 ethnic Ket have now left their native Turukhansk District to reside in other parts of the Russian Federations. Most fluent speakers of Ket are older than 50. As shown in Map 1, the location of villages where concentrations of Ket speakers reside today is generally farther north than the forests the Ket and other Yeniseian tribes inhabited during the 1600s, when Russians first made contact with them.

Map 1: Ket in its geographical context

18. Loanwords in Ket

473

1

The labels -ses, -!et, and so forth in Map 2 provide a rough approximation of areas located outside of the documented area inhabited by Yeniseian speakers that nevertheless contain river names based on cognates of the Ket word for river, ses, or water ul. These vast areas presumably represent places of former habitation of linguistic relatives of the Ket prior to the Russians’ arrival in Siberia after 1582. In many cases, the substrate river names appear to be closely related to one of the known Yeniseian languages: Ket (-ses, -sis), Yugh (-"es), Kott (-!et), Assan (-"et), Arin (-sat), Pumpokol (-dat, -tat). The widespread hydronymic formants -tys or -ty!, represented in the river name Irtysh and in the names of many smaller rivers in Western Siberia, may attest to a distinct branch of Yeniseian that otherwise disappeared without a trace. Because the hydronyms north of Mongolia and west of Lake Baikal are dialectally the most diverse, this general area likely represents the geographic origin of the Yeniseian-speaking tribes. The ethnonym Ket was adopted only in the 1930s, based on the word k#'t ‘person, human being’. Prior to this time, the Russians called the Ket “Yenisei Ostyak”, hardly distinguishing them from their linguistically unrelated neighbors to the west, the Selkup (formerly the “Ostyak-Samoyed”) and the Ugric-speaking Khanty (formerly known simply as “Ostyak”). In tsarist times, the Russians generally referred to all of the West Siberian forest people as “Ostyaks” of some sort, a term whose origin remains unclear; cf. Georg (2007: 11–15) for the most authoritative discussion of Yeniseian ethnonyms. Most Ket people today live in small villages on the middle reaches of the Yenisei River or its tributaries. The largest concentration – about 250 – is to be found in Kellog Village on the Yelogui River, though only a minority of these are fluent speakers. This village, like most locations inhabited by the Ket, is accessible to the outside world only by boat (in summer) or helicopter (year round). The Ket, as well as their documented linguistic relatives, were the last huntergatherers of North Asia outside the Pacific Rim. Having no domesticated animals besides the dog, the Yeniseian tribes had been pushed northward out of south Siberia by pastoral peoples such as the Yenisei Kirghiz. Even before the coming of the Russians the Ket had experienced centuries of encroachment from the reindeerbreeding Enets to the north and the Evenki to the east, as attested in Ket folklore. The Southern Ket, however, had formed a sort of social alliance with the Selkupspeaking reindeer-breeders to the west. 1

Ket-related hydronyms of Siberia include additional minor variations (sis ~ ses ~ sas, set ~ sat, det ~ dat, etc.) not shown in Map 2 that are difficult to connect with specific Yeniseian languages or dialects since they appear to reflect nothing more than pronunciation adjustments on the part of the peoples who took over the given territory from Yeniseian speakers. South Siberian Turkic speakers, for example, probably harmonized vowel quality (e ~ a) to match the articulation of the preceding vowel in many cases. Also not shown are areas with river names ending in -tym, -tom, -sym, etc., which are of unknown origin but tend to be prevalent in areas known to be inhabited by Yeniseian tribes in the 1600s. Also not shown are toponyms in -tes, also conceivably Yeniseian, though no documented Yeniseian language shows this pronunciation of the word for river. Cf. Werner (2006: 148–156) for more detail on the distribution of early Yeniseian peoples and their cultures.

474

Edward J. Vajda

Map 2: Location of contemporary speakers of Ket (shown in black) and of Yeniseian groups in 1600 as well as Yeniseian substrate river names (marked by labels such as -ses) All three dialects of Ket are rapidly disappearing today. Northern Ket was reported to have only a single speaker in 2006, though a second fluent speaker has since been identified. Attempts to write Ket using a Latin script based on Central Ket in the 1930s or a Cyrillic script oriented toward the Southern Ket dialect in the 1990s did little to reverse this trend, though basic lessons in Ket language continue to be given in the first few grades of primary school in Kellog and a few other villages even today. While most ethnic Ket spoke their language fluently and used Russian, at most, as a second language even as late as the 1920s, the events of the Soviet period irrevocably placed Ket on the path toward oblivion. During the 1930s the Ket were collectivized and forced to live alongside Russians and other Native Siberian minorities in the riverside villages where they currently reside, leading to a general adoption of Russian for interethnic communication. During the 1960s the Ket were forced to give up their children to boarding-school education where a Russian-only rule was vigorously enforced. This led to general language shift by the younger generations. By the time a new policy of ethnic education was adopted in the 1980s, leading to the creation of elementary language textbooks in the 1990s, most Ket children entered primary school speaking little or no Ket. As a rule, neither their parents nor even their schoolteachers were sufficiently fluent in Ket to pass it on as a native tongue. A few hours a week of elementary-school lessons of

18. Loanwords in Ket

475

Ket as a second language could not reverse the overwhelming trend toward language replacement by Russian. Today, it is generally only older adults, especially those born before the early 1960s, who retain strong fluency in their ancestral tongue. Even among this group there are no monolingual Ket speakers. For a concise overview of the history of Ket people and of the scholars who have studied them, see Vajda (2001).

2. Sources of data The first substantial publication of Yeniseian vocabulary came in 1858, with the posthumous appearance of Finnish linguist Mathias Castrén’s “Yenisei-Ostyak” grammar (Castrén 1858), which contained lists of words with their German translations. The “Ostyak” materials in this work primarily represent Yugh rather than Ket. This first Yeniseian grammar also contains the only extensive collection of Kott vocabulary and grammatical forms, as Castrén was the last scholar to work with native speakers of Kott. Earlier recordings of Yeniseian vocabulary – brief word lists of Arin, Pumpokol, Assan, Kott, Yugh and Ket taken down by explorers durth ing 18 century – long remained accessible only through visits to the archives in Moscow, Leningrad or other places in the Soviet Union where they were housed (Vajda 2001: 341–351). Tomsk linguist Andreas Dulson gathered the data from these disparate sources and published them together for the first time, though in a regional periodical difficult to obtain outside of Russia (Dul’zon 1961). Fortunately, Heinrich Werner, a linguist from Tomsk who is now based in Bonn, Germany, has th recently published a full compilation of all 18 century Yeniseian language documentation (Werner 2005). Werner’s monograph includes not only the materials published earlier by Dul’zon (1961), but also two vocabulary lists (one Arin, the other Pumpokol) newly discovered in the 1980s by Moscow linguist Eugen Helimth ski (Xelimskij 1986). Werner has also republished Castren’s 19 century Kott th vocabulary, together with Kott words recorded in the 18 century, in a KottRussian glossary appended to his Kott grammar (Verner 1990: 284–394). Unfortunately, this work remains largely inaccessible, as it was printed in only 250 copies by a regional university (Rostov University). No comprehensive dictionary of either Yugh or Kott has yet been published. Fortunately, all of the extant words from the extinct Yeniseian languages, including Castrén’s Kott dictionary materials have recently been published together th with all of the Ket and Yugh vocabulary gathered during the 20 century. This magnum opus is Heinrich Werner’s three-volume Comparative Dictionary of the Yeniseian Languages, a work written in German with English and Russian glossaries at the end of the third volume (Werner 2003). At present, this dictionary can be regarded as the authoritative publication of all recorded Yeniseian vocabulary. Not only did Werner gather together all material on the extinct Yeniseian languages, he also greatly expanded the rather scant earlier publications of Ket vocabulary. At the th close of the 20 century substantial compilations of Ket words were limited to three

476

Edward J. Vajda 2

publications . The first was a glossary of Central Ket published in German in a book about Ket ethnography (Donner 1955: 15–111). The second was a short dictionary and morpheme list of Southern Ket that appeared in a volume largely devoted to Ket texts and folklore (Krejnovi! 1969: 22–90). The third was a KetRussian/Russian-Ket elementary-school pedagogical dictionary with 4,000 Ket lexemes, based on Southern Ket (Verner 1993). Amazingly, no comprehensive th dictionary of Ket appeared during the 20 century. The present study employs Werner (2003) as its basic source, supplementing it with new fieldwork among the remaining native speakers of Ket. In some cases, new words were discovered, in many other cases, it was confirmed that Ket lacks any word for a given item. This was particularly common for words denoting features of the natural world not present in central Siberia (‘palm tree’, ‘mainland’, ‘elephant’, and the like), as well as many items of modern culture and society with which Ket speakers had never come into contact (‘judge’, ‘oath’, ‘to convict’, etc.). Judgments about recent Russian loanwords into Ket listed in Werner (2003) were also elicited from these native speakers, and in a number of cases brought to light interesting facts about the sociolinguistic status of these items. These new findings will eventually contribute to the publication of two major new works on Yeniseian lexicon, both of which are currently in preparation under the sponsorship of the Linguistics Department of the Max Planck Institute for Evolutionary Anthropology, Leipzig. The first is a comprehensive Ket-Russian-English-German dictionary of words gathered from all three Ket dialects re-elicited in idiomatic context from the remaining native speakers (Kotorova ed. 2009+). The second is an etymological dictionary of Yeniseian aimed at explaining, whenever possible, the origins of all known Yeniseian vocabulary, including loanwords (Vajda & Werner 2009+). The present study has both informed and been informed by both of these projects.

3. Contact situations 3.1.

Introduction

As isolated bands of hunter-gatherer-fishers, the Ket evolved a vocabulary uniquely th suited to their taiga and riverine environment. Up until the 20 century the Ket had little intensive contact with other linguistic groups, since they lived as small mobile bands in a vast northern forest. Most Ket words show no sign of borrowing and quite a number of them are semantically rather unique. There are nouns

2

Vajda (2001: 374) provides an exhaustive list of publication of Ket and Yugh vocabulary since the second half of the 19th century. These include several shorter lists, as well as a few pamphlets written to introduce Ket vocabulary in elementary school classes.

18. Loanwords in Ket

477

3

conveying special attributes of northern ecology: at"#tli$ %ks ‘a lone tree of one species in a pure stand of another species’, h&lis ‘small raised mound in the tundra’, ta'o ‘swampy, treeless area in the taiga’, s(lgup ‘point of land jutting out into a small river’, etc. Many words express details of forest life: )ráq ‘spring camp’, itá$ ‘distance traveled between two encampments’ (< * ‘day’ + tà$ ‘drag’), imt#t ‘to harvest pine nuts’, t)+t ‘swarms of bloodsucking insects (a major feature of forest life in the brief summer)’, lilgej ‘the crunch of snow under moving sled runners’, q)'j ‘large piece of birchbark used to cover the summer tent’, etc. Characteristic words and phrases express key aspects of Ket spiritual culture: s#ni$ ‘shaman’, h,s ‘shaman’s drum’, all"-l ‘female guardian spirit image’, ulvéj ‘the primary soul from among the seven spirits associated with each person’. Fire was conceived as a feminine-class animate being: b('k d.+p ‘fire burns’ (literally, ‘fire, she-eats’). The Ket used specialized, taboo-related vocabulary during their Bear Ceremony, an ancient tradition featuring the ritualized slaughter and consumption of a bear thought to be the reincarnation of a human relative; for example, hukt#$ are ‘bear eyes’, while d#stá$ are eyes of other animals or people. A rich inventory of spatial adverbs expresses specific types of orientation with regard to rivers or lakes and forested land: igda ‘from the forest to the riverbank’, &tá ‘from water to shore’, a.á ‘from shore to forest’, #tá ‘movement on foot upriver along the ice’, etc. These adverbs can be incorporated into motion verbs. Some adjectives build classificatory distinctions involving animacy: suk$ ‘thick (said of a tree)’, b%l ‘fat (person or animal)’, and b&sl ‘fat, thick (object)’; ka't ‘old, elderly (animals, people)’, qà or qa' ‘old, big, grown up (said of children, young adults)’, and s*n ‘old (object or person; also said of large trees)’; kitéj ‘young (animals, people)’ and ki' ‘new (object or plant)’. Some verbs have suppletive stems for animate- and inanimate-class subjects: d*n ‘he (person or animal) stands’ [du-k-a-in 3MASC.SBJ-erect-PRES-stand], du.ata ‘it (a masculine-class tree) stands’ [du-h-a-ta 3MASC.SBJ-area-PRES-extend], ujb('ut ‘it (a movable, inanimate-class object) stands’ [uj-b-a-qut at.rest-3INAN.SBJ-STATE-occupy.position]. Certain nouns describing natural phenomena are more elaborately classificatory than is typical of most Eurasian languages: b#'s ‘falling snow’, t*k ‘layer of fallen snow on the ground’, t(qpul ‘layer of fallen snow on branches’; also, huut ‘animal tail’, hi's ‘bird tail’, h(ráp ‘fish tail’. But certain kinship terms are surprisingly generic with regard to gender (bis"-p ‘brother, sister’, q*p ‘uncle, aunt’, qàl ‘grandchild, niece, nephew’), especially given the fact that Ket marriages were traditionally patrilocal and arranged on the basis of two exogamous phratries, called h(.(-tpul (< h% ‘same’ + a't ‘bone’ + h)l ‘accumulation’). As far as can be ascertained, none of this specialized vocabulary

3

The phonemic prosody in the Ket examples is transcribed using: a macron denotes high-even tone (%ks ‘tree’); an apostrophe denotes abrupt tone ending in glottal constriction (b('k ‘fire’); a grave accent denotes falling tone (ùs ‘birch tree’); an acute accent denotes rising pitch on a second syllable (h(ráp ‘fish tail’); the lack of any tone mark on disyllabic or polysyllabic words indicates an initial syllable pitch peak (s#ni$ ‘shaman’); finally, a double vowel denotes rising-falling tone on a geminate vowel (huut ‘animal tail’). The forms given are from the Southern Ket dialect unless otherwise noted.

478

Edward J. Vajda

is borrowed, though some of it could involve areal metaphoric diffusion. For example, Mongolian groups also refer to kinship lineages using the word jas ‘bone’. Yeniseian vocabulary bears no clear genealogical affinity with other North Asian families. Contact with other peoples of Eurasian, however, either directly or through the mediation of neighboring tribes, has produced several layers of loanwords. By far the largest layer results from recent Russian contact. A much smaller set of loanwords derives from contact with the Samoyedic-speaking Selkup reindeer breeders, who were the western neighbors of the Ket, or from diffusion north from the Turco-Mongol world of the steppes much farther south. The few attested loanwords that originated from the steppes, e.g., tal)-n ‘flour’ (cf. Halh Mongolian talxan), may have diffused into Ket via other languages of the taiga. It is possible that some words inherited from Common Yeniseian by Ket were borrowed from Turkic or Uralic at some very great time depth, but these are difficult to trace, and the direction of borrowing, if it occurred could have been from rather than into Yeniseian. Yeniseian words for ‘birchbark’, ‘birch tree’, ‘reindeer’, and ‘falling snow’ bear some resemblance to words in Uralic, while Yeniseian ‘stone’ resembles Turkic words for stone. 3.2.

Contact with Russian

Cossacks and other Russian-speaking adventurers began to infiltrate the middle reaches of the Yenisei watershed less than two decades after Yermak’s successful invasion across the Urals in 1582. The Ket and other Yeniseian-speaking peoples were soon incorporated into the fur-tax (yasak) system. Yasak entailed regular payment of sable and other pelts by the natives to a local representative of the Tsarist government, the voyevoda, who, as a rule, established a base camp in the form of a fort (ostrog) on some convenient riverway. Since the Ket were nomadic hunters, contact with Russians in this early period was limited to a few brief encounters every year, when yasak was delivered. In general, Ket groups tried to avoid the Russians for fear their kinsman would be kidnapped as a means of coercing regular yasak payments. The southern Yeniseian peoples were more immediately affected by the Russian presence, since they found themselves torn between fur-tax obligations to the Russian newcomers as well as to the Turco-Mongol polities of the forest-steppe fringe. In the taxation tug of war that developed, such peoples as the Arin and Pumpokol were devastated by reprisals taken against them by the Tatars for submitting to the Russian fur tax system. By 1735 the Arin as a distinct ethnic community had all but disintegrated. By 1800 the Assan and Pumpokol likewise melded with the local Russian or Turkic populations and their languages disappeared. The Kott lasted until at least the 1840s, when Mathias Castrén worked with the last five known native speakers. Social, geographic and linguistic data on the extinct Yeniseian peoples can be found in Dolgix (1960) and Werner (2005). Another factor that decimated all of the tribes of the Yenisei watershed to some significant degree was the introduction of European diseases (Alekseenko 1967: 26).

18. Loanwords in Ket th

479

Recurrent smallpox epidemics during the course of the 17 century (notably in 1627–28 and again during the period 1654–1682) all but wiped out the fisherfolk along the middle Yenisei, with the riverine Yugh especially hard hit. Although Yugh continued to be spoken by a few elderly people up to the early 1970s th (Heinrich Werner, p.c.), already by the mid-19 century the tribe had decreased to several dozen individuals from an original population of probably ten times that number. Some of the Ket hunting groups, though affected by the same epidemics, fared somewhat better, as their mobile upland lifestyle took them away from close contact with the Russians and others living in the riverside zones hardest hit. The Ket were likewise fortunate in living far enough northward on the Yenisei so as to be out of range of reprisals by steppe peoples bent on keeping their subjects from submitting to the Russians. In fact, after the coming of the Russians, the Ket gradually relocated considerably farther upstream along the Yenisei. For most Ket groups, contact with the Russians continued to be limited to times when separate family hunting parties emerged from the forest onto the riverbank during the spring to fish and pay their fur tax. The sporadic nature of Ket contact with the Russians remained little changed until the 1930s, and relatively few words from Russian were taken into the language in this initial period. Early loanwords include trade items such as teslá ‘adze’ (< Russian teslo ‘adze’), kurúk ‘hook’ (< Russian kr/uk ‘hook’), and postóp ‘glass bottle’ (< Russian stopka ‘shot glass’). There are also a few terms relating to Christianity, e.g., ho'p ‘priest’ (< Russian pop ‘parish priest’), though the Ket did not adopt the th new religion but instead retained their traditional spiritual culture into the 20 century. Direct linguistic borrowing, however, was the exception rather than the rule, even for new realia. Rather, the Ket showed a more marked tendency to coin native terms for new objects, concepts, or social categories. A typical example of these neologisms is bogdóm ‘gun’ (< Ket bo'k ‘fire’ + q,m ‘arrow’). Ket interaction with Russians underwent a drastic revolution as a result of Stalin’s collectivization campaign of the 1930s, which forced the Ket and other Native Siberians to settle in Russian-style villages where they came increasingly under pressure to deal with spoken Russian on a regular basis. During the 1960s the Soviet government intensified its policy of forcing Ket families to give up their children to Russian-language boarding schools. This seems to have triggered the crucial breaking point in transmission of the language, as Ket children born after the 1960s rarely learned fluent Ket. Older native speakers, however, continued to use Ket with relatively little influence from Russian, preferring instead to coin neologisms based on native morphological material, such as 0. suul ‘iron sled’ for ‘automobile, truck’. Nevertheless, the majority of Russian loans seem to date after the period of collectivization.

480

Edward J. Vajda

3.3.

Contact with other Siberian peoples

The Yeniseian languages spoken to the south of Yugh and Ket, all of which became extinct before massive Russian influence could affect them, show loans from South Siberian Turkic, especially in the realms of stockbreeding, farming, or metallurgy: Kott bal ‘cattle’, bagar ‘copper’, !ero ‘beer’; Kott/Assan tabat ‘camel’, kulun ‘foal’, araka ‘wine’; Assan talkan ‘flour’, alton ‘gold’; Arin ogus ‘bull’, bugdai ‘wheat’, kajakok ‘butter’, etc. A few Turkic loans even name natural phenomena, e.g., Kott/Assan boru ‘wolf’, attesting to the pervasive Turkic influence on later stages of these languages; cf. Ket q1t ‘wolf’ and Yugh X1t ‘wolf’, terms presumably inherited from Proto-Yeniseian. The contact situation for Ket and Yugh, the northern Yeniseian languages, is quite different, since these tribes were not in direct association with stockbreeding peoples of the steppes. Rather, the Ket in their taiga home lived in desultory proximity to reindeer-breeding tribes on all sides. The Nenets and Enets groups to the north, as well as the Evenki tribes pushing into the Yenisei watershed from eastern Siberia, tended to be adversarial toward the Ket. Contact was sporadic and generally hostile, with few or no identifiable loanwords into the Ket dialects from Nenets, Enets, or Evenki. A rare exception is so.uj ‘sokui’, an Evenki word in Northern and Central Ket for a type of pullover jacket without a hood (cf. Alekseenko 1967: 138). The situation with the Selkup was different, since the Ket developed friendly relations with this tribe and even exchanged marriage partners after the traditional inter-Ket exogamous phratry system collapsed in the wake of smallpox epidemics. Selkup loans in Ket are somewhat more common and include the ethnonym la'q ‘Selkup’, a word that means ‘friend’ in Selkup, symbolizing the close relations between Ket and Selkup peoples. There are also loans relating to domesticated reindeer (qobd ‘castrated reindeer’, ollas ‘reindeer calf’, ka.li ‘reindeer sled’), with some Ket in the Yelogui River area (near present-day Kellog Village) even adopting th reindeer breeding by the early 20 century. Other words shared between Ket and Selkup were more likely borrowed in the other direction, notably Selkup aqlalta ‘guardian spirit image’. This word is only found in the Selkup dialect spoken adjacent to Ket and likely derives from an earlier pronunciation of Ket all"-l ~ allalt ‘guardian spirit doll’ (the disappearance of the final -ta, which appears to have been a native Ket nominalizing suffix, gave rise to the final stress in the first variant). Xelimskij (1982: 238–239), conversely, interprets this word as a Selkup loan into Ket which derives from a nominalization of the Selkup verb ‘to amaze’, an etymology unlikely on semantic grounds. A few loanwords in Ket were likely borrowed through Selkup or Turkic and ultimately derive from more distant sources. One is kan"á ‘(smoking) pipe’, a word of Chinese origin found in many Native Siberian languages. Another is Ket/Yugh na'n ‘bread’, which might represent a Wanderwort of Iranian origin, though it might just as likely be a nursery word.

18. Loanwords in Ket

481

4. Numbers and kinds of loanwords in Ket 4.1.

Introduction

The subdatabase for Ket contains 1018 words, alongside 443 gaps, most involving concepts irrelevant or unknown to Ket speakers and therefore lacking any dedicated lexical designation. Most lexical gaps involve items alien to the traditional world of taiga hunter-gatherers. These include exotic realia such as ‘palm tree’, ‘elephant’, ‘beech tree’, ‘kangaroo’, etc., as well as technological concepts or social categories typical of stratified sedentary society: ‘battery’, ‘axle’, ‘judge’, ‘jury’, ‘birth certificate’, and so forth. Other gaps involve cases where Ket lacks a superordinate term that would correspond to a general category typically designated by a lexeme in other languages, such as ‘weapon’, ‘tool’, ‘age’, ‘plant’. A number of the completed entries represent super-counterparts – single lexical items used to express two or more basic meanings. Once example is ba'$, the Ket noun used to refer to the concepts, ‘earth’, ‘land’, ‘soil’, as well as ‘time’. Another is bis"#p, a generic word for ‘sibling’ that can be used to mean either ‘brother’ or ‘sister’. Finally, a number of lexical gaps unfortunately result from insufficient information about Ket vocabulary. Among the coded forms in the subdatabase, only 78 show clear evidence of having been borrowed. In most of the remaining 940 cases, there is little or no evidence for borrowing, and the word must be considered as belonging to native Ket vocabulary. While in a majority of these cases, the words in question were reth corded by linguists only during the mid 20 century, a comparison of core Ket vocabulary with that of the documented extinct Yeniseian languages (most notably Kott and Yugh) suggests that virtually all basic Ket words are of native provenance. In the case of the clearly borrowed items, the age of most of them can be surmised based of what is known historically about episodes of language contact. The overwhelming majority of clearly attested loanwords (72 out of 78) derive from Russian, th with most of these acquired by Ket during the 20 century. Early Russian loans are defined as words incorporated into Ket before the 1930s, when the Ket were forced to settle down in Russian-style villages and began to communicate in Russian on a regular basis. These early loans can be identified on the basis of their more complete phonological adaptation (about which cf. §5 below), or their meanings (i.e., they refer to Tsarist era categories such as ‘priest’). In addition, there are a few cases th where early Russian loans into Ket were actually attested during the 19 century. It should be noted that some modern Ket words may ultimately derive from early Turkic or Uralic loans into ancient Yeniseian, though such a possibility is difficult to verify. One such word is Ket q)+nt ‘ant’, possibly associated historically with Proto-Finno-Ugric *ku23e ‘ant’ (Xelimskij 1982: 244). Another is Ket bo'q ‘bag net’, apparently connected with Selkup *pok ‘bag net’ (Alekseenko 1967: 62). In such cases, however, it is not possible to determine with certainty whether we are dealing with a chance resemblance or, in the case of a genuine loanword, to determine the direction or time of borrowing. Conversely, some pan-Yeniseian terms of basic vocabulary, such as ‘stone’ (Ket t)'s, Yugh ")'s, Kott !i!, Arin kes), are more

482

Edward J. Vajda

likely to be the source of early loans into Common Turkic (cf. Proto-Turkic *ta! ‘stone’). The dialectal differentiation of the Yeniseian words vis-à-vis the Turkic form suggest that, if the resemblance is more than simply chance, then it was Turkic that borrowed the word from Yeniseian, presumably from a Yeniseian language with initial *t. In summary, there are no incontrovertible examples of basic Ket content words (body parts, kinship terms, words for basic actions and the like) originating as direct loans from another language. Nor do borrowed nouns, adjectives, or verbs from Russian belong to the core vocabulary. 4.2.

Loanwords by semantic word class

Table 1 shows the breakdown of loanwords from the four attested source languages into Ket by semantic word class. The decimal values indicate instances where a native synonym exists for a given loanword.

Chinese

0.7 0.4

0.3 0.2

0.2 0.1

0.2 0.1

13.6 4.0 6.1 3.5 0.0 9.7

Nonloanwords

Evenki

12.3 4.0 6.1 3.5 8.9

Total loanwords

Selkup

Nouns Verbs Function words Adjectives Adverbs all words

Mongolian

Loanwords in Ket by donor language and semantic field (percentages) Russian

Table 1:

86.4 96.0 93.9 96.5 100.0 90.3

The vast majority of loanwords are nouns, which make up about 14% of the total number of nouns in the subdatabase. Loan verbs are much more rare, and are limited to the borrowing of Russian infinitives or nouns incorporated into the Ket verb complex in the morpheme position normally reserved for nominal forms: (da-deldu.abet ‘she shares it’ (< Russian delit’ ‘to share’), da-kerasin-ata.it ‘she rubs him with kerosene’ (< Russian kerosin ‘kerosene’). Therefore, in a sense, even these verb-related loans are nominal in nature. 4.3.

Loanwords by semantic field

A breakdown of percentages of loanwords in the 24 semantic fields represented in the subdatabase likewise reflects the predominance of Russian loans in comparison to loans attested from other families.

18. Loanwords in Ket

Selkup

Evenki

Chinese

Total loanwords

Nonloanwords

The physical world Kinship Animals The body Food and drink Clothing and grooming The house Agriculture and vegetation Basic actions and technology Motion Possession Spatial relations Quantity Time Sense perception Emotions and values Cognition Speech and language Social and political relations Warfare and hunting Law Religion and belief Modern world Miscellaneous function words

Mongolian

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Loanwords in Ket by donor language and semantic field (percentages) Russian

Table 2:

483

4.9 1.5 6.7 3.3 14.6 13.0 23.1 12.8 10.0 6.0 24.1 6.3 2.9 9.9 5.6 13.8 9.7 3.5 30.8 15.4 54.8 8.9

1.8 9.7 4.8 0.4

1.7 1.7 0.2

3.2 0.1

3.2 0.1

4.9 1.5 8.3 3.3 16.5 13.0 23.1 19.2 11.7 6.0 24.1 6.3 2.9 9.9 0.0 5.6 13.8 0.0 19.4 3.5 30.8 15.4 59.5 0.0 9.7

95.1 98.5 91.7 96.7 83.5 87.0 76.9 80.8 88.3 94.0 75.9 93.7 97.1 90.1 100.0 94.4 86.2 100.0 80.6 96.5 69.2 84.6 40.5 100.0 90.3

As can be seen from Table 2, loanwords are scattered widely across the semantic spectrum. A relatively larger number of loanwords belong to the categories Food and drink (a total of 8 loans), The house (6), Possession (6), and Animals (5). Unsurprisingly, these are all semantic fields involving realia with which the Ket came into regular daily contact only after the sedentarization campaign of the 1930s. Even in these categories, it must be noted, the majority of new items encountered by the Ket after their adoption of a Russian village lifestyle received names based on native Ket neologisms rather than borrowing or even calquing based on Russian, if they received any dedicated nominalization at all. For example, alongside Ket sa'j ‘tea’, a loanword deriving earlier from either Russian "aj or Mongol tsai, other drinks received native Ket nominalizations. Vodka came to be referred to as b(.ul (< b('k ‘fire’ + 4l ‘water’), and coffee was called q&li$ 4l ( > > > > >

‘electricity’ ‘to sow’ ‘to thresh’ ‘to grind grain’ ‘the gun’ ‘to read’

‘afternoon’

lunch.GEN

mas ar#"–ta

rastitel’noe maslo

tree butter–POSS.3SG

plant

eder

kihi

molodoj &elovek

young

man

young

‘(vegetable) oil’

butter

‘young man’

man

Copies from Siberian languages

One item was probably copied into Sakha from Ket (t#" ‘boat’), and one item was probably copied from Selkup (mel'i ‘often, always’); however, it is not at all clear exactly how, when or where these items entered the Sakha language. Fourteen items (1%) were probably or clearly copied from Northern Tungusic languages. Seven of these are clearly copied only from Evenki, because the Even word is different; five could have been copied from either Evenki or Even, because the model word is identical in the two languages; and two more words have similar cognates in Even and Evenki, but the Sakha form is closer to the Evenki form: Sakha lap&a"n ‘the fin’, Evenki la"p&a", Even (ap&a ‘fin, tail’; Sakha so$o", Evenki so$o, Even ho$, so$ ‘to cry’. We classified all of these items as having been copied from Evenki for three reasons: (i) More copies (including less certain copies not discussed here) can be traced only to Evenki and not to Even. (ii) Contact with Evenks was more widespread than with Evens, who are settled to the northeast of the Sakha. (iii) Influence on Sakha grammatical structure can be traced to Evenki rather than Even (cf. §6). The copies from Evenki predominantly come from the domain of natural phenomena, e.g. %o&o ‘valley’, tura"% ‘crow’, but include a few cultural items such as mam#kta ‘lasso’ or untu" ‘fur boots’. Only one verb was copied from Evenki (so$o" ‘to cry, weep’), and so was one body part term: t#$a ‘the lung’. Furthermore, the

19. Loanwords in Sakha (Yakut)

509

word mas ‘wood, tree, firewood, tree trunk’ is probably copied; however, the model language is uncertain. It has been suggested as being a copy from Mongolic modu(n) ‘tree’ as well as from Evenki mo" ‘tree’ (Ka%u&y'ski 1995 [1962]: 205; Stachowski 1995). 4.3.

Copies from Mongolic

A very large number of probable or clear copies (148, i.e. approximately 11%) are from Mongolic languages; amongst these, one can be traced only to Khalkha, three have only Buryat cognates, and one is clearly comparable only to a Kalmyk model (cf. Appendix). However, this number does not accurately reflect the degree of copying that speakers of Sakha undertook from Mongolic, since a number of items in the Loanword Typology meaning list are derived in Sakha from items copied from Mongolic and are so not included in this count. Thus, 29 items are verbs or nouns derived from lexemes probably or clearly copied from Mongolic, or are compounds containing Mongolic copies. Following the guidelines of the Loanword Typology project, these were not classified as copies, since they were derived in Sakha and so represent Sakha innovations. Furthermore, seven items were classified by us as “perhaps copied” not because there are any doubts about the ultimately copied origin of the word, but because they might have either been copied directly from Mongolic (in which case they would have been classified as ‘clearly copied’), or they might have been derived from a copied word (in which case they would have been classified as showing ‘no evidence of copying’). These items are oyu"r ‘the woods or forest’, sime% ‘the ornament or adornment’, soruk ‘the intention’, seherge" ‘to tell’, ma!an ‘white’, üges ‘the custom’, and anda!ay ‘to swear’. If we add these 36 items to the count of Mongolic copies in Sakha, we find a total of 184 (13%) Mongolic copies in Sakha. The Mongolic copies come from a wide variety of semantic domains: natural phenomena, kinship terms, body part terms, terms for tools and household items, as well as words describing more abstract concepts such as form and size, time, and thoughts and ideas. However, in the semantic field Food and drink only four copies are from Mongolic, three of them verbs; and no words dealing with clothing were copied from Mongolic. It is furthermore notable that 53 verbs are copied from Mongolic languages (38 of these were classified as “probably” or “clearly copied”, while 15 were the basis of derivation of a nominal or different verb included in the Loanword Typology meaning list), i.e. more than one fourth of the Mongolic copies are verbs. This unexpectedly high number might be explained by the similar agglutinative structure of Mongolic and Sakha, and especially by the fact that in both languages the bare stem of the verb functions as a categorical imperative, making verb stems easy to recognize and easy to integrate into the recipient language.

510

Brigitte Pakendorf and Innokentij N. Novgorodov

4.4.

Copies from Russian

The largest number of copies in Sakha are from Russian: 240 items (17%) are clearly copied, one further word might be copied (%ar&# ‘money’ which formally appears to be copied from dialectal Russian xar&i ‘food’, although the semantic shift is hard to explain), and three Russian copies appear in compounds. Furthermore, 11 Sakha verbs are derived from copied Russian nouns, and 14 expressions may be calques from Russian. Thus, if one includes all the copies from Russian found in the subdatabase (excluding, however, the calques), there are 255 copies (approximately 18%) from Russian. Not surprisingly, most of these are connected with items that were introduced through Russian contact, such as bi"lke ‘the fork’, ki"ne ‘the film/movie’, or mass#"na ‘the car’. However, there are some interesting cases concerning words for human relationships, such as the loanblend ma&a%a i!e ‘stepmother’, a compound of the Russian word ma&exa ‘stepmother’ and the Sakha word i!e ‘mother’. This is nowadays restricted to stepmothers; previously, however, the compound also occurred with the Sakha word for ‘father’ a!a and then denoted a stepfather. The Russian word for widow vdova was copied into Sakha as ogdo"bo with a meaning of both ‘widow’ and ‘widower’; probably this copy was made because the Sakha word tula"ya% ‘orphan’ used to have a meaning of both ‘orphan’ and ‘widow(er)’, so that the copied word served to make a distinction between the kinds of bereft family members. The Russian word brat ‘brother’ was initially copied with a general meaning of ‘brother, sister’ without the specification of relative age; nowadays, however, it has acquired a meaning specifically of ‘younger brother’ and is replacing the indigenous Sakha terms ini ‘younger brother of a boy’ and surus ‘younger brother of a girl’. 4.5.

Synonyms

A number of copied words are synonymous with other words. In most cases (over 70) they are synonymous with what appear to be inherited words, while in nearly 40 cases synonymous items were copied from different languages, or even from the same language. Copies from Russian often denote the specifically western-style item that was introduced through contact, in contrast with Sakha words that denote the traditional Sakha items, e.g. sele)ppe ‘(Russian-style) hat’, which coexists with bergehe ‘(Sakha-style fur) hat’, or ustu)l ‘chair (with back)’ vs. oloppos ‘stool, Sakhastyle chair’. Interestingly, however, the Russian copy kömülüök is used to designate the traditional open fireplace of the Sakha, while the inherited Turkic word oho% designates the Russian-style brick or iron stoves and even modern electric stoves used for cooking. Although in some cases there are slight differences in meaning between the Turkic word and the Mongolic copy (e.g. copied oyu)r ‘forest near the village’ vs. inherited t#a ‘forest further away’, or copied öbügeler ‘ancestors’ vs. törütter ‘roots, ancestors’), a number of copies from Mongolic appear to be direct synonyms, such

19. Loanwords in Sakha (Yakut)

511

as copied so!uo vs. inherited ur ‘goitre’, copied &ö!ö&ök vs. inherited tö$ürges ‘tree stump’, copied tökürüy vs. inherited ie% ‘to bend’, or copied kelgiy vs. inherited ba"y ‘to tie’. These items may have been copied due to the higher social status the Mongols had during the Mongol Empire, when an apparent knowledge of Mongolic may have conferred some prestige on Sakha speakers. A number of synonyms were copied first from Mongolic and then again from Russian, or from Russian in pre-Soviet times and then later in Soviet times. For example, the following synonymous pairs were copied first from Mongolic and later from Russian: kieli, matka ‘womb’; sülühün, 'a"t ‘poison’, tiergen, telgehe (both from Mongolic), olbuor ‘yard’, kem, birieme ‘time’, na"r, kuru"k (as well as mel'i copied from Selkup) ‘often, always’, kere, k#rah#abay ‘beautiful’, 'e$ke, &uolkay ‘clear’, soruk, s#al ‘intention’, and %oruy, eppiet ‘the answer’ (included in the subdatabase in the derived verbs %oruyda", eppiette" ‘to answer’). Items copied from Russian during preSoviet times and later from Russian in Soviet times include %ortuopuy, %ortuoska ‘potatoes’, #sta"n, bürü"kke ‘trousers’, %oruobuya, k#r#"sa ‘roof’, la"pp#, ma!ah#"n ‘the shop’, and deriebine, böhüölek ‘the village’. Interestingly, three different words for ‘the temples’ (&e&egey, &an&#k, and &ab#r!ay) were copied from Mongolic, as well as two different words each for ‘to damage’ (al'an, ültürüy) and ‘to rescue’ (örühüy, ab#ra"), for ‘the witness’ (tuohu, kerehit) and ‘the magic’ (ap, %omuhun). Whether this might be an indication that the ancestors of the Sakha were in contact with speakers of different Mongolic dialects is unclear; models for all of these words are found in Written Mongolian.

5. Integration of lexical copies 5.1.

Phonological integration

Copies from Mongolic and older copies from Russian are adapted to Sakha phonology, while modern copies from Russian retain their Russian form in Sakha speech, at least by speakers who are fluent in Russian; e.g. Russian avtobus, Sakha avtobus ‘the bus’. Some older Sakha speakers, however, who are still monolingual in Sakha, may adapt these items phonologically, such as otuobus ‘the bus’. Vowels are changed to follow Sakha vowel harmony rules, e.g. (4)

a. Mongolic &o!&ay ‘to rise, to loom; to squat’ do!ula$ b. Russian karto)ka tarelka

Sakha &o%&oy do!olo$

‘to crouch’ ‘lame’

Sakha %ortuoska terielke

‘potatoes’ ‘plate’

512

Brigitte Pakendorf and Innokentij N. Novgorodov

Consonant clusters in Russian are broken up through prothetic or epenthetic vowels (5), and stressed syllables in Russian are copied with a long vowel or diphthong in Sakha (6). The Russian labial fricatives f and v, which do not exist in native Sakha words, are replaced by the stops p and b (7). Word-initial p and g are missing in Sakha, so that in copies of Russian words these consonants are changed to their counterparts b and k or %, respectively (8). Russian Sakha (5)

(6)

(7)

(8)

spi&ka

ispi"ske

‘matches’

svad’ba vra&

s#ba"yba b#ra"s

‘wedding’ ‘doctor’

kraska

k#ra"ska

‘the paint’

xleb vremja

kiliep birieme

‘bread’ ‘time’

konfeta

kempiet

‘chocolate, sweet’

velosiped vybor

belasiped b#"bar

‘bicycle’ 3 ‘election’

pyl’

b#"l

‘dust’

posëlok gorod grabli gazeta

böhüölek kuorat k#ra"b#l %ah#at

‘settlement, village’ ‘town’ ‘rake’ ‘newspaper’

In copies from Mongolic that contain sequences of vowel-voiced velar-vowel, the velar consonant (g or !) is lost and the vowels undergo diphthongization or lengthening (Ka%u&y'ski [1962] 1995: 55–63), e.g.: (9)

3

Mongolic

Sakha

kögemey kegeli xada!asun ‘nail, peg, spike’ sana!a

küömey kieli %ata"h#n sana"

‘throat’ ‘womb’ ‘latch, door-bolt’ ‘thought’

Actually, the Russian word vybor means ‘choice’; ‘election’ is the plural form of this, vybory.

19. Loanwords in Sakha (Yakut)

5.2.

513

Morphological integration

Lexical copies in Sakha are morphologically well integrated in that they can take the same derivational and inflectional suffixes as native words. Thus, a large number of Sakha adjectives and verbs are derived from nouns copied from Mongolic or Russian; for example kergennen ‘to marry (from a man’s perspective)’ is derived from the noun kergen ‘spouse’, which was copied from Mongolic, while &eyde" ‘to drink tea’ is derived from &ey ‘the tea’, which was copied from Russian. Copies of Mongolic verbs ending in a vowel are occasionally adopted without any further changes (10a), but more often they take a final glide (for the intransitive form), e.g. (10b). Russian verbs, on the other hand, are mostly copied from the imperative form and integrated with the help of the most frequent Sakha verbalizing suffix -la" (and its variants) e.g. (10c). The Russian word prostit’ ‘to forgive’, however, is integrated not with the verbalizing suffix, but with the help of the auxiliary g#n ‘to do’, i.e. Sakha b#rast#" g#n. (10) a. Mongolic abura tölü b. daba ergi

Sakha

‘save, rescue; help, protect’ ‘compensate, pay off’

ab#ra" tölö"

‘to rescue’ ‘to pay’

‘climb, ascend’

dabay

‘to go up (a mountain)’

‘to turn’

ergiy

‘to turn’

c. Russian me)at’ me)aj ‘disturb.IMP’

Sakha mehey-de"

‘to disturb, bother’

Some copies of plural forms in Russian are treated as non-plural forms in Sakha, such as Russian cvetki ‘flowers’, Sakha sibekki ‘flower(s)’, which can take the plural suffix -ler, or Russian grabli ‘the rake’ (which is morphologically plural), which in Sakha appears in the non-plural form k#ra"b#l.

6. Grammatical copying Although Russian is the model language for most of the lexical copies found in the subdatabase, the impact of Russian on the structure of Sakha appears to have been negligible. Although this may appear surprising given the dominance of Russian as the language of education, political administration, and the widely received mass media, this can be accounted for by the fact that a number of Sakha speakers are still monolingual. Thus, Russian names for items adopted through contact with Russians have made their way into the language at the same time as the items made it into modern Sakha culture, but the grammar of rural speakers has not yet been influenced.

514

Brigitte Pakendorf and Innokentij N. Novgorodov

The structural influence of Mongolic on Sakha has been stronger than the influence of Russian. Thus, the (somewhat archaic) use of ikki ‘two’ as a coordinating device was copied from Mongolic (Ka%u&y'ski [1962] 1995: 154): (11)

Sakha:

a!a-m

i!e-m

ikki

father-POSS.1SG

mother-POSS.1SG

two

‘my father and mother’ (Uwarowskij’s Erinnerungen, sentence 13) Mongolic: Ba"tar Dor* xoyer Baatar

Dorj

two

‘Baatar and Dorj’

(Kullmann & Tserenpil 2001: 299)

Likewise, the extension of the Sakha Dative case to include locative as well as allative functions can be traced to Mongolic influence. Furthermore, initial Mongolic influence might have played a role in the retention of distinct Comitative and Instrumental cases in Sakha, and also in the development of the Turkic Locative case to a Partitive case (Pakendorf 2007: 120–201). Surprisingly, in contrast to the relative paucity of lexical copies from Evenki, Evenki structural influence on Sakha has been quite strong, especially in the nominal case system of Sakha. Thus, the loss of the Turkic Genitive case, the development of an indefinite accusative meaning of the Partitive case, and possibly the retention of the Comitative-Instrumental distinction can all be traced to Evenki influence. Furthermore, the development of a Future Imperative in Sakha as well as extended uses of the possessive suffixes can be explained by Evenki contact influence (Pakendorf 2007: 95–270).

7. The results of genetic studies Molecular anthropological analyses of the Sakha and neighboring populations (Pakendorf et al. 2006) confirm the hypothesis that the Sakha immigrated to their current territory from the south. Furthermore, the genetic homogeneity of the Sakha population is in good accordance with their relatively recent spread over the widespread area they inhabit nowadays, originating from a fairly small area on the middle Lena. The Y-chromosomal data show signs of a very strong and recent bottleneck of the paternal side of the population followed by an expansion; this can be dated to approximately 900 ± 440 years before present. A less dramatic founder event visible in the mitochondrial DNA (which is inherited solely in the maternal line) can be dated to 1,300 ±800 years ago (Pakendorf et al. 2006). Thus, if these expansions were caused by the same event (e.g. the migration of the Sakha ancestors to the north), this would have taken place between 700 and 1500 CE, in reasonably good agreement with the archaeological data that point to a migration th th north in the 13 or 14 century.

19. Loanwords in Sakha (Yakut)

515

There is no evidence of admixture in the paternal line between the Sakha and the indigenous inhabitants of Yakutia (Yukaghirs, Evenks, or Evens), although there is clear evidence that Sakha men married into a subgroup of western Evens. Although admixture from Evenks or Evens cannot be entirely excluded in the maternal line (due to widespread sharing of mitochondrial DNA sequence types between South Siberian Turkic-speaking groups, Sakha, Evenks, and Evens), there is no conclusive evidence for such admixture, and it cannot have been substantial. Thus, there is no genetic evidence for a language shift of entire groups of Evenks (i.e. both men and women) to Sakha. Furthermore, there does not appear to have been large-scale intermarriage in the maternal line. This makes the finding of structural influence from the Evenki language quite intriguing, since it cannot be accounted for by substratum influence or large numbers of linguistically mixed households. An explanation might be that the group of Sakha ancestors that initially migrated to Yakutia was very small, as shown by the severely reduced genetic diversity on the Y-chromosome; this small group of immigrants may have been dependent on their indigenous neighbors in the initial period after their migration to an area with a much harsher climate than that in their South Siberian homeland (Pakendorf 2007: 317–323).

8. Conclusions It is clear from the above discussion that the Sakha have been open to contact with speakers of other languages throughout a large part of their history. With a total of nearly 29% copied lexemes found in the subdatabase, Sakha can be classified as a heavily copying language (thus, Bakker & Mous (1994: 5) suggest that “extreme borrowing never exceeds 45%”). This large number of lexical copies is due in part to the inclusion in the Loanword Typology meaning list of lexical items pertaining to the modern world, which not unexpectedly were copied from Russian. However, Sakha has not copied only words for new items: a large number of verbs were copied from Mongolic languages, as were terms for body parts and kin. Furthermore, a very large domain of items known to have been copied from Mongolic, namely descriptive verbs, was not included in the subdatabase, so that the number of total Mongolic copies in Sakha was probably somewhat underestimated in this study; thus, Popov (1986: 8) and Rassadin (1980: 65) count between 2,000 and 2,500 words of Mongolic origin in the Sakha language. Taking the number of 6,200 lexical roots contained in Pekarskij’s dictionary ([1907–1930] 1958–1959) as the basis for calculations, one would arrive at a proportion of 30–40% of Mongolic copies in Sakha. Intriguingly, Sakha has copied only a relatively small number of items from Evenki, while it appears to have undergone noticeable structural influence from this language. This kind of structural influence is indicative of bilingualism of the Sakha ancestors in Sakha and Evenki (Winford 2005: 376f). This is a somewhat surprising finding, given that there is no conclusive evidence for large-scale Evenk genetic

516

Brigitte Pakendorf and Innokentij N. Novgorodov

admixture in the Sakha, and given that today the Sakha language is the dominant language in Yakutia, where Evenks and Evens are at least bilingual in Sakha, if they have not shifted to this language entirely. A last interesting finding of this study is the number of verbs copied from Mongolic: more than one quarter of the Mongolic copies detected in this study are verbs. This very high number can be explained by the fact that the Mongolic languages and Sakha are typologically similar, facilitating the transfer of verbs from one to another. Furthermore, it can be explained by the fact that the Sakha were the socially and politically subordinate group, while the Mongolic tribes were the politically dominant group during the Mongol Empire. Thus, Haugen (1950: 224) finds that Swedish and Norwegian immigrants in the USA, who constitute clear minority groups in a socially and politically dominant culture, have copied between 18% and 23% of verbs from American English.

Acknowledgments We gratefully acknowledge financial assistance by the Max Planck Society, as well as RFBR grant No. 03–06–96033 ! 2003 "!#$%#"_" to Novgorodov. Furthermore, Pakendorf thanks the inhabitants of Tabalaax for their hospitality and support, and Elizaveta Migalkina for her time and patience.

19. Loanwords in Sakha (Yakut)

517

References Afanas’ev, P. S. & Voronkin, M. S. & Alekseev, M. P. 1976. Dialektologi&eskij slovar' jakutskogo jazyka [Dialectological dictionary of the Sakha language]. Moscow: Izdatel’stvo ‘Nauka’. Afanas’ev, P. S. & Xaritonov, L. N. (eds.). 1968. Russko-jakutskij slovar' [Russian-Yakut dictionary]. Moscow: Izdatel’stvo ‘Sovetskaja $nciklopedija’. Alekseev, Anatolij Nikolaevi". 1996. Drevnjaja Jakutija: Neolit i epoxa bronzy [Ancient Yakutia: The Iron Age and the Medieval Epoch]. Novosibirsk: Izdatel'stvo Instituta Arxeologii i Etnografii SO RAN. Anikin, Aleksandr Evgen’evi". 2003. +timologi&eskij slovar’ russkix zaimstvovanijax v jazykax Sibiri [Etymological dictionary of the Russian borrowings in the languages of Siberia]. Novosibirsk: ‘Nauka’. Antonov, N.K. 1971. Materialy po istori&eskoj leksike jakutskogo jazyka [Materials on the historical lexicon of Yakut]. Yakutsk: Jakutskoe kni#noe izdatel’stvo. Bakker, Peter & Mous, Maarten. 1994. Introduction. In Bakker, Peter & Mous, Maarten (eds.), Mixed Languages: 15 Case Studies in Language Intertwining. Amsterdam: IFOTT. Böhtlingk, Otto (ed.). 1964 [1851]. Uwarowskij’s Erinnerungen. In Über die Sprache der Jakuten: Grammatik, Text und Wörterbuch [On the language of the Jakuts: Grammar, text and dictionary] (Indiana University Publications, Uralic and Altaic Series 35). The Hague: Mouton & Co. Cincius, Vera Ivanovna (ed.). 1975. Sravnitel’nyj slovar’ tunguso-man’&*urskix jazykov: Materialy k ,timologi&eskomu slovarju [Comparative dictionary of the Tungusic languages: Materials for an etymological dictionary]. Vol. 1: A–*. Leningrad: Izdatel’stvo ‘Nauka’, Leningradskoe otdelenie. Cincius, Vera Ivanovna (ed.). 1977. Sravnitel’nyj slovar’ tunguso-man’&*urskix jazykov: Materialy k ,timologi&eskomu slovarju [Comparative dictionary of the Tungusic languages: Materials for an etymological dictionary]. Vol. 2: O–$. Leningrad: Izdatel’stvo ‘Nauka’, Leningradskoe otdelenie. Dolgix, Boris Osipovi". 1960. Rodovoj i plemennoj sostav narodov Sibiri v XVII veke [The th tribal composition of the peoples of Siberia in the 17 century]. Moscow: Izdatel'stvo Akademii Nauk SSSR. ESTJ. +timologi&eskij slovar’ tjurkskix jazykov [Etymological dictionary of the Turkic languages]. ESTJ 1974. Sevortjan, $rvand Vladimirovi". Ob)&etjurkskie i me*tjurkskie osnovy na glasnye [Common Turkic and Middle Turkic roots beginning in vowels]. Moscow: Izdatel’stvo ‘Nauka’. ESTJ 1978. Sevortjan, $rvand Vladimirovi". Ob)&etjurkskie i me*tjurkskie osnovy na bukvu “B” [Common Turkic and Middle Turkic roots beginning with the letter “B”]. Moscow: Izdatel’stvo ‘Nauka’.

518

Brigitte Pakendorf and Innokentij N. Novgorodov

ESTJ 1980. Sevortjan, $rvand Vladimirovi". Ob)&etjurkskie i me*tjurkskie osnovy na bukvy “V”, “G”, i “D” [Common Turkic and Middle Turkic roots beginning with the letters “V”, “G”, and “D”]. Moscow: Izdatel’stvo ‘Nauka’. ESTJ 1989. Levitskaja, L. S. (ed.). Ob)&etjurkskie i me*tjurkskie osnovy na bukvy “D-”, “.”, i “J” [Common Turkic and Middle Turkic roots beginning with the letters “D-”, “.”, and “J”]. Moscow: Izdatel’stvo ‘Nauka’. ESTJ 1997. Levitskaja, L.S. & Dybo, A.V. & Rassadin, V.I. Ob)&etjurkskie i me*tjurkskie leksi&eskie osnovy na bukvy “K”, “Q” [Common Turkic and Middle Turkic lexical roots beginning with the letters “K”, “Q”]. Moscow: ‘Jazyki Russkoj Kul’tury’. ESTJ 2000. Blagova, G. F. (ed.). Ob)&etjurkskie i me*tjurkskie leksi&eskie osnovy na bukvu “K” [Common Turkic and Middle Turkic lexical roots beginning with the letter “K”]. Moscow: Izdatel’stvo ‘Indrik’. ESTJ 2003. Dybo, A. V. (ed.). Ob)&etjurkskie i me*tjurkskie leksi&eskie osnovy na bukvy L, M, N, P, S [Common Turkic and Middle Turkic lexical roots beginning with the letters L, M, N, P, S]. Moscow: Izdatel’skaja firma ‘Vosto"naja literatura’ RAN. Federal'naja slu#ba gosudarstvennoj statistiki. 2004. Nacional'nyj sostav i vladenie jazykami, gra*danstvo: Itogi vserossijskoj perepisi naselenija 2002 goda [Composition of nationalities and knowledge of languages, citizenship: Results of the All-Russian Population Census of 2002]. Moscow: IIC 'Statistika Rossii'. Forsyth, James. 1992. A History of the Peoples of Siberia: Russia's North Asian Colony 1581– 1990. Cambridge: Cambridge University Press. Gogolev, Anatolij Ignat'evi". 1993. Jakuty: Problemy ,tnogeneza i formirovanija kul'tury [The Yakuts: Problems of their ethnogenesis and the formation of their culture]. Yakutsk: Izdatel'stvo JaGU. Haugen, Einar. 1950. The Analysis of Linguistic Borrowing. Language 26(2):210–231. Janhunen, Juha. 1998. Ethnicity and language in prehistoric Northeast Asia. In Blench, Roger & Spriggs, Matthew (eds.), Archaeology and Language II: Correlating archaeological and linguistic hypotheses, 195–208. London/New York: Routledge. Johanson, Lars. 1998. The history of Turkic. In Johanson, Lars & Csató, Éva Ágnes (eds.), The Turkic Languages, 81–125. London/New York: Routledge. Ka%u&y'ski, Stanis%aw. 1995. IACUTICA: Prace jakutoznawcze [IACUTICA: Yakutological studies]. Warsaw: Wydawnictwo Akademickie DIALOG. Ka%u&y'ski, Stanis%aw. 1995 [1962]. Mongolische Elemente in der jakutischen Sprache [Mongolian elements in the Jakut language]. Reprinted as part of Ka%u&y'ski, Stanis%aw (1995). 's-Gravenhage: Mouton & Co. Original edn. 1962. Warsaw: Pa'stwowe Wydawnictwo Naukowe. Kecskeméti, István (comp.). 1971. Martti Räsänen: Versuch eines etymologischen Wörterbuchs der Türksprachen [Martti Räsänen: Attempt at an etymological dictionary of the Turkic languages]. Vol. 2: Wortregister. Helsinki: Suomalais-Ugrilainen Seura.

19. Loanwords in Sakha (Yakut)

519

Konstantinov, Ivan Vasil'evi". 2003 [1975]. Proisxo*denie jakutskogo naroda i ego kul'tury [The origins of the Yakut people and their culture]. Yakutsk: Akademija nauk Respubliki Saxa (Jakutija), Institut gumanitarnyx issledovanij. Kullmann, Rita & Tserenpil, D. 2001. Mongolian Grammar. Ulaanbaatar: Institute of Language and Literature, Academy of Sciences. Lessing, Ferdinand D. (ed.). 1995. Mongolian-English Dictionary. Third reprinting with minor type-corrections. Bloomington, Indiana: The Mongolia Society, Inc. Maslova, Elena Sergeevna & Vaxtin, Nikolaj Borisovi". 1996. The Far North-East of Russia. In Wurm, Stephen A. & Mühlhäusler, Peter & Tryon, Darrell T. (eds.), Atlas of Languages of Intercultural Communication in the Pacific, Asia, and the Americas, 999– 1001. Berlin/New York: Mouton de Gruyter. Okladnikov, A. P. 1955. Istorija Jakutskoj ASSR [The History of the Yakut ASSR]. Vol 1: Jakutija do prisoedinenija k russkomu gosudarstvu [Yakutia before Its Incorporation into the Russian State]. Moscow, Leningrad: Izdatel'stvo Akademii Nauk SSSR. Pakendorf, Brigitte. 2007. Contact in the prehistory of the Sakha (Yakuts): Linguistic and genetic perspectives. (LOT Dissertation series 170). Utrecht: LOT. Pakendorf, Brigitte & Novgorodov, Innokentij N. & Osakovskij, Vladimir L. & Danilova, Al’bina P. & Protod’jakonov, Artur P. & Stoneking, Mark. 2006. Investigating the effects of prehistoric migrations in Siberia: Genetic variation and the origins of Yakuts. Human Genetics 120(3):334–353. Pekarskij, $dvard Karlovi". 1958–1959 [1907–1930]. Slovar’ jakutskogo jazyka [Dictionary of st Yakut]. Facsimile reprint of the 1 edn. Otdelenie literatury i jazyka Akademii Nauk SSSR, Jakutskij filial Akademii Nauk SSSR. Popov, Gavriil Vasil’evi". 1986. Slova ‘neizvestnogo proisxo*denija’ jakutskogo jazyka: Sravnitel’no-istori&eskoe issledovanie [The words of ‘unkown origin’ in Yakut: A comparative-historical study]. Yakutsk: Jakutskoe kni#noe izdatel’stvo. Räsänen, Martti. 1969. Versuch eines etymologischen Wörterbuchs der Türksprachen [Attempt at an etymological dictionary of the Turkic languages]. Helsinki: Suomalais-Ugrilainen Seura. Rassadin, Valentin Ivanovi". 1980. Mongolo-burjatskie zaimstvovanija v sibirskix tjurkskix jazykax [Mongolo-Buryat borrowings in the Turkic languages of Siberia]. Moscow: ‘Nauka’. Romanova, Agnija Vasil'evna & Myreeva, Anna Nikolaevna & Bara(kov, Petr Petrovi". 1975. Vzaimovlijanie ,venkijskogo i jakutskogo jazykov [Mutual Influence of the Evenk and Yakut Languages]. Leningrad: Izdatel'stvo ‘Nauka’. Leningradskoe Otdelenie. Safronov, F. G. (ed.). 2000. +nciklopedija Jakutii [Jakut encyclopedia]. Moscow: OOO ‘Jakutskaja $nciklopedija’. !"erbak, Aleksandr Mixajlovi". 1994. Vvedenie v sravnitel’noe izu&enie tjurkskix jazykov [Introduction to comparative Turkic studies]. St-Petersburg: ‘Nauka’. Schönig, Claus. 1997. A new attempt to classify the Turkic languages 1. Turkic Languages 1:117–133.

520

Brigitte Pakendorf and Innokentij N. Novgorodov

Sero(evskij, Vaclav Leopol'dovi". 1993 [1896]. Jakuty [The Yakuts]. Moscow: Rossijskaja politi"eskaja enciklopedija (ROSSPEN). !irobokova, Natal'ja Nikolaevna. 1977. Ob otno(enii jakutskogo jazyka k jazykam drevnetjurkskix pamjatnikov [On the relationship of Yakut to the language of the Old Turkic remains]. In Ubrjatova, Elizaveta Ivanovna (ed.), Issledovanija po jazykam narodov Sibiri: Sbornik nau&nyx trudov, 108–116. Novosibirsk: Sibirskoe Otdelenie AN SSSR. Institut Istorii, Filologii i Filosofii. Slepcov, P. A. (ed.). 1972. Jakutsko-russkij slovar’ [Yakut-Russian dictionary]. Moscow: Izdatel’stvo ‘Sovetskaja $nciklopedija’. Slepcov, P. A. (ed.). 2004. Tolkovyj slovar’ jakutskogo jazyka [Explanatory dictionary of Yakut]. Tom 1: Bukva A [Vol. 1: Letter A]. Novosibirsk: ‘Nauka’. Slepcov, P. A. (ed.). 2005. Tolkovyj slovar’ jakutskogo jazyka [Explanatory dictionary of Yakut]. Tom 2: Bukva B [Vol. 2: Letter B]. Novosibirsk: ‘Nauka’. Stachowski, Marek. 1995. Jakutisch und dolganisch mas ‘Baum’ [Jakut and Dolgan mas 'tree']. Central Asiatic Journal 39(2):270–274. Teni(ev, $. R. (ed.). 2001. Sravnitel’no-istori&eskaja grammatika tjurkskix jazykov: Leksika nd [Comparative-historical grammar of the Turkic languages: Lexicon]. 2 edn. Moscow: ‘Nauka’. Tugolukov, Vladilen Aleksandrovi". 1985. Tungusy (,venki i ,veny) Srednej i Zapadnoj Sibiri [The Tungus (Evenks and Evens) of Central and Western Siberia]. Moscow: Izdatel’stvo ‘Nauka’. Voronkin, M. S. & Alekseev, M. P. & Vasil’ev, Ju. I. 1995. Dialektologi&eskij slovar’ jazyka Saxa (dopolnitel’nyj tom) [Dialectological dictionary of the Sakha language (supplementary volume)]. Novosibirsk: VO ‘Nauka’. Voronkin, Mixail Spiridonovi". 1999. Dialektnaja sistema jazyka Saxa [The dialect system of the Sakha language]. Novosibirsk: 'Nauka'. Sibirskaja izdatel'skaƒja firma RAN. Weiers, Michael. 1986. Zur Herausbildung und Entwicklung mongolischer Sprachen: Ein Überblick [On the formation and development of Mongolian languages: An overview]. In Weiers, Michael (ed.), Die Mongolen: Beiträge zu ihrer Geschichte und Kultur, 29–69. Darmstadt: Wissenschaftliche Buchgesellschaft. Winford, Donald. 2005. Contact-induced changes: Classification and processes. Diachronica 22(2):373–427. Wurm, Stephen A. 1996. Siberia: 1650–1950 ethnic and linguistic changes. In Wurm, Stephen A. & Mühlhäusler, Peter & Tryon, Darrell T. (eds.), Atlas of Languages of Intercultural Communication in the Pacific, Asia, and the Americas, 969–974. Berlin/New York: Mouton de Gruyter.

19. Loanwords in Sakha (Yakut)

521

Appendix of lexical copies Proto-Indo-European o!us

bull

war/battle, army

larch

Persian (ultimately) bolot

sword

Selkup mel'i

külgeri

lizard

country

&e&egey

temples

&an&#k

temples

&ab#r!ay

temples

sirey

face

mi"le

gums

küömey

throat

%omur!an

collarbone

berbe"key

ankle

kuorsun

feather

kieli

womb

üösket

to beget

so!uo

goitre/goiter poison

saps#n

always, often

'ar!a

boat

Evenki

pain

Mongolic b#r#"

mud

bay!al

sea

'ebere, 'abara

swamp

&a!#l!an

lightning, bolt of lightning

sülühün s#la"yb#t

tired

salg#n

air

tara!ay

bald

tölön

flame

do!olo$

lame

&o%

embers, charcoal

'üley

deaf

umuruor

to extinguish

balay

blind

kergen

husband, wife

&a&ay

to choke

o!on/or

husband, old man

orguy

to boil

meliy

to crush, to grind

Ket t#"

to fan

Khalkha

Finno-Ugric ti"t

scar

doydu Kalmyk

Sanskrit seri"

&er

%o&o

valley

'ü"kte

spring, well

kuta

swamp

&#"&a"%

bird

tura"%

crow

eme"%sin

wife, old woman

lap&a"n

fin

e'i"y

older sister, aunt

ar#g#

wine

t#$a

lung

toyon

yard, court

boot

telgehe

yard, court

ü"te"n

hut

father-in-law, chieftain, master

tiergen

untu"

twins

meeting house

lasso

igireler

sugula"n

mam#kta mutuk&a, mu&ukta

(coniferous) needle

öbügeler

ancestors

%ata"h#n

lock, latch / door-bolt

ayma%tar

relatives

sandal#

table

kumala"n

rug

'on

to cry

dolbu"r, dalb#"r

shelf

so$o"

family, crowd, people

sibien

ghost

süöhü

livestock

ba!ana

post, pole

%oton

stable, stall

%aptah#n

board

/irey

calf

kürüo

fence

%a%%ay

lion

%orut

to plough, plow

tebien

camel

%otu"r

sickle, scythe

üön

insect

%amsa, ga$sa pipe

o"!uy, a"!#y

spider

&ö!ö&ök

mo!oy

snake

Evenki or Mongolic mas

wood, firewood, tree, tree trunk

Buryat mek&irge

owl

tree stump

522

Brigitte Pakendorf and Innokentij N. Novgorodov

a&a"%

forked branch

söp

enough

kü"hüle"

rape

sime

sap

kem

time

ap

magic

üle

work

eder

young

magic

tokurut

to bend

%oyut

late

%omuhun, %obuhun

kelgiy

to tie

tietey

to hurry

süge

axe/ax

belem

ready

ültürüt

to break

na"r

always, often

tenit

to stretch

'#l

year, season

telget

to spread out

a$#l#y

to smell

%arba"

to sweep

de%si

smooth

sep

tool

sohuybut

&ü"&&ü

chisel

surprised, astonished

ergiy

to turn around

'ol

eriy

to twist

sana"

sa%s#y

to shake

&o%&oy

to crouch

%altar#y

to slide, to slip

dabay

to go up

bur!al'#

yoke

s#ar!a

sledge/sled

o$o&o

boat

mal

thing

örühüy

to rescue

ab#ra"

to rescue

al'an

to damage

'ada$#

poor

ke&&egey

stingy, greedy

tölö"

to pay

&ep&eki

cheap, light, easy

tobo%

remains

%omuy

to gather

%olbo"

to join

oroy

top

bödö$

big

b#&#ka"n

small

%apta!ay

flat

kö%ö

hook

tögürük

round

ular#y

to change

elbe%

many

Russian b#"l

dust

muora

sea

ak#ya"n

ocean

%oluo'as

spring, well

bu"r!a

storm

pa"r

steam

good luck

ispi"ske

match

grief, idea

s#ba"yba

wedding

kemsin

to regret, be sorry

b#ra"t

uor

anger

younger brother, nephew, cousin

eren

to hope

ma"&a%a i!e

stepmother

buruy

fault, crime

ogdo"bo

widow(er)

kere

beautiful

bostu"k

herdsman

büre

ugly

sibin/e

pig

ite!ey

to believe

&o"sku

pig

ta"y

to guess

bötü"k

cock/rooster

s#lta%

cause

ku"russa

hen

sa"rba%

doubt

&opu"ska

chicken

ta$nar

to betray

%olu"p

dove

sata"

to try

xoruoluk

rabbit

/#ma

manner

kuoska

cat

sibiginey

to whisper

b#la%#

flea

Mel'es

to deny

taraka"n

cockroach

Suruy

to write

kuma"r

mosquito

Omuk

people

&ierbe

worm

Salay

to rule/govern

buobura

beaver

do!or

friend

matka

womb

eye

peace

temperatura

fever

&uguy

to retreat

b#ra"s

physician

%arab#l

guard

'a"t

poison

küögü

fishhook

#ha"r#la"

to roast, to fry

tuohu

witness

&ugu"n

pot

kerehit

witness

köstörü"le

pot

anda!ar

oath

&a"n/#k

kettle

19. Loanwords in Sakha (Yakut)

523

%obordo"%

pan

tü"ppüle

shoe

k#ra"b#l

rake

bülü"de

dish

ba&#"nka

shoe

sieme

seed

terielke

plate

sapp#k#

boot

ku"mna

threshing-floor

mi"ske

bowl

sele"ppe

hat, cap

seliehiney

wheat

ta"s

bowl

siep

pocket

barley

kupsu"n

jug/pitcher

bula"pka

pin

&a"sk#

cup

külü"ske

ring

'ehimien, ne&imien (archaic)

bülü"he

saucer

%oruo$ka

necklace, bead

oruos

rye

luoska

spoon

b#la"t

oats

biri"ppe

knife, razor

kukuruza

maize/corn

bi"lke

fork

iri"s

rice

s#ps#"

tongs

headband / headdress, handkerchief, rag

ebies

ebiet

lunch

kiliep

bread

tieste

dough

mehiy

to knead

mieli$se

mill

%albah#

sausage

sasiska

sausage

%ortuoska

potato

+ortuopuy

potato

erie%e

nut

bieres

pepper

müöt

honey

sa"%ar

sugar

s#"r

cheese

pi"be

beer

b#ra"ga

fermented drink

si"des

cotton

solko

silk

bo"l'o%

felt

b#la"&&#ya

(womans) dress

#rba"%#

(womans) dress, shirt

boltuo

coat

saro&ka

shirt

'u"ppa

skirt

#sta"n

trousers

bürü"ke

trousers

&ulku, &ukku

sock, stocking

nask#

sock, stocking

sibekki

flower

suokka

brush

lu"k

oak

ma"s

ointment

taba%

tobacco

m#"la

soap

s#ap

chain

sierkele

mirror

mi"n/ik

broom

bala"kka

tent

sibin/e"s

lead

olbuor

yard, court

östüöküle

glass

kulu"p

meeting house

korzina

basket

%oluoda

doorpost

k#ra"ska

paint

külü"s

lock, padlock

buru"s

whetstone

muosta

floor, bridge

teliege

cart, wagon

istiene

wall

kölüöhe

wheel

kömülüök

fireplace

uos

axle

turba

chimney

%ara"b#l

ship

kirilies

ladder

borokuot

ship

ustu"l

chair

uru"l

rudder

ostuol

table

ma&ta

mast

la"mpa

lamp, torch

ba"r#s

sail

bana"r

lamp, torch

'a"k#r

anchor

&üme&i

candle

port

port

%oru"da

trough

man#at

coin

k#r#"sa

roof

suot

bill

%oruobuya

roof

noluok

tax

ostuolba

post, pole

#r#"nak

market

kirpi"&&e

brick

ma!ah#"n

shop/store

ispieske

mortar

la"pp#

shop/store

ba"h#nay

farmer

s#ana

price

ba"h#na

field

kiries

cross

o!uruot

garden

/u"l

zero

%ana"ba

ditch

t#h#"n&a

a thousand

524

Brigitte Pakendorf and Innokentij N. Novgorodov

ire"t

part

kirieppes

fortress

ministr

minister

&a"s (1)

part

ata"ka

attack

mili"ssiye

police

pa"ra

pair

bilienney

captive, prisoner

b#"bar

election

birieme

time

%apka"n

trap

a"d#r#s

address

kuru"k

always, often

&a"rka"n

trap

nüömer

&a"s (2)

hour

sokuon

law

number, license plate

&ah#

clock

su"t

court

u"lussa

street

nediele

week

su'uya

judge

po&ta

post/mail postage stamp

benidien/ik

Monday

#stara"p

fine

ma"rka

optuorun/uk

Tuesday

siertibe

sacrifice

atkr#tka

postcard

bank

bank (financial institution)

k#ra"n

tap/faucet

rakovina

sink

tualet

toilet

matara"s

mattress

kensierbe ba"nkata

tin/can

bi"nte

screw

etibierke

screwdriver

b#t#"lka

bottle

kempiet

candy/sweets

plasma"s

plastic

buomba

bomb

böppürüöske

cigarette

%ah#at

newspaper

%alanda"r

calendar

ki"ne

film/movie

muz#ka

music

&ey, &ay

tea

kofe

coffee

serede

Wednesday

#ray

heaven

&eppier

Thursday

a"t

hell

be"tinse

Friday

ara"'#ya

radio

subuota

Saturday

telebi"zer

television

b#rast#" g#n

to forgive

tölüpüön

telephone

k#rah#abay

beautiful

belasiped

bicycle

mu"daray

wise

matass#k#l

motorcycle

u&u"tal

teacher

mass#"na

car, machine

oskuola

school

avtobus

bus

&uolkay

clear

buoyas

train

s#al

intention

sömölüöt

airplane

kuma"!#

paper

'arap#la"n

airplane

uru"&ka

pen

batareya

battery

kinige

book

motuor

motor

baraba"n

drum

neft

petroleum

kuorat

town

bal#"ha

hospital

böhüölek

village

siestere

nurse

deriebine

village

ukuol

injection

k#ran#"ssa

boundary

a&#k#

spectacles/glasses

meheyde"

to prevent

pravitelstva

government

salla"t

soldier

prezident

president

Chapter 20

Loanwords in Oroqen, a Tungusic language of China* Fengxiang Li and Lindsay J. Whaley 1. The language and its speakers 1.1.

Introduction

Oroqen is a northwestern Tungusic language spoken by one of the officially recognized nationalities in the People’s Republic of China. The classification of the Tungusic languages is shown in Figure 1. Northeastern

Even, Arman

Northwestern

Evenki, Solon, Negidal, Orok, Oroqen

Central-Eastern

Oroch, Udege, Ulch

Central-Western

Kili, Nanai

Northern

Tungusic

Central

Southern

Figure 1:

Manchu, Jurchen

Classification of Tungusic Languages (based on Doerfer 1978)

This ethnic group (referred to as Elunchun in Mandarin Chinese and Oroqen/ Orochen/Orochon in the literature) has roughly 7000 members according to the Chinese government census information from 1990. There are only a few dozen fully fluent speakers left, all of them over the age of 65, and perhaps as many as 1000 partially fluent speakers. The Oroqen were primarily hunters and gatherers until the early 1950s when the Chinese government started to settle them. Their hunting grounds were vast, covering most of the areas between 45 to 55 degrees latitude and 115 to 135 degrees longitude. The Oroqen language has never had a writing system, so the historical record of the language is limited. The academic community outside of China has tended to treat the Oroqen as a nondescript *

The subdatabase of the World Loanword Database that accompanies this chapter is available online at http://loanwords.info. It is a separate electronic publication that should be cited as: Li, Fengxiang. 2009. Oroqen vocabulary. In Haspelmath, Martin & Tadmor, Uri (eds.) World Loanword Database. Munich: Max Planck Digital Library, 1138 entries.

526

Fengxiang Li and Lindsay J. Whaley

subdivision in an Evenki complex which spans most of Siberia and northeastern China. It is often assumed, without explicit justification, that claims made about the Evenki in Siberia also hold true of Oroqen, particularly in the realm of language. 1.2.

The Current Distribution of the Oroqen

Currently, most of the Oroqen live in nine locations in the Oroqen Autonomous Banner in Inner Mongolia and four locations in four different counties in Heilongjiang Province. Two of the nine locations in Inner Mongolia are county towns, Alihe and Dayangshu, while the other seven are villages, namely, Mukui, Xierteqi, Wulubutie, Chaoynag, Guli, Nantun and Nanmu. The six locations in Heilongjiang Province are all villages: Shibazan, Baiyina, Xinsheng, Xinxing, Xin’e and Shengli.

Map 1: Geographical setting of Oroqen Special note should be made of three Oroqen localities. Alihe is the seat of the Oroqen Autonomous Banner. More ethnic Oroqen live in this city than in any other location, but they are scattered about the town and live among the Han majority and other minorities. This makes Alihe, as well as Dayangshu, somewhat different than other Oroqen towns, where a section of the town is or was designated specifically for Oroqen homes. Reliable census statistics are unavailable for the actual number of Oroqen in these two cities. Nuomin (i.e. Xiao’ergou), which lies directly to the south of Alihe, no longer has any fluent Oroqen speakers, and for the most part the Oroqen have fully assimilated into the larger community (cf. Whaley & Li 2000). In 1953, the communist government of China did its first national census, which identified a total of 2256 Oroqen. This statistic reflected a 45% population decline compared with 4,111 Oroqen identified in a 1917 census. Heavy casualties

20. Loanwords in Oroqen

527

in wars, constant conscription by warring parties, destitute living conditions in many areas during the upheavals of changing powers, devastation resulting from irresponsible gold mining near several of the Oroqen locations in Heilongjiang Province (e.g., large quantities of mercury were used to float non-gold containing debris in the Wulaga region), and epidemics obviously all contributed to such a sharp decrease in population in those chaotic years. 1.3.

Oroqen history

There is basic consensus that the Oroqen originate from various clans that moved south of the Amur River and further into the Greater and Lesser Hinggan Mounth tains in the middle of the 17 century. Shirokogorov (1929) and Janhunen (1996) both propose that this migration did not occur en masse, but in successive waves of Oroqen clans, and they agree that the migration was facilitated by the fact that when the Oroqen entered the Hinggan Mountains region, no other ethnic groups were using the region for hunting. Furthermore, they agree that the Birarchen, that is, the clan of Oroqen who now have descendants in Xunke and Jiayin counties, were the last Oroqen to enter into China and that some of the distinctiveness of the dialect of this group comes from contact that they had with the Nanai north of the Amur before migrating (see also Doerfer 1983). The claims of this scenario match well with the current dialect divisions. It seems likely that most or all of the dialects represent a wave of migration by a group of clans. The delineation of the hunting grounds of the loosely associated clans was roughly determined by the rivers they followed. Thus, the Oroqen came across the Amur not only at different times but made their southward movement by means of several different tributaries. One complication to this sketchy scenario is that some evidence points to mith grations into the Hinggan Mountains region earlier than the mid-17 century. Such migrations may have included Oroqen clans and/or other groups such as the Solon Ewenki and the Hezhen. This is a particularly important possibility when considering the Oroqen who speak the Western dialect. This group is unique in several ways. First, the Western dialect, unlike other dialects, is heavily influenced by the Mongolic language Dagur. Second, the information we have gathered about the history of the group contains recollections of migration that involved the herding of cows from the Bystraya and Gen rivers region. The rearing of cattle is decidedly not typical for Tungus people generally, but did traditionally occur among Mongolic peoples of northern China. Until the formation of the People’s Republic of China in 1949, most Oroqen clans were nomadic, though one Oroqen settlement had already formed: Nantun (in Xiaoergou). After the establishment of the PRC, the government worked to settle the remaining Oroqen clans with promises of health care and food provisions, and by 1958 Oroqen nomadism had ended. The earliest settlement in Inner Mongolia is Nanmu Township under the jurisdiction of current day Zalantun Municipality. The Oroqen in this location had hunted along Jiqin River and Amuniu River before settlement in 1949. The Oroqen

528

Fengxiang Li and Lindsay J. Whaley

that hunted in the Kuile River reaches, including its small tributaries such as Kuduqi, the lesser Kuile River, the Wole River, and Chabaqi settled in Wulubutie (formerly called Gankui). Those whose hunting ground was along the Duobukur river and its tributaries (i.e. Chuifeng, Dayangqi, Xiaoyangqi) settled in Chaoyang. The clans that hunted along the Gan River and its tributaries (such as the Keyi River, the Jiwen River, the Yinga River, Molige and Qigelani) settled first in Nerkeqi, only to later relocate to Wulubutie in 1965 due to railroad construction. The clans that hunted in the Guli River reaches settled in a spot west of the Duobukur River, which is the current day Guli Township. It should be noted that most of the Oroqen in this location originally migrated from the Aihui area in Heilongjiang Province in or after 1947. Some came because their relatives were already in Guli, and others relocated to this region due to their involvement in the Japanese campaigns in Heilongjiang Province against the Soviet Red Army and the Chinese Communist Party’s Eighth Route Army. At that time, the Oroqen Autonomous region was still relatively isolated, and there were no other ethnic groups in that area. The vast space and being surrounded by their own people undoubtedly gave them a sense of security. The westernmost Oroqen villages have the most complex evolution. The four groups of Oroqen (about 300 members in all) hunting along the Gen River, the Hailar River, the Kudur River, and the Yituli River migrated to the Nuomin River area en masse in 1956. Those groups that hunted in the Nuomin River basin settled in Longtou in 1958. Later, due to frequent flooding, they moved to Simuke (i.e. s!m!lk!) in 1959 and relocated a third time in 1961 to be where they are now, namely, in Xierteqi, a village in the current day Tuozhamin Township (i.e. Tuohe). The remaining Oroqen, who were hunting along the Nemen River and the Tuo River, settled in Mukui. The local government in Heilongjiang Province started their rigorous efforts to settle the Oroqen in that region in 1951. The approach adopted by the Heilongjiang Provincial government was the same as that employed in settling the Oroqen in Inner Mongolia. Government cadres carried out a lot of discussions, and hosted lectures for the Oroqen to educate them about the advantages of a non-nomadic lifestyle. Once the Oroqen agreed to settle, a number of factors were taken into consideration when picking locations for the settlements: ease of access, proximity to hunting grounds, good quality water resources, fertile soil for farming purposes, and so on. A meeting was held in Heihe city in 1952 to decide on the locations and to approve the settlement plan. In 1953, 313 houses were built in ten locations picked for the settlement. The ten locations were: Shibazhan, Baiyina, Xinlitun and Xiayuliangzi of Huma County, Xinsheng and Hartong of Aihui County, Xin’e, Xinxing and Laoxidiyingzi villages of Xunke County, and Shenglitun of Jiayin County. Some of these groups were later relocated to be combined with other groups for the local government’s convenience in managing the communities.

20. Loanwords in Oroqen

1.4.

529

Oroqen Dialects

S. M. Shirokogorov (1929) is the first to suggest dialects among the Oroqen. He identifies four clusters of Oroqen clans: the Birarchen, the Kumarchen, the Khingan Tungus, and the Mergen Tungus. His four-way division is based more on ethnographic observations than linguistic features, and the word list from his book provides more counter-evidence than support for the notion that his categorization reflects the actual dialect situation. Since Shriokogorov, the question of Oroqen dialect divisions has been confounded by Tungusologists, particularly those working outside of China, in their insistence that Oroqen is so closely related to Evenki that it need not be examined as an independent language, or even as a significant dialect of Evenki. While the differences between Evenki and Oroqen are not great, Whaley et al. (1999) demonstrate that there is as much reason to take them to be separate languages as there is to make a distinction between any of the Northwestern Tungusic tongues. Rather than repeat the arguments presented there in full, we offer here just one piece of their justification for this claim. A comparison of 206 basic vocabulary items (Table 1) gives the degrees of similarities among the Northwestern Tungusic languages. Table 1:

Percentage of cognate words in Northwest Tungusic drawn from a basic 200 word list Evenki

Evenki Oroqen Solon

Oroqen 83% (171/206)

Solon 88% (161/182) 87% (155/178)

Negidal 95% (184/194) 85% (163/192) 91% (158/174)

The deviations between percentages are relatively small, just as one would expect in a cluster of closely related languages. Notably, however, the difference between Oroqen and Evenki is greater than the difference between any other two languages, suggesting a degree of variation which is language-like rather than dialect-like. The only published study of the dialectal variations in Oroqen can be found in Whaley & Li (2000) where they propose that Oroqen be divided into four main dialects, namely, Western, Central, Northeastern, and Southeastern which correspond roughly with the geographic position of the villages. The Western dialect is spoken in Mukui, Xierteqi, and Nanmu. The Central dialect is spoken in Alihe, Chaoyang, Wulubutie, and Dayangshu. The Southeastern dialect is spoken in Shengli, Xinxing, and Xin’e. The Northeastern dialect is spoken in Baiyina and Shibazhan. The dialects are all mutually intelligible and linguistically quite similar, though it should be noted that many Oroqen in Inner Mongolia admit to having some difficulties in understanding the Heilongjiang Oroqen, and within Inner Mongolia, some Central dialect speakers have problems understanding the Western dialect. In these instances, the barriers to intelligibility are the result of the combination of relatively minor prosodic, phonological, and lexical differences. For example, the

530

Fengxiang Li and Lindsay J. Whaley

Western dialect maintains consonant clusters of -ld- and -nd- where other dialects do not, e.g. mandi vs. mani ‘very’. The Western dialect, having been more heavily influenced in its lexicon by Solon, and by the Mongolic language, Dagur, has words not used in other Oroqen dialects. The Northeastern dialect is unique in having intervocalic [h] where other dialects have [!], e.g. [ahi] vs. [a!"] ‘woman’. These and many other phenomena serve as the diagnostics for separating the various Oroqen communities into different dialects. It may be necessary to establish a fifth dialect to depict the Oroqen of Guli and Xinsheng. Janhunen et al. (1989) believe that a unique dialect, which they call Selpechen, exists in Xinsheng (most of the Oroqen in Guli came from Xinsheng fairly recently). However, the few bits of evidence they provide are far from compelling; our own data from these regions are not complete enough to make any definitive statement at this point. However, Selpechen appears to us to be too similar to the Northeastern dialect to warrant distinct treatment.

2. Sources of data We did fieldwork on Oroqen for ten summers from 1996 through 2006, skipping only the summer of 2003 due to the SARS outbreak. Extensive cross-checking among different speakers from different dialects was carried out to maximize accuracy. Wherever possible, the lexical items were also checked against word lists available in the literature, including Starostin et al.’s Etymological Dictionary of Altaic Languages (Starostin et al. 2003), Zengyi Hu’s Oroqen Grammatical Sketch (Hu 1986) and his follow up volume, A Study on Oroqen (Hu 2001), A Basic Vocabulary of Three Evenki Languages (Chaoke 1995), A Basic Vocabulary of the Tungusic Languages in China (Chaoke 1997), Elunchun Yu Han Yu Duizhao Duben [An Oroqen-Chinese Reader] (Han & Meng 1993), and Jianming Hanyu Elunchun yu duizhao duben. [A concise Chinese-Oroqen reader] (Saxirong 1981).

3. Contact situations Oroqen has had contact for an extended period of time with several genetically related and non-genetically related languages, such as Chinese (Sinitic), Dagur (Mongolic), Khalkha Mongolian (Mongolic), Evenki (Tungusic), Russian (Slavic) and to some extent Manchu (Tungusic). It is clear that for all speakers (except a few elderly speakers), Oroqen is currently at best a second language which is being rapidly replaced by Mandarin Chinese. The contact situation of Oroqen is quite complex because all the fluent Oroqen speakers are multilingual, at least in Oroqen and Mandarin Chinese, often also in Dagur, and occasionally in some other Tungusic variety, such as Solon or Evenki.

20. Loanwords in Oroqen

3.1.

531

Contact with Dagur

As indicated above, the Oroqen are believed to originate from the region north of the Amur River, that is, in present day Russia. Janhunen (1997) suggests that migrations of small Dagur populations occurred in tandem with the Oroqen (and Solon Ewenki) migrations. Regardless of whether there was co-migration, it is widely accepted that all these groups have co-existed in Inner Mongolia and Northeast of China for several centuries. Trading among them was ubiquitous, usually involving the exchange of fur, game products and animal hide handicrafts from the Oroqen and animal husbandry or agricultural products from the Dagur. The frequency of the trading led to multilingualism, and some mixing of the populations through inter-marriages. The commercial relationship, while mutually beneficial, established Dagur as the dominant language, and it became the norm for Oroqen speakers to learn to speak Dagur, resulting in extensive lexical and grammatical borrowings from Dagur. 3.2.

Contact with Chinese

Since the founding of the Oroqen Autonomous Banner in 1951, the Oroqen went from a degree of isolation from other ethnic groups to being a tiny minority population in a Han Chinese dominated context due to massive migration of Han Chinese to the region. In 1951, the total population of the Oroqen Autonomous Banner was 778, with only 4 non-Oroqen people in it (one Ewenki and three Dagur). That means that the area was more than 99% Oroqen at that time. However, in 1988, the total population was 291,372 with an Oroqen population of around 1800, which leaves the Oroqen at less than 1% of the population in this region. Although the period of time that Mandarin Chinese has been in intense contact with Oroqen is relatively short, Oroqen has borrowed quite freely from Chinese during the last few decades, especially in the area of lexicon. 3.3.

Contact with Russian

Needless to say, lexical borrowing does not require extensive and persistent contact, although the amount and range of lexical borrowing seem to correlate with the intensity and length of time that the languages are in contact. In the case of Oroqen, this is born out by the fact that lexical borrowing from Russian arises most commonly in the northeastern dialect. In that region, the Oroqen and the Russian communities are separated by the Amur River. During winter months when the river is frozen solid, the natural barrier is removed making contact between the peoples relatively easy. When the relationship between China and Russia is good, frequent trade activities occur leading to all kinds of interactions between the two sides. Though such contact is cut off completely whenever the two nations become hostile to each other, the interactions have been stable enough to leave their mark on Oroqen in the form of lexical borrowings. It should be noted that some of the

532

Fengxiang Li and Lindsay J. Whaley

Russian loans might have entered Oroqen before their migration across the Amur River into China. Interestingly enough, no grammatical structure borrowings from Russian in Oroqen have been uncovered so far. 3.4.

Contact with Mongolian

No systematic patterns can be established for loanwords from Mongolian. However, only Mongolian loans contain body part expressions, a possible indication that contact between Mongolian and Oroqen goes far back into their histories before the proposed Oroqen migration into China. A few of the Mongolic loans are clearly very old. It is possible that the Western Oroqen were in the Borzya area in Russia prior to an early southward migration, which brought them into the Greater Hinggan Mountain region via the Argun River (i.e. upper Amur). In support of this scenario, documents from the Ming Dynasty (1368–1644) indicate that reindeer herding ethnic groups were in the Hinggan Mountains as early th as the 15 century. For instance, in 1409, (i.e. Yongle’s reign [1403–1424] of the Ming Dynasty), the Ming Dynasty Annals state that “the barbarians from the northern mountain forests come and go riding deer”, referring to the ethnic groups living in the upper and lower reaches of the Amur (Zheng 1991). Such vague and general reference to the ‘barbarians’ in the northern mountain forests can be found in various documents from the Yuan (1271–1368), Ming (1368–1644) and Qing (1644–1911) dynasties. Unfortunately, these documents are not discriminating when it comes to identifying their referents, and so it is impossible to say with any certainty that references to deer-riding peoples are in fact describing a Tungusic group (though this seems likely), let alone which Tungusic group. Various documents from the Qing Dynasty (especially during Kangxi’s reign 1662–1722) do use the terms such as Eluochun, Elechun, and Elunchun (Oroqen). However poor this indirect evidence is, it is consistent with some Oroqen clans having crossed the Amur prior to the generally assumed time period of around the th middle of the 17 century. If so, this means that the Mongolic contact has a relatively lengthy history though with less intensity and persistence. 3.5.

Contact with Manchu

It is intriguing that few lexical borrowings from Manchu are found in Oroqen despite its contact with Manchu in the southeastern and northeastern regions. During our fieldwork trips, we were told more than once that there used to be elderly Oroqen in those areas who could speak fluent Manchu. Though there is no way to verify such anecdotal information, it is clear that the Oroqen had contact with the Manchu in that region, especially through serving in the Qing Dynasty military. According to Wang (2005: 105), Manchu, Oroqen, and Dagur co-existed along the Amur River in the Aihui County region in Heilongjiang Province for many years with close contact. Presumably, the contact was not persistent enough and not much bilingualism in Manchu and Oroqen existed among the Oroqen communi-

20. Loanwords in Oroqen

533

ties. Only a few lexical items that are possible borrowings from Manchu have been identified in this project.

4. Number and types of loanwords in Oroqen Out of the total number of 1138 word entries in the subdatabase for Oroqen, 137 (12%) are clear or probable loanwords from Chinese, Russian, Dagur, or other Mongolic languages. This number includes loanwords that are used in one or more of the Oroqen dialects. In some cases, the borrowed forms co-exist in the dialect with a native lexical item whereas in other cases the borrowed word is preferred. The highest percentage of loanwords is from Chinese (61 items) followed by Dagur with 43 items. A total of 22 borrowed forms are from (Khalkha) Mongolian compared to 15 loans from Russian. Despite its contact with Manchu, only a few items were found to be possible borrowings from Manchu. 4.1.

Types of loanwords

The majority of the loans are correlated with the change from a hunter-gatherer life style to a sedentary way of life influenced by the Mongolic peoples such as the Dagur and Mongolian, and in more recent time, the Chinese. What is notable is the lexical range of the Chinese loans given the short time span of contact between the Chinese and the Oroqen, which is only about five decades. The Dagur influence, on the other hand, is much longer and just as intense. Therefore, the wide range that the loans from Dagur cover should be expected. For the same reason, we would expect grammatical borrowings from Dagur as well, which is exactly the case as is discussed in §6 on grammatical borrowings. The kind of impact Dagur has had on Oroqen took several hundred years to realize contrasting to almost the same level of influence on Oroqen from Chinese in just a few decades. Another major source of the loanwords is related to trade goods. It is noteworthy that the loans display a clear pattern in terms of the contact languages and dialect regions. Specifically, the Dagur and Mongolian borrowings are almost exclusively found in the central and western dialects, whereas the dozen or so Russian borrowings are mostly confined to the northeastern dialect. The Chinese loans are found in all four dialects although only a small number of them are used in the northeastern dialect. This is expected since the northeastern dialect was impacted by the Chinese the latest in terms of intensity and time depth. 4.2.

Loanwords and semantic fields

Table 2 shows the distribution of the loanwords over semantic fields.

534

Fengxiang Li and Lindsay J. Whaley

4.2.1.

Russian

Mongolian

Khalkha

Proto-Mongolic

Unidentified

Total loanwords

Non-loanwords

The physical world Kinship Animals The body Food and drink Clothing and grooming The house Agriculture and vegetation Basic actions and technology Motion Possession Spatial relations Quantity Time Sense perception Emotions and values Cognition Speech and language Social and political relations Warfare and hunting Law Religion and belief Modern world Miscellaneous function words

Dagur

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Chinese

Table 2: Loanwords in Oroqen by donor language and semantic field (percentages)

1.6 4.8 1.4 12.7 8.7 10.4 18.2 4.3 1.4 12.9 1.3 2.4 3.2 2.8 3.4 4.0 4.2 28.6 40.8 5.3

13.4 2.0 7.9 3.5 5.2 6.1 9.2 6.4 2.3 8.0 8.2 3.7

1.6 8.7 1.4 1.4 2.4 3.4 20.4 1.3

6.4 3.7 0.7 1.6 4.2 0.9

0.7 4.1 2.8 0.4

1.2 0.7 0.2

2.4 4.2 0.3

9.6 4.8 20.7 5.4 22.2 20.9 15.6 28.4 17.7 9.2 12.9 1.3 0.0 4.9 2.3 3.2 2.8 6.7 12.0 12.5 28.6 0.0 69.4 0.0 12.0

90.4 95.2 79.3 94.6 77.8 79.1 84.4 71.6 82.3 90.8 87.1 98.7 100.0 95.1 97.7 96.8 97.2 93.3 88.0 87.5 71.4 100.0 30.6 100.0 88.0

Dagur loans

The nature of contact between the Dagur and the Oroqen, as is discussed earlier, dictates the semantic fields of the loans. Most of the Dagur loans have to do with animal husbandry, such as domesticated animal names and names of related production tools and products. It is fair to say that the social fabric and the cohesion of the Oroqen social organization were both mostly intact during the long and relatively intense contact with the Dagur because we find very few Dagur borrowings that are in the realm of socio-political relations. In fact, only one item is found, namely, #d$#n ‘master’. The Dagur word for the same meaning is #d$in ‘master’. Since the Mongolian word #d$i ‘master’ is almost identical to the Dagur word, it is

20. Loanwords in Oroqen

535

hard to determine whether #d$#n in Oroqen is actually a Mongolic loan or not (cf. §4.2.5 below). It should also be pointed out that no kinship terms were borrowed from Dagur. Another item that has an unclear donor language is altan ‘gold’. Both Mongolian and Dagur have exactly the same form for the same meaning alt ‘gold’. We are not certain at this point whether Oroqen borrowed the word from Mongolian via Dagur or directly from Mongolian. The former is probably a more likely scenario. 4.2.2.

Chinese loans

The borrowings from Chinese are fairly extensive in terms of the range of semantic fields, covering 13 of the 24 semantic fields in the database excluding the miscellaneous category. The rapid acculturation of the Oroqen into Chinese customs has largely refashioned Oroqen society in the last fifty years. Not surprisingly, many Chinese loans depict aspects of social organization (e.g., kinship terms, terms of political organization, law, possession, and so on). Some of the Chinese loans are found in only a subset of the dialects. For instance, the kinship terms, such as y!%y! [yéye !!] ‘paternal grandfather’, d$üy [zhír "#] ‘nephew’, d$üy &na%d$" [zhírnü "$] ‘niece’, etc., are not used in the northeastern dialect. Since Oroqen has its native form adamay (vocative) and adama for both ‘grandma’ and ‘grandpa,’ y!%y! [yéye !!] ‘paternal grandfather’ must be a replacement. All dialects use the borrowing for ‘nephew’ though there is some reluctance on the part of the speakers in the northeastern dialect region in using this term. In fact, the loan for ‘niece’ is not used in the northeastern dialect, though it is readily acceptable in the other dialect areas. It is actually a compound word made up of the Chinese borrowing d$üy [zhír "#] ‘nephew’ and the native Oroqen word &na%d$" ‘(unmarried) girl’ for the kinship meaning ‘niece’. It is quite likely that a few of the Chinese loans entered Oroqen through Dagur or Mongolian. A case in point is t!üd!' q"d(ngr %&# ‘match’. Dagur has t!yyd#' ‘match’, which is clearly a borrowing from Chinese. So the question is did the two languages borrow the term independently or did Oroqen borrow it from Chinese via Dagur as the intermediate donor? This borrowed form is used in the central and southeastern regions. The Chinese compound word is only used in certain northern Mandarin dialects. In fact, it is highly likely that Manchu, which has t)yd#r/t)yd#l ‘match’, borrowed the term from Chinese. In the northeastern Oroqen dialect, pit!ix# "#$%&' [spit!ka] is used, which is a borrowing from Russian. Another instance involving a possible intermediate donor language is t!i'd$u ‘chili pepper’ (< qínji*o '( in Chinese) for which Mongolian has exactly the same form t!ind$u% ‘chili pepper’. In other words, it is possible that Mongolian is the intermediate donor for this item. Interestingly, the northeastern dialect uses the term lad$au (< làji*o )() ‘chili pepper’ which is obviously a more recent borrowing from Chinese. Occasionally, Chinese serves as the intermediate donor. For example, the Chird nese loan puto ‘grape’ (< pútao *+) was borrowed into Chinese before the 3

536

Fengxiang Li and Lindsay J. Whaley

century AD, and Oroqen must have borrowed it from Chinese fairly recently. Sometimes when Oroqen borrows an item from Chinese, which in turn borrowed it from another language, some semantic shift has taken place. This is true with the term bula%d$i ‘skirt’ (< bùl*jı+ ,-.). There is no doubt that Chinese borrowed the item from the Russian form #(')*+ bulatye. In Chinese it means ‘one piece dress,’ but in Oroqen it denotes ‘skirt’. In the central dialect, another borrowing is used, namely, d$ün!i ‘skirt’ (< qúnzi /0) which is a borrowing from Chinese. It is quite obvious that some of the slightly older Chinese loans have taken on more meanings. For instance, m&d$an ‘carpenter’ (< mùjiang 12) means either ‘carpenter’ or blacksmith’ in Oroqen, though it only means ‘carpenter’ in Chinese. In fact, in Oroqen, this borrowed item can also be used to refer to a ‘handyman’ or th a ‘skilled person.’ Most of the Chinese loans entered Oroqen during the 20 century, though some may have been borrowed a bit earlier than that. 4.2.3.

Mongolian loans

No systematic patterns can be established for loanwords from Mongolian. However, only Mongolian loans contain body part expressions, a possible indication that contact between Mongolian and Oroqen goes far back in their histories before the Oroqen’s migration into China. 4.2.4.

Russian loans

The number of loanwords from Russian is limited, only about a dozen. They are mostly found in the northeastern dialect region. They probably entered Oroqen in th the latter half of the 20 century. Several instances of Russian borrowing point to the dialectal variation showing divisions in contact induced preferences. For instance, the loanword d$,%l,t& (< ,-(-)- zoloto) ‘gold’ is only used in the northeastern region. In the other areas, altan ‘gold’ is used, which has a Mongolic source. Another example is t!ulki (< %.(&$ -ulki) ‘sock’. Speakers in the northeastern region prefer this term though they do accept the Chinese borrowing waha ‘sock’ (< wàzi 30), which is wa!a in the central region. In fact, most of the Russian loans are not used in the central and western dialects. The Russian loan pit!arga ‘glove’ (< #+/%')&' per-atka) co-exists with a few native terms referring to different types of gloves. The Russian loan refers to gloves with five fingers. The native Oroqen terms k#pt!#k. ‘mitten (with thumb (covering up to wrist))’ and k,x,l, ‘mitten (covering up to elbow and with an opening from wrist forward half way in the direction of palm for hunting purposes)’. This suggests that gloves with five fingers were not part of the Oroqen traditional lifestyle. 4.2.5.

Manchu loans

Surprisingly, hardly any loanwords were found to have come from Manchu with any degree of certainty despite Oroqen’s contact with Manchu in the southeastern

20. Loanwords in Oroqen

537

region. Only a few items were found, such as k#%'tir# ‘rib’ and bata b#y# ‘enemy’. According to Starostin et al. (2003: 658), Evenki has ke'tere. The Proto-Tungusic form is *ke'tire ‘chest, breast, or side of body’, and it is a borrowing from Mongolian into Manchu. So Oroqen may have borrowed this item from Manchu a very long time ago or from an intermediate donor language, namely, Mongolian. The form for ‘enemy’ has a different story. Since Dagur borrowed it from Manchu (Zhong 1982: 21), it is highly likely that Oroqen borrowed it from Dagur. Notice that the word in Oroqen is a compound noun consisting of the borrowed form bata ‘enemy’ and the native word b#y# ‘person’. There are several other possible forms that may have come from Manchu. For instance, #d$#n ‘master’ is probably a borrowing from Dagur. However, Manchu has almost an identical form #d/#n for the same meaning. It is unclear which language borrowed from which language in this case. Another interesting instance is d#'d$#%n ‘lamp’, which could be a borrowing from Chinese via Manchu, making Manchu the intermediate donor, though there is a possibility that Oroqen borrowed it directly from Chinese. The Manchu form is d#'d/an ‘lamp.’ It is quite clear that the form ya%ba (b#y#) ‘mute (person)’ is a borrowing from Chinese. But Manchu has almost the same form yaba for the same meaning. At this point, we do not know whether Manchu and Oroqen borrowed the form from Chinese independently or not. The single most likely case of Manchu loan in Oroqen is u'ku ‘scarf/towel’. Manchu has fu'ku ‘handkerchief.’ This means that strictly speaking, no lexical borrowings from Manchu in Oroqen have been uncovered in this project. 4.3.

Loanwords and semantic class

Almost all of the loanwords are nouns, and only a few instances are verbs (cf. Table 3). The same can be said about adjectives. The few instances of adjectives are actually nouns in the donor languages. For example, atirga/g#k ‘male/female’ are loans from Mongolian a$0ir1a ‘male horse’, gegü ‘female horse’ (Hu 2001: 204). However, it should be noted that quite a few of the loans participate in derivational morphological processes that can turn them into members of other word classes. For example, the Chinese loan d$,wuli (< zhàoli 45) ‘strainer’ is a noun which can be turned into a verb by the productive verbalizer -la to yield d$,wula- ‘to strain’.

538

Fengxiang Li and Lindsay J. Whaley

Dagur

Russian

Mongolian

Khalkha

ProtoMongolic

Unidentified

Total loanwords

Nonloanwords

Nouns Verbs Function words Adjectives Adverbs all words

Chinese

Table 3: Loanwords in Oroqen by donor language and semantic word class (percentages)

8.0 1.8 1.2 0.9 5.3

5.8 1.1 0.9 3.7

2.3 1.3

1.1 2.8 0.9

0.8 0.4

0.3 0.2

0.5 0.3

18.6 2.8 1.2 4.7 0.0 12.0

81.4 97.2 98.8 95.3 100.0 88.0

5. Integration of loanwords Although some of the more recently borrowed items are not integrated at all, many of the older loanwords have gone through quite a bit of integration both phonologically and morphologically. Part of the phonological integration is attributable to phonotactic constraints in Oroqen. For instance, the sp- cluster is not permitted in syllable onset positions. Therefore, it is simplified to p- when borrowing an item that contains it. The Russian loan pit!ix# ‘match’ illustrates this well. The Russian word spi-ka ("#$%&') contains an sp- syllable initial consonant cluster, which is simplified to just the p- in the Oroqen word pit!ix#. The rest is identical to the Russian word since [x] is phonemically /k/ in intervocalic positions in Oroqen. Another example is the loanword sux# ‘ax’ which is a borrowing from the Mongolian form sux ‘ax’. The voiceless velar fricative [x] never occurs in coda positions in Oroqen, so the insertion of the schwa makes it conform to this Oroqen phonotactic constraint. Some of the sound adjustments are predictable. For all of the instances of Chinese loans that contain the nominalizer zi 0 (the initial consonant of which is the unaspirated dental-alveolar affricate [ts], the consonant has been nativized as either [!] or [x] in Oroqen. For instance, i!# (< yízi 60) ‘soap’, may!a (< màizi 70) ‘wheat’, !u!# (< zu2zi 80) ‘bamboo’, balixa (< b*lízi 950) ‘prison’ all contain instances of [ts] that corresponds to either [!] or [x] in Oroqen. This is an instance of sound substitution since Oroqen does not have the unaspirated dentalalveolar affricate [ts] in its sound inventory. Loanwords can freely undergo morphological processes both for derivational purposes and for inflectional marking.

6. Grammatical borrowing Oroqen has grammatical borrowings from both Dagur and Chinese. Grammatical borrowings from Dagur seem to be more extensive than those from Chinese. Due

20. Loanwords in Oroqen

539

to the limitation of space, only a brief discussion with a few illustrative examples is given below. 6.1.

Grammatical borrowing from Dagur

The contact between Dagur and Oroqen resulted in not only lexical borrowings from Dagur to Oroqen but grammatical borrowings as well. A case in point is the emphatic reduplication strategy to mark intensity on some color terms (Whaley & Li 2000): bagdar"n ‘white’ vs. bag-bagdar"n ‘very white’. Dagur has a formally identical reduplication strategy which copies the first syllable and inserts [b] or [m] in the coda position of the prefix (Zhong 1982, Wu 1996), as is illustrated in the examples in (1). (1)

xula%n

‘red’

xub xula%n ‘thoroughly red’

t!i1a%n

‘white’

t!im t!i1a%n ‘very white’

dasu'

‘sweet’

dab dasu' ‘really sweet’

s!ru%'

‘cool’

s!b s!ru%' ‘really cool’

xordu'

‘fast’

xob xordu' ‘very fast’

Like Oroqen, Dagur employs reduplication to indicate intensity. However, the process in Dagur is fully productive and operates on adjectives denoting different sorts of properties, not just colors. The borrowing of reduplication can be seen as part of a more general Dagur influence on Oroqen grammatical structure, particularly in the area of derivational morphology. For instance, Dagur has a plural marker -nur used for kinship terms and human nouns illustrated in (2). (2)

#k##

‘sister’

#k##nur

‘sisters’

gut!

‘comrade’

gut!nur

‘comrades’

(from Zhong 1982: 33) Oroqen uses the phonologically similar suffix -nVr, but it is only used for kinship terms. Among Tungusic languages, the suffix is only found in Oroqen and some dialects of Solon, which has also long been in contact with Dagur. This fact indicates that the suffix is a borrowing. Some examples from Oroqen are given in (3). (3)

naat!&

‘uncle (on mother’s side)’

naat!&n,r

‘uncles (on mother’s side)’

amaakaa

‘uncle (on father’s side)’

540

Fengxiang Li and Lindsay J. Whaley

amaakaanar

‘uncles (on father’s side)’

j!!j!

‘grandpa’

j!!j!n!r

‘grandpas and those of their generation’

(Hu 1986: 56) In Dagur, the marker -nur goes on kinship terms and human nouns to signal plurality. The borrowed marker -nVr in Oroqen, however, is more restricted. It is used solely with kinship terms, and it has taken on other connotations beyond plurality. On the one hand, it can indicate an exhaustive set. The word naat!&n,r in (3) thus connotes all the uncles on the mother’s side together. The suffix can also indicate age association, as in the final form in (3) j##j#n#r, which designates grandpas and everybody else belonging to their generation. The instance of the -nur borrowing is reminiscent of the facts surrounding the borrowing of reduplication. In both cases, a morphological strategy is borrowed, but in a more restricted way such that in Oroqen it can only be applied to a subset of those forms to which it can be applied in Dagur. Furthermore, perhaps as part of its limited distribution in Oroqen, it takes on connotations that it did not have in Dagur. This pattern, which we have only discussed with respect to two morphological borrowings, appears to hold true for other cases of grammatical borrowings identified to date. 6.2.

Grammatical borrowing from Chinese

Although the period of time that Mandarin Chinese has been in intense contact with Oroqen is relatively short, it has had a strong impact on its grammatical structure. We found that for most speakers of Oroqen (with the exception of a speaker from the Northeastern dialect), the plural marker is no longer required, which could be seen as the consequence of Chinese influence (cf. Grenoble & Whaley 2003) since Oroqen is the only Northwestern Tungusic language in which unmarked plurals are more common than suffixation. Hu’s (1986) description of Oroqen implies the productive use of -l and -sal as plural markers. If this is accurate, then it is noteworthy that Oroqen has moved down the path to losing the plural marker in a remarkably short time span, only a few decades. We have yet to elicit any naturally occurring examples of Oroqen in which the plural markers are employed. The informants we have worked with from southeastern, western and central Oroqen dialect regions occasionally, and then only reluctantly, accept the plural marker -l; they accept -sal with a restricted number of lexical items, most of which denote animate beings with a high frequency of occurrence. The only informant who readily accepted forms with the plural markers -l and -sal was from the northeastern Oroqen dialect region, specifically from Baiyina. Some of the examples are: kumaxa-l ‘deer’, ut!-l ‘sons’, ilga-l ‘flowers’, b!y!-x!l ‘persons’, ahi-x!l ‘women’, ut!-x!l ‘sons’. Even for her, the preferred form is the

20. Loanwords in Oroqen

541

analytic construction of baran kumaxa ‘many deer’. It is highly likely that this loss is contact induced. Even in the northeastern dialect region, where Oroqen shows relatively less Chinese influence, there are still clear instances of grammatical borrowing, such as in the formation of A-not-A questions shown in (4). (4)

a. !i

t"mana

you tomorrow

'#n#-ni

y#-ni

go-2SG.PRS

INTERROG-2SG.PRS

‘Are you going tomorrow or not?’ b. yabu-t!a walk-PST

ha!i

y#-t!a

still.be

INTERROG-PST

‘Went or not?’ In (4), we have two examples of the A-not-A question formation in Oroqen. The example in (4a) represents the typical Oroqen pattern in which an inflected verb particle (y!) is placed post verbally. However, as shown in (4b), some speakers will include Chinese hái shì :; ‘still be; or not’ as well. This kind of phenomenon indicates that when speakers reach a certain level of bilingual proficiency, grammatical borrowing between the languages is much easier than is generally assumed in the literature. That is to say it does not take a very long time for a language to shift to a completely different typological pattern in its grammatical structures. Central to the rate of such structural shifts are sociolinguistic factors (see Li 2005 for details).

7. Conclusion The majority of the loanwords in Oroqen are instances of insertion. Only a small number of them are replacement items. The borrowing is mostly driven by the need to reflect social and lifestyle changes that made new phenomena and new entities an integral part of the lives of the Oroqen. Such borrowings took place much more readily and freely than one would expect. Grammatical borrowings are less common, but they do exist. These are due not only to extensive and persistent contact but a high level of bilingualism among a considerable number of Oroqen speakers as well.

Acknowledgment This work was supported in part by the National Science Foundation (grant # 0220354).

542

Fengxiang Li and Lindsay J. Whaley

References Chaoke, D. O. 1995. A Basic Vocabulary of Three Evenki Languages. Otaru: Center for Language Studies. Chaoke, D. O. 1997. A Basic Vocabulary of the Tungusic Languages in China. Otaru: Center for Language Studies. Doerfer, Gerhard. 1978. Classification problems of Tungus. In Doerfer, G. & Weiers, M. (eds.), Tungusica, Vol. 1. Wiesbaden: Otto Harrassowitz. Doerfer, Gerhard. 1983. Das Birare. Journal de la Société Finno-Ougrienne 78:7–25. Grenoble, Lenore A. & Lindsay J. Whaley. 1999. Language policy and the loss of Tungusic languages. Language and Communication 19:373–386. Grenoble, Lenore A. & Lindsay J. Whaley. 2003. The Case for Dialect Continua in Tungusic: Plural Morphology. In Holisky, D. A. & Tuite, K. (eds.), Current Trends in Caucasian, East European and Inner Asian Linguistics: Papers in Honor of Howard Aronson, 97–120. Amsterdam: John Benjamins. Han, Y. F. & Meng, S. 1993. Elunchun Yu Han Yu Duizhao Duben [An Oroqen-Chinese Reader]. Beijing: Zhongyang Minzu Xueyuan Chubanshe [Central Institute of Nationalities Press]. Hu, Zengyi. 1986. Elunchun Yu Jianzhi [A Grammatical Sketch of Oroqen]. Beijing: Minzu Chubanshe [Nationalities Press]. Hu, Zengyi. 2001. Elunchun Yu Yanju [A Study of Oroqen]. Beijing: Minzu Chubanshe [Nationalities Press]. Janhunen, Juha. 1996. Manchuria: An Ethnic History. Helsinki: The Finno-Ugrian Society. Janhunen, Juha. 1997. The languages of Manchuria in Today's China. Northern Minority Languages: Problems of Survival. Senri Ethnographical Studies 44:123–146. Janhunen, Juha & Xu, Jingxue & Hou, Yucheng. 1989. The Orochen in Xinsheng. Journal de la Societe Finno-Ougrienne 82:145–169. Helsinki. Li, Fengxiang. 2005. Contact, attrition, and structural shift: Evidence from Oroqen. In Bradley, David (ed.), Special Issue of the International Journal of the Sociology of Language. Issue number 173: Language Endangerment in the Sinosphere. Office of Local Annals Editing and Compilation of Huma County. 1990. Huma Xian Zhi [Annals of Huma County]. Nanjing: Chinese Literature and History Press. Saxirong. 1981. Jianming Hanyu Elunchun yu duizhao duben [A concise Chinese-Oroqen reader]. Beijing: Chinese Academy of Social Sciences. Shirokogorov, S. M. n.d. Social Organization of the Northern Tungus. Shanghai: Commercial Press. Starostin, Sergei & Bybo, Anna & Mudrak, Oleg. 2003. Etymological Dictionary of the Altaic Languages. Boston: Brill. Wang, Qingfeng. 2005. Manyu Yanjiu [A Study of Manchu]. Minzu Chubanshe [Nationalities Press]. Beijing.

20. Loanwords in Oroqen

543

Whaley, Lindsay J. & Grenoble, Lenore. 2003. Evaluating the impact of literacy: The Case of Evenki, in Languages in Conflict. In Joseph, Brian & Destefano, Johanna & Jacobs, Neil & Lehiste, Ilse (eds.), Languages in Conflict. Columbus, OH: OSU Press. Whaley, Lindsay J. & Grenoble, Lenore & Li, Fengxiang. 1999. Revisiting Tungusic classification from the bottom up: A comparison of Evenki and Oroqen. Language 75(2):286–321. Whaley, Lindsay J. & Li, Fengxiang. 1998. The suffix -kan in Oroqen. Studies in Language 22(2):181–205. Whaley, Lindsay J. & Li, Fengxiang. 2000. Oroqen Dialects. Central Asiatic Journal 44(1):1–26. Wu, Hegejilitu. 1996. A common method by which all Altaic languages indicate emphasis of adjectves. Minzu Yuwen p.50–60. Zheng, Dongri. 1991. Dong bei tong gu si zhu min zu qi yuan ji she hui zhuang kuang [The Origins and Social Conditions of the Tungus Peoples of the Northeast]. Yanji: Yanbian Daxue Chubanshe [Yanbian University Press]. Zhong, Suchun. 1982. Dagur Yu Jianzhi [A Grammatical Sketch of Dagur]. Beijing: Minzu Chubanshe [Nationalities Press].

Loanword Appendix Dagur adut!in

herdsman

ukur

ox

g#k ukur

cow

k&nin

sheep

ima%n

goat

atirga ima%n

he-goat

ima%kan

kid

ka3ila.

turtle

#%m

medicine, drug, pill, tablet

d#rbu

pillow

d$üy

nephew

kad"w&n/ kad"'ki

sickle, scythe

d$üy &na%d$"

niece

mud$u

sow

nar.%m&%

millet

g#'d$i

cock, rooster

altan

gold

mud$i

hen

mo'won

silver

dai5u

gu

glass

physician, doctor

budur

paint

ya%ba (b#y#)

mute (person)

t#rg#wu

path

m#'gun

t#rg#n

cart, wagon

pot (for cooking)

kurdu

wheel

t!aku

kettle

t#'g#l

axle

pan

dish

!#yl#-/d$,wula- to sieve, to strain

tired

aral

yoke

t!ad$u'ku

bowl

#d$#n

master

gulin

flour

Chinese

n&'a%/nuwa%

vegetables

t!üd#n

match

tubi1#

fruit

y#%y# (in Heihe)

grandfather

tayti (in Heihe)

grandmother

t!a'gal-

kata1an/kata%n salt k&ni'i i4akt#n

wool

but!uda-

to dye

tu%d#w

potato

puto

grape

d#' d$#n

oil lamp

hud$au

pepper

lad$au/t!i'd$u

chili pepper

!atan

sugar

b#%!u

cloth

544

Fengxiang Li and Lindsay J. Whaley

bula%d$i

skirt

atirga

male

skirt

t!ul ball, sphere (central dial.)

d$ün!i by.d$#n

g#k

female

pin

be%wya'

praise

livestock

i!#

soap

school

dahwita'

meeting house

suytan (Central D)

aduhun (NE), ad&!un (Central)

yak!i-rga-'ki

key

gambi

pen

t#m#kun

camel

t!,'k,

window

d$a'gin

chieftain, chief

k#%'tir#

ya'tan

blanket

peace

chest, breast, side of body

i%!#

chair

taypin (C), tay3in (NE)

ma'gila

lamp, torch

court

side of head, temple

d#' d$#%n

!am&n (Central)

ma'.la

forehead

ya'la

candle

balixa

prison, jail

taraxa

bald

t!uto

hoe

malta'ki

rake

scythe

t!i%t!#% (Central)

car

!and&r

ur#

seed

dat!a'n#n

thresh

5#ytin

airplane

nab&t!"

leaf

t!a'yuan

threshingfloor

de%n

electricity

sux#

ax

bu%d$a'

minister

may!a

wheat

gawl"

copper, bronze

g.%

street

kandu

rice

d$a!i1an/d$a!.n letter

Russian

dayri

pipe

inxa'

bank

pit!ix#

match

digua

sweet potato

gu't!a'

skirt

pumpkin, squash

workshop, factory

bula%d$i

w,gw,

t!ulki

stocking, sock

de%ny#'

film/movie

!u!#

bamboo

pit!arga

glove

t!ay

tea

mo%go

mushroom

palat!.nt!a

towel

pa%ra

sledge, sled

m&d$an

carpenter, blacksmith

d$,%l,t&

gold

gawle%n

sorghum

p,r,k,!

ship

kwa'

basket

Mongolian

k.r#nd.r

pen

to'z#r

coin

t,%rag

dust

ma!in

machine

mayman

market (place), store, shop

dalay

sea

buty.rk#

bottle

bular

spring

kanpy.kta

candy, sweets

k&d&r

well

gi'na-

weigh

d#'!#l#%r#n

weigh

Chapter 21

Loanwords in Japanese* Christopher K. Schmidt 1. The language and its speakers As the official language of Japan, Japanese is spoken by almost the entire population (121 million as of 1986, Shibatani 1990: 89) and a small number of overseas communities. Japanese can be divided into a number of distinct dialects, with prestigious Standard Japanese based on the Tokyo dialect (cf. Shibatani 1990: 185– 214). Depending on the perspective, Ryukyuan is either seen as the most divergent variety from the standard language or as the language most closely related to Japanese. Other than this, the question of the genealogical affiliation of Japanese is notoriously controversial. In the words of Shibatani (1990: 94), “Japanese is the only major world language whose genetic affiliation […] has not been conclusively proven”. Leaving aside really far-fetched theories such as those linking Japanese to Indo-European, Basque or Sumerian, the majority of the scholars working on the question seem to prefer a relation to either Altaic or Austronesian (cf. also Kamei 1961; Shibatani 1990; Vovin 2003). In the case of Altaic theories, some scholars restrict themselves to positing a closer relationship between Japanese and Korean (Martin 1966; Whitman 1985), while others then relate Japanese and Korean (and usually Ainu as well) to the Altaic family as a whole (Miller 1971, 1996; Vovin 1994). For the Austronesian theories, usually an Altaic-Austronesian superstrate-substrate mix is proposed (Polivanov 1918), although Benedict (1990) has proposed a genealogical connection to Austronesian, Hmong-Mien and Tai-Kadai, resulting in a superfamily called “Japanese/Austro-Tai”, which is argued against by Vovin (1994) on a number of methodological grounds. Ohno (1957), on the other hand, has also argued for the existence of an Austronesian substratum, and in later work (Ohno 1980) also included a connection to Dravidian languages. Even though the Altaic/Korean hypothesis appears to be the most widely accepted one, it should nevertheless be noted even here the number of cognates is only about 300 or so, leading some scholars such as Lewin (1976) to be skeptical about the possibility of successfully establishing a link between Japanese and any language family. Therefore *

The subdatabase of the World Loanword Database that accompanies this chapter is available online at http://wold.livingsources.org. It is a separate electronic publication that should be cited as: Schmidt, Christopher K. 2009. Japanese vocabulary. In Haspelmath, Martin & Tadmor, Uri (eds.) World Loanword Database. Munich: Max Planck Digital Library, 1975 entries.

546

Christopher K. Schmidt

it is also not uncommon to designate Japanese as a language isolate (with the aforementioned exception of Ryukyuan).

2. Sources of data For the creation of the Japanese database for the project I have mainly relied on the Nippon Kokugo Daijiten !"#$%&' (Sh!gakkan 2005), which is the most comprehensive dictionary of the Japanese language available. It contains over 600,000 words in 14 volumes. Especially for its second edition, it has been extensively revised, providing an abundance of information about when a given word was first attested in what meaning in what kind of text. I have also relied on other Japanese dictionaries, such as the Old Japanese dictionary by Omodaka (1967), the historically important Daigenkai %() by "tsuki (1932), and a current compact dictionary, the Daijirin %&* by Sanseid! (2005c). For Sino-Japanese words whose status was unclear, I have relied on the Sino-Japanese dictionary by Morohashi (1955). I have also used the Japanese-English dictionaries by Sanseid! (2005a, 2005b) to make sure no basic Japanese words of importance would be missed in the subdatabase. There is an extensive body of literature available on loanwords in Japanese. Miller (1967) and Shibatani (1990) in their general surveys of the Japanese language treat this subject to quite an extent. Also Sat! (1982) and Tanaka (1978) give a more general overview of the structure of the Japanese lexicon, which of course includes loanwords. There are also a number of monographs specifically devoted to the subject of loanwords in Japanese. However, in Japanese usage, the word gairaigo +,$, literally ‘words that came from without’, usually refers to words from languages other than Chinese, mostly European languages. Thus Maeda (1922) and Yazaki (1963) almost exclusively discuss loanwords from languages other than Chinese, and Umegaki (1963), while discussing some aspects of the Chinese loanwords, puts his main focus on European languages. Likewise, Loveday (1996), in his account of the history of Japanese language contact, does not dwell too much on the role of Chinese, for which he is criticized by Miller (1998). The methodology used in this project, on the other hand, is intended to give a representative view of loanwords in Japanese from all sources. It is also important to remark at this point that the definition for loanword used in this project would exclude any coinages made by the Japanese, even if they involved foreign lexical material, be it partially or wholly. This would also give a different perspective on loanwords in Japanese, since many traditional accounts have been treating words borrowed from foreign sources and words coined using foreign material as the same.

21. Loanwords in Japanese

547

3. Contact situations Let us now consider the language contact situations relevant for Japanese, beginning with the neighboring languages of Japanese, except for Chinese, whose extensive period of language contact with Japanese merits its own section, and then the various European languages.

Map 1: Japanese in its geographical setting 3.1.

Neighboring languages

Austronesian substrate: As mentioned above, the origin of the Japanese language remains one of the unsolved problems in historical linguistics, and some of the theories put forward have involved an Austronesian element, either as a substrate or as part of a somehow mixed language. While Shinmura (1908) is probably the earliest proponent of an Austronesian substrate theory, Polivanov (1918) offers the first systematic proposal. He notes a number of phonological and morphological characteristics that set Japanese apart from Korean and other languages usually said to be genealogically related to Japanese, which he ascribes to an Austronesian influence on Japanese, as for instance the presence of some prefixes and the fact that open syllables are typical. These authors found a supporter in Izui (1953) who proposed a list of sound correspondences extending to about 55 Proto-Malayo-Polynesian

548

Christopher K. Schmidt

words, based on the reconstruction in Dempwolff (1934–1938). A more recent account of potential loans from Old Javanese in Kumar & Rose (2000) does not employ Proto-Malayo-Polynesian forms but Old Javanese, which would reflect a time depth of only roughly 2000 years. The biggest problem remains that there is no archeological proof for an Austronesian presence in Japan (Peter Bellwood, p.c.). Korean: As a neighboring country of both China and Japan, Korea has played an immensely important role in transmitting Chinese cultural influence to Japan. In th th the 7 and 8 centuries CE, many highly-skilled and Chinese-educated refugees from some defeated Korean states came to Japan, where they played an important role at the court and were able to maintain a high status due to their skills, quickly intermarrying with the Japanese aristocracy. It is assumed that due to their quick integration, they underwent rapid language shift from Korean to Japanese, but as most of the written records were in Chinese, the exact circumstances are unclear. Only borrowings from these periods offer some evidence for the Korean influence of that time. Often cited terms include the name for the former imperial capital Nara -., which is said to come from the Korean nara !"/‘land’, and tera 0 ‘temple’, from the Korean jeol # (Umegaki 1963: 41f.; Miller 1967: 238f.; Loveday 1996: 44f.; Tanaka 2002: 134–144). Ryukyuan: As mentioned above, Ryukyuan is the only language variety that is indisputably related to Japanese, although some scholars claim it to be a dialect of Japanese. While Ryukyuan has been massively influenced by the language of its powerful neighbor, the influence of Ryukyuan on Japanese has been limited to concepts associated with the islands, such as gajumaru 12345 ‘banyan tree’. Ainu: The relationship between Ainu and Japanese is controversial, while some scholars link Ainu to the Korean-Altaic hypothesis (Patrie 1982). It is clear, however, that the regions inhabited by the Ainu in northern Honshu and Hokkaido constituted the frontier for traditional Japanese settlement. Over the course of th Japanese advancement towards the north starting from at least the 8 century CE, the Japanese had contact with the Ainu, who were continuously pushed back until the permanent Japanese settlement of the northern island of Hokkaido starting around 1868, after which the Japanese government followed a policy aimed at assimilating the Ainu. As in the case of Ryukyuan, the influence of Ainu on Japanese has been limited to concepts associated with the Ainu, such as s (h)ake 6 ‘salmon’ from Ainu cukipe. Cf. Umegaki (1963: 41f.), Miller (1967: 238f.), Loveday (1996: 44), Tanaka (2002: 134–144f.). Other languages: Japanese has also borrowed a number of terms from other Asian languages, mostly from Southeast Asia (cf. Umegaki 1963: 58f., 171f.). In a period th th of heightened Japanese maritime activity in the between the 12 and 16 centuries Japanese merchants were part of an extensive trade network, resulting in some loans. These connections would be severed in the following period of self-isolation, th but then set in again in the Meiji era in the 19 century, culminating in the colonialization and occupation of much of Southeast Asia by Japan in the Second World War. The number of resulting loans, however, is very low, with the most notable being kiseru 785 ‘pipe’, from Khmer khsier.

21. Loanwords in Japanese

3.2.

549

Chinese

The language Japanese has had the longest and most intensive documented contact with is Chinese. Over a period of over 1,600 years, Chinese has exerted an immense influence on Japanese, leading to an influx of a large number of vocabulary items into Japanese (cf. Yamada 1940). Because of the extent of the language contact, I will discuss it by linguistic period of Japanese. Archaic Japanese: This refers to the language period immediately predating written records, the historical period known as Yamato Period (c. 250 CE – 710 CE). Most scholars believe some type of irregular contact between China and Japan already occurred in the Yayoi Period (ca. 900 BCE – 250 CE), but as the number of Chinese-trained immigrants to Japan grows, so does the influence of Chinese culth th ture. The Chinese script is brought to Japan in the 4 century CE, in the 6 century Buddhism arrives in Japan, and extensive Chinese-style reforms are instituted in 646. Thus, it is not unreasonable to assume that this huge Chinese influence resulted in some loans even in the time period prior to the first extant th written records in Japan, from around the 8 century CE. This is the theory put forward by Karlgren (1926), who argued for a number of early Chinese loans into Japanese. Cf. also Kamei (1954). It is also shown that a number of Sanskrit terms were borrowed through Chinese into Japanese, cf. Ueda (1922), Loveday (1996: 45) and Tanaka (2002: 162–164). Old Japanese and Late Old Japanese: The former is usually equated with the historical Nara Period (710–794), while the latter is equated with the historical Heian Period (794–1185). The Nara period is the one with the earliest written records of 1 Japanese, even though some ritual texts that were recorded in that period may originally have been created a century or so prior to their first recording. As mentioned above, in the preceding Yamato period, the Sinicization of the Japanese society had come to full bloom. By the time of the Nara period, most of the Japanese elite had become bilingual in Japanese and Chinese and their children were also sent to China for study. The first extant written records in Japan date from the th 8 century and are almost exclusively written in Chinese, with the exception of poems and ritual texts. The Sino-Japanese pronunciation of the characters, called onyomi 9:;, reflects the Chinese standard of these times: the first type of onyomi, called go’on ? ‘(traditional) drum’, annai @A ‘to guide’ and y!sha 3 BC ‘to forgive’ and might have been coined by the Japanese themselves (cf. also Vance 1987: 167–169). Middle Japanese: This spans the three historical periods, which are usually referred to as the Japanese Middle Ages: Kamakura, Muromachi and Azuchi-Momoyama (1185–1603): After Sinicization had reached its height during the Nara Period (710–794), in the following Heian period (794–1185) contact with China was severely curtailed following the fall of the Tang dynasty. During this time, a flourishing Japanese literature emerged, even though Chinese remained the language of administration at all times until the Edo Period (1604–1868). Bilingualism among the nobility quickly waned, being replaced by a diglossic-like situation with Classical Chinese as the language of administration and other formal matters, but with Japanese used in all other situations. Indeed, this situation became even more complicated, with the written Japanese of the Heian period emerging as the language of high literature by later periods as opposed to the spoken vernacular. As for the Chinese written language used in Japan, a bewildering number of different styles emerged (cf. Habein 1984), some even read in a Japanese word order, by way of diacritical marks. Thus, even after the decline of bilingualism among the elite, the role of Chinese in Japanese society remained important, resulting in a steady stream of loanwords incorporated into the Japanese language. The words were transmitted in writing, which means that the pronunciation of the Chinese loans was mainly according to the Sino-Japanese pronunciation from the Heian period. As the Japanese kept using Chinese as the written language of administration even after the decline of bilingualism, the gap between the Chinese language used at the Chinese court and at the Japanese court widened. The meaning of some words diverged considerably, as for instance ry!ri DE ‘to cook’, which in Chinese means ‘arrange’. Furthermore, a small number of words appears to have been coined in Japan as well (the evidence for this is largely negative, i.e. they are accounted for in Japanese sources but not in Chinese ones). Some direct contact with China ensued, leading to some degree of direct exchange throughout the Kamakura and Edo periods. The following language period, called Early Modern Japanese, is usually equated with the Edo Period (1604–1867). It is famous as the period of isolation (called sakoku F#), when the contact to the outside world was severely limited. Beginning in 1639, only a few nations were permitted to continue trading with Japan, namely the Ainu in the north, the Koreans in Tsushima, and the Chinese and the Dutch in Nagasaki. Thus, there was some actual contact with Chinese merchants 3

Originally GH in the meaning of “using what is necessary and discarding what is not”. Present meaning attested in 1699.

21. Loanwords in Japanese

551

during this period, which led to some words being borrowed from Chinese. Unlike the Chinese borrowings of earlier periods, these were usually borrowed from the spoken language, which is why these loans are sometimes called “phonetic loans”. While European proselytization efforts in China brought about linguistic change there, the Japanese government had prohibited proselytization activities, allowing only the Dutch to trade with Japan, but in 1720, when some of the restrictions were lifted, some Chinese translations of European scientific texts found their way to Japan and galvanized the Japanese rangaku IJ or ‘Dutch studies’ scholars. After the anatomy book Kaitaishinsho was published in 1774, a publication frenzy ensued, leading to the publication of many works in medicine, philology, astronomy, geography, botanics, chemistry, physics and other fields (cf. Zhu 2003). Modern Japanese: This starts with the Meiji Period (1868–1912), and is followed by the Taish! (1912–1926), Sh!wa (1926–1989) and Heisei (1989-) periods. When Japan was forcibly opened up by Western powers in the 1850s, the old traditional structures of the shogunate gave way to the new rule of the Emperor, who proclaimed the Meiji era in 1868. Under his rule, Japan sent out students to various Western countries and invited scholars from Western countries in order to learn from their advanced technologies and scientific knowledge. Japan was able to modernize within quite a short period of time, much earlier than China. During the early Meiji period, a lot of direct borrowing took place, and this gave way to a more distinct bilingual setting later as the group of bilinguals, even though influential, always remained small. Official attempts at standardization favored the use of the traditional Sino-Japanese material to express the variety of new concepts and objects that were brought into the Japanese culture. This has been extensively researched in Japanese philology, cf. Suzuki (1981), Shen (1994) and Zhu (2003), and also Tanaka (2002: 156–162). Some old terms from classical texts were reused to express new concepts, for instance keizai KL ‘economy’ from j"ngshì-jìmín MNOP ‘to govern and benefit the people’. But many new terms were coined using the Sino-Japanese material available to the educated Japanese elite, such as s#gaku QJ ‘mathematics’ from s# Q ‘number’ and gaku J ‘study’. At the beginning of the Meiji era the technical terminology was highly fluid: at some points, especially at the beginning of the modernization, direct borrowings from European languages dominated, cooccurring with some short-lived coinages of new terminology using Sino-Japanese material as well. Systematic coinage of terminology using Sino-Japanese material began in the late 1880s, a process that lasted until the end of the Meiji era in 1912, by which time the newly established terminology was firmly in place. Between the 1880s and 1910s, Japanese translations of Western works and thus also the newly coined vocabulary were widely used in China, thus leading to a phenomenon of “backborrowing” of Japanese words coined using Sino-Japanese material. For instance Pan et al. (1993: 389–391) remark that out of 1500 borrowings, 359 are from Japanese, with 92 terms “purely” Japanese, i.e. originally Japanese, 67 from Japanese

552

Christopher K. Schmidt

coinages reusing old terms from Classical texts, and 200 from Japanese coinages involving Chinese lexical material (also cf. Wang & Zou 1998: 262f.). More recently, after the Meiji era, there have been some loans from Chinese, even though they are few in number. These include words like r$men RSTU ‘type of Chinese noodle’, which were also borrowed from spoken Chinese and can thus also be counted among the phonetic loans from Chinese. However, in contrast to the words borrowed into Middle Japanese and Early Modern Japanese as discussed above, the words borrowed in more modern times are clearly perceived as foreign. 3.3.

European languages th

Since the 16 century, Japan has been in contact with various European powers as well as the United States, leading to a variety of contact situations involving Japanese and several European languages. It will be expedient to divide the discussion into three periods (see also Loveday (1996: 50–76) and Umegaki (1963: 46–95) for more detailed overviews of the history of language contact with various European languages). First European contact and Japanese self-isolation: The first Europeans to come to Japan were from Portugal and Spain beginning in the 1540s, followed by Dutch and English merchants. In this early phase, the Portuguese had by far the longest and most extensive trade links, from as early as 1542. Christianity was also transmitted through missionaries travelling on merchant ships, and thus a number of mostly Portuguese words from the context of Catholicism and trade have been recorded, even though by modern times most of them have fallen out of use, with the exception of words like misa VW ‘Catholic mass’ and kappa XY ‘raincoat’. While the Dutch came to Japan only in 1609, they were to outlast all other Western powers, as in 1639 the shogunate chose to implement the sakoku F# policy, isolating Japan from the outside world, with the exception of Chinese and Dutch merchants. Thus starting from 1639, official contact was extremely limited, with a handful of local interpreter families being the only group of Japanese officially allowed to learn Dutch. However, in 1720, the shogunate partly softened its stance by relaxing the ban on the import of European books, which led to a number of Western texts finding their way into the country. This resulted in the establishment of an academic discipline called rangaku IJ ‘Dutch studies’, which corresponded with an influx of a high number of vocabulary items from Dutch, not just from trade but also from said studies, which were mostly in medicine and natural sciences, but also some social sciences (cf. Umegaki (1963: 59–70), Saito (1967) and Vos (1963) for a list of words from Dutch). Many of these words have also fallen out of use, and have been supplanted by other words. As far as other countries are concerned, contact was limited to chance encounth ters due to the sakoku policy. At the beginning of the 19 century the shogunate imported some weapons from France, which was the leading military power of the

21. Loanwords in Japanese

553

time. Umegaki (1963: 70f) mentions some French advisers that were brought in, resulting only in a small number of loans. Modernization and Westernization: The period of sakoku ended when Japan was forced to open up to the outside world by the United States in 1854, leading to the downfall of the shogunate and to a new government under the Emperor, who proclaimed the Meiji period in 1868. Trade relations with various Western countries intensified, and at the same time, the Japanese government sent students to the West to study modern science and technology and invited Western experts to Japan to instruct their students and officials. Thus, during the early Meiji era, the modernization efforts were in the hands of a small group of bilingual speakers, leading to an influx of loans from various European languages. This later changed as the government increased its standardization efforts, establishing many Sino-Japanese terms. However, the influence of European languages never waned, and even though the vast majority of the population never became bilingual, new borrowings spread. The influence of English and German were greatest in this period, with other languages such as French, Russian and Italian being largely confined to specific domains. The dominance of English only continued at the expense of the other languages, also aided by the popularization of American mass culture throughout the Taisho period until the beginning of the Showa period. It only came to a halt in the 1930s, when nationalist and militarist tendencies led to a purification effort to rid the language of Western elements, leading to the coinage of Sino-Japanese terms such as yaky# Z[, from ya Z ‘field’ and ky# [ ‘ball’, alongside the existing loan from English, b%sub!ru \S]^S5 ‘baseball’. Globalization: The purification effort did not last long, and was effectively ended with the Japanese defeat in the Second World War, followed by the first occupation of the country in its history by the United States. At the same time, English became the most important language throughout the world. Thus, the influence of English, which was previously already on the rise, only grew, leading to an enor4 mous influx of lexical material. English also served as a conduit for words from other languages transmitted through English. Often in such cases it remains ambiguous if the word in question was directly borrowed from the language in question or if it came through English. The increased influence of English has led to some Japanese coinages involving English lexical material, such as the famous sarariiman WR_S4U ‘salaried person’. Umegaki (1963) also includes terms such as kanningu `UaUb ‘cheating (on a test)’ from English cunning, which seem not to be Japanese coinages but rather cases of semantic change. Cf. the notion of ‘lexical interference’ in Weinreich (1974) and Rohde et al. (1999, 2000) for a model of semantic change that takes place when words are borrowed.

4

Ueno (1980) has shown that the percentage of English loanwords has increased substantially after World War II.

554

Christopher K. Schmidt

4. Integration of loanwords As we have seen in § 3.2, Chinese lexical material has historically been integrated into the language to such an extent that it is has become available for coinage of new terms by Japanese speakers themselves. For the last hundred years or so, this process has also been accompanied by an increased influence of European languages, chiefly English. With the exception of the nationalist backlash in the 1930s and 1940s, no major acceptance issues have occurred. However, it seems that while the Chinese lexical material is no longer perceived as foreign, English lexical material is sometimes perceived as opaque jargon (Yamada 2005), and is used in a highly specialized way in certain domains such as advertising (Loveday 1996). These different layers of lexical material, also called strata, follow different phonological, morphosyntactic and graphematic rules, which I will discuss below, in comparison with the native material. 4.1.

Lexical strata in Japanese

When discussing the lexicon of Japanese, usually three different lexical strata and a 5 category of mixed-stratum words are distinguished. These strata reflect the perception of native speakers rather than necessarily the words’ true etymology, an issue that I will come back to later. Native Japanese vocabulary: Words that are perceived as not having been borrowed from any other language. In Japanese they are either called wago c$ ‘Japanese words’, or Yamato kotoba %c(d after the first Japanese kingdom esth tablished in the 4 century CE. There are a number of words that are of foreign origin but are perceived as Native words. This includes the words from the language contact situations discussed above in 3.1 and 3.2. Since the borrowing would have occurred in prehistoric times, many of these suggestions remain problematic. However, the number of proposed loans is small, not exceeding 5% of the Native words in the subdatabase (cf. also Takeuchi 1982). Occasionally, words that were borrowed in much more recent times, would also come to be seen as Native words. For instance, the word for ‘pumpkin’, kabocha efgh, was originally borrowed from the Portuguese word for the country Cambodia, but is now regarded as a Native word. Sino-Japanese vocabulary: Called kango =$ (‘Hàn words’) in Japanese, this type of word refers to words either perceived to be borrowings from various varieties of Chinese or to Japanese coinages using Sino-Japanese lexical material. It was very important for the project to differentiate between the two, for two reasons: the project followed a definition of loanwords which would exclude any Japanese 5

Mimetic words (onomatopoetic and sound-symbolic words) in Japanese are usually regarded as a subset of the Native vocabulary, but due to their phonological characteristics, they are sometimes considered as a stratum in their own right (cf. Hamano 1998).

21. Loanwords in Japanese

555

neologisms, even if they involved foreign material, thus creating the need to draw this distinction clearly. But this was also an important finding for the study of Japanese loanwords in general since in most studies this distinction is not made. It was one of the biggest challenges to ascertain which kango words were borrowed from Chinese and which words were coined by Japanese speakers. In the subdatabase, at least 11% of words from the Sino-Japanese stratum can be considered as Japanese coinages rather than loans, a significant number (cf. also Matsui 1982). Non-Chinese Foreign vocabulary: Words that are perceived to be borrowings th from languages other than Chinese since the exposure to the West in the 16 century, which are called gairaigo +,$ (‘words that came from outside’) in Japanese. Since the majority of words from this stratum entered the language quite recently, almost all words that have been classified as Non-Chinese Foreign in this subdatabase could be verified as borrowed, the only exception being g$doman 1Si4U “guard” which is likely to be a Japanese coinage (cf. also Matsui 1982). Hybrid vocabulary: Known as konshugo jk$ ‘mixed words’ in Japanese, there are also some complex lexemes that consist of morphemes from different strata (cf. also Yamada 1940: 483–495). The three main strata can be said to reflect the linguistic history of the Japanese language. The noted exceptions notwithstanding, by and large the speakers’ perceptions and the linguistic histories of the words match: in general, the Native stratum does comprise the inherited vocabulary, and the Sino-Japanese and Foreign strata do contain borrowings from different time periods and donor languages. The three strata show clear differences in phonology, morphosyntax and script, which will be discussed one by one in the following subsections. The strata are also used in synchronic phonological generalizations (e.g. Ono 2002; Ota 2004), and some psycholinguistic research such as Moreton & Amano (1999) seems to suggest that strata indeed play a role in speech perception. The National Institute of the Japanese Language, known in Japanese as Kokuritsu Kokugo Kenky#sho #l#$mno and henceforth abbreviated as Kokken #m, has conducted some statistical research on the different strata (Kokken 1962, 1971, 2005 and Miyajima 1997). In Kokken (1962), the contents of 90 contemporary magazines were counted to determine the token and type frequencies of the most frequent words. 530,000 words were counted, with roughly 40,000 different types. Table 1 contrasts those results with the type frequency as found in the subdatabase. We can already see a major difference between the two studies: in the present database, which uses a word list limited to basic vocabulary, the number of Native Japanese items is far higher than in the Kokken study, and the other three strata have lower percentages than in the Kokken study. In the Kokken study, loanwords constitute roughly 60% of the lexicon, in the database nonloanwords are even more than 60%. This is an important and meaningful finding, since usually no numbers based on basic vocabulary are cited in the literature, thus these numbers constitute an important contrast to the Kokken figures.

556

Christopher K. Schmidt

Table 1:

Token and type frequencies of the four strata

Stratum

Token frequency in Kokken (1962) Type frequency in subdatabase

Native Japanese Sino-Japanese Non-Chinese Foreign Hybrid

4.2.

53.9% 41.3% 2.9% 1.9%

61.2% 30.2% 6.3% 2.3%

Phonological integration

Japanese is known to have fairly restricted phonotactics, with no tautosyllabic consonant clusters allowed and only a very limited set of closed syllables. This is reflected in the phonological adaptation that all loanwords undergo, but again we can see differences by stratum. Native Japanese stratum: There are some phonological processes that are largely limited to this stratum: a no longer productive process called “morpheme final vowel alternation” as described by Vance (1987: 149–155), another process called rendaku pq or “sequential voicing” is only frequent in Native Japanese (Vance 1987: 133–148; Shibatani 1990: 173), although the latter is known to affect a few well established words from the Sino-Japanese and even the non-Chinese Foreign stratum (Martin 1952: 48; Nakagawa 1966: 308; cf. also the discussion in Takayama 2005 and Irwin 2005). Sino-Japanese stratum: The different dialects of Middle Chinese, which were most likely the source for both go’on and kan’on discussed above, did not have any consonant clusters, but they did have glides and some syllable-final consonants. On the other hand, Old Japanese has been reconstructed to have a fairly simple CV syllable structure with no long vowels, no syllable-final consonants, and also no voiced obstruents word-initially (Umegaki 1963: 96f.). The language contact between Middle Chinese and Old Japanese seems to have led to the following: 6 ! While palatalization in the Native stratum is confined to consonants before /i/, in the Sino-Japanese stratum it can also occur before /a/, /o/ and /u/ (cf. also Vance 1987: 28). th ! Syllable-final nasals: /m/ and /n/ were borrowed as such, but by the 12 century these seem to have merged into one phoneme /N/, which is the only consonant permissible in word-final position. /$/ on the other hand, was usually replaced by a high vowel, creating a diphthong which has undergone further sound change towards a long vowel in Modern Japanese (Vance 1987: 52–59, Okumura 1972).

6

Strictly speaking, palatalization encompasses two distinct phenomena: While alveolar obstruents and /h/ are replaced by their palato-alveolar and palatal counterparts, the other consonants are merely palatalized.

21. Loanwords in Japanese

557

! Syllable-final plosives: /t/ and /k/ in the final position of the first morpheme in two-morpheme Sino-Japanese words form a long consonant with the following segment, in case of the former when followed by any voiceless obstruent, and in the case of the latter when the following morpheme begins with /k/. In all other 7 positions, morpheme-final /t/ and /k/ take on an epenthetic high vowel and be8 come /tsu/, /t%i/ or /ku/, /ki/ respectively. For instance, the morpheme ‘difference’ is /beQ/ (Q standing in for whatever voiceless obstruent is following) or /betsu/ r, as is exemplified in words like /beQ.ke/ rs ‘different family’ vs. /betsu.mei/ rt ‘alias’. The development of /p/ is less predictable: in the majority of the cases, it has become /fu/, which in conjunction with other sound change has resulted in a long vowel after the loss of intervocalic /f/. In some cases, it has become /t/, where it follows the same rules as original /t/ (cf. Vance 1987: 155–164). There is also a relatively rare phonological process exclusive to Sino-Japanese words 9 called renj! pu ‘liaison’ (Vance 1987: 164–167). There is some debate as to whether the above innovations were all exclusive to Chinese borrowings, since the Native Japanese stratum itself has undergone a considerable amount of sound change from Archaic Japanese to Modern Japanese, in the course of which the phoneme inventory has widened. Non-Chinese Foreign stratum: Even though language contact with European lanth guages has been going on since the 16 century, it is English that has had the most profound impact on Japanese for the last hundred years or so. This has led to a phonological system for loanwords that is “in flux”. Many accounts distinguish between two varieties, a “conservative” one and an “innovative” one (Bloch 1950, Vance 1987), and Loveday (1996) goes beyond that by positing a “scale of assimilation”. In the conservative variety, Foreign borrowings are adapted to a greater extent than in the innovative variety. The difference between the two lies primarily in the availability of some additional consonant onsets to the innovative variety. ! Consonant onsets: In the innovative variety, palatalization can appear before any vowel, i.e. unlike the Sino-Japanese stratum, also before /e/: chero vwx vs. sero 8x ‘cello’. While /f/ and /ts/ can only appear before /u/ in the conservative variety, they can appear before other vowels in the innovative one: firumu yz5{ ‘film’, which would be fuirumu y|5{ in the conservative variety. Also, while in the conservative variety /t/ cannot occur before /i/, the innovative variety allows for it: p$tii }S~zS ‘party’. Finally, while the conservative variety has /w/ only before /a/, and no /v/ at all, replacing it by /b/, there is some disagree-

7

In certain environments, these vowels usually become devoiced or are omitted altogether. Cf. Vance (1983: 48–55), Bloch (1950). 8 It seems that morpheme-final /t/ was maintained at least in formal speech as late as the 17th century (Toyama 1972) 9 One of the most famous examples for this process is tenn! •€ ‘Japanese emperor’, from ten • ‘heaven’ and ! €!‘emperor, king’.

558

Christopher K. Schmidt

ment if the innovative variety indeed allows for /w/ before other vowels and /v/ at all (Bloch 1950; Vance 1987; Umegaki 1963: 124–135). ! Final consonants: If a syllable ends in an alveolar or velar nasal in the source language, it can be rendered as a final /N/ in Japanese, but all other final consonants must be followed by an epenthetic vowel. Regarding /Q/, the innovative variety is said to allow for voiced obstruents in that position as well, in words such as /beQdo/ \•i ‘bed’, though the matter remains somewhat controversial (Vance 1983: 42f.). One type of phonological adaptation that is common to both varieties involves tautosyllabic consonant clusters: syllables beginning with a consonant cluster are usually made to conform to a /CVCV/ pattern by inserting /u/ as an epenthetic vowel, or in the case of /t/, /o/: street becomes sutoriito ]‚_S‚ (Vance 1987: 48– 53). It should be noted that earlier loans behave somewhat differently, especially in the matter of epenthetic vowels. For instance, the English word strike has been borrowed into Japanese twice, first in the form sutoraiki ]‚R|7, and then as sutoraiku ]‚R|ƒ. The former refers to strikes as in labor disputes, and the latter to strikes in sports such as in baseball. Another phenomenon often found with Foreign borrowings is clipping as in baito „|‚ from arubaito …5„|‚ ‘part time job’, from German Arbeit ‘work’ (cf. Loveday 1996: 151). 4.3.

Morphosyntactic integration

Japanese is an agglutinative language, but only verbs, adjectives and some auxiliaries including the copula and other suffixal verbs are inflected (Shibatani 1990: 221f.). Verbs are usually divided into the consonantal and vocalic conjugations classes, with the two verbs kuru ,† ‘come’ and suru ‡† ‘do’ being irregular. Nouns are not inflected, but they do appear with a variety of postpositional case and focus markers. It has been observed that almost all inflected verbs and adjectives are from the Native stratum, e.g. verbs such as tabe-ru ˆ‰†/ ‘eat’ or nom-u Š‹ ‘drink’, and adjectives such as haya-i Œ• ‘fast’ or hiro-i Ž• ‘wide’. Native words like these can function as predicates on their own with the appropriate inflectional endings, whereas words from both the Sino-Japanese stratum and the non-Chinese Foreign stratum usually cannot be inflected in the same manner, but instead must rely on auxiliaries and particles. For instance, the Sino-Japanese word benky! ••/‘study’ can be used as a predicate only in conjunction with the auxiliary verb suru ‘do’, and the Foreign word romanchikku x4Uv•ƒ/‘romantic’ must combine with a copula such as da in predicative use, and when used as an attribute, it needs the particle na such as romanchikku na hito x4Uv•ƒ‘’ ‘a romantic person’, which is why the former are sometimes called suru-verbs and the latter na-adjectives. There is disagreement in the literature whether a word used in this manner belongs to a different word class or not, or even whether they should be regarded as verbs and

21. Loanwords in Japanese

559

adjectives at all. In Japanese philology, suru-verbs and na-adjectives have traditionally been assigned their own word classes (Shibatani 1990: 215–217; Backhouse 1984). It should be noted that these patterns are also available to Native roots, but their number remains low compared to the Native roots that inflect. Second, there are some Sino-Japanese and Foreign words that can be inflected directly. For one, while most Sino-Japanese roots are bimorphemic, there are a number of monomorphemic Sino-Japanese roots that have developed a higher degree of fusion with the auxiliary verb suru. We can distinguish two groups here, roots ending in /N/ and those that do not end in /N/. For instance, the root shin “ ‘trust’ followed by suru became shinzuru “”†, a case of sequential voicing very rare in Sino-Japanese loanwords and clearly indicative of fusion. At the same time, in Modern Japanese, speakers have developed a variant shinjiru “•†, which is conjugated not like suru but like verbs from the vocalic conjugation class. Roots not ending in /N/, on the other hand, do not show such voicing assimilation, for instance ai – ‘love’ becomes aisuru –‡†. However, this group as well has developed a variant aisu –‡, which in turn follows the consonantal conjugation. While the shinjiru type variants have largely replaced the shinzuru type in popular usage, the aisu type and aisuru type variants are used alongside each other: aisu tends to occur in a few forms such as potential and negation, while aisuru occurs elsewhere. As far as Foreign words go, only a miniscule number of verbal borrowings are inflected. All take the ending -ru, are conjugated in the consonantal verb class and have a highly colloquial flavor such as saboru W^† ‘play truant’ from French sabotage (Loveday (1996: 117f.), cf. also Backhouse (1984) on adjectives). This shows that Foreign words have not reached as high a degree of morphosyntactic integration as the Sino-Japanese ones. 4.4.

Integration into the script

Japanese employs a mixture of three scripts, two syllabaries called hiragana —˜™‘ and katakana `š`› and a logographic script called kanji =œ. The rule of thumb is that all Native words are either written in kanji or hiragana, all SinoJapanese words in kanji, and all Foreign words in katakana. The details are more complex, which is why I will discuss each script by itself (cf. also Hayashi 1982 and Gottlieb 1995: ch.1): 4.4.1.

Kanji

Kanji is the logographic script borrowed from Chinese. It is used for most lexical words: nouns, personal pronouns and the stems of verbs and adjectives, of both the Native Japanese and Sino-Japanese strata. With Sino-Japanese words, each character reflects a monosyllabic Chinese pronunciation (called on’yomi ‘sound reading’ in Japanese), which are combined to form larger lexical units as in Chinese, for

560

Christopher K. Schmidt

instance kazan •ž ‘volcano’ from ka • ‘fire’ and san ž ‘mountain’. On the other hand, Native words usually correspond to one Chinese character based on the meaning of the Chinese character, seldom multi-character combinations. This way of pronouncing the kanji like Native words is called kun’yomi ‘meaning reading’. For instance, the Native words for ‘fire’ and ‘mountain’ are hi • and yama ž respectively. As for non-Chinese Foreign words, these used to be written in Kanji as well, but are now almost exclusively written in the Katakana script as described below. There is also the phenomenon of ateji Ÿ œ (‘directed characters’, cf. Tajima 1998: 452–461), where Chinese characters are applied to non-Chinese words not according to their meaning but according to the sound. For instance, kega ¡¢ ‘wound’ is a Native word, yet the Chinese characters, meaning ‘suspicious’ and ‘self’ respectively, do not reflect the meaning of the word, but were chosen only for their pronunciation. Earlier, ateji were also used for words of Foreign origin, though very few still persist such as kan £ ‘can’. Another script-related phenomenon is the reanalysis of originally Native terms as Sino-Japanese terms. For instance, the word for ‘Japanese radish’ was originally a Native word pronounced !ne, which consisted of two morphemes, !- ‘big’ and ne ‘root’. This word was accordingly written with the corresponding Chinese characters meaning ‘big’ and ‘root’, namely % and ¤. Over the course of time %¤ was then reanalyzed as a Sino-Japanese word and took on the pronunciation according to the Sino-Japanese on’yomi for the two characters % dai and ¤ kon, resulting in the word still used today, daikon. Cf. also Yamada (1940: 468–483). 4.4.2.

Hiragana

The hiragana script was originally developed from the cursive writing style of Chinese calligraphy and was first used in diaries and novels that were written in the Japanese vernacular as opposed to the more formal writing which was done in Chinese. The script is currently used for most function words and verb and adjective endings. Hiragana is also used in cases where a word originally written in kanji has grammaticalized into a function word; for instance, the verb miru ‘to see’ is usually written ¥†, but in its function as a conative auxiliary, it is usually written in as ;†, as in tabete miru ˆ‰ ;†/‘to try to eat’. Furthermore, hiragana can also be used instead of any Chinese character deemed too obscure or difficult, which is usually the case with characters outside of the set of 1,945 j!y! kanji ¦G=œ (‘frequently used characters’) prescribed by the Ministry of Education for compulsory school education. 4.4.3.

Katakana

The katakana script was originally developed from shorthand renderings of Chinese characters and used in conjunction with Chinese characters as a reading aid. The script is now used to write non-Chinese Foreign loanwords and any foreign names

21. Loanwords in Japanese

561

that are not part of the Sinosphere. As mentioned above, some Foreign words used to be written in kanji as well, but this is no longer the case. For instance, buriki ‘tinplate’ used to have two kanji spellings: §¨, which is a case of ateji, and ©d, which is a case of semantically-based kanji assignment with the characters for ‘steel’ and ‘leaf’; both have been discarded today in favor of the katakana spelling ª_7. Katakana are also often used for scientific terms such as animal and plant names as well as mineral names. For instance, hito ‘human being, person’, usually written with the kanji ’, is written «‚ in katakana when referring to humans as a species.

5. Grammatical borrowing I will now briefly turn to grammatical borrowing in Japanese. Owing to its long history of language contact, Japanese has also been influenced in terms of grammar. As described in section 3.2, Japanese has been heavily influenced by Chinese in many ways, which led to text styles with varying degrees of Sinicization. This has had an influence on the grammar of Japanese as well. Yamada (1940: 494f.) lists some of the phenomena believed to be borrowings from Chinese, such as the conjunctions narabi ni ¬-®and oyobi ¯-, which are not so much borrowings from Chinese as calques based on Chinese characters. Other grammatical influences cited include the use of the attributor no as a way of marking nominalized clauses and the so-called inversion, or t!chih! °±² (cf. Hinds 1986: 51). The intonation pattern clearly shows that the postposed element is an afterthought, so the claim in Yamada (1940) that this is due to Chinese influence remains unconvincing. Cf. also Shibatani (1990: 259) and Wenck (1974: 774–749). Similarly, with the onset of large-scale translational work from the major European languages with the Meiji Restoration, there are claims that there have been grammatical influences from European languages as well, mainly from English, but also some from Dutch due to the length of language contact between the two languages during the Edo period (Miura 1979; Loveday 1996: 55f.). The most notable influence is the use of the old demonstrative pronoun kare ³ as pronoun for the third person singular, and the use of the passive that usually has an adversative connotation without such a connotation, similar to what has been discussed as an European influence on Chinese (Yang 2006).

6. Numbers and kinds of loanwords As mentioned in §2, I have used the meaning list provided by the Loanword Typology project consisting of 1,460 entries grouped by 24 semantic fields. They were additionally coded by the project by “semantic word class”. I will first discuss the total percentages of loanwords, and then turn to the results for all loans grouped by semantic word class, and showing all loans grouped by semantic field.

562

Christopher K. Schmidt

The 1,460 entries on the meaning list correspond to 1,975 unique words as entered into the database. This discrepancy in numbers occurs not only because of one concept corresponding to several words, but also one word connecting to several concepts. Overall this has resulted in roughly 1.5 words per concept for this database, but a more thorough analysis of this and all possible lexical relationships is definitely warranted. As shown in Table 2, we can see that loans from Chinese take up the biggest share of loanwords by far (around 28%). The figures in the tables refer to loanwords as defined by the project, which means that neologisms coined by the Japanese using Sino-Japanese lexical material are counted with the nonloanwords and not included in the figures for Chinese. As mentioned in section 4.1, the percentage of Japanese coinages was around 11% of all Sino-Japanese items in the subdatabase which accounts for the different percentages for Sino-Japanese words in Table 1 and for loans from Chinese in Table 2. After loanwords from Chinese, the next biggest group are those from English with around 6%. All other languages in their entirety, various European languages other than English as well as the neighboring languages other than Chinese, altogether only comprise a little bit more than 1% of loanwords. Semantic word class: If we look at Table 2, we can see that nouns (words referring to things or entities) were borrowed much more heavily in total, with only a little less than half of the nouns in the database being loans. Leaving aside adverbs, which were not borrowed at all in the database, roughly one quarter of function words and adjectives were borrowed, followed by about one fifth of verbs borrowed. Comparison of loans from Chinese and English reveals an interesting contrast: While Chinese loans are present in all semantic word classes except for adverbs, English loans are predominantly – and the miniscule number of loans from other languages exclusively – nouns. This fact is interesting in itself, since we have seen in the previous section that the means for morphosyntactic integration of Chinese and non-Chinese lexical material are identical. The same trend also obtains for a larger number of words: Ozawa (1976), in his study of 7,045 English loans taken from a dictionary, concludes that 85% of them were borrowed as nouns.

French

Korean

Ainu

Spanish

Ryukyuan

German

9.3 0.9 0.7 0.6 6.0

0.5 0.3

0.3 0.2

0.2 0.2

0.2 0.1

0.1 0.1

0.1 0.1

0.1 0.1

0.1 0.1

Non-loanwords

Portuguese

32.3 19.0 24.1 24.8 27.9

Total loanwords

Dutch

Nouns Verbs Function words Adjectives Adverbs all words

English

Loanwords in Japanese by semantic word class (percentages)

Chinese

Table 2:

43.2 56.8 19.9 80.1 24.8 75.2 25.4 74.6 0.0 100.0 34.9 65.1

21. Loanwords in Japanese

563

Semantic field: It can be seen from Table 3 that concepts associated with highly complex cultural aspects such as the fields Religion and belief, Modern world, Social and political relations, and Law show heavy borrowing (around 60%–70%), while fields associated with material culture in the broad sense such as Food and drink, Clothing and grooming and The house show lower percentages (between 35% and 45%). Even lower are the percentages of those fields associated with nature in general such as The physical world and The body and those associated with Basic actions and technology, Cognition and Miscellaneous function words. As exceptions to these generalizations, we can cite the fields of Quantity and Time, which exhibit a loanword percentage of 50% and 38%, respectively. When we now compare Chinese and English as the most important donor languages, the differences lie in four fields mostly associated with material culture: Clothing and grooming, Modern world, Food and drink, and The house, where English loans account for a significant number of loans, between 16% and 26%. In all other fields the percentage of English pales in comparison to the loans from Chinese. Chinese loans, on the other hand, are found throughout the data-set, mirroring the generalizations laid out above except for the four fields just mentioned. From this we can again see that Chinese loans have permeated all domains in Japanese, while English loans are still much more restricted in terms of their semantic domains. While Chinese indeed is more prominent in areas associated with the Chinese culture, such as Religion and belief, Law and Social and political relations, Chinese loans play a significant role in every domain. The unusually high loan percentages in the Quantity and Time domains could also be an effect of the fact that the words for numbers and days of the week take up quite a large share of concepts on the list. English is prominent in the four domains of material culture, which are associated with artifacts and other concepts taken in from the West. This has had an impact on the percentage of Chinese loans, especially in the Modern world field, where Chinese only accounts for 38% of the words. But compared to the group of semantic fields relating to nature and basic actions and function words, where the number of English loans is negligibly low, the percentage of Chinese loans is accordingly higher. Shibatani (1990: 153) has also noted that the number of doublets involving Native and Sino-Japanese words is decreasing in favor of doublets with Native and non-Chinese Foreign words. While the numbers in the subdatabase are too low to specifically corroborate or refute this assumption, English loans have been found to replace an earlier word in about 10% of the cases, half of which involve a Sino-Japanese term being replaced. This is a major difference to Chinese loans, which have resulted in replacing a previously present word 25% of the time, which again speaks for a higher permeation of Chinese lexical items in the lexicon.

Christopher K. Schmidt

Dutch

Portuguese

French

Korean

Ainu

Spanish

Ryukyuan

German

Total loanwords

Non-loanwords

Loanwords in Japanese by semantic field (percentages)

25.6 3.1 35.2 17.4 6.5 28.2 0.9 25.2 19.0 11.2 26.1 19.5 16.3 20.4 7.6 21.9 7.0

1.0 1.6 2.8

1.0 1.2 1.7 -

3.7 -

-

0.8 -

1.2 -

1.2 -

-

28.7 35.2 24.7 29.2 46.1 43.5 37.4 30.9 31.6

71.3 64.8 75.3 70.8 53.9 56.5 62.6 69.1 68.4

6.8 1.5 3.5 1.4 3.1 1.2 7.9 1.9

-

-

-

-

-

-

-

1.6 -

19.4 29.4 16.2 50.7 38.0 7.4 39.3 37.0 31.4 60.1

80.6 70.6 83.8 49.3 62.0 92.6 60.7 63.0 68.6 39.9

39.7 1.6 55.9 2.9 57.9 5.3 38.9 19.4 9.8 -

1.4 -

0.7 -

-

2.6 1.4 -

-

-

-

-

41.3 58.8 65.8 61.8 9.8

58.7 41.2 34.2 38.2 90.2

27.9

0.3

0.2

0.2

0.1

0.1

0.1

0.1

Chinese

Table 3:

English

564

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

The physical world Kinship Animals The body Food and drink Clothing and grooming The house Agriculture and vegetation Basic actions and technology Motion Possession Spatial relations Quantity Time Sense perception Emotions and values Cognition Speech and language Social and political relations Warfare and hunting Law Religion and belief Modern world Miscellaneous function words all words

12.6 27.8 12.7 49.3 38.0 7.4 36.2 35.8 22.0 58.2

6.0

0.1 34.9 65.1

7. Summary Japanese is a language whose ancestry remains controversial, but whose extensive exposure to culturally dominant languages is without question. I have reviewed the long history of language contact of Japanese, in which Chinese and English play an immensely important role. The prolonged period of contact in the case of the former has led to a thorough permeation of the Japanese lexicon with Chinese and

21. Loanwords in Japanese

565

Sino-Japanese lexical material in virtually all semantic fields and semantic word classes. English loans, by contrast, are much more restricted. I have noted the important role that lexical strata play in Japanese as they follow certain phonological, morphosyntactic and graphematic regularities, irrespective of whether the words that belong to these strata are inherited, borrowed or newly coined words. There were two important findings with this study. The first pertains to the composition of the Japanese lexicon. By using the word list of basic vocabulary items for the database, it was possible to shed light on the composition of the basic vocabulary in Japanese, rather than analyzing a much wider range of the lexicon like earlier studies. By this we found that while the widely cited Kokken studies posited a 60% percentage for borrowed items in Japanese, our study focusing on the basic vocabulary would result in a reversed picture of 60% native words, indicating that the basic vocabulary is indeed less permeable to foreign borrowings. The second one relates to the distinction between Chinese loans and Japanese neologisms involving Sino-Japanese lexical material, which is one that is not often made in analyses of the Japanese lexicon. Neologisms have been found to account for 11% of the Sino-Japanese material in the database. In the case of a field like Modern world, this percentage rises to 30%, which, combined with the high percentage of loans from European languages, results in only 20% of the words being from the Native stratum. This offers avenues for further research investigating the differences between Chinese loans and Sino-Japanese coinages in more details.

Acknowledgments I would like to thank Suzanne Kemmer, Laura Robinson and the editors for comments regarding this paper, and also the Dolores Mitchell Fund and the Max Planck Institute for Evolutionary Anthropology for financial support.

566

Christopher K. Schmidt

References Backhouse, Anthony E. 1984. Have all the adjectives gone? Lingua 62:162–189. Benedict, Paul K. 1990. Japanese/Austro-Tai. Ann Arbor: Karoma. Bentley, John R. 2001. A descriptive grammar of early Old Japanese prose. Leiden: Brill. Bloch, Bernard. 1950. Studies in Colloquial Japanese 3: Phonemics. Language 26:86–125. Dempwolff, Otto. 1934–1938. Vergleichende Lautlehre des Austronesischen Wortschatzes [Comparative phonology of the Austronesian vocabulary]. (Beihefte zur Zeitschrift für Eingeborenen-Sprachen 15, 17, 19 1–3). Berlin: Reimer. Gottlieb, Nanette. 1995. Kanji politics. London: Kegan Paul. Habein, Yaeko S. 1984. The history of the Japanese written language. University of Tokyo Press. Hamano, Shoko. 1998. The sound-symbolic system of Japanese. Tokyo: Kurosio. Hayashi, "ki. 1982. Nihongo no goi no hy!ki [The writing of the Japanese lexicon]. 179–200. Hinds, John. 1986. Japanese. London: Routledge. Irwin, Mark. 2005. Rendaku-based lexical hierarchies in Japanese: The behaviour of SinoJapanese mononoms in hybrid noun compunds. Journal of East Asian Linguistics 14:121– 153. Izui, Hisanosuke. 1953. Nihongo to nant! shogo [Japanese and the Languages of the Southern islands]. Minzokugaku Kenkyu 17(2). Kamei, Takashi. 1954. Chinese borrowings in prehistoric Japanese. Tokyo. Kamei, Takashi. 1961. The relationship of Japanese to the other languages of East Asia. Karlgren, Bernhard. 1926. Philology and Ancient China. Oslo. Kokken. see Kokuritsu Kokugo Kenky&sho. Kokuritsu Kokugo Kenky&sho. 1962. Gendai zasshi ky#jusshu no y!go y!ji, 1: S!ki oyobi goihy! [Vocabulary and characters used in 90 contemporary magazines, 1: General description and wordlist]. Tokyo: Kokuritsu Kokugo Kenky&sho. Kokuritsu Kokugo Kenky&sho. 1971. Denshiki keisanki ni yoru shinbun no goi ch!sa,ii [Computational lexical analysis of newspapers]. Tokyo: Kokuritsu Kokugo Kenky&sho. Kokuritsu Kokugo Kenky&sho. 2005. Gendaizasshi no goi ch!sa – 1994nen hakk! 70shi [Lexical analysis of contemporary magazines: 70 magazines from 1994]. Tokyo: Kokuritsu Kokugo Kenky&sho. Kumar, Ann & Rose, Phil. 2000. Lexical evidence for early contact between Indonesian languages and Japanese. Oceanic Linguistics 39(2):219–255. Lewin, Bruno. 1976. Japanese and Korean: The problems and history of a linguistic comparison. Journal of Japanese Studies 2(2):389–412. Loveday, Leo J. 1996. Language contact in Japan: A socio-linguistic history. Oxford. Maeda, Tar!. 1922. Gairaigo no kenky# [Research of loanwords]. Tokyo: Iwanami.

21. Loanwords in Japanese

567

Martin, Samuel E. 1952. Morphophonemics of standard colloquial Japanese. Yale University Ph.D. thesis. Martin, Samuel E. 1966. Lexical evidence relating Korean to Japanese. Language 42:185– 251. Martin, Samuel E. 1988. A reference grammar of Japanese. Vermont: Tuttle. Matsui, Toshihiko. 1982. Kango, Gairaigo no seikaku to tokushoku [Characteristics of SinoJapanese and Foreign words]. 149–177. McCawley, James D. 1968. The phonological component of a grammar of Japanese. The Hague: Mouton. Miller, Roy Andrew. 1967. The Japanese language. The University of Chicago Press. Miller, Roy Andrew. 1971. Japanese and the other Altaic languages. University of Chicago Press. Miller, Roy Andrew. 1996. Languages and history: Japanese, Korean and Altaic. Bangkok: White Orchid Press. Miller, Roy Andrew. 1998. Review of Language contact in Japan: A sociolinguistic history, by Leo J. Loveday (1996). Journal of Japanese studies 24(1):208–212. Miura, Akira. 1979. The influence of English on Japanese grammar. The Journal of the Association of Teachers of Japanese 14(1):3–30. Miyajima, Tatsuo. 1997. Zasshi ky&jisshu hy!kih! no t!kei [Statistics of the notation of 90 magazines]. Nihongo kagaku 1:92–104. Moreton, Elliott & Amano, Shigeaki. 1999. Phonotactics in the perception of Japanese vowel length: Evidence for long-distance dependencies. In Ohala, John J. & Hasegawa, Yoko et al. (eds.), Proceedings of the 14th Congress of Phonetic Sciences, 2215–2217. San Francisco. Morohashi, Tetsuji. 1955. Daikanwa Jiten. Tokyo: Taishukan. Nakagawa, Yoshio. 1966. Rendaku, rensei (kash!) no keifu [The origin of rendaku and rensei]. Kokugo Kokubun 35(6):302–314. Ohno, Susumu. 1957. Nihongo no kigen [The origin of Japanese]. Tokyo: Iwanami. Ohno, Susumu. 1980. Nihongo no seiritsu [The formation of Japanese]. Tokyo: Ch&!k!ronsha. Okumura, Mitsuo. 1972. Kodai no on’in [The phonemes of Archaic Japanese]. In Nakata, Norio (ed.), K!za Nihongoshi 2: On’inshi, Mojishi, 63–171. Tokyo: Taishukan. Omodaka, Hisataka. 1967. Jidaibetsu kokugo daijiten – J!daihen. Tokyo: Sanseid!. Ono, Hajime. 2002. Sino-Japanese and a way to shape up the stem. In Proceedings of the Southwest Workshop in Optimality Theory/Texas Linguistics Society Conference. Ota, Mitsuhiko. 2004. The learnability of the stratified phonological lexicon. Journal of Japanese Linguistics 20:19–40. "tsuki, Fumihiko. 1932. Daigenkai. X.

568

Christopher K. Schmidt

Ozawa, Katsuyoshi. 1976. An investigation of the influence of the English language on the Japanese language through lexical adaption from 1955–1972. Ohio: Ohio University Ph.D. dissertation. Pan, Wenguo & Po-Ching, Yip & Saxena, Han Yang. 1993. Hanyu de goucifa yanjiu [Studies of Chinese word-formation]. Taipei: Student Book Co. Patrie, James. 1982. The genetic relationship of the Ainu language. University of Hawai’i Press. Philippi, Donald L. 1959. Norito: A translation of the ancient Japanese ritual prayers. Princeton University Press. Polivanov, E. D. 1918. One of the Japanese-Malayan parallels. Reprinted in Selected works, compiled by Leont’ev, A. A. 1974. Rohde, Ada & Stefanowitsch, Anatol & Kemmer, Suzanne. 1999. Loanwords in a usagebased model. In Billings, Sabrina J. & Boyle, John P. & Griffith, Aaron M. (eds.), Papers from the Thirty-Fifth Annual Regional Meeting of the Chicago Linguistic Society, Part 1: Papers from the Main Session, 265–275. Chicago, IL: Chicago Linguistic Society. Also published in 2000. (LAUD Series B 296). Duisburg: LAUD. Sait!, Shizuka. 1967. Nihongo ni oyoboshita Oranda-go no eiky! [Dutch influences on Japanese]. Tokyo: Shinozaki. Sanseid!. 2005a. Ekushiido eiwa jiten. Tokyo: Sanseid!. Sanseid!. 2005b. Ekushiido waei jiten. Tokyo: Sanseid!. Sanseid!. 2005c. Sanseid! s#p$ daijirin. Tokyo: Sanseid!. Sat!, Kiyoji (ed.). 1982. Nihongo no goi no tokushoku [Characteristics of the Japanese lexicon]. Tokyo: Meiji. Shen, Guów'i. 1994. Kindai Nitch# goi k!ry#shi [The history of language contact between Japanese and Chinese in modern times]. Tokyo: Kasama. Shibatani, Masayoshi. 1990. The languages of Japan. Cambridge: Cambridge University Press. Shinmura, Izuru. 1908. Kokugo keit! no mondai [The question of the genealogy of the Japanese language]. Sh!gakkan. 2005. Nihon kokugo daijiten. 2nd edn. Tokyo: Sh!gakkan. Suzuki, Sh&ji. 1981. Nihon kango to Ch#goku [Sino-Japanese and China]. Tokyo: Ch&k! Shinsho. Tajima, Masaru. 1998. Kindai kanji hy!kigo no kenky# [Study of notation of Chinese characters in Modern Japanese]. Osaka: Izumi. Takayama, Tomoaki. 2005. A survey of Rendaku in loanwords. In van de Weijer, Jeroen & Najo, Kensuke & Nishihara, Tetsuo (eds.), Voicing in Japanese. Berlin: Mouton de Gruyter. Takeuchi, Michiko. 1982. Wago no seikaku to tokushoku [Characteristics of Wago]. In Sat! (ed.), Nihongo no goi no tokushoku, 127–147. Tokyo: Meiji. Tanaka, Akio. 1978. Kokugo goiron [The Japanese lexicon]. Tokyo: Meiji.

21. Loanwords in Japanese

569

Tanaka, Takehiko. 2002. Gairaigo to wa nani ka [What loanwords are]. Tokyo: Choeisha. Toyama, Eiji. 1972. Kindai no on’in [Phonemes in Modern Japanese]. In Nakata, Norio (ed.), K!za Nihongoshi 2: On’inshi, Mojishi, 173–268. Tokyo: Taishukan. Ueda, Ky!suke. 1922. Kokugoch# no Bongo no kenky# [Study of Sanskrit words in the national language]. Tokyo: Daidokan. Ueno, Kagetomi. 1980. Eigo goi no kenky# [Study of English words]. Tokyo: Kenkyusha. Umegaki, Minoru. 1963. Nippon gairaigo no kenky# [Study of Japanese loanwords]. Vance, Timothy J. 1987. An introduction to Japanese phonology. Albany: State University of New York Press. Vos, F. 1963. Dutch influences in Japanese. Lingua 12:341–388. Vovin, Alexander. 1994. Is Japanese related to Austronesian. Oceanic Linguistics 33(2):369– 390. Vovin, Alexander. 2003. Nihongo keit!ron no genzai: kore kara doko e [The genetic relationship of Japanese: Where do we go from here?]. In Vovin, Alexander & Osada, Toshiki (eds.), Perspectives on the origins of the Japanese language, 15–40. Tokyo: International Research Center for Japanese Studies. Wang, Ning & Zou, Xiaoli. 1998. Cihui [Lexicon]. Hong Kong: Haifeng. Weinreich, Uriel. 1974 [1953]. Languages in contact: Findings and problems. The Hague: Mouton. Wenck, Günther. 1974. Systematische Syntax des Japanischen [Systematic syntax of Japanese]. Wiesbaden: Steiner. Whitman, John B. 1985. The phonological basis for comparison of Japanese and Korean. Harvard University Ph.D. dissertation. Yamada, Y&ichir!. 2005. Gairaigo no shakaigaku. Tokyo: Shump&sha. Yang, Hui-Ling. 2006. Grammaticalization of the Chinese morpheme bei: Using diachronic/synchronic corpora. Arizona State University M.A. thesis. Yazaki, Genkur!. 1963. Nihon no gairaigo. Tokyo: Iwanami. Zhu, Jingwei. 2003. Kindai Nitch# Shingo no seishutsu to k!ry#. Tokyo: Hakuteisha.

570

Christopher K. Schmidt

Loanword Appendix Ainu tonakai

reindeer/ caribou

Chinese sekai riku rikuchi tairiku heichi naichi kagan kaigan d!kutsu kaiy! taiy! (2) wan ansh! shio shitchi mokuzai jishin taiy! (1) rakurai k#ki kyokk! tenki tenk! suij!ki mokutan ningen dansei josei danshi sh!nen seinen joshi sh!jo seijo danna shujin teishu kanai ny!b! kekkon

world land land land, mainland plain mainland shore shore cave ocean ocean bay reef tide, salt swamp wood earthquake sun bolt of lightning air arctic lights weather weather steam charcoal person man, male (1) woman, female (1) boy boy young man girl girl young woman husband husband, master, host husband wife wife to marry

rikon ry!shin keitei shimai ky!dai ky!daishimai keishi teimai sofu r!jin sobo r!ba r!fu sofubo senzo shison gifubo gifu gibo koji kafu shinseki kazoku boku d!butsu kachiku bokuj! bokufu kyoseigy# uma roba raba kakin gach! !mu shishi kuma z! rakuda konch# mitsur! h!ka (2) risu suigy# ch! (1) baku hifu

divorce parents brother sister sibling sibling older sibling younger sibling grandfather old man grandmother old woman old woman grandparents ancestors descendants parents-in-law stepfather stepmother orphan widow relatives family I animal livestock pasture herdsman ox horse donkey mule fowl goose parrot lion bear elephant, statue camel insect beeswax beehive squirrel buffalo butterfly tapir skin, hide

niku taim! chim! inm! kekkan rokkotsu sekich# zugaikotsu n! jida bik! shigin shiniku ky#shi kenk!kotsu sakotsu ekika shinz! haiz! kanz! jinz! hiz! i ch! (2) naiz! y!bu denbu ken (1) shiky# k!gan dankon chitsu inmon koky# hakkan !to h!hi ny! sh!ben daiben ninshin jinsei seikatsu seimei shib! (2) dekishi shitai haka genki

flesh, meat body hair pubic hair pubic hair vein, artery rib spine skull brain ear, earlobe nostril gums gums molar tooth shoulderblade collarbone armpit heart lung liver kidney spleen stomach intestines, guts intestines, guts waist hip, buttocks sinew, tendon womb testicles penis vagina vulva to breathe to perspire to vomit to fart to piss to piss to shit to conceive life life life to die to drown corpse, carcass grave healthy

21. Loanwords in Japanese kenk! by!ki netsu k!j!senshu doku ky#y! taida tokut! fuzui m!moku ch!ri juku mijuku k#fuku kikin chissoku ry!ri futt! yakan sara chawan hachi (2) gohan shokuji bansan ny#b! yasai ichijiku bud! shib! (1) kosh! mitsu sat! gy#ny# bud!shu hakk!inry! fuku (1) y!m! momen kinu hikaku b!sui b!shi h!seki nank! ie suijij! kaid!

healthy sick/ill, disease fever goitre/goiter poison to rest lazy bald lame blind cooked, to cook ripe unripe to be hungry famine to choke to cook to boil kettle dish, plate bowl, cup bowl meal, rice meal supper pestle vegetables fig grape grease, fat pepper honey sugar milk wine fermented drink clothing, clothes wool cotton silk leather spindle hat, cap jewel ointment house cookhouse meeting house

j! danro shindai isu taku dent! r!soku sekk! yaei hyakush! n!min ta saku (1) shushi kama sh#kaku kokumotsu mugi enbaku shokubutsu kitsuen juhi jueki kankitsurui hy!tan shin’y! (1) setsudan h!ch! bunretsu sentaku (1) s!ji d!gu k!gu daiku kenchiku ch#z! kin !gon gin d! (1) tetsu t!k! nendo j#tan sensu ch!koku ch!kokuka ch!kokut! sen (3) k!k!

lock fireplace bed chair table lamp, torch candle mason camp farmer farmer paddy fence seed sickle, scythe harvest grain wheat, barley, rye, oats oats plant to smoke bark sap citrus fruit gourd needle (2) to cut knife (2) to split to wash to sweep tool tool carpenter to build to cast gold gold silver copper iron potter clay rug fan to carve sculptor chisel peg to sail

hak! shuppatsu k!zoku sekkin unten j!ba d!ro sharin jiku fune ro chakuriku shoy# hoy# hozon ky#jo hakai gai hakken kane k!ka binb! kojiki fusai kanj! zeikin chingin k!kan sh!nin (1) kakaku sh!men ch!ten teppen sokumen ch#! j#ji seih!kei en ky# (1) sen (2) henka rei ichi (2) ni (1) san shi go (1) roku shichi

571

to limp to leave to follow to approach to drive to ride road wheel axle ship, boat oar to land to own to keep to preserve to rescue to destroy to injure, to damage to find money coin poor beggar debt bill tax wages to trade, to barter merchant price in front of top top side middle cross square circle ball line to change zero one two three four five six seven

572

Christopher K. Schmidt

hachi (3) ky# (2) j# (2) j#ichi j#ni j#go nij# hyaku sen (1) zenbu j#bun gunsh# ippai bubun ichibu ko hanbun yuiitsu tandoku daiichi saigo daini ittsui nido nikai daisan sando sankai jikan nenrei kaishi sh#ry! ch#shi junbi shitaku y!i (1) hinpan nitch# ichinichi gozench# sh!go gogo ban sakujitsu issakujitsu tokei sh# sh#kan (2)

eight nine ten eleven twelve fifteen twenty a hundred a thousand all enough crowd full part part piece half only alone first last second pair twice, two times twice, two times third three times three times time, hour age beginning to finish to cease ready ready ready often day (1) day (2) morning midday afternoon evening yesterday the day before yesterday clock week week

nichiy! getsuy! kay! suiy! mokuy! kin’y! doy! gatsu nen nenkan natsu kisetsu kanjiru somatsu eiri kirei seiketsu reikon seishin kichi k!’un akuun fuk! ky! (2) bish! ai kuts# hitan fu’an k!kai d!j! gekido shitto sh#chi k!man y#kan ky!fu kiken sentaku (2) kib! ch#jitsu seijitsu shin seikaku sei sekinin hinan sh!san sh#aku

Sunday Monday Tuesday Wednesday Thursday Friday Saturday month year year summer season to feel rough (1) sharp clean, beautiful clean soul, spirit soul, spirit, mind good luck good luck bad luck bad luck bad luck to smile to love pain grief anxiety to regret, to be sorry pity anger envy, jealousy shame proud brave fear danger to choose to hope faithful faithful true right (2) fault fault, blame blame praise ugly

don’yoku rik! s!mei shin’y! (2) shinjiru rikai suisoku moh! y! gainen kenmei baka benky! seito ky!shi sensei (1) gakk! kioku meikaku aimai fumeiry! himitsu setsumei ito (2) ki (2) gen’in gimon hitsuy! kantan y!i (2) konnan h!h! enzetsu gengo go (2) shitsumon sh!nin (3) hitei yakusoku kinjiru kinshi happy! ky!haku hon shijin rappa kuni sokoku toshi (2) kokumin

greedy clever clever, wise to believe to believe to understand to guess to imitate to seem idea wise stupid to learn, to study pupil teacher teacher school to remember clear obscure obscure secret to explain intention intention cause doubt need, necessity easy easy difficult manner speech language word to ask (1) to admit to deny to promise to forbid to forbid to announce to threaten book poet horn, trumpet country native country town people, citizen

21. Loanwords in Japanese ichizoku shizoku sh#ch! shihai ! jo’! kizoku shimin dorei kerai jiy#min kaih! meijiru meirei fukuj# kyoka teki ijin kyaku sh!tai enjo soshi sh#kan (1) kenka inb! baishunfu sens! sent! heiwa guntai gunjin buki heiki konb! senpu t!sekiki ken (2) j# (1) y!sai t! (2) sh!ri haiboku k!geki b!ei taikyo k!fuku sh#jin

clan clan chieftain to rule, to govern king queen noble citizen slave servant freeman to liberate to command, to order to command, to order to obey to permit enemy stranger guest to invite to help to prevent custom quarrel plot prostitute war, battle war, battle peace army soldier weapons weapons club battle-axe sling sword gun fortress tower victory defeat attack to defend to retreat to surrender captive, prisoner

senrihin gyofu ry!shi gyom! h!ritsu hanketsu saiban genkoku hikoku mokugekisha sh!nin (2) sensei (2) kokuso sho y#zai muzai batsu bakkin satsujin furin g!kan h!ka (1) gish! sh#ky! ji’in shinden ky!kai saidan s#hai shinsei sekky! shukufuku danjiki zesshoku gokuraku tengoku jigoku akuma g#z! mah! y!sei y#rei zench! katsurei ts#kagirei musen denwa densha

booty fisherman fisherman fishnet law to adjudicate, judgment judgment plaintiff defendant witness witness oath to accuse to condemn guilty innocent penalty, punishment fine murder adultery rape arson perjury religion temple temple church altar to worship holy to preach to bless to fast to fast heaven heaven hell demon idol magic fairy, elf ghost omen circumcision initiation ceremony radio telephone train

kisha denki denchi kikai sekiyu by!in ch#sha seifu daijin keisatsu menkyo kosekish!hon hanzai senkyo j#sho s#ji gink! kesh!shitsu futon bin kashi (1) shinbun ongaku kyoku cha betsu d!y!

573

train electricity battery machine petroleum hospital injection government minister police driver’s license birth certificate crime election address number bank toilet mattress bottle candy/sweets newspaper music song tea other same

Dutch biiru rampu buriki garasu penki k!hii

beer lamp, torch tin, tinplate glass paint coffee

English saban’na !rora matchi opossamu raion koy!te biib$ kangar# jag$ kamereon baffar! penisu sekkusu

savanna arctic lights match opossum lion coyote beaver kangaroo jaguar chameleon buffalo penis to have sex

574

Christopher K. Schmidt

!bun renji furaipan b!ru (1) kappu sup#n naifu f!ku tongu s!s%ji s#pu fur#tsu nattsu oriibu oiru miruku chiizu bat$ miido wain #ru shiruku feruto k!to shatsu suk$to surakkusu sokkusu sutokkingu b#tsu beruto b%ru poketto pin akusesarii nekkuresu biizu iyaringu hankachi taoru burashi tento doa

oven oven pan bowl cup spoon knife (1), knife (2) fork tongs sausage soup fruit nut olive oil milk cheese butter mead wine wool silk felt coat shirt skirt trousers sock, stocking sock, stocking boot belt veil pocket pin ornament, adornment necklace bead earring handkerchief, rag towel brush tent door, gate

rokku sut!bu beddo t%buru $chi morutaru kyanpu hanmokku shaberu raisu !ku paipu kokonattsu banana kyassaba

r!pu hanm$ matto k$petto nettobaggu b#meran tanpurain daibingu b!to kan# autorigg$ !ru padoru masuto koin fazomu fukku b!ru (2) zero kisu puraido misu aidea supiichi pen fur#to doramu toranpetto sutekki

lock stove bed table arch mortar (2) camp hammock shovel rice oak pipe coconut banana manioc bread, cassava/ manioc rope hammer mat rug netbag boomerang tumpline to dive boat canoe outrigger oar paddle mast coin fathom hook ball zero to kiss proud mistake idea speech pen flute drum horn, trumpet walking stick

herumetto reipu mosuku inishi%shon rajio terebi !tobai basu bur%ki enjin gasorin nanb$pur%to toire mattoresu kan kyand% purasuchikku karend$

helmet rape mosque initiation ceremony radio television motorcycle bus to brake motor petroleum license plate toilet mattress tin/can candy/sweets plastic calendar

French rinneru manto zubon

linen cloak trousers

German horun

horn, trumpet

Korean tera charinko

temple bicycle

Portuguese pan botan tabako kabocha

bread button tobacco, cigarette pumpkin, squash

Ryukyuan gajumaru

banyan

Spanish poncho

poncho

Chapter 22

Loanwords in Mandarin Chinese* Thekla Wiebusch and Uri Tadmor 1. The language and its speakers Mandarin Chinese is a language spoken natively in central and northern China as well as in some outlying areas. It is a member of the Sinitic branch of SinoTibetan. Other Sinitic languages (Chinese languages) include Wu, Yue (including Cantonese), Hakka, and the numerous and diverse Min languages of southeastern China. Mandarin is further subdivided into Northern, Eastern and Southwestern Mandarin. The Loanword Typology (LWT) subdatabase gives data from the standard variety (Putonghua), based on Northern Mandarin, as a standard. With almost 900 million native speakers, Mandarin is the most widely spoken native language in the world. Standardized varieties of Mandarin serve as the official languages of (mainland) China and of Taiwan, and a similar variety is one of Singapore’s four official languages. Because of Mandarin’s official status, it is also widely spoken as a second language by speakers of other Chinese languages as well as by members of non-Chinese minorities in China and in Taiwan. It is also studied as a second language in many countries.

2. Sources of data The information in the subdatabase comes from native speakers of Mandarin Chinese, from dictionaries and from the specialized literature. In a first step, Mandarin counterparts for the LWT meaning list were compiled with the help of several German-Chinese/Chinese-German and English-Chinese dictionaries, including pictorial dictionaries. They were verified and supplemented in three steps by native speakers of contemporary Mandarin (age ca. 30 years, doctoral students and postdoctoral researchers in the humanities living in Germany, with good fluency in both German and English).

*

The subdatabase of the World Loanword Database that accompanies this chapter is available online at http://wold.livingsources.org. It is a separate electronic publication that should be cited as: Wiebusch, Thekla. 2009. Mandarin Chinese vocabulary. In Haspelmath, Martin & Tadmor, Uri (eds.) World Loanword Database. Munich: Max Planck Digital Library, 2049 entries.

576

Thekla Wiebusch and Uri Tadmor

Map 1: Geographical setting of Mandarin Chinese In a second step, the age of the words was verified using the Ciyuan and Hanyu da cidian, both containing some etymological information and Chinese historical text corpora (Scripta Sinica, Hanquan database of Ancient Chinese texts). In a third step, the loanword status was checked and related information obtained using (Chinese) dictionaries of Chinese loanwords (Cen 1990; Gao et al. 1984; Hu 1998) and literature about Chinese loanwords (e.g. Gao & Liu 1958; Masini 1993; Shi 2000; Norman & Mei 1976; Zhao 2006), the history of the Chinese language and reconstruction of Old Chinese (e.g. Norman 1988; Sagart 1999; Wang 1995), language contact (e.g. LaPolla 2001; Chappell 2001; Pulleyblank 1983) and etymology (e.g. Schuessler 2003, 2007). Both age and loanword status were checked systematically only for unanalyzable and semi-analyzable words (together ca. 30%). Compounds or derivations were usually formed after the (potential) borrowing of the components, and thus are not eligible as loanwords according to LWT standards. (Indeed, Wang & Wang 2004 came to the conclusion that excluding compound words gave the best lexicostatistical results for sub-grouping Sinitic languages). Information on loan types other than phonetic loans was entered on a case-by-case basis. No systematic attempt was made to explore the potential borrowing of words before the Han Dynasty (206 BCE – 220 CE), although information about such

22. Loanwords in Mandarin Chinese

577

(potential) loanwords was entered into the database, if substantial evidence was described in the literature consulted. This decision was taken due to the unsettled issues in the reconstruction of Old Chinese and the possible donor languages and with regard to the ultimate relationship between language families in East and Southeast Asia (see Sagart et al. 2005; Wang 1995; Sanchez-Mazas et al. (eds.) 2008 for discussions), which makes the distinction between cognates and loans a matter of speculation in many cases.

3. Contact situations Virtually nothing is known about the circumstances of ancient language contact between speakers of Chinese and speakers of other languages. Evidence for these contacts is in the form of a handful of loanwords, far too few to base sound hypotheses on them. Moreover, speculating on the nature of the contact situations based on these few loanwords without corroborating historical evidence might run the risk of circularity. Considerable progress is being made both in the reconstruction of Old Chinese and its possible donor languages and in the population history of East and Southeast Asia, so that more conclusive results may be within reach. The Loanword Typology (LWT) subdatabase for Mandarin includes one or two words each of Altaic1, Kra-Dai, Austroasiatic, Austronesian, Hmong-Mien, Tibeto-Burman, Indo-Iranian, and Tocharian origin. There are also some much newer loanwords from European languages. The major contact situations are listed below, with brief explanations on whatever is known about their circumstances. 3.1.

Contact with Austroasiatic

Looking at the linguistic and ethnographic situation in China today, nothing points to the important role that Austroasiatic people played in the formative period of the th th th th Chinese empire, the Shang (16 –11 c. BCE) and Zhou Dynasty (11 –5 c. BCE). Archeological findings as well as recent reconstruction work (Schuessler 2003, 2007, Norman & Mei 1976) seem to show that speakers of Austroasiatic languages settled in major parts of southern China as well as the Eastern coast in present day Shandong. Austroasiatic languages have been associated with the Ancient Yi people in Shandong, and the Eastern Longshan culture by Schuessler (2003). They have been equated by Norman & Mei (1976) with the Old Chinese designation “Yue” for a people obviously forming part of the population of the two States of Yue (before th th 6 c. – 334 BCE) and of Chu (11 c. – 221 BCE). Both states, which once covered the territory of what is now South China up to the Yangzi river, were ultimately 1

This grouping is based on practical considerations. The authors take no stand vis-à-vis the validity of the Altaic language family proposed by some scholars.

578

Thekla Wiebusch and Uri Tadmor

absorbed during the unification of China by the northwestern state of Qin in 221 BCE. The southward expansion of the Chinese Han Empire with massive populast nd tion movement in the 1 and 2 c. CE finally marginalized the Austroasiatic languages. But, along with Hmong-Mien languages, they entered the language of the southern Chinese dynasties as an important substratum. Schuessler (2003, 2007) shows an important substratum of Austroasiatic in th Chinese, stemming from the time of the earliest written records in the 14 c. BCE (or before) to the Warring States (476 – 206 BCE). Numerous high-frequency words have convincingly been argued to be of Austroasiatic origin by Norman & Mei (1976), Norman (1988) and Schuessler (2007): examples are sh! ! ‘pig’ (Old Chinese *lhe" or hlai", cf. proto-Monic *cliik), g#u " ‘dog’ (Old Chinese *klo", cf. proto-Mon-Khmer *klu2B), h$ # ‘tiger’ (**k’la(g) < Proto-Austroasiatic *kala"), n$ $ ‘crossbow’ (**na < proto-Austroasiatic *s-na), ji%ng ‘Yangzi river’ (**krong) from a word for ‘river’ (cf. Old Mon krung) and zh&u % ‘boat’ (Old Chinese *tu, cf. ProtoViet-Muong *do:k). The fact that some are written with basic characters points to a very early layer. 3.2.

Contact with Hmong-Mien

Hmong-Mien languages are today spoken in small communities in Laos and Thailand, as well as in south-western regions of China. Blench (2005: 112–13, quoting Ratliff) assumes their homeland south of the Yangzi River, with subsend quent dispersal due to the incoming Chinese from the 2 millennium BCE on. The six potential loanwords in the database mostly stem from the Warring States period (475–221 BCE). At that time, the state of Chu (see above), south of the Yangzi River, was possibly pre-Mien-speaking with a sinicized bureaucracy (Blench 2005, quoting Sagart) or at least has absorbed pre-Mien languages during its expansion. 3.3.

Contact with Tocharian

The northwest of China and the Tarim Basin were probably the first place of contact between Chinese and early Indo-European people. Since the discovery of prehistoric Caucasoid mummies in that region, various hypothesis have been put forward as to their linguistic affiliation, among them predecessors of the Tocharians. Pulleyblank (1996) is not alone to suggest that the appearance of the horse-drawn chariot in the Shang Dynasty may be related to Indo-European contact, and that the Shang borrowed words for horses, chariots and related items (compare Bauer 1994). The exact nature of their relationship still remains to be determined.

22. Loanwords in Mandarin Chinese

3.4.

579

Contact with Indo-Iranian, Elamite, and Sanskrit

The Silk Road is known to have connected China with Central Asia and the Near East since Emperor Han Wudi (r. 141 – 86 BCE) gained control over the Silk Road and sent regular embassies to Parthia, Bactria, Syria, and northern India, among others. The Silk Road brought cultural items and technologies to China, as well as religious beliefs. The most important among them for China became Buddhism, which entailed the need to render Sanskrit terms into Chinese. The time until the th 10 c. CE was the flourishing period of Buddhism, but its influence continued until modern times. Contact with Indo-Iranian languages (e.g. Sogdian, Parthian, Bactrian) was mainly indirect and restricted to a small minority of Chinese. The same is true for any knowledge of Sanskrit: Buddhist teaching was received primarily through translations. This is reflected in the borrowed terms, mainly belonging to the domains of then exotic food (e.g. pínggu# &' ‘apple’ < Sanskrit bimbara) and animals, cultural items, e.g. musical instruments (pípa () ‘Chinese lute’ < Sanskrit vivañki ‘a string instrument’ (alternatively from Xiongnu)), precious materials, e.g. liuli *+ < Prakrit veluriya ‘fayence’, b&li ,+ ‘glass’ (ultimately from Sanskrit sphatika ‘quartz crystal’). Religious terms such as fó - ‘Buddha’, mó . ‘devil’ < m'ra ‘demon, Lord of Evil’, níg( /0 ‘nun’ < bikchuni, púsà 12 ‘Bodhisattva’ < bodhisattva, chán 3 ‘meditation’ (> Japanese zen) shèlì 45 ‘relic’ < sárira dominate borrowings from Sanskrit (see also Cen 1990). Semantic renditions (loan translations) of philosophical terms have remained indispensable until today, e.g. shìjiè 67 ‘cosmos, time and space’ for lokodhatu, xiànzài 89 ‘now/present’ for atita, guòqù :; ‘past’ for anagatakala, jianglái Hàn-b%o st), g to 1 (Stuttgart > S*-t$-ji'-tè uvwj) (For a detailed account of the adaptation of English loanwords into Mandarin, see Novotná 1967). Most renditions result in mono-morphemic polysyllabic words (see also Table 5), a very unusual structure for Mandarin. Sanskrit word structure posed similar problems to borrowers as later European languages. Thus we already find many of the later strategies, such as splitting consonant clusters among several syllables and reducing long phonetic loans to the more adapted mono- or bisyllabic words (fút$ xv/ fútú xy/ fót$ -v > fó ‘Buddha’, s(d$b& (z{ > t% | ‘Stupa/tower’). In general, loanwords entering the language before the Han Dynasty (206 BCE – 220 CE) are by now so highly integrated that even expert linguists are still debating about their status. From the Han period onwards, some remained bisyllabic monomorphemic (unanalyzable) words (e.g. pútao ]^ ‘grape’, b&li ,+ ‘glass’), which is unusual for Mandarin. This, together with a better knowledge of the possible donor languages makes it easier to identify them. 5.2.

Morphosyntactic integration

Morphosyntactic integration of loanwords is facilitated by the lack of morphology of modern Mandarin. Mandarin usually imports the bare noun or verb of the donor language. It does not need to assign any specific plural form, gender, or verb morphology to these borrowed items. For nouns, an appropriate numeral classifier has to be used, which requires knowledge of the meaning of the word, but it is not a fixed feature (unlike gender). The default classifier ge can be used in case of uncertainty. Novotna (1967: 115–116) mentions the verb morphology of European donor languages as an inhibitive factor causing the near absence of loan verbs. As the verb morphology of English is not more complex than its nominal morphology, this argument is hardly plausible. Old loanwords have undergone the same trend for compounding as the rest of the lexicon, so that many only exist as part of compounds. Yá } was probably borst rowed with the meaning ‘tusk/ivory’ from (proto)-Austroasiatic in the early 1 millennium BCE, and was only later compounded with ch! ~, the original term for ‘tooth’, to form modern yách! }~ ‘tooth’ (see Norman & Mei 1976: 288–292). As a side effect, such old loanwords are not counted in the subdatabase.

586

Thekla Wiebusch and Uri Tadmor

5.3.

Orthographic / semantic integration

The Chinese writing system poses two problems for writing phonetic loans, which are different from those encountered in languages with alphabetic, or even merely syllabic writing systems, and which have to be separated from the potential influence of the characteristics of the Chinese spoken language on the borrowing process. Chinese characters (zì •) render entire syllables, often – especially for Ancient Chinese - equivalent with entire morphemes or words. Usually, only existing characters (for already existing syllables or morphemes) can be used to render foreign words. Exceptionally, new characters can be created using already existing graphic components. In most cases, this means that even if a word had been borrowed into the spoken language in a form closely related to its foreign model (possibly not respecting the Chinese syllable structure), in writing, only existing morphemes could have been used as an approximation of the phonetic value of the source word. Even filling phonetic gaps would be difficult. This makes it difficult for unusual phonetic forms to enter the written lexicon, but it also puts serious constraints on identifying loanwords as such. A Chinese character usually has a specific semantic value. In a standard dictionary of ca. 10,000 Chinese characters, up to 50 can have the same pronunciation (including the tone), but the written forms have distinct, mostly unrelated meanings. Characters used to render source words inevitably carry semantic weight. This can make the rendition of loanwords problematic, as interference between the intended meanings and the original meanings of the characters has to be avoided. Moreover, Mandarin readers are used to characters providing semantic information at least through the semantic elements, also called “radicals”, which assign them to certain semantic domains. Strings of characters without semantic content can be found disturbing. Phonetic loans, which are a priori semantically opaque for the Mandarin speaker or reader, can be semantically integrated on several levels: ! assignment of specific numeral classifier, hinting either at the shape or the nature/function of the referent (otherwise: usage of neutral classifier). ! formation of hybrid compound with a productive Chinese super-ordinate term (e.g. qíyì-gu# €•' ‘kiwi-fruit’, píng-gu# &' ‘apple-fruit’, see also §8.2 on loan blends). It is often hard to find out whether the formative super-ordinate element was already added during the borrowing process or only later. Sometimes, both variants coexist. ! creation of characters with fitting graphemic classifiers (“radical”) (or selection of already existing such characters) for the phonetic rendition of the word. This has been most systematically applied to the chemical elements and plant names (see Table 3).

22. Loanwords in Mandarin Chinese

Table 3:

Some characters for loanwords using appropriate graphemic classifiers

New character(s) Phonetic form Meaning 1 2 3 4 5 6 7 8 9 10 11 12 13

587

-q Pq .q ]^q ,+q ()q hq +q

fó s,ng mó pútao b&li pípa bàng bàng

!q ,q -q .q /q

x* b% hài 'n b+n

Buddha Buddhist monk devil, demon grape glass (crystal) Chinese lute pound (weight) pound (currency) silicon palladium helium ammoniac benzol

Graphemic classifier semantic domain ) ) ‚ ƒ „ * … †

‘human being’ ‘human being’ ‘ghost’ ‘herb/plant’ ‘jade/precious stone’ ‘string instrument’ ‘stone’ ‘metal’

… † ‡ ‡ ƒ

‘stone’ ‘metal’ ‘gas’ ‘gas’ ‘herb/plant’

A frequent strategy is to form new characters with the semantic element ˆ ‘mouth’, to point to the phonetic value (comp. k'f,i "# ‘coffee’). Graphemic classifiers were even more widely employed in earlier stages of the writing system. In combination with syllable reduction or hybrid formation (often somewhat after the initial borrowing), they have made many loans visually unrecognizable. During the earliest phase of the writing system, even the creation of new basic ideograms was possible. Xiàng Z ‘elephant’, sh! ! ‘pig’, m% c ‘horse’, ch, ‰ ‘chariot’, yá } ‘tusk, ivory’ and zh&u % ‘boat’, all highly suspect of being borrowed are written with such basic characters, some of which occur as graphemic classifiers in a series of compound characters. This makes the suggestion of borrowing highly counterintuitive for Chinese native speakers.

6. Structural borrowing 6.1.

Phonological borrowing

In contemporary Mandarin, phonological borrowing is extremely rare, at least as far as the standard and written variety is concerned. This is quite different from the case of Cantonese according to Bauer (2006), who lists 50 syllables (excluding tones), that are only used in loanwords (all of them filling phonological gaps in the Cantonese syllabary of potentially 1140 syllables, with only 779 occurring). T’sou (2001) mentions some loanwords that seem to have expanded the existing Mandarin syllabary. The syllables ‘ga’, ‘ka’, and ‘ha’ are frequently used in phonetic loans. Even new characters with this pronunciation have been created for certain loanwords (e.g. " k' in k'f,i ‘coffee’, ' k% ‘carbylamine’). In native words, these syllables are mostly restricted to onomatopoetic words and dialect words. T’sou

588

Thekla Wiebusch and Uri Tadmor

(2001) remarks that in native compounds, characters such as 0 are read jiá, even if they are pronounced k% in the loanwords ‘card’ (k%piàn), ‘cartoon’ (k%t&ng) or ‘carbine’ (k%b*nqi'ng). This “velar fronting” was already the rule in Mandarin when these words were borrowed, but has obviously not applied to the loans. The syllables ka, ga and ha were much needed to transcribe Inner-Asian place names. At the same time, southern dialects might have played a role in (re-)introducing this feature to Mandarin. Cantonese, for example, still retains the traditional pronunciation. Loanwords from Sanskrit also enriched the inventory of Chinese syllables: fó - ‘Buddha’ is the only character nowadays pronounced fo (the Ciyuan also has fo ", a part of a mountain mentioned in the Chuci, a work from the southern State of Chu), and s,ng P ‘monk’ is also unique at the time of borrowing. 6.2.

Morphosyntactic borrowing

Claims about grammatical borrowing have been made principally with regard to two sources. Most of the relevant literature discusses purported influence from Western languages. Peyraube (2000) examined such claims in detail, and has convincingly demonstrated that the grammatical features in question could be found in Chinese writings long before any contact with Western languages. Moreover most of these purportedly Western features were shown by Peyraube to be on the wane. A more promising avenue of investigation regarding grammatical borrowing in Chinese involves Altaic languages, which, from a sociolinguistic perspective, would have been much better placed to influence the structure of Chinese. According to Norman (1988: 20), “[i]t may be that Altaic has left more of a mark on the grammar of Chinese than it has on the lexicon”. As an example, Norman mentions the development of SOV sentence patterns, which are rare in Sino-Tibetan but prevalent in Altaic. Another grammatical feature of possible Altaic origin is the exclusive/ inclusive distinction in the first person plural pronouns of the Beijing dialect and in several other northern dialects (see also LaPolla (2001: 230–231) and Chappell (2001: 336) for a summary of Hashimoto’s and others’ arguments). Further discoveries must await systematic historical-contrastive studies of Chinese languages and Altaic languages with which they have been known to be in contact. A prominent feature of the contemporary Mandarin lexicon, especially in the literary, technical and bureaucratic register is the omnipresence of derivational affixes and formative elements, e.g. -xìng ‘-tion’/‘-ity’ (< ‘nature’), -huà ‘-ize’/‘-ization’ (< ‘change’), -zh$yì ‘-ism’ (< ‘doctrine’) k+- ‘-able’ (‘can’/‘possible to’). The systematic use of these morphemes to form derivational compounds can be traced to the attempt to render the equivalent Western structures, and has most likely been borrowed together with a large number of graphic loans from Japanese, where words of these types were first coined, a strategy that may have been favored by the agglutinative character of this language. Subsequently, Chinese made independent use of these affixes or very productive components. Table 4 shows that central terms in modern Chinese life have been formed in this way.

22. Loanwords in Mandarin Chinese

Table 4:

Some compounds formed with modern affixes in Mandarin

Graphic form 1 2 3 4 5

589

8Š‹q Œ•Žq co••‘q K’q K“q

Phonetic form xiàndàihuà chuàngzào-xíng m%kès*zh$yì k+kào k+pà

Meaning

Meanings of each syllable

modernize/-ization creativity Marxism reliable terrible

xiàn-dài ‘modern’ + huà ‘to change’ chuangzao ‘create’ + xíng ‘character/nature’ m%kès* ‘Marx’ + zh$yì ‘-ism’ k+ ‘can’ + kào ‘rely on’ k+ ‘can’+ pà ‘fear’

7. Lexical borrowing 7.1.

Outright lexical borrowing

When borrowing words into Mandarin, the syllables chosen to represent the foreign sounds usually already have a meaning, otherwise there would be no Chinese characters to represent them, and it would be impossible to write them down in Chinese. In simple loanwords, the meaning of the existing Mandarin morphemes is ignored (but of course still present to the reader). Table 5 presents a few examples of relatively recent loanwords from European languages. Table 5:

Some English loanwords in Mandarin

Graphic form 1 2 3 4 5 6

”•q c–q —o˜q ™šq !q ›œ•žq

Phonetic form sh'f' m%dá qi'okèlì luóji bèng 's*p!lín

Meaning sofa motor chocolate logic(al) pump aspirin

Meanings of each syllable sh' ‘sand’ + f' ‘send out, bring forth’ m% ‘horse’ + dá ‘reach’ qi'o ‘skillful’ + kè ‘overcome’ + lì ‘force’ luó ‘patrol’ + ji ‘compile/ edit’ Special character (‘stone’ on top of ‘water’) ' ‘PREF’ + s* ‘administrate / a surname’ + p! ‘NUM.CL for horse’ + lín ‘forest’

As can be seen, the existing meanings of the Mandarin syllables making up simple loanwords bare no relation to the meaning of the source word. Perhaps for this reason, outright lexical borrowing has played a relatively minor role in the development of the Chinese lexicon. Several other types of borrowing have been much more prominent; they are discussed in §8. 7.2.

Intra-Sinitic borrowing

Intra-Sinitic borrowing, and especially borrowing into Mandarin from other Chinese languages, has not been thoroughly explored. Mandarin would have been in contact with these languages ever since Proto-Sinitic broke up into separate languages. It seems logical that over many centuries of close contact, words would be borrowed into Mandarin from other Chinese languages (while a the greater number

590

Thekla Wiebusch and Uri Tadmor

of words have of course flowed in the opposite direction). This is all the more probable as Mandarin has been used widely as a second language in China ever since th Beijing first became its capital in the 15 century. Non-Mandarin Chinese words could easily have been introduced by non-native speakers from their first language, and then incorporated into the language as a whole. However, since native speakers view Chinese as one language (rather than as a family of languages), the study of loanwords has naturally focused on words that are perceived to be of a foreign origin. In other words, for many Mandarin speakers (and even for scholars of Mandarin), words borrowed into Mandarin from other Chinese languages do not constitute loanwords and therefore do not merit serious study. Norman, for example, devotes an entire section in his overview book Chinese to “Chinese in contact with other languages” (Norman 1988: 16–22), but does not discuss contact between Mandarin and other Chinese languages. The potential of such work can be seen in Wang & Wang’s (2004) discussion of a Shandong dialect layer in contemporary th th Beijing Mandarin, stemming from massive population influx during the 16 to 19 c., on the basis of irregular tone correspondences. Chen (1999: 100) does discuss (albeit briefly) contact between Mandarin and other Chinese languages. He notes that “Modern Written Chinese has borrowed many words from other dialects”. Yet he only cites six examples, all from Wu. T’sou (2001) mentions several English items that have recently entered Mandarin through HK Cantonese as graphic loans, e.g. bèng ( ‘pump’ (< Cantonese bam), dìshì ŸV ‘taxi’ (< Cantonese diksi). G%ng ‘harbor’ is borrowed from a southern dialect according to Schuessler (2007). The original word for ‘river’ had already changed to ji%ng in the north, but the Southern form g%ng with the derived meaning ‘harbor’ entered Mandarin at a later time. Other factors that have contributed to the low loanword figure for Mandarin are discussed in the conclusion to this chapter (§9).

8. Other types of borrowing affecting the lexicon 8.1. 8.1.1.

Semantic borrowing Semantic borrowing proper

Although semantic borrowing is one of the most common outcomes of language contact, it has only received scant attention in the literature, apart from calques, which have been frequently discussed. In fact, the process of calquing is part of the more general process of semantic borrowing, which involves assigning native words the meaning of equivalents in another language. Since simple words are much more numerous in the basic lexicon than compounds, it stands to reason that more simple words are affected by semantic borrowing compared to compounds. It seems that the reason calques have received so much more attention is that they are far easier to identify compared to simple words which have undergone semantic

22. Loanwords in Mandarin Chinese

591

borrowing, as the majority of calques are neologisms. Semantic borrowing in simple words entails an induced meaning shift or polysemy. Such a shift can be observed in several Chinese words, e.g. diàn ‘electricity’ (previously ‘lightning’) or quán ¡ ‘right’ (previously ‘power’) as well as in bound morphemes, e.g. ch, ‰, now typically ‘car’, previously typically ‘chariot’; j* ¢, now typically ‘machine’, previously mainly referring to the loom. Semantic borrowing in existing compounds is also a current strategy in lexical importation (T’sou 2001). See items 1–2 in Table 7. Calquing is a specific semantic borrowing of compounds: each constituent in the donor language compound is translated into an equivalent in the recipient language, and the resulting compound is assigned the compositional semantics of the compound in the donor language. Calquing from European languages is common in modern in Mandarin. Some examples are provided in Table 6. A special form of calquing is the imitation of the derivational structure of the source word (see Table 4). Table 6:

Mandarin compounds calqued on European expressions

Graphic form Phonetic form 1 2 3 4

£¤q ¥"q ¦§q ¨©q

Meaning of constituents

mìyuè règ#u lánqiú tiélù

(honey + moon) (hot + dog) (basket + ball) (iron + way)

5 ª«q

su#g$

(to lock + bone)

6 ¬-q

zuò’ài

(make + love)

8.1.2.

Meaning of compound honeymoon hotdog basketball railway (< Eisenbahn/ chemin de fer) collarbone (< German Schlüsselbein) to make love

Graphic loans

The lexicon of Japanese contains numerous loanwords from Classical Chinese (see Schmidt’s chapter on Japanese, this volume). Some of these words were reborrowed into modern Mandarin with their Japanese meanings. The reborrowing was done in writing. This was facilitated by the fact that Chinese loanwords in Japanese are written with Chinese-based characters (known in Japanese as kanji). The phonetic realization of the reborrowed kanji was in their Mandarin phonetic forms, unaffected by their realization in Japanese. Such words have been called “graphic loans”. Like semantic borrowing, where compounds have been singled out for discussion in the literature because of their ease of identification, graphic loans in Mandarin have been discussed almost exclusively in the context of compounds. Masini (1993: 128–129) distinguishes two types of graphic loans, original loans and return loans. In the latter type, Japanese has maintained existing Chinese compounds which were lost in Chinese itself – with their meanings – and used them for new Western concepts. The terms were later re-borrowed through writing into Chinese and each constituent given its modern Mandarin pronunciation. The compounds can either be traditional Chinese or have been coined, e.g. in a Chinese

592

Thekla Wiebusch and Uri Tadmor

dictionary to render a Western concept, without being “successful” in China. An example is given as item 9 in Table 7. The former type indeed involves three subtypes: (1) New compounds that were created in Japanese based on constituents originally borrowed from Classical Chinese. These were then borrowed into Mandarin in their graphic (kanji) form and realized with the modern Mandarin phonetic forms associated with these characters (items 3–6 in Table 7). (2) Japanese words written with kanji, borrowed into Mandarin with Mandarin phonetic forms regardless of their Japanese origin. This includes graphic loans of Japanese phonetic loans from European languages (items 7 & 8 in Table 7). (3) Zhao’s (2006: 312) “Japanese redefined vocabulary”, where the Japanese used compounds borrowed from Chinese for Western concepts different from the Chinese meaning, which may still have been in use in China (items 1 & 2 in Table 7, a case of semantic loan for compounds). Graphic loans form a major part of the contemporary Mandarin terminology in the fields of law, economy, politics, sociology, sciences and humanities, as kanji compounding or recycling of kanji-compounds was the major strategy of Japan in rendering Western terminology into the 1920s, a time when Japan was a major source of modern knowledge for China. (For in depth-analyses of the development of the scientific terminology of Modern Mandarin, see Lackner et al. 2001). These graphic loans that are estimated to account for 30% of the modern Mandarin lexicon (Zhao 2006) do not show up in the LWT statistics, as they are (1) treated as calques, (2) mostly refer to technical or modern terminology, not part of the LWT vocabulary. Setting aside the meaning of the individual characters, the situation is not unlike that of Modern Italian or Greek making use of internationally coined scientific and technical terms consisting of Latin and Ancient Greek building blocks. Such terms would not strike the ordinary speaker as “foreign words”.

22. Loanwords in Mandarin Chinese

Table 7:

Graphic loans in Mandarin

Graphic Meaning of form constituents in Classical Chinese 1 ®¯q 2 °±q 3 ²³q 4 ´µq 5 ¶·q 6 d¸q

‘Leather/ transform’ + ‘heavenly mandate (of emperor)’ ‘altar of earth god’ + ‘gathering’ (‘protect’ + ‘life’) (‘build/construct’ + ‘build/construct’) ‘solve/losen’+ ‘set free’ ‘Division’ + ‘learn(ing)’

7 ¹uq ‘tile’ + ‘Dem.’ 8 cº»q ‘horse’ + ‘small bell’ + ‘yam’ 9 ±¼q ‘pay’?+ ‘calculate’

8.2.

593

Meaning of Japanese compound in phonetic Classical Chinese form

Mandarin Meaning in phonetic Japanese and form Mandarin

‘change of kakumei predetermination’

géming

‘revolution’

ceremonial gathering – –

shakkai

shèhui

‘society’

eisei kenchiku

wèish,ng jiànzhù

‘hygiene’ ‘build’/ ‘building’

– –

kaih& kagaku

ji+fàng kèxué

– –

gasu bareisho

w's* m%língsh$

‘liberate, liberation’ ‘science’ < ‘education system’ ‘gas’ (Dutch) ‘potato’

accountant

kaikei

kuàiìi

‘accountant’

Loan blends

It is relatively common for a Mandarin morpheme to be added to a borrowed morpheme as part of the borrowing process. The result is a loan blend: a compound that contains a borrowed base and a native base. This process should be distinguished from the compounding of existing loanwords, where the resulting compound is not itself part of the lexical borrowing process. Examples of loan blends are provided in Table 8. In examples 1 to 4, the borrowed element is the first constituent, while in examples 5 and 6 it is the last constituent. The immediate donor language of the borrowed elements in these examples is English, though the English word itself might be a loanword. Table 8:

Loan blends in Mandarin

Graphic form Phonetic form 1 2 3 4 5 6

1½q 0¾q ¿'q ÀÁÂq ‰Ãq ½2q

píji$ k%piàn mánggu# tàng,w$ ch,t'i ji$b'

Nonborrowed constituent ji$ ‘liquor’ piàn ‘slice’ gu# ‘fruit’ w$ ‘dance’ ch, ‘car’ ji$ ‘liquor’

Borrowed constituent pí ‘beer’ k% ‘card’ máng(gu#) ‘mango’ tàng, ‘tango’ t'i ‘tire’ b' ‘bar’

Meaning of compound ‘beer’ ‘card’ ‘mango’ ‘tango’ ‘tire’ ‘bar’

594

Thekla Wiebusch and Uri Tadmor

8.3.

Sound translations

As already noted above, the syllables that make up every loanword written with Chinese characters must, by definition, have a meaning. If this meaning bears no relation to the meaning of the source word, the result can be confusing. Therefore, when borrowing words, speakers of Chinese have historically preferred mechanisms other than outright lexical borrowing. One common strategy is choosing characters for the selected syllables that bear some relation to the source word. A looser phonetic resemblance to the source word can be acceptable, if there is a gain on the semantic side. Some examples are provided in Table 9. This strategy is also very popular in translating foreign names, especially brand names. Table 9:

Sound translations in Mandarin

Graphic from 1 ÄÅÆq ÇÈÉÊRËÌq 2 KˆKÍq 3 4 5 6

ÎÏWq ÐÑq Òb¯q Ó6ÔÕq Ö×fØq Óu|{qq

Phonetic form

Meaning

Dé-yì-zhì (liánb'ng gònghéguó) short: Déguó K+-k#u k+-lè

Germany (FRG)

X*ménzi àiz* wéit'mìng gàishìtàib%o gesh'd%p( gàis*t%b&

Siemens Aids vitamin Gestapo

Coca Cola

Literal meaning ‘Virtue - meaning - will’ (federal republic) short: ‘Land of virtue’ ‘Can - mouth - can - enjoy’ ‘palatable and enjoyable’ ‘Son of the Western gate’ ‘generated by love’ ‘protect his life’ ‘cover/surpass - world - utmost safe’; ‘norm - kill - beat - aggress’; ‘cover - this - pagoda - wave’

Example 6 demonstrates how Chinese characters contribute not only to the objective understanding of loanwords, but also to their emotional value. It also gives an idea of the number of alternatives to render even a bisyllabic word phonetically in a writing system where many syllables correspond to 20 or more characters, even if tone were distinguished. To avoid this problem, an official list of characters has been proposed for the transliteration of foreign names and terms. It contains relatively simple characters, which are mostly rare or function words unlikely to find in compounds. A string of such characters thus raises the suspicion that the meaning of the characters should not be activated. Yet, especially for brands, “telling names” are still preferred.

9. Conclusion One of the challenges in analyzing loanwords in Mandarin is explaining why there are so few loanwords to begin with. Of all languages in this book, Mandarin has by far the lowest loanword rate (1.2% overall). How can we account for this low figure?

22. Loanwords in Mandarin Chinese

595

One factor that can be discounted from the start is morphological type. Chinese is a highly isolating language with little by way of affixation or more complex synthetic morphological processes. As such, borrowing into Mandarin is technically easy, since loanwords do not need to undergo complex processes of morphosyntactic integration. Yet the loanword rate for Mandarin is very low. The crux of the explanation must therefore lie in extralinguistic factors. As already explained, the fact that there are relatively few loanwords in Mandarin does not mean that its lexicon has not been affected by borrowing. But for various reasons, speakers of Mandarin have preferred strategies other than direct lexical borrowing. These include semantic borrowing (including calquing), sound translations, and graphic loans (see §8). Overall, then, the lexicon of Chinese, and specifically of Mandarin, has by no means been impervious to borrowing. T’sou (2001: 41) may be right in suggesting sociolinguistic factors such as accessibility, agreeability and familiarity, as components of “cultural compatibility” as determining the preferred strategies of “lexical importation”. Indeed, Hong Kong Cantonese, also a Sinitic language using the Chinese writing system, shows a much higher inclination towards phonetic loans from English. This can be easily explained both by higher and more direct exposure to English in Hong Kong and a different attitude towards items and concepts introduced by this language. A systematic comparative study, also taking into account recent borrowing of phonetic loans by Mandarin through Cantonese would be desirable. Another important point that must be stressed is that a low figure does not necessarily mean a low loanword rate. Obviously, only known loanwords can be counted, and undetected loanwords are not included in the statistics. Norman (1988: 16) pointed out that “prior to the middle of the second millennium BC, China’s cultural superiority was almost certainly not as overwhelming as it was to become later on, and we should not rule out the possibility that in prehistoric times Chinese absorbed foreign elements, perhaps even a on a relatively large scale. China’s later hegemony in East Asia has been confused with a kind of cultural and linguistic immunity which exempted Chinese from any but the most trivial of outside influences. Widespread acceptance of such a view has no doubt impeded a serious search for foreign influence in Chinese.”

In other words, there may be many non-Chinese loanwords in Mandarin, but they have not been identified. Norman even goes on to say that “[t]he fact that only a relatively few Chinese words have been shown to be Sino-Tibetan may indicate that a considerable proportion of the Chinese lexicon is of foreign origin” (Norman 1988: 17). Since it is probable that some of these pre-Chinese languages have since disappeared, chance are that loanwords from them will remain unidentified. Yet it is equally probable that serious comparative lexical investigations will uncover loanwords from other Sino-Tibetan languages as well as from the other known language families of the region, namely Austroasiatic, Hmong-Mien (Miao-Yao), Kra-Dai (Tai-Kadai), and Austronesian.

596

Thekla Wiebusch and Uri Tadmor

A second source of unidentified loanwords in Mandarin is intra-Sinitic borrowing (§7). Here the prospects of discovery are even higher than in the case of ancient loanwords, since the borrowing was more recent and the donor languages are still by and large extant. More scholarship in this area could no doubt result in future increases in the loanword figures for Mandarin. The defining criteria of the LWT project also leave some well-known ancient borrowings uncounted. As the project counts loanwords, not borrowed roots, ancient loanwords that have hence become part of complex words by processes of compounding or affixation – about two thirds of the Mandarin counterparts of the LWT list are analyzable – and are no longer available as free morphemes in contemporary Mandarin do not enter the statistics. Finally, the make-up of the LWT meaning list has also contributed to the very low borrowing figure for Mandarin: The LWT list is based on the International Dictionary Series list, which itself is based on Buck’s list. These lists tended by their nature and purpose to focus on relatively basic vocabulary. Yet loanwords in Mandarin, especially those borrowed in the modern period, have tended to consist mostly of technical vocabulary (see §4.3). It is indeed ironic that so little is known about loanwords in Mandarin, which is not only the language with the largest number of native speakers in the world, but is also the language whose history is documented for the longest period. It is hoped that future studies will rectify this unfortunate incongruity.

References Bauer, Robert. 2006. The Stratification of English Loanwords in Cantonese. Journal of Chinese Linguistics 34(2):172–191. Bauer, Robert S. 1994. Sino-Tibetan *kolo “wheel”. (Sino-Platonic Papers 47). Blench, Roger. 2005. Stratification in the Peopling of China. In Sagart, Laurent & Blench, Roger & Sanchez-Mazas, Alicia (eds.), The peopling of East Asia: Putting together archaeology, linguistics and genetics, 105–132. London: RoutledgeCurzon. Cen, Qixiang. 1990. Hanyu wailaiyu cidian [Dictionary of foreign loans in Chinese]. Beijing: Shangwu yinshuguan. Chappell, Hilary. 2001. Language Contact and Areal Diffusion in Sinitic Languages. In Aikhenvald, Alexandra & Dixon, R. M. W. (eds.), Areal diffusion and genetic inheritance subtitle problems in comparative linguistics, 328–357. Oxford: Oxford University Press. Chen, Ping. 1983. Deutsch-Chinesisches Wörterbuch. Taipei: Lanbridge Press. Chen, Ping. 1985. Das neue Chinesisch-Deutsche Wörterbuch. Beijing: Shangwu yinshuguan. Chen, Ping. 1999. Modern Chinese. Cambridge: Cambridge University Press. Feng, Zhiwei. 2004. The Semantic Loanwords and Phonemic Loanwords in the Chinese Language. 11th International Symposium of NIJLA, Tokyo, March 2004.

22. Loanwords in Mandarin Chinese

597

Gao, Mingkai & Liu, Zhengtan. 1958. Xiandai Hanyu wailaici yanjiu [Study of Foreign loans in Modern Chinese]. Beijing: Wenzi gaige chubanshe. Gao, Mingkai & Liu, Zhengtan & Mai, Yongqian & Shi, Youwei. 1984. Hanyu wailaici cidian [Dictionary of Chinese foreign loans]. Shanghai: Shanghai cishu chubanshe. Gugong “Hanquan” gudian wenxian quanwen jiansuo ziliaoku (Hanquan database of Ancient Chinese texts). . Hall-Lew, Lauren Asia. 2002. English Loanwords in Mandarin Chinese. Honors Thesis. University of Arizona. He!manová-Novotná, Zdenka. 1975. Morphemic Reproduction of Foreign Lexical Models in Modern Chinese. Archiv Orientální 43:146–171. Hu, Xiaoqing. Wailaiyu [Foreign words]. Beijing: Xinhua chubanshe. Lackner, Michael & Amelung, Iwo & Kurtz, Joachim. 2001. New Terms for New Ideas: Western Knowledge and Lexical Change in Late Imperial China. (Sinica Leidensia 52). Leiden: Brill. LaPolla, Randy. 2001. The Role of Migration and Language Contact in the Develoment of the Sino-Tibetan Language Family. In Aikhenvald, Alexandra Y. & Dixon, R. M. W. (eds.), Areal diffusion and genetic inheritance subtitle problems in comparative linguistics, 223–254. Oxford: Oxford University Press. Masini, Federico. 1993. The Formation of the Modern Chinese Lexicon: The Period from 1840-1898. (Journal of Chinese Linguistics, Monograph Series 6). Berkeley. Norman, Jerry. 1988. Chinese. (Cambridge Language Surveys). Cambridge: Cambridge University Press. Norman, Jerry & Mei, Tsulin. 1976. The Austroasiatics in Ancient South China: Some Lexical Evidence. Monumenta Serica 32:274–301. Novotná, Zdenka. 1967. Linguistic Factors of the Low Adaptability of Loan-Words to the Lexical System of Modern Chinese. Monumenta Serica 26:103–118. Peyraube, Alain. 2000. Westernization of Chinese grammar in the 20th century: Myth or reality. Journal of Chinese Linguistics 28(1):1–25. Pulleyblank, Edwin. 1983. The Chinese and their neighbours in prehistoric and early historic times. In Keightley, David N. (ed.), The Origins of Chinese Civilization, 411– 466. Berkeley. Pulleyblank, Edwin. 1996. Early Contact Between Indo-Europeans and Chinese. International Review of Chinese Linguistics 1(1):1–24. Sagart, Laurent. 1999. The roots of Old Chinese. Amsterdam: Benjamins. Sagart, Laurent & Blench, Roger & Sanchez-Mazas, Alicia (eds.). 2005. The peopling of East Asia: Putting together archaeology, linguistics and genetics. London: RoutledgeCurzon. Sanchez-Mazas, Alicia & Blench, Roger & Ross, Malcolm D. & Peiros, Ilia & Lin, Marie (eds.). 2008. Past Human Migrations in East Asia: Matching Archaeology, Linguistics and Genetics. London: Routledge.

598

Thekla Wiebusch and Uri Tadmor

Schuessler, Axel. 2003. Multiple Origins of the Chinese Lexicon. Journal of Chinese Linguistics 31(1):1–35. Schuessler, Axel. 2007. ABC Etymological Dictionary of Old Chinese. Honolulu: University of Hawai#i Press. Scripta Sinica. Hanji dianzi wenjian. Zhongyang yanjiuyuan. Academia Sinica, Nankang. . Shi, Youwei. 2000. Hanyu wailaici [Chinese loanwords]. Beijing: Shangwu yinshuguan. Starostin, Sergei A. 2008. Altaic loans in Old Chinese. In Sanchez-Mazas, Alicia & Blench, Roger & Ross, Malcolm D. & Peiros, Ilia and Lin, Marie (eds.), Past Human Migrations in East Asia: Matching Archaeology, Linguistics and Genetics, 254–262. London: Routledge. T’sou, Benjamin K. 2001. Language contact and lexical innovation. In Lackner, Michael & Amelung, Iwo & Kurtz, Joachim (eds.), New Terms for New Ideas: Western Knowledge and Lexical Change in Late Imperial China (Sinica Leidensia 52), 35–66. Leiden: Brill. Wang, William S.-Y. (ed.). 1995. The ancestry of the Chinese language. (Journal of Chinese Linguistics Monograph Series 8). Berkley. Wang, William S.-Y. & Wang, Feng. 2004. Basic Words and Language Evolution. Language and Linguistics 5(3):643–662. Zhao, Jian. 2006. Japanese Loanwords in Modern Chinese. Journal of Chinese Linguistics 34(2):306–325.

Loanword Appendix Austro-Asiatic g,n

root, cause, foot, base

Austronesian b*nglang

betel palm

Elamite pútao

grape

English níngméng b'shi

lemon bus

l%ba sh*zi (1)

motor coffee

Mongolian mógu

mushroom

s,ng

(Buddhist) monk

Proto-Austroasiatic

Southern Chinese

zh&u

g%ng

liàng píng

boat (traditional) bright bottle

harbor

Xiongnu luòtuo

camel

Proto-Hmong-Mien

Unknown origin

g#u miào

m% j*

European Colonial Languages m%dá k'f,i

trumpet lion

dog temple (ancestral temple, Buddhist temple, etc.)

Sanskrit b&li t%

glass pagoda, tower

g%nl%n x* (2) héshang

horse general term for chicken and hen olive tin, tinplate Buddhist monk/priest

Chapter 23

Loanwords in Thai* Titima Suthiwan and Uri Tadmor 1. The language and its speakers Thai is the national language of Thailand. The variety discussed in this chapter is Standard Thai, which has two rather different variants that form two ends of a continuum. At the acrolectal end is literary Thai, a written language that evolved in th th Ayutthaya during the 14 -18 centuries CE, when that city was the center of the Thai monarchy. At the basilectal end is spoken Standard Thai, a koine based on Central Thai dialects (but different from Bangkok’s original dialect). (Central) Thai is sometimes referred to as “Siamese”, to avoid the confusion between Thai on the one had and Tai languages (see below) on the other hand. Smalley (1994: 367) estimated that about 19.5% of the population of Thailand spoke Standard Thai as 1 a first language, and that a further 27% spoke a different variety of Central Thai . Thai is a member of the Tai subgroup of the Kra-Dai family, formerly known as Tai-Kadai. Other major Tai languages spoken in Thailand include Kam Meuang (spoken in the north); Lao (or Isan, as the local Lao dialect group is known, spoken in the northeast); and Paktay (spoken in the south). Lao is also spoken in Laos, although it has many more speakers in Thailand; there are also some Paktayspeaking communities in Malaysia. Other notable Tai languages include Shan, spoken in Shan State, Burma, and Zhuang, spoken mostly in Guangxi province, China. Smaller Tai minority languages are spoken as far east as Vietnam and as far west as India. Among the important indigenous non-Tai languages spoken in Thailand are Patani Malay, spoken mainly in the south, and Northern Khmer, spoken mainly in Surin and in a few other Thai provinces bordering on Cambodia. In addition, a number of Chinese languages are used by a rapidly dwindling yet still significant number of descendants of Chinese immigrants. Most Thais who do not speak Standard Thai as a first language speak it as a second language. This is because – with few exceptions – it is used as the medium of instruction in schools, as well as in the mass media and in all government *

1

The subdatabase of the World Loanword Database that accompanies this chapter is available online at http://wold.livingsources.org. It is a separate electronic publication that should be cited as: Suthiwan, Titima. 2009. Thai vocabulary. In Haspelmath, Martin & Tadmor, Uri (eds.) World Loanword Database. Munich: Max Planck Digital Library, 2073 entries.

In his statistics, Smalley strangely ignored the possibility – quite common in Thailand – of speaking more than one language natively.

600

Titima Suthiwan and Uri Tadmor

communication. The vast majority of Thailand’s inhabitants, who number about 63 million, speak at least some Thai as a first or second language.

Map 1: Thai and neighboring languages

2. Sources of data The major source of data used for this study was the personal knowledge of the first author, who is a native speaker of Standard Thai who specializes in Thai language and literature. Various dictionaries were also consulted, the most important of which was the online version of the Royal Institute Thai Dictionary (Photjananukrom 2542/1999). Thai lexicographical resources were also accessed via 2 the website of the Center for Research in Computational Linguistics in Bangkok .

2

, created and managed by Doug Cooper.

23. Loanwords in Thai

601

For Sanskrit and Tamil, the principal sources were the Cologne Digital Sanskrit Lexicon and the Cologne Tamil Online lexicon3. The point of reference for Khmer loanwords in Thai is Varasarin (1984), and for Chinese loanwords Manomaivibool (1975) and Gyarunsut (1983).

3. Contact situations Thailand is situated in an overlapping area of the “Sinosphere” – the area in Asia that has come under heavy Chinese influence – and the “Indosphere”, the area that has come under heavy Indian influence. Located in the geographical center of Southeast Asia, it is also surrounded by major local cultures and languages such as th Khmer, Burmese, and Malay. Since the 17 century, Thailand (or, as it was known then, Siam) has also had diplomatic and trade relations with several European powers. This long history of intensive contacts with speakers (and literatures) of many different languages has led to the adoption of a large number of loanwords in Thai. The major contact situations are briefly described below. 3.1.

Chinese influence before southwestward expansion

The Tai homeland was probably located in modern-day southern China, not far from the Vietnamese border. From there, Tai speakers started migrating southwestwards along the region’s river basins about one thousand years ago. When still in their original homeland, the Tais4 must have been under strong influence from the nearby Chinese, who were numerically, politically, and technologically superior. These early borrowings show the greatest resemblance to Middle Chinese reconth th structions5 (6 -10 centuries), and include most numerals (including lower numerals, see §4.2) and many other common vocabulary items such as fùn ‘dust’ h (< *pju!n ‘dust’), s"a# ‘sound, voice’ (< *sjang ‘sound’), and t ùa ‘bean’ (< *deu ‘bean’). 3.2.

Chinese immigrants and traders

After the Tai established polities in Southeast Asia, they had strong trade relations with southern China. Many of the Chinese traders eventually settled in the Tai 3

The URL for both lexicons is . As already alluded to above, by a confusing convention a distinction is made in writing (though not in pronunciation) between Thai, the language of modern central Thailand and its speakers, and Tai, the language family to which Thai belongs, which also includes other languages. The ancestors of modern Thais are therefore referred to as Tais. 5 Tones are not marked here for Middle Chinese reconstructions because there is no conventional way of doing so. Apparently there were eight tones, but studies such as Manomaivibool (1975) do not mark them regularly. 4

602

Titima Suthiwan and Uri Tadmor

states and were to have strong influence on various aspects of Tai life and culture, such as commerce, cuisine, and music. Loanwords from Chinese vernaculars include kâw$î% ‘chair’ (< Hokkien kau-í ‘chair’), tó ‘table’ (< Hokkien toh ‘table’), and #!n ‘silver, money’ (< Chaochow #i# ‘silver’). 3.3.

Contact with Mons

Before the arrival of the Tai, the area known today as central Thailand was inhabited mostly by Mons, who spoke an Austroasiatic language unrelated to Tai. There is no record of any major conflict between the Mon and the Tai. While the Tai assimilated to the Mon culturally, adopting Theravada Buddhism and other aspects of culture from the Mon, the Mon gradually assimilated to the Tai politically and linguistically. During the transition period, Mon-Tai bilingualism must have been widespread, and this brought about some lexical borrowing. Mon loanwords in Thai include k&'$ ‘island’ (< k&$ ‘island’), (má)phrá%w ‘coconut’ (< Old Mon bra%w ‘coconut’), and kwà%t ‘sweep’ (< kwàt ‘to scratch, rake’). 3.4.

Indian religious and literary influence

As already mentioned, after migrating to the area of present-day Thailand, the Tai encountered the Mon (and the Khmer), who had by then undergone considerable Indianization. Indian influence still permeates many aspects of Mon and Khmer cultures, from clothing to architecture to literature. The Tai quickly assimilated into this Indianized culture, adopting Buddhism as their religion along with an Indian-derived script and Indian literary traditions. Sanskrit and Pali, the liturgical and literary languages of ancient India, have had an enormous influence on the Thai vocabulary (Gedney 1947). Most Buddhist religious terms are of Indic origin, as well as words for many everyday items, especially from the field of time, e.g. we%la% ‘time’ (Sanskrit v(l) ‘time’), na%líka% ‘clock, watch’ (< Sanskrit n)*ika ‘measure h h h of time’), $a%t ít ‘week’ (< Sanskrit )ditya ‘sun’), k ànà ‘moment’ (nh)

to cure

Chinese th3 gi@i

world

l%u v2c

valley

/+o

island

dì

aunt

/)i l-c

mainland

m

mother’s sister

/Cng

rough (2)

thím

father’s sister

/)i d%(ng

ocean

tB tiên

ancestors

h=

lake

hDu du>

descendants

trinh l>nh

th0y

to see

ho'c

or

to command, to order

xanh

blue

không

no, not

b)n

friend

mó

to touch, to feel

ngôn ng"

language

khách

guest

l)nh

cold

ch"

word

phong t-c

custom

linh h=n

soul, spirit

t1

word

âm m%u

plot

yêu

to love

th1a nhDn

to admit

chi3n /0u

to fight

ti3c

to regret, to be sorry

ph4 nhDn

to deny

chi3n tranh

war, battle

hKa

to promise

hoà bình

peace

th

tear

kiêu ng)o

proud

c0m

can /+m

brave

kêu

to call (1)

cung

bow

s2 nguy hi?m danger

báo (1)

to announce

tên (2)

arrow

hy v5ng

to hope

khoe

to boast

g%(m

sword

trung thành

faithful

vi3t

to write

pháo /ài

fortress

true

/5c

to read

tháp

tower

gi0y

paper

chi3n thGng

victory

thDt

s2 khi?n trách blame

24. Loanwords in Vietnamese

637

th0t b)i

defeat

phi c(

airplane

phó mát

cheese

t0n công

attack

/i>n

electricity

b(

butter

b+o v>

to defend

c( gi@i

machine

vang

wine

/*u hàng

to surrender

b>nh vi>n

hospital

(v+i) lanh

linen

tù ph)m

captive, prisoner

viên

pill, tablet

aó s( mi

shirt

chi3n l,i phFm

booty

kính (2)

spectacles/glasses xà phòng

soap

chính ph4

government

bulô

birch

ph-c kích

ambush

tBng th!ng

president

(xe) ô tô

car

pháp luDt

law

bC tr%Ang

minister

(xe) búyt

bus

pháp vi>n

court

c+nh sát

police

pin

battery

phán quy3t

judgment

tCi ác

crime

mô tô

motor

nguyên cáo

plaintiff

cuCc tuy?n c: election

tem

postage stamp

b$ cáo

defendant

/$a ch6

address

b-ng

bank

t! cáo

to accuse

s!

number

tua vít

screwdriver

k3t án

to condemn, to convict

b%u ki>n

post/mail

bom

bomb

ph)m tCi

guilty

th%

letter

phim

film/movie

vô tCi

innocent

ngân hàng

bank

cà phê

coffee

hình ph)t

punishment

b=n

sink

hCp

tin/can

báo (2)

newspaper

tuc-ng

toucan

l$ch

calendar

ôpôt

opossum

âm nh)c

music

ôliu

olive

chè

tea

aó ponsô

poncho

qua

through

gi!ng

same

s2 c%Jng dâm rape tôn giáo

religion

chúa (2)

god

th*n

god

dn tho)i

telephone

French

ho+ xa

train

xúp

Indo-European

Proto-Tai /2c

male (2)

bò /2c

bull, ox

raccoon

v$t

duck

canguru

kangaroo

b= câu

dove

bum(rang

boomerang

m%(ng

ditch

ra/iô

radio

nong

ti vi

television

basket (to dry things)

chèo (1)

oar

chèo (2)

to row

soup

Chapter 25

Loanwords in White Hmong* Martha Ratliff 1. The language and its speakers The thirty-odd languages of the Hmong-Mien (or Miao-Yao) family are spoken in southern China and in northern Southeast Asia.1 White Hmong (Hmong Daw, Hmoob Dawb)2 is a dialect of the Hmong language that is spoken primarily in Sichuan, Guizhou, and Yunnan provinces in China, as well as in northern Vietnam, Laos, and Thailand. There are minor differences between dialects called “White Hmong” in different countries: the variety represented in the subdatabase and described in this chapter is the White Hmong of Laos. *

The subdatabase of the World Loanword Database that accompanies this chapter is available online at http://wold.livingsources.org. It is a separate electronic publication that should be cited as: Ratliff, Martha. 2009. White Hmong vocabulary. In Haspelmath, Martin & Tadmor, Uri (eds.) World Loanword Database. Munich: Max Planck Digital Library, 1292 entries.

1 Hmong speakers in China belong to the Miao “nationality”, which is an officially-recognized minority ethnic group. The referents of “Hmong” and “Miao” are not equivalent, however. Miao people may speak Hmongic languages other than Hmong, such as Hmu, A-Hmao, Pa-Hng, and Qo Xiong (Wurm et al. 1988). The Miao of Hainan Island speak Mun, a Mienic (“Yao”) language that belongs to a separate branch of the Hmong-Mien family, and many other Miao only speak a local dialect of Chinese. This follows from the fact that the concept of “nationality” is not a purely linguistic classification but also takes into account cultural practices, politics, and self-identification (Sun 1992). To avoid conflating ethnic and linguistic categories, many Western linguists have adopted the name “Hmong-Mien” to refer to this language family. Western anthropologists, however, use the traditional name “Miao-Yao” to refer to the language family, insofar as it is the generally accepted name for these communities of speakers in China and does not arbitrarily elevate the names of two representative languages to the name of the group as a whole. 2 The White Hmong orthography used in this chapter and in the database is the Romanized Popular Alphabet designed by William A. Smalley and G. Linwood Barney in the 1950s. It is more widely used than any other orthography by White Hmong people in the diaspora. For the most part, the values of the symbols are equal to their IPA values (so c is in fact a voiceless palatal stop and q is in fact a voiceless uvular stop), with the following exceptions: (1) since there is only one possible final consonant in a Hmong word – [!] – consonant symbols in word-final position have been used to indicate tones: high level -b, high falling -j, mid rising -v, low level -s, mid level -ø, falling breathy -g, and low creaky -m; (2) the final [!] is indicated by a doubling of the vowel: -oo- is thus [!!]; (3) certain symbols have special values: x [s], tx [ts], s ["], ts [t"], xy [#], r [$], g [!], w [%]; and (4) in prenasalized clusters, the nasal is always written n, even though it assimilates to the following consonant (npua ‘pig’ is thus pronounced [mpua]).

25. Loanwords in White Hmong

639

It is hard to say how many speakers of this language there are, given the difficulty of determining which related tongues count as varieties of “the Hmong language”, the problems with census counts, and population changes. However, this is not an endangered language: after a careful consideration of all these factors, Lemoine (2005) conservatively estimates that there may be between 4 and 5 million speakers of the Hmong language altogether (3.5 million in China; 1.25 million in Southeast Asia; 320,000 in the diaspora).3 Within the Hmong-Mien family of languages, two main subfamilies have been identified: the Hmongic subfamily and the Mienic subfamily. A provisional family tree structure is presented in Figure 1. Hmong A-Hmao West Hmongic

Bunu … …

North Hmongic East Hmongic Hmongic

Qo Xiong Hmu Jiongnai Pa-Hng

Hmong-Mien Mien-Mun

Mien Mun

Mienic

Biao Min Zao Min

Figure 1:

Genealogy of the Hmong-Mien languages

Although the highest-level two-branch structure is not in doubt given the obvious lexical, phonological and grammatical differences between languages belonging to the two subfamilies, more work needs to be done to refine the internal structure of each subfamily. This is the smallest language family in Southeast Asia: the neighboring languages from which Hmong has borrowed words belong to the vastly larger TibetoBurman, Mon-Khmer, and Tai-Kadai families, and, most importantly, to the languages of the Sinitic subgroup of Sino-Tibetan. Traditionally, Chinese scholars have claimed that Hmong-Mien belongs to the Sino-Tibetan language family, along with Chinese, Tibeto-Burman (which includes Tibetan, Burmese, Karen, and many smaller languages of southern and western Asia), and Tai-Kadai (which 3

Some poorly-understood tongues in China may also belong to the Hmong language, but are not included in this particular count (Lemoine 2005: 7).

640

Martha Ratliff

includes Thai, Lao, Shan, Zhuang, and many smaller languages of Southeast Asia). Most recent Chinese scholarship continues to support this view (Wang 1986; Pan 2006). However, the belief that the Sino-Tibetan family includes Hmong-Mien and Tai-Kadai is not widely shared by linguists outside China. Despite massive numbers of Chinese loanwords in Hmong-Mien languages, differences in basic vocabulary raise serious doubt that Chinese and Hmong-Mien are related. Furthermore, most linguists reject typological similarities as evidence of genetic relationship; the similarities in grammar, word structure, and phonological systems between Chinese and Hmong-Mien languages can be explained by the dominance of Chinese in the area and widespread bilingualism. On the basis of similarities in basic vocabulary, other family connections have been proposed. Forrest (1973 [1948]: 93–103), Downer (1963), Haudricourt (1966), and Peiros (1998: 155–160) have favored the possibility of a family relationship with Mon-Khmer, while Benedict linked Hmong-Mien to Austronesian and Tai-Kadai as part of “Austro-Tai” (Benedict 1975). Neither of these proposals has gained general acceptance among scholars. Until a careful separation of layers of Chinese borrowings from native Hmong-Mien vocabulary has been completed – a task which this contribution only begins to address – and the remaining core has been systematically compared to these other families, the question of wider relationship cannot be resolved. The most prudent position to take in the meantime is that Hmong-Mien constitutes an independent family of languages. In villages in Laos, White Hmong is used for all daily functions of home, work, animist religious practices, and social interaction. Those without formal education are monolingual, although there may be some knowledge of the languages of other hill tribes (Mon-Khmer languages such as Khmu, or Tibeto-Burman languages such as Akha and Lahu). Additionally, bi-dialectism with Green Mong is widespread. White Hmong and Green Mong are mutually intelligible, but nonetheless have significant differences in vocabulary and phonology. Speakers of the two dialects may marry, and villages may thus include speakers of both dialects. The question of inter-dialect borrowing is a matter too subtle for discussion here, but it surely needs to be addressed in a fuller study of loanwords in either dialect, ideally one conducted by a Hmong linguist. The language of school and government is Lao (or Laotian), the national language of Laos, a Tai-Kadai language closely related to Thai. All Hmong who have received some years of formal education in Laos know Lao, as do all members of the community who are involved in government or business dealings with those outside the village. In Laos, the language of written literature and the media is predominantly Lao. Although an excellent orthography for White Hmong exists (the Romanized Popular Alphabet or “RPA”, see footnote 2), no substantial literature in this or in any other Hmong script has been produced in Laos. However, many books, periodicals, websites, and literacy materials have been produced in the RPA by Hmong people who emigrated to Australia, the United States, and other countries. Twenty years ago one of the linguists who developed the RPA in the 1950s, William Smalley,

25. Loanwords in White Hmong

641

expressed the opinion that the use of this or any other written form of Hmong would not last: “Functional viability seems precarious for these writing systems not supported by a literate culture” (Smalley 1988: 20). However, the RPA appears to be holding its ground today; it is therefore used in this chapter and in the accompanying subdatabase.

Map 1: Geographical setting of White Hmong

2. Sources of data The principal sources of data were published dictionaries of White Hmong. In order of importance, these were Xiong (2006), Heimbach (1979), and Bertrais (1979 [1964]). The Xiong dictionary was an especially good source, since it is fairly thorough, and documents the variety of White Hmong spoken in Laos and indicates when a word is a loanword from Lao. The dialect represented also corresponds closely to the dialect of my consultant, Na Yang. The Heimbach dictionary provides nuanced glosses for the Hmong entries; however, this dictionary is based on the variety of White Hmong spoken in Thailand, and there are some minor lexical differences between this variety and that spoken in Laos. The main strength of the Bertrais dictionary, which documents the White Hmong of Laos, is that it provides numerous examples of each word in context.

642

Martha Ratliff

All entries initially made on the basis of these published sources were checked, and in many cases changed, upon consultation with native speaker Na Yang. Ms. Yang is an elementary school teacher in the Detroit public schools, and holds a B.A. in Education from Wayne State University. She was born in Saamthong, a village near Long Cheng in Xiangkhouang Province, Laos in 1967. When she was a young girl she went to live with her father and stepmother in the capital city of Vientiane. She attended school in the city for a number of years, and thus knows Lao well. After the end of the Vietnam War, in December 1979, when she was 12 years old, Ms. Yang and her family fled to the Ban Vinai Refugee Camp in Thailand. The family emigrated to Canada the following year. After seven years in Canada, marriage, and eight years in California, she moved with her family to the Detroit area. Ms. Yang has taken two courses in linguistics, and has developed ingenious teaching materials for her Hmong students; in her work on the Loanword Typology project she has also shown a great sensitivity to language. Not only was she able to supply words that did not appear in the dictionaries, she was also able to confirm those cases where a gap in the dictionaries truly represented the absence of a word for a particular meaning. She would often lead me to substitute one word for another, either a more precise word for the meaning given, or one more commonly used. A fair amount of scholarship on loanwords in languages of this family exists. Works that focus on lexical and syntactic borrowing in White Hmong include Downing & Fuller (1985) (on English loanwords in White Hmong spoken in the United States), Fuller (1986) (on the Hmong perfective particle lawm as a loanword from Chinese !, Mandarin le), Mortensen (2000) (on Sinitic loanwords in White and Green Hmong), and Ratliff (forthcoming) (on the borrowing of classifier construction). Other work has dealt with older strata of loanwords that appear across the family: Ying 1972, Benedict 1987, Haudricourt & Strecker 1991, Sagart 1995, Ratliff 2001, Mortensen 2002. For Chinese loanwords in Mien, the best-described language from the other side of the family, the most important reference (and a model for the type of study that should be conducted for Hmong) is Downer 1973.

3. Contact situations The oldest relationships involving Hmong may not be contact relationships at all: similarities between ancient words in Hmong and words in the Austronesian, MonKhmer, or Tai-Kadai families may be due to common inheritance. Accordingly, pending further research, very few of these intriguing words are identified as loanwords in the subdatabase; they merit separate study in a project on Asian prehistory. Information about a few such words is included in the subdatabase, however, so that this issue could be raised: for example, ‘bird’, ‘die’, ‘kill’ (Austronesian), ‘blood’, ‘to cry’ (Mon-Khmer), and ‘fish’ (Tai-Kadai). These words differ from the Wanderwörter in the subdatabase in that they are basic words; the widelyshared areal words like ‘hawk’ and ‘crossbow’ differ from these in that they are less basic and are more widely shared. These areal words are simply contact words for

25. Loanwords in White Hmong

643

which we cannot identify a source language; it could even be that Hmong-Mien is the source for one or more of them. The relationship with Chinese, on the other hand, is a contact relationship; Chinese is by far the most important contact language for Hmong. My working assumption is that almost all words shared by Chinese and Hmong are loans from Chinese to Hmong, since Chinese is the dominant language in the homeland of the Hmong-Mien-speaking peoples. It may be, however, that some shared words were borrowed from Hmong by the Chinese; this is noted in a few entries. It may also be the case that a few ancient shared words (such as the numeral ‘one’) reflect common inheritance. The designation “ancient”, “early” or “early modern” in the subdatabase reflects which form of the Chinese word most closely resembles White Hmong in present or reconstructed form: Old Chinese (c. 1000 BCE), Middle Chinese (c. 500 CE), or modern Chinese (c. 1500 CE to the present). What types of relationships did the Chinese and Hmong have at these different periods? This is very hard to determine. First, there are no written records in Hmong before the twentieth century, so it is not possible to use textual evidence to help us answer this question. Second, it is not clear how we can use the phonological evidence linking Hmong to a particular stage in the development of Chinese (phonological strata) to different contact situations. We can only attempt to glimpse aspects of the contact situation through the types of words borrowed. For the oldest phonological stratum, we go back to when the ancestor language of Hmong was spoken in the middle Yangzi River basin, an area where we can place the ancient Hmong-Mien peoples 2500 years ago on the basis of the antiquity of native words for flora and fauna associated with this region (Ratliff 2004). The Chinese loanwords that entered the language in this stratum most closely resemble Old Chinese, and include the manufacturing terms ‘gold’, ‘silver’, ‘iron’, and ‘to carve/chisel’; the measure and commerce words ‘half/middle’, ‘lend/borrow’ and ‘price’; and the adjectives ‘wide’, ‘narrow’, ‘sweet’, ‘yellow/light’, ‘low’, and ‘dry’. Pulleyblank (1983) and Sagart (1999: 8) suppose that the Hmong-Mien-speaking people belonged to the ancient southern kingdom of Chu ("), which was established at this time in an area that corresponds to modern-day Hunan and Hubei provinces. If not Chinese themselves, the rulers of this kingdom were fluent and literate in Chinese. Loanwords in a second, larger, group show a closer resemblance to Middle Chinese. It is difficult to say what the nature of the relationship between Chinese and the ancestors of the Hmong was at this period, beyond the fact that the two were close (see the quotation from Benedict below) and Chinese was dominant. The loanwords in this group suggest that the Chinese shared farming practices with the Hmong only in this period, but not earlier (‘cow’, ‘water buffalo’, ‘sheep’, ‘chicken’, ‘goose’, ‘duck’, ‘sickle’, ‘to harvest’, ‘to plow’, ‘to plant’), while contact through manufacturing (‘saw’, ‘hammer’, ‘copper’) and commerce (‘to buy’, ‘to sell’, ‘to weigh’, ‘hundred’, ‘thousand’, ‘to count’) continued.

644

Martha Ratliff th

th

During a period from the mid-17 through the mid-19 centuries, the ancestors of the Hmong moved southward into the southernmost provinces of China, and finally from China into Vietnam, Laos, and Thailand under pressure from an expanding Han population, intermittent warfare, increased taxation, and a desire to maintain their own distinctive way of life (Culas & Michaud 2004: 64). The contact language at this time was Southwest Mandarin. Within this stratum, loanwords from Chinese may have continued to come into the language even in Laos, since the Hmong had regular contact with traveling Chinese merchants in Southeast Asia (Mortensen 2000: 11; Lyman 1974: 40–41). Turning away from Chinese now, there are a number of loanwords from an unknown Tibeto-Burman source language in Hmong; some of them correspond to words reconstructed for Proto-Hmong-Mien, and are thus very old. Despite their age, however, it is more likely that this is a contact relationship than a genetic relationship because the most important Tibeto-Burman loanwords fall into sets, and were presumably borrowed as sets: the numerals ‘four’ through ‘nine’ (and perhaps ‘ten’); ‘sun’ and ‘moon’; and ‘son-in-law’ and ‘daughter-in-law’. Although some of these resemblances had been noted earlier, in 1987 Benedict published a brief but important study that brought discussion of these loanwords together. Several more loanwords from Tibeto-Burman have been discovered since then (Mortensen 2002; Ratliff 2001), yet Benedict’s belief that this was a contact relationship is still wellsupported by the fact that these words are “… sparse, and rigidly confined to specific categories. The early MY [Miao-Yao]speakers made good use of the higher numerals of the TB-speakers on their west and even shared in their heavenly body (sun, moon) cults, perhaps also entered into certain marital alliances with them, but they kept their distance: with their Chinese neighbors, on the other hand, they shared a community existence of sorts as a ‘substratumized’ population, the two groups sharing cultural items of various kinds. To put it somewhat differently, they had the DMY [Donor-Miao-Yao]-speakers as neighbors; they lived with the Chinese.” (Benedict 1987: 20).

The most important recent contact language for the White Hmong of Laos is Lao, the national language of Laos, a Tai-Kadai language closely related to Thai. This is naturally the language from which a minority group would have taken many words for urban life, commerce, education, culture, government, and modern technology over the last hundred years. Of course, there are many English loanwords in the White Hmong spoken in the United States, Australia, and Canada today (and French loanwords in the White Hmong spoken in France, etc.); however, this subdatabase attempts to capture White Hmong as it was spoken before contact with the west, so they are not included here.

25. Loanwords in White Hmong

645

4. Numbers and kinds of loanwords As to be expected of a minority language, Hmong has a relatively high percentage of loanwords. Of the 1292 word entries in the subdatabase, about a fifth are “probably” or “clearly” borrowed. However, if we add those words that are “perhaps” borrowed and those created on a loan basis, the number rises to over a third. The language thus clearly has a strong imprint of outside languages – especially Chinese. It is undoubtedly the case that the number of loanwords in the subdatabase will rise upon further research; eventually I hope to be able to add those words whose meaning or form (or both) suggest that they are non-native, but for which no outside source has yet been identified.

Tibeto-Burman

Middle Chinese

Modern Chinese

Lao

Thai

English

Total loanwords

Non-loanwords

Nouns Verbs Adjectives Adverbs Function words all words

Old Chinese

Table 1: Loanwords in White Hmong by donor language and semantic word class (percentages)

2.0 2.7 5.8 4.0 2.6

1.5 0.3 8.0 1.5

4.0 7.5 5.2 2.3 4.9

6.9 7.3 10.1 25 6.9 7.4

6.7 0.9 3.5 1.1 4.6

0.1 0.1

0.3 0.2

21.5 18.8 24.6 25.0 22.4 21.2

78.5 81.2 75.4 75.0 77.6 78.8

In interpreting the data in Table 1, it is important to note that there were cases where the closest word in Hmong to a meaning on the Loanword Typology (LWT) meaning list did not belong to the word class associated with that meaning in the database. For example, there is no noun ‘fog’, only an adjective meaning ‘foggy’, there is no noun ‘harvest’, only a verb meaning ‘to harvest’, there is no noun ‘bunch’, only a nominal classifier for bunches, and there is no verb for ‘to hurry’, only a modifier meaning ‘quickly’. The number counts in the table above should therefore only be taken as approximations; a more precise count can be made by consulting the comments in the subdatabase for a particular category of interest. In the set of borrowed adjectives, there is a weak pattern that nonetheless may be of cross-linguistic interest. In five cases where there are antonyms in which one term is positive and the other negative, the positive term is native and the negative term is borrowed: native ‘full’ vs. Chinese ‘empty’, native ‘wet’ vs. Chinese ‘dry/arid’, native ‘clean’ vs. Chinese ‘dirty’, native ‘smooth’ vs. Chinese ‘wrinkled’. To this group we may add ‘good’ vs. ‘bad’. Although ‘good’ may ultimately be a borrowing from Old Chinese, it is ancient in Hmong-Mien, but ‘bad’ is clearly a relatively modern Chinese loanword, and means more narrowly ‘evil’: for ‘bad’, people more often say ‘not good’. There are antonyms where both terms are borrowed (‘cheap’

646

Martha Ratliff

vs. ‘expensive’, ‘straight’ vs. ‘crooked’, ‘right’ vs. ‘wrong’, ‘easy’ vs. ‘difficult’) and antonyms where both terms are native (‘sharp’ vs. ‘blunt’, ‘beautiful’ vs. ‘ugly’, ‘clever’/‘wise’ vs. ‘stupid’), but I have found no antonyms where the positive term is borrowed and the negative term is native. Given the small number of positive/negative antonyms in the subdatabase, this may be accidental – but while working with Na Yang, I was struck by the difficulty she had in coming up with negative counterparts to some positive terms. This is not recorded in the database; if a negative term turned up after some thought, it was entered. When categorized by semantic field, the five fields that contain the highest percentage of loanwords are, in order, (1) Time, (2) Quantity, (3) Modern world, (4) Social and political relations, and (5) Possession. The five fields that contain the lowest percentage of loanwords are, in order, (1) Kinship, (2) The body, (3) Religion and belief, (4) Law, and (5) Speech and language (see Table 2). Some of the fields that fall on the high or low end of a scale determined by percentage of loanwords are not surprising: we expect many words that have to do with the marketplace (possession), the modern world, and social/political relations to be loanwords, since they involve interaction with others; similarly we expect to find few loanwords among more private words referring to kinship relations, beliefs, speech, and the body. Some are a bit surprising, however, and require comment. Time words in the LWT meaning list refer to divisions of time that are not important to an agrarian people that organize their daily activities by the rotation of the sun, and organize their yearly activities by the rotation of the seasons. Therefore, ‘hour’, ‘clock’, ‘week’, and the specific days of the week, are unnecessary concepts. Even ‘day’, ‘month’ and ‘year’ are borrowed: ‘day’ and ‘month’ (from ‘sun’ and ‘moon’) as a set from some Tibeto-Burman source, and ‘year’ from Chinese. Borrowed quantity words in the subdatabase refer to concepts that were not of crucial importance to the ancient Hmong-Mien people: for numerals, ‘two’, ‘three’ (< ‘group’), and ‘many’ appeared to suffice. ‘One’ appears to be the same word as Chinese ‘one’. ‘Four’ through ‘nine’ (and perhaps ‘ten’) are Tibeto-Burman in origin. ‘Zero’ and all higher numerals are borrowed, and the ordinals are built on a Lao base. Other quantity words are also borrowed: not only ‘to count’ (which makes sense in the absence of numerals to count) but also ‘more’, ‘only’, a second word for ‘many’, and ‘half’. Finally, the fact that the semantic field of the law shows a low percentage of loanwords should not lead readers to the conclusion that the Hmong have a native legal tradition that is directly reflected in the lexicon. The modern legal concepts in this group are primarily dealt with by paraphrase in Hmong, and several concepts have set phrasal equivalents (‘court’ = ‘house discuss case’, ‘judge’ = ‘one decide guilt’, ‘plaintiff’ = ‘one accuse’, ‘defendant’ = ‘one suffer accuse’). Even if a set phrase includes a loanword like ‘guilt’, the phrase itself will not be categorized as a loanword.

647

25. Loanwords in White Hmong

Middle Chinese

Modern Chinese

Lao

Thai

English

Total loanwords

Non-loanwords

The physical world Kinship Animals The body Food and drink Clothing and grooming The house Agriculture and vegetation Basic actions and technology Motion Possession Spatial relations Quantity Time Sense perception Emotions and values Cognition Speech and language Social and political relations Warfare and hunting Law Religion and belief Modern world Miscellaneous function words

Tibeto-Burman

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Old Chinese

Table 2: Loanwords in White Hmong by donor language and semantic word class (percentages)

0.5 2.2 1.0 1.4 3.1 1.2 5.5 8.6 0.7 8.8 6.7 4.6 1.7 8.1 2.2 1.1 10.4 2.6

3.0 3.4 3.3 0.6 16.1 3.4 1.1 1.1 2.9 1.5

5.9 10.0 2.6 6.8 1.6 11.0 8.6 7.4 11.5 6.5 6.9 3.4 4.4 5.6 9.4 4.2 1.9 4.9

3.0 1.7 4.5 1.3 7.0 10.4 9.5 9.1 7.1 3.0 11.5 8.6 6.9 13.5 6.6 10.1 18.0 8.3 21.9 14.4 3.5 11.4 4.7 10.4 7.4

3.0 3.3 0.6 4.2 4.2 7.2 1.8 2.1 3.0 2.3 1.4 4.6 18.6 1.1 4.3 8.5 6.3 5.7 4.2 26.5 4.6

1.9 0.1

3.8 0.2

15.3 5.2 23.4 6.1 19.3 17.7 19.5 27.4 26.4 14.0 34.0 23.2 39.1 40.5 21.2 17.7 27.5 13.9 37.5 23.0 11.9 11.4 38.8 20.9 21.2

84.7 94.8 76.6 93.9 80.7 82.3 80.5 72.6 73.6 86.0 66.0 76.8 60.9 59.5 78.8 82.3 72.5 86.1 62.5 77.0 88.1 88.6 61.2 79.1 78.8

5. Integration of loanwords 4

The canonical shape of the White Hmong morpheme is a single open syllable: the only closed syllable is one with a final [-!], and the [-!] can only appear after [&] and [']. Each morpheme also carries a lexical tone that is needed to distinguish it 4

The Hmong morpheme, rather than the Hmong word, is a single syllable. In working on the subdatabase, Na Yang would more often provide disyllabic compounds or phrases than monosyllabic words, explaining that the extra word was necessary to avoid confusion with homophones. Hmong is like Chinese in now having more disyllabic than monosyllabic words.

648

Martha Ratliff

from many others with the same initial onset and rime. Recent loanwords from Lao, as to be expected, must conform to this template. The most striking adaptation is the elimination of final stop consonants. Thus [màak-phâao] ‘coconut’ becomes maj phaub [mâ-pháo] in Hmong (note Lao vowel length is also ignored in Hmong), and [thit] ‘week’ becomes thiv [th(] in Hmong. Final [-m] and [-n] in Lao loanwords are either dropped or changed to [-!]: Lao [s)un] ‘zero’ thus becomes either Hmong xoom [s'*!] or xum [s+] (note that in the first form the vowel shifts to one of the two vowels licensed to pattern with [-!]). Lao [-a!] may be borrowed as [-a!] in violation of native patterns, thus Lao [jàa!-gàao] ‘glue’ is yaas kaus [jà! kàu]. This violation is easily accommodated, since the [-a!] rime exists in the closely related Green Mong dialect, with which White Hmong speakers are quite familiar. Alien initial consonants are rendered by their closest match in the native inventory. Thus since Hmong does not have voiced stops other than /d/, Lao /b/ appears as either a prenasalized stop or a voiceless unaspirated stop in Hmong, for example, Lao [bàan] ‘ball’ becomes Hmong npas [mpà] and Lao [bài] ‘card’ becomes Hmong paib [pái]. And since Hmong does not have a /w/, Lao /w/ is rendered as /v/: for example, /lám-wó!/ ‘Lao circle dance’ becomes las voos [là vó!] in Hmong. Hmong and its relatives have borrowed so many words from Chinese over the centuries that new elements have been introduced into the phonemic inventories of Hmong-Mien languages. Consonant features that appear frequently in Chinese loanwords include aspiration, frication, and palatalization. When words have these features they suggest to the linguist that the word might be borrowed, although a number of native words (or to be more precise, words for which a Chinese source has not been identified) show these features as well. Given their healthy numbers, however, words that have these onsets are not at all perceived to be “foreignsounding” to the native speaker. The only onset that has a very low functional load and is noticeably foreign-sounding to speakers is the velar nasal, which shows up in just a few Chinese loanwords (gus [!ù] ‘goose’) and Lao loanwords (gaib [!ái] ‘easy’). Since lexical tone in Hmong developed in the first instance from the loss of laryngeal contrasts at the end of the syllable (CV, CV,, CVh, CVC), and the number of tones was thereafter doubled by the loss of a voicing contrast in initial consonants, we can assign each tone in Hmong to a historical “tone category” on the basis of the laryngeal properties of the syllable type from which tone arose. If the historical tone category (not the phonetic value of the tone itself, such as high level, mid rising, etc.) of a White Hmong word corresponds to the tone category of cognate loanwords across the family, the word is a very old loanword, borrowed before either the donor language (Chinese) or the recipient language (Hmong) developed tones. If, on the other hand, the tone categories of cognate loanwords across the family do not correspond, this is evidence that the word was borrowed into several Hmong-Mien languages independently on the basis of the closest match between the Chinese tone and one of the phonemic tones in the inventory of the recipient language. In terms of the linguistic analysis, those words with tones that correspond

25. Loanwords in White Hmong

649

across the family (such as tuav ‘to pound’, phua ‘to split’, and kub ‘gold’) are more deeply integrated loanwords than those that do not so correspond (such as txos ‘stove’, txwv ‘master’, and zaum ‘to sit’); the native speaker, however, will not be aware of this distinction. A more superficial matter having to do with loanword tonology is the assignment of “loan tones” to syllables taken from atonal languages. Since borrowed syllables must be assigned some tone from the inventory, they generally are assigned a default tone. Such words are probably easily recognized as loanwords by native speakers. For example, the two English loanwords in the subdatabase, this vis [thì vì] ‘television’ and maus taus xais [màu tàu sài] ‘motorcycle’ use the default low level tone on each syllable. Loanwords from Lao, on the other hand, do not use the “loan tone” strategy, since Lao, like Hmong, is a tone language. Any one of the Hmong tones may be used to approximate the pitch and contour of the tone in the source language. Cross-linguistic support for this categorization of borrowing routines with respect to tone appears in Ratliff (2005). Modern-day speaker attitudes toward loanwords differ according to the age of the loanword. Most native speakers are surprised to learn how many thoroughly native-feeling Hmong words have been borrowed from Chinese: words such as haus 5 ‘to drink’, nyuj ‘cow’, and dav ‘wide’, for example. With a few exceptions these words have been completely assimilated into the native heart of the language. However, according to Na Yang, people hold mixed feelings about more recent loans from Lao: on the one hand, those who know Lao have pride in their knowledge of Lao since it is a sign of the higher status associated with education. At the same time, however, Hmong people take pride in using native words rather than loanwords whenever possible, thereby signaling that the speaker is thoroughly Hmong. In a similar way, Hmong people express mixed feelings about the RPA writing system since it was developed by outsiders. Many Hmong prefer the Pahawh Hmong writing system because it is a native system (Eira 2000). Others who concede the greater practicality of the RPA have made changes to the RPA to reflect native sensibilities: for example, what linguists perceive and represent as consonant clusters are felt to be single units by speakers, and may be represented with one symbol; in Thoj (2000), for example, [nt"] is written instead of RPA .

6. Grammatical borrowing In addition to the high number of Chinese loanwords, the grammatical cast of Hmong word and sentence structure strongly resembles that of Chinese. James A. Matisoff (1990) has named the part of the world under the linguistic domination of Chinese the “Sinosphere”, a term that has since gained wide currency; he includes 5

One of my consultants was aware that his childhood name Xab [sá] was taken from Chinese # s!n ‘three’, although he said it was clear his mother did not know Chinese very well because he was the fourth-born in his family.

650

Martha Ratliff

in this convergence area, in addition to Hmong-Mien, the Tai-Kadai languages and Vietnamese. In terms of morphology, White Hmong and Chinese are both isolating languages. There are no suffixes in White Hmong, and the most common prefixes are weakly classifying noun prefixes, such as pob- ‘ball’ for round things (such as the heel of the foot) and qhov- ‘hole’ for hollow things (such as the nose). Tones are used to differentiate words; they are unlike tones in African languages in that they only rarely serve a grammatical function (Ratliff 1992). Number and case are not marked on the noun or pronoun, nor are tense, mood, or aspect marked on the verb. There is no agreement marking of any type between words in the clause. New words are formed by compounding, as in modern Chinese. Reduplication, usually with an intensifying effect, is also common. In the absence of grammatical information within the word, syntax and discourse are correspondingly more important. White Hmong is characterized by a number of syntactic and discourse structures common to languages of the area: (1) word order is SVO; (2) numeral classifiers are obligatory in phrases with overt numerals and in phrases where the noun is otherwise fixed in reference by the presence of a possessive pronoun or demonstrative; (3) parataxis of clauses, noun phrases and verb phrases is more common than hypotaxis (Jarkey 1991) – subordinating words are often either borrowed (see 6.2) or transparently “new”, such as the complementizer tias from a verb of saying used to introduce reported speech (Jaisser 1984); and (4) for yes-no questions, either a preverbal question marker (puas) or the A-not-A construction is used; wh-question words appear in situ. For more information about the grammatical characteristics of White Hmong, see Mottin (1978) and Niederer (2001–2002). Clark (1989) is a helpful overview of the structural characteristics of Hmong in the context of the languages of the area. A thorough discussion of the effect of contact with Chinese on the grammatical structure of White Hmong goes beyond both the limits of this author’s knowledge and the goals of this project. But I would like to mention here three areas of grammatical borrowing that are related to lexical borrowing: (1) the borrowing of function words; (2) the borrowing of Chinese word order in the noun phrase as a consequence of lexical borrowing; and (3) the borrowing of the Chinese classifier construction in conjunction with the borrowing of particular classifiers from Chinese. 6.1.

Borrowing of function words

A number of words in the subdatabase are loanwords that play an important role in Hmong grammar. They reflect a long period of asymmetrical influence and pervasive bilingualism. They include: (1) the verb tau ‘to get’ – which is also used as a preverbal completive marker and a postverbal modal auxiliary indicating ability – from Chinese $ (Mandarin dé) ‘to obtain, get’; (2) the subordinating conjunction vim(chij) ‘because’ from Chinese %& (Mandarin wèic!); (3) the higher numerals and the quantity words ntxiv ‘more’, xwb ‘only’, coob ‘many’, nrab ‘half’ (see §4

25. Loanwords in White Hmong

651

above); (4) the time words ib txwm ‘always’, sij ‘often’; (5) the spatial words dhau ‘through’, nqis ‘down’; and (6) yam ‘kind, sort’ in expressions meaning ‘same’ (= one kind, one sort), and lwm ‘other’. In addition to these words in the subdatabase, other borrowed function words that are important to Hmong grammar include: (1) classifiers for people and some animals, short things, long things, flat things (see 6.3 below); (2) the reciprocal prefix sib from Chinese ' (Mandarin s"), see Sagart (1999: 70); (3) the conjunction tabsis ‘but’ from Chinese () (Mandarin dànshì); (4) the adversative passive raug from Chinese * (Mandarin zhuó ‘to place, put, apply’), in Hmong, ‘to be put upon’; and (5) the perfective marker lawm from Chinese ! (Mandarin le), see Fuller (1986). 6.2.

Borrowing of A + N order

Two common adjectives have been borrowed in their Chinese prenominal position. All native adjectives and most other borrowed adjectives appear after the nouns they modify. These prenominal adjectives are qub ‘old, former’ from Chinese + (Mandarin gù ‘old’) and tuam ‘great, large’ from Chinese , (Mandarin dà ‘large, major, great’). A third, less common, adjective loanword that appears before the noun is swm ‘familiar’ (in expressions for acquaintances or close friends) from Chinese (Mandarin zh" ‘to know; known’). Two adjectives for ‘small’, me and nyuam (or nyuag, a somewhat derogatory term), may also appear before the noun. Chinese sources have not been identified for these two words. They may appear before the noun because as opposites of ‘large’ they have come to pattern the same way, or because they are themselves borrowed. 6.3.

Borrowing of the classifier construction

Hmong possesses two systems for noun classification, both well-represented in languages of Southeast Asia: numeral classifiers and classifying prefixes. It is not unusual to find cases where the two types of classifier occur side-by-side, each classifying the noun in the same way, as in ib lub pob-zeb, ‘one CLF-bulky things clump-stone’ (= ‘one stone’) and ib tus ko-tw ‘one CLF-short lengths handle-tail’ (= ‘one tail’). There is good evidence that the system of classifying prefixes, now moribund in the modern languages, was native, while the classifier construction, now a hallmark feature of all modern Hmong-Mien languages, was borrowed from Chinese (for discussion and support, see Ratliff, forthcoming, Chapter 6). Part of the evidence for this claim is relevant to the present study: a number of the most common Hmong classifiers have themselves been borrowed from Chinese, suggesting that this particular construction has been transferred to Hmong-Mien on the back of the words that characterize it. Old classifier loanwords include: (1) rab, the classifier for tools, from Chinese . (Middle Chinese trjang ‘spread’ > Mandarin zh!ng ‘CLF-flat things’), as a classifier first used for bows; (2) phob, a classifier for quilts, from Chinese / (Mandarin piàn ‘one-sided’); (3) tus, a classifier for both

652

Martha Ratliff

humans, animals, and stubby objects, probably from Chinese 0 (Middle Chinese duw ‘head’ > Mandarin tóu ‘CLF-animals’). More recent classifier loanwords include (4) txoj, a classifier for long things, from Chinese 12 (Mandarin tiáo ‘CLF-long things’); (5) yam, a classifier for kinds and sorts, from Chinese 3 (Mandarin yàng ‘appearance, pattern’); and (6) hom, another classifier for kinds and sorts, from Chinese 4 (Mandarin hào ‘name, mark; order, size, number’).

7. Conclusion Hmong, as represented by the White Hmong dialect of Laos, is a language that has been heavily influenced by Chinese over a 2500-year period. Given the lack of historical records that refer unambiguously to speakers of Hmong-Mien languages, what we know about the nature of the contact situations between the Chinese and the ancestors of the Hmong over this long time period must come primarily from an analysis of the loanwords themselves. Hmong also shows traces of an important contact relationship with an old Tibeto-Burman donor, and a few basic words shared with members of the Mon-Khmer, Austronesian, and Tai-Kadai families suggest even older contact relationships, if not a deep family relationship in some configuration. White Hmong also shows a recent contact relationship with Lao. There are two important research tasks involving loanwords in this language that need to be undertaken. First, only three very rough strata of Chinese loanwords are presented here, based on the resemblance between a particular Hmong form or reconstruction and either modern Chinese, Middle Chinese, or Old Chinese. We need experts in Chinese history and dialectology to separate out strata of Chinese loanwords in Hmong-Mien in greater detail by identifying those southern varieties of Chinese that were most likely to have been the immediate donors of loanwords to different languages of the family at different periods of time. For example, languages from the Mienic branch of the Hmong-Mien family have Chinese loanwords that more closely resemble the M!n varieties of Chinese spoken in the southeast China than their Hmongic counterparts; this suggests that historically, as in the present day, speakers of Mienic languages lived to the east of speakers of Hmongic languages. This work should not only help place Hmong-Mien speakers on a historical linguistic map, it should also shed more light on the nature of the interactions of the Hmong and different groups of Chinese at different periods of time and in different places. Second, for words shared by languages from two or more of the five language families of southern China and northern Southeast Asia, more work needs to be done to try to distinguish the loanwords from the true cognates. This work will contribute to a better understanding of Asian prehistory.

25. Loanwords in White Hmong

653

References Benedict, Paul K. 1975. Austro-Thai language and culture with a glossary of roots. New Haven: Human Relations Area Files Press. Benedict, Paul K. 1987. Early MY/TB loan relationships. Linguistics of the Tibeto-Burman Area 10.2:12–21. Bertrais, Yves. 1979 [1964]. Dictionnaire Hmong-Français [Hmong-French dictionary]. Bangkok: Sangwan Surasarang. Clark, Marybeth. 1989. Hmong and areal South-east Asia. In Bradley, David (ed.), Papers in South-East Asian linguistics, Vol. 11: South-East Asian syntax (Pacific Linguistics Series A-77), 175–230. Culas, Christian & Michaud, Jean. 2004. A contribution to the study of Hmong (Miao) migrations and history. In Tapp, Nicholas & Michaud, Jean & Culas, Christian & Lee, Gary Yia (eds.), Hmong/Miao in Asia, 61–96. Chiang Mai, Thailand: Silkworm Press (distributed by University of Washington Press). Downer, Gordon B. 1963. Chinese, Thai, and Miao-Yao. In Shorto, H. L. (ed.), Linguistic Comparison in South East Asia and the Pacific, 133–139. London: School of Oriental and African Studies, University of London. Downer, Gordon B. 1973. Strata of Chinese loanwords in the Mien dialect of Yao. Asia Major 18.1:1–33. Downing, Bruce T. & Fuller, Judith W. 1985. Cultural contact and the expansion of the Hmong lexicon. Unpublished manuscript. Eira, Christina. 2000. Discourses of standardization: Case study – the Hmong in the West. Ph.D. dissertation. Melbourne: University of Melbourne. rd

Forrest, R. A. D. 1973 [1948]. The Chinese language. 3 edn. London: Faber & Faber. Fuller, Judith W. 1986. Chinese le and Hmong lawm. Columbus, OH: Paper presented at th the 19 International Conference on Sino-Tibetan Languages and Linguistics. Haudricourt, André G. 1966. The limits and connections of Austroasiatic in the northeast. In Zide, Norman H. (ed.), Studies in comparative Austroasiatic linguistics. The Hague: Mouton. Haudricourt, André G. & Strecker, David. 1991. Hmong-Mien (Miao-Yao) loans in Chinese. T’oung Pao 77(4–5):335–341. Heimbach, Ernest E. 1979. White Hmong-English dictionary. revised edn. (Linguistics Series 4, Data Paper 75). Ithaca: Cornell University Southeast Asia Program, Department of Asian Studies. Jaisser, Annie Christine. 1984. Complementation in Hmong. M.A. thesis. San Diego, CA: San Diego State University. Jarkey, Nerida. 1991. Serial Verbs in White Hmong: A functional approach. Ph.D. dissertation. Sydney: University of Sydney.

654

Martha Ratliff

Lemoine, Jacques. 2005. What is the actual number of the (H)mong in the world. Hmong Studies Journal 6. Lyman, Thomas Amis. 1974. Dictionary of Mong Njua. The Hague: Mouton. Matisoff, James A. 1990. On megalocomparison. Language 66(1):106–120. Mortensen, David R. 2000. Sinitic loanwords in two Hmong dialects of Southeast Asia. B.A. honors thesis. Logan, UT: Utah State University. Mortensen, David R. 2002. A preliminary survey of Tibeto-Burman loanwords in Hmongth Mien languages. Tempe: Paper presented at the 35 Annual Conference on SinoTibetan Languages and Linguistics, Arizona State University. Mottin, Jean. 1978. Elements de Grammaire Hmong Blanc [Elements of the grammar of White Hmong]. Bangkok: Don Bosco Press. Niederer, Barbara. 2001–2002. La langue Hmong [The Hmong language]. Amerindia: Revue d’ethnolinguistique amérindienne Vols. 26–27: Langues de Guyane. Paris: Centre National de la Recherche Scientifique. Pan, Wuyun. 2006. On the genetic relationship between the Miao-Yao languages and the Sino-Tibetan languages. Paper presented at the Workshop on Language and Genes in East Asia/Pacific, December 12-13, Uppsala, Sweden. Uppsala: Paper presented at the th th Workshop on Language and Genes in East Asia/Pacific, 12 –13 December. Peiros, Ilia. 1998. Comparative linguistics in Southeast Asia. Canberra: Pacific Linguistics. Pulleyblank, Edwin. 1983. The Chinese and their neighbors in prehistoric and early historic times. In Keightley, David N. (ed.), Origins of Chinese Civilization, 416–423. Berkeley: University of California Press. Ratliff, Martha. 1992. Meaningful tone: A study of tonal morphology in compounds, form classes, and expressive phrases in White Hmong. DeKalb, IL: Northern Illinois University Center for Southeast Asian Studies. Ratliff, Martha. 2001. Voiceless sonorant initials in Hmong-Mien: Sino-Tibetan correspondences. In Thurgood, Graham (ed.), Papers from the Ninth Annual Meeting of the Southeast Asian Linguistics Society (1999), 361–375. Tempe: Arizona State University Program for Southeast Asian Studies. Ratliff, Martha. 2004. Vocabulary of environment and subsistence in the Hmong-Mien protolanguage. In Nicholas, Tapp & Michaud, Jean & Culas, Christian & Lee, Gary Yia (eds.), Hmong/Miao in Asia, 147–165. Chiang Mai, Thailand: Silkworm Press (distributed by University of Washington Press). Ratliff, Martha. 2005. Timing tonogenesis: Evidence from borrowing. In Chew, Patrick (ed.), Special Session on Tibeto-Burman and Southeast Asian Linguistics (Proceedings of the Annual Meeting of the Berkeley Linguistics Society 28), 29–41. Berkeley, CA: Berkeley Linguistics Society. Ratliff, Martha. Forthcoming. Hmong-Mien language history. Canberra: Pacific Linguistics.

25. Loanwords in White Hmong

655

Sagart, Laurent. 1995. Chinese ‘buy’ and ‘sell’ and the direction of borrowings between Chinese and Hmong-Mien: A response to Haudricourt and Strecker. T’oung Pao 81(4– 5):328–342. Sagart, Laurent. 1999. The roots of Old Chinese. Amsterdam/Philadelphia: John Benjamins. Smalley, William A. 1988. Pahawh Hmong and other Hmong writing systems. Indochina Studies Program, University of Hawai’i, June 27-28. Paper read at the Colloquium on Language Use and Language Policy in Laos, Cambodia and Vietnam: Modern th th developments, Indochina Studies Program, University of Hawai’i, 27 –28 June. Sun, Hongkai. 1992. Language recognition and nationality. International Journal of the Sociology of Language 97:9–22. Thoj, Xeeb Xaivkaub. 2000. Dictionary English-Hmong (Tsevlu Aakiv-Hmoob). San Diego, CA: Windsor Associates. Wang, Fushi. 1986. A preliminary investigation of the genetic affiliation of the Miao-Yao languages. Santa Barbara, CA: Paper presented at the International Symposium on the th th Minority Nationalities of China, 27 –29 January. Wurm, Stephen A. et al. (ed.). 1988. Language atlas of China. Hong Kong: Longman. Xiong, Yuepheng L. 2006. English-Hmong/Hmong-English dictionary. St. Paul, MN: Hmongland Publishing Company. Ying, Lin. 1972. Chinese loanwords in Miao. In Purnell, Jr. & Herbert C. (eds.), Miao and Yao linguistic studies: Selected articles in Chinese, Translated by Chang Yu-hung & Chu Kwo-ray. (Linguistics Series 5, Data Paper 88). Ithaca, NY: Cornell University Southeast Asia Program, Department of Asian Studies.

656

Martha Ratliff

Loanword Appendix ntxiv

Old Chinese kaj

light, bright; the light

nquab

dove, pigeon

kooj

grasshopper, locust

hlwb

brain matter, marrow

nplaig

tongue

npau

to boil

rau

to put, to put on (shoes), to wear (shoes)

to mend/repair, to make up what is lacking, more

thee

charcoal

nyuj

cow

yaj

sheep

qaib

chicken

gus

goose

nrab

half, center, middle

ntxov

early

os

duck

xyoo

year

puav

bat

qab

sweet

uab

crow

daj

yellow

w

quail

qhuav

dry

twm

water buffalo

to cry, to wail, to make a loud noise

rau

claw, hoof, nail

ntsws

lung

tshee

to shiver, to shudder

los

to bury

haus

to drink, to smoke

ci

to roast, to toast, to bake

quaj

khau

shoe

vaj

garden, enclosed yard

liaj

paddy

hlau

iron (> hoe)

nceb

mushroom

Tibeto-Burman (precise donor unknown)

tuav

to pound (rice)

hnub

sun, day

qhov-txos

fireplace, oven

phua

to split, to cut open

hli

moon, month

kuam

to scrape

laim

lightning

ntxuav

to wash

taum

bean

vauv

son-in-law

kub

gold

cawv

liquor

nyab

daughter-in-law

nyiaj

silver, money

laij

to plow

nees

horse

txaug

to carve, to chisel, chisel

liag

sickle

kub

horn

sau

to gather

hneev

footprint

cog

to plant

plaub

four

txiv tsawb

banana

tsib

five

xyoob

bamboo

rau

six

txiab

xya

seven

scissors, to cut with scissors

nqis, nqes tau

down, to go down to get (> completive marker, ability marker)

dhau

through, to cross, to pass

yim

eight

pua

to spread out

hole, place, thing

cuaj

nine

kaw

saw, to saw

qiv, qev

to lend, to borrow

kaum

ten

rauj

hammer

hnia

to sniff, to kiss

tooj

copper

nqi, nqe

price

ntxias

qis, qes

low

to weave, to braid

dav

wide

nqaim

narrow

qhov

Middle Chinese lwg

dew

poob

to fall

pa

air, breath

xa

to send

hlawv

to burn (TR)

caij

to ride

25. Loanwords in White Hmong

657

thawb

to push

ciaj

tongs, pliers

pob

ball

choj

bridge

zuaj

to look for

to knead, to massage

coob

nrhiav

many (esp. of people)

txais

to borrow on long-term basis

hwj txob

pepper

khoob

empty, hollow

piam thaj

sugar

xwb

only

zas

to dye

sij hawm

time, occasion

muas

to buy

thom khwm

sock, stocking

tamsim

immediately

muag

to sell

khawm

button

maj mam

slow

luj

to weigh

xauv

silver neck ring

chiv

to begin

chaw

place

phuam

cloth, rag

chiv keeb

beginning

nyias

thin

chav

room

tsum

to cease

ncaj

straight

yaum sij

key

sij

pua

a hundred

teeb

lamp, lantern

continually, repeatedly

txhiab

a thousand

tswg

season

to count

post (in house construction)

caij

suav

saj

yau

young, small

laj kab

fence

to taste, to try flavor of

qub

of old, former

kwj

xiav

blue, purple

sov

warm

ditch, valley, gully

txab

dirty

wheat

ua si

to play

orange (the fruit)

vam

to hope

tseeb

true

kab tsib

sugar cane

phem

bad, evil

hauj lwm

work

txhaum

wrong, guilty

khi

to tie

xav

piam

broken, ruined

to think, to be of the opinion

khuam

to hang up

tswv yim

idea, intention

tshau

to pierce, to make a hole

xyaum

tsav

to drive

to learn, to practice, to imitate

phuaj

raft

cawm

to rescue

xib hwb, xib fwb

npib

coin

teacher, Protestant minister

tax

meej

clear

piav

to explain

yooj yim

easy

sim

to try

vim

because

hu

to call, to name, to cry out

hem

to threaten by making noise

txoom

wrinkled

mog

cem

to scold

kab-ntxwv

twm

to read

tswv

master

qhua

guest

pab

to help

txhav

to rob

tshuab

motor, machine

Modern Chinese roob

mountain

hiav-txwv

sea

xeeb-ntxwv

grandchild, descendant

luj-txwv

mule

se

xiab

gill

kim

expensive

muas lwj, mos lwj

Sambar deer

pheej yig

cheap

zaum

to sit

vaub kib, vuab/huab kib

turtle, tortoise

tum

to pile up

koom

to join forces, to share

hwj-txwv

beard

faib

to divide

lim

tired

phab

phaj

dish, plate

side (flat vertical nyeem surface)

to read

658

Martha Ratliff

ciam, ciaj ciam

boundary

xeem

clan

thawj

chief, leader

huab tais, fuab king tais pej xeem

populace

phooj ywg

friend

yeeb ncuab

enemy

qws

club, stick, pestle

phom

gun

yeej

to win

swb

to lose

cuab

to trap

txim

crime, offense, punishment, guilt

thaj

altar, spirit shelf

hauj sam

Buddhist monk

tsheb

vehicle

hwj, fwj

bottle

lwm

other, next

Lao kob

island

nab kuab

ice

kab taij

rabbit

us

camel

npias

thoom thaub

beer (from French via Lao)

hnub vas xuv

Friday

hnub vas xaum

Saturday

sock, stocking

thwj

right, correct

phiv

wrong

nais khu

teacher

xoom xaim

doubt

gaib

easy

nyuaj, nyuab

difficult

muj

friend

meb cab

prostitute

nyoo

to surrender

xus npus, xum soap npum (from French via Lao) hoob

room

kas ces

key

thees khaim

candle

txiv maj phaub, txiv maj phob

coconut

yaas kaus

glue

npev

fishhook

xim

color, paint

foob

to accuse, to sue

las voos

dance (Lao style)

vis thab nyub

radio

luv thij

bicycle

laub

cart, wagon

luv

vehicle

taj laj

market, shop, store

luv npav

bus

npas

ball

fai fab

electricity

xoom, xum

zero

naas mom

nurse

khub

pair

xav

to inject

as nyub

age

leb

number

xuab moos, xuab moo

hour

xab tias

postage stamp tin can

moos

clock, watch

kos poom, kas poom

as thiv

week

(khob) noom

candy, sweets

hnub vas thiv

Sunday

yaas

plastic

hnub vas cas

Monday

phee

song

Tuesday

kas fes

coffee (from French via Lao)

kheb

crocodile

thaj mom

physician

hnub vas as qhas

ntim

small porcelain bowl

hnub vas phuv Wednesday

khob cij

bread

hnub vas phab Thursday hav

Chapter 26

Loanwords in Ceq Wong, an Austroasiatic language of Peninsular Malaysia* Nicole Kruspe 1. The language and its speakers Ceq Wong is an Aslian language of Peninsular Malaysia. Aslian languages belong to the Mon-Khmer branch of Austroasiatic and are spoken in both Malaysia and southern Thailand. Aslian has three main internal divisions: Northern, Central, and 1 Southern, with Ceq Wong affiliated with the Northern division (Diffloth 1975) . The Ceq Wong inhabit the mountainous tropical rainforest on the southern foothills of Gunung Benum in the Krau Game Reserve in the central region of the Peninsula along the Teris and Lompat Rivers and their tributaries (Map 1). The Ceq Wong have a unique position within Northern Aslian, being somewhat isolated from the remainder of the group in the centre of the Peninsula; all other speakers are located in the north. In all probability the Ceq Wong represent a relict Northern Aslian population, and it was from here that the 2 languages spread north (see Burenhult 2009+; Burenhult et al. 2009+). Neighboring languages are Jah Hut, a probable isolate in Aslian (G. Diffloth, p.c.; Dunn et al. 2009+) located to the east, and Temuan, a Malay dialect spoken by 3 Aboriginal Malays to the south and west (Map 1). The Ceq Wong were first encountered by a European when colonial British game warden Charles S. Ogilvie, made contact with them in 1938 in the thennamed King George V National Park, Pahang. He published several articles including word lists (Ogilvie 1940, 1948, 1949), and so, although describing himself as neither “ethnologist or philologist” (Ogilvie 1949: 11), produced the first ethnographic and linguistic documentation of the Ceq Wong. *

The subdatabase of the World Loanword Database that accompanies this chapter is available online at http://wold.livingsources.org. It is a separate electronic publication that should be cited as: Kruspe, Nicole. 2009. Ceq Wong vocabulary. In Haspelmath, Martin & Tadmor, Uri (eds.) World Loanword Database. Munich: Max Planck Digital Library, 871 entries.

1 Recent research suggests a fourth branch containing Jah Hut, formerly a member of the Central group (Diffloth & Zide 1992; Dunn et al. 2009+). 2 Benjamin (1976) nominated the area around Gunung Benum as the location of the secondary dispersal of Aslian throughout the Peninsula. Bulbeck (2004) includes this area in his Aslian homeland. 3 Aboriginal Malays are most likely descendants of Orang Asli who shifted into the Malay culturallinguistic sphere, but did not adopt Islam.

660

Nicole Kruspe!

Ten!en Tea-de Kensiw Kintaq Northern Aslian

Menriq Jahai Batek Ceq Wong Jah Hut Temiar

Central Aslian

Lanoh Semai

Aslian Semaq Beri Southern Aslian

Semelai Temoq Mah Meri

Figure 1:

The Aslian languages (adapted from Dunn et al. 2009+)

Che Wông (rendered here as Ceq Wong) is an exonym erroneously bestowed by Ogilvie (1940: 23, Howell 1989: 10–14). The people themselves do not identify 4 with it, although it has been adopted to name themselves to outsiders . The Ceq Wong have no endonym, but simply refer to themselves as bi! h"! [people 1PL.INCL] ‘our people’. The Ceq Wong are divided into two groups: the southeastern Ceq Wong and the northwestern Ceq Wong, pace Howell (1989: 15). The current study is based on research with the southeastern Ceq Wong, whose population numbered 164 in 2002. The northwestern Ceq Wong remain undocumented, save the list in Benjamin’s Comparative Aslian vocabulary (1976: 102–23), and a short account in Howell (1989: 15–17). According to the Government census, the total population was 401 in 1999, although based on my own reckoning this figure appears inflated. Ogilvie (1949: 13) attributed the low population figure of the Ceq Wong, 54 in his 1947 census, to the Influenza pandemic of 1918 when two-thirds of the population reportedly perished. The ethnohistorical account attributes the low population to the slave raids carried out by Malays on behalf of the Dutch (Kruspe

4

It has also been rendered as Siwang (Needham 1956), Cheq Wong (Diffloth 1975), Che’ Wong (Benjamin 1976) and Chewong (Howell 1982, Ethnologue 2005). In the earliest known report they were called the Maroi (Evans 1927: 42).

26. Loanwords in Ceq Wong

661

th

fieldnotes 2002). Slave raiding was carried out in Pahang until the early 20 century (Endicott 1983). Despite these conflicting accounts, it should be made clear that the small population figure of the Ceq Wong is not necessarily indicative of a sudden decline in numbers, or a recent exodus of speakers from a dying language community, but rather is typical of the microsocieties found amongst the Northern Aslian groups.

Map 1: Ceq Wong and other languages of Peninsular Malaysia In addition to the tripartite linguistic classification, the Orang Asli (the indigenous peoples of the Malay Peninsula), are also grouped into a tripartite classification in terms of physical type and ecological adaptation (Semang: mobile foraging; Senoi: shifting agriculture; Proto-Malay: collecting for trade). Whilst Northern Aslian speakers typically belong to the Semang complex, defined by their Negrito phenotype, and mobile forager adaptation, the Ceq Wong do not exhibit these traits, having a physical appearance that resembles the Senoi, and practicing a mixed adaptation based on foraging and a simple form of agriculture. Although this classification has seen the Ceq Wong characterized as Senoi, and hence agriculturalists (see Benjamin’s Comparative Orang Asli taxonomy 1976: 100; 1985: 251), this is at variance with their own perception of themselves. They see

662

Nicole Kruspe!

themselves primarily as foragers or “digging people” (Howell 1989: 13, 1996: 131–32), in reference to their dependence on non-cultivated tubers, and only as marginal and not very successful cultivators. This is in contrast to the neighboring Jah Hut whose settled agricultural lifestyle the Ceq Wong consider to be Malay-like (Howell 1989: 22), and whose house-building skills and neat villages they admire (Kruspe fieldnotes 2002). We know nothing of the Ceq Wong prior to Ogilvie’s encounter. When Ogilvie initially met them in October 1938 only one man, the “leader” of the group, was conversant in Malay (Ogilvie 1940), suggesting that at that time they had minimal contact with Malay communities. According to ethnohistorical tradition, the Ceq Wong once dwelt in the lowland area of the Pahang River, but retreated with the establishment of immigrant Malay communities (Needham 1956: 53). This was confirmed by an elderly speaker (Kruspe fieldnotes 2002), and correlates with Benjamin’s observations (see §3.1). The local Malay villagers at Kampung Bolok are descendants of Bugis immigrants (Zainal Abidin Lela, p.c.). Following Ogilvie’s contact, the Ceq Wong remained relatively isolated until the th last decades of the 20 century. Anthropologist Signe Howell, a student of Needham, commenced research into Ceq Wong social, religious and cultural practices in the late 1970s. Details of Ceq Wong society can be found in a body of work spanning the last three decades, including a collection of myths and legends (1982), an ethnographic account (1984/1989), and publications on other aspects of Ceq Wong society (1985, 1996, 2002, and references therein). Given the unique societal traditions and ecological adaptation of the Ceq Wong, I will briefly describe their society in order to place in context the later discussion of the kinds of loanwords in their language. Ceq Wong society is egalitarian. It is marked by an absence of hierarchical structure and therefore lacks socio-political organization in terms of leadership of the community. The primary social unit is the nuclear family, which functions as an autonomous entity. Residence was traditionally in a camp or swidden in the forest with perhaps one or more families. These days some people choose to live in a village at Kuala Gandah, but residence is fluid in the absence of permanent group membership. Individual behavior is determined by adherence to proscriptions and prescriptions which ensure the cooperative wellbeing of the individual and the community. Transgression results in repercussions involving the elements or supernatural beings, and never human intervention. The Ceq Wong are animists. Kinship is consanguineal and bilateral, marriage is endogamous, and residence exhibits an uxorilocal bias in the period immediately following marriage. Both of these features are typical of the Senoi, rather than the Northern Aslian-speaking Semang. Interestingly, perhaps in acknowledgment of an earlier state, Ceq Wong insist that marriage should be exogamous otherwise it is incestuous (Howell 1989: 28, Kruspe fieldnotes 2002), but this is not practiced, nor was it documented by Ogilvie or Needham.

26. Loanwords in Ceq Wong

663

In this egalitarian society there are no gender specific roles or tasks, no knowledge which is held exclusively by any member or section of society, and no status associated with one’s ability to perform a task. Non-violence and noncompetitiveness are the norm. Further, there is no classification of activites into ritual and mundane (see Howell 1989, 1996: 128, and Endicott & Endicott 2007). Ceq Wong ecological adaptation is based on low-level swidden cultivation, and apart from rice and bananas, it is mainly more recently introduced crops like manioc, sweet potatoes, maize, tobacco and chili that are grown. Rice is only planted by some people, and yields are small. In the past there were no domestic animals other than dogs. These days the Ceq Wong keep a few chickens which some sell to neighboring Malay villagers, but which are never exploited by the community as a source of meat or eggs. There are also a few domestic cats. Agriculture is heavily subsidized with extensive hunting and foraging. Small arboreal game is hunted with a blowpipe. Both men and women are adept in all aspects of the blowpipe from construction to use. Larger game were once caught with spring and snare traps, but these are no longer used. Long periods are spent engaged in foraging when people move into the forest away from either the settlement at Kuala Gandah, or their swidden in the forest. The Ceq Wong move about the forest on foot, carrying their belongings in a woven backbasket seeking foodstuffs for personal consumption, and items for trade. Although the use of motorbikes is increasing, they are not always suited to the terrain, especially in the wet season. The rivers are shallow and either forded on foot, or if deeper crossed on a felled tree trunk. Agriculture and foraging for personal consumption are complemented by foraging-for-trade: rattan, agila wood, wild fruits, resin, and small game are sold to traders for cash (Howell 2002: 258–60). A few people also cultivate low maintenance crops like bananas and challis for sale to traders. Houses are constructed with materials collected from the forest: small gauge trees for wooden posts, bark walling, bamboo flooring and palm leaves for thatch. According to Howell (1989: 25) these simple houses, raised constructions with bark walling and an open doorway are a relatively recent innovation. Originally they lived exclusively in lean-tos or windbreaks consisting of a low bamboo floor with a rough thatch of palm leaves. Their material culture is poor. In the past people wore loincloths of beaten bark, and used the same fabric to make blankets and slings for carrying babies and small children. They still weave backbaskets from rattan, and sleeping mats and tobacco pouches from pandanus. When Ogilvie encountered the Ceq Wong, bamboo tubes were used for cooking and fetching and storing water, and food was eaten from vessels made from folded leaves. Metal pots, crockery and plastic containers have since taken their place. While resin was once used for lighting, this has been replaced by kerosene. Some households now have electricity. Elder members of the community, those over the age of thirty, are non-literate and speak a heavily accented Malay. Most have only a passive knowledge of the language. This concurs with Howell’s report that Malay was not widely spoken

664

Nicole Kruspe!

when she began fieldwork in the late 1970s. While those under twenty speak a more fluent Malay, it is still very much a “contact” variety. The children who live in the village now receive a state education, although attendance is generally irregular. Absent is the multilingualism that pervades other Northern Aslian communities. Needham noted that the Ceq Wong were conversant in Jah Hut and Temuan (Needham 1956: 55). However, this was not confirmed by my own observations.

2. Sources of data This study is based wholly on documentary data which I collected on several field trips from 2002–2003 and 2005–2006 (Kruspe 2009++a). Some additional direct elicitation for the purpose of this study was also undertaken. Limited use was made of Ogilvie’s wordlists. The first, published in 1948, and a revised and expanded one published in 1949 are phonetically unreliable, nonetheless they were useful for comparative purposes providing a snapshot of Ceq Wong vocabulary collected some 60–70 years ago. Naturally, there is no previous work specifically on Ceq Wong loanwords. However, Benjamin’s (1976) lexicostatistical study does give some calculations for Malay and intra-Aslian loans. Given the relative infancy of modern research on Aslian languages, there is no etymological database of Aslian, nor phonetically reliable dictionaries of any languages (see Kruspe 2009+), and therefore it is difficult to present a discussion on intra-Aslian loans. Data from other Aslian languages were drawn primarily from my own extensive research into Semelai and Mah Meri, and wordlists I have collected of Semaq Beri, Jah Hut and Batek Deq. In addition I consulted comparative Aslian materials (Benjamin 1976, Dunn et al. 2009+), and Diffloth’s (1976a) grammatical sketch of Jah Hut. For Northern Aslian materials I was largely reliant on discussions with Niclas Burenhult, and Burenhult (2005). There is no published material available on Temuan, apart from a phonetically unreliable dictionary of a variety spoken in the neighboring state of Selangor (Baer 1999). Ceq Wong is an unwritten language. The orthography used here is phonemic and displays the following deviations from the IPA: j replaces /!/, y replaces /j/ and s replaces /"/. The letter a represents the Ceq Wong phoneme /æ/.

26. Loanwords in Ceq Wong

665

3. Contact situations 3.1.

Preliminary remarks

Attempting to determine the contact situations of the Ceq Wong was another challenge to this study as there are no records of the Ceq Wong prior to their “discovery” in 1938 (see §1 above). Prior to the twentieth century, the Malay Peninsula was made up of predominantly coastal oriented kingdoms of immigrant rulers, with minor riverine populations along some of the major rivers. The interior was almost exclusively occupied by the Orang Asli. This general picture prevailed in Pahang until the time the British took power in the 1880s. However, in Pahang there were also extensive trade routes which exploited the navigable rivers and allowed trade to continue unhindered between the east and west coasts when sea passages were impassable during the monsoon. This network included rivers like the Semantan, a tributary of the Pahang River at the edge of the current Ceq Wong range, which formed a route to the neighboring state of Selangor to the west, prior to a road being built in the 1920s (Cant 1972: 114). Benjamin (1997: 104) suggests that when Aslian speakers initially spread across the Main Range to populate the east they would have absorbed any pre-existing Austronesian speakers, who may have already been there. This may also have been the scenario with other inhabitants (see §3.4 below). Later, when Malay populations began to migrate to the peninsula from Borneo around 1,500 years ago, Benjamin proposes that it would have been in a different social context and that social ranking would have given rise to ethnic boundaries between the indigenous people and the immigrants (Benjamin 1997: 105). Maritime traders arriving in the Malay Peninsula sought amongst other things the gold and forest products for which Pahang was known abroad (Cant 1972). The extraction of forest products was the domain of the Orang Asli and they traded them for items like metal tools, salt and cloth. Malays are believed to have dominated this trade from early on, and the lingua franca of the region was Malay. th From the 8 century Pahang was under the suzerainty of Srivijaya, a Malay thalossocracy whose capital was located in Sumatra. Pahang reputedly fell to the th King of Siam around the 12 century, although Benjamin (1997: 106) suggests there may have been a Mon or Khmer presence there. th Malay influence in Pahang was only re-established in the 15 century when it th was annexed by the Malacca Sultanate. In the mid-17 century Pahang became part of the Johor-Riau Empire, which itself came under the influence of the Bugis in th the 18 century. The British established residence in 1888 until Malaysian independence in 1957. Foraging for trade is the main subsistence mode currently exploited by the Ceq Wong. It is unclear to what extent trade led to direct involvement with the Malays in the past. Nor is it clear to what extent they would have been involved in this in

666

Nicole Kruspe!

historical and prehistorical times. Three trade scenarios have been documented in the Peninsula: i. Direct trade of the type which takes place today whereby middlemen approach with an “order” for a particular type of forest produce which is then purchased from the Ceq Wong, availability permitting; or whereby the Ceq Wong deliver the materials to middlemen. Ogilvie mentions that “at long intervals” cane was taken by the men down to Lanchang, the nearest service town, where it was traded in return for “salt, saltfish, white rice, bush knives, and fish hooks” (Ogilvie 1949: 13). This is the only evidence we have of Ceq Wong trade in the past, and it remained that way until a road to the settlement was built in the early 1990s. Howell (2002: 266) noted that the traders were all Chinese, but currently Malay traders predominate. ii. Indirect trade where exchange took place with an intermediary group, e.g. Temuan, or another Aslian group, as described in Noone (1954: 13–15) and Benjamin (1987: 144). iii. Indirect or “silent” trade, where goods were left at a pick-up point, the traders would remove the goods, and then leave items which the Orang Asli would retrieve once the traders had gone (Begbie 1834). Trade remains the major channel of interaction between the Ceq Wong and their Malay neighbors, the only difference being that a cash economy has replaced a system of barter. In the last few years the Ceq Wong have increasingly found short-term employment as guides and consultants for researchers (for details of this trade, see Howell 2002). Other than this, interaction is limited to contact with kindergarten teachers, and occasionally visits from the government department which oversees aboriginal affairs (Jabatan Hal Ehwal Orang Asli). It is noteworthy that their culture and practice exhibit little influence from Malay. At present the language is still robust. Nonetheless, increased contact with mainstream society augurs dim prospects for such a small society. 3.2.

Intra-Aslian contacts

Given the poor documentation of Aslian languages, work on intra-Aslian contact remains in its infancy, and it is therefore difficult to distinguish intra-Aslian loans from commonly inherited forms with any certainty. Neighboring indigenous groups of the Ceq Wong, both Aslian and Austronesian, lived by similar means, shared a similar material culture, and existed within social systems similar to the Ceq Wong’s, so that it is unlikely that there was any significant intergroup dominance or superiority relations involved in language contact. Some Aslian groups may have had more contact with Malays, and passed

26. Loanwords in Ceq Wong

667

their borrowings on to more interior groups. On the whole the greatest influence in terms of new concepts and technologies has come ultimately from Malay. As mentioned earlier, the Ceq Wong have limited contact with their neighbors, and are ignorant of other Aslian groups other than their immediate neighbors, the Jah Hut and Temuan. Some older speakers know of the Batek Nong of Ulu Ceka to the north of Gunung Benum, but no-one has ever met them, nor knows anything of their language. 3.2.1.

Ceq Wong – Jah Hut

The Jah Hut live to the east along the Krau and the Lompat Rivers downstream from the Ceq Wong (Map 1). They traditionally combine swidden cultivation with forager-trade. The genealogical affiliation of Jah Hut within Aslian is unclear and it may be either an isolate, or affiliated with Southern Aslian (Diffloth, p.c.). A recent study places Jah Hut as a separate branch between Central and Northern Aslian (Dunn et al. 2009+). In terms of the sociolinguistic context, present-day interaction with the Jah Hut is negligible. Although the Ceq Wong have a generally positive regard for the Jah Hut, there are no formal relations with their neighbors to the east, and present day Ceq Wong marriage is almost always endogamous. Ogilvie did not document any intermarriage with the neighboring Jah Hut other than one case. My consultant Tal#y was married to Tapah (dec.), a Jah Hut from Ulu Krau (Ogilvie 1947: 28; Tal#y Jareng p.c.). Needham (1956: 62, 65) records the mother of Jareng and T$%k as also being Jah Hut, but Tal#y, the daughter of Jareng did not recall this, and neither did T$%k’s sons. These unions appear to have been random events as they are the only known cases, against a persistent pattern of cousin marriage (Howell 2002: 261). Recently several young men have married Jah Hut women they met while working at a plywood factory near Jerantut, Pahang. All couples used Malay as a lingua franca, until the women had learnt enough Ceq Wong to be conversant with other community members, although all of the women spoke Jah Hut to their children. Most of these marriages have since broken down with the women and their children returning home. There is little evidence of outright borrowing between Ceq Wong and Jah Hut, probably due to the limitations of the 600 item vocabulary of Jah Hut. One of the few instances is ko# ‘woman, female’. Diffloth (1975: 3) notes that Ceq Wong has been greatly influenced by Jah Hut, although he did not elaborate. Evidence that these two groups may have had a close relationship is seen in unique mappings of form and meaning that appear to be exclusive to Ceq Wong and Jah Hut, e.g. talon is the generic term for ‘snake’ in these two languages. In all other Aslian languages the reflex is usually a python, as in Jahai talon ‘Reticulated python’, or Semelai tl$n ‘avoidance name for the Reticulated python’. Shorto reconstructs *t1lan ‘python’ for Proto Mon-Khmer (2006: 332).

668

Nicole Kruspe!

Further evidence is found in shared lexicon for introduced foods which occur in these two languages, but not in other adjacent or related languages, e.g. pigo! ‘chili’ for which no etymology has been identified and tal$s ‘taro’, ultimately a non-Malay 5 western Austronesian loan . The Ceq Wong learnt to plant rice from their Jah Hut neighbors, and when they wanted to plant rice would procure seed from them (Howell 2002: 268), so this would be a possible conduit for other agricultural introductions. Finally, there are a number of words in Ceq Wong which are present in both Jah Hut and Southern Aslian, e.g. Ceq Wong mn%m ‘mountain’ (see Jah Hut bn&m or Semelai bn'm), or Ceq Wong l#"(! ‘neck’ (see Semelai and Semaq Beri l#"(!). In the database these items are listed as “borrowed from Jah Hut or Southern Aslian”. Jah Hut may have acted as the intermediary for their spread into Ceq Wong. For an alternative scenario see §3.2.2 below. 3.2.2.

Ceq Wong – Southern Aslian

Ceq Wong has words that appear to be of Southern Aslian origin which are not found in Jah Hut. The forms of the Ceq Wong numerals three and four suggests that a Southern Aslian language was the donor for Ceq Wong p"t ‘three’ (see Semaq Beri hmp"t ‘three’ Diffloth 1976b) and p)n ‘four’ (see Semelai hmpon ‘four’). It is unclear as to whether Ceq Wong had relations with Southern Aslian independent of Jah Hut, or Jah Hut was the donor, but has since replaced the indigenous numerals with Malay loans (see Jah Hut tiga! ‘three’ ! Malay tiga and !mpat ‘four’ ! Malay empat). The presence of loans from Southern Aslian would perhaps confirm the earlier mentioned ethnohistorical account (§1) that the Ceq Wong once dwelt in the lowland area of the Pahang River. The area around the Pahang river was, and in some areas still is occupied by speakers of Southern Aslian languages. This may account for cognates with Southern Aslian, and also for Northern Aslian loans in Semaq Beri, which have been attributed to Ceq Wong, although the two languages are currently spoken in noncontiguous areas (Endicott 1975: 7, Kruspe fieldnotes 2001, 2008). Benjamin wrote that “Ceq Wong has high loan rates with all Southern Aslian languages except Mah Meri, which suggests its ancestors came into contact with Proto-Southern Aslian speakers (Semaq Beri – Semelai - Temoq) before the latter split apart, but after Mah Meri had split away” (Benjamin 1976: 78). Note that Benjamin’s study did not reveal any borrowing between Ceq Wong and Central Aslian.

5

Adelaar reconstructs schwa for Proto Malayic, in which case this would be evidence of early contact between the Orang Asli and Malays. However, the dates would not work for these items, which are obviously more recently introduced. Instead it suggests a non-Malay Austronesian contact situation.

26. Loanwords in Ceq Wong

3.3.

669

Malay contacts

As has been mentioned above, due to the paucity of information, it is difficult at present to ascertain what contact situations obtained directly between Malays and the Ceq Wong. What is clear is that the rich borrowings from Malay reflect the long and complex history of that language in the Peninsula evident in the 6 numerous archaic forms of Malay words and also highly integrated forms . The high number of loanwords yet little grammatical or cultural influence from Malay perhaps supports the claim that Malay loans have come through intermediary languages. For example, terms for introduced food items like cassava, sweet potato, and yam are also found in Jah Hut, so they may have entered Ceq Wong via an Aslian language, rather than directly from a variety of Malay. It is difficult to assign sources or differentiate historical strata for Malay loans, other than to the most recent loans expressing concepts relating to the modern world, e.g. kit$! ‘car’, dyo! ‘radio’. These can be attributed directly to Malay, usually to the local dialect spoken in the nearby Malay villages, but also increasingly from the national language through other means like the media and education. There are some potential indicators in the integration of loanwords that confirm complex patterns of contact: i. loans from Malay that preserve a final voiceless velar stop where Malay now has a glottal stop, e.g. tarek ‘to pull’, as opposed to loans that reflect the final glottal stop, e.g. !adi! ‘younger sibling’. This clearly demonstrates that these items have entered Ceq Wong in different historical periods. ii. loans that have a final glottal stop where Malay has an open final syllable, e.g. bsi! ‘iron’ (! Malay besi), kit$! ‘car’ (! Malay kereta) and dyo! ‘radio’ (! Malay radio) compared to loans that lack this, e.g. roti ‘bread’ and gul$ ‘sugar’. The accretion of a glottal stop meets wellformedness conditions. It is common amongst Aslian languages (see Kruspe 2004: 56; Burenhult 2005: 41) and is treated as an Aslian innovation. Temuan is a Malay dialect spoken by Aboriginal Malays who live adjacent to the 7 Ceq Wong. There are no descriptions available of Temuan , other than the abovementioned lexicon, so based on available data it is not possible to identify loans as Temuan. Impressionistically Temuan is lexically very similar to Malay, the most striking differences being the fixed final syllable word stress, and the presence of

6

7

Note that Ceq Wong exhibits a prevalence of Malay vocabulary in ritualistic language, not found in everyday language, e.g. in shamanic songs (see Howell 1989: 97–102). This is also common in other Aslian languages like Semelai and Semoq Beri. Temuan, spoken over a wide area of western Pahang, Selangor, Negeri Sembilan, Melaka and Johor, exhibits internal diversity. The variety spoken from Rawang north of Kuala Lumpur southward through to Negeri Sembilan and east toward the southeastern Pahang border exhibits some lexical similarity with Southern Aslian (Kruspe fieldnotes 2000, 2002). There was no evidence of this in the variety spoken by the Temuan neighboring the Ceq Wong in western Pahang.

670

Nicole Kruspe!

preploded nasals in word final position (Kruspe fieldnotes 2002), both features typical of Aslian, but not of Malay. There is limited bilingualism between Ceq Wong and Temuan. This appears to result exclusively from mixed marriages where the male was Ceq Wong and the female Temuan. In most cases the men and their offspring have become part of the Temuan socio-cultural complex, with the exception of two cases. These unions between Ceq Wong and Temuan appeared to be the result of a severe shortage of potential female spouses amongst the Ceq Wong, who in general have a low opinion of the Temuan language and society, and avoid contact with them (Kruspe fieldnotes 2002, Howell 2002: 267–68). Again, Ogilvie did not note any intermarriage with the Temuan. According to him (Ogilvie 1940: 23) the Temuan, his Bih Nyeg, lived away in the Klau Valley. It would appear that the two groups have not always lived in close proximity as they do today. In conclusion, where words are listed as Malay or non-standard Malay loans, it would be prudent to keep in mind that they may well be loans from Temuan rather than a Malay dialect spoken by ethnic Malays. 3.4.

Western Austronesian donor languages

The label “Western Austronesian unspecified” denotes a language (or languages) which is not identifiable with any one particular western Austronesian language other than Malay. Some of these forms could possibly predate Malay and present evidence of the long-speculated pre-Malay Austronesian languages in the Peninsula (Blagden 1903, Benjamin 1987: 130–31), e.g. klantãr ‘lightning’, which resembles the Malay word halilintar ‘thunderbolt’, but the initial /k-/ suggests a different Austronesian source (Uri Tadmor, p.c.). This category also includes items which are not found in modern varieties of Malay, e.g. tal$s ‘taro’, which on the basis of the schwa could be from Javanese (see footnote 5). The latter would possibly represent a more recent loan than the ‘lightning’ example. 3.5.

Unidentified donor language

Two Ceq Wong words have neither Austroasiatic nor Austronesian etymologies. They are identified by Adelaar (1995: 89–90) as being found in both Aslian and some Land Dayak and Sumatran languages, but otherwise have no reconstruction in Austronesian. These are the Ceq Wong terms k$b&s ‘to die, be dead’, and mam*h ‘to bathe’. It is interesting to note here that while the term for ‘bathe’ is restricted to Jah Hut and Central Aslian, the term for ‘die’ is attested in all three branches of Aslian, perhaps evidence of a substratum predating both the Austroasiatic and Austronesian presence.

26. Loanwords in Ceq Wong

3.6.

671

English

During the Malayan Communist Emergency following the Second World War, some of the Ceq Wong were temporarily interned by the British from 1953 until around 1956 (Howell 1989: 17–18). The camp at Bukit Rumput, Bolok, where some of them resided, was administered by British troops and officials. It is unclear how much contact took place there, and whether any of the loanwords which are ultimately English were the direct result of this contact, or whether they have entered the language via Malay. Few members of the community are old enough to have recollections of the 1950s. A possible contender as a direct borrowing from English is makaw si#gret [tobacco cigarette] ‘cigarette’; the usual Malay term is rokok.

4. Number of loanwords Of the 1460 Loanword Typology (LWT) meanings, not more than 861 had equivalents in Ceq Wong. The low rate of corresponding forms in the database is due to the large number of LWT meanings which are irrelevant for the Ceq Wong. This is attributed to two main factors: firstly the nature of Ceq Wong subsistence and society as outlined in §1 above is not accommodated by the Eurocentric bias of the LWT list, and even though items from other cultures have been added, it is not sufficient to correct this imbalance. Secondly, many of the terms in the list simply have no counterpart in Ceq Wong. This is particularly prevalent in areas such as generic terms for animals, where only lower order terms exist. For example, there is no term for ‘insect’, nor is there a generic term equivalent to ‘monkey’. Amongst foodstuffs, Ceq Wong also displays an absence of generic or basic starter labels, other than plo! ‘fruit’ be! ‘rice plant’ and tes ‘mushroom’; for instance, there is no generic term for the many types of wild tubers. Similarly, there are gaps in body part terminology where Ceq Wong for instance has no single equivalent term for ‘leg’, only for the components, e.g. foot, lower leg, thigh. These are features common to many Aslian languages (see Burenhult 2006 for Jahai). The number of loanwords and potential loanwords calculated in this study is 36.9% (318 items) with the highest number (34.5%, or 297 items) originating from Malay. The remaining 2.6% (21 items) were from Jah Hut (0.4%), Jah Hut or Southern Aslian (0.5%), Southern Aslian (0.4%), western Austronesian (1.1%), and 0.2% had an unidentified donor language. While the status of Malay as the ultimate source of loanwords (but not necessarily as the immediate donor language) can be firmly established, the status of intra-Aslian loans is tentative, and they are graded accordingly in the database. Ultimately loanwords are attributed to their original source Malay when the donor language may well have been either Temuan, or another Aslian language into which they have been borrowed. For instance, Jah Hut exhibits a higher number of Malay loans (18%) than Ceq Wong (12%) (Benjamin 1976: 73; Kruspe fieldnotes

672

Nicole Kruspe!

2002), and it may be that the Malay loans in Ceq Wong were borrowed from Jah Hut (see §3.3). Loanwords which have themselves been borrowed into Malay are attributed to Malay as contact between the ultimate source language and the Ceq Wong is unlikely, except in the possible case of English, as mentioned above in §3.6. It is difficult to say anything substantial about the possibility of intra-Aslian borrowing given the absence of necessary data, and there is undoubtedly much more than the rudimentary figures here indicate. Benjamin (1976) suggests a high cognacy rate between Ceq Wong and the geographically distant Northern Aslian language Kensiw (Map 1), suggesting post-split contact, a claim repeated in Bauer (1991). However, no data is given in relation to this claim. Not only are we without the necessary linguistic data here, but there is also negligible historical information, and no comprehensive ethnographic accounts of neighboring Orang Asli groups that might provide insight into their past relationships. Finally, it is worth commenting on the high number of loans suggested by this study. My own loanword calculations based on Ogilvie’s (1949) 1,000 item wordlist resulted in a figure of 9.4% for loanwords from Malay. The list includes items for traditional technologies, cultural items and plants and animals relevant to the area. Only three items in the list have been replaced by Malay loans in the intervening years. The high loanword figure does not conform to the impression gained when one works on the language. My conclusion is that it arises from the manner in which the meaning list has been compiled, the type of “alien” information sought in the study, and the fact that the database is not representative of the Ceq Wong lexicon, as alluded to above. This should be kept in consideration in the following discussion.

5. Kinds of loanwords 5.1.

Loanwords and semantic word class

Ceq Wong has distinguished word classes with the major classes being Verb and Noun (Kruspe 2009++a). There are no “Adjectives” as such, these being a subclass of the verbal class. Similarly, adverbs are in fact largely members of the Verb class, with the exception of a handful of true adverbs. Therefore generalizations about the borrowing patterns of verbs apply also to adjectives and in part adverbs. The resultant figures are not surprising, with Nouns at 41.6% the most likely to be borrowed. Adverbs at 20% recorded the lowest rate of loanwords. Verbs and Adjectives, recorded almost identical rates of borrowing. The categories Noun and Verb were drawn from the greatest number of donor languages, whereas adverbs were drawn only from Malay.

26. Loanwords in Ceq Wong

Jah Hut

Southern Aslian

Unidentified source

Total loanwords

Non-loanwords

Nouns Verbs Adjectives Adverbs Function words all words

West Austronesian

Loanwords in Ceq Wong by semantic word class (percentages)

Malay

Table 1:

673

39.0 31.2 28.8 20.0 29.3 34.5

1.2 0.8 2.1 1.0

1.1 1.1 0.6 0.7

0.4 0.4 3.0 0.6

0.6 0.5 0.2

41.6 32.9 32.5 20.0 32.9 37.0

58.4 67.1 67.5 80.0 67.1 63.0

The high number of loans for function words comprises mostly prepositions and connectives. The prepositions are mainly the result of insertion and coexist with Ceq Wong prepositions to provide greater semantic specificity in some cases, e.g. k)! ‘LOC’ is a general locative meaning ‘in, on, at’, Malay dalam has been borrowed as lam and specifically expresses containment. Other borrowings simply exist as partial equivalents, e.g. #"(n ‘with’ from Malay dengan exists alongside Ceq Wong bi! ‘with’. Aslian languages generally lack connectives, and the Malay connectives are strategically borrowed to fill a void; see Kruspe (2004: 380) for a similar situation in Semelai. Examples of this type clearly indicate that borrowing is being used to enrich the indigenous lexical system 5.2.

Loanwords and semantic field

In terms of semantic fields, one category, Law, scores 100%, but the actual figure is one loanword. The next highest categories are Modern world (73.3%), Warfare and hunting (70.3%), The house (70%), Religion and belief (75%), Clothing and grooming (62.8%) and Social and political relations (64.9%). Fields that score between 50– 65% are Possession, Quantity and number, and Cognition. The high number of loans in these categories are largely predictable in light of the discussion of Ceq Wong culture in §1 above: absence of political organization, religion, poor material culture and so forth. Most of the items included in these categories are absent from Ceq Wong society. Even so, two categories that unexpectedly score relatively high loan rates of above 30% are The physical world, and Animals. The semantic fields which are most resistant to loans, scoring 19% and 19.2% respectively, are Miscellaneous function words and Sense perception respectively. Other low scoring categories (25% and below) are Spatial relations, The body, and Kinship. In all categories Malay dominates as the major (ultimate) source language, reflecting the fact that cultural innovations have largely flowed from the spread of Malay influence in the Peninsula. The explanation for the patterns of borrowings is

674

Nicole Kruspe!

evident when one considers Ceq Wong ecological adaptation and societal traditions outlined earlier in §1. For example, Ogilvie reported how the Ceq Wong wore loincloths made from barkcloth and wore adornments made from natural materials derived from the forest, and cooked either by roasting in the fire or in bamboo tubes. The intrusion of the modern world explains the large-scale borrowing of items in the domains of The house (70%), Clothing and grooming (62.8%), and Food and drink (37.5%).

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

The physical world Kinship Animals The body Food and drink Clothing and grooming The house Agriculture and vegetation Basic actions and technology Motion Possession Spatial relations Quantity Time Sense perception Emotions and values Cognition Speech and language Social and political relations Warfare and hunting Law Religion and belief Modern world Miscellaneous function words all words

25.0 16.2 29.9 18.2 37.5 62.8 70.0 30.8 38.4 38.3 64.3 20.1 50.0 26.0 19.2 32.0 52.5 34.5 64.9 70.3 100.0 75.0 73.3 19.0 34.5

6.8 1.9 0.8 2.0 3.1 2.0 1.7 1.0

1.1 4.1 1.9 1.1 2.0 1.5 0.7

1.1 0.9 0.4 3.1 7.4 0.6

1.5 0.2

34.1 20.3 34.6 21.9 41.4 62.8 70.0 36.9 40.4 38.3 64.3 21.7 58.8 36.0 19.2 32.0 52.5 34.5 64.9 70.3 100.0 75.0 73.3 19.0 37.0

Non-loanwords

Total loanwords

Unidentified

Southern Aslian

Jah Hut

West Austronesian

Loanwords in Ceq Wong by semantic field (percentages)

Malay

Table 2:

65.9 79.7 65.4 78.1 58.6 37.2 30.0 63.1 59.6 61.7 35.7 78.3 41.2 74.0 80.8 68.0 47.5 65.5 35.1 29.7 25.0 26.7 81.0 63.0

Other than where loanwords have arisen due to the introduction of new items or technologies, or generic terms have been borrowed where there was previously none, at this stage it is difficult to postulate well motivated reasons for the borrowings. Amongst a number of more inexplicable borrowings are those of

26. Loanwords in Ceq Wong

675

familiar concepts in the categories of The physical world, e.g. binta# ‘star’, and verbs like those of category Motion, e.g. kisar ‘turn around’. While it could be argued that the Ceq Wong would most likely have known basic Malay vocabulary, and so this is what they borrowed, it does not explain why some basic items were borrowed and not others.

6. Integration of loanwords 6.1.

General observations

Many loanwords are well integrated into the language, so much so that speakers do not identify them as loans. This is particularly true of words which diverge from Bahasa Melayu (the national language of Malaysia) or the local dialect, that is those which are archaic, or where a semantic shift or a phonological change has obscured the relationship. Ceq Wong speakers, and most native Malay speakers for that matter, would not recognize these words as Malay, and the Ceq Wong are of the opinion that these words are indigenous. This may also be due to the fact that these older loans refer to items which are salient in their culture, or to activities which are not related to non-indigenous concepts or practices, e.g. batuk ‘to cough’ from Malay batuk, and sila! ‘sweet potato’ from the archaic, although not very old, ubi setela, now commonly known as keledek in modern Malay. There is no native Ceq Wong equivalent for ‘cough’, suggesting that the original term has either been replaced, undergone semantic shift, or there was no generic verb previously. In other cases a Malay loan coexists with a Ceq Wong term. In most cases of coexistence the loans are verbs. A second set of loanwords is readily identified by speakers as being the “same” as Malay words. In general these are words which have clearly been introduced and in most cases relate to the Modern world, and include temporal concepts like age and time, domesticated animals and introduced crops like corn, cassava and rubber, and th innovations of the 20 century: cars, chainsaws, aeroplanes, identity cards. For instance, when telling the time or in stating someone’s age in calendar years, the Ceq Wong will preface this with klu# gob (say Malay) ‘As the Malays say…’. Many of the Modern world loans are considered incipient, they are not phonologically integrated, and in some instances they violate the phontactic constraints of Ceq Wong. For example, gul$ ‘sugar’ ends in an open final syllable, in violation of phonotactic structure prefering a closed final syllable. The Ceq Wong do not exhibit any reluctance to borrow words, nor do they display any prejudice toward blending languages, other than in the construction of complex numerals where indigenous and borrowed terms cannot be mixed. Words from all classes become well integrated into the language. Both verbs and nouns undergo morphological derivations typical of the indigenous lexical items (see §6.3 below). Hybrid compounds of an indigenous and borrowed term also occur freely. For example, the following terms were coined for introduced items: kapal koh

676

Nicole Kruspe!

‘a type of small aeroplane (resembling the shape of a tortoise)’ (lit. ‘aeroplane tortoise’) and kueh som ruh ‘a wafer-type biscuit (with a distinctive diagonal pattern that resembles that of the nest of a type of wasp)’ (lit. ‘biscuit nest hornet’; kueh ‘biscuit’ is from Malay kuih ‘cake’). There are also examples of compounds of words with two different foreign origins, e.g. makaw si#gret ‘cigarette’, where the first word is from Malay tembakau ‘tobacco’ and the second from English cigarette. Calques on Malay terms which blend the two languages include the days of the week, consisting of the indigenous term kt+! ‘day’ and the loaned Malay name of the day of the week, e.g. kt+! ham"(s ‘Thursday’ based on Malay hari khamis, and t)m batu! [water stone] ‘ice’ based on Malay air batu [water stone]. Where there is only a loanword to express a particular meaning, speakers have no negative opinions. However, where there are alternatives, they express a preference for the Ceq Wong term. When younger speakers choose a Malay alternative over the indigenous terms, elders are quick to opine that the youth don’t know how to speak their own language. 6.2.

Phonological adaptation

Aslian languages typically contain a larger phonemic inventory than Malay of which a subset includes all standard Malay phonemes apart from the Malay oral alveolopalatal affricates c /t"/ and j /d&/ which are realized as the indigenous palatal stops /c/ and /!/. The Malay alveolar s /s/ is realized as the Ceq Wong alveolo-palatal fricative /"/. Similarly, most Aslian languages contain comparable sets of basic phonemes. Phonological adaptation is mainly evident in terms of the manipulation of syllabic structure to satisfy the phonotactics of the target language. Ceq Wong has a preference for mono- or disyllabic words which end in a closed syllable. Some of the adaptations observed are: i. Reduction of trisyllabic Malay forms to bisyllabic, e.g. elision of the onset of the penultimate syllable where it is a /l/ or /r/: selimut ‘blanket’ " sim&(t; berenang ‘to swim’ " bn"(# and seratus ‘one hundred’ " satus. In all other cases, the initial syllable is elided, e.g. binatang ‘animal’ " nata# and sekolah ‘school’ " kolah. ii. Ceq Wong words typically have a glottal stop in final position where the Malay equivalent would end in vowel: paya ‘swamp’ " paya!. However, there are also a number of words in Ceq Wong where there is an open syllable, e.g. cit$r$ ‘to tell a story’ (! Malay citera/ceritera ‘tale, narrative’). iii. In disyllabic forms there is a preference for an open penultimate syllable. The reduction of medial clusters of a nasal and homorganic voiced stop to a simple nasal is a common phenomenon in Malay dialects, although it was not observed in the local one. It would appear to represent the dialect of Malay from which the form was borrowed, rather than an innovation as this cluster is admissable in

26. Loanwords in Ceq Wong g

677

Ceq Wong, e.g. pa#a# [pa'a '] ‘to roast meat’ (! Malay panggang ‘broil, roast before a fire’), and kmar [k#mar] (! Malay kembar ‘twin’). iv. The reduction of nonhomorganic medial clusters, inadmissible in Ceq Wong, is also attested. The coda of the penultimate syllable is elided, e.g. satu! ‘Saturday’ (! Malay sabtu). The sporadic reduction of disyllabic words to monosyllabic words is also observed, e.g. masak ‘ripe’ " sak; susun ‘arrange in layers’ " sun ‘to put away’; angkit ‘pick up’ " kit ‘to take’, and embun ‘dew’ " mon. While these could appear to be indigenous innovations, they are more likely to be very old loans of forms that historically were monosyllabic, e.g. Malay masak is in fact a prefixed form ma-sak derived from the root sak, which is found in the Bornean language Land Dayak (Tadmor p.c.). Numerous phonological variations from standard Malay are evident. However, it is difficult to find systematic correspondences, particularly with respect to Malay loans, or to be able to identify these as Ceq Wong innovations. The following are listed as observations. i. The raising of the vowel height of /a/ " /$/ following a nasal vowel is an indication of the local Malay dialect rather than a Ceq Wong integration strategy or innovation, e.g. mh"(l ‘expensive’ compared to standard Malay mahal. An exception is lumã# ‘manioc bread’. The following features may reflect older borrowings. ii. Some Ceq Wong words preserve final /k/, a feature of an earlier form of Malay, which in modern Malay is realized as a glottal stop /(/, e.g. tarek ‘to pull’ (! Malay tarek [tare(]), sak ‘ripe’ (! Malay masak [masa(]). Alternatively, it suggests contact with non-Peninsular Malays, as this feature is found in many dialects of Borneo, Sumatra and Java (Uri Tadmor, p.c.). This feature is also found in Malay loans in numerous other Aslian languages (see Burenhult 2006 for Jahai, Kruspe 2004 for Semelai, and Kruspe 2009+ for Mah Meri). iii. Curiously, in what appears to be a Ceq Wong hypercorrection, forms where an initial /h/ would be expected instead have a palatal fricative, e.g. sudang ‘prawn’ (! Malay udang) and sga! ‘price’ (! Malay harga). Compare initial /h/ in hayam ‘chicken’ (! Malay ayam). iv. Nasalisation/denasalisation of the initial bilabial segment is also attested. Bilabial nasals preceding a nasal consonant denasalize; and voiced bilabial oral stops nasalize when the following consonant in a nasal: ba#kok ‘bowl’ (! Malay mangkuk), bi#u! ‘week’ (! Malay minggu), mn$ ‘thing’ (! Malay benda) and the possible Aslian loan mn%m ! bn'm ‘mountain’. v. Word final nasals preceded by an oral vowel are realized as prestopped nasals, e.g. b jam ‘the clock’ is realized as [!a m]. vi. Finally, in a handful of loans /a/ becomes /#/ in the final syllable:

678

Nicole Kruspe!

tb$l ‘thick’ jr$m ‘waterfall’ sn$r ‘to snore’ tal$s ‘taro’

Malay tebal Malay jeram ‘rapid’ Malay sendar ‘to snore lightly’ Malay talas (obsolete, keladi in modern Malay)

This “phonological change” could in fact be evidence of very early Malay forms if one follows Adelaar's reconstruction for Proto-Malay. Alternatively they could be loans from a non-Malay western Austronesian language (Tadmor p.c). Other Aslian languages also exhibit sporadic examples of this kind, e.g. Mah Meri p$g$!k ‘to hold’ (! Malay pegang) (Kruspe 2009+). There is of course the possibility that it may also be an indigenous innovation. Recent loans are not phonologically adapted. It is impossible to determine to which of the following factors this can be attributed: (1) borrowing directly from colloquial Malay, e.g. gul$ ‘sugar’ (! Malay gula [gul#]), roti ‘bread’ (! Malay roti), whereas phonologically adapted words may have entered the language via Temuan, or another Aslian language; (2) greater sophistication on the part of Ceq Wong speakers who are increasingly fluent in Malay; or (3) the words are simply not yet integrated into the target language. An example of this final point is the case of the loanword nata# ‘animal’, which has only been adopted in the last couple of decades (Howell 1989: 215). 6.3.

Morphological integration

All loanwords in Ceq Wong feed morphological processes regardless of their origin, provided that their syllabic structure meets the requirements and the derivations are semantically sound. In the first example, a verb is derived in the progressive by means of infixed reduplication: batuk ‘to cough’ " btuk [b#ktuk] [cough] ‘be coughing’ (! Malay batuk). In the second example a noun, thun ‘year’ derives a unit nominalized form: thun [t#nhun] [year] ‘a year (unit)’ (! Malay tahun).

7. Grammatical borrowing The Ceq Wong language exhibits a low level of grammatical borrowing and this suggests that the Ceq Wong may have had little direct, sustained contact with the Malay language. Minimal morphology is used by some speakers who have some confidence in Malay and adopt ‘Malay-isms’ like the use of connectives in discourse, Malay verbs over Ceq Wong ones, and the sporadic use of the prefixes b$- and t$(from Malay ber- and ter- respectively). These features are idiosyncratic, and more prevalent amongst a couple of confident, older female speakers. There are, however, two well attested, but rather unusual derivations which may be from Malay: the Iterative s$-RDP-, and nasal mutation.

26. Loanwords in Ceq Wong

679

i. The iterative is formed by prefixing s$- ‘ITER’ to a reduplicant of the root (“s#-RDP-root”), e.g. d$! ‘to flee’ " s$-d$!-d$! [ITER-RDP-flee] ‘to keep on fleeing’ (Kruspe 2009++a). Formally it is probably based on a Malay derivation of the type “se-RDP-root” (Mees 1969). However, in terms of semantics it is probably an example of an indigenous innovation. 8 ii. Nasal mutation of the initial consonant is an unusual process in Aslian , but is a typical Austronesian strategy. In Ceq Wong this is a nominalizing process for disyllabic verbs which have a phonological vowel in the penultimate syllable. Rather than infixing, nasal mutation is selected, where the onset mutates to a homorganic nasal, e.g. patu! ‘to try’ " m:atu! ‘trying’, katet ‘to stick’ " #:atet ‘sticking’, or alveolar nasal n: in the case of a glottal onset, e.g. har$n ‘to know’ " n:ar$n ‘knowing’. These two derivations are problematic. One would expect the borrowing of complex abstract morphemes to arise only in an environment of sustained, intensive 9 contact . Again further research is required to establish the exact origin of these potentially borrowed processes.

8. Conclusion In light of the current paucity of available data on the Aslian languages and Malay dialects like Temuan, it is difficult to present a complete account of loanwords in Ceq Wong. The outcome of this study is that Ceq Wong exhibits a high level of lexical borrowing, particularly from Malay, and predominantly of nouns. While the Ceq Wong exhibit a catholic attitude toward lexical borrowing, the results are surprising given the context: documented evidence of low bilingualism, general isolation from and outward avoidance of Malay society, and negligible Malay cultural influence, e.g. in social organization and in the absence of references to the Malay world in traditional narratives. Furthermore, the high level of lexical borrowing from Malay is in sharp contrast to minimal grammatical borrowing. One possible explanation is that the words were acquired through an intermediary source. Finally, apart from the issue of borrowing, this study evinces the importance Aslian languages may play in providing information that would assist in reconstructing the linguistic history of the Austronesian languages in the Malay Peninsula. 8

9

Nasal mutation of the same form as Ceq Wong is also attested in Mah Meri (Kruspe 2009++b). Clearly indigenous replacive patterns are attested in Aslian, compare, for instance, Mah Meri nominal demonstrative n+h+(! ‘this’ and locative demonstrative t+h+! ‘here’ (Kruspe 2009++b) and Jahai demonstratives t$(h ‘this’ and !$(h ‘here’ (Burenhult 2005: 85–6). Curiously, these same two derivations also occur in Mah Meri, a language which notwithstanding extensive contact with Malay is also largely bereft of Malay loans in the domain of derivational affixes.

680

Nicole Kruspe!

Acknowledgments Foremost I would like to thank the Ceq Wong community for sharing their language with me, the Economic Planning Unit, Putrajaya, and the Department of Aboriginal Affairs for granting permission to undertake research in Malaysia, my sponsor Dato’ Prof. Dr. Shamsul Amri Baharuddin, Institut Kajian Etnik, Universiti Kebangsaan Malaysia, and the funding bodies named above for supporting my research. I am grateful to Sander Adelaar, Niclas Burenhult, Signe Howell and Uri Tadmor for their comments, and Amelia Goss for preparing the map. Research into Ceq Wong was supported by post-doctoral fellowships from the Research Centre for Linguistic Typology, La Trobe University (2001–2004), and the Hans Rausing Endangered Languages Programme, School of Oriental and African Studies, The University of London (2005–2007).

References Adelaar, K. Alexander. 1995. Borneo as a cross-roads for Comparative Austronesian Linguistics. In Bellwood, Peter & Fox, James J. & Tryon, Darrell (eds.), The Austronesians: Historical and Comparative Perspectives, 75–95. Canberra: Pacific Linguistics. Research School of Pacific and Asian Studies, Australian National University. Baer, Adela. 1968–1999. A Temuan-English-Malay Lexicon. Adela S. Baer papers. Series 2: Manuscripts, 1968–1999, 2001. Digital File. Keene, NH: Keene State College, Orang Asli Archive. . Bauer, Christian. 1991. Kensiw: A Northern Aslian language of Southern Thailand. In Surin Pookajorn & staff (eds.), Preliminary report of excavations at Moh-Khiew Cave, Krabi province, Sakai Cave, Trang Province and ethnoarchaeological research of a huntergatherer group, so-called “Sakai” or “Semang” at Trang Province, 310–335. Bangkok: Faculty of Archaeology, Silpakorn University. Begbie, Captain Peter James. 1834. The Malayan Peninsula: Embracing its history, manners and customs of the inhabitants, politics, natural history, etc. from its earliest record. Vepery Mission Press. Benjamin, Geoffrey. 1976. Austroasiatic Subgroupings and Prehistory in the Malay Peninsula. In Jenner, Philip N. & Thompson, Laurence C. & Starosta, Stanley (eds.), Austroasiatic Studies (Oceanic Linguistics Special Pulications 13), 37–128. Honolulu: University of Hawaii Press. Benjamin, Geoffrey. 1985. In the long term: Three themes in Malayan cultural ecology. In Hutterer, Karl L. & Rambo, A. Terry & Lovelace, George (eds.), Cultural values and human ecology in South East Asia, 219–278. University of Michigan: Center for South and Southeast Asian Studies.

26. Loanwords in Ceq Wong

681

Benjamin, Geoffrey. 1987. Ethnohistorical perspectives on Kelantan’s prehistory. In Shuhaimi, Nik Hassan bin Nik Abdul Rahman (ed.), Kelantan zaman awal: Kajian arkeologi dan sejarah di Malaysia, 108–153. Kota Bharu, Kelantan: Perpaduan Muzium Negeri Kelantan. Benjamin, Geoffrey. 1997. Issues in the ethnohistory of Pahang. In Shuhaimi, Nik Hassan bin Nik Abdul Rahman & Abu Bakar, Mohamed Mokhtar & Khairuddin, Ahmad Hakimi & Baharuddin, Jazamuddin (eds.), Pembangunan arkeologi pelancongan negeri Pahang, 82–121. Pekan, Malaysia: Muzium Pahang. Blagden, C. O. 1903. The comparative philology of the Sakai and Semang dialects of the Malay Peninsula: A review. Journal of the Straits Branch, Royal Asiatic Society 39:47–63. Bulbeck, David. 2004. An integrated perspective on Orang Asli ethnogenesis. In Paz, Victor (ed.), Southeast Asian archaeology: Wilhelm G. Solheim II festschrift, 366–99. Quezon City: The University of the Philippines Press. Burenhult, Niclas. 2005. A grammar of Jahai. Canberra: Pacific Linguistics. Burenhult, Niclas. 2006. Body part terms in Jahai. Language Sciences 28:162–180. Burenhult, Niclas. 2008. Foraging and the history of languages in the Malay Peninsula. Unpublished manuscript. Burenhult, Niclas & Kruspe, Nicole & Dunn, Michael. n.d. Complexities and dynamics of linguistic history in the Malay Peninsula: The case of Nothern Aslian. Unpublished manuscript. Cant, R. G. A. 1972. A historical geography of Pahang. (Monograph 4). Singapore: Malaysian Branch, Royal Asiatic Society. Diffloth, Gerard. 1975. Les langues mon-khmer de Malasie: Classification historique et innovations [The Mon-Khmer languages of Malaysia: Historical classification and innovations]. Asie du sud-est et monde insulinde 6(4):1–19. Diffloth, Gérard. 1976a. Jah-Hut, an Austroasiatic language of Malaysia. In Liem, N. D. (ed.), Southeast Asian Linguistic Studies, Vol. 2, 73–118. Canberra: Pacific Linguistics. Diffloth, Gérard. 1976b. Mon-Khmer numerals in Aslian languages. Linguistics 174:31–8. (Special issue: Austroasiatic number systems). Diffloth, Gérard & Zide, Norman. 1992. Austro-Asiatic languages. In Bright, William (ed.), International Encyclopedia of Linguistics, Vol. 1, 137–42. New York/Oxford: Oxford University Press. Dunn, Michael & Burenhult, Niclas & Kruspe, Nicole & Becker, Neele & Tufvesson, Sylvia. 2009+. A quantitative analysis of Aslian linguistic prehistory. Endicott, Kirk M. 1975. A brief report on the Semoq Beri of Pahang. Federated Museums Journal (New Series) 20:1–23. Endicott, Kirk M. 1983. The effects of slave raiding on the aborigines of the Malay Peninsula. In Reid, Anthony & Brewster, J. (eds.), Slavery, bondage and dependency in Southeast Asia, 216–245. Brisbane, Australia: University of Queensland Press.

682

Nicole Kruspe!

Endicott, Kirk M. & Endicott, Karen L. 2007. The headman was a woman. Long Grove, IL: Waveland Press Inc. Evans, Ivor. H. N. 1927. Papers on the ethnology and archaeology of the Malay Peninsula. Cambridge: Cambridge University Press. Howell, Signe. 1982. Chewong myths and legends. (Monograph 11). Kuala Lumpur: Malaysian Branch, Royal Asiatic Society. Howell, Signe. 1985. Equality and hierarchy: Chewong classification. In Barnes, R. H. & De Coppet, D. (eds.), Contexts and levels (Journal of the Anthropological Society of Oxford Monograph 4). Oxford: Oxford University Press. Howell, Signe. 1989 [1984]. Society and Cosmos: Chewong of Peninsular Malaysia. Original edition 1984. Oxford: Oxford University Press. Chicago/London: University of Chicago Press. Howell, Signe. 1996. Nature in culture or culture in nature? Chewong ideas of ‘humans’ and other species. In Descola, Phillip & Passoni, Gisli (eds.), Nature and society: Anthropological perspectives, 127–44. London: Routledge. Howell, Signe. 2002. "We people belong in the forest": Chewong re-creations of uniqueness and separateness. In Benjamin, Geoffrey & Chou, Cynthia (eds.), Tribal communities in the Malay world: Historical, social and cultural perspectives, 254–272. Leiden/Singapore: IIAS & ISEAS. Kruspe, Nicole D. 2004. A grammar of Semelai. (Cambridge Grammatical Descriptions). Cambridge: Cambridge University Press. Kruspe, Nicole D. 2009+. A dictionary of Mah Meri, Bukit Bangkong. (Oceanic Linguistics Special Publication). Hawaii: University of Hawai‘i Press. Kruspe, Nicole D. 2009++a. A grammar of Ceq Wong. Unpublished manuscript. Kruspe, Nicole D. 2009++b. A grammar of Mah Meri. Unpublished manuscript. Mees, C. A. 1969. Tatabahasa dan tatakalimat. Kuala Lumpur: University of Malaya Press. Needham, Rodney. 1956. Ethnographic notes on the Siwang of central Malaya. Journal of the Malaysian Branch of the Royal Asiatic Society 29:46–69. Noone, Richard O. D. 1954. Notes on the trade in blowpipes and blowpipe bamboo in north Malaya. Federated Museums Journal Vols. 1–2:1–18. Ogilvie, Charles S. 1940. The “Che Wông”: A little known primitive people. Malayan Nature Journal 1(1):23–25. Ogilvie, Charles S. 1948. More of the Che Wông. Malayan Nature Journal 3(1):15–29. Ogilvie, Charles S. 1949. Che Wông Word List and Notes. In Collings, H. D. (ed.), Bulletin of the Raffles Museum Series B 4:11–39. Shorto, Harry L. 2006. In Paul, Sidwell & Cooper, Doug & Bauer, Christian (eds.), A Mon-Khmer comparative dictionary. Canberra: Pacific Linguistics. Research School of Pacific and Asian Studies, Australian National University. Skeat, William Walter & Blagden, Charles Otto. 1906. Pagan races of the Malay Peninsula. Vols. 1–2. London: Macmillan.

26. Loanwords in Ceq Wong

683

Loanword Appendix Jah Hut ko# bu! pigo!

woman, female breast udder chili pepper

Jah Hut/Southern Aslian mn%m camb&# l#"(! d&l

mountain, hill spider neck some

Southern Aslian pt$m p"t p)n Malay pasir ta,o# lawot paya! la#%t binta# kil)t baya# habu! siut buja# budak !adi! kmar !an"(k kam%n nata# kami# kud$ hayam kuce# layar sisik s$!a# gajah suda#

to plant three four sand valley sea, ocean swamp sky star lightning shade, shadow ash to burn (1) young man child (1) younger sibling twins sibling’s child family animal goat horse fowl cat fin scale gill elephant prawns, shrimp

cace# tupay bala# siput katak badak !urat

tanok kilep paroh ta#k+k baho! batok bsin ba#kay kbur kwat hanal sihat hawar p,ãkit parut bla! !ubat jreh rehat malas pkak pnel butã! kir$ sbus pa#a# priyok kuni! cr"k pi#an ba#kok kuleh sudu!

worm squirrel grasshopper snail frog tapir vein, artery, sinew, tendon horn to blink beak nape of neck shoulder to cough to sneeze corpse carcass grave strong strong healthy cold disease scar to cure medicine tired to rest lazy deaf mute blind to cook to boil to roast, to fry pot kettle kettle dish, plate, saucer bowl cup spoon fork

tapis roti tpu# kaca# mi,"(k garam gul$ susu! lmã# baju! kayen jayit clup sw)l t+ken kasut cincan man"(k !ante# sikat sabun bilik pintu! kunci! lantay dene# ra#kal sim&(t m"j$ #)l)# bnul tia# togoh pap)n pag)r mali! jago# makaw tu#ul gtah sila!

to sieve, to strain bread flour bean oil salt sugar milk manioc bread clothing, clothes cloth to sew to dye trousers sock, stocking shoe ring bracelet, bead, necklace earring comb soap room door, gate key floor wall ladder blanket table ridgepole, beam beam post, pole post, pole board fence digging stick (=yamstick) maize/corn tobacco tree stump sap sweet potato

684

Nicole Kruspe!

gal$ labu! jn*! buat kja! lu#kor lipat rantay tali! tmok kap)k biyo# bl)h tarek sa#kut tkan hapit gur"k gaji gam m"(s bsi! !aleh pale# pusi# guli# gunca# ha,*t bn"(# hlam ra#kak te#"! lompat sipak balek halaw tiba! gnam ta#+# tolak lboh titi! kapal (1)

cassava/ manioc gourd, squash, pumpkin fish poison to do, to make work to bend to fold chain rope to pound axe/ax adze to split to pull to hang up to press to squeeze to bore saw glue gold iron to move to turn to turn around to roll to shake to float to swim to dive to crawl to crouch to jump to kick to come back to pursue to arrive to carry in hand to carry on shoulder to push road bridge ship

kit (1) mn$ dwet hut)# bayar bli! jwal tukar pasar kday sga! m"h"(l mur)h lpas lam boh kotep tamon samo# cray buk)! tutop tudu# ti#gi! dalam limã! nãm tujoh lapan smilan spuloh sblas duw$ blas limã! blas duw$ puloh satus sibu bila# lbeh cukup pasa# cpat lambat bnti! siap

raft to take thing money debt to pay to buy to sell to trade, to barter market shop/store price expensive cheap after inside, in to put to pick up to pile up to join to separate to open to shut to cover high, tall deep five six seven eight nine ten eleven twelve fifteen twenty a hundred a thousand to count more enough pair fast to be late to cease ready

slalo! kada# kada# nijam jam bi#u! thun ha,%r wanã! puteh hijaw kun%# tmaga! kasar lbat kri# bayek nasip !untu# s,+(m pdih bnci! bran%! pilih panãy peker caya! kn"(l !ojok !akal blajar cigu! kolah lupa! sn"(# sus)h pay)h sbap kalaw !antaw bapa! gagap cit$r$ ta,"(!

always sometimes hour clock week year, season brackish colour/color white green yellow yellow rough (1) heavy dry clean, good, beautiful good luck good luck to smile pain to hate brave to choose clever to think (1) to believe to know to imitate idea to learn, to teach teacher school to forget easy difficult difficult because if or how many?, how much? to stutter, to stammer to tell to ask (1)

26. Loanwords in Ceq Wong tam)h jawap tulis ktas pensel buku! sun"(! pkan kampo# tu#kat raja! kawan pakat tolo# !adat lawan pra# !as$kar ,ata! pan"(h kres bdil tali! ta#si!

to answer to answer to write paper pen book flute town village walking stick king friend to invite to help custom to fight war or battle army, soldier weapons arrow sword gun fishing line

jare# sare# twar !umpan temba! male# tuhan gr"j$ dyo! tibi m+t+ kita! bas kapal (2) !ubat poles surat payet tandas tilam t"# c+klat

fishnet fish trap fish trap bait to shoot to steal god church radio television motorcycle car bus airplane battery police birth certificate tap/faucet toilet mattress tin/can candy/sweets

bom t"h jadi! #"(n

685

bomb tea to become with

Western Austronesian jr$m klantãr mon bahaya! sak sn$r tal$s bik)h tb$l

waterfall bolt of lightning dew crocodile, alligator ripe to snore yam to break thick

Unknown origin mam*h kb&s

to bathe to die, dead

Chapter 27

Loanwords in Indonesian* Uri Tadmor 1. The language and its speakers 1.1.

Name and classification

Indonesian (or Bahasa Indonesia ‘the Indonesian language’) is a form of Malay that serves as the national language of Indonesia. It is a member of the Malayic subgroup of Western Malayo-Polynesian, a branch of the Austronesian language family. Other Malayic languages include Minangkabau (spoken on Sumatra), Iban (northern Borneo), and Banjar (southern Borneo). No satisfactory internal classification of Malayic languages has been proposed so far, in part because no linguistic criteria have been established to distinguish between Malay dialects and Malayic languages. It would not be possible to discuss all varieties of Indonesian here, not only because of space limitations, but also because many of them are poorly documented. The discussion therefore focuses on standard Indonesian, the most widely used variety. Whenever the term “Malay-Indonesian” is used, it will refer to the language as a whole (especially in historical perspective, when it is not possible to make a distinction between Malay and Indonesian). “Indonesian” will refer specifically to contemporary standard Indonesian, while “Malaysian” will refer to standard Malay as used in Malaysia. In citing Malay-Indonesian words the standard orthography is used, with one exception: the mid front vowel /e/ is written é, to distinguish it from the mid central vowel /!/, written e. In the standard orthography, both are written e. 1.2.

Sociolinguistic position

Once used only by ethnic Malays, the Malay language assumed a role as a regional lingua franca at an early date. Currently ethnic Malays constitute only a small part of the total number of speakers of Malay-Indonesian, although they may still con*

The subdatabase of the World Loanword Database that accompanies this chapter is available online at http://wold.livingsources.org. It is a separate electronic publication that should be cited as: Tadmor, Uri. 2009. Indonesian vocabulary. In Haspelmath, Martin & Tadmor, Uri (eds.) World Loanword Database. Munich: Max Planck Digital Library, 1947 entries.

27. Loanwords in Indonesian

687

stitute a majority among native speakers. There are two different standardized varieties, one used in Malaysia (with very similar varieties also used in Brunei and Singapore) and the other in Indonesia.

Map 1: Geographic situation of Indonesian The total number of speakers of Malay-Indonesian is estimated at about 250 million, making it by far the most widely spoken language in Southeast Asia, as well as the most widely spoken Austronesian language. Most Indonesians know at least some Indonesian and use it on a regular basis. However, the standard language is not acquired as a first language. Where Indonesian is used as a home language, it is in the form of a local colloquial variety. In this sense, all speakers of standard 1 Indonesian are at least bidialectal . Children acquire the standard language early on from its use on television and in school. In recent years the Jakarta dialect has been making inroads into areas that have previously been the sole domain of standard Indonesian, such as advertisements and television interviews. It is also widely used in youth magazines, on the Internet, and in text messaging. However, in more formal situations the use of standard Indonesian is still the overwhelming choice.

1

In a diglossic situation where speakers use Standard Indonesian in more formal situations and a colloquial variety of Malay-Indonesian as a home language, the two can be said to form the two ends of a continuum.

688

Uri Tadmor

As the sole official language, Indonesian is used in all government communication, both oral and written. Practically anything published in Indonesia (books, newspapers, magazines) is in Indonesian, as are almost all product markings and public signs. Spoken Indonesian is also used as a lingua franca among people who belong to different ethnolinguistic groups.

2. Sources of data In addition to the author’s personal knowledge, a large number of dictionaries were consulted. The major dictionaries of Indonesian used were Kamus Besar Bahasa Indonesia (2002), Echols & Shadily (1975), Echols & Shadily (1998), and Stevens & Schmidgall-Tellings (2004). Among dictionaries of Malaysian, the following deserve special mention: Wilkinson (1959), Kamus Dewan (1991), and Kamus Inggris-Melayu Dewan (1992). In addition, many dictionaries of the various donor languages were also consulted. Of the few methodological studies of loanwords in Malay-Indonesian, Jones (1984) provides a good overview, and Gonda (1952) remains the classic work on words of Sanskrit origin in the languages of Indonesia. Particularly useful were the three publications of the Indonesian Etymological Project, which consist of lists of Indonesian loanwords that ultimately originate from Arabic and Persian (Jones 1978), European languages (Grjins et al. 1983), and 2 Sanskrit (de Casparis 1997).

3. Contact situations 3.1.

Languages of India

The earliest foreigners known to have had significant influence on the MalayIndonesian archipelago were Indians. However, practically all evidence of this early contact – dating back to the first millennium CE – is secondary, in the form of Indian cultural, religious, and linguistic traits. Therefore, we do not know exactly who introduced Indian civilization to the area, although it is assumed that it was mostly visiting Indian traders, scholars, and missionaries, rather than through largeth scale immigration or political domination. The oldest Malay inscriptions (7 century CE) contain parts in Sanskrit, and even the Malay sections of these inscriptions include many Sanskrit loanwords. Sanskrit continued to be used in the Malayspeaking world for centuries as a liturgical language (for both Hinduism and Buddhism) as well as a literary language. This has made Sanskrit a major donor 2

The results of the Indonesian Etymological Project, which investigated lexical borrowing into Malay-Indonesian from languages outside the Malay-Indonesian Archipelago, were summarized in Jones (2007). This book, consisting of a glossary of words borrowed into Malay-Indonesian from ten foreign languages, was published after the data collection for this chapter had been practically completed.

27. Loanwords in Indonesian

689

language for Malay-Indonesian and the source of many common words such as kepala ‘head’ (< kap!la ‘cup, skull’), cahaya ‘light’ (< ch!ya ‘reflection, light’), nama ‘name’ (< n!ma(n) ‘name’), kerja ‘work’ (< k!rya ‘duty, work’), semua ‘all’ (< sam"ha ‘multitude’), and karena ‘because’ (< k!ra!a ‘cause’). Later Indic languages such as Hindi-Urdu also contributed some loanwords to Indonesian, but it is often difficult to point to the precise source words, which may have been early or dialectal forms. Some of these loanwords are roti ‘bread’ (cf. Hindi-Urdu ro#$ ‘bread’), celana ‘trousers’ (cf. Hindi-Urdu charn! ‘half-trousers’), topi ‘hat’ (cf. Hindi-Urdu #op$ ‘hat’), and kunci ‘key’ (cf. Hindi-Urdu ku"j$ ‘key’). Dravidian languages of southern India – mostly Tamil – were also in contact with Malay-Indonesian, and have left traces in the form of numerous loanwords. Tamil loanwords in Indonesian include kapal ‘ship’ (< kappal ‘ship’), teman ‘friend’ (< tama# ‘male relative or friend’), nelayan ‘fisherman’ (< nulaiyan ‘seashore dweller, fisherman’), and kedai ‘shop’ (< ka#ai ‘shop, market’; this loanword is more commonly used in Malaysia). Some of the source words, such as nulaiyan, are obsolete in modern Tamil (E. Annamalai, p.c.), testifying to the antiquity of these loanwords in Malay-Indonesian. Many words of Indic (Indo-European) origin also show evidence of having been borrowed via speakers of Dravidian languages (see §4.1.1). 3.2.

Chinese languages

Chinese pilgrims and traders have been visiting Indonesia for well over a thousand years, in the past often on their way to India. Chinese communities have existed throughout the archipelago for many centuries. Various Chinese languages were (and still are) spoken in Indonesia, and have influenced colloquial varieties of Indonesian, although in the standard language their influence has been limited and purely lexical. Most Chinese immigrants to the Malay-Indonesian archipelago were speakers of Southern Min varieties, and by far the most important Chinese donor language was Hokkien (also called Amoy). There are also some loanwords from Teochew (another Southern Min language) and a handful from Mandarin and Cantonese. Examples of Chinese loanwords in Indonesian include cat ‘paint’ (< Amoy ! chhat ‘paint’), toko ‘store’ (< Amoy "# thó’ khò’ ‘store, warehouse’), 7 5 giwang ‘earring’ (< Foochow $% ngi hwang ‘earring’), and téh ‘tea’ (< Amoy & tê ‘tea’). 3.3.

Near East languages

Travelers from the Near East first arrived in Indonesia during the second half of the first millennium CE. Eventually the Arabic and Persian languages were to have a strong impact on Malay-Indonesian. However, this did not take place until centuries later, when local inhabitants began converting to Islam. The lexical influence of Arabic has been especially strong. Many words of Arabic origin did not enter

690

Uri Tadmor

Malay-Indonesian from spoken Arabic, but rather through Arabic literature or through Persian literature (where Arabic loanwords abound); see §4.1.1. Loanwords of Arabic origin, which are very numerous, include dunia ‘world’ (< duny! ‘world’), badan ‘body’ (< badan ‘body’), kuat ‘strong’ (< q"wat- ‘strength, power’), kursi ‘chair’ (< kurs$ ‘chair’), waktu ‘time’ (< waqt ‘time, period’), pikir ‘think’ (< fikr ‘thinking, cognition’), and jawab ‘answer’ (< jaw!b ‘answer’). Words of ultimate Persian (nonArabic) origin are far fewer, and include kawin ‘marry’ (< k!w$n ‘dowry’), domba ‘sheep’ (< du$ba ‘a kind of sheep with a thick tail’), anggur ‘grape, wine’ (< ang"r ‘grape, raisin’), and gandum ‘wheat’ (< gandum ‘wheat’). 3.4.

Portuguese

The earliest Europeans with a significant presence in Indonesia were the Portuth guese, who first arrived in the early 16 century. There are numerous loanwords of Portuguese origin in Indonesian, but most were not borrowed directly but rather via the Portuguese-based creole that once served as a lingua franca in Batavia (capital of the Dutch East Indies, now Jakarta). This creole was spread to Batavia from Malacca (in the Malay Peninsula) by slaves captured from the Portuguese after th Malacca fell to the Dutch in 1641. Batavian Portuguese Creole died out in the 20 century, but Malaccan Portuguese Creole (known as Kristang) is still used by small groups in Malacca itself and in Singapore. The Indonesian words for many everyday objects are of ultimately Portuguese origin, such as garpu ‘fork’ (< garfo), keméja ‘shirt’ (< camisa), sepatu ‘shoes’ (< sapato), méja ‘table’ (< mesa), roda ‘wheel’ (< roda), bola ‘ball’ (< bola), and jendéla ‘window’ (< janela). A hitherto overlooked Portuguese loanword is kaléng ‘tin, can’, recorded as calaim (with various other spellings) in Asian Portuguese sources by Yule and Burnell (1903:145–6). The ultimate source of this loanword is Turkish kalay ‘tin’. Some Indonesian words previously considered as Dutch loanwords are analyzed in the present study as having been borrowed via Portuguese Creole. These include lampu ‘lamp’ (< Portuguese Creole lampu < Dutch lamp; cf. Kristang lampu) and buku ‘book’ (< Portuguese Creole buku < Dutch boek; cf. Kristang buku). In these words, indirect borrowing explains the presence of the unexpected final -u, a com3 mon phonological strategy in Portuguese Creole but not in Malay-Indonesian. Other words also betray indirect borrowing by their unusual phonological or semantic correspondence pattern. Indonesian pompa ‘pump’ is ultimately from Portuguese bomba ‘pump’ (cf. Kristang bomba); however, its phonology has been influenced by Dutch pomp ‘pump’. (The Malaysian counterpart of this word did not undergo the change and has remained bomba.) Indonesian pipa ‘(water) pipe’ is ultimately from Portuguese pipa ‘barrel’, influenced by the semantics of Dutch pijp

3

Still used until relatively recently – cf. waistu ‘waist’ in Singapore Kristang, from English waist.

27. Loanwords in Indonesian

691

4

‘pipe’ . The unusual syncopation exhibited by Indonesian taflak ‘tablecloth’ (ultimately from Dutch tafellaken) is also explained by borrowing via Portuguese Creole (cf. Kristang taflak). An interesting doublet is represented by the two Indonesian words for ‘bullet’, peluru and pélor ‘bullet’. Peluru was borrowed from Kristang piloru while pélor would have been borrowed from Batavian Portuguese Creole 5 *pilor . 3.5.

Dutch

After the Portuguese, the next Europeans to send expeditions to Indonesia were th the Dutch, who first came towards the end of the 16 century. Eventually the th Dutch came to control all of present-day Indonesia until the mid-20 century. The use of Dutch in Indonesia was limited, however, and only a small fraction of the indigenous population ever gained fluency in the language. Nevertheless, since the ruling class spoke Dutch (and the few Indonesians who spoke Dutch belonged to the influential elite), Dutch had a strong impact on the Indonesian lexicon, and some impact on its grammar as well. Interestingly, many of the Dutch source words were themselves loanwords, mostly from French. Dutch loanwords in Indonesian are very numerous, and include kamar ‘room’ (< kamer ‘room’), kopi ‘coffee’ (< koffie ‘coffee’), duit ‘money’ (< duit, the name of an old Dutch coin), mobil ‘car’ (< automobiel ‘car’), setir ‘driving wheel, drive’ (< stuur ‘driving wheel), lat / telat ‘late’ (< laat ‘late’), and koran ‘newspaper’ (< courant, the older form of krant ‘newspaper’). 3.6.

English

Following full independence in late 1949, English quickly became the most widely taught foreign language in Indonesia. Members of the educated elite generally have a good knowledge of English and frequently code-switch between English and Indonesian. English is also heard daily on television and in movie theaters, so most Indonesians have had at least some exposure to it. English loanwords in Indonesian include flu ‘cold, flu’, koin ‘coin’, gaun ‘dress’ (< gown), bolpoin ‘pen’ (< ballpoint), mall [mol] ‘shopping center’ (< mall), and bil ‘check, bill’. 3.7.

Other languages of the Malay-Indonesian archipelago

In addition to coming in contact with languages from outside the region, MalayIndonesian has been in contact with many local languages, principally via its role as 4

For an example of Malay pipa used with the sense ‘barrel’ (in a Malay letter of 1797) see Mu’jizah 2009: 25. 5 In Batavian (but not Malaccan) Portuguese Creole, final -u is deleted after r.

692

Uri Tadmor

a lingua franca throughout the archipelago. The most influential of these local languages overall has been Javanese, in contact with Malay-Indonesian for well over a millennium. Today, native speakers of Javanese form the largest group among speakers of Indonesian. Javanese loanwords entered Indonesian through at least two distinct contact situations. There is evidence for borrowing from Old Javanese into a very early form of Malay. Such loanwords are typically characterized by their presth th ence in Classical Malay manuscripts of the 16 and early 17 centuries (see §4.1.1). The second contact situation was between modern Javanese and Indonesian (or its th precursor) since the mid-17 century. Javanese has had a strong impact on Java Malay and on Betawi (the variety used in Jakarta), and through them on the standard language as well. The two other local languages which have strongly influenced the lexicon of modern Indonesian are Balinese and Sundanese. Balinese people (mostly slaves) once constituted the largest ethnic group in Batavia, and it is there (rather than in Bali) that most lexical transfer from Balinese into Indonesian took place. Sundanese was the language of Batavia’s rural hinterland (and indeed is still used in the areas surrounding Jakarta). After natives of Java were permitted to settle in the city, there was a massive inflow of migrants from the Sundanese-speaking hinterland, who brought their language with them. Indonesian words for which possible source words exist in all three languages (Javanese, Balinese, and Sundanese) include bébék ‘duck’, mulus ‘smooth’, sabuk ‘belt’, pusar ‘navel’, kocok ‘shake, mix’, keponakan ‘niece/nephew’, tuding ‘to accuse’, ajak ‘to invite’, sepi ‘quiet’, and mirip ‘similar’. Finally, another local language that has had some lexical influence on Indonesian is Minangkabau, a Malayic language of western Sumatra. Many Indonesian authors and educators, especially those active in the early formative years of modern standard Indonesian, were native speakers of Minangkabau. Their writings contain numerous Minangkabau words, some of which have become part of the general vocabulary of Indonesian. Because Minangkabau is a Malayic language very closely related to Malay-Indonesian, it is difficult to distinguish between shared retentions and loanwords, let alone to determine the direction of borrowing. The following words appear to have been borrowed from Minangkabau into Malay-Indonesian: kalian ‘you (pl.)’, gadis ‘girl’, dangkal ‘shallow’, datar ‘flat’, bersua ‘meet’, bertikai ‘fight’, and pidato ‘speech’. Some of these words display final consonants whose realization has changed in modern Minangkabau, changes not reflected in the Arabic-based writing system formerly used for writing Minangkabau. 3.8.

Other neighboring languages

In addition to numerous and mostly recent examples of borrowing from other languages of the Malay-Indonesian archipelago, Malay-Indonesian also appears to share numerous vocabulary items with other languages of the region, especially Austroasiatic and Tai languages. It is often difficult to pinpoint the immediate

27. Loanwords in Indonesian

693

donor language and the direction of borrowing, because the same etymon may be represented in several languages in each family. Some of the loanwords in question are very old and have undergone subsequent sound changes in the recipient languages, further obscuring their origin. Words of definite Tai origin are mostly used in Malaysian rather than in 1 Indonesian, e.g. bomoh ‘shaman’ (< Old Thai b%& ‘father (used as an epithet)’ + hm%& ‘shaman’), natang ‘kind of window’ (< Thai nâ&ta&ng ‘window’), and wau ‘kite’ 6 (< Thai wâ&w ‘kite’) . Words of Austroasiatic origin are more numerous and include words used in Indonesian (as well as Malaysian), such as ketam ‘kind of crab’ (cf. Proto Mon-Khmer *kt1aam) and sekam ‘husk’ (cf. Proto Mon-Khmer *skaam' 7 ‘husk’) . Later loanwords came in from the languages of the major Austroasiatic civilizations of Southeast Asia, Mon and Khmer. A large group of etyma occurs in Mon-Khmer and Tai as well as in MalayIndonesian. Most are due to borrowing from a common source (principally Sanskrit/Pali), while some others are clearly the result of borrowing from MalayIndonesian rather than into Malay-Indonesian. But there remains a large number of words which are not of Indic origin and where the direction of borrowing seems to be into Malay-Indonesian; some examples are provided in Table 1. Without getting into the possibility of ancient genealogical connections between Austronesian, Austroasiatic, and Tai-Kadai, I believe that further research would show that most items in Table 1 are the result of borrowing from Khmer into Malay-Indonesian, either directly or via Tai. Indeed, quite a few of the etyma have reflexes in Old Khmer and/or have reconstructed proto-forms in Proto Mon-Khmer.

4. Numbers and kinds of loanwords 4.1. 4.1.1.

Loanwords by donor language Challenges in identifying immediate donor languages

Identifying the immediate donor language of many loanwords in Indonesian proved difficult and even impossible. Some particularly challenging groups of loanwords are discussed below. Javanese, Balinese, and Sundanese are closely related to each other and share a significant part of their lexicon. This is due not only to shared retentions but also to the fact that Balinese and Sundanese have both borrowed from Javanese. It is therefore often difficult to tell whether a particular word was borrowed into Indonesian from (modern) Javanese, Balinese, or Sundanese. This is one of the reasons why these three languages are grouped together as “Languages of the Java Area” in Tables 3, 4, and 5. Loanwords from Old Javanese are easier to identify because they 6

By a strange coincidence, the Dutch word for kite (the bird, not the toy) is wouw, which led Wilkinson (1959:1282) to wrongly cite it as the source for Malay wau. 7 Proto Mon-Khmer forms cited in this chapter are from Shorto 2006.

694

Uri Tadmor

are attested in Classical Malay, which was used before Balinese and Sundanese had any significant influence on Malay. Some are also used in Malay dialects of Borneo and the Malay Peninsula, where the direct influence of Balinese and Sundanese has been minimal. Table 1:

Some shared etyma in Khmer, Thai, and Malay-Indonesian

Khmer b%(ci& ‘a register’ cam ‘remember’ ti)n ‘candle’ t%)n ‘manage to, be in time for’ pù&)k ‘group’ kr*& ‘bed’ Proto Mon-Khmer *dga&m ‘molar tooth’ kht,-&y ‘hermaphrodite’ Proto Mon-Khmer *[t]ru. ‘cage’, *kru. ‘to confine’ kr)b,y ‘water buffalo’ l.%-& ‘sesame’ sbay ‘muslin’ sr)maoc ‘ant’ sr)nok ‘pleasant’ tra& ‘seal, stamp’ thù)n ‘endure’

Thai

Malay-Indonesian

banch&i ‘account, list, register’ cam ‘remember, recall’ thian ‘candle’ than ‘manage to, be in time for’ phûak ‘group’ khræ+& ‘litter, light bed/seat’ kra&m ‘molar tooth’

banci ‘census’ cam ‘recognize, be able to recall’ dian ‘candle, oil lamp’8 dan ‘manage to, be in time for’ puak ‘group’ gerai ‘platform, stall’ geraham ‘molar tooth’

kàth)&y ‘hermaphrodite’ krong ‘cage’

kedi ‘hermaphrodite’ kurung ‘cage, to confine’

krà b/& ‘water buffalo’ ngaa ‘sesame’ sàbay ‘shawl’ mót ‘ant’ sànùk ‘enjoyable’ tra& ‘seal, stamp‘ thon ‘endure’

kerbau ‘water buffalo’ lenga ‘sesame’ sebai ‘shawl’ semut ‘ant’ seronok ‘pleasant, enjoyable’ tera ‘seal, stamp’ tahan ‘endure’

Loanwords of Sanskrit origin also presented some problems. Although most such words show a close enough phonetic and semantic resemblance to Sanskrit and seem to have originated from Sanskrit literature, it is also possible that some have come into Malay not directly from Sanskrit, but via a Prakrit or a later vernacular. An interesting feature of some loanwords of Sanskrit origin is intervocalic voicing, which can be seen in words such as kuda ‘horse’ (cf. Sanskrit gho#a), gergaji ‘saw’ (cf. Sanskrit krakaca; the initial consonant may have undergone voicing by assimilation), ajar ‘teach/learn’ (cf. Sanskrit !c!rya), bijaksana ‘wise’ (cf. Sanskrit vicak0a!a), curiga ‘suspect’ (cf. Sanskrit churik!), and segala ‘all’ (cf. Sanskrit sakala). Intervocalic voicing was never a feature of Malay-Indonesian phonology, but it is a hallmark of Tamil and other Dravidian languages. This indicates that these words (and probably others which happen not to have intervocalic voiceless stops) may have been borrowed via a Dravidian language or were learned from native speakers of a Dravidian language. A loanword where this development is clear is Indonesian tiga ‘three’, ultimately from Sanskrit trika ‘triple’; indeed, Telugu exhibits an identical 9 form, tiga ‘three’ . The fact that Malay-Indonesian borrowed a lower numeral from 8 9

This etymon appears to have entered Old Khmer from Chinese, where it meant ‘oil lamp’. This was pointed out to me by Waruno Mahdi.

27. Loanwords in Indonesian

695

or via Dravidian speakers is strong testimony to the latter group’s strong influence in ancient Indonesia. This analysis matches historical evidence that early Indian influence on Indonesia emanated from southern (Dravidian-speaking) India. To overcome the problem of identifying the immediate donor language, all words ultimately originating in languages of India are grouped together for statistical purposes. Regarding words of Arabic origin, it is seems probably that many came into Malay-Indonesian via intermediate languages. Campbell (1996) investigated loanwords of ultimate Arabic origin in Malay which in Classical Arabic contained the feminine suffix -at- (followed by a case ending, e.g. nominative indefinite -un). This suffix is represented in words of Arabic origin in Malay-Indonesian by -ah or -at, e.g. fitnah ‘slander’ and adat ‘custom’ (both words have the same ending in 10 Arabic) . Campbell compared well-attested Arabic etyma that occur both in Persian and in Malay whose source words in Classical Arabic have the feminine suffix -at-. His findings indicate that more than half the words that end in -at in Persian also end in -at in Indonesian, while only a fifth end in -ah. Even more striking, nearly three quarters of the words ending in -e in Persian end in -ah in Malay, while only about 10% end in -at. Campbell’s results are summarized in table 2. Table 2:

Correspondence rate between Persian -e and -at and Malay -ah and -at (after Campbell 1996: 38–39) Malay -at

Persian -at Persian -e

51.58% 10.47%

Malay -ah Problematic11 20.00% 73.26%

28.42% 16.28%

These rates of correspondence in table 2 cannot be the product of chance, and seem to indicate that the words in question were borrowed into Malay via Persian. Since there is no reason to assume that only words with these endings were borrowed into Malay, it may be inferred that most Arabic loanwords in Malay-Indonesian in general were borrowed principally via Persian. The possibility remains, of course, that the direct donor language was a Persianized language of India (such as Urdu). It also remains to be explained why Persian -é would be borrowed as Malay -ah; perhaps this reflects a spelling pronunciation or an earlier Persian pronunciation. Moreover, a recent study by van Dam (2009) has shown that if older and more extensive lexicographical sources are used, different statistical results are obtained, 10

Superficially, it would appear as though words in -ah reflect the Classical Arabic pausal form, while words with -at reflect the construct state form. However, there is a good explanation why these marked forms would be the ones borrowed rather than the absolute forms. Moreover, as will be explained below, it is highly probable that most of these words did not enter Malay-Indonesian directly from Arabic. 11 “Problematic” words are those that according to Campbell “either showed disagreement among the authorities, or had alternatives cited by at least one authority” (Campbell 1996: 39).

696

Uri Tadmor

possibly indicating a much stronger place for direct borrowing from Arabic. Whichever the case might be, it is clear that words were borrowed with -at only at earlier stages of Malay-Indonesian; modern borrowings exhibit exclusively -ah, e.g. kuliah ‘to attend university’, nasabah ‘bank customer’, and majalah ‘magazine’. It is possible that this study might help settle an old debate among historians of th Islam in Southeast Asia. By the last quarter of the 19 century, European historians had realized that Islam has spread to Southeast Asia principally via India, rather than directly from Arabia. At first Gujarat in northwestern India was thought to have been the more specific locus whence Islam was introduced to the MalayIndonesian archipelago, although later scholars also theorized that southern India was a more probable locus. The debate raged for a century (for a summary see Meuleman 2005: 24–25). This study did not find any Indic loanwords in Malay-Indonesian that can definitely be shown to have originated from Gujarati. Moreover, none of the ultimately Arabic loanwords show signs of having been borrowed via Gujarati. On the other hand, there are numerous loanwords of obvious Dravidian origin in MalayIndonesian (including ultimately Indic loanwords borrowed via Dravidian or Dravidian speakers). Even more importantly, certain Arabic loanwords in MalayIndonesian show signs of having been borrowed via a Dravidian language (or from native speakers of a Dravidian language). Specifically, these are words that in standard Arabic end in a cluster. Some cluster-final words exhibit a final -u which eliminates the unphonotactic (in Malay-Indonesian) cluster, e.g. Sabtu ‘Saturday’ (< Arabic sabt), waktu ‘time’ (< Arabic waqt), salju ‘snow’ (< Arabic 1alj), perlu ‘need’ (< Arabic far2). The final -u appears to reflect the Classical Arabic nominative ending. However, such forms occur in Arabic itself only in the construct state (roughly, when the noun is the head of a genitive construction) or following a definite article, and it would be difficult to explain why such highly marked forms would be borrowed (without the genitive noun or definite article) rather than the simple unmarked forms. Moreover, as noted in §5.2 below, Arabic words with final clusters are normally integrated into Malay-Indonesian with an echo vowel inserted between the final two consonants: subuh ‘dawn’ < Arabic 0ub3, jisim ‘body’ < Arabic jism, rajam ‘stoning’ < Arabic rajm. So the origin of final -u in these words seems to originate neither in Arabic itself nor in the Malay-Indonesian integration pattern. An alternative explanation would be that these forms were learned from native speakers of a Dravidian language such as Tamil or Telugu, where -u is regularly appended to consonant-final loanwords. The linguistic evidence therefore supports the theory of the introduction of Islam to the Malay-Indonesian archipelago from southern India. Finally, a word of caution is also in order regarding Dutch and English loanwords. It is often impossible to tell simply by looking at a word whether it was borrowed directly from Dutch or from English. This is especially true for words ultimately derived from Greek and Latin, many of which would have the same shape in Indonesian regardless of whether they were borrowed from Dutch or from English. For example, Dutch connectie and English connection would both be bor-

27. Loanwords in Indonesian

697

rowed as Indonesian konéksi, and Dutch kwantiteit as well as English quantity would both be borrowed as Indonesian kuantitas (see §5 and §6.2). This has the potential of making newer English loanwords appear to be older Dutch ones. However, after carefully examining all potential cases in the database, I concluded that the number of such words in the Indonesian subdatabase is very small. 4.1.2.

Grouping donor languages

The number of donor languages (including dialects and language groups) that have contributed loanwords to Indonesian is very large. For ease of presentation and discussion, they have been grouped into eight groups of donor languages (plus “Miscellaneous” and “Unidentified source”), as in Tables 3, 4, and 5. The grouping was based on socio-historical as well as practical grounds. Languages participating in broadly the same contact situation were generally grouped together (e.g. languages of India, Arabic/Persian). In grouping together the languages of the Java area, a practical consideration was used: as already mentioned, it was often impossible to tell if a particular word was from Javanese, Sundanese, or Balinese. Fortunately, this practical approach did not conflict with sociolinguistic and historical considerations. The percentages of words in the database originating from the various donor languages and donor language groups are presented in Table 3. In previous studies of loanwords in Indonesian, loanwords from local languages 12 have been excluded from the discussion . As the present study shows, this approach is wholly unwarranted, as such loanwords form an important part of the lexicon. Words from languages of the greater Java area (Java, Bali, and Madura) constitute the most numerous group of loanwords, with 8.9% of the total number of words in the database. Loanwords from these languages are characterized by their common, everyday nature, e.g. samping ‘side, next to’ (< Balinese), keriput ‘wrinkled’ (< Sundanese), and ketombé ‘dandruff’ (< Javanese). These words first entered Indonesia via second-language speakers who were transferring words from their first languages into Indonesian. This process (technically known as “imposition”) is rather different from the transfer of words into one’s first language from another language (“adoption”) which accounts for most Indonesian loanwords of other sources. Different types of borrowing processes are discussed in §8. Although Sanskrit is the earliest recorded donor language for Malay-Indonesian and has not been used by speakers of Malay-Indonesian for many centuries, loanwords from languages of India (consisting mostly of words of Sanskrit origin) still constitute the second largest category, with 8.4% of all words in the subdatabase. This is especially remarkable considering that many Sanskrit loanwords that appear in early writings (and doubtlessly many unrecorded ones) have long become obsolete and are therefore not included in the count. These numerous and enduring Sanskrit loanwords testify to the tremendous impact that Indian cultures, religions, and languages have had on the Malay-Indonesian speaking world. 12

Including the recent Loan-Words in Indonesian and Malay (Jones 2007).

698

Uri Tadmor

Table 3: Loanwords in Indonesian by donor languages/language groups (percentages) Donor language

Proportion of all words in Indonesian database

Languages of the Java area

8.9

Languages of India

8.4

Dutch

6.4

Arabic/Persian

5.7

Portuguese (including Creole)

1.4

English

1.2

Chinese languages

0.7

Languages of Sumatra

0.4

Unidentified source

0.5

Miscellaneous languages

0.4

Total loanwords

34.0

Words of Indian origin are particularly well represented in the domain of religion, and constitute about a quarter of all words in this category (as represented in the database). Only a small fraction of Indonesia’s population still adheres to religions that originated from India. However, many religious terms of Sanskrit origin persist, and are now applied to Muslim and Christian concepts, e.g. agama ‘religion’, surga ‘heaven’, neraka ‘hell’, pahala ‘(religious) merit’, puasa ‘fasting’, and pendéta ‘priest, Protestant minister’. It is also remarkable that over 11% of the Indonesian function words in the database are of Indian origin. It should be noted, however, that many of these were not function words in the donor language (see §4.2). Expectedly, the field least impacted by Sanskrit was the Modern World. Although words of Sanskrit origin are still used for coining neologisms, these were not counted as loanwords, as they were created in Indonesian. Dutch words were borrowed into Indonesian throughout the colonial period, which started in 1619 with the occupation and destruction of Jayakarta (modern Jakarta). The era of Dutch colonialism ended with the transfer of sovereignty in 1949. For all intents and purposes, the borrowing of Dutch words into Indonesian also ended then, because Dutch never became an auxiliary language in Indonesia, as other European languages have become in many former colonies. Since then, numerous Dutch loanwords have become obsolete, but many others are still well established in the lexicon. With 6.5% of the total number of words in the database, the number of Dutch loanwords even exceeds the number of loanwords of Arabic and Persian origin combined. The difference between Dutch and English, the major conduit of Western linguistic influence in Indonesia since independence, is also striking. English loanwords constitute only 1.2% of the total number of words in the database. While the proportion is certain to rise on current sociolinguistic trends, it will be a long time before English loanwords can eclipse the large number of Dutch loanwords adopted throughout centuries of colonialism.

27. Loanwords in Indonesian

699

Finally, Portuguese (including Batavian Portuguese Creole) and Chinese languages (mostly Hokkien) also played a significant role as donor languages, although the number of loanwords they contributed to standard Indonesian is relatively small (respectively 1.5% and 0.9% of the total number of words in the Indonesian database). 4.2.

Semantic word classes

The breakdown of loanwords according to semantic word classes is summarized in Table 4. The figures for Indonesian conform to the general trend of borrowing proportionally more nouns than verbs (see Chapter III). Indeed, fully 43.7% of all nouns in the Indonesian database are loanwords. Verbs were borrowed far less frequently, although loanwords still constitute a significant proportion (17.2%) of the verbs in the database. If Indonesian borrowed more verbs than other languages, this was probably because this is easily done Indonesian, which is agglutinative and has almost no inflectional morphology. Compare this to Semitic languages like Arabic or Hebrew, for example, which have complex inflectional verbal morphology and therefore present substantial challenges to the borrowing of verbs (see discussion in Chapter III). The only morphological condition on the borrowing of verbs into Indonesian is that the citation form must contain the prefix meng- (for transitive 13 verbs and a subcategory of stative verbs) or ber- (for intransitive verbs) .

Languages of India

Dutch

Arabic/Persian

Portuguese (inc. Creole)

English

Chinese languages

Languages of Sumatra

Unidentified source

Miscellaneous languages

Total loanwords

Non-loanwords

Nouns Verbs Adjectives Adverbs Function words All words

Languages of the Java area

Table 4: Loanwords in Indonesian by donor language group and semantic word class

10.0 6.8 10.9 4.0 8.9

10.1 3.8 6.8 15.4 11.3 8.4

9.6 1.9 0.6 2.4 6.4

7.4 3.2 5.3 1.2 5.7

2.1 0.4 1.4

2.0 0.2 1.2

1.0 0.4 0.7

0.3 0.2 1.2 0.8 0.4

0.8 0.2 0.5

0.6 0.4

43.7 17.2 24.8 15.4 19.8 34.0

56.3 82.8 75.2 84.6 80.2 66.0

Two findings that stand out require some explanation. The first is that languages of India contributed a high proportion (15.4%) of adverbs, while no other language contributed any adverbs at all. This is easily explained by the fact that the LWT meaning list only contains 7 meanings classified as ‘adverbs’. While just one of the 13

With inherited bases, these prefixes behave much less regularly.

700

Uri Tadmor

words corresponding to an adverbial meaning is of probable Sanskrit origin (laju ‘fast’ from Sanskrit laghu ‘light, swift, quick’), it is enough to appear as significant. The second striking finding is the high proportion of borrowed function words. Nearly one of five function words in the Indonesian database is a loanword, a high proportion for this category into which it is considered difficult to borrow. However, a detailed examination of the data reveals that most of the source words did not constitute function words in the donor languages. They were probably borrowed as content words and underwent grammaticalization later. Thus saya ‘I’ derives from the Sanskrit noun sahaya ‘companion’. It was first borrowed into Malay in the sense of ‘royal companion’, and was also used for self-reference by some court officials when addressing a monarch. Later, use of the word expanded to general polite self-reference, and today saya functions as a first person pronoun, its original meaning as a noun having been lost. In cases where there is no evidence that the word was ever used as a content word in Indonesian, it is also possible that grammaticalization was part of the borrowing process. Examples include bahwa ‘complementizer’ (< Sanskrit bh!va ‘being, state’), bila ‘when, if’ (< Sanskrit vel! ‘time’), and karena ‘because’ (< Sanskrit k!ra!a ‘cause’), which are only attested in Malay-Indonesian as function words, never with the nominal meanings of the source words. 4.3.

Semantic fields

Indonesian exhibits considerable variation in the rates of borrowing into different semantic fields, as can be seen in Table 5. Four semantic fields consist mostly of loanwords: Religion and belief (70.0% borrowed vocabulary), the Modern world (66.4%), Clothing and grooming (55.6%), and Law (51.4%). It is fairly obvious why modern world terms would be borrowed, especially in a civilization prone to cultural borrowing. However, why the sphere of religion and belief should be even more susceptible to borrowing requires further explanation. Well over 90% of Indonesia’s population adhere to an introduced monotheistic faith (Islam or Christianity), and many of the remaining population follow other non-indigenous faiths (Buddhism or Hinduism). Only a tiny fraction of the population still adheres to an indigenous religion, officially at least. With the disappearance of indigenous religions, much of the vocabulary associated with them also disappeared, or was replaced by the vocabulary of the introduced religions. A similar process affected the indigenous, pre-contact clothing of Indonesia, which is now mostly reserved for ceremonial purposes, and in few areas at that. In their daily lives, most Indonesians use clothes patterned after those of India and the West, which were borrowed along with their names: celana ‘trousers’ and topi ‘hat’ from Hindi; keméja ‘(button-down) shirt’ and sepatu ‘shoes’ from Portuguese; rok ‘skirt’ and kaos ‘socks’ from Dutch; daster ‘house dress’ and bot ‘boots’ from English. Finally, the Indonesian legal system is also based on systems introduced from abroad. In ancient times the sources were India and the Middle East, while the modern

27. Loanwords in Indonesian

701

Indonesian legal system is based on the Dutch one. Hence the large number of borrowed legal terms.

English Chinese languages Languages of Sumatra Unidentified source Miscellaneous languages Total loanwords Total nonloanwords

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

The physical world Kinship Animals The body Food and drink Clothing and grooming The house Agriculture and vegetation Basic actions and technology Motion Possession Spatial relations Quantity Time Sense perception Emotions and values Cognition Speech and language Social and political relations Warfare and hunting Law Religion and belief Modern world Miscellaneous function words All words

3.9 3.5 5.5 7.9 5.0 8.8

2.5 6.3

0.4 4.1 1.1 3.0 5.0

0.9 2.0 2.5

2.4 -

1.8 1.3

0.8 0.9 0.5 -

12.3 17.2

3.5 12.3 6.1 6.1

5.3 1.8

7.0 1.8

1.2

-

-

1.8 -

- 42.1 57.9 2.5 36.8 63.2

6.2

5.6

2.1

1.0

0.5

3.1

-

1.0

1.0 25.1 74.9

9.1 3.9 7.3 2.3 1.2 8.0 5.6 14.7 6.7 7.6

3.2 10.2 3.7 11.5 11.2 4.7 14.7 12.9 9.2 10.7

3.2 1.2 7.8 4.7 2.1 4.6 3.7 18.6 1.3 1.1 7.9 1.2 11.0 3.3 5.0 1.5 6.1

2.4 1.0 3.7 1.2 3.3 -

4.7 1.7 1.5

3.1 2.3 1.5

1.0 1.3 1.7 1.5

1.3 1.7 -

-

19.0 34.4 15.2 18.4 38.5 16.7 31.6 41.1 32.5 30.5

81.0 65.6 84.8 81.6 61.5 83.3 68.4 58.9 67.5 69.5

9.2 10.8 4.6 3.1 2.9 14.3 17.1 17.1 2.5 26.3 7.5 31.3 2.8 2.8 49.1 1.4 4.7 7.0 4.7 -

1.5 2.5 1.4 -

4.8 -

1.4 -

1.5 -

3.1 1.4 -

1.4 -

33.8 51.4 70.0 66.4 16.3

66.2 48.6 30.0 33.6 83.7

8.9

1.4

1.2

0.7

0.4

0.5

0.4 34.0 66.0

8.4

Dutch

10.2 16.5 1.2 15.3 12.9 8.2 15.5 8.2 2.3 10.0 2.1 2.6 14.1 5.0 6.0 3.7 8.1 20.0

Languages of the Java area Languages of India

Arabic/Persian Portuguese (inc. Creole)

Table 5: Loanwords in Indonesian by donor languages and semantic fields (percentages)

4.6

6.4

5.7

33.1 42.4 39.3 24.2 37.7 55.6

66.9 57.6 60.7 75.8 62.3 44.4

Equally interesting are the semantic fields least affected by borrowing. In the case of Indonesian, these are Spatial relations (15.2% borrowed), Sense perception (16.7%),

702

Uri Tadmor 14

Quantity (18.4%), and Motion (19%) . Apparently Indonesian speakers and their predecessors felt no objective necessity to borrow extensively into these fields, because they already contained most of the words necessary to convey the required concepts. Nor did they have motivation for copious cultural borrowing here, because many of the concepts in these fields are relatively culture-free.

5. Integration of loanwords 5.1.

Morphological integration

Most words borrowed into Malay-Indonesian did not undergo any morphological integration, because none was necessary. The language has little inflection, and most roots may occur as words without any modification. As already mentioned (§4.2), loan verbs constitute a notable exception, in that their citation forms must contain a verbal prefix. For example, the English noun access was borrowed into Indonesian as akses without any morphological modification. However, all citation forms of verbs deriving from the English verb (to) access contain a prefix, e.g. mengakses ‘to access’, mengakseskan ‘to access on someone else’s behalf’, berakses ‘to have access’, and terakses ‘to be accessed / accessible’. Certain types of complex words had a fixed pattern of integration into Indonesian. In contemporary Indonesian this process principally involves two types of borrowed English nouns. The first consists of abstract nouns whose source forms in English end in -ation or -ization. The integration pattern of these words is based on an earlier pattern of borrowing similar Dutch words ending in -atie [asi] and -isatie [isasi]. For example Dutch proclamatie ‘proclamation’ was borrowed as Indonesian proklamasi, and Dutch modernisatie ‘modernization’ was borrowed as Indonesian modernisasi. The same integration pattern is now applied to English loanwords which end in -(iz)ation. Thus English stagflation was borrowed as stagflasi and English globalization was borrowed as globalisasi. The second type is also based on an earlier pattern of borrowing abstract nouns from Dutch, in this case those ending in -iteit, but with an added twist. Such Dutch words were initially integrated into Indonesian with the ending -itet or -iteit, e.g. Indonesian kualitet or kualiteit ‘quality’ from Dutch kwaliteit. After independence, the ending -ite(i)t was viewed as too Dutch-sounding by Indonesia’s language planners, who replaced it with the Latin ending from which it ultimately derived, -itas. Thus in contemporary Indonesian, the word for ‘quality’ is kualitas. This pattern is now used to integrate English words ending with -ity, e.g. integritas from integrity. The elements -(is)asi and -itas are not phonological adaptations of English -(iz)ation and -ity, but result from the mechanical application of an established integration pat14

Function words are not included in this discussion because the semantic field Miscellaneous function words only contains items not already included in one of the other semantic fields. A more precise count is based on semantic word class, see §4.2.

27. Loanwords in Indonesian

703

tern. The elements -isasi and -itas have been so well integrated into Indonesian that they are used productively to derive new words; see §6.2 below. 5.2.

Phonological integration

At its earlier stages, Malay-Indonesian had a relatively small inventory of phonemes as well as a restrictive syllable structure, so borrowing words from other languages often necessitated considerable phonological integration (see Tadmor 2007: 304– 308). Initially, loanwords from any language were assimilated to the existing h phonological structure. For example, Sanskrit 4$ghra [ç"g ra] ‘quick’ was borrowed as Malay segera [s!#!ra] ‘immediate’. Sanskrit 4 was represented by the closest Malay phoneme, s; gh was likewise replaced by the closest Malay phoneme, g; vowel length was disregarded, since it is not distinctive in Malay; since Malay did not allow clusters, a schwa was inserted between the g and r; this resulted in a trisyllabic word, so the initial vowel i was reduced to schwa as required by early Malay phonology (antepenult reduction). Other processes which affected loanwords (some of which are still productive) include the following: ! Devoicing of final voiced consonants (e.g. masjid (/mas$it/) ‘mosque’ < Arabic masjid, sebab (/s!bap/) ‘reason’ < Arabic sabab); ! Reduction of complex vocalic nuclei of closed syllables by monophthongization (especially in polysyllabic words), e.g. héran ‘surprised’ < Arabic 3air!n, tobat ‘repentance’ < Arabic taubat, or by turning the diphthong into two syllables with hiatus (especially in monosyllables), e.g. kaos ‘socks’ < Dutch kous /kaws/, wain ‘wine’ < Dutch wijn or English wine /wajn/; ! Assimilation of the place of articulation of nasals to that of following stops (e.g. mungkin ‘maybe’ < Arabic mumkin, amplop ‘envelope’ < Dutch enveloppe); ! Reduction of some final clusters by schwa epenthesis (in loanwords from European languages, when both consonants are sonorants; e.g. filem ‘film’ < Dutch film, modéren ‘modern’ < Dutch modern), or by deleting the second consonant (in loanwords from European languages, when the two consonants are not both sonorants, e.g. arsiték ‘architect’ < Dutch architect, ban ‘tire’ < Dutch band), or echo-vowel epenthesis (in Arabic loanwords, e.g. subuh ‘dawn’ < Arabic 0ub3, jisim ‘body’ < Arabic jism, rajam ‘stoning’ < Arabic rajm). Due to extensive borrowing from languages with different phonemic inventories and phonotactics, the phonology of Malay-Indonesian underwent changes (see §6.1) which now allow borrowed morphemes to be adopted into Indonesian with far fewer modifications than before.

704

Uri Tadmor

6. Structural borrowing This section contains a brief overview of structural borrowing in Indonesian. For a more detailed discussion, see Tadmor (2007). 6.1.

Phonological borrowing

When borrowing from a particular language was extensive (as was the case with words from Sanskrit, Arabic, and Dutch), this eventually led to changes in phonotactics, and even to the introduction of new phonemes. For example, in older stages of Malay, the semivowels w and y were allophones of the vowels u and i. Under the influence of loanwords, they have fully phonemicized. In addition, Indonesian has several loan phonemes which were borrowed outright, such as f and ç (spelled sy). Even more than the inventory of phonemes, the syllable structure of Indonesian has been profoundly affected by borrowing. In Proto Malayic, the syllable shape was (C)V(C). Due to massive lexical borrowing, Indonesian now allows up to three consonants in the onset and coda, so that the syllable shape is (C)(C)(C)V(C)(C)(C). 6.2. 6.2.1.

Morphosyntactic borrowing Borrowed bound morphemes

It is difficult to establish clear-cut criteria in Indonesian for distinguishing between affixes and clitics, and it is equally difficult to distinguish between inflection and derivation. Clitics as well as affixes are treated together here under the cover term bound morphemes. Nevertheless, this does not mean that all are equally borrowable, and some generalizations can be made regarding the borrowability of different types of bound morphemes. Generally speaking, bound morphemes which are more affix-like and whose function is more grammatical are less borrowable, while bound morphemes which are more clitic-like and whose function is more semantic are more borrowable. Thus meng- (which forms active verbs), -i (which derives transitive verbs), and -an (which derives nouns) are definitely not borrowed. At the other end of the spectrum are “pseudo-affixes”, bound morphemes whose function is semantic and which do not affect the base’s syntactic properties, and are as borrowable in Indonesian as free content morphemes (words). These are usually (though not always) written as separate words, unlike the true affixes mentioned above, which are never written as separate words, even by uneducated speakers. This may reflect a perception of the former as part of the same word and of the latter as separate words (even though they do not occur in isolation). Examples of borrowed pseudo-affixes in Indonesian are non- (< Dutch / English, e.g. non pemerintah ‘nongovernmental’), ekstra- (< Dutch / English, e.g. ekstra ketat ‘extremely strict’) super- (< Dutch /

27. Loanwords in Indonesian

705

English, e.g. super murah ‘super-cheap’), maha- (< Sanskrit, e.g. maha penting ‘extremely important’), pra- (< Sanskrit, e.g. prabayar ‘prepaid’), pasca- (< Sanskrit, e.g. pasca perang ‘post-war’). In all of these examples, the “pseudo-affix” is borrowed but the base is not. A few class-changing morphemes were borrowed into Indonesian. Two of them have already been discussed (§5.1): the Latinate affixes -(is)asi and -itas which derive abstract nouns, and became productive in Indonesian after extensive lexical borrowing from Dutch. These affixes are used to derive new words in Indonesian, although still mostly from loanword bases (interestingly including loanwords of non-European origin). The suffix -(is)asi was used, for example, to derive the word swastanisasi ‘privatization’ from the Sanskrit-derived base swasta ‘private’ (the -n- is epenthetic); the suffix -itas was used to derive koneksitas ‘connectivity’ from the base koneksi ‘connection’ (the word was coined in Indonesian and does not occur in Dutch); the suffix -isme was used to derive the word koncoisme ‘cronyism’ (the base konco is from Javanese). In addition to these Dutch-derived, ultimately Latin affixes, Indonesian also borrowed a few affixes from other languages. The suffix -awi , of Arabic origin, derives adjectives from nouns, such as manusiawi ‘humane’ (from the base manusia ‘human being’, of Sanskrit origin) and geréjawi ‘ecclesiastical’ (from the base geréja ‘church’, of Portuguese origin). The suffix -wan, of Sanskrit origin, derives agent nouns, e.g. ilmuwan ‘scientist’ (from the base ilmu ‘knowledge’, of Arabic origin) and jomblowan ‘bachelor’ (from the base jomblo ‘unmarried’, of Austronesian origin). 6.2.2.

Syntactic borrowing

The syntax of Malay-Indonesian has undergone many changes during its intermittently documented history. However, it is difficult to point to borrowing as the direct cause of specific changes, and it is more probable that a mixture of internal and external factors have been at work. Among features whose origin may be due at least in part to borrowing are: the broadening of the function of pronominal enclitics from purely genitive to accusative (as well as genitive); the development of copulative-like constructions; and the emergence of locative relative clauses. For more details, see Tadmor (2007).

7. Lexical adoption vs. imposition and borrowing through writing vs. borrowing through speech Some degree of bilingualism is a precondition for linguistic interference, which may result in contact-induced language change, including lexical borrowing. However, in the case of Malay-Indonesian, the principal agents of change were not members of bilingual communities whose languages underwent linguistic interference. Sanskrit has profoundly influenced the lexicon of Malay-Indonesian, yet there was never a bilingual community that used Sanskrit and Malay. We have no reason to

706

Uri Tadmor

assume that more than a small minority of ancient speakers of Malay ever had any knowledge of Sanskrit, and it would not be correct to say that they “spoke” it, because it was used as a literary and liturgical language. However, this minority of scholars and clergy was very influential and constituted the cultural and religious (if not political) elite. Once this elite adopted Sanskrit elements, the rest of society imitated their prestigious speech, thus incorporating Sanskrit words into their language without actually knowing – much less speaking – any Sanskrit. Similarly, the great majority of Arabic loanwords in Indonesian, whether borrowed directly or via Persian or an Indian language, were adopted from writing, and were initially used by a small elite minority before spreading to the language of the general population. Dutch words were borrowed principally from speech and not from writing. As already mentioned, only a small minority of Indonesians were ever fluent in Dutch, although many Dutch residents of Indonesia did become fluent in some variety of Malay-Indonesian. Thus the conduits of lexical borrowing from Dutch to Indonesian were members of the Dutch-educated indigenous elite and Dutch speakers of Malay-Indonesian. Both these groups were very small numerically but very influential sociolinguistically. The cases of Sanskrit and Arabic on the one hand, and of Dutch on the other hand, contrast with a third scenario: indigenous Indonesians borrowing lexical items from various local languages into Malay-Indonesian. The agents of this lexical borrowing were imposing words from their indigenous vocabulary on their second language rather than adopting words from another language into their native language. This view is espoused by van Coetsem (1988) as well as by Thomason & Kaufman (1988), although they used different terminologies to describe it. In this study all transfer of words from one language to another, regardless of the sociolinguistic circumstances, is subsumed under the cover term “borrowing”. No significant differences were found in Indonesian between the results of lexical imposition (shift-induced lexical change) and lexical adoption (borrowing in a maintenance situation). Both types of lexical transfer have been common in the language’s history, and both have affected various semantic categories and fields in similar ways. However, two important differences were observed between borrowing through speech and borrowing through writing. On the whole, sound correspondence between source words and loanwords tended to be more regular when the borrowing was through writing. Moreover, loanwords borrowed through speech tended to be of a more colloquial and everyday nature, while loanwords borrowed through writing tended to be more literary or formal. Of course, if enough time has elapsed since the borrowing such distinctions can get somewhat blurred; colloquial words may work their way up into more literary genres, while literary words may percolate down to more colloquial styles. However, as the findings in Table 6 show, even in the case of the oldest documented borrowing situation (Indianization) the overall distinction is maintained. Each word in the Indonesian database was tagged as “colloquial”, “formal”, or “general”. Table 6 compares the numbers of loanwords of ultimate Sanskrit to

27. Loanwords in Indonesian

707

15

Arabic origin (which were borrowed almost exclusively through writing ) and loanwords from languages of Java and its environs (specifically Javanese, Balinese, Sundanese, and Madurese) and Dutch, which were borrowed through speech. Table 6:

Effects of borrowing through writing and through speech (percentages)

Type of borrowing Donor language

Number of loans by style Colloquial General

Through writing Through speech

Arabic Sanskrit Java area Dutch

20 12

79 64 71 83

Formal 21 36 8 4

The difference in the stylistic distribution of loanwords from written and spoken sources is quite striking. The database contains no examples at all of exclusively colloquial loanwords ultimately borrowed from Arabic or Sanskrit, despite their large number and long presence in the language. On the other hand, loanwords originating in local languages of the greater Java area and in Dutch include many purely colloquial words and relatively few formal ones.

8. Speakers’ attitudes Are speakers aware of the origin of the words that they use? Generally speaking, a child acquiring the vocabulary of his first language regards all words are equally “native”. However, as children grow older and acquire more formal education, they may become aware of the fact that some words in their language were borrowed from other languages, and this knowledge may affect their linguistic behavior. Because speakers are more conscious of their vocabulary than they are of their grammar, they have a higher degree of control over their vocabulary than they do over their grammar use (see Tadmor 1995: 37ff, Tadmor 2000). This can have various effects on the use of loanwords. In the case of Indonesian, speakers who wish to emphasize their pious Islamic background may increase their use of Arabic of loanwords, while speakers who wish to show their sophisticated worldliness may exaggerate the use of English loanwords. The opposite can also be true: purists or nationalists may consciously avoid using loanwords when speaking Indonesian. Many loanwords have nothing about their shape to indicate their foreign origin. Others, however, have loan phonemes or other features which mark them as having a foreign origin. Thus, educated speakers of Indonesian know that words 15

A handful of Arabic loanwords, representing a tiny proportion of the total, appear to have been borrowed through speech rather than through writing. An example is the word raib ‘disappear’ (< Arabic 5aib), where the Arabic 5 is represented by Malay-Indonesian r, whereas normally it is represented by g. Another example is the word menara ‘tower’; if borrowed through writing, it would end with -h (or -t).

708

Uri Tadmor

containing the consonants /f/ or /z/ are loanwords. They also know that nouns ending in -ah / -at are of Arabic origin, and that adjectives ending in -if are of European (Dutch or English) origin. As mentioned above, Indonesian speakers productively use several borrowed derivational affixes, such as the abstract noun forming -asi, -isasi, and -itas, and the adjective forming -awi (see §6.2.1). Interestingly, these affixes are used to create new words almost exclusively from loanword bases (regardless of their origin). This is an indication that loanwords constitute a distinct category in the speakers’ minds. A well-known morphophonological process of Indonesian is the sandhi rule that assimilates the active prefix meng- (and the agentive prefix peng-) to the initial consonant of the base. Interestingly, despite protestations from Pusat Bahasa (Indonesia’s language planning body), speakers often disregard this rule when they are aware that the base is a loanword. This is another indication that loanwords enjoy a special (or different) status in the speakers’ minds. Table 7 presents a few 16 examples of forms produced by speakers, compared to normative forms . Table 7:

Standard and nonstandard forms of words with the prefix meng- and a borrowed base

Nonstandard form

Standardized form

Gloss

Base

Origin

mensubsidi mentaati memprotés mengcopy / mengkopi

menyubsidi menaati memrotés mengopi

to subsidize to obey to protest to copy

subsidi taat protés copy/kopi

Dutch Arabic Dutch / English English

Orthography is another area where knowledge of a word’s origin can come into play. In Indonesian, voiced consonants do not occur in final position. Loanwords which end in a voiced consonant are invariably assimilated into Indonesian by undergoing final devoicing. However, their spelling often reflects the final voiced consonant of the source word, for example in tertib [t!rtip] ‘orderly’ (< Javanese), masjid [mas$it] ‘mosque’ (< Arabic), uleg [ul!k] ‘beat in a mortar’ (< Javanese), and iméj [imec] ‘image’ (< English).

9. Conclusion About one third of the words in the Indonesian database were identified as probable or certain loanwords. This relatively high proportion is explained by the long history of contacts between speakers of Malay-Indonesian and other cultures, as well as by the long-standing role of Malay-Indonesian as a regional lingua franca. Specific

16

It should be noted that older, well-assimilated loanwords are no longer perceived as borrowed and therefore do undergo the expected sandhi rule. Moreover, words from local languages in which similar rules operate also undergo them after being borrowed into Indonesian.

27. Loanwords in Indonesian

709

historical processes, such as Indianization, Islamization, colonization, and globalization, have all resulted in considerable lexical borrowing into Malay-Indonesian. Previous studies have pointed out the importance of distinguishing between imposition and adoption. The present study did not find significant differences between the impacts of these two different processes on the lexicon of Indonesian. However, some important differences between the results of borrowing through speech and borrowing through writing were observed. An area that has not been satisfactorily investigated yet, and about which some ideas are presented here, is speakers’ attitudes and role in contact-induced lexical change. Whether and how speakers perceive loanwords can significantly influence borrowing processes and their linguistic outcomes. A specific contribution of this study was providing the first systematic discussion of words borrowed into Indonesian from other languages of Southeast Asia, including languages indigenous to the Malay-Indonesian archipelago as well as neighboring languages. As a preliminary effort, it probably contains numerous errors (of omission as well as of commission), and leaves much room for future research. With regard to languages of Europe, the Near East, South Asia, and China, the present study has built upon extensive previous work, and its efforts have been focused on identifying previously unrecognized loanwords, correcting errors in the identification of donor languages and source words, and linking between historical events and their linguistic outcomes.

Acknowledgment The author wishes to express his thanks to Waruno Mahdi, who has read an earlier draft of this chapter and has provided many helpful comments.

References Alwi, Hasan (chief ed.). 2002. Kamus Besar Bahasa Indonesia (KBBI) [Unabridged Indonesian Dictionary]. 3rd edn. Jakarta: Pusat Bahasa. Campbell, Stuart. 1996. The distribution of -at and -ah endings in Malay loanwords from Arabic. Bijdragen tot de Taal-, Land- en Volkenkunde 152:23–44. de Casparis, J. G. 1997. Sanskrit Loan-words in Indonesian. Nusa 41 (monograph). Jakarta: Atma Jaya University. Echols, John M. & Shadily, Hassan. 1975. Kamus Inggris-Indonesia: An English-Indonesian Dictionary. Ithaca: Cornell University Press / Jakarta: Gramedia. Echols, John M. & Shadily, Hassan. 1998. Kamus Indonesia-Inggris: An Indonesian-English Dictionary. 3rd edn. Ithaca: Cornell University Press / Jakarta: Gramedia. Gonda, Jan. 1952. Sanskrit in Indonesia. Den Haag: Oriental Bookshop.

710

Uri Tadmor

Grijns, D. J. & de Vries, J. W. & Santa Maria, L. 1983. European Loan-words in Indonesian. Indonesian Etymological Project 5. Leiden: KITLV. Johns, A. H. & Prentice, D. J. (eds.-in-chief). 1992. Kamus Inggris-Melayu Dewan: An English-Malay Dictionary. Kuala Lumpur: Dewan Bahasa dan Pustaka. Jones, Russell. 1978. Arabic Loan-words in Indonesian. Indonesian Etymological Dictionary 3. London: School of Oriental and African Studies. Jones, Russell. 1984. Loan-Words in Contemporary Indonesian. Nusa 19:1–38. Jones, Russell. 2007. Loan-words in Indonesian and Malay. Leiden: KITLV Press. Kamus Besar Bahasa Indonesia. 2002. see Alwi (2002). Kamus Dewan. 1991: see Othman bin Sheikh Salim (1991). Kamus Inggris-Melayu Dewan. 1992: see Johns & Prentice (1992). Meuleman, Johan H. 2005. History of Islam in Southeast Asia: Some questions and debates. In Nathan, K. S. & Kamali, Muhammad Hashiun (eds.), Islam in Southeast Asia: Political, Social and Strategic Challenges for the 21st Century, 2nd edn. 22–44. Singapore: Institute of Southeast Asian Studies. Mu’jizah. 2009. Iluminasi dalam Surat-Surat Melayu Abad ke-18 dan ke-19 [Illuminations in Malay letters from the 18th and 19th centuries]. Jakarta: Kepustakaan Populer Gramedia, École française d’Extrême-Orient, Pusat Bahasa – Departemen Pendidikan Nasional / KITLV-Jakarta. Othman bin Sheikh Salim, Sheik (chief ed.). 1991. Kamus Dewan: Edisi Baru [Dewan dictionary: New edition]. Stevens, Alan M. & Schmidgall-Tellings, A. Ed. 2004. Kamus Lengkap Indonesia-Inggris: A Comprehensive Indonesian-English Dictionary. Athens: Ohio University Press / Bandung: Mizan. Tadmor, Uri. 1995. Language Contact and Systemic Restructuring: The Malay Dialect of Nonthaburi, Central Thailand. Ph.D. dissertation. University of Hawaii. Tadmor, Uri. 2000. Can speakers control contact-induced language change? Paper presented at Sociolinguistics Symposium 2000, University of the West of England, Bristol, 27th–29th April, 2000. Tadmor, Uri. 2007. Grammatical borrowing in Indonesian. In Matras, Yaron & Sakel, Jeanette (eds.), Grammatical Borrowing in Cross-Linguistic Perspective, 301–328. Berlin: Mouton. van Coetsem, Frans. 1988. Loan phonology and the two transfer types in language contact. Dordrecht: Foris. van Dam, Nikolaos. 2009. Arabic loan-words in Indonesian revisited. Unpublished manuscript. Jakarta. Wilkinson, R. J. 1959. A Malay-English Dictionary (Romanised). 2 vols. London: MacMillan & Co.

27. Loanwords in Indonesian

711

Yule, Henry & Burnell, A. C. 1903. Hobson-Jobson: A Glossary of Colloquial Anglo-Indian Words and Phrases, and of Kindred Terms, Etymological, Historical, Geographical and Discursive. Crooke, William (ed.). London: Routledge & Kegan Paul.

Loanword Appendix Sanskrit jagat buana bumi gua samudra gempa angkasa surya candra cahaya udara bayu méga cuaca manusia pria putra teruna putri suami istri janda saudara saya gembala kuda angsa rajawali (burung) merpati serigala singa gajah kepala muka bahu selesma, selésma kendi

world (lit.) world (lit.) land cave sea, ocean (lit.) earthquake sky (lit.) sun (lit.) moon (lit.) light air, weather wind (lit.) cloud (lit.) weather human being man, (human) male (formal) boy, son (formal) young man (lit.) girl, daughter (formal) husband wife widow relatives I herdsman horse goose eagle dove wolf, jackal lion elephant head front, face shoulder the cold (lit.) jug, pitcher

cabé madu gula busana kapas sutera manik-manik lépa menenggala [mengtenggala] biji belia cemara

chili pepper honey sugar clothing, clothes (lit.) cotton silk beads mortar (substance) to plough/ plow (lit.)

seed, grain youth (lit.) conifer, esp. the casuarina labu gourd, pumpkin, squash kerja work mencuci [meng- to wash cuci] kencana gold (lit.) kaca glass (material) mengendarai to drive (lit.) [mengkendara-i] marga (1) road (lit.) benda thing memelihara to preserve [mengpelihara] papa (1) poor (lit.) harga price membagi to share, to [meng-bagi] divide sisa remains utara north daksina south (lit.) semua all segala all (lit.) pertama first masa time, period

usia dini segera laju mula sedia sentiasa, senantiasa berasa [ber-rasa] suara sunyi jiwa celaka bahagia gembira cinta murka bahaya setia berdusta [berdusta] puji loba pandai percaya menerka [meng-terka] bijaksana guru rahasia menyangka [mengsangka] cara karena atau bila berbicara [berbicara]

age (formal) early (lit.) immediately, soon fast beginning ready (lit.) always (lit.) to taste (intr.) sound, noise, voice quiet soul, spirit bad luck happy, content glad, happy to love (monarch’s) anger (lit.) danger faithful to lie, tell a lie (lit.) praise greedy (lit.) clever to believe to guess (lit.) wise teacher secret to suspect

manner because or when? (lit.) to speak, talk

712

Uri Tadmor

bahasa kata nama membaca [meng-baca] negara kota raja setru berkelahi [ber-kelahi] tentara senjata jala saksi pidana denda penjara agama déwa pendéta suci berpuasa [ber-puasa] surga neraka berhala bidadari menteri angka sama

language word name to read country town king enemy (lit.) to fight army, soldier weapons fishnet witness punishment (legal) fine prison religion god priest holy to fast heaven hell idol fairy minister number, digit same

Hindi-Urdu unta, onta roti celana topi kunci tembakau tembaga jam mencuri [meng-curi]

camel bread trousers hat, cap lock tobacco copper hour, clock to steal

Tamil badai keledai mengacu [meng-acu]

storm donkey to cast (metal) (lit.)

kapal pasar kedai cuma teman handai taulan bedil perisai nelayan kuil

ship market shop, store only friend friend (lit.) friend (lit.) gun shield (lit.) fisherman temple

Telugu (?) tiga Dutch

three

és papa (2), papi mama, mami opa oma om, oom tante famili kelinci kangguru léver

ice father mother grandfather grandmother uncle aunt relatives rabbit, hare kangaroo (human) liver (col.) stomach (col.) penis (lit.) vagina (lit.) physician oven cup, glass tongs sausage soup beer wool linen cotton coat blouse T-shirt collar skirt sock, stocking hat, cap handkerchief, rag towel

maag pénis vagina dokter open gelas tang sosis sop, sup bir wol linen katun mantel blus kaos kerah rok kaos (kaki) péci (kain) lap handuk

salep kamar selot kompor rak balok semén kamp got sekop havermut palem mengelap [meng-lap] mengebor [meng-bor] lém karpét bumerang menyupir [meng-supir] menyetir [meng-setir] as kano duit rékening bon gaji puing huk nol massa lat telat arloji jelék idé pulpén trompét komplot ketapél senapan, senapang hélem memvonis [meng-vonis] setrap

ointment room lock, latch, door-bolt stove shelf beam cement, mortar camp ditch shovel oats palm tree to wipe to bore glue carpet, rug boomerang to drive to drive axle canoe money bill, invoice bill, check wages remains (street) corner zero crowd late, to be late late, to be late clock bad, ugly idea pen horn, trumpet plot sling gun helmet to condemn, sentence punishment (in school)

27. Loanwords in Indonesian penalti vonis bui altar pastor, pastur radio télepon, télpon, telpon, télefon, telfon mobil bis listrik baterai mengerém [meng-rém] mesin motor suster pil, pél inyéksi présidén polisi plat (mobil) nomor pos prangko, perangko bank kran, keran wastafel toilét, toalét klosét WC sekrup, sekerup permén plastik bom béngkél rokok koran kalénder filem musik kopi nihil

penalty (in soccer) penalty, (legal) punishment prison altar (in a church) priest (Roman Catholic) radio telephone

car bus electricity battery to brake motor, machine motor nurse, nun pill, tablet injection president police license plate number post, mail postage stamp bank (financial institution) tap, faucet sink toilet toilet toilet screw candy, sweets plastic bomb workshop cigarette newspaper calendar film, movie music coffee nothing (lit.)

silét

razor

English kormoran (burung) tukan oposum jaguar tapir flu picer olive wine gaun tunik daster (sepatu) bot tato laso koin membarter [meng-barter] bolpoin konspirasi télevisi TV injéksi

cormorant toucan opossum jaguar tapir cold jug, pitcher olive wine (woman’s) dress (long) (woman’s) dress (short) (woman’s) dress, robe boot tattoo lasso coin to barter ballpoint pen plot television television injection

English or Dutch laguna botol

lagoon bottle

Portuguese (incl. Creole) garpu kéju mentéga keméja sepatu saku peniti tuala ténda jendéla méja terigu berdansa [berdansa] keréta roda

fork, pitchfork cheese butter shirt shoe pocket pin towel (lit.) tent window table wheat to dance cart, wagon wheel

bola témpo minggu (hari) Minggu sekolah péna serdadu geréja lampu menyéka [meng-séka] buku kaléng pipa

713

ball time week Sunday school pen soldier (lit.) church lamp to wipe book tin, can pipe

Spanish sabana

savanna

French sepéda

bicycle

Latin pinus

pine

Arabic dunia alam médan salju hawa bagal (burung) nuri arnab badan wajah rahim bernafas [ber-nafas] lahir hamil mayat jasad jenazah kubur, kuburan makam kuat séhat

world world plain, field (lit.) snow weather, air (lit.) mule parrot rabbit (lit.) body face womb to breathe to be born pregnant corpse corpse (lit.) corpse grave grave strong healthy

714

Uri Tadmor

beristirahat [ber-istirahat] (buah) zaitun jubah jilbab

to rest

olive cloak headband, headdress sabun soap pondok hut kémah tent kursi chair alat tool sejedah, sajadah (prayer) rug raib to disappear (mysteriously) menyelamatkan to rescue, save [mengselamat-kan] miskin poor waktu time, when (conj.) umur age awal early, beginning akhir end (temporal) tamat end (temporal, lit.) fajar dawn (lit.) subuh dawn (hari) Ahad Sunday (lit.) Senin Monday Selasa Tuesday Rabu Wednesday Kamis Thursday Jum‘at Friday Sabtu Saturday musim season arwah soul, spirit roh soul, spirit héran surprised, astonished amarah anger (lit.) serakah greedy tamak greedy akal mind berpikir to think [ber-pikir] murid pupil yakin certain, convinced

maksud niat sebab mengkhianati [mengkhianat-i] menjawab [meng-jawab] kertas kalam rakyat jiran adat menara hukum mahkamah hakim mendakwa [mengdakwa] menghukum [menghukum] zina, zinah masjid, mesjid korban beribadah [ber-ibadah] berdoa [ber-doa] sholat

imam

intention, meaning intention cause, because to betray

to answer paper pen (lit.) people neighbor (lit.) custom tower law court judge to accuse (in court) to punish

adultery mosque sacrifice, victim to worship to pray (for something) to pray (formulaic, Islamic) clergyman (chiefly Islamic) holy (lit.) to preach

kudus berkhotbah [ber-khotbah] sétan demon sihir magic alamat address Persian domba rubah piring pinggan anggur

sheep fox dish, plate dish grape, wine

seluar cadar serban, sorban destar bandar gandum saudagar laskar peri

trousers veil headband, headdress man’s headdress ditch, port wheat merchant (lit.) soldier (lit.) fairy

Old Javanese bapa, bapak ibu meréka jawawut mengantar [meng-antar] merusak [meng-rusak] warna mengajar [meng-ajar] pasti ratu Javanese kali rembulan menyulut [meng-sulut] bapak mertua ibu mertua wanita bocah leluhur caplak jénggot bréwok ketombé kerongkongan kéték payudara usus cérét blangkon jagung

father mother they millet to bring, take to damage color to teach certain queen river, stream, canal moon (lit.) to light father-in-law mother-in-law woman child (young human) ancestors flea beard beard dandruff throat armpit breast (formal) intestines, guts kettle man’s headdress maize, corn

27. Loanwords in Indonesian singkong jamur paron timbal terpelését [ter-pelését] merintih [meng-rintih] mengerti gagasan konyol enggak, nggak, gak, ga kapan warga(negara) lonté

cassava, manioc mushroom anvil lead to slide, slip to groan to understand idea stupid no (col.) when? citizen prostitute

Sundanese situ monyét kelapa

man-made lake monkey coconut

Javanese or Sundanese pesisir bayi mertua sapi banténg (ayam) jago kalong (burung) bétét tawon rayap kéong kodok kadal jambut buntut kontol pilek capék, capai paceklik panci séndok sarapan adonan soto kentang gubuk

shore baby parents-in-law cattle ox cock, rooster bat parrot bee, wasp termites snail frog lizard pubic hair tail penis cold tired famine pot, pan spoon breakfast dough meat soup potato hut

gapura gréndél obor pacul membabat [meng-babat] panén jeruk menggebrak [menggebrak] memencét [meng-pencét] menétés [meng-tétés] menyemprot [mengsemprot] mengguyur [meng-guyur] berjogét [berjogét] menggéndong [menggéndong] pojok separo kecut énténg kagét merangkul [mengrangkul] pintar goblok bégo gampang gara-gara ngobrol [ng-obrol] mencegah [meng-cegah] prajurit bénténg romo kembang gula

gate (lit.) latch, doorbolt torch spade to mow harvest citrus fruit to pound with fist

Balinese bianglala lindung menénténg [mengténténg] samping

to carry

corner half unpleasantly sour light (of weight) surprised, astonished to embrace

clever stupid stupid easy because to speak, talk (col.) to prevent soldier fortress (Catholic) priest candy

rainbow (lit.) freshwater eel to carry in hand side

Balinese/Javanese/Sundanese

rawa paman bibi to press keponakan duda to drip céléng bébék to splash, spray bunglon iga to splash, spray, drench to dance

715

jempol pusar sabuk témbok arit mengecor [meng-cor] kembang mengocok [meng-kocok] mengungsi [meng-ungsi] pajak garis mirip soré wangi sepi empuk ngomong [ng-omong] mengoméli [meng-oméli] désa taméng menuding [mengtuding] tanpa

swamp uncle aunt sibling’s child widower boar duck chameleon rib (of animals) thumb navel belt wall sickle to cast flower to shake to flee tax line similar afternoon fragrant quiet soft to speak, talk (col.) to scold

village shield to accuse

without

716

Uri Tadmor

Madurese (modern or earlier) celurit, clurit

sickle

Minangkabau (modern or earlier) gadis dangkal pidato kalian kumuh Toba Batak marga

girl, young woman shallow speech you (plural) dirty clan

Hokkien Chinese kecoa

cockroach

téko giwang képang cat uang toko hoki sué cabo téh

kettle, teapot earring plait, braid paint money shop, store good luck bad luck prostitute tea

Mandarin Chinese cawan

cup (lit.)

Khmer (modern or earlier) mas, emas semut

gold ant

kuk dian

yoke (oil) lamp

Unknown origin binatang pusut remaja permadani nokén mengendus [meng-endus] pujangga pesakitan [pe-sakit-an] obéng

animal awl youth rug (lit.) net bag to sniff poet (lit.) captive, prisoner screwdriver

Chapter 28

Loanwords in Malagasy* Alexander Adelaar 1. The language and its speakers Malagasy is the national language of Madagascar and is spoken by the vast majority of its 14 million inhabitants. It has many dialects, most of which are mutually intelligible. The Malagasy variety spoken by the Merina people in the central highlands (where the capital Tananarivo is situated) is the main dialect and became the basis of official and standard Malagasy. Since the French colonization of Madagascar, Malagasy has always been in competition with French, which remained an official language after Madagascar had become independent. Malagasy was meant to replace French after independence, but as an official language it actually appears to be losing ground to it. This is partly due to the fact that French was the language of higher administration and was more established in all public domains of communication. Another reason is that standard Malagasy is associated too closely with the dialect of the Merina people. The latter had forced other Malagasy regional groups to a reluctant unification prior to French colonization. When the French took control of the island, they continued to use Merina Malagasy at the expense of other dialects. After independence, there was regional resistance against standard Malagasy, which was felt as a symbol of continued Merina hegemony. In contrast to standard Malagasy, the vitality of regional Malagasy varieties is generally healthy. Malagasy is a South East Barito language. Its historical homeland is in South East Borneo, and it is most closely related to other South East Barito languages such as Maanyan, Dusun Witu, Dusun Malang and Samihim (Dahl 1951, 1977). The South East Barito group belongs to the (West) Malayo-Polynesian branch of Austronesian languages (see Figure 1).

*

The subdatabase of the World Loanword Database that accompanies this chapter is available online at http://wold.livingsources.org. It is a separate electronic publication that should be cited as: Adelaar, Alexander. 2009. Malagasy vocabulary. In Haspelmath, Martin & Tadmor, Uri (eds.) World Loanword Database. Munich: Max Planck Digital Library, 1526 entries.

718

Alexander Adelaar

Proto Austronesian

Formosan a

Malayo Polynesian (MP)

(West MP) b

East Barito

Central-East MP

Central MP c

East MP d

South East Barito

Maanyan

Samihim

Dusun Witu

Dusun Malang

Malagasy

a I.e., the Austronesian languages of Taiwan, constituting various primary branches of Austronesian (Blust 1999). b West Malayo Polynesian languages are generally spoken in Madagascar as well as in a contiguous area including

(but not limited to) the Philippines, Malaysia, Brunei, and West and Central Indonesia. However, their genetic subgroup status remains to be proven, and more research is needed to establish whether they represent one or several branches of Malayo Polynesian. c

Central Malayo Polynesian languages are spoken in East Indonesia, in Central Indonesia (on the eastern part of Sumbawa Island and on other islands further east) and East Timor.

d East Malayo Polynesian languages are spoken in East Indonesia (in parts of West New Guinea and South

Halmaheira), in East Timor, and in Oceania (the so-called Melanesian, Micronesian and Polynesian languages).

Figure 1: The South East Barito languages within the Austronesian language family

2. Sources of data The Malagasy data in this chapter are based on my own knowledge as well as the information compiled by Webber (1853, 1855), Abinal and Malzac (1970, 1973) and Rajaonarimanana (1995). The identification of loanwords in Malagasy does not automatically entail the identification of their exact source language. As will be discussed in the next section, some loanwords are adopted from an Austronesian language, but there is no way of telling whether this language is Malay or Javanese; others are clearly Bantu

28. Loanwords in Malagasy

719

but ambivalent for Swahili or Comorian (some Bantu loanwords may even be neither Swahili nor Comorian); for some European loanwords, the exact source (French or English) cannot be established; Arabic loanwords were usually adopted via Swahili, but there were also other intermediate languages. Moreover, loanwords from a certain language may differ significantly in their phonological structure depending on whether they were adopted before or after the Bantu contact stage. Information about (Sumatran and Banjar) Malay and Javanese loanwords is based 1 on Adelaar (1989, 1995a, 1995b and 2009). These publications are the first systematic approach to such loanwords. Information about South Sulawesi loanwords is based on Adelaar (1995a) and Adelaar (2009). There are various studies dealing with Sanskrit loanwords in Malagasy, which are discussed in Dahl (1951: 97). The present inventory is based on Dahl (1951), Bernard-Thierry (1959) and (for some Sanskrit loanwords) my own analysis. It also makes use of Gonda (1973) and Adelaar (1994a) to demonstrate the pathway of these loanwords, which must have entered Malagasy via Malay and (in a few cases) Javanese. My main source for Bantu loanwords is Dahl (1988), which is basically an updated and English version of Dahl (1954). It is a critical evaluation and inventory of earlier works on the topic. Additional sources that I used are Nurse & Hinnebusch 2 (1993) for Proto Bantu and for Sabaki vocabulary in general, Sacleux (1939) for Swahili, Ahmed-Chamanga (1992, 1997) for Shingazija, and Lafon (1991) for 3 Shindzuani. For Arabic and European loanwords I made extensive use of Dez (1964, 1965, 1967), although in some cases the identification of these loanwords is based on my own analysis or on information found in Webber (1853), Abinal & Malzac (1970) or Rajaonarimanana (1995). Most Arabic and European loanwords have a phonotactic structure which is distinctly post-Bantu contact stage, making them relatively easy to recognize. A recent study of English loanwords is Ratsimandresy (2003).

3. Contact situations The South East Barito ancestors of the Malagasy people must have begun to leave th Borneo in the 7 century CE. Nothing historical is known about their migrations, but multidisciplinary evidence suggests that the link between South East Asia and East Africa was originally established by Malays. The latter must have transported South East Barito speakers to East Africa, using them as subordinates. Presumably, 1

This is a Malay publication, but the lexical data in it are also accessible in French via the etymological information in Beaujard’s Tanala Malagasy dictionary (1998b). 2 Sabaki is the name of a lower-order Bantu subgroup of six closely related languages from the East African littoral, to wit Swahili, Comorian, Pokomo, Mijikenda, Mwani and Elwana (Nurse & Hinnebusch 1993: 4-18). 3 Another inventory of Bantu loanwords is Bergenholz (1991). However, this dictionary tends to overrate the number of Bantu loanwords, and its historical information is often impressionistic.

720

Alexander Adelaar th

the settlement of Madagascar (from the 8 century CE onwards) was not directly from Southeast Asia but via the African mainland, where the South East Barito speakers stayed and mixed with Africans before they finally settled in Madagascar. Although Malagasy is clearly a South East Barito language, the Asian ancestors of the present-day Malagasy may conceivably have been of a multi-ethnic constitution. There is a clear Malay lexical superstratum in Malagasy, and there are also Javanese and South Sulawesi loanwords. Initially, there may have been a group of South East Barito speakers which was under Malay administration and was reinforced later on by other Indonesians who had to adapt to the language of the nuclear group.

Map 1: Geographical setting of Malagasy The main sources for lexical borrowing into Malagasy are Sumatra Malay, Banjar Malay, Javanese, Sanskrit, South Sulawesi, Swahili, other Bantu languages, Arabic, English and French. Based on the phonological developments they exhibit, these sources roughly belong to two periods, before and during the Bantu contact stage and after the Bantu contact stage. The Bantu contact stage is a relative time reference for lexical influence. It is an adaptation of the “Bantu substratum” that Dahl (1954, 1988) inferred in Malagasy and Comorian linguistic history. It differs from Dahl’s Bantu substratum in that it is not limited to Bantu influence but applies to all linguistic influences and does not assume the presence of a “substratum” (see further §3.5). Lexical influence preceding the Bantu contact stage (that is, roughly before the th 8 century CE) came from the following sources: ! Sumatra Malay (mostly pre-Bantu contact stage, but some loanwords of later date) ! Banjar Malay ! Javanese

28. Loanwords in Malagasy

721

! South Sulawesi th th The Bantu contact stage must have taken place in the 7 –8 century CE and brought in early Bantu influence. Lexical influence after the Bantu contact stage (that is, from some time after the th 8 century CE onwards) has the following main sources: th th ! Swahili, Comorian languages (8 [?] to 19 centuries; ongoing in regional areas?) th th ! Sumatra Malay (only a few loanwords, between 8 and 16 centuries [?]) th th ! Arabic (12 [?] to 19 centuries; ongoing in regional areas?) th ! English (mostly in the 19 century but probably ongoing) th ! French (since the late 19 century) The Bantu contact stage is a major chronological dividing line in the history of Malagasy. Borrowing of Austronesian loanwords basically took place before the Bantu contact stage, and it is not possible to make any further chronological ordering of Austronesian donor languages according to the period of their influence on Malagasy. On the other hand, there are a handful of Malay loanwords that were probably borrowed after the Bantu contact stage, showing that Malay influence must have been sustained over a long period and continued until after the waning of Javanese and South Sulawesi influence. Sanskrit loanwords are also pre-Bantu contact stage. They were borrowed into Malagasy via Malay and Javanese and hence cannot be older than Malay and Javanese loanwords in general. They are treated as a sub-category of the latter. In this section, I distinguish the following broad contact situations: pre-Bantu contact stage (§3.1), Bantu influence (§3.2), Arabic influence (§3.3) and European influence (§3.4). 3.1.

Pre-Bantu contact stage

This period was basically pre-historical. There are no historical records, and much of what can be inferred about the setting in which borrowing took place is based on the lexical data themselves. 3.1.1.

Malay loanwords

Among the identifiable loanwords from this period, the ones from Malay are by far the most numerous. Two semantic fields are particularly well represented: the maritime domain and the domain of body-part terms. Words belonging to the first 4 domain include shipping terms: sambo ‘boat, vessel’ < Old Malay s!mvaw ‘vessel’; North Malagasy tampika ‘outrigger’, cf. Brunei Malay sa-tampik ‘on one side’; 4

Note that in Standard Malagasy, ‘o’ stands for a [u] sound, and ‘y’ for an [i] in word-final position.

722

Alexander Adelaar

winds: rivotra ‘wind’ < (a"in) ribut ‘storm wind’; varatraza (Betsimisaraka) ‘south wind’ < barat daya ‘Southwest’; tsimilotru (Betsimisaraka) ‘north wind’ < timur laut 5 ‘Northeast’; cardinal directions: a/varatra ‘North’ < barat ‘West’ ; sagary ‘a northeast wind’ < Malay s#gara ‘sea’ < Sanskrit s!gara- ‘the ocean’; the sea, the beach and the river: tanjona ‘cape, promontory’ < tanju" id.; hoala (North) ‘bay, inlet’ < kuala ‘river mouth’; nosy ‘island’ < Malay (or Javanese) nusa id.; ranto ‘1. go trading to faraway places or countries; 2. product of such trading’ < rantaw ‘1. reach of a river; 2. go abroad for trading’; harana ‘coral-reef, coral-rock’ < kara" id.; hara ‘mother-of6 pearl’ < karah ‘patchy in coloring (tortoise-shell)’; vatoharanana, vatokaranana ‘quartz’ < batu kara" ‘coral rock’; fasika, fasina ‘sand’ < pasir ‘sand; beach’; names of fish and other sea animals: horita ‘octopus’ < gurita id.; fano ‘turtle’ < Banjarese Malay panyu id.; trozona ‘whale’ < duyu" ‘sea cow’; tona ‘large nocturnal snake; enormous eel’ < tuna ‘a mud-snake or eel with yellowish body’; vontana (North) ‘k.o. fish’ < (ikan) buntal ‘box-fish, globe-fish or sea-porcupine’; North Malagasy vidy ‘k.o. small fish’ < Malay (ikan) bilis ‘anchovy, Macassar redfish; small fish, esp. Stolephorus spp.’. Examples of words belonging to the domain of body-parts include: voavitsi ‘calf of leg’ < buah b#tis id.; molotra ‘upper lip’ < mulut ‘mouth’; tsofina ‘outer ear’ < cupi" ‘lobe (e.g., of ear)’; valahana ‘loins’ < b#laka" ‘back; space behind’; Bara Malagasy haranka ‘skeleton’ < k#ra"ka id.; tratra ‘chest’ < dada id.; tanana ‘hand’ < ta"an id.; hihi ‘gum’ (dialectally ‘teeth’) < gigi ‘teeth’. The Malagasy words in the above word pairs must be borrowed from Malay because of the irregular sound changes they exhibit (such as Malagasy ts, tr and r corresponding to c, d and r respectively in Malay), and because Proto Austronesian, Proto Malayo-Polynesian and Proto Southeast Barito have other well-attested etyma denoting the body-parts in question. Many other loanwords do not occupy such well-defined semantic fields, although they do give a clear picture of the great historical impact of Malay on early Malagasy civilization. 3.1.2.

Banjar Malay

Most Malay loanwords are representative of Sumatra Malay, where the oldest th known Malay kingdom, Srivijaya, was established in at least the 7 century. However, before the Bantu contact period, when the Asian ancestors of the Malagasy were still in Borneo, the Malays had apparently already settled in South Borneo long enough to develop their own dialect, as some Malay loanwords appear to reflect Banjar Malay. The diagnostic device to identify Banjar Malay loanwords is their Malagasy reflex of an earlier Malayic penultimate schwa, which remained # in Standard Malay but became a in Banjar Malay. For instance, (Sumatran) Malay has 5

These terms are related through their historical meaning ‘(direction of the) wet monsoon’: the wet monsoon is a western wind in Sumatra, but a northern wind in Madagascar (Dahl 1951: 326). 6 The Malagasy forms must be derived from *batu *karang + (a locative nominal suffix) *-an.

28. Loanwords in Malagasy

723

the following words: s#mbah ‘gesture of worship or honor’; l#mah ‘weak’; c#cak ‘k.o. lizard’; p#ñu ‘turtle’; and k#mbar ‘twins’. Banjar Malay, which has no schwa and uses a instead, has the corresponding forms sambah, lamah, cacak, pañu and kambar (all with same meanings, cf. Abdul Jebar 2006). These words were borrowed into Malagasy as samba-samba ‘an expression of gratitude to God’, lama ‘weak’, tsatsaka ‘k.o. lizard’, fano ‘turtle’ and (Sakalava) hamba ‘twins’. In these cases, the Malagasy form agrees with Banjar Malay in showing a; it does not follow Sumatran Malay, which has a corresponding original # (Adelaar 1989). In contrast, Malagasy words that are borrowed from Sumatra Malay reflect its # as e, e.g. (Sumatra) Malay r#tak ‘to crack’ > Malagasy retaka ‘to collapse’, and Malay b#sar ‘big’ > Malagasy vesatra ‘heavy’. It should be noted that in general only Malay loanwords containing a reflex of *# (as either a or #) can be checked for possible Banjarese dialect affiliation. Other Malay loanwords usually do not have such a device and therefore cannot be “diagnosed” for source dialect. However, this does not necessarily mean that they cannot be of Banjar Malay origin. Malay loanwords that are post-Bantu contact stage are discussed in §3.3 below. 3.1.3.

Javanese loanwords

The Javanese played a major role in the history of insular South East Asia, although probably not in the same sustained way as the Malays. Their linguistic influence is less obvious except on languages in the direct vicinity of Java. Malay and Javanese th mutual influence has a very long history (demonstrably going back to the 9 century CE) and has clearly affected the lexicons and phonologies of both languages. th Interestingly, whereas Javanese written sources are not older than the 9 century CE, there is Javanese influence in Malagasy, suggesting that the Javanese were already an influential force before the migrations to East Africa and the Bantu th contact stage, that is, before the 9 century. Javanese loanwords in Malagasy are few, and they are sometimes hard to distinguish from Malay loans because of the long history of mutual influence between Javanese and Malay. Some loanwords in Malagasy are clearly of direct Javanese provenance, such as Malagasy tomotra ‘close, imminent’, which reflects Javanese tumut ‘to follow, accompany, participate’, and Malagasy volana ‘word, speech’ reflecting Javanese wula" ‘lesson, advice; admonition’. Neither of these pairs has a corresponding form in Malay. Moreover, tomotra can be traced to a morphologically complex Old Javanese form t$t ‘to follow’, with an active voice infix that does not exist in Malay. Other loanwords have Javanese as an ultimate source but may have been borrowed via Malay. Words like these include rotsaka ‘fall downwards, slip on a slope’, or (in Tandroy Malagasy) ‘collapse’. The corresponding form in both Malay and Javanese is rusak meaning ‘spoiled, ruined’. Rusak originally derives from an Old Javanese compound (r$g ‘fall in, collapse, be smashed’ + s!k ‘fallen apart, loosened, dispersed’), testifying to its unambiguous Javanese credentials. However, rotsaka

724

Alexander Adelaar

could just as well have been borrowed via Malay, as rusak has become part and parcel of the core vocabulary of this language. Finally, there are also many loanwords which could be of either Malay or Javanese origin, because their source cannot be established due to the lack of comparative-historical information, or because the phonotactics of Javanese and Malay are too similar. This is the case with Malagasy sodina ‘flute’, which could be from either Malay suli" or Javanese suli" ‘flute’. In spite of the actual or perceived low number of Javanese loanwords, it is nevertheless noticeable that Javanese influence was important enough to leave a mark on Malagasy morphology, in the form of an honorific prefix ra- (see §6). 3.1.4.

Sanskrit loanwords

In pre-Muslim Malay and Javanese society, Sanskrit was used as an intellectual reference language and was the main source of religious, administrative and scholarly terminology, similar to the way Latin was used in European societies in the past. In Malagasy, there are some 35 Sanskrit words, 19 of which are represented in the subdatabase as indirect loanwords. As is the case with Sanskrit vocabulary in almost all Austronesian languages, these words must have been borrowed indirectly via Malay or Javanese. Virtually all of them also occurred in Malay or Javanese at some stage, and they generally reflect the phonological and semantic changes that Sanskrit words underwent in the process of being borrowed into Malay or Javanese. This is demonstrated in the example of SKT ko%i ‘ten million’: Malay and Malagasy have respectively k#ti ‘100,000’ and hetsi ‘100,000’, both showing the same semantic change from ‘ten million’ to ‘hundred thousand’, and the same vocalic change from o to # (Malagasy e being the regular reflex of a historical #). This indirect borrowing perhaps accounts for the fact that it is not quite possible to assign particular semantic domains to Sanskrit loanwords (Bernard-Thierry 1959: 340). Interestingly, however, some Sanskrit loanwords in Malagasy have retained original Sanskrit phonological oppositions that have been lost in their Malay (and modern Javanese) counterparts. For instance, although Malagasy has no aspirated consonant phonemes, it still indicates the original distinction between aspirated and non-aspirated velars in Sanskrit loanwords, with *k and *g > h, as against *kh, *gh > k. For example, the k in Sanskrit ko%i (above) became h in hetsi, but the kh in the Sanskrit form vai&!kha ‘April-May’ became k in Malagasy saka-masay ‘April’ and 7 saka-ve ‘May’. Similar retentions of Sanskrit phonological oppositions are also found in Tagalog, another language that obtained its Sanskrit vocabulary via Malay. What the evidence of Malagasy and Tagalog tells us is that when Malay borrowed Sanskrit lexicon, it must initially have kept the phonological oppositions in question, and apparently still had them at the time it passed on Sanskrit vocabulary to 7

masay ‘small’; ve ‘big’. Note that saka/masay and saka/ve still had their initial syllable in 17th-century Betsimisaraka Malagasy, cf. vysackavey ‘April’, vysackamassey ‘May’ (de Houtman 1603, cf. Dahl 1951: 99).

28. Loanwords in Malagasy

725

Malagasy and Tagalog. Only afterwards did Malay lose these distinctions, which are not part of modern Malay phonology (Adelaar 1994a). Malay and Javanese played a pivotal role in the transfer of Sanskrit loanwords to Malagasy, but it is hard to assess which of these languages was more instrumental in this respect. The earliest written records of both languages are replete with Sanskrit loanwords. However, there is much more historical documentation of Javanese (with a rich corpus of Old Javanese texts and inscriptions) than there is of Malay (there are only a few Old Malay inscriptions). This gives a lopsided impression of the importance of Old Javanese as a source. For instance, Malagasy lapa ‘palace; courtroom’ reflects Javanese pa!'åpå ‘large open structure in front of Javanese a house; open veranda, pavilion’, which in turn is borrowed from Sanskrit ma!'apa‘open hall or temporary shed, pavilion’. Although Malay has no form corresponding to this Sanskrit loanword, it may be that it had one in the past and that this now lost form was borrowed into Malagasy. Other than Indian loanwords borrowed via Malay and Javanese, there is no evidence for early (linguistic or cultural) Indian influence in Madagascar (Adelaar 1996), which in turn suggests that there were no early historical links between Madagascar and the Indian subcontinent. 3.1.5.

South Sulawesi influence

Malagasy clearly has a number of loanwords from South Sulawesi languages, but it is hard to identify any South Sulawesi language or languages in particular that had an influence on the South East Barito lexicon. The linguistic evidence in some cases suggests that the lending language was Buginese. But then again, while in the last 400 years the Buginese have undoubtedly become the most influential and cosmopolitan South Sulawesi ethnic entity, there is no solid evidence that they had already obtained this status further back in history (cf. Pelras 1996, chapters 2–3). Furthermore, given the time-depth involved, it is not even clear whether or not, at the time of the migrations to East Africa, South Sulawesi had already diverged into the various languages that make up the group today. For the time being it is therefore safe to label the loanwords in question as generic “South Sulawesi”, and not to be any more specific about their origin. That there are South Sulawesi loanwords in Malagasy should not come as a surprise, given that several South Sulawesi communities have a strong orientation towards the sea and have developed impressive navigational skills. The Buginese have travelled extensively and have had longstanding contacts with the coasts of Borneo. Historical linguistic research shows that the Tamanic communities of the Upper Kapuas area in West Kalimantan (Indonesian Borneo) speak some closely related dialects which are very closely related to the South Sulawesi languages and in fact form a branch of the South Sulawesi language subgroup (Adelaar 1994b). The Tamanic case suggests that contacts between South Sulawesi and Borneo may be older and more complex than those documented in written records. In any case, the handful of South Sulawesi loanwords in present-day Malagasy indicates that

726

Alexander Adelaar

there were inter-insular contacts between South Sulawesi and Borneo long before the beginning of written history in that area. In light of these contacts, the suggestion that the Buginese or any other South Sulawesi people ever made independent voyages to Madagascar would definitely be speculative to an unnecessary degree. There are at least eight South Sulawesi loanwords. These do not belong to a particular semantic domain but include a specific cultural notion such as solo ‘substitute’, which should have lost its s- if it had not been borrowed (< Macassarese, Buginese pas-solo(, Duri solo( ‘present (money, goods) given at celebrations’, South Toraja pas-sulu( ‘money borrowed short term and without requiring interest’ (< Proto South Sulawesi *sulu(r) ‘exchange, pay’). The set also includes the geographically specific term tanety ‘high and flat terrain, slope, hill; mainland, terra firma’. This term is not inherited from Proto Austronesian, but it does have corresponding forms in South Sulawesi languages: Macassarese tanete ‘rolling (hills), hilly terrain’; Buginese tanete ‘elevated terrain, high country’, South Toraja tanete ‘hill, low mountain’. Finally, South Sulawesi influence includes the prefix ta-, which forms words referring to ethnic, geographic, or professional categories of people (§6). Various other cases have been made for borrowing from other Austronesian languages, but the evidence is weak and of little structural relevance (Adelaar 2009), apart from possible loanwords from Ngaju Dayak, a language spoken in an area bordering that of the South East Barito languages (Dahl 1951). 3.2.

Bantu influence

Bantu loanwords stem from the Bantu contact stage or later. Dahl pointed out that they are particularly well represented in the domains of animal names and terms for domesticated plants, and that the terminology for domestic animals is essentially Bantu, including ondry ‘sheep’, ampondra ‘donkey’, amboa ‘dog’, omby ‘cow’, akoho ‘chicken’, osy ‘goat’, etc. Bantu terms for wild animals are among others akangga ‘guineafowl’, mamba ‘crocodile’, pili ‘a kind ‘of venomous snake’, ampaha ‘wild cat’ and papango ‘kite’; and for domesticated plants, tongolo ‘onion’, ampemby ‘sorghum’, akondro ‘banana (and its tree)’. In one case, Malagasy borrowed a Bantu body-part term: maso ‘eye’ reflects Sabaki *mat)o ‘eyes’. The reason for this unusual borrowing must have been an avoidance of the “homonymie fâcheuse” that would have resulted from the merger of Proto South East Barito *matä [mat!] ‘eye’ and *matey ‘dead’ in Malagasy maty (currently meaning ‘dead’) through regular phonological change (Dahl 1954). The Bantu loanwords are generally of Sabaki origin. However, whereas some of them can be traced to Swahili and some other ones to Comorian languages, many others are difficult to “read” as far as their exact provenance is concerned. They may not have corresponding forms in either Swahili or Comorian languages, or they have corresponding forms in both Swahili and in Comorian but these forms are so

28. Loanwords in Malagasy

727

similar that the loanwords in question can be traced to both, with no clue as to their exact source. Dahl’s (1954, 1988) investigation of Bantu influence in Malagasy was influenced by his perspective on the settlement history of Madagascar. He believed that South East Barito speakers had migrated from Borneo directly to Madagascar, and that, when they arrived, the island already had a Bantu population speaking a form of “Old Comorian”. The latter exercised a substantial influence on the language of the newcomers before they adopted it as their mother tongue in the end. Their influence on - and shift to - (early) Malagasy left a clear mark on the language. It became a crucial factor in the creation of modern Malagasy and constitutes what Dahl called a “Bantu substratum”. Dahl assumed that the many words that could not immediately be traced to Swahili or Comorian languages belong to the Bantu substratum and represent “Old Comorian”. He also found Bantu loanwords that were closer in form and meaning to cognates in Bantu languages spoken in the African interior. For instance, words like ondry ‘sheep’, kondry ‘sweet banana’, and mamba (with the specific meaning of ‘crocodile’) match better with their corresponding forms in Bantu languages spoken around Lake Victoria than with Sabaki. Dahl concluded that these were also Old Comorian words, the reflexes of which were subsequently lost in modern Comorian languages. However, it is now considered more likely that South East Barito speakers stayed on the African mainland before they finally settled in Madagascar, and that 8 the island was unoccupied , or had a non-Bantu population, upon their arrival. This interpretation of the early migration route and settlement of Madagascar has obvious consequences for the assessment of Bantu influence. It follows that there never was an “Old Comorian” language in Madagascar. The notion of a Bantu substratum is in itself cumbersome enough, but even if there is a Bantu substratum, the substratum language must now be sought in mainland Africa, and we are even more at a loss as to what its nature may have been than in the case of “Old Comorian”. In order to avoid unwarranted speculation about the origin of Bantu loanwords in the subdatabase, their donor language is indicated only if it can be established beyond reasonable doubt. In other cases, no source language is given. Obviously, the analysis of Bantu influence in Malagasy needs a major overhaul and requires the input of historical linguists specializing in Bantu languages. As mentioned above, the similarity between Swahili and Comorian languages sometimes makes it impossible to identify the exact source language of loanwords from these languages. In the case of unequivocally Comorian loanwords, it is even more difficult to identify the individual Comorian source language. Furthermore, as far as Swahili loanwords are concerned, even if they are unequivocally Swahili, they

8

A theory first proposed in 1960 by Deschamps and now getting increasing multidisciplinary support (cf. Adelaar 2009 for a discussion).

728

Alexander Adelaar

may still have been borrowed in the Comoros, where Swahili used to be an official th court language in the 19 century (Nurse & Hinnebusch 1993:18). Swahili was a medium language for the spread of Islam in Madagascar, and it was an important vehicular language for the introduction of Arabic lexicon into Malagasy (see below §3.3.2). Some instances of Bantu loanwords that are most probably borrowed from Swahili are ndzia ‘path’ (Swahili n-d*ia); mosavy ‘witchcraft’ (Swahili m)awi ‘witch’); kiso ‘knife’ (Swahili ki-su); kilema ‘deformity, defect’ (Swahili ki-lema ‘deformity, defect; person with such defect’) (Dahl 1988). Instances of Bantu loanwords that must have been borrowed from Comorian languages are akanga ‘guineafowl’ (Comorian nkanga id.), ampondra ‘donkey’ (Comorian mpundra id.), akoho ‘chicken’ (Comorian nkuhu), ampaha ‘wild cat’ (Shinzwani mpaha), papango ‘vulture’ (Maore papangu), kiraru ‘shoe, sandal’ (Shinzwani shilarú), kianja ‘open place’ (Shingazija shandza). Finally, there have been speakers of Bantu languages in Madagascar in the recent past. A southern dialect of Swahili was still spoken in 1993 in the village of Marodoka on the island of Nosy Be, off the northwest coast of Madagascar, and in one pocket further down Madagascar’s west coast (Nurse & Hinnebusch 1993: 14). th Especially in the mid-19 century, many slaves were imported from Mozambique, who used Makhua as a first language or as a lingua franca, but this Bantu language apparently never became a major source of lexical borrowing to the Malagasy speech 9 community. 3.3. 3.3.1.

Post-Bantu stage Malay loanwords

Malay loanwords can also be stratified according to the period in which they were borrowed. While most have undergone the changes that are typical for the Bantu contact stage, some have not, and must therefore be more recent. In one case, the meaning of the loanword in question testifies to borrowing after the Bantu contact stage. These loanwords confirm the few references to Austronesian voyages to East Africa found in Arabic and Portuguese sources (cf. Dahl 1951). Examples of Malay loanwords affected by the Bantu contact stage are voron+ ‘bird’ (Malay buru" id.); trano ‘house’ (Malay da"aw ‘temporary shelter in the field’); haran+ ‘coral-reef, coral-rock’ (Malay kara" id.); fara-fara ‘bed-frame’ (Malay parapara ‘(under-)frame’). These loanwords typically reflect the shift from occlusives to fricatives and affricates so typical of the Bantu contact stage (*b > v; *d > tr; *k > h; *p > f). They contrast with Malay loanwords that were not affected by these changes and must therefore have been borrowed after the Bantu contact stage, such as bodo ‘infantile’ < Malay bodo(h) ‘uneducated, stupid’; poki (Comorian dialects) ‘(a curse)’, mi-poki ‘swear, curse’ (Gueunier 1986) < Malay puki ‘vagina’ (a term used in 9

Prof. Noel Gueunier, University of Strasbourg (email communication).

28. Loanwords in Malagasy

729

insulting phrases); landaizan+ ‘anvil’ < Malay landasan, Minangkabau and dialectal Malay landehan id.). Socio-cultural meaning can also be indicative of the time of borrowing. Most Malay loanwords were probably borrowed before the migrations to East Africa some 1300 years ago (Adelaar 1989) at a time when Islam in all likelihood had not yet had a serious impact in South East Asia. However, at least one Islamic term in the Taimoro dialect must have been borrowed via Malay. This is sombily [sumbili] ‘to 10 slaughter according to Muslim practice’ . It is used in the derivation mpanombily referring to a ‘(ritual) slaughterer of meat’, which is an office strictly reserved for Taimoro aristocrats, and it must have been borrowed via Malay, which has s#mb#leh [semb"l!h]. The latter originally meant the same as sombily but nowadays refers to slaughtering in general. It ultimately derives from Arabic bi’smi’ll!hi ‘in the Name of God’, a frequently pronounced religious formula, the use of which is obligatory at the ritual killing of an animal (Adelaar 1995a: 328). As it happens, the phonological structure of Malagasy sombily does not give helpful clues as to whether or not it had passed through the Bantu contact stage, but its meaning clearly shows that it must have been borrowed after Islam had been introduced in insular South East Asia (when that happened is a matter of debate, but it is very unlikely to have been as th early as the migrations of South East Barito speakers to East Africa in the 7 century CE). 3.3.2.

Arabic influence

Dez (1967: 13–14) points out that borrowing from Arabic into standard Malagasy resulted in a number of loanwords of cultural importance including the names of lunar months, names of days of the week, divinatory terms, some terms for kinship and “social” relations, and some commercial, legal and navigational terms. Many more Arabic loanwords are found in regional dialects, particularly those of the northwest. Although Muslims only constitute a minority into Madagascar, archaeological th excavations suggest that Islam was introduced in Madagascar as early as the 12 century CE (Dewar & Wright 1993). Madagascar has various historically unrelated Muslim communities, evidencing the introduction of Islam on several independent occasions (Gueunier 1994: 35–72). Two areas in particular had an impact on the lexical make-up of modern Malagasy, namely the southeast of Madagascar, and the northwest and west of the island. In the southeast, aristocrats among the Taimoro th practise a hybrid form of Islam and claim an Arab ancestry going back to the 12 – th 13 centuries. The origin of their form of Islam is contested: while Dahl (1983) believed that it came from Oman, the likelihood of an Indonesian origin for the Taimoro cultural-religious term mpanombily (§3.1) along with the Taimoro Malagasy adaptations to the Arabic script provides evidence for introduction via Indonesia (Adelaar 1995a, 2009). These adaptations are more akin to the 10

sombidy in standard Malagasy.

730

Alexander Adelaar

adaptations made by Malay and Javanese users of this script than to those made by Omani Arabic or Swahili-speaking users. Members of the Taimoro aristocracy are also the guardians of the so-called “Arabico-Malagasy” manuscripts, a corpus of secret texts containing fragments in Taimoro Malagasy, Arabic, and a mixed Arabic-Malagasy “pidgin” called Kalamo Tetsitetsy (Beaujard 1998a: 5). Given the secrecy and exclusivity surrounding Islam among the Taimoro, the influence of this literary tradition on the Taimoro dialect in general was probably limited. In the northwest and west, Islam was introduced from the African mainland and via the Comoros. This area is influenced by the Islamic Swahili cultural belt, which runs along the East African coast and the Comoros. Among its adherents are the Antalaotsy, a Malagasy ethnic group with strong maritime traditions and an orientation towards trade, who used to have strong ties with the Comoros and the African coast. Other adherents are mostly of Comorian or other foreign origin, although their number also includes many ethnic Malagasy. According to Dez (1967: 9–13), Arabic loanwords in Malagasy generally consist of both esoteric and profane vocabulary. Esoteric words entered Taimoro Malagasy very early and from there spread to other Malagasy dialects. They include terms for lunar days and months, zodiac signs, divinatory terms, and, possibly, names of days of the week. It remains uncertain where the Taimoro had obtained this vocabulary from, which they also kept in scriptural form. The words were not passed on via Swahili. They may have been borrowed directly from Arabic, but this remains unclear. Profane words include economic, financial and administrative terms, terms for weights and measurements, some terms indicating social relations and kinship, navigational terms, and, possibly, names of days of the week. These words were generally borrowed into northwestern Malagasy dialects when Northwest Madagascar still had close economic, cultural and religious relations with the Comoros and the East African mainland in the centuries preceding the unification of Madagascar by the Merina. Many of these loanwords may have been passed on via Swahili. However, Dez is right in pointing out that this does not always have to be the case. The formal similarities between Arabic loanwords in Malagasy and in Swahili may also be partly caused by the phonotactic limitations that the languages have in common, such as the intolerance of final consonants (requiring the addition of a final vowel) and of heterorganic consonant clusters (requiring the epenthesis of these clusters with a vowel). These adaptations are clearly illustrated in the names of week days: alahàdy ‘Sunday’ < Arabic (yawm) al-a,ad; alatsinàiny ‘Monday’ < Arabic al-i-nayn; talàta ‘Tuesday’ < Arabic a---al!-!’; alarobìa ‘Wednesday’ < Arabic al-arba.!’; alakamìsy ‘Thursday’ < Arabic al-xam/s (Swahili Alhamisi; zomà ‘Friday; (Friday) market’ < Arabic al-jum.a (Swahili Ijumaa ‘Friday’); sabòtsy ‘Saturday’ < Arabic as-sabt. Dez was uncertain whether these names were borrowed via the Taimoro dialect or northwestern Malagasy, and one can appreciate his doubts. It would be facile to point out that modern Swahili and Comorian languages do not use Arabic loanwords for names of weekdays (except for Swahili Alhamisi and Ijumaa) given that they may have used them in the past, or may use them on special occasions. Swahili Alhamisi has maintained a (real or orthographic?) consonant

28. Loanwords in Malagasy

731

cluster -lh-, unlike its Malagasy counterpart, which points to borrowing via a different pathway. On the other hand, Swahili Ijumaa and Malagasy zoma (< *yuma) agree in reflecting Arabic al-jum.a without al-, whereas all other names for weekdays have integrated this article. The problem of identifying borrowing pathways presents itself in many individual cases. In a case like bakoly ‘bowl’ < Swahili bakuli < Arabic b$q!l ‘vessel without handles, mug’, it is safe to assume that Swahili was the vehicular language on account of the vowel metathesis reflected in bakuli and bakoly. Likewise, kilema ‘deformity, defect (body or character)’ < Swahili kilema ‘1. deformity; 2. someone with a deformity’ < Arabic kil!m, plural of kalm ‘wound, cut, slash’, must have been borrowed via Swahili on account of the semantic and phonological developments in kilema. On the other hand, rafiky ‘friend’ < Arabic raf/q ‘friend, companion’, zamany ‘ancient’ < Arabic zam!n ‘time’, zam!n/ ‘temporal’, and bahary (dialectal) ‘sea’ < Arabic ba,r id., may have been borrowed via Swahili (which has respectively rafiki ‘friend’, zamani ‘ancient’ and < bahari ‘sea’), but it did not necessarily happen that way. Finally, in a case like mizàna ‘scales; judges, the magistrate’ < m/z!n ‘scales, balance’ (cf. Swahili mizani ‘scales’), the occurrence of a different added vowel in Swahili would suggest that this was not the vehicular language. The same goes for gidro ‘monkey’ < Arabic qird ‘monkey’, which is unlikely to have been borrowed via Swahili. The latter has ngedere (same meaning), the vowels of which do not match. 3.3.3.

English and French influence

Currently, the bulk of loanwords in Malagasy are borrowed from European languages. Madagascar has been in contact with Europe ever since the Portuguese first visth th th ited the island on their way to Asia in the 16 century. In the 17 and 18 centuries, the Dutch, French and British established strongholds in Madagascar, which enabled them to participate in the trade of slaves and commodities. th Throughout the 19 century, the British and French competed for influence on the island, with the final aim of colonizing it. The British initially prevailed in this contest, which explains the large number of well-established English loanwords in Malagasy. However, most European loanwords were borrowed during French colonization (1895–1960), and borrowing from French has continued until the present. After independence, French has remained an official language along with Malagasy, which was declared the national language in 1992. The British in Madagascar were often missionaries and members of the military. th They were most prominent in the 19 century, and most English loanwords date from that period. These are often terms pertaining to the army, school, music, medicine, the Protestant church, the building industry, and animals and plants. The French, who became the official colonial administrators of Madagascar in the th first half of the 20 century, were often development workers and missionaries. French lexical influence is most notable in the domains of chemistry, pharmacy,

732

Alexander Adelaar

cooking and food, clothes and furniture, the Roman Catholic church, agriculture and various technologies (Dez 1964: 45). Almost all European loanwords are nouns. A few are adjectives or can be used as predicates, e.g. grèk+ ‘Greek’, katolìk+ ‘catholic’, bànky ‘bankrupt’, dahòlo ‘all’ (probably < English the whole). One loanword in Dez (1965) is labeled as a verb: gorampà (< French grand pas) ‘to baste (tack with long, loose stitches in preparation for sewing)’. The establishment of clear-cut sound correspondences between Malagasy loanwords and their English and French source words is problematic because of interfering factors such as (1) differences in adaptation between standard Malagasy and regional dialects; (2) differences in adaptation between written and spoken sources (e.g. the rendering of ü in divày ‘wine’ (< French du vin) and in repoblik+ (< French république)); (3) differences in the extent of adaptation between individual loanwords; (4) differences in adaptation between cognate forms borrowed from French and English (especially in the domain of Christian religion; see below); and (5) the place of stress (which, among other things, affects the syllable structure of loanwords and needs further investigation). Dez (1964, 1965) tried to establish the rules of adaptation of French and English loanwords, but the rules he identified only apply to some loanwords and leave a good many others unexplained. English and French loanwords sometimes compete in denoting the same concept, as in the following word pairs: medàly ‘medal’ vs. medày id. (French médaille); diksionàry ‘dictionary’ vs. diksionèra id. (French dictionnaire); gilàsy ‘glass’ vs. vèra (French verre); and pènin+ ‘pen’ vs. pilìmo (French plume [plym]; Dez 1964, 1965). In religious terminology, the English loanword in such pairs is often associated with Protestantism, whereas the French one has a Roman Catholic flavor. Compare Krismasy ‘Christmas’ vs. Noèly id. (French Noël); Batisa ‘Protestant baptism’ vs. batèmy ‘Catholic baptism’ (French baptême [batem]); Baibòly ‘bible’ vs. biblìa id. (French bible; Dez 1964, 1965); alitara ‘altar’ vs. ôtely id. (autel [otel]; Ratsimandresy 2003: 318).

4. Number and kinds of loanwords 4.1.

Loanwords by their source languages and semantic domains

The Malagasy subdatabase contains 1526 lexical items, of which 267 items (or 17.5%) are loanwords. Among the latter, French loanwords form the largest category (92 items or 6% of the entire subdatabase). Most of these pertain to the fields of the Modern world (28), Clothing and grooming (15), Food and drink (14.5), Animals (9), The house (6.5), and Agriculture and vegetation (4).

733

28. Loanwords in Malagasy

9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Banjarese

Dutch

Portuguese

Unidentified

Non-loanwords

- 1.3 - 0.9 - 1.4 7.4 - 2.8 - 0.9 0.6 0.6 0.6 - 0.6 - 1.3 - 1.3 1.6 - 1.6 4.0 -

-

-

2.7 0.9 1.3 2.0

81.8 96.4 68.5 87.8 69.7 69.1 83.9 82.9

4.3 6.4 1.4 2.2 2.3 2.1 2.1 4.0 6.2 41.5 -

-

-

- 1.1 2.1 - 2.1 2.5 2.3 1.4 - 10.1 1.5 1.5 7.1 - 3.5 - 4.2 -

-

4.2 4.0 1.5 12.3 3.1 - 7.4 - 0.7 7.6 -

-

10.5 7.3 7.4 7.0 8.5 4.0 6.9 9.4 11.6

-

Javanese

3.7 0.6 1.3 3.3 -

South Sulawesi

10.1 1.4 7.4 8.5 6.5 2.2 3.0

Comorian

Bantu

4.0 8.3 0.6 18.7 24.4 14.0 8.1

Arabic

English

The physical world Kinship Animals The body Food and drink Clothing and grooming The house Agriculture and vegetation Basic actions and technology Motion Possession Spatial relations Quantity Time Sense perception Emotions and values Cognition Speech and language Social and political relations Warfare and hunting Law Religion and belief Modern world Miscellaneous function words all words

Malay

1 2 3 4 5 6 7 8

Malagasy loanwords by donor language and semantic field

French

Table 1:

- 1.4

-

-

-

- 86.4

1.1 - 2.1 2.5 0.8 1.9 - 1.9 2.1 2.1 -

-

-

-

85.1 86.5 87.7 87.5 88.4 87.7 96.0 90.1 80.0 77.8

- 2.1 2.1 - 3.8

89.5 91.9 76.9 50.4 88.6

-

-

6.0 6.0 1.8 1.1 0.7 0.4 0.4 0.3 0.3 0.1 0.1 0.4 82.5

Malay loanwords are almost as many (91 items or 6%) and would be an even larger category if Banjar Malay loanwords were included, which account for 4 loanwords or 0.3%. The main semantic domains for Malay are The body (14), Motion (9.7), Animals (8), The physical world (7.5), Spatial relations (6), Social and political relations (5.5), Food and drink (5), Basic actions and technology (4.5), Sense perception (4.5), and Speech and language (4). There are 27 English loanwords (1.8%), which pertain, among other things, to the fields of the Modern world (5), Religion and belief (4), and Animals (4).

734

Alexander Adelaar

Bantu loanwords number 17 (or 1.1%), almost half of which belong to the fields of Animals (8). There are an additional six Comorian loanwords (0.4%), half of which also belong to the field of Animals (3). There are 11 Arabic loanwords (or 0.7%), many of which refer to Time (7). Other loanwords were borrowed from South Sulawesi (6 words or 0.4%), Javanese (5 words or 0.3%), Dutch and Portuguese (each one word or 0.1%), and an unknown source (6 words or 0.4%) (see Table 1). Note also that there are 19 words of Sanskrit provenance, which were all borrowed via Malay and Javanese. They are therefore not shown as a separate category in Table 1. Other languages from which Malagasy borrowed only indirectly (and which were identified in the subdatabase) are Greek, Javanese, Persian, Arabic and vernacular Arabic. Each of these languages is represented by one word only. 4.2.

Loanwords by their semantic word class

Table 2 shows the numbers of loanwords according to their source language and semantic word class. In agreement with what seems to be a universal tendency, the vast majority of loanwords are nouns: the 267 loanwords in the subdatabase consist of 216 nouns, 26 verbs, 8 function words, 17 adjectives, and 0 adverbs. Interestingly, however, Malay loanwords seem to defy this pattern to some extent: of these 91 loanwords, 53 are nouns, 22.5 verbs, 4.5 function words, and 11 adjectives. This is in maximal contrast with the 92 French loanwords, which consist of 91 nouns and 11 one function word. When these figures are compared with one another and with figures from other languages, there seems to be a correlation between the high proportion of nouns and the recentness of their source: whereas European languages and Comorian (the most recent sources) provided almost exclusively nouns, Malay and other Indonesian languages (the oldest sources) show a more even distribution between nouns and non-nouns. The figures for Bantu and Arabic loanwords, which are chronologically in between Indonesian and European loanwords, would seem to affirm this trend: while the 11 Arabic loanwords consist of 10 nouns and one adjective, the 17 Bantu loanwords include 15 nouns, one verb and one adjective. While the trend is clear, it is harder to determine the cause. Time as a factor may play a role, in the sense that older and more adapted loanwords are probably more likely to end up in verbal derivations and to change word class. However, other factors such as language affinity, similar word structure and cultural (semantic) compatibility may also have played a role in the more frequent borrowing of words other than nouns from Indonesian languages.

11

Dez’s example of a verb gorampa (in §3.7) does not feature in the subdatabase.

28. Loanwords in Malagasy

English

Bantu

Arabic

Comorian

South Sulawesi

Javanese

Banjarese

Dutch

Portuguese

Unidentified

Non-loanwords

Nouns Verbs Function words Adjectives Adverbs

Malay

Loanwords in Malagasy by semantic word class (percentages)

French

Table 2:

735

10.1 0.9 6.0

5.9 6.0 3.9 8.6 6.0

2.9 0.9 1.8

1.7 0.3 0.8 1.1

1.1 0.8 0.7

0.7 0.4

0.3 0.3 0.9 0.8 0.4

0.3 0.3 0.8 0.3

0.2 1.6 0.3

0.1 0.1

0.1 0.1

0.6 0.1 0.4 0.4

76.1 93.0 93.0 86.6 100.0 82.5

5. Integration of loanwords The phonological changes that took place before the Bantu contact stage were fundamentally different from those that took place afterwards. There are two reasons for this: (1) the Malagasy sound system underwent a rather thorough change during the Bantu contact stage; and (2) the Austronesian, Sanskrit and Bantu source words were more adaptable to the Malagasy word structure throughout its history than those from Arabic, French and English. A striking development in Malagasy phonological history is the large-scale fricativization of Proto Austronesian consonants. This process can largely be ascribed to the Bantu contact stage, although Proto Austronesian *b had already become *w in Proto South East Barito before it ended up as v. Furthermore, the change from *y (a loan phoneme in South East Barito) to z is very recent: it never affected the regional Betsimisaraka dialect (compare also bazanetra < French baïonette ‘bayonet’ th (Dez 1965), showing that the change was still operative in the 19 century). Table 3 shows which consonants underwent fricativization: Table 3:

Fricativization

Proto Malayo Polynesian

Proto South East Barito

post-Bantu contact stage

Merina Malagasy (and standard)

*b non-final *p *d

*w *p r

v f

v f r

– non-final *g, *k –

(borrowed) ' k (borrowed) *y

tr (retroflex [#$]) h y

tr (retroflex [#$]) h z

736

Alexander Adelaar

Prenasalized obstruents did not undergo this fricativization. Moreover, *"g and *"k were reduced to k, and so were the aspirated velars *g0, *k0, which are typical for Sanskrit and Swahili loanwords. Typical changes in word-final position are the reduction of diphthongs, the merger of final consonants and the development of final devoiced vowels, which affected the Merina dialect in the following way: Table 4:

Word-final changes

Proto Malayo Polynesian

Proto South East Barito

post-Bantu contact stage

Merina Malagasy (and standard)

*-ay *-aw *-p *-t, *-d, (borrowed) *-r *-k *-n, *-", *-m

*-ey *-aw *-t / *-k *-t *-k *-n, *-", *-m (borrowed *-l) > *-n

*-ey *-aw -tr, -k -tr [#$] -k *-n, *-", *-m *-n

-y [i], -ez-o [u], -ov- [uv] -tra [#$%], -k+ [k%] -tra [#$%] -ka [k%] -na [n%] -n+

Proto Malayo Polynesian *y and final *l were lost; *s and *q became Proto South Barito *h, which was subsequently lost in Merina Malagasy. Present-day Malagasy h reflects *k or *g, and s is a loan phoneme. Examples: Malay b#sar ‘big’ > vesatr+ ‘swollen’; Malay perak ‘silver’ > firaka ‘tin’; Malay duyu" ‘sea cow’ > trozon+ ‘large sea animal (whale?)’; Malay barat ‘wet monsoon; west’ > varatr+ ‘thunder’; a-varatr+ ‘wet monsoon; north’ (fn.4); (Sanskrit sama ‘sameness’ >) Malay sama ‘together; same’ (Bazaar Malay sama ‘with’) > early Malagasy *(h)ame > amy ‘with’; Malay gigi ‘tooth’ > hihy ‘gums’; Malay kayu ‘tree; wood’ > hazo id.; Sanskrit megha ‘cloud’ > Old Malay megha > Tanory Malagasy mika id.; Sanskrit sakh!y-, (accusative) sakh!yam ‘companion, friend’ > Malay sakai ‘servant, follower’ > sakaiza ‘companion’; Malay sa"kal ‘to reject, deny’ > sakan+ ‘something placed crosswise, barrier’; Sanskrit ç""gavera ‘ginger’ (> Malay ? lost) > sakaviro; Javanese murah ‘cheap’ > mora ‘easy; cheap’; Malay salay, saley ‘to roast, heat, grill’ > sali ‘to smoke over a fire’; Old Malay s!mvaw [s!mbaw] ‘vessel’ > sambo id. In contrast to loanwords that went through the Bantu contact stage, Arabic, Swahili, English and French loanwords have as a rule maintained *b, *w, *p, *d, *s, *k, *"k, *"g, and (at least in some cases) *h, *-t, *-r, *-ey and *-aw. On the other hand, while they are more resistant in keeping their original phonemes, their word structures were more thoroughly affected. This is probably because these word structures are more complicated and less agreeable to Malagasy than those of Austronesian and Bantu loanwords or of Sanskrit loanwords which were borrowed via Malay and Javanese. There are various ways in which post-Bantu contact loanwords adapted to the structure of Malagasy, and there are many doublets. It seems that regularity is partly

28. Loanwords in Malagasy

737

obscured by different degrees of adaptation and by competing modes in written and spoken language. Some of the more frequent adaptations are: (1) the insertion of a vowel to break up consonant clusters, e.g. Arabic al-kham/s ‘fifth; Thursday’ > alakamìsy ‘Thursday’; Arabic as-sabt ‘Saturday’ > sabòtsy; English altar > alitàra; French administrateur ‘administrator’ > adimisitrater+; English print > pirìnty; English blister > bilistr+; (2) the addition of final devoiced vowels (which, in contrast to older loanwords, are not limited to +), e.g. English ‘the whole’ > dahòlo ‘all’ and the examples in (1); (3) the addition of an extra final syllable followed by a devoiced vowel to words with original final syllable stress, e.g. French (?English) plan > pilànin+; French caisse ‘box’ > kèsik+; English duck > dòkotr+; English mill > mìlin+ ‘machine, engine’; French dentelle ‘lace’ > dantèlin+, dantèly; French addition ‘bill, tab’ > adisaòn+, asidaònin+. (4) the loss of various vowel distinctions, such as nasalization and rounding of front vowels in French loanwords. Frequent loss of nasalization in word-final vowels: banc [bã] ‘bench’ > ba; jardin [&ard'] ‘garden’ > zaradày; bouchon [bu(õ] ‘plug’ > bosòa [busò]; however, cf. also satan [satã] ‘Satan’ > satàn+, where a final nasal emerges. In non-final vowels, a following nasal is heard before the following consonant (i.e. the historical nasal which is left unpronounced in French], e.g. ceinture [s'tür] ‘belt’ > santìry; bon point [bõpw'] ‘a good mark’ > bompòa; banque [bãk] ‘bank, financial institution’ > bànky; concert [kõser] ‘concert’ > konsèrtr+. Loss of rounding of front vowels: monsieur [m)syö] > mosè ‘sir’; du vin [düv'] > divày; ruban [rübã] ‘ribbon’ > ribà; however, cf. also république [repüblik] > Repoblik+ ‘republic’). Note that the vowels that are used after final consonants and to break up consonant clusters are much shorter than stressed vowels to the point of being almost inaudible. Other frequently observed changes include the tendency to reduce diphthongs, as in monsieur > mosè, and in quick march > koikimàtso, kikimàtso; French la loi [lalwà] ‘the law’ > laloàn+, lalàn+; la cuisine [lak*izìn] ‘the kitchen’ > lakòzy; addition [adisyõ] ‘bill, tab’ > adisaòn+, adisaònin+. Compare also the loss of wordinitial vowels: French élastique ‘rubber band’ > lasitìk+; aiguille ‘needle’ > gìly; accordéon ‘concertina’ > korodào; épingle ‘hairpin’ > pàingotr+, pàingitr+; unstable correspondences of i and o [u] when they co-occur in the same word: French chinois ‘Chinese’ > sinoa, sonoa; English pussy > piso; French velours ‘velvet’ > volory; French corniche ‘cornice’ > koronòsy (incidentally, lack of stability of co-occurring i and u is 12 also observed in inherited vocabulary and in the adaptation of early loanwords); the adaptation of final *l to -ly, -lin+ or -in+/(-on+), in English devil > devòly; French avril ‘April’ > avrìly; English ball > bàolin+; English pill > pìlin+; English rifle 12

For instance, Proto Austronesian *qijuSu" ‘nose’ > Malagasy oron+ id.; Malay tumit ‘heel’ > (dialectal) Malagasy tomotr+, tombotr+ id.

738

Alexander Adelaar

> rèfon+; French bugle ‘bugle (horn to send military signals)’ > bingon+; English level > lèvin+; English treble > torèbin+ (cf. also mìlin+; dantèlin+, dantèly; gìly, above). However, note also: French (la) table ‘table’ > latabatr+. $ Finally, -d > -dr [+ ]+ a vowel, e.g. English diamond > diamòndra; acide ‘acid’ > asìdra, English brigade > borìgedry; -r, -t > tr,, e.g. French concert ‘concert’ > konsèritr+, konsèrta; French minute ‘minute’ > minìtr+; French sabre ‘saber’ > sàbatr+; French vinaigre ‘vinegar’ > vinàingitr+; French baromètre ‘barometer’ > barometatr+; but also -ter > tr,, in French blotter > bilaotr+; English blister > bilistr+. In one instance, the English dental spirant is adapted to f: bismuth > bizimofo. Other formal adaptations are based on re-analysis. These are (1) cases in which the original article became lexicalized; (2) cases in which an original French possessive pronoun became lexicalized; and (3) cases showing back-formation. (1) Arabic and French loanwords are often adopted along with their article. In Arabic, the definite article is al-, but if the following word begins with a coronal, it is aC- (where C is identical to the following coronal). In French loanwords, the definite article is always represented as la (which is only the feminine article in French) and the partitive article (occurring with mass nouns) is represented as di. Examples are Arabic al-khamis ‘Thursday’ > alakamìsy; as-sunbulah ‘Virgo (in the zodiac)’ > asombòla; al-asad ‘Leo (id.)’ > alahasàty; French la bière ‘beer’ > labièra; (l’)adresse ‘address’ > ladirèsy; (la) douane ‘customs’ > ladoàny; (la) pêle ‘spade > lapèly; du thé ‘tea’ > ditè; de l’huile ‘oil’ > diloìlo; du poivre ‘pepper’ > dipoàvatr+; du sel ‘salt’ > disèly; du vin ‘wine’ > divày. Note that la is used as a default article also appearing with originally masculine nouns, e.g. (le) sac ‘bag’ > lasàk+; (le) pic ‘pick’ > lapik+ ‘pickaxe’; (le) levain ‘leaven, sourdough’ > lalivày; (le) four ‘oven’ > lafàoro ‘oven’. This preference for la has sometimes led to erroneous back-formations, as in l’essence ‘petrol’ > lasàntsy; French liberté ‘freedom’ > labarit#. Mass nouns with di are originally predominantly masculine, which makes one wonder what happened to original feminine mass nouns, and whether mass nouns like labiera and lasàntsy are in fact also originally partitive forms and reductions of de la bière and de l’essence respectively. An original French indefinite plural article des became lexicalized and was reduced to initial z, as in des agrafes ‘hooks, clips, staples’ > zaigràfy; des Indiennes ‘Indian women’ > zandiàn+ ‘Indian woman/women’; (faire) des histoires ‘to cause trouble, make things complicated’ > zisitoara ‘trouble, complications’. It is also shown in des haricots [de-arikò] ‘beans’ > (dialectal) Zarikào ‘bean(s)’, where the inserted z is due to false analogy. (2) Original possessive pronouns became lexicalized terms of address used for Roman Catholic clergy: French Mompèra ‘priest’ < mon père; French Mamèra ‘religious person’ < ma mère; French Masèra ‘nun’ < ma soeur. Note that this lexicalization of articles or possessive pronouns does not add definiteness or possession to the resulting noun in Malagasy: the latter still takes on an article or possessive pronoun like other Malagasy nouns, cf. ny ditè ‘tea’, ny zandiàn+ ‘the Indian woman’, ny mompèra ‘the priest’.

28. Loanwords in Malagasy

739

(3) Back-formations are seen in the following cases. The locative prefix aN- is wrongly identified in the following forms: French enveloppe ‘envelope’ > valòpy; (English?) American > (*am-merikan(a) >) merikan+. As mentioned above, French la is (erroneously) identified in liberté > labarit#; Arabic ar-riy!l ‘Spanish real (monetary unit)’ became ariàry ‘5 franc coin’ and is identified as a reduplicated root; English rabbit > ra-bìtro, bìtro (see below); French cheval ‘horse’ is reanalyzed as a compound soavaly, sovaly deriving from soa ‘good, beautiful’; Arabic qub$r ‘graves, tombs’ > kibòry ‘grave, tomb’ and French cochon ‘pig’ > kisòa, kosòa ‘pig’ are reanalyzed as nouns formed with the (originally Bantu) noun prefix ki (§6). Finally, there is the sociolinguistic adaptation of adding the honorific marker ra-. Some nouns with human reference obtain this prefix (which was ultimately borrowed from Javanese, see §6), e.g. rànon+ ‘what’s-his-name’ < ra- +ànon+ ‘thing, whatsit’; ra-mosè ‘sir’ < ra- + French monsieur ‘sir’; ra-fòtsy ‘elderly woman (with white hair)’ < ra- + fòtsy ‘white’. This prefix is sometimes attached to names of European animals, e.g. French mulet ‘mule’ > mole, ra-mole; English ‘geese’ > gisy, ra-gisy ‘goose’; pussy > piso, ra-piso ‘cat, pussy’ (with vowel metathesis). However, note a tendency for the contrary to happen in ra-bìtro, bìtro ‘rabbit’, where bìtro is apparently the result of back-formation.

6. Grammatical borrowing Many donor languages also left a mark on the morphology of Malagasy. However, only Bantu languages have been able to affect the structure of Malagasy beyond the level of lending the occasional derivational affix. What follows is an account of the various affixes and grammar features that have been adopted in Malagasy grammar over time. Tafa- is a prefix expressing an accomplished act; the resulting verb can be nonvolitional, but this is not necessarily so. Tafa- reflects an original Banjar Malay tapa-, which is the combination of ta- (a nonvolitional prefix) and pa- (a transitive prefix). The honorific prefix ra- occurs in kinship terms and in terms of address for people deserving extra respect, e.g. Comoro Malagasy ravinanto ‘child-in-law’ (< ra+ vinantu < Proto Austronesian *b-in-antu ‘child-in-law’); rafotsy ‘term of address for an old lady’ (< ra- + fotsy ‘white’). It also occurs with animal names of European provenance, e.g. ramulè ‘mule’ (< French mulet). It must have been borrowed from Javanese, where ra- occurs in many kinship terms and terms of address for highrank people, cf. Old Javanese r!nak (< ra- + anak) ‘child’, r!ma ‘father (< ra- + ama ‘father’, modern Javanese råmå ‘Roman Catholic priest’), etc. ta- (or taN-) refers to an ethnic or geographical group, or to someone who belongs to an ethnic group, resides in a certain place, or belongs to a certain profession, as seen in the above instance ta-laotra denoting an ethnic group with many contacts across the Mozambican channel. Having no reflexes in other East

740

Alexander Adelaar

Barito languages, it must derive from South Sulawesi ta- (same meaning), which is a cliticized form of Proto Austronesian *taw ‘human being, person’. The originally Bantu prefix ki- (or tsi-) forms nouns (which sometimes have a diminutive aspect): lalao ‘to play’ vs. ki-lalao ‘toy’; fafa ‘to sweep’ vs. ki-fafa ‘broom’; trano ‘house’ vs. ki-trano-trano, tsi-trano-trano ‘little shed’. In active verbs with a prefix beginning with m- (such as man- or mi-), this m- is replaced by h- in future tense, and by n- in past tense. Future h- is historically a reduced form of ho, a future marker with non-agent-oriented verbs (or any other verbs without initial prefix beginning with m-). This ho also indicates direction and forms infinitive verbs. It became grammaticalized further to h- in agent-oriented verbs in alignment with the tense markers m- and n-; cf. mangalatra Paoly ‘Paul steals’ vs. nangalatra Paoly ‘Paul stole’ vs. hangalatra Paoly ‘Paul will steal’; lasa aho ‘I leave/left’ vs. ho lasa aho ‘I will leave’ (lasa ‘to leave’; aho ‘I’). Ho (/h-) is probably a borrowing from Bantu. Moreover, the change from an original Austronesian aspect system contrasting perfect and imperfect to a tense system contrasting future, present and past, is also likely to be a result of Bantu influence on the verb structure of Malagasy (Dahl 1954, 1988). A further Bantu influence is the possibility for causatives to have reciprocality in their scope, and for reciprocals to have causativity in their scope (Adelaar 2007). This is demonstrated by ome ‘to give’, from which a causative reciprocal as well as a reciprocal causative can be derived; compare m-amp-if-an-ome ‘to make people give to each other’ to m-if-amp-an-ome ‘to cause one another to give’ (Rajaonarimanana 2001: 52). However, note that amp- (+causative) and if- (+ reciprocal) are historically South East Barito affixes (Dahl 1951: 171–175). A possible Bantu influence is the development of the circumstantial voice, which can raise all sorts of non-core arguments to subject position. A semantically similar construction raising non-core arguments to object position by means of an applicative exists in Bantu languages (Adelaar 2007). Morphological influence of English can be seen in the addition of certain suffixes to proper names, such as -son (< English son), as in Rakotoson (literally ‘son of Rakoto’), -oeli- or -oelina (< English well), as in Raoelina, and -fera (< English fair), as in Radilifera. Note, incidentally, the unusual lack of a final vowel in -son (Dez 1964: 36).

7. Concluding remarks Loanwords in Malagasy are an indispensable tool for those searching for the origins of the people of Madagascar. In the absence of historical records, they hold vital clues to the history of Malagasy culture. The relative isolation of the island and the availability of the Bantu contact stage as a chronological device contribute significantly to this circumstance. Among others, loanwords give us critical information about the Malagasy homeland, the period and setting of the early migrations to East Africa, the adoption of cattle breeding, and the multiple introduction to

28. Loanwords in Malagasy

741

Madagascar of Islam, with its various sources and traditions. They also illustrate some better-known historical phenomena such as the origins of Protestantism and Catholicism and other aspects of westernization and globalization.

References Abinal, A. & Malzac, V. 1970 [1888]. Dictionnaire malgache-français [Malagasy-French dictionary]. Paris: Éditions Maritimes et d’Outre-Mer. 1st edn. Tananarive. Abinal, A. & Malzac, V. 1973. Dictionnaire français-malgache [French-Malagasy dictionary]. Paris: Éditions Maritimes et d’Outre-Mer. Adelaar, A. 1989. Malay influence on Malagasy: linguistic and culture-historical inferences. Oceanic Linguistics 28(1):1–46. Adelaar, A. 1994a. Malay and Javanese loanwords in Malagasy, Tagalog and Siraya (Formosa). Bijdragen tot de Taal-, Land- en Volkenkunde 150(1):49–64. Adelaar, A. 1994b. The classification of the Tamanic languages (West Kalimantan). In Dutton, T. & Tryon, D. (eds.), Contact-induced language change in the Austronesianspeaking area, 1–41. Berlin: Mouton-de Gruyter. Adelaar, A. 1995a. Asian roots of the Malagasy: A linguistic perspective. Bijdragen tot de Taal-, Land- en Volkenkunde 151(3):325–357. Adelaar, A. 1995b. Bentuk pinjaman bahasa Melayu dan Jawa di Malagasi. In Hussain, Ismail & Deraman, A. Aziz & Al Ahmadi, Abd. Rahman (eds.), Tamadun Melayu: Jilid Pertama, 21–40. Kuala Lumpur: Dewan Bahasa dan Pustaka. Adelaar, A. 1996. Malay culture-history: Some linguistic evidence. In Reade, Julian E. (ed.), The Indian Oceanin Antiquity, 487–500. London: Kegan Paul. Adelaar, A. In press. The amalgamation of Malagasy. In Bowden, J. & Himmelmann, N. P. (eds.), Festschrift for Andrew Pawley. Canberra: Pacific Linguistics. Adelaar, Alexander. 2007. Language contact in the Austronesian Far West. London: School of Oriental and African Studies, Paper presented at 3rd Conference on Austronesian Languages and Linguistics, 21st–22nd September. Adelaar, Alexander. 2009. Towards an integrated theory about the Indonesian migrations to Madagascar. In Peiros, I. & Peregrine, P. & Feldman, M. (eds.), Ancient Human Migrations: Integrative approaches to complex processes, 149–172. Salt Lake City: Utah University Press. Ahmed-Chamanga, Mohamed. 1992. Lexique comorien (shindzuani) français. Paris: L’Harmattan. Ahmed-Chamanga, Mohamed. 1997. Dictionnaire français-comorien (dialecte shindzuani). Paris: L’Harmattan. Beaujard, Philippe. 1998a. Le parler secret arabico-malgache du sud-est de Madagascar: Recherches étymologiques. Paris: L’Harmattan.

742

Alexander Adelaar

Beaujard, Philippe. 1998b. Dictionnaire malgache-français: Dialecte tanala, sud-est de Madagascar. Avec recherches étymologiques. Paris/Montréal: L’Harmattan. Bergenholz, H. 1991. Rakibolana Malagasy-Alema: Madagassisch-Deutsches Wörterbuch. Moers: Aragon. Bernard-Thierry, Solange. 1959. À propos des emprunts sanskrits en malgache [About the Sanskrit loans in Malagasy]. Journal Asiatique 247(3):311–348. Blust, Robert A. 1999. Subgrouping, circularity and extinction: some issues in Austronesian comparative linguistics. In Zeitoun, Elizabeth & Li, Paul Jen-kuei (eds.), Selected Papers from the Eighth International Conference on Austronesian Linguistics (Symposium Series of the Institute of Linguistics 1), 31–94. Symposium Series of the Institute of Linguistics Number 1. Taipei: Institute of Linguistics, Academia Sinica. Dahl, Otto Christian. 1951. Malgache et maanyan: Une comparaison linguistique. (Avhandlinger utgitt av Instituttet 3). Oslo: Egede Instituttet. Dahl, Otto Christian. 1954. Le substrat bantou en malgache [The Bantu substrate in Malagasy]. Norsk Tidsskrift for Sprogvidenskap 17:325–362. Dahl, Otto Christian. 1977. La subdivision de la famille barito et la place du malgache. Acta Orientalia 38:77–134. Copenhagen. Dahl, Otto Christian. 1983. Sorabe révélant l’évolution du dialecte antemoro. Antananarivo: Trano Printy Loterana. Dahl, Otto Christian. 1988. Bantu substratum in Malagasy. Études Océan Indien 9:91–132. Paris: Institut National des Langues et Cultures Orientales. de Houtman, Frederick. 1603. Spraeck ende woordboeck, in de Maleysche ende Madagaskarsche talen. Amsterdam. Deschamps, H. 1960. Histoire de Madagascar [The history of Madagascar]. Paris: Editions Berger-Levrault. Dewar, Robert E. & Wright, Henry T. 1993. The culture history of Madagascar. Journal of World Prehistory 7:417–466. Dez, Jacques. 1964. La malgachisation des emprunts aux langues européennes. In Annales de l’Université de Madagascar (Série Lettres et Sciences Humaines 3), 19–46. Dez, Jacques. 1965. Lexique des mots européens malgachisées. In Annales de l’Université de Madagascar (Série Lettres et Sciences Humaines 4), 63–86. Dez, Jacques. 1967. De l’influence arabe à Madagascar à l’aide de faits de linguistique. Taloha & Revue de Madagascar 34–37:1–20. Faublée, Jacques. 1983. Mémoire spécial du Centre d’études sur le monde arabe et du Centre d’études sur l’océan Indien occidental. 21–30. Paris: INALCO & Conseil international de la langue française. Gericke, J. F. C. & Roorda, T. 1901. Javaansch-Nederlandsch handwoordenboek [JavaneseDutch concise dictionary]. Revised and expanded by A. C. Vreede. Amsterdam: Muller. Gonda, Jan. 1973. Sanskrit in Indonesia. New Delhi: International Academy of Indian Culture.

28. Loanwords in Malagasy

743

Gueunier, Noel J. 1986. Lexique du dialecte malgache de Mayotte (Comores). Études Océan Indien 7 (numéro spécial). Paris: INALCO. Gueunier, Noel J. 1994. Les chemins de l’Islam à Madagascar. Paris: L’Harmattan. Hapip, Abdul Jebar. 2006. Kamus Banjar-Indonesia. Banjarmasin: PT. Grafika Wangi Kalimantan. Lafon, Michel. 1991. Lexique français-shingazija. Paris: L’Harmattan. Nurse, Derek & Hinnebusch, Thomas J. 1993. Swahili and Sabaki: A linguistic history. (Publications in Linguistics 121). edited by Hinnebusch, Th. J. with a special addendum by Philippson, Gérard. Berkeley: University of California Press. Pelras, Christian. 1996. The Bugis. (The Peoples of South-East Asia and the Pacific Series). Oxford: Blackwell. Rajaonarimanana, Narivelo. 1995. Dictionnaire du malgache contemporain [Dictionary of Contemporary Malagasy]. Paris: Karthala. Rajaonarimanana, Narivelo. 2001. Grammaire moderne de la langue malgache [Modern grammar of the Malagasy language]. (Langues INALCO). Paris: Langues et Mondes L’Asiathèque. Ratsimandresy, Lucette. 2003. L’emprunt du malgache à l’anglais: Cas des emprunts “intégrés” dans le malgache contemporain. Études Océan Indien 35–36:309–330. Sacleux, Ch. 1939. Dictionnaire swahili-français [Swahili-French dictionary]. (Travaux et Mémoires de l’Institut d’Ethnologie 36–37). Paris: Musée de l’Homme. Webber, J. 1853. Dictionnaire malgache-français [Malagasy-French dictionary]. Île Bourbon: Établissement Malgache de Notre-Dame de la Ressource. Webber, J. 1855. Dictionnaire français-malgache [French-Malagasy dictionary]. Île Bourbon: Établissement Malgache de Notre-Dame de la Ressource. Wehr, Hans. 1994. A Dictionary of Modern Written Arabic (Arabic-English). 4th edn. Cowan, J. M. (ed.). Ithaca, NY: Spoken Language Services, Inc. Wilkinson, R. J. 1959. A Malay-English dictionary. London: Macmillan. Zoetmulder, P. J. with Robson, S. O. 1982. Old-Javanese-English dictionary. ‘s-Gravenhage: Martinus Nijhoff.

744

Alexander Adelaar

Loanword Appendix vazàha

Arabic salàma karàma alahàdy alatsinàiny talàta alarobìa alakamìsy zomà asabòtsy kabàry taratàsy

healthy wages Sunday Monday Tuesday Wednesday Thursday Friday Saturday speech, public discourse paper, letter

Banjarese kàmbana làmbo mànta ma-làma

twins boar raw, unripe smooth, slippery

Bantu òmby òndry ampòndra papàngo

ambùa kòngona lòlo màmba màso kiràro akòndro ampèmba làsa adàla vahìny

ox, cow, bull, bovine sheep donkey a large bird of pray, kind of kite (milan doré, Milvus Aegyptius) dog tick; louse butterfly crocodile eye shoe banana millet or sorghum to leave, to disappear mad, insane stranger, guest, traveler

mosikirìny

stranger, foreigner (especially a European) mosque

Comorian akòho jàko rajàko nòfo mòfo akànjo

chicken monkey monkey flesh bread, cake clothing, clothes

Dutch bàsy

gun, rifle

English gìsa dòkotra bìtro rabìtro pòizina kòpy hàolo harèza karipèta adisànina sokèra bàolina aotra òra sekòly pènina bòky tròmpetra tempòly

àlitara hèlo demòny mìlina pìlina naòmba saikòro

goose duck rabbit rabbit poison cup, bowl awl razor rug bill square ball zero hour school pen book trumpet temple (Protestant liturgical term) altar hell demon machine pill number screw

sokodrèvo

screwdriver

French kontinènta samònta saribào kisòa soavàly ramolè ganagàna kanakàna lìona orsa serfa elefànta dokotèra lafàoro kafitèra lapoàly tàsy foròsety lafarìnina saosìsy lasòpy olìva diloìlo dipoàvatra foromàzy dibèra divày labièra lasòa kapàoty ròbo zipò patalòha ba, bankiràro bàoty pàosy pàingotra mosàra serivièta boròsy lakomàdina savòny lakozìa

mainland, continent high tide charcoal pig horse mule duck duck lion bear deer elephant physician oven, stove kettle, coffee pot pan saucer fork flour sausage soup olive oil pepper cheese butter wine beer silk cloak, coat (woman’s) dress skirt trousers sock, stocking boot pocket pin handkerchief towel brush ointment soap cookhouse

28. Loanwords in Malagasy kadanà sèza latàbatra laboz`y birìky zaridaina lapèlina hàzo kèsika pìpa eriminèty marotò lakàoly, lakòly sarèty jìoga zèrô makorèlina sàbatra lalàna eglìzy aotèly radìô telefaònina bisikilèty mobilèty aotobìsy elektrisitè baterìa motèra hopitàly governemànta minìsitra pôlìsy adirèsy paòsitra kàratra paostàly bànky raobinè lavabò boàty bômbô plastìka bàomba sigàra gazèty alimanàka mosìka dite kafè

padlock chair table candle brick garden shovel pine pipe adze hammer glue cart/wagon yoke zero prostitute sword law church altar radio telephone bicycle motorcycle bus electricity battery motor hospital government minister police address post/mail postcard bank (financial institution) tap sink tin/can candy/sweets plastic bomb cigarette newspaper calendar music tea coffee

Javanese nòsy tàolana mamàtotra mòra vazantàny

island bone to bind, tie cheap; easy; slow cardinal points, borders of the country

Malay fàsika farìhy tànjona òny rìana hàzo vàratra rìvotra tarànaka vòrona vàno tròzona hàla fanènitra valàla sìfotra fàno tàndroka lamòsina sòfina mòlotra hìhy sàndry tànana rànjo tràtra tsìngy fàsana bòntsina manasìtrana sòla màsaka fìnga sòtro sìra tòaka

sand lake cape; aim, objective river waterfall wood, tree thunder air, wind descendants bird heron whale spider wasp grasshopper snail turtle horn back ear lip; edge gums arm hand, sleeve leg chest vulva grave swelling to cure bald cooked, ripe bowl spoon salt alcohol, spirits in general

tràno ràntsana àsa mitàrika landàizana fìraka hàrona mamàdika misàmbotra mitsìoka milèfa manàraka mitòndra

milànja sàmbo hamòry salàzana

mahìhitra mitròsa tròsa sìsa mampisàraka mandràkotra avàratra

mitòmbo zòro arìvo sahàza

fèno màsina (1) matsàtso fòtsy maìtso mavèsatra

745

house branch work to pull, to lead anvil lead basket to turn to catch, to borrow to blow to flee to follow, next to carry, to bring, to drive to carry on shoulder ship, boat rudder mast; grid for grilling or drying; fork for grilling meat stingy to owe debt remains to separate to cover North; direction of the wet monsoon (blowing from the North in Madagascar) to grow corner a thousand enough, sufficient, appropriate full salty, holy brackish, tasteless white green heavy

746

Alexander Adelaar

manìry mamèla tsàra fisalasàlana mambàdika mòra satrìa tsy mibitsibìtsika mitantàra manòratra sòdina tanàna fàritra mpanjàka

to want to forgive, to permit good doubt to betray easy because no, not to whisper to tell to write flute town, village boundary, border king, queen

sakàiza misàkana tàfika mihàza mitsàra manjàry

friend to prevent army to hunt to adjudicate to become

malòto andrìana

Portuguese ampingahàratra gun South Sulawesi vàdy mandèha ivòho

atsìmo

husband, wife, pair to walk, to go behind

South; dry monsoon (blowing from the South in Madagascar) dirty noble

Unknown origin kitày bolòky mosàry paràky mìsy

firewood parrot famine tobacco to be, with

Chapter 29

Loanwords in Takia, an Oceanic language of Papua New Guinea* Malcolm Ross 1. The language and its speakers Takia is an Oceanic Austronesian language spoken by about 25,000 people in the Madang Province of Papua New Guinea. Most speakers live on two volcanic islands, Karkar and Bagabag, twenty kilometers off the north coast of the New Guinea mainland, and in one village, Megiar, on the mainland coast. Takia speakers occupy the southern half of Karkar; the northern half is inhabited by the Waskia, who speak a Papuan language of the Trans New Guinea family. A number of speakers also live in Madang, the nearest mainland town, and scattered in towns and mining settlements across the country. Takia is one of perhaps six hundred vernaculars spoken in Papua New Guinea. Today all its speakers are bilingual in Tok Pisin (New Guinea Pidgin), the national lingua franca. Some speakers also speak English, but most of these live away from the Takia-speaking communities in the Madang Province. Takia speakers on the islands and at Megiar are subsistence farmers, living in villages which vary in size from hamlets of perhaps 20–30 people to villages with over 1,000 inhabitants. Until very recently, houses continued to be constructed from bush materials, but in the last decade or so men who have worked on the mainland have started building larger, Western-style houses. Traditional livelihood depended on the subsistence farming of taro, yam, sweet potatoes and some minor crops in the island’s fertile volcanic soil, and keeping a few pigs and chickens which often roam free through the village and surrounding rain forest. Pigs are slaughtered only for celebrations. This pattern has not changed greatly, but some villagers grow cash crops in the form of copra and cocoa, which are pre-processed locally, then sent to the mainland for sale. Transport to the mainland continues to be a problem. Individuals on the western side of Karkar can travel to the nearest point on the mainland by speedboat before dawn, when the sea is calm, if the weather allows. Otherwise, transport is by small copra boats, which take about three hours to reach the mainland town of Madang from a wharf on the southern coast of *

The subdatabase of the World Loanword Database that accompanies this chapter is available online at http://wold.livingsources.org. It is a separate electronic publication that should be cited as follows: Ross, Malcolm. 2009. Takia vocabulary. In Haspelmath, Martin & Tadmor, Uri (eds.) World Loanword Database. Munich: Max Planck Digital Library, 1125 entries.

748

Malcolm Ross!

Karkar. The problem of transport largely prevents Karkar Islanders from augmenting their incomes as many Papua New Guineans do, namely by selling the excess from their subsistence gardens in the nearest town market. Takia belongs to the Bel family, a small group of Oceanic languages whose interrelationships are shown in Figure 1. Takia is the westernmost of these languages. Megiar (a Takia dialect) and Matugar are spoken on the nearby mainland coast. Gedaged is spoken on tiny islands around and to the north of Madang, Bilibil is spoken in a village on the edge of Madang township (it was once spoken on a nearby offshore island), Dami is spoken inland from Madang, and the other languages are spoken on the Rai Coast southeast of Madang. All these languages are in regular contact with speakers of Papuan languages of the Madang subgroup of the large Trans New Guinea family, the languages of which are spoken across much of mainland New Guinea. According to oral tradition, the ancestors of Bel speakers lived on Yomba Island, which disappeared beneath the waves sometime in the late sixteenth or early seventeenth century in one of the seismic events which regularly visit the area (Mennis 2006). From there they fled to roughly their present locations on the mainland and, I assume, diversification into the present-day Bel languages has occurred since that time. A Takia oral history reports that 10 generations before the 1960s a large tsunami killed most of Karkar Island’s inhabitants, and the island was then occupied by the ancestors of the Takia (McSwain 1977: 24). This appears to postdate the disappearance of Yomba by a generation or two (Ross 2008). Megiar BilibilMegiarWestern Bel Mindiri Western Bel Proto-Bel

Mindiri

Takia Bilibil

Gedaged

Matugar Eastern Bel

Dami Bing-Wab

Figure 1:

Takia

Megiar-Takia

Bing, Wab

A genealogy of the Bel languages

The Bel maintained their traditional ties and kin relationships at least until European contact in the late nineteenth century. A major maintenance mechanism was the Bilibil trade network (Mennis 2006). Bilibil Island is near Madang, and fine pots were made there. The Bilibil people constructed large sailing canoes and sailed seasonally to all the Bel villages to trade, as well as to many Papuan-speaking villages to the east, where they interacted with the famous Vitiaz Strait trading network (Harding 1967). The Takia apparently built only smaller canoes for fishing and to cross the strait to Megiar village on the mainland. However, crossing the

29. Loanwords in Takia

749

strait was banned by the Australian administration sometime in the 1920s or 1930s, with the result that Karkar and Bagabag Islands became surprisingly isolated from the mainland, a situation which persists for many Takia speakers. The Takia recognize a division of their language into two main dialect groups, coastal and inland. The coastal dialects, spoken in villages on and near the coast on Karkar Island and by all speakers on Bagabag Island and at Megiar, have a phonemic distinction between /l/ and /r/, whereas the dialects spoken in inland Karkar villages have merged these two phonemes as /l/. There are also numerous other minor dialect distinctions which speakers are less aware of. This dialect division is emblematic of a division of the island’s inhabitants into two lifestyles, coastal and inland. The difference between the two must have been greater in the past, when the inhabitants of coastal villages regularly put to sea and caught fish and turtles. Today there is strikingly little fishing activity in the coastal villages that I am familiar with. Inland villages lie several kilometers inland on the lower slopes of the volcanic cone. Karkar erupts from time to time, but the inner, live crater is within a much larger caldera, and the islanders are thereby protected from the worst effects of most eruptions. They benefit, however, from the fertility of its volcanic soil. The data used in this study are from the Rigen village, two kilometers inland from the southernmost point of Karkar Island. The dialect is coastal, and Rigen has close kin ties with coastal villages, suggesting that it represents a fairly recent inland settlement. The lifestyle, however, is more like that of an inland village. Until a water tank was installed, the women of the village walked down to the coast daily to collect water, as there is no stream nearby. The history of Takia since European contact is known to us, at least in outline. German Lutheran missionaries were active on Karkar from 1888 to 1895, but made little impact. A second group arrived in 1912, bringing with them Samoan evangelists, and were more successful. Until recently, all Karkar Islanders, both Takia and Waskia, were fiercely loyal Lutherans, with the exception of one Catholic Takiaspeaking village. The German missionaries were forced out by Germany’s wartime defeat in 1918 (Wagner & Hermann 1986: 109–110, 127–129). However, after the transfer of New Guinea (the northern half of modern mainland Papua New Guinea) to Australian sovereignty, German and Australian Lutheran missionaries continued their activities, and in 1935 Gedaged (or Graged), another Bel language (described in a manuscript grammar by Dempwolff (n.d.)) was adopted as a lingua franca by the mission and was used extensively in education and in church activities. The Lutheran mission station at Narer, in the Waskia-speaking area of the island, was a major centre for mission education, and numerous young men learned Gedaged and acquired reading and writing skills. Because Gedaged is closely related to Takia, it was easy for Takia speakers to learn it. It is not entirely clear how much influence Gedaged has had on Takia by way of loanwords because lexical items in the two languages quite often have near-identical forms. Gedaged education continued until 1962, when the mission largely replaced Gedaged with the national lingua franca Tok Pisin (Freyberg 1977; Ross 1996b). Tok Pisin had doubtless already made inroads on Karkar before the Second World War, however, and was already spoken

750

Malcolm Ross!

by most of the population in 1962. Today all Takia speakers are bilingual in Tok Pisin. Recent years have seen a reduction in the influence of the Lutheran Church on Karkar, as more and more islanders have spent time on the mainland for education or to earn a living and have brought a new-found faith home with them. As a result a number of smaller denominations have become established on the island, but it is too soon to say what impact, if any, they will have on Takia. Only one borrowing on the list (‘to fast’) is attributable to this last period.

Map 1: Geographical setting of Takia The prehistoric period effectively lasted until 1888, when the first missionary set foot on the island. A fair amount is known about Takia’s long prehistory, thanks to the detailed work that has been carried out in reconstructing the history of Oceanic Austronesian languages and thanks also to the fact that correlating linguistic and archaeological reconstructions of prehistory has proven to be fruitful. Austronesian speakers reached the Bismarck Archipelago of Melanesia around 1400 BCE, where they were associated with the Lapita neolithic cultural complex. The language of the Lapita culture was Proto Oceanic, ancestor of all the languages of the large Oceanic subgroup, which includes all the Austronesian languages of mainland New Guinea except those in its extreme west, of northwest Island Melanesia (the

29. Loanwords in Takia

751

Bismarck Archipelago, Bougainville and the Solomon Islands), and of Remote Oceania (Vanuatu, New Caledonia, Fiji, Micronesia and Polynesia). The Proto-Oceanic-speaking community broke up sometime around 1200 BCE with the eastward migrations into the Pacific and the settlement of the Admiralty Islands, leaving behind the Western Oceanic dialect network, probably limited to New Britain and New Ireland and their offshore islands. There were further migrations both eastward and into southeast and central Papua. Somewhat later, sometime between 300 and 500 CE, came the re-settlement of the islands in the Vitiaz Strait, between New Britain and New Guinea, followed by the gradual establishment of toeholds along the north coast of New Guinea and its offshore islands. Bilibil-style pottery first appears around 1000 CE, along with the establishment of trade networks mentioned earlier. These networks were more or less coterminous with the extent of the Ngero-Vitiaz network: Proto Ngero-Vitiaz is the language ancestral to Proto Bel and the Oceanic languages of the Vitiaz Strait.

2. Sources of data Much of the Takia wordlist was collected during two periods of fieldwork at Rigen in 1987 and 1988. The Loanword Typology (LWT) meaning list is based on the Intercontinental Dictionary Series list, which was used as the basis of the lists in Tryon (1995), and the original Takia list was published in that work. That list was provided largely by Mait Kilil, born around 1956. He was, and remains, a wonderful informant, as he has a considerable knowledge of Takia traditional culture, a good understanding of recent history, and is fluent in Takia, Tok Pisin and English. Without his help this work would probably have been impossible, and I am immensely grateful to him. The original recording of the list was accompanied by extensive discussion of issues relating to the appropriateness of the translations, and has since been carefully checked in the light of my own work and particularly of the Shoebox file containing the dictionary compiled by Bruce Waters of the Summer Institute of Linguistics in the 1990s. In 2006, Mait Kilil re-checked the original list and added the new items required by the Loanword Typology meaning list. I sent him a printout of the list, together with the newly provided meaning descriptions and typical contexts and a set of instructions which included a request to avoid nonce borrowings. He returned the completed list, together with accompanying notes, especially on cultural and historical matters. Although face-to-face discussion was not possible this time round, I am confident that the information provided with each item has minimized errors; the more so as many of the additions are borrowings from Tok Pisin, and their correctness is transparent. The greatest difficulty in compiling the Takia subdatabase comes from the inadequacy of lexical sources in most other Bel languages and in neighboring Papuan languages of the Trans New Guinea family. The best Bel source is Mager’s (1952) dictionary of Gedaged, which also contains cognates from other Bel languages and

752

Malcolm Ross!

allows one to reconstruct Proto Western Bel and Proto Bel etyma. Otherwise I had only rather short wordlists (300 words) that I collected in the late 1970s. However, thanks to the extensive work that has been done on the lexical reconstruction of Proto Oceanic and later Oceanic interstage languages (Ross et al. 1998, 2003), it is often possible to determine the forebears of Takia lexical items at a considerable time depth. This in some measure compensates for the absence of more detailed Bel sources. Among neighboring Papuan languages, I had a vocabulary that I had collected in the 1970s for Waskia (Ross with Paol 1978) and a small dictionary produced by the Summer Institute of Linguistics (Barker & Lee 1985). There is also a small dictionary for Bargam (Hepner 2007). For other Papuan languages on the nearby mainland only the 370-word lists collected by Z’graggen (1980) are available.

3. Contact situations Contact situations coincide roughly with periods in the history of Takia, and are discussed below in chronological order. The periodization is given in (1) for convenience of reference. (1)

Pre-Oceanic Early Oceanic Western Oceanic dialect network New Guinea Oceanic dialect network North New Guinea dialect network Ngero-Vitiaz dialect network Bel Early Takia period Early missionary period Gedaged Schools period Modern period

3.1.

before 1400 BCE roughly 1400 to 1200 BCE roughly 1200 to 900 BCE roughly 900 to 200 BCE roughly 200 BCE to between 300 and 500 CE from between 300 and 500 CE to around 1000 CE roughly 1000 to 1680 CE from about 1680 to 1912 CE 1912–1918 1935–1962 since 1935

Pre-Oceanic

No loanwords can be identified as dating from the pre-Oceanic period. However, apparently at the time around 1400 BCE when Austronesian speakers of the language ancestral to Proto Oceanic were making their way eastward along the north coast of New Guinea to the future Proto Oceanic homeland in the Bismarck Archipelago, a number of words were borrowed from Austronesian into Papuan languages ancestral to members of the present-day Madang family (Ross 1988: 21). Evidence for this borrowing source is that some of the borrowed items display final

29. Loanwords in Takia

753

consonants which were retained in Proto Oceanic but later lost from the Ngero-Vitiaz languages, including Takia and the other Bel languages. Curiously some of these items have been borrowed back into Takia. Their forms still witness to their Austronesian origin, but are sufficiently irregular to show that they are not directly inherited. (Reconstructions are for Proto Malayo-Polynesian (PMP) and Proto Eastern Malayo-Polynesian (PEMP), Austronesian interstages of a higher order than Proto Oceanic. “Expected Takia forms” are the forms that the words would be expected to take if they had been directly inherited from Proto Oceanic.) (2)

Takia

Gloss

madar goub peik

‘bandicoot’ madar ‘rat’ kaup ‘skink’ ..

Waskia

Reconstructed form

(Expected Takia form)

PMP *mansar PEMP *kazupay PMP *bayawak

*mad *ayup *feo

The items borrowed by Papuan languages from pre-Oceanic Austronesian include several animals. Among them were *boRok ‘pig’ and *kasuari ‘cassowary’ (Ross 1988: 21), as well as the three above. Austronesian speakers may well have brought the pig and the house rat with them, but not the bandicoot and the cassowary. 3.2.

Early Oceanic contact with Papuan speakers

On mainland New Guinea, Proto Oceanic speakers encountered taro cultivators, speakers of Papuan languages of the Trans New Guinea (TNG) family. It is thus none too surprising that the one borrowing into Proto Oceanic for which I have evidence and which is reflected in Takia is mao ‘taro, Colocasia esculenta’. This reflects Proto Oceanic *m!apo(q), which was borrowed from an unknown mainland New Guinea language. The evidence for this is that (1) the term is not reflected in any non-Oceanic Austronesian language and (2) terms of similar form are found in Papuan languages scattered across the New Guinea mainland (Hays 2005: 642–643). Clearly, if one Proto Oceanic word was borrowed, then others almost certainly were too, but they have yet to be identified. 3.3.

From Proto Western Oceanic to Proto Bel

There is a lengthy period of time, from the break-up of Proto Oceanic around 1200 th BCE to the break-up of Proto Bel, probably in the early 17 century, for which I have no evidence of borrowings. Sixty-eight items in the subdatabase reflect reconstructed Proto Bel etyma. There are far fewer etyma for each of the earlier interstages from Proto Western Oceanic to Proto Ngero-Vitiaz, and with regard to borrowing, all that these reconstructions tell us is that the item has not been borrowed in the period since the relevant interstage. It is possible, for example, that some of the Proto Bel etyma

754

Malcolm Ross!

were borrowed from neighboring Papuan languages. At present, insufficient comparative data on the Papuan languages of the region (and no Papuan reconstructions) are available to allow this possibility to be assessed. On the other hand, if in the future a cognate of a reconstructed item is found outside the subgroup of languages on which a reconstruction is based, then the item is reconstructable at an earlier interstage, and the date since which the item has been in the language is correspondingly pushed back in time. 3.4.

Early Takia borrowings from neighboring Papuan languages

Two languages have been identified as Papuan loan sources: Bargam, spoken in mainland villages just inland of Megiar, and Waskia, spoken throughout the northern half of Karkar Island. These loans apparently occurred during what I label the Early Takia period, from the break-up of Proto Bel to the arrival of the missionaries. In the earliest part of this period, the ancestors of the Takia had not yet occupied southern Karkar, and their language was effectively still Proto Western Bel, i.e. it had not yet diversified into Takia, Megiar, Bilibil and Gedaged. There is very thin evidence of borrowings like Proto Western Bel *wail ‘thunder, lightning’ at this time. Most (and all Waskia) borrowings, however, are absent from Mager’s Gedaged dictionary, implying that they postdate the separation of Takia and Gedaged. It would be useful to have a corresponding wordlist in Bilibil, but these data are not available. I have tended to assume in earlier publications that the main source for the Papuanization of Takia grammar was its current neighbor, Waskia, although I have pointed out that this cannot be proven, since Papuanization consists in the borrowing of grammatical patterns, not of forms. It is surprising that Takia has more loans from Bargam than from Waskia. This may to some degree reflect the fact that the Bargam dictionary is larger than the Waskia, and also the fact that the Waskia dictionary was produced for indigenous consumption, among other things to help Waskia speakers learn English. As a result, it lacks words for a good many meanings that occur in the Takia subdatabase and the dictionary. In spite of this difference between sources, however, I am confident that modern Takia reflects more loans from Bargam than from Waskia. This is intriguing, given that the Takia evidently settled on Karkar quite early in the Early Takia period. Either there was strong bilingualism in Takia and Bargam before settlement on Karkar, or strong ties across the strait continued after settlement. It is also possible that the Takia and Waskia had much less to do with each other during much of the Early Takia period than they do today. Populations must have been much smaller, with Waskia in the north of the island and Takia in the south and perhaps scant contact between them until population expansion brought some Takia and Waskia villages into close proximity. Takia speakers are generally bilingual in neither Bargam nor Waskia today: they communicate in Tok Pisin.

29. Loanwords in Takia

755

A number of authors talk about the system of ‘trade partners’ (Harding 1967, McSwain 1977, Mennis 2006) whereby a man in, say, a Takia village would have a special trading relationship with a man in each of another number of other villages, and these trade partnerships provided the network through which trade occurred. th Although Takia–Waskia trade partnerships were strong in the 20 century, it is possible that Takia-Bargam partnerships were strong before the cessation of frequent crossings to the mainland. It is significant, I suspect, that borrowings from Bargam represent replacements of existing words for creatures and things that were already present in the environment of Takia speakers (‘fowl’, ‘giant fruit bat’, ‘sandfly’, ‘sky’, ‘moon’, ‘earthquake’, ‘tide’, ‘foam’ and possibly ‘water’), body-parts (‘heart’, ‘wing’, ‘finger/toe’, ‘fish scale’) and two common properties (‘hard’, ‘dry’). This implies that Takia speakers were thoroughly bilingual in Bargam. Borrowings from Waskia are not of this type. They include a synonym for ‘dew’; the introduced house rat, which Takia speakers may have encountered for the first time on Karkar; and words for types of netbag and basket and a frog species, which may also have been new to them. 3.5.

Early missionary period

This is the brief period of effective German presence from 1912 to 1918. Because of the brevity of the period and the replacement of German by other “foreign” languages since, very few German loans remain in Takia. Indeed, it is not entirely clear what language the missionaries used. I suspect that it was an early form of Tok Pisin, and that German words were co-opted for items for which there was then no Tok Pisin term. Hence soken ‘socks’ from German Socken and ades ‘hell’ from German Hades. 3.6.

Gedaged schools period

From 1935 to 1962 there was widespread missionary instruction in the Gedaged language through a network of mission schools. Gedaged, a closely related Bel language, became the lingua franca of the Madang Lutheran church. It was the language used in church schools (of which there was one on Karkar in the late 1930s) and often for preaching (indeed, Gedaged sermons could still be occasionally heard in the 1980s). However, the lexicons of Gedaged and Takia are so similar that borrowings from Gedaged are very hard to detect, and I have relied on informant opinions in this regard.

756

Malcolm Ross!

3.7.

Tok Pisin and English

Tok Pisin is one of a family of English-based pidgins (others are Solomons Pijin, Vanuatu Bislama and Torres Strait Broken) which evolved from Pacific Pidgin. The latter probably became stabilized on plantations established by the colonial powers in Samoa, Queensland, Fiji and New Caledonia during the period from about 1865 to about 1890 (Clark 1979; Mühlhäusler 1979; Keesing 1988; Goulden 1990), but its origins can be traced back to early nineteenth-century Sydney, whence it was carried across the Pacific on trading vessels (Tryon & Charpentier 2004). Sydney Pidgin, in turn, had grown out of the (probably unstable) nautical jargon which already existed in the Atlantic, with a history stretching back ultimately to the fifteenth century Mediterranean (Todd 1974: 32–39). This convoluted history explains the origin of odd Takia words like patou ‘duck’ and klabus ‘prison’, ultimately from Portuguese pato ‘duck’ and calabouço ‘prison cell’ respectively, and dating back to the Mediterranean period. Tok Pisin’s “New Guinea period” began with the return of Pacific Pidgin speaking laborers to the Rabaul area around 1890, and this accounts for a number of Tok Pisin words that originate in the languages of that area (Ross 1992): Takia matmat cemetery ultimately from Ramoaaina (Duke of York Islands), Takia muli ‘citrous fruit’ and baira ‘hoe’ from Tolai, and Takia balus ‘aircraft’ from southern New Ireland balus ‘dove’. But the vast majority of Tok Pisin words borrowed into Takia have English as their earliest known source language (most of those in the wordlist; this is obviously a shortcut, in the sense that the English words themselves have a history). It is often difficult to know at what stage these words entered Tok Pisin or its forebears: the th th Atlantic period (18 century), the Sydney period (early 19 century), the Pacific th Pidgin English period (late 19 century) or the New Guinea period (since 1890). Happily, this is of no particular concern for the history of borrowings in Takia. Whilst it is difficult to measure the growth of Tok Pisin use on Karkar, it is a reasonable inference that there were already speakers before the Second World War, and that there had been a significant expansion by 1962, when the mission gave up using Gedaged. Today every Takia speaker is bilingual in Tok Pisin, and this makes it difficult to distinguish between stable and nonce borrowings. However, I asked my informant to make this distinction, and it is my impression that he has been successful. A large majority of the Tok Pisin borrowings in the wordlist represent insertions into the lexicon – words for items that were not present or known before contact with the western world. They include items associated with the environment (sno ‘snow’, eis ‘ice’), animals (elefan ‘elephant’, hanibi ‘honey bee’), the tools of food preparation (stop ‘oven’, praipan ‘frying pan’), foods and beverages (bret ‘bread’, bia ‘beer’), and meals (brakpas ‘breakfast’), western clothing (dres ‘dress’, kola ‘collar’), western housing (windua ‘window’, bet ‘bed’) and the materials used in it (brik ‘brick’, aen ‘iron’, bol ‘lead’), and terms associated with nationhood and government (gavman ‘government’, kwin ‘queen’, eleksin ‘election’). In the late 1970s law reforms led to a system of village courts. Meeting on Tuesdays and Thursdays, they have

29. Loanwords in Takia

757

inserted a new terminology into Takia: compleina ‘plaintiff’, inosen ‘innocent’, kotfain ‘fine’. Mixed in with these items are a number which I have assumed to be direct borrowings from English. These include numbers above ten, as until recently these were generated analytically (tenpela tu ‘12’). They also include items denoting objects that have never become a part of the Takia environment but which are known from Bible teaching. These include animals (kamel, laion), cultivars (fig, oliv, baliwit ’barley’) and objects associated with Judaism (tempel, altar). A recent addition is fasti", introduced by Pentecostal churches in the last decade. Whether my assessment that these are directly from English is correct is difficult to check. Indeed, it can be argued that the boundary between the lexicons of Tok Pisin and Papua New Guinea English is becoming increasingly fuzzy, so that they together form a lexical stock from which vernaculars may draw words for new items without concern for their precise source.

4. Numbers and kinds of loanwords Materials for evaluating borrowings are of two kinds. First, there are materials which allow us to reconstruct direct inheritance back to a particular stage in linguistic prehistory. As far as borrowing is concerned, this is at best negative evidence: we know that the word has not been borrowed since the stage at which it is reconstructable, but we cannot know whether it had been borrowed earlier. Second, there are materials for potential source languages which enable us to identify borrowing sources. There are plentiful lexical reconstructions for the stages of Austronesian that predate Proto Oceanic, and particularly for Proto Oceanic itself. These include the volumes published by the Oceanic Lexicon Project at the Australian National University (Pawley & Ross 1994; Ross et al. 1998, 2003) and ongoing work on three further volumes, together with reconstructions for interstages between Proto Oceanic and modern Takia, based mainly on the collated field notes that underpin Ross (1988). These materials allow the identification of words that were already present at a particular interstage, but I have made no attempt to distinguish the ages of words older than Proto Oceanic (around 1300 BCE), and in all cases except one (§3.2), I have classified them as showing “No evidence for borrowing”. Words which are not reconstructable at any Oceanic interstage but for which I can find no borrowing source are classified under “Very little evidence for borrowing”. The only evidence that they are borrowed is that no cognate has been found in another Oceanic language. This does not mean that a cognate will not be found in the future, nor, on the other hand, that more data for neighboring languages will not reveal a borrowing source. I have identified German and Tok Pisin borrowings on the basis of my own knowledge of the two languages, Bargam borrowings through Hepner (2007) and Waskia borrowings through Ross with Paol (1978) and Barker & Lee (1985).

758

Malcolm Ross!

Unfortunately, these sources are not as extensive as I would like. This is particularly so for Bargam, and I suspect that quite a number of the items marked “Very little evidence for borrowing” may indeed have been borrowed from Bargam, as they cannot be sourced from elsewhere. Identifying Gedaged borrowings is very difficult and largely based on informants’ assertions (which may not be accurate): Takia and Gedaged are so closely related that the availability of a good Gedaged dictionary (Mager 1952) has been no great help.

English

Gedaged

Waskia

German

Unidentified

Total loanwords

Nonloanwords

Nouns Verbs Adjectives Adverbs Function words all words

Bargam

Loanwords in Takia by donor language and semantic word class Tok Pisin

Table 1:

31.0 2.0 4.3 13.8 21.0

2.5 2.2 1.7

1.3 0.4 5.0 1.2

0.9 0.8 0.7

0.9 0.6

0.5 0.3

0.6 0.4

37.7 3.2 6.5 0.0 18.8 25.9

62.3 96.8 93.5 100.0 81.2 74.1

Table 1 presents percentages of loanwords by donor language and semantic word class. It must be read, however, in the light of the caveats in the paragraphs above. About 20% of the word forms in the Takia subdatabase, or over 80% of all identified loanwords, are borrowed from Tok Pisin, and only about 5% of all word forms, or 20% of identified loanwords, are borrowed from other sources. The figure may well be skewed quite heavily by unidentified loans. Although it is unlikely that I have missed loans from German or English, it is very likely that I have missed loans from Waskia, Bargam or other Papuan sources, and from Gedaged. In the case of Gedaged, this is because of the close relatedness of Gedaged and Takia. In the cases of Waskia, Bargam or other Papuan sources, it is because available dictionaries are small. There are many words for which I find no source, either of inheritance or borrowing, but on the basis of canonical form I suspect that a good proportion of these represent borrowings from Papuan sources. A majority of Proto Oceanic reconstructions have the form *C1V1C2V(C), which (in the absence of inalienable possessor suffixes) becomes Takia C1V1C2. Many unetymologized roots, however, have the form C(V)CVC or C(V)C(V)CVC which is more characteristic of loans from Bargam and Waskia. Hence research into Takia borrowings awaits more detailed study of coastal Papuan languages in the region where the Bel languages are spoken. Table 1 also provides loan figures for “semantic word classes”, i.e. for meanings that are most likely to be encoded as nouns, verbs and so on. If one reads across the totals for loanwords and non-loanwords, one sees that the semantic word class with the largest proportion of loanwords is nouns. Loanwords account for a staggering

29. Loanwords in Takia

759

37% of Takia nouns. The number of borrowed verbs and adjectives is small, and, apart from o ‘or’, borrowed “function words” are all numerals. Even the small number of “verbs” listed in Table 1 is misleading, as only one of these, -koy ‘row (a boat)’, is a verb in Takia. Reasons for the predominance of nouns among borrowings are probably mainly circumstantial: nouns prototypically encode things, and most borrowed nouns encode things that were new to speakers at the time of borrowing. The exceptions are the Bargam nouns mentioned in §3.4, which were replacement innovations due to strong bilingualism. One would also expect to find borrowed verbs encoding event types that were new to speakers at the time of borrowing. There are very few of these, however, perhaps because few new event types have entered the lives of Takia speakers over time. The borrowed verbs that we do find concern money (dina -pani, lit. ‘give debt’, i.e. ‘owe’, from Tok Pisin dinau ‘debt’, itself from Fijian), measurement (skel -gane , lit. ‘do measure’, i.e. ‘weigh’, from Tok Pisin skel ‘debt’, itself from English scale), formal illocutionary acts like ‘promise’ (promis) and ‘acquit’ (dismis) and the religious act of fasting (fasti"). Clearly these are new event types. Often, however, an existing verb extends its meaning to include new acts. For example, the Christian activity of praying is encoded in Tok Pisin by beten ‘pray’ (from German), but this has not been borrowed into Takia. Instead -gdani is used: its pre-Christian meanings probably had to do with addressing ancestor and bush spirits. It is tempting to argue that nouns are borrowed more easily than verbs because their morphological integration into Takia is simpler – a verb always has a subject cross-referencing prefix, and the verb stem is not transparently separable from the prefix. However, this argument does not hold up. As the examples above show, Takia speakers do borrow ontological verbs, but they integrate them into Takia by combining them with an existing verb, often -pani ‘give’ or -gane ‘do’ (§5). Table 2 presents percentages of loanwords by donor language and semantic field. It reveals substantial variation in numbers of loanwords across semantic fields, summarized in (3). (3)

nil 0.1–10% 10.1–20% 20.1–30% 30.1–40% 40.1–60% 60.1–80% 80.1–100%

Emotions and values, Miscellaneous function words Kinship, The body, Spatial relations, Cognition The physical world, Motion, Possession, Sense perception, Speech and language, Warfare and hunting Agriculture and vegetation, Time, Social and political relations, Religion and belief Animals, Basic actions and technology Food and drink, The house, Quantity Clothing and grooming, Law Modern world

760

Malcolm Ross!

English

Gedaged

Waskia

German

Unidentified

Total loanwords

Nonloanwords

The physical world Kinship Animals The body Food and drink Clothing and grooming The house Agriculture and vegetation Basic actions and technology Motion Possession Spatial relations Quantity Time Sense perception Emotions and values Cognition Speech and language Social and political relations Warfare and hunting Law Religion and belief Modern world Miscellaneous function words

Bargam

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Tok Pisin

Table 2: Loanwords in Takia by donor language and semantic field (percentages)

4.5 2.3 14.8 1.8 35.7 56.2 40.3 22.0 29.8 10.7 19.9 5.2 30.3 21.3 8.1 5.3 11.2 27.9 13.3 72.9 5.0 98.2 21.0

6.5 8.1 0.7 1.7 5.6 3.1 5.4 5.0 1.7

4.0 3.0 2.1 11.0 5.6 15.0 1.2

1.3 2.5 2.8 1.5 3.1 2.4 2.7 0.7

2.0 4.0 0.2 3.1 0.6

3.7 2.1 5.0 0.3

2.7 1.0 2.3 0.4

13.0 2.3 35.1 2.5 40.7 62.4 48.6 27.2 39.7 13.8 19.9 5.2 41.4 23.7 13.4 0.0 8.0 11.2 27.9 13.3 78.5 30.0 98.2 0.0 25.9

87.0 97.7 64.9 97.5 59.3 37.6 51.4 72.8 60.3 86.2 80.1 94.8 58.6 76.3 86.6 100.0 92.0 88.8 72.1 86.7 21.5 70.0 1.8 100.0 74.1

There are no surprises here. Takia has borrowed no function words except perhaps o ‘or’ (counted under Cognition in the LWT Meaning list). Linguists conventionally label as “basic vocabulary” semantic fields containing concepts that are fairly constant across human societies and are less likely to change when speakers’ culture changes. These include Emotions and values, Kinship, The body, Spatial relations and Cognition, which show 0–10% loanwords in Takia. At the opposite extreme, Food and drink, The house, Quantity, Clothing and grooming, Law and the Modern world manifest the most loanwords, ranging from food and drink with about 40% to the Modern world with almost 100%. These are all fields which have been affected by modernization. Food and drink is, understandably, at the lower margin of this range because it includes a number of traditional foods as well as a number of introduced items.

29. Loanwords in Takia

761

Table 2 shows that a substantial majority of the loanwords in most categories are from Tok Pisin. This is unsurprising, since 80% of all loanwords in the list are from Tok Pisin. More interesting are those categories where this is not true. In just one field, The physical world, Tok Pisin loans are outnumbered by loans from another source, namely Bargam: (4)

Takia you sblbal (you) brruk knawrig keit kalam

‘water’ ‘foam, froth’ ‘tide’ ‘earthquake’ ‘sky’ ‘moon’

Bargam yu (also Waskia yu) sabalbalim bururuk ‘river, rapids’ kanawrigrig kait kalam (possibly borrowed by Bargam from Takia)

By no stretch of the imagination were these words borrowed because they encoded concepts previously unknown to the Takia. They must be attributed to a period when many or all speakers were bilingual in Bargam. Bargam is also the source of a number of terms from the animal world. In this case it is difficult to know whether these items were new to Takia speakers or were simply borrowed under bilingual pressure. (5)

Takia krek yok mluk bramat ksalom mrkis

‘fowl, chicken’ ‘bat’ ‘dove’ ‘scorpion, pig’s tusk’ ‘spider’ ‘sandfly’

Bargam kurek yok muluk kanawrigrig ‘pig’s tusk’ kasilomlom kurmis

Bargam animal kingdom loans are outnumbered by those from Tok Pisin, the latter representing foreign animals introduced to Takia speakers as a result of globalization (§3.7). Other Bargam loans again reflect bilingualism: tor ‘soup’ (Bargam tawor), ttawit ‘kind of fan’ (Bargam tatwit), kram ‘kind of larger basket’ (Bargam kuram), gub ‘ceiling joist, beam’ (Bargam gub). Two imply shared religious practices: barag ‘men’s cult’ (Bargam barag) and weit ‘demon’ (Bargam wait). Just seven calques are recorded in the subdatabase. Two of these, =ta=g and =ta=p, enclitic sequences meaning ‘because’ and ‘if’ respectively, go back to the period when early Bel grammar was being remodeled on the basis of one or more Papuan languages (§6). The others are calques on the basis either of Waskia terms or of terms in another Papuan language with compounding patterns similar to Waskia. We find, for example, bor-goun (lit. ‘pig-dog’) ‘domestic animals’, imitating Waskia buruk-kasik; tamol-pein (lit. ‘man-woman’) ‘people’, imitating Waskia kadi-

762

Malcolm Ross!

imet; malan-malan (lit. ‘his.eye-his.eye’) ‘(do) first of all’, imitating Waskia motam-motam. There are also numerous phrasal calques in Takia, apparently on a Waskia model (Ross 2003, 2007), but few of these appear in the subdatabase. One that does is bani-n ate-n (lit. ‘his hand’s liver’) ‘palm of the hand’, modeled on Waskia a-giti" goma" (my-hand its.liver) ‘palm of the hand’.

5. Integration of loanwords Most modern words have been borrowed from Tok Pisin, the consonant inventory of which varies according to the speaker’s vernacular. Takia speakers use the version of the Tok Pisin consonant inventory which makes the maximum number of distinctions, and this corresponds more or less with the Takia inventory. Modern loans differ radically from pre-modern words in phonotactics, however. The Takia primary stress placement rule is given in (6): (6)

Primary stress occurs (a) on the last or only a in the word (e.g. madár ‘bandicoot’, nánu-n ‘his/her child’, gfgáf ‘dust’, "á-sol ‘I fled’); (b) or if there is no a, on the last or only e or o (e.g. kr"é-n ‘his/her figer/toe’, u-sól ‘you (SG) fled’, péin ‘woman’); (c) or if there is no e or o, on the final syllable (e.g. i-fn-í ‘s/he hit him/her’, tbún ‘big’

Apart from certain inflectional prefixes, vowels other than -a- which occur earlier than the stressed vowel are inserted by rule: either -i- or -u- (for more detail, see Ross 2002). Inserted vowels are not written in the orthography used here, resulting in the written initial consonant sequences in some of the examples above. Takia phonotactics are unusual and not part of its Oceanic inheritance. They are perhaps the outcome of earlier bilingualism in Bargam, although the present-day Bargam system differs from Takia in significant respects (Hepner & Hepner 1989). The Takia stress placement rule does not apply to modern loans. Most Tok Pisin words are of English origin, and retain English stress placement: e.g. plástik ‘plastic’, eléksin ‘election’, kálenda ‘calendar’, skrúdraiva ‘screwdriver’. Of these, the first two happen to conform to Takia norms in their stress placement, but the latter two don’t. When pre-stress vowels occur, they are mostly lexically specified. However, many Tok Pisin idiolects allow optional vowel insertion in consonant sequences, and I assume that forms like [pi!lastik] and [suku!rudraiva] also occur in the speech of some Takia speakers. I have not investigated the extent to which vowel insertion operates in Tok Pisin borrowings in fluent speech, as my recorded texts contain little modern lexicon. I have recorded the modern forms in the spelling in which they were given to me, on the assumption that this provides a reasonably good reflection of current Takia usage. I have not removed vowels, even on the rare occasions that I suspect vowel insertion has occurred.

29. Loanwords in Takia

763

There is quite a sharp division between items which behave according to the canons of Takia phonology and those which do not. Bargam and Waskia borrowings are phonologically fully integrated into Takia. German, Tok Pisin and English loans are not. They are adapted to the Takia phoneme system, but not to its systems of stress and syllabification. There is no sign of this adaptation occurring, and I doubt that it ever will occur, as most of these items are in regular use in Tok Pisin, creating a pressure against adaptation. Indeed, it seems more likely that (if Takia survives) its phonological system will perhaps adapt to accommodate the mass of new loans in somewhat the way that the Tagalog vowel system has changed to accommodate Spanish and English borrowings. Of the 291 sourced loans in the wordlist, 34 are from Bargam, Waskia or Gedaged, 30 of which are Takia nouns. Nouns require no morphological integration into Takia unless they are treated as inalienably possessed, and just 4 of the 34, Bargam loans replacing earlier terms (I assume), are inalienably possessed. The corresponding Bargam term in each case ends in -n, and this has been reanalyzed as Takia -n ‘3SG possessor’, replaced by the other Takia possessor suffixes as appropriate (inalienably possessed nouns in Bargam take a readily separable possessor prefix): (7)

Takia babu, bbe-n sbari-n kr"e-n tbla-n

‘heart’ ‘wing’ ‘finger, toe’ ‘fish scale’

Bargam -babuwan -siwalin -kurgan -tiblalan

Grammatical functors have in general not been borrowed into Takia (Ross 1996a): The one possible exception is perhaps o ‘or’, perhaps borrowed from Tok Pisin. In sum, the overwhelming majority of borrowings are nouns, and most of the remainder are also items that require no morphological integration. Three of the nine items listed as ontological verbs are phrasal: dina -pani ‘owe’ (’give debt’), skel -gane ‘weigh’ (‘do measure’), ma"aw -lo" ‘learn’ (‘knowledge hear’). Several are ambiguous with regard to their word class, and I have not been able to check them in the field, but I don’t believe that I have ever encountered a single-morpheme morphologically integrated loan verb, other than -koy ‘row (a boat)’, borrowed from Gedaged, which has prefixal verbal morphology almost identical to Takia’s. One of the items listed, for example, is promis ‘promise’. I assume that if this is used verbally, it is in the phrase promis -pani; Waters (n.d.) lists a traditional idiom, "ie-bal ‘speak (one’s) leg’, "ie- -pan ‘give (one’ s) leg’, for ‘promise’. The strategy for creating new verbs by borrowing, then, resembles Wichmann & Wohlgemuth’s (2008) “light verb” strategy. It differs from it, however, in that the borrowed item is a noun, not a verb, in the instances listed here. Furthermore, this is not a strategy which is in any sense reserved for borrowings. Such phrases are an integral part of the lexical strategy of Takia. A total of 149 items in the wordlist are marked “Analyzable compound” or “Analyzable phrasal”, and 87 of these are verbal items. Some are compound verbs (nuclear serializations in the terminology of Crowley 2002), but

764

Malcolm Ross!

54 are phrasal. Some are serial verb constructions, and others are noun + verb sequences like pen -pani ‘paint’ (‘give paint’), bbe- -pani ‘love’ (‘give one’s heart’), yaes -bal ‘breathe’ (‘throw breath’), ilo- -sou ‘remember’ (‘shoot one’s inside’) and kankan -gane ‘guess’ (‘do thought’). It seems, then, that Takia borrowing of ontological verbs simply exploits the existing lexical structure of the language. This is not the place for a disquisition on Takia lexical structure, but it is clear that the lexical structures of languages vary considerably, and that this is an area to which little typological attention has been given (but see Pawley 2006). It is also clear that lexical typology overlaps significantly with loanword typology.

6. Grammatical borrowing Proto Oceanic probably had verb-initial clause order, but Early Western Oceanic and all its daughters except the Bel languages were SVO, perhaps as the result of contact with Papuan (non-Austronesian) languages in the Bismarck Archipelago. The Bel languages are SOV, and reflect a massive contact-induced shift towards the syntactic patterns of Papuan languages of the Trans New Guinea family (Ross 1987, 1996a, 2001, 2003, 2007): verb-final, postpositional, clause-chaining. Proto Bel was already SOV and had postpositions. The Western Bel languages – Takia, Megiar, Bilibil and Gedaged – or their immediate ancestor imitated clause-chaining constructions, i.e. coordinate-dependent (“cosubordinate”) clause constructions from neighboring Trans New Guinea languages. The constructions are described by Ross (1994), the imitative process by Ross (1987) and the internal history of the Bel family and the development of clause-chaining constructions by (Ross 2008).

7. Conclusion “Conclusion” is perhaps a misnomer as section title, since there is much in this study of Takia loans that is inconclusive. There has been a huge input of Tok Pisin words into Takia, all of them since 1930 and most since the Second World War. This is to be expected: Tok Pisin has been the expanding lingua franca that has brought and continues to bring globalization. There is a large number of words, however, for which I have no etymological information. The wordlist contains over 1000 unanalyzable roots, of which about 300 have been identified as loans. Of the remaining 700 or so, another 300 reflect forms that can be reconstructed for various Austronesian interstages including Proto Oceanic but predating Proto Bel, i.e. they have an Austronesian pedigree that stretches back more than a millennium. This leaves about 400 roots with no etymology. I suspect that a good many of these are unidentified loans from one or more Papuan languages, including Bargam and Waskia. This study has established that there was probably a time when many or all Takia speakers were bilingual in Bargam. This is of interest for the student of the

29. Loanwords in Takia

765

history of the Bel languages, but its relevance to loanword typology is limited to the inference that replacive loans are diagnostic of intense bilingualism.

References Barker, Fay & Lee, Janet. 1985. Waskia diksenari [Waskia dictionary]. Ukarumpa: Summer Institute of Linguistics. Clark, Ross. 1979. In search of Beach-la-mar: Towards a history of Pacific Pidgin English. Te Reo 22:3–64. Crowley, Terry. 2002. Serial verbs in Oceanic: A descriptive typology. Oxford: Oxford University Press. Dempwolff, Otto. n.d. Grammar of the Graged language. Unpublished mimeograph. Lutheran Mission, Narer, Karkar Island. Freyberg, Paul G. 1977. Missionary lingue franche: Bel (Gedaged). In Wurm, S. A. (ed.), New Guinea area languages and language study, Vol. 3, 855–864. Canberra: Pacific Linguistics. Goulden, Rick J. 1990. The Melanesian content in Tok Pisin. Canberra: Pacific Linguistics. Harding, Thomas G. 1967. Voyagers of the Vitiaz Straits: A study of a New Guinea trade system. Seattle: University of Washington Press. Hays, Terence E. 2005. Vernacular names for tuber in Irian Jaya: Implications for agricultural prehistory. In Pawley, Andrew & Attenborough, Robert & Golson, Jack & Hide, Robin (eds.), Papuan pasts: Cultural, linguistic and biological histories of Papuan speaking peoples (Pacific Linguistics), 625–670. Canberra: Pacific Linguistics. Hepner, Mark. 2007. Bargam dictionary. Unpublished manuscript. Summer Institute of Linguistics, Papua New Guinea Branch. . Hepner, Mark & Hepner, Carol. 1989. Bargam phonology essentials. Unpublished manuscript. Summer Institute of Linguistics, Ukarumpa. . Keesing, Roger. 1988. Melanesian Pidgin and the Oceanic substrate. Stanford, CA: Stanford University Press. Mager, John F. 1952. Gedaged-English dictionary. Columbus, Ohio: American Lutheran Church, Board of Foreign Missions. McSwain, Romola. 1977. The past and future people. Melbourne: Oxford University Press. Mennis, Mary R. 2006. A potted history of Madang: Traditional culture and change on the north coast of Papua New Guinea. Aspley, Queensland: Lalong Enterprises. Mühlhäusler, Peter. 1979. Growth and structure of the lexicon of New Guinea Pidgin. Canberra: Pacific Linguistics.

766

Malcolm Ross!

Pawley, Andrew. 2006. Where have all the verbs gone? Remarks on the organisation of languages with small, closed verb classes. Research School of Pacific and Asian Studies. th th Paper presented to the 11 Binnenial Rice University Linguistics Symposium, 16 – th 18 March. Pawley, Andrew & Ross, Malcolm (eds.). 1994. Austronesian terminologies: Continuity and change. Canberra: Pacific Linguistics. Ross, Malcolm. 1987. A contact-induced morphosyntactic change in the Bel languages of Papua New Guinea. In Laycock, Donald C. & Winter, Werner (eds.), A world of language: Papers presented to Professor S. A. Wurm on his 65th birthday, 583–601. Canberra: Pacific Linguistics. Ross, Malcolm. 1988. Proto Oceanic and the Austronesian languages of western Melanesia. Canberra: Pacific Linguistics. Ross, Malcolm. 1992. The sources of Austronesian lexical items in Tok Pisin. In Dutton, Tom & Ross, Malcolm & Tryon, Darrell (eds.), The language game: Papers in memory of D. C. Laycock, 361–384. Canberra: Pacific Linguistics. Ross, Malcolm. 1994. Describing inter-clausal relations in Takia. In Reesink, Ger P. (ed.), Topics in descriptive Austronesian linguistics, 40–85. Vakgroep Talen en Culturen van Zuidoost-Azië en Oceanië. Leiden: Rijksuniversiteit te Leiden. Ross, Malcolm. 1996a. Contact-induced change and the comparative method: cases from Papua New Guinea. In Durie, Mark & Ross, Malcolm (eds.), The comparative method reviewed: Regularity and irregularity in language change, 180–217. New York: Oxford University Press. Ross, Malcolm. 1996b. Mission and church languages in Papua New Guinea. In Wurm, S. A. & Mühlhäusler, Peter & Tryon, Darrell (eds.), Atlas of languages of intercultural communication in the Pacific, Asia, and the Americas, Vol. 2.1, Map 60 & 595–617. Berlin: Mouton de Gruyter. Ross, Malcolm. 2001. Contact-induced change in Oceanic languages in north-west Melanesia. In Dixon, R. M. W. & Aikhenvald, Alexandra Y. (eds.), Areal diffusion and genetic inheritance: problems in comparative linguistics, 134–166. Oxford: Oxford University Press. Ross, Malcolm. 2002. Takia. In Lynch, John & Ross, Malcolm & Crowley, Terry (eds.), The Oceanic languages, 216–248. Richmond: Curzon Press. Ross, Malcolm. 2003. Diagnosing prehistoric language contact. In Hickey, Raymond (ed.), Motives for language change, 174–198. Cambridge: Cambridge University Press. Ross, Malcolm. 2007. Calquing and metatypy. Journal of Language Contact Thema 1:116– 143. . Ross, Malcolm. 2008. A history of metatypy in the Bel languages. Journal of Language Contact Thema 2. . Ross, Malcolm with Paol, John Natu. 1978. A Waskia grammar sketch and vocabulary. Canberra: Australian National University.

29. Loanwords in Takia

767

Ross, Malcolm & Pawley, Andrew & Osmond, Meredith (eds.). 1998. The lexicon of Proto Oceanic: The culture and environment of ancestral Oceanic society. Vol. 1: Material culture (Pacific Linguistics C-152). Canberra: Pacific Linguistics. Ross, Malcolm & Pawley, Andrew & Osmond, Meredith (eds.). 2003. The lexicon of Proto Oceanic: The culture and environment of ancestral Oceanic society. Vol. 2: The physical world (Pacific Linguistics 545). Canberra: Pacific Linguistics. Todd, Loreto. 1974. Pidgins and creoles. London: Routledge & Kegan Paul. Tryon, Darrell (ed.). 1995. Comparative Austronesian Dictionary. Berlin: Mouton de Gruyter. Tryon, Darrell & Charpentier, Jean-Michel. 2004. Pacific Pidgins and Creoles: Origins, growth and development. Berlin: Mouton de Gruyter. Wagner, Herwig & Hermann, Reiner. 1986. The Lutheran Church in Papua New Guinea: The first hundred years 1886–1896. Adelaide: Lutheran Publishing House. Waters, Bruce. n.d. Takia dictionary. Ukarumpa: Summer Institute of Linguistics. Unpublished Shoebox file. Wichmann, Søren & Wohlgemuth, Jan. 2008. Loan verbs in a typological perspective. In Stolz, Thomas & Bakker, Dik & Palomo, Rosa (eds.), Aspects of language contact, 89– 121. Berlin: Mouton de Gruyter. Z’graggen, John A. 1980. A comparative word list of the Northern Adelbert Range languages, Madang Province, Papua New Guinea. Canberra: Pacific Linguistics.

768

Malcolm Ross!

Loanword Appendix Bargam

dis

dish

plet

plate

boul

bowl

jag

jug/pitcher

kap

cup

sosa

saucer

spun

spoon

fok

fork

to"s

tongs

brakpas

breakfast

lans

lunch

sock, stocking

bret

bread

spade

flawa

flour

hell

sosis

sausage

Gedaged

sblbal

foam

bllek

sheep

you brruk

tide

yamel

cloth

knaorig

earthquake

galu"

room

keit

sky

bala

paint

krek

fowl

koy

oar

yok (1)

bat

-koy

to row

tbila-

scale

ubou

week

bramat

scorpion

ma"aw -lo"

to learn

mrkis

sandfly, midge, gnat

German

ksalom

spider

soken

babu, bbe-

heart

tor

soup

barag ab

men’s house

gub

beam

kram

basket

ttawit

fan

sakar

hard

gos

dry

weit

demon

Waskia

spaten ades

sup

soup

German or Tok Pisin

bin

bean

su (2)

patete

potato

sol

salt

shoe

Tok Pisin

hani

honey

sno

snow

suga

sugar

eis

ice

bata

butter

masis

match

wein

wine

famili

family

bia

beer

bug

dew

makao

cattle

tret

thread

goub

mouse, rat

bigbel

ox

dres

(woman’s) dress

porpoise, dolphin

wos

horse

cot

coat

patou

goose

siot

shirt

madar

bandicoot

rebit

rabbit

kola

collar

grgroy

frog

weldok

wolf

sket

skirt

sareg

basket

mo"i

monkey

traosis

trousers

ktaok

netbag

elefan

elephant

subut

boot

hanibi

bee

hat

hat, cap

ke"garu

kangaroo

glab

glove

bafolo

buffalo

poket

pocket

Unidentified New Guinea mainland Papuan languages

matmat

grave

baten

button

doktai

physician

pin

pin

mao

taro

marasin

medicine

ri"

ring

kam

vine

stop

oven

yari"

earring

peik

lizard

praipan

pan

hedben

headband, -dress

lous

Bargam or Waskia you

water

29. Loanwords in Takia mak

tattoo

silwa

silver

mandei

Monday

ha"gisip

handkerchief, rag

baras (2)

copper

tudei

Tuesday

aen

iron

tridei

Wednesday

taol

towel

bol

lead

fodei

Thursday

bros

brush

kapa

tin, tinplate

fraidei

Friday

resa

razor

basket

basket

sarrei

Saturday

sop

soap

mat

mat

kala

colour/color

galas

mirror

kapet

rug

raf

rough(1)

lok

lock

sisel

chisel

smut

smooth

pedbol

latch, door-bolt

pen (2)

paint

sumatin

pupil

ki

key

bris

bridge

skul

school

windua

window

wil

wheel

promis

to promise

flou

floor

sap

axle

pepa

paper

bet

bed

yok (2)

yoke

pen(1)

pen

pelou

pillow

sip

ship

buk

book

pla"git

blanket

bot

boat

kantri

country

sia

chair

a"gai

anchor

taon

town

tebol

table

moni

money

ki"

king

lam

lamp, torch

takis

tax

kwin

queen

kendel

candle

haia

to hire

masta

master

selp

shelf

saina

merchant

sleb

slave

brik

brick

maket

market

wokboy

servant

fama

farmer

stua

shop/store

fri

freeman

sabol

shovel

skwe

square

ami

army

baira

hoe

bal

ball

soldia

soldier

rek

rake

laen

line

gan

gun

wit

wheat

not

zero

helmet

helmet

kon

maize/corn

faif

five

kot

court

reis

rice

siks

six

jas

thejudge

garas

grass

sabaen

seven

compleina

plaintiff

paep

pipe

eit

eight

difenda

defendant

muli

citrus fruit

naen

nine

holim Baibel

oath

sen

chain

ten

ten

saspek

to accuse

naip

knife (2)

wanandet

a hundred

dismis

to acquit

sisis

scissors, shears

wantousen

a thousand

gilti

guilty

tuls

tool

seken

second

inosen

innocent

kapenta

carpenter

ted

third

kotfain

fine

sou

saw

grismas

age

klabus

prison

ama

hammer

awa

hour

meda

murder

anwil

anvil

klok

clock

rep

rape

gol

gold

769

770

Malcolm Ross!

katim-skin

circumcision

kriminol

crime

filim

film/movie

redio

radio

eleksin

election

musik

music

tiwi

television

adres

address

so"

song

telefon

telephone

namba

number

ti

tea

wilwil

bicycle

strit

street

kofi

coffee

motobaik

motorcycle

pos

post/mail

kar

car

stem

postage stamp

bas

bus

pas

letter

do"ki

donkey lion

English

tren

train

piksakad

postcard

laion

balus

airplane

beng

bank (financial institution)

kamel

camel

fig

fig

tap/faucet

oliv

olive

pawa

electricity

batri

battery

kok

brek

to brake

besin

sink

baliwit

barley

moto

motor

toilet

toilet

elabaen

eleven

masin

machine

matres

mattress

twelf

twelve

wel

petroleum

tin

tin/can

fiftin

fifteen

aosik

hospital

skru

screw

twenti

twenty

sista

nurse

skrudraiva

screwdriver

adaltri

adultery

sut

injection

botol

bottle

tempel

temple

aiglas

candy/sweets

altar

altar

gavman

spectacles/glasses loli plastik government

plastic

fasti"

to fast

presiden

president

bom

bomb

minista

minister

woksap

workshop

polis

police

sigaret

cigarette

mis

cat

draiva laisens driver’s license

niuspepa

newspaper

plagis, pla"is

axe/ax

nambaplet

kalenda

calendar

license plate

Unknown origin

Chapter 30

Loanwords in Hawaiian* ‘!iwi Parker Jones 1. The language and its speakers The Hawaiian language (i.e., ka ‘"lelo Hawai#i) is indigenous to the islands of Hawai!i. Until Western contact in 1778, Hawaiian was the only language spoken throughout the archipelago. Hawaiian is an Austronesian language that belongs to the Eastern Polynesian language family and is closely related to M"ori, Marquesan, and Tahitian (see Figure 1). Proto-Tongic ProtoPolynesian

Tongan, … Proto-Samoic Outlier

Proto-Nuclear Polynesian

Samoan, … Proto-Easter Island

Rapa Nui

Hawaiian

Proto-Eastern Polynesian

M!ori Proto-Central Polynesian

Marquesan

Tahitian

…

Figure 1: The Polynesian family tree (adapted from Pawley 1966, Clark 1979: 258, and Schütz 1994: 335) The archeological evidence suggests that Polynesians may have first settled Hawai!i as early as 200 CE (see, e.g., Kirch 1998: 161). Yet to arrive in Hawai!i from the Marquesas or Tahiti, the early settlers had to cross over 2,000 miles of open ocean without the benefit of modern navigational instruments, such as a compass or a clock. According to Hawaiian oral history, this remarkable achievement was in fact *

The subdatabase of the World Loanword Database that accompanies this chapter is available online at http://wold.livingsources.org. It is a separate electronic publication that should be cited as: Parker Jones, ‘#iwi. 2009. Hawaiian vocabulary. In Haspelmath, Martin & Tadmor, Uri (eds.) World Loanword Database. Munich: Max Planck Digital Library, 1306 entries.

772

‘!iwi Parker Jones

repeated many times. Moreover, the modern revival of stellar navigation provides strong evidence that such long-distance commutes were practicable (Pi!in"i!a 1998). Although there is no accurate census or survey, Kapono (1998: 199) estimates that there were about 5,000 Hawaiian speakers in 1995. Of these, 1,000 were m$naleo (i.e., “heritage speakers” or “native-speaking elders”), 400 of whom had connections to the island of Ni!ihau (see §3 below). We may identify another 1,000 of the estimate to be young native Hawaiian speakers, who acquired the language naturally, during the critical period, and without formal instruction (on the critical period, see Penfield & Roberts (1959), Lenneberg (1967), and Pinker (1994)). The remaining 3,000 speakers in Kapono’s estimate were second language speakers, who learned the language (often fluently) through formal instruction. It should be noted, however, that the population of Hawai!i is approximately 1.14 million. So, less than 1% of the population actually speaks Hawaiian. Furthermore, over 100 languages were reportedly spoken by residents of Hawai!i in the 1990 census (cited at Schütz 1998: 199). Other than English and Hawaiian, these languages include Japanese, Tagalog, Ilocano, and Hawai!i Creole English (Schütz 1998: 198–200). As Romaine (2006: 227) observes, Hawai!i Creole English “is the first language of the majority of locally born children and the first language of somewhat less than half the state of Hawai!i’s population of just over a million.” The Hawaiian language is only just recovering from near extinction, after two hundred years of foreign contact, much of which has been oppressively colonialist. Many of the reasons for Hawaiian’s decline can be traced back to events in the th 19 century. For example, the sovereign Kingdom of Hawai!i existed in constant threat, because of foreign interest in Hawai!i’s abundant natural resources, deep harbors, and strategic geopolitical position. This situation resulted in the mass dispossession of Hawaiian people, through the privatization of land known as the M$hele (‘dividing’). Sugar plantations then took root, importing indentured laborers from Japan, China, Portugal, and the Philippines, and attracting heavy foreign investment, especially from the United States. In 1886, US-allied businessmen coerced the monarch (King Kal"kaua) to sign away his authority, while allowing him to remain as a figurehead. The resulting legislation became known as the ‘Bayonet Constitution’. When Kal"kaua’s successor (Queen Lili!uokalani) reasserted the monarchy’s popular authority, a small well-armed militia (supported by U.S. marines) forcibly dethroned her by coup d’état (Langlas 1998: 177). As a corollary, Hawaiian went from being a language of prestige in the Hawaiian Kingdom to being something much less in the resulting Republic (1893), U.S. Territory (1898), and U.S. State (1959). To illustrate this, in the Kingdom of Hawai!i, “more than 90 percent of the Hawaiian population could speak, read, and write in their native tongue” (Kapono 1998: 199). After the overthrow, however, education in Hawaiian was banned, and, in some schools, children caught speaking Hawaiian were punished. But on the heels of the Civil Rights Movement in America, Hawaiian language and culture experienced what is known locally as the ‘Hawaiian Renaissance’. During this renaissance, a grass-roots, language revitalization movement emerged

30. Loanwords in Hawaiian

773

based on the immersion school model of New Zealand’s K"hanga Reo (‘language nest’) (for details, see Wilson & Kaman" 2001). These Hawaiian immersion schools, including P%nana Leo (also ‘language nests’) and Kula Kaiapuni (‘immersion schools’), have become the locus for the revival and perpetuation of Hawaiian. Another important domain of Hawaiian use is the University of Hawai!i, where a Hawaiian language B.A. has been available since the 1970s, and where an M.A. and Ph.D. have been available since 2002 and 2006, respectively. Hawaiian is used in the homes of some of the immersion school families, and, as an extension of these educational domains, Hawaiian is spoken in workplaces run by the ‘Aha P%nana Leo (‘language nest gathering’), which, for example, develops materials for the schools. As Hawaiian remains endangered, much work remains to be done. Yet the revitalization movement in Hawai!i already has a lot to be proud of. “Of all languages indigenous to what is now the United States, Hawaiian represents the flagship of language recovery, and serves as a model and a symbol of hope to other endangered languages” (Hinton 2001: 131).

2. Sources of data In general, the data in the Hawaiian loanword database were drawn from two Hawaiian dictionaries: the Hawaiian Dictionary (Pukui & Elbert 1986) and M$maka Kaiao (K$mike Hua!$lelo 2003). These are not the only Hawaiian dictionaries (see, e.g., Andrews 1865), but they are representative of the modern standard. The two dictionaries stand in complementary distribution to one another. When the K$mike Hua!$lelo repeats words from Pukui & Elbert’s dictionary, it is to update them (e.g., to extend the meaning of an existing word). As both dictionaries are available electronically (), it was possible to mine them computationally for information.

Map 1: Geographical setting of Hawaiian

774

‘!iwi Parker Jones

Where other works have been consulted in the database, they are explicitly referenced. (A full list of the works consulted in the database is available in the Documentation File.) Other information in the database was supplied by the author, who grew up speaking Hawaiian as a part of the P%nana Leo revitalization movement. The author was born and raised in the town of Hilo, on the island of Hawai!i. The variety of Hawaiian described here is standard, as spoken across the Hawaiian Islands (except on Ni!ihau and on parts of Kaua!i).

3. Contact situations The contact situations are divided into nine categories, which are discussed in the subsections below (cf. Reinecke 1969). These categories are intended to help situate each loanword within the context it is believed to have been loaned in. Many of the categories’ names should have obvious meanings. For example, the category English to Hawaiian contains English words that were loaned into Hawaiian. However, a few of the category names are less transparent. For example, the category Ni#ihau also contains Hawaiian words loaned from English. What distinguishes this contact situation from English to Hawaiian is that English words loaned into the dialect of Ni!ihau retain elements of that dialect’s phonology. But the mapping from languages to categories is not just one-to-many. Two categories, Lexicon Committee and Missionary Bible translation, include words that were loaned into Hawaiian from English as well as a variety of other languages (like Czech and Classical Greek). So the mapping from languages to categories is many-to-many. In the subsections that follow, each of the context situations will be sketched. 3.1.

Chinese to Hawaiian

In 1852, plantation owners in Hawai!i began recruiting contract laborers from China. Just over 2,000 Chinese immigrants were recorded in Hawai!i in 1875, the year in which the Reciprocity Treaty was signed between the Kingdom of Hawai!i and the United States of America, allowing free trade between these two countries. New labor was sought to meet the growing US demand for Hawaiian sugar. As a result, a further 37,000 indentured laborers were imported into Hawai!i from China. Chinese professionals and merchants arrived later, after Hawai!i’s annexation to the USA in 1898. Contract labor was abolished by law in 1900, although foreign laborers continued to arrive on the plantations. All of the Chinese plantation workers came from the province of Guangdong (formerly Kwangtung) and spoke Hakka and Cantonese. In practice, the Hawaiian Dictionary (Pukui & Elbert 1986) makes no distinction between Hakka and Cantonese loanwords. In a single case, a loanword has been identified as “Informal Cantonese” (the word is Hawaiian P$k& ‘Chinese’ < Informal Cantonese baak3

30. Loanwords in Hawaiian

775

‘father’s older brother’). The four other Chinese to Hawaiian loanwords have not been identified more specifically, so “Chinese” is left as their donor language. 3.2.

English to Hawaiian

Contact between speakers of English and speakers of Hawaiian dates from 1778, when Captain James Cook sailed into Hawaiian waters. Thereafter, the Islands became a frequent stop between America and the Asian or Australian coasts. New England missionaries arrived in the 1820s, massively increasing the number of English loanwords in Hawaiian. Hawai!i’s rich agricultural resources also enticed English-speaking entrepreneurs (typically the sons and grandsons of missionaries), th who set up Hawaiian sugar plantations in the 19 century. These sugar barons increasingly influenced local politics in the Kingdom of Hawai!i until a US conspiracy overthrew the monarchy in 1893. The sugar barons, who directly benefited from this coup, then set up a “Republic” (technically an oligarchy), which lasted until the US annexed the Islands in 1898. One enticement for American annexation was Hawai!i’s strategic position in the Pacific. US military bases increased the English-speaking population in Hawai!i, although contact between soldiers and civilians remained limited. The Territory of Hawai!i was yoked into the Union in 1959, almost ensuring the English language’s dominance in Hawai!i. Hawai!i now receives American products, media, and tourists, and exports very little in the other direction. As mentioned in §1, the lingua franca of Hawai!i is Hawai!i Creole English, which is an English dialect that is known in Hawai!i as ‘Pidgin’. Unfortunately, no distinction is made in the Hawaiian database between words that were borrowed from Pidgin or from another variety of English. The majority of loanwords in the database were borrowed from some variety or other of English. 3.3.

Japanese to Hawaiian

Although the Japanese contract laborers arrived on the plantations after the laborers from China, they also left a substantial impression on the local culture. From 1897 to 1907, the majority of laborers who sought work on the sugar plantations were Japanese. Like other plantation workers, many Japanese laborers settled in Hawaiian cities after completing their 3–5 year contracts. Connections between Hawai!i and Japan remain strong today, as Japanese visitors constitute a major tourist presence in the Islands. The dictionaries identify four Hawaiian words as Japanese borrowings: m"ch' ‘sticky rice cake’ (< mochi); musub' ‘rice ball’ (< musubi); koi% ‘soy sauce’ (< sh"-yu); and, according to Pukui & Elbert (1986), ‘eka#eka ‘Japanese taro’ (< adado). Of these, the last one seems phonologically suspect (why would adado not be integrated into Hawaiian as something like ‘akako?).

776

‘!iwi Parker Jones

3.4.

Lexicon Committee

Hawaiian language immersion schools emerged in the 1980s as part of the cultural revival inspired in part by the Civil Rights movement in America. These schools required new pedagogic materials for primary and secondary curricula. However, a lexicographic void was left after the final edition of Pukui & Elbert’s Hawaiian Dictionary in 1986. In response, the K"mike Hua‘"lelo ‘Lexicon Committee’ assembled in 1987. Although the K$mike Hua!$lelo’s membership changes, it typically includes teachers, scholars, and other Hawaiian language experts from across the Islands. In terms of content, the K$mike Hua!$lelo’s dictionary stands in complementary distribution with Pukui and Elbert’s dictionary, adding new terminology like P%naewele Puni Honua ‘internet’ (a calque on World Wide Web). The K$mike Hua!$lelo favors loanwords from Polynesian or other endangered languages over loanwords from English. This is a conscious response to the overwhelming dominance of English in Hawai!i, as students are strongly exposed to English outside of school. Since the vocabulary in K$mike Hua!$lelo (2003) has been influenced by an educated and formally assembled committee, it is important to distinguish words that were coined by the K$mike Hua!$lelo from words borrowed in other contact situations, even if they are ultimately modeled after the same source language (e.g., French). K$mike Hua!$lelo (2003) records words from: Assyrian, Czech, English, French, Japanese, M"ori, Rarotongan, Tahitian, and Ute. But since it does not always cite source words, there are some guesses in the database which I should like to flag. These are indicated with question marks (indicating uncertainty), as in the Rarotongan source word for ma#aka ‘uppercase’ (< Rarotongan ma#aka?). 3.5.

Missionary Bible translation

Missionaries arrived in Hawai!i in 1820, bringing with them their New England ideals of education, literacy, and religion. They eventually devised a rudimentary Hawaiian orthography and began producing reading material in the Hawaiian language. Foremost amongst their products was a translation of the Bible. This required developing a number of new Hawaiian words for Biblical characters, animals, theological terms, and so forth. Although in many cases they borrowed words from English into Hawaiian, such as h'meni ‘song’ (< English hymn), they also borrowed words from the classical languages. Thus the modern Hawaiian vocabulary contains words like nahesa ‘snake’ (< Hebrew na(a)), ‘aeko ‘eagle’ (< Church Latin aetos), and ‘alopeka ‘fox’ (< Greek alopeks). Why did the missionaries borrow vocabulary from these classical languages? In some cases, the missionaries’ puritan ideals seem to have been involved. For example, meli ‘honey’ may have been borrowed from Greek rather than from English, because, as Schütz (1976: 79) suggests: “Honi ‘kiss’ and hani ‘act flirtatious’ would have given an undesired risqué meaning to such phrases as ‘land overflowing with milk and honey’ (especially with ‘milk’

30. Loanwords in Hawaiian

777

translated by a phrase that means ‘breast liquid’) or ‘lips of a strange woman drop honey’.” In building the database, it was not always easy to determine when an English loanword should be categorized as Missionary or not, so some missionary loans from English may have been omitted from the database unintentionally. On the other hand, many of the loanwords from the classical languages are identifiable from their spellings (which retain the foreign consonants of their source words). So, such words are reliably classified as Missionary borrowings. 3.6.

Ni!ihau

The westernmost of the Hawaiian Islands, Ni!ihau, has had a separate history from the rest of the Islands. Since Ni!ihau is privately owned, the comings and goings of visitors are very tightly controlled. The current Hawaiian population of approximately 200 has been sheltered from the modern world, though it continues to be th strongly influenced by the 19 century (via the missionary bible). Only one word is marked in the database as coming specifically through the island of Ni!ihau. This word is tuko ‘glue’, which Pukui & Elbert (1986) claim to be a loan from Duco, an automotive lacquer developed by DuPont in the 1920s. 3.7.

French loanwords

Seven words have been included in the database whose contact situations remain opaque. These all happen to be loanwords from French, which might be attributed to the historical influence of francophone Catholics in Hawai!i. 3.8.

Portuguese to Hawaiian

Several hundred Portuguese immigrants entered Hawai!i before 1876, while the majority came later, after the effect of the Reciprocity Treaty. About 10,000 Portuguese immigrants came in the first wave, between 1878 and 1887. Between 1906 and 1913, the second wave brought another 5,000 Portuguese immigrants to Hawai!i - many of whom did not settle. In addition to their strong presence on the plantations, the Portuguese have remained a visible group in Hawai!i. For example, Puerto Rican and Spanish immigrants to Hawai!i have often been classified as Portuguese. The database records three loanwords from Portuguese: pakaliao ‘codfish’ (< bacalhau); Pakoa ‘Easter’ (< Páscoa); and p'p'nola ‘type of squash’ (< pepineiro).

778

‘!iwi Parker Jones

3.9.

Spanish to Hawaiian th

Cattle arrived in Hawai!i in the late 18 century, as a gift to King Kamehameha from the British explorer Captain George Vancouver. Mexican vaqueros were subsequently invited to Hawai!i (from what is now California) to help on the Hawaiian cattle ranches. These Mexicans introduced paniolo ‘cowboy’ into the Hawaiian vocabulary, which comes from the Spanish word español, meaning ‘Spanish’.

4. Numbers and kinds of loanwords Hawaiian is not usually described as having adjectives or adverbs; instead, the language is described as having another set of syntactic categories (e.g., stative verbs; see Elbert & Pukui 1979: 43–44, 49–51). Moreover, many Hawaiian bases function as both nouns and verbs (see, e.g., the category of noun-verb in Elbert & Pukui 1979: 43). However, in order to compare the languages in this project, the word classes reported here have been standardized, using a set of semantic categories. This standardized set consists of nouns, verbs, function words, adjectives, and adverbs, as in Table 1 (cf. the “semantic word class” field in the database). Of these standardized parts of speech in the Hawaiian database, nouns were the most commonly borrowed, followed by verbs, function words, and finally adjectives. No borrowed adverbs are recorded in the database. Why did Hawaiian borrow so many nouns? One contributing cause may be the large number of new things that have been introduced to Hawai!i since contact, along with their names.

Classical Greek

M!ori

Hebrew

Czech

French

Total loanwords

Non-loanwords

Nouns Verbs Adjectives Adverbs Function words All words

English

Table 1: Loanwords in Hawaiian by donor language and semantic word class (percentages)

17.7 4.7 1.8 2.5 12.5

0.9 0.6

0.3 0.2

0.1 0.1

0.1 0.1

0.1 0.1

19.2 4.7 1.8 0.0 2.5 13.4

80.8 95.3 98.2 100.0 97.5 86.6

Interestingly, some of the borrowed verbs in the database may have entered Hawaiian as nouns. For instance, the verb p%lumi ‘to sweep’ came into Hawaiian from the English noun broom. Thus, one might literally ‘broom’ in Hawaiian, rather than ‘sweep’, as in English. Another possible example is the verb kupa (1) ‘to boil’ from the English noun soup. Both p%lumi and kupa (1) exist in Hawaiian as

30. Loanwords in Hawaiian

779

nouns, too. As in English, the nouns mean ‘broom’ and ‘soup’. As mentioned above, many Hawaiian bases function as both nouns and verbs. For example, a native Hawaiian word like ‘"lelo may be employed as a verb (meaning ‘to speak’) or as a noun (meaning ‘language’). So, once in the language as a noun, a loanword might also be used as a verb. Thus, a number of borrowed verbs in Hawaiian may also have entered the language as nouns. Of the source languages in Table 1, English is by far the most strongly represented. This accords well with the historical record, as discussed above (in §1 and §3). The influence of English can also be seen when we look at the Hawaiian data from the perspective of the semantic fields in Table 2. Some of the borrowed concepts in these tables were not present in pre-contact Hawai!i. For instance, one th thinks of the missionaries’ 19 century biblical vocabulary, or of our modern terminology for radios, television sets, and computers. But some foreign concepts were arguably present in pre-contact Hawaiian, even though Hawaiian borrowed new vocabulary for them. Examples include the concepts of ‘to cook’ and ‘insect’, which were borrowed from English as kuke and ‘iniseka (< cook and insect, respectively); rough pre-contact alternatives include ho#omo#a for ‘to cook’ and mea kolo for ‘insect’. Many Hawaiian concepts were also revised in response to foreign contact. For example, borrowed kinship terms like ‘anakala ‘uncle’ (< English uncle) and ‘anak& ‘aunty’ (< English aunty) tweaked the pre-contact understanding of makua k$ne makua ‘parent’s older brother’ and makua k$ne ‘"pio ‘parent’s younger brother’. Notice that the traditional system nicely paralleled the distinction of age between other Hawaiian kinship terms, like kaikua#ana ‘older sibling (of the same gender)’ and kaikaina ‘younger sibling (of the same gender)’. Thus, while the older brother of one’s parent would be one’s makua k$ne makua ‘parent’s older brother’, the older brother of a boy would be the boy’s kaikua#ana ‘older sibling (of the same gender)’. Although the traditional distinction between a parent’s older and younger sibling has fallen out of use, the Hawaiian concepts of ‘anakala and ‘anak& are still not exactly isomorphic to English uncle and aunty, since one’s parent’s friend may also be one’s ‘anakala or ‘anak&. Less familial relationship is entailed by the Hawaiian borrowings. Finally, while nearly every domain of Hawaiian has been affected by English, the influence of the other donor languages has been severely limited. For instance, borrowings from the classical languages are typically Bible related. Furthermore, borrowings from other, more ‘exotic’ donor languages (like Czech and Ute) are extremely rare; indeed, such borrowings have typically originated with the K$mike Hua!$lelo as a conscious attempt to offset the overwhelming dominance of English.

780

‘!iwi Parker Jones

Czech

French

3.6 1.5 8.9 0.6

0.9 1.9 0.2

0.9 0.1

1.6 0.1

2.8 0.1

Non-loanwords

Hebrew

3.0 8.1 26.4 2.1 26.7 33.0 12.7 29.4 16.0 6.3 13.1 3.1 9.0 3.8 2.3 4.8 8.4 6.4 4.7 8.9 42.6 12.5

Total loanwords

M!ori

The physical world Kinship Animals The body Food and drink Clothing and grooming The house Agriculture and vegetation Basic actions and technology Motion Possession Spatial relations Quantity Time Sense perception Emotions and values Cognition Speech and language Social and political relations Warfare and hunting Law Religion and belief Modern world Miscellaneous function words all words

Classical Greek

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

English

Table 2: Loanwords in Hawaiian by donor language and semantic field (percentages)

3.0 97.0 8.1 91.9 31.9 68.1 2.1 97.9 28.1 71.9 33.0 67.0 12.7 87.3 31.0 69.0 16.0 84.0 6.3 93.7 13.1 86.9 3.1 96.9 9.0 91.0 3.8 96.2 2.3 97.7 0.0 100.0 4.8 95.2 11.2 88.8 6.4 93.6 0.0 100.0 4.7 95.3 17.8 82.2 44.5 55.5 0.0 100.0 13.4 86.6

5. Integration of loanwords There is a lot to say about the integration of Hawaiian loanwords because of the differences between Hawaiian and donor language phonologies, because of the modality by which the words were borrowed (i.e. aural or visual), and because of top-down influence from institutions. Borrowings, whatever their origin, generally conform to Hawaiian phonotactics. For instance, Hawaiian syllables are never closed, even in loanwords. Foreign codas are reanalyzed to fit the Hawaiian model, by consonant deletion, vowel insertion, or both. For example, the [n] in Classical Greek amomon [amomon] ‘amomum’ is

30. Loanwords in Hawaiian

781

deleted in Hawaiian ‘amomo [%&momo]; the ['] in English bill [b('] is word-final, but in Hawaiian pila [pil&] the corresponding [l] precedes a paragogic [&]; and the complex, word-final coda [nd] in English island [%a(l&nd] is both simplified in Hawaiian ‘ailana [%)ilan&] to [n] and followed by a paragogic [&]. (The ‘island’ example is from Schütz n.d.) Note that, in case of vowel insertion, an English coda is reanalyzed as a Hawaiian onset (e.g., /pi.la/ and /%ai.la.na/). Also note that both phonemically and orthographically vowel-initial words in English have phonetically glottal onsets in isolation. Hawaiian speakers, who contrast glottal stops phonemically, would seem to hear these and interpret them as phonemic in Hawaiian borrowings (as in the ‘island’ example). In general, foreign consonant clusters are similarly reanalyzed. For example, the Hawaiian loanword kalepa [k&lep&] (< English scraper [sk*e(p+]) exhibits both consonant deletion and vowel insertion, as the English [sk] cluster is simplified in Hawaiian to [k], while the English [k*] cluster is broken up in Hawaiian by vowel epenthesis as [k&l]. (The ‘scraper’ example is also from Schütz n.d.) However, a few Hawaiian loanwords do contain what appear to be complex onsets, like [k*] and [st], which do not occur in native Hawaiian words. Examples of both occur in kristo h [k*isto] (< Classical Greek Christos [k ristos]). Therefore, it is important to partition native and foreign vocabulary into separate lexical strata. Another reason for having lexical strata in Hawaiian is the number of so-called loan phonemes, which are phones that only function to contrast foreign words. For example, [s] occurs in loanwords like savana [s&van&] (< English savanna [s&væn&]), but nowhere in the native Hawaiian lexicon. Also the affricate [t,] only occurs in loanwords like m"ch' [mo-t,i-] ‘sticky rice cake’ (< Japanese mochi [mo-t,i-]). Incidentally, foreign-sounding words can always be nativized by replacing the foreign phones with native ones, as in the variant pronunciations of savana [k&van&] and m"ch' [mo-ki-]. Many linguists have been interested by the question of loanword adaptation in Hawaiian, since a donor language like English has many more consonants and vowels than Hawaiian (e.g., Carr 1951; Pukui & Elbert 1957; Schütz 1994). So, how are English words adapted to fit the Hawaiian model? Table 3 summarizes some of the correspondences between foreign and Hawaiian phones. The asymmetry in the table is worth noting; more English phones map to fewer Hawaiian phones. Hawaiian [k] and [a] are the most common targets represented here. It is perhaps also worth nothing that the mappings are both many-to-one and one-to-many. For example, both English [n] and [.] map to Hawaiian [n]; English [s] maps to both Hawaiian [k] and [h]. Vowels are sometimes lengthened in Hawaiian borrowings, presumably to represent the stress patterns of their foreign sources, as phonemically long vowels are never unstressed in Hawaiian (Schütz 1994, n.d.). Compare the patterns of stress in an English word like rabbit [/*æb(t] and its Hawaiian borrowing l$paki [0la-/paki]. Without the long vowel in this word, it would only have received one stress on the penultimate syllable [la/paki], which is suggested to be a poorer approximation of the source.

782

‘!iwi Parker Jones

Table 3: Some correspondences between English and Hawaiian sounds in borrowings, adapted from Carr (1951), Pukui & Elbert (1957: xvii), and Schütz (1994: 192) Consonants

Vowels

English

Hawaiian

English

Hawaiian

n, .

n

i, (

i

p, b, f

p

e, 1

e

t, d, 2, 3, s, z, 4, t,, d4, k, g s, h, ,

k h

æ, a, +, &, 5 6, o

a o

7

hu

u, 8

u

l, * v, w

l w

With this in mind, there are a number of apparent counterexamples, which less perfectly match stress patterns between the source word and borrowing. For example, consider the proper names in Table 4, which are listed in Pukui & Elbert (1992) and analyzed by Schütz (1994: 195). All of these source words are stressed on their initial syllables, while the borrowings are stressed on their second syllables. Notice that the initial glottal stops are not represented in the Hawaiian borrowings either. Neither of these facts is surprising, given that glottal stops and long vowels were not regularly represented in the Hawaiian orthography until recently. So, while these Hawaiian borrowings might once have better approximated their source words (e.g., English Alex [/%æl1ks] might have been pronounced as Hawaiian [0%a-/lik&]), the names now appear fossilized in their present forms both because of the deficiency of the old orthography and because of their relatively frequent occurrence in print. Table 4:

Borrowed names with divergent stress patterns

Hawaiian borrowing

w (usually a signature of borrowing) ngantipa is due to a switch of nasal coda operating on a prior form *nganimpa (McConvell 1988b/2001). This presumably happened after p-lenition stopped operating (see discussion of ages and sound change in §7). 4.2.2.

Verbs

There are 30 inflecting simple verbs in Gurindji, and many of these are also used to form complex verb phrases with coverbs (see §4.4.1; all but one of these can be traced back to proto-Ngumpin-Yapa and many to higher up the Pama-Nyungan tree and have therefore not been borrowed at least in the last couple of millennia (see Nash 2008 for discussion of Ngumpin-Yapa verb roots and their history). A number of them show the effects of lenition, e.g. *paja -> paya ‘bite, drink’. The exception to being an Ngumpin-Yapa inheritance is only partial: jayi- ‘give’ has its first component borrowed from the Wardaman coverb joy ‘give’ combined with the proto-Ngumpin-Yapa root yu- ‘give’ which is no longer found in Gurindji. Rather than ending up as a synchronically analyzable complex verb phrase, this combination coalesced to become a single verb. One verb, yunpa- ‘sing’, in Gurindji is probably a loanword from the northern branch of Warluwarric at the protoNgumpin-Yapa stage (McConvell & Laughren 2004). Another verb, papa- ‘press down on’, contains an unlenited intervocalic p so might be presumed to be a Reverse Lenition internal loanword from Western Ngumpin (cf. Walmajarri papa- ‘be covered in dust/ashes’). In fact, however, the etymology of this verb is more likely to be a complex verb pat-pa- ‘press down on’, lit. ‘press-hit’ (cf. Walmajarri pat-pu‘press down on; press-hit’). The former consonant cluster tp has protected p from lenition. 4.3.

Sometimes borrowed

Basic nominal vocabulary such as body parts are loans to a moderate extent. Bodypart vocabulary in this list is 36% borrowed. Over half of those loans (19%), as with other semantic fields, is borrowed from Jaminjungan (Western Mirndi). While the notion of elevation of numbers of loans due to taboo on words similar to the names of deceased people does not have the profound effect sometimes attributed to it (Dixon 1980: 28), it has had impact on basic vocabulary. The current word for foot/leg, jamana, was borrowed to replace the earlier yuparn about 70 years ago

798

Patrick McConvell

according to Gurindji people’s recollections when a man nicknamed Yuparnjawung ‘bad leg’ died. Cultural vocabulary is borrowed particularly where the thing, concept or practice has been imported from the donor language group. This does not show up much in the database as local varieties of cultural items are not in focus. A number of artifact terms are relatively recent loanwords from Western Ngumpin to judge by the absence of lenition e.g. mirta ‘shield’; kurrupartu ‘boomerang’. The borrowing cannot be taken to mean that the class of artifacts is new but probably that the word arrived with a new style of item: mirta is a long narrow shield in contrast to the kurtiji used by desert people to the south, for instance. Another example in the database is tartij ‘headband’ borrowed from Jaminjung. While we cannot infer that Eastern Ngumpin speakers did not have headbands before contact with northern languages, it may be that this loan signals the adoption of a new style or function for headbands, perhaps use of European cloth, rather than hairstring (archeology cannot usually assist with such plant or hair items as they are not preserved for long; rock art could provide some clues in exceptionally favorable cases). Another loan (not in the database) which fairly clearly arrived relatively recently with its referent, is the pearl shell used as ornament which was traded from the Western Kimberley, a trade which only started in the late nineteenth century according to Akerman & Stanton (1994). Predictably given the recency of the borrowing, the Gurindji term jakurli does not exhibit lenition of the k to w (see §9 for discussion of the diagnosis of loan status and age from sound change). 4.4. 4.4.1.

Frequently borrowed Coverbs

Coverbs are elements which are generally interpreted together with a light verb to yield together what might be in other languages, including most Australian languages, a single verb. As mentioned above these are frequently borrowed, especially from Non-Pama-Nyungan languages to the north. Languages in many parts of the world which have “light verb constructions” also frequently have high levels of borrowing of the other element with the greater semantic content, which are known as “coverbs” here (Wichmann & Wohlgemuth 2008; Bowern 2006b). Stratigraphy based on lenition (see §8) indicates that there are at least two strata of loanwords and this also applies to borrowed coverbs

31. Loanwords in Gurindji

799

(i)

early, undergoing lenition:

paraj ‘find’

< Wardaman pirtij ‘find’

(ii)

late, not leniting:

partaj ‘climb’ < Miriwung pertij ‘climb’ jipu ‘extinguish’ < Jaminjung jipu ‘extinguish’

In the first two examples there are changes in vowels which may give clues as to stratigraphic placement but this is not well worked out yet. 4.4.2.

Environment

Environmental vocabulary is the most common recipient of loans: 55% of Animals and 66% of The physical world with most of these being from northern Non-PamaNyungan sources. While a few European introduced animals are there, this is not a significant component. The vocabulary which is specific to the environment of the Victoria River Basin as compared to the southern semi-desert is overwhelmingly borrowed from the Non-Pama-Nyungan languages to the north of Gurindji. Not much of this vocabulary is captured by the list used here, but some examples of fauna which are distinctively associated with the riverine region are: kuljara ‘dove (spp)’, kuyarru ‘owl (spp)’, jaalij ‘crayfish’, ngalja ‘frog’ (only sand-frogs are known in the desert). Other animal or animal-related terms whose referents are also found in the semi-desert are borrowed into Gurindji too from northern Non-Pama-Nyungan languages e.g. kalpun ‘hawk (generic)’; juru ‘nest’, muntarla ‘scorpion’, jungkuwurru ‘echidna’, kirrawa ‘goanna’, and ngarin ‘meat, animal’ (for the last, see McConvell 1997b). These include some common generic terms which would have been current in a scenario of multilingualism between early Gurindji-Ngarinyman and NonPama-Nyungan neighbors, and may be items easily adopted from the substrate in a scenario of language shift of resident Non-Pama-Nyungan speakers to GurindjiNgarinyman 1500–500 years ago perhaps. A number of fauna terms are inherited within Ngumpin-Yapa and lenition testifies to their age, e.g. jalwa ‘heron’ < proto-Ngumpin-Yapa *jalka. The borrowed environmental vocabulary can also be divided into strata based on whether lenition has operated or not. The word for ‘fish’, for instance, yawu descends from *yaku ‘fish’ a Western Mirndi root, which is found in Jaminjung today as yak due to another sound change in Western Mirndi languages dropping final vowels after certain consonants. The term survives in Ngardi and some Jaru dialects south-west of Gurindji as yaku. This indicates that the term was borrowed into a group of Ngumpin-Yapa languages including eastern members of Western Ngumpin, and lenition subsequently operated on the Eastern Ngumpin languages. Northern Warlpiri also uses yawu but that is a recent loan from Eastern Ngumpin. This word contrasts with the environmental words discussed above to which lenition has not applied and which therefore can be classed as more recent. Their distribution is also largely confined to Eastern Ngumpin languages, unlike yaku/yawu, which spills over into Western Ngumpin.

800

Patrick McConvell

4.4.3.

Wanderwörter

Some of the vocabulary can be classed as Wanderwörter – items that have diffused widely and whose ultimate source is sometimes hard to discover. These are sometimes cultural items, like warlmayi ‘spear thrower’ but also include flora and fauna names like warrija ‘crocodile’, marnuwiji ‘conkerberry (Carissa lanceolata)’, jurlak ‘bird’, wirntiku ‘stone curlew’, jipilyuku ‘duck’. While ‘crocodile’ is an animal with exceptional properties which may lead to it being a common topic in interethnic conversation, the same reasons cannot be adduced for the other items above or many other Wanderwörter in this region, at least not given the cultural configuration of the recent past. The form warrija specifically can be traced to Mudburra/Karranga, borrowed from Western Mirndi warrij with the -a augmentation typical of Mudburra, and is clearly a late loan into Gurindji as there are place names containing the older form warrij, e.g. Warrijkuny ‘belonging to crocodile; crocodile dreaming place’ (McConvell 2004, 2009b). Marnuwiji has a cognate in Warlpiri marnikiji and many forms in other languages with k, but in Walmajarri unexpectedly marnuwiji is found, possibly indicating borrowing of the lenited form from east to west in a counter-flow to the Reverse Lenition, discussed in §9, which is exemplified in the words wirntiku and jipilyuku without lenition. Jurlak ‘bird (generic)’ appears to be related to the Jarragan root jirek- ‘bird’. If so, it underwent Ngumpin-Yapa lateralization at the proto-language level perhaps 2000–3000 years ago and spread widely from there. Geographically this raises questions: the Ngumpin-Yapa homeland was probably in the semi-desert around the present location of Warlmanpa or further east. If Jarragan was in contact with proto-NgumpinYapa that early, it implies a more easterly territory for Jarragan in those times.

5. Source languages The total proportion of loanwords from other indigenous Australian languages is approximately 40%. The four major sources for loanwords In Gurindji are the following languages (% of total vocabulary rounded to nearest %): Non-Pama-Nyungan languages: Jaminjung-Ngaliwurru (Mirndi family Western Branch) Miriwung (Jarragan) Wardaman (Yangmanic) Pama-Nyungan languages: Jaru and other Western Ngumpin

19% 7% 3% 7%

There is a bias in the semantic composition of the Non-Pama-Nyungan loans: there are higher numbers of loanwords in the physical world and animals categories, which could relate to the hypothesis that the Gurindji language moved into the

31. Loanwords in Gurindji

801

riverine area with its different environment in the last couple of millennia and adopted words from resident languages for the environment (see above §4.4.2, “Environment”). Words for motion and other activity are also commonly borrowed from NonPama-Nyungan also, contrary to the hypothesis. However the explanation in this case lies in the fact that these concepts are expressed largely through coverbs, and these are commonly borrowed, perhaps irrespective of their semantics (see §4.2.2, §4.4.1 on complex verbs/coverbs). The bulk of the loanwords from Jaru and other Western Ngumpin are those words classed as Reverse Lenition (see §9 for discussion). Here too, body parts are well represented but environmental words less so, as one would predict from the northern movement hypothesis. Borrowing of coverbs from Western Ngumpin is rare: coverbs with related forms in Western and Eastern Ngumpin are generally inheritances with lenition evident where there are intervocalic consonants e.g. Gurindji jaart ‘eat’ < *japart, cf. Walmajarri japart ‘chew’.

6. Integration of loanwords Some of the Non-Pama-Nyungan donor languages to the north have different vowel systems from the 3-vowel system of Gurindji: Miriwung and other Jarragan languages have an additional central vowel e; Wardaman has a five vowel system with additional e (mid-front vowel) and o (mid-back), which generally appear in Gurindji as i or a and u or a, respectively. The appearance of the mid vowels provides clear evidence that Wardaman is the source and Gurindji the recipient. (1) Wardaman e > Gurindji i Wardaman jegban ‘bustard’

Gurindji jikpan

(2) Wardaman e > Gurindji a Wardaman jelin ‘crayfish’

Gurindji jaalij

The last example has a long a replacing e and the final consonant difference is not a regular correspondence; how this occurred cannot be established at this point. (3) Wardaman o > Gurindji u Wardaman ngone ‘spear’

Gurindji nguni

‘short jabbing spear’

The meaning difference here is typical of semantic narrowing in loanwords: a term which is a general word in a donor language is adopted in a specific sense which is relevant to the cultural contact between the groups. In this case the kind of spear which was least marked among the Wardaman is the most exotic for the Gurindji

802

Patrick McConvell

and Ngarinyman and the word is adopted to mean the typical Wardaman type of spear. Later the northern origin of the item is forgotten. (4) Wardaman o > Gurindji a Wardaman joy- ‘give’ Gurindji jaying- ‘give’ (coverb with ancillary verb wo- ‘give’) The Gurindji verb would originally have been a complex verb *jayi-yung- where the ancillary verb is the inherited Pama-Nyungan root found in other Ngumpin-Yapa languages *yung- ‘give’. Wardaman has the phonotactic possibility of V+glide coda, but Gurindji lacks this and such syllables are converted to two syllables: V+glide+V (i following y; u following w). Wardaman gornbun ‘hawk’

Gurindji kalpun

The last example also illustrates another regular change between Wardaman and Gurindji in loanwords (or at least loanwords of a certain chronological stratum): replacement of rn coda by l coda. This also occurs in the following loanword Wardaman mejern ‘stomach’

Gurindji majul

The second e here changes to u rather than a: I have no explanation for this at the moment. This example also illustrates another source of evidence for direction of borrowing: morphological complexity in the source disappearing in the recipient. Wardaman mejern is made up of the ‘vegetable’ class prefix ma- and the stem jern (Merlan 1994: 597). Gurindji does not have any noun class prefixes or grammatical genders. The current dialects of Kriol also have five vowels, which may be related to significant substratal influence from the indigenous languages around Ngukurr (on the Roper River) according to the scenario proposed by Munro (2000, 2004); these languages generally have five vowels. However, many of the older Gurindji speakers of Pidgin/Kriol in the 1970s tended to use a three vowel system in conformity with their first language, and this has affected the loanwords from Pidgin/Kriol into Gurindji. There is also a difference in the consonant inventory of Gurindji and Wardaman on the one hand, and the Jarragan languages and Jaminjung on the other: the latter have the lamino-dental phonemes th and nh as well as the lamino-palatals j and ny. This is generally resolved by converting th to j and nh to ny in loanwords. Ngaliwurru, the southern dialect of Jaminjung, and the eastern Mirndi languages only have lamino-palatals. The phonotactics of Ngumpin-Yapa languages generally are different from Eastern Ngumpin, and Eastern Ngumpin resembles the Non-Pama-Nyungan languages in a number of respects, such as in having a wider range of coda consonants including p, k, m, ng. This is unlike other Ngumpin-Yapa languages, which tend to add vowels so that these segments do not occur at the end of words, e.g. Warlpiri

31. Loanwords in Gurindji

803

kartaku, Gurindji kartak ‘cup etc.’ (with the meanings ‘snail, shell’ in the ultimate loan source). This loosening of phonological constraints in Victoria River Eastern Ngumpin especially may be due to the part language shift played in the history (Thomason & Kaufman (1988) point to phonology as one of the major areas in which substrate influence is found). In far eastern Eastern Ngumpin (Mudburra and Karranga), however, there was still resistance to consonant final words until recent times and this is responsible for an adjustment of a-augmentation, such as the change *warrij > warrija ‘crocodile’ mentioned above. In this case and a number of others, the Mudburra/Karranga -a augmented form was borrowed into Gurindji relatively recently.

7. Loanword strata and sound changes One of the main methods for detecting loanwords is the fact that they have not undergone processes which have affected the inherited vocabulary, such as regular sound changes. Importation of distinctive roots and morphology from other languages also often reveals a clear picture of sequence and direction of loans. 7.1.

Lateralization

Lateralization involved the change from the retroflex glide /r/ in proto-PamaNyungan to the retroflex lateral /rl/ in proto-Ngumpin-Yapa. The word for ‘hand’ for instance in proto-Pama-Nyungan is *mara and it retains this form in many languages but in Ngumpin-Yapa it becomes *marla; see further §8 below. 7.2.

Lenition

There are several forms of intervocalic lenition in Gurindji and other Eastern Ngumpin languages: (5)

*p, k > w *j > y *rt > r

between vowels and following liquids between vowels between vowels

There is a synchronic lenition alternation in suffixes and enclitics: (6)

a. -piti ~ -witi kataj-piti ngurra-witi b. -ku ~ -wu kataj-ku ngurra-wu

‘place to do x’ ‘cutting place’ ‘camping place’ dative ‘for cutting’ ‘for camp’

804

Patrick McConvell

c. -jawung ~ -yawung ngumpit-jawung ngurra-yawung

‘having’ ‘having a person’ ‘having a camp’

There are also non-leniting suffixes, which are no doubt borrowed in the Reverse Lenition pattern from Western Ngumpin e.g. (7)

-kari ‘other’, cf. Jaru -kariny~-wariny etc.

As the branches to which they are assigned imply, Lateralization precedes Lenition in time, perhaps by 1000 years or more. If the temporal sequence had been opposite, lenition rt > r would have fed r > rl but this does not occur. There are indications that the lenition of different consonants did not all occur at the same time as some words have for instance lenited p > w but unlenited j e.g. jawiji ‘mother’s father’ < *japiji. The issues are complex and are not further explored here. 7.3.

Vowel apocope

The change -ngu > ng is found for instance in the Gurindji word marang ‘muller’ mentioned above (< *mara-ngu, discussed in §8 on linguistic stratigraphy). Similarly -u seems to be lost following k in some cases.

8. Linguistic stratigraphy and culture history Absolute dating of loans is much more difficult in Australia than in some other regions due to absence of documentary records before the nineteenth century and often scanty and unreliable treatment of indigenous matters since then. Absolute dating available from archeology can provide calibration as when a new artifact has arrived in an area and its horizon can be dated, and additionally a new loan term for it and/or associated practices has arrived presumably at the same time, which has distinctive patterns, e.g. absence of a sound change. This happy conjunction of circumstances is rare but then again little research has yet been done using such methods. One study which provides something approaching this kind of case is that of the term for ‘muller’ (top grindstone) in Ngumpin-Yapa – marang(u) (McConvell & Smith 2003). This consists of the proto-Pama-Nyungan root *mara ‘hand’ with an old suffix *-ngu. In Eastern Ngumpin languages, the term for ‘hand’ has undergone a regular Lateralization sound change converting it to marla. The absence of this change in marangu indicates that this is a loanword which arrived after the lateralization change ceased to operate. Lateralization is one of the shared innovations that define the Ngumpin-Yapa subgroup (McConvell & Laughren 2004), so this loanword was borrowed after Ngumpin-Yapa unity. It was borrowed before the sound change of loss of final u after ng which affected Gurindji and other Eastern Ngumpin languages, however, yielding marang in Gurindji for ‘muller’.

31. Loanwords in Gurindji

805

Now the major efflorescence of seed-grinding as a subsistence strategy occurred in Central arid parts of Australia around 3500 years ago according to archeologists, and mullers indicating this change are found in the area close to Eastern Ngumpin with earliest dates around 2,500 years ago. This leads to the hypothesis that the word marangu arrived in the area about this time and that this date lies between the Lateralization change (proto-Ngumpin-Yapa) and u-loss which may be characteristic of the group of languages Gurindji, Birlinarra and Ngarinyman (which I will here assume to be a sub-group and call Victoria River languages). u-loss is likely to be a result of adstratal contact with Non-Pama-Nyungan languages or Non-PamaNyungan substratum associated with movement of Eastern Ngumpin languages into the Victoria River Basin. Table 3 sets out the Ages used in the database together with the sound changes, loanword types and in some cases inferences about cultural patterns and language location. The dates are only extremely rough estimates with little evidence so far to back them up: they are not suitable for citation as meaningful in further research. The lenition sound change has also been used to detect borrowing and stratify Gurindji kinship vocabulary to investigate changes in kinship systems as reflected in borrowing, (e.g. borrowing new terms to fill out a more complex system), as well as to plot the diffusion of the subsection system and estimate its chronology (McConvell (1997a) and references there). The arrival of subsections in Victoria River Eastern Ngumpin precedes lenition for most terms and has been placed at about 2000 years ago, earlier than my previous estimates, although new evidence and arguments (e.g. in Harvey 2008) are yet to be assessed. One interesting finding has been that all the ‘father’s father’ terms in Ngumpin-Yapa languages have been borrowed from neighboring languages – but all from different languages which are close neighbors of the specific receiving languages. This may lead to the conclusion that the proto-Ngumpin-Yapa kinship system lacked such a term and recruited neighboring terms as the system changed to the type know as “Aranda”, perhaps together with a change in marriage rules, relatively recently. The Gurindji term kaku ‘father’s father’ is closely similar to the Jaminjung term kakung, but the source of the latter is probably a Jarragan language, due to the Jarragan -ng non feminine suffix: it is a late post-lenition loan. In previous work (McConvell 1997a) the ‘mother’s father’ term in Gurindji was identified as a loan from Jaminjung (jawijing) but the situation is more complex. The form japiji is found in Western Ngumpin and jam(p)irti in Yapa, suggesting a reconstruction of *jampi + suffix for proto-Ngumpin-Yapa, and *japi + suffix in proto-Ngumpin. In this light jawijing was no doubt a loan from Ngumpin into Non-Pama-Nyungan not vice versa. The fact that p lenites but not j in the erstwhile suffix indicates that p-lenition precedes j-lenition in time and further stratification of vocabulary could be carried out on that basis (for kinship suffixes in Pama-Nyungan, see McConvell 2008c). Of course there is no way to arrive at absolute time horizons for borrowing of kinship terms based on social organization vocabulary alone, but if the method used for dateable artifacts could be applied more extensively, some inferences could be drawn about linguistic stratigraphy of other vocabulary.

806

Patrick McConvell

Table 3: Start

End

Ages Period

-1500 -1000 Pre-ProtoNgumpin-Yapa

sound change

loanword

culture/ location

*ngaru ‘honey’ seed-grinding intensification in Central *jirak ‘bird’ Australia from Jarragan Lateralization *r > rl *yinpa- ‘sing’ *mara>marla ‘hand’ from N, Warluwarric *ngaru>ngarlu *jirak>jurlak mara-ngu ‘muller’ seed-grinding intenEarly borrowing from sifies north of desert W. Mirndi Eastern Ngumpin begin to occupy VR yaku ‘fish’, basin subsections

-1000 -500

Proto-NgumpinYapa

-500

-1

Proto-Ngumpin

1

500

Proto-Eastern Ngumpin

500

1000

1000

1500

1500

1850

1850

1880

1880

1910

1910

1950

1950

2007

Proto-Victoria lenition *rt > r, *j > y borrowing River E. Ngumpin Apocope from W. Mirndi *ngu# > ng# Gurindji marang ‘muller’ Early Gurindji Late borrowing from N. Jarragan, Wardaman Reverse lenition borrowing from west Pre-colonial Post Mudburra –a augmentation Gurindji Indirect European Possible start of longcontact distance loans from indigenous languages for European introductions Early direct contact Start of Pidgin / English loans Cattle stations/ Pidgin 2nd language Creolization / mixed language

lenition *p, k > w *yaku > yawu

borrowing from W. Mirndi

Eastern Ngumpin dominate S. VR Basin

9. Reverse lenition and regular sound change There are quite a large number of words in Gurindji which appear to be inheritances within Ngumpin-Yapa but have unlenited intervocalic consonants. One explanation for this could be that the lenition sound change only affected part of the vocabulary – for instance that the sound change spread gradually through the

31. Loanwords in Gurindji

807

lexicon and what we see now is a freezing of the change in a state of partial completion. The alternative, which I have preferred here, is that the lenition sound change was categorical in Gurindji at a particular period converting all intervocalic consonants of the leniting type in the correct environments to glides; subsequently, after lenition had stopped operating, quite a large number of new items were borrowed in from Western Ngumpin (mostly Jaru) including many items which had not undergone lenition. This is a type of process known as “reversed change” (Phillips 2006) and I call the effect of this wave of late borrowing from the west, internally within Ngumpin-Yapa, Reverse Lenition. On this basis I have assigned all unlenited forms to the loanword category. One of the reasons why I prefer the Reverse Lenition explanation is the existence of cases like laja ‘shoulder’ in Gurindji. If lenition had operated without exception, this word would have been converted into laya. The form found in Ngaliwurru to the north is laya; this is clearly a loan form as it is a Ngumpin-Yapa root, it is only found in Ngaliwurru, not in the more northerly dialect Jaminjung, and because there is no lenition in Ngaliwurru, and there is other evidence that Gurindji-Birlinarra-Ngarinyman vocabulary was borrowed to a greater extent into Ngaliwurru than into Jaminjung. What occurred is that *laja was inherited into Gurindji-Ngarinyman, underwent lenition and passed on the lenited form to neighboring Ngaliwurru. Subsequently the conservative form laja was borrowed from Western Ngumpin into Gurindji, “reversing” the lenition change in Gurindji but not affecting the loan which had further diffused to Ngaliwurru. It is possible that both these processes – gradual diffusion of lenition through the Gurindji lexicon and Reverse Lenition borrowing – have taken place, and it will be a difficult job to establish which of these happened for each lexical item except in favorable circumstances like the case of laja. A further set of evidence favoring the Reverse Lenition approach is the existence of lenited and unlenited doublets in Gurindji, e.g. ngapuju, ngawuju, ngawuyu kapuku, kawurlu jukuputu, juutu Pakarrji, Paarrji

‘father’s mother’ ‘(elder) sister’ ‘elbow’ (place name)

It is more likely that two different languages or dialects each had a different form, lenited and unlenited, and that the unlenited form is subsequently borrowed by the leniting dialects without in this case losing the lenited form. At this stage, it is not possible to discern any pattern in the distribution of the Gurindji unlenited (presumed Ngumpin-internal loan) versus lenited forms. The Reverse Lenition borrowed forms include very common words such as jurtu ‘dust’ and kiki ‘star’. Possibly this group of words are words commonly used in intergroup communication and this is one of the drivers of the spread of conservative forms. Factors adduced, for instance, in Europe for “reversed change” borrowing, such as the tendency for prestige or standard dialects to spread forms into areas of local

808

Patrick McConvell

variation, do not really apply in Indigenous Australia, where traditionally there were no elites, dominant classes or urban centers.

10. Attitudes to borrowing Speakers of Gurindji frequently use words and longer insertions from other languages, often with a particular social meaning and pragmatic force (McConvell 1988a). This kind of code-switching is tolerated and even admired as a creative way of utilizing a multilingual repertoire. It is not permanent borrowing, however, and when it seems that children are borrowing a word from another language, there can be an adverse reaction among adults and stigmatization of use of the loanword. This occurred in recent years (as reported to me by Felicity Meakins): When children began to use the form ngapa (from Warlpiri or Jaru) for ‘water’, instead of Gurindji ngawa, it provoked such a reaction and they eventually stopped using it (McConvell 2008c). As far as old borrowings are concerned, people generally have no knowledge of the source of words or special attitudes towards them. If there are synonyms in use, sometimes people will assert that one of them is “not really Gurindji”.

11. Grammatical borrowing Gurindji and the other Eastern Ngumpin languages have retained a grammatical profile quite distinct from the Non-Pama-Nyungan languages to their north. The Non-Pama-Nyungan languages have complex verb morphology including pronominal prefixes marking subjects and objects many of which are portmanteau forms. Some of them – the Jarragan languages, the Nungali language of Western Mirndi, and Wardaman, have gender/noun class marking on nouns and adjectives. None of the Ngumpin-Yapa languages including Gurindji have these features. They have pronominal enclitics, cross-referencing subjects, objects and obliques which are placed in various positions in the clause varying for different languages, mainly in second position and on a small closed set of “catalysts” (also known as auxiliaries) (McConvell 1996). Related enclitic forms are widespread in western Pama-Nyungan and are inherited from some western Pama-Nyungan proto-language, not borrowed. They are unrelated to the Non-Pama-Nyungan bound pronouns and cannot be seen as products of structural diffusion from Non-Pama-Nyungan either. In the Non-Pama-Nyungan languages some languages like Kija have no ergative or accusative marker on nominals; in others like Eastern Mirndi ergative is optional and it is arguable that the ergative suffix -ni in Mirndi is a relatively recent extension of the locative which may itself have pronominal origins (McConvell 2003). In contrast all the Ngumpin-Yapa languages have case suffix systems, including the obligatory ergative (-ngku~-lu etc.), which go back to proto-Pama-Nyungan.

31. Loanwords in Gurindji

809

The major structural effect of Non-Pama-Nyungan language contact on Eastern Ngumpin languages including Gurindji is the phenomenon of complex verbs with loose nexus between the coverb (the main lexical meaning bearing element) and the light or ancillary verb which usually accompanies it. Ngumpin-Yapa languages other than Eastern Ngumpin and other PamaNyungan languages also have two-part verbs but these are generally compounds – single words consisting of a preverb element and an inflecting verbal element. In contrast complex verbs in Eastern Ngumpin (including Gurindji) have the following characteristics: (i)

the coverb and light verb are separate words, phonologically

(ii) the coverb may precede or follow the light verb or may be separated from it by other words (iii) the coverb may have suffixes (adverbial, information structure etc) attached to it (iv) the coverb may appear without a light verb in certain constructions, e.g. nonfinite subordinate clauses, imperatives, narratives. These characteristics are all found in all the neighboring Non-Pama-Nyungan languages to the north but not in the neighboring Pama-Nyungan languages. It appears likely that these characteristics were adopted by Eastern Ngumpin languages from Non-Pama-Nyungan neighbors, possibly as a substratal feature, as Eastern Ngumpin moved into the linguistic area of complex verbs with loose nexus. Further evidence is to be found in the clear history of replacement of Ngumpin-Yapa monomorphemic verb roots by complex verbs in the north-east, usually involving borrowing of Non-Pama-Nyungan coverbs. For instance, the stem for ‘cut’ in Jaru is the monomorphemic inflecting verb stem kuma- and this is also found in Wanyjirra, but in Gurindji the form kataj pa- ‘cut-hit’ is found with kataj being a loan from western Mirndi. Kataj actually originally would have been bimorphemic made up of a coverb root kat (not from English cut – a “false friend”) and a suffix – Vj found in Western Mirndi. This type of morphological complexity is quite common in the sources of coverbs and helps to confirm the path and direction of borrowing, since the morphology becomes opaque when the item is borrowed into Gurindji. As remarked above, no inflecting verbs have been borrowed from Non-PamaNyungan into Gurindji except for one example of merger of a Non-Pama-Nyungan coverb with a Pama-Nyungan verb, but the list of simple verbs including light verbs in terms of their semantics is quite similar and may indicate that some kind of convergence has been at work. The selection of light verbs by coverbs is also quite similar but there are a sizable number where there are different selections.

810

Patrick McConvell

However, large numbers of coverbs have been borrowed between Non-PamaNyungan languages and from Non-Pama-Nyungan into the Pama-Nyungan Eastern Ngumpin languages. In this we see mutual positive feedback between diffusion of grammatical patterns and actual forms. The loose nexus complex verb arrangement allows insertion of invariant verbal forms into a frame with a light verb without the complications which arise when these elements are morphologically bound together. Additionally, the more frequent and less marked this arrangement becomes due to the volume of coverb loans, the more it becomes established as the standard pattern and reduces the numbers of monomorphemic verbs in the languages joining the area. Another case of an areal grammatical feature is the locative-allative alternation: In Gurindji, a locative phrase which is a secondary predication on an object or oblique NP can optionally be marked allative although no motion is involved. Similar constructions are found in a wide area east of Gurindji in both Pama-Nyungan and Non-Pama-Nyungan languages. This is in fact probably an inherited feature in some sub-group of Pama-Nyungan, but early borrowing of it into Ngumpin-Yapa cannot be ruled out (McConvell & Simpson 2008).

12. Areal semantics As well as phonotactic patterns, Gurindji and the other Victoria River Eastern Ngumpin languages share some aspects of semantic organization with their NonPama-Nyungan neighbors rather than with Western Ngumpin and Yapa within Ngumpin-Yapa. These include areal polysemies like the equation of ‘hill’ and ‘head’ in Victoria River Eastern Ngumpin (ngarlaka or walu in Gurindji), contrasting with the equation of ‘hill’ and ‘stone’ in other Ngumpin-Yapa languages. Such areal features apply whether or not the words in question are loanwords (walu is an inherited Ngumpin-Yapa term, for instance). The word for ‘throw at, spear’ (luwa-) is polysemous with ‘grind’ in other Ngumpin-Yapa languages, but in Gurindji ‘sharpen’ and ‘grind’ are expressed by one verb jama-, and luwa- does not have any ‘grind’ meaning (for reasons in terms of different technologies in the desert and the Victoria River region, see McConvell & Smith 2003: 195–197). ‘Inside’ is polysemous with ‘under’ in the rest of Ngumpin-Yapa as elsewhere in the Pama-Nyungan languages of the west, but in Gurindji kaniny- only means ‘under’, and the ‘inside’ meaning is taken over by walyak, a Jaminjungan loan of ancient vintage as its adoption precedes j > y lenition, reflecting the fact that the ‘inside’/‘under’ polysemy is not found in the Non-Pama-Nyungan languages of the region.

13. Linguistic diffusion in Australia There has been great debate over linguistic diffusion in Australia (McConvell 2009a and references there). Apart from Dixon’s (2001, 2002) challenge to the applicabil-

31. Loanwords in Gurindji

811

ity of the comparative method, supposedly entailed in the long-term diffusion on the continent, he has also pursued a notion of “equilibrium” sharing of vocabulary close to 50% which he claims is found between languages which have been adjacent to each other for a long time, irrespective of their linguo-genetic affiliation. This has been in turn challenged (Alpher & Nash 1999; Black 2006 ; Sutton & Koch 2008) with claims that shared vocabulary rarely rises above 25% in such cases. Gurindji and other Eastern Ngumpin languages are an exception on the high side, moving towards the 50% predicted by Dixon. In this they contrast with the reported much lower levels of borrowing between the Non-Pama-Nyungan languages in the “Top End” north of the Victoria River District (Harvey 2009+) and between Pama-Nyungan and Non-Pama-Nyungan languages in the Western Kimberleys (Bowern 2006a). Interestingly, the other well-known exception is the Yolngu language Ritharrngu and its Non-Pama-Nyungan neighbor Ngandi: Heath (1978: 29; 1981) finds 50% shared vocabulary in a 1300 word list of nouns and verbs. This would be accounted for entirely by borrowing in one direction or the other (or both from a third source) since the languages are linguo-genetically so distant as to be treated as unrelated for practical purposes. In the “core” domain of body-parts, where loans are less expected, levels of sharing are still fairly high: 18 out of 70 or 26%. Harvey (1997: 184) claims that there were special circumstances in the Ritharrngu-Ngandi case which explains the high numbers: that one language (Ritharrngu) has and had a much higher population than Ngandi had, and that all (rather than some) segments of the Ngandi population had close contact with Ritharrngu. Heath (1978: 123–124) also claims that there is no significant use of language as emblematic of social identity among the Non-Pama-Nyungan languages in Eastern Arnhem Land and this produces tolerance towards borrowing, but there is such a function in the Pama-Nyungan languages (e.g. Yolngu), where “clan-language” identity is significant. How this plays out in a situation of contact between the two types of language and language ideology is not explored. Situations where grammatical borrowing and convergence may occur but a separate lexicon is maintained for social identity reasons have been claimed for Australia (e.g. Rigsby 1997) and elsewhere. In Gurindji there do not appear to be heavy social constraints restricting borrowing. One factor that perhaps should be taken into account is a geographical/demographic one, as discussed by Harvey, that at the time of the major borrowing of Non-Pama-Nyungan languages into Gurindji-Birlinarra-Ngarinyman (Victoria River Eastern Ngumpin) this language was surrounded on three sides by Non-Pama-Nyungan languages and there was intimate contact between all these languages. It is difficult to assess if Harvey’s hypothesis about population disparity applies in this case, especially if the numbers of Non-Pama-Nyungan speakers was eroded over time by language shift to Victoria River Eastern Ngumpin. Linked with the last point is the possibility that the high numbers of loans in Gurindji are associated with past language shift by Non-Pama-Nyungan speakers to Pama-Nyungan: part of the borrowed vocabulary is in fact substratal rather than

812

Patrick McConvell

adstratal, and having both kinds of non-inherited items elevates an aggregated loanword percentage. Bio-genetic research could help with determining levels of language shift, but data is not available for these areas of Australia. We should also investigate in more detail how we can distinguish between substratal and adstratal contact phenomena on linguistic grounds following up Thomason & Kaufman’s (1988) suggestions (e.g. the phonological and semantic adaptations of Gurindji to Non-Pama-Nyungan may suggest a historical background involving relatively rapid language shift by some Non-Pama-Nyungan speakers to Victoria River Eastern Ngumpin).

14. Conclusion Gurindji, a Pama-Nyungan language in the northernmost “bulge” of PamaNyungan in the central Northern Territory of Australia, is surrounded by NonPama-Nyungan languages of three families, in the north and north west. Although Gurindji is today physically separated from the Non-Pama-Nyungan languages by some distance, the intervening languages Birlinarra and Ngarinyman are very closely related in the Victoria River Eastern Ngumpin subgroup and would have been a single language in the relatively recent past. There has been large-scale lexical borrowing from all the Non-Pama-Nyungan languages into Gurindji-BirlinarraNgarinyman, accounting for at least 28% of the vocabulary looked at here. Major components of this include words related to the riverine environment: this is argued to reflect a move of the Eastern Ngumpin languages from the semi-desert to the south-east into the riverine region to the north, perhaps 2000 years ago or so. Another major loan component from Non-Pama-Nyungan languages is coverbs, the main semantic element of complex verbs. As in other regions of the world, here the existence of independent uninflecting verbal components like coverbs is both stimulated by and drives a high-borrowing regime in a linguistic area, and in this case causes a convergence of the complex verb syntax to the Non-Pama-Nyungan type, which is more analytical (loose nexus). However there is little other grammatical borrowing or convergence, and the Pama-Nyungan languages like Gurindji remain very different grammatically from their Non-Pama-Nyungan neighbors. Apart from the Non-Pama-Nyungan languages, Gurindji has significantly borrowed from the closely related Western Ngumpin languages, in a process resembling ‘dialect mixing’ but more structured, as most of the flow has gone from west to east rather than randomly in all directions. This analysis may be controversial as one might argue alternatively that what has happened here, at least partially, is gradual lexical diffusion of a sound change through the lexicon. If the latter proposition is true it may reduce the Western Ngumpin loan contribution somewhat but the overall level of borrowing, not counting the small number of English and Pidgin/Kriol loans, is not going to be less than 35%, and may be as much as 40%.

31. Loanwords in Gurindji

813

In terms of the debates about levels of borrowing in Australia, Gurindji is a language where lexical borrowing levels approach Dixon’s “equilibrium” levels of 40– 50%. This does not show that the “equilibrium” model is correct for all of Australia, but it does call for an explanation of cases like this where the borrowing levels are relatively high. In this case, the model of language spread from the south could well have a language shift component – Gurindji-Birlinarra-Ngarinyman being adopted by erstwhile speakers of Non-Pama-Nyungan languages. If this is so, Non-Pama-Nyungan would have entered via the substratum, particularly in the case of Jaminjung-Ngaliwurru, as well as being augmented by adstratal loans at that time and later, bringing the overall indigenous language loanword percentage to a relatively high level.

Acknowledgments Part of the work on this paper was funded by NSF grant BCS-0902114 “Dynamics of Hunter-Gatherer Language Change”. Thanks also to the Australian Institute of Aboriginal and Torres Strait Islander Studies and the Max Planck Institute for Evolutionary Anthropology for assistance in attending meetings at Leipzig and support for work on the project.

References Akerman, Kim & Stanton, John. 1994. Riji and Jakuli: Kimberley Pearl Shell in Aboriginal Australia. Darwin: Northern Territory Museum and Art Gallery. Alpher, Barry. 2004. Pama-Nyungan: Phonological reconstruction and status as a phylogenetic group. In Bowern, Claire & Koch, Harold (eds.), Australian Languages: Classification and the Comparative Method, 93–126. Amsterdam: Benjamins. Alpher, Barry & Nash, David. 1999. Lexical replacement and cognate equilibrium in Australia. Australian Journal of Linguistics 19:5–56. Bavin, Edith. 1989. Some lexical and morphological changes in Warlpiri. In Dorian, Nancy (ed.), Investigating obsolescence: Studies in language contraction and death, 267–286. Cambridge: Cambridge University Press. Black, Paul. 2006. Equilibrium theory applied to Top End languages. In Allan, Keith (ed.), The 2005 Conference of the Australian Linguistic Society. . Bowern, Claire. 2006a. Another look at Australia as a linguistic area. In Matras, Yaron & McMahon, April & Vincent, Nigel (eds.), Linguistic Areas: Convergence in Historical and Typological Perspective, 244–265. Palgrave Macmillan. Bowern, Claire. 2006b. Australian complex predicates. In Proceedings of the 32nd Berkeley Linguistics Society, Vol. 2. Bowern, Claire & Koch, Harold. 2004. Australian Languages: Classification and the Comparative Method. Amsterdam: Benjamins.

814

Patrick McConvell

Dalton, L. & Edwards, S. & Farquharson, R. & Oscar, S. & McConvell, P. 1995. Gurindji Children’s language and language maintenance. International Journal of the Sociology of language 113:83–96. Dixon, R. M. W. 1980. The Languages of Australia. Cambridge: Cambridge University Press. Dixon, R. M. W. 2001. The Australian linguistic area. In Dixon, R. M. W. & Aikhenvald, A. (eds.), Areal diffusion and genetic inheritance: Problems in comparative linguistics, 64– 104. Oxford: Oxford University Press. Dixon, R. M. W. 2002. Australian Languages, their Nature and Development. Cambridge: Cambridge University Press. Evans, Nicholas (ed.). 2003. The non-Pama-Nyungan languages of northern Australia: Comparative studies of the continent’s most linguistically complex region. (Pacific Linguistics). Canberra: Australian National University. Harris, John. 1986. Northern Territory Pidgins and the Origin of Kriol. (Pacific Linguistics Series C-89). Canberra: Pacific Linguistics. Harvey, Mark. 1997. The temporal interpretation of linguistic diversity in the Top. In McConvell, Patrick & Evans, Nicholas (eds.), Archeology and Linguistics: Aboriginal Australia in Global Perspective, 179–186. Melbourne: Oxford University Press. Harvey, Mark. 2008. Proto Mirndi: A discontinuous language family in Northern Australia. Canberra: Pacific Linguistics. Harvey, Mark. 2009+. Hunter-gatherer languages in global historical perspective. In Güldemann, Tom & McConvell, Patrick & Rhodes, Richard (eds.), Proto Mirndi: A discontinuous language family in Northern Australia. Cambridge: Cambridge University Press. Heath, Jeffrey. 1978. Linguistic diffusion in Arnhem Land. Canberra: AIAS. Heath, Jeffrey. 1981. Language 57:335–367. McConvell, Patrick. 1988a. Mix-im-up: Aboriginal code switching, old and new. In Heller, M. (ed.), Codeswitching: Anthropological and Sociolinguistic Perspectives, 97–149. Berlin: Mouton de Gruyter. McConvell, Patrick. 1988b. Nasal Cluster Dissimilation and constraints on phonological variables in Gurindji and related languages. Aboriginal Linguistics 1:135–165. Reprinted in Kreidler, C. (ed.). 2001. Phonology: critical concepts, 300-344. London: Routledge. McConvell, Patrick. 1996. The Functions of Split-Wackernagel Clitic systems: Pronominal clitics in the Ngumpin languages. In Halpern, A. & Zwicky, A. (eds.), Language in Australia, 299–332. Stanford: CSLI. McConvell, Patrick. 1997a. Long lost relations: Pama-Nyungan and Northern kinship. In McConvell, Patrick & Evans, Nicholas (eds.), Archaeology and Linguistics: Aboriginal Australia in Global Perspective, 206–236. Melbourne: Oxford University Press. McConvell, Patrick. 1997b. The Semantic Shift between “Fish” and “Meat” and the prehistory of Pama-Nyungan. In Walsh, M. & Tryon, D. (eds.), Boundary Rider: Essays in Honour of G. N. O’Grady, 302–325. Canberra: Pacific Linguistics.

31. Loanwords in Gurindji

815

McConvell, Patrick. 2002. Linguistic Stratigraphy And Native Title: The Case Of Ethnonyms. In Henderson, J. & Nash, D. (eds.), Linguistics and Native Title, 261–292. Canberra: Aboriginal Studies Press. McConvell, Patrick. 2003. Headward migration: A Kimberley counter-example. In Evans, N. (ed.), The Non-Pama-Nyungan languages of northern Australia: comparative studies of the continent’s most linguistically complex region (Pacific Linguistics 552), 75–92. Canberra. McConvell, Patrick. 2004. A Short Ride On A Time Machine: Linguistics, Culture History And Native Title. In Toussaint, S. (ed.), Crossing Boundaries: Cultural, legal, historical and practice issues in native title, 34–49. Melbourne: Melbourne University Press. McConvell, Patrick. 2008a. Language mixing and language shift in indigenous Australia. In Wigglesworth, G. & Simpson, J. (eds.), Children and language in multilingual communities, 205–225. London: Continuum. McConvell, Patrick. 2008b. ‘Reversed Change’ Dialect Borrowing Versus Gradual Lexical Diffusion Shaped By Frequency As Explanation Of Variability. Paper to Methods in Dialectology conference, Leeds, UK. McConvell, Patrick. 2008c. Granddaddy morphs: The importance of suffixes in reconstructing Pama-Nyungan kinship. In Bowern, Claire & Evans, Bethwyn & Miceli, Luisa (eds.), Morphology and language history: In honour of Harold Koch (Amsterdam Studies in the Theory and History of Linguistic Science, Series IV. Current Issues in Linguistic Theory 298), 313–327. Amsterdam/Philadelphia: John Benjamins. McConvell, Patrick. 2009a. Contact and indigenous languages in Australia. In Hickey, R. (ed.), Handbook of Language Contact, 298. Oxford: Blackwell. McConvell, Patrick. 2009b. Where the spear sticks up: The variety of locatives in place names in the Victoria River District. In Hercus, L. & Koch, H. (eds.), Place names old and new. Canberra: Pacific Linguistics. McConvell, Patrick & Laughren, Mary. 2004. Ngumpin-Yapa Languages. In Bowern, Claire. & Koch, Harold (eds.), Australian Languages: Classification and the Comparative Method, 151–178. Amsterdam: Benjamins. McConvell, Patrick & Meakins, Felicity. 2005. Gurindji Kriol: A mixed language emerges from code-switching. Australian Journal of Linguistics 25(1):9–30. McConvell, Patrick & Simpson, Jane. 2008. Moving Along The Grammaticalisation Path: Locative And Allative Marking Of Non-Finite Clauses And Secondary Predications In Australian Languages. Nijmegen, Netherlands: Paper to PIONIER Workshop on Locative Case, 25th–26th August, 2008. . McConvell, Patrick & Smith, Michael. 2003. Millers and mullers: The archaeolinguistic stratigraphy if technological change in Holocene Australia. In Andersen, H. (ed.), Language Contacts in Prehistory: Studies in stratigraphy, 177–200. Amsterdam: John Benjamins.

816

Patrick McConvell

McConvell, Patrick & Thieberger, Nicholas. 2005. Languages past and present. In Arthur, Bill & Morphy, Frances (eds.), Macquarie Atlas of Indigenous Australia, 78–87. North Ryde: Macquarie Library. McConvell, Patrick & Thieberger, Nicholas. 2006. Keeping track of language endangerment in Australia. In Cunningham, D. & Ingram, D. & Sumbuk, K. (eds.), Language Diversity in the Pacific: Endangerment and Survival, 78–87. Clevedon: Multilingual Matters. Meakins, Felicity. 2007. Case-marking in Contact: The development and Function of Case Morphology in Gurindji Kriol, an Australian mixed language. Ph.D. thesis. University of Melbourne. Mühlhäusler, Peter. 1991. Overview of the pidgin and creole languages of Australia. In Romaine, Suzanne (ed.), Language in Australia, 159–173. Cambridge: Cambridge University Press. Mühlhäusler, Peter. 1996. Post-contact languages in mainland Australia after 1788. In Wurm, Stephen & Mühlhäusler, Peter & Tryon, Darryl (eds.), Atlas of languages of intercultural communication in the Pacific, Asia and the Americas, Vol. 2, 11–16. Berlin: Mouton de Gruyter. Munro, Jennifer. 2000. Kriol on the move: A case of language shift and language spread in Northern Australia. In Siegel, Jeff (ed.), Processes of Language Contact: Studies from Australia and the South Pacific (Fides), 245–270. University of Montreal Press. Munro, Jennifer. 2004. Substrate language influence in Kriol: The application of transfer constraints to language contact in Northern Australia. Ph.D thesis. University of New England, NSW, Australia. Nash, David. 2008. Warlpiri verb roots in comparative perspective. In Bowern, Claire & Evans, Bethwyn & Miceli, Luisa (eds.), Morphology and Language History: In honour of Harold Koch, 221–234. Amsterdam: John Benjamins. O’Grady, Geoffrey N. & Hale, Kenneth. 2004. The coherence and distinctiveness of the Pama-Nyungan language family within the Australian linguistic phylum. In Bowern, C. & Koch, H. (eds.), Australian Languages: Classification and the Comparative Method, 69– 92. Amsterdam: Benjamins. O’Grady, Geoffrey N. & Voegelin, Carl F. & Voegelin, Florence M. 1966. Languages of the world: Indo-Pacific fascicle 6. Anthropological Linguistics 8:1–199. O’Shannessy, C. 2005. Light Warlpiri: A new language. Australian Journal of Linguistics 25(1):31–57. O’Shannessy, Carmel. 2006. Language contact and child bilingual acquisition: Learning a mixed language and Warlpiri in Northern Australia. Ph.D thesis. Max-Planck Institute of Psycholinguistics, Nijmegen/University of Sydney. Phillips, Betty. 2006. Word frequency and lexical diffusion. Palgrave MacMillan. Rigsby, Bruce. 1997. Structural parallelism and convergence in the Princess Charlotte Bay languages. In McConvell, Patrick & Evans, Nicholas (eds.), Archaeology and Linguistics: Aboriginal Australia in Global Perspective, 169–178. Melbourne: Oxford University Press.

31. Loanwords in Gurindji

817

Rumsey, Alan. 1993. Language and Territoriality in Aboriginal Australia. In Walsh, M. & Yallop, C. (eds.), Language and Culture in Aboriginal Australia, 191–206. Canberra: Aboriginal Studies Press. Siegel, Jeff (ed.). 2000. Processes of language contact: Studies from Australia and the South Pacific. Quebec: Fides. Siegel, Jeff. 2008. The Emergence of Pidgin and Creole Languages. Oxford: Oxford University Press. Sutton, Peter & Koch, Harold. 2008. Australian Languages: A singular vision. Journal of Linguistics 44:471–504. Thomason, Sarah & Kaufman, Terrence. 1988. Language Contact, Creolization and Genetic Linguistics. Berkeley: University of California Press. Wichmann, Søren & Wohlgemuth, Jan. 2008. Loan verbs in a typological perspective. In Stolz, Thomas & Bakker, Dik & Salas Palomo, Rosa (eds.), Aspects of Language Contact: New Theoretical, Methodological and Empirical Findings with Special Focus on Romancisation Processes, 89–121. Berlin/New York: Mouton de Gruyter.

Paper dictionaries and wordlists Chadwick, Neil. 1975. A descriptive study of the Djingili language. Canberra: Australian Aboriginal Studies. Kofod, Frances. n.d. Miriwung wordlist. Unpublished manuscript. Merlan, Francesca. 1994. A Grammar of Wardaman: A Language of the Northern Territory. Berlin: Mouton de Gruyter. Nordlinger, Rachel. 1998. A Grammar of Wambaya: A language of the Northern Terriotry of Australia. (Pacific Linguistics C.140). Richards, Eirlys & Hudson, Joyce. 1990. Walmajarri-English Dictionary. Darwin: SIL-AAB. Wrigley, Matthew. 1992. Jaru Dictionary. Halls Creek: Kimberley Language Resource Centre.

Digital dictionaries and wordlists ASEDA = Aboriginal Studies Electronic Data Archive. n.d. Canberra: AIATSIS. Breen, Gavan. n.d. Warluwarra wordlist. Cataldi, Lee. n.d. Ngardi vocabulary and notes. ASEDA 0737. Green, Rebecca & others. n.d. Mudburra vocabulary. ASEDA 0699. McConvell, Patrick. 2006. Gurindji dictionary. McNair, Norman & McNair, Helen. 1984. Gurindji vocabulary. ASEDA 0207. Schultze-Berndt, Eva. n.d. Ngarinyman lexicon. Schultze-Berndt, Eva. n.d. Jaminjung and Ngaliwurru vocabulary. ASEDA 0740. Simpson, Jane. 2004. Warumungu vocabulary. Warlpiri Dictionary Project. 1997. Warlpiri Dictionary. ASEDA 50.

818

Patrick McConvell

Loanword Appendix English witing

patik takman puliki kaapikaapi pikipiki kilpukut mirrijin

pilayit tina japa pipa pang panana paki-paki

warukku wumara tuwa wajintayi mamam pakapman

betrothed bride, wedding (rare) paddock, fence stockman, cowboy cattle calf pig goat traditional or modern medicine plate lunch supper pepper bed banana adze made from motorcar spring money for work shop, store Monday refuse government

maarn ngapurung

purtuj kampajipu(n) pa-

yarrulan wamala

ngapa karlaj kaku

kartpiralang juru jurlak kalpun ngarnalang

Jaminjung kurlwa warranginy palangarri

talukurru jarriny wangkuwala pinka kulumarra wurlngarn turlurlup

karrngan wumaj

wet mud cliff, precipice black soil plain, instant coffee hole, valley cave salt water river, creek sky sun thunder, be thundering (cover) dew wind, hunting on windy day

wangkuriny kuljara murrunkirn yawu parnngirri wutu tarla

jaalij jungkuwurru kartpi ngurrurn yimitjimitji nyimij

cloud(s) steam, fragrance from cooking burn put out fire, switch off light, appliance, staunch flow of blood (with ma-verb) young man girl, young woman without children elder brother younger sibling father’s father, father’s father’s sibling long-haired, sheep nest bird (generic) hawk (generic) Sulphurcrested cockatoo; group of these or little corellas crow dove water-rat fish bark, shell, peel head-louse wax, primarily bees, also spinifex freshwater crayfish echidna hair, fur pubic hair eyebrow blink, wink

kajurta kurtpu jiminkirt ngalyak payatirrirn ngayatarlarlap karriparaj puparaj wani-

tampang tampang nyiny ngatampang pajipij yuwa-

paturu mangarri

ngirljik ngapurrngpurrngkarra karrilurlupkarraaji ngamayak

forearm calf (of leg) kidney lick fart shiver, shake find, beget, give birth to find (be found), be born dead dead drown kill bury (dead person or anything) scar, cicatrice food in general, vegetable food swallow boil

spoon wet dough, diarrhea pikurta bush potato type, yam sp. kulpap karribe in a bunch mirlarrang spear, beer wirriji hair string belt tartij headband nyumurli soap palkin blanket, cover lurluwaji chair, anything for sitting kurrijkarra pu- dig parnnga the bark nurlunurlu fold yuwatipit matie kataj pacut pakirr pabreak lak pasplit at end yurr marub

31. Loanwords in Gurindji yirr manut papapirrka majarriny pirrka mamurlukurn yingin yuwa-

wirrminy karrinurlu yuwajak majap mayingin palilaj yatijkurlp patiwu yamingip yajalurlk karritipart waniwarl wanyjawalyak yajuluj katurt mawarra kajal yuwawartuj majalarr yuwajarrmip karriwalyak kutij karrijurtat makamurra jampukarra tarlukurru palki -marraj yurrk mapurrp purrp ma-

pull, pull out press down on make, fix make a hollow bottle, glass make something move (parts of itself) turn around wrap drop catch, grab move, shake, disturb swim splash fly crawl crouch jump flee, run away from fight go inside carry under arm hold take care of, look after save, rescue lose join, fix on, hire share together inside to stand pile up middle left-handed deep hole flat like track, tell, count finished, completely, all finish

parunga

-mirntij ngapuk karringapuk mangapukarriny

kurru nyayilying mirlinyinyip karrimirlinyinyipkaji puny mawilngwilng majalungurru nyirn malimpal ngurrku pawayarrap mapampaya yangki pakamaliwang kuturu pujarl ma-

wuyurrurnkarraaji wuyurrurn nguru yangkarrp yatiwuwaji tirrkkaji

hot season (Sept-Dec), year season smell sniff, smell sweetsmelling, soap hear, listen noise shine

murrupa-k karrimalyju

shining, bright kiss want, desire pretty, handsome lose, forget private, secret suspect, blame shout, call out speech, harangue ask the stranger club, large hula-hula give up (anything, e.g. fight), rest fisher

papa

fishing line tobacco, bait hunting (kangaroo) airplane policeman

Jaru jurtu kiki yarti murrupa

dust, dustcloud star, ornament shade, shadow hail, also used for ice and snow not naturally found in region

ngaji kajirri

kurturtu

kapuku mukurl kurriji

yawarta-wu marru yawarta kuljany ngirlkirri laja jamana luju ngapurlu puju kurlpak yuwamakin karripuka jurrulungku marru yuka wirlka kurrupartu

lajap kajapiyapi yapawurru kujarra murrkun

819

freeze, turn to ice boy, male (human or animal) father, father’s brother, god old woman, married woman woman’s child, man’s sister’s child brother, elder brother sister, older sister father’s sister husband’s mother, son’s wife horse stable horse snake (generic) neck, throat, voice shoulder(blade) shoe, foot, leg heel breast, milk vulva vomit lie, sleep rotten, stinking, pus trousers, thigh house, station homestead grass axe (steel) boomerang (generic), ‘killer’ boomerang carry on shoulder end, top small twp three, a few

820

Patrick McConvell

kujarra-kujarra murrkunpurru kujarrawurt kaputa tapu tupurrng ngarrka mawaku mirlimirli jaliji mirta tarruku lajapkaji yartiwaji -kari Jingulu pakara kajupari wajija wajija yajirtart karringuwajkarra Karranga nalija Kija kirrim kampa-

kirrim-kaji jarrakap maKriol wirriman yuwamintimkaji

mintim maniil waruk payim ma-

four a few twice night blunt warm, hot recognize, understand or, also exclamation ‘well’ paper, book friend, age-mate shield secret/sacred bicycle picture, movie other

jalim jayiriitim majangayi

sell read slingshot

parr wani-

(bird, airplane) land wirnarn jayitrade, exchange gifts Miriwung lurlu karrisit jakiliny moon, month, warnparlk ma- open snail jipurt mashut, stop ngarin animal, meat, wurlaj yuwahide cattle narra hook warlaku dog kurrunyung sphere nyili scale murr karricease (talking, yartupkarrawaji kneeling creafighting, ture, camel raining) pirlpirlji grasshopper tirrip overnight, day (generic) and (when a species counting) ngalja frog (generic) karrikurnyellow kirrawa goanna karrikurn (generic) kutukutu sharp warting collar-bone pantij wet outside, open kuni madream yayip malaugh place taruk wanibathe wukarra fear, afraid near yalpuyalpu feel strong and yijarni true quickly, fast, karrihealthy kulanypung wrong, immediately rumpa swelling, mistaken go quickly swell up ngalking greedy hate, dislike, yikarrp mascratch kanung easy, handy be angry with purrku tired kiyakiyap ma- whisper jealous kurrku earth-oven wuruny mawhistle parlak yuwamix jatajatajkarra write nampula bush fig tree yuwatea and fruit tuturrpkaji pen karrikurn yellow ochre, waringarri raiding-party, yellow, yolk soldiers, light a fire, wapaapa clothes army, war strike a warampurr vine type larrmij bundle of match tinung sap, blood weapons the match wood sap (spears/ speak, talk turrp pupoke, pierce, boomerangs) stab pululuj waniattack jirrip parip, tear purruly miss put clothes on wuyi mastretch jawurra masteal palkin yuwaspread out jawurraaji thief person who (covering) wilmurr wire, wire sews and tartajkarraaji blacksmith spear, makes warrkap dance telephone clothes wanyjakurririj car sew partaj yaclimb, go up turrp piercing, needle warrij yadepart, go away injection work jalak yuwasend pay, buy

31. Loanwords in Gurindji Mudburra lamparra warrija malumpa kamamurru kurranyku japirri ngunyju tuwa yamumpung yintij mayawaliwali

father-in-law, son-in-law crocodile liver blind thirsty knife, ‘sorry cut’ (European) tobacco arrive, meet black pinch talking/acting peacefully, exclamation ‘stop fighting!’

karnti mum luku nungkiying kuyarru yingkuwarl muntarla ngurrakin laparn wartarn jupak yuwajawul walywalyarra karripurrngip mangarturr pujarl kartak warrpa

Ngardi kirturlk makirturlkpari kirlka

bend crooked, bent clean

Ngarinyman ngawirlangpirlangmulung ngawirlangpirlangjawung

without waves having waves

Unknown Pama-Nyungan source marang

mangurlu

tarl pajuly majulurl wuyatarap waniwapaja karrijurlkap yuwawarrkuj mawumara mawumara jangkarni timpak ngampukuju

top grindstone

mirlirri parnkarrang

baby, little fever with shivering cold, school

munpa

Walmajarri

Ngaliwurru yanturri

talwirr yuwa- hang up wumpalp karri- float kirr kathink, remember

waterfall, also (?primarily) specific place tree, stick, wood dark, be dark, get dark married relatives, family tawny frogmouth nits scorpion dingo shoulder blade hand, arm, finger spit dribble

snore pregnant, heavy lazy, tired billycan bat’s wing, dress, flyingfox

yapakaru makurrmakurr makurru

wuma

Wardaman langkana wumara janginyina kaparru yipu

wangu karliyinta warrpawurru murlukurr

walngirn nungkuru lipi majul

billabong stone, rock, money lightning fog, smokehaze heavy rain, wet season (rain also = water) widow widower flying fox ‘devil-lion’, a legendary creature, can also be used of lion fly elbow fingernail, toenail, claw stomach (external and internal)

821

cereal seed, including wild sorghums pound squeeze, wring pour, spill sink, dive disappear push get, pick up get money big money full boss, right person, culprit walking stick murderer, murder murdering sorcery, sorcerer string of clouds signaling death

Warlpiri karli kujarra minija

twins cat

Warluwarric junpa

song (mainly traditional)

Warumungu jaju

murrkartu

mother’s mother, mother’s mother’s sibling hat

Yanyuwa yunpa-

sing

Unknown origin ngarlaka purrkiji

head, hill Black cormorant

822

Patrick McConvell

jika

jitji jarriny wartarn majul jamana ngamayi lupu kura ngaya-

beard; Brushtailed ratkangaroo and decorations made from its tail tip nostril palm of hand the toe

nakurr kartaj majikurnu kurlartarti

jarnpirta guts shit

grave, earthoven choke lower grindstone bush “orange”, sometimes used for orange mushroom, a kind of ink-cap

jikala

takujkarra yawumara yapawurru kijijiyawung

spinifex wax (used for joining things) limp little money [wumara is Wardaman] having a point

Chapter 32

Loanwords in Yaqui, a Uto-Aztecan language of Mexico* Zarina Estrada Fernández 1. The language Yaqui (ISO 639–3 Yaq) belongs to the Tara-Cahitan branch of the Uto-Aztecan family (called Uto-Nahuatl in the linguistic literature in Spanish). This branch comprises two subgoups: Tarahumara (which includes Tarahumara and Guarijio) and Cahita (which includes Yaqui-Mayo and the extinct Tehueco). The Uto-Aztecan family extends nowadays north to the state of Idaho in the United States and south to the Mexican states of Tlaxcala, Veracruz, Guerrero, Morelos, Mexico, Michoacan, and Hidalgo. Until the 1970s a Uto-Aztecan language, Pipil, was also spoken in El Salvador (see Campbell 1985). Studies about the genealogical relationship among the Uto-Aztecan languages were first conducted by Sapir (1913, 1915) and Whorf (1935, 1937) and later by Voegelin et al. (1962), Miller (1967), Campbell & Langacker (1978), Kaufman (1981), Manaster-Ramer (1992, 1993), and Dakin (1996). The most recent classification of the Uto-Aztecan languages was provided by Dakin (2004); see Figure 1. Yaqui is spoken in eight traditional towns along the Yaqui river in Sonora, Mexico, as well as in Tucson, Arizona (United States). The presence of Yaqui speakers th th in the U.S. is the result of migration during the late 19 century and early 20 century. The language is spoken by about 15,000 people in the western central part of Sonora and by about 2,000 in Arizona. The data in the Yaqui subdatabase are from the Sonoran dialect, spoken at Vicam pueblo, one of the eight traditional Yaqui towns. Yaqui is probably the most studied (and best studied) language of Sonora. Most of the studies, however, focused on Arizona Yaqui, e.g. Dozier (1956), Lindenfeld (1971, 1982), Escalante (1988, 1990), Jelinek (1998), Jelinek & Escalante (1988, 2000), Demers et al. (1999), and Molina et al. (1999). Studies on Sonoran Yaqui include Johnson (1943, 1962), Dedrick (1977), Silva Encinas et al. (1998), Dedrick & Casad (1999), Estrada Fernández et al. (2004), and Guerrero (2004).

*

The subdatabase of the World Loanword Database that accompanies this chapter is available online at http://wold.livingsources.org. It is a separate electronic publication that should be cited as: Estrada Fernández, Zarina. 2009. Yaqui vocabulary. In Haspelmath, Martin & Tadmor, Uri (eds.) World Loanword Database. Munich: Max Planck Digital Library, 1384 entries.

824

Zarina Estrada Fernández

Mono, Northern Paiute Western Tümpisa Shoshoni (Panamint), Gosyute, Shoshone Numic

Central

Comanshe

Southern

Kawaiisu, Ute (Chemehuevi, Southern Paiute)

Tübatulabal Hopi Northern Uto-Aztecan

Cahuilla, Cupan

Takic

Cupeño, Luiseño Serrano,

Serrano Uto-Aztecan (Uto-Nahua)

GabrielinoFernandeño Pima-Tohono O'odham,

Piman

Pima Bajo or Nevome †, Pima Bajo or Mountain Pima

Tepehuan

Northern Tepehuan,

Opata †

Southern Tepehuan, Tepecano †

Tepiman

Southern Uto-Aztecen

Opata-Eudeve Eudeve † Tubar (extinct) Tarahumaran

Tarahumara, Guar!io

Cahitan

Yaqui-Mayo, Tehueco †

Corachol

Cora, Huichol

Tara-Cahitan

Corachol-Nahuatl Nahuatl

Figure 1: The classification of Uto-Aztecan languages (Estrada, taking into consideration, in the order given: Dakin 2004; Campbell 1997, 1985; and Miller 1983) The first contact of the Yaqui with Spaniards goes back to 1523 when Diego de Guzmán tried to conquer the Yaqui, but it was after 1617 when the Spaniards founded the first six of the eight Yaqui towns, that contacts became intensive. The eight traditional towns now form one of the most important features of Yaqui identity. The concept of the eight traditional towns provides them a sense of community. Throughout four centuries of Spanish occupation, the Yaqui were known as a proud group who fiercely defended their sovereignty. They staged more revolts against the Spanish or Mexican government than any other indigenous group. Between 1608 and 1929, there were at least five known revolts. As a consequence of these revolts, during the government of Porfirio Diaz, a group of almost 6,500

32. Loanwords in Yaqui

825

members of this ethnic group were exiled to Yucatan to serve as laborers in agave plantations; many of them later returned to their homeland. The Yaqui are one of several indigenous groups of Mexico that even today do not allow audio-video recordings or photographs being made during traditional religious ceremonies, specially those which take place during the Holy Week. The Yaqui owe the Spaniards, in particular the Jesuits, their strong feeling of identity. Historical documents mention that the Yaqui were always considered an independent nation; during the colonization of Baja California, they played an important role helping the Spaniards in military actions. Traditional Yaqui social and religious organization resembles a confraternity, a system that makes them feel as guardianes de ese compromiso (‘keepers of a commitment’; see Juramento Yaqui in Estrada Fernández et al. 2004).

2. Sources of data The source of the data in the subdatabase was initially the Yaqui-Spanish dictionary (Estrada Fernández et al. 2004). An early book about Yaqui (Johnson 1962), and a dictionary of Arizona Yaqui (Molina et al. 1999) were also utilized. Entries in the Yaqui subdatabase were provided largely by Crescencio Buitimea Valenzuela, born around 1970, who is fluent in Yaqui, has a Bachelor’s degree in Linguistics, and has a considerable knowledge of Yaqui traditional culture. Additional entries were provided by Melquiades Bejipone Cruz, born in 1981, who is also fluent in Yaqui. Other data in this study are the result of more than ten years of my own documentation of the language, mainly for the purposes of language maintenance, in collaboration with speakers of the Yaqui community in Mexico. Various bibliographic references were used for verifying phonological variations in loanwords as well as to consult those which have been historically attested: Buelna (1989). Dedrick (1946, 1977, 1985), Dedrick & Casad (1999), Dozier (1956), Escalante (1988), Johnson (1943), Kurath & Spicer (1947), Lindenfeld (1971, 1973, 1982), and Spicer (1943). Acculturation patterns among the Yaqui, and in particular borrowing, have been extensively discussed in the linguistic literature. On the one hand, some writers consider Yaqui and their language as “profoundly affected” by Spanish (Johnson 1943: 434), as a group with a “permissive” attitude towards external language materials (Dozier 1956), and as a “loosely knit community” (Aikhenvald 2006: 39). On the other hand, authors like Dedrick (1977: 144) argue that “the sampling of the data [in Johnson’s study] was inadequate and not representative of the language”, comment that calls the attention to the methodology involved in this kind of studies. The present study provides another view of this language. For example, the word maala ‘mother’, previously classified by Johnson (1943: 431) as a loanword from Spanish, has been identified by Miller (1967: 25) as a reflex of Proto-UtoAztecan *mal, *ma, with cognates in other Uto-Aztecan languages, such as Northern Tepehuan mára ‘offspring’ Tarahumara mará ‘daughter (of a man)’, and Hopi,

826

Zarina Estrada Fernández

má!na ‘girl’, so this etymon is no longer considered a borrowing in the present study.

Map 1: Geographical setting for Yaqui

3. Contact situations Yaqui is one of over 60 vernacular languages spoken in Mexico. According to th Orozco y Berra (1864) and Pimentel (1874), during the first half of the 20 century at least seven distinct languages were spoken in Sonora: Opata (Tegüima), Eudeve (Heve), Nevome, Nebome (Pima), Jova, Varogio, Yaqui, Mayo, and Seri, the last being the only non-Uto-Aztecan language. Orozco y Berra in particular also remarks that during those days almost any small town or municipality was inhabited by speakers of distinct languages, in some cases up to five different languages. Such historical description implies a complex sociolinguistic situation, but no records of such complexity were made. It is unclear whether speakers of Yaqui were bilingual in other languages in the pre-colonial and colonial periods. The absence of good descriptive and comparative studies about all the Uto-Aztecan languages spoken in

32. Loanwords in Yaqui

827

Sonora prevents us from fully understanding borrowing patterns among these distinct languages. The topic must await future studies. Nowadays, for all vernacular languages spoken in Mexico (including Yaqui) the contact situation is asymmetrical; Spanish is the majority language as a result of the Spanish colonization which took place after 1521. At present, all Yaqui speakers are bilingual in Spanish (in Mexico) or English (in the U.S.). The number of bilinguals th probably increased in the second half of the 20 century. The domains of use of the Yaqui language in Sonora are mainly home, religion, and traditional government. Outside their own towns, the Yaqui usually use their own language only with each other and Spanish for all other purposes. Literacy is a recent phenomenon in the Yaqui community, although there are some written texts, such as Las cartas en th yaqui de Juan Banderas (cf. Dedrick 1985), written in the second half of the 19 century. Since 1973 the Yaqui have had a bilingual education program in order to teach their language at least in all the elementary schools within their communities. Yaqui teachers participate actively in teaching the language at schools, preparing writing materials, publishing a newspaper, documenting old words, and creating neologisms. The language is fighting for its survival, but faces an uphill battle against globalization and against the impact of the media: television, radio, cinema, and more recently the internet. In addition to Spanish loanwords, Sonora Yaqui also has a small number of loanwords from other languages, including Nahuatl. Historical records mention that during the colonial period Nahuatl was widely used as a lingua franca (see Aboites 1993). Most other non-Spanish loanwords entered into Yaqui via Spanish. This includes Taino, Quechua, and English words documented in the Yaqui subdatabase. However, it appears probable that Yaqui borrowed directly from Nahuatl since the Spaniards brought Nahuatl speakers with them when they first entered the Yaqui lands.

4. Numbers of loanwords Determining which words are loanwords in a language like Yaqui is not an easy task. Loanwords have been recognized as lexical items which are adopted or borrowed from another language as a result of direct or indirect language contact (Campbell 1999: 63). In order to identify a lexical item as a loanword, it would be useful to have some documentation of the language which is studied in terms of this topic, although is it definitely not necessary. In the study of loanwords in a language with no written tradition such as Yaqui, it is often difficult to determine whether we are dealing with borrowing or with codeswitching. That is, linguistic material originally from one language which occurs while speaking another language may be a case of an ad-hoc communicative need, or a well established, fully accepted

828

Zarina Estrada Fernández 1

and adapted loanword. For this reason, for each entry marked as a loanword in the Yaqui subdatabase, a comment is provided concerning the actual use of the lexical item within a specific social context. Such comments were considered important since the subdatabase is based on a fixed list of meanings. Some of these meanings are irrelevant for the Yaqui, but many are, partially relevant because they live in two overlapping communities: traditional Yaqui society and modern Mexican society. Words which were judged as ad-hoc, nonce borrowings are disregarded in this chapter. 4.1.

Absolute numbers, percentages by semantic fields

The Yaqui subdatabase contains 1683 entries. For 67 meanings of the Loanword Typology (LWT) meaning list there is no counterpart in Yaqui, either because they are irrelevant for the speakers (e.g. ‘the grass-skirt’, ‘the men’s house’, ‘the banyan’, ‘the boomerang’) or because there simply happens to be no counterpart (for example ‘to be’ and the pronoun ‘it’). The total number of words which are considered clearly borrowed is 387. Two words are considered perhaps borrowed from Nahuatl: te’ochia ‘to beget’ and takea ‘to hide’, and one word, miisi ‘cat’, is marked as showing “very little evidence for borrowing” since its original source is unclear: Some dictionaries claim it to be from Nahuatl miztli ‘cougar, mountain lion’ but the word might have an onomatopoetic origin. As shown in Table 1, Spanish loanwords are distributed across all semantic fields except Miscellaneous function words. In fact, no loanword is recorded for the LWT meanings belonging to this field. Semantic fields with a relatively high proportion of loanwords are Animals, Food and drink, Clothing and grooming, The house, Agriculture and vegetation, Basic actions and technology, Motion, and Modern world. Semantic fields with a relatively low proportion of loanwords are Sense perception, Kinship, and Emotions and values. Of the entries in the Yaqui subdatabase, 1217 are marked as showing “no evidence for borrowing”, 387 as “clearly borrowed”, and the remaining as showing “very little evidence for borrowing”, e.g. miisi ‘cat’ from Nahuatl, rumui ‘blunt’, mamato ‘imitate’ Spanish imitar ‘to imitate’ or mimo ‘person that do mimics’; ‘perhaps borrowed’, e.g. te’ochia ‘to beget’, takea ‘to hire’, or ‘probably borrowed’, e.g. leepe ‘the orphan’ (Spanish lépero). The entries are organized into eight distinct ages, two of which are relevant for loanwords: Colonial (1617–1899) and Modern (1900–2007). The Colonial age includes only loanwords, but the Modern considers some loanwords as well as other compound words which probably have been recently created, as for example: bawe mayoa ‘the shore’, bwe’u bawe ‘the sea’, tetmajau ‘the reef’. The other six ages apply to entries considered as with no evidence for 1

“Codeswitching is perhaps most frequently found in the informal speech of those members of cohesive minority groups in modern urbanizing regions who speak the native tongue at home, while using the majority language at work and when dealing with members of groups other than their own” (Gumperz 1982: 64).

32. Loanwords in Yaqui

829

borrowing. Such description is in some way arbitrary. Numbers of meanings associated to ages are in Table 2. Table 1:

Loanwords in Yaqui by semantic field (percentages) Loanwords from Spanish

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

The physical world Kinship Animals The body Food and drink Clothing and grooming The house Agriculture and vegetation Basic actions and technology Motion Possession Spatial relations Quantity Time Sense perception Emotions and values Cognition Speech and language Social and political relations Warfare and hunting Law Religion and belief Modern world Miscellaneous function words all words

Table 2:

Meanings associated with ages

age Proto-Uto-Aztecan (4000 to 3031 bp) Proto-Sonoran (3030 to 2018 bp) Proto-Tara-Cahitan (2017 to 614 bp) Proto-Cahitan (613 bp to 1616) Colonial (1617 to 1899) Modern (1900 to 2007)

2

11.1 2.7 26.5 5.9 34.6 51.1 69.8 37.6 36.3 21.5 22.3 14.7 33.2 19.2 7.1 19.0 27.3 36.4 41.6 52.6 33.6 83.8 26.5

Non-loanwords 88.9 97.3 72.5 94.1 65.4 48.9 30.2 62.4 63.7 78.5 77.7 85.3 66.8 80.8 100.0 92.9 81.0 72.7 63.6 58.4 47.4 66.4 16.2 100.0 73.5

2

number of words 326 81 68 309 239 592

Glottochronological dates were calculated by Søren Wichmann (personal communication, July 2008). Bp stands for “before present”.

830

Zarina Estrada Fernández

4.2.

Multiple counterparts

The Yaqui subdatabase contains a total of 186 cases where two, three, or four words are assigned to one meaning. Among these, 86 involve loanwords which co-exist with native Yaqui words. One may wonder if these cases must be considered on the total percentages of loanwords; most of all, because the bilingual social situation which involves the Yaqui daily life pressures them to act – in relation to the use of loanwords – on an economy basis of communication. That is, if a Yaqui speaker in a particular act of communication need to take a decision between using a short word originally from Spanish, e.g. kinse ‘fifteen’, vs. a phrasal expression from Yaqui, e.g. wojmamni ama mamni, a principle of economy of communication will force him to use the shortest lexical item, but in formal contexts, i.e. ritual or political events, the Yaqui will use the long terms. Another interesting aspect of multiple words for a meaning concerns four cases where it seems that an old Spanish loanword is alternating and probably being replaced by a modern Spanish loanword; examples are illustrated in (1): (1)

4.3.

Old loans lionoka maejto pojporo saweam

Modern loans resaroa propesor seriom pantaloonim

Spanish rezar profesor cerillo pantalón

‘pray’ ‘teacher’ ‘match’ ‘pants’

Recurring loanwords

Another interesting case is where a word is related to more than one meaning. There are 181 words polysemically related to other words in the subdatabase; 19 of them are Spanish loanwords. Two are created on a loanword basis: bawero’aktim ‘wave, tide’ from Yaqui bawe ‘sea’ and Spanish roa ‘turn’ (Spanish rodar), and kuchichi’iti nooka ‘mumble, whisper’ derived from Spanish cuchichear and Yaqui nooka ‘talk’. Examples of polysemous loanwords are provided in the following list: (2)

Yaqui tiempo wakasim waakas joona pu’ato asuka puepplo tomi lisensia konrenaroa rajtiom

Spanish tiempo vacas vacas horno plato azúcar pueblo tomín licencia condenado rastrillo

gloss ‘weather, time’ ‘livestock, cattle’ ‘cow, meat, flesh’ ‘oven, stove, forge’ ‘dish, plate’ ‘sugar, sugar cane’ ‘village, people’ ‘money, coin, booty’ ‘driver’s license, license plate’ ‘condemn, convict’ ‘razor, rake’

32. Loanwords in Yaqui

ramaa kabaa aseite kulpa

enramada cabra aceite de comer culpa

831

‘hut, thatch’ ‘goat, he-goat’ ‘oil, petroleum’ ‘fault, blame’

The semantics of all words in (2) largely reflect the basic meanings of the Spanish source words. For example, Yaqui puepplo means ‘village, people’, and the Spanish source word pueblo has the same meanings. Similarly rajtiom ‘razor, rake’ has the same meaning as the Spanish sourceword rastrillo, however the loanword tomi used to mean a ‘type of silver coin’ which was used in American countries under the Spanish colonization’; the term was borrowed into different indigenous languages from Mexico meaning ‘coin’, ‘money’, ‘bill’, as well as ‘booty’. An exception is waakas ‘cow, meat, flesh’ (< Spanish vacas ‘cows’) which extended its meaning and replaced, at least in Yaqui common usage, the original Yaqui word tekwa ‘flesh’.

5. Kinds of loanwords Most loanwords in Yaqui are nouns (originating from Spanish): of 389 loanwords 334 are nouns. Verbs occupy a second position, followed by function words and adjectives. None of the adverbs on the LWT meaning list have loanword counterparts. An interesting issue concerning loan verbs is the fact that all but five (see §6 for these exceptions) are adapted in Yaqui by means of the suffix -oa, itself a borrowed Nahuatl verbalizer. This fact points to the sociolinguistic situation prevailing during colonization when the Spaniards used Nahuatl speakers to consolidate their social, religious, economic, and political hegemony. There are relatively few borrowed function words and adjectives: 18 function words and seven adjectives. The function words consist of all numerals in the subdatabase, which all have alternative Yaqui counterparts, and the disjunctive conjunction o ‘or’. Among the adjectives there are muuro ‘mute’, poloobe ‘poor’, neesio ‘stupid’, sekreeto ‘secret’, lobolai ‘round’, and two adjectives derived from other loanwords: tomekame ‘rich’ and kulpakame ‘guilty’. Table 3 shows loanwords in Yaqui by semantic word class. Table 3:

Loanwords in Yaqui by semantic word class (percentages) Loanwords from Spanish

Nouns Verbs Function words Adjectives Adverbs all words

37.2 10.1 17.4 3.7 26.5

Non-loanwords 62.7 89.9 82.6 96.3 100.0 73.5

832

Zarina Estrada Fernández

6. Integration of loanwords The phonemic inventory of Yaqui includes five short and five long vowels, thirteen consonants and two glides, provided in (3) in the orthography used by the Yaqui of Sonora: (3)

Short vowels: Long vowels: Consonants: Glides:

a, e, i, o, u aa, ee, ii, oo, uu 3 p, t, ch, k, ’, b, bw, s, j, l, r, m, n w, y

In a few cases, Spanish loanwords in Yaqui reflect Old Spanish pronunciation. These include waakas ‘cow’ where the /w/ reflects the Old Spanish bilabial approximant /!/ and used to be written as v or u. Another example saabum ‘soap’, where the /s/ reflects the old voiced postalveolar fricative /Z/ which merged with its voiceless counterpart /S/ and evolved later into the modern voiceless velar fricative th /x/ by the 17 century. This latter correspondence is also reflected in the loanwords 4 na’aso ‘orange’ and aasos ‘garlic’ which are not in the Yaqui subdatabase. Old colonial Spanish loanwords exhibit phonological processes as the result of adaptation into Yaqui. I provide first a list of loanwords and then I discuss the processes of adaptation since most are still productive in modern loanwords. (4)

Old colonial chiba’ato kaba’i kucha’a kuchi’im moina pua’ato sene’eka tena’asam

Spanish loanwords chivato caballo cuchara cuchillo molino plato ciénega tenaza

‘goat’ ‘horse’ ‘spoon’ ‘knife’ ‘mill’ ‘dish’ ‘pond’ ‘tongs’

Four phonological adaptation processes affect Spanish vowels. Not all of these processes are regular, thus the adaptation of a loanword is more or less unpredictable (Estrada 2004; Estrada & Alvarez 2004). Those processes are: i. the modification of the place of articulation of an initial or final vowel; in (5), the adaptation of final /a/, /o/ and /e/ vowels is illustrated: 3

The phonemic description of the Yaqui letters is the following: p represents a labial voiceless stop /p/, t a dental voiceless stop /t/, k a velar voiceless stop /k/, ch a palatal voiceless affricate /"/, ’ a glotw tal stop /#/, b a bilabial voiced stop, bw a bilabiovelar voiced fricative /b /, s a dental fricative /s/, j a aspirated fricative /h/, l a lateral /l/, r a flap /r/, m bilabial nasal /m/, n a dental nasal /n/, w a bilabial glide /w/ and y a palatal glide /j/. The orthography of Arizona Yaqui uses an for the fricative aspirated phoneme and a for the fricative labiovelar consonant. 4 Cf. Miller (1990) as well as Sicoli (1999) for the tracing old Spanish loanwords.

32. Loanwords in Yaqui

(5) /a/ > /o/ /a/ > /i/ /o/ > /a/ /o/ > /u/ /e/ > /i/

Yaqui na’aso kampaani moina keesum ijpuelam

Spanish naranja campana molino queso espuelas

833

‘orange’ ‘bell’ ‘mill’ ‘cheese’ ‘spurs’

ii. the deletion of an initial vowel, illustrated in (6), where in some cases, the process may affect also a contiguous consonant, i.e., enramada > ramaa: (6)

Yaqui maka ramaa naranjao banteareo

Spanish hamaca enramada anaranjado abanderado

‘hammock’ ‘thatch’ ‘orange’ ‘who carries the flag’

iii. the rearticulation of a vowel with a glottal stop /’/ separating the two vowels, applied to gain weight in a word once some other phonological proceeds applied: (7)

chiba’ato pua’ato tena’asam mache’etam rebo’osam

chivato plato tenazas machete rebozo

‘goat’ ‘dish’ ‘tongs’ ‘machete’, ‘knife’ ‘shawl’, ‘scarf’

iv. vowel lengthening to reflect Spanish stressed vowels, a process also identified by Casad (1988:80–81) for Cora, another Uto-Aztecan language: (8)

Yaqui animaal buuru karakool pooso

Spanish animal burro caracol pozo

‘animal’ ‘donkey’ ‘snail’ ‘spring’ or ‘well’

Concerning consonants, the Yaqui phonological system lacks five consonants which are present in Spanish: the voiceless labiodental fricative /f/, the voiced dental stop /d/, the voiced velar stop /g/, the alveolar trill /$/, the lateral palatal /´/, and the nasal palatal /ø/ written . The adaptation of loanwords with these consonants is as follows: i. /f/ is replaced by /p/: (9)

Yaqui elepante pojporo piltro

Spanish elefante fósforo fieltro

‘elephant’ ‘match’ ‘felt’

834

Zarina Estrada Fernández

ii. /d/ is replaced by /r/, /l/ or zero. The deletion of intervocalic /d/ may have already been present in the source word since this phonological process is common in several dialects of Spanish. (10) Yaqui arorno relpin manara lios sondao

Spanish adorno delfín manada Dios soldado

‘adornment’, ‘ornament’ ‘dolphin’ ‘the herd’ ‘God’ ‘soldier’

iii. /g/ is replaced by /k/ or /w/: (11) Yaqui lominko biika tiikom wolpo

Spanish domingo viga trigo golfo

‘Sunday’ ‘beam’ ‘wheat’ ‘gulf’

iv. /s/ is replaced by /h/ (written as ) when it is before a stop consonant. This is a case where the loanword may be influenced by the Spanish dialect of Sonora: (12) Yaqui ejpeeko ejkalea ejtaatua impuejtom

Spanish espejo escalera estatua impuesto

‘mirror’ ‘ladder’ ‘statue’ ‘tax’, ‘tribute’

v. the lateral palatal /´/ is replaced by a simple lateral /l/, a glide /y/, or zero, a phonological process also present in the Sonoran Spanish dialect: (13) Yaqui tokila toayam orkiam orniam

Spanish toquilla toalla horquilla hornilla

‘headband’, ‘headdress’ ‘towel’ ‘pitchfork’ ‘fireplace’

vi. the nasal palatal /ø/ is adapted as a glide /y/: (14) Yaqui paayum

Spanish paño

‘handkerchif’ or ‘rag’

One last phonological process which applies in the adaptation of Spanish loanwords is the introduction of an epenthetic vowel to break consonant clusters which are not phonotactic in Yaqui. Evidence of the early application of this process can be seen in some Spanish loanwords of the extinct Tehueco documented by Buelna (1989, [1890]):

32. Loanwords in Yaqui

(15) Buelna cabara curute purato tirihon

Yaqui kabaa kuus pu’ato tiikom

Spanish cabra cruz plato trigo

835

‘goat’ ‘cross’ ‘dish’ ‘wheat’

Finally, one phonological process of adaptation of loanwords deserves a special mention: the Spanish consonant is replaced by a glottal stop. This phenomenon of adaptation is observed mainly in older colonial loanwords. (16) Yaqui kaba’i

Spanish caballo

‘horse’

Phonological integration of loanwords, especially for old loanwords, makes them unrecognizable as such to most native speakers, in particular to those illiterate in their own language. The opposite occurs with speakers literate in Yaqui. In some cases, the speaker will intentionally avoid the use of loanwords. However, it is not completely possible to avoid old cultural loanwords such as kaba’i ‘horse’ Spanish caballo, wakas ‘cow’ Spanish vaca, lios ‘God’ Spanish Dios. It is also possible to find a reluctant speaker who tries to assign a Yaqui origin to some older loanword: i.e. kobanaroa ‘governor’ Spanish gobernar, analyzed as derived from koba ‘head’ and naawa ‘root’.

7. Morphological integration Four suffixes are involved in the morphological adaptation of loanwords. Two come from Spanish, one from Nahuatl, and only one is of Yaqui origin. The first is the Spanish plural suffix -s which is semantically opaque and is not productive. The suffix appears in a handful of loanwords: boes ‘ox’ < Spanish buey or aasos ‘garlic’ < Spanish ajo, the last one not included in the list of meanings studied in this project. The presence of this suffix on those loanwords might have its origin in the Spanish plural forms, or else, in a general tendency of Yaqui to adapt many of the Spanish loanwords as plural. A list of Spanish loanwords with the plural Yaqui suffix -im are provided in (17): (17) Yaqui keesum kuchi’im kolmeenam paleetam paanim seriom

Spanish queso cuchillo colmena paleta pan cerillo

‘cheese’ ‘knife’ ‘beehive’ ‘shoulder blade’ ‘bread’ ‘match’

836

Zarina Estrada Fernández

Words borrowed from Spanish with the plural suffix -s are treated as singular forms in Yaqui and may be suffixed with the Yaqui plural marker -im, as in wakasim ‘cow, flesh, meat’ and puentesim ‘bridge’. Another Spanish suffix which appears in loanwords is the agentivizer suffix -eo. In Spanish, this suffix derives nouns denoting a person who regularly performs a particular action. Examples are provided in (18): (18) Yaqui alpareo ereo kapinteo rancheo sapateo

Spanish alfarero herrero carpintero ranchero zapatero

‘potter’ ‘blacksmith’ ‘carpenter’ ‘farmer’ ‘shoemaker’

Although the suffix -eo is not productive in modern Yaqui, there are two examples of words which were apparently derived in Yaqui from a Spanish loanword plus the suffix -eo: bochareo ‘shoemaker’ < Spanish bolsa, which is adapted in Yaqui as boocham ‘shoes’, and wakareo ‘butcher’, which derives from the Spanish loanword wakas ‘cow’ < Spanish vacas, but in Yaqui this word extended its meaning to include ‘meat’. Wakareo ‘butcher’ was evidently derived after the semantic extension had taken place. One last suffix, perhaps the most interesting one, is -oa, a Nahuatl verbalizer used in the morphological adaptation of verbal loanwords. All verbal Spanish loanwords, except for a handful, are adapted into Yaqui with the infinitival Spanish suffixes, -ar, -ir, or -er. For a Spanish verbal loanwords to function as a verb in Yaqui, it must be modified by adding the originally Nahuatl suffix -oa. The use of this suffix to derive loan verbs in Nahuatl was described by Karttunen & Lockhart (1976: 32). According to the authors, by 1700 the suffix -oa had become the “standard convention” in the adaptation of verbs. In the Yaqui subdatabase, 37 loan verbs borrowed from Spanish bear this originally Nahuatl suffix. Some examples are provided in (19): (19) Yaqui abanikaroa kisaroa kolaroa mobeiaroa orniaroa

Spanish abanicar guisar colar mover hornear

‘fan’ ‘cook’ ‘sieve’ ‘to move’ ‘bake’

However, there are several verbs not derived by the Nahuatl verbalizer -oa, but by means of other morphological processes. The verb kuchi’isoa ‘stab’ was not derived from the Spanish infinitival form acuchillar ‘to stab’, but from the loanword noun kuchi’isim ‘knife’ and the verbalizer suffix -oa. Another five verbs show original Yaqui strategies for deriving new verbs: the transitivizing suffix -te, the causative

32. Loanwords in Yaqui

837

-tua, the directional prefix nau-, incorporation, and phrasal construction. All those examples are provided in (20): (20) Yaqui kuchi’is-oa kaba’i-te lio-noka mani-tua nau-monto kulpa-ta a u’ura

knife-VBLZ horse-TR God.talk brake-CAUS DIR-pile guilt-ACC 3SG.ACC

Spanish acuchillar cabalgar rezar manear amontonar quitar

‘stab’ ‘to ride’ ‘pray’ ‘brake’ ‘to pile up’ ‘to remove’

The morphological processes for the adaptation of the Spanish loanwords in (14) characterize such loanwords as old (colonial age). They are also useful in determining that the use of suffix -oa was a strategy to adapt loanwords which probably th th become generalized by the 19 or 20 century.

8. Grammatical borrowing Grammatical borrowing in Yaqui is mainly observable in the use of discourse particles and word order. 8.1.

Discourse particles

Discourse particles are, according to Stolz & Stolz (1996) and Matras (1998), the most common linguistic material to be adopted by another language since such particles are fundamental for the organization of the message. The ultimate decision concerning grammatical borrowing of discourse particles can not be supported, at least by now, in terms of frequency of use, most of all, because Yaqui speakers show some inconsistencies on whether which discourse particle will be used. In a Yaqui database of nearly 50 distinct texts (more than 150 pages) collected from five distinct speakers, the following Spanish particles occurred: the disjunctive conjunction o ‘or’ (< Spanish o); the prepositions asta ‘till’ (< Spanish hasta) and komo ‘as’ (< Spanish como); and the subordinator pos ‘well, then, so’ from the Span5 ish hesitative pues ; the conditional si ‘if’; and ke ‘what’ (< Spanish que), which contrary to what has been documented by Johnson (1943) and Lindenfeld (1982) was only attested in the speech of one speaker. Examples are provided below. (21) ume

tomt-I

DEM-PL born-ADJVZ

kate-me

achai-m-ta-ka

o

walk.PL-NMLZ father-PL-ACC-SUB or

mala-m-ta-ka... mother-PL-ACC-SUB

‘those youngsters of today, their fathers or mothers…’ 5

A hesitative, according to Mushin (2001), is a discourse particle which communicates an act of hesitation, where the speaker gives place to a suspension of opinion or action, an act of doubt.

838

Zarina Estrada Fernández

(22) jeka-po wind-LOC

chaasime

asta

junum ian

“Ten

Jaweepo”.

hang-go.SG.PRS

till

there

mouth

open.LOC

today

‘It went rolling in the air up to what is today the “open mouth” mountain.’ (23) binwatu time.ago

komo

setentai

ocho

wasukte...

like

seventy

eight

years

‘(I was born) about seventy-eight years ago…’ (24) Poj... inepo

jewi im naa

pues... 1SG.NOM yes

weye... ir.SG

LOC DIR

‘Well..., I live here…’ (25) si nee

a=mabett-ne-’u...

if 1SG.ACC 3SG.ACC=accept-FUT-NMLZ

‘If I were accepted...’ (26) Jiokot

te

not-good

a=pasa-roa-k

ama wo’ota-tua-wa-k LOC

ke te

1PL.NOM 3SG.ACC=spend_time-VBLZ-PFV

munim jitasa

throw-CAUS-IMPRS-PFV beans

that

SUB

trajte-ta

1PL.NOM dish-ACC

joona-po

jo’o-wa-me...

stove-LOC

make-IMPRS-REL

‘We had such trouble there that we were forced to throw away the plate of beans that were cooked on the stove...’ 8.2.

Word order

Two changes in the word order of Yaqui as the result of language contact with Spanish were discussed by Johnson (1943) and Lindenfeld (1971). The first author points out that the postposition bet"i’ibo ‘for’ has changed into a preposition, maintaining at the same time its postpositional status, i.e. the original postposition may now occur both before or after the NP. He also makes the following statement about all postpositions: “The resulting set of alternate constructions [that is postnominal and prenominal] are applied to the entire set of Yaqui postpositions’ and ‘bilinguals are obviously prone to change the Yaqui order, making the postposition a preposition” (Johnson 1943:432). Examples in Johnson are provided in (27): (27)

a. /in-á"ai bet"i/ibo ‘my-father for’ b. bet"i/ibo /in-á"ai

However, after twelve years of doing fieldwork on Sonora Yaqui, I have never documented any examples like those described by Johnson. On the contrary, the

32. Loanwords in Yaqui

839

different occurrences of bet"i/ibo – which in the orthography used by the Yaqui is provided as betchi’ibo – confirm its postpositional status: (28) a. Joan John

Peo-ta

betchi’ibo

Wa’imam-meu siika.

Peter-ACC

for

Guaymas-DIR

go.PFV

‘John went to Gaymas for Peter.’ b. samim ya’a adobe

betchi’ibo

make.PRS for

techoa

into

sankoa nau-kuu-kuuta-wa.

mud

and

straw

together-RDP-mix-PASS

‘For making adobes, mud and straw is mixed.’ c. u DET.NOM

jamut

baka-ta

etaj-ta

warim ya’a

woman

straw’s_stem-ACC

cut-TR.PRS

basket.PL make.PRS for

betchi’ibo.

‘The woman cuts the straw’s stem for making baskets.’ Lindenfeld (1971) demonstrates that the comparative construction in Yaqui has been influenced by Spanish comparative constructions. His main argument is based on the change of order of the adjective within the comparative clause. According to Lindenfeld, the Yaqui adjective bwe’ú ‘big’ changed from sentence final position as in (29a) to a postnominal order as in (29b): (29) a. hu o’óu húme haamú"im benásya bwe’ú. ‘This man is as bigger as these women.’ b. hu o’óu bwe’ú ke húme haamú"im benásya. Contrasting with the type of comparative constructions documented by Lindenfeld (1971: 8), what I have attested is the change in word order of the comparative benasi ‘like, as’ from final sentence position, illustrated in (30a-b), to initial position like the Spanish adverb como ‘how, about, as, like’. Example in (30c) illustrates benasi in the postion of the adverb como. (30) a. Ket-kea also-only

ili ito

aet

DIM RFLX

3SG.DIR scare-ADJVZ-ACC

womta-la-ta

benasi... COMP

‘Like if we were scared a little bit (towards) ourselves…’ b. Tua=te

wepul

true=1PL.NOM one

mampusiam-po benasi... finger-PL-LOC

COMP

‘In fact there was like about only one small amount…’ (lit. like about one finger) c. Benasi=t=a COMP=1PL.NOM=3SG.ACC

ta’a-pea. know-DES

‘Like when we want to know it (the sun).’

840

Zarina Estrada Fernández

8.3.

Expletives and exclamations

Spanish expressions of emotion and (strong) opinion occur in Yaqui texts. They are not frequent and their occurrence depends on the degree of the involvement of the speaker in relation to the topic. Based on their unpredictability, I assume that those expressions are instances of codeswitching. An example is given in (31): (31) Si

nesio

ju’u

yoi

jodido!

INT

necio

DET

yori

jodido

INT

stubborn

DET

white-man

fucker

‘¡Qué necio el yori jodido!’ ‘So stubborn, the fucking white man!’

Special Abbreviations ADJVZ COMP DES DIM DIR IMPRS INT RDP REL SUB VBLZ

adjectivizer comparative desiderative diminutive directional impersonal intensive reduplication relativizer subordinator verbalizer

References Aboites Aguilar, Luis. 1993. Norte precario: Poblamiento y colonización en el Norte de México (1760–1940). Doctoral Thesis. México: El Colegio de México. Aikhenvald, Alexandra Y. 2006. Grammars in contact: A cross-linguistic perspective. In Aikhenvald, Alexandra Y. & Dixon, R. M. W. (eds.), Grammars in contact: A crosslinguistic typology, 1–66. Oxford: Oxford University Press. Buelna, Eustaquio. 1989 [1890]. Arte de la lengua cahita por un padre de la Compañía de Jesús. México: Siglo XXI Editores. Campbell, Lyle. 1985. The Pipil Language of El Salvador. Berlin: Mouton Publishers. Campbell, Lyle. 1997. American Indian languages: The historical linguistics of Native America. Oxford: Oxford University Press. Campbell, Lyle. 1999. Historical Linguistics: An Introduction. Cambridge, MA: MIT Press.

32. Loanwords in Yaqui

841

Campbell, Lyle & Langacker, Ronald. 1978. Proto-Aztecan vowels. Parts 1, 2, 3. International Journal of American Linguistics 44:85–102, 197–210, 262–279. Casad, Eugene H. 1988. Post-Conquest influences on Cora (Uto-Aztecan). In Shipley, William (ed.), In Honor of Mary Haas: From the Haas Festival Conference on Native American Linguistics, 77–136. Berlin: Mouton de Gruyter. Dakin, Karen. 1996. Long vowels and morpheme boundaries in Nahuatl and Uto-Aztecan: Comments on historical developments. Amerindia 21:55–76. Dakin, Karen. 2004. Prólogo. In Estrada et al. (ed.), Diccionario yaqui-español: Obra de preservación lingüística, 13–20. México: Editorial Plaza y Valdés/Universidad de Sonora. de Molina, Fray Alonso. 1977 [1955]. Vocabulario en lengua castellana y mexicana y mexicana y castellana. México: Editorial Porrúa, S.A. Dedrick, John M. 1977. Spanish influence on Yaqui Grammar. International Journal of American Linguistics 43(2):144–149. Dedrick, John M. 1985. Las cartas en yaqui de Juan “Banderas”. Tlalocan. Revista de Fuentes para el Conocimiento de las Culturas Indígenas de México X:119–187. Dedrick, John M. & Casad, Eugene H. 1999. Sonora Yaqui Language Structures. Tucson: The University of Arizona Press. Dedrick, John. M. 1946. How Jobe’eso Ro’i got his name. Tlalocan. A journal of source materials on the native cultures of México 2(2):163–166. Azcapotzalco: La casa de Tlaloc. Demers, Richard & Escalante, Fernando & Jelinek, Eloise. 1999. Prominence in Yaqui Words. International Journal of American Linguistics 65(1):40–55. Dozier, Edward P. 1956. Two examples of linguistic acculturation: The Yaqui of Sonora and Arizona and the Tewa of New Mexico. Language 32(1):146–157. Escalante, Fernando. 1988. Spanish loanwords in Yaqui. Phoenix, AZ: Work presented in CAIL-AAA. Escalante, Fernando. 1990a. Voice and Argument Structure in Yaqui. Ph.D. dissertation. University of Arizona. Estrada Fernández, Zarina. 2004. Spanish loanwords in Yaqui (Uto-Aztecan language from Northwest Mexico). Leipzig: Talk given at the Max Planck Institute for Evolutionary th Anthropology, 24 June. Estrada Fernández, Zarina & Álvarez González, Albert. 2004. Préstamos del español en yaqui. México, D.F.: VII Coloquio de Lingüística en la Escuela Nacional de th th Antropología e Historia, 28 –30 April. Estrada Fernández, Zarina & Buitimea Valenzuela, Crescencio & Gurrola Camacho, Adriana E. & Castillo Celaya, María Elena & Carlón Flores, Anabela. 2004. Diccionario yaquiespañol: Obra de preservación lingüística. México: Editorial Plaza y Valdés Universidad de Sonora. Gumperz, John J. 1982. Discourse strategies. Cambridge: Cambridge University Press. Jelinek, Eloise. 1998. Voice and Transitivity as Functional Projections in Yaqui. In Butt, Miriam & Geuder, Wilhelm (eds.), Projections from the Lexicon. Stanford: CSLI.

842

Zarina Estrada Fernández

Jelinek, Eloise & Escalante, Fernando. 1988. Verbless possessive Sentences in Yaqui. In Shipley, William (ed.), Festschrift for Mary Haas: From the Haas Festival Conference on Native American Linguistics. Berlin/New York: Mouton de Gruyter. Jelinek, Eloise & Escalante, Fernando. 2000. Unergative and Unaccusative Verbs in Yaqui. In Casad, Eugene & Willett, Thomas L. (eds.), Uto-Aztecan Structural, Temporal and Geographic Perspectives: Papers in Honor of Wick R. Miller by the friends of Uto-Aztecan, 171–182. Hermosillo, Sonora: Universidad de Sonora. Johnson, Jean B. 1962. El idioma yaqui. México: Instituto Nacional de Antropología e Historia. Departamento de Antropología e Historia, Publ. 10. Johnson, Jean Bassett. 1943. A clear case of linguistic acculturation. American Anthropologist. New Series 45(3):Part 1: 427–434. Karttunen, Frances & Lockhart, James. 1976. Nahuatl in the middle years: Language contact phenomena in texts of the colonial period. (University of California Publications in Linguistics 85). Berkeley: University of California Publications. Kaufman, Terrence. 1981. Comparative Uto-Aztecan Phonology. Unpublished manuscript. Kurath, William & Spicer, Edward H. 1947. A brief introduction to Yaqui: A native language of Sonora. University of Arizona Bulletin 18(1). (Social Science Bulletin 15). Lindenfeld, Jacqueline. 1971. Semantic categorization as a deterrent to grammatical borrowing: A Yaqui example. International Journal of American Linguistics 37(1):6–14. Lindenfeld, Jacqueline. 1973. Yaqui Syntax. Berkeley: University of California Press. Lindenfeld, Jacqueline. 1982. Langues en contact: Le yaqui face a l’espagnol. La Linguistique 18(1):111–127. Lionnet, André. 1977. Los elementos de la lengua cahita (yaqui-mayo). México: Universidad Nacional Autónoma de México. Manaster-Ramer, Alexis. 1992. A Northern Uto-Aztecan sound law: *-c- > *-y-. International Journal of American Linguistics 58(3):251–268. Manaster-Ramer, Alexis. 1993. On lenition in some Northern Uto-Aztecan languages. International Journal of American Linguistics 59(3):334–341. Matras, Yaron. 1998. Utterance modifiers and universals of grammatical borrowing. Linguistics 36(2):281–331. Miller, Wick R. 1967. Uto-Aztecan cognate sets. Berkeley/Los Angeles: University of California Press. Miller, Wick R. 1983. Uto-Aztecan. In Ortiz, Alfonso (ed.), Southwest: Handbook of North American Indians, 113–124. Washington: Smithsonian Institution. Miller, Wick R. 1990. Early Spanish and Aztec loan words in the indigenous languages of northwest Mexico. In Garza Cuarón, Beatriz & Levy, Paulette (eds.), Homenaje a Jorge A. Suárez: Lingüística Indoamericana e hispánica, 351–365. México: El Colegio de México.

32. Loanwords in Yaqui

843

Moctezuma, José Luis & López, Gerardo. 1990. Variación Dialectal Yaqui-Mayo. Noroeste de México 9:94–106. Centro Regional Sonora del Instituto Nacional de Antropología e Historia. Molina, Felipe S. & Valenzuela, Herminia & Shaul, David Leedom. 1999. Hippocrene Standard Dictionary Yoeme-English/English-Yoeme: With a Comprehensive Grammar of Yoeme Language. New York: Hippocrene Books. Mushin, Ilana. 2001. Evidentiality and epistemological estance: Narrative retelling. Amsterdam: John Benjamins. Orozco y Berra, Manuel. 1864. Geografía de las lenguas y carta etnográfica de México: precedidas de un ensayo de clasificación de las mismas lenguas y de apuntes para las inmigraciones de las tribus. México: J. M. Andrade y F. Escalante. Pimentel, Francisco & de Heras, Conde. 1874. Cuadro descriptivo y comparativo de las lenguas indígenas de México: o Tratado de filología mexicana. México: Isidro Epstein. Santamaría, Francisco J. 2000. Diccionario de mexicanismos. México: Editorial Porrúa. Sapir, Edward. 1913. Southern Paiute and Nahuatl: A Study in Uto-Aztecan. Part 1. Journal de la Societé des Americanistes de Paris 10:379–425. Sapir, Edward. 1915. Southern Paiute and Nahuatl, A Study in Uto-Aztecan, Part 2. American Anthropologist 17:98–120, 306–328. Also published 1914–19, Journal de la Société des Americanistes de Paris, nouvelle série 11:443–488. Sicoli, Mark. 1999. Loanwords and contact-induced phonological change in Lachixio th Zapotec. In Proceedings of the 25 annual meeting of the Berkeley Linguistic Society. General session and parasession on loanword phenomena, 395–406. Berkeley, CA. Silva Encinas, Manuel Carlos & Álvarez Romero, Pablo & Buitimea Valenzuela, Crescencio. 1998. Jiák nokpo etéjoim (pláticas en lengua yaqui). Hermosillo: Universidad de Sonora. Spicer, Edgard H. 1943. Linguistic aspects of Yaqui acculturation. American Anthropologist. New Series 45(3):Part 3: 410–426. Stolz, Christel & Stolz, Thomas. 1996. Funktionswortentlehnung in Mesoamerika: Spanish-amerindischer Sprachkontakt (Hispanoindiana II). Sprachtypologie und Universalienforschung 49(1):86–123. Voegelin, C. F. & Voegelin, F. M. & Hale, Kenneth L. 1962. Typological and comparative grammar of Uto-Aztecan: 1 (Phonology). Supplement to Internacional Journal of American Linguistics 28(1). Memoir 17. Indiana University Publications in Anthropology and Linguistics. Whorf, Benjamin L. 1935. The comparative linguistics of Uto-Aztecan. American Anthropologist 37:600–608. Whorf, Benjamin L. 1937. The origin of Aztec tl. American Anthropologist 39:265–274.

844

Zarina Estrada Fernández

Loanword Appendix Spanish kontinente

mainland, continent

wolpo

bay, gulf

punta sene’eka pooso tiempo

insekto

insect

uuba

grape

kuka

cockroach

nues

nut

kolmeenam

beehive

aseituuna

olive

kama

alligator, crocodile

pimientam

pepper

mielim

honey

manara

herd

keesum

cheese

piiko

beak

mantekia

butter

omooplato

shoulderblade, omoplate

biino

wine

shoulderblade, omoplate

serbeesa

beer

tajkaim

tortilla

teela

cloth

liino

flax, linen

seeda

silk

piltro

felt

joronwom

poncho

saakom

coat

kueyo

collar

saaweam

trousers

pantaroonim

trousers

meeriam

sock, stocking

kalsetiinim

sock, stocking shoe

headland, point spring, well weather, time

seriom

match

pojporo

match

paleetam

pale ~ paale

son

munyeeka

wrist

animal ~ animaal

animal

baaso

spleen

wakasim

livestock, cattle

robonim

testicles

purunkulo

boil (noun)

poteo

pasture

lotor ~ lootor

physician

kapyeo

herdsman

muuro

mute

rancho

stable, stal

kisaroa

cook

too ~ tooro

bull

kisaaroa

fry, roast

boes

ox

orniaroa

bake

waakas

cow, meat, flesh

joona

oven, stove (n), to forge (vb)

kaaso

pan

boocham

pu’ato

dish, plate

bootam

boot

pitcher, jug

sapateo

shoemaker

taasa

drinking vessel, cup

bochareo

shoemaker

wantem

glove

ass, donkey

kucha’ara ~ kucha’a

spoon

beelom

veil

muura

mule

bolsiom

pocket

kuchi’im

knife

totoi

chicken

boosam

pocket

teneror

fork

kanso

goose

booton

button

tena’asam

tongs

paato

duck

alpiler

pin

kolaroa

sieve, strain

takwaachi

opossum

arorno

paanim

bread

adornment, ornament

relpin

dolphin, porpoise

ainam

flour, meal

jooyam

jewel

moina

mill

aniilio

salchiicha

sausage

ring (for finger)

soopam

soup, broth

pulseera

bracelet

begetaalim

vegetables

tokila

paapa

potato

headband, headdress

beseo

calf

kabaa

goat

chiba’ato

he-goat

kaba’i

horse

yeewa

mare

buuru

raaya

stingray, raie

ooso ~ jooso

bear

elepante

elephant

kameo

camel

picheel

32. Loanwords in Yaqui

845

paayum

handkerchief, rag

suuko

furrow

ejkulpiaroa

carve

paalam

shovel

ejkultoor

sculptor

toayam

towel

asaroonim

hoe

ejtaatua

statue

sepio

brush

orkeetam

fork, pitchfork

mobeiaroa

move

rajtiom

razor, rake

orkiam

fork, pitchfork

kareteera

road

saabum

soap

oosam

sickle, scythe

puentes

bridge

ejpeeko

mirror

tiikom

wheat

kareeta

ramaa

hut, thatch

sebaara

barley

carriage, wagon, cart

raanja

garden-house

abeena

oats

rueram

wheel

kancha

court, yard

arosim

rice

eeje

axle

kosina

cookhouse

pine

barko

ship

auritoorio

meeting house

piipa

pipe

boote

boat

koato

room

kooko

coconut

kanoa

canoe

pueta

door, gate

sittriko

fruit (citrus)

batanka

outrigger

marko

doorpost, jamb

reemom

oar

lock

laatano ~ plaatano

banana tree

kanraom

remaroa

row (vb)

kandao

padlock

kamoote

potato (sweet)

tiimon

rudder

yaabem

key

sorwo

majtil

mast

bentaana

window;

millet, sorghum

beelam

sail (noun)

orniam

fireplace

asuka

cane, sugar

amklam

anchor

chimenea

chimney

tekipanoa

work

pueto

harbor, port

ejkalea

ladder

karenam

chain

resembarkaroa

land (vb)

kaamam

bed

kuchi’isoa

stab

salbaroa

rescue, save

chair, bank (financial institution)

eramienta

tool

tomi

kapinteo

carpenter

money, coin, booty, spoils

seruuchom

saw

poloobe

poor

meesa

table

martiom

hammer

poobe

the begar

lampaa

lamp, torch

laabos

nail

kueenta

kanteelam

candle

ereo

blacksmith

account, reckoning

repiisa

shelf

pundiroa

cast (metals)

impuejtom

tribute, tax

batea

trough

ooro

gold

merkao

market (place)

biika

rafter, beam

silver

tienra

shop, store

pojte

pole, post

koppre

copper, bronze

pesaroa

weigh

tappla

board

ronse

copper, bronze

sentro

center, middle

arko

arch

ploomo

lead (noun)

ejte

east

albanyil

mason

ejtanyo

tin, tinplate

oejte

west

lakrio

brick

potter

north

maaka

hammock

alpareo ~ alparero

noote suur

south

rancheo

farmer

limeete

glass

meriaroa

measure

kora

fence

abaniiko

fan (noun)

kancho

hook

sanja

ditch

abanikaroa

fan (vb)

ejkiina

corner

banko

piino

laata ~ plaata

846

Zarina Estrada Fernández

kuus ~ kurus

cross

doubt

square

ruraroa, dudaroa

payaroa

miss (target)

koarao pelo’otam

ball, sphere

traisionaroa

betray

lei

law

tribunaal

sero

zero, nothing

nesesira

court

need, necessity

uno

one

juskaroa

judge (vb)

probaroa

try

juisio

los ~ dos

judgment

two

o

or

joes

judge (noun)

tres

three

konpesaroa

admit, confess

tejtiiko

witness

kuatro

four

nekaroa

deny

juraroa

swear

sinko

five

prometiaroa

promise

juraroawame

oath

seis

six

proibiaroa

forbid

konrenaroa

siete

seven

anunsiaroa

announce

condemn, convict (vb)

ocho

eight

menasaroa

threaten

kajtiiko

nuebe

nine

le’iaroa

read

punishment, penalty

dies

ten

pluuma

pen

peena

onse

eleven

liprom

book

punishment, penalty

dose

twelve

puepplo

village, people

multa

fine

kinse

fifteen

triibu

tribe, clan

kasel

prison, jail

beinte

twenty

baara

walking stick

biolasion

rape

sien

hundred

kobanaroa

rule, govern

lakron

robber, thief

miil

thousand

kobanaao

ruler, king

relijion

religion

alba

dawn

liipre

freeman

lios

god

bijpaam

evening

enemiko

enemy

te’opo

temple, church

oora

hour

ejtanjeo

stranger

altaar

altar

relok

clock, timepiece

bisitaroame

guest

resaaroa

pray

prebiniaroa

prevent, hinder

paare

priest

semaana

week

ejtorbaroa

prevent, hinder

inpierno

hell

lominko

Sunday

custom

rario

radio

luunes

Monday

kojtumrem ~ kojtumbrem

teele

television

maates

Tuesday

geer

battle, war

teleepono

telephone

miekoles

Wednesday

paas

peace

bisikleeta

bicycle

juebes

Thursday

troopa

army

mooto

motorcycle

bienes

Friday

sontao

soldier

kaaro

car

sabala

Saturday

lansam

spear

kamion

bus

poloobele

pity

ejpaam

sword

treen

train

selaaroa

envy, jealousy

kanyon

gun, cannon

abion

airplane

kulpa

fault, blame

atakaroa

attack

luus

electricity

neesio

stupid, foolish

waaria

sentinel, guard

bateriiam

battery

maejto

teacher

ansueelom

fishhook

manitua

brake

propesor

teacher

taampa

motor/engine

ejkuela

school

trap (fish), trap (noun)

motor maakina

machine

sekreeto

secret

pusilaroa

shoot

aseite

oil, petroleum

32. Loanwords in Yaqui

847

ojpital

hospital

numero

number

bombam

bomb/explosive

empermea

nurse

koreo

post, mail

paprika

pajtiam

pill, tablet

ejtampia

postage stamp

work shop/factory

inyeksionim

injection

kaata

letter (epistula)

perioriko

newspaper

taajeta

postcard

almanaakem

calendar

labareo

faucet

peliikulam

film, movie tea

lentem

spectacles, glasses

presirente

president

komon

toilet

te

polisiia

police

kolchoonim

mattress

kapee

coffee

lisensia

driver’s license, license plate

ampora

can, tin

leepe

orphan

nasimiento akta

birth certificate

tornio

screw

leriito

crime

eleksion

election

resarmaroorim screw driver botea

bottle

uuli

plastic

Unknown origin miisi

cat

Chapter 33

Loanwords in Zinacantán Tzotzil, a Mayan language of Mexico* Cecil H. Brown 1. The language and its speakers Tzotzil belongs to the Tzeltalan group of the Greater-Tzeltalan branch of the Mayan language family (see Figure 1). It is spoken in the central highlands of Chiapas, the southernmost state of Mexico (see Map 1). There are around 265,000 speakers of the language in 24 communities, 19 having municipal status. This study focuses exclusively on the variety of Tzotzil spoken in the municipality of Zinacantán. The political and ceremonial center of Zinacantán is located at an altitude of 2,558 m at 16°45’0N, 92°42’0W, about 10 km west of the city of San Cristóbal de las Casas. The municipality has a population of more than 30,000 people, of which at least 94 percent speak Tzotzil. Traditionally, like other indigenous peoples of Chiapas, speakers of Zinacantán Tzotzil are swidden maize farmers. Until around 700 CE, the highlands of Chiapas were sparsely inhabited (Vogt 1969: 139). Ancestors to speakers of Tzotzil (and of Tzeltal as well, a closely related sister language spoken in the region) may have begun to migrate into the area around that time, a possibility attested by archaeological evidence showing rapid increase in number of settlements and types of pottery. It is not known for certain from whence these people came, but the linguistic connection of Tzotzil (and Tzeltal) with Cholan languages (see Figure 1) suggests a migration to the Chiapas highlands from the Maya lowlands to the north and east, where Cholan languages are spoken today and where Classic Maya Civilization developed and flourished 1 (from 250–900 CE).

*

The subdatabase of the World Loanword Database that accompanies this chapter is available online at http://wold.livingsources.org. It is a separate electronic publication that should be cited as: Brown, Cecil H. 2009. Zinacantán Tzotzil vocabulary. In Haspelmath, Martin & Tadmor, Uri (eds.) World Loanword Database. Munich: Max Planck Digital Library, 1217 entries.

1 In a recent paper, I (Brown n.d.) assemble linguistic and ethnobotanical evidence indicating that Proto-Tzeltalan, mother language to Tzotzil and Tzeltal, was spoken somewhere in the Maya lowlands.

33. Loanwords in Zinacantán Tzotzil

Huastecan Yucatecan

Huastec, Chicomuceltec

Chontal

Yucatec, Lacandon

Ch!ol

Mopan, Itzaj

Proto-Mayan

849

Ch!orti!

Ch!olan

Ch!olti!

Tzeltalan

Tzotzil

Greater Tzeltalan

Tzeltal Tojolab!al Chujean Chuj Greater Q!a"ob!alan Q!a"ob!al Q!a"ob!alan

Akateko Jakalteko Mocho! Mam Teko

Mamean Ixhil Awakateko Eastern Mayan

Q!eqchi! Uspanteko Poqomchi! Poqomam K!iche!an Sakapulteko Sipakapense K!iche! Kaqchikel Tz!utujiil

Figure 1: The classification of Mayan languages (based on Wichmann & Brown (2003: Fig. 1), adapted from Campbell & Kaufman (1985: Fig. 1)).

850

Cecil H. Brown

Map 1: Geographical setting of Tzotzil There is very little evidence of influence from major Mesoamerican centers in the central highlands of Chiapas from 700 CE until the arrival of Europeans in 1519 CE (Vogt 1969: 141). For example, ceramics and architecture of this period in highland Chiapas are distinct from that associated with Mexicans of the Central Valley of Mexico to the north. In contrast, Mexican influence was clearly manifest just to the south in the Guatemalan highlands and nearby in the valley of the Grijalva River. Thus, the highland region of Chiapas seems to have been something of a cultural backwater from around 700–1519 CE. This may have begun to change at the end of this period. It is reported that shortly before the arrival of the Spanish, an Aztec garrison was established in Zinacantán (Remesal 1932: 378; Calnek 1962: 24; Gerhard 1979: 149). The first penetration of Spaniards into the area occurred in 1523 (Vogt 1969: 143). The Spanish conquest of highland Chiapas was completed in 1528. In addition to the Spanish conquerors, large numbers of Aztec and Tlaxcalan soldiers are

33. Loanwords in Zinacantán Tzotzil

851

known to have been present in Ciudad Real (San Cristóbal de las Casas) in early post-conquest times (Laughlin 1988:I: 10). These were speakers of Nahuatl, a language that served as a lingua franca for Mesoamerica (southern Mexico and northern Central America) for decades both before and after the Spanish intrusion. In 1545, the Dominicans, the dominant Catholic order of the region, arrived from Spain (Laughlin 1988:I: 1). These clerics remained in highland Chiapas until the order was expelled from Mexico in the seventeenth century.

2. Sources of data The main source of lexical data for Zinacantán Tzotzil is The Great Tzotzil Dictionary of San Lorenzo Zinacantán compiled by Robert M. Laughlin (1975). This work is one of the most comprehensive bilingual dictionaries (Tzotzil/English) ever assembled for a Native American language, with approximately 30,000 entries in Tzotzil and 15,000 in English. The dictionary documents the language as it was spoken in the mid-twentieth century. A second lexical source consulted is a colonial dictionary of Zinacantán Tzotzil compiled by an unknown lexicographer, probably at the end of the sixteenth century. This dictionary has been published by Robert M. Laughlin (1988) under the title The Great Tzotzil Dictionary of Santo Domingo Zinacantán. Laughlin includes a facsimile of the dictionary which is a Spanish to Tzotzil listing, a rendering of the latter into modern orthography, and English-Tzotzil and Tzotzil-English presentations. The facsimile is not of the original dictionary, but rather of a handwritten copy made around 1906. The original manuscript has been lost. The copy includes no indication of where, when, or by whom the dictionary was compiled. Laughlin (1988:I) in a historical commentary assembles convincing evidence that the dictionary treats the Tzotzil spoken in Zinacantán where it probably was prepared. He narrows the time of compilation down to either the second half of the sixteenth century or the first quarter of the seventeenth. Laughlin (1988:I: 33) opts for the close of the sixteenth century, but points out that there is no positive proof. While the lexicographer cannot be definitively named, he clearly was a Dominican friar. Lexical information from the colonial dictionary is used here only as supplementary material, mainly for understanding how Spanish loanwords have figured into lexical change in Zinacantán Tzotzil over a period of about 350 years. Comparative data from other Mayan languages used for investigating the history of individual Tzotzil words comes from Wichmann & Brown (2000), which is described in §2 of Wichmann & Hull’s chapter on Q’eqchi’ found in this volume.

852

Cecil H. Brown

3. Number of loanwords Only one language, Spanish, has significantly contributed words to Zinacantán Tzotzil. Of the 966 Loanword Typology (LWT) meanings for which Tzotzil words have been found, 180 are rendered by Spanish loanwords (not including compound terms with a Spanish loan component). In contrast, Tojolab’al, a neighboring Mayan language, which is probably second to Spanish in donating loans to Tzotzil, at most has contributed loanwords for only six of the LWT meanings. I am tentative about the latter statistic because direction of borrowing has not been definitively determined, that is, the six words may just as well be loans into Tojolab’al from Tzotzil as the other way around. In fact, other than Spanish words and one loan from Nahuatl (Uto-Aztecan), no other terms for LWT meanings from any other languages are definitively determined to be loanwords into Zinacantán Tzotzil.2 The following is a statistical account of possible loans into Tzotzil that could have been donated by languages other than Spanish or Nahuatl: Tojolab’al (Mayan): 6 possible loans; Tzeltal (Mayan): 4; K’iche’ (Mayan): 3; Chuj (Mayan) 2; Ch’ol (Mayan), Chontal (Mayan), Huastec (Mayan), Itzaj (Mayan), Q’eqchi’ (Mayan), Mocho (Mayan), Proto-Greater Q’anjob’alan (Mayan), Proto-Yucatecan (Mayan), and Proto-Mixe-Zoquean (Mixe-Zoquean): 1 possible loan each.3 In addition, Tzotzil may have three calques and a single semantic loan based on words pertaining to the vocabulary of its very close sister language, Tzeltal, but it is just as likely that these involve Tzeltal doing the calquing and acquiring the semantic loan. The one certain loan from a non-Spanish language into Tzotzil is xamit ‘adobe’ (< Nahuatl ximitl ‘adobe’). Table 1 presents numbers of loanwords in Zinacantán Tzotzil organized by semantic fields and presented as percentages. Loanword percentages from languages other than Spanish are very low do not contribute very much to any generalizations concerning association of loanwords in Tzotzil with semantic fields. From the statistics of Table 1 it emerges that 16% of Zinacantán Tzotzil’s vocabulary may have been borrowed from other languages. (This compares with 15% found for Q’eqchi’, the other Mayan language treated in this volume.) The four semantic fields showing the highest loan percentages are as follows (with percentages indicated): Modern world (66%), The house (31%), Law (31%), Food and drink (27%). The latter three fields have substantially large numbers of things and concepts associated with them that were unknown to Native Americans before their introduction to the New World by Europeans. The field Modern world is similar in that most of its associated items are innovations, things unknown to Native Americans until they were introduced to them by the dominant culture. Thus, it appears 2

For the total corpus of approximately 30,000 words in his dictionary, Laughlin (1975: 24) gives the following breakdown of language origins for loans into Zinacantán Tzotzil: Chiapanec 3, ProtoMixe-Zoquean 8, Nahuatl 12, Spanish 836, Tzeltal 2. 3 Most of these “possible” loans are highly problematic and should not be taken very seriously.

33. Loanwords in Zinacantán Tzotzil

853

that most lexical borrowing of Zinacantán Tzotzil from Spanish has involved words for introduced items.

Tojolabal

Nahuatl

Q’eqchi’

Chontal

Proto-MixeZoquean

Total loanwords

Non-loanwords

The physical world Kinship Animals The body Food and drink Clothing and grooming The house Agriculture and vegetation Basic actions and technology Motion Possession Spatial relations Quantity Time Sense perception Emotions and values Cognition Speech and language Social and political relations Warfare and hunting Law Religion and belief Modern world Miscellaneous function words All words

Chuj

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Loanwords in Zinacantán Tzotzil by donor language and semantic field (percentages)

Spanish

Table 1:

5.8 6.0 23.7 5.9 26.3 17.8 28.9 23.1 11.2 3.7 15.5 8.7 14.7 26.0 4.1 10.1 11.4 9.7 24.8 15.2 30.8 23.3 62.6 7.7 15.4

0.9 3.2 0.2

1.6 0.1

2.4 0.1

1.8 0.1

1.5 0.1

0.5 0.4 0.7 0.1

7.4 6.5 24.6 5.9 26.7 17.8 31.3 24.9 12.7 3.7 15.5 8.7 14.7 26.6 4.1 10.1 11.4 9.7 24.8 15.2 30.8 23.3 65.8 7.7 16.0

92.6 93.5 75.4 94.1 73.3 82.2 68.7 75.1 87.3 96.3 84.5 91.3 85.3 73.4 95.9 89.9 88.6 90.3 75.2 84.8 69.2 76.7 34.2 92.3 84.0

There are 180 LWT meanings rendered by Spanish loans in Zinacantán Tzotzil. I have evaluated each of these meanings with regard to whether or not they entail objects or concepts that may have been encountered by Tzotzil speakers, either in the modern era as innovations or in earlier post-contact times as European introductions. 117 of the 180 meanings or 65% entail objects or concepts judged to be introduced. Thus, around two-thirds of lexical borrowing into Tzotzil has involved Spanish words for introduced items.

854

Cecil H. Brown

4. Kinds of loanwords Table 2 reports on loanwords in Zinacantán Tzotzil by semantic word class with loanword frequencies presented as percentages. It is clear from this table that the vast majority of LWT meanings labeled by loans are nouns. This conforms with the above observation that most loans into Tzotzil name introduced objects and concepts.

Tojolabal

Nahuatl

Q’eqchi’

Chontal

Proto-MixeZoquean

Total loanwords

Nonloanwords

Nouns Verbs Adjectives Adverbs Function words All words

Chuj

Loanwords in Zinacantán Tzotzil by semantic word class (percentages) Spanish

Table 2:

23.2 0.6 11.7 9.5 15.4

0.3 0.2

0.1 0.1

0.1 0.1

0.1 0.1

0.1 0.1

0.6 0.1

24.1 0.6 12.4 0.0 9.5 16.0

75.9 99.4 87.6 100.0 90.5 84.0

5. Contact situations Speakers of Zinacantán Tzotzil, like speakers of native languages all over the New World at the time of European intrusion, encountered introduced things never before seen or experienced and, of course, never before named in their languages. Naming these new things, which I call “items of acculturation” (Brown 1999), involved two basic strategies: (1) borrowing a European language term for the item, or (2) creating a native label for the item using the lexical resources of a mother tongue. Details of both of these strategies are discussed in Brown (1999). I focus here on New World language patterns in the naming of items of acculturation through lexical borrowing from European languages, and particularly how Tzotzil fits into them. For evaluating lexical borrowing in Native American languages, I developed a list of 77 items of acculturation, consisting of things such as ‘book’, ‘cheese’, ‘horse’, ‘Saturday’ and so on (Brown 1999). Words for these items were assembled from 292 Native American languages distributed from the Arctic Circle to Tierra del Fuego. For each language is calculated a loanword percentage which is the number of items on the list that are named by European loans in the language, divided by the number of items for which the language has a label, multiplied by 100. The loanword percentage for Zinacantán Tzotzil is 77% (49/64 rounded). This number becomes informative in the context of loanword percentages calculated for other New World languages.

33. Loanwords in Zinacantán Tzotzil

855

Table 3, extracted from Brown (1999: 80), presents in rank order (from highest to lowest) average loanword percentages for languages of different geographic areas of the New World. The highest average percentage, 63%, is found for languages of California, and the lowest, 5%, for languages of the North American Plains. Notable is that average percentages for geographic areas with Spanish/Portugueseinfluenced languages (California, North American Great Basin, North American Southwest, and all areas south of the Rio Grande River) are greater than average percentages for geographic areas with languages influenced by other European tongues, especially by English (all areas north of the Rio Grande save California, North American Great Basin, and North American Southwest). Means for averages of loanword percentages for languages of Spanish/Portuguese-influenced areas and non-Spanish/Portuguese-influenced areas are respectively 47% and 20%. Clearly, peoples of non-Spanish/Portuguese-influenced areas, such as many native groups of English-influenced North America, have been more resistant to lexical borrowing from European languages than have people of Spanish/Portuguese-influenced regions. This difference almost certainly relates to differential degrees of bilingualism in native and European languages among native speakers of New World languages. Table 3: Average (%) 63 61 58 58 56 52 52 44 41 41 39 32 31 27 26 24 23 21 18 14 9 5

Average loanword percentages by New World geographic area (adapted from Brown 1999: 80) Number of languages Geographic Area 15 1 11 17 4 63 5 28 3 4 23 3 11 8 4 2 19 7 11 26 11 16

California North Central Andean Region North Andean Region South Central Andean Region South Andean Region Middle America North America Great Basin North-Central South America Tropical Forest Central Brazil Northeastern South America Tropical Forest North America Southwest South-Central South America Tropical Forest Arctic North America Plateau Central South America Tropical Forest Extreme South of South America Sub-arctic North America Northwest Coast South America Gran Chaco North America Northeast North America Southeast North America Plains

856

Cecil H. Brown

While it is extremely difficult, if not impossible, to accurately assess degree of bilingualism of post-contact Native Americans who lived centuries ago, enough is known about experiences of these peoples to indicate that in general speakers of English-influenced languages were significantly less bilingual than speakers of Spanish-influenced languages. The histories of native peoples of North America not influenced by Spanish are characterized by exploitation, forced migrations, removal to reservations, and lack of any long-term government and/or religious programs to integrate them culturally, socially, and linguistically into the dominant culture (see Berkhofer 1978). As a result, very few North American Indians are strongly bicultural today. Instead, they have either retained traditional cultural patterns or totally replaced them with those of the dominant culture. This has almost certainly resulted in a general tendency not to develop bilingualism. While it is true that many contemporary Native Americans in Anglo-influenced areas of North America speak English, many now lack knowledge of their native tongue. In most instances they have probably learned English without bilingualism as a significant intervening developmental stage. Such developments in part explain why many North American Indian languages are now extinct or close to extinction (cf. Diamond 1993: 82). In contrast, while Spanish conquerors certainly exploited Native Americans, Indian groups that were under the direct influence of Spanish-American culture were systematically brought into colonial society by both religious and secular authorities. The manner in which this was done had the effect of creating “creolized” cultures (Gillin 1947; Voegelin & Hymes 1953). “Creolization” entailed the retention of native languages by those Indians who became familiar with the language of their conquerors and, consequently, promoted bilingualism for people of many Latin American Indian groups. It is highly likely that the experiences of speakers of languages of North America not influenced by Spanish have been uniformly different from those of speakers of Spanish-influenced languages of Latin America (Gillin 1947). Clearly, the fact that speakers of the former languages have strongly resisted lexical borrowing, while speakers of the latter, including Tzotzil, have freely adopted European loans is linked to these different historical conditions (Brown 1999: 81–83). The geographic area in which Tzotzil is spoken is Middle America which, with 52%, has a high average loanword percentage among New World geographic areas (see Table 3). With a percentage of 77%, Zinacantán Tzotzil (the modern version, 1950–1975) is among languages of Middle America having the highest loanword percentages. Table 4, extracted from Brown (1999: 75), gives in rank order (from highest to lowest) loanword percentages for 63 Middle American languages, for each of which is also given genealogical-group affiliation, e.g. Otomanguean, Mayan, Mixe-Zoquean. Among these 63 languages, Zinacantán Tzotzil (1950– 1975) is tied with Zapotec (Mitla) for the eighth highest percentage. (Two other languages treated in this volume, Q’eqchi’ and Otomi, are listed in Table 4 with respective loanword percentages of 69% and 26%.)

33. Loanwords in Zinacantán Tzotzil

Table 4:

857

Loanword percentages for languages of Middle America (with names of languages mentioned in the text given in bold)

Language

Genealogical Group

%

Mocho Totonac (Xicotepec de Juárez) Cora Pipil Nahuatl (Tetelcingo) Ixil Poqomam Tzotzil (Zinacantán, 1975) Zapotec (Mitla) Nahuatl (Xalitla) Tojolab’al Tzotzil (San Bartolome) Tequistlatec Jakaltek Huave (San Mateo) Popoluca (Sayula) Q’eqchi’ North Mam Awakatek Mopan Tzeltal (Bachajón) Popoloca (San Vicente Coyoctepec) Cuicatec Tzotzil (San Andrés) Trique (Copala) Zapotec (Juárez) Kaqchikel (1956–1981) Totonac (La Sierra dialect) Popoluca (Oluta) Tzeltal (Tenejapa) Ch’ol (Tila) Otomi (Santiago Mexquititlán) Nahuatl (1611) Huichol Mazahua Mixe (Totontepec) Tzeltal (Oxchuc) Otomi (1826–1841) Nahuatl (Sierra de Zacapoaxtla) Ixcatec Zoque (Copainalá) Kaqchikel (c. 1650)

Mayan Totonacan Uto-Aztecan Uto-Aztecan Uto-Aztecan Mayan Mayan Mayan Otomanguean Uto-Aztecan Mayan Mayan Isolate Mayan Isolate Mixe-Zoquean Mayan Mayan Mayan Mayan Mayan Otomanguean Otomanguean Mayan Otomanguean Otomanguean Mayan Totonacan Mixe-Zoquean Mayan Mayan Otomanguean Uto-Aztecan Uto-Aztecan Otomanguean Mixe-Zoquean Mayan Otomanguean Uto-Aztecan Otomanguean Mixe-Zoquean Mayan

91 83 80 79 78 78 78 77 77 75 75 74 72 71 70 70 69 68 68 68 67 66 62 61 61 61 60 58 58 58 57 55 53 52 49 48 47 47 46 46 44 44

858

Cecil H. Brown

Kaqchikel (Central) Nahuatl (Huazalinguillo dialect) Trique (Chicahuaxtla) Tzotzil (Zinacantán, 16th century) Huastec K’iche’ Nahuatl (1571) Tarascan (1559) Chatino (Tataltepec) Mixtec (Chayuco) Mixtec (San Juan Colorado) Otomi (Mezquital dialect) Otomi (c. 1770) Tzeltal (1888) Zapotec (1578) Zoque (1672) Zapotec (16th century) Amuzgo Yucatec (1850–1883) Chinantec (Ojitlán) Mixtec (1593)

Mayan Uto-Aztecan Otomanguean Mayan Mayan Mayan Uto-Aztecan Isolate Otomanguean Otomanguean Otomanguean Otomanguean Otomanguean Mayan Otomanguean Mixe-Zoquean Otomanguean Otomanguean Mayan Otomanguean Otomanguean

43 42 41 39 38 37 36 35 35 30 29 26 26 22 21 16 12 9 9 5 4

For its earlier manifestation (based on the colonial dictionary), see Tzotzil (Zinacantán, sixteenth century) in Table 4, Tzotzil shows a loanword percentage of only 39.0. Clearly, during the period of roughly 350 years from the time of the colonial dictionary to around the mid-twentieth century, Tzotzil prolifically added Spanish loans to its vocabulary – as borne out as well by data assembled in the immediately following section.

6. Developments from the late sixteenth century to the modern period Only a handful of Native American languages are documented for time states separated by 300 years or more and, of course, Zinacantán Tzotzil is one of these (see §2). From documents available for the language, it is possible to chart changes involving Spanish loanwords that occurred from the late sixteenth century to the modern era. Among the 966 LWT meanings for which a word is found in modern Zinacantán Tzotzil, there are 801 for which a label is also found in the colonial dictionary. These include 670 instances in which a meaning with a native term in the earlier era also shows a native term in the modern era (typically the same one). These also include 131 instances in which one or both of the two time states show labels that are Spanish loans or labels that are compounds incorporating Spanish

33. Loanwords in Zinacantán Tzotzil

859

loans. Table 5 shows the changes observed among these 131 items (presented in order of frequency): Table 5:

Changes between sixteenth century and modern Tzotzil

change type Native term replaced by loan

Loan persists

Native term replaced by native term with Spanish loan component Loan replaced by native term

Loan replaced by loan

Native term with Spanish loan component replaced by loan

quantity 61 instances 46.6 % (61/131) 30 instances 22.9 % (30/131) 14 instances 10.1 % (14/131) 10 instances 7.6 % (10/131) 8 instances 6.1 % (8/131) 5 instances 3.8 % (5/131)

Loan replaced by native term with 1 instance Spanish loan component 0.8 % (1/131) Native term with Spanish loan 1 instance component replaced by native 0.8 % term (1/131) Native term with Spanish loan 1 instance component replaced by native 0.8 % term with Spanish loan compo- (1/131) nent Native term with Spanish loan 1 instance component replaced by native 0.8 % term with same Spanish loan (1/131) component

4

examples tzajal choy ‘shrimp’ (lit. ‘red fish’) ! kamaron ‘shrimp’ (< camarón ‘shrimp’) [higos]4 ‘figs’ ! !ik’ux ‘fig’ (< higos ‘figs’), [tijeras] ‘scissors’ ! texerex ‘scissors’ (< tijeras ‘scissors’) jk’anojel ‘beggar’ ! hk’an-limuxno ‘beggar’ (lit. ‘AGENTIVE-want/ask_for-alms’5) (< limosna ‘alms’) [camisa] ‘shirt’ ! moketeil ‘shirt’, [cuenta] ‘bead’ ! satil ‘bead’, [puerta] ‘door’ ! mak na ‘door’ (lit. ‘close house’) [mula] ‘mule’ (< mula ‘mule’) ! ka! ‘horse, mule’ (< caballo ‘horse’), [navaja] ‘razor’ (< navaja ‘razor’) ! xilete ‘razor’ (< gillette ‘razor’ [from the brand name in English]) castillan !ixim ‘wheat’ (< Castilla ‘Castile’) (lit. ‘Spanish maize’) ! triko ‘wheat’ (< trigo ‘wheat’), xinch’okil vakax ‘bull’ (< vacas ‘cows’) (lit. ‘husband cow’) ! toro ‘bull’ (< toro ‘bull’) [lampara] ‘lamp’ ! !av kas ‘lamp’ (< gas ‘kerosene’) (lit. ‘place of gas’) castillan tuluk’ ‘chicken’ (< Castilla ‘Castile’) (lit. ‘Spanish turkey’) ! !alak ‘chicken’6 !ut dios ‘to curse’ (< dios ‘God’) (lit. ‘scold God’) ! !ak’ ta maltisyon ‘to curse’ (< maldición ‘malediction’) (lit. ‘give curse’) mulav caballo ‘stallion’ (< caballo ‘horse’) (lit. ‘sinful horse’) ! batz’i ka! ‘stallion’ (< caballo ‘horse’) (lit. ‘true horse’)

Brackets encompass Spanish loans whose actual phonological shapes are not reported in the colonial dictionary, see §9. 5 This literal translation is supplied by John Haviland (personal communication). 6 According to Haviland (personal communication), !alak’ means domesticated bird of almost any kind, and its default reading varies from town to town.

860

Cecil H. Brown

7. Integration of loanwords There are three major ways in which Spanish words have found their way into Tzotzil as loans: (i) Words have been borrowed into the language as free (unbound) terms with their original meanings, (ii) words have been borrowed into the language as free words with changed meanings, and (iii) words have found their way into compound constructions of the language juxtaposed with non-loan, native elements. There are a total of 193 Spanish words that have been borrowed as counterparts or parts of compounds for LWT meanings. Of these, 137 or 71% have been borrowed with the original meaning intact, 45 or 23% have been borrowed with a meaning change, and 34 or 18% have been borrowed in a compound with a native constituent. (Some Spanish words have been borrowed in more than one of these three manners.) Some examples of borrowing with semantic change are: mermeho ‘honey bee’ < bermejo ‘bright reddish’, kamaro ‘mortar’ < cámara ‘storage room’, karos ‘wheel’ < carro ‘cart’, sentavo ‘money’ < centavo ‘cent’, lastima ‘wound’ < lástima ‘pity, pitiful object’, mis ‘vagina’ < miz ‘cat’. Some examples of borrowing in partially native compounds are: lok’es bala ‘shoot’ (lit. ‘dispatch shot’) < bala ‘shot’, batz’i ka! ‘stallion’ (lit. ‘true horse’) < caballo ‘horse’, kachimpa pom ‘honey bee’ (lit. ‘pipe incense’) < cachimba ‘smoking pipe’, sakil tranhero !is-ak ‘potato’ (lit. ‘white stranger sweet potato’) < extranjero ‘stranger’, pas hamparoal ‘greedy’ (lit. ‘be harmed abundance of food’) < jambado ‘abundance of food’, pala te! ‘oar’ (lit. ‘shovel wood’) < pala ‘shovel’. Such compounds by convention are not counted here as loanwords. However, their Spanish loan constituents are included among Spanish loanwords for Zinacantán Tzotzil listed in the appendix. Phonological/orthographical correspondences are observed relating to the adaptation of Spanish words borrowed into Tzotzil. Some of the more interesting of these are presented in (1)–(11). (1)

Tzotzil k ~ Spanish g kovyerno ~ gobierno ‘government’, !akuxa ~ aguja ‘needle’, rominko ~ domingo ‘Sunday’, !eklixya ~ iglesia ‘church’, !ik’ux ~ higos ‘figs’, rextiko ~ testigo ‘witness’

(2)

Tzotzil p ~ Spanish f posporo ~ fosforo ‘match’, !impyerno ~ infierno ‘hell’, palta ~ falta ‘fault’.

(3)

Tzotzil l ~ Spanish r (word-finally) mankornal ~ mancarnar ‘to twist neck…’, pertonal ~ perdonar ‘to forgive’, tampol ~ tambor ‘drum’

33. Loanwords in Zinacantán Tzotzil

(4)

861

Tzotzil x ~ Spanish j !akuxa ~ aguja ‘needle’, koxo ~ cojo ‘lame’, xavon ~ jabón ‘soap’, texerex ~ tijeras ‘scissors’

This correspondence was no longer pertinent after the end of the sixteenth century. At that time Spanish x /!/ shifted to j /x/ (/x/ = velar fricative) (Spaulding 1942: 162, Harris 1969). (5)

Tzotzil x ~ Spanish s riox ~ dios ‘God’, texerex ~ tijeras ‘scissors’, mexa ~ mesa ‘table’, kexu ~ queso ‘cheese’, vakax ~ vacas ‘cows’, byernex ~ viernes ‘Friday’

In the sixteenth century, Spanish apico-alveolar /s/ (orthographically s and heard by Tzotzil speakers as /!/) shifted to plain /s/ (M. Stanley Whitley, personal communication; Harris 1969). This change, which probably occurred around the middle of the sixteenth century (see below discussion), happened before Spanish x /!/ shifted to j /x/; see (4) above. The correspondence was no longer pertinent after the shift. (6)

Tzotzil s ~ Spanish s !espara ~ espada ‘sword’, savaro ~ sabado ‘Saturday’, sankre ~ sangre ‘blood’

This correspondence did not develop until Spanish apico-alveolar /s/ (orthographically s) shifted to plain /s/ (orthographically s) around the middle of the sixteenth century; see (5) above. (7)

Tzotzil s ~ Spanish c (before i and e), z krus ~ cruz ‘cross’, servesa ~ cerveza ‘beer’, sin ~ cine ‘film’, maltisyon ~ maldición ‘(a) curse’, lurse ~ dulce ‘candy’

The orthographic segments c (before i and e) and z represent /s/ in Andalucian and in Latin American Spanish and /"/ in the Spanish of Castile. These are from earlier /ts/ (Whitley, personal communication; Harris 1969). (8)

Tzotzil t ~ Spanish d (following a consonant and before a vowel) kavilto ~ cabildo ‘court’, kantela ~ candela ‘candle’, soltaro ~soldado ‘soldier’

(9)

Tzotzil r ~ Spanish d (between vowels) !espara ~ espada ‘sword’, savaro ~ sabado ‘Saturday’, soltaro ~soldado ‘soldier’

(10) Tzotzil r ~ Spanish d (word-initially) rominko ~ domingo ‘Sunday’, riox ~ dios ‘God’ This correspondence developed before that of (11) below.

862

Cecil H. Brown

(11) Tzotzil l ~ Spanish d (word-initially) lóktor ~ doctor ‘physician’, lurse ~ dulce ‘candy’ This correspondence developed after that of (10) above. Among its inventory of phonemes, Tzotzil shows five glottalized consonants (ejectives) corresponding to five plain consonants: ts’/ts, ch’/ch, k’/k, p’/p, and t’/t. Only one of the Spanish loans of this survey shows a consonant adapted to this feature: !ik’ux (< higos ‘figs’).

8. Grammatical borrowing There is no detailed information on grammatical borrowings in Zinacantán Tzotzil forthcoming at this time. An introductory paragraph from John B. Haviland’s (Laughlin 1988:I: 79) grammatical sketch of colonial Tzotzil sums up what is generally known about grammatical borrowing from Spanish: “Tzotzil is a morphologically rich language that follows a mildly ergative patterning of verbal cross-indexing. In the past 400 years of constant and intimate contact with Spanish language and Mexican society, Tzotzil speakers have incorporated many Spanish words into their language, but have maintained the syntactic integrity of Tzotzil grammar with surprisingly few changes. Apart from ongoing variation in the phonological and morphological details of the language, modern Tzotzil differs most obviously from its colonial ancestor in its use of Spanish conjunctions and discourse devices to make explicit the logical links between clauses.”

9. Chronological stratification of Spanish loanwords The late sixteenth century dictionary allows identification of some of the Spanish loans present in the language at that time. However, it provides very little evidence concerning the exact phonological characteristics of these loans. For example, the dictionary indicates that the language had a Spanish loan for ‘Saturday’ by presenting this entry: sabado idem. The actual Tzotzil term is not given. This is the case for nearly all Spanish loans recognized in the source. The colonial dictionary does not report some Spanish loans that may have actually been present in the language. For example, the Spanish words lunes ‘Monday’ and miercoles ‘Wednesday’ do not occur in the work (and, of course, neither do Tzotzil loanwords from Spanish for these concepts). In the modern language, Tzotzil words for these are respectively lunex and melkulex. The phonology of these words indicates that they were borrowed from sixteenth century Spanish; see (5) above. Thus, these words must have been part of the sixteenth century language, but were omitted by the lexicographer. In the case of some loans, we know of their occurrence in sixteenth century Tzotzil from the colonial dictionary, but, as explained above, we know little or

33. Loanwords in Zinacantán Tzotzil

863

nothing from that source concerning their phonology. However, in many instances dictionary-reported sixteenth century loans show up in the modern vocabulary. For example, this is so of the loan for ‘scissors’ which is texerex (< tijeras) in modern Tzotzil. We know from this word that it was borrowed before Spanish x shifted to j (around the end of the sixteenth century) and before Spanish apico-alveolar /s/ shifted to plain /s/; see (4) and (5) above. The older dictionary reports a loan for ‘sword’ based on Spanish espada. Like the loan for ‘scissors’, this word also shows up in modern Tzotzil, i.e. as !espata. In this case, phonology indicates that the term was borrowed after Spanish apico-alveolar /s/ shifted to plain /s/; see (6) above. Thus, the latter shift must have taken place before the older dictionary was compiled (around the late sixteenth century). This means that !espata is a later loan than is texerex (< tijeras) because the former realizes orthographic s of Spanish as s while the latter realizes it as x. There are other Spanish loans in the modern language also reported in the colonial dictionary whose phonologies are similarly indicative of chronological stratification. These suggest that the Spanish shift of apico-alveolar /s/ to plain /s/ occurred sometime around the middle of the sixteenth century. Using such phonological information and reports of Spanish loans in the older dictionary, plus the semantic content of the loanwords themselves, Spanish loans in modern Tzotzil can be chronologically stratified with respect to when they entered the language. The list of Spanish loans in Zinacantán Tzotzil given in the appendix is organized in terms of this chronological stratification. One strong pattern emerges in the chronological stratification of Spanish loanwords. Loans that are definitively attributable to the sixteenth century are substantially more often associated with the domain of religion than are those not definitively attributable to the sixteenth century. In this instance, a loan is judged religious in nature even if such a connection is not direct but only suggestive. An example of the latter is sankre (< sangre ‘blood’) whose borrowing into Tzotzil is plausibly linked to the high salience of blood in Catholicism as the blood of Christ. 30.0 percent of sixteenth century loans (12/40) have a religious connection, while only 4.7 percent of non-sixteenth century loans (4/85) show such an association. Given the prominence of Catholic clerics in early contact between Spaniards and natives, this is not surprising.

10. Tzotzil and the Mesoamerican post-contact linguistic area The chronological stratification of Spanish loanwords (see the appendix) is based on different lines of evidence, some of which permit more exact timing of borrowing than others. For example, as discussed in 9 above, some Spanish loans in modern Tzotzil are known from the colonial dictionary to have entered the language in the sixteenth century, while others are only presumed to be sixteenth century based solely on phonological characteristics. Examples of the latter are xavon (< jabón ‘soap’) and yevax (< yeguas ‘mares’). While these terms are not found

864

Cecil H. Brown

in the older dictionary, they are assumed to have been borrowed during the sixteenth century before Spanish x /!/ shifted to j /x/ (see [4] above) and before apicoalveolar /s/ shifted to plain /s/ (see [5] above), and, therefore, that they were part of the lexical inventory of sixteenth century Tzotzil. However, some evidence suggests this not to be the case. A native Tzotzil word for ‘soap’ is found in the sixteenth century dictionary, i.e. ch’upak’, as is a native word for ‘mare’, i.e. antzil caballo (< caballo ‘horse’) (lit. ‘female horse’ ). Thus, unlike the concepts ‘Monday’ and ‘Wednesday’ noted above, the concepts ‘soap’ and ‘mare’ are not dictionary omissions. This suggests that speakers of Tzotzil in the late sixteenth century were not familiar with Spanish words for ‘soap’ and ‘mare’ or, at the very least, had not yet incorporated these terms into their language. Now if they borrowed the words directly from Spanish after the end of the sixteenth century, that is, after the colonial dictionary had been compiled, then they would have been borrowed respectively as *havon (< jabón) and *yevas (< yeguas) (rather than as the observed xavon and yevax) because by that time Spanish x / !/ had everywhere changed to j /x/ and Spanish apico-alveolar /s/ had everywhere changed to plain /s/. Under this interpretation, the only logical sources for Tzotzil xavon and yevax would have been other Native American languages that had borrowed the terms directly from Spanish before the end of the sixteenth century, that is, before x ! j and apico-alveolar /s/ ! plain /s/ had fully run their courses. That Tzotzil xavon ‘soap’ and yevax may have not been directly borrowed from Spanish, but rather indirectly from some other Native American language or languages is not especially surprising. Indirect diffusion of European loans for introduced items to Native American languages is a well-documented development (Brown 1999). In some instances in the Americas, such diffusion has been so extensive that it has produced post-contact linguistic areas. A linguistic area is apparent when geographically contiguous languages, including both genealogical and non-genealogical related ones, share many linguistic features because of areal diffusion (i.e. borrowing). In the New World, when shared features are mostly names for European introduced objects and concepts, linguistic areas are clearly post-contact developments. I have documented five such linguistic areas for the Western Hemisphere (Brown 1999: 144–157): Southeastern United States, Pacific Northwest, Andean Region, Mesoamerica, and Tropical Forest South America. Associated with each of these areas is a major lingua franca that probably played an important role in the wide diffusion of post-contact linguistic features: respectively for each of the five areas, Mobilian Jargon, Chinook Jargon, Quechua, Nahuatl, and Tupi/Guaraní. Tzotzil was a participant in the post-contact linguistic area of Mesoamerica. Mesoamerica, a culture area, roughly encompasses southern Mexico, including the Yucatan Peninsula, and northern Central America. Table 6, which is an adaptation and expansion of Table 11.4 in Brown (1999: 152–154), gives the distributions of nine unequivocal post-contact linguistic features across 66 languages and dialects of Mesoamerica and adjacent areas. These

865

33. Loanwords in Zinacantán Tzotzil

nine features (A-I) are a mixture of native terms, native terms that are semantically extended European loans, and loan shifts for introduced items. Also given as expanded information are the distributions of five Spanish loans for five European introduced items (a-e), ‘cow’, ‘soap’, ‘needle’, ‘scissors’, and ‘table’, all of which are found in Zinacantán Tzotzil, respectively, vakax (< vacas ‘cows’), xavon (< jabón ‘soap’), !akuxa (< aguja ‘needle’), texerex (< tijeras ‘scissors’), and mexa (< mesa ‘table’). Table 6:

Distribution of post-contact language features across languages of Mesoamerica and adjacent areas (adapted from Brown 1999: 152–154). A

A. Mesoamerican languages Uto-Aztecan Nahuatl Pipil Mayan Huastec K’iche’ Kaqchikel Tz’utujiil Poqomam Poqomchi’ Q’eqchi’ Mam Teko Ixil Jakaltek Chuj Tojolab’al Ch’ol Chontal Ch’olti’ Ch’orti’ Tzeltal Tzotzil Yucatec Itzaj Mopan Mixe-Zoquean Totontepec Mixe Sayula Popoluca Oluta Popoluca Old Zoque Copainalá Zoque Francisco León Zoque

+ +

B C

D

+

+

+

+ +

+ +

+ + + + + + +

E F G

+ +

+

+ +

+ +

+

+

+ + +

+

+ + +

+ +

+ + +

+

+ + + +

+

+ + + + +

+ + +

+

+ + + + + +

+ +

+

+ +

+ + +

+ +

+ + +

+

+

+ + + + + + +

+ +

+

+ + +

+ + +

+ +

+ + + + +

H

+ + +

+ + +

+

I a b c d e

+ + + + + + + + + + + + + + + + + + + + +

+ + +

+ + + +

+ + + + + + + + + + + + + + + + + + + + + + + +

+ + + + + + + + + + + + + + +

+ + + + + + + + + +

+

866

Cecil H. Brown

Rayón Zoque Sierra Popoluca Texistepec Totonacan Totonac Otomanguean Otomi Mazahua Matlazinca Chinantec Tlapanec Chiapanec Mixtec Cuicatec Amuzgo Mazatec Ixcatec Chocho Popoloca Zapotec Chatino Xincan Xinca Others Huave Cuitlatec Tarascan Tequistlatec B. Adjacent-area languages Uto-Aztecan Pima Eudeve Tarahumara Cora Huichol Yuman Cocopa Misumalpan Sumu Miskito Chibchan Rama Others Seri Jicaque Lenca

+

+ + +

+ + +

+

+ + + + +

+

+

+

+ + +

+ + + +

+ +

+

+ + + + + + +

+ +

+ +

+ + + +

+ +

+

+ +

+ + + +

+

+

+ + + + +

+

+ + +

+

+

+ +

+ + +

+

+ +

+ + + + +

+ +

+ + +

+

+

+

+

+ + +

+

+ +

+

+ + +

+

+ + + + +

+

+

+ + + + + + + + + +

+

+

+ +

+

+ +

+ +

+

+ + + +

+

+ + +

+ +

+ +

+ + +

+ + +

+

33. Loanwords in Zinacantán Tzotzil

867

A Native term for CAT derived from Classical Nahuatl mizton (e.g. Totontepec Mixe miistu; Cora místun; Seri miist).

B Native term for GOAT (sometimes SHEEP) derived from Classical Nahuatl tentzone (e.g. Ixil tentzun; Mazatec tentsun; Huave teants; see Brown 1999: ch. 8).

C Loan shift for SHEEP derived from a word for COTTON (realized through overt marking, e.g. “cotton deer” = SHEEP, or as polysemy; see Brown 1999: ch. 3).

D Loan shift for BREAD derived from a combination of Classical Nahuatl caxtillan plus a word for TORTILLA (e.g. Classical Nahuatl caxtillan tlaxcalli; Yucatec castran uah; Huave peats castil; see Brown 1999: ch. 3, 8).

E Loan shift for CHICKEN/HEN (occasionally ROOSTER) typically derived from a combination of Classical Nahuatl caxtillan plus a term for TURKEY or BIRD (e.g. Classical Nahuatl caxtillantotolin ‘chicken’; Lenca cashlanmúni ‘hen’), sometimes truncated to a derived form of caxtillan (e.g. Tlapanec #ti la ‘chicken, hen’; Mopan cax ‘hen’; see Brown 1999: ch. 3).

F Loan shift for WHEAT (or, rarely, some other imported grain) derived from Classical Nahuatl caxtillan plus a term for MAIZE (e.g. Classical Nahuatl caxtillan tlaulli; Chiapanec nama katila; Cakchiquel kaxlan ixim).

G Native term for MONEY, based on an archaic Spanish term, tomín, which denoted a specific currency denomination – one-eighth of a peso (e.g. Pame tumin; Huave tomian; Seri tom; see Brown 1999: ch. 8).

H Loan shift for HORSE derived from a term for DEER, typically realized in marking reversals (e.g. Huastec “Huastec horse” = DEER) and as polysemy (see chapter 8 and also discussion of Nahuatl in Brown 1999: ch. 7).

I Native term for CHICKEN/HEN/ROOSTER derived from Classical Nahuatl totolin (e.g. San Bartolome Tzotzil totórin ‘rooster’; Pame tolôn ‘chicken’; Tarahumara totorí ‘hen’).

a Loanword for COW based on the sixteenth century realization of Spanish vacas “cows” which was phonologically /vaka!/ (e.g. Xalitla Nahuatl w$kax; Huastec p$cax; Tequistlatec galwagax; Cuitlatec waká#i).

b Loanword for SOAP, based on the sixteenth century realization of Spanish jabón ‘soap’ which was phonologically /!abon/ (e.g. Xalitla Nahuatl xapon; Mazahua #abo; Rayón Zoque #abut).

c Loanword for NEEDLE, based on the sixteenth century realization of Spanish aguja ‘needle’ which was phonologically /agu!a/ (e.g. Xalitla Nahuatl ac%xa; Mocho !aku:#ah; Mixtepec Zapotec guzh).

d Loanword for SCISSORS, based on the sixteenth century realization of Spanish tijeras ‘scissors’ which was phonologically /ti!eraS/ (e.g. Mocho te!e:re!; Chuj texlex). Sixteenth century Spanish /S/ was the apico-alveolar fricative.

e Loanword for TABLE, based on the sixteenth century realization of Spanish mesa ‘table’ which was phonologically /meSa/ (e.g. Huastec m&xa; Mocho me:#ah; Mixtepec Zapotec mèzh). Sixteenth century Spanish /S/ was the apico-alveolar fricative.

868

Cecil H. Brown

In order for a Spanish loan to be judged present for a language in Table 6, it must match the corresponding Tzotzil form exactly with respect to certain phonological features: (1) The Spanish loan for ‘cow’ must show a word-final /!/ (thus, for example, Huastec paacax qualifies, but Jicaque waca does not), (2) the loan for ‘soap’ must show a word-initial / !/ or /s/ (thus, Texistepec shapuun and Chichimec sábos qualify, but Yucatec jáabon does not), (3) the loan for ‘needle’ must show a non-first consonant that is /!/ (thus, Tequisltlatec -guxa qualifies, but Tojolab’al aguja does not), (4) the loan for ‘scissors’ must show second and final consonants that are both /!/ (thus, Cuitlatec ti#erá#i qualifies, but Q’eqchi’ ti#eer and Yucatec tijeeras do not), and (5) the loan for ‘table’ must show a word-medial or stem-final 7 1 /!/ (thus, Mocho me:#ah qualifies, but Ixcatec me sa does not). All 14 features of Table 6, A–I and a–e, are frequently found in languages of Mesoamerica and show broad distributions. The major difference between features A–I and features a–e is that none of the former could have been borrowed directly from Spanish, while all of the latter could have been borrowed directly from Spanish. In other words, given the nature of features A–I, their wide distribution is almost certainly the result of borrowing among Native American languages, while given the nature of features a–e, their broad distribution could be the result of diffusion, but also the result of independent, direct borrowing from Spanish by individual languages. Distributions of both types of features are very similar. Consequently, it seems unlikely that the a–e distributions are explained totally by independent, direct borrowing from Spanish. Undoubtedly, some of these five features were sometimes independently and directly borrowed from Spanish, but probably more than just a few were not. It would be surprising to discover that the sociolinguistic conditions that promoted the wide diffusion of features A–I, which are associated with European introduced objects and concepts, did not also promote diffusion of Spanish loanwords for introduced items. If words for ‘cow’, ‘soap’, ‘needle’, ‘scissors’, and ‘table’ diffused indirectly from Spanish across languages of Mesoamerica, it is likely that Spanish loans for many other items of acculturation diffused indirectly as well. There is, then, much more work to undertake in fleshing out dimensions of the Mesoamerican post-contact linguistic area. Only one language of Table 6 possesses all 14 post-contact linguistic features, and, as it happens, this is the target language of this study, Tzotzil. (This is Tzotzil as a conglomerate of its dialects; Zinacantán Tzotzil on its own does not show all 14 features.) The language showing the next highest number is Nahuatl (Aztec) with 12 features (also a conglomerate of its dialects, but Classical Nahuatl on its own possessed features A–I). The large number of features shown by Nahuatl is isomorphic with the fact that six of the features, A, B, D, E, F, and I, unquestionably originated in the language. In addition, as the major native lingua franca of Mesoamerica at the time of the conquest, Nahuatl was in a sociolinguistically privileged position to strongly influence wide diffusion of post-contact features. The large number of features shown by Tzotzil suggests it was a very early and robust participant in the formation of the Mesoamerican post-contact linguistic

33. Loanwords in Zinacantán Tzotzil

869

area. This may have involved significant interaction with speakers of Nahuatl in the very early post-conquest era. As noted in §1, a large number of speakers of Nahuatl, Aztec and Tlaxcalan soldiers, in early post-contact times were posted in Ciudad Real, just 10 km from Zinacantán.

11. Conclusion Zinacantán Tzotzil has been a heavy borrower of words from Spanish, as has the other Mayan language treated in this volume, Q’eqchi’. However, unlike Q’eqchi’, interactions with sister languages has not led to a significant incorporation of loans from other Mayan languages into the language. The sixteenth-century dictionary for Zinacantán Tzotzil and phonological evidence document substantial Spanish loanword acquisition by the language from the very early post-contact era into modern times. Some of the loans into Tzotzil, especially older ones, were probably not acquired directly from Spanish, but rather indirectly through other Native American languages to which they had diffused earlier. Such borrowing was part of an extensive diffusion of post-contact linguistic features across languages of Mesoamerica that resulted in the formation of a post-contact linguistic area.

Acknowledgments Gene Anderson, Pamela Brown, Lyle Campbell, Martin Haspelmath, John Haviland, Robert M. Laughlin, M. Stanley Whitley, and Søren Wichmann either read and commented only earlier drafts of this paper or contributed to it in other important ways. I am grateful to them for their help.

References Berkhofer, Jr. Robert F. 1978. The white man's Indian: Images of the American Indian from Columbus to the present. New York: Knopf. Brown, Cecil H. 1999. Lexical acculturation in Native American Languages. New York: Oxford University Press. Brown, Cecil H. 2009+. Development of agriculture in prehistoric Mesoamerica: The linguistic evidence. In Staller, John E. & Carrasco, Michael (eds.), Pre-Columbian foodways: Interdisciplinary approaches to food, culture, and markets in Mesoamerica. Springer Publishers. Calnek, Edward. 1962. Highland Chiapas before the Spanish conquest. Ph.D. thesis. University of Chicago. Campbell, Lyle R. & Kaufman, Terrence. 1985. Mayan linguistics: Where are we now. Annual Review of Anthropology 14:187–98.

870

Cecil H. Brown

Diamond, Jared. 1993. Speaking with a single tongue. Discover February:78–85. England, Nora C. 1994. Ukuta'miil, ramaq'iil, ttzijob'aal: Ri maya' amaaq'. Autonomía de los idiomas mayas: Historia e identidad. Guatemala City: Editorial CHOLSAMAJ. Gerhard, Peter. 1979. The southeast frontier of New Spain. Princeton, NJ: Princeton University Press. Gillin, John. 1947. Modern Latin American culture. Social Forces 25:243–248. Harris, James W. 1969. Sound change in Spanish and the theory of markedness. Language 45:538–552. Haviland, John B. 1988. It’s my own invention: A comparative grammatical sketch of Colonial Tzotzil. In Laughlin, Robert M. (ed.), The great Tzotzil dictionary of San Lorenzo Zinacantán, 79–121. Washington, D.C.: Smithsonian Institution Press. Laughlin, Robert M. 1975. The Great Tzotzil Dictionary of San Lorenzo Zinacantán. Washington, D.C.: Smithsonian Institution Press. Laughlin, Robert M. 1988. The great Tzotzil dictionary of Santo Domingo Zinacantán with grammatical analysis and historical commentary. Vol. I: Tzotzil-English, Vol. II: EnglishTzotzil, Vol. III: Spanish-Tzotzil. Washington, D.C.: Smithsonian Institution Press. Remesal, Antonio. 1932. Historia general de las Indias Occidentales, y particular de la governación de Chiapa y Guatemala. Tomo I. Guatemala: Biblioteca “Goathemala” de la Sociedad de Geografía e Historia. Spaulding, Robert K. 1942. How Spanish grew. Berkeley: University of California Press. Voegelin, Charles F. & Hymes, Dell H. 1953. A sample of North American Indian dictionaries with reference to acculturation. Proceedings of the American Philosophical Society 97:634–644. Vogt, Evon Z. 1969. Chiapas Highlands. In Vogt, Evon Z. (ed.), Handbook of Middle American Indians, Vol. 7: Ethnology, Part One, 133–151. Austin: University of Texas Press. Wichmann, Søren & Brown, Cecil H. 2000. Panchronic Mayan dictionary. Electronic manuscript. Wichmann, Søren & Brown, Cecil H. 2003. Contact among some Mayan languages: Inferences from loanwords. Anthropological Linguistics 45(1):57–93.

Loanword Appendix Spanish (including Mexican Spanish)

byernex

Friday

martex

Tuesday

hk’an-limuxna

beggar

mayxtro

teacher

kaxlan

chicken

melkulex

Wednesday

Earlier sixteenth century

kexu

cheese

mexa

table

!eklixya

church

krixchano

person

mixail

wedding

!ik’ux

fig

lavux

nail

rextiko

witness

wasp

lunex

Monday

riox

God

!ovixpo

33. Loanwords in Zinacantán Tzotzil

871

rominko

Sunday

!ora

time, hour

pas hamparoal

greedy

xemana

week

!oro

gold

pas preva

to taste

xevu

grease

beneno

poison

pas rason

wise

xila

chair

buro

donkey

pelota

ball

texerex

scissors

choris

sausage

pimenta

pepper

vakax

cow, bull

harina

flour

plato

plate

yevax

mare

horno

oven

pletu

kalpintero

carpenter

quarrel, war battle,

kamaro

mortor

pulatu

bowl

kamaron

shrimp

pwersa

necessity

field

riata

lasso rich

Later sixteenth century !espara

sword

sankre

blood

savaro

Saturday

kantil

lamp, torch

riko

Sixteenth

century

karina

chain

rinyon

kidney

!akuxa

needle

karos

wheel

rosa

pot

!altal

altar

kastiko

punishment

sapato

shoe

bino

wine

kavilto

court

sempre

always

ka!

horse

kaya

road, street

sentavo

money

kantela

candle

komonta

to share

syen

hundred

koxo

lame

korneta

trumpet

tampol

drum

krus

cross

kova

spade

toro

bull

lech

milk

kovyerno

government

traste

lonkanisyo

sausage

krontail

enemy

dish, frying pan

martio

hammer

kuchilu

knife

triko

wheat

wey

ox

yave

key

kampo

pale

priest

lagarto

crocodile

reloho

clock

lagrio

brick

rey

king

lansa

spear

xavon

soap

laso

rope

tyenta

shop, store

lastima

wound

leon

lion

lok’es bala

to shoot

loro

parrot

maltisyon

crime

mas

more

mil

thousand

moso

servant

mulino

mill

multa

!aktavus

bus

!avyon

airplane

!eleksyon

election

!inteksyon

injection

banko

bank

bateria

bateria

xilete

razor

karo

bus

lus

light, electricity

(a) fine

sin

movie

pala

shovel

timpre

stamp

and

pan

bread

turismu

car

hell

pas baik ta preva

to fight

Seventeenth century to modern !ak bentisyon

to bless

!ak bwelta

to turn

!alku!

arch

!anima

soul

!asaluna

hoe

!av kas

lamp

!era

threshingfloor

!i !impyerno

Modern

872

Cecil H. Brown

Mid-sixteenth century to modern

koko

coconut

konxa

shell

!aros

rice

kontento

happy

!asuka

sugar

kuchara

spoon

patix

lizard

!osov

bear

kwenta

debt

sik’alal

cigarette

baka

cow

livro

book

hweves

Thursday

lok’esob manya

punishment

karsa

heron

lukar

place

listo

ready

lumero

number

physician

luna

moon

lurse

candies/sweets

mankornal

yoke

mis

cat

marchante

merchant

Q’eqchi

posporo

match

meko

white

tzu

preserente

president

mermeho

bee

sentro

circle

napach

raccoon

K’iche

senya

scar

pareho

similar

chi!ilil

sera!aha-pom

beeswax

periko

parrot

serio

match

pino

good

servesa

beer

plóho

lazy

tz’et

soltaro

soilder

porke

because

Chontal

sorto

deaf

potrero

pasture

surto

left

potro

stallion

lóktor

pas preva Chuj

Mocho vok

foam

Proto-Greater Q’anjob’alan moch

basket

gourd

sibling, relative

Ch’ol

p’ok

primero

first

Huastec

Sixteenth century to modern

púro

only

yan

!ak pertonal

forgive

potato

!ak punto

lead

sakil tranhero!is-ak

animal

taro

bamboo

!antivo

ancestors

tyempo

time, season

!ilera

thread

ventana

window

!ok’es tiro

shoot

Tojolab’al

botin

boot

boton

button

chamaroil

blanket

chivo

goat

hk’an-parte

plantiff

kachimpa pom

bee

kahve

coffee

Tzeltal

karnero

ram

ch’un

to obey

katu!

cat

hten-tak’in

blacksmith

kinya

banana

kaxlan vah

bread

!animal

to try

left

knot

more

Itzaj tzemen

tapir

Proto-Yucatecan tzotz

hair

k’anal

star

Proto-Mixe-Zoquean

satil

face, eye

!unen

ti

that

tzapo

short

vet

fox

baby

Nahuatl xamit

adobe

Unknown origin k’uk’umal

feather

Chapter 34

Loanwords in Q’eqchi’, a Mayan language of Guatemala* Søren Wichmann and Kerry Hull 1. The language and its speakers Q’eqchi’ belongs to the K’iche’an branch of Mayan languages (see Figure 1). It boasts somewhere between 360,000 and 400,000 speakers in 23 different municipalities in Guatemala (see Map 1). The majority of Q’eqchi’ speakers reside in Alta Verapaz in San Pedro Carchá, Cobán, and Chisec, with other large concentrations of speakers in Fray Bartolomé de las Casas, Santa Catalina La Tinta, Panzós, and Senahú. Outside of the Alta Verapaz region, Q’eqchi’ communities exist in Baja Verapaz and El Quiché. Additional Q’eqchi’ speakers are found in the Petén, primarily in San Luis and Sayaxche, and in Livingston, Izabal. Multiple groups of Q’eqchi’ have also migrated to Belize. Due to the institution of the mandamiento in 1877 by the Guatemalan government, a law which essentially authorized employers to pay Q’eqchi’ workers little or nothing at all, land disputes in Alta Verapaz, and suffering under the hands of the German-run coffee plantath tions in the late 19 century, some Q’eqchi’ migrated to the southern Toledo District of Belize. Most of these early immigrants to Belize came from San Pedro Carchá near Cobán (Thompson 1930: 36). Today, the Q’eqchi’ of Belize have established more than 30 villages, such as San Pedro, Santa Elena, San Pedro Columbia, Otoxa, Machaca, Dolores, and Mabilha. While they have successfully escaped the difficult circumstances in Guatemala, life is still a struggle for the Q’eqchi’ in Belize, as they are poorest of all ethnic groups in that country (Wilk & Chapin 1990: 18).

*

The subdatabase of the World Loanword Database that accompanies this chapter is available online at http://wold.livingsources.org. It is a separate electronic publication that should be cited as: Wichmann, Søren & Hull, Kerry. 2009. Q’eqchi’ vocabulary. In Haspelmath, Martin & Tadmor, Uri (eds.) World Loanword Database. Munich: Max Planck Digital Library, 1774 entries.

874

Søren Wichmann and Kerry Hull

Huastecan

Yucatecan

Huastec, Chicomuceltec

Chontal

Yucatec, Lacandon

Ch!ol

Mopan, Itzaj

Proto-Mayan

Ch!orti!

Ch!olan

Ch!olti!

Tzeltalan

Tzotzil

Greater Tzeltalan

Tzeltal Tojolab!al Chujean Chuj Greater Q!a"ob!alan Q!a"ob!al Q!a"ob!alan

Akateko Jakalteko Mocho! Mam Teko

Mamean Ixhil Awakateko Eastern Mayan

Q’eqchi’ Uspanteko Poqomchi! Poqomam K!iche!an Sakapulteko Sipakapense K!iche! Kaqchikel Tz!utujiil

Figure 1:

The classification of Mayan languages (from Wichmann & Brown (2003: Fig. 1), which is adapted from Campbell & Kaufman (1985: Fig. 1))

34. Loanwords in Q’eqchi’

875

Map 1: Q’eqchi’ in its geographical setting The number of Q’eqchi’ in Belize continues to rise at a slow but steady rate even today. According to Gordon (2005), there were about 9,000 Q’eqchi’ speakers in Belize in 1995, though a detailed census of 30 communities by Wilk in 1984 showed a population of just 4,388 (Wilk & Chapin 1990: 18). Most recently, civil war in Guatemala sent various waves of refugees into Belize seeking safety and reprieve, further increasing the Belizean Q’eqchi’ population. Many of the Q’eqchi’ in Belize today speak English in addition to Q’eqchi’, and a considerable portion of the population can also speak the language of the Mopan Maya, who are the neighbors of the Belizean Q’eqchi’. Bilingualism in both Mayan languages, together with their close geographic proximity, has created an environment where lexical borrowing between the two Mayan languages is not uncommon (Hull n.d.; Wichmann & Brown 2003: 85). Most Q’eqchi’ speakers live in the Verapaz region of Guatemala. Today, there are over 850,000 indigenous people in Alta Verapaz, most of whom are of ethnic Q’eqchi’. Historically, the homeland of the Q’eqchi’ of Guatemala has always been in the Alta Verapaz, stretching back possibly 1,700 years (Campbell 1977).

876

Søren Wichmann and Kerry Hull

Archeologically, the entire area encompassing El Quiche, Alta Verapaz, and Baja Verapaz has been termed the Hilly Middle Country. The archaeology of this region from the time when we can begin to reckon with the earliest form of Q’eqchi’ has been summarized as follows by King (1974: 13–14), based on Borhegyi (1965). The period 300–700 CE witnessed the emergence of ceremonial sites with temple structures, ball courts, and large elite tombs. Agricultural terraces and possibly irrigation begin to appear. Roughly around 400 CE influence from Central Mexico is seen in the urban centers, apparently causing changes in the nature of religious cults. Possibly there was an actual movement of people from the Teotihuacan sphere who were sent out to gain control over the cotton and cacao production. This early period is, however, not well known. During 700–1000 CE, a strong influence from the lowland Maya area can be detected in the presence of the black-and-white-onred pottery. This, as we shall see, harmonizes well with the linguistic evidence for contact. Possibly because of threats from outsiders, urban centers situated in valleys are abandoned and people retreat to defensible hilltop sites. During 1000–1524 CE, major urban centers are again found in valleys, and this period sees “an accentuation of the isolated or independent quality of this area intensified again” (King 1974: 14). King stresses that the key to understanding the nature of this region is its isolated character which, at times, cuts it off from “main trends of Middle American culture history.” In colonial times, unlike most other Maya groups in Guatemala, the Q’eqchi’ were successful in repelling the military advances of the Spanish. As one early colonial chronicler put it, they were “feroz y bárbara e imposible de domar y sujetar” [“ferocious and barbarous and impossible to pacify and subjugate”] (Remesal 1966[1619]: 311). Thus, after repeated attempts to conquer them, the first of which took place in 1529 (King 1974: 17), a compromise agreement was reached that allowed the Q’eqchi’ to remain independent but allowed Bartolomé de Las Casas and other Dominicans to proselytize in their territory (Wilk & Chapin 1990:18). Lay Spaniards, however, were prohibited from entering the Verapaz region, contributing to its previously mentioned isolation (Schackt 2000: 15). Las Casas, who started his missionary campaigns in 1538 (King 1974: 18), was markedly more successful in bringing the Q’eqchi’ under some form of Spanish control through pacification and peaceful means than were the numerous military incursions into the region (Weeks 1997: 62; Kockelman 2003: 468). Following the missionary efforts, the Spaniards were able to apply their usual strategy of reducción, i.e. the gathering of natives in larger settlements. This led to the founding of Cobán, San Pedro Carchá, and San Juan Chamelco, to this day important Q’eqchi’ towns, as well as some towns mostly populated by Poqomchi’ Maya. The region could now receive its name Verapaz, ‘True Peace’, contrasting with its previous name Tezulutlán, ‘Land of War’. th During the 17 century there was intense contact between the Q’eqchi’ and the “Manche Chol”, who were speakers of Ch’olti’. The Manche Chol produced cacao, annatto, and vanilla in the region around the upper Rio Cancuen and would trade these goods in Q’eqchi’ towns of Alta Verapaz (Caso Barrera & Aliphat 2006). No

34. Loanwords in Q’eqchi’

877

doubt this kind of interaction between the two ethnic groups had persisted since pre-Columbian times. The apparently peaceful coexistence changed, however. Chroniclers of the early colonial period mention that the Manche Chol would raid towns established by the Spaniards in Alta Verapaz and that Q’eqchi’ speakers would serve in military expeditions to the Manche area (King 1974: 23). Dominican control of the region continued for three centuries, with only a minimum of non-clerical Spanish or other foreign settlers in the area during this period. By the time of the recognized independence of the Mexican Republic in 1821, however, the Alta Verapaz had become open to settlers from the outside. During the nineteenth century it was mainly Europeans – especially Germans – who migrated to the area, attracted by the possibility of establishing coffee plantations (King 1974: 20–27). Through a new Guatemalan law of 1877 (already mentioned above) the procuring of labor and the expropriation of land that had previously belonged to the natives was made possible. Germans continued to dominate the area financially until the Second World War. Perhaps not accidentally, German scholars were also among the first to produce scientific Q’eqchi’ linguistic and ethnographic studies, e.g. Stoll (1896) and Sapper (1890, 1891, 1902). The th ensuing situation for the Q’eqchi’ in the post-war 20 century was sketched above. Ironically, the survival and growth of the Q’eqchi’ language seems to have been furthered by the willingness of the Q’eqchi’ to at least partly accept Christianity and the Spanish ways. Their neighbors to the north and east, speakers of the Ch’olan language Ch’olti’ and the Yucatecan languages Mopan, Itzaj, and Lacandon, continued to be hostile towards the Dominicans and were not conquered militarily until th the end of the 17 century (King 1974: 23–25). After having been suppressed, the Manche Chol were transferred to the highlands near Rabinal, and Ch’olti’ is now an extinct language which is only sparsely documented through a single colonial source (Moran 1695). As for the aforementioned Yucatecan languages, these are either seriously endangered or on the verge of extinction. Earlier, however, Mopan, Itzaj, Lacandon, and Ch’olti’ were more widespread, and loanwords, especially from Ch’olti’, attest to the nature of interaction of their speakers with the Q’eqchi’. Q’eqchi’ holds a unique distinction among all Mayan languages in Middle America as having the highest percentage of monolingual speakers (see Kaufman 1974). Whereas bilingualism in Spanish (or simply monolingualism in Spanish) has become commonplace among many Mayan groups, the majority of indigenous Q’eqchi’ of Alta Verapaz are monolingual in Q’eqchi’. Part of the retention of Q’eqchi’ as their primary language of discourse likely stems from their isolation in Alta Verapaz (Stewart 1980: xx), which traces back to the times of Fray Bartolomé de las Casas. In addition, even Ladinos (non-indigenous inhabitants) living in many Q’eqchi’ areas are commonly bilingual in Spanish and Q’eqchi’ (Campbell 1974: 277). The high status of the Q’eqchi’ language is also reflected in the fact that more Q’eqchi’ are literate in Q’eqchi’ than in Spanish, seeing the former as the most advantageous to merit their efforts (Stewart 1980: xx–xxi). With such a large linguistic community spread over distinct geographic areas, the Q’eqchi’ language has naturally developed dialectal variation, especially in the

878

Søren Wichmann and Kerry Hull

area of phonology. For instance, in the Cahabón dialect, the proto-Mayan vowel length distinction is retained in monosyllabic forms, whereas other dialects have lost the distinction in all environments; the Cobán dialect exhibits an innovation whereby vowels are lengthened before a sequence of a sonorant and another consonant; and in Cobán, Carchá, and Chamelco the loss of h of CVhC and CVhVC forms has been compensated for by vowel lengthening (Campbell 1977: 25–26). An example of dialectal variation in the realm of morphology is found in the past tense system of the Belize variant, which has undergone several changes such as the reinterpretation of what corresponds to non-future forms ending in -k in the Cobán dialect as habituatives (DeChicchis 1990: 1495). However, it is highly noteworthy that despite its vast area of use and number of speakers, the dialectal variations in Q’eqchi’ are relatively minor. This unity of speech, according to Stewart (1980: xiii), can be attributed to the fact that until recently the Q’eqchi’ inhabited a much smaller geographic area than they do today. Additionally, the area of Baja Verapaz, which has a high concentration of Q’eqchi’ speakers today, was not a Q’eqchi’speaking area in the past. Thus, a limited geographic space and near-total isolation for centuries after the arrival of the Spanish have helped to limit the scale of dialectal proliferation in the Verapaz region. In general, spoken Q’eqchi’ can be divided into two major dialects, a western and an eastern form. The Eastern Dialect consists primarily of the areas near Lanquin, Cahabón, and Senahú, whereas the Western Dialect (also sometimes referred to as the “Cobán dialect”) centers around Cobán, San Pedro Carchá, and Chamelco (Campbell 1977: 24). Perhaps the earliest notice of these dialect boundaries was made by Juarros (1936: 72) in 1800 (Freeze 1975: 16). Today, there is a prevailing consensus among Q’eqchi’ speakers that the Cobán (Western) Dialect is the prestige form of Q’eqchi’ (Campbell 1977: 24; Stewart 1980: xvii).

2. Sources of data Lexical data for this research project were taken from a variety of published linguistic sources supplemented by direct elicitation when published sources were insufficient. The default source is constituted by the Diccionario Q’eqchi’ by Sam Juárez et al. (1997) of the Proyecto Lingüístico Francisco Marroquín. 1575 records in the subdatabase derive from this source. Another 365 records are due to fieldwork carried out by Kerry Hull in Cobán, Guatemala in 2006. Hull’s informants were all from Cobán or nearby communities and consisted of a 70 year-old male from Cobán, a 35 year-old male also from Cobán, a 32 year-old bilingual K’iche’ and Q’eqchi’-speaking female, and a 72 year-old male religious expert from an outlying community in the Cobán area. Additional sources include the Vocabulario Q’eqchi’ - Xtusulal Aatin Sa’ Q’eqchi’ published anonymously in 2004 by the Academia de Lenguas Mayas de Guatemala (28 records) and the 1955 Nuevo diccionario de las lenguas k’ekchi’ y española by the Protestant missionary linguist William Sedat (20 records). Finally, Kaufman & Justeson’s A Preliminary Mayan Etymological

34. Loanwords in Q’eqchi’

879

Dictionary (2003) provided data from Kaufman’s own fieldwork in different Q’eqchi’-speaking areas as well as compiled resources from other publications on Q’eqchi’ words (7 records), and a single record derives from Freeze’s (1975) study A Fragment of an Early K’ekchi’ Vocabulary with Comments on the Cultural Content. The general (i.e., not always strictly observed) order of preference of sources was as follows: Sam Juárez et al. (1997) > Anonymous (2004) > Sedat (1955) > elicitation > other. The various sources were also used to supplement one another with information about phonological variants or alternative lexical items fitting a given meaning label; such alternative items not deriving from more preferred source were, however, not etymologized and do not constitute separate records. Additional sources for such commentary on different records include Kockelman (2007), which used almost exclusively Q’eqchi’ data in his analysis, and Ponce E Hijos’ (1830) publication of a large number of Q’eqchi’ phrases and vocabulary in Vocabulario QuecchiEspañol. For the purposes of the study at hand, the vast majority of the data we used reflect the Western Dialect. Data from Belizean Q’eqchi’, which show considerable innovation and variation (see DeChicchis 1989), and data from varieties outside the general Cobán-Carchá-Chamelco area of Alta Verapaz, were not used to any significant extent in order to maintain general internal consistency over the data. Comparative data from other Mayan languages used for studying the history of individual Q’eqchi’ words come from a vast repository of lexical sources feeding into Wichmann & Brown (2000–), a comparative dictionary database which includes 6710 cognate sets in addition to more than 40,000 entries for which etymologies have not currently been found. About half of the cognate sets include data from Kaufman & Justeson (2003), and much information also derives from Dienhart (1989). Loanwords in Q’eqchi’ from other Mayan languages have been the subject of previous studies. Justeson et al. (1985) list 24 Q’eqchi’ lexical borrowings from the Lowland Mayan languages, i.e. Yucatecan and Ch’olan, and they note that the diffusion of such vocabulary from the Ch’olan languages into others, in particular Q’eqchi’ and Yucatecan “has suggested to Mayan linguists over the past century that Cholan speakers were prominent in the formation of ancient Lowland Maya civilization” (Justeson et al. 1985: 9). That is, the authors associate lexical borrowings into Q’eqchi’ with the cultural dominance of Ch’olan, and it is assumed that Ch’olan languages are, indeed, responsible for at least most of the borrowings, even if not all of the words in question are attested or uniquely attested in Ch’olan. A more recent study, by Wichmann & Brown (2003: 65–69) (similar to, but superseding, Wichmann & Brown 2002), identified 134 borrowings or possible borrowings into Q’eqchi’ from other Mayan languages. Since Mayan languages have rather similar phonological inventories, it is rarely evident from the phonological shape of a word that it has been borrowed. Nine of the 134 words discussed by Wichmann & Brown (2003) were identified on phonological grounds as loans since they fail to undergo sound changes that characterize inherited Q’eqchi’ vocabulary; the rest were identified on distributional grounds. It was argued that if a word is

880

Søren Wichmann and Kerry Hull

found in Q’eqchi’ but not in any other language of the large Eastern Mayan subgroup to which Q’eqchi’ belongs, then there is a good possibility that the word has been borrowed. The authors looked at cases where the words in question, apart from their attestation in Q’eqchi’, were only attested in a single subgroup of Mayan languages, and it turned out that this subgroup was Ch’olan in about 70% of all cases, with Yucatecan, Q’anjob’alan, and Chujean being represented as unique possible donors in about 10% of all cases each. These figures were assumed to be representative of the relative shares of lexical influence that languages of these different subgroups have had on Q’eqchi’. Given that the Lowland language groups Ch’olan and Yucatecan are responsible for some 80% of the borrowings that have only one subgroup as candidate for being the donor, it was assumed that Lowland languages were actually responsible for donating the item also in cases where the words were attested in subgroups other than the Lowland ones. The same strategies for identifying borrowings and assumptions concerning their origins are followed in the present paper. Wichmann & Brown (2003: 68–69) make a number of inferences regarding the nature of interaction between Q’eqchi’ speakers and speakers of other Mayan languages from the meaning of loanwords. We turn to these issues below. Through the present study some additional borrowings from other Mayan languages have been identified: 1 clear borrowing, 15 probable ones, and 14 for which the evidence is slim. The “clear” and “probable” borrowings are shown in Table 1, the rest may be sought out in the electronic subdatabase. Of the 134 loans identified in Wichmann & Brown (2003), 44 appear in this database. Table 1:

Clear or probable borrowings from other Mayan languages not previously identified in Wichmann & Brown (2003)

Q’eqchi’ form

Meaning label

Language or subgroup of origin

ajch’ajom hoyjob’nil k’ams lak’am ~ lakam mi’ mukuy pak’pech’pikpo’lem tz’ulxuxb’xaal che’ yajel

to wake up the boy; the young man to pour the stomach the termites the shield the vagina the dove to mold to thresh to dig the hut to plait/braid to whistle the forked branch the disease

Yucatecan Ch’olan or Yucatecan Yucatecan Ch’olan or Yucatecan Ch’olan or Yucatecan Ch’olan or Yucatecan Q’anjob’alan Ch’olan or Yucatecan Ch’olan or Yucatecan Ch’olan Ch’olan Ch’olti’ Ch’olti’ Ch’olan or Yucatecan Ch’olan or Yucatecan Ch’olan or Yucatecan

34. Loanwords in Q’eqchi’

881

Regarding Spanish borrowings, there is a paper by Campbell (1976) on native perceptions of the origins of lexical borrowings. Consultants from Cobán were asked their opinions about the origins of different words, consisting mostly of loanwords from Spanish, in addition to a few Mayan words that have been borrowed into the regional variety of Spanish. Speakers’ judgments were based on a mixture of phonological, semantic, and culture historical criteria and were analyzed by Campbell to determine possible insights into the psychological reality of different phonological phenomena and folk taxonomies that they provide. Brown (1994, 1999) investigated words for a sample of 77 objects and concepts that are prone to get acculturated in Native American languages, and his findings show that Middle American languages on average borrow 54% of the items in the list (Brown 1999: 89). The average for Mayan languages is 59%. The median is 67%, and Q’eqchi’ is close to this, having 69% (Brown 1999: 86). Thus, Q’eqchi’ may be considered representative of the languages of its region when it comes to Spanish lexical influence. It stands out, however, as the Mayan language which appears to have borrowed the most from its linguistic relatives, and it therefore represents a relatively rare case where the data is sufficient to gain good insights into contact among indigenous Mesoamerican languages, which is why we have selected Q’eqchi’ for the present contribution.

3. Number of loanwords Although other chapters in this book describe contact situations before they describe loanwords, we have chosen to do it the other way around since the contact situations must in a large measure be inferred from the loanwords themselves. Apart from Spanish, the main donor languages belong to the Ch’olan and Yucatecan subgroups of Mayan languages, and in many cases it is not possible to establish which individual language donated a given form. For the purposes of the statistics in Tables 2 and 3 we operate with “Ch’olan” and “Yucatecan” as donors (plus Western Mayan), not with individual members of these language groups. Since Ch’olan lexical influence is far greater than Yucatecan influence, it may be hypothesized that words which could theoretically have been donated by either come from Ch’olan, but it should be cautioned that this hypothesis will likely be wrong in some 10% of the cases. Therefore we do not merge the donor categories “Ch’olan” and “Ch’olan or Yucatecan.” Loanwords which theoretically could have come from either Ch’olan or other Western Mayan languages, however, are simply considered to be Ch’olan in origin for the purpose of the statistics. The subdatabase may be consulted for more precise information. In addition to the loanwords, there are two clear calques from Spanish, and two and one probable calque, respectively, from Ch’olan and Q’anjob’alan. What emerges from the statistics is that 15% of the Q’eqchi’ vocabulary has been borrowed from other languages. Since the Spaniards introduced a host of new material items and concepts, it is not surprising that the majority of these loan-

882

Søren Wichmann and Kerry Hull

words are from Spanish. It is more surprising, perhaps, that the Lowland languages are responsible for over 4% of the Q’eqchi’ vocabulary. In the following sections we will try to characterize this interesting impact further. Q’anjob’alan languages have donated a single vocabulary item. The only other language for which there is clear evidence of donorship is Nahuatl, with two words donated.

4. Kinds of loanwords We shall first provide an overview of the kinds of loanwords found from the perspective of the Loanword Typology Project, using our database sample and referring to the semantic categories according to which it is organized. We then broaden the outlook, also taking into account previous work on loanwords from other Mayan languages into Q’eqchi’. In Table 2 we provide statistics on the percentages of loanwords in different se1 mantic word classes supplied by different donor languages. These statistics should be taken with a grain of salt, remembering that we are dealing with semantic word classes, not grammatical ones. Nevertheless, the statistics do reflect the fact that other Mayan languages have been more prolific than Spanish in supplying words that grammatically function as verbs and adjectives in both donor and recipient languages to the Q’eqchi’ lexicon. In absolute figures, Lowland Mayan languages have supplied 21 such verbs and 8 adjectives or participles, where Spanish has not supplied a single verb and just one adjective.

Yucatecan

Western Mayan

Nahuatl

Q’anjob’alan

Unknown

Total loanwords

Non-loanwords

18.7 0.2 1.0 2.5 10.8

Ch’olan or Yucatecan

Nouns Verbs Adjectives Adverbs Function words all words

Ch’olan

Loanwords in Q’eqchi’ by semantic word class (percentages)

Spanish

Table 2:

1.5 2.4 4.0 1.9

1.9 1.5 2.0 1.7

0.1 0.4 0.2

0.4 0.2

0.2 0.1

0.1 0.1

0.2 0.1

23.0 4.8 7.0 0.0 2.5 15.0

77.0 95.2 93.0 100.0 97.5 85.0

As can be seen in Table 3, Spanish loanwords dominate in all semantic fields except The physical world, Kinship, The body, Spatial relations, Sense perception, Emotions and values, and Speech and language. In each of these fields, the collective influence from 1

In a few cases one and the same word belongs to different semantic parts of speech (e.g., saqen, which means both ‘light’ as an adjective and as a noun). For the purposes of the statistics in Table 3 such words are counted twice or more (in the example given, twice).

34. Loanwords in Q’eqchi’

883

Lowland languages is greater than or equal to Spanish influence. In some fields, Spanish influence accounts for more than one fifth of the Q’eqchi’ vocabulary: Modern world, Clothing and grooming, The house, Religion and belief, and Time.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

The physical world Kinship Animals The body Food and drink Clothing and grooming The house Agriculture and vegetation Basic actions and technology Motion Possession Spatial relations Quantity Time Sense perception Emotions and values Cognition Speech and language Social and political relations Warfare and hunting Law Religion and belief Modern world Miscellaneous function words

4.1 1.1 19.1 0.5 19.8 27.9 24.0 17.0 8.8 5.9 3.7 1.7 3.3 20.9 1.4 3.6 7.3 8.0 18.8 10.6 22.7 51.6 10.8

1.4 4.2 2.2 0.8 1.0 3.8 5.1 3.6 3.7 3.0 3.4 0.5 1.4 1.8 2.7 3.2 1.9

4.1 2.3 2.9 0.7 - 0.7 2.0 0.5 0.5 - 0.5 0.3 - 2.1 1.9 1.2 - 1.2 1.8 0.9 0.8 1.9 - 1.9 2.6 1.4 1.9 1.6 7.3 - 2.7 2.1 1.2 1.7 0.2 0.2 0.1 0.1 0.1

Non-loanwords

Total loanwords

Unknown

Q’anjob’alan

Nahuatl

Western Mayan

Yucatecan

Ch’olan Ch’olan or Yucatecan

Loanwords in Q’eqchi’ by semantic field (percentages)

Spanish

Table 3:

9.7 90.3 7.6 92.4 25.7 74.3 4.8 95.2 23.3 76.7 33.7 66.3 29.1 70.9 23.1 76.9 15.2 84.8 9.7 90.3 7.4 92.6 7.7 92.3 3.3 96.7 22.8 77.2 4.8 95.2 1.6 98.4 5.4 94.6 14.6 85.4 13.3 86.7 20.9 79.1 11.8 88.2 25.9 74.1 51.6 48.4 0.0 100.0 15.0 85.0

The overall impression is that Ch’olan influence provided the Q’eqchi’ with new ways to dominate nature, whereas Spanish influence is similar, but is seen mostly in

884

Søren Wichmann and Kerry Hull

human-made objects. Selections of loanwords that support this general impression 2 are listed in (1)–(3). (1)

Spanish loanwords in the subdatabase relating to material and culinary culture th (excluding very recent ones relating to 20 -century technology): poos ‘match’, oorn ‘oven’, xartin ‘pan’, xaar ‘jug, pitcher’, taas ‘cup’, platiiy ‘saucer’, ariin ‘flour’, salchiich ~ salchicha ‘sausage’, kaalt ‘soup’, aseet ‘oil’, asuukr ‘sugar’, leech ‘milk’, kees ‘cheese’, manteek ‘butter’, laan ‘wool’, seda ‘silk’, akuux ~ kuus ‘needle’, koton ‘coat’, kamiis ‘shirt’, b’otonx, alpiler ‘pin’, pulseer ‘bracelet’, tuwaay ‘towel’, sepiiy ‘brush’, nawaaj ‘knife’, xab’on ‘soap’, kwaart ‘room’, pwerta ‘door, gate’ champa ‘doorpost’, kantaaw ‘lock’, laaw ‘key’, estuuf ‘stove’, eskaleer ‘ladder’, siiy ‘chair’, meex ‘table’, kandeel ~ kanteel ‘candle’, kareen ‘chain’, tixeer ‘scissors’, xeer ‘saw’, klaawx ‘nail’, plaat ‘silver’ almul ‘basket’, tumin ‘money’, liib’r ‘book’, tamb’or ‘drum’, trompeet ‘trumpet’, ont ‘sling’, makaan ‘sword’, yooy ‘fishnet’, maak ‘machine’, anyooj ‘spectacles, glasses’, laat ‘tin, can’, meet ‘bottle’, te ‘tea’, kape ‘coffee’

(2)

Spanish loanwords in the subdatabase relating to domesticated animals and agriculture: potreer ~ potrero ‘pasture’, wakax ’cattle’, b’ooyx ‘ox’, b’aak ‘cow’, karneer ‘sheep, ewe’, b’orego / kordero / ob’eja ‘lamb’, chib’aat ‘he-goat’, kawaay ‘horse’, granyon ‘stallion’, b’uro ~ b’uur ‘donkey’, muul ‘mule’, kaxlan ‘chicken’, seer ‘bee’, paaps ‘potato’, suurk ‘furrow’, paal ‘shovel’, asaron ‘hoe’, orkeet ‘fork, pitchfork’, rastiyo ‘rake’, aros ‘rice’, kachimp ‘pipe’, lamunx ‘citrus fruit’, yunta ‘yoke’

(3)

Ch’olan loanwords relating to man’s domination of nature: k’ak’naab’ ‘spring, well’, b’ook ‘steam’, pak’- ‘to mold’, waal ‘the fan’, poyte’ ‘the raft’, xuk ‘corner’, tz’ak ‘wall’, eeb’ ‘ladder’, pik- ‘to dig’, pech’- ‘to thresh’, aq ‘hay’, b’it- ‘to carry on head’, juy- ‘to row’, nub’aal ~ rub’ajl ‘boundary’, po’lem ‘hut’

Many of the loanwords coming from Spanish and Ch’olan are similar in the sense that they represent cultural innovations. The fact that both languages donated a word for ‘ladder’ (Ch’olan eeb’, Spanish eskaleer) is characteristic of this larger trend. If we also consider Ch’olan and Yucatecan borrowings identified in the much larger list of Wichmann & Brown (2003: 86), it becomes even clearer that the nature of interaction between Q’eqchi’ and the Lowland Mayan languages is in some sense a 2

The orthography used for rendering Q’eqchi’ is the one which is recommended by the Academía de las Lenguas Mayas de Guatemala and is employed by most Mayanists these days. While most symbols roughly have the same phonetic values as similarly shaped IPA symbols, the following deviate: “tz” = !, “ch” = ", “x” = #, “j” = x. Following a vowel, an apostrophe represents a glottal stop; following a consonant symbol it indicates glottalization or, in the case of “b’”, implosion. Vowel doubling (“ii”, “ee”, etc.) is a way of indicating vowel length. We use angle brackets to indicate the rendition of an original orthographic form deviating from the conventions otherwise used.

34. Loanwords in Q’eqchi’

885

Pre-Columbian parallel to that of Q’eqchi’-Spanish interaction. This list includes several items of materials and culinary culture as well as many pertaining to domesticated animals and agriculture. These items, displayed in (4) and (5), may be compared to the Spanish loanwords in (1) and (2). (4)

Ch’olan and Yucatecan loanwords from Wichmann & Brown (2003: 68) relating to material and culinary culture: aq ‘plant or straw used for roof construction’, q’a ‘bridge’, saklun ‘clay’, pan ‘big spoon’, k’aan ‘wedge’, kookom ‘twiner used for tying’, peteet ‘spindlewhorl’, t’upuy ‘braid of red wool’, mukuk ‘peel from a certain fruit used for condiment in drinks made from cornstarch’, ob’en ‘tamale (a kind of corn bread wrapped in leaves)’, xep ‘bean tamale’

(5)

Ch’olan and Yucatecan loanwords from Wichmann & Brown (2003: 68) relating to edible animals and production or provision of food: chi’ ‘nance (an edible fruit)’, isk’i’ij ‘spearmint’, lol ‘a kind of bean’, ox ‘breadnut’, pata ‘guava’, tz’uumuuy ‘sweet apple (anona), pu’ ‘wild turkey’, tzak ‘to hunt’, /x/k’anjel ‘work’, k’al ‘cleared field, cornfield’, kololte’ ‘cage’

Moreover, pertaining to the area of religion there are both Lowland borrowings (k’ab’a’ ‘name’, musiq’ ‘spirit’, ch’ool ‘heart’) and Spanish ones (krus ~ kurus ‘cross’, yos ~ tyos ‘god’, ermiit and kapiiy ‘temple’, iglees ‘church’, artal ‘altar’, payr ‘Catholic father’). While the nature of interaction between speaker of Q’eqchi’ and Spanish is well known, we know much less about how and when the Q’eqchi’ interacted with Lowland Mayas. A few leads are found in the archaeological records, but much must be inferred from loanwords. In the next section we turn to this issue.

5. Contact situations The major contact languages include Spanish and the Ch’olan and Yucatecan languages. Nahuatl was a lingua franca for around two centuries preceding and two centuries following Spanish conquest, and has donated a few loanwords without, however, apparently having had much direct influence on Q’eqchi’. It is possible that several Spanish borrowings passed through Nahuatl, a borrowing scenario which is hypothesized by Brown (1994, 1999) to apply to many languages of Mesoamerica (see Brown, this volume, for a summary of the arguments). While the hypothesis may well also apply to the case of Q’eqchi’, there is little direct evidence by way of actual Nahuatl borrowings to sustain it. Quite possibly, Spanish loanwords may also have travelled via other Mayan languages, but again we are left with speculations and have no concrete evidence. Possibly Q’eqchi’ has donated a few words to Mopan Mayan (Wichmann & Brown 2003: 69), but in its contact with other languages Q’eqchi’ has the role of the recipient whenever directions of

886

Søren Wichmann and Kerry Hull

borrowing can be detected. A contact language not hitherto mentioned is Poqomchi’, spoken to the immediate south of the Q’eqchi’ area. Like Q’eqchi’, Poqomchi’ belongs to the K’iche’an branch of the Mayan family. In Wichmann and Brown (2000–) there are a number of cognate words which are found only in Q’eqchi’ and the sister languages Poqomchi’ and Poqomam. Thus we do possess some evidence of Q’eqchi’-Poqomchi’ contact, even if it is not possible to establish the directions of loanwords in question. This contact, however, would seem to have a more local and perhaps also more recent character than the interaction with Ch’olan and Yucatecan. It is the latter contact situation which is the most interesting, because it permits us to make some inferences about pre-Columbian culture history. We therefore devote the remainder of this section to this topic. According to the estimate cited earlier, Q’eqchi’ branches off from its protoK’iche’an mother around 300 CE. This coincides with the beginning of the Classic Period in Lowland Maya civilization, a period characterized by large urban centers, complex architecture, monumental inscriptions, social stratification, intensive agriculture, and interregional competition within an area stretching from the peninsula of Yucatan to regions to the south of this area. The Q’eqchi’ area borders the eastern part of the Lowlands, and, as mentioned earlier, some of the cultural innovations, such as temple structures, tombs, ballcourts, and agricultural terraces are also found in the general area within which Q’eqchi’ is currently spoken. The ceramic record also provides evidence of cultural diffusion, particularly after 700 CE. The archaeological and the linguistic evidence complement each other nicely. The archaeological evidence permits us to infer that much of the interaction accounting for the borrowings from Lowland languages is quite early, probably dating to the Classic period of 300–900 CE, and the linguistic evidence helps us to flesh out the nature of the interaction. As already noted by Wichmann & Brown (2003: 68–9) the Ch’olans and Yucatecans were evidently culturally dominant, since they influenced their Q’eqchi’ neighbors in such diverse and important areas as religion, architecture, economy (cf. loanwords such as maatan, which means ‘gift’ and earlier would have referred to a kind of tax, or ch’uy ‘eight thousand’), food provision, and technical implements. Hieroglyphic evidence shows that Ch’olan, which was minimally differentiated in the beginning of the Classic period, begins to split up a few hundred years into this period, and became differentiated into its four descendants towards its end (see Wichmann 2006a for an overview as well as discussion of the somewhat different scenario of Houston et al. 2000). Thus, it is not surprising that we are often not able to identify Ch’olan borrowings as coming from one particular Ch’olan language. Nevertheless, one of these languages does stand out as having had gmore importance than the rest. This was observed by Wichmann & Brown (2003: 69) in a passage worth citing in extenso (we change the spelling to , since the latter is more current): “Among all the possible Ch’olan donors, the Eastern Ch’olan language Ch’olti’, now extinct and known only from seventeenth-century documents, seems to have contributed a disproportionally large number of loans to Q’eqchi’. This large proportion is

34. Loanwords in Q’eqchi’

887

especially remarkable in light of the fact that we possess only very limited lexical data for the language. Ch’olti’ forms appear fifty-nine times in the list of candidates for the origins of the 134 possible Mayan-language loans into Q’eqchi’. Speakers of Q’eqchi’ and of Ch’olti’ would have been linguistic neighbours before the latter language became extinct, so the apparently great contribution of Ch’olti’ is not surprising.” (Wichmann & Brown 2003: 69)

While this observation in itself was unsurprising, it was surprising that it was possible to make it, given the nature of the data. Since Ch’olti’ would not have become crystallized as an individual language until the end of the Classic Period, we may assume that much of the interaction between Ch’olti’ and Q’eqchi’ speakers took place in the Postclassic, but prior to the arrival of the Spaniards. Thus the 59 loanwords in questions, some of which would have been donated directly from Ch’olti’, testify to the continued importance of Ch’olti’ in the Postclassic, even after the socalled “collapse” of Classic Maya civilization (Wichmann 2006b). Regarding the contact with speakers of Spanish, enough has been said throughout this chapter already, and the loanwords speak for themselves.

6. Integration of loanwords Since Q’eqchi’ has all the phonemes of Ch’olan and Yucatecan (and in addition q and q’, which the Lowland Languages do not have), and since the phonotactic patterns are also similar, there is little phonological adaptation to be observed in loanwords from the Lowland Languages. If aq ‘hay’ is borrowed from Ch’olan or Yucatecan ak, then this would imply a shift in pronunciation, and the same goes for choq ‘cloud’ from Ch’olan tokal and q’a ‘bridge’ from Ch’olan k’ah-te’. It seems unlikely that all three of these forms should have been misidentified as borrowings, so this occasional shift from a velar to a uvular place of articulation of the stop sounds in question is well supported. The Ch’olan velar stops could well have a somewhat wider range of allophones than the Q’eqchi’ velar stops, coming close to a Q’eqchi’ uvular stop, so no phonetic adaptation need be involved. Since the works of Houston et al. (1998) and Lacadena & Wichmann (2004) many students of Maya writing agree that Ch’olan languages retained vowel length throughout most of the Classic period and that the orthographic conventions of the logosyllabic inscriptions included a means to indicate this distinction, even if current Ch’olan languages have lost the distinction and only indirectly (within the Western branch) show a former a : aa contrast through schwa vs. full a reflexes. Unfortunately, this paper cannot provide further support for the hypothesis of the late retention of vowel length in Ch’olan since there are no loanwords uniquely identifiable as Ch’olan which happen to have long vowels in Q’eqchi’, and which can be shown by comparative or hieroglyphic evidence to also have had long vowels in Ch’olan. In instances where such comparative evidence is available, it comes from Yucatecan; and if a form is also attested in Yucatecan, this is where it could

888

Søren Wichmann and Kerry Hull

theoretically have originated, and it would then no longer be uniquely identifiable as Ch’olan. Some long vowels in loanwords are the result of either phonetic adaptation from a Vh sequence or a later phonological development from such a sequence; examples are: ch’ool ‘heart’ (Ch’ol ch’ujlel ‘spirit, pulse’, Chontal chu’ul ‘sacred’, Ch’olti’ ‘idol’, Ch’orti’ ch’u’r, ‘god, saint’); kaalam e ‘chin, jaw’ (Ch’olti’ ‘beard, chin’, Ch’orti’ kajram); teelom ‘man, male’ (Ch’olti’ , Ch’orti’ tejrom). We have not been able to identify other phonological adaptations or possible adaptations in borrowings from other Mayan languages. When we turn to Spanish, we can observe many such adaptations, however. In (6)– (8) we provide examples of adaptation phenomena relating to phoneme inventories and phonotactics. (6)

The replacement of foreign phonemes with related native ones (not necessarily in a one-to-one fashion): poos < fósforo ‘match’, pereera < fé de edad, kape < café ‘coffee’, kaalt < caldo ‘soup’, ont < honda ‘sling’, almul < almud ‘basket’, kareen < cadena ‘chain’, payr < padre ‘Catholic father’, ray < radio, yos ~ tyox < dios ‘god’, tumink ~ romink < domingo ‘Sunday’

(7)

The dropping of final vowels to satisfy the preference for closed syllables: b’aak < vaca ‘cow’, muul < mula ‘mule’, seer < cera ‘bee’s wax’, taas < taza ‘cup’, suurk < zurco ‘furrow’, room < romo ‘blunt’, etc.

(8)

Reduction of a polysyllabic word to the preferred monosyllabicity of Q’eqchi’ morphemes: poos < fósforo ‘match’, saaw < sábado ‘Saturday’, maak < máquina ‘machine’, meet < limeta ‘bottle’

These are the most frequent phonological adaptation phenomena observed. Some more sporadic ones include the addition of a final consonant to satisfy the preference for closed syllables (chib’aat < chivo ‘goat’), the addition of a vowel to break up a consonant cluster (kurus < cruz ‘cross’), the replacement of a diphthong by a long y monothong (pleet < pleito ‘fight’), and the replacement of the colonial Spanish l sound by l (laaw < llave ‘lock’, contrasting with the later borrowing yaaw < llave ‘tap, faucet’). We now look at ways in which some Q’eqchi’ borrowings reflect features of colonial Spanish. The Spanish s sound was perceived as having a more palatal pronunciation and is reflected as such in early loanwords. But only a minority of Spanish forms in s behave in this way. In (9)–(10) we provide examples where s is either reflected as x (i.e., a palatal sibilant) or as s. The referents of these words support the interpretation that the loanwords reflecting a palatal pronunciation entered the Q’eqchi’ earlier than others. This phenomenon is then to be regarded as the reflection of an earlier Spanish pronunciation rather than as adaptation.

34. Loanwords in Q’eqchi’

889

Following the Canfield’s 1952 chronology of Spanish sibilants, we assign such borrowings to the period preceding 1600. (9)

Early treatment of Spanish s (add to these, the plural forms in (12) below): xartin < sartén ‘pan’, meex < mesa ‘table’, b’axton < bastón ‘walking stick’, preex < preso ‘captive’, tyox < dios ‘god’ (alternant: yos)

(10) Later treatment of Spanish s: estuuf < estufa ‘stove’, eskaleer < escalera ‘ladder’, iglees < iglesia ‘church’, kamiis < camisa ‘shirt’, kees < queso ‘cheese’, laas < lazo ‘lasso’, moos < mozo ‘servant’, oos < oso ‘bear’, poos < fósforo ‘match’, pulseer < pulsera ‘bracelet’, seer < cera ‘bee’s wax’, sepiiy < cepillo ‘brush’, serwees < cerveza ‘beer’, siiy < silla ‘chair’, suurk < zurco ‘furrow’, taas < taza ‘cup’, yos < dios ‘god’ (alternant: tyox) The sound corresponding to the Spanish “j” (jota) was pronounced as a palatal sibilant during the first half of the sixteenth century and then began to change to a velar fricative. This change had perpetuated throughout the Spanish of the Amerith cas by the end of the 16 century (Canfield 1952). Loanwords reflecting the early pronunciation are given in (11). (11) Loanwords reflecting the colonial Spanish palatal sibilant: xaar < jarro ‘jug, pitcher’, akuux < aguja ‘needle’, xab’on < jabón ‘soap’. Q’eqchi’ speakers regularly borrow nouns whose referents are often encountered as conglomerates in the plural of the Spanish form, cf. (12). (12) Reflection of non-functional Spanish plural -s: wakax < vaca ‘cattle’, b’ooyx < buey ‘ox’, patux < pato ‘duck’, b’otonx < botón ‘button’, klaawx < clavo ‘nail’ We have not detected any special means by which verbs from other Mayan languages are integrated. That is, they seem to be “plugged” directly into Q’eqchi’ morphology without further ado, using a strategy called “direct insertion” in the typological study of loanverbs of Wichmann & Wohlgemuth (2008). The Spanish borrowings in the sample are all imported nouns, except two function words (algo < algo ‘some’, si’ < si ‘if’) and two adjectives (room < romo ‘blunt’, look < loco ‘mad’). There is single form which in Q’eqchi’ functions as a verbal form ayuunink rix ‘to fast’. Data recorded by Hull in Belize contain an additional such form, peleetik ‘fighting’, and a text published by Ac & Pinkerton (1976: 104) exhibits the verbal form trab’axik ‘working’. But all three of these are nominal in origin, coming from respectively ayuna ‘fast’, pelea ‘fight’, and trabajo ‘work’, and thus do not offer insights into the way that verbs get borrowed. The literature does not provide good information on Q’eqchi’ attitudes towards Spanish borrowings. Campbell’s (1976) study gives a little insight, however,

890

Søren Wichmann and Kerry Hull

showing that many Spanish borrowings are actually regarded in some sense as being native to Q’eqchi’. Speakers tend to view borrowings that are adapted phonologically or which refer to native or nativized cultural items as Q’eqchi’ words. Among the linguistically trained activists, who are responsible for several of the sources which we have used for this study, there is a heavily puristic attitude. This is generally shared among activists belonging to the Academia de Lenguas Mayas de Guatemala and other organizations struggling to promote Mayan languages.

7. Grammatical borrowing This study has focused on loanwords, and we are not yet prepared to provide much information on grammatical borrowing. There is no doubt that a comparative study of K’iche’an languages would reveal many cases where Q’eqchi’ behaves in a deviant way which could be accounted for through influence from other Mayan languages, but this must remain a topic for future studies, as must the general impact of Spanish grammar. It is likely that Ch’olan has influenced Q’eqchi’ both in morphology and syntax, whereas Spanish grammatical influence is largely restricted to 3 syntax.

8. Conclusions In this chapter we have presented a case of a Mesoamerican language which, in spite of being spoken in a relatively remote and isolated region, shows evidence of interaction with Spanish typical of many other Mesoamerican languages. The impact on this language from related Mayan languages, however, is extraordinary. From its birth as an emerging dialect, the language appears to have received a significant number of loanwords from neighboring speakers of Lowland Mayan languages who were instrumental in developing Classic Maya civilization, famed for its impressive architectural remains and elaborate writing system. The “civilizing” impact of the lowlanders is clearly felt in borrowed vocabulary. No doubt, much of the interaction responsible for these loanwords was at least initially of an unfriendly nature, but would eventually result in sociopolitical ties under which Q’eqchi’ speakers could continue to thrive. Aspects of this situation would repeat themselves when Spaniards arrived and failed to conquer the Q’eqchi’ militarily but succeeded in 3

The earliest comparative linguistic study involving Q’eqchi’ includes a mention of possible Western Mayan (more specifically Tzeltalan) influence in the pronominal system (Stoll 1896: 19). Furthermore, Pinkerton (1976: 56), citing personal communication from Marlys Bacon, observes that “Mayan languages can be roughly divided into three groups whose geographical location correlates with the prefixing or suffixing of the absolute pronouns to the verb. The lowland languages suffix the absolutive pronouns to the verb and the highland languages prefix them to the verb. There are also a group of ‘buffer zone’ languages which both prefix and suffix the absolutive pronouns to the verb. K’ekchi seems to fall into the ‘buffer zone’ group.”

34. Loanwords in Q’eqchi’

891

implanting their designs for religious and social organization. Thus, the story of the Q’eqchi’, as revealed though archaeological, historical, and linguistic sources, is one of development through negotiation with and adaptation to powerful foreign influences.

Acknowledgments We would like to thank Héctor Rolando Xol Choc, a Q’eqchi’ speaker from Cobán, Guatemala, for his considerable help with data used in this study. Andrea Ruf supplied us with some important bibliographical information, and Cecil H. Brown, John Robertson, and the editors provided helpful comments on an earlier version of the paper.

References Ac, Flora & Pinkerton, Sandra. 1976. Li k’alek, li aw:k [sic!], ut li q’olok. In Pinkerton, Sandra (ed.), Studies in K’ekchi (Texas Linguistic Forum 3), 102–110. Austin: Department of Linguistics, The University of Texas at Austin. Anonymous. 2004. Xtusulal Aatin Sa’ Q’eqchi: Vocabulario Q’eqchi’. Guatemala City: Academia de Lenguas Mayas de Guatemala. Brown, Cecil H. 1994. Lexical acculturation in Native American languages. Current Anthropology 35:95–117. Brown, Cecil H. 1999. Lexical Aculturation in Native American Languages. New York/Oxford: Oxford University Press. Campbell, Lyle R. 1974. Theoretical implications of Kekchi phonology. International Journal of American Linguistics 40:269–278. Campbell, Lyle R. 1976. Kekchi linguistic acculturation: A cognitive approach. In McClaran, Marly (ed.), Mayan Linguistics, Vol. 1, 90–97. Los Angeles: American Indian Studies Center. Campbell, Lyle R. 1977. Quichean Linguistic Prehistory. (University of California Publications in Linguistics 81). Berkeley: University of California Press. Campbell, Lyle R. & Kaufman, Terrence. 1985. Mayan linguistics: Where are we now. Annual Review of Anthropology 14:187–198. Canfield, D. Lincoln. 1952. Spanish American data for the chronology of sibilant changes. Hispania 35(1):25–30. Caso Barrera, Laura & Aliphat, Mario M. 2006. Relaciones entre q’eqchi’ de la Verapaz y chol del Manche, siglo XVII. Paper presented at the XX Simposio de Investigaciones Arqueológicas en Guatemala, Museo Nacional de Arqueología y Etnología, Guatemala City, July 24th–28th, 2006.

892

Søren Wichmann and Kerry Hull

Cu Cab, Carlos Humberto. 1998. Q’eqchi’ – Kaxlan aatin ut Kaxlan aatin – Q’eqchi’. Guatemala City: Instituto de Lingüística de la Universidad Rafael Landívar. de Borhegyi, Stephan F. 1965. Archaeological Synthesis of the Guatemalan Highlands. In Willey, Gordon R. (ed.), Handbook of Middle American Indians, Volume 2: Archaeology of Southern Mesoamerica, Part 1, 3–58. Austin: University of Texas Press. de Moran, Fray Francisco. 1695. Vocabulario de la lengua Cholti que quiere decir la Lengua de Milperos. (MS, Collection 497.4/M79). American Philosophical Society Library, Philadelphia. de Remesal, Antonio. 1966 [1619]. Historia general de las Indias Occidentales y particular de la gobernación de Chiapa y Guatemala. Guatemala City: Editorial José de Pineda Ibarra. DeChicchis, Joseph E. 1989. Q'eqchi' (Kekchi Mayan) Variation in Guatemala and Belize. Ph.D. dissertation. University of Pennsylvania. DeChicchis, Joseph E. 1990. The genesis of Kekchi dialect differences. In Bahner, Verner & Schildt, Joachim & Viehweger, Dieter (eds.), Proceedings of the Fourteenth International Congress on Linguistics. Berlin/GDR, August 10–15, 1987, 2nd edn. 1492–1495. Berlin: Akademie-Verlag. Dienhart, John M. 1989. The Mayan Languages: A Comparative Vocabulary. Vol. 1–3. Odense: Odense University Press. England, Nora C. 1994. Ukuta’miil, ramaq’iil, ttzijob’aal: ri maya’ amaaq’: Autonomía de los idiomas mayas: historia e identidad. Guatemala City: Editorial CHOLSAMAJ. Freeze, Ray A. 1975. A Fragment of an Early K’ekchi’ Vocabulary with Comments on the Cultural Content. (University of Missouri Monographs in Anthropology 2, Studies in Mayan Linguistics 1). Department of Anthropology, University of Missouri-Columbia. Gordon, Jr., Raymond G. (ed.). 2005. Ethnologue: Languages of the World. 15th edn. . Dallas, TX: SIL International. Houston, Stephen D. & Robertson, John & Stuart, David. 2000. The language of Classic Maya inscriptions. Current Anthropology 41(3):321–356. Houston, Stephen D. & Stuart, David & Robertson, John. 1998. Disharmony in Maya hieroglyphic writing: Linguistic change and continuity in Classic society. In Ciudad Ruiz, Andrés et al. (eds.), Anatomía de una civilización: Aproximaciones interdisciplinarias a la cultura maya (Publicaciones de la S.E.E.M. 4), 275–296. Madrid: Sociedad Española de Estudios Mayas. Hull, Kerry. n.d. Data from fieldnotes collected through fieldwork done in Belize in 2002 and 2005. Jones, H. Lee. 2003. Birds of Belize. Austin: University of Texas Press. Juarros, Domingo. 1936. Compendio de la historia de la ciudad de Guatemala. Guatemala City. Kaufman, Terrence. 1974. Idiomas de Mesoamérica. (Seminario de Integración Social, Publicación 33). Guatemala City: Editorial José de Pineda Ibarra, Ministerio de Educación.

34. Loanwords in Q’eqchi’

893

Kaufman, Terrence with Justeson, John. 2003. A Preliminary Mayan Etymological Dictionary. Foundation for the Advancement of Mesoamerican Studies. . King, Arden R. 1974. Coban and the Verapaz. (Middle American Research Institute, Publication 37). New Orleans: Middle American Research Institute, Tulane University. Kockelman, Paul. 2003. The meanings of interjections in Q’eqchi’ Maya: From emotive reaction to social and discursive action. Current Anthropology 44(4):467–490. Kockelman, Paul. 2007. Inalienable possession and personhood. Language in Society 36(3). Lacadena, Alfonso & Wichmann, Søren. 2004. On the representation of the glottal stop in Maya writing. In Wichmann, Søren (ed.), The Linguistics of Maya Writing, 100–164. Salt Lake City: University of Utah Press. Pinkerton, Sandra. 1976. Ergativity and word order. In Studies in K’ekchi (Texas Linguistics Forum 3), 48–66. Austin: Department of Linguistics, The University of Texas at Austin. Ponce E Hijos, Rosales. 1930. Vocabulario Quecchi-Español [Q'eqchi'-Spanish vocabulary]. 2nd edn. Reprinted in El Norte. Coban, A.V. Sam Juárez, Miguel & Cao, Ernesto Chen & Tec, Crisanto Xal & Chen, Domingo Cuc & Pop, Pedro Tiul. 1997. Diccionario del idioma q’eqchi’. La Antigua, Guatemala: Proyecto Lingüístico Francisco Marroquín. Sapper, Karl. 1890. Die Kekchi Indianer [The Q'eqchi' Indians]. Das Ausland 63(43, 45):841–844, 892–895. Stuttgart. Sapper, Karl. 1891. Die soziale Stellung der Indianer in der Alta Verapaz, Guatemala [The social situation of the Indians in Alta Verapaz, Guatemala]. Petermans Mitteilungen 37:44–66. Gotha. Sapper, Karl. 1902. Die Alta Verapaz. (Reprinted from Mitteilungen der Geographischen Gesellschaft in Hamburg 17). Hamburg. Sapper, Karl. 1936. Die Verapaz im 16 und 17 Jahrhundert: Ein Beitrag zur historischen Geographie und Ethnographie des nordöstlichen Guatemala. München: Verlag der Bayerischen Akademie der Wissenschaften. Schackt, Jon. 2000. La cultura q’eqchi’ y el asunto de la identidad entre indígenas y ladinos en Alta Verapaz. Revista Estudios Interétnicos 8(13):14–20. Instituto de Estudios Interétnicos, Universidad de San Carlos de Guatemala. Sedat, William. 1955. Nuevo diccionario de las lenguas k’ekchi’ y española. Chamelco, Alta Verapaz, Guatemala: Instituto Lingüístico de Verano. Stewart, Stephen. 1980. Gramática kekchí [Q'eqch' grammar]. Guatemala City: Editorial Académica Centroamericana. Stoll, Otto. 1896. Die Maya-Sprachen der Pokom-Gruppe. Zweiter Teil: Die Sprachen der K’e’kchi-Indianer. Nebst einem Anhang: Die Uspanteca. Leipzig: K. F. Köhler’s Antiquarium.

894

Søren Wichmann and Kerry Hull

Thompson, J. Eric. 1930. Ethnology of the Mayas of Southern and Central British Honduras. (Publication 274, Anthropological Series XVII.2). Chicago: Field Museum Press. Weeks, John M. 1997. Subregional organization of the Sixteenth-Century Q‘eqchi’ Maya, Alta Verapaz, Guatemala. Revista Española de Antropología Americana 27:59–93. Madrid: Servicio Publicaciones UCM. Wichmann, Søren. 2006a. Mayan historical linguistics and epigraphy: A new synthesis. Annual Review of Anthropology 35:279–294. Wichmann, Søren. 2006b. A new look at linguistic interaction in the lowlands as a background for the study of Maya codices. In Valencia Rivera, Rogelio & Le Fort, Geneviève (eds.), Sacred Books, Sacred Languages: Two Thousand Years of Ritual and Religious Maya Literature. (Acta Mesoamericana 18). Proceedings of the 8th European Maya Conference, Madrid, 25th–30th November, 2003. Markt Schwaben: Verlag Anton Saurwein. Wichmann, Søren & Brown, Cecil H. 2000–. Panchronic Mayan Dictionary. Electronic manuscript. Wichmann, Søren & Brown, Cecil H. 2002. Contacto lingüístico dentro del área maya: Los casos de ixhil, q’eqchii’ y chikomuselteko. Pueblos y Fronteras 4:133–167. México, D.F.: Universidad Nacional Autónoma de México. Wichmann, Søren & Brown, Cecil H. 2003. Contact among some Mayan languages: Inferences from loanwords. Anthropological Linguistics 45(1):57–93. Wichmann, Søren & Wohlgemuth, Jan. 2008. Loan verbs in a typological perspective. In Stolz, Thomas & Palomo, Rosa & Bakker, Dik (eds.), Aspects of Language Contact: New Theoretical, Methodological and Empirical Findings with Special Focus on Romanisation Processes, 89–121. Berlin/New York: Mouton de Gruyter. Wilk, Richard & Chapin, Mac. 1990. Ethnic Minorities in Belize: Mopan, Kekchi and Garifuna. (SPEAReports 1). Mexico: Cuboloa Production.

34. Loanwords in Q’eqchi’

Loanword Appendix Ch’olan or either Ch’olan or Yucatecan k’ak’naab’ aak’ab’ b’ook ch’ajom lut so’sol mukuy k’ams kuluk ko job’nil mus-iq’-ak yajel kalajenaq sek’ b’oox xaal che’ xeek’ pak’ok waal poyte’ jek’ok jay b’ech’ xuk jochok xeb’ok mek’onk xuxb’ak k’ab’a’ej xolb’ lak’am ~ lakam ch’anaak teelom alal ko’ tiix suk motzo’ xukub’ yax tz’ak eeb’ pikok pech’ok

spring, well darkness steam boy twins vulture dove termites worm cheek stomach to breathe disease drunk plate pocket forked branch palm tree to mould/mold fan raft to share thin crooked corner to pinch to pinch to embrace to whistle name flute shield calm man son daughter old woman nest worm horn claw wall ladder to dig to thresh

aq b’asok tenok xeq’ok kelonk k’onok sub’e’k b’itonk juyuk sa’ tz’e tz’e tach’to k’onk’o k’urux nub’aal, rub’ajl hob’ok t’upuy tz’uluk po’lem wax ru paachach

hay to bend to strike, to hit, to beat to stab to pull to twist to sink to carry on head to row left left flat crooked rough (1) boundary to curse headband, headdress plait/braid hut mad cockroach

Yucatecan yuk ajk hoyok Nahuatl tolokok tenamit

goat to wake up to pour lizard village

Q’anjob’alan mi’

vagina

Spanish kontineent sabana poos komon potreer ~ potrero wakax b’ooyx b’aak karneer b’orego kordero

mainland savanna match relatives pasture cattle ox cow sheep, ewe lamb lamb

ob’eja chib’aat kawaay granyon b’uro ~ b’uur muul kaxlan ganso patux loor tiburon b’ayeen lob’o oos elefaant kameey insekto seer kanguru loktor oorn xartin xaar taas platiiy ariin salchiich ~ salchicha kaalt paaps igo nwes asetuna aseet asuukr leech kees manteek b’iin serwees laan seda akuux ~ kuus koton kamiis b’oot (1) b’elo, b’eel b’otonx alpiler

lamb he-goat horse stallion donkey mule chicken goose duck parrot shark whale wolf bear elephant camel insect bee kangaroo physician oven pan jug/pitcher cup saucer flour sausage soup potato fig nut olive oil sugar milk cheese butter wine beer wool silk needle (1) coat shirt boot veil button pin

895

896

Søren Wichmann and Kerry Hull

pulseer tuwaay sepiiy nawaaj xab’on kwaart pwerta champa kantaaw laaw b’entaan estuuf eskaleer siiy meex lampr kandeel, kanteel arko kampameent koral suurk paal asaron orkeet rastriyo laas triiw (1) seb’aad aros kachimp kook lamunx panchola kareen tixeer xeer klaawx plaat almul alfombra formon kapoteer kareton rueda

bracelet towel brush razor soap room door, gate doorpost lock key window stove ladder chair table lamp, torch candle arch camp fence furrow shovel hoe fork/pitchfork fork/pitchfork, rake lasso wheat barley rice pipe coconut citrus fruit nettle chain scissors, shears saw nail silver basket rug chisel peg cart, wagon wheel

yunta b’arko kanaleta paleta b’ela tumin kwenta krus, kurus linea sero algo priim hoonal oor reloj xamaan domiin, tumink, romink luuns marts miercools, myers jwees, jweb’es b’yers, b’iernes saaw primab’era otonyo estasion room look si’ liib’r tamb’or trompeet triiw (2) b’axton moos pleet soldaa ~ sola ont makaan tore preex kordel

yoke ship oar paddle sail money bill cross line zero some dawn hour hour clock week Sunday

Monday Tuesday Wednesday Thursday Friday Saturday spring autumn/fall season blunt mad if book drum horn, trumpet clan walking stick servant war, battle soldier sling sword tower captive, prisoner fishing line

yooy traamp ley testiig muult yos, tyox ermiit kapiiy iglees artal payr ayuunink rix ray teleb’ision teleef b’isikleet moot kamyon tren ab’yon elektrisidad b’ateriiy motor maak petrool pastiiy b’akuun inyeksion anyooj lisens papelseya pereera b’oot (2) tiimbr tarjeet postal yaaw laat torniiy meet plastiik ~ plaas b’oom periood son te kape

fishnet fish trap law witness fine god temple temple church altar priest to fast radio television telephone bicycle motorcycle bus train airplane electricity battery motor machine petroleum pill or tablet injection injection spectacles/glasses driver’s license birth certificate birth certificate election postage stamp postcard tap/faucet tin/can screw bottle plastic bomb newspaper music tea coffee

Chapter 35

Loanwords in Otomi, an Otomanguean language of Mexico* Ewald Hekking and Dik Bakker 1. The language and its speakers Otomi belongs to the Otomanguean family, according to Ruhlen (1991: 37) a subgroup of the Central Amerindian stock, together with Uto-Aztecan and Tanoan. With Mazahua, Ocuilteca, Matlatzinca, Pame and Chichimeca it constitutes the Otopamean group. Figure 1 gives an overview of the Otomanguean family according to Gordon (2005). Amuzgoan Mixtec-Cuicatec Mixtecan

Cuicatec, Mixtec

Trique Chichimeca (Guanajuato State)

Chiapanec-Mangue

Guanajuato State Chichimec

Otomanguean

Matlatzinca (Mexico State) Hidalgo State

Matlatzincan

Valle del Mezquital

Ocuilteca (Mexico State) Otopamean Mazahua (Mexico State)

Mexico State

Otomian Michoacán State Chinantecan Pamean

Zapotecan

Chatino, Zapotec Chocho-Popolocan

Popolocan

Ixcatecan

Otomi (9 dialects distributed over 8 states (according to ethnologue))

Puebla State

Municipality of Amealco: Santiago Mexquititlán San Ildefonso Tultepec

Pame (Querétaro State / San Luis Potosí State)

Querétaro State

Municipality of Tolimán

Chocho, Popolocan

Tlaxcala State

Municipality of Cadereyta

Veracruz State

Mazatecan

Figure 1:

Otomanguean family according to Ethnologue (2005)

Otomi is currently spoken by around 330,000 generally poor, traditional and mostly bilingual peasants on the highlands around Mexico City in the states of Mexico, Hidalgo, Querétaro, Puebla, Guanajuato, Tlaxcala, Veracruz and Michoacán. It is the fifth largest indigenous language of Mexico, only surpassed by Nahuatl, Maya, Zapoteco and Mixteco. According to the information given by the Comisión *

The subdatabase of the World Loanword Database that accompanies this chapter is available online at http://wold.livingsources.org. It is a separate electronic publication that should be cited as: Bakker, Dik & Hekking, Ewald. 2009. Otomi vocabulary. In Haspelmath, Martin & Tadmor, Uri (eds.) World Loanword Database. Munich: Max Planck Digital Library, 2158 entries.

898

Ewald Hekking and Dik Bakker

Nacional para el Desarrollo de los Pueblos Indígenas (National Commission for the Development of the Indigenous Peoples; CDI), during the census of the Mexican indigenous population in 2000, a total population of 646,875 Otomis was registered, of which only 50.6% were active speakers of their mother language. This means that about 49% of the Otomi have lost their language. In the state of Querétaro, where we have collected most of the data for this contribution, the indigenous population was estimated by the Instituto Nacional Indigenista (INI) and the Consejo Nacional de Población (SEDESU 2005) at 47,420 inhabitants in 2002. Querétaro’s indigenous inhabitants are predominantly Otomis. They are distributed mainly over the municipalities of Amealco, Tolimán and Cadereyta, which make up for 82.4 % of the Otomi population of Querétaro. The Otomi spoken in the state of Querétaro forms part of the north-western dialect of Otomi, which is the dialect with most literature written in it. Map 1 gives an overview of where Otomi is spoken.

Map 1: Geographical location of Otomi and some other languages

2. Sources of data The words for the Otomi subdatabase stem first of all from the database collected during fieldwork in the communities of Santiago Mexquititlán (Amealco) and Tolimán between 1981 and 1994. At that time, we were preparing a bilingual dictionary of the Otomi of Santiago Mexquititlán (Hekking & Andrés de Jesús 1989)

35. Loanwords in Otomi

899

and conducting a study about language shift and borrowing among 95 native speakers of Otomi from both communities (Hekking 1995). Our data has also been taken from a series of stories recorded and written down in the reading and writing workshops organized for the speakers of the same communities between 1992 and 2002 (Hekking & Andrés de Jesús 2002) and from texts originally written in modern Spanish and translated by native speakers of Otomi into their native language. We have also consulted two other dictionaries of the north-western dialect of Otomi: firstly the Diccionario de Urbano (Old Otomi of the Tula region of the beginning of th the 17 century) (Urbano 1990 [1605]) and secondly the Diccionario de Hernández Cruz, Torquemada y Sinclair (Modern Otomi of the Valle del Mezquital) (Hernández Cruz et al. 2004). Earlier studies about loanwords in Otomi have been written by Ecker (1966), Zimmermann (1992), Lastra (1994), Hekking & Bakker (1998a, 1998b) and Bartholomew (2000). All these scholars have discussed Spanish loanwords in Otomi, while only Bartholomew and Ecker have dealt with the question of Nahuatl words in the language. Let us first have a brief look at the Nahuatl borrowings. Ecker suggests that Otomi and Nahuatl must have influenced each other mutually, but Bartholomew (2000) claimed that Otomi has adopted only calques from Nahuatl and no authentic loanwords. Bartholomew’s stand was mainly based on an analysis of Urbano’s (1990 [1605]) trilingual dictionary. Indeed, we found only a few Nahuatl borrowings in our list. The situation is altogether different for Spanish. To the best of our knowledge, Bartholomew (1954) wrote the first study about Spanish loanwords in Otomi. Lastra (1994) is another important study on this topic. Arguably, Zimmermann (1992) was the first publication on Spanish grammatical loanwords in Otomi, triggering off a whole series of studies about that category. Zimmermann (1992: 295–298) shows that both in the variant of Otomi spoken in the Valle del Mezquital (State of Hidalgo) and in the catechisms written in Otomi between 1785 and 1826, many Spanish prepositions and conjunctions were used, such as hasta ‘until’; con ‘with’; para ‘for’; o ‘or’ and que ‘that’. He suggests that the use of these grammatical borrowings is pragmatically motivated. In relation to the study of Zimmermann (1992), Hekking & Muysken (1995) compare the Spanish function words used in the Otomi spoken in Santiago Mexquititlán and the Spanish function words used in the Quechua spoken in Potosí, Bolivia. They show that in the Santiago Mexquititlán Otomi far more Spanish prepositions and conjunctions are used than in Quechua, in particular subordinators. They conclude that the difference between both languages is motivated by the difference in structure. Hekking (1995) gives a detailed description of language shift in Santiago Mexquititlán and Spanish loanwords used by 31 Otomi speakers from that Otomi community. Hekking focuses in particular on grammatical loanwords and the consequences the adoption of these loanwords has for the Otomi grammar. Apart from that, he reports that there is a large number of Spanish numerals and interjections but only a small number of borrowed adjectives. All these points have been elaborated and discussed in more depth in

900

Ewald Hekking and Dik Bakker

later studies by Hekking & Bakker (1998a, 1998b, 2007), Bakker & Hekking (1999), Hekking (2001, 2002), and Bakker et al. (2008).

3. Contact situations Otomi was the native language of the original inhabitants of the Valley of Mexico and the surrounding valleys. Throughout history Otomi speakers had to confront the Aztecs, the Spaniards and the Mestizos, speakers of Nahuatl and Spanish, i.e. languages that belong to other language families, Uto-Aztecan and Indo-European respectively. th Since the arrival of the Nahuas on the Mexican high plateau in the 10 century there must have been a close contact between the Otomi and Nahuatl languages. Only a few studies have been written about the mutual influence between Otomi and Nahuatl, notably Ecker (1966) and Bartholomew (2000). Not much is known about this topic, and more research needs to be done. We suppose that in the beginning Nahuatl was influenced by the Otomis, whose culture was dominant then. th However, from the 13 century onwards, the tables were turned, and the Otomis were subjected to the Nahuas’ dominance. From then on, we may assume that Otomi has been influenced by Nahuatl. It was during pre-Columbian times that the Nahuas developed a very negative image of the Otomis. Later, this was passed on to the Spanish chroniclers from the colonial time, such as Sahagún (1989 [1557]), whose Nahuatl speaking informants considered the Otomis toscos e inhábiles (‘coarse and unskillful’). The very fact that the word Otomi is probably a derivation from the Nahuatl word totomitl ‘birdhunter’ is an example of the negative image imposed by the Nahuas (Jiménez Moreno 1939). The Otomis themselves prefer to call their language Hñöñhö or Hñähñu, and themselves Ñöñhö or Ñähñu. The word Ñöñho probably means “he who speaks well”. The morphemes ñö and ñä mean ‘speak’. The meaning of -ñho is controversial. Some scholars relate it to hñö or hñä ‘breathing’ or ‘respiration’. We suppose that it is a derivation of the morpheme hño ‘well’. The morpheme h- marks the impersonal or passive voice (Hekking 1995). The Otomis have also been in contact with the Mazahuas, with whom they had a relationship of equality, and the Chichimecs, in comparison with whom the Otomis probably felt superior. It is interesting to mention that the Otomis from Tolimán claim that their ancestors originally spoke Chichimeca. This could mean that in the Otomi spoken in that community some Chichimeca substrate might be found. From around the year 1500 onwards Otomi has been in contact with Spanish. As a result it has undergone pervasive influence from that language. This contact caused the adoption of Spanish lexical and grammatical material in Otomi with the concomitant changes in the indigenous grammar. It also introduced a process of language shift for well over the last two centuries.

35. Loanwords in Otomi

901

Since the Otomis were the second most numerous group after the Nahuas on the highlands, the Spaniards were highly interested in their conversion to Catholicism. Although their language was considered to be very difficult mainly because of its high number of vowels and consonants, a spelling system for Otomi was developed as well as some grammars and vocabularies. As a result of that, a large number of catechisms and legal documents were written in Otomi. Several missionary friars of the Franciscan order studied the Otomi language, such as Fray Pedro Cárceres, who wrote his Arte de la lengua Otomi around 1580 in Querétaro. Some decades later the Franciscan priest Fray Alonso Urbano wrote the linguistically important Arte breve de la lengua Otomi y vocabulario trilingüe, Español-Otomi-Náhuatl, a trilingual dictionary, in which we already find several Spanish loanwords. Sociolinguistically interesting is the Códice Martín del Toro from the second th half of 17 century. This story has a hero who belongs to a lineage of Otomi nobles who had formed a military alliance with the Spaniards. In this text, there is constant code switching between Otomi and Spanish (Guerrero 2002). After the independence of Mexico in 1813 the indigenous groups lost their status as more or less independent communities. As a result of this loss in status, many Otomis could no longer afford their education. Another consequence was that Otomi stopped being used by the civil authorities, apart from a handful of scholars. It was in the nineteenth century that a process of language shift started. The Mexican Revolution (1911–1917) did not lead to social change for the Otomi population, nor did it foster recognition for their language, and stop language shift. On the contrary, after a long history in which the Otomis had been degraded socially, they belong today to the lowest social levels of the Mexican society. They live in the most remote and least fertile places in the highlands and engage in subsistence agriculture, a reason for many of them to emigrate to large cities such as Mexico City, Guadalajara, and Monterrey. th In the 20 century several attempts were made to integrate the indigenous population into the national community by means of the officially called Educación Bilingüe. This program is taught mainly by indigenous teachers with a very negative attitude towards their own roots and a complete lack of knowledge about bilingual education. In actual practice, therefore, Otomi is not taught at school at all, and most Otomis remain illiterate in their first language and very often have insufficient command of the standard variety of Mexican Spanish as well. Otomi is only spoken in informal domains such as the family, among good friends, and to one’s godfather, godmother and godchildren. In church as well as in the media, Spanish is virtually the only language employed. There exists hardly any written modern literature in Otomi. During the last 20 years, because of the construction of roads and schools, the growing influence of the media, the growth in trading and emigration, contact between the relegated Otomis and the Spanish speaking world of the Mestizos has increased considerably. As a result, a rapid increase in contact phenomena from Spanish may be observed (Bakker & Hekking 1999; Hekking 1995, 2001, 2002; Hekking & Bakker 1998a, 1998b, 2007). Because of the fact that theirs is a

902

Ewald Hekking and Dik Bakker

stigmatized language, only spoken by poor and traditional people, many Otomis do not want to transmit the indigenous language to their children any more. Without a dramatic change in the sociolinguistic situation the language will probably disappear within a few generations, despite the considerable number of native speakers today.

4. Numbers and kinds of loanwords Of the 1457 Loanword Typology meanings, 48 (3.4%) have no counterpart in Otomi, neither native nor borrowed. Of these, 31 are considered to be irrelevant for the speakers (e.g. reef, arctic lights, stingray, raccoon). The other 17 simply have no counterpart for unknown reasons (e.g. swamp, calm, rough, fin, alone). The missing counterparts stem mainly from the semantic fields Animals (8 meanings), Agriculture and vegetation (8), The physical world (6), and Motion (6). Of the remaining 1411 meanings, 231 (16.3% of the overall lexicon) have one or more loanwords as counterparts. For 97 of these (6.9%), a loanword is the only form available. Since for 20 meanings there are in fact two loanwords, the total of the borrowings amounts to 251. The major source of borrowing is Spanish: of the 231 meanings which have loanword counterparts, 229 are expressed by a lexeme stemming from this language. We were able to relate only two words to any other language with some certainty, in both cases Nahuatl: kwate ‘twins’ and the calque xifik’eña ‘centipede’. It is very possible that a more thorough search for cognates in our lexicon with respect to Otomi, other Otomanguean languages, and Nahuatl, will reveal more potential loans from the latter language, but they would be very old and fully integrated phonologically. Of the total number of Spanish borrowings that we found, we have some hesitation about only two: mixi ‘cat’ and lobo ‘ball’ (probably from Spanish globo ‘sphere’). In the next sections we will discuss the distribution of the Spanish borrowings among the respective word classes and semantic fields. Given the low number of loans from languages other than Spanish, we will restrict ourselves to the latter language. 4.1. Loanwords and semantic word class The meanings in the database are labeled for their semantic word class, i.e. the part of speech that is prototypically used for the expression of a LWT meaning. The distribution of the 1413 relevant meanings in our database is provided in Table 2. The first row in the table gives the percentages of meanings belonging to the various word classes which are not loanwords. The second row gives the percentage of meanings for which a loanword is the only counterpart. The third row gives the meanings which have both a native word and a loanword as counterparts.

35. Loanwords in Otomi

Table 1:

903

Loanwords in Otomi by semantic word class (percentages) Only Spanish loanword

Nouns Verbs Adjectives Adverbs Function words all words

11.1 0.3 0.7 6.9

Spanish loanword and non-loanword 12.5 4.1 1.7 10.9 9.5

Non-loanword only 76.4 95.6 97.6 100.0 89.1 83.7

As most studies in language contact show, nominal borrowings are typically the most frequent ones encountered in a wide variety of contact situations (cf. Thomason 2001: 70f). The distribution of the percentages over columns two and three seem to confirm this tendency.1 Although this does not necessarily imply that the corresponding loanword is a noun in Spanish, this is the case for all relevant meanings. In total there are 204 out of 864 meanings of a nominal character for which a Spanish word is borrowed. For 96 (47.1%) of these, a loanword is the only alternative. For another 108 there is an Otomi alternative. In all, about a quarter of the nominal meanings have a borrowed form as a possible or only counterpart. Verbal meanings are much less open to borrowing: just 13 out of 338 verbal meanings have a Spanish loanword as counterpart, of which only one is without an Otomi alternative. This is pesa ‘to weigh’ (< pesar). Interestingly, of the other forms, only five correspond to a pure verb: kre (< creer), which corresponds to the meanings ‘to think (2)’ and ‘to believe’; lei ‘to read’ (< leer); sige ‘to follow’ (< sigue); and mfende ‘to defend’ (< defender). The other seven Spanish forms are either plain nominal stems, or nominal or verbal stems turned into verbal predicates by adding an Otomi verb such as ot’e and ja (pi) ‘to do, to make’. Examples are given in (1). (1)

‘to pile up’ ‘to injure’ ‘to admit’ ‘to rescue’

mundo (< montón ‘heap, lot’) ot’e dañu (< daño ‘damage’) japi ar kaso (< caso ‘case’) ja salba (< salvar ‘to save’)

Even more restricted is borrowing for adjectival meaning. Of the total of 115 relevant meanings in this category, only two have a loanword as a counterpart. One of these is a noun in Latin: animä (< ánima ‘the soul’), and is used to express the adjectival meaning ’dead’. The second is a word that may be both an adjective and a noun in its original usage: dondo (< tonto ‘mad, stupid’) occurs frequently as the head of a noun phrase in Spanish. Adverbs are completely absent as borrowings, and this category is represented only four times in the database. Finally, there is quite a 1

Obviously, this figure is biased by the selection made in the LWT database. Our impression is that in this selection, nouns are underrepresented in terms of proportion of types in relation to what one might find in a dictionary.

904

Ewald Hekking and Dik Bakker

number of borrowed function words: 10 out of 92 meanings in this category have a Spanish-derived counterpart. These will be discussed in more detail in § 6 below. The low number of adverbial meanings in the subdatabase – four – does not warrant any conclusion about the borrowing of this category. All these facts fall in step with what we have observed elsewhere in relation to borrowing in Otomi (Bakker et al. 2008; Hekking et al. 2009+). The figures in Table 3 below are based on three corpora which we collected over the last decade: a 110,541 token corpus of spoken Otomi, a 79,718 token corpus of spoken Quechua and a 57,828 token corpus of spoken Guarani. For Otomi, we found a total of 2 15,571 borrowings from Spanish, i.e. 14.1% of the overall number. The percentages for Quechua and Guarani were somewhat higher than this: 18.9% and 17.4%, respectively. Table 3 provides a breakdown for the four major lexical categories for these three Amerindian languages. We have taken into consideration only the lexical as opposed to the grammatical categories, since the former may be argued to be in direct ‘competition’ with each other in the lexical-semantic space in which borrowing takes place. Table 2:

Borrowing from Spanish in three languages

Part of Speech

Otomi

Quechua

Guarani

Noun Verb Adverb Adjective Total

78.4% 9.2% 8.7% 3.6% 100%

64.6% 21.1% 4.1% 10.2% 100%

57.8% 28.1% 3.5% 11.3% 100%

These figures support the observation, made on the basis table 2, that Otomi prefers to borrow nominal material rather than verbal or adjectival material from Spanish. Otomi even stands out against both Quechua and Guarani in its preference for nouns. Verbs and especially adjectives are dispreferred. Our explanation for this is that Otomi lacks the category adjective altogether, and prefers nominal to verbal strategies in its grammar. Thus, the typological differences between these languages may well be reflected in their borrowing behavior.3 4.2. Loanwords and semantic fields Now let us have a look at the semantic fields where Spanish borrowings are to be found. The overall distribution is found in Table 4 towards the end of this section. Table 3 highlights the most frequent categories (more than 20% borrowed) and 3

The number of types amounts to 21.3%, and is higher than the 16.3% we found for the LWT lis. This may be expected, given the fact that the LWT list comprises what may be considered to be the core lexicon, and which should be less prone to borrowing. On the other hand, our corpus counts contain quite a number of phonological variants of the same lemma, and are therefore an overestimation of the real situation.

35. Loanwords in Otomi

905

the least frequent ones (less than 10% borrowed). Per field, we give the total number of relevant meanings in brackets in the first column. The second column gives the percentages of the meanings for which there is a borrowed word. The third column gives the percentage of the meanings in column one for which there is only a borrowing and no Otomi counterpart. Table 3:

Spanish loans by semantic field

semantic field

Borrowed

Unique

53% 35% 29% 22% 22% 21% 21%

18% 29% 17% 5% 4% 11% 8%

9% 8% 8% 6% 3% 2% 16.3%

4% 6% 4% 2% 0% 0% 6.9%

Modern world (57) Animals (105) Agriculture and vegetation (66) Warfare and hunting (40) The house (45) Clothing and grooming (57) Religion and belief (24) … Spatial relations (74) Emotions and values (48) Law (26) The body (156) The physical world (68) Sense perception (48) Total (1413)

As may be expected, by far the most borrowings are found in the domain Modern world. More than half of the meanings had loanword counterparts. It is interesting, however, that only about a third of these are unique (have only loanword counterparts). For the rest there is also an Otomi alternative. Some examples of unique borrowings are given in example (2). (2)

‘the television’ ‘the pill’ ‘the police’

telebisyon (< televisión) pastiya (< pastilla) polisiya (< policía)

Some examples of “doubles” are found in (3). The Otomi counterparts are often compounds, and some seem to be ‘constructed’, and slightly artificial. (3)

‘the radio’ ‘the hospital’

radio (< radio) ospital (< hospital)

‘the newspaper’

peryodiko (< periódico)

thuhu (= music, to call) nguu nt’othe (= house where to cure) he’mi mpa’bu (= paper every day)

906

Ewald Hekking and Dik Bakker

Examples of LWT meanings from this category for which there are no loanword counterparts, although a word is readily available in Spanish (and is frequently borrowed into other languages), are to be found in (4). (4)

‘the telephone’ ‘the bus’ ‘the cigarette’

nts’ohñä (= object for calling words) dätä boja (= big iron) ‘yui

The second most frequent category, Animals, has an almost strict division between Otomi names for indigenous animals (e.g. boar, rabbit, goat) and borrowed names for introduced ones (e.g. bull, cow, calf). A rare double is for ‘pig’: berko (< puerco) and ts’udi. On the other side of the scale we find the categories The body, The physical world and Sense perception, for which few items are borrowed. The first category contains only three loanwords without Otomi counterparts: animä (< Latin ánima ‘spirit’) ‘the corpse’ and also ‘dead’; pulmon ‘the lung’ (< pulmón); and riñu ‘the kidney’ (< riñón). An interesting side effect of cultural contact for the lexicon is the Otomi word fani. Its pre-Columbian meaning is ‘deer’. After the Spaniards introduced horses, the word came to be used also for this new animal, which became dominant in the streets. Over time, ‘horse’ has become the only meaning, with derivations such as tafni ‘stallion’ (= father horse); tsufni ‘mare’ (= female horse); t’olo fani ‘foal’ (= little horse); and even fanibojä ‘bicycle’ (= horse of iron). Deer themselves were demoted to the compound fani fantho (literally ‘horse of mountain’), or, interestingly, hogufani (‘the real horse’).

35. Loanwords in Otomi

Table 4: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

907

Loanwords in Otomi by semantic field (percentages)

The physical world Kinship Animals The body Food and drink Clothing and grooming The house Agriculture and vegetation Basic actions and technology Motion Possession Spatial relations Quantity Time Sense perception Emotions and values Cognition Speech and language Social and political relations Warfare and hunting Law Religion and belief Modern world Miscellaneous function words

Spanish loanwords

Non-loanwords

3.6 3.8 21.8 3.2 14.0 15.2 17.0 17.7 6.2 6.3 13.5 7.6 9.7 13.3 1.2 5.4 9.0 6.9 7.3 17.5 6.2 11.2 31.3 11.1 10.7

96.4 96.2 78.2 96.8 86.0 84.8 83.0 82.3 93.8 93.7 86.5 92.4 90.3 86.7 98.8 94.6 91.0 93.1 92.7 82.5 93.8 88.8 68.7 88.9 89.3

5. Integration of loanwords As discussed above, in the LWT Otomi database there are 251 Spanish loanwords, which adjust in different degrees to the phonological patterns of Otomi. Most of the adaptations that we discuss elsewhere (Hekking 1995; Hekking et al. 2009+) are also found in our LWT database. We mention the following. (a) The nasalization of the Spanish open central vowel a (or the half open back vowel o) in syllables that start with the nasal consonants m, n or ñ: kadenä /kadenã/ (< cadena ‘the chain’); (b) The nasalization of the sibilants and voiced and unvoiced plosives at the beginning of a syllable: nsinke /nsinke/ (< sin que ‘without’); (c) The loss of vowels that form the nuclei of unstressed syllables: biskleta /biskleta/ (< bicicleta ‘bicycle’) or the replacement of these vowels by the central vowel u /!/: bispura /bisp!ra/(< víspera ‘evening’);

908

Ewald Hekking and Dik Bakker

(d) The insertion of the epenthetical vowel u /!/ in loanwords with consonant clusters atypical for Otomi: dokutor /dok!tor/ ( televisión ‘television’). Although these criteria may help to classify loanwords as old or new, we are conscious of the fact that the shortage of historical material written in Otomi remains a serious problem for determining more exactly the time of entry into the language. With the available means, we established that 111 (44.4%) of the loanwords in our list are pre-modern (i.e. before 1950), while 136 (54.2%) would be modern (after 1950). In the case of four words (1.6%) we could not establish a date with any certainty. Given a time depth of around 450 years for the older words, and only 50 years for the newer ones, these figures suggest a dramatic increase in the

35. Loanwords in Otomi

909

amount of borrowing over recent decades. We will give a few examples of both categories. First some loanwords we assume to be pre-modern. anxe

< ángel

angel

animä

< ánima

dead, corpse, carcass

ata

< altar

altar

badu

< pato

duck

baga

< vaca

cow

benä

< avena

oats

bisinu

< vecino

neighbor

boi

< buey

ox

boto

< botón

button

denda

< tienda

shop

doro

< toro

bull

mäzo

< macho

mule

mexa

< mesa

table

mundo

< montón

heap, pile, crowd, multitude

nhwebe

< jueves

Thursday

nonxi

< lunes

Monday

nsinke

< sin que

without that

‘ñalba

< alba

dawn

pale

< padre

old man, grandfather

serbeza

< cerveza

beer

sumänä

< semana

week

sundado

< soldado

soldier

swida

< ciudad

city

tambo

< tambor

drum

xebo

< cebo

bait

An interesting loanword is the word animä which is originally a Latin word first adopted in Spanish and later in Otomi. That the older loanwords are more integrated than the more recent ones is supported by the fact that there is an Otomi counterpart for only 37.6% of them. For the modern loanwords there is an Otomi alternative for the majority of the relevant meanings: 70.7%. The following old loanwords have an Otomi alternative. All forms were found in the dialect of Santiago Mexquititlán.

910

Ewald Hekking and Dik Bakker

Loanword

Spanish source

Otomi alternative

Gloss

kolmenä

< colmena ‘beehive’

gäne

bee

señä

< señal ‘sign’

ntheni

scar

fruta

< fruta

ixi

fruit

kre

< creer

ñeme

believe

dondo

< tonto

xongo, ‘bemfeni

stupid

The following are examples of modern loanwords with counterparts in Otomi, again all occurring in the data from Santiago. Admittedly, some of the Otomi versions are more or less clearly constructed, and probably created by purist speakers of the indigenous language after the borrowed word had entered daily conversation. pelota

< pelota

nuhni

ball

lisensya

< licencia

nthegi, seki

license

kafe

< café

‘bothe

coffee

tiyo

< tío

‘we

uncle

rweda

< rueda

tsant’i ‘round’

wheel

abyon

< avión

nsani bojä ‘fly car’

airplane

banko

< banco

nguu njwatubojä ‘house where to keep money’

bank

dokutor

< doctor

‘yothe ‘someone who cures’

doctor

peryodiko

< periódico

he’mi mpa’bu ‘paper daily’

newspaper

tren

< tren

mabojä ‘long car’

train

Examples of modern loanwords without counterparts in Otomi are: bufalo

< bufalo

buffalo

ladriyo

< ladrillo

brick

pelikula

< película

film

sorgo

< sorgo

sorghum

tiburon

< tiburón

shark

6. Grammatical borrowing In §4.1 we mentioned that 10 of the 92 meanings in the function word category have Spanish loanword counterparts. It concerns quantifiers (more, a thousand, all, twice); prepositions (with, without); time adverbials (never, immediately); and a conjunction (or). However, there is always an Otomi alternative available. With just over 10% of the functional meanings borrowed, and none of them uniquely, this appears to be one of the less central areas for borrowing. In our corpus of spoken Otomi, however, we found a remarkable number of Spanish function words. No less

35. Loanwords in Otomi

911

than 48.1% of the borrowed tokens belong to the grammatical area of the lexicon. Table 6 gives the most important categories. Table 5:

Grammatical parts of speech borrowed

Part of Speech

Otomi percentages borrowed

Preposition Coordinator Discourse Marker Subordinator Other TOTAL

21.2% 7.5% 6.5% 6.1% 6.8% 48.1%

Thus, almost half of the borrowed function words are prepositions. We found 54 different types. Interestingly, neither Quechua nor Guarani seem to borrow more than the casual preposition from Spanish. We assume that typological differences play a role here. While Otomi has adverbial elements which are typically found in a prenominal syntactic position, Quechua and Guarani are both clearly postpositional. There is also a remarkable number of coordinators and subordinators, with many different functions at several levels of syntax. Classical Otomi is very much asyndetic in the sense that many relations between constituents remain implicit. The frequent use in the spoken language of Spanish prepositions (for all kinds of relations between nominal and verbal constituents), coordinators (for addition, contrast, and disjunction) and subordinators (for time, place, manner, purpose, cause, condition, and concession, among others) clearly affect the asyndetic character of the language. We noticed several other contact-induced changes in the grammar. In §4.1 we mentioned sige (< sigue ‘follow’) as one of the few verbs in the loanword list. When we look at the use of this verb in the corpus, we find that it is typically used as an auxiliary, as in example (5). (5)

Him-bi=patu

nu-r

hñäñho, syempre

sige

ñätho.

NEG-PAST.3=change

DEM-DEF.SG

Otomi

continue

speak.

always

‘Otomi hasn’t changed, they always keep speaking it’ Other Spanish verbal borrowings in the corpus, not represented in the LWT database, are tyeneke (< tiene que ‘have to’); debe (< debe ‘must’); pwede (< puede ‘can’); and nesesita (< necesita ‘need’). Their introduction affected the Otomi modal system, which is mainly based on verbal affixes and nominal expressions. Furthermore, there appears to be a tendency in nonverbal predication to insert Spanish copula ta (< está ‘be’) where classical Otomi would just have the nominal with an optional proclitic for tense. Finally, we mention the frequent use of two Spanish elements in paragraph initial position: pos (< pues ‘well’) and este (< este ‘this’). The latter is the masculine singular of the Spanish proximate demonstrative, which is clearly used by Otomi

912

Ewald Hekking and Dik Bakker

speakers as turn holder and hesitation marker. It is very characteristic for Mexican Spanish in this function as well. We think that the main function of these elements is to give a Spanish flavor to Otomi utterances, and possibly also to accord high prestige to the speaker. Following Matras (1998) we might assume that, in fact, the discourse structure of Otomi is converging towards that of Spanish. This is illustrated in example (6) below. (6)

Temu

gi=mä-nge?

What

PRES.2=say-EMPH.2

Pwes nuga

di=mä-nga gatho

ar

za

well

PRES.1=say- EMPH.1

everything

DEF.SG

I

good

‘What do you think? Well, I think everything is okay.’ Potentially in line with this, the constituent order of Otomi has been influenced by Spanish as well. There is a tendency in Otomi to replace the classical VOS and VS main clause orders by the Spanish orders SVO and SV, which may be indicative of a restructuring in discourse pragmatics. While (7a) is the only acceptable order in the classical language, today we find utterances as (7b) as well. (7)

a. Xi=nkuhi PRF.3=delicious

ar

ngo

DEF.SG

meat

‘The meat is delicious.’ b. Ar DEF.SG

ngo

xi=nkuhi

meat

PRF.3=delicious

‘The meat is delicious.’

7. Conclusion We have sketched the language contact situation of the Otomanguean language Otomi, from Mexico. This language shows a clear impact on its lexicon from Spanish. We found a Spanish loanword counterpart for 16.3% of the meanings on the LWT list; 6.9% of the meanings are expressed by a Spanish loanword only. Given our criteria for the integration of loanwords, around 45% of the Spanish borrowings were classified as pre-modern, i.e. in use (long) before 1950. For the majority of these, there is no Otomi counterpart, which is indicative of their integration. For the borrowings of a more recent date we do find an Otomi alternative in over 70% of the cases. A fine-tuned analysis of the use in context of these might reveal in what sense words of Spanish origin are replacing indigenous ones. Most of the corresponding Otomi forms are established, monomorphemic lexemes of the language. However, we did find a number of apparent new formations, typically compounds. To what extent these are successful may only be found out on the long run. We think, however that the prestige attached to the Spanish loanwords, which very

35. Loanwords in Otomi

913

often are themselves borrowings from English or international jargon, will make them win out in the end. The figures for borrowing in the LWT lexicon are supported by the findings from our extensive corpus of spoken Otomi. They show that, apart from a sizeable number of lexical borrowings, a large number of grammatical elements from Spanish have been copied, mainly prepositions and discourse markers. The fact that the majority of the borrowings entered the language only during the last 50 years is indicative of the growing influence of Spanish. This is obviously due to the recent changes in the overall structure of Mexican society, the growing influence of the mass media, education, migration, and globalization in general. Even if Otomi is currently still spoken by well over 300,000 speakers, it is far from clear whether there would be many left around the year 2100. Apart from two clear cases, we did not find any borrowings from other languages than Spanish. We think that it is unlikely, however, that more than a hundred years of domination by the speakers of Nahuatl, and ongoing contact with surrounding indigenous languages also after the Spanish invasion would have left no traces in the Otomi lexicon. A more thorough comparison with the lexicons of the relevant languages than we have been able so far to implement is called for.

References Bakker, Dik & Gomez Rendon, Jorge & Hekking, Ewald. 2008. Spanish meets Guaraní, Otomí and Quichua: A multilingual confrontation. In Stolz, Thomas & Bakker, Dik & Salas Palomo, Rosa (eds.), Aspects of language contact: New theoretical, methodological and empirical findings with special focus on Romancisation processes, 165–238. Berlin: Mouton de Gruyter. Bakker, Dik & Hekking, Ewald. 1999. A functional approach to linguistic change through language contact: The case of Spanish and Otomí. Working Papers in Functional Grammar 71:1–32. Amsterdam. Bartholomew, Doris. 1954–1955. Palabras prestadas del español en el dialecto otomí. Revista Mexicana de Estudios Antropológicos 14:169–171. México. Bartholomew, Doris. 2000. Intercambio lingüístico entre otomí y náhuatl. In Lastra de Suárez, Yolanda & Quezada, Noemí (eds.), Estudios de Cultura Otopame 2:189–201. México: Universidad Nacional Autónoma de México, Instituto de Investigaciones Antropológicas. de Sahagún, Fray Bernardino. 1989 [1557]. In López Austin, Alfredo & García Quintana, Josefina (eds.), Historia General de las cosas de la Nueva España: Primera versión íntegra del texto castellano del manuscrito conocido como Códice Florentino. México: CONACULTA.

914

Ewald Hekking and Dik Bakker

Ecker, Lawrence. 1966. Algunas observaciones sobre el calendario otomí y los nombres otomíes de los monarcas nahuas en el Códice de Huichapan. In Pompa y Pompa, Antonio (ed.), Summa anthropologica: Homenaje a Robert J. Weitlaner, 605–612. México: Instituto Nacional de Antropología e Historia. th

Gordon, Jr., Raymond G. (ed.). 2005. Ethnologue: Languages of the World. 15 edn. Dallas, TX: SIL International, . Guerrero, Alonso. 2002. El Códice Martín del Toro: De la oralidad y la escritura, una perspectiva Otomi: Siglos XV-XVII. Tesis de Licenciatura en Etnohistoria. Escuela Nacional de Antropología e Historia. Hekking, Ewald. 1995. El otomí de Santiago Mexquititlán: Desplazamiento lingüístico, préstamos y cambios gramaticales. Amsterdam: IFOTT. Hekking, Ewald. 2001. Cambios gramaticales por el contacto entre el otomí y el español. In Zimmermann, K. & Stolz, T. (eds.), Lo propio y lo ajeno en las lenguas austronésicas y amerindias: Procesos interculturales en el contacto de lenguas indígenas con el español en el Pacífico e Hispanoamérica, 127–151. Frankfurt: Vervuert-Iberoamericana. Hekking, Ewald. 2002. Desplazamiento, pérdida y perspectivas para la revitalización del hñäñho. In Lastra de Suárez, Yolanda & Quezada, Noemí (eds.), Estudios de Cultura Otopame (Revista Bienal) 3(3):221–248. México: UNAM, Instituto de Investigaciones Antropológicas. Hekking, Ewald & Andrés de Jesús, Severiano (eds.). 2002. Ya 'bede ar hñäñho Nsantumuriya, Cuentos en el Otomí de Amealco. Querétaro: Universidad Autónoma de Querétaro. Hekking, Ewald & Andrés de Jesús, Severiano. 1989. Diccionario español-otomí de la comunidad de Santiago Mexquititlán Querétaro. Querétaro: Universidad Autónoma de Querétaro. Hekking, Ewald & Andrés de Jesús, Severiano. Forthcoming. He'mi Mpomuhñä ar Hñäñho ko ya Njat'i (Diccionario explicativo ilustrado del otomí del estado de Querétaro). Hekking, Ewald & Bakker, Dik. 1998a. Language shift and Spanish content and function words in Otomí. In Caron, B. (ed.), Actes du 16e Congres International des Linguistes. Actes du 16e Congres International des Linguistes. Oxford: Elsevier Sciences. Hekking, Ewald & Bakker, Dik. 1998b. El otomí y el español de Santiago Mexquititlán: Dos lenguas en contacto. Foro Hispánico Revista Hispánica de los Países Bajos 13:45–73. Sociolingüística: lenguas en contacto. Holanda: Groningen. Hekking, Ewald & Bakker, Dik. 2007. The case of Otomí: A contribution to grammatical borrowing in cross-linguistic perspective. In Matras, Yaron & Sakel, Jeanette (eds.), Grammatical borrowing in cross-linguistic perspective, 435–464. Berlin: Mouton de Gruyter. Hekking, Ewald & Bakker, Dik & Gómez Rendón, Jorge. Forthcoming. Language contact and typology: Anything goes, but not quite. In Chamoreau, Claudine & Estrada, Zarina & Lastra, Yolanda (eds.), Perfiles tipológicos y contacto entre lenguas.

35. Loanwords in Otomi

915

Hekking, Ewald & Muysken, Pieter. 1995. Otomí y Quechua: Una comparación de los elementos prestados del español. In Zimmermann, K. (ed.), Lenguas en contacto en Hispanoamérica: Nuevos enfoques, 101–118. Frankfurt am Main: Vervuert. Hernández Cruz, L. & Victoria Torquemada, M. & Sinclair Crawford, D. 2004. Diccionario del Hñähñu (otomí) del Valle del Mezquital, Estado de Hidalgo. México: Instituto Lingüístico de Verano, A.C. Jiménez Moreno, Wigberto. 1939. Origen y significación del nombre otomí. Revista Mexicana de Estudios Antropológicos 3:62–68. México. Lastra de Suárez, Yolanda. 1994. Préstamos y alternancias de código en otomí y en español. In Mackay, Carolyn & Vazquez, Veronica (eds.), Investigaciones lingüísticas en Mesoamérica, 185–195. México: Universidad Autónoma de México. Matras, Yaron. 1998. Utterance modifiers and universals of grammatical borrowing. Linguistics 36(2):281–331. Ruhlen, M. 1991. A Guide to the World’s Languages. London: Edward Arnold. SEDESU. 2005. Anuario Económico 2005. Querétaro: Gobierno del estado de Querétaro, Secretaria de Desarrollo Sustentable. Smith Stark, Thomas & Lastra, Yolanda Avelino, Heriberto & Covarrubias, MariPaz (eds.). 2000. Versión electrónica del Vocabvlario trilingüe español-náhuatl-otomí (1605) de Alonso Urbano. México: Biblioteca Novohispana de Lenguas Indígenas, Centro de Estudios Lingüísticos y Literarios, El Colegio de México. Thomason, Sarah G. 2001. Language contact: An introduction. Edinburgh: Edinburgh University Press. Urbano, Alonso. 1990 [1605]. Arte breve de la lengua Otomí y vocabulario trilingüe. Ciudad de México: Universidad Nacional Autónoma de México. Zimmermann, Klaus. 1992. Sprachkontakt, ethnische Identität und Identitätsbeschädigung: Aspekte der Assimilation der Otomí-Indianer an die hispanophone mexikanische Kultur [Language contact, ethnic identity and damage of identity: Aspects of assimilation of the Otomí indians to the Spanish-speaking Mexican culture]. Frankfurt am Main: Vervuert.

916

Ewald Hekking and Dik Bakker

Loanword Appendix Spanish

ternera

calf

bota

boot

t’olo baga

calf

gorro

hat, cap

berko

pig

unwento

ointment

t’u fani

foal, colt

mänta

poncho

ndobru

donkey

boto

button

tsubru

donkey

stufa

stove

tsumuzo

mule

ladriyo

brick

mula

mule

meskla

mortar (2)

lobo (1)

wolf

nthokukomida

cookhouse

sorra

fox

bentanä

window

letxusa

owl

siya

chair

t’olo fani

foal, colt

poste

doorpost

animä

dead

arko

arch

pulmon

lung

pala

spade

riñu

kidney

arros

rice

señä

scar

paxa

hay

dokutor

physician

tabako

tobacco

txitxi

nipple, teat

koko

coconut

tasa

cup

simiya

seed

sopa

soup

sorgo

fruta

fruit

millet, sorghum

igo

fig

hmunts’upasto

rake

uba

bunch

subada

barley

gexu

cheese

benä

oats

mäntekiya

butter

pala

spade

binu

wine

era

serbesa

beer

threshingfloor

ornu

oven

planta

plant

pinu

pine

bambu

bamboo

piñä

cone

kobre

copper

plomu

lead

klabo

nail

kadenä

chain

märtiyo

hammer

pega

glue

bronse

copper

bidryo

glass

siriyo

match

kweba

cave

seriyo

match

sobrinu

nephew

sobrinä

niece

pale

grandfather

tiyo

uncle

burru

donkey

ganso

goose

badu

duck

liyon

lion

sorro

fox

elefante

elephant

kameyo

camel

kolmenä

bee

kanguro

kangaroo

kokodrilo

crocodile

gabyota

seagull

garsa

heron

loro

parrot

palomä

dove

tiburon

shark

delfin

dolphin

bayenä

whale

sera kolmenä

beeswax

kamaro

prawns, shrimp

kodornis

quail

nwes

nut

alse

elk, moose

asete

oil

bufalo

buffalo

ndega

butter

jawar

jaguar

pinsa

tongs

pasto

pasture

oro

jewel

doro

bull

xabo

soap

boi

ox

lino

linen

baga

cow

seda

silk

t’olo doro

calf

fyeltro

felt

fani

horse

ilo

thread

t’u boi

calf

medya

sock, stocking

35. Loanwords in Otomi

917

sinsel

chisel

bispura

evening

paraiso

heaven

erramyenta

tool

ndomingo

Sunday

‘moe animä

sacrifice

yugu

yoke

kolor

colour, color

plaka

license plate

timu

rudder

anxe

soul, spirit

kotxi

car

sige

to follow

hoguswerte

good luck

mäkinä

machine

kartera

road

ts’oswerte

bad luck

biskleta

bicycle

karreta

cart, wagon

penä

grief

lisensya

driver’s license

rweda

wheel

komongu

manner

prisidente

president

ehe (2)

axle

kre

to think (2)

motosikleta

motorcycle

barko

ship

dondo

stupid

pitrolyo

petroleum

denda

shop, store

skwela

school

bomba

bomb

pesa

to weigh

‘yoskwela

pupil

polisiya

police

kosa

thing

mästro

teacher

pelikula

film, movie

ja salba

to rescue

komu

manner

pastiya

pill, tablet

ot’e dañu

to injure

o (2)

or

torniyo

screw

kwenta

bill

lei

to read

plastiko

plastic

renta

to hire

japi ar kaso

to admit

kafe

coffee

merkado

market

lapisero

pen

tren

train

pelota

ball

trompeta

horn, trumpet

bateriya

battery

raya

line

bisinu

neighbour

radyo

radio

norte

north

swida

town

ospital

hospital

sur

south

mända

newspaper

square

to command, to order

peryodiko

kwadro

oroplano

airplane

enemy

lugar

place

enemigo

abyon

airplane

pilota

ball

ilo‘bejwä

fishing line

mfrenä

to brake

lobo (2)

ball

xebo

bait

mpoyeta

injection

mundo

to pile up

ejersito

army

numero

number

pare

pair

sundado

soldier

seyo

postage stamp

mil

a thousand

kasko

helmet

banko

mäs

more

tropa

army

bank (financial institution)

kadu

all

soldado

soldier

kalendaryo

calendar

yoho ya bes

twice, two times

lansa

spear

kaye

street

kañon

gun

motor

motor

ora

hour

mfende

to defend

telebisyon

television

sumänä

week

ngupeskado

fisherman

ko

with

nhwebe

Thursday

ilo nthuts’i

fishing line

nsi

without

nsabdo

Saturday

ley

law

nsinke

without

nunka

never

tribunäl

court

is ar ora’ä

immediately

hoguanxe

fairy, elf

ni ‘na bes

never

mprebeni

omen

kwate

twins

‘ñalba

dawn

ata

altar

xifik’eñä

centipede

Nahuatl

Chapter 36

Loanwords in Saramaccan, an English-based creole of Suriname* Jeff Good 1. The language and its speakers 1.1.

Sociohistorical background

Saramaccan is an Atlantic creole spoken primarily in Suriname, though there are also speakers in French Guiana as well as a substantial diaspora population in the Netherlands. The fifteenth edition of the Ethnologue estimates that there are 26,000 speakers of the language. It is a maroon creole – that is, a creole spoken by descendants of slaves who escaped from plantations (see Price (1976) for an overview of the history of the maroons of Suriname). Accordingly, most Saramaccan villages lie in the Surinamese rain forest away from the coast, which was the center of the colonial plantation economy. These villages are situated along two rivers, the Suriname River and Saramacca River. (The populations found along the Saramacca River, speaking the Matawai dialect, are sometimes classified as a distinct group from the Saramaccans.) All of the data discussed here, and included in the subdatabase, comes from dialects spoken along the Suriname River, of which two are generally distinguished, a Lower River dialect, spoken closer to the coast, and an Upper River dialect spoken further in the interior (Bakker et al. 1995: 165). In addition to Saramaccan, there are two other creoles spoken in Suriname, Sranan and Ndyuka, that are generally believed to be genealogically related to Saramaccan (see, for example, Smith 1987b: 150–169, 2002: 135–136; McWhorter 2000: 101–105; and Migge 1998: 45). There is also good evidence that the Surinamese creoles, in turn, are part of a larger genealogical unit comprising all Atlantic English-based creoles (Smith 1987b: 103–112; McWhorter 2000: 41–98). Ndyuka, like Saramaccan, is a maroon creole. Sranan, the urban and coastal creole of Suriname, represents a continuation of Surinamese plantation creole varieties and serves as a lingua franca for the country. Figure 1 gives a map showing the Saramaccan-speaking area, in addition showing the locations of other Surinamese language communities. The official language of Suriname, Dutch, as well as Sranan, *

The subdatabase of the World Loanword Database that accompanies this chapter is available online at http://wold.livingsources.org. It is a separate electronic publication that should be cited as: Good, Jeff. 2009. Saramaccan vocabulary. In Haspelmath, Martin & Tadmor, Uri (eds.) World Loanword Database. Munich: Max Planck Digital Library, 1089 entries.

36. Loanwords in Saramaccan

919

are not specifically located on Map 1, as their use is widespread and their core geographic areas overlap significantly.

Map 1:

Linguistic setting of Saramaccan in Suriname and African donor language represented in the subdatabase (language locations based on the Ethnologue and Nurse & Tucker (2001))

920

Jeff Good

Permanent European settlement in Suriname began in 1651 when an English colony was established along the Suriname River. English control of the area was relatively short-lived, and Suriname came under the control of the Dutch in 1667. Despite the relatively short period of English control, the lexicons of the Surinamese creoles show heavy English influence and are generally considered English-lexifier creoles, though the Saramaccan case is quite complex since the language shows a significant Portuguese element in its basic vocabulary (see Smith 1987b: 116–125) – this issue will be discussed in more detail below in this section and in §3. Of the Surinamese maroon societies, Saramaccan’s is the oldest, with 1690 generally being given as the year of a first mass escape of slaves who would form the group’s founding core. Price (1976: 30) gives 1712 as the date of the last significant influx of escaped slaves into the group. By 1770, the oldest maroon societies in Suriname had signed treaties with the Dutch, which made them officially – and probably largely – closed to new recruits (Price 1976: 29–31, Bakker et al. 1995: 168–169). Early stages of the Saramaccan language are comparatively well documented, with records going as far back as 1762 (Arends 2002b: 201–205). Lexical evidence indicates that substrates drawn from two clusters of languages, Bantu languages spoken around the former Kingdom of Loango (which would not necessarily have formed a genealogical unit; see §2) and Gbe languages, were especially influential in Saramaccan’s development (see, for example, Daeleman 1972 for Bantu and Smith 1987a for Gbe). This evidence is consistent with known demographic facts of the Surinamese slave trade, which show that most slaves who were transported to Suriname were taken from parts of Africa where languages from those two groups are spoken (see Arends 1995: 268, based largely on Postma 1990). However, it should be noted that the recent work of Price (2008: 287–308) has suggested that the linguistic background of imported slaves may have been more heterogeneous than has been recently believed. This is because, even though much of the demographic input to the colony in its early history came from ships departing from only a few coastal areas of Africa, there is evidence that the “catchment” areas for those slaves were fairly large, encompassing not only linguistic groups in close proximity to the relevant ports but also some that were relatively distant from them. A further point regarding the demographics of the Surinamese slave population is that, as described by Arends (1995: 268), “[t]he rate of nativization among Suriname’s black population was very slow: more than one hundred years after colonization still more than 70% of the black population was African born.” This demographic skewing is connected to the role Suriname had as a sugar plantation colony, since sugar production not only required a large labor force but, at least in the Suriname case, was also associated with an inordinately high mortality rate, meaning that new imports of slaves were not only necessary for the expansion of plantations but also for their maintenance (Arends 2002a: 115–116; Price 1976: 9). Therefore, at any given point in time in early Surinamese history, native-born Africans would have predominated in the slave population. “Indeed, during the

36. Loanwords in Saramaccan

921

sixty years following the Dutch takeover of 1667, the number of Africans imported in each ten-year period amounted to between 110 percent and 220 percent of the total slave population at the beginning of the decade ... ” (Price 1976: 9). These demographic patterns are probably largely responsible for the fact that Saramaccan shows a comparatively high degree of African influence in its lexicon. In fact, as will be discussed in §6, Kramer (2002: 622) goes so far as to state that modern Saramaccan much more closely resembles Fon Gbe than the eighteenth-century variety of the language did, indicating that the African element in Saramaccan is not only the result of “creolization” but is also due to later contact-induced change. The Saramaccans to this day belong to a society clearly distinct from that of coastal blacks and non-blacks of Suriname, but one that has had continuous contacts with those communities over the centuries (Price 1983: 12). 1.2.

The development of the Saramaccan lexicon assumed here

Determining what lexical elements in creoles constitute “loanwords” is necessarily problematic, since their origins as contact languages do not obviously point to a single genealogical parent and, thereby, a single ancestral lexicon. What is crucial is to devise a system of vocabulary classification which is concrete enough to reliably capture interesting patterns but, at the same time, is not so inextricably tied to a particular theoretical conception of creoles that it will cease to be of value if theoretical fashions change. Indeed, in the ideal case, the Saramaccan subdatabase could be used to shed light on the various theoretical controversies regarding creole formation, which requires taking as minimal a theoretical approach as possible. Nevertheless, one theoretically-oriented assumption has been made about the Saramaccan lexicon for the purposes of this project. This is that it represents a continuation of the lexicon of English which branched off from that of standard English varieties at some point before the formation of Saramaccan itself. The theoretical position most closely associated with this assumption is one that can be called the “superstratist” perspective, which treats creoles as varieties of their superstrate lexifiers (see McWhorter 1998: 788–790 for overview discussion). This basic position is also adopted by the other creole language subdatabase that is part of this project, Seychelles Creole (Michaelis & Muhme). While the idea that creoles represent a continuation of their lexifiers is controversial (see, again, McWhorter 1998 for critical discussion), it is a useful one in the 1 present context for several reasons. The first is purely practical in nature: It is simply easier to determine if a word is of superstrate origin than substrate origin, making the superstrate lexicon a good “baseline” which loanwords are conceived of as adding to. Related to this is the fact that, since all of the superstrates have 1

It would seem to be worth noting here, that, in general, I do not personally believe that superstratist approaches provide the best models for the development of Saramaccan, even though I adopted this conception for the subdatabase.

922

Jeff Good

associated standard varieties, they serve as worthwhile reference points for diverse researchers. An additional practical advantage in adopting such a model is that, for a given creole, the superstrates tend to be few – most typically there is only one superstrate – while the substrates may be many, making the superstrates a simpler choice as representing the inherited lexicon. These things being said, the forms in the subdatabase have nevertheless been coded in ways which should mitigate any problems adopting a superstratist model may cause for those favoring other models, and, for practical purposes, it results in only one crucial effect: The loanword information fields of the subdatabase are left empty for words assumed to represent a continuation of English vocabulary. However, since these words are all explicitly marked as belonging to an Early English Stratum of vocabulary, they can still be readily identified. Furthermore, I have included comments for these words in the subdatabase indicating their English source to assist those not familiar with the sound changes that have affected English lexical items in Saramaccan. Therefore, if one, for example, wanted to treat all early English words as loanwords, contrary to what is assumed here, it would be relatively straightforward to isolate and recode them. However, in the case of Saramaccan, there is an additional complication in adopting the superstratist position: It is a rare instance of a so-called mixed lexifier creole, showing prominent early contributions from two superstrates, English and Portuguese (Bakker et al. 1995: 165). So, it is necessary to comment on the choice of the English lexicon as being privileged as “ancestral” over the Portuguese one. There are two reasons for this, one analytical and one practical. On the practical side, in order to ensure that the subdatabase was coded for the maximum number of possibly interesting distinctions, it seemed advisable to choose only one language as contributing the parent lexicon, as opposed to two. Since the English and Portuguese elements have been coded quite distinctly – one set as inherited, the other as loans – they are easily identified on their own, allowing the subdatabase to be used to test a range models for the development of Saramaccan with relatively straightforward modifications. On the analytical side, as discussed in §1.1, Saramaccan is generally believed to form a genetic unit with two other Surinamese creoles, Sranan and Ndyuka, both of which are uncontroversially English-based creoles. It, therefore, seems reasonable to assume that the English vocabulary represents inherited items while the Portuguese element represents a later intrusion. Thus, given that it made practical sense to choose only one superstrate as contrib2 uting the original lexicon, English specifically was chosen over Portuguese. There is one crucial respect in which the assumption that the Saramaccan lexicon is a continuation of the English lexicon has not been followed to its logical conclusions here, however. Words that are taken to be part of Saramaccan’s early English stratum are never treated as loanwords, even if they are uncontroversially 2

However, it should be noted here it has been suggested that the Portuguese lexical element in Saramaccan is significant enough to classify it as a Portuguese-based creole (see, e.g., Perl 1995: 244), though this is clearly a minority view (Smith 2002: 146).

36. Loanwords in Saramaccan

923

considered loanwords in Standard English. For example, the Saramaccan word famíi ‘relatives’ from English family is not classified as a loanword in the subdatabase, even though it was borrowed into English from Latin familia. Strictly speaking, if the Saramaccan lexicon is considered simply to be the lexicon of one of many varieties of English, then famíi should be considered a loanword. There are at least two reasons for not classifying such words as loans in the present context, however. The first and foremost is that clearly what is of interest in a study of Saramaccan loanwords are the loan patterns particular to the development of the creole itself, not the loan patterns of English in England before the colonial period. There is also the matter of expertise. My own background is not in the study of the history of English but, rather, synchronic and diachronic aspects of the Saramaccan lexicon (see, e.g., Good 2004, 2009). Furthermore, given that a separate study of English loanwords was also undertaken for this project (Grant), it seemed ill advised for me to also conduct research into the source of early English elements in Saramaccan. Given this background, in Table 1, I give the three-stage historical model for the Saramaccan lexicon assumed in the coding of the subdatabase. Any element believed to have entered the Saramaccan lexicon from stage 1 onwards was considered a loanword. Table 1: Stage Stage 1 Stage 2 Stage 3 1.3.

Development of the Saramaccan lexicon Description Early English-based Atlantic creole lexicon splits from English lexicon Surinamese creole lexicon splits from Early English-based Atlantic creole lexicon Saramaccan lexicon splits off from Surinamese creole lexicon

Notes on transcription

The Saramaccan transcription system used in the subdatabase follows that found in Rountree et al. 2000, with the segmental transcription summarized in Table 2. (However, see Smith & Haabo 2007 for the possibility of a contrast between plain voiced and implosive stops in Saramaccan not represented in this system.) The conventions are largely straightforward, though a few clarifications are in order. With respect to the consonants, a j on its own represents a palatal glide, the digraphs tj and dj represent alveopalatal affricates, the digraph nj represents a palatal nasal, sequences of a nasal followed by a voiced stop (e.g., mb, nd, ndj, ng) represent prenasalized stops (where ng corresponds to IPA [!g]), and kp and gb represent labiovelar stops which, in some dialects, can be realized as kw and gw respectively. With respect to vowels, ë and ö represent lower mid vowels (i.e., IPA ["] and [#]).

924

Jeff Good

Table 2:

Saramaccan segment inventory

Consonants

Vowels

p

t

tj

k

kp

i

u

b

d

dj

g

gb

e

o

mb

nd

ndj

ng

m

n

nj

f

s

v

z

w

l

ë

ö a

h j

Saramaccan also has contrastive vowel length, transcribed as two vowels of the same quality (as in, for example, jaa [ja$] ‘to sling’) and contrastive vowel nasalization, transcribed as a “coda” nasal consonant (as in, for example, hön [h#!]) ‘uproot’. In addition, though the details are complicated, Saramaccan also employs contrastive tone and/or pitch accent (see Good 2004). Surface high-tone vowels are transcribed with an acute accent, and surface low tone vowels are left unmarked. An effort has been made to give surface transcriptions of tones throughout the subdatabase, which, in some cases, has led to a degree of normalization from other sources. Another instance of normalization involves compounds which are generally transcribed without a space separating the constituent words of the compound, while existing sources on Saramaccan show variation in this regard (much as in, for example, English orthography). Finally, there is dialectal variation in Saramaccan which would result in some of the forms having different pronunciations from those transcribed for certain speakers.

2. Sources of data Fortunately for a project such as this one, Saramaccan’s status as the most “African” English-based Atlantic creole has led to fairly extensive investigation into African elements in its lexicon, and its classification by some as a mixed lexifier creole has led to detailed investigation of its Portuguese element. In addition, there are a number of good dictionaries and word lists available for the language. Furthermore, easily available Sranan and Ndyuka dictionaries are also useful in determining the status of many Saramaccan lexical items, as is Smith’s (1987b) comparative study of the Suriname creoles. I discuss each of these classes of sources in more detail in turn. Two works, in particular, Daeleman (1972) and Smith (1987a) are the primary sources of African etyma indicated in the subdatabase. Daeleman (1972) is titled by its author as a study of “Kongo” elements in Saramaccan. However, I believe it would be more accurate, at least in some cases, to not classify these elements as being specifically of Kikongo origin, but rather of more general Bantu origin. This

36. Loanwords in Saramaccan

925

is because a number of the elements Daeleman identifies as Kikongo match the Saramaccan forms only imperfectly, in a way that suggests that, while they are clearly of Bantu origin, they may not be direct borrowings of the specific forms given by Daeleman. Therefore, for those Saramaccan words which Daeleman treats as having Kikongo etymologies that appear in the Loanword Typology list, I have given their source not as Kikongo but rather, Loango Bantu, a label I use for Bantu varieties (not necessarily forming a genealogical unit) spoken in and around Loango, a kingdom located around coastal areas of present-day Republic of the Congo (Brazzaville), whose primary language was a Kikongo dialect and which was the center of the slave trade of the area (see Martin (1972) for a history of the kingdom). Undoubtedly, some of the Loango Bantu elements are of Kikongo origin, but, since it also seems likely others are not, a more general label seemed appropriate. Smith 1987a is a valuable (unpublished) comparative Gbe–Saramaccan word list, which is the source of all elements given as Gbe loanwords in the subdatabase. The word list itself contains words from various Gbe languages, including Fongbe and Ewe. (The Gbe language cluster is currently given as comprising twenty or so languages in the fifteenth edition of the Ethnologue.) Fongbe appears to have had an especially strong influence on Saramaccan, though Gbe elements may have come from other varieties as well, which is why such elements are simply classified as having a Gbe origin here without further specification. (In the subdatabase itself, a Gbe form not associated with a particular language is from Fongbe and Ewe forms are indicated as such in their gloss.) With respect to the identification of Portuguese loanwords, I have largely relied on Smith & Cardoso (2004), the most up-to-date existing survey of the Portuguese-element in Saramaccan. In addition to works with a specific focus on loanwords in Saramaccan, a number of good dictionaries and word lists were used both to determine how best to fill out each entry and to detect and identify further loanwords, in particular loanwords from Dutch and Sranan, which have not been the subject of extensive work (largely because they have relatively little to add to the study of the origins of Saramaccan). For basic reference, I relied most heavily on Rountree et al. (2000) and Glock & Rountree (2003) due to the fact that they were available in electronic form, which 3 facilitated searching. I also consulted de Groot 1977, an extensive Dutch-Saramaccan wordlist and de Groot (1981), a Saramaccan–Dutch wordlist. For Sranan, my primary source was Wilner (2003), again due to its availability in electronic form. At various points, I additionally consulted Shanks 2000, a dictionary of Ndyuka, which contained etymological notes on Ndyuka that, in some cases, were also relevant for Saramaccan. Finally, the comparative study of the Surinamese 3

Glock & Rountree (2003) was particularly valuable insofar as it contained many items not found in other sources. However, there was a technical problem in the dictionary (which is only available online) wherein the Dutch portion was missing all words beginning with n. Some Saramaccan words whose Dutch equivalent begins with n are, therefore, probably missing from the subdatabase for this reason.

926

Jeff Good

creoles in Smith (1987b) was valuable for understanding the histories of many of the older elements of the language. Various other minor sources also contributed to the formation of the subdatabase, including van Panhuys (1904), Taylor (1964) (a review of Donicie & Voorhoeve (1963), containing a range of etymological comments), Price (1970), Aceto (1999), Holm (2000), McWhorter (2000), and Bruyn (2002) (all of which are singled out here since they are referenced in the database itself).

3. Contact situations As a contact language, Saramaccan owes its very existence to a complex set of historical contact situations, some of which, in all likelihood, stretch back to Africa itself. Of these, two contact situations stand out as having had an especially profound influence on the development of the language’s lexicon: contact with Portuguese or a Portuguese-based creole and contact with Sranan. Less striking but still noteworthy in this regard was contact with Gbe and Bantu languages and Dutch. These situations are discussed in more detail below, followed by a brief discussion of other, less consequential (from a lexical perspective), types of contact coded in the subdatabase. 3.1.

Contact with a Portuguese variety

In comparison with its two relatives in Suriname, Sranan and Ndyuka, one of the most striking features of Saramaccan is the extent of its Portuguese-derived vocabulary. For example, Smith (1987a: 119–120) examined the etymology of Saramaccan and Sranan words in a 200-word Swadesh list and found that around thirty-five percent of the Saramaccan entries were of Portuguese origin compared to around four percent of Sranan entries. Particularly noteworthy are the presence of a number of Portuguese-derived function words in Saramaccan, including, for example, akí ‘here’, alá ‘there’, and ku ‘with’, and the large number of Portuguese-derived verbs, including basic concepts like bebé ‘drink’, ke! ‘want’, and kulé ‘run’. Uncontroversially, the presence of this extensive Portuguese element in Saramaccan is due to a distinct aspect of its early history as compared to the histories of Sranan and Ndyuka. Beyond this there is little consensus. Two crucial issues are: (i) What was the nature of the speech variety from which the Portuguese elements entered Saramaccan? and (ii) How did this speech variety get to Suriname? With respect to the first question, what is not clear is whether or not the relevant speech variety was some type of Portuguese or was, instead, a Portuguese-based creole. With respect to the second issue, the central concern is whether or not the Portuguese element can, in some way, be traced to Brazil. There is not sufficient space here to discuss these debates in detail, and I refer the reader to Arends (1999), Ladhams (1999), and Smith (1999) for full discussion.

36. Loanwords in Saramaccan

927

Relatively less controversial is the belief that the presence of Portuguese or a Portuguese-based creole in Suriname was connected, in some way, to Portuguesespeaking Jews who established plantations in the early Suriname colony (Smith 2002: 146). Especially noteworthy in this regard is that fact that, as discussed in Price (1983: 51–52), the origins of the senior Saramaccan clan, the Matjáu, can be traced to a mass slave escape from a Portuguese Jewish plantation. It is not clear what those slaves spoke precisely, but it clearly would have been influenced directly or indirectly by Portuguese and a good candidate would be that it is what is referred to by early sources as Djutongo (i.e., Jew Tongue) and was described as a mixture between “Negro” English and Portuguese (Smith 2002: 140). If this is the case, then the Portuguese borrowings would have actually entered Saramaccan on the plantations before marronage, and the Saramaccans would represent the last community speaking what was once a plantation speech variety spoken alongside the variety that would become Sranan. Because of the controversial nature of the source of the Portuguese elements in Saramaccan, in the subdatabase I have indicated their source as Suriname Portuguese, as opposed to Portuguese, to indicate that their source may not have been Portuguese, per se, but rather a distinct Portuguese-influenced speech variety spoken at some point in Suriname. 3.2.

Contact with Bantu and Gbe languages

As discussed above, of the possible African substrates for Saramaccan, Loango Bantu languages and the Gbe languages have been singled out as being especially important, and languages from both groups have contributed a noteworthy number of lexical items to the language. In the subdatabase, the contact situations leading to the introduction of these loanwords into the Saramaccan lexicon (as conceived here – see §1.2) have been labeled Loango Bantu contact situation and Gbe contact 4 situation. However, the use of these labels masks the fact that, under conventional views of the development of creoles, we may, in fact, want to distinguish between two types of contact involving substrate languages. The first would be what is typically discussed under the rubric of “creolization” – i.e., the process through which a full-fledged language develops from a contact variety. The second would be contact between speakers of an early variety of Saramaccan and Africans newly arrived to Suriname, many of whom would natively speak Loango Bantu or Gbe languages (see §1.1). This second type of contact would be contact of the “usual” sort insofar as it would not produce a new contact language but, rather, result in borrowing from one language into another. Practically speaking, I am not aware of any 4

In the interests of not unnecessarily proliferating contact situations, the two Twi elements in the database have been classified as resulting from the Gbe contact situation. This choice was made since (i) the Twi elements are specific to the Surinamese creoles, and, therefore, are unlikely candidates for being associated with the Early AEC contact situation and (ii) Twi is closer geographically to the Gbe languages than the Loango Bantu languages.

928

Jeff Good

generally-accepted criteria for distinguishing between words which may have entered Saramaccan as the result of one or the other type of contact. So, no attempt was made to code them as two distinct classes in the subdatabase. 3.3.

Contact with Sranan and Dutch

As the coastal and urban creole of Suriname, Sranan serves as a lingua franca for the country and it is, thus, an unsurprising source of loanwords in Saramaccan. Similarly, Dutch served as the colonial language of Suriname for several centuries and still serves as the official language and, thus, also, unsurprisingly has served as a source of loanwords (see de Kleine 2002 for a discussion of the status of Dutch in Suriname). Important in this context is the fact that it has been quite typical for around a century for Saramaccan men to spend significant portions of their working lives in coastal areas (see Price 1975: 65–74). While this has not generally caused them to lose their Saramaccan identity in any significant way, it does mean that they would have had extensive contact with Sranan speakers. Since Sranan is a close relative of Saramaccan, it can be difficult to detect Sranan loanwords into Saramaccan since both languages share many elements due to common inheritance. Therefore, it is likely that some Sranan loans into Saramaccan have gone undetected in the subdatabase, being inappropriately treated as common Surinamese creole elements when, in fact, they represent transfers into Saramaccan after it broke off from Sranan. More problematic, however, in this regard is the difficulty in determining whether a word of ultimate Dutch origin entered Saramaccan directly from Dutch or through the intermediation of Sranan. For example, the Saramaccan word we!ti ‘law, regulation’ is clearly ultimately from Dutch wet ‘law’, but I am unaware of any evidence that would bear on whether or not it was borrowed into Saramaccan directly from Dutch or via the Sranan word wèt ‘law’. Given the sociolinguistic situation wherein Saramaccans generally have had closer contact with Sranan than Dutch, in such cases, I treated the relevant word as being borrowed from Sranan in the subdatabase, sometimes noting a Dutch borrowing would also seem possible. A related problem in this regard is that, if I could not find a Sranan word corresponding to a given Dutch etymon in Wilner (2000), my primary source on Sranan vocabulary, I treated the Saramaccan element as entering directly from Dutch, even though it seems likely that in some cases the absence of the relevant etymon in Wilner 2000 represented an accidental omission rather than a true gap in the Sranan vocabulary. Because of these issues, the subdatabase in its present form cannot be considered to be a reliable source as to how Dutch loans into Saramaccan may pattern differently from Sranan loans. For this to be done, it would be necessary to recode the database to clearly distinguish between unambiguous Sranan loans, unambiguous Dutch loans, and loans which could plausibly have entered Saramaccan from either language. Fortunately, in many cases, there are good reasons for believing a Saramaccan word of ultimate Dutch provenance entered the language via Sranan or directly

36. Loanwords in Saramaccan

929

from Dutch. For example, the Saramaccan word olóísi ‘watch, clock’ ultimately appears traceable to Dutch horloge ‘watch’, but shows a much closer formal and semantic correspondence to Sranan oloisi ‘watch, clock’, strongly indicating Sranan was, in fact, the source for this word. Similarly, the Saramaccan word zé ‘ocean’ seems likely to have been borrowed directly from Dutch zee ‘sea’ since the relevant Sranan form is se ‘sea, ocean’, giving the Saramaccan form a closer formal match to the Dutch one. 3.4.

Minor contact situations

In addition to the contact situations described above, there are also a number of other contact situations coded in the database, classified as “minor” here since they are associated with relatively few loanwords. One such contact situation was that between Saramaccan and speakers of various Amerindian languages (see Carlin & Boven (2002) for overview discussion of the historical and contemporary Amerindian populations of Suriname). There is evidence of fairly intimate contact among the Saramaccans and Amerindian groups from early stages of the development of Saramaccan society, including the taking of Indian women into the Saramaccan community (Price 1983: 80), as well as extensive trade relations (Carlin & Boven 2002: 26). The subdatabase probably underestimates the social impact of this contact situation since there are a number of words of Amerindian provenance found in Saramaccan (see, e.g., Taylor 1964: 437) with meanings that are used for local species of animals not present in the Loanword Typology list. For example, there is a word for a specific deer species in Saramaccan, kusaí, that appears to be a loan from an Arawakan language (Taylor 1964: 437), but the list only contains the general meaning ‘deer’ whose Saramaccan equivalent is not of Amerindian provenance. The semantic specificity of many of the Amerindian loans, of course, makes them inherently unlikely to appear in a word list of general meanings like the one employed for this project, even though such terms may be quite salient to a specific culture. In addition, it is worth noting here that I have little familiarity with the relevant Amerindian languages, and the Amerindian element in the Saramaccan lexicon has not to my knowledge been discussed in detail in any published work, making it more difficult to find Amerindian loanwords than those of other languages. My primary source for Amerindian loans was Courtz (1997), a Carib-Dutch dictionary (though one additional Carib loan was found in Courtz 2008). (A more cursory inspection of the Arawak wordlist in Pet 1987 did not reveal the source of loans for any words currently specified as having an unknown origin in the subdatabase.) I did not systematically check whether any of the Carib loans may have represented words found generally among Amerindian languages of the area, and my reliance on Courtz (1997) reflected ease of access rather than previous suggestion that Carib may have been a particularly prominent source of loans in Saramaccan. Therefore, the extent of the vocabulary marked as being specifically of Carib origin in the database should not be construed to mean that contact with Carib was more or less

930

Jeff Good

extensive than contact with other Amerindian languages. Rather, it is an artifact of the process of data collection and coding. Properly assessing the relative prominence of the contributions of individual Amerindian languages to the Saramaccan lexicon will have to await further research. Another minor contact situation found in the subdatabase is labeled Early AEC contact situation, where AEC refers to Atlantic English-based creoles. This refers to a putative stage in the development of the Atlantic English-based creoles before they branched off from a common contact variety (Smith 1987b: 103–112; McWhorter 2000: 41–98). There is actually some controversy regarding whether or not such a variety ever existed. Nevertheless, some of the evidence for it is a set of African lexical items widely distributed among Atlantic English-based creoles whose presence is difficult to attribute to chance – and some of these lexical items are found in the subdatabase. For the purposes of this project, this contact situation is not of particular importance, yielding only a handful of loanwords, but due to its importance in the literature on the origins of the Atlantic English-based creoles, it seemed worthwhile to code them as resulting from a different contact situation from the others. The other minor contact situations coded in the subdatabase are contact with either French or French-based creoles (possibly resulting from the fact that many Saramaccan men have found work in French Guiana over the last century (Price 1976: 65–66)) and contact with modern products where there was difficulty in locating a specific source language for a given word for whatever reason, but which clearly represent recent borrowings (e.g., gási meaning ‘stove that uses bottled gas’).

4. Numbers and kinds of loanwords The subdatabase contains around 1100 distinct words associated with about 1300 meanings, with approximately 300 meanings not associated with a word, most typically because a meaning was deemed not to have a specific counterpart in Saramaccan. Somewhat more than 400 words were treated as borrowed. Around 300 words were treated as belonging to an Early English stratum of vocabulary – that is, they are taken to represent the continuation of the English lexicon into Saramaccan (see §1.2). Around 250 words were analyzable (e.g., compounds, phrasal idioms), which were not inherited from English, and thus represent innovations to the Saramaccan lexicon after it split off from the English lexicon (though not necessarily after the point at which we would consider Saramaccan to have become a language distinct from the other Surinamese or English-based Atlantic creoles). The remaining (100 or so) words are of unknown origin. If they are loanwords, they are most likely to represent unidentified African or Amerindian elements, since European elements, on the whole, are easier to identify. However, some may represent true Saramaccan monomorphemic innovations. Of the approximately 400 loanwords coded in the subdatabase, around half are from Suriname Portuguese and around a third are from Sranan. The bulk of the remaining words are from Gbe

36. Loanwords in Saramaccan

931

languages, Loango Bantu, Dutch, and Carib, with very minor contributions from English, French, Igbo, Mende, Twi, and Wolof.

5. Loanwords and semantic word class

Loango

Gbe

Carib

Mende

Twi

French

Igbo

Wolof

English

Unidentified

12.7 30.9 11.8

12.6 7.1 10.4

4.8 1.2 1.2

3.0 2.0 0.4

1.7 2.3 2.3

2.0 0.4 -

0.3 -

0.3 -

0.3 -

1.2

0.4 -

0.2 -

0.5 -

38.3 61.7 44.2 55.8 27.3 72.7

19.6 33.3 17.6

11.1 16.7 11.1

2.1 3.4

5.3 2.7

2.1 1.9

1.3

0.2

0.2

0.2

0.1

0.1

0.1

0.3

40.3 59.7 50.0 50.0 39.1 60.9

Non-loanwords

Dutch

Nouns Verbs Function words Adjectives Adverbs all words

Total loanwords

Loanwords in Saramaccan by semantic word class (percentages) Suriname Portuguese

Table 3:

Sranan

Table 3 summarizes the distribution of loanwords across donor language and semantic word class.

The figures in Table 3 underscore the remarkable fact about Saramaccan, already known from earlier work, that its lexicon has been exceptionally strongly influenced by Suriname Portuguese. Not only is a large portion of the language’s vocabulary of Portuguese origin, this element is significant in all word classes, and even more pronounced for verbal meanings than nominal ones. In fact, the true influence of Suriname Portuguese is slightly underestimated here since it is known that, in a few cases, Portuguese elements were actively used in Saramaccan in the late eighteenth century that are no longer in use today. For example, Schumann’s (1778) wordlist gives the word flamma ‘flame’ as a Saramaccan word, representing a borrowing of the Portuguese word flama. However, this word has fallen out of use in contemporary Saramaccan and, therefore, does not figure in the subdatabase. The only language with a comparable lexical impact to that of Suriname Portuguese in Saramaccan is Sranan. But, we must recall that Saramaccan has been in contact Sranan continuously since it came to exist as a separate language, while it has had no such continuous contact with Portuguese or a Portuguese-based creole. Whatever the contact event was between Saramaccan and Suriname Portuguese, it was certainly intense and remarkable in nature. The figures from the subdatabase clearly show the logic behind Saramaccan’s classification as a mixed lexifier creole by some sources. The other loanword figures in Table 3 seem to be more or less in accord with the relevant contact situations. There is a fairly wide distribution of Sranan elements throughout the lexicon, which is not surprising given the continuous contact

932

Jeff Good

between Sranan and Saramaccan over the centuries and given the fact that words for new concepts often enter Saramaccan via Sranan. While not particularly numerous, the Loango Bantu and Gbe contributions are spread over various word classes, which likely reflects the fact that they would have been brought into Saramaccan by native speakers of the relevant languages. As discussed in §3.3, there are problems in clearly determining whether words of ultimate Dutch origin entered Saramaccan directly via Dutch or through the intermediation of Sranan. It therefore seems inadvisable to come to strong conclusions based on the distribution of Dutch elements in Table 3. While Table 3 indicates the Carib contribution included at least one verb, in fact, the relevant verb maaní ‘screen, sieve’ appears to have been borrowed as a noun and its use has been extended to a verbal sense in Saramaccan. Thus, the Amerindian element appears to be confined largely to nouns, which is consistent with a scenario wherein the primary pathway through which Amerindian elements entered Saramaccan was as names for flora, fauna, and other objects present in the local environment that were unfamiliar to newly arrived Africans.

6. Loanwords and semantic word field Table 4 summarizes the distribution of loanwords across source language and semantic field. Given Saramaccan’s contact situations and the patterns already seen in §4.1, no particularly surprising patterns emerge from the examination of loanwords across semantic field. Suriname Portuguese words are spread over a wide range of categories, and, where they are poorly represented, this is not particularly surprising given the timing and nature of the contact. For example, the lack of any Suriname Portuguese elements in the field of religion and belief reflects, among other things, the fact that Christian missionaries came to Saramaccan communities after Suriname Portuguese contact and were not themselves Portuguese-speaking. Similarly, those cases where the Sranan component is greater than the Suriname Portuguese component in a given field can be largely explained by the fact that Sranan is the most typical donor language for “modern” concepts. This is obviously the case for the Modern world field but is also the case for fields like The house where many of the concepts, if not specifically modern, represent relatively recent imports like ‘window’, ‘bed’, and ‘pillow’. The patterning of loanwords across semantic field for the other languages also appears more or less as expected given the timing and nature of the relevant contact situations: the Sranan is element is widely distributed, the African element is found in more traditional spheres, the Dutch element is found in more modern spheres, and the Amerindian element is found in spheres relating the South American environment. All semantic fields show a relatively high degree of borrowing, with even those with fewer loanwords showing around a quarter of the vocabulary as borrowed. Not surprisingly, the field Modern world shows the highest proportion of

36. Loanwords in Saramaccan

933

loanwords reflecting the fact that Saramaccans have generally been recipients of modern culture rather than producers of it. All told, the figures suggest that there does not seem to be any noteworthy cultural prohibitions against borrowing in any semantic domain.

9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Non-loanwords

English

Wolof

Igbo

21.8 13.4 5.2 0.9 7.0

-

-

-

-

-

-

24.0 15.5 37.0 7.8 18.1 16.1 22.6 7.2 18.3 5.4

-

- - - - - - - - - - - - - 2.4 - - - - - - 3.2

-

-

- - 38.3 61.7 - - 36.2 63.8 - 1.8 49.9 50.1 - - 25.1 74.9 - - 34.0 66.0 - - 32.4 67.6 - - 34.5 65.5 - - 37.1 62.9 - - 44.2 55.8 - - 24.7 75.3

-

-

-

-

6.4 17.7 7.2 17.3 9.9 7.8 7.1 19.7 21.2 12.8

4.8 6.0 2.8 2.5 -

1.6 3.0 1.2 2.8 2.5 4.7 -

1.6 2.7 2.8 2.4 5.1 3.2

4.2 4.2 8.4 4.2 4.2 4.2 7.4 37.0 - - - - 7.7 5.1 7.7 - 5.1 1.1 43.4 32.0 - - 30.0 6.7 6.7 - - -

-

French

Twi

59.9 71.7 67.4 64.8 55.9 45.3 68.8 57.0

Mende

40.1 28.3 32.6 35.2 44.1 54.7 31.2 43.0

Carib

1.8 - - - - - - - - - - - - 2.5 - - 2.8 9.4 - - - - - - 0.9 0.4 - 0.7 0.7 - - 0.4 - 1.5 1.5 - - 1.5 - 0.8 - 2.4 2.4 - - - - - 2.4 - - - - - - - - 5.0 - 2.5 - - - - - 2.5

Gbe

Loango

Dutch

23.5 3.7 3.7 7.4 15.8 2.5 - 7.5 11.9 1.9 0.9 4.7 23.1 2.9 2.2 4.9 24.7 13.9 - 20.1 24.9 - 2.4 3.0 21.2 6.0 1.0 20.5 7.5 2.5 2.5

Total loanwords

The physical world Kinship Animals The body Food and drink Clothing and grooming The house Agriculture and vegetation Basic actions and technology Motion Possession Spatial relations Quantity Time Sense perception Emotions and values Cognition Speech and language Social and political relations Warfare and hunting Law Religion and belief Modern world Miscellaneous function words all words

Unidentified

1 2 3 4 5 6 7 8

Sranan

Loanwords in Saramaccan by semantic word field (percentages)

Suriname Portuguese

Table 4:

-

-

- 48.3 51.7

-

29.5 44.4 25.6 76.6 43.3

70.5 55.6 74.4 23.4 56.7

17.6 11.1 3.4 2.7 1.9 1.3 0.2 0.2 0.2 0.1 0.1 0.1 0.3 39.1 60.9

7. Integration of loanwords The main processes through which loanwords have been integrated into Saramaccan have centered around adapting loanwords to the language’s phonotac-

934

Jeff Good

tics, most prominently by inserting epenthetic vowels within consonant clusters found in words from a donor language creating CV syllables. In addition, as would be expected, segments in loanwords not found in Saramaccan are replaced with phonetically similar segments that are found in the language. The forms in Table 5 give illustrative examples, all involving apparent loans from Sranan (which are ultimately of Dutch origin). Table 5: Examples of phonological adaptation in Sranan words borrowed into Saramaccan Saramaccan sikópu wo!lúku suwálufu we!ti baáu

Sranan ‘narrow shovel’ ‘cloud’ ‘match’ ‘law’ ‘blue’

skopu wolku swarfu wèt blaw

‘spade’ ‘cloud’ ‘match’ ‘law’ ‘blue’

The factors governing the choice of epenthetic vowel are somewhat complex and apparently not always completely regular. The qualities of the neighboring vowels and consonants can play a role, as can the location of the epenthetic vowel within the word. Though not specifically focused on loanwords, Smith’s (1987b: 338–399) discussion of the development of liquid clusters in the Surinamese creoles addresses a number of issues relevant to understanding the nature of the vowel epenthesis processes affecting Saramaccan words (see also Smith 2003: 47). It is possible in some cases to identify different historical layers of vocabulary – and, thereby, corroborate an element’s status as a loanword – on the basis of what phonological adjustments words have been subject to. For example, initial sibilantstop clusters in the vocabulary inherited from English tend to lose their initial s, as in, for example, píki ‘answer’, from English speak, while the s is retained and followed by an epenthetic vowel in such clusters in words borrowed from Sranan as in, for example, sitááti ‘street’ from Sranan strati. The last word in Table 5 illustrates the effects of a sound change that began to affect Saramaccan sometime before the end of the nineteenth century (Smith 2003: 32–3) wherein intervocalic liquids were lost in many words. Thus, the Saramaccan word baáu ‘blue’ can be understood to have gone through an intermediate stage along the lines of baláu with the loss of the intervocalic l producing the long a seen today. This sound change often obscures the relationship between a loanword in Saramaccan and its original source. Not all words undergo this sound change, and the conditioning factors are complex, governed largely by the quality of the two vowels adjacent to the historical liquid (see Smith 1987b: 323–325). An open question is whether or not the application or non-application of this sound change can be used to determine a word’s probable entry date into the language. Liquid deletion does not consistently apply in even the relatively old Suriname Portuguese stratum of the lexicon (see, for example, the words listed in Smith (1987b: 321– 322)), making it difficult, without further study, to determine the significance of

36. Loanwords in Saramaccan

935

the presence or absence of intervocalic liquids in more recent borrowings. Furthermore, because of the variability of liquid deletion even in older words, it is not clear whether or not it may be part of a loanword integration strategy in some cases. One finds, for example, Sranan borrowings in Saramaccan both with and without a liquid where Sranan has one, such as olóísi ‘clock, watch’ from the Sranan form oloisi with the same meaning against a word like baáu in Table 5. The extent to which variable application of liquid deletion can be used to shed light on the history of the Saramaccan lexicon will have to await future research. For discussion of the historical interpretation of the presence of tone in Saramaccan with respect to different lexical strata, see Good (2009). Roughly speaking, high tones in Saramaccan loanwords of European-language or Sranan origin correspond with accent in the donor languages, representing a fairly minimal phonotactic adaptation. With respect to the integration of tonal African words into Saramaccan, there is evidence for a special African-derived stratum of the language’s vocabulary with different prosodic behavior from the rest of the lexicon. This suggests that, in at least some cases, such words were not closely integrated into the language’s existing prosodic system but, rather, their tones were left intact. Two final notes about loanword adaptation in Saramaccan should be made regarding the sequence kw and the status of nasals. As discussed in §1.3, kw can alternate with kp in some dialects. This variation has resulted in kw sequences in words of European origin appearing as kp in some of the forms in the database. Thus, for example, the word sakpí ‘shake out’ in the database (which is also found as sakwí) is derived from Suriname Portuguese sacudir ‘shake’, making this word of European origin look superficially “African”. With respect to nasals, in some cases, etymological plain nasals appear as prenasalized stops in Saramaccan, in particular before the vowels i and e. This is true both of English-derived elements, such as ndéti ‘night’ from night, and borrowings, such as gumbitá ‘vomit’ from Portuguese vomitar, again giving some European words an “African” appearance.

8. Grammatical borrowing While there has not been extensive work on the topic of grammatical borrowing per se in Saramaccan, there has been work done under the rubric of substrate influence, which, depending on one’s theoretical viewpoint, could be construed as a kind of grammatical borrowing. (See (McWhorter 2000: 119–123) for summary discussion of substrate features identified in the Atlantic English-based creoles.) While assessing the various controversies regarding creole genesis to determine what would or would not constitute grammatical borrowing during that process would clearly take us far astray from the issues of primary concern here, there has been work that has claimed that Saramaccan proper has been influenced by grammatical borrowing, and such research is clearly of interest here. The most extensive work on this topic is Kramer’s (2002) study of “substrate transfer” in Saramaccan from Fon Gbe. A striking fact about Saramaccan syntax is that the earliest recorded

936

Jeff Good

varieties of the language are grammatically less similar to Fon Gbe than later varieties (Kramer 2002: 622). This is almost certainly connected to the low rate of nativization of Suriname’s black population discussed in §1.1. Saramaccan’s early population undoubtedly included many native speakers of Gbe languages who would have acquired Saramaccan as a second language and who must have also transferred features of their native languages into the creole. Good (2009) adopts a view similar to Kramer’s with respect to the presence of a special tonal stratum in Saramaccan’s lexicon, thus arguing that some of the tonal features of Saramaccan are also the result of grammatical borrowing. Other work along these lines includes Kramer (2006, 2007).

9. Conclusion Despite its relatively short history, the Saramaccan lexicon has been greatly affected by borrowing. Not only does the lexicon show a surprising Portuguese element, but it also shows extensive borrowings from a related creole, Sranan, a good number of borrowings from African languages and a colonial language, Dutch, as well as having a salient Amerindian element. In many cases, the borrowings are the result of the same basic sociohistorical factors responsible for the creation of the creole itself: (i) contact of Africans with Europeans, (ii) contact among diverse groups of Africans, and (iii) contact among communities from the western Atlantic and eastern Atlantic areas. In addition, Saramaccan’s status as a maroon creole resulted in a kind of contact not found among creoles generally: contact between two different creolespeaking communities.

References Aceto, Michael. 1999. The Gold Coast lexical contribution to the Atlantic English Creoles. In Huber, Magnus & Parkvall, Mikael (eds.), Spreading the word: The issue of diffusion among the Atlantic Creoles, 69–80. London: University of Westminster. Arends, Jacques. 1995. Demographic factors in the formation of Sranan. In Arends, J. (ed.), The early stages of creolization, 233–277. Amsterdam: Benjamins. Arends, Jacques. 1999. The origin of the Portuguese element in the Surinam Creoles. In Huber, M. & Parkvall, M. (eds.), Spreading the word: The issue of diffusion among the Atlantic Creoles, 195–208. London: University of Westminster. Arends, Jacques. 2002a. The history of the Surinamese creoles 1: A sociohistorical survey. In Carlin, E. B. & Arends, J. (eds.), Atlas of the languages of Suriname, 115–130. Leiden: KITLV Press. Arends, Jacques. 2002b. Young languages, old texts: Early documents in the Surinamese creoles. In Carlin, E. B. & Arends, J. (eds.), Atlas of the languages of Suriname, 183–205. Leiden: KITLV Press.

36. Loanwords in Saramaccan

937

Bakker, Peter & Smith, Norval & Veenstra, Tonjes. 1995. Saramaccan. In Arends, J. & Muysken, P. & Smith, N. (eds.), Pidgins and creoles: An introduction, 165–178. Amsterdam: Benjamins. Bruyn, Adrienne. 2002. The structure of the Surinamese creoles. In Carlin, Eithne B. & Arends, Jacques (eds.), Atlas of the languages of Suriname, 153–182. Leiden: KITLV Press. Carlin, Eithne B. & Boven, Karin M. 2002. The native population. In Carlin, E. B. & Arends, J. (eds.), Atlas of the languages of Suriname, 11–45. Leiden: KITLV Press. Courtz, Henk (comp.). 1997. Karaïbs-Nederlands Woordenboek [Carib-Dutch dictionary]. Paramaribo: Instituut voor Taalwetenschap. .

Courtz, Henk. 2008. A Carib grammar and dictionary. Toronto: Magoria. Daeleman, Jan. 1972. Kongo elements in Saramacca Tongo. Journal of African Languages 11:1–44. de Groot, Adrianus H. P. 1977. Woordregister Nederlands-Saramakaans [Word index DutchSaramaccan]. Paramaribo. de Groot, Adrianus H. P. 1981. Woordregister Saramakaans-Nederlands [Word index Saramaccan-Dutch]. Paramaribo. de Kleine, Christa. 2002. Surinamese Dutch. In Carlin, E. B. & Arends, J. (eds.), Atlas of the languages of Suriname, 209–230. Leiden: KITLV Press. Donicie, Antoon & Voorhoeve, Jan. 1963. De Saramakaanse woordenschat [The Saramaccan vocabulary]. Amsterdam: Bureau voor Taalonderzoek in Suriname. Glock, Naomi & Rountree, S. Catherine. 2003. Nederlands-Saramaccaans-English woordenboek [Dutch-Saramaccan-English dictionary]. .

Good, Jeff. 2004. Tone and accent in Saramaccan: Charting a deep split in the phonology of a language. Lingua 114:575–619. Good, Jeff. 2009. A twice-mixed creole? Tracing the history of a prosodic split in the Saramaccan lexicon. Studies in Language 33:459–498. Holm, John A. 2000. An introduction to pidgins and creoles. Cambridge: Cambridge University Press. Kramer, Marvin. 2002. Substrate transfer in Saramaccan Creole. Ph.D. dissertation. Berkeley: University of California. Kramer, Marvin. 2006. The late transfer of serial verb constructions as stylistic variants in Saramaccan creole. In Deumert, Ana & Durrleman, Stephanie (eds.), Structure and variation in language contact, 337–372. Amsterdam: Benjamins. Kramer, Marvin. 2007. Tone on quantifiers in Saramaccan as a transferred feature from Kikongo. In Huber, Magnus & Velupillai, Viveka (eds.), Synchronic and diachronic perspectives on contact languages, 43–66. Amsterdam: Benjamins.

938

Jeff Good

Ladhams. John. 1999. The Pernambuco connection? An examination of the nature and origin of the Portuguese elements in the Surinam Creoles. In Huber, Magnus & Parkvall, Mikael (eds.), Spreading the word: The issue of diffusion among the Atlantic Creoles, 209–240. London: University of Westminster. Martin, Phyllis. 1972. The external trade of the Loango coast 1576–1860: The effects of changing commercial relations on the Vili kingdom of Loango. Oxford: Clarendon. McWhorter, John H. 1998. Identifying the creole prototype: Vindicating a typological class. Language 74:788–818. McWhorter, John H. 2000. The missing Spanish creoles. Berkeley: University of California. Migge, Bettina M. 1998. Substrate influence in the formation of the Surinamese Plantation Creole: A consideration of sociohistorical and linguistic data from Ndyuka and Gbe. Ph.D. dissertation. Columbus, OH: Ohio State University. Nurse, Derek & Tucker, Irene. n.d. A Survey Report for the Bantu Languages. (SIL Electronic Survey Reports. SILESR 2002–016). . Perl, Matthias. 1995. Part 2: Saramaccan. In Arends, J. & Perl, M. (eds.), Early Suriname creole texts, 243–250. Frankfurt: Ibero-American. Pet, Willem Jan Agricola. 1987. Lokono Dian: The Arawak language of Suriname: A sketch of its grammatical structure and lexicon. Ph.D. dissertation. Ithaca, NY: Cornell University. Postma, Johannes. 1990. The Dutch in the Atlantic slave trade, 1600–1815. Cambridge: Cambridge University. Price, Richard. 1970. Saramaka Woodcarving: The Development of an Afroamerican Art. Man (New Series) 5(3):363–378. Price, Richard. 1975. Saramaka social structure: Analysis of a maroon society in Surinam. Río Piedras, Puerto Rico: Institute of Caribbean Studies, University of Puerto Rico. Price, Richard. 1976. The Guiana Maroons. Baltimore: Johns Hopkins. Price, Richard. 1983. The first time: The historical vision of an Afro-American people. Baltimore: Johns Hopkins. Price, Richard. 2008. Travels with Tooy: History, memory, and the African American imagination. Chicago: University of Chicago. Rountree, S. Catherine & Asodanoe, Jajo & Glock, Naomi. 2000. Saramaccan word list (with idioms). Paramaribo: Instituut voor Taalwetenschap (SIL). .

Schumann, C. L. 1778. Sarmaccanisch Deutsches Wörter-Buch [Saramaccan-German dictionary]. In Schuchardt, Hugo (ed.), Die Sprache der Saramakkaneger in Surinam (1914), 44–116. Amsterdam: Johannes Müller. Shanks, Louis (ed.). 2000. A buku fu Okanisi Anga Ingiisi wowtu: Aukan-English dictionary and English-Aukan index. Paramaribo: Instituut voor Taalwetenschap (SIL). Smith, Norval. 1987a. Comparative word list of Gbe and Saramaccan. Unpublished manuscript. University of Amsterdam.

36. Loanwords in Saramaccan

939

Smith, Norval. 1987b. The genesis of the creole languages of Surinam. Ph.D. dissertation. Amsterdam: University of Amsterdam. Smith, Norval. 1999. Pernambuco to Surinam 1654–1665? The Jewish slave controversy. In Huber, Magnus & Parkvall, Mikael (eds.), Spreading the word: The issue of diffusion among the Atlantic Creoles, 251–298. London: University of Westminster. Smith, Norval. 2002. The history of the Surinamese creoles 2: Origin and differentiation. In Carlin, E. B. & Arends, J. (eds.), Atlas of the languages of Suriname, 131–151. Leiden: KITLV Press. Smith, Norval. 2003. Evidence for recursive syllable structure in Aluku and Sranan. In Adone, Dany (ed.), Recent development in creole studies, 31–52. Tübingen: Max Niemeyer. Smith, Norval & Cardoso, Hugo. 2004. A new look at the Portuguese element in Saramaccan. Journal of Portuguese Linguistics 3:115–147. Smith, Norval & Haabo, Vinije. 2007. The Saramaccan implosives: Tools for linguistic archaeology. Journal of Pidgin and Creole Linguistics 22:101–122. Taylor, Douglas Rae. 1964. Review of De Saramakaanse woordenschat by Antoon Donicie and Jan Voorhoeve. International Journal of American Linguistics 30:434–439. van Panhuys, L. C. 1904. Indian words in use in the Dutch language and in use at DutchGuiana. Bijdragen tot de Taal-, Land- en Volkenkunde van Nederlandsch-Indië 56:611– 614. Wilner, John (ed.). 2003. Wortubuku ini Sranan Tongo [Sranan Tongo-English dictionary]. 4th edn. Paramaribo: Instituut voor Taalwetenschap (SIL). .

940

Jeff Good

Loanword Appendix Carib píngo kujaké sipaí maisi máku saasáa tookóo walilí kaluwá akalé maaní maáun piiwá joóka

boar toucan stingray freshwater eel mosquito prawns, shrimp quail anteater lizard crocodile, alligator to sieve, to strain cotton, thread arrow ghost

Carib (earlier donor language) káima

crocodile, alligator

sea, ocean, lake ice sheep tooth to vomit blister stove blanket rake blacksmith basket chisel to dive to pursue sail young day (2) year cold

teacher soldier tower priest radio television telephone motorcycle train nurse spectacles/glasses minister driver’s license bank tap/faucet toilet tin/can cigarette through

English fékísi

Dutch zé e!ísi sikáfu tánda baláki baási gási déígi hálíki simítima mánda be!të dóki jáka zéi njo!nku dáka jáa ko!tö

me!sítë sodáti to!lu páíti ládio televísi telefo!n bölo!m talán so!sútu beéi (2) minísíti leibewéísi bánku kalán wéésée beénki sigale!ti dóu (2)

ointment

French or French-based creole pomte! lakwá nasíön

potato cross people

ba agbágba logoo bë lëgëdë andí ambe! lo! aviti Igbo un

tatí këke! me! azö agó fe!n zín

embers chameleon frog turtle buttocks, end (1) pestle spindle to thresh nettle knot to tear to press, to squeeze

you (plural)

Loango Bantu pötöpötö pululú mutjáma bundji taatá tatá böngö pukusu zaun ahalala bubú kóla tutú (2)

Gbe zonká agama bése logoso gogó

to draw water to carry on head round red to lie (2) what? who? clan trap

töönso!n tekútekú djukú le!kíti male!ngë pe!ne!pënë muungá bandja ndekú pindi tónto tolá fulufulu

mud foam rainbow fog father, father’s brother old man descendants bat elephant centipede jaguar snail horn, throat, horn, trumpet brain to hiccough to vomit weak lazy naked bracelet wall, beside, side fish poison statue, idol to limp to damage soft

36. Loanwords in Saramaccan laú tjaká bakisi sibá

mad rattle fish trap to curse

Mende fuköfukö njámísi

lung yam

Sranan wo!lúku suwálufu kijóo sikápu gánsi sitéífi suwáki kaábu deési báka (2)

cloud match young man sheep goose strong sick/ill to scratch medicine to roast, to fry, to bake siko!tííki cup fo!lúku fork boón dough, flour gulúntu vegetables nóto nut me!íki milk kási cheese wín wine nái to sew féífi (1) to dye, paint, to paint jápo (woman’s) dress he!mpi shirt kóto skirt buúku trousers köúsu sock, stocking bánti belt lelibúba belt ko!no!pu button djamátísítónu jewel kámba room söo!tö lock fe!nsë window

kúnsu báki báíki mésema sikópu (2) apeesína pampú ke!ti féki kándi láki goútu ko!pö lóto póbíki sikópu (1) sáka le!i boóki pená (2) lantimo!ni wojowójo we!nkë búnkópu wégi paáta fo!kánti lín djéi séíbi ne!ígi e!lúfu tuwálufu dúsu híi fuúku gáu no!íti júu olóísi

pillow trough beam mason spade, shovel citrus fruit pumpkin, squash chain to wipe to pour, to lie down glue gold copper lead statue, idol to kick to go down to drive bridge poor tax market shop/store cheap to weigh flat square line similar, to seem seven nine eleven twelve a thousand all early fast never hour clock

léi

dúngu baáu guúun kölo!ku piizíi fo!útu fusután djeési siko!ö sóífi ko!nku no!útu wo!útu sikífi lési (2) pampía pe!ni foloíti kumadéi kómpe feántima guwénti séépi we!ti kotóígima kaági sitááfu bútu kéíki baisígi otó bési opaláni masíni óli péíki sipóíti lánti siko!útu

941

to show, to learn, to study, to teach dark, obscure blue green good luck happy mistake to understand to imitate school certain to betray need, necessity word to write to read paper pen flute to command, to order friend enemy custom fishnet law witness to accuse penalty, punishment fine religion bicycle car bus airplane motor, machine petroleum pill or tablet injection government police

942

Jeff Good

sitááti biífi mataási sukúfu kúku kolánti kínö té (2) kofí so!ndö

street letter, post/mail mattress screw candy/sweets newspaper film/movie tea coffee without

ganían patupátu gabián pómba kwátíwójo gééja le!un makáku kaapátu bítju kákísa

Suriname Portuguese téla súndju paatí

sukúma lío mátu páu líba teéja sugúu sómba ve!ntu tjúba síndja tjumá tapá

wómi muje!ë mií avó tío pái mái

kaabíta bulíki

land dust island, to share, to separate, to divide foam river or stream woods or forest wood, tree moon, above, month star darkness shade or shadow air, wind rain ash to burn to extinguish, to shut, to cover, to forbid, to prevent man, male woman, female child grandparents mother’s brother father-in-law, son-in-law mother-in-law, daughter-inlaw goat donkey

puúma lábu wójo katáu búka gangáa máun húnjan pantéja hánza bíngo básu (1) tiípa kú suwá gumbitá lëmbe! babá duumí lonká sunján miindjá diiná tumá tëëme! paí nasí vívo

chicken duck hawk dove opossum gill lion monkey tick worm skin, hide, leather, bark body hair, feather tail eye nasal mucus, cold mouth, beak, edge neck arm, hand, branch fingernail, claw calf of leg wing navel spleen intestines, guts vagina, vulva to perspire to vomit to lick to dribble to sleep to snore to dream to piss to shit to have sex to shiver to beget to be born, to grow to be alive

búnu fe!bë kulá opión poosián mo!i kúa póndi bebé tjupá gulí fëëbe! fiidjí jasá kuje!ë fáka buuká limbá lalá fuúta súki óbo bisí

agúja saapátu kaapúsa andélu kónda pénti sipéi sikáda djaaí kiijá sakpí foló tabáku pípa batáta dobá matjáu

healthy, good fever to cure poison poison cooked, soft unripe rotten to drink to suck to swallow to cook, to boil to roast, fry to roast, to fry, to bake spoon knife to peel to peel to crush, to grind fruit sugar egg to put on, ornament, adornment needle (1) shoe hat, cap ring necklace comb mirror ladder garden to cultivate to thresh flower tobacco pipe sweet potato to fold axe/ax

36. Loanwords in Saramaccan latjá feegá tënde! paajá peetá baí baso!ö félu peégu káma bulí biá lolá toosá kaí kulé buwá koogá bajá subí baziá tooná disá fusí dendá mandá

panján panjá dá tjubí paká básu (2)

to split to rub to stretch to spread out to squeeze to sweep broom tool, iron nail mat to move to turn around to roll to twist to fall to flow, to run to fly to slide, slip to dance to go up, to climb to go down to come back to leave, to let go to disappear, to flee to enter to send, to command, to order to grasp to hold to give to preserve, to hide to pay, wages down, under, bottom

déndu kamían butá fiká jabí töo!tö zúntu lóngi maaká gaán pikí fitjá fínu mángu fúndu baáku tooká kondá túu tjiká kabá

gaándi awáa biingá didía amanján sabá pikísaba gaánsaba sënde! límbo línzo munján kéndi baasá

inside place to put to remain to open left, crooked near far to measure big small narrow thin thin deep hole to change to count, to tell all enough last, end (2), to finish, to cease, ready old now to hurry day (1) tomorrow week Wednesday Thursday to shine, bright light (2), clean, clear smooth wet hot, warm to embrace

pená (1) djëme! giitá buusé fédja ke! ganjá gafá sábi poobá ku kandá ngáku fiá pidí niingá búja lánza djulá saí akí alá óto (1)

943

to regret, to be sorry to groan to groan to hate envy, jealousy to want deceit praise to know to try and, with to sing, song to stutter, to stammer to deny to ask (2) to refuse quarrel spear to swear to be here there other

Twi djönkú akáa

hip soul, spirit

Unspecified Bantu language bakúba

banana

Wolof nján (1)

to bite, to eat

Chapter 37

Loanwords in Imbabura Quechua* Jorge Gómez Rendón and Willem Adelaar 1. The language and its speakers Imbabura Quechua is spoken in the northern Andes of Ecuador by some 150,000 speakers. Although the majority of them rely on agriculture, an increasing number live also on the trade of handicrafts in and around the town of Otavalo. Others sell their labor to different factories in the province or migrate temporarily to work in the nearby cities of Ibarra and Quito. The socioeconomic status of Imbabura Quechua speakers is considered one of relative prosperity in comparison to that of other ethnic groups in Ecuador. Imbabura Quechua is part of the Quechua IIB dialect group (Torero 1964). This branch covers an extensive area that includes “the dialects of the Ecuadorian highlands and Oriente (the eastern lowlands); the Colombian Quichua dialect usually called Inga or Ingano (Caquetá, Nariño, Putumayo); the dialects spoken in the Peruvian department of Loreto in the Amazonian lowlands (which are, in fact, extensions of the varieties spoken in the Amazonian region of Ecuador); the Lamista dialect spoken in the area of Lamas (Department of San Martín, Peru); and that of Chachapoyas and Luya (Department of Amazonas, Peru)” (Adelaar & Muysken 2004: 186f). Figure 1 gives an idea of the place of Imbabura Quechua within the Quechua language family. Map 1 charts highland and lowland varieties of Ecuadorian Quechua. Differences between northern Quechua (Ecuador), locally known as Quichua, and southern Quechua (Peru and Bolivia) occur at all levels of linguistic structure but are particularly noticeable in morphology. Like other Ecuadorian dialects, Imbabura Quechua has undergone a gradual process of morphological simplification involving the loss of verb-object agreement and possessive nominal suffixes. Typologically speaking, Imbabura Quechua is much more analytic than Peruvian and Bolivian Quechua, even though it preserves the typical agglutinative character of all Quechua languages.

*

The subdatabase of the World Loanword Database that accompanies this chapter is available online at http://wold.livingsources.org. It is a separate electronic publication that should be cited as: Gómez Rendón, Jorge A. 2009. Imbabura Quechua vocabulary. In Haspelmath, Martin & Tadmor, Uri (eds.) World Loanword Database. Munich: Max Planck Digital Library, 1179 entries.

37. Loanwords in Imbabura Quechua

Quechua I

Huyalas-Conchucos, Alto Pativilca, Alto Marañón, Alto Huallaga, Yaru, Jauja-Huanca, Huangáscar-Topará

Quechua IIA

Quechua II

Quechua IIB

Quechua IIC

Figure 1:

945

Ferreñafe (Cañaris), Cajamarca, Lincha, Pacaraos

North Highland (Imbabura, … ) Highland Quichua South Highland (Salasaca, Cañar, …)

Chachapoyas San Martín Ecuadorian Quechua Ayacucho, Cuzco, Collao (Puno), Northern Bolivian (Apolo), Southern Bolivian, Santiago del Estero (Argentina)

Lowland Quichua

North Lowland (Napo) South Lowland (Pastaza)

The Quechua language family

Imbabura Quechua shows lexical and phonetic differences from other Ecuadorian varieties of Quechua. A number of localisms are due to pre-Inca substrata while others result from semantic specialization of Quechua words. Salomon & Grosboll (1986) show that substratum influence in Imbabura Quechua comes from Cara, an indigenous language once spoken in Imbabura and Pichincha. Cara was eventually replaced by Imbabura Quechua in the early eighteenth century (Caillavet 2000: 103). Phonetically, Imbabura Quechua differs from other Ecuadorian dialects in that the stops /p/ and /k/ can be fricativized as [f] and [j] in all positions except after nasal. The same phonemes are aspirated ([ph], [kh]) or non-aspirated ([p], [k]) in the rest of Ecuadorian dialects. Some examples are pukuna ‘to blow’, realized as [fukuna] in Imbabura but [phukuna] in Bolívar (central) and [pukuna] in Loja (southern); upiana ‘to drink’, realized as [ufyana] in Imbabura but [uphyana] in Cotopaxi and Tungurahua (central) and [upyana] in Azuay (southern). According to Stark et al. (1973) (quoted in Cole 1982), Imbabura Quechua is divided into five subdialects: “from Cayambe through San Pablo and from the east of Mount Imbabura to Angla, Zuleta, Angachawa [sic], and Rinconada, and from these communities to Mariano Acosta and Pimampiro, hereafter Rinconada; (2) San Roque; (3) the zone from San Rafael in the north to San Roque on the east side of the Ambi River, hereafter Otavalo; (4) to the north of San Roque until San Antonio de Ibarra on the east side of the Ambi River, hereafter San Antonio; and (5) to the north of San Rafael and to the east of the River Ambi through the area near Cotacachi, hereafter Cotacachi” (Cole 1982: 7f). The present chapter is based on the Rinconada dialect. Sociolinguistically, the province of Imbabura ranks second among the nine Quechua-speaking provinces of Ecuador as for the number of speakers (Haboud 1998: 91–92). Imbabura also shows the largest number of Quechua-Spanish bilinguals in the country (Büttner 1993: 48–49). While there is a small number of Imbabura Quechua monolinguals, the tendency nowadays is one of increasing levels of bilingualism accompanied by maintenance of the native language. The language

946

Jorge Gómez Rendón and Willem Adelaar!

is vigorously spoken at community and family levels, being taught in schools as part of the Bilingual Intercultural Education Program implemented since 1986. In the last decades Imbabura Quechua entered oral media through regular radio broadcasting. The language has had a unified writing system since 1980.

Map 1: The geographical setting of Imbabura Quechua The fact that Imbabura Quechua shows a comparatively strong vitality in Ecuador should not veil its non-indigenous origin. According to ethno-historical evidence

37. Loanwords in Imbabura Quechua

947

1

(Torero 1974; 1984–1985) long-distance traders or mindaláes brought and disseminated Quechua from the central Peruvian coast to the northern Andes. Later in the late fifteenth century Quechua became a lingua franca for different ethnic 2 groups. The variety disseminated was Chinchay Quechua, so denominated by Torero (2002: 93) because of its assumed association with the commercial port of Chincha in Peru. By the time of the Inca conquest at the beginning of the sixteenth century, Quechua was extensively spoken in the northern Andes and the Incas used this language to communicate with the local peoples (Cerrón-Palomino 1987: 365). In the sixty years between the invasion of the northern Andes by Tupac Yupanqui (ca. 1470) and the fall of the Inca empire in 1532, Chinchay Quechua became consolidated but could not displace the native languages. That not all the peoples from Imbabura were bilingual in their native languages and Quechua by the early years of the Spanish colonization is demonstrated by several chroniclers. Andres Rodríguez reports that in the curacy of Lita (western slopes) “only a few speak the lengua general [Quechua]” (1991 [1582]: 413). Antonio de Borja admits in similar terms that “very few Indians of this curacy [Pimampiro, eastern slopes] speak the language of the Inca while none of the women know the language” (1991 [1591]: 483). Compare these reports with the statement of Jerónimo de Aguilar, who notes that “most of these Indians [from the curacy of Caguasquí] either speak the language of the Inca or understand it sufficiently” (1991 [1582]: 416). From early records we know that Caguasquí and Salinas were settlements of the Otavalo Indians, where salt was produced for domestic consumption or exchanged with other peoples to the west and the east of the Andes (Caillavet 1981a). These reports suggest that of the native peoples from Imbabura only the Otavalo Indians were bilingual in their own language (Cara) and Quechua. The Indians settled on the slopes of the Andes had only a few incipient bilinguals. From the chronicles it is clear that at least some of these Indians spoke Barbacoan languages. The reason why Otavalo Indians were much more proficient in Quechua than their neighbors was their permanent and intense contact not only with the Inca invaders but also with several groups of forced migrants, the so-called mitimaes, who were resettled in Imbabura after the defeat of the Cara around 1505. The dissemination of Quechua in the northern Andes did not imply the replacement of the pre-Inca languages. The Diocesan Synod of Quito ordered in 1593 the preparation of catechisms and confessionaries in these languages for the evangelization of peoples whose mother tongue was not Quechua (Adelaar & Muysken 2004: 392). It is generally assumed that pre-Inca languages survived throughout the

1 2

The word mindalá itself is not Quechua but a local expression from one of the indigenous Ecuadorian languages. Before the Inca invasion the territory of today’s Imbabura was inhabited by several ethnic groups: central Imbabura was the home of speakers of the Cara language (belonging to the Barbacoan family) while the western and eastern slopes were populated by other Indians of Barbacoan affiliation, probably Cayapas and Pastos (Caillavet 1981b).

948

Jorge Gómez Rendón and Willem Adelaar!

sixteenth century to be finally replaced by Quechua around the second half of the seventeenth century or the early years of the eighteenth century. During the Spanish colonization (1532–1810), Quechua too was used as a means of evangelization, in particular after the three Councils held in Lima between 1551 and 1583. Efforts were made to standardize Quechua in order to make its learning easier for priests and facilitate the printing of books in the language. The basis for the standardization was Cuzco Quechua, a variety directly associated with the center of the Inca empire. Cuzco Quechua presented several phonetic intricacies which were eventually omitted in the standardized version: e.g. the velar-uvular distinction /k/ – /q/ and the ejective-aspirated distinction of stops (Mannheim 1991: 142). Closely resembling the Quechua variety spoken in the northern Andes because of its simplified phonetics, the standard was used until the first half of the seventeenth century (Adelaar & Muysken 2004: 183). Some scholars maintain that the missionary use of standardized Quechua influenced decisively the development of Ecuadorian Quechua, especially in the Amazon Lowlands (Oberem & Hartmann (1971); but see Muysken (2000) for an evaluation of this hypothesis). The influence of standardized Quechua may not have been as decisive, but its use by missionaries certainly promoted the dissemination of Quechua in the northern Andes at the expense of indigenous languages. Because these languages were spoken along with Quechua for a couple of centuries, their influence on the development of Ecuadorian Quechua in general and Imbabura Quechua in particular is obvious. In addition, the contact between speakers of Quechua in Imbabura and nearby ethnolinguistic groups continued presumably for at least another hundred years after the extinction of the local pre-Inca language due to an extensive network of trade that survived into the eighteenth century (Caillavet 2000: 81). These groups spoke several Barbacoan languages including now extinct Pasto (southern Colombia), living Tsafiki (western slopes of the Andes) and living Awa Pit (southern Colombia and 3 the Ecuadorian Province of Carchi). Because Quechua was not the mother tongue of the local peoples of the northern Andes until their native languages were eventually replaced, it is not possible to speak of Imbabura Quechua as a distinct variety before the end of the seventeenth century. It is only from the moment that the native people of Imbabura abandoned their pre-Inca language (Cara) and adopted Quechua that something like an Imbabura variety of Quechua emerged. The historical record shows that the shift to Quechua was a gradual process that lasted over one hundred years. From a linguistic examination of early grammatical descriptions, Muysken (2009) shows that Quechua in Ecuador kept many features of Peruvian dialects in the seventeenth century, and that these features were gradually replaced by those typical of presentday Quechua in the course of the next two centuries (e.g. the loss of an inclusiveexclusive distinction in the nominal and verbal paradigms and the loss of object encoding in the verb). 3

Notice, however, that the presence of Awa Pit in Ecuador is the result of recent migration from southern Colombia.

37. Loanwords in Imbabura Quechua

949

Further changes in Ecuadorian Quechua continue to date, but now they are motivated by language contact rather than internal evolution. The role played by Spanish in this case is decisive. Spanish influence on Ecuadorian Quechua dates back to the early years of the European conquest but the degree of influence has grown dramatically in the last century as a result of the expanding mainstream society. Increasing levels of bilingualism among Quechua speakers strengthen the influence of Spanish on Quechua lexicon and grammar. While Spanish influence is important, it is not the same across dialects and idiolects and often depends on geographical location and individual factors such as age and gender. More recently, the use of Quechua in radio broadcasting has introduced a number of structural changes in the language (Fauchois 1988). Contemporary Imbabura Quechua is a living language after four centuries of contact with Spanish because it made a compromise between the communicative needs imposed by the dominant culture and the speakers’ need to preserve their identity.

2. Sources of data The major obstacle to the present investigation was the lack of specific lexicographic studies on Imbabura Quechua. This situation is certainly not unique of Imbabura Quechua but of Ecuadorian Quechua in general. Four dictionaries of Ecuadorian Quechua were consulted for the preparation of the database accompanying this study: Cordero’s (1992 [1892]) dictionary, based mainly on southern Ecuadorian Quechua; Stark & Muysken’s (1977) Quichua-Spanish dictionary, with lexical information of dialectal zones and a large number of contemporary Spanish loanwords; Haboud et al.’s (1982) Quichua monolingual dictionary, with valuable phonetic information; and Torres Fernández de Córdova’s (2002) three-volume dictionary, with dialectal information about Ecuador, Peru and Bolivia. Data on Imbabura Quechua came from: (1) personal knowledge; (2) information provided by speakers; and (3) fieldwork notes collected by Gómez Rendón for his doctoral dissertation on Spanish lexical borrowing in Imbabura Quechua plus a corpus of spontaneous speech collected in several Quechua communities in Imbabura (Gómez Rendón 2008a). Reference works consulted for the preparation of this chapter include: Cerró-Palomino (1987) for a discussion of the hypotheses about the origin and expansion of Quechua in the Andes; Torero (2002) for a discussion of the use of Quechua in Ecuador before the Inca conquest and the existence of a trade network between the northern Andes and the Peruvian coast; Adelaar & Muysken (2004) for the genealogical classification of dialects and a general overview of pre-Inca languages in Ecuador; Jijón y Caamaño (1940–1945) for a discussion of the aboriginal languages of the northern Andes, in particular chapter IX of the first volume, which deals with the pre-Inca Cara (or Caranqui) language; Caillavet (2000) for an updated evaluation of linguistic, archaeological and historical data from Imbabura;

950

Jorge Gómez Rendón and Willem Adelaar!

and Cole (1982) for a discussion of the typological features of Imbabura Quechua and the integration of Spanish loanwords. Data on Spanish loanwords came from personal knowledge, except in a few cases of localisms and obsolete words no longer used in the modern language. While the identification of Spanish loanwords was rather easy, that of non-Spanish loanwords proved a major challenge in so far as they come from insufficiently described or undescribed languages of the Barbacoan family (e.g. Tsafiki, Awa Pit), and from extinct pre-Inca languages of which neither vocabularies nor grammars are available. Moore’s (1966) dictionary of Spanish and Tsafiki (the traditional name of the Colorado language) was of valuable help to establish the origin of several Barbacoan loanwords. Finally, the identification of Quechua loanwords from non-Ecuadorian dialects could be established through lexicographic comparison.

3. Contact situations The language that has most influenced Imbabura Quechua is Spanish. This is not surprising when the duration and the intensity of contact are considered. Equally decisive for Imbabura Quechua was the contact with pre-Inca Cara. Less influential was the contact with Peruvian Quechua, in particular with Cuzco Quechua, the language of the ruling Inca elite. Finally, the contact with neighboring Barbacoan languages may have been regular before the Spanish conquest but was presumably interrupted one century after. These contact situations correspond each to a specific period of the history of the northern Andes. 3.1.

Contact with pre-Inca languages

Chronologically, the first language in contact with Quechua in Imbabura was the Cara language of the Otavalo Indians (Caillavet 1981b: 109ff). The affiliation of Cara has been disputed over the years, but most scholars agree nowadays that it was a Barbacoan language (cf. Adelaar & Muysken 2004: 393–394). Cara is therefore affiliated with other languages of southern Colombia such as Pasto or Muellamués, both extinct, but also with Tsafiki, Cha’palaa and Awa Pit, spoken today in the provinces of Santo Domingo de los Tsáchilas, Imbabura, Esmeraldas and Carchi in northern Ecuador. Therefore, it is not possible to trace a clear-cut distinction between Cara and other Barbacoan influence on Imbabura Quichua. For strictly practical purposes we have established a distinction in the following terms: a loanword is considered a borrowing from Cara in so far as similar word forms are not present in the living Barbacoan languages of the area (Tsafiki and Awa Pit); by contrast, if a loanword has a clearly identified counterpart in either of these languages, it is considered to be of Barbacoan origin, i.e. Tsafiki or Awa Pit. We are aware that this procedure is rather artificial to the extent that loanwords assigned to

37. Loanwords in Imbabura Quechua

951

living Barbacoan languages could have had similar forms in Cara, but the distinction is helpful in providing a more accurate classification of loanwords. Besides Cara, the Otavalo Indians began to use Chinchay Quechua as a lingua franca in the first half of the fifteenth century. After the Inca conquest (ca. 1470) Chinchay Quechua became the official language of the Inca administration in the northern Andes (Torero 1983: 68) but Cara continued to be spoken by the majority of the local population. After the Spanish conquest in 1532, Chinchay Quechua was used for evangelization while Cara was still vital. Finally, in the early eighteenth century Quechua replaced Cara as the native language of the Indian population of the northern Andes. In sum, Quechua was in contact with Cara for at least three centuries. Cara influence on Quechua should have been minor during the pre-Inca and Inca periods because Quechua was used only by small sectors of the population such as traders and local elites. However, for the time the Otavalo Indians adopted Quechua as their second language in the early 1600s, a greater influence from Cara must be assumed. In general, Cara-Quechua contacts involve two scenarios: one of slight borrowing, before the Spanish conquest, and another of moderate or intense borrowing from the Spanish invasion onwards until the eventual demise of the Cara language. It is expected that a long contact with Cara may have induced important language changes in Quechua that go beyond lexical borrowing. Phonetically, for example, the fricativization of stops in Imbabura Quichua and the non-aspiration of consonants in any position may be the result of Cara substratum (Torero 2002: 106, 371). This substratum could also explain the re-ordering of the switchreference system (cf. Adelaar & Muysken 2004: 149). 3.2.

Contact with Peruvian Quechua

Quechua entered the northern Andes several times and in the form of different dialects: first, through long-distance traders from Chincha in the fifteenth century; second, through the Inca rulers from Cuzco between 1470 to 1532; and through mitimaes (populations uprooted from their traditional homelands and re-settled in distant areas of the Inca empire for political reasons) and segments of the Inca army during the Inca occupation. Each dialect made its own contribution. Chinchay Quechua provided the lexical and grammatical basis for the emergence of Quechua in Imbabura. Cuzco Quechua influenced Chinchay Quechua as a source of lexical innovation during the Inca occupation of the northern Andes. Cuzco Quechua was also a point of reference for all Quichua varieties in early colonial times (Garcés 1999: 35) because of the prestige associated with the former Inca capital. The influ4 ence of Cuzco Quechua did not go beyond schooling circles, however. Peruvian varieties other than Cuzco Quechua also made their own contribution to the Quechua spoken in Imbabura through: (1) mitimaes uprooted from other Quechuaspeaking areas of the Inca empire who were resettled in the northern Andes 4

But see Itier (1991) for an example of its use in letter correspondence in 1616.

952

Jorge Gómez Rendón and Willem Adelaar! 5

(Espinosa Soriano 1988b: 15, 362) ; (2) Inca soldiers who stayed in the conquered territories after their pacification or could not return to their Peruvian homes after the fall of the Inca empire (Torero 2002: 102). In any case, the languages of mitimaes and soldiers were less influential because their speech eventually merged in the pool of local Chinchay. 3.3.

Contact with Barbacoan languages

Historical records show that well before the Spanish conquest the Otavalo Indians were part of a regional trade network that involved groups from the Andean western slopes (Caillavet 2000: 46ff). Today the western slopes (western Imbabura) are inhabited by speakers of Barbacoan languages (Tsafiki, Awa Pit). Therefore, the ethnic groups mentioned by the records must have spoken one or more Barbacoan languages. With the transformation of the regional economy during the second half of the seventeenth century (Caillavet 2000: 59ff), the relations between the highlands and the western slopes became less important. The Barbacoan-Quechua contact must have reached a peak in the Inca period and continued in early colonial times to eventually wane in the eighteenth century. 3.4.

Contact with Spanish

Spanish is by far the most important of the languages in contact with Quechua in Imbabura. Apart from the time factor (four centuries of contact), other influencing factors such as the inferior status of Quechua vis-à-vis Spanish and the increasing rates of bilingualism among Imbabura Quechua speakers as a result of their participation in the market economy have induced major changes in the lexicon and the structure of the indigenous language (Gómez Rendón 2007). Interestingly, bilingualism in Imbabura is not accompanied by the loss of Quechua as in other provinces. Compared to conservative varieties, Imbabura Quechua shows an important degree of Spanish lexical borrowing, the end point of which is the emergence of mixed varieties with Spanish lexicon and Quechua morphology (Gómez Rendón 2005, 2008b).

4. Number and kinds of loanwords The Imbabura Quechua subdatabase contains 1482 meaning-word pairs, of which 172 are meanings without equivalents in this language. There are 257 words that 5

There are several cases of Peruvian mitimaes in the northern Andes of Ecuador, but the best documented case concerns the Huayucuntus from Cajamarca, who served as a military force to control Quito and Otavalo. There are also reports of Aymaran mitimaes in the central Highlands (the province of Cotopaxi) but no documents exist that prove their presence in Imbabura.

37. Loanwords in Imbabura Quechua

953

correspond to two or more Loanword Typology (LWT) meanings (supercounterparts). Out of 1310 meanings with established equivalents, 389 (31.3%) correspond to loanwords or probable loanwords. The number of distinct loanwords and probable loanwords amounts to 359 different lexical items. It represents 24.2% of all the entries and 27.4% of all LWT meanings with established equivalents. Out of the set of distinct loanwords, 342 items (95.3%) are of Spanish origin, 7 items (1.8%) of Peruvian Quechua origin, 5 items (1.4%) of pre-Inca origin, 3 items (0.9%) of Barbacoan origin, and 2 items (0.2%) of unknown origin. The borrowed status is clearly established for all of the Spanish, Barbacoan and Peruvian Quechua loanwords. The status of “probably borrowed” items has been assigned to all of the pre-Inca loanwords. While there is no evidence of calquing for any borrowed item in the database, 18 entries contain analyzable compounds created on Spanish loan basis. 4.1.

Loanwords compared

A comparison of loanwords per source language confirms the duration, intensity and level of bilingualism associated with each of the contact situations described in §3. The overwhelming presence of Spanish loanwords in the database is not surprising given the sociolinguistic situation of Imbabura Quechua: four centuries of contact, higher levels of bilingualism and active participation in the Spanishspeaking society. The percentage of Spanish lexical borrowing is closely similar to the percentage reported in a corpus-based investigation on Ecuadorian Quechua (cf. Gomez Rendón 2006, 2008). Much less numerous in the database are pre-Inca loanwords. Nevertheless, their presence is indicative of a clear pre-Quechua substratum. Some pre-Inca loanwords have been inserted while others have replaced Quechua items and still others coexist with them. Because the local pre-Inca language (Cara) was spoken during the Inca occupation and throughout the early colonial period, pre-Inca loanwords come from different periods. Among these loanwords are not only names of plants and animals but also a few of the basic vocabulary items that must have been adopted when the majority of the native population became bilingual in Cara and Quechua. On closer inspection it is possible to find a larger number of pre-Inca loanwords in Imbabura Quechua, most of them corresponding to zoological and botanical concepts not included in the database. For non-LWT meanings corresponding to pre-Inca loanwords, see the Appendix. Loanwords from Cuzco Quechua (Peruvian) all correspond to basic vocabulary. It is therefore reasonable to hypothesize that they entered Quechua in Imbabura before or during the Inca conquest rather than in colonial times. One loanword h from Cuzco Quechua is an originally Aymaran loanword: allchi ‘grandchild’ < allch i

954

Jorge Gómez Rendón and Willem Adelaar! 6

‘grandson’. Alternatively, the loanword may have come directly from Aymara through forced immigrants in the northern Andes. We also have found a couple of loanwords from dialects of the Quechua I group: shigra ‘netbag’ (HuaylasConchucos dialect) and pikpiga ‘burrowing owl’ (Jauja-Huanca dialect) – even though the latter is not part of the LWT core list. Of the three Barbacoan loanwords one belongs to basic vocabulary (puzun ‘stomach’ < Awa Pit puzan), while the other two are related to the semantic field Animals (tupan ‘bat’ < Tsafiki supãn; tazin ‘nest’ < Tsafiki ta’sin). These words might have entered Quechua before the Spanish conquest or in the first century of colonization, when contact with Barbacoan groups was still intense. Barbacoan loanwords would be much more numerous if we expanded our database to include endemic concepts. The large number and varied origin of loanwords in the lexicon of Imbabura Quechua demonstrates not only the intensive contact with other languages but also a permissive attitude towards language mixing. It is not unreasonable to attribute this openness to external influences to the non-indigenous origin of Imbabura Quechua and its development from regional lingua franca to local first language. 4.2.

Loanwords and semantic word class

Loanwords are classified according to lexical class in Table 1. Nouns are by far the largest semantic word class, followed by verbs and adjectives. No loan adverbs have been found. As far as their origin is concerned, Barbacoan loanwords are all nouns. Similarly, Peruvian Quechua loanwords are all nouns, except for one verb. Differently, Spanish and pre-Inca loanwords include not only nouns but also verbs and adjectives. In addition, there are a couple of function words from Spanish, which are not the only ones reported for Imbabura Quechua however (cf. Gómez Rendón 2007). The primacy of loan nouns over other word classes – a fact amply corroborated by most case studies – is explained by the need to name new objects and practices introduced by other speakers. While this need may be catered for by the creation of new words (neologisms) or the borrowing of lexical items from the contact language, the use of either mechanism depends on: (i) the stage of contact (the first strategy is preferred in earlier stages); (ii) the level of bilingualism of speakers (bilingual speakers usually borrow most); and (iii) the social attitude towards language mixing (if the speech community values purism, borrowings will not find their way into the language). For the case of Imbabura Quechua, the requirements are met which make Spanish borrowing the best choice: a century-long contact with Spanish, high levels of Quichua-Spanish bilingualism, and the absence of 6

Lexical borrowing between Southern Peruvian Quechua and Aymara goes far beyond one single item. This fact has misled some scholars to propose a genealogical relation between both languages (cf. Adelaar & Muysken 2004: 34ff).

37. Loanwords in Imbabura Quechua

955

sociocultural restrictions to language mixing because of the prestige associated with Spanish.

Pre-Inca languages

Barbacoan languages

Unidentified

Total loanwords

Nonloanwords

Nouns Verbs Adjectives Adverbs Function words all words

Peruvian Quechua

Loanwords in Imbabura Quechua by semantic word class (percentages)

Spanish

Table 1:

33.7 14.3 12.5 2.0 25.7

0.8 0.3 0.5

1.4 0.3 0.3 0.4

0.3 0.2

0.3 84.2

36.5 14.9 12.8 2.0 27.0

64.6 85.1 87.2 100.0 98.0 73.0

In relation to function words notice, on the one hand, that numerals from one to ten are all Quechua and that Spanish numerals coexist with native forms in less conservative idiolects. On the other hand, function words such as the loan adverb simpri ‘always’ (< Spanish siempre) and the conjunction o ‘or’ (< Spanish o) occur not only in Imbabura but also in the rest of Ecuadorian Quechua (cf. Gómez Rendón 2008a). These and other function words are reported in many indigenous languages in contact with Spanish around the world (cf. Stolz & Stolz 1996, 1997). 4.3.

Loanwords and semantic field

Although Spanish loanwords are the great majority, most semantic fields have at least one loanword of non-Spanish origin. Only the field of Miscellaneous function words has Spanish loanwords exclusively. Spanish influences all semantic fields. An important number of Spanish loanwords are lexical insertions referring to new entities (objects, concepts, practices) of the dominant society. In 25% of the entries, Spanish loanwords have replaced native items in different periods of time. Here are some examples: LWT meaning

Quechua word replaced Spanish borrowing

Time of replacement

‘the star’ ‘to measure’ ‘the cow’

kuillur tupuna wagra

Late Colonial Early Republic Contemporary

luziro < lucero midina < medir baka < vaca

In other cases Spanish loanwords coexist with native items, although there is no exact semantic equivalence between them. Such is the case of micha ‘light’ (< Spanish mecha ‘wick’) and tayta ‘father’ (< old Spanish taita) which overlap with Quechua nina ‘light’ and yaya ‘father’, respectively. While Spanish-derived micha is used to refer to the light produced by a candle, Quechua nina refers to any kind of light.

956

Jorge Gómez Rendón and Willem Adelaar!

Similarly, Spanish-derived tayta marks respect towards a male individual for his age 7 or social position while Quechua yaya refers to one’s father or grandfather. It is worth noting that the semantic differentiation between the elements of both pairs is not an independent development of Quechua. The distribution of loanwords across semantic fields is given in Table 2. Since the overwhelming number of Spanish loanwords does not allow a meaningful crosslinguistic comparison, we will focus on the semantic distribution of Spanish borrowings only. We have grouped semantic fields according to levels of influence: the first group contains semantic fields where Spanish lexical borrowing is particularly high (66%–100%); the second group includes semantic fields where such borrowing is moderate (36%–65%); finally, the third group contains semantic fields where Spanish lexical borrowing is generally low (0%–35%). The semantic fields Modern world and Religion make the group of heavy borrowing. The number of Spanish loanwords in these fields represent 72% and 66% of their respective entries. The well-attested use of evangelization for the acculturation of native American peoples from the early years of Spanish colonization explains the occurrence of such a large number of loanwords in these fields. Semantic fields with moderate lexical borrowing include Clothing and grooming, The house, Kinship, Basic actions and technology, and Law. The presence of Spanish items in the first two fields reflect the new clothing and housing practices of Quechua speakers in Imbabura. Thus, for example, the loanword biga ‘rafter’ was introduced in the second half of the twentieth century, when straw houses began to be replaced by log cabins, and brick houses with tiled roofs. The semantic field Kinship includes loanwords for kinship terms which result from the rearranging of family relations. Most of the loanwords referring to basic actions and technology are associated with the introduction of implements used in Western arts and crafts. A further field of moderate borrowing concerns law. Here we find an interesting mixture of Quechua words and Spanish items. On closer inspection it becomes clear that concepts referring to spaces (e.g. karsil ‘prison’, tribunal ‘court’), actions performed therein (e.g. jwizhu ‘judgment’, jurana ‘to swear’) and performers themselves (e.g. jwis ‘judge’, tistigo ‘witness’) use Spanish loanwords while other, more general concepts such as shuwana ‘to steal’ or wañuchina ‘to murder’ are mainly Quechua. Semantic fields in which Spanish lexical borrowing is low are sixteen in total. Those with a minimum of Spanish loanwords include Quantity, Emotions and values, and The physical world. For the first of these fields, it is necessary to consider that Quechua numerals often coexist with Spanish forms in the speech of young bilinguals. Semantic fields on the verge of moderate borrowing are Food and drink, Agriculture and vegetation, Animals, and Possession. Spanish loanwords in the first two of these fields result from the replacement of native practices with those of Western society. For example, the Spanish names for ‘fork’, ‘spoon’ and ‘knife’ refer to new utensils for eating. On the other hand, Quechua words referring to native cooking 7

For example, taita was a term of respect for priests and other individuals of high position in sixteenth-century Spanish.

37. Loanwords in Imbabura Quechua

957

objects still in use such as manga ‘kettle’ have not been replaced by loanwords referring to similar objects in the mestizo society.

Pre-Inca languages

Barbacoan languages

Unidentified

Total loanwords

Non-loanwords

The physical world Kinship Animals The body Food and drink Clothing and grooming The house Agriculture and vegetation Basic actions and technology Motion Possession Spatial relations Quantity Time Sense perception Emotions and values Cognition Speech and language Social and political relations Warfare and hunting Law Religion and belief Modern world Miscellaneous function words all words

Peruvian Quechua

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Loanwords in Imbabura Quechua by semantic field (percentages)

Spanish

Table 2:

8.0 38.8 32.0 15.8 29.7 35.0 40.4 31.1 32.0 13.2 32.6 21.3 2.5 21.0 16.3 41.6 15.7 16.6 22.2 15.0 46.1 66.0 72.0 2.0 25.7

2.6 1.2 1.0 1.2 1.3 1.2 0.5

0.0 1.2 1.0 0.6 0.6 2.1 0.4

1.5 0.6 0.2

2.0 0.2

10.6 41.2 37.5 17.0 30.9 35.0 40.4 31.1 34.0 14.4 32.6 21.3 2.5 21.0 18.4 41.6 15.7 16.6 22.2 15.0 46.1 66.0 72.0 2.0 27.0

89.4 58.8 62.5 83.0 69.1 65.0 59.6 68.9 66.0 85.6 67.4 78.7 97.5 79.0 81.6 58.4 84.3 83.4 77.8 85.0 53.9 34.0 28.0 98.0 73.0

In the field of Agriculture and vegetation it calls our attention that Imbabura Quechua speakers, active farmers themselves, have replaced Quechua words with Spanish items or use both interchangeably. Thus, Quechua lampa was replaced by asadun (< Spanish azadón ‘spade’) but pallana ‘harvest’ coexists with kuzicha (< Spanish cosecha ‘harvest’). Finally, there are cases of lexical insertions to name objects introduced in agriculture by Spaniards (e.g. jurkita ‘pitchfork’). Both replacements and insertions mirror past and present changes in agricultural practices among the indigenous population of Imbabura.

958

Jorge Gómez Rendón and Willem Adelaar!

Notice, to conclude, that most of the Spanish names of animals and plants correspond to those introduced by Spaniards since the beginning of colonization: e.g. pullu ‘chicken’; trigu ‘wheat’. In addition to Spanish loanwords there are one 8 Pre-Inca loanword (i.e. pilis ‘body louse’ < Pre Inca language pilis) and two items of Barbacoan origin (Tsafiki). Several pre-Inca loanwords referring to local flora and fauna which are not part of the core LWT list were not included in the statistics (cf. Appendix).

5. Integration of (Spanish) loanwords In this section we deal with the mechanisms for the integration of Spanish loanwords in Imbabura Quechua. The processes described hereunder can be applied also to non-Spanish loanwords in so far as they follow Quechua morpho-phonological rules. The majority of Spanish loanwords (88%) are fully or almost fully assimilated to the phonological patterns of Imbabura Quechua. Partially integrated loanwords amount to 5% while unintegrated loanwords represent 7% of the whole set. Unlike loanwords from non-Spanish origin (e.g. Cuzco Quechua or pre-Inca), which occur always as integrated forms regardless of other considerations (e.g. age, semantics), the integration of Spanish loanwords depends heavily on an interaction of factors including age, frequency, pragmatics and discourse. Thus, an old loanword frequently used in discourse may be more integrated to Quechua phonology than a recent loanword whose frequency is also high. The phonological integration of Spanish loanwords involves mainly vocalic changes. Spanish medial vowels are generally raised (/e/>/i/, /o/>/u/) or otherwise pronounced as close as possible to their Quechua equivalents, as illustrated in (1). (1)

[misa] / [miza]

< Spanish /mesa/ ‘table’

Partial assimilation is frequent in words with several medial vowels, as shown in (2). Assimilation varies across idiolects, resulting in different pronunciations of the same word. (2)

[prizidinti] / [presidinti] / [presidente] < Spanish /presidénte/ ‘president’

Words with more than one medial vowel have different phonetic realizations depending on their environment and frequency of use. The less frequent a word in everyday speech (i.e. the more external to basic vocabulary) the less assimilated to Quechua phonology. A further factor influencing phonological integration is the speaker’s level of bilingualism. From this perspective the three realizations of the 8

This non-Quechua word might be related also to Guambiano /palitï/ ‘louse'. The other Barbacoan languages do not yield similar forms.

37. Loanwords in Imbabura Quechua

959

Spanish loanword in (2) can be correlated to three increasing levels of bilingualism, with the first realization corresponding to an incipient bilingual, the second to a subordinate bilingual, and the third (unassimilated) to a coordinate bilingual. The phonological adaptation of Spanish consonants is less frequent. One of the few consonant changes concerns the velarization of the fricative labiodental /f/, as illustrated in the following examples: (3)

[xi!u]

< Spanish /fie!o/ ‘(piece of) iron’

(4)

[xurkita]

< Spanish /forketa/ ‘pitchfork’

Notice that both word forms reflect a typical Spanish American pronunciation and contrast with their Peninsular equivalents hierro [ye!o] and horqueta [orketa] which do not involve consonant onsets. In turn, the presence of a velar onset in the following loanword – originally lacking a consonant onset – suggests that it was borrowed in an earlier phonological stage of the source language: (5)

[xazinda]

< old Spanish /fasienda/ ‘estate’

The loanword [jazinda] in (5) resembles the sixteenth-century pronunciation of contemporary Spanish hacienda ‘estate’. Accordingly, the velarization illustrated in (5) results from the phonological adaptation of an old Spanish word form and not from dialectal pronunciation as in the previous examples. Notice also the sonorization of the intervocalic sibilant in (5). The same sonorization is attested in the following word: (6)

[xui"u]

< Spanish juicio /xwisio/ ‘estate’

Another process of loanword assimilation is metathesis. The nature of this process is not only phonological but also morphological in so far it affects the syllable structure of loanwords. The order of syllables changes in some cases while syllables are replaced or simply deleted in others. Consider the following case of syllable deletion: (7)

tempora

< Spanish temporada ‘season’

In a few other cases metathesis affects not the syllable proper but only a particular feature. This is the case of (8), where the palatality of /r/ goes to /n/. (8)

sañora

< Spanish zanahoria /sanaoria/ ‘carrot’

The morpho-phonological integration of loanwords may involve semantic changes too. Accordingly, certain nouns and verbs are borrowed in the guise of other nouns and verbs but with different meanings:

960 (9)

Jorge Gómez Rendón and Willem Adelaar!

rifuirso ‘effort’ < Spanish refuerzo ‘reinforcement’; compare esfuerzo ‘effort’

Verbs are particularly prone to morpho-phonological changes whereas nouns, adjectives and adverbs are less so. The integration of Spanish verbs in Imbabura Quechua involves the drop of inflectional endings. The resulting verbal root becomes the base form to which Quechua verbal morphology is added. The following example illustrates this process for the verb volar ‘to fly’. The raising of the stem vowel occurs also in this case. (10) bula-na

< bula-

< Spanish vola-r ‘fly-INF’

fly-INF

‘to fly’ Once adapted to Quechua morpho-phonology, loan verbs behave exactly as any other verb. In a few cases verbs are derived directly from loan nouns by simply adding the infinitive marker. Consider the following example created on loan basis: (11) kaballu-na

< kaballu

< Spanish caballo ‘horse’

ride-INF

‘to ride (a horse)’ Spanish nouns and adjectives are sometimes borrowed along with their plural and gender markers (frozen borrowing). This is illustrated in (12) and (13) below. (12) barbas

< Spanish barba-s ‘beard-PL’

(13) awila

< Spanish abuel-a ‘grandparent-FEM’

The borrowing of roots along with bound morphemes does not imply the effective borrowing of the latter, because Spanish bound morphemes do not occur in native Quechua forms. Another case of frozen borrowing is the occurrence of Spanish gender markers in loan adjectives. Here are two examples: (14)

santu

< Spanish sant-o ‘holy-MASC’

surdu

< Spanish sord-o ‘deaf-MASC’

A unique type of frozen borrowing involves Spanish words and Quechua particles. This is the case of nakrina ‘to doubt’, in which the Quechua negative form (ma)na is prefixed to the borrowed verb root kri- ‘to believe’ (from Spanish creer) and followed by the Quechua infinitive -na. The main characteristic of these loanblends is that their original constituents cannot be detached, modified or otherwise subjected to derivational or inflectional mechanisms. The same feature is shared by phrasal borrowings, i.e. phrasal constructions created on loan basis as (15).

37. Loanwords in Imbabura Quechua

(15) afila-na

961

rumi

sharpen-INF stone

‘whetstone’ (Spanish, piedra de afilar) In this case no adjectival modifier can be inserted between afilana and rumi. Should we like to modify this compound with an adjective such as jatun ‘big’, we would have to put the latter immediately before afilana. This means that a phrase such as *afilana jatun rumi is ungrammatical because the adjective splits the compound in two.

6. Grammatical borrowing The affluence of Spanish loanwords in Imbabura Quechua goes hand in hand with changes at the levels of the clause and the sentence. Even if syntactic developments are not necessarily explained by lexical borrowing, the co-occurrence of loanwords and syntactic calquing suggests a close relation between both phenomena. The outcomes of grammatical borrowing in Imbabura Quechua are many and varied. A comprehensive study of this phenomenon has been presented elsewhere (Gómez Rendón 2007). Three contact-induced changes are worthy of notice. First, the replacement of embedded nominalized constructions with subordinated clauses that use loan connectives including relativizers (e.g. que ‘that’, lo que ‘that which’) and conjunctions (e.g. purki ‘because’, si ‘if’). Second, the occurrence of SVO word order in declarative sentences and the use of non-verbal predicative constructions with copulas. Third, the shift from relative clause-head to head-relative clause order, with Quechua interrogative pronouns used as relative markers. These and other contact-induced changes have modified and continue to modify the typological character of Imbabura Quechua as a typical agglutinative, verb-final language that uses clause embedding instead of clause subordination through connectives. In general terms, the combined effects of lexical and grammatical borrowing have made Imbabura Quechua a Hispanicized variety different from conservative dialects. At the same time, these changes have made the language extremely adaptive to new communicative settings.

7. Conclusion Quechua has been in contact with different languages since its Chinchay variety entered Imbabura as a regional lingua franca in the fifteenth century. Four distinct contact languages can be distinguished, each associated with a specific period of time: Cara (pre-Inca, Inca and early Colonial); Peruvian Quechua (Inca); Barbacoan languages (pre-Inca, Inca, early Colonial); and Spanish (early Colonial to the present). While all of these situations left imprints on the lexicon, the influence from Spanish has been by far the largest. Four centuries of intense long-term contact

962

Jorge Gómez Rendón and Willem Adelaar!

with the European language have led to borrowing in the lexicon and the grammar. The higher levels of bilingualism among Imbabura Quechua speakers and their permissiveness to language mixing have further encouraged Spanish borrowing. It remains to explore to what extent the present-day configuration of Imbabura Quechua is due to the contribution of languages other than Spanish, in particular pre-Inca; and to what extent the non-indigenous origin of Imbabura Quechua as a pidginized variety turned into the first language of an ethnic group made it permeable to language mixing.

References Adelaar, Willem with Muysken, Pieter. 2004. The Languages of the Andes. Cambridge: Cambridge University Press. Büttner, Thomas. 1993. Uso del quichua y del castellano en la Sierra ecuatoriana. Quito: Ediciones Abya Yala. Caillavet, Chantal. 1981a. La sal de Otavalo-Ecuador: Continuidades indígenas y rupturas coloniales. Revista Sarance 9:47–89. Otavalo: Instituto Otavaleño de Antropología. Caillavet, Chantal. 1981b. Etnohistoria ecuatoriana: Nuevos datos sobre el Otavalo prehispánico. Revista Cultura 4(11):109–127. Quito: Banco Central del Ecuador. Caillavet, Chantal. 2000. Etnias del Norte: Etnohistoria e historia del Ecuador. Quito: Ediciones Abya Yala & IFEA. Cerrón-Palomino, Rodolfo. 1987. Lingüística Quechua. (Biblioteca de la Tradición Andina 10). Cusco: Centro de Estudios Rurales Andinos Bartolomé de las Casas. Cole, Peter. 1982. Imbabura Quechua. Amsterdam: North-Holland Publishing Company. Cordero, Luis. 1992. Diccionario Quichua-Castellano y Castellano-Quichua [Quichua -Spanish and Spanish- Quichua dictionary]. Quito: Corporación Editora Nacional. Cusihuamán Gutiérrez, Antonio. 1976. Diccionario Quechua Cuzco-Collao. Lima: Ministerio de Educación & Instituto de Estudios Peruanos. de Aguilar, Jerónimo. 1991 [1582]. Relación hecha por mí, Fray Gerónimo de Aguilar, de la Orden de Nuestra Señora de las Mercedes Redención de Cautivos, de la Doctrina y pueblo de Caguasquí y Quilca, que doctrino y tengo a mi cargo, en cumplimiento de lo que por Su Magestad se me manda y en su nombre, el Muy Ilustrísimo Señor Licenciado Francisco de Auncibay, Oidor de la Real Audiencia de Quito. In Ponce Leiva, Pilar (ed.), Relaciones histórico geográficas de la Audiencia de Quito (siglo XVI-XIX), I, 415–418. Madrid: Consejo Superior de Investigaciones Científicas. de Borja, Antonio. 1991 [1591]. Relación en suma de la doctrina y beneficio de Pimampiro y de las cosas notables que en ella hay, de la cual es beneficiado el Padre Antonio Borja. In Ponce Leiva, Pilar (ed.), Relaciones histórico geográficas de la Audiencia de Quito (siglo XVI-XIX), I, 480–488. Madrid: Consejo Superior de Investigaciones Científicas.

37. Loanwords in Imbabura Quechua

963

de Cieza de León, Pedro. 1984 [1553]. Crónica del Perú: Primera parte. Lima: Fondo Editorial de la Pontifica Universidad Católica del Perú. de Paz Ponce de León, Sancho. 1991 [1582]. Relación y Descripción de los pueblos del Partido de Otavalo. In Ponce Leiva, Pilar (ed.), Relaciones histórico-geográficas de la Audiencia de Quito (siglos XVI-XIX), Vol. 1, 359–371. Madrid: Consejo Superior de Investigaciones Científicas. Espinosa Soriano, Waldemar. 1988a. Carangues y Cayambes. Siglos XV y XVI: El testimonio de la etnohistoria. Otavalo: Instituto Otavaleño de Antropología. Espinosa Soriano, Waldemar. 1988b. Etnohistoria ecuatoriana: Estudios y documentos. Quito: Ediciones Abya Yala. Fauchois, Anne. 1988. El quichua serrano frente a la comunicación moderna. Quito: Proyecto EBI & Ediciones Abya Yala. Garcés, Fernando. 1999. Cuatro Textos Coloniales del Quichua de la ‘Provincia de Quito’. Quito: EBI-GTZ. Gómez Rendón, Jorge. 2005. La media lengua de Imbabura. In Muysken, Pieter & Olbertz, Hella (eds.), Encuentros y conflictos: Bilingüismo y contacto de lenguas en el mundo andino, 39–57. Madrid: Vervuert Iberoamericana. Gómez Rendón, Jorge. 2006. Condicionamientos tipológicos en los préstamos léxicos del castellano: el caso del quichua de Imbabura. Actas del XIV Congreso del ALFAL–2005. Monterrey: ALFAL. Gómez Rendón, Jorge. 2007. Grammatical borrowing in Imbabura Quichua (Ecuador). In Matras, Yaron & Sakel, Jeannette (eds.), Grammatical borrowing in cross-linguistic perspective, 481–521. Berlin: Mouton de Gruyter. Gómez Rendón, Jorge. 2008a. Typological and social constraints on language contact. Amsterdam: ACLC, University of Amsterdam Ph.D. dissertation. Gómez Rendón, Jorge. 2008b. Mestizaje lingüístico en los Andes. Génesis y Estructura de una Lengua Mixta. Quito: Editorial Abya Yala. Haboud de Ortega, Marleen & Montaluisa Chasiquiza, Luis & Muenala Pineda, Fabián & Viteri Gualinga, Froilán. 1982. Caimi Ñucanchic Shimiyuc Panca. Quito: Ministerio de Educación y Cultura & Pontificia Universidad Católica del Ecuador. Haboud, Marleen. 1998. Quichua y Castellano en los Andes Ecuatorianos: Los efectos de un contacto prolongado. Quito: Ediciones Abya Yala. Itier, Cesar. 1991. Lengua general y comunicación escrita: cinco cartas en Quechua de Cotahuasi – 1616. Revista Andina 9:1, 65–107. Cuzco: Centro Bartolomé de Las Casas. Jijón y Caamaño, Jacinto. 1940–1945. El Ecuador Interandino y Occidental. 4 vols. Quito: Editorial Ecuatoriana. Jiménez de la Espada, Marcos. 1965 [1586]. Relaciones geográficas de Indias: Perú. 3 vols. Biblioteca de Autores Españoles. 183–5. Madrid: Atlas. Mannheim, Bruce. 1991. The Language of the Inka since the European Invasion. Austin: University of Texas Press.

964

Jorge Gómez Rendón and Willem Adelaar!

Moore, Bruce R. 1966. Diccionario Castellano-Colorado Colorado-Castellano [SpanishColorado Colorado-Spanish dictionary]. Quito: Instituto Lingüístico de Verano-ILV. Muysken, Pieter. 2000. Semantic transparency in Lowland Ecuadorian Quechua morphosyntax. Linguistics 39, 5:873–988. Muysken, Pieter. 2009. Gradual Restructuring in Ecuadorian Quichua. In Selbach, Rachael & Cardoso, Hugo C. & van den Berg, Margot (eds.), Gradual Creolization: Studies celebrating Jacques Arends (Creole Language Library 24), 77–100. Amsterdam: John Benjamins. Oberem, Udo & Hartmann, Roswith. 1971. Quechua Texte aus Ost-Ecuador [Quechua texts from East Ecuador]. Anthropos 66:673–718. Proyecto de Educación Bilingüe Intercultural. 1999. Cuatro textos coloniales del Quichua de la ‘Provincia de Quito’. Estudio Introductorio de Fernando Garcés. Quito: Proyecto EBI & Minsiterio de Educación y Cultura. Rodríguez, Andrés. 1991 [1582]. Relación hecha por el Muy Reverendo Padre Fray Andrés Rodriguez, de la Orden de Nuestra Señora Santa María de las Mercedes Redención de Cautivos, de lo que en este pueblo de Lita hay. In Pilar Ponce Leiva (ed.), Relaciones histórico geográficas de la Audiencia de Quito (siglo XVI-XIX), Vol. 1, 413–415. Madrid: Consejo Superior de Investigaciones Científicas. Salomon, Frank & Grosboll, Sue. 1986. Names and peoples in Incaic Quito: Retrieving undocumented historic processes through anthroponymy and statistics. American Anthropologist 88:387–99. Stark, Louisa & Carpenter L. K. with Concha, M. A. & Conterón Córdova, C. A. 1973. El Quichua de Imbabura: Una gramática pedagógica. Otavalo: Instituto Otavaleño Nacional de Antropología e Historia. Stark, Louisa & Muysken, Pieter. 1977. Diccionario Español-Quichua Español. (Publicaciones de los Museos del Banco Central del Ecuador). Guayaquil: Banco Central del Ecuador. Stolz, Christel & Stolz, Thomas. 1996. Funktionswortentlehnung in Mesoamerika: Spanisch-amerindischer Sprachkontakt [Borrowing of function words in Mesoamerica: Spanish-Amerindian language contact]. Sprachtypologie und Universalienforschung 49(1):86–123. Stolz, Christel & Stolz, Thomas. 1997. Universelle Hispanismen? Von Manila über Lima bis Mexiko und zurück: Muster bei der Entlehnung spanischer Funktionswörter in die indigenen Sprachen Amerikas und Austronesiens [Universal hispanisms? From Manila via Lima to Mexico: Patterns in the borrowing of Spanish functions words to the indigenous languages of America and Austronesia]. Orbis 39:1–77. Torero Fernández de Córdova, Alfredo. 1964. Los dialectos quechuas [The Quechua dialects]. Anales Científicos de la Universidad Agraria 2(4):446–478. Torero Fernández de Córdova, Alfredo. 1974. El quechua y la historia social andina. Lima: Universidad Ricardo Palma.

37. Loanwords in Imbabura Quechua

965

Torero Fernández de Córdova, Alfredo. 1983. La familia lingüística quechua. In Pottier, Bernard (ed.), América Latina en sus lenguas indígenas, 61–192. Caracas: Unesco & Monte Avila. Torero Fernández de Córdova, Alfredo. 1984–1985. El comercio lejano y la difusión del quechua: El caso del Ecuador. Revista Andina 6, 7:367–402, 107–114. Cusco: Centro de Estudios Rurales Andinos Bartolomé de las Casas. Torero Fernández de Córdova, Alfredo. 2002. Idiomas de los Andes. Lima: Editorial Horizonte e Instituto Francés de Estudios Andinos. Torres Fernández de Córdova, Glauco. 2002. Lexicon Etnolectológico del Quichua Andino. 3 vols. Cuenca: Impresora Rocafuerte.

Loanword Appendix (*entry not in Loanword Typology meaning list) Cuzco Quechua (Peruvian) fuyu waña chafsina mullapa allchi

cloud mosquito to shake bunch, knot grandchild

(from Aymara allchhi !grandson")

Quechua I (Peruvian) pikpiga* shigra

Andean owl netbag

a Pre-Inca Language (Cara) kuytsa amfana amuklla pilis zunfa* zunfu* chugunda* pigala* kintsilgu*

girl to yawn soft body louse singing bird crawling plant dark-red berry Andean weed poisonous berry

Barbacoan languages tupan tazin puzun

bat (Tsafiki) nest (Tsafiki) stomach (Awa Pit)

Spanish luziru rilampagu

star lightning

rayu micha bichi fúsfuru karbun jinti kazarana kazamintu dibursyu tayta millisus awilu awila ñitu ñita tiyu tiya subrinu subrina primu swidru swidra yirnu padastru madastra intinadu intinada byuda byudu animal putru istablu

bolt of lightning wick pan match charcoal people to get married marriage divorce father twins grandfather grandmother grandson granddaughter uncle aunt nephew niece cousin father-in-law mother-in-law son-in-law stepfather stepmother stepson stepdaughter widow widower animal pasture, colt stable, stall

buyi baka shiku karneru kaballu bistya burru mula gallu pullu gansu patu garsa luru paluma kunu rapusa piji usu munu elefanti sintupis alakrán añara abija kulibra dyablillu sapu kukudrilu turtuga barbas kaspa kustilla

ox cow small ram horse beast donkey mule rooster chicken goose duck heron parrot dove rabbit opossum fish bear monkey elephant centipede scorpion spider bee snake grasshopper frog crocodile turtle beard dandruff rib

966

Jorge Gómez Rendón and Willem Adelaar!

piku umbru subaku didu talun alas pichu sintura madri sudana lanzana lambina runkana awarina kalintura rumas bininu surdu chupana awana jurnu platu platillu kuchara almuza sina masa masana igus uba nuwis asituna asiyti lichi kizu binu serbisa algudun seda tilar awuja tiñina shwiter kamiza pantalun midias butas bulsiku butun surtijas sarsillu

beak shoulder armpit finger heel wings chest waist womb to sweat to throw to lick to snore to drown fever a cold, a flu poison deaf to suck to choke oven dish saucer spoon lunch dinner dough to knead fig grape nut olive oil milk cheese wine beer cotton silk loom needle to dye coat shirt trousers socks boots pocket button ring earring

sinta panilu twalla sipillu jabun ispiju wirta tuldu llabi bintana kama miza lampara bila batiya kumbrira biga tabla arku albañil ladrillu adubi labradur sanja azadun pala jurkita kuzichana usis kuzicha granu trigu sibada sintinu abina arrusa pinu tabaku koku platanu kamuti trabajana zafana kadena kuchillu kuchilluna tijiras jacha aswila ajustana jirraminta

band handkerchief towel brush soap mirror garden tent key window bed table lamp wax trough ridgepole rafter board arch mason brick adobe farmer ditch spade shovel pitchfork to mow scythe harvest grain wheat barley rye oats rice pine tobacco coconut banana sweet potato to work to untie chain knife to stab scissors axe adze to press tool

karpinteru sirruchu martillu klabus jirriru jirru kanastu ishtira sinsil tiñi bulana lisyana manijana kaballuna rwida yugu kanuwa dibina debi pagana kwinta impwesto alkilana ganana nigusyanti tyinda prisyu karu baratu repartina pushtu muntunana partina filu midina brasa anchu jundu dirichu krus kwadradu kambyana timpu simpri uras riluju simana dumingo lunis martis

carpenter saw hammer nails blacksmith iron, copper basket mat chisel paint to fly to limp to drive to ride a horse wheel yoke canoe to own debt to pay bill tax to hire to earn trader shop price expensive cheap to share place to pile up to divide, to split sharp to measure fathom wide deep right, straight cross square to trade time always hours watch week Sunday Monday Tuesday

37. Loanwords in Imbabura Quechua mircules juibes birnes sabadu gushtana bulla kulur asul birdi sintina filulla alma atribina kriyina intindina adibinana siguru kawza minishtina o silbana awllana ufrisina papil flawta tambur trumpita billa linderu amu sirbyenti swiltu bizinu koshtumbri ispada iskupita armadura durri trampa ley tribunal

Wednesday Thursday Friday Saturday to like noise color blue green to feel sharp edge soul to dare to believe to understand to guess sure cause need, duty or to whistle to howl to offer paper flute drum trumpet town, village boundary master servant loose neighbor custom sword gun armor tower trap law court

jwizhu jwis dimandanti tistigu jurana juramintu inusinti kashtigu multa karsil rilijyún iglizya altar ufrinda kultu rizana tayta kura santu bindisina ayunana silu infirnu brujyana bruju dwindi fantasma radyu tilibisyún tilefunu mutu awtu abyun pila uspital pastilla indijsyún lababu iskusadu mantana gubyernu prisidinti

judgment judge plaintiff witness to swear oath innocent punishment fine prison religion temple altar offering worship to pray priest holy to bless to fast heaven hell witch sorcerer elf ghost radio television telephone motorcycle car airplane battery hospital pill injection sink toilet blanket government president

ministru pulisiya karta banku lata turnillu butilla antyujus plastiku trwinu fabrika almanaki pilíkula muzika trin partida pitrulyu mutur makina bisiklita armadillu ardilla ataju briya kurtina tamal jamaka aru bisya* liyón tenedur karritilla kosa kafé te swidrus

967

minister police letter bank can screw bottle glasses plastic thunder factory calendar movie music train certificate petroleum motor machine bicycle armadillo squirrel herd tar to tan tamale hammock ring vetch lion fork cart thing coffee tea parents-in-law

Unknown origin chita

goat/he-goat

Chapter 38

Loanwords in Kali’na, a Cariban language of French Guiana* Odile Renault-Lescure 1. The language and its speakers The Kali’na (self-denomination Kali’na) are an indigenous group living in the northeast of South America, in the vast regions of the Guianas, speaking the Kali’na 2 language. The Guianas Shield forms an “island” of approximately 1,800,000 km bordered by the Amazon River, the Negro River, the Casiquiare Canal, the Orinoco River and the Atlantic Ocean. It is constituted by the five Guianas: Venezuelan Guiana, (formerly British) Guyana, Suriname (formerly Dutch Guiana), French Guiana, and the state of Amapá in Brazil. The Kali’na represent a relatively important group with about 20,000 members distributed in these countries: 11,141 live in Venezuela, 3,000 in Guyana (Forte 2000), 3,000 in Suriname (Boven & Morroy 2000), about 4,000 in French Guiana (Renault-Lescure 2009), and 28 in Brazil (ISA 2008). The number of speakers of Kali’na is lower, as in all countries the Kali’na communities have faced language loss, but this varies enormously from one place to another in the same country (30% retention in Venezuela, 80% in Guyana, 50% in Suriname; there are no precise data for French Guiana, but the retention rate is high). The Kali’na language consists of three main dialects: the Venezuelan dialect, the M!lato dialect in western Suriname, and the T!lewuyu dialect in eastern Suriname, French Guiana and Brazil. (Nothing is known about the Kali’na spoken in Guyana.) This chapter is based on the T!lewuyu dialect as spoken in French Guiana. In French Guiana, a sub-dialectal frontier divides the speakers into a Western and an Eastern group. The Kali’na are spread out into various places of the region: They live in the indigenous municipality of Awala-Yalimapo and partially in other western or eastern villages: Village Javouhey of Mana, Terre Rouge, Village Pierre, Espérance, Paddock, Prospérité, Bellevue-Yanou, Dégrad Savane, Village amérindien of Kourou, scattered settlements of Organabo, as well as in the western and eastern towns of Mana, Saint-Laurent, Cayenne and Kourou.

*

The subdatabase of the World Loanword Database that accompanies this chapter is available online at http://wold.livingsources.org. It is a separate electronic publication that should be cited as: Renault-Lescure, Odile. 2009. Kali’na vocabulary. In Haspelmath, Martin & Tadmor, Uri (eds.) World Loanword Database. Munich: Max Planck Digital Library, 1191 entries.

38. Loanwords in Kali’na

969

The Kali’na language, also known as Galibi in French Guiana, as Carib in Suriname and as Kari’ña in Venezuela, belongs to the Cariban language family. Kali’na was the first Cariban language to be described, as it was the first to come in contact with Europeans (Introduction à la langue des Galibis, written by the priest Pierre Pelleprat (1606–1667)). The first attempt to classify the Cariban languages was made by Salvatore Gillij (1721–1789) for the Venezuelan languages. To date, there is no accepted internal classification of the Cariban languages. They have been treated quite differently in the three most recent classifications (Gildea 1998). Gildea’s more recent work (2002) proposes three main branches: the Venezuelan branch, the Southern branch, and the North-Amazonian branch, plus some languages that do not belong to any of these branches, among them Kali’na.

Map 1: Geographical setting of Kali’na French Guiana is a French administrative entity (an overseas department). The official language is French, but since 2001, the indigenous languages have been recognized as “Languages of France” defined as follows by the Délégation Générale à la langue française et aux langues de France (Ministry of Culture): “The Languages of France include the regional and minority languages that are traditionally spoken by French citizens on the territory of the Republic and are not official languages in any other country […] These criteria of definition are inspired by the European Charter for regional or minority languages”. Among the “Languages of France” in French Guiana, (French) Guianese Creole (or Guyanais) is the only “regional language”, while the other languages (Amerindian languages, creole languages of the Maroons, and Hmong) remain minority languages. This status does not confer any linguistic

970

Odile Renault-Lescure

rights, but recognizes the linguistic diversity of the French nation and allows these languages to benefit from some advantages. Kali’na has the traditional vernacular uses in the families and villages. It is spoken almost exclusively by the Kali’na themselves, and its knowledge by outsiders is th very exceptional. Its uses extended rapidly in the last decades of the 20 century and the beginning of the 21st in various unusual domains, especially in the indigenous municipality of Awala-Yalimapo, which was created as a separate administrative entity in 1989. There, it has entered into political speeches, administrative interactions (management of the municipality, post office), and in the primary school where the pupils have received a partial instruction in Kali’na since 1998. The major religion, the Catholic Church, does not make use of Kali’na in its practices, but the growing influence of evangelical missions that translate the bible into the vernacular languages is changing the linguistic landscape in religious contexts, especially in the villages where larger numbers of Surinamese live. In the media, the language does not have a representative presence, but there are some local stations in the western part of the region where the people prefer to listen to a popular Surinamese Kali’na programme. No written media exist, other than some texts in a periodical published by some Kali’na people of Kourou. After a four-year process involving workshops bringing together native speakers, linguists and an anthropologist, an orthography was adopted by the community in 1997, giving rise to a document known as “Déclaration de Bellevue”. This writing system is used in the school for elaborating pedagogical materials. Some individuals are beginning to propose bilingual pieces of literature (myths, tales or poems) for bilingual publications (Renault-Lescure 2007). This orthography is used in the present work. But the favorite cultural domain for using and spreading the language is music. Various musical associations exist, influenced by the creativity of the Kali’na of Suriname who mix traditional rhythms and songs, and contemporary music of the Caribbean. Recent archeological research shows that the Amerindian presence in the Guianas and the Amazon region is very ancient. Around 2000 BCE we see the beginning of an increasing number of settlements of sedentary farmers and the dawn of complex societies, testified by elaborate ceramics. Other waves of different populations, coming from the east (500 CE) and later (1000 CE) from the south occupied the Guiana coast until the arrival of the Europeans (Rostain 2009+, in Renault-Lescure & Goury 2009+). When Europeans initially established contact with the “savage coast” in the sixteenth century to trade goods, the territory that the Kali’na, estimated 5,500 (Grenand & Grenand 1979), shared with other peoples eventually became French th Guiana and Dutch Guiana. At the end of the 17 century, weakened by new diseases and epidemics brought by more recent immigrants, the Kali’na began to forge new relationships with the now dominant colonial occupiers. Notwithstanding their demographic decline, the Kali’na represented the most important Amerindian population between Cayenne and Paramaribo. Furthermore, their language became

38. Loanwords in Kali’na

971

the lingua franca among Amerindians of different ethnic groups and the primary means of communication with the Europeans along much of the coast. In the mid-nineteenth century, the number of Tilewuyu Kali’na living along the Maroni River dwindled to a few hundred. They lived in plots of land between French Guiana and Dutch Guiana that the Europeans had not yet exploited. However, colonization soon extended into these western areas and the Kali’na increasingly found themselves engulfed in a largely foreign-influenced environment. Their family-based economy became dependent on activities engendered by the colonial situation while their collective mobility became increasingly restricted. While their relations with other Amerindian groups grew more sporadic until, by th the end of the 19 century, they found themselves interacting almost exclusively with foreigners and foreign-influenced populations: Europeans, Creoles, and Maroons from the Lower Maroni. Thus, economic domination and a social order stratified into “civilized” and “primitive” peoples defined the colonial environment surrounding the Kali’na.

2. Sources of data Lexical data for this study were mainly taken from the earlier systematic work on loanwords in Kali’na, Evolution lexicale du galibi, langue caribe de Guyane by Renault-Lescure (1985). The lexicon contains 347 loanwords. This work was based on various sources. Because of its coastal distribution, Kali’na was in contact from an early date with European languages. It attracted the attention of the colonizers, particularly missionaries, who wrote the first sketches of its grammar and lexicon. The result of this attention is that documentation of the language dates as far back th as the 17 century. The first attempts to write grammatical sketches of Kali’na with lists of words and phrases date from this period. The first work is Biet’s (1661) Les Galibis: Tableau véritable de leurs moeurs avec un vocabulaire de leur langue, an account of his journey in Guiana from 1652 to 1653. It contains a chapter De la langue galibi, followed by a lexicon with 400 words. In actual fact, it describes a lingua franca used along the coast for evangelization and trade needs (Taylor & Hoff 1980; Renault-Lescure 1984), but while the phrases are a description of a kind of “Pidgin Galibi”, the word list is still useful. The second work is Introduction à la langue des Galibis, Sauvages de la Terre ferme de l’Amérique méridionale written by Pelleprat, a missionary based on the Guarapiche river (in Venezuela today) from 1653 to 1654, a document of thirty six pages, of which nineteen are devoted to grammatical observations and seventeen to a thematic lexicon. The other material consists of word lists collected by naturalists and traders. All these works were compiled by an agronomist (De la Salle de l’Estaing 1763) in a Dictionnaire galibi that does not give any new information. The next useful production appears with Les Etudes linguistiques Caraïbes by de Goeje (1909). This important work adds new information based on de Goeje’s own data from the eastern Kali’na variant and data compiled from colonial documents written in another language than French

972

Odile Renault-Lescure

(Spanish, English, German, Dutch). It brings to our knowledge the first loanwords th from the period called "rise of creoles”. Two main works from the 20 century, the Encyclopaedie der Karaïben by Ahlbrinck (1931), and The Carib language: phonology, morphonology, morphology, texts and word index by Hoff (1968) provide some additional information on loanwords. The other data used in Renault-Lescure (1985) come from my own fieldwork in the years 1979–1981. During my recent research (from 1998), I added new data and new ideas based on current research in the area of language contact.

3. Contact situations 3.1.

Amerindian language contacts

Contact is ancient in the languages of the Guianas. Recent work in the field of ethno-history points to the complexity of the contact situation, due to the presence, the overlap and the mobility of numerous groups populating the area. Grenand & Grenand (1979) point out that there were more than thirty different indigenous ethnic groups encountered by the first Europeans that landed on the th Guianese coast during the 16 century. The landscape of the pre-Colombian indigenous Guianas comprises ethnic groups in intense contact, with languages belonging to three main linguistic families, Cariban, Arawakan and Tupi-Guarani. The lexicons of these unrelated languages share a lot of common elements, which represent, accordingly to the situations, mutual borrowings whose ancestry is generally difficult to trace. Even though we can clearly trace the Tupi-Guarani origin of the word alawe ‘beetle (Periplaneta americana)’, or the Cariban origin of the word awala ‘palm-tree (Atrocaryum vulgare Mart.)’, it is not possible to trace the origin of a lot of common words in Amazonian languages, such as kuwata ‘spidermonkey (Ateles paniscus L.)’ (Renault-Lescure 2005). 3.2.

First contacts with European languages th

The arrival of the Europeans during the 17 century and the commercial links left their mark. The Kali’na referred to contact goods with the words of the first Europeans they met. Borrowings are primarily from Spanish and Portuguese, and less commonly from English, Dutch and French. Despite the rapid settling of the French, there are surprisingly few early borrowings from French. This first stock of Kali’na borrowings were diffused and used all along the coasts of the Guianas, from the mouth of the Orinoco to Approuague (in eastern Guyana), and later spread southwards along the rivers. The indigenous southern languages in turn borrowed these words from Kali’na, or from the Kali’na-based lingua franca (e.g. Spanish arcabuz > Kali’na alakaposa ‘rifle’ > Wayana alakapuha, Wayampi alakausa).

38. Loanwords in Kali’na

973

During this period there was no situation of general bilingualism. Exchanges were carried out with the help of interpreters and through the use of a lingua franca with a Kali’na lexicon (Taylor & Hoff 1980; Renault-Lescure 1984). th The 18 century was a transitional period during which the Dutch and French colonies established themselves more firmly, creating the economic and political setting in which the Kali’na live until now. Contacts with European languages decreased, being replaced by contacts with languages that arose in the colonies with the slave trade and the construction of Creole and Maroon societies. 3.3. 3.3.1.

Contact with creole languages English-based creoles th

The numerous lexical borrowings from English-based creole languages in the 19 century do not have their origin in the first contacts with Maroon groups, the Aluku, speakers of the English-based creole Nengee, who had arrived some time before in the lower Maroni area. Instead, they show the privileged relationship of the Kali’na with the Dutch colony, whose contact language was Sranan, another English-based creole. Kali’na speakers were often bilingual in Sranan until the end th of the 20 century as a result of their history as refugees in the Dutch colony in the th th 18 and 19 centuries, as well as strong trade relations across the border. Knowledge of Sranan was strengthened by the arrival of Surinamese refugees after the civil war in Suriname in the 1980s. Sranan became a second language for the Kali’na, and th in the last decades of the 20 century, a third language after Guianese Creole. In certain western settlements, it has replaced the lost Kali’na. 3.3.2.

French-based creoles

The economic and social recentralization of the Kali’na on the right bank of the th Maroni and Mana into the French colony, which began in the middle of the 19 century, is visible through the appearance of borrowings from Guianese Creole (or Guyanais), a French-based creole used by the slaves and later, after the end of slavth ery in 1848, by a new Creole society. During this period and the 20 century, Guianese Creole (as well as other French-based creoles introduced during the gold rush) was largely used as vehicular language in French Guiana. It played an important role in this usage among the Kali’na, who became bilingual. Some Eastern Kali’na of French Guiana, who were cut off from the Surinamese Kali’na, began to use Guianese Creole as a first language, with loss of their original language. Kali’na speakers began to replace the loanwords from Sranan with borrowings from Guianese Creole. This fact strengthened the dialect boundary between the western and the eastern varieties of the T!lewuyu Kali’na language of French Guiana.

974 (1)

3.4.

Odile Renault-Lescure

western Kali’na kelege siwitisoli kalden

eastern Kali’na légliz dilwil musike

‘church’ ‘oil’ ‘mosquito-net’

New contact with French

In the fourth phase, Guiana underwent the unification of its administrative system as a French department, and the settlement of French institutions such as administrative representations and schools (the first Kali’na children went to school in 1945). As a consequence, contact with French became more intense, tending to substitute the creole languages as a vehicular language nowadays. Today, bilingualism with French is widespread among Kali’na speakers, and for the great part of them Guianese Creole language has become the third language. During the transition between the last periods, one observes a competition between the loanwords of creole origin and new borrowed forms of French origin. This transition frequently ended with the replacement by a French word. For example, ‘shovel’ was borrowed from Sranan as sikopu and from Guianese Creole as lapèl, but is now sometimes replaced by the French pelle. So some loanwords from the creole languages continue to be used, especially those from Sranan, more ancient and firmly integrated, and reinforced by the immigration from Suriname, some of them are disappearing, but as the contact situations vary in time and space, there is no absolute rule. Nevertheless, French has been the most important language in contact with Kali’na in French Guiana since th the second half of the 20 century. It is worth noting that these language contacts are taking place in a context of th wider multilingualism in French Guiana. Migrations of the late 20 century have led to an increasing number of speakers of Surinamese English-based creoles, spoken by the Maroons (Nengee, with its varieties Aluku, Ndyuka, Pamaka), as well as other languages such as Haitian Creole as well as Brazilian Portuguese, Sranan, Dutch, and English from the other Guianas.

38. Loanwords in Kali’na

975

Table 1: Kali’na history of contacts (adapted from Rose & Renault-Lescure 2008) Historical phases

Main contact language Type of contact

(1) Pre-Columbian time Arawakan, Cariban, Tupi-Guarani, other Amerindian languages (2) Early colonial time Spanish, Portuguese, Dutch, French

(3) rise of the creoles (4) “Francisation”

creoles (Sranan and Guianese Creole) French

Social contact situation

direct and indirect, intense

ethnic alliances and wars/ trade, celebration and marriage exchanges direct and occasional first contacts/ alliances and extinction of the war or slavery, majority of the in- technological and political digenous languages changes some bilingualism trade contacts/social networks with Creole populations intense (widespread political/government dominabilingualism) tion, schooling

4. Number and kinds of loanwords Of the 1460 meanings on the Loanword Typology (LWT) list, 313 have no counterpart in Kali’na. Of the 1201 words in the Kali’na subdatabase, there are 103 words that represent sub-counterparts of LWT meanings, for example species of palm tree, or of monkeys, and the subdatabase contains a number of words for common concepts of the Amazonian indigenous environment and life. Many other words represent super-counterparts of LWT meanings, such as nono ‘land; earth; soil’. 1005 Kali’na words show no evidence of borrowing, and the rest are perhaps (10), probably (9), or clearly borrowed (176). Out of this set of loanwords, 80 are from Sranan, 36 from Guianese Creole, 33 from French, 27 from Spanish, 9 from Spanish or Portuguese, 6 from Tupi-Guarani, 5 items from Dutch origin, 2 from Portuguese origin, one is of presumed Arawakan origin, and one word is from the Aluku dialect of Nengee (English-based creole). 4.1.

Loanwords compared

A comparison of loanwords by source language confirms the socio-historical scenario as described for the colonial period. The lack of sources and the complexity of the situation in pre-Colombian times do not give us a solid basis for comparing Amerindian loanwords. Moreover, the greater part of the probable loanwords from these sources do not appear in the database, as they are part of a specific Amazonian lexicon with names of plants, animals and some basic vocabulary (Renault-Lescure 2005). It is noteworthy that no contact language has an overwhelming presence in the borrowings. This fact may be correlated with the socio-linguistic situation in the first centuries of colonial times: sporadic direct or indirect contacts with European languages and the use of the Kali’na-based lingua franca as the primary means of

976

Odile Renault-Lescure

communication with the Europeans. The demographic decline may have played a role too. Loanwords from Spanish (or Portuguese), Dutch and French from the early colonial phase result from the first contacts with Europeans. As noted by Kloos (1971: 5): “The impression the Spanish seafarers made on Amerindian must have been tremendous. Although the Spanish after only relatively few voyages made room for English and Dutch discoverers and traders, the impact of the first years of contact is still clearly evident in many words of Spanish origin in the Carib language”. The narratives of the Kali’na about the first contact with White people bear out the brutal shock and the importance of goods such as iron tools. First, they became essential to their basic activities and created a subordination to Europeans. Secondly, the Kali’na were implicated in the trade along commercial routes on the coast and into the interior of the Guianas. Very numerous are the loanwords pertaining to the third phase (“rise of creoles”). The source languages of this period are Sranan (the most important language), and Guianese Creole. The quasi-absence of loanwords from Nengee (the English-based creole spoken by the Maroons) is interesting but in fact not surprising. The Maroons had a prolonged association with the Kali’na, sharing the same environment. As the Maroon cultures adopted some Amerindian methods in exploiting the natural resources, borrowing occurred in the direction from the Amerindian languages into Nengee. The relatively greater importance of borrowing from Sranan started during the th second half of the 19 century, a period characterized by a general increase of contact with the European and Creole populations of the French and Dutch colonies. New economic and social relations were established with two recently created little towns, Albina on the left bank of the Maroni and Saint-Laurent on the right bank, as a penitentiary. They became economic centers, with European products, attracting all the different groups of the Maroni River. Kloos (1971: 10) reports: “[…] Albina is almost a miniature of Surinam’s plural society. It counts Chinese already since Kappler’s time [1848 the foundation of the settlement], East Indians or Hindoustani, Javanese, Creoles, Dutch and strongly acculturated Amerindians among its inhabitants. Languages of all the people mentioned are spoken, although the creole language (Negro English, called Sranan by the sophisticated) is the lingua franca. The town is frequented by the Djuka and other Bush negroes, mainly from the upper and middle course of the Maroni, and by Waiyana from the upper reaches of the Maroni, the Litani and the Palumeu. Both groups have built their traditional houses in Albina. An Arowak village […] forms part of the town […] and tribal Caribs are daily visitors.[…] St Laurent, situated opposite Albina, shows the same picture, albeit on a larger scale.”

Because of the status of the penitentiary of Saint-Laurent, and because Albina was more easily connected to Paramaribo and its European products, this town was

38. Loanwords in Kali’na

977

frequented more by the Kali’na. It is noteworthy to remember that an important number of the Kali’na of the western part of French Guiana moved to Dutch Guiana during the period of the penitentiary. This situation explains clearly the numerous loanwords of Sranan origin. The loanwords from Guianese Creole correspond to a new colonial period beth ginning in the last decades of the 19 century. The transformation of the western part of the region is due to three major phenomena. The first is the abolition of slavery (1848) and the progressive construction of a Creole society that spoke a French-base creole. The second is the gold rush, especially on the Mana river, and the creation of the town of Mana that became an active commercial center. It attracted various populations, Creoles, Maroons, Chinese, Vietnamese, East Indians, mixed Brazilians, Arabs, each of them speaking its own language and using the French-based creole for communicating (Collomb & Tiouka 2000). The third phenomenon playing a major role in the growing influence of Guianese Creole is the new social and economic settlement of the Kali’na people in the western part of French Guiana. Some villages came back from the Dutch colony, and the Kali’na villages already established in this area left their isolation from the rest of the colony. The Kali’na population established intense social relations with the Guianese Creoles later on, comparable in a certain way to the relations with the creoles of Suriname that were described earlier. Loanwords from French correspond to the fourth phase (“new contact with French”). In 1946, the colony of “Guyane” became an overseas department, i.e. a territorial and administrative entity that is an integral part of the French nation. “The transformation of French Guiana into a Department in 1946 removed the colonial distinction between ‘French citizen’ an ‘Indigenous people’, but it was only in the beginning of the 1960s that the state has allotted French citizenship to the Amerindians. This was the first step towards a policy of cultural and social integration, by grouping previously isolated families and re-settling them into large villages, and by schooling of children within the framework of boarding schools managed by the Catholic clergy. Their integration into an expanded France had important economic and social effects by granting Amerindians the right to access the welfare system, which would eventually represent for these resettled families a new and important source of income, with consequences for Amerindian economies and social systems.” (Collomb 2006: 200)

As a consequence, contact with French became more intense and French is tending to replace the creoles as vehicular language nowadays. Contacts with the creoles are decreasing, and speakers’ attitudes towards these languages are changing too. Mastering French is more or less seen as a key for social success (for work, studies, and involvement in the political and administrative structures). These tendencies are still differentiated by the changes in the vehicular role assigned to the creole languages. Guianese Creole, which was largely attested in this function in all parts of French Guiana, began to lose it. This is due to two reasons:

978

Odile Renault-Lescure

the demographic decrease of the Creole population and the arrival of numerous speakers of English-based creoles on the western border of the country. In the region of Saint-Laurent, these creoles are replacing Guianese Creole as vehicular language. So the contact situation is changing, influenced by new migrations. 4.2.

Loanwords and semantic word class

Table 2 presents a classification of loanwords according to their lexical class.

French

Guianese French Creole

Spanish

Portuguese

Tupi-Guaraní

Dutch

Aluku

Unidentified

Total loanwords

Non-loanwords

Nouns Verbs Adjectives Adverbs Function words all words

Sranan

Table 2: Loanwords in Kali’na by semantic word class (percentages)

7.5 2.4 2.3 5.2

3.9 3.5 2.6

3.9 0.8 1.2 2.6

3.7 2.2

0.8 0.5

0.6 0.4

0.3 0.2

0.2 0.1

0.3 0.4 0.3

21.2 3.6 0.0 0.0 7.0 14.1

78.8 96.4 100.0 100.0 93.0 85.9

Kali’na differentiates only four major word classes: nouns, verbs, adverbs (playing the role of qualifiers once nominalized) and function words. All these words switch easily from one class to the other through derivational processes. Nouns are by far the largest semantic word class represented, followed by function words and verbs. Among the list of loanwords present in the database, neither adverbs nor adjectives appear. Borrowed nouns, regardless of their source language and the period of borrowing, are integrated into the class of Kali’na nouns. (2)

a. dilet! oto

‘milk’ ‘car’

b. alakaposa ‘gun’ pusipusi ‘cat’

< Guianese Creole < French < Spanish < Sranan

The vast majority of them fall into the sub-class of alienably possessed nouns (cf. 2a), while only a few fall into the sub-class of non-possessed nouns which have a suppletive form in the possessive construction (cf. 2b):

38. Loanwords in Kali’na

(3)

paila alakaposa

‘bow’ ‘gun’

Ø-!lapal!

kulewako pusipusi

‘parrot’ ‘cat’

y-ek!

979

‘my weapon, my gun’

1-weapon.POSS

‘my pet, my parrot, my cat’

1-pet.POSS

There are no instances of words being transferred into the sub-class of inalienably possessed nouns (kin terms, body parts, parts of a whole). Loanwords that appear in the semantic class of function words are from the creoles and French. They consist of some conjunctions, especially from French (with the use of numerals). Borrowings that appear in the semantic class of verbs are limited to the creole languages. The creole source languages offer examples of multifunctionality (Bruyn 2002). Some words function as noun, verb or adjective without any change in word class being morphologically marked. This ability seems to be quite common in Sranan, especially with forms used as a verb and as a noun, and less common in Guianese Creole. 4.3.

Loanwords and semantic field

Table 3 presents the breakdown of loanwords by semantic fields. Certain semantic areas avoid borrowing, such as Kinship, The body and Sense perception, and others seem to be not permissive, such as The physical world, Spatial relations, Emotions and values, and Law. In these last domains, Sranan is the most present donor language, followed by Guianese Creole and French. In the semantic fields with a low rate of loanwords such Social and political relations or Quantity, it is noteworthy that the more recent contact languages played a role, except in the field Speech and language, with the presence of the Spanish borrowing sanpula ‘drum’, which became the most important musical instrument in Kali’na culture, and Warfare and hunting, with the introduction of firearms. An increasing number of loanwords appear in two fields, attesting however various phases of contacts. Animals shows a high rate of Spanish loanwords from the early colonial time. Among them only pelo ‘dog’ became a normal part of family life, the others remaining in the lexicon because the Kali’na know about their existence, such as p!iliku ‘pig’, paka ‘ox, cow’, kawale ‘horse’, paliko ‘donkey’, but they have not become part of the way of life of the Kali’na. On the other hand, loanwords in the field Time predominantly originate from French, with the divisions of the calendar. Agriculture and vegetation, Motion and Basic actions and technology present comparable distributions of loanwords: a higher number from Sranan, a moderate number from Spanish (but with important domesticated plants such as asikalu ‘sugar cane’), and a lower number from Guianese Creole or French.

980

Odile Renault-Lescure

Portuguese

Tupi-Guaraní

Dutch

Aluku

Unidentified

Total loanwords

Non-loanwords

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

The physical world Kinship Animals The body Food and drink Clothing and grooming The house Agriculture and vegetation Basic actions and technology Motion Possession Spatial relations Quantity Time Sense perception Emotions and values Cognition Speech and language Social and political relations Warfare and hunting Law Religion and belief Modern world Miscellaneous function words

1.3 0.8 3.5 1.7 1.2 0.5 -

0.9 5.3 -

2.4 1.8 -

8.7 -

2.4 3.4 3.1 -

1.9 0.0 8.8 0.0 24.9 51.8 47.9 19.5 17.8 11.4 22.0 1.6 10.7 21.4 0.0 2.5 8.1 8.3 17.1 16.7 9.1 43.5 63.7 0.0

98.1 100.0 91.2 100.0 75.1 48.2 52.1 80.5 82.2 88.6 78.0 98.4 89.3 78.6 100.0 97.5 91.9 91.7 82.9 83.3 90.9 56.5 36.3 100.0

5.2

French Guianese French Creole

1.9 1.8 - 4.8 13.6 1.6 6.4 2.4 21.2 7.1 - 15.3 21.0 6.7 13.4 3.4 7.1 1.8 1.8 1.8 8.1 3.2 1.6 4.8 0.9 3.5 1.7 3.5 12.6 - 3.1 3.1 - 1.6 2.7 8.0 6.0 13.2 2.2 2.5 2.7 - 5.4 3.6 - 4.7 12.4 4.7 7.2 - 3.6 4.8 - 9.1 17.4 - 17.4 12.4 20.1 28.6 2.1 -

Sranan

Spanish

Table 3: Loanwords in Kali’na by semantic field (percentages)

2.6

2.6

2.2

0.5

0.4

0.2

0.1

0.3

14.1

85.9

An important creole influence is found in the fields Food and drink and The house. They result from the geographical and social proximity of the Creole societies when Kali’na came to visit, or sometimes to live in little towns where the Kali’na heard these languages as lingua francas and were shown housing practices in change. The field Clothing and grooming contains loanwords from almost all languages, with particular influence of Spanish (general loanwords for cloth, dress and shoes), Sranan and French (more specific loanwords). Finally, the most important field in terms of number of loanwords is Modern world, where an increasing number of loanwords from Guianese Creole and French are observed. It bears out the division between the eastern and western Kali’na T!lewuyu dialects and shows that their linguistic practices are influenced by sociolinguistic contacts with the official languages of French Guiana and Suriname,

38. Loanwords in Kali’na

981

where the influence of Dutch on Kali’na is similar to the influence of French on the other side of the border. Some features are interesting to note across various semantic fields. For example, the influence of schooling and purchasing is obvious with the naming of days and numbers in French in Time and Quantity. Another characteristic is the mapping of semantic fields. Ancient loanwords have acquired large semantic values, with various kinds of semantic changes: alakaposa ‘gun’ (from Spanish arcabuz) changes its meaning with the technological change, kaleta ‘book, paper’ acquired a lot of new referential values. Kali’na borrowed essentially words relating to material items and other concepts associated with non-Indian products or practices by insertion. Co-existence is observed in the case of dialectal competition, and replacement of native words is rare. It appears nevertheless with the increasing number of French words inserted in today’s speech.

5. Integration of loanwords In this section, we deal first with the major processes that loanwords have undergone, independently of the origin of the language (disregarding the Amerindian borrowings for which we do not know the source words). All Spanish, Portuguese, Dutch, older French, early Sranan and Guianese Creole loanwords are fully assimilated to the phonological, morphonological and morphological patterns of Kali’na, while some Sranan and Guianese Creole loanwords are only partially integrated. The phonological integration involves various processes of adaptation of the borrowed item, among them substitution of phonemes. Although there is no opposition between p/b, t/d, and k/g in Kali’na, the stops /p/, /t/ and /k/ have voiced allophones, except word initially. Initial voiced stops in loanwords were regularly replaced by the corresponding voiceless obstruents: (4)

pantila [pan!di:"a] paliki [pa: !"igi] kalasi [ka!"a:çi]

‘flag’ ‘bark’ ‘glass’

< Spanish bandera < French barque < Sranan grasi

During the final phase of the rise of creoles with introduction of loanwords maintaining a voiced stop word-initially, a new opposition emerged: (5)

pali ‘barrage’ # panki ‘skirt’ #

bali ‘barrel’ banki ‘bank (seat)’

< French barril < Sranan

Another characteristic of the phonological system of Kali’na is the palatalization of consonants after /i/ (except when followed by the same vowel) and of /s/ after and before this vowel. Phonological integration involves allophonic realizations:

982 (6)

Odile Renault-Lescure

asikalu [açi!ca:"u] sikopu [çi!copu]

‘sugar-cane’ ‘shovel’

< Spanish açucar < Sranan skopu

This adaptation is beginning to show some loss, as the consonant [t] may now occur after /i/, instead of the allophone [c]: (7)

lopital [lo!pital]

‘hospital’

< Guianese Creole lopital

Kali’na has a phonological constraint that restricts consonant clusters. The source words often contain consonant clusters that are not allowed by Kali’na. Consonant clusters that violate the syllable structure constraints are broken up by the insertion of an epenthetic vowel between two consonants, or adding a final vowel: (8)

paliki sipunu

‘bark’ ‘spoon’

< French barque < Sranan spun

The quality of epenthetic vowels is determined by progressive or regressive assimilation, or is by default a vowel prone to devoice, such as [i] or [$]. It seems that Kali’na does not use simplification processes because it has a tendency to accept (and sometimes even favor) polysyllabic stems and words. These rules do not apply to the most recent borrowings (from the creoles or French) that preserve consonant clusters: (9)

kalden

‘mosquito-net’

< Sranan garden

The morphological integration of loanwords depends on historical factors, but also on the source languages. Kali’na clearly differs in structure from the European languages, which are fusional, and the creoles, which are isolating. All borrowed nouns enter in the morphological patterns of Kali’na nouns. The numerous processes of derivation with or without change of word class are very productive, as well as the semantic changes: ! semantic derivation (10) winu-menpo

‘a glass of wine’ < Spanish/Portuguese vino/vinho ‘wine’

wine-DIMIN

! derivation with transposition (11) lakele-to

‘to lock’ < Guianese Creole laklé ‘key’

key-TR.VERBALIZER

! derivation with transposition to adverb and semantic change of the loanword (12a) followed nowadays by a new morphological regular transfer in the noun class with a new semantic change (12b):

38. Loanwords in Kali’na

(12) a. i-kaleta-pa

983

‘without paper, without document’ < Spanish carta

PRIV-paper-PRIVATIVE.ADV

b. i-kaleta-p[a]!n

‘some one without document, in irregular situation’

PRIV-paper-PRIVATIVE.ADV-NMLZ

(French un immigrant en situation irrégulière, un “sans papier”) ! possessive inflection (13) a-serviette-!-l! [as"#vj"t!l!]

‘your towel’

< French serviette

2-towel-$-POSS

Contrary to this flexible adaptation of borrowed nouns to the patterns of Kali’na nominal morphology, the verbs, like the adjectives, are borrowed by means of a transfer into new invariable sub-categories of nouns (Rose & Renault-Lescure 2008). Borrowings from the Sranan verb/noun category are always integrated as uninflected nominal stems to which a verbalizing suffix is attached to form transitive verbs, carrying a person prefix and a tense suffix: (14) tamusi God

si-begi-ma-e 1A-prayer-VERBALIZER-PRS

‘I pray God’ The Kali’na verb begima ‘to pray’ has its origin in the Sranan form begi ‘to pray, prayer’. To this form Kali’na suffixes its transitive verbalization morpheme -ma. This regular strategy involves a creative adaptation process and conforms fully to Kali’na patterns of derivation. Thus Kali’na borrowed the Sranan verb/noun forms as nominal stems, but kept their verbal meanings by using them in a verbal construction. Why is the Sranan verb, unspecified for valency, always used in a transitive construction? Many linguists agree that morphological adaptation and class assignment may be hindered when the recipient language has complex rules. This is the case with the verbal system of Kali’na, which displays split intransitivity. One may think that this strategy permits speakers to avoid the assignment of the borrowed verb to one of the verbal intransitive sub-classes (active or stative). Some of these loanwords are kept in the lexicon of French Guiana Kali’na, some disappear. The notable fact is that this kind of formation stopped when borrowing verbs from Guianese Creole began. Borrowings from Guianese Creole follow a similar strategy, but the result is a different construction. The invariable forms of the Creole verb follow another pattern. In order to be used as predicate, they need to be embedded in predicative structures involving nominalized verbs in a postpositional phrase with poko ‘busy with’.

984

Odile Renault-Lescure

The postpositional group functions in a single-participant copular construction: (15) pentiré poko paint

man

busy.with 3S.COP.PRS

‘He is painting.’ Nowadays an increasing number of such structures is invading Kali’na everyday interactions, with two noteworthy features, the use of a two-participant construction with the transitive verb !l! ‘to put’ and the opening of this structure to any French verb in its infinitive form (Renault-Lescure 2005; Alby & Renault-Lescure 2009+): (16) woto fish

nettoyer poko clean

s-!-ya

busy.with 1A-put-PRS

‘I am cleaning the fish.’ (lit. ‘I am putting it out for cleaning.’) In this case we observe a reanalysis of a verb form from the donor language to a nominal form without inflection in the recipient language. In the sentence above, the postpositional phrase acquired a function of complement, the noun woto ‘fish’ being the object of the verb !l! ‘to put’. This open structure now leads the speakers to insert a lot of French verbs, frequently replacing native forms, as in the example (12). These cases do not appear in the database, nor do the cases of insertion of French adjectives. Adjectives are undergoing a similar transposition to the class of adverbs in Kali’na: They are inserted as nominal roots adverbialized by means of an adverbializer: (17) rouge-me man red-ADVB

< French rouge ‘red’

3.COP

‘it is red’ The large number of loan nouns in relation with the other semantic word classes seems due to various factors. One of them is the need to name new objects introduced through contact with other groups. While this need may be catered for by the creation of new words (neologisms) or the borrowing of lexical items, the use of either mechanism depends on the interpretation of the new entities. The White men in general were named palanak!l! (a compound noun meaning ‘spirits of the sea’), and reinterpreted as new entities of the indigenous world. In turn, the different colonial peoples were differentiated with borrowings such as palansi[sin] ‘French’ and pulutekesi ‘Portuguese’, … as the colonies were perceived as foreign socio-political organizations. A lexical mechanism is regular: an object is named with a borrowed word, but parts of the object and activities made with described with native words: pila ‘sail’ from Spanish, and pila ep! ‘mast, support of the sail’, oto ‘car’ from French, oto unt!ma ‘to drive (a car)’ (as ‘to drive a canoe’). Another fact is the vitality of the derivational processes (see below) and the compound forms based on possessive phrases (cf. 18a), nominalized postpositional (18b), verbal (18c) or

38. Loanwords in Kali’na

985

adverbial groups (18d) in this language. In certain cases, it led to the creation of lexicalized compound phrases: (18) a. kawale kawale elepa-l! horse

‘horse’ < Spanish cavallo ‘herb (sp. Cymbopogon citratus Stapf)’

food-POSS

b. alakaposa alakaposa ta-no gun

‘gun’ ‘cartridge’

< Spanish arcabuz

in-NMLZ

c. lekol lekol ka[p!]'-nen

‘school’ ‘teacher’

< Guianese Creole lékol

school make-NMLZ.AGT

d. sopo sopo t!-yana-le-n soap

‘soap’ ‘bar of soap’

< Sranan sopo

ADV-hardness-ADV-NMLZ

6. Grammatical borrowing Direct grammatical borrowings are not observed, but syntactic changes that may have been triggered by contact are described in Rose & Renault-Lescure (2008). Three contact-induced changes are worthy of notice. The first change is induced by the borrowing of a frequent functional word, the coordinative conjunction nanka ‘and’ (< Sranan) inserted between two noun phrases (cf. 19a). In the traditional Kali’na construction, a comitative postposition malo ‘with’ is postposed to the noun (cf. 19b): (19) a. wayamaka Iguana

nanka akale and

Caïman

‘Iguana and Caïman’ b. wayamaka Iguana

akale

malo

Caïman

with

‘Iguana with Caïman’ The coexistence of the two constructions creates a significant typological change. Furthermore, this coordinator now occurs both between two phrases and between two sentences. Another typological change is linked to the borrowing of subordinators. Kali’na is starting to replace its inherited nonfinite subordinate constructions marked by

986

Odile Renault-Lescure

postposed subordinators (such as yako ‘when’) with finite constructions introduced by borrowed preposed subordinators (such as paske ‘because’ < Guianese Creole). But so far this change is restricted to subordinate constructions with borrowed subordinators. A third change is the reinforcement of analytical constructions by the very frequent use of copular constructions and the construction with the verb !l! ‘to put’ in order to integrate foreign verbs. The copula construction is now the only way to integrate novel verbs from any source language into Kali’na. Examples with Brazilian Portuguese as the source language have also recently been collected in Amapá, Brazil: (20) misa mass

ta reza poko

wai

< reza(r) (Amazonian Portuguese)

in prayer busy.with 1.COP.PRF

‘I prayed at the mass.’

7. Final remarks To understand Kali’na loanwords, we need to keep in mind the various linguistic and sociolinguistic factors that play a role in the contact phenomena (Alby & Renault-Lescure 2009+). 7.1.

Linguistic factors

The most important structural linguistic factors include the following: (i) The relative timing of borrowed elements; the hierarchies of borrowability predict that when contact is initiated, the first elements to be borrowed will be content items because they are the most salient and transparent of all potentially borrowable elements (Fields 2002). It seems particularly important to note that entities with visible referents are easiest to learn by speakers who have visibility as a cognitive category, as it is the case with Kali’na; (ii) The impossibility of directly borrowing verbs when the verbal system of the recipient language is complex, or adjectives when the recipient language does not have the same word class; so there is a necessary reanalysis of the verbs from the source language to enter the Kali’na. In this language, relations between the arguments and the predicate are marked on the predicate with person indexes, following a hierarchical system (Renault-Lescure 2002). We also noted the absence in the database of loanwords coming from an adjectival class in any source language: in the incipient borrowings of today, we showed that they are also reanalyzed and integrated in an attributive sub-category of adverbs;

38. Loanwords in Kali’na

987

(iii) The convergence between features of internal and external change: the Kali’na language today tends to use analytical sentence patterns (for example young people avoid the use of nominal incorporation) and this seems parallel with the increasing use of auxiliary structures (for borrowing verbs); (iv) The morphosyntactic changes triggered by loanwords as we have seen in §6; (v) The foreign items that are not directly borrowable are marked with specific morphemes that underline their foreign origin. De Goeje (1946) already gave the same meaning to the verbalizer -ma ‘to apply, in the way of’ and the adverbalizer -me ‘in the way of’. 7.2.

Sociolinguistic factors

The most important sociolinguistic factors include the following: (i) The intensity of contacts: three major waves of contact have left their mark on the language: violent contacts with the Spanish, contact with more social th interaction with the Creoles in the 19 century, and contact with French with administrative and political pressure after the mid-twentieth century. It is interesting to remark that borrowings are sensitive to political and economical domination. That is the case for the Kali’na with a double minority situation in relation with the Creole and French societies. This may explain why the social contacts with Maroon peoples did not result in borrowings from these languages; (ii) The attitudes of the speakers and the social context of interactions play an important role. Although some “purist” speakers are careful with the use of loanwords, attitudes differ from one speaker to the next, following various parameters such as the origin of the village and family, level of multilingualism, presence of foreigners, etc. By contrast, the use of borrowings is completely generalized. In addition, today the path for incipient or nonce-borrowings or insertions is opened and each speaker has various registers for use in various situations. Léglise & Alby (2006) write about Kali’na: “[…] language contact has led to numerous lexical changes and the question of whether language contact has brought about morphosyntactic change is still unresolved. Nonetheless, language games, alternations, bilingual speech as a code among adolescents […] with identity-building and cryptic functions, are signs of vitality in that the language is alive and functional language as a linguistic resource […] for young speakers who, as the first generation attending secular schools, and experiencing the difficulties of drops in status and the creation of social outcasts…”

988

Odile Renault-Lescure

8. Conclusion Kali’na has been in contact with different languages since pre-Colombian times. Apart from Amerindian languages, which are not well identified, seven main donor languages left imprints on the lexicon, the influence of each of them being different following the type of the language in contact, its social dominance or minority situation, the intensity of the contact and the speakers’ attitudes. The present-day configuration is leading to new contact situations with other languages present, a higher level of multilingualism, particularly bilingualism with French, and a new permissiveness with respect to code-switching.

References Ahlbrinck, Wilhelmus G. 1956 [1931]. Encyclopaedie der Karaïben. (Verhandelingen der Koninklijke Nederlandsche Akademie van Wetenschappen, Afdeeling Letterkunde XXVII). Reprinted in Encyclopédie des Caraïbes, translated from Dutch by von Herwijnen, Doude. Typescript). Alby, Sophie. 2001. Contacts de langues en Guyane française: Une description du parler bilingue kali'na-français. Thèse de doctorat. l'Université de Lyon II. Alby, Sophie & Renault-Lescure, Odile. 2009+. Stratégies prédicatives en contact: Langue kali’na et discours des jeunes Kali’na. In Chamoreau, Claudine & Goury, Laurence (eds.), Changement linguistique et langues en contact: Approches plurielles du domaine prédicatif. Paris: CNRS Editions. Biet, Antoine. 1896. Les Galibis: Tableau véritable de leurs moeurs avec un vocabulaire de leur langue (1661). Revue de Linguistique juillet/octobre. Paris: Aristide Massé. Boven, Karin & Morroy, Robby. 2000. Indigenous Languages of Suriname. In Queixalòs, Francisco & Renault-Lescure, Odile (eds.), As línguas amazônicas hoje, 377–384. São Paulo: IRD-ISA-MPEG. Boyer du Petit Puy, Paul. 1654. Véritable relation de tout ce qui s’est fait et passé au voyage que M. de Bretigny fit à l’Amérique occidentale. Paris: P. Rocolet. Bruyn, Adrienne. 2002. The structure of the Surinamese Creoles. In Carlin, Eithne B. & Arends, Jacques (eds.), Atlas of the Languages of Suriname, 153–182. KITLV Press. Collomb, Gérard. 2006. Disputing Aboriginality: French Amerindians in European Guiana. In Forte, Maximilian C. (ed.), Indigenous resurgence in the contemporary Caribbean: Amerindian survival and revival, 197–212. NY: Peter Lang. Collomb, Gérard & Tiouka, Félix with Appolinaire, Jean & Renault-Lescure, Odile. 2000. Na’na Kali’na: Une histoire des Kali’na en Guyane, Petit Bourg. (with participation of Appolinaire, Jean and Renault-Lescure Odile). Guadeloupe: Ibis Rouge Editions. de Goeje, Claudius H. 1909. Etudes linguistiques caraïbes. (Verhandelingen der Koninklijke Nederlandsche Akademie van Wetenschappen, Afdeeling Letterkunde, Nieuwe Reeks 10.3). Amsterdam.

38. Loanwords in Kali’na

989

de Goeje, Claudius H. 1946. Etudes linguistiques caribes. Vol. 2. (Verhandelingen der Koninklijke Nederlandsche Akademie van Wetenschappen, Afdeeling Letterkunde, Nieuwe Reeks 49.2). Amsterdam: Noord-Hollandsche Uitgevers Maatschappij. de la Salle de l’Estaing, M. 1763. Dictionnaire galibi, présenté sous deux formes: I° Commençant par le mot français; II° par le mot galibi. Précédé d’un essai de grammaire. In Bruletout de Préfontaine (ed.), Maison rustique à l’usage des habitants de Cayenne, 1– 127. Paris: Bauche. Field, Fredric W. 2002. Linguistic Borrowing in Bilingual Contexts. Amsterdam/Philadelphia: John Benjamins Publishing Company. Forte, Janette. 2000. Amerindian Languages of Guyana. In Queixalós, Francisco & RenaultLescure, Odile (eds.), As línguas amazônicas hoje, 317–331. São Paulo: IRD-ISA-MPEG. Gildea, Spike. 1998. On Reconstructing Grammar, Comparative Cariban Morphosyntax. New York/Oxford: Oxford University Press. Gildea, Spike. 2002. Etat de l’art des descriptions linguistiques des langues du groupe caribe. In Landaburu, Jon & Queixalós, Francesc (eds.), Faits de Langues: Méso-Amérique, Caraïbes, Amazonie, Vol. 2, 79–85. Gildea, Spike & Payne, Doris. 2007. Is Greenberg’s Macro-Carib viable? Lingüística Histórica na América do Sul. In Galucio, Ana Vilacy & Muysken, Pieter (eds.), Boletim do Museu Emilio Goeldi (Série de Ciências Humanas). Belém: Museu Goeldi. Grenand, Pierre & Grenand, Françoise. 1979. Les Amérindiens de Guyane française aujourd’hui: Eléments de compréhension. Journal de la Société des Américanistes 66:361– 382. Hoff, Berend J. 1968. The Carib language, phonology, morphonology, morphology, texts and word index. The Hague: Martinus Nijhoff. ISA = Instituto Socio-Ambiental. 2008. A Enciclopédia dos Povos Indígenas no Brasil. . Kloos, Peter. 1971. The Maroni River Caribs of Surinam. Assen: Van Gorcum. Léglise, Isabelle & Alby, Sophie. 2006. Minorization and the process of (de)minorization: The case of Kali’na in French Guiana. International Journal for the Sociology of Language 182:67–85. Renault-Lescure, Odile. 1984. A propos des premières descriptions d'une langue caribe, le Galibi. In Auroux, Sylvain & Queixalós, Francesc (eds.), Amerindia n° spécial 6:183–208. Renault-Lescure, Odile. 1985. Evolution lexicale du galibi, langue caribe de Guyane française. Paris: TDM ORSTOM, F 16. Renault-Lescure, Odile. 1990. Contacts interlinguistiques entre le karib et les créoles des côtes guyanaises. Etudes Créoles 13(2):86–93. Renault-Lescure, Odile. 2002. As palavras e as coisas do contato: Os neologismos Kali’na (Guiana Francesa). In Albert, Bruce & Ramos, Alcida Rita (eds.), Pacificando o Branco: Cosmologias do contato no Noerte-Amazônico, 85–112. São Paulo: Editora UNESP.

990

Odile Renault-Lescure

Renault-Lescure, Odile. 2005. Intégration grammaticale des emprunts et changements linguistiques dans la langues kali’na de Guyane française (famille caribe). In Chamoreau, Claudine & Lastra, Yolanda (eds.), Dinámica lingüística de las lenguas en contacto, 103– 120. Universidad de Sonora. Renault-Lescure, Odile. 2007. L’écriture du kali’na en Guyane: Des écritures coloniales à l’écriture contemporaine. In Léglise, Isabelle and Migge, Bettina (ed.), Pratiques et représentations linguistiques en Guyane, IRD edns. 425–453. Paris: Regards croisés. Renault-Lescure, Odile. 2009+. Guyana Francesa. In Sichra, Inge (ed.), Atlas etnográfico y sociolingüístico de Latinoamérica y El Caribe. UNICEF / FUNPROEIB. Renault-Lescure, Odile & Goury, Laurence. 2009+. Les langues de Guyane [The languages of Guyana]. La Roque d’Anthéron: Vents d’Ailleurs/IRD. Rose, Françoise & Renault-Lescure, Odile. 2008. Contact-induced changes in Amerindian Languages of French Guiana. In Stolz, T. & Salas Palomo, R. & Bakker, D. (eds.), Romancisation world-wide, 349–376. Mouton de Gruyter. Taylor, Douglas. M. & Hoff, Berend. J. 1980. The linguistic repertory of the Island-Carib in the seventeenth century: The men's language – a Carib pidgin. International Journal of American Linguistics 46:301–312.

Loanword Appendix Aluku mupelu

priest

Arawak kuyapa

guava

Dutch supikili paipa palantuwini mantala salaika

mirror pipe liquor, rum almond rake

French mille onze vendredi lundi samedi koto w!w! velo paliki kanon bouton

thousand eleven Friday Monday Saturday dress axe bicycle boat brick button

calendrier oto fromage kapiteni ko’ko pena maloto machine numéro plastique cachot simona tele serviette jeudi mardi douze mercredi alopon sipol!

machine

calendar car cheese chieftain coconut door hammer motor number plastic prison rudder television towel Thursday Tuesday twelve Wednesday harpoon convict, white man (pejorative) motor, machine

Guianese French Creole paske bal labye dibe kafe leta lopital lakele

pil[!] maché dilet! sis! kaché lapos! radio alato lekol! vis lapalet! linet! simenn beni

because ball, football beer butter coffee government hospital key, latch, door-bolt, lock lamp, torch market milk petroleum pill, tablet post/mail radio raft school screw sling glasses week to bless

38. Loanwords in Kali’na pentire dilwil legliz loto lafinèt lit! lapèl zonyon palan puwela buret! zenk

to paint oil church car window bed shovel onion long line canvas sheet wheelbarrow corrugated iron

welasi kapala sapato asikalu tapala pot!iya mank! m!lato Portuguese kasulu kawale

Spanish or Portuguese

Sranan

kapilita suntat!

nanka fulu sonde oplan[!] pasikita pesi bedi bais!k!l! palanka patele pelele talapu

kamisa winu palansi(sin) pulutekesi lem! pila

goat army, soldier, police cloth, shirt wine French men Brazilian (people) oar sail

Spanish nopon saiya ankala kaleta

palila supala palapi pelo paliko sanpula alakaposa sanpelelu p!loto p!lata akusa paka p!iluku mawasa kapuya

blanket (woman’s) dress anchor paper, book, document, letter, postcard, newspaper comb cutlass dish dog donkey drum gun hat lead money needle ox, cow pig razor rope

scissors sheep shoes sugar cane table watermelon mango mixed-race Kali’na

posolo konopu pusipusi keti keleke alimiki olosi kopolo kontele komiki peleti bakis! sipi polom!n polomiki sipunu ank!sa watalakan konu

lampu

bead horse

and many Sunday airplane basket beans bed bicycle board bottle bread bridge, ladder, port, harbor brush button cat chain church citrus fruit clock copper country, town cup dish fish trap fishnet flour flower fork, spoon handkerchief, rag jug/pitcher king, queen, head of state, president lamp

wowoiyo suwapulu meliki alata kalasinoli pini talapu moito aleisi sasa yulu empi wenkele sikopu panki sopo kosu sukulu puluku pesele posima epema kofuma bekima lesima wekima kuwasi kalden boketi tolonpulu ap!lisina ayunu mati

sinesi suma senki pesi

991

market matches milk rat petroleum pin harbor, ladder, bridge prostitute rice saw hour, period, time shirt shop shovel skirt soap socks sugar trousers window to kiss to pay to pound to pray to read to weigh pom-pom mosquito net bucket case orange onion Creole or Black men, friend, mate Chinese somebody corrugated iron pea, bean

Tupi-Guarani languages alawe awasi kunami ulupe kapiwa nana

cockroach maize fish poison sp. mushroom sp. capybara pineapple

Chapter 39

Loanwords in Hup, a Nadahup language of Amazonia* Patience Epps 1. The language and its speakers The Hup language (also known as Hupda or Jupde) is spoken in the Vaupés region of the northwest Amazon, an area of roughly 2000 square kilometers that straddles the border between Brazil and Colombia. The approximately 1500 speakers of Hup, who call themselves Hup-d’!h [person-PL], live in scattered villages (with populations numbering from a dozen to a hundred people, in most cases), and rely on hunting, fishing, gathering, and the small-scale cultivation of bitter manioc for subsistence. Hup belongs to the small family of Nadahup languages, also known as Makú.1 Until recently, there was almost no comprehensive description of any member of this language family. Information remains limited, and for this reason the membership of the family itself is still in question. Evidence for a genealogical relationship between four languages – Hup, Yuhup, Dâw, and Nadëb – is fairly clear (both lexical and grammatical; see Martins 2005; Epps 2008a: 3–7), and suggests the family tree in Figure 1. Lexical comparison has so far revealed no clear evidence for a relationship between these languages and the pair of sisters Kakua and Nukak, or the language Puinavé, spoken in Colombia, although these have been treated as members of the ‘Makú’ family in much of the literature (see Epps (2008a) for discussion; Martins (2005) proposes some possible cognates but finds no regular sound correspondences).

*

The subdatabase of the World Loanword Database that accompanies this chapter is available online at http://wold.livingsources.org. It is a separate electronic publication that should be cited as: Epps, Patience. 2009. Hup. In Haspelmath, Martin & Tadmor, Uri (eds.) World Loanword Database. Munich: Max Planck Digital Library, 994 entries.

1 The name Nadahup is preferred here because (a) the name Makú occurs in the literature in reference to several unrelated language groups in Amazonia and is thus prone to confusion, and (b) the name Makú (likely from Arawak ‘do not talk’; cf. Koch-Grünberg 1906) is widely recognized in the Vaupés region as an ethnic slur, directed against the members of this ethnic/linguistic group. Nadahup combines elements of the names of the four established languages that make up the family (Nadëb, Dâw, Yuhup, Hup). This language family has also been referred to as Vaupés-Japura (cf. Epps 2005; Ramirez 2001b).

39. Loanwords in Hup

993

Nadëb Dâw Nadahup Hup Yuhup

Figure 1:

The Nadahup family

Hup itself can be divided into at least three main dialects, defined geographically. These are spoken in the region between the lower Tiquié and upper Vaupés Rivers, along the middle Tiquié and Papuri Rivers, and in the area of the upper Tiquié in Brazil and into Colombia. The extent to which dialectal variation continues on the Colombian side is not known. These three dialects are all clearly mutually intelligible; differences are most profound between the upper Tiquié form and those of the other two regions. This study draws primarily on data from the middle Tiquié dialect, as spoken in the village of Barreira Alta. The Vaupés region is well known for the multilingualism of its inhabitants, fostered by the regional practice of linguistic exogamy, or marriage across language groups. The region is remarkably diverse, containing representatives of three different language families – Nadahup, East Tukanoan, and Arawak (see Map 1) – as well as speakers of Portuguese and Nheengatú (also known as Lingua Geral, a th th Tupian language spread by missionaries in the 17 –19 centuries). Hup speakers are in frequent contact with speakers of East Tukanoan languages, and appear to have maintained a stable but unilateral (unreciprocated) bilingualism in Tukano for many generations (see §3). Interactions with Portuguese speakers also take place, but are much more limited. In the regional social hierarchy, speakers of Hup and other Nadahup languages are positioned at the bottom, and the sociolinguistic status of their languages is accordingly very low. Despite this fact and the general bilingualism in Tukano (the regional lingua franca), Hup is currently fully viable, being learned by all Hup children as a first language, and as yet shows no signs of shifting to Tukano. Its viability is probably nevertheless quite fragile. In their home villages, Hup speakers use Hup nearly one hundred percent of the time. However, there are some domains where Tukano is frequently used. Larger Hup villages typically have a small primary school and a church, instituted in the past 20–30 years by Catholic missionaries, who entrusted their direction to local Tukanoans. For this reason, many Hup villages today have a resident Tukanoan schoolteacher and catechist (often the same person), who almost invariably speaks only Tukano. Hup speakers typically use a mixture of Hup and Tukano in their

994

Patience Epps!

interactions with these people, particularly in the school and church contexts, depending on the degree to which the Tukanoan person is integrated into the village and understands some Hup. Children in particular are often not comfortable in Tukano, and most use Hup even in the school context, although instruction is almost completely in Tukano.

Map 1: Hup and other languages of the Upper Rio Negro region Outside of their own villages, Hup speakers usually speak Tukano when they are in a Tukanoan community, often even amongst themselves; this is clearly encouraged by feelings of linguistic insecurity engendered by the unequal social status of these languages. They likewise favor Tukano (at least while in public) on the rare occasions when they travel to São Gabriel da Cachoeira, the nearest Brazilian town, which is also home to many Tukanoan and other indigenous people. Those few Hup people who speak Portuguese normally use it only with non-Indian people; contexts for this interaction are limited mostly to visits from missionaries, health agents, and linguists and anthropologists, or to the occasional trip to São Gabriel. Hup is effectively an unwritten language. The last five years have seen the development of a Hup orthography and some native-language literacy classes, but at this point few, if any, Hup people write the language on a regular basis. It is not used regularly in the village schools, which for the most part introduce literacy in Portuguese (ineffectively, in most cases, since children do not speak Portuguese). Little is known about the histories of the indigenous Vaupés peoples. There is some speculation that the Nadahup peoples are the region’s autochthonous

39. Loanwords in Hup

995

inhabitants, which is supported by the fact that all the known Nadahup languages are spoken in this general area. Ethnohistorical accounts suggest that the Tukanoans entered the Vaupés between 500–1000 years ago, probably from the west (cf. Aikhenvald 1999: 390, 2002: 24, Nimuendajú 1982), and the Arawaks are thought to be the most recent arrivals, entering from northeast around the end of th 16 century (cf. Cabalzar & Ricardo 1998: 55; Aikhenvald 2002: 24). Contact with non-Indians began with the arrival of the Portuguese at São Gabriel da Cachoeira around 1700, initiating a brutal epoch of slaving and epidemics that decimated the local indigenous populations. A new wave of non-Indian contact took place during the rubber boom, lasting from approximately 1870 to 1920. Catholic missionaries have also been active in the region since the early days, and have done much to alter indigenous culture. However, the river-dwelling Tukanoan and Arawak peoples bore the brunt of this contact, while for many years the Hupd’!h remained more isolated in their interfluvial forests. Salesian Catholic missionaries began to approach the Hupd’!h seriously in the 1940’s, intensifying their efforts in the early 1970’s. This led to the concentration of small, semi-nomadic Hup communities into many of the larger, more sedentary villages (100 people or more) that exist today. The greater accessibility of these villages to outsiders led to more regular contact from health agents, missionaries, and other Portuguese-speaking visitors, and the missionaries’ practice of instituting a Tukanoan teacher and catechist has undoubtedly increased the role of Tukano in the everyday life of many Hup speakers.

2. Sources of data The greatest challenge in completing this study was the lack of detailed lexical data on most of the languages spoken in and around the Vaupés region. Data on Hup was itself fully accessible to me, coming almost exclusively from my own fieldnotes, which were compiled during fieldwork along the Rio Tiquié (Amazonas, Brazil) conducted in 2000–2004. A short dictionary of Hup also now exists (Ramirez 2006), which I used for crosschecking, and a reference grammar of Hup has recently been published (Epps 2008a). Aside from Hup, data on Portuguese is also of course readily available, and for the purposes of this study came mostly from personal knowledge. Data on Tukano, Hup’s primary contact language, was unfortunately more limited, coming primarily from the Tukano dictionary and grammar by Ramirez (1997), and supplemented by information provided in the field by Hup speakers fluent in Tukano. Little lexical data on other Tukanoan languages exists, although this would have been useful for comparative purposes. The dictionary of Desano (a relative of Tukano spoken in the same region) by Aleman et al. (2000) was consulted, although was of limited use due to a lack of detail; Huber & Reed (1992) was also a very limited source of comparative data on several Tukanoan and other regional languages.

996

Patience Epps!

Of other languages spoken in the region, information on Nheengatú (Tupian; also known as Lingua Geral) comes from a dictionary by Grenand & Ferreira (1989). Available sources on local Arawak languages were also consulted, although these revealed very little evidence for borrowing from Arawak into Hup; these are Aikhenvald’s (2001) dictionary of Tariana and Ramirez’ (2001a) dictionary of Baniwa. Huber & Reed (1992) was also a source of some data on Arawak. Data on Hup’s relatives, the three other languages that make up the Nadahup family, is also extremely scarce, although of considerable importance to this study. Lexical data on Yuhup, Hup’s closest sister, is limited to a short wordlist (of around one hundred words) that I collected while in the field, to the words that appear in the examples in Ospina’s (2002) grammar, and to a list of words in Martins (2005). The similarly scanty data on Dâw is also drawn from a wordlist I collected, lists in Martins (1994) and Martins (2005), and those appearing in the examples in the grammar by S. Martins (2004). For Nadëb – Hup’s most distant relative – a lengthy but very rough draft of a lexicon (Martins 1999) and a wordlist (Martins 2005) exist, as well as those words appearing in the examples in Weir’s (1984) grammar sketch. Determining the likelihood of borrowing and the immediate sources of loans was constrained by the information available. Ramirez’ Tukano dictionary (1997) is by far the most extensive available dictionary of any East Tukanoan language, but it still is quite limited, and includes almost no Portuguese borrowings or names for non-native concepts. The loanword status of many Hup words in the subdatabase is thus given as ‘no information’ where no Tukanoan form was available for comparison (except in the few cases where there was some other indication that the word might be borrowed). Furthermore, without a good documentation of Portuguese loans into Tukano, it was impossible in many cases to determine whether borrowings of Portuguese origin in Hup came via Tukano or directly from Portuguese (see the discussion below). The lack of detailed lexical data on Hup’s sister languages also made intra-family comparison difficult. Given these limitations, the likelihood of borrowing from Tukano was determined by the following criteria. If a Tukano word nearly identical in form and meaning to the Hup word could be identified (e.g. Hup b"# and Tukano bi#i ‘rat/mouse’), the degree of certainty was rated 3 (“probably borrowed”) or 4 (“clearly borrowed”), depending on the closeness of the match, intuitions of consultants, and whether information was available on the form of the word in Hup’s sister languages (i.e. indications that the word is not cognate were considered additional evidence of loanword status). The degree of certainty decreased where the word form and meaning were only an approximate match (e.g. Hup pi# and Tukano yãpi ‘potato’), to 3 or 2 (“possibly borrowed”) if Hup’s sister languages had unrelated forms, and to 2 or 1 (“very little evidence”) where there was no comparative information available on the word in the other Nadahup languages. Finally, while existing data and the current sociolinguistic situation (see §3) suggest that the direction of borrowing was consistently Tukanoan > Hup rather than vice versa, the

39. Loanwords in Hup

997

absence of good comparative data on the Tukanoan and Nadahup families has made this impossible to establish definitively in every case. The limited data (and lack of any text record) also made the dating of words in the subdatabase difficult. Relative dates were established using several criteria. First, comparison of word lists across the Nadahup family, and particularly the reconstructions in Martins (2005), were used to determine whether a given word was cognate across two or more Nadahup languages. Native words are dated according to the highest node of the tree to which they appear to reconstruct (e.g. Proto-Nadahup; Proto-Hup-Yuhup-Dâw; etc.; but note that these dates are constrained by the available lexical data, and intra-family borrowing cannot always be ruled out). Loanwords of Portuguese origin are dated simply as “relatively early/late” on the basis of formal accommodation and my best guess at when the concept became known to Hup speakers generally; for example, Hup speakers obtained items like knives and axes through trade long before they entered into direct contact with non-Indians, whereas other things like ice, wheat bread, and canned beer are very recent and still rarely encountered. Finally, Tukano borrowings are in general not given a date, but those cases where semantics, phonological form, speaker intuitions, and other clues indicated that they were part of the newer stratum (see §5.1) are labeled ‘relatively recent’.

3. Contact situations Of Hup speakers’ experiences with other languages, contact with Tukano appears to have been by far the most profound, followed by Portuguese. Hup has also had direct or indirect contact with other East Tukanoan languages, Tariana (Arawak), and Nheengatú (Tupian). 3.1.

Contact with Tukanoan

Interaction between speakers of Hup (and other Nadahup languages) and speakers of East Tukanoan languages has a primarily economic basis. The Hupd’!h, traditionally forest-dwellers with a subsistence focus on hunting and gathering, supply their fisher-farmer neighbors with meat and labor in exchange for agricultural products and trade goods (cf. Milton 1984; Reid 1979, etc.). This patron-client relationship appears to have been in place for many generations, and has fostered almost 100% bilingualism in Tukano on the part of Hup speakers. This bilingualism is unreciprocated, however, due to the marked social inequality in the relationship. While Hup speakers do interact with speakers of other East Tukanoan languages besides Tukano, they use only Tukano – which is also the regional lingua franca – in all cases of such interaction I have observed. To date, there is no evidence that Tukanoan loanwords in Hup come from any East Tukanoan

998

Patience Epps!

languages other than Tukano; however, given the paucity of data on these languages, this possibility cannot yet be discounted. Although occasional marriages between a Hup woman and a Tukanoan man do take place, Hup speakers do not in general participate in the local system of linguistic exogamy, preferring to marry within their own language group (but across clans). However, the practice of exogamy is quite strictly adhered to by the members of the various East Tukanoan language groups (Tukano, Tuyuca, Desano, etc.), and by the speakers of Tariana (an Arawak language) within the Vaupés (although in recent years the linguistic basis of the system has been threatened by language shift). This practice has fostered, on one hand, an extensive region-wide multilingualism, but on the other, a clear equation between one’s own identity and one’s ‘father-language’ (since membership in language groups and clans passes through the male line), which in turn leads to a strong cultural prohibition against mixing languages. The Hupd’!h have internalized this cultural outlook, presumably as a result of their deep involvement in the regional cultural and economic systems and the relative social dominance of the Tukanoan and Arawak peoples, despite the fact that they do not themselves practice linguistic exogamy. The linguistic outcome of this remarkable sociocultural situation is that the Hupd’!h, like their Tukanoan and Arawak neighbors (though perhaps not to the same degree; cf. Aikhenvald 2002; Jackson 1983), consciously avoid mixing elements of Tukano into their Hup, despite long-term and pervasive bilingualism. Code-switching into Tukano is constrained and generally limited to specific contexts, such as the discourse of spirits or animals in narratives, or descriptions of interactions with Tukanoan people (example 1). (1)

#$n

pihi-tæ%n,

wetam!-tæ&#-æ&y

1SG.OBJ

call(T)-COND help(T)-CNTRFCT-DYNM

#ãh-ãh,

#ãh

n'(-'(h

1SG-DECL

1SG

say-DECL

‘If they (Tukanos) call me, I should help, I say.’ The result of these attitudes has been that relatively few loanwords from Tukano have entered the Hup vocabulary (see §4). At the same time, however, less salient forms of language mixing have gone on virtually unchecked. Hup has undergone heavy diffusion of grammatical forms from Tukano, as well as considerable calquing and some phonological influence (see §6–7). This is very similar to the effects of Tukano on Tariana (Aikhenvald 2002, etc.), although in the case of Tariana contact was directly motivated by linguistic exogamy. 3.2.

Contact with Portuguese

While virtually all Hup adults are bilingual in Tukano, fluency in Portuguese is very low, and interaction with non-Indians is limited. Most of the larger villages have one or two bilinguals, who are for the most part young adults who spent some time in the mission schools as children; in all, perhaps 4% of Hup adults are fluent

39. Loanwords in Hup

999

enough in Portuguese (and confident enough) to use it in conversation with nonIndians, and a few more have some passive understanding and can use it for basic communication if they have to. Many children have learned some Portuguese words in the village schools – such as numerals and the names of unfamiliar animals pictured in schoolbooks (e.g. rabbits, lions, etc.) – but are otherwise generally unable to speak or understand Portuguese at all. Nevertheless, there are many loanwords of Portuguese origin in the subdatabase. While almost all of these refer to previously unfamiliar (and therefore nameless) items or concepts (see §4), it is not fully clear how they entered the Hup vocabulary in the absence of widespread bilingualism. A few have probably entered via direct interaction with non-Indians (e.g. in trade contexts) or via the few Hup speakers who are themselves fluent in Portuguese. Others have come via Tukano speakers and other River Indians (almost all of whom speak some Portuguese), particularly the village schoolteachers and the Portuguese-language schoolbooks they teach from; however, many of these words (such as animal names like ‘lion’ and ‘shark’) are treated here as incipient borrowings (similar to what are elsewhere termed ‘nonce borrowings’) rather than true loanwords because they appear to be known primarily only to school-aged children or other subgroups of Hup speakers (see §5 for more discussion). It seems most likely that the majority of the Portuguese borrowings entered Hup via the Tukano language itself, but this is unfortunately not known in the majority of cases because information on borrowed words and even non-native entities is mostly lacking from the Tukano dictionary (Ramirez 1997; see §2). However, a distinction between Portuguese as the immediate or original donor language may be somewhat unnecessary, since most Hup speakers are probably able to identify a Portuguese word used in Tukano discourse as Portuguese – by its unfamiliarity, its reference to a non-native entity, and its phonological form. They would thus have been aware that they were borrowing a Portuguese word even when encountering it via Tukano. 3.3.

Contact with Nheengatú

Nheengatú, or Lingua Geral, is a form of Tupinambá (Tupian family) that was th adopted by Jesuit missionaries as a lingua franca in the 17 century and imported to many parts of Brazil. It was probably never widespread in the Vaupés region itself, however, and there was never any significant bilingualism of Hup speakers in Nheengatú. However, its use as a contact language between Indians and nonIndians led to a few early borrowings from Nheengatú into Tukano and other th regional languages, probably around the end of the 19 century (Aikhenvald 2002: 37). Some of these borrowings then made their way from Tukano to Hup (see §4).

1000 3.4.

Patience Epps!

Contact with Arawak

Interaction between Hupd’!h and Arawak peoples seems to have been limited. Ethnohistorical accounts (cf. Aikhenvald 2007: 239) mention groups of client “Makú” (likely Hup) people associated with patron villages of Tariana (the sole Arawak group within the Vaupés) in the past, but no such relationship exists today. There is no evidence that Hup speakers were ever bilingual in an Arawak language, as is reflected in the apparent lack of direct linguistic borrowing of any kind from Arawak; the few forms of probable Arawak origin in Hup (not included in this subdatabase) have all entered via Tukano.

4. Number and kinds of loanwords 4.1.

General observations

Of the 1460 Loanword Typology meanings, 446 have no established equivalents in Hup (this includes incipient borrowings). Of the 981 Hup lexical items in the subdatabase, 130 (13%) are loanwords or possible loanwords. Out of this set of loanwords, 40 (31%) are of Tukano origin, 86 (67%) are Portuguese, and 2 (2%) are Nheengatú. The Tukano loanwords are the only forms for which loanword status is not fully established in all cases; of the 40 total, 5 are ranked at level 1 (very little evidence), 8 at level 2 (perhaps), 10 at level 3 (probably), and 17 at level 4 (clearly). One further possible loanword, ranked at level 1, is of unknown source. Because the immediate donor language is frequently unclear (particularly for the loans of Portuguese origin), these figures correspond to the original source of the loanwords in Hup, to the extent that this can be determined. Tukano was the immediate donor of the Nheengatú words to Hup; the same can be said for the very few words (loans and calques) of probable Arawak origin (e.g. kapi# ‘ayahuasca’, not in the core of the subdatabase). Many of the Portuguese words also passed through Tukano before reaching Hup, and some apparently followed an even more elaborate trajectory of Portuguese > Nheengatú > Tukano > Hup (see example 2 below). The overall loanword figures in Hup suggest some interesting observations. First, the lack of any direct borrowings from Nheengatú and Arawak languages is consistent with what is known about the degree of contact and bilingualism between Hup speakers and speakers of these languages (see §3). That Tukano was the immediate source for these words is also clear; Tukano shares almost identical forms, and the Hup loans from Nheengatú are a subset of the words of the same origin found in Tukano (some of the same Nheengatú borrowings are also found in Tariana; it is not clear whether they are native roots in Nheengatú or were borrowed from some other language). On the other hand, the fact that not more than 4% of the Hup words in the subdatabase are identifiable as loans of Tukano origin is quite remarkable, given that Hup-Tukano interaction and bilingualism has been intense, long-term, and

39. Loanwords in Hup

1001

accompanied by social inequality and some level of linguistic insecurity on the part of Hup speakers. (But note that identification of Tukano loans is constrained by the available data; see §2.) From a cross-linguistic perspective, such a sociolinguistic context would be expected to promote heavy lexical borrowing (cf. Thomason & Kaufman 1988: 50; also compare Heath’s (1981) discussion of lexical borrowing in Arnhem Land, Australia); this has not occurred in Hup, although it has resulted in heavy borrowing of grammatical structures (see §6) and considerable calquing (§7). In the sociolinguistic context of the Vaupés, however, this resistance to lexical borrowing makes sense; regional attitudes, fostered by linguistic exogamy, condemn language mixing in general – even for the Hupd’!h, who do not practice linguistic exogamy but are nevertheless part of the local cultural system. Very similar contact effects are found in Tariana (e.g. Aikhenvald 2002). In contrast, it is striking that borrowings of Portuguese origin outnumber loans of Tukano origin by more than two to one – despite the fact that bilingualism in Portuguese has always been minimal, and contact with non-Indians is recent and still very limited. This apparent paradox can be addressed on several points. First, not a few of the Portuguese borrowings reached the Hupd’!h via Tukano speakers, who have considerably more contact with non-Indians, and had borrowed the words into their Tukano. In the case of older, well integrated borrowings that were accommodated in Tukano and then borrowed into Hup (such as Portuguese lancha ‘boat’ > Tukano nasia > Hup náciya), Hup speakers may have considered these Tukano words at the time they were borrowed. There are perhaps ten such words among the 85 Portuguese borrowings in the subdatabase; of these, at least three (and probably more) also passed through Nheengatú before entering Tukano (example 2). 2 (2)

Portuguese bicho (‘creature’) limão sabado

Nheengatú pisana wirimo sauru

Tukano pisana wirimo sauru

Hup picána ‘cat’ wirim'" ‘lime’ cáuru ‘Saturday’

However, the majority of words of Portuguese origin are less well integrated (see §5), and it is probable that Hup speakers recognized most of them as Portuguese even when encountering them through Tukano (as discussed in §3.2 above). In fact, a relatively high number of Portuguese borrowings are found in all the languages of the Vaupés, including Tukano, Tariana, and others (Aikhenvald 2002, etc.). In all of these languages, Portuguese lexical borrowings were undoubtedly motivated in part simply by the need to label the incoming flood of new material objects. An additional motivation has certainly been the perception of non-Indians’ socially privileged status, monolingual norms, and of their position (for the most 2

The symbol c represents a palatal stop in Hup, which has the allophones [#], [$], and [s] in prevocalic position. The allophones [s] and [$] are typical in borrowed words, corresponding to Tukanoan and Portuguese s and [$]. Tukano orthographic ‘o’ is equivalent to Hup ‘%’.

1002

Patience Epps!

part) outside the regional marriage system. On the other hand, calquing and grammatical borrowing from Portuguese has been almost nonexistent (see §6 and §7). 4.2.

Loanwords and semantic word class

The most striking difference between loanwords of Tukano and Portuguese origin is their part of speech. As Table 1 shows, Hup has borrowed roughly twice as many verbs as nouns from Tukano, but very few verbs from Portuguese. Table 1: Loanwords in Hup by donor language and semantic word class (percentages) Portuguese

Tukano

Total loanwords

Non-loanwords

11.1 1.5 14.7 7.9

2.7 6.8 1.3 2.0 3.6

13.8 8.3 16.6 1.3 0.0 11.5

86.2 91.7 98.7 100.0 83.4 88.5

Nouns Verbs Adjectives Adverbs Function words all words

The high proportion of verbs among Tukano borrowings is typologically highly unusual. Cross-linguistic studies of lexical borrowing have shown verbs to be much less prone to borrowing than nouns, so much so that Moravscik (1975: 111) even proposed that the borrowing of verbs (as verbs) was impossible (cf. Campbell 1993: 102–103; Wichmann & Wohlgemuth 2008). Explanations for the lower rate of verb borrowing cross-linguistically include the fact that verbs are less likely to correspond to culturally unfamiliar concepts than are nouns, and that verbs typically involve more bound morphology, which they must be both separated from in the donor language and accommodated to in the recipient language (Campbell 1993: 99; Heath 1978: 72; Weinreich 1953: 41). What explains the favoring of verbs as loans in Hup? First, Hup speakers’ high level of bilingualism in Tukano allows them to easily pluck Tukano verbs from their morphosyntactic context. Second, and most importantly, Hup speakers are probably less resistant to the borrowing of verbs than they are to nouns because verbs can be smuggled into the language in long, morphosyntactically complex compounds, where they are less easily noticed (example 3; cf. Aikhenvald 2002: 224 for a similar phenomenon in Tariana). (3)

n)

#ín

1SG.POSS mother

#an

d’o#-maca-yo#…

1SG.OBJ

take-be.born/raised-SEQ

‘My mother having raised me…’ [Tukano borrowing maca- ‘be born/raised’]

39. Loanwords in Hup

1003

Of the remaining Tukano loans in the subdatabase, one (‘top of, over’) is an adposition; another adposition ‘near to’ is possibly derived from a Tukano verb ‘approach’. The adjective ‘hot’ may also be borrowed, but occurs as a noun (‘hot season/year’) as well as an adjective in both languages. In contrast to Tukano, the majority of Portuguese loans are nouns. The scarcity of borrowed Portuguese verbs is presumably due to the fact that actions are less likely than objects to be completely new (and thus nameless); Hup speakers’ general lack of fluency in Portuguese (making it more difficult for them to separate the verb root from the inflection) is undoubtedly also a factor. Portuguese loans in Hup also include the function words #o ‘or’ (from Portuguese ou ‘or’), te ‘until, up to’ (both temporal and spatial, from Portuguese até ‘until, up to’), and possibly the negative particle næ, which may be from Portuguese nem (the latter two are not included in the subdatabase). These loans are widespread in the Vaupés, occurring in Tukano (undoubtedly their source in Hup), Tariana, and various other East Tukanoan and Arawak languages; their prevalence is consistent with Matras’ (1998) observation that such “utterance modifiers” are particularly prone to borrowing cross-linguistically. As for other lexical classes, Hup speakers favor borrowed Portuguese terms for numerals above five (and occasionally for terms below five, especially in reference to money). Hup speakers reportedly used some Nheengatú terms for days of the week in the past, but now use Portuguese terms for these (with the exception of ‘restday’ for ‘Sunday’, calqued from Tukano). 4.3.

Loanwords and semantic field

In general, semantic domains in Hup corresponding to “basic vocabulary” tend to be relatively conservative. Nouns corresponding to such semantic domains as native flora and fauna, body parts, and kin terms tend to be cognate across the Nadahup family, showing relatively little replacement, although they are not immune to borrowing. Hup lexical innovation in general appears to have been motivated largely by need for a new word to accompany a new concept. Domains relating to agriculture and ritual practice (concepts shared by all local indigenous groups) are quite innovative, including loans and calques in addition to native coinages, suggesting that the Hupd’!h acquired many of these practices from their neighbors. Words relating to material items and other concepts associated with non-Indians have a particularly high percentage of loans, mostly from Portuguese, but also including the two loans of apparent Nheengatú origin (‘study’ and ‘pray’, which appear to derive from the same Nheengatú root). Of the loanwords of Tukano origin in Hup, the relatively few nouns are a semantically mixed group. The two terms for humans or kin (‘infant’, ‘sibling’) are perhaps most likely to be core borrowings. Many of the others are good candidates for “cultural borrowings” (cf. Myers-Scotton 2002: 239), that is, words that correspond to material items or technological concepts that were likely introduced

1004

Patience Epps!

to the Hupd’!h by their River Indian neighbors. These include agricultural terms such as those in (4): (4)

Hup Tukano can$ s!rá ‘pineapple’ b’$# ‘manioc bread’ ba#a ‘eat’

Borrowings or possible borrowings also include words for (early) European trade goods like ‘clothes’, ‘thread’, and ‘kerchief’. The loanword má ‘large waterway’ may have been motivated by the fact that the Tukanoans are settled river-dwellers, while the Hupd’!h are nomadic forest people; similarly, it is possible that ‘rat’ was borrowed because of its association with a more settled context. The loan of kedó ‘firefly’ has a likely explanation in a popular children’s game, which involves calling ‘kero, kero’ to entice fireflies to land nearby. There is little evidence that any of the loan verbs of Tukano origin are “cultural borrowings”, although a few (such as ‘whistle with the fingers’) may be. They do not belong to any clear semantic domain, as the examples in (5) illustrate. (5)

Hup macacom'nædu#yo-

Tukano masá ‘be born, come to senses, grow’ soó ‘rest, relax’ moné ‘mix together’ du#ú ‘drop, let fall’ yoó ‘carry dangling from hand’

Almost all loans of Portuguese origin (of which many may have been borrowed through Tukano; see §4) refer to new or previously unfamiliar concepts. Examples of the replacement of a native term by a Portuguese borrowing are limited to numerals, and of these consistently only six through twenty, for which the native terms were no more than semi-lexicalized Tukano calques to begin with. The majority of loans of Portuguese origin correspond to previously unfamiliar material items, such as ‘button’, ‘soap’, ‘plate’, ‘ice’, and ‘calendar’; others refer to non-native foods, such as ‘bean’, ‘rice’, and ‘soup’, and to animals such as ‘domesticated duck’ and ‘horse’. Other loans correspond to temporal concepts like ‘hour’, ‘week’, and ‘age (in years)’. Borrowed Portuguese verbs in the subdatabase are limited to ‘kiss’, ‘win (a game/contest)’, ‘fry in oil’, and ‘paint/color in’ (which border on incipient borrowings, see §5.1); all are culturally imported concepts.

39. Loanwords in Hup

Table 2:

1005

Loanwords in Hup by donor language and semantic field (percentages) Portuguese

Tukano

Total loanwords

Nonloanwords

2.1 2.7 17.1 25.0 22.0 11.0 7.9 1.6 4.2 3.7 34.6 17.8 6.3 6.9 4.8 9.0 33.3 26.2 7.9

2.1 2.1 5.4 1.4 1.6 5.6 1.6 2.8 1.6 5.6 12.6 6.5 2.2 2.6 6.5 6.9 2.3 25.4 7.5 9.5 3.6

4.2 2.1 8.0 1.4 18.7 30.6 23.6 13.8 9.5 7.2 16.8 10.2 34.6 20.0 2.6 6.2 6.5 13.7 4.8 11.3 33.3 25.4 33.6 9.5 11.5

95.8 97.9 92.0 98.6 81.3 69.4 76.4 86.2 90.5 92.8 83.2 89.8 65.4 80.0 97.4 93.8 93.5 86.3 95.2 88.7 66.7 74.6 66.4 90.5 88.5

1 The physical world 2 Kinship 3 Animals 4 The body 5 Food and drink 6 Clothing and grooming 7 The house 8 Agriculture and vegetation 9 Basic actions and technology 10 Motion 11 Possession 12 Spatial relations 13 Quantity 14 Time 15 Sense perception 16 Emotions and values 17 Cognition 18 Speech and language 19 Social and political relations 20 Warfare and hunting 21 Law 22 Religion and belief 23 Modern world 24 Miscellaneous function words all words

5. Integration of loanwords Loanwords in Hup have undergone varying degrees of phonological and morphological adaptation, with some differences depending on their original source language. 5.1.

Integrating loans of Tukano origin

Loanwords of Tukano origin can be divided into relatively older and more recently borrowed sets, depending largely on their degree of integration into Hup. Hup has virtually no regular Tukano code-switches or “nonce” type borrowings, probably due to constraints against language mixing and to the lack of concepts (of native origin) that are not equally familiar to both Tukano and Hup speakers (thus limiting contemporary “cultural borrowings”).

1006

Patience Epps!

Some examples of probable older loans are given in (6) below. These words have in general been truncated to one syllable to fit the preferred Hup morpheme structure, although the CV structure that frequently results is relatively rare for Hup morphemes generally (CVC is the most common pattern). Speakers tend to consider these older loans to be native Hup words, and exhibit little variation in their use. No coexisting native Hup synonym exists for most of these, although the concept presumably predated the loanword in most cases (thus the native word was either lost or underwent semantic shift). In some cases, the form and/or semantics of the Hup word and its Tukano counterpart is not an exact match, presumably due to later changes as the loan was integrated (but also calling into question the word’s status as a loan, see §2 above). (6)

Hup b"# du#yon)-

Tukano bi#i ‘rat’ du#ú ‘drop, let fall’ yoó ‘carry dangling from hand’ n)ro ‘keep, look after’

Obviously newer Tukano loans, such as those in (7), are fewer (approximately 20% of Tukano-origin loanwords). Evidence for their more recent adoption includes their lower degree of phonological integration; they typically involve little or no change from the Tukano source word, and the most obvious examples are composed of two syllables with different vowels (whereas native Hup words of two syllables almost always have identical vowels). These may be identified as Tukano borrowings by speakers, and sometimes coexist with a native Hup word as a synonym or hyponym; for example, two of the words in (7) have the native counterparts hikaku- ‘mix together’ and huh*y ‘firefly’. Their use frequently varies across speakers and/or dialect areas. Of the words in (7), m'næ- ‘mix together’ is standard in one region, but labeled Tukano in another; kedó ‘firefly’ is said by some to be a synonym of huh*y, while others say it refers to one type of firefly, and huh*y to another (i.e. they are hyponyms). (7)

Hup m'nækedó h'&w+

Tukano mone ‘mix together’ kero ‘firefly’ hõwe ‘infant’

Borrowed roots are easily incorporated into Hup morphosyntax, regardless of their degree of integration; loan verbs are always accommodated via a ‘direct insertion’ strategy (Wichmann & Wohlgemuth 2005: 7), as are the rare code-switches (examples 1 and 3 above; cf. 12 below for Portuguese). As discussed above (§1) speakers tend to be highly aware of language mixing, and attitudes toward lexical borrowing from Tukano are largely unfavorable. Loans like those in (7), which are recognized by many as Tukano but widely used, seem

39. Loanwords in Hup

1007

on the whole to be accepted; however, where variation exists among dialect areas or speakers, those who do not use the loanword may be mildly critical of those who do (saying, for example, ‘hmm, that’s a Tukano word; we use the Hup word’) – as in the case of m'næ- ‘mix together’, which has the native Hup counterpart hikaku-. There is one known example of a folk etymology involving a place name borrowed from Tukano: Tukano yuyu sá ‘ritual.instrument creek’ has become Hup y)y)w deh ‘ant sp. creek’. 5.2.

Integrating loans of Portuguese origin

Unlike Tukano borrowings, which include many nativized loans, fewer lessintegrated loans, and almost no regular code-switches, loans of Portuguese origin form a continuum ranging from a few fully nativized loans to many marginally integrated loans to incipient borrowings. The handful of well-integrated loans from Portuguese have virtually all entered Hup via Tukano, which shares nearly identical forms. These loans are in common use among Hup speakers; they tend to have undergone some accommodation to Hup phonological requirements, and to have fixed pronunciations regardless of the speaker’s degree of fluency in Portuguese. Several (e.g. nasiya ‘boat’) may be recognized as Tukano borrowings, but may not considered to be of Portuguese origin even by speakers of Portuguese. Examples are given in (8). (8)

Hup náciya cudáda peyãw y!du

‘boat’ ‘soldier’ ‘bean’ ‘money’

Portuguese lancha soldado feijão dineiro

This nativized set blends into the semi- and non-nativized loans of Portuguese origin (example 9). These are more easily recognized as coming originally from Portuguese, and in some cases are not quite as widely used. (9)

Hup cemána gaña cabonéci

‘week’ ‘win’ ‘soap’

Portuguese semana ganhar sabonete

This set blends in turn into the set of incipient borrowings from Portuguese, such as those in (10). These are in general recently encountered and not in widespread use; they are clearly recognized as Portuguese and fill an obvious lexical gap. Their pronunciation tends to vary with the speaker’s Portuguese competence; bilingual speakers will tend to pronounce them more as they are pronounced in Portuguese.

1008

Patience Epps!

(10)

Hup koéyu káma #úrcu

‘rabbit’ ‘bed’ ‘bear’

Portuguese coelho cama urso

While not in common use, these incipient borrowings are not necessarily restricted to bilingual speakers, but rather to those who have encountered the concept and its label through other speakers of Hup or Tukano; for example, many children are familiar with the Portuguese word ‘rabbit’ from pictures in schoolbooks, but many of their grandparents are not. Portuguese also enters the language through personal names. Virtually all Hup people today have both a Hup and a Portuguese name, which is adapted phonologically (more or less identically in Hup and the local Tukanoan languages), yielding such names as Mandu (Manuel), Mingu (Domingo), Pedu (Pedro), Ciri (Silvina), Tede (Teresa), etc. Dogs are always named in Portuguese, using words for objects and animals that are not in general widely known among Hup speakers, as example (11) illustrates. This practice is presumably motivated by the association of dogs (which are not native) with the non-Indian world. (11) Hup dog names: Kupí Tuberãw Moto-céha Badánka

Portuguese Cupim Tuberão Motor-serra Branca

‘termite’ ‘shark’ ‘chainsaw’ ‘white’

As with Tukano loans, words of Portuguese origin – whether nativized loans or incipient borrowings – are easily incorporated into Hup morphosyntax. Verbs are inserted directly into the Hup verb template (example 12). The form of the Portuguese verb appears to be the infinitive minus -r; it could also be interpreted as the third person singular form with a shift of stress to the second syllable, but this is less appealing given that such a stress shift does not occur with borrowed nouns, as in (13). (12) #*y who

gañá-a# ? win-INT

‘Who won?’ (Portuguese ganhar ‘to win’) (13) méca-át table-OBL

wób-óy rest.on-DYNM

‘(something is) resting on the table’ (Portuguese mesa ‘table’) Although Hup speakers accept Portuguese loans more readily than they do Tukano loans, they have often favored native neologism over direct borrowing (see §7). In

39. Loanwords in Hup

1009

many other cases, doublets exist, where one term is a loan and the other a native creation: (14)

Hup native word #!%g-b’'k [drink-dish] wæ%d-b’ah [eat-FLAT.THING] cák-ap-teg [climb-DEP-THING]

loanword kópu koyéw cikáda

Portuguese source copo ‘cup’ colher ‘spoon’ escada ‘ladder’

In still other cases, borrowed Portuguese nouns are integrated via combination with a Hup class term, as in (15). This strategy affects both relatively well-integrated borrowings and nonce-like words; the class term is frequently optional. The strategy is particularly intriguing because such class terms or classifying nouns are otherwise very infrequent in Hup, and some semantic extensions have even been driven by new types of entity imported from the non-Indian world (e.g. ‘leaf’ > ‘paper, book’). In fact, this ‘loanblend’ strategy seems to be giving rise to a new system of noun classifiers (see Epps 2007a), motivated both by the vast number of new items requiring names that have entered Hup life in the past few decades, and by the desire to give the new terms a Hup identity, rather than to just borrow hundreds of words indiscriminately. (15)

Hup bóda-tat d*c-tat pídiya-w)g tábwa-b’ah motúru-teg

gloss meaning [ball-ROUND.THING] ‘ball’ [light-ROUND.THING] ‘lightbulb’ [battery-SMALL.ROUND]‘small battery’ [board-FLAT.THING] ‘board’ [motor-TREE/THING] ‘motor boat’

Portuguese bola ‘ball’ luz ‘light’ pilha ‘battery’ tabua ‘board’ motor ‘motor’

Finally, a few Portuguese loans are accommodated via folk etymology (sometimes with additional phonological adjustments). In the first two cases, these accommodations are recent and are still treated as jokes, while in the last (‘Colombian’) the folk etymology seems to be older and more conventionalized: (16)

y*y < Portuguese juiz ‘judge’ h'wæd-nuh-dó < Portuguese governador ‘governor’ kod’,b-d’!h ‘Colombian’ (k!d-d’ob-d’!h [pass-go.to.river-PL]) Lit. ‘those who go down to the river quickly’

lit. ‘ant sp.’ lit. ‘thirsty red head’

6. Grammatical borrowing Grammatical borrowing from Tukanoan has deeply affected Hup, but is limited almost entirely to the borrowing of structures rather than forms. This is undoubtedly due to the relative salience of form to speakers, whereas they tend to

1010

Patience Epps!

be much less conscious of categories and patterns. A very similar state of affairs is described for Tariana, whose speakers have experienced long-term bilingualism in Tukano and cultural constraints against language mixing much like those experienced by Hup speakers (Aikhenvald 2002, etc.). The profound Tukanoan influence is probably responsible for Hup’s verb-final word order, its system of object marking according to the animacy and definiteness of nominal referents, its incipient noun classification system (which organizes inanimates according to shape, just as Tukano does), its evidential system (which has been augmented from one marker to four, with categories that parallel those in Tukanoan languages almost exactly), and many other features of Hup grammar. These effects of areal diffusion are discussed at length in Epps (2007b, 2008b). Hup bears virtually no trace of grammatical borrowing from Portuguese, reflecting the low intensity and short duration of contact, and the general lack of bilingualism.

7. Other strategies for neologism The reluctance to mix languages has fostered the productivity of neologism strategies other than lexical borrowing. In particular, Tukanoan influence on the Hup lexicon has included at least as much calquing and loan translation as the borrowing of lexical forms. For example, in both Hup and Tukano ‘moon’ and ‘sun’ are the same word, as are ‘deer’ and ‘manioc-processing tripod’; both languages use the expressions ‘star-saliva’ for ‘dew’, ‘fire-person’ for ‘non-Indian’, ‘have sibling/companion’ for the numeral four, and ‘Bone-Son’ for the principle deity or culture hero. Many of the same calques are widespread throughout the region (e.g. in Tariana; Aikhenvald 2002: 228–30). Hup speakers also create new words entirely from their own linguistic resources. This may involve adapting the semantics of an existing Hup word; for example, one dialect of Hup calls beans s)(s)(b’ ‘flies’ instead of the borrowed peyã(w (from Portuguese feijão). Another extremely productive approach is to derive a new lexical item from the combination of a verb stem (or phrase) and a bound classifying noun (much like the loanword + classifying noun strategy in example 15 above): (17)

pæ&y-ca# h)%#-g’æt h'&-tat #)d d’!hd’!hhám=teg

[thunder/electricity-box] [write-LEAF] [burn-ROUND.THING] [speech send-THING]

‘boat/car battery’ ‘notebook ‘lightbulb’ ‘telephone’

8. Conclusion The Vaupés region presents a unique sociolinguistic situation of language contact. Hup speakers have experienced long-term, heavy bilingualism in Tukano, coupled

39. Loanwords in Hup

1011

with social inequality, but outright language shift has been prevented by a regional ideology that tightly links language and identity. The linguistic outcome has been similarly unusual: despite decades of bilingualism, Hup has undergone relatively little lexical borrowing from Tukano, and what has occurred has involved far more verbs than nouns. Conversely, calquing and grammatical borrowing from Tukano have been heavy. At the same time, an additional layer of contact with Portuguese has involved much less bilingualism but somewhat more relaxed attitudes toward language mixing, yielding almost the opposite effect: more and freer lexical borrowing, involving mostly nouns, but little or no calquing or grammatical borrowing.

Acknowledgments Support from Fulbright-Hays, NSF (Grant no. 0111550), and MPI EvA, Leipzig is gratefully acknowledged. Many thanks go to my Hupd’&h hosts and language teachers, as well as to the Museu Parense Emílio Goeldi and the Instituto Socioambiental in Brazil for practical assistance with fieldwork.

Special Abbreviations CNTRFCT DYNM INT SEQ

Counterfactual Dynamic Interrogative Sequential

References Aikhenvald, Alexandra Y. 1999. Areal diffusion and language contact in the Içana-Vaupés basin, north-west Amazonia. In Dixon, R. M. W. & Aikhenvald, Alexandra Y. (eds.), The Amazonian Languages, 385–416. Cambridge: Cambridge University Press. Aikhenvald, Alexandra Y. 2001. Dicionário Tariana-Portugu-s e Portugu-s-Tariana [Tariana-Portuguese and Portuguese-Tariana dictionary]. (Boletim do Museu Goeldi 17.1). Belém: Museu Goeldi. Aikhenvald, Alexandra Y. 2002. Language Contact in Amazonia. Oxford/New York: Oxford University Press. Aleman, Túlio & Lopez, Reinaldo & Miller, Marion. 2000. Diccionario Desano-Españo [Desano-Spanish dictionary]. Bogotá, Colombia: Editorial Buena Semilla.

1012

Patience Epps!

Cabalzar, Aloisio & Ricardo, Carlos Alberto. 1998. Povos Indígenas do Alto e Medio Rio Negro. AM: FOIRN – Federação das Organizações Indígenas do Rio Negro. São Paulo: Instituto Socioambiental, São Gabriel da Cachoeira, AM: Federação das Organizações Indígenas do Rio Negro (FOIRN). Campbell, Lyle. 1993. On proposed universals of grammatical borrowing. In Aertsen, Henk th & Jeffers, Robert J. (eds.), Historical Linguistics 1989: Papers from the 9 International conference, 91–109. Amsterdam: Benjamins. Epps, Patience. 2005. Areal diffusion and the development of evidentiality: Evidence from Hup. Studies in Language 29(3):617–650. Epps, Patience. 2007a. Birth of a noun classification system: The case of Hup. In Wetzels, L. (ed.), Language Endangerment and Endangered Languages: Linguistic and anthropological studies with special emphasis on the languages and cultures of the AndeanAmazonian border area (Indigenous Languages of Latin America series = ILLA), 107– 128. Leiden, The Netherlands: Publications of the Research School of Asian, African, and Amerindian Studies = CNWS, Leiden University. Epps, Patience. 2007b. The Vaupés melting pot: Tukanoan influence on Hup. In Aikhenvald, Alexandra & Dixon, R. M. W. (eds.), Grammars in Contact: A crosslinguistic typology (Explorations in Linguistic Typology 4), 267–289. Oxford: Oxford University Press. Epps, Patience. 2008a. A Grammar of Hup. (Mouton Grammar Library 43). Berlin: Mouton de Gruyter. Epps, Patience. 2008b. Hup. In Matras, Yaron & Sakel, Jeanette (eds.), Grammatical Borrowing in Cross-Linguistic Perspective, 551–566. Grenand, Françoise & Ferreira, Epaminondas Henrique. 1989. Pequeno Dicionário da Língua Geral. (Série Amazônas, Cultura Regional 6). Manaus: SEDUC. Heath, Jeffrey. 1978. Linguistic Diffusion in Arnhem Land. (Australian Aboriginal Studies Research and Regional Studies, 13). Canberra: Australian Institute of Aboriginal Studies. Heath, Jeffrey. 1981. A case of intensive lexical diffusion: Arnhem Land, Australia. Language 57(2):335–367. Huber, Randal Q. & Reed, Robert B. 1992. Vocabulario Comparativo: Palabras selectas de lenguas indígenas de Colombia. Bogotá: Asociación Lingüístico de Verano. Jackson, Jean. 1983. The Fish People: Linguistic Exogamy and Tukanoan Identity in Northwest Amazonia. Cambridge: Cambridge University Press. Koch-Grünberg, Theodore. 1906. Die Indianner-Stämme am oberen Rio Negro und Yapurá und ihre sprachliche Zuhörigkeit [The Indian tribes on the upper Rio Negro and their linguistic affiliation]. Zeitschrift für Ethnologie 38:167–205. Martins, Silvana A. 2004. Fonologia e Gramática Dâw [Dâw phonology and grammar]. PhD Dissertation, University of Amsterdam. Amsterdam: LOT. Martins, Valteir (ed.). 1999. Dicionário Nadëb-Português [Nadëb-Portuguese dictionary]. Unpublished manuscript.

39. Loanwords in Hup

1013

Martins, Valteir. 1994. Análise Prosódica da Língua Dâw (Makú-Kamã) numa Perspectiva Não-linear. Master’s thesis. Santa Catarina: UFSC. Martins, Valteir. 2005. Reconstrução Fonológica do Protomaku Oriental. Ph.D. thesis. Amsterdam: Vrije Universiteit. Matras, Yaron. 1998. Utterance modifiers and universals of grammatical borrowing. Linguistics 36(2):281–331. Milton, Katherine. 1984. Protein and carbohydrate resources of the Maku Indians of northwestern Amazonia. American Anthropologist 86:7–27. Moravscik, Edith. 1975. Borrowed verbs. Wiener Linguistische Gazette 8:3–30. Myers-Scotton, Carol. 2002. Contact linguistics: Bilingual encounters and grammatical outcomes. Oxford: Oxford University Press. Nimuendajú, Curt. 1982. Textos Indigenistas. São Paulo: Edições Loyola. Ospina Bozzi, Ana Maria. 2002. Les structures élémentaires du Yuhup Makú, langue de l’Amazonie Colombienne: morphologie et syntaxe. [The basic structures of Yuhup Makú, a language of the Colombian Amazonia: Morphology and syntax] Ph.D. thesis. Paris: Université Paris 7, Denis Diderot. Ramirez, Henri. 1997. A Fala Tukano dos Ye’pa-Masa. Vol. 2: Dicionário. Manaus: Inspetoria Salesiana Missionária da Amazônia, CEDEM. Ramirez, Henri. 2001a. Dicionário Baniwa-Português [Baniwa-Portuguese dictionary]. Manaus: Editora da Universidade do Amazonas. Ramirez, Henri. 2001b. Família Makú ou família Uaupés-Japura? [Makú family or VaupésJapura family?] Paper presented at the meeting of ANPOLL. Belém, Brazil: Encontro da ANPOLL. Ramirez, Henri. 2006. A Língua dos Hupd'äh do Alto Rio Negro: Dicionário e Guia de Conversação. 1st edn. Sao Paulo: SSL/Libro. Reid, Howard. 1979. Some Aspects of Movement, Growth and Change Among the Hupdu Maku Indians of Brazil. Ph.D. dissertation. Unversity of Cambridge. Thomason, S. & Kaufman, T. 1988. Language Contact, Creolization, and Genetic Linguistics. (Publications of the Linguistic Circle of New York 1 1). Amsterdam: Benjamins. Weinreich, Uriel. 1953. Language in Contact: Findings and Problems. (Publications of the Linguistic Circle of New York 1). 9th printing. New York: The Hague: Mouton. Weir, E. M. Helen. 1984. A Negação e outros Tópicos da Gramática Nadëb. Master’s thesis. Campinas: UNICAMP. Wichmann, Søren & Wohlgemuth, Jan. 2008. Loan verbs in a typological perspective. In Stolz, Thomas & Palomo, Rosa & Bakker, Dik (eds.), Aspects of Language Contact, 89– 121. Berlin: Mouton de Gruyter.

1014

Patience Epps!

Loanword Appendix Portuguese cédu kawádu pátu picána kópu pãw cópa vedúra peyãw #acúka cervéca méya candária bóta bóca mutãw toáya cabonéci cáwi cikáda káma méca-b’ah wéda tábwa-b’ah pá #arócu kóku naránya barutéru p!régu cúmbu pintanácia báwca yénu kwádru bóda céc céci #óytu nówi déc #õci dóci k.ci v.(ci c!

ice horse domesticated duck cat cup bread soup vegetables, greens beans sugar beer sock sandal boot pocket, backpack button towel soap key ladder bed table candle, wax board shovel, spade rice coconut orange hammer nail lead to paint boat raft money square ball six seven eight nine ten eleven twelve fifteen twenty hundred

míw #óra

wirim'( kúpa dewídriu

thousand time of day, hour age week Monday Tuesday Wednesday Thursday Friday Saturday to kiss or pen village leader soldier to win (a game) judge prison priest television battery basin governor police calendar number motor coffee to fry plate, dish spoon powdered milk (in a package) shoe cap community building lime fault, blame to read aloud glass

Tukano h'&w+ cadak$# b"# kedó macãcoyutá

infant chicken rat firefly to be born to rest thread

#idáci camána cegúnda-wag téca-wag kwáta-wag kínta-wag cécta-wag cáuru beca#ó dápi kapitãw cudáda gañay*y p!récu (m'y) p’$y tedevicãw pídiya (w)g) bacíya h'wædnuhdó podíciya karendário númeru motúdu-tat kapé pritapadátu koyéw déti capátu boné kudúbi

n+c/-b’ah du#yotuwidud’o#næbuycó# puwiçkuku#dohnihidoho-

má m'nækukú n)-

d)#-

Nheengatu b’oyyumb’oy-

kerchief to drop, to miss (a target) to carry dangling from hand to push to give back to buy, to sell, to trade to join top, upper part, above to get wet, to be wet to whistle (with fingers) to stammer, to stutter to make harmful spell to exist, to be at, to dwell to change, to transform (often magically), to become river, stream (usually relatively large) to mix together permanent knot, bump on tree to have temporarily, to safeguard, to have second spouse after first spouse dies to be left over, to be the remainder

to learn, to teach, to study to pray, to worship

Chapter 40

Loanwords in Wichí, a Mataco-Mataguayan language of Argentina* Alejandra Vidal and Verónica Nercesian 1. The language and its speakers Wichí has approximately 40,000 speakers in Argentina and Bolivia. In Argentina, it is spoken in the western and central parts of the provinces of Salta and Formosa, and in the northeastern part of the province of Chaco. In Bolivia, it is spoken in Tarija County. The data for this study are based on fieldwork on the Bermejo dialect, which has approximately 3,000 speakers in the province of Formosa, Argentina. All of the communities in Formosa are organized as Civil Associations with legal status. In addition to hunting and gathering activities, they have also developed textile weaving, pottery, and, to a lesser extent, ranching and some degree of farming. There is an ongoing tendency for these families in rural communities and their urban relatives to migrate. This has to do, among other reasons, with the search for part-time jobs, the sale of their crafts, the need for health care services, the completion of administrative procedures, and the payment of subsidies. Some of the elderly and the heads of families receive government pensions and a small group (3%) is employed by the state. An alternative name of Wichí that was current until recently is Mataco. The Wichí language belongs to the Mataco-Mataguayan family, spoken in the region known as Chaco. The other languages of the family are Maká, Chorote and Nivaklé. The Chaco region is located in the South American Lowlands and includes the great woody plain bounded on the west and southwest by the Andes and the Salado River basin, in the east by the Paraguay and Parana Rivers and in the north by the Moxos and Chiquitos Plains. This vast region spans 1,000,000 square kilometers across western Paraguay, eastern Bolivia, northeastern Argentina and a little portion of Brazil. The Wichí language exhibits several dialects. Tovar (1964) mentions the existence of two dialects in the province of Salta, Argentina (Vejoz and Guisnhay) and a third in Bolivia (Noctén, also called Weenhayek). Gerzenstein (2003) introduced a !

The subdatabase of the World Loanword Database that accompanies this chapter is available online at http://wold.livingsources.org. It is a separate electronic publication that should be cited as: Vidal, Alejandra. 2009. Wichí vocabulary. In Haspelmath, Martin & Tadmor, Uri (eds.) World Loanword Database. Munich: Max Planck Digital Library, 1187 entries.

1016

Alejandra Vidal and Verónica Nercesian

slightly different dialect division, though she kept the number of Wichí dialects to three. She called both the Guisnhay and Vejoz dialects from Tovar’s study Salteño (spoken in eastern Salta, Argentina), and added a linguistic variety that was not recognized in Tovar’s classification, the Bermejo (also called Teuco) dialect (spoken in Formosa and Chaco, Argentina), and still acknowledged Noctén (in Tarija, Bolivia) as one of the three dialects. Geographically, the Guisnhay dialect can be found in the cities of Embarcación and Tartagal, east of Salta and west of Formosa (district of Ramón Lista). The Bermejo (or Teuco) dialect is spoken by the communities in the district of Rivadavia (Salta), on the Bermejo riverbanks (in Chaco and Formosa) and along National Route 81 in Formosa going from Pozo del Tigre to Laguna Yema. Finally, the Noctén dialect is spoken between the mouth of the Bermejo River and Parallel 64 in Bolivia.

Map 1: Geographical setting of Wichí Lexical, morphological, and phonological differences exist among these varieties, which the speakers themselves can identify. The division in dialects, however, is not so clear cut. As the lexical and grammatical elements of one variety can also be

40. Loanwords in Wichí

1017

found in another, it would be necessary to establish isoglosses within the three major dialects. The Bermejo dialect, for example, is spoken in 25 Wichí communities (Braunstein & Dell’Arciprete 1997) distributed in three Argentinian provinces (Salta, Chaco and Formosa). However, there are a few linguistic differences among the speakers of this dialect. In the Rivadavia county (Salta), for instance, the clausal negation –hit’e can be interrupted (hi…t’e) by other verbal suffixes like the directional or object markers (Terraza 2005). Our data, which was collected in the Pozo del Tigre, Las Lomitas, and Bazán (Formosa) communities, show that the same clausal negation morpheme is formally different ha-……-hi. Likewise, we have detected phonological differences – the voiceless palatalized velar stop of the Rivadavia variety corresponds to the voiceless palatal affricate of the Formosa variety. Of the languages spoken in the Chaco region, Wichí is the language with the greatest number of speakers, along with Toba. The degree of vitality of the Wichí language, however, is different from one region to another. While speakers in the province of Formosa are Wichí-dominant, those in the district of Rivadavia (Salta) are Spanish-dominant (Terraza 2002). In general, we have witnessed a growing tendency to bilingualism nowadays accompanied by an effort to maintain the native language. The language is transmitted across generations and is spoken vigorously at the community and family levels. It is also used as a means of communication on local radio broadcasts that are produced and anchored by the speakers themselves. Wichí is not used as a means of communication by non-Wichí persons. It has been taught in Wichí schools as part of the bilingual intercultural education program implemented since 1984. However, the government has not developed key bilingual education programs or curricula for the Wichí communities. For that reason, the program is rather ineffective for its lack of scope or sequence in Wichí instruction and its deficiency of didactic materials. The most widely used writing system was created by Anglican missionaries. In 1937, Richard Hunt, an Anglican preacher, developed the first Wichí alphabet. To represent certain special sounds in Wichí, they used a combination of specific letters, for example th, to the voiceless lateral fricative sound. In 1998, the Anglican missionaries introduced some modifications into the alphabet so that each letter is now associated with a phoneme. In our transcriptions, we use the modified version of the alphabet. Once the first alphabet of the Wichí language had been developed, the Anglicans translated the bible into Wichí and promoted reading and writing for the Wichí speakers to be able to access biblical texts. The Anglicans’ first contact with the Wichí was in the province of Salta and can th be traced back to the beginning of the 20 century. The first written materials belong to a dialect of that area (Salteño or Guisnhay, according to the dialect classification presented above). The emergence of written texts and of literacy in Wichí in the area of Bermejo, Formosa appeared later on in the 1980s. Some Wichí speakers of different varieties were trained as Anglican preachers and educated by

1018

Alejandra Vidal and Verónica Nercesian

the Anglican missionaries in Salta. They fostered literacy in Wichí years later. This is, for instance, the case of Francisco López who, having trained as an Anglican preacher in Salta, conducted teaching tasks in his own community in the province of Formosa. Part of Francisco López’s personal project was supported and financed by the DOBES Project “Chaco Languages” (2002–2005). th Sustained contact with nearby society took longer – from the early 20 century – and for this reason, Wichí was used as the sole means of communication among these peoples. The first published linguistic texts (vocabularies, grammars) date th th back to the late 19 century and continued until the first half of the 20 century. These were undertaken by missionaries and European travelers that had been hired by the Instituto Geográfico Militar Argentino to explore the area. Their work did not lead to the production of reference grammars and/or dictionaries and, hence, linguistic documentation is still work in progress.

2. Sources of data Given that no prior studies exist on contact between Wichí and Spanish, and/or other languages, the data for this chapter had to be specially compiled. We collected the information in the communities of Tres Pozos (Bazán), and Lote 27 (Las Lomitas), Formosa, with young and adult native speakers. We complemented the corpus with other data elicited on prior occasions in these communities and in Lakhawichí (Pozo del Tigre) in the province of Formosa and the Sauzalito neighborhood in the province of Chaco. Other sources for this paper are: Pelleschi (1886, 1897), Remedi (1890), Massei (1895), D’Orbigny (1896) and Hunt (1913, 1937, 1940). 2.1.

Nineteenth century sources (1850–1900)

Pelleschi (1886) is a traveler’s diary describing his experiences across the Bermejo River from east to west. The author registers the characteristics and particularities of two indigenous groups that inhabited the zone: the Tobas in Chaco and the Wichí from northeastern Chaco (on the border of Salta) and in eastern Salta. The book contains descriptions of diverse aspects of the culture of each ethnic group and of the relationships between the peoples. The author offers his impressions of different aspects of the grammar and phonology of the Wichí language (which he calls “Mataco”), and includes the systematization of lexical and grammar categories, together with examples in Wichí, and the comparison of its phonological system with Spanish, with Toba, and at times, with Italian. Pelleschi’s second book (1897) contains a grammatical description of Wichí, and references to the phonological adaptations of Spanish loanwords. These two books constitute the most important grammatical account on the Wichí language written th during the second half of the 19 century.

40. Loanwords in Wichí

1019

Remedi was a Franciscan missionary from the Colegio Apostólico de Salta. His book (Remedi 1890) includes brief comments on some aspects of the language: sounds, nouns, verbs and adjective classes, and subject and possessive pronominal paradigms. It also contains an appendix with a short list of Wichí-Spanish words. Inocencio Massei published a series of notes on the Noctén variety of Wichí as a contribution to the dialectal documentation undertaken by his contemporaries (Massei 1895). The author includes a grammatical appraisal of the Noctén variety of Wichí together with an analysis of the noun and verb categories, and the subject and possessive person paradigms. The work also provides a description of the group’s customs and activities. Finally, D’Orbigny (1896), based on the data provided by his contemporary, Father Doroteo Gionnecchini in Tarija (Bolivia), focuses on the Vejoz dialect located in the province of Salta (Argentina) from the Orán River to the Seco River, quite near the Noctén group. Using sources from other missionaries and his own information, D’Orbigny proposes to develop a grammar of this Wichí dialect. He presents subject and possessive pronoun paradigms and interrogative pronouns, as well as including references to the noun, to number and case markers, and a description of the verb forms and their structure. He concludes with a comparison between Wichí and Toba, suggesting a possible historical relationship between them, though he does not develop this idea any further. 2.2.

Twentieth century sources (1900–1950) th

The most important sources from the first half of the 20 century are Hunt (1913, 1937 and 1940). Based on the hypothesis of a possible genealogical relationship between the Mataco-Mataguayan and Guaycuruan languages that Lafone Quevedo had proposed, Hunt establishes comparisons between Wichi and Toba throughout his work, which was published in 1913. His Wichí grammar based on the study of the Vejoz dialect includes a lexicon of about 2,000 words in alphabetical order in Spanish with English and Wichi translations, and the same in alphabetical order in Vejoz with translations into Spanish and English. Hunt (1937) is a bilingual Wichí-English dictionary that contains an appendix with brief grammatical notes. Lastly, his grammar (Hunt 1940) represents the work of approximately ten years of study of the Wichí language and was published with a slight modification one year before he died. It was of major importance for the missionaries to be able to speak the language of the group with whom they were carrying out their work. To this end, and as a way to help his fellow missionaries learn Wichí, Hunt developed this grammar, including exercises to practice the grammatical structures, which was then published in English. Likewise, each of the chapters contains a corpus of words and phrases in WichíEnglish.

1020

Alejandra Vidal and Verónica Nercesian

The lexical forms found in the secondary sources do not differ greatly as far as we currently know. At any rate, fewer than 50 percent of the total entries in the database used appear in the bibliographical sources that were cited.

3. Contact situations In the province of Formosa several indigenous groups coexist whose languages belong to different families: Pilagá and Toba (Guaycuruan), and Wichí and Nivaklé (Mataco-Mataguayan). In eastern Formosa, the population originally from Paraguay speaks Guaraní and all of these languages have had contact with Spanish since the conquest. No multilingual communities exist in this province with speakers of several of these languages. The overall social norm reflects marriages within the same ethnic 1 group, although the partners belong to distinct bands . There are a relatively small number of interethnic marriages. However, we can see a growing tendency for marriages to take place between indigenous and non-indigenous persons (Spanishspeaking Criollos), which fosters the advance of bilingualism and the possibility that Spanish loanwords are progressively incorporated into Wichí in the future. 3.1.

Contact with languages in the Chaco area

Despite scarce archeological data, Braunstein proposes a hypothesis on the time when the area was populated (from 6,000 to 2,000 BP). He claims that two principal groups, the Mataco-Mataguayan and the Guaycuruan, settled in Chaco. The first came from the north by way of the west and followed the Pilcomayo and Bermejo river basins toward the southeast. The second inversely came from the south by way of the east and moved toward the northwest following the same river basins (Vidal & Braunstein 2009+). The Chaco thus became an area of migration and displacement where these peoples were organized internally into tribes. That they had to share the same geographical area and its resources promoted interchange and relations between these peoples. In addition, it was a propitious scenario for linguistic and cultural contact over prolonged and somewhat stable periods. Although the sustained contact between Wichí and other languages in the area is undeniable, from the linguistic perspective it is still difficult to identify which loanwords originated in which language and what direction they could have taken. Some hypotheses on possible genetic and contact relations among the Chaco languages were pointed out in the nineteenth century sources. Based on the individual 1

Band is the term used in the literature for a bilateral group perceived like a single family that migrated together and was represented by a single principal leader. The exogamic local groups or bands kept more or less permanent alliances with other bands, and the result was the conformation of larger groups that we name "tribes". Each tribe was mainly endogamous, and postmarital residence tended to be that of the woman/uxorilocal (Braunstein 1983; Braunstein & Miller 1999).

40. Loanwords in Wichí

1021

works of the missionaries and travelers in the Chaco (D’Orbigny, Massei, Remedi, Pelleschi), Lafone Quevedo (1896), states how strikingly similar the pronoun systems of the Mataguayan languages are to the Guaycuruan languages, but at the same time, he observes that the percentage of lexical items they share is quite low. 2 Our data confirm this. According to Lafone Quevedo, the Matacoo-Mataguayan languages were closer to Lule (of the Lule-Vilela family) with respect to the amount of shared lexicon, despite their grammar being notably more similar to that of Toba (Guaycuruan). The questions Lafone Quevedo posed, and which have still gone unanswered, considered, on the one hand, which of these similarities between the languages could be attributed to genetic relations and which to linguistic con3 tact. On the other, he debated the direction of these loanwords. Lafone Quevedo was well aware that one needed to know more about the languages spoken in the Chaco area before one could offer a thorough explanation of the linguistic situation. He maintained that the picture was extremely complex, with linguistic groups and subgroups, though he could not account for the similarities that, according to Braunstein are the result of migratory movements of the populations in the Chaco, centuries beforehand. For the moment, the impossibility of clarifying the outcome of this contact for the Chaco languages involved can be partly attributed to the absence of complete or specific documentation in each case. Also, by studying only one dialect, we cannot be sure whether the other Wichí varieties were more influenced by neighboring languages than the Bermejo dialect, selected for the present study. 3.2.

Contact with Spanish

The Chaco indigenous languages’ contact with Spanish developed relatively late when compared to other languages in the Americas like Quechua (in this volume), th Nahuatl and Quiché. In the Chaco, the Spanish conquerors arrived in the 16 centh tury, reaching the Bermejo River in the late 18 century (Kersten 1968 [1905]). However, sustained contact with the European population began with missionary activities and then the evangelization of the indigenous peoples when the missions were established. The Catholic Franciscans first arrived in the last quarter of the th 19 century, settling on the right bank of the Bermejo River. They were followed by the Anglican South American Mission, who founded the Misión Chaqueña in Salta in the 1920s.

2 3

At least with respect to Pilagá, a Guaycuruan language. “Vejoz, the language of the Mataco group, has a pronominal marking mechanism that is almost identical to that of the Guaycuruan group. However, its vocabulary is far from manifesting the same analogies. Undoubtedly, we could find some common roots between the two languages, but homophonies, which are the rule between pronouns, are more the exception in the rest of the vocabularies. Now the question is, should we admit a linguistic relation based on the first or reject it based on the second?” (Lafone Quevedo 1896: 131–132) (our translation).

1022

Alejandra Vidal and Verónica Nercesian th

Sources dating back to the 19 century mention some contact between indigenous peoples, and between these communities and the European population on the plantations and the Franciscan and Anglican missions (Palmer 2005). th In the early 20 century, the Wichí were incorporated into the workforce as laborers on the sugar and cotton plantations and in manufacturing. The indigenous workers thus began to come into greater contact with Spanish (also used as the lingua franca among the indigenous peoples that spoke different languages). Contact with small farmers and ranchers also grew during this period (García 2005: 56ff). In sum, the incorporation of the indigenous population into the capitalist system and the labor market occurred relatively late, as well as their contact with the European population and its language, Spanish. Consequently, we could say that bilingualism among the Wichí developed early within the past 100 years and even more recently among the Bermejo communities. That might explain why the contact influence of Spanish on the indigenous th th languages play such a small role in the works of 19 and early 20 century authors. Of all the sources we have examined, only Pelleschi (1897) notes the way in which the Wichí pronounce certain Spanish loanwords like: cailá < cabra (‘goat), Peiló < Pedro (‘Peter’), nelom < melón (‘melon’), thilalol < tirador (‘suspender’), hléno < freno (‘brake’), húyelo < pueblo (‘people’), tles < tres (‘three’), poole < pobre (‘poor’) 4 (Pelleschi 1897: 181, 237–238). Despite the centuries of contact, the share of Spanish loanwords is 15.5 percent, of which 10 percent are co-existent words and very few are replacements (8 words altogether, of which 5 are the words for numbers). Mixed varieties of Spanish-Wichí have not emerged. Nor is code-switching a widely spread phenomenon in the older generation (Vidal 2006).

4. Number and kinds of loanwords in Wichí Of the 1460 Loanword Typology (LWT) meanings, 195 have no equivalent in Wichí. There are 1361 meaning-word pairs in the Wichí subdatabase: 820 pairs have an exact counterpart in Wichí; 269 pairs have a super-counterpart (i.e. the word corresponds to several meanings of the LWT list, e.g. hunhat ‘world, land, floor, soil’; iyhot ‘mud, clay’; lhip ‘half, side, part, piece’); and 101 have subcounterparts (i.e. several words correspond to a single meaning, e.g. hulu, lamukw ‘dust’; wuk’u, winalhch’u ‘owl’; nichay’uhi, nichay’ukwe ‘warm’). Finally, 171 pairs are para-counterparts (their meanings are not completely equivalent, e.g. tshotoyw’et (‘place of animals’) of the LWT meaning ‘stable or stall’; ts’iwase (‘a species of the same family like the reindeer/caribou or elk/moose’) for the LWT meaning ‘reindeer/caribou, elk/moose’; chelhchep (‘time after summer’)/ for the meaning ‘the autumn/fall’).

4

These examples reflect the transcription of Wichí that Pelleschi developed in his book.

40. Loanwords in Wichí

1023

The Wichí subdatabase contains 197 words that show some evidence of loanwords status. Of these, 194 are Spanish loanwords (95.7%). Quechua is the earliest source language for 6 of the total number of loanwords (2.8%). However, they were probably borrowed into Spanish first and then into Wichí. Of words coded in the database as “perhaps borrowed” (10 total items), 9 (4.2%) are also present in the Guaycuruan lexicon, presumably dating back to pre-Hispanic contact. Spanish loanwords are distributed in the following categories: 4 probably borrowed, 1 perhaps borrowed, and 198 clearly borrowed. Of this total, 165 are insertions of new terms, 8 are replacements and 22 co-exist with the native word; about the last 3 there is no information. Of the total number of insertions, 32.5 belong to the semantic field Modern world (71.8% of all words in the semantic field). The remaining insertions are basically distributed in the fields of Food and drink, Law, and Quantity. When examining the number of insertions according to time periods, we can see almost the same number of borrowed words during the early period and modern times. 4.1.

Loanwords by semantic word class

Table 1 shows the Spanish loanwords in the database by lexical class. All maintain the grammatical category to which they belong in the donor language. Table 1:

Loanwords in Wichí by semantic word class (percentages) Spanish loanwords

Nouns Verbs Adjectives Adverbs Function words all words

23.1 2.7 1.7 21.5 15.8

Non-loanwords 76.9 97.3 98.3 100.0 78.5 84.2

Of the word classes in the database (noun, verb, function word, adjective and adverb), nouns show the greatest number of Spanish loanwords. There is at least one borrowed noun in each field with the exception of Sense perception, Emotion and values and Miscellaneous function words. The fields Modern world and Food and drink include most of the borrowed nouns, 33 in the first and 23 in the second of the 160.5 borrowed nouns. The word class with the second highest percentage of loanwords is “Function words” (21%). All Spanish loanwords in the function word class are cardinal numbers. There existed a numeral system from 1 to 5 that could act as lexical replacements, but new words were incorporated as from 6. Wichí’s numeral system was replaced by the western system. The same occurred regarding the division and organization of time. The months of the year and the days of the week are terms that were incorporated into the Wichí lexicon.

1024

Alejandra Vidal and Verónica Nercesian

Verbs are the third word class with loanwords. Only 8 verbs (see below) represent 2.4% of the total words in this class (see examples in 1). (1)

Spanish loanword verb class pinta < pintá wayla < bailá manija < manejá wende < vendé pesa < pesá meli < medí kunta < contá fwulena < frená

Gloss ‘to paint’ ‘to dance’ ‘to drive’ ‘to send’ ‘to weigh’ ‘to measure’ ‘to count’ ‘to brake’

The Spanish input form for these loanwords is the second person of the imperative mood. It is used as a verb root (like others from Wichí) and receives the same inflectional affixes as any other non-borrowed verb (cf. §5.2). This imperative form may have been chosen for its prosodic form (with right-head stress in Spanish, more similar to the Wichí stress pattern), that consequently requires less phonological integration. Adjectives do not exist as a word class in Wichí. Rather, they belong to the word class of stative verbs. There is, however, one borrowed adjectival form (Spanish pwili/poor) that, interestingly, was integrated as a verb and behaves like a Wichí stative verb. Finally, the adverb word class is the only one to manifest no loanwords. 4.2.

Loanwords by semantic field

All of the semantic fields contain Spanish loanwords except two, Emotions and values and Miscellaneous function words. The distribution of loanwords by semantic field is given in Table 2. Note the three fields with a striking percentage of loanwords: Modern world (71.8%), Quantity (54.3%) and Law (42.3%). Of the remaining semantic fields, 9 exhibit 33%–20% of loanwords, and the other 9 exhibit 11%–0% of loanwords. Interestingly, 29.2% of non-loanwords are innovations (compounds or derivations of native bases, i.e., wej itoj [end fire] ‘car’; wiy’o-taj [fly-AUG] ‘airplane’; tochemet+cha [POSS.INDEF-work+tool] ‘machine’; or words whose meaning was extended to embrace new concepts, e.g. niyat ‘any person with power’, later ‘president, rich, government, minister, queen, king’; lanek ‘shell’, later ‘spoon’). Both the incorporation of loanwords and lexical innovations are the speakers’ responses to their modern lifestyle. The field Quantity comprises a large number of loanwords designating numbers. Wichí, by tradition, has a system of five numbers. Amounts over five are conceived of or measured as sets of elements.

40. Loanwords in Wichí

Table 2:

Loanwords in Wichí language by semantic field (in percentages) Spanish loanwords

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

1025

The physical world Kinship Animals The body Food and drink Clothing and grooming The house Agriculture and vegetation Basic actions and technology Motion Possession Spatial relations Quantity Time Sense perception Emotions and values Cognition Speech and language Social and political relations Warfare and hunting Law Religion and belief Modern world Miscellaneous function words

3.8 4.2 12.0 0.7 33.2 23.6 27.2 20.7 23.3 20.8 20.5 4.2 54.3 23.7 2.2 4.4 5.3 11.4 8.1 42.3 20.0 71.8 15.8

Non-loanwords 96.2 95.8 88.0 99.3 66.8 76.4 72.8 79.3 76.7 79.2 79.5 95.8 45.7 76.3 97.8 100.0 95.6 94.7 88.6 91.9 57.7 80.0 28.2 100.0 84.2

5. Integration of Spanish loanwords Of the total 198 Spanish loanwords, 22 are unintegrated. That is, they have been incorporated into Wichí but have maintained the language’s phonological particularities of the word in the source language, for example, herrero (blacksmith), oro (gold), plata (silver), bronce (bronze), impuesto (tax), puerto (port), carne (meat), and ora (pray). Three arguments could explain their incorporation without integration to the Wichí phonological system: that they have only recently been incorporated and show little use, as in the case of oro, bronce, timón (rudder), impuesto; that the phonological form of the word is acceptable to the phonological patterns of Wichí, as in the case of papel (paper), país (country), azul (blue); and that the degree of bilingualism and the use of Spanish has increased in recent years and shows a tendency to rise. The rest of the 177 loanwords were adapted to the patterns and phonological and morphosyntactic rules of Wichí.

1026 5.1.

Alejandra Vidal and Verónica Nercesian

Phonological integration

The phonological adaptation of Spanish loanwords to Wichí is made both at the level of the phoneme inventory and at the level of the syllable structure, as well as at the prosodic level (adaptation of Wichí’s stress pattern). When loanwords are adapted, they take on these three aspects. 5.1.1.

Phonological adaptation of Spanish vowels

The phonological integration of Spanish loanwords involves vocalic changes like vowel raising. Mid vowels of Spanish loanwords, /e/ and /o/ are raised to /i/ and /u/ (despite the fact that the Wichí vowel system is composed of five vowels like Spanish). This is quite a regular and predictable mechanism. The change in the Spanish vowel /a/ for the Wichí /u/ only occurs at the end of the word. A similar phenomenon is found in the lexical loanwords of Imbabura Quechua (this volume). (2)

5.1.2.

Wichí platu tulu pulutu pusti mati munelu semanu w esk elu

Spanish < plato < to!o < po!oto < poste < mate < moneda < semana < eskwela

English ‘the plate’ ‘the cow’ ‘the bean’ ‘the post or pole’ ‘type of hot drink’ ‘the coin’ ‘the week’ ‘the school’

Phonological adaptation of Spanish consonants

The other phonological adaptations consist of replacements for those Spanish consonants that do not exist in Wichí. The Spanish consonants /b/, /!/, /r/, /"/, /d/ and /f/ are replaced by those of the Wichí inventory whose features have some resemblance to Spanish consonants: example (3a) labial; (3b) palatal; in (3c) the two consonants belonging to the group of liquids; in (3d) sharing coronal feature, and (3e), the labiodental articulation point. In example (3f), the change in sound is not motivated by the absence of the voiceless velar fricative in Wichí’s phonological inventory, but, as the /x/ never occurs in the syllabic onset position, it is replaced by the continuant labialized consonant. In the last case, (3g), the consonant in the onset position changes its point of articulation from palatal to alveolar (Nercesian 2009+).

40. Loanwords in Wichí

1027

(3) Consonant adaptations

Examples

Spanish > Wichí

Spanish

a.

b>w

b.

!>y

c.

r, " > l

d.

d>l

e.

f > f#

f.

x > f#

g.

t! > ts

batata banana pava po"e!a "e!ba bote"a a!ena kareta ma!tes domingo to!o moneda fideos fohfo!o kafe xavon xweves t"alana pont"o let"e

Wichí watata wanana pawa puyelu yelwa wuteya alena kalet#j maltis luminku tulu munelu f$ilel f$uf$ulu kaf$e f$awun f$ewis tsalana puntsu letsi

English ‘the sweet potato’ ‘the banana’ ‘the kettle’ ‘the skirt’ ‘type of herb’ ‘the bottle’ ‘the sand’ ‘the cart or wagon’ ‘Tuesday’ ‘Sunday’ ‘the cow’ ‘the coin’ ‘pasta’ ‘the match’ ‘the coffee’ ‘the soap’ ‘Thursday’ ‘the canoe’ ‘the poncho’ ‘the milk’

The phonological adaptations of consonants are quite a regular and predictable processes. But, Spanish-dominant speakers tend to use the Spanish words instead. 5.1.3.

Integration to syllabic structure

The syllable structure in Wichí can be CV, CVC, CCV, CCVC, of which CV and CVC are most frequent. Only three consonant clusters exist in this language: /tl/, /pl/ and /kl/. So if the Spanish loanword has a cluster, it will be replaced according to the above mechanism (see examples (4d) and (4e) with acceptable clusters). Yet, if an unacceptable cluster still results, as in examples (4a), (4b) and (4c), it will be adapted to form a good syllable structure in different ways. In (4a) and (4b), the adapted consonants of a cluster will be constituted by two elements with unexpected degree of sonority /b"/ > /wl/ violating the sonority hierarchy sequence, from lesser to greater sonority. Accordingly, the two consonants will appear in two different syllables and the cluster will be broken up. Example (4c) is more difficult to explain because we expect the adaptation to be similar to (4b), something like “pu.wi.li”. However, it is preferable to form two syllables instead of three and add the labialized feature of /u/ and /w/ to the voiceless bilabial stop /p/. All of these mechanisms tend to preserve Wichí’s preferred syllable structure.

1028

Alejandra Vidal and Verónica Nercesian

(4) a. b. c. d. e.

Wichí li.wu.lu su.wu.la’ p$i.li klus k$a.tlu

< < < <
k

/#q’a"u / > /ka#"u/ ‘grass’

Adjustments of Quechua plosives /q/ and /t/

(a) Velarization of /q/ (5)

q > k / __a

/#awqa/ > /aw#ka/ ‘rebel, enemy’

(b) Palatalization and fricativization of /q/ (6)

q > "/__i

6

/#qi-qay/’to write, to draw’ > / "i-#katun/ ‘to write’

(c) Palatalization and affrication of /t/ (7) 5.2.3.

t>"

/ata#wa-pa/ > /a"a#wa-/ ‘the chicken’

Adjustments of aspirated fricatives and affricates: Plosivization

Since Mapudungun registers no aspirated consonants, those in Quechua are replaced by plosives. (8) 5.2.4.

".> "

/i#".una/ > /i#"una/~ /e#"una/ ‘sickle’

Adjustments of the Quechua alveolar tap /!/

(a)

Retroflexion

(9)

/ > $ /V__C

/#"’a/ki/ > /"a$#ki/ ‘the meat (jerked meat)’

(b) Lateralization (sometimes with palatalization) (10) / > l ~ 5.2.5.

/#qa/qu/ ‘ill omen’ > /ka-#ku/ ‘sorcerer or witch/wizard’

The maintenance of vowel variation in Mapudungun

The Proto-Quechua vowel system has been described as a three-vowel system with wide allophonic variation (Adelaar with Muysken 2004). The vowel variation, which is also characteristic of Mapudungun, is present in some Quechua borrowings, as shown below. (11) /i".una/ > /i"una/~ /e"una/ ‘the sickle or scythe’ 6

Chillkatun ‘to write’ derives from qillqay ‘to write, to draw’; chillkatun ‘to read’ comes from Quechua chillka ‘book’, ‘letter’, which in turn derives from the Quechua qillqa ‘letter’, ‘drawing’ (Adelaar, personal communication).

41. Loanwords in Mapudungun

1053

(12) /mi(ki/, /misk’i/ > /mi(ki/~ /m+(ki/ ‘sweet, the beeswax, honey’ (13) /wampu/ ~ /wampo/ ‘the canoe’> /wampu/ ~ /wampo/ ‘the canoe’ 5.2.6.

Syllable structure in borrowing

Mapudungun usually preserved Quechua’s syllable structure. The CV(C) occurrences exhibit the following consonants as onsets in descending order: (a) the affricate /"/; (b) a voiceless stop (/p/, /t/, /k/, /q/); (c) the semivowel /w/, and (d) a nasal /m/, /n/ or /)/. VC occurrences are less frequent, with /aw/ as the most recurrent in our subdatabase. See the Appendix. 5.2.7.

Stress adaptations

Borrowing produces a change of stress in disyllabic words from the penultimate to the ultimate syllable, according to Mapudungun primary stress pattern (Hayes 1995, based on Echeverría & Contreras 1965) as examples (14) and (15) show: (14) /#awqa/ > /aw#ka/ ‘the enemy’ (15) /#"a-wa/ > /"a-#wa/ ‘fish, to fish’ Trisyllabic monomorphemic words generally maintain the original stress. Note the coincidence between the Quechua right-to-left stress pattern on the penultimate syllable and the Mapudungun left-to-right primary stress pattern on the second syllable (see §5.1) in the following example. (16) /ka!witu/ > /ka!witu/ ‘the bed’ Some Quechua polysyllabic loanwords may have undergone syllable reduction while preserving the original stress, now on the final syllable, as shown in (17). (17) /ata#wa-pa/ > /a"a#wa-/ ‘the chicken’ For a complete list of adaptations, see Smeets (1989, 2008). 5.3.

Gününa Yajüch loanwords

Two processes have been documented in the Mapudungun integration of Gününa Yajüch lexicon: (a) the creation of loanwords and (b) the creation of calques. Pedro Viegas Barros suggests three ages of loanwords, including one that may date far earlier to Mapuche’s adoption of the horse. This hypothesis is supported by grammatical similarities, such as the Mapudungun plural (-ün, -ñ) and dual (-u) person suffixes, and the nominalizer -we which functions as an instrument and location marker, all of which indicate old contact processes (Viegas Barros 2005).

1054

Lucía A. Golluscio

(a) Some examples of Mapudungun loanwords from Gününa Yajüch appear to have kept their source language patterns, while others have adapted to the recipient’s structure. See, for example, /x/ and its maintenance or adaptation in loaning in (18) and (19). (18) /#"axa-/ > /"a#xa-/ ‘Southern Mountain Cavy (a small rodent)’ (19) /xitran/ ‘salt-water lake’ > /ko#0$+/ ‘salty’

7

Viegas Barros suggests that /x/ could have been introduced in Mapudungun not only through Spanish, but also through Gününa Yajüch. He lists three Gününa loanwords that preserve the original /x/ in the variety of Mapudungun spoken in what was once Gününa Küna territory (2005: 154). In contrast, he affirms that Mapuche speakers from other regions reject /x/. Hence, the Gününa Yajüch substratum evidences only in a few words, some onomatopoeic short expressions, and the recurrent allophone [h] in the tayül (the Mapuche sacred songs). See Viegas Barros (2005) for more details. (b) Calques result from the Mapudungun translation of Gününa Yajüch words and expressions, especially place names, fauna and flora. According to Casamiquela, “All Mapudungun place names of northern Patagonia are faithful or free translations of the original Gününa Küne place names” (Casamiquela 1962: 88 in Viegas 2005, my translation). Harrington also refers to the Mapudungun names of Patagonian plants as possible semantic calques from Gününa Yajüch. See example (20) below and other examples in Viegas Barros (2005: 162). (20) trintri lawen ‘curly medicine’ (Mapudungun) < akïc a tïltïl ‘curly medicine’ (Gününa Yajüch) 5.4.

Spanish loans

Borrowing from Spanish is a productive process that can be traced back to the early post-conquest period. Unlike other indigenous American languages, Mapudungun did not develop a syncretic or a mixed variety – see Hill & Hill (1986) on Mexicano and Dreidemie (2007) on Quechua spoken by Bolivian migrants in Greater Buenos Aires, among other examples. Nor was it overwhelmed by an avalanche of Spanish function and content words. On the contrary, my fieldwork over the years has shown an extended bilingual strategy. Fluent Mapudungun-Spanish Mapuche speakers keep each language in relatively separate domains and choose which language to speak according to the topics, the addressees, and the situation. In the case of non-fluent Mapudungun speakers, the use of the heritage language is limited to only a few terms and expressions that they incorporate into their Spanish discourse.

7

This is similar also to Quechua katri ‘salt’; kachi in Cuzco and Ayacucho Quechua (Adelaar, personal communication).

41. Loanwords in Mapudungun

5.4.1.

1055

Age of loanword and degree of integration

The age of the Spanish loanword is crucial in explaining its degree of integration. Words introduced by the Spaniards soon after the conquest show significant phonological modifications required by the Mapudungun phonemic inventory and distribution restrictions (Echeverría 1964; Golbert 1975). See examples below. (21) /#baka/ > /wa#ka/ ‘cow’ (22) /bo#riko/ (from “borrico”, a Spanish diminutive of burro ‘donkey’ that is more used in Spain and could have been introduced in the early Contact) > /fu#$iku/ ‘donkey’ (23) /xa#bon/ > /ka#fon/ ‘soap’ th

The words introduced after the 1880s, and especially during the 20 century, may exhibit fewer adaptations to Mapudungun. The following examples (Smeets 1989) pertaining to the social and political relations semantic field show phonemes and syllable structure patterns that do not correspond to Mapudungun native phonological system. (24) /p/esi#dente/ > /p/ese#dente/ ‘president’ (25) /go#bje/no/ > /go#bje1nu/ ‘government’ (26) /sosia#lismo/ > /sosia#limu/ ‘socialism’ Notwithstanding examples such as the above, Mapudungun phonotactics still determine the constraints on the distribution of phonemes in modern loans (see §5.2.6). The absence of /x/ and of voiced plosives in the original phoneme inventory, along with the restrictions on tautosyllabic consonant clusters and on the number of syllables in monomorphemic roots above mentioned has triggered complex adjustments in most of the Spanish loanwords, some of which are listed below. 5.4.2.

Adjustments of Spanish plosives

When the Spanish plosive is voiced, it is replaced by a Mapudungun voiceless plosive or fricative, and the place of articulation is at least partially maintained. See the following examples of these processes.

1056 (a)

Lucía A. Golluscio

Devoicing

(27) b > p

/som#b/e/o/ > /"um#pi$u/ ‘hat’

(28) g > k

/#ganso/ > /kan#su/ ‘goose’

(b)

Devoicing and fricativization

(29) b > f

/#bolso/ > /fol#so/ ‘bag’

(30) d > '

/di/ek#sjon/ ‘address’ > /'i$ek#sjon/ ‘address’

At least three examples with the Spanish /g/ in word-initial and onset positions have been documented (Smeets 1989: 68; my phonological transcription). (31) /ga#-eta/ > /ga#jeta/ ~ /ga#-eta/ (Golluscio 2006: 176) ‘cookie’ (32) / go#bje/no/ > /go#bje$nu/ ‘government’ (33) [e%ga#)a/] >/e%ga#)an/ ‘to deceive’ 5.4.3.

Adjustments of voiced Spanish fricatives

When the Spanish fricative is voiced, the following strategies that encompass changes in place and manner of articulation have been documented. (a)

Elision

[!], the Spanish fricative allophone of /d/ in intervocalic position is elided in the case of the Spanish past participle ending –ado, giving rise to a falling diphthong and, thus, reducing the number of syllables. This phenomenon, which is also common to Chilean and Argentine substandard varieties of Spanish, was present as th early as the 16 century in some dialects, for example, Andalusian Spanish, introduced by the conquerors. [!] > Ø /a__o# (34) [se/tifi#ka!o] > /se$tifi#kaw/ ‘certificate’ (35) [a,o#2a!o] > /awo#kaw/ ‘lawyer’ (b)

Devoicing or defricativization

[,], the Spanish fricative allophone of /b/ in intervocalic and consonant cluster positions, partially retains its place of articulation, being replaced by the voiceless labio-dental fricative /f/ or the voiced labio-velar semi-vowel /w/.

41. Loanwords in Mapudungun

1057

[,] > f (36) [#po,/e] > /po#f$e/ ‘poor’ (37) [a#,jon] > /a#fjon/ ‘airplane’ (38) [o#,exa] > /ofi#(a/ ~ /ufi#(a/ ~ /wi#(a/ ‘sheep’ [,] > w (39) [a#,wela] > /a#wela/ ‘grandmother’ (40) [a,o#2ado] > /awo#kaw/ ‘lawyer’ (41) [o#,exa] >/wi#(a/ ‘sheep’ (42) [ka#,a-o] > /ka#we-u/~ /ka#we-/ ‘horse’ (c)

Devoicing and plosivization

[3], the Spanish fricative allophone of /g/ in intervocalic position, is replaced by the voiceless velar plosive /k/. 2>k (43) [a,o#2a!o] > /awo#kaw/ ‘lawyer’ (44) [a#2uxa] >/a#ku"a/ ‘needle’ 5.4.4.

Adjustments of voiceless Spanish fricatives

Voiceless Spanish fricatives /x/ and /s/ are subject to varied adaptation strategies. Voiceless palatal fricative /x/ is replaced by voiceless velar plosive /k/, voiceless palatal fricative /(/ or voiceless palatal affricate /"/. In at least one documented example, /x/ is maintained – see (48). The voiceless alveolar fricative /s/, of low frequency in Mapudungun (see §5.1), is replaced by the voiceless alveolar affricate /"/ or the voiced dental fricative /!/. (a)

Plosivization x>k 8

(45) /ke#xa/-se/ [complain-RR] > keka-w"-n [complain-RR-N.FIN] ‘to complain’ (b)

Palatalization th

x > ( (most likely a remnant of 16 century Spanish pronunciation of /x/) (Entwistle 1969, cited after Smeets 1989: 69)

8

Note that the loanword has adapted both the phonological and morphological Spanish structure to Mapudungun patterns.

1058

Lucía A. Golluscio

(46) [o#,exa] > /ofi#(a/ ‘sheep’ (c)

Affrication and palatalization x >#

(47) [a#2uxa] > /a #ku"a/ (d) No change (48) /asu#lexo/ > /asu#lexu/ ‘the blue-blackish color of the horse’s coat’ (e)

Affrication and palatalization s>"

(49) /se#)o/a/ > /"i#)o$a/ ‘lady’ (50) /som#b/e/o/ > /"um#pi$u/ ‘hat’ (in communities west and east of the Andes) (f)

Voicing and dentalization s>!

(51) /#p/eso/ >/p+#$e!u/ ‘prisoner’ (For the treatment of consonant clusters in Mapudungun, see below.) (g) Elision The elision of /s/ in disyllabic clusters with /p/ and /f/, as well as in word-final position, calls for an explanation. One hypothesis is that [h] rather than /s/ is elided (Claudio Kairuz, personal communication). [h] is a frequent allophone of /s/ in word-medial and word-final coda positions in Chilean and Argentine oral varieties of Spanish. Note that the aspiration of /s/ in coda position was present in some th varieties of Spanish, such as the Andalusian dialect, as early as the 16 century. Thus, it could have been introduced during the early post-conquest period. [h] > Ø /_ p s > Ø /_ f h

h

(52) /des#pwes/ [de pwe ] > /de#pwe/ ‘after’ h

(53) /respe#ta-/ [re pe#ta-] > /$epe#ta-/ ‘to respect’ h

(54) /#fosforo/ [#fo fo/o] > /fofo#$o/ ‘match’ However, Smeets (2008) identifies at least one exception: (55) /es#pwela/ >/is#pwela/ ‘spur’

41. Loanwords in Mapudungun

(g)

1059

No change

/s/ tends to be maintained in word-initial position and in clusters with consonants other than /p/ and /f/: s > s /#__ s > s / __ t (56) /sa#pato/ > /sa#patu/ ‘shoe’ (57) /es#tudjo/ > /es#tudjo/ ‘study’ 5.4.5. (a)

Adjustments of the Spanish voiced alveolar tap /!/ and trill /r/ Retroflexion

Both Spanish /// and /r/ are replaced by retroflex /$/ in word-initial and intervocalic positions. / > $ /#_ r > $ /V_V (58) /#rosa/ > / $o#sa/ ‘Rose’ (see other examples in Smeets (2008) The Spanish /// in coda position undergoes the following changes in borrowing: (b)

Deletion

When /// is the coda in word-medial position preceding another alveolar tap, it tends to be deleted. However, I have documented its replacement by the Mapudungun retroflex approximant allophone of /$/ in communities east of the Andes. / > Ø /CV_ (59) /ka/#ne/o/ > [ka#ni$u ~ ka1#ni$u] ‘sheep’ (c)

Metathesis

In word-final position, an idiosyncratic example recorded in the field documents a complex change triggered by an analogical process that encompasses (a) the metathesis between the Spanish vocalic nucleus and /// in coda position, and (b) the epenthesis of /u/ to break up the consonant cluster /k//: (60) /a#suka// > /a#suk/a/ > /a4suku#$a/ ‘sugar’ (probably associated with kura ‘stone’ in the case of sugar cubes). (d) Reassignment of stress As explained in §5.1, in disyllabic words recorded in Argentine communities, stress is generally displaced to the second syllable:

1060

Lucía A. Golluscio

(61) [#je/,a] > [je1#fa] ‘green tea’ (62) /#bolso/ > /fol#so/ ‘bag’ (63) /#baka/ > /wa#ka/ ‘cow’ Other phonotactic adjustments recorded east and west of the Andes are described below (for a more detailed list of changes, see Smeets 2008: 55–8). 5.4.6.

Simplification of Spanish consonant clusters

In Mapudungun, consecutive vowels are the nuclei of different syllables and consonant clusters do not occur in the same syllable. The only complex onsets and codas occurring in the language are those constituted by a consonant and a semivowel. Therefore, most Spanish loans insert a vowel (most frequently, /+/ in its [5] realization) into consonant clusters with /l/ or /r/ as the second consonant, for syllabification. (64) /$p/eso/ >/p+.#$e.!u/ ‘prisoner’ (65) /#kab/a/ > /ka.#p+.$a/ (66) /#flo// > [f+. #lo1] ‘flower’ Few lexical items have kept the Spanish consonant clusters. In such cases, the Spanish voiced labial /b/ is reinterpreted as /f/ in intervocalic position and as /p/ in initial position. (67) /#blanko/ > /#plan/ ‘white’ (68) /#po,/e/ > [po#f$e] ‘poor’ Finally, the deletion of the liquid /// has also been recorded as (69) shows. (69) /som#b/e/o/ > /"um#pi$u/ 5.4.7.

Syllable reduction

The number of syllables acts as another phonotactic restriction. Mapudungun monomorphemic roots are di- or trisyllabic. Deletion of the first syllable, counting from left to right, has been documented in tri- and polysyllabic Spanish words where the first syllable is (a) VC, with consonant /s/, an infrequent sound in Mapudungun, as in (70), (71), and (72) below, or (b) CV, where C is not a Mapudungun native phoneme, as in (73) below. (70) /eska#le/a/ > /ka#le$a / ’staircase/ ladder’ (71) /estam#pi!a/ >/tam#pi!a/ ‘stamp’

41. Loanwords in Mapudungun

1061

(72) /ospi#tal/ > /pi#tal/ ‘hospital’ (73) /desa#juno/ ~/desa#6uno/ > /sa#juno/ ‘breakfast’ The deletion of the first syllable in polysyllabic Spanish words yields three-syllable words, all of which are possible in Mapudungun without having to change either the Mapudungun (see §5.1.) or the Spanish stress patterns.

6. Final remarks Despite documented processes of language attrition, Mapudungun is a strong distinguishing component in Mapuche social and personal identity. Throughout history Mapudungun and its speakers have maintained long-lasting relationships with other peoples and their languages – Quechua, Gününa Yajüch, Aymara, and Spanish. Spanish has unquestionably had the greatest impact on Mapudungun. However, unlike other Latin American languages, Mapudungun does not manifest language mixing or a strikingly large number of Spanish loanwords. Moreover, since early post-conquest contact, borrowing has been ruled by Mapudungun phonological and morphosyntactic patterns and guided by “pragmatic” needs. Finally, Spanish loanwords currently constitute an open and productive domain. The creation of the Mapudungun loanword database is, thus, an ongoing undertaking.

Acknowledgments I am indebted to Fresia Mellico and Cecilio Melillán, linguistic consultants for the construction of the Mapudungun subdatabase for the Loanword Typology project, and to the elderly and superb Mapuche speakers who have shared their knowledge with me over the years. I am grateful to Martin Haspelmath and Uri Tadmor, editors of this volume, for their careful reading of the manuscript and relevant comments, and to Cecilio Melillán and Eva-Maria Schmortte for their invaluable help with the map. Special thanks go to Willem Adelaar for his wise and generous comments to the final draft. I would like to acknowledge Adriana Fraguas, who assists me in my course in Ethnolinguistics at the Universidad de Buenos Aires, for her efficiency, good nature, and commitment to this research, and the anthropologist Morita Carrasco, for her updated information about the distribution of Mapuche communities in Argentina. Finally, I would like to thank Claudio Kairuz for inspiring discussion, and Guillaume Boccara, María Teresa Boschin, Walter Delrio, Antonio Díaz-Fernández, Diana Lenton, Lidia Nacuzzi, Laura Pakter, Julio Vezub, Pedro Viegas, and Fernando Zúñiga for their generosity and help.

1062

Lucía A. Golluscio

Transcription conventions For phonological transcription, I use standard IPA symbols in Unicode. For orthographic transcription purposes, I use the Mapuche Unified Alphabet (Croese, Salas, & Sepúlveda 1978), which was adopted by the Sociedad Chilena de Lingüística (Linguistic Society of Chile) in 1988. Most orthographic symbols have roughly their IPA values, with the following exceptions: ü high central-back unrounded vowel (in stressed positions) or mid-central unrounded (in unstressed positions); t voiceless dental or interdental stop, tr alveopalatal retroflex affricate, ch palatal affricate, d voiceless interdental fricative, n voiced dental or interdental nasal, ñ voiced palatal nasal, ng voiced velar nasal, l voiced dental or interdental lateral, l voiced alveolar lateral, ll voiced palatal lateral, and q, back unrounded semivowel (sometimes with velar spirantization). When citing other authors, I follow their transcription system.

Special Abbreviations [3] COLL INV NEG.IND N.FIN OBL RR STAT

unmarked third person collective inverse (-e) negation for indicative non finite (-n) oblique marker for inverse (-(m)ew) reflexive-reciprocal stative

References Adelaar, Willem F. H. with Muysken, Pieter C. 2004. The Languages of the Andes. Cambridge: Cambridge University Press. Alderetes, Jorge R. 2001. El quichua de Santiago del Estero: Gramática y vocabulario. Tucumán: Universidad Nacional de Tucumán. Arguedas, José María. 1958. Los ríos profundos. Buenos Aires: Losada. Arnold, Jennifer. 1996. The inverse system in Mapudungun and other languages. Revista de Lingüística Teórica y Aplicada 34:9–47. Baker, Mark & Aranovich, Roberto & Golluscio, Lucía. 2005. Two Types of Syntactic Noun Incorporation: Noun Incorporation in Mapudungun. Language 81(1):138–176. Bengoa, José. 2000. Historia del Pueblo Mapuche: Siglos XIX y XX. Santiago: LOM. Bengoa, José. 2003. Historia de los antiguos mapuches del sur. Desde antes de la llegada de los españoles hasta las paces de Quilín. Siglos XVI y XVII. Santiago: Catalonia.

41. Loanwords in Mapudungun

1063

Boccara, Guillaume. 1996. Notas acerca de los dispositivos de poder en la sociedad colonialfronteriza, la resistencia y la transculturación de los reche-mapuche del centro-sur de Chile (XVI-XVIII). Revista de Indias LVI(208):659–695. Madrid. Boccara, Guillaume. 1998. Guerre et Ethnogenèse mapuche dans le Chili coloniale: L’invention du Soi. Paris: L’Harmatian. Carrasco, Morita. 2007a. Mapas de distribución de comunidades mapuches reconocidas por el Registro Nacional de Comunidades Indígenas en Argentina: Neuquén, Río Negro y Chubut. Carrasco, Morita. 2007b. Mapa de distribución de comunidades indígenas reconocidas por el Registro Nacional de Comunidades Indígenas en Argentina. Unpublished manuscript. Casamiquela, Rodolfo M. 1962. El contacto Araucano-Gününa Kena: Influencias recíprocas en sus producciones espirituales. In Actas de Jornadas Internacionales de Arqueología y Etnografía "Vinculaciones de los aborígenes argentinos con los de los países limítrofes", 83–97. 11th–15th November 1957. Buenos Aires. Casamiquela, Rodolfo M. 1987a. Pelajes criollos. Revista Patagónica VII(32, October):19–32. Buenos Aires. Censo Nacional Indígena de Chile. 1992. Centro de Estudios Públicos. 2002. Estudio Nacional de Opinión Pública 15, Tercera Serie. Tema especial: Una radiografía de los mapuches. (Documento de trabajo 345). Santiago de Chile. Cited after Zúñiga (2007). Centro de Estudios Públicos. 2006. Estudio de Opinión Pública: Los Mapuches rurales y Urbanos Hoy. May 2006. Cited after Zúñiga (2007). Cooper, John. 1946. The Araucanians. In Handbook of South American Indians, Vol. 2, 687– 760. Bulletin 143. Washington: Smithsonian Institution, Bureau of American Ethnology. Croese, Rober & Salas, Adalberto & Sepúlveda, Gastón. 1978. Proposición de un sistema unificado de transcripción fonémica para el mapudungu. Revista de Lingüística Teórica y Aplicada 16:151–160. Concepción, Chile. Croese, Robert. 1980. Estudio dialectológico del mapuche. Estudios filológicos 15:7–38. Valdivia: Universidad Austral de Chile. Croese, Robert. 1985. Mapuche dialect survey. In Klein, Harriet M. & Stark, Louise (eds.), South American Indian Languages: Retrospect and Prospect, 784–801. Austin: Texas University Press. de Augusta, Félix José. 1903. Gramática araucana. Valdivia: Imprenta Central L. Lampert. de Augusta, Félix José. 1966 [1916]. Diccionario araucano-español y español-araucano. Padre Las Casas, Chile: Imprenta y Editorial San Francisco. de Valdivia, Luis. 1606. Arte y Gramática General de la lengua que corre en todo el Reyno de Chile. Lima: Francisco del Canto. Delrio, Walter. 2005. Memorias de expropiación: Sometimiento e incorporación indígena en la Patagonia (1872–1943). Bernal: Universidad Nacional de Quilmes.

1064

Lucía A. Golluscio

Díaz-Fernández, Antonio. 2004. Panorama dialectal mapuche en la Provincia del Chubut. Congreso Internacional "Políticas Culturales e Integración Regional", Buenos Aires, th nd 30 March–2 April 2004. Díaz-Fernández, Antonio. 2008. Transferencias léxicas del quechua en el mapuzungun. II Congreso Internacional de Lenguas y Literaturas Indoamericanas and XIII Jornadas de Lengua y Literatura Mapuche, Facultad de Educación y Humanidades, Universidad de La Frontera, 22-24 October 2008. Díaz-Fernández, Antonio. n.d. Situación actual del mapuzungun en Chubut. Unpublished manuscript. Dreidemie, Patricia. 2007. Estrategias discursivas de persistencia cultural: (Dis)continuidad del quechua en el habla mezclada de migrantes bolivianos en Buenos Aires. Master's thesis. Buenos Aires: Universidad de Buenos Aires. Echeverría Weasson, Sergio. 1964. Descripción fonológica del mapuche actual. Boletín del Instituto de Filología de la Universidad de Chile 16:13–59. Santiago de Chile. Echeverría, Max S. & Contreras, Heles. 1965. Araucanian phonemics. International Journal of American Linguistics 31(2):132–135. Englert van Dillingen, Sebastián. 1936. Lengua y literatura araucanas. Anales de la Facultad de Filosofía y Educación 1(2–3):62–109. Universidad de Chile. Entwistle, William James. 1969 [1936]. The Spanish language, together with Portuguese, Catalan and Basque. 1st edn. London: Faber and Faber. Cited in Smeets (1989:69). Fabre, Alain. 1998. Mapudungu (mapuche, araucano). In Fabre, Alain (ed.), Manual de las lenguas indígenas sudamericanas, 2 vols. 720–748. Munich: LINCOM Europa. Febrés, Andrés. 1884 [1765]. Gramática Araucana, o sea, Arte de la Lengua General de los Indios de Chile, reproducción de la edición de Lima de 1765, con los textos completos, por Juan M. Lársen, impreso por Juan A. Alsina. Fernández Garay, Ana V. 1997. El sustrato tehuelche en una variedad del mapuche argentino. Actas. Jornadas de Antropología de la Cuenca del Plata y Segundas Jornadas de Etnolingüística. Rosario: Universidad Nacional de Rosario. Fernández Garay, Ana V. 2001. Ranquel-Español/Español-Ranquel: Diccionario de una Variedad Mapuche de La Pampa (Argentina). (Indigenous Languages of Latin America 2). Leiden: Research School of Asian, African, and Amerindian Studies (CNWS). Golbert de Goodbar, Perla. 1975. Epu Peñiwen (‘Los dos hermanos’). Cuento tradicional araucano. Transcripción fonológica, traducción y análisis. Buenos Aires: CICE. Golluscio, Lucía. 1988. La comunicación etnolingüística en comunidades mapuches de la Argentina: Gramática, textos, etnografía del habla. Universidad Nacional de La Plata Doctoral Dissertation. Golluscio, Lucía. 1990. La imagen del dominador en la literatura oral mapuche y su relación con lo ‘no dicho’, una estrategia de resistencia cultural. Annales Littéraires Special Issue:695–710. Besançon: Univ.de Franche-Comte.

41. Loanwords in Mapudungun

1065

Golluscio, Lucía. 1997. Operadores gramaticales metapragmáticos: evidencialidad y modalidad en mapudungun. Papeles de Trabajo 6:53–66. Rosario: Universidad Nacional de Rosario. Golluscio, Lucía. 2000. Rupturing implicature in the Mapudungun verbal system: The suffix - fï. Journal of Pragmatics 32:239–263. Golluscio, Lucía. 2006. El Pueblo Mapuche: Poéticas de pertenencia y devenir. Buenos Aires: Editorial Biblos. Golluscio, Lucía. 2007. Morphological causatives and split intransitivity in Mapudungun. International Journal of American Linguistics 73(2):209–238. April 2007. Golluscio, Lucía & Ramos, Ana. 2007. El “hablar bien” mapuche en zona de contacto: valor, función poética e interacción social. In Golluscio, Lucía & Dreidemie, Patricia (eds.), Prácticas comunicativas indígenas en contextos urbanos: exploraciones teóricas y metodológicas. (Signo y Seña 17). Instituto de Lingüística, Universidad de Buenos Aires. Gómez Rendón, Jorge & Adelaar, Willem. Loanwords in Imbabura Quechua. (this volume). Greenberg, Joseph H. 1987. Languages in the Americas. Stanford: Stanford University Press. Grimes, Joseph. 1985. Topic inflection in Mapudungun verbs. International Journal of American Linguistics 51:141–63. Guevara Silva, Tomás. 1925–1927. Historia de Chile: Chile Prehispano. 2 vols. Santiago: Universidad de Chile, Balcells & co. Hajduk, A. 1982. Cultura mapuche de la Argentina. 7–9. Buenos Aires: Ministerio de Cultura y Educación. Hamp, Eric. 1971. On Mayan-Araucanian comparative phonology. International Journal of American Linguistics 37:156–159. Harrington, Tomás. 1912–1955. Cuadernos. Vol. 1 (pp. 1–178) and 2 (pp. 1–176). Fondo Documental del Programa Pilcaniyeu, Centro Nacional Patagónico-Consejo nacional de Investigaciones Científicas y Técnicas. Fondo Documental del Programa Pilcaniyeu, Centro Nacional Patagónico-Consejo nacional de Investigaciones Científicas y Técnicas. Harrington, Tomás. 1935. Publicaciones del Museo de Antropología y Etnografía. A.3. 59–69. Buenos Aires. Harrington, Tomás. 1946. Contribución al estudio del indio Gununa Kuna. Revista del Museo de La Plata (Nueva Serie) 2, Antropología: 14:237–275. La Plata. Havestadt, Bernardo. 1883 [1777]. Chilidúgú sive Tractatus Linguae Chilensis. 2 vols. Leipzig: B. G. Teubner. Hayes, Bruce. 1995. Metrical Stress Theory: Principles and Case Studies. Chicago: The University of Chicago Press. Hill, Kenneth & Hill, Jane. 1999 [1986]. Hablando mexicano: La dinámica de una lengua sincrética en el centro de México. México DF: Instituto Nacional Indigenista, CIESAS y SEP-CONACYT. Instituto Nacional de Estadísticas y Censos. 2004/2005. Encuesta Complementaria de Pueblos Indígenas. Buenos Aires: INDEC.

1066

Lucía A. Golluscio

Kaufman, Terrence. 1990. Language History in South America: What we know and how to know more. In Payne, Doris (ed.), Amazonian Linguistics: Studies in Lowland South American Languages, 13–73. Texas: Texas University Press. Kenstowics, Michael J. 1994. Phonology in Generative Grammar. Oxford: Blackwell. Key, Mary Ritchie. 1978. Araucanian genetic relationships. International Journal of American Linguistics 44:280–293. Key, Mary Ritchie. 1984–. Intercontinental Dictionary Series. South American Languages Database. Mapudungun. . Latcham, Ricardo E. 1924. La organización social y las creencias religiosas de los antiguos araucanos. Santiago: Imprenta Cervantes. Latcham, Ricardo E. 1927. El problema de los Araucanos: Sus orígenes y su lengua. Revista Mensual de Ciencias, Letras y Bellas Artes IV, II, (6). Concepción: Universidad de Concepción. Latcham, Ricardo E. 1928. Prehistoria Chilena. Santiago: Oficina del Libro. Lenton, Diana & Lazzari, Axel. 2002. Araucanization and Nation, or How to inscribe foreign indians upon the Pampas during the last century. In Briones, Claudia & Lanata, José L. (eds.), Contemporary Perspectives on the Native Peoples of Pampa, Patagonia, and Tierra del Fuego: Living on the Edge, 33–46. Westport: Greenwood Publishing Group. Lenz, Rodolfo. 1895–1897. Estudios Araucanos (1–12). (Anales de la Universidad de Chile, Santiago XC–XCVIII). 115–485. Santiago de Chile: Imprenta Cervantes. Lenz, Rodolfo. 1905–1910. Los elementos indios del castellano de Chile. Estudio lingüístico y etnológico. Diccionario etimológico de las voces chilenas derivadas de lenguas indígenas americanas. Anexo a los Anales de la Universidad de Chile. Santiago de Chile: Imprenta Cervantes. Loos, Eugene E. 1973. Estudios Panos 2. (Serie Lingüística Peruana 11). 263–282. Yarinacocha, Perú: Summer Institute of Linguistics. Mandrini, Raúl & Ortelli, Sara. 2005. Volver al País de los Araucanos. Buenos Aires: Ed. Sudamericana. Mannheim, Bruce. 1991. The Language of the Inka since the European Invasion. Austin: University of Texas Press. Márquez Eyzaguirre, Luis. 1955. Intromisión de la lengua quechua en Chile. In Anales de la Universidad Católica de Valparaíso, Vol. 3 (1956), 15–38. Mellico, Fresia & Pereira, Petrona. 1997. Actas de III Jornadas de Lingüística Aborigen. 407– 412. Buenos Aires: Universidad de Buenos Aires. Mostny, Grete. 1992. Prehistoria de Chile. Santiago: Editorial Universitaria. Nacuzzi, Lidia. 1998. Identidades impuestas: Tehuelches, aucas y pampas en el norte de la Patagonia. Buenos Aires: Sociedad Argentina de Antropología.

41. Loanwords in Mapudungun

1067

Nacuzzi, Lidia. 2002. Los grupos, los nombres, los territorios y los blancos: Historia de algunos nombres étnicos. In Boccara, Guillaume (ed.), Capítulo IX de Colonización, resistencia y mestizaje en las Américas (siglos XVI-XX). Quito, Ecuador: Ediciones AbyaYala e IFEA Lima, Perú. Nardi, Ricardo. 1962. El quechua de Catamarca y La Rioja. Cuadernos del Instituto Nacional de Investigaciones Folklóricas 3:189–285. Buenos Aires. Nardi, Ricardo. 1981–1982. Cultura Mapuche en Argentina. 11–38. Buenos Aires: Instituto Nacional de Antropología, Ministerio de Cultura y Educación. Payne, David & Croese, Robert. 1988. On Mapudungun linguistic affiliations: An evaluation of previous proposals and evidence for an Arawakan relationship. Paper read at the 46th International Congress of Americanists, July 1988, Amsterdam. Religiosos franciscanos misioneros de los Colegios de Propaganda Fide del Perú. 1998 [1905]. Vocabulario políglota incaico: Quechua, Aimara, Castellano. Palomino, Rodolfo Cerrón (ed.). Lima: Ministerio de Educación de Perú. Salas, Adalberto. 1979. Semantic Ramifications of the Category of Person in the Mapuche Verb. Doctoral Dissertation. State University of New York at Buffalo, University Microfilms International. Salas, Adalberto. 1984. Textos orales en mapuche o araucano del centro-sur de Chile. Concepción, Chile: Universidad de Concepción. Salas, Adalberto. 1992. El mapuche o araucano: Fonología, gramática y antología. Madrid: MAPFRE. Silveira, Mario Jorge. 1996. Alero Los Cipreses (Provincia de Neuquén). Actas II Jornadas de Arqueología de la Patagonia p.107–118. Centro Nacional Patagónico-Consejo Nacional de Investigaciones Científicas y Técnicas. Smeets, Ineke. 1989. A Mapuche Grammar. Doctoral Dissertation. Leiden: University of Leiden. Smeets, Ineke. 2008. A grammar of Mapuche. Berlin: Mouton de Gruyter. Sociedad Chilena de Lingüística. 1988. Alfabeto Mapuche Unificado. Temuco: Universidad Católica. Stark, Louisa. 1970. Mayan Affinities with Araucanian. Papers from the Meeting of the Chicago Linguistic Society 6:57–69. Suárez, Jorge A. 1959. The phonemes of an Araucanian dialect. Internacional Journal of American Linguistics 25(3), April:177–181. Suárez, Jorge A. 1988 [1954]. Observaciones sobre el dialecto manzanero. In Fontanella de Weinberg, Beatriz (ed.), Estudios sobre lenguas indígenas sudamericanas, 107–21. Bahía Blanca: Universidad Nacional del Sur.

1068

Lucía A. Golluscio

Vezub, Julio. 2008. Historiar las prácticas etnográficas. Tomás Harrington y la morfología de la cultura en Patagonia septentrional hacia 1940. Suplemento Anuario IEHS, Actas del Seminario Internacional Pueblos indígenas de América Latina, Siglo XIX, Sociedades en movimiento. Tandil: Instituto de Estudios Histórico-Sociales (IEHS), Centro de Investigaciones y Estudios Superiores en Antropología Social (CIESAS). Viegas Barros, J. Pedro. 2005. Los préstamos del gününa küne al mapudungun. In Voces en el viento, Raíces lingüísticas de la Patagonia (Colección “El Suri”), 153–163. Buenos Aires: Ediciones Mondragón. Viegas Barros, J. Pedro. n.d. Quichuismos en la variedad ranquel. Zúñiga, Fernando. 2000. Mapudungun. Münster: LINCOM Europa. Zúñiga, Fernando. 2006a. Mapudungun: El habla mapuche. Santiago de Chile: Centro de Estudios Públicos. Zúñiga, Fernando. 2006b. Deixis and alignment: Inverse systems in the indigenous languages of the Americas. (Typological Studies in Language 70). Amsterdam/Philadelphia: John Benjamins. Zúñiga, Fernando. 2007a. ‘Mapudunguwelaymi am? Acaso ya no hablas mapudungun?’: Acerca del estado actual de la lengua mapuche. Estudios Públicos 105(Summer 2007):9– 24. Zúñiga, Fernando. 2009. Applicatives in Mapudungun. Paper read at the Society for the Study of the Indigenous Languages of the Americas (SSILA) Annual Meeting, San Francisco, January 2009.

Loanword Appendix Spanish

ufisa

sheep

potrillo

foal, colt

ofisha

sheep

furiku

donkey

kaniru

ram

mula

mule

pichi ofisha

lamb

kansu

goose

korderu

lamb

patu

duck

foraku

boar

lofo

wolf

zomo sanchu

sow

kolmenia

bee

lawuna

lagoon

foforo

match

karfon

charcoal

kompañ

partner

don

don, title of respect

awela

grandmother

sanchu

pig

kormeña

beehive

wachol

orphan

sañwe

pig

faltan

to die

familia

family

kapüra

goat

iñchalen

swelling

tigre

tiger

chifu

he-goat

doktor

physician

potreru

pasture

pichi kapüra

kid

kansalen

tired

walpon

stable, stall

kawell

horse

kansatun

to rest

toro

bull

potüro

stallion

galleta

biscuit

manshun

ox

zomo kawellu

mare

fürin

to roast, to fry

cow

yewa

mare

ronotun

to bake

waka

41. Loanwords in Mapudungun

1069

rono

oven

fota

boot

asaon

hoe

tetera

kettle

sapatufe

shoemaker

wilkita

kaserola

pan

chumpiru

hat, cap

fork (2) / pitchfork

fwente

dish

fonsiku

pocket

ratrillu

rake lasso

tason

bowl

foton

button

llasu

tasa

cup

alfiler

pin

semilla

seed sickle, scythe

charu

cup

pülata

jewel

wazaña

pülatillu

saucer

pañuelu

handkerchief, rag

trillan

to thresh

kosecha

harvest

kuchara

spoon

kuchillu

knife (1)

tuwalla

towel

kachilla

wheat

fork

peine

comb

kawella

barley

tenasa

tong

puma

ointment

afena

oats

sayuno

breakfast

kafon

soap

aro

rice

almuerso

lunch

espeko

mirror

pinu

hay

pan

bread

tolto

tent

fülor

flower

masa

dough

kawle

cable

monte

tree

sofalün

to knead

piesa

room

sause

willow

masan

to knead

pwerta

door, gate

mate

mate

molinu

mill

kanzaw

lock

ufas foki

vine

ofaz

grape

llafe

key

pillaw

asukura

sugar

fentana

window

to catch, caught

lichi

milk

kalera

ladder

kadena

chain

entulichin

to milk

fresada

blanket

puñete

to pound

kesu

cheese

silla

chair

tikera

scissors, shears

pulku

wine

mesa

table

kalpinteru

carpenter

folso

bag

fela

candle

seruchu

saw

folsa

bag

fatia

trough

martillu

hammer

ekota

sandal

viga

rafter

külafo

nail

ispwela

spur

poste

post, pole

pulata

silver

felantar

smock

trafla

board

kofre

copper

pañu

cloth

fazofe

adobe

vidrio

glass

seza

silk

sofaltrülken

to tan

kanesta

basket

akucha

needle (1)

chaküra

field

pintura

paint

abrigo

coat

kampo

field

pintan

to paint

kamisa

shirt

werta

garden

etaka

peg

kwellu

collar

koral

fence

kompañan

to accompany

pollera

skirt

sanka

ditch

fürollan

to muddle

pantalon

trousers

awar

lima beans

modan

to move

mezia

sock, stocking

napor

turnips

marchan

to walk

sapatu

shoe

surko

furrow

pala

shovel

tenedor

1070

Lucía A. Golluscio

angkashün

to let sb. ride on the back of the horse

marte

Tuesday

pürezu

captive, prisoner

mierkole

Wednesday

kwefe

Thursday

ley

law

fiene

Friday

kues

judge

safado

Saturday

awokaw

lawyer

fülang

white

testiku

witness

plan

white

kulpatun

to convict

pañush

smooth

kulpafle

guilty

kürasia

thanks

kastiku

kekawün

to complain

penalty, punishment

multa

fine

karsel

prison religion

kareta

cart, wagon

rweda

wheel

eje

axle

yuku

yoke

fote

boat

remo

oar

pala

paddle

timun

rudder

karpa

sail

pwerto

port

repetan

to respect

relijion

fotetuwe

port

animawün

to dare

dios/dio

god

kapatangen

foreman

wapo

brave

malisian

to curse

resiwin

to receive

perdonan

to forgive

razio

radio

plata

money

blame

tele

television

peso

coin

pwede

can

telefonu

telephone

pofre

poor

lapi

pencil

sikleta

bicycle

zefen

to owe

estudio

study

wingka kawell

car

zefe

debt

eskwela

school

auto

car

alkilan

to hire

kolekio

school

türen

train

sweldo

wages

malisian

to suspect

afion

airplane

kanan

to earn

porke

because

makina

machine

fenzen

to sell

no

no

faril

barrel

tuntefalin

price

kontestan

to answer

pital

hospital

alü falin

expensive

neqan

to deny

antiojo

pichi falin

cheap

papel

paper

spectacles, glasses

faratu

cheap

lifru

book

presidente

president

ministro

minister

polisia

police

patente

license plate

sertificaw

birth certificate

entusiasmawün to be enthusiastic

kulpa

depwe

after

pülauta

flute

abajo

under

kazkawilla

rattle

sofran

remains

paisano

peasant

seran

to shut

pweblo

town

kürus

cross

limite

boundary

pelota

ball

sosialimu

socialism

pedaso

part

pekan

to fish

ora

hour

karoti

club

relo

clock

entrekawün

to surrender

kawtifangey

captive, prisoner

semana

week

zomingku

Sunday

lune

Monday

zullipresizenten election zireksion

address

kalle

street

korew

post/mail

tampilla

postage stamp

karta

letter

kolchon

mattress

lata

tin/can

41. Loanwords in Mapudungun tornillu

screw

müski

honey

entutornillan

screwdriver

wallka

bag

fotilla

bottle

ükülla

cloak

patilla

candy/sweets

tupu

pin

pülatiko

plastic

lilpu

mirror

fomfa

bomb

kawitu

bed

kalenzario

calendar

ichona

sickle, scythe

te

tea

kachu

kafe

coffee

wañu

Quechua

1071

Gününa Yajüch kalel

mountain, hill

churchur

chorlo (Pluvialis dominica)

tartajar

avutarda (Chloephaga pipta)

grass

kululu

butterfly

dung

kelesia

lizard

sapallu

pumpkin, squash

yüskalaw

mamuelchoique (Adesmia campestris)

weike

willow

külüf

mate

kotrü

salty

kotrü ko

brackish

waylün

to cry

kechuwe

handle of hunting sling

trawil

bola of hunting sling

lüpümün

to burn (1)

titi

lead

ñaña

sister

wampo

canoe

kaka

mother’s sister

kawewe

oar

zomo achawall hen

kawen

to row

achawall

chicken

chawcha

coin

challwa

fish

kelü

red

puma

lion

muchan

to kiss

pike

flea

chillkatun

to write

müski

beeswax

awka

enemy

charaypuka

lizard

awkan

war, battle

pütra

belly

challwan

to fish

Aymara

pongkün

swelling

challwafe

fisherman

pataka

a hundred

kangkan

to roast, to fry

lüftukun

arson

warangka

a thousand

patay

bread

kalkutun

magic

charki

meat (jerked meat)

kalku

sorcerer, witch

cha

question tag

Index of languages A-Hmao, 638 Afro-Asiatic, 4f, 135, 149, 205 Ainu, 545, 548, 550, 562, 564 Akha, 640 Akkadian, 51, 332, 345 Akkala, 384f, 393–395 Alagwa, 103, 107, 113, 118 Alamannic, 330 Albanian, 233–235, 239–243, 245f, 249 Altaic, 530, 545, 577, 579f, 582f, 588 Altay, 496 Aluku, 973–975, 978, 980 Amharic, 125, 127f, 131–138 Colloquial, 128 Ancient Egyptian, 194 Arabic, 42, 44, 51, 63, 77, 79–83, 85–93, 143, 145, 147f, 150f, 153, 156–158, 160, 168f, 172, 174–178, 180–184, 191–193, 195–202, 204–206, 222, 224, 271, 341, 343, 365, 368, 371, 417–421, 425, 432f, 435f, 438–442, 602f, 610, 688f, 695–699, 701, 703–708, 719–721, 728–731, 733–739 Classical, 195, 198, 695f Indian Ocean, 80, 85f Moroccan, 40, 48, 192f, 196f, 199–203, 205, 341f Omani, 83, 86, 730 Standard, 192, 195, 197–200, 204 Arawakan, 47, 929, 972, 975, 1039 Archi, 4, 14, 56, 60, 62, 430–446 Arin, 471, 473, 475, 478, 480f, 492 Armenian, 268f, 274–276, 278 Aromanian, 230, 235 Asax, 113 Assan, 471, 473, 475, 478, 480, 492 Assyrian, 776 Austro-Asiatic, 4f

Austronesian, 4f, 44, 545, 547, 577, 584, 595, 617, 621, 640, 642, 652, 665f, 668, 670f, 673f, 678f, 686f, 693, 705, 717f, 721f, 724, 726, 728, 735–737, 739f, 747, 750, 752f, 757, 764, 771 Avar, 14, 16, 415–424, 431–443 Awa Pit, 948, 950, 952, 954 Aymara, 954, 1039, 1042f, 1046, 1048f, 1061 Azerbaijani, 16, 418–421, 433–435, 437f Bade, 170, 175 Bagirmi, 149, 170, 173 Bahasa Melayu, 675 Balinese, 692f, 697, 707 Balochi, 268 Baltic, 160, 185, 397f, 401f Banda, 176 Baniwa, 996 Banjar Malay, 720, 722, 733, 739 Banjar, 686, 719f, 722, 733, 739 Banjarese, 722f, 733, 735 Bantu, 5, 77, 79f, 87, 89, 92, 103, 107, 113f, 117f, 120, 149, 153, 218–220, 222–226, 718–723, 726–729, 733–736, 739f, 920, 924, 926f Barbacoan, 947f, 950, 952–955, 957f, 961 Bargam, 752, 754f, 757–764 Batavian Portuguese Creole, 690f, 699 Batek, 664, 667 Bavarian, 330 Bazaar Malay, 736 Bel, 748f, 751–755, 758, 761, 764 Bemba, 219 Bengali, 602, 610 Benue-Congo, 142, 145, 148, 151, 153 Berber, 60, 63, 145, 148, 151, 156f, 159, 175f, 180, 191–196, 199–205, 342 Beni Iznasen, 192, 199 Central Moroccan, 192 Bermejo, 1015–1018, 1020–1022, 1030f Berti, 166f, 171

Index of languages Betsimisaraka, 722, 724, 735 Bezhta, 4, 16, 56, 60, 62, 414–429 Bianjida-Datooga, 114 Bilibil, 748, 754, 764 Birlinarra, 791, 793, 804, 812 Bondei, 80 Brabants, 338 Buginese, 725f Bulgarian, 230, 233, 235f, 239, 241, 243–245, 248, 271 Burmese, 601, 639 Burunge, 103, 113, 118, 120 Burushaski, 44, 265, 269 Buryat, 501f, 509 Cahabón, 877f Cambodian, 627 Cantonese, 81, 575, 584, 587f, 590, 595, 622– 624, 628, 689, 774 Cara, 945, 947–951, 953, 961 Carib, 929, 931–933, 969, 972, 976 Celtic, 51, 234, 332, 345, 350, 352, 363–365, 370, 375, 377, 381 British, 363f, 370, 373, 375 Ceq Wong, 3f, 46, 48, 56, 60, 62, 659–685 Ch’ol, 852, 857, 865, 888 Ch’olan, 877, 879–887, 890 Ch’olti’, 865, 876f, 880, 886–888 Ch’orti’, 865, 888 Chadic, 4, 142–144, 147–149, 153, 159f, 168– 173, 175f, 180, 183f Chamelco, 876, 878 Chamic, 44, 619, 622f Chantyal, 447, 449, 466 Chatino, 858, 866 Chepang, 450 Chichimec, 868 Chichimeca, 897, 900 Chilean Pehuenche, 1039

1073

Chinese, 3, 37, 43, 58, 79f, 82, 86, 222, 224, 371, 381, 480, 482f, 497, 525, 528, 530f, 533–538, 540f, 546–550, 552, 554–557, 559–565, 575–582, 584, 586–591, 593–595, 599, 601, 604f, 607, 611, 617–632, 638– 640, 642–652, 666, 689, 694, 698f, 701, 737, 774, 976f Northern, 580 Chinook Jargon, 864 Chocho, 866 Chontal, 852–854, 865, 888 Chorote, 1015 Chuj, 852–854, 865, 867 Chujean, 880 Cobán, 873, 876, 878, 880, 891 Comorian, 77f, 80, 82, 719–721, 726–728, 730, 733–735 Coptic, 51, 371 Croatian, 271 Cuicatec, 857, 866 Cuitlatec, 866–868 Cutchi/Sindhi, 82 Cuzco Quechua, 948, 950f, 953, 958, 1051 Czech, 39, 58, 263, 272, 274–278, 282–285, 287f, 305, 308, 311–313, 774, 776, 778–780 Daco-Romanian, 230, 235f Dagur, 527, 530–535, 537–540 Dami, 748 Danish, 330, 338, 361, 364, 366 Dargwa, 434 Datooga, 103, 105f, 108–113, 118, 120 Daw, 638 Dazaga, 166f Desano, 995, 998 Dharuk, 367, 371 Dolgan, 496 Dravidian, 45, 51, 82, 265f, 274, 283, 368, 545, 689, 694, 696 Dusun Malang, 717 Dusun Witu, 717

1074

Index of languages 

Dutch, 3f, 16, 40, 45, 47, 57, 60, 62, 197, 217, 338–359, 366f, 370–373, 375, 377, 381, 451, 550, 552, 561f, 564, 593, 660, 690f, 693, 696, 698–708, 731, 733–735, 918, 920f, 925f, 928, 931–934, 936, 968, 970– 978, 980f Early New High German, 320, 322 East Middle German, 313, 315, 318–320 East Slavic, 396 Egyptian, 51 Elamite, 579, 582–584 Elwana, 77, 719 Enets, 473, 480 English, 3–6, 8f, 11, 15, 17f, 35, 37–40, 42f, 45–48, 50f, 56, 59f, 62f, 73, 76, 79, 84–88, 90–92, 106, 108–110, 115, 117f, 128, 131f, 134, 136, 145–148, 150, 153, 156–158, 160, 169, 172–180, 183f, 216f, 220–224, 226, 248, 288, 330, 338, 341, 343f, 347–354, 360–383, 450f, 453f, 456, 458f, 475, 516, 552–554, 557f, 561–564, 575, 581, 583– 585, 589f, 593, 595, 603–605, 607–611, 618–620, 622f, 629, 631, 642, 644f, 647, 649, 671f, 676, 690f, 696, 698–704, 707f, 719–721, 731–733, 735–740, 747, 751, 754, 756–760, 762f, 772, 774–776, 778–785, 790, 794–796, 806, 809, 812, 827, 851, 855f, 859, 875, 913, 920–924, 927, 930f, 933f, 972, 974, 976, 1019, 1026–1028 Indian, 50, 221 Papua New Guinea, 757 Etruscan, 368 Even, 37f, 45, 61, 92, 106, 200, 236, 240, 246, 250, 315, 345, 360, 370, 388, 395, 397f, 424, 473, 475, 483, 499, 503, 508, 540f, 545, 563, 586f, 646, 673, 695f, 704, 759, 903, 913, 961, 972, 1040 Evenki, 473, 480, 482f, 499, 502f, 505f, 508, 514f, 525, 529f, 537 Ewe, 925 Faroese, 330, 338 Fars, 268

Fennic, 394–396, 398f, 401f Fijian, 759, 783 Finnish Romani, 270 Finnish, 270, 389f, 392, 394f, 399, 475 Finno-Ugric, 504 Flemish, 338, 341, 343, 366 Fon, 921, 935 Fongbe, 217, 925 Frankish, 330, 365, 368, 371 French, 17, 35, 39, 42, 45, 47, 50f, 79, 82, 84, 128, 134, 136, 145–148, 150, 153, 156f, 160, 169, 171, 173–180, 183f, 192, 196– 198, 204, 216–218, 220, 222–226, 234, 237–239, 241–244, 249, 275, 340, 342–344, 347–350, 352–354, 365–375, 377f, 381, 451, 553, 559, 562, 564, 583, 618–623, 629, 644, 691, 717, 719–721, 731–739, 776–778, 780, 918, 930f, 933, 968–984, 987f Norman, 363–366, 369f, 375 Frisian, 330, 338f, 343, 347, 360, 363, 368 Ful, 156f Fulfulde, 151f Gaelic, 365 Gagauz, 231 Gawwada, 3f, 11, 57, 60, 62, 124–141 Gbe, 920f, 925–927, 930–933, 935 Gedaged, 748f, 751f, 754–756, 758, 760, 763f Georgian, 268, 414, 416–423, 425 German, 11, 37–40, 42f, 45, 47, 79, 84–86, 106, 115, 171, 173, 176, 197, 232f, 237, 239, 241f, 244, 247, 271, 275, 304–322, 330f, 334, 336, 338, 342, 344, 348f, 351, 353f, 362, 368, 371, 377, 451, 475, 553, 558, 562, 564, 575, 591, 749, 755, 757–760, 763, 877, 972 High German, 13, 313f, 316, 318f, 330f, 334–336, 338, 348 Germanic, 4, 38, 44f, 191, 234, 275, 308, 313, 330–334, 336, 338, 345–347, 353f, 360, 364f, 368, 370f, 375–377, 381, 389, 394– 402, 406, 489 Ghomaran Berber, 205

Index of languages Giryama, 77 Gorwaa, 103f, 112, 118 Gothic, 308, 313, 330, 360 Greek, 38f, 42, 233, 236, 239, 241f, 245, 248f, 269f, 272–280, 282–287, 333–335, 343, 347, 364f, 367–372, 374, 377f, 381, 418, 439, 592, 603, 626, 631, 696, 732, 734, 776 Classical, 774, 778, 780f Green Hmong, 642 Guarani, 286, 904, 911 Guaraní, 864, 1020, 1046 Guarijio, 823 Guianese Creole, 969, 973–983, 985f Guisnhay, 1015–1017 Gujarati, 82, 371, 696 Gününa Yajüch, 1042–1044, 1046–1050, 1053f, 1061 Gurindji, 4, 56, 60–62, 790–822 Gurung, 447–449, 467 Guugu Yimidhirr, 371 Gyalsumdo, 447f Hadza, 103f, 118 Haitian Creole, 217, 974 Hakka, 575, 774 Hausa, 3f, 45, 57, 60, 62, 142–165, 168f, 171– 174, 176–178, 180, 182–184 Hawaiian, 4, 57, 60, 62, 771–789 Hebrew, 44, 63, 194, 364, 699, 776, 778, 780 Hindi-Urdu, 221, 602, 610, 689 Hindi, 77, 79, 81f, 85–88, 90, 93, 107, 223, 451, 700 Hmong-Mien, 5, 545, 577f, 582f, 595, 638– 640, 643, 645f, 648, 650–652 Hmu, 638 Hokkien, 602, 689, 699 Hollands, 338 Huastec, 852, 858, 865, 867f Huilliche, 1037, 1039 Hungarian, 48, 58, 230, 232, 236f, 239–241, 243, 245, 247, 260–263, 270–288 Hup, 3f, 57, 59f, 62, 992–1014

1075

Iban, 686 Icelandic, 330, 338 Igbo, 142, 931, 933 Ilocano, 772 Imbabura Quechua, 3f, 46, 56, 59f, 62, 944– 967, 1026, 1046, 1050 Inari, 385, 389, 395f, 399 Indian, 50, 79–87, 90, 107, 115, 117, 215f, 218, 221, 264–268, 274–276, 278, 285, 450, 601f, 688, 694, 697f, 706, 725, 738, 856, 929, 951, 1003 Indic, 44f, 371, 451, 466, 602, 608f, 611, 689, 693, 696 Indo-European, 4–6, 44, 230, 234, 260f, 264, 330f, 338, 345, 360, 375, 397, 401f, 406, 450, 462, 545, 578, 608, 622f, 689, 900 Indo-Iranian, 5, 260f, 264, 401f, 577, 579, 582f Indo-Portuguese, 221 Indonesian, 3f, 6, 9–12, 16f, 51, 56, 60, 62, 341, 686–716, 725, 729, 734 Iranian, 267f, 275f, 278, 281, 286f, 480 Iraqw, 4, 57, 60, 62, 103–123 Istro-Romanian, 230 Italian, 39, 45, 47, 79, 86, 127f, 131f, 134, 137, 237, 239, 241–244, 275, 334f, 341, 343, 348, 365, 367f, 371, 553, 592, 1018 Itzaj, 852, 865, 877 Ixcatec, 857, 866, 868 Jah Hut, 659, 662, 664, 667–671, 673f Jahai, 667, 671, 677, 679 Jakarta dialect, 687 Jaminjung, 792f, 795f, 798f, 802, 805, 807 Japanese, 4, 20, 37, 42f, 47, 56, 60, 62, 528, 545–574, 579, 581, 588, 591, 593, 624, 772, 775f, 781 Jaru, 791, 794–796, 799f, 803, 807–809 Javanese, 17, 51, 548, 603–605, 607, 610, 691–693, 697, 705, 707f, 718–725, 733–736, 739, 976 Jingulu, 792, 795f Jova, 826 Jurchen, 580

451, 620,

670, 730,

1076

Index of languages 

K’iche’, 852, 858, 865, 878 K’iche’an, 873, 886, 890 Kaike, 447 Kajirrabeng, 792 Kakua, 992 Kali’na, 4, 57, 60, 62, 968–991 Kalmyk, 501f, 509 Kanembu, 166f, 170 Kannada, 45 Kanuri, 4, 12, 45, 57, 60, 62, 145–150, 153, 156–160, 166–190 Karelian, 389, 394–396, 398–403, 405f Karen, 639 Karranga, 791, 793, 799, 802 Ket, 3f, 15, 57, 60, 62, 471–495, 508 Central, 474, 476, 480 Southern, 473f, 476f, 485 Khakas, 496 Khalkha, 501f, 509, 530, 533f, 538 Kham, 450 Khmer, 548, 599, 601f, 604–606, 608f, 611, 665, 693f Khmu, 640 Kiamu, 84 Kija, 792f, 795f, 808 Kikongo, 51, 924 Kildin Saami, 3f, 56, 60, 62, 384–413 Kimvita, 76, 84 Kiranti, 450 Kiunguja, 76, 78, 83f Komi, 387, 389, 393f, 400–402, 406 Kongo, 219f, 924 Konso, 125–136 Korean, 47, 451, 545, 547f, 562, 564 Kott, 471, 473, 475, 478, 480f, 492 Kriol, 790f, 794–796, 802, 812 Kristang, 690 Kukatja, 791 Kumyk, 418, 432–435, 437f Kurdish, 267, 274f, 285, 342 Lacandon, 877 Lahu, 640

Lak, 432–436, 438–443 Langobardic, 330 Lao, 599, 640–642, 644–649, 652 Laotian, 640 Late Latin, 334f, 370 Latin, 14, 16f, 35, 38f, 42, 85, 110, 153, 176, 193, 195, 199, 203f, 222, 232–235, 237– 242, 244, 246–248, 250, 275, 312f, 331, 333–336, 340, 344, 346f, 349–354, 361, 363–374, 377f, 381, 432, 474, 592, 626, 631, 696, 702, 705, 724, 776, 856, 861, 903, 906, 909, 923, 1039, 1041, 1046, 1061 Vulgar, 333–335 Lezgian, 434 Lhomi, 452 Limburg, 342 Loango Bantu, 925, 927, 931f Loango, 920, 925, 927, 931–933 Low German, 43, 313f, 316, 318f, 330, 338, 348, 350, 352, 366, 371, 373, 377 Low Saxon, 338 Lower Sorbian, 3f, 57, 60, 62, 304–329 Lule, 1021 Maanyan, 717 Macassarese, 726 Macedonian, 235, 271 Madang, 747f, 752, 755 Madurese, 707 Magar, 450 Mah Meri, 664, 668, 677–679 Maká, 1015 Makhua, 219, 728 Malagasy, 4, 57, 60, 62, 79f, 82, 85f, 219, 222– 224, 226, 717–746 Malay, 3, 17, 46, 48, 77, 79f, 82, 85f, 371, 584, 599, 601, 603–605, 607, 610, 659, 661–679, 686, 688, 690, 692–695, 697, 700, 703–705, 718–725, 728–730, 733–737 Brunei, 721 Sumatra, 720–722 Malayo-Polynesian, 686, 717, 722, 753 Malgwa, 175

Index of languages Malngin, 791, 793 Manange, 4, 49, 57, 60, 62, 447–470 Manchu, 530, 532f, 535, 537, 580 Mandarin Chinese, 3f, 57f, 60, 62f, 525, 530f, 535, 540, 575–598, 642, 644, 650f, 689 Mande, 145, 148f, 151–153 Manga, 167f, 171 Maore, 728 Maori, 771, 776, 778, 780 Mapuche, 1035–1037, 1039–1046, 1048, 1050, 1053f, 1061f Mapudungun, 3f, 57, 60, 62, 371, 1035–1071 Marquesan, 771 Massachusett, 371 Matawai, 918 Matlatzinca, 897 Matugar, 748 Maya, 848, 875f, 879, 886f, 890, 897 Mayo, 826 Mazahua, 857, 866f, 897 Medieval Latin, 334f Megiar, 747–749, 754, 764 Megleno-Romanian, 230, 235 Mende, 931, 933 Merina, 717, 730, 735f Mgao, 83 Middle Chinese, 556, 584, 601, 605f, 611, 618, 620, 625, 643, 645, 647, 651f Middle Dutch, 344, 361 Middle French, 365 Middle High German, 13, 313, 323, 331 Middle Latin, 333 Middle Low German, 313, 323, 362, 370, 372 Middle Mon, 501 Middle Persian, 267 Mienic, 638f, 652 Mijikenda, 77, 719 Mikmaq, 371 Min, 575, 611, 629, 689 Minangkabau, 686, 692, 729 Miriwung, 792f, 795f, 798, 800f Mixe-Zoque, 852, 856–858, 865 Mixteco, 897

1077

Mocho, 852, 857, 867f Moluche, 1037, 1039 Mon-Khmer, 44f, 604f, 607, 611, 617, 639f, 642, 652, 659, 693 Mon, 602, 604, 665, 693 Mongolian, 368, 371, 477f, 482f, 501, 511, 530, 532–538, 580, 582, 584 Mongolic, 16, 498f, 501, 504–507, 509–516, 527, 530, 532f, 535f Mopan, 857, 865, 867, 875, 877, 885 Mrima, 83, 89 Mudburra, 791–793, 795f, 799, 802, 806 Multani, 266 Mun, 638 Munda, 45, 265f, 274 Mvita, 83 Mwani, 719 Nahuatl, 371, 827f, 831, 835f, 851–854, 857f, 864f, 867–869, 882f, 885, 897, 899f, 902, 913, 1021, 1046 Nanai, 527 Nar-Phu, 447–449, 452, 466 Nara, 113, 548–550 Ndyuka, 918, 922, 924–926, 974 Nebome, 826 Negidal, 529 Nenets, 393f, 480 Nengee, 973–976 Neo-Latin, 79, 86 Nepali, 49, 448–464, 466 Nevome, 826 New High German, 318, 323, 331 Newar, 450 Ngaliwurru, 792f, 795f, 802, 807 Ngardi, 791, 795f, 799 Ngarinyman, 791–793, 801, 804, 812 Ngizim, 175 Nheengatú, 993, 996f, 999–1001, 1003 Niger-Congo, 5, 103, 146f, 149, 153, 159, 173, 176 Nilo-Saharan, 103, 147, 149, 153 Nivaklé, 1015, 1020

1078

Index of languages 

Noctén, 1015f, 1019 North Germanic, 330, 338, 360, 389, 392, 394– 396, 398–402, 406 North Saami, 384, 389, 391, 399 Northern Tepehuan, 825 Norwegian, 330, 338, 361, 366, 389, 394f, 400– 402, 405f, 516 Nukak, 992 Nungali, 792, 808 Nupe, 152, 175f Nyamwezi, 79, 84, 86 Nyaturu, 113, 118 Nyika, 80 Nyiramba, 104, 107, 113, 117 Ocuilteca, 897 Oghur, 275 Oirat, 501 Old Chinese, 576–578, 584, 606, 618, 620, 624f, 643, 645, 647, 652 Old French, 50, 334f Old Frisian, 347, 363 Old High German, 4, 55, 57, 60, 62, 308, 313, 323, 330–337, 360 Old Irish, 334f, 347 Old Javanese, 548, 691, 693, 723, 725, 739 Old Khmer, 604, 609, 693f Old Mon, 578, 602 Old Norse, 35, 39, 347, 360–362, 364–366, 370–372, 375f Old Persian, 583 Old Russian, 396, 403 Old Slavic, 346 Old Spanish, 832, 908 Old Thai, 608, 693 Opata, 826 Oromo, 80, 127f, 130–136 Oroqen, 4, 57, 60, 62, 525–544 Oscan, 368 Ossetic, 267f, 275 Otomanguean, 4, 856–858, 866, 897, 902, 912 Otomi, 4, 57, 60, 62, 856–858, 866, 897–917 Pacific Pidgin English, 756

Pali, 602, 604–609, 611, 693 Pama-Nyungan, 4, 790, 792f, 797, 800f, 805, 808–812 Pamaka, 974 Pame, 867, 897 Parya, 266 Pasto, 948, 950 Persian, 77, 79–82, 85f, 88, 90, 93, 265, 268f, 275, 283, 365, 417f, 420, 425, 433–439, 441f, 504, 602, 610, 688f, 695, 697–699, 701, 706, 734 Phoenician, 194 Picardic, 348f Picunche, 1037, 1039 Pilagá, 1020f Pipil, 823, 857, 865 Pokomo, 77, 719 Polabian, 305 Polish, 45, 243, 305, 312 Poqomchi’, 865, 876, 886 Portuguese, 37, 39, 56, 61, 79, 83–88, 91, 175f, 221f, 224, 341, 348, 367, 371, 552, 554, 562, 564, 690f, 698–701, 705, 728, 731, 733–735, 756, 777, 920, 922, 924–927, 931f, 935f, 972, 975f, 978, 980–982, 984, 986, 993–1011 Brazilian, 974, 986 Pre-Inca, 955, 957f Pre-Rangi, 108, 111 Pre-Sandawe, 108, 111 Proto-Austro-Tai, 449 Proto-Austroasiatic, 578, 618 Proto-Chadic, 144, 149, 159 Proto-Chinese, 449 Proto-Germanic, 330–332, 360, 396f, 399, 406 Proto-Greater Q’anjob’alan, 852 Proto-Hmong-Mien, 582, 644 Proto-Indo-European, 13, 38, 261, 265, 330f, 360, 397, 399, 406, 504 Proto-Malayo-Polynesian, 547 Proto-Mixe-Zoquean, 852–854 Proto-Mon-Khmer, 621, 667, 693f Proto-Mongolic, 534, 538

Index of languages Proto-Semitic, 44 Proto-Tai, 618, 620–623 Proto-Tamangic, 453 Proto-Yucatecan, 852 Provençal, 365 Pumpokol, 471, 473, 475, 478, 492 Punic, 193f, 199, 203, 368 Q’anjob’alan, 880–883 Q’eqchi’, 4, 57, 60, 62, 851–854, 856f, 865, 868f, 873–896 Qo Xiong, 638 Quechua, 35, 47, 827, 864, 899, 904, 911, 944– 958, 960f, 1021, 1023, 1037, 1039, 1042– 1044, 1046–1049, 1051–1054, 1061 Quiche, 876 Ramoaaina, 756 Rangi, 110, 114f, 117f Ranquel, 1035, 1039, 1042–1044, 1048, 1050 Rarotongan, 776 Romance, 4, 45, 191, 195, 230, 234, 236, 240, 248, 250, 333, 338, 347, 349–352, 371, 377, 381, 631 Romani, 57f, 231, 260–276, 280–288 Romanian, 3f, 56, 60, 62, 230–259, 262 Russian, 15, 37, 42, 49, 231, 238f, 241, 243, 263, 272, 346, 384f, 387–389, 391–396, 398–406, 414, 416–425, 431–443, 472, 474– 476, 478f, 481–484, 486, 491f, 496–499, 502–508, 510–515, 530f, 533–536, 538, 553 Ryukyuan, 545f, 548, 562, 564 Sabaki, 77–79, 719, 726f Saharan Tubu, 152 Saharan, 4, 147–149, 152f, 166f, 171, 173, 175f, 183f Sakha, 5, 49, 56, 60, 62, 496–524 Sambaa, 79f, 86, 89 Samoan, 45, 749 Samoyedic, 487f, 491 Sandawe, 107, 110, 118

1079

Sanskrit, 17, 44f, 267, 371, 504, 549, 579, 582, 585, 588, 601f, 604–609, 611, 688, 693f, 697f, 700, 703–707, 719–722, 724f, 734– 736 Saramaccan, 3, 5, 51, 55f, 60–62, 218, 222, 918–943 Scandinavian, 370, 377 Selice Romani, 5, 48, 56–58, 60, 62, 260–303 Selkup, 473, 478, 480–483, 508, 511 Semaq Beri, 664, 668 Semelai, 664, 667–669, 673, 677 Semitic, 136f, 176, 180, 195, 265, 364, 608, 699 Sena, 219 Serbian, 230, 233, 236, 239, 241, 243, 247 Serbo-Croatian, 271, 274, 281–284 Seri, 826, 866f Seychelles Creole, 3, 5, 55–57, 60, 62, 215– 229, 921 Shan, 599, 640 Shingazidja, 82 Shingazija, 719, 728 Shinzwani, 728 Shirazi, 82 Shuwa, 168, 172, 174f Sindhi, 266 Skolt, 384f, 391f, 394f, 404 Slavic, 4, 230f, 233, 235f, 238–249, 270f, 274f, 304, 371, 399, 401f, 406, 530 Slovak, 58, 260, 262f, 270, 272, 274–278, 282– 285, 287f Slovene, 271 Slovenian, 260 Solomons Pijin, 756 Solon, 527, 529–531, 539 Songhai, 145, 149, 152 Songhay, 145, 147–149, 151–153, 156f, 159, 172, 175f South Cushitic, 79, 86 South Slavic, 235f, 239, 244, 248f, 270–272, 274–276, 278, 280, 283f, 286–288 South Sulawesi, 719–721, 725f, 733–735, 740 South Toraja, 726

1080

Index of languages 

Spanish, 35, 46–48, 59, 191, 196–199, 202, 233f, 340, 343, 348, 367, 371, 451, 562, 564, 739, 763, 777, 823–825, 827–840, 850–856, 858–865, 867–869, 876–878, 880– 885, 887–890, 899–913, 947–961, 972, 975f, 978–985, 987, 1018–1031, 1036f, 1040, 1042–1051, 1054–1061 Mexican, 901, 912 Sranan, 341, 918, 922, 924–928, 930–936, 973– 983, 985 Sukuma, 113 Sulawesi, 725f Sumatran, 670, 719, 722 Sundanese, 17, 692f, 697, 707 Suriname Portuguese, 927, 930–935 Swahili, 5, 11, 56, 60, 62, 76–102, 103, 105– 118, 120, 219, 226, 719–721, 726–728, 730f, 736 Swedish, 330, 338, 392, 516 Sydney Pidgin, 756 Syriac, 371 Tagalog, 724, 763, 772 Tahitian, 371, 771, 776 Tai-Kadai, 5, 545, 595, 599, 617f, 620f, 639f, 642, 644, 650, 652, 693 Tai, 599, 601–604, 606f, 611, 618–621, 628, 692f Taimoro, 729f Taíno, 371 Takia, 5, 56, 60, 62, 747–770 Tamang, 447, 449 Tamasheq, 175f Tamil, 221, 266, 368, 371, 601, 689, 694, 696 Tanoan, 897 Taqer’iyt, 191, 196f Tarahumara, 823, 825, 866f Tariana, 47, 996–998, 1000–1003, 1010 Tarifiyt Berber, 3, 5, 48, 56, 62, 191–214 Tashelhiyt, 192, 194 Tedaga, 166f Tehueco, 823, 834 Telugu, 602, 610, 694, 696

Temoq, 668 Temuan, 659, 664, 666f, 669–671, 678f Teochew, 609, 689 Tequistlatec, 857, 866f Ter, 384f, 393, 395 Thai, 5, 17, 56, 60, 62f, 584, 599–616, 618, 627, 640, 644f, 647, 693f Thakali, 447, 449 Thraco-Dacian, 232, 234f Tibetan, 447, 449, 452f, 456, 458, 639 Tibeto-Burman, 371, 447, 449f, 452–454, 456, 465–467, 577, 639f, 644–647, 652 Tocharian, 577f Tojolab’al, 852, 857, 865, 868 Tojolabal, 853f Tok Pisin, 747, 749, 751, 754–764 Tolai, 756 Tongan, 45, 371, 783 Torres Strait Broken, 756 Transylvanian, 232 Trique, 857f Tsafiki, 948, 950, 952, 954, 958 Tuareg, 147f, 151, 173, 175, 205 Tubu, 149, 152, 166, 170 Tukano, 993–1008, 1010 Tupi-Guaraní, 978, 980 Tupí, 371 Turkic, 5, 231, 235, 269, 275, 417f, 420, 433– 435, 437–439, 441f, 473, 478–481, 487f, 491f, 496–499, 504, 510, 514, 579 Turkish, 84, 233, 237, 239, 241, 244f, 247f, 267, 271, 276, 341f, 418, 424f, 433, 690 Tuyuca, 998 Twi, 927, 931, 933 Tzeltal, 46, 848, 852, 857f, 865 Tzeltalan, 848, 890 Ubangi, 176 Udi, 430 Ukrainian, 230, 236, 239, 241, 243 Upper Saxon, 313 Upper Sorbian, 307, 310–318, 321–323

Index of languages Uralic, 4, 384, 389f, 393, 396–398, 478, 481, 503 Urdu, 82, 695 Ute, 776, 779 Vanuatu Bislama, 756 Varogio, 826 Vejoz, 1015, 1019, 1021 Vieil Arzeu, 193, 200 Vietnamese, 3, 5, 56, 60, 62, 601, 617–637, 650, 977 Virginia Algonquian, 367, 371 Vlax Romani, 275–278 Voltaic, 152 Walmajarri, 791, 795–797, 800 Wambaya, 792 Wanyjirra, 791, 809 Wardaman, 792f, 795–798, 800–802, 806, 808 Warlmanpa, 792, 800 Warlpiri, 790f, 799f, 802, 808 Warluwarric, 792, 797, 806 Warumungu, 792 Waskia, 747, 749, 752–755, 757f, 760–764 Welsh Romani, 270 Welsh, 269f, 363

1081

West Germanic, 17, 330, 332f, 338, 345f, 360, 364, 376, 396 Western Mayan, 881–883, 890 White Hmong, 3, 5, 57, 59f, 62, 638–658 Wichí, 5, 57, 59f, 62, 1015–1034 Wolof, 221, 371, 931, 933 Xiongnu, 579, 584 Yaaku, 114, 124 Yangman, 792 Yao, 219, 638 Yaqui-Mayo, 823 Yaqui, 3, 5, 56, 60, 62, 823–847 Yoruba, 142, 146, 148, 151, 156f, 159, 173, 175, 182 Yucatec, 858, 865, 867f Yucatecan, 877, 879–885, 887 Yugh, 471, 473, 475f, 479–481, 492 Yuhup, 992, 996 Yukaghir, 503 Zapotec, 856–858, 866f, 897 Zaramo, 79f Zealands, 338 Zhuang, 599, 640 Zigua, 79f Zinacantán Tzotzil, 5, 20, 57, 60, 62, 848–87