The Languages and Linguistics of Mainland Southeast Asia: A comprehensive guide 9783110556063, 9783110558142, 9783110556124, 2021934687


522 40 7MB

English Pages [984] Year 2021

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Preface
Table of Contents
List of abbreviations
1 Introduction
2 The Neolithic occupation of Southeast Asia
3 Homelands and dispersal histories of Mainland Southeast Asian language families: a multidisciplinary perspective
4 The origins and spread of cereal agriculture in Mainland Southeast Asia
5 History of MSEA Austroasiatic studies
6 History of Tai-Kadai studies
7 Scholarship on Trans-Himalayan (Tibeto-Burman) languages of South East Asia
8 Historiography of Hmong-Mien linguistics
9 French contributions to the study of Mainland Southeast Asian languages and linguistics
10 The SIL contribution to language and linguistics in Mainland Southeast Asia
11 Classification of MSEA Austroasiatic languages
12 Classifying Trans-Himalayan (Sino-Tibetan) languages
13 Classification of (Tai-)Kadai/Kra-Dai languages
14 Classification and historical overview of Hmong-Mien languages
15 Language macro-families and distant phylogenetic relations in MSEA
16 Typological profile of Hmong-Mien languages
17 Typological profile of Burmic languages
18 Typological profile of Karenic languages
19 Typological profile of Kuki-Chin languages
20 Typological profile of the Kachin languages
21 Typological profile of Kra-Dai languages
22 Typological profile of Vietic
23 Northern Austroasiatic languages of MSEA
24 Eastern Austroasiatic languages
25 The national languages of MSEA: Burmese, Thai, Lao, Khmer, Vietnamese
26 South Asian influence on the languages of Southeast Asia
27 Linguistic influence of Chinese in Southeast Asia
28 The influence of contact between Austroasiatic and Austronesian
29 Register in languages of Mainland Southeast Asia: the state of the art
30 Contact and convergence in the semantics of MSEA
31 Classifiers in Southeast Asian languages
32 Grammaticalization in Mainland Southeast Asian languages
33 Expressives in languages of Mainland Southeast Asia
34 Pragmatics and syntax in the languages of MSEA
35 MSEA epigraphy
36 Writing systems of MSEA
37 Language policy and planning in Mainland Southeast Asia
38 Language and the building of nations in Southeast Asia
Subject index
Language index
Recommend Papers

The Languages and Linguistics of Mainland Southeast Asia: A comprehensive guide
 9783110556063, 9783110558142, 9783110556124, 2021934687

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

The Languages and Linguistics of Mainland Southeast Asia WOL 8

The World of Linguistics

Editor Hans Henrich Hock

Volume 8

The Languages and Linguistics of Mainland Southeast Asia A Comprehensive Guide Edited by Paul Sidwell Mathias Jenny

ISBN 978-3-11-055606-3 e-ISBN (PDF) 978-3-11-055814-2 e-ISBN (EPUB) 978-3-11-055612-4 Library of Congress Control Number: 2021934687 Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available on the Internet at http://dnb.d-nb.de. © 2021 Walter de Gruyter GmbH, Berlin/Boston Cover image: YODAPIX / iStock / Getty Images Plus Typesetting: Dörlemann Satz, Lemförde Printing and binding: CPI books GmbH, Leck www.degruyter.com

Paul Sidwell and Mathias Jenny

Preface

The present volume began early in 2017 when Sidwell was approached by deGruyter to edit a volume for their World of Linguistics (WoL) series. Some years before a volume had been initiated for the series which would ambitiously encompass both East Asia (EA) and Mainland Southeast Asia (MSEA), and some ten individuals had undertaken to contribute as editors, sub-editors, and authors. That project eventually stalled, was effectively abandoned, and had to be rethought; the intended scope was too great, and most of the intended authors had directed their attention to other things before any progress could be achieved. Handbooks such as these are not simple projects; they require goodwill and stamina from all involved. The editors must persuade enough suitable authors to coalesce around the plan in a reasonable timeframe, and all this is done by academics who by and large already have jobs taking the bulk of their waking hours, and who agree to assume additional responsibilities for no additional compensation. To be sure, there is some prestige associated with the publication, yet the more senior the scholar, the more diluted that value becomes. In this case it seems that the challenges were excessive. The suggestion that subsequently emerged was to divide the scope and propose EA and MSEA volumes, and Sidwell was approached to edit the latter. He had previously edited the Brill Handbook of Austroasiatic Languages (2014) with Mathias Jenny, and after some discussions Jenny came on board to jointly edit the new WoL volume. The new editors’ initial vision was rather different from what you see today. At first we imagined broad typological chapters, followed by sketch grammars of selected languages highlighting the features described in the typological discussions. A number of potential authors signed on to provide sketches but after discussions with the publisher we pivoted the plan to the table of contents you see now. Rather than sketches of specific languages, the focus is on genealogical and areal language groups with no one chapter devoted to a single language – this provides for more comprehensive coverage while putting languages into context. The chapters of the volume are arranged into six sections under the following themes: 1. Deep history of the peoples and languages of MSEA 2. History of linguistics of MSEA 3. Language classification 4. Typological profiles by areal-genetic language groups 5. Areality and contact 6. Language and society

https://doi.org/10.1515/9783110558142-201

vi 

 Paul Sidwell and Mathias Jenny

We originally planned to have more but shorter chapters. The idea was that by restricting chapters to 10~20 pages it would be less onerous on authors and encourage timely submissions. However, as time passed the working plan was substantially adjusted. Some chapters needed to be longer to cover the material, and consequently several were merged and others lengthened. Additionally, as we began with a long list of planned chapters, we were able to adjust as a number of authors withdrew and we had scope to adjust the table of contents. The final result is a mix of long and short chapters with a breadth coverage over six sections. Overall we have endeavoured to ensure that the content reflects not just what we know about MSEAn languages, but also the various scholarly traditions, theoretical approaches, transcription methods, and the varied historical paths as the field has emerged. While today much MSEAn linguistics is carried out in terms broadly consistent with international practices in typological and descriptive linguistics, there remain nuances of national and local traditions, and legacy effects from the long history of colonial and post-colonial national divisions. We also decided to give special emphasis to the historical circumstances that created the MSEA language area. The first section brings together chapters from historical linguists and archaeologists that ground the work in the story of Neolithic SEAsia, which saw a great transformation from hunter-gatherer societies to cereal cultivators who eventually tamed the lowlands and built some of the world’s great monumental cultures. The combination of rich eco-systems capable of supporting hunting and fishing, and high-yield agriculture, facilitated the spread of Austroasiatic languages over a vast area, and drew other peoples, speaking Tibeto-Burman, Austronesian, Kra-Dai, and Hmong-Mien languages into the region. The uplands of MSEA also functioned as refugia for peoples seeking relief from the growth of the Chinese state to the north and the emerging states of the MSEAn lowlands. Thus, an understanding of the human history of the region extending back to prehistory times is integral to interpreting the very complicated contemporary distribution of language groups and typological features. These are important clues for what they too can tell us about migration pathways, technological and cultural shifts, and histories of social interactions. Thus we have delivered a handbook of languages and linguistics that locates the material within the historical and geographical context that helps to define the language area. The eventual completion of the volume took several years and saw significant turnover in terms of contributors. We initially approached more than 50 individuals, and more than 30 agreed to write chapters. However, as time progressed some authors withdrew and replacements were recruited. We would like to thank all of them for their contributions, their patience, and great collaboration over the last years. We would like to thank the Southeast Asian Linguistics Society (SEALS) for endorsing this project at its 2017 meeting. A number of SEALS regulars subsequently contributed chapters, and two individuals, Jean Pacquement and Franklin Huffman, made financial donations towards editorial costs following an appeal made at SEALS. We would also like to thank Mark Alves for help and advice with editing. Thanks are

Preface 

 vii

due to the Department of Comparative Language Science at the University of Zurich for allowing Mathias Jenny to use the department’s infrastructure and part of his working hours on this project, and for hosting Paul Sidwell on several occasions, facilitating the cooperation of the two editors.

Table of Contents Paul Sidwell and Mathias Jenny Preface   v List of abbreviations   xiii Paul Sidwell and Mathias Jenny 1 Introduction   1 Charles F. W. Higham 2 The Neolithic occupation of Southeast Asia 

 21

Peter Bellwood 3 Homelands and dispersal histories of Mainland Southeast Asian language families: a multidisciplinary perspective   33 Dorian Q. Fuller and Cristina Cobo Castillo 4 The origins and spread of cereal agriculture in Mainland Southeast Asia  Paul Sidwell 5 History of MSEA Austroasiatic studies  Paul Sidwell and Mathias Jenny 6 History of Tai-Kadai studies 

 45

 61

 93

Nathan W. Hill 7 Scholarship on Trans-Himalayan (Tibeto-Burman) languages of South East Asia   111 Yoshihisa Taguchi 8 Historiography of Hmong-Mien linguistics 

 139

Jean Pacquement, Paul Sidwell and Mathias Jenny 9 French contributions to the study of Mainland Southeast Asian languages and linguistics   149 Carolyn P. Miller and Kirk R. Person 10 The SIL contribution to language and linguistics in Mainland Southeast Asia   163 Paul Sidwell 11 Classification of MSEA Austroasiatic languages 

 179

x 

 Table of Contents

Scott DeLancey 12 Classifying Trans-Himalayan (Sino-Tibetan) languages  Peter Norquest 13 Classification of (Tai-)Kadai/Kra-Dai languages 

 207

 225

Martha Ratliff 14 Classification and historical overview of Hmong-Mien languages 

 247

Paul Sidwell and Lawrence A. Reid 15 Language macro-families and distant phylogenetic relations in MSEA  David Strecker 16 Typological profile of Hmong-Mien languages 

 277

David Bradley 17 Typological profile of Burmic languages 

 299

Atsuhiko Kato 18 Typological profile of Karenic languages 

 337

Kenneth Van Bik 19 Typological profile of Kuki-Chin languages  Keita Kurabe 20 Typological profile of the Kachin languages  Pittayawat Pittayaporn 21 Typological profile of Kra-Dai languages  Mark J. Alves 22 Typological profile of Vietic 

 369

 403

 433

 469

Paul Sidwell 23 Northern Austroasiatic languages of MSEA  Paul Sidwell 24 Eastern Austroasiatic languages 

 499

 547

Mathias Jenny 25 The national languages of MSEA: Burmese, Thai, Lao, Khmer, Vietnamese   599

 261



Table of Contents 

Tom Hoogervorst 26 South Asian influence on the languages of Southeast Asia  Mark J. Alves 27 Linguistic influence of Chinese in Southeast Asia 

 623

 649

Graham Thurgood 28 The influence of contact between Austroasiatic and Austronesian 

 673

Marc Brunelle and Tạ Thành Tấn 29 Register in languages of Mainland Southeast Asia: the state of the art  Stefanie Siebenhütter 30 Contact and convergence in the semantics of MSEA  Alice Vittrant and Marc Allassonnière-Tang 31 Classifiers in Southeast Asian languages 

 707

 733

Walter Bisang 32 Grammaticalization in Mainland Southeast Asian languages  Jeffrey P. Williams 33 Expressives in languages of Mainland Southeast Asia  Mathias Jenny 34 Pragmatics and syntax in the languages of MSEA 

 773

 811

 825

Paul Sidwell and Mathias Jenny 35 MSEA epigraphy   855 Mathias Jenny 36 Writing systems of MSEA 

 879

Kimmo Kosonen and Kirk Person 37 Language policy and planning in Mainland Southeast Asia  Andrew Simpson 38 Language and the building of nations in Southeast Asia  Subject index  Language index 

 957  965

 xi

 907

 927

 683

List of abbreviations /.../ [...]

phonemic representation phonetic representation orthographic representation/­ transliteration 1 first person 2 second person 3 third person i, ii, iii, ... morphological class; degree of remoteness A agent ABL ablative ABIL abilitive ABS absolutive ACC accusative ACCOMP accomplishment ACP attention calling particle ACS accessible ACT active ADD additive ADDR address particle ADH adhortative ADJ adjective ADP adpositional phrase ADV(Z) adverb(alizer) ADVRS adversative AFF affirmative AGR agreement AGT agent(ive) ALL allative ANA anaphoric AND andative ANIM animate ANTI_AGT anti-agentive ANTIP antipassive AO agent orientating AOR aorist APPL applicative APPR approving particle APRX approximative ART article ASP aspect ASRT assertive particle ASSOC associative ATTR attributive AUG augmentative AUX auxiliary BEN benefactive https://doi.org/10.1515/9783110558142-203

CAP capability CAUS causative CLF classifier COLL collective COM comitative COMP complementizer COMPAR comparative COMPL completive COND conditional CONS consequence CONSUL consultative CONT continuous CONTR contrastive CONTRADICT contradictory COP copula COORD coordinating particle CQ content question marker CRS change of currently relevant state CSP consent seeking particle CT class term CVB converb DAT dative DCON discontinuous DECL declarative DEF definite DEM demonstrative DEP departitive DEP dependent form DES desiderative DESC descriptive DET determiner DIM diminutive DIR directional DIR.O direct object DRCT direct (speech) DISC discourse particle DISP displacement (in space/time) DIST distal demonstrative DISTR distributive DP discourse particle DS different subject DU, d dual DUR durative DYN dynamic aspect ECHO echo word EMPH emphatic EMOT emotional particle

xiv 

 List of abbreviations

ENUM enumerative particle EQUD equidistal demonstrative EPIST epistemic particle ERG ergative EUPH euphonic EXCL exclusive marker EXCLAM exclamation EXP experience EXPER experiential FEM, f feminine FAM familiar FILLER filler FIN finite FOC focus FREQ frequentative FUT future G goal, recipient GEN genitive GOAL goal GRNDV gerundive GRP generic referential prefix H high honorific HAPP happenstance HES hesitation HON honorific HORT hortative HUM human IDEOPH ideophone IDF identificational particle IMM immediate (future) IMP imperative INAN inanimate INCEP inceptive INCH inchoative INCL inclusive marker INC.OBJ incorporated object IND indicative IND.O indirect object INDF indefinite INDR indirect (speech) INF infinitive INS instrumental INTER interrogative INTR intransitive INTS intensifier INV invariant (verb form) IPFV imperfective IRR irrealis ITER iterative L low honorific

LINK linker LOC locative MANN manner MASC, m masculine MDF modifier MEDL medial demonstrative MENS mensural classifier MID middle voice MIR mirative MPROX mid-proximal demonstrative N neuter N- nonNC numeral classifier NEG negation NEWINF new information NF non-final NFIN non-finite NFUT non-future NM noun modifier NML(Z) nominal(izer) NOM nominative NSIT new situation NUM numeral NVIS non-visible NVOL non-volitional OBJ object OBL oblique OPP opportunity OPT optative ORD ordinal number P patient PA past anterior PART partitive PASS passive PERF perfect PFV perfective PL, p plural PM predicate marker PN proper name POL politeness particle POSS possessive PP preposition phrase PQ polar question marker PRED predicative PROB probability PROG progressive PROH prohibitive PRON pronoun PROS prospective PROX proximal demonstrative

 PRS present PST past PTCL particle PTCP participle PURP purposive Q question marker QNT quantifier QUOT quotative RC relative clause RDP reduplicated syllable REAL realis RECP reciprocal RED reduplication REF referential REFL reflexive relativizer REL remote REM reported REP reported speech REP.SP repetitive REPET request particle REQ RES resultative RESTR restrictive rhetorical question RHET realizing particle RLZ royal language ROY resumptive topic RTOP single argument S subject SBJ

List of abbreviations  SBJV subjunctive SEQ sequential SFP sentence final particle SG, s singular SRP self reflecting particle SS same subject STAT static aspect SUB subordinate SUGG suggestive SUPER superlative particle expressing surprise SURP T theme TAG tag particle tense, aspect, modality TAM topic-comment linker TCL temporal TEMP title TITLE TOP topic TPC topic-comment transitive TR unitized UNIT V verb directional verb VD verb head Vh versatile verb Vv VENT venitive VIS visible vocative VOC

 xv

Paul Sidwell and Mathias Jenny

1 Introduction

1.1 How do we define MSEA? Mainland Southeast Asia (MSEA) has no fixed delineation: it is common to identify a core area consisting of present day Myanmar, Thailand, Laos, Vietnam, Cambodia, and (peninsular) Malaysia, and a greater MSEA region reflecting historical ethno-linguistic and political relations and origins, which includes parts of NE India and SW China. Five language families are found in the region: Austroasiatic (AA), Tai-Kadai (KD), Tibeto-Burman (TB), Hmong-Mien (HM), and Austronesian (AN). Of these, AA has been spoken there the longest and may well have originated in northern MSEA, possibly emerging from a fusion of local and invasive populations in Neolithic times. As the chapters in section 1 show, there is physical anthropological and genetic evidence that the Stone Age peoples of MSEA were of Australo-Papuan type who intermixed with East Asian peoples who brought rice, millet and other domesticates into the region. Today there are no language isolates or other indications of cultures residual from that earlier time. We can be confident that the AA speaking groups that today live in India (Munda, Khasian, Nicobarese) and in southern China (Bolyu, Bugan) arrived as migrants in Neolithic times or later and do not reflect purely archaic populations but outcomes of migrations and mixing. In contrast to this, TB, KD, AN all began to spread into MSEA in the early metal age, while the bulk of HM migration southward from China into core MSEA has been quite recent, being from the mid-19th century. Thus, the pattern of language distribution in MSEA is one of a large AA substratum, particularly well preserved in eastern Indochina, and discontinuous groups of AA speaking communities spread out widely, divided by intrusive wedges of mainly TB and KD (especially the national languages Burmese and Thai and their various close relatives). This region corresponds broadly to the core of MSEA, to which we can add the southern Chinese provinces and NE India in which languages are spoken which have close relatives in the neighbouring countries. Across this core MSEA region there is a clear tendency for linguistic convergence, forming a language area or Sprachbund. Prolonged and intense language contact, combined with internal drift, have conditioned restructuring towards monosyllabic or sesquisyllabic morphemes, lexical tones, analytical grammar, and other features discussed in more detail below. This linguistic area is the hallmark of MSEA, and is itself a reasonable basis for delineating it on the map, and has strong implications for defining the southern periphery of the area, where it interacts with the Austronesian dominated peninsular and insular Southeast Asia. Within Indochina, in southern Vietnam and along the waterways of eastern Cambodia, live Chamic speakers whose languages are recognisably related to Malay. Yet https://doi.org/10.1515/9783110558142-001

2 

 Paul Sidwell and Mathias Jenny

Map 1: Language Families of Southeast Asia

Introduction 

 3

over many centuries their speech has shifted inexorably towards the MSEA areal type, abandoning disyllables, simplifying morphology, and in some cases becoming tonal, providing natural laboratory specimens of areal convergence. At the same time, something like the reverse has occurred on the Malay peninsula. There, Malay became dominant, extending northward into three southern provinces of Thailand. Malay retains much of its inherited AN typology, with robust disyllables, inflexional morphology, and simple phonology. The small Aslian (AA) groups in the interior of the peninsula have converged with Malay to some extent, avoiding many of the changes that characterise the MSEA area (as have other AA languages spoken outside the MSEA core). Given this very different areal trajectory, we have taken the approach of excluding Malaysia and the Malay dominated regions from our functional definition of MSEA in this volume. In addition to the language families that historically fall within our defined MSEA area, there are also speaker communities that reflect more recent migrants into the area. These include the many ethnic Chinese, often Cantonese, Hokkien, Hakka, or Mandarin speakers who live within the region, especially in urban centers, as well as Indo-Aryan communities, such as Bangla and Gorakha, in Myanmar, which are remnants of the British colonial administration. We have taken the view that such language varieties do not fall within the scope of this volume, but are peripheral to it, and are more appropriately discussed in volumes dealing specifically with the language groups they spring from. On these bases we come to the definition of MSEA that constrains the scope of this volume. It is the language groups that fall within the conventionally recognised core MSEA area, minus the Malay peninsula and Malay speaking area. At the same time, it does include languages of the border regions on the northern and north-western periphery of core MSEA that are closely related to languages within core MSEA.

1.2 The MSEA language area The recognition of an MSEA language area has roots in the European scholarship on the languages of the Far East and Further India in the 19th century. Linguists of the colonial era, before the rise of the Neo-grammarians and later Structuralists, did not have well developed notions of how Asian languages formed phylogenetic groups, and instead placed importance on perceived typological commonalities and commonly recurring lexical forms. As late as 1862 Müller was advocating a grand Turanian family embracing far eastern languages in opposition to Aryan (Indo-European), the latter extending from Europe into India and the Himalayas and being more civilised in his conception. The Turanians were characterised as more prone to nomadism and tribalism, roaming from Siberia to the Spice Islands and Pacific atolls. Keane (1880) proposed two great

4 

 Paul Sidwell and Mathias Jenny

Asian families, Indo-Chinese and Indo-Pacific, the former including what we would now recognise as TB and KD, while the latter included AN and AA languages of Indochina, including Vietnamese. While the views of Müller, Keane, and others were coloured by eurocentric concerns, such scholars were sincerely dealing with the data at hand, and while lacking adequate tools and documentation, they nonetheless recognised that common structural features could be seen in languages across Asian regions. In particular, features such as monosyllabism, tonicity, and lack of inflexional morphology, were evidently non-European and exotic. Furthermore, the idea of widespread nomadism in Asia (giving undue emphasis to the Mongolian example) fed the idea of language mixing leading to widespread structural similarity in languages. Thus in the 19th century the field was already familiar with rudimentary versions of the language area concept, with scholars discussing notions of structural borrowing and of “mixed languages”. Kopitar (1829, 1857) had identified various Balkan areal traits and a century later in the Americas Boas (1917, 1920, 1929) was making observations on structural traits of Native American languages that did not pattern geographically with genetic groupings, strongly influencing 20h century thinking on areality. The contemporary conception of areal linguistics emerged in the 20th century with Trubetzkoy (1923) – introducing the term Sprachbund or ‘language union’ – and Emeneau’s “India as a Linguistic Area” (1956) was particularly influential, being based on systematically listing and comparing features that appear to cross-cut genetic ­families. The theoretical underpinnings received more attention in the 1970s. For example, Sherzer (1973) asserted: A linguistic area is defined here as an area in which several linguistic traits are shared by the languages of the area and furthermore, there is evidence (linguistic and non-linguistics) that contact between the speakers of the languages contributed to the spread and/or retention of these traits and thereby to a certain degree of linguistic uniformity within the area. (Sherzer 1973: 760)

Other scholars offered similar formulations, with the common theme that an area arises when a number of features are shared between various not closely related languages of a defined geographical area. However, the definition remains somewhat vague, with no clear guidance as to the number of shared traits and languages involved, or the extent of similarity that justifies the identification. Alieva (1984) explicitly identified a “language union” in MSEA (building on earlier observations by Cowan [1948], Gorgoniev [1960, 1965], Henderson [1965], Benedict [1976], Egerod [1980]), taking as her starting point the restructuring of Cham (AN) in Indochina, listing the following structural changes (Alieva 1984: 12): – monosyllablic morphemes prevail; – diphthongs have developed into a final, strong syllable; – register and tonal oppositions appear on the phonological level;

Introduction 

 5

– prefixes and infixes have been lost as productive markers and having turned into presyllables, are disappearing altogether; – all the synthetic means of expressing grammatical meanings and relations have been replaced by analytical means of a definite type; – word-composition prevails in word-formation. It is striking that Alieva’s (1984) formulation of the area specifically applied to the bigger languages such as Khmer, Vietnamese, and Thai, noting of them that, “[a]t the same time hardly any of the above idioms are typical of their families in general; they have been alienated from their genetic stocks” (Alieva 1984: 19). In her conception, identifying an area crucially requires distinguishing the members of the language area from the mass of languages that happen to be spoken in the same territory. However, in the era of documentary and typological linguistics that developed rapidly from the 1990s, the idea of linguistic area pivoted to a more inclusive notion. With access to more and better quality data, and a drive to describe and explain both the common properties and the structural diversity of the world’s languages, scholars began to approach the question of areality in a more systematic way. The research methods shifted to a search for what emerges from a thorough investigation of a presumptive area in terms of commonalities and differences. The idea being that a language area is not an artifact of selection but reflects the identification of features that truly permeate and thus define a coherent region. We see an example of this approach taken by Comrie, explaining: By using the maps in the World Atlas of Language Structures, it is possible to build up a more structured assessment of the extent to which Mainland Southeast Asia constitutes a linguistic area. (Comrie 2007: 18)

With this method, Comrie tested languages for 21 diagnostic features, with Thai evidencing 19 of these, and the number decreasing towards the periphery (e.  g. Burmese with 11) and languages outside MSEA falling well below this (e.  g. Mandarin with only 7). By this measure, Comrie characterised Thai as “the most typical of the three major national languages” (Comrie 2007: 45). Yet one can question the use of “typical” here. Arguably, Thai stands out as an extreme case, and in a fair sample one would find languages without so many of the nominated diagnostic features, and thus they would be arguably more typical of MSEA, a point made by Enfield (2019: 55). Enfield and Comrie (2015) offer a list of 7 phonological and 11 morphosyntaxsemantic features diagnostic of the MSEA languages, listed as follows (abbreviated somewhat): 1. Large vowel inventories 2. Common underlying structure of vowel systems 3. Long versus short vowel distinctions 4. Syllable with onset-rhyme structure

6 

5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18.

 Paul Sidwell and Mathias Jenny

Preference for one major syllable per word Lexical contrast of pitch and/or phonation Velar gap in voiced stop series Lack of inflectional morphology Nouns and verb serving as functional morphemes Widespread verb serialization Relative flexibility of constituent order Zero anaphora Topic-comment structure Large set of ambitransitive verbs Rich inventory of sentence final particles Rich inventory of ideophones Numeral classifiers Pronouns with multi-level social-deictic meanings

This list can be fairly taken as characterising the MSEA language area, with some caveats. Features such as (2) and (3) apply to only a subset of the languages, with many historically derivative systems drastically changing the number of vowel contrasts or losing the length contrast completely. Feature (7), which should be extended to cover palatals, is the result of the general devoicing wave that swept across most of MSEA between the 13th and 16th centuries, sparing only a few peripheral languages.1 Voiced velar (and palatal) stops were present in earlier stages of all languages of the area, as evidenced by diachronic (epigraphic) data and synchronic phonological features (tones, registers). Feature (18) only applies to highly elaborated languages, and not to hundreds of smaller lects. And other features, such as (17) may be reported for many languages, but are optional in speech and may be infrequent in spontaneous texts. Despite such considerations, from a global perspective it is clear that areality as we understand it is strongly reflected in MSEA, and remains a matter of ongoing research interest. Consideration has also turned to the mechanisms of language restructuring and convergence, so as to better understand how areality arises beyond simple notions of language mixing. A widely received assumption is that parallelism in neighbouring languages is caused by contact and is thus also indicative evidence of contact. The mechanisms of contact are multiple and not all are obvious or universally agreed upon. It is uncontroversial that languages can come to resemble each other by direct

1 What appears today as voiced /b/ and /d/ in some MSEAn languages originate in (and in many languages still are) implosives /ɓ/ and /ɗ/, respectively. Cross-linguistically, labial and dental implosives are much more common than palatal and velar, and only the former were present in the inventories of most MSEAn languages, leading to the present-day “velar gap”.

Introduction 

 7

borrowing of morphemes along with their phonology and syntax. However, while there are some clear cases of such borrowing in MSEA, such as the Khmer influence on Siamese/Thai (Huffman 1973, Varasarin 1984, Khanittanan 2004) this cannot be the principal mechanism behind the widespread distribution of features such as the 18 listed by Enfield and Comrie. In addition to communication, language serves in part a social-identifying function, and this may condition a resistance to lexical borrowing, while speakers still adopt or approximate linguistic structures of contact languages. Also, by imperfect language acquisition they may carry over features of their native tongues into other languages. Thus there can be linguistic interference or metatypical change (or “metatypy”, see Ross 2006) with minimal morpheme transfer (complete lack of lexical borrowing is unlikely). At the same time, structural parallelism can arise by coincident internal changes, and it is clear that this plays a role in MSEA. For example, it is not clear how far the general preference for SV/AVP word order in MSEA has resulted from contact or is merely a case of the strong tendency for languages to favour SV/AVP for cognitive reasons (Marno et al. 2015). There are also cases where it has been strongly asserted that language contact is the main driver of particular examples of language convergence while further research has cast doubt upon it. A good example is the case of the spread of lexical contour tones, which superficially appear to have spread in a historical wave of tonogenesis out of China and across the region. Matisoff (1973) provides an indication of views held widely at the time when he wrote: It seems likely that the development of true tones in Vietnamese was precipitated not only by influence from Chinese, but also from Siamese as well. This indicates that Tai (and Miao-Yao) acquired their tone systems from Chinese before Vietnamese did; that is, the ST > AT influence preceded the ST-cum-AT > AA influence. (Matisoff 1973: 88)

Yet it is not at all clear that the influence of Chinese and Tai on Vietnamese tonogenesis is more than peripheral. We now understand that tones arise from features of segmental phonology, such as the tendency of voicing to depress pitch and glottal codas to induce rising and falling contours (Abramson 2004, Ratliff 2015), and languages develop similar tone systems by such processes without any evidence of contact. At the same time, case studies (e.  g. Sidwell 2015) demonstrate languages in close contact can develop tones in radically differing ways such that it is difficult to see any clear role for contact. Other MSEA areal features, such as monosyllabism, large vowel inventories, velar gap, and others, may similarly arise independently without needing to invoke contact, which relates to another observation: the most important predictor of typology in MSEA is inheritance, not proximity (Brunelle and Kirby 2015). What the languages of MSEA do have in common, despite apparently separate origins (although some do still assert common origins, see Sidwell and Reid in this volume) is similarities in typological starting points (such as shared sesquisyllabism) which would already favour

8 

 Paul Sidwell and Mathias Jenny

common tendencies in internal changes, and which contact may act to reinforce to some extent. The fact that MSEA today appears as a model case of a linguistic area with (standard) Thai as its most typical representative may suggest a different explanation. As has been widely demonstrated, Thai and Khmer, though belonging to two different families, share not only many lexical and grammatical features, but also a long cultural and religious heritage. With Thai as the most influential language in central MSEA in modern times, the MSEA convergence area could also be seen as an area of languages converging on the Thai model, which in turn has been influenced in its development by Khmer (and, at an earlier period, Chinese lects). As seen above, Burmese, being outside the Thai sphere of influence, importantly only shares a small number of the features typically listed for MSEAn languages. Masica (2005: 183) speaks of a “profound hiatus between India and Southeast Asia beyond Burma”, suggesting that Burmese does not belong to the MSEAn linguistic area, but rather shares features with South Asian languages, depending on the features one decides to consider. MSEA certainly appears as a convergence area, although its status is far from static, but rather a continuously changing sequence of developments, with a dynamic interaction of inherited and contact-induced features converging and diverging at different times and places. The following section outlines what we can tell about the development of this area, not only in terms of linguistics, but also socio-cultural scenarios.

1.3 Early linguistic landscape of MSEA As the preceding section has made clear, there is striking linguistic parallelism across unrelated language families in MSEA, and this is explained in part by prolonged and often intimate contact. Yet among these families only AA is apparently indigenous, with much of the language diversity resulting from inward migration in historic and prehistoric times. Much of the history, at least in broad outline, is well understood for more than a thousand years. Inscriptional Mon, Khmer, Cham, and Pyu texts are known from the 1st millennium (Sidwell and Jenny, chapter 35) and there are Chinese records relating to Vietnam going back to the second century BCE. And while broad inferences can be made based on comparative linguistic reconstruction, it is crucial to have a well-grounded understanding of regional prehistory based on archaeology and population genetics. Given the importance of such considerations, this volume has several chapters on the prehistory of MSEA from the perspectives of archaeology (Higham, Bellwood, Fuller, and Castillo) in addition to contributions by linguists on language history and classification (Sidwell and Reid, Sidwell, Ratliff, Norquest, DeLancey), and out of these we can see the outlines of the early linguistic landscape of the region.

Introduction 

 9

There is a clear consensus among contemporary scholars that before around 4000 BP the inhabitants of MSEA were stone age hunters and gatherers, generally referred to as Hoabinhians (based on archaeological finds in Hòa Bình, Vietnam). Physically they resembled peoples of Australia and Melanesia, and they left behind stone tools, shell mounds, and some pottery, and buried their dead without offerings. A cultural shift occurred as people of East Asian type began moving into northern Indochina, bringing with them various domesticated plants and animals, more diverse pottery and tools, and distinct housing the funerary practices. A new mixed population of Neolithic farmers emerged, which scholars mostly associate with early Austroasiatic speakers on strong circumstantial grounds. Later, in the SEA Bronze and Iron Ages, speakers of other language families began moving into the region, beginning a pattern that has been repeated multiple times since. These migrations can be associated with both pull- and push-factors; Southeast Asia is ecologically rich, supporting year-round hunting and fishing, and the lowlands, if properly irrigated, support multiple rice cropping through the year. Additionally, the expansion of the Chinese empire, and conflicts with neighbors and within conquered territories, have motivated groups to seek relief by moving south. The latter factor can be seen starkly in the historically recent movement of Hmong and Mien speakers into Indochina, fleeing 19th century Chin dynasty China, disrupted by events such as the opium wars and Taiping rebellion, and in the 20th century Chinese groups fleeing Maoist persecution into Thailand and Myanmar. At the same time, we should not underestimate the effects of natural population growth in the context of the political and geographical realities. China established a strong foothold in the Red River Valley in Han times, coveting the delta as a rice bowl. Once the Vietnamese achieved full independence in the 10th century the only natural expansion corridor was southward. Similarly, the movement of Tai speakers out of Guangxi and into what is now Thailand, Laos, Vietnam, Myanmar in the 8th–10th centuries was along already established corridors, while all other directions ran directly up against Tang China. And a similar perspective applies to the Burman migrations from Yunnan to the Irrawaddy valley from the 7th century onward. Two MSEA language groups that stand out as having anomalous geographical distributions are Karenic and Chamic. Karenic is a Tibeto-Burman group, whose lects are mostly spoken along the hilly borderlands between Myanmar and Thailand and parts of the Irrawaddy delta further to the west, which has the appearance of being a relic population, with national languages Burmese and Thai dominating to the west and east respectively. While the internal classification of Tibeto-Burman (DeLancey this volume) remains unresolved/controversial, it seems that the closest relatives of Karenic are the Kachinic and other languages of the Myanmar-India borderlands. Thus, Karenic appears like a residue of a wider patchwork of Tibeto-Burman languages now lost due to the dominance of Burmese (and previously Mon and Pyu) over the fertile plains and banks of the Irrawaddy river. Although it is difficult to be more precise than this due to the lack of relevant historical and archaeological materials.

10 

 Paul Sidwell and Mathias Jenny

On the other hand, Chamic, an Austronesian group closely related to Malayic, is spoken on the south-central coast of Vietnam and in diasporic communities known to have migrated away from the Vietnam central coast over the past millennium. We know from numerous historical accounts and monumental remains that in the 1st millennium the Cham built a Hinduized civilization that rivaled Angkor, although it lacked arable land such that its growth was limited and ultimately could not resist the expansion of the Vietnamese in their long drive south. Archaeological and linguistic indications place the initial arrival of Austronesians on the Indochinese coast in the latter half of the 1st millennium BCE (Bellwood 1997). The subsequent development of coastal trade, combined with the advantage of an established maritime culture, put the Cham in a favourable position to develop economically, and like the Khmer initially adopted an Indic model for social organization and ideology. Immediately to the north of the Cham, their future rivals the Vietnamese, developed out of an Austroasiatic speaking population who – colonised by the Han Chinese from 111 BCE to 938 CE – acquired the cultural and economic capacity for both self-sufficiency and regional power. Two other Austroasiatic groups also developed regional civilizations during the 1st millennium, the Mon and the Khmer, adopting Indic models as did the Cham. These historical processes of Indianization and development linked to (mainly) coastal trade are evident from the inscriptional records and archaeological data, and for the first century or so of serious western interest in MSEA languages and cultures, these cultures were emblematic of the region. The terms Mon-Annam and Mon-Khmer, for example, were coined to name the language family they belong to, which we now more usually call Austroasiatic, following Schmidt (2006) who stood almost alone for half a century for his efforts to examine the languages coherently as a family. This brings us back to Austroasiatic; uniquely among the MSEA language families, there are no clear indications that AA originated outside of MSEA, and it must be recognised as the principal regional substratum. Various proposals over decades have placed the AA homeland in India or various locations within China even as far north as the mid-Yangtse (see van Driem [2001: 262–332] for wide ranging discussion). The arguments that AA is not native to MSEA are of two kinds: 1. Supposed lexical or morphological parallels with Sanskrit, Old Chinese, etc. are invoked to show that AA was historically spoken in the area of the apparent contact language, and 2. Agricultural vocabulary is reconstructed for proto-AA, and the homeland suggested as located in a zone of known ancient agriculture/domestication. Such proposals have enjoyed strong advocacy by individual scholars, yet have problematic aspects. As Sidwell and Blench (2011) discuss, the centre of diversity of AA is located in Indochina, and branches located in India and China all show strong indications of having migrated into, not out of, those regions. The agricultural lexicon argument is problematic as none of the reconstructed word forms (names for rice, millet,

Introduction 

 11

etc.) resemble their semantic equivalents in Indic or Sinitic or other non-AA languages, providing little comfort for the India or China homeland hypotheses. In the HmongMien reconstructions by Wang and Mao (1995) and Ratliff (2010) many words with origins in Old Chinese, including agricultural items, provide testimony to a real history of contact between those language groups, but the same kind or quality of evidence has not emerged from strenuous efforts to compare AA with neighbouring families.2 State formation in the 1st millennium saw the rise of Mon, Khmer, Pyu, Cham, Vietnamese as important regional languages, for trade, culture and as vectors for Hinduism and Buddhism. In the latter 1st millennium Burmans and Tai speakers began moving south, and by the medieval period began to achieve dominance in the Irrawaddy, Chaophraya, and Mekong regions. The Siamese (Thai) took over from Khmer in the 14th century becoming the most influential polity and language in the center, with Vietnam in the east and Burma in the west both creating their own spheres of influence in terms of politics and languages, while still sharing a basis of Indic-Buddhist culture (except for northern Vietnam). The “Burma Zone” is characterized by convergence in voiceless sonorants, which were present in Old Burmese but not Old Mon, for example, and are found in the modern varieties of both. In the “Thai-Khmer Zone”, voiceless sonorants were present in Old Thai of the 14th century, but not in Angkorian Khmer, and they are absent from both modern varieties. In the Burma Zone, the verbal negators tend to be bound morphemes, as in Old and Modern Burmese, Modern Mon (but not Old Mon), Karenic and Shan, while in the Thai-Khmer Zone they are free forms. Pali is the main Indic source of traditional loanwords in the Burma Zone, while Sanskrit is more prominent in the Thai-Khmer Zone. Thus emerged the basis of the present day majority language distribution in MSEA, conditioning much of the linguistic areality that characterises the region.

1.4 MSEA – the state of linguistics today Linguistics in MSEA has developed into a vibrant field, with activity in many institutions supporting international conferences, journals, and an enormous amount of language documentation and analysis especially represented in graduate dissertations. At the same time, MSEA linguistic research is surprisingly dispersed, lacking any true organizational locus. There is a good deal of activity within Southeast Asia, and also in Europe, North America, and Australia. Linguistics programs inside the

2 Southwestern Tai, which interestingly almost completely lacks elaborated rice vocabulary, may have borrowed AA *rŋkoːʔ ‘rice’, c.f. Thai kʰaːw, but this must belong to a relatively late period. The dictionary of Old Chinese by Schuessler (2006) claims to demonstrate some 1,500 Chinese-Austroasiatic parallels, but the equations are based on loose criteria and have found no significant support among scholars.

12 

 Paul Sidwell and Mathias Jenny

countries of Southeast Asia naturally have a focus on languages in those countries, and while each country has its own language policies and goals, activities are still very much specific to individual institutions. National professional bodies representing linguists are lacking, and the supra-national Southeast Asian Linguistics Society is an informal body that meets annually, having no institutional home in the region, its journal JSEALS is published by the University of Hawai’i Press. Looking at the state of linguistics as a profession on a country by country basis, we offer some broad indications: Thailand: Linguistics in Thailand is quite strong, with many universities having programs. There has been a strong focus on Thai language development and standardization for many decades, with special responsibility falling to the Royal Institute of Thailand, plus significant academic interest in the linguistic diversity of the country. The field got a significant boost when linguists who had been working in Indo-China moved to Thailand from the mid-1970s, many of whom supervised students who went on to become linguistic scholars and departmental heads in Thailand. Notable are: – Mahidol RILCA is strong on language description, documentation, and revitalization, with numerous MA theses and PhD dissertations on local languages; they publish the Journal of language and culture. – Chulalongkorn University linguistics department has a focus on theoretical studies, and publishes the journal MANUSYA with some linguistic articles. – Chiang Mai University has both a linguistics department and the Myanmar Study Centre. – Silpakorn University supports epigraphic research. – Phayap University hosts its Linguistics Institute, pursuing descriptive work focused on northern Thailand, Laos, Myanmar, and training of native speakers. – The Siam Society sporadically organizes public lectures and publishes material with linguistic content, often with a historical focus. – At the community level, some language revitalization efforts for local languages are often assisted by universities. Also operationally based out of Thailand, the Centre for Research in Computational Linguistics hosts a number of projects under its sealang.net domain. These include: online dictionaries and corpora and tools for working with complex scripts; online corpora of inscriptional Khmer and Mon; the Mon-Khmer Languages Project with extensive lexical resources and search tools; extensive archives of unpublished works; and at sealang.net/sala thousands of papers from linguistics journals, edited volumes, conferences and other sources are available for search and download. All of these resources provide invaluable research tools for MSEA linguistics.

Introduction 

 13

Myanmar: Linguistics in Myanmar was neglected in the latter 20th century but is developing more seriously these days. The major drive is for development of Burmese as the national language, though research in and promotion of local languages is increasing. – General linguistics departments at the Yangon University of Foreign Languages (YUFL) and Mandalay University of Foreign Languages (MUFL) provide supportive courses for language teaching departments. – Research in local minority languages is mostly based at Yangon University’s Myanmar-sar (Burmese) department; graduates have produced a number of MA theses and PhD dissertations, mostly written in Burmese and not officially published, though available as photocopies at local bookshops in Yangon. – Studies in local languages are done at some regional institutions by personal interest and initiative of local staff members, though not officially supported by the centralized curriculum committee. With a major university reform initiated in 2020 giving universities more independence, this may change in the near future. – Community activities in terms of literacy, language documentation, conservation, gaining ground (dictionaries, textbooks, primers, some in print, some online, often for local distribution). Cambodia: Linguistics in Cambodia is modestly developed. Research is predominantly concerned with Khmer-related subjects, and while there is some work on minority languages, it is mostly done by students for their theses, or by foreign researchers. The latter includes civil society organizations, such as SIL International, which are able to operate in the country reasonably freely. Linguistic and related conferences are readily held in Cambodia, and there is broadly an attitude of openness among local scholars and officials to foreign engagement. In terms of linguistics courses: – The Royal University of Phnom Penh has the degree program in linguistics, with extensive course offerings at both BA and MA levels. – The Royal University of Fine Art offers some linguistic courses although not a degree program. Vietnam: Linguistics has a long-established tradition in Vietnam. While linguistic research largely began in the French-colonial period through French researchers of l’École française d’Extrême-Orient, from the 1950s, Vietnamese researchers took the main role, with the state actively supporting language development work and research positions. Linguist teams conducted significant field research in the 1980s and 1990s, also with a number of Vietnamese-Russian collaborations, although field research has waned in recent decades. While there is much activity and many publications in linguistics, the work is largely inaccessible to scholars outside of Vietnam as the vast majority of publications there are in Vietnamese, and local scholars infrequently publish in international journals. – The Institute of Linguistics in Hanoi, founded in 1968, has research mainly on Vietnamese and languages of the 50-plus ethnic minority groups inside and bor-

14 

 Paul Sidwell and Mathias Jenny

dering Vietnam. The Institute has published the journal Ngôn Ngữ (‘Language’), which has been the primary linguistics journal in Vietnam since 1969. Some other linguistics journals include Từ điển học và Bách khoa thư (‘Lexicography and encyclopedia#) and Ngôn Ngữ và Đời Sống (‘Language and life’), the latter which publishes works with a more popular tone. – Numerous linguistics books, theses and dissertations have been produced over the past several decades, almost exclusively in Vietnamese. – Many Vietnamese universities have linguistics programs with a focus on the Vietnamese language and periodically publish linguistics articles in their own journals. – In recent years, locally hosted linguistics conferences have become a common way to share linguistic research, with many proceedings published, sometimes running over a 1,000 pages. Laos: Linguistics in Laos is the least developed regionally, there being no linguistics courses taught, no linguistics conferences held, and no linguistics journals published. State support exists for applied linguistics such as foreign languages teaching, translation, and development and promotion of Lao as the national standard. The most developed programs for these are efforts conducted by the Faculty of Letters at the National University of Laos. Many foreign linguists have conducted language survey and documentation work in the country since 1975, sometimes with formal approval and often without, as civil society activities are strongly regulated by the state. Into the early 2000s the Ministry of Information and Culture did operate an Institute for Research on Lao Culture which officially conducted research on minority languages including joint projects with foreign nationals and organisations, and these efforts resulted in the publication of a series of dictionaries and text collections, among other works. Later, responsibility moved to the National University which does still approve some research projects, but the attitude towards linguistic research and language documentation remains unenthusiastic, and authorities prefer to support efforts to educate ethnic groups in the national language. Beyond MSEA: In the USA, there was a surge of interest in MSEA and linguistics after World War II, and many universities established programs and centres that supported work on MSEA languages, primarily for language instruction but also linguistic research. During the 1960s to 1970s, the US involvement in the Indo-China war furthered interest in research on the languages in the region. It was in this period that many linguistic researchers in the Summer Institute of Linguistics (nowadays known as SIL International, see Miller and Person, chapter 10), a US headquartered faith-based organisation, began conducting fieldwork on many minority languages in the region. SIL publications in the 1960s and 1970s were some of the earliest – and certainly the most extensive at the time – on those languages, often including grammars and dic-

Introduction 

 15

tionaries. From 1975 on, SIL largely relocated to Thailand and the Philippines, only gradually returning to Indo-China from the 1990s, and also more recently becoming quite active in Myanmar. As a specific area of study in US institutions, Southeast Asian linguistics is not housed in a single program, but usually involves combined activities of SEA studies programs and linguistics programs. Examples of this distributed nature of Southeast Asian linguistic research are the Endangered Language Fund Projects at Yale University, the Language Documentation and Conservation program at the University of Hawaii, and the National Council of Less Commonly Taught Languages (NCOLCTL). All three provide support for research in general that ultimately reaches MSEA linguistic research, though none are focused on a particular geographic region. Although the MSEA linguistic research profile has declined somewhat in the new millennium, upwards of 20 major Southeast Asian studies programs and Southeast Asian languages and literatures programs are found across the US. These include the programs at Cornell University, the University of California at Berkeley, the University of Hawaii at Manoa, the University of Minnesota, among others throughout the country. The Southeast Asian Studies Summer Institute (SEASSI) has, since 1983, brought together college students from SEA Studies programs nationwide. Southeast Asian languages are sometimes part of special subgroups of larger linguistics conferences, resulting in publications in proceedings or linguistics journals. The Southeast Asian Linguistics Society (SEALS) was founded in the US by Martha Ratliff and Eric Schiller, holding its first meeting at Arizona State University in 1991. 10 of the first 13 annual meetings were held in the US and only later would meetings be predominantly held in Asian locations (with a strongly international profile of participants). Today, the journal JSEALS, which is associated with SEALS although is not restricted to publishing articles presented at SEALS meetings, has been hosted by the University of Hawaii Press since 2017 (being previously published by Pacific Linguistics at the Australian National University from 2009). The UH Press itself has a long history of publication on SEA languages and linguistics, as does the University of California Press. Altogether, while US-based researchers of Southeast Asian linguistics are active, they are scattered geographically and thus must take advantage of the ease of communication and opportunities to interact at conferences. Funds for language study and linguistic research similarly come from a variety of sources. In Australia, linguistic research enjoyed important growth from the 1960s, especially with support from the Research School of Pacific and Asian Studies and the Australian National University, and also linguistics departments established later at the University of Sydney and University of Melbourne. While the main foci of their attentions were indigenous and Pacific languages, there was always a variety of linguistic work on MSEA, plus a commitment to teaching of MSEA languages, especially given the large Indo-Chinese refugee movement to Australia after 1978. The publishing house Pacific Linguistics at ANU has published much relevant work, including grammars and dictionaries, often by scholars based outside of Australia. In 2010s the locus

16 

 Paul Sidwell and Mathias Jenny

of MSEA linguistics within Australia shifted from ANU to the University of Sydney, following important personnel changes. In Europe, a few universities have departments dedicated to SEAn languages and cultures. In Germany, there are the Humboldt University in Berlin with a strong Burmese department and the University of Hamburg with SEA languages and culture studies, focusing on Thai, Vietnamese, and Indonesian. In France, it is especially the INALCO and several labs of the CNRS that are dedicated to the study of SEAn languages (see Pacquement et al. in this volume). In Sweden, the University of Lund houses the RWAAI archive of Austroasiatic and has a focus on AA languages of SEA, to name just a few. Often, the SEA centered research and teaching programs of these and other places, though institutionalized to some extent, are closely linked to the local staff in charge and prone to fluctuation with changes in academic personnel. In most places, if areal studies include SEA, linguistics is only a peripheral element in the respective departments, with the focus rather on socio-political and cultural issues (e.  g. at the universities of Bonn, Frankfurt, Leiden, Amsterdam). Japan has a number of research institutions focusing on Southeast Asian studies (Center for Southeast Asian Studies at the University of Kyoto, Tokyo University of Foreign Studies, among others), and Japanese scholars have been active in descriptive work on MSEAn languages, including several ill-described languages of Myanmar (Karen, Kadu, Sak, Kachin, among others). Several recurring conferences are dedicated to or include SEAn linguistics; the most important are the following: – ISCTLL: The annual International Conference of Sino-Tibetan Languages and Linguistics has been around for over 50 years (2020 seeing the 53rd meeting). Its main focus is Sino-Tibetan languages, but the conference usually includes panels on other East and Southeast Asian language families. The ICSTLL is held at venues alternating between Asia and Europe/USA, enabling a wide range of scholars to attend and share their research. – ICAAL: The International Conference on Austroasiatic Linguistics has been held every two years since 2007, after a long hiatus following the first two meetings in 1973 and 1978. The scope includes AA languages of India, and meetings are variously held in India, SEAsia, Europe, and the USA. Some meetings have led to proceedings volumes. – SEALS: The Southeast Asian Linguistics Society has met annually since 1991. Beginning as a graduate seminar in the US it grew quickly into the premier international conference for SEAsian linguistics, now meeting more often in Asian locations. Proceedings volumes were published for the early meetings, and since 2009 SEALS has supported a peer reviewed journal – JSEALS – nowadays published by the University of Hawaii Press. In recent years the scope of SEALS has expanded to include Insular SEAsia and meetings can exceed more than 100 participants.

Introduction 

 17

Two of these conferences specialize in two of the major language families of MSEA, namely Sino-Tibetan and Austroasiatic, while the third includes languages and linguistics of the whole area. Tai-Kadai is sometimes part of workshops in other conferences as well, while Hmong-Mien is generally the least represented in conferences. In addition, there are several general conferences with a regional or country focus, usually including some linguistic panels (e.  g. the International Conference on Thai Studies, the International Conference on Lao Studies, the Burma Studies Conference organized biennially by the Northern Illinois University, the more recently established International Conference on Burma/Myanmar Studies [biennial] by Chiang Mai and Mandalay Universities). The increasing participation of SEAn scholars in linguistic activities in recent decades, and the growing tendency for regionally themed conferences to be held in Asia, show a very welcome shift away from Eurocentric research in the field and growing confidence among the home-gown research communities. In terms of what MSEAn linguistics has achieved in the last century or so of activity, we can point to a number of significant milestones: – All of the regional language families have been successfully delineated, and their internal structures, such as sub-groups and branching relations, have been largely determined. Where problems remain in language classification, it is apparent that we are running up against the limits of what our data and methods can achieve. – Comparative historical reconstructions have been proposed for all MSEA language families, delivering broad outlines of the historical phonologies and lexicons going back to Neolithic times. Nonetheless, much of this comparative work has been done on the basis of limited data and pre-computational methods, and we can expect significant improvement particularly as more detailed sub-group reconstructions emerge and improved computational methods are implemented. – A more sophisticated understanding of areality has emerged since the 1960s, based on theoretical advances and a grounding in more thorough typological studies, complemented by inclusion of historical data, both linguistic and cultural/political, as well as human and plant genetics. – The description and analyses of grammar has transitioned from a Eurocentric centric model to a more appropriate appreciation of the configurational encoding of grammatical relation and interaction with lexical semantics. And more broadly, linguistics in MSEA has benefited from becoming more generally grounded in the real typology of the region. – Naive ideas of loanwords and scripts, or superficial grammatical features as determinants of language affiliations have been transcended in favour of a contemporary framework of language phylogenetics and areality. – Description and analysis of phonological structures and processes has progressed tremendously in particular in terms of understanding mono- and sesqui-syllables and suprasegmentals.

18 

 Paul Sidwell and Mathias Jenny

Generally, MSEAn linguistics has benefited greatly from the advances in and application of general linguistic methodology, both quantitative and qualitative, combined with an ever-expanding database of linguistic material being made available to the research community. Looking forward, it is our contention that all these trends are positive, and would be further fostered by increased involvement by MSEAn based scholars, and in particular more international cooperation between those scholars based within the region. In the currently emerging era of big data, data sharing and cooperation, and new analytical tools and opportunities, a transcending of narrow national priorities and habits in favour of a more truly regional linguistics holds strong promise of consolidating and growing linguistics in MSEA.

References Abramson, Arthur S. 2004. The plausibility of phonetic explanations of tonogenesis. In Gunnar Fant, Hiroya Fujisaki, Jianfen Cao & Yi Xu (eds.), From traditional phonology to modern speech processing: Festschrift for Professor Wu Zongji’s 95th birthday, 17–29. Beijing: Foreign Language Teaching and Research Press. Alieva, Natalia F. 1984. A language-union in Indo-China. Asian and African Studies 20. 11–22. Bellwood, Peter. 1997. Prehistory of the Indo-Malaysian Archipelago, 2nd edn. Honolulu: University of Hawaii Press. Benedict, Paul K. 1976. Austro-Thai and Austroasiatic. In Philip N. Jenner, Laurence C. Thompson & Stanley Starosta (eds.), Austroasiatic studies, part I: 1–36 (Oceanic Linguistics Special Publication 13). Honolulu: University of Hawaii. Boas, Franz. 1917. Introduction. International Journal of American Linguistics. Boas, Franz. 1920. The classification of American languages. American Anthropologist 22. 367–376. Boas, Franz. 1929. The classification of American Indian languages. Language 5. 1–7. Cowan, H. K. J. 1948. “Aantekeningen betreffende de verhouding van het Atjehsch tot de Mon-Khmer talen.” Bijdragen tot de Taal-, Land- end Volkenkunde van Nederl. Indie. 104. 429–514. Egerod, Søren. 1980. To what extent can genetic-comparative classifications be based on typological considerations? In Typology and Genetics of Language. Travaux du Cercle Linguistique de Copenhague, vol. XX. Enfield, Nicholas. 2005. Areal linguistics and mainland Southeast Asia. Annual Review of Anthropology 34. 181–206. Enfield, Nicholas. 2019. Mainland Southeast Asian languages: A concise typological introduction. Cambridge: Cambridge University Press. Gorgoniev, Yu., Yu. Ya Plam, Yu. V. Rozhdestvenskii, G. P. Serdyuchenko & V. M. Solntsev. 1960. Obshchie cherty v stroe kitaisko-tibetskikh i tipoíogicheski blizkikh k nim yazykov Yugo-Vostochnoi Azii. Doklady sovetskikh delegátov na XXV mezhdunarodnom kongresse vostokovedov v Moskve. Gorgoniev, Yu. 1965. Yavlenie parallelizma v stanovlenii grammaticheskikh kategorii v yazykakh izoliruyushchego tipa. In Lingvisticheskaya tipologiya i vostochnye yazyki, 132–143. Moscow: Nauka. Henderson, Eugénie. 1965. The topography of certain phonetic and morphological characteristics of South East Asian languages. Lingua 15. 400–434.

Introduction 

 19

Huffman, Franklin E. 1973. Thai and Cambodian: A case of syntactic borrowing. Journal of the American Oriental Society 93(4). 488–509. Keane, Augustus Henry. 1880. On the relations of the Indo-Chinese and Indo-Oceanic races and languages. Journal of the Royal Anthropological Institute of Great Britain and Ireland 9. 254–289. Khanittanan, Wilaiwan. 2004. Khmero-Thai: The great change in the history of the Thai language of the Chao Phraya Basin. In Somsonge Burusphat (ed.), Papers from the Eleventh Annual Meeting of the Southeast Asian Linguistics Society, 375–391. Tempe, AZ: Arizona State University, Program for Southeast Asian Studies. Kopitar, Jeernej. 1857 [1829]. Albanische, walachische und bulgarische Sprache. Jahrbüchern der Literatur 46. 59–106. Vienna. [Reprinted 1857: Kleinere Schriften sprachwissenschaftlichen, geschichtlichen, ethnographischen, und rechtshistorischen Inhalts, ed. by Fr. Miklosich, Theil 1. Vienna: F. Beck]. Marno, Hanna, Langus Alan, Omidbeigi Mahmoud, Asaadi Sina, Seyed-Allaei Shima & Nespor Marina. 2015. A new perspective on word order preferences: The availability of a lexicon triggers the use of SVO word order. Frontiers in Psychology 6. 1–8. DOI: 10.3389/fpsyg.2015.01183. Masica, Colin P. 2005. Defining a linguistic area. South Asia. New Delhi: Pauls Press. [1st edition 1976 by University of Chicago Press]. Müller, Friedrich Wilhelm Karl. 1862. Lectures on the science of language, 3rd edn. London: Longman, Green, Longman and Roberts. Ratliff, Martha. 2010. Hmong-Mien language history. Canberra: Pacific Linguistics. Ratliff, Martha. 2015. Tonoexodus, tonogenesis, and tone change. In Patrick Honeybone & Joseph Salmons (eds.), The Oxford handbook of historical phonology, 245–261. Oxford: Oxford University Press. Ross, Malcolm. 2006. Metatypy. In K. Brown (ed.), Encyclopedia of language and linguistics, 2nd edn. Oxford: Elsevier. Sherzer, Joel. 1973. Areal linguistics in North America. In Thomas Sebeok (ed.), Current Trends in Linguistics 10. 749–795. The Hague: Mouton. Schmidt, Wilhelm. 1906. Die Mon-Khmer-Völker, ein Bindeglied zwischen Völkern Zentralasiens und Austronesiens. Archiv für Anthropologie, Braunschweig 5. 59–109. Schuessler, Axel. 2006. ABC etymological dictionary of Old Chinese. Hawaii: University of Hawaii Press. Sidwell, Paul & Roger Blench. 2011. The Austroasiatic Urheimat: The Southeastern Riverine hypothesis. In Nicholas Enfield (ed.), Dynamics of human diversity, 315–344. Canberra: Pacific Linguistics. Sidwell, Paul. 2015. Local drift and areal convergence in the restructuring of Mainland Southeast Asian languages. In Nicholas Enfield & Bernard Comrie (eds.), Languages of Mainland Southeast Asia: The state of the art, 51–81. Berlin: Mouton de Gruyter. Trubetzkoy, Nikolai Sergeevich. 1923. Vavilonskaja bashnja I smeshenie jazykov. Evrazijskij vremennik 3. 107–124. van Driem, George. 2001. Languages of the Himalayas: An ethnolinguistic handbook of the Greater Himalayan region: Containing an introduction to the symbiotic theory of language. Leiden: Brill. Varasarin, Uraisi. 1984. Les éléments khmers dans la formation de la langue siamoise. Paris: SELAF. Wang Fushi 王辅世, Mao Zongwu 毛宗武. 1995. Miao-Yao yu guyin gouni 苗瑤语古音构拟. Beijing: China Social Sciences Academy Press 中国社会科学出版社.

Charles F. W. Higham

2 The Neolithic occupation of Southeast Asia 2.1 Introduction Anatomically modern humans reached Southeast Asia from their African homeland at least 50,000 years ago. Archaeologically, they are documented in many inland rock shelters, and less frequently in open sites often located near rivers. During the last glacial, the sea level fell by over 100m below its present level, exposing a low-lying landscape the size of modern India known as Sundaland that we presume was occupied by hunter-gatherers. With the Holocene rise in global temperature, the sea rose above the present level, and with the formation of shorelines behind the present coast, we can identify mature marine communities, as at Cồn Cổ Ngựa in northern Vietnam. Such communities hunted big game animals, exploited coastal resources and interred their dead in a flexed position with few if any mortuary offerings (Figure 1; Oxenham et al. 2018). There are thus at least two distinct adaptations, one inland, where rock shelters present brief periods of occupation over many millennia, the other complex sedentary communities in favoured coastal or riverine locations whose inhabitants made pottery vessels, ground stone axes, and interred their dead in a flexed position with few mortuary offerings. A widespread dislocation in this long cultural sequence took place about 4,000 years ago. Newly founded, permanent villages incorporated residences, and the dead were interred in cemeteries in a supine, extended position, accompanied by a range of grave goods that included complete ceramic vessels, jewelry and animal bones. Adzes were fashioned from stone that had to be obtained from some distance. Flotation of cultural deposits has yielded domestic rice and millet. Some of the faunal remains come from domestic pigs, dogs and cattle. This change, widely referred to as the onset of the Neolithic period in Southeast Asia, has raised a debate over origins. Was it the result of the arrival of migrant, fully-fledged farmers from elsewhere, or could it reflect an indigenous transition that involved the local domestication of plants and animals, and the associated radical changes in technology? There are two ways of resolving this question. The first is to employ the traditional archaeological approach of seeking either evidence for indigenous innovation or external sources for the documented changes that are earlier, and therefor potentially ancestral. The second is to employ biological information that might evidence the arrival in Southeast Asia of immigrant farmers.

https://doi.org/10.1515/9783110558142-002

22 

 Charles F. W. Higham

Fig. 1: Map showing sites mentioned in the text. 1. Cồn Cổ Ngựa, 2. Baiyangcun, 3. Mán Bạc, 4. An Sơn, 5. Khok Phanom Di, 6. Non Ratchabat, 7. Ban Kao, 8. Phum Snay, 9. Weidun, 10. Ban Non Wat, 11. Noen U-Loke, 12. Liangzhu, 13. Ban Chiang 14. Zhuangbianshan, 15. Non Pa Wai.



The Neolithic occupation of Southeast Asia 

 23

2.2 Archaeological evidence The key issue is to identify the origins of rice and millet domestication. No such transition has been documented in Southeast Asia, nor is there any reason to anticipate one. With its consistently warm climate and rich bioproductivity both inland and marine, there would have been little reason for the enduring tradition of hunting and gathering to innovate. Indeed, hunter-gatherers survive to this day in peninsular Malaysia, the Philippines and the Andaman Islands. However, the long process of rice domestication has now been tracked in the Yangtze Valley (Fuller et al. 2010; Fuller and Castillo this volume), and further north in the Central Plains, there is growing evidence for the domestication of millet. Between 3300 and 2300 BC, a state society flourished, centred at Liangzhu in the lower reaches of the Yangtze Valley (Renfrew and Liu 2018). Archaeologically, it is now possible to trace a southward thrust of rice farmers. Recent research in the Fuzhou Basin has made a crucial contribution to tracing a possible maritime route of expansion by rice farmers originating in the lower reaches of the Yangzi (Ma et al. 2016). The recovery of rice phytoliths from Zhuangbianshan have placed initial rice cultivation in the Tanshishan Phase of the local sequence, dated to 3000–2300 BC. This was not the only possible route south, for at Baiyangcun in Yunnan, a Neolithic community was cultivating rice and millet by 2650 BC (Dal M ­ artello et al. 2018). Critically, the DNA of rice from two later Thai sites, Ban Non Wat and Noen U-Loke, is of the japonica variety native to the Yangtze region (Castillo et al. 2016). There are several key sites in mainland Southeast Asia that provide evidence not only for the origins of the Neolithic, but also for the interactions that took place with the indigenous hunter gatherers. Ceramics is a medium that can take a virtually limitless variety of pot forms and decorative motifs. To find similarities is therefore a means of tracking down relationships. There is a remarkable common grammar for the incised and impressed designs found on Neolithic pottery vessels that link sites from coastal and inland Southeast Asia with southern China (Rispoli 2007). The cemetery uncovered at Mán Bạc in Northern Vietnam is one such site that has provided both mortuary ceramics and hard stone ornaments with northern parallels (Oxenham et al. 2010). Dated to about 2000 BC, the human remains include both incoming farmers and indigenous hunter-gatherers, seen in the shape of the skulls and their DNA. Where some crania and DNA are matched in the Yangtze Neolithic site of Weidun, others are similar to the local hunter-gatherers. Yet all were interred in the same way, extended and supine. This is the clearest evidence to date for the mixing of two populations at the onset of the Neolithic, a finding in distinct contrast to the separation of incoming farmers and local hunter-gatherers that occurred in Central Europe. An Sơn is located in the Đồng Nai River catchment of southern Vietnam. Dated from the late 3rd millennium BC, it too includes burials accompanied by pots with incised and impressed designs that were tempered with rice. These people also raised domestic pigs and dogs. It was one of many such Neolithic sites in this region (Bellwood et al. 2013).

24 

 Charles F. W. Higham

Khok Phanom Di was located next to the estuary of the Bang Pakong River on the eastern shore of the Gulf of Siam (Higham and Thosarat 2004). It was a settlement of maritime hunter-gatherers as the first farmers arrived about four thousand years ago and occupied the site between ca. 2000 and 1500 BC. The estuarine location favoured coastal and inland exchange, but the mangroves and salt flats did not encourage rice cultivation. The cultural sequence is divided into seven mortuary phases over about 20 human generations, the dead being interred in clusters over the ancestors as the occupation mound accumulated. One reason for the rapid deposition of cultural remains was the thick lenses of the discarded shellfish that comprised a major component of the diet. The rice that was recovered through flotation from the earlier layers might have been obtained through exchange, but half way through the occupation the sea level fell, and freshwater conditions favoured local cultivation, that was effected using granite hoes and shell harvesting knives. When the sea rose again, the occupants reverted to a principally marine diet. Khok Phanom Di was also a major production site for pottery vessels. The women were interred with the tools of their trade, including the ceramic anvils for shaping pots, and the stones used to burnish them before firing. Some of the vessels were superbly formed, decorated and burnished, the surfaces glowing to this day. The motifs incised and impressed on their vessels are virtually identical with those from inland sites of the same period. Others were mass-produced storage pots that were probably taken by boat on trading voyages along the coast or up river. Stone for adzes was brought in, as was the fine marine shell for fashioning ornaments. Some women potters were interred with wealth unmatched elsewhere in Neolithic Southeast Asia. One wore over 120,000 exotic shell beads, horned shell discs on her chest, a shell bangle and ear discs (Figure 2). Her anvil and burnishing stones lay beside her right ankle. An infant in an adjacent grave, who died when about 18 months of age, also wore many thousand beads. A miniature clay anvil had also been placed beside the right ankle. Men, too were wealthy, and often interred with decorated marine turtle carapaces. Migrant farming communities also colonized the river valleys that flow into the plains of Central Thailand. The pioneer excavations at Ban Kao in the Khwae Noi River valley by Sørensen uncovered a Neolithic cemetery in which the dead were interred with a remarkable assemblage of pottery vessels and stone adzes (Sørensen and Hatting 1967). He suggested that migrants who came via the Salween River from southern China occupied this site. A site survey in Ratchaburi Province has more recently identified similar Neolithic settlements, and excavations at one of these, Non Ratchabat, has dated the initial settlement to about 2000 BC (Doungsakul pers. comm.). Some of the pots, embellished with horns or female breasts, are in stark contrast to those from Khok Phanom Di, and must surely represent a different origin and route from the north. Further east, in Lopburi Province, the ceramics and shell beads are more similar to those from Khok Phanom Di. However here, at sites like Non Pa Wai, millet was preferred to rice, and some individual millet grains have been dated from the late 3rd millennium BC (Weber et al. 2010; Higham et al. 2020).



The Neolithic occupation of Southeast Asia 

 25

Fig. 2: Burial 15, a Neolithic female potter at Khok Phanom Di, was interred wearing garments encrusted with over 120,000 exotic marine shell beads. Adjacent lay a headless man with a couple of pots.

The settlement of the inland Khorat Plateau by the first farmers appears on present evidence to be rather later than the coastal or near coastal sites. They reached Ban Non Wat in the 17th century BC, while at Non Nok Tha and Ban Chiang, the earliest Neolithic burials date from the 15th century BC (Higham et al. 2015). The early Neolithic cemetery at Ban Non Wat is highly intriguing (Higham and Kijngam 2010). In addition to the typical extended, supine burials with fine Neolithic ceramic vessels, there are flexed interments with a quite distinct set of mortuary offerings and to judge from the isotopes in their teeth, a different diet (Figure 3; King et al. 2015). There is a

26 

 Charles F. W. Higham

distinct possibility that, as at Mán Bạc, some graves contain indigenous hunter-gatherers. What is certain is that the biological remains from this Neolithic settlement include rice and the bones of domestic cattle, pigs and dogs. The traditional archaeological evidence, including ceramic vessels and stone adzes, is thus unanimous in identifying the settlement of mainland Southeast Asia by at least 2000 BC by migrants from the north (Figure 4). It is conceivable that the hunter-gatherers they encountered modified their habitat by fire to encourage some plants or herbivorous animals, or tended yams, taro or other indigenous plants. However, there is no evidence for the local domestication of rice, which was to prove the economic basis of later complex societies. This phenomenon is widely known as the “Two Layer Hypothesis”, and it can be tested and refined through examining physical remains.

2.3 Human biology The study of human remains has involved the comparative study of bones, particularly crania and teeth, and the recovery and interpretation of ancient DNA. Matsumura et al. (2018) have taken 16 cranial measurements from a wide range of prehistoric and modern crania, and subjected them to multivariate statistical analyses. Their results identify two distinct groups. One of these is labeled the Northeast Asian cluster, while the other is called the Australo-Papuan and Hoabinhian Southeast Asian cluster. The former represents incoming farmers, the latter the indigenous hunter-gatherers. The skulls from Khok Phanom Di, An Sơn, Ban Chiang and Mán Bạc subset (1) all cluster close to those from Hemudu and Weidun, both rice farming sites of the lower Yangtze. Mán Bạc subset (2) and Cồn Cổ Ngựa belong to the AustraloPapuan group. Variation in the form of human teeth is genetically determined, and can therefore provide valuable information on population history. In a study of 21 dental traits in over 7,000 prehistoric and modern individuals, Matsumura and Oxenham (2014) have identified two major groups, one Northeast Asian and the other, characteristic of the indigenous Southeast Asian hunter-gatherers. The relative position of certain prehistoric samples within this dichotomy is particularly revealing. Thus, some of those at Mán Bạc, and the people of Khok Phanom Di, are virtually twinned with those from Weidun and Songze in the lower Yangtze Valley. It is beyond reasonable doubt that there was a rapid coastal colonization by rice farmers. However, there are inland sites in Thailand, such as Noen U-Loke, Ban Chiang and Ban Na Di that indicate a mixture of indigenous dental traits and those of incoming farmers. This is particularly marked at the Cambodian Iron Age site of Phum Snay, which falls squarely within the indigenous orbit. The Two Layer hypothesis stands intact, but resulted in population mixture rather than replacement.



The Neolithic occupation of Southeast Asia 

 27

Fig. 3: The flexed burials from Ban Non Wat are quite different from the early Neolithic graves and might represent a community of indigenous hunter-gatherers.

28 

 Charles F. W. Higham

Fig. 4: The arrival of rice farmers at Ban Non Wat is seen, archaeologically, in distinctive incised, impressed and painted pottery vessels and polished adzes.

The study of ancient DNA has had a slow start in Southeast Asia, due to the impact of heat and moisture on its rapid degradation. However, the finding that DNA has a much higher chance of survival in the very hard petrous bone, linked with new-generation methods for extracting endogenous DNA, has provided sufficient information for two major pioneer studies. In the first of these, McColl et al. (2018) have compared the aDNA extracted from the petrous bones of Hoabinhian hunter-gatherers with modern peoples and found that they relate most closely to the Onge, a group of



The Neolithic occupation of Southeast Asia 

 29

hunter-gatherers that survive in the Andaman Islands. On the other hand, Neolithic farmer aDNA is similar to modern rice farmers who speak Austroasiatic languages. In a second study, Lipson et al. (2018) have extracted aDNA from the Neolithic inhabitants of Mán Bạc and Ban Chiang, and concluded that they were early farmers who originated in what is now Southern China and who may also have spoken proto-Austroasiatic languages ancestral to those spoken today from the Munda group in India to modern Mon, Khmer and Vietnamese.

2.4 Summary The transition in Southeast Asia from hunting and gathering to the cultivation of rice and millet, and raising domestic animals, known widely as the “Neolithic Revolution”, could have involved local innovation or the arrival of migrant farmers from elsewhere. The archaeological evidence strongly supports the latter model. There was a sharp and widespread dislocation between the late hunter-gatherer settlements and those of the first farmers. This is seen in the foundation of sedentary villages, permanent houses, virtually all aspects of material culture, subsistence incorporating agriculture and stock raising, and the methods of interring the dead. This, the “two layer model” is now further supported by human biology: the form of cranial and dental variation and aDNA. However, settlement by farmers who brought with them domestic rice, millet, cattle, dogs and pigs did not displace the indigenous inhabitants, whose ancestors had been adapted to Southeast Asia for at least 50,000 years. The archaeological evidence suggests several migratory routes. A proposed coastal route brought farmers to Mán Bạc, An Sơn and Khok Phanom Di by at least 2000 BC. They met and interacted with hunter-gatherers who themselves made pottery vessels, polished stone adzes and maintained a semi-sedentary existence that led to the establishment of extensive cemeteries. An inland route brought millet and rice farmers to Central Thailand. The more remote Khorat Plateau of Northeast Thailand was settled several centuries later, and again there are hints at Ban Non Wat for admixture with the local hunter-gatherers. Indeed, the crania and dentition of the inhabitant of inland Iron Age Phum Snay suggests that some indigenous groups adopted agriculture with little or no genetic input from beyond Southeast Asia. We do not know how important rice and or millet were in the diet of the first farmers. Southeast Asia abounds with both animals and plants that can be hunted or collected. The detailed recovery of biological remains from early Neolithic sites has shown that fishing, collecting and hunting were significant contributors to the diet. The early farmers soon developed trading networks that involved high quality stone, marine shell, ceramics and doubtless less durable products. No cemetery has yet provided evidence to support a social hierarchy, although some individuals, such

30 

 Charles F. W. Higham

as burial 15 at Khok Phanom Di, were interred with extraordinary wealth that surely reflects high status or achievement within that community of traders. The Neolithic of Southeast Asia lasted for about a millennium, until a second stimulus from northerly sources brought knowledge of bronze casting during the 11th century BC. It represents the foundation upon which the early states of Southeast Asia were constructed.

References Bellwood, Peter, Marc Oxenham, Bui Chi Hoang, Nguyen Kim Dzung, Anna Willis, Carmen Sarjeant, et al. 2011. An Sơn and the Neolithic of Southern Vietnam. Asian Perspectives 50. 144–175. Cobo Castillo, Cristina, Katsunori Tanaka, Yo-Ichiro Sato, Ryuji Ishikawa, Bérénice Bellina, Charles Higham, Nigel Chang, Rabi Mohanty, Mukund Kajale & Dorian Q. Fuller. 2016. Archaeogenetic study of prehistoric rice remains from Thailand and India: evidence of early japonica in South and Southeast Asia. Archaeological and Anthropological Science 8 523–543. Dal Martello, Rita, Rui Min, Chris Stevens, Charles Higham, Thomas Higham, Ling Qin & Dorian Q. Fuller. 2018. Early agriculture at the crossroads of China and Southeast Asia: Archaeobotanical evidence and radiocarbon dates from Baiyangcun, Yunnan. Journal of Archaeological Science: Reports 20. 711–721. Fuller, Dorian Q., Yo-Ichiro Sato, Cristina Cobo Castillo, Ling Qin, Alison R. Weisskopf, Eleanor J. Kingwell-Banham, Jixiang Song, Sung-Mo Ahn & Jacob van Etten. 2010. Consilience of genetics and archaeobotany in the entangled history of rice. Archaeological and Anthropological Science 2. 115–131. Higham, Charles, Katerina Douka & Thomas Higham. 2015. A new chronology for the Bronze Age of Northeastern Thailand and its implications for Southeast Asian prehistory. https://journals. plos.org/plosone/article?id=10.1371/journal.pone.0137542 (last accessed 7 December 2020). Higham, Charles & Amphan Kijngam (eds.). 2010. The origins of the civilization of Angkor. Volume IV. The excavation Ban Non Wat: The Neolithic occupation. Bangkok: The Fine Arts Department of Thailand. Higham, Charles & Rachanie Thosarat. 2004. The excavation of Khok Phanom Di: Volume VII. Summary and conclusions. London: The Society of Antiquaries of London. Higham, Thomas, Andrew Weiss, Charles Higham, Christopher Bronk Ramsey, Jade D’Alpoim Guedes, Sydney Hanson, et al. 2020. A prehistoric copper-production centre in Central Thailand: Its dating and wider implications. Antiquity 94. 948–965. King, Charlotte Louise, Nancy Tayles, Charles Higham, Una Strand-Viđarsdóttir, R. Alexander Bentley, Colin Macpherson & Geoff M. Nowell. 2015. Using isotopic evidence to assess the impact of migration and the two-layer hypothesis in prehistoric Northeast Thailand. American Journal of Physical Anthropology 158. 141–150. Lipson, Mark, Olivia Cheronet, Swapan Mallick, Nadin Rohland, Marc Oxenham, Michael Pietrusewsky, et al. 2018. Ancient genomes document multiple waves of migration in Southeast Asian prehistory. Science 361. 92–95. Ma, Ting, Zhuo Zheng, Barry V. Rolett, Gongwu Lin, Guifang Zhang & Yuanfu Yue. 2016. New evidence for Neolithic rice cultivation and Holocene environmental change in the Fuzhou Basin, Southeast China. Vegetation History and Archaeobotany 25. 375–438. Matsumura, Hirofumi, Ken-ichi Shinoda, Truman Simanjuntak, Adhi Agus Oktaviana, Sofwan Noerwidi, Harry Octavianus Sofian, et al. 2018. Cranio-morphometric and aDNA corroboration



The Neolithic occupation of Southeast Asia 

 31

of the Austronesian dispersal model in ancient Island Southeast Asia: Support from Gua Harimau, Indonesia. PLoS ONE 13(6). DOI: https://doi.org/10.1371/journal.pone.0198689. Matsumura, Hirofumi & Marc Oxenham. 2014. Demographic transitions and migration in prehistoric East/Southeast Asia through the lens of nonmetric dental traits. American Journal of Physical Anthropology 155. 45–65. McColl, Hugh, Fernando Racimo, Lasse Vinner, Fabrice Demeter, Takashi Gakuhari, Eske Willerslev, et al. 2018. The prehistoric peopling of Southeast Asia. Science 361. 88–92. Oxenham, Marc, Hirofumi Matsumura & Kim Dung Nguyen (eds.). 2010. Man Bac. The excavation of a Neolithic site in Northern Vietnam. The Biology (Terra Australis 33). Canberra: Australian National University. Oxenham, Marc, Hiep Hoang Trinh, Anna Willis, Rebecca Jones, Kathryn Domett, Cristina Cobo Castillo, et al. 2018. Between foraging and farming: Strategic responses to the Holocene thermal maximum in Southeast Asia. Antiquity 92. 940–957. Renfrew, Colin & Bin Liu. 2018. The emergence of complex society in China: The case of Liangzhu. Antiquity 92. 975–990. Rispoli, Fiorella. 2007. The incised & impressed pottery of mainland Southeast Asia: Following the paths of Neolithization. East & West 57. 235–304. Sørensen, Per & Tove Hatting. 1967. Archaeological investigations in Thailand. Volume II, Ban Kao, Part 1: The archaeological materials from the burials. Copenhagen: Munksgaard. Weber, Steve, Heather Lehman, Timothy Barela, Sean Hawks & David Harriman. 2010. Rice or millets: Early farming strategies in prehistoric central Thailand. Archaeological and Anthropological Sciences 2. 79–88.

Peter Bellwood

3 Homelands and dispersal histories of Mainland Southeast Asian language families: a multidisciplinary perspective 3.1 Introduction Every widespread language family reflects a history of dispersal from a homeland region, involving both ancestral languages and their speakers. To some degree, processes of language shift can help to explain certain subgroup expansions, but language shift is clearly insufficient as a sole explanation for the totality of any major language family distribution, at least from the perspective of comparative language history (Ostler 2005; Bellwood 2013; and see Ross [2008: 164] for the specific case of Austronesian). Within the histories of all major language families there have been significant movements of human populations. The four major language families that occupy mainland Southeast Asia reflect this important observation very clearly in their internal phylogenies, subgroup distributions, and linkages with archaeological and genetic expansions. Reconstruction of these past expansions is therefore not simply a linguistic exercise. Other sources of information, from archaeology, the genomics of both dead and living populations, and craniofacial analysis all assist in pointing to the most likely hypothesis for the existence of any given language family. We need to look carefully at language family distributions in mainland Southeast Asia both today and in the historical record, and to search for parallel distributions at suitable time depths in the data available from other sciences of human prehistory.

3.2 Mainland Southeast Asian language families The four major language families that occupy Mainland Southeast Asia today are Sino-Tibetan (Tibeto-Burman subgroup), Kra-Dai (or Daic, or Tai-Kadai), Austroasiatic, and Austronesian (Malayo-Chamic subgroup) (Sidwell 2013; and this volume). This chapter does not discuss the Hmong-Mien languages, owing to the historical recency of their spread from southern China. Of these language families/subgroups, Tibeto-Burman occupies a continuous distribution from the Himalayan region into southwestern China and down the western side of mainland Southeast Asia. Austronesian occupies a peripheral distribution in central Vietnam (Chamic languages) and far to the south in the Thai-Malay ­Peninsula (Moken and Malayan languages). Kra-Dai occupies a fairly continuous region of the https://doi.org/10.1515/9783110558142-003

34 

 Peter Bellwood

northern and central mainland, extending far into southern China. Austroasiatic reveals all signs from its distribution of being the most ancient linguistic layer that survives in the region today. Its original extent has been clearly overlain by the more recent expansions of Kra-Dai and Austronesian languages, as well as by the historical-era expansions of the Austroasiatic state-level languages Khmer and Vietnamese. In order to reconstruct the history of any language family, we must first ask two questions. How long ago did the proto-language exist, and what cultural attributes can be recovered from the list of word meanings contained in it? It is necessary also to remember that a reconstructed proto-language can only reflect the results of comparison between existing or attested languages and subgroups within the family. Should any subgroups be extinct without written record or historical reference, an early phase of the history of the whole family might well be lost forever. We can only work from what we know. Two very important observations stand out for the four major language groups of Mainland Southeast Asia. The first is that their proto-languages contain meanings related to food production, of both crops (especially rice) and domesticated animals, especially pigs and dogs (Sagart 2003; Blust 2017). The second concerns chronology. Modern lexicon-based phylogenetic analyses that use historical data on the evolution of specific vocabulary items reveal proto-language ages between 6,000 and 4,000 years ago for Austronesian (Gray et al. 2009) and Sino-Tibetan (Sagart et al. 2018; Zhang et al. 2019). Linguistic estimates based on other sources of data, including genetic molecular clocks, for Kra-Dai and Austroasiatic are similar (Ostapirat 2005; Diffloth 2005; Singh et al. 2019). These observations for these four major language families, of an approximate mid-Holocene date of expansion and a proto-vocabulary with terms related to food production, are extremely important. The Mainland of Southeast Asia and adjacent southern China between 6,000 and 4,000 years ago was an active place in terms of population movement. The expansion of food production was giving rise during that period to one of the most rapid episodes of population growth in the Neolithic prehistory of Eurasia. With those Neolithic populations spread the foundations of many of the world’s major language families (Bellwood 1997, 2005).

3.3 Archaeology, craniofacial biology, and the origins of agriculture in eastern Asia Understanding of the Neolithic in Mainland Southeast Asia, and the succeeding Bronze and Iron Ages, has progressed rapidly in recent years, especially in terms of chronology and archaeological detail. Such information is now especially detailed for China, Thailand and Vietnam (Higham 2018a, 2018b; Bellwood 2015, 2017a; Bellwood et al. 2011; Kim and Higham in press). However, the time perspective required



Homelands and dispersal histories of Mainland Southeast Asian language families 

 35

to understand how modern language family distributions in Mainland Southeast Asia have come about is much greater than that of the Neolithic alone. Before the development of food production, the whole of Southeast Asia, including southern China to at least as far north as the Yangzi (this being the most relevant geographical region of China for the purposes of this chapter), was peopled by hunter-gatherer populations who left behind shell middens, human burials, animal bones, stone tools (sometimes with ground cutting edges), and occasionally pottery. Such sites, mostly in caves or within shell mounds, are widely reported in southern China and the northern half of Vietnam, and also occur in the Malay Peninsula and Sumatra (an excellent example, Con Co Ngua in northern Vietnam, is described by Oxenham et al. 2018). The earlier phases of these assemblages are generally referred to by archaeologists as “Hoabinhian”, after the caves of Hoa Binh Province in Vietnam. Later phases are termed “Para-Neolithic” by this author (Bellwood 2017a) because of the possession of pottery and ground stone tools, especially after 8,000 years ago. Hoabinhian and Para-Neolithic populations buried their dead in flexed or seated postures in pits, usually without any grave offerings. Their craniofacial features group them with living and recent Australo-Papuan indigenous peoples of Australia and New Guinea (Matsumura et al. 2017, 2018), although genetic evidence (discussed below) reveals that those in southern China had East Asian genetic connections as well. By 8,000 years ago, potent changes were occurring in the archaeological record of the middle and lower regions of the Yangzi Valley, and in the Huai Valley to its immediate north (Stevens and Fuller 2017; Bellwood 2017a: 218–231; Ma et al. 2018; Yang et al. 2018). People living in lowland riverine environments, mostly inland to begin with, were beginning to plant a wild precursor of the domesticated short-grained rice subspecies Oryza sativa japonica. To the north, along the Yellow River, other groups began to plant the wild precursors of domesticated common and foxtail millet. The combined result of these agricultural beginnings, by 7,000 years ago, was that central China supported perhaps the densest human population on earth at the time (see population density reconstructions in Hosner et al. 2016; also Bellwood 2017b), living in large timber villages, growing rice and millets, keeping domesticated pigs and dogs, and familiar with weaving and the skilled carpentry required to construct houses and boats. By this time also, the post-glacial rise of global sea level was virtually complete, giving rise to a drowned East Asian coastline flanked by sloping hinterlands (Bellwood et al. 2008; Carson and Hung 2018). The coastal lowlands that exist now have been formed by subsequent sediment deposition, as the progression of agriculture has released more and more soil from hinterlands, right up to the present. Six thousand years ago, as early farmers commenced their coastal migrations, available low-lying riverine and coastal land suitable for agriculture across the whole of Southeast Asia was less extensive than today. The rapidly growing human populations of the time, who needed increasing quantities of fertile land for food production, sometimes had to travel far to find those lands. Many switched to shifting cultivation away from

36 

 Peter Bellwood

low-lying soils, in the process increasing the demand for land and increasing the rate of population movement. The finer details of these early farmer migrations will forever elude us, but we can read the overall picture very clearly in the spreads of Neolithic human populations archaeological assemblages (with radiocarbon dates) out of central and southern China. The Southeast Asian Neolithic human population identified from both cranio­ facial and genomic evidence was different from the hunter-gatherer population that dominated beforehand, but with its expansion there occurred episodes of admixture, documented especially in Neolithic northern Vietnam (Oxenham et al. 2011; Matsumura et al. 2017, 2019). The whole of Mainland Southeast Asia, and most of Island Southeast Asia (except close to New Guinea) underwent unprecedented cultural and human biological change between 6000 and 3,500 years ago. By 3,000 years ago, the Austronesian-speaking populations of Island Southeast Asia were undertaking the long-distance maritime settlements of Micronesia and western Polynesia (to as far east as Tonga and Samoa). The populations of Australia and New Guinea were only marginally affected by these movements, and retain their indigenous Australo-Papuan populations today. One important point needs to be emphasised, however. The region that the world today recognises as central and southern China undoubtedly has the oldest evidence for food production, population growth and out-migration of any region of Asia beyond the Fertile Crescent of the Middle East. But this does not mean that the Neolithic inhabitants of central and southern China were all “Chinese”, of Han ethnicity, speakers of Sinitic languages. Chinese culture and civilisation grew along the Yellow River, and only spread south of the Yangzi around 3000–2500 years ago. The Neolithic inhabitants of the Yangzi basin and regions to the south became the ancestral speakers within the major language families that now exist across the Mainland and Islands of Southeast Asia. Sinitic languages now dominate much of China, but they did not do so during Neolithic times.

3.4 Genomics I have discussed craniofacial evidence above, with the archaeology, because skeletons are such a major part of the archaeological record. Those skeletons also contain osseous elements, particularly from the base of the skull close to the ear, that can yield ancient DNA. Until 2015, most attempts by geneticists to unravel the history of human populations depended on DNA data from living groups. Many analyses also focused on the two genetic elements that are inherited uniparentally – mitochondrial DNA through females and the Y-chromosome through males. But when techniques were developed to extract whole genomic information from ancient bones, it was rapidly realised that



Homelands and dispersal histories of Mainland Southeast Asian language families 

 37

DNA analysis of populations alive now can give ambiguous results about population distributions that existed thousands of years ago. Humans, basically, have never stopped migrating, and each new migration will tend to absorb or erase the records left by older ones (Reich 2018). As a result, coherent population histories are now derived from whole genomic analyses of ancient DNA, focused on admixture proportions between potential ancestral populations that can be defined from ancient skeletons and living peoples. Two major studies published in 2018 (Lipson et al. 2018; McColl et al. 2018; see commentary in Bellwood 2018) emphasise the importance of the Southeast Asian Neolithic population dispersal that overlay the original Australo-Papuan distribution of the pre-Neolithic period. The most recent research emphasises even more that Southeast Asian Neolithic populations had genetic ancestries located in southern China, rather than in Hoabinhian Southeast Asia (Yang et al. 2020; Wang et al. 2021). Later migrations can also be identified, especially those that resulted from the Chinese conquest of northern Vietnam around 2,000 years ago. The genetic intrusions that resulted from Indian contact around 2,000 years ago, and from European colonial powers, were far more limited in comparison. We now come to examine the major language families of Mainland Southeast Asia in terms of what can be stated about their origins and histories of dispersal.

3.5 Austroasiatic By virtue of its “swamped” distribution, the Austroasiatic language family surely represents the most ancient layer of Mainland Southeast Asian language history that can be recognised today. The geographical overlays by the Tibeto-Burman, Kra-Dai and Austronesian language families are strongly evident. The ultimate distribution of the Austroasiatic languages extends from the Munda languages of northeastern India across to Vietnam, and southwards to almost the end of the Malay Peninsula (the Aslian languages). Nicobarese represents an island outlier, as does the recent historical movement of Khasi from Mainland Southeast Asia into Assam. Because of the linguistic swamping of so much of the former distribution of this language family throughout southern China and Thailand, its exact homeland will probably never be known. A southwest Chinese origin followed by migration down the Mekong River is a likely possibility (Sidwell and Blench 2011). Other suggested homelands include northeastern India and the Yangzi basin, but there is no specific archaeological or genetic evidence that can help to distinguish between these possibilities. The expansion history of the Austroasiatic languages is obscure, but if the Mekong Valley was their main conduit, then early spreads into and across the Truong Son Range (Vietic), down the Malay Peninsula (Aslian), and across to northeastern

38 

 Peter Bellwood

India (Munda) are all possibilities. Many Austroasiatic subgroups would then descend from those groups who stayed closest to home, along the Mekong itself. The archaeological record from Thailand and Vietnam suggests that this migration was well underway by around 4,000 years ago, carried by rice and millet farming populations who created large village mounds with extensive cemeteries of supine burials. Grave goods became plentiful during the Neolithic, including fine pottery, stone axes, and stone and shell jewelry (Higham 2014). Bronze working was introduced from China much later, around 3,000 years ago. Cemetery analyses indicate that both birth rates and infant death rates were high during the Neolithic, reflecting a common pattern amongst pioneering agricultural populations in tropical regions subject to malaria (Bellwood and Oxenham 2008). Details of interactions with Pre-Austroasiatic-speaking populations are scarce, but it is interesting that the negrito population of Peninsular Malaysia, of indigenous (re-Neolithic) genetic heritage, all switched long ago from their unrecorded original languages to speaking Aslian languages (see Reid 2013 for a similar situation with negrito populations in the Philippines). Because Aslian is well defined apart from the other Austroasiatic languages a considerable time depth is implied, presumably in millennia. Such a switch implies a considerable level of social interaction, even if there was little genetic interaction.

3.6 Kra-Dai (Tai-Kadai) The Kra-Dai languages have clearly a recent history in Thailand and Laos, having entered those regions according to historical sources around AD 1250. But the much greater linguistic diversity in Guangxi, presumably Guangdong (before Chinese conquest), Hainan and northern Vietnam suggests a Kra-Dai homeland region somewhere here (Ostapirat 2005). In northern Vietnam, Neolithic cultures with pottery styles ­paralleled in Guangdong and Hong Kong made an appearance by at least 4,000 years ago (Bellwood 2015), but archaeologists have yet to make similar comparisons between Vietnam and Hainan. The people buried in the Neolithic cemetery of Man Bac in northern Vietnam had predominantly East Asian Neolithic craniofacial affinities at around 3,900 years ago, albeit with a continued presence of a few individuals of Hoabinhian ancestry (Oxenham et al. 2011). These people grew rice and kept domesticated pigs and dogs, and presumably spread through much of northern Vietnam at the same time as Austroasiatic-speaking populations were penetrating what is now Vietnam from further west and south. Pre-existing Austroasiatic settlement of Neolithic origin throughout the rest of Mainland Southeast Asia may have stopped any further Kra-Dai advance southwards, prior to the medieval movements into Thailand and Laos. But we can certainly expect to find that the Kra-Dai languages dominated much of southeastern China during Neolithic times, possibly from as early as 4,500 years ago.



Homelands and dispersal histories of Mainland Southeast Asian language families 

 39

3.7 Tibeto-Burman The scarcity of archaeological research in Burma and around the Himalayas means that little can be said about Tibeto-Burman expansion. However, it presumably commenced with millet farmers from the Sino-Tibetan homeland region in the middle Yellow River Valley during Neolithic times, most possibly during the Yangshao and Majiayao cultures of Chinese archaeologists. The route of expansion was probably from the Yellow River down to the Yangzi in Sichuan, and thereupon via Yunnan into Upper Burma, once leaving a linguistic trail now eradicated by the expansion of Sinitic languages. Strikingly, the oldest Neolithic sites in central Thailand, which date to about 4300 years ago, occur only with foxtail millet rather than rice, thus raising the possibility of an association there with Tibeto-Burman migration from ultimate Yellow River sources (d’Alpoim-Guedes et al. 2020). The timescale for such a movement might have fallen between 5,000 and 4,000 years ago, and one wonders if Tibeto-Burman languages replaced older Austroasiatic languages throughout what is now Burma. The presence of the Munda languages in India might suggest that there was once a continuous Austroasiatic distribution across the whole of the Austroasiatic-speaking area, including Burma, but the historical lesson provided by Khasi in Assam (an Austroasiatic language) advises caution. Khasi was a fairly recent spread, presumably through areas that were already occupied by Tibeto-Burman speaking peoples. Therefore “leap-frogging” cannot be dismissed lightly. A similar process must have occurred with the Thai language Ahom, which also leap-frogged into Assam in historical times. Indeed, linguists Rau and Sidwell (2019) have recently suggested that the ancestral Munda languages spread by sea to the Mahanadi Delta in Odisha from Mainland Southeast Asia about 4,000 years ago, thus removing any need to worry about land migrations. A similar antiquity for Munda migration is also suggested by a genetic molecular clock analysis of Munda DNA (Singh et al. 2019). It is therefore not impossible that Tibeto-Burman languages reached Burma at the same time as did Austroasiatic ones, causing an early halt to westward migration of the latter by land.

3.8 Austronesian The Austronesian languages had a fairly peripheral presence on the Southeast Asian mainland, represented by Moken and Malayic in the south, and the Chamic languages that created the Indianised civilisation of Champa in central Vietnam. The language family as a whole originated from a proto-Austronesian stage in Taiwan about 5,000 years ago, with any former relatives in southern China now erased by Sinitic expansion (unless they are represented by the Kra-Dai languages, as suggested by Sagart 2005). The expansion of the Malayo-Polynesian subgroup of Austronesian reached

40 

 Peter Bellwood

the Philippines from Taiwan about 4,000 years ago, and major expansions through Indonesia and into the Pacific Islands followed after 3,500 years ago (Ross 2008; Blust 2017). Linguist Robert Blust has presented an argument that Malayo-Polynesian languages were once much more widespread between coastal Vietnam and Peninsular Malaysia than they are now (Blust 1994). Iron Age pottery dating from c. 2,000 years ago found in Cam Rang Bay (south-central Vietnam) and in southern Thailand has stylistic parallels in the Iron Age Philippines, so this is quite possible (Bellwood 2017a: Fig. 9.10). Perhaps former Malayo-Polynesian languages were replaced by Khmer and Thai expansion. However, the evidence is still fairly thin, and suggestions that the Indic-influenced civilisation of Funan was once Austronesian-speaking seem unlikely (at least to this author). Because the Malayic and Chamic languages are very closely related within a single Malayo-Chamic subgroup of western Borneo origin (closest siblings are Iban, Selako and other western Borneo languages), it is quite likely that both moved together, albeit in rather different directions. The ancestral Chamic languages perhaps crossed the South China Sea directly, whereas the expansion of Malayic perhaps went through Sumatra to the Malay Peninsula (or did it move down from the north, along the mainland coast?). Blust has noted a Malayo-Chamic association with a reconstruction for “iron”, making Iron Age expansions after 500 BC for both Malayic and Chamic very likely (Blust 2005). There is little in the regional archaeological records that can be equated directly with these two language groups, and the Iron Age Sa Huynh jar burial culture of central Vietnam is not identical to any contemporary cultures in Borneo or the Philippines. However, there are many specific parallels between Vietnam and the Philippines in the decoration of the small accessory pottery vessels buried with the dead, as well as in earring styles in Taiwan nephrite and glass (Hung 2017). Archaeologist Wilhelm Solheim II stressed the similarities between Sa Huynh and the Kalanay culture of the central Philippines (Solheim 2002), although there appears to be no possibility of deriving the Malayo-Chamic languages themselves from the Philippines. The Malayo-Chamic migration, like that to Madagascar, was later in time than the Neolithic migrations of the bulk of the Austronesian-speaking population in Island Southeast Asia. Because Indic contacts and influences spread through most of Island Southeast Asia after 2,500 years ago, including through traded items to as far north as Taiwan, one likelihood is that extensive trade networks fuelled the MalayoChamic expansion, with core linguistic populations moving out of western Borneo. Many other groups, especially from the Philippines, were evidently involved as well. Whether such an explanation can also explain the existence of the Moken languages on the other side of the Thai-Malayan Peninsula remains unknown.



Homelands and dispersal histories of Mainland Southeast Asian language families 

 41

3.9 Conclusion and further observations In Mainland Southeast Asia, it is a reasonable inference that the early dispersal histories of the three major language families that occur there now (Tibeto-Burman, Austroasiatic, Kra-Dai) involved migration, generally from north to south, by Neolithic farmers. Malays and Chams (Austronesians), of Taiwan and Island Southeast Asian linguistic origin, arrived somewhat later, most probably during the Iron Age. However, in the specific case of Malayo-Polynesian dispersal out of Taiwan, there is a distinct but undemonstrated possibility, supported by comparisons of Neolithic circle- and dentate-stamped pottery, that there were contacts around 3,500–4,000 years ago involving Luzon and the northern coastline of Vietnam. Unfortunately, no data are available yet for Hainan, but we might expect this island to have been involved in such contacts, given its geographical position. These observations support linguistic opinions of close ancestral relationships between Kra-Dai (Tai-Kadai) and Austronesian, especially Malayo-Polynesian, by Sagart (2005) and Blench (2018). Genomic and archaeological data reveal a similar picture of human population expansion, spreads of domesticated animals and plants, and the spread of related sets of Neolithic material culture with pottery and polished stone technology. The Neolithic material culture concerned, like the human populations, did not descend in place from pre-Neolithic forebears in Southeast Asia, but reveals strong evidence for ultimate introduction from a developmental region for early agriculture in what is now central and southern China, between 6,000 and 4,000 years ago. It therefore seems very likely that the ancestral languages that founded the major language family that exist today were also carried with these human migrations. This need not mean that ancestral Tibeto-Burman, Kra-Dai and Austroasiatic necessarily began their expansion careers in what is now southern China, but the consensus of genomic, archaeological and linguistic evidence makes this a likely hypothesis. The precise origin regions for these major language families, especially Austroasiatic, are by no means obvious, and this may be because their original subgroup patterns in geographical space have been erased by subsequent linguistic expansions, these expansions continuing into very recent times with metropolitan/imperial languages such as “Chinese” (multiple Sinitic languages), Thai, Vietnamese, and Burmese.

References Bellwood, Peter. 1997. Prehistoric cultural explanations for widespread language families. In Patrick McConvell & Nicholas Evans (eds.), Archaeology and linguistics, 123–134. Oxford: Oxford University Press. Bellwood, Peter. 2005. First farmers: The origins of agricultural societies. Oxford: Blackwell. Bellwood, Peter. 2013. First migrants: Ancient migration in global perspective. Chichester: Wiley-Blackwell.

42 

 Peter Bellwood

Bellwood, Peter. 2015. Vietnam’s place in the prehistory of Eastern Asia – A multidisciplinary perspective on the Neolithic. In A. Reinecke (ed.), Perspectives on the archaeology of Vietnam, 47–70. Berlin & Bonn: German Archaeological Institute. (In English and Vietnamese). Bellwood, Peter. 2017a. First Islanders: Prehistory and human migration in Island Southeast Asia. Hoboken: Wiley-Blackwell. Bellwood, Peter. 2017b. A model for the expansion of agricultural societies in southern China and into Southeast Asia. In Emergence of Neolithic culture and preservation of Neolithic sites, vol. 1, 23–41. Seoul: Amsadong Site Research Series. Bellwood, Peter. 2018. The search for ancient DNA heads east. Science 361. 31–32. Bellwood, Peter, Janelle Stevenson, Eusebio Dizon, Armand Mijares, Gay Lacsina & Emil Robles. 2008. Where are the Neolithic landscapes of Ilocos Norte? Hukay 13. 25–38. Manila. Bellwood, Peter, Marc Oxenham, Bui Chi Hoang, Nguyen Kim Dzung, Anna Willis, Carmen Sarjeant, et al. 2011. An Sơn and the Neolithic of Southern Vietnam. Asian Perspectives 50. 144–175. Blench, Roger. 2018. Tai-Kadai and Austronesian are related at multiple levels. Unpublished manuscript. Blust, Robert. 1994. The Austronesian settlement of mainland Southeast Asia. In Kathleen Adams & Thomas Hudak (eds.), Papers from the Second Annual Meeting of the Southeast Asian Linguistics Society, 25–83. Tempe, AZ: Program for Southeast Asian Studies, Arizona State University. Blust, Robert. 2005. Borneo and iron: Dempwolff’s *besi revisited. Bulletin of the Indo-Pacific Prehistory Association 25. 31–40. Blust, Robert. 2017. The linguistic history of Austronesian-speaking communities in Island Southeast Asia. In Peter Bellwood (ed.), First Islanders, 190–197. Malden, MA: Wiley Blackwell. Carson, Mike T. & Hsiao-chun Hung. 2018. Learning from paleo-landscapes. Current Anthropology 59. 790–813. d’Alpoim Guedes, J. et al. 2020. 3000 years of farming strategies in central Thailand. Antiquity 94:966–982. Diffloth, Gérard. 2005. The contribution of linguistic palaeontology to the homeland of AustroAsiatic. In Laurent Sagart, Roger Blench & Alicia Sanchez-Mazas (eds.), The peopling of East Asia: Putting together archaeology, linguistics and genetics, 77–80. Abingdon: Routledge Curzon. Gray, Russell, A., J. Drummond & Simon Greenhill. 2009. Language phylogenies reveal expansion pulses and pauses in Pacific settlement. Science 323. 479–483. Higham, Charles. 2014. Early Mainland Southeast Asia. Bangkok: River Books. Higham, Charles. 2018a. East Asian agriculture and its impact. In Chris Scarre (ed.), The human past, 230–260. London: Thames and Hudson. Higham, Charles. 2018b. Complex societies of East and Southeast Asia. In: Chris Scarre (ed.) The Human Past, 547–589. London: Thames and Hudson. Hosner, D. et al. 2016. Spatiotemporal distribution patterns of archaeological sites in China. The Holocene 26:1576–1593. Hung, Hsiao-chun. 2017. Nephrite and other Early Metal Age exchange networks across the South China Sea. In Peter Bellwood (ed.), First Islanders, 333–335. Hoboken: Wiley Blackwell. Kim, Nam C. & Charles Higham (eds.). In press. Oxford handbook to Southeast Asian archaeology. Oxford: Oxford University Press. Lipson, Mark, Olivia Cheronet, Swapan Mallick, Nadin Rohland, Marc Oxenham, Michael Pietrusewsky, et al. 2018. Ancient genomes document multiple waves of migration in Southeast Asian prehistory. Science 361. 92–95. Ma, Yongchao, Xiaoyan Yang, Xiujia Huan, Yu Gao, Weiwei Wang, Zhao Li, et al. 2018. Multiple indicators of rice remains and the process of rice domestication. PLoS ONE 13(12). e0208104.



Homelands and dispersal histories of Mainland Southeast Asian language families 

 43

Matsumura, Hirofumi, Marc Oxenham, Truman Simanjuntak & Mariko Yamagata. 2017. The biological history of Southeast Asian populations from Late Pleistocene and Holocene cemetery data. In Peter Bellwood (ed.), First Islanders, 98–106. Hoboken: Wiley Blackwell. Matsumura, Hirofumi, Hsiao-chun Hung, Charles Higham, Chi Zhang, Mariko Yamagata, Lan Cuong Nguyen, et al. 2019. Craniometrics reveal “two layers” of prehistoric human dispersal in Eastern Eurasia. Scientific Reports 9. 1451. McColl, Hugh, Fernando Racimo, Lasse Vinner, Fabrice Demeter, Takashi Gakuhari, Eske Willerslev, et al. 2018. The prehistoric peopling of Southeast Asia. Science 361. 88–92. Ostapirat, Weera. 2005. Kra-dai and Austronesian. In Laurent Sagart, Roger Blench & Alicia Sanchez-Mazas (eds.), The peopling of East Asia: Putting together archaeology, linguistics and genetics, 107–131. London: Routledge Curzon. Ostler, Nicholas. 2005. Empires of the word. London: Harper Perennial. Oxenham, Marc, Hirofumi Matsumura & Nguyen Kim Dung (eds.). 2011. Man Bac: The excavation of a Neolithic site in Northern Vietnam. Canberra: Terra Australis Volume 33, ANU E Press. Oxenham, Marc, Hiep Hoang Trinh, Anna Willis, Rebecca Jones, Kathryn Domett, Cristina Cobo Castillo, et al. 2018. Between foraging and farming: Strategic responses to the Holocene Thermal Maximum in Southeast Asia. Antiquity 92(364). 940–957. Rau, Felix & Paul Sidwell. 2019. The Munda maritime hypothesis. Journal of the Southeast Asian Linguistics Society 12. 35–57. Reich, David. 2018. Who we are and how we got here. Oxford: Oxford University Press. Reid, Lawrence. 2013. Who are the Philippine Negritos? Human Biology 85. 329–358. Ross, Malcolm. 2008. The integrity of the Austronesian language family: From Taiwan to Oceania. In Alicia Sanchez-Mazas, Roger Blench, Malcom Ross, Ilia Peiros & Marie Lin (eds.), Past human migrations in East Asia: Matching archaeology, linguistics and genetics, 161–181. Abingdon: Routledge Curzon. Sagart, Laurent. 2003. The vocabulary of cereal cultivation and the phylogeny of East Asian languages. Bulletin of the Indo-Pacific Prehistory Association 22. 127–136. Sagart, Laurent. 2005. Tai-Kadai as a subgroup of Austronesian. In Laurent Sagart, Roger Blench & Alicia Sanchez-Mazas (eds.), The peopling of East Asia: Putting together archaeology, linguistics and genetics, 161–178. London: Routledge Curzon. Sagart, Laurent, Guillaume Jacques, Yunfan Lai, Robin J. Ryder, Valentin Thouzeau, Simon J. Greenhill & Johann-Mattis List. 2018. Dated language phylogenies shed light on the ancestry of Sino-Tibetan. Proceedings of the National Academy of Sciences 116. 10317–10322. Sidwell, Paul. 2015. Southeast Asian mainland: Linguistic history. In Peter Bellwood (ed.), The global prehistory of human migration, 259–268. Chichester: Wiley Blackwell. Sidwell, Paul & Roger Blench. 2011. The Austroasiatic Urheimat: The Southeastern Riverine hypothesis. In Nick J. Enfield (ed.), Dynamics of human diversity, 315–344. Canberra: Pacific Linguistics. Singh, Prajjval Pratap, Shani Vishwakarma, Gazi Nurun Nahar Sultana, Arno Pilvar, Monika Karmin, Siiri Rootsi, et al. 2019. Counting the paternal founders of Austroasiatic language speakers associated with the language dispersal in south Asia. BioRxiv preprint (posted online 18 November 2019), doi: http://dx.doi.org/10.1101/843672. Solheim, Wilhelm G. II. 2002. The archaeology of Central Philippines. Diliman: University of the Philippines Archaeological Studies Program. Stevens, Chris J. & Dorian Q. Fuller. 2017. The spread of agriculture in Eastern Asia. Language Dynamics and Change 7. 152–186. Wang, C. et al. 2021. Genomic insights into the formation of human populations in East Asia. Nature 591:413–­419. Yang, M. et al. 2020. Ancient DNA indicates human population shifts and admixture in northern and southern China. Science 369.6501: 282-288.

44 

 Peter Bellwood

Yang, Xiaoyan, Qiuhe Chen, Yongchao Ma, Zhao Li, Hsiao-chun Hung, Qianglu Zhang, et al. 2018. New radiocarbon and archaeobotanical evidence reveal the timing and route of southward dispersal of rice farming in South China. Science Bulletin 63. 1495–1501. Zhang, Menghan, Shi Yan, Wuyun Pan & Li Jin. 2019. Phylogenetic evidence for Sino-Tibetan Origin in northern China in the late Neolithic. Nature 569. 112–115.

Dorian Q. Fuller and Cristina Cobo Castillo

4 The origins and spread of cereal agriculture in Mainland Southeast Asia 4.1 Introduction Cereal agriculture was established through much of mainland Southeast Asia around 4,000 years ago, but underwent important subsequent transformations. The first evidence for cereal-based agriculture in the southern parts of China and Taiwan dates to ca. 2500 BC or a few centuries earlier. While there are a few sites in mainland Southeast Asia that might have involved cereal farming between 2500 and 2000 BC, including millet cultivation in Non Pa Wai, Thailand, most evidence is after 2000 BC. Current evidence suggests that lower yielding, and lower input, dry rice and millet cultivation are what first became established in mainland Southeast Asia. Wet rice, which is so common in the landscapes today, represents a later development, involving labour intensification and probably new varietal diversity in rice. The earliest evidence for this at present comes from ca. 100 BC at Ban Non Wat (Castillo et al. 2018b). Some of that new diversity included indica rice introduced from India (perhaps by the 3rd century AD), which dominate much of lowland irrigated rice in the plains of mainland Southeast Asia today. Additional diversity involved several pulse crops introduced from India, such as mungbean and pigeon pea, as well as cotton, all of which were established by Iron Age times (Castillo et al. 2016; D’Alpoim Guedes et al. 2020). Another strand of new diversity included glutinous rice, that spread as new japonica rice varieties from China, perhaps also in the last 2000 years. There is much less ­evidence for cereal agriculture in Island Southeast Asia and that is mostly within the past 3500 years – for example rice phytoliths of this age have recently been reported from Sulawesi (Deng et al. 2020a). The shift from hunter-gather economies to those based on agriculture was a fundamental change in how human societies affected their environments and it created a fundamental demographic transition, with the potential to support ever greater populations. As such the transition to agriculture has been seen as fundamental to the distribution of language families, as farmers have tended to expand geographically through migration at the expense of foraging populations (e.  g. Bellwood 2005). Nevertheless, not all agricultural systems are equivalent, with some supporting much higher population densities and therefore packing the landscape, whereas other farming systems have lower carrying capacities and are prone to more frequent community fission and migration. Therefore, in terms of understanding early agriculturalist migrations it is not enough to establish the presence of agriculture. Instead, what is important is to establish the basis of that agricultural system and its potential productivity – the details of which crops and cropping systems really does matter. It https://doi.org/10.1515/9783110558142-004

46 

 Dorian Q. Fuller and Cristina Cobo Castillo

has recently been demonstrated, for example, that the lower productivity of African savannah millets meant that African agriculture spread faster than the more productive Eurasian wheat and barley (Fuller et al. 2019), and that early wet rice in China was less prone to support migrations than rainfed rice and millets (Qin and Fuller 2019), a key insight we return to below. Mainland Southeast Asia is generally regarded as a recipient region of agriculturalist migrations (Higham 2003; Bellwood 2005), which seems clearly the case for major cereal farming traditions, such as millets and rice. Recent archaeological synthesis suggests that there were as many as 20–24 separate regional foci of plant domesti­ cation (Purugganan and Fuller 2009; Larson et al. 2014), and of these, probably a dozen involved the domestication of a cereal. This means that the other half of the centres of origin were likely to have focused on various vegetatively reproduced crops, the tubers and arboricultural crops that have often been prominent in the tropics. While mainland Southeast Asia may well have had earlier Holocene traditions of tuber cultivation or even palm starch use (sago), which has been better documented for Island Southeast Asia, the archaeobotanical evidence for documenting this remains scarce (see Fuller et al. 2014; Denham et al. 2020). We therefore focus on the transition to grain-based agriculture, the origins of staple cereals, which would have displaced prior hunter-gatherer or vegeculturalist traditions over the long-term. From the perspective of early Southeast Asia, the most important centres of cereal domestication were doubtless those from China, where a distinct northern macro-region provided domesticated millets (Panicum miliaceum and Setaria italica), and further south in the Yangtze basin rice (Oryza sativa) was domesticated. It is fundamental, however, not to treat rice as a single monotypic crop, as its forms and its agricultural systems vary greatly from highly intensive irrigated flooded paddy fields, to lower productivity rainfed systems with very different ecological and demographic implications (Fuller et al. 2011; Qin and Fuller 2019), as well as significant cultural variants, such as sticky rices. So we treat three major rice culture types in turn. Nevertheless, other Old World regions contributed to the cereal diversity in Southeast Asia. India–in its broad modern sense – was the source of new forms of rice (Oryza sativa subsp indica, and cultivars grouped as “circum-aus”), as well as some minor millets and a stepping stone for African cereals. Cereals of Near Eastern origin found niches in some of the mountain regions of Southeast Asia, and even a few secondary cereal crops of potential local origin, perhaps focused in particular on the mountainous region of the IndoBurma borderlands (Fuller and Castillo in press).



The origins and spread of cereal agriculture in Mainland Southeast Asia 

 47

4.2 Chinese domestication processes: millets and early rice China is recognised as having at least two and probably more centres of independent, initial crop domestication (Zhao 2011; Stevens and Fuller 2017). These include a northern Chinese millet centre of origin. Soybeans (Glycine max) also come from this region but were added later (domesticated around ca. 2500 BC). This is also one of the regions where pigs were domesticated, also by ca. 5000 BC (Cucchi et al. 2016; Dong and Yuan 2020). A second region is the Yangtze valley, where rice was domesticated (Oryza sativa), but modern genetic inferences suggest this was subspecies japonica only, with subspecies indica and the separate groups of “circum-aus” having different regional origins, which we return to below (Choi et al. 2017; Civan and Brown 2018; Gutaker et al. 2020). A possible third region is the far south of China (including Guangdong, Guangxi), and the Pearl River Basin, which has long been suggested to have been a region of vegecultural origins of some tuber crops, although clear archaeological evidence remains limited (e.  g. Li 1970; Yang et al. 2013; Denham et al. 2018). In northern China, two millet species (Panicum miliaceum and Setaria italica) were domesticated and were the initial focus of cereal agriculture. It is unclear whether more than one cultural area was involved in domestication, and there are at least five cultures that are generally regarded as likely early millet cultivators which stretch throughout the Yellow River basin and the upper Liao River basin in eastern Inner Mongolia; these cultures are dated from 6500–5000 BC, and include the Dadiwan, Peiligang, Cishan, Houli and Xinglongwa cultural complexes (Cohen 2011; Ren et al. 2016; Stevens and Fuller 2017). Foxtail millet was the dominant cereal in the middle Yellow river basin through the Yangshao and Longshan cultures and into the Shang Dynasty (Stevens and Fuller 2017; Li et al. 2020; Deng et al. 2020b), making it central to the rise of the first Chinese states, presumably related to population growth in Sinitic speaking groups. The Yangtze River basin, including its middle and lower reaches (east of the Three Gorges region), and its tributaries (such as the Han and Huai Rivers) is broadly accepted as a region of rice domestication at roughly the same time as millet domestication, i.  e. 7000–5000 BC (Cohen 2011; Stevens and Fuller 2017). However, the Yangtze basin is certainly not culturally unified in this period, with distinctive Neolithic traditions in the middle Yangtze (Hunan province), the Lower Yangtze (the regions around Taihu lake, Zhengzhou bay), and along tributary rivers. It has been argued that Middle and Lower Yangtze rice domestication processes were separate (Fuller and Qin 2009; Zhao 2011; Makibayashi 2014; Silva et al. 2015). What is clear for rice is the evolution of morphological features of rice domestication, including non-shattering panicles, larger grain sizes, change in plant habit, that were evolving rapidly between 6500 and 4500 BC, with domesticated rice spreading thereafter (Fuller et al. 2016; Stevens

48 

 Dorian Q. Fuller and Cristina Cobo Castillo

and Fuller 2017; Ishikawa et al. 2020). Early exploitation (>6500 BC) and low-level pre-­ domestication cultivation is likely, but remains partially documented. Two millet crop species that need to be considered in terms of early Chinese agriculture and its potential spread to Southeast Asia are foxtail and broomcorn millet, although only foxtail millet became the most widespread throughout Southeast Asia. Broomcorn millet (Panicum miliaceum) is traditionally grown in Taiwan, the southern parts of China and parts of Myanmar, where a Burmese name is recorded from at least the 12th century (Bradley 2011), but otherwise has had little penetration in Southeast Asia as a crop. Nevertheless, P. miliaceum is present among the earliest archaeological cereal remains in Yunnan, in near coastal Fujian and in Taiwan, all by ca. 2500 BC (Dal Martello et al. 2018; Tsang et al. 2017; Deng et al. 2018a). Foxtail millet (Setaria italica), is widespread through Southeast Asia, which is readily attested by vernacular names across the major language families, from Tai-Kadai (e.  g. Lao kʰǎo fāːŋ, Vidal 1962), Tibeto-Burman (e.  g. Old Burmese tɕhap, Proto-Burmic *tsap, Bradley 2011), Munda languages (Proto-Munda *hoxy, Zide and Zide 1976), Austronesian languages (Proto-­ Malayo Polynesian *beteŋ), with plausible correspondence to an Old Chinese term, *tsek (Sagart et al. 2017). Archaeologically it is more widespread than Panicum miliaceum: while it co-occurs with Panicum in Yunnan (Baiyangcun, ca. 2600 BC: Dal Mar­ tello et al. 2018), Fujian (Pingfengshan and Huangguashan, 2200–1600 BC: Deng et al. 2018), and Taiwan (Nuankuanli East, ca. 2500 BC: Tsang et al. 2017). Foxtail millet and rice both spread early to Southeast Asia. Foxtail millet on its own, without rice, is reported from central Thailand with a direct date of ca. 2200 BC, from Non Pa Wai (Weber et al. 2010), whereas at later phases in this valley rice appeared (c. 1500 BC) and increased through time (D’Alpoim Guedes et al. 2020). Foxtail millet and rice co-occur, such as at Rach Nui in Vietnam, ca. 1500 BC and Khao Sam Kaeo in Thailand, ca. 200 BC (Castillo et al. 2016; Castillo et al. 2018a). Both Panicum and Setaria and these sticky varieties spread secondarily through much of eastern Asia and parts of Southeast Asia. Rice is the best known, and most widely spread, of the cereals that originated in China. Although the origins of Asian rice (Oryza sativa) was not restricted to China, its Neolithic origins in China are the source for the initial spread to Southeast Asia. Later rice dispersals from India, including subspecies indica and the circum-aus rices, are dealt with below. In addition, we address the origins and spread of glutinous rice varieties, which are important in many Southeast Asia regions. Like rice, the millets also later evolved “glutinous” or waxy varieties through post-domestication mutations (Araki et al. 2012; Hachinken et al. 2013). A secondary dispersal of new sticky millet varieties is parallel to the spread of glutinous rice to Southeast Asia (Fuller and Castillo 2016). After being domesticated, rice spread to other regions (Fuller et al. 2010; Silva et al. 2015; Stevens and Fuller 2017). The first direction of dispersal was northwards to the Yellow River (by 3800 BC at the latest, based on direct AMS dates from Nanjiaokou, Henan). Rice was slower to spread southwards or westwards, but reached northern Yunnan, Guangdong near the Pearl River Delta, and the Island of Taiwain by ca. 2600



The origins and spread of cereal agriculture in Mainland Southeast Asia 

 49

BC (Dal Martello et al. 2018; Qin and Fuller 2019; Gao et al. 2020). As already noted, the spread of rice was usually accompanied by the Chinese millets, especially into Southwest and Southeast China. The earliest evidence for domesticated rice in Southeast Asia is from Khok Phanom Di, on the eastern margin of the lower Chao Phraya Plain in Thailand (Thompson 1996). The site was occupied between 2000 and 1500 BC and yielded twenty-seven domesticated-type rice spikelet bases. The absence of millets here should be noted, a contrast to the presence of millet but not rice at the near contemporary Non Pa Wai (2200–1500 BC) in central Thailand (Weber et al. 2010; D’Alpoim Guedes et al. 2020). Since then, numerous other sites have been found with the presence of domesticated rice across Southeast Asia spanning the Neolithic to the Historic periods. The evidence so far points to multiple dispersal routes at different times. However, it remains clear that rice was adopted widely in Southeast Asia by the Neolithic starting in the period of 2000–1500 BC (Castillo 2017; D’Alpoim Guedes et al. 2020).

4.3 The problem of indica rice origins It has long been recognised that there are at least two distinct subspecies, indica and japonica, which have long been attributed to separate geographical origins  –  with japonica associated with origins in China and indica, plausibly from South Asia (e.  g. Vitte et al. 2004). This is important because both indica and japonica rices have long been cultivated in Southeast Asia. In mainland Southeast Asia much upland rainfed rice is japonica, while indica dominates lowland irrigated rice. As noted above the first rices to spread from China to Southeast Asia appear to have been subspecies japonica (Gutaker et al. 2020). Rices of the indica subspecies were later adopted in Southeast Asia alongside the established japonica rices. This likely involved multiple dispersal events (Gutaker et al. 2020). Evidence from grain metrics in Thailand suggests that the earliest indica may have been present in the early centuries AD (Castillo et al. 2018b) and became more widespread before Angkorian times (Castillo 2017; Castillo et al. 2018c). Interestingly, other crops of Indian origin, including pulses (Vigna radiata, Macrotyloma uniflorum, Cajanus cajan) and cotton (Gossypium arboreum) were adopted earlier in Mainland Southeast Asia, certainly by ca. 300–200 BC (Castillo et al. 2016). One find of Cajanus cajan may even be older than 1000 BC (D’Alpoim Guedes et al. 2020), hinting at very early interactions across the Bay of Bengal (Fuller et al. 2011). Archaeological evidence indicates early rice cultivation in the Ganges plains of India by ca. 2500 BC, which have been argued to represent the origins of indica (Fuller 2006). After 1000 BC rice cultivation became widespread in India spreading to the far south and Sri Lanka, including differentiation into wet and dry cultivation ecologies (Fuller and Qin 2009; Kingwell-Banham 2019).

50 

 Dorian Q. Fuller and Cristina Cobo Castillo

However, more recent advances in the genomics of rice diversity, the genetics underlying domestication features in rice, and archaeology has complicated the picture further. Genomic evidence also indicates that a distinct group of “circum-aus” rices should be segregated from indica, as they are as distant as indica from japonica (McNally et al. 2009; Choi et al. 2017; Civan et al. 2018; Gutaker et al. 2020). The circum-aus rices are particularly diverse in northeast India and Bangladesh, where they perhaps originated. These circum-aus rices also came into Southeast Asia, represented by the so-called “Champa rices”, which had a short growing season and were introduced from Vietnam to China in the 10th century and set off something of a demographic and agricultural revolution (Barker 2011).

4.4 Dry rice and secondary transitions to irrigated rice Another fundamental axis of variation in rice is that between rainfed (dry) and wet (flooded or irrigated) cultivation systems. Historically, major states in mainland Southeast Asia have been supported by wet rice which provides both large surpluses and supported the large populations from which state armies were drawn (Scott 2009). In the uplands of Southeast Asia the highest diversity of ethnolinguistic groups is associated with dry rice and other rainfed crops. Thus, understanding the nature of rice cultivation systems is fundamental to assessing its potential to support population density as well as its impact on the environment. In order to distinguish wet from dry rice systems it is necessary to document the companion species of rice, namely the weeds that grew with rice, which have increasingly been documented in recent years (Fuller and Qin 2009; Weisskopf et al. 2014, 2015; Castillo et al. 2018b; Kingwell-Banham 2019). Currently we infer that the first form of cultivation in mainland Southeast Asia was dry rice (Fuller et al. 2016; Castillo 2017; Qin and Fuller 2019; D’Alpoim Guedes et al. 2020). The transition to wet rice and irrigation took place in some regions during the Iron Age. This has been documented in Northeast Thailand by archaeobotanical evidence taking place between 100 BC and AD 400 (Castillo et al. 2018b). Wet and dry ecologies occur across each of the rice subspecies. A key reason why this matters is the issue of potential yield, which impacts carrying capacity and in turn the frequency with which we can expect outward migration due to community fission. While ancient yields cannot be directly recovered we can make reasonable inferences from ethnographic and historic parallels (Qin and Fuller 2019). Over the long-term we might expect yields to tend to increase, but nevertheless the contrast in productivity between dry and wet rice should remain. An average taken from pre-modern wet rice yields is 1,897 kg/ha, while 1300 kg/ha was achieved in 10th century Japan, and 1,000 kg/ha in the Han Dynasty at Hangzhou nearly 2000 years ago (Qin and Fuller 2019). In contrast, the average of recent dry rice yields is 1,062 kg/ha



The origins and spread of cereal agriculture in Mainland Southeast Asia 

 51

(Qin and Fuller 2019), although data from Palawan and Borneo swiddens average just 578 kg/ha, with as low as 229 kg/ha being reported (Barton 2012). Millet yields tend to be closer to those of dry rice, with 500–650 kg/ha being reasonable historical estimates for Chinese millets, and ca. 400 kg/ha being typical of several Indian millets (Qin and Fuller 2019). Taken together with the needs of fallowing some land when growing rainfed rice or millet, it is possible to estimate the reasonable agricultural carrying capacity of self-sufficient villages, with population growth above this level expected to lead to group fission and outward migration (Fuller et al. 2019). Based on such estimates the potential carrying capacity of early rice farming, in Neolithic Yangtze is up to seven times that of dry rice or millet farming, which means that migrations of dry rice or millet farmers would have been more likely and more frequent. Wet rice farmers tended to pack into a landscape, urbanize and invest labour in further agricultural intensification – represented in the Lower Yangtze by the emergence of the urban centre of Liangzhu at ca. 3000 BC (Zhuang et al. 2014; Qin and Fuller 2019; Liu et al. 2020), and Shijiahe in the Middle Yangtze at a similar timeframe, and contrary to the Bellwood (2005) hypothesis, the wet rice farmers of the Yangtze basin tend to grow in population density, urbanise and intensify their agriculture, already by 3000 BC, whereas the dry rice and millet farmers who emerged on their more mountainous peripheries, in the era 3000–2500 BC, are more likely to be the cultural groups who expanded through migration into southeast Asia, due to maintaining much lower population densities and a recurrent quest for new agricultural lands. (cf. Deng et al. 2018b)

4.5 The creation and spread of sticky rice An important cultural pattern in much of eastern Asia and Southeast Asia is the cultivation of sticky varieties of the major cereals, especially rice and millet. These cereals have become an important component of cuisines and alcohol brewing to varying degrees throughout the region, but are conspicuously little known in agriculture further west (in the Indian subcontinent or Central Asia) – this highlights that the “sticky cereal zone” represents a geographical region defined by a cultural frontier that operated against the westward dispersal of sticky cereals (Sakamoto 1996; Fuller and Castillo 2016; Fuller and Lucas 2017). This zone includes much of China, mainland Southeast Asia, Malaysia and the Philippines, where sticky and non-sticky varieties of cereals are both grown. Two things are striking about this pattern. First, there is no ecological advantage or disadvantage that explains the presence of absence of sticky cereals. Second, the sticky forms of cereals do not exist among wild populations, and therefore evolved after domestication as a result of cultural selection on several cereal taxa. Beyond rice and the major Chinese millets, sticky mutant varieties have been selected in some populations of barley, sorghum, job’s tears and maize (Sakamoto 1996; Fuller and Castillo 2016). The resulting characteristic texture in parallel mutations

52 

 Dorian Q. Fuller and Cristina Cobo Castillo

arise from recessive genes and affect starch synthesis, so-called waxy mutations. Such grains when cooked are extremely sticky, and taste sweeter as a result of the increased starch molecule branch endings interacting with salivary amylase and different cooking properties (see Wani et al. 2012; Bertoft 2017). From the point of view of early agricultural history, this is significant as these sticky forms are likely to have evolved later, after the Neolithic in China, and dispersed into Southeast Asia. When is still unknown, but they could have evolved in China by 1500 BC and started reaching Southeast Asia after that (Fuller and Castillo 2016). As with the displacement of japonica rices by subspecies indica (and circum-aus), sticky rices (which are mainly japonica), represent a significant agricultural transformation. Rice agriculture was not of a single form, but experienced several major waves of dispersal and processes of historical transformation.

4.6 Additional cereals, later or localized Beyond the several strands of rice, and the earliest Chinese millets there are numerous other cereals that can be found in cultivation in parts of Southeast Asia, mostly with a patchy distribution. In broad terms we can divide these other cereals into three categories in terms of where they came from. First, there are those cereals that derive ultimately from the Mediterranean or temperate Eurasian world, but have found themselves useful in mountainous regions, spread from higher elevation cultivation traditions around the Tibetan Plateau and through the Himalayas in the mountain ranges of southern China, northern Myanmar or adjacent countries – forms of wheats (Triticum aestivum mainly), barley (mainly Hordeum vulgare) and oats (Avena sativa, A. byzantina and A. chinensis, the latter plausibly natively domesticated in northwest China). Second, there are additional tropical millets of African origin, especially finger millet (Eleusine coracana) and sorghum (Sorghum bicolor). The spread of these two cereals from African origins through India is well-reviewed (Fuller 2014; Blench 2016; Fuller and Stevens 2018), but their incorporation into agriculture in parts of China or Southeast Asia remains poorly understood. In Southeast Asia, a single early occurrence of finger millet is at Phu Khao Thong dating to around the first century BC (Castillo et al. 2016). Written sources suggest that sorghum spread into the Sichuan basin, Southwest China by the mid first millennium AD (Hagerty 1941), while it is recorded from Burmese texts by the 12th century (Bradley 2011). Third, there are a few endemic cereals that are minor or highly localised in some parts of Southeast Asia (Digitaria cruciata, Spodiopogon formosanus, Coix lachryma-jobi), and some Indian millets that have made minor inroads into Southeast Asia (Panicum sumatrense, Echinochloa frumentacea). The origins of agriculture in the eastern Mediterranean and middle East was based on wheat and barley, of which several forms were domesticated by 7500 BC (Zohary



The origins and spread of cereal agriculture in Mainland Southeast Asia 

 53

et al. 2012; Fuller et al. 2018). But it was only a few of these wheat and barley forms that made it to the East. It was hexaploid bread wheats (Triticum aestivum sensu lato), which had emerged as a hybrid weed in the early Neolithic, that became the dominant wheat in India (from the third millennium BC) and the only wheat to spread through central Asia into China, also arriving as a very minor crop in the Yellow River before 2000 BC (Deng et al. 2020b). Similarly, the derived six-row barleys (which are more productive), including the naked form, that are widely grown in the Himalayas, the Tibetan plateau, into the mountains of northern Southeast Asia, especially in Yunnan and Myanmar arrived from the west. Barley and wheat also appear in wordlists reconstructed for the Proto-Burmic languages (Bradley 2011); these terms appear borrowed from Indic terms which fits with an eastwards dispersal from India, where wheat and barley were well established in the Ganges plains by 2000 BC (Pokharia et al. 2017). They were established crops in Bangladesh ca. 400–200 BC (Rahman et al. 2020). An earlier dispersal from northwest China into central Yunnan by ca. 1500 BC is indicated by the wheat and barley finds at Haimenkou, but these did not penetrate further south at that time (D’Alpoim Guedes and Butler 2014; cf. Liu et al. 2017). A locally domesticated cereal in Southeast Asia may be job’s tears (Coix lachryma-jobi), a cereal of minor importance currently cultivated throughout the tropics and subtropics, including China, India, the Philippines, Myanmar, Thailand and Malaysia, and also on a small scale in Korea, Japan and Taiwan (Arora 1977; Simoons 1991; Jiang et al. 2008). Despite speculation that its cultivation in Southeast Asia might have preceded rice (Li 1970: 12; Simoons 1991: 81), this is not borne out by the available archaeobotanical evidence. Job’s tears is a staple food amongst some tribal groups on the hills of the Assam region (Arora 1977). Bradley (2011) infers domestication in the Tibeto-Burman region, with diffusion into China during the Han Dynasty. As this species may occur as a weed around wet rice fields, its few archaeological finds remain ambiguous in inferring its cultivation history. Recently, starch grains attributed to this species have been reported from many hunter-gatherer and Neolithic sites across northern China (Liu et al. 2019), but given its morphological similarity to maize – a universal modern starch contaminant (Crowther et al. 2014) – and the lack of any evidence for this species in verified archaeological seed/husk assemblages across northern China during the later Neolithic, its ancient presence as an early cultivar must be considered suspect.

4.7 Concluding remarks It remains the case that we have a poor understanding of pre-cereal subsistence throughout Southeast Asia, and whether or not there were major developments in terms of vegetative agriculture that could have had major impacts on demography and the dispersal of human populations. Vegecultural systems – based on tuber crops

54 

 Dorian Q. Fuller and Cristina Cobo Castillo

(Colocasia esculenta, Dioscorea sp.), bananas (Musa spp.), and sago (Metroxylon and other species) – could have preceded cereal agriculture in many regions, but remains undocumented. Domestication processes of tuber crops are harder to document archaeologically than cereal cultivation systems (Denham et al. 2020). It therefore remains speculative as to whether or not the presence of these taxa in regions, such as around the Pearl River region in Guangdong prior to the earliest rice finds (Yang et al. 2013; Denham et al. 2018), represent cultivation or forms of wild gathering. Forms of vegeculture and exploitation of vegetative food plants by hunter-gatherers form a spectrum of behavioural orientations (Barton and Denham 2018); it is less clear-cut than the transition from gathering to cultivation in cereals. Nevertheless, the spread of cereals, especially rice, involved several dispersal episodes, several varieties, and alternative cultivation systems. It was this diversity of rice cultivation traditions that helped to support both high ethnolinguistic diversity in parts of the hilly tracts of Southeast Asia, and denser populations of lowland rice states.

References Araki, Mie, Aya Numaoka, Makoto Kawase & Kenji Fukunaga. 2012. Origin of waxy common millet, Panicum miliaceum L. in Japan. Genetic Resources and Crop Evolution 59. 1303–1308. Arora, R. K. 1977. Job’s-tears (Coix lacryma-jobi) – A minor food and fodder crop of northeastern India. Economic Botany 31(3). 358–366. Barker, Randolph. 2011. The origin and spread of early-ripening champa rice: Its impact on Song Dynasty China. Rice 4(3). 184–186. Barton, Huw. 2012. The reversed fortunes of sago and rice, Oryza sativa, in the rainforests of Sarawak, Borneo. Quaternary International 249. 96–104. Barton Huw & Tim P. Denham. 2018. Vegecultures and the social-biological transformations of plants and people. Quaternary International 489. 17–25. Bellwood, Peter. 2005. First farmers. The origins of agricultural societies. Oxford: Blackwell. Bertoft, Eric. 2017. Understanding starch structure: Recent progress. Agronomy 7(3). 56. Blench, Roger. 2016. Vernacular names for African millets and other minor cereals and their significance for agricultural history. Archaeological and Anthropological Sciences 8. 1–8. Bradley, David. 2011. Proto-Tibeto-Burman grain crops. Rice 4. 134–141. Castillo, Cristina Cobo. 2017. Development of cereal agriculture in prehistoric Mainland Southeast Asia. Man In India 95(4). 335–352. Castillo, Cristina Cobo, Berenice Bellina & Dorian Q. Fuller. 2016. Rice, beans and trade crops on the early maritime Silk Route in Southeast Asia. Antiquity 90(353). 1255–1269. Castillo, Cristina Cobo, Dorian Q Fuller, Phil J. Piper, Peter Bellwood & Marc Oxenham. 2018a. Hunter-gatherer specialization in the Late Neolithic of southern Vietnam – The case of Rach Nui. Quaternary International 489. 63–79. Castillo, Cristina C., Charles FW Higham, Katie Miller, Nigel Chang, Katerina Douka, Thomas FG Higham & Dorian Q. Fuller. 2018b. Social responses to climate change in Iron Age north-east Thailand: New archaeobotanical evidence. Antiquity 92(365). 1274–1291.



The origins and spread of cereal agriculture in Mainland Southeast Asia 

 55

Castillo, Cristina Cobo, Martin Polkinghorne, Brice Vincent, Tan Boun Suy & Dorian Q. Fuller. 2018c. Life goes on: Archaeobotanical investigations of diet and ritual at Angkor Thom, Cambodia (14th–15th centuries CE). The Holocene 28(6). 930–944. Choi, Jae Young, Adrian E. Platts, Dorian Q. Fuller, Rod A. Wing & Michael D. Purugganan. 2017. The rice paradox: Multiple origins but single domestication in Asian rice. Molecular Biology and Evolution 34. 969–979. Civáň, Peter & Terence A. Brown. 2018. Role of genetic introgression during the evolution of cultivated rice (Oryza sativa L.). BMC Evolutionary Biology 18. 57. Cohen, David J. 2011. The beginnings of agriculture in China: A multiregional view. Current Anthropology 52(S4). S273–S293. Crowther, A., M. Haslam, N. Oakden, D. Walde & J. Mercader. 2014. Documenting contamination in ancient starch laboratories. Journal of Archaeological Science 49. 90–104. Cucchi, Thomas, Lingling Dai, Marie Balasse, Chunqing Zhao, Jiangtao Gao, Yaowu Hu, Jing Yuan & Jean-Denis Vigne. 2016. Social complexification and pig (Sus scrofa) husbandry in ancient China: A combined geometric morphometric and isotopic approach. PloS one 11(7). e0158523. Dal Martello, Rita, Rui Min, Chris Stevens, Charles Higham, Thomas Higham, Ling Qin & Dorian Q. Fuller. 2018. Early agriculture at the crossroads of China and Southeast Asia: Archaeobotanical evidence and radiocarbon dates from Baiyangcun, Yunnan. Journal of Archaeological Science: Reports 20. 711–721. D’Alpoim Guedes, J. & E. E. Butler. 2014. Modelling constraints on the spread of agriculture to Southwest China with thermal niche models. Quaternary International 349. 29–41. D’Alpoim Guedes, Jade, Sydney Hanson, Thanik Lertcharnrit, Andrew D. Weiss, Vincent C. Pigott, Charles FW Higham, Thomas FG Higham & Steven A. Weber. 2020. Three thousand years of farming strategies in central Thailand. Antiquity 94(376). 966–982. Fuller, Dorian Q., Nicole Boivin, Tom Hoogervorst & Robin Allaby. 2011. Across the Indian Ocean: The prehistoric movement of plants and animals. Antiquity 85 (328). 544–558. De Wet, J. M. J., K. E. Prasada Rao & D. E. Brink. 1983a. Systematics and domestication of Panicum sumatrense (Graminae). Journal d’agriculture traditionnelle et de botanique appliquée 30(2). 159–168. De Wet, J. M. J., K. E. Prasada Rao, M. H. Mengesha & D. E. Brink. 1983b. Domestication of mawa millet (Echinochloa colona). Economic Botany 37(3). 283–291. Deng, Zhenhua, Ling Qin, Yu Gao, Alison Ruth Weisskopf, Chi Zhang & Dorian Q. Fuller. 2015. From early domesticated rice of the middle Yangtze Basin to millet, rice and wheat agriculture: Archaeobotanical macro-remains from Baligang, Nanyang Basin, Central China (6700–500 BC). PLoS One 10(10). e0139885. Deng, Zhenhua, Hsiao-chun Hung, Xuechun Fan, Yunming Huang & Houyuan Lu. 2018a. The ancient dispersal of millets in southern China: New archaeological evidence. The Holocene 28(1). 34–43. Deng, Zhenhua, Hsiao-chun Hung, Mike T. Carson, Peter Bellwood, Shu-ling Yang, and Houyuan Lu. 2018b. The first discovery of Neolithic rice remains in eastern Taiwan: phytolith evidence from the Chaolaiqiao site. Archaeological and Anthropological Sciences 10(6). 1477–1484. Deng, Zhenhua, Hsiao-chun Hung, Mike T. Carson, Adhi Agus Oktaviana, Budianto Hakim & Truman Simanjuntak. 2020a. Validating earliest rice farming in the Indonesian Archipelago. Scientific Reports 10(1). 1–9. Deng, Zhenhua, Dorian Q. Fuller, Xiaolong Chu, Yanpeng Cao, Yuchao Jiang, Lizhi Wang, and Houyuan Lu. 2020b. Assessing the occurrence and status of wheat in late Neolithic central China: the importance of direct AMS radiocarbon dates from Xiazhai. Vegetation History and Archaeobotany 29(1): 61-73.

56 

 Dorian Q. Fuller and Cristina Cobo Castillo

Denham, Tim, Yekun Zhang & Aleese Barron. 2018. Is there a centre of early agriculture and plant domestication in southern China? Antiquity 92. 1165–1179. Denham, Tim, Huw Barton, Cristina Castillo, Alison Crowther, Emilie Dotte-Sarout, S. Anna Florin, Jenifer Pritchard, Aleese Barron, Yekun Zhang & Dorian Q. Fuller. 2020. The domestication syndrome in vegetatively propagated field crops. Annals of Botany 125(4). 581–597. Dong, Ningning & Jing Yuan. 2020. Rethinking pig domestication in China: Regional trajectories in central China and the Lower Yangtze Valley. Antiquity 94. 864–879. Fuller, Dorian Q. 2006. Agricultural origins and frontiers in South Asia: A working synthesis. Journal of World Prehistory 20. 1–86. Fuller, Dorian Q. 2014. Finger millet: Origins and development. In Claire Smith (ed.), Encyclopedia of global archaeology, 2783–2785. New York: Springer. Fuller, Dorian Q. & Cristina Castillo. 2016. Diversification and cultural construction of a crop: The case of glutinous rice and waxy cereals in the food cultures of eastern Asia. In J. Lee-Thorp & M. A. Katzenberg (eds.), The Oxford handbook of the archaeology of diet. Oxford: Oxford University Press. DOI: 10.1093/oxfordhb/9780199694013.013.8. Fuller, Dorian Q & Cristina Castillo. In press. Cereals of Southeast Asia. In N. C. Kim & C. Higham (eds.), The Oxford handbook of Southeast Asian archaeology. Oxford: Oxford University Press. Fuller, Dorian Q & Leilani Lucas. 2017. Adapting crops, landscapes and food choices: Patterns in the dispersal of domesticated plants across Eurasia. In M. Petraglia, N Boivin & R. Crassard (eds.), Human dispersal and species movement: From prehistory to the present, 304–331. Cambridge: Cambridge University Press. Fuller, Dorian Q. & Ling Qin. 2009. Water management and labour in the origins and dispersal of Asian rice. World Archaeology 41. 88–111. Fuller, Dorian Q & Chris J. Stevens. 2018. Sorghum domestication and diversification: A current archaeobotanical perspective. In A. Mercuri, A. D’Andrea, R. Fornaciari & A. Höhn (eds.), Plants and people in the African past, 427–452. Cham: Springer. Fuller, Dorian Q., Ling Qin, Yunfei Zheng, Zhijun Zhao, Xugao Chen, Leo Aoi Hosoya & Guo-Ping Sun. 2009. The domestication process and domestication rate in rice: Spikelet bases from the Lower Yangtze. Science 323. 1607–1610. Fuller, Dorian Q., Yo-Ichiro Sato, Cristina Castillo, Ling Qin, Alison R. Weisskopf, Eleanor J. Kingwell-Banham, Jixiang Song, Sung-Mo Ahn & Jacob Van Etten. 2010. Consilience of genetics and archaeobotany in the entangled history of rice. Archaeological and Anthropological Sciences 2(2). 115–131. Fuller, Dorian Q., Tim Denham, Manuel Arroyo-Kalin, Leilani Lucas, Chris J. Stevens, Ling Qin, Robin G. Allaby & Michael D. Purugganan. 2014. Convergent evolution and parallelism in plant domestication revealed by an expanding archaeological record. Proceedings of the National Academy of Sciences 111. 6147–6152. Fuller, Dorian Q., Alison R. Weisskopf & Cristina Castillo. 2016. Pathways of rice diversification across Asia. Archaeology International 19. 84–96. Fuller, Dorian Q., Leilani Lucas, Lara Gonzalez Carretero & Chris Stevens. 2018. From intermediate economies to agriculture: Trends in wild food use, domestication and cultivation among early villages in Southwest Asia. Paléorient 44. 61–76. Fuller, Dorian Q, Louis Champion & Chris Stevens. 2019. Comparing the tempo of cereal dispersal and the agricultural transition: Two African and one West Asian trajectory. In Barbara Einchornn & Alexa Höhn (eds.), Trees, grasses and crops – People and plants in sub-Saharan Africa and beyond (Frankfurter Archäologischen Schriften 37), 119–140. Bonn: Verlag Dr. Rudolf Habelt. Gao, Yu, Guanghui Dong, Xiaoyan Yang & Fahu Chen. 2020. A review on the spread of prehistoric agriculture from southern China to mainland Southeast Asia. Science China Earth Sciences 63. 615–625.



The origins and spread of cereal agriculture in Mainland Southeast Asia 

 57

Gutaker, Rafal M., Simon C. Groen, Emily S. Bellis, Jae Y. Choi, Inês S. Pires, R. Kyle Bocinsky, Emma R. Slayton, O. Wilkins, C. C. Castillo, S. Negrao, M. M Oliveira, D. Q. Fuller, J. A. d’Alpoim Guedes, J. R. Lasky & M. D. Purugganan. 2020. Genomic history and ecology of the geographic spread of rice. Nature Plants 6. 492–502. Hachiken, Takehiro, Kei Sato, Takahiro Hasegawa, Katsuyuki Ichitani, Makoto Kawase & Kenji Fukunaga. 2013. Geographic distribution of Waxy gene SNPs and indels in foxtail millet, Setaria italica (L.) P. Beauv. Genetic Resources and Crop Evolution 60. 1559–1570. Hagerty, Michael J. 1941. Comments on writings concerning Chinese sorghums. Harvard Journal of Asiatic Studies 5. 234–260. Higham, Charles. 2003. Languages and farming dispersals: Austroasiatic languages and rice cultivation. In Peter Bellwood & Colin Renfrew (eds.), Examining the farming/language dispersal hypothesis, 223–232. Cambridge: McDonald Institute for Archaeological Research. Ishikawa, Ryo, Cristina C. Castillo & Dorian Q. Fuller. 2020. Genetic evaluation of domestication-related traits in rice: Implications for the archaeobotany of rice origins. Archaeological and Anthropological Sciences 12(8). 1–14. Jiang, Hong-En, Bo Wang, Xiao Li, En-Guo Lü & Cheng-Sen Li. 2008. A consideration of the involucre remains of Coix lacryma-jobi L.(Poaceae) in the Sampula Cemetery (2000 years BP), Xinjiang, China. Journal of Archaeological Science 35(5). 1311–1316. Kingwell-Banham, Eleanor. 2019. Dry, rainfed or irrigated? Reevaluating the role and development of rice agriculture in Iron Age-Early Historic South India using archaeobotanical approaches. Archaeological and Anthropological Sciences 11(12). 6485–6500. Larson, Grogor, D. R. Piperno, R. G. Allaby, M. D. Purugganan, L. Andersson, M. Arroyo-Kalin, L. Barton, C. Climer Vigueira, T. Denham, K. Dobney, A. N. Doust, P. Gepts, M. T. P. Gilbert, K. J. Gremillion, L. Lucas, L. Lukens, F. B. Marshall, K. M. Olsen, J. C. Pires, P. J. Richerson, R. Rubio de Casas, O. I. Sanjur, M. G. Thomas & Dorian Q. Fuller. 2014. Current perspectives and the future of domestication studies. Proceedings of the National Academy of Sciences 111. 6139–6146. Li, H. 1970. The origin of cultivated plants in Southeast Asia. Economic Botany 24. 3–19. Li, Ruo, Feiya Lv, Liu Yang, Fengwen Liu, Ruiliang Liu & Guanghui Dong. 2020. Spatial– temporal variation of cropping patterns in relation to climate change in Neolithic China. Atmosphere 11(7). 677. Liu, Bin, Ling Qin & Yijie Zhuang. 2020. Liangzhu culture. Society, belief, and art in Neolithic China. London: Routledge. Liu, L., N. A. Duncan, X. Chen & J. Cui. 2019. Exploitation of job’s tears in Paleolithic and Neolithic China: Methodological problems and solutions. Quaternary International 529. 25–37. Liu, X., D. L. Lister, Z. Zhao, C. A. Petrie, X. Zeng, P. J. Jones, R. A. Staff, A. K. Pokharia, J. Bates, R. N. Singh & S. A. Weber. 2017. Journey to the east: Diverse routes and variable flowering times for wheat and barley en route to prehistoric China. PLoS One, 12(11). e0187405. Makibayashi, Keisuke. 2014. The transformation of farming cultural landscapes in the Neolithic Yangtze area, China. Journal of World Prehistory 27. 295–307. McNally, Kenneth L., Kevin L. Childs, Regina Bohnert, Rebecca M. Davidson, Keyan Zhao, Victor J. Ulat, Georg Zeller et al. 2009. Genomewide SNP variation reveals relationships among landraces and modern varieties of rice. Proceedings of the National Academy of Sciences 106. 12273–12278. Nasu, Hiroo, Hai-Bin Gu, Arata Momohara & Yoshinori Yasuda. 2012. Land-use change for rice and foxtail millet cultivation in the Chengtoushan site, central China, reconstructed from weed seed assemblages. Archaeological and Anthropological Sciences 4. 1–14.

58 

 Dorian Q. Fuller and Cristina Cobo Castillo

Pokharia, A. K., S. Sharma, D. Tripathi, N. Mishra, J. N. Pal, R. Vinay & A. Srivastava. 2017. Neolithic − Early historic (2500–200 BC) plant use: The archaeobotany of Ganga Plain, India. Quaternary International 443. 223–237. Purugganan, Michael D. & Dorian Q. Fuller. 2009. The nature of selection during plant domestication. Nature 457. 843–848. Qin, Ling. & Fuller, Dorian Q. 2019 Why rice farmers don’t sail: Coastal subsistence traditions & maritime trends in Early China. In C. Wu & B. Rolett (eds.), Prehistoric maritime cultures and seafaring in East Asia. The archaeology of Asia-Pacific navigation, vol 1, 159–191. Singapore: Springer. Rahman, Mizanur, Cristina Cobo Castillo, Charlene Murphy, Sufi Mostafizur Rahman & Dorian Q. Fuller. 2020. Agricultural systems in Bangladesh: The first archaeobotanical results from Early Historic Wari-Bateshwar and Early Medieval Vikrampura. Archaeological and Anthropological Sciences 12(1). 37. Ren, Xiaolin, Ximena Lemoine, Duowen Mo, Tristram R. Kidder, Yuanyuan Guo, Zhen Qin & Xinyi Liu. 2016. Foothills and intermountain basins: Does China’s fertile arc have “hilly flanks”? Quaternary International 426. 86–96. Sagart, Laurent, Tze-Fu Hsu, Yuan-Ching Tsai & Yue-Ie C. Hsing. 2017. Austronesian and Chinese words for the millets. Language Dynamics and Change 7(2). 187–209. Sakamoto, S. 1996. Glutinous-endosperm starch food culture specific to Eastern and Southeastern Asia. In R. Ellen & F. Katsuyoshi (eds.), Redfining nature: Ecology, culture and domestication, 215–231. London: Berg. Scott, James. 2009. The art of not being governed: An anarchist history of upland Southeast Asia. New Haven: Yale University Press. Silva, Fabio, Chris J. Stevens, Alison Weisskopf, Cristina Castillo, Ling Qin, Andrew Bevan & Dorian Q. Fuller. 2015. Modelling the geographical origin of rice cultivation in Asia using the rice archaeological database. PLoS One 10. e0137024. Simoons, Frederick J. 1991. Food in China: A cultural and historical inquiry. Boca Raton: CRC Press. Singh, H. B. & R. K. Arora. 1972. Raishan (Digitaria sp.) – A minor millet of the Khasi Hills, India. Economic Botany 26. 376–380. Stevens, Chris J. & Dorian Q. Fuller. 2017. The spread of agriculture in Eastern Asia: Archaeological bases for hypothetical farmer/language dispersals. Language Dynamics and Change 7. 152–186. Thompson, Gill B. 1996. The excavations of Khok Phanom Di, a prehistoric site in Central Thailand. Volume IV. Subsistence and environment: the botanical evidence. The biological remains part III. London: The Society of Antiquaries of London. Tsang, Chen-Hwa, Kuang-Ti Li, Tze-Fu Hsu, Yuan-Ching Tsai, Po-Hsuan Fang & Yue-Ie Caroline Hsing. 2017. Broomcorn and foxtail millet were cultivated in Taiwan about 5000 years ago. Botanical Studies 58(1). 3. Vidal, Jules. 1962. Noms vernaculaires de Plantes en usage au Laos, 2nd edn. Paris: École Franҫiase D’Extrême-Orient. Vitte, C., T. Ishii, F. Lamy, D. Brar & O. Panaud. 2004. Genomic paleontology provides evidence for two distinct origins of Asian rice (Oryza sativa L.). Molecular Genetics and Genomics 272. 504–511. Wani, Ali Abas, Preeti Singh, Manzoor Ahmad Shah, Ute Schweiggert‐Weisz, Khalid Gul & Idrees Ahmed Wani. 2012. Rice starch diversity: Effects on structural, morphological, thermal, and physicochemical properties – A review. Comprehensive Reviews in Food Science and Food Safety 11(5). 417–436. Weber, Steve, Heather Lehman, Timothy Barela, Sean Hawks & David Harriman. 2010. Rice or millets: Early farming strategies in prehistoric central Thailand. Archaeological and Anthropological Sciences 2. 79–88.



The origins and spread of cereal agriculture in Mainland Southeast Asia 

 59

Weisskopf, Alison, Emma Harvey, Eleanor Kingwell-Banham, Mukund Kajale, Rabi Mohanty & Dorian Q. Fuller. 2014. Archaeobotanical implications of phytolith assemblages from cultivated rice systems, wild rice stands and macro-regional patterns. Journal of Archaeological Science 51. 43–53. Weisskopf, Alison, Ling Qin, Jinglong Ding, Pin Ding, Guoping Sun & Dorian Q. Fuller. 2015. Phytoliths and rice: From wet to dry and back again in the Neolithic Lower Yangtze. Antiquity 89(347). 1051–1063. Yang, Xiaoyan, Huw J. Barton, Zhiwei Wan, Quan Li, Zhikun Ma, Mingqi Li, Dan Zhang & Jun Wei. 2013. Sago-type palms were an important plant food prior to rice in southern subtropical China. PLoS One 8(5). e63148. Zhao, Zhijun. 2011. New archaeobotanic data for the study of the origins of agriculture in China. Current Anthropology 52. S295–S306. Zhuang, Yijie, Pin Ding & Charles French. 2014. Water management and agricultural intensification of rice farming at the late-Neolithic site of Maoshan, Lower Yangtze River, China. The Holocene 24(5). 531–545. Zide, A. R. K. & N. H. Zide. 1976. Proto-Munda cultural vocabulary: evidence for early agriculture. In P. N. Jenner, L. C. Thompson and S. Starosta (eds.), Austroasiatic studies, Part II, 1295–1334. Honolulu: University of Hawaii Press. Zohary, Daniel, Maria Hopf & Ehud Weiss. 2012. Domestication of plants in the Old World, 4th edn. Oxford: Oxford University Press.

Paul Sidwell

5 History of MSEA Austroasiatic studies 5.1 Introduction This chapter offers an overview of the history of Austroasiatic (AA) linguistics in the MSEA region, focusing particularly on description, comparative reconstruction, and classification, and how the field has been conducted and communicated. AA languages that fall geographically outside of MSEA are only mentioned when the wider larger context is relevant. Although the AA family is quite large, with at least 170 languages spoken over nine countries, as a field of study it is quite modest, attracting an order of magnitude less attention than say Austronesian linguistics. Nonetheless, it is today an active ongoing field of study. The first century of AA studies, from the mid-19th to mid-20th century, spans the discovery of the family and many important aspects of its history, yet lacks coherence; there were no dedicated conferences, journals, or other mechanisms to foster cooperation and exchange, although there were important data aggregations and analyses that made a lasting impact. From the 1950s onward the situation changed rapidly as important conceptual and theoretical breakthroughs combined with a boom in fieldwork – especially in Indo-China – transformed the field and by the early 1970s there were dedicated meetings, publications, and an emerging coherence of approach and programmatic priorities. From the 1960s to the time of writing the field has transformed as a substantial amount of descriptive work has been done, extensive online resources have become available, and there has been strong progress in comparative work. A substantial handbook of AA languages appeared in 2014 (Jenny and Sidwell eds.) and the International Conference on Austroasiatic Linguistics (ICAAL) has become a regular biennial event.

5.2 Historiographs of Austroasiatic studies Thomas (1964) contributes to the first issue of Mon-Khmer Studies with a 15-page survey of comparative-historical and classification studies. This was a milestone that set the tone for much work over the following two decades, especially regarding SIL (Summer Institute of Linguistics) efforts in Indo-China. Thomas bemoaned the lack of published data until then and the limits this placed on progress, while expressing confidence that increasing data availability would facilitate much progress. The main programmatic point that emerges from Thomas is the question of regularity of phonological correspondences among AA languages, especially between vowels, and the problems this poses for reconstruction (harking back to Schmidt 1905). Thomas advocated that a strongly phonemic and bottom-up approach should be taken to the https://doi.org/10.1515/9783110558142-005

62 

 Paul Sidwell

problem, and this advice was widely received, although on reflection we can say that he was overly optimistic in this respect. In the 1990s Diffloth (ms)1 discussed the discovery of the AA family, including a detailed critique of Schmidt’s contribution to the field at the beginning of the 20th century. The difficulties for historical analyses presented by Munda and Nicobarese due to their typological peculiarities are briefly discussed; and Schmidt is given appropriate credit for showing systematically how these morphologically complex groups can be related to the more isolating MSEA AA languages. Diffloth also discusses the problems that “expressive” lexicon poses for systematizing correspondences. AA languages are rich in sound symbolic and icon forms that often seem to violate principles of regularity in sound change, and this has been a fundamental problem for comparative reconstruction. The turn of the millennium was marked by a substantial contribution to AA historiography: van Driem (2001: 262–312) gives us a 50-page review that ranges broadly over the history of the field, and is especially sympathetic to the various AA speaker communities, especially those in India (i.  e. within his conceptual Himalayan region). The discussion is wide ranging, including consideration of Austric and various proposals for regional substrates in SE Asia, with commentary on regional first contacts and early linguistic speculations. Van Driem is particularly interested in the possibility of early AA presence in India and hypothetical contact with Indo-Aryans and Dravidians, with reference to the work of scholars such as Lévi (1923), Kuiper (1948 and elsewhere), Shafer (1952), Emeneau (1954), Witzel (1999a, 1999b). Various language classification proposals are showcased, but not critically engaged. Nagaraja (2010) offers a 30-page summary of the state of the art of AA studies; the work can be seen as a follow-on from his (Nagaraja 1989) annotated bibliography of AA studies. The chapter lays out various published views on the classification of AA languages, as well as demographic data on AA speaking communities in India, and the ongoing need for descriptive materials, notwithstanding Anderson’s (2008) The Munda Languages handbook. In relation to the origins of AA, Nagaraja notes, “[t]he date and place of the origin of Austroasiatic is still unknown. The place was probably southern or southeastern China; the date was at least circa 2–2500 BC and possibly much earlier” (Nagaraja 2010: 1). Sidwell (2010) provides a 67-page history of comparative AA reconstruction, with commentary on all known published works up to 2006, plus the short historiography of AA comparative reconstruction offered by Sidwell (2006). The narrative is constructively critical, and expresses concern that personal factors have limited cooperation and resource utilization to the detriment of the field over the decades, while the new

1 This work is a series of lectures written in 1996 and 1997, giving typological and historical summaries for each of the non-Munda AA language branches. It was intended as a monograph but was not published, although a draft was widely circulated, so it is treated as an available source here.



History of MSEA Austroasiatic studies 

 63

era of electronic resource sharing and management and computer assisted methods in reconstruction is seen as crucial to future progress. Sidwell (2009a) is a monograph length treatment of the history of classification of AA languages. The text recapitulates much of the history of AA studies mentioned above. Included is a history of the application of lexicostatistics in AA studies, which has enjoyed a strong support over decades, compared to other approaches. A repeated theme is a broad lack of transparency in works that discuss classification of AA languages, such as failure to reproduce source data or to discuss the theoretical underpinnings of methods used. The work was followed by considerable activity focused on AA classification and more up-to-date results for the whole family are reported in Sidwell (2014a).

5.3 The early years of Austroasiatic studies The story of AA studies begins with Logan’s (1850) observation of lexical similarities between Vietnamese (“Anamese”)2 and the Aslian language Mah Meri (“Besisi”), and Mason’s (1854) list of lexical correspondences between Mon (“Talaeng”) and Ho (“Kole”) of India. The situation was advanced with Logan (1859) presenting more systematic data, defending an explicit claim for a Kol-Anam family that is recognizably AA, spanning from India to Vietnam. However, comparative linguistics was a divided and evolving field in the second half of the 19th century, and scholars differed strongly in their explanations of why languages shared similarities, with differing models of diffusion versus inheritance and the nature of human migrations. Of particular influence, Müller (1862) characterized Asian races as essentially nomadic and had no particular expectation that languages would form clear genealogical groups. Yet over time language families and sub-groups became recognized by scholars, especially as data started coming in from French Indo-China (works such as Azémar’s [1886] Stieng dictionary, Dourisbourne’s [1889] Bahnar dictionary). In this context, Cust (1878) supported the Mon-Annam family (later called Mon-Khmer), and recognized four subgroups: Mon, Kambojan, Annamite, and Paloung. Yet the primacy of diffusion in Asia was well established, with Keane (1880) seeing only language mixing and coincidence comparing Asian languages, firmly rejecting Mon-Annam, recognizing only two vast Indo-Chinese and Indo-Pacific language types. Otto Blagden, who would be highly influential in Aslian, Monic, Malay and Burmese studies well into the 20th century, had from the 1890s begun offering a grand diffusion hypothesis imagining multiple waves of migration to the Malay peninsula from a vague Mon-­Annam ancestral stock. By the end of the 19th century there was no broad 2 Logan consistently spelled Anam and Anamese with a single n, although it is more usually spelled in double nn reflecting Vietnamese An Nam.

64 

 Paul Sidwell

scholarly consensus on the proper classification of languages into families, or even why languages shared recognizable lexical and structural correspondences.

5.4 The first half of the 20th century The first decade of the 20th century saw a revolution in AA studies as scholars were able to aggregate numerous sources and important analytical progress was made. The initial impetus was the data from Grierson’s Linguistic Survey of India (LSI), the Aslian lexical compilation of Blagden (1906), the Indo-China compilation of Cabaton (1906), and other lexicons emerging from the Far East. Such resources were utilized for Schmidt’s (1901, 1903, 1904, 1905) analyses of AA sub-groups: Aslian, Khasi-­Palaungic, and Khmer-Mon-Stieng-Bahnar. His works delivered important insights into phonology and morphology, and established sets of lexical comparisons that would underlie subsequent comparative work throughout the 20th century, only being superseded in print by Shorto (2006). Otto Blagden spent many years assembling wordlists, making comparisons with other AA languages, effectively building a nascent AA etymological dictionary, first presented as the second volume of Skeats and Blagden (1906). The influence of this compilation on 20th century AA linguistics is profound; the data reorganized to put Mon as the head entries provided much of the foundation for Shorto’s (1971) dictionary of Mon inscriptions3 and (2006) Mon-Khmer Comparative Dictionary. And Blagden, although strongly diffusionist in his orientation, explicitly recognized the Austroasiatic in its modern form, writing, “it is certain that a considerable common element runs through Munda, Khasi, and Nicobarese, and this common element is identical with the main constituents of the Mon-Annam family” (Blagden 1906: 444). The era of large AA data aggregations and new broad insights was short-lived; the sweep of available sources had been thorough and the field needed time to digest the results. Blagden focused more on Malay and Burmese linguistics, while Schmidt examined other language families and devoted his theoretical considerations to theological questions. A tendency among French scholars took exception to the idea that Vietnamese belongs within AA and in particular Maspero (1912 and elsewhere) articulated the idea that it is underlying a Tai language. This was challenged by Przyluski (1924), but not soon resolved. Hostility to the AA hypothesis emerged elsewhere: von Hevesy (1928, 1930, 1932, 1934) ridiculed the relation of Munda to Mon-Khmer and Sebeok (1942) denied that Munda, Aslian or Vietic could relate to Mon-Khmer in a single family. The field was split more or less neatly along lines corresponding to 3 Shorto made use of a substantial index card collection left to him by Blagden, compiled over several decades of epigraphic research on languages of Burma. Regrettably most of this compilation was lost after Shorto’s passing in 1995.



History of MSEA Austroasiatic studies 

 65

the political-imperial alliances of the time, with French scholars somehow reluctant to acknowledge that languages within French Indo-China could be closely related to those spoken within the British Empire, and that skepticism extended also to the USA.

5.5 1950s to the present After World War II cold war rivalries saw renewed interest in and access to parts of the MSEA region, while revolution in China and North Vietnam blocked access to those countries and their sensitive border areas. Over the following four decades extensive linguistic work would be done, an enormous amount of new data would become available, and many fundamental problems would be investigated and solved by linguistics. The progress set up AA studies for a golden age, which arguably began with Haudricourt (1952, 1953, 1954) placing Vietnamese solidly within AA, demonstrating the emergence of tones from AA phonological features. By the early 1970s the clear division of AA into at least a dozen branches was also demonstrated, and a number of researchers were applying comparative reconstruction to sub-groups and branches. In the late 1960s Harry Shorto compiled his first draft Mon-Khmer Comparative Dictionary (MKCD) and presented part of his results at the 1973 International Conference on Austroasiatic Linguistics (ICAAL) (Shorto 1976); the full MKCD was published posthumously in 2006, edited by P. Sidwell, D. Cooper, and C. Bauer. The 1973 ICAAL meeting took stock of progress, and theoretical advances were presented in the form of proposals that began to pivot phonological analyses away from straightforward segmental and phonemic approaches to more sophisticated suprasegmental and syllable-based models. Much of the new thinking was grounded in the recognition of phonological patterns across many languages, facilitated by increasingly rich comparative studies. An example is Huffman’s 1973 ICAAL presentation on the “Register Problem” (Huffman 1976) which analysed the origins of tones and phonation types, and explored the patterns of vowel shifts that correlate with phonation settings. Such work revealed how complex vowel correspondences arise from features shifting within syllables, removing the need to project complex vocalism into the past to explain modern patterns.4 Another methodological innovation of great impact for AA studies was lexicostatistics. While the method was already strongly criticized (e.  g. since Bergsland and Vogt 1962) it enjoyed wide application in practice, and in MSEA much effort went into the collection of standardized wordlists allowing for comparability of data. This helped to inform studies that went beyond simple exercises in classification, and 4 It took decades for this insight to filter through the community of concerned scholars. For example, Perios (1996) reconstructed 44 distinct Proto-Katuic nuclei.

66 

 Paul Sidwell

resources began to circulate such as the highly influential Huffman (1971) vocabulary list of some 900 items for 20 AA languages. Various scholars/institutions developed standard elicitation lists for field studies beyond the Swadesh 100 and 200 lists, and these remain important and in use today (see Mann 2004). While many technical papers, including grammar studies on AA languages, were published through the 1960s to 1990s, dissertation and monograph length grammars were rather rare, and we must look to journal articles and book chapters for the most important works. It was a time of competing theoretical approaches and methods, especially variants of Generative Grammar and Tagmemics, and much of the work that was produced presents difficulties when trying to interpret in more contemporary typological terms. Descriptive efforts in recent decades have overcome some of these difficulties and there are now many grammars of AA languages available. Below I briefly review work on each MSEA AA branch, including references to key descriptive and other works, followed by a summary discussion of journals, conferences, and reference materials.

5.6 Branch by branch survey 5.6.1 Aslian The speakers of Aslian languages of the Malay Peninsula are small in number and particularly some northern communities remain semi-nomadic. Through the colonial period researchers collected wordlists of mixed quality and wrote brief grammar notes; these were the bases of early data aggregations and studies into classification and history (e.  g. Schmidt 1901, 1903; Skeat and Blagden 1906). The ethnographer Schebesta importantly documented Aslian languages and cultures over several decades into the 1950s (1926, 1926–1928, 1931, 1952–1957, etc.). The situation changed as the “Malaysian Emergency” saw many Aslians move into settlements, some proximal to the coast. The people became more accustomed to interactions with officials and outsiders and linguists could gain ready access to speakers. Benjamin (1976a: 127) describes how he collected wordlists for his lexicostatistical study by visiting the Orang Asli Department’s hospital at Ulu Goabak, Kuala Luapur. Work in the 1960s and 1970s transformed our understanding of the group, with descriptive publications by both local and international scholars (e.  g. Carey 1961; Asmah 1963, 2014; Dentan n.d.; Benjamin 1976b; Diffloth 1976), as well as phonological and comparative studies (e.  g. Diffloth 1968, 1975, 1977; Benjamin 1976a). Presentations and discussion at the 1973 ICAAL meeting saw recognition of a single AA branch and the adoption of the term Aslian at the suggestion of Benjamin. Benjamin (1976a) also applied lexicostatistics to work out the internal classification and offer hypotheses about migrations and contact.



History of MSEA Austroasiatic studies 

 67

Researchers gave special attention to northern Aslian languages spoken in Thailand from the 1980s onward, yielding works on phonology, lexicon, and grammar (e.  g. Phaiboon 1984; Bishop and Peterson 1994a; Bishop 1996; Peterson 1997). A renewed focus on Aslian languages of Malaysia emerged from the turn of the millennium, with grammars and advances in comparative studies. Notable grammars and sketches include: Kruspe (2004, 2014), Burenhult (2005), Kruspe et al. (2014); also lexicons, dictionaries, and comparative lexicons have appeared since the 1980s, such as: Means and Means (1987), Means (1999), Burenhult and Wegener (2009), Kruspe (2010), although this aspect of Aslian studies remains frustratingly underdeveloped. A cluster of scholars based at Lund University (Sweden) continues to work on Aslian languages, with a particular focus on cognitive linguistics as the semi-nomadic lifestyle of some Aslian groups creates rich opportunities in these areas. Comparative historical work also saw progress in the 21st century: Philips (2012) produced a proto-Aslian phonology and lexicon, although it is based on less than 300 etymologies. Dunn et al. (2011), applying computational methods, confirmed and refined the classification offered by Benjamin (1976) based on glottochronology. Aslian linguistic bibliographies have been produced by Bishop and Peterson (1994b), and Lye (2001). Additionally, Matisoff (2003) offers a 58-page typological profile of Aslian, which includes extensive historiographic detail up the 1980s, plus extensive bibliographic references.

5.6.2 Bahnaric The Bahnaric branch falls within the territory of French Indo-China, so the earliest works emerge from French scholarship in the latter 1800s, and the peak of this tradition is reflected in work such as the Köho dictionaries of Dournes (1950) and Drouin and K’Naǐ (1962); the Bahnar dictionary of Guilleminet and Alberty (1959–1961) (the latter documents seven lects and remains a rich resource to this day). An extensive listing of linguistic resources is provided by Cheeseman et al.’s (2013) annotated bibliography. Through the 1950s and 1960s French research activity in Indo-China was augmented by Americans and SIL researchers undertook extensive fieldwork in the Central Highlands, and publications began to appear in the early 1960s, e.  g. half the pages of the first issue of Mon-Khmer Studies were devoted to Bahnar. Over time many lexicons, primers, and papers dealing with grammar topics and short grammar sketches were produced, although monograph length grammars were uncommon and produced in frameworks (such as Tagmemics) that are difficult to interpret today. Notable examples are: Gregerson (1971), Thomas (1971), Miller (1976), Smith (1979). South Bahnaric languages received more attention over the decades than North or West Bahnaric, the latter only getting detail treatments in more recent decades, e.  g. Jacq (2001), Luang-Thongkum (2001). Further grammars and sketches have appeared

68 

 Paul Sidwell

more recently (e.  g. Bon 2014; Olsen 2014a, 2014b; Butler 2014; Smith and Sidwell 2014) yet much ground remains to be covered. Bahnaric has also been the subject of much effort in comparative reconstruction: Blood (1966), Thomas and Smith (1967), Smith (1972), Efimov (1990), Sidwell (1999, 2000), Jacq and Sidwell (2000), Luang-Thongkum (2001), Sidwell and Jacq (2003), Sidwell (2011). Within Vietnam home-grown researchers had been working on various Bahnaric languages. Their output includes some substantial printed dictionaries (e.  g. Lê Đông and Tạ Văn Thông 2008; Nguyễn Văn Thanh and Bùi Ðăng Bình 2009) and grammars (e.  g. Hoàng Tuệ et al. 1986; Lý Toàn Thắng et al. 1985; Nguyễn Văn Thanh and Bùi Ðăng Bình 2011). Nonetheless, much field data collected over the past half century remains unpublished or difficult to access, although the situation is improving with increasing online publication and data sharing.

5.6.3 Katuic The earliest sources of data on Katuic languages are short survey wordlists from European expeditioners of the 1860s and 1870s, e.  g. Harmond (1878–1879), Bastian (1868), Garnier (1973), although important Katuic languages such as Katu, Pacoh, Ta-oih were not seriously investigated until the 1960s (Tă-hoi [Ta-oih] and Kontu [Katu] are recognizable in the listing of Przyluski [1924] but are lacking associated data). Katuic gained more attention in the 1960s as French, Vietnamese, and SIL scholars increased activity, and four papers in the first (1964) issue of Mon-Khmer Studies concern Brôu (Bru) and Pacốh (Pacoh). Lexicons, dictionaries, primers, and comparative reconstructions rapidly emerged. Decades of comparative work have yielded a significant legacy: Thomas’ (1967) proto-East-Katuic, Diffloth’s (1982) proto-Katuic phonology, Efimov (1983) extends Thomas’ etymologies, Gainey (1985) compares Kui, Bru and So, Shorto (nd., ms.) proposes 794 Katuic etymologies and reconstructions, Peiros (1996) revises and extends Efimov, Theraphan (2001) documents six lects and reconstructs 1,406 proto-Katuic forms, and Sidwell (2005) reconstructs 1,395 proto-Katuic forms, although none of those reconstructions adequately deals with the problem of creaky registers in Katuic languages. More recently, Gehrmann (2015, 2016, 2019) develops ideas incorporating connections between vowel height, phonation, and syllable structure, to substantially solve the problem of register formation and distribution in Katuic. Various grammars, sketches, and dictionaries have appeared, making Katuic now one of the AA branches that is more accessible to scholarship, e.  g. Miller and Miller’s (1963) Brou (Bru) dictionary; Costello’s (1971, 1991) Katu dictionaries; Prasert’s (1978) dictionary of Kui/Suai; Luang-Thongkum and Puengpa’s (1980) dictionary of Bru; Nguyễn Văn Lợi et al.’s (1986) materials on Paoch and Taoih; Nguyễn Hữu Hoành (1995) and Nguyễn Hữu Hoành and Nguyễn Văn Lợi (1998) on Katu grammar; Migliazza’s (1998) grammar of So; Alves’ (2006, 2014) grammars of Pacoh; Bos and Sidwell’s



History of MSEA Austroasiatic studies 

 69

(2014) sketch of Kui Ntua. Costello and Khamluan (1993) is a large collection of Katu folk tales in tri-lingual format. Most discussion of Katuic grammar is distributed in many journal articles, and readers can consult the two substantial Katuic linguistics bibliographies and the profile of the Katuic languages by Choo (2009, 2010, 2012). An extensive database of wordlist audio recordings is accessible online at http://sealang. net/archives/huffman2/.

5.6.4 Khmer The first western accounts of Khmer lexicon and grammar, mainly by French scholars, appear in the second half of the 1800s (such as Aymonier’s 1874, 1878 dictionaries), and Maspero’s (1915) grammar and Tandart’s (1910–1911, 1935) dictionaries were particularly influential. Attention to inscriptional Khmer particularly took off with the work of Coedes (1937–1966) and is still carried on today by the EFEO (Paris) as a signature project (see also Sidwell and Jenny this volume regarding inscriptional Khmer). Broadly, through the 1920s to 1960s, priority went into language development with compilation and standardization efforts such as the Buddhist Institute Chuon Nath Khmer Dictionary (1938–1966). Western linguists were also interested in the development of Khmer in the context of efforts to support the nation before and after the tragic Khmer Rouge period, and we note the English-Cambodian dictionary of Huffman and Proum (1978), and Cambodian-English dictionaries of Headley et al. (1977), Headley et al. (1997), and Headley and Chim (2014). The first two of these were relied upon significantly during and after the United Nations Transitional Authority period in Cambodia (1992–1993). Both Huffman and Headley have independently made significant contributions to Khmer linguistics, including phonological, historical, and lexical studies as well as teaching and reference materials. A distinct northern dialect of Khmer is spoken in Thailand, and has attracted less attention from scholars, and the substantial Surin Khmer-Thai-English dictionary of Dhanan and Chartchai (1978) remains a principal resource. Khmer has been of prime importance to comparative AA linguistics; for example Pinnow (1959) and Shorto (2006) relied heavily on Khmer data in their analyses of AA linguistic history. Ferlus (1992) offered a more fine-grained account of Khmer historical phonology, integrating evidence from Siamese and Sanskrit loan phonology, and Sidwell and Rau (2014) offer a summary reconstruction of proto-Khmeric. Khmer morphology has also been of special interest; extending Jenner’s 1969 PhD thesis, Jenner and Pou’s (1980–1981) A Lexicon of Khmer Morphology attempts an exhaustive morphological analysis of the Khmer lexicon, harking back to Schmidt’s (1905, 1906) thinking that the AA lexicon was built on CVC roots, and that rhymes (-VC) may have had meaningful functions. Khmer morphology remains an active area of research, including among linguists based in Cambodia (especially at the Buddhist Institute, Phnom Penh).

70 

 Paul Sidwell

5.6.5 Khmuic The Khmuic branch is dominated by Khmu (also written Kammu, Khmu’), spoken by more than 700,000 people mostly in northern Laos, plus a handful of smaller languages on the geographical periphery in Laos, Thailand, Vietnam and marginally in China (with some of the lesser members seriously under-documented). The study of Khmu has been on a strong basis for many decades, and since the 1970s multiple bibliographies of Khmu and Khmuic have been produced (Smalley 1973; Proschan 1987, 1996; Renard and Singhanetra-Renard 2015; Lund University 2015; and Cheeseman et al. 2017). There has been a Kammu project based at Lund since the 1970s, with key personnel Kristina Lindell, Jan-Olof Svantesson, and Kàm Ràw (Damrong Tayanin) producing numerous publications including the (2014) encyclopedic Dictionary of Kammu Yùan Language and Culture. Also Premsrirat (2002, plus a number of derived volumes) presents a large multidialectal dictionary of Khmu. Other dictionaries and lexicons include: Delcros and Subra (1966), Preisig et al. (1994), Kato (2001). Khmu has been known to linguistics since Garnier (1873) provided a short lexicon, and there are early sources of lexicon for smaller Khmuic lects such as Macey (1906). The mid-20th century interest in Khmu is also reflected in Maspéro’s (1955) materials on the Theng dialect, and Smalley’s (1956) dissertation on Khmu grammar, much edited and published as a 45-page pamphlet in 1961. Yet no long-form grammar of Khmu has appeared; instead diverse aspects of its grammar has been discussed in many journal articles over the decades. For a recent overview of the grammar, see the sketch by Svantesson and Holmer (2014). Work on the lesser Khmuic languages has been largely done by various missionaries, and Thai and Vietnamese scholars. Notable works in English include: comparative works on Mal-Pray (Thinic) by Filbeck (1978), Unchalee (1988) and Rischel (1989); Mingkwan’s (1989) grammar of Pray; documentation and comparative work on Mlabri such as Rischel (1995, 2007), Bätscher (2014); description of Ksingmul by Pogibenko and Bùi Khánh Thế (1990), a sketch and lexicon of Phong by Bùi Khánh Thế (2000) and there are many other works that can be found via the various bibliographies mentioned above.

5.6.6 Monic A literary language since the first millennium, Europeans became aware of Mon around 1600, and with Mason (1854) and Haswell (1874) useful linguistic data began to appear in print. Although a few scholars have built their careers specializing in Mon, it is not widely studied: “Mon has not received much attention from the linguistic community” (Jenny 2014: 554). Halliday’s (1922) dictionary became a standard reference and basis for important later works. Both Otto Blagden and Harry Shorto published on inscriptional/written



History of MSEA Austroasiatic studies 

 71

Mon in English; Blagden’s legacy includes many articles in Epigraphia Birmanica, and Shorto produced dictionaries of Spoken Mon (1962) and Old Mon (1971) making liberal use of Blagden’s notes. See also Sidwell and Jenny (this volume) on Mon as an inscriptional language. A student of Shorto at SOAS, Christian Bauer, produced a grammar of Mon (PhD 1982), and subsequently wrote widely on Mon and Old Mon, and taught Mon language in Berlin for a time. Bauer’s (1984) A Guide to Mon Studies provides a useful survey of works on Mon up to that time. Active work on Mon grammar, and even teaching of Modern Mon in Europe, was carried on by Mathias Jenny (Paris, Zurich) whose (2005) monograph deals with the syntax of the Mon verb. Modern Mon has also been of interest to Japanese scholarship; the dictionaries of Sakamoto (1994, 1996) (based on the Pakkred variety in Thailand) being a notable example. Nai Pan Hla’s (1988–1989) Introduction to Mon Language was published in Tokyo, and he has also written extensively on Mon inscription, literature and grammar. Another ethnic Mon scholar, Nai Tun Way, has also authored a dictionary of English-Mon (1997) among other works. Grammatical sketches of Old Mon (Jenny and McCormick 2014) and Mon (Jenny 2014) are welcome contributions. Studies on the history of Mon and Monic include Blagden (1910), Haudricourt (1965), Shorto (1965), Ferlus (1983), and a comparative reconstruction of proto-Monic by comparing Mon and Nyah Kur was produced by Diffloth (1984). The latter established the scope and time frame of the Monic branch, correlating it with the Dvaravati kingdom of 1st millennium Thailand. There has also been attention to the acoustics of Mon as a register language (e.  g. Theraphan 1987).

5.6.7 Pakanic, Mang The small Pakanic group, consisting of Bugan (Pakan) and Bolyu (Paliu, Lai), only came to the attention of linguists in the 1980s, initially with Liang Min’s (1984) sketch of Bolyu (in Chinese). Benedict (1990) drew on Chinese sources to discuss the tonal system, lexicon, and classification of Bolyu, characterising it as an independent AA branch. Discussion of the sister language Bugan appeared in Chinese in the 1990s with Wu Zili (1992) and Li Yunbing (2005); a sketch in both English and Chinese by Li Jinfang (1996a, 1996b) and another in English by Li Jinfang and Luo Yongxian (2014) deal to some extent with grammar. Pakanic is of particular interest for developing tones somewhat similar to Vietnamese, and yet located in Guizhou, and subject to extensive language contact effects. Hsiu (2016) proposes a reconstruction of proto-Pakanic segments, and Sidwell and Hsiu (2019) discuss the reconstruction of Pakanic tones. Mang is spoken on either side of the Vietnam-China border, and has been studied by scholars from both nations, with the Vietnam variety first documented in print being Vương Hoàng Tuyên (1963). Gao Yongqi (2003) provides a sketch of Chinese

72 

 Paul Sidwell

Mang, while Nguyễn Văn Lợi (2008) offers a monograph length study of three Mang lects spoken in Vietnam. Both scholars describe six tone systems, although there are some difficulties reconciling the transcriptions and tonal values they document. The origin of Mang tones remains obscure at this time. Thomas and Headley (1970) provisionally grouped Mang with Palaungic on the basis of lexicon in this source, and Peiros’ (2004) lexicostatistics put Pakanic and Mang into one AA branch. It is not clear whether such a Mangic branch is justified, although the studies have found statistical support (Sidwell 2014a, and Sidwell this volume on AA classification).

5.6.8 Palaungic Schmidt (1904) recognized a Palaung-Wa ~ Palaung-Wa-Riang grouping and later Shafer (1952) introduced Palaungique, adapted to Palaungic by Thomas and Headley (1970). Various lexicons and mentions begin to appear around the turn of the last century, e.  g. Scott and Hardiman (1900), Drage (1907), Davies (1909), and Milne’s (1921) grammar and (1931) dictionary of Namhsan Palaung were standard references for decades, hardly surpassed by more recent works such as Mak (2012) in terms of data and scope. Shorto, working in Burma before the 1962 anti-British crackdown, worked on Palaung (see Shorto 1957, 1960, 1963), and Luce (1965) documented Danaw not described again until Nu Nu Thein (2005) and Si (2014). Herman and Margarete Janzen worked on Pale (Rucing) in the Shan State through the 1960s and published a grammar (1972) and various related papers. The full extent of the Palaungic branch took a century to become clear, given that languages are distributed across Myanmar, Thailand, China, Laos, and Vietnam, with many in problematic border areas. For some decades fieldwork in Burma was not viable and linguists directed attention to groups who were living in, or had taken refuge in, Thailand, especially Wa and Lawa. Notable works by Thai linguists from this time include: Narumol (1980, 1982) on Lamet, Suriya and Vinya (1997) on Lawa, and later Sujaritlak (2009), Sujaritlak and Pattama (2010), Supakit (2012), and Sujaritlak, Ampika and Supakit (2014). Many Wa were evangelized from the British period, and a Roman orthography (based on gospel translations such as Young and Yaw Su 1938) became a defacto standard and this Wa Bible came to be used by linguists. There is considerable detail on the history of Wa orthographies and lexical sources in Diffloth (1980) and Watkins (2013), the latter being the most important dictionary of Wa to date. From the 1980s Chinese linguists became more active, and the 1980s onward saw a substantial output of survey data, sketches, and grammar papers related to Palaungic languages became available: Yan Qixiang et al. (1981), Chén Xiāng-Mù et al. (1986), Li Dao Yong et al. (1986), Yan Qixiang and Zhou Zhizhi (2012), Chen Guoqing (2005), Dao Jie (2007), Tao Chengmei (2016). Since the turn of the millennium there has also



History of MSEA Austroasiatic studies 

 73

been a significant contribution to Palaungic linguistics emerging recently from Payap University (Chiang Mai), such as the dissertations of Harper (2009), Hall (2010), Block (2013), Munn (2018) and the bibliographies by Gordon (2014) and Cheeseman et al. (2015). Palaungic is of considerable interest typologically and historically; some varieties have undergone significant phonological restructuring. For example, the Angkuic sub-group is particularly known for dramatic syllable reduction, Germanic consonant shift, and tonogenesis (see Svantesson 1988, 1989, 1991). Comparative-historical work has been keenly pursued since a renewal in the 1950s: Shafer (1952), Mitani (1977, 1979), Diffloth (1991) focus on the Palaung-Riang sub-group, Diffloth (1980) reconstructs proto-Waic, and Paulsen (1989–1990) reconstructs proto-Plang which falls within Waic. Shorto worked up a proto-Palaungic reconstruction in the 1960s and a fragment of that is presented as Appendix B to his (2006) Mon-Khmer Comparative Dictionary, and Sidwell (2015) presents the most recent reconstruction of the protoPalaungic branch, including a comprehensive review of published classification and reconstructions.

5.6.9 Pearic The Pearic languages of Cambodia and the Trat province of Thailand were among the first AA languages to be noticed by western scholarship (e.  g. Crawford 1828), yet adequately documenting them has been problematic. The languages are small and endangered, they have been heavily relexified and structurally remodeled by Khmer and Thai to some extent. The French colonial period saw recording of lexicons, e.  g. Bastian (1868), Garnier (1873), Harmond (1878–1879), Morizon (1926) and the large comparative compilation of Baradat (1941). These early scholars did not have a clear idea that Pearic was a separate branch from Khmer, and sometimes referred to the languages as Khmer Boran or Khmer Daeum (‘ancient Khmer’). Haudricourt (1965) recognized distinctive phonological innovations in Pearic, and Thomas and Headley (1970) noted lexicostatistical distance of Pearic, so that by the first ICAAL meeting (1973) Pearic was squarely recognized as a distinct branch and given the name Pearic at the suggestion of Headley. Grammars and sketches of Pearic languages are only few (e.  g. Ploykaew 2001; Kamnuansin 2002; Rojanakul 2009; Premsrirat and Rojanakul 2014). The most extensive recent published lexical collection is the dictionary of Chong dialects by Premsrirat (2008), which is weighted to lects spoken in Thailand. Most recent descriptive materials have emerged from Thai researchers working with speakers in Trat; Khmer scholars have given Pearic a lower priority. Linguists have been fascinated by the complex of modal-breathy-creaky phonation in Pearic, which began to be recognized properly in the 1970s (e.  g. Martin 1974; Surekha 1982; Huffman 1976, 1985; Theraphan 1984) and confirmed by instrumental

74 

 Paul Sidwell

studies soon after (Luang-Thongkum 1988, 1991). This issue was explored in several dissertations (e.  g. Ungsitipoonporn 2001; Thongkham 2003; Choosi 2007). Headley (1985) offers a reconstruction of proto-Pearic (149 roots), having previously published some lexical aggregations (Headley 1977, 1978) although he put aside the creaky register issue due to limitations of the sources. Various proposals for the origin of Pearic creak are discussed by Diffloth (1989), Ostapirat (2009), Ferlus (2011), the latter usefully listing all known Pear lexicons up to 2009.

5.6.10 Vietic The Vietic branch is dominated demographically by Vietnamese, which has been subject to substantial effects in standardization, elaboration, and education for national development goals. In addition there are the closely related Mường lects and maybe a dozen5 minor Vietic languages spoken in the hinterlands of northern Vietnam and Lao and remain seriously under documented (the best grammar in print is of May, Babaev and Samarina [2018], in Russian). From the wider linguistic perspective, the Vietic group has been of intense interest as an example of languages acquiring tonal systems, and glottalised syllables.6 Vietnamese is interesting as a language restructured substantially by contact – principally with Chinese – in its lexicon, phonology, and syntax, as well as challenging theoretical issues such as the bases of genetic classifications and the notion of the wordhood in linguistics (Thomas 1962; Schiering et al. 2010). The evidence of the Sino-Vietnamese vocabulary has been important for studies of Chinese linguistic history, and knowledge of Middle and Classical Chinese is essential for Vietnamese etymology. Early lexical compilations include Dumoutier (1891), Tharaud (1904), Lunet de Lajonquière (1906), and such sources facilitated the comparison of Vietnamese (modern and archaic) Mường, Sách, and Nguồn with other AA languages by Cheon (1907), demonstrating the subgroup and its alignment with AA rather than with Tai. Yet some scholars still placed excessive importance of typological parallels with Southern Chinese and Tai. The situation turned decisively with breakthroughs by Haudricourt (1952, 1954, etc.) compared Vietnamese to Khmu, demonstrating the development of tones in the native lexicon. That insight ushered in a new era of comparative linguistics in MSEA in which more attention was given to the evidence of small languages, as fieldwork delivered much new data, permitting new analytical work. Important comparative-historical studies include Barker (1963, 1966), Barker and Barker (1970, 1976), Thompson (1967, 1976), laying much groundwork for the reconstruction of Proto5 Sources list 20 or more named Vietic lects, but it is clear that some of these are just different local names for varieties of the same language. 6 Ferlus (1998): “The situation of tone systems in Viet-Muong languages is of the highest importance for the theory of tonogenesis” (Ferlus 1998: 25).



History of MSEA Austroasiatic studies 

 75

Vietic, which is most fully realised in the works of Ferlus (e.  g. 1998, 2004, 2007, the latter available as a searchable database at http://sealang.net/monkhmer). Additionally there are studies of historical Vietic studies by native scholars, such as Trần Trí Dõi (2011, 2018). Significant data collection on minor Vietic languages has been conducted by joint Vietnamese-Soviet/Russian field expeditions since the 1980s. Expeditions from 1979 to 2015 studied Mường, Ruc, Sach, Malieng, Arem, and Kri; while much of that material remains unpublished, substantial descriptions of Mường, Ruc, and May have appeared (Sokolovskaja and Nguyên Van Tai 1987; Solntseva and Nguyễn Văn Lợi 2001; Babaev and Samarina 2018). Notable sketches and reference papers by Vietnamese scholars on the minor Vietic languages include: Hà Văn Tấn and Phạm Đức Dương (1978), Nguyễn Phú Phong et al. (1988), Nguyễn Văn Lợi (1993). Language descriptions in English are somewhat rarer; examples include: Preedaporn (2008), Enfield and Diffloth (2009) (see Alves’ chapter on the typology of Vietic languages in this volume for more descriptive material on Vietic languages). For further readings on Vietic linguistics: Barker (1993) offers an annotated bibliography of Vietic languages, Babaev and Samarina (2018) offer an extensive current bibliography of Vietic linguistics, and there is an online open-access Zotero group.7

5.7 Conferences and journals In 1959 SIL members and colleagues from the University of Saigon began meeting monthly as the Linguistic Circle of Saigon, reading papers in Vietnamese, French, and English. The group ran a workshop in Hue in 1963, and the papers presented there formed the basis of the journal Mon-Khmer Studies (MKS), which was launched in 1964. The journal quickly reached around the world and catalysed international cooperation in AA linguistics. At around the same time, there was a developing research focus on SE Asia at SOAS (London), and in 1961 Harry Shorto organised a meeting of some two dozen scholars from around Western Europe to discuss issues of comparative linguistics in Mainland and Insular SE Asia. The proceedings (Shorto 1963 ed.) demonstrated progress in analysis and reconstruction in phonology and morphology across multiple language families of the region. That conference was followed with another in 1965, and a two-volume set of proceedings also published that year (Milner and Henderson eds. 1965). Attending the 1965 meeting was Norman Zide, then professor of Hindi at the University of Chicago, and leading a research effort into Munda and Nicobarese languages. Zide urged various American colleagues to organise an AA focused conference in the 7 https://www.zotero.org/groups/956729/vietic_languages_and_cultures (last accessed 12 January 2021).

76 

 Paul Sidwell

USA, and given strong links between staff at Chicago and the University of Hawaii at Manoa, an organising committee was formed, lead locally in Manoa by Lawrence Reid. In January 1973 the first International Conference on Austroasiatic Linguistics (ICAAL) was held to great success and a two-volume proceedings was published in 1976 as Austroasiatic Studies (Philip N. Jenner, Laurence C. Thompson and Stanley Starosta eds.), as Thompson (1976) noted, “The conferees were thus gratified to note progress on nearly all fronts” (Thompson 1976: viii). That first ICAAL was intended to be the first of a regular conference, but it did not work out as such. The Second International Conference on Austroasiatic Linguistics (SICAL, as it was dubbed, and is referred to in Huffman’s 1986 bibliography) was held at the Central Institute of Indian Languages (CIIL) in Mysore in 1978, and some 40 papers were read. Much effort went into preparing proceedings for publication, yet in 1983 the editors abandoned the project. Bound sets of mimeographed papers from SICAL are kept in the Cambridge University library and the CIIL Library (Mysore), and most are available for download at an online archive.8 After SICAL the momentum waned; in 1979 an AA symposium was held at Helsingør (Denmark) and, for a quarter century after, AA meetings were reduced to occasional sessions at other regional linguistic meetings. See the statement by van Driem9 on the “Long Pause” in Austroasiatic studies. The era of documentary linguistics dawned in the 1990s, and in the 2000s new AA grammars and comparative reconstructions were appearing. 2006 saw a modest “Pilot Meeting” held in Siam Reap, which lead to the formal resurrection of ICAAL as K. S. Nagaraja energetically organised ICAAL 3 at Deccan College Post-Graduate and Research Institute in Pune in 2007. A proceedings volume appeared in 2010 (Nagaraja and Mankodi eds.), and plans were made to hold the renewed ICAAL conferences every two years. The 4th ICAAL was held at Mahidol University (Thailand) in 2009. This was a larger meeting and included a ceremony celebrating the elder statesmen and -women of the field, with gifts and speeches recounting the earlier meetings and progress of the field over previous decades. At that meeting Mathias Jenny proposed the Austroasiatic Handbook Project which eventually saw publication in 2014 (Jenny and Sidwell eds.). A meeting planned for 2011 was cancelled due to flooding in Bangkok, and the 5th ICAAL was held in 2013 in Canberra. This was a smaller meeting with less than 30 presentations, although this would prove to be a trend for ICAAL subsequently. The 6th ICAAL conference was held in Siam Reap in 2015, organised by Meng Vong, at which 34 papers were programmed. ICAAL7 was held in Kiel (Germany) in 2017, organised by John Person and Tobias Weber. Some 22 papers were read over two days; although a modest meeting in size, it importantly drew in participation from the

8 https://sites.google.com/view/paulsidwell/the-sical-papers (last accessed 12 January 2021). 9 Read van Driem’s potted history at https://www.himalayanlanguages.org/icaal (last accessed 12 January 2021).



History of MSEA Austroasiatic studies 

 77

Lund (Sweden) based AA researchers, who agreed to host an ICAAL in Lund in 2021. The last ICAAL to be held at the time of writing was ICAAL8 in 2019, at Chiang Mai University’s Myanmar Center, organised by Mathias Jenny, Paul Sidwell and Ampika Rattanapitak. Also in 2016 an ICAAL workshop on historical syntax was held in Chiang Mai, out of which emerged a volume of papers (Jenny, Sidwell and Alves eds. 2020). Over the more than a century of AA linguistics, relevant scholarly output has mostly appeared in journals, such as Mon-Khmer Studies (MKS), Bulletin de l’Ecole française d’Extrême-Orient (BEFEO), Journal Asiatique, Journal of the Siam Society, Journal of the American Oriental Society, Federation Museums Journal, Bulletin of the School of Oriental and African Studies, Indian Linguistics, Linguistics of the Tibeto-Burman Area, and the Journal of the Southeast Asian Linguistics Society. At the time of writing MKS, published semi-regularly for over 50 years, had become moribund, and these days there is no journal dedicated to Austroasiatic language and linguistics in MSEA. The existence and continuity of MKS was substantially due to the energetic effort of one dedicated scholar over multiple decades, David Thomas of SIL. Thomas passed away in 2006, and this setback combined with increasing general difficulties in journal publishing saw a decline in MKS circulation and by 2016 publication had ceased. A total of 45 issues of MKS were published, plus several monograph length special issues, and most of the content is available online.10 That legacy includes landmark publications, such as Jenner and Pou’s (1980–1981) A Lexicon of Khmer Morphology, Thompson’s (1987) Vietnamese Reference Grammar, and various proto-language reconstructions and bibliographies for Austroasiatic branches. In many ways, MKS kept AA studies alive during the Long Pause between ICAAL meetings, and its impact on the field is enduring.

5.8 Reference materials and archives Various reference works are discussed below, chosen principally for their programmatic contributions and reference to key figures in the history of the field. Shorto et al. (eds.) (1963) reflects the new programmatic zeal for SE Asian linguistics that was manifest at SOAS in the 1960s. The authors compiled an annotated bibliography of Mon-Khmer and Tai linguistics that was a standard reference until Huffman’s (1986) magisterial bibliography of MSEA linguistics appeared. Jenner et al. (eds.) (1976) is the two volume proceedings of the first ICAAL meeting (1973). It includes 53 papers (in 1,343 pages) by 38 authors, reflecting the state of the art in AA studies in the mid-1970s. The topics covered include comparative reconstruction, classification, epigraphy, phonology, morphosyntax, reflecting the peak of activity in AA studies before the generational “Long Pause” set in after the failure to 10 http://mksjournal.org, http://sealang.net/mks (last accessed 12 January 2021).

78 

 Paul Sidwell

plan a third ICAAL meeting or to follow through with publication of proceedings of the second such conference. Nagaraja (1989) is a linguistics bibliography of AA linguistics; although the material is heavily weighted to Munda and Khasian languages, reflecting the anticipated Indian readership, it nonetheless includes a substantial number of references to important works on AA languages of MSEA. Parkin (1991) offers an exhaustive aggregation of demographic, geographic, and language affiliation data for language speaker groups. Much of the text is derived from Lebar et al. (1964) and colonial era sources, with much data on the historical locations and populations of AA communities. Shorto (2006) is the posthumous publication of the author’s magnum opus, a comparative dictionary of the Mon-Khmer languages (effectively an AA etymological dictionary) with more than 2,000 entries, and preliminary chapters discussing the historical phonology of the family based principally on the comparison of Mon and Khmer (and their inscriptional antecedents, as partially explained by Shorto 1976). It was edited together from partly incomplete notes salvaged a decade after Shorto passed away, including multiple preparatory drafts that were produced over a period that stretched from the late 1960s into the late 1980s. Jenny and Sidwell’s (2014) Handbook of the Austroasiatic Languages is a unique reference work in relation to AA studies. The two volumes, totaling 1,330 pages, present work of 27 authors, delivering sketches of 21 AA languages representing at least one of each branch. Other chapters include typological, historical and classification overviews, with much historiographical detail throughout. There are no specific book series dedicated to Austroasiatic linguistics, with grammars, conference papers and other collections published over the decades by a wide range of publishers. 1986 saw the publication of Franklin Huffman’s 640-page Bibliography of Mainland and Southeast Asian Languages and Linguistics, being an attempt at an exhaustive aggregation of works up to that time, making it a fundamental resource for AA studies (among others). More than two decades later the need for an updated resource of this type was recognized, and the Centre for Computational Linguistics (CRCL, Bangkok) responded by aggregating bibliographic resources and links to PDF copies of thousands of papers, book chapters, and theses, related to MSEA languages and linguistics, making these available online.11 Additionally, with funding support from the National Endowment for the Humanities (Washington), CRCL compiled a project site with substantial data resources (more than a quarter million lexical items) and research tools for AA languages of MSEA, plus a sister site for Munda languages, with links to related sites and resources.12

11 http://sealang.net/sala (last accessed 12 January 2021). 12 http://sealang.net/monkhmer, http://sealang.net/munda (last accessed 12 January 2021).



History of MSEA Austroasiatic studies 

 79

Since 2012 Lund University (Sweden) has hosted the RWAAI (The Repository and Workspace for Austroasiatic Intangible Heritage) digital archive.13 This archive is uniquely committed to the preservation of research collections documenting the languages and cultures of communities from the Austroasiatic language family of Mainland Southeast Asia and India. Collections are accessible free of charge, although they may be subject to special access conditions. The digital archive of the Pangloss Collection14 is maintained by a consortium headed by LACITO (Paris). Although global in scope, and focusing on endangered languages, it happens to archive a significant amount of audio data and wordlists for AA languages, reflecting the great legacy of French linguists in MSEA. The files are generally freely available for download.

References Alves, Mark. 2006. A grammar of Pacoh: A Mon-Khmer language of the central highlands of Vietnam. Canberra: Pacific Linguistics. Alves, Mark. 2014. Pacoh. In Mathias Jenny & Paul Sidwell (eds.). 2014. The handbook of Austroasiatic languages, 881–906. Leiden & Boston: Brill. Anderson, Gregory (ed.). 2008. The Munda languages. London & New York: Routledge. Asmah, Haji Omar. 2014. The Mah Meri language: An introduction. Kuala Lumpur: University of Malaya Press. Asmah, Haji Omar. 1963. Bahasa Semang: Dialek Kentakbong. Jabatan Pengajian Melayu: Universiti Malaya, Kuala Lumpur honors thesis. Aymonier, Etienne François. 1874. Dictionnaire français-cambodgien; précédé d’une notice sur le Cambodge et d’un aperçu de l’écriture et de la langue cambodgiennes. Paris: Challamel. Aymonier, Etienne François. 1878. Dictionnaire khmer-français. Saigon. (Self-published). Azémar, Henri. 1886. Dictionaire Stieng. Receuil de 2,500 mots fait à Brơlâm en 1865. Excursions et Reconnaissances 12. 99–146, 251–344. Babaev, Kirill V. & Irina V Samarina. 2018. Язык рук Май. Материалы российско-вьетнамской лингвистической экспедиции. Вып. 5. Москва: ЯСK. Baradat, R. 1941. Les dialectes des tribus sâmrê. Paris: Manuscrit de l’Ecole Française d’Extrême-Orient. Barker, Milton E. 1963. Proto-Vietnamuong initial labial consonants. Văn-hoa Nguyêt-san 12(3). 491–500. Barker, Milton E. 1966. Vietnamese and Mương tone correspondences. In Norman Herbert Zide (ed.), Studies in comparative Austroasiatic linguistics, 9–25. The Hague: Mouton. Barker, Milton E. & Muriel A. Barker. 1970. Proto-Vietnamuong (Annamuong) final consonants and vowels. Lingua 24(3). 268–285. Barker, Milton E. & Muriel A. Barker 1976. Muong-Vietnamese-English dictionary. Dallas: Summer Institute of Linguistics (microfiche).

13 https://projekt.ht.lu.se/rwaai (last accessed 12 January 2021). 14 http://lacito.vjf.cnrs.fr/pangloss/ (last accessed 12 January 2021).

80 

 Paul Sidwell

Barker, Miriam. 1993. Bibliography of Mường and other Vietic language groups, with notes. Mon-Khmer Studies 23. 197–243.  Bastian, Adolf. 1868. Reise durch Kambodja nach Cochinchina. Die Voelker des oestlichen Asien: Studien und Reisen von Dr. Adolf Bastian, Vierter Band. Jena: Herman Costenoble. Bätscher, Kevin. 2015. Mlabri. In Mathias Jenny & Paul Sidwell (eds.), The handbook of Austroasiatic languages, 2 vols., 1003–1030. Leiden & Boston: Brill. Bauer, Christian. 1982. Morphology and syntax of spoken Mon. London: University of London PhD dissertation.  Bauer, Christian. 1984. A guide to Mon studies (Working Papers / Centre of Southeast Asian Studies). Melbourne: Monash University. Benedict, Paul K. 1990. How to tell Lai: An exercise in classification. Linguistics of the Tibeto-Burman Area 13(2). 1–26. Benjamin, Geoffrey. 1976a. Austroasiatic Subgroupings and Prehistory in the Malay Peninsula. In Philip N. Jenner, Laurence C. Thompson, and Stanley Starosta (eds.) Austroasiatic Studies. Honolulu, University of Hawaii Press (Oceanic Linguistics Special Publications No. 13). Pp: 37–128. Benjamin, Geoffrey. 1976b. An outline of Temiar grammar. In Philip N. Jenner, Laurence C. Thompson & Stanley Starosta (eds.), Austroasiatic studies (Oceanic Linguistics Special Publications 13), 129–188. Honolulu: University of Hawaii Press. Bergsland, Knut & Hans Vogt. 1962. On the validity of glottochronology. Current Anthropology 3. 115–153. Bishop, Nancy M. 1996. A preliminary description of Kensiw (Maniq) phonology. Mon-Khmer Studies 25. 227–253. Bishop, Nancy N. & Mary M. Peterson. 1994a. Kensiw glossary. Mon-Khmer Studies 23. 163–195. Bishop, Nancy N. & Mary M. Peterson. 1994b. A selective Aslian bibliography. Mon-Khmer Studies 24. 161–169. Blagden, Charles Otto. 1906. Language and comparative vocabulary of Aboriginal dialects. In Walter William Skeat & Charles Otto Blagden, Pagan races of the Malay Peninsula, vol. 2, 379–472, 481–775. London: Macmillan. Blagden, Charles Otto. 1910. Quelques notions sur la phonétique du talain et son evolution historique. Journal Asiatique 15. 477–505. Blok, Gregory Robert. 2013. A descriptive grammar of Eastern Lawa. Thailand: Payap University MA thesis. Blood, Henry F. 1966. A reconstruction of Proto-Mnong. Bloomington: Indiana University MA thesis. (Mimeograph). Bon, Noëllie. 2014. Une grammaire de la langue stieng. Lyon: Université Lumière Lyon 2 doctoral thesis. Bos, Kees Jan & Paul Sidwell. 2014. Kui Ntua. In Mathias Jenny & Paul Sidwell (eds.), The handbook of Austroasiatic languages, 2 vols., 837–880. Leiden & Boston: Brill. Bùi Khánh Thế. 2000. The Phong language of the Ethnic Phong which live near the Melhir Muong Pon Megalith in Laos. In Pan-Asiatic Linguistics: The Fifth International Symposium on Languages and Linguistics, 199–253. Ho Chi Minh City: National University. Burenhult, Niclas & Claudia Wegener. 2009. Preliminary notes on the phonology, orthography, and vocabulary of Semnam. Austroasiatic, Malay Peninsula. Journal of the Southeast Asian Linguistics Society 1. 283–311. Burenhult, Niclas. 2005. A grammar of Jahai. Canberra: Pacific Linguistics. Butler, Becky. 2014. Bunong. In Mathias Jenny & Paul Sidwell (eds.), The handbook of Austroasiatic languages, 2 vols, 719–745. Leiden & Boston: Brill.



History of MSEA Austroasiatic studies 

 81

Cabaton, Antoine. 1905. Dix dialectes indochinois recueillis par Prosper Oden’hal. Etude linguistique par Antoine Cabatone. Journal Asiatique, Dixième série, tome V. 265–344. Carey, H. J. Iskandr. 1961. Tengleq Kui Serok: A study of the Temiar language with an ethnographic summary. Kuala Lumpur: Dewan Bahasa dan Pustaka. Cheeseman, Nathaniel, Elizabeth Hall & Darren Gordon. 2015. Palaungic linguistic bibliography. Mon-Khmer Studies 44. i–iiv. Cheeseman, Nathaniel, Jennifer Herington & Paul Sidwel. 2013. Bahnaric linguistic bibliography with selected annotations. Mon-Khmer Studies 42. xxxiv–xlvii. Cheeseman, Nathaniel, Paul Sidwell & R. Anne Osborne. 2017. Khmuic linguistic bibliography with selected annotations. Journal of the Southeast Asian Linguistics Society 10(1). i–xlvi. Chen Guoqing [陈国庆]. 2005. A study of Kemie [Kemie yu yan jiu 克蔑语研究]. Beijing: Ethnic Publishing House [民族出版社]. Chén Xiāng-Mù, Wáng Jìng Liú & Làn Yǒng Liáng. 1986. Dé-ángyǔ jiǎnzhì [A description of the Ta-ang language]. Beijing: National Minorities Press. (In Chinese) Cheon, Jean-Nicolas-Arthur. 1907. Note sur les dialectes nguon, sac et muong. Bulletin de l’École Française d’Etrême-Orient 7. 87–99. Choo, Marcus. 2009. Katuic bibliography. Chiang Mai, Thailand: Survey Unit, Linguistics Institute, Payap University. Choo, Marcus. 2010. Katuic bibliography with selected annotations. Chiang Mai, Thailand: Survey Unit, Linguistics Institute, Payap University. Choo, Marcus. 2012. The status of Katuic. Chiang Mai, Thailand: Survey Unit, Linguistics Institute, Payap University. Choosi, Isara. 2007. Investigating contact-induced language change: Cases of Chung (Saoch) in Thailand and Cambodia. Thailand: Mahidol University PhD thesis. Coedès, George. 1937–1966. Inscriptions du Cambodge (EFEO collections de textes et Documents sur l’Indochine, 8 vols.). Hanoi: Imprimerie d’Extrême-Orient. Costello, Nancy A. 1971. Ngữ-vựng Katu: Katu vocabulary (Vietnam Montagnard Language Series 5). Saigon: Department of Education (Summer Institute of Linguistics Dallas Microfiche). Costello, Nancy A. 1991. Katu dictionary (Katu-Vietnamese-English). Dallas: Summer Institute of Linguistics. Costello, Nancy A. & Khamluan Sulavan. 1993. Katu folktales and society, Katu-Lao-English. Vientiane: Ministry of Information and Culture. Crawfurd, John. 1828. Journal of an embassy from the Governor-General of India to the Courts of Siam and Cochin China. London: Henri Colburn.  Cust, Robert Needham. 1878. A sketch of the modern languages of the East Indies. London: Trübner and Company. Dao Jie 刀洁. 2007. Bumang yu yanjiu 布芒语研究 [A study of Bumang]. 北京: 民族出版社 [Beijing: Nationalities Publishing House]. Davies, Henry Rudolph. 1909. Yün-nan: The link between India and the Yangtze. Cambridge: Cambridge University Press. Delcros, Henri & Jean Subra. 1966. Petit dictionnaire du langage des Khmu’ de la région de Xieng-Khouang. Vientiane: Mission Catholique (Mimeograph). Dentan, Robert K. ms. An outline of Semai grammar. n.d. Dhanan Chantrupanth & Chartchai Phomjakagarin. 1978. Khmer (Surin) – Thai – English dictionary. Bangkok: Chulalongkorn University Diffloth, Gérard. 1968. Proto-Semai phonology. Federation Museums Journal (new series) 13. 65–74. Diffloth, Gérard. 1975. Les langues Mon-Khmer de Malaisie: classification historique et innovations. Asie du Sud-Est et Monde Insulindien 6(4). 1–19.

82 

 Paul Sidwell

Diffloth, Gérard. 1976. Jah-Hut, an Austroasiatic language of Malaysia. In Nguyen Dang Liem (ed.), Southeast Asian Linguistic Studies (Pacific Linguistics C-42), vol. 2, 73–118. Canberra: Australian National University. Diffloth, Gérard. 1977. Towards a history of Mon-Khmer: Proto-Semai vowels. Tônan Ajia Kenkyû (Southeast Asian Studies) 14(4). 463–495. Diffloth, Gérard. 1980. The Wa languages (Linguistics of the Tibeto-Burman Area 5.2). Berkeley: University of California. Diffloth, Gérard. 1982. Registres, dévoisement, timbres vocaliques: leur histoire en Katouique. Mon-Khmer Studies 11. 47–82. Diffloth, Gérard. 1984. The Dvaravati-Old Mon Language and Nyah Kur (Monic Language Studies 1). Bangkok: Chulalongkorn University Printing House. Diffloth, Gérard. 1989. Proto-Austroasiatic creaky voice. Mon-Khmer Studies 15. 139–154. Diffloth, Gérard. 1991. Palaungic vowels in Mon-Khmer perspective. In Jeremy H. C. S. Davidson (ed.), Austroasiatic languages, essays in honour of H. L. Shorto, 13–28. London: School of Oriental and African Studies, University of London. Diffloth, Gérard. ms. The Mon-Khmer family of languages: An introduction. Taipei: Institute of History and Philology, Academia Sinica. Unpublished. Dourisbourne, Le Père P.-X. 1889. Dictionnaire bahnar-français. Hongkong: Imprimerie de la Société des Missions Etrangères 45. Dournes, Jacques. 1950. Dictionnaire Srê (Köhö)-Français. Saigon: Des Missions Etrangères de Paris. Drage, Godrey. 1907. A few notes on Wa. Rangoon: Superintendent, Government Press. Drouin, S. & K’nai. 1962. Dictionnaire français-montagnard (Köho). 4 vols. Dalat. [Mimeographed: Wason film 2359, Cornell University, Wason Collection]. Dumoutier, Gustave Emile. 1891. Notes sur la Rivière Noire et le Mont Ba-Vi (Tonkin). Bulletin de Géographie historique et descriptive 4. 150–196. Dunn, Micheal, Nicolas Burenhult, Nicole Kruspe, Sylvia Tufvesson & Neele Becker. 2011. Aslian linguistic prehistory: A case study in computational phylogenetics. Diachronica 28(3). 291–323. Efimov, A. Ju. 1990. Istoricheskaja Fonologija Juzhnobaxnaricheskix Jazykov. Moskva: Nauka. Efimov, Aleksandr 1983. Problemy fonologicheskoj rekonstrukcii proto-katuicheskogo jazyka. Moscow: Institute of Far Eastern Studies Kandidat Dissertation. Emeneau, Murray B. 1954. Linguistic prehistory of India. Proceedings of the American Philosophical Society 98. 282–292. Enfield, Nicholas & Gerard Diffloth 2009. Phonology and sketch grammar of Kri, a Vietic language of Laos. Cahiers de Linguislique – Asie Orientale 38(1). 3–69. Ferlus, Michel. 1983. Essai de phonétique historique de môn. Mon-Khmer Studies 12. 1–90. Ferlus, Michel. 1992. Essai de phonétique historique du khmer (Du milieu du premier millénaire de notre ère à l’époque actuelle). Mon-Khmer Studies 21. 57–89. Ferlus, Michel. 1998. Les systèmes de tons dans les langues viet-muong. Diachronica 15(1). 1–27. Ferlus, Michel. 2004. The origin of tones in Viet-Muong. In Somsonge Burusphat (ed.), Papers from the Eleventh Annual Meeting of the Southeast Asian Linguistics Society 2001, 297–313. Tempe, AZ: Arizona State University: Ferlus, Michel. 2007. Lexique de racines Proto Viet-Muong (Proto Vietic Lexicon). http://sealang.net/ monkhmer/database/ (last accessed 12 January 2021). Ferlus, Michel. 2011. Toward Proto Pearic: Problems and historical implications. In Sophana Srichampa & Paul Sidwell (eds.), Austroasiatic studies: Papers from ICAAL4. Mon-Khmer Studies Journal Special Issue No. 2, 38–51. Dallas: SIL International; Salaya: Mahidol University; Canberra: Pacific Linguistics. Filbeck, David. 1978. T’in: A historical study. Canberra: Pacific Linguistics.



History of MSEA Austroasiatic studies 

 83

Gainey, Jerry 1985. A comparative study of Kui, Bruu and So phonology from a genetic point of view. Thailand: Chulalongkorn University MA thesis. Gao Yongqi [高永奇]. 2003. A study of Mang [莽语硏究]. Beijing: Ethnic Publishing House [民族出 版社]. Garnier, Francis. 1873. Voyage d’exploration en Indo-Chine effectué pendant les années 1866, 1867, et 1868 par une Commission Française présidée par M. le Capitaine de Frégate Doudart de Lagrée. Paris: Librarie Hachette. Gehrmann, Ryan. 2015. Vowel height and register assignment in Katuic. Journal of the Southeast Asian Linguistics Society 8. 56–70. Gehrmann, Ryan. 2016. The West Katuic languages: Comparative phonology and diagnostic tools. Thailand: Payap University MA thesis. Gehrmann, Ryan. 2019. On the origin of Rime laryngealization in Ta’oiq: A case study in vowel height conditioned phonation contrasts. Paper presented at the 8th International Conference on Austroasiatic Linguistics (ICAAL8), Chiang Mai, Thailand, 29–31 August. Gordon, Darren C. 2014. A selective Palaungic linguistic bibliography. Mon-Khmer Studies Journal 42. xiv–xxxiii. Gregerson, Kenneth. 1971. Predicate and argument in Rengao grammar. Seattle: University of Washington PhD dissertation. Gregerson, Kenneth. 2014. Fifty years of Mon-Khmer Studies. Mon-Khmer Studies 43(2). i–iii. Guilleminet, Paul & R. P. Jules Alberty. 1959–1963. Dictionnaire bahnar-français, 2 vols. Hanoi & Paris: Publications de Ecôle Française d’Extrême Orient. Hà Văn Tấn, Phạm Đức Dương. 1978. Hà Văn Tấn, Phạm Đức Dương. Hanoi: Về ngôn ngữ Tiền Việt-Mượng. Hall, Elizabeth. 2010. A phonology of Muak Sa-aak. Thailand: Payap University MA thesis. Halliday, R. 1922. A Mon-English dictionary. Bangkok: Siam Society. Harmand, Jules. 1878–1879. Notes de voyage en Indo-Chine: les Kouys – Ponthey-Kakèk. Annales d’Extrême-Orient 1. 322–339. Harper, Jerod. 2009. Phonological descriptions of Plang spoken in Man Noi, La Gang, and Bang Deng Villages (in China). Thailand: Payap University MA thesis. Haswell, James M. 1874. Grammatical notes and vocabulary of the Peguan language. Rangoon: American Baptist Mission. Haudricourt, André-Georges. 1952. L’origine môn-khmèr des tons en viêtnamien. Journal Asiatique 240. 264–265. Haudricourt, André-Georges. 1953. La place du viêtnamien dans les langues austroasiatiques. Bulletin de la Société de Linguistique de Paris 49(1). 122–128. Haudricourt, André-Georges. 1954. De l’origine des tons en viêtnamien. Journal Asiatique 242. 69–82. Haudricourt, André-Georges. 1965. Mutation consonantique en Mon-Khmer. Bulletin de la Société Linguistique de Paris 60. 160–172. Hayes, La Vaughn H. 1982. The mutation of *r in pre-Thavưng. Mon-Khmer Studies 11. 83–100. Headley, Robert K., Jr. 1977. A Pearic vocabulary. Mon-Khmer Studies 6. 69–150. Headley, Robert K., Jr. 1978. An English-Pearic vocabulary. Mon-Khmer Studies 7. 61–94. Headley, Robert K., Jr. 1985. Proto-Pearic and the classification of Pearic. In Suriya Ratanakul, David Thomas & Suwilai Premsirat (eds.), Southeast Asian Linguistic Studies presented to André-G Haudricourt, 428–478. Bangkok: Mahidol University. Headley, Robert K., Jr. & Rath Chim. 2014. Modern Cambodian-English dictionary, 2nd edn. Hyattsville: Dunwoody Press. Headley, Robert K., Jr., Kylin Chhlor, Lim Hak Kheang, Lam Kheng Lim & Chen Chun. 1977. Cambodian-English dictionary. Washington: Catholic University of America Press.

84 

 Paul Sidwell

Headley, Robert K., Jr., Rath Chim & Ok Soeum. 1997. Modern Cambodian-English dictionary. Kensington: Dunwoody Press. Hoàng Tuệ, Lý Toàn Thắng, Tạ Văn Thông, et al. 1986. Ngữ pháp Tiếng Kơho [A grammar of Koho]. Lam Dong, Vietnam: Sở Văn Hóa và Thông Tin Lâm Đông. Hsiu, Andrew. 2016. A preliminary reconstruction of Proto-Pakanic. Thailand: Payap University manuscript. Huffman, Franklin E. 1971. Vocabulary lists (Mon-Khmer) 20 languages. http://sealang.net/archives/ huffman (last accessed 12 January 2021). Huffman, Franklin E. 1976. The register problems in fifteen Mon-Khmer languages. In Philip N. Jenner, Laurence C. Thompson & Stanley Starosta (eds.), Austroasiatic studies (Oceanic Linguistics, Special Publication 13), Part I, 575–589. Honolulu: University of Hawaii. Huffman, Franklin E. 1985. The phonology of Chong, a Mon-Khmer language of Thailand. In Surya Ratanakul, David Thomas & Suwilai Premsrirat (eds.), Southeast Asian Linguistic Studies presented to André-G. Haudricourt, 355–388. Bangkok: Mahidol University. Huffman, Franklin E. & Im Proum. 1978. English-Khmer dictionary. New Haven: Yale University Press. Huffmann, Franklyn. 1986. Bibliography and index of mainland Southeast Asian languages and linguistics. New Haven & London: Yale University Press. Jacq, Pascale & Paul Sidwell. 2000. A comparative West Bahnaric dictionary. München: Lincom Europa. Jacq, Pascale. 2001. A description of Jruq (Loven): A Mon-Khmer language of the Lao PDR. Canberra: Australian National University MA thesis. Janzen, Herman & Margarete Janzen. 1972. Grammar analysis of Pale clauses and phrases. Journal of the Burma Research Society 55(1/2). 47–99. Jenner, Philip N. & Saveros Pou. 1980–1981. A lexicon of Khmer morphology. Mon-Khmer Studies 9/10.  Jenner, Philip N., Laurence C. Thompson & Stanley Starosta (eds.). 1976. Austroasiatic studies (Oceanic Linguistics, Special Publication 13), 2 vols. Honolulu: University of Hawaii. Jenny, Mathias & Patrick McCormick. 2014. Old Mon. In Mathias Jenny & Paul Sidwell (eds.), The handbook of Austroasiatic languages, 2 vols., 519–552. Leiden & Boston: Brill. Jenny, Mathias, Paul Sidwell & Mark Alves (eds.). 2020. Austroasiatic syntax in areal and diachronic perspective. Leiden: Brill. Jenny, Mathias. 2005. The verb system of Mon. Zurich: Universität Zürich. Jenny, Mathias. 2014. Modern Mon. In Mathias Jenny & Paul Sidwell (eds.), The handbook of Austroasiatic languages, 2 vols., 553–600. Leiden & Boston: Brill. Kamnuansin, Sunee. 2002. Kasong syntax. Thailand: Institute of Language and Culture for Rural Development, Mahidol University MA thesis. Kato, Takashi. 2001. Khmu vocabulary. In Tasaku Tsunoda (ed.), Endangered languages of the Pacific Rim. Basic materials in minority languages 2001, 95–104. Osaka: Osaka Gakuin University. Keane, Augustus Henry. 1880. On the relations of the Indo-Chinese and Indo-Oceanic races and languages. Journal of the Royal Anthropological Institute of Great Britain and Ireland 9. 254–289. Kruspe, Nicole D. 2004. A grammar of Semelai. Cambridge: Cambridge University Press. Kruspe, Nicole, Niclas Burenhult & Ewelina Wnuk. 2014. Northern Aslian. In Mathias Jenny & Paul Sidwell (eds.), The handbook of Austroasiatic languages, 419–474. Leiden: Brill. Kruspe, Nicole. 2010. A dictionary of Mah Meri as spoken at Bukit Bangkong (Oceanic Linguistics Special Publications 36). Honolulu: University of Hawai’i Press. https://muse.jhu.edu/ book/830 (last accessed 12 January 2021). Kruspe, Nicole. 2014. Semaq Beri. In Mathias Jenny & Paul Sidwell (eds.), The handbook of Austroasiatic languages, 475–516. Leiden: Brill.



History of MSEA Austroasiatic studies 

 85

Kuiper, F. B. J. 1948. Proto-Munda words in Sanskrit. Amsterdam: Noord-Hollandsche Maatchappij.  Lê Đông & Tạ Văn Thông. 2008. Từ điển Việt – Xơ Đăng [Vietnamese-Sedang dictionary]. Hanoi: Nhà xuất bản văn hóa thông tin. Lebar, Frank M., Gerald C. Hickey & John K. Musgrave. 1964. Ethnic groups of mainland Southeast Asia. New Haven, CT: Human Relations Area Files Press. Lévy, Sylvain. 1923. Pré-Aryen et pré-Dravidien dans l’Inde. Journal Asiatique 203. 1–57. Li Dao Yong, Nie Xi Zhen & Qiu E Feng. 1986. Bùlàngyu ǰiánzhì [A description of Bulang (Lamet)]. Beijing: Chinese Academy of Social Sciences. Li Jinfang [李錦芳]. 1996a. A sketch of Bugan [布干语概况]. Minzu Yuwen. Li Jinfang. 1996b. Bugan – A new Mon-Khmer language of Yunnan Province, China. Mon-Khmer Studies 26. 135–160. Li Jinfang & Luo Yongxian. 2014. Bugan. In Mathias Jenny & Paul Sidwell (eds.), The handbook of Austroasiatic languages, 1031–1062. Leiden: Brill. Li Yunbing [李云兵]. 2005. A study of Bugeng [Bugan] [布赓语研究]. Beijing: Ethnic Publishing House [民族出版社]. Liang Min [梁敏]. 1984. A sketch of Bolyu [俫语概况]. Minzu Yuwen. Logan, James Richardson. 1850. On the leading charactersitics of the Papuan, Australian, and Malayu-Polynesian nations. The Journal of the Indian Archipelago IV. 344–478. Logan, James Richardson. 1859. The Mon-Anam formation. Journal of the Indian Archipelago 3(1). 153–183. Luang-Thongkum, Theraphan & See Puengpa. 1980. A Bruu-Thai-English dictionary. Bangkok: Chulalongkorn University Printing House Luang-Thongkum, Theraphan. 1984. Samre language. Journal of Thammasat University 15(1). 16–128. (In Thai) Luang-Thongkum, Theraphan. 1987. Another look at the register distinction in Mon. UCLA Working Papers in Phonetics 67. 132–165. Luang-Thongkum, Theraphan. 1988. Phonation types in Mon-Khmer languages. In Osama Fujimura (ed.), Vocal physiology: Voice production, mechanisms and function, 319–333. New York: Raven Press. Luang-Thongkum, Theraphan. 1991. An instrumental study of Chong Register. In Jeremy H. C. S. Davidson (ed.), Austroasiatic languages. Essays in honour of H. L. Shorto, 141–160. London: School of Oriental and African Studies, University of London. Luang-Thongkum, Theraphan. 2001. Languages of the tribes in Xekong Province Southern Laos. Bangkok: Chulalongkorn University Press. Luce, Gordon H. 1965. Danaw, a dying Austroasiatic language. Lingua 14. 98–129. Lund University. 2015. Bibliography of the Kammu Project. https://projekt.ht.lu.se/en/rwaai/ austroasiatic/the-kammu-project/bibliography/ (accessed November 2019). Lunet de Lajonquière, Etienne Edmond. 1906. Ethnographie du Tonkin septentrional. Paris: Ernest Leroux. Lý Toàn Thắng, Tạ Văn Thông, K’Brêu & K’Bròh. 1985. Ngữ pháp tiếng Kơho [Kơho grammar]. Lâm Đồng: Sở Văn hóa và Thông Tin Lâm Đồng. Lye, Tuck Po. (ed.). 2001. Orang Asli of Peninsular Malaysia: A comprehensive and annotated bibliography. Kyoto: Kyoto University Center for Southeast Asian Studies. Ma Nu Nu Thein. 2005. Danaw/Htanaw grammar and phonology. Rangun: Yangon University PhD dissertation. (In Burmese). Macey, Paul. 1906. Etude ethnographique sur diverses tribus, aborigènes ou autochtones, habitant les provinces de Hua-phans, Ha-tang-hoc et du Cammon, au Laos. In Actes du XIVe Congres International des Orientalistes: 1e Tome, 5e Section, 3–63. Paris: Ernest Leroux. Mak, Pandora. 2012. Golden Palaung: A grammatical description. Canberra: Asia-Pacific Linguistics.

86 

 Paul Sidwell

Mann, Noel. 2004. Mainland Southeast Asia Comparative wordlist for lexicostatistic studies. Chiang Mai: Payap University Graduate School.  Martin, Marie A. 1975. Les dialectes Pears dans leurs rapports avec les langues nationales. Journal of the Siam Society 63(2). 86–95. Mason, Francis. 1854. The Talaeng language. Journal of American Oriental Society 4. 277–288. Maspéro, Henri. 1912. Etude sur la phonétique historique de la langue annamite. Les initiales. Bulletin de l’Ecole Française d’Extrême Orient 12. 1–27. Maspéro, Henri. 1915. Grammaire de la langue k ̲hmère (cambodgien). Paris: Imprimetie Nationale. Maspéro, Henri. 1955. Matériaux pour l’étude de la langue t’èng. Bulletin de l’Ecole Française d’Extrême Orient 47. 457–507. Matisoff, James. 1973. Tonogenesis in Southeast Asia. In Larry M. Hyman (ed.), Consonant type and tones (Southern California Occasional Papers in Linguistics 1), 71–95. Los Angeles: University of Southern California, Linguistics Program. Matisoff, James. 2003. Aslian: Mon-Khmer of the Malay Peninsula. Mon-Khmer Studies 33. 1–58. Means, Natalie. 1999. Temiar-English, English-Temiar dictionary. St. Paul, MN: Hamline University Press. Means, Nathalie & Paul B. Means. 1987. Senoi-English English-Senoi dictionary. Toronto: Joint Centre on Modern East Asia. Migliazza, Brian. 1998. A grammar of So: A Mon-Khmer language of northeast Thailand. Thailand: Mahidol University PhD thesis. Miller, John & Carolyn Miller. 1963. Brou-English-Vietnamese dictionary. Dallas, TX: Summer Institute of Linguistics. (pp. 195 microfiche). Miller, Vera Grace. 1976. An overview of Stiêng grammar. Grand Forks, ND: University of North Dakota MA thesis. Published as Summer Institute of Linguistics Workpapers Vol. 20. Milne, Leslie. 1921. An elementary Palaung grammar. Oxford: Clarendon Press. Milne, Leslie. 1931. A dictionary of English-Palaung and Palaung-English. Rangoon: Superintendent, Government Printing and Stationary. Milner, G. B. & Eugénie J. A. Henderson (eds.). 1965. Indo-Pacific Linguistic Studies (Lingua 14/15). Vol. 2: Descriptive Linguistics. Amsterdam: North Holland. Mingkwan Malapol. 1989. Pray grammar at Ban Pae Klang, Thung Chang district, Nan province. Thailand: Mahidol University MA thesis. Mitani, Yasuyuki. 1977. Palaung dialects: A preliminary comparison. Tônan Ajia Kentyû (South East Asian Studies) 15(2). 193–212. Mitani, Yasuyuki. 1979. Vowel correspondences between Riang and Palaung. In Theraphan L. Thongkum (ed.), Studies in Thai and Mon-Khmer phonetics and phonology in honour of Eugénie J. A. Henderson, 142–150. Bangkok: Chulalongkorn University Press. Morizon, René. 1936. Essai sur le dialecte des populations Pears des Cardamomes. Paris: Les Éditions internationales. Müller, Friedrich Wilhelm Karl. 1862. Lectures on the science of language. London: Longman, Green, Longman and Roberts. Munn, Elizabeth. 2018. A phonological comparison of Eastern Lawa varieties in Hot District, Chiang Mai Province, Thailand. Thailand: Payap University MA thesis. Nagaraja, K. S. 1989. Austroasiatic languages: A linguistic bibliography. Pune: Deccan College, Post-Graduate and Research Institute. Nagaraja, K. S. 2010. Austroasiatic languages – An introduction. In K. S. Nagaraja & Kashyap Mankodi (eds.), Austro-Asiatic linguistics: In memory of R. Elangaiyan, 1–32. Mysore: Central Institute of Indian Languages. Nagaraja, K. S. & Kashyap Mankodi (eds.). 2010. Austro-Asiatic linguistics: In memory of R. Elangaiyan. Mysore: Central Institute of Indian Languages.



History of MSEA Austroasiatic studies 

 87

Nai Pan Hla. 1988–1989. An introduction to Mon language. Kyoto: Center for Southeast Asian Studies, Kyoto University.  Nai Tun Way. 1997. The modern English-Mon dictionary. New York: Open Society Institute, Burma Project. Narumol Charoenma. 1980. The sound systems of Lampang Lamet and Wiangpapao Lua. Thailand: Mahidol University MA thesis. Narumol Charoenma. 1982. The phonologies of a Lampang Lamet and Wiang Papao Lua. Mon-Khmer Studies 11. 35–45. Nguyễn Hữu Hoành. 1995. Tiếng Katu cấu tạo từ [Katu Language word formation]. Hà Nội: Nhà xuất bản khoa học xã hội. Nguyễn Phú Phong et al. 1988. Nguyễn Phú Phong, Trần Trí Dõi, Ferlus M. Lexique vietnamien – ruc – français. Paris: Université de Paris VII. Nguyễn Văn Lợi, Ðoàn Văn Phúc & Phan Xuân Thành. 1986. Sách học tiếng Pakôh-Taôih [Text to study the Pacoh and Taoih languages]. Hà Nội: Ủy Ban Nhân Dân. Nguyễn Văn Lợi. 1993. Nguyễn Văn Lợi. Tiếng Rục. Hà Nội: Khoa học Xã hội. Nguyễn Văn Lợi. 2008. Tiếng Mảng. Hanoi: Nhả xuất bản Khoa học Xã hộ. Nguyễn Văn Thanh & Bùi Ðăng Bình. 2011. Tiếng Bhnong [Bhnong language]. T.P. Hố Chí Minh: Nhà Sách Tổng Hợp. Nguyễn Văn Thanh, Bùi Ðăng Bình. 2009. Từ Điển Việt – Bhnong [Vietnamese-Bhnong dictionary]. Hanoi: Nhà xuất bản văn học. Nguyễn, Hữu Hoành & Nguyễn Văn Lợi. 1998. Tiếng Katu [The Katu language]. Hà Nội: Nhà Xuất Bản Khoa Học Xã Hội. Olsen, Neil Hayes. 2014a. A descriptive grammar of Kơho-Sre: A Mon-Khmer language. Salt Lake City: University of Utah PhD thesis. Olsen, Neil Hayes. 2014b. Kơho-Sre. In Mathias Jenny & Paul Sidwell (eds.), The handbook of Austroasiatic languages, 2 vols., 746–788. Leiden & Boston: Brill. Ostapirat, Weera. 2009. Early and Modern Pearic registers/tones. Paper presented at the 4th International Conference on Austroasiatic Linguistics, Mahidol University, 29–30 October. Paiboon Duangchand. 1984. A phonological description of the Kensiw language (a Sakai dialect). Thailand: Mahidol University MA thesis. Parkin, Robert. 1991. A guide to Austroasiatic speakers and their languages (Oceanic Linguistics Special Publications 23). Honolulu: University of Hawaii Press. Paulsen, Debbie. 1989–1990. A phonological reconstruction of Proto-Plang. Mon-Khmer Studies 18/19. 160–222.  Peiros, Ilia. 1996. Katuic comparative dictionary. Canberra: Pacific Linguistics. Peterson, Mary M. 1997. Kensiw grammar: Basic declarative, interrogative and imperative propositions. Bangkok: TU-SIL-LRDP Thammasat University. Phaiboon Duangchan. 1984. A phonological description of the Kensiw language (a Sakai dialect). Thailand: Mahidol University MA thesis. Phaiboon Duangchan. 2006. Glossary of Aslian languages: The northern Aslian languages of southern Thailand. Mon-Khmer Studies 36. 207–224. Phillips, Timothy C. 2012. Proto-Aslian: Towards an understanding of its historical linguistic systems, principles and processes. Bangi, Malaysia: Institut Alam Dan Tamadun Melayu Universiti Kebangsaan PhD thesis. Pinnow, Heinz-Jürgen. 1959. Versuch einer historischen Lautlehre der Kharia-Sprache. Wiesbaden: Otto Harrassowitz. Ploykaew, Pornsawan. 2001. Samre grammar. Thailand: Institute of Languages and Culture for Rural Development, Mahidol University PhD thesis.

88 

 Paul Sidwell

Pogibenko, T. G. & Bùi Khánh-Thê. 1990. Iazyk Ksingmul. Materialy sovetsko-v’etnamskoj lingvisticheskoj ekspeditsii 1979 goda. Moscow: Nauka. Prasert Sriwises. 1978. Kui (Suai)-Thai-English dictionary. Bangkok: Chulalongkorn University. Preedaporn Srisakorn. 2008. So (Thavung) grammar. Thailand: Mahidol Univesity PhD thesis. Preisig, Elisabeth, Somseng Sayavong & Suksavang Simana’. 1994. Kmhmu’-Lao-French-English dictionary. Vientiane: Ministry of Information and Culture. Proschan, Frank. 1987. Bibliography of Kmhmu (Khmu, Kammu). Manuscript. Washington, DC. Proschan, Frank. 1996. A survey of Khmuic and Palaungic languages in Laos and Vietnam. Pan-Asiatic Linguistics 3. 895–919. Przyluski, Jean 1924. Les langues austroasiatiques. In Antoine Meillet & Marcel Cohen (eds.), Les Langues du Monde (Collection linguistique publiée par la société de linguistique de Paris), 16, 335–403. Paris: Librarie Ancienne Edouard Champion. Renard, Ronald D. & Anchalee Singhanetra-Renard (eds.). 2015. Mon-Khmer peoples of the Mekong region. Chiang Mai: Chiang Mai University Press. Kmhmu’ bibliography at pp. 443–382. Ring, Hiram & Felix Rau (eds.). 2018. Papers from the Seventh International Conference on Austroasiatic Lingusitics (JSEALS Special Publication 3). Manoa: University of Hawaii Press. Rischel, Jørgen. 1989. Can the Khmuic component in Mlabri (“Phi Tong Luang”) be identified as old T’in? Acta Orientalia 50. 79–115. Rischel, Jørgen. 1995. Minor Mlabri: A hunter-gatherer language of Northern Indochina. Copenhagen: Museum Tusculanum Publishers. Rischel, Jørgen. 2007. Mlabri and Mon-Khmer. Copenhagen: Historisk-filosofiske Meddelelser 99. Rojanakul, Nattamon. 2009. Chong syntax. Thailand: Research Institute for Languages and Cultures of Asia, Mahidol University MA thesis. Sakamoto, Yasuyuki. 1994. Mon-Japanese dictionary. Tokyo: University of Foreign Studies. Sakamoto, Yasuyuki. 1996. Japanese-Mon dictionary. Tokyo: University of Foreign Studies. Schebesta, Paul. 1926. The jungle tribes of the Malay Peninsula (translated by C. O. Blagden). Bulletin of the School of Oriental Studies, London Institution 4. 269–278. Schebesta, Paul. 1926–1928. Grammatical sketch of the Jahai dialect, spoken by a Negrito tribe of Ulu Perak and Ulu Kelantan, Malay Peninsula (translated by C. O. Blagden). Bulletin of the School of Oriental Studies, London Institution 4. 803–826. Schebesta, Paul. 1931. Grammatical sketch of the Ple-Temer language (translated by C. O. Blagden). Journal of the Royal Asiatic Society 1931. 641–652. Schebesta, Paul. 1952–1957. Die Negrito Asiens: Geschichte, Geographie, Umwelt, Demographie und Anthropogie der Negrito, vol. 1, vol. 2, parts 1 and 2. Wien-Mödling: Anthropos Institut, St. Gabriel Verlag. Schiering, René, Balthasar Bickel & Kristine Hildebrandt. 2010. The prosodic word is not universal, but emergent. Journal of Linguistics 46. 657–709. Schmidt, Wilhelm. 1901. Die Sprachen der Sakai und Semang auf Malacca und ihr Verhältnis zu den Mon-Khmer-Sprachen. Bijdragen tot de Taal-, Land-, en Volkenkunde van Nederlandsch-Indië 52. 399–583. Schmidt, Wilhelm. 1903. The Sakai and Semang languages in the Malay Peninsula and their relation to the Mon-Khmer languages. Journal of the Straits Branch of the Royal Asiatic Society 39. 38–45. Schmidt, Wilhelm. 1904. Grundzüge einer Lautlehreder Khasi-Sprache in ihren Beziehungen zu derjenigen der Mon-Khmer-Sprachen. Mit einem Anhang: die Palaung-Wa-, und Riang-Sprachen des mittleren Salwin. Abh. Bayrischen Akademie der Wissenschaft 1.22.3. 677–810. Schmidt, Wilhelm. 1905. Grundzüge einer Lautlehre der Mon-Khmer-Sprachen. Denkschrift der Akademie der Wissenschaften, Wien, Philologisch-Historische Klasse 51. 1–233.



History of MSEA Austroasiatic studies 

 89

Schmidt, Wilhelm. 1906. Die Mon-Khmer-Völker, ein Bindeglied zwischen Völkern Zentralasiens und Austronesiens. Archiv für Anthropologie, Braunschweig 5. 59–109. Scott, James George & John Percy Hardiman. 1900. Gazetteer of Upper Burma and the Shan States 1.1. Rangoon: Superintendent, Government Printing. Sebeok, Thomas A. 1942. An examination of the Austro-Asiatic Language family. Language 18. 206–217. Shafer, Robert. 1952. Études sur l’Austroasian. Bulletin de la Société de Linguistique de Paris 48. 111–158. Shorto, Harry L. (ed.). 1963. Linguistic comparison in South-East Asia and the Pacific. London: University of London. School of Oriental and African Studies. Shorto, Harry L. 1957. Palaung wordlist (based on material collected from Paw´shwe Kya, Namhsan, MS. Sept–Oct.) Published in facsimile 2013. https://openresearch-repository.anu.edu.au/ handle/1885/9782 (last accessed 12 January 2021). Shorto, Harry L. 1960. Word and syllable patterns in Palaung. Bulletin of the School of Oriental and African Studies 23. 544–557. Shorto, Harry L. 1962. A dictionary of Modern Spoken Mon. London: Oxford University Press. Shorto, Harry L. 1963. The structural pattern of northern Mon-Khmer languages. In Harry L. Shorto (ed.), Linguistic comparison in South-East Asia and the Pacific, 45–61. London: University of London, School of Oriental and African Studies. Shorto, Harry L. 1965. The interpretation of archaic writing systems, illustrated by the analysis of the phonological systems in early Mon dialects. In G. B. Milner & Eugénie J. A. Henderson (eds.), Indo-Pacific Linguistic Studies 1, 88–97. Amsterdam: North-Holland. Shorto, Harry L. 1971. A dictionary of the Mon inscriptions, from the sixth to the sixteenth centuries, incorporating materials collected by the late C.O. Blagden. London: Oxford University Press. Shorto, Harry L. 1976. The vocalism of Proto-Mon-Khmer. In Philip N. Jenner, Laurence C. Thompson & Stanley Starosta (eds.), Austroasiatic studies (Oceanic Linguistics, Special Publication 13), Part II, 1041–1067. Honolulu: University of Hawaii. Shorto, Harry L. 2006. A Mon-Khmer comparative dictionary. Canberra: Pacific Linguistics. Shorto, Harry L. ms. (circa. 1980). Draft Proto-Katuic reconstruction. https://drive.google.com/ file/d/1UiH8l9suYtGPdpgggmWcHm4v1hDgkuil/view?usp=sharing (last accessed 6 January 2021). Shorto, Harry L., Judith M. Jacob & E. H. S. Simmonds (eds.). 1963. Bibliographies of Mon-Khmer and Tai linguistics. London: Oxford University Press. Si, Aung. 2014. Danau. In Mathias Jenny & Paul Sidwell (eds.). 2014. The handbook of Austroasiatic languages, 2 vols., 1106–1141. Leiden & Boston: Brill. Sidwell, Paul & Andrew Hsiu. 2019. Pakanic tonogenesis in areal and etymological perspective. Paper presented at The 52nd International Conference on Sino-Tibetan Languages and Linguistics, Sydney, 24–26 June. Sidwell, Paul & Felix Rau. 2014. Austroasiatic comparative-historical reconstruction: An overview. In Mathias Jenny & Paul Sidwell (eds.), The handbook of Austroasiatic languages, 221–362. Leiden & Boston: Brill. Sidwell, Paul & Pascale Jacq. 2003. A handbook of comparative Bahnaric: Volume 1, West Bahnaric. Canberra: Pacific Linguistics. Sidwell, Paul. 1999. A reconstruction of Proto-Bahnaric. Melbourne: University of Melbourne PhD thesis. Sidwell, Paul. 2000, Proto South Bahnaric: A reconstruction of a Mon-Khmer language of Indo-China. Canberra: Pacific Linguistics. Sidwell, Paul. 2002. Genetic classification of the Bahnaric languages: A comprehensive review. Mon-Khmer Studies 32. 1–24.

90 

 Paul Sidwell

Sidwell, Paul. 2005. The Katuic languages: Classification, reconstruction and comparative lexicon. Munich: Lincom Europa. Sidwell, Paul. 2006. Preface. In Harry L. Shorto, A Mon-Khmer comparative dictionary, vii–xxv. Canberra: Pacific Linguistics 579. Sidwell, Paul. 2009a. Classifying the Austroasiatic languages: History and state of the art. Munich: Lincom Europa. Sidwell, Paul. 2009b. Proto-Mon-Khmer vocalism: Moving on from Shorto’s “alternances”. Journal of the Southeast Asian Linguistics Society 1. 205–214. Sidwell, Paul. 2010. Comparative Mon-Khmer linguistics in the 20th century: Where from, where to? In K. S. Nagaraja (ed.), Austro-Asiatic linguistics: In memory of R. Elangaiyan (Proceedings of the 3rd International Conference on Austroasiatic Languages), 38–104. Mysore: Central Institute of Indian Languages. Sidwell, Paul. 2011. Proto-Bahnaric. http://sealang.net/monkhmer/database/ (last accessed 12 January 2021). Sidwell, Paul. 2013. Proto-Khmuic. http://sealang.net/monkhmer/database/ (last accessed 12 January 2021). Sidwell, Paul. 2014a. Austroasiatic classification. In Mathias Jenny & Paul Sidwell (eds.), The handbook of Austroasiatic languages, 144–220. Leiden & Boston: Brill. Sidwell, Paul. 2014b. Khmuic classification and homeland. Mon-Khmer Studies 43(1). 47–56. Sidwell, Paul. 2015. The Palaungic languages: Classification, reconstruction and comparative lexicon. Munich: Lincom Europa. Skeat, Walter Willian & Charles Otto Blagden. 1906. Pagan races of the Malay Peninsula, 2 vols. London: Macmillan. Smalley, William A. 1956. Outline of Khmu structure. New York: Columbia University PhD dissertation. Smalley, William A. 1961. Outline of Khmu’ structure (Journal of the American Oriental Society Essay 2). New Haven: American Oriental Society. Smalley, William A. 1973. Bibliography of Khmuʔ Mon-Khmer Studies 4. 23–32.  Smith, Kenneth & Paul Sidwell. 2014. Sedang. In Mathias Jenny & Paul Sidwell (eds.), The handbook of Austroasiatic languages, 2 vols., 789–833. Leiden & Boston: Brill. Smith, Kenneth, D. 1979. Sedang grammar. Canberra: Pacific Linguistics. Smith, Kenneth. 1972. A phonological reconstruction of Proto-North-Bahnaric (Language Data, Asian-Pacific Series 2). Santa Ana: Summer Institute of Linguistics. Sokolovskaia, Natalia Ksenofontovna & Nguyễn Văn Tài. 1987. La langue muong: Materiaux de l’expedition linguistique Sovieto-Vietnamienne de 1979. Moskva: Nauka. Solntseva, Nina V. & Nguyễn Văn Lợi. 2001. Язык рук. Материалы российско-вьетнамской лингвистической экспедиции. Вып. 4. Издат. Москва: фирма «Восточная литература» РАН. Sophana Srichampa, Paul Sidwell & Kenneth Gregerson (eds.). 2011. Austroasiatic studies: Papers from ICAAL4. Mon-Khmer Studies Journal Special Issue No. 3, 2 vols. Dallas: SIL International; Salaya: Mahidol University; Canberra: Pacific Linguistics. Sujaritlak Deepadung & Pattama Patpong. 2010. Dara’ang: Language, culture and ethnic identity maintenance at the Thai-Myanmar border. [Research report]. Thailand: Research Institute for Languages and Cultures of Asia, Mahidol University. (In Thai). Sujaritlak Deepadung, Ampika Rattanapitak & Supakit Buakaw. 2015. Dara’ang Palaung. In Mathias Jenny & Paul Sidwell (eds.), The handbook of Austroasiatic languages, 1065–1103. Leiden & Boston: Brill. Sujaritlak Deepadung. 2009. Ethnicity and the Dara-ang (Palaung) in Thailand. Journal of Language and Culture (Research Institute for Languages and Cultures of Asia, Mahidol University) 28(1). 7–30.



History of MSEA Austroasiatic studies 

 91

Supakit Buakaw. 2012. A phonological study of Palaung dialects spoken in Thailand and Myanmar, with focuses on vowels and final nasals. Thailand: Mahidol University PhD thesis. Surekha Suphanphaiboon. 1982. The phonological system of Chong language, Muban Takhianthong, Tambon Takhianthong, Amphoe Makham, Changwat Chanthaburi. Bangkok: Srinakharinwirot Prasarnmit University MA thesis. (In Thai). Suriya Ratanakul & Vinya Sysamouth. 1997. Lawa-English dictionary. Parkville & Salaya: Dept. of Linguistics and Applied Linguistics, University of Melbourne and Institute of Language and Culture for Rural Development, Mahidol University. Suwilai Premsrirat & Nattamon Rojanakul. 2014. Chong. In Mathias Jenny & Paul Sidwell (eds.), The handbook of Austroasiatic languages, 2 vols., 603–640. Leiden & Boston: Brill. Suwilai Premsrirat. 2008. Chong dictionary. Mahidol: Center for Revitalization of Endangered languages, Institute of Language and Culture for Rural Development. Svantesson, Jan-Olof & Arthur Holmer. 2014. Kammu. In Mathias Jenny & Paul Sidwell (eds.), The handbook of Austroasiatic languages, 2 vols., 957–1002. Leiden & Boston: Brill. Svantesson, Jan-Olof, Kàm Ràw (Damrong Tayanin), Kristina Lindell & Håkan Lundström. 2014. Dictionary of Kammu Yùan language and culture (NIAS Reference Library 6). Copenhagen: NIAS Press. Svantesson, Jan-Olof. 1988. U. Linguistics of the Tibeto-Burman Area 11(1). 64–133. Svantesson, Jan-Olof. 1989. Tonogenic mechanisms in Northern Mon-Khmer. Phonetics 46. 60–79. Svantesson, Jan-Olof. 1991, Hu – A language with unorthodox tonogenesis. In Jeremy H. C. S. Davidson (ed.), Austroasiatic languages, essays in honour of H. L. Shorto, 67–80. London: School of Oriental and African Studies, University of London. Tandart, Sindulphe Joseph. 1910–1911. Dictionnaire français-cambodgien. Hong Kong: Imprimerie de la Société des Missions Etrangès. Tandart, Sindulphe Joseph. 1935. Dictionnaire cambodgien-français. Phnom Penh: Imprimerie Albert Portail. Tao Chengmei 陶成美. 2016. Bulangyu Lawahua de zhicheng daici i55 布朗语拉瓦话的指称代词 i55. Minzu Fanyi 民族翻译 1. 68–74. Tharaud, L. 1904. Les provinces du Tonkin: Hung Hoa. Revue Indochinoise 2. 174–187, 282–289, 358–369. Thomas David. 1964. A survey of Austro-asiatic and Mon-Khmer comparative studies. Mon-Khmer Studies 1. 49–163. Thomas, David & Marilyn Smith. 1967. Proto-Jeh-Halang. Zeitschrift für Phonetik, Sprachwissenschaft und Kommunikationsforschung 20. 157–175.  Thomas, David D. 1962. On defining the “word” in Vietnamese. Van-hóa Nguyêt-san 11(5). 519–523. Thomas, David D. 1971. Chrau grammar (Oceanic Linguistics Special Publication 7). Honolulu: University of Hawaii Press. Thomas, David D. & Robert K. Headley. 1970. More on Mon-Khmer subgroupings. Lingua 25. 398–418. Thomas, Dorothy M. 1967. A phonological reconstruction of Proto-East-Katuic. Grand Forks: University of North Dakota MA thesis. [Published 1976, University of North Dakota Work Papers 20, supplement 4). Thompson, Laurence C. 1967. The history of Vietnamese finals. Language 43(1). 362–371.  Thompson, Laurence C. 1976. Preface. In Philip N. Jenner, Laurence C. Thompson & Stanley Starosta (eds.), Austroasiatic studies (Oceanic Linguistics, Special Publication 13), 2 vols, ix–xii. Honolulu: University of Hawaii. Thompson, Laurence C. 1976. Proto-Viet-Muong phonology. In Philip N. Jenner, Laurence C. Thompson & Stanley Starosta (eds.), Austroasiatic studies, vol. 2, 1113–1204. Honolulu: University of Hawaii Press.

92 

 Paul Sidwell

Thompson, Laurence C. 1984–1985. A Vietnamese grammar. Mon-Khmer Studies Journal 13/14. 1–367. Thongkham, Noppawan. 2003. The phonology of Kasong at Khlong Saeng Village, Danchumphon Sub-District, Bo Rai District, Trat Province. Thailand: Mahidol University MA thesis. Trần Trí Dõi. 2011. Giáo trình lịch sử tiếng Việt [A textbook of the history of Vietnamese]. Hanoi: Nhà Xuất Bản Giáo Dục Việt Nam.  Trần Trí Dõi. 2018. Về vấn đề nguồn gốc của tiếng Việt [On the question of the origin of Vietnamese]. In Đinh Văn Đức (ed.), Tiếng Việt Lịch Sử: Một Tham Chiều Hồi Quan [Vietnamese language history: A reflective reference], 13–86. Hanoi: Nhà Xuất Bản Văn Học. Unchalee Singnoi. 1988. A comparative study of Pray and Mal phonology. Thailand: Mahidol University MA thesis. Ungsitipoonporn, Siripen. 2001. A phonological comparision between Khlong Phlu Chong and Wangkraphrae Chong. Thailand: Institute of Language and Culture for Rural Development, Mahidol University MA thesis. van Driem, George. 2001. Languages of the Himalayas: An ethnolinguistic handbook of the Greater Himalayan Region: Containing an introduction to the symbiotic theory of language. Leiden: Brill. von Hevesy, Wilhelm. 1928. Munda-Magyar-Maori: An Indian link between the antipodes; new tracks of Hungarian origins. London: Luzac. von Hevesy, Wilhelm. 1930. On Wilhelm Schmidt’s Munda–Mon-Khmer comparisons (Does an “Austric” family of languages exist?). Bulletin of the School of Oriental Studies 6(1). 187–200. von Hevesy, Wilhelm. 1932. Finnish-Ugrische aus Indien: Es gibt keine austische Sprachfamilie – das vorarische Indien reilweise finnish-ugrisch. Wien: Manzsche Verlags- und Universitätsbuchhandlung. von Hevesy, Wilhelm. 1934. A false linguistic family “The Austro-Asiatic”. Journal of the Bihar and Orissa Research Society 20(3/4). 251–259. Vương Hoàng Tuyên. 1963. Các dân tộc nguồn gốc Nam Á ở Tây Bắc Việt Nam. Hà Nội: Nxb Giáo dục. Watkins, Justin. 2013. Dictionary of Wa, 2 vols. Leiden & Boston: Brill. Witzel, Michael. 1999a. Early sources for South Asian substrate languages. Mother Tongue Special Issue Oct. 1–76. Witzel, Michael. 1999b. Substrate languages in Old Indo-Aryan (Rgvedic, Middle and Late Vedic). Electronic Journal of Vedic Studies 5(1). 1–67. Wu Zili. 1992. Guangnan Ben’ganyu Chutan [An initial description of Bangan]. Yunnan Minzu Yuwen 4. Yan Qixiang [颜其香] & Zhou Zhizhi [周植志]. 2012. Mon-Khmer languages of China and the Austroasiatic family [中国孟高棉语族语言与南亚语系]. Beijing: Social Sciences Academy Press [社会科学文献出版社]. Yan Qixiang [颜其香], Zhou Zhizhi [周植志], Li Daoyong [李道勇] et al. 1981. Pug lai cix ding yiie sindong lai Vax mai lai Hox: 佤汉简明词典 [A concise dictionary of Wa and Chinese]. 昆明: 云南民族出版社 [Kunming: Yunnan Nationalities Publishing House]. Young, M. Vincent & Yaw Su (translators). 1938. Hpuk lai sigang si siyeh pa hkrao [The New Testament]. Rangoon: British and Foreign Bible Society, American Baptist Mission.

Paul Sidwell and Mathias Jenny

6 History of Tai-Kadai studies 6.1 Introduction The Tai-Kadai languages, and in particular members of the Southwestern Tai (SWT) branch Thai, Lao, Shan, and their local varieties, have a tremendous social importance in MSEA. Since they spread out from southern China and over much of the MSEA region – largely over the 2nd millennium – these languages have ascended to become major vectors for cultural, economic and political life. It can even be argued that the spread and ascendancy of SWT has substantially conditioned the emergence of the SE Asian Linguistic Area, as many other languages converge on proto-typical Tai patterns of word structure, tonality, syntax, etc.1 By the beginning of the 21st century we can say that Tai-Kadai linguistics has achieved a high level of descriptive coverage and analytical insight within MSEA, and is making rapid progress in regard to the languages of southern China. This includes excellent linguistic grammars of the more important languages, important insights into language history, tonology, and language contact phenomena. It is now widely recognized that the Tai-Kadai family originated in southern China, and it is there that its greatest diversity remains. Only the Tai sub-family (with Northern, Central and Southwestern groups), and small Kra speaking communities are found within the area we recognize as MSEA, and consequently the main thrust of this chapter is on the history of Tai linguistics, with only marginal reference to other Tai-Kadai (sub) branches (Be, Hlai, Kam-Sui, etc.). In this chapter we provide a historiographic overview of Tai-Kadai linguistic studies relevant to MSEA, with most emphasis given over to the major languages. Regrettably we do not have space to do justice to the full range of relevant work which has been done in Chinese, Vietnamese, Thai, or Burmese, although much is recognized in the references and referenced works. Additionally, please note that while we use the term Tai-Kadai throughout this chapter to refer to the language family, there is a history of alternate naming preferences, with scholars variously using Kadai, KraDai, Daic and similar terms for the family or various subgroup proposals (see Norquest chapter 13 for more discussion).

1 The authors would like to express thanks to Pittayawat Pittayaporn (Joe) for useful suggestions and comments in the writing of this chapter. https://doi.org/10.1515/9783110558142-006

94 

 Paul Sidwell and Mathias Jenny

6.2 Tai in Thailand and beyond 6.2.1 Thai Not surprisingly, Thai is the Tai language with the longest standing and strongest history in linguistic research and description. Diller (2008) provides a wide-ranging overview of resources for Thai language research, including a 27-page bibliography. The Cindamani (ca. 1670), traditionally assigned to Horathibodi, a prominent astrologer, poet and royal teacher at the court of King Narai at Ayutthaya, is an early account of the Thai language written in Thai. It is modelled after Indian works on language, mostly written in verse, and dealing mainly with Thai poetry, rhyme patterns, as well as tones and orthography, making it an important source for early modern Thai. The Cindamani first appeared in print in Bangkok in 1870 and was subsequently published several times (e.  g. Fine Arts Department 1971). A modern analysis of the text is given by Panarut (2015, 2018), and an analysis of 17th century Thai tones on the basis of Cindamani is offered by Pittayaporn (2016). Sporadic mention of the Thai language is made by early western missionaries and travelers, and a few Thai words can be found in their accounts of the country and the people, but no grammatical description or vocabulary survives before the 19th century. There are apparent mentions of Thai grammars and dictionaries by French missionaries in the 17th century but nothing concrete survives. The first attested grammar of Thai in a western language is probably Low (1828), followed by Jones’s (1842) Brief grammatical notices, which includes exercises and numerous texts in Thai prose and poetry, as well as an appendix giving rules for spelling Hebrew and Greek names in Thai script. Only with Pallegoix (1850a) does Thai get a full systematic description (in Latin), together with a Latin-Thai d ­ ictionary (Pallegoix 1850b) and followed in 1954 by a Thai-Latin-French-English dictionary. Pallegoix reportedly was close to the future King Mongkut (Rama IV), then still in monkhood, who was very fond of European culture and languages, and was keen on standard­izing Thai as a “civilized” language according to European models. This was seen at the time as one important factor for keeping Siam as an independent kingdom between British controlled Burma to the west and French controlled Indochina to the east (see also Winichakul 1994). The Thai national agenda, which started with the modernization of the language (among other things) in the mid-19th century, saw the establishment of Thai as a fullfledged civilized national language (see Smalley 1994 for an extensive discussion). Prescriptive treatises, like the làk pʰaːsǎː tʰaj ‘fundaments of the Thai language’ (e.  g. the 1987 edition by Thonglor) naturally followed from this endeavor, which formed the basis for teaching Thai throughout the 20th century in Siam/Thailand. With Thai as the only national language, as well as a means of ensuring the “Thainess” of the people, local varieties were at different points actively suppressed by the central



History of Tai-Kadai studies 

 95

administration. Only towards the mid-20th century did interest in non-Thai Tai varieties gain prominence, partly along with imperialistic ideas of uniting the Tai people in a Tai/Thai kingdom (hence the official change of the countries name from Siam to Thailand in 1939).2 Through the 1940s to the 1960s Mary Haas made a particularly impactful contribution to Thai linguistics and teaching of Thai internationally. Her work on Thai began with the need to produce instructional materials for the US military forces in wartime (Haas 1942, 1945), and led to a grammar of Thai (Haas 1956b), numerous manuals including a guide to Thai writing (Haas 1956a) and ultimately a dictionary of Thai for students (Haas 1964) that proved tremendously impactful, in part due to its use of a system of phonetic transcription that is still used by many as a standard for Thai. Haas also wrote on diverse linguistic aspects of Thai such as classifiers, reduplication, tones, word play and others. Noss’s (1964) Reference grammar of Thai is still unsurpassed in its detailed description of Thai, including pragmatically and intonationally conditioned phonetic nuances. The language in this excellent, if not easily accessible, grammar is rather natural colloquial Thai as spoken in Bangkok. From the 1980s, Diller (1985, 1993, 2001, 2006) conducted research on different aspects of Thai, including diglossia and its development and role in Thailand. In addition to a prolific research output, Diller headed the National Thai Studies Centre at the Australian National University for many years. Iwasaki and Ingkaphirom (2005), one of the most recent comprehensive grammars of Thai, is based on real-life corpora and represents the everyday language, rather than a “correct” form as propagated by Thai government bodies. The analysis is done according to modern linguistic practice, which makes the text accessible to language learners as well as linguistic typologists. Together with Noss (1964), this reference grammar is among the most valuable sources for present-day Thai. More descriptive/functional and theoretical approaches to all aspects of Thai have been followed by Thai and other scholars since the late 20th century, such as Warotamasikkhadit’s generative analysis of Thai syntax (1972), as well as linguistic activities of Thai language departments at universities in Thailand, among which Chulalongkorn University and Mahidol University are probably the most active and well-known internationally.

2 The name was changed back to Siam between 1946 and 1948, after which “Thailand” has remained the official name.

96 

 Paul Sidwell and Mathias Jenny

6.2.2 Lao/Isaan Enfield (2008) offers an informative historiography of Lao linguistics. Broadly speaking, Lao was hardly recognized as a language distinct from Thai until the mid-20th century, being often referred to as Thai Noi (‘Little Thai’), and neglected for official purposes in Laos in favor of French. The situation changed with the emergence of Lao nationalism in the 1930s, and concern quickly grew for describing and standardizing Lao as an emergent national standard. The first important grammar of Lao is Hospitalier (1937, in French) written in traditional European terms, with many constructed examples translating French grammatical categories. Once the country achieved self-government in 1949 Lao intellectuals split between those who would base the language on the speech and writing of the (largely Buddhist) educated elite versus a rural demotic with simple spelling conventions. The former is reflected in the grammars of Viravong (1962), Royal Lao Government (1972), Nginn (1984) and the latter by the grammar of Vongvichit (1967), with the latter winning out after 1975. In terms of modern linguistic grammars, the most important recent works are Rehbein and Sayaseng (2004, in German) and Enfield (2007). It has to be said that linguists appeared to pay scarce attention to Lao until recently. Roffe (1946) considered Lao phonemic structure, Morev et al. (1972, in Russian) is a grammar sketch of Lao, some papers considered aspects of Lao grammar (e.  g. Crisfield 1974; Honts 1979), some particular attention to expressives in Lao (Crisfield 1978; Trongdee 1996; Wayland 1996), and there is a dissertation on Lao tone by Osatanada (1997). More recently Enfield has written prolifically on linguistic aspects of Lao, and extensive listings can be found in Enfield (2008) and his University of Sydney profile. Demographically the greatest number of Lao speakers live in Northeastern Thailand, but everyday language use was historically discouraged by authorities, and their vernacular is officially regarded as Isaan Thai. Nonetheless, some Thai linguistics have shown interest and produced various studies (e.  g. Prakhong 1976; Premchu 1979; Luang-Thongkhum 1979), as well as a comprehensive Isaan-Thai-English dictionary of over 1,000 pages (Phinthong 1989).

6.2.3 Shan Shan first received some attention from western officers and missionaries in the late 19th and early 20th centuries as the major language used in the British controlled Shan States. Most important among these is J. N. Cushing, who published two editions of his Grammar of the Shan language (1871, 1887), as well as a learners’ Handbook of Shan (1880), followed by a comprehensive Shan-English Dictionary (1914). The diction­ ary gives all entries in traditional Shan script with additional indications of vowel quality and tones, which are only partially expressed in the pre-reformed orthography. Despite the relative political and cultural importance of Shan, also as a lingua franca



History of Tai-Kadai studies 

 97

in much of northern Burma for a long time, it disappeared from the radar of linguistics for several decades. Young’s Shan chrestomathy (1985) was a first step towards making Shan data available to typologically oriented linguists, though its main focus is philological, rather than general linguistic. Glick and Sao Tern Moeng’s Shan for English speakers (1991) is the first modern textbook for Shan in English. Though not a reference grammar, the 700+ pages strong volume is a rich source for Shan data, including conversations, narratives, and grammatical explanations. The Shan text is given in transcription, with the reformed Shan orthography (marking all vowel qualities and tone distinctions) of all texts and vocabulary entries in the appendix. Sao Tern Moeng (1995) is a remake of Cushing’s classic Shan-English dictionary, this time in the reformed orthography, and includes a number of new lexemes. The history of the different Shan and related scripts is presented in Sai Kam Mong (2004), providing a comprehensive overview of the development and uses of Shan writing. Edmondson (2008) provides a sociolinguistic and dialectal overview of Shanic languages across northern Myanmar, based on a number of phonological features and sound changes. Although the above-mentioned publications, together with a moderate printing and publishing activity in Shan (both on paper and online) provide abundant source material for linguistic studies, not much has been done in recent years in terms of Shan studies. Some local Shanic varieties are the subject of sporadic studies and publications, but to date there is no comprehensive Shan grammar in modern linguistic terminology.

6.2.4 Zhuang Zhuang is the largest linguistic minority in the PRC with some 18 million speakers; at least 16 named varieties are distinguished in the literature, while most are spoken in Southern China some are also found in Vietnam. Officially, Zhuang dialects are divided into two groups labeled Southern and Northern Zhuang, with Southern Zhuang falling into the Central Tai branch, and Northern Zhuang into the Northern Tai branch. However, this classification is open to challenge, with ambiguity as to whether some Zhuang varieties are clearly northern or southern, and it is safer to regard Zhuang as a dialect continuum. Chinese studies of the Zhuang people and their language go back to the 1950s, but western interest in the language only increased in the late 20th and early 21st century, with an in-depth research in the traditional Zhuang script (Holm 2013) and sketch grammars of a Northern Zhuang variety (Luo 1991, 2008). Sybesma (2008) investigates Chinese influence in Zhuang and places the language in a general areal perspective. Luo (2008) also provides a broad overview of Zhuang varieties, geography, history and writing. Dictionaries of

98 

 Paul Sidwell and Mathias Jenny

several Zhuang varieties are available in Chinese, Thai, and English (e.  g. Hudak and Fiedler 2004).

6.2.5 Other Tai languages Morey (2005a, 2008) provides an overview and comparative grammar of the Tai languages of Assam (NE India), focusing on Aiton, Khamti, and Phake, of which there are only some thousands of active speakers (with some language revival activities being pursued). Several journal articles by Morey (e.  g. 2005b, 2006) are dedicated to different aspects of the NE Indian Tai languages. Morey’s efforts have been instrumental in transforming linguistics in NE India, playing a founding role in the Northeast Indian Linguistics Society, among other activities. He has put significant effort into preserving and documenting the written legacy of Tai languages in that region, and his pioneering 2005 monograph is accompanied by a CD of texts and linked sound files. The extinct Tai language of the former Ahom kingdom in Assam has been the subject of interest by local Indian, Thai and western scholars and the main text in the language, the Ahom Buranji ‘History of the Ahom’ was published in English and Ahom (Barua 1930) and Ahom and Assamese, though the not very careful edition of the Ahom text as well as the often inaccurate English translation make this publication problematic as a source for linguistic studies. A much more thoroughly researched and elaborate edition is Wichasin (1996), which presents the text in Ahom script, transliteration in Thai, comparison with Thai cognates, and Thai translation. Complete with an Ahom-Thai vocabulary list (in Thai script) which refers to occurrences in the text, this is a valuable resource for Ahom studies, though the fact that it is only accessible through Thai reduces its usefulness for the non-Thai linguistic community. Terwiel and Wichasin (1992) provides a shorter text in Ahom, with English transliteration and translation, and a vocabulary list. Apart from the edition of primary texts in Ahom, only few linguistic studies of the language are available, among these the recent publications by Jacquesson (2010) and Morey (2011). The former is an edition of Phukan’s unfinished manuscript of a textbook for Ahom, the latter a grammar sketch of the language, the only one of its kind to date. In recent years, Ahom studies have led to efforts at language revival, but not necessarily to more general linguistic interest. Morey has an Ahom online dictionary project.3 The situation with Tai languages of Assam has parallels in many ways with other SWT languages spoken across Myanmar, northern Laos and Vietnam: many have seen the publication of texts and a few dictionaries (e.  g. Don et al. 1989 for Tai Dam; Hanna 2012 for Dai Lue; Luo 1999 for Dehong), but are less often subject to linguistic inves-

3 http://www.sealang.net/ahom/ (last accessed 4 December 2020).



History of Tai-Kadai studies 

 99

tigation. Ferlus (2008, in French) surveys Tai varieties spoken in Nghệ An Province (Vietnam), comparing phonologies and lexicon, and although he provides some short textual samples he is less concerned with documenting the grammar. Although most of the Tai varieties of the region are quite closely related to each other and may not offer much new insight in terms of reconstructing proto-Tai, there is much interest in their more recent development from an areal and sociolinguistic perspective. The dialect continuum ranging from northern Thailand (Lanna, Kammueang) through the eastern Shan State of Kengtung (Khuen) to Xishuangbanna in Yunnan (Lue) today stretches over three countries, which is reflected in recent loanwords from the different dominant languages, namely Thai, Burmese/Shan, and Chinese, respectively, leading to divergence among the three varieties. The Saek/Sek language, spoken by small communities in Nakhon Phanom Province (Thailand) and Khammouane Province (Laos) is somewhat anomalous. The language was first identified as Tai by Fraisse (1950), and subsequently documented in various studies (e.  g. Gedney 1970, 1993; Khanittanan 1976). Saek speakers appear to have migrated out of Guangxi several hundred years ago, as the language has distinct Northern Tai features, and archaisms such as allowing for lateral /-l/ codas. The history of migration and contact with other languages has produced confusing correspondences which have been the subject of some discussion (Haudricourt 1958; Gedney 1989; Kosaka 1992; Chamberlain 1998). Somsonge Burusphat and her team at the Institute of Language and Culture for Rural Development, Mahidol University have long-standing projects on local Tai languages, with a special focus on the documentation of oral literature. A number of theses on Tai languages in MSEA have been produced in the last decades at Mahidol university, making it an important center of Tai studies in the region.

6.3 Comparative Tai First short wordlists and phrases of several Tai languages of India appear in Campbell (1874), supplemented with a comparison with Shan, but far from presenting more than impressionistic glimpses of the Ahom, Aiton, and Khamti. As the author states in the introduction: I do not purpose to attempt here any comparisons of the languages shown. I have neither the time nor the ability to do so. Fortunately the language-specimens were obtained before famine came upon us, and now that the printing is completed, I issue them with the briefest possible note. I will only mention one or two salient features in the classification of the non-Aryan tribes of these territories, which the specimens render self-evident. (Campbell 1874: 2)

Campbell’s publication was followed in 1904 by Grierson’s more comprehensive Linguistic survey of India, volume two of which is dedicated to the “Mon-Khmer and Siamese-Chinese families”. Grierson’s survey is a large step forward from Campbell,

100 

 Paul Sidwell and Mathias Jenny

not only providing standardized vocabularies and phrases, but also short grammar sketches and lists of references for the languages included. De Lacouperie (1887) compares the structures and vocabularies of several languages in Southern China and adjacent areas, including Siamese. He attempts a classification of the languages based on structural and lexical similarities and differences, combining (or mixing) what today would be distinct methods of areal-typological and historical-comparative research. In 1923, Dodd published his pioneer work on the Tai race, elder brother of the Chinese, presenting an overview of Tai peoples in MSEA and China. The volume provides a short comparative Tai vocabulary list, unfortunately with many gaps and non-systematic transcription of the words in several Tai varieties. The author gives the following “key to the Romanization of Lao [= Tai] words in this work”: In general, no attempt has been made to indicate Lao tones, fine vowel distinctions, and aspirations of consonants. (Dodd 1923: xxv)

Despite these and other methodological shortcomings, Dodd deserves the honor of compiling the first broad account of the Tai groups and their histories, customs, and languages. The problem of reconstructing the history of Tai was first seriously tackled by Haudricourt (1952, 1956) investigating the comparative reconstruction of syllable onsets of “thaï commun” (proto-Tai). Strikingly Haudricourt proposed various onset clusters and disyllabic forms to explain problematic correspondences. This was an aspect of Haudricourt’s work which influenced subsequent reconstructions and efforts to link Tai-Kadai to other language families, especially Austronesian into recent decades (see Section 5). For example, Haudricourt’s discussion implies forms *bulanA ‘moon’ and *manukD ‘bird’ with clearly Malay proto-types in mind (compare Li 1977 *ˀblɯənA, *nrokD for the same etyma). Haudricourt is also well known for his pioneering work on tonogenesis, and abstracted the basic paradigm of proto-Tai having A (mid), B (rising), C (falling) tones in unchecked syllables and an unmarked D tone in checked syllables. Nishida (1954, in Japanese) focused on comparative study of Tai tones, with tentative proto-onsets discussed as part of the analysis. J. Marvin Brown’s (1965) From Ancient Thai to Modern Dialects is a foundational work whose impact on the field is hard to overestimate. While the title suggests a historical reconstruction, the work is more a phonological characterization of SWT varieties throughout Thailand and Laos, laying out segmental, tonal, and syllabic correspondences. The analysis and presentation is a type example of applying phonological theory tailored to the typology of the languages: all surveyed lects are described in terms of syllable components rather than phonemes (i.  e. onsets, nuclei, codas) and tones are discussed in articulatory and acoustic terms and phonologically characterised in the tone-box format that was later so popularized by Gedney (1972) that they came to be known as Gedney boxes. While Brown’s analysis of tonemes based on five



History of Tai-Kadai studies 

 101

tones (three in live syllables, two in dead syllables)4 and three laryngeal onset settings is perfectly adequate, his detailed description of actual tonal values of the tonemes in older stages of the varieties is more problematic. The phonetic realization of tones is notoriously fluid and reconstructing historical values is not an easy task. Brown does not give a rationale for the suggested historical phonetic tones, so these must be seen as hypothetical, at best. He does offer a reconstruction of “ancient Thai” vocabulary and phonology based on a diagnostic vocabulary of several hundred words, but this was done without proper reference to language sub-grouping or relevant spoken lects outside of the data collection area. We can fairly say that the era of modern comparative work on Tai languages really begins with Brown. Willam J. Gedney built on Brown’s analytical legacy with extensive field work that saw him produce text collections and dictionaries of numerous Tai languages plus comparative-historical analyses. Gedney refined the tone-box methodology, creating a short diagnostic list for quick survey, building substantially our knowledge of Tai tonal systems. Gedney’s pioneering comparative work was edited by Hudak (2008) as William J. Gedney’s comparative Tai source book, which gives phonological overviews of several languages in each of the three established subgroups, Southwestern Tai, Central Tai, and Northern Tai. The description covers segmental and tone inventories, tone categories, as well as syllable structures. The bulk of the 220 pages strong volume is a list of some 1,100 cognates in Tai languages. No proto-Tai reconstruction is attempted, only the tone category of the proto-form is indicated for each entry, with cognate forms listed according to subgroups. An English index makes Gedney’s book a convenient source for quickly checking cognates in the family. Comparative Tai reached an important milestone with Sarawit (1973), which proposed a proto-Tai vowel system with seven long and short monophthongs plus several diphthongs. This was significant as it follows languages such as Thai and Lao which do contrast length, while many Tai languages do not, and later scholars, especially Li (1977) treated length as an innovation. Li Fang Kuei (1977) was the standard work of Tai reconstruction for several decades, influencing many scholars. Li particularly stood out as authoritative as he already had decades of experience of working on Tai languages, especially in Southern China, having begun to collect comparative data from 1936. Consequently, his work was seen to be much more grounded than more Thailand/Indo-China centered works that had appeared earlier. His 1977 monograph treats more than 1200 etymologies. Phonological correspondences are presented in three column format, with reflexes in Thai (SWT), Lungchow (Central Tai), and Po-ai (Northern Tai) and extensive notes giving reflex forms in other languages. All roots are treated as monosyllabic, although

4 Traditionally, syllables ending in a vowel or sonorant are called “live” syllables, the ones ending in a stop are called “dead” syllables in Thai/Tai linguistics.

102 

 Paul Sidwell and Mathias Jenny

many onset clusters are posited. No contrastive vowel length is reconstructed, instead Li proposed (contra Sarawit) some nine monophthongs, plus various diphthongs and triphthongs. The origin of contrastive length was given as arising secondarily from the three low vowels and open syllable vowels. The ABCD tone scheme was retained. While Li’s reconstruction is often referred to as the “standard work” it has been sub­stantially superseded by subsequent work. In particular the field has generally returned to the view that vowel length was contrastive, and a greater diversity of onset segments is generally now recognized. Strecker (1983) offers a critical review of the work of Sarawit and Li on proto-Tai vowels, refining their analysis in some points. Strecker’s 1984 PhD dissertation is an in-depth study of pronouns in Tai languages, expanding the scope of comparative Tai studies beyond the phonological domain and lexical reconstruction. Nanna Jonsson (1991) is an unpublished PhD dissertation that reconstructs PSWT based on 10 languages. The findings are not radically different from Li (1977, see above for commentary) and overall the work has the appearance of largely reconciling SWT correspondences to Li’s framework. Ferlus (1990) was the first to present in writing a revision of Li’s reconstruction, under the rubric proto-Thai-Yai (following Haudricourt, Yai representing Northern Tai). Ferlus systematically works through Li’s onset correspondences, in particular revising the proposed onset clusters (e.  g. *vr > *mr, *tl > *kt, *tr > *pt, *thl > *chr, and others) and in some cases suggesting sesquisyllablic forms, although somewhat underspecified for form. For example, a *Cr- onset is proposed for the ‘bird’ etymon. Some of Ferlus’ revisions are quite dramatic and did not all catch on, although the principle of reconstructing sesquisyllables was subsequently generally adopted. The state of comparative Tai studies is effectively summarized by Edmondson and Solnit (1997), whose edited volume presents some 15 chapters by various scholars on aspects of comparative Tai. Luo (1997) is a substantial monograph length study that extends and revises Li’s reconstruction, adding nearly a thousand additional etymologies. This is done in part with reference to higher level relation Kra-Dai data and inclusion of what are assumed to be very old borrowings. The revised phonology includes many new proposals for clusters, especially involving sibilant consonants. The classification is also revised to suggest a Northwestern Tai su-branch. However, much like the proposals of Ferlus (1990), many of Luo’s proposals failed to gain traction with concerned scholars, and have now been superseded by Pittayaporn (2009). Pittayaporn (2009) represents the current state of Tai reconstruction. His analysis departs from Li (1977) in several ways: proto-Tai sesquisyllabic words are reconstructed, a larger set of onsets and codas is posited, including a uvular series, and the proto-vocalism more closely follows the scheme of Sarawit (1973) with long and short monophthongs. Pittayaporn also reconstructs somewhat more diverse onset clusters, and adopts some aspects of Ferlus’ (1990) proposals (such as *p.t- and *k.tfor Li’s *tr- and *tl- and additional sesquisyllabic forms not previously suggested).



History of Tai-Kadai studies 

 103

Another novel aspect of Pittayaporn’s study is his four-branch model of Tai phylogeny essentially merging SWT and Central Tai, and splitting Northern Tai into three subgroups, all based on phonological changes. No assessment of this model is offered here, but one can see how such a pattern of nested branching could reflect higher diversity close to the homeland area with other groups reflecting one principle branch migrating to the south and west. The general progress in different aspects of Tai studies in Thailand and abroad is reflected also in various volumes of papers on Tai linguistics, including Harris and Noss (1972), Harris and Chamberlain (1975), and Tingsabadh et al. (2001), among others.

6.4 Kadai beyond Tai The bulk of Kadai languages beyond Tai fall geographically beyond our definition of MSEA, the main exception being the Kra group, also known as Geyang from Ge- for Gelao and -yang for Guyang (so called by Liang 1990). Kra is rather small demographically, with around 20,000 speakers only, with communities scattered variously in Yunnan, Guangxi, and Vietnam. Views differ as to where Kra fits into the Tai-Kadai family tree (see Norquest, this volume) but scholars generally place it in a position of branching off at a rather high level, and Ostapirat (2000) in particular treats it as a coordinate level branch. In terms of linguistic significance, various Kra languages, especially Buyang and Gelao have been identified as possessing disyllabic structures and lexical roots that are crucial for the Austro-Tai hypothesis (discussed above). Ostaprat (2000) reconstructs proto-Kra phonology and lexicon, providing a list of about 300 proto-Kra forms, and while this moved the field forward significantly in terms of historical reconstruction, a thorough treatment of proto-Tai-Kadai is yet be presented. More pressing concern for scholars have been to view Tai-Kadai linguistic history in terms of language contact – particularly the historical interaction with Chinese – and the issues around the origins and development of tonal systems. Both of these concerns have to be addressed. Much could be said about the study of Kadai languages, and readers are referred to the various handbooks and review articles have appeared in recent decades. Reconstructions of proto-languages of several branches have been proposed (e.  g. Norquest [2015] for Hlai; Thurgood [1988] and Ferlus [1996] for Kam-Sui; Ostapirat [2000] for Kra), and Edmondson and Yang (1988) present initial correspondences among Kam-Sui languages, reconstructing “preconsonants” (sesquisyllables), even though all modern Kam-Sui languages are monosyllabic. It is evident that “Kadai beyond Tai” is gaining ground in comparative and typological linguistic studies. The state of the art is summarized in two edited volumes, one of which is Edmondson and Solnit (1988), who sum up the research in the field up to the late 1980s in the introduction to their edited volume which presents some 13 chapters by various authors, covering different

104 

 Paul Sidwell and Mathias Jenny

aspects of Kadai linguistic studies. The more recent volume edited by Diller, Edmondson, Luo in 2008 appeared in the Routledge series on language families. Departing from the other volumes in the series, Diller et al. do not focus on grammar sketches of representative languages of the family, but rather present topical research alongside sketches and historical comparative studies, providing a broad overview of the latest state in Tai-Kadai linguistics in all its aspects. These publications pave the way ahead for future research in the field. Many aspects of Tai-Kadai await further investigation, and with the appearance of more descriptive work, we can look forward to the further integration of the family in historical, typological, and areal linguistics. Additionally, Peter Jenks and Pittayawat Pittayaporn compiled an annotated bibliography of Kra-Dai languages in the Oxford Bibliography series, published electronically.5

6.5 Tai, Austro-Tai and Sino-Tibetan In the latter 19th and the first half of the 20th centuries the typological and lexical parallels between Tai and Chinese persuaded scholars in the West and Asia that Tai is related to the Sinitic languages, the Sino-Tai hypothesis (see Luo 2008 for a detailed review). Over the decades authors have either advocated this position or regarded it as a possibility to be investigated (e.  g. de Lacouperie 1886; Conrady 1896; Schlegel 1902; Grierson 1904; Wulff 1934; Nishida 1960; Denlinger 1967, 1989; Manomaivibool 1975; Xing 1999). As discussed by Luo (2008), substantial lexical compilations and associated historical analyses (such as Manomaivibool 1975; Xing 1999) of Tai and Sinitic demonstrate a close mapping of pronunciation and tone categories of Sino-Tai vocabulary to 1st millennium Chinese. This is now widely interpreted as indicative of borrowing into Tai, with important implications for the reconstruction of Old and Middle Chinese as well as various Tai-Kadai proto-languages. Since the beginning of the 20th century, scholars also pondered whether Tai may share an origin with Austronesian. Schlegel (1901) pioneered the idea, offering numerous lexical comparisons between Siamese and Malay (in addition to Indic and Chinese parallels). The claim was formalized by Benedict, whose (1942) study proposed a “new alignment” in Southeast Asia, launching the Austro-Thai hypothesis (later renamed Austro-Tai).6 The bulk of that paper fleshes out the Kadai family, approximating our current understanding of Tai-Kadai, and the latter part introduces a modest number of comparisons with Indonesian (Dempwolff’s 1934–1938 proto-Austronesian). He makes the point that the forms are not only similar, but invoking the disyllabic structure of

5 https://www.oxfordbibliographies.com/view/document/obo-9780199772810/obo-97801997728100178.xml (last accessed 4 December 2020). 6 It is likely that Benedict (1942) influenced Haudricourt’s initial attempts at Tai reconstruction a decade later.



History of Tai-Kadai studies 

 105

Austronesian potentially helps to explain various complicated onset correspondences among Tai-Kadai forms. Benedict continued to elaborate his Austro-Tai hypothesis in two monographs (Benedict 1975, 1992), the latter volume linking Japanese to Austronesian and Tai-Kadai, although that proposal failed to gain traction with scholars and Benedict’s works attracted significant criticism for lax semantic and phonological criteria in many lexical comparisons. The view emerged among scholars that among the bulk of Benedict’s Austro-Tai comparisons only a modest core of quite strong lexical agreements are of value. Thurgood (1994) asserted that such a pattern is indicative of borrowing, and posited contact between the two language families in “the Guizhou and Guangxi area” (Thurgood 1994: 361) more than 6,000 years BP. A decade later the arguments shifted decisively towards recognizing some form of Austro-Tai: Sagart (2004, 2005) and Ostapirat (2005) independently argued for common inheritance as the explanation for the most regular correspondences. Both scholars crucially referenced data from Buyang (a Kra language of Yunnan) which appears to retain precisely the long posited disyllables that scholars had speculated would explain various problematic consonant and vowel correspondences within Tai-Kadai. While views remain divided on how Tai-Kadai and Austronesian coordinate historically, opinion appears to now favor a genetic relationship between these families.

References Barua, Golap Chandra. 1985 [1930]. Ahom Buranji – From the earliest time to the end of the Ahom rule. Guwahati: Spectrum Publications. Benedict, Paul K. 1942. Thai, Kadai, and Indonesian: A new alignment in Southeastern Asia. American Anthropologist 44. 756–601. Benedict, Paul K. 1975. Austro-Thai: Language and culture, with a glossary of roots. New Haven: HRAF Press. Benedict, Paul K. 1990. Japanese/Austro-Tai (Linguistica Extranea, Studia 20). Ann Arbor: Karoma. Brown, Marvin J. 1965. From Ancient Thai to modern dialects. In Marvin J. Brown, From Ancient Thai to modern dialects and other writings on historical Thai linguistics, 69–254. Bangkok: White Lotus. Campbell, George. 1874. Specimens of languages of India, including those of the aboriginal tribes of Bengal, the central provinces, and the eastern frontier. Calcutta: Bengal Secretariat Press. Chamberlain, James R. 1998. The origin of the Sek: Implications for Tai and Vietnamese history. In Somsonge Burusphat (ed.), The International Conference on Tai Studies, 97–128. Bangkok, Thailand: Institute of Language and Culture for Rural Development, Mahidol University. Conrady, August. 1896. Eine indochinesische Causative-Denominativ-Bildung und ihr Zusammenhang mit den Tonaccenten. Leipzig: Otto Harrassowitz. Crisfield, Arthur. 1974. Lao final particles. In D. L. Nguyen (ed.), Southeast Asian linguistic studies, vol. 1, 41–45. Canberra: Pacific Linguistics. Crisfield, Arthur. 1978. Sound symbolism and the expressive words of Lao. Manoa: University of Hawaii PhD dissertation.

106 

 Paul Sidwell and Mathias Jenny

Cushing, Josiah Nelson. 1871. Grammar of the Shan language. Rangoon: American Mission Press. Cushing, Josiah Nelson. 1880. Elementary handbook of the Shan language. Rangoon: American Baptist Mission Press. Cushing, Josiah Nelson. 1887. Grammar of the Shan language, 2nd revised and enlarged edn. Rangoon: American Baptist Mission Press. Cushing, Josiah Nelson. 1914. A Shan and English dictionary. Rangoon: American Baptist Mission Press. de Lacouperie, Terrien. 1887. The languages of China before the Chinese: Researches on the languages spoken by the pre-Chinese races of China proper previously to the Chinese occupation. London: D. Nutt. Dempwolff, Otto, 1934. Vergleichende Lautlehre des austronesischen Wortschatzes. 1. Band: Induktiver Aufbau einer indonesischen Ursprache (Beihefte zur ZES 15). Berlin: Dietrich Reimer. Dempwolff, Otto, 1937. Vergleichende Lautlehre des austronesischen Wortschatzes. 2. Band: Deduktive Anwendung des Urindonesischen auf austronesische Einzelsprachen (Beihefte zur ZES 17). Berlin: Dietrich Reimer. Dempwolff, Otto, 1938. Vergleichende Lautlehre des austronesischen Wortschatzes. 3. Band: Austronesisches Wörterverzeichnis (Beihefte zur ZES). Berlin: Dietrich Reimer. Denlinger, Paul. 1967. Chinese and Thai. Monumenta Serica 26. 35–41. Denlinger, Paul. 1989. The Chinese-Tai linguistic relationship: A formal proof. Monumenta Serica 38. 167–171. Diller, Anthony, Jerold A. Edmondson & Yongxian Luo (eds.). 2008. The Tai-Kadai languages (Routledge Language Family Series). London & New York: Routledge. Diller, Anthony. 1985. High and Low Thai: Views from within. In David Bradley (ed.), Language policy, language planning and sociolinguistics in South-East Asia (Papers in South-East Asian Linguistics No. 9, PL A-67), 51–76. Canberra: Pacific Linguistics. Diller, Anthony. 1993. Diglossic grammaticality in Thai. In William A. Foley (ed.), The role of theory in language description, 393–420. Berlin & New York: Mouton de Gruyter. Diller, Anthony. 2001. Thai grammar and grammaticality. In Hannes Kniffka (ed.), Indigenous grammars across cultures, 219–244. Frankfurt am Main: Peter Lang. Diller, Anthony. 2006. Polylectal grammar and Royal Thai. In Felix K. Ameka, Alan Dench, and Nicholas Evans (eds.) Catching language. Berlin: De Gruyter Mouton, 565–608. Diller, Anthony. 2008. Resources for Thai language research. In A. Diller, J. A. Edmondson, & Yongxian Luo (eds.), The Tai-Kadai languages (Routledge Language Family Series), 31–82. London & New York: Routledge. Dodd, William Clifton. 1923. The Tai race, elder brother of the Chinese. Cedar Rapid, IA: The Torch Press. Don Baccam, Baccam Faluang, Baccam Hung, Dorothy Fippinger. 1989. Tai Dam–English, English– Tai Dam vocabulary book. Eastlake, CO: Summer Institute of Linguistics. Edmondson, Jerold A. 2008. Shan and other northern tier Southeast Tai languages of Myanmar and China: Themes and variations. In Anthony V. N. Diller, Jerold A. Edmondson & Yongxian Luo (eds.), The Tai-Kadai languages, 184–206. London & New York: Routledge. Edmondson, Jerold A. & Yang Quan. 1988. Kam Tai initials and tones. In J. A. Edmondson & D. B. Solnit (eds.), Comparative Kadai: Linguistic studies beyond Tai, 143–166. Arlington, TX: The Summer Institute of Linguistics and the University of Texas at Arlington. Edmondson, Jerry A. & David B. Solnit. 1988. Introduction. In J. A. Edmondson & D. B. Solnit (eds.), Comparative Kadai: Linguistic studies beyond Tai, 1–17. Arlington, TX: The Summer Institute of Linguistics and the University of Texas at Arlington.



History of Tai-Kadai studies 

 107

Edmondson, Jerry A. & David B. Solnit. 1997. Introduction. In J. A. Edmondson & D. B. Solnit (eds.), Comparative Kadai: The Tai branch, 1–26. Arlington, TX: The Summer Institute of Linguistics and the University of Texas at Arlington. Enfield, Nicholas. 2007. A grammar of Lao. Berlin: Mouton de Gruyter. Enfield, Nicholas. 2008. Lao linguistics in the 20th century and since. In Y. Goudineau & M. Lorrillard (eds.), Recherches nouvelles sur le Laos, 435–452. Paris: Ecole Française d’Extrême-Orient. Ferlus, Michel. 1996. Remarques sur le consonantisme du proto kam-sui. CRLAO 25(2). 235–278. Ferlus, Michel. 2008. The Dai dialects of Nghệ An, Vietnam (Tay Daeng, Tay Yo, Tay Muong). In A. Diller, J. A. Edmondson & Yongxian Luo (eds.), The Tai-Kadai languages (Routledge Language Family Series), 298–316. London & New York: Routledge. Ferlus, Michel.1990. Remarques sur le consonantisme de proto thai-yay (révision du proto-tai de Li Fangkuei). Paper circulated at the 23rd International Conference on Sino-Tibetan Languages and Linguistics, University of Texas at Arlington. Fine Arts Department (Thailand). 1971. Chindamani vol. 1–2, Record of the Chindamani and Chindamani by Phra Chao Boromkos [in Thai]. Bangkok: Bannakharn. Fraisse, André. 1950. Les tribus Sô de la provinces de Cammon (Laos). Bulletin de la Société des Etudes Indo-chinoises 25. 171–185. Gedney, William J. 1970. The Saek language of Nakhon Phanom Province. Journal of the Siam Society 58. 67–87. Gedney, William J. 1989. The Saek language of Nakhon Phanom province. In R. J. Bickner, J. Hartmann, T. J. Hudak & P. Peyasantiwong (eds.), Selected papers on comparative Tai studies, 373–400. Ann Arbor, MI: Center for Southeast Asian Studies, the University of Michigan. Gedney, William J. 1993. William J. Gedney’s The Saek language: Glossary, texts and translations. In Thomas John Hudak (ed.), Michigan Papers on South and Southeast Asia Number 41. Ann Arbor, MI: Center for Southeast Asian Studies, the University of Michigan. Gedney, William R. 1972. A Checklist for determining tones in Tai dialects. In M. Estellie Smith (ed.), Studies in linguistics in honor of George L. Trager. The Hague: Mouton. Glick, Irving & Sao Tern Moeng. 1991. Shan for English speakers. Hyattsville MD: Dunwoody Press. Grierson, G. A. 1904. Linguistic survey of India, Vol. 2: Mon-Khmer and Siamese-Chinese. Calcutta: Office of the Superintendent of Government Printing House. Haas, Mary R. 1942. Beginning Thai: Introductory lessons in the pronunciation and grammar of the Thai language. Washington, DC: American Council of Learned Societies. Haas, Mary R. 1945. Special dictionary of the Thai language. Berkeley: Army Specialized Training Program. Haas, Mary R. 1956a. The Thai system of writing. Washington, DC: American Council of Learned Societies. Haas, Mary R. 1956b. Brief description of Thai, with sample texts. Outline for types of linguistic structure. Berkeley: University of California. Haas, Mary R. 1964. Thai-English student’s dictionary. Stanford: Stanford University Press. Hanna, William J. 2012. Dai Lue–English dictionary. Chiang Mai: Silkworm Books. Harris, Jimmy G. & James R. Chamberlain (eds.). 1975. Studies in Tai linguistics in honor of William J. Gedney. Bangkok: Central Institute of English Language. Harris, Jimmy. G. & Richard B. Noss (eds). 1972. Tai phonetics and phonology. Bangkok: Central Institute of English Language, Office of State Universities. Haudricourt, André-Georges. 1948. Les phonèmes et le vocabulaire du thai commun. Journal Asiatique 236. 197–238. Haudricourt, André-Georges. 1952. Les occlusives velaires en thai. Belletin de la Société de Linguistique de Paris 48. 86–89.

108 

 Paul Sidwell and Mathias Jenny

Haudricourt, André-Georges. 1956. De la restitution des intitiales dans les langues monosyllabiques: le problème du thai commun. Bulletin de la Société de Linguistique de Paris 52. 307–322. Haudricourt, André-Georges. 1958. Les Sek de la province du Cammon (Laos), migration thai ou déportation chinoise? Journal Asiatique 246. 107–108. Holm, David. 2013. Mapping the Old Zhuang character script. A vernacular writing system from southern China. Leiden & Boston: Brill. Honts, Mary. 1979. Cases and clauses in Lao. Southeast Asian Linguistics Studies No. 4, 17–37. Canberra: Pacific Linguistics. Hospitalier, Julien Joseph. 1937. Grammaire laotienne. Paris: Imprimerie nationale, Paul Geuthner. Hudak, Thomas John & Michelle Fiedler. 2004. Zhuang–English lexicon: Based on the Zhuang– Chinese vocabulary compiled by the Research Office, Guangxi-Zhuang Autonomous Region. Phoenix: Arizona State University. Hudak, Thomas John. 2008. William J. Gedney’s comparative Tai source book. Honolulu: University of Hawai‘i Press. Iwasaki, Shoichi & Preeya Ingkaphirom. 2005. A reference grammar of Thai. Cambridge: Cambridge University Press. Jacquesson, François (ed.). 2010. An introductory primer & grammar of Ahom (Tai) language by J. N. Phukan. Paris: CNRS. Jones, Taylor. 1842. Brief grammatical notices of the Siamese language with an appendix. Bangkok: Mission Press. Jonsson, Nanna L. 1991. Proto Southwestern Tai. Albany, NY: State University of New York at Albany, Department of Linguistics PhD dissertation. Khanittanan, Wilaiwan. 1976. The Saek language [in Thai]. Bangkok: Thammasat University Press. Kosaka, Ryuichi. 1992. Tentative de reconstruction d’un proto-saek-chuang et comparaison de son vocabulaire avec l’ancien siamois. Tokyo: Tokyo University of Foreign Studies MA thesis. Li Fang-Kuei. 1977. A handbook of comparative Tai. Honolulu: University Press of Hawaii. Liang Min 梁敏. 1990 Geyang yuqun de xishu wenti 仡央语群的系属问题 [On the affiliation of the Ge-Yang group of languages.] Minzu Yuwen 民族语文 6. 1–8. Low, James. 1828. A grammar of the Thai or Siamese language. Calcutta: Baptist Mission Press. Luang-Thongkhum, Theraphan. 1979. Iconicity of vowel qualities in Northeastern Thai reduplicated words. In T.L. Thongkum et al. (eds.), Studies in Tai and Mon-Khmer phonetics and phonology in honour of Eugénie J. A. Henderson, 247–260. Bangkok: Chulalongkorn University Press. Luo, Yongxian & Mark Aronoff (eds.). 2019. The Kra-Dai languages. Oxford: Oxford University Press. Luo, Yongxian. 1991. Tense and aspect in Zhuang: A study of a set of tense and aspect markers. Canberra: Australian National University MA thesis. Luo, Yongxian. 1997. The subgroup structure of the Tai languages: A historical-comparative study. Berkeley: Project on Linguistic Analysis, University of California, Berkeley. Luo, Yongxian. 1999. A dictionary of Dehong, Southwest China. Canberra: Pacific Linguistics. Luo, Yongxian. 2008. Zhuang. In Anthony Diller, Jerold A. Edmondson & Yongxian Luo (eds.), The Tai–Kadai languages (Routledge Language Family Series). London & New York: Routledge. Manomaivibool, Prapin. 1975. Sino-Thai lexical correspondence. Seattle: University of Washington PhD dissertation. Morev, Lev, Aleksej Moskalev & Yuri Ya Plam. 1972. The Lao language. Moscow: Nauka (Glavnaja Redakcija Vostochnoj Literatury). (In Russian) Morey, Stephen. 2005b. Tonal change in the Tai languages of Northeast India. Linguistics of the Tibeto-Burman Area 28(2). 139–202. Morey, Stephen. 2006. Constituent order change in the Tai languages of Assam. Linguistic Typology 10. 327–367.



History of Tai-Kadai studies 

 109

Morey, Steven. 2005a. The Tai languages of Assam – A grammar and texts. Canberra: Pacific Linguistics. Morey, Steven. 2008. The Tai languages of Assam. In Anthony Diller & Jerrold A. Edmondson (eds.), The Tai-Kadai languages, 207–253. London & New York: Routledge. Morey, Steven. 2010. Turung: A variety of Singpho language spoken in Assam. Canberra: Pacific Linguistics. Morey, Steven. 2011. A sketch of Tai Ahom, as recorded in original manuscripts. In Biswajit Das & Phukan Basumatary (eds.), Axamiya aru Axamar Bhasa [Assamese and the languages of Assam]. Guwahati: AANK-Bank. Nginn, Somchinne Pierre. 1984. Élements de grammaire Laotienne. Paris: Sudestasie. Nishida, Tatsuo. 1954. Tonēmu ni yoru Tai shogo hikaku gengogakuteki kenkyū [Comparative linguistic studies of the Tai languages based on tonemes]. Gengo Kenkyū 25. 19–46. Nishida, Tatsuo. 1960. Common Tai and Archaic Chinese. Transactions of Kansai University Institute of Oriental and Occidental Studies 49. 1–15. Norquest, Peter. 2015. A phonological reconstruction of proto-Hlai. Leiden & Boston: Brill. Noss, Richard B. 1964. Thai reference grammar. Washington, DC: Department of State. Osatanada, Varisa. 1997. Tone in Vientiane Lao. Manoa: University of Hawaii PhD dissertation. Ostapirat, Weera. 2000. Proto-Kra. Linguistics of the Tibeto-Burman Area 23(1). 1–251. Ostapirat, Weera. 2005. Kra-Dai and Austronesian: Notes on phonological correspondences and vocabulary distribution. In L. Sagart, R. Blench & A. Sanchez-Mazas (eds.), The peopling of East Asia: Putting together archaeology, linguistics and genetics, 107–131. London & New York: Routledge Curzon. Pallegoix, Jean Baptiste. 1850a. Grammatica linguae Thai. Bangkok: Ex typographia Collegii Assumptionis. Pallegoix, Jean Baptiste. 1850b. Dictionarium latinum Thai: ad usum missionis Siamensis. Bangkok: Ex typographia Collegii Assumptionis. Pallegoix, Jean Baptiste. 1954. Dictionarium linguae Thai sive Siamensis interpretatione Latina, Gallica et Anglica illustratum. Paris: Typ. Imper. Panarut, Peera. 2015. On a quest for the jewel. A review of the Fine Art Department’s edition of Phra Horathibodi’s Chindamani. MANUSYA: Journal of Humanities Regular 18(1). 23–57. Panarut, Peera. 2015. Cindamani. The odd content version. A critical edition (Hamburger Thaiistik Studien). Segnitz: Zenos Verlag. Phinthong, Preecha. 1989. Isan-Thai-English dictionary. Ubol, Thailand: Siritham Press. Pittayawat Pittayaporn. 2009. The phonology of Proto-Tai. Ithaca: Cornell University PhD dissertation. Pittayawat Pittayaporn. 2016. Chindamani and reconstruction of Thai tones in the 17th century. Diachronica 33(2). 187–219. Prakhong, Nimmanaeminda. 1976. Vowel sounds in some Isan modifiers. Bulletin of the Faculty of Arts No. 10. Bangkok: Chulalongkom University Press. Premchu, P. 1979. Elegant words and expressions in the Isan language. Bangkok: National Cultural Committee, Ministry of Education. Rehbein, Boike & Sisouk Sayaseng. 2004. Grundzüge der laotischen Grammatik. Hamburg: Buske. Roffe, Edward.1946. The phonemic structure of Lao. Journal of the American Oriental Society 66. 289–295. Royal Lao Government. 1972. Lao Grammar, 4 vols. Vientiane: Royal Academy, Ministry of Education. (In Lao) Sagart, Laurent. 2004. The higher phylogeny of Austronesian and the position of Tai-Kadai. Oceanic Linguistics 43(2). 411–444.

110 

 Paul Sidwell and Mathias Jenny

Sagart, Laurent. 2005. Tai-Kadai as a subgroup of Austronesian. In L. Sagart, R. Blench & A. Sanchez-Mazas (eds.), The peopling of East Asia: Putting together archaeology, linguistics and genetics, 177–181. London & New York: RoutledgeCurzon. Sai Kam Mong. 2004. The history and development of the Shan scripts. Chiang Mai, Thailand: Silkworm Books. Sao Tern Moeng. 1995. Shan–English dictionary. Hyattsville, MD: Dunwoody Press. Sarawit, Mary Elizabeth Sautter. 1973. The Proto-Tai vowel system. Ann Arbor: University of Michigan PhD dissertation. Schlegel, Gustave. 1901. Review of Frankfurter’s Siamese grammar. T’oung Pao 2. 76–87. Schlegel, Gustave. 1902. Siamese studies. T’oung Pao, Supplement to Series 2, vol. 2. 1–128. Smalley, William A. 1994. Linguistic diversity and national unity: Language ecology in Thailand. Chicago: University of Chicago Press. Strecker, David S. 1983. Proto-Tai vowels revisited: A comparison and critique of the work of Sarawit and Li. Linguistics of the Tibeto-Burman Area 7(2). 33–74. Strecker, David S. 1984. Proto-Tai personal pronouns. Ann Arbor: University of Michigan PhD dissertation. Sybesma Rint. 2008. Zhuang: A Tai language with some Sinitic characteristics. Postverbal ‘can’ in Zhuang, Cantonese, Vietnamese and Lao. In P. Muysken (ed.), From linguistic areas to areal linguistics, 221–274. Amsterdam: John Benjamins. Terwiel, B. J. & Ranoo Wichasin. 1992. Tai Ahoms and the stars. Three ritual texts to ward off danger. Ithaca: Cornell University, Southeast Asia Program. Thonglor, Kamchai. 1987. Lak phaasaa Thai [Fundaments of the Thai language]. Bangkok: Ruamsarn. Thurgood, Graham. 1988. Notes on the reconstruction of Proto-Kam-Sui. In Jerold A. Edmondson & David B. Solnit (eds.), Comparative Kadai: Linguistic studies beyond Tai, 179–218. Arlington, TX: The Summer Institute of Linguistics and the University of Texas at Arlington. Thurgood, Graham. 1994. Tai-Kadai and Austronesian: The nature of the relationship. Oceanic Linguistics 33(2). 345–368. Tingsabadh, M. R. Kalaya & Arthur S. Abramson (eds.). 2001. Essays in Tai linguistics. Bangkok: Chulalongkorn University Press. Trongdee, Thananan. 1996. The sound symbolic system in Lao. Pan-Asiatic Linguistics 1. 189–197. Viravong, Sila. 1962. Lak pahsah Lao [Principles of the Lao language]. Bangkok: Self-published. (In Lao) Vongvichit, Poumi. 1967. Lao grammar. Sam Neua: Central Education Department. (In Lao) Warotamasikkhadit, Udom. 1972. Thai syntax: An outline. The Hague: Mouton. Wayland, Ratree. 1996. Lao expressives. Mon-Khmer Studies 26. 217–231. Wichasin, Ranoo. 1996. Ahom Buranji. Bangkok: Amarin Printing & Publishing. (In Thai) Winichakul, Thongchai. 1994. Siam mapped. The history of the geo-body of a nation. Honolulu: University of Hawaii Press. Wulff, K. 1934. Chinesisch und Tai: Sprachvergleichende Untersuchungen. Copenhagen: Levin und Munksgaard, Ejnar Munksgaard. Xing, Gongwan. 1999. Han tai yu bijiao shouce [A handbook of comparative Sino-Tai]. Beijing: The Commercial Press. (In Chinese) Young, Linda Wai Ling. 1985. Shan chrestomathy: An introduction to Tai Mau language and literature. Lanham, MD: University Press of America.

Nathan W. Hill

7 Scholarship on Trans-Himalayan (Tibeto-Burman) languages of South East Asia 7.1 Introduction The spread of the Trans-Himalayan family1 naturally paid no attention to 21st century political boundaries.2 The family includes languages with a geographic range from Balti Tibetan in Pakistan to Hokkien Chinese in Indonesia, with the foothills of the Himalayas and the South East Asian highlands as its center of gravity. Van Driem (2001) and Thurgood (2017a) provide helpful introductions to the family overall.3 Here we restrict the focus to South East Asia, in more specific terms treating those Trans-Himalayan branches that include languages spoken in today’s Myanmar and Thailand and excluding Chinese. I include discussion of subbranches entirely contained within the boundaries of the People’s Republic of China (Rgyalrongic, Qiangic, Ersuic, and Naish), but omit treatment of primary branches entirely confined to regions under the control of China, India, Nepal or Bhutan (Bai, Tujia, Kiranti, etc.); these criteria yield Burmo-Qiangic, Kuki-Chin, Karen, Sal, Mruic, and Nungish as the branches for discussion.4 The farther south and east a language is spoken, the more it exhibits the typical South East Asian typological profile of simple syllable structure, lack of inflection, and concatenating auxiliary verbs. Karenic, as the most southern of the Trans-Himalayan subgroups, reflects the vanguard of this transition, whereas the Rgyalrongic languages of Sichuan exhibit the opposite extreme. The frequency of the South East Asian typology in the Trans-Himalayan family is what led Meillet to despair that “la restitution d’une ‘langue commune’ dont le chinois, le tibétain, etc., par example, seraient des formes postérieures, se heurte à des obstacles quasi invincibles” (Meillet 1954: 26–27).5 Such pessimism is not entirely warranted. On the one hand, historical linguistics is still 1 This family is also called Indo-Chinese, Tibeto-Burman, or Sino-Tibetan (see van Driem 2014). 2 I would like to thank Guillaume Jacques, Mathias Jenny, Lai Yunfan, Alexis Michaud, and David Solnit for helpful pointers while I wrote this piece. 3 Other useful reference works include the 言語学大辞典 Gengogaku Daijiten (1988–2001), which includes over 40 entries on Trans-Himalayan languages, and 云南特殊语言研究 Yúnnán tèshū yǔyán yánjiū (2004), which focuses on languages spoken in Yunnan. 4 Bodic is also very marginally reflected because of the handful of Tibetan speaking villages of northern Burma (Suzuki 2012), but here is hardly the place for a survey of Tibetan linguistics; for part of such a survey see Hill and Gawne (2017). 5 In English: “the reconstruction of a proto-language of which Chinese, Tibetan, etc. for example, would be the descendants, faces almost nearly insurmountable obstacles”. https://doi.org/10.1515/9783110558142-007

112 

 Nathan W. Hill

profitably undertaken even in such innovative branches as Naic (Jacques and Michaud 2011) and Karenic (Haudricourt 1946, 1975). On the other hand, Kuki-Chin, Sal, Mruic, and Nungish all have inflectional morphology of the kind that has facilitated progress in the reconstruction of Indo-European. As data on more languages become available, it is increasingly clear that the typological profile of the Trans-Himalayan proto-language is close to that of the Rgyalrongic languages, with complex syllable structure and ornate inflection; the more typically South East Asian languages have lost these features more or less independently. While it is inappropriate to speculate too precisely about prehistoric migrations on the basis of language distributions today, without the corroborating evidence of genetics or archaeology (pace LaPolla 2012), the broad pattern – languages with complex syllable structure and abundant inflectional morphology spoken in more mountainous terrain contrasting with languages of more simple structure spoken in flatter and more southern regions – points to an Urheimat inside of what is now China. The Lolo-Burmese seem to be a relative newcomer to South East Asia, with Nungish, Sal, Kuki-Chin, and Karen having spread earlier.

7.2 Research trends The practical need of communication, particularly in commercial and diplomatic circumstances, is typically what drove the study of foreign languages in the pre-modern period. The first records of Trans-Himalayan languages of South East Asia arose in the diplomatic entanglements of successive Chinese empires. The earliest evidence of a Trans-Himalayan language of South East Asia is three songs that a Bailang delegation presented to the Chinese court (58–75 CE). The songs so delighted the emperor that he had the songs recorded for posterity. Many centuries later the diplomatic requirements of the Ming and Qing dynasty led to the compilation of the Huáyí yìyǔ (華夷譯語) vocabularies. In many cases these reflect an older stage of a particular language (in particular, Rgyalrong and Tosu) than we would otherwise know. The advent of European colonialism brought increased information on languages of South East Asia. Investigators were generally either colonial officials or missionaries. Later, the explicitly atheist Marxist ideology of the People’s Republic of China (PRC) also required the classification of and to some extent documentation of that state’s subject peoples. The major ethnolinguistic surveys of the 1950s in the PRC were only published after the turbulence of the Cultural Revolution subsided. The publications, when they did emerge, provided invaluable comparative word lists of Trans-Himalayan, namely Huáng and Dài (1992) and Sūn (1991).6 Today, most language documentation available to academic linguists is produced by academic lin6 These sources have been conveniently digitized by the STEDT project, based at the University of California at Berkeley (1987–2015).



Scholarship on Trans-Himalayan (Tibeto-Burman) languages of South East Asia 

 113

guists, but missionaries also continue to produce useful resources. Most linguists engaged in the documentation of the Trans-Himalayan languages have a functional-typological rather than a historical orientation. As such, the collection and publication of lexicographical resources and text collections receives much less attention than grammar. Although language communities themselves produce a certain amount of text in their own languages, this work tends to be undervalued, underused, and not even collected. General works on historical linguistics often present research on the Trans-Himalayan family as in keeping with the standards and methods of the discipline at large (e.  g. Abondolo 1998: 8; Campbell and Poser 2008: 114). Such authors paint an overly rosy picture. In particular, the influential research tradition that Benedict (1972) inaugurated and Matisoff (2003) led, does not adhere to the comparative method in an orthodox manner (Chang 1973; Fellner and Hill 2019; Miller 1974; Sagart 2006). The contributions of Tatsuo Nishida, unfortunately rather neglected in the West, are generally more reliable; I make a point of mentioning his work in the relevant places below. An element of Nishida’s judiciousness is that he avoids proposing reconstructions, but focuses on establishing the facts of language change. Reconstructions of Trans-Himalayan subgroups typically associate proto-forms with the prominent correspondence patterns found in the group, but do not formulate sound changes and their relative chronology with sufficient precision to trace the reconstructions down to the attested forms. As such, one must approach the findings of such research with caution. If two languages share a sound change which occurred in their respective histories subsequent to a change that they do not share, then the shared sound change must be due to contact. There are a number of cases like this among the Trans-Himalayan languages of South East Asia. Chinese, Karen, Burmish, and Loloish all underwent quite similar tonal splits conditioned by initial manner (as did Thai and Vietnamese); the conditioning in each case is different. In the Burmish language the split only occurred in stop final syllables (Hill 2019: 55–56); the “Loloish tonal split” appears to have occurred independently with the same conditioning, but the locus classicus for its description is a work that I find uniquely impenetrable (Matisoff 1972). In Karen the tonal split effected open and closed syllables equally, and in addition to a low register associated with original voiced onsets and a high register associated with original voiceless onsets, there is a mid register associated with implosive onsets (Kato 2018). Van Bik takes the change *s- > t- as diagnostic of Kuki-Chin, but it is a wider areal feature and also attested in Tangkhul (Mortensen 2003), Bodo-Garo (Burling 1959), and Bangru (Bodt and Lieberherr 2015). To give a third example, standard Burmese has changed r- to y- [j-] and shares this sound change with several Burmish languages (WBur. raṅ, SBur. jin, Lacid jaŋ³¹- ‘chest’). However, the Burmese dialect of Arakanese preserves r- distinct from y- (Okell 1995: 2), so the parallel change of Standard Burmese and Lashi occurred separately in the two languages. In terms of historical morphology, causative formations and person agreement are the most widely discussed issue. It is traditional to see a devoicing *s- as respon­

114 

 Nathan W. Hill

sible for alternations between voiced intransitives and voiceless transitives or alternations between unaspirated intransitives and aspirated transitives. Jacques has shown that this explanation is untenable. Rgyalrongic languages have both a s-prefix causative and a nasal prefix anticausative and the Burmese unaspirated versus aspirated pattern is cognate with the anticausative in Japhug (Bur. kya ‘fall, drop’ and Japhug ŋgra ‘fall’, versus khya ‘bring down, lower’ and kra ‘bring down’) (Jacques 2014: 250, also see Jacques 2020). As we saw, early researchers like Meillet saw an absence of agreement as characteristic of the Trans-Himalayan family overall. As more languages with agreement have come to be described, two logical possibilities arise to explain the typological diversity of languages in the family. LaPolla argues that isolating languages continue the typological pattern of the Trans-Himalayan proto-language, and that agreement emerged via pronoun incorporation, partly independently in different branches, but largely through a single common innovation that in his proposal constitutes an enormous and geographically dispersed “Rung” branch (LaPolla 2013). Most other researchers, most vocally DeLancey, believe that the proto-language had verb agreement that was independently lost in various branches (DeLancey 2015). This inky debate has generated more heat than light, but slowly the detailed work of subbranch level morphological reconstruction is making progress. Comparative syntax is only possible once phonology and morphology are advanced to a certain level and even in Indo-European, the world’s best studied language family, the methodology of comparative syntax remains insecure and controversial (Watkins 1976). The publication of texts in Trans-Himalayan languages has not been a research priority and as such in most languages very few continuous texts are available for study, if any at all. The prospects of syntactic study in such circumstances are dismal. Trans-Himalayan languages are overwhelmingly verb final, the only exception in South East Asia are the Karenic languages. Their exceptional status is understood to have arisen through contact, but there is unlikely to ever be sufficient evidence to trace this transition. An area of great promise in Trans-Himalayan historical syntax is the variation between verb stems in Kuki-Chin. Broadly speaking each Kuki-Chin language has one verb stem predominantly used in finite contexts and another that favors non-finite contexts, but the specific distribution of the two forms varies quite subtly from one language to another. A close study across the KukiChin family could teach a lot about the syntax of Proto-Kuki-Chin and about syntactic change in general. In addition, Scott Delancey in forthcoming work proposes that lexical doublets in Jinghpaw are explainable as cognates to different Kuki-Chin verb forms. There is a tendency in descriptive work on Trans-Himalayan languages, particularly common in discussions of clause chaining, to speculate about particular grammaticalization pathways that give rise to converbial constructions, rather than to carefully illustrate the precise meaning of one construction versus another. My hunch is that many grammatical patterns wait to be discovered in the clause chaining of



 115

Scholarship on Trans-Himalayan (Tibeto-Burman) languages of South East Asia 

Trans-Himalayan languages; Tibetan has a switch-reference system that was only clearly explained in 2019 (Beer 2019), despite this being the best studied member of the family other than Chinese. Looking to the future, research on Trans-Himalayan languages is constrained by the absolute number of investigators; the achievements seen in better studied families will only be achieved through methodological innovations that increase the productivity of an individual researcher. Fortunately, these innovations are nearly at hand. Automatic transcription of under-resourced languages has the potential to speed up the collection and processing of fieldwork data into textual corpora (Adams et al. 2018; Do et al. 2014). Off-the-shelf natural language processing (NLP) tools speed up the glossing and translation of texts and the compilation of dictionaries. Automatic cognate detection can speed up the articulation of sound laws and the reconstruction of proto-forms (Bodt and List 2019; List and Hill 2017). To fulfill the promise of these new techniques, it is only necessary to incorporate more advanced technological training into education of the next generation of linguists.

7.3 Subgroups In very few instances are the Stammbäume of Trans-Himalayan languages rigorously justified by the identification of shared isoglosses. Nonetheless, a genealogically organized account provides a convenient way of structuring this survey of research on a large number of languages, and will orient the reader within the current state of discussions on subgrouping, despite the tentativeness of current understandings. Burmo-Qiangic

Na-Qiangic

Naic

Ersuic

Prinmi

Lolo-Burmese

Qiangic

Muya

Rma

Loloish

Rgyalrongic

Burmish

Burmic

Fig. 1: The Stammbaum of Burmo-Qiangic, based on Jacques and Michaud (2011).

Maruic

116 

 Nathan W. Hill

7.4 Burmo-Qiangic The view that Lolo-Burmese and Qiangic languages are closely related has circulated for some time (Dempsey 1995: 13; Bradley 1997; Peiros 1998; Jacques and Michaud 2011). The subgroups within Burmo-Qiangic are Naic, Ersuic, Qiangic (including Rgyalrongic), Loloish, and Burmish. The grouping together of Loloish and Burmish as “Lolo-Burmese” is universally accepted. Tangut (Rgyalrongic, txg), Burmese (Burmish, mya), and Yi (Loloish, iii) are the Burmo-Qiangic languages of medieval attestation, respectively attested from 1036, 1113, and 1485. Naxi (Naic, nxq) is written from the 18th century. Features characteristic of the Burmo-Qiangic languages include a contrast of velarized and plain vowels, a complex system of directional prefixes that double as past tense markers, and inverse agreement marking in the verb. The more conservative languages show all of these features, but in many subbranches only relics or indirect evidence is preserved. In the 1960s and 1970s the easy accessibility of Thailand to outsiders, compared to Burma and China, made Loloish the focus of both documentation and reconstruction by Western scholars, but now the Qiangic side of Burmo-Qiangic is where one finds the vanguard of Neogrammarian progress in Trans-Himalayan reconstruction. Jacques (2014), consolidating earlier forays, devotes a monograph to Tangut-Rgyalrong comparative phonology and grammar. The clarity and comprehensiveness of his presentation has in turn enabled others to make quick discoveries. In particular, Gong (2020) finds that the mysterious Tangut “grades” (等 děng) correspond to the velarized versus non-velarized syllabic contrast met in Rgyalrongic languages and Sims (2020) shows that the tonal contrasts of northern Qiang dialects correspond to the Tangut tonal contrasts. Naic

Namuyi

Xumi

Naish

Naxi

Na

Laze

Fig. 2: The Stammbaum of Naic, based on Jacques and Michaud (2011).



Scholarship on Trans-Himalayan (Tibeto-Burman) languages of South East Asia 

 117

7.4.1 Naic Naic consists of Namuyi (nmy), Xumi (sxg), and “Naish”, with the latter made up of Naxi (nxq), Na (nru), and Laze (Jacques and Michaud 2011). For Namuyi, there is a deposit at the Endangered Languages Archive (number 0217), and three recent treatments of its grammar (Li 2017; Pavlík 2017; Yǐn 2016). Very little work has been done on the Xumi language. There is a short grammar by Sun (2014) and a series of articles by Chirkova (see Chirkova 2017). The most well-known language of the family is Naxi, famed for the representational writing system used in the liturgical writings of Dongba priests. Michaud et al. (2017) provide an up-to-date survey of research on Naxi, including its written forms. Within Naish, Yongning Na has received most attention. Lidz (2010) provides an overall grammar. Michaud (2017) is specifically devoted to the very complicated tone system. The least documented language of this group is Laze (see Michaud and Jacques 2012). Jacques and Michaud (2011) provide a preliminary reconstruction of Proto-Naish. Li has made a few further forays into Naish reconstruction (Li 2018, 2020). He is paradoxically critical of his predecessors for letting non-Naic languages inform their reconstructions, but he himself uses Tibetan comparisons (not always correctly) to inform his own reconstructions.7

7.4.2 Ersuic The Ersuic subbranch of Burmo-Qiangic consists of only three languages, Ersu (ers), Lizu, and Tosu.8 Tosu is recorded among the Huáyí yìyǔ from the Qianlong (1735–1796) period (Nishida 1973b). Ersu was first recorded by Baber (1882). Lizu appears to have first been studied by Sun Hongkai, with partial publication of his data in Nishida and Sun (1990). Yu (2012) surveys the existing work on the family and provides a preliminary reconstruction for this subfamily, based on Ersu and Lizu, as he had insufficient Tosu data available. Unfortunately, most of the changes, e.  g. the brightening of *-a- to *-i-, are not very diagnostic.

7 Li (2020) suggests that the C1 of his own previous reconstruction is probably a nasal because it corresponds to “pre-initial” nasals in Written Tibetan. I have to admit that this ends up being a bit odd; it we allow ourselves to symbolize this as “N”, we end up with a contrast in the proto-language between *Nŋg and *Ng-. I would also point out that one of the pre-initials that he suggests comparison to in Written Tibetan is, in my own analysis and that of other investigators a voiced velar fricative, and not prenasalization, but this is a point of controversy in Tibetan historical phonology. 8 This Lizu language is not to be confused with the similarly named Lisu of the Loloish subbranch.

118 

 Nathan W. Hill

Ersuic

Tosu

Ersu

Lizu

Fig. 3: The Stammbaum of Ersuic, based on Yu (2019).

Since Yu’s study considerably more data on these three languages have become available. For Ersu, there is a deposit at the Endangered Archives Project (MPI655457) and a PhD dissertation (Zhang 2013). Katia Chirkova has published new data on Tosu (2014), including a grammar (Han et al. 2019).

7.4.3 Qiangic Qiangic has the four daughters Prinmi (pmi, pmj), Munya (mvm), Rma (cng, qxs), and Rgyalrongic. It is convenient to treat here work on those Qiangic languages other than Rgyalrongic, since the body of scholarship on the latter is rather large. The documentation of Qiangic languages is not vast but is increasing. For Prinmi, we have grammars by Daudey (2014). and Ding (2014). The Munya language is documented in a few articles of Ikeda (2002, 2006, 2008) and two PhD theses (Bai 2019; Gao 2015). The Rma language, spoken in Sichuan, consists broadly of two sets of dialects, a Southern set and a Northern set; LaPolla and Huang (2003: 16–17) surveys early work on Rma, and themselves offer a grammar of northern Rma as spoken in Ronghong Village.

7.4.4 Rgyalrongic Apart from the dead language Tangut, the Rgyalrongic languages are confined to Sichuan. Under pressure from the expanding Tibetan empire, the ancestors of the Tanguts left the Rgyalrongic homeland heading north east; they founded a polity in 984 in what is today Ningxia. On the basis of lexical isoglosses and the distribution of morphological features such as case marking and directional prefixes on the verb, Lai et al. (2020) divide the Rgyalrongic branch into a western and eastern subbranch, with Tangut on the western branch. Galambos (2015) provides a convenient entry point to Tangut studies for the anglophone reader. The remaining West



Scholarship on Trans-Himalayan (Tibeto-Burman) languages of South East Asia 

 119

Rgyalrongic languages are less well studied. For a survey of Horpa (ero) see Jacques et al. (2017). The Geshiza variety of Horpa was more recently the subject of a PhD (Honkasalo 2019). On Khroskyab (jiq), Lai has published several articles and a PhD dissertation (Lai 2015, 2016, 2017). The documentation of East Rgyalrongic languages begins in the 18th century with a Huáyí yìyǔ vocabulary of 800 words (Nishida and Sun 1990). Jacques (2004: 9–13) gives a history of Rgyalrong studies. In the intervening 16 years since his survey, the field has been very active. Jacques’ many contributions have made Japhug Rgyalrong (jya) among the best described Trans-Himalayan languages (Jacques 2017a). Sun has a number of papers on Tshobdun (jya) (Sun 2017), and has also contributed to the study of other Rgyalrong varieties. For Situ (jya) there is a grammar of the Kyom-kyo variety (Prins 2016), and a collection of text in Cogtse (Lin 2016). For an overview of the East Rgyalrongic languages, see Jacques (2017b). Rgyalrongic

West Rgyalrongic

Tangut

Horpa

Stau

East Rgyalrongic

Khroskyabs

Geshiza

Cogtse

Situ

Japhug

Tshobdun

Kyomkyo

Bragbar, etc.

Zbu

Fig. 4: The Stammbaum of Rgyalrongic, based on Lai et al. (2020), incorporating suggestions of Guillaume Jacques.

7.4.5 Burmish There are around seven Burmish languages, spoken in the hills of North Burma and on the other side of the border in China, namely Ngochang (also called Achang, acn), Zaiwa (Atsi, atb), Pela (Bela, bxd), Lacid (Leqi, Lashi, lsi), Lhao Vo (Langsu, Maru, mhx) and Hpun (Phon, hpo), and of course Burmese itself. The Burmese wandered out of the ancestral homeland and onto the planes around the 10th century, to the linguistic disadvantage of the Pyu. The Burmish group of languages itself splits into two, which Nishi calls Burmic and Maruic (1999: 70). The Burmic subbranch is characterized by the merger of inherited plain and preglottalized stops as aspirates, as seen in the merger of *kruk ‘six’ (Bur. khrok, Longchuan Ngochang xʐoʔ⁵⁵ versus Zaiwa

120 

 Nathan W. Hill

khjuʔ⁵⁵ Pela khjauʔ⁵⁵) and *ʔkruk ‘frighten’ (Bur. khrok, Longchuan Ngochang xʐoʔ⁵⁵ versus Zaiwa kju̱ ʔ⁵⁵, Pela kja̱ uʔ⁵⁵). I previously expressed reservations about the validity of Maruic, because Nishi characterized it by the shared retention of preglottalized initials (Hill 2019: 51–52), but the validity of the subgroup is confirmed by what we can call the “chicken-mouse” split, whereby for ‘chicken’ and ‘mouse’ Burmic reflects respectively *grak (Bur. krak, Xiandao Ngochang kʐɔʔ⁵⁵) and *grok (OBur. kro₁k, Longchuan Ngochang kʐoʔ⁵⁵) whereas Maruic instead reflects *rak (Lhao Vo ɣɔʔ³¹, Pela ɣaʔ³¹pha³⁵) and *rok (Lhao Vo ɣuk³¹nɔʔ³¹, Pela ɣɔʔ³¹naʔ³¹).9 Dai Qingxia and his collaborators have contributed massively to the documentation of Burmish languages including book-length grammars of Ngochang (Dài and Cuī 1985), Pela (Dài et al. 2007), Lacid (Dài and Lǐ 2007), Lhao Vo (Dài 2005), and Chashan (Dài et al. 2010). The best documented Burmish language, other than Burmese itself, is Zaiwa, with three book-length grammars (Xú and Xú 1984; Lustig 2010; Zhū and Lèpáizǎozā 2013).10 Burmish

Burmic

Old Burmese

Maruic

Ngochang

Pela

Lacid

Lhao Vo

Zaiwa

Arakanese Rangoon Tavoyan, etc. Fig. 5: The Stammbaum of Burmish, based on Hill (2019: 51–54), with changes discussed in text.

In some ways the most intriguing Burmish language is Hpun, which unfortunately has now died. Henderson (1986) first brought the attention of the scholarly public to this language. More recently U Tun Aung Kyaw (2007) did fieldwork among the few elderly Hpun who still remember some vocabulary. Both Yabu (2003) and Wāng and Cài (2018) also treat Hpun, but not independently from U Tun Aung Kyaw. The Hpun language is quite striking for beginning many of its nouns with tă- and kă-, a potentially inher9 An easy solution for reconstructing these words at the proto-Burmish level is to posit *gərak ‘chicken’ and *gərok ‘mouse’. 10 In addition, Yabu (1982) and Wannemacher (1994, 1998) have made smaller contributions.



Scholarship on Trans-Himalayan (Tibeto-Burman) languages of South East Asia 

 121

ited feature not seen in any other of the Burmish languages (Hpun kălíʔ ‘wind’, kăʃàiʔ ‘wood’, tămi ‘fire’, tăphà ‘frog’ versus Bur. le ‘wind’, sac ‘wood’, mīḥ ‘fire’, phāḥ ‘frog’). The oldest document in Burmese is the Myazedi inscription of 1113 CE (Nishida 1955, 1956; Yabu 2006). Essentially all documents in Old Burmese are stone inscriptions recording land grants to Buddhist establishments (Frasch 2018). The family that consists of languages descending from Old Burmese, known as Burmese dialects, includes “Standard Burmese” of Rangoon (Yangon) and Mandalay, Tavoyan (Dawei) and the closely related Palaw, Yaw, Merguese (Beik), Intha, Danu, Arakanese (Rakhine) and the closely related Marma and Taung’yo (Bernot 1965; Jones 1972; Okell 1995; Naksuk 2012). I call this the “Mranmaic” language family. Compared to its older cousin Tangut, serious study of the history of Burmese from a linguist perspective is languishing. Hill (2019: 46–83) provides a general orientation, to Burmese and its place within the Burmish family. There have been a handful of forays into the reconstruction of proto-Burmish, none of them wholly successful. Neither Burling (1967) nor Mann (1998) employ Written Burmese or Old Burmese data in their correspondence sets. Nishi (1999) brings together a number of cognate sets, assembled from Huáng and Dài (1992). A disadvantage of Nishi’s approach is that he only offers cognate sets where there is a cognate in Burmese. In addition, he throws out the full words. Dempsey offers two insightful contributions on Burmish language history (Dempsey 2001, 2003), but did not integrate his findings into an overall reconstruction.

7.4.6 Loloish There are many Loloish languages, some among them quite well studied. The majority, however, are only known from comparative word lists used in more or less naïve lexicostatistical studies (Lama 2012; Satterthwaite-Phillips 2011). There was a great deal of work on this branch during the 1970s; work continued subsequently, but at a slower pace. Figure 6 presents the Loloish Stammbaum according to Bradley (2007), but the “classification of the Loloish languages must be re-established in the future by considering more data sets and also grammatical features” (Gerner 2013: 7). The general disjuncture between linguistic reality and the official Chinese classification of its subject peoples reaches its apogee in the treatment of speakers of Loloish languages. Katso (kaf) speakers are absurdly classified as Mongolian. Speakers of Hani (hni), Lahu (lhu), Lisu (lis), and Jino (jiu, jiy) are recognized as having their own respective nationalities. The remaining speakers of Loloish languages are mostly grouped together as “Yi”, although this term is particularly associated with Nosu (iii), the official language of the Yi nationality and the only Loloish language to enjoy limited government support (Gerner 2016). A number of the languages called Yi in China, used, and to some extent still use, a family of logographic scripts, predominantly for religious texts (Iwasa 2018; Wasilewska 2014).

122 

 Nathan W. Hill

Loloish

Northern

Nosu

Nasu

Central

Lisu

Lahu

Akha

Lalo

Southern

Jino, etc.

Hani

Bisu

Southeastern

Akoid

Laomian

Bisoid

Pula Muji

Sangkong

Phunoi

Fig. 6: The Stammbaum of Loloish, based on Bradley (2007).

If we concentrate on those languages for which a reasonable level of documentation has been achieved, for Northern Loloish we have only Nosu (Gerner 2013). The Central Loloish branch contains the best studied languages including Lahu, for which there are a number of grammars (Cháng 1986; Lǐ 2014; Matisoff 1973) and dictionaries (Matisoff 1988, 2006). There are dictionaries of both Northern (Bradley 1994) and Southern (Bradley 2006) dialects of Lisu. The Southern Lisu use the Fraser script, invented by the eponymous missionary. Turning to Southern Loloish, Lewis and Bai prepared comprehensive dictionaries for both Hani (Lewis and Bai 1996) and Akha (ahk) (Lewis 1968), as well as text collection again for both Hani (Lewis and Bai 2002) and Akha (Lewis 2003). The 12 volumes of the series “Regional Culture Investigation of International Hani/Aka” (国际哈尼/阿卡区域文化调查), published in 2011, provides information on Hani and Akha as spoken inside China at a county by county level. Tatsuo Nishida has done important work in the Central and Southern Loloish, including field work on Akha (Nishida 1966), Lisu (Nishida 1967, 1968), Lahu (Nishida 1969), and particularly Bisu (bzi), a language that he was the first to document (Nishida 1966, 1973). Not many resources are available for the Southeastern Loloish languages, but Pelkey (2011) provides some useful lexical lists for Phula. The only book-length treatment of Loloish reconstruction is Bradley (1979). Bradley compares Lahu, Lisu, Bisu, Phunoi, Akha and Mpi (mpz). His work very thoroughly discusses the history of research on these languages and previous sources and previous efforts at reconstruction. The reconstruction per se is however not very successful, out of date at the time of its publication (Thurgood 1981) and failing to predict attested forms (Hill 2012: 64–65, 2015: 192–195). Nishida has also contributed to Loloish comparative reconstruction (Nishida 1966–1967, 1969, 1977, 1979). Dempsey (2005) suggests a new approach to Loloish initial reconstruction in relationship to the tonal split.



Scholarship on Trans-Himalayan (Tibeto-Burman) languages of South East Asia 

 123

7.5 Kuki-Chin Kuki-Chin languages are spoken throughout the hilly terrain of Myanmar (the Chin hills), North East India, and Bangladesh (the Chittagon hills). The verbal systems of these languages typically have person agreement. Their verbal systems are also noteworthy for the use of two verbal stems, the second usually having a more complex final and typically thought to originate from nominalization. The patterns of use of these two stems varies considerably from language to language and should be investigated more.11 Peterson (2017b) discusses subgrouping extensively. He divides the branch into the Maraic, Northwestern, Central, and Peripheral branches. Lorrain (1951) is the locus classicus for Mara (mrh), which unfortunately does not mark tones. Löffler (2002) provides a discussion of earlier work on Mara as well as insightful comments on Mara in its comparative context. Northwestern (formerly called Old Kuki) centers on the Indian state of Manipur; it is a poorly studied group of languages. A synthesis of work on this group is a major desideratum of Kuki-Chin studies. In contrast, the Central branch has received most attention, with Mizo (lus) as probably the best studied of the Kuki-Chin languages. Lewin (1874) provides an early textbook. Lorrain (1940) offers a full Mizo dictionary; this work has been extremely influential in Trans-Himalayan studies, used by Benedict (1972) and Peiros and Starostin (1996).12 Chhangte has written both a preliminary grammar (Chhangte 1986) and a more complete study of syntax (Chhangte 2001). For Hakha Lai (cnh), see Peterson (2017a) and references therein. Reichle (1981) offers a grammar of Bawm (bgr). According to Peterson the Peripheral branch itself includes the three subbranches Northeastern, Southern, and Khomic. Of these three, the Northeastern branch is better studied and has played more of a role in reconstruction. Henderson’s (1965) treatment of Tedim (ctd), despite its unambitious title, is one of the more extensive treatments of any Kuki-Chin language. Sizang (csy) was the subject of some early pedagogical grammars (Naylor 1925; Rundall 1891), two articles in the 20th century (Stern 1963, 1985), and is now being studied earnestly by Davis (2017). The change r- > g- is particularly characteristic of the Northeast branch (Solnit 1979); Peterson (2017b) discusses this change, convincingly showing that it is relatively recent and spread through contact. In general, the southern languages have received less attention than the northeastern languages. Houghton (1892) offers an early, and extremely insightful Asho (csh) vocabulary with comparative notes, which would reward careful study in light of more recent developments. There are full grammars of Daai (dao) (So-Hartmann 2009) and Hyow (Zakaria 2018). The Language and Social Development Organization, a missionary organization affiliated with the Summer Institute of Linguistics, conducted a 11 It merits mentioning that the recent fad in anthropological circles to discuss “Zomia” at least etymologically is referring to this area. Their autonym is Zo in many languages. 12 James Matisoff claims to be in possession of a copy of Lorrain’s dictionary to which Siamkima Khawlhring added tone marks, but has not made it available to the scholarly public.

124 

 Nathan W. Hill

dialect survey of the Southern Chin languages from 2005 to 2014. The data, recently published, is an invaluable resource for future research (LSDO 2019a, 2019b). David Peterson is the researcher most active on the Khomic group. He has written on Khumi (cnk) extensively (Peterson 2019 and citations therein).13 Peterson is also documenting Rengmitca (Peterson 2014), a highly endangered and phonologically archaic language. There was a long hiatus in Kuki-Chin reconstruction following the pioneering efforts of Ohno (1965). Efforts in the reconstruction of Proto-Kuki-Chin accelerated in the 21st century (Button 2012; Hill 2014; Khoi 2001; Van Bik 2009). In general, the more northern languages present rimes in something like their ancestral form but radically simplify the onset, whereas the southern languages have more conservative onsets and innovate more in the rimes. The existing reconstructions rely mostly on northern languages, because more data has been available on them.

7.6 Karen The most southern members of the Trans-Himalayan family are known for their SVO syntax and their simple syllable structure; most languages have only open syllables. According to Manson “it would appear from the literature that there are between 20–30 distinct Karen languages” and 16 of them “have been reasonably documented” (Manson 2017: 150). Reasonably well documented varieties include Sgaw (ksw) (Binney 1883), Pwo (Kato 2017; Purser and Aung 1920), Kayah Li (Red Karen, eky, kyu, kvy, kxf) (Brown 1900; Solnit 2017), and Bwe (bwe) (Henderson and Allott 1997). The Pa-o language, because it retains nasal finals, is of great importance for reconstruction, but is unfortunately relatively understudied. Some Pwo data is included by Jones (1961) and two Pwo dialects are compared by Nishida (1967a). David Solnit is nearing completion on a new subgrouping of Karen varieties, in deference to which I omit a tree diagram here. Comparative work on the branch begins with a brilliant essay of A. G. Haudricourt (1946). He produced a reconstruction of Proto-Karen relying only on data from Pwo and Sgaw. The crux of his insight was to realize that the complex correspondence patterns for the onset manner and tones that he saw between the two languages reflected the aftermath of a tonal split according to manner class, just like also happened in Thai, Vietnamese, and Middle Chinese. Discussions with Gordon Luce, and the addition of data from Red Karen and White Karen allowed him to correct a few mistakes, to yield the reconstruction of onsets that is still held to today (Haudricourt 1953). In his third intervention, he relies on subsequent work of Luce (1959), Jones (1961), and Burling to add a fourth tone. Kato (2018) provides a synthesis of the reconstructed 13 Justin Watkins has unpublished materials on Burmese Khumi.



Scholarship on Trans-Himalayan (Tibeto-Burman) languages of South East Asia 

 125

Kuki-Chin

Maraic Northwestern

Central

Peripheral

Anal Lamkang, etc. Mara

Zotung, etc.

Mizo Lai

Tedim

Sizang

Bawn, etc

Cho

Northeastern

Hyow/Asho

Southern Khomic

Khumi

Mro

Rengmitca

Fig. 7: The Stammbaum of Kuki-Chin according to Peterson (2017b).

tonal system of Haudricourt’s first and third papers, helpfully adding data in the original Sgaw and Pwo scripts. Although a number of researchers have been active in Karen reconstruction since, very little further progress appears to have been made (Luangthongkum 2019). Haudricourt reconstructs three inherited manner types (the same as proto-Burmish incidentally) by making use of a tonal split, the same as seen in Chinese and Tai. He reconstructs a series of voiceless nasals, which are later confirmed by fieldwork of Luce. Haudricourt proposes *ŋ- > *ɲ- as an isogloss for the Karenic branch.

7.7 Sal Burling (1983) proposes to call a group of languages “Sal” on basis of a shared lexical item for ‘sun’ that is otherwise not seen in Trans-Himalayan languages. This family consists of three subbranches, Bodo-Garo, Konyak, and Jingpho-Luish (Post and Burling 2017: 224–227). Of these three, only the third has members in South East Asia, so it is our focus here. The Bodo-Garo family, spoken mostly in Northeast India, but also parts of Bangladesh, is one of the better studied Trans-Himalayan families. Joseph and Burling (2006) and Jacquesson and Breugel (2017) discuss subgrouping and provide reconstructions. More recent publications include a complete grammar of Rabha (Joseph 2007) and a survey article on Garo (Burling 2017). French (1983) provides a reconstruction of the Konyak branch, but much descriptive work has subsequently taken place to merit a fresh look at this reconstruction. The third Sal subbranch is Jingpho-Luish, which consists of only seven languages, five on the Luish side and two on the Jingpho side. Progress in both subfamilies is now

126 

 Nathan W. Hill

steady, thanks mostly to the intrepid efforts of two young Japanese scholars. Jingpho (kac) is the major language of Kachin state in Myanmar and is also spoken in India and China. As such, it was extensively researched already in the British Colonial period, and has received a lot of attention from Chinese linguists as well. It is one of the five languages used in Benedict’s (1972) influential study. The apparently archaic iambic syllables of the language, along with its preservation of final -r and -j, does make it a useful language for comparative linguistics. Kurabe provides a summary article (Kurabe 2017), a grammar (Kurabe 2016), and a discussion of historical phonology (Kurabe 2015). The major obstacle to the use of Jingpho for comparative linguistics has been its isolated status, but better study of the Luish languages has improved this situation. On the Luish side, Huziwara has come out with a grammar of Cak (ckh) (Huziwara 2008), a Cak dictionary (Huziwara 2016), and a reconstruction of the Luish subbranch (Huziwara 2012). The relationship between Jingpho and Luish will doubtless be an exciting area of research in the coming years.

7.8 Nungish The Nungish family consists of the three languages, Trung (duu), Rawang (raw), and Anong (nun), spoken on both sides of the Sino-Burmese border. For Trung there are a few survey articles (LaPolla 2017; Nishida 1987), a short grammar (Sun 1982) and a longer grammar (Perlin 2019). Perlin has also made available a deposit on Trung at the Endangered Languages Archive (Id: 0235). For Rawang there is a pedagogical grammar (Barnard 1934) and a text collection (LaPolla and Poa 2001). Thurgood (2017b) provides an overview of Anong, including a discussion of previous studies. Straub compiled an extensive bibliography of work on the Nungish languages (Straub 2020). To date there have been no efforts to reconstruct Proto-Nungish.

7.9 Mruic The languages Mru (mro) and Anu-Hkongso (anl) together make up the small Mruic branch of Trans-Himalayan. These languages are spoken in the highlands of Burma and across the border in Bangladesh. Löffler (1966) elaborates correspondences between Mru, Mizo, Burmese, and reconstructed stages of Chinese.14 His work provides an excellent starting point for further historical consideration of Mru, but naturally must be updated, in particular with reference to newer Chinese reconstructions. Quite rare for Trans-Himalayan languages, Mru distinguishes final -r, -l, and

14 David Peterson has done extensive fieldwork on Mru that remains as yet unpublished.



Scholarship on Trans-Himalayan (Tibeto-Burman) languages of South East Asia 

 127

-j, although it appears that -l may not be inherited. In a few interesting cases, Mru vocabulary compares suspiciously well with Tibetan (Mru pak, Tib. phag ‘pig’, Mru kim, Tib. khyim ‘house’, Mru tom, Tib dom ‘bear’); these similarities are indicative or Trans-Himalayan archaisms. The numerals are intriguing and thus merit repetition here: chum ‘three’, tali ‘four’, tanga ‘five’, taruk ‘six’, ranit ‘seven’, riat ‘eight’, taku ‘nine’. My proposal is that ta- is etymological only in ‘six’ and ‘nine’ and the ta- of ‘six’ spread through contamination to ‘four’ and ‘five’. One may be tempted to see the ch- of ‘three’ as somehow preserving a *k- prefix (compare Tib. gsum), but the word ching ‘tree’ (compare Tib. śiṅ) instead suggests a regular change *s- > ch-.

7.10 Trümmersprachen There are at least two ancient Trans-Himalayan languages of fragmentary attestation (Trümmersprachen) that fall within the scope of our survey. The older, and more fragmentary of the two is Bailang, known only from the three “Songs of Bailang” (白狼歌). These are poems transliterated with Chinese characters and translated into Chinese during the Han dynasty (specifically 58–75 CE). In 1979, making extensive use of previous research, W. S. Coblin (1979) published a study of these songs. The author of these lines revisited the songs in light of ensuing progress in Chinese reconstruction (Hill 2017) and referring to the intervening more small scale studies (Mǎ and Dài 1982; Zhèngzhāng 1993; Beckwith 2008). Whereas Coblin and Beckwith propose that the language is Loloish, I do not think that this conclusion is obvious. There are many etyma of obvious Trans-Himalayan provenance, but they are etyma also attested in other branches, e.  g. 蘇 *sa ‘meat’ (Tib. śa, Bur. sāḥ, Mizo. sâ ‘meat’), 螺 *ruai ‘rain’ (Bur. rwā, Mizo. rùah), 毗 *bi ‘give’ (OBur. piyḥ, Mizo pè). The only candidate lexical innovation that Sagart et al. (2019) say is indicative of Burmo-Qiangic that is found in the Bailang songs is 冒 *mus ‘heaven’, which compares with OBur. muiwḥ ‘sky’ and Japhug tɯ-mɯ ‘rain, sky, weather’, but this word occurs in other branches as well (Tib. dmu ‘sky god’, Rawang dvmø̀ ‘celestial spirit’). As these examples show, Bailang is phonologically quite innovative, and lest one credit the filter of Chinese transcription entirely, note that the Chinese were perfectly able a few centuries later to transliterate Tibetan clusters where they head them. For example, the Sino-Tibetan treaty inscription of 821–822 has Tib. stag ‘tiger’ (in a name) as 悉諾 [sir-ndak] (Preiswerk 2014: 51). The Pyu (pyx) were the urban civilization that preceded and was absorbed by the Burmese. Pyu literary activity covers the 6th to the 13th century CE. Griffiths et al. (2017) provides a complete inventory of Pyu inscriptions and a detailed study of a previously unpublished Sanskrit-Pyu bilingual inscription around the base of a headless Buddha statue. Miyake discusses the rimes of Pyu (Miyake 2018) and Pyu grammar (Miyake 2019). The team responsible for this renaissance of Pyu scholarship envisions

128 

 Nathan W. Hill

a number of further studies on aspects of Pyu linguistics and epigraphy. Pyu has quite conservative phonology, quite atypically for Trans-Himalayan languages of South East Asia, compare täk ‘one’, kni ‘two’, plä ‘four’, pä.ṅa ‘five’, tko ‘nine’, and tdü ‘water’, which look quite Tibetan in their tolerance for cluster onsets, compare Tibetan gcig ‘one’, gñis ‘two’, bźi ‘four’, lṅa ‘five’, dgu ‘nine’, and chu ‘water’. Having admitted bafflement with the wider relations of either of our Trümmersprachen, it is interesting to note that they both share the same form of the first person pronoun, namely Bailang 支 *ke ‘we, us’ and Pyu gäy. In contrast, the vast majority of Trans-Himalayan languages have velar nasal initial first person pronouns. Ken Van Bik (this volume) points to a first person pronoun like kay as indicative of Kuki-Chin, but before we attach our Trümmersprachen to that family, note similar looking forms like Olekha kö ‘I’ and Puxi Qiang qa ‘me’ (Jacques 2007).

References Abondolo, Daniel. 1998. Introduction. In Daniel Abondolo (ed.), The Uralic languages, 1–42. London: Routledge. Adams, Oliver, Trevor Cohn, Graham Neubig, Hilaria Cruz, Steven Bird & Alexis Michaud. 2018. Evaluating phonemic transcription of low-resource tonal languages for language documentation. In Proceedings of LREC 2018. https://www.aclweb.org/anthology/L18-1530/ (last accessed 14 December 2020). Baber, E. Colborne. 1882. Travels and researches in the interior of China. Royal Geographical Society of London, Supplementary Papers 1. Bai, Junwei. 2019. A grammar of Munya. Cairns: James Cook University PhD dissertation. Barnard, J. T. O. 1934. A handbook of the Rawang dialect of the Nung language. Rangoon: Super­intendent of Government Printing and Stationery. Beckwith, Christopher I. 2008. The Pai-lang songs: The earliest texts in a Tibeto-Burman language and their Late Old Chinese transcriptions. In Christopher Beckwith (ed.), Medieval Tibeto-Burman Languages III, 87–110. Halle: International Institute for Tibetan and Buddhist Studies. Beer, Zack. 2019. Switch-reference in the Ye shes rgyas pa’i mdo. Journal of the Royal Asiatic Society. New Series 29(2). 249–256. DOI: 10.1017/S1356186318000731. Benedict, Paul K. 1972. Sino-Tibetan: A conspectus. Cambridge: Cambridge University Press. Bernot, Denise. 1965. The vowel systems of Arakanese and Tavoyan. Lingua 15. 463–474. Binney, J. P. 1883. The Anglo-Karen dictionary begun by J. Wade. Rangoon: American Mission Press, 793. Bodt, Tim & Johann-Mattis List. 2019. Testing the predictive strength of the comparative method: An ongoing experiment on unattested words in Western Kho-Bwa languages. Papers in Historical Phonology 4. 22–44. Bodt, Timotheus A. & Ismael Lieberherr. 2015. First notes on the phonology and classification of the Bangru language of India. Linguistics of the Tibeto-Burman Area 38(1). 66–123. DOI: https:// doi.org/10.1075/ltba.38.1.03bod. Bradley, David. 1979. Proto-Loloish. London: Curzon Press. Bradley, David. 1994. A dictionary of the northern dialect of Lisu. Canberra: Department of Linguistics, Research School of Pacific and Asian Studies, Australian National University.



Scholarship on Trans-Himalayan (Tibeto-Burman) languages of South East Asia 

 129

Bradley, David. 1997. Tibeto-Burman languages and classification. In David Bradley (ed.), Papers in Southeast Asian Linguistics. Vol. 14, 1–72. Canberra: Pacific Linguistics. Bradley, David. 2006. Southern Lisu dictionary. Berkeley: Sino-Tibetan Etymological Dictionary and Thesaurus Project. Bradley, David. 2007. East and South East Asia. In R. E. Asher & Christopher Moseley (eds.), Atlas of the world’s languages, 2nd edn., 159–208. London: Routledge. Brown, R. J. R. 1900. Elementary hand-book of the Red Karen language. Rangoon: Printed by the Supt., Govt. Print., Burma. Burling, Robbins. 1959. Proto-Bodo. Language 35(3). 433–453. Burling, Robbins. 1967. Proto-Lolo-Burmese. Bloomington: Indiana University. Burling, Robbins. 1983. The Sal languages. Linguistics of the Tibeto-Burman Area 7(2). 1–32. Burling, Robbins. 2017. Garo. In Randy J. LaPolla & Graham Thurgood (eds.), The Sino-Tibetan languages, 2nd edn., 243–257. London: Routledge. Button, Christopher. 2012. Proto-North Chin. Berkeley: STEDT. Campbell, Lyle & William John Poser. 2008. Language classification: History and method. Cambridge: Cambridge University Press. Chang, Kun. 1973. Review of Benedict 1972. The Journal of Asian Studies 32(2). 335–337. Chhangte, Lalnunthangi. 1986. A preliminary grammar of the Mizo language. Arlington, TX: University of Texas MA thesis. Chirkova, Katia. 2014. The Duoxu language and the Ersu-Lizu-Duoxu relationship. Linguistics of the Tibeto-Burman Area 37(1). 104–146. Chirkova, Katia. 2017. Xùmǐ 旭米 Language. In Rint Sybesma, Wolfgang Behr, Yueguo Gu, Zev Handel, C.-T. James Huang & James Myers (eds.), Encyclopedia of Chinese Language and Linguistics. DOI: 10.1163/2210-7363_ecll_COM_00000253. Cháng, Hóng ēn. 1986. Lāhùyǔ jiǎnzhì. Mínzú chūbǎnshè 民族出版社. Coblin, W. South. 1979. A new study of the Pai-lang songs. Tsing Hua Journal of Chinese Studies 12. 179–216. Dài, Qìngxià. 2005. Làngsùyǔ yánjiū. 民族出版社 Mínzú chūbǎnshè. Dài, Qìngxià & Zhìchāo Cuī. 1985. Āchāng yǔ Jiǎnzhì. Beijing: 民族出版社 Mínzú chūbǎnshè. Dài, Qìngxià, Yǐng Jiǎng & Zhì’ēn Kǒng. 2007. Bōlāyǔ yánjiū. 民族出版社 Mínzú chūbǎnshè. Dài, Qìngxià & Jié Lǐ. 2007. Lèqīyǔ yánjiū. 中央民族大学出版社 Zhōngyāng mínzú dàxué chūbǎnshè. Dài, Qìngxià, Jīnzhī Yú, Chénglín Yú, Xīnyǔ Lín, Yànhuá Zhū, Líjūn Fàn, Jiànxióng Pǔ & Línróng Zhù. 2010. Piānmǎ Cháshān rén jí qí yǔyán. 商務印書館 Shangwu yinshuguan. Daudey, Henriëtte. 2014. A grammar of Wadu Pumi. Melbourne: LaTrobe University PhD dissertation. Davis, Tyler D. 2017. Verb stem alternation in Sizang Chin. Chiang Mai: Payap University MA thesis. DeLancey, Scott. 2015. The historical dynamics of morphological complexity in Trans-Himalayan. Linguistic Discovery 13(2). 60–79. Dempsey, Jacob. 2005. Tonogenesis in Yipo-Burmic syllables with final stops. Tsing Hua Journal of Chinese Studies 35(2). 405–434. Dempsey, Jakob. 1995. A reconsideration of some phonological issues involved in reconstructing Sino-Tibetan numerals. Seattle: University of Washington PhD dissertation. Dempsey, Jakob. 2001. Remarks on the vowel system of old Burmese. Linguistics of the Tibeto-Burman Area 24(2). errata 26(1). 183, 205–234. Dempsey, Jakob. 2003. Analysis of rime-groups in Northern-Burmish. Linguistics of the Tibeto-Burman Area 26(1). 63–124. Ding, Picus Sizhi. 2014. A grammar of Prinmi. Leiden: Brill. DOI: https://doi.org/10.1163/ 9789004279773.

130 

 Nathan W. Hill

Do, Thi-Ngoc-Diep, Alexis Michaud & Eric Castelli. 2014. Towards the automatic processing of Yongning Na (Sino-Tibetan): Developing a “light” acoustic model of the target language and testing “heavyweight” models from five national languages. In Proceedings of the 4th International Workshop on Spoken Language Technologies for Under-resourced Languages (SLTU 2014), St. Petersburg, 153–160. https://halshs.archives-ouvertes.fr/halshs-00980431v2 (last accessed 4 December 2020). Driem, George van. 2001. Languages of the Himalayas. Leiden: Brill. Driem, George van. 2014. Trans-Himalayan. In Nathan W. Hill & Thomas Owen-Smith (eds.), Trans-Himalayan linguistics, 11–40. Berlin: Mouton de Gruyter. Fellner, Hannes & Nathan W. Hill. 2019. Word families, allofams, and the comparative method. Cahiers de linguistique – Asie Orientale 48(2). 91–124. Frasch, Tilman. 2018. Myanmar epigraphy – Current state and future tasks. In Daniel Perret (ed.), Writing for eternity: A survey of epigraphy in Southeast Asia, 47–72. Paris: École française d’Extrême-Orient. French, Walter Thomas. 1983. Northern Naga: A Tibeto-Burman mesolanguage. New York: City University of New York PhD dissertation. Galambos, Imre. 2015. Translating Chinese tradition and teaching Tangut Culture manuscripts and printed books from Khara-Khoto. Berlin: Mouton de Gruyter. Gao, Yang. 2015. Description du menya. Paris: EHESS PhD dissertation. Gerner, Matthias. 2013. A grammar of Nuosu. Berlin & Boston: Mouton de Gruyter. DOI: https://doi. org/10.1515/9783110308679. Gerner, Matthias. 2016. Yí 彜 Languages. In Rint Sybesma, Wolfgang Behr, Yueguo Gu, Zev Handel, C.-T. Huang & James Myers (eds.), Encyclopedia of Chinese language and linguistics. Leiden: Brill. Gong, Xun. 2020. Uvulars and uvularization in Tangut phonology. Language and Linguistics 2(21). 175–212. Griffiths, Arlo, Bob Hudson, Marc Miyake & Julian Wheatley. 2017. Studies in Pyu epigraphy and Pyu language, I: State of the field, edition and analysis of the Kan Wet Khaung Mound inscription, and inventory of the corpus. Bulletin de l’Ecole française d’Extrême-Orient 103. 43–205. Han, Zhengkang, Xiaowen Yuan & Katia Chirkova. 2019. Sichuan Mianning Duoxu hua. Beijing: Shangwu yin-shuguan 商务印书馆. Haudricourt, André-Georges. 1946. Restitution du karen commun. Bulletin de la Société de Linguistique de Paris 42(1). 103–111. Haudricourt, André-Georges. 1953. A propos de la restitution du Karen commun. Bulletin de la société de Linguistique de Paris 49. 129–132. Haudricourt, André-Georges. 1975. Le système des tons du karen commun. Bulletin de la société de Linguistique de Paris 70(1). 339–43. Henderson, Eugénie J. A. 1965. Tiddim Chin: A descriptive analysis of two texts. London: Oxford University Press. Haudricourt, André-Georges. 1986. Some hitherto unpublished material on Northern (Megyaw) Hpun. In John McCoy & Timothy Light (eds.), Contributions to Sino-Tibetan studies, 101–134. Leiden: Brill. Henderson, Eugénie J. A. & Anna Allott. 1997. Bwe Karen dictionary: With texts and English-Karen word list. London: School of Oriental and African Studies, University of London. Hill, Nathan W. 2012. Evolution of the Burmese vowel system. Transactions of the Philological Society 110(1). 64–79. Hill, Nathan W. 2014. Proto-Kuki-Chin initials according to Toru Ohno and Kenneth Van Bik. Journal of the Southeast Asian Linguistics Society 7. 11–30.



Scholarship on Trans-Himalayan (Tibeto-Burman) languages of South East Asia 

 131

Hill, Nathan W. 2015. The contribution of Tangut to Trans-Himalayan comparative linguistics. Archiv orientální 83(1). 187–200. Hill, Nathan W. 2017. Songs of the Bailang: A new transcription with etymological commentary. Bulletin de l’École française d’Extrême-Orient 103. 387–429. Hill, Nathan W. 2019. The historical phonology of Tibetan, Burmese, and Chinese. Cambridge: Cambridge University Press. Hill, Nathan W. & Lauren Gawne. 2017. The contribution of Tibetan languages to the study of evidentiality. In Nathan W. Hill & Lauren Gawne (eds.), Evidential systems of Tibetan languages, 1–40. Berlin: Mouton de Gruyter. Honkasalo, Sami. 2019. A grammar of the Geshiza language, a culturally anchored description. Helsinki: University of Helsinki PhD dissertation. Houghton, Bernhard. 1892. Essay on the language of the southern Chins and its affinities. Rangoon: Government Printing Office. Huáng, Bùfán and Dài Qìngxià (eds.) 1992. Zàngmiǎn yǔzú yǔyán cíhuì. Běijīng 北京: Zhōngyāng Mínzú Dàxué 中央民族大学. https://stedt.berkeley.edu/~stedt-cgi/rootcanal.pl/source/ TBL. Huziwara, Keisuke. 2008. Chakku-go no kijutsu gengogakuteki kenkyuu. Kyoto: Kyoto University PhD dissertation. Huziwara, Keisuke. 2012. Rui sogo no saikou ni mukete. Kyōto Daigaku Gengogaku Kenkyū (Kyoto University Linguistic Research) 31. 25–131. Huziwara, Keisuke. 2016. Cak-English-Bangla dictionary (a Tibeto-Burman language spoken in Bangladesh). Ḍhākā: A H Development Publishing House. Ikeda, Takumi. 2002. On pitch accent in the Munya language. Linguistics of the Tibeto-Burman Area 25(2). 27–45. Ikeda, Takumi. 2006. 200 basic words of the Munya language. Zinbun 39. 81–147. Ikeda, Takumi. 2008. 200 example sentences in the Munya language (Tanggu Dialect). Zibun 40(3). 71–140. Iwasa, Kazue. 2018. Remarks on maps of the Yi script based on the Swadesh 100 wordlist (Studies in Asian Geolinguistics No. 5). Tokyo: Research Institute for Languages, Cultures of Asia, and Africa (ILCAA), Tokyo University of Foreign Studies. Jacques, Guillaume. 2004. Phonologie et Morphologie du Japhug (rGyalrong). Paris: Université Paris VII – Denis Diderot. Jacques, Guillaume. 2007. A shared suppletive pattern in the pronominal systems of Chang Naga and Southern Qiang. Cahiers de Linguistique – Asie Orientale 36(1). 61–78. Jacques, Guillaume. 2014. Esquisse de phonologie et de morphologie historique du tangoute. Leiden: Brill. Jacques, Guillaume. 2017a. Japhug. In Graham Thurgood & Randy J. LaPolla (eds.), The Sino-Tibetan languages, 2nd edn., 614–634. London: Routledge. Jacques, Guillaume. 2017b. Rgyalrong. In Rint Sybesma, Wolfgang Behr, Yueguo Gu, Zev Handel, C.-T. Huang & James Myers (eds.), Encyclopedia of Chinese language and linguistics, Vol. 3, 583–589. Leiden: Brill. Jacques, Guillaume. 2020. Voicing alternation and sigmatic causative prefixation in Tibetan. Bulletin of the School of Oriental and African Studies 83(2). 283–292. DOI: 10.1017/ S0041977X20002189. Jacques, Guillaume & Alexis Michaud. 2011. Approaching the historical phonology of three highly eroded Sino-Tibetan languages: Naxi, Na and Laze. Diachronica 28(4). 468–498. Jacques, Guillaume, Lai Yunfan, Anton Antonov & Lobsang Nima. 2017. Stau (Ergong, Horpa). In Graham Thurgood & Randy J. LaPolla (eds.), The Sino-Tibetan languages, 2nd edn., 597–613. London: Routledge.

132 

 Nathan W. Hill

Jacquesson, François & Seino van Breugel. 2017. The linguistic reconstruction of the past: The case of the Boro-Garo languages. Linguistics of the Tibeto-Burman Area 40(1). 90–122. DOI: https:// doi.org/10.1075/ltba.40.1.04van. Jones, Robert B. 1961. Karen linguistic studies: Description, comparison, and texts. Berkeley: University of California Press. Jones, Robert B. 1972. Sketch of Burmese dialects. In M. Estellie Smith (ed.), Studies in linguistics in honor or George L. Trager, 413–422. The Hague: Mouton. Joseph, Umbavu Varghese. 2007. Rabha. Vol. 1. Languages of the Greater Himalayan Region (Brill’s Tibetan Studies Library 5/1). Leiden: Brill. Joseph, Umbavu Varghese & Robbins Burling. 2006. Comparative phonology of the Boro Garo languages. Mysore: Central Institute of Indian Languages Publication. Kato, Atsuhiko. 2017. Pwo Karen. In Graham Thurgood & Randy LaPolla (eds.), The Sino-Tibetan Languages, 2nd edn., 942–958. London: Routledge. Kato, Atsuhiko. 2018. How did Haudricourt reconstruct Proto-Karen tones? Reports of the Keio Institute of Cultural and Linguistic Studies 49. 21–44. Khoi, Lam Thang. 2001. A phonological reconstruction of Proto Chin. Chiang Mai: Payap University MA thesis. Kurabe, Keita. 2015. Issues in the historical phonology of Gauri Jingpho. Archives internationales d’ethnographie: Supplement 14(1). 1–19. Kurabe, Keita. 2016. A grammar of Jinghpaw, from Northern Burma. Kyoto: Kyoto University PhD dissertation. Kurabe, Keita. 2017. Jinghpaw. In Graham Thurgood & Randy J. LaPolla (eds.), The Sino-Tibetan languages (Routledge Language Family Series), 2nd edn., 993–1010. London & New York: Routledge. Lai, Yunfan. 2015. The person agreement system of Wobzi Lavrung (Rgyalrongic, Tibeto-Burman). Transactions of the Philological Society 113(3). 271–285. Lai, Yunfan. 2016. Causativisation in Wobzi and other Khroskyabs dialects. Cahiers de Linguistique – Asie Orientale 45(2). 148–175. Lai, Yunfan. 2017. Grammaire du khroskyabs de Wobzi. Paris: Université Paris III PhD dissertation. Lai, Yunfan, Guillaume Jacques, Gong Xun & Jesse Gates. 2020. Tangut as a West Rgyalrongic language. Folia Linguistica Historica 54.41: 171–203. Lalnunthangi, Chhangte. 2001. Mizo syntax. Munich: Lincom Europa. Lama, Ziwo Qiu-Fuyuan. 2012. Subgrouping of Nisoic (Yi) languages: A study from the perspective of shared innovation and phylogenetic estimation. Arlington, TX: University of Texas PhD dissertation. LaPolla, Randy. 2013. Subgrouping in Tibeto-Burman: Can an individual-identifying standard be developed? How do we factor in the history of migrations and language contact? In Balthasar Bickel, Lenore A. Grenoble, David A. Peterson & Alan Timberlake (eds.), What’s where why? Language typology and historical contingency, 463–474. Amsterdam: Benjamins. LaPolla, Randy. 2017. Dúlóng 獨龍 language. In Rint Sybesma, Wolfgang Behr, Yueguo Gu, Zev Handel, C.-T. James Huang & James Myers (eds.), Encyclopedia of Chinese language and linguistics. Vol. 2, 134–141. DOI: 10.1163/2210-7363_ecll_COM_00000251. LaPolla, Randy & Chenglong Huang. 2003. A grammar of Qiang. Berlin & New York: Mouton de Gruyter. LaPolla, Randy & Dory Poa. 2001. Rawang texts. München: Lincom Europa. LaPolla, Randy J. 2012. Comments on methodology and evidence in Sino-Tibetan comparative linguistics. Language and Linguistics 13(1). 117–132.



Scholarship on Trans-Himalayan (Tibeto-Burman) languages of South East Asia 

 133

Lewin, Thomas Herbert. 1874. Progressive colloquial exercises in the Lushai dialect of the ʹDzoʹ or Kúki language, with vocabularies and popular tales (notated). Calcutta: Calcutta central press company, limited. Lewis, Paul W. 1968. Akha-English dictionary (Linguistics Series 3), Data paper no. 70. Ithaca, NY: Cornell University, Department of Asian Studies, Southeast Asia program. Lewis, Paul W. 2003. Akha oral literature. Bangkok: White Lotus Press. Lewis, Paul W. & Bibo Bai. 1996. Hani-English/English-Hani dictionary: Haqniqdoq-yilyidoq, Doqlo-Soqdaoq. London: Kegan Paul International in association with the International Institute for Asian Studies. Lewis, Paul W & Bibo Bai. 2002. 51 Hani stories. Bangkok: White Lotus Press. Li, Jianfu. 2017. A descriptive grammar of Namuyi Khatho spoken by Namuyi Tibetans. Melbourne: La Trobe University PhD dissertation. Li, Zihe. 2018. yuánshǐ nàxīyǔ de qiánguānyīn hé * -r- jièyīn. 民族語文 Minzu Yuwen 1. 10–25. Li, Zihe. 2020. Yuánshǐ nàxīyǔ qiánguānyīn de láiyuán yǔ yǎnbiàn. Bulletin of Chinese Linguistics 12(2). 201–228. DOI: https://doi.org/10.1163/2405478X-01202003. Lidz, Liberty A. 2010. A descriptive grammar of Yongning Na (Mosuo). Austin, TX: University of Texas PhD dissertation. Lin, Youjing. 2016. Jiāróngyǔ Zhuókèjīhuà yǔfǎ biāozhù wénběn. Beijing: Shehui Kexue chubanshe. List, Johann-Mattis & Nathan W. Hill. 2017. Computer-assisted approaches to linguistic reconstruction. A case study from the Burmish languages. Cologne: Paperworkshop, Universität zu Köln. Lorrain, James Herbert. 1940. Dictionary of the Lushai language. Calcutta: The Asiatic Society. Lorrain, Reginald Arthur. 1951. Grammar and dictionary of the Lakher or Mara language. Gauhati: Department of Historical and Antiquarian Studies, Government of Assam. LSDO = Language and Social Development Organization. 2019a. A Chin dialect survey (Part 1 of 2). LSDO = Language and Social Development Organization. 2019b. A Chin dialect survey (Part 2 of 2). Luangthongkum, Theraphan. 2019. A view on Proto-Karen phonology and lexicon. Journal of the Southeast Asian Linguistics Society 12(1). i–iii. Luce, Gordon Huntington. 1959. Introduction to the comparative study of Karen languages. Journal of the Burma Research Society 42(1). 1–18. Lustig, Anton. 2010. A grammar and dictionary of Zaiwa. Leiden: Brill. Löffler, Lorenz G. 1966. The contribution of Mru to Sino-Tibetan linguistics. Zeitschrift der Deutschen Morgenländischen Gesellschaft 116(1). 118–159. Löffler, Lorenz G. 2002. Some notes on Maraa. Linguistics of the Tibeto-Burman area 25(1). 123–136. Lǐ, Chūnfēng. 2014. Bāng duǒ lāhù yǔ cānkǎo yǔfǎ. Beijing: 中国社会科学出版社 zhōngguó shèhuìkēxué chūbǎnshè [China Social Science Press]. Mann, Noel Walter. 1998. A phonological reconstruction of Proto Northern Burmic. Arlington, TX: The University of Texas MA thesis. Manson, Ken. 2017. The characteristics of the Karen branch of Tibeto-Burman. In Picus Sizhi Ding & Jamin Pelkey (eds.), Sociohistorical linguistics in Southeast Asia, 149–168. Leiden: Brill. DOI: https://doi.org/10.1163/9789004350519_010. Matisoff, James A. 1972. The Loloish tonal split revisited. Berkeley: Center for South and Southeast Asia Studies, University of California. Matisoff, James A. 1973. The grammar of Lahu. Berkeley & London: University of California Press. Matisoff, James A. 1988. The dictionary of Lahu (University of California Publications in Linguistics 111). Berkeley & Los Angeles: University of California Press. Matisoff, James A. (ed.). 2003. Handbook of Proto-Tibeto-Burman: System and philosophy of Sino-Tibetan reconstruction. Berkeley: University Presses of California. Matisoff, James A. 2006. English-Lanu Lexicon. Berkeley: University of California Press.

134 

 Nathan W. Hill

Meillet, Antoine. 1954. La méthode comparative en linguistique historique. Reprint. Paris: Honoré Champion. Michaud, Alexis. 2017. Tone in Yongning Na. Lexical tones and morphotonology. Berlin: Language Science Press. Michaud, Alexis & Guillaume Jacques. 2012. The phonology of Laze: Phonemic analysis, syllabic inventory, and a short word list. Yǔyánxué lùn cóng 语言学论丛 45. 196–230. Michaud, Alexis, Yaoping Zhong & Limin He. 2017. Nàxī 納西 language / Naish languages. In Rint Sybesma, Wolfgang Behr, Yueguo Gu, Zev Handel, C.-T. James Huang & James Myers (eds.), Encyclopedia of Chinese language and linguistics, vol. 3, 144–157. DOI: 10.1163/2210-7363_ ecll_COM_00000247. Miller, Roy Andrew. 1974. Sino-Tibetan: Inspection of a conspectus. Journal of the American Oriental Society 94(2). 195–209. Miyake, Marc Hideo. 2018. Studies in Pyu phonology, ii: Rhymes. Bulletin of Chinese Linguistics 11(1/2). 37–76. DOI: 10.1163/2405478X-01101008. Miyake, Marc Hideo. 2019. A first look at Pyu grammar. Linguistics of the Tibeto-Burman area 42(2). 150–221. DOI: 10.1075/ltba.18013.miy. Mortensen, David. 2003. Comparative Tangkhul. Berkeley: University of California unpublished qualifying paper. Mǎ, Xuéliáng & Qìngxià Dài. 1982. ‘Báilánggē’ Yánjiū. 民族語文 Minzu yuwen [Nationality languages] 5. 16–26. Naksuk, Yuttaporn. 2012. Intha phonology and lexicon with comparison to three Burmese dialects (Yangon, Arakan, and Tavoyan). Bangkok: Mahidol University PhD dissertation. Naylor, L. B. 1925. A practical handbook of the Chin language (Siyin dialect): Containing grammatical principles with numerous exercises and a vocabulary. Rangoon: Superintendent, Union Government Printing and Stationery. Nishi, Yoshio. 1999. Four papers on Burmese: Toward the history of Burmese (the Myanmar language). Tokyo: Institute for the study of languages, cultures of Asia, and Africa, Tokyo University of Foreign Studies. Nishida, Tatsuo. 1955. Myazedi hibun ni okeru chūko Biruma-go no kenkyū [Studies in the ancient Burmese language through the Myazedi inscriptions] 1. Palaeologia IV(1). 17–32. Nishida, Tatsuo. 1956. Myazedi hibun ni okeru chūko Biruma-go no kenkyū [Studies in the ancient Burmese language through the Myazedi inscriptions] 2. In Palaeologia V(1). 22–40. Nishida, Tatsuo. 1966a. Akago no onso taikei: Taikoku hokubu niokeru sanchimin Akazoku no gengo no kijutsuteki kenkyū. Onsei Kagaku Kenkyū 音聲科學研究 (Studia phonologica) 4(1). 1–36. Nishida, Tatsuo. 1966b. Bisugo no kenkyū: Taikoku hokubu niokeru Bisuzoku no gengo no yobite-ki kenkyū. 東南アジア研 究 Tōnan Ajia Kenkyū 4(1). 65–87. Nishida, Tatsuo. 1966–1967. Bisugo no keitō. In 東南アジア研究 Tōnan Ajia Kenkyū 4(3–5). 440–466 and 854–870. Nishida, Tatsuo. 1967a. Biruma niokeru Pazoku no gengo nitsuite: Nampō Paogo Paanhōgen oboegaki. 言語研究 Gengo Kenkyū (Journal of the Linguistic Society of Japan) 50. 15–33. Nishida, Tatsuo. 1967b. Risugo no kenyū: Taikoku Tākuken niokeru Risuzoku no kotoba no yobi hōkoku. 東南アジア研究 Tōnan Ajia Kenkyū 5(2). 276–307. Nishida, Tatsuo. 1968. Risugo hikaku kenyū. 東南アジア研究 Tōnan Ajia Kenkyū 6(1/2). 2–35 and 261–289. Nishida, Tatsuo. 1969a. Rahu shigo no kenyū: Taikoku Chenraiken niokeru Rahushizoku no gengo no yobi hōkoku. 東南アジア研究 Tōnan Ajia Kenkyū 7(1). 2–39. Nishida, Tatsuo. 1969b. Roro Birumago hikaku kenkyū niokeru mondai. 東南アジア研究 Tōnan Ajia Kenkyū 6(4). 868–899.



Scholarship on Trans-Himalayan (Tibeto-Burman) languages of South East Asia 

 135

Nishida, Tatsuo. 1973a. A preliminary study of the Bisu language: A language of northern Thai-land, recently discovered by us. In D. W. Dellinger (ed.), Papers in South East Asian Linguistics, no. 3 (Pacific Linguistics, Series A-30), 55–82. Canberra: Linguistic Circle of Canberra. Nishida, Tatsuo. 1973b. Tosu Yakugo no kenkyū: Shin gengo Tosugo no kōzō to keitō 多續譯語の 研究:新言語トス語の構造と 系統 [A study of the Tosu-Chinese vocabulary Tos i-yu: The structure and lineage of Tosu, a new language]. Vol. 6. Ka-i Yakugo kenkyū sōsho. Shōkadō. Nishida, Tatsuo. 1977. Some problems in the comparison of Tibetan, Burmese and Kachin languages. Onsei Kagaku Kenkyū 11. 1–24. Nishida, Tatsuo. 1979. Lolo-Burmese studies I. Onsei Kagaku Kenkyū 12. 1–24. Nishida, Tatsuo. 1987. Trunggo oyobi Nugo no ichi nitsuite. In Tōhō Gakkai (ed.), Tōhō Gakkai sōritsu 40 shūnen kinen Tōhō ronshū 東方學會創立 40 周年記念東方學論集 [“Eastern Studies” fortieth anniversary volume], 973–988. Tokyo: Tōhō Gakkai. Nishida, Tatsuo & Hongkai Sun. 1990. Hakuba Yakugo no kenkyū: Hakubago no kōzō to keitō. Ka-i Yakugo kenkyū sōsho 7. Kyoto: Shōkadō Shoten. Ohno, Toru. 1965. Kyotsu-kuchi-chin-go no saikosei I: Goto shi in. 言語研究 Gengo Kenkyū (Journal of the Linguistic Society of Japan) 47(3). 8–20. Okell, John. 1995. Three Burmese dialects. In David Bradley (ed.), Studies in Burmese languages, 1–138. Canberra: Department of Linguistics, Research School of Pacific Studies, Australian National University. Pavlík, Štěpán. 2017. The description of Namuzi language. Prague: Charles University PhD dissertation. Peiros, Ilia. 1998. Comparative linguistics in Southeast Asia (Pacific linguistics). Canberra: Pacific Linguistics, Research School of Pacific and Asian Studies, Australian National University. Peiros, Ilia & Sergei Starostin. 1996. A comparative vocabulary of five Sino-Tibetan languages. Melbourne: University of Melbourne, Department of Linguistics. Pelkey, Jamin. 2011. A Phula comparative lexicon. SIL International. Perlin, Ross. 2019. A grammar of Trung. Himalayan Linguistics 18(2). DOI: 10.5070/H918244579. Peterson, David. 2017a. Hakha Lai. In Graham Thurgood & Randy LaPolla (eds.), The Sino-Tibetan languages, 2nd edn., 258–276. London: Routledge. Perlin, Ross. 2017b. On Kuki-Chin subgrouping. In Picus Sizhi Ding & Jamin Pelkey (eds.), Sociohistorical linguistics in Southeast Asia: New horizons for Tibeto-Burman studies in honor of David Bradley, 189–209. Leiden: Brill. Peterson, David A. 2014. Rengmitca: The most endangered Kuki-Chin language of Bangladesh. In Nathan W. Hill & Thomas Owen-Smith (eds.), Trans-Himalayan linguistics, 313–327. Berlin: Mouton de Gruyter. Peterson, David A. 2019. Bangladesh Khumi. In Alice Vittrant & Justin Watkins (eds.), The Mainland Southeast Asia Linguistic Area, 12–55. Berlin: Mouton de Gruyter. DOI: 10.1515/9783110401981-002. Post, Mark W. & Robbins Burling. 2017. The Tibeto-Burman languages of Northeast India. In Graham Thurgood & Randy LaPolla (eds.), The Sino-Tibetan languages, 2nd edn., 213–242. London: Routledge. Preiswerk, Thomas. 2014. Die Phonologie des Alttibetischen auf Grund der chinesischen Beamtennamen im chinesisch-tibetischen Abkommen von 822 n. Chr. Zentralasiatische Studien 43. 7–158. Prins, Marielle. 2016. A grammar of rGyalrong, Jiǎomùzú (Kyom-kyo) dialects. Leiden: Brill. DOI: https://doi.org/10.1163/9789004325630. Purser, W. C. B. & Saya Tun Aung. 1920. A comparative dictionary of the Pwo-Karen dialect. Rangoon: American Baptist Mission Press. Reichle, Verena. 1981. Bawm language and lore: Tibeto-Burman area. Bern: Peter Lang.

136 

 Nathan W. Hill

Rundall, Frank M. 1891. Manual of the Siyin dialect spoken in the Northern Chin Hills. Rangoon: Government printing, 121. Sagart, Laurent. 2006. Review of Handbook of Proto-Tibeto-Burman: System and philosophy of Sino-Tibeto-Burman reconstruction. By James A. Matisoff. Diachronica 23(1). 206–223. Sagart, Laurent, Guillaume Jacques, Yunfan Lai, Robin J. Ryder, Valentin Thouzeau, Simon J. Greenhill & Johann-Mattis List. 2019. Dated language phylogenies shed light on the ancestry of Sino-Tibetan. Proceedings of the National Academy of Sciences 116(21). 10317–10322. DOI: 10.1073/pnas.1817972116. eprint: https://www.pnas.org/content/116/21/10317.full.pdf. Satterthwaite-Phillips, D. 2011. Phylogenetic inference of the Tibeto-Burman languages or on the usefulness of lexicostatistics (and “megalo”-comparison) for the subgrouping of Tibeto-Burman. Stanford: Stanford University PhD dissertation. Sims, Nathaniel A. 2020. Reconsidering the diachrony of tone in Rma. Journal of the Southeast Asian Linguistics Society 13(1). 53–85. So-Hartmann, Helga. 2009. A descriptive grammar of Daai Chin. Berkeley: STEDT Monograph. Solnit, David. 2017. Eastern Kayah Li. In Graham Thurgood & Randy J. LaPolla (eds.), The Sino-Tibetan languages (Routledge Language Family Series), 2nd edn., 932–941. London & New York: Routledge. Solnit, David B. 1979. Proto-Tibeto-Burman *r in Tiddim Chin and Lushai. Linguistics of the Tibeto-Burman Area 4(2). 111–121. Stern, Theodore. 1963. A provisional sketch of Sizang (Siyin) Chin. Asia Major, New Series 10(2). 222–278. Stern, Theodore. 1985. Sizang (Siyin) Chin texts. Linguistics of the Tibeto-Burman Area 8(1). 43–58. Straub, Nathan. 2020. Annotated bibliography of Nungish (Version 2016.11.21). Zenodo. https:// zenodo.org/record/3996184 (last accessed 4 December 2020). Sun, Hongkai. 1982. Dúlóngyǔ jiǎnzhì 獨龍語簡志 [A brief description of Dulong]. Beijing: Minzu chubanshe. Sun, Hongkai. 2014. Shǐxīngyǔ yánjiū. Beijing: Minzu University Press. Sun, Jackson T. S. 2017. Tshobdun Rgyalrong. In Graham Thurgood & Randy LaPolla (eds.), The Sino-Tibetan languages, 2nd edn., 557–571. London: Routledge. Suzuki, Hiroyuki. 2012. Kamuchibettogo Sangdam hōgen no onsei bunseki to sono hōgen tokuchō. In アジア·ア フリカ言語文化研究 Ajia afurika gengobunka kenkyū [Journal of Asian and African studies] 83. 37–58. Sūn, Hóngkāi (ed.) 1991. Zàngmiǎnyǔ yǔyīn hé cíhuì. Zhōngguó Shèhuì Kēxué 中国社会科学. Thurgood, Graham. 1981. Review of Bradley 1979. Bulletin of the School of Oriental and African Studies 44(3). 622–623. DOI: 10.1017/S0041977X0014474X. Thurgood, Graham. 2017a. Sino-Tibetan: Genetic and areal subgroups. In Graham Thurgood & Randy LaPolla (eds.), The Sino-Tibetan languages, 2nd edn., 3–39. London: Routledge. Thurgood, Graham. 2017b. Ānóng 阿儂 language. In Rint Sybesma, Wolfgang Behr, Yueguo Gu, Zev Handel, C.-T. James Huang & James Myers (eds.), Encyclopedia of Chinese language and linguistics. Vol. 1, 156–162. DOI: 10.1163/2210-7363_ecll_COM_00000251. Tun Aung Kyaw. 2007. Phwanḥ desiyacakāḥ leʹlā khyak. ဖွန်းဒေသိယစကားလေ့လာချက် [A study of the Hpun language]. Rangoon: Yangon University PhD dissertation. Van Bik, Kenneth. 2009. Proto-Kuki-Chin. A reconstructed ancestor of the Kuki-Chin languages. Berkeley: University of California. Wannemacher, Mark W. 1994. A preliminary phonology of the Atsi language. Chiangmai: Payap Research, Development Institute, Payap University, and The Summer Institute of Linguistics. Wannemacher, Mark W. 1998. Aspects of Zaiwa prosody: An autosegmental account. Dallas, TX: Summer Institute of Linguistics.



Scholarship on Trans-Himalayan (Tibeto-Burman) languages of South East Asia 

 137

Wasilewska, Halina. 2014. Unity and diversity: The Yi traditional writing system and its multiple representations: A study. Stęszew: International Institute of Ethnolinguistic and Oriental Studies. Watkins, Calvert. 1976. Towards Proto-Indo-European syntax: Problems and pseudo-problems. In Sanford B. Steever, Carol A Walker & Salikoko S. Mufwene (eds.), Papers from the Parasession on Diachronic Syntax, 305–326. Chicago: Chicago Linguistic Society, University of Chicago. [Reprinted in 1994, Selected writings. In Lisi Oliver (ed.), Vol. I, Language and linguistics; vol II, Culture and poetics, xvi + 771. Innsbruck: Innsbrucker Beiträge zur Sprachwissenschaft, Band 80]. Wāng, Dànián & Xiàngyáng Cài. 2018. Miǎndiànyǔ fāngyán yánjiū. Beijing: 北京大学出版社 Běijīng dàxué chūbǎnshè and 新华书店 Xīnhuá shūdiàn. Xú, Xījiān & Guìzhēn Xú. 1984. Jǐngpōzú yǔyán jiǎnzhì (Zǎiwǎyǔ). Beijing: 民族出版社 Mínzú chūbǎnshè. Yabu, Shirō. 1982. Atsigo kiso goishū. Tokyo: 東京外国語大学アジア·アフリカ言語文化研究 所 Tōkyō Gaikokugo Daigaku Ajia Afurika Gengo Bunka Kenkyūjo. Yabu, Shirō. 2003. The Hpun language endangered in Myanmar. Osaka: Osaka University of Foreign Studies. Yabu, Shirō. 2006. Old Burmese (OB) of Myazedi inscription in OB materials. Osaka: Osaka University of Foreign Studies. Yu, Dominic. 2012. Proto-Ersuic. Berkeley: University of California PhD dissertation. Yu, Dominic. 2019. Proto-Ersuic and Doshu. Unpublished paper presented at the 52nd International Conference on Sino-Tibetan Languages and Linguistics, University of Sydney. Yǐn, Wèibīn. 2016. Nàmùzīyǔ yǔfǎ biāozhù wénběn. Beijing: 社會科學文獻出版社 Shèhuì kēxué wénxiàn chūbǎn-shè. Zakaria, Muhammad. 2018. A grammar of Hyow. Nanyang: Nanyang Technological University, 882, PhD dissertation. Zhang, Sihong. 2013. A reference grammar of Ersu: A Tibeto-Burman language of China. Cairns: James Cook University PhD dissertation. Zhèngzhāng, Shàngfāng. 1993. Shànggǔ miǎngē ‘Báilánggē’ de quánwén jiědú. 民族语文 Minzu yuwen [Nationality languages] 1/2. 10–21 and 64–70. Zhū, Yànhuá & Lèpáizǎozā. 2013. Zhēfàng zàiwǎyǔ cānkǎo yǔfǎ. Beijing: 中国社会科学出版社 Zhōngguó shèhuìkēxué chūbǎnshè.

Yoshihisa Taguchi

8 Historiography of Hmong-Mien linguistics 8.1 Introduction Although reports on individual Hmong-Mien (Miao-Yao) languages go back to the beginning of the twentieth century (e.  g., Bonifacy 1905), or even earlier if we include vocabulary records compiled by Chinese officials, true linguistic study in the HmongMien languages began towards the middle of the twentieth century. The earliest serious linguistic studies were conducted by Chinese linguists: Fanggui Li’s (1930) comparative study of some Hmong-Mien words, Yuenren Chao’s (1930) phonological study of Yao songs, and Sikling Wong’s (1939) description of a Mienic language, Zao-Min. In the early period of Hmong-Mien linguistics, only a limited number of language varieties were accessible to Western scholars: White Hmong and Green Mong, and some Mienic varieties, which were spoken in Southeast Asian countries. As we mention in detail in the next section, although the above languages have been extensively studied by Western scholars, to date, the list of well-studied Hmong-Mien languages remains short. Most linguistic varieties of Hmong-Mien, however, are distributed in Mainland China, and their existence was revealed to the academic world only later. A comprehensive image of Hmong-Mien began to emerge only after a language survey among minorities in China was conducted by the Chinese Academy of Sciences. The survey started in 1956, and its results began to appear in major academic journals from the 1960s (e.  g., Second research team of Minority Language Survey of Chinese Academy of Sciences, 1962).

8.2 Synchronic studies 8.2.1 Language sketches and dictionaries Grammar sketches on a specific language of this family were first published in the late 1970s: for instance, Mottin (1978), a grammar course of White Hmong, and Lyman (1979), a compact grammar of Green Mong. Both of them describe a Hmongic variety spoken in Southeast Asia. In the 1980s, as the above-mentioned language survey in China bore fruit, a series of minority language sketches were published in sequence, which included Hmong-Mien: A Sketch of Yao People’s Languages (Mao et al. 1982), A Sketch of the Miao Language (F. Wang 1985), and A Sketch of the She language (Mao and Meng 1986). These titles reveal that ethnic identities are given great significance in language classification in China: the Miao language in F. Wang (1985) only includes Hmongic languages spoken by the ethnic Miao; the She language (also known as https://doi.org/10.1515/9783110558142-008

140 

 Yoshihisa Taguchi

Ho-Ne), a language spoken by the ethnic group She, is actually a Hmongic language (see Ratliff 1998). In the twenty-first century, sketches of other Hmong-Mien languages were published to provide us with a fuller picture of the family. The main contributors are Zongwu Mao and Yunbing Li, who published a series of general descriptions of some Hmongic languages: Pa-Hng (Baheng, Mao and Li 1997), Pu-Nu (Bunu, Meng 2001), Kiong-Nai (Jiongnai, Mao and Li 2002), and You-Nuo (Younuo, Mao and Li 2007). Mao also published a vocabulary list containing lexical data of ten Mienic lects (Mao 2004). Other publications that provided extensive lexical information include Shintani and Yang (1990), which describes one of the Kim-Mun lects, Nakanishi (2003), which describes the She language, and Taguchi (2008), which describes the West Hmongic language Lan Hmyo (also known as Luobohe river). Turning to dictionaries, the earliest ones were compiled by French scholars: Bertrais (1964), a Hmong-French dictionary, and Savina (1926), a Mun-French dictionary (Kim-Di-Mun, a Mienic language). Today, in the English-speaking world, Heimbach (1979) is the most popular Hmong-English dictionary, and Lombard (1968) has been widely used as an Iu-Mien-English dictionary. Recently, Purnell (2012) was published to provide more extensive lexical information about Iu-Mien. In China, as well, many dictionaries have been published since 1990: Xiang (1992, Chinese-Xiangxi Miao), C. Wang (1992, Chinese-Qiandong Miao), Zhang and Xu (1990, Qiandong Miao-English), Xian (2000, Chuanqiandian Miao-English), Mao (1992, Chinese-Iu-Mien), Meng (1996, Chinese-Pu Nu), as well as Meng and Meng (2008, Pu Nu-Chinese).

8.2.2 Grammar studies Apart from the language sketches mentioned in the previous section, studies on specific grammar topics also appeared in the 1980s: Clark (1985, 1989) are early achievements of areal linguistic study on White Hmong; Court (1986) is a descriptive grammar of Iu-Mien focusing on its nominal structures; and Harriehausen (1990) is the first publication of a reference grammar of Green Mong. In China, descriptive work has been conducted quite independently from Western linguistics. The grammar books published in the late twentieth century include C. Wang (1986), which describes Qiandong Miao (also known as Hmu, East Hmongic) and Luo (1990), which describes Xiangxi Miao (also known as Xong, North Hmongic). A grammar of Diandongbei (also known as A-Hmao, West Hmongic) was published recently as well (W. Wang 2005). In the 1990s, extensive grammatical studies appeared. Their main targets were Green Mong and White Hmong, two closely related West Hmongic lects. Jarkey (1991, later published as Jarkey 2015) is a detailed analysis of serial verb construction in White Hmong. Ratliff (1992) is a thorough study of morphological processes in White Hmong. Bisang (1993, 1999) are theoretical studies of classifiers with a particular attention to White Hmong. In Hmong grammar study, Nerida Jarkey has been active as a leading scholar. Her recent publications include the following: Jarkey (2006), a



Historiography of Hmong-Mien linguistics 

 141

study of complementation strategy in White Hmong; Jarkey (2010), a study of transitivity in White Hmong; and Jarkey (2019), a study of discourse strategy in White Hmong. In the twenty-first century, reference grammars of a specific Hmong-Mien lect have been submitted as dissertations: Yu (2011, Aizhai Miao, North Hmongic), Ji (2012, Taijiang Miao, East Hmongic), Sposato (2015, Xong, North Hmongic), and Arisawa (2016, Iu-Mien, Mienic). Publication of typology-oriented studies on Hmong-Mien has been increasing: Sposato (2012), a descriptive study of relative clauses in Xong; Sposato (2014), a typological study of word order in Hmong-Mien; and White (2019), a typological study of White Hmong classifiers. Despite the above-mentioned endeavors, the grammar of most languages of the family still remains undocumented, except for White Hmong, Green Mong, Iu-Mien, and a couple of Hmongic lects. More descriptive work is necessary to fill the gap and provide typology-sensitive information for crosslinguistic study.

8.2.3 Phonetic studies The first transcription of a Hmong-Mien language with phonetic alphabets was made by the celebrated phonetician Daniel Jones in 1923, in which he transcribed a passage of A-Hmao (Jones 1923). Since then, much phonetic and phonological work has been done, especially to develop an orthography for each language (see, for example, Smalley 1976, and Smalley et al. 1990, for White Hmong orthography). What has attracted scholars’ special attention in Hmong-Mien phonetics is its tone and phonation. Huffman (1985) is an early contribution to phonation studies; Kong (1992) is a study of five level tones of a Hmongic lect; Andruski and Ratliff (2000) is a study of the interaction between tone and phonation in Green Mong; and Esposito (2012) is a new contribution to the same field. However, the phonetic details of most Hmong-Mien languages remain understudied.

8.3 Diachronic studies Diachronic research on this language group in the twentieth century had three main targets: genetic relationship with other families, internal phylogeny (genealogical relationship within the group), and reconstruction of the proto-language and the language history. Contributions to other fields of historical studies, such as studies of grammatical change, have only recently emerged.

142 

 Yoshihisa Taguchi

8.3.1 Genetic relationship with other families and phylogeny Despite the age-long debate about which family Hmong-Mien is genetically related to, no consensus view is available. Sino-Tibetan Theory has been one of the candidates (F. Li 1937, 1973), and still has some supporters in China (see F. Wang 1985, 1986; Chen 2001). Other genetic theories include the Austroasiatic theory (Forrest 1948; Haudricourt 1966) and Austro-Tai theory (Benedict 1975, 1997). Needless to say, there is a possibility that the Hmong-Mien languages constitute an independent language family (or phylum) by themselves. Although the genetic relationship between Hmongic and Mienic has been long recognized (e.  g., F. Li 1930, 1937), it was only after the data of lects were obtained from the language survey in China that internal phylogeny among lects was extensively studied. The first classification of Hmongic languages was submitted in 1957 (Second research team of Minority Language Survey of Chinese Academy of Sciences 1957). This classification is based on the ratio of the shared vocabulary and divided Hmongic lects into five groups. The first phylogenetic tree of Hmong-Mien was drawn by Purnell (1970), which divides Hmongic lects into four groups based on shared phonological innovation (Purnell 1970: 40, 137). In 1987, Strecker submitted a new classification recognizing Pa-Hng as an independent branch of Hmong-Mien (Strecker 1987a, 1987b, 1987c). The position of Pa-Hng within the family has been one of the focal points in the Hmong-Mien phylogeny. Scholars in Western countries have treated Pa-Hng as one of the earliest split-offs of the family, or at least the earliest split-off of the Hmongic branch (see Niederer 2004: 141; Ratliff 2010: 3). In China, Pa-Hng is treated as one of the equal daughters of Proto-Hmongic along with other Hmongic languages (see Wang and Mao 1995: 2–3; Y. Li 2018: 21).

8.3.2 Proto-language reconstruction and related studies The earliest comparative study was done by Chang Kun. He started his Hmong-Mien study by publishing a paper that reconstructs the proto-tone of Hmong-Mien (Chang 1947), and continued to study the tones and the initial consonants of the proto-language (Chang 1953, 1966, 1972, 1976). Downer is another scholar who was active in the 1960s–1970s: Downer (1963) provides a historical survey of the family; Downer (1973) is a study of Chinese loanwords in Iu-Mien. Meanwhile, some important contributions were made by celebrated linguists as well: Haudricourt (1954, 1960) and Shafer (1964). The first full-fledged reconstruction of Proto-Hmong-Mien (Proto-Miao-Yao in his term) was accomplished by Purnell (1970). Although the data available to him were limited compared with the present state of knowledge, his reconstruction, based on strict sound correspondence, is still worth consulting. In China, after the Cultural Revolution, Chen (1979), a paper on the development of Tone D, led a series of publications on the Hmong-Mien historical phonology. The



Historiography of Hmong-Mien linguistics 

 143

representative scholars from this period are Fushi Wang and Qiguang Chen. F. Wang, who had accomplished his reconstruction of Proto-Hmongic in 1979 (F. Wang 1979), successively published papers and books on Hmong-Mien historical linguistics. His main work includes a classification of the Miao languages (F. Wang 1983), a reconstruction of Proto-Miao (F. Wang 1994, a revised version of the 1979 manuscript), and a reconstruction of Proto-Hmong-Mien, co-authored with Zongwu Mao (Wang and Mao 1995). F. Wang (1994), which is a reconstruction of the proto-language in the style of the traditional Chinese phonology, still serves as a cornerstone of Hmong-Mien historical linguistics. Chen (1979) first recognized that a part of Tone C in Hmongic comes from proto-Tone D of syllables with a coda *-k. Chen’s representative work includes an introduction to the Hmong-Mien linguistics (Chen 1991), a reconstruction of Proto-HmongMien (Chen 2001), a study of Hmong-Mien nominal prefixes (Chen 1993), and a handbook of Hmong-Mien linguistics (Chen 2013). In addition to the works of F. Wang and Chen, we have Lixin Jin’s reconstruction of the Proto-Hmong-Mien open-syllable rime system (2007), and Yunbing Li’s recent reconstruction of Proto-Hmong-Mien (2018). Turning to publications outside China, many important contributions have been made since 1990: Downer (1991) recognizes the above-mentioned tonal change of *D > *C as an innovation that characterizes the Hmongic branch; L.-Thongkum (1993) is a reconstruction of Proto-Mienic based on the data that she collected; Solnit (1996) submits a new view on proto-cluster initials based on the data of a Mienic language, Biao-Min; Niederer (1998) provides a useful survey of the historical phonology of the family; Nakanishi (2007) is a study of nasal coda in Hmongic, which proves that Proto-Hmongic had a single nasal coda losing distinction in terms of place of articulation. Martha Ratliff is the most active scholar in the historical Hmong-Mien studies since the early 1990s. Her main contributions include a study of Hmong-Mien tonogenesis (Ratliff 2002), a historical study of prefixes (Ratliff 2006), and a reconstruction of Proto-Hmong-Mien (Ratliff 2010). Ratliff (2010) also includes a study of grammatical change, such as the genesis of classifiers, and a cultural reconstruction of the ancient Hmong-Mien world.

References Andruski, Jean E. & Martha Ratliff. 2000. Use of phonation type in distinguishing tone: The case of Green Mong. Journal of the International Phonetic Association 30(1/2). 37–61. Arisawa, Tatsuro Daniel. 2016. An Iu Mien grammar: A tool for language documentation and revitalization. Melbourne: La Trobe University PhD dissertation. Benedict, Paul K. 1975. Austro-Thai language and culture with a glossary of roots. New Haven: Human Relations Area Files Press. Benedict, Paul K. 1997. Interphyla flow in Southeast Asia. Mon-Khmer Studies 27. 1–11. Bertrais, Yves. 1964. Dictionnaire hmong-francais. Vientiane, Laos: Mission Catholique. Bisang, Walter. 1993. Classifiers, quantifiers and class nouns in Hmong. Studies in Language 17(1). 1–51.

144 

 Yoshihisa Taguchi

Bisang, Walter. 1999. Classifiers in East and Southeast Asian languages: Counting and beyond. In Jadranka Gvozdanović (ed.), Numeral types and changes worldwide, 113–185. Berlin: Mouton de Gruyter. Bonifacy. 1905. Etudes sur les langues parlées par les populations de la haute rivière claire. Bulletin de l’Ecole Française d’Extreme-Orient 5. 306–327. Chang, Kun. 1947. Miaoyaoyu shengdiao wenti [On the tone system of the Miao-Yao languages]. Bulletin of the Institute of History and Philology 16. 93–110. Chang, Kun. 1953. On the tone system of the Miao-Yao languages. Language 29. 374–378. Chang, Kun. 1966. A comparative study of the Yao tone system. Language 42. 303–310. Chang, Kun. 1972. The reconstruction of Proto-Miao-Yao tones. Bulletin of the Institute of History and Philology 44. 541–628. Chang, Kun. 1976. Proto-Miao initials. Bulletin of the Institute of History and Philology. 47(2). 155–218. Chao, Yuenren. 1930. Guangxi Yaoge Jiyin [Phonetics of the Yao Folk-songs]. Monograph A, No.1. Beijing: Academia Sinica. Chen, Qiguang. 1979. Miaoyaoyu rusheng de fazhang [Development of the entering tone in the Miao-Yao languages]. Minzu Yuwen 1. 25–30. Chen, Qiguang. 1991. Miaoyaoyu pian [Hmong-Mien languages]. In Xueliang Ma (ed.), Hanzangyu gailun [A general introduction to Sino-Tibetan languages], 601–806. Beijing: Beijing Daxue. Chen, Qiguang. 1993. Miaoyaoyu qianzhui [Hmong-Mien prefixes]. Minu Yuwen 1. 1–9. Chen, Qiguang. 2001. Hanyu Miaoyaoyu bijiao yanjiu [A comparative study of Chinese and Miao-Yao]. In Bangxin Ding & Hongkai Sun (eds.), Hanzangyu Tongyuanci Yanjiu [A Study of Sino-Tibetan cognate vocabulary], 129–651. Nanning: Guangxi Minzu Chubanshe. Chen, Qiguang. 2013. Miao Yao Yuwen [A Handbook of the Miao-Yao languages]. Beijing: Zhongguo Minzu Daxue Chubanshe. Clark, Marybeth. 1985. Asking questions in Hmong and other Southeast Asian languages. Linguistics of the Tibeto-Burman Area 8(2). 60–67. Clark, Marybeth. 1989. Hmong and areal South-East Asia. In David Bradley (ed.), South-East Asian Syntax. Papers in South-East Asian linguistics, 175–230. Canberra: Department of Linguistics, Research school of Pacific Studies, the Australian National University. Court, Christopher Anthony Forbes. 1986. Fundamentals of Iu Mien (Yao) grammar. Berkeley: University of California PhD dissertation. Downer, Gordon B. 1963. Chinese, Thai, and Miao-Yao. In H. L. Shorto (ed.), Linguistic comparison in South East Asia and the Pacific, 133–139. London: School of Oriental and African Studies, University of London. Downer, Gordon B. 1973. Strata of Chinese loanwords in the Mien dialect of Yao. Asia Major 18(1). 1–33. Downer, Gordon B. 1991. The relationship between the Yao and the Miao languages. In Jacques Lemoine & Chiao Chien (eds.), The Yao of South China: Recent international studies, 39–45. Paris: Pangu, Editions de l’A.F.E.Y. Esposito, Christina M. 2012. An acoustic and electroglottographic study of White Hmong tone and phonation. Journal of Phonetics 40. 466–476. Forrest, R. A. D. 1948. The Chinese language. London: Faber and Faber. Harriehausen, Bettina. 1990. Hmong Njua: Syntaktische Analyse einer gesprochenen Sprache mithilfe datenverarbeitungstechnischer Mittel und sprachvergleichende Beschreibung des südostasiatischen Sprachraumes. Tübingen: Max Niemeyer Verlag. Haudricourt, André G. 1954. Introduction à la phonologie historique des langues Miao-Yao. Bulletin de l’Ecole Française d’Extreme-Orient 44. 555–576.



Historiography of Hmong-Mien linguistics 

 145

Haudricourt, André G. 1960. V. Note sur les dialectes de la région de Moncay. Bulletin de l’Ecole Française d’Extreme-Orient 50(1). 161–177. Haudricourt, André G. 1966. The limits and connections of Austroasiatic in the northeast. In Norman H. Zide (ed.), Studies in comparative Austroasiatic linguistics, 44–56. The Hague: Mouton. Heimbach, Ernest E. 1979. White Hmong-English Dictionary. Ithaca: Cornell University. Huffman, Marie K. 1985. Measures of phonation type in Hmong. UCLA Working Papers in Phonetics 61. 1–25. Jarkey, Nerida. 1991. Serial verbs in White Hmong. Sydney: University of Sydney PhD dissertation. Jarkey, Nerida. 2006. Complement clause types and complementation strategy in White Hmong. In R. M. W. Dixon & Alexandra Y. Aikhenvald (eds.), Complementation: A cross-linguistic typology, 115–136. Oxford: Oxford University Press. Jarkey, Nerida. 2010. Shiromongo niokeru tadousei [Transitivity in White Hmong]. In Yoshihiro Nishimitsu & Prashant Pradeshi (eds.), Zidoushi Tadoushi No Taisho. Tokyo: Kuroshio. Jarkey, Nerida. 2015. Serial verbs in White Hmong. Leiden: Brill. Jarkey, Nerida. 2019. Bridging constructions in narrative texts in White Hmong (Hmong-Mien). In Valérie Guérin (ed.), Bridging constructions, 129–156. Berlin: Language Science Press. Ji, Anlong. 2012. Miaoyu Taijianghua Cankao Yufa [A Reference Grammar of the Taijiang Miao]. Kunming: Yunnan Minzu Chubanshe. Jin, Lixin. 2007. The rhyme system of Yinsheng in Miao-Yao. Yuyan Yanjiu 27-3: 99–111. Jones, Daniel. 1923. tʃainːz. Le Maître Phonétique 38(1). 4–5. Kong, Jiangping. 1992. Ziyun Miaoyu wupingdiao xitong de shengxue ji ganzhi yanjiu [An acoustic and auditory study of the five level tones of Ziyun Miao]. In Ma Xueliang (ed.), Minzu Yuwen Yanjiu Xintan, 152–163. Chengdu: Sichuan Minzu Chubanshe. Li, Fanggui. 1930. Guangxi Lingyun Yaoyu. Bulletin of the Institute of History and Philology 1(4). 419–426. Li, Fanggui. 1937. Languages and dialects of China. Chinese year book. Shanghai: The Chinese Year Book Publishing Company. Li, Fanggui. 1973. Languages and dialects of China. Journal of Chinese Linguistics 1(1). 1–13. Li, Yunbing. 2018. Miaoyaoyu Bijiao Yanjiu [A comparative study of the Miao-Yao languages]. Beijing: Shangwu Yinshuguan. Lombard, Sylvia J. 1968. Yao-English Dictionary. Ithaca: Cornell University. Luo, Anyuan. 1990. Xiandai Xiangxi Miaoyu Yufa [A grammar of the Modern Xiangxi Miao]. Beijing: Zhongyang Minzu Xueyuan Chubanshe. Lyman, Thomas Amis. 1979. Grammar of Mong Njua (Green Miao): A descriptive linguistic study. Sattley, CA: The Blue Oak Press. L.-Thongkum, Theraphan. 1993. A view on Proto-Mjuenic (Yao). Mon-Khmer Studies 22. 163–230. Mao, Zongwu. 1992. Hanyao Cidian [Chinese-Yao dictionary]. Chengdu: Sichuan Minzu Chubanshe. Mao, Zongwu. 2004. Yaozu Mianyu Fangyan Yanjiu [A study of the dialects of Mienic]. Beijing: Minzu Chubanshe. Mao, Zongwu, Zhaoji Meng & Zongze Zheng. 1982. Yaozu Yuyan Jianzhi [A sketch of the languages of the Yao people]. Beijing: Minzu Chubashe. Mao, Zongwu & Zhaoji Meng. 1986. Sheyu Jianzhi [A sketch of the She language]. Beijing: Minzu Chubanshe. Mao, Zongwu & Yunbing Li. 1997. Bahengyu [The Pa-Hng language]. Shanghai: Shanghai Yuandong Chubanshe. Mao, Zongwu & Yunbing Li. 2002. Jiongnaiyu Yanjiu [A study of the Kiong-Nai language]. Beijing: Zhongyang Minzu Daxue Chubanshe.

146 

 Yoshihisa Taguchi

Mao, Zongwu & Yunbing Li. 2007. Younuoyu Yanjiu [A study of the You-Nuo language]. Beijing: Minzu Chubanshe. Meng, Zhaoji. 1996. Hanyao Cidian (Bunuyu) [Chinese-Yao dictionary: Pu-Nu language]. Chengdu: Sichuan Minzu Chubanshe. Meng, Zhaoji. 2001. Yaozu Bunuyu Fangyan Yanjiu [A study of the Bunu language]. Beijing: Minzu Chubanshe. Meng, Zhaoji & Fengjiao Meng. 2008. Yaohan Cidian (Bunuyu) [Yao-Chinese dictionary: Pu-Nu language]. Beijing: Minzu Chubanshe. Mottin, Jean. 1978. Eléments de grammaire hmong blanc. Bangkok: Don Bosco Press. Nakanishi, Hiroki. 2003. A She vocabulary: Haifeng dialect. Kyoto: Institute for Research in Humanities, Kyoto University. Nakanishi, Hiroki. 2007. Xiandai Sheyu Biyin Yunwei de Laili [Origin of nasal codas in the modern She language]. Minzu Yuwen 4. 10–20. Niederer, Barbara. 1998. Les langues Hmong-Mjen (Miáo-Yáo): Phonologie historique. Munich & Newcastle: Lincom Europa. Niederer, Barbara. 2004. Pa-Hng and the classification of the Hmong-Mien languages. In Nicholas Tapp, Jean Michaud, Christian Culas & Gary Y. Lee (eds.), Hmong/Miao in Asia, 129–146. Chiang Mai: Silkworm Books. Purnell, Herbert, C. 1970. Toward a reconstruction of Proto-Miao-Yao. Ithaca: Cornell University PhD dissertation. Purnell, Herbert, C. (ed.) 2012. An Iu-Mienh-English dictionary: With cultural notes. With the assistance of Gueix-Fongc Zanh, V. Ann Burgess, Greg Aumann, Chiang Mai. San Francisco: Silkworm Books. Ratliff, Martha. 1992. Meaningful tone: A study of tonal morphology in compounds, form classes and expressive phrases in White Hmong. Dekalb, IL: Northern Illinois University Center for Southeast Asian Studies. Ratliff, Martha. 1998. Ho Ne (She) is Hmongic: One final argument. Linguistics of the Tibeto-Burman Area 21(2). 97–109. Ratliff, Martha. 2002. Timing tonogenesis: Evidence from borrowing. In Proceedings of the Twenty-Eighth Annual Meeting of the Berkeley Linguistics Society: Special Session on Tibeto-Burman and Southeast Asian Linguistics, 29–41. DOI: https://doi.org/10.3765/bls. v28i2.1043. Ratliff, Martha. 2006. Prefix variation and reconstruction. In Thomas D. Cravens (ed.), Variation and reconstruction, 165–178. Amsterdam & Philadelphia: John Benjamins. Ratliff, Martha. 2010. Hmong-Mien language history. Canberra: Pacific Linguistics, The Australian National University. Savina. F. M. Dictionnaire français-mán, précédé d’une note sur les Mán Kim-đi-mun et leur langue. Bulletin de l’Ecole française d’Extrême-Orient 26. 11–255. Second research team of Minority Language Survey of Chinese Academy of Sciences. 1957. Miaoyu fangyan de huafen he chuangli Miaowen de wenti [On the classification of the Miao dialects and the establishment of the Miao orthography]. In Guizhou Oversight Committee of Minority Languages (ed.), Miaozu Yuyan Wenzi Wenti Kexue Taolunhui Huikan [Proceedings of the symposium on the problems of the Miao language], 11–74. Guiyang. Second research team of Minority Language Survey of Chinese Academy of Sciences. 1962. A general sketch of the Miao language. Zhongguo Yuwen 1. 28–37. Shafer, Robert. 1964. Miao-Yao. Monumenta Serica: Journal of Oriental Studies of the Catholic University of Peking 23. 398–411. Shintani, Tadahiko & Zhao Yang. 1990. Hainandao Menyu [The Mun language of Hainan Island: Its classified lexicon]. Tokyo: ILCAA.



Historiography of Hmong-Mien linguistics 

 147

Smalley, William A. 1976. The problems of consonants and tones: Hmong (Meo, Miao). In William A. Smalley (ed.), Phonemes and orthography: Language planning in ten minority languages of Thailand, 85–123. Canberra: Pacific Linguistics. Smalley, William A., Chia Koua Vang & Gnia Yee Yang. 1990. Mother of writing: The origin and development of a Hmong messianic script. Chicago: University of Chicago Press. Solnit 1996. Some evidence from Biao Min on the initials of Proto-Mienic and Proto-Hmong-Mien (Miao-Yao). Linguistics of the Tibeto-Burman Area 19(1). 1–18. Sposato, Adam. 2012. Relative clauses in Xong (Miao-Yao). Journal of the Southeast Asian Linguistics Society (JSEALS) 5. 49–66. Sposato, Adam. 2014. Word order in Miao-Yao (Hmong-Mien). Linguistic Typology 18(1). 83–140. Sposato, Adam. 2015. A grammar of Xong. Buffalo: The University at Buffalo, State University of New York PhD dissertation. Strecker, David. 1987a. The Hmong-Mien languages. Linguistics of Tibeto-Burman Area 10(2). 1–11. Strecker, David. 1987b. Some comments on Benedict’s “Miao-Yao enigma: The Na-e language”. Linguistics of Tibeto-Burman Area 10(2). 22–42. Strecker, David. 1987c. Some comments on Benedict’s “Miao-Yao enigma”: Addendum. Linguistics of Tibeto-Burman Area 10(2). 43–53. Taguchi, Yoshihisa. 2008. Luobohe Miaoyu Cihuiji [A vocabulary of Luobohe Miao]. Tokyo: ILCAA. Wang, Chunde. 1986. Miaoyu Yufa (Qiandong Fangyan) [A grammar of Miao: Qiandong dialect]. Beijing: Guangming Ribao Chubanshe. Wang, Chunde. 1992. Hanmiao Cidian (Qiandong Fangyan) [Chinese-Miao dictionary: Qiandong dialect)]. Guiyang: Guizhou Minzu Chubanshe. Wang, Fushi. 1979. Miaoyu fangyan sheng yun mu bijiao [The comparison of the initials and finals of the Miao dialects]. Unpublished manuscript. Beijing. Wang, Fushi. 1983. Miaoyu fangyan huafen wenti [On the dialect classification of the Miao language]. Minzu Yuwen 5. 1–22. Wang, Fushi. 1985. Miaoyu Jianzhi [A sketch of the Miao language]. Beijing: Minzu. Wang, Fushi. 1986. Miaoyaoyu de xishu wenti chutan [A preliminary investigation of the genetic affiliation of the Miao-Yao languages]. Minzu Yuwen 1. 1–18. Wang, Fushi. 1994. Miaoyu Guyin Gouni [Reconstruction of the sound system of Proto-Miao]. Tokyo: ILCAA. Wang, Fushi & Zongwu Mao. 1995. Miaoyaoyu Guyin Gouni [Reconstruction of the sound system of Proto-Miao-Yao]. Beijing: Zhongguo Shehui Kexue Chubanshe. Wang, Weiyang. 2005. Miaoyu Lilun Jichu Diandongbei Fangyan [Theoretical foundation of the Miao language: Diandongbei dialect]. Kunming: Yunnan Minzu Chubanshe. White, Nathan M. 2019. Classifiers in Hmong. In Alexandra Y. Aikhenvald & Elena I. Mihas (eds.), Genders and classifiers, 222–248. Oxford: Oxford University Press. Wong, Sikling. 1939. Phonetics and phonology of the Yao language: Description of the Yau-Ling dialect. Lingnan Science Journal 18(4). 425–455. Xian, Songkui. 2000. Xin Miaohan Cidian (Xibu Fangyan) [New Miao-Chinese dictionary: Western dialect]. Chengdu: Sichuan Minzu Chubanshe. Xiang, Rizheng. 1992. Hanmiao Cidian (Xiangxi Fangyan) [Chinese-Miao dictionary: Xiangxi dialect]. Chengdu: Sichuan Minzu Chubanshe. Yu, Jinzhi. 2011. Xiangxi Aizhai Miaoyu Cankao Yufa [A Reference Grammar of the Aizhai Miao]. Beijing: Zhongguo Shehui Kexue Chubanshe. Zhang, Yongxiang & Xu Shiren. 1990. Miaohan Cidian (Qiandong Fangyan) [Miao-Chinese dictionary: Qiandong dialect]. Guiyang: Guizhou Minzu.

Jean Pacquement, Paul Sidwell and Mathias Jenny

9 French contributions to the study of Mainland Southeast Asian languages and linguistics 9.1 Introduction In this chapter we examine the advent and development of French linguistics from the early colonial era into the decades that followed WWII. References to, and discussion of, work by French scholars in more recent decades is subsumed within the other chapters in this volume that deal with the history of linguistics in MSEA, reflecting the greater internationalization of linguistics in the post-colonial era. Among the colonial powers of the 19th century, France colonized Indochina, which corresponds to the present-day Vietnam, Cambodia and Laos. As a result, many French individuals could and did go to live in Indochina, and, for a few of them, whether explorers, soldiers, administrators, missionary priests, or scholars, the colonial experience turned out to be an opportunity to observe the ethnic and linguistic diversity directly in MSEA. France has had a long history of interest in language, with two main traditions and intellectual stances when it comes to oriental languages and their place in oriental studies. In the first, the focus is on teaching languages deemed useful for diplomacy and trade. The first tradition is mostly represented by the École des langues orientales, established in 1795, which was renamed as the Institut national des langues et civilisations orientales in 1971.1 In the second, the main goal is to study and understand civilizations: language is thus important, because it gives both a unique access to primary sources, and a window into the epistemology and values of any society. The second tradition is represented by orientalist scholars belonging to the Académie des Inscriptions et Belles-Lettres and their contributions in the Journal Asiatique, which was launched in 1822.2 Modern linguistics in Europe begins with the development of comparative and historical linguistics in the field of Indo-European studies during the 19th century, and notably cultivated within France in the early 20th century by Antoine Meillet and the generation of Indo-Europeanists he trained. That period also saw the rise of structural

1 When the École des langues orientales was reorganized in 1869, a tenured position for Annamite was created in order to facilitate the recruitment of administrative officers to be posted in the new French colony of Cochinchina. Annamite was thus the first MSEAn language taught at the school, with Abel Des Michels (1833–1910) as its instructor. 2 Among them, could be found, during the 19th century, Egyptologists, Assyriologists, specialists of Arabic and, more broadly, of Semitic languages, Indologists – called in French “Indianistes” –, and Sinologists. https://doi.org/10.1515/9783110558142-009

150 

 Jean Pacquement, Paul Sidwell and Mathias Jenny

linguistics generally following the posthumous publication in 1916 of Ferdinand de Saussure’s Cours de Linguistique Générale, laying the foundation for broader developments in synchronic linguistics and linguistic theory throughout the 20th century. Both tendencies were pursued and developed in France and elsewhere, laying the foundations of modern linguistics. While many French scholars had studied MSEA languages from the latter 19th century, much of their work was rather impressionistic and hardly compatible with modern linguistics.3 The situation pivoted profoundly in the early 20th century: study of languages in Indo-China was put on a strong institutional foundation with the creation of the École française d’Extrême-Orient (EFEO), and individual scholars, particularly Henri Maspero (1883–1945) and André-Georges Haudricourt (1911–1996), now conducted their work on the bases of firm philological and structuralist principles, driving real progress and new insights into the languages of MSEA.

9.2 Pre-1900s: missionary and colonial activities We begin by considering the French missionary priests, who belonged to the Missions étrangères de Paris (MEP hereafter), a Roman Catholic missionary organization established in Paris since 1663.4 The foundation of MEP is actually linked to Alexandre de Rhodes (1591–1660) himself, who, after returning from Vietnam in 1649, pleaded for more Catholic missions to that country. At the end of the 19th century, French missionary priests had been present in various parts of MSEA such as Siam, Cambodia, Cochinchina, Tonkin, and Southeastern China for decades already. They were not linguists and were not expected to focus on the languages of the people they had come to evangelize. They were nevertheless communicating with the people they were living with and knew local customs. As French missionary priests were expert Latinists, the way they learnt and understood indigenous languages was mostly through the framework of Latin and its grammar.5 Among missionary priests who made lasting contributions to MSEA language studies, Jean Louis Taberd (1794–1840) published his own Latin-Annamite diction-

3 “Fonctionnaires, officiers et missionnaires avaient besoin de connaître la langue du pays et en effet l’apprenaient pour la comprendre et la parler. Ceux d’entre eux, et ils ont été nombreux, qui avaient le goût d’écrire ont rédigé des manuels et des grammaires sans avoir reçu aucune formation linguistique.” (Martini 1960: 80) 4 Among the missionary priests who did not belong to the MEP, one will mention the Missionary Oblates of Mary Immaculate (OMI). The missionaries who documented Khmu in Laos in the 20th century, such as Jean Subra (1923–2000) and Henri Delcros (1925–1994), belonged to that congregation which was founded in 1816. 5 Such an approach has been discussed for the first grammars of Siamese and Vietnamese by Babu (2007) and Pham (2018). Both of them refer to the concept of ‘grammatization’ forged by Auroux (1994).

French contributions to the study of Mainland Southeast Asian languages and linguistics 

 151

ary (Taberd 1838a), as well as an Annamite-Latin dictionary (Taberd 1838b), which had been previously prepared by Pierre Pigneau de Behaine (1741–1799). The latter dictionary contains Taberd’s Annamite grammar composed in Latin,6 the Grammaticae compendium. Jean-Baptiste Pallegoix (1805–1862), apart from compiling one of the first dictionaries available for Thai (Pallegoix 1854), wrote a Thai grammar (Pallegoix 1850), which partly consists of a free translation into Latin of the Chindamani, a treatise dealing with orthography and poetry, composed in the 17th century by Phra Horathibodi. In other parts, the grammar is quite insightful: it contains examples of sociolects, lists latinized verbal and nominal paradigms, and illustrates tones with European musical notation. Several other authors who were in French Indochina or in neighboring parts of MSEA at the end of the 19th century also compiled dictionaries and grammars of different local languages. In the same way that Taberd and Pallegoix had dealt with Annamite and Thai respectively – major languages of MSEA associated with historical kingdoms with long written traditions – Joseph Guesdon (1852–1939) dealt with Khmer (Guesdon 1930),7 Marie-Joseph Cuaz (1862–1950) with both Thai and Lao (Cuaz 1903, 1904a, 1906), and Théodore Guignard (1864–1930) with Lao (Guignard 1912).8 Missionary priests were also sent to areas where languages had no written form.9 Among them were the missionary priests of the Kontum and Brơlâm missions in Cochinchina, who were living among highlanders speaking Bahnaric languages (Austroasiatic family): the names of Pierre Dourisboure (1825–1890) and Henri Azémar (1834–1895) are associated with dictionaries of Bahnar (Dourisboure 1889) and Stieng (Azémar 1886). Dourisboure and Azémar can be compared to the missionary priests belonging to French Catholic Missions located on the Tonkin-Yunnan border, whom Michaud (2007) has described as “incidental ethnographers”. Paul Vial (1855–1917) and Alfred Liétard (1872–1912) documented Loloish languages (Tibeto-Burman branch of the Sino-Tibetan family) of Yunnan: “Gni” (Vial 1909) for the first, and “A-hi” (Liétard 1909a) as well as other Lolo dialects (Liétard 1909b) for the latter. Vial also produced a grammar and a lexicon of

6 As for an Annamite grammar written in French, we have Gabriel Aubaret’s Grammaire annamite (Aubaret 1867), which is, according to Pham (2018: 8–9), a more or less accurate translation of Taberd’s grammar into French. Aubaret (1825–1894), a naval officer, also produced French-Annamite and Annamite-French lexicons (Aubaret 1861, 1867). 7 Guesdon had already started various lexicographical works related to Khmer right from the 1880s, but the complete version of his comprehensive Cambodian-French dictionary was published much later. 8 As Guignard was based in a part of Nghệ An province (Vietnam) inhabited by Tai Muong speakers, his dictionary contains a few Tai Muong words (Personal communication, Michel Ferlus, July 2015). 9 Cuaz and Guignard themselves did not deal with languages having a written form only. In his Étude sur la langue laocienne (Cuaz 1904b), Cuaz gives vocabularies of various languages including Saek (Northern Tai) and So (a Katuic language belonging to the Austroasiatic family). As for Guignard, he mentions nomadic groups speaking Vietic languages in the area of Khamkeuth and Khammouane of Laos (Guignard 1911a) and in Quảng Bình province of Vietnam (Guignard 1911b).

152 

 Jean Pacquement, Paul Sidwell and Mathias Jenny

“Miaotse” (Hmong-Mien family) (Vial 1908a, 1908b).10 We will finally mention Joseph Esquirol (1870–1934), who, together with Gust Williatte (Esquirol and Williatte 1908), “produced a Yay (Northern Tai) dictionary based on a Pu-Yi dialect in southwestern Guangxi” (Hudak 2008: 50). However the scholarship of missionary priests is not limited to dictionaries, grammars, and other language accounts. Such works coexisted with personal diaries, correspondence, and also various texts produced for annual reports or religious publications, often containing ethnographic and cultural accounts (Michaud 2007: 134). With respect to texts authored by the three missionary priests studied by Michaud, Paul Vial, Alfred Liétard, and François Marie Savina (1876–1941), he describes those texts as “detached from the missionary society’s needs and expectations” and their “authors in the highland examining the ‘natives’ and writing in a studious style” which was “characterized by […] a scientific longing” (Michaud 2007: 215–216). There is no surprise therefore that, especially with François Marie Savina and Léopold Cadière (1869–1955), we have missionary priests who got their writings published in scientific journals, such as T’oung Pao, the Bulletin de l’École française d’Extrême-Orient, or Anthropos. Other “first Western ethnographers” (Michaud: 2007: 5) include explorers, administrators, and military men, who arrived at the time of colonial rule. Among those who wrote about languages, we find Étienne Aymonier (1844–1929), an army officer who became a colonial administrator, writing on Khmer (Aymonier 1874, 1878) and Cham (Aymonier 1889; Aymonier and Cabaton 1906); Albert Morice (1848–1877), a doctor who had traveled to Indochina as an assistant medical officer of the navy, who compared Cham and Stieng with Khmer (Morice 1875); Jean Nicolas Arthur Chéon (1857– 1928), a teacher who became later a colonial administrator, working on Chrau (South Bahnaric) (Chéon 1890; and Chéon and Mougeot 1890), Annamite (Chéon 1905a),11 and other Vietic languages (Muong, Nguon and Sach) (Chéon 1905b, 1907); Édouard Diguet (1861–1921), an army officer, who studied Annamite, Tai Dam, and Tho (Diguet 1892, 1895, 1910); and P. Silve, an army officer too, investigating Tho (Silve 1906). All of them, with the possible exception of Chéon,12 would at one time or another be labeled as autodidacts. This was certainly true when they arrived in French Indochina, especially when compared with either missionary priests, who had the background of Latin grammar, or with the philologists of the EFEO, who already knew Sanskrit or Chinese. One could be considered as an amateur in philology and language studies and at the same time be recognized, sooner or later, in another field. For example, according to Clémentin-Ojha and Manguin (2007), Aymonier is now the 10 According to Michaud, Vial appears to be “the Hmong script pioneer”: he uses “an alphabet based on French pronunciation” (Michaud 2020: 243) to write the “Miaotse” language. 11 In that article, Chéon deals with secret languages used by specific groups of people. 12 Chéon had an in-depth knowledge of Annamite, especially of the Chữ Nôm Sino-Vietnamese script (Goloubew 1928).

French contributions to the study of Mainland Southeast Asian languages and linguistics 

 153

“pioneer of Indochinese epigraphy”, and “his formative role in research has today been rehabilitated” (Clémentin-Ojha and Manguin 2007: 126). Another name worth mentioning is Diguet: whereas his linguistic work about Tai Dam (Diguet 1895) was criticized in many accounts because of his “impressionistic method of transcribing Black Tai sounds in French spelling” (Gedney 1989: 417), the book he authored on the highlanders of Tonkin (Diguet 1908) is among “the most reliable sources available on the early ethnography of upland Tonkin” (Michaud 2013: 37). In addition to the efforts of amateurs and autodidacts, there were also officially commissioned exploratory expeditions into the interior of Indo-China. While these were primarily concerned with efforts such as topographic mapping and establishing political relations/authority, there was also earnest collection of vocabulary lists as part of ethnographic documentation, known as the Pavie Mission. Accounts of these were published between 1898 and 1911 (and a two volume Atlas of the Pavie Mission published in English translation, Pavie 1999), and include some of the first useful data on various languages of the highlanders of the Annamite Range. E.  g. – Harmand (1878–1879) lexicon of four Koui (Kui) lects, plus unpublished but widely circulated comparative lexicon of West Bahnaric languages. – Lefèvre-Pontalis (1892, 1892–1896) lexicons for some 15 languages. – Rivière (1902) lexicons of Hang-Tchek (Saek, Tai), Khas Xos (Sô/Bru, Katuic), Harème (Arem, Vietic). One of the Pavie expeditioners, Paul Macey (1852–19??), became a colonial administrator and independently collected linguistic and ethnographic data, publishing wordlists for various Tai, Khmuic, and Katuic languages (1906, 1907a, 1907b). Macey’s wordlists are based upon a questionnaire of 384 lexical items, which follows the EFEO language questionnaire of 1900 (see below).13

9.3 The EFEO era: first half of the 20th century French research activities were put on a firm institutional basis with the founding of the Mission archéologique d’Indo-Chine in 1898 in Saigon, renamed the École française d’Extrême-Orient (EFEO) in 1900. In 1902 the headquarters were moved to Hanoi where a library and museum were established. The remit of the EFEO was broad, initially commissioned to inventory and preserve the cultural heritage of Indochina by activities such as archaeological exploration, collection of manuscripts, preservation of monuments, study of ethnic groups and linguistic studies, the scope was further expanded to encompass the history of 13 Macey’s death date is not known. His notes, including wordlists, are much more extensive that the fraction that where actually published.

154 

 Jean Pacquement, Paul Sidwell and Mathias Jenny

all Asian civilizations from India to Japan. Subsequently museums and other facilities were established at locations including Da Nang, Saigon, Hue, Phnom Penh, Battambang, and Angkor, and later in other locations and countries. In relation to linguistic studies, the early decades of EFEO are particularly marked for extensive language surveys, and the work in Indochinese epigraphy by of Louis Finot and George Cœdès, which laid the basis for the epigraphic work carried on today by EFEO scholars (see Sidwell and Jenny, this volume). French orientalists had been aware of the Englishman George Abraham Grierson’s project of a linguistic survey for India. Grierson had made the proposal of such a survey during the Seventh International Oriental Congress held at Vienna in September 1886, and it was formally launched in 1894. The EFEO conceived a similar undertaking for French Indochina, and prepared a language questionnaire which included French, Vietnamese, Khmer, and Lao, printed in 1900 (Haudricourt 1958: 213). In the same way that Grierson had Government officers collect data from all over British India, that questionnaire was distributed to the EFEO’s collaborators and correspondents. Subsequently wordlists collected from the North of Indo-China by Lunet de la Jonquière (1906) and Bonifacy (1904, 1905, 1907, 1908), from the Center by Cadière (1911, 1940), Guignard (1907), Macey (1906, 1907a, 1907b), and from the South by Lavallée (n.d.) and Odend’hal (Cabaton 1906). A second EFEO survey was initiated in 1937. The linguistic questionnaire was expanded to 462 lexical items and 29 phrases and sentences, and informants were expected to give a version of the parable of the prodigal son in their dialects.14 On the basis of work done up to 1938, the EFEO drew up a first ethnolinguistic map of Indochina, which was exhibited at the Hanoi exhibition fair in 1941. Apart from the language mapping, not so much was done with the questionnaire data at that time, although later Haudricourt would consult the EFEO language surveys when he was based at the EFEO’s library (1948–1949) and use their data extensively. Those language questionnaires were moved to Paris and can be consulted at the library of EFEO there, in the collection “manuscrits européens”. Most languages studies at the EFEO when it was focused on MSEA looked at languages one by one. In the early 20th century the linguistic families were not yet identified, although early understanding of how the languages sub-group did begin to emerge from the survey data being compiled. Haudricourt (1958) discussing the history EFEO’s work in Indo-China, identifies four periods for 1898–1945:

14 Another linguistic questionnaire, which had been prepared by the Institut d'Ethnologie de Paris with 391 lexical items and 35 phrases and sentences, was used by R. Robert (1941) in a monograph about the Tai Daeng.

French contributions to the study of Mainland Southeast Asian languages and linguistics 

 155

– In the first period (1898–1908) the members of EFEO were more philologists than linguists; their main occupation was the inventory and editing of Sanskrit, Khmer and Cham inscriptions; language surveys based on the first questionnaire were conducted. – The second period (1908–1920) was the time Henri Maspero was an active member. – The third period (1920–1936) is regarded as the least fruitful for language studies in Haudricourt’s assessment. Nonetheless, he pays tribute to the work of François Marie Savina, who produced various influential works of lexicography including dictionaries and lexicons of Miao (Mien) (1916, 1920), Ɖày (Tai) (1931), Bê (Be-Tai) (1965) and comparative vocabularies (1939). – The fourth period (1937–1945) started with the second language survey. François Martini (1895–1965) also discussed the history of language studies in MSEA, offering a different perspective. He focuses on the context in which they were conducted and their methods. His presentation (Martini 1960: 80–82) characterized the view of Indo-Chinese languages as a kind of language ecology with three components: – Languages of the countries of French Indochina: Vietnamese, Khmer, Lao, – Important regional dialects such as Tai Don and Tai Dam, – Languages spoken by minorities in a “mosaic”. He also distinguished the language studies and surveys according to the transcription methods (Martini 1960: 84): – Transcriptions for French people; words represented/reproduced as they were heard without a phonemic interpretation, although the data could be difficult to use, – Transcriptions using the Quốc ngữ alphabet; these were adapted to Tai dialects, for example, but they are not adequate for languages with phonological systems very different from Vietnamese. The language surveys of the first half of the 20th century provided systematic comparable data from across the region, inventorying languages according to locations and demographics, providing a valuable resource, yet progress in terms of analytical understanding was only modest. Progress in understanding how the languages form families, come to have their distinct typological features, and so forth would fall to particular outstanding scholars who made good use of that data. In particular: – Maspero deserves credit for being the first who applied the methods of comparative linguistics to Tai languages and Vietnamese (Maspero 1911, 1912). However, he later put linguistics aside to focus on the history and the religion of ancient China. – Haudricourt coherently studied and described various language families of MSEA, including the Tai, Karen, Austroasiatic, and Hmong-Mien languages. His work can be regarded as foundational to our modern understanding of MSEA linguistics, including the origins of tones in Vietnamese and other languages of the region,

156 

 Jean Pacquement, Paul Sidwell and Mathias Jenny

language classification, and other historical insights. His work and leadership role are still held in high esteem within France, leaving a lasting influence. In an autobiographical account (Haudricourt and Dibie 1987), Haudricourt recalls how he became familiar with linguistics during his formative years in Paris. His interest in linguistics at that time appears to have been a part of a broader interest in human sciences in general. After a first linguistic and cultural shock in Albania in 1932, he registered for phonetic classes at the Sorbonne during the 1932–1933 academic term and attended Marcel Mauss’ lectures at the Collège de France. Marcel Mauss (1872– 1950) was an ethnologist, whose participation in the organization of ethnographic research in French Indochina is mentioned by Michaud (2007): “In the year 1900, aged 28, he had been consulted on how to organize field research in the highlands and contributed to preparing ethnographic enquiry tools” (Michaud 2007: 226). Haudricourt recalls Mauss as a kind of structuralist and as a good philologist studying each word in its context (Haudricourt and Dibie 1987: 33). In the mid-1930s, Haudricourt became acquainted with Marcel Cohen, who was then teaching Ethiopian at the École des langues orientales as well as linguistics at the Institut d’Ethnologie, and who later entrusted him his personal collection of books during the German occupation. After the liberation of Paris, Haudricourt became a student of André Martinet at the École pratique des hautes études. Martinet had just begun teaching phonology, a new field of linguistics established as a field separate from phonetics by Nikolai Trubetzkoy, a member of the Prague School of structural linguistics. Haudricourt’s account gives us an idea of how the important branches of linguistics were already represented in French universities and other higher education institutions in the 1930s and 1940s. Encounters with linguists during Haudricourt’s formative years certainly prepared his mind for future achievements which made him a leading figure in MSEA linguistics. They might also explain why Haudricourt, who had studied agronomy at the Institut national agronomique and was first appointed as a researcher of the Centre national de la recherche scientifique (CNRS) in the section of botany in 1940, chose to move to the section of linguistics in 1945. Haudricourt’s interest in MSEA can tentatively be traced back to 1934. While in the USSR he was asked by the agronomist Nikolai Vavilov to make a report on Indochina’s cultivated plants. He later studied Siamese at the École des langues orientales during the German occupation, where he met Martini who was then teaching Khmer. Haudricourt was compiling a comparative dictionary of Tai languages when Paris was bombed before its liberation. An overview of French scholars’ contribution to MSEA linguistics, which would focus on the development of structural linguistics and phonology, while following Haudricourt’s intellectual itinerary, would most probably start at the middle of the 20th century, when Haudricourt’s first article dealing with Tai languages was published (Haudricourt 1948). However, if there is a French perspective on language studies in MSEA, it might have to be found before that date. Although Haudricourt’s

French contributions to the study of Mainland Southeast Asian languages and linguistics 

 157

article about Tai languages proposes a new analysis, it needed to be supported by language data, and Haudricourt, who had not yet gone to MSEA, obtained his data from published dictionaries and accounts. The authors of those sources were guided by principles of traditional grammar or philology, rather than linguistic description as we understand it today. But more importantly, those authors had themselves lived in MSEA and had thus had access to speakers. Both those notions of interest in language and access to speakers might be considered to be trivial in linguistics nowadays, yet from a historical perspective they are of particular significance in the case of French studies related to MSEA languages. While EFEO was interrupted by the war, activities were revitalized from 1945 onward, nonetheless the school was required to abandon Hanoi in 1957 and Phnom Penh in 1975, and subsequently built a vital distributed network of institutions and scholars across many countries. From 1990 onward there was a return to Indochina, and these days EFEO activities are well established in Vietnam, Cambodia, and Laos, as well as in Thailand and Myanmar. Other important French institutions contributing to the research of MSEAn languages and linguistics are the Institut national des langues et cultures orientales (INALCO) and the Langues et civilisation à culture orales (LACITO) lab of the CNRS. The former was established in 1795, initially teaching languages of East and Southeast Asia. At present, the INALCO offers courses in over 100 languages and civilisations from around the globe up to the level of PhD. Among the languages taught at the INALCO are several SEAn languages not much taught elsewhere, Lao has a full-fledged curriculum, while Tai Lü and Mon have been taught at times. Research at the INALCO is closely linked to groups at the CNRS, among which the LACITO, established in 1976, is of special importance to MSEA, as it manages the Collection Pangloss, an online archive with recordings and texts in more than 150 languages, including many from the MSEA area.

9.4 Conclusion The French contribution to the study of MSEAn languages, both individual idioms and language families, started early and left an ongoing impact in the field, most prominently in Indochina, but also in adjacent areas, including Thailand, Myanmar, and southern China. In many cases, French researchers and laypeople were the first to describe local languages of the area, and some succeeded in greatly adding to the general understanding of the relations and development of languages and language families. French work on MSEAn languages and linguistic research continues to be a strong factor contributing to the field, even though most publications by French scholars and researchers are traditionally written in French, which to some extent restricts their impact outside of the French speaking world.

158 

 Jean Pacquement, Paul Sidwell and Mathias Jenny

References Aubaret, Gabriel. 1861. Vocabulaire français-annamite et annamite-français, précédé d’un traité des particules annamites. Bangkok: Imprimerie de la Mission Catholique. Aubaret, Gabriel. 1867. Grammaire annamite suivie d’un vocabulaire français-annamite et annamitefrançais. Paris: Imprimerie Impériale. Auroux, Sylvain. 1994. La révolution technologique de la grammatisation. Liège: Pierre Mardaga. Aymonier, E. 1874. Vocabulaire cambodgien–français. Saigon: Collège des stagiaires. Aymonier, E. 1878. Dictionnaire khmêr-français. Saigon: autographié par So’n Diép. Aymonier, Etienne & Antoine Cabaton. 1906. Dictionnaire čam-français (Publications de l’École française d’Extrême-Orient 7). Paris: Imprimerie Nationale, Ernest Leroux, publisher. Aymonier, Étienne. 1889. Grammaire de la langue chame. Saigon: Imprimerie coloniale. Azémar, H. 1886. Dictionnaire Stieng: recueil de 2,500 mots, fait à Brơlâm en 1865. Excursions et Reconnaissances 12 (27/28). 93–146, 251–344. Babu, Jean-Philippe. 2007. L’influence de la tradition grammaticale gréco-latine sur la grammaire du thaï. Journal of Humanities Naresuan University 4(3). 21–41. Bonifacy, Auguste Louis. 1904. Les groups ethniques de la Rivière Claire. Revue indochinoise 2. 1–16. Bonifacy, Auguste Louis. 1905. Etude sur les langues parlées par les populations de la Haute Rivière Claire. Bulletin de l’Ecole française d’Extrême-Orient 5. 306–327. Bonifacy, Auguste Louis. 1907. Etude sur les Cao-Lan. T’oung Pao, Archives concernant l’histoire, les langues, la géographie et les arts de l’Asia Orientale 2(8). 429–438. Bonifacy, Auguste Louis. 1908. Etude sur les coutumes et la langue des Lolo et des La-qua du Haut Tonkin. Bulletin de l’Ecole française d’Extrême-Orient 8. 531–558. Cabaton, Antoine. 1905. Dix dialectes indochinois recueillis par Prosper Odend’hal. Etude linguistique par Antoine Cabaton. Journal Asiatique, Dixième série, tome V. 265–344. Cadière, Léopold. 1911. Le dialecte du Bas-Annam. Bulletin de l’Ecole française d’Extrême-Orient 11. 67–110. Cadière, Léopold. 1940. Note sur les Moï du Quâng-tri. Institute Indochinois pour L’Etude de L’Homme (Bulletins et Travaux) 3(1). 101–107. Chéon, Jean Nicolas. 1890. Notice sur la langue des Chraus ou Moïs de l’arrondissement de Biên-hòa. Bulletin de la Société des Études Indochinoises de Saigon 20. i–xiii. Chéon, A. 1905a. L’argot annamite. Bulletin de l’Ecole française d’Extrême-Orient 5. 47–75. Chéon, A. 1905b. Notes sur les Muong de la province de Son-tay. Bulletin de l’Ecole française d’Extrême-Orient 5. 328–348. Chéon, A. 1907. Note sur les dialectes nguon, sac et muong. Bulletin de l’Ecole française d’Extrême-Orient 7. 87–99. Chéon, Jean Nicolas & A. Mougeot. 1890. Essai de dictionnaire de la langue chrău (dialecte moï), par MM. Chéon et Mougeot, comprenant 1,400 mots et un grand nombre d’expressions et d’idiotismes, recueillis par M. Chéon à Bŭt Doc (arrondissement de Biên-Hoà), Cochinchine française. Bulletin de la Société des Études Indochinoises de Saigon 20. 1–106. Clémentin-Ojha, Catherine & Pierre-Yves Manguin. 2007. A century in Asia: The history of the École Française d’Extrême-Orient, 1898–2006. Singapore: Editions Didier Millet. Cuaz, M. J. 1903. Essai de dictionnaire français-siamois. Bangkok: Imprimerie de la Mission Catholique. Cuaz, M. J. 1904a. Lexique français-laocien. Hongkong: Imprimerie de la Société des Missions Étrangères. Cuaz, M. J. 1904b. Étude sur la langue laocienne. Hongkong: Imprimerie de la Société des Missions Étrangères.

French contributions to the study of Mainland Southeast Asian languages and linguistics 

 159

Cuaz, M. J. 1906. Manuel de conversation franco-laocienne. Hongkong: Imprimerie de Nazareth. Diguet, Édouard. 1892. Éléments de grammaire annamite. Paris: Imprimerie Nationale. Diguet, Edouard. 1895. Étude de la Langue Taï, précédée d’une notice sur les races des Hautes Régions du Tonkin, comprenant Grammaire, Méthode d’écriture Taï et Vocabulaires. Hanoi: F.-H. Schneider, Imprimeur-Éditeur. Diguet, Édouard. 1908. Les montagnards du Tonkin. Paris: Augustine Challamel. Diguet, Édouard. 1910. Étude de la langue Thô. Paris: Augustine Challamel. Dourisboure, Le Père P. X. 1889. Dictionnaire bahnar-français. Hongkong: Imprimerie de la Société des Missions Étrangères. Esquirol, Jos. & Gust Williatte. 1908. Essai de dictionnaire dioi₃-français, reproduisant la langue parlée par les tribus Thai de la haute Rivière de l’Ouest (西 汀), suivi d’un vocabulaire français-dioi₃. Hongkong: Imprimerie de la Societé des Missions-Etrangères. Gedney, William J. 1989. A comparative sketch of White, Black, and Red Tai. In R.J. Bickner, J. Hartmann, T.J. Hudak & P. Peyasantiwong (eds.), Selected Papers on Comparative Tai Studies, 415–462. Ann Arbor: Center for South and Southeast Asian Studies, the University of Michigan. Goloubew Victor. 1928. J.-N.-A. Chéon (1856–1928). Bulletin de l’Ecole française d’Extrême-Orient 28(3). 665. Guesdon, Joseph. 1930. Dictionnaire cambodgien-français, 2 vols. Paris: Librairie Plon. Guignard Théodore. 1911b. Note sur une peuplade des montagnes du Quáng-Bīnh: les Tầc-cúi. Bulletin de l’Ecole française d’Extrême-Orient 11. 201–205. Guignard, Théodore. 1911a. Note historique et ethnographique sur le Laos et les Thai. Revue indochinoise 3. 233–244. Guignard, Théodore. 1912. Dictionnaire laotien-français. Hongkong: Imprimerie de Nazareth. Harmand, Jules. 1878–1879. Notes de voyage en Indo-Chine. Chapter II: les kouys. Annales de l’Exrême-Orient 1. 332–337. Haudricourt, André-Georges & Pascal Dibie. 1987. Les pieds sur terre. Paris: Métailié. Haudricourt, André-Georges. 1948. Les Phonèmes et le vocabulaire du thai commun. Journal Asiatique 236. 197–238 Haudricourt, André-Georges. 1958. L’œuvre linguistique de l’École française d’Extrême-Orient en Indochine. Orbis 7(1). 212–219. Hudak, Thomas John. 2008. William J. Gedney’s comparative Tai source book (Oceanic Linguistics Special Publication 34). Honolulu: University of Hawai’i Press. Lavallée, Alfred. n.d. Vocabulaire comparé des dialectes sauvages du Bas-Laos: boloven, niaheun, alak, lăvé, kaseng, halang (ou selang), bahnar, sedang, djarai, Ecole française d’Extrême-Orient, ms. Lefèvre-Pontalis, Pierre. 1892. Étude sur quelque alphabets et vocabulaires thaïs. T’oung Pao 3(1). 39–64. Lefèvre-Pontalis, Pierre. 1892–1896. Notes sur quelques populations du nord de l’Indo-Chine. Journal Asiatique, 8e sér., 19(1892). 237–269; 9e sér., 8(1896). 129–154, 291–303. Liétard, Alfred. 1909a. Notions de grammaire lo-lo (dialecte a-hi). Bulletin de l’Ecole française d’Extrême-Orient 9. 285–314. Liétard, Alfred. 1909b. Notes sur les dialectes lo-lo. Bulletin de l’Ecole française d’Extrême-Orient 9. 549–572. Lunet de la Jonquière, Etienne Edmond. 1906. Ethnographie du Tonkin septentrional. Paris: Ernest Leroux. Macey, Paul. 1906. Étude ethnographique sur diverses tribus, aborigènes ou autochtones, habitant les provinces de Hua-phans, Ha-tang-hoc et du Cammon, au Laos. In Actes du XIVe Congres International des Orientalistes, 1e Tome, 5e Section, 3–63. Paris: Ernest Leroux.

160 

 Jean Pacquement, Paul Sidwell and Mathias Jenny

Macey, Paul. 1907a. Études ethnographiques sur les khas. Revue Indochinoise 5. 869–874. Macey, Paul. 1907b. Étude ethnographique et linguistique sur les K’Katiam-Pong-Houk, dits: Thai Pong (Province du Cammon-Laos). Revue Indochinoise 5. 1411–1424. Martini, François. 1960. Conditions et méthode de la recherche linguistique dans le Sud-Est asiatique. Colloque sur les recherches des instituts français de sciences humaines en Asie organisé par la Fondation Singer-Polignac, en son hôtel, 43 avenue George-Mandel Paris, 23 au 31 octobre 1959, 79–87. Paris: Éditions de la Fondation Singer-Polignac. Maspero, Henri. 1911. Contribution à l’étude du système phonétique des langues thaï. Bulletin de l’Ecole Française d’Extrême Orient 11. 153–169. Maspero, Henri. 1912. Etude sur la phonétique historique de la langue annamite. Les initiales. Bulletin de l’Ecole Française d’Extrême Orient 12. 1–124. Michaud, Jean. 2007. “Incidental” ethnographers: French Catholic missions on the Tonkin-Yunnan frontier, 1880–1930. Leiden & Boston: Brill. Michaud, Jean. 2013. French military ethnography in Colonial Upper Tonkin (Northern Vietnam), 1897–1904. Journal of Vietnamese Studies 8(4). 1–46. Michaud, Jean. 2020. The art of not being scripted so much: The politics of writing Hmong language(s). Current Anthropology 61(2). 240–263. Morice, Albert. 1875. Étude sur deux dialectes de l’Indo-Chine: les Tiams et les Stiengs (Cochinchine et Cambodge). Paris: Maisonneuve et Cie, libraires-éditeurs. Pallegoix, Jean Baptiste. 1850. Grammatica Linguae Thai. Bangkok: Assumption College. Pallegoix, Jean Baptiste. 1854. Dictionarium linguae thai sive Siamensis interpretatione Latina, Gallica et Anglica Illustratum. Parisiis: Jussu Imperatoris Impressum in Typographeo Imperatorio. Pavie, Auguste. 1999. Atlas of the Pavie Mission: Laos, Cambodia, Siam, Yunnan and Vietnam, 2 vols. Translated by Walter E. J. Tips. Bangkok: White Lotus Press. Pham, Thi Kieu Ly. 2018. REI (21) – Les premières grammaires du vietnamien. CTLF – Articles. Colloque ‘Refonte et extension internationale du CTLF: Corpus de textes linguistiques fondamentaux’. Université Paris Diderot (UMR 7597 « Histoire des théories linguistiques »), 31st May–1st June 2018. 1–10. Rivière, Capitaine. 1902. Vocabulaire Hang-Tchek, khas Xos, Harème, recueilli par M. Rivière dans son voyage de Lakhône à Vinh. Mission Pavie: Géographie et voyages IV, 285–290. Paris: E. Leroux. Robert, Romain. 1941. Notes sur les Tay Dèng de Lang Chánh (Thanh-hoá – Annam). Institut Indochinois pour l’Etude de L’Homme (Mémoire N°1). Hanoi: Imprimerie d’Extrême-Orient. Saussure, Ferdinand de. 1978. Cours de linguistique générale. Publié par Charles Bally et Albert Sechehaye avec la collaboration de Albert Riedlinger. Édition critique préparée par Tullio de Mauro. Paris: Payot. Savina, François. 1916. Dictionnaire Miao-tseu-Français, précédé d’un précis de grammaire miao-tseu et suivi d’un vocabulaire français-miao-tseu. Bulletin de l’Ecole Française d’Extrême Orient 16(2). 1–246. Savina, François Marie. 1920. Lexique Français-méo. Hanoi: Imprimerie d’Extreme Orient. Savina, François Marie. 1931. Lexique đày-français accompagné d’un petit lexique français-đày et d’un tableau des différences dialectales. Bulletin de l’Ecole Française d’Extrême Orient 31. 103–199. Savina, François Marie. 1939. Guide linguistique de l’Indochine française, 2 vols. Hongkong: Imprimerie de la Société des Missions Etrangères. Savina, François. 1965. Le vocabulaire Bê (Publications Ecole Française d’Extrême Orient 7). Presented by André-Georges Haudricourt. Paris: Ecole Française d’Extrême Orient. Silve, P. 1906. Etude de la langue tai: grammaire thô. Hanoi: F.-H. Schneider.

French contributions to the study of Mainland Southeast Asian languages and linguistics 

 161

Taberd, Jean Louis. 1838a. Dictionarium Latino-Anamiticum. Fredericnagori vulgo Serampore: Ex typis J. C. Marshman. Taberd, Jean Louis. 1838b. Dictionarium Anamitico-Latinum, primitus inceptum ab Illustrissimo et Reverendissimo P. J. Pigneaux, Episcopo Adranensi, Vicario apostolico Cocincinae, &c. Dein absolutum et editum a J. L. Taberd, Episcopo Isauropolitano, Vicario apostolico Cocincinae, Cambodiae et Ciampae, Asiaticae Societatis Parisiensis, nec non Bengalensis Socio honorario. Fredericnagori vulgo Serampore: Ex typis J. C. Marshman. Vial, Paul. 1908a. Petite grammaire miaotse. Annales de la Société des Missions-Étrangères et de l’Œuvre des Partants 11. 154–157. Vial, Paul. 1908b. Petit lexique français miaotse. Annales de la Société des Missions-Étrangères et de l’Œuvre des Partants 11. 158–169. Vial, Paul. 1909. Dictionnaire Français-Lolo, Dialecte Gni tribu située dans les sous-préfectures de Loú nân tcheōu 路南州 Lŏu leâng tcheōu 陸涼州 Koùang-si tcheōu 廣西州 Province du Yunnan. Hongkong: Imprimerie de la Société des Missions Étrangères.

Carolyn P. Miller and Kirk R. Person

10 The SIL contribution to language and linguistics in Mainland Southeast Asia 10.1 Introduction Founded in 1934, SIL International has grown from a summer linguistics training program with two students to an international, faith-based, nonprofit organization with over 5,000 staff members from 89 countries of origin.1 SIL has trained over 20,000 students in various aspects of linguistics, literacy, and other cross-cultural work through a network of training programs that now involves 26 institutions in 18 countries.2 Today SIL works alongside speakers of more than 1,660 languages, representing 1.07 billion people in 162 countries. More than two million people have learned to read and write as a direct result of SIL’s literacy efforts. SIL’s Language and Culture Archives houses over 60,000 works of various kinds, including scholarly publications, vernacular literacy materials, and Bible translations. SIL’s flagship publication, the Ethnologue,3 is an often-cited database of the world’s more than 7,000 living languages. SIL’s collective scholarly output is widely utilized in documentary and typological linguistics. Many SIL staff are professional linguists, with advanced degrees and significant engagement with academia. From its beginnings in Latin America, SIL extended its work through the mid-twentieth century to other regions of the world, including mainland and insular Southeast Asia. Instrumental in that expansion in the 1950s was Asia Area Director Richard Pittman who, from his base in the Philippines, established relations with governments and educators in the region. His visit to Saigon in early 1957 opened the way for SIL personnel to work initially in Vietnam, and later other mainland Southeast Asia (MSEA) countries (Lynip 2013). David and Dorothy (Dot) Thomas were the first SIL staff members to arrive in Vietnam in December 1957. Others followed, some of whom did not survive the tragic conflict that continued until 1975.4 The efforts of SIL staff in MSEA during the 1960s and 1970s were fundamental to transforming the understanding of the typology and history of languages in the region. Their engagement with local and international academics, scholarly publications (including founding the Mon-Khmer Studies journal), participation in international 1 The authors would like to express their appreciation to Brian Migliazza and the many other SIL staff who provided input to and comments on this paper. 2 See https://sil.org/ (accessed 7 January 2021). 3 See https://www.ethnologue.com/ (accessed 7 January 2021). 4 On 4 March 1963 Gaspar Makil and his infant daughter, Janie, along with Elwood Jacobsen, were killed at a roadblock between Saigon and Dalat. Gaspar’s three-year-old son, Thomas Makil, was injured, but survived. In 1968, Henry Blood was captured in Banmethuot and died in captivity. https://doi.org/10.1515/9783110558142-010

164 

 Carolyn P. Miller and Kirk R. Person

meetings (such as the International Conference on Austroasiatic Linguistics), and contributions to language classification, historical reconstruction, phonology, and grammar in MSEA advanced knowledge of the field and remain important to this day. A particular case in point is historical linguistics, where SIL staff authored papers and theses pursuing comparative reconstructions of Austroasiatic language history in MSEA. They were encouraged by David Thomas’ prediction in the first issue of Mon-Khmer Studies (Thomas 1964) that careful phonemic and bottom-up comparisons would bring rapid progress in reconstruction. The studies that followed strongly energized the field and underwrote subsequent decades of progress in Austroasiatic studies.

10.2 Vietnam In the years following the arrival of the Thomases in Vietnam, others came to join them. As neophyte linguists they were assigned to specific languages such as Cham, Roglai, Stieng, Bahnar, Mnong and Bru, while the Thomases worked with a Chrau (South Bahnaric, Austroasiatic) language community located to the east of Saigon. Most of the languages to which SIL teams were assigned had no written tradition. Thus, after learning Vietnamese and the minority language, SIL staff were tasked to work with community members to analyze the phonology, devise orthographies, compile basic dictionaries, and assist the people in learning to read and write their own language. Not much was known about the phylogeny and history of the languages of southern Vietnam at that time. David Thomas undertook an initial comparative study and language classification (Thomas 1966), which was later enhanced with Thomas Headley (Thomas and Headley 1970). All SIL Vietnam staff were expected to contribute data in the form of basic wordlists, and this facilitated lexicostatistical analysis which distinguished the branches and sub-branches of Austroasiatic languages in MSEA and showed the relation of Chamic languages to Malayo-Polynesian (e.  g. Blood 1962; Lee 1966). Often data gathering and structural analysis generated unanticipated results. When the Millers reported finding 41 contrastive vowel nuclei in Bru, SIL colleagues were highly skeptical. Further studies (e.  g. Phillips et al. 1976) convinced them of what has since been found to be not unusual for Austroasiatic languages in MSEA. It was work of this kind by SIL staff that began to reveal the real diversity and complexity of vocalic systems, exploiting features such as diphthongization and breathy and creaky phonation, adding fundamentally to the understanding of language typology in MSEA. SIL staff also taught classes at both Saigon and Hue Universities, and David Thomas’s friendship with Dr. Nguyễn Đình Hòa5 led to the formation of the Linguistic 5 Nguyễn Đình Hòa was a professor and cultural official in Hue, Da Lat, and Saigon, and later held academic positions at Southern Illinois University, the University of Hawaii, and SOAS.



The SIL contribution to language and linguistics in Mainland Southeast Asia 

 165

Circle of Saigon. This, in turn, led to the launch of the Mon-Khmer Studies journal in 1964. This journal, which resided for a time at the University of Hawaii and then Mahidol University in Thailand, provided an outlet for many articles about languages and cultures in the region, with some 45 volumes over 52 years.6 Throughout South Vietnam, primary and secondary schools, where they existed, taught only in Vietnamese. Schools of any kind were seldom found in the more remote areas where ethnic minority groups lived. A cooperative program between SIL and the Department of Education of Vietnam, with funding from USAID, was established in 1967, focused on the educational needs of the minority groups. The “Highlander Education Program” (HEP) would allow for minority children to study in their own language for a “primer” year before beginning instruction in Vietnamese as a second language in the first grade. After bilingual materials were used for the first few years, it was anticipated that minority language-speaking children would be able to continue their studies in Vietnamese. For HEP, SIL teams and local speakers prepared 1,140 “teaching items,” including primers, posters, teacher guides, textbooks (math, science), etc. (Gregerson and Smith 2016; Person 2015). HEP began in six languages, expanding to 12 and eventually 19. Apart from HEP, non-formal literacy programs were started by SIL teams working in another 8–9 languages (Smith 1978). Despite the conflict, the SIL “Vietnam Branch” was incredibly productive.7 From 1957 to 1975, 84 SIL staff worked directly with 22 language groups in Vietnam and two in Cambodia, while collecting linguistic data on an additional 33 Asian languages (Smith 1978). They authored 681 technical documents related to linguistics and applied linguistics, often in collaboration with Vietnamese academics, national co-workers, and missionaries (Smith 1978). In 1975, SIL’s presence in South Vietnam concluded, and personnel were resettled in other countries, such as Thailand and the Philippines. Nevertheless, significant linguistic work continued based on data collected up to that point. For example, Kenneth Smith’s work on Sedang (North Bahnaric, Austroasiatic) yielded a monograph-length grammar (Smith 1979) and a substantial dictionary (Smith 2000), in addition to an important legacy of comparative-historical work (e.  g. Smith 1967, 1972, 1974). In 1978, Smith reported that SIL staff in Vietnam had published technical articles on forty of Vietnam’s languages. The SIL Vietnam bibliography for that year listed 291 entries in the general and descriptive linguistics section and 164 entries in the applied linguistics section.

6 Mon-Khmer Studies Journal (MKS) was a peer-reviewed publication of record for research in Austroasiatic linguistics. Founded in 1964, it ceased publication in 2016 with volume 45. All articles are archived at http://www.sealang.net/mks (accessed 7 January 2021). 7 For SIL administrative purposes, Vietnam and Cambodia were combined into the “Mainland Southeast Asia Group,” often referred to as the “Vietnam Branch,” until its official closure in June of 1978.

166 

 Carolyn P. Miller and Kirk R. Person

10.3 Cambodia The work of SIL in Cambodia began in 1973 under the leadership of David and Dorothy Thomas, following contacts with academic colleagues at the University of Phnom Penh and the Khmer-Mon Institute.8 It was decided to pursue work on Western Cham as a logical extension of the work that had been done on Eastern Cham in Vietnam, and on Brao (West Bahnaric, Austroasiatic) spoken in Ratakanakiri Province in the north east of the country. These were the only practically accessible minority communities during a tumultuous time, and teams were limited in how long they could function in the field. After 1975, SIL work in Cambodia stopped for 18 years. The situation changed when Charles and Sally Keller, who had maintained contact with Brao refugees in the USA and Europe, returned to Cambodia in 1993 under the sponsorship of World Concern.9 In 1995, SIL President Dr. Kenneth Gregerson (who had previously worked in Vietnam) participated in the first international academic conference held in Cambodia since the civil war, centering on the languages and cultures of the country. Ken and his wife Marilyn later moved to Cambodia as senior consultants in linguistics and anthropology, respectively, while working with the Tampuan (Central Bahnaric, Austroasiatic) language project. In 2001, SIL joined several other international development agencies in forming an umbrella organization, International Cooperation Cambodia (ICC).10 Projects under ICC in Ratanakiri and Mondulkiri Provinces included community development and language development. Work in these provinces led to government approval for Khmerbased orthographies for the Bahnaric languages Brao Krung, Tampuan, and Mnong, as well as primer development and non-formal mother tongue-based bilingual education programs (Person 2015). Those programs later became mainstreamed into the formal Cambodian school system; as a result, Cambodia now has one of the strongest pro-minority language education policies in Asia (Kosonen and Benson 2021). In 2017, a cooperative agreement between SIL and the National Polytechnic Institute of Cambodia (NPIC) established the Language Software Development Unit (LSDU). A key activity of LSDU is to continue development of SIL’s Keyman software, which provides keyboarding and font solutions for MSEA languages. Keyman recently added predictive texting to its capabilities, to make it easier for minority language speakers to communicate via mobile phone in their own language. This is expected to contribute to language preservation, revitalization and literacy. The SIL team in Cambodia is also working on a Khmer language corpus and dictionary projects with the Institute of National Language, Royal Academy of Cambodia. One notable output of 8 The Khmer-Mon Institute was a Lon Nol-era nationalist, Buddhist institute that promoted the study of Khmer national culture, history, and language (Harris 2005). 9 World Concern is a global relief and development organization, see https://worldconcern.org ­(accessed 7 January 2021). 10 See http://www.icc.org.kh (accessed 7 January 2021).



The SIL contribution to language and linguistics in Mainland Southeast Asia 

 167

that cooperation has been work on security risks, poor searchability, and other complications resulting from ambiguities in how Khmer script can be input on computers and mobile devices, as detailed in Horton et al. (2017).

10.4 Thailand In 1975, Kenneth Smith and David Thomas took a seven-week trip to Thailand to explore opportunities for SIL work in that country. Several universities expressed interest, leading David and Dorothy Thomas to move to Bangkok in 1975. Subsequent years saw memorandums of agreement signed between SIL and Mahidol, Thammasat, Chulalongkorn and Payap Universities, as well as cooperation with scholars at the Royal Institute (now Society) of Thailand.

10.4.1 Mahidol University Through their work at Mahidol University’s Institute of Language and Culture for Rural Development (later renamed the Research Institute for Languages and Cultures of Asia), David and Dorothy Thomas helped foster a commitment to minority language research among key Thai faculty members and graduate students, who in turn passed on this interest to other generations of students. Mahidol students and faculty have since produced linguistic descriptions of most of the minority languages in Thailand. Subsequent SIL staff both served alongside and studied under their Thai Mahidol colleagues. SIL’s Brian Migliazza, for example, produced A grammar of So – a Mon-Khmer language of Northeast Thailand (Migliazza 1998) for his Mahidol University PhD dissertation. Philip Dill assisted the Institute with its ambitious Thailand ethnolinguistic mapping project (Premsrirat 2004). The university awarded honorary doctorates to two SIL staff members: David Thomas and Susan Malone. In 2000, senior SIL multilingual education consultants Dennis and Susan Malone began working with Dr. Suwilai Premsrirat and her Mahidol University team on language revitalization and education projects. As a result, Mahidol University, UNESCO Bangkok and SIL joined UNICEF’s East Asia and Pacific Regional Office in 2008 to found the Asia-Pacific Multilingual Education Working Group. Hosted by UNESCO Bangkok, this network has facilitated six international conferences bringing ethnic minority people, academics, international development workers, and government officials together to discuss the use of ethnic minority languages in formal education systems. A high-level policymakers’ forum in 2019 produced the “Bangkok Statement on Language and Inclusion,” in which 16 Asia-Pacific nations pledged to give greater attention to the needs of ethnic minority children, including expansion of mother tongue-based multilingual education programs – and to report on their progress in

168 

 Carolyn P. Miller and Kirk R. Person

subsequent UNESCO-sponsored conferences. Grammars, dictionaries, orthography statements, and sociolinguistic surveys done by SIL and other researchers have informed many of these mother tongue-based education projects, which hold great potential for raising the educational achievement of disadvantaged minority children while simultaneously contributing to language revival and maintenance. One clear example is the “Patani Malay-Thai Multilingual Education Programme” (PMT-MLE), launched in 2007 by Mahidol University, UNICEF and the Thailand Research Fund, with technical assistance from SIL.11 At the outset of the program, pilot schools were paired with comparison schools as part of a longitudinal study. The pilot schools followed an adapted curriculum which used Patani Malay as the main language of instruction in the early years along with a specially developed »Thai for Ethnic Children« language acquisition course. The amount of Thai language used in the classroom increases each year. The comparison schools followed the “normal” monolingual Thai curriculum. Annual evaluations tracked student performance and community attitudes. UNICEF (2018) reports that PMT-MLE has succeeded in helping students achieve significantly higher marks in Thai language learning, while improving the learning of other subjects such as math and science. Community interviews conducted by the Thailand Research Fund in 2010 and 2015 found strong parental support. PMT-MLE has garnered national and international recognition, including the 2016 UNESCO King Sejong Prize for Literacy and “Honorable Commendation” for the 2017 UNESCO Wenhui Award for Innovations in the Professional Development of Teachers – the only program in UNESCO history to be recognized by both award juries.

10.4.2 Thammasat University The “Thammasat University-SIL Language Research and Development Project” began in 1985. For the next thirteen years, SIL staff taught undergraduate and graduate linguistics, undergraduate English, and anthropology, while also serving in the main library and in the Office of Foreign Relations. Under the terms of the agreement, SIL staff were to teach for one semester and conduct fieldwork the other semester. Kensiw, Western Pwo Karen, Shan, So, Lahu Shi, and Bru were among the languages studied, resulting in such publications as Arthur Cooper’s (2000) Lahu Shi orthography and Julie Green’s (1996) A preliminary description of Bru (Khong Chiam) phonology. Annual academic seminars were organized, often bringing in outside speakers. In 1986, SIL’s Kenneth L. Pike was the keynote speaker for such a gathering, resulting in the publi-

11 Ironically, Mohammad Kadir, a Patani Malay-speaking Ministry of Education official, had presented the case for just such a bilingual education program 20 years earlier during the “Linguistics and Worldview” conference at Thammasat University mentioned later in this chapter (Samermitt and Thomas 1989).



The SIL contribution to language and linguistics in Mainland Southeast Asia 

 169

cation Linguistics and worldview: in honor of Professor Kenneth L. Pike (Samermit and Thomas 1989).

10.4.3 Chulalongkorn University Cordial relationships with Chulalongkorn University allowed for cooperative work in language survey and the documentation of Mon-Khmer languages in Northeast Thailand. Results of the survey were published in Mon-Khmer Studies (Miller and Miller 1993b, 1995, 1996). Word lists, thesauruses, phrase books and text material from six of the Bru-So groups (Miller and Miller 1993a) were placed on file with the National Research Council of Thailand and SIL’s David Thomas Library (formerly located in SIL’s Bangkok office but now housed at Payap University).

10.4.4 The Royal Institute Dr. Udom Waromakasadit, the doyen of Thai linguists, came to know SIL while studying under Kenneth Pike at the University of Michigan in the 1960s. In 1975, Dr. Udom introduced SIL to Mahidol University linguists, resulting in the program previously mentioned. Thirty years later, Dr. Udom invited Kirk Person to represent SIL on the Royal Institute’s newly formed National Language Policy (NLP) drafting committee. The NLP was approved by Prime Minister Abhisit Vejjajiva in 2010 and endorsed by his successor Prime Minister Yingluck Shinawatra in 2012. As of early 2020, the NLP implementation plan is in the process of being submitted to the Thai cabinet. The NLP is supportive of the language rights of ethnic minority people, including the right to mother tongue-based education in the formal school system. In explaining the need for such provisions, Waromakasadit and Person (2011) stated: As many as 1 out of every 15 children in Thailand speak a non-Tai language in the home. In terms of education, the monolingual Thai approach is not producing satisfactory results among ethnic children. Ministry of Education statistics from 2007, for example, found that 25–35 % of second grade children in the far North, deep South, and Northeast border regions were functionally illiterate in Thai, compared to 1 % in Bangkok. […] the National Language Policy of Thailand (2010) represents a significant first step in a systematic effort to develop the language resources of the kingdom. (Waromakasadit and Person 2011: 35, 41)

10.4.5 Payap University Cooperation with Payap University in the northern Thai city of Chiang Mai began in 1985, when SIL’s Mainland Southeast Asia Director, Paulette Hopple, approached the university with a proposal for a joint program in linguistics, research, and commu-

170 

 Carolyn P. Miller and Kirk R. Person

nity development. The university responded positively, prompted in large part by the enthusiastic support of the Payap Board of Trustees chairperson, Dr. Saisuree Chutikul, a social reformer and United Nations advisor who would soon become the third woman in Thai history to hold a cabinet position. Payap Vice President Boonthong Phoocharoenthon, SIL’s Frances Woods and Howard McKaughn (who had worked with SIL in Southeast Asia for many years before becoming the founding chair of the University of Hawaii’s linguistic department) developed plans for a graduate program in linguistics, receiving Ministry of Education authorization in 1988 (Person 2009). Beginning with two students in 1989, as of 2019, the program has awarded master’s degrees in linguistics to 139 students from 23 countries. Of these graduates, five have completed PhDs and eight are currently enrolled in PhD programs. Sixteen graduates returned to teach at Payap, and 71 have worked in minority language development projects in the area. Theses done by Payap students under the guidance of SIL professors include phonological, grammatical, discourse, and sociolinguistic studies of many of the languages of the region, as shown in Table 1. Tab. 1: Asian languages and language families featured in Payap University theses. Agusan Manobo Akha Anung Bai Baloch Bisu Bisoid (family) Brao Bru Burmese Bwe Karn Chin (proto) Chodri Daai Chin Dermuha Karen Dong Dehong Dai Eastern Lawa English Falam Chin Guinaang Hawa Nocte Helong Hmong

Hmong Daw Hmong Ntsuab Hokkien Iu-Mienh Japanese Jejara (Para Naga) Jingpaw Jirel K’cho Karenic (family) Kayah Monu Kayan Lahta Khmer Khuen Kim Mun Khmu Koho Kuy Lacid Lahu Aga Lahu Na Lahu Bakeo Lahu Si/Shi Lai

Lao Leinong Naga Lemei Lemi Chin Lhasa Tibetan Lisu Maa Maguindanaon Makyam Naga Makuri Naga Mandarin Chinese Manobo Meung Yum Savaiq Muak Sa-aak Nepali Ngochang Northeastern Thai Northern Thai Persian Phowa Plang Punjabi

Pwo Karen Rawang Rera Santa Mongolian Senthang Chin Sgaw Karen Karenic (family) Shan Sizang Chin Solu Sherpa Suyot Tai (family) Tai Dam Tai Lue Tai Nua Tangshang Naga Thai Tibetan Vietnamese Wa Wadiyari West Katuic (family) Yong



The SIL contribution to language and linguistics in Mainland Southeast Asia 

 171

As was the case with Thammasat University, SIL staff working in Payap’s linguistics department are expected to spend one semester teaching and the other semester conducting research (although some exceptions have been made for full-time teaching staff). That research was to be published either internally by the University (through a series of technical reports) or externally (conference proceedings, international journals, etc.). Payap Linguistics Department faculty have thus produced over 200 technical papers, available via SIL’s David Thomas Library, the Payap University Central Library, or SIL’s International Language and Culture Archives. Recent work by SILPayap staff include Ikeda and Lew (2017) on alveolar fricative rhotics in Nusu, Manson (2017) on the Karenic family, Tehan (2017) on Kriang clauses, Inglis (2018) on Khamti anti-ergative constructions, Lew (2014) on Lao orthography, Adams (2014) on orthography decision making, Diller and Diller (2018) on Prai story structure, Fraiser (2019) on indigenous agroforestry, Gehrmann (2017) on Kriang historical phonology, Owen (2017) on Khuen script, and Phillips and Hanna (2019) on Tai Lue classifiers. The Payap-SIL partnership was not limited to the graduate linguistics program. In 1991, the “Tribal Training Program” (TTP) was established under the auspices of the Payap Research and Development Institute (PRDI). As part of the university’s commitment to public service, this program provided language development training directly to minority language communities. Workshops lasting from a few days to several weeks were organized on multiple language development topics, including orthography creation/revision, story writing/editing, primer construction, program planning, literacy teacher training, software tools, and translation principles. The output of these workshops can be seen in hundreds of vernacular publications (story books, picture dictionaries, primers, Scripture portions) on file in the David Thomas Library. TTP was renamed the “Applied Linguistics Training Program” (ALTP) in 1995. ALTP continued to develop the workshop series, introducing new topics as needs arose. From 2003 to 2006, ALTP cooperated with the Office of the Non-Formal Education Commission to provide support for a mother tongue-based multilingual education program for Pwo Karen children – the first time in which a minority language was officially authorized to be used as a medium of instruction in Thai government schools. The success of this UNESCO-funded project prompted school visits from Minister of Education Chaturon Chaisang and Her Royal Highness Princess Maha Chakri Sirindhorn. In 2006, ALTP became the “Linguistics Institute” (LI), with a broadened scope. Today, the LI comprises a training unit (carrying out activities similar to those described above), a computing unit, and a research/documentation unit, and an extension site in Chiang Rai Province called the Northern Training Center. The LI’s computing unit has created computer programs to aid linguistic research and language development in cooperation with SIL’s global “Language Software Development” and “Non-Roman Script Initiative” units. They have developed dictionary and word-gathering tools such as WeSay and LanguageForge, in addition to tools to aid translation work. Major contributions have also been made in developing

172 

 Carolyn P. Miller and Kirk R. Person

Unicode resources for Lanna, Tai Lue and Tai Khuen, as well as systems like Graphite and Keyman that enable the use of complex scripts on computing systems. The LI’s research and documentation12 unit supports researchers by providing visas, work space, planning, and follow-up services. One key activity has been the collection of linguistic and sociolinguistic survey data, resulting in language maps and other tools important to clarify language and dialect boundaries. Notable publications by LI researchers include Kosonen (2017) on language policy, Hall (2017) on Tai Loi, Cheeseman, Hall and Gordon (2015) on Palaungic, Schmutz (2013) on Ta’oi, Miller and Miller (2011) on Khmu literacy, Wannemacher (2000) on Zaiwa and Vitrano-Wilson (2016) on Hmong Daw reading.

10.5 Laos In the 1960s and 1970s, some SIL staff began to study languages which extended into Laos from the neighboring countries. These included Tai languages such as Black Thai, White Thai, Tho, Nung, and Austroasiatic languages such as Khmu, Rmeet, and Bru. Some study was carried out with speakers of the languages who settled in other countries such as the United States or France. In 1991, SIL signed an agreement with the Lao government’s Committee for Social Sciences to survey minority languages in the southern part of the country. Though the agreement was suspended by the government before the survey was able to take place, some staff were able to serve in other capacities. Dr. John Durdin spent several years developing computer programs and scripts for the Lao, Bru and Khmu languages. Others helped with linguistic description and language development projects in cooperation with various international non-government agencies. Notable works include Pamela Sue Wright’s (1996) A Lao grammar for language learners and several publications by Nancy Costello and her co-worker, Khamluan Sulavan. These included several trilingual (Katu-Lao-English) books published by the Lao government on Katu agriculture, education, traditional medicine, and a collection of Katu folktales, as well as numerous Mon-Khmer Studies journal articles covering Katu nouns, affixes, dialects, aspect and tense. Elisabeth Preisig worked with Suksavang Simana and Somseng Sayavong to produce the Kmhmu’-Lao-French-English Dictionary (Simana, Sayavong and Preisig 1994), published by the Lao Ministry of Information and Culture.

12 Payap University Linguistics Institute publications can be seen here: https://li.payap.ac.th/index. php?option=com_content&view=article&id=15&Itemid=24 (accessed 7 January 2021). Reports and Conference Presentations are at: https://li.payap.ac.th/index.php?option=com_content&view=article &id=10&Itemid=57 (accessed 7 January 2021). Language Resources are at: https://li.payap.ac.th/index. php?option=com_content&view=article&id=28&Itemid=38 (accessed 7 January 2021).



The SIL contribution to language and linguistics in Mainland Southeast Asia 

 173

10.6 Myanmar SIL’s contribution to language communities in Myanmar has been largely in the area of training. Thirty-six of the graduates from the MA program at Payap University have been from Myanmar. Their theses have contributed much to the understanding of minority languages in that country. Most of these graduates have gone on to promote language development in their own and other languages, generally working with local organizations or churches. Several Payap graduates were instrumental in founding the Yangon-based Language and Social Development Organization (LSDO), which does advocacy at a regional and national level and assists minority communities with language development. SIL has provided formal and informal consultancy services to language communities and local organizations that have requested training in language development, literacy programming and language planning. While providing such technical assistance, SIL staff have authored articles on such languages as Muak-Sa-aak, Riang Lang, Lacid, Zaiwa, Mok, Plang, Khamti Shan, Pyen, Akha and Lahu. In recent years, SIL partnered with the LSDO, the Myanmar Information Management Unit (a service of the Office of the United Nations Resident and Humanitarian Coordinator), the European Union, and the Embassy of Canada to produce language maps13 of Myanmar in the national language and in English, based on Ethnologue data. SIL education consultants have been invited speakers at academic conferences and United Nations-sponsored symposiums related to mother tongue-based multilingual education for ethnic minority children.

10.7 Conclusion SIL staff have worked in MSEA for more than 60 years, generating a significant amount of language data, linguistic analysis, and vernacular language publications.14 In that time, the region and its peoples have witnessed dramatic changes, as rapid economic growth has transformed rice paddies into megacities. Yet much remains the same for ethnic minority peoples living on the fringes of Asia’s urban societies.

13 These are available online at https://www.themimu.info/mm/search/node/Ethnologue (accessed 7 January 2021). 14 SIL’s Language and Culture Archives lists 688 entries for Vietnam, 499 for Thailand, 117 for Myanmar, 157 for Cambodia, and 88 for Laos. These can and other resources be found at https://www. sil.org/resources/language-culture-archives (accessed 7 January 2021). The David Thomas Library at Payap University is another excellent repository of technical and vernacular publications; the library is open to the public, with a searchable online catalog available via https://sites.google.com/a/sil.org/ mseag/david-thomas-library (accessed 7 January 2021).

174 

 Carolyn P. Miller and Kirk R. Person

Earlier generations of SIL linguists often lived in remote villages, working alongside community members with little formal education to learn and analyze previously undocumented languages. They filled notebooks and index cards with data which steadily made its way to linguistic conferences, journal articles, and books. Today, SIL’s work is considerably more multifaceted. Some SIL staff still conduct “traditional” fieldwork, including sociolinguistic assessments and other forms of linguistic and anthropological data collection. Yet increasingly this is done in the context of carefully constructed partnerships with language communities who have requested SIL’s technical assistance to help them achieve their own goals. Toward that end, SIL provides training, ranging from short workshops to graduate programs, aimed at empowering minority communities. SIL computer experts work alongside local communities to ensure they have the technical tools they need, developing easyto-use software to create dictionaries, analyze sound systems, produce pedagogically sound primers and reading books, and facilitate accurate and meaningful translations. Meanwhile, SIL education consultants strive to build bridges of understanding between government leaders, United Nations staff, and ethnic communities. What has not changed is SIL’s commitment to serve ethnic minority people through language development partnerships that foster individual and community development, while contributing to the scientific study of humanity’s amazing linguistic and cultural diversity.

References Adams, Larin. 2014. Case studies of orthography decision making in mainland Southeast Asia. In Michal Cahill & Keren Rice (eds.), Developing orthographies for unwritten languages, 177–189. Dallas: SIL International. Blood, Doris E. 1962. Reflexes of Proto-Malayopolynesian in Cham. Anthropological Linguistics 4(9). 11–20. Bryant, John R. & Khu Klawreh & Khu Noah 1992. Notes on Western Kayah Li (Western Red Karen) phonology. Chiang Mai: Payap Research and Development Institute and SIL International. Cheeseman, Nathaniel, Elizabeth Hall & Darren Gordon. 2015. Palaungic linguistic bibliography with selected annotations. Mon-Khmer Studies 44. i–iiv. Cooper, Arthur D. 2000. Lahu Shi orthography report. Bangkok: Thammasat University and Summer Institute of Linguistics Language Research and Development Project. Cooper, Michael. 1996. Some fortitions and lenitions in Vietnamese. PYU working papers in linguistics, vol. 1, 105–118. Chiang Mai: Payap University. Costello, Nancy A. 1998. Affixes in Katu of the Lao P. D. R. Mon-Khmer Studies 28. 31–42. Costello, Nancy A. 2000. Dialect differences for Katu prepositional phrases. Mon-Khmer Studies 30. 65–73. Costello, Nancy. 2001. Aspect and tense in Katu of the Lao P. D. R. Mon-Khmer Studies 31. 121–125. Diller, Kari Jordan & F. Jason Diller. 2018. Twos and fore: Dual organization and the importance of foreshadowing in Prai story structure. Journal of the Southeast Asian Linguistics Society 11(2). 118–178.



The SIL contribution to language and linguistics in Mainland Southeast Asia 

 175

Fraiser, Douglas. 2019. Environmental invasion and social response: Of a forest and those who dwell therein. Dallas: SIL International. Gehrmann, Ryan. 2017. The historical phonology of Kriang, a Katuic language. Journal of the Southeast Asian Linguistics Society 10(1). 114–139. Green, Julie. 1996. A preliminary description of Bru (Khong Chiam) phonology. Bangkok: Thammasat University and Summer Institute of Linguistics Language Research and Development Project. Gregerson, Kenneth & Kenneth Smith. 2016. Témoignage de l’intérieur: le Summer Institute of Linguistics au Viêt Nam et sa présence en Asie du Sud-Est [Testimony from inside: the Summer Institute of Linguistics in Vietnam and its work in Southeast Asia]. In Pascal Bourdeaux & Jeremy Jammes (eds.), Chrétiens évangéliques d’Asie du Sud-Est. Expériences d’une ferveur conquérante [Southeast Asian evangelicals: Conquering fervor narratives], 89–106. Rennes: Presses Universitaires de Rennes. Haak, Feikje van der & Brigitte Woykos. 1987. Kui dialect survey in Surin and Sisaket. Mon-Khmer Studies 16/17. 109–142. Hall, Elizabeth. 2017. On the linguistic affiliation of Tai Loi. Journal of the Southeast Asian Linguistics Society 10(2). xix–xxii. Hall, Elizabeth. 2019. Muak Sa-aak: Challenges of an extensive phoneme inventory for a contained Latin-based orthography. Journal of the Southeast Asian Linguistics Society 12(2). i–viii. Hanna, William J. 2012. Dai Lue-English dictionary. Chiang Mai: Silkworm Books. Harris, Ian. 2005. Cambodian Buddhism: History and practice. Honolulu: University of Hawai’i Press. Horton, Joshua, Makara Sok, Mark Durdin & Rasmey Ty. 2017. Spoof-vulnerable rendering in Khmer Unicode implementations. Proceedings of the Sixth Asian Conference on Information Systems, 177–180. Phnom Penh: Royal University of Phnom Penh. Ikeda, Elissa & Sigrid Lew. 2017. The case for alveolar fricative rhotics with evidence from Nusu. Linguistics of the Tibeto-Burman Area 40(1). 1–39. Inglis, Douglas. 2018. Khamti Shan anti-ergative construction: A Tibeto-Burman influence? Linguistics of the Tibeto-Burman Area 40(2). 133–160. Kosonen, Kimmo. 2017. Language policy and education in Southeast Asia. In Teresa L. McCarty & Stephen May (eds.), Language policy and political issues in education (Encyclopedia of Language and Education), 477–490. Cham, Switzerland: Springer. Kosonen, Kimmo & Carol Benson. 2021. Bringing non-dominant languages into education systems: Change from above, from below, from the side – or a combination? In C. Benson & K. Kosonen (eds.), Language issues in comparative education II: Policy and practice in multilingual education based on non-dominant languages. Leiden: Brill Publishers. Kya Heh, Noel & Thomas Tehan. 2000. The current status of Akha (Technical Paper #57). Chiang Mai: Payap Research and Development Institute and SIL International. Lee, Ernest W. 1966. Proto-Chamic phonological word and vocabulary. Bloomington: University of Indiana doctoral dissertation. Lew, Sigrid. 2014. A linguistic analysis of the Lao writing system and its suitability for minority language orthographies. Writing Systems Research 6(1). 25–40. DOI: 10.1080/17586801.2013.846843. Lynip, Arthur. 2013. Richard S. Pittman: SIL statesman linguist and the Asia-Pacific rim of fire. Manila: SIL Philippines. Manson, Ken. 2017. The characteristics of the Karen branch of Tibeto-Burman. In Picus Shizhi Ding & Jamin Pelkey (eds.), Sociohistorical linguistics in Southeast Asia: New horizons for Tibeto-Burman studies in honor of David Bradley (Languages of the Greater Himalayan Region 20), 149–168. Leiden: Brill.

176 

 Carolyn P. Miller and Kirk R. Person

Migliazza, Brian. 1998. A grammar of Sô – A Mon-Khmer language of northeast Thailand. Salaya, Thailand: Mahidol University doctoral dissertation. Miller, John & Carolyn Miller. 1993a. A WordSurv database of Katuic languages. Submitted to the National Research Council of Thailand. Unpublished manuscript. Miller, John & Carolyn Miller. 1993b. Perceptions of ethnolinguistic identity, language shift and language use in Mon-Khmer language communities in Northeast Thailand. Mon-Khmer Studies 23. 83–101. Miller, John & Carolyn Miller. 1995. Notes on phonology and orthography in several Katuic Mon-Khmer groups in Northeast Thailand. Mon-Khmer Studies 24. 27–51. Miller, John & Carolyn Miller. 1996. Lexical comparison of Katuic Mon-Khmer languages with special focus on So-Bru groups in northeast Thailand. Mon-Khmer Studies 26. 255–290. Miller, John & Carolyn Miller. 2017. Bru-English-Vietnamese-Lao dictionary. SIL International. http://bru.webonary.org/ (accessed 8 April 2020). Miller, Michelle M. & Timothy M. Miller. 2011. A study of language use and literacy practices to inform local language literature development among Khmu in Thailand. Mon-Khmer Studies Journal Special Issue No. 2. 98–111. Owen, Robert Wyn. 2017. A description and linguistic analysis of the Tai Khuen writing system. Journal of the Southeast Asian Linguistics Society 10(1). 140–164. Page, Christina Joy. 2013. A new orthography in an unfamiliar script: A case study in participatory engagement strategies. Journal of Multilingual and Multicultural Development 34(5). 1–16. Person, Kirk R. 2009. Celebrating 20 years of cooperation: Payap University and SIL International. Chiang Mai: Payap University Press. https://drive.google.com/file/d/0B7VcgyXE37XZ3k4ajc1a2pXZHM/view?pli=1 (accessed 7 January 2021). Person, Kirk R. 2015. SIL International and the mother tongue-based multilingual education (MTB-MLE) movement. Paper presented at The Mission of Development: Religion and TechnoPolitics in Asia conference, National University of Singapore, 3–4 December. Premsrirat, Suwilai, Sujaritlak Depadung, Akapong Buasawnwong, Isara Choosri, Sophana Srichampa, kapong Bua & Mayuree Thawornpat. 2004. Ethnolinguistic maps of Thailand. Bangkok: Ministry of Culture and Mahidol University. Phillips, Audra & William J. Hanna. 2019. Numeral classifiers in Tai Lue (Xishuangbanna). Journal of the Southeast Asian Linguistics Society 12(2). 1–34. Phillips, Richard, John Miller & Carolyn Miller. 1976. The Brũ vowel system: Alternate analyses. Mon-Khmer Studies 5. 203–217. Samermit, Potchanat & David Thomas (eds.). 1989. Linguistics and worldview: In honor of Professor Kenneth L. Pike. Bangkok: Thammasat University and Summer Institute of Linguistics Language Research and Development Project. Schmutz, Jonathan. 2013. The Ta’oi language and people. Mon-Khmer Studies 42. i–xiii. Simana, Suksavang, Somseng Sayavong & Elisabeth Preisig. 1994. Kmhmu’ – Lao – French – English dictionary. Vientiane: Ministry of Information and Culture, Institute of Research on Culture. Smith, Kenneth D. 1967. Phonological reconstruction of proto Central North Bahnaric. Work Papers of the Summer Institute of Linguistics, University of North Dakota 11. 85–112. Smith, Kenneth D. 1972. A phonological reconstruction of proto North-Bahnaric (Language Data, Asian-Pacific Series 2). Santa Ana, CA: Summer Institute of Linguistics. Smith, Kenneth D. 1974. A computer analysis of Vietnam language relationships. Work Papers of the Summer Institute of Linguistics, University of North Dakota 18. 99–113. Smith, Kenneth D. 1978. Summary report of the Mainland Southeast Asian Branch, Summer Institute of Linguistics 1957–1978. Unpublished ms. Smith, Kenneth, D. 1979. Sedang grammar: Phonological and syntactic structure (Pacific Linguistics Series B No. 50). Canberra: Australian National University.



The SIL contribution to language and linguistics in Mainland Southeast Asia 

 177

Smith, Kenneth. 2000. Sedang dictionary with English, Vietnamese and French glossaries: A thesaurus-alphabetical listing of Sedang words and word groups (Mon-Khmer Studies Special Volume 1). Salaya, Thailand: Mahidol University and SIL International. Smith, Kenneth. 2015. Personal communication. Sulavan, Kamluan, Thongpeth Kingsada & Nancy A. Costello. 1996. Katu traditional education for daily life in ancient times. Vientiane: Institute of Research on Lao Culture. Sulavan, Khamlouan, Thongpheth Kingsada & Nancy Costello. 1998. Katu-Lao-English Dictionary. Lao P. D. R.: The Ministry of Information and Culture & The Institute of Research on Lao Culture. Tehan, Tom. 2017. Kriang clause structure: Active (dynamic) events. Proceedings of the Payap University research symposium. Chiang Mai: Payap University. Thomas, David. 1964. A survey of Austro-asiatic and Mon-Khmer comparative studies. Mon-Khmer Studies 1. 49–163. Thomas, David. 1966. Mon-Khmer subgroupings in Vietnam. In Norman Herbert Zide (ed.), Studies in comparative Austroasiatic linguistics (Indo-Iranian Monographs V), 194–202. The Hague: Mouton. Thomas, David D. & Robert K. Headley. 1970. More on Mon-Khmer subgroupings. Lingua 25. 398–418. UNICEF. 2018. Bridge to a brighter tomorrow: The Patani Malay-Thai multilingual education programme. Bangkok: UNICEF Thailand Country Office. Vitrano-Wilson, Seth. 2016. Reading syllable-spaced versus word-spaced text in Hmong Daw: Breaking up isn’t so hard to do. Writing Systems Research 8(2). 234–256. Wannemacher, Mark & Zau Mo. 2000. A preliminary Zaiwa – English lexicon with English Zaiwa glossary (Technical Paper #54). Chiang Mai: Payap University Research and Development Institute and SIL International. Warotamasikkhadit, Udom & Kirk Person. 2011. Development of the national language policy (2006–2010). Journal of the Royal Institute of Thailand 3. 29–44. Wright, Pamela Sue. 1996. A Lao grammar for language learners. Bangkok: SIL and Thammasat University.

Paul Sidwell

11 Classification of MSEA Austroasiatic languages 11.1 Introduction The Austroasiatic (AA) language family comprises around 170 named languages that can be grouped into 13 or 14 primary branches (Eberhard et al. [2019] name some 167 languages). All but three of these branches are spoken in MSEA: Palaungic, Aslian, Bahnaric, Katuic, Khmer, Khmuic, Mang/Pakanic, Monic, Pearic, Vietic. Broadly speaking, the AA family in one form or another has been recognised by linguists since the second half the 19th century, yet understanding the full extent of the family and how the languages fall into groups only seriously merged around 1970, after a surge of field work activity in the region, and the application of lexicostatistical methods to language classification (crucially Thomas and Headley 1970). Classification was an important issue at the first International Conference on Austroasiatic Linguistics in 1973; and with work pursued for that meeting and in its aftermath, consensus emerged that identified the principal AA branches, and this is seen for example, in Diffloth’s (1974) Encyclopædia Britannica article. However, how those branches coordinate into a nested hierarchy, if at all, has remained a problematic issue, and it is not at all clear that a convincing solution has been presented to date. These issues are reviewed in detail by Sidwell (2009a, 2014a) with much of this chapter summarizing what is reported in those works, plus some updates reflecting more recent work. It is straightforward to recognise a language as belonging to the AA family, based on the stability of various body part and animal terms; very stable AA etyma include *mat ‘eye’, *tiːʔ ‘hand’, *ɟəːŋ ‘foot’, *cʔaːŋ ‘bone’, *ceːm~*ciːm ‘bird’, *cɔːʔ ‘dog’ and others including personal pronouns and lower numerals 1–4 (an AA basic lexicon is provided by Sidwell and Rau 2014). All AA branches have a recognisable cohort of roots from this common stock, while each is distinguished from the others variously by diagnostic lexical and phonological innovations. At the same time, some scholars have invoked typological considerations in arguments over AA classification: Maspero (1912) refused to recognise Vietnamese as AA, arguing for Tai ancestry due to the presence of tones, Sebeok (1942) cited structural differences between AA groups as fundamental obstacles to recognising the AA family at all, and Pinnow (1963) invoked morpho-syntactic typology to justify splitting AA into Munda and MonKhmer sub-families. The term Mon-Khmer had enjoyed widespread currency since as labelling all or some non-Munda branches.1 1 The use of the term Mon-Khmer continues to have some currency no doubt due to its use in the name of the journal Mon-Khmer Studies and in Shorto’s (2006) influential A Mon-Khmer Comparative Dictionary. https://doi.org/10.1515/9783110558142-011

180 

 Paul Sidwell

Map 1: Map of MSEA Austroasiatic languages.



Classification of MSEA Austroasiatic languages 

 181

Notable works in recent decades that discuss the possible nesting relations between AA branches include: Parkin (1991), Diffloth and Zide (1992), Peiros (1998, 2004), Chazée (1999), van Driem (2001), Diffloth (2005, 2009), Sidwell (2009a), Nagaraja (2010), Sidwell (2014a). These include some novel proposals, but show little overall movement towards a consensus, beyond the clear tendency for concerned scholars to step away from the binary model of AA consisting of Munda and MonKhmer, effectively abandoning Mon-Khmer as a phylogenetic label. From here on in, for the AA languages of India, Munda is not discussed, and Khasian and Nicobarese are only discussed in terms of their relation to MSEA languages. At the time of writing the most influential classificatory schemes are those of Diffloth (2005, 2009) and Sidwell (2014a); the former offers a model of deeply nested branching that coordinates all AA languages of MSEA into two sub-families: Khasi-­ Pakanic and Mon-Khmer, while the latter study finds no strong evidence for nested branching and arranges most branches into a rake-like tree in which no two MSEA branches apparently coordinate (see Figures 1, 2). Khasian

Khasi-Pakanic

Palaungic Pakanic Khmuic Vietic Katuic

Khasi-Aslian

Bahnaric Khmeric Mon-Khmer

Pearic Monic Aslian Nicobarese

Fig. 1: Classification of Mainland AA branches by Diffloth (2009).

Austroasiatic

Munda

Palaungic Khasian

Mangic

Khmuic

Vietic

Katuic

Bahnaric Monic Nicobarese Khmeric Pearic Aslian

Fig. 2: Classification of AA branches by Sidwell (2014a).

182 

 Paul Sidwell

Since 2014 Sidwell has conducted computational phylogenetic experiments on an AA dataset of 200 words in 121 languages (in collaboration with Simon Greenhill and Russell Gray, see Sidwell 2015a). The results of those experiments suggest a revision of the flat rake model to recognise some six primary branches, with the Indo-Chinese groups in particular (Pearic, Khmeric, Bahnaric, Vietic, Katuic) forming a loose Eastern-AA sub-family (see Figure 3). However, such results are difficult to assess as they fall out statistically from the identification of cognate and innovated lexicon, and need to be validated against comparative reconstruction of phonological and other structural evolution. Southern-AA

Nicobarese Aslian Khasian

Northern-AA

Palaungic Khmuic Munda

Austroasiatic

Mang Monic Pearic Khmeric Eastern-AA

Bahnaric Vietic Katuic

Fig. 3: Classification of AA branches suggested by computational phylogenetic analysis of 200-word list (Sidwell 2018).

Thus, it remains that the overall configuration of the AA family tree is an unsettled problem. This is not entirely surprising, as the differentiation of AA into distinct branches probably occurred over millennia, before and during the Neolithic period of MSEA prehistory (see chapters by Higham, Bellwood, this volume) and the march of time both erases linguistic evidence by internal change and the effects of prolonged contact. By contrast, the phylogenetic structures within branches stand on much firmer ground; the time-depths are much shallower, the language histories are now usefully reconstructed for each branch and results across multiple studies have shown significant convergence. In the sections that follow below, we touch upon each branch, and review the state of understanding of their internal classifications.



Classification of MSEA Austroasiatic languages 

 183

11.2 Aslian Aslian is a group of between a dozen and 18 languages2 spoken in the interior of the Malay Peninsula by relatively small tribal groups, totalling around 50,000 speakers. Broadly, they fall into Southern, Central, and Northern subgroups, with a question over the status of one – Jah Hut – which may form a fourth sub-group. The question of internal Aslian classification has been mostly settled since the 1970s, with lexicostatistical study of Benjamin (1976), and only minor refinements added by Dunn et al. (2011, see Figure 4) who applied Bayesian computational phylogenetic methods to an augmented version of Benjamin’s original dataset.

North Aslian

Kensiu (kns) Chewong (cwg) Mendriq (mnq) Jehai (jhi) Kintaq (knq) Batek (btq) Mintil (mzt) Tonga (tnz) Jah Hut (jah)

Central Aslian

Lanoh (lnh) Semnam (ssm) Temiar (tea) Semai (sea) Sabum (sbo)

South Aslian

Semaq Beri (szc) Semelai (sza) Mah Meri (mhe) Temoq (tmo)

Aslian

Fig. 4: Aslian family tree, based on Dunn et al. (2011).

Until the 1970s Aslian was not generally treated as a unitary branch, but was commonly listed as three, using the names Semelaic, Senoic, and Jahaic (corresponding to Southern, Central, and Northern). In earlier work, such as Schmidt (1901) and Skeat and Blagden (1906), the Southern and Central groups were termed Sakai while the Northern were called Semang. The recognition of a coherent branch, with the name

2 Eberhard et al. (2019) list 18 Aslian languages, yet as Kruspe et al. (2014) explain, the North Aslian sub-group is a chain of very closely related lects, so it is not quite clear whether more than one language should properly be distinguished within the northern group.

184 

 Paul Sidwell

Aslian (from Malay Asli ‘original’), was proposed by Geoffrey Benjamin at the 1973 ICAAL meeting and emerged as the standard with the publication of Benjamin (1976). The current classification maps neatly onto the historical reconstruction by Phillips (2012), adding weight to the idea that the classification is essentially settled. The justification of Aslian as a coherent AA branch, while not contested, is not based on an agreed list of shared innovations but seem reasonable on geographical and lexical considerations. The languages are lexically diverse, with many replacements of otherwise stable AA etyma, yet all lexicostatistical studies unambiguously group them against the rest of AA. An apparent high rate of lexical change with Aslian may be related to cultural practices of word tabooing among the Aslian (see Diffloth 1980b). Nonetheless, it is possible to identify some lexical innovations in the basic lexicon that appear to reconstruct to Proto-Aslian, and some examples are given in Table 1, based on Phillips (2012). Tab. 1: Selected Aslian lexical innovations. Gloss

Kensiu

Jahai

Semnam

Semalai

pAslian

‘bitter’ ‘marrow’ ‘bear’ ‘sweet’

kadek – kawap gəhɛt

kadɛk suwəp kawip bhɛt

kdɛk lasuoːm kaweːp bhɛt

kədɛc smsɔm – gɛhɛt (Temoq)

*kədɛk *suəm *kawãːp *gəhɛːt

11.3 Bahnaric Bahnaric is one of the more diverse AA branches; as many as 30 Bahnaric languages are spoken by hundreds of thousands of people in the southern highlands in Vietnam, southern Laos, and eastern Cambodia, and range from hillside swiddeners to lowland paddy cultivators. Bahnaric languages were recognised and documented since the latter 1800s (particularly Bahnar, Stieng, Sre, Sedang, and others) but into the 1960s were basically treated by scholars as lects in a long chain of highland languages stretching the length of the Annamite Range. The pioneering lexicostatistical study of Thomas (1966) differentiated Bahnaric from Katuic languages, and identified a north-south division within Bahnaric; Thomas and Headley (1970) recognised a western sub-group (spoken in Laos and Cambodia). The classification was further investigated and refined at the 1973 ICAAL and in papers published subsequently. Smith (1973) proposed a North-East sub-group, and Gregerson et al. (1976) proposed that Bahnar and Alak (spoken in Laos) form a Central Bahnaric sub-branch. Thomas (1979) conducted a more extensive review, and proposed five sub-groupings, as listed in Table 2.



Classification of MSEA Austroasiatic languages 

 185

Tab. 2: Bahnaric classification by Thomas (1979). North Bahnaric: South Bahnaric: West Bahnaric: Central Bahnaric: Eastern Bahnaric:

Sedang, Hrê, Halăng, Jeh, Rengao Kơho, Chrau, Mnong, Stieng Laven, Nyaheun, Cheng, Oi, Laveh, Brao Bahnar, Tampuan, Alak Cua, Takua (?)

In the new millennium Sidwell (2002, 2009a, 2009b, 2014a) refined and extended Thomas’ scheme by modelling the break-up of proto-Bahnaric according to reconstructed phonological history, and those results are diagrammed in Figure 5, and this remains the state of our understanding at the time of writing.

Bahnaric

North Bahnaric

Halang (hal), Doan (hld) Jeh (jeh) Kayong (kxy) Takua (tkz) Katua (kta) Hre (hre) Sedang (sed) Todrah (tdr) Monom (moo) Rengao (ren) Kaco’ (xkk), Romam (rmx), Lamam (lmm)

East Bahnaric

Cua (cua)

West Bahnaric

Kavet (krv), Krưng (krr), Lave (brb) Jru’ (lbo), Sou (sqq) Oy (oyb), Sapuan (spu), Sok (skk), The (thx), Jeng (jeg) Nyaheun (nev) Lavi (lvi)

Central Bahnaric

Alak (alk) Trieng (stg), Talieng (sdf), Kasseng (kgc) Tampuon (tpu) Bahnar (bdq)

South Bahnaric

Stieng (sti), Budeh Stieng (stt) Chrau (crw) Sre (kpm) Maa (cma) Eastern Mnong (mng) Central Mnong (cmo), Kraol (rka) Southern Mnong (mnn)

Fig. 5: Current Bahnaric classification based on Sidwell (2002, 2009a, 2009b, 2014).

186 

 Paul Sidwell

Tab. 3: Selected Bahnaric lexical innovations. Gloss

Laven (West)

Sedang (North)

Stieng (Central)

pBahnaric pAA

‘bone’ ‘tongue’ ‘fire’

ktɨəŋ hapiat ʔuɲ

kəsiəŋ rəpiɩ ʔuɲ

ntiːŋ lpiat ʔoɲ

*kʦɨːŋ *lpiət *ʔuɲ

*cʔaːŋ *lntaːk *ʔus

The Bahnaric branch is readily distinguished from the rest of AA by several lexical and phonological innovations (see Table 3). The etyma for ‘bone’ and ‘tongue’ are particularly diagnostic, because they are found in all Bahnaric languages. Phonologically the ‘bone’ word looks suspiciously like an adaptation from Vietic (cf. Vietnamese xương ‘bone’) although other etymologies indicate that Bahnaric may have innovated both the *ʦ and *ɨː segments independently. The ‘tongue’ word is clearly an infixed reflex of *liət ‘to lick’, and we do not find this derivative in other branches. The ‘fire’ word with nasal coda reflects an AA root with a fricative coda, but phonologically altered by word-play. Interestingly, two Bahnaric languages still have archaic reflexes of ‘fire’: Srê (in the south) ʔos ‘fire’, and Cua (in the extreme north-east) ʔolh ‘fire’ (with -lh the regular reflex of *-s); it is not clear if these are retentions or were influenced by neighbouring Khmer or Katuic languages.

11.4 Katuic Katuic is a branch of approximately 15 distinct languages spoken in Thailand, Cambodia, Laos, and Vietnam; there are many named lects reported in the literature but any impression that this is indicative of very high diversity is incorrect (see Choo 2010, 2012 and references therein). The centre of diversity of Katuic is in southern Laos, between Salavan and the Vietnam border; outside of this relatively compact area a West-Katuic dialect chain, with many named lects, spreads to the north and west and south-west across large areas of Thailand and Cambodia. Into the 1960s there was no recognition of a coherent Katuic branch; sources such as Lebar et al. (1964) applied vague labels such as “mountain Khmer” to upland AA speakers across the region. Thomas’ (1966) lexicostatistical study first properly distinguished Katuic from Bahnaric, and Thomas and Headley (1970) listed 18 putative Katuic languages, although that list wrongly included three Bahnaric lects as Katuic (Alak, Kasseng, Talieng) based on geographical considerations. Various lexicostatistical studies followed: Smith (1981), Migliazza (1992), Miller and Miller (1996), Peiros (1996, 2004), producing conflicting results in terms of nested branching, although all recognised a West Katuic sub-group and a special status for Katu alongside or within an Eastern Katuic sub-group. Sidwell (2005) proposed a historical phonology



Classification of MSEA Austroasiatic languages 

West Katuic

Bru (bru, brv, xhv) Sô (sss) Kuy (kdt) Nyeu (nyl)

Ta’oih

Ir (irr), Ong (oog) Ta’oih (tto, tth) Kataang (kgd) Ngeq (ngt), Khlor (llo)

Katuic

 187

Pacoh (pac) Eastern Katu (ktv) Western Katu (kuf) Phuong (phg)

Katu

Fig. 6: Katuic classification based on Sidwell (2005).

for Katuic, and a classification into four coordinate sub-branches based on sound changes (Figure 6). It is difficult to justify proposals for nested relations between these four sub-groups, and this remains an open research question, so the division into four sub-groups remains a conservative position at this time. In addition to the lexicostatistical indications, Katuic is readily distinguished from other AA branches by numerous lexical innovations (some of which were borrowed into West Bahnaric, see Sidwell 2005). These innovations include a special set of numerals for six through ten, plus many other items, and some examples are given in Table 4. Tab. 4: Selected Katuic isoglosses. Gloss

Katu

Pacoh

Bru

pAA

‘year’ ‘cobra’ ‘mushroom’ ‘bone’ ‘head’

kamɑː tuːr triː ŋhaːŋ ploː

kumɔː tur triə ŋhaːŋ ploː

kumɒː tṳːr tri̤aʔ ŋhaːŋ pləː

*cnam *ɟaːt *psit *cʔaːŋ *b/ɓuːk; *kuːj

188 

 Paul Sidwell

11.5 Khmer Khmer, also called Cambodian, is the national language of Cambodia, as well as being spoken by significant numbers of people in the Mekong delta region of Vietnam, Northeast Thailand and Thailand’s Trat province. Effectively, Khmer is a branch of the AA family consisting of a single language with several regional dialects. The most divergent dialect is Cardamo Khmer, spoken in the west of the country; it reportedly maintains the breathy voice register inherited from Middle Khmer but lost in other dialects (Wayland and Jongman 2001). Other dialects are considered to be forms of central Khmer, which includes the National Standard, Surin or Northern Khmer (spoken in Thailand), and Krom Khmer spoken in Vietnam. It is apparent (following Ferlus 1992, see Figure 7) that the modern forms of Khmer diverged variously during the Middle Khmer period (after 1440 CE) with the loss of the unifying influence of the Angkorian state. Prior to that, the Old Khmer state was so centralised and powerful that any significant diversity within Khmer was apparently levelled.

Old Khmer Pre-Angkorian/Angkorian

Central Khmer (khm)

Surin Khmer Standard Khmer and its dialects

Cardamo Khmer (kxm)

Fig. 7: Khmer classification based on Ferlus (1992).

Khmer shows no clear affinity to another AA branch, although lexicostatistical studies tend to indicate a somewhat higher percentage of cognates with Pearic, and Diffloth (2005, 2009) groups Khmer with Bahnaric. Headley (1976) examined lexical, phonological, and morphological data to investigate the problem, and found numerous conflicting indications, concluding: I suggest that Khmer stands alone as a language isolate. It has its closest ties with the Eastern Mon Khmer-Mon Subfamily. Khmer exerted a strong influence on its neighbors, especially Pearic. This accounts for the high lexicostatistical figures between Khmer and Pearic. (Headley 1976: 450)

In terms of formally distinguishing Khmer from the rest of AA, there are various unique lexical replacements, and in many cases these may reflect outcomes of word tabooing. This is particularly seen in animal names; some of these are given in Table 5.



Classification of MSEA Austroasiatic languages 

 189

Tab. 5: Selected Khmer lexical innovations. Gloss

Old Khmer

Modern Khmer

Surin Khmer

pAA

‘fish’ ‘chicken’ ‘dog’

triː~treː – cʰkɛː

trəj moan ckae

trɛj mɯan ʨkɛː

*kaʔ *ʔiər *cɔːʔ

Additionally, there are marked sound changes that occurred early in the history of Khmer, in particular etyma that reflect pAA *aː as a short raised vowel [ə, ɨ~ɯ] or a long front vowel [iː, ɛː], while *aː is preserved intact in many words. The raising of *aː is not unusual in AA languages, but the reduction to a short vowel is particularly identifying for Khmer (see Table 6). Tab. 6: Selected Khmer reflexes of pAA *aː. Gloss

Old Khmer

Modern Khmer

Surin Khmer

pAA

‘bone’ ‘water’ ‘two’ ‘hawk/eagle/kite’ ‘crab’

cᵊʔɤŋ dɪk ~ dɯk biːr kʰlɛːŋ kʰɗaːm

cʔəŋ tɨɁ piː khlaeŋ kdaːm

ʨɁʌŋ tɯɁ piːr khlɛːɲ kdaːm

*cʔaːŋ *ɗaːk *ɓaːr *(k)laːŋ *ktaːm

11.6 Khmuic The Khmuic branch is dominated by Khmu, a large dialect chain in Northern Laos, plus smaller communities in Thailand, Vietnam, and China. Premsrirat (2002) documents seven named lects in her thesaurus of Khmu dialects. Additionally there exist a handful of minor Khmuic languages spoken on the western and eastern peripheries of the Khmu dominated area, extending into Thailand and Vietnam. Demographically Khmu is dominant with about 700,000 speakers, while the smaller Khmuic languages number only a few thousands of speakers (the Mlabri perhaps less than 200). A listing of Khmuic languages and variant names is given by Cheeseman et al. (2017). The recognition of Khmuic as a coherent branch emerged out of the lexicostatistics of Thomas and Headley (1970), reporting that the term Khmuic had been suggested by William Smalley. However, Thomas and Headley’s study only sampled two languages, Khmu and Mal, classifying additional languages as Khmuic based on their geographical proximity, and confusion continues today as to whether various small languages belong within Khmuic or Palaungic.

190 

 Paul Sidwell

On the western periphery, in the provinces of Nan (Thailand) and Xayaboury (Laos), are the Tinic/Mal-Pray languages, phonologically marked by aspiration of the historical voiceless stop series. In proximity is the phonologically conservative Mlabri, marked by an unusually high rate of lexical replacement. Rischel (2007) wonders whether Mlabri is not Khmuic but an archaic branch of its own, although consensus appears to hold that Mlabri is a Khmuic language affected by periods of social self-isolation. On the eastern periphery, in Houaphan Province (Laos) and neighbouring areas of Vietnam, are the Ksing Mul/Puoc and Pramic languages such as Tayhat, Kniang, Ơdu/Ưdu, Phong.3 Pramic is a group of closely related lects, perhaps better regarded as varieties of one language; the name Pramic derives from their common word for ‘person’ /pram/. Several other languages have been listed as Khmuic from time to time, with suggestions that they are related to the Pramic group, especially Khang/Khao, Buxing/Bit, although Sidwell (2015b) classifies these as Palaungic based on lexical and phonological criteria. Justifying Khmuic as a single branch is a unique sound change in the form of the regular loss of pAA medial *h. This is found only in a small number of lexical items, yet is prominent in the etymon for ‘blood’, e.  g. Khmu maːm, Mlabri mɛːm, Mal miam, Ksing Mul miəm, Ơdu miːm, etc. (cf. Stieng mhaːm, Chong məhaːm etc.). Within Khmuic, nested branching is suggested on the basis of reflexes of pAA *aː (seen in the ‘blood’ etymon, among others). As pointed out by Sidwell (2014b), there is a split in the reflexes of pAA *aː; Khmuic generally shows [aː], while other Khmuic languages show [aː] in some etyma and a fronted/raised reflex in others. The leading hypothesis is that the fronted/raised reflexes are regular, while the [aː] is reflected in loans from Khmu. Assuming that the shift proceeded as *aː > *ɛː > *iə > *iː, nested branching is implied which matches neatly the classification of Chazée (1999, citing Diffloth and Proschan without providing the specific sources) and Sidwell (2014a), see Table 7 and Figure 8. Tab. 7: Khmer reflexes of *aː suggesting nested branching relations.

Proto-Khmuic *aː

*aː *ɛː *iə *iː

Khmu Mlabri Ksing Mul, Mal-Pray Pramic (Tayhat, Kniang, Ơdu/Ưdu, Phong)

3 Phong/Pong is also used to designate some Vietic lects also spoken in the region and special care needs to be taken when dealing with secondary sources using this language name.



Classification of MSEA Austroasiatic languages 

 191

Khmu’ (kjg), Khuen (khf), Kuanhua (xnh) Mlabri (mra)

Khmuic

Mal (mlf) Prai (prt), Pray (pry) Pray-Pram

Khsing Mul, Puoc (puo) Pramic

Phong, Kniang (pnx) Tayhat, Ơdu (tyh)

Fig. 8: Khmuic classification based on Chazée (1999) and Sidwell (2014a).

11.7 Mang, Pakanic The status of Mang (ISO 639-3 zng, spoken in Northern Vietnam and China), and the Pakanic languages Bolyu/Paliu/Lai (ISO 639-3 ply) and Bugan/Pakan (ISO 639-3 bbh) both spoken in southern China, remains an unsolved problem. While lexicostatistical and computational-phylogenetic studies4 indicate that these three languages group lexically, and Peiros (2004) and Jenny and Sidwell (2014) recognised a Mangic branch based on these indications. However, Mang and Pakanic lack a clear set of shared innovations, and the statistical results may reflect only retained archaisms. Mang came to the attention of scholars with Vương Hoàng Tuyên (1963) reporting data from Lai Châu Province in Vietnam, and this was used by Thomas and Headley (1970), remarking: Mang (Mang U’) in North Vietnam near the China border (Tuyên 1963), which we expected from its geographical location to be Khmuic, shows highest cognateness with Palaungic, next highest with Viet-Mương. Though the data is sketchy, we are tentatively classifying Mang with Palaungic. (Thomas and Headley 1970: 403)

Gao (2003) delivered a sketch of Chinese Mang, and Nguyễn Văn Lợi et al. (2008) a monograph length study of Mang in Vietnam. The latter speculate that Mang subgroups with the Waic sub-branch of Palaungic, citing the Mang words for ‘two’ ʑɨəj⁴ and ‘water’ ʑum¹ as derivable from proto-Waic (Diffloth 1980a) *lʔar and *rʔom respec-

4 This writer’s (Sidwell 2010) lexicostatistical analysis grouped the three languages, scoring Bolyu-Bugan at 47 % cognate, Mang-Bolyu at 29 %, and Mang-Bugan 25 %. The highest inter-branch count is 22 % between Mang and Khmu, other pairwise comparisons across AA yield cognate counts no higher than 20 %. Peiros’ (2004) lexicostatistics groups Mangic, Bolyu and Bugan, coordinating with Vietic and Palaungic-Khmuic.

192 

 Paul Sidwell

tively. However, it is apparent that Mang words with initial /ʑ/ reflect forms with original clusters *kr-, *kj-, e.  g. ‘wind’ ʑiː4 < *kjaːl, ‘road’ ʑaː¹ < *kraʔ. These suggest a pre-Mang *krum/*kjum ‘water’ and *kraːj/*kjaːj ‘two’, neither of which support comparison with Waic/Palaungic. Bolyu was first described by Liang Min (1984), and later Wu Zili (1992), Li Jinfang (1996), Li Yunbing (2005). Benedict (1990) proposed that Bolyu is an independent AA branch, and Edmondson and Gregerson (1996) compared Bolyu typologically to Vietic, but offered no conclusions about classification. Bugan materials include the sketch in English by Li and Luo (2014). Tab. 8: Selected Mang and Pakanic etyma compared to other AA. Gloss

Mang

protoWaic / protoPalaungic

Khmu

Bolyu

Bugan

proto Vietic

Mon / OldMon

‘I’ ‘water’ ‘two’ ‘fire’ ‘blood’ ‘five’ ‘eye’

ʔuː⁴ ʑum¹ ʑɨəj⁴ ɲɛ² haːm¹ han² mat⁷

*ʔɨʔ / *ʔɔːʔ *rʔom / *ʔoːm *ləʔar / *ləʔaːr *ŋɒl / *ŋal *hnam / *snaːm *phɒn / *pəsan *ʔŋaj / *ˀŋaːj

ʔoʔ ʔom baːr pʰrɨə maːm (Tayhat sɔːŋ) mat

ʔaːu⁵⁵ nde⁵³ mbi⁵⁵ mat³³ saːm⁵³ me³¹ mat⁵³

ɔ³¹ nda²⁴ bi³¹ a̠ u³¹ sa⁴⁴ mi⁴⁴ mɛ̱ ³³

*soː *ɗaːk *haːr *guːs *ʔasaːmʔ *ɗam *mat

ʔoa / ʔɔj dac / ɗaik ba / ɓar kəmot /– chim / chim pəsɔn / sun mòt / mɔt

Table 8 provides a selection of relevant basic vocabulary items. All three languages appear to agree with both Palaungic and Khmu for the ‘I’ pronoun, suggesting a common Northern AA origin, yet this is contradicted by other items. Mang etyma for ‘water’, ‘two’, ‘fire’ appear to agree specifically with Palaungic, ‘blood’ agrees with Pakanic and Vietic, ‘five’ with Palaungic and Monic, and ‘eye’ is the general AA root which is otherwise completely replaced in Palaungic. Extending the list provides a further mix of ambiguous indications, so the Mang-Pakanic problem remains unsettled at this time.

11.8 Monic The Monic branch consists of two languages, Mon and Nyah Kur (and their dialects); the history of Monic is reconstructed by Ferlus (1983) and Diffloth (1984), indicating that today’s Mon and Nyah Kur languages descend from vernacular speech of the Dvāravatī culture that flourished in central and northeastern Thailand from the 6th to the 10th centuries CE. The reconstructions suggest that proto-Monic and Old Mon (attested in inscriptions from the 6th to 13th centuries CE) are essentially the same language in their basic vocabulary, but the inscriptional language reflects a written



Classification of MSEA Austroasiatic languages 

 193

register which was heavily lexified from Pali and Sanskrit, and thus not simply the direct ancestor of the Monic vernaculars that exist today. Today Mon is spoken in southern Burma, plus some small communities in Thailand reflecting back-migration from Burma. Small communities of Nyah Kur survive in uplands areas regarded less suitable for paddy farming by the Siamese, in the Phetchabun and Sankambeng Ranges separating central and northeast Thailand. Nyah Kur is documented and discussed by Diffloth (1984) and Luang-Thongkum (1984). A family tree of the Monic lects is given as Figure 9. Literary Mon Monic

Old Mon /

Middle Mon

Proto-Monic

Mon (mnw) (Mon Ro, Mon Rao, Thai Mon) Nyah Kur (cbn) (Southern, Central, Northern dialects)

Fig. 9: Monic family tree based on Diffloth (1984).

Phonologically speaking, while the modern vernaculars are highly innovative, Old Mon is revealed to be highly conservative (Diffloth 1984; Jenny and McCormick 2014), so we rely upon lexical innovation to identify the branch, and some examples are given in Table 9. Tab. 9: Selected Monic lexical innovations. Gloss

Proto-Monic

Old Mon

Nyah Kur

pAA

‘knee’ ‘money’ ‘chicken’ ‘dog’

*ɟroːm *knuːj *tjaːŋ *clur

– knuj tyaiŋ kløw

chròːm khǝnúːj cháːŋ chúr

*psaɲ *swaːʔ *ʔiər *cɔːʔ

194 

 Paul Sidwell

11.9 Palaungic Palaungic (Palaung-Wa in older sources) is a group of some 26 languages (Eberhard et al. 2019) spoken in unconnected pockets over a large region that extends through Thailand, Myanmar, China, Laos, and into Vietnam. The complex distribution, often in border areas, has meant that a coherent account of the branch has only emerged rather recently. Historically, comparative-historical and classification studies have tended to focus on easily recognised sub-groupings (e.  g. Schmidt 1904; Shafer 1952; Diffloth 1977, 1980a, 1991a; Mitani 1977, 1978, 1979; Paulsen 1989–1990), and a consolidated Proto-Palaungic has only appeared recently (Sidwell 2015b). Scholarship has generally divided Palaungic languages into two main groups; a western group of Palaung-Riang5 lects, and a complex of eastern groups that includes Waic, Lawa, Angkuic, and Lameet languages; additionally the outlier Danaw (also spelled Htanaw), spoken near Inle Lake in the southwest of the Shan hills, which may coordinate with Palaung-Riang or reflect its own sub-branch. This is reflected in the family tree presented by Mitani (1978) based on lexicostatistical analysis.6 The Mitani scheme is largely confirmed by Sidwell (2015b), the latter adding an additional Eastern sub-branch, Bit-Khang. The classification by Sidwell underlies the tree given here as Figure 10. Palaungic languages are readily recognised by the presence of various lexical innovations, and three of the significant ones are given in Table 10. Tab. 10: Indicative Palaungic lexical innovations. Gloss

protoPalaungic

Danaw

RiangLang

Wa

U

Lameet

Buxing

Bumang

protoAA

‘eye’ ‘fire’ ‘laugh’

*ˀŋaːj *ŋal *kəɲaːs

ŋɑi² ɲɔːn⁴ kăɲɑʔ¹ˈ³

ŋɑi² ŋal² kɤ̆ɲɑs²

ŋai ŋṳ ɲi̤ah

ŋâj ŋàw ɲǎʕ

ŋaːj ŋal kǝɲaːs

pɤŋŋai ʧiŋal kɤȵaih

ŋai⁵⁵ ŋăn⁵⁵ ȵai³³

*mat *ʔɔːs~*ʔuːs –

5 “Palaung” is the general Burmese designation, and in the linguistic literature often refers to the speech forms in or around Namhsan in Shan State. “Riang” is a Romanised form of their autonym which reconstructs as *rəʔaːŋ, a Palaungic word for ‘stone’ used in the sense of ‘cliff’ as a place of shelter (Ferlus 2014). 6 The paper is a conference presentation archived at https://drive.google.com/file/d/1glBuoVTiZG 0k6bMLy-dUez5RFtv5nZdT (last accessed 16 December 2020).



Classification of MSEA Austroasiatic languages 

 195

Danaw (dnu) Palaungic Palaung-Riang

Palaung (pce, pll) Rumai (rbb) Riang (ril), Yinchia (yin)

Wa (vbm) Parauk (prk) Awa (vwa) Waic

Lawa (lwo, lcp) Phalok (lwl) Blang (blr) Bumang (bvp) Samtao (stu)

West Palaungic

Lameet (lbn) Bit-Khang

Khabit (bgk), Buxinhua (bxt) Kháng (kjm), Khao (xao)

Angkuic

U (uuu) Hu (huo) Mok (mqt) Man Met (mml) Kemiehua (kfj) Tai Loi (tlq) Kiorr (xko), Con (cno), Kon Keu (kkn)

Fig. 10: Palaungic languages classification (based on Sidwell 2015b).

11.10 Pearic Pearic languages are a small and highly endangered group spoken mainly in Southwest Cambodia and Trat Province of Thailand, plus small communities to the east of Siam Reap and some which relocated further afield within Thailand. The most extensive available documentation of Pearic in Cambodia was collected in the 1930s by Baradat (1941a, 1941b), and Ferlus (2011) provides a comprehensive list of sources of published Pearic data up to 2009. The Pearic languages are distinguished by having a four-voice register system (modal, breathy, creaky, breathy-creaky) plus numerous lexical innovations, a selection of which is given at Table 11 (Headley 1985: 463–465 for more lexical innovations).

196 

 Paul Sidwell

Tab. 11: Select Pearic lexical innovations. Gloss

pPearic

Kasong

Chong

Samre

Pear of Kompong Thom

pAA

‘fish’ ‘fire’ ‘bone’ ‘chicken’ ‘banana’

*meːˀw *pleːw *klɔːŋ *hlɛːk *hlɔːŋ

me̤ ː⁴⁵³ ple̤ ːw²¹ klɔːŋ³³ lɛːk⁴⁵ lɔːŋ³³

me̤ ːˀw ple̤ ːw klɑːŋ læːk lɑːŋ

miːɹ pliːw kluəŋ liək luəŋ

miəl phlou – lék lâng

*kaʔ *ʔuːs *cʔaːŋ *ʔiər –

Note: Reconstructions are by the author, other sources: Kasong: Thongkham (2003); Chong: Huffman (1985); Samre: Ploykaew (2001); Pear of Kompong Thom, Baradat (1941b) (the latter are Romanised forms as they appear in the original ms.).

Broadly, Pearic appears to have two principal branches: (i) Pear of Kompong Thom (KPT), a small language east of Tonle Sap, and (ii) the dialect chain of Chong, Samre, Somray, Suoy, etc. The latter group share between 98 % and 77 % basic vocabulary (Martin 1974), and a high degree of mutual intelligibility is reported. This leaves the impression that Pearic may be regarded as just two languages, one of which is internally diverse. Headley (1985: 464) offers a phonological isogloss map for the group, indicating patterns of vowel diphthongization, lenitions among historical palatal segments, and mergers among coda consonants, but it does not clearly indicate nested relations, so we are left with the simple tree diagram at Figure 11. Pearic

Pear (Kompong Thom) (pcb) Chong (cog), Saoch/Chung (scq), Samre (sxm), Somray (smu), Suoy/Su’ung (syo)

Fig. 11: Pearic classification.

11.11 Vietic The Vietic branch consists of at least 14 distinct languages; ten are recognised by Eberhard et al. (2009) although documentation of the smaller Vietic lects is inadequate, complicating matters. Scholars in recent decades have distinguished two main subgroups; Northern Vietic or Viet-Muong, which includes at least Vietnamese, Mường varieties, and Nguon, and Southern Vietic consisting mostly of small languages spoken in the hills of the Annamite Chain. This was made explicit by Ferlus (1979) and replicated by various scholars, such as the recent scheme presented by Trần Trí Dõi (2018: 61). A problematic issue has been whether the languages including Cuối, Phong, Tuom, Liha, fall closer to Viet-Muong or Southern Vietic, and recent (presently unpublished) work by Sidwell and Alves finds lexical and phonological indications that they group with the northern clade.



Classification of MSEA Austroasiatic languages 

 197

The southern languages appear to fall into two distinct groups, both marked by phonological conservatism: the Chut group (Arem, Sac, Ruc, May) and the Thavung-Malieng group. Comparative reconstruction indicates that the latter of these retains the most archaic lexicon and phonological features, while the Chut group appears to share a merger of *-r and *-l codas to *-l with the northern languages. This leads us to propose the tree given here at Figure 12. Viet-Muong

Vietnamese (vie) Mường (mtq) Nguon (nuo) Cuối (tou) Tho (tou) Phong (hnu) Tuom Liha Arem (aem) Sach (scb) Ruc (scb) May (scb)

Vietic

Kri Maleng (pkt) Malieng (pkt) Ahao/Ahlao (thm) Thavung (thm)

Fig. 12: Vietic classification following Sidwell and Alves (in preparation).

11.12 AA sub-family proposals 11.12.1 Northern AA / Khasi-Pakanic Since the 1970s scholars have been inclined to recognise a northern AA sub-family with Palaungic, Khmuic, and Khasian branches (Thomas and Headley 1970; Ferlus 1974; Diffloth 1979; Diffloth and Zide 1992; Perios 1998; and others), based largely on lexicostatistical indications. Additionally, Thomas and Headley (1970) and Diffloth and Zide (1992: 137) have also suggested that Mang belongs in such a northern clade. However, those lexicostatistical results are based on margins of only two or three items

198 

 Paul Sidwell

out of the 100-word list, so it is possible that these are artifacts of language contact or variable rates of change in the lexicon, and thus other evidence such as lexical innovations are required to make the case. Investigations by this author (Sidwell 2014a, 2015a, 2015b) find strong lexical support for the grouping of Palaungic and Khasian, but only one lexical innovation – the 1st person pronoun reconstructable as *ʔɔːʔ ‘I’ – is apparently found in all putative northern branches, plus Pakanic. It is possible to compile many lexical isoglosses between Khmu and neighbouring Palaungic languages, and more than two dozen are discussed by Sidwell (2015b) finding that these are indicative of language contact and not to be taken as evidence of a deeper Palaungic-Khmuic relationship. A good example is found in words for ‘water’; the AA root *ɗaːk was replaced with *ʔoːm in Palaungic, and while ʔom is attested in Khmu, so is ʔɔːk ‘water’ (< ‘to drink’) and all other Khmuic sub-groups have their own etymon for ‘water’ distinct from *ɗaːk or *ʔoːm (see Table 12). Thus, it seems that early studies which identified Khmuic as falling into a northern clade were unduly influenced by focussing on Khmu, which has a particular history of contact with Palaungic. Select other lexical comparisons are made in Table 12 which illustrate the difficulty of justifying a northern clade larger than Khasi-Palaung, and we are left with the problem of explaining the distribution of reflexes of *ʔɔːʔ ‘I’ in various geographically northern branches. Tab. 12: Key Northern AA isoglosses. Khasian Palaungic Khasi

Khmuic

pPalaungic Danaw

Lameet Khmu

Mang Pakanic Mlabri Ơdu

Ksing

Mang Bugan Bolyu

Mul ‘blood’

snaːm

*snaːm

kᵊnɑn⁴

naːm

maːm

mɛːm

miːm

miəm

‘rain’

slap

*clɛːʔ

kᵊlɪ¹

səlɛʔ

kmaʔ

mɛːʔ

kmʌj

ʔəmĩə ma²

kʰou³⁵

qɔ⁵⁵

‘two’

ʔaːr

*ləʔaːr

ʔɑn⁴

ʔlaːr

baːr

bɛːr

baːr

(sɔːŋ)* ʑɨəj⁴

biɔ³¹

mbi⁵⁵

‘water’

ʔum

*ʔoːm

ʔun⁴

ʔoːm

ʔom, ʔɔːk ɟrʌːk

paj

hɔːt

ʑum¹

da³⁵

nde⁵³

‘nail/claw’

trsim

*rəmsiːm

(kᵊleəŋ⁴) lmhiːm tmʰmɔːŋ

(?)

mi³³

maːi¹³

ʔuː⁴

ɔ³¹

ʔaːu⁵⁵

chŋkɛr hawkar (moŋ

ham¹ sa³³

saːm⁵³

suəŋ) ‘I’

(Pnar ʔɔ) *ʔɔːʔ

oʔ¹

ʔɔːʔ

ʔoʔ

ʔoh

naɲ

ʔaɲ

* Borrowed from Lao.

11.12.2 Katuic-Vietic A sub-family of Katuic and Vietic was proposed by Diffloth (1991b) and maintained in his subsequent publications. The principal evidence is a set of lexical comparisons illustrating a correspondence of pAA onsets *ʔ- to Katuic *h- and Vietic *s-. The relevant data is presented in Table 13.



Classification of MSEA Austroasiatic languages 

 199

Tab. 13: Katuic-Vietic isoglosses from Diffloth (1991b). Gloss

‘centipede’ ‘bone’ ‘to cough’ ‘to fart’ ‘to breath’ ‘blood’

Vietic

Katuic

Other AA

Thavung Maleng

Tum

Pacoh

Kuay

Katu

kasḭːp – – – pəsʌ̰ ːmʔ –

liːp-siːp siəŋ – somʔ – –

kahḛːp ŋha̰ ːŋ kahɔ̰ ːʔ – palho̤ ːm ʔaha̰ ːm

kahɛ̰ːp ŋha̰ ːŋ ŋho̰ ʔ – pəhɒ̰ ːm ŋha̰ ːm

kahip ŋhaːŋ – – – ʔahaːm

kasḭːp səːŋ – – pəsʌːmʔ ʔasaːmʔ

Bahnar kəʔɛːp, Khmer kʔaɛp Khmer cʔəŋ, Khmu cʔaːŋ Khmer kʔoːʔ, Brao kʔɔk Bahnar phoːm, Khmu puːm Khmer ɗɒŋhaəm, Car ʔuhɔːm Bahnar phaːm, Khmu maːm

Diffloth’s hypothesis suggests an unusual direction of phonetic change (ʔ > h > s) lacking specific conditioning or motivation. Another speculative explanation is that the relevant pAA phoneme was perhaps a strident segment that merged with *s in Vietic, lenited to h in Katuic, and became a glottal stop elsewhere. If this alternative hypothesis is correct it would not necessarily support a Vietic-Katuic sub-family, since no shared innovation would be involved. At present we have no clear basis for deciding how to interpret this correspondence set. Alves (2005) does discuss various Katuic and Vietic lexical agreements, but it is not clear that any of these reflect shared innovations, so the Katuic-Vietic hypothesis still lacks convincing support. Strikingly, published lexicostatistical studies (including all those referenced in this chapter) find a marked level of lexical agreement between Katuic and Bahnaric, in contradiction to the Katuic-Vietic hypothesis, and in the face of contradictory indications we are left with an unresolved problem.

11.12.3 Bahnaric-Khmeric The arguments for a Bahnaric-Khmeric sub-family have not been discussed in print, although Gerard Diffloth has remarked at various international meetings that a grouping is suggested on the basis that both branches share a metathesized reflex of the word for ‘mushroom’: Khmer psət, Bahnaric: Sre bəsit, Laven pseːt, etc. Other AA reflexes, such as Jahai tis, Khmu tih, Mon pətɑh, Nyah Kur pətíh etc. suggest pAA *ptis. While such shared metathesis is striking, it is not beyond explanation by diffusion; mushrooms are highly sought after as food and traded over distances as a dried comestible. Additionally, we know that Bahnar bəməw and Tampuon ma̤ w ‘mushroom’ were borrowed from Chamic (cf. Written Cham bimaw, Jarai bəmau, etc.), so borrowing of this item is attested in the area. Consequently strong evidence for a Bahnaric-Khmeric relation remains unrevealed.

200 

 Paul Sidwell

11.12.4 Southern AA/Nico-Monic The identification of a southern clade consisting of Monic, Aslian, and Nicobarese, is a feature of the family trees presented by Diffloth (2005, 2009). There are some indications of shared lexical and phonological traits; among these there are half a dozen apparent lexical isoglosses suggestive of a southern subfamily (see Table 14). Tab. 14: Apparent Southern-AA isoglosses. Monic Gloss

Mon

Aslian Nyah Kur

Nicobarese

Kensiu Jahai Jahut

‘night’ hətɔm pǝtám ‘rotten’ ʔut ŋʔúːc ‘to perch’ dun ‘nail/ cas claw’ ‘to die’ gabis ‘to grate’ gɯc

siʔin

Temiar Semai

sǝʔuʔ

tɯp

Semelai Mah Meri

Car

ptɔm sʔĩt

hataːm hatɔm ʔomnɔc* dɯən sɔh kisoah

cnrɔs cərwɛs cənɹos cəŋɹoʊs cros kbis

kəbɨs

kǝbǝs kiːɟ

kəbəs

suɁũt

Nancowry

kǝbǝs kapah fah ʔitkic**

* ‘rotten log’, nominalized with infix. ** ‘to cut with knife’

These isoglosses are certainly suggestive of an Aslian-Nicobarese relationship, while the Monic comparisons are fewer in number and it is not clear if they are innovations or retentions. Aslian-Nicobarese does also have phonological support, with Diffloth proposing correspondence linking diphthongs in these languages (presented at the 18th SEALS meeting, 21–23 May 2008, Universiti Kebangsaan Malaysia). The examples from Diffloth’s handout are given here in Table 15 (Proto-Aslian reconstructions are Diffloth’s). These data are also consistent with the computational-phylogenetic results reported by Sidwell (2014a) and hence we can have some confidence that Aslian and Nicobarese do form a southern AA clade. Tab. 15: Nicobarese-Aslian diphthong correspondences (Diffloth 2008).

*uə

*uɔ

Gloss

Nancowry

Proto-Aslian

‘house-fly’ ‘wasp’ ‘child’ ‘to scratch’ ‘dream’ ‘fingernail’ ‘four’

juəj tũəʔ kuən koac ʔinfoaʔ kisoah foan

*ruəj *gr-tuəʔ * kuən *kuɔc *-mpuɔʔ *c-(n)r-uɔs *puɔn



Classification of MSEA Austroasiatic languages 

 201

11.13 Concluding remarks The overall picture that emerges for AA languages of MSEA is that the branch-level groups have been reliably identified since the early 1970s, and a reasonably complete understanding of their internal structures has emerged in recent decades. However, how the AA branches coordinate into nested relations remains a largely unsolved problem, and only the Aslian-Nicobarese and Khasi-Palaung hypotheses find any strong support based on multiple lines of evidence.

References Alves, Mark. 2005. The Vieto-Katuic hypothesis: Lexical evidence. In Paul Sidwell (ed.), SEALS XV Papers from the 15th Annual Meeting of the Southeast Asian Linguistics Society 2005, 169–176. Canberra: Pacific Linguistics, Research School of Pacific and Asian Studies, The Australian National University. Baradat, R. 1941a. Les Samrê ou Pear, population primitive de l’Ouest du Cambodge. BEFEO 4(1). 1–150. Baradat, R. 1941b. Les dialectes des tribus samre. Manuscript. Paris: l’Ecole Francaise d’Extreme-Orient. 267pp. Benedict, Paul K. 1990. How to tell Lai: An exercise in classification. Linguistics of the Tibeto-Burman Area 13(2). 1–26. Benjamin, Geoffrey. 1976. Austroasiatic subgroupings and prehistory in the Malay Peninsula. In Philip N. Jenner, Laurence C. Thompson & Stanley Starosta (eds.), Austroasiatic studies (Oceanic Linguistics Special Publications 13), 37–128. Honolulu: University of Hawaii Press. Blagden, Charles Otto. 1906. Pagan races of the Malay Peninsula vol. 2, edited by W. W. Skeat & C. O. Blagden. London: Macmillan. Chazée, Laurent. 1999. The peoples of Laos: Rural and ethnic diversities. Bangkok: White Lotus. Cheeseman, Nathan, Paul Sidwell & Anne Osborne. 2017. Khmuic linguistic bibliography with selected annotations. Journal of the Southeast Asian Linguistics Society 10(1). i–xlvi. Choo, Marcus. 2010. Katuic bibliography with selected annotations. Chiang Mai, Thailand: Survey Unit, Linguistics Institute, Payap University. Diffloth, Gérard & Norman Zide. 1992. Austro-Asiatic languages. In W. Bright (ed.), International encyclopedia of linguistics 1, 137–142. New York: Oxford University Press. Diffloth, Gérard. 1974. Austro-Asiatic languages. In Encyclopaedia Britannica, Macropaedia 2, 15th edn., 480–484. Chicago, London, Toronto & Geneva: Encyclopaedia Britannica Inc. Diffloth, Gérard. 1977. Mon-Khmer initial palatals and “substratumized” Austro-Thai. Mon-Khmer Studies 6. 39–57. Diffloth, Gérard. 1979. Aslian languages and Southeast Asian prehistory. Federation Museums Journal 24. 3–16. Diffloth, Gérard. 1980a. The Wa languages. (Linguistics of the Tibeto-Burman Area 5[2]). Berkeley: University of California. Diffloth, Gérard. 1980b. To taboo everything at all times. Proceedings of the Berkeley Linguistic Society 6. 157–165. Diffloth, Gérard. 1984. The Dvāravatī-Old Mon Language and Nyah Kur (Monic Language Studies). Bangkok: Chulalongkorn University Printing House.

202 

 Paul Sidwell

Diffloth, Gérard. 1991a. Palaungic vowels in Mon-Khmer perspective. In J. H. C. S. Davison (ed.), Austroasiatic languages, essays in honour of H. L. Shorto. London: SOAS, University of London. Diffloth, Gérard. 1991b. Vietnamese as a Mon-Khmer language. In M. S. Ratliff & E. Schiller (eds.), Papers from the First Annual Meeting of the Southeast Asian Linguistics Society, 125–139. Tempe, AZ: Arizona State University, Program for Southeast Asian Studies. Diffloth, Gérard. 2005. The contribution of linguistic palaeontology to the homeland of Austroasiatic. In L. Sagart, R. Blench & A. Sanchez-Mazas (eds.), The peopling of East Asia: Putting together archaeology, linguistics and genetics, 79–82. London & New York: Routledge/Curzon. Diffloth, Gérard. 2008. Proto-Aslian diphthongs and historical parallels in other Austroasiatic languages. Paper presented at the 18th Meeting of the Southeast Asian Linguistic Society, Universiti Kebangsaan Malaysia, Bangi, Selangor, 22 May. Diffloth, Gérard. 2009. More on Dvaravati Old Mon. Paper presented at the Fourth International Conference on Austroasiatic Linguistics. Mahidol University, Salaya. (Austroasiatic classification reproduced in Van Driem 2012) Dunn, Micheal, Niclas Burenhult, Nicole Kruspe, Sylvia Tufesson & Neele Becker. 2011. Aslian linguistics prehistory: A case study in computational phylogenetics. Diachronica 28(3). 291–323. Eberhard, David M., Gary F. Simons & Charles D. Fennig (eds.). 2019. Ethnologue: Languages of the world, 22nd edn. Dallas: SIL International. https://www.ethnologue.com. Edmondson, Jerold A. & Kenneth Gregerson. 1996. Bolyu tone in Vietic perspective, Mon-Khmer Studies 26. 117–133. Ferlus, Michel. 1974. Les langues du groupe austroasiatiques-nord. Asie du Sud-Est et Monde Insulindien 5(1). 39–68. Ferlus, Michel. 1979. Lexique thavung-français. Cahiers de Linguistique, Asie Orientale 5. 71–94. Ferlus, Michel. 1983. Essai de phonétique historique de môn. Mon-Khmer Studies 12. 1–90. Ferlus, Michel. 1992. Essai de phonétique historique du khmer (Du milieu du premier millénaire de notre ère à l’époque actuelle). Mon-Khmer Studies 21. 57–89. Ferlus, Michel. 2011. Toward Proto-Pearic: Problems and historical implications. In Sophana Srichampa, Paul Sidwell & Kenneth Gregerson (eds.), Austroasiatic studies: Papers from ICAAL4. Mon-Khmer Studies Journal Special Issue No. 3, part 1, 38–51. Dallas: SIL International; Salaya: Mahidol University; Canberra: Pacific Linguistics. Gao Yongqi [高永奇]. 2003. A study of Mang [莽语硏究]. Beijing: Ethnic Publishing House [民族出版社]. Gregerson, Kenneth J., Kenneth D. Smith & David D. Thomas. 1976. The place of Bahnar within Bahnaric. In Philip N. Jenner, Laurence C. Thompson & Stanley Starosta (eds.), Austroasiatic studies, part I (Oceanic Linguistics Special Publication 13), 371–406. Honolulu: University of Hawaii Press. Headley, Robert K., Jr. 1985. Proto-Pearic and the classification of Pearic. In Suriya Ratanakul, David Thomas & Suwilai Premsirat (eds.), Southeast Asian linguistic studies presented to André-G Haudricourt, 428–478. Bangkok: Mahidol University. Headley, Robert K., Jr. 1976. Some considerations on the classification of Khmer. In Philip Jenner, Laurence Thompson & Stanley Starosta (eds.), Austroasiatic studies, 431–452. Honolulu: The University Press of Hawaii. Huffman, Franklin E. 1985. The phonology of Chong, a Mon-Khmer language of Thailand. In Surya Ratanakul, David Thomas & Suwilai Premsrirat (eds.), Southeast Asian linguistic studies presented to André-G. Haudricourt, 355–388. Bangkok: Mahidol University. Jenny, Mathias & Patrick McCormick. 2014. Old Mon. In Mathias Jenny & Paul Sidwell (eds.), The handbook of Austroasiatic languages, 519–552. Leiden & Boston: Brill. Jenny, Mathias & Paul Sidwell (eds.). 2014. The handbook of Austroasiatic languages, 2 vols. Leiden & Boston: Brill.



Classification of MSEA Austroasiatic languages 

 203

Kruspe, Nicole, Niclas Burenhult & Ewelina Wnuk. 2014. Northern Aslian. In Mathias Jenny & Paul Sidwell (eds), The handbook of Austroasiatic languages, 419–474. Leiden: Brill. Lebar, Frank M., Gerald C. Hickey & John K. Musgrave. 1964. Ethnic groups of mainland Southeast Asia. New Haven, CT: Human Relations Area Files Press. Li Jinfang & Luo Yongxian. 2014. Bugan. In Mathias Jenny & Paul Sidwell (eds), The handbook of Austroasiatic languages. Leiden: Brill. Li Jinfang. 1996. Bugan-a new Mon-Khmer language of Yunnan Province, China. Mon-Khmer Studies 26. 135–160. Li Yunbing [李云兵]. 2005. A study of Bugeng [Bugan] [布赓语研究]. Beijing: Ethnic Publishing House [民族出版社]. Liang Min [梁敏]. 1984. A sketch of Bolyu [俫语概况]. Beijing: Minzu Yuwen 4. Luang-Thongkum, Theraphan. 1984. Nyah Kur (Chao Bon) – Thai – English dictionary. Bangkok: Chulalongkorn University Printing House. Martin, Marie A. 1974a. Remarques générales sur les dialectes pear. Asie du Sud-Est et Monde Insulindien 5(1). 25–37. Maspero, Henri. 1912. Etude sur le phonétique de le langue annamite. Les initials. Bulletin de l’Ecole Française d’Extrême Orient 12(1). 1–127. Migliazza, Brian. 1992. Lexicostatistic analysis of some Katuic languages. In Proceedings of the Third International Symposium on Language and Linguistics. Volume III, 1320–1325. Bangkok: Chulalongkorn University Printing House. Miller, John & Carolyn Miller. 1996. Lexical comparison of Katuic Mon-Khmer languages with special focus on So-Bru groups in northeast Thailand. Mon-Khmer Studies 2(6). 255–290. Mitani, Yasuyuki. 1977. Palaung dialects: A preliminary comparison. Tônan Ajia Kentyû (South East Asian Studies) 15(2). 193–212. Mitani, Yasuyuki. 1978. Problems in the classification of Palaungic. Paper presented at 2nd International Conference on Austroasiatic Linguistics, 19–21 December 1978. Mysore, India. Mitani, Yasuyuki. 1979. Vowel correspondences between Riang and Palaung. Studies in Thai and Mon-Khmer phonetics and phonology in honour of Eugénie J. A. Henderson, 142–150. Bangkok: Chulalongkorn University Press. Nagaraja, Keralapura S. 2010. Austroasiatic languages – An introduction. In Keralapura S. Nagaraja & Kashyap Mankodi (eds.), Austro-Asiatic linguistics: In memory of R. Elangaiyan, 1–32. Mysore: Central Institute of Indian Languages. Nguyễn Văn Lợi, Nguyễn Hữu Hoành & Tạ Văn Thông. 2009. Tiếng Mảng. Hanoi: Nhả xuất bản Khoa học Xã hội.  Parkin, Robert. 1991. A guide to Austroasiatic speakers and their languages (Oceanic Linguistics Special Publications 23). Honolulu: University of Hawaii Press. Paulsen, Debbie. 1989–1990. A phonological reconstruction of Proto-Plang. Mon-Khmer Studies 18/19. 160–222. Peiros, Ilia. 1996. Katuic comparative dictionary. Canberra: Pacific Linguistics. Peiros, Ilia. 1998. Comparative linguistics in Southeast Asia, Series C-142. Canberra: Pacific Linguistics & Canberra Australian National University. Peiros, Ilia. 2004. Geneticeskaja klassifikacija avstroaziatskix jazykov. Moskva: Rossijskij gosudarstvennyj gumanitarnyj universitet doctoral dissertation. Phillips, Timothy C. 2012. Proto-Aslian: Towards an understanding of its historical linguistic systems, principles and processes. Bangi: Institut Alam Dan Tamadun Melayu Universiti Kebangsaan Malaysia PhD thesis. Pinnow, Heinz-Jürgen. 1963. The position of the Munda languages within the Austroasiatic language family. In Harry L. Shorto (ed.), Linguistic comparison in Southeast Asia and the Pacific, 140–152. London: SOAS.

204 

 Paul Sidwell

Ploykaew, Pornsawan. 2001. Samre grammar. Salaya, Thailand: Institute of Languages and Culture for Rural Development, Mahidol University PhD thesis. Premsrirat, Suwilai. 2002. Thesaurus of Khmu dialects in Southeast Asia. Salaya, Thailand: Institute of Language and Culture for Rural Development, Mahidol University. Rischel, Jørgen. 2007. Mlabri and Mon-Khmer (Historisk-filosofiske Meddelelser 99). Copenhagen: Historisk-filosofiske Meddelelser. Schmidt, Wilhelm. 1901. Die Sprachen der Sakai und Semang auf Malacca und ihr Verhältnis zu den Mon-Khmer-Sprachen. Bijdragen tot de Taal-, Land-, en Volkenkunde van Nederlandsch-Indië 52. 399–583. Schmidt, Wilhelm. 1904. Grundzüge einer Lautlehreder Khasi-Sprache in ihren Beziehungen zu derjenigen der Mon-Khmer-Sprachen. Mit einem Anhang: die Palaung-Wa-, und Riang-Sprachen des mittleren Salwin. Abhandlungen der Bayerischen Akademie der Wissenschaft 22(12/3). 677–810. Sebeok, Thomas A. 1942. An examination of the Austro-Asiatic language family. Language 1(8). 206–217. Shafer, Robert. 1952. Études sur l’Austroasian. Bulletin de la Société de Linguistique de Paris 48. 111–158. Shorto, Harry L. 2006. A Mon-Khmer comparative dictionary. Canberra: Pacific Linguistics. Sidwell, Paul. 2002. Genetic classification of the Bahnaric languages: A comprehensive review. Mon-Khmer Studies 32. 1–24. Sidwell, Paul. 2005. The Katuic languages: Classification, reconstruction and comparative lexicon: Munich: Lincom Europa. Sidwell, Paul. 2009a. Classifying the Austroasiatic languages: History and state of the art. Munich: Lincom Europa. Sidwell, Paul. 2009b. How many branches in a tree? Cua and East (North) Bahnaric. In Bethwyn Evans (ed.), Discovering history through language. Papers in honour of Malcolm Ross, 193–204. Canberra: Pacific Linguistics. Sidwell, Paul. 2010. The Austroasiatic central riverine hypothesis. Вопросы языкового родства/ Journal of Language Relationship 4. 117–134. Sidwell, Paul. 2014a. Austroasiatic Classification. In Mathias Jenny & Paul Sidwell (eds.), The handbook of Austroasiatic languages, 144–220. Leiden & Boston: Brill. Sidwell, Paul. 2014b Khmuic classification and homeland. Mon-Khmer Studies 43(1). 47–56. Sidwell, Paul. 2015a. Austroasiatic dataset for phylogenetic analysis: 2015 version. Mon-Khmer Studies 44. ixviii–ccclvii. Sidwell, Paul. 2015b. The Palaungic languages: Classification, reconstruction and comparative lexicon. Munich: Lincom Europa. Sidwell, Paul. 2018. Austroasiatic deep chronology and the problem of cultural lexicon. Paper presented at the 28th Annual Meeting of the Southeast Asian Linguistics Society. Kaohsung Taiwan. https://drive.google.com/file/d/1b_vqZuDTnR9VkcpgAiJZveQ4nlvbxN0D (last accessed 16 December 2020). Sidwell, Paul and Mark Alves. In preparation. A Phylogenetic Analysis of Vietic. Skeat, Walter William & Charles Otto Blagden. 1906. Pagan races of the Malay Peninsula, two vols. London: Franck Cass & Co. [Reprint 1966, New York: Barnes & Noble]. Smith, Kenneth, D. 1973. Eastern North Bahnaric: Cua and Kotua. Mon-Khmer Studies 4. 113–118. Smith, Kenneth, D. 1981. A lexico-statistical study of 45 Mon-Khmer languages. In Andre Gonzalez & David Thomas (eds), Linguistics across continents, 180–205. Manila: SIL. Thomas David. 1979. The place of Alak, Tampuon, and West Bahnaric. Mon-Khmer Studies 8. 171–186.



Classification of MSEA Austroasiatic languages 

 205

Thomas, David & Robert K. Headley, Jr. 1970. More on Mon-Khmer subgroupings. Lingua 2(5). 398–418. Thomas, David. 1966. Mon-Khmer subgroupings in Vietnam. In N. H. Zide (ed.), Studies in comparative Austroasiatic linguistics, 194–202. The Hague: Mouton. Thongkham, Noppawan. 2003. The phonology of Kasong at Khlong Saeng Village, Danchumphon SubDistrict, Bo Rai District, Trat Province. Salaya, Thailand: Mahidol University MA thesis. Trần Trí Dõi. 2018. Về vấn đề nguồn gốc của tiếng Việt [On the question of the origin of Vietnamese]. In Đinh Văn Đức (ed.), Tiếng Việt Lịch Sử: Một Tham Chiều Hồi Quan [Vietnamese language history: A reflective reference], 13–86. Hanoi: Nhà Xuất Bản Văn Học. Van Driem, George. 2001. Languages of the Himalayas: An ethnolinguistic handbook of the Greater Himalayan Region: Containing an introduction to the symbiotic theory of language. Leiden: Brill. Van Driem, George. 2012. The ethnolinguistic identity of the domesticators of Asian rice. Comptes Rendus Palevol 11(2/3). 117–132. Vương, Hoàng Tuyên. 1963. Các dân tộc nguồn gốc Nam Á ở miền bắc Việt Nam [The ethnic groups of northern Vietnam]. Hanoi: Nhà Xuất Bản Giáo Dục.  Wayland, Ratree & Allard Jongman. 2001. Chanthaburi Khmer vowels: Phonetic and phonemic analyses. Mon-Khmer Studies 31. 65–82. Wu, Zili. 1992. Guangnan Ben’ganyu Chutan [An initial investigation of Bengan]. Kunming: Yunnan Minzu Yuwen 4.

Scott DeLancey

12 Classifying Trans-Himalayan (Sino-Tibetan) languages 12.1 Introduction The Trans-Himalayan (TH) or Sino-Tibetan family consists of several hundred languages spread from the Pacific to the mountains of northwest India and northeast Pakistan. Uncertainty about a total number is primarily due to the difficulty of deciding what to count as “languages” among many hundreds of named varieties, although there is also a continuous trickle of reports of previously unnoticed languages. There is broad agreement about the membership of the family, but subclassification is still unsettled. In this chapter I will try to summarize the current situation in TH classification. It is not possible to present a single answer to the question of how many languages there are in the family, as this depends on what we count. If we count languages at the scale of German (including Swiss German and Dutch) and Italian (all language varieties in Italy, including Sicilian), there might be two hundred or so; at a level in which we recognize several languages within the political boundaries of Italy or Germany-Netherlands-Austria-Switzerland, there are several times that many. At a level at which we distinguish Tuscan and Neapolitan, or English and Scots, there must be over a thousand TH languages. There are also local differences in how speakers count languages. For this among other reasons, there is no solidly-established classification of the family at any but the very lowest levels. Van Driem (2014, inter alia) presents a picture of about 40 clades, ranging in size and depth from single languages to small families of a dozen or two languages. Even after a century of research, very little can be said about mid- and higher-level subclassification of the family that can be supported according to the usual standards of comparative linguistics. Every grouping that is not inspectionally obvious is controversial, and none have yet been established on the basis of demonstrated shared innovations, other than shared lexical items. The Sinitic clade is distinguished from the rest of the family by substantial lexical and dramatic morphosyntactic shifts, resulting from extensive contact with languages of the Mainland Southeast Asian (MSEA) type (DeLancey 2013). In the past this has been taken as implying a primary split between a Sinitic and a Tibeto-Burman branch. However, there is no evidence that the non-Sinitic languages in the family share a common ancestor which is not also ancestral to Sinitic. In cladistic terms Tibeto-Burman is a paraphyletic group, not a genealogical unit. The label Sino-Tibetan explicitly reflects the bifurcated model; the newer term Trans-Himalayan (TH), suggested by van Driem (2014) avoids the implication of a binary structure. Nevertheless the term https://doi.org/10.1515/9783110558142-012

208 

 Scott DeLancey

“Tibeto-Burman”, although there is no genetic unit for it to refer to, remains useful as a cover term for the non-Sinitc TH languages, which share a number of typological morphosyntactic features.

12.2 Overview of Trans-Himalayan classification Trans-Himalayan languages include over 40 low-level, inspectionally obvious or formally demonstrated clades. About half of these are single languages or very small, cohesive dialect chains not self-evidently classifiable at any higher level. The largest and presumably deepest demonstrable clades  – Sinitic, Burmese-Ngwi, and BodoGaro  – each have a time depth which we can estimate on independent historical grounds at 2–2.5 millennia. There are numerous suggestions in the literature for other groupings at this depth, and for higher-level classifications. Some of these are widely accepted and probably true, but to date all basically rest on unsystematic lexical correspondences.

12.2.1 Obstacles to classification There are many reasons for the chaotic state of Trans-Himalayan classification. The most obvious is the scant or nonexistent documentation of many critical languages, and, conversely, the initial description over the past two decades of many hitherto quite unknown languages, which has thrown a huge classification task upon a field which was already overwhelmed. Another is the dramatic effects of inter- and extra-familial contact, including dramatic contact-induced simplification which has left many clades stripped of almost all inherited morphology (DeLancey 2014), and substantial lexical mixing to obscure the only evidence left once morphology is gone. More fundamentally, for many subgroups and/or geographical areas the available evidence does not support a neat tree model. We may very well be looking at the product of chain structures, subject over the last 2–3 millennia to considerable ethnic mixture, migration, and consolidation of larger-scale political structures. The field needs to explore alternate approaches to purely dendritic classification, e.  g. the “linkage” model (François 2014). It may be that this is true even at the highest levels.1 Certainly at the present state of knowledge any tree representation of the family must be taken as a suggestive hypothesis, not as a summary of current knowledge.

1 Benedict (1972) famously presented a radial representation rather than a tree to show the relationships of the subgroups, with Jinghpaw at the center.



Classifying Trans-Himalayan (Sino-Tibetan) languages 

 209

12.2.2 Past and present classification The most influential tradition has its roots in Sten Konow’s classification in the Linguistic Survey of India (Grierson 1909: 11) where we first encounter issues like the centrality of Jinghpaw (see Benedict 1972: 4–6), the internal structure of Kuki-Chin (see Shafer 1950), and the difficulty of arranging the lower-level groups in a tree structure (cf. van Driem 2014). Data from the LSI formed a major part of the lexical database assembled by Robert Shafer and Paul Benedict in the Berkeley Sino-Tibetan Project in the 1930s, which formed the basis for the classifications and reconstructions in Shafer (1966), Benedict (1972), and Matisoff (2003), and the Sino-Tibetan Etymological Dictionary and Thesaurus (STEDT). This means that the early classification which continues to influence most later and current work is based on haphazardly transcribed lexical data from a limited and geographically skewed set of languages. Subsequent attempts at comprehensive classification include Shafer (1966), Benedict (1972), Sun (1988), Matisoff (1996, 2003), Bradley (2002, 2012, forthcoming), van Driem (2001), Thurgood (2017). These differ in terms of their willingness to recognize speculative higher-order groupings: Thurgood’s classification is very conservative, while Bradley’s is very much a “lumper” classification with a place for almost everything. The most agnostic view is the “fallen leaves” presentation of Van Driem (2014: 19), who simply lists 42 (almost)2 indisputable clades with no attempt at higher-level grouping. (This representation is widely misunderstood as a competing model of the classification of the family, but van Driem is explicit that it is intended simply as a realistic representation of our current state of knowledge.) Work on classification in China has concentrated on languages of China, and neglects those spoken only in South or Southeast Asia. With the exception of van Driem, all of the authors mentioned above accept the bifurcated Sino-Tibetan model. Bradley then divides the TB languages into three major branches, Western, Central, and Eastern. Within each of these branches, member clades tend to share typological features and sporadic lexical connections of a sort which can easily be interpreted in either genealogical or areal terms (or both). To date nothing has been published in the way of shared phonological or morphological innovations, or irregular shared retentions, which would constitute compelling evidence for the cladistic status of any of the three. Still each proposal is supported by some evidence, and the three-branch model is perfectly plausible. (The only higher-level grouping proposals in the literature which are inconsistent with this model are Shafer’s grouping of Rgyalrongic with Bodic, which derives from a failure to distinguish cognates from borrowings in shared vocabulary, and LaPolla’s [2013] grouping of Rgyalrong with Kiranti, Nungish, and West Himalayan, which derives from a failure to distinguish retention from innovation in shared morphology.) 2 In fact not even all of van Driem’s basic units are universally recognized; see in particular the discussion of Kiranti in van Driem (2014: § 3.2).

210 

 Scott DeLancey

12.2.3 Prospects The recent fashion for experimentation with statistical methods based on biological phylogenetics has been applied to Trans-Himalayan languages in two recent studies. Zhang et al. applies Bayesian phylogenetic analysis to “949 binary-coded lexical root-meanings for 109 languages” (Zhang et al. 2019: 112)  – that is, all 949 reconstructed roots found in STEDT for the 100 meanings in the basic Swadesh list. This is lexicostatistics with a better statistics package, with a head start from previous work (STEDT). It is hard to see how the inherent problems of genealogical classification by lexicostatistics are eliminated by a better statistics package. Since Zhang et al. are classifying based on shared vocabulary, and using STEDT data, one would expect a result compatible with the Benedict-Matisoff model, and for the most part that is what they get. They find a basic Sino-TB split, and their classification of Tibeto-Burman is very close to Matisoff’s. They find support for the Eastern branch described in this chapter (minus Karen, which they attach to Kuki-Naga), but none for our Central or Western. Their tree also has a few unexpected associations which will be noted where relevant. The authors correctly note that the tree model is not adequate for sorting out lexical data given the long history of language contact in the family – which is to say that the results we have here are no improvement over what we had before. Sagart et al. likewise apply Bayesian phylogenetic methods, but construct their own lexical set, “a lexical database of 180 basic vocabulary concepts from 50 languages” (Sagart et al. 2019: 10318) strongly biased toward items which they can try to correlate with archaeological evidence for domestication of various plant and animal species. So again this is a sort of lexicostatistics, but explicitly biased toward a category of lexical items which are often borrowed along with their referents. Their midlevel results are mostly conventional, except that they revive a long-discarded idea of a special relationship between Rgyalrong and Tibetan. The higher-order connections produced by their model are less conventional, including the very surprising idea of a special connection between Sinitic and Bodo-Garo, implying a basic split in the family of Sino-Sal versus everything else. I am sympathetic to Bradley’s tripartite division of the TB languages, but I see Sinitic as simply one more branch of TH. So I will present the languages in four sections: Western (12.3), Eastern (12.4), Central (12.5), and Sinitic (12.6). Within each chapter I will list the languages and their locations, followed by a section summarizing different contemporary opinions about their classification. It is important to emphasize that the listed, bolded clades in the first part of each chapter are the highest-level groupings that can be stated with certainty at our present state of knowledge. The genealogical unity of most of these is self-evident; where there is any formal reconstruction it will be cited. The reader is warned that none of the higher-level groupings discussed in the second section of each chapter are currently supported by anything more than a list of strongly resemblant lexical items not attested elsewhere in the family, and some by



Classifying Trans-Himalayan (Sino-Tibetan) languages 

 211

rather less than that. So if I call a proposal “plausible” that is a positive evaluation, and “likely” means I consider it the strongest hypothesis. For a variety of reasons, many Tibeto-Burman languages and groups are known in the literature by two or more different names. I will not attempt to recite all of the names and spellings here, but will include in brackets other names or spellings commonly found in the literature, e.  g. Jinghpaw [Jingpho, ‘Kachin’]. By placing an alternate name within brackets also in scare quotes, e.  g. [‘Kachin’], I mean to suggest that for one reason or another – it is obsolete, inaccurate, inappropriate, or is better used for a different purpose – it should not be used to refer to this language or group. Quote marks “ ” around a group or branch name mentioned in the text simply indicate that I am discussing the suggestion or usage of a particular author rather than a widely accepted label, and are not intended as judgmental.

12.3 The Western languages The languages spoken along the Himalayas from Bhutan to Himachal Pradesh and Ladakh have long been regarded as constituting a major branch of the family, which is recognized in most classification schemes, as Shafer’s (1966) “Bodic”, Matisoff’s (2003) “Himalayish”, van Driem’s (2001) “Northwestern”, and Bradley’s (2002) “Western”. But to date this grouping is based only on lexical similarities in the data in the LSI, so it can only be a provisional hypothesis. These languages are not represented in Southeast Asia, and I will not go into great detail here; for more detailed discussion see Genetti (2016). The uncontroversial Himalayan clades are (roughly from east to west): Bodish: Tibetic [Tibetan, Bodic, Yarlungic], the languages referred to as “dialects” of Tibetan (Tournadre 2014), and East Bodish, a half-dozen languages of Bhutan and neighboring areas of Tibet and India (Hyslop 2014). Tshangla [‘Central Monpa’]: A dialect chain centered in eastern Bhutan. The place of Tshangla among the Western languages, and even whether it belongs with the others at all, is unclear. (Monpa, from a Tibetan word for regions to the south of Tibet proper, occurs in several language names of Arunachal; it has no cladistic significance, and no connection to Austroasiatic Mon of Southeast Asia.) ‘Ole [BlackMountain], Gongduk, Lhokpu: Three languages of Bhutan; their relation to one another and to anything else is still unclear (Gerber and Grollmann 2018a). Dhimalish: Dhimal in southeastern Nepal, and Toto in West Bengal; van Driem (2001) suggests a relationship to Bodo-Garo, with little evidence; Gerber and Grollmann (2018a). Eastern Kiranti: Bantawa, Limbu, and other languages of eastern Nepal and Sikkim. Western Kiranti: Thulung, Hayu, Khaling, and other languages of east-central Nepal. Newaric: Newar, in the Kathmandu Valley, with several quite distinct varieties, plus Baram and Thangmi.

212 

 Scott DeLancey

Tamangic [Tamang-Gurung-Thakali-Manang]: A half-dozen languages of central-western Nepal. Central Himalayan [‘Magaric’]: Chepang, Magaric (Kham and Magar), and Dura in west-central Nepal. Raji-Raute: At least four languages of western Nepal and far eastern Uttarakhand, spoken by still nomadic groups. West Himalayan: Two dozen languages of Uttarakhand and Himachal Pradesh, including the extinct Zhang-zhung language. No genealogical unit including any two or more of these clades has yet been demonstrated. All current models recognize a Kiranti unit, subsuming my East and West Kiranti headings, but Gerber and Grollmann (2018b) show that this still needs to be proven. Neither Newaric nor Central Himalayan is proven, although both seem very likely (Turin 2004; Schorer 2016). Bodish and Tamangic are typologically very similar, and sometimes thought to be closely related, while Benedict links Bodish and West Himalayan in a “Tibeto-Kinnauri” subbranch, but neither hypothesis can be considered established.

12.4 The Eastern languages The languages of the putative Eastern group are distributed through the upper Yangtse and Mekong drainages, in Southwest China and northeastern Myanmar, with some recent spread into parts of Thailand, Laos, and Vietnam. Karen is the only Trans-Himalayan clade spoken entirely within the political boundaries of Southeast Asia. The largest number of Burmese-Ngwi languages and speakers are in Yunnan, but languages of the branch are also spoken in Myanmar, Thailand, Laos and Vietnam.

12.4.1 The low-level Eastern clades Roughly from north to south the low-level Eastern clades are: Rgyalrongic: Perhaps a dozen languages of western Sichuan, one subgrouping (J. Sun 2000) is:   Rgyalrong/Jiarong: Situ, Japhug, etc.   Kroskyabs [‘Lavrung’]   Stau [‘Horpa’, Ergong]: Stau/Rtau, Dgebshes/Geshizha, etc. Some morphological reconstruction by J. Sun (2004, 2019).



Classifying Trans-Himalayan (Sino-Tibetan) languages 

 213

Qiangic: Rma [Qiang], Prinmi/Pumi, Minyag/Muya, also Tangut [Xixia], the extinct language of the Tangut or Xixia Empire. Evans (2004) reconstructs the verbal morphology for Rma proper. Bai: Two or three languages of northwestern Yunnan; see Wang (2005). Tujia: Two languages of northwestern Hunan. Ersuic: Three languages of western Sichuan, reconstructed by Yu (2012). Naic: Several varieties of Naxi, Laze, Na [Mosuo] spoken in Sichuan and Yunnan border, see Lidz (2010), Guillaume and Michaud (2011) for some work on internal classification. Burmese-Ngwi [Lolo-Burmese, Burmese-Lolo, Yi-Mian]: Several dozen languages centered around the Mekong drainage in Yunnan, but spread from easternmost Arunachal to northwestern Vietnam. There are two subbranches: Burmish, in Myanmar and Yunnan, and Ngwi [Yi, Nisoic, Loloish] in Guizhou, Yunnan, southeastern Tibetan Autonomous Region, easternmost Arunachal Pradesh, and in the northern mountains of Vietnam, Laos, Thailand and Myanmar. Several Ngwi languages are spoken in Thailand, including Akha, Lahu, Lisu, Phunoi. A few – Bisu, Mpi, and Ugong [Gong] – are spoken by small villages of settled farmers. Ngwi languages in Laos include Akeu, Akha, Hani, Kaduo, Khongsat, Laoseng, Lahu, Phunoi, Phusang, Sila, and Yi. The phonology has been substantially reconstructed by James Matisoff in a long series of papers, see Matisoff (2003); Bradley (1979) and Lama (2012) deal specifically with Ngwi. Karen: Karen languages are spoken in the lower Salween drainage in Kayah and Kayin States in Myanmar, and across the border in Thailand. There are16 named varieties, in three major groups: Sgaw (including Kayan and Karenni), Pwo and Pa’o. Manson (2011) is a survey of the group, Luangthongkhum (2019) presents a reconstruction.

12.4.2 Classification With the exception of Karen, Bai, and Tujia, all the other languages in 12.4.1 form a north-south chain in terms of phonological typology, word and syllable structure, verbal morphology, and lexicon. This undoubtedly to some extent reflects areal effects (this is Matisoff’s “Sinosphere”), but most scholars accept a Macro-Qiangic group including Rgyalrongic, Qiangic, Ersuish, and debatably Naic, which is variously considered Qiangic or Burmese-Ngwi. Bai and Tujia show significant resemblances to Sinitic, and in both cases opinion is divided as to whether this shows some kind

214 

 Scott DeLancey

of transitional status or is the result of intense Sinitic influence on a Tibeto-Burman language. The issues of particular interest to Southeast Asianists are the further affiliations of Burmese-Ngwi and Karen. A popular proposal unites Macro-Qiangic, Naic, and Burmese-Ngwi (Sun 1988, 2001; Li 1998; Jacques 2014; Sagart et al. 2019; Zhang et al. 2019). This is Bradley’s Eastern branch, which he divides into Northeastern, including Macro-Qiangic, Naic, Bai and Tujia, and Southeastern, consisting of Burmese-Yi and perhaps Karen. Matisoff links Lolo-Burmese with Naic, but is skeptical of any special connection of this group with Qiangic. The Karen languages are sufficiently divergent from the rest of the family that Benedict treated them as a distinct branch coordinate with “Tibeto-Burman” proper, but no modern scholar endorses this idea. Most scholars treat Karen as not particularly affiliated with any other group, but van Driem (2001) and Bradley (2002) note lexical connections between Karen and Burmese-Ngwi and suggest a Southeastern branch comprised of these two clades.

12.5 The Central languages In this section we deal with the numerous languages of the Irrawaddy-Chindwin and Brahmaputra-Siang drainages, spoken in northern Myanmar, Northeast India, and adjoining districts in China. These include the languages of Arunachal Pradesh and adjacent parts of China, which we only recently have begun to have serious documentation for. While the Western and Eastern branches have been generally recognized for some time, the idea that all or most of the languages in between might constitute a genealogical unit is relatively new and not widely accepted; indeed, they have been treated in the literature as highly divergent. However as one of the proponents of the idea I will adopt it for this presentation; this will be discussed further in 12.5.5. The Central branch, as I define it, includes a great number of languages spoken only or predominantly in SEA; moreover there is reason to think that some of the languages spoken further west in NE India may in fact have SEA geographical origins (Post 2015). Whether or not the Central languages are in fact a clade, they constitute an areal group (cp. STEDT’s North East India Areal Group) with strikingly SEA defining features, e.  g. sesquisyllabicity and verb serialization.

12.5.1 Languages of the upper Salween and Irrawaddy drainage Since, again, all mid- and higher-level groupings must be regarded as hypothetical, I will present the established clades according to geographical rather than genealogical affinities, and then consider proposals for higher-order structure.



Classifying Trans-Himalayan (Sino-Tibetan) languages 

 215

Jinghpaw-Asakian: This clade consists of Jinghpaw [‘Kachin’] in northern Myanmar, western Yunnan, and upper Assam, and the small Asakian [Sak, Luish] group – Kantu [Kadu, Sak] in Sagaing, Sak in Rahine, Cak in Chittagong, and Andro, Chairel, and Sengmai, three extinct languages of Manipur. Huziwara 2014 is a reconstruction of Proto-Asakian. Luce suggests that Asakian languages “at one time spread over the whole north of Burma, from Manipur perhaps to northern Yünnan” (Luce 1985: 36). The validity of this clade is demonstrated by Matisoff (2013). Pyu: Inscriptions in apparently Tibeto-Burman languages from several upper and middle Irrawaddy Valley urban centers of the 1st millennium CE. Bradley (2002) suggests that these may be Asakian; this is consistent with Luce’s suggestion, but in the current state of Pyu decipherment (Griffiths et al. 2017) still premature. Nungish [Nung, Nungic] is an extensive dialect chain or nascent linkage spoken in the N’Mai/upper Irrawaddy and Nujiang/upper Salween drainages in Myanmar and China, generally presented as three languages, Rawang, Anong, and Trung [Dulong, Tarong]. Mru-Hkongso: Two little-known languages, Hkongso in southern Chin State and the long-mysterious Mru of Chittagong, are now shown be related, but the place of this clade in the family is unknown (Peterson 2017).

12.5.2 Languages of the Patkai Range and the Brahmaputra Valley The valley region which is the contemporary state of Assam has been ruled by one or more city-states for at least two millennia, and thus has relatively little linguistic diversity compared to the surrounding hills. Assam has many minority communities of migrants from the hills, but the Bodo-Garo group seems to have formed and diverged entirely in the valley. Straddling the mountains which separate the Irrawaddy-Chindwin and Brahmaputra-Siang drainages is a set of small, low-level clades, all except Northern Naga probably belong to a single Kuki-Naga subgroup (see 12.5.4): Bodo-Garo [Bodo-Koch, Baric, Barish]: A dozen languages spoken in and around the Brahmaputra Valley, including Bodo [Boro] with over a million speakers in Assam. Burling (2012) suggests a subclassification, but it is not clear that the group has a tree-like structure. Joseph and Burling (2006) and Debnath (2014) give phonological/lexical reconstructions. Northern Naga [‘Konyak’]: Perhaps dozens of languages of Sagaing, Arunachal and Assam, in two subbranches. The first includes Chang, Konyak, Phom, and probably Wancho, the second Nocte, Tangsa and Tutsa, all umbrella ethnic designations which include speakers of very distinct linguistic varieties (Morey 2019). French (1983) is a phonological/lexical reconstruction. Ao [Central Naga]: Very closely related Ao, Lotha, Sangtam, Yimchungrü in Nagaland, and other undescribed varieties or languages. Bruhn (2014) is a phonological/ lexical reconstruction.

216 

 Scott DeLancey

Angami-Pochuri: Very closely related Ntenyi, Pochuri, Rengma and Sumi, in Nagaland, are the Pochuri group; Angami includes Angami, Chokri, Kheza, Mao [Sopvoma], and Poula in Nagaland, Manipur and Sagaing. Zeme [Zeliangrong]: A dialect chain with named varieties Liangmai, Rongmei, Zeme, spoken in and around Tamenglong District, Manipur. Meitei [Manipuri]: the language of the Imphal Valley of Manipur, and the dominant language and lingua franca of Manipur. Tangkhulic: Tangkhul (in many varieties, some not mutually intelligible, mostly undescribed) in Ukhrul and Kamjong Districts of Manipur and in Sagaing, and Maring and Uipo [‘Khoibu’] in Chandel District, Manipur. Mortensen (2003) is a phonological/lexical reconstruction of Tangkhul. South Central [Kuki-Chin]: Dozens of varieties spoken throughout Chin State, Mizoram, Manipur, Chittagong, and in adjacent states in Myanmar and India. Van Bik (2009) is an extensive phonological and lexical reconstruction. The clade divides into several subbranches, including at least (roughly from north to south): Northwestern [‘Old Kuki’]: Two dozen or more varieties spoken in the Manipur River drainage in southern Manipur, and the Barak River valley in Assam and Tripura. The Manipur languages include Aimol, Anal, Chiru, Chhothe, Kharam, Koireng, Kom, Lamkang, Monsang, Moyon, Purum, Sorte, Sorbung, Tarao, and others; the Barak Valley group includes Biate, Chorei, Hallam, Hmar, Hrangkhol, Ranglong, Saihriem, Sakachep, and others. Peripheral: See Peterson (2017) for justification for this grouping Northeastern [Northern Chin]: A dozen or more varieties of Chin State and southern Manipur; Thado is more widely spoken. Languages include Gangte, Paite, Ralte, Sizang (Siyin), Tedim (Tiddim), Thado, Zomi, and others. Southern [Southern Chin]: Asho, Daai, Hyow, K’cho, Khyang, Paletwa, and more; Peterson distinguishes Khumi as a distinct Khomic subbranch. Central: Languages of Chin State, Mizoram, and Chittagong, including Bawm, Lai, Mizo (Lushai, Lushei), Pangkhua, Zahao, and more. The Maraic subgroup includes Mara (Lakher), Zotung, and Senthang in Mizoram and Chin State. Karbi: A relatively undifferentiated language with nearly a million speakers in and near Karbi Anglong district in Assam.

12.5.3 Languages of the Eastern Himalayas The languages spoken along the Eastern Himalayas, in Arunachal Pradesh and adjacent districts of China, are not yet very well-known, and we are at a very preliminary stage in classifying them. Roughly from east to west, the clades which I am assigning to the Central branch are:



Classifying Trans-Himalayan (Sino-Tibetan) languages 

 217

Kaman-Meyor [Mijuish (Mishmi)]: K’man [Kaman, Geman, ‘Miju’] and Meyor [Dza, Zaiwa, ‘Zakhring’] on both sides of the line of control. This is controversial, but I have seen unpublished morphological data which will eventually confirm it. Kera’a-Tawrã [Idu-Taruang, Digarish, ‘Mishmi’]: Kera’a [Idu/Yidu, Idu Mishmi, ‘Chulikata’] and Taruang [Taraon/Tawrã, Digaru, Digaru Mishmi], perhaps a dialect chain rather than distinct languages, spoken just west of Kaman-Meyor in the Dibang valley in Arunachal Pradesh. Milang: A language spoken in the upper Siang drainage. Tani: [Adi-Mising-Nishi, ‘Abor-Miri-Dafla’]: Adi, Apatani, Bangni, Bokar, Galo, Mising, Nyishi, Tagin, and others. Spoken in central and eastern Arunachal Pradesh and adjacent districts in Assam and the Tibetan Autonomous Region. Reconstructed (J. Sun 1993). Koro [Koro Aka]: A language of East Kameng district, once mistakenly considered a variety of Hruso. Hrusish: Hruso [Aka], Miji-Bangru Kho-Bwa: [Bugun-Khowa, Bugunish, Kamengic]: Bugun [Khowa], Puroik [‘Sulung’], Sherdukpen, Sartang [Butpa Monpa], spoken in western Arunachal. Preliminary reconstruction in Lieberherr and Bodt (2017).

12.5.4 Mid-level clades A number of linguists have proposed affiliations among various sets of the languages listed in the preceding sections, usually with little explicit basis. In this section I will summarize proposals which are currently in play.

Jinghpaw-Asakian and Sal The classification of Jinghpaw, spoken from western Yunnan across northern Myanmar into Assam, has always been problematic. Benedict (1972) placed it at the center of a star-like representation of family relationships; since his classification is based on lexical similarities, this might simply reflect the fact that Hanson (1906) provides more extensive lexical coverage than was available for most other languages. Matisoff (2013) confirms the long-held belief that Jinghpaw’s closest connection is to Asakian. At the next higher level there is broad, if not ungrudging, agreement on the validity of a Sal [Bodo-Konyak-Jinghpaw] branch including Jinghpaw, Northern Naga, and Bodo-Garo. Burling (1983) presents some lexical evidence for this grouping (see also Matisoff 2013); there is also morphological evidence linking Northern Naga and Jinghpaw (DeLancey 2015).

218 

 Scott DeLancey

Kuki-Naga The idea of a Naga clade, comprising Ao, Angami-Pochuri, Zeme, and Tangkhulic, but none of their neighbors, is generally out of favor. Conservative opinion (Post and Burling 2017; Thurgood 2017) sees only dubious evidence for any higher-order affiliation for any of these. But the idea of a Kuki-Naga [Kuki-Chin-Naga] branch, including these four plus Meitei, South Central, and probably also Karbi, was self-evident to Shafer (1950), who regarded South Central and Tangkhulic as manifestly a low-level linkage (see also Mortensen and Keogh 2011), and is assumed in Benedict (1972), Matisoff (2003), and Bradley’s work, based on a number of lexical forms found only among these languages. Skeptics note that few of these shared lexical items are found in all, or even most, of the languages. Nevertheless a Kuki-Naga clade, including Karbi, seems very likely.

Nungish Nungish has been linked by one or another author with every neighboring or nearby language group, generally without much evidence. It shows similarities to Jinghpo which lead to them being regularly grouped together (H. Sun 1988; Matisoff 1996), but these may be explainable by contact (Matisoff 2013). Zhang et al. group Nungish with Kiranti and Central Himalayan, i.  e. as a Western language, while for Sagart et al. it is the outgroup in a clade otherwise comprising Burmo-Qiangic and Tibetic.

Grouping the Eastern Himalayan languages The LSI lumped these languages together in a “North Assam Group” as a matter of convenience. Recent documentation and comparison, some not yet published, provides lexical support for various pairings and groupings of these languages, but their position within any higher branch is unclear. Sun Hongkai (1988), who did some of the earliest modern work on the Eastern Himalayan languages, links Jinghpaw, Nung, Meyor-Kaman, Kera’a-Tawrã, Tani, and Kho-bwa in a Jingpho branch; in terms of the Central hypothesis (12.5.5) this would make them all Central. Blench and Post (2014) critique several proposals, and suggest that some or all of these may not even be Trans-Himalayan, but this is not an idea that needs to be taken seriously. The position of Milang and Koro remains a subject of discussion; Post and Modi (2011) note strong connections between Milang and Tani, and suggest a Macro-Tani clade with Milang as the first offshoot. Unpublished suggestions by Blench, Post, and Modi link Milang with Koro in a “Siangic” clade, with possible links to Tani and Kera’aTawrã.



Classifying Trans-Himalayan (Sino-Tibetan) languages 

 219

12.5.5 The Central hypothesis Several of the Central languages made an outsized contribution to earlier work in classification. Benedict (1972) bases his reconstruction on five key languages  – Written Tibetan and Written Burmese, representing the Western and Eastern branches, and Jinghpaw, Garo, and Mizo, a South Central language. That means that the reconstruction scheme was built on the assumption that Garo, Jinghpaw, and Mizo represent three different branches of the family. If in fact all three belong to one major branch, this will be difficult to discern in terms of a reconstruction scheme which assumes otherwise. Several scholars have suggested various mid-level groupings. Matisoff (2003) ­suggests the possibility of a closer relationship between South Central and BodoGaro; Burling and others see Bodo-Garo as close to Jinghpaw; the Central hypothesis is b ­ asically that both of these are correct. The central element to the hypothesis of a major Central clade is evidence linking Sal and Kuki-Naga. Strong typological similarities, in particular a predilection for sesquisyllabic noun and adjective stems, and stray lexical connections between Jinghpaw and South Central have long been noted, but not presented as evidence for relationship. But DeLancey (2015) presents thin but strong morphological correspondences linking Jinghpaw, Northern Naga, Kaman-Meyor, and South Central. A demonstrable relationship between Jinghpaw and South Central necessarily implies a relationship between their next-higher nodes, so if Sal and Kuki-Naga are both valid, then they are related, and we have a substantial Central branch including all the languages of the Chindwin-Irrawaddy and Brahmaputra drainages. The assignment of other groups to this branch is speculative. Sun Hongkai (1988), links Jinghpaw, Nung, Meyor-Kaman, Kera’a-Tawrã, Tani, and Kho-bwa in a Jinghpo branch, and Bradley likewise provisionally places these in the Central branch. These clades share the typical Central sesquisyllabic pattern, but that need not be shared inheritance.

12.6 Sinitic Sinitic consists of 7–10 languages or low-level clades. All scholars recognize Mandarin, Wu [Shanghai], Gan, Xiang, Min, Yue [Cantonese], and Kejia [Hakka]. Sagart (2011) and Bradley (forthcoming) recognize a distinct Southwest language consisting of Waxiang in Hunan and perhaps Caijia in Guizhou. Many other scholars, particularly in China, recognize the distinctness of some or all of three more languages, Jin in Shanxi, Huizhou in Anhui, and Pinghua in Guangxi and Hunan. Min, Yue and Mandarin are each themselves highly differentiated into many mutually unintelligible varieties. Sinitic shares considerable basic vocabulary with various Tibeto-Burman languages, but also has strong lexical, phonological, and syntactic similarities to Kradai and Hmong-Mien. The basic morphosyntactic profile of Sinitic is the isolat-

220 

 Scott DeLancey

ing SVO type characteristic of mainland Southeast Asia rather than the agglutinating SOV structure characteristic of Tibeto-Burman (but see Bisang 2006). The southernmost Sinitic varieties are even closer to the MSEA prototype (de Sousa 2015). This may reflect an origin for Proto-Sinitic through adoption of a Tibeto-Burman language by a population speaking either an early form of Kradai or Hmong-Mien, or some kind of already creolized lingua franca involving elements of those groups (DeLancey 2013). Although there is no evidence for a Tibeto-Burman clade, Sinitic could well have been the first branch to diverge from the parent stock. Hypothetically it also could belong in some other branch, and several authors have suggested connections on the basis of specific lexical resemblances: Sino-Bodic (van Driem 1997), Sino-Kiranti (Starostin 1994), or Sagart et al.’s (2019) suggested connection between Sinitic and BodoGaro. All of these are based on sparse lexical connections.

References Benedict, Paul. 1972. Sino-Tibetan: A Conspectus. Cambridge: Cambridge University Press. Bisang, Walter. 2006. Southeast Asia as a linguistic area. In Keith Brown (ed.), Encyclopedia of Languages & Linguistics, vol. 11, 2nd edn., 587–595. Oxford: Elsevier. Blench, Roger & Mark Post. 2014. Re-thinking Sino-Tibetan phylogeny from the perspective of North East Indian languages. In Nathan Hill & Thomas Owen-Smith (eds.), Trans-Himalayan linguistics: Historical and descriptive linguistics of the Himalayan area, 71–104. Berlin: Mouton de Gruyter. Bradley, David. 1979. Proto-Loloish. London: Curzon. Bradley, David. 1997. Tibeto-Burman languages and classification. In David Bradley (ed.), Tibeto-Burman languages of the Himalayas, 1–72. Canberra: Australian National University. Bradley, David. 2002. The subgrouping of Tibeto-Burman. In Christopher Beckwith (ed.), Medieval Tibeto-Burman languages, 73–112. Leiden: Brill. Bradley, David. 2012. Tibeto-Burman languages of China. In Rint Sybesma (eds.), Encyclopedia of Chinese languages and linguistics. Leiden: Brill. http://dx.doi.org/10.1163/2210-7363_ecll_ COM_00000419 (last accessed 16 December 2020). Bradley, David. Forthcoming. Sino-Tibetan. In Routledge atlas of the world’s languages. Bradley, David. 2018. Subgrouping of the Sino-Tibetan languages. Presented at the 10th International Conference on Evolutionary Linguistics, Nanjing University, 27–28 October 2018. Bruhn, Daniel. 2014. A phonological reconstruction of Proto-Central Naga. Berkeley: University of California dissertation. Burling, Robbins. 1983. The Sal languages. Linguistics of the Tibeto-Burman Area 7(2). 1–32. Burling, Robbins. 2003. The Tibeto-Burman languages of Northeastern India. In Graham Thurgood & Randy LaPolla (eds.), The Sino-Tibetan languages, 169–191. London: Routledge. Burling, Robbins. 2012. The Stammbaum of Bodo-Garo. In Gwendolyn Hyslop, Stephen Morey & Mark Post (eds.), North East Indian Linguistics 4. 21–35. Delhi: Foundation. de Sousa, Hilário. 2015. The far southern Sinitic languages as part of Mainland Southeast Asia. In N. J. Enfield & Bernard Comrie (eds.), Languages of Mainland Southeast Asia: The state of the art (Pacific Linguistics 649), 356–439. Berlin: Mouton de Gruyter. Debnath, Rupak. 2014. A reconstruction of Proto-Barish. Delhi: Akansha



Classifying Trans-Himalayan (Sino-Tibetan) languages 

 221

DeLancey, Scott. 2013. The origins of Sinitic. In Zhou Jing-Schmidt (ed.), Increased empiricism: Recent advances in Chinese linguistics (Studies in Chinese Language and Discourse 2), 73–99. Amsterdam: John Benjamins. DeLancey, Scott. 2015. Morphological evidence for a Central branch of Trans-Himalayan (Sino-Tibetan). Cahiers de Linguistique – Asie Orientale 44(2). 122–149. Driem, George van. 1997. Sino-Bodic. Bulletin of the School of Oriental and African Studies 60(3). 455–488. Driem, George van. 2001. Languages of the Himalayas. Leiden: Brill. Driem, George van. 2014. Trans-Himalayan. In Nathan Hill & Thomas Owen-Smith (eds.), Trans-Himalayan linguistics: Historical and descriptive linguistics of the Himalayan area, 11–40. Berlin: Mouton de Gruyter. Evans, Jonathan. 2004. The reconstruction of Proto-Qiang verb inflection. In Ying-chin Lin et. al. (eds.), Studies on Sino-Tibetan languages: Papers in honor of Professor Hwang-Cherng Gong on his seventieth birthday (Language and Linguistics Monograph Series W-4), 201–238. Taipei: Institute of Linguistics, Academia Sinica. François, Alexandre. 2014. Trees, waves and linkages: Models of language diversification. In Claire Bowern & Bethwyn Evans (eds.), The Routledge handbook of historical linguistics, 161–189. Oxford: Routledge. French, Walter. 1983. Northern Naga: A Tibeto-Burman mesolanguage. New York: City University of New York dissertation. Genetti, Carol. 2016. The Tibeto-Burman languages of South Asia. In Hans Hock & Elena Bashir (eds.), The languages and linguistics of South Asia, 130–155. Berlin: Mouton de Gruyter. Gerber, Pascal. 2019. Gongduk agreement morphology in functional and diachronic perspective. Presented at the International Society of Bhutan Studies Inaugural Conference, Magdalen College, Oxford, 10 January 2019. Gerber, Pascal & Selin Grollmann. 2018a. Linguistic evidence for a closer relationship between Lhokpu and Dhimal. Cahiers de Linguistique – Asie Orientale 47(1). 1–96. Gerber, Pascal & Selin Grollmann. 2018b. What is Kiranti? A critical account. Bulletin of Chinese Linguistics 11. 99–152. Grierson, George (ed.). 1903–1909. Linguistic survey of India, vol. 3: Tibeto-Burman family. Calcutta: Office of the Superintendent of Government Printing [Reprint 1967: Delhi: Motilal Banarsidass]. Griffiths, Arlo, Bob Hudson, Marc Miyake & Julian Wheatley. 2017. Studies in Pyu epigraphy, I: State of the field, edition and analysis of the Kan Wet Khaung Mound inscription, and inventory of the corpus. Bulletin de l’École Française d’Extrême-Orient 103. 43–205. Huziwara, Keisuke 藤原敬介. 2014. ルイ祖語の再考 Ruisogo no saikō [Proto-Luish reconsidered]. Kyoto University Linguistic Research 33. 1–32. Huziwara, Keisuke. 2016. Cak-English-Bangla dictionary. Dhaka: AH Development. Hyslop, Gwendolyn. 2014. A preliminary reconstruction of East Bodish. In Nathan Hill & Thomas Owen-Smith (eds.), Trans-Himalayan linguistics: Historical and descriptive linguistics of the Himalayan area, 155–179. Berlin: Mouton de Gruyter. Jacques, Guillaume. 2014. Esquisse de phonologie et de morphologie historique du tangoute. Leiden: Brill. Jacques, Guillaume & Alexis Michaud. 2011. Approaching the historical phonology of three highly eroded Sino-Tibetan languages: Naxi, Na and Laze. Diachronica 28. 468–498. Joseph, U. V. & Robbins Burling. 2006. The comparative phonology of the Boro Garo languages. Mysore: Central Institute for Indian Languages. Kato, Takashi. 2008. Linguistic survey of Tibeto-Burman languages in Lao P.D.R. Tokyo: Institute for Languages and Cultures of Asia and Africa.

222 

 Scott DeLancey

Konnerth, Linda. 2017. Karbi. In Graham Thurgood & Randy LaPolla (eds.), The Sino-Tibetan languages, 2nd edn., 302–321. London: Routledge. Konnerth, Linda. 2018. The historical phonology of Monsang (Northwestern South-Central/“KukiChin”): A case of reduction in phonological complexity. Himalayan Linguistics 17(1). 19–49. Konow, Sten. 1902. Zur Kenntnis der Kuki-Chinsprachen. Zeitschrift der Deutschen Morgenländischen Gesellschaft 56. 486–517. Kurabe, Keita. 2016. A grammar of Jinghpaw, from northern Burma. Kyoto: Kyoto University dissertation Lama, Ziwo Qiu-Fuyuan. 2012. Subgrouping of Nisoic (Yi) languages: A study from the perspectives of shared innovation and phylogenetic estimation. Arlington, TX: University of Texas at Arlington dissertation. LaPolla, Randy. 2013. Subgrouping in Tibeto-Burman. In Balthasar Bickel, Lenore Grenoble, David Peterson & Alan Timberlake (eds.), Language typology and historical contingency, 463–474. Amsterdam: Benjamins. Lieberherr, Ismael. 2015. A progress report on the historial phonology and affiliation of Puroik. In Linda Konnerth et al. (eds.), North East Indian linguistics 7, 235–286. Canberra: Asia Pacific Linguistics. Lieberherr, Ismael & Timotheus Bodt. 2017. Subgrouping Kho-Bwa based on shared core vocabulary. Himalayan Linguistics 16(2). 26–63. Li, Yongsui 李永燧. 1998. Qiang-Mian qun chuyi 羌缅语群刍议 [On Qiang-Burmese group]. Minzu Yuwen 1. 16–28. Lidz, Liberty A. 2010. A descriptive grammar of Yongning Na (Mosuo). Austin, TX: University of Texas PhD dissertation. Luangthongkhum, Theraphan. 2019. A view on Proto-Karen phonology and lexicon. Journal of the Southeast Asian Linguistics Society 12(1). xxx. Luce, Gordon. 1985. Phases of Pre-Pagán Burma. Oxford: Oxford University Press. Macario, Florens. 2015. The genetic position of Apatani within Tibeto-Burman. In Linda Konnerth et al. (eds.), North East Indian linguistics 7, 213–233. Canberra: Asia Pacific Linguistics. Manson, Ken. 2011. The subgrouping of Karen. Presented at the 21st Meeting of the Southeast Asian Linguistics Society. http://jseals.org/seals21/manson11subgroupingd.pdf (accessed 27 September 2019). Matisoff, James.1996. Languages and dialects of Tibeto-Burman (STEDT Monograph # 2). Berkeley: Sino-Tibetan Etymological Dictionary and Thesaurus Project. Matisoff, James. 2003. Handbook of Tibeto-Burman: System and philosophy of Sino-Tibetan reconstruction. Berkeley: University of California Press. Matisoff, James. 2013. Re-examing the genealogical position of Jingpho: Putting flesh on the bones of the Jingpho/Luish relationship. Linguistics of the Tibeto-Burman Area 36(2). 15–95. Michaud, Alexis, Limin He & Yaoping Zhong. 2017. Nàxī 納西 language / Naish languages. Encyclopedia of Chinese language and linguistics 3, 144–157. Leiden: Brill. Morey, Stephen. 2019. The Nocte-Tangsa languages: An introduction. Himalayan Linguistics 18(1). 134–140. Mortensen, David. 2003. ms. Comparative Tangkhul. Mortensen, David & Jennifer Keogh. 2011. Sorbung, an undescribed language of Manipur: Its phonology and place in Tibeto-Burman. Journal of the Southeast Asian Linguistics Society 4(1). 62–114. Peiros, Ilia. 1998. Comparative linguistics in Southeast Asia (Pacific Linguistics Series C-142). Canberra: Australian National University. Peterson, David. 2017. On Kuki-Chin subgrouping. In Picus Ding & Jamin Pelkey (eds.), Sociohistorical linguistics in Southeast Asia, 189–209. Leiden: Brill.



Classifying Trans-Himalayan (Sino-Tibetan) languages 

 223

Post, Mark. 2015. Morphosyntactic reconstruction in an areal-historical context – A pre-historical relationship between North East India and Mainland Southeast Asia? In N. J. Enfield & Bernard Comrie (eds.), Languages of Mainland Southeast Asia: The state of the art, 205–261. Canberra: Australian National University. Post, Mark & Yankee Modi. 2011. Language contact and the genetic position of Milang in Tibeto-Burman. Anthropological Linguistics 53(3). 215–369. Post, Mark & Robbins Burling. 2017. The Tibeto-Burman languages of Northeast India. In Graham Thurgood & Randy LaPolla (eds.), The Sino-Tibetan languages, 2nd edn., 213–242. London: Routledge. Sagart, Laurent. 2006. Review of Matisoff, handbook of Proto-Tibeto-Burman: System and philosophy of Sino-Tibetan reconstruction. Diachronica 22(1). 206–223. Sagart, Laurent, Guillaume Jacques, Yunfan Lai, Robin Ryder, Valentin Thouzeau, Simon Greenhil & Johann-Mattis List. 2019. Dated language phylogenies shed light on the ancestry of Sino-Tibetan. Proceedings of the National Academy of Sciences 116(21). 10317–10322. Schorer, Nicolas. 2016. The Dura language: Grammar and phylogeny. Leiden: Brill. Shafer, Robert. 1950. The Naga branches of Kukish. Rocznik Orientalistyczny 16. 467–530. Shafer, Robert. 1966. Introduction to Sino-Tibetan. Wiesbaden: Otto Harrassowitz. Singh, Chungkham Yashawanta. 2002. The impact of historical events on Manipuri language. Indian Linguistics 63. 77–85. Starostin, Sergei. 1994. The reconstruction of Proto-Kiranti. Presented at the 27ème Congrès International sur les Langues et la Linguistique Sino-Tibétaines. Centre International d’Études Pédagogiques à Sèvres, 14 October 1994. STEDT. Sino-Tibetan etymological dictionary and thesaurus. https://stedt.berkeley.edu/ (accessed 5 December 2020). Sūn, Hóngkāi 孙宏开. 1988. Shilun woguo jingnei Zang-Mianyude puxi fenlei 试论我国境内藏缅语的谱系分类 [A classification of Tibeto-Burman languages in China]. In Tatsuo Nishida & Paul Kazuhisa Eguchi (eds.), Languages and history in East Asia: Festschrift for Tatsuo Nishida on the occasion of his 60th birthday, 61–73. Kyoto: Shokado. Sūn, Hóngkāi. 1990. Languages of the Ethnic Corridor in western Sichuan. Linguistics of the Tibeto-Burman Area 13(1). 1–31. Sūn, Hóngkāi 孙宏开 2001. Lun Zang-Mian yuzu zhong de Qiang yuzhi yuyan 论藏缅语族中的 羌语支语言 (On the Qiangic branch of the Tibeto-Burman language family). Language and Linguistics 2(1). 157–181. Sun, Jackson Tianshin. 1993. A historical-comparative study of the Tani [Mirish] branch in Tibeto-Burman. Berkeley: University of California dissertation. Sun, Jackson Tian-Shin. 2000. Stem alternations in Puxi verb inflection: Toward validating the rGyalrongic subgroup in Qiangic. Language and Linguistics 1(2). 211–232. Thurgood, Graham. 2017. Sino-Tibetan: Genealogical and areal subgroups. In Graham Thurgood & Randy LaPolla (eds.), The Sino-Tibetan languages, 2nd edn., 1–39. London: Routledge. Turin, Mark. 2004. Newar-Thangmi lexical correspondences and the linguistic classification of Thangmi. Journal of Asian and African Studies 68. 97–120. Van Bik, Kenneth. 2009. Proto-Kuki-Chin: A reconstructed ancestor of the Kuki-Chin languages. Berkeley: STEDT. Wang, Feng. 2005. On the genetic position of the Bai language. Cahiers de Linguistique – Asie Orientale 34(1). 101–127. Yu, Dominic. 2012. Proto-Ersuic. Berkeley: University of California PhD dissertation. Zhang, Menghan, Shi Yan, Wuyun Pan & Li Jin. 2019. Phylogenetic evidence for Sino-Tibetan origin in northern China in the Late Neolithic. Nature 569. 112–115.

Peter Norquest

13 Classification of (Tai-)Kadai/Kra-Dai languages 13.1 Introduction The term Kadai was coined by Paul Benedict in 1940 to group three languages spoken in southern China and Vietnam: Gelao, Laqua (Qabiao) and Lachi, not previously recognized as related. Based on certain core vocabulary items, Benedict also connected these languages to the Hlai languages of Hainan island. He then went on to relate Kadai to Thai and Indonesian in his famous 1942 paper, “Thai, Kadai and Indonesian: A new alignment in Southeast Asia”. Before the publication of this article, it was taken for granted that Tai was a member of the Sino-Tibetan stock. The traditional view of a Sino-Tai relationship was based on the fact that Tai and Chinese share many typological similarities, both in phonology and grammar, along with a substantial number of lexical items (Luo 2008). Nearly contemporaneously, Fang Kuei Li had been conducting fieldwork on the Sui and Mak languages, which he characterized in Li (1943: 2–3) as related to Tai before it had split into daughter languages. In fact, in 1943 he was already using terms such as “Kam-Sui group” and “Tai group” to describe the two divisions. The basic bifurcation of this linguistic stock, which Li called Kam-Tai, was confirmed in his classic article on Tai and Kam-Sui (Li 1965), in which he demonstrated that the Kam-Sui languages were definitely related to but distinct from the Tai languages. However, Haudricourt (1967) later argued against the unified nature of these Kadai languages, stating that they should be analyzed as independent primary branches of the parent stock (Edmondson and Solnit 1988). Note that, while the term Kra-Dai is the one currently favored for the entire phylum, it is problematic in the current context due to its similarity to the term Kra-Tai, which is proposed below as a mid-level node in the phylogenetic tree. On the other hand, the traditional term Kadai is also problematic, due to its historical use as a referent to a subgroup (see discussion in Diller et al. 2008: 23). The name of the phylum will therefore be abbreviated in the following to KD, unless specifically quoting another author’s preferred term.

13.2 History of classification Haudricourt’s analysis was initially adopted in both China and the West. In China, the term Kam-Tai was adopted for the entire phylum, and a flat structure was employed with no internal hierarchy: https://doi.org/10.1515/9783110558142-013

226 

 Peter Norquest

Kam-Tai – Gelao – Lachi – Laha – Hlai – Kam-Sui – Tai Outside of China, however, a unified Kam-Tai node was assumed: Kadai – Gelao – Lachi – Laha – Hlai – Kam-Tai – Kam-Sui – Tai A major breakthrough in classification occurred when the languages of Benedict’s Kadai branch (excluding Hlai) were identified as a unified group of languages by Liang (1990) who incorporated these languages under the Chinese Geyang (from Gein Gelao + -yang in Buyang) group and separated it from Hlai; recently documented languages such as Buyang and En were also placed in this group, improving the previous flat-structured phylogenies which had included these smaller groups as primary branches of KD. A decade later, Ostapirat (2000) definitively demonstrated that the majority of the original Kadai languages form a unified genetic node which Ostapirat dubbed Kra, the common reconstructed ethnonym of these groups which is cognate with the word in Tai for ‘mountain person’ *kraːʔ (Hudak 2008: 193). Ostapirat also renamed the KD phylum Kra-Dai, and this is now the most common name for the phylum in the current literature, replacing the previous Tai-Kadai which had become the default name of the phylum up until that point. Meanwhile, other smaller groups including Biao, Lakkja and (Ong-)Be have been integrated over time as members of the KD phylum. Debate about the phylogeny of KD has since involved primarily the subgrouping of the primary families within the overall tree on the one hand, and the hierarchical relation of subgroups within these families on the other. Liang and Zhang (1996) provided the following classification (using the Chinese term Dong-Tai), which includes a unified Kam-Tai group in which Biao-Lakkja is subsumed under Kam-Sui and Be is posited as a sister of Tai:



Classification of (Tai-)Kadai/Kra-Dai languages 

 227

Dong-Tai – Geyang (Kra) – Hlai – Kam-Tai – Kam-Sui (incl. Biao-Lakkja) – Be-Tai – Be – Tai This general schema was followed by Edmondson and Solnit (1997: 2), but the Biao-Lakkja node was split into a sister of Kam-Sui: Kadai – Geyang (Kra) – Hlai – Kam-Tai – Kam-Sui – Biao-Lakkja – Kam-Sui – Be-Tai – Be – Tai Diller (2008: 7) joined Be with Lakkja under the Kam-Tai node: Tai-Kadai – Kra – Hlai – Kam-Tai – Lakkja-Be – Lakkja – Be – Kam-Sui – Tai Chamberlain (2016: 38) also maintained a Kam-Tai node, but doesn’t place Lakkja; he also moves the Northern Tai language Saek upwards to join Be in a Be-Saek group:

228 

 Peter Norquest

Kra-Tai – Kra – Hlai – Kam-Tai – Kam-Sui – Be-Tai – Be-Saek – Tai Ostapirat (2005: 108) proposed an original bifurcation between Northern and Southern groups, making an important association between Hlai and Tai in the latter group, although associating Kra with Kam-Sui in the former. This basic schema was augmented in Norquest (2007) by associating the Biao-Lakkja group with Kam-Sui and the Be group with Tai: Kra-Dai – Northern – Kra – Northeastern – Biao-Lakkja – Kam-Sui – Southern – Hlai – Be-Tai – Be – Tai Norquest (2015) later revised the above classification to the following, maintaining Ostapirat’s Southern Kra-Dai node as Western Kam-Tai but disassociating the Biao-Lakkja and Kam-Sui nodes from Kra: Kra-Dai – Kra – Eastern Kra-Dai – Biao-Lakkja – Kam-Tai – Kam-Sui – Western Kam-Tai – Hlai – Be-Tai – Be – Tai



Classification of (Tai-)Kadai/Kra-Dai languages 

 229

13.3 Classification by position in phylogenetic tree This section is devoted to a discussion of the present state of KD classification, followed by a survey of each individual node in the tree and the individual KD families and subgroups.

13.3.1 KD The proposal offered here for highest-level subgrouping of the phylum is given below: KD – Biao-Lakkja – Kam-Tai – Kam-Sui – Kra-Tai – Kra – Hlai-Tai – Hlai – Be-Tai – Be – Tai There are two significant points in the above tree which contrast with the phylogenies given in 13.2 above. The first is that the Biao-Lakkja node has been moved to the top of the tree, and is one of the two main nodes which result from the initial bifurcation in the phylum. The second is that the Kra family has been demoted to the position of sister of the Hlai-Tai branch (Southern Kra-Dai in Ostapirat’s terms). Both of these innovations to the phylogenetic tree are based on shared lexical innovations in core vocabulary, where the Kra family, although exhibiting many internal innovations of its own, nevertheless aligns more with the Hlai-Tai branch than it does with either the Biao-Lakkja group or with Kam-Sui. Likewise, as shown in L-Thongkum (1992), there are a number of isoglosses which are shared by Lakkja on the one hand with Kam-Sui, and on the other with Tai, supporting the hypothesis that Kam-Sui and Tai (by way of Kra-Tai) each inherited different but overlapping items from the Proto-KD lexicon after the break-up of the Kam-Tai group. Note that there is some extra-linguistic evidence which supports this: the nexus of the KD phylum is rooted geographically towards the mouth of the Pearl River Delta, and then proceeds in a general westward direction with Kam-Sui to the north and the Kra-Tai complex to the south. It may then be hypothesized that the Kra group first split away from this complex by moving further west, while the Hlai, and

230 

 Peter Norquest

then the Be, groups split off later to the south (Hainan island) leaving the core Tai group to eventually expand southwestward.

13.3.2 Biao-Lakkja Biao [byk] consists of three mutually unintelligible languages spoken primarily in Huaiji County, Guangdong Province, China, in the Shidong, Yonggu, and Dagang townships (Hsiu 2014). An internal structure of the Biao group is offered below, with the southern group exhibiting more internal divergence than the northern: Biao – Southern – Shidong – Yonggu – Northern – Dagang – Chang’an The Lakkja [lbc] language is spoken in the Jinxiu Yao Autonomous County in east-central Guangxi, China. L-Thongkum (1992) considers the varieties of Lakkja to be subdialects of one monolithic dialect rather than to be separate dialects proper; she also considers the Western (Jintian) variety to be more innovative: Lakkja – Western – Jintian – Eastern – Liula – Jinxiu There is currently no consensus on the classification of either Biao or Lakkja within the KD family. Hsiu (2014) suggests that Biao could either subgroup with Lakkja, or form an independent branch of KD on its own. Biao shares certain common etyma and phonological traits with Lakkja, which Liang (2002) believes forms a subgroup with Kam-Sui. See Table 1 for examples of Biao-Lakkja isoglosses.



Classification of (Tai-)Kadai/Kra-Dai languages 

 231

Tab. 1: Examples of Biao-Lakkja isoglosses. Gloss

Biao-Lakkja

Kam-Sui

Kra

Hlai

Be

Tai

‘house’ ‘road’ ‘heavy’ ‘leg’ ‘neck’

*ljaːk *tsaːŋ *N-tsak *puk *ʔən

*r̥ aːn *qʰwən *C-dʑan *p-qaː *ʔdənʔ

*qran *qron *qχəl *C-qaː *C-joː

*hrɯːn *kuːn *kʰɯn *kʰok *hljoŋʔ

*raːn *ʃwən *xən *kok *liəŋX

*rɤːn *r̥ wɤn *n̥ ak *f-qaː *ɣoː

Solnit (1988) and Hansell (1988) classify Lakkja as a sister of the Kam-Sui branch. Solnit (1988) also classifies Biao and Lakkia together as part of a Biao-Lakkja branch that is coordinate with Kam-Sui. However, L-Thongkum (1992) considers Lakkja to be most closely related to the Tai branch, based on their total number of shared lexical items – she concludes that Lakkja shares more lexical retentions and innovations with Tai than with Kam-Sui, arguing that Lakkja is ultimately closer to the former than the latter. A third possibility, adopted here, is that Kam-Sui and Tai are closer to each other than either is to the Biao-Lakkja branch, which constitutes one of the two highest-order nodes of the KD phylum.

13.3.3 Kam-Tai As discussed above, the Kam-Tai branch was proposed early in the history of KD studies, although scholars disagree as to which families should be included and excluded. The Kam-Tai grouping, sometimes also called Zhuang-Dong in the Chinese literature, is primarily used in China; following Ostapirat (2005), scholars outside of China now usually do not make use of the Kam-Tai grouping. Kam-Tai by definition always includes minimally the Kam-Sui and Tai families, and historically the Kra and Hlai families have been excluded. Biao and Lakkja are often included as either part of or sisters of the Kam-Sui family, and Be is included most often as a sister of Tai. Edmondson and Solnit (1997: 8–10) provide the following list of phonological structures which distinguish Kam-Sui from Tai: 1. Kam-Sui lacks the back nonrounded vowels /ɯ ɤ/ so common in Tai 2. There is a class of forms that originally come from dyadic roots (those that may once have consisted of two syllables) in which Kam-Sui tends generally to weaken the medial consonant to form a cluster with appropriate consequences on tonal development; Tai languages tend by contrast to truncate the first syllable. 3. Kam-Sui has tone splitting that follows the voiced-low principle to the letter. Northern Tai languages generally do as well, but Central and Southwestern Tai evidence a wide range of developments other than voiced-low.

232 

 Peter Norquest

4. Although dead syllables (with final stops) [tone category D] differ from live syllables (with final sonorants) [tone categories A, B, C] in tonal development, the pitch shapes of dead-syllable tones can nearly always be equated with the pitch shapes of some subset of live-syllable tones. In Kam-Sui these equations are generally A = DS(hort) and C [*-ʔ] = DL(ong), whereas in Tai B [*-h] = DL or C=DL are common, and reflexes of proto-tone A seldom enter the picture. 5. There is systematic variation of diphthong and monophthong (the ‘Gedney Puzzle’ phenomenon also found between Northern and Central-Southwestern Tai) between Tai and Kam-Sui, as in fajA2 versus fiA1 ‘fire’. 6. There are a number of diagnostic vocabulary items or tone categories. Some items are cognate but differ in proto-tone. For example: Gloss

Kam-Sui

Tai

‘pig’ ‘rat’ ‘long’

*qʰ-muːh *hnɔːʔ *ʔraːjʔ

*m̥uː *n̥ uː *rɯj

Liang and Zhang (1996) classify Kam-Sui, Be, and Tai together as the Dong-Tai branch, due to the large number of lexical items shared by all three branches vis-a-vis the more divergent Kra and Hlai branches: Kam-Tai – Kam-Sui (incl. Biao-Lakkja) – Be-Tai – Be – Tai A Kam-Tai group consisting of Kam-Sui and Tai is accepted by Edmondson and Solnit (1988, 1997). They argue that Hlai and Geyang (Kra) left the parent language at an earlier date while Kam-Sui and the Tai branch divided later. They suggest that Hlai may have diverged from the others early and gradually, accounting for it sharing slightly more cognates with Zhuang (Tai) than with Kam (Kam-Sui), whereas the Geyang group separated from Kam-Tai later but preserved many features of the parent language. Hansell (1988) considers Be to be a sister of the Tai branch based on shared vocabulary, and proposes a Be-Tai grouping within Kam-Tai; this hypothesis is incorporated by Edmondson and Solnit: Kam-Tai – Biao-Lakkja/Kam-Sui – Biao-Lakkja – Kam-Sui



Classification of (Tai-)Kadai/Kra-Dai languages 

 233

– Be-Tai – Be – Tai The proposal provided here for Kam-Tai is repeated below. Crucial differences with earlier proposals include the inclusion of Kra as part of a Kra-Tai group, a sister of Kam-Sui, and the inclusion of Hlai as part of a Hlai-Tai group. This version of Kam-Tai therefore includes every KD group except for Biao-Lakkja, with which it is a sister: Kam-Tai – Kam-Sui – Kra-Tai – Kra – Hlai-Tai – Hlai – Be-Tai – Be – Tai

13.3.4 Kam-Sui The Kam-Sui languages are spoken mainly in eastern Guizhou, western Hunan, and northern Guangxi in southern China, with small pockets found in northern Vietnam and Laos. They include Mulam [mlm], Kam (Northern [doc], Cao Miao [cov], Southern [kmc]), Then [tct], Sui [swi], Chadong [cdy], Maonan [mmd], Ai-Cham [aih], and Mak [mkg]. As mentioned above, the Kam-Sui group was first posited in Li (1943), and has remained uncontroversial up to present with the exception of whether or not to consider the Biao-Lakkja group as a sister or to subsume it under Kam-Sui itself. Thurgood (1988) proposed the following Kam-Sui classification: Kam-Sui – A – Mulam – Kam – B – T’en – C – Sui – Maonan – Mak

234 

 Peter Norquest

A more detailed phylogeny is proposed below. The primary differences with Thurgood’s 1988 proposal above are (i) the disassociation of Mulam from Kam, placing the latter closer to the Macro-Sui group, and (ii) grouping Maonan and Mak together under the Para-Sui node which is coordinate with Sui: Kam-Sui – Mulam – Northern Kam-Sui – Kam – Macro-Sui – T’en – Greater Sui – Sui – Para-Sui – Chadong – Maonan – Ai-Cham/Mak – Ai-Cham – Mak

13.3.5 Kra-Tai This node in the KD phylogenetic tree is being proposed here for the first time. Kra is normally considered to be either a primary (one of the two primary branches of Ostapirat’s Proto-Kra) or at most a secondary node of the tree. The decision to place it here as a sister of Hlai-Tai is based primarily on isoglosses which Kra shares with the Hlai-Tai families (particularly Hlai and Be) but lacks with either Biao-Lakkja or Kam-Sui (see Table 2). Tab. 2: Examples of Kra-Hlai-Tai isoglosses. Gloss

Biao-Lakkja

Kam-Sui

Kra

Hlai

Be

‘beard’ ‘wetfield’ ‘crow’ ‘needle’ ‘mortar’

*m-luːt *raːh *kaː *tɕʰəm –

*m-nrut *ʔra:h *qaː *tɕʰəm *krˠəm

*mumʔ *naː *ʔak *ŋot *ʔdru

*hmɯːmʔ *hnaːɦ *ʔaːk *hŋuc *ɾəw

*mum *njaː *ʔak *ŋaːʔ *ɦoːk

Tai X

*mumh *naː *kaː *qjem *grok



Classification of (Tai-)Kadai/Kra-Dai languages 

 235

The Kra languages are spoken in southern China (Yunnan, Guangxi) and in northern Vietnam. They include Gelao (including the A’ou [aou], Duoluo [giw], Mulao [giu], Qau [gqu], Green Gelao [gig] and Red Gelao [gir] dialects), Lachi [lbt] (including the possibly extinct White Lachi [lwh]), Laha [lha], Paha [yha], Buyang (including the Langnian [yln], E’ma [yzg], and Yerong [yrn] dialects), Qabiao [laq] and En [enc]. As discussed above, Benedict (1942) used the term Kadai for both the Kra and Hlai languages. Liang (1990), using a classification based largely on cognate percentages (Edmondson and Solnit 1997: 2), excluded Hlai and called the remaining group Geyang. The name Kra was proposed for this group by Ostapirat (2000), and is the term usually used by scholars outside of China, whereas Geyang is the name currently used within China. Ostapirat (2000ː 25) proposed the following phylogeny for Kra: Kra – Southwestern – Western – Gelao – Lachi – Southern – Laha – Central-east – Central – Paha – Eastern – Buyang – Qabiao – En Hsiu (2014) proposed the below phylogeny, the main difference with Ostapirat’s being the realignment of the Laha language (following Edmondson [2011], who argues for a special link to Paha): Kra – Northern – Lachi – Gelao – Red Gelao – Vandu – A’ou – Core Gelao – Dongkou Gelao – White Gelao (Telue) – Central Gelao (Hagei, Qau)

236 

 Peter Norquest

– Southern – Guangxi Buyang (Yerong) – Yunnan Buyang (Ecun, Langjia, En) – Laha, Paha – Qabiao

13.3.6 Hlai-Tai The Hlai-Tai branch was first proposed in Ostapirat (2005) as Southern Kra-Dai, and was followed by Norquest (2015) where it was called Western Kam-Tai. It has been renamed here on analogy with other nodes in the KD tree: Be-Tai, Kra-Tai and Kam-Tai. The Hlai-Tai branch has been proposed based partly on isoglosses shared with Be-Tai (see Table 3). Tab. 3: Examples of Hlai-Be-Tai isoglosses. Gloss

Biao-Lakkja

Kam-Sui

Kra

Hlai

Be

Tai

‘tongue’ ‘wing’ ‘skin’ ‘to shoot’ ‘to fly’

*m-laː – – – *[C-]pənh

*maː *C-faːh *ŋʀaː *pɛŋh *C-pˠənʔ

*l-maː *ʀwaː *taː – –

*hliːnʔ *pʰiːk *n̥ əːŋ *hɲɯː *ɓin

*liːn *pik *n̥ aŋ *ɲəː *ʔbjən X

*linʔ *piːk *n̥ aŋ *ɲɯː *ʔbil

The Hlai [lic] languages are spoken in central and south-central Hainan island in China. They include Cun [cuq], the speakers of which are ethnically distinct, and Jiamao [jio], an aberrant Kra-Dai language with a Hlai superstratum and a non-Hlai substratum (Thurgood 1992; Norquest 2015). The Hlai languages were classified into the following groups in initial stages Chinese scholarship, with no internal substructure: Hlai – Ha (Bouhin, Ha Em, Lauhut) – Qi (Tongzha, Zandui, Baoting) – Meifu (Moyfaw) – Bendi (Baisha, Yuanmen) Norquest (2015) proposes the following subgrouping of the Hlai languages, dividing the Ha group and adding the Cun, Nadou, and Changjiang languages:



Classification of (Tai-)Kadai/Kra-Dai languages 

 237

Hlai – Bouhin – Greater Hlai – Ha Em – Central Hlai – East Central Hlai – Lauhut – Qi (Tongzha, Zandui, Baoting) – North Central Hlai – Northwest Central Hlai – Cun – Nadou – Northeast Central Hlai – Meifu (Changjiang, Moyfaw) – Run (Baisha, Yuanmen)

13.3.7 Be-Tai There is no absolute consensus on the precise relationship of Be to other branches within the KD family; however most KD scholars currently accept Hansell’s (1988) classification. Hansell suggests that Be falls outside of Tai but is closer to it than other branches. However, Be also shows a certain amount of distance from Tai, with certain retentions shared with Kam-Sui; it is therefore close to Tai, but not close enough to be a direct descendant of Proto-Tai, nor is it far enough away to belong to Kam-Sui. The best possible conclusion is therefore that Proto-Be-Tai is a daughter language of Proto-Kam-Tai, and a sister language of Kam-Sui (Hansell 1988: 281–286). This placement of Be with respect to Tai is essentially the same as that implied in Haudricourt (1967). Examples of Be-Tai isoglosses are given in Table 4. Tab. 4: Examples of Be-Tai isoglosses. Gloss

Biao-Lakkja

Kam-Sui

Kra

Hlai

Be

Tai

‘bee’ ‘vegetable’ ‘red’ ‘to bite’ ‘to descend’

*mlet – – *kat *lojʔ

*luk *ʔmaː *hlaːnʔ *klət *C-ɭuːjh

*reː *ʔop – *ʈajh *caɰʔ

*kəːj *ɓɯː ʈʂʰəj *hraːnʔ *hŋaːɲʔ *l̥uːj

*ʃaːŋ *ʃak *r̥ iŋ *gap *roːŋ

X

*prɯŋʔ *prak *C-djeːŋ *ɢɦap *N-ɭoŋ

238 

 Peter Norquest

(Ong-)Be [onb] is a language spoken on the north-central coast of Hainan island, including the provincial capital Haikou. Weera Ostapirat (1998), analyzing data from Zhang (1992), notes that Be and Jizhao share many lexical similarities and sound correspondences, and that Jizhao may be a remnant Be-related language on the Leizhou peninsula of Guangdong province of the Chinese mainland. Chen (2018) proposes two main branches for Ong-Be: Ong-Be – Western – Eastern Incorporating Ostapirat’s (1998) proposal to include Jizhao, the above phylogeny can be expanded in the following way: Ong-Be – Jizhao – Hainan Ong-Be – Western – Eastern

13.3.8 Tai Tai languages are currently spoken in China, Vietnam, Laos, Cambodia, Thailand, Malaysia, Myanmar, and India (Pittayaporn 2009), and it has been speculated that Proto-Tai was spoken in the area around the Guangxi-Vietnam border areas (Gedney 1995; Diller 2000). Following the tripartite classification originally proposed by Fang Kuei Li (1960 – see below), the Tai languages can be listed as followsː Northern China

Laos Thailand Central China Vietnam

Bouyei [pcc], Zhuang (including Central Hongshuihe [zch], Eastern Hongshuihe [zeh], Guibei [zgb], Guibian [zgn], Lianshan [zln], Liujiang [zlj], Liuqian [zlq], Qiubei [zqe], Yongbei [zyb], and Youjiang [zyj]) Saek [skb] Yoy [yoy]

Zhuang (including Dai [zhd], Min [zgm], Nong [zhn], Yang [zyg], Yongnan [zyn], and Zuojiang [zzj]) Cao Lan [mlc], Nung [nut], Tày [tyz], Ts’ün-Lao [tsl]



Classification of (Tai-)Kadai/Kra-Dai languages 

 239

Southwestern India Ahom [aho], Aiton [aio], Khamyang [ksu], Phake [phk] Myanmar Khamti [kht], Khün [kkh], Shan [shn], Tai Laing [tjl] China Lü [khb], Pa Di [pdi], Tai Nüa [tdd], Tai Ya [cuu], Tai Hongjin [tiz] Laos Tai Long [thi], Lao [lao] Thailand Nyaw [nyw], Phu Thai [pht], Phuan [phu], Thai [tha] (including Song [soa], Northeastern [tts], Northern [nod], and Southern [sou]), Yong [yno] Vietnam Tai (including Daeng [tyr], Dam [blt], Dón [twh], and Thanh [tmm]), Tày (including Sa Pa [tys] and Tac [tyt]), Thu Lao [tyl] Unclassified India Nora [nrr] Laos Kang [kyp], Kuan [uan], Tai (including Khang [tnu] and Pao [tpo]) Vietnam Tai Yo [tyj] Li (1960, 1977) divided the Tai languages into three branches: Northern Tai, Central Tai, and Southwestern Tai, based on phonological and lexical criteria. The Northern branch encompasses the Bouyei speakers in China’s Guizhou Province and a large part of the Zhuang speakers on the north bank of the Xi River in Guangxi. The Central branch is made up of dialects spoken from the south bank of the Xi River to the north bank of the Red River in Vietnam, including Lungchow, Tay, Nung and Caolan. The Southwestern branch, the largest of all, comprises the Dai (Tai) of Yunnan, the many varieties of Shan (both in China and in Burma), and Thai and Lao, along with White Tai, Black Tai and Red Tai spoken across the Vietnamese-Lao border areas (Luo 1997: 9). Other scholars who follow this tripartite schema include Liang and Zhang (1996), Edmondson and Solnit (1997) and Diller et al. (2008): – Proto-Tai – Northern – Central – Southwest The retroflex initial series provides a good example of how Proto-Tai initials have developed differently in the three branches. Examples in Table 5 include Proto-Tai initial and medial *ʈ where the latter occurs in *p-ʈ clusters.

240 

 Peter Norquest

Tab. 5: Examples of voiceless retroflex stops in Proto-Tai. Gloss

Proto-Tai

North Tai

Central Tai

Southwest Tai

‘lift’ ‘headlouse’ ‘to see’ ‘eye’ ‘die’ ‘grasshopper’

*ʈaːm *ʈaw *ʈaȵ *p-ʈaː *p-ʈaːj *p-ʈak

*r̥ aːm *r̥ aw *r̥ aȵ *p-ʈaː *p-ʈaːj *p-ʈak

*tʰraːm *tʰraw *tʰran *p-tʰraː *p-tʰraːj *p-tʰrak

*haːm *haw *hen *taː *taːj *tak

Luo (1997: 232) classifies the Tai languages as follows, and proposes a fourth branch called Northwestern Tai that includes Ahom, Shan, Dehong Dai, and Khamti. All branches are considered to be coordinate with each other: Tai – Northern – Central – Southwestern – Northwestern While there is generally no question about the Northern division of Tai (although see Pittayaporn’s classification below), the division between Li’s Central and Southwestern branches on a par with the Northern branch has not been uncontroversial. Gedney (1989: 62–66) argues that the Southwestern and the Central languages should be placed into one group, with a line drawn between the Northern and non-Northern languages, and if they are to be further divided, it should be done at a lower level (others who have adopted this schema include Haudricourt 1956, Chamberlain 1975, Strecker 1985, and Ferlus 1990): Tai – Northern – Central – Southwest Edmondson (2013) has reaffirmed this general subgrouping schema using a computational phylogenetic analysis, which shows a much more diversified Central Tai phylogeny, supporting Pittayaporn’s more detailed analysis below, while generally disconfirming Luo’s (1997) proposal for a separate Northwestern branch with the possible exception of Ahom:



Classification of (Tai-)Kadai/Kra-Dai languages 

 241

Tai – Northern – Central – Core – Tay – Nung – Southwest One distinguishing feature between these two primary branches is that Northern Tai fails to have the tightness of voice in the C [*-ʔ] (or sometimes the B [*-h]) tone category found in Central Tai and Southwestern Tai (Edmondson and Solnit 1988: 10). Li (1977) also noted that Northern Tai fails to have a stable aspiration contrast in its initials inventory, whereas Central Tai and Southwestern Tai possess this feature – this is evident specifically in the series of lexical items which can be reconstructed with breathy-voiced initials in Proto-Tai (Table 6). Tab. 6: Examples of breathy-voiced initials in Proto-Tai. Gloss

Proto-Tai

North Tai

Central Tai

Southwest Tai

‘person’ ‘bowl’ ‘eggplant’ ‘rice’

*b uːʔ *dɦuəjʔ *ɡɦɯə *ɢɦawʔ

*buːʔ *duəjʔ *gɯə *ɣawʔ

*pʰuːʔ *tʰuəjʔ *kʰɯə *kʰawʔ

*pʰuːʔ *tʰuəjʔ *kʰɯə *kʰawʔ

ɦ

Some diagnostic vocabulary items include the following, where Northern Tai either has an etymon unrelated to that in Central and Southwestern Tai, or otherwise differs in either initial, rime, or both (Table 7). Tab. 7: Examples of differences between Northern Tai and Central-Southwestern Tai. Gloss

North Tai

Central Tai

Southwest Tai

‘tiger’ ‘thorn’ ‘crow’ ‘steam, vapor’ ‘to tear’ ‘knife’

*kuːk *ʔon *ʔaː *soːj *siːk *mit

*sɯə *n̥ aːm *kaː *ʔjaːj *cʰiːk *miːt

*sɯə *n̥ aːm *kaː *ʔaːj *cʰiːk *miːt

242 

 Peter Norquest

Pittayaporn (2009) classifies the Tai languages based on clusters of shared innovations (which, individually, may be associated with more than one branch) (Pittayaporn 2009: 298). In Pittayaporn’s preliminary classification system of the Tai languages, Central Tai is split up into multiple branches, with the Zhuang varieties of Chongzuo in southwestern Guangxi having the most internal diversity. The Southwestern Tai and Northern Tai branches remain intact, but several of the Southern Zhuang languages are considered to be paraphyletic. Pittayaporn’s subgroup structure thus shows characteristics that depart greatly from Li’s conventional three-way classification. It does not recognize either Central Tai or Northern Tai as genealogical subgroups, even though subgroup N resembles Northern Tai rather closely; nor does it view Southwestern Tai as a primary branch. Most importantly, it claims that the Tai family tree is rather flat as it consists of more than two primary branches (Pittayaporn 2009: 301–304): Tai – D – I (Qinzhou) – J – M (Wuming, Yongnan, Long’an, Fusui) – N (Saek, Po-Ai, Yay, Lingyue, Rong’an, Qiubei, Bouyei, other N Tai) – C (Chongzuo, Shangsi, Caolan) – B (Ningming) – A – F (Lungchow, Leiping) – E – H (Lungming, Daxin) – G – L (Debao, Jingxi, Nung) – K – P (Bao Yen, Cao Bang, Wenma) – O – R (Sapa) – Q (Shan, Siamese, Black Tai, Lue, other SW Tai) Pittayaporn’s diagnostic criteria for the above groups include those shown in Table 8, where each dependent group has inherited the innovations of its mother node.



Classification of (Tai-)Kadai/Kra-Dai languages 

 243

Tab. 8: Phonological criteria for subgrouping of Tai languages (Pittayaporn 2009: 299–301). Groups

Innovations

Varieties

D

1) *ɤj, *ɤw, *ɤɰ > *iː, *uː, *ɯː 2) *weː, *woː > *iː, *uː
 3) *k.t- > *tr-
 4) *ɟm̩.r- > *ɟr*ɤː# > *aː# *ɤn, *ɤt, *ɤc > *an, *at, *ac *p.t- > *tr*ɯj, *ɯw > *aj, *aw

Subgroups I, M, and N

I J M N C

B

A

F E H G L K P O R Q

1) *ɯj, *ɯw > *iː, *uː 2) *weː, *woː > *eː, *oː 3) *k.t- > *tr-
 4) *ɟm̩.r- > *ɟr1) *ɯj, *ɯw > *iː, *uː 2) *ɤj, *ɤw, *ɤɰ > *iː, *uː, *ɯː 
 3) *weː, *woː > *eː, *oː 1) *ɯj, *ɯw > *iː, *uː
 2) *ɯːk > *uːk 3) *ɤj, *ɤw, *ɤɰ > *aj, *aw, *aɰ 4) *weː, *woː > *eː, *oː
 5) *ɟm̩.r- > *br-
 1) *ɯm > *ɤm 2) *p.t- > *p.r*qr- > *ʰr*k.r- > *qr*qr- > *kr*eː, *oː > *ɛː, *ɔː *ɤn > *on *kr- > *s*kr- > *ʰr-

Qinzhou Subgroups M and N Wuming, Yongnan, Long’an, Fusui Saek, Po-Ai, Yay, Lingyue, Rong’an, Qiubei, Bouyei, other Northern Tai Chongzuo, Shangsi, Caolan

Ningming

Subgroups F, H, L, P, R, and Q

Lungchow, Leiping Subgroups H, L, P, R, and Q Lungming, Daxin Subgroups L, P, R, and Q Debao, Jingxi, Western Nung, Guangnan Nung, ­Yanshan Nung Subgroups P, R, and Q Bao Yen, Cao Bang, Wenma Subgroups R and Q Sapa Shan, Siamese, Black Tai, Lue, other Southwestern Tai

13.4 Unsolved challenges The primary challenge in ongoing KD classification involves collection and analysis of lexical data. While ample data is currently available for many of the KD families and individual languages, there are still significant gaps in key places. The most important ones are listed below – in the case of Biao, Mulam and T’en, at least one dialect has

244 

 Peter Norquest

been well-described but data for other dialects is still minimal. Additional fieldwork on all of the below languages is therefore highly desirable, especially in the case of the more endangered Kra languages Red Gelao (50 speakers), White Lachi (possibly extinct), Laha (5,700 speakers as of 1999), Paha (600 speakers in 2007) and En (200 speakers in 1998): – Biao – Mulam – T’en – Red Gelao – White Lachi – Laha – Paha – En The most controversial subgrouping assertions presented above are also in need of either confirmation or refutation through comparative work. The foremost of these is the demotion of Kra from a primary branch to a mid-level branch, below Biao-Lakkja and Kam-Sui and as a sister of Hlai-Tai. The next most important is the placement of Hlai – again, moving it from its traditional position as a primary branch to a lower position as a sister of Be-Tai. Although a quantitative phylogenetic analysis has been performed on the Tai branch (Edmondson 2013), no such analysis has been performed over the entire KD phylum as a whole (at least not in published form). Such a quantitative analysis is very desirable, given the challenges which have faced scholars attempting to perform a qualitative comparative analysis with the currently available data. It would be valuable for suggesting new or testing already existing hypotheses about the internal structure of the larger KD branches, as well as helping to confirm the positions of the Kra and Hlai branches mentioned above. Finally, full reconstructions at all levels of the KD phylum are still needed. Thurgood (1988) provided a preliminary reconstruction of Proto-Kam-Sui, but a full reconstruction has yet to be published, although one exists in manuscript form (Norquest ms). Ostapirat (2000) provides a valuable core of reconstructed Proto-Kra vocabulary, but the inventory of reconstructions must be greatly expanded in order to be of maximum subgrouping value. Lastly, a reconstruction of Proto-Biao (and thereafter a reconstruction of Proto-Biao-Lakkja) must be considered an important goal since the Biao-Lakkja group is posited here as a primary branch of the KD tree, and a unification of Proto-Biao-Lakkja on the one hand with Proto-Kam-Tai on the other will therefore be the only way to achieve a full Proto-KD reconstruction.



Classification of (Tai-)Kadai/Kra-Dai languages 

 245

References Benedict, Paul K. 1942. Thai, Kadai, and Indonesian: A new alignment in South-Eastern Asia. American Anthropologist 44. 576–601. Benedict, Paul K. 1975. Austro-Thai language and culture, with a glossary of roots. New Haven: HRAF Press. Chamberlain, James R. 1975. A new look at the history and classification of the Tai dialects. In Jimmy G. Harris & James R. Chamberlain (eds), Studies in Tai linguistics in honor of William J. Gedney, 49–60. Bangkok: Central Institute of English Language, Office of State Universities. Chamberlain, James R. 2016. Kra-Dai and the proto-history of South China and Vietnam. Journal of the Siam Society 104. 27–77. Chen, Yen-ling. 2018. Proto-Ong-Be. Manoa: University of Hawaii at Manoa PhD dissertation. Diller, Anthony. 2000. The Tai language family and the comparative method. In Somsonge Burusphat (ed.), Proceedings: The International Conference on Tai Studies, July 29–31, 1998, 1–32. Nakhon Pathom: Institute of Language and Culture for Rural Development, Mahidol University. Diller, Anthony, Jerold A. Edmondson & Luo Yongxian (eds.). 2008. The Tai-Kadai languages. London & New York: Routledge. Edmondson, Jerold A. 2011. Geyang yuyan fenlei buyi [Notes on the subdivisions in Kra]. Journal of Guangxi University for Nationalities 33(2). 8–14. Edmondson, Jerold A. 2013. Tai subgrouping using phylogenetic estimation. Presented at the 46th International Conference on Sino-Tibetan Languages and Linguistics (ICSTLL 46), Dartmouth College, Hanover, New Hampshire, United States, 7–10 August 2013 (Session: Tai-Kadai Workshop). Edmondson, Jerold A. & David B. Solnit (eds.). 1988. Comparative Kadai: Linguistic studies beyond Tai (Summer Institute of Linguistics Publications in Linguistics 86). Arlington, TX: Summer Institute of Linguistics. Edmondson, Jerold A. & David B. Solnit (eds.). 1997. Comparative Kadai: The Tai branch (Summer Institute of Linguistics Publications in Linguistics 124). Arlington, TX: Summer Institute of Linguistics. Ferlus, Michel. 1990. Remarques sur le consonantisme de proto thai-yay (révision du proto-tai de Li Fangkuei). Paper circulated at the 23rd International Conference on Sino-Tibetan Languages and Linguistics, University of Texas at Arlington. Gedney, William J. 1989. Selected papers on comparative Tai studies. In Robert Bickner, John Hartmann, Thomas J. Hudak & Patcharin Peyasantiwong (eds.), Michigan papers on South and Southeast Asia 29. Ann Arbor: Center for South and Southeast Asian Studies, University of Michigan. Gedney, William J. 1995. Linguistic diversity among Tai dialects of southern Guangxi. In Thomas J. Hudak (ed.), William J. Gedney’s Central Tai dialects: Glossaries, texts, and translations, 803–822. Ann Arbor: Center for South and Southeast Asian Studies, University of Michigan. Hansell, Mark. 1988. The relation of Be to Tai: Evidence from tones and initials. In Jerold A. Edmondson & David B. Solnit (eds.), Comparative Kadai: Linguistic studies beyond Tai (Summer Institute of Linguistics Publications in Linguistics 86), 239–288. Arlington, TX: Summer Institute of Linguistics. Haudricourt, André-Georges. 1956. De la restitution des initiales dans les langues monosyllabiques: le problème du thai commun. Bulletin de la Société de Linguistique de Paris 52. 307–322. Haudricourt, André-Georges. 1967. La langue lakkia. Bulletin de la Societé de Linguistique de Paris 62. 165–182. Hsiu, Andrew. 2014. The Biao languages of northwestern Guangdong, China. Presented at SEALS 24, Yangon University, Yangon, Myanmar, 28 May.

246 

 Peter Norquest

Hudak, Thomas J. 2008. William J. Gedney’s comparative Tai source book (Oceanic Linguistics Special Publication 34). Honolulu: University of Hawai’i Press. L-Thongkum, Theraphan. 1992. A preliminary reconstruction of Proto-Lakkja (Cha Shan Yao). Mon-Khmer Studies 20. 57–90. Li Fang Kuei. 1943. Notes on the Mak language (Institute of History and Philology, Monograph series A, no. 20). Shanghai: Academia Sinica. Li Fang Kuei. 1965. The Tai and Kam-Sui languages. Lingua 14. 148–179. Li Fang Kuei. 1977. Handbook of Comparative Tai. Honolulu: University of Hawai’i Press. Li Rulong, Hou Xiaoying, Lin Tiansong & Qin Kai. 2012. A study of Chadong. Beijing: Ethnic Publishing House. Liang Min. 2002. A study of Biao. Beijing: Minzu University Press. Liang Min & Zhang Junru. 1996. An introduction to the Kam-Tai languages. Beijing: China Social Sciences Academy Press. Luo Yongxian. 1997. The subgroup structure of the Tai languages: A historical-comparative study. Journal of Chinese Linguistics Monograph Series 12. Luo Yongxian. 2008. Sino-Tai and Tai-Kadai: Another look. In Anthony Diller, Jerold A. Edmondson & Luo Yongxian (eds.), The Tai-Kadai languages, 9–28, London & New York: Routledge. Norquest, Peter K. 2007. A phonological reconstruction of Proto-Hlai. Tucson: University of Arizona PhD dissertation.
 Norquest, Peter K. 2015. A phonological reconstruction of Proto-Hlai (Languages of Asia 13). Leiden: Brill. Ostapirat, Weera. 1998. A mainland Be language? Journal of Chinese Linguistics 26(2). 338–344. Ostapirat, Weera. 2000. Proto-Kra. Linguistics of the Tibeto-Burman Area 23(1). 1–251. Ostapirat, Weera. 2005. Kra-Dai and Austronesian: Notes on phonological correspondences and vocabulary distribution. In Laurent Sagart, Roger Blench & Alicia Sanchez-Mazas (eds.), The peopling of East Asia: Putting together archaeology, linguistics and genetics, 107–131. London & New York: Routledge-Curzon. Ostapirat, Weera. 2008. The Hlai language. In Anthony Diller, Jerold A. Edmondson & Yongxian Luo (eds.), 623–652. London & New York: Routledge. Pittayaporn, Pittayawat. 2008. Proto-Southwestern Tai: A new reconstruction. Paper presented at 18th Annual Meeting of the Southeast Asian Linguistics Society, Universiti Kebangsaan Malaysia, Bangi, 21–22 May. Pittayaporn, Pittayawat. 2009. The phonology of Proto-Tai. Ithaca, NY: Department of Linguistics, Cornell University PhD dissertation. Solnit, David B. 1988. The position of Lakkia within Kadai. In Jerold A. Edmondson & David B. Solnit (eds.), Comparative Kadai: Linguistic studies beyond Tai (Summer Institute of Linguistics Publications in Linguistics 86), 219–238. Dallas: Summer Institute of Linguistics and the University of Texas at Arlington. Strecker, David. 1985. The classification of the Caolan languages. In Suriya Ratanakul, David Thomas & Suwilai Premsrirat (eds.), Southeast Asian linguistic studies presented to André-G. Haudricourt, 479–492. Nakhorn Pathom: Institute of Language and Culture for Rural Development, Mahidol University. Thurgood, Graham. 1992. The aberrancy of the Jiamao dialect of Hlai: Speculation on its origins and history. In Martha S. Ratliff & Eric Schiller (eds.), Papers from the First Annual Meeting of the Southeast Asian Linguistics Society, 417–433. Tempe, AZ: Arizona State University, Program for Southeast Asian Studies. Zhang Zhenxing. 1992. A brief account of the Wuchuan dialect in Guangdong Province. Fangyan 3.

Martha Ratliff

14 Classification and historical overview of Hmong-Mien languages 14.1 Introduction Some aspects of the history of Hmong-Mien are well-known: the tone system of the family has been understood since Kun Chang’s work in the middle of the last century (e.  g. Chang 1947, 1953, 1966, 1972), as has the basic division of the family into two branches, Hmongic and Mienic (or Miao and Yao). Much reconstruction work has been done: Proto-Hmong-Mien reconstructions include Purnell (1970), Wang and Mao (1995), Chen (2001), Ratliff (2010b), and Li (2018); lower-level reconstructions include Wang’s reconstruction of Proto-Hmongic (1979, 1994) and L-Thongkum’s reconstruction of Proto-Mienic (1993) (see Taguchi, this volume). There are understandably many things we do not understand well, however, which include the internal structure of each main branch of the family, the precise sources and timing of borrowings from different varieties of Chinese (both words and structures), and the deeper relationship of Hmong-Mien to one or more neighboring language families (see also Sidwell and Reid, this volume).

14.2 Age of the family The age of the reconstructed Hmong-Mien protolanguage has been given as roughly 500 BCE because it contains a stratum of loanwords from Old Chinese, and unlike Hmong-Mien, Chinese is attested in texts that can be dated. Borrowing from Chinese has been continuous over the history of Hmong-Mien (Ratliff 2009). There are far more loanwords from Middle Chinese (c. 600 CE) and modern Chinese dialects than from Old Chinese in Hmong-Mien languages, but a stratum of Old Chinese loanwords in the protolanguage is undeniable (see Table 1). “Old Chinese” refers to varieties of Chinese used before the unification of China under the Qín dynasty in 221 BCE, and encompasses the language of the oracular inscriptions that date roughly 1,000 years earlier to 1250 BCE. However, “Old Chinese” is also used as a name for the reconstructed language of the pre-Qin period. This reconstructed language is based on the traditional comparative method, but is constrained by a number of other sources of evidence, including the reconstructed forms of ancient loanwords to Hmong-Mien and other languages (Baxter and Sagart 2014: 1–4). We have arbitrarily picked the middle of the Old Chinese millennium as the time of contact that led to these loans, although of course contact could have occurred somewhat earlier or later. https://doi.org/10.1515/9783110558142-014

248 

 Martha Ratliff

There are also ancient loanwords from some Tibeto-Burman donor in ProtoHmong-Mien, but since the donor is not known, we are not able to date these loans. If the donor were to be identified, and if, as Benedict (1987: 20) believed, these loans predated contact with Chinese, the age of the protolanguage might be pushed back further. Additional support for an earlier date comes from the fact that two Old Chinese words, 鐵 tiě ‘iron’ and 下 xià ‘descend’, were borrowed separately into Proto-Hmongic and Proto-Mienic, which suggests that the period before the split, when ProtoHmong-Mien was spoken, was even earlier than 500 BCE. However, the Hmong-Mien family certainly appears to be shallower than neighboring language families. For many meanings, simple inspection of a word list suffices to suggest cognate candidates, especially since historical tone categories are so stable. To be sure, this is an impressionistic measure of age since we do not expect languages to change at a uniform rate, but Hmong-Mien gives the appearance of a subfamily of some ancient family rather than an ancient family itself. This is why the hunt for external relatives has been joined by so many scholars. Many Chinese scholars argue that Hmong-Mien and Tai-Kadai belong to the Sino-Tibetan language family (e.  g. Wang 1986; Pan 2006). Forrest (1973 [1948]: 93–103), Downer (1963), Haudricourt (1966), and Peiros (1998: 155–160) favor the possibility of a relationship with Austroasiatic, while Benedict links Hmong-Mien to Austronesian and Tai-Kadai as part of “Austro-Tai” (Benedict 1975). Kosaka (2002) proposes a “Miao-Dai” family uniting Hmong-Mien and Tai-Kadai. Finally, Starosta (2005) suggests all of these families may ultimately be related in a macro-family, “East Asian”.

14.3 The Hmong-Mien homeland The traditional method of “linguistic paleontology” – locating the homeland at the nexus of words for flora and fauna that can be reconstructed for the protolanguage – suggests that the speakers of Proto-Hmong-Mien occupied the middle Yangtze River valley c. 500 BCE (Ratliff 2004, 2010b). However, it is still unclear whether or not the ancestors of these speakers moved to the Yangtze valley from someplace else before that, and if so, from what direction. We believe that speakers of Proto-Hmong-Mien must have lived in southern China because cognates of the ancient word for tropical cogongrass or thatch grass (imperata cylindrica) are found in both Hmongic and Mienic (Proto-Hmong-Mien *NKan), and cogongrass stops growing between the 33rd and 34th latitudes (Zhongguo Caodi Ziyuan Tuji 1996). Other flora and fauna words also link the Hmong-Mien to southern China, some more specifically to the middle Yangtze River valley. These words can only be reconstructed to Proto-Hmongic, but since their Mienic counterparts are Chinese borrowings they may well be preservations from Proto-Hmong-Mien. The two words for plants are for rakkyo (allium chinense), PH *ɢləŋB, and the chameleon plant



Classification and historical overview of Hmong-Mien languages 

 249

(houttuynia cordata), PH *truwD. These plants are used for food and for medicinal purposes. Rakkyo is cultivated in Guangxi, Hunan, Guizhou, Sichuan, and Guangdong and the chameleon plant grows in provinces south of the Yangtze River (Ci Hai 1979; Anderson 1993: 213). Animals specifically associated with this area for which old Hmong-Mien words can be reconstructed are the painted eyebrow thrush (garrulax canorus canorus), Proto-Hmongic *cɔŋA, and the river deer (hydropotes inermis), Proto-Hmongic *ŋgu̯ eiB. The painted eyebrow thrush is a non-migratory bird, common in the south Yangtze River basin, and the river deer is found in the middle and southern Yangtze River valley as well as on reed beaches and grasslands on the southeast coast (Ci Hai 1979). And although the endangered Chinese pangolin (manis pentadactyla), Proto-HmongMien *rɔiH, has been found throughout southern Asia in a large area that stretches from northeast India east to Taiwan, it is specifically noted as having been present in the Yangtze River basin (Wildlife of China n.d.: 69). The question of where the ancestors of these people lived in even earlier times is difficult to answer, however. It would be easier to explain the significant number of Tibeto-Burman loanwords in Proto-Hmong-Mien (the numerals ‘four’ through ‘ten’, ‘son-in-law’, ‘daughter-in-law’, ‘sun’, ‘moon’, etc.) had their ancestors moved into the middle Yangtze River valley from a location further west. Benedict believed, “… that the Tibeto-Burman loanwords had already become part of the parent MY [Miao-Yao = Hmong-Mien] language by [the time of Old Chinese] in view of the continuous nature of the subsequent MY/Chinese relationship” (Benedict 1987: 20). Clearly, given the age difference between the two families  – Hmong-Mien at c. 2500 years and Tibeto-Burman close to the age of Sino-Tibetan itself at c. 7200 years (Sagart et al. 2019) – we should try to identify a Tibeto-Burman subfamily source for these loanwords that might help us locate a prehistoric contact zone for the two groups. This is a challenge, however, since the words Hmong-Mien borrowed from Tibeto-Burman show similarities to different Tibeto-Burman subfamilies. For example, ‘sun/day’ (TB *s-nəy Ô HM *hnu̯ ɔj) shows a trace of the s- prefix that is also retained in Qiangic and Gyalrongic, but ‘son-in-law’ (TB *krwəy Ô HM *ʔweiX) does not appear to have been borrowed from Qiangic or Gyalrongic, in which the word means ‘daughter-in-law’, but rather perhaps from Kachin, in which it also means ‘son-in-law’. The final problem is that Hmong-Mien has no candidate sister families in the west. The relationship with TibetoBurman is primarily a contact relationship, even though it is possible that Tibeto-Burman and Hmong-Mien are related through a higher-order construct at a very deep level. Several scholars have speculated that the speakers of Proto-Hmong-Mien may have been the subject people of the ancient state of Chu 楚 (770–223 BCE) under a ruling Chinese elite (Erkes 1930; Pulleyblank 1983; Sagart 1999). This proposal would place them in the right place, south of the Yangtze River valley, at the right time, c. 500 BCE. But there is no convincing evidence of a Hmong-Mien linguistic substratum in Chu texts; Chamberlin (2016) finds evidence of a Tai-Kadai substratum instead.

250 

 Martha Ratliff

It would also be difficult to explain the Tibeto-Burman element in Hmong-Mien had speakers of Proto-Hmong-Mien been closer to the eastern coast c. 500 BCE.

14.4 Classification The basic division between the two major branches of the family, Hmongic and Mienic, is not in dispute. Although clearly related to one another, the two groups of languages diverge in phonology, syntax, and lexicon. In Mienic, prenasalized stops have become voiced stops, uvulars have merged with velars, and vowel length contrasts have been innovated (this is controversial: Purnell (1970) and Wang and Mao (1995) reconstruct vowel length for the protolanguage). In Hmongic, on the other hand, the numerous rimes that must be reconstructed for the protolanguage have merged into a relatively small set of 28, all final coda consonants with the exception of -ŋ have been lost, and words ending in *-k have shifted tone category. The order of elements within the noun phrase is different: in Hmongic languages the demonstrative is on the right edge of the noun phrase, but in Mienic languages the demonstrative is on the left edge. Finally, although Hmongic languages are full of Chinese loanwords, Mienic languages have even more. As several of these lexical replacements are shared by all Mienic languages (e.  g. Mienic *hmienA ‘face’ < Chinese 面 miàn, Mienic *nɔŋC ‘pus’ < Chinese 膿 nóng, Mienic *pouB ‘axe’ < Chinese 斧 fǔ, Mienic *ʔwənB ‘bowl’ < Chinese 碗 wǎn), we can consider lexical replacement as another type of innovation that defines the Mienic branch. Beyond this basic two-way division, the structure of the family is less clear. Within Mienic, the main languages that need to be classified are Iu Mien (in distinct varieties), Kim Mun, Biao Min, Chao Kong Meng, and Dzao Min. Wang and Mao (1995: 3) place Biao Min and Chao Kong Meng together and show four separate branches stemming from Proto-Mienic on the basis of a set of phonological similarities, rather than on the basis of shared innovations from a reconstructed protolanguage. Four other linguists have selected what they consider to be one key individual phonological innovation to distinguish Mienic languages. For Purnell (1970) it is the loss of voiceless sonorants in Mun that distinguishes Mun from Mien, and for L-Thongkum (1993) it is the split of upper-register tone categories in Mun that distinguishes Mun from Mien (neither includes Biao Min, Chao Kong Meng, or Dzao Min). Aumann and Sidwell (2004) use the development of rhotics as the basis for subgrouping. Their classification pairs two varieties of Mien and Biao Min (in which rhotics > laterals) against two other varieties of Mien, Kim Mun, and Dzao Min (in which rhotics > g, ð, or dz). Finally, Taguchi (2008) argues that both lexical and phonological innovations should be taken together to determine subgrouping. His tree shows a close relationship between two varieties of Mien – Changping and Luoxiang – and Kim Mun on the lowest (most innovative) node of the tree.



Classification and historical overview of Hmong-Mien languages 

 251

Until these various proposals can be reconciled, we cannot suggest which one of them is most likely to be right. Thus the most conservative tree from Wang and Mao (1995) appears in Figure 1. Proto-Mienic

Iu Mien

Kim Mun

Biao Min/CKM Biao Min

Dzao Min

Chao Kong Meng

Fig. 1: Mienic languages.

Within Hmongic there are even greater challenges since many more languages are involved. In Chinese scholarship, the Miao division comprises four languages: (i) the Miao language, divided into three major “dialects”: Qiandong (East Hmongic, e.  g. Hmu), Xiangxi (North Hmongic, e.  g. Xong), and Chuanqiandian (West Hmongic, e.  g. Hmong, Mong, A-Hmao, Hmyo), (ii) Pu Nu, (iii) Pa Hng, and (iv) Kiong Nai. In this tradition, Ho Ne (She) constitutes a division of its own, on a par with Miao and Yao (Wang 1995: 2). It is clear, however, that Ho Ne is Hmongic since it participates in the major innovations that distinguish Hmongic from Mienic (Ratliff 1998). Pa Hng has been shown in studies by Ratliff (2010) and Taguchi (2013) to share both phonological and lexical features with North Hmongic. Moreover, Pa Hng and North Hmongic show archaisms in their rimes which suggest that they were the first languages to split off from Proto-Hmongic. The tree in Figure 2, adapted from Taguchi (2013), represents an advance in our understanding. Proto-Hmongic North Hmongic/PaHng

West Hmongic/Pu Nu Hmong/A-Hmao/Hmyo

Pu Nu/Nao Klao

Kiong Nai/Ho Ne/Pa Na Kiong Nai

East Hmongic Ho Ne/Pa Na

Fig. 2: Hmongic languages.

The internal structure of the Hmongic sub-family is not a settled matter, however. Compared to the Mienic sub-family, surprisingly little work has been done, perhaps because Hmongic is a family of daunting complexity. The subgrouping of languages shown above is likely to be revised on the basis of the discovery of other shared innovations.

252 

 Martha Ratliff

14.5 Selected issues in Hmong-Mien language history In this section I present brief sketches of the history of three Hmong-Mien language features: tones, monosyllabic word structure, and noun classifiers (for more detail, see Ratliff 2010b). These three features are of special interest because each one is a defining feature of the modern-day Southeast Asian language type (see Strecker, this volume), but there is reason to think that not one of them was type-defining for the Hmong-Mien protolanguage.

14.5.1 Tones The earliest in-depth work on the Hmong-Mien language family involved the reconstruction of the tones by Kun Chang in a series of articles published over several decades (Chang 1947, 1953, 1966, 1972). For the most part, Hmong-Mien tonal categories correspond to those of Chinese, Vietnamese, and Tai, the languages of the ‘Sinosphere’ (Matisoff 1990) – Chinese and those languages in close contact with and profoundly influenced by Chinese – and the similarity of their tonal systems makes it likely that these languages all developed tones in the same way. We therefore assume that tonogenesis in Hmong-Mien followed a series of developments similar to those first elucidated by Haudricourt (1954) to explain tonogenesis in Vietnamese. By this account, tonal contrasts arose upon the loss of syllable-final laryngeal consonants *-ʔ and *-h (< *-s), yielding a three-way tonal contrast: A (*-ø)

B (*-ʔ)

C (*-h)

Syllables with final voiceless stops *-p, *-t, *-k developed their own tonal characteristics at a later date, yielding a fourth major tone category, D. These original tones subsequently doubled to eight following the merger of syllable-initial voiced and voiceless obstruents (preglottalized and voiceless sonorants patterned with the voiceless initials) in most, but not all languages, as represented below: 1 (voiceless initials) 2 (voiced initials)

A1 A2

B1 B2

C1 C2

D1 D2

All the words in a particular historical tone category have a common historical origin in terms of initial and final consonants (e.  g., B2 < syllable with a voiced initial and a final glottal stop). This ensures that when consonants are trans-phonologized into tone, all of the words belonging to each original category as defined by syllable type will continue to pattern together tonally. Although phonetic studies have shown that the newly emergent tones will have certain properties due directly to the type of consonant lost, once tones are created, they morph quite quickly into other things due to



Classification and historical overview of Hmong-Mien languages 

 253

in good part to tonal truncation in connected speech (Yang and Xu 2019): originally high tones may lower, low tones may raise, tones may merge, contours may simplify, etc. Therefore, across languages in a family, words that belong to a particular tone category may have quite different phonetic realizations. For example, Hmong-Mien tones in the A1 category have a variety of different phonetic values: they may be mid rising, high level, low rising, mid falling, or mid level. This cross-linguistic variability is true of every tone category. The categories themselves, on the other hand, are remarkably stable: in all Hmong-Mien languages, the members of the group of cognates which includes ‘deep’, ‘three’, ‘thatch grass’ and ‘snake’ will have the same tone (the A1 reflex), regardless of the phonetic value of that tone in any particular language. The stability of historical tonal categories explains why, of all aspects of Hmong-Mien historical phonology, tone was the first to be reconstructed. Relatively speaking, it was easy to do. Today, tone is one of the first features mentioned in a typological sketch of HmongMien languages. All Hmong-Mien languages are tonal, and their tone inventories are of great complexity: a variety of Hmu (Qingjiang Miao or Black Miao), for example, is reported to have a record five level tone contrasts (Kuang 2013). But were tones present in the protolanguage, spoken approximately 2,500 years ago? There is evidence from Chinese loanwords to suggest that they were not. As explained above, Chinese, Vietnamese, Tai, and Hmong-Mien languages all developed tones in the same way, first from the loss of certain syllable final consonant contrasts, and second from a tone split following a merger of syllable-initial voiceless and voiced obstruents. So all have tone categories A-D, and most have doubled these into A1, A2, B1, B2, etc. What is striking is the correspondence between the tone categories of Chinese and Hmong-Mien in the loanwords from Old Chinese, as seen in Table 1 (PHM reconstructions from Ratliff 2010b, OC forms from the online appendix to Baxter and Sagart 2014): Tab. 1: Loanwords from Old Chinese.





OC

PHM

Tone category in both

迂 號 下

yū ‘far’ háo ‘sing/cry out’ xià ‘descend/low/short’

里 廩 鐵

lǐ ‘village’ lǐn ‘granary’ tiě ‘iron’

*q (r)a *[C.g]ˤaw *gˤraʔ *m-gˤraʔ-s *(mə.)rəʔ *p.rimʔ *l̥ʕik

A1 A2 B2 C2 B2 B2 D1 (D1>C1 HM-internal)

舌 力

shé ‘tongue’ lì ‘strength’

*mə.lat *k.rək

*qʷuw *Gæw H *ɴɢaB M *ɣaC *rəŋX *rɛmX H *hluwC M *hrɛkD *mblet *-rək

w

D2 D2

254 

 Martha Ratliff

According to specialists in Old Chinese phonology, Chinese did not have tones at the time of these borrowings (Mei 1970; Baxter 1992; Sagart 1999). The simplest explanation for the correspondence in historical tone category between donor and borrower is that Hmong-Mien borrowed these words before either language developed tones, and then both developed tones in the same way out of final consonant contrasts as part of a prosodic shift that affected all the languages of the Sinosphere. The three other logical contact scenarios (Chinese tonal/Hmong-Mien atonal, Chinese atonal/Hmong-Mien tonal, Chinese tonal/Hmong-Mien tonal) cannot account for the correspondence of historical categories as well as this one. For example, if we take the usual assumption that Hmong-Mien borrowed tones from Chinese as part of the words themselves (see for example, Ying 1972; Benedict 1997), we would not expect to see a correspondence of abstract historical tone categories, which of course no speaker can hear. And if these old borrowings took place after both languages had developed tones, we would expect to see the irregularity across Hmong-Mien languages that characterize recent Chinese loans, in which each speech community assigns one of its own tones to the Chinese loanword on the basis of phonetic similarity: for example, 對 duì ‘correct’ has been widely borrowed, but relatively recently: in Chinese, the historical tone category is C1, but in Hmong-Mien the tones are reflexes of categories A1, B1, C1, C2, and D1 (Chen 2013: 762). Phonetic matching between two tonal languages clearly does not lead now – and would not have led then – to a correspondence of historical category. For discussion of the three less probable contact scenarios – tonal/atonal, atonal/tonal, tonal/tonal – with examples from modern-day languages of the outcomes to be expected, see Ratliff 2010b: 186–192. There is an even larger group of loanwords in Hmong-Mien from Middle Chinese that show regular correspondences of historical tone category (Table 2). Tab. 2: Loanwords from Middle Chinese.





MC

PHM

Tone category in both

金 秧 千 銅 羊 銀 桶 瓦 甑 炭 竈 箸 漆 百 十

jīn ‘metal’ yāng ‘seedling’ qiān ‘thousand’ tóng ‘copper’ yáng ‘sheep/goat’ yín ‘silver’ tǒng ‘bucket’ wǎ ‘tile’ zèng ‘rice steamer’ tàn ‘charcoal’ zào ‘stove’ zhù ‘chopsticks’ qī ‘lacquer’ bǎi ‘hundred’ shí ‘ten’

kim ʔjang tshen duwng yang ngin thuwngX ngwæX tsingH thanH tsawH drjoH tshit pæk dzyip

*kjeəm ‘gold’ *ʔjɛŋA (PH) *tshi̯en *dɔŋ *juŋ *ɲʷi̯ən *thɔŋ(X) *ŋʷæX *tsjɛŋH *thanH *N-tsoH *drouH *thjet *pæk *gju̯ ɛp

A1 A1 A1 A2 A2 A2 B1 B2 C1 C1 C1 C2 D1 D1 D2



Classification and historical overview of Hmong-Mien languages 

 255

The fact that the tone categories of HM and MC correspond presents a problem for the atonal/atonal borrowing hypothesis, because by the date usually given for Middle Chinese, c. 600 CE, Chinese had developed tone: the Middle Chinese Qieyun rime tables are organized by tone. However, there is still no way to account for the tone category correspondences without assuming that these must have been early Middle Chinese loans, with contact closer to the beginning of the Common Era, and that tone had only been incipient in the Chinese donor language at this point. Hmong-Mien could have borrowed these words with the (perhaps already decomposing) segmental material which eventually gave rise to tones intact. Then if both developed tones in the exactly same way, out of the laryngeal features of word-final consonants as tonogenesis swept across the area, we would expect to see these regular correspondences.

14.5.2 Monosyllabic word structure Although Hmong-Mien languages are excellent examples of the isolating type, with almost no affixal morphology and a near-perfect equivalence between syllable and morpheme, certain aspects of modern-day language word structure point to a syllableand-a-half word structure in the protolanguage. First, the complex consonant onset clusters (and simple to non-existent consonant codas) in Hmong-Mien suggest an older syllable-and-a-half word structure that underwent a process of “front-end collapse” (Ratliff 2018), that is, the reduction of the first syllable in an original iambic sesquisyllabic language. These consonant clusters include prenasalized voiced, voiceless and voiceless aspirated stops and affricates, and stops followed by the sonorants -r-, -l-, -lj-, -j- and -w-. On the basis of loanwords from Tibeto-Burman, voiceless sonorants in modern-day languages are best reconstructed as clusters of *hm-, *hn-, *hl- from an even older set *s-m-, *s-n-, *s-l-. Finally, certain words have to be reconstructed with odd double sonorant onsets, like *m-nɔk ‘bird’ (cf. AN *manuk ‘chicken’ [Blust and Trussel n.d.], Tai *C̬ .nok [Pittayaporn 2009], Kam-Sui *mluk [Thurgood 1988]), *n-mɛj ‘to have’ and *mlu̯ ɛjH ‘soft’. Second, when there are affixes, they are prefixes, not suffixes. There is an active process by which nominal prefixes are generated from class nouns in some Hmongic languages today. For example, White Hmong /tu55-ʈo̤ 42/ ‘solider’, literally ‘boy-war’, shows a generalization of the word /tu55/ ‘son/boy’ to serve as a human prefix (see also /tu55-lua21ʔ/ ‘merchant’, /tu55-dua33 ke24/ ‘traveler’, /tu55-nji̤a̤42/ ‘thief’, /tu55-tsi55/ ‘missionary; maid’). As discussed in the next section, this represents the native system of noun classification by prefix that existed before a set of classifiers and the classifier construction itself were borrowed from Chinese. Recently derived prefixes such as /tu55-/ ‘human’ continue a history of noun prefixation attested by phonologically and semantically reduced prefixes across the family. For example, (i) older semantically opaque prefixes are preserved in HmongMien language names: Pa-Na, Pa-Hng, A-Hmao, Pu-Nu, etc. (ii) There are two prefixes

256 

 Martha Ratliff

on nouns in Hmu (Hmongic), /a-/ and /qa-/ that have developed from a single ancestral form *qa-. /a-/ signals animacy and definiteness, while /qa-/ serves a variety of other functions which are difficult to reduce a common core (Shi 2016). (iii) Although the older layer of nominal prefixes is better preserved in Hmongic than in Mienic languages, a single nominal prefix /ʔa-/ is retained in the conservative Mienic language Dzao Min, and functions simply to mark a word as a noun (Mao, Meng and Zheng 1982). (iv) In A-Hmao (Hmongic), there is a special tone class for nouns in the B2, C2, and D2 tone categories (Wang 1979; Wang and Wang 1984). Based on comparative evidence, it seems clear that an ancient nominal prefix comparable to Hmu /a-/, /qa-/ and Dzao Min /ʔa-/ triggered tone sandhi in these nouns and subsequently ­disappeared (Ratliff 1991). Irregularities in initial consonant correspondences can often be attributed to the replacement of the original consonant by the initial consonant of a prefix: for example White Hmong /te22/ ‘hand’ has clear cognates in other languages that have either a /p-/ or /k-/ onset (from PHM *bɔuX ‘hand’). Moreover, in White Hmong, although ‘hand’ does not bear a prefix, /ko33-taw33/ ‘foot’ must always bear one, suggesting the likelihood that ‘hand’ was once prefixed as well. Finally, one verbal prefix may be reconstructed on the basis of cross-linguistic comparative evidence. The tonal contrast between ‘die’ and ‘kill’ in modern HmongMien languages (e.  g. White Hmong /tua33/ ‘kill’, /tṳa̤ 42/ ‘die’) can be traced back to a voicing contrast in the protolanguage (*təjH ‘kill’/*dəjH ‘die’), which in turn can be explained by the effect of a voiced stative prefix on the word for ‘die’. This is supported by evidence from Austronesian, which has the same pair of words (*pa-aCay ‘kill’/ *ma-aCay ‘die’, Blust and Trussel n.d.). Another prefix, impossible to recover, must account for the part/whole relationship between Proto-Hmongic *cæwB ‘body/trunk’ and *ɟæwB ‘leg/branch’.

14.5.3 Classifiers Along with tones and monosyllabic word structure, classifiers (or measure words) that categorize nouns into groups are also a hallmark feature of the Southeast Asian type. The classes into which the world is sorted by these words will be familiar to those who know other languages of the area: animates (including humans), humans specifically, things of a particular shape (bulky, stubby, stringy, flat, etc.), things in pairs, useful things (tools, instruments, weapons), units of length/weight/area/time, collectives, kinds, plus many highly specific “categories” that may have no more than one member. In White Hmong, classifiers must be used if other markers of definiteness are also present in the noun phrase: a quantifier, a possessive, a demonstrative, or a combination of these elements, and may also be used in the bare classifier construction, without another marker of definiteness (Simpson et al. 2011). Nichols (1992: 132–133) places Hmong-Mien in the middle of a numeral classifier “hotbed”,



Classification and historical overview of Hmong-Mien languages 

 257

i.  e., a part of the world where numeral classifiers occur in many or most languages. However, it appears that classifiers and the classifier construction are not native to Hmong-Mien. Classifiers exist side-by-side with another type of noun classification system which appears to be the older one, a system of classifying prefixes. Furthermore, many major classifiers are loanwords from Chinese. The prefixal system of noun classification was briefly introduced in section 4.2 above. Depending on the language, there are either phonologically and semantically degraded ancient prefixes (Hmu, Dzao Min) or newer prefixes recruited from class nouns, such as /tu55/ ‘son/boy’ (White Hmong). The difference between class nouns and prefixes is primarily that the prefixes are semantically generalized: /tu55/ as a prefix marks a member of the human class, and the noun bearing this prefix can refer to an adult male or to a female. Prefixes are also more tightly bound to the root phonologically since in West Hmongic languages they may trigger tone sandhi in the following noun if they bear reflexes of historical categories A1 or A2, e.  g. White Hmong /po55-ɳtȿe42/ ‘earlobe’ from /po55/ ‘round’ and /ɳtȿe52/ ‘ear’ (for an explanation of West Hmongic tone sandhi rules, see Ratliff 2010a). The semantic categories marked by prefixes overlap with the semantic categories marked by classifiers: animacy, shape, function. As a result, it is possible to find examples of both in the same noun phrase, such as in White Hmong /i55 lu55 po55-ʐe55/ ‘one round round-stone’ (‘a stone’) and /ib55 tu22 tu55-ʈo̤ 42/ ‘one animate human-war’ (‘a soldier’). These constructions are reminiscent of redundant, historically layered English constructions such as ‘more better’: the newer layer is external and rule-governed, while the older layer is internal and lexicalized. The classifier is also free to occur by itself as an anaphor for the full noun phrase; the prefix is bound by definition. The fact that many basic classifiers are of Chinese origin also argues for the secondary nature of this classification system. For example, the White Hmong tool classifier /ʈa55/, Proto-Hmongic *traŋA comes from 張 Old Chinese *traŋ > Middle Chinese trjang > zhāng ‘clfflat’ (< ‘spread’). It was first used as classifier for ‘bow’ and then for the instrument ‘zither’ in the Han period. Recent loans for abstract classifiers like ‘kinds/ sorts’ are also Chinese: White Hmong /ja21ʔ/ from Mandarin 樣 yàng, /ho21ʔ/ from Mandarin 號hào. For more examples, see Ratliff 2010b: 228–234.

14.6 Conclusion The field of Hmong-Mien historical linguistics is small but active. As more field work is conducted on undescribed and under-described languages, more comparative work will be possible: the refinement of existing reconstructions, the mapping and historical interpretation of syntactic features in light of the distribution of Chinese varieties in south China, and the creation of an on-line etymological dictionary modelled upon

258 

 Martha Ratliff

the Sino-Tibetan Etymological Dictionary and Thesaurus. A comprehensive etymological dictionary searchable by meaning as well as by form will be an important resource for comparative projects, the study of language contact, and the study of Hmong-Mien prehistory. Finally, work on the internal structure of the family needs to proceed with consideration of shared innovations at every level: lexical, phonological, morphological, and syntactic.

References Anderson, Edward F. 1993. Plants and people of the Golden Triangle: Ethnobotany of the hill tribes of Northern Thailand. Portland: Dioscorides Press. Aumann, Greg & Paul Sidwell. 2004. Subgrouping of Mienic languages: Some observations. In Somsonge Burusphat (ed.), Papers from the Eleventh Annual Meeting of the Southeast Asian Linguistics Society 2001, 13–27. Tempe: Program for Southeast Asian Studies, Arizona State University. Baxter, William H. 1992. A handbook of Old Chinese phonology. Berlin: Mouton de Gruyter. Baxter, William H. & Laurent Sagart. 2014. Old Chinese: A new reconstruction. Oxford: Oxford University Press. Benedict, Paul K. 1975. Austro-Thai language and culture with a glossary of roots. New Haven: Human Relations Area Files Press. Benedict, Paul K. 1987. Early MY/TB loan relationships. Linguistics of the Tibeto-Burman Area 10(2). 12–21. Benedict, Paul K. 1997. Interphyla flow in Southeast Asia. Mon-Khmer Studies 27. 1–11. Blust, Robert & Stephen Trussel. n.d. The Austronesian comparative dictionary, web edition. https://www.trussel2.com/acd/ (accessed 19 April 2020). Chamberlin, James R. 2016. Kra-Dai and the proto-history of South China and Vietnam. Journal of the Siam Society 104. 27–77. Chang, Kun. 1947. Miaoyaoyu shengdiao wenti [On the tone system of the Miao-Yao languages]. Bulletin of the Institute of History and Philology 16. 93–110. Chang, Kun. 1953. On the tone system of the Miao-Yao languages. Language 29. 374–378. Chang, Kun. 1966. A comparative study of the Yao tone system. Language 42. 303–310. Chang, Kun. 1972. The reconstruction of Proto-Miao-Yao tones. Bulletin of the Institute of History and Philology 44. 541–628. Chen, Qiguang. 2001. Hanyu Miaoyaoyu bijiao yanjiu [A comparative study of Chinese and Miao-Yao]. In Bangxin Ding & Hongkai Sun (eds.), Hanzangyu Tongyuanci Yanjiu [A study of Sino-Tibetan cognate vocabulary], 129–651. Nanning: Guangxi Minzu Chubanshe. Chen, Qiguang. 2013. Miao Yao Yuwen [Miao and Yao languages]. Beijing: Zhongguo Minzu Daxue Chubanshe [China Minorities University Press]. Ci Hai [Sea of Words], 3rd edn. 1979. Shanghai: Shanghai Cishu Chubanshe [Shanghai Dictionary Press]. Downer, Gordon B. 1963. Chinese, Thai, and Miao-Yao. In H. L. Shorto (ed.), Linguistic comparison in South East Asia and the Pacific, 133–139. London: School of Oriental and African Studies, University of London. Erkes, Eduard. 1930. Die Sprache des alten Ch’u. T’oung Pao 27. 1–11. Forrest, R. A. D. 1973 [1948]. The Chinese language, 3rd edn. London: Faber and Faber.



Classification and historical overview of Hmong-Mien languages 

 259

Haudricourt, André G. 1954. De l’origine des tons en vietnamien. Journal Asiatique 242. 69–82. (Reprinted in Haudricourt, André G., 1987, Problémes de Phonologie Diachronique, 147–160. Paris: Société pour l’Étude des Langues Africaines=SELAF.) Haudricourt, André G. 1966. The limits and connections of Austroasiatic in the northeast. In Norman H. Zide (ed.), Studies in comparative Austroasiatic linguistics, 44–56. The Hague: Mouton. Kosaka, Ryuichi. 2002. On the affiliation of Miao-Yao and Kadai: Can we posit the Miao-Dai family? Mon-Khmer Studies 32. 71–100. Kuang, Jianjing. 2013. The tonal space of contrastive five level tones. Phonetica 70. 1–23. Li, Yunbing. 2018. Miaoyaoyu Bijiao Yanjiu [A comparative study of the Miao-Yao languages]. Beijing: Shangwu Yinshuguan. L-Thongkum, Theraphan. 1993. A view on Proto-Mjuenic (Yao). Mon-Khmer Studies 22. 163–230. Mao, Zongwu, Chaoji Meng & Zongze Zheng. 1982. Yaozu Yuyan Jianzhi [A sketch of the languages of the Yao people]. Beijing: Minzu Chubanshe [Nationalities Press]. Matisoff, James A. 1990. On megalocomparison. Language 66(1). 106–120. Mei, Tsu-lin. 1970. Tones and prosody in Middle Chinese and the origin of the rising tone. Harvard Journal of Asiatic Studies 20. 86–110. Nichols, Johanna. 1992. Linguistic diversity in space and time. Chicago: University of Chicago Press. Pan, Wuyun. 2006. On the genetic relationship between the Miao-Yao languages and the Sino-Tibetan languages. Paper presented at the Workshop on Language and Genes in East Asia/Pacific, December 12–13, Uppsala, Sweden. Peiros, Ilia. 1998. Comparative linguistics in Southeast Asia. Canberra: Pacific Linguistics. Pittayaporn, Pittayawat. 2009. The phonology of Proto-Tai. Ithaca, NY: Cornell University PhD dissertation. Pulleyblank, Edwin G. 1983. The Chinese and their neighbours in prehistoric and early historic times. In David N. Keightley (ed.), The origins of Chinese civilization, 411–466. Berkeley: University of California Press. Purnell, Herbert C., Jr. 1970. Toward a reconstruction of Proto-Miao-Yao. Ithaca, NY: Cornell University PhD dissertation. Ratliff, Martha. 1991. The development of nominal/non-nominal class marking by tone in Shimen Hmong. Proceedings of the Seventeenth Annual Meeting of the Berkeley Linguistics Society, 267–282. DOI: https://doi.org/10.3765/bls.v17i0.1631. Ratliff, Martha. 1998. Ho Ne (She) is Hmongic: One final argument. Linguistics of the Tibeto-Burman Area 21(2). 97–109. Ratliff, Martha. 2004. Vocabulary of environment and subsistence in the Hmong-Mien proto­language. In Nicholas Tapp, Jean Michaud, Christian Culas & Gar Yia Lee (eds.), Hmong/Miao in Asia, 147–165. Chiang Mai, Thailand: Silkworm Press. Ratliff, Martha. 2009. Loanwords in White Hmong. In Martin Haspelmath & Uri Tadmor (eds.), Loanwords in the world’s languages: A comparative handbook, 638–658. Berlin & New York: Mouton de Gruyter. Ratliff, Martha. 2010a [1992]. Meaningful tone: A study of tonal morphology in compounds, form classes, and expressive phrases in White Hmong, 2nd edn. DeKalb, IL: Northern Illinois Press. Distributed by Cornell University Press [1st edition published by Northern Illinois University Center for Southeast Asian Studies]. Ratliff, Martha. 2010b. Hmong-Mien language history. Canberra: Pacific Linguistics, The Australian National University. Ratliff, Martha. 2018. Against a regular epenthesis rule for Hmong-Mien. Papers in Historical Phonology 3. 123–136. Sagart, Laurent. 1999. The roots of Old Chinese. Amsterdam & Philadelphia: John Benjamins.

260 

 Martha Ratliff

Sagart, Laurent, Guillaume Jacques, Yunfan Lai, Robin J. Ryder, Valentin Thouzeau, Simon J. Greenhill & Johann-Mattis List. 2019. Dated language phylogenies shed light on the ancestry of Sino-Tibetan. Proceedings of the National Academy of Sciences of the United States of America (PNAS) 116(21). 10317–10322. Shi, Defu. 2016. The functions of proclitic Ab and Ghab in Hmub. Languages and Linguistics 17(4). 575–622. Simpson, Andrew, Hooi Ling Soh & Hiroki Nomoto. 2011. Bare classifiers and definiteness: A cross-linguistic investigation. Studies in Language 35(1). 168–193. Starosta, Stanley. 2005. Proto-East Asian and the origin and dispersal of the languages of East and Southeast Asia and the Pacific. In Laurent Sagart, Roger Blench, & Alicia Sanchez-Mazas (eds.), The peopling of East Asia, 182–197. London: Routledge Curzon. Taguchi, Yoshihisa. 2008. On the subgrouping of Mien dialects. Paper presented at the 41st International Conference on Sino-Tibetan Languages and Linguistics, London. Taguchi, Yoshihisa. 2013. On the phylogeny of Hmongic languages. Paper presented at the 23rd Annual Meeting of the Southeast Asian Linguistics Society, Chulalongkorn University, Bangkok. Thurgood, Graham. 1988. Notes on the reconstruction of Proto-Kam-Sui. In Jerold A. Edmondson & David B. Solnit (eds.), Comparative Kadai: Linguistic studies beyond Tai, 179–218. Dallas: Summer Institute of Linguistics and the University of Texas at Arlington. Wang, Fushi. 1979. Miaoyu Fangyan Sheng Yun Mu Bijiao [The comparison of initials and finals of Miao dialects]. Monograph presented at the 12th International Conference on Sino-Tibetan Languages and Linguistics, Paris. Wang, Fushi. 1986. A preliminary investigation of the genetic affiliation of the Miao-Yao languages. Paper presented at the International Symposium on the Minority Nationalities of China, January 27–29, Santa Barbara, California. Wang, Fushi. 1994. Miaoyu Guyin Gouni [Reconstruction of the sound system of Proto-Miao]. Tokyo: ILCAA. Wang, Fushi & Chunde Wang. 1984. Guizhou Weining Miaoyu de sheng diao [Tones of the Miao language of Weining, Guizhou]. In Fu Maoji (ed.), Zhongguo Minzu Yuyan Luwenji [Collection of articles on the minority languages of China]. Chengdu: Sichuan Minzu Chubanshe. Wang, Fushi & Zongwu Mao. 1995. Miaoyaoyu Guyin Gouni [Reconstruction of the sound system of Proto-Miao-Yao]. Beijing: Zhongguo Shehui Kexue Chubanshe [China Social Sciences Press]. Wildlife of China. n.d. Zhongguo Yesheng Dongwu Baohu Xiehui [China Wildlife Conservation Association]. Beijing: China Forestry Publishing House. Yang, Cathryn & Yi Xu. 2019. A review of tone change studies in East and Southeast Asia. Diachronica 36(1). 417–459. Ying, Lin. 1972. Chinese loanwords in Miao. In Herbert C. Purnell, Jr. (ed.), Chang Yu-hung & Chu Kwo-ray (trans.), Miao and Yao linguistic studies: Selected articles in Chinese (Linguistics Series 5, Data Paper 88), 55–81. Ithaca, NY: Cornell University Southeast Asia Program, Department of Asian Studies. Zhongguo Caodi Ziyuan Tuji [Atlas of grassland resources of China]. 1996. Zhongguo Kexueyuan Guojia Jihua Weiyuanhui Ziran Ziyuan Zonghe Kaocha Weiyuanhui [The Comprehensive Investigation Committee on Natural Resources under the National Strategy Committee of the Chinese Academy of Sciences]. Beijing: China Cartographic Publishing House.

Paul Sidwell and Lawrence A. Reid

15 Language macro-families and distant phylogenetic relations in MSEA 15.1 Introduction The other chapters on language classification in this volume deal on the whole with matters of substantial consensus between scholars, and problems that are generally regarded as being tractable given established methods and a reasonable body of data. However, the MSEA region is also one that – for as long as language classification has been discussed – claims and counter-claims have been made about distant linguistic relations. Such proposals and associated discussions routinely test the limits of our methods as well as the patience of scholars whose relations are otherwise collegial and more or less productive. In order to appropriately contextualize this discussion, we begin with what is uncontroversially known about the historical disposition of language families in MSEA. Austroasiatic (AA) is the principle linguistic substrate in MSEA, being well established there since at least the late Neolithic transition (Sidwell and Blench 2011) and there is no unambiguous evidence of other language families preceding AA in the same region. If anything, scholars have suspected an even wider ancient dispersal of AA in the SEA region, and have variously pointed to regional typological and cultural parallels and lexical oddities to tentatively suggest that there may have been an AA presence in insular SEA (Adelaar 1995; Blench 2010). The other language families found in MSEA today (Austronesian, Hmong-Mien, Tibeto-Burman, Sinitic, Kra-Dai/Tai-Kadai) are all immigrants known to have penetrated the region–essentially since the adoption of metal-working regionally – often displacing and/or absorbing AA communities in the process.1 However, this rather broad picture was not an obvious interpretation to the 19th century linguists who first pondered these matters, and it is really only since the 1970s that the present general conception of MSEA linguistics history firmly emerged. Key insights in this process include the emerging understanding of a link between language and farming dispersal and the discovery of rice domestication in ancient China (Bellwood, this volume), and the identification of Formosa as the homeland of the Austronesians (Blust 2019). Such discoveries firmly anchored the origins of the non-AA languages outside of Indo-China, providing a much firmer footing for discussions of ancient language relationships and migrations in more recent decades.

1 This was far from the general outcome. The Chinese presence in Vietnam arguably lead to the elevation of Vietnamese as a national language; Tai and AA languages have successfully lived alongside one another for centuries in much of Indo-China, and the historically recent influx of Hmong-Mien into the region has been characterised substantially by strong maintenance of identity. https://doi.org/10.1515/9783110558142-015

262 

 Paul Sidwell and Lawrence A. Reid

15.2 Early language classifications and the emergence of “Austric” A hint of what we now recognize as the AA family first emerged in the works of scholars such as Logan (1850, 1856) recognizing a “Mon-Anam Formation” (spelled Mon-Annam in later works) within a vast and all-encompassing “Turanian” language family that gathered all the languages of east and southeast Indo-Iranian and Altaic (Müller 1862). These were not language families in the contemporary ­conception, but more like typological and areal constructs, with strongly diffusionist conceptions prevailing among linguists. Nonetheless, new data was coming in from the colonial world and theoretical advances in comparative linguistics (especially of the neo-grammarian school) were gaining traction, and divisions recognizably relating to real language histories began to emerge in scholarly analyses. A significant part of the Austronesian (AN) family was already identified quite early in the era of comparative linguistics – Malayo-Polynesian2 being recognized in the 1700s (Réland 1708; Förster 1778, and others) – and by the 1870s scholars were making useful progress in comparative studies (e.  g. H. Gabelentz 1861–1873; G. Gabelentz 1881). The Mon-Annam family was increasingly recognized, being defended by Cust (1878) and Forbes (1881) among others, although the question of its affinity with the Munda languages of India remained controversial. Importantly the vague Turanian hypothesis was being abandoned and the idea that languages of the east fall out into multiple distinct families was emerging. From the 1880s to the first decade of the 20th century saw tremendous progress as something approaching our contemporary view of Asian language families emerged. It was in this period that important data compilations were made and became a ­ vailable to scholars. For example, the French Pavie Expeditions into Indo-China yielded lexicons of montagnard languages, the British in Burma and Malaya collected and collated surveys, and the enormous Linguistic Survey of India began, with volumes appearing from 1903 onward (edited by Grierson). While linguistics was strongly divided between diffusionist and neo-grammarian perspectives, scholars were noting lexical and typological similarities between the better-known AA and AN languages, such as Khmer and Malay, and further contrasting these to the monosyllabic and tonal East-Asian tongues. The diffusionist-racialist Keane (1880) divided the languages of the east into “Indo-Chinese” and “Indo-Pacific” on these bases. At about the same time G. Gabelentz (1881) noted various similarities between Nicobarese and the Malayo-Polynesian family. These and other observations strongly hinted at a deeper relation between AA and AN families, which would soon became an important line of investigation for 2 Understood as a relationship only between the opposite ends of a family, somewhat different from current views of Malayo-Polynesian as consisting of all the Austronesian languages spoken outside mainland Taiwan, elsewhere known as Extra-Formosan.



Language macro-families and distant phylogenetic relations in MSEA 

 263

comparatists. From the MSEA perspective, the emergent problem was the homeland of AA, and if and how the other regional language families of MSEA are related to it, and this would be seriously tackled at the turn of the century. It was Schmidt who in 1906 made a systematic study of the Austroasiatic family and gave the hypothesis a much firmer basis; the same scholar also first proposed the names Austroasiatic, Austronesian, and Austric, the latter he perceived as a genetic relationship between Austroasiatic and Austronesian (Schmidt 1906: 121–157, and see Map 1). Schmidt considered some 215 AA-AN lexical similarities, mostly direct comparisons of Khmer and various Malayo-Polynesian languages, and discussed similarities in prefixes and infixes between the two families. Additionally, Schmidt was struck by the Chamic group of Indo-China, which he characterized as mixed-languages and implied long-standing close contact between AA and AN in Indo-China. This suggested to him a common homeland in that proximal region. The apparent coherence of the Austric hypothesis, reconciling lexical, morphological, and homeland arguments, was immediately appealing to Schmidt, although it attracted little serious interest at the time or in the immediately following decades. Nonetheless, versions of the Austric hypothesis did attract a small but serious following from the 1940s onward, and it remains a feature of contemporary discussions of macro-families and linguistic prehistory of MSEA today.

Map 1: Austric languages (Schmidt 1906: 71).

264 

 Paul Sidwell and Lawrence A. Reid

Aside from AN, the other language family of MSEA which has had a significant history of proximity and contact is Kra-Dai/Tai-Kadai (KD). The KD family (see Diller et al. 2008) was generally considered to be related to Chinese for many decades because of the significantly large body of shared vocabulary items, however Benedict (1942) proposed that KD languages were not genetically related to Chinese at all, but to AN (Benedict used the terms Thai and Indonesian in that paper), initially proposing a “Thai-Kadai-Indonesian Complex” as a sister of AA within Austric, a “northern division of Schmidt’s Austric superstock” (Benedict 1942: 599, see Figure 1).

{ Proto-Austric

Thai Kadai Indonesian

{ Mon-Khmer Annamite ?Miao-Yao

Fig. 1: Austric languages (Benedict 1942: 600).

In subsequent works, Benedict (1966, 1967, 1975, 1976) elaborated and modified his model, renaming his Thai-Kadai-Indonesian “Austro-Tai” (AT), and in 1990 further included Japanese (although this was not well received). In the course of those works Benedict came to abandon Austric, seeing ancient contact as explaining apparent AA-AT similarities. The evidence for AT that Benedict produced was challenged by many linguists (see Reid 1984–1985; Diller 1998) yet as the millennium turned it received new support from scholars such as Sagart (2004, 2005a, 2005b) and Ostapirat (2005). They focused on regularities in a core set of lexical comparisons, and new evidence from small KD languages in southern China (especially the Kra language Buyang) providing evidence for the reorganization of syllable and word structure necessary to reconcile the KD and AN families. Generally, opinion has now firmed strongly to the view that KD and AN are related, although opinions differ controversially as to the configuration of their grouping. In any case, we can now speak with some confidence about an AT macro-family which occupies a substantial proportion of MSEA. At the same time, Schmidt’s original Austric hypothesis is necessarily weakened, although it still enjoys some support. In the 1970s, as AA studies were in something of a heyday, there were serious attempts to assess and evaluate Schmidt’s Austric. Pou and Jenner (1975) discussed some 65 lexical comparisons, mostly directly comparing Malay and Khmer, and compiled 18 sets of parallel pre-syllables, emphasizing structural-phonological similarities between the languages. Shorto (1976) enthusiastically supported Austric, and in his posthumous (Shorto 2006) Mon-Khmer Comparative Dictionary identifies some 279 AA-AN comparisons, building substantially on the set originally proposed by Schmidt. There is a strong emphasis on Khmer and Malay in these works, and it is evident that



Language macro-families and distant phylogenetic relations in MSEA 

 265

contact between these important regional languages historically can explain many of the lexical similarities. Another feature of Benedict’s (1942) Austric is the tentative inclusion of HmongMien/Miao-Yao (HM) as another branch. Lexical comparisons indicating this possibility were noted by Shafer (1964), Haudricourt (1966) and Jakhontov (1977). Strikingly, Benedict himself (1975, 1990) later took the view that HM belongs within the AT macro-family. Peiros (1998) supported the idea of a special relation between AA and HM, presenting 26 comparisons that he regards as particularly convincing. Taken together with Benedict’s AT evidence, Peiros proposed an “extended Austric macrofamily” (Peiros 1998: 167) as shown in Figure 2. Generally, the linking of HM to Austroasiatic has not received wide support, and one can say that among scholars who have seem some value in the comparisons, this has also been taken as possible evidence of contact in the Yangtse region in the context of early rice cultivation (see van Driem 1999 for discussion). Austric

Miao-Austroasiatic

Miao-Yao

Austroasiatic

Austro-Tai

Kadai

Austronesian

Fig. 2: Austric languages (Peiros 1998: 168).

The 1990s also saw critical discussion of Austric by other scholars, some quite positive, some less so. Hayes (1992, 1996, 1997, 1999) compiled extensive comparanda in favor of the hypothesis, although these were not well received, in particular being questioned by Blust (1999). Diffloth (1994) discusses various “probable or possible” Austric comparisons, but finds many problems with them, characterizing the lexical evidence as, “neither abundant nor obvious” (Diffloth 1994: 311) and places more weight on the evidence of shared morphology (discussed in more detail below), although even this is problematic. Reid (1994, 1999) took a more detailed look at the morphological evidence for Austric, with a special focus on the evidence from Nicobarese languages, finding much merit in this approach. Reid’s contribution impressed Blust (1996) who provisionally accepted the proposition that AA and AN form coordinate families. However, little in the way of new linguistic evidence has emerged since in favor of the classic Austric thesis since the turn of the millennium, and it has languished somewhat. What has emerged with renewed vigor are grand proposals for macro-phyla that seek to explain the origins of all East and Southeast Asian language families, and

266 

 Paul Sidwell and Lawrence A. Reid

particularly in a manner that dovetails into the emerging results of genetics, especially the proposals of Stanly Starosta. Starosta (2005) includes Sino-Tibetan, Austronesian, Kra-Dai, Austroasiatic, and Hmong-Mien as part of an East Asian superphylum that he also referred to as Sino-Tibetan Austronesian (STAN3) or Proto-East-Asian. His ideas (and those of Sagart 2004, 2005) are supported by Kutanan et al. (2018) who claim (based on mitochondrial analysis), “[f]inally, we used simulations to test hypotheses concerning the genetic relationships of groups belonging to different language families. We found that Starosta’s model […] provided the best fit to the mtDNA data; however, Sagart’s model […] was also highly supported” (Kutanan et al. 2018). The boldness of such hypotheses is striking; they are highly speculative, being based mainly on perceived similarities in affixation and word-structure, and a confidence that the domestication of millet and rice in pre-historic China was important for the formation and ultimate dispersal of relevant language families. Proto-East Asian

ST-Yangzian

Pre-Sino-Tibetan

Sino-Bodic

Proto-Austronesian

Proto-Yangzian

Proto-HimalayoBurman

Proto-Austro-Asiatic

Proto-Extra Formosan

Tangut-Bodic

Sinitic

Tibetan

TangutHimalayan

Proto-HmongMien

Munda Mon-Khmer

Tai-Kadai

MalayoPolynesian

Fig. 3: PEA and the origin and dispersal of the languages of East and Southeast Asia and the Pacific (Starosta 2005: 183).

3 The origins of the Sino-Tibetan-Austronesian hypothesis lay in part with proposals by Sagart (1993, 1994, 2002) for “Sino-Austronesian”. Originally proposed as a special relationship between those two families, the hypothesis did not find broad support and was subsequently revised to include other language families.



Language macro-families and distant phylogenetic relations in MSEA 

 267

In the following sections we take a closer look at the evidence for the traditional Austric hypothesis in detail, and see that while indications are there, the evidence is difficult to assess and may be beyond the limits of our methods to resolve clearly. With the firming of support of AT in recent years, it may be that Austric withers as a proposal that was initially promising, yet subsequently could not be coherently reconciled with developments in language history regionally.

15.3 Proposed evidence for Austric 15.3.1 Morphology The primary evidence for a genetic relationship between Austroasiatic and Austronesian languages comes from morphology (Diffloth 1994: 310–312; Reid 1994). As Diffloth notes, “[b]oth families [Austroasiatic and Austronesian] can infix *-m- and *-n- after the first consonant of the root, with vocalization to *-um- and *-in- in Austronesian, and this infix can be reconstructed to the highest level in both families” (Diffloth 1994: 310–312). He also notes, “[a]nother infix, *-r-, is also inserted after the first consonant, with vocalization to *-ar- in Austronesian and to *ra- in Austroasiatic, and with very similar meanings in both families” (Diffloth 1994: 310–312). Reid (1994) notes that, in addition to these infixes, causative affixes *pa- and *ka- are also common between the two families. It is not only these common affixes and their functions which were considered to be unique to the two families, but their historic developments. In Austronesian, there is a relatively common morphophonemic process whereby the prefix mu- alternates with the infix -um-, and the infix -in- alternates with the prefix ni- (sometimes in the same language). In both Nancowry and Car (Nicobarese languages) a similar process occurs, whereby ma- alternates with -am- in the former, and mə- alternates with -am- in the latter deriving agentive nominals. Other affixes also show parallelism in their alternations: in Chrau, pa- alternates with -ap- (Reid 1994: 326–327). Even in Nancowry there is a root prefix ha- (< *pa-), which according to Radhakrishnan (1970) was “probably originally a causative … [and] there is some evidence to support treating as a variant of /-ah-/, a nominalizer affix” (Radhakrishnan 1970: 48). The process by which these affixes alternate is clearly a metathesis of the first two consonants of the derived word, so that *mu-C1, *ma-C1, *ni-C1, *ra-C1, and *pa-C1, become respectively *C1-um-, C1-am-, *C1-in-, *C1-ar- and *C1-ap-. While this remarkable similarity in affixes found in both AA and AN families has been taken as convincing evidence for an ancient common origin by various scholars for over a century, it has also been critiqued as problematic by others. For example, Sagart (2016) claims that the resemblance among infixes is only superficial, asserting:

268 

 Paul Sidwell and Lawrence A. Reid

The functions of * and in the two groups are quite different: * derives agentive nouns out of verbs in Austroasiatic while the Austronesian infix marks verbs taking their semantic agent for subject: it acquires a deverbal function only in Malayo-Polynesian. The * infix derives names of instruments in Austroasiatic but in Proto-Austronesian it only served for perfective nominalizations. (Sagart 2016: 256)

Sagart (2005b, 2016 and elsewhere) also suggests that *p(V)- and * k(V)- prefixes and an infix *-r- are reconstructable for Sino-Tibetan (ST) with some similarities in function to the counterparts in the other families. This is regarded by Sagart as s­ upporting his Sino-Tibetan-Austronesian hypothesis (STAN), or potentially an even deeper Asian phylum comparable to Starosta’s similarly named macrofamily. In any case, the argument for shared morphology between AA and AN is no longer an exclusive one, but has pivoted more towards nesting these families within an older grouping, a sentiment expressed by Reid (2005a: 150).

15.3.2 Syntax While many mainland AA languages have been heavily influenced both in their morphology and syntax by Chinese, the island languages, Nancowry and Car (­Nicobarese) have had less such influence, if any. While most mainland languages have no case-marking forms, Nicobarese has a set of initial monosyllabic forms introducing noun phrases, which in many Austronesian languages mark case. Some AA languages also mark case by using introductory particles, some of which resemble Austronesian markers. In particular, Reid (1994) points out that Old Khmer ta marked locative phrases and could also mark direct objects, and Sidwell (2020) discusses the cognate morpheme in Nicobarese languages marking indirect objects and demoted arguments, and similar functions in other AA languages. Reid compares this to ­Proto-AN *ta ‘locative preposition, demonstrative’ which functions as a case marker in some Formosan languages. Some analyses of Proto-AN suggest that the language was ergative, with possessive pronouns also being used as agentive pronouns on transitive verbs. Reid (1994) suggests that Nicobarese also was ergative at an early stage, with possessive pronouns agreeing in form with agentive pronouns and absolutive/nominative pronouns marking patients, and this proposal finds some support in Sidwell’s (2020) reconstruction of Proto-Nicobarese syntax. It is also striking that recent work (e.  g. Jenny 2015, 2020) increasingly points to Proto-AA having preferred verb-initial word order, converging on the established consensus view (Blust 2009: 465) that Proto-AN was also verb-initial. This is particularly significant given the present-day areal typology of word order patterns in MSEA. As the relevant chapters elsewhere in this volume clearly demonstrate, the dominant word order in MSEA is verb medial (SV/AVP), as it is also in many proximal AN languages. However, but in terms of historical word order AA and AN both converge on



Language macro-families and distant phylogenetic relations in MSEA 

 269

verb-initial, in contrast to Proto-ST reconstructed as verb-final (LaPolla 2019), and verb-medial order for Kra-Dai (Luo 2019) and Hmong-Mien (Ratliff 2010).

15.3.3 Lexicon From the beginning of comparative linguistics in MSEA, scholars have pointed to lexical parallels between AA and AN languages. The earlier observations in this type tended to focus on well-known languages such as Khmer and Malay, yet given their proximity and long history of contact it is understandable that there are many loan words shared between them, some evidently quite old. Some examples are given in Table 1. Tab. 1: Selected ancient AA-AN lexical borrowings.

  AN > AA

AA > AN ‘crab’ ‘ant’ ‘hawk’

Khmer kʰɗaːm srəmaoc klaeŋ

Malay kətam səmut həlaŋ

‘silver’ ‘gold’ ‘village’

Malay pərak əmas kampuŋ

Khmer praʔ meah kɑmpuəŋ ‘dock; bank’

As noted above, Schmidt (2006) considered some 215 AA-AN comparisons, latter authors extending this list somewhat further, although serious examinations of these comparanda have found problems with many of them, including: suspected borrowing, poor semantic matches, irregular phonological matches, and restriction to imitative or expressive vocabulary. For example, see Reid’s (2005a) critique of Haye’s (1997, 1999) Austric vocabularies, ranking sets into classes: “probable”, “possible”, “weak”, “rejected”, with commentary. Such critical analyses have reduced the serious contenders for Austric lexicon to just a few dozen terms, approximating the extent of lexical resemblance one can compile comparing any possible random pair of East or Southeast Asian language families. Among the strongest comparisons, terms for fauna, flora, and body parts stand out as the most promising. In other semantic fields that we might expect to hold promise if the relationship is genetic, such as pronouns or numerals, good comparisons are lacking. For the sake of illustration, ten of the most important comparisons as judged by Diffloth (1994) are shown in Table 2, with commentary. AA reconstructions are Diffloth’s, AN reconstructions are Blust’s (Austronesian Comparative dictionary ms).4

4 This ACD has since become available online: https://www.trussel2.com/ACD/ (last accessed 4. April 2021).

270 

 Paul Sidwell and Lawrence A. Reid

Tab. 2: Ten selected Austric lexical comparisons. AA

AN

‘fish’

*ʔaka̰ ːʔ

*Sikan

‘dog’ ‘wood’

*ʔac(ṵə)ʔ *kəɟh(uː)ʔ

‘eye’

*ma̰ t

‘bone’

*ɟlʔaːŋ

‘hair’

*s(ɔ)k

‘bamboo rat’

Khmu dəkən

‘molar’

Khmer thkìəm

‘left’ ‘ashes’

pMonic *ɟwiːʔ Stieng *buh

Diffloth rates as “probable”, although the phonological disagreement, particularly in the final, makes it problematic. *asu Diffloth rates as “probable”. *kaSiw Phonological resemblance is apparent, and can be reconciled if one accepts specific sound changes (Diffloth 1990). *maCa Diffloth rates as “probable”, but the difference in syllable structure is problematic. *CuqelaN Diffloth rates as “probable” if one accepts specific sound changes (Diffloth 1990). *bukeS Possible if the AN bu syllable can be explained (Diffloth 1990). Malay dəkan “Too far apart for borrowing and without attestations in between” Diffloth (1994: 314). Malay gərham Possible but there is phonological interference from another AA etymon *-gaːm ‘jaw’. *ka-wiʀi Diffloth rates as “probable”. *qabu Diffloth rates as “possible” although distribution limited in AA.

It is apparent that some of the comparisons in Table 2 still require unexplained adjustments to word or syllable structure, or sound changes that involve a certain amount of special pleading. In cases of very close phonological and semantic matches, such as ‘bamboo rat’, these are oddly specific and precise, which suggests some yet to be explained borrowing or remarkable accident. On the whole the lexical evidence for Austric is highly suggestive, but does not reach the level of making a compelling case for a special relationship between AA and AN within or aside from other linguistic relations. They are also far too few to allow a statement of regular sound correspondences.

15.4 Conclusion The presence of infixation after the initial consonant of a root, and an associated metathesis of the first two consonants of a derived root forming alternations found in both families remains the strongest evidence of a relation between AA and AN. As Diffloth writes, “even that phonological process is unusual and the likelihood of it having happened by chance twice in Southeast Asia is, I feel, infinitesimal” (Diffloth 1994: 311). The possibility of Nicobarese (lying off the north-west coast of Sumatra) being influenced by contact with Austronesian is also discussed in Reid (1994: 339–340),



Language macro-families and distant phylogenetic relations in MSEA 

 271

although how a language can share prefixes, infixes and suffixes without sharing a large portion of lexicon prompted Diffloth to comment, “[t]he Austric hypothesis presents a real challenge to comparativists, and it will only receive its real test when we have full dictionaries of several languages in each branch of Mon-Khmer” (Diffloth 1994: 320). We may be more cautious than Diffloth and suggest that the data is the real problem here. The history of the Austric hypothesis is such that the most compelling evidence was collected already more than a century ago, and in spite of the great flood of primary data that has been collected since, it has only added incrementally in favour of Austric, and arguably has also introduced as many, if not more, difficulties for the hypothesis. As it presently stands, it remains an intriguing possibility within a wider family of ideas concerning the ancient linguistic history of East and Southeast Asia.

References Adelaar, K. Alexander. 1995. Borneo as a crossroads for comparative Austronesian linguistics. In Peter Bellwood, James J. Fox & Darrell Tryon (eds.), The Austronesians: Historical and comparative perspectives, 75–95. Canberra: Department of Anthropology, Research School of Pacific and Asian Studies, Australian National University. Benedict, Paul K. 1942. Thai, Kadai and Indonesian: A new alignment in Southeastern Asia. American Anthropologist 44. 756–601. Benedict, Paul K. 1966. Austro-Thai. Behavior Science Notes 1. 227–261. Benedict, Paul K. 1967. Austro-Thai studies. Behavior Science Notes 2. 203–244. Benedict, Paul K. 1975. Austro-Thai: Language and culture, with a glossary of roots. New Haven: HRAF Program. Benedict, Paul K. 1976. Austro-Thai and Austroasiatic. In Philip Jenner, Laurence C. Thompson & Stanley Starosta (eds.), Austroasiatic studies, Part I (Oceanic Linguistics Special Publication 13), 1–36. Honolulu: University Press of Hawaii. Benedict, Paul K. 1990. Japanese/Austro-Tai. Linguistica Extranea, Studia 20. Ann Arbor: Karoma Publishers. Blench, Roger M. 2010. Was there an Austroasiatic presence in Island Southeast Asia prior to the Austronesian expansion? Bulletin of the Indo-Pacific Prehistory Association 30. 133–144. Blust, Robert A. 1996. Beyond the Austronesian homeland: The Austric hypothesis and its implications for archaeology. Transactions of the American Philosophical Society, New Series 86(5). 117–158. Blust, Robert A. 1999. Comments on La Vaughn H. Hayes, “The Austric denti-alveolar sibilants.” Mother Tongue 5. 19–21. Blust, Robert A. 2009. The Austronesian languages. Canberra: Pacific Linguistics. Blust, Robert A. 2019. The Austronesian homeland and dispersal. Annual Review of Linguistics 5(1). 417–434. Cust, Robert Needham. 1878. A sketch of the modern languages of the East Indies. London: Trübner & Co. Diffloth, Gérard. 1990. What happened to Austric? Mon-Khmer Studies 16/17. 1–9. Diffloth, Gérard. 1994. The lexical evidence for Austric, so far. Oceanic Linguistics 33(2). 309–322.

272 

 Paul Sidwell and Lawrence A. Reid

Diller, Anthony V. N. 1998. The Tai language family and the comparative method. Proceedings of the International Conference on Tai Studies, pp. 1–32. Bangkok: Mahidol University. Diller, Anthony V. N., Jerrold A. Edmondson & Yongxian Luo (eds.). 2008. The Tai-Kadai languages (Routledge Language Family Series). New York: Routledge. Driem, George van. 1999. Four Austric theories. Mother Tongue 5. 23–27. Forbes, C. J. F. S. 1881. Comparative grammar of the languages of Further India: A fragment. And other essays … London: W. H. Allen. Förster, J. R. 1996 [1778]. Observations made during a voyage round the world, edited by N. Thomas, H. Guest & M. Dettelbach, with a linguistic appendix by K. H. Rensch. Honolulu: University of Hawaii Press. Gabelentz, Georg von der. 1881. Sur la possibilité de prouver l’existence d’une affinity généalogique entre les langues dites indochinoises. Proceedings of the International Congress of Orientalists 4 (Florence). 283–293. Gabelentz, Hans Conor von der. 1861–1873. Die melanesischen Sprachen nach ihrem grammatischen Bau und ihrer Verwandtschaft unter sich und mit den malaiischpolynesischen Sprachen. Abhandlungen der philologisch-historischen Classe der Königlich Sächsischen Gesellschaft der Wissenschaften 3(7). Haudricourt, André-Georges. 1966. The limits and connections of Austroasiatic in the northeast. In Norman H. Zide (ed.), Studies in comparative Austroasiatic linguistics, 44–56. The Hague: Mouton. Hayes, La Vaughn H. 1992. On the track of Austric: Part I. Mon-Khmer Studies 21. 143–178. Hayes, La Vaughn H. 1996. Another look at final spirants in Mon-Khmer. Mon-Khmer Studies 26. 41–64. Hayes, La Vaughn H. 1997. On the track of Austric: Part II. Consonant mutation in early Austroasiatic. Mon-Khmer Studies 27. 13–44. Hayes, La Vaughn H. 1999. On the track of Austric: Part III. Basic vocabulary. Mon-Khmer Studies 29. 1–34. Jakhontov, Sergej E. 1977. In defence of Austro-Thai. In Vjacheslav Ivanov (ed.), Konferencija ‘Nostraticheskie jazyki i nostraticheskoe jazykoznanie’, 50–51. Moscow: Nauka. Jenny, Mathias. 2015. Syntactic diversity and change in Austroasiatic languages. In Carlotta Viti (ed.), Perspectives on historical syntax, 317–340. Amsterdam & Philadelphia: John Benjamins. Jenny, Mathias. 2020. Verb-initial structures in Austroasiatic languages. In Mathias Jenny, Paul Sidwell & Mark Alves (eds.), Austroasiatic syntax in areal and diachronic perspective, 21–45. Leiden & Boston: Brill. Kayser, Manfred, Silke Brauer, Gunter Weiss, Wulf Schiefenhovel, Peter Underhill, Peidong Shen, Peter Oefner, Mila Tommaseo-Ponzetta & Mark Stoneking. 2003. Reduced Y-chromosome, but not mitochondrial DNA, diversity in human populations from West New Guinea. American Journal of Human Genetics 72. 281–302. Keane, A. H. 1880. On the relations of the Indo-Chinese and Inter-Oceanic races and languages. Journal of the Anthropological Institute 9. 254–289. Kuiper, F. B. J. 1948. Proto-Munda words in Sanskrit. Amsterdam: Noord-Hollandsche Uitgevers Maatschappij. Kuiper, F. B. J. 1950. An Austro-Asiatic myth in the RV. Amsterdam: Noord-Hollandsche Uitgevers Maatschappij. Kutanan, Wibhu, Jatupol Kampuansai, Andrea Brunelli, Silvia Ghirotto, Pittayawat Pittayaporn, Sukhum Ruangchai, Roland Schröder, Enrico Macholdt, Metawee Srikummool, Daoroong Kangwanpong, Alexander Hübner, Leonardo Arias & Mark Stoneking. 2018. Complete mitochondrial genomes of Thai and Lao populations indicate an ancient origin of Austroasiatic



Language macro-families and distant phylogenetic relations in MSEA 

 273

groups and demic diffusion in the spread of Tai-Kadai languages. European Journal of Human Genetics 26. 898–911. DOI: https://doi.org/10.1038/s41431-018-0113-7. LaPolla, Randy J. 2019. On the structure of the clause in Proto-Sino-Tibetan and its development in the daughter languages. In Kong Jiangping (ed.), The ancestry of the languages and peoples of China (Journal of Chinese Linguistics Monograph Series 29), 1–23. Hong Kong: Chinese University Press of Hong Kong. Logan, James Richardson. 1850. On the leading characteristics of the Papuan, Australian, and Malayu-Polynesian nations. The Journal of the Indian Archipelago IV. 344–478. Logan, James Richardson. 1856. Ethnology of the Indo-Pacific islands, Part 2, Chapter 6, Appendix A: Comparative vocabulary of the numerals of the Mon-Anam formation. Appendix B: Comparative vocabulary of miscellaneous words of the Mon-Anam formation. Journal of the Indian Archipelago 1. Luo, Yongxian. 1919. The Kra-Dai languages. In Oxford Research Encyclopedia of Linguistics. DOI: https://doi.org/10.1093/acrefore/9780199384655.013.346 Müller, Friedrich Wilhelm Karl. 1862. Lectures on the science of language, 3rd edn. London: Longman, Green, Longman and Roberts. Ostapirat, Weera. 2005. Kra–Dai and Austronesian: Notes on phonological correspondences and vocabulary distribution. In Laurent Sagart, Roger Blench & Alicia Sanchez-Mazas (eds.), The peopling of East Asia: Putting together archaeology, linguistics and genetics, 107–131. London: Routledge Curzon. Peiros, Ilia. 1998. Comparative linguistics in Southeast Asia. Canberra: Pacific Linguistics. Pou, Saverous & Philip Jenner. 1974. Proto-Indonesian and Mon-Khmer. Asian Perspectives 17(2). 112–124. Radhakrishnan, R. 1970. A preliminary descriptive analysis of Nancowry. Chicago: University of Chicago Ph.D. dissertation. Ratliff, Martha. 2010. Hmong-Mien language history. Canberra: Pacific Linguistics. Reid, Lawrence A. 1984–1985. Benedict’s Austro-Tai hypothesis – An evaluation. Asian Perspectives 261. 19–34. Reid, Lawrence A. 1994. Morphological evidence for Austric. Oceanic Linguistics 33(2). 323–344. Reid, Lawrence A. 1999. New linguistic evidence for the Austric hypothesis. In Elizabeth Zeitoun & Paul Jen-kuei Li (eds.), Selected papers from the Eighth International Conference on Austronesian Linguistics, 5–30. Taipei, Taiwan: Academia Sinica. Reid, Lawrence A. 2005a. The current status of Austric: A review and evaluation of the lexical and morphosyntactic evidence. In Laurent Sagart, Roger Blench & Alicia Sanchez-Mazas (eds.), The peopling of East Asia: Putting together archaeology, linguistics and genetics, 132–160. London & New York: Routledge Curzon. Reid, Lawrence A. 2005b. The Austric hypothesis. In Keith Brown (ed.), Encyclopedia of language and linguistics, second edition, vol. 1, 596–598. Oxford: Elsevier. Reid, Lawrence A. 2005c. Austro-Tai hypotheses. In Keith Brown (ed.), Encyclopedia of language and linguistics, second edition, vol. 1, 609–611. Oxford: Elsevier. Réland, Hadrianus. 1708. Dissertationum miscellanearum partes tres, 3 vols. Utrecht: Trajecti ad Rhenum. Sagart, Laurent. 1990. Chinese and Austronesian are genetically related. Paper presented at the 23rd International Conference on Sino-Tibetan Languages and Linguistics, October 1990, Arlington, Texas. Sagart, Laurent. 1993. Chinese and Austronesian: Evidence for a genetic relationship. Journal of Chinese Linguistics 21(1). 1–62. Sagart, Laurent. 1994. Old Chinese and Proto-Austronesian evidence for Sino-Austronesian. Oceanic Linguistics 33(2). 271–308.

274 

 Paul Sidwell and Lawrence A. Reid

Sagart, Laurent. 2002. Sino-Tibeto-Austronesian: An updated and improved argument. Paper presented at the 9th International Conference on Austronesian Linguistics, Canberra, 8–11 January 2002. https://www.academia.edu/2460148/Sino-Tibetan-Austronesian_an_updated_ and_improved_argument (accessed 9 January 2021). Sagart, Laurent. 2004. The higher phylogeny of Austronesian and the position of Tai-Kadai. Oceanic Linguistics 43(2). 411–444. Sagart, Laurent. 2005a. Sino-Tibetan–Austronesian: An updated and improved argument. In Laurent Sagart, Roger Blench & Alicia Sanchez-Mazas (eds.), The peopling of East Asia: Putting together archaeology, linguistics and genetics, 161–176. London: Routledge Curzon. Sagart, Laurent. 2005b. Tai-Kadai as a subgroup of Austronesian. In Laurent Sagart, Roger Blench & Alicia Sanchez-Mazas (eds.), The peopling of East Asia, 177–181. London & New York: Routledge/Curzon. Sagart, Laurent. 2011. The Austroasiatics: East to West or West to East. In Nicholas Enfield (ed.), Dynamics of human diversity, 345–359. Canberra: Pacific Linguistics. Sagart, Laurent. 2013. The higher phylogeny of Austronesian: A response to winter. Oceanic Linguistics 52(1). 249–255. http://hal.archives-ouvertes.fr/hal-00781136 (accessed 9 January 2021). Sagart, Laurent. 2016. The wider connections of Austronesian: A response to Blust (2009). Diachronica 33(2). 255–281. Schiller, Eric. 1987. Causativity in Southeast Asia. University of California Working Papers in Linguistics 3. 201–219. Schiller, Eric. 1988. Which way did they grow? (Morphology and the Austro-Tai/(Macro)Austric debate). Papers from the Chicago Linguistic Society 24th Regional Meeting, 235–246. Chicago: Chicago Linguistic Society. Schmidt, W. 1906. Die Mon-Khmer-Völker, ein Bindeglied zwischen Völkern Zentralasiens und Austronesiens. Archiv der Anthropologie 5. 59–109. Schmidt, W. 1916. Einiges über das Infix mn und dessen Stellvertreter p in austroasiatischen Sprachen. Aufsätze zur Kultur- und Sprachgeschichte, vornehmlich des Orients, Ernst Kuhn zum 70. Breslau: Marcus. Schmidt, Wilhelm. 1906. Die Mon-Khmer Völker, ein Bindeglied zwischen Völkern Zentralasiens und Austronesiens. Braunschweig: Friedrich Vieweg und Sohn. Shafer, Robert. 1964. Miao-Yao. Monumenta Serica 23. 398–411. Shorto, Harry L. 1976. In defense of Austric. Computational Analyses of Asian and African Languages 6. 95–104. Shorto, Harry L. 2006. A Mon-Khmer comparative dictionary. Canberra: Pacific Linguistics. Sidwell, Paul & Roger Blench. 2011. The Austroasiatic Urheimat: The Southeastern Riverine hypothesis. In Nicholas Enfield (ed.), Dynamics of human diversity, 315–344. Canberra: Pacific Linguistics. Sidwell, Paul. 2020. Nicobarese comparative grammar. In Mathias Jenny, Paul Sidwell & Mark Alves (eds.), Austroasiatic syntax in areal and diachronic perspective, 82–104. Leiden & Boston: Brill. Starosta, Stanley. 1995. The Chinese-Austronesian connection: A view from the Austronesian morphology side. In William S.-Y. Wang (ed.), The ancestry of the Chinese language (Journal of Chinese Linguistics Monograph Series 8), 373–392. Hong Kong: The Chinese University of Hong Kong. Starosta, Stanley. 2005. Proto-East Asian and the origin and dispersal of the languages of East and Southeast Asia and the Pacific. In Laurent Sagart, Roger Blench & Alicia Sanchez-Mazas (eds.), The peopling of East Asia: Putting together archaeology, linguistics and genetics, 182–197. London & New York: Routledge/Curzon.



Language macro-families and distant phylogenetic relations in MSEA 

 275

Trivedi, Rajni, T. Sitalaximi, Jheelam Banerjee Anamika Singh, P. K. Sircar & V. K. Kashyap. 2006. Molecular insights into the origins of the Shompen, a declining population of the Nicobar archipelago. Journal of Human Genetics 51. 217–226. DOI: 10.1007/s10038-005-0349-2. Xing Gongwan. 1991. Guanyu Hanyu Nandaoyu de fashengxue guanxi wenti – L. Shajia’er Hanyu-Nandaoyu tongyuan lun shuping buzheng [On the question of the genetic relationship between Chinese and Austronesian – An evaluation and supplementation of L. Sagart’s Sino-Austronesian theory]. Minzu Yuwen 3. 1–14; 4. 23–35; 5. 13–25.

David Strecker

16 Typological profile of Hmong-Mien languages 16.1 Introduction Although the Hmong-Mien (HM) languages originated in Central China (Ratliff 2010 and this volume), they are presently spoken in Southern China, considered by many to be part of MSEA.1 The migration into Vietnam, Laos, and Thailand is relatively recent, beginning in the 18th century, with the bulk of the migration in the 19th century. The typological profiles of HM languages share many features with what is found in other groups of the MSEA linguistic area, and HM may be regarded as belonging to the northern periphery of that area. In this overview, we identify and discuss various of these features. Among notable phonological features are: sesquisyllabic words in which the first syllable may, under certain grammatical and phonological conditions, be dropped (Matisoff 1989, 1992; Ratliff 1991, 2010: 199–213; Chen 1993; Sposato 2015: 183–209, 495–501); large consonant inventories; tonal systems that typologically are like those of Kra-Dai [KD] and Vietnamese both with regard to the historical development of the tones and with regard to the phonological features which characterize tone (Chang 1947; Haudricourt 1954, 1961; Gedney 1972; Brown 1975; Li 1977; Strecker 1979, 1990; Wang Fushi 1994; Ratliff 2010: 184–198; and Brunnelle and Ta Thành Tấn this volume); and several distinct types of tone sandhi (Ratliff 1986; Sposato 2015: 94–117). In terms of morpho-syntax, the languages are predominantly isolating with certain exceptions (Ratliff 1986, 1991a, 2010: 199–213). They prefer SV/AVP word order in clauses but variation in word order within constituents suggests that HM may be at an intersection zone between Sinitic and KD (Sposato 2014). As Ratliff (this volume) shows, HM languages fall into two groups, Hmongic and Mienic. The principal typological differences between the two groups are phonological. Mienic languages typically have a richer inventory of finals – often including vowel length contrasts, three contrasting final nasals, and final stops – than do Hmongic languages, which usually have no vowel length contrasts, a single final nasal archiphoneme (realized as /n/, /ŋ/, or nasalization of the vowel), and no final stops. In the initials, Hmongic languages typically have more point of articulation contrasts, including retroflex and uvular consonants, which are not found in Mienic languages.

1 The author is deeply grateful to Paul Sidwell, Mathias Jenny, and Martha Ratliff for their input in the preparation of this chapter. https://doi.org/10.1515/9783110558142-016

278 

 David Strecker

16.2 Phonology 16.2.1 Syllables Words in HM languages are predominately monosyllabic, but there are also sesquisyllabic words, the proportion of which in the lexicon varies from language to language. The first component of sesquisyllabic words, the presyllable, is often optional and generally exhibits a restricted range of phonological shapes. In Xong [ISO 639 mmr], for example, presyllables end in either i /i/, a /ɑ/, or ao /ɔ/, and are optional or have grammatical or semantic functions, e.  g. ‘to pass’ appears as both jidchat /ʨi44 -ʈhɑ53/ and chat /ʈhɑ53/ and ‘tiger’ appears as both dabjod /tɑ35-ʨo44/ and jod /ʨo44/.2 Two questions which have long puzzled students of HM languages are: “What determines when a presyllable can be omitted?” and “What do presyllables mean?”. Both questions can at this point only be answered to some limited extent, and more in-depth research in this area is needed, as is the case with much of HM studies. In languages of the Mienic branch of the HM family, many presyllables are contractions of fuller forms. In Iu Mien [ISO 639 ium], for example, bu’juonh /pu ʨwon31/ ‘fist’ is a contraction of buoz-juonh /pwo31ʨwon31/ < buoz /pwo231/ ‘hand’ + juonh / ʨwon31/ ‘to be curled up’. Iu Mien speakers regard all presyllables as contractions. For example, the Iu Mien word for ‘child’ appears as fu’jueiv /fu ʨwei453/ and fa’jueiv /fa ʨwei453/ and in a fuller form fuh jueiv /fu31ʨwei453/ and also as jueiv /ʨwei453/ with no presyllable. Iu Mien speakers regard fu’ /fu/ and fa’ /fa/ as contractions of fuh /fu31/ even though fuh /fu31/ has no independent meaning (Purnell 2012: xxxi). In languages of the Hmongic branch, presyllables can be divided into those which have a primarily lexical function, that is, they identify the semantic class of the word (most often a noun) to which the syllable is proposed, and those which have a primarily grammatical function, that is, they refer to such concepts as definiteness (preposed to a noun) and durativity (preposed to a verb). Lexical presyllables are exemplified by Xong, which has minimal pairs of nouns combined with general and specific presyllables, as seen in examples (1) to (3): (1)

ghaobbaod /qɔ35-pɔ44/ ‘stalk of a plant’ (ghaob /qɔ35-/, general presyllable for nouns) bidbaod /pji44-pɔ44/ ‘node on a stalk’ (bid /pji44-/, presyllable for round objects)

2 Examples taken from sources that use indigenous orthography are written in the indigenous orthography followed by IPA transcription. In the indigenous orthographies of Hmong-Mien languages morpheme boundaries are unmarked. In the IPA transcription morpheme boundaries are marked with a hyphen. (Not to be confused with the hyphen used in the indigenous Iu Mien orthography to mark tone sandhi.)



Typological profile of Hmong-Mien languages 

 279

(2)

ghaobdaob /qɔ35-tɔ35/ ‘gourd’ galdaob /kɑ̤ 33-tɔ35/ ‘gourd ladle’ (gal /kɑ̤ 33-/, presyllable for carved or cut objects)

(3)

ghaobgaod /qɔ35-kɔ44/ ‘root of the tongue’ ghadgaod /qɑ44-kɔ44/ ‘root of a plant’ (ghad /qɑ44-/, presyllable for parts of plants)

Grammatical presyllables, or proclitics (procl), as Shi (2016) prefers to call them, are exemplified by Hmu [ISO 639-3 hea]. Hmu has two proclitics, ab /ʔa33-/ and ghab / qa33-/, preposed to nouns to mark definiteness, as in this example from Shi (2016: 588). The following sentence (4a) is quite acceptable to speakers: (4a) Mongx mongl ghab jangs liod 55 11 33 moŋ mo̤ ŋ̤ qa ʨa̤ ŋ̤13 ljo35 2sg go procl- lair scalper ‘Go to the lair and lead the scalper out here!’

tiet tjhə44 lead

dail tɛ̤ 11 clf

liod ljo35 scalper

dax! ta55 out

But if one replaces ghab /qa33-/ with a classifier, which conveys indefiniteness, the sentence (4b) sounds odd to speakers: (4b) ?Mongx mongl laib jangs liod tiet moŋ55 mo̤ ŋ11̤ lɛ33 ʨa̤ ŋ̤13 ljo35 tjhə44 2sg go clf lair scalper lead ‘Go to a lair and lead a scalper out here.’

dail tɛ̤ 11 clf

liod dax. ljo35 ta55 scalper out

Xong has a grammatical presyllable, jid /ʨi44-/, preposed to verbs to mark durativity, as in the following example (5) from the editors’ preface to a collection of folktales (Guizhou Minzu Chubanshe 1958a: 1). “For thousands of generations and hundreds of years,” the editors write, “our generations of old”: (5a)

nax peut ghobqib jidjanb, ghadlot jidchat, 31 53 35 35 44 35 44 53 nɑ phə qo -ʨhi ʨi -ʨɛ qɑ -lo ʨi44-ʈhɑ53 only keep nml-belly dur-remember nml-mouth dur-pass ‘only remembered them in their hearts, and passed them on with their mouths,’

(5b) ad reux chat baod ad reux, ʔɑ44 ʐə31 ʈhɑ53 pɔ44 ʔɑ44 ʐə31 one generation pass tell one generation ‘each generation passing them to the next,’ (5c)

nangbnangd jidchat dand maxnend lol. nɑ̃ 35nɑ̃ 44 ʨi44-ʈhɑ53 tɛ44 mɑ31nẽ44 lo1̤1̤ therefore dur-pass reach present come ‘until they were passed on to the present.’

280 

 David Strecker

What is noteworthy in this example is that although the meaning is durative throughout, the form with a presyllable, jidchat /ʨi44-ʈhɑ53/ ‘DUR-pass’, alternates with the form with no presyllable, chat /ʈhɑ53/ ‘pass’. There is much that is still not known about the conditions under which presyllables in HM can be dropped, but it appears that both phonological and grammatical factors are involved. Among phonological factors is rhythm, as in an Iu Mien poem about the flood myth in which the alternation between monosyllabic lueih /lwei31/ and sesquisyllabic m’lueih /m lwei31/ ‘thunder’ is necessary to preserve the seven-syllable line, as seen in example (6) (Purnell 2012: 393–394): (6a) Houh louh mbiouh zaangc tinhou31 lou31 bjou31 tsaːŋ11 thin31 calabash gourd float ascend sky ‘The gourd floated up into the heavens,’

dorngh noic, tɔŋ31 nɔi11 place endure

(6b) faandongz lueih dingh fiem- binc ging. 31 231 31 31 31 11 faːn toŋ lwei tiŋ fjem pin kiŋ44 overturn move thunder dwelling heart comfort bother ‘Banged into thunder’s dwelling alarming him.’ (6c)

M’ lueih dox buonv njiec dong m lwei31 to24 pwon453 ɟje11 toŋ44 presyll thunder pour clf descend east ‘He sent a message down into the Eastern Sea,’

koiv, khɔi453 sea

(6d) Binv mbuox luangh hungh koi suiv zingh. pin453 bwo24 lwaŋ31 huŋ31 khɔi44 swi453 tsiŋ31 say tell dragon king open water gate ‘Asking the Dragon King to open the water gate.’ Grammatical factors are exemplified by the rules governing the presence or absence of the presyllable ghaob- /qɑ41-/ ~ aob- /ɑ41-/ ~ ob- /o41-/ ‘nominal’ in the Fenghuang dialect of Xong (Sposato 2015: 188–189)3. 1. ghaob- /qɑ41-/ ~ aob- /ɑ41-/ ~ ob- /o41-/ is obligatory when the noun is (i) in clause-initial position, (ii) in fronted preverbal position, (iii) in a possessive construction involving the marker naond, or (iv) immediately following the copula nins /ᶇĩ22̤ /, as illustrated in examples (7) to (10).

3 Examples in the Fenghuang dialect of Xong are written in Sposato’s practical orthography, which, unlike the indigenous orthography, uses hyphen to mark morpheme boundaries.



Typological profile of Hmong-Mien languages 

 281

(7)

Aob-roub id-lons leut-gheul geud roub zox. ɑ41-ʐɛɯ41 i43-lõ̤ 22 lɤ14-qɤ̤43 kɤ43 ʐɛɯ41 tso454 nml-stone dur-gather top-place hold stone smash ‘(The villagers) gathered stones at the top (of the cliff, then they) used the stones to attack (the bandits).’

(8)

Beul ghaob-nhaub, ghaob-hlat, ghaob-mok at mex. pɤ̤43 qɑ41-ɳau41 qɑ41-lɦa14 qɑ41-mo22 a14 me454 3 nml-seed nml-rope nml-sickle all exist ‘(They’ve got everything at that store,) they’ve got seeds, ropes, and sickles.’

(9)

Beul naond ob-doul gueub npeif.npeif! 43 43 41 43 pɤ̤ nɑ̃ o -tɛ̤ ɯ̤ qwɤ41 mbɤ̤i21̤ mbɤ̤i21̤ 3 poss nml-hand white ideo ‘His hands are so white!’

(10)

Beul jix nins ghaob-Xonb. pɤ̤43 ʨi454 ᶇĩ22 ̤ qɑ41-ɕõ41 3 neg cop nml-Miao ‘He’s not Miao.’

2. ghaob- /qɑ41-/ ~ aob- /ɑ41-/ ~ ob- /o41-/ is optional when the noun is (i) preceded by a classifier, (ii) the initial element in an attributive compound, (iii) immediately following a verb other than the copula, as seen in examples (11) to (13). (11)

oub-leb (ghaob-)dab ɛɯ41-lɛ41 (qɑ41-)tɑ41 two-clf (nml-)box ‘two boxes’

(12)

(ghaob-)deud-guoud (qɑ41-)tɤ43-qwɛɯ43 (nml-)skin-dog ‘dog-skin’

(13)

Wel lis nieus (ghaob-)njib. 43 22 ʋe̤ lji̤ njɤ̤22 (qɑ41-)nʥi41 1sg want buy (nml-)scissors. ‘I want to buy some scissors.’

282 

 David Strecker

3. ghaob- /qɑ41-/ ~ aob- /ɑ41-/ ~ ob- /o41-/ is obligatorily dropped when the noun is the non-initial constituent in an attributive compound, as seen in (14a) (compared to [14b] which is never acceptable). (14a) sond-doul   (14b) *sond-ghaob-doul 43 43 sõ -tɛ̤ ɯ̤ * sõ43-qɑ41-tɛ̤ ɯ43 ̤ bone-hand *bone-NML-hand ‘hand bones, the bones of one’s hands’

16.2.2 Segments HM languages are noteworthy for the size of their consonant inventories, which often include contrasts such as uvular/velar, voiced/voiceless sonorant, and plain/prenasalized stop that are unusual world-wide yet typical of HM. Tables 1 to 4 present the syllable onsets and rhymes of Iu Mien, representing the Mienic branch of the family, and of Xong, representing the Hmongic branch. Tab. 1: Iu Mien syllable onsets. p pj pw t tj tw ts tsj tsw ʨ ʨw k kw kjw ʔ

ph phj phw th thj thw tsh tshj tshw ʨh ʨhw kh khw khjw

b bj bw d dj dw dz dzj dzw ʥ ʥw ɡ ɡw ɡjw

m mj mw n nj

m̥ m̥j m̥w n̥ n̥ j

ᶇ ᶇw ŋ ŋw

ᶇ̥ ᶇ̥w ŋ̥

w

l lj lw

j jw

f fj fw ɬ ɬj ɬw s sj sw ɕ ɕw h hw hjw



Typological profile of Hmong-Mien languages 

Tab. 2: Iu Mien rhymes. i e ɛ a ɔ u ə ɿ

iu ei

eu

aːi ai ɔi

aːu au

uːi ui

ou

iːm im eːm

iːn in eːn en

aːm am ɔm om

aːn an ɔn on

iːŋ

eŋ ɛŋ aːŋ aŋ ɔŋ oŋ

ip

it

ep

et

ek

aːp ap ɔp ot

aːt at ɔt op

eʔ ɛʔ

ak ɔk ok

aʔ ɔʔ oʔ

un ən



ut

m mj mʐ n

m̥h

w

n̥ h

l

Tab. 3: Xong syllable onsets. p pj pʐ t ts ʈ ʨ c k kw q qw

ph pjh phʐ th tsh ʈh ʨh ch kh kwh qh qwh

mp

nt nts ɳʈ ᶇʨ ɲc ŋk ŋkw ɴq ɴqwh

mph mphʐ ntsh ntsh ɳʈh ᶇʨh ɲch ŋkh ŋkwh ɴqh

ɳ ᶇ ŋ ŋw

Tab. 4: Xong rhymes. i e ei ɛ æ

ĩ ẽ

ɯ ɚ ə ɑ

u o

ɑ̃

ɔ

ʐ ʑ lj

l̥h s ʂ ɕ l̥jh

h hw





 283

284 

 David Strecker

16.2.3 Tones4 Proto-Hmong-Mien [PHM] had three tones on sonorant-final syllables, conventionally referred to as A, B, and C. Stop-final syllables constituted a fourth tone, D. In most HM languages, each of the tones split into two or more tones conditioned by the phonation type of the initial consonant. In Iu Mien each of the tones split into two tones depending on whether the PHM initial consonant was voiceless or voiced (Table 5). Tab. 5: Hmong-Mien tone split reflected in Iu Mien. PHM tone class

PHM onset type

Iu Mien

A A B B C C D D

– voice +voice – voice +voice – voice +voice – voice +voice

tɔn 44 taːi 31 twei 453 tə̤ṳ 231 tai 24 ta̰ ḭ 11 ʨop 55 top 11

‘son’ ‘to come’ ‘tail’ ‘fire’ ‘to kill’ ‘to die’ ‘bear’ (n.) ‘bean’

Table 5 shows that Iu Mien has six tones on sonorant-final syllables and two tones on stop-final syllables. In another Mienic language, Mun [ISO 693-3 mji], each of the PHM tones split into three tones depending on whether the PHM initial consonant was unaspirated, aspirated, or voiced. In the Houei Sai dialect of Mun the pattern is as shown in Table 6. Tab. 6: Hmong-Mien tone split reflected in Houei Sai Mun. PHM tone class

PHM onset type

Houei Sai Mun

A A A B B B C C C

- aspirated +aspirated +voice - aspirated +aspirated +voice - aspirated +apirated +voice

tɔːn 534 tin 554 taːi 31 tei 44 sa̤ ːm̰ 33 toṵ 53 tai 24 taːn 554 ta̰ ḭ 22

‘son’ ‘thousand’ ‘to come’ ‘tail’ ‘blood’ ‘fire’ ‘to kill’ ‘charcoal’ ‘to die’

4 Tones are transcribed with Chao notation. Phonation is indicated as follows: breathy with subscript dieresis and creaky with subscript tilde. To facilitate comparison among the different languages, the transcriptions in this subsection are broadly phonetic rather than phonemic.



Typological profile of Hmong-Mien languages 

 285

Tab. 6 (continued) PHM tone class

PHM onset type

Houei Sai Mun

D (V̆) D (V̆) D (V̆) D (Vː) D (Vː) D (Vː)

- aspirated +aspirated +voice - aspirated +aspirated +voice

kjap 22 hɔp 55 tɔp 33 ʔaːp 42 gaːt 55 bjaːt 33

‘bear’ (n.) ‘to drink’ ‘bean’ ‘duck’ ‘thirsty’5 ‘peppery’

In theory, since the three proto-tones on sonorant-final syllables each split three ways, the Houei Sai variety of Mun ought to have nine tones on sonorant-final syllables but Table 6 shows that the tone of ‘thousand’ merged with the tone of ‘charcoal’ so there are only eight, as well as three tones on stop-final syllables with a short vowel and three tones on stop-final syllables with a long vowel. Proto-Hmongic [PH] lost final stops. All four tones occurred on sonorant-final syllables. In other respects, the patterns of tonal development in Hmongic languages are similar to the patterns in Mienic languages. In Hmu, for example, the pattern is as presented in Table 7. Tab. 7: Hmong-Mien tone split reflected in Hmu. PH tone class

PH onset type

Hmu

A A B B C C D D

– voice +voice – voice +voice – voice +voice – voice +voice

tɛ 33 ta 55 tɛ 35 tṳ 11 tɛ 44 ta̤ 13 ta 33 tə 31

‘son’ ‘to come’ ‘tail’ ‘fire’ ‘to sever’ ‘to die’ ‘wing’ ‘bean’

Table 7 shows that Hmu has eight tones, all on sonorant-final syllables. Tables 5–7 show that tones in HM are characterized by phonation as well as pitch: 1. Modal phonation, e.  g. Iu Mien tɔn 44, Houei Sai Mun tɔːn 534, Hmu tɛ 33 ‘son’. 2. Breathy phonation, e.  g. Iu Mien tə̤ṳ 231, Hmu tṳ 11 ‘fire’. 3. Breathy phonation followed by creaky phonation, e.  g. Houei Sai Mun sa̤ ːm̰ 33 ‘blood’.

5 The Mun voiced onsets in ‘thirsty’ and ‘peppery’ are derived from PHM prenasalized onsets.

286 

 David Strecker

4. Creaky phonation distributed throughout the pitch contour, e.  g. Iu Mien ta̰ ḭ 11, Houei Sai Mun ta̰ ḭ 22 ‘to die’. 5. Creaky phonation concentrated at the end of the pitch contour, e.  g. Houei Sai Mun toṵ 53 ‘fire’. Phonation contrasts in the tones of HM languages may derive from phonation contrasts in PHM tones and from transphonologization of phonation contrasts in initial consonants, as is suggested by a comparison of tones on sonorant-final syllables in Iu Mien, Houei Sai Mun, and a third Mienic language, Biao Min [ISO 693-3 bje] (Table 8). Tab. 8: Origins of phonation contrasts. PHM tone class

PHM onset type

Iu Mien

Houei Sai Mun

Biao Min

A A A B B B C C C

- aspirated +aspirated +voice - aspirated +aspirated +voice - aspirated +aspirated +voice

pjei 44 n̥ hɔi 44 pjaːŋ 31 pjau 453 n̥ hiə 453 bja̤ ṳ 231 tai 24 thaːn 24 ta̰ ḭ 11

pjei 534 nɔːi 554 faŋ 31 pjau 44 ni̤ḭ 33 bjaṵ 53 tai 24 taːn 554 ta̰ ḭ 22

pli 44 n̥ wai 44 pja̤ ŋ̤ 21 pla 35 n̥ i 35 blaʔ 42 tai 24 than 24 taiʔ 42

‘hair’ ‘day’ ‘flower’ ‘house’ ‘heavy’ ‘fish’ ‘to kill’ ‘charcoal’ ‘to die’

Table 8 suggests that: 1. Proto-tone B may have involved final glottal stop, preserved in Biao Min blaʔ 42 ‘fish’ and becoming creaky phonation at the end of the pitch contour in Houei Sai Mun ni̤ḭ 33 ‘heavy’ and bjaṵ 53 ‘fish’. 2. Proto-tone C may have involved creaky phonation, preserved Iu Mien tai 1̰1̰, Houei Sai Mun ta̰ ḭ 22 ‘to die’ and becoming final glottal stop in the Biao Min cognate taiʔ 42. 3. The breathy phonation in the first part of the pitch contour in Houei Sai Mun ni̤ḭ 33 ‘heavy’ may be a transphonologization of aspiration of the initial consonant, preserved in the Iu Mien cognate n̥ hiə 453.

16.2.4 Tone sandhi Two very different types of tone sandhi in HM are exemplified by Iu Mien and SichuanGuizhou-Yunnan Hmong. In Iu Mien, sandhi involves neutralization of tonal contrasts and affects syllables in non-final position in a phrase. In non-final position, syllables which end in a sonorant have /31/ regardless of what the underlying tone is and syllables which end in



Typological profile of Hmong-Mien languages 

 287

a stop have /11/ regardless of what the underlying tone is. These rules are illustrated in Table 9. Tab. 9: Iu Mien tone sandhi. /wom44/ ‘water’ /ŋoŋ31/ ‘cow’ /ʨau453/ ‘road’ /tuŋ231/ ‘pig’ /fai24/ ‘small’ /mwo11/ ‘hat’ /ʔaːp55/ ‘duck’ /top11/ ‘bean’

+ + + + + + + +

/nam24/ ‘cold’ /tɔn44/ ‘offspring’ /hep11/ ‘narrow’ /tɔn44/ ‘offspring’ /fai24/ ‘small’ /ɬjeʔ55/ ‘metal’ /tɔn44/ ‘offspring’ /ᶇa31/ ‘tooth’

> > > > > > > >

/wom31 nam24/ ‘cold water’ /ŋoŋ31 tɔn44/ ‘calf’ /ʨau31 hep11/ ‘narrow road’ /tuŋ31 tɔn44/ ‘piglet’ /fai31 fai24/ ‘quite small’ /mwo31 ɬjeʔ55/ ‘helmet’ /ʔaːp11 tɔn44/ ‘duckling’ /top11 ᶇa31/ ‘bean sprout’

In Sichuan-Guizhou-Yunnan Hmong, sandhi is conditioned by the tone of the first syllable of the phrase and affects the tone of the second syllable of the phrase. The rules are best described in historical terms. The description below refers to the Green Mong variety [ISO 639-3 hnj] of Sichuan-Guizhou-Yunnan Hmong, but in other varieties the rules are similar. The historical sources of the Green Mong tones are set forth in Table 10. Tab. 10: Sources of Green Mong tones. PH onset \ PH tone class

A

B

C

D

– voice +voice

55 52

24 4̤2̤

33 4̤2̤

22 2̰1̰

Sandhi occurs when the first syllable of the phrase has a tone from the A column. The tonal changes occur in the second syllable of the phrase. In the first row of Table 11, tones in the B and C columns move one column to the right. In the second row, tonal contrasts are neutralized. Tones in the second row all remain or change to /4̤2̤/. Tab. 11: Green Mong tone sandhi. PH onset \ PH tone class

A

B

C

D

– voice +voice

/52/ > /4̤2̤/

/24/> /4̤2̤/

/33/ > /4̤2̤/

/22/ /2̰1̰/ > /4̤2̤/

288 

 David Strecker

16.3 Morphology Morphologically, HM languages are predominantly isolating. Verbs are not inflected for TAM, person, or number. Nouns are not inflected for number or case. Derivation is through compounding, as in White Hmong [ISO 639-3 mww] and Green Mong kev /ke24/ ‘way’ + sib /ʂi55/ ‘RECP’ + hlub /l̥ u55/ ‘love’ (verb) > kev sib hlub /ke24 ʂi55 l̥ u55/ ‘love’ (noun). One exception is the Xong durative marker jid /ʨi44-/, discussed in subsection 16.2.1. Another exception is the classifier in A-Hmao [ISO 693-3 hmd], which has three forms. The form with the vowel unchanged is augmentative. It refers to things considered grand and imposing. The form with the vowel changed to /ai/ refers to ordinary things. The form with the vowel changed to /a/ refers to things that are few in number, delicate, or lovable. In a story about a group of monkeys who steal a man’s hats, the narrator uses the unchanged form of /ti55/ ‘classifier for a group of things’: (15)

kau11 hi33–55 bɦo31 ti55 clf.aug hat neg see ‘There were no hats’

But the man in the story shouts to the monkeys: (16)

kau11 ʈai11 ʈhau33 ku55 ma55 ku55 tai55 take 1sg clf.ordinary hat return give 1sg ‘Give my hats back!’

The narrator considers the hats grand and imposing so he uses ti55 for the classifier. The owner of the hats in the story, however, is humble and treats the hats as being rather ordinary so he uses the form tai55 (Wang Fushi 1972: 167). In a folktale, a woman is referred to as i55 lɯ55 a33bɦo35 (‘one CLF.AUG woman’) with the unchanged form of lɯ55 ‘classifier for human beings’, but her lovable little daughter is i55 la35 ntshai11 (‘one CLF.DIM daughter’),6 with the diminutive /a/-form of the classifier: (17a) ᶇi31 hi11 pi55dɦau31 ku11 a33thau33li33 people tell story rel long.ago ‘People tell the story that long ago’ (17b) mɦa35 i55 lɯ55 a33bɦo35 ku11 ntsi33 ᶇi11bo55bu55bɦa11, exist one clf.aug woman rel be.named pn ‘there was a woman named ᶇi11bo55bu55bɦa11,’ la35 ntshai11 (17c) mɦa35 tau33–11 i55 have get one clf.dim daughter ‘who gave birth to a daughter’ 6 In /i55 la35 ntshai11/ the classifier also undergoes tone sandhi.



Typological profile of Hmong-Mien languages 

 289

qhau55 ta55ku11gɦi11 (17d) ku11 zau33 ᶇi31 rel good people kiss extremely ‘who was very lovable’ (Wang Deguang 1986: 69; story told by Zhang Minglao) An example of derivational morphology is the lexicalization of a tonal prosody in White Hmong and Green Mong. In White Hmong and Green Mong, tone /2̰1̰/ (low falling creaky) has a prosodic variant, [13] (low rising) ~ [213] (low falling rising), used in utterance-final position, as in the following pairs of contrastive examples in White Hmong: (18a) Nyob ntawm lub tsev. ɲɔ55 ndɤ̰ɯ21̰ lu55 tʂe24 be.located at clf house ‘It’s at the house.’

(18b) Nyob ntawd. ɲɔ55 ndɤɯ13 be.located there ‘It’s there.’

(19a) kuv tus niam ku24 tu22 nḭa̰ 21 1SG clf mother ‘my mother’

(19b) Niad! nia13 mother! ‘Mother!’

This low-rising prosody has been lexicalized to create a pronoun, White Hmong and Green Mong nkawd /ŋɡɤɯ13/ ‘third person dual’, which is derived from the noun nkawm /ŋɡɤ̰ɯ̰21/ ‘a pair, a couple’, but which behaves as a regular member of the lexicon, not restricted to utterance-final position, as in a White Hmong folktale in which the narrator describes the parents’ reaction upon learning that one of their daughters has been killed and dismembered by a tiger: (20) Ces nkawd cem cem niag tsov tas zog. ce22 ŋɡɤɯ13 cḛ21 cḛ21 ni̤a̤42 tʂɔ24 ta22 ʐɔ̤ 42 then 3du scold scold nml tiger finish ints ‘Then they [dual] were furious at the tiger.’ (Johnson 1985: 417, story told by Maiv Yaj)

16.4 Syntactic typology 16.4.1 Word order within constituents The principal word orders within the noun phrase in HM are not consistent with regard to the order of head and dependents. There is both dependent-head word order, suggesting contact with Sinitic, and head-dependent word order. As the following examples illustrate, dependent-head word order may be more common in Mienic languages (here exemplified by Iu Mien) than in Hmongic languages (here exemplified

290 

 David Strecker

by Xong, Hmu, and Green Mong), but dependent-head word order does occur in Xong and head-dependent word order does occur in Iu Mien. Demonstrative + Noun, e.  g. (21): Iu Mien: (21) mh norm norqc m31 nɔm44 nɔʔ11 this clf bird ‘this bird’ Noun + Demonstrative, e.  g. (22) to (24): Xong: (22) hant gut nend 53 53 hɛ ku nẽ44 pl story this ‘these stories’ Hmu: (23) dol gheib nend to̤ 11 qei33 nen44 pl chicken that ‘those chickens’ Green Mong: (24) phau ntawv nuav phau33 ndɤɯ24 noə24 clf book this ‘this book’ Relative Clause + Noun, e.  g. (25) to (27): Xong: (25) boub louxdongd- louxnius danglend- danglmongs nangd ndeud pɯ35 lɯ31tõ44lɯ31ᶇṳ42 tɑ̤̃ 33lẽ44tɑ̤̃ 33mõ̤ 42 nɑ̃ 44 ntə44 1pl long.winter- long.day waitattend rel book xongb ɕõ35 Miao ‘the Miao written language that we have long awaited’7

7 The hyphens in the orthographic form of (25) are an exception to the rule that in indigenous HmongMien orthographies morpheme boundaries are unmarked.



(26)

Typological profile of Hmong-Mien languages 

nib kianx guex nangd nex dadxib ᶇi35 chɛ31 kwe31 nɑ̃ 44 ne31 tɑ44ɕi35 be.located Guizhou country rel person nml.all ‘everyone who lives in Guizhou’

Iu Mien: (27) dongh hnyouv nyei mienh toŋ31 ᶇ̥ou453 ᶇei44 mjen31 be similar alimentary canal rel person ‘people who are in full agreement’ Noun + Relative Clause, e.  g. (28) to (30): Hmu: (28) dol zaid dal gheib id to̤ 11 tsɛ35 ta̤ 11 qei33 ʔi35 pl family lose chicken that ‘those families who had lost their chickens’ Green Mong: (29) qhov kws tseem ceeb tshaaj plawg 24 22 21 55 qhɔ kʉ tʂḛŋ̰ ceŋ tʂhaŋ52 plɤɯ4̤2̤ thing rel be.important most ints ‘the thing which is most important’ Iu Mien: (30) mbuo mienh maaih uix bwo44 mjen31 maːi31 ʔui24 pl person have defile ‘people who are ritually unclean’ Modifier + Head, as in (31): Iu Mien: (31) ndiangx- biauv djaŋ31 pjau453 wood house ‘a wooden house’ Head + Modifier, e.  g. (32) to (36): Xong: (32) ad ngongl deb nenb ghueub ɑ44 ŋõ̤ 33 te35 nẽ35 qwə35 one clf child snake white ‘a little white snake’

 291

292 

(33)

 David Strecker

denb xongb tẽ35 ɕõ35 country Miao ‘the Miao region’

Hmu: (34) vongx hlieb vongx yut ɣoŋ55 ɬjhə33 ɣoŋ55 ʑu44 dragon big dragon small ‘great and small dragons’ (35)

mos dud mo̤ 13 tu35 hat paper ‘a paper hat’

Iu Mien: (36) douz henz haic 231 231 tou hen hai11 fire strong very ‘very strong fire’

16.4.2 Ordering of constituents to form clauses The most common word order in intransitive clauses is SV but VS occurs as well, as the following examples illustrate. SV, e.  g. (37) to (39): Xong: (37) Ghobbleid xeb nis ad leb nex ngoub ndeud. qo35-pʐei44 ɕe35 ᶇi̤42 ɑ44 le35 ne31 ŋɯ35 ntə44 nml-eldest son-in-law cop one clf person study book ‘The eldest son-in-law was a student.’ Hmu: (38) Jox jux det id muk yangx. ʨo55 ʨu55 tə44 ʔi35 mu53 ʑaŋ55 clf bridge wood that rot pfv ‘That wooden bridge is rotten now.’



Typological profile of Hmong-Mien languages 

 293

Iu Mien: (39) Meih mbuo mv zuqv nzauh hex. mei31 bwo44 m453 tsuʔ55 dzau31 he31 2 pl neg advrs be.sad decl ‘You need not be sad.’ The order VS is seen in (40): Xong (Fenghuang dialect): (40) Xub aub hint. ɕu41 æu41 hĩ14 be.small water very ‘The water pressure (in our home) is really low.’ The most common word order in transitive clauses is AVP, as seen in examples (41) to (43). AVP, e.  g. (41) to (43): Xong: (41) Boub lies xox ad leb ghobblab xeb. pɯ35 lje̤ 42 ɕo31 ɑ44 le35 qo35-pʐɑ35 ɕe35 1pl irr emulate one clf nml-youngest son-in-law ‘We should emulate the youngest son-in-law.’ Hmu: (42) Dail mif gheib nas git yangx. tɛ̤ 11 mi31 qei33 na̤ 13 ki44 ʑaŋ55 clf female chicken lay egg pfv ‘The hen laid an egg.’ Iu Mien: (43) Ninh mbuo mingh naaic ninh mbuo nyei fin-saeng aqv. nin31 bwo44 miŋ31 naːi11 nin31 bwo44 ᶇei44 fin31-sɛŋ44 ʔaʔ55 3 pl go ask 3 pl poss celestial-life decl ‘They went and asked their spiritual teacher.’ For pragmatic reasons, the P argument may be fronted, as in (44): Xong (Fenghuang dialect): (44) Aod ngonl at jix beux. ɑ43 ŋõ̤ 43 æ14 ʨi454 pɤ454 one clf even neg hit ‘(He) won’t even kill a single (bug).’

294 

 David Strecker

The order of constituents in WH-questions is the same as the order of constituents in declarative sentences, as illustrated in the following examples. SV, e.  g. (45) and (46): Hmu: (45) Mangx mongl hangd deis, ghaib daib gheib? maŋ55 mo̤ ŋ11̤ haŋ35te̤ i13̤ qɛ33tɛ33 qei33 2pl go where small chicken ‘Where are you going, little chickens?’ Iu Mien: (46) Haaix dauh caux haaix dauh mingh? 24 31 24 haːi -tau tshau haːi24-tau31 miŋ31 which-clf with which-clf go ‘Who all is going?’ AVP, e.  g. (47) to (49): Xong: (47) Gheab wel jiddas mongd daot ghobnangb? qæ35 we̤ 33 ʨi44-tɑ̤ 42 mõ44 tɔ53 qo35-nɑ̃ 35 bite 1sg comp-die 2sg get nml-what ‘If you kill me, what will you get?’ Hmu: (48) Mangx dax ait gheix xid? maŋ55 ta55 ʔɛ44 qei55ɕi35 2pl come do what ‘What have you come to do?’ Iu Mien: (49) Meih daaih lorz haivdauh? mei31 taːi31 lɔ213 hai453tau31 2sg come seek who ‘Who have you come to look for?’ The order of constituents in YES-NO questions is similar to the order of constituents in declarative sentences, as illustrated in the following examples. The question can be unmarked, as in (50). Hmu: (50) Mongx nangx laib gib mongl, baif? moŋ55 naŋ55 lɛ33 ki33 mo̤ ŋ11̤ pɛ31 2sg eat clf snail go cat ‘Did you eat the snail, cat?’



Typological profile of Hmong-Mien languages 

 295

An overt sentence final question marker is added (SV Q; AVP Q), as in (51) and (52): Iu Mien: (51) Meih yiem naaiv lauh nyei saah? mei31 jem44 naːi453 lau31 ᶇei44 sa31 2sg be.located dem be.long adv q ‘You will be here for a long time, won’t you?’ Iu Mien: (52) Ninh nimc hungh diex nyei ga’naaiv fai? nin31 nim11 huŋ31 tje24 ᶇei44 ka naːi453 fai44 3sg steal king father poss thing q ‘Did he steal the king’s things?’ Alternatively, the verb is repeated with the negator (S V-NEG-V), e.  g. (53): Xong: (53) Mongd kint jex kint nax lies jidlieas nangd. mõ44 chĩ53 ʨe31 chĩ53 nɑ31 lje̤ 42 ʨi44-ljæ̤42 nɑ̃ 44 2sg be.willing neg be.willing or irr recp-exchange nml ‘Are you willing to exchange?’ In (54), the negated existential copula alone is used to form a question. Xong: (54) Mongd mex lut fangd las fangd jex mex? 44 31 53 44 42 44 mõ me lu fɑ̃ lɑ̤ fɑ̃ ʨe31 me31 2sg have ground lie.waste field lie.waste neg have ‘Do you have any uncultivated fields?’ There is a limited set of ditransitive predicates found in HM languages; the archetypal ditransitive verb is ‘give’. The word order in ditransitive clauses in not consistent with regard to the ordering of Theme and Goal/Recipient, as the following examples illustrate. ATG, e.  g. (55): Iu Mien: (55) Mienh bun nyaanh ninh. mjen31 pun44 ᶇaːn31 nin31 person give money 3sg ‘People gave him money.’ AGT, e.  g. (56) and (57):

296 

 David Strecker

Xong: (56) Nusreib doub gangs wud ghobmis ghoub ghobyangb. nṳ42ʐei35 tɯ35 ɡɑ̤̃ 42 wu44 qo35-mi̤42 qɯ35 qo35-ʑɑ̃ 35 pn then give 3sg nml-many clf nml-seedling ‘Nu Rei then gave him many seedlings.’ Hmu: (57) Nenx baib wil benx dud. nen55 pɛ33 vi̤11 pen55 tu35 3sg give 1sg clf book ‘He gave me a book.’

16.5 Conclusion HM lies between Sinitic to the north, Tibeto-Burman [TB] to the west, and KD and AA to the south. Phonologically, it is linked to KD and Vietnamese by the typology of its tonal systems, but its large consonant inventories set it apart from other MSEA language groups. Its predominantly isolating morphology links it to KD, AA, and Sinitic, and sets it apart from TB. Syntactically, it is linked to KD, AA, and Sinitic (and set apart from TB) by its predominantly SV/AVP word order but the variation in word order within noun phrases between dependent-head and head-dependent suggests that HM may be at an intersection zone between Sinitic on the one hand and KD and AA on the other.

References Arisawa, Tatsuro Daniel. 2016. An Iu Mien grammar: A tool for language documentation and revitalisation. Melbourne: La Trobe University PhD thesis. Bertrais, Yves. 1964. Dictionnaire Hmong-Français. Vientiane: Catholic Mission [Reprinted 1979 by Assumption Press, Bangkok]. Brown, J. Marvin. 1965. The great tone split: Did it work in two opposite ways? In Jimmy G. Harris & James R. Chamberlain (eds.), Studies in Tai linguistics in honor of William J. Gedney, 33–48. Bangkok: Central Institute of English Language, Office of State Universities. Chang Kun (Zhang Kun). 1947. Miao-Yao yu shengdiao wenti [On the tone system of Miao-Yao languages]. Bulletin of the Institute of History and Philology, Academia Sinica (Taipei), 16. 93–110. Chen Qiguang. 1993. Miaoyaoyu qianzhui [Miao-Yao prefixes]. Minzu Yuwen 1. 1–9. Court, Christopher. 1985. Fundamentals of Iu Mien (Yao) grammar. Berkeley: University of California PhD dissertation. Gedney, William J. 1972. A checklist for determining tones in Tai dialects. In Estelle M. Smith (ed.), Studies in linguistics in honor of George L. Trager (Janua linguarum, Ser. Maior, 52), 423–437. The Hague: Mouton.



Typological profile of Hmong-Mien languages 

 297

Guizhou Minzu Chubanshe. 1958a. Ghob Xongb Nangd Gut [Xong folktales]. Guiyang. Guizhou Minzu Chubanshe. 1958b. Hmongb-Shuad Jianming Cidian: Chuanqiandian Fangyan [Sichuan-Guizhou-Yunnan Hmong – Chinese pocket dictionary]. Guiyang. Guizhou Minzu Chubanshe. 1958c. Hmub-Diel Jianming Cidian: Qiandong Fangyan [Hmu-Chinese pocket dictionary]. Guiyang. Haudricourt, André-Georges. 1954. De l’origine des tones en viêtnamien. Journal Asiatique 242. 69–82. Haudricourt, André-Georges. 1961. Bipartition et tripartition des systèmes de tons dans quelques langues d’Extrême-Orient. Bulletin de la Société Linguistique de Paris 56(1). 163–180. Heimbach, Ernest E. 1979. White Hmong – English dictionary, revised edn. Ithaca, NY: Southeast Asia Program, Cornell University. Jarkey, Nerida. 1991. Serial verbs in White Hmong: A functional approach. Sydney: University of Sydney PhD dissertation. Jarkey, Nerida. 2019. Bridging constructions in narrative texts in White Hmong (Hmong-Mien). In Valérie Guérin (ed.), Bridging constructions, 129–156. Berlin: Language Science Press. Johnson, Charles. 1985. Dab Neeg Hmoob / Myths, legends & folk tales from the Hmong of Laos. St. Paul, MN. [Self-published]. Li, Fang Kuei. 1964. The phonemic system of the Tai Lü language. Academia Sinica: Bulletin of the Institute of History and Philology, Taipei XXXV. 7–14. Li, Fang Kuei. 1977. A handbook of Comparative Tai. Honolulu: The University Press of Hawaii. Lyman, Thomas Amis. 1974. Dictionary of Mong Njua. The Hague & Paris: Mouton. Lyman, Thomas Amis. 1979. Grammar of Mong Njua (Green Miao): A descriptive linguistic study. Sattley, CA: Blue Oak Press. Mao Zongwu & Meng Chaoji. 1986. Sheyu Jianzhi [A Sketch of the Ho Ne language]. Beijing: Minzu Chubanshe. Mao Zongwu, Meng Chaoji & Zheng Zongze. 1985. Yaozu Yuyan Jianzhi [A sketch of the languages of the Yao nationality]. Beijing: Minzu Chubanshe. Matisoff, James A. 1989. The bulging monosyllable, or the mora the merrier: Echo-vowel adverbialization in Lahu. In J. H. C. S. Davidson (ed.), South-east Asian Linguistics: Essays in Honour of Eugénie J. A. Henderson, 163–198. London: School of Oriental and African Studies, University of London. Matisoff, James A. 1992. The mother of all morphemes: Augmentatives and diminutives in areal and universal perspective. In M. Ratliff & E. Schiller (eds.), Papers from the First Annual Meeting of the Southeast Asian Linguistics Society, 293–349. Tempe, AZ: Arizona State University, Program for Southeast Asian Studies. Moskalev, A. A. 1978. Jazyk Duan’skix Jao (Jazyk Nu) [The Du’an Yao language (The Nu language)]. Moscow: Izdatel’stvo “Nauka”. Mottin, Jean. 1978. Éléments de grammaire hmong blanc. Bangkok: Don Bosco Press. Purnell, Herbert C. 2012. An Iu-Mienh – English dictionary with cultural notes. Chiang Mai, Thailand: Silkworm Books; San Francisco: Center for Lao Studies. Ratliff, Martha. 1986. The morphological functions of tone in White Hmong. Chicago: University of Chicago PhD dissertation. Ratliff, Martha. 1991a. The development of nominal/non-nominal class marking by tone in Shimen Hmong. Proceedings of the Seventh Annual Meeting of the Berkeley Linguistics Society: General Session and Parasession on the Grammar of Event Structure, 267–282. [Published online via eLanguage]. Ratliff, Martha. 1991b. Cov, the underspecified noun, and syntactic flexibility in Hmong. Journal of the American Oriental Society 111. 694–703.

298 

 David Strecker

Ratliff, Martha. 2010. Hmong-Mien language history. Canberra: Pacific Linguistics, Research School of Pacific and Asian Studies, and Centre for Research on Language Change, The Australian National University. Shi, Defu. 2016. The functions of proclitic Ab and Ghab in Hmub. Language and Linguistics 17(4). 575–622. Sposato, Adam. 2014. Word order in Miao-Yao (Hmong-Mien). Linguistic Typology 18(1). 83–140. Sposato, Adam. 2015. A grammar of Xong. Buffalo, NY: State University of New York at Buffalo PhD dissertation. Strecker, David. 1979. Higher falls more: A tonal sound change in Tai. Computational Analyses of Asian & African Languages 11. 30–84. Strecker, David. 1990. The tones of the Houei Sai dialect of the Mun language. Cahiers de Linguistique Asia Orientale XIX(1). 5–33. Wang Chunde. 1985. Benx Wix Fax Hveb Hmub: Hveb Qeef Dongb / Miaoyu Yufa: Qiandong Fangyan [Hmu grammar]. Beijing: Guangming Ribao Chubanshe. Wang Deguang. 1986. Text of Miao language in Weining, Guizhou Province. Minzu Yuwen, 3. 69–80. Wang Fushi. 1972. The classifier in the Weining Dialect of the Miao language in Guizhou. In Herbert C. Purnell, Jr. (ed.), Miao and Yao linguistic studies: Selected articles in Chinese, 111–185. Ithaca, NY: Dept. of Asian Studies, Cornell University [English translation of an article appearing originally in Yuyan yanjiu 2 (1957), pp. 75–121]. Wang Fushi. 1985. Miaoyu Jianzhi [A sketch of the Miao language]. Beijing: Minzu Chubanshe. Wang Fushi. 1994. Reconstruction of Proto-Miao language. Tokyo: Institute for the Study of Languages and Cultures of Asia and Africa (ILCAA). Xiong, Lang, Joua Xiong & Nao Leng Xiong. 1983. English-Mong-English dictionary. Milwaukee, WI. [Self-published].

David Bradley

17 Typological profile of Burmic languages 17.1 Introduction The Burmic languages, to use the original term of Shafer (1966–1974), are also known as Burmese-Lolo (Benedict 1972; Bradley 1979b), Lolo-Burmese (Burling 1967; Matisoff 2003) and Mran-Ngwi (Bradley 1995, 2005a). They are now spoken in southwestern China, Myanmar, northern Thailand, northern Laos and northern Vietnam, with small numbers of speakers also in southeastern Bangladesh and northeastern India. The Ngwi or Loloish branch of Burmic, also known as the Yi Branch in China, includes many languages only spoken in southwestern China but also some extending into northern MSEA. The major language of the Burmish or Mran subgroup of Burmic is Burmese; the term Mran for this subgroup reflects the spelling of Myanmar. The speakers of Burmese probably came to what is now Myanmar circa 832 AD as part of the Nanzhao invasion and conquest of the Pyu in the central part of the country (Luce 1959; Stargardt 1990: 78); the other Burmic languages in MSEA are much more recent arrivals. Apart from the speakers of Burmese and its dialects, who have lived in the plains of Myanmar for more than a millennium, all Burmic groups in MSEA traditionally lived in upland settings, and nearly all continue to do so. There are approximately 35 million mother-tongue speakers of Burmese, also over ten million fluent second-language speakers in Myanmar. Nearly everyone else in Myanmar has at least some knowledge of the language (Bradley 1996a), which has long been spreading to replace other languages; first distantly related Tibeto-Burman Pyu and Kadu in the central plains, later unrelated Austroasiatic Mon further south; Kadu and some smaller related Sak languages and to a much greater extent Mon still persist in some areas. Burmese has been the national and official language of Myanmar since independence in 1948, with this status officially recognized in the constitutions of 1947, 1974 and 2008. In Burmese diglossia, the literary High is based on the written version of the language which stabilized in the early 13th century, now pronounced reflecting nearly all subsequent phonological changes in the spoken language. The spoken Low differs from it mainly in the forms of nearly every grammatical function word and some other very frequent words; for example, ‘this’ is /i/ in older literary, /θi/ in less conservative literary and /di/ in spoken language. In most cases there is a one-for-one equivalence, but sometimes there are extra contrasts in one or the other. For a grammar of spoken Burmese, see Okell (1969); for a comparison of literary and spoken Burmese grammatical forms, see Okell and Allott (2001). For a comprehensive and modern grammar of Burmese, primarily spoken but also with some supplementary literary forms, see Jenny and San San Hnin Tun (2016). Most verbs and some nouns also have longer https://doi.org/10.1515/9783110558142-017

300 

 David Bradley

literary forms. Some of the literary forms are occasionally used in spoken language; for example the literary plural suffix /mjà/ instead of spoken /twe/ ~ /te/. Diglossia is the source of the competing names for the country and the language: literary /mjəma/ ‘Myanmar’ and spoken /bəma/; the English terms Burma, Burman and Burmese come from the latter. In this chapter, the nation is called Myanmar and the language is called Burmese; the majority Burman ethnic group is now called Bamar in Myanmar. Apart from standard Burmese, there are many regional varieties which derive from the same original migration but have differentiated since then: Rakhine (Arakanese) in the west and into Bangladesh and India, Tavoyan in the southeast, Intha around Inle Lake and so on, with a total of over 2.6 million speakers; eight of these groups are recognized as separate ethnic groups in Myanmar, distinct from the Bamar majority group who speak Burmese. All are more conservative phonologically than standard Burmese, nearly all speakers also speak the standard fluently, and there is gradual convergence towards and in some areas replacement by the standard. Two more divergent languages of this group, Hpun in northern Myanmar and Gong in western central Thailand, are now disappearing; Hpun has no speakers left, Gong has only very few, all over 50. In Dehong Prefecture in western Yunnan, China, there are various small groups who speak North Burmish languages closely related to Burmese; this is probably the area from which the Burmans originally came nearly 1,200 years ago. These groups now also extend across the border into northeastern Myanmar; most are connected with the Jinghpaw as part of the Kachin culture complex, some Ngochang are not. The four main groups are Ngochang (Achang, Maingtha), Zaiwa (Atsi, Zi), Lawngwaw (Langsu, Maru) and Lachik (Lashi), each with substantial internal linguistic differences. All four are recognized as ethnic groups of Myanmar, with a total population of about 195,000 in Myanmar and over 150,000 in China. Most of those in Myanmar and some in China speak their group language, Jinghpaw Kachin and sometimes another North Burmish language, as language exogamy is normal in the area (Bradley 1996b). The other branch of Burmic or Mran-Ngwi is Ngwi, an old autonym originally based on the Tibeto-Burman etymon for ‘silver’ (Bradley 2005a); Lolo is an old pejorative Chinese exonym for some Ngwi groups. A post-1950 Chinese exonym is the basis for the current Chinese term Yi Branch. Major Ngwi languages of MSEA are Lisu (Bradley 2017a, 2020b), Lahu (Matisoff 1982, 2017), Akha (Hansson 2017) and Phunoi (Bradley 1977), all with various subvarieties. There are over 1.2 million speakers of Ngwi languages in MSEA, with far more in China and elsewhere; for example, there are about 770,000 Lisu in China, about 330,000 in Myanmar, about 75,000 in Thailand and 3,000 in India. There are about 395,000 Akha, 325,000 Lahu and 42,000 Phunoi in MSEA; other Ngwi languages of small groups living mainly or only in MSEA include Bisu, Akeu/Gokhu, Sila and Mpi. Some groups mainly living in China also extend into MSEA, such as the large groups Hani, Phula and Nisu into Vietnam and the small groups Nusu and Laomian into Myanmar. A much larger and more diverse variety of Ngwi languages is spoken only in China; see Bradley (1979b) for their subclassifi-



Typological profile of Burmic languages 

 301

cation and Bradley (2020) for current information on distribution and populations. Their migrations into MSEA are fairly recent and in some cases ongoing; for example, the Lisu have been coming into northeastern Myanmar from China since circa 1820, reaching Thailand from 1919; also into far northern Myanmar since circa 1880 and thence to northeastern India from 1942. Most Ngwi groups in MSEA are officially recognized as ethnic groups where they live in Myanmar, Thailand, Laos and Vietnam. The smallest groups, such as Akeu/Gokhu, Nusu and Laomian, are not officially recognized in MSEA. Most speakers of Ngwi languages also speak the national language and sometimes other local lingua francas; Lahu is one such local lingua franca, used by Lisu, Akha, Wa and others in the Eastern Shan State of Myanmar (Bradley 1996c); Lisu is another, widely used in Putao District of the Kachin State. Many of the languages of smaller groups within Burmic are endangered or disappearing. For example, Hpun, a Burmish language formerly spoken in northeastern Myanmar, has been completely replaced by Burmese over the last 70 years; Bisu in eastern Myanmar and northern Thailand has been receding for more than a century in the face of Tai languages, though it is still spoken in two villages in Thailand by about 400 people, two villages in Burma by about 600 people, and one village in China by about 240 people. Gong in western central Thailand is in the final stages of replacement by Thai (Bradley and Bradley 2019). The ethnic classification of some Ngwi groups is not always congruent with linguistic differences. For example, Lahu is now classified as two ethnic groups in Burma, Lahu and Kwi or Lahu Shi (‘yellow Lahu’), and was formerly classified into the same two groups in Laos but is now combined as Lahu. In Vietnam, the local variety of Lahu was formerly called Cosung from a Chinese exonym, Kucong, but has now been renamed Lahu according to their autonym. In China the Lahu Shi have always been classified as Lahu, and the Kucong were reclassified as Lahu in 1989. Linguistically the three are very closely related, though Lahu Shi and Kucong are sometimes difficult for other Lahu to understand; Lahu Na ‘black Lahu’ serves as a lingua franca among the Lahu and several other nearby mountain groups. Bisu were formerly classified as Lawa along with various small groups speaking Austroasiatic languages in northern Thailand, but are now treated as a separate group there; in Burma they are officially called Pyen, a Shan exonym, and in China they are unclassified for ethnic group. Names also differ and change; Phunoi went through a period in the 1990s when they were officially called Singsali in Laos, and they are officially called Côông in Vietnam. In Myanmar during the 1980s, Lisu was divided into two ethnic groups, Lisu and Lishaw. The latter term is from the Shan name for this group; now they are officially reunited. The principal source of loanwords in Burmese is Pali, the liturgical language of Theravada Buddhism; Pali loanwords are mostly spelled as in Pali using Burmese script and pronounced according to the phonological pattern of reading Burmese; for example, Pali dhamma, Burmese /dəmá/ ‘(Buddhist) law’; some Pali loanwords are shortened. Burmese also has a substantial component of English loanwords, again

302 

 David Bradley

adapted into Burmese phonology; for example /mɔtɔkà/ or /kà/ from ‘(motor)car’. Some Ngwi languages like Lisu have numerous Chinese loans; others like Lahu and Phunoi have more loans from various Southwestern Tai languages; Akha also has loans from Lahu. There are also loans from Burmese in all other Burmic languages spoken in Myanmar; for example, Lisu /modo/ ‘car’ from the first part of the English loan into Burmese; and of course loans from Thai, Lao and Vietnamese into the Ngwi languages spoken in Thailand, Laos and Vietnam.

17.2 Phonetics and phonology There is major diversity of phonology among the Burmic languages; Burmese and its varieties have become much more typical of MSEA over the last millennium, while North Burmish and most Ngwi languages remain typical of the adjacent East Asian linguistic area. All have relatively numerous syllable-onset consonantal contrasts, sometimes including a few clusters, relatively less complex systems of nuclei and fairly restricted or no coda consonantal contrasts. All forms are transcribed here using IPA symbols; the pitch of tones in Ngwi languages is indicated by numbers 1 (low) to 5 (high). There are various coda contrasts which are represented in Burmese spelling, which is based on 12th century pronunciation, but not distinguished in modern Burmese pronunciation; however, many of these former coda contrasts have left traces in the preceding nuclei. There are also some written Burmese onset clusters which have coalesced to single onsets, and others where formerly distinct medial consonant contrasts have merged. Of the onsets, nearly all Burmic languages other than subvarieties of Burmese have a contrast between alveolar affricates and fricatives like /ts/ /tsʰ/ /dz/ /s/ /z/ versus alveopalatal affricates and fricatives like /tɕ/ /tɕʰ/ /dʑ/ /ɕ/ /ʑ/. This contrast has been completely lost in Burmese from the earliest stage of writing in the 11th century, merging all affricates to what was written with Indic palatal consonants and probably then pronounced /tɕ/ /tɕʰ/ /dʑ/ judging from contemporary Chinese representations of Burmese pronunciation, and merging all coronal fricatives to alveolar /s/. These mergers may have been triggered by the absence of alveolar affricates, alveopalatal fricatives and voiced fricatives in the languages such as Mon which were spoken by most of the local population other than the Burman invaders when they arrived in the 9th century. Subsequent changes in Burmese onsets can be dated from foreign representations of the language (Bradley 2011). By the 17th century, the palatal affricates had become alveolar /ts/ /tsʰ/ /dz/. Clusters with medial /ɹ/ merged by the mid-18th century with medial /j/ clusters to /kj/ /kʰj/ /gj/ /ŋj/. These shifted to their modern /tɕ/ /tɕʰ/ /dʑ/ /ɲ/ pronunciation in the late 18th century; the merger of initial /ɹ/ to /j/ only took



Typological profile of Burmic languages 

 303

place somewhat later, and was virtually completed in the early 19th century; syllable-initial /ɹ/ still persists in a few Pali and English loanwords. In the late 18th century, /s/ shifted to its modern /θ/ pronunciation (now actually an affricate [tθ]), leaving a gap for /ts/ /tsʰ/ /dz/ to shift to their modern /s/ /sʰ//z/ pronunciations in the early to mid-19th century. In the early 21st century, /sʰ/ is in process of merging into /s/, thus eliminating the typologically-unusual aspirated fricative. There was a four-way set of manner contrasts in Proto-Burmic initial stops and affricates: voiceless unaspirated, voiceless aspirated, voiced and prenasalized voiced. This is retained in some Ngwi languages spoken in China, but is only reflected by distinct regular correspondences in Burmic languages spoken in MSEA: Lisu and Akha merge prenasalized voiced with voiced, Lahu merges voiced with voiceless unaspirated and then shifts prenasalized voiced to plain voiced. Most other Burmic languages in MSEA including Burmese have merged earlier voiced and prenasalized voiced stops and affricates to voiceless unaspirated stops. This reduction in manner contrasts can be seen as converging towards the MSEA norm. Modern Burmese has the following onset segments; the arrangement of Table 1 reflects the history, orthography and phonology of the standard language. Tab. 1: Burmese onsets. p pʰ b hm m (hw) w

t pʰ d θ hn n hl l

s sʰ z

tɕ tɕʰ dʑ

k kʰ g

hɲ ɲ ʃ j

hŋ ŋ

h

All of the voiced segments in the third row are secondary; they are a direct result of juncture voicing of voiceless stops and affricates within compound words, in some cases phonologized after the loss of the conditioning first syllable, as in /ù kʰàun/ > [ù gàun] > /gàun/ ‘head’. Inscriptional evidence suggests that this juncture voicing was already present in early 11th century pronunciation. /θ/ also voices to [ð] in juncture environments; see further discussion below concerning juncture voicing within words in modern Burmese. In examples in text here, underlying forms without voicing are given; but in sentence examples the juncture voiced forms are given. There are clusters with medial /j/ after bilabials other than /hw/and /w/. Clusters of any of the above onsets other than /hw/ and /w/ plus medial /w/ occur, but here the /w/ is best regarded as part of a diphthongal nucleus. The combinations of /h/ plus sonorant start with a voiceless sonorant and become voiced. /hw/ is very marginal, occurring only in a few literary and archaic words. /ʃ/ is diachronically derived from hj and pre-19th century

304 

 David Bradley

hr, and is still written as such. Unlike /θ/, /h/ does not voice and can be regarded as a voiceless sonorant. Some additional segments occur in other varieties; for example, Arakanese has initial and medial /ɹ/ which merged with /j/ during the 18th and 19th century in standard Burmese pronunciation but is maintained in standard Burmese spelling; Tavoyan and Intha retain medial /l/ which existed in 11th century Burmese inscriptions but had merged with other medials by the early 12th century when the orthography was standardized. Most North Burmish and Ngwi languages have onset systems with the places of articulation seen in Burmese, plus a contrast between alveolar affricates and fricatives /ts tsʰ s z/ versus alveopalatal affricates and fricatives like /tɕ tɕʰ ɕ ʑ/; Lahu has this contrast only in complementary distribution, with alveolar affricates or fricatives only before the vowel /ɨ/ realized as [ɿ], and alveopalatals elsewhere. Mpi, spoken in Thailand for over 200 years in close contact with Northern Thai which lacks alveolar affricates, has replaced the alveolar affricates with alveolar stops. Akha as spoken in Thailand is also losing this contrast, merging alveolar affricates to alveopalatals which also exist in Thai. Unlike Burmese, many other Burmic languages in MSEA have a three-way manner contrast for stops and affricates: in Lahu, Lisu and Gong voiceless unaspirated, voiceless aspirated and voiced, and in North Burmish languages usually voiceless unaspirated, voiceless aspirated and voiceless creaky. Southern Ngwi languages like Akha, Phunoi, Bisu and Mpi have only a two-way manner contrast: Akha voiced versus voiceless (allophonically aspirated in non-creaky syllables and unaspirated in creaky syllables) and the others like Burmese with voiceless unaspirated versus aspirated, with secondary voiced stops developing from some nasal onsets in Phunoi and Bisu. Most North Burmish and Ngwi languages have a much larger inventory of initial fricatives than Burmese, including a voiceless/voiced contrast. The voiceless-onset nasals and lateral in Burmese correspond to voiceless or creaky nasals and lateral in most North Burmish languages. In Ngwi languages with complex tone systems like Lahu and Lisu, there are tonal developments conditioned by former initial nasal and lateral voicing contrasts, but these voicing contrasts have now disappeared. Some Ngwi languages in China and a couple in MSEA like Phunoi and Bisu retain voiceless nasal and lateral initials, and the Kucong variety of Lahu retains a voiceless lateral. For other initial sonorants such as rhotics and glides, there is greater diversity; some Ngwi and North Burmish languages have none, having replaced them with fricatives. Some Ngwi languages in MSEA have initial consonant clusters parallel to those seen in Burmese orthography; Akha, Lisu and Phunoi have medial /j/ after bilabial stops and nasal; Bisu, Akeu/Gokhu and one dialect of Akha also have medial /l/; and Lisu also has medial /j/ after /h/, in all cases before a restricted subset of vowel nuclei; Lahu has no such clusters. The original early 11th century Burmese orthography had a system of seven nuclei in open syllables, four of them written in ways reflecting /j/ or /w/ offglides. Subsequent monophthongization has resulted in a symmetrical seven-vowel system /i e ɛ a



Typological profile of Burmic languages 

 305

ɔ o u/ in modern Burmese, plus a schwa in reduced nonfinal syllables of many words. The more restricted written Burmese nucleus system before final stops and nasals was /i a u/, but with two nuclei before Burmic final velars written differently: Proto-Burmic *ik and *iŋ written in Burmese as atɕ aɲ, and *uk and *uŋ written awk awŋ and now pronounced /au’/ /aun/; a further written sequence was ək and əŋ, almost entirely in loanwords from Mon and other sources, which become modern spoken /ai’/ and /ain/. Modern Burmese has restructured and expanded these to a seven-nucleus symmetrical system including four diphthongs: /i ei ai a au ou u/, all either with glottal-stop final tone or nasalized with any of the three other tones, plus /ɛ/ with glottal-stop final tone only. Most other Burmese dialects have substantially different nucleus systems, though these are converging with the standard; for one example, compare the Marma system (Bernot 1957–1958) probably reflecting the late 18th century Arakanese system with the modern Arakanese system (Bradley 1985). However, differences persist; for example, Arakanese has one fewer front vowel. Burmese and the most closely related languages also have open-syllable diphthongs /wa/, /we/ and rarely /wi/. North Burmish languages and some Ngwi languages like Phunoi have relatively simple vowel systems, /i e a o u/ plus a couple of these with /j/ and/or /w/ offglides in non-loan vocabulary, reflecting a similar pattern to that implied by the 12th-­century Burmese orthography. Most Ngwi languages have relatively complex systems of monophthongs: Lisu and Akha have /i e ɛ y ø ɯ ɤ a u o/ with Akha also having /ɔ/ as well as nasalized /ɔn/ and syllabic /m/, while Lahu has /i e ɛ ɨ ə a u o ɔ/. In some Ngwi languages, as in Chinese, [ɿ] is an allophone of another vowel: in Lahu, of /ɨ/ after alveolar affricate and fricative onsets; and in Lisu, of /i/ after alveolar affricates and fricatives as well as of /y/ after retroflex allophones of alveopalatal affricates and voiceless fricative. The front rounded vowels of several Ngwi languages are unusual in MSEA. The Lahu system is a symmetrical nine-vowel pattern more typical of Southwestern Tai languages, with which it has long been in close contact. The coda position has a very restricted range of possibilities in Ngwi languages, ranging from none in Akha, glottal stop (best regarded as part of the tone) in Lisu and Lahu, final /p t m n/ in Phunoi, and additionally /ŋ/ in Bisu. Burmese has graphic final stop p t tɕ k and nasal m n ɲ ŋ as well as j w finals, reflecting an earlier stage of the language. The final stops written in Burmese are all now pronounced in isolation with a final glottal stop, best regarded as a component of a tone. The graphic sequence atɕ in written Burmese is the reflex of the Tibeto-Burman and Burmic rhyme *ik and is pronounced /i’/ in modern Burmese and /ai’/ in Arakanese; the 11th-century phonetic value is a matter of speculation. The development of *iŋ is more complex, initially to graphic -aɲ with a subsequent split between various non-nasal front vowels, mainly /i/ in literary and most often /ɛ/ in spoken forms, and occasionally /in/ in some words; this was regularized in the standard Burmese orthography in the 1970s. In modern Burmese, apart from most of those derived from *iŋ, all the final nasals have resulted in nasalization (here represented by postscript /-n/) and some vowel quality changes in the nucleus but then disappeared; all the stops have resulted in nearly the same

306 

 David Bradley

vowel quality changes as the homorganic nasals and then been replaced by the glottal stop of the stop-final tone. The nuclei written with final glides have become monophthongs. The codas of North Burmish languages mainly reflect the reconstructed system and the written Burmese system, with a full set of final stops /p t k/ and final nasals /m n ŋ/ (Burling 1967: 16–22). A characteristic of Burmese is the development of a very frequent sesquisyllabic pattern by reducing the first syllable of a two-syllable word and the second and sometimes the first syllable of a three-syllable word to schwa and neutralizing its tone; this is often attributed to close contact with Mon over more than a millennium. The reduced syllables are very often still written with an etymologically expected nucleus, coda and tone in the reduced syllable, and may appear in unreduced form in other words; see further discussion below. Some extremely frequent morphemes, such as the numerals /ti’/ ~ [tə-] ‘one’ and /hni’/ ~[hnə-] ‘two’, always have a reduced form when they are combined with a following numeral classifier (clf); the full form is used only in counting. All Burmic languages have tonal systems of three or more tones which include pitch as the primary parameter of their realization. In addition, some have phonation contrasts, either as an additional component of the suprasegmental, or as a cross-cutting category. Burmese has a combined system (Bradley 1980, 1982) with three suprasegmental packages for syllables other than stop-final coda: the “creaky” tone with creaky phonation and mid-high level pitch, the “heavy” tone with breathy phonation and falling pitch, and the ‘even’ tone with modal phonation and mid-low level pitch. For stop-final codas, the pitch is very high and the syllable is short and ends in a stop: glottal stop in isolation, assimilated to the position of a following onset within the same word. Here, the creaky /44/ tone is indicated with an acute accent, the breathy or heavy /41/ tone is indicated by a grave accent, the even or modal /22/ tone is unmarked, and the stop-final /5ʔ/ tone is represented with a postscript’. Within a word, an unreduced breathy tone remains fairly high before another breathy tone. As a result of the process of creating sesquisyllabic words discussed above, non-stop coda tones in a reduced nonfinal syllable lose their original tone. Thus there are seven surface possibilities for a Burmese syllable: creaky [44], breathy [41] or its breathy [44] alternative, modal [22], short [5ʔ] and reduced, typically [3], the last only in a nonword-final syllable. Most other Burmic languages have tone systems which contrast at least three level tones. North Burmic languages usually have three-tone systems: high, mid and low, but some have a fourth tone, most often high falling, or a high falling tone instead of a mid level tone. Most Ngwi languages have a cross-cutting phonation contrast which applies to two or occasionally all three of these pitch categories. For example, Akha has high /55/, mid /33/ and low /21/ tones with modal phonation, and mid /33/ and low /21/ tones with creaky phonation (here indicated by vowel underline); also a very marginal /55/ creaky tone in a few words. Phunoi lacks the phonation contrast as it retains final /p/and /t/codas; the presence of a stop coda is the conditioning environment for



Typological profile of Burmic languages 

 307

the development of creaky phonation elsewhere in Ngwi, as in Akha. Lisu and Lahu, like many other Ngwi languages spoken only in China, have developed more complex tonal systems with additional contour tones. Lisu has a blended system, with high level /55/, rising /35/, creaky /44/, modal /33/, low falling /21/ and short low falling with final glottal stop /21ʔ/; some dialects of Lisu are merging the creaky /44/ and the modal /33/, thus removing the phonation contrast. Lahu has seven tones, with a high falling /54/ tone but no high level tone, a high stop-final tone /45ʔ/ where Lisu has creaky /44/ and an additional low level /11/ which contrasts with low falling /21/. Burmese has one tone sandhi phenomenon which reflects several types of syntactic relations: the so-called “induced creaky tone” (Okell and Allott 2001: 273–274) or dependent creaky tone (Jenny and San San Hnin Tun 2016), in which an element linked to a following element changes its tone from even /22/ tone (or very occasionally from heavy /41/ tone) to creaky /44/ tone. This is used to indicate possession in spoken Burmese, replacing the spoken gen postposition /jɛ´/ as seen in (1); both forms occur. This also applies to the question pronoun /bəθu/ [bəðu] ‘who?’ to give [bəðú] ‘whose?’, as well as to other pronouns ending in a syllable with even tone. Another use is before the spoken object (obj) postposition /ko/ and locative (loc) postposition /hma/ as seen in (2), where the creak is obligatory. It is also obligatorily used in compound numeral plus clf phrases on the end of each of the round-number constituents in sequence, as seen in (3). Another use is in a four-syllable ABCB expression on the first of two occurrences of a stative verb in the B position to express a partial degree, as in (4). Note that juncture voicing links a nominal postposition to the preceding noun as in (2) but not the non-round numeral to the preceding round numeral as in (3) nor the possessed noun to the preceding possessor as in (1). (1)

əme jɛ´ phəna’ mother poss sandals ‘mother’s sandals’

  əmé phəna’ mother’s sandals ‘mother’s sandals’

(2)

/θu + he/she

(3)

/hnə thaun 2 1000 ‘2,000’

> θú go he/she obj ‘him/her, to him/to her’ + kò ja/ > hnətháun + 9 100 2,000 ‘900’ ‘2,900’

(4)

/mə hman ə hman/ > məhmán əhman neg true nmlz true neg true nmlz true ‘fairly true’

ko/ obj

kòja 900

A final instance of induced creak is seen for clause-final literary and spoken realis (realis) and irrealis (irr) markers in Burmese relative clauses, followed by the head NP; a literary example is given in (5) and a spoken example in (6). Again, underlying forms are given on the left and surface forms showing juncture voicing are given on the right.

308 

 David Bradley

(5)

a

/jau’tɕà kàun θi/ man good real



b

/kàun good

(6)

a

/əphje answer



b

/mə hman tɛ´ neg true real.rel ‘the answer which is not true’

θí real.rel mə hman neg1 true

jau’tɕà/ man phù/ neg2

> [jau’tɕà kàun ði] man good real ‘The man is good.’ > [kàun ðí jau’tɕà] good real.rel man ‘the man who is good’

> [əphje mə hman bù] answer neg1 true neg2 ‘The answer is not true.’ əphje/ > [mə hman dɛ´ answer neg true real.rel

əphje] answer

In other sections of this chapter, Burmese forms in isolation are given in their underlying voiceless forms, but in examples they are cited as pronounced, with juncture voicing where it applies. Burmese has two intonation phenomena; one is a rising intonation on the last syllable of quoted speech embedded by a following quotation marker, spoken /ló/ or /tɛ´/or literary /hú/, as discussed in Bradley (2018a, 2018b) and Tian and Lee (2019); it most frequently affects a clause-final realis or irr marker, but can also apply to other clause-final forms. This rise is absolutely regular and obligatory. Another intonation type in Burmese is a sentence-final rising intonation to indicate a question without an overt sentence-final question marker, as discussed in Okell and Allott (2001: 276); this is relatively infrequent. Burmese is written in an Indic script derived from Mon in a spelling which probably reflects late 12th century pronunciation fairly accurately. Some of the graphic contrasts are completely merged in modern pronunciation, such as graphic final p and t, graphic initial and medial r and j and so on. Most other Ngwi languages in MSEA which have established orthographies use romanizations; Lahu and Akha have several competing romanizations. Lisu has an interesting blend which uses some of the conventions of Burmese script in its romanization (Bradley and Bradley 1999); a few Buddhist Lisu in Myanmar instead use a script based on Burmese. The Gong and the Bisu in Thailand use Thai-based scripts which I devised with them in the late 1970s; but the Bisu in Myanmar are developing a romanization. Several Burmish languages in northeastern Myanmar also have recently developed romanizations.



Typological profile of Burmic languages 

 309

17.3 Word classes All Burmic languages have two large word classes, verbs and nouns; within each there are subclasses as well as additional smaller word classes which combine with verbs and/or with nouns. In all Burmic languages, all NPs in a clause normally precede the verb, though some languages can postpose one or more NPs for focus; the Southern dialect of Lisu does this quite frequently. The verb is usually followed by words in small closed classes of postverbal TAM and clause-final clause-linking words. While temporal and locational expressions are NPs, all Burmic languages also have a category of manner and expressive ideophonic adverbs which precede the verb and also include negative and prohibitive forms. Most verb and noun stems are monosyllabic; two-syllable compounds are extremely frequent for nouns, but much less so for verbs. In literary and formal Burmese, most verbs also have a two-syllable alternative, usually with the one-syllable verb form as the first syllable. Most adverbs have two syllables, often including complete or partial reduplication. In formal and creative speech styles, four-syllable expressions with ABAC or ABCB reduplication are highly frequent, especially for nouns and adverbs but also in words which combine verbs with associated preverbal or postverbal elements. Nouns can be defined by their possibility of occurrence immediately followed by an NP-final case marker such as the obj marker, spoken Burmese /ko/, Lahu /tʰaʔ21/, Lisu /tɛ55/ and so on. Within the major class of nouns, there are also three subclasses: (i) pronouns which may replace nouns, including question pronouns; (ii) personal names of people; and (iii) kinship terms. In addition, there are small closed word classes which may combine with a noun to form an NP. These include deictics and the combination of a numeral plus a clf, either or both of which may be the sole constituents in an NP or combine with a head noun, also bound NP-final case and topic markers which can be used to define the overall category of NP and can follow any NP. There are also various bound nominal prefixes, suffixes and stems which do not occur alone as free nouns. There are very interesting syntactic differences between the deictic systems of different Burmic languages which are discussed below; no one syntactic frame can define the deictic class overall. The pronouns in most Burmic languages include forms which have widespread Tibeto-Burman cognates, first person *ŋa1 and second person *naŋ1, reflected by modern Burmese /ŋa/ and /nin/. Due to long-term social stratification and the introduction of self-humbling and other-honouring forms in addition to these original cognates, modern Burmese now only uses these forms in very intimate or derogatory situations, and has a wide range of alternatives, such as /tɕənɔ/ ‘I (male), /tɕəmá/ ‘I (female)’ or the colloquial and somewhat archaic forms /tɕənou’/ and /tɕou’/ ‘I’, all derived from contracted forms of the noun /tɕun/ ‘slave’ with various second syllables; and /kʰəmjà/ ‘you (male speaker)’ (with various alternative older forms such as /kʰinbjà/, /kʰinbja/ and so on) and /ʃin/ ‘you (female speaker)’, both derived from nouns referring to ranks in the former royal aristocracy; these polite ‘you’ forms are

310 

 David Bradley

also used as sentence-final polite tags in direct address; the male tag form can be reduced to /bja/. Third person forms are more diverse, as is often the case in Tibeto-Burman; for animate referents, most North Burmish languages have forms like /njaŋ/ with low tone. Most Ngwi languages of MSEA have forms reflecting a reconstruction *jaŋ2 like Lahu /jɔ54/, Lisu /ji55/, Akha /a55 jɔ21/, Phunoi /jã21/ and Bisu /jaŋ21/; some noncognate forms are also found in a few Ngwi languages of China. Burmese instead uses a third person animate pronoun form /θu/ which is a cognate of a very widespread Tibeto-Burman human subject (sbj) nominalizer suffix *su1, with cognates also used as a third person remote form in most Burmish languages as well as many Ngwi languages such as Lisu /su33/ and Lahu /ʃu33/, and a related form still used as a productive agent/sbj nominalizer suffix in these languages and elsewhere; in Burmese it is also sporadically used as a prefix, usually reduced from /θu/ to /θə/ as in /θəkʰò/ ‘thief’, from the verb /kʰò/ ‘steal’ or /θəŋɛ/ ‘child’, from the verb /ŋɛ/ ‘be. young/small’. These forms are etymologically related to the Tibeto-Burman agent/sbj nominalizer suffix *su1. However, even in Burmese it is much more productive as an agent/sbj nominalizer suffix. In some languages, the pronouns can be defined as a separate word class by the possibility of adding a pronoun plural suffix not normally used to pluralize other nouns, such as Burmese /dó/, Lisu /wa21/and Lahu /hɨ33/; but there are also some pronouns which cannot be pluralized this way, such as the homophonous Burmese first plural pronoun /dó/ and the various first dual and plural inclusive pronoun forms of many Ngwi languages which are already inherently plural. The question pronouns ‘who?’, ‘what?’, ‘when?’, ‘where?’ and so on typically pattern together in each Burmic language. In each language they typically start with a specific question prefix. In modern spoken Burmese, all the question pronouns start with a /b/ initial, from literary Burmese /(ə)mji/ ‘the thing named’ also used as a question prefix as in modern literary /mji θu/ ‘who?’ and so on, grammaticalized into /əbɛ/ and then shortened to /bɛ/ ‘which?’ in modern spoken Burmese. This new form is now prefixed in all other spoken Burmese substance question pronouns: in reduced form in /bəθu/ ‘who?’ and sometimes /bəlau’/ ‘how much?’; also /bɛ/ plus the dummy noun /ha/ ‘thing’ almost always contracted to /ba/ ‘what?’, and unreduced in /bɛ tòun ká/ ‘when (in past)?’, /bɛ tɔ´/ ‘when (in future)?’, /bɛ hma/ ‘where?’, /bɛ lo/ ‘how?’ and sometimes in /bɛ lau’/ ‘how much?’. Juncture voicing takes place within these words giving surface [bəðu] ‘who?’, [bɛ dòun gá] ‘when (in past)?’ and [bɛ dɔ´] ‘when (in future)?’. There is also a question form which must be followed by a clf, /bɛ hnə-/ ‘how many + clf?’; also source and goal forms /bɛ gá/ ‘from where’ and /bɛ go/ ‘to where?’. A why question is expressed clausally in spoken Burmese, /ba pʰji’ ló/ ‘what happen cause > why?’, used as an NP in a clause without further embedding marking. Question pronoun forms with bilabial stops like /b/ or related voiceless /p/ or /pʰ/ occur in some Burmese dialects; however, Arakanese uses /za/, North Burmish languages mostly have a /kʰə-/ prefix, and Ngwi language forms begin with /a/ in Lisu, /a21/ in Akha, and in Lahu mostly /qʰa21/ but also a few with /a21/ or /a33/; for example



Typological profile of Burmic languages 

 311

‘who?’ Akha /a21 su55/ Lisu /a21 ma33/ Lahu /a33 ɕu33/; ‘which?’ Akha /a21 gɤ33/ Lisu /a44 li44/ Lahu /qʰa21 ve33/ and ‘when?’ Akha /a21 mjɔn33/ Lisu /a55 tʰæ21/ Lahu /qʰa21 tʰaʔ45/. Interestingly, the earliest written Burmese used a couple of older forms with /ə/ prefix, such as /əθu/ ‘who?’ and /əθó/ ‘how?’, more similar to some Ngwi forms, so this prefix may be the most archaic. In some Burmic languages including Burmese and Lahu, a clause containing a question pronoun requires a clause-final question marker, but in Lisu this is optional and in Akha it is not required, as discussed in 17.5.2 below. Personal names often have unusual phonological and syntagmatic characteristics which enable them to be defined as a separate subclass of nouns. In Burmese, most personal names are two syllables (or less often one, three or four), usually meaningful, and have no internal juncture voicing, unlike most other two or more syllable compound nouns. Uniquely, they may be prefixed by an honorific derived from a kin term: /ù/ ‘high-status male’ (from the term for mother’s brother), /dɔ/ ‘high-status female’ (from a relatively recent term for aunts in general), /ko/ ‘equal-status male’ (from a term for elder brother) and /má/ ‘equal-status female’ (from the female suffix found among other places in the terms for elder and younger sister); these prefixes do not trigger juncture voicing in the first syllable of the name. In recent times, the child of a high-status father may add the father’s personal name before their own; for example, /dɔ aun sʰàn sú tɕi/ Daw Aung San Su Kyi, the daughter of the independence leader /bodʑou’ aun sʰàn/ General Aung San. If this were a compound noun, it would be voiced to [dɔ aun zàn zú dʑi] but it is not. As the name of General Aung San shows, various other military, religious and occupational titles can also precede a personal name; sometimes the honorific may intervene, so he could also have been addressed or called /bodʑou’ ù àun san/. Most Burmese kin terms show a similar resistance to juncture voicing which is one way in which they form a distinct subclass of nouns in Burmese. Most kin terms are two syllables in all Burmic languages; the most frequent form of all kin terms has an /a/ prefix, reduced to [ə] in Burmese, which usually does not trigger Burmese juncture voicing as in /əko/ ‘elder brother’, /əpʰe/ ‘father’ and so on. This /a/ prefix is not unique to kin terms; it is also seen in many animal and plant terms where juncture voicing does take place in Burmese, resulting in some voiced-initial animal and plant terms such as /dʑò/ ‘dove’ and /zì/ ‘jujube’ after the /ə/ prefix is lost, as well as being a productive verb-nominalizing prefix in Burmese, which does not trigger juncture voicing, as in /tʰin/ ‘to.think’, /ətʰin/ ‘thought’ or in a stative verb such as /sa’/ ‘be.spicy’, /əsa’/ ‘spicy.food’ or in a non-stative verb like /tɕá/ ‘to.fall’, /ətɕá/ ‘a.fall’. In Burmese, many close kin terms have an alternative reduplicated form like /ko ko/ ‘elder brother’, /pʰe pʰe/ ‘father’ and so on which would be voiced to [ko go], [pʰe be] and so on if they were normal reduplicated compound nouns, but are not. Conversely, some reduplicated kin terms have a voiced initial in both syllables, as in /dɔdɔ/ ‘aunt’ and /bábá/ ‘uncle’. The quantifier phrase of numeral plus clf follows the head nominal in all Burmic languages, or it may occur as an NP without a head nominal. It consists of a numeral followed by a clf. The clf include sortal clf which are based on semantic properties

312 

 David Bradley

of the nominal, such as human, non-human animate, shape or function, and a default general clf, such as Burmese /kʰú/, Lahu /ma21/, Lisu /ma33/ and Akha /hm21/. Other clf include some derived from free nouns used to classify themselves and similar nouns, also space, time and other measure clf, and round numbers like ‘10’, ‘100’ and so on. Numerals include a few quantifiers (‘some’, ‘many’, etc.) in most languages, but some languages express some of the same concepts with adverbial or nominal forms. The form class of numerals can be defined by their occurrence with a following clf; in some languages such as Burmese they also occur alone in counting. The form class clf can be defined by their occurrence after a numeral; some occur productively after any numeral, some only occur after the numeral ‘one’, and some are also independent head nouns as well. The combinations of ‘one’ plus clf include forms like Burmese /tətɕó/ ‘some’ which can occur alone with nominal meanings, others which occur alone in comparisons such as Lisu /tʰi21 le35/ ‘the same as’ or in adverbial forms such as Lahu /te54 gɛ33/ ‘together’. The verb can be defined by its possibility of being negated by a preceding negative, cognate Burmese /mə-/, Lahu /ma54/, Lisu /ma21/, Akha /ma21/ and so on. Active verbs, intransitive, transitive or ditransitive, have a prohibitive form; Ngwi languages use cognate preverbal Lahu /ta54/, see (28), Lisu /tʰa21/, Akha /ta21/; spoken Burmese uses preverbal /mə/ plus postverbal /nɛ´/ and literary Burmese /mə/ + verb + /hnín/; these prohibitive forms distinguish the active verbs from the stative verbs with adjectival meanings which have no prohibitive form. Other elements associated with the verb include a wide variety of grammaticalized serial verbs; most occur in post-head position. These include various semantic and syntactic subcategories, including directional, modal and so on; they combine with each other in complex and interesting ways; see 17.5 for more discussion. After these elements, there may be one of a small set of bound tense/aspect markers, then one or more of a small closed set of clause-final markers including imperative, question, epistemic and evidential markers. All of these form part of the verbal component at the end of the clause. The much smaller class of expressive and manner adverbs mainly occurs immediately preverbally; some adverbs are derived productively from one-syllable verbs by AA reduplication, like Burmese /kàungàun/ ‘well’ from /kàun/ ‘be.good’ and from two-syllable verbs by AABB reduplication. There are also some two-syllable adverbs derived from noun plus verb combinations, such as Lahu /ɣa21 tʰeʔ21/ ‘vigorously’ from /ɣa21/ ‘strength’ plus /tʰeʔ21/ ‘to.strike’, as well as unanalyzable two-syllable adverbs like Lisu /bɤʔ21 lɯ55/ ‘quickly’. Many four syllable expressive adverbs without a transparent verbal origin have ABAC or ABCB reduplication, and some of these are to a greater or lesser extent onomatopoetic; Various Ngwi languages also use many ABB adverbial forms, such as Lisu /bɯ21 lɯ21 lɯ21/ ‘(rain) heavily’. The most frequent preverbal adverb is the one-syllable negative marker discussed above, which is a cognate of a form very widespread in Tibeto-Burman languages; another is the one-syllable prohibitive marker which occurs preverbally before non-stative verbs in most Ngwi



Typological profile of Burmic languages 

 313

languages. These two are the only one-syllable adverbs in Burmic languages. Reduplication, whether AA, ABAC, ABCB, AABB or ABB, is not limited to adverbial forms. In formal styles, many nouns and verbs also have longer forms; when these are three or more syllables they usually include some reduplication.

17.4 Words and word formation In the orthographies of Burmic languages, word boundaries are not indicated. In Burmese, there is a fairly clear phonological definition of the word: the unit within which there is juncture voicing. Unless blocked by a syllable with a stop-final tone, juncture voicing of medial consonants which can voice is fairly regular between a noun and its following attribute as in /ein/ ‘house’ + /pʰju/ ‘white’ > [ein-bju] ‘white house’, between a numeral and its following clf as in /lè/ ‘four’ + /kaun/ clf.animal > [lè-gaun] ‘four animals’ and between a verb and most postverbal elements with obstruent initials, as in /θwà/ ‘to.go’ + /hnain/ ‘able’ + /pa/ polite + /tɛ/ realis > [θwàhnain-ba-dɛ] ‘(I) can go.’ The exceptions include personal names and a decreasing number of kin terms as discussed above, also certain prefixes such as the nominalizer /ə/, a clf after the combining forms of numerals /hnə-/ ‘two’, /kʰuhnə-/ ‘seven’ and /bɛ hnə-/ ‘how many?’ which historically had stop-final tone blocking the voicing, the reflexive forms discussed in 17.5 below and some others. Another kind of Burmese word, mainly nominal, is defined by syllable reduction: the first syllable of two or the second and sometimes also the first syllable of three are reduced to their onset plus schwa and have their tones neutralized, as in /kəlà/ ‘Indian’ + /tʰain/ ‘to.sit’ > [kələ-tʰain] ‘chair’; a few synchronically monomorphemic disyllabic nouns and verbs also follow this pattern, like /kʰəlè/ ‘child’ and /gəzà/ ‘to. play’. Juncture voicing and syllable reduction are not reflected in Burmese spelling, are somewhat more restricted in other closely related dialects of Burmese such as Arakanese, and are completely absent from North Burmish and Ngwi languages. The main free word forms in all Burmic languages are (i) nouns (including pronouns, personal names and kinship terms), (ii) verbs, (iii) expressive and ideophonic adverbs, (iv) numerals when counting in some languages and (v) clausal interjections such as expressions of agreement, disagreement, grief, pain, surprise and so on. A numeral plus clf and/or a deictic with whatever nominalizing marking each language requires with deictics can also occur as an independent word. Never occurring as independent words are the nominal prefix and nominal suffix forms which are not also independent nominal forms, the NP-final case and topic marking forms, the clf which are not also independent nominal forms, the preverbal adverbs of negation and prohibition, preverbal and postverbal TAM and other serial verb elements in their serial verb meanings (though in most Burmic languages most of these are ultimately derived from and often still used as homophonous independent verbs with other meanings),

314 

 David Bradley

and clause-final pragmatic markers of politeness, epistemicity, evidentiality, imperatives and questions as well as clause linkage and embedding markers. It is sometimes difficult to define what forms part of the same verbal word, as clause-linking elements may sometimes be inserted inside some sequences of verb and postverbal elements which also occur without these linking forms. Some postverbal elements, particularly some serial verbs such as modals, may occur in their serial meaning in an independent word combined with other preverbal and/or postverbal elements but without a verb, usually in answers to questions containing them. In several North Burmish languages, there is some tone sandhi within words, but this remains poorly described. In the Ngwi language Mpi spoken in Thailand and the Burmish languages Gong spoken in Thailand and Lawngwaw spoken in Myanmar and China, verbs have two sandhi forms of all tones in different syntactic environments; and in Gong there is a wordfinal sandhi form of one tone in nouns (Bradley 1979b: 49; Bradley 2012; Okell 1989). Some numerals, mainly ‘3’, ‘4’ and ‘9’, have two tone sandhi forms in Ngwi languages such as Lisu and Lahu (Bradley 2005b). Apart from Burmese where juncture voicing and syllable reduction give some clear indications, and the limited evidence from tone sandhi in a few languages, the only evidence for word status and word boundaries in most Burmic languages is the ability of a verbal form to occur as an independent utterance and the position of possible pauses and intonation breaks within a clause. All verbs can be the predicate of a clause; there is no morphology distinguishing them from other form classes, the differences are only syntactic. There are two subclasses of verb: stative verbs and active verbs. All verbs can be negated by a preceding negative in all Burmic languages, such as Burmese /mə-/ and so on. All active verbs can have a preceding prohibitive as discussed below. The active subclass has intransitive, transitive and ditransitive subcategories defined by the number of core NP arguments which may occur with them. The stative subclass has a subcategory including the dimension and colour verbs which may occur in adjunct position immediately postnominallly. One morphosyntactically distinct subclass is a small subset of grammaticalized positive dimensional verbs, slightly different in its membership and behaviour in different Ngwi languages (Bradley 1995, 2019). There is also a productive valency-increasing postverbal serial verb for active verbs, Burmese /se/ with cognate forms in most Ngwi languages, discussed further below in 17.5. As would be predicted for verb-final languages, most affixation in Burmic languages is left-headed suffixing, whether in compounds where all components are also independent words such as cognates of reconstructed Tibeto-Burman nominal forms such as *ne3 ‘day’ and *s-nik ‘year’, or with bound suffixes which are cognates of reconstructed Tibeto-Burman suffixes such as *ma3 ‘female’; the relevant Burmese forms are /né/, /hni’/ and /-má/; Lisu has /ɲi44/, /-ni35/ (which now only occurs as a bound form in time ordinals, see 17.6 below) and /-ma44/. Other suffixes are more language-specific. There are also a few widespread prefixes, such as cognates of reconstructed Tibeto-Burman *a- which is seen on kinship terms, as a verb nominalizer and so on, and the preverbal negative adverb derived from reconstructed Tibeto-Burman



Typological profile of Burmic languages 

 315

*ma-. Most such prefixes have a full syllable with a tone in Ngwi languages but a reduced vowel /ə/ in Burmese, such as the kin term and other prefix /ə/ and the preverbal negative adverb /mə-/. There are no infixes. Reduplication is widely used to form manner adverbs from stative verbs and to express repeated action of active verbs. It is also seen in some nominal forms such as kinship terms, and particularly in expressive adverbs which tend to be three or four syllables, as discussed elsewhere.

17.5 Phrase and clause structure In the basic word order of all Burmic languages, all NPs precede the verbal element in the clause; an adverbial element (if any) immediately precedes the verbal element. The order of NPs within the clause when more than one NP is present is largely pragmatically determined but the unmarked order is temporal NP + sbj/erg NP + dat NP + obj NP + loc NP; needless to say, clauses containing this many NPs are most infrequent. NPs are often fronted within the clause to mark topicality, with or without an NP-final topic postposition. In some languages, one or more NPs can be moved after the verbal element to mark focus, but this is unusual and infrequent. Such postposing is only particularly frequent in the southern dialect of Lisu as spoken in Thailand, and is normally also indicated by pause and/or intonation break between the final verbal element and the postposed NP or NPs. Word order in main and subordinate clauses is the same. Word order remains the same across statements, questions and imperatives, with differences between these clause types expressed by clause-final elements and/ or question pronouns. The constituency of the NP differs somewhat across Burmic languages. In general, relative clauses embedded by a final relative marker and possessors with or without marking of their possessive status precede the head noun, though in some languages short relative clauses may also occur in the postnominal adjunct position. This adjunct position allows a variety of stative verbs of dimension, colour and so on to occur immediately after the head noun, not marked in any way but in Burmese undergoing juncture voicing as noted above. The head noun including any adjunct can be followed by the numeral phrase of numeral or other quantifier plus appropriate clf, with juncture voicing on the clf in Burmese after most numerals. Plural marking is unusual in Burmic languages; it is somewhat more frequent in Burmese than in others, perhaps due to Pali influence. The Burmese plural marking on nouns follows the head noun; literary plural /mjà/ is homophonous with the verb ‘be.many’; spoken plural /twe/ ~ /te/, which undergoes juncture voicing, is sometimes replaced in more formal spoken Burmese by /mjà/. However, plural marking is not very frequent even in Burmese. The final constituents in the Burmic NP are case and topic markers. These are not obligatory; their presence is pragmatically marked, and their rate of use and the

316 

 David Bradley

preferences and restrictions about using them differ between languages. The literary Burmese sbj is often marked by /θi/, but unmarked or if anything marked for topic with /ká/ in spoken Burmese; the existence of an agent/sbj case marker in literary Burmese is sometimes attributed to the influence of Pali. A Burmese obj or dat is marked by /à/ in literary Burmese and /ko/ in spoken. loc is marked by /hnai’/ ‘loc’ or less formally /twin/ ‘inside’ in literary Burmese, and by /hma/ ‘loc’ in spoken; directionals also differ: ‘from’ is /hmá/ in literary and /ká/ in spoken; allative ‘to’ is /θó/ in literary and /ko/ in spoken; note the greater homophony across categories in spoken Burmese. In some Ngwi languages such as Lisu, it is fairly frequent to combine a case marker and a topic marker in this order, but in Burmese and most other Burmic languages, this is not possible, and only one final marker per NP may occur. The deictics differ greatly across Burmic, both in form, in the systems of meanings, and in structure. Where a head noun is present, in Burmese the deictic precedes the head noun, and requires no further marking; in Ngwi languages the deictic follows the head noun. In Lahu, most deictics must be followed by the nominalizer /ve33/, though the proximal deictic may also occur without a following /ve33/ in anaphoric use. In Lisu, a deictic must be followed by a nominalizing element /ma44/ or some other postnominal case marker such as loc /kwa44/ or manner /lø33/; in Southern Lisu the anaphoric (ana) deictics have a prefix /a55/ as in /a55 thø33/ ‘this (previously mentioned)’, but the same form has a different meaning, a medial form ‘that close to addressee’ in Central Lisu, as seen in (9) below. In Akha, a deictic must be followed by a clf to complete the deictic phrase. Note that when a deictic and a numeral + clf phrase occur in the same NP, there are many alternative patterns. The deictic in Burmese occurs alone before the head noun and the numeral + clf phrase follows the head noun, as in (7), which also shows that plural marking is not obligatory in Burmese. In Lahu, the deictics occur after the head with (or for the anaphoric proximal deictic only without) a nominalizer, either before or after the numeral + clf phrase, as in (8); the unmarked order has the deictic phrase first. In Lisu, the numeral + clf phrase is embedded inside the deictic phrase, after the deictic and before the nominalizer, as in (9). In Akha, the deictic requires the presence of a clf, and the numerals if any are inserted between the deictic and the clf, as seen in (10). (7)

di lu this person ‘this/these people’

(8)

tɕhɔ33 tɕhi33 ve33 ŋa54 ɣa54   tɕ2ɔ33 ŋa54 ɣa54 tɕhi33 ve33 person this nmlz five clf person five clf this nmlz ‘these five people’ ‘these five people’ tɕhɔ33 tɕhi33 ŋa54 ɣa54 tɕhɔ33 ŋa54 ɣa54 tɕhi33 person this.ana five clf   person five clf this.ana ‘these five previously ‘these five previously mentioned people’ ­mentioned people’



  di

lu ŋà jau’ this person five clf.human ‘these five people’ (Burmese)



Typological profile of Burmic languages 

(9)

la21 tsho33 thø33 ma33   la21 tsho33 thø33 ŋwa21 ʐo44 ma33 person this nmlz person this five clf nmlz ‘this person’ ‘these five people’ (Lisu) la21 tsho33 a55 thø33 ŋwa21 ʐo44 ma33 person ana-this five clf nmlz ‘these five previously mentioned people’ (Southern Lisu)

(10)

tsɔ55 ha21 hɤ33 person this ‘this person’

ɣa21 clf

 317

tsɔ55 ha21 hɤ33 ŋa21 ɣa21 person this 5 clf ‘these five people (Akha)

The inventory of deictics also differs greatly among Burmic languages. In Burmese, spoken for more than a millennium in the plains, there is a proximal, literary /i/ or less formally /θi/ and spoken /di/ and a distal, literary /tʰo/ and spoken /ho/; also anaphoric deictics with an /ɛ`/ prefix: spoken /ɛ`di/ ‘this (previously mentioned)’ and much less frequently /ɛ`ho/ ‘that (previously mentioned)’, also very occasionally /ɛ`θi/ ‘this (previously mentioned)’. All Ngwi languages, spoken by groups who still live in the mountains, have complex systems with distinct deictic stems distinguishing relative height and distance; for more information see Bradley (2003). While a deictic phrase can occur alone as an NP in Ngwi languages, in Burmese a deictic alone cannot be a full NP; a following dummy noun /ha/ ‘thing’ is required. This is normally contracted together with the proximal deictic to produce spoken Burmese /da/ ‘this one’ and /ɛ`da/ ‘this one (anaphoric)’; the distal deictic does not contract this way. In Lisu, the deictics usually contract with a following loc suffix /kwa44/, for example /tʰø33/ + /kwa44/ > /tʰa33/ ‘here’. Several Ngwi languages have very complex systems of deictics with different tone and initial forms and reduplicated forms expressing degrees of distance; for the system in various dialects of Lisu, see Bradley (2017b, 2022). Tense/Aspect marking is postverbal; an imperfective is among the post-head modals, Burmese /ne/, Lahu /tɕʰɛ54/, Lisu /tja33/, Akha /dʑɔ55/ and so on, all grammaticalized from the verb ‘to.live/to.stay/to.be.at’. There is also a verb meaning ‘to. finish’ which also grammaticalizes into a completive post-head modal, Burmese /pjì/, Lahu /pə21/, Lisu /gu33/ and Akha /dʑi55/. Lisu also has a past tense marker, /gɯ33/ or /kɤ33/ in different dialects, which follows all modals in a clause. In addition, there are clause-final markers of perfectivity such as Lahu /o21/ and Lisu /o44/; Burmese uses /pji/ which is related to the verb ‘finish’ but has a different tone; sequences with multiple completion marking, such as Burmese Verb + /pjì/ ‘finish.Ving’ + /pji/ pfv and Lisu Verb + /gu33/ ‘finish.Ving’ + /kɤ33/ past + /o44/ pfv are extremely frequent. As discussed concerning the identification of the form class of verbs, negation is expressed by a one-syllable adverb preceding the verb of the negated clause; it has the entre clause as its scope. A few Ngwi languages in China use a Verb-neg-Verb form for polar questions, like one of the possible Chinese structures, but this is not usual in any Burmic language spoken in MSEA. In Ngwi languages but not in Burmese, clauses

318 

 David Bradley

with most but not all post-head modals have the neg on that modal instead of on the main verb. Sequences of verb plus resultative verb also usually have the neg on the result verb in all Burmic languages. There is a nominal negative suffix, Burmese /hmá/, Lisu /e55/, Lahu /ɛ35/, Ahka /i21/ ‘at.all’ which can be added to the numeral ‘one’ plus clf in the NP slot, as seen in (11) and (12) or to a question noun as in (55A) below. These can only occur when the verb is also negated. (11)

ta jau’ hmá ma la bù one clf.human at.all neg1 come neg2 ‘Not one person came.’ (spoken Burmese)

(12)

te54 ɣa54 ɛ35 ma54 one clf.human at.all neg ‘Not one person came.’ (Lahu)

la21 come

The minimum clause usually contains a verb with associated TAM and final markers. It can also be a positive equational copula clause with two juxtaposed NPs and potentially final markers, though in Lisu a copula may occur in such clauses. However, in negated equational copula clauses an overt copula is obligatory, as shown in spoken Burmese (13), Lahu (14) and Lisu (15) below; in Lisu, final markers in positive equational clauses like (15) are optional. (13)

θu sʰəja he/she teacher ‘He/she is a teacher.’

θu sʰəja məhou’pʰù he/she teacher neg1 be.the.case neg2 ‘He/she is not a teacher.’ (spoken Burmese)

(14)

jɔ54 ɕa11la11 (jo21) he/she teacher decl ‘He/she is a teacher.’

jɔ54 ɕa11la11 ma54 heʔ45 he/she teacher neg be.the.case ‘He/she is not a teacher.’ (Lahu)

(15)

ji55 ma55 pha21 he/she teacher ‘He/she is a teacher.’

(ŋa33) ji55 ma55 pha21 ma21 ŋa33 cop he/she teacher neg cop ‘He/she is not a teacher.’ (Lisu)

As noted above, there is no morphological person marking directly on verbs. Clauses with stative, intransitive, transitive and ditransitive verbs and valency-increased ‘causative’ verbs may have all, some or none of the NPs overtly present, in unmarked order or in pragmatically determined order. While in most Burmic languages, comitative and instrumental NPs are marked with the same postposition, the comitative NP usually follows the NP with which it is associated, while the instrumental NP is immediately preverbal in the manner slot. Manner is marked by a postnominal /lo/ in spoken Burmese, /θɔ´/ in literary Burmese and /lø33/ in Lisu; in Lahu this is expressed by a following /qʰe33/ ‘like’. Nominal instrument and accompaniment are marked by postpositional spoken Burmese /nɛ´/, literary Burmese /hnín/, Lahu /gɛ33/ and Akha /nɛ21 ɛ55/. In Lisu, comitative is expressed by postpositional numeral + clf /tʰi21 le35/



Typological profile of Burmic languages 

 319

‘one same’; instrumental is postpositional /du33/, the same as the postverbal instrumental nominalizer. Another very frequent main clause type in all Burmic languages is a final nominalized clause, ending in spoken Burmese realis.nmlz /ta/ or irr.nmlz /hma/, Lahu nmlz /ve33/ and similar, often with additional clause-final elements following it, as in Burmese (16) and Lahu (17). Syntactically, this is a single NP containing an embedded sentence. (16)

la22 da22 bɛ̀ 41 come nmlz int ‘(I) definitely come/came!’ (Burmese)

(17)

la21 ve33 jo21 come nmlz decl (I) definitely come/came!’ (Lahu)

Reflexives in Burmic languages are expressed in a variety of ways using nominal material, especially pronouns. Literary Burmese uses /mí mí/; this can be an obj or gen coreferential to an expressed or unexpressed agent in the same clause as in (18). When it is an obj, this form can also be extended with a following noun /ko/’body’ and the obj marker /ko/ as in (19). Spoken Burmese uses either an invariant /kó ko ko/ ‘self’s body obj’ with a possible following pronoun or NP agent as in (20), or a reduplicated pronoun with the first having an induced creak, then the noun ‘body’ (or the obj form) without juncture voicing and then the pronoun repeated without a pause, as in (21). Other body part or abstract nouns can replace the middle /ko/ ‘body’ in the construction seen in (20), as discussed in Bradley (2005c). Note also the use of the literary quote marker /hú/ for a reported thought here, not only for reported speech. (18)

mí mí əhmú mí mí go pa’ dɔ´ mji hú θí self deed self obj trap at.last irr quot know θɔ` nf.real ‘(He) realized that he was about to be trapped by his own acts.’ (literary; Okell and Allott 2001: 154)

(19)

mí mí ko go pà na’ θu hú tin bjì self body obj shrewd nmlz quot think pfv ‘He thought that he himself was shrewd.’ (literary; Okell and Allott 2001: 154)

(20) kó ko ko tɕənɔ jai’ tɛ self.obj i.m hit real ‘I hit myself.’ (spoken; Bradley 2005c: 73) (21)

θú ko θu kai’ tɛ him/herself bite real ‘He/she bites him/herself.’ (spoken; Bradley 2005c: 74)

320 

 David Bradley

Unlike Burmese, which normally fronts the reflexive NP as seen above, Ngwi languages often leave it after the agent if one is present, as in Lisu (22). Note that the reflexive can be directly marked by nominal case markers; no intervening noun is needed, unlike Burmese. The first syllable of the Lisu form /tsɿ55 tɕʰa21/ may be borrowed from Chinese 自 zì. (22)

tsɿ55 tɕʰa21 tɛ55 ka55 seʔ21 kɤ33 o44 ji55 he/she self obj stab kill pst pfv ‘He stabbed himself to death.’ (Lisu; Bradley et al. 2006: 158)

Unlike Lisu, Lahu can only mark objects as reflexives. These reflexives include the question pronoun ‘who?’ plus a pronoun plus the noun ‘body’, in this case the Shan loanword /ɔ21 to33/, in the obj slot, as in (23). (23)

ɔ21 to33 dɔʔ45 ve33 a33 ɕu33 jɔ54 who he/she body hit nmlz ‘He/she hits him/herself.’ (Lahu)

In addition to the reflexive forms, Burmic languages also have a sbj intensifier form, like English ‘I myself’. In literary and spoken Burmese, this uses the noun /ko/ ‘body’ plus the intensifier form /tain/ as in (24). This may also imply that no one else was involved, it was done alone, like English ‘by myself’. (24)

θu ko dain lou’ tɛ he/she self ints do realis ‘He/she does it by him/herself.’ (spoken Burmese; Bradley 2005c: 81)

In most Ngwi languages, the form is usually a repeated pronoun separated by an intervening word, such as Lisu /da33/ ‘by.oneself’ as in (25) or Lahu /qʰa54/ ‘by.oneself’ as in (26). In Lahu, the first of the two identical pronouns can be omitted, but not in Lisu. (25)

nu33 da33 nu33 dʑe33 ʑu33 aʔ21 you by.self you go take imp ‘Go and bring it yourself!’ (Lisu; Bradley et al. 2006: 45)

(26)

(jɔ54) qʰa54 jɔ54 te33 ve33 he/she by.self he/she do nmzl ‘He/she does it by him/herself. (Lahu)

Some person-related marking is found in a few Burmic languages in the form of postverbal and/or clause-final markers. For example, Lahu has a benefactive /la54/ for speech act participant (SAP, first and second person) and /pi54/, homophonous with the verb ‘give’, for non-SAP recipients (Matisoff 1982: 324  ff.) as in (27) and (28). (27)

tho33 la54 o21 la54 tell to.SAP pfv q.yes.no ‘Has he/she told you?’ (Lahu; Matisoff 1982: 325)



(28)

Typological profile of Burmic languages 

 321

tɕʰi33 ta54 kɔʔ45 pi54 ʔ ya54 mi54 ɛ35 girl baby this proh scare to.non-SAP imp ‘Don’t scare this little girl!’ (Lahu; Matisoff 1982: 326)

Burmese has a postverbal marker /tɕá/ which indicates distributed multiple action as in (29), which often involves a plural sbj; for discussion, see Okell and Allott (2001: 16–17). Like most postverbal elements, it undergoes juncture voicing. This form can also be used with a reciprocal meaning as in (30). The Burmese reciprocal has a reduplicated preverbal adverb form /tɕʰìndʑìn/ as seen in spoken Burmese (30) combined with /tɕá/ and in (31) without. This is derived from the single-syllable postnominal reciprocal as seen in (32) and (33); less often, this has a sequential meaning as seen in (33) and (34); unusually, in the latter it is postverbal rather than single-syllable postnominal or reduplicated preverbal. (29)

təjau’ nɛ´ təjau’ mə tɛ´ dʑá bù one clf com one clf neg1 get.along distr neg2 ‘(They) do not get along well with each other.’ (Okell and Allott 2001: 37)

(30) təjau’ nɛ´ təjau’ tɕʰìndʑìn mə tɛ´ dʑá bù one.clf com one.clf recp neg1 get.along distr neg2 ‘(They) do not get along well with each other.’ (modified) (31)

təjau’ nɛ´ təjau’ tɕʰìndʑìn mə tɛ´ bù one clf com one clf recp neg1 get.along neg2 ‘(They) do not get along well with each other.’ (modified 2)

(32)

mèinmá dʑìn tʰain dɛ woman recp sit real ‘The women sit beside each other.’ (Okell and Allott 2001: 37)

(33)

təjau’ tɕʰìn kʰɔ ba one.clf recp call pol ‘Please call them one after the other.’ (Myanmar Language Commission 1983: 71)

(34) θwà θwà dʑìn go go recp ‘Go one after the other!’ (Myanmar Language Commission 1983: 71) Reciprocal meanings in Ngwi languages are expressed by postverbal markers, such as partly cognate Lahu /daʔ21/ and Lisu /lɛʔ21 xo33/ or /lɛʔ21 kʰo33/ as in (35) and (36), unlike Burmese (37) illustrating the most frequent Burmese reciprocal, a preverbal reduplicated adverb. Note that such clauses inherently contain a plural agent/sbj; here, Burmese /tɕá/ can be added after the verb in (37) but is not necessary.

322 

 David Bradley

(35)

mɔ21 daʔ 21 ve33 see recp nmlz ‘(We) see each other.’ (Lahu; Matisoff 1982: 317)

(36)

mo33 lɛʔ 21xo33 a44 see recp decl ‘(We) see each other.’ (Lisu)

(37)

tɕìndʑìn mjin (dʑá) dɛ recp see (distr) real ‘(We) see each other.’ (spoken Burmese)

All Burmic languages have a very large inventory of modal and other grammaticalized verbs, some in pre-head but most in post-head position. Many are clearly related to full verbs, like post-head Lisu /pɯ55/ ‘dare.to.V’ from the full verb /pɯ55/ ‘to.dare’; others are derived from former verbs which no longer occur as main verbs, like posthead Lahu /ga54/ ‘want.to.V’ which still occurs in bound form as a pair for /dɔ54/ ‘to. think’ as in /dɔ54 ve33 ga54 ve33/ ‘think’ or /ma54 dɔ54 ma54 ga54/ ‘not think’. Some have no obvious synchronic verbal source, like post-head Burmese /tɕʰin/ ‘want.to.V’. There are extremely complex patterns of combination of these preverbal and postverbal elements, as very extensively discussed in Okell (1969) for spoken Burmese, Matisoff (1982) for Lahu and Bradley (2022) for Lisu. The main post-head subcategories, as in many Tibeto-Burman languages, include directionals (Matisoff’s juxtacapitals) after the verb, then a wide variety of modals, sometimes with more than one sequence possible to express different meanings, and finally some expressives. Sequences of two of these elements after a verb are quite frequent; longer strings are less so. Where a particular sequence is not grammatical, and sometimes even when it is, a string of post-head elements may be interrupted by a nonfinal clause marker, such as Burmese /ló/ ‘cause’ or /pʰó/ ‘for’ or Lisu /sɿ55/ ‘sequence’. The position and scope of negation differs between languages; in Burmese, negation normally precedes the main verb as in (38), but in most Ngwi languages it tends to precede most post-head modals rather than a preceding main verb as in (39). (38) mə θwà nain bù neg1 go can neg2 ‘(I) can’t go.’ (spoken Burmese) (39)

dʑe33 ma21 da33 go neg can ‘(I) can’t go.’ (Lisu)

All Burmic languages have some pairs of verbs which are in a simplex/causative relationship: one intransitive or stative and the other transitive. The latter reflects the Tibeto-Burman *s- valency-increasing prefix, which is no longer productive in any Burmic language. The largest inventory is in Burmese with 65 so-called H/Non-H pairs;



Typological profile of Burmic languages 

 323

the non-H forms have unaspirated stop or affricate or voiced sonorant initials, and the valency-increased H forms have aspirated stop or affricate or voiceless sonorant initials and sometimes creaky tone (Okell 1969 I: 205–208); for example, Burmese /lu’/ ‘be.free’, /hlu’/ ‘to.set.free’; /pwín/ ‘be.open’, /pʰwín/ ‘to.open’. In Ngwi languages the forms differ in tone, as in Lahu /tɕa54/ ‘to.eat’, /tɕa11/ ‘to.feed’, and sometimes in initial as well, as in Lisu /dʑo44/ ‘be.afraid’, /tɕo35/ ‘to.scare’ and /do33/ ‘to.drink’ and /to44/ ‘to.give.to.drink’. In addition to these lexicalized forms, all Burmic languages also have a syntactic valency-increasing post-head modal which occurs after active verbs; the form is cognate across most Burmic languages: Burmese /se/, Lahu /cɨ/, Lisu /tsi44/ and so on. This follows a post-head directional but precedes other post-head elements. In sentences with a valency-increased form, the case role of the causer of the action can be marked by the agt postposition in languages which have these; the causee (who is the agt or sbj in the corresponding clause without valency increase) is extremely likely to be marked by the obj postposition; this marking is much more frequent than obj marking in other syntactic contexts. Many Burmese H verbs and lexicalized causative forms in Ngwi languages which are already valency-increased by the etymological prefix can have their valency further increased by this postverbal element, such as Burmese /pʰwín/ ‘to.open’+ /se/ ‘cause’ or Lisu /to44/ ‘to.give.to.drink’ + /tsi44/ ‘cause’. In some Ngwi languages, there is an alternative causative with grammaticalized verb ‘to.give’ plus another verb; this is the only such construction in Akha, as in /bi21/ ‘to.give’ + /dza21/ ‘to.eat’ > ‘to.feed’, and also occurs sometimes in Lisu, as in /gɯ21/ ‘to.give’+ /dza21/ ‘to.eat’ > ‘to.feed’; conversely, Burmese uses verb plus grammaticalized ‘to.give’ in the opposite order, as in /θin/ ‘to.study’, /θin/ + /pè/ ‘to.teach’, as does Lahu (Matisoff 1982: 247–248).

17.5.1 Clause combining Non-final clauses are linked to and precede a final main clause or are embedded as an NP complement or a relative clause component of an NP within a main clause. In some Burmic languages such as Burmese and Lahu, nominalized clauses in embedded form also occur very frequently in sentence-final position as well. In all cases, the main marker of linkage occurs at the end of the clause. Interestingly, the realis versus irrealis opposition which is neutralized in spoken Burmese final negative clauses is preserved in negated spoken Burmese relative and complement clauses, including nominalized clause-final complements. Sequences of clauses can be combined in Burmic languages without any overt linkage marker. With extensive zero anaphora, this often results in a sequence of very short clauses, sometimes just two or more verbs directly juxtaposed. Alternatively, there may be any of a variety of markers after the verb of a nonfinal clause, with a possible intonation break and then the final clause. These clause-final markers in non-

324 

 David Bradley

final clauses have various types of meanings: simultaneous, sequential, conditional, causal and so on; the semantic packaging differs between Burmic languages. Simultaneity and weak causation is expressed by literary Burmese /θɔ/ and spoken Burmese /tɔ´/; simultaneity by Lahu /tʰa54/ and Lisu /tʰɛ21/; prior sequence (‘before’) by literary Burmese /mi/ and spoken Burmese /kʰin/, subsequent sequence (‘and then’) by Burmese /pjì/ and Lisu /sɿ55/, sequence and cause by Lahu /lɛ33/, consequence by Lisu /ɲi44/, condition and cause by Lisu /ɲa44/, condition by Lahu /qo33/, literary Burmese /hljin/ and spoken Burmese /jin/, cause by spoken Burmese /ló/ or less frequently /mó/ which is more often postnominal and by literary Burmese /tɕáun/, and so on. Some languages also have possible combinations of two or more of these forms, such as Lisu /sɿ55 ɲi44/ ‘sequence + consequence’ or Burmese /pjì jin/ ‘finish + cond’, /mó ló/ ‘cause + cause’ or even /tɕáun mó ló/ ‘cause + cause + cause’ mixing literary and spoken forms. For more discussion and examples, see Okell (1969), Okell and Allott (2001), Matisoff (1982) and Bradley (2022). Semantically parallel but structurally different clause linkage strategies include the use of an entire short clause linking a preceding nonfinal and a following final clause, such as Lahu /qʰe33 te33 lɛ33/ ‘like do sequence‘ as in (40). qʰe33 te33 lɛ33 … (40) qai33 ve33 go nmlz like do sequence/cause ‘(I) went, and so then …’ (Lahu) The Lahu linkage marker /qʰe33 te33 lɛ33/can also be used as a final clause with the meaning ‘and so on’ following a nonfinal clause, or as an initial clause without a preceding linked nonfinal clause; it is also often used without /qʰe33/ in narrative (Matisoff 1988: 291). Some combinations of nominalized clause plus abstract head NP have semantics similar to a nonfinal clause with a cause marker, but they are headed complement clauses. These have a stronger causative meaning than nonfinal clauses with the cause marker like Lisu /ɲi44/ or Lahu /lɛ33/ in (41) and (43). One such noun is Lisu /pɯ55 du33/, Lahu /pa33 tɔ33/ ‘cause’ as in (42) and (44); another frequent abstract noun in this frame is Lisu /wu55 ɲi44/, Lahu /ɔ21 lɔn33/ ‘reason’. (41)

ma21 dʑo33 ɲi44, dʑe33 ma21 da33 phu33 money neg have cause go neg can ‘Having no money, (we) can’t go.’ (Lisu)

(42)

phu33 ma21 dʑo33 a44 ma44 pɯ55 du33, dʑe33 ma21 da33 money neg have nmlz cause go neg can ‘Because (we) have no money, (we) can’t go.’ (Lisu)

(43)

pʰu33 ma54 cɔ21 lɛ33, qai33 ma54 pʰɛʔ 21 money neg have cause go neg can ‘Having no money, (we) can’t go.’ (Lahu)



Typological profile of Burmic languages 

 325

ma54 cɔ21 ve33 pa33 tɔ33, qai33 ma54 pʰɛʔ 21 (44) pʰu33 money neg have nmlz cause go neg can ‘Because (we) have no money, (we) can’t go.’ (Lahu; Matisoff 1982: 159) For simultaneous clauses, other patterns of structure with a linking nominal also occur; for example, in Lahu (45), clause + /ɔ21 jan54 hta54/ ‘time simultaneous‘; this is structurally ambiguous between a headed complement and a headed relative clause construction. In spoken Burmese, a relative clause followed by a head noun as in (46) Clause + /tɛ´ ətɕʰein/ ‘rel time’ is semantically analogous to a nonfinal clause. (45)

qai33 ve33 ɔ21 jan54 tʰa54 … go nmlz time simultaneous ‘At the time when (I) went, …’ (Lahu)

(46) θwà dɛ´ ətɕʰein … go rel time ‘The time when (I) went, …’ (spoken Burmese) There is no lexical disjunction ‘or’ form in Burmic languages; this meaning is expressed by the first alternative followed by a short conditional clause neg be.the.case cond ‘if it is not the case that’, then the clause giving the second alternative: spoken Burmese Clause + /məhou’jin/ + Clause, Lisu Clause + /ma21 ŋa33 ɲa44/ + Clause, Lahu Clause + /ma54 heʔ45 qo33/ + Clause and so on. All Burmic languages can nominalize a clause and have it function as an NP within another clause. In most Ngwi languages it can be difficult to distinguish nominalized complement clauses from relative clauses as some of the forms used are the same and both may have a following NP head or be headless, but in Burmese the two are distinct; relative clauses are headed by a following NP and the relative and nominalizing markers are distinct. The spoken Burmese relative markers are realis /tɛ´/ and irr /mɛ´/; in literary Burmese the forms are realis /θí/ or /θɔ´/ and irr /mjí/; all these are derived from clause-final markers with even tone by grammatical induced creak as discussed in 17.3 above; see (5b) and (46) for literary and spoken examples. The unmarked complement nominalizer which does not indicate temporal, causal or other linking relationships between clauses in sequence is spoken Burmese realis /ta/ and irr /hma/, both contractions of relative /tɛ´/ and /mɛ´/ plus the dummy noun /ha/ ‘thing’; literary Burmese also uses irr /hma/, but instead uses realis /θi/. Lahu uses /ve33/, Lisu uses /a44 ma44/, Akha uses /ɤ33/. There are various other clause nominalizers in Burmic languages which follow the final verb of the clause and relate to the case role of the NP within the clause. In Burmese, /θu/ is an agent/sbj nominalizer of verbs or clauses; /ja/ is a loc nominalizer of verbs or clauses which may also occur after nominals; this same form also has various clause-linking functions: alternative ways of expressing simultaneity, purpose or means (Okell and Allott 2001: 180–182). The loc nominalizer in Lahu is /kɨ21/; the Lisu form is /gu33/; this has nominalizer cognates elsewhere in Tibeto-Burman. The

326 

 David Bradley

instrumental nominalizer is Burmese /nì/, Lahu /tu21/, Lisu /du33/; Akha uses cognate /du21/ both for instrumental and locative nominalization; *du1 is a widespread Tibeto-Burman instrument nominalizer etymon. In all Burmic languages, the most typical relative clause (without the equi NP) precedes a clause-final relative marker followed by a head NP; such heads are typically in a core syntactic case role (agent, sbj, obj) or a loc or temporal NP in the relative clause; restrictions on case roles differ between languages. In Ngwi languages, a relative clause may fairly frequently lack an overt external head. In Burmese, the relative markers are distinguished by having grammatical induced creak as discussed in 17.3 above. In most Ngwi languages, relative markers are identical to some complement markers and so relative clauses which due to zero anaphora happen to lack an overt head and complement clauses without a head NP can be difficult to distinguish, as are relative clauses and complement clauses with a following external head NP. The main relative marker in Lisu is /a44 ma44/, identical to the nominalizer discussed above; this is used with heads of all types except for an animate agent or sbj. Lisu uses /su44/ as a relativizer or nominalizer with an animate agent/sbj equi-NP, with or without an external NP head; this adds relative clauses to the uses of the corresponding Burmese form /θu/. If a Lisu relative clause with either of the two relativizers is short, typically containing only a single stative verb, it normally follows the head NP. The same pattern, which Matisoff calls right relative clauses, is also seen in Lahu (Matisoff 1982: 490–503). Longer and more complex relative clauses with NP arguments and sequences of relative clauses must precede the head NP in Lisu and Lahu. For an in-depth discussion and examples, see Bradley (2022). In Lahu, as Matisoff (1972) has shown, nominalized final main clauses, nonfinal nominalized complement clauses with or without a following NP head and relative clauses with or without a following external NP head have the same clause-final marker /ve33/; this is also the genitive postposition following a possessor NP which precedes the possessed NP. All other Burmic languages distinguish at least some of these; Lahu Shi does not use /ve33/ clause-finally, only as a genitive, nominalizer and relative marker (Bradley 1979a: 202). Unlike Lahu, Burmese distinguishes all three: clauses can be nominalized by a final realis /ta/ or irr /hma/ both finally and as non-final complement clauses, but relative clauses are marked by final literary realis /θí/ or /θɔ´/ or spoken realis /tɛ´/ or by literary irr /mjí/ or spoken irr /mɛ´/; possessors are marked by following /jɛ´/ or zero. Another type of clause combining is quoted and reported speech. In Burmese, quoted and reported speech and thought are not formally distinct. They are marked by a form after the framed speech or thought: literary Burmese /hú/ and spoken Burmese /ló/ and /tɛ´/ as seen in (47) to (49) below; for examples of reported thought, see (18) and (19) above. All three forms show grammaticalized induced creak: /hú/ from an obsolete verb /hu/ ‘speak’; /ló/ most likely from the postposition /lo/ ‘manner/way’; and /tɛ´/ from the clause-final realis /tɛ/. Of the two spoken forms, /ló/ is normally used with a following clause containing a verb of speaking or thinking, while /tɛ´/ is



 327

Typological profile of Burmic languages 

normally used without a following framing clause. For an extended discussion with examples and statistics from a spoken Burmese corpus, see Bradley (2018b). (47)

tʰo sʰan mjò ko “lòundi sʰan” hú kʰɔ ði that paddy kind obj round-beat paddy quote call real ‘That kind of paddy is called ‘lòundi sʰan’. (Okell and Allott 2001: 250)

(48) sìn zà òun mɛ ló pjɔ` dɛ consider would irr quot say real ‘(He) said that he would consider it.’ (Okell and Allott 2001: 209) (49) di au’ hma dɔ´ pò əsi’ tɛ´ This below loc yet silk real quot ‘At the bottom here, it says “real silk”.’ (Okell and Allott 2001: 76) Quoted speech in Ngwi languages is usually immediately followed by a quotation marker, Lahu /tɛʔ21/, Lisu /be33/ and Akha /lɛ55/, as in (50) and (51); the quoted material is normally preceded and/or followed by framing clauses which may contain some or all of a temporal NP, a loc NP, a speaker NP, an addressee NP, a speaking verb and relevant clause-final elements, as in (50) and (51). Extended examples from Lisu, Lahu and Akha are given in Bradley (2018a). (50) mɔ55 hɯ55 hɔ55 ɤ33 mɔ55 ɲi55 a21 ha33 ɲa21 nm55 ɔn55 ɛ55nɛ21 ɤ33 Maw big and Maw two they.two sibling obj tell decl   [long quote of five clauses] lɛ55 ɛ55 nɛ21 ɤ 33 mɛ33 quot tell decl 3.positive ‘He told the two siblings Maw Hui and Maw Nyi “[quote]”, (he) told them.’ (Akha; quoted in Bradley 2018a) (51)

da11 viʔ45 jɔ54 tʰaʔ21 “dʑɔ54 mɔ54” tɛʔ21 qoʔ45 ve33 ko33  … David he obj Lord quot say nmlz cond ‘If David calls him “Lord”, …’ (Lahu; Matthew 22: 45, quoted in Bradley 2018a)

Quoted and reported speech are usually distinct in Ngwi languages. There is a clause-final reported speech evidential, Lahu /tɕe54/, Lisu /dʑo21/ (with various dialect forms including Southern /do21/) and Akha /dʑe55/, distinct from direct quote framing. In Lisu, sometimes the quote marker /be33/ can be omitted if a postquote frame is present; and sometimes the reported speech evidential may also be used after the postquote frame of a direct quote, especially in Southern Lisu, as in (52). (52)

a55 tʰe33 a44 “a55 ŋwa33 a55 ti55 gɯ21 iʔ21 ta55 ma21” ana-this loc interj I a.bit let sleep stat pol do21 quot ‘Here he said ‘“Ah, let me stay sleeping a bit”.’ (Southern Lisu; Bradley 2018a)

bæ55 say.dec

328 

 David Bradley

17.5.2 Pragmatics and syntax Burmic languages have a range of strategies for expressing topic and focus. In general, a topic NP is moved to initial position in the clause, often overtly marked by an NP-final topic postposition. Each language has a substantial inventory of topic markers; the most frequent are literary Burmese /hma/, spoken Burmese /ká/, Lisu /ɲa44/ and Lahu /ɔ11/. In Lisu, a secondary topic is marked by /na21/. Matisoff (1982: 167–177) discusses various other Lahu topic markers, the strongest of which is /mɛ33/; some are also clause-final markers like /tʰɔ54/ and /kaʔ21/ ‘even/although’ or forms from other dialects like Lahu Nyi /pɔʔ21/; Lahu Shehleh uses the form /ka11/ instead of /kaʔ21/ (Bradley 1979a: 202). Sometimes a topic has no syntactic role in the clause and is simply added clause-initially. There is also a focus or afterthought position after the verb; NPs in this position are usually separated by an intonation break after the verb. There are various pragmatic markers found in all Burmic languages. One category relates to questions. All Burmic languages have sentence-final question markers, often distinguishing yes/no questions from substance questions. In yes/no questions, most Burmic languages use a sentence-final polar question form cognate with Proto-Burmic *la2. The usual Burmese form is /là/; there is also an archaic literary form /lɔ`/ and a rural form /sá/; where a yes answer (or agreement to a statement) is expected or desired, the form is /nɔ/ or /nɔ`/ The Lahu form is /la54/, some dialects of Lisu use /la21/ and others use the shortened form /a21/ or the alternative forms /lɛ21/ or /ɛ21/. In Akha, the corresponding form is /lo55/, but closely related Hani instead uses /la21/; Akha only uses /la21/ as a tag requesting agreement. In most Burmic languages, a polar question can be answered with a single word interjection or short positive or negative clause, or by repeating the verb with positive or negative polarity, or both, with the interjection or short clause first, as in the spoken Burmese examples in (53) and the Lisu examples in (54) giving a question and then positive and negative answers. (53)

Q

A

(54)

Q

nɛ’ pʰjan la mə là tomorrow come irr q.yes.no ‘Will (you) come tomorrow?’ (spoken Burmese) hou’ kɛ´ mə hou’ pʰù be.the.case compl neg1 be.the.case neg2 ‘Yes.’ ‘No.’ la mɛ mə la bù come irr neg1 come neg2 ‘(I) will come (tomorrow).’ ‘(I will) not come (tomorrow).’ hou’ kɛ´, la mɛ mə hou’ pʰù, mə la bù ‘Yes, (I) will come (tomorrow).’ ‘No, (I will) not come (tomorrow).’ sa55 nɛʔ21 la33 la21 tomorrow come Q.yes.no ‘Will (you) come tomorrow?’ (Lisu)



Typological profile of Burmic languages 

A

 329

ma21 ŋa33 ŋo33 cop neg cop ‘Yes.’ ‘No.’ 33 44 33 la a ŋo ma21 la33 come decl fut neg come ‘(I) will come.’ ‘(I will) not come.’ ŋo33, la33 a44 ŋo33 ma21 ŋa33, ma21 la33 ‘Yes, (I) will come.’ ‘No, (I will) not come.’

In substance questions, the question pronoun is usually in its normal position in the clause, though it may also be fronted for pragmatic reasons. Substance questions typically end in /lɛ`/ in spoken and literary Burmese, though there is also a decreasingly used literary form /nì/ and an abrupt spoken form /tòun/. The Lahu counterpart is /le33/; Lisu and Akha do not require a final question marker where the clause contains a question pronoun, but Lisu may use the same final form /la21/ for both yes/no and substance questions. These can be answered with just the requested nominal material, or with the nominal material in a short full clause with the same verb as the question. Lahu has a distinct marker /na11/ for embedded questions which are contained within another clause, such as in quotes; this replaces either the yes/no question marker /la54/or the substance question marker /le33/. (Matsoff 1982: 471). The usual answer to a substance question can be just the desired NP, or a full clause containing this NP and a repetition of the verb. A negative answer to a substance question involves the repetition of the question noun with case markers if any, followed by Burmese /hmá/, Lisu /e55/, Lahu /ɛ35/ and Akha /i21/ and then the verb negated, as seen in (55) below. (55)

Q

A

bəðu la lɛ` who come q ‘Who came?’ (Burmese) bəðu hmá mə la who at.all neg1 come ‘Nobody came.’

bù neg2

(la21) a21 ma33 la33 who come q ‘Who came?’ (Lisu) a21 ma33 e55 ma21 la33 who at.all neg come ‘Nobody came.’

There are also extensive systems of imperative, hortative, urging and emphasis markers which likewise occur sentence-finally. In Ngwi languages, the most frequent and unmarked of the imperatives is simply an added glottal stop after the verb. Most Burmic languages have substantial systems of sentence-final epistemic and evidential markers which are mainly used in statements with reference to first-person degree of certainty or source of evidence. Some of these combine evidential and epistemic meanings, like Central Lisu /bɯ33/ which indicates that the speaker used to think otherwise, but is now absolutely certain that something is so, as in (56); for a fuller discussion of the Lisu evidential and epistemic system, which differs substantially between dialects, see Bradley (2010); concerning Akha, which is even more complex, see Kya Heh (2003).

330 

(56)

 David Bradley

mu21 ɣɯʔ21 tʰi21 ɣɯʔ21 de21 dʐa44 bɯ33 horse blessing one blessing beg help epistemic4 ‘I am certain that you will help me to beg for the blessing of horses; I used to think otherwise.’ (Central Lisu; adapted from Bradley 2010: 66)

Epistemic systems in these languages usually involve a number of degrees of certainty. Lahu has a relatively simple system in which /hɛ35/ implies uncertainty or doubt and /nɛ21ɔ11/ implies near-certainty (Matisoff 1982: 469). Northern Lisu has a five-degree system: /lo44/ ‘absolutely certain’; /pʰa55a21/ ‘most likely to’; /bɯ33/ ‘fairly sure (used to think otherwise)’; /do44/ ‘likely’ and /pʰe55/ ‘doubtful’ (Bradley 2010: 65); see (57). (57)

tʰi21 ɣɯʔ21 de21 dʒa44 bɯ33 mo21 ɣɯʔ21 horse blessing one blessing beg help epistemic3 ‘I am fairly sure that you will help me to beg for the blessing of horses; I used to think otherwise.’ (Northern Lisu; Bradley 2010: 66)

Note that the same form can also have a completely different meaning in another dialect. In Southern Lisu, the related form /bo35/ is a visual evidential, not an epistemic, as in (58). (58)

mo21 ɣɯʔ21 tʰi21 ɣɯʔ21 de21 dza44 bo35 horse blessing one blessing beg help evidential.visual ‘I see that you will help me to beg for the blessing of horses.’ (Southern Lisu; adapted from Bradley 2010: 66)

Most Ngwi languages have a clause-final reported speech evidential; this can sometimes follow another epistemic or evidential. These include Lisu /dʑo21/, Lahu /tɕe54/ and Akha /dʑe55/ as noted above in 17.5.1 in the discussion of quoted speech. Syntactically, these are like the other epistemics and evidentials in these languages; pragmatically, they suggest somewhat less certainty by the speaker about the accuracy of the quote and the information which it contains than in a direct quote.

17.6 Other interesting features One striking discourse phenomenon in all Burmic languages is a strong preference for zero anaphora where the NP referents are clear, and even sometimes when they are not so obvious. Unless previous context suggests otherwise, without an overt agent/ sbj, a statement is normally interpreted as first person, and a question or imperative as second person; an NP argument is unlikely to occur more than once per sentence, no matter how many clauses the sentence contains. Pronouns are relatively infrequent in normal conversation and narrative, and their presence is pragmatically marked.



Typological profile of Burmic languages 

 331

However, unlike some other languages, in Burmic languages, overt agent/sbj pronouns do occur in imperatives. Due to new laws and practices concerning incest which came in with British rule between 1825 and 1885, the former preference for cross-cousin marriage among the Burman population was forbidden, affecting the usage of some kinship terms. Judson (1852), compiled mainly in the 1830s and 1840s in the independent Myanmar kingdom, reports a system with distinct terms for uncles and aunts: /bá dʑì/ ‘father’s elder brother/mother’s elder sister’s husband’, /bá tʰwè/ ‘father’s younger brother/ mother’s younger sister’s husband’, /mí dʑì/ ‘mother’s elder sister/father’s elder brother’s wife’, /mí tʰwè/ ‘mother’s younger sister/father’s younger brother’s wife’, /ù jì/ ‘mother’s elder brother/father’s elder sister’s husband’, /ù mìn/ ‘mother’s younger brother/father’s younger sister’s husband’ and /əjì/ ‘father’s sister/mother’s brother’s wife’, with a colloquial alternative /dʑì dɔ/ for ‘mother’s elder sister’. By the time of Tun Nyein (1906), the term /ù mìn/ had gone out of use for addressing or referring to an uncle, but was still used as an address term to older respected males. A distinction for father’s sisters was added with productive ‘big’ and ‘little’ suffixes, /əjì dʑì/ for ‘father’s elder sister’ versus /əjì gəlè/ for ‘father’s younger sister’. Also, a number of additional general terms for aunts not reported by Judson are cited: /ədɔ/ and /dɔ dɔ/ ‘elder aunt’ and /tʰwè dɔ/ and /dɔ lè/ ‘younger aunt’. An additional term /dʑì dʑì/ ‘elder aunt’ also came into use during the early 20th century, but is not included in most dictionaries. By the time of the 1993 Myanmar Language Commission standard dictionary, regularized terms for maternal uncles were /ù dʑì/ for ‘mother’s elder brother’ and /ù lè/ for ‘mother’s younger brother’, with the former /ù jì/ and a reduced form /wəjì/ also cited for the elder maternal uncle, and the former form for younger maternal uncle /ù mìn/still used for address to older men. The intimate term /bábá/ ‘father’s elder brother’ is also included. Confusion reigns with the elder maternal aunt term which is given only as /dʑì dɔ/, not the former /mí dʑì/; general elder aunt terms /dɔ dʑì/ and again /dʑì dɔ/ are cited, also general younger aunt terms /dɔ lè/ and /dɔ dɔ/; all these /dɔ/ are now respelled with ‘d’ instead of ‘t’, reflecting changes in juncture voicing for these terms over the last couple of centuries. Usage of these uncle and aunt terms now varies greatly across Myanmar, often not in accord with earlier meanings, especially for the extended uses of the original uncle and aunt terms for uncles and aunts by marriage, who by law should not be related by blood. Many Ngwi languages in China and some of those in MSEA have special two-syllable clf systems for groups of family members; nearly all other clf are one syllable. The most widespread are for a group of a father and his children, a group of a mother and her children, and for a grandparent and his or her grandchildren; also one or more for groups of siblings and cousins. Table 2 shows the Lisu and Akha forms; for more discussion and examples, see Bradley (2001). The Akha forms are compounds of the senior kin term ‘father’/’mother’/’grandfather’/’grandmother’ plus /za21/ ‘child’, but in most of the Lisu forms neither syllable is homophonous with a current kinship term and all have /55/ tone in the first syllable.

332 

 David Bradley

Tab. 2: Lisu and Akha Family Group clf.

father and children mother and children grandparent and grandchildren grandfather and grandchildren grandmother and grandchildren siblings and cousins male siblings and cousins (male speaker)

Lisu

Akha

/pa55 laʔ21/ /ma55 laʔ21/

/da33 za21/ /ma33 za21/ /pi55 liʔ21/ /bɔ55 za21/ /pi21 za21/

/ʂʅ55 (ɲi33)/

/mɛ55 nm55/

In most dialects of Lisu including all in MSEA, the collective sibling term has only the first syllable. In Northern Ngwi languages in China, most or all of these two-syllable forms are shortened to one to conform to the normal one-syllable pattern for clf. Lahu, North Burmish languages and Burmese have no such special clf, and simply use the normal human clf. Because of the clear meaning of such a group, these family group clf are very often used after a numeral with no other elements in the NP. To express the same meaning clearly in Burmese, Lahu and other languages which do not have these clf, such a group is referred to or addressed using kin terms before the numeral plus human clf; compare Lisu and Lahu (59). (59)

sa44 ma55 laʔ21 three clf.mother.children ‘mother and two children’ (Lisu)

ɔ21 e33 ɔ21 ja54 ɕɛʔ21 ɣa54 mother child three clf.human ‘mother and two children’ (Lahu)

Most Burmic languages have grammatical markers of politeness. In Ngwi languages, these include various markers which come after everything else in the clause, such as polite imperatives Lahu /mɛ11/ in (60), Lisu /ma21 a21/ in (61), Akha /nɛ21 la21/ in (62) and so on as seen below. (60) qha21 buʔ45 tɕa54 mɛ11 adv full eat pol ‘Please eat your fill!’ (Lahu; Matisoff 1982: 377) (61)

tɕʰi44 bɯ44 sɿ21 wu55 ta55 kɤ44 o44 ma21a21 Lisu.banjo rely.on stop past pfv pol ‘Please rely on the Lisu banjo and stop!’ (Lisu; Bradley et al. 2008: 22)

(62)

tɕɛ55 pju55 ɯ33 tɕɯ33 bi21 nɛ21 la21 paddy white a.bit give pol ‘Please give me a bit of pounded rice.’ (Akha; Kya Heh 2003: 102)

Burmese has a postverbal politeness marker /pa/ which undergoes juncture voicing and is used especially in imperatives but also in questions and statements both in spoken and in literary Burmese. It follows the verb and modal if any but precedes



Typological profile of Burmic languages 

 333

tense-aspect and other clause-final markers; it also occurs finally in positive equational copula sentences, as in (64) which is otherwise identical to the first sentence in (13) above. In literary Burmese, which does not use the second negative, it also occurs finally in negated statements, as in (66), and particularly in imperatives. (63)

tʰəmìn mə sà dɔ´ ba bù rice neg1 eat yet pol neg2 ‘I won’t eat after all’ (spoken Burmese; Okell and Allott 2001: 114)

(64) θu sʰəja ba he/she teacher pol ‘He/she is a teacher.’ (spoken/literary Burmese) (65)

θu sʰəja mə hou’ pa he/she teacher neg1 be.the.case pol ‘He/she is not a teacher.’ (spoken Burmese)

bù neg2

(66) mə hou’ pa neg be.the.case pol ‘It is not the case/No.’ (literary Burmese) These politeness markers are particularly frequent in imperatives. Burmese also has various ways of marking respect for others and self and ingroup humbling embedded in the pronoun system and various other areas of lexicon. All Burmic languages have substantial lexicalized sets of time ordinal forms for days and years; for example, Lisu /tsʰi44 ni35/ ‘this.year’, /a21 ni35/ ‘last.year’, /ʂʅ44 ni35/ ‘year.before.last’, /ʂʅ55 wu21 ni35/ ‘3.years.ago’, also /nɛ44 ni35/ ‘year.after.next’ and /pʰɛʔ21 ni35/ ‘3.years.in.the.future’. These forms are diachronically unstable; Bradley (2013) shows how they have changed rapidly in the recent history of Burmese. They often contain fossilized forms, like the /-ni35/ ‘year’ form in Lisu which occurs only in year ordinals and has been replaced in its core meaning by the Lisu form /kʰoʔ21/ ‘year’ with widespread cognates across Ngwi languages; different Lisu dialects also have different forms, and some forms are particularly irregular, such as Lisu /nɛ55 hɛʔ21/ or /na55 hjaʔ21/ ‘next.year’ with /hɛʔ21/ or /hjaʔ21/ instead of expected /ni35/; compare Lahu /nɛ35 qʰɔʔ21/ ‘next.year’ with a related first syllable plus the Lahu ‘year’ form and Burmese /nau’ hni’/ which is transparently ‘behind’ + ‘year’. The future cannot yet be seen and is conceptualized as being spatially behind in all Burmic languages; the past is conceptualized as being spatially in front. In Lisu, /ka55 nɛ55/ or /ka55 ɲa55/ is ‘behind’ and also ‘after’; Lahu has /qʰɔʔ21 nɔ35/; thus in ‘next.year’ the nasal-initial first syllable is also irregular in Lahu and some dialects of Lisu. There are many other interesting areas of the structure of Burmic languages discussed in more depth in the references below.

334 

 David Bradley

References Benedict, Paul K. 1972. Sino-Tibetan: A conspectus. Cambridge: Cambridge University Press. Bernot, Denise. 1957–1958. Rapports phonétiques entre le dialecte marma et le birman. Bulletin de la Société de Linguistique de Paris 53(1). 273–294. Bradley, David 1977. Phunoi or Côông. In David Bradey (ed.), Papers in Southeast Asian linguistics No. 5 (Pacific Linguistics A-49), 67–98. Canberra: Department of Linguistics, Research School of Pacific Studies, Australian National University. Bradley, David. 1979a. Lahu dialects. Canberra: Australian National University Press. Bradley, David. 1979b. Proto-Loloish (Scandinavian Institute of Asia Studies Monograph Series 39). London & Malmö: Curzon Press. Bradley, David. 1980. Phonological convergence between languages in contact: Mon-Khmer structural borrowing in Burmese. Berkeley Linguistics Society 6. 259–267. Bradley, David. 1982. Register in Burmese. In David Bradley (ed.), Tonation (Pacific Linguistics A-62), 117–132. Canberra: Department of Linguistics, Research School of Pacific Studies, Australian National University. Bradley, David. 1985 Arakanese vowels. In Graham Thurgood, James Matisoff & David Bradley (eds.), Linguistics of the Sino-Tibetan area: The state of the art (Pacific Linguistics C-87), 180–220. Canberra: Department of Linguistics, Research School of Pacific Studies, Australian National University. Bradley, David. 1995. Grammaticalization of extent in Mran-Ni. Linguistics of the Tibeto-Burman Area 18(1). 1–28. Bradley, David. 1996a. Burmese as a lingua franca. In Steven A. Wurm, Peter Mühlhäusler & Darrell T. Tryon (eds.), Atlas of languages of intercultural communication in the Pacific, Asia and the Americas II(1), 745–747. Berlin: Mouton de Gruyter. Bradley, David. 1996b. Kachin. In Steven A. Wurm, Peter Mühlhäusler & Darrell T. Tryon (eds.), Atlas of languages of intercultural communication in the Pacific, Asia and the Americas II(1), 749–751. Berlin: Mouton de Gruyter. Bradley, David. 1996c. Lahu. In Steven A. Wurm, Peter Mühlhäusler & Darrell T. Tryon (eds.), Atlas of languages of intercultural communication in the Pacific, Asia and the Americas II(1), 753–755. Berlin: Mouton de Gruyter. Bradley, David. 2001. Counting the family: Family group classifiers in Yi Branch languages. Anthropological Linguistics 43(1). 1–17. Bradley, David. 2003. Deictic patterns in Lisu and Southeastern Tibeto-Burman. In David Bradley, Randy LaPolla, Boyd Michailovsky & Graham Thurgood (eds.), Language variation: Papers on variation and change in the Sinosphere and in the Indosphere in honour of James A. Matisoff (Pacific Linguistics 555), 219–236. Canberra: Research School of Pacific and Asian Studies, Australian National University. Bradley, David. 2005a. Sanie and language loss in China. International Journal of the Sociology of Language 173. 159–176. Bradley, David. 2005b. Why do numerals show “irregular” correspondence patterns in Tibeto-Burman? Some Southeastern Tibeto-Burman examples. Cahiers de Linguistique – Asie Orientale 34(2). 222–238. Bradley, David. 2005c. Reflexives in literary and spoken Burmese. In Justin Watkins (ed.), Studies in Burmese linguistics (Pacific Linguistics 570), 67–86. Canberra: Research School of Pacific and Asian Studies, Australian National University. Bradley, David. 2010. Evidence and certainty in Lisu. Linguistics of the Tibeto-Burman Area 33(2). 63–84.



Typological profile of Burmic languages 

 335

Bradley, David. 2011. Changes in Burmese phonology and orthography. Keynote, Southeast Asian Linguistics Society 21, Bangkok. https://academia.edu/1559757/Changes_in_Burmese_ Phonology_and_Orthography (accessed 10 January 2021). Bradley, David. 2012. Tone alternations in Ugong. In Cathryn Donohue, Shunichi Ishihara & William Steed (eds.), Quantitative approaches to problems in linguistics: Studies in honour of Phil Rose (LINCOM Studies in Phonetics 08), 55–62. München: Lincom Europa. Bradley, David. 2013. Time ordinals in Ngwi languages. Paper presented at 3rd Workshop on Sino-Tibetan Languages of Sichuan, Paris. https://academia.edu/6966367/Time_ordinals_in_ Ngwi_languages (accessed 10 January 2021). Bradley, David. 2017a. Lisu. In Graham Thurgood & Randy J. LaPolla (eds.), The Sino-Tibetan languages, 2nd edn., 902–917. London: Routledge. Bradley, David. 2017b. Space in Lisu. Himalayan Linguistics 16(1). 1–22. Bradley, David 2018a. Reported speech in Lisu and Burmic. Paper presented at ICSTLL 51, Kyoto. http://repository.kulib.kyoto-u.ac.jp/dspace/bitstream/2433/235263/1/proc_icstll51_8.pdf (accessed 10 January 2021). Bradley, David. 2018b. When was or is an evidential not an evidential? Paper presented at CoEDL workshop, Melbourne. https://academia.edu/39871134/WHEN_WAS_OR_IS_AN_EVIDENTIAL_ NOT_AN_EVIDENTIAL (accessed 10 January 2021). Bradley, David. 2019. Dimensional extent in Lisu. Paper presented at ICSTLL 52, Sydney. https:// academia.edu/40823708/Dimensional_Extent_in_Lisu (accessed 10 January 2021). Bradley, David. 2020. East and Mainland Southeast Asia. In Christopher Moseley (ed.), Atlas of the world’s languages, 3rd edn. London: Routledge. Bradley, David. 2022. Grammar of Lisu. Berlin: Mouton de Gruyter. Bradley, David, Byabe Loto & David Ngwaza. 2008. Lisu New Year Song. Chiang Mai: Actsco. Bradley, David & Maya Bradley. 1999. Standardisation of transnational minority languages; Lisu and Lahu. Bulletin Suisse de Linguistique Appliquée 69(1). 75–93. Bradley, David & Maya Bradley. 2019. Language endangerment. Cambridge: Cambridge University Press. Bradley, David, with Edward R. Hope, Maya Bradley & James Fish. 2006. Southern Lisu dictionary (STEDT Monograph Series 4). Berkeley: STEDT Project. Burling, Robbins. 1967. Proto-Lolo-Burmese. International Journal of American Linguistics 33(1). Part II. Hansson, Inga-Lill. 2017. In Graham Thurgood & Randy J. LaPolla (eds.), The Sino-Tibetan languages, 2nd edn., 885–901. London: Routledge. Jenny, Mathias & San San Hnin Tun. 2016. Burmese: A comprehensive grammar. London: Routledge. Judson, Adoniram. 1852. Judson’s Burmese-English dictionary. Moulmein: American Baptist Mission Press. Kya Heh, Noel. 2003. A descriptive study of Akha sentence final particles. Chiang Mai: Payap University MA thesis. Luce, Gordon H. 1959. Old Kyaukse and the coming of the Burmans. Journal of the Burma Research Society 42. 75–109. Matisoff, James A. 1972. Lahu nominalization, relativization and genitivization. In John Kimball (ed.), Syntax and semantics I, 237–257. New York: Seminar Press. Matisoff, James A. 1982. The grammar of Lahu (University of California Publications in Linguistics 75), 2nd printing. Berkeley: University of California Press. Matisoff, James A. 1988. The dictionary of Lahu (University of California Publications in Linguistics 111). Berkeley: University of California Press. Matisoff, James A. 2003. Handbook of Proto-Tibeto-Burman (University of California Publications in Linguistics 135). Berkeley: University of California Press.

336 

 David Bradley

Matisoff, James A. 2017. Lahu. In Graham Thurgood & Randy J. LaPolla (eds.), The Sino-Tibetan languages, 2nd edn., 918–931. London: Routledge. Myanmar Language Commission. 1993. Myanmar-English dictionary. Yangon: Ministry of Education. Okell, John. 1969. A reference grammar of colloquial Burmese. London: Oxford University Press. Okell, John. 1989. Notes on tone alternation in Maru. In David Bradley, Eugénie J. A. Henderon & Martine Mazaudon (eds.), Prosodic analysis and Asian linguistics: To honour R. K. Sprigg (Pacific Linguistics C-104), 109–114. Canberra: Department of Linguistics, Research School of Pacific Studies, Australian National University. Okell, John & Anna Allott. 2001. Burmese/Myanmar dictionary of grammatical forms. Richmond, Surrey: Curzon Press. Shafer, Robert. 1966–1974. Introduction to Sino-Tibetan, parts I–V. Wiesbaden: Harrassowicz. Stargardt, Janice. 1990. The Ancient Pyu of Burma. Vol. one: Early Pyu cities in a man-made landscape. Cambridge: Publications on Ancient Civilizations in South East Asia; Singapore: Institute of Southeast Asian Studies. Tian, Mimi & Albert Lee. 2019. Burmese question intonation. Paper presented at ICPhS 20, Melbourne. https://assta.org/proceedings/ICPhS2019/papers/ICPhS_2484.pdf (accessed 10 January 2021). Tun Nyein, U. 1906. The student’s English-Burmese dictionary. Rangoon: Burma Secretariat.

Atsuhiko Kato

18 Typological profile of Karenic languages 18.1 Introduction Karenic languages constitute the Karenic branch of the Tibeto-Burman family. The ethnic groups that speak Karenic languages (called “Karenic people” here) include several groups, such as the Bwe, Geba, Gek(h)o, Kayah, Kayaw, Kayan, Manu, Monebwa, Mopwa (Mobwa), Paku, Pa-O, Pwo Karen, Sgaw Karen, Thalebwa, Yeinbaw, Yintale, and others (listed in alphabetical order). Today, these ethnic groups live in the eastern, southeastern, and southwestern parts of Myanmar and the northern and western parts of Thailand. As pointed out by Kato (2016), the range of people who consider themselves to be ethnic “Karen” can vary depending on various contexts, such as political, ethnic, and linguistic. The Geba people, for example, have their own identity as a separate group from the Karen or Kayah; however, sometimes they may associate themselves as belonging to the Karen and, less often, to the Kayah. Despite this complicated situation, Pwo Karen and Sgaw Karen are always recognized as Karen, and, in the narrowest sense, can be considered to constitute the ethnic “Karen”. In Myanmar, the Pwo Karen and Sgaw Karen jointly hold many events frequently, including traditional festivals such as the Karen New Year Festival. These two groups also often act together in political movements; the Karen National Union (KNU), one of the largest and keenest anti-government armed groups in Myanmar, are mainly comprised of Pwo and Sgaw. Intermarriage between Pwo and Sgaw is also fairly common. In addition to Pwo Karen and Sgaw Karen, the ethnic groups that speak languages close to Sgaw, i.  e., Monebwa, Paku, and Thalebwa, often consider themselves Karen. Moreover, the Bwe, Geba, and Mopwa, whose languages cannot be genealogically said to be very close to Pwo or Sgaw but who have strong contact with the Sgaw people, may also consider themselves Karen. Other groups such as the Kayah, Kayan, and Pa-O usually see themselves as belonging to ethnic groups that are separate from Karen. However, all ethnic groups speaking Karenic languages may be considered as a unified ethnic unit that constitutes “Karen”; for example, a radical Karen nationalist would prefer this because the concept of “Karen” comprises more groups and has a larger population when all Karenic people are treated under the same rubric. The English word “Karen” probably originated from an old form of the Burmese exonym /kăyìɴ/ (cf. written form: )1, which can be further traced back to a Proto-Karen autonym. Some scholars have attempted to reconstruct it, e.  g., *k-ɲaŋA (Solnit 2001), *bra-ka-louŋ (Shintani

1 The Mon exonym for Karen is /kəriəŋ/ (written form: ) (Mathias Jenny, p.c.). This Mon form could also be the origin of the word “Karen”. https://doi.org/10.1515/9783110558142-018

338 

 Atsuhiko Kato

2003a), and *k-rjaŋA (Luangthongkum 2012). Autonyms in contemporary Karenic languages include /phlòʊɴ/ (Eastern Pwo), /phlóuɴ/ (Western Pwo), /pɣākəɲɔ́ / (Sgaw), and /ɡèɓá/ (Geba), /kəjɛ̄ / (Kayah Li; Solnit 1997: xviii), and /kayân/ (Kayan; Manson 2010: xxi). The Ethnologue (22nd edition; Eberhard et al. 2019) lists 21 Karenic languages: Bwe, Geba, Geko, Mobwa, Paku, Phrae Pwo, Eastern Pwo, Northern Pwo, Western Pwo, Sgaw, Kawyaw, Eastern Kayah, Western Kayah, Kayan, Kayaw, Lahta, Pa-O, Wewaw, Yinbaw, Yintale, and Zayein. Based on their estimated population data, the Karenic languages that have a relatively large population of speakers are: Sgaw (2,250,000), Pwo (1,326,000), Pa-O (858,740), Kayah (178,000), Kayan (133,180), Geba (40,000), Kayaw (20,100), and Bwe (17,200) (population counts in parentheses). The Genealogical relationships among the Karenic languages have been discussed by scholars, including Jones (1961), Manson (2002, 2017a), and Shintani (2003b), but opinions vary regarding the subgrouping of the languages. The Union of Myanmar has two states for Karenic people: Kayin State (Karen State) and Kayah State. These are the only administrative units in the world that are legitimately established for the Karenic people. People speaking Karenic languages are mostly farmers, who make swiddens in mountainous areas and wet rice fields in the plains. The majority of the Pwo Karen and Sgaw Karen population in Myanmar live in the plains of Kayin State, Bago Region, Yangon Region, Mon State, Tanintharyi Region, and Ayeyarwady Region, and some of the population live in urban areas including Yangon and Mawlamyine. However, Pwo Karen and Sgaw Karen in Thailand mainly live in the mountainous areas. Karenic people other than Pwo and Sgaw usually live in the mountainous areas of the southern Shan State, Kayah State, Kayin State, and Bago Region, with the exception of Pa-O, some of whom live in the plains of Kayin State and Mon State. In the plains, people that speak the Karenic languages are usually Buddhists or Christians, whereas in the mountainous areas, they are usually Christians or animists. The majority of Pa-O and Pwo Karen are Buddhists, Sgaw Karen also has a large population of Buddhists, and the rest are mostly Christians or animists. Some of the Karenic languages have writing systems. Pwo Karen in Kayin State has several writing systems, including the Buddhist Pwo Karen Script (called /láithɯ̂lì/ in Pwo; I assume that it was created by Pwo Karen Buddhist monks in the late 18th or early 19th century based on the Mon script, though legend has it that it dates back to the Pagan Period), Christian Pwo Karen Script (created by American Baptist missionaries in the middle of 19th century based on the undermentioned Christian Sgaw Karen Script), and Leke Script (with original new shapes, which may have been created in the middle of 19th century). For the Pwo Karen scripts, see Stern (1968) and Kato (2001a, 2001b, 2006). Different religious followers of Pwo Karen tend to use different writing systems (Leke is the religion specific to Pwo Karen in Kayin State). Many Sgaw Karen use the Christian Sgaw Karen Script; this was created in the 1830s by Wade, an American missionary, based on the Burmese script with a considerable modification. Some Buddhist Sgaw also use this, whereas some Buddhist Sgaw use the Buddhist Sgaw



Typological profile of Karenic languages 

 339

Karen Script (called /lìʔtəláʔɲá/ in Sgaw), which was probably created after the 1960s based on the Burmese script; moreover, some Buddhist Sgaw use the Myaing Gyi Ngu Script with original new shapes that were created probably in the 1990s. According to Solnit (1997: 304‒308), Kayah has a roman-letter orthography developed by Catholic missionaries and an Indic-style orthography created by indigenous people. Geba and Gekho also have roman-letter writing systems created by Catholic missionaries. The Pa-O living in Shan State have a Burmese-based writing system that reportedly has a history of hundreds of years, and the Pa-O living in Mon and Kayin States use a modified version of this. There are several other scripts that have been developed recently: for example, the roman-letter Kayan orthography (Manson 2010: 67‒69) and the Burmese-based Manu orthography (Wai Lin Aung 2013: 14). All Karenic languages have an SVO word order, which is aberrant among the Tibeto-Burman languages, the majority of which are of the SOV-type. Most likely, the ancestor of the Karenic languages originally had an SOV word order, which changed to SVO at the Proto-Karen stage. This change was likely due to contact with some Mon-Khmer language(s). Matisoff (2000) suggests heavy contact with Mon in the late first millennium AD. Manson (2009) suggests that Mon-Khmer loanwords in Karenic ­languages imply a greater connection with the Palaungic branch of Mon-Khmer than the Monic branch. Kato (2019c) argues that Pwo and Sgaw have loanwords from Mon that can be considered to be borrowed before Pwo and Sgaw split because they match the regular historical tonal changes observed in pure Karen words; he concludes that Pwo and Sgaw have a long history of contact with Mon, although it is not obvious whether the contact can be traced back to the Proto-Karen stage. Below are some examples. Mon forms shown in angle brackets “< >” indicate Shorto’s (1962) literary forms. (1)

Eastern Pwo /pàiɴtərâɴ/, Western Pwo /təràɴ/, Sgaw /pɛ́ trɔ́ / ‘window, door’, cf. Mon Eastern Pwo /phjâ/, Western Pwo /phjà/, Sgaw /phjá/ ‘market’, cf. Mon Eastern Pwo /kəmâ/, Western Pwo /kəmà/, Sgaw /kəmá/ ‘pond’, cf. Mon

Kato also notes in the same paper that Burmese and Tai loanwords in Pwo and Sgaw do not match the regular patterns of tonal changes, and concludes that the contact of Pwo and Sgaw with Burmese or Tai languages, including Thai and Shan, is relatively recent. At present, Karenic languages generally have great interaction with the Burmese or Thai, and have many Burmese loanwords in Myanmar and Thai loanwords in Thailand. Dialectal variety within some Karenic languages is rather large. For example, Kato (2019) lists four dialectal groups of Pwo Karen that are not mutually intelligible: Western Pwo Karen, Htoklibang Pwo Karen, Eastern Pwo Karen, and Northern Pwo Karen. For more details regarding Pwo Karen dialects, see Kato (1995, 2009b), Dawkins and Phillips (2009a, b), and Phillips (2017). According to Manson (2017a, 2019), Kayan seems to have an even wider variety of dialects. He says that the “Kayan cluster” is

340 

 Atsuhiko Kato

highly diverse both phonologically and lexically, and that several speech varieties are mutually unintelligible (Manson 2019: 5). In this chapter, to describe the general typological profile of Karenic languages, I will mainly use samples of the Hpa-an dialect of Eastern Pwo Karen (hereafter, Eastern Pwo Karen is abbreviated as EPK, and Pwo Karen as PK), as I am the most familiar with this dialect among the Karenic languages.2 I will also use many samples of the Hpa-an dialect of Sgaw Karen (hereafter, abbreviated as SK), with which I am familiar to a certain extent. As stated by Manson (2017a), the modern comprehensive grammatical studies available in the present for Karenic languages are Solnit (1997) for Kayah Li, Manson (2010) for Kayan, and my grammar for Pwo Karen (Kato 2004); thus, in this chapter, Solnit (1997) and Manson (2010) are often referenced for comparison.

18.2 Phonology 18.2.1 Syllable structure and segmental phonemes The syllable structure of EPK can be represented as C1(C2)V1(V2)(C3)/(T). “C” stands for a consonant, “V” for a vowel, and “T” for a tone. C1 is an initial consonant, C2 is a medial consonant, and C3 is a final consonant. One or two vowels may occur, and are represented with V1 and V2. Bracketed elements may or may not occur. The sequence of V1, V2, and C3 has limited combinations; thus, it is better to treat them as a unit, that is, a rhyme. Many EPK basic words consist of one syllable; thus, we can consider EPK to be a monosyllabic language. This monosyllabicity is also true of the other Karenic languages. Table 1 shows all the EPK simple onsets and rhymes. The phonetic values that should be noticed are: /θ/[t̪~t̪θ~θ], /c/[tɕ], /ph/[pʰ], /th/[tʰ], /ch/[tɕʰ], /kh/[kʰ], /b/[ɓ], /d/[ɗ], /j/[j~ʝ], /r/[r~ɹ], /i/[əi], /i̠/[ɪ], and /ʁ/[ʁ~ɦ]. The uvular nasal /ɴ/ can only occur as C3, and all the other consonants can occur as C1. The consonants that can occur as C2 are: /w, l, r, j/. The consonants /ɲ/, /ŋ/, and /r/ occur mostly in loanwords from Mon or Burmese. Rhymes are formed with the following four combinations: V1, V1V2, V1ɴ, and V1V2ɴ. There are 21 rhymes. The final uvular nasal /ɴ/ often functions as an element that only nasalizes the last part of the preceding vowel, especially in rapid speech. /ɴ/ of the rhymes /eiɴ/, /əɯɴ/, and /oʊɴ/ is often entirely dropped; in such cases, no nasalization is observed in the vowels, either. /i̠ɴ/ is only found in Burmese loanwords. The vowel of the rhyme /aɴ/ is phonetically realized as a diphthong [ɑ̆ ɔɴ].

2 The author has a certain degree of speaking, listening, reading, and writing competence of Eastern Pwo Karen, and reading competence of Sgaw Karen. He has researched many Pwo Karen and Sgaw Karen dialects and made some field work on Geba and Palaychi.



Typological profile of Karenic languages 

 341

This diphthong differs from the diphthongs of /eiɴ/, /əɯɴ/, /oʊɴ/, and /aiɴ/ because it is a rising diphthong, while the first and second elements of the diphthongs in /eiɴ/ [ei(ɴ)], /əɯɴ/ [əɯ(ɴ)], /oʊɴ/ [oʊ(ɴ)], and /aiɴ/ [aiɴ] are of equal prominence. Tab. 1: Onsets and rhymes in EPK. Onsets p ph b

θ

m w

Rhymes t th d

n l r

c ch

k kh

ʔ

ɕ

x ɣ ŋ

h

ɲ j

ʁ

i i e ɛ

ɨ ə a

ɯ ʊ o ɔ

ai



iɴ aɴ

əɴ oɴ

eiɴ

əɯɴ aiɴ

oʊɴ

Next, let us look at SK simple onsets and rhymes for comparison. Table 2 shows all the SK onsets and rhymes. The phonemic transcription of SK follows Kato (2002). SK /θ/ is pronounced [t̪~t̪θ~θ], the same as for Eastern Pwo /θ/. Both come from Proto-Karen *s, which was reconstructed by Haudricourt (1946) and is still preserved in many SK dialects spoken in Thailand. /s/ [s] and /sh/ [sʰ] come from *c and *ch (see Manson 2009), which are also preserved in many SK dialects spoken in Thailand. The SK phonemes /b/ and /d/ are implosives, similar to Eastern Pwo /b/ and /d/. These can be considered as reflects of Proto-Karen *ɓ (or *ʔb) and *ɗ (or *ʔd). In many Karenic languages, these Proto-Karen sounds are preserved as implosives or preglottalized voiced stops. Note that Haudricourt’s (1946) Proto-Karen plain voiced stops *b, *d, and *ɡ have changed to /ph/, /th/, and /kh/ in PK, and /p/, /t/, and /k/ in SK, as will be mentioned again in 18.2.2. Tab. 2: Onsets and rhymes in SK. Onsets p ph b

m w

θ

Rhymes t th d s sh z n l r

c ch

k kh

ʔ

ɕ

x

h

ɣ ŋ

ɦ

ɲ j

i e ɛ

ɯ ə a

u o ɔ

iʔ eʔ ɛʔ

ɯʔ əʔ aʔ

uʔ oʔ ɔʔ

342 

 Atsuhiko Kato

It is certain that Proto-Karen had series of syllable-final nasals and stops, as assumed by scholars including Manson (2009), Solnit (2013), and Luangthongkum (2019). Proto-Karen probably had final nasals such as *-m, *-n, and *-ŋ because these are still present in Pa-O (see Jones 1961). However, regarding the final stops, it is difficult to reconstruct them with indisputable certainty because all Karenic languages have reduced them to a glottal stop or completely dropped them. Pa-O has many rhymes with final stops /-p, -t, -k, -ʔ/; however, as Solnit (2013) and Manson (2017a: 154) state, most of these with /-p, -t, -k/ are borrowed from Tai (Shan) or Pali. SK still preserves a final glottal stop, but final nasals have entirely disappeared. In this regard, Pwo is more conservative than Sgaw. In the Hpa-an dialect of Pwo, Proto-Karen final nasals are preserved as a uvular nasal. Final stops have completely disappeared in Hpa-an; however, a final glottal stop is still preserved in the Kyonbyaw dialect of Western PK and the Tavoy dialect of EPK (see Kato 1995). For example, compare Hpa-an /thò/ ‘pig’ with Kyonbyaw /thoʔ/ ‘pig’ and Tavoy /thòʔ/ ‘pig’. Generally, Proto-Karen rhymes have been highly simplified in every Karenic language. Such simplification has occurred due to the dropping and merging of final consonants, as mentioned above, as well as the monophthongization of diphthongs. Examples of monophthongization would be EPK /ʔɛ́ / ‘to love’ and SK /ʔɛ̂ / ‘to love’, whose vowels probably used to be *ai, which is preserved in Western PK, e.  g. /ʔài/ ‘to love’. The simplification of rhymes is an on-going phenomenon. For example, as mentioned above, the final /ɴ/ of /eiɴ/, /əɯɴ/, and /oʊɴ/ in the Hpa-an dialect of EPK is often dropped. Moreover, the EPK final /-aɴ/ can also be pronounced as /-ɔ/ in some words, such as /náɴ ~ nɔ́ / ‘to wake’ and /màɴ ~ mɔ̀ / ‘to bark (as a dog)’. The possible combinations of C1 and C2 in EPK are shown in Table 3. The medial /r/ in EPK generally occurs only in Mon or Burmese loanwords. Table 4 shows the possible combinations of C1 and C2 in SK, which appears in Wade (1849). Tab. 3: Possible combinations of C1 and C2 in EPK.

C2

w l r j

p

θ

t

c

k

ʔ

+ + + +

+

+

+

+ + +

+

ph + +

C1 th

ch

kh

b

d

x

h

m

n

ɲ

j

l

+

+

+ + +

+ +

+

+

+

+ +

+

+

+

+

+

+

+



 343

Typological profile of Karenic languages 

Tab. 4: Possible combinations of C1 and C2 in SK.

C2

w l r j ɣ

p

θ

t

k

ph

+ + + + +

+

+

+

+

+ + +

+ + + + +

C1 th + +

kh

b

+ + +

+

d

s

sh

x

m

n

ɲ

ŋ

j

l

r

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+ + + + +

+ +

The medials of Karenic languages are generally semivowels or liquids. Even the SK fricative medial /-ɣ-/ [ɣ] can be pronounced [ɰ], which is also noted by Manson (2017a: 154), e.  g. /pɣā/[pɣā~pɰā] ‘person’. In dialects of the Ayeyarwady delta, it has generally merged with /-w-/ when it occurs after a bilabial, e.  g. /pwā/ ‘person’. The SK medial /-ɣ-/ can be traced back in many cases to the Proto-Karen medial *-r-. The Proto-Karen medial *-l- is relatively well preserved in many Karenic languages. An example would be words that mean ‘arrow’: EPK /phlā/, SK /plà/, Bwe /blɛ/ (Henderson 1997: vol. 2, 7), Eastern Kayah Li /plè/ (Solnit 1997: 354), and Kayan /plā/ (Manson 2010: 234).

18.2.2 Suprasegmental phonemes EPK (the Hpa-an dialect) has four tones, as shown below with examples:   High-level Mid-level Low-level Falling

á ā à â

[a55] /má/ ‘son-in-law’ [a̤ 33 ~ 334] /mā/ ‘very’ [a11] /mà/ ‘to do’ [a51] /mâ/ ‘wife’

The mid-level tone is pronounced with a breathy phonation and may be pronounced with a rising pitch in utterance-final position and before a pause. SK (the Hpa-an dialect) has four plain tones and two checked tones: High-level Mid-level Low-level Falling

á ā à â

[a55] /má/ ‘wife’ [a33] /mā/ ‘to do’ [a11] /nà/ ‘ear’ [a51] /ɲâ/ ‘fish’

High-checked áʔ [aʔ44] /náʔ/ ‘sword’ Low-checked

àʔ [aʔ11] /màʔ/ ‘son-in-law’

The high- and low-checked tones can be interpreted either as the high- (or possibly mid-) and low-level tones occurring in a syllable with the final glottal stop or as tones that are independent tones from the plain ones. It is uncertain which interpretation is appropriate.

344 

 Atsuhiko Kato

Table 5 shows how the Proto-Karen tones (represented as 1, 2, 2′, and 3)3 reconstructed by Haudricourt (1946, 1953, 1975) have changed in EPK and SK. Tone 3 is the tone with final stops. Haudricourt (1946) only reconstructed Tones 1, 2, and 3, and added one more proto-tone, i.  e. Tone 2′, in his 1975 paper.4 B, M, and H represent the initial consonant groups of Proto-Karen. B is the group of voiced consonants, M is the group of voiceless non-aspirated stops including glottalized/implosive stops, and H is the group of voiceless aspirated stops as well as voiceless fricatives and sonorants. The Proto-Karen tones are split in this way based on the types of initial consonants. The tonal split in Karen resembles that of the Tai languages (see Li 1977) in that tones split based on the three types of initial consonants. For details regarding Haudricourt’s reconstruction of Proto-Karen tones, see Kato (2018). Tab. 5: Proto-Karen tones and the tones of EPK (left) and SK (right).

B M H

1 (plain)

2 (plain)

2′ (plain)

3 (checked)

/à/[11] : /ā/[33] /à/[11] : /á/[55] /â/[51] : /á/[55]

/ā/[33] : /à/[11] /á/[55] : /â/[51] /á/[55] : /â/[51]

/á/[55] : /á/[55] /á/[55] : /á/[55]

/á/[55] : /àʔ/[11] /à/[11] : /áʔ/[44] /à/[11] : /áʔ/[44]

Haudricourt (1946) reconstructed the Proto-Karen tones using the PK dictionary by Purser and Saya Tun Aung (1922), which also shows corresponding SK forms in brackets. Haudricourt (1975) assumes that Tone 2′ merged with Tone 2 in syllables beginning with B-series consonants at the Proto-Karen stage; thus, there are no contemporary examples of B2′. All Karenic languages still have tones. The tonal correspondences of 16 Karenic languages with Proto-Karen are shown in Shintani (2003b). If we consider his description and if tones with a final glottal stop (i.  e. checked tones) are treated as independent tones, Karenic languages have either two, three, four, five, or six tones. As has been already mentioned in 18.2.1, the Proto-Karen initials *b, *d, and *ɡ have changed to /ph/, /th/, and /kh/ in PK, and /p/, /t/, and /k/ in SK. This devoicing happened in the process of the tonal split. When Haudricourt (1946) reconstructed Proto-Karen *b, *d, and *ɡ, no Karenic language that has [b], [d], or [ɡ] had been found. However, his assumption that the Proto-Karen had voiced stops was attested by the reports of Luce (1959) and Henderson (1961, 1979) on Bwe Karen, which preserves plain

3 Luangthongkum (2019) claims that Haudricourt’s (1975) reconstruction of Tone 2′ is unnecessary. However, as I discussed in Kato (2018), reconstructing Tone 2′ can explain the tonal developments that occurred in the transition from Proto-Karen to modern Karenic languages. This hypothesis that assumes Tone 2′ is accepted by Solnit (2001, 2013), Shintani (2003b), and Manson (2009). Tones 1, 2, 2′, 3 may be represented as Tones A, B, B′, C or A, B, B′, D by different scholars. 4 Haudricourt himself did not number the added tone as 2′. I am using this numbering for convenience.



Typological profile of Karenic languages 

 345

voiced stops. According to Shintani (2003b), Proto-Karen *b, *d, and *ɡ are still preserved intact as [b], [d], and [ɡ] in Bwe, Geba, Paku, and Monebwa, and these languages have a four-way contrast of homorganic stops, e.  g. /p/, /ph/, /b/, and /ɓ/, which is a contrast that Proto-Karen probably used to have. In the other Karenic languages, *b, *d, and *ɡ have changed to [p], [t], and [k], or [ph], [th], and [kh]. For example, Proto-Karen *ɡo2 ‘hot’ became /khʊ̄ / in EPK and /kò/ in SK. The Geba form corresponding to them is /ɡō/ (Kato 2008a: 191). Similarly, in all Karenic languages except Geba, Proto-Karen voiceless sonorants belonging to the H-series have been voiced in the process of tonal split. For example, Proto-Karen *m̥a1 ‘wife’ became /mâ/ in EPK and /má/ in SK. However, only Geba preserves voiceless sonorants (see Kato 2008a and Naw Hsar Shee 2008), and the Geba word corresponding to ‘wife’ is /m̥ɛ́ / (Kato 2008a: 184). EPK has atonal syllables, which are phonetically unstressed and short, and cannot occur utterance-finally. They must be followed by a syllable with a tone when they occur in an utterance. The only vowel that can occur in an atonal syllable is /ə/, and it is transcribed as /Cə/ without a tonal notation. In an atonal syllable, C2 never occurs. EPK has many disyllabic words that have an atonal first syllable. These are “sesquisyllabic” in Matisoff’s (1973: 84) terms, in that they are composed of the unstressed first syllable and the stressed second syllable. Examples are: /kəchā/ ‘lord, master’, /kəchâɴ/ ‘elephant’, /cəxwà/ ‘king’, /chərâ/ ‘teacher’, /chəná/ ‘evil-spirit’, /təkhwâ/ ‘cousin’, /təɕā/ ‘surely’, /təwâɴ/ ‘village’, /thərài/ ‘expense’, /pənā/ ‘buffalo’, /phəjā/ ‘to release’, and /θədáɴ/ ‘shrimp’. SK also has many sesquisyllabic words, such as /kəsà/ ‘lord, master’, /kəshɔ́ / ‘elephant’, /θərâ/ ‘teacher’, /təkhwá/ ‘cousin’, /θəwɔ́ / ‘village’, /pənà/ ‘buffalo’, and /ləpɔ́ / ‘wave’. Karenic languages generally have sesquisyllabic words. Solnit (1997: 25) presents Kayah Li examples, such as /kədā/ ‘door’ and /təmɔ̀ / ‘sun’. Manson (2010: 57) presents Kayan examples, such as /na.là/ ‘ear’, /ka.nó̤ / ‘brain’, and /ʔa.cʰû/ ‘thorn’. In PK and SK, the minor syllable in a sesquisyllabic word never has a tone; however, in some other Karenic languages, it may bear tones. For example, according to Solnit (1997: 25), the Kayah Li prefix /ʔi-/, which appears as the minor syllable in a sesquisyllabic word, may have two tones (low or high), whereas full syllables may have five tones. In some Karenic languages, some prefixes or proclitic morphemes with atonal or unstressed syllables show vowel harmony. For example, according to Kato (2008a), in Geba, the vowel of the personal pronouns /jV/ (1sg), /wV/ (1pl), /nV/ (2sg), and /sV/ (3sg) show vowel harmony with the vowel of the following syllable. ‘V’ here represents an atonal vowel that is the same as that of the following syllable. Examples are: /ja ʔā/ (1sg – to.eat) ‘I ate’, /je lē/ (1sg – to.go) ‘I went’, /jɛ mɛ̄ / (1sg – to.do) ‘I did’, and /jɔ ɗɔ̄ / (1sg – to.speak) ‘I spoke’. Apart from these personal pronouns that occur before a verb or a noun, vowel harmony in Geba is also observed in the negative marker /tV/ that occurs before a verb, in the realis marker /kV/ that occurs before a verb, and in the numeral /tV/ ‘one’ that occurs before a numeral classifier. Not all atonal morphemes show vowel harmony. An example would be the nominalizing prefix /ʔa-/, e.  g. /ʔaθɛ̄ / ‘fruit’ (< /ʔa-/ + /θɛ̄ / ‘to bear fruit’). Solnit (1997: 26) reports that vowel harmony is

346 

 Atsuhiko Kato

observed in Kayah Li, Kayaw, and Bwe (see also the “Introduction” of Henderson [1997] written by Solnit). Manson (2010: 152) also reports that a Kayan adverbializer /θa-/ shows vowel harmony. In PK and SK, vowel harmony is not observable. In EPK, the first syllables of some disyllabic words can be pronounced either as a full syllable with a tone or as an atonal syllable. Examples are: /khʊ́ lòɴ ~ khəlòɴ/ ‘mountain’, /phɨ́bàiɴ ~ phəbàiɴ/ ‘blanket’, and /kháɴthài ~ khəthài/ ‘base, foundation’. These forms tend to appear in rapid speech with an atonal syllable. Manson (2017a: 153) mentions that the Kayan word /kʰaŋ42.du42/ [kʰan42.du42] ‘thigh’ is also pronounced with forms that have weakened first syllables including [kʰə.du42] and [kə. du42]. These examples imply that one of the sources of sesquisyllabic words in Karenic languages is the weakening of the full first syllables of disyllabic words.

18.3 Word classes Kato (2004, 2008b) groups EPK words into five word classes: nouns, verbs, adverbs, particles, and interjections. It is unnecessary to set up the category of adjectives, and words that denote states are considered a subgroup of verbs, i.  e., stative verbs. These word classes can be defined in terms of the bundle of syntactic features that each word class has. Interjections are considered as a special word class that cannot have a syntactic relationship with other elements. For the other four word classes, Kato (2004, 2008b) proposes three tests to distinguish them: Test 1 is whether a word can constitute an utterance on its own; Test 2 is whether a word allows a verb particle to occur; and Test 3 is whether a word can be an argument of a verb. See Table 6. Tab. 6: Tests to distinguish EPK word classes.

Nouns Verbs Adverbs Particles

Test 1

Test 2

Test 3

Yes Yes Yes No

No Yes No No

Yes Yes No No

Nouns, verbs, and adverbs can constitute an utterance on their own, but particles cannot. Only verbs can allow a verb particle to occur. Verb particles are one of the seven kinds of particles. Kato (2004: 370‒372) lists over sixty verb particles, including the negative marker /lə/ and the irrealis marker /mə/. For example, in /ʔəwê mə ɣɛ̂ / (3sg – irr – to.come) ‘S/he will come’, the irrealis marker /mə/ cannot occur without the presence of a verb /ɣɛ̂ /; thus, a verb allows a verb particle to occur. Nouns and verbs can function as the subject argument or object argument of a verb. For example,



Typological profile of Karenic languages 

 347

in /jə θànáɴ láiʔàʊ/ (1sg  – to.forget  – book) ‘I forgot a book’, the noun /láiʔàʊ/ ‘book’ is the object argument of the verb /θànáɴ/ ‘to forget’, and in /jə θànáɴ ɣɛ̂ / (1sg – to.forget – come) ‘I forgot to come’, the verb /ɣɛ̂ / ‘to come’ functions as the object argument of the same verb, /θànáɴ/ ‘to forget’. The grammars of Karenic languages by Solnit (1997) for Kayah Li, Manson (2010) for Kayan, and Kato (2004) for Pwo Karen, establish different word classes; however, all these grammars commonly recognize nouns, verbs, and adverbs, and consider the class of adjectives as un­necessary.

18.4 Word formation EPK is an isolating type of language; its words do not inflect at all. However, there are three productive word-formation processes. They are compounding, affixation, and reduplication. Compounding is a highly productive process of word formation in EPK. Nouns and verbs are involved in compounding and the resultant words are also nouns or verbs. There are four patterns in the formation of compound nouns: N+N > N, N+V > N, V+V > N, and V+N > N. In the formation of compound verbs, there are three patterns: V+V > V, N+V > V, and V+N > V. Here, we will look at four patterns from the patterns mentioned above: N+N > N, N+V > N, V+V > V, and V+N > V. – N+N > N. In this pattern, the second element is usually the semantic head: (2)

/mé/ ‘eye’ + /thî/ ‘water’

> /méthî/ ‘tear’

(3)

/ɣéiɴ/ ‘house’ + /khʊ́ / ‘head’ > /ɣéiɴkhʊ́ / ‘roof of a house’

(4)

/phɔ̂ / ‘flower’ + /phə̀ ɴ/ ‘pot’

(5)

/déiɴ/ ‘sesame’ + /θʊ́ / ‘oil’

> /phɔ̂ phə̀ ɴ/ ‘vase’ > /déiɴθʊ́ / ‘sesame oil’

However, in some words, the former element is the semantic head, as in (6) and (7), and there are also cases in which both elements can be considered the semantic heads, as in (8) and (9): (6)

/láiɴ/ ‘cart’ + /mí̱/ ‘fire’

> /láiɴmí̱/ ‘train’

(7)

/phə̀ ɴ/ ‘pot’ + /thà/ ‘iron’

(8)

/mʊ̄ / ‘mother’ + /phā/ ‘father’ > /mʊ̄ phā/ ‘parents’

(9)

/chái/ ‘rice field’ + /xàʊ/ ‘swidden’ > /cháixàʊ/ ‘agricultural land’

> /phə̀ ɴthà/ ‘Chinese pot made of iron’

In “N+N > N” pattern, the second element is sometimes a numeral classifier, as in (10) and (11). Numeral classifiers constitute a subgroup of nouns.

348 

 Atsuhiko Kato

(10)

/lái/ ‘writing’ + /béiɴ/ ‘nc[flat thing]’

> /láibéiɴ/ ‘booklet’

(11)

/lái/ ‘writing’ + /phlóʊɴ/ ‘nc[round thing]’

> /láiphlóʊɴ/ ‘alphabet’

– N+V > N. This pattern can be considered in terms of syntactic relationship of N and V; ‘N’ may be the subject of the ‘V’ as in (12), the object as in (13), and the adjunct as in (14) and (15). In many cases, ‘N’ denotes a generic concept of the resultant noun but sometimes, it is not the case; In (15), a ‘handle’ is not a kind of ‘hand’. (12)

/lōʊɴ/ ‘stone’ + /jî̱/ ‘be green’

> /lōʊɴjî̱/ ‘jade’

(13)

/lái/ ‘writing’ + /pɔ̄ / ‘to read’

(14)

/thà/ ‘iron, needle’ + /chà/ ‘to sow’

(15)

/cɯ́/ ‘hand’ + /phóɴ/ ‘to hold’

> /láipɔ̄ / ‘textbook’ > /thàchà/ ‘sewing needle’

> /cɯ́phóɴ/ ‘handle, grip’

– V+V > V. Examples are below: (16)

/ʔáɴ/ ‘to eat’ + /ʔɔ̀ / ‘to drink’ > /ʔáɴʔɔ̀ / ‘to eat and drink’

(17)

/thé/ ‘to be cut’ + /phà/ ‘to be split’

(18)

/ɕɯ́/ ‘be calm’ + /máʊ/ ‘be comfortable’

> /théphà/ ‘to crack, explode’ > /ɕɯ́máʊ/ ‘be peaceful’

– V+N > V. Examples are below. In (20), /pərə̂ ɴ/ is not the object noun of the verb /kè/ because /kèpərə̂ ɴ/ as a whole can take an object, as in /kèpərə̂ ɴ lái/ (to.write – letter) ‘to write a letter’. (19)

/bá/ ‘to hit’ + /θà/ ‘heart’

> /báθà/ ‘to want (something)’

(20) /kè/ ‘to write’ + /pərə̂ ɴ/ ‘news’

> /kèpərə̂ ɴ/ ‘to write (as a letter)’

Compounding is highly productive in Karenic languages in general. Some of the compound words seem to be able to trace back to the Proto-Karen stage. An example would be words that mean ‘tree leaf’: EPK /θéiɴlá/ (< /θéiɴ/ ‘tree’ + /lá/ ‘leaf’), SK /θêlâ/ (< /θê/ ‘tree’ + /lâ/ ‘leaf’), Kayah Li /sɔle/ (< /sɔ/ ‘tree’ + /le/ ‘leaf’) (Solnit 1997: 44), and Kayan /θə̂ ŋlâ/ (< /θə̂ ŋ/ ‘tree’ + /lâ/ ‘leaf’) (Manson 2007: 58). Reduplication in EPK only applies to verbs. Nouns are never reduplicated. Reduplication derives an adverb from a stative verb. For example, /phlɛ́ phlɛ́ / ‘fast’ is a reduplicated form of the verb /phlɛ́ / ‘be fast’, and it functions as an adverb that occurs after the verb and modifies it. Other examples are: /xɛ̀ xɛ̀ / ‘slowly’ (< /xɛ̀ / ‘be slow’), /ʔáʔá/ ‘much’ (< /ʔá/ ‘be many’), and /ɣì̱ɣì̱/ ‘well’ (< /ɣì̱/ ‘be good’). When the verb is disyllabic, it is usually reduplicated in the pattern of AABB, e.  g. /thîthîchàchà/ ‘exactly’ (< /thîchà/ ‘be exact’) and /xɯ̂xɯ̂xàɴxàɴ/ ‘unitedly’ (< /xɯ̂xàɴ/ ‘be united’). However, some disyllabic words are reduplicated in the form of ABAB, e.  g. /bádàbádà/ ‘moderately’ (< /bádà/ ‘be moderate’). In Kayan, reduplication also has a function of



Typological profile of Karenic languages 

 349

deriving adverbs from the stative verbs (Manson 2010: 151). Kayah Li reduplication does not seem to have such a function but has an interesting function of expressing the meaning of ‘also, too, either’ (Solnit 1997: 52‒53). For example, in the sentence /vɛ̄ cwá to to/ (I – to.go – neg – rdp) ‘I won’t go either’, the meaning of ‘either’ is expressed by reduplicating the last syllable of the clause. This process can be applied to any syntactic element as far as it is in the clause-final position. EPK has a limited number of affixes. Kato (2004: 57‒62) lists ten affixes; eight out of these are prefixes and two are suffixes. Here, we will see two important affixes. – /ʔə-/ is a derivational prefix attached to a verb or noun. This morpheme is related to the Proto-Tibeto-Burman prefix *a- (Benedict 1972: 121‒123; see also Matisoff 2003: 104‒117; and Matisoff 2018), which has various functions in Tibeto-Burman languages, including nominalization. It is homophonous with the 3rd person singular pronoun /ʔə/ ‘s/he; his or her’. It is prefixed to verbs and derives nouns. Examples are: /ʔəkhʊ̄ / ‘hot thing’ (< /ʔə-/ + /khʊ̄ / ‘be hot’), /ʔəchâ/ ‘injury, wound’ (< /ʔə-/ + /châ/ ‘to ache’), /ʔədà/ ‘rug’ (< /ʔə-/ + /dà/ ‘to spread’), /ʔəθá/ ‘fruit’ (< /ʔə-/ + /θá/ ‘to bear fruit’), /ʔəθâɴ/ ‘new one’ (< /ʔə-/ + /θâɴ/ ‘be new’), and /ʔəʔwà/ ‘white’ (< /ʔə-/ + /ʔwà/ ‘be white’). This prefix is also prefixed to a noun and derives another noun. Examples are: /ʔəkhâiɴ/ ‘backside, rear’ (< /ʔə-/ + /khâiɴ/ ‘buttock’) and /ʔəkhʊ́ / ‘roof’ (< /ʔə-/ + /khʊ́ / ‘head’). Moreover, this prefix is also utilized to change a dependent nominal morpheme into an independent noun. For example, in /ʔəchóɴ/ ‘body hair’, the part /chóɴ/ cannot occur as an independent noun but evidently indicates the meaning of ‘body hair’ because it is used in a compound noun; an example of this would be /kháchóɴ/ ‘beard’ (< /khá/ ‘chin’ + /chóɴ/ ‘body hair’). The independent noun /ʔəchóɴ/ is formed by prefixing /ʔə-/ to the bound form /chóɴ/. Examples of nouns of this type are: /ʔənā/ ‘blade’, /ʔəphə̀ ɴ/ ‘inside’, /ʔəmèiɴ/ ‘name’, /ʔəkhwâ/ ‘female’, /ʔəmɯ́/ ‘male’, and /ʔəlɯ̄/ ‘voice’. Some words have both forms with and without this prefix. That is, both forms can be used as an independent noun. Examples of this are: /(ʔə)chɯ́/ ‘thorn’, /(ʔə)khlî/ ‘seed’, and /(ʔə)nòʊɴ/ ‘horn’. The prefix /ʔə-/ is evidently cognate with the Kayah Li prefix /ʔa-/ (Solnit 1997: 41‒44), the Kayan prefix /a-/ (Manson 2010: 104‒105), and the SK prefix /ʔə-/. Proto-Karen probably had a prefix that can be reconstructed as *(ʔ)a-, from which these prefixes have developed. – /chə-/ is a derivational prefix related to the noun /chə̄ /, meaning ‘thing’, which originated from Proto-Karen *da2 ‘thing’.5 It is mainly prefixed to verbs and derives nouns. Examples are: /chəkhlàiɴ/ ‘language’ (< /chə-/ + /khlàiɴ/ ‘to speak’), /chəkhʊ̄ / ‘heat, 5 The Proto-Karen *d generally became /th/ in Pwo Karen dialects. In this regard, the onset of the prefix /chə-/ shows an irregular correspondence with the Proto-Karen *d. In Northern Pwo Karen, spoken in northern Thailand, the form corresponding to the Eastern prefix /chə-/ is /thə-/ (see Phillips 2017: 70‒80), whose onset shows a regular correspondence with Proto-Karen. Probably, the onset of the Eastern Pwo prefix /chə-/ used to be an alveolar stop and became /ch/ later for some reason.

350 

 Atsuhiko Kato

hotness’ (< /chə-/ + /khʊ̄ / ‘be cold’), /chəchə̀ ɴ/ ‘rain’ (< /chə-/ + /chə̀ ɴ/ ‘to rain’), /chədóɴ/ ‘wall’ (< /chə-/ + /dóɴ/ ‘to fence’), /chəmà/ ‘job’ (< /chə-/ + /mà/ ‘to do’), and /chəʔɛ́ / ‘love’ (< /chə-/ + /ʔɛ́ / ‘to love’). This prefix may also occasionally be attached to a noun. Examples are: /chəpərə̂ ɴ/ ‘information’ (< /chə-/ + /pərə̂ ɴ/ ‘news about somebody’) and /chəɣàɴ/ ‘picture’ (< /chə-/ + /ɣàɴ/ ‘appearance, figure’). Compared to /ʔə-/, nouns that are derived with /chə-/ tend to have more abstract meanings; compare /ʔəkhʊ̄ / ‘hot thing’ with /chəkhʊ̄ / ‘heat, hotness’. This prefix is cognate with the Kayan prefix /ta-/ (Manson 2010: 105‒107) and SK nominalizer /tà/. It is not certain whether the proto-form *da2 ‘thing’ had already been grammaticalized as a prefix at the Proto-Karen stage.

18.5 Clause structure The basic structure of the EPK verb-predicate clause can be schematized as presented in Figure 1. In the position of the verb, represented as ‘V’, concatenated verbs, which will be discussed later, may occur. I call the part consisting of the verb and the verb particle(s) as a verb complex. In addition to the elements shown in the schema, after the adverbial elements, another verb complex may occur, which is the V2 of a separated type serial verb construction. Furthermore, some adverbial elements may appear clause-initially. (NP1) (verb particle(s)) V (verb particle(s)) (NP2) (NP3) (adverbial elements)                                                                                                                            verb complex Fig. 1: Basic structure of the EPK clause.

NP1 is the subject, and NP2 and NP3 are the objects of the verb. When the verb is monotransitive, NP2 can appear, and when the verb is ditransitive, both NP2 and NP3 can appear. When the verb is intransitive, only NP1 can appear. (21), (22), and (23) are examples of intransitive, monotransitive, and ditransitive sentences of EPK, respectively. In many Southeast Asian and East Asian SVO languages, existential and phenomenon sentences utilize a VS order; however, EPK does not use that order in such sentences, as seen in (21). In a ditransitive clause, the noun denoting Recipient occurs immediately after the verb, and the noun denoting Theme follows it, as seen in (23). All Karenic languages have basic SVO word order, and the positions of the arguments of verbs are basically the same as EPK. (21)

phlòʊɴmwì ɣɛ̂   [EPK] guest to.come ‘A guest came; There comes a guest.’



Typological profile of Karenic languages 

(22)

ɕáphàɴ dʊ́ θàkhléiɴ   [EPK] pn to.hit pn ‘Shapan hit Thakhlein.’

(23)

ɕáphàɴ phí̱lâɴ θàkhléiɴ khòθá   [EPK] pn to.give pn mango ‘Shapan gave Thakhlein a mango.’

 351

To see elements other than NPs, let us take sentence (24) as an example: (24)

nə mə ʔáɴ bá kʊ́ ʔáʔá lə́ jə ɣéiɴ phə̀ ɴ ɕī̱ [EPK] 2sg irr to.eat opp cake much loc 1sg house inside too ‘You will also get a chance to eat a lot of cake inside my house.’

/mə/ and /bá/ are verb particles. /mə/ is the irrealis marker, and /bá/ denotes opportunity. In this example, /mə ʔáɴ bá/ is the verb complex. Kato (2004) lists 11 verb particles that appear before the verb and 50 verb particles that appear after the verb. In the position of “adverbial elements”, adverbs, adpositional phrases, adverbial particles, and numeral-classifier phrases may occur. In (24), /ʔáʔá/ is an adverb, /lə́ jə ɣéiɴ phə̀ ɴ/ is an adpositional phrase, and /ɕī̱/ is an adverbial particle. These adverbial elements occur after the verb. Adpositional phrases are formed by adpositional particles including /lə́ ~ lé/ ‘at; from; to’, /dē/ ‘with’, /thōɴ/ ‘around’, /nî̱/ ‘as much as’, and /bê … θò/ ‘like’. Adpositional particles generally precede the noun phrase, with the exception of /bê … θò/, which is a circumposition. Out of the constituents that occur in the position of adverbial elements, adpositional phrases may be topicalized and placed in the sentence-initial position (see Section 13 for topicalization). The form /phə̀ ɴ/ ‘inside’ in (24) is a locational noun. Its independent form is /ʔəphə̀ ɴ/ ‘inside’ with the prefix /ʔə-/, which is dropped in most cases when it modifies a noun. Locational nouns occur after the noun that they modify. They are a subgroup of nouns; however, they are special because they can function like an adposition. For example, in /ʔəwê ʔɔ́ (lə́ ) ɣéiɴ phə̀ ɴ/ (3sg – be – loc – house – inside) ‘He is in the house’, the adpositional particle /lə́ / can be omitted because /phə̀ ɴ/ can make the preceding noun function as an adjunct. (However the omission seems to occur usually when the adpositional phrase appears immediately after the verb.) Other locational nouns include /ʔəklà/ ‘between’, /ʔəméjâ/ ‘front’, /ʔəlāɴkhâiɴ/ ‘back’, /ʔəphâɴkhʊ́ / ‘on, above’, and /ʔəphâɴlá/ ‘under’. It is difficult whether we should recognize noun-predicate clauses in EPK. In a sentence such as /ʔəwê mwɛ̄ phlòʊɴ/ (3sg – cop – Karen) ‘He is a Karen’, the copular verb /mwɛ̄ / is occasionally dropped. However, some native speakers say that it makes the expression sound rude or sloppy. Therefore, it cannot be said with certainty that the expression /ʔəwê phlòʊɴ/ ‘He is a Karen’ is completely acceptable. Karenic languages generally do not have a tense marking. However, in EPK, the irrealis marker /mə/ (also pronounced as /mʊ̄ /) helps to indicate the future, as is seen in (24) above. A more simple EPK example is /jə mə ɣɛ̂ / (1sg – irr – to.come) ‘I will

352 

 Atsuhiko Kato

come’. The SK irrealis marker /kə/ also helps to indicate the future, as in: /jə kə hɛ́ / (1sg – irr – to.come) ‘I will come’. These markers should not be considered as a future marker because they can be used to denote a surmise about past time. For example, in EPK, /mɯ̄ ɣá ʔəwê mə ʔɔ́ lə́ jò/ (yesterday – 3sg – irr – to.be – loc – here) ‘Yesterday, he was probably here’. However, it is certain that EPK /mə/ and SK /kə/ are usually used in a clause depicting an event in the future. Kayan has an irrealis prefix /kà-/ (Manson 2010: 140‒141), and Geba also has a preverbal irrealis particle /kV/ (Kato 2008a: 215–216). The forms of SK, Kayan, and Geba are probably cognates. EPK /mə/ is related to the Western Pwo Karen verb particle /mô/, which means ‘want to’. The Western Pwo irrealis marker is /kə/, and this is probably cognate with the SK, Kayan, and Geba forms. Thus, it seems that EPK used to have an irrealis marker that is cognate with Western /kə/, but it was replaced by /mə/ later. Although Karenic languages do not mark tense, they have markers denoting various aspect categories. EPK has a perfective marker, /jàʊ/, which is an adverbial particle. An example is /ʔəwê ɣɛ̂ jàʊ/ (3sg – to.come – pfv) ‘He has come’. The SK equivalent is /lí/, as in /ʔəwɛ́ hɛ́ lí/ (3sg – come – pfv) ‘He has come’. Both EPK and SK do not have a marker that denotes continuous aspect. Thus, EPK /jə ʔáɴ mì̱/ (1sg – to.eat – rice) and SK /jə ʔɔ̂ mē/ (1sg – to.eat – rice) can both express meanings, such as ‘I ate rice’, ‘I was eating rice’, ‘I am eating rice’, and ‘I eat rice (every day)’. Kayan (Manson 2010: 137) has a continuous aspect prefix /âu-/, for example, /âu-mé/ (cont-to.sleep) ‘to be sleeping’, which originated from the verb meaning ‘to exist, live’. However, in EPK and SK, the EPK verb /ʔɔ́ / and SK verb /ʔɔ̂ /, both of which mean ‘to exist, live’ have not been grammaticalized to a form denoting aspect.

18.6 Pronouns The EPK pronouns have three forms: form I, form II, and the emphatic form, as is shown in Table 7. Form I is used for the subject of a verb, e.  g., /jə ʔáɴ khòθá/ (1sg – to.eat – mango) ‘I ate a mango’. It is also used in Slot 3 (see Section 7) of the noun phrase to denote a possessor, e.  g., /jə khòθá/ (1sg – mango) ‘my mango’. Form II is mainly used for the object of a verb, e.  g., /jə dá nə̀ / (1sg – to.see – 2sg) ‘I saw you’, or the object of an adpositional particle, e.  g., /jə ʔáɴ dē ʔə̀ / (1sg – to.eat – with – 3sg) ‘I ate with him/her’. It is also used for the pronouns that are topicalized (see Section 13), e.  g., /jə̀ nɔ́ ʔáɴ khòθá/ (1sg – top – to.eat – mango) ‘As for me, I ate a mango’. The emphatic form is typically used when the pronoun is emphasized, whatever syntactic role the emphasized pronoun has, e.  g., /jəwê ʔáɴ khòθá/ (1sg – to.eat – mango) ‘I ate a mango’. When the subject of a sentence is third person singular, the emphatic form /ʔəwê/ is usually used instead of the form I /ʔə/, e.  g., /ʔəwê ʔáɴ/ (3sg – to.eat) ‘He ate’.



Typological profile of Karenic languages 

 353

Tab. 7: EPK pronouns. Form I

Form II

Emphatic form

1sg 1pl

jə hə (pə)

jə̀ hə̀ (pə̀ )

2sg 2pl 3sg 3pl

nə nəθí ʔə ʔəθí

nə̀ nəθí ʔə̀ ʔəθí

jəwê, jəwêdá həwê (pəwê), həwêdá (pəwêdá) nəwê, nəwêdá nəθíwê, nəθíwêdá ʔəwê, ʔəwêdá ʔəθíwê, ʔəθíwêdá

In many Karenic languages, pronouns placed before verbs and nouns are phonologically weak, as is the case with the EPK pronouns /jə/ (1sg), /nə/ (2sg), and /ʔə/ (3sg), which have atonic syllables. In EPK, a topicalized pronoun in the form II sometimes appears together with the same pronoun in form I in the subject slot in a sentence, such as in (25). (25)

jə̀ jə mə lɔ̀ nə̀ 1sg[top] 1sg[sbj] irr to.tell 2sg[obj] ‘As for me, I will tell you.’

[EPK]

The weak forms /jə/, /nə/, and /ʔə/ can be interpreted as proclitics, and these forms used before verbs could be taken as an example of the pronominalization that is widely observed in Tibeto-Burman languages (for pronominalization, see, e.  g., Nishi 1995). LaPolla (1994: 74) discusses a similar phenomenon in SK in the context of pronominalization.

18.7 Noun phrases The structure of the Pwo Karen noun phrase can be schematized as in Figure 2. This is a revised version of the schema shown by Kato (2019: 144). Brackets denote optional items. Slot 1

Slot 2

Slot 3

Slot 4

Slot 5

Slot 6

Slot 7

Slot 8

Slot 9

(RC)

(NM)

(PRON)

HEAD NOUN

(RC)

(ADP)

(NUM + NC)

(PL)

(DEM)

Fig. 2: The order of components within the EPK noun phrase.

Slot 1 is for a pre-head relative clause. Slot 2 is for a noun modifier, a noun that modifies the head noun. Slot 3 is for a pronoun, which denotes a possessor. Slot 4 is for the head noun. Slot 5 is for a post-head relative clause. Slot 6 is for an adpositional phrase.

354 

 Atsuhiko Kato

Slot 7 is for “numeral + numeral classifier”. Slot 8 is for a particle that denotes plurality. Slot 9 is for a demonstrative, more precisely, a particle that has a demonstrative function. Below are examples: (26)

Slot 2 Slot 3 Slot 4 phā (ʔə) chái father 3sg rice.field ‘Father’s rice field’

(27)

Slot 3 Slot 4 Slot 6 jə θò dē khʊ́ láʊ 1sg friend with hat ‘those friends of mine with hats’

(28)

Slot 1 ʔəwê xwè 3sg   to.buy

Slot 4 já fish

Slot 5 phàdʊ́ be.big

[EPK]

Slot 8 θɛ̀ pl Slot 6 lə́ cəpwɛ̄ loc table

Slot 9 nɔ́ that

[EPK]

phâɴkhʊ́ above

ʔò away

Slot 7 Slot 8 Slot 9 [EPK] θə̄ ɴ béiɴ θɛ̀ nɔ́ three nc pl that ‘those three big fish on that table, which he bought’ When a possessor noun modifies the head noun as in (26), the third-person pronoun may appear before the possessed noun. /ʔə chái/ (3sg – rice.field) alone can mean ‘his rice field’. Slot 7 in Figure 2 is the slot for “numeral (num) + numeral classifier (nc)”, which I call an “NC phrase” here. When a noun is modified by a numeral, the numeral must be followed by a numeral classifier corresponding to the noun. Examples of head nouns with an NC phrase are: /thwí lə dɯ̀/ (dog – one – nc[animal]) ‘one dog’, /phlòʊɴ nī ɣà/ (person – two – nc[person]) ‘two people’, /châiɴ θə̄ ɴ béiɴ/ (shirt – three – nc[flat thing]) ‘three pieces of shirt’, /khòθá lī phlóʊɴ/ (mango – four – nc[round thing]) ‘four mangos’, and /lé jɛ̄ bòɴ/ (stick – five – nc[long thing]) ‘five sticks’. /dɯ̀/ is usually used for mammals. Birds, fish, and insects are counted with /béiɴ/ (flat thing), and snakes and lizards are counted with /bòɴ/ (long thing). EPK numeral classifiers are a type of dependent nouns. When they occur in Slot 7, they need to be preceded by a numeral. Thus, /thwí lə dɯ̀ jò/ (dog – one – nc[animal] – this) ‘this one dog’ and the zero head /lə dɯ̀ jò/ (one  – nc[animal]  – this) ‘this one (animal)’ are fine but */thwí dɯ̀ jò/ (dog – nc[animal] – this) and */dɯ̀ jò/ (nc[animal] – this) are ungrammatical. It would be worth noting that an NC phrase can be floated into the position of the “adverbial elements” in Figure 1, if the entire noun phrase is the subject or object. Thus, /təwâphjā nī ɣà ɣɛ̂ lə́ jò/ (student – two – nc[person] – to.come – loc – here) ‘Two students came here’ can be changed to /təwâphjā ɣɛ̂ lə́ jò nī ɣà/ (student – to.come – loc – here – two – nc[person]).



Typological profile of Karenic languages 

 355

According to Manson’s (2017: 160) generalization about the structure of the noun phrase in Karenic languages, the NC phrase follows the demonstratives. He cites Naw Hsa Eh Ywar’s (2013: 62) Kayan Lahta example: (29)

ɲa˧ ʃa˥ pi˥ do˩ shu˩ 1sg chicken be.small that six ‘those six small chickens of mine’

ba˩ clf

[Kayan Lahta]

In EPK, however, the NC phrase is placed before the demonstrative, as in (28). A simpler EPK example than (28) would be: /já phàdʊ́ nī béiɴ jò/ (fish – be.big – two – nc[flat thing] – this). The order in SK is the same as EPK: /ɲâ pháʔdô khí bê ʔī/ (fish – be.big – two – nc[flat thing] – this). In this way, the order of elements in the noun phrase varies from one Karenic language to another. Relative clauses are also formed in various ways in different Karenic languages. In EPK, there are two ways of forming a relative clause: the first uses the relative marker /lə́ / and the second does not. We will first observe the latter case, which is the more colloquial. In this case, when the relativized noun is the subject of the relative clause, the relative clause is placed in Slot 5. In the relative clause, the subject noun is gapped: (30) phlòʊɴ [ɣɛ̂ lə́ phlòʊɴ thîkhāɴ] person to.come loc Karen country ‘the person who came to Kayin State’

[EPK]

The same phonological string as (30) can be used as a full sentence that means ‘People (a person) came to Kayin State’. When the relativized noun is a non-subject noun, the relative clause is placed in Slot 1. The relativized non-subject noun is gapped in the relative clause: (31)

[jə tháʊ lə́ dàʊ phə̀ ɴ] kháɴphài nɔ́ 1sg to.ride loc room inside shoes that ‘those shoes which I wear in the room’

[EPK]

In the case in which the relative marker /lə́ / is used, relative clauses are placed in Slot 5 regardless of the syntactic role of the relativized noun. The resumptive pronoun corresponding to the head noun occurs in relative clauses with /lə́ /, as seen in (32) and (33). (When the relativized noun is a non-subject noun, the resumptive pronoun occurs only if the noun is an animate noun.) (32)

phlòʊɴ [lə́ ʔə ɣɛ̂ lə́ phlòʊɴ thîkhāɴ] person rel 3sg to.come loc Karen country ‘the person who came to Kayin State’

(33)

phlòʊɴ [lə́ jə dʊ́ ʔə̀ ] nɔ́ person rel 1sg to.hit 3sg that ‘that person whom I hit’

[EPK]

[EPK]

356 

 Atsuhiko Kato

SK only has the relative clauses with the relative marker /lə́ /, and its usage is very similar to that of EPK: (34) pɣākəɲɔ́ [lə́ ʔə hɛ́ lə́ pɣākəɲɔ́ person rel 3sg to.come loc Karen ‘the person who came to Kayin State’ (35)

pɣākəɲɔ́ [lə́ jə tɔ̀ ʔɔ̄ ] nê person rel 1sg to.hit 3sg that ‘that person whom I hit’

kɔ̀ ] country

[SK]

[SK]

Kayan has internally headed relative clauses as well as externally-headed relative clauses that are formed with relative markers. Manson (2010: 325) argues that the noun meaning ‘woman’ in (36) is the head of the internally headed relative clause that is enclosed in square brackets: (36)

[prà ̤ mû wân-bâ písāpʰò sʰɨ́ nù ̤ prà̤ ] ŋá mè mwē woman to.wash-ben child water that clf there top to.be kʰí pʰò [Kayan] 1sg child ‘The woman over there washing the baby is my daughter.’

18.8 Negation In EPK, when the main clause is negated, the adverbial particle /ʔé/ is placed in the clause-final position: (37)

ʔəwê ʔáɴ bá mì̱ dài ʔé 3sg to.eat opp rice still neg ‘He has not managed to eat rice yet.’

[EPK]

When a subordinate clause is negated, the verb particle /lə/, which originated from the Proto-Karen negative marker *ta (Manson 2017a: 157)6 and is probably related to the Proto-Tibeto-Burman negative imperative marker *ta (Benedict 1972: 97; Matisoff 2003: 162), is placed immediately before the verb; simultaneously, the verb particle /bá/, whose origin is unknown, is placed in the final position of the subordinate clause, as in (38). That is, “double negation” (Dryer 2005) is employed in a subordinate clause.

6 Forms corresponding to EPK /lə/ in many other Karenic languages still preserve the onset of the Proto-Karen negative marker *ta, e.  g., SK /tə/ and Geba /tV/. The Proto-Karen onset *t became /l/ in EPK in two morphemes: /lə/ ‘negative marker’ and the numeral /lə̂ ɴ ~ lə/ ‘one’ (see Matisoff’s [2003: 262] Proto-Tibeto-Burman form *tan ‘one’). The corresponding Western Pwo Karen forms are /lə/ ‘negative marker’ and /kə/ ‘one’.



Typological profile of Karenic languages 

 357

(38) ʔəwê lə ɣɛ̂ lə́ jò bá ʔəkhʊ́ còɴ, jə bá 3sg neg to.come loc here neg because 1sg must mà [EPK] to.do ‘Because he did not come here, I have to do.’ In (38), /bá/ may also be placed immediately after the verb, i.  e. /ʔəwê lə ɣɛ̂ bá lə́ jò ʔəkhʊ́ còɴ …/. Sometimes, this negative form /lə V bá/ is used in the main clause, as seen in (39). In this case, the sentence typically presupposes that the listener wants to know the reason for something and the sentence shows the reason. Thus, (39) can be translated into English as ‘it is that he doesn’t remember me anymore’, or ‘it is because he doesn’t remember me anymore’. In the case that /lə V bá/ is used in the main clause, /bá/ can be omitted. (39)

ʔəwê lə θí̱jâ lə̀ ɴ (bá) [EPK] 3sg neg to.know anymore neg ‘(It is that) he doesn’t remember (me) anymore.’

Manson (2017b) concisely summarizes the patterns of negation in Karenic languages. He groups them into five types: (I) the negative marker is placed immediately before the verb, (II) the negative marker is placed immediately before the verb and a second marker is placed immediately after the verb, (III) the negative marker is placed immediately before the verb and a second marker is placed in the clause-final position, (IV) the negative marker is placed immediately after the verb, and (V) the negative marker is placed in the clause-final position. Manson assumes that Type (I) is the original pattern. (37) is an example of (V), (38) is an example of (III), the case in which /bá/ in (38) is placed immediately after the verb is an example of (II), and (39) without /bá/ is an example of (I). Thus, in Pwo Karen, the only Type (IV) is not observed. A Pa-O example of Type (IV) from Boote Cooper (2018: 29), which is not observable in EPK, is presented in (40). According to Manson, Type (I) is observed in Kayan, Lahta, Gekho, and Paku, (II) in Sgaw, (III) in Bwe, Geba, and Sgaw, (IV) in Pa-O, and (V) in Monu (Manu), Kayaw, Kayah, and Palaychi. (40) khwè phré lə̀ n phé bá tâw na mɔ́ k.cɔ́ k 1sg to.buy to.come to.give to.hit neg 2sg orange ‘I didn’t buy you oranges.’

[Pa-O]

358 

 Atsuhiko Kato

18.9 Interrogative sentences In Karenic languages, polar questions (yes-no questions) are generally indicated by placing a question marker in the clause-final position. In EPK, the polar question marker /ʁâ/, a kind of particle, is used. Below is an example: (41)

nə mə ɣɛ̂ ʁâ 2sg irr to.come q ‘Will you come?’

[EPK]

In SK, /ɦá/ is used for making a polar question, such as in /nə kə hɛ́ ɦá/ (2sg – irr – to.come – q) ‘Will you come?’. In Kayah Li, the polar question marker is /ɛ̄ / (Solnit 1997: 233), and in Kayan, it is /yá/ (Manson 2010: 271). All of these forms are probably cognates. In the case of a content question, EPK uses the content question marker /lɛ̂ / in the clause-final position, such as in the example below: (42)

nə mə ʔáɴ chənɔ́ 2sg irr to.eat what ‘What will you eat?’

lɛ̂ q

[EPK]

In this sentence, /lɛ̂ / is not omittable. In SK, the form /lɛ̂ /, which is also not omittable, is used in a content interrogative sentence, as in /nə kə ʔɔ̂ mənɯ̄ lɛ̂ / (2sg – irr – to.eat – what – q) ‘What will you eat?’. In Kayah, /tē/ is generally used (Solnit 1997: 244). In Kayan, /lé/ is used, but it is omittable. For example, /tará̤ (lé)/ (what – q) ‘What?’ (Manson 2010: 276). All these forms are probably cognates and probably also cognate with the Burmese content question marker /lɛ́ /.

18.10 Verb serialization EPK has two types of serial verb constructions: the concatenated type and separated type. Here, we will discuss serial verb constructions that contains minimum of two verbs, represented as V1 (the first one) and V2 (the second one). The concatenated type does not allow other elements including a noun phrase to occur between the two verbs, while the separated type allows. The concatenated type and separated type in Pwo Karen respectively correspond to Aikhenvald and Dixon’s (2006) “contiguous serial verb construction” and “non-contiguous serial verb construction”. Using the terms defined by Role and Reference Grammar (see Van Valin and LaPolla 1997), the concatenated type corresponds to the “nuclear juncture”, and the separated type to “core juncture”. (43) and (44) below are examples of the concatenated type:



(43)

Typological profile of Karenic languages 

jə xwè ʔáɴ kʊ́ 1sg to.buy to.eat cake ‘I bought and ate cake.’

 359

[EPK]

(44) jə dʊ́ θî thò [EPK] 1sg to.hit to.die pig ‘I hit the pig intending to kill it.’ When the combinations of verbs are Vi+Vi, Vi+Vt, and Vt+Vt, the subject arguments of them are coreferential, and both verbs are volitional, as in (43). However, in the case of Vt+Vi, the object argument of V1 and the subject argument of V2 is coreferential, and V1 and V2 must be volitional and non-volitional respectively, as in (44). The serialization of Vt+Vi type has a causative meaning. In the separated type, V2 denotes a result of V1, as in (45) and (46), or denotes potentiality including possibility/ability/permission, as in (47). In the separated type, there is no restriction on the combination of shared arguments, but V2 must be non-volitional. (45)

jə ʔáɴ mì̱ blɛ̀ jàʊ 1sg to.eat rice be.full pfv ‘I ate rice and got full.’

(46) jə dʊ́ thò θî mèiɴ 1sg to.hit pig to.die naturally ‘When I hit the pig, it died.’ (47)

jə nâɴ kā θí̱ 1sg to.drive car be.capable ‘I can drive a car.’

[EPK]

[EPK]

[EPK]

The semantic difference between (44) and (46) is noteworthy. In (44), the death of the pig was intended by the agent of V1, whereas it was not intended but happened accidentally in (46). Such a difference between the two types is also observed in Kayan (Manson 2010: 301‒302). Solnit (1997: 56‒57) points out that Karenic languages show a preference for immediate concatenation of verbs. This is also true of Pwo Karen. In the serialization /xwè ʔáɴ kʊ́ / of (43), if the noun /kʊ́ / is put between the verbs, it will be ungrammatical */xwè kʊ́ ʔáɴ/ (to.buy – cake – to.eat). In Mainland Southeast Asian languages that are of SVO-type, this type of serialization is often fine, such as in Thai /sɯ́ɯ khənǒm kin/ (to.buy – cake – to.eat) ‘buy cake and eat it’. Solnit states that Kayah Li’s preference for immediate concatenation is high even in comparison to what is known of the syntax of other Karenic languages, such as Sgaw and Pa-O. Kayah Li lacks the separate type serialization. According to my research, Geba also lacks the separated type serialization. In Geba, ability, which is expressed by using the separate type in EPK as in (47) is expressed with concatenation (Kato 2008a: 213):

360 

 Atsuhiko Kato

(48) jɔ ɗɔ̄ zā ɡadā ʔalē 1sg to.speak be.capable Burma language ‘I can speak Burmese.’

[Geba]

In the concatenated type serialization in EPK, the order of verbs follows the chronological order of the events. In (43), for example, the verb /xwè/ (V1) ‘to buy’ precedes /ʔáɴ/ (V2) ‘to eat’ because the actions of buying and eating happen in this chronological order. However, this principle may be broken when one of the serialized verbs is a motion verb. Typical motion verbs are /lì̱/ ‘to go’ and /ɣɛ̂ / ‘to come’. EPK has a verb order rule that requires a motion verb to occur before a non-motion verb (Kato 2004: 222‒228). For example, if the verbs /ɣɛ̂ / ‘to come’ and /xwè/ ‘to buy’ are concatenated, they must be arranged in the order of /ɣɛ̂ xwè/ whatever the chronological order may be and */xwè ɣɛ̂ / is ungrammatical. Thus, /jə ɣɛ̂ xwè já/ (1sg – to.come – to.buy – fish) has two readings: (a) ‘I came to buy a fish’, and (b) ‘I bought a fish and came with it’. In the case of (b), the serialization does not follow the chronological order. SK has the same rule, and the sentence /jə hɛ́ pɣē ɲâ/ (1sg – to.come – to.buy – fish) has the same two readings. However, Geba does not seem to have such a rule about motion verbs because the following example with a motion verb occurring after a non-motion verb is found in my Geba data: /zɛ swɛ̄ lē/ (1sg – to.run – to.go) ‘I went running’. To express the same meaning, EPK and SK have to use the sentences /jə lì̱ klí/ (1sg  – to.go  – to.run) and /jə lɛ̄ xè/ (1sg – to.go – to.run) respectively, in both of which a motion verb occurs before a non-motion verb. In EPK, some verbs in concatenated serial verb constructions have changed their meanings into more abstract ones and are thus more flexible in terms of co-occurrence with other verbs. These can be called “versatile verbs”, following Matisoff’s (1969) terminology. Kato (2004) treats them as verb particles because each has been grammaticalized in certain points. Aikhenvald and Dixon’s (2006) “asymmetrical serial verb constructions” are serial verb constructions that contain this type of verbs. In the following example, the verb particle /jʊ̄ /, originating from the verb /jʊ̄ / meaning ‘to look at’, expresses the meaning ‘to try to V’. (49) jə ʔáɴ jʊ̄ phlòʊɴ chəʔáɴchəʔɔ̀ 1sg to.eat to.look Karen food ‘I tried to eat Karen food.’

[EPK]

For more details on serial verb constructions in EPK, see Kato (1998, 2004: 207‒275). Kato (2017) and Kato (2019a) also discuss them to some extent. Descriptive studies on serial verb constructions in other Karenic languages include Kato (1992) and Weinhold (2011; in Brunelle’s [2011] Sgaw Karen papers) on Sgaw Karen, Solnit (1997: Chapter 4) and Sonit (2006) on Kayah, Manson (2010: 287‒302) on Kayan, Swanson (2011) on Bwe, and Boote Cooper (2017) on Pa-O. Kato (2019c) compares EPK and SK in terms of word order in serial verb constructions.



Typological profile of Karenic languages 

 361

18.11 Clause linkage In EPK, clauses can be embedded as arguments without any special marking. The following example (50) is interpreted as a construction in which the part /ʔəwê ʔáɴ mì̱/ is a complement clause embedded as the subject of /phlɛ́ /, because the part /ʔəwê ʔáɴ mì̱/ has some features of nounhood, including the fact that it can be clefted, such as /chə phlɛ́ nɔ́ mwɛ̄ ʔəwê ʔáɴ mì̱/ (thing – be.fast – top – cop – 3sg – to.eat – rice) ‘Literal translation: It is his eating that is fast’. (50) can more properly be translated as ‘his eating rice is fast’, if we make much of the syntactic construction of this sentence. An example of a clause embedded as the object is (51). (50) [ʔəwê ʔáɴ mì̱] phlɛ́ 3sg to.eat rice be.fast ‘He eats (rice) fast.’ (51)

[EPK]

jə dá [ʔəwê ʔáɴ mì̱] 1sg to.see 3sg to.eat rice ‘I saw that he was eating (rice).’

[EPK]

Interestingly, in Kayah Li, the same meaning as in (50) is expressed by using verb concatenation. (52) is an example from Solnit (1997: 91). Kayah Li’s strong preference for immediate concatenation is observed in this point as well: (52)

ʔa ʔe phrɛ̄ dī 3 to.eat be.fast rice ‘He eats (rice) fast.’

[Kayah Li]

Solnit says that syntactically there is no embedding in the verb concatenation of (52). Geba also uses verb concatenation to express situations similar to (52), such as /ja ʔā plá ɗɩ́/ (1sg – to.eat – be.fast – rice) ‘I eat rice fast’ (Kato 2008a: 174). Kayan also seems to have the same characteristic; see example (53) from Manson (2010: 151). (53)

phón phrái dyán ká to.cook be.fast rice IMP ‘Quickly cook the rice.’

[Kayan]

Adverbial clauses in EPK are usually placed before the main clause. They are introduced in the sentence by using subordinate clause particles whose occurring positions differ from each other. Examples are: /ʔəwê ʔè lì̱, jə lì̱ ʔé/ (3sg – if – to.go | 1sg – to.go – neg) ‘If he goes, I won’t go’; /ʔəwê lì̱ ʔəkhʊ́ còɴ, jə lì̱/ (3sg – to.go – because | 1sg – to.go) ‘Because he went, I went’; /ʔəwê lì̱ lānâɴ, jə lì̱ ʔé/ (3sg – to.go – although | 1sg – to.go – neg) ‘Although he went, I didn’t go’; and /kəlā ʔəwê lə lì̱ dài bá, jə mə lì̱/ (while – 3sg – neg – to.go – still – neg | 1sg – irr – to.go) ‘Before he goes, I will go’.

362 

 Atsuhiko Kato

18.12 Voice As Manson (2017: 159) states, some Karenic languages have a passive construction but its use is infrequent. SK is one of the Karenic languages that have passive usages. It uses the verb /bâ/ ‘to hit’ to make a passive clause. (54) is an example cited from Kan Gyi (1915: 19). This example corresponds to the active clause /jə pà thú jā lɔ̄ / (1sg – father – to.kick – 1sg – emph) ‘My father kicked me’. This construction is a very formal expression and rarely used in a daily conversation. (54)

jə bâ tà thú jā lə́ jə pà lɔ̄ 1sg to.hit nml to.kick 1sg loc 1sg father emph ‘I was kicked by my father.’

[SK]

EPK does not have a passive construction. However, agent-defocusing effect (see Myhill 1997), which is a significant functional role of the passive voice in many languages, is fulfilled by the noun meaning ‘thing’ that occurs in the subject slot (Kato 2020). (55) is an example. SK also has the same construction, e.  g. /tà tɔ̀ jā/ (thing – to.hit – 1sg) ‘I was hit (by someone).’ (55)

chə dʊ́ jə̀ [EPK] thing to.hit 1sg ‘I was hit (by someone).’ (Literally: ‘A thing hit me.’)

As Kato (1999, 2009a) describes, EPK has several causative markers, which belong to the category of verb particles. Out of them, /dàʊ/ (also pronounced /dà/ and /dài/) is a “genuine” causative marker because it has no corresponding homophonous verb in EPK. In a clause with /dàʊ/, the causee occurs as NP2 (see Figure 1) and the object argument of the verb occurs as NP3, as seen in (56): (56)

jə dàʊ ʔáɴ ʔəwê kʊ́ 1sg caus to.eat 3sg cake ‘I let him eat cake.’

[EPK]

/dàʊ/ is cognate with the Kayah Li verb /dʌ́ / ‘to let; to give’ (Solnit 1997: 65), which can be used as a causative element in serialized verbs. /dàʊ/ is also cognate with the Kayan causative prefix /də̀ -/, which is derived from the verb /də̀ / ‘order’ (Manson 2010: 262‒263). It is also cognate with the SK causative marker /dɯ́ʔ/. Thus, its original form at the Proto-Karen stage might have already had a causative use. Jenny (2015: 170) assumes that Kayah Li /dʌ́ / and SK causative marker /dɯ́ʔ/ originated from Proto-Tibeto-Burman *ter/*s-ter ‘give, causative’ (Matisoff 2003: 399, 615). It would be worth noting that Karenic languages have applicative constructions. Kato (2009a), in discussing valence-changing verb particles in EPK, pointed out that EPK has several applicative markers including the benefactive applicative, comitative applicative, prioritive applicative, assistive applicative, and substitutive applicative. See (57) below, an example of comitative applicative:



(57)

Typological profile of Karenic languages 

jə [ʔáɴ ɣə̀ ɴ] ʔəwê mì̱ 1sg to.eat appl 3sg rice ‘I ate rice with him.’

 363

[EPK]

The bracketed part is a verb complex. The verb particle /ɣə̀ ɴ/ is placed after the verb and the applied noun appears as NP2. A similar meaning can be expressed by using the adposition /dē/ ‘with’ as in /jə ʔáɴ mì̱ dē ʔəwê/ (1sg – to.eat – rice – with – 3sg). However, in (57), in terms of semantics, the referent of the applied noun is more actively engaged in the event denoted by the sentence. Such a meaning is also observed in applicative constructions of other languages (Shibatani 2006: 244). (58) is an example of comitative applicative in Kayah Li (Solnit 1997: 104), and (59) and (60) are examples of instrumental applicative in Geba (Kato 2008b: 174) and Palaychi (the author’s field data). Palaychi is one of the languages of the Mopwa group (for more details on Mopwa, see Naw Veronica 2011). (58)

cwá kʌ̄ vɛ̄ to.go appl 1sg ‘Go with me.’

[Kayah Li]

(59)

ja ʔā ʔī zwɩ̄ 1sg to.eat appl spoon ‘I ate with a spoon.’

(60) zà ʔɔ̀ zɾ̀ ʔ jō 1sg to.eat appl spoon ‘I ate with a spoon.’

[Geba]

[Palaychi]

Furthermore, EPK has a middle marker, /θà/, which originated from the noun meaning ‘heart’, as is discussed in detail by Kato (2019b). (61) and (62) are examples of the marker that is used for an anticausative construction and reflexive construction, respectively. SK also has a cognate middle marker /θáʔ/, whose uses are quite similar to EPK /θà/. (61)

ʔəwê ʔáɴlɛ̀ θà jàʊ 3sg to.change(tr.) mid pfv ‘He has changed.’

(62)

ʔəwê chè làɴ θà 3sg to.stab down mid ‘He stabbed himself.’

[EPK]

[EPK]

364 

 Atsuhiko Kato

18.13 Pragmatics Karenic languages use topicalization frequently. Topicalization in EPK is a left-dislocation of an element and the element is often followed by a topic marker. EPK has several topic markers. Mostly /nɔ́ / is used. In the sentence /ɕáphàɴ dʊ́ θàkhléiɴ/ (Shapan – to.hit – Thakhlein) ‘Shapan hit Thakhlein’ (=22), the subject can be topicalized, such as /ɕáphàɴ nɔ́ dʊ́ θàkhléiɴ/ (Shapan – top – to.hit – Thakhlein) ‘As for Shapan, he hit Thakhlein’, and the object can also be topicalized, such as /θàkhléiɴ nɔ́ ɕáphàɴ dʊ́ / (Thakhlein – top – Shapan – to.hit) ‘As for Thakhlein, Shapan hit him’. Topicalization can be applied to various syntactic elements including the subject noun, object noun, adpositional phrase, and complement clause. In EPK, ellipses of arguments that are recoverable from the discourse context are quite frequent. For example, when one is asked with the sentence /nə mə lì̱ ʁâ/ (2sg – irr – to.go – q) ‘Will you go?’, the answer can be either with or without the subject, /(jə) mə lì̱/ (1sg – irr – to.go) ‘Yes, (I) will go’. However, the frequency of ellipsis can vary from language to language within the Karenic branch. I have the feeling that in SK, subject pronouns are retained more often than EPK in daily conversation, but it needs further detailed investigation.

References Aikhenvald, Alexandra Y. & R. M. W. Dixon (eds.). 2006. Serial verb constructions: A cross-linguistic typology. Oxford: Oxford University Press. Benedict, Paul K. 1972. Sino-Tibetan: A conspectus. Cambridge: Cambridge University Press. Boote Cooper, Alys. 2018. Secondary verbs in Pa-O: A preliminary study. In Pittayawat Pittayaporn et al. (eds.), Papers from the Chulalongkorn International Student Symposium on Southeast Asian linguistics 2017, 21‒31. Honolulu: University of Hawai’i Press. Brunelle, Marc (ed.). 2011. Sgaw Karen papers, presented to Nimrod Andrew. Ottawa: University of Ottawa. Dawkins, Erin & Audra Phillips. 2009a. A sociolinguistic survey of Pwo Karen in Northern Thailand. Chiang Mai: Linguistic Department, Payap University. Dawkins, Erin & Audra Phillips. 2009b. An investigation of intelligibility between West-Central Thailand Pwo Karen and Northern Pwo Karen. Chiang Mai: Linguistic Department, Payap University. Dryer, Matthew S. 2005. Negative morphemes. In Martin Haspelmath, Matthew S. Dryer, David Gil & Bernard Comrie (eds.), The world atlas of language structures, 454‒457. Oxford: Oxford University Press. Eberhard, David M., Gary F. Simons & Charles D. Fennig (eds.). 2019. Ethnologue: Languages of the world. Twenty-second edition. Dallas, TX: SIL International. Haudricourt, André-Georges. 1946. Restitution du karen commun. Bulletin de la Société de Linguistique de Paris 42(1). 103‒111 [Reprinted: Haudricourt 1972, 131‒140]. Haudricourt, André-Georges. 1953. A propos de la restitution du karen commun. Bulletin de la Société de Linguistique de Paris 49(1). 129‒132. [Reprinted: Haudricourt 1972, 141–145].



Typological profile of Karenic languages 

 365

Haudricourt, André-Georges. 1972. Problèmes de phonologie diachronique. Paris: SELAF. Haudricourt, André-Georges. 1975. Le système des tons du karen commun. Bulletin de la Société de Linguistique de Paris 70(1). 339‒343. Henderson, Eugénie. J. A. 1961. Tone and intonation in Western Bwe Karen. Burma Research Society Fiftieth Anniversary Publication 1. 59‒69. Henderson, Eugénie. J. A. 1979. Bwe Karen as a two-tone language? Pacific Linguistics, Series C, 45. 301‒326. Henderson, Eugénie. J. A. 1997. Bwe Karen dictionary: With texts and English-Karen word list, 2 vols. London: School of Oriental and African Studies University of London. Jenny, Mathias. 2015. The far west of Southeast Asia: “Give” and “get” in the languages of Myanmar. In N. J. Enfield & Bernard Comrie (eds.), Languages of Mainland Southeast Asia, 156‒208. Berlin & Boston: Mouton de Gruyter. Jones, Robert B. 1961. Karen linguistic studies. Berkeley & Los Angeles: University of California Press. Kan Gyi. 1915. Introduction to the study of Sgaw Karen. Rangoon: American Baptist Mission Press. Kato, Atsuhiko [加藤昌彦]. 1992. Verb serialization in Sgaw Karen [スゴー·カレン語の動詞連 続]. Journal of Asian and African Studies [アジア·アフリカ言語文化研究] 45. 177‒204. (In Japanese) Kato, Atsuhiko. 1995. The phonological systems of three Pwo Karen dialects. Linguistics of the Tibeto-Burman Area 18(1). 63‒103. Kato, Atsuhiko [加藤昌彦]. 1998. On head verbs of serial verb constructions in Pwo Karen [ポー·カレン語の動詞連続における主動詞について]. Journal of the Linguistic Society of Japan [言語研究] 113. 31‒61. (In Japanese) Kato, Atsuhiko. 1999. Two types of causative construction in Pwo Karen. In Tadahiko L. A. Shintani (ed.), Linguistic & anthropological study on the Shan culture area, 55–93. Tokyo: Research Institute for Languages and Cultures of Asia and Africa. Kato, Atsuhiko [加藤昌彦]. 2001a. Buddhist Pwo Karen script [仏教ポー·カレン文字]. In The Sanseido encyclopaedia of linguistics [言語学大辞典], vol. 7, 847‒851. Tokyo: Sanseido [三省堂]. (In Japanese) Kato, Atsuhiko [加藤昌彦]. 2001b. Christian Pwo Karen script [キリスト教ポー·カレン文字]. In The Sanseido encyclopaedia of linguistics [言語学大辞典], vol. 7, 333‒337. Tokyo: Sanseido [三省堂]. (In Japanese) Kato, Atsuhiko [加藤昌彦]. 2002. A contrastive basic vocabulary of Eastern and Western Pwo Karen in Myanmar [ビルマにおける東部および西部ポー·カレン語の対照基礎語彙]. Southeast Asian Studies, Tokyo University of Foreign Studies [東京外大 東南アジア学] 7. 212‒249. (In Japanese) Kato, Atsuhiko [加藤昌彦]. 2004. A grammar of Pwo Karen [ポー·カレン語文法]. Tokyo: University of Tokyo PhD dissertation. (In Japanese) Kato, Atsuhiko [加藤昌彦]. 2006. Difference in prevalence of writing systems in the same language: A case of Pwo Karen [同一言語内における文字普及状況の差異について: ポー·カレ ン語の事例]. In Asako Shiohara [塩原朝子] & Shigeaki Kodama [児玉茂昭] (eds.), Writing unwritten languages [表記の習慣のない言語の表記], 89‒110. Tokyo: Research Institute for Languages and Cultures of Asia and Africa. (In Japanese) Kato, Atsuhiko [加藤昌彦]. 2008a. Basic materials in Geba [ゲーバー語基礎資料]. Asian and African Languages and Linguistics [アジア·アフリカの言語と言語学] 3. 169‒219. (In Japanese) Kato, Atsuhiko [加藤昌彦]. 2008b. Is the category “adjective” necessary in Pwo Karen? Asian and African Languages and Linguistics [アジア·アフリカの言語と言語学] 3. 77‒95. (In Japanese)

366 

 Atsuhiko Kato

Kato, Atsuhiko. 2009a. Valence-changing particles in Pwo Karen. Linguistics of the Tibeto-Burman Area 32(2). 71‒102. Kato, Atsuhiko. 2009b. A basic vocabulary of Htoklibang Pwo Karen with Hpa-an, Kyonbyaw, and Proto-Pwo Karen forms. Asian and African Languages and Linguistics 4.169‒218. Kato, Atsuhiko [加藤昌彦]. 2016. Ethnic groups and languages in Myanmar [ミャンマーの諸民族と諸言語]. ICD NEWS [法務省法務総合研究所国際協力部報] 69. 8‒26. (In Japanese) Kato, Atsuhiko. 2017. Pwo Karen. In Graham Thurgood & Randy LaPolla (eds.), The Sino-Tibetan languages, 2nd edition, 942‒958. London & New York: Routledge. Kato, Atsuhiko. 2018. How did Haudricourt reconstruct Proto-Karen tones? Reports of the Keio Institute of Cultural and Linguistic Studies 49. 21‒44. Kato, Atsuhiko. 2019a. Pwo Karen. In Alice Vittrant & Justin Watkins (eds.), The Mainland Southeast Asia linguistic area, 131‒175. Berlin & Boston: Mouton de Gruyter. Kato, Atsuhiko. 2019b. The middle marker in Pwo Karen. Reports of the Keio Institute of Cultural and Linguistic Studies 50. 21‒62. Kato, Atsuhiko. 2019c. Karen and surrounding languages. In Hayashi Norihiko (ed.). Topics in Middle Mekong linguistics. 123‒150. Kobe: Kobe City University of Foreign Studies. Kato, Atsuhiko. 2020. Impersonal construction with the noun ‘thing’ in subject position in Pwo Karen. In Hayashi Norihiko (ed.) Topics in Middle Mekong linguistics 2. 159–183. Kobe: Kobe City University of Foreign Studies. LaPolla, Randy J. 1994. Parallel grammaticalizations in Tibeto-Burman languages: Evidence of Sapir’s “drift”. Linguistics of the Tibeto-Burman Area 17(1). 61‒80. Li, Fang-Kuei. 1977. A handbook of comparative Tai. Honolulu: The University Press of Hawaii. Luangthongkum, Theraphan. 2012. Proto-Karen (*k-rjaŋA) fauna. Paper presented at SEALS 22, Agay. Luangthongkum, Theraphan. 2019. A view on Proto-Karen phonology and lexicon. Journal of the Southeast Asian Linguistics Society 12(1). i‒lii. Luce, Gordon. H. 1959. Introduction to the comparative study of Karen languages. Journal of Burma Research Society 42(1). 1‒18. Manson, Ken. 2002. Karen language relationships: A lexical and phonological analysis. Chiang Mai: Department of Linguistics, Payap University. Manson, Ken. 2007. Pekon Kayan phonology. Chiang Mai: Department of Linguistics, Payap University. Manson, Ken. 2009. Prolegomena to reconstructing Proto-Karen. La Trobe University Working Papers in Linguistics 12. Manson, Ken. 2010. A grammar of Kayan, a Tibeto-Burman language. Bundoora: La Trobe University PhD dissertation. Manson, Ken. 2017a. The characteristics of the Karen branch of Tibeto-Burman. In Picus Sizhi Ding & Jamin Pelkey (eds.), Sociohistorical linguistics in Southeast Asia: New horizons for Tibeto-Burman studies in honor of David Bradley, 149‒168. Leiden & Boston: Brill. Manson, Ken. 2017b. From right to wrong: Negation in the Karen languages. Paper read at the meeting of Australian Linguistic Society. Manson, Ken. 2019. Proto-Kayan. ms. Matisoff, James A. 1969. Verb concatenation in Lahu: The syntax and semantics of “simple” juxtaposition. Acta Linguistica Hafniensia XII(1). 69‒120. Matisoff, James A. 1973. Tonogenesis in Southeast Asia. In Larry M. Hyman (ed.), Consonant types and tone (Southern California Occasional Papers in Linguistics 1), 71‒96. Los Angeles: University of California Press. Matisoff, James A. 2003. Handbook of Proto-Tibeto-Burman: System and philosophy of Sino-Tibetan reconstruction. Berkeley, Los Angeles & London: University of California Press.



Typological profile of Karenic languages 

 367

Matisoff, James A. 2018. Rethinking the Proto-Tibeto-Burman *a- prefix: Glottal and nasal complications. Journal of Asian and African Studies 96. 29‒69. Myhill, John. 1997. Toward a functional typology of agent defocusing. Linguistics 35. 799‒844. Naw Hsa Eh Ywa. 2013. A grammar of Kayan Lahta. Chiang Mai: Payap University MA thesis. Naw Hsar Shee. 2008. A descriptive grammar of Geba Karen. Chiang Mai: Payap University MA thesis. Naw Veronica. 2011. The phonology of Dermuha and a phonological and lexical comparison between Dermuha, Sgaw Karen and Pwo Karen. Chiang Mai: Payap University MA thesis. Nishi, Yoshio. 1995. A brief survey of the controversy in verb pronominalization in Tibeto-Burman. In Yoshio Nishi, James A. Matisoff & Yasuhiko Nagano (eds.), New horizons in Tibeto-Burman morphosyntax (Senri Ethnological Studies 41), 1‒16. Osaka: National Museum of Ethnology. Phillips, Audra. 2000. West-Central Thailand Pwo Karen phonology. 33rd ICSTLL Papers, 99‒110. Bangkok: Ramkhamhaeng University. Phillips, Audra. 2017. Entities and the expression of grounding and referential coherence in Northern Pwo Karen narrative discourse. Edmonton: University of Alberta PhD dissertation. Purser, W. C. B. & Saya Tun Aung. 1922. A comparative dictionary of the Pwo-Karen dialect. Rangoon: American Baptist Mission Press. Shibatani, Masayoshi. 2006. On the conceptual framework for voice phenomena. Linguistics 44(2). 217‒269. Shintani, Tadahiko. 2003a. Notes à propos de l’étymologie du mot karen. Linguistics of the Tibeto-Burman Area 26(1). 15‒21. Shintani, Tadahiko. 2003b. Classification of Brakaloungic (Karenic) languages in relation to their tonal evolution. In Shigeki Kaji (ed.), Proceedings of the Symposium Cross-linguistic Studies of Tonal Phenomena: Historical Development, Phonetics of Tone, and Descriptive Studies, 37‒54. Tokyo: Research Institute for Languages and Cultures of Asia and Africa. Solnit, David B. 1997. Eastern Kayah Li: Grammar, texts, glossary. Honolulu: University of Hawai`i Press. Solnit, David B. 2001. Another look at Proto-Karen. Paper presented at the 34th Internatinal Conference of Sino-Tibetan Languages and Linguistics, Kunming. Solnit, David B. 2006. Verb serialization in Eastern Kayah Li. In Alexandra Y. Aikhenvald & R. M. W. Dixon (eds.), Serial verb constructions: A cross-linguistic typology, 144‒159. Oxford: Oxford University Press. Solnit, David B. 2013. Proto-Karen rhymes. Paper read at the 46th International Conference on Sino-Tibetan Languages and Linguistics, Dartmouth University. Solnit, David B. 2017. Eastern Kayah Li. In Graham Thurgood & Randy LaPolla (eds.), The Sino-Tibetan languages, 2nd edition, 932‒941. London & New York: Routledge. Stern, Theodore. 1968. Three Pwo Karen scripts: A study of alphabet formations. Anthropological Linguistics 10(1). 1‒39. Swanson, Kirstie. 2011. Serial verb constructions in Bwe Karen. Chiang Mai: Payap University MA thesis. Van Valin, Robert D., Jr. & Randy J. LaPolla. 1997. Syntax: Structure, meaning and function. Cambridge: Cambridge University Press. Wade, J. 1849. Vocabulary of the Sgau Karen language. Tavoy: Karen Mission Press. Wai Lin Aung. 2013. A descriptive grammar of Kayah Monu. Chiang Mai: Payap University MA thesis. Weinhold, Meredith Lucey. 2011. Serial verb constructions in Sgaw Karen: A comparison of Karenic VP structures. In Marc Brunelle (ed.), Sgaw Karen papers, presented to Nimrod Andrew, 76‒84. Ottawa: University of Ottawa.

Kenneth Van Bik

19 Typological profile of Kuki-Chin languages 19.1 Introduction Kuki-Chin (KC) speakers live across the borders of Bangladesh, India, and Myanmar (Burma).1 In Bangladesh, KC speakers, e.  g., Bawm, Hyow, and Khumi, are found in Chittagong Hill Tracts. In India, KC speakers live in several states such as Assam, Manipur, and Mizoram. In Myanmar, Chin State is the region with the largest KC population. But adjacent areas in Arakan State, and Magwe and Sagaing Regions also have sizeable numbers of KC speakers. Many Chins have for a long time been migrating to the larger cities like Mandalay and Yangon of Myanmar where there are now substantial Chin communities. The exact population is difficult to calculate since these speakers live also in several areas where they are minority groups and a population census is not readily available. But we could estimate that there are 1.5–2 million KC speakers since Mizoram and Chin State alone have about 1.5 million KC speakers (cf. census of India 2012 and census of Myanmar 2010). The word Chin (Written Burmese: khyang:) is a pseudo-exonym, that is, it is a Burmese adaptation of the Asho Chin word khlong or khlaung ‘man’ or ‘person’, according to Van Bik (2009). The hypothesis is that when the Burmese met the Asho Chin, that is, the Asho khlong, they called them by the latter part of their name. At that time the Burmese language did not have the initial KHL- consonant cluster, to pronounce the name Asho Khlong. So, they used the closest initial combination of sounds, the KHY- sequence, calling them khyang, and applied this word to other nearby groups, and eventually to all of the Chin. According to Luce (1959), the writer(s) of the Pagan inscriptions attempted to write the term Chin with both KHL- sequence and KHYsequence. This basic word for ‘man’ or ‘person’, still in use for some Chin groups (e.  g., the Hyow of Bangladesh), is what gave rise to the modern word Chin. Note that the word Chin is not accepted by all the people living in Chin State, especially the northern Zo group who prefer the word Zo for their ethnic identity.2 The origin of the word Kuki is less clear compared to the origin of the word Chin. One of the earliest attestations, perhaps the first, was Rawlins in 1787. His term “Cúcì” is explained as referring to the Mountaineers of Tipra. However, it is clear that Rawlins was referring to Kuki-Chin peoples because he indicated that the people in question worshipped a God named “PÁTIYÁN”. God is still called Pasian in Northern Chin languages and Pathian in Central Chin languages. With a different spelling, KUKI, the name contin-

1 My research in Kuki-Chin languages has been supported by an NSF Grant (BCS-1911385). 2 The people I worked with in Burma were very sensitive to being referred to as “Tedim Chin”, they insisted they are “Zomi”, speaking “Zopau” and using “Zolai” (written) language [MJ, editor]. https://doi.org/10.1515/9783110558142-019

370 

 Kenneth Van Bik

ued to be used by British administrators to designate the migrants from further to the East into Manipur State, the Naga Hills, and the North Cachar Hills of India. These administrators themselves admitted that the term KUKI is a name given to this people by outsiders (and hence possibly not the proper self-designation), but they continued to use the term KUKI to designate any Kuki-Chin group (Shakespear 1912). Traditionally, farming is the most common livelihood among KC people. Even those who live in the so-called urban areas such as Aizawl, Mizoram State, many Mizo still cultivate their own garden in their home, growing fruits and vegetables for their own consumption. Multilingualism is widespread among the KC people. A KC speaker is most likely trilingual: his or her native language, a nearby local language, and the national language. For example, Zakaria (2017) reports that a Hyow speaker in Bangladesh would also speak Marma, the locally dominant language closely related to Arakanese and Burmese, and the national language Bangla. Similarly, a Lamtuk speaker in Chin state would also be fluent in Tinam (a Sakta Lai dialect), a neighboring language, Hakha Lai, the local dominant language, and Burmese, the national language in Myanmar. Consequently, loanwords and calques are very prevalent in these KC communities. There are several criteria to determine a language as a member of the KC group within the Tibeto-Burman (TB) branch of the Sino-Tibetan (ST) family. One of them is some unique words in the lexicon. Two important words may serve as an illustration here. First, for first person pronoun, Kuki-Chin languages almost always have the form kay, or something close to it like key. Most other ST languages tend to have other forms of this pronoun, such as ŋa as in Burmese and Tibetan ṅa. Second, in all Kuki-Chin languages interesting changes occurred in the basic word meaning “dog”, which is found in virtually all languages of the group. In Kuki-Chin the form of the word is uy. Comparative study shows that this is a very old Tibeto-Burman root with an onset k, reconstructed the Proto-Tibeto-Burman level as *kwəy (Matisoff 2003: 20).3 Kuki-Chin languages lost this k onset in the dog word idiosyncratically which is indicative that a language belongs to the Kuki-Chin group. Form alternation in verbs (aka stem alternation) is another hallmark of KC languages not found in other ST languages. In (1a)–(1b), (2a)–(2b), and (3a)–(3b), these verbal form alternations are illustrated in three KC languages: Hakha Lai, Khumi, and Mindat K’Cho (cf. Peterson and Van Bik 2015).4

3 There are hypotheses that the k- was (or was seen as) the common animal prefix. In Karenic, this was replaced by the prefix t-, so ‘dog’ now appears as tʰwi or similar in Karen varieties. 4 For the glosses, the distinction between affix and clitic is not always straightforward and not relevant to the present discussion. The use of dashes marks bound elements, without distinguishing between the different levels and types of boundness.



(1)

Typological profile of Kuki-Chin languages 

 371

Hakha Lai (Central Chin, Myanmar) hŋaaktshia-nùu-niʔ bêel a-laak child-female-erg pot 3SG.S-take2 ‘The girl took the pot’ b. hŋaaktshia-nùu-niʔ bêel a-làa lăw child-female-erg 3sS 3SG.S-take1 neg ‘The girl didn’t take the pot’

a.

(2)

Khumi (Chittagong Hill, Bangladesh) a. anglo-lö l’ång la girl-top pot take ‘The girl took the pot’ b. anglo-lö l’ång lo-lä girl-top pot take-neg ‘The girl didn’t take the pot’

(3)

Mindat K’Cho (Southern Chin, Myanmar) om ip-ci Om sleep-nf ‘Om slept’ b. om nei a-ih Om emp 3SG.S-sleep ‘Om (not someone else) slept’ a.

More details about form alternations will be given below in 19.3.3. The schema of subgrouping in KC language is preliminary to some extent, that is, the details are changing as we learn more about the languages. This paper takes Peterson and Van Bik (2015) as the basis for the classification of KC languages, as shown in Figure 1. Proto-Kuki-Chin

Maraic

Khomic

Lautu

Khumi

Sentang Zophei Mara

Mro-Khimi Lemi

Peripheral

Central (Lai) Falam

Southern Matu K’Cho Daai Asho

Fig. 1: Subgrouping in KC languages.

Northern Tedim Teizang Vangte Kuki-Thado

Lungbang Hakha Lamtuk Mizo

Northwestern

372 

 Kenneth Van Bik

In addition, as shown in (4), there is another schema (e.  g., Peterson 2017 and Konnerth 2018) that uses mostly geographical words except for the Maraic group which Peterson (2017) considers as a part of the Central Chin. (4)

A. B. C. D. E. F.

Northwestern (previously Old Kuki; e.  g., Monsang, Lamkang, etc.) Northeastern (previously Northern; e.  g., Tedim, Sizang, etc.) Central (e.  g., Hakha Lai, Laizo/Falam, Mizo, Bawm, etc.) Maraic (most prominently, Mara, but also Senthang, Zophei, Zotung, etc.) Southwestern (e.  g., Khumi, Lemi, Mro-Khimi, Rengmitca, etc.) Southeastern (e.  g., Daai, K’Cho, Hyow, Asho, etc.)

19.2 Phonetics/phonology Most KC languages have the syllable structure of [(C1)(C2)V(:)(C3)(C4)]T. The minimal main syllable type is a single vowel plus a tone.5 Table 1 illustrates the possible syllable types with example from Daai (Southern), Hyow (Southern-plain), Tedim (aka Tiddim, Northern), and Hakha Lai (Central). Note that the gap (“-”) indicates lack of data in Tedim whereas gaps in system in Hyow (Zakaria 2017: 29). Tab. 1: Syllable types in Kuki-Chin languages. Syllable types

Daai (So-Hartmann 2009)

Hyow (Zakaria 2017)

Tedim (Henderson 1965)

Hakha Lai (Hyman and Van Bik 2004)

V VC V:C CV CCV CVC

/ɔ/ ‘drink’ /ip/ ‘harvest’ /a:t/ ‘harvest’ /sə/ ‘basket’ /mɣa/ ‘healthy’ /dəm/ ‘big’

/ɘ̂/ ‘good smell’ /ák/ ‘one’ – /bî/ ‘work’ /klɔ̂ / ‘fall’ /bút/ ‘cook’

CV:C CCVC

/kɣɛt/ ‘firmly’

– /blúŋ/ ‘sound of jumping’

/a/ ‘3rd P.S.’ /ip/ ‘sleep’ /a:k/ ‘fowl’ /ba/ ‘owe’ – /dam/ ’be well’ /ba:k/ ‘twig’ –

/a/ ‘3SG.S’ /it/ ‘sleep 1’ /aat/ ‘cut 1’ /saa/ ‘meat’6 /tlaa/ ‘fall 1’ /dam/ ’be well 1’ laak ‘take 2’ thlop ‘treat INV’

5 Note that some authors, e.  g., Melnik (2002), analyze vocalic onset as containing a glottal stop. But that is a matter of phonetic interpretation in KC. There is no contrast between /ʔa/ vs. /a/ in these languages. 6 Except for pronominal clitics and reduced syllables in compounds, only long vowel occurs in open syllable in Hakha Lai.



Typological profile of Kuki-Chin languages 

 373

Tab. 1 (continued) Syllable types

Daai (So-Hartmann 2009)

Hyow (Zakaria 2017)

Tedim (Henderson 1965)

Hakha Lai (Hyman and Van Bik 2004)

CCV:C VCC

/mɯ:p/ ‘beat (gong)’ /ujʔ/ ‘burn’

– –

CVCC

/sajʔ/ ‘clean’

CCVCC

/hjawʔ/ ‘stop to work’

– /ǽʔj/ ‘unactualized event’ /bɘ́ʔl/ ‘mingle/ mix-II’ /klóʔj/

thlaak ‘drop 2’ /orʔ/ ’right side’ /panʔ/ ‘approach.INV’ /tlerʔ/ ‘threaten.INV’

– –

19.2.1 Onsets Stops KC languages typically have three ways of contrasts in bilabial and alveolar stops, namely plain, aspirated and voiced, and two or three contrasts in velar depending on the subgroup. For example, all languages have p, ph, b/ɓ and t, th, d/ɗ. But central Chin languages only have k and kh whereas the northern and southern languages have all three velar stops, k, kh, and g. As far as we know, /g/ is a development from PKC *r (Van Bik 2009), and also occurs in loanwords from Burmese and English. There are no alveo-palatal stops in KC languages.

Fricatives Voicing contrasts for fricatives in KC include labial dentals and alveolars, exemplified by Hakha Lai and Mizo (Chhangte 1986). There is no voicing contrast if a language has fricative onsets of palatal-alveolar and velar consonants. The number of fricatives ranges from three to five consonants on the onset position. Three languages from different sub-branches demonstrate the distribution of fricative onsets in KC, as shown in Table 2.

374 

 Kenneth Van Bik

Tab. 2: Example of KC fricative onsets’ distributions. Language

Labial Dental

Daai (So-Hartmann 2009) Hakha Lai (Van Bik 2009) Tedim (Henderson 1965) Mizo (Chhangte 1986)

f v

f v

Alveolar

PalatoAlveolar

Velar

Glottal

s s z s z s z

sh

x

h h

x

h h

Nasals and approximants Just like many Tibeto-Burman languages, many KC languages contrast voiced and voiceless nasals in all three places of articulations as well as lateral and rhotic onsets. There are some languages, however, that no longer maintain these contrasts. Table 3 lists such distributions. Tab. 3: Example of sonorant distribution in KC languages. Languages

Nasals

Laterals

Rhotics

Labial-Velar Glide

Palatal-Glide

Daai Hakha Lai Hyow Tedim

m, m̥; n, n̥ ; ŋ, ŋ̥ m, m̥; n, n̥ ; ŋ, ŋ̥ m, m̥; n, n̥ ; ŋ, ŋ̥ m, n, ŋ

l, l̥ l, l̥ l, l̥ l

– r, r̥ – –

w – – w

j – j, j̥ –

Onset clusters KC languages have a range of possible onset clusters, from no cluster, e.  g., Tedim (aka Tiddim, Henderson 1964) to a moderate two clusters /tl, thl/ in Hakha Lai, to a very complex one in Hyow, /pr kr phr khr/, / br gr/, /kl/, /ʔm ʔn ʔŋ ʔl ʔj ʔw/, as exemplified in Hyow (Zakaria 2017).



Typological profile of Kuki-Chin languages 

 375

Rhymes Three types of vowel systems are found in KC languages: (i) monophthongs, diphthongs, and vowel length contrasts in closed syllables, as represented by Mizo with five cardinal vowels with length contrast in closed syllable plus two diphthongs (Chhangte 1993: 43–44); (ii) monophthongs and rounding contrast in high-front and high-back positions, as represented by Hyow with a total of nine monophthongs (Zakarian 2017: 59); (iii) monophthongs, diphthongs, and rounding contrast in high-front and high-back positions, as represented by Mara with ten monophthongs plus three diphthongs (Arden 2010: 73). All KC languages have open syllable with at least one nasal coda. Mara (Arden 2010) represents such language. Other KC’s syllables could be closed in a stop such as /p, t, k, ʔ/, nasal /m, n, ŋ/, liquid /l, r/, or glide /w, j/. The only possible complex coda is a sonorant followed (or preceded) by a glottal stop, known as “glottalized sonorants” (Roengpitya 1997).

Tones Lexical tones in KC languages ranges from two tones (Thlantlang, Hyman 2007) to five tones (Khumi, Peterson 2003). Tones are not very useful in subgrouping purpose in this branch. For example, the three dialects of Lai, Thlantlang, Hakha and Falam Chin have two, three, and four tones respectively. Typical tonal phenomena such as, tone sandhi (Chhangte 1986; Hyman and Van Bik 2004), contour simplification, and downstep (Hyman 2010) have been identified in KC. As listed in Table 4, having three tones seem to be the norm in KC. Tab. 4: Distribution of tones in KC languages. 2 Tones

3 Tones

4 Tones

5 Tones

Bawn, Daai, Thlantlang Lai

Hakha Lai, Hyow, Mara, Senthang, Tedim, Thado Kuki, Zophei

Falam Khumi Chin, Mizo

376 

 Kenneth Van Bik

The syllable structure of KC languages can be summarized as follows: a. KC syllables do not require an onset, that is, it could be a phonemic or phonetic variant of zero onset, and can be open or closed. b. Coda consonants can be voiceless stops, sonorants, or glottalized sonorants. c. Underlying vowel length is contrastive only in syllables closed by a sonorant or voiceless stop. d. Vowels are short before a glottal stop or glottalized sonorant coda.

19.3 Word classes KC languages distinguish noun (N) and verb (V) quite clearly, but there is little (if any) syntactic evidence of differentiating verb from adjective. As demonstrated in 19.3.3 below, adjectives can be described as quality verbs. Other common word classes include numerals and quantifiers, demonstratives, adverbs, and others under a broad umbrella of particles.

19.3.1 Noun Several criteria are available for a word to be considered as a noun in KC languages. These include the possibility to occur with a postnominal demonstrative or quantifier, and possession, as illustrated in the following paragraphs. Demonstratives usually follow a noun (or NP) in KC languages. Consequently, they serve as a good criterion in testing the nominality of a word. The following examples illustrate the relevant patterns: Daai (So-Hartman 2009: 122): hnashen sun ‘the child’ (hnashen ‘child’; sun ‘DEM’); Falam Chin (King 2010: 64): naam cu ‘that knife’ (naam ‘knife’; cu ‘DEM/TOP’); Mizo (Chhangte 1986: 74): hee2 vok hi1 ‘this pig here’ (hee2 ‘DPRO’; vok ‘pig’; 1 hi ‘DEM’); Senthang (Par 2016: 35): ʔɪ́n khɪ́ ‘that house’ (ʔɪ́n ‘house’; khɪ́ ‘DEM’). In some KC languages, e.  g., Hakha Lai, the noun is flanked by the demonstrative. For example, hi naam hi ‘this knife here’ (hi ‘DEM’; naam ‘knife’). Hakha Lai example, *that hi ‘kill this’ (that ‘kill’; hi ‘DEM) also exemplifies that this criterion sets N apart from V. Quantifiers and numerals also play a role in verifying if a particular word is a noun, that is, nouns are quantifiable in KC languages (see examples in 19.3.2), and the numerals follow the noun. For example, naam pakhat ‘one knife’ (Hakha Lai: naam ‘knife’; pakhat ‘one’). Similar to the negative evidence provided in the ungrammatical nature of a verb followed by a demonstrative (*V DEM) in Hakha Lai, *that tàmpìi ‘kill many’ (that ‘kill’; tàmpìi ‘many’) also demonstrates that a verb could not be followed by a quantifier.



 377

Typological profile of Kuki-Chin languages 

Another test for nounhood (and verbhood) in KC is if a word could be preceded and/or possessed by a pronominal clitic, that is, the possessor (pronominal clitic) precedes the possessed noun, as exemplified by these languages with concrete examples: Daai (So-Hartman 2009: 81): kah-ksi:m ‘my knife’; Hyow (Zakaria 2017: 109): ɘ́-hɘ́n ‘her belly’; Mizo (Chhangte 1986: 86): ka ui2 ‘my dog’; Paite (Singh 2006: 81); kə-in ‘my house’; Senthang (Par 2016: 33): ka po ‘my father’. Note that the forms of these possessive clitics are identical with the forms of subject agreement markers (see 19.3.3). Therefore, one needs to have a complete sentence to verify whether the clitic is the possessor or subject agreement marker. Examples of Hakha Lai in (5a)–(5b) illustrate the ambiguity of the personal marker ka-.7 (5)

a.

Possessor kaka-ûy na-hmŭu mâa 1SG.POSS-dog 2SG.S-see1 Q ‘Did you see my dog?’ b. Subject Agreement kaka-hmŭu lâw 1SG.SUBJ-see1 NEG ‘I did not see it.’

19.3.2 Numerals and quantifiers KC languages have zero to three prefixes for their cardinal numbers, as shown in Table  5 with examples from Mindat K’Cho (Jordan 1969: 23), Hakha Lai, and Mara (Savidge 1908). Tab. 5: KC numeral prefixes. 1

2

3

4

5

6

7

8

9

10

M. K’Cho

tumat

hngih

thum

ph’li

hma

chuk

chih

cheit

ko

gha

H. Lai

pakhat

pahniʔ

pathûm

palîi

paŋâa

paruk

pasa-riʔ

pariat

pakûa

pahrâa

Mara

pá-khā

pānō

pā-thò

sápālı̀

sápāngàw

pāchārū

pāsārı̄

pāchārı́

pāchākı́

pāhràw

7 Although this criterion does not a priori postulate a N-V distinction, the diagnostic is helpful for native speakers to check if a word is at least a N or a V, excluding the possibility of other lexical categories.

378 

 Kenneth Van Bik

Quantifiers follow the noun in KC languages. The examples include: Daai (So-Hartman 2009: 78): kpa:mi akhak ‘some men’ (kpa:mi ‘man’; akhak ‘some’); Hyow (Zakaria 2017): táá ɔ́ bɔ́ ng ‘a lot of money’ (táá ‘money’; ɔ́ bɔ́ ng ‘many’); Mizo (Chhangte 1986: 96): aar1 tleem1 tee2 ‘few hens’ (aar1 ‘hen’; tleem1 ‘few’; tee2 ‘little’); Senthang (Par 2016: 36): ʔɪ́n t̪ám.pɘ́ ‘many houses’ (ʔɪ́n ‘house’; t̪ám.pɘ́  ‘many’). Hakha Lai: în tàmpìi ‘many houses’. In Hakha Lai, one cannot use the nominal quantifier tàmpìi ‘many’ to modify the whole sentence, e.  g., *a ṭûan tàmpìi ‘he works a lot.’

19.3.3 Verbs and adjectives KC languages do not distinguish between verbs and adjectives morphologically or syntactically. For instance, English adjectives such as ‘good’ and action verb ‘fall’ show identical grammatical behavior, as exemplified in Hakha Lai (6a)–(6b). Thus, morphemes which correspond to English adjectives are generally considered as stative verbs. (6)

a-ṭhǎa na-în 2SG.POSS-house 3SG.S-good1 ‘your house is good’ b. na-thı̂l a-tlǎa 2SG.POSS-thing 3SG.S-fall1 ‘your clothes fall’ (“your things fall”) a

There are some reliable criteria for testing verb-hood in KC languages. The most reliable criterion for verb-hood in KC is negation. Some languages (e.  g. Hakha Lai) allow that a noun be followed (“negated”) by a negative morpheme lăw, as in, în-lăw ‘homeless’ (în ‘house’; lăw ‘NEG’). But the negative marker in such cases could be analyzed as part of a compound (N-V1) since its Form-2 could also form a similar compound în-lawʔ … ‘losing house …’ (N-V2), whereas the sentential negation lăw never shows such form alternation. Also, verbs in KC are those “words” which can take pronominal clitics (aka “agreement forms”) as their subject and/or object, as illustrated by Mizo example in (7a)–(7b). (7)

Mizo (Chhangte 1986: 130) a. nau1seen1 a-muu1 infant 3SG.SBJ-sleep1 ‘An infant is sleeping’ b. nau1seen1 ka-mut infant 1SG.SBJ-sleep2 ‘I put an infant to sleep’



Typological profile of Kuki-Chin languages 

 379

In addition, KC languages are known for having verbal alternations (known as Form1,2 or Stem-1,2). Functionally, a citation form serves as Form-1, and Form-2 is usually more nominal (“participle”-like, subordinate) and it can occur with the nominalizer suffix naa/naak (see 19.4.2). The nominal nature of Form-2 is illustrated with Hakha Lai examples in (8c), and (8d) provides a diagnostic for Form-2 verb as opposed to a verbal noun in Hakha Lai. That is, the presence of applicative markers, relinquitive applicative in (8d) shows that the form-2 verb tliik ‘run’ is indeed a verb. (8)



a.

ka-tlìi lǎw 1SG.S-run1 NEG ‘I did not run’ b. ka-tliik lǎw tsâa-aʔ ân-ka-tlayʔ 1SG.S-run2 NEG for-LOC 3PL.S-1SG.OBJ-catch2 ‘They caught me because I did not run’ c. ka-tliik na-hmŭu mâa 1SG.POSS-run2 2SG.S-see1 Q ‘Did you see my running?’ d. ka-hôoy-le-niʔ ân-ka-tliik-taak 1SG.POSS-PL-ERG 3PPL.S-1SG.O.run2-RELINQ.APPL ‘My friends abandoned me’ (Lit. “my friends ran away from me leaving me behind”) (taak ‘relinquitive applicative’, Peterson 1998: 117)

It appears that, on the one hand, the syntactic function of this alternation is more robust when the language has a variety of coda consonants (e.  g. Daai, Hyow, Hakha Lai, Mizo, Paite, etc.), that is, the occurrence of Form-2 is predictable, e.  g., in Hakha Lai, subordinate clauses require Form-2 morphology (8b). On the other hand, KC languages which have fewer coda consonants (e.  g. Mara, Khumi, etc.) show little to no alternation of the different verbal forms in different syntactic contexts. Many KC languages have a middle voice; that is, the reflexive marker is required for certain verbs such as grooming (‘comb one’s hair’) and reciprocal events (‘to be neighbors to one another’). These include: two southern Chin languages, Daai (So-Hartmann 2009) and Hyow (Zakaria 2017); and two central Chin languages, Hakha Lai (Smith 1999) and Mizo (Chhangte 1986, 1993). Examples in (9a) provides a body action middle as opposed to a regular transitive verb in (9b), and (9c)–(9g) illustrate other middle voice constructions in Hakha Lai (cf. Smith 1999 and Tluangneh 2002). (9)



a.

Body Action Middle ka-kut kàa-ṭòol 1SG.POSS-hand 1SG.MID-wash1 ‘I washed my hand’ b. khêeŋ ka-ṭôol dish (“plate”) 1SG.S-wash1 ‘I washed the dish’

380 













 Kenneth Van Bik

c.

Cognition Middle kàa-phuhrûŋ 1SG.MID-be paranoid1 ‘I feel paranoid’ d. Permissive Middle si-bòoy kàa-zawʔ-têr medicine-master 1SG.MID-look at.INV–CAUS ‘I let the doctor (“master of medicine”) examine me’ e. Intensive Middle kân-ròol nân-ii-ʔǎy nân-ii-ʔǎy 1SG.POS-food 2PL-MID-eat2 2PL-MID-eat2 ı̌i kân-lèeŋ riʔ ŋat ûu   CONJ 1PL.O-visit1 still SUGG HORT ‘you guys ate and ate our food, so you have to visit us for a while’ (teasingly said to urge guests to stay longer; SUGG = Suggestive) f. Impersonal Middle tu-kǔm kân-khùa tsǔu tâm-tuk àa-thı̀ i this-year 1PL.POSS-village DEM many-very 3SG-MID-die1 ‘this year, so many people died in our village’ g. Reciprocal Middle Ze-phàay lěe Khûa-bùŋ ân-ii-khùa-pǎa Ze-phai CONJ Khua-bung 3PL-MID-village-be.neighbouring1 ‘Zephai and Khuabung are neighboring villages to one another’

Many KC languages have N-V collocations, that is, some verbs occur only with a particular noun. For example, in Hakha Lai, the verb dôom ‘be cloudy’ could be used only with the word khûa ‘cosmos’ for the meaning ‘be cloudy’. Psycho-collocation, a term coined by Matisoff (1906) is a type of this N-V collocation. Examples in (10a)–(10c) illustrate psycho-collocation in Hakha Lai (Van Bik 1998). (10)



a.

a-thı̌n a-tòoy 3SG.POSS-liver 3SG.S-be short1 ‘he is impatient’ (“his liver is short”) b. ka-lûŋ a-dǒŋ 1SG.POSS-heart 3SG.S-end1 ‘I am disappointed’ (“my heart ended with respect to this matter”) tuk c. na-ôr a-khùu 2SG.POSS-throat 3SG.S-smoke1 really ‘you are really gluttonous’ (“your throat is really smoking, i.  e. cooking all the time”)



Typological profile of Kuki-Chin languages 

 381

19.3.4 Pre-verbal directionals Preverbal directionals describe how the participants behave in relation to position, distance, and movement, that is, the “where” of the participants and the “how” of the actions involved, indicating the position, distance, and movement of the interlocutors. This word class is closed and may be considered a sub-class of verbs in some languages since some of these particles still function as independent verbs, e.  g., Daai (Hartmann-So 1989) and Hakha Lai (Van Bik and Tluangneh 2017). The range of numbers also varies from language to language: one particle in Hyow (Zakaria 2017: 344), four pairs in Daai (Hartmann-So 1989: 82), and five pairs in both Hakha Lai (Van Bik and Tluangneh 2017) and Zahao (Laizo) Chin (Osburne: 1975). The exact functions of these deictic pairs may be different from one language to another. For example, in Hakha Lai (Van Bik and Tluangneh 2017), the verbal directionals rak and ra8 are used when the ground levels of the locations of the agent and the patient are equal, or at least the difference is so minimal that it can be disregarded. As illustrated in Figure 2, on the one hand, the use of rak (11a) indicates that Ang Mang (A) does the action to the pig (P) from where he is standing. On the other hand, the verbal directional ra (10b) indicates that Ang Mang moves towards the pig and stones it.9

ra

A

rak

S P L

Fig. 2: The function of ra/rak directionals in Hakha Lai.

(11)

a.

âŋ̣măŋ̣-niʔ vok a-rak-tsheʔ Angmang-ERG pig 3SG.S-DIR.SAMELEV-throw2 ‘Angmang stones a pig from afar’ (A = Agent/Angmang; S=Speaker; L = Listener; P = Patient/Pig) b. âŋ̣măŋ̣-niʔ vok a-ra-tsheʔ Angmang-ERG pig 3SG.S-DIR.SAMELEV-throw2 ‘Angmang moves towards a pig (and us) and stones it’

8 These directionals rak and ra are not related in modern Hakha Lai. But this could reflect an example of “old” stem alternation since some of the directional pairs, e.  g., vuŋ̣-1, vun-2 ‘go down’ are independent verbs as well. Also, there is a verb raa-1, rat-2 which means ‘come’ in modern Hakha Lai. 9 For the scenarios such as if S and L are close to A, not to P, see Van Bik and Tluangneh (2017) where detailed figures and explanations are provided.

382 

 Kenneth Van Bik

19.3.5 Pronouns KC languages have full personal pronouns and often corresponding pronominal agreement markers (which will be discussed in 19.4). In addition, there are indefinite and interrogative pronouns (which will be detailed in 19.5). The KC languages tend to have a distinction between inclusive and exclusive in personal pronouns: one way which includes the person being spoken to, and the other which excludes the person being spoken to. The system of pronouns may be divided into two groups: those which have exclusive vs. inclusive in dual and plural, exemplified by Hyow in Table 6, and those which lack exclusive vs. inclusive in dual and plural, exemplified by Hakha Lai in Table 7. Note that the prefix section of 19.4.2 deals with the corresponding pronominal clitics of Hakha Lai. Tab. 6: Personal pronouns in Hyow (Zakaria 2017: 125). Person

Number

First Second Third

SG kêy náng ání

DL.INC DL.EXCL hníhníʔ kéyhníʔ nànghníʔ náháʔy áhníʔ

PL.INC PL.EXCL nàngkéʔy kéyníʔ nàngníʔ náʔy áníʔ

Tab. 7: Hakha Lai pronouns. 1st person

Unmarked Contrastive

2nd person

3rd person

SG

PL

SG

PL

SG

PL

kây-maʔ kǎy

kân-maʔ kân-niʔ

nâŋ-maʔ nǎŋ

nân-maʔ nân-niʔ

a-maʔ a-niʔ

ân-maʔ ân-niʔ

The suffix maʔ in Table 7 is a deictic whose meaning is hard to define in Hakha Lai. But it seems to have the meaning of ‘self’, as illustrated in (12a)–(12b), that is, only (12a) is grammatical with the meaning of ‘VERB oneself’, which takes the form ‘X le X V’. (12)

a.

kây-maʔ 1SG.Pron-self ‘I hit myself’ b. *kǎy lêe 1SG.Pron and ‘I hit myself’

lêe and

kây-maʔ 1SG.-self

kàa-tŭu 1SG.REFL-hit1

kǎy kàa-tŭu 1SG.Pron 1SG.REFL-hit1



Typological profile of Kuki-Chin languages 

 383

In addition, as shown in (13a)–(13b), the deictic maʔ seems to carry the meaning ‘self’ with a fixed phrase maʔ-le-maʔ ‘among themselves’ (13a) as well as ‘among ourselves’ (13b). (13)

a.

maʔ-le-maʔ ân-i-sìi self-and-self 3PL.S-REFL-quarrel1 ‘They quarrel among themselves’ b. maʔ-le-maʔ kân-i-sìi self-and-self 1PL.S-REFL-quarrel1 ‘We quarrel among ourselves

19.3.6 Demonstratives Demonstratives form a word class in KC languages, ranging from two in Hyow (13a)– (13b), to four in Hakha Lai (Barne 1998), to six in Mizo (15a)–(15  f ) which seems to have the most complex functions, including relative elevation levels. (14)

Demonstratives in Hyow (Zakaria 2017: 127) a. ní ‘proximal’ b. tsú ‘distal’

(15)

Demonstratives in Mizo (Chhangte 1986) a. hei3 hi1 ‘this (near speaker)’ b. khaa3 kha1 ‘that (near addressee)’ c. khii3 khi1 ‘that (up there)’ 3 1 d. khuu khu ‘that (down there)’ e. soo3 so1 ‘that (far)’ f. cuu3 cu1 ‘that (out of sight)’

Some demonstrative modifiers in KC could precede the noun, as exemplified in (15) by approximal and distal demonstratives of Hakha Lai (Barne 1998). (16)

a.

hìi-ìn-aʔ DEM-house-LOC ‘in this house’ b. khàa-ìn-aʔ DEM-house-LOC ‘in that house’

384 

 Kenneth Van Bik

19.3.7 Adverbs The “adverbs” do not form homogeneous class in KC. In addition, the criteria of their labels vary from one language to another, especially when it comes to temporal adverbs. For example, the words in Table 8 are considered temporal nouns in Hakha Lai since they could be possessed by pronominal clitics (e.  g. na thaizing ‘your tomorrow’) as well as flanked by deictic determiners (cu thaizing cu ‘that next day’) whereas they are considered “time adverbials” in Daai (So-Hartmann 2009: 116). Tab. 8: “Time adverbials” in Daai and Hakha Lai’s temporal noun. Time

Daai

Hakha Lai

‘today’ ‘tonight’ ‘tomorrow’ ‘lastnight’ ‘dawn’ ‘this year’ ‘next year’

tuh-ngooi: shee:p mthan kho-ngooi: tuh-mthan kho-thaai tuh-a kum sheeng kum

ni-hîn (‘day-this’) tu-zăan (‘now-night’) thàay-zìŋ (unanalyzable) ni-zăan (‘day-night’) fîŋ-rày (unanalyzable) tu-kŭm (‘now-year’) hmâay-kŭm (‘front-year’)

There is a common pattern of how “manner adverbials” are derived from verbs by adverbializer(s) in KC languages. Table 9 illustrates two patterns which are quite similar: a […] =a (Daai); (a) […] (te) in (Hakha Lai). These adverbial phrases usually have freedom of placement in the clause, depending on their scope and other pragmatic/semantic factors. Tab. 9: “Manner adverbials” in Daai and Hakha Lai. Manner

Daai (So-Hartmann 2009: 119) a […] = a

Hakha Lai (a) […] (te) in

a. quickly

akjaa:ng=a (jaa:ng ‘quick’) akdo=a (do ‘be good’) angte=a (ngteh ‘be equal’)

ràŋ tèe ìn (ràŋ ‘quick’; te ‘DIM’) dâm tèe ìn (dâm ‘be healthy’) (aa) tluk tèe ìn (tluk ‘be equal’)

b. well c. equally



Typological profile of Kuki-Chin languages 

 385

19.3.8 Ideophones (expressives) Ideophones, aka expressives (Diffloth 1979) are well attested in KC languages: Daai (So-Hartmaan 2009: 121), Hyow (Zakaria 2017: 170), Hakha Lai (Patent 1998), Mizo (Chhangte 1986), and Paite (“reduplicative adverbs”, Singh 2006: 189). This word class is characterized by (partial) reduplication and semantic properties of “vivid imagery” and emotional content. For example, in Hakha Lai, ideophones have a particular morphological templates (17a)–(17b), as illustrated in examples from (18a)–(18d). Note that the vowels V1 and V2 in the templates are in harmony in terms of vowel length. (17)

a. C1V1C2 – C1V2C2 b. C1V1C2 – MV2C2

(18)

mûuy aʔtsûn na-ka-tăy tsiktsek form as for 2SG.S-1SG.OBJ-win2 thoroughly (IDEOPH) ‘when it comes to beauty, you beat me so bad’ b. thlâan-aʔ a-lı̂am rı̌aŋmâaŋ grave-LOC 3SG.S-pass.on1 IDEOPH ‘he passed on to the grave (the speaker felt very sad about it)’ c. (mi-thâaw-pàa-pı̀ i) a-ṭhûu detdut kâw (person-fat1-male-AUG) 3SG.S-sit1 IDEOPH AFF ‘a big fat person sat down heavily’ d. (mi-dèr-pàa-tèe) a-ṭhûu ditdet kâw (person-fat1-male-DIM) 3SG.S-sit1 IDEOPH AFF ‘a thin person sat down lightly’ a.

19.4 Words and word formation 19.4.1 Free forms Some word classes in KC such as nouns, verbs, adverbs, demonstratives, and ideophones are syntactically free since they can stand (and be recognized) on their own whereas others, such as preverbal conditionals, negators, possessives and subject markers are syntactically bound. Prefixes and suffixes are also bound phonologically in KC and can be less than a syllable (see causative k- below). Consequently it is useful to consider different levels of boundedness in terms of syntactic and phonological criteria.

386 

 Kenneth Van Bik

19.4.2 Bound forms Bound forms in KC are either prefixes or suffixes. Evidence of infixes or circumfixes are still scanty in this branch of the Tibeto-Burman family.

Prefixes Pronominal prefixes (aka pronominal clitics/agreement markers) are considered bound classes in KC languages. These prefixes refer to a system of agreement which mark between the verb (or verbal complex) and its subject and object. Table 10 shows an example of pronominal agreement system in Hakha Lai (Bedell 1995; Peterson 2013). Tab. 10: Hakha Lai pronominal agreement markers.

1SG 2SG 3SG 1PL 2PL 3PL

Subject

Object

Reflexive

ka na a kan nan an

ka in / n with sg. subj. Ø kan + hnaa in / n with sg. subj. + hnaa Ø + hnaa

kaa / ka-ii naa / na-ii aa / a-ii kan-ii nan-ii an-ii

In addition, some KC languages have prefixes which have peculiar semantic functions. For example, Daai has four derivational morphemes which are prefixes: k-, m-, ng- and a-. Examples in (19a)–(19d) illustrate the function of one such prefix, namely causative k-prefix. (19)

Daai (So-Harmann 2009: 53) Simplex Intransitive a. ak ‘break’ b. pyak ‘collapse’ c. pha ‘arrive’ d. seet ‘be firm’

Causative Transitive k’ak ‘cause to be broken’ kpyak ‘destroy’ kpha ‘cause to arrive’ kseet ‘tighten’

Suffixes Case markers such as ergative, locative, and genitive, etc. are considered suffixes in KC since they are affixed to a noun or NP. Table 11 lists nominal cases in several KC languages: Daai (So-Hartmaan 2009), Hyow (Zakaria 2017), Khumi (Peterson 2010), Hakha Lai (Peterson 2013), Mizo (Chhangte 1986), and Sizaang (Davis 2017).



Typological profile of Kuki-Chin languages 

 387

Tab. 11: Examples of case markers in KC languages. Languages Case

Daai

Hyow

Khumi

Hakha Lai

Mizo

Sizang

Ergative Locative Genitive

noh â a

lâ â

aa

niʔ aʔ ii

in aʔ â

nǎ: a: In

Example in (20) illustrates an ergative case in Mindat K’Cho. (20) Mindat K’Cho (Mang 2006: 29) Om-noh k’khìm meh ah-na(k)-ci Name-ERG knife meat cut.2-APPL.INST-NF ‘Om used the knife for cutting meat’ There is a nominalizer suffix in the form of naa/naak across KC languages, deriving verbal nouns with various functions. This suffix often requires a certain verbal form (e.  g., Form-2 in Hakha Lai). Some languages, e.  g., Mizo (Chhangte 1993: 79) and Sizaang (Taylor 2017: 70) have nominalizer suffix naa whereas Daai (So-Hartmann 2009: 62) and Hakha Lai (Peterson 2013) has the suffix naak. Examples in (21) and (22) illustrate the function of these nominalizers in Mizo and Daai. (21)

Mizo (Chhangte 1993: 79–80) Verb Noun (+nâ) a. cán ‘slice’ cán-nâ ‘slicer’ b. thíám ‘know.how’ thíám-nâ ‘know.how’ c. re-they ‘poor’ re-they-nâ ‘poverty’

(22)

Daai (Hartmann-So 2009: 63) Verb Noun (+naak) a. ngthei ‘study’ ngthei-naak ‘lesson’ b. küm-seei ‘be full’ küm-seei-naak ‘fullness’ c. kum-kyan ‘save’ kum-kyan-naak ‘salvation’

Reduplication The structures of ideophones in KC have partial reduplication (see 19.3.7). Phrasal and adverbial reduplications are also very common in KC with the function of emphasis. For example, Mizo (Chhangte 1986: 242–244) lists several attributive types of reduplication: trhaa trhaa ‘best one’ (“good good”); a kal1 a kal1 ‘she went back and forth’ (“she went she went”); tak tak ‘very very’ (»INT INT«); zong zong ‘all’ (“also also”). In Hakha Lai, adverbial phrasal reduplication has a particular template (23), as exemplified by (24a)–(24b).

388 

 Kenneth Van Bik

(23)

a-V2-in a-V1

(24)

a.

a-khaʔ-ı̀ n 3SG.S-full2-ADVZ ‘it’s packed full’ b. a-tsaʔ-ı̀ n 3SG.S-dirty2-ADVZ ‘it’s really dirty’

a-khat 3SG.S-full1 a-tsaak 3SG.S-dirty1

19.5 Phrase and clause structure KC languages are generally head-final, both on the phrase and clause level, with the verb generally the last major constituent of the sentence, as will be exemplified amply in this section. KC clauses may be divided into finite and non-finite constructions. Finite clauses usually include a combination of subject agreement, tense, aspect, etc. whereas non-finite clauses are dependent clauses and occur as subordinate clauses and complement clauses, which will be dealt in detail in 19.5.3.

19.5.1 Noun phrase An NP in KC consists minimally of a noun which is the only obligatory element. The NP head can be a lexical noun, a personal pronoun, or a demonstrative pronoun, and may include one or more other constituents. The possible constituents that could precede the head noun include demonstratives (either pre- or post-nominal), possessors, and relative clauses, and the other possible constituents follow the head. The possible NP structure in (25) is exemplified by the actual elements in (26a)–(26c) with data from Daai, Hyow, and Sizang, and Hakha Lai in (27). (25)

(REL CL) (POSS) (DEM) Head-N (Gender) (Stative Verb) (QUANT) (CASE) (DEM)10

(26)

a.

Daai (Adapted from Hartmann-So 2009: 133) angyan-üng kah-nih yah-ei meh effort-INSTR 1S.AGR-DU.PL get-AO meat ‘The meat that we got with great effort’ b. Hyow (Zakaria 2017: 180) pɔtɔ nú-pɔ lá hmútɔ nú-pɔ man mother-father and woman mother-father ‘The man’s parents and the woman’s parents’

10 This means that the phrase structure is not strictly head final.



Typological profile of Kuki-Chin languages 

c.

(27)

 389

Sizang (Davis 2017: 13) mi zɔ:ŋ-pɑ̌ person be.poor-MASC ‘The poor man’

Hakha Lai [Prenominal DEM, REL, POSS, postnominal MDF, CASE, DEM] [tsùu zìiŋ-aʔ a- àw-tòon-mìi kân-uy-păa-nakDEM Morning-LOC 3SG.S-bark-usually-REL 1PL.POSS-dog-MASC-black1niʔ tsun]NP tshìizoʔ a-seʔ ERG DEM cat 3SG.S-bite2 ‘Our male black dog that usually barked in the morning bit a cat’

19.5.2 Verb phrase KC verb phrase contains minimally a verb, e.  g., for the imperative mood. For example, in Hakha Lai, the VP kâl! ‘go!’ only consists of a verb (the subject ‘you’ is implied here). For the other moods such as indicative and negative ones, a KC verb phrase may contain several elements. For example, the possible Hakha Lai VP structure in (28a)–(28b) is exemplified by the actual elements in (29a)–(29b). (28)

Hakha Lai (Adapted from Kavitskaya 1997) a. (SBJ)-(DIR)-V1-(AUX)-(ASP#)-(T)-(NEG) b. (SBJ)-(DIR)-V2-(CAUS/APPL)-(AUX)-(ASP#)-(T)-(NEG)

(29)

a.

ka-vŏn-râa khàw dèeŋ riʔ lâay lăw 1SG.S-DIR-come1 able1 (AUX) about.to yet FUT NEG ‘I am not able come soon yet’ tsàŋ b. ka-vŏn-rat-têr khoʔ dèeŋ 1SG.S-DIR-come2-CAUS able2 (AUX) about.to PFV ‘I am about to be able to let him come now’

19.5.3 Alignment and valency The following sections discuss KC alignment and valency in relation to main and subordinate clauses.

Main clauses An intransitive clause in KC includes an obligatory verb. On the one hand, some KC languages, for example, Sizang (30a)–(30c), do not require pronominal agreement

390 

 Kenneth Van Bik

marker. On the other hand, some KC languages require pronominal clitics which could serve as a subject, as illustrated by Daai in (31). (30) Sizang11 (PC: Rev. Kip Thian Pau; Adapted from Davis 2017: 25) a. Dr. Thang Za Lian-sǐ:a thi: hî: die1 be Name-ABS ‘Dr. Thang Za Lian died’ b. uǐ-sǐ:a thi: hî: dog-ABS die1 be ‘The dog died’ c. ha:usa:pa:-nă ui-(sǐ:a) sa:t hî: headman-ERG dog-(ABS) hit1 be ‘The headman hit the dog’ (31)

Daai (Hartmann-So 2009: 93) kah-büh S.AGR:1SG-look ‘I look’

KC transitive clauses usually involve ergative and absolutive markers, as illustrated by Sizang in Figure 3.12 However, absolutive marker is optional in many KC languages, as shown in (32a)–(32b) with Hakha Lai examples.

S

A Ergative

/nă/

/sǐ:a/

O

Absolutive

Fig. 3: Ergative-absolutive marking in Sizang (adapted from Davis 2017: 17).

11 It appears that the Sizang ABS marker has animacy property (human vs. non-human). For example, in (30a) the ABS sǐ:a is required when it marks a person whereas in (30b), it refers back to the known topic, as indicated conveniently by the English definite article ‘the’. 12 Note that Davis (2017: 17) includes an ergative case marker /ĭn/ in addition /nă/ in the figure. However, Rev. Kip Thian Pau (a Sizang pastor) suggests that /ĭn/ is a loan from Falam Chin (interview date: 13 December 2019).



(32)

Typological profile of Kuki-Chin languages 

 391

Hakha Lai a. vok (khâa) a-thîi pig ABS 3SG.S-die1 ‘the pig died’ b. Ni Hu-niʔ vok (khâa) a-thaʔ Ni Hu-ERG pig ABS 3SG.S-kill2 ‘Ni Hu killed the pig’

Ditransitive clauses in KC involve verbs that allow three participants (arguments) in the clause. In these ditransitive expressions R is encoded identically to O in transitive expressions. This argument may be called “primary object”, as opposed to T, which is treated differently. Hakha Lai (33a)–(33c) illustrates such ditransitive clauses with the verbs ‘give’, ‘send’, and ‘cook’. (See also King [2010: 100] for Falam Chin and So-Hartmann [2009: 229] for Daai.) (33)

Hakha Lai a. Ka-nu-niʔ rôol 1SG.POSS-mother-ERG food ‘My mom gave me a meal’ b. Ka-nu-niʔ ca 1SG.POSS-mother-ERG letter ‘My mom sent me a meal’ c. Ka-nu-niʔ rôol 1SG.POSS-mother-ERG food ‘My mom cooked me a meal’

a-ka-peek 3SG.S-1SG.OBJ-give2 a-ka-kuat 3SG.S-1SG.OBJ-send2 a-ka-chuanʔ 3SG.S-1SG.OBJ-cook3

KC languages usually display two strategies for causative constructions. The first one includes lexical (direct) causative which usually involves limited causative prefix m-, as in Daai (34a)–(34c), or aspiration/devoicing of the initial consonant, as in Hakha Lai (35a)–(35c). (34) Daai (Hartmann-So 2009: 55–56) Simplex a. ei ‘eat’ b. oo:k ‘drink’ c. ooi ‘hang around the neck’ (35)

Hakha Lai (Van Bik 2003) Simplex Form-1 Form-2 a. kǎaŋ kaŋʔ ‘burn (int)’ b. tsat tsaʔ ‘be severed’ c. rook roʔ ‘break down’

Causative mbei ‘feed’ mbook ‘give to drink’ mbooi ‘put over the neck of another person’ Causative Form-1 Form-2 khǎaŋ khaŋʔ ‘burn (tr)’ tshat tshaʔ ‘sever’ r̥ ook r̥ oʔ ‘destroy’

392 

 Kenneth Van Bik

The second causative construction is a productive morphological suffix. (36a)–(36c) illustrates such (indirect) causative constructions in several KC languages. (36)

a.

Mindat K’Cho (Mang 2006: 59) Nu noh Yong am ak’hmo m-ih-hlak-ci Mother ERG Yong DAT child CAUS-sleep2-CAUS-NF ‘Mother asked/made Yong to put the child to sleep’ b. Mizo (Chhangte 1993: 101) Kâ-pàà-in keel mín-veen-tîîr 1SG-father-ERG goat 1SG.ACC-watch.2-CAUS ‘My father made me watch the goats’ c. Tedim (Henderson 1965: 83) hî: Á dám-sák 3SG.NOM heal.1-CAUS PA ‘He healed him’

Another well attested feature of valence raising operation in KC is applicative constructions. King (2010) lists the number of applicative morphemes ranging from one in Tedim to seven in Hakha Lai, as shown in Table 12, and some applicative constructions are exemplified in (37a)–(37c) for Daai, Hyow, and Mizo. Tab. 12: Applicative morphemes in Chin languages (adapted from King 2010: 17).

Benefactive Malefactive Instrumental Comitative/ associative Relinquitive Priorative Adversative Locative

Hakha Lai

Mizo

Falam (Zahau)

K’Chò

Daai

Sizang

Tiddim

-piak, -tsemʔ -hnoʔ

-sak

-sak

-pe(k)/ peit -pe(k)/ peit, -shi -na(k) -püi

-pee:t

-sak

-sak13

-pui

-pui

-ta

-taa:k

-san

-naak -pii -taak -kanʔ

-khùŋ

-pûy -sàn

-piǐ

-sak

-cilʔ -khùm -nhàn

13 Note that this morpheme is also a causative suffix (36c). Typologically, this is in line with other SEAn languages that frequently use ‘to give’ as causative and benefactive marker (including some varieties of colloquial Burmese).



(37)

Typological profile of Kuki-Chin languages 

 393

a.

Daai (Hartmann-So 2002: 93) Ling So noh lou: nah phyoh püi kti Ling So ERG field 1SG.ACC weed.2 COM NF ‘Ling So weeds the field with me’ b. Hyow (Zakaria 2017: 373) tông-ak èydɘ èy-tsíʔ-êy-pek-khɔ then ANA.DEM-take.II-MID-BEN-PM CLF-one ‘Then, he took back that, one basket, for himself from him’ c. Mizo (Chhangte 1993: 102) Kór mî-ley-sak dress 1SG.ACC-buy.2-BEN ‘S/he bought a dress for/from me.’

Subordinate clauses Subordinate clauses in KC have functions such as complement, adverbial, and relative, which are marked by respective subordinators. Complement markers enable complement clause to be embedded within the main clause. The verb ‘say’ often acts as a complementizer in KC languages. For example, as shown in (38a), in Falam Chin, the verb ti ‘say’ is also a complementizer. However, the origin of Daai’s complementizer /a/ in (38b) is not transparent within the language itself. (38) a.

Falam Chin (King 2010: 111) Mang cu [rul-in a-cuk ding ti] Mang TOP snake-ERG 3SG.NOM-bite FUT COMP ‘Mang is afraid [that the snake will bite him]’ b. Daai (So-Hartmann 2009: 314) Kah-thi-in kkhai-a kah-ngngaih ni S.AGR:1S-die-MIR FUT-CF S.AGR:1S-think EMPH ‘I think that I will die’

a-phang 3SG.NOM-afraid

Adverbial clauses also have markers which help add extra information about when, how, or why an action occurs in KC. Again, Falam Chin and Daai illustrate KC adverbial clauses in (39a)–(39b). Examples from Hakha Lai cover conditional, causal, and temporal clauses (40a)–(40c). (39)

a.

Falam Chin (King 2010: 114) A-fate-pawl hmu lo-in 3SG.POSS-child-PL see.1 NEG-ADVZ ‘He died without seeing his children’

a-thi 3SG.NOM-die.1

394 

 Kenneth Van Bik

b. Daai (So-Hartmann 2009: 280) Lou:-a ah-nih pha jata … Field-LOC S.AGR:3-DU.PL arrive as.soon.as ‘As soon as they arrived on the field …’ (40) Hakha Lai a. Na-zoot aʔtsûn si-bòoy vàa-zawʔ-têr 2SG.S-sick2 if medicine-master DIR.MID-look2-CAUS ‘Let the doctor examine you if you are sick’ b. Ka-zoot rùaŋ-aʔ ka-râa khàw lăw 1SG.S-sick2 reason-LOC 1SG.S-come1 able NEG ‘I could not come because I was sick’ c. Ka-dăm tik-aʔ ka-râa tĕe lâay 1SG.S-healthy2 time-LOC 1SG.S-come1 DIM FUT ‘I will come when I become well’ Relative clauses in KC are usually externally headed. Some languages such as Hakha Lai have both externally and internally headed ones (Kathol and Van Bik 1999). Even then, the externally headed relative clauses are most common in Hakha Lai (Bawi Tawng 2017: 48). In some KC languages, e.  g., Hakha Lai and Falam Chin, the verb forms play an important role in the strategies of subject and object relativizations. For example, in Falam Chin, form-1 verb is an essential component for subject relativization in (41a)–(41b), and form-2 verb is required for object relativization in (41c)–(41d). (41)

Falam Chin (adapted from King 2007: 116–117) a. Cinte-in a-ṭap-rero-mi naute a-cawi Cinte-ERG 3SG.NOM-cry1-IPFV-REL baby 3SG.NOM-pick.up.INV ‘Cinte picked up the baby who was crying’ b. Cinte paisa a-cawi-mi nu-in a-bawm Cinte money 3SG.NOM-lend1-REL woman-ERG 3SG.NOM-help.INV ‘The woman who lent the money to Cinte helped here’ c. Parte-in paisa a-cawih-mi nu-in Parte-ERG money 3SG.NOM-lend2-REL woman-ERG a-rul sal 3SG.NOM-repay.INV again ‘The woman to whom Parte lent the money repaid it’ d. Parte-in Cinte-in a-cawih-mi paisa Parte-ERG Cinte-ERG 3SG.NOM-lend2-REL money a-rul sal 3SG.NOM-repay.INV again ‘Parte repaid the money which Cinte lent her’



Typological profile of Kuki-Chin languages 

 395

On the other hand, Daai employs quite a different strategy which involves the k- prefix. Examples in (42a)–(42c) illustrate subject, direct object, and indirect object relativizations in Daai. (42)

Daai (So-Hatmann 2009: 185–186) a. Subject Relativization ana-k-ve ma Msi Msääi ta … DIR:in.advance-REL-existing do.before Msi Msääi FOC … ‘As for the Msi Msääi who existed first …’ b. Direct Object Relativization Mhnam-pa:-noh ah-pee:t mthi-kshon-k-khe sun creator.god-male-ERG S.AGR:3S-giving iron-walking-REL-stick DEM Lün-noh mhnih lüta … Lün-ERG forget SR ‘Lün forgot the iron walking stick which the creator god had given …’ (SR = Switch Reference) c. Indirect Object Relativization muti ah-shoom-ei lo shak bead.string S.AGR:3S-wrap.around.head-AO ASP CAUS k-khyaangsa: sun ngthiimkho-a pha lo hnüh-kti. REL-human DEM world-LOC arrive DIR finally-NON.FUT ‘The human being whom he had caused to wrap bead strings around his head arrived finally on this world.’

Also, it is not uncommon to have relativized clauses without any relative marker, as illustrated in (43) and (44) with examples from Hyow and Sizang respectively. Similar to other KC languages (cf. Falam Chin and Hakha Lai), the subject is relativized in Hyow with Form-1 verb while the object is relativized in Sizang with Form-2 verb. (43)

Hyow (Zakaria 2017: 710  f.) [kɘm-âl-hlɔ] MC èydɘ̂ [èy-thǽʔy shík-êy-tíʔ]RC then [ANA.DEM-fruit pluck.I-MID-NMLZ] [descend-DEP-PM] ‘Then [the person] who was plucking the fruits climbed back down’

(44) Sizang (Davis 2018: 46) Na-lâːi hoŋ-tʰák ka-ŋaâː 2SG.POSS-letter CIS-send.II 1SG.S-get.I ‘I got your letter that you sent me, and …’

aː NF

396 

 Kenneth Van Bik

19.5.4 Questions and commands KC languages have two kinds of forming questions: polar versus content questions. Polar questions in KC have some kind of question marker at the end of the sentence. Table 13 lists such polar question markers: Daai (So-Hartmann 2009: 310), Falam Chin (King 2010: 125), Hyow (Zakaria 2017: 573), Hakha Lai (Peterson 1999), Mizo (Chhangte 1986: 204), Paite (Singh 2006: 160), and Sizang (Davis 2017: 42) which has two markers, as illustrated in (45a)–(45b). Tab. 13: Polar question markers in KC languages.

Polar question markers

(45)

a.

Daai

Falam Chin

Hyow

Hakha Lai

Mizo

Paite

Sizang



maw

êy

măa, mŏo

em2

hiá

lɛː, mɔ̌ :

nǎː-mû: lɛː 2-see.I Q ‘Do you see it?’

b. ɛ̌ :i-tɛ̌ :ŋ   bɛ̂ k mɔ̌ : 1PL.INCL.PL-1 only Q ‘Are we all alone?’ Content questions (aka WH-questions) are formed with content question morpheme(s) in KC languages. Table 14 lists such morphemes used in KC languages: Daai (So-Hartmann 2009: 308), Falam Chin (King 2010: 127), Hakha Lai, and Mizo (Chhangte 1986: 201–204), and Sizang (Naylor 1925: 19). Tab. 14: Content question morphemes in KC languages. KC languages/ WH pronouns

Daai

Falam Chin

Hakha Lai

Mizo

Sizang

who what why

u i ilü

zo ziang ziang ruang ah

tu eng eng ah

koi bang a bang hang

when how where

itüh ihokba ho

ziang tik ziang tin khoi

ahăw zây ziaʔ zây tsàa-aʔ zây tik zây tìn khŏoy

eng tik eng tin khoi3

a bang hun chiang koi bang koilai

Examples in (46a)–(46d) illustrate how these content question morphemes are used in Falam Chin (King 2010: 127).



Typological profile of Kuki-Chin languages 

 397

(46) a.

Zo so a-hmuh? who FOC 3SG.NOM-see.2 ‘Whom did he see?’ b. Mang-in ziang si a-lo-suh? Mang-ERG what FOC 3SG.NOM-2.ACC-ask.2 ‘What did Mang ask you?’ c. Ziang-ruang-ah so a-ruk? what-reason-LOC FOC 3SG.NOM-steal.2 ‘Why did he steal it?’ d. Buh ziang-ti-n na-suang? rice which-way-ADVZ 2SG.NOM-cook.1 ‘How do you cook rice?’

Commands in KC are expressed with a bare verb or with an imperative marker for simple positive command. Negative imperative (prohibitive) markers (which usually have different forms from declarative negation) are usually different from the positive one, as listed in Table 15 for KC languages: Daai (So-Hartmann 2009: 302–306), Falam Chin (King 2010: 91), Hyow (Zakaria 2017: 573), Hakha Lai (Peterson 1999), Mizo (Chhangte 1986: 204), Paite (Singh 2006: 151) which illustrates three different imperatives: polite, simple, and persuasive in (47a)–(47c). Tab. 15: Imperative markers in KC.

Positive Imperative Negative Imperative

(47)

a.

Daai

Falam Chin

Hakha Lai

Mizo

Paite

a kah V a

aw hlah

tuaʔ, loo hlaʔ

roʔ suʔ

ó, ín, vè kóʔ, ken, key

Polite imperative with ó ‘Stand up! (you sg)’ Affirmative: dīŋ ó Negative: dīŋ kóʔ ‘Don’t stand up! (you sg)’ b. Simple imperative with ín Affirmative: dīŋ ŋín ‘Stand up! (you sg)’ Negative: dīŋ kén ‘Don’t stand up! (you sg)’ c. Persuasive imperative with vè Affirmative: dīŋ vè ‘Stand up! (you sg)’ Negative: dīŋkey vè ‘Don’t stand up! (you sg)’

398 

 Kenneth Van Bik

19.5.5 Pragmatics and syntax Fronting and afterthoughts are common pragmatics in KC languages. An example of fronting is provided with Falam Chin (King 2010: 126–127). Example (48a) is a regular WH question with the head noun in focus. As shown in (48b), the interrogative pronoun may also be moved to the focus position. (48) a.

Zungruk zo-in ring who-ERG ‘Who stole the ring?’ b. Zo-in zungruk Who-ERG ring ‘Who stole the ring?’

a-ru? 3SG.NOM-steal.1 a-ru? 3SG.NOM-steal.1

An afterthought example is provided in a Hyow narrative text (Zakaria 2017: 560) in (49) where, according to Zakaria (2017), “the topicalized noun phrase argument, kéy-tsæ ‘1SG-TOP’ has the role of an agent, and follows the verb complex, so its clause final position identifies it as an afterthought” (Zakaria 2017: 560). (49)    

èydɘ tsó ní lúkí then DP PROX head ɔ-krɔ ʔêy-tsɔ GRP-be.late eat.II-DIM



kǝ-bʉn-hnɔ kêy-tsæ 1A-get.II-PM 1SG-TOP

púm-shɘk èy-kón-ní CLF-six ANA.DEM-ABL-FOC tîng-ni lúkí-dɘ púm-shɘk QUOT- head=EMPH CLS six FOC

ní DIST

Then, he said, “Take, these are six heads.” Saying, “From these, this one is for dinner (literally, late eating).” He said, “I found six heads, I” (it’s me who got these six heads)

19.6 Conclusion KC languages are well known for their conservativeness in maintaining PTB phonology, that is, they retain many PTB initial and final consonants. Some KC languages, e.  g., Khumi and Mara also retain PTB prefixes. For example, the word for snake is reconstructed as PTB *s-b-rul; the *s- and *b- are animal prefixes in TB (Matisoff 2003: 43). Interestingly, Mizo still has a sa- prefix for animals as in sa-vom ‘bear’, sa-khi ‘deer’ (Lorrain 1940); the pa- prefix in Mara’s pa-ri ‘snake’ (Lorrain 1951: 237) is cognate to PTB *b- animal prefix; and the word for snake in Hakha Lai is ruul. There is an urgent need to document many KC languages which are facing the threat of extinction due to the social and political pressure from the local and national dominant languages. For example, Lamtuk, spoken in two villages (Lamtuk and



Typological profile of Kuki-Chin languages 

 399

Ruavan by about 400 speakers), is one of them since the youth who go to Hakha for educational opportunities are shifting to Hakha Lai and Burmese. Situations like these seem inherently unstable and seem to be obviously conducive to eventual endangerment and shift, leading to loss.

References Arden, Michelle J. 2010. A phonetic, phonological, and morphosyntactic analysis of the Mara language. San Jose, CA: San José State University MA thesis. Barnes, Jonathan. 1998. Tsuu kha tii hlaʔ: Deixis, demonstratives and discourse particles in Lai Chin. Linguistics of the Tibeto-Burman Area 21(1). 53–86. Bawi Tawng. 2017. A typology of subordinate constructions in Lai. Chiang Mai, Thailand: Payap University thesis. Bedell, George. 1995. Agreement in Lai. Papers from the Fifth Annual Meeting of the Southeast Asian Linguistics Society, 21–32. Tempe, AZ: Program for Southeast Asian Studies, Arizona State University. Bhaskararao, Peri. 1996. A computerized lexical database of Tiddim Chin and Lushai. Tokyo: ILCAA, Tokyo University of Foreign Studies. Chhangte, Lalnunthangi. 1986. A preliminary grammar of the Mizo language. Arlington: University of Texas at Arlington Master’s thesis. Chhangte, Lalnunthangi. 1993. Mizo syntax. Eugene, OR: University of Oregon PhD dissertation. Davis, Tyler D. 2017. Verb stem alternation in Sizang Chin narrative discourse. Chiang Mai, Thailand: Payap University thesis. Diffloth, Gérard. 1979. Expressive phonology and prosaic phonology in Mon-Khmer. In Theraphan L. Thongkum, Vichin Panupong, Pranee Kullavanijaya & M. R. Kalaya Tingsabadh (eds.), Studies in Thai and Mon-Khmer phonetics and phonology, 49–59. Bangkok: Chulalongkorn University Press. Hartmann-So, Helga.1985. Morphophonemic changes in Daai Chin. In Suriya Ratanakul, David Thomas & Suwilai Premsrirat (eds.), Southeast Asian Linguistic Studies presented to André-G. Haudricourt. Bangkok: Mahidol University. Hartmann-So, Helga. 1988. Notes on the Southern Chin languages. Linguistics of the Tibeto-Burman Area 11(2). 98–119. Hartmann-So, Helga. 1989. Directional auxiliaries in Daai Chin. In David Bradley (ed.), Papers in South-East Asian Linguistics No.11: South-East Asian syntax (Pacific Linguistics, A-77). Canberra: Australian National University. Hartmann-So, Helga. 1999. Prenasalization and preglottalization in Daai Chin with parallel examples from Mro and Mara. LTBA 24(2). 123–142. Hartmann-So, Helga. 2002. Verb stem alternation in Daai Chin. Linguistics of the Tibeto-Burman Area 25. 81–98. Henderson, Eugénie J. A. 1965. Tiddim Chin: A descriptive analysis of two texts (London Oriental Series 15). London: Oxford University Press. Hyman, Larry M. 2007. Elicitation as experimental phonology: Thlantlang Lai tonology. In Maria-Josep Solé, Pam Beddor & Manjari Ohala (eds.), Experimental approaches to phonology in honor of John J. Ohala, 7–24. Oxford: Oxford University Press. Hyman, Larry M. 2010. Kuki-Thaadow: An African tone system in Southeast Asia. In Franck Floricic (ed.), Essais de typologie et de linguistique générale, 31–51. Lyon, France: Les Presses de l’Ecole Normale Supérieure.

400 

 Kenneth Van Bik

Hyman, Larry M. & Kenneth Van Bik. 2002a. Tone and syllable structure in Hakha-Lai. Proceedings of the Twenty-Eighth Annual Meeting of the Berkeley Linguistics Society [BLS]: Special Session on Tibeto-Burman and Southeast Asian Linguistics, 13–28. Berkeley: Center for South and Southeast Asia Studies. Hyman, Larry M. & Kenneth Van Bik. 2002b. Tone and stem2-formation in Hakha Lai. Linguistics of the Tibeto-Burman Area 25(1). 113–121. Jordan, Marc. 1969. Chin (Cho) dictionary and grammar. Manuscript. Kathol, Andreas & Ken Van Bik. 1999. Morphology-syntax interface in Lai relative clauses. In Pius Tamanji, Masako Hirotani & Nancy Hall (eds.), Proceedings of the 29th Annual Meeting of the North Eastern Linguistic Society, 427–441. Amherst: University of Massachusetts. Kavitskaya, Darya. 1997. Tense and aspect in Lai Chin. Linguistics of the Tibeto-Burman Area 20(2). 173–213. Kemmer, Suzanne. 1993. The middle voice. Amsterdam & Philadelphia: John Benjamins. Konnerth, Linda. 2018. The historical morphology of Monsang (North-western South-central “Kuki-Chin”): A case of reduction in phonological complexity. Himalayan Linguistics 17(1). 19–49. King, Deborah. 2010. Voice and valence-altering operations in Falam Chin: A role and reference grammar approach. Arlington, TX: University of Texas at Arlington PhD dissertation. Lorrain, J. Herbert. 1940. Dictionary of the Lushai language. Calcutta: Royal Asiatic Society of Bengal. [Repr. 1965, 1976]. Lorrain, Reginald Arthur. 1951. Grammar and dictionary of the Lakher or Mara language. Gauhati: Department of Historical and Antiquarian Studies, Government of Assam. Tones marked by Ngo Co Le. Luce, Gordon. H. 1959. Chin Hills – linguistic tour (Dec. 1954) – university project. Journal of the Burma Research Society 42(1). 19–31. Maddieson, Ian. 2004. Timing and alignment: A case study of Lai. Language and Linguistics 5(4). 729–755. Maddieson, Ian & Kenneth Van Bik. 2004. Apical and laminal articulations in Hakha Lai. In Marc Ettlinger, Nicholas Fleisher & Mischa Park-Doob (eds.), Proceedings of the Thirtieth Annual Meeting of the Berkeley Linguistics Society, 232–243. Berkeley, CA: Berkeley Linguistics Society. Matisoff, James A. 1986. Hearts and minds in Southeast Asian languages and English: An essay in the comparative lexical semantics of psycho-collocations. Cahiers de Linguistique Asie Orientale 15(1). 5–57. Matisoff, James A. 2003. Handbook of Proto-Tibeto-Burman. Berkeley, Los Angeles & London: University of California Press. Mang, Kee Shein. 2006. A syntactic and pragmatic description of verb stem alternation in K’Chò, a Chin language. Chiang Mai, Thailand: Payap University thesis. Melnik, Nurit. 1997. The sound system of Lai. Linguistics of the Tibeto-Burman Area 20(2). 9–19. Naylor, L. B. 1925. A practical handbook of the Chin language (Siyin dialect): Containing grammatical principles with numerous exercises and a vocabulary. Rangoon: Superintendent, Govt. Print. and Stationery. Osburne, Andrea Gail. 1975. A transformational analysis of tone in the verb system of Zahao (Laizo) Chin. Ithaca, NY: Cornell University PhD dissertation. Par, Ngun Tin. 2016. Agreement and verb stem alternation in Senthang Chin. Chiang Mai, Thailand: Payap University MA thesis. Peterson, David. 1998. The morphosyntax of transitivization in Lai (Haka Chin). Linguistics of the Tibeto-Burman Area 21(1). 87–153.



Typological profile of Kuki-Chin languages 

 401

Peterson, David. 2003. Khumi-English dictionary. Hanover, NH: Dartmouth College MS. Peterson, David. 2004. A sketch of Bangladesh Khumi morphosyntax. Paper presented at International Conference on Sino-Tibetan Languages and Linguistics (ICSTLL 37), University of Lund, 1 October 2004. Peterson, David. 2010. Khumi elaborate expressions. Himalayan Linguistics 9(1). 81–100. Peterson, David. 2016. Lai. In Graham Thurgood & Randy J. LaPolla (eds.), The Sino-Tibetan Languages. London & New York: Routledge. Peterson, David. 2017. On Kuki-Chin subgrouping. In Picus S. Ding & Jemin Pekley (eds.), Social-historical linguistics in Southeast Asia: New horizon for Tibeto Burman studies in honor of David Bradley. Leiden: Brill. 189–209. Peterson, David & Ken Van Bik. 2015. Kuki-Chin languages: An overview. In Van Bik (ed.), Continuum of the richness of languages and dialects in Myanmar. Yangon, Myanmar: Chin Human Rights Organization’s Publication. 33–51. Rawlins, John. 1787. On the manners, religion, and laws of the C úcì’s, or Mountaineers of Tipra. Communicated in Perfian. Asiatick Researches. Calcutta: Manuel Cantopher. 187–193. Roengitya, Rungpat. 1997. Glottal stop and glottalization in Lai (connected speech). Linguistics of the Tibeto-Burman Area 20(2). 21–57. Savidge, Fred. W. 1908. Grammar and dictionary of the Lakher language. Allahabad, India: The Pioneer Press. Shakespear, J. 1912. The Lushai Kuki Clan. Aizawl, Mizoram: Tribal Research Institute [Reprinted 1975]. Singh, Noarem S. 2006. A grammar of Paite. New Delhi: Mittal Publication. Smith, Tomoko Yamashita. 1998. The middle voice in Lai. Linguistics of the Tibeto-Burman Area 21(1). 1–52. So-Hartmann, Helga. 2009. A descriptive grammar of Daai Chin (STEDT [Sino-Tibetan Etymological Dictionary and Thesaurus] Series Monograph 7). Berkeley: University of California. Thuan, Khar. 2008. A phonological description of Falam. Chiang Mai, Thailand: Payap University MA thesis. Tluangneh, Thlasui. 2002. The middle voice in Lai. Cincinnati, OH: Cincinnati Christian University thesis. Van Bik, Kenneth. 1998. Lai Psycho-collocation. Linguistics of the Tibeto-Burman Area 21(1). 201–232. Van Bik, Kenneth. 2003. Three types of causative constructions in Hakha Lai. Linguistics of the Tibeto-Burman Area 25(2). 99–122. Van Bik, Kenneth. 2009. Proto-Kuki-Chin: A reconstructed ancestor of the Kuki-Chin languages (STEDT [Sino-Tibetan Etymological Dictionary and Thesaurus] Monograph Series 8). Berkeley: University of California. Van Bik, Kenneth & Thlasui Tluangneh. 2017. Preverbal directional particles in Hakha Lai. Himalayan Linguistics 16(1). 141–150. Zakaria, Muhammad. 2017. A grammar of Hyow. Singapore: Nanyang Technological University Doctoral dissertation.

Keita Kurabe

20 Typological profile of the Kachin languages 20.1 Introduction Northern Burma (Myanmar) is mainly an upland area of hills and mountains, where, as is typical of other uplands in mainland Southeast Asia (Enfield and Comrie 2015: 4–5), populations are sparser and ethnolinguistically more diverse than in the lowlands. Northern Burma, being home to a great number of Tibeto-Burman and Tai speakers, has been a locus of long-standing intra- and inter-ethnolinguistic contact (e.  g., Leach 1954). The Kachin people are one of the major populations inhabiting Kachin State and northern Shan State of northern Burma and adjacent areas of China and India. Traditionally, they are highlanders occupying heavily forested hill tracts, known as the Kachin Hills, where they practice slash-and-burn agriculture in non-irrigated mountain fields. The Kachin are a linguistically diverse people speaking several distinct Tibeto-Burman languages, which, from a phylogenetic point of view, are not always close to each other. These Kachin languages include Jinghpaw, Zaiwa, Lhaovo, Lacid, Rawang, and many other varieties (20.2). The Kachin, in spite of their internal linguistic diversity, constitute more or less a socio-cultural complex with shared traits. This is especially illustrated by the intra-Kachin marriage alliance system with fixed correspondences between clans that crosscut different languages. In this world of multiple languages, Jinghpaw, the speakers of which outnumber those of other linguistic groups, serves as a lingua franca, being a linguistic bond of the Kachin people. Its influence is especially prominent in the lexicon, as demonstrated by a number of Jinghpaw loanwords introduced into other Kachin languages that form a part of the areal lexicon of the Kachin cultural area (see 20.3). The Kachin area thus can be counted as a linguistic convergence zone in the northwestern part of mainland Southeast Asia (hereafter MSEA). In addition to their shared lexicon, Kachin languages also exhibit a number of typological similarities in their phonology and morphosyntax, although few of them have been fully compared (Yabu 1988; Kurabe 2015; Müller 2018). These include lexical tones, creaky phonation, sesquisyllabic word structures, negative prefixes, verb-­finalness, postpositive case marking, numeral classifiers, extensive use of multi-verb constructions, topic prominence, prevalence of clausal nominalization, rich inventory of ideophones, and other similarities. Many of these, however, are general typological features found in MSEA and/or Tibeto-Burman languages, and thus do not single out Kachin languages from neighboring languages. It is beyond the scope of this chapter to provide an exhaustive analysis of contact-induced convergence in the phonology and grammar of Kachin languages, which is not easily distinguishable https://doi.org/10.1515/9783110558142-020

404 

 Keita Kurabe

from inheritance and broad areal traits. Rather, this chapter, as a preliminary approximation toward studies in Kachin contact linguistics, sets out to showcase typological profiles of the major languages of the Kachin in terms of lexico-semantics (20.3), phonology (20.4), morphology (20.5), word classes (20.6), and syntax (20.7). Conclusions are offered in 20.8 with a brief suggestion for possible convergent traits.1

20.2 Membership, contact situation, and scale of Jinghpaw influence 20.2.1 Membership There are a number of languages and dialects spoken by the Kachin people. These languages, as noted above, are not always genetically close to each other. Three distinct branches of Tibeto-Burman (TB) are commonly recognized as being present: Bodo-Konyak-Jinghpaw or “Sal” (e.  g., Jinghpaw), Lolo-Burmese (e.  g., Zaiwa, Lhaovo, Lacid), and Rung (e.  g., Rawang). The following (1) lists the major Kachin languages in terms of language names, ISO 639-3 codes, and affiliation within Tibeto-Burman. Dialect names used in Burma and China are shown in the first and second columns, respectively, with exonyms in the third column. (1)

Major languages of the Kachin. Burma China Exonyms Jinghpaw Jingpo Kachin Zaiwa Zaiwa Atsi Lhaovo Langsu Maru Lacid Leqi Lashi Rawang Nung Lisu Lisu Yawyin

ISO 639-3 kac atb mhx lsi raw lis

Within TB Sal Lolo-Burmese Lolo-Burmese Lolo-Burmese Rung Lolo-Burmese

Subgroups Jinghpaw-Luish Burmish Burmish Burmish Nungish Loloish

Jinghpaw, distributed throughout the Kachin region, belongs to the Boro-Konyak-Jinghpaw branch, showing a special relationship with Boro-Garo, Northern Naga, and Luish (Asakian) languages, which are mainly spoken in northeastern India and adjacent areas. These languages are also known as the “Sal” languages based 1 All Jinghpaw examples are drawn from my own fieldwork. The data of Zaiwa, Lhaovo, Lacid, Leqi, Ngochang, Rawang, and Proto-Tibeto-Burman (PTB), unless otherwise noted, are from Lustig (2010), Sawada (2004), Hkaw Luk (2017), Dai and Li (2007), Nasaw Sampu et al. (2005), LaPolla and Sangdong (2015), and Matisoff (2003), respectively. Examples quoted from the secondary source are indicated using the transcription in the original source. Abbreviations for languages and language families in this paper are as follows: Jg. (Jinghpaw); Lc. (Lacid); Lq. (Leqi); Lv. (Lhaovo); Ng. (Ngochang); PTB (Proto-Tibeto-Burman); Rw. (Rawang); WB (Written Burmese); Zw. (Zaiwa).



Typological profile of the Kachin languages 

 405

on certain shared distinctive etyma such as *sal ‘sun’ and *war ‘fire’ (Burling 1983; Huziwara 2012; Matisoff 2013). Jinghpaw, as noted earlier, serves as a lingua franca among the linguistically diverse Kachin people, and is also referred to as the “Kachin” language (e.  g., Hanson 1906). The language is also known as “Jingpo” in China (Dai and Xu 1992) and “Singpho” in India (Morey 2010). There are a number of internal varieties such as Gauri, Hkahku, Duleng, Turung, and Numhpuk. Zaiwa, Lhaovo, and Lacid, also known as “Atsi,” “Maru,” and “Lashi” in Jinghpaw exonyms, respectively, are also regarded as “core” members of the Kachin languages in Burma. These languages are especially concentrated in the Burma-China borderlands that form the southeastern areas of the Kachin region. Together with Burmese, they constitute the Burmish branch within Lolo-Burmese, which is a well-established group with shared innovations (Bradley 1979). Within Burmish, Zaiwa, Lhaovo, and Lacid belong to Northern Burmish in contrast to Burmese, which belongs to Southern Burmish. Northern Burmish languages are grouped together on the basis of sets of lexical innovations with a phonological innovation of Proto-Burmish preglottalized sonorants (Nishi 1999: 70): in contrast to voiceless sonorants in Burmese (e.  g., WB nhut ‘mouth’), they are reflected as creaky vowels (e.  g., Zw. nvut5, Lv. natF, Lq. nuat55) in Northern Burmish (Hideo Sawada, p.c., 2019).2 In China, the population of Zaiwa speakers is greater than that of Jingpo (Jinghpaw), but, together with speakers of Jingpo and other Burmish languages such as Langsu and Leqi, they constitute the “Jingpo nationality.” Thus, the Jingpo nationality in China roughly corresponds to the Kachin in Burma. There is a strong cross-border Kachin/Jingpo identity that includes Singpho in India. Other Burmish varieties such as Ngochang in Burma (Nasaw Sampu et al. 2005), Lhangsu in Burma (Sawada 2018), and Bola in China (Dai, Jiang and Kong 2007) are also included in this socio-cultural complex. The speakers of Rawang and Lisu mainly inhabit northern areas of the Kachin region. Rawang speakers are particularly distributed along the Mali Hka and Nmai Hka River valleys. They were formerly known under the name “Nung” or “Hkanung.” Rawang consists of a number of diverse varieties that are not always mutually intelligible, such as Mvtwang, Dvru, Lungmi, and Tangsar. The variety presented in this chapter is Mvtwang, which is seen as the standard variety of Rawang in Burma. Rawang is closely related to languages such as Dulong and Anong, spoken by the Dulong and Nu nationalities in Yunnan Province of China, who do not participate in the Kachin socio-cultural sphere. These languages belong to the Nungish branch of the Rung, established based on shared morphological innovations (LaPolla 2003: 30–31). Lisu, which is in close contact relationship with Rawang among Kachin languages, belongs to the Loloish branch of Lolo-Burmese. The Lisu, whose population is large in China and who have a distinct nationality status there, are often regarded as a group distinct

2 The creaky voice is marked by v in Zaiwa and by underlined vowels in Lhaovo and Leqi (see 20.4).

406 

 Keita Kurabe

from the Kachin, especially outside Kachin State, Burma.3 As exemplified by this, it is not always obvious what the exact membership of the Kachin is, due in part to the fluidity of ethnic identities in northern Burma (Leach 1954).

20.2.2 Contact situation In the Kachin region, especially in the southeastern part with a large Burmish Kachin population, Kachin villages, village clusters, and other communities inhabited by different linguistic groups are not uncommon (Leach 1954; Dai, Fu and Liu 1985; Dai 1993; Bradley 1996). Multilingualism is a long-standing phenomenon in these communities, and it is not hard to encounter a Kachin who speaks, say, Jinghpaw, Zaiwa, Lhaovo, and Lacid. In addition to these languages, many Kachins also speak Burmese in Burma, Chinese in China, Assamese in India, and Shan as a trade language with neighboring Tai peoples. Part of the multilingualism stems from marriage preferences built into the Kachin cultural system that promote and perpetuate multilingualism among the Kachin; as Bradley (1996) puts it: In most mixed villages each Burmish group operates as a Kachin exogamous patrilineal clan, so if the father is from one of the Burmish-speaking groups, the mother must have a different first language from her husband. Grandparents could therefore represent up to four languages, but marriage preferences tend to lead to repeated marriages between the same clans. The clan identity is acquired from the father, but children also speak the mother’s language, especially if it forms a substantial group in the village. Thus, people of one of the Burmish backgrounds may be bilingual in their father’s and their mother’s language, and if neither of these is Jinghpaw they will early on become trilingual and use Jinghpaw as their medium of education, literacy and lingua franca within the group as a whole. (Bradley 1996: 750–751)

Dai (1993) also reports the socio-linguistic situation of language use within the Jingpo nationality in China. He shows that the language choice is ruled by factors such as social setting, generation, age, sex, and occupation: Many Jingpo families include people from different subgroups. In such families, the language used by each member is stipulated by tradition. The children belong to their father’s subgroup, and they use the language of that subgroup. If a father and mother are from different subgroups, the father and children use one language and the mother uses another. Although husband and wife each master the other’s language, each uses his or her own. In other words, people speak in one language but are spoken to in another. The mother insists on using her own language, and the children may also use their mother’s language when speaking to her … If a family has a grandmother who speaks a third language, the younger generation uses her language when speaking to her. (Dai 1993: 4)

3 Due to this situation, in what follows we will exclude Lisu from the discussion unless otherwise necessary.



Typological profile of the Kachin languages 

 407

20.2.3 Scale of Jinghpaw influence Jinghpaw, as a lingua franca, has had a significant impact on other Kachin languages. Its influence varies from language to language, as represented by the scale of Jinghpaw influence (2), which exhibits a core–periphery structure (see also Müller 2018). (2)

Scale of Jinghpaw influence (adapted from Kurabe 2018a).

Jinghpaw Zaiwa Lhaovo, Lacid Rawang Lisu

Zaiwa is considered the most affected by Jinghpaw, as reflected in its phonological similarity to Jinghpaw (20.4) as well as in its abundant Jinghpaw loans, including many kinship terms, even for ‘father’ and ‘mother’ (20.3). Lisu, on the other hand, is least influenced by Jinghpaw, as many of its speakers do not enter into the Kachin socio-cultural complex. This is reflected in the fact that Lisu has very few Jinghpaw loans, if any. Other languages fall somewhere in between. Lhaovo and Lacid are also considered to have been significantly influenced by Jinghpaw while Rawang is less so. The scale appears to roughly correlate with social, psychological, and geographical proximity. Lhangsu, a variety of Lhaovo surrounded by Jinghpaw and isolated from the rest of Lhaovo varieties, is more influenced by Jinghpaw than other varieties (Sawada 2018). More evidence should be collected to validate the scale.

20.3 Lexico-semantics Jinghpaw influence on other Kachin languages is especially prominent in lexico-semantics. Many Jinghpaw loanwords, identified based on criteria such as phonological patterns, morphological complexity, and cognates in sister languages (Kurabe 2018a), are cultural items from such semantic fields as religion, clothing, and the house, such as ‘to bless’, ‘embroidery’, and ‘front side post of a house’, which are susceptible to borrowing in general. Nevertheless, more borrowing-resistant items with culture-free meanings such as body parts, spatial relations, and sense perception also appear in the loan lexicon. This includes items in the Leipzig-Jakarta list of core/basic vocabulary (Haspelmath and Tadmor eds. 2009), such as ‘salt’, ‘mouth’, and ‘to be sweet’. Zaiwa in particular has extensively adopted Jinghpaw loans, as demonstrated by over 280 Jinghpaw loans, including more than ten loan grammatical items (Kurabe 2018a).

408 

(3)

 Keita Kurabe

Jinghpaw loans in other Kachin languages. Jinghpaw Zaiwa Lhaovo ‘to mistake’ ɕút syut5 šatH 31 ‘be different’ ɕáy syai šayH 11 ‘be correct’ jò zyo coF 1 31 ‘song’ məkhón me -kon ‘cat’ ləʔnyaw le1-ngvyau55 lă-ñauL

Ngochang Rawang shuot shut shaih sháy jò mvkún lvnyhau

The intra-Kachin marriage alliance system, as noted above, is a strong socio-cultural bond that ties the Kachin people together. Related to this is the isomorphism found in kinship systems of some Kachin subgroups. Zaiwa, which is most affected by Jing­ hpaw, is of importance in that it has adopted many Jinghpaw kinship terms, such as ‘elder brother’, ‘elder sister’, ‘grandfather’, ‘grandmother’, ‘grandchild’, ‘mother’s brother’, ‘father’s sister’, ‘sister’s husband’, ‘father’s sister’s husband’, ‘male’s brother’s wife’, ‘female’s brother’s wife’, and so on. This even includes ‘father’ and ‘mother’, which are cross-linguistically resistant to borrowing. This situation can be contrasted with that of Lhaovo, which is genetically closer to Zaiwa but is resistant to adopting individual kinship terms, which are in many cases comparable to those in Burmese. Zaiwa sometimes has both loan and inherited words simultaneously, as illustrated by ‘elder brother’ in (4). Zaiwa thus stands at an important crossroad between genetic and contact relationships. (4)

Roots of kinship terms (adapted from Kurabe 2018a). Jinghpaw Zaiwa Lhaovo WB ‘father’ wà wa11 phoH pha 31 ‘mother’ nù nu myiH mi ‘grandfather’ ji zvi55 phukH phui3 ‘grandmother’ woy wvoi55 phyitH bhe3/4 11 L myit mre3 ‘grandchild’ ɕù syu 11 ‘elder brother’ phù pu ‘elder brother’ mang11 moŋL moṅ-krii3

Most of the Lhaovo kin terms, as can be seen, are not of Jinghpaw origin. Jinghpaw and Lhaovo, however, share a strikingly similar kinship term system, where items that are not always cognate are organized into a similar system, as Burling (1971) describes: [T]he [kinship] terms of Jinghpaw and Maru [Lhaovo] are … different, but the systems into which they are organized are very similar. Indeed, the systems are so much alike that each term of one language can generally be paired with a synonym from the other language and the equivalent pairs can be defined together. (Burling 1971: 27)

4 bhe3 denotes ‘great-grandfather’ in Burmese (Hideo Sawada, p.c., 2020).



Typological profile of the Kachin languages 

 409

Another related semantic field is that of personal names. Kachin languages, as illustrated in (5), usually have systems for naming people in terms of their gender and birth order.5 Zaiwa, as can be seen, adopted many personal names from Jinghpaw. Lhaovo and other languages, by contrast, usually maintain their own separate names. According to Jinghpaw speakers, in many cases Jinghpaw and Zaiwa are not distinguishable only by their given names. They also share sets of lineage names, as exemplified by the Jinghpaw lineage names Dashi, Jangma, Mahka, and Sumnut, which directly correspond to the Zaiwa lineage names Dawshi, Jangmaw, Mahkaw, and Sumlut (Leach 1954: 54). Note additionally in (5) that Rawang and Lisu spoken in northern areas share a set of birth-order names. This is due to the fact that Northern Lisu borrowed birth-order names from some varieties of Rawang (probably Anong), which indicates their close contact (Bradley 2007: 58–59). (5)

Birth-order names from 1st to 7th son. Jinghpaw Zaiwa Lhaovo Lacid 1st gam gam35 khoŋF boem 35 2nd no(ŋ) nong lømH jɨ́ŋ 1 L 3rd làʔ laq tau kʰó 4th tú dvu31 tseF tɕʰáŋ 5th tang dvang55 xoŋH sau 6th yo(ŋ) yong35 tsauŋH láŋ 7th khá ka31 kyuŋH ting

Rawang pong dø kwin søn nøn pi yung

Lisu a55pʰu33 a55dɯ55 a55kʰi33 a21tsʰe33 a55tiʔ21 a21jo35 a55gɛ21

Jinghpaw sometimes performs the function of transferring lexical items of high prestige languages in the region such as Pali, Burmese, and Shan into other Kachin languages. Matisoff (2013) provides the following chain of borrowing across several language families: Indo-Aryan (IA), Tibeto-Burman (TB), and Tai-Kadai (Tai). The position of Rawang in the chain could be replaced by other non-Jinghpaw Kachin languages. The Pali word sati ‘recognition’, for example, is likely to have entered recipient languages through the borrowing chain, sometimes with semantic changes: Pali sati ‘recognition’ > WB sati ‘caution’ > Shan sha1ti5 ‘caution’ > Jinghpaw sədìʔ ‘caution, promise’ > Rawang svdiq ‘promise’ (Kurabe 2018a). (6)

Borrowing chain (Matisoff 2013: 24). Pali (IA) > Burmese (TB) > Shan (Tai) > Jinghpaw (TB) > Rawang (TB)

Parallel colexification patterns are also observed. Some Kachin languages, for example, colexify ‘north’ and ‘length’ on the one hand, and ‘south’ and ‘width’ on the other. Jinghpaw and Zaiwa share a fourfold colexification ‘ashes/fireplace/compartment of a house/branch of a clan’. Kachin languages also share many calques that exhibit parallel semantic structures (see 20.5.2).

5 Lhaovo and Lacid data are provided by Hideo Sawada and Hkaw Luk, respectively (p.c., 2020).

410 

 Keita Kurabe

20.4 Phonology 20.4.1 Syllable and word structure Kachin languages typically allow up to two prenuclear consonants and one postnuclear consonant (i.  e., CCVC). Sonority must increase in the onset position, where the medial consonant is limited to semivowels or liquids. Medial -y- (-j-) is pervasive in Jinghpaw, Zaiwa, Lhaovo, and Lacid. Medials -r- and -w-, by contrast, occur only in Jinghpaw and Rawang, respectively, in the native phonology. The coda position, as typical in MSEA languages, allows one postnuclear consonant with a restricted set of consonants. Final consonants typically found are -p, -t, -k, -ʔ, -m, -n, and -ŋ, which are usually unreleased. Rawang further allows -r and -l, retaining PTB final liquids, which have merged with -n in Jinghpaw. Syllables exhibit binary branching into an onset (e.  g., CC) and a rhyme (e.  g., VC), like other neighboring languages (Enfield 2018: 56–57). The internal structure of a rhyme (V–C combination) is relatively free in Jinghpaw, Zaiwa, and Rawang. Lhaovo and Lacid, by contrast, exhibit restricted structures (over 30 gaps) like Burmese (Hideo Sawada, p.c., 2014). Examples illustrating some syllable structures in Kachin languages are given in (7). (7)

Examples of syllable structure. Jinghpaw CV ‘to come’ sa CCV ‘bee’ pru ‘to exit’ CVC ‘to shoot’ gàp CCVC ‘six’ krúʔ

Zaiwa lo31 byo11 bek1 kyuq5

Lhaovo loF pyoL pakF khyaukH

Lacid lo bjo: bɨk kʰjuk

Rawang hé kwá wvp gwør ‘to toss’

The majority of monomorphemic words are monosyllabic or disyllabic. A large number of disyllables, as given in (8), take the form of the iambic “sesquisyllabic” structure (Matisoff 1973), consisting of a heavy or “major” syllable preceded by a light or “minor” syllable with reduced phonemic possibilities. (8)

Sesquisyllabic words. Jinghpaw Zaiwa ‘moon’ ɕəta lva5-mo35 ʔəphyìʔ si1-gvuq5 ‘skin’ ‘leaf’ ʔəlàp a1-haq5 ‘middle’ ləpran ge1ro11

Lhaovo šŏɣitF ʔăfoʔH ʔăkhukF

Leqi Rawang lă55mo55 shvlá ʃŏ55kuk55 shvlvp mvlùng

Minor syllables usually exhibit no or a restricted inventory of tones because they are too short to accommodate full-fledged tonal distinctions. Vowels are also restricted in minor syllables in Jinghpaw and Rawang, where only a schwa is allowed.6 This is in 6 The schwa is indicated by v in the Rawang orthography.



Typological profile of the Kachin languages 

 411

contrast to Burmish Kachin languages that exhibit vowel distinctions in minor syllables. A rich array of syllabic nasals, which can be treated as a type of minor syllable, is found in Jinghpaw (e.  g., ǹsàʔ ‘breath’) in contrast to the other Kachin languages, which exhibit no or marginal examples. One of the widespread morphophonological processes in the Kachin languages is “sesquisyllabization,” where the first syllable of a fully disyllabic word is reduced to a minor syllable due to the predominance of the iambic prosodic pattern (e.  g., Jg. gìnsúp > gəsúp ‘to play’, Lv. noLkhyeʔH > nŏkhyeʔH ‘earlobe’).

20.4.2 Tone All Kachin languages, as with neighboring languages, are syllable-tone languages, where a tone is assigned to every syllable. In open and sonorant-final syllables, three tones are usually contrastive.7 Some languages have secondary derived tones, as illustrated by Jinghpaw high-falling tone [51], which is derived from an underlying low-falling tone [31] preceded by a high tone [55] by means of tone spreading. (9)

Basic tones in non-checked syllables. Jinghpaw H [55~35] M [33] F Zaiwa H [55] L [11] F Lhaovo H [44] L [22~33] F Leqi H [55] M [33] F Rawang H [55~53] M [33] F

[31] [31] [21] [53] [31]

Checked syllables usually have reduced sets of tonal distinctions because, like minor syllables, they are too short to accommodate full-fledged tonal distinctions in the case of Kachin languages. Only high- and low-stopped tones are distinguished in checked syllables in Jinghpaw, Zaiwa, and Lacid. Only high tone appears in Rawang. Lhaovo, which distinguishes three tones in checked syllables, is an exception, although its close relative Langsu distinguishes two (Hideo Sawada, p.c., 2019).

20.4.3 Segments and phonation The vowel inventories of Kachin languages are modest by MSEA standards (Enfield 2018: 53–56). Five monophthongs /i, e, a, o, u/ are usually contrastive, plus additional vowels depending on the language. The following (10) gives inventories of nonlong monophthongs in some languages. Phonemic contrast of vowel length is alien to Kachin languages, except Lacid and Rawang with non-basic long vowels. Jinghpaw

7 The Zaiwa tone [35] could be seen as in an allotonic relationship with [55] (Anton Lustig, p.c., 2020).

412 

 Keita Kurabe

and Zaiwa have exactly the same set of phonetic diphthongs (i.  e., [ai, au, oi, ui]). Diphthongs in these two languages can be best analyzed as sequences of a vowel plus a consonant (semivowels) phonologically, given that they do not occur in closed syllables. (10)

Basic monophthongs. Jinghpaw i e a o Zaiwa i e a o Lhaovo i e a o Lacid i e a o Rawang i e a o

u u u u u

ə ue [ɘ] ø ɔ ɨ v [ə]  ø [ɯ]

As for consonant inventories, the places of articulation often used are such typologically common positions as bilabial, dental, alveolar, palatal, velar, and glottal. Retroflex, uvular, and pharyngeal consonants, by contrast, are uncommon. Labiodentals such as /f/ and /v/ are found in some Burmish languages such as Lhaovo and Lacid but not in Jinghpaw, Zaiwa, and Rawang in their native phonologies. Kachin languages usually have aspirated-unaspirated distinctions in stops (i.  e., /p, t, k, ph, th, kh/) plus further distinctions in affricates in some Burmish languages (e.  g., /ts, tsh, c, ch/). Alveolar and postalveolar fricatives (e.  g., /s, ɕ/) are pervasive. Four nasals /m, n, ɲ, ŋ/ are usually contrastive, the third of which is sometimes analyzed as a consonant cluster (i.  e., /ny/). Two semivowels /w/ and /y/ are often distinguished, where /w/ appears as /v/ in some languages. Implosives, not unusual in the MSEA context, and voiced/voiceless distinction for sonorants, found in Burmese, are alien to Kachin languages as well as Shan. The lateral /l/ is prevalent in Kachin languages. The rhotic /r/, by contrast, although quite common in Jinghpaw and Rawang, is unusual in the native phonology of Burmish Kachin languages, where PTB *r- is reflected as /w/ and /ɣ/ in Zaiwa and Lhaovo, respectively, and as /y/ in Lacid and Ngochang (Sawada 2018). (11)

Reflexes of PTB *r-. PTB ‘bone’ *rus ‘place’ *s-ra ‘cat/tiger’ *s/k-roŋ ‘to laugh’ *r(y)ay ‘water’ *rəy 

WB rui3 rā kroṅ ray re

Zw. wui11 wo31 wung31 wui31 wui31

Lv. Ng. ɣukL yau ɣoF yos gyungs F ɣi yis ɣitF

Jg. Rw. ǹrút sharø ɕərà shvrà ɕəro(ŋ)

Note additionally that the Zaiwa lexicon, as in (12), has come to contain the rhotic /r/ to some extent due to extensive lexical borrowing from Jinghpaw. Lustig’s (2010) dictionary of Zaiwa, for example, lists over 70 morphemes beginning with /r/. This situation can be contrasted with Nasaw Sampu et al.’s (2005) Ngochang dictionary, which contains only three morphemes beginning with /r/.



(12)

Typological profile of the Kachin languages 

Loan initial r- in Zaiwa. Zaiwa ‘to be enough’ ram35 ‘to like’ raq1 ‘to grow well’ reng11 ‘to mock’ roi31

Jinghpaw rám ràʔ ríŋ róy

‘exactly’ ‘to tend’ ‘waterfall’ ‘to be hard’

Zaiwa rvoq5 ruem35 rum35 rvuq5

 413

Jinghpaw róʔ rem rum rúʔ

Creaky voice phonation (which is sometimes indicated by an underline or by v in the Zaiwa orthography) is prevalent in Kachin languages save Rawang. Breathy voice, by contrast, is not a prominent phonation type. Creakiness in Kachin languages is a phenomenon independent of tone, unlike Burmese, but is dependent on the preceding consonant. For example, creaky vowels are compatible with unaspirated stops and sonorants, but incompatible with aspirates and fricatives. Creaky vowels always occur after voiceless unaspirated stops, while plain (non-creaky) vowels occur after other types of stops. Due to this situation, the creakiness can be attributed to the feature of either vowels or consonants at the phonological level. To illustrate this, consider the interaction between initial stops and phonation types in Jinghpaw in (13a)–(13c). Attributing the creakiness to the vowel (V-analysis below) requires two stop series (aspirated vs. unaspirated) and two vowel series (plain vs. creaky). On the other hand, when one attributes the feature to the consonant (C-analysis), then three stop series with one vowel type is required. Similarly, sonorants can also be analyzed in two different ways (13d)–(13e). In terms of C-analysis, a preglottalized sonorant series is required (13e). Both treatments are common in previous studies. Yabu (1982), Dai and Xu (1992), Sawada (2004), Dai (2005), Dai and Li (2007) adopt the V-analysis. By contrast, Burling (1967), Maran (1978), Lustig (2010), Kurabe (2016), Hkaw Luk (2017), and Sawada (2018) adopt the C-analysis (see Lustig [2010: 62–64] and Sawada [2018: 384] for further discussion). (13)

Initial consonant-phonation interaction in Jinghpaw. Phonetic Meaning V-analysis C-analaysis a. [pa̰ ŋ] ‘class’ /pa̰ ŋ/ /paŋ/ b. [b̥ aŋ] ‘to put’ /paŋ/ /baŋ/ c. [phaŋ] ‘to begin’ /phaŋ/ /phaŋ/ d. [maŋ] ‘corpse’ /maŋ/ /maŋ/ e. [ʔma̰ ŋ] ‘dark’ /ma̰ ŋ/ /ʔmaŋ/

Creaky voice, as noted above, is usually incompatible with aspirated stops and voiceless fricatives in Kachin languages. In Jinghpaw, for example, creaky voice is not compatible with initials /ph, th, kh, s, ɕ/, in contrast to other initials such as /p, t, k, ts, c/, etc. This distributional asymmetry can be accounted for in terms of the mutual exclusivity between creaky voice, which is produced by epilaryngeal constriction, and aspirates (including voiceless fricatives), which are produced by spreading glottis (see Kurabe [2018b] for the laryngeal specification of aspirates and fricatives in Jinghpaw).

414 

 Keita Kurabe

20.5 Morphology 20.5.1 Affixation Prefixation is pervasive in PTB derivational morphology, reflected in various forms in modern languages (Wolfenden 1929; Benedict 1972; Matisoff 2003). The PTB negative prefix *ma- is well retained in Kachin languages, as illustrated by Lhaovo mă- and Rawang mv-. It has further developed into a syllabic nasal ń- in Jinghpaw and simple vowels a1- and ʔa- in Zaiwa and Lacid. (14)

Negative prefixes. Jinghpaw lá ‘to take’ Zaiwa zue31 ‘to be late’ Lhaovo tsoL ‘to eat’ Lacid se ‘to know’ Rawang yùl ‘to be easy

> > > > >

ń-lá a1-zue31 mă-tsoL ʔa-se mv-yùl

‘not to take’ ‘not to be late’ ‘not to eat’ ‘not to know’ ‘not to be easy’

Another common prefix widespread in modern Tibeto-Burman languages is the reflex of the PTB causative prefix *s-. It is directly reflected as a prefix in Jinghpaw ɕə- and Rawang shv-. The proto-prefix has left only indirect traces in some Tibeto-Burman languages. A well-known example comes from the simplex-causative verb pairs in Burmese, where the prefix *s- is reflected as aspiration except in verbs beginning with vowels and approximants (e.  g., WB no3 ‘to be awake’ vs. hno3 ‘to awaken’, ip ‘to sleep’ vs. sip ‘to put to sleep’). The prefix has also left indirect traces in Burmish Kachin languages, where it is typically reflected as creakiness, and sometimes as aspiration (Yabu 1988: 85–86, 101–102, 118–119; Lustig 2010: 575–580; Sawada 2018: 284). (15)

Simplex-causative pairs. Jinghpaw ʔyúp ‘to sleep’ Zaiwa mi11 ‘to be closed’ Lhaovo tsoL ‘to eat’ Lacid no ‘to be black’ Rawang aq ‘to drink’

vs. vs. vs. vs. vs.

ɕə-ʔyúp mvi11 tsoL no̰ shv-aq

‘to put to sleep’ ‘to close’ ‘to feed’ ‘to blacken’ ‘to make someone drink’

The widespread PTB prefix *ʔa- has the functions of marking kinship terms, body parts, genitive constructions, the third persons, and nominalization of verbs (Matisoff 2018). It pervades modern languages of the Kachin, reflected as ʔa-, ʔang, or preglottalization. (16)

Reflexes of PTB *ʔa-. Jinghpaw ʔə́ -ʔnû Zaiwa a5-nu11 Lhaovo ʔă-phoH Leqi a55-maŋ33 Rawang àng-kàng

‘mother’ ‘mother’ ‘father’ ‘elder brother’ ‘his/her grandfather’

ʔə-caŋ a1-nye31 ʔă-ɣiL a55-sək55 v-nǿn

‘black one’ ‘red one’ ‘big one’ ‘new one’ ‘cooked one’



Typological profile of the Kachin languages 

 415

The expansive nominalizing prefix ma- derives nouns with the meaning of ‘everything’, with reduplication of the verb stems in Jinghpaw, Zaiwa, Lhaovo, and Lacid. This productive prefix can be demonstrated to be of Jinghpaw origin based on its etymology (i.  e., máʔ ‘to be exhausted’). This prefix is a point of interest illustrating structural borrowing among some Kachin languages (Kurabe 2015: 76).8 (17)

Expansive nominalizing prefix. Jinghpaw ŋà ‘to exist’ > ŋà mə́ -ŋâ Zaiwa ngi31 ‘to exist’ > ngi11 me5-ngi11 Lhaovo naF ‘to exist’ > naF mă-naF Lacid ɲit ‘to exist’ > ɲit ma-ɲit

‘everything that exists’ ‘everything that exists’ ‘everything that exists’ ‘everything that exists’

20.5.2 Compounding Compounding, as with neighboring languages, is one of the major word formation processes in Kachin languages. Nouns and verbs are productively involved in compounding. Noun-noun compounds, as typical of all languages of Sino-Tibetan (LaPolla 2003: 43), are usually right-headed (e.  g., Jg. ɕəta-pan ‘sunflower, lit. moon-flower’, Lc. bjo:-jḭ:t ‘honey, lit. bee-water’). Kachin languages sometimes have shared calques, some of which would be possible outcomes of language contact. The word for ‘thunderbolt’, for example, is expressed by compounding ‘sky/thunder’ and ‘axe’ (e.  g., Jg. múq-nìŋwa, Zw. mau-wazung, Lv. mug: vozaung:, Rw. muq-wurdi).9 Other examples of possible calques that are not shared by Burmese include: (18)

Shared calques. frog + cloth sun + leg fire + tongue oil + fire pig + eye word + measure navel + cut fowl + pig salt + sweet

> > > > > > > > >

‘wet moss’ ‘sunbeam’ ‘flame’ ‘candle’ ‘mountain oak’ ‘example’ ‘to be born’ ‘livestock’ ‘sugar’10

Jinghpaw, Zaiwa Jinghpaw, Zaiwa Jinghpaw, Zaiwa Jinghpaw, Zaiwa Jinghpaw, Zaiwa, Ngochang Jinghpaw, Zaiwa, Lhaovo Jinghpaw, Zaiwa, Lhaovo, Lacid Jinghpaw, Zaiwa, Lhaovo, Lacid Jinghpaw, Zaiwa, Lhaovo, Lacid

8 Zaiwa, Lhaovo, and Lacid data are provided by Anton Lustig, Hideo Sawada, and Hkaw Luk, respectively (p.c., 2020). 9 The Zaiwa and Lhaovo data in their orthographic forms are taken from Wannemacher (2017) and my fieldwork, respectively. 10 This could be an influence of Shan which has kɤ1 waan1 (lit. salt-sweet), which is markedly different from Thai and Burmese (Mathias Jenny, p.c., 2020).

416 

 Keita Kurabe

Co-compounds whose constituents are in a relationship of coordination are also widely attested. Co-compounds consisting of nouns tend to express generic meanings, expressing more than composition of the meanings of their parts. For example, Kachin languages, like other MSEA languages (Enfield 2018: 97–98), lack a simple word for ‘parents’, expressing it by compounding ‘father’ and ‘mother’ (e.  g., Rw. àngpè àngmè). Co-compounds in Jinghpaw include: phún-kəwá ‘plants, lit. tree-bamboo’, jùm-məjàp ‘seasoning, lit. salt-chili’, and ləbù-pəloŋ ‘clothes, lit. trousers-jacket’. In Jinghpaw, the order of the constituents in co-compounds is largely predictable, where the shorter member or the member including the higher vowel comes first (see Kurabe [2016] for more details).

20.5.3 Reduplication Reduplication is often productive in Kachin languages.11 Reduplication may be manifested as full reduplication (e.  g., Zw. dung11-dung11 ‘to ask repeatedly’) or as partial reduplication (e.  g., Jg. məŋa-ŋa ‘five each’). Reduplication may be employed in order to mark indefiniteness, plurality, distributivity, habituality, intensity, and completion. Indefiniteness can be indicated by reduplication of nominals, where the numeral ‘one’ is often involved, as in (19a)–(19c). Reduplication sometimes indicates plurality in Jinghpaw, Zaiwa, and Lacid (19d)–(19e). Reduplication encodes distributivity when it takes place with numerals in Jinghpaw and with classifiers in Lacid (19  f )–(19g). In Lacid, ordinal numbers can be derived by reduplication from cardinal numbers (19h). An unusual but productive form of marking in Rawang is the marking of translative (i.  e., ‘by way of’) arguments by reduplicating the last syllable of the relevant place name (19i). This can be done with any place name, whether a proper name or a common noun (Randy J. LaPolla, p.c., 2020). (19)

Reduplication on nouns. a. Jinghpaw ləŋây-ŋày b. Lhaovo tă-yaukF-yaukF c. Rawang tiq-kvt-kvt d. Jinghpaw gəday-day e. Lacid tɕʰi-tɕʰi f. Jinghpaw məsum-sum g. Lacid nu da du:-du: h. Lacid sóem dzain-dzain i. Rawang Rapboq-boq

one-red one-clf-clf one-time-time who-red what-red three-red cow one-clf-clf three-year-red Rapboq-red

‘something’ ‘someone’ ‘sometimes’ ‘who-PL’ ‘what-PL’ ‘three each’ ‘each cow’ ‘the third year’ ‘by way of Rapboq’

11 The Lhaovo data given in this section are provided by Hideo Sawada (p.c., 2019). The Rawang data in (19c) and (20h) and those in (19i) and (20d) are provided by Nathan Straub and Randy J. LaPolla, respectively (p.c., 2020).



Typological profile of the Kachin languages 

 417

Verb reduplication may indicate habituality (20a)–(20b), intensity (20c), and completion (20d). Reduplication is also used to derive adverbs (20e)–(20  f ) and sometimes nouns (20g)–(20h) from verbs. (20) Reduplication on verbs. a. Jinghpaw gərum-rum b. Lacid koit-koit c. Zaiwa zvai55-zvai55 d. Rawang di-di e. Zaiwa han31-han31 f. Leqi tan33-tan33 g. Lhaovo lamF-lamF-tsaL h. Rawang goq-dv-goq

help-red do-red be.fine-red go-red be.quick-red be.straight-red be.warm-red-tsa bent-caus-bent

‘to help repeatedly’ ‘to do repeatedly’ ‘to be very fine’ ‘to have gone’ ‘quickly’ ‘straightly’ ‘warm one’ ‘bent one’

20.6 Word classes The noun-verb distinction, as typical in MSEA languages (Enfield 2018: 86–87), is usually clear in Kachin languages. Nouns do not inflect in terms of case, number, gender, or definiteness. Case marking is usually achieved by means of postpositive case markers. Demonstratives, numerals, and personal pronouns can be regarded as subclasses of the noun (see 20.6.1 to 20.6.3). Common grammatical meanings marked on verbs are negation, aspect, and modality. Tense is alien to Kachin languages except Rawang and its varieties (20.6.4 below). “Adjectives” or property concept words, as in other MSEA languages (Enfield 2018: 87–91), are usually not formally distinct from verbs. Copulas are a subtype of the verb in Kachin languages.

20.6.1 Demonstratives Being highlanders, speakers of the Kachin languages usually distinguish spatial demonstratives in terms not only of relative distance but also of relative height from the deictic center. The height-based demonstratives, which prevail in major branches of Tibeto-Burman (Post 2019), usually exhibit three-way splits in high, level, and low, a distinction only found in distal demonstratives (Yabu 1988; Kurabe 2015; Müller 2018). Rawang (Mvtwang), whose demonstratives are not sensitive to verticality, is an exception, although its close relatives such as Anong do have the distinction (Randy J. LaPolla, p.c., 2017).

418 

(21)

 Keita Kurabe

Demonstrative of some Kachin languages. Jinghpaw Zaiwa Lhaovo Proximal nday hi31 cheL 31 Medial day hau ʔayL 31 Distal (up) thó hu thoL 31 Distal (level) wó hye thøL Distal (down) lé mvo31 moL

Lacid he: hau: fu: tʰɨ: mo̰ :

Rawang a/ya we ku ku ku

20.6.2 Personal pronouns The Kachin personal pronoun system is relatively simple in that it does not encode differences in age, gender, formality, politeness, register, monkhood, and relative social status of the speech act participants, as is often the case with highlands languages of MSEA in contrast to lowlands languages with hierarchically stratified societies such as Burmese (Müller and Weymuth 2017). The Kachin system typically exhibits threeway splits in person (1st, 2nd, 3rd) and in number (singular, dual, plural), as illustrated in (22) by Jinghpaw and Lacid personal pronouns. (22)

Personal pronouns. Jinghpaw sg du 1st (ex) ŋay ʔán 1st (in) 2nd naŋ nán 3rd ɕi ɕán

pl ʔánthe nánthe ɕánthe

Lacid sg du ŋo ŋá-ta:ŋ ɲá̰ -ta:ŋ naŋ né-ta:ŋ ɲa:ŋ ɲa:-ta:ŋ

pl ŋá-mo ɲá̰ -mo né-mo ɲa:-mo

All Kachin languages, as with many other Tibeto-Burman languages, share cognates for the first- and second-person singular pronouns but not for the third person, the form of which is unstable in Tibeto-Burman in general. Duals often involve the numeral ‘two’, as illustrated by the Jinghpaw duals that involve an obsolete numeral ni ‘two’, which only appears in compounds in the modern language (e.  g., ni-niŋ ‘two years’). The inclusive/exclusive distinction, scattered throughout most of the Tibeto-Burman branches, prevails in Burmish Kachin languages in first dual and plural. Note also that Jinghpaw and Burmish languages have distinct possessive personal pronouns only for singular (e.  g., Jg. nyéʔ ‘my’, náʔ ‘your’, ɕíʔ ‘his/her’).

20.6.3 Interrogatives The following (23) gives interrogatives in Kachin languages. Interrogatives meaning ‘what’ are usually not further segmentable into smaller morphemes. Other interrogative pro-forms, by contrast, are often analyzable into a velar element followed by mor-



Typological profile of the Kachin languages 

 419

phemes denoting categories such as place, time, manner, and classifiers.12 Compare, for example, Jinghpaw interrogatives in (23) with related morphemes such as day ‘that’, ɕərà ‘place’, ɕəlóy ‘then’, and nîŋ ‘thus’. Interrogatives meaning ‘why’ are often formed by ‘what’ and accompanying morphemes (e.  g., Jg. pha məjò ‘lit. what-­because’, Lc. tɕʰi mu ‘lit. what-happen’, and Rw. pà wá ‘lit. what-do’). (23)

Interrogatives. Jinghpaw ‘what’ pha ‘who’ gə-day ‘where’ gə-rà ‘when’ gə-lóy ‘how’ gə-nìŋ

Zaiwa hai31 o55 ka55-me55 ke5-nvam55 ke5-se55

Lhaovo peH khŏ-yaukF khŏ-meŋF khŏ-neŋH khŏ-ruL

Lacid tɕʰi kʰa:-juk kʰa-mo: kʰa-na̰ m kʰa-sú

Rawang (ka)-pà ka-gǿ ka-yv́ ng ka-dvgvp ka-dø

Interrogatives, as in many other world’s languages (Haspelmath 1997), can also be employed to express indefiniteness. The following (24) summarizes the relationship between interrogative and indefinite meanings in Jinghpaw. (24)

The interrogative-indefinite relationship in Jinghpaw. Categories Forms Interrogatives Indefinite Negative indefinite thing pha what anything nothing person gə-day who anybody nobody place gə-rà where anywhere nowhere time gə-lóy when anytime never manner gə-nìŋ how anyhow no way

The interrogative ‘what’ can further be used as a nominalizer in Jinghpaw and Rawang, as illustrated by the word for ‘food’ in Jinghpaw (i.  e., ɕá-pha ‘lit. eat-what’) and Rawang (i.  e., v́ m-pà ‘lit. eat-what’).

20.6.4 Numerals and classifiers All Kachin languages, as is typical of other Tibeto-Burman languages, have a decimal-based numeral system. Many numerals up to one hundred are inherited from PTB, as illustrated by ‘three’ to ‘six’ in (25). Notable exceptions in some languages are the numerals for ‘one’ and ‘two’, which are susceptible to diachronic change in general. The old numeral ni ‘two’ in Jinghpaw, for example, has been replaced by ləkhôŋ ‘two’, although it survives marginally, as noted above, fossilized in dual pronouns and some fixed compounds.

12 The Zaiwa interrogative o55 ‘who’ is an exception. Wannemacher (2010: 97–110) gives hká-yuq or hká-ó ‘who’ in Zaiwa spoken in Burma.

420 

(25)

 Keita Kurabe

Numerals. PTB ‘two’ *g/s-ni-s ‘three’ *g-sum ‘four’ *b-ləy ‘five’ *l/b-ŋa *d-k-ruk ‘six’

Jinghpaw ləkhôŋ məsum məli məŋa krúʔ

Zaiwa i55 sum11 mi11 ngo11 kyuq5

Lhaovo šitH samF pyitF ŋoH khyaukH

Lacid ʔɨk sóem mji:t m̩ kʰjuk

Rawang ní shø̀ m bì pvngwà chuq

Numerals over ten are usually expressed by the operations of addition and multiplication, as in Rawang (tiq)sé ní ‘twelve, lit. (one)ten-two’ and ní-sé ‘twenty, lit. two-ten’. One complication with numbers over ten lies with the unanalyzable Jing­ hpaw numeral khun ‘twenty’ that is comparable with WB akun ‘all’, with an original meaning like “such a large number that one has to use all the fingers and toes to count up to it” (Matisoff 2003: 278–279). Round numbers over one hundred such as one thousand, ten thousand, and hundred thousand show similar forms in the Kachin languages. They are originally loanwords from Shan, Burmese, or Chinese, but some of them have undergone common semantic changes. Chinese wàn ‘ten thousand’ and yì ‘hundred million’, for example, have been adopted by many Kachin languages with the meaning of ‘one million’ and ‘ten million’, respectively. Kachin languages often use numeral classifiers to enumerate things in specific quantities. Many classifiers are of nominal origin, as illustrated by echo classifiers in Leqi jɔm33 ta53-jɔm33 ‘one house’, where ta53 is the numeral ‘one’, and jɔm33 ‘house’ appears twice, once as head noun and once as classifier. Many languages of the Kachin, like neighboring languages, are classifier-rich languages with a fine-grained system of obligatory sortal numeral classifiers with semantic oppositions involving animacy, shape, size, structure, etc. The following (26) illustrates classifiers in some Kachin languages (Lustig 2010: 366–386; Sawada 2011: 267; Hkaw Luk 2017: 48–49; Dai and Li 2007: 96–116; LaPolla and Sangdong 2015: 95). (26)

Classifiers in Kachin languages. Languages Forms Used for Zaiwa kat5 long objects H Lhaovo khyeʔ flat objects Lacid kʰjap thin and wide objects Leqi tʃham55 round objects Rawang dv̀ m stick-like objects

Examples needle, thread, road paper, leaf, longyi plate, blanket, shirt, trousers egg, fruit, grain of rice pen, pencil, ruler, river

By contrast, Jinghpaw has a very small set of optional classifiers where nouns usually do not require classifiers to enumerate them (e.  g., mà məsum ‘three children, lit. childthree’) like some neighboring Tibeto-Burman languages such as Chin and Tibetan to its west and north.



Typological profile of the Kachin languages 

 421

20.6.5 Verbs Common grammatical meanings marked on verbs are negation, aspect, and modality. Negation is achieved by adding negative prefixes to verbs (see 20.5.1 above). Many Kachin languages are aspect- and mood-prominent languages. Jinghpaw has a binary distinction between the change-of-state and non-change-of-state aspect. The former is a marked aspect marking a recent change of state, whether the onset or endpoint. For example, the verb thùʔ ‘to rain’ in the change-of-state aspect can express both ‘to begin raining’ or ‘to finish raining’. Lhaovo has a distinction between the realis and irrealis mood (Sawada 2013: 6–11). The former portrays situations that are or were real (e.  g., tsoL-TA13 ‘ate’ or ‘have a habit of eating’). The latter, by contrast, portrays situations that are or were not within the realm of reality (e.  g., tsoL-neŋH ‘would eat’). Grammatical tense, as in many MSEA languages, is alien to Kachin languages save Rawang, which has a distinction between past and non-past (e.  g., di-e ‘go’ and dì bǿ-ì ‘went’). Verbs may also take directional and valency-changing marking. Jinghpaw has grammaticalized directional suffixes exhibiting a binary distinction between venitive (i.  e., -r) and andative (i.  e., -s). By means of these suffixes, the orientation of the deictically neutral motion verb sa ‘to go, come’ is specified. Kachin languages of the Burmish group, as in some Loloish languages, have a set of deictic motion verbs distinguished in terms of deictic orientation, relative height, and home position, as illustrated by Zaiwa lye35 ‘to come down/away’, lo31 ‘to come up/back’, ye31 ‘to go down/away’, and lo35 ‘to go up/back’. As Lustig (2010: 497) puts it, there is “a logical link between moving down and moving away on one side, and between moving up and moving back on the other, since, at least in former days, villages are at the top of the hill and the fields are below.”14 These motion verbs are also used as aspectual and benefactive markings in various ways (see Lustig [2010: 495–539] for more details). Aside from the causative prefix (see 20.5.1), Rawang has a range of affixes for increasing or decreasing the valency of verbs (LaPolla 2000). This is exemplified by the intransitivizing prefix v- (e.  g., ngaq ‘to push over’ > v-ngaq ‘to fall over’), the reflexive/middle marker -shì (e.  g., kup ‘to cover’ > kup-shì ‘to cover oneself’), the applicative benefactive suffix -ā (e.  g., shvlá ‘to be good’ > shvlá-ā ‘to be good for someone’), and the non-productive transitivizing suffix -t (e.  g., ngø̄ ‘to cry’ > ngø̄ t ‘to cry over someone’). Person marking on verbs, alien to Burmish languages, is found in Jinghpaw and Rawang, manifested as personal indices suffixed to the verb. Consider (27) for intransitive paradigms of Jinghpaw gədùn ‘to be short’ and Rawang tø̀ ‘to be short’, where -ay is a declarative mood marker and ē is a non-past tense marker, respectively. Both languages show hierarchical person marking where the indexation system is per-

13 See 20.7.1 for the realis mood marker -TA in Lhaovo. 14 A similar association of “up” and “home” is also reported for Macro-Tani and some Tibetan languages (Post 2019: 243).

422 

 Keita Kurabe

son-based with or without an inverse marker (see LaPolla [2010] and Kurabe [2017] for more details). It should be noted that the complex verbal endings in Jinghpaw are now on the verge of being lost in modern dialects, presumably due to intensive language contact with Burmish Kachin languages. (27)

1sg 1du 1pl 2sg 2du 2pl 3sg 3du 3pl

Jinghpaw gədùn-ŋ̀ŋ-ay gədùn-gàʔ-ʔay gədùn-gàʔ-ay gədùn-ǹd-ay gədùn-m-y-ìtd-ay gədùn-m-y-ìtd-ay gədùn-Ø-ʔay gədùn-m-àʔ-ʔay gədùn-m-àʔ-ʔay

Rawang tø̀ -ng-ē tø̀ -shì-ē tø̀ -ì-ē è-tø̀ -ē è-tø̀ -shì-ē è-tø̀ -nø̀ ng-ē tø̀ -ē tø̀ -ē tø̀ -ē

‘I am short.’ ‘We (du) are short.’ ‘We (pl) are short.’ ‘You (sg) are short.’ ‘You (du) are short.’ ‘You (pl) are short.’ ‘S/he is short.’ ‘They (du) are short.’ ‘They (pl) are short.’

Copulas and property concept words (roughly corresponding to English adjectives), as noted above, can best be treated as subclasses of the verb in all Kachin languages. They are like other verbs in terms of negation, tense-aspect-mood marking, nominalization, etc. The Lhaovo copula ŋatF, for example, can take the negative prefix mă- and realis mood marker -TA just like other types of verbs (Sawada 2013: 26–27). The same holds for the property concept word caŋ ‘to be black’ in Jinghpaw, which can take a range of verbal marking like person marking, aspect-mood marking, interrogative marking, negation, and nominalization. Ambitransitives (labile verbs), which are not strictly specified as transitives or intransitives, are not alien to Kachin languages, as in other MSEA languages (Enfield 2018: 17), as illustrated by Jinghpaw ɕəŋày ‘to bear; to be born’, Lhaovo ñamF ‘to lower; to be low’, and Rawang gvyaq ‘to break; to be broken’.

20.7 Syntax 20.7.1 Tone in grammar It is not always common for MSEA languages to employ tones in morphosyntactic processes. This can be found in some languages of the Kachin area. Lhaovo, for example, marks grammatical categories such as realis mood, attribution, and verb coordination by means of grammatically conditioned tone alternations (represented as an abstract morpheme -TA), manifested as F > L, L > H, and H > H (Sawada 2013). The verb tsoL ‘to eat’ is thus realized as tsoH in realis mood. Example (28) illustrates additional data on the tonal morpheme in Lhaovo (Sawada 2013: 3). A similar process is also observable in Zaiwa (Lustig 2010). Also, in Rawang, non-high tones often change to high tone when they appear in non-final position (before another verb), before certain suffixes,



Typological profile of the Kachin languages 

 423

and in some contexts where the verb is used as a nominal (Randy J. LaPolla, p.c., 2020). (28)

chømH-TA-ɣuH-TA-kaH. pyŏL-ʔamF-khyoF loʔF-ñukH bee-cluster-all hand-finger point-seq-show-real-hs ‘(The man) pointed at the honeycomb with his finger, it’s said.’

20.7.2 Constituent order All languages of the Kachin, as with the vast majority of the Tibeto-Burman languages, are strictly verb-final at the clausal level with the possibility of post-verbal elements as afterthoughts. The order of core arguments in transitive clauses is usually determined by pragmatic factors, where more topical NPs tend to occur earlier. NPs are freely omitted under appropriate semantic-pragmatic recoverability conditions. (29)

Jinghpaw. day məjò ɕəro=gò ləʔnyaw=phéʔ grày n-ju-ʔay. that because tiger=top cat=acc very neg-like-decl ‘That’s why tigers hate cats so much.’

(30) Rawang (adapted from LaPolla and Poa 2001: 24). nà-í nø̄ kà-shǿn è-shá-ò-ē … 2sg-agt top word-say n1-know-3.tr.n.pst-n.pst ‘You know how to talk …’ In possessive constructions, the possessor precedes the possessee in all Kachin languages. Possession may be expressed by means of a genitive case added to the possessor (31a), special possessive pronouns (31b), or simple juxtaposition (31c). Mixed strategies often coexist in a single language. (31)

a.

Possessive constructions. Jinghpaw b. Zaiwa ʔán=ná gùy nga35 syang55 1du=gen dog 1sg.gen companion ‘our dog’ ‘my companion’

c.

Rawang ngàmaq chø̀ m 1pl house ‘our house’

The numeral plus classifier follows the noun in all Kachin languages, as is often the case with languages spoken in the western part of MSEA (Jones 1970). The classifier usually follows the numeral save Jinghpaw, which like some Chin languages shows the reverse order. Like numeral-classifier phrases, lexical quantifiers such as ‘all’ and ‘some’ follow the noun.

424 

(32)

 Keita Kurabe

Numeral-classifier constructions. a. Jinghpaw b. məɕà məray məsum three person clf ‘three persons’

Lacid la̰ :ŋmju da du: snake one clf ‘one snake’

c.

Rawang vsv̀ ng vbì gǿ person four clf ‘four persons’

Adjectives, which are a subtype of verbs, directly follow the nominal head in Jing­ hpaw. In Burmish languages, adjectival concept words can juxtapositionally occur in postnominal position nominalized by the nominalizing prefix (20.5.1 above), where the prefix sometimes drops (Sawada 2011: 279; Hkaw Luk 2017: 37). In Rawang, in some cases a prefix and/or a tone change serves to nominalize a verb to allow it to follow a noun (see 20.7.1). In all languages, adjectival concept words, like other verbs, can occur in prenominal position nominalized by a clausal nominalizer. (33)

Adjective concept words. a. Jinghpaw b. Lhaovo məka gəja thauŋF ʔă-ɣiL embroidery good bag nml-be.big ‘good embroidery’ ‘big bag’

c.

Rawang kung shvlá plate be.good.nml ‘good plate’

Demonstratives can occur in both pre- and post-head positions in Jinghpaw (e.  g., day mà ~ mà day ‘that child’). This flexible ordering of the demonstratives, as pointed out by Yabu (1988) and Sawada (2011), is also shared by Zaiwa, Lhaovo, and Lacid, sometimes with tonal change (e.  g., Lv. ʔayL ʔăšiL ~ ʔăšiL ʔayF ‘that fruit’). Demonstratives always precede the head in Rawang (e.  g., ku chø̀ m ‘that house’).

20.7.3 Case marking The relationship that a noun bears to its head can explicitly be marked by means of postpositive case markers, whether they are analyzed as suffixes, clitics, or post­ positions. Some of these case markers have their diachronic sources in nouns or verbs (e.  g., Jg. thàʔ ‘loc’ < ləthàʔ ‘upper’, Lv. khyoF ‘all’ < khyoF ‘road’, Zw. dong31 ‘prolative’ < dong31 ‘to lead, connect’). The alignment pattern is nominative-accusative, with the patient overtly marked by the accusative in Jinghpaw and Burmish languages. Rawang, by contrast, shows an ergative-absolutive system with the agent overtly marked by the agentive marker.15 Other case markers typically found in Kachin languages include genitives, comitatives, instrumentals, and spatial cases including locatives, ablatives, and allatives.

15 Here, I use these terms in the non-strict sense because in many Kachin languages the marking is based primarily on semantic and pragmatic factors.



Typological profile of the Kachin languages 

 425

(34) Jinghpaw. wàʔ-ǹdù báy ɕəro=phéʔ gəwá-dàt-yàŋ=gò … pig-boar again tiger=acc bite-away-when=top ‘When the wild boar bit the tiger again …’ (35)

Rawang (adapted from LaPolla and Poa 2001: 66). vkàng-í cv̀ mré gǿ rok-ng-vt-ò-nī-ng … grandpa-agt child clf watch(1sg)-1sg-dir(1sg)-3.tr.n.pst-will-1sg ‘Grandfather (I) will watch the child …’

In Kachin languages, it is often the case that case marking is used to fulfill semantic and pragmatic functions, rather than expressing grammatical relations. Core argument marking is often motivated by the need to disambiguate between two potential agents, as in many other Tibeto-Burman languages (e.  g., LaPolla 1992). For example, in Jinghpaw, the patient is more likely to be case marked, leaving the agent unmarked, when the patient equals or is higher than the agent in animacy, as in (34), since the animacy of the patient is prototypically lower than that of the agent. The patient is usually unmarked when the agent outranks the patient in animacy. The patient in (34), for example, is often not case marked if it is an inanimate noun such as nàmsì ‘fruit’ and màysàw ‘paper’.

20.7.4 Multi-verb constructions Multi-verb constructions such as serial verbs are fairly productive in many Kachin languages, as in other parts of MSEA. The constructions typically describe a sequential action with component verbs ordered iconically in accordance with the temporal order of subevents they denote (36). The construction can also describe a simultaneously occurring event, the subevents of which are related by serialization (37). Kachin languages sometimes allow lengthy verb strings involving more than three component verbs (38). (36)

Jinghpaw. ɕəro=ni ŋay=phéʔ sa gəwá sàt-káw-na-ràʔ-ʔay. tiger=pl 1sg=acc come bite kill-away-irr-infer-decl ‘The tigers will come and bite me to death.’

(37)

Lhaovo (adapted from Sawada 2017: 198). lømHkhoŋF šoL vinF-TA-suL-TA-raH. Lumkhong meat carry-seq-walk-real-ra ‘Lumkhong walked carrying meat.’

426 

 Keita Kurabe

(38) Lacid (Hkaw Luk 2017: 39). ŋo ɹi gjit da kʰu jḛ: ɕoi:n ju lɔ bji:t ɰè. 1sg acc water one cup go pour take come give sfp ‘Give me a cup of water. Lit: Go and pour a cup of water and take it, then come and give it to me.’ Observe that multiple verbs, as is typical in verb-final languages in the area, are contiguous, sequenced at the end of a clause with no syntactic elements interposed between the component verbs. Thus, all arguments of component verbs are required to precede the whole multiple verbs. Multiple verbs are usually blocked from containing duplicate roles (e.  g., two agents, two patients, two instruments, etc.), which suggest that they are monoclausal constructions (Durie 1997).

20.7.5 Grammaticalization of verbs Grammaticalization from lexical verbs to auxiliaries is common in the Kachin languages, many of which appear to have developed through the multi-verb construction. The Jinghpaw lexical verb jòʔ ‘to give’, for example, has developed into a permissive causative and benefactive marker, as in other languages of the area (Jenny 2015). Possibility markers in Kachin languages, as with other MSEA languages (Enfield 2003), often have their lexical sources in the verb ‘to get’. The lexical verb ‘to say’, as with Thai and Khmer (Matisoff 1991), has developed into the quotative complementizer in Kachin languages. The following (39) provides some common verb grammaticalizations in four Kachin languages. (39)

Grammaticalization of verbs. Jg. Zw. send > causative yes yes give > benefactive yes yes meet > passive yes yes live > continuous yes yes lose > completive yes yes look > experiential yes yes look > conative yes yes

Lv. yes yes yes yes yes yes yes

Rw. yes no yes no no no no

Jg. throw > V away yes know > habitual yes know > ability yes get > possibility yes die > intensive yes say > quotative yes say > hearsay yes

Zw. yes yes yes yes yes yes yes

Lv. yes yes yes yes no yes yes

Rw. no yes yes yes no yes yes

The degree of grammaticalization varies from verb to verb. For example, Jinghpaw lù ‘to get; possibility’ and ŋà ‘to live; continuous’, although both convey grammatical meanings, are distinguished in terms of their negatability. The former can be directly prefixed with the negative prefix while the latter cannot. This fact suggests that the former retains verbal properties but not the latter (Kurabe 2016). Another interesting fact, as summarized in (40), is that a set of deverbal auxiliaries have the ability to be both pre- and postposed to main verbs with little semantic effect in Jinghpaw (e.  g.,



Typological profile of the Kachin languages 

 427

lù gəlo ~ gəlo lù ‘can make’). This flexibility is sometimes shared with some Burmish Kachin languages but not with Burmese (Kurabe 2015; Müller 2018). (40) Flexible position of some deverbal auxiliaries. Jinghpaw know > ability/habitual pre, post defeat > possibility pre, post get > possibility pre, post be exhausted > completive pre, post be good > permissive pre, post be willing > intention pre, post be brave > dare to pre, post want > desiderative post

Zaiwa pre, post pre, post pre pre pre pre pre post

Lhaovo pre, post – pre post post16 – pre pre, post

Burmese – post post post post post post post

20.7.6 Cognate noun-verb constructions Kachin languages exhibit rich examples of cognate noun-verb constructions (Lustig 2010: 203–209; Dai and Li 2007: 74–75; Sawada 2017: 166). The noun-verb pairs usually co-occur in natural usage to the extent that in some cases one part cannot occur in the absence of its counterpart. For example, the verbal part of the Jinghpaw expression rì rì (lit. thread-spin) ‘to spin thread’ is hardly used on its own without the nominal part. More examples include: (41)

Cognate noun-verb constructions. Jinghpaw mədim dim ‘to dam a dam’ Zaiwa bui11-syum11 syum11 ‘to sweep with the broom’ Lhaovo yapF-moʔF moʔF ‘to dream a dream’ 53 55 55 Leqi lă phɔʔ phɔ:ʔ ‘to tie a knot’ Rawang pvlu pvlu ‘to lay out a mat’

20.8 Conclusions This chapter gave a typological picture of major Kachin languages in terms of lexico-semantics (20.3), phonology (20.4), morphology (20.5), word class (20.6), and syntax (20.7). The linguistic convergence of Kachin languages, as illustrated in 20.3, is most prominent in their lexico-semantics, as illustrated by a number of Jinghpaw loans that form a part of areal lexicon of the Kachin cultural area, and by the isomorphism in semantic fields related to the kinship and intra-Kachin marriage alliance 16 The grammatical meaning in Lhaovo is ‘possibility by the situation’ (Hideo Sawada, p.c., 2020).

428 

 Keita Kurabe

system, which is a strong socio-cultural bond that ties the Kachin people together. Kachin languages also exhibit phonological and grammatical homogenization on a grand scale. This includes lexical tones, creaky phonation, restricted inventories of final consonants, onset-rhyme syllable-internal structures, sesquisyllables, negative prefixes, postpositive case markers, numeral classifiers, lack of grammatical gender, copulas and adjectives as verbs, verb-finalness, GEN-N and N-Num orders, extensive use of multi-verb constructions, zero anaphora, topic prominence, pivotlessness, rich inventory of sentence-final particles, and clausal nominalization. Many of these features, however, are general typological features of MSEA and/or Tibeto-Burman languages, and thus do not single out Kachin languages from neighboring languages. I wish to close this chapter by suggesting some possible candidates for contact-induced language changes in some Kachin languages. Because Burmese is not a Kachin language, linguistic traits shared by Jinghpaw and Burmish Kachin languages on the one hand but not by Burmese on the other are possible candidates for contact-induced changes. These properties include: shared kinship system (20.3); birth-order name system (20.3); no lexical distinction for younger siblings; no medial -w- (20.4.1); final -p, -t, -k, -m, -n, -ŋ (20.4.1); tonal contrasts in checked syllables (20.4.2); creaky voice independent of tone (20.4.3); no voiceless sonorants (20.4.3); Vi ma-Vi construction (20.5.1); shared calques (20.5.2); height-based demonstratives (20.6.1); dual pronouns (20.6.2); no politeness distinction in personal pronouns (20.6.2); interrogatives involving a velar element (20.6.3); shared semantic changes in round numbers (20.6.4); flexible order of demonstratives (20.7.2); and a set of prehead deverbal auxiliaries (20.7.5). Zaiwa is a key language in Kachin contact linguistics in that, as noted in 20.2.3, it stands at an important crossroad between genetic and contact relationships. In terms of genetic relationship, it is closer to Lhaovo, Lacid, and Burmese than to Jing­ hpaw. On the other hand, it has a closer contact relationship with Jinghpaw than other Burmish languages. Linguistic traits shared by Jinghpaw and Zaiwa but not other Burmish languages are good candidates for the Jinghpaw-Zaiwa special contact relationship. These include: more than 280 Jinghpaw loans, including more than ten kinship terms (20.3); more than ten shared grammatical items (20.3); shared lineage and given names (20.3); free VC distribution (20.4.1); exactly the same set of phonetic diphthongs (20.4.3); no diphthongs in closed syllables (20.4.3); more than 70 morphemes with /r/ (20.4.3); shared calques (20.5.2); and shared grammaticalization (20.7.5). More work is required to demonstrate the core–periphery organization found in the scale of receptivity in Kachin contact linguistics. It should finally be noted that contact influence among Kachin languages is not always unidirectional from Jinghpaw to other Kachin languages. The simplification of person marking on verbs in Jinghpaw, as noted in 20.6.5, is presumably attributable to language contact between Jinghpaw and Burmish Kachin languages that do not have person marking systems. This is not unlikely, given that it is often the case that complex morphology is easily acquirable by infants but challenging for adult second



Typological profile of the Kachin languages 

 429

language learners (Lupyan and Dale 2010). Put more specifically, “[t]he use of a somewhat pidginised and grammatically simplified Kachin Jinghpaw throughout northern Burma as a lingua franca between various Kachin communities is a long-standing phenomenon, and the existence of this pidgin clouds the original picture of the Jing­ hpaw languages, as many of the dialects have been influenced by the morphologically simplified lingua franca” (van Driem 2001: 394). Acknowledgements: I would like to express my gratitude to Mathias Jenny, Shiro Yabu, Randy J. LaPolla, Hideo Sawada, Anton Lustig, and Nathan Straub for their constructive comments and suggestions, without which this work would not be possible. Any errors are solely my own. This work was supported in part by JSPS and NUS under the Japan-Singapore Research Cooperative Program “Ethnolinguistic contact across the Indo-Myanmar-Southwestern China mountains: Migration routes, intercultural interactions, and linguistic outcomes.”

References Benedict, Paul K. 1972. Sino-Tibetan: A conspectus. New York: Cambridge University Press. Bradley, David. 1979. Proto-Loloish. London: Curzon Press. Bradley, David. 1996. Kachin. In Stephen A. Wurm, Peter Mühlhäusler & Darrell T. Tryon (eds.), Atlas of languages of intercultural communication in the Pacific, Asia, and the Americas, vol. 2(1), 749–751. Berlin: Mouton de Gruyter. Bradley, David. 2007. Birth-order terms in Lisu: Inheritance and contact. Anthropological Linguistics 49(1). 54–69. Burling, Robbins. 1967. Proto-Lolo-Burmese. Bloomington, IN: Indiana University. Burling, Robbins. 1971. The historical place of Jinghpaw in Tibeto-Burman. In Frederic K. Lehman (ed.), Occasional papers of the Wolfenden Society on Tibeto-Burman Linguistics, vol. 2, 1–54. Urbana: Department of Linguistics of the University of Illinois. Burling, Robbins. 1983. The Sal languages. Linguistics of the Tibeto-Burman Area 7(2). 1–32. Dai, Qingxia. 1993. On the languages of the Jingpo nationality. Linguistics of the Tibeto-Burman Area 16(1). 1–11. Dai, Qingxia. 2005. Langsuyu yanjiu [A study of the Langsu language]. Beijing: Nationality Press. Dai, Qingxia, Fu Ailan & Liu Juhuang. 1985. Jingpozu Bolahua gaikuang [A brief description of the Bola vernacular spoken by Jingpo nationality]. Minzu Yuwen 6. 56–71. Dai, Qingxia, Jiang Ying & Kong Zhi’en. 2007. Bolayu yanjiu [A study of the Bola language]. Beijing: Nationality Press. Dai, Qingxia & Li Jie. 2007. Leqiyu yanjiu [A study of the Leqi language]. Beijing: Central University for Nationalities Press. Dai, Qingxia & Xu Xijian. 1992. Jingpoyu yufa [A Jingpo grammar]. Beijing: Minzu University of China. Driem, George van. 2001. Languages of the Himalayas: An ethnolinguistic handbook of the greater Himalayan region. Leiden: Brill. Durie, Mark. 1997. Grammatical structures in verb serialisation. In Alex Alsina, Joan Bresnan & Peter Sells (eds.), Complex predicates, 289–354. Stanford: CSLI.

430 

 Keita Kurabe

Enfield, N. J. 2003. Linguistic epidemiology: Semantics and grammar of language contact in mainland Southeast Asia. London: Routledge Curzon. Enfield, N. J. 2018. Mainland Southeast Asian languages. Cambridge: Cambridge University Press. Enfield, N. J. & Bernard Comrie. 2015. Mainland Southeast Asian languages: State of the art and new directions. In N. J. Enfield & Bernard Comrie (eds.), Languages of mainland Southeast Asia: The state of the art, 1–26. Boston & Berlin: Mouton de Gruyter. Hanson, Ola. 1906. A dictionary of the Kachin language. Rangoon: American Baptist Mission Press. Haspelmath, Martin. 1997. Indefinite pronouns. Oxford: Oxford University Press. Haspelmath, Martin & Uri Tadmor (eds.). 2009. Loanwords in the world’s languages: A comparative handbook. Berlin: Mouton de Gruyter. Hkaw Luk. 2017. A grammatical sketch of Lacid. Chiang Mai: Payap University MA thesis. Huziwara, Keisuke. 2012. Lui-sogo no saiko ni mukete [Toward a reconstruction of Proto-Luish]. Kyoto University Linguistic Research 31. 25–131. Jenny, Mathias. 2015. The far west of Southeast Asia: “Give” and “get” in the languages of Myanmar. In N. J. Enfield & Bernard Comrie (eds.), Languages of mainland Southeast Asia: The state of the art, 156–208. Boston & Berlin: Mouton de Gruyter. Jones, Robert B. 1970. Classifier constructions in Southeast Asia. Journal of the American Oriental Society 90(1). 1–12. Kurabe, Keita. 2015. Jinghpaw and related languages. In Kenneth Van Bik (ed.), Continuum of the richness of languages and dialects in Myanmar, 71–101, 143–168. Yangon: Chin Human Rights Organization. Kurabe, Keita. 2016. A grammar of Jinghpaw, from northern Burma. Kyoto: Kyoto University PhD dissertation. Kurabe, Keita. 2017. Jinghpaw. In Graham Thurgood & Randy J. LaPolla (eds.), The Sino-Tibetan languages, 2nd edn., 993–1010. London & New York: Routledge. Kurabe, Keita. 2018a. A classified lexicon of Jinghpaw loanwords in Kachin languages. Asian and African Languages and Linguistics 12. 99–131. Kurabe, Keita. 2018b. Deaspiration and the laryngeal specification of fricatives in Jinghpaw. Gengo Kenkyu 153. 41–55. LaPolla, Randy J. 1992. Anti-ergative marking in Tibeto-Burman. Linguistics of the Tibeto-Burman Area 15(1). 1–9. LaPolla, Randy J. 2000. Valency-changing derivations in Dulong/Rawang. In R. M. W. Dixon & Alexandra Y. Aikhenvald (eds.), Changing valency: Case studies in transitivity, 282–311. Cambridge: Cambridge University Press. LaPolla, Randy J. 2003. An overview of Sino-Tibetan morphosyntax. In Graham Thurgood & Randy J. LaPolla (eds.), The Sino-Tibetan languages, 22–42. London & New York: Routledge. LaPolla, Randy J. 2010. Hierarchical person marking in the Rawang language. In Dai Zhaoming (ed.), Forty years of Sino-Tibetan language studies: Proceedings of ICSTLL-40, 107–113. Harbin: Heilongjiang University Press. LaPolla, Randy J. & David Sangdong. 2015. Rawang-English-Burmese dictionary. Privately published for limited circulation. LaPolla, Randy J. & Dory Poa. 2001. Rawang texts, with grammatical analysis and English translation. Berlin: LINCOM EUROPA. Leach, Edmund R. 1954. Political systems of highland Burma: A study of Kachin social structure. London: G. Bell and Sons. Lupyan, Gary & Rick Dale. 2010. Language structure is partly determined by social structure. PloS one 5(1). DOI: https://doi.org/10.1371/journal.pone.0008559.



Typological profile of the Kachin languages 

 431

Lustig, Anton. 2010. A grammar and dictionary of Zaiwa. Leiden: Brill. Maran, La Raw. 1978. A dictionary of Modern Spoken Jingpho. Unpublished manuscript. Matisoff, James A. 1973. Tonogenesis in Southeast Asia. In Larry M. Hyman (ed.), Consonant types and tone, 71–95. Los Angeles: UCLA. Matisoff, James A. 1991. Areal and universal dimensions of grammatization in Lahu. In Elizabeth C. Traugott & Bernd Heine (eds.), Approaches to grammaticalization, vol. 2, 383–453. Amsterdam: Benjamins. Matisoff, James A. 2003. Handbook of Proto-Tibeto-Burman: System and philosophy of Sino-Tibetan reconstruction. Berkeley, Los Angeles & London: University of California Press. Matisoff, James A. 2013. Re-examining the genetic position of Jingpho: Putting flesh on the bones of the Jingpho/Luish relationship. Linguistics of the Tibeto-Burman Area 36(2). 15–95. Matisoff, James A. 2018. Rethinking the Proto-Tibeto-Burman *a- prefix: Glottal and nasal complications. Journal of Asian and African Studies 96. 29–69. Morey, Stephen. 2010. Turung: A variety of Singpho language spoken in Assam. Canberra: Pacific Linguistics. Müller, André. 2018. The Kachin as participants of an ethno-linguistic area? In Pittayawat Pittayaporn, Sujinat Jitwiriyanont, Pavadee Saisuwan & Bhimbasistha Tejarajanya (eds.), Papers from the Chulalongkorn International Student Symposium on Southeast Asian Linguistics 2017, 124–135. Hawai’i: University of Hawai’i Press. Müller, André & Rachel Weymuth. 2017. How society shapes language: Personal pronouns in the Greater Burma Zone. Asiatische Studien/Études Asiatiques 71(1). 409–432. Nasaw Sampu, Wilai Jaseng, Thocha Jana & Douglas Inglis. 2005. A preliminary Ngochang-KachinEnglish lexicon. Chiangmai: Payap University. Nishi, Yoshio. 1999. Four papers on Burmese: Toward the history of Burmese. Tokyo: ILCAA, Tokyo University of Foreign Studies. Post, Mark W. 2019. Topographical deixis in Trans-Himalayan (Sino-Tibetan) languages. Transactions of the Philological Society 117(2). 234–255. Sawada, Hideo. 2004. A tentative etymological word-list of Lhaovo (Maru) language. In Setsu Fujishiro (ed.), Approaches to Eurasian linguistic areas, 61–122. Kobe: Kobe City College of Nursing. Sawada, Hideo. 2011. Ronwo-go no meishiku no sosei [Composition of noun phrases of Lhaovo]. Kopasu ni motozuku gengogaku kyoiku kenkyu hokoku 7. 259–283. Sawada, Hideo. 2013. Ronwo-go no bunkozo no gaikan [Overview of sentence structure of Lhaovo]. In Hideo Sawada (ed.), Chibetto-Biruma kei gengo no bunpo gensho 2 [Grammatical phenomena of Tibeto-Burman languages 2], 1–40. Tokyo: ILCAA, Tokyo University of Foreign Studies. Sawada, Hideo. 2017. Ronwo-go no fukudoshi kozo [Multi-verb constructions in Lhaovo]. In Tonan ajia shogo kenkyukai (eds.), Tonan ajia tairikubu shogengo no doshirenzoku [Serial verbs in mainland Southeast Asian languages], 162–207. Tokyo: The KEIO Institute of Cultural & Linguistic Studies. Sawada, Hideo. 2018. The phonology of Lhangsu, an undescribed Northern-Burmish language. In Tooru Hayashi, Tomoyuki Kubo, Setsu Fujishiro, Noriko Ohsaki, Yasuhiro Kishida & Mutsumi Sugahara (eds.), Diversity and dynamics of Eurasian languages: The 20th commemorative volume, 381–404. Kobe: Kobe City College of Nursing. Wannemacher, Mark. 2010. The basic structure of the Zaiwa noun phrase. Linguistics of the Tibeto-Burman Area 33(2). 85–135. Wannemacher, Mark. 2017. Zaiwa to English dictionary. Unpublished manuscript. Wolfenden, Stuart N. 1929. Outlines of Tibeto-Burman linguistic morphology. London: Royal Asiatic Society.

432 

 Keita Kurabe

Yabu, Shiro. 1982. Atsi-go kisogoishu [A classified dictionary of the Atsi or Zaiwa language (Sadon dialect) with Atsi, Japanese and English indexes]. Tokyo: ILCAA, Tokyo University of Foreign Studies. Yabu, Shiro. 1988. A preliminary report on the study of the Maru, Lashi and Atsi languages of Burma. In Yoshiaki Ishizawa (ed.), Historical and cultural studies in Burma, 65–132. Tokyo: Institute of Asian Studies, Sophia University.

Pittayawat Pittayaporn

21 Typological profile of Kra-Dai languages 21.1 Introduction The Kra-Dai language family (KD) is central to our study of structural convergence in Mainland Southeast Asia. According to Ethnologue (Eberhard et al. 2019), 91 KD languages are spoken natively by approximately 81.67 million1 people occupying a vast geographical area stretching from China’s Hunan Province in the north to Malaysia in the south and from Hainan Island in the east to the Indian state of Assam to the west. Although it is a rather homogenous group, the family as a whole, or some of its branches, have been genealogically linked to various language families with typological profiles very distinct from its own. Despite attempts to classify it as a branch of Sino-Tibetan due to its structural and lexical similarities to Chinese (F.-K. Li 1976; Nishida 1975; Wulff 1934; Xing 1999), most recent research points to a deeper and most likely genealogical connection with Austronesian (Benedict 1942, 1975; Ostapirat 2005; Sagart 2004). KD is thus crucial in understanding how areal contact may give rise to isolating and predominantly monosyllabic languages.2 In western scholarship, the language family is known by a number of names. “Tai-Kadai” is the most common label in literature from the late 20th century and the beginning of the 21st century, e.  g. Diller et al. (2008); Sagart (2005). While “Tai” in this binomial name refers to the most populous and best studied branch of the family, “Kadai” is a term coined by Benedict (1942) originally to refer languages now known as Hlai, Gelao, Lachi, and Qabiao. Some specialists, e.  g. Edmondson and Solnit (1988, 1997), also employ the term “Kadai” for the entire family, although this usage is less common. The current trend is to adopt the label “Kra-Dai” proposed by Ostapirat (2000). While “Dai” is the reconstructed pronunciation of the common ethnonyms for Tai speakers, “Kra” is the reconstructed form for ‘human’ in the Kra branch, which includes languages like Gelao, Lachi, and Qabiao on the Chinese mainland but excludes the Hlai dialects in Hainan. This term is gaining currency in not only linguistics, e.  g. Jenks and Pittayaporn (2014), but also in related fields, e.  g. Srithawong et al. (2015). Confusion may arise when dealing with sources from China. Chinese scholars, e.  g. F.-K. Li (2011) and Liang and J.-R. Zhang (1996), consistently refer to all languages 1 Thai and Lao are the two most populous languages with 20.2 and 3.7 million L1 speakers, respectively. 2 I would like to thank Nong Changsheng for procuring materials from China, without which this typological survey would have been impossible. In addition, I would like to express my deepest gratitude to Doug Couper and the SEALang Projects. This article is one of the fruits of their ceaseless effort to make resources on Southeast Asian languages and linguistics freely available online. Last but not least, I would like to thank Khet Lengwiriyakul for meticulously rechecking the data and proofreading the manuscripts. https://doi.org/10.1515/9783110558142-021

434 

 Pittayawat Pittayaporn

in the KD family as “Dòng-Tái (侗台).” Unfortunately, “Dòng-Tái” is often translated as “Kam-Tai” in English, identical to the label used by scholars outside of China to refer to the subgroup comprising Kam-Sui and Tai, e.  g. Edmondson and Solnit (1988) and F.-K. Li (1965), to which the Chinese refer as “Zhuàng-Dòng (壮侗).” A recommended practice is thus to translate “Dòng-Tái” as “Kra-Dai,” reserving “Kam-Tai” specifically for the subgroup. The current typological profile of the language family is certainly an outcome of prolonged contact with various languages whose traces are evidenced in the lexicons of modern KD languages. The most important sources of lexicon are clearly Sinitic varieties, which started to influence KD beginning in the first millennium, if not before (Manomaivibool 1976; Pittayaporn 2014). The reconstructed lexicons of Proto-KamSui (Thurgood 1988), Proto-Hlai (Norquest 2015), and even Proto-Kra (Ostapirat 2000) contain a sizeable number of Chinese etyma relating to agriculture, natural elements and phenomena, belief and society, human and animal body, number and trade, and technology. Among the branches, Kra seems to be the least Sinicized, as they still preserve Austronesian-related vocabulary lost elsewhere, including lower numerals (Sagart 2004). Moreover, one of its members, namely Buyang, still preserves sesquisyllabic word structure and some derivational morphology (Jacques 2017), characteristics that bridge KD to Austronesian. Even now, Chinese remains a major lexical source for KD languages. Zhuang (Burusphat and Qin 2012; Wang 1966), Mulam (Zheng 1988), Maonan (Lu 2008: 111– 117), and Biao (Liang and J.-R. Zhang 2001: 44–52), among many others, have been shown to have borrowed from different Sinitic varieties, such as Yue and Southwestern Mandarin. Outside of China, Thai also shows an expansive set of cultural vocabulary from diasporic Chinese dialects, such as Chaozhou (Gyarunsut 1983). In addition to Chinese, another Sino-Tibetan donor language is Burmese, from which Shan and closely related varieties have adopted significant part of their vocabularies (Khanthaphad 2019). Additionally, Yi has contributed a non-negligible number of loanwords into Kra languages (J.-F. Li 2010). Other important sources are Austroasiatic languages, especially Khmer and Mon. Their lexical influences are most clearly seen in Tai languages of Thailand and Laos, such as Thai (Ferlus 1985; Huffmann 1986; Jenny 2012; Suthiwan and Tadmor 2009; Varasarin 1984), Southern Thai (Karavi 1996), and Lao (Ferlus 1985). In addition, Vietnamese is a significant donor for Tai languages currently and historically spoken in its north and central regions, such as Saek (Kosaka 1997) and Tay-Nung (Toan 1992). Last but not least, the two classical Indic languages of Buddhism and Hinduism, Pali and Sanskrit, also supply Thai and other languages in Mainland Southeast Asia with religious and literary vocabulary both directly and through Khmer (Gedney 1965). Almost all KD languages are tonal, isolating, head-initial SVO languages resembling Chinese (Ansaldo and Matthews 2017) but differing from their Austronesian relatives, which are predominantly non-tonal and agglutinative with either VOS or SVO word orders (Himmelmann 2005). This typological profile of KD is based on data from



Typological profile of Kra-Dai languages 

 435

over 60 varieties from all branches. Intriguingly, languages in Vietnam are underrepresented due to lack of substantial and systematic descriptions. The sources are mainly reference grammars, grammatical sketches, and dialect surveys, but research on specific typological features was also consulted.

21.2 Sounds and sound system Phonologically speaking, KD languages are exceptionally homogenous. They are all tonal languages with a relatively large vowel inventory and simple syllable structure. With very few exceptions, most languages are predominantly monosyllabic and have an average-sized consonant inventory.

21.2.1 Consonants Although consonant inventories in some KD languages are larger than others, they all make very similar phonemic distinctions, differing only with respect to contrastive places of articulation and laryngeal properties. With respect to the former, all languages of the Tai branch, and most other languages, show a five-way contrast distinguishing labial, alveolar, (alveolo-)palatal, velar, and glottal. The Zhuang dialect of Lóngmíng (Hudak 1991a: xxiii–xxiv) in Table 1 exemplifies prototypical consonant inventories. Tab. 1: Consonant inventory of Lóngmíng Zhuang (Hudak 1991a: xxiii–xxiv).

Stops/ affricates Fricatives Nasals Liquids Glides

Labial

Alveolar

Palatal

Velar

Glottal

p pʰ f m

t tʰ s n l

c cʰ ɕ

k kʰ

ʔ

w

ŋ

h

j

With respect to laryngeal properties, KD languages typically show finer distinctions for stops than for fricatives and sonorants. For stops, they may contrast in terms of voicing, aspiration, breathiness, implosivization, or prenasalization. The majority show a two-way or three-way contrast, with few making use of more properties. The Zhuang dialect of Lóngmíng (Hudak 1991a: xxiii–xxiv) in Table 1 is a good example of a variety with two series of stops, /kaː⁵⁵/ ‘crow’ vs. /kʰaː⁵⁵/ ‘leg’. A three-way contrast among stops is also quite common in KD. In most varieties, the distinction is based on voicing and aspiration, with voiced stops phonetically realized as plain voiced

436 

 Pittayawat Pittayaporn

or implosivized, e.  g. /paː¹¹ˀ/ ‘aunt’ vs. /pʰaː¹¹ˀ/ ‘ghost’ vs. /baː¹¹ˀ/ ‘crazy’ in the Lue (Hudak 1996: xxii–xxiii). Only a handful of languages substitute nasalized stops for their voiced counterparts, e.  g. /tɯ³³/ ‘steep’ vs. /tʰɯ³³/ ‘tube’ vs. /ntɯ³³/ ‘bow and arrow’ in the Gelao dialect of Píngbà (J.-M. Zhang 1993: 19–20, 66). Uncommon among present-day KD languages are those with more than three series of stops, although they were more numerous before an epidemic devoicing of voiced obstruents affected most members of the family at various points in their histories (Haudricourt 1961). The Tay dialect of Trùng Khánh (Hoang 1997; Pittayaporn and Kirby 2017) and a few other conservative Tai varieties along the Sino-Vietnamese border exhibit a four-way contrast, displaying unaspirated, aspirated, plain voiced, and breathy voiced, e.  g. /tʰɔːŋ⁵³/ ‘waterfall’ vs. /tɔːŋ⁵³/ ‘big leaf for wrapping’ vs. /dɔːŋ⁵³/ ‘related by marriage’ vs. /d̤ ɔːŋ²¹/ ‘copper, brass’ (Haudricourt 1960; L-Thongkum 1997; Ross 1996). As shown in Table 2, a different type of four-way contrast is found in Sui (J.-R. Zhang 1980: 3–8), which has distinctive sets of unaspirated, aspirated, prenasalized and preglottalized/implosive voiced stops, e.  g. /paː³³/ ‘aunt’ vs. /pʰaː²⁴/ ‘gray’ vs. /mbaː³³/ ‘to draw close’ vs. /ɓaː³³/ ‘butterfly’. Rarest and most impressive of all is Maonan (Lu 2008: 75–81) with its five contrastive sets of stops, including unaspirated, aspirated, plain voiced, preglottalized/ implosive voiced, and prenasalized voiced, e.  g. /paːw⁴²/ ‘packaɡe’ vs. /pʰaːw⁴²/ ‘run’, /bɔk²³/ ‘to bend over’ vs. /ɓɔk²³/ ‘basin’, /baːŋ²¹³/ ‘afterbirth’ vs. /mbaːŋ⁴²/ ‘ghost’. For fricatives, most KD languages only have one voiceless series, but some also display a voicing contrast. Intriguingly, all languages that thoroughly distinguish voiced and voiceless stops, also make the same distinction for fricatives. One such language is Trùng Khánh Tay from which we find /saː⁵³/ ‘to look for’ vs. /zaː⁵³/ ‘to hide’ (Hoang 1997: 223–224). More interesting are the sonorants as they typically do not show any laryngeal distinction. However, a small number of conservative dialects display contrastive voiceless, pre-glottalization, or both. The most spectacular system is found in Sui, which systematically and thoroughly distinguishes voiced, voiceless, and preglottalized sonorants, e.  g. /maː³¹/ ‘tongue’ vs. /m̥aː²⁴/ ‘dog’ vs. /ˀmaː²⁴/ ‘vegetable’ (J.-R. Zhang 1980: 3–8), as illustrated in Table 2. KD languages are remarkably similar with respect to the distributions of the phonemes within the syllable, allowing very restricted subsets of consonants to occur as codas, specifically, stops, nasals, and glides. Only Saek (Gedney 1970, 1993; Hudak 1993: xxv–xxvii), Laha (Ostapirat 1995), and possibly the Hlai dialect of Báishā (Ostapirat 2008)3 also allow final /-l/. Most languages allow just eight or nine consonants in the syllable-final position. For example, only eight of the 41 Sui consonants in Table 2 can occur at the end of the syllable: /-p/, /-t/, /-k/, /-m/, /-n/, /-ŋ/, /-w/ and /-j/. This

3 Norquest (2015) suspects that the alleged final /-l/ is in fact an error due to misperception of final /-ɯ/.



Typological profile of Kra-Dai languages 

 437

Tab. 2: Consonant inventory of Sui (modified from J.-R. Zhang 1980: 3–4).

Stops

Affricates Fricatives Nasals

Liquids Glides

Labial

Alveolar

p pʰ ɓ m b

t tʰ ɗ n d ʦ ʦʰ s z n̥ ˀn n l

f v m̥ ˀm m

ˀw

Palatal

Velar

Uvular

Glottal

k kʰ

q qʰ

ʔ

ʨ ʨʰ ɕ ɲ̊ ˀɲ ɲ

ˀj j

ŋ̊ ˀŋ ŋ ˀɣ ɣ

ʁ

h

distributional restriction points to neutralization of laryngeal contrasts and continuancy. An interesting case is Hlai, represented by the Bǎodìng dialect (Ouyang and Y.-Q. Zheng 1983: 13–15), which also has final palatal codas /-c/ and /-ɲ/ that are unattested elsewhere. A much smaller number of languages only allow one or two phonemes syllable-finally, further neutralizing nasality. For instance, out of the 31 consonants in Píngbà Gelao (J.-M. Zhang 1993: 19–24), only /n/ and /ŋ/ may occur in the coda. The most extreme is the Dai dialect of Lǜchūn (Zhou and M.-Z. Luo 2001: 51–53), which only permits the velar nasal in the syllable-final position among its 22 consonants. Alternatively, [ŋ] may also be considered phonologically vowel nasalization instead of a final consonant, i.  e. [taŋ³³] is phonemically represented as /tã³³/.

21.2.2 Vowels KD languages typically have average-sized vowel inventories. With respect to vowel qualities, the number of contrastive heights and the presence or absence of front rounded vowels are the two main features that lead to variation among KD languages. With respect to vowel quantity, all but a few KD languages have a phonemic contrast between long and short vowels. For example, Lóngmíng Zhuang (Hudak 1991a) displays a three-way contrast in vowel heights, lack of front rounded vowels, and a length distinction for most vowels, as illustrated in Table 3. In contrast, Biao (Liang and J.-R. Zhang 2001: 37–42) distinguishes two distinct series of mid vowels, has a contrastive series of rounded front vowels, and lacks vowel length distinction altogether, as

438 

 Pittayawat Pittayaporn

shown in Table 4. Importantly, no language has both front rounded and back/central unrounded vowels. All have a three-way contrast. Tab. 3: Vowel inventory of the Zhuang dialect of Lóngmíng (modified from Hudak 1991a). Monophthongs

High Mid Low

Unrounded Front Back

Rounded

i, iː e, eː

uː o, oː

ɯː ɤ, ɤː a, aː

Diphthong: /aɯ/ Tab. 4: Vowel inventory of Biao (based on Liang and J.-R. Zhang 2001: 37–42).

High High-mid Low-mid Low

Unrounded Front

Rounded Back

i e ɛ

u o ɔ

y ø œ a

Diphthong /ia/ Note: Liang and J.-R. Zhang (2001) also list syllabic [ŋ̍] and [m̩], but they only occur in a handful of grammatical items, e.  g. [ŋ̍] ‘five’ and [m̩] ‘neg’. Moreover, they include [iɔ] in their rime table but place it in the position expected for [iaw]. It is thus considered a realization of /iaw/.

An important restriction shared by all languages with phonemic length is neutralization of the contrast in open syllables. In Maonan (Lu 2008: 83–88), for instance, vowels in closed syllables can be either short or long, e.  g. /sam⁴²/ ‘heart’ vs. /saːm⁴²/ ‘three’, but must be long in open syllables, e.  g. /saː⁴²/ ‘sand’ vs. */sa⁴²/. In Yay (Hudak 1991b), vowel length is contrastive in closed syllables only for /a/ and /aː/, e.  g. /kaj³³/ ‘trigger’ vs. /kaːj³³/ ‘to sell’. In some varieties, the contrast may be enhanced by quality differences. For example, the Bouyei dialect of Wàngmó short /a/ is realized as centralized [ɐ] (Snyder 1998). Vowel length is such a prominent feature in KD that many languages without the phonemic contrast also exhibit a sub-phonemic distinction conditioned by tones. For example, in the Kam dialect of Zhānglǔ /ɐ/ and /ə/ are phonetically long in checked syllables, but /a/ /e/, /i/ and /u/ are pronounced short in the same environment. While the former set only occurs with Tones /55/, /35/, and /21/, the latter occurs solely with /24/, /13/, and /31/. The only vowel that may occur as either long or short is /o/. It is realized as short when it has /55/, /35/, or /21/ and as long when it has /24/, /13/, and



Typological profile of Kra-Dai languages 

 439

/31/ (Long and G.-Q. Zheng 1998: 32–33). Furthermore, some languages that do contrast short and long vowels also show similar co-occurrence restrictions. For instance, Maonan vowels are short in checked syllables with Tones /55/ and /23/ but long in checked syllables with Tones /44/ and /24/ (Lu 2008: 90–92). In addition to phonemic distinctions in vowel qualities and quantity, very few KD languages also show phonologically contrastive nasality. The most famous is Lakkja, whose nasal vowels lend support to the idea that ancient KD languages had phonologically more complex words than their modern descendants (cf. L.-Thongkum 1993). Lakkja (Lan 2011: 9–15) has four contrastive series of vowels distinguished in terms of vowel length and nasality, e.  g. /fin⁵²/ ‘rain’ vs. /fiːn⁵²/ ‘deity’ and /ʔĩn²⁴/ ‘one’ vs. /ʔĩːn⁵²/ ‘cigarette’. The nasal vowels occur in much fewer morphemes and show many distributional gaps. A lesser known KD variety with nasal vowels is the Zhuang dialect of Wénshān (J.-R. Zhang et al. 1999: 166–168). Like Lakkja, nasality only distinguishes a subset of vowel pairs, e.  g. /mɛ³¹/ ‘ant’ vs. /mɛ̃ ⁴²/ ‘body’. One final observation about the vowels is that the majority of KD languages only have a few diphthongs, excluding rimes that consist of vowel plus glide. However, a very small number of varieties have more than four complex vowels. Not coincidentally, these languages do not allow consonants in syllable-final position, as illustrated by Zoulei (X. Li et al. 2014: 27–30), which exhibits 13 diphthongs, /ai/, /ei/, /ui/, /au/, /əu/, /aɯ/, /əɯ/, /iu/, /ɯu/, /ia/, /ie/, /iɔ/, /ua/, and seven triphthongs, /iau/, /iəɯ/, /iəu/, /uai/, /uau/, /uəɯ/, /uei/4.

21.2.3 Tones All attested KD languages are tonal languages in which pitch is the primary phonetic cue to word-prosodic contrast. In such pitch-based systems, tones are distinguished from each other in terms of pitch height and pitch contour. Within the language family, the inventory sizes vary from three to nine lexical tones, as illustrated by Aiton, a Shan variety spoken in northeast India (Morey 2005a: 160–163, Morey 2005b), shown in Table 5, and Zhānɡlǔ Kam (based on Long and G.-Q. Zheng 1998: 30–32) in Table 6.

4 Sequences of vocoids ending in [-i] and [-u] in this language are better treated as parts of diphthongs and triphthongs because they do not show any diagnostic for analyzing them as final glides following vowel nuclei. Crucially, if they were final glides, they should be able to follow all vowels that are produced at a different location in the mouth. For example, we would also expect to see *[ɔi], *[ɯi], *[əi], *[əu] and *[eu] in the inventory.

440 

 Pittayawat Pittayaporn

Tab. 5: Tonal inventory of Aiton (based on Morey 2005a: 160–163, Morey 2005b). Tone

Description

Examples

1 2 3

Mid/high level tone High level falling Mid falling

/maː⁴⁴/ ‘dog’ /maː⁴⁴²/ ‘to come’ /maː³¹/ ‘horse’

Tab. 6: Tonal inventory of Zhānɡlǔ Kam (based on Long and G.-Q. Zheng 1998: 30–32). Tone

Description

Examples

1 1’ 2 3 3’ 4 5 5’ 6

/55/ /35/ /212/ /323/ /13/ /31/ /53/ /453/ /33/

/saw⁵⁵/ ‘to twist’ /saw³⁵/ ‘straw’ /saw²¹²/ ‘to rear’ /saw³²³/ ‘to steam’ /saw¹³/ ‘grass carp’ /saw³¹/ ‘husband’ /saw⁵³/ ‘soup’ /saw⁴⁵³/ ‘egret’ /saw³³/ ‘to create’

Although all attested KD tonal systems are pitch-based, some do exhibit redundant voice quality properties such as glottalization or laryngealization. For example, two of the six tones in Nung Fan Sling (Saul and Wilson 1980: 9) are described as ending in optional final glottalization or laryngealization, e.  g. /maː⁴⁵ˀ/ ‘horse’ and /maː²¹˜/ ‘to no longer be afraid’. This final glottalization appears to be common in Tai (Gedney 1989) and has also been reported for the Red Gelao variety spoken in northern Vietnam (Edmondson and Li 2003). Unfortunately, most sources of data simply ignore such non-contrastive phonetic properties, describing only pitch heights and contours. Another interesting aspect of KD tones is their relationship to onsets. In a number of languages, only a subset of tones can co-occur with a certain type of onset. For example, the six tones in Trùng Khánh Tay are divided into two series. While tones /53/, /43/, and /34/ do not co-occur with voiced fricatives, and breathy voiced stops, /21/, /33/, and /25/ are not found in syllables with initial unaspirated voiceless, aspirated voiceless, and plain voiced initial obstruents (Pittayaporn and Kirby 2017). More complicated is the co-occurrence restriction in Maonan. Like in Trùng Khánh Tay, Maonan tones pattern in two groups of three. Tones in /42/, /51/, and /44/ do not occur with implosives, in contrast to /231/, /24/, and /213/, which are not possible when the onset is prenasalized stops, preglottalized sonorants, aspirated stops, or fricatives (Lu 2008: 93–100). Furthermore, a nonnegligible number of KD languages are reported to have tone sandhi, the phonological process across word boundaries by which lexical tones exhibit alternation (Yip 2002: 116) or become another tone in a specific tonal environ-



Typological profile of Kra-Dai languages 

 441

ment (Brunelle et al. 2020). Interestingly, most, if not all, reported cases of productive sandhi are anticipatory, involving tonal substitution triggered by the following syllable. In the Zhuang dialect of Wǔmínɡ (Snyder and Lu 1997), the rising tones /24/ and /35/ are substituted by the high tone /55/, e.  g. /raːm²⁴/ ‘to carry’ + /taːj²¹/ ‘table’→ [raːm⁵⁵ taːj²¹] ‘to carry the table’. Similarly, the low tone /21/ and mid tone /33/ by the falling tone /42/, e.  g. /wun²¹/ ‘person’ + /huŋ²⁴/ ‘small’ → [wun⁴² huŋ²⁴] ‘children’. Unlike Wǔmíng Zhuang, sandhi processes in Maonan (Lu 2008: 101–107) do not necessarily substitute one lexical tone with another but may also yield non-contrastive phonological alternants. For instance, in non-phrase-final positions, tones in the high series /51/, /24/, and /44/ are lowered to [42], [12], and [22], respectively, e.  g. /jaːw⁵¹/ ‘inside’ + /kjeŋ⁵⁵/ ‘mirror’ → [jaːw⁴² kjeŋ⁵⁵] ‘in the mirror’, and /naːn²⁴/ ‘flesh’ + /muː⁵⁵/ ‘pig’ → [naːn¹² muː⁵⁵] ‘pork’. While [42] is a contrastive tone in the language, the sandhi forms [12] and [22] must be analyzed as contextually conditioned allophones of tones /24/ and /44/, respectively. Crucially, tone sandhi in KD languages, like in Sino-Tibetan and Hmong-Mien, are domain-sensitive, applying within prosodic constituents rather than to simple linear sequences. For instance, the tonal alternation in Wǔmínɡ Zhuang (Snyder and Lu 1997: 121–126) only occurs within immediate prosodic constituents. In contrast to (1), the verb /ɣaː²⁴/ ‘to read’ does not become [ɣaː⁵⁵] in (2) because it does not form an intermediate constituent with /saɯ²⁴/ ‘book’. Instead, the noun /saɯ²⁴/ surfaces as [saɯ⁵⁵] as it forms a prosodic constituent with /kun²⁴/ ‘Chinese’. (1)

teː²⁴ paj²⁴ [ɣaː⁵⁵ saɯ²⁴] 1sg go read book ‘He goes to look for books.’ (Snyder and Lu 1997: 122)

(2)

teː²⁴ paj²⁴ ɣaː²⁴ [saɯ⁵⁵ kun²⁴] 1sg go read book Chinese ‘He goes to look for Chinese books.’ (Snyder and Lu 1997: 122)

21.2.4 Prosodic word The canonical shape of the prosodic word in most KD languages is a stressed monosyllable consisting of an obligatory onset, a vocalic nucleus, and an optional simple coda. Particularly interesting is the ban against stressed *C(C)V in languages with a vowel length contrast, such as Thai (Bennett 1995: 57–60), Hlai (Ouyang and Zheng 1983: 20–22), Maonan (Lu 2008: 81–88), Mulao (Wang and Zheng 1980: 13–17), Mak (Yang 2000: 24–30), among others. Bennett (1995: 67–70) explains this restriction as an effect of the Stress-to-Weight Principle. To illustrate, possible syllables in the Zhuang dialect of Lóngzhōu (F.-K. Li 1940) include C(C)VV, e.  g. /piː³¹/ ‘fat’ /kɯː³³/ ‘salt’, C(C)VC, e.  g. /piŋ³³/ ‘leech’, /mat⁵⁵/ ‘flea’, C(C)VVC, e.  g. /taːŋ⁵⁵/ ‘window’, /heːw²⁴/ ‘chestnut’, but not *C(C)V.

442 

 Pittayawat Pittayaporn

There are only few KD languages that are not monosyllabic. Thai (Bennett 1995), Lao (Enfield 2007: 33–36), and a few other languages of the lower part of Mainland Southeast Asia are unmistakably polysyllabic, having acquired longer words through borrowing, as illustrated by Southern Thai /cʰa⁴³.laːt⁴⁵/ ‘clever’, /sam⁴⁵.mlap⁴⁵/ ‘for’, and /leːŋ⁴³.teːŋ³³/ ‘bilimbi’ (Karavi 1996: 275–278). More interesting typologically and historically is Buyang (J.-F. Li 1999: 35–38), which has a sizable set of sesquisyllabic words, i.  e. consisting of a full stressed syllable preceded by a defective syllable-like unit (Pittayaporn 2015). Examples include /qaˈɕiː¹¹/ ‘to be full’, /maˈðuː³¹²/ ‘eight’, and /taˈnaː³¹²/ ‘thick’. In addition, Paha (J.-F. Li and Y.-X. Luo 2010: 16–17), Long-haired Lachi (Kosaka 2000: 20–24) and Qabiao (J.-R. Zhang 1990) appear to also have sesquisyllabic words. With respect to syllable structure, the only significant variation is whether complex onsets are allowed. In languages that do allow onset clusters, only typologically unmarked ones consisting of two consonants are permitted. No language has clusters that violate the Sonority Sequencing Principle (Clements 1990). Some allow clusters that are made up of a nasal followed by a liquid, thus showing sonority distance smaller than two, cf. Zec (2007). For example, the Zhuang dialect of Héngxiàn (J.-R. Zhang et al. 1999: 55–56) only has five clusters with medial /-l-/ and /-w-/, e.  g. /pl-/ in /plaː⁴⁵/ ‘fish’ and /kw-/ in /kwaː⁵³/ ‘to cross’. Similarly, the relatively large inventory of clusters in Saek (Hudak 1993: xxv–xxvii) in Table 7 only include clusters with medial /-l-/, /-r-/, and /-w-/. Tab. 7: Permissible clusters in Saek (based on Hudak 1993: xxvii).

ppʰbttʰkkʰʔshmŋl-

-l-

-r-

-w-

/plaː³⁴/ ‘fish’ /pʰluː⁴⁵⁴/ ‘betel’ /blian³⁴/ ‘moon’ /tlua³⁴/ ‘salt’ /tʰlaː¹¹/ ‘tray’ – – – – – /mlɔː⁵²/ ‘meat’ – –

/praː³⁴/ ‘eye’ /pʰrak⁴⁵⁴/ ‘vegetable’ – /trɛːŋ⁴⁵⁴/ ‘sharp’ /tʰruː⁴⁵⁴/ ‘to be similar’ – – – – – – – –

– – – – /tʰwaːj¹¹/ ‘to give’ /kwaj³⁴/ ‘to stir’ /kʰwɤː³²/ ‘daughter-in-law’ /ʔwaːj³²/ ‘to turn around’ /swaː⁵²/ ‘to be famous’ /hwaːŋ³¹/ ‘to be worried’ – /ŋwaːk⁵²/ ‘to turn the head’ /lwaːt⁵²/ ‘to be overly full’



Typological profile of Kra-Dai languages 

 443

21.3 Word classes Establishing word classes in isolating languages is notoriously difficult due to lack of overt inflectional morphology. For KD languages, it is doubly hard as available sources often do not give explicit morphosyntactic criteria or provide enough data to carry out further systematic analyses. As there may be no framework that is completely adequate for the entire family, the classification in this chapter is based on Prasithrathsint’s (2010) analysis of Thai word classes as it has the benefit of being based on an insider’s perspective. KD words can be grouped into eight major categories based on their distribution and grammatical functions. These word classes are illustrated here by data from Nung Fan Sling (Saul and Wilson 1980). The verb is unique among the word classes in that it can be negated by a negator, as illustrated by /kin⁴⁵/ ‘to eat’ in (3). In addition to proto-typical verbs, this category also includes auxiliary verbs as well as stative verbs that denote property concepts corresponding to adjectives in European languages, cf. Prasithrathsint (2000) and Post (2008). However, this criterion may not be absolute in some languages, as evidenced by the Thai sentence in (4) that seems to suggest that negators may also negate nouns in some special constructions. (3)

ɬaːw⁴⁵ sam²¹˜ miː³³ saŋ³³ kin⁴⁵ sister also have not.yet eat ‘I (=sister) also haven’t eaten yet.’ (Saul and Wilson 1980: 50)

(4)

tuə³³ nan³⁴ maj⁴² wuə³³ kɔ=kʰwaːj³³ clf dist neg cow then=buffalo That one is either a cow or a buffalo.

The noun can be defined by its function as an argument of verbs and as a complement of prepositions, as well as by its co-occurrence with adjectives, as illustrated by /vaːj³³/ ‘buffalo’ in (5)–(7). This word class subsumes many types of nominals including common nouns, proper names, time words, numerals, classifiers, and personal pronouns. (5)

kaw⁴⁵ han¹³ tuː⁴⁵ vaːj³³ ɗɔŋ²¹˜ nɯŋ³³ 1sg see clf buffalo be.white one ‘I see a white buffalo.’ (Saul and Wilson 1980: 14)

(6)

mɯn³³ paj⁴⁵ sak³³ kʰɯn²¹ taɯ²¹ man²¹˜ vaːj³³ paj⁴⁵ 3sg go butt ascend under pl buffalo go ‘He butted underneath the buffalo.’ (Saul and Wilson 1980: 15)

(7)

loːŋ⁴⁵ kaj³³ tuː⁴⁵ vaːj³³ niː⁴⁵ʔ be.big be.equal clf buffalo prox ‘It was as big as a buffalo.’ (Saul and Wilson 1980: 38)

444 

 Pittayawat Pittayaporn

The next word class is the adjective, which always co-occurs with nouns. Unlike verbs, adjectives cannot be negated by a negator. Moreover, unlike nouns, they cannot co-occur with demonstratives, which are considered a sub-type of adjectives. Importantly, these criteria exclude words denoting properties corresponding to adjectives in European languages. Rather, such words are considered stative verbs. The class of adjective mostly consists of determiners including demonstratives, interrogative words, and a small set of other modifiers, exemplified by the distal demonstrative /teː⁴⁵/ and the interrogative /haɯ³³/ ‘which’ in (8)–(9), respectively. (8)

læːw⁴⁵ˀ vaŋ³³ teː⁴⁵ niː⁴⁵ paj⁴⁵ then boy dist flee go ‘Then that boy fled.’ (Saul and Wilson 1980: 37)

(9)

luk⁴⁵ haɯ³³ paj⁴⁵ haːŋ²¹˜ time which go market ‘When are you going to the market’ (Saul and Wilson 1980: 119)

The preposition is a closed class of words that co-occurs with nouns or verbs in a head-complement relation. In a prepositional phrase, a nominal or verbal complement is obligatory. For example, the preposition /nɯː⁴⁵/ ‘in’ (10) must be followed by a nominal complement, in this case the following noun phrase /baːn²¹ haw¹³/. Similarly, the preposition /tiŋ³³/ in (11) is complemented by the noun /poː³³/ ‘mountain’. (10)

piː²¹˜-nɔːŋ⁴⁵ˀ nɯː⁴⁵ baːn²¹ haw¹³ haː⁴⁵ˀ vaː²¹˜ … sibling in village 3pl tell say … ‘The brethren in their village said …’ (Saul and Wilson 1980: 39)

(11)

ɬɔːŋ⁴⁵ vaŋ³³ teː⁴⁵ kʰɯn²¹ tiŋ³³ poː³³ paj⁴⁵ two boy dist ascend on mountain go ‘These two boys went up the mountain.’ (Saul and Wilson 1980: 38)

However, nouns and verbs can also take the function of what in European languages are prepositions. In such cases, it is usually impossible to tell if they have been grammaticalized into prepositions as they occur in ambiguous syntactic environments. For instance, /ʑuː²¹˜/ in (12) can be analyzed as a preposition ‘at’ if /ʨaːŋ⁴⁵/ and /hɛn³³/ are nouns. Alternatively, it can be analyzed as a verb if the two words are considered prepositions. (12)

ʑuː²¹˜ ʨaːŋ⁴⁵ taː²¹˜ ʑuː²¹˜ hɛn³³ ʔan⁴⁵ huː³³ niː³³ kaw⁴⁵ tæːm¹³ paː⁴⁵ at inside river at side clf hole foc 1sg catch fish ‘In the river at the side of the hole I catch fish.’ (Saul and Wilson 1980: 91)

The adverb is defined by its function as a modifier either preceding or following verbs. Like the adjective, words belonging to this class cannot be negated by a negator and cannot co-occur with demonstratives. In addition to prototypical adverbs such as time, place, and manner adverbs, discourse connectors are also categorized in this



Typological profile of Kra-Dai languages 

 445

word class. In (13), the connectors /ʨiŋ¹³/ ‘then’ and /sam²¹˜/ ‘also’ are adverbs as they precede and modify the verb /paj⁴⁵/ ‘to go’. One could also argue that they belong, with conjunctions, to a class of “connectors” or “linkers”. On the other hand, in (14), the time adverb /ʑaː¹³/ ‘already’ follows and modifies the verbs /nɔːn³³/ ‘to sleep’ and /dak⁴⁵/ ‘to be deep’. (13)

mɯn³³ maː³³ həːn³³ ʨiŋ¹³ sam²¹˜ paj⁴⁵ nɔːn³³ 3sg come house then also go sleep ‘He came home, then he went to sleep.’ (Saul and Wilson 1980: 51)

(14)

ʔɔŋ⁴⁵ teː⁴⁵ nɔːn³³ dak⁴⁵ ʑaː¹³ clf dist sleep be.deep already ‘That person is sleeping soundly already.’ (Saul and Wilson 1980: 97)

The next class is the conjunction, which links two syntactically equal units. Unlike nouns, conjunctions cannot co-occur with demonstratives. Moreover, unlike verbs, they cannot be negated by a negator. The conjunction /saw²¹˜/ ‘and’ in (15) occurs between two noun phrases, while the conjunction /viː²¹˜/ ‘because’ in (16) links two clauses together. (15)

ʔaw⁴⁵ NP[koː⁴⁵ maːk¹³ naj²¹] saw²¹˜ NP[baɯ⁴⁵ maj⁴⁵ˀ naj²¹] maː³³ kin⁴⁵ take plant fruit prox and leaf tree prox come eat ‘Bring this fruit and this leaf and eat them.’ (Saul and Wilson 1980: 17)

(16)

viː²¹˜ [mɯn³³ kin⁴⁵ laːj⁴⁵], S[mɯn³³ boː³³ miː³³ sæːn³³ kin⁴⁵] S because 3sg eat much 3sg neg have money eat ‘Because he ate so much, he didn’t have money to eat.’ (Saul and Wilson 1980: 104)

The quantifier is another major word class. It is defined by its occurrence before nouns in head-modifier relations. Moreover, quantifiers cannot be negated, and do not co-occur with verbs and prepositions. A good example of a quantifier is shown in (17) in which /doː¹³/ ‘enough’ comes before and modifies the noun /piː⁴⁵/ ‘year’. (17)

luk²¹˜ saŋ³³ doː¹³ piː⁴⁵ child not.yet be.enough year ‘The child isn’t old enough yet.’ (Saul and Wilson 1980: 23)

The last word class is the class of particles, which can be defined as words that occur before or after an utterance without a grammatical relation to any individual word. Unlike other word classes, they primarily serve discourse functions rather than grammatical ones. In (18), the emphatic final particle /loː⁴⁵ˀ/ is attached to the end of the imperative sentence /maː³³ kin⁴⁵ si³³/ ‘Come eat bread!’ to convey emphasis. (18)

koː⁴⁵ ʔəːj⁴⁵, maː³³ kin⁴⁵ si³³ loː⁴⁵ˀ father ptcl come eat bread ptcl ‘Father! Come and eat bread!’ (Saul and Wilson 1980: 115)

446 

 Pittayawat Pittayaporn

It is important to note that in this classification polyfunctional words are treated as semantically related homophones. For example, the Nung Fan Sling nominal /naj²¹/ in /maː³³ naj²¹/ ‘come here!’ is considered a homophone of the adjectival /naj²¹/ in /baɯ⁴⁵ maj⁴⁵ˀ naj²¹/ ‘this leaf’. Similarly, the verb /paj⁴⁵/ ‘to go’ and the imperative particle /paj³³/ are near homophones as they show different distribution and argument structures, even though they are historically related. That such polyfunctionality is common in KD languages is not surprising given that grammaticalization is rampant in Mainland Southeast Asia, cf. (Enfield 2003) and (Bisang 1996).

21.4 Words and word formation Like other isolating languages, words in KD languages do not display any inflectional morphology. Compounding is the primary way of forming new words in all languages of the family. Compounds in KD can be either subordinate or coordinate. A subordinate compound consists of two elements, one of which is subordinate to the other. Examples from Paha (J. Li and Y.-X. Luo 2010: 18–20) are /ðaːŋ³²² mwaː⁴⁵/ ‘handle of knives’ from ‘handle’ + ‘knife’ and /niŋ⁴⁵ θɔ³¹/ ‘to aim, take aim’ from ‘to shoot’ + ‘straight’. In addition, subordinate compounds in the Lue dialect of Jǐnghóng (Zhou and M.-Z. Luo 2001: 176–182) include /laːj⁴¹ mɯː⁴¹/ ‘lines of the palm’ from ‘pattern’ + ‘hand’ and /kam⁵⁵ dɤn⁵⁵/ ‘laying-in’ from ‘to confine’ + ‘moon’. The second type of compound is the coordinate compound, which can be characterized as headless, or made up of two elements with a parallel structure. Semantically, the referents are either equivalent to those of the parts that make up the compounds, the union of the referents of the parts, or a superset of the referents of the parts (Mortensen 2003: 2). The two elements may form converse, synonym, antonym, or near-synonym pairs. Examples from the Zhuang dialect of Jìngxī (Y.-Q. Zheng 1996: 182) include /pej³²⁴ noːŋ²¹³/ ‘siblings’ from ‘older sibling’ + ‘younger sibling’ and /kam⁵³ hoː²³²³/ ‘to be miserable’ from ‘to be bitter’ + ‘to be poor’. Similar examples are found in Long-haired Lachi (Kosaka 2000: 66) including /cĩ⁴³ wɑ²³/ ‘to do business’ from ‘to buy’ + ‘to sell’ and /m̩ ²² qʰɛ⁴³/ ‘food’ from ‘cooked rice’ + ‘vegetable’. Note the syntactic parallelism between the two elements of the compounds in both languages. The second word-formation process in KD is reduplication, which can be either total or partial. Many disyllabic words are created by reduplicating a monosyllabic root that otherwise does not occur alone. For example, in the Buyang dialect of Lángjià (J.F. Li 1999: 34) names for ‘toad’, ‘bat’, and ‘cockroach’, /pom¹²~pom¹²/, /tem³³~tem³³/, and /θip⁵³~θip⁵³/, are monomorphemic but display total reduplication. Similarly, certain monomorphemic Buyang verbs such as /ði¹¹~ðap¹¹/ ‘to blink’, /ɓɔk¹¹~ɓaːk¹¹/ ‘to spill over’, and /ɓiʔ¹¹~ɓuːt¹¹/ ‘to scoop out’ appear to have been derived through onset reduplication.



Typological profile of Kra-Dai languages 

 447

In addition to creating new lexemes, reduplication can also be used as a grammatical device. While functions of reduplication differ from language to language, certain semantic similarities exist. For nouns, the reduplicated forms typically convey universal quantification, i.  e. the concept of ‘every’ and ‘each’, as illustrated by /zəɯ¹³~zəɯ¹³/ ‘every person’ and /wuai³³~wuai³³/ ‘every day’ in Zoulei (X. Li et al. 2014: 93). An interesting function of noun reduplication is to indicate a small amount, as illustrated by the reduplicated classifiers /wu²⁵~wu²⁵/ ‘small mouthful’ and / jəm¹¹~jəm¹¹/ ‘few seeds’ in the Kam dialect of Shídòng (Long and G.-Q. Zheng 1998: 94–95). For verbs, reduplication, most typically applied to property-denoting stative verbs, expresses intensity of the situation being described. As an example, this type of reduplication is found in Paha (J.-F. Li and Y.-X. Luo 2010: 33), e.  g. /ɡaːŋ³¹~ɡaːŋ³¹/ ‘to be very firm’ and /m̥ɔ⁴⁵~m̥ɔ⁴⁵/ ‘to be very happy’. Moreover, other types of verbs can also be reduplicated to convey transience or repetitiveness, e.  g. /χia³³~χia³³/ ‘to take a little walk’ and /zai³³~zai ³³/ ‘to have a quick look’ in Zoulei (X. Li et al. 2014: 147). In Paha (J.-F. Li and Y.-X. Luo 2010: 33), reduplicated non-stative verbs convey repeated actions, e.  g. /na³¹~na³¹-ðɔŋ⁴⁵~ðɔŋ⁴⁵/ ‘walk up and down (repeatedly)’. In addition to compounding and reduplication, traces of affixation have been reported in extremely few Kra languages. Long-haired Lachi (Kosaka 2000: 50–64) has a number of phonologically reduced bound forms with transparent meanings, many of which carry no lexical tone. For example, /mji/ is attached to fruit names and nouns referring fruit-like objects, e.  g. /mji=põ⁴⁵/ ‘peach’, /mji=kam⁴⁵/ ‘tangerine’, and /mji=le²²/ ‘bell’. Another example is /nji/, which is found at the beginning of a set of verbs, e.  g. /nji=bo⁴⁵/ ‘to yawn’, /nji=tɛ⁴⁵/ ‘to fart’, /nji=qatji²³/ ‘to vomit’. Despite their productivity, it is however not clear if these are still morphologically dependent but phonologically independent clitics or if they have become true affixes. An important observation is that many of the preposed morphemes developed from reduction of full lexical words, i.  e. /mji²²/ ‘fruit’ and /nji²²/ ‘to hit’. Crucially, they can be omitted in certain syntactic positions (Kosaka 2000: 50–64), suggesting they have not become true prefixes, cf. Matisoff (2001). Among reported cases of affixation in KD, the preposed morphemes in Longhaired Lachi are the closest to true affixes. Many alleged cases are in fact morphologically independent words that co-occur with other lexical morphemes. A good example is /ma⁵⁵/ in Zoulei, which marks female animals, e.  g. /ma⁵⁵ ŋɤu⁵⁵/ ‘mare’, /ma⁵⁵ na³³/ ‘cow’, and /ma⁵⁵ qua³¹/ ‘hen’. It is referred to as a “prefix” by X. Li et al. (2014: 59) but, more likely, is the first element of a subordinate compound, i.  e. /ma⁵⁵/ ‘mother’ + /ŋɤu⁵⁵/ ‘horse’. Similarly, /ni³³/ occurs between the main verb and a verb denoting result, e.  g. /ɭa³³ ni³³ kʰua⁵⁵/ ‘became tired from speaking.’ It is claimed to be a resultative infix but is also described as a complement marking enclitic in the same book. Also often labelled as “affixes” are presyllables in sesquisyllabic words. For example, J.-F. Li (1999) lists /ma/ as a prefix used to form nouns or verbs in Lángjià Buyang, e.  g. /ma.la³¹²/ ‘fish’, /ma.tak¹¹/ ‘grasshopper’, /ma.ɕam⁵⁴/ ‘hair’, and /ma.ta⁵⁴/ ‘eye’. Jacques (2017) shows that this /ma/ is in fact part of a monomorphemic root.

448 

 Pittayawat Pittayaporn

In sum, KD languages are all strongly isolating, relying largely on compounding and reduplication and displaying no productive agglutinative or inflectional morphology.

21.5 Phrase and clause structure KD languages are strongly head-initial, with few exceptions showing a slight mixture of head-final traits. They all display SVO basic word order, employ prepositions, and have head-modifier order within the noun phrase. Moreover, grammatical relations, aspects, and modalities are expressed periphrastically through linear ordering and grammatical particles.

21.5.1 Basic clause structure The clause structures of all KD languages are characterized by a fixed SVO constituent order both in main and subordinate clauses, with possible variation in independent clauses due to pragmatic considerations. Without morphological marking of grammatical relations, the linear order of the core arguments relative to the verb determines their roles within each clause. The subject, if overt, always precedes the verb, while the direct object typically follows the verb as shown by the examples from Flowery Lachi (Y.-B. Li 2000: 151–182) in (19) and Nung Fan Sling (Saul and Wilson 1980: 60–85) in (20). (19)

ku⁴⁴.la⁴⁴ m̩⁵⁵ vei⁵⁵ kje³¹, kje³¹ ɲaŋ⁴⁴ kʰen⁴⁴ vu⁴⁴ if 1sg call 3sg 3sg only.then be.willing go ‘Only if you call him, he would then be willing to go.’ (Y.-B. Li 2000: 209)

(20) kaw⁴⁵ han¹³ keː¹³ tiː²¹˜ vaː²¹˜ meː¹³ mɯn³³ tʰaːj¹³ 1sg see man rel mother 3sg die ‘I saw the man whose mother had died.’ (Saul and Wilson 1980: 81) Interestingly, the basic clause structure in KD is extremely homogeneous, despite claims that Khamti, a Shan variety spoken in northeastern India and northwestern Myanmar, has become SOV (Khanittanan 1986; Needham 1894: 81). Morey (2006) provides ample empirical data showing that SVO sentences are unmarked and that the SOV order is possible only in predicate focus structure, as illustrated by (21)–(22). The direct object in SOV sentences is introduced by the anti-agentive /mai³⁵ˀ/ which marks non-agentive animate argument. (21)

sɯ⁴⁴ kaːp⁴⁴ pʰaːn⁴² tiger bite barking_deer ‘The tiger ate the barking deer.’ (Morey 2006: 360)



(22)

Typological profile of Kra-Dai languages 

 449

sɯ⁴⁴ pʰaːn⁴² maj³⁵ˀ kaːp⁴⁴ tiger barking_deer anti_agt bite ‘The tiger ate the barking deer.’ (Morey 2006: 360)

In negative clauses, the SVO word order is still strictly adhered to. The only typological variation found is the position of the standard negator relative to the verb. Most KD languages use preverbal negators. As exemplified in (23)–(24), the negators /m̩ ²⁴/ in the Dai dialect of Déhóng (Kullavanijaya 2001: 86–88) and /pi⁵⁵/ in Paha (J.-F. Li and Y.-X. Luo 2010: 36) come before the verb. (23)

kaw³³ m̩²⁴ caɯ³³ mo²⁴ laːj⁵⁵ 1sg neg cop teacher ‘I’m not a teacher.’ (Kullavanijaya 2001: 86)

(24)

ʨuː³¹ paː³³ ʔan³²² jaː¹¹ lan³¹ piː⁵⁵ dʱam⁴⁵ laːk¹¹ father have wife later neg care child ‘The father refuses to look after the child after he has a second wife.’ (J.-F. Li and Y.-X. Luo 2010: 36)

In contrast, a minority also uses two-part or clause-final negators in negative clauses. For example, Yang Zhuang (Jackson 2019) employs a bipartite construction /mej³¹ … naːw⁴⁵/ to negate verbs, as shown in (25). While /mej³¹/ comes immediately before the verbs, /naːw⁴⁵/ almost always appears clause-finally. At the extreme, Lángjià Buyang (J.-F. Li 1999: 76), like most Kra languages, places the imperfective negators /laːm¹¹/ ‘neg.ipfv’ at the end of the clause, as illustrated in (26). (25)

ŋo⁴⁵ mej³¹ row¹³ naː³³ˀ kən³¹ kej⁴⁵ mjɛn²⁴ paː²⁴ ni⁴⁵ naːw⁴⁵ 1sg neg know person prox cop father 2sg neg ‘I didn’t know this man was your father.’ (Jackson 2019: 58)

(26)

kɛ⁵⁴ tin¹¹ lavi³¹² qʰun⁵⁴ laːm¹¹ 3sg able walk way neg.ipfv ‘He still can’t walk’ (J.F. Li 1999: 76)

Like negative clauses, interrogative clauses in KD are characterized by lack of constituent order change. For example, Northern Thai (Wimonkasem 2012: 160–166), as illustrated in (27), places the interrogative pronoun /pʰaj²⁴/ ‘who’ before the verb, the normal subject position. Similarly, Paha (J.-F. Li and Y.-X. Luo 2010: 35–36), exemplified by (28), puts the NP containing the adjective /taw⁴⁵/ ‘where’ in the expected position for the complement of the copular verb. (27)

pʰaj²⁴ sɔːn²⁴ hɯː⁴⁴ˀ ʔu⁴⁴ˀ jaːŋ²¹ ʔiː⁴⁴ˀ who teach give speak type prox ‘Who teaches you to speak like this?’ (Wimonkasem 2012: 160)

450 

(28)

 Pittayawat Pittayaporn

mə³¹ ka⁴⁵ ha³³ taw⁴⁵ 2sg cop person where ‘Where are you from?’ (J.-F. Li and Y.-X. Luo 2010: 35)

As for yes-no interrogatives, the clause may be formed by adding a particle at the end of the corresponding affirmative clause or by the A-not-A construction. For example, Yang Zhuang (Jackson 2019) and Zhānɡlǔ Kam (Long and G.-Q. Zheng 1998: 177–178) have both kinds of interrogative clauses, as illustrated in (29)–(30) and (31)–(33), respectively. Observe that Zhānɡlǔ Kam uses the general and imperfective negative particles /kwe¹²/ ‘neg’ and /mi³¹/ ‘neg.ipfv’ particles as clause-final interrogative markers. (29)

waːn³¹ kej⁴⁵ ɗuːt³⁵ kwa⁴⁵ waːn³¹ waː³¹ mi⁴⁵ today be.hot beyond yesterday ptcl ‘Is today hotter than yesterday?’ (Jackson 2019: 58)

(30) kun⁴⁵ pit⁵³ kej⁴⁵ tsej²⁴ mej³¹ tsej²⁴ kun⁴⁵ ni⁴⁵ naːw⁴⁵ clf pen prox cop neg cop clf 2sg neg ‘Is this pen yours or not?’ (Jackson 2019: 61) (31)

ɲa²¹² li³²³ tʰɐw⁴⁵³ ɕaŋ⁵⁵.haj³¹ kwe²¹² 2sg have reach Shanghai neg ‘Have you been to Shanghai or not?’ (Long and G.-Q. Zheng 1998: 178)

(32)

ʨi⁵⁵ ʔɐw³¹ mi³¹ eat rice neg.ipfv ‘Have you eaten or not?’ (Long and G.-Q. Zheng 1998: 178)

(33)

tuj⁵⁵ naj³³ kʰwan³⁵ kwe²¹² kʰwan³⁵ plum prox be.sweet neg be.sweet ‘Is this plum sweet or not?’ (Long and G.-Q. Zheng 1998: 177)

Lastly, there is no distinct category for imperative clauses. Positive commands are expressed by regular declarative clauses, often with the subject omitted. Negative commands, on the other hand, are marked by a prohibitive marker instead of a general negator, or in conjunction with one. For example, Northern Thai (Wimonkasem 2012: 167) uses a preverbal prohibitive marker /bɔː²¹ diː³³/, formed by compounding ‘neg’ and ‘to be good’, to express negative commands. Similarly, Zoulei (X. Li et al. 2014: 145–146) uses the preverbal prohibitive /naŋ¹³/ with the clause-final negator /ɔ³³/ to mark a clause as prohibitive.



Typological profile of Kra-Dai languages 

 451

21.5.2 Noun phrases and prepositional phrases The structure of the noun phrase uniformly shows a head-modifier order. The only major variation is the position of the classifier phrase, which shows a north-south division (Lu 2012: 65–68). Elements within the NP are assembled without inflectional case marking but may sometimes be marked by particles. Clearly an oversimplification, the linear order of the NP components is schematized in Figure 1. slot 1

slot 2

slot 3

slot 4

slot 5

slot 1

slot 2

slot 3

slot 4

qnt/clf

n

n/v

poss/rc/pp

det

n

n/v

poss/rc/pp/qnt/clf

det

a. Northern type

b. Southern type

Fig. 1: Linear orders of elements within NPs.

In northern-type languages, classifiers and quantifiers appear before the head noun. Unlike other slots, slots 3 and 4 can host multiple modifiers. Languages of this type, exemplified by Long-haired Lachi, are found in northern Vietnam and in China east of the Red River. In (34) and (35), modifying verbs and possessives are marked by the associative particle /la/ and the possessive particle /ʔa/, respectively. In (36), the quantifier /ta.mja²³/ precedes the head noun but does not co-occur with a classifier. (34)

[qu²¹ la=ɲu²²] NP person assoc=be.old ‘An old person’ (Kosaka 2000: 102)

(35)

[te⁴³ fɑ̃ ⁴³ nja²³ nji=m̩⁴⁵ ʔã²³ ʔa= kʰwi⁴³ la.bjɔ²²] NP three clf baby dog be.small poss=1sg dist ‘Those three small puppies of mine’ (Kosaka 2000: 112)

(36)

[ta.mja²³ qu⁴⁵] all person ‘All the people’ (Kosaka 2000: 70) NP

In contrast, southern-type languages place classifiers and quantifiers after the head noun. Notably, slot 3 allows for multiple modifiers whose relative order is determined by scope. Déhóng Dai (Kullavanijaya 2001: 65–74) exemplifies structure of the NP in southern-type languages, as in (37) where the possessive noun is not introduced by overt markers. Similarly, in (38), the two classifier phrases /laŋ²⁴ ʔɛn²⁴/ ‘small one’ and /laŋ²⁴ ʔɔn³³.taːŋ⁵⁵/ ‘first one’ occupy slot 3 and function like relative clauses. (37)

[maːk²².moŋ³³ kaw³³ sɔŋ²⁴ hoj²² lə²⁴ phən²⁴ lan⁵³] mango 1sg two clf on table dist ‘Those two mangoes of mine on the table’ (Kullavanijaya 2001: 74) NP

452 

 Pittayawat Pittayaporn

(38) haw⁵⁵ ti²⁴ kaː²² le²² [ʔən⁵⁵ laŋ²⁴ ʔɛn²⁴ laŋ²⁴ ʔɔn³³ taːŋ⁵⁵] NP 1pl pros go visit house clf be.small clf first ‘We will go visit the first small house.’ (Kullavanijaya 2001: 66) A curious feature with respect to the position of the classifier is the numeral ‘one’, which occurs after but not necessarily adjacent to the classifier in southern-type and some northern-type varieties (Lu 2012: 143–146). In the Zhuang dialect of Tiānděng (Langella 2012: 21–22, 101–102) exemplified in (39), post-nominal quantifiers, including the enclitic /o³³/ ‘one’, cannot co-occur with a possessive or a demonstrative, suggesting that this morpheme is in fact a determiner. This analysis is compatible with the fact that /o³³/ is associated with indefiniteness. (39)

mej³³ NP[tu³⁵ noːn³⁵ ʔej³³~ʔej³³=o³³] have clf worm be.small~be.small=one ‘There is one small worm.’ (Langella 2012: 36)

An exception to the head-modifier pattern is Biao (Liang and J.-R. Zhang 2001: 114, 116–117), which has pre-nominal possessives, classifiers, quantifiers, relative clauses, and determiners but puts other types of modifiers after the head nouns, as illustrated in (40)–(42). The split pattern seems to be a transition toward a complete modifier-head order. This new Chinese-style pattern is now gaining currency in many other KD varieties in China such as Tiānděng Zhuang in (43). (40)

NP

(41)

NP

(42)

NP

(43)

[mu²¹⁴ toj³⁵ haj¹³²] 2sg clf shoe ‘Your shoe’ (Liang and J.R. Zhang 2001: 102) [mui⁵⁵ kʰaːi³⁵ jo²²] dist clf rice.field ‘This plot of rice field.’ (Liang and J.R. Zhang 2001: 103) [ki¹³² muk⁵⁵] pig be.black ‘Black pig’ (Liang and J.R. Zhang 2001: 116)

[kaj³³ kin⁵³ ti³³ mak¹³ kaːm³⁵] daj³³ kin³⁵ jɤj³¹ place prox assoc orange be. good eat ptcl ‘Oranges from here are good, you know’ (Langella 2012: 28) NP

Closely associated with the NP is the prepositional phrase. NPs that are not core arguments in the clause are often introduced by prepositions. In addition to occurring as modifiers within NPs, prepositional phrases may also occur within the VP or as an adjunct of the clause, as shown by the Nung Fan Sling (Saul and Wilson 1980: 38–39, 90–91) example in (44) and Long-haired Lachi (Kosaka 2000: 100) example in (45). It is crucial to point out that it is often very difficult to distinguish between prepositions and their homophonous verbs and nouns because they occur in ambiguous syntactic



Typological profile of Kra-Dai languages 

 453

environments. One may also say that nouns and verbs can take on functions that in other languages would be expressed by prepositions. paj³³] (44) ɬɔːŋ⁴⁵ vaŋ³³ teː⁴⁵ kʰɯn²¹ PP[tiŋ³³ poː³³ two boy dist ascend on mountain ɡo ‘Those two boys went up the mountain.’ (Saul and Wilson 1980: 38) (45)

kʰwi⁴³ wu²² PP[qũ⁴⁵ kʰo⁴³ fɔ²¹] 1sg go in kitchen ‘I go into the kitchen.’ (Kosaka 2000: 97)

21.5.3 Verb phrases The structure of the verb phrase in KD languages displays little cross-linguistical diversity. It consists of at least one verb and optional TAM markers and arguments. While the minimal VP only has one single verb, KD languages also make frequent use of serial verbs. A serial verb construction is a sequence of verbs that forms a single predicate conceptualized as a single event without any overt markers of clause linkage (Aikhenvald 2006: 1). Serial verb constructions in KD are non-contiguous and consist of multiple phonological words. In the Paha example (J.-F. Li and Y.-X. Luo 2010: 42) in (46), the direct object /ŋɯː³¹/ ‘door’ occurs after the first verb it logically belongs to. Similarly, in Wàngmó Bouyei (Boonsawasd 2012: 163–167) data in (47), the negator /mi¹¹/ comes between the verbs and has scope only over the following VP. (46)

[tam³²² ŋɯː³¹ waː⁴⁵ qai³²²] VP shut door catch chicken ‘Shut the door to catch chickens.’ (J.-F. Li and Y.-X. Luo 2010: 42)

(47)

soŋ²⁴ pi³¹ nuaŋ³¹ VP[zan²⁴ po³³ mi¹¹ xaɯ⁵³ ɕen¹¹] je⁵³ ɗi²⁴.xam¹¹ two brothers see father neg give money also be.angry ‘Two brothers got angry when their father did not give them money’ (Boonsawasd 2012: 112)

The serial verb constructions may be either symmetrical or asymmetrical. A symmetrical construction consists of verbs from any semantic or grammatical class (Aikhenvald 2006: 3) and expresses purposive, sequential, or simultaneous actions. While the example from Wàngmó Bouyei (Boonsawasd 2012: 163–167) in (48) illustrate purposive constructions, the Long-haired Lachi (Kosaka 2000: 113–114) example in (49) exemplifies simultaneous actions.

454 

 Pittayawat Pittayaporn

(48) te²⁴ paj²⁴ ziaŋ¹¹ pi³¹ te²⁴ VP[ɕia³⁵ xaw³¹ maː²⁴ kwaː³⁵ ɕiaŋ²⁴] 3sg go with elder_brother 3sg borrow rice come celebrate ‘He borrowed from his brother the rice to be used in the festival.’ (Boonsawasd 2012: 167) (49) kʰu⁴³ VP[ŋwi²² m̩.pã⁴³ ta.qɑ⁴⁵ fu⁴³ fɑ̃ ⁴³ qɑ.tʰe²³] 1sg sleep dream see two clf tiger ‘I dreamt of two tigers.’ (Kosaka 2000: 113) On the other hand, an asymmetrical serial verb construction contains a main verb followed by another verb from a semantically or grammatically restricted set. The secondary verbs may denote results, directions, or aspect. In the Wàngmó Bouyei example (Boonsawasd 2012: 163–167) in (50) and the Long-haired Lachi example (Kosaka 2000: 113–114) in (51), the secondary verbs indicate that the motion event is moving toward the speaker and the result of the main verbs, respectively. As seen here, only motion verbs and property-denoting stative verbs can occur as second verbs in directed-motion constructions, as in (50), and in resultative constructions, as in (51). (50) ɗan²⁴ ɲian¹¹ VP[tok³⁵ maː²⁴ taŋ¹¹ pɯaŋ¹¹ pu³¹ ˀjaj³¹] bronze.drum fall come arrive village Bouyei ‘The bronze drum fell into the Bouyei village.’ (Boonsawasd 2012: 165) (51)

[ko⁴³ m̩²² fa²³] VP eat rice be.satiated ‘Eat so as to be satiated’ (Kosaka 2000: 114)

In KD, it is common to encounter sentences with verb complexes that come from serializing multiple constructions. In Maonan (Lu 2008: 246–253), sentences with nine verbs are attested, as illustrated in (52). It is also possible that such complex predicates may in fact consist of multiple predicates without overt arguments. Note that the verb /səːŋ⁵¹/ ‘to want’ is not considered part of the serial verb complex but analyzed as the main verb that takes the serialized verbs as complement. (52)

ɦe²³¹ səːŋ⁵¹ VP[lət²³ paːj⁴² ʣaw²⁴ van²¹³ maː⁴² ɕaː⁴⁴ vɛ²⁴ kaw⁴⁴ 1sg want walk go take return come try do look fin⁴²] kam⁴⁴ accomplish ptcl ‘Could I walk over there to bring it back and give it a try?’ (Lu 2008: 246)

In addition to their prototypical usage, verb serialization is also used as a strategy to encode grammatical relations among arguments in KD languages. In the ditransitive construction, the indirect object is introduced by a verb of giving, as illustrated by Nung Fan Sling (Saul and Wilson 1980: 67–69) and Paha (J.-F. Li and Y.-X. Luo 2010: 37) in (53) and (54), respectively. Importantly, this verb of giving is not a full lexical verb but has been grammaticalized to varying degrees. Unlike true verbs, grammat-



Typological profile of Kra-Dai languages 

 455

icalized markers cannot be negated by a negator and still retain their grammatical interpretations. (53)

sin³³ naj²¹ tuː⁴⁵ ɬɯː⁴⁵ ʨiŋ¹³ VP[ʔaw⁴⁵ ʔan⁴⁵ maːk¹³ hɯː²¹ tuː⁴⁵ now clf tiger then take clf fruit give clf nɔk³³ teː⁴⁵] bird dist ‘Now the tiger gave the fruit to the bird.’ (Saul and Wilson 1980: 68)

(54)

paː³³ taːj³²² [diː³²² ta³¹ ʔɔn³³ tiː⁵⁵ laːk³³ lim²⁴] VP elder_brother tell younger_brother one clf sentence ‘Elder brother said a sentence to younger brother.’ (J.-F. Li and Y.-X. Luo 2010: 37)

In the causative construction, the caused event is prototypically introduced by the verb ‘to do’ and/or ‘to give’, but other verbs that specifically describe the causing action may also be used. Crucially, the complement of the causative verb is its direct object and, at the same time, the subject of the verb denoting the caused event. Semantically, the construction may convey either causation or permission, as illustrated by Maonan (Lu 2008: 253–255) and Paha (J.-F. Li and Y.-X. Luo 2010: 38–39) examples in (55) and (56), respectively. (55)

ɦe²³¹ VP[juː⁴⁴ man²³¹ taŋ⁴² jaːn⁴² ɦe²³¹] 1sg call 3sg come house 1sg ‘I asked her/him to come to my house.’ (Lu 2008: 254)

(56)

kuː³²² VP[duː³²² naːk¹¹ waj³¹ tiː⁵⁵ kat⁵⁵ maː⁵⁵ tiː³²²] 1sg do give ruin one clf timber ‘I damaged a piece of timber’ (J.-F. Li and Y.-X. Luo 2010: 42)

Like in many languages of Mainland Southeast Asia, the passive construction in KD seems to be a bi-clausal construction. Importantly, unlike in prototypical passive constructions, the agent is not demoted to an oblique role but remains in its preverbal subject position. The subject is marked as patient by a verb denoting coming into contact, suffering, or giving. The argument that follows the passive-marking verb is the subject of the second verb and the agent of the action. Typically, the passive construction in KD languages is adversative, indicating that the received action is not desirable, as illustrated by Maonan (Lu 2008: 235–239) and Paha (J.-F. Li and Y.-X. Luo 2010: 34–35) examples in (57) and (58), respectively. Note that no full lexical verb corresponding to the passive marker /ɲɛ³¹/ in Paha has been identified. (57)

m nuŋ²⁴ [tjeŋ⁵¹ niː²⁴ bat⁵⁵ liːw⁴⁴] VP younger_sibling suffer mother smack ptcl ‘Little brother/sister was smacked by mother.’ (Lu 2008: 238)

456 

(58)

 Pittayawat Pittayaporn

maː⁵⁵ luː³²² VP[ɲɛ³¹ ʨaːj⁴⁵ pjaːŋ³²²] money pass use all ‘Money has been used up.’ (J.-F. Li and Y.-X. Luo 2010: 35)

In addition to the verbs, the VP may also contain overt markers of aspect and modality, lacking any tense distinction. Consistent with the strong analyticity of KD, the aspectual markers are morphologically free verbal elements that occur pre-verbally, post-verbally, or VP-finally. For example, aspectual markers can occur immediately before the verb or at the end of VP in Long-haired Lachi (Kosaka 2000: 93–96), as exemplified by /jɔ²³/ ‘prospective’ and /bɔ²²/ ‘perfective’ in (59)–(60). Additionally, they can also occur immediately after the verb, as illustrated by /kwa³⁵/ ‘perfect’ in Wàngmó Bouyei (Boonsawasd 2012: 127–130) in (61). (59)

kɛ⁴³ VP[jɔ²³ pẽ²¹ bɔ²²] 3sg pros die pfv ‘He is about to die (at any moment).’ (Kosaka 2000: 93)

(60) kʰwi⁴³ VP[ko⁴³ m̩²² bɔ²²] 1sg eat rice pfv ‘I have finished eating.’ (Kosaka 2000: 94) (61)

zi³³ naː²¹ [tu³³ paː³³ kwaː³⁵ soŋ²¹⁴ taːw³⁵] VP paddy_field all dig perf two time ‘All paddy fields have been dug twice.’ (Boonsawasd 2012: 129)

In contrast to aspectual markers, modality markers can only occur in the pre-verbal and the VP-final position, as exemplified by the modality markers /ɓaːŋ31/ ‘probability’ in Wàngmó Bouyei (Boonsawasd 2012: 130–131) in (62) and /tju²³/ ‘(cap)ability’ in Lachi (Kosaka 2000: 93–96) in (63), respectively. (62)

ɗak³⁵ zin²⁴ ni³¹ ɕi³³ VP[ɓaːŋ³¹ tɯk³³ zin²¹⁴ ɕia³¹] paː³³ stone prox then prob cop god_stone ptcl ‘This stone may be a god stone.’ (Boonsawasd 2012: 130)

(63)

ʔu⁴⁵ [jɔ²² sɛ⁴⁴ tju²³] VP horse pull cart cap ‘The horse can pull the cart.’ (Kosaka 2000: 94)

21.5.4 Clause linkage In KD languages, clause subordination can be achieved with or without subordinating conjunctions. When conjunctions are used, they appear at the beginning of the subordinate clause. The subordinated sentences from Standard Zhuang (Y.-X. Luo 1990: 11–13) in (64) and from Long-haired Lachi (Kosaka 2000: 106–118) in (65) are good



Typological profile of Kra-Dai languages 

 457

examples of adverbial clauses with overt causal and temporal conjunctions. Adverbial clauses denoting conditions, causes, or reasons are often logically linked to main clauses by simple juxtaposition, as illustrated by data in (66). (64) [ʔan²⁴ vi³³ te²⁴ sam²⁴ di²⁴], pu⁴²~pu⁴² ɕuŋ³⁵ maːj⁵⁵ te²⁴ because 3sg heart be.good, clf~clf all like 3sg ‘Being kind-hearted, he is liked by all.’ (Y.-X. Luo 1990: 11) (65) [new⁴⁵ mɑ²² tja⁴³ wu²² lew²¹], lɛ²² kʰwi⁴³ le²¹ wu²² lew²¹ if elder_brother go neg then 1sg also go neg ‘If you don’t go, I won’t go either.’ (Kosaka 2000: 107) (66) [wu²² qũ⁴⁵ kʰĩ⁴³ nji²²], tjɔ⁴³ tju²³ la⁴⁵ go in road prox arrive cap ptcl ‘If I take this road, can I get there?’ (Kosaka 2000: 107) In most KD languages, perception and speech verbs may take subordinate clauses as complements without a subordinator. Examples of such languages include Standard Zhuang (Y.-X. Luo 1990: 12) and Long-haired Lachi (Kosaka 2000: 114–116) in (67) and (68), respectively. However, a small number of languages do require complement clauses to be introduced by a complementizer. For example, Thai requires all indirect quotes to be introduced by the complementizer /waː⁴²/ as illustrated by the example in (69). (67) te²⁴ roː⁴² ɗe³³ [mɯŋ³¹ tow⁵⁵] 3sg know 2sg come ‘He knows that you have come.’ (Y.-X. Luo 1990: 12) (68) jo⁴⁵ qu²¹ tʰɛ⁴⁵ [tja⁴³ nji⁴ mja²² la.bjɔ²² mji.pi⁴⁵] hear person speak elder_sister dist be.kind ‘I hear that she is kind.’ (Kosaka 2000: 114) (69) pʰom²⁴ chɯə⁴² [waː⁴² loːk⁴² ca rɔːn⁴⁵ kʰɯn⁴²] 1sg believe comp earth pros be.hot ascend ‘I believe the earth will become hotter.’ Unlike the other two types of subordinate clauses, relative clauses show significant typological variation within KD. Many languages have true relative clauses which come after the nouns they modify. Depending on the language, the relative clauses may be headed by an overt relativizer. For example, the relative clause in Standard Zhuang (Y.-X. Luo 1990: 12–13) does not occur with a relativizer but requires a demonstrative at the end of the NP, as illustrated in (70). This exact pattern is found in Maonan (Lu 2008: 205–207) and is exemplified in (71).

458 

(70)

 Pittayawat Pittayaporn

ʔan²⁴ po²⁴ [saːŋ²⁴ tɯk⁵⁵ ju³³ laː⁵⁵ jaːw³⁵ bow⁵⁵ ran²⁴ neŋ³¹] RC clf mountain be.tall comp at bottom look neg see top … te²⁴ mi³¹ jaːŋ³³ jɯa²⁴ ni⁴² … dist have kind herb prox ‘There are herbs of this kind in the mountain which is so high that one can’t see it from top to bottom …’ (Y.-X. Luo 1990: 13)

(71) ɗat²³ jaːn⁴² RC[seː⁴² ɲaːw²¹³] kaː⁴⁴ ˀniː⁴⁴ paːj⁴² clf house 2pl live dist be.small go ‘The house where you live is too small.’ (Lu 2008: 206) Examples of languages that use overt relativizers are Nung Fan Sling (Saul and Wilson 1980: 78–81) and Zoulei (X. Li et al. 2014: 138). In the former, relative clauses may be optionally introduced by /tiː/ as shown in (72). In the latter, they are introduced by the relativizer /mie³³/, as shown in (73). Note that the NP does not necessarily end with a demonstrative and that the relativizer is optional in some contexts. (72)

keː¹³ RC[(tiː²¹˜) hɛt¹³ ɬaj⁴⁵ tʰaː¹³ huŋ²¹˜ niː³³], mɯn³³ … man rel do sorcerer foc 3sg … ‘The man who is a sorcerer, he …’ (Saul and Wilson 1980: 79)

(73)

tau³¹ RC[mie³³ ie³¹ zi³¹ na⁵⁵ ʦən⁵⁵ χəɯ¹³] kɔ³³ clf rel have money prox turn be.poor ptcl sɔ³³ ʨʰi⁵ ti³¹ kɔ³³ laugh inch ptcl ‘The person, who used to be wealthy but who is now impoverished, laughed.’ (X. Li et al. 2014: 138)

In contrast, many languages, mostly in China, do not have true relative clauses but use attributive phrases to modify the head noun, not formally distinguishing among the different types of modifiers. Similar to Chinese, the attributive phrase in Biao (Liang and J.-R. Zhang 2001: 114, 116–117) is introduced by the associative marker /kɛ²²/ and precedes the head noun, as illustrated in (74). Following the KD head-modifier pattern, Long-haired Lachi (Kosaka 2000: 101–104) puts the attributive phrase, introduced by the associative marker /la/, after the noun, as in (75). (74)

[jɔ²¹⁴ ʦuŋ³⁵ kɛ²²] taw²² 1pl plant assoc bean ‘The beans that we planted.’ (Liang and J.-R. Zhang 2001: 114)

(75)

pu²³ [la=lo⁴⁵ ljɔ²²] qɑ.ɲa⁴⁵ clothes assoc=grandchild wear be.beautiful ‘The clothes that the grandchild wears are beautiful.’ (Kosaka 2000: 102)

Interestingly, pre-nominal modifiers are becoming more common in a number of languages under Chinese influence. This means that both post-nominal relative clauses



Typological profile of Kra-Dai languages 

 459

and pre-nominal attributive phrases are now possible in these languages, as illustrated by Tiānděng Zhuang (Langella 2012: 39–44) in (76) and (77), respectively. (76)

tu³⁵ maː³⁵ cɤɯ³¹ cɯŋ⁵³ pin²² tu³⁵ RC[van³³ ŋwaː³³ len³³~len³³ clf dog cop raise cop clf yesterday run~run kʰɤn²² lɯn³³ maː³³] ascend house come ‘We also raise dogs like the one that yesterday ran up into the house.’ (Langella 2012: 39)

(77)

cɤɯ³¹ lɯk³¹ [kjaː³⁵ ʔok¹³ paj³⁵ ti³³] lok³¹ θaːw³⁵ maː³³ irr invite move_out exit go assoc daughter come ‘We invite our daughters that have left home.’ (Langella 2012: 43)

21.6 Other interesting features In addition to phonology and morphosyntax proper, KD languages also share a number of typological features that involve interaction between syntax and pragmatics. The interaction between different components of language is a very understudied area for KD languages.

21.6.1 Topic structure Following Li and Thomson’s (C. N. Li and Thompson 1976) typology, all languages in the family can be regarded as topic-prominent languages. In a topic-comment construction, the topic serves as the center of attention and always occupies the sentence-initial position. The topic structure can be explicitly marked by a prosodic break, a particle, or both. The topics in the Maonan (Lu 2008: 228–235) examples in (78) and (79) are both patient NPs that have been fronted from the subject position. The difference is that the latter has an echo pronoun as a co-referent. In the Zoulei (X. Li et al. 2014: 227–230), sentence in (80), on the other hand, the topic is an adjunct and does not have a co-referent. Lastly, in (81) the topic is a series of listed NPs that has been fronted from the object position. (78)

ɕin⁴⁴i ‖ man²³¹ ɕaː⁵¹ [ti] waːj⁵¹ letter 3sg write crs ‘He has written the letter.’ (Lu 2008: 233)

(79)

[ʔaj⁴² ɡiː²¹³ seŋ⁴² kaː⁴⁴]i ‖ ɦeː²³¹ kam⁵¹ maŋ²⁴ man²³¹ i clf teacher dist 1sg not like 3sg ‘That teacher, I don’t like him.’ (Lu 2008: 234)

460 

 Pittayawat Pittayaporn

(80) [sai¹³ lu³¹ ke³³] ‖ ma⁵⁵ ve¹³ təɯ³³ su¹³ pu¹³ ple³³ cut grass ptcl mother 3sg clf mouth breast become sɯ⁵⁵ təɯ³³ ma⁵⁵.nu³¹ one clf bird ‘(When she was) cutting grass, his mother’s nipple turned into a bird.’ (X. Li et al. 2014: 229) (81)

[[ei⁵⁵ kɔ³³] ‖ [zɔ³¹ kɔ³³] ‖ [sa¹³ kɔ³³] ‖ [zəɯ¹³ kɔ³³] ‖ [nɔ³¹ kɔ³³]]i ‖ Buyi ptcl Miao ptcl Han ptcl Gelao ptcl Nung ptcl ma⁵⁵.qəɯ⁵⁵i tu³³ u³³ na³³ ŋɤu⁵⁵ everyone all way prox call ‘The Buyi, the Miao, the Han, the Gelao, the Nung, everyone calls them this way.’ (X. Li et al. 2014: 270)

Furthermore, it is not uncommon for the topic structure to lack explicit marking. as illustrated by data from Maonan (Lu 2008: 228–235) in (82) and Zoulei (X. Li et al. 2014: 228–229) in (83). Of special interest are cases in which the topic is an agent of a transitive verb. Without explicit marking, it is usually difficult to determine if the sentence-initial NP is a topic or a subject, as is the case for the Zoulei sentence in (84). Given that topic and subject are not mutually exclusive notions, it is possible to treat such ambiguous cases as simultaneously subjects and topics. (82)

[ɦuː²⁴] juːn⁵¹ ɕuː⁴² ze²³¹ liːw⁴⁴ rice all harvest complete crs ‘The rice has all been harvested.’ (Lu 2008: 231)

(83)

[səw³³ wuai³³ na³³] wuai³³~wuai³³ ti³¹ mie³¹ two day prox day~day drop rain ‘For the past two days, it has been raining every day.’ (X. Li et al. 2014: 228)

(84) [ɗɛ²³¹] jan⁴⁴ laːk²³ liːw⁴⁴ father praise offspring crs ‘The father praised the son.’ (Lu 2008: 230)

21.6.2 Final particles Another prominent feature in KD is the final particle, which can be roughly defined as a grammatical morpheme that is prosodically attached to the end of an intonational phrase and conveys grammatical, discourse, or sociolinguistic information (Pittayaporn and Pirachula 2012). Syntactic constituents that host these clitics may be single words, phrases, or clauses, as long as they form an intonational phrase. For instance, the particle /ni³³/ in Paha (J.-F. Li and Y.-X. Luo 2010: 29–31) in (85) occurs at the end of the topic phrase. Moreover, the Nung Fan Sling (Saul and Wilson 1980: 15, 37) particle /nɛ²¹/ in (86) is hosted by listed NP phrases.



(85)

Typological profile of Kra-Dai languages 

 461

kɔn³³ ʔɔn³³ paː⁵⁵ mai¹¹ ɲə⁵⁵ niː³³ | ŋuː¹¹ ɲiː³³ kiː³¹ clf be.young sister that ptcl sleep up stairs ‘As for the young sister, she sleeps upstairs.’ (J.-F. Li and Y.-X. Luo 2010: 30)

(86) mɯn³³ ʔaw⁴⁵ man²¹˜ tʰoj²¹ nɛː²¹ | tʰuː²¹˜ nɛː²¹ | poːn³³ nɛː²¹ | 3sg take pl bowl ptcl chopstick 3sg plate ptcl ʔaw⁴⁵ maː³³ take come ‘She brought bowls, chopsticks, and plate.’ (Saul and Wilson 1980: 37) Semantically speaking, final particles can be classified into three types: interrogative particles, modality particles, and status particles (Peyasantiwong 1982). Interrogative particles function as overt markers that structurally turn declarative clauses into yes-no interrogative clauses. All KD languages make use of this type of particle, exemplified by /kɔː²¹/ in Northern Thai (Wimonkasem 2012: 160–166) in (87) and /ja⁴⁵/ in Long-haired Lachi (Kosaka 2000: 123–124) in (88). (87)

saːw²⁴ kʰon³³ nan⁴⁵ ŋaːm³³ kɔː²¹ girl clf dist be.beautiful ptcl ‘Is that girl beautiful?’ (Wimonkasem 2012: 162)

(88) qa.nji²² wu²² to²¹ bjɔ²² qa.le²³ ja⁴⁵ here go place dist be.far ptcl ‘Is it far from here to that place?’ (Kosaka 2000: 123) The second type is the modality particle, which expresses the speaker’s modal and epistemic knowledge as well as marks information structure. This type of particle is found in all KD languages, although their inventory may vary in size from a few in Buyang (J.-F. Li 1999: 63–64) to over 30 in Thai (Cooke 1989). Northern Thai (Wimonkasem 2012: 145–150), for example, has a rich array of modality particles including /kɔːj³³/ in (89), which adds an assertive tone to the message, and /nɔː⁴²/ (90), which is used in requests or persuasions. Another example is Long-haired Lachi (Kosaka 2000: 104–105) in (91), which is endowed with a several particles including /wɔ²²/ for expressing determination in the statement. (89) man³³ kɔː²¹ ɲaʔ⁴⁵ tɛː⁴⁵~tɛː⁴⁵ kɔːj³³ 3sg tcl do be.real~be.real ptcl ‘She/he really did it!’ (Wimonkasem 2012: 146) (90) kin²⁴ kʰaw⁴⁴ kɔːn²¹ nɔː⁴² eat rice before ptcl ‘I’ll eat first then.’ (Wimonkasem 2012: 149) (91)

kʰu⁴³ wu²² wɔ²² 1sg go ptcl ‘I will go even if you don’t.” (Kosaka 2000: 105)

462 

 Pittayawat Pittayaporn

The last type of final particle is the status particle, which indexes relationships between discourse participants. In contrast to the other types, final particles are attested in a much smaller number of southerly languages. To illustrate, female speakers of Northern Thai (Wimonkasem 2012: 144–146) use /caw⁴⁴/ to convey politeness. Interestingly, status particles can occur alone in responding to a call or acknowledging a statement. (92)

ʔaː³³.caːn²⁴ caw⁴⁴ ‖ juː²¹ kɔː²¹ caw⁴⁴ professor ptcl stay ptcl ptcl ‘Professor, are you here, sir/ma’am?’ (Wimonkasem 2012: 145)

As well, multiple particles can be stacked at the end of the same intonational phrase, but they are typically ordered so that the interrogative particles come first, and the status particles come last. The most elaborate in KD is that of Thai as it can have up to four particles in one single sequence. An intriguing example is (93), which puts the modal particle /niə⁴²/ after the interrogative particle /maj³⁴/, the modality particle /la²¹/, and the status particle /wa/. (93)

kuː³³ ca ruː³⁴ maj³⁴ la²¹ wa niə⁴² 1sg pros know ptcl ptcl ptcl ptcl ‘How the hell would I know that!’

21.7 Conclusion From a typological point of view, the KD languages form a relatively homogenous language family. They are proto-typical tonal, isolating, head-initial, SVO languages with very few deviations. Phonologically, they do show considerable cross-linguistic variations, but these are differences in details rather than in system types. Grammatically, they are even more uniform, displaying typological differences only in a few morphosyntactic traits.

References Aikhenvald, Alexandra Y. 2006. Serial verbs constructions in a typological perspective. In Alexandra. Y. Aikhenvald & R. M. W. Dixon (eds.), Serial verb constructions: A cross-linguistic typology, 1–68. Oxford: Oxford University Press. Ansaldo, Umberto & Stephen Matthews. 2017. Typology of Sinitic. In Rint Sybesma (ed.), Encyclopedia of Chinese language and linguistics, vol. 4, 463–466. Leiden: Brill. Benedict, Paul K. 1942. Thai, Kadai and Indonesian: A new alignment in south-eastern Asia. American Anthropologist 44. 576–601. Benedict, Paul K. 1975. Austro-Thai language and culture, with a glossary of roots. New Haven, CT: Human Relations Area Files Press.



Typological profile of Kra-Dai languages 

 463

Bennett, J. Fraser. 1995. Metrical foot structure in Thai and Kayah Li: Optimality-theoretic studies in the prosody of two Southeast Asian languages. Urbana & Champaign, IL: University of Illinois PhD dissertation. Bisang, Walter. 1996. Areal typology and grammaticalization: Processes of grammaticalization based on nouns and verbs in East and South East Asian languages. Studies in Language 20(3). 519–597. Boonsawasd, Attasith. 2012. A grammar of Bouyei. Nakhon Pathom: Mahidol University PhD dissertation. Brunelle, Marc, James Kirby, Alexis Michaud & Justin Watkins. 2020. Prosodic systems: Mainland Southeast Asia. In Carlos Gussenhoven & Aoju Chen (eds.), The Oxford handbook of language prosody, 344–354. Oxford: Oxford University Press. Burusphat, Somsonge & Xiaohang Qin. 2012. Zhuang word structure. Journal of Chinese Linguistics 40(1). 56–83. Clements, G. N. 1990. The role of the sonority cycle in core syllabification. In John Kingston & Mary E. Beckman (eds.), Papers in laboratory phonology I: Between the grammar and physics of speech, 283–333. Cambridge, New York, Port Chester, Melbourne & Sydney: Cambridge University Press. Cooke, Joseph R. 1989. Thai sentence particles: Forms, meaning and formal-semantic variations. In Joseph R. Cooke (ed.), Thai sentence particles and other topics, 1–90. Canberra: Pacific Linguistics. Diller, Anthony V. N., Jerold A. Edmondson & Yongxian Luo. 2008. The Tai-Kadai languages. London & New York: Routledge. Eberhard, David M., Gary F. Simons & Charles D. Fennig. 2019. Ethnologue: Languages of the world, 22nd edn. https://www.ethnologue.com (accessed 23 January 2020). Edmondson, Jerold A. & Jinfang Li. 2003. Red Gelao, the most endangered from the Gelao languages. Paper presented at the The 36th International Conference on Sino-Tibetan Languages and Linguistics, La Trobe University, Melbourne. Edmondson, Jerold A. & David B. Solnit (eds.). 1988. Comparative Kadai: Linguistic studies beyond Tai. Dallas: Summer Institute of Linguistics and the University of Texas at Arlington. Edmondson, Jerold A. & David B. Solnit (eds.). 1997. Comparative Kadai: The Tai branch. Dallas: Summer Institute of Linguistics and the University of Texas at Arlington. Enfield, Nick J. 2003. Linguistic epidemiology: Semantics and grammar of language contact in mainland Southeast Asia. London: RoutledgeCurzon. Enfield, Nick J. 2007. A grammar of Lao. Berlin: Mouton de Gruyter. Ferlus, Michel. 1985. Les emprunts mons en thaï et en laos. In Suriya Ratanakul, David Thomas & Suwilai Premsrirat (eds.), Southeast Asian Linguistic Studies presented to André-G. Haudricourt, 217–233. Nakhon Pathom, Thailand: Institute of Language and Culture for Rural Development. Gedney, William. J. 1965. Indic loanwords in spoken Thai. New Haven: Yale University PhD dissertation. Gedney, William. J. 1970. The Saek language of Nakhon Phanom Province. Journal of the Siam Societiy 58. 67–87. Gedney, William. J. 1989. Speculations on early Tai tones. In Robert J. Bickner, John Hartmann, Thomas J. Hudak & Patcharin Peyasantiwong (eds.), Selected papers on Comparative Tai studies, 207–228. Ann Arbor: Center for South and Southeast Asian Studies, University of Michigan. Gedney, William. J. 1993. Saek final -l: archaism or innovation? In Thomas J. Hudak (ed.), William J. Gedney’s Saek language: Glossary, texts, and translations, 917–974. Ann Arbor: Center for South and Southeast Asian Studies, University of Michigan.

464 

 Pittayawat Pittayaporn

Gyarunsut, Pranee [ปราณี กายอรุณสุทธิ์]. 1983. Chinese loanwords in Modern Thai [คำ�ยืมภาษาจีนในภาษาไทยปัจจุบัน]. Bangkok: Chulalongkorn University MA thesis. (In Thai). Haudricourt, André-Georges. 1960. Notes sur les dialectes des région de Moncay. Bulletin de l’École Française d’Extrême-Orient 50. 167–177. Haudricourt, André-Georges. 1961. Bipartition et tripartition des systèmes de tons dans quelques langages d’extrême orient. Bulletin de la Société de Linguistique de Paris 56. 163–180. Himmelmann, Nikolaus. P. 2005. The Austronesian languages of Asia and Madagascar: Typological characteristics. In K. Alexander Adelaar & Nikolaus P. Himmelmann (eds.), The Austronesian languages of Asia and Madagascar, 110–181. London & New York: Routledge. Hoang Van Ma. 1997. The sound system of the Tày language of Cao Bằng Province, Vietnam. In Jerold A. Edmondson & David B. Solnit (eds.), Comparative Kadai: The Tai branch, 221–235. Dallas: Summer Institute of Linguistics and the University of Texas at Arlington. Hudak, Thomas J. 1991a. William J. Gedney’s The Tai dialect of Lungming: Glossary, texts, and translations. Ann Arbor: Center for South and Southeast Asian Studies, University of Michigan. Hudak, Thomas J. 1991b. William J. Gedney’s The Yay language: Glossary, texts, and translations. Ann Arbor: Center for South and Southeast Asian Studies, University of Michigan. Hudak, Thomas J. 1993. William J. Gedney’s The Saek language: Glossary, texts, and translations. Ann Arbor: Center for South and Southeast Asian Studies, University of Michigan. Hudak, Thomas J. 1996. William J. Gedney’s The Lue language: Glossary, texts, and translations. Ann Arbor: Center for South and Southeast Asian Studies, University of Michigan. Huffmann, Franklin E. 1986. Khmer loanwords in Thai. In Robert J. Bickner, Thomas J. Hudak & Patcharin Peyasantiwong (eds.), Papers from a conference on Thai studies in honor of William J. Gedney, 199–210. Ann Arbor: Center for South and Southeast Asian Studies, University of Michigan. Jackson, Eric. 2019. Two-part negation in Yang Zhuang. Journal of the Southeast Asian Linguistics Society 12(1). 52–82. Jacques, Guillaume. 2017. On the status of Buyang presyllables: A response to Professor Ho Dah-An. Journal of Chinese Linguistics 45(2). 451–457. Jenks, Peter & Pittayaporn Pittayaporn. 2014. Kra-Dai languages. In M. Aranoff (ed.), Oxford bibliographies in “linguistics”. DOI: 10.1093/obo/9780199772810-0178. Jenny, Mathias. 2012. The Mon language: Recipient and donor between Burmese and Thai. Journal of Language and Culture 31(2). 5–33. Karavi, Premin. 1996. Khmer loanwords: The linguistics alien fossilized in the southern Thai dialect. In Pan-Asiatic linguistics: Proceedings of the Fourth International Symposium on Languages and Linguistics, January 8–10, 1996, 1037–1050. Nakhon Pathom, Thailand: Institute of Language and Culture for Rural Development, Mahidol University. Karavi, Premin [เปรมินทร์ คาระวี]. 1996. Khmer loanwords in Southern Thai language [คำ�ยืมภาษาเขมรในภาษาไทยถิ่นใต้]. Bangkok: Chulalongkorn Univerisity PhD dissertation. (In Thai). Khanittanan, Wilaiwan. 1986. Kamti Tai: From an SVO to an SOV language. In Bhadriraju Krishnamurti (ed.), South Asian Languages: Structure, convergence and diglossia, 174–178. Delhi: Motilal Banarsidass. Khanthaphad, Phannida. [พรรณิดา ขันธพัทธ์]. 2019. Burmese borrowing words in the Shan language [คำ�ยืมภาษาพม่าในภาษาไทใหญ่]. Silpakorn University Journal 39(3). 119–132. (In Thai). Kosaka, Ryuichi. 1997. On the loans of Vietnamese origin in the Saek language. In Arthor S. Abramson (ed.), Southeast Asian Linguistics Studies in Honor of Vichin Panupong (pp. 127–146). Bangkok: Chulalongkorn University Press.



Typological profile of Kra-Dai languages 

 465

Kosaka, Ryuichi. 2000. A descriptive study of the Lachi language: Syntactic description, historical reconstruction and genetic relation. Tokyo: Tokyo University of Foreign Studies PhD dissertation. Kullavanijaya, Pranee [ปราณี กุลละวณิชย์]. 2001. Tai Nuea language [ภาษาไทเหนือ]. Bangkok: Faculty of Arts, Chulalongkorn Univerisity. (In Thai). L-Thongkum, Theraphan. 1993. A preliminary reconstruction of Proto-Lakkja (Cha Shan Yao). Mon-Khmer Studies 20. 57–89. L-Thongkum, Theraphan. 1997. Implications of the retention of proto-voiced plosives and fricatives in the Dai Tho language of Yunnan province for a theory of tonal development and Tai language classification. In Jerold A. Edmondson & David B. Solnit (eds.), Comparative Kadai: The Tai branch, 191–220. Dallas: Summer Institute of Linguistics and the University of Texas at Arlington. Lan, Qingyuan [蓝庆元]. 2011. A study in Lakkja language [拉珈语研究]. Nanning: Guangxi Normal University Press. (In Chinese). Langella, François. 2012. The noun phrase structure in the Zhuang dialect of Tian Deng. Bangkok: Chulalongkorn University MA thesis. Li, Charles N. & Sandra A. Thompson. 1976. Subject and topic: A new typology of language. In Charles N. Li (ed.), Subject and topic, 457–489. New York: Academic Press. Li, Fang-Kuei. 1965. The Tai and the Kam-Sui languages. Lingua 14. 148–179. Li, Fang-Kuei. 1976. Sino-Tai. Computational Analysis of Asian and African Languages 3. 39–48. Li, Fang-Kuei. 1977. Handbook of comparative Tai. Honolulu: University of Hawai’i Press. Li, Fang-Kuei. [李方桂]. 1940. The Tai dialect of Longzhou [龍州土語]. Shanghai: The Commercial Press. Li, Fang-Kuei. [李方桂]. 2011. A collection of articles in Kra-Dai languages [侗台語論文集], Bang-xin Ding & Ai-qin Xu [丁邦新, & 余霭芹] (eds.). Beijing: Tsinghua University Press. (In Chinese). Li, Jinfang. 2010. Language contact between Geyang and Yi. Language and Linguistics 28(2). 13–24. Li, Jinfang & Yongxian Luo. 2010. The Buyang language of south China: Grammatical notes, glossary, texts and translations. Canberra: Pacific Linguistics. Li, Jinfang [李锦芳]. 1999. A study in Buyang language [布央研究]. Beijing: China Minzu University Press. (In Chinese). Li, Xia, Jinfang Li & Yongxian Luo. 2014. A grammar of Zoulei, Southwest China. Bern: Peter Lang. Li, Yunbing. [李云兵]. 2000. A study in Lachi language [拉基语研究]. Beijing: China Minzu University Press. (In Chinese). Liang, Min & Junru Zhang [梁敏, & 张均如]. 1996. Introduction to Kra-Dai languages [侗台语族概 论]. Beijing: China Social Sciences Press. (In Chinese). Liang, Min & Junru Zhang [梁敏, & 张均如]. 1997. A study in Be language [临高语研究]. Shanghai: Shanghai Far-East Press. (In Chinese). Liang, Min & Junru Zhang [梁敏, & 张均如]. 2001. A study in Biao language [标话研究]. Beijing: China Minzu University Press. (In Chinese). Long, Yaohong & Guoqiao Zheng. 1998. The Dong language in Guizhou Province, China. Arlington, TX: Summer Institute of Linguistics. Lu, Tianqiao. 2008. A grammar of Maonan. Boca Raton, FL: Universal-Publishers. Lu, Tianqiao. 2012. Classifiers in Kam-Tai languages: A cognitive and cultural perspective. Boca Raton, FL: Universal Publishers. Luo, Yongxian. 1990. Tense and aspect in Zhuang: A study of a set of tense and aspect markers. Canberra: Australian National University MA thesis.

466 

 Pittayawat Pittayaporn

Manomaivibool, Prapin. 1976. Layers of Chinese loanwords in Thai. In Thomas W. Gething, Jimmy G. Harris & Pranee Kullavanijaya (eds.), Tai linguistics in honor of Fang-Kuei Li, 179–184. Bangkok: Central Institute of English Language, Office of State Universities. Matisoff, James A. 2001. Genetic versus contact relationship: Prosodic diffusibility in South-East Asian languages. In Alexandra Y. Aikhenvald & R. M. W. Dixon (eds.), Areal diffusion and genetic inheritance: Problems in comparative linguistics, 291–327. Oxford & New York: Oxford University Press. Morey, Stephen. 2005a. The Tai languages of Assam: A grammar and texts. Canberra: Pacific Linguistics. Morey, Stephen. 2005b. Tonal change in the Tai languages of Northeast India. Linguistics of the Tibeto-Burman Area 28(2). 145–212. Morey, Stephen. 2006. Constituent order change in the Tai languages of Assam. Linguistic Typology 10(3). 327–367. Mortensen, David. 2003. Hmong elaborate expressions are coordinate compounds. UC Berkeley. https://www.cs.cmu.edu/~dmortens/papers/elaborate_expressions.pdf (accessed 23 January 2020). Needham, J. F. 1894. Outline grammar of the Khâmtî language, as spoken by the Khâmtîs residing in the neighbourhood of Sadiya, with illustrative sentences, phrase-book and vocabulary. Rangoon: Superintendent of Government Printing. Nishida, Tatsuo. 1975. Common Tai and Archaic Chinese. Studia Phonologica 9. 2–12. Norquest, Peter K. 2015. A phonological reconstruction of Proto-Hlai. Leiden & Boston: Brill. Ostapirat, Weera. 1995. Notes on Laha final -l. Linguistics of the Tibeto-Burman Area 18(1). 173–181. Ostapirat, Weera. 2000. Proto-Kra. Linguistics of the Tibeto-Burman Area 23(1). 1–251. Ostapirat, Weera. 2005. Kra-Dai and Austronesian: Notes on phonological correspondences and vocabulary distribution. In Laurent Sagart, Roger Blench & Alicia Sanchez-Mazas (eds.), The peopling of East Asia: Putting together archaeology, linguistics and genetics, 107–131. London & New York: RoutledgeCurzon. Ostapirat, Weera. 2008. The Hlai language. In Anthony V. N. Diller, Jerold A. Edmondson & Luo Yongxian (eds.), The Tai-Kadai languages, 623–652. London & New York: Routledge. Ouyang, Jueya [欧阳觉亚]. 1998. Studies in Cun language [村语研究]. Shanghai: Shanghai Far-East Press. (In Chinese). Ouyang, Jueya & Yiqing Zheng [欧阳觉亚, & 郑贻青]. 1983. A survey research in Hlai languages [黎语调查研究]. Beijing: China Social Sciences Press. (In Chinese). Peyasantiwong, Patcharin. 1981. A study of final particles in conversational Thai. Ann Arbor: University of Michigan PhD dissertation. Pittayaporn, Pittayawat. 2014. Layers of Chinese loanwords in Proto-Southwestern Tai. MANUSYA Journal of Humanities, Special Issue 20. 47–68. Pittayaporn, Pittayawat. 2015. Typologizing sesquisyllabicity: The role of structural analysis in the study of linguistic diversity in Mainland Southeast Asia. In Nick J. Enfield & Bernard Comrie (eds.), The languages of Mainland Southeast Asia: The state of the art, 500–528. Berlin & Boston: Walter de Gruyter. Pittayaporn, Pittayawat & James Kirby. 2017. Laryngeal contrasts in the Tai dialect of Cao Bằng. Journal of the International Association of Phonetics 47(1). 65–85. Pittayaporn, Pittayawat & Chulanon Pirachula. 2012. Syntactically naughty? Prosody of final particles in Thai. In Tadao Miyamoto, Satoshi Uehara & Kingkarn Thepkanjana (eds.), Typological studies on languages in Thailand and Japan, 13–28. Sendai, Japan: Tohoku University Press. Post, Mark. 2008. Adjectives in Thai: Implications for a functionalist typology of word classes. Linguistic Typology 12(3). 339–381. Prasithrathsint, Amara. 2000. Adjectives as verbs in Thai. Linguistic Typology 4(2). 251–271.



Typological profile of Kra-Dai languages 

 467

Prasithrathsint, Amara [อมรา ประสิทธิรัฐสินธุ์]. 2010. Parts of speech in Thai: A syntactic analysis [ชนิดของคำ�ในภาษาไทย: การวิเคราะห์ทางวากยสัมพันธ์]. Bangkok: Chulalongkorn University Printing House. (In Thai). Ross, Peter A. 1996. Dao Ngan Tay: A B-language in Vietnam. Mon-Khmer Studies 25. 133–139. Sagart, Laurent. 2004. The higher phylogeny of Austronesian and the position of Tai-Kadai. Oceanic Linguistics 43(2). 411–444. Sagart, Laurent. 2005. Tai-Kadai as a subgroup of Austronesian. In Laurent Sagart, Roger Blench & Alicia Sanchez-Mazas (eds.), The peopling of East Asia: Putting together archaeology, linguistics and genetics, 177–181. London & New York: RoutledgeCurzon. Saul, Janice E. & Nancy F. Wilson. 1980. Nung grammar. Dallas: Summer Institute of Linguistics, University of Texas at Arlington. Snyder, Donna. 1998. Folk wisdom in Bouyei proverbs and songs. In Somsonge Burusphat (ed.), The International Conference on Tai Studies, Bangkok, Thailand, 61–88. Institute of Language and Culture for Rural Development, Mahidol University. Snyder, Will C. & Tianqiao Lu. 1997. Wuming Zhuang tone sandhi: A phonological, syntactic, and lexical investigation. In Jerold A. Edmondson & David B. Solnit (eds.), Comparative Kadai: The Tai branch, 107–137. Dallas: The Summer Institute of Linguistics and the University of Texas at Arlington. Srithawong, Suparat, Metawee Srikummool, Pittayawat Pittayaporn, Silvia Ghirotto, Panuwan Chantawannakul, Jie Sun, Arthur Eisenberg, Ranajit Chakraborty & Wibhu Kutanan. 2015. Genetic and linguistic correlation of the Kra-Dai-speaking groups in Thailand. Journal of Human Genetics 60(7). 371–380. Suthiwan, Titima & Uri Tadmor. 2009. Loanwords in Thai. In Martin Haspelmath & Uri Tadmor (eds.), Loanwords in the world’s languages: A comparative handbook, 599–616. Berlin: Mouton de Gruyter. Thurgood, Graham. 1988. Notes on the reconstruction of Proto-Kam-Sui. In Jerold A. Edmondson & David B. Solnit (eds.), Comparative Kadai: Linguistic studies beyond Tai, 179–218. Dallas: Summer Institute of Linguistics and the University of Texas at Arlington. Toan, Vuong. 1992. The Tay-Nung language in contact with the Vietnamese. In Pan-Asiatic Linguistics: The Third International Symposium on Language and Linguistics, 909–919. Bangkok: Chulalongkorn University. Varasarin, Uraisri. 1984. Les éléments khmers dans la formation de la langue siamoise. Paris: Société d’études linguistiques et anthropologiques de France. Wang, Jun & Guoqiao Zheng [王均, & 郑国乔]. 1980. A brief record in Mulam language [仫佬语简 志]. Beijing: Publishing House of Minority Nationalities. (In Chinese). Wang, Stephen S. 1966. Phonology of Chinese loanwords in a northern Tai dialect. Seattle: University of Washington PhD dissertation. Wimonkasem, Kannika [กรรณิการ์ วิมลเกษม]. 2012. Northern Thai language [ภาษาไทยถิ่นเหนือ]. Nonthaburi: Department of Eastern Languages, Faculty of Archeology, Silpakorn University. (In Thai). Wulff, K. 1934. Chinesisch und Tai: Sprachvergleichende Untersuchungen. København: Levin und Munksgaard, Ejnar Munksgaard. Xing, Gongwan [邢公畹]. 1999. A handbook of comparative Sino-Tai [汉台语比较手册]. Beijing: Commercial Press. (In Chinese). Yang, Tongyin. [杨通银]. 2000. A study in Mak language [莫语研究]. Beijing: China Minzu University Press. (In Chinese). Yip, Moira. 2002. Tone. Cambridge: Cambridge University Press. Zec, Draga. 2007. The syllable. In Paul de Lacy (ed.), The Cambridge handbook of phonology, 161–194. Cambridge: Cambridge University Press.

468 

 Pittayawat Pittayaporn

Zhang, Junru. 1990. The Pubiao language. Kadai 2. 23–34. Zhang, Jimin [张済民]. 1993. A study in Gelao language [仡佬语研究]. Guiyang: Guizhou Nationality Publishing House. (In Chinese). Zhang, Junru [张均如]. 1980. A brief record in Sui language [水语简志]. Beijing: Publishing House of Minority Nationalities. (In Chinese). Zhang, Junru, Min Liang, Jueya Ouyang, Yiqing Zheng, Xulian Li & Jianyou Xie [张均如, 梁敏, 欧阳觉亚, 郑贻青, 李旭练, & 谢建猷]. (1999). A study in Zhuang dialects [壮语方言研究]. Chengdu: Sichuan People’s Publishing House. (In Chinese). Zhang, Junru, Jueya Ouyang, Yiqing Zheng, Xulian Li & Jianyou Xie [张君如, 欧阳觉呀, 郑贻青, 李旭练, & 谢建猷]. 1999. Zhuang dialect studies [壮语方言研究]. Chengdu: Sichuan Nationalities Press. Zheng, Guoqiao. 1988. The influences of Han on the Mulam language. In Jerold A. Edmondson & David B. Solnit (eds.), Comparative Kadai: Linguistic studies beyond Tai, 167–178. Dallas: The Summer Institute of Linguistics and the University of Texas at Arlington. Zheng, Yiqing [郑贻青]. 1996. A study in Jingxi Zhuang language [靖西壮语研究]. Beijing: China Social Sciences Press. (In Chinese). Zhou, Yaowen & Meizhen Luo [周耀文, & 罗美珍]. 2001. A study in Tai dialects [傣语方言研究]. Beijing: Publishing House of Minority Nationalities. (In Chinese).

Mark J. Alves

22 Typological profile of Vietic 22.1 Introduction This paper provides a synchronic typological overview of the Vietic languages, with only brief, broad comments on historical linguistic matters insofar as they have been factors in the typological range. The Vietic branch of Austroasiatic is a powerful example of the typological transition from an Austroasiatic to a Sinitic typological template. At one extreme, Vietnamese has essentially a CVC syllable template with /w/ as the only medial in onset clusters, a monosyllabic structure in lexical roots, a complex tone system, and a robust two-syllable word-formation pattern, all paralleling those of Chinese languages. At another extreme, the archaic Vietic languages have many polysyllabic morphemes with sesquisyllabic word structure, simple tone systems or only register/phonation systems, and word-formation strategies with presyllables and infixes (though many of these are fossilized). In terms of prior language contact background, Vietnamese correspondingly has had the most extensive language contact with and linguistic influence from Chinese, while the archaic languages much less so. However, archaic languages have borrowed a notable quantity of both content and functional vocabulary from Vietnamese in Vietnam and Thai or Lao in northern Thailand, of which some of the words are of Chinese origin. These divergent Vietic languages are not without their similarities. There are many shared features, including both retentions from earlier stages in Vietic and from Southeast Asian regional typological tendencies. They share palatal stops and nasals, a mid or back unrounded vowel, three primary diphthongs /iə, ɨə, uə/, Austroasiatic-style reduplication with alternating segments (i.  e. alliteration, rhyming, and chiming), a solid core of Austroasiatic etyma (e.  g. basic numbers, body parts, etc.), among other retentions. As for syntax, available data suggests very strong regional typological similarity in both clauses and noun phrases. The sub-branching of Vietic based on historical linguistic methodology is not yet agreed upon (cf. a historical overview of previous hypotheses by Sidwell [2009: 140–147] and discussion of classification in this volume at chapter 11). Thus, for this study, the main groups in Vietic for typological comparison are based primarily on morphophonological typology, parallel to the hypothesized branching of Sidwell (2015). These include (a) Viet-Muong languages, (b) Pong-Toum languages, and (c)  archaic ­languages. The Pong-Toum and archaic languages can also be termed “Southern Vietic”, referring broadly to their shared historical proximity in contrast with the Viet-Muong languages to the north. As indicated, Viet-Muong languages are at one end of the spectrum (e.  g. monosyllabic, relatively simple syllable structure, and complex tone systems), while archaic languages are at the other (e.  g. ­polysyllabic, https://doi.org/10.1515/9783110558142-022

470 

 Mark J. Alves

complex syllable structures, and limited or no tone systems), and Pong-Toum are somewhere between (e.  g. mostly monosyllabic, somewhat complex syllable structure, complex tone systems). Other than Vietnamese, the available published data on Vietic languages is rather limited. There are phonological descriptions of all the languages in this study, but for morphosyntax, there are overall linguistic descriptions of Ruc, May, and So Thavung, but not for Cuoi, Pong, or varieties of Muong. The key sources used throughout this article include those in Table 1. In some cases, additional phonological and lexical data comes from the Mon-Khmer Etymological Dictionary. Tab. 1: Language descriptions used in this study. Groups

Language

Sources

Viet-Muong

– Vietnamese – Muong – Cuoi – Poong

– Thompson (1987); Brunelle (2015) – Nguyen V. K. et al. (2002); Nguyen V. T. (2005) – Ferlus (1994); Nguyễn H. H. (2009); Nguyen H. H. & Nguyen V. L. (2019) – Nguyen T. L. (1992) – Nguyen V. L. (1993); Solntsev et al. (2001) – Babaev and Samarina (2018: 38) – Ferlus (2014) – Premsrirat (1996); Srisakorn (2008) – Ferlus (1997) – Enfield and Diffloth (2009)

Pong-Toum

Archaic ­languages

– Ruc – May – Arem – Thavung – Maleng Bro – Kri

22.2 Phonology In various ways, the consonant and vowel phoneme inventories of Vietic languages are comparable, such as similar numbers of initial and final consonants and registral features on main-syllable vowels. However, the overall prosodic word, phonotactics, and suprasegmental features of Vietic languages range from a typical Austroasiatic typology (e.  g. sesquisyllabic words, some affixation, registral but not tonal) to that resembling Sinitic or Tai languages (e.  g. largely isolating morphology, complex tone systems, CVC or CCVC syllables). The typology of Vietic morphophonology appears to have converged with Sinitic through language contact (see “Chinese Linguistic Influence in Southeast Asia” in this volume). While initial Vietic-Sinitic contact occurred at the end of the Old Chinese period, when pre-syllabic material was more complex in Sinitic, the time of most intense contact with Sinitic was during Early Middle Chinese in the early to mid-1st millennium CE, when Sinitic had lost its presyllabic material. In this situation, combined with significant amounts of bilingualism (cf. Phan 2013),



Typological profile of Vietic 

 471

many monosyllabic words were borrowed into Vietic. It thus appears that this intense language contact ultimately led to the north-south typological distinction in Vietic and the present typological range. In the following subsections, the typological range of Vietic consonants, vowels, and tonal and registral systems is summarized.

22.2.1 Consonants Most Vietic languages have modest segmental inventories, as shown in Table 2. In Table 2, numbers of consonants in parentheses mark those sounds which are only in loanwords or have marginal presence in a language. Only a small subset of the total inventories (typically excluding voiced and aspirated stops or affricates) can occur in word-final position. The following is a summary of the sound classes. – Stops and nasals: All sub-branches of Vietic have at least a four-way labial-alveolar-palatal-velar distinction among both nasals and voiceless stops. These are the only stops to consistently be permitted in word-final position (i.  e. not affricates, retroflexes, or voiced stops). – Retroflexes: Retroflex initials /ʈ/ and /ʂ/ are seen in multiple sub-branches in Vietic, including Viet-Muong (Vietnamese, some varieties of Muong), Pong-Toum (Cuoi), and archaic languages (May, Ruc, Kri). Only southern and central Vietnamese have /ɽ/: it is not listed in the inventories of other Vietic languages in this study. All Vietic languages which do have any retroflex sounds have the voiceless retroflex stop /ʈ/, and most have the voiceless retroflex fricative /ʂ/, but there is no observable pattern, and some have no retroflexes. In Vietic, retroflex consonants occur only as initials in main syllables. One exception is that the liquid /ɽ/ appears word-finally in May. – Aspirated consonants: All sub-branches have voiceless aspirated consonants, but they vary in number. Vietnamese has only aspirated /th/ (as earlier *ph merged with /f/); Cuoi has /th/ and /kh/; May has /ph/, /th/ and /kh/; and Kri has /ph/, /th/, / ʈh/ and /kh/. Thus, only the aspirated voiceless alveolar stop is seen in all these Vietic languages. Aspirated stops occur only as initials in main syllables. – Voiced bilabial fricative: While *β was a part of Middle Vietnamese which merged with Vietnamese /v/ (e.  g. Gregerson 1969: 150–151), in modern Vietic, /β/ is retained in some varieties of Muong, Kri, and May. It is not a common phoneme among languages in the region. – Pre-glottalized/imploded stops: Imploded stops appear in several different Vietic languages in all three groups. Those with them have /ɓ/ and /ɗ/, with palatal /ʄ/ in only a couple of languages. They do not co-occur in phonemic systems with non-imploded voiced stops. – Velar fricatives: While Vietnamese has both voiced and voiceless fricatives, voiced /ɣ/ is seen in varieties of Muong and some Southern Vietic languages. The

Viet

22 (24)

10

ɽʂʈ (Central and Southern)

th

no

ɓɗ

x ɣ (Northern)

none

none

none

Features

No. of initials

No. of finals

Retroflex ­phonemes

Aspirated stops

/β/

Imploded stops

Velar ­fricatives

Final ­fricatives

Final liquids

Final glottal stop none

-l (in 15 dialects)

none

ɣ (Regions 1, 3, 5)

none

yes (Regions 1, 3, 4, 5, 6)

ph th kh kwh

ʂ ʈ ʐ (Region 4 only)

11

19 to 25

Muong varieties

Tab. 2: Numbers and features of consonants.

none

-l

none

(ɣ)

ɓɗ

no

ph th kh

none

11

21

Cuoi

yes

-l

(-h)

none

none

no

ph kh

none

13

?

Poong

no

-l -r

-s, -h

none

none

yes

ph th kh

ʂʈ

14

23

Ruc

yes

-l - ɽ

-h

(ɣ)

ɓɗʄ

yes

ph th kh

ʂʈɽ

14 (17)

23 (25)

May

no

-l

-h

none

none

no

ph th kh

none

12

23 (25)

Arem

yes

none

-s, -h

none

none

no

ph th kh

none

12

20

Thavung

no

-r -l

none

ɣ

ɓɗʄ

no

ph th ʈh kh

ʈʈ

h

13

24

Kri

yes

-r -l

none

none

ɓɗ

no

none

none

12

17

Maleng Bro

472   Mark J. Alves



Typological profile of Vietic 

 473

voiced velar fricative has only a marginal presence in Cuoi and May. Such sounds appear only in initial position in the languages. – Liquids and glides: All varieties of Vietic have multiple liquids, including laterals and rhotics, and the glides /w/ and /j/, all of which can occur word-initially. Vietnamese and Thavung are the only languages that do not allow final liquids /r/ or /l/. Note on varieties of Muong: Of the 30 dialects of Muong (divided into Regions 1 to 6) described by Nguyen V. T. (2005), their consonant phoneme inventories range from 19 to 25 phoneme categories, with most around 23 (Nguyen V. T. 2005: 71–73). They vary in specific types of consonant codas but generally maintain 11 final consonants among the varieties (Nguyen V. T. 2005: 77–78). 15 of the 30 varieties have final /-l/, while 8 instead have final /-n/, and 6 have /-ɯ/ (Nguyen V. T. 2005: 78). A sample inventory of phonemes in varieties of Muong spoken in Region 5 is shown in Table 3. Tab. 3: Consonants in varieties of Muong in Region 5 (Nguyen V. T. 2005: 72). m b (ph) β (~v)

n t d (th) s z l r

ɲ c

ŋ k

kw

kh

khw

(ɣ)

22.2.2 Vowels The sizes of vowel inventories among Vietic languages varies tremendously. The differences in how much vowel length – and sometimes, registral features on main syllable vowels  – plays a major role in the inventory sizes. Vietic languages typically have a core set of nine vowels, including three degrees of height and backness. However, the number of vowels distinguished by length ranges from none (Cuoi), to only mid-vowels /a/ and /ɤ/ (Vietnamese and varieties of Muong), to all basic vowels (Ruc, May, and Arem). The inventories with both length and registral features, such as a clear versus breathy distinction, are among the largest. All vowels in the inventories can occur in main syllables, regardless of the sizes of the vowel systems. In contrast, in the presyllables of those languages with them, only a neutral vowel (e.  g. /a/ or schwa) or a few vowels (e.  g. /i/, /u/, /a/) may appear. There is no vowel length distinction in this unstressed position. Sizes of the vowel inventories are summarized in Table 4.

474 

 Mark J. Alves

Tab. 4: Numbers of vowels and length distinctions. Group

Language

Number

Length distinction

Viet-Muong

– Vietnamese – Muong – Cuoi – Ruc – May – Thavung – Arem – Kri – Malieng Bro

– 10 or 11 – 11 – 10 – 18 – 18 – 10 – 41 (46) – 18 – 30

– Limited – Limited – No – yes – yes – no – yes (+ register) – no (+ register) – yes (+ register)

Pong-Toum Archaic languages

Regarding diphthongs, all available descriptions show at minimum three, starting with high vowels of three degrees of backness and ending in a central vowel: /iə/, /ɯə/, and /uə/. There can be some assimilation, such as /uə/ realized as [uo], with both backing and rounding. Examples are shown in Table 5 with examples of Arem, which has both clear and breathy diphthongs. Tab. 5: Vowels in the Arem language (Ferlus 2014: 5). Breathy vowels

Clear vowels

ì

ì

ɨ̀

ɨ̀ː

ù

ùː

è

(èː)

(ə̀ ) ɐ̀

ə̀ ː ɐ̀ ː

ò

ò:

ìe

ɨ̀ə

i ɪ (e) ɛ æ

ùo

(iː) ɪ eː ɛː æː ie

ɨ ʉ

ɨː ʉː

(ɐ) ʌ a

ɐː ʌː aː ɨə

u ʊ o ɔ ɑ

u: ʊː oː ɔː ɑː uo

22.2.3 Tones and phonation Suprasegmental features include both phonation (i.  e. glottalization, breathiness, tension, and creakiness) and tones, and often a blend of these two phenomena, but to varying degrees among the Vietic languages, as shown in Table 6. The varieties more impacted by Sinitic  – especially Viet-Muong languages  – have more complex tone systems, while the less Sinicized archaic languages have simpler tone systems, and in some cases only phonation systems. The tone systems of all the languages appear not to have tone sandhi (i.  e. phonological rules such that tones of words of change in certain phonological contexts), nor do the tones in Vietic appear to play significant roles in word-formation and morphol-



Typological profile of Vietic 

 475

ogy. Some minor exceptions in Vietnamese are (a) the change of tones in lexical compounds for numbers. For example, the original Vietnamese numbers hai ‘two’, mười ‘ten’, and một ‘one’ change in the compound hai mươi mốt ‘twenty-one’, in which both ‘ten’ and ‘one’ have changed tones. Also, the hỏi tone (a mid-dipping rising tone) is used in referential terms derived from kinship terms when they have 3rd person reference. Tab. 6: Suprasegmental features in Vietic. Branch

Language

Features

Viet-Muong

– Vietnamese – Muong – Cuoi – Poong – Ruc, May, Arem – Thavung, Maleng Bro – Kri

– – – – – – –

Pong-Toum Archaic languages

6- or 5-way contour and phonation 5-way contour and phonation 7-way or 5-way contour and phonation 4-way registral height and phonation 4-way contour and phonation 4-way pitch and phonation 2-way register and phonation

One factor in the numbers of tones among Vietic languages is the presence or loss of final fricatives as those sounds have historically corresponded directly with the development of certain tone categories. Vietnamese, Muong, and Cuoi – all of which lack final fricatives – have the largest number of tones. Most of the archaic languages have final fricatives, either /-h/ or less frequently /-s/, and have at most four tones even simpler systems. Thus, there does appear a tendency: the lack of fricatives corresponds to more complex tone systems (the result of a historical development), while in Vietic languages with final fricatives and tones, there are fewer tones. However, Kri and Maleng Bro have lost final fricatives without gaining tone categories, a situation more often seen in other Austroasiatic languages. The Cuoi language has either 7 or 5 tones (cf. Table 7), with 7 tones if tones in syllables with final -p/-t/-k are treated as phonologically distinct. Tones in the latter category do have some overlap with other tone categories, but they nevertheless show measurable phonetic distinctions (Nguyen and Nguyen 2019). Tab. 7: The tone system of Cuoi (Nguyen and Nguyen 2019: lxi). Toneme (T)

Phonological traits

T. 1 (A1) T. 2 (A2a, A2 b) T 3 (B1) T. 4 (B2, C2) T. 5 (C1) T. 6 (D1) T. 7 (D2)

high rising falling low rising falling + ʔ level + ʔ rising + voiceless stop (-p, -t, -k) falling + voiceless stop (-p, -t, -k)

476 

 Mark J. Alves

22.2.4 The prosodic word and phonotactics While no variety of Vietic has conclusive evidence of final consonant clusters, syllable initial consonant clusters are found in all described languages. In archaic languages and Pong-Toum, these include both obstruent-liquid combinations but also those with two obstruents that do not follow the sonority hierarchy. In such cases, it is not necessarily clear how to distinguish presyllables and clusters, as discussed below. In Viet-Muong, Muong dialects have a variety of the type with voiceless stops and /-l-/ medials (and less often, /-r-/), while the only clusters in Vietnamese involve obstruents plus a [-w-] medial, as noted in Table 9. Despite being polysyllabic, the archaic language So Thavung is the only Vietic language shown to lack clusters (Srisakorn 2009). It is in the area of the prosodic word and the phonotactic restrictions on sounds in those words that a full range original Austroasiatic and Sinicized typology appears, as summarized in Table 8. Viet-Muong languages all have only monosyllabic prosodic words (excluding compounds) and entirely lack presyllabic material, while all archaic languages have numerous bisyllabic prosodic words, all with unstressed presyllables and stressed main syllables. While the Pong-Toum language Cuoi is monosyllabic, the Pong language is a true transition language, having a small number of genuine polysyllabic words. Tab. 8: Typology of word structure among the sub-branches of Vietic. Branch

Features

Viet-Muong Pong-Toum Archaic languages

monosyllabic prosodic words monosyllabic prosodic words, small numbers of sesquisyllabic words in Pong both monosyllabic and sesquisyllabic prosodic words

Table 9 shows approximate templates of the prosodic words among Vietic languages. The prosodic word template in archaic languages is comparable to those of neighboring Austroasiatic Katuic languages. Tab. 9: Prosodic word templates of Vietic languages. Branch

Templates

Viet-Muong

Vietnamese: C1WV1C2 Muong: C1LV1C2 Cuoi: C1(C3)V1C2 C4V2N.C1LV1C2

Pong-Toum Archaic languages

C1 = all available consonants; C2 = only final consonants of main vowels; C3 = variable consonants in the inventory (unknown extent); L = liquids /r/ and /l/; W = only /-w-/ as a medial; N = nasals; V1 = all available vowels; V2 = presyllable vowels (often i/u/a)



Typological profile of Vietic 

 477

Note: Vietnamese (and probably Muong) does not have clearly limited prosodic words (Schiering et al. 2010), and thus the characterization in Table 6 is only of stand-alone syllables. This issue in Vietnamese corresponds to the ambiguity of bisyllabic “pseudo-compounds”, many of which consist of Chinese loanwords and loanmorphs and which straddle the features of affixes and words, as noted in 22.3.1. Presyllables and sesquisyllabic words: While trisyllabic phonological words are extremely rare in archaic languages (a small number in Thavung), bisyllabic words are quite common. According to Ferlus (2014: 2), in Ruc, Sach, Thavung, and Maleng Bro, 35 % to 40 % of the words are sesquisyllabic, and in Arem, 55 % to 60 % are. Such bisyllabic words have iambic stress patterns. The second syllable is often termed the “major syllable”, while the unstressed presyllable is the “minor syllable”. The main syllable can contain all consonants (in initial position), vowels, phonation features, and tones in those varieties with tone. In contrast, minor syllables allow only a subset of the phoneme inventory, lack phonation features, and, being unstressed, lack lexically distinctive tone. Initial consonant-clusters: While Vietnamese has at most initial clusters with /-w-/ medials, but otherwise only single-consonant initials, Ruc has 18 initial clusters with /w/, /l/, and /r/. The initial clusters in Southern Vietic languages generally only allow medial /w/, /l/, and /r/ and with voiceless stop onsets (/p, t, c, k/), but also some nasal onsets (e.  g. /m/, /ŋ/, etc.). Nguyen V. T. (2005: 68–73) notes seven different initial clusters (tl, kl, pl, bl, ml, hr, dr) found spread among 30 Muong dialects, but in Regions 1 to 6, the numbers of clusters range from a maximum of five clusters to just one, /tl/ (a somewhat uncommon cluster in the region, perhaps due to the combination of two alveolar sounds). Medial /-w-/ is common in Vietnamese largely due to Chinese loanwords, but this medial is otherwise relatively rare among Vietic languages (Nguyễn T. C. 1995: 221–223). Codas: Codas in all Vietic languages include (a) final nasals /m, /n/, /ɲ/, and /ŋ/, (b) final voiceless stops /p/, /t/, /c/, and /k/, and (c) offglides /w/ and /j/. Additional sounds depend on inventories of respective languages. These include (a) final fricatives /s/ and /h/, (b) final liquids /l/ and /r/, and (c) final glottal stop /ʔ/. The features are shown in Table 11. It is precisely in Viet-Muong and Pong-Toum in which the loss of final fricatives has occurred, with consequences on the development of tone systems of those languages (see 22.2.3). As for final liquids, while most Vietic languages have them, Vietnamese has lost them entirely, and in cognates in languages with final /-r/ or /-l/, Vietnamese has /-j/, nothing after front vowels, or in rare cases, final /-n/ (Alves 2017). Note on coda clusters: It appears that Vietic may have no coda clusters. While it has been reported that Ruc has final clusters with -lh/-rh/-jh (Nguyen V. L. 1993; Solntsev et al. 2001), it is unclear whether these are clusters or instead devoicing of sonorants

yes

1 (-w-)

none

NA

NA

NA

Single consonant initials

Consonant clusters – stop-sonorant ­combinations

Presyllables (minor syllables)

Presyllable initial ­consonants

Vowels in presyllables

Presyllable codas

Viet

none none

none

Type

Final fricatives Final l/r

Final glottal stop

Tab. 11: Final consonants in Vietic.

Viet

Type

none -l (in 15 dialects) none

Muong

NA

NA

NA

none

5 to 1 (-l-, -r-)

yes

Muong

Tab. 10: Features of presyllabic material in Vietic.

none

none -l

Cuoi

NA

NA

NA

none

10 (-l-, -w-)

yes

Cuoi

yes

(-h) -l

Poong



no

p-, k-, m-, s-

few

9 (-l-, -r-)

yes

Poong

no

-s, -h -l, -r

Ruc

nasals and -r

i, a, u, ɯ

p-, t-, c-, k-, b-, m-, l-, r-, h-

many

18 (-w-, -l-, -r-)

yes

Ruc

yes

-h -l, - ɽ

May

nasals

i, a, u

p-, t-, c-, k-, ʔ-, m-, l-, ɕ-,

many

11 (-l-, -ɽ-)

yes

May

no

-h -l

Arem

-m

i, a, u

p-, t-, c-, k-, ʔ-, l-

many

9 (-r-, -1-)

yes

Arem

yes

-s, -h none

Thavung

nasals

a

p-, ph-, t-, th-, c-, k-, ʔ-, b-, m-, l-,

many

none

yes

Thavung

no

none -r, -l

Kri

none

i, a, u

p-, t-, c-, k-, ʔ-, s-

many

several? (-r-, -1-)

yes

Kri

yes

none -r, -l

Maleng Bro

none

i, a, u

t-, c-, k-, b-, m-, s-

many

several? (-r-, -1-)

yes

Maleng Bro

478   Mark J. Alves



Typological profile of Vietic 

 479

in word-final position. Enfield and Diffloth (2009: 26) argue that similar segments in Kri are not final clusters, but rather represent non-segmental phenomena.

22.3 Morphology While monomorphemic Viet-Muong and Pong-Toum languages can only employ compounding and reduplication, archaic languages can also use derivational affixation, including both prefixes and infixes, though suffixes are entirely lacking. The extent of productivity of such derivational affixation is not entirely clear, but in general, it appears limited, with much of it fossilized.

22.3.1 Compounding Both single-headed and dual-headed compounds are seen in all the languages, similar to those seen in other Austroasiatic languages (cf. Alves 2014: 540–543). They include combinations of noun-plus-verb, noun-plus-noun, and verb-plus-noun. In these compounds, the verbs may be either active or stative, which overlap syntactically in various ways (e.  g. similar means of negation, the ability to stand alone as predicates, etc.). Tab. 12: Examples of compounding in Vietic. Branch

Languages

Examples

Viet-Muong

– Vietnamese

– V-N: tốt bụng (good – stomach) ‘to be good-hearted’ N-V: tàu bay (boat – to fly) ‘airplane’ N-N: chó con (dog – child) ‘puppy’ – V-V: là cao (to do – tall) ‘to be proud/conceited’ V-V: khể tổi (speak – false) ‘to lie’ N-N: nhà pếp (house – cooking fire) ‘kitchen’ – N-V: ɲa:2 ɉo:n1 (house – tall) ‘house on stilts’ V-N: pɨə2 kudəl1 (suitable – stomach) ‘to be satisfied’ V-V: ti2 luh1 (go – exit) ‘to exit’ – V-N: lɔn1 ɲim1 (enter – heart) ‘to understand’ V-V: ʔatoŋ1 hat3 (run – compete) ‘to race’ N-V: kon1 ʔih1 (person – big) ‘leader’

– Muong

Archaic languages

– Ruc

– Thavung

Note on usage of the references: From this section on, all lexical and sentential examples come from the various sources noted in Table 1. Sentence samples are cited with page numbers, while single words from dictionaries or wordlists are, of course, listed alphabetically, so pages are not indicated. All Vietnamese examples

480 

 Mark J. Alves

were selected or generated by the author, with frequency of usage checked via internet resources. IPA transcriptions among the sources vary in nature and are kept as in the original sources. Vietnamese and Muong are presented in original orthography. Ruc data from Nguyen V. L. (1993) has been converted to IPA from the publication’s Vietnamese-based transcription. Dual-headed compounds often utilize two words in a semantic domain to refer generally to the entire domain, as in Table 13. Tab. 13: Dual-headed compounds with generalized meanings. Gloss Language

‘Clothing’ (pants – shirt)

‘parents’ (father – mother)

Vietnamese Muong May Ruc So Thavung

quần áo quần ảo ku̯ ɤ̆n² ˈʔau̯ ³ NA soŋ1 ʔaw2

bố mẹ pổ cảy pɯ⁴ mɛ̤ ⁴ meɛ4 pɨ:4 (mother-father) NA

Chinese-style Compounds: What distinguishes Viet-Muong languages from those in other sub-branches is the intake of not only many Chinese loanwords but also the Chinese-style word-formation. In this pattern, basic Chinese monosyllabic morphs form a kind of “pseudo-compounding” in which bisyllabic words are formed from what are originally two words (i.  e. free morphemes), but in which (a) both syllables have roughly the same amount of phonetic stress (in contrast with the major-minor syllable distinction of archaic languages), (b) the semantics of the original words have been somewhat generalized, and (c) they function in a sense as “prefixes” and “suffixes”. This phenomenon is likely related to the lack of clear phonologically derived prosodic words, as noted in 22.2.4. In Table 14, with examples of Sino-Vietnamese and Sino-Muong bisyllabic words, the individual syllables generally cannot stand alone as words in those languages. The first group of words have a bolded morph meaning ‘things’, while the bolded morph in the second group has a broader meaning of ‘explaining’ and ‘resolving’. Vietnamese has many thousands of such bisyllabic words, and varieties of Muong also have large numbers (though statistical studies are not available on this matter). In archaic Vietic languages, Chinese loanwords constitute very small portions of the overall lexicons, so these kinds of words, if present, appear to be recent loanwords via Vietnamese.



Typological profile of Vietic 

 481

Tab. 14: Chinese-style bisyllabic compounds in Vietnamese and Muong. Gloss

Vietnamese

Muong

Chinese

material things plants monster domestic animals explain comfort interpret mediate resolve

hiện vật thực vật quái vật súc vật biện giải khuyến giải lý giải hoà giải giải quyết

hiện vât thưc vât quải vât xúc vât biện dái khuyên dái lỷ dái wà dái dái quyết

事物 物 shí wù 植物 物 zhí wù 怪物 物 guài wù 畜物 物 chù wù 辯解 解 biàn jiě 劝解 解quàn jiě 理解 解lǐ jiě 和解 解hé jiě 解决 jiě jué

22.3.2 Affixation True affixation (as evidenced by phonological reduction) is limited to the polysyllabic archaic languages. The Poong language of Pong-Toum has some presyllables, but available lexical data shows very few instances of these, and so it is unclear whether it has genuine prefixes. Even among archaic languages, despite the high numbers of words with presyllables, affixes are hard to locate, and only a few studies offer ample data with affixation. In archaic languages, some prefixes and infixes are etymologically related to or comparable in function to those of other Austroasiatic languages, such as the causative prefix on verbs and nominalizing infixes. Much of this morphology is fossilized, nonproductive material. Examples of these are shown in Table 15. Tab. 15: Samples of affixes in archaic languages. Language

Category

Examples

Ruc

– Causative prefix [pa-] – Causative ­resultative prefix [pa-] – Nominalizing infix

– (a) ku:n4 ‘afraid’ > paku:n4 ‘threaten’; (b) tɨŋ4 ‘to stand (intransitive)’ > patɨŋ4 ‘to stand (something) up’; (c) ce:t4 ‘to die’ > pece:t3 ‘to kill’; (d) rɨmɛk3 ‘cool’ > parɨmɛk3 ‘to cool (something)’ (Nguyen V. L. 1993: 77) – (a) ɉah1 ‘to tear’ > pajah1 ‘torn’; (b) peh1 ‘to break’ > papeh1 ‘broken’ (Nguyen V. L. 1993: 80) – (a) thu:t3 ‘to insert’ > tanu:t3 ‘a cork’; (b) səp3 ‘to cover’ > sanəp3 ‘blanket’ (Nguyen V. L. 1993: 83)

May

– Transitivizing prefix on directional verbs [pa-] – Nominalizing prefixes (various)

– (a) leŋ¹ ‛rise’ > pa-leŋ¹ ‛to raise’; (b) tɔ̤ ŋ⁴ ‛go down’ > pa-ʈɔ̤ ŋ⁴ ‛to lower’; lɔ̤ n² ‛enter’ > pa-lɔn¹ ‛to deposit’; (c) loh¹ ‛leave’ > pa-loh¹ ‛to get out’; (d) βi² ‛return (intransitive)’, ‛leave’ > pa-βi² ‘to return (transitive) / bring / carry’ (Babaev and Samarina 2018: 177)

482 

 Mark J. Alves

Tab. 15 (continued) Language

Category

Examples – (a) tɯ̆ŋ³ ‛stand’ > ʔatɯ̆ ŋ³ ‛wall’; (b) pi¹ ‛worn behind his back’ > cəɴpi¹ ‛fin’; (c) ɕăt³ ‛tie’ > kuɕăt³ ‛knot’; (d) pah¹ ‛clap’ > tapah¹ ‛slap in the face’ (Babaev and Samarina 2018: 104)

Maleng Bro – Nominalizing infix

– (a) tęʔ ‘to urinate’ > trnęʔ ‘urine’; (c) tkat ‘to roast’ > trkat ‘roasting sticks’ (Ferlus 1997: 62)

22.3.3 Reduplication Reduplication in Vietic consists of two types: full reduplication and a type of partial reduplication in which templates of prosodic words (generally monosyllabic words, but also bisyllabic words) are copied, but some segments alternate. Full reduplication functions in a variety of ways: (a) lightening adjectives (e.  g. Ruc ɟa:l1 ɟa:l1 ‘a little long’), (b) to generalize meanings (e.  g. Vietnamese xanh xanh ‘bluish/greenish’), (c) to create nouns (e.  g. Poong bolboːl ‘lizard’), among other semantic functions. The other type of partial reduplication results in alliteration (copying the initial but alternating the rhyme), rhyming (copying of the rhyme but alternation of the initial), and chiming (copying the initial and final but alternating the main vowel). While reduplicants with alternation can sometimes be shown to be derived from a single lexeme, in many cases, there is no evident source. The latter type of “reduplication with alternation” is a pattern seen throughout Austroasiatic. Examples in Vietnamese and Muong of these various types are shown in Table 16. Additional examples from archaic languages are shown in Table 17. Tab. 16: Examples of reduplication with alternating material in Vietnamese and Muong. Category

Gloss

Vietnamese

Muong

Full reduplication

Average; not slim, not fat splutter fast, do quickly and easily, agile emaciated sticky

nhàng nhàng

nhàng nhàng

lúng búng nhanh nhẹ, nhanh nhẹn, lanh lẹ, lanh lẹn hom hem nhơn nhớt

lủng pủng nhanh nhẹl

Rhyming Alliteration Chiming Final consonant alternation

hom hem NA



Typological profile of Vietic 

 483

Tab. 17: Examples of reduplicants with alternating segments in archaic Vietic languages. Branch

Examples

Ruc

(a) lɛɲ1 lɤaw4 ‘agile’; (b) kərßeːŋ1 kərßiːt2 ‘to hang about’; (c) tiːŋ1 ßiːŋ2 ‘spider’; (d) toːŋ1 m oːŋ1 (Nguyen V. L. 1993: 89) (a) pʰɽɛk⁴ pʰɽɛ⁴ ‛pure’; (b) ɕɤ² ɕai̯³ ‛in abundance’; (c) kaβɛŋ¹ kaβɔŋ³ ‛tortuous’; (d) laŋi⁰ laŋɤ¹ ‛extravagant’; (e) ʔuβaŋ³ ʔuβac³ ‛light’; (f) ɴtah¹ ɴtai̯¹ ‛tasteless’ (Babaev and Samarina 2018: 101–102) (a) balɔ́ k balɛ́ k ‘bad people’; (b) kacáɁ kacâːj ‘to be scattered’; (c) thamít thamîan ‘in the old days, the very first time’ (Premsrirat 2000)

May

So Thavung

The degree of productivity of reduplication has not been fully explored in all Vietic languages. Reduplication with alternating segments cannot be considered productive in the way that affixation is, in which, for example, certain prefixes may appear freely on any verbs, nor of partial reduplication in which a presyllable can be added based on material of any base form. Nevertheless, for example, a Vietnamese dictionary of such words (Viện Ngôn Ngữ Học 1995) containing a few thousand of these reduplicants with alternation (and many more forms of regional dialects are not listed in the dictionary) is a sample of the amount of productivity of this type of word-formation. Until quantitative studies of this word-formation process in other Vietic languages are undertaken, the degree of productivity of this process in Vietic will remain unknown.

22.4 Syntax Few published descriptions of syntax in Vietic languages are available, and fewer still are written in English. While Vietnamese syntax has been analyzed extensively, there are no descriptions of Muong syntax (though many sentences are in the dictionary referred to in this article). As for Southern Vietic languages, syntax of the Ruc language has been described in two books (Nguyen V. L. 1993 in Vietnamese and Solntsev et al. 2001 in Russian), syntax in the May language in one book (Babaev and Samarina 2018 in Russian), So Thavung syntax in a dissertation (Srisakorn 2008), and Kri syntax briefly and only partly in one article (Enfield and Diffloth 2009). There are currently no available published descriptions of syntax in Pong-Toum languages (though some unpublished materials in Vietnamese have been shared with this author). Despite these limitations, it is possible to make preliminary observations about the typological tendencies of Vietic languages. As Table 18 shows, clause structure throughout described Vietic languages is quite consistent: all show SVO structure and pragmatically conditioned topic-comment patterns. This is consistent with most other languages in the region (i.  e. other Austroasiatic languages, Hmong-Mien, Tai, Chamic, and Sinitic).

484 

 Mark J. Alves

Noun phrase structure, on the other hand, varies somewhat, with three observable patterns, as characterized in Table 18. In all Vietic languages, modifiers – including adjectives/stative verbs, relative clauses, possessives, and demonstratives – all follow nouns. However, in different Vietic languages, the position of number-plusunit-word combinations occur variously before or after nouns. In general, available data shows consistent patterns of Vietic languages spoken inside Vietnam of all three branches of Vietic: Viet-Muong, Pong-Toum, and some archaic languages. In contrast, the archaic language So Thavung in Thailand follows the Thai/Lao pattern, while the archaic language Kri in Laos is reported to allow varying orders. Tab. 18: Word order in Vietic languages. Branch

Clauses TopicNoun phrases comment

Viet-Muong SVO Pong-Toum SVO Archaic languages SVO

yes yes yes

number – unit – noun – modifiers number – unit – noun – modifiers 1. number – unit – noun – modifiers (Ruc and May) 2. noun – modifiers – number – unit (Thavung in Thailand) 3. either order (Kri)

The following subsections review various syntactic aspects including clause structure, questions, noun phrase structure, and locational and directional marking.

22.4.1 Clause structure This section reviews the organization of clausal actors and verbs, including discussion of topic-comment sentences, passive and mediopassive sentences, and constructions with stative verbs followed by body parts, often with an afflictive sense. To start, all available data in Vietic languages shows verb-medial clauses. Muong (Nguyen V. K. et al. 2002: 380) Nả lệ cải câl nả phang con chim 3s take CLF stick 3s hit with stick CLF bird ‘He picked up a stick to hit at a bird but wasn’t able to.’ So Thavung (Srisakorn 2008: 103) ʔɔŋ1 ʔakɛ2 ka2 nɨ1 calɔk1 dad seek fish at canal ‘The father searches for fish in the canal.’

mé chăng but no

ản ABL



Typological profile of Vietic 

 485

This SVO order is seen as well in both main and dependent clauses. In the sample sentence, the word ‘miss (n.)’ is a socially conditioned referent for which 1st/2nd/3rd-person reference depends on the spoken context. Vietnamese thì cứ nói nếu cô muốn ăn gì if miss (n.) want eat what then just say ‘If you (miss) would like to eat something, go ahead and say so.’ Ruc (Nguyen V. L. 1993: 109) khi:1 ho:1 loɔn2 ɲa:2 han3 la:1 ɲɤəp4 when 1s enter house 3 COMPL sleep ‘When I entered the house, he was already asleep.’ Topic-comment patterns: In topicalized constructions, a variety of semantic roles can occur, though careful studies of this range are limited to Vietnamese. Cao (1992) provides an overview of the range of semantic roles of topicalized elements in the Vietnamese sentence, including any actants, parts of actants, locations, among other roles. Despite limited studies of topicalization in other Vietic languages, available data also offers instances of a variety of semantic roles in the topic position. These can occur with or without lexical markers between the topicalized element and the main comment/predicate. Vietnamese (Cao 1992: 146) Sơn thì tay bị gãy NAME TPC hand PASS broken ‘Sơn has a broken arm.’ So Thavung (Srisakorn 2008: 75) ʔali1 lɛh1 khɔŋ1 ɪnə2 this be of mother ‘This is my mother’s (thing).’ May (Babaev and Samarina 2018: 227) ʔai̯¹ lɛ² hat³ tu³ na³ who TPC sing place DIST ‛Who is it singing over there?’ Passivization: One pattern that overlaps with topic-comment constructions is the mediopassive construction in that the first noun has the role Patient, as in topicalized constructions. These middle-passive sentences typically have inanimate subjects and lack lexical marking seen in topic-comment constructions.

486 

 Mark J. Alves

Vietnamese sách đã bán hết rồi book PAST sell completely COMPL ‘The book has already been sold out.’ Ruc (Nguyen V. L. 1993: 117) ɲaː2 poɔ4 hoː1 mɨːn2 ruːj2 house uncle 1s make COMPL ‘My uncle’s house has been built already.’ Vietic languages have passive constructions that are lexically marked. They differ somewhat from European-style passive in that (a) Agents are not marked by prepositions and (b) these can be viewed as biclausal sentences in which the lower clause is semantically experienced (e.  g. ‘He was bitten by a dog’ can be interpreted as ‘He experienced a dog biting [him]’). The words used to mark the passive voice in these constructions are typically borrowed words or stem from calqued patterns. In Vietnamese, passive markers from Chinese have developed differing semantics: bị expressing afflictive passive (Chinese 被 bèi), được encoding passive with a positive sense (Chinese 得 dé, standard Sino-Vietnamese đắc), and do with a neutral sense (Chinese 由 yóu). The Vietnamese passive marker bị shares structural patterns with those in varieties of Chinese: it can occur with or without an agent in the lower clause. Vietnamese bị chó cắn cậu bé fellow young PASS dog bite ‘A young boy was bitten by a dog.’ Vietnamese cậu bé bị cắn fellow young PASS bite ‘A young boy was bitten.’ This Sino-Vietnamese lexeme bị has been borrowed by both Ruc and May. In this instance, an information question word is in the in-situ position after the passive marker. May (Babaev and Samarina 2018: 227) hăn³ ɓi⁴ ʔai̯¹ [ma²] pɯ̆p³ ʔina³ 3s PASS who REL hit DIST ‘Who was he hit by like that?’ Both Vietnamese and Muong have a shared passive-marking lexeme which is homophonous with the senses ‘must’ and ‘correct’ (Vietnamese phải and Muong phái). A similar grammaticalization pattern is seen as well in So Thavung (and also Thai thuùk and Khmer trəw) with their own etymological distinct lexemes.



Typological profile of Vietic 

 487

Muong (Nguyen V. K. et al. 2002: 74) Nả phái môch lát chẻm kel 3s PASS one instant chop neck ‘He received a blow to the neck.’ So Thavung (Srisakorn 2008: 73) kunʔit2 cɔh1 luk1 kat2 child correct snake bite ‘A child was bitten by a snake.’ Afflictive stative verbs: One pattern seen in Vietic (and more broadly in the region) is a construction in which a stative verb is followed by a noun referring to body parts. As stative verbs, while taking oblique objects, they can still take intensifiers. These often express a sense of affliction, though the pattern can be semantically extended with positive senses and other semantic functions. Ruc (Nguyen 1993: 119) han3 so:t3 ku.loɔk4 3s hurt head ‘He has a headache.’ Vietnamese bác sĩ ấy tốt bụng lắm doctor DIST good stomach very ‘The doctor is very nice/good-hearted.’

22.4.2 Questions The data suggests substantial consistency in both polar and information questions. Information questions tend to maintain in-situ positions in the sentences. Question words for objects most often follow verbs even though these are topic-comment languages in which objects can, in other cases, be readily fronted. This is significant in that a wide variety of semantic categories (e.  g. objects, locations, possessives, etc.) can be fronted, whether the question word is the sole element in a noun phrase or as a modifier, as in the example of Muong. Muong (Nguyen V. K. et al. 2002: 171) da hảo hạng nò 2s want evidence which ‘What evidence do you want?’

488 

 Mark J. Alves

So (Thavung) (Srisakorn 2008: 124) ɲaphɔ1 han1 cua1 ʔan1 ʔahə1 monk ask novice eat what ‘The monk asked, ‘What did you eat?’ Polar questions are often marked by sentence-final particles. Some of these interrogative particles are phonologically simple syllables (e.  g. Vietnamese and Muong à, Ruc ʔɛ1, and May ʔɛ³) with no evidence etymological source, while some are derived from words meaning ‘no/not’ (e.  g. Vietnamese không, Muong chăng, and May βăŋ³). Muong (Nguyen V. K. et al. 2002: 427) Đồ ớ pổl pay tá chớ thing play PL child leave LOC ‘Did the children leave those toys here?’

ni à here Q

May (Babeav and Samarina 2019: 213) lɔŋ¹ tapɛh¹ kɔ³ camɤ³ ʔăn¹ ʔɛ³ βăŋ³ inside kitchen exist what eat Q not ‘In the kitchen, is there anything to eat?’

22.4.3 Noun phrase structure As shown in Table 18, in all available descriptions of Vietic, modifiers, possessives, and determiners follow head nouns. All descriptions include the use of classifiers, but also as noted in Table 18, the position of the quantity-plus-measure/classifier units does vary according to geographic region. In all described Vietic languages in Vietnam (Vietnamese, varieties of Muong, archaic Vietic languages and the Pong-Toum language Cuoi), quantity-unit constituents precede head nouns; in Thavung in Thailand, they follow nouns; and in Kri in Laos, both orders are attested. Vietnamese hai quả chuối tươi này two CLF banana fresh PROX ‘These two fresh bananas’ Ruc (Nguyen 1993: 107) ha:l1 kɔn1 mɛ:w2 pati:1 two CLF cat black ‘Two black cats’ Kri (Enfield and Diffloth 2009: 61) ha:r loŋʔ kade:ʔ kade:ʔ ha:r loŋʔ two CLF child child two CLF ‘Two children’



Typological profile of Vietic 

 489

So Thavung (Srisakorn 2008: 89) ʔuh1 ʔit2 laŋ1 ʔali1 house small CLF PROX ‘this small house’ Classifiers in Vietic languages have the ability to function as heads of noun phrases. In these cases, classifiers can have referential, pronoun-like functions, and in Vietnamese, the generic classifier cái has developed a function as a definite article. Vietnamese cái này là cái gì CLF PROX be CLF what ‘What is this?’ Kri (Enfield and Diffloth 2009: 60) loŋʔ na:ʔ CLF dem.external ‘That one’ In Vietnamese, many words are themselves count nouns and do not require classifiers to occur in quantified noun phrases. The two major categories include human nouns (e.  g. một cô (one-aunt) ‘one aunt’, hai bác sĩ ‘two doctors’, etc.) and Chinese loanwords borrowed largely in the 20th century (e.  g. một chính phủ ‘one government’, hai hóa chất ‘two chemicals’, etc.). Among other Vietic languages, it is not yet clear how much classifiers are required and whether certain classes of nouns are count nouns.

22.4.4 Locational and directional marking In Vietic languages, location and direction can be indicated by prepositions or by grammaticalized verbs, which resemble what have been noted in the literature to be serial verb constructions (e.  g. Li and Thompson 1981). Directional verbs such as ‘enter’, ‘exit’, ‘ascend’, ‘descend’, and others follow movement verbs to indicate the direction of the main verb action. Vietnamese mất công quá đi lên đi xuống lose effort very go ascend go descend ‘It takes too much work to go up and down.’ Ruc (Nguyen V. L. 1993: 131) mi:ŋ2 βaŋ3 ti:2 loɔn2 bru:3 3p NEG go enter forest ‘We’re not going into the forest.’

490 

 Mark J. Alves

Such directional words can also develop additional semantic functions. Muong (Nguyen V. K. et al. 2002: 40) môch nước chia tha từ bỗ one country divide exit many ministry ‘A country is divided into many ministries.’

22.4.5 Other grammatical categories 22.4.5.1 Tense/aspect, modality and negation Vietic verb phrases tend to have negation first and then a position for a variety of preverbal modifiers, words indicating aspect/tense (completion, past, future), modality (ability, obligation), among other categories (e.  g. sequential ‘then’, additive ‘also’, etc.). A smaller number of markers of aspects, ability, and negation can occur in postverbal position. The position in clauses of these words is generally lexically specified. Vietnamese ấy chưa ăn xong Thế mà anh yet/but elder brother DIST not yet eat finish ‘Yet he hasn’t finished eating.’ Ruc (Nguyen 1993: 108) khi:1 ho:1 loɔn2 ɲa:2 han3 la:1 ɲɤəp4 when 1s enter house 3s already sleep ‘When I entered, he was already sleeping.’ So Thavung (Srisakorn 2008: 81) pʌm2 lot1 ʔəʔ1 nɨ:1 ʔan1 ʔata3 he then speak who eat duck ‘He then asked who ate the duck.’ May (Babeav and Samarina 2019: 236) ma̤ ŋ⁴ ti² ʔatɯ̆ ŋ⁴ ʔaho¹ kuŋ⁴ ti² 2s go hunt 1s also go ‘You’re going hunting, I’m going too.’



Typological profile of Vietic 

 491

Negation: Negation in Vietic is most often preverbal. In Vietnamese, the negator không is preverbal (while in sentence-final position, it is an interrogative particle), while đâu (homophonous with the word meaning ‘where’) can be used for negation in sentence-final position with slight emphasis, sometimes with implied irritation of the speaker. Only preverbal không can be used as a stand-alone word in response to polar questions, though whether such is the case in other Vietic languages is not always clear from available descriptionis. In Vietnamese, these preverbal and post-verbal negators can co-occur. It is unclear whether sentence-final negation occurs in other Vietic languages. Vietnamese tôi không biết đâu 1s no know where ‘I just don’t know!’ Prohibitive negators similarly occur in preverbal position. They generally require following verbs and cannot stand alone in utterances. Subjects can be dropped in such sentences, but subjects do occur before prohibitive verbs in available data. Muong (Nguyen V. K. et al. 2002: 224) da chở hay khể xẩu cho pậu 2s PROH often speak bad give people ‘Don’t speak ill of people.’ May (Babaev and Samarina 2018: 130) ʔatɤ¹ ahăn³ ɓăi̯¹ nɔi̯³ ʔapa¹ hɛ¹ tell 3s PROH speak 3p SFP ‘Tell him not to speak with them.’ Abilitive modality: Regarding verbs that express abilitive modality, they most often occur after verbs but before objects. In Vietnamese, these verbs can flexibly occur before or after a main verb. They can occur alone as responses to polar questions. They are frequently homophonous with words meaning ‘to achieve/get’, a common grammaticalization path among languages in the region. Vietnamese tôi không mua được sách   1s no buy get book ‘I can’t buy a book.’ So Thavung (Srisakorn 2008: 87) kan1 ti1 tin1 1s go get ‘I can go.’

tôi không được 1s no get

mua sách buy book

492 

 Mark J. Alves

Aspectual/tense markers: Aspectual and tense markers can occur both before and after verbs, though these are generally lexically specified. For example, in Vietnamese, đang ‘in progress’, sẽ ‘will’, and đã ‘already’ all strictly precede verbs, while rồi ‘already’ follows verbs. Vietnamese Minh đã về nhà rồi NAME already return to house COMPL ‘Minh has already gotten home.’ Muong (Nguyen V. K. et al. 2002: 47) da cách ngịa dòng đỉ  ho tường rồi  2s explain as thus 1s understand COMPL ‘When you explain it that way, I understand it clearly.’ Markers of progression, recent past (i.  e. ‘have just X-ed’), or immediate future (i.  e. ‘about to’) tend to be in the preverbal slot. So Thavung (Nguyen V. K. et al. 2002: 49) kunʔit2 thuam1 thajon1 khaʔəj2 children PROG hop play ‘The children are hopping for fun.’ So Thavung (Srisakorn 2008: 75) chợ mởi cỏ tlận cãi lỗn rà wài outside market just have CLF quarrel each other ‘There has just been a quarrel at the market.’ Ruc (Solntsev, Solntseva and Samarina 2001: 414) ho1 ɕăp4 ʔăn1 1s about to eat ‘I’m about to eat.’

22.4.5.2 Comparison and intensification Available data in Vietic languages shows that comparison structures in Vietic languages are of the “surpass” type, as in other languages in the region (cf. Ansaldo 2010). In such constructions, comparative or equative particles are used. Vietnamese mạnh như Minh Minh mạnh hơn Dũng nhưng Lộc NAME strong CMP NAME but NAME strong as NAME ‘Minh is stronger than Dung, but Loc is as strong as Minh.’



Ruc (Solntsev et al. 2001: 523) kəʌ̆ i2̯ ni1 ɟoɔn2 hɤn1 CLF PROX high/tall COMPAR ‘This one is taller than that one.’

Typological profile of Vietic 

 493

kəʌ̆ i2̯ hĕh1 CLF PROX

So Thavung has evidently borrowed Lao ‘more than’ (from Tai *kwaB, itself a likely Chinese loanword, 過 guò ‘to pass over/exceed’, which developed new grammatical features). Thus, comparative constructions in Vietic appear to be yet another instance of typological convergence through general language contact as well as lexical borrowing. So Thavung (Srisakorn 2008: 111) ʔuh1 kan1 ʔit2 kua1 ʔuh1 thow1 house 1s small than house 2s ‘My house is smaller than yours.’ The position of intensifiers shows more variation without clear geographic patterns. Available descriptions show instances of intensifiers after adjectives/stative verbs, as shown in Table 19. Thus, post-posed intensifiers, which structurally parallels comparative constructions, is a common feature. However, there are also pre-posed intensifiers. While Vietnamese lắm follows adjectives, rất precedes them. Also, the Vietnamese intensifier quá (which comes from the Chinese 過 guò ‘to pass over/exceed’, like in Tai, but which only occurs as an intensifier in Vietnamese) typically follows adjectives, but it can also precede them for emphasis. The Vietnamese intensifier thật ‘truly’ is a Chinese loanword 實 shí ‘true’ (cf. Chinese 實在 shízài ‘truly’), and it is pre-posed in Vietnamese as in Chinese. Both Muong and Ruc have an intensifier resembling Vietnamese quá, occurring after adjectives in those languages, and Muong also has an intensifier like Vietnamese thật ‘truly’, which also appears before adjectives. Ruc has one native intensifier which precedes adjectives.

(Babaev and Samarina 2018: 181)

(Enfield and ­Diffloth 2009: 51)

phadɔŋ1 hot ‘very hot’

tapui̯¹ nak⁴ pleased INT ‘very pleased.’

qjồồn tàn tall true ‘Really tall’

May

Kri

thɛ1 INT

salaŋ1 dɔ1 cold INT ‘very cold’

(Srisakorn 2008: 75)

(Solntsev 2001: 386)

(Nguyen V. K. et al.: 30, 100, 70)

So Thavung

3

daj lɯ:p INT sharp ‘very sharp’ 4

ku̯ a ɟ on tall INT ‘very tall’ 4

Ruc

2

ɔ

thât thẳn INT hard/strong ‘really hard/strong’

chả quả cold INT ‘very cold’

thật tốt INT good ‘truly good’

hay lẳm interesting INT ‘very interesting’

quá tốt INT good ‘very good’

Muong

tốt quá good INT ‘very good’

rất tốt INT good ‘very good’

lắm tốt good INT ‘very good’

Vietnamese

Language

Tab. 19: Positions of intensifiers in Vietic languages.

494   Mark J. Alves



Typological profile of Vietic 

 495

22.4.5.3 Pronouns and referential systems Descriptions of pronoun systems are available for several Vietic languages. The descriptions show recurring features, including a dual category and inclusive-exclusive distinctions. As for numbers of slots in the pronoun systems, they vary considerably. In some cases, available descriptions do not provide detailed information, and in such cases, in Table 20, the features are marked with question marks. Tab. 20: Pronoun features among Vietic languages. Category

Vietnamese

Muong

Ruc

May

Kri

So Thavung

No. of pronoun slots Exclusive distinction Dual category politeness distinction Pronoun-dominant

several yes no yes no

several no no no? yes?

a dozen yes yes no? yes

NA yes yes no? yes

15 yes yes yes yes

a dozen no no yes yes

One key feature of Vietnamese distinguishing it from other Vietic languages is that it is not pronoun-dominant. The original Vietnamese system of pronouns in daily usage has largely been supplanted by kinship terms in a complex system of terms of address and reference with full floating pronoun reference (e.  g. ‘elder brother’ may function as 1st person when used by the addressor but as 2nd person by the addressee), a situation seen also in Khmer. Social and professional titles can also have 1st, 2nd, and 3rd person pronoun reference, though these are statistically secondary in daily usage compared to kinship-based referential terms. Most true pronouns in Vietnamese  – which are Vietic etyma – are considered rude and/or intimate, while a restricted set of kinship terms have grammaticalized with de facto pronoun status. Partly due to this tremendous change in pronoun usage and the system, the exact number of true pronouns in modern Vietnamese is debatable: the pronoun system has been both reduced in some ways and amplified in others through innovation and borrowing from Chinese (cf. Alves 2017). Overall referential systems of Vietic languages other than Vietnamese have been little studied. While the language descriptions of Southern Vietic languages do not provide details about the sociolinguistic usage of referential systems, they present the pronoun systems and offer sentences primarily using pronouns, with few instances of socially conditioned referential terms. However, as noted in Kri, the small size of the community, in which most community members are related, results in a complex mix of pronouns and kinship terms (Enfield and Diffloth 2009: 56–57). Thus, while data on the southern Vietic languages shows pronoun-dominant tendencies, the sociocultural systems appear to interact with pronouns in a more complex way.

496 

 Mark J. Alves

22.5 Note on Chinese loanwords While careful statistical study would be needed to clarify the etymological profiles of all the languages (with Vietnamese the best studied), cf. Alves (2009) some preliminary observations of Chinese loanwords in lexical data of languages suggest the following. While Chinese loanwords are seen in data on all Vietic languages, the quantities vary according to sub-branch in ways corresponding to typological profile. Vietnamese has the largest quantity of Chinese loanwords, followed by Muong, then Pong-Toum languages, and lastly, archaic languages. The larger the number of Chinese loanwords in a sub-branch, the closer that sub-branch is typologically to Sinitic. This is a matter that needs to be explored in future historical linguistic research, ideally with additional lexical data from Southern Vietic languages.

22.6 Conclusion Vietic is a true transitional (both geographically and typologically) group of languages connecting southern China to the Austroasiatic languages of Mainland Southeast Asia. The significant typological range, especially in the domain of morphophonology, is indicative of the intense historical sociocultural changes in Northern Vietic. Archaic linguistic features correspond to a more archaic set of sociocultural/political practices among the southern Vietic languages. Conversely, the modified linguistic features of Viet-Muong languages are precisely those which mark them as “Sinified”, along with various Chinese-style cultural features. The data underlying the modern typology is thus informative as well of the socio-cultural history in the region. As usual, as more details are revealed and more analyses offered, more questions can be raised, and the quest for data continues. While the amount today is considerably more than just a few decades ago, a majority of the Vietic languages have been studied little (with gaps especially in syntactic data), and some, not at all. Thus, Vietic languages – especially the small groups of Vietic languages which are increasingly endangered languages – are in tremendous need of additional data-gathering and analysis. A number of questions deserve focused research and documentation, including at least the following areas: – Acoustic phonetic studies, especially of phonation, tones, and final laryngeal sounds (fricatives and glottal stops) – The degree of productivity of alternating reduplication – Naturalistic data of clause and noun-phrase structure to test the full range of variation, not only high-frequency patterns – The degree of language contact with, variously, Sinitic languages, Tai languages, and Austroasiatic languages (including Vietnamese) through loanwords as well as structural linguistic elements



Typological profile of Vietic 

 497

This certainly is not an exhaustive list, and hopefully, those reading this will raise more questions and find ways to continue research of the Vietic languages.

References Alves, Mark J. 2003. Ruc and other minor Vietic languages: Linguistic strands between Vietnamese and the rest of the Mon-Khmer language family. In Karen L. Adams et al. (eds.), Papers from the Seventh Annual Meeting of the Southeast Asian Linguistics Society, Tempe, Arizona, 3–19. Tempe: Arizona State University, Program for Southeast Asian Studies. Alves, Mark J. 2009. Loanwords in Vietnamese. In Haspelmath, Martin & Tadmor, Uri (eds.), Loan­words in the world's languages: A comparative handbook, 617–637. Berlin: Walter de Gruyter. Alves, Mark J. 2014. Mon-Khmer. In Rochelle Lieber & Pavol Stekauer (eds.), The Oxford handbook of derivational morphology, 520–544. Oxford: Oxford University Press. Alves, Mark J. 2017. Does Vietnamese have evidence for oc *-r? Cahiers de Linguistique Asie Orientale 46. 151–173. Ansaldo, Umberto. 2010. Surpass comparatives in Sinitic and beyond: Typology and grammaticalization. Linguistics 48(4). 919–950. Babaev, Kirill V. & Irina V. Samarina. 2018. Materials of the Russian-Vietnamese linguistic expedition: May language. Moscow: Yask Publishing House. (In Russian) Cao, Xuân Hạo. 1992. Some preliminaries to the syntactic analysis of the Vietnamese sentence. The Mon-Khmer Studies Journal 20. 137–152. Enfield, Nick J. & Gérard Diffloth. 2009. Phonology and sketch grammar of Kri, a Vietic language of Laos. Cahiers de Linguistique Asie Orientale 38(1). 3–69. Ferlus, Michel. 1994. Quelques particularités du cuôi cham, une langue viet-muong du Nghê-an (Vietnam). Neuvièmes Journées de Linguistique de l’Asie Orientale. 1–4. Ferlus, Michel. 1997. Le Maleng Brô et le Vietnamien. Mon-Khmer Studies Journal 27. 55–66. Ferlus, Michel. 2014. Arem, a Vietic language. Mon-Khmer Studies 43. 1–15. Li, Charles N. & Sandra A. Thompson. 1981. Mandarin Chinese: A functional reference grammar. Berkeley: UC Press. Mon-Khmer etymological dictionary. SEAlang Mon-Khmer Languages Project. Web. http://sealang. net/monkhmer/dictionary/ (last accessed 15 October 2019). Nguyễn, Ðình Hòa. 1957. Classifiers in Vietnamese. Word 13(1). 124–152. Nguyễn, Hữu Hoành. 2009. Hệ thống ngữ âm tiếng Cuối [The phonological system of the Cuoi language]. In Tạ Văn Thông (ed.), Tìm Hiểu Ngôn Ngữ Các Dân Tộc [Understanding the languages of ethnicities], 9–23. Hanoi: Nhà Xuất Bản Khoa Học Xã Hội. Nguyen, Huu Hoanh & Nguyen Van Loi. 2019. Tones in the Cuoi language of Tan Ki district in Nghe An Province, Vietnam [1]. The Journal of the Southeast Asian Linguistics Society 12(1). ivii–ixvi. Nguyen, Tuong Lai. 1992. Poong language – The first contact of languages between Viet and Thai. In The Third International Symposium on Language and Linguistics, Bangkok, Thailand, 98–107. Bangkok: Chulalongkorn University. Nguyễn, Tài Cẩn. 1995. Giáo trình lịch sử ngữ âm tiếng Việt (sơ thảo) [Textbook on Vietnamese phonological history (preliminary)]. Vietnam: Nhà Xuất Bản Giáo Dục. Nguyễn, Văn Khang, Bùi Chi & Hoàng Văn Hành. 2002. Từ điển Mường-Việt [A Mường-Vietnamese dictionary]. Hanoi: Nhà Xuất Bản Văn Hoá Dân Tộc. Nguyễn, Văn Lợi. 1993. Tiếng Rục [The Rục language]. Hà Nội: Nhà Xuất Bản Khoa Học Xã Hội. Nguyễn, Văn Tài. 2005. Ngữ âm tiếng Mường qua các phương ngôn. Hanoi: Nhà Xuất Bản Từ Điển Bách Khoa.

498 

 Mark J. Alves

Premsrirat, Suwilai. 1996. Phonological characteristics of So (Thavung), a Vietic language of Thailand. Mon-Khmer Studies 26. 161–178. Premsrirat, Suwilai. 2000. So (Thavung) preliminary dictionary. Salaya & Melbourne: Institute of Language and Culture for Rural Development, Mahidol University and the University of Melbourne. Schiering, René, Balthasar Bickel & Kristine A. Hildebrandt. 2010. The prosodic word is not universal, but emergent. Linguistics 46. 657–709. Sidwell, Paul. 2009. Classifying the Austroasiatic languages: History and state of the art. Lincom Europa. Sidwell, Paul. 2015. Austroasiatic classification. In Mathias Jenny & Paul Sidwell (eds.), The handbook of Austroasiatic languages, 144–220. Leiden: Brill. Solntsev V. M., N. V. Solntseva & I. V Samarina. 2001. Phonetics and phonology. Field materials: Vocabulary and grammar. In Materials of the Russian-Vietnamese linguistic expedition from 1986: Ruc language. Moscow: Oriental Literature Publishing Firm, Russian Academy of Sciences. (In Russian). Srisakorn, Preedaporn. 2008. So (Thavung) grammar. Thailand: Mahidol University dissertation. Viện Ngôn Ngữ Học [The institute of linguistics]. 1995. Từ điển từ láy tiếng Việt [Dictionary of Vietnamese reduplicative words]. Hà Nội: Trung Tâm Khoa Học Xã Hội và Nhân Quốc Gia, Viện Ngôn Ngữ Học.

Paul Sidwell

23 Northern Austroasiatic languages of MSEA 23.1 Introduction For the purposes of this chapter, the Northern Austroasiatic (NAA) branches are identified as Palaungic, Khmuic, Mang and Pakanic; they are mostly spoken in northern Laos, Thailand, Shan State of Myanmar, and neighboring areas of China and Vietnam (see Map 1). It is possible that these branches do form an AA sub-family (which includes the Khasian languages of Northeast India) although the phylogeny of AA remains an ongoing and difficult problem (see chapter 11 this volume). Whatever the real history of AA, it is striking that all four branches discussed here – with the exception of a subset of Khmuic lects – share a first person singular pronoun ‘I’ form that reflects *ʔɔːʔ, rather than *ʔaɲ which is commonly found in the rest of the family.

Map 1: Approximate distribution of Austroasiatic branches discussed in this chapter.

It is significant that NAA languages are spoken by communities that are separated by mountainous terrain, rivers, and international borders. Speakers are often minorities in their home area, are multilingual, and consistently the national and other local languages that they are exposed to are tonal and highly isolating. For example, in Shan https://doi.org/10.1515/9783110558142-023

500 

 Paul Sidwell

State the Danau (Palaungic) mix with Burmese, Pa’O Karen, Shan, and Pale (Southern Palaung) while in northern Vietnam the Khang (Palaungic) are exposed to Vietnamese, Tai Dam, Lü, and Iu Mien. This mosaic of areal influences has contributed to diversification in phonology and morpho-syntax at the micro-level while also favoring a drift towards tonality, syllable reduction, and simplification in affixal morphology. This trend is particularly evident among the smaller languages which are understandably more subject to external pressures. For example, Palaung and Khmu varieties, which have populations in the hundreds of thousands, have significantly resisted the areal drift towards syllable reduction and more analytical grammar. At the same time there are evident exceptions, and Wa, with a population of over half a million in Myanmar (Watkins 2019: 432), is almost completely monosyllabic and isolating. While the typological information here is sourced from published grammars, sketches and dissertations, relevant language documentation is limited, and often written in different descriptive and analytical frameworks, problematizing the identification of common categories and structures. The available grammars span a period from the 1920s to the present, are written in various languages and by writers with various language backgrounds, and are prepared on the bases of data collections that lack common programmatic principles. The problem is not great in respect of phonology as basic notions such as segment, syllable, and tone are largely universal in linguistics, but the same cannot be said of morpho-syntax. Common terms are used with disparate meanings, and there is no core set of categories that are regarded as necessary in a grammatical sketch. The approach taken in this case is conditioned by these limitations; I focus on a set of criterion languages for which there are multiple and/or extensive descriptions, and the necessary reanalyses and relabeling to facilitate comparison is only moderate. The criterion languages are Golden Palaung (Mak 2012), Dara’ang Palaung (Deepadung et al. 2014), Kammu Yuan (Svantesson 1983; Svantesson and Holmer 2014), Khmu Cuang (Premsrirat 2002), Wa (Seng Mai 2012; Watkins 2019), Pray (Malapol 1989), Bugan (Li and Luo 2014), plus other languages and works as could be manageably utilized. One NAA branch, represented by just the Mang language, is regrettably absent from the discussion of morpho-syntax. The available sources for Mang are written in Vietnamese and Chinese and focus primarily on phonology and lexicon, hence the regrettable lacuna.

23.2 Phonology The NAA languages exhibit a wide range of phonological typologies, from conservative lects that maintain segments and phonological patterns substantially unchanged from proto-AA, to innovators that have complete reduction to monosyllables, complete loss of various segments and/or segmental colocations, and the emergence of



Northern Austroasiatic languages of MSEA 

 501

complex contour tones. Throughout NAA there does remain the general AA preference for iambic structure and its tendency for simplification of the left edge of (sesqui) syllables, yet mergers among codas and among nuclei do also occur (see Sidwell [2015] for a typology of phonological change in AA languages). NAA onset inventories can be quite large; although a contrast between plain and implosive voiced stops is relatively rare, devoiced and/or preglottalized sonorants are common and these increase the number of consonant series. In some cases there has been a general reversal of voicing values in obstruent onsets (as in Dara’ang Palaung), such that voicing predominates among stops in that position. Aspirated stops are also common syllable onsets, while they are often missing in other AA groups. In the cases of Angkuic (Palaungic) and Mal-Pray (Khmuic), aspiration is present due to a Germanic shift among voiceless stops, and languages also borrow words with aspirated stops from other languages such as Shan and Lao. In terms of suprasegmental phonology, contour tones are rather common, although they tend to occur in a complex with breathy and/or creaky phonation, and descriptions can vary in whether or how they include phonation in their descriptions of tone. The most tonal NAA languages are spoken in China or nearby areas of IndoChina and Myanmar, and language contact is often apparent. Syllable nuclei may also contrast length, continuing the historical proto-AA pattern, although NAA languages under strong Burmese, Shan and/or Chinese influence in particular have lost or are losing the length contrast, while closely related varieties may continue to contrast vowel length robustly. Thus there is no particular NAA phonological type, but various tendencies do manifest, especially under areal pressures common to southern China and neighboring areas.

23.2.1 Word and syllable structure The more conservative NAA languages continue the sesquisyllabic word template; phonological words are monosyllables or iambic disyllables; longer strings of syllables may arise by affixation. More innovative languages have reduced all words and morphemes to monosyllables, although phrasal constructions and cliticization also conditions the re-emergence of disyllables among the innovators. Palaungic and Khmuic both span this diversity, while Mang and Pakanic are both radically innovative, having reduced all morphemes to C(C)V(C) or CV(C). Among the latter, compounded strings of up to four morphemes are attested, although depending on the sources it can be difficult to distinguish phrasal constructions from compounds or if that distinction is meaningful. Palaung and Riang of Shan State (as recorded by Shorto 1963) are robustly sesquisyllabic, plus they permit a modest number of polysyllables, e.  g. Palaung rəkətaʔ ‘loom’). Simplifying Shorto’s analysis by treating aspirates as unitary segments, both Palaung and Riang dominant word structure can represented as follows:

502 

 Paul Sidwell

pre-σ main-σ (C1ə.(C₂)). ˈCi(Cm)V(Cf ) Additionally, Riang contrasts long and short vowels in closed syllables and a high versus falling tone, the latter reflecting the historical voice distinction in onsets. Clustered onsets are formed with a medial (Cm) r or l which can follow obstruents (Ci). A similar word template applies to other Palaungic languages such as Lawa and Lameet, and Khmuic languages such as Khmu/Kammu and Mlabri. Many NAA languages have a simpler preferred structure, with presyllable and onset-cluster reduction being common. At the extreme end of the reduction cline are languages such as Bumang (Palaungic) and Bugan (Pakanic), both monosyllabic languages with six contour tones. According to Dao (2007), Bumang permits only (Ci)V(Cf ) main syllables, with no consonant clusters or presyllables. Even onsets are optional, so the minimal word shape in Bumang is a single vowel or syllabic nasal. Compensating for this structural limitation, compound words of up to four syllables are documented (e.  g. kɯi⁵¹pɯn⁵¹luŋ⁵⁵sɔ²⁴ ‘weasel’, du³³tsɔp²¹ɯ⁵⁵ɯ⁵⁵ ‘momentarily’, etc.). Bugan (Li and Luo 2014) is similar although nasals do not appear to function as syllable nuclei. In Palaung and Riang, as with all NAA languages, there is no voicing or aspiration contrast permitted among codas, such that Cf can only be one of the voiceless stops, or a voiced sonorant, or zero. The zero coda is not universally available, and numerous Palaungic and Khmuic languages do not permit open syllables in native vocabulary or open-class lexicon. The languages which have become tonal have typically lost weak codas such as glottal stop, and these give rise to open main syllables. Additionally, some fairly conservative languages, such as the Palaung lects, have also lost glottal codas in a chain shift that saw lenition of velar codas (compare Palaung huʔ, Riang huk1 ‘hair’, Palaung bri, Riang priʔ2 ‘forest’). In the presyllable, the segments available for onsets are restricted; to just six obstruents in Palaung and four in Riang, and only three continuants in either. Presyllable codas are limited to just r, l or n (and the nasal may assimilate to the place of the following Ci). The nucleus of the presyllable is either an epenthetic vowel or the coda is syllabified. These restrictions reflect the inherently weaker status of the presyllable, which carries neither stress nor tone. Other NAA languages allow for more complexity and/or a different range of contrasts. The presyllables of Lameet (Charoenma 1980) have no codas but allow for at least 17 consonants, including an aspirated series. Buxing (Gao 2004) contrasts nasals n, m, ŋ as codas in presyllables plus presyllable vowels ɤ, ɿ, i, u in limited contexts, such that at least 27 different presyllables are attested. Mlabri (Rischel 1995) contrasts voiced, plain, and aspirated presyllable onsets, which combining with nasal and liquid codas yield a large inventory. Some NAA languages have a very restricted set of presyllables or none at all. Wa (Watkins 2013) is described as having only one presyllable, derived from an /s-/ prefix that surfaces as [si]. Presyllables also tend to be lost altogether under Chinese influence (as happened in Vietnamese) and this is seen clearly in the Pakanic lects Bolyu and



Northern Austroasiatic languages of MSEA 

 503

Bugan, and Palaungic languages such as Bumang. In both Bolyu (Edmondson 1995) and Bugan (Li Jinfang 1996) there are no presyllables in lexical roots, but many phrasal and cliticized forms1 among derived lexemes. A similar pattern is seen in Mang (Nguyễn Văn Lợi et al. 2008; Gao 2001, 2003), in which there are no presyllables in lexical roots, although cliticizing class markers to nouns is common.

23.2.2 Phoneme inventories and phonotactics 23.2.2.1 Consonants The consonant inventories of NAA language vary following areal patterns often rather than phylogenetic groupings, although patterns based on common inheritance are evident. Conservative main syllable onset inventories include three oral stop series (voiced, plain, aspirated), nasal and approximant series, all falling across four places of articulation: labial (~labio-dental), apical, palatal (~alveo-palatal) and velar. Innovative NAA language may have only two obstruent series, having reorganized their voicing and/or aspiration feature. The innovative NAA languages tend to have velar or uvular approximant or they may have a distinct uvular place of articulation utilized for obstruents, nasals, and/or approximants. The uvular articulations tend to have arisen from fusion of onset clusters, so they are more common in languages that now favor CiV(Cf ) main syllables. Additionally, devoiced nasals and approximants are common onsets, although many published sources treat these as clusters, yet by treating devoicing/aspiration as a segmental feature one can achieve a more unified account of syllable structure overall. The more innovative NAA languages may also contrast apical versus laminal affricates instead of a single palatal or alveo-palatal place of articulation, and this contrast may extend into the fricatives (perhaps due to Iu Mien influence). AA languages historically only have one oral fricative [s~ɕ], while innovative NAA languages do vary from this pattern. A labio-dental f is commonly present due to loanwords (being a common sound in Lao, Lü, Shan, Vietnamese) and Burmese influence in particular results in the presence of a dental fricative θ and/or an aspirated apical sʰ. Northern Vietnamese influence is seen in some of the languages replacing their palatal glide with a voiced laminal fricative ʑ, and a ʑ:ʒ contrast is recorded in some cases. Typically conservative segmental inventories are reflected in Palaung and Khmu, as shown in Tables 1, 2.

1 Li and Luo (2014) list various “prefixes” for Bugan, but this is used by the authors as a catch-all term for cliticized class markers (such as for body parts, animals, plants, etc.).

504 

 Paul Sidwell

Tab. 1: Palaung Namhsan consonant inventory (based on Shorto 1960, 1963); segments in round brackets occur in loans. Onsets p pʰ b m m̥ v[~β] ɸ[~f]

  Codas t tʰ d n n̥ l l̥, r̥ s[~ɕ]

c

r

ɟ ɲ ɲ̊ j j̊

k kʰ g ŋ ŋ̊

ʔ

p m w

t n r r̥

c ɲ j

(k) ŋ

ʔ

h

h

Tab. 2: Khmu Cuang consonant inventory (Premsrirat 2002); segments in round brackets occur in loans. Onsets

  Codas

p pʰ b m m̥ ˀm w w̥

t tʰ d n n̥ ˀn l l̥

(f)

s

r r̥

c cʰ ɟ ɲ ɲ̊ ˀɲ j j̊ ˀj

k kʰ g ŋ ŋ̊

ʔ

p m w

t n l

r

c ɲ j ç

k ŋ

ʔ

h

h

As mentioned above, a Germanic shift happened in both the Angkuic sub-branch of Palaungic, and the Mal-Pray Khmuic languages, the former spoken in Yunnan while the latter straddles the Thai-Lao borderland. As seen in Tables 3 and 4, Hu and Mal show parallelism in their consonants: both lost their voiced stop series and aspirated their plain stops, both have devoiced nasal onsets, both developed prenasalized obstruents, and both retain clusters with medial liquids. Yet they belong to different branches and are located in very different areal contexts, beyond mutual contact.



Northern Austroasiatic languages of MSEA 

 505

Tab. 3: Hu (Angkuic) consonant inventory (Svantesson 1991). Onsets p pʰ mp mpʰ m m̥ w

  Codas t tʰ nt ntʰ n n̥ l θ nθ

s

c

k kʰ

ns ɲ

ŋkʰ ŋ ŋ̊

j

x

p m w

t n l

c ɲ j

k ŋ

ʔ ʁ

ʁ

Tab. 4: Mal (Khmuic) consonant inventory (Filbeck 1978), bracketed segments occur in loans. Onsets p pʰ mp mpʰ (b) m m̥ w w̥

  Codas t tʰ nt ntʰ (d) n n̥ l l̥

c ɲc ns

s

ɲ ɲ̊ j j̊

k kʰ ŋk ŋkʰ

ʔ

p m w

t n l

c ɲ j j̊

k ŋ ɰ

ʔ

ŋ ŋ̊ h

The languages that underwent more dramatic syllable simplification, such as Bolyu and Mang, have developed new oppositions among onsets while also reducing the range of possible codas; see Tables 5, 6. Tab. 5: Bolyu consonants (Edmondson 1995). Onsets p pʰ mb m w v

  Codas t tʰ nd n l ɬ

ts tsʰ

s

ʨ ʨʰ

k kʰ

ɲ j ɕ

ŋ ɣ

q qʰ

ʔ

p m w

t n

j

k ŋ

506 

 Paul Sidwell

Tab. 6: Mang consonant inventory (Nguyễn Văn Lợi et al. 2008), bracketed segments occur in loans. Onsets

  Codas

p ɓ m

t ɗ n

c

k

ɲ

v

r, l θ, s

ʑ

ŋ ŋ̊

ʔ

p m (w)

t n l

j

k ŋ

h

Broadly speaking, the coda consonants are more stable across NAA, although that is partly inherent in having a smaller set of oppositions compared to onsets.

23.2.2.2 Vowels All of the NAA languages have at least three degrees of backness and height in their main syllable nuclei, although not all nine possible spaces may be filled, and length may or may not be contrasted. Additional features such as nasalization and rounding of front vowels are used in some languages, but these are not commonly reported in the sources. Long and short is contrasted generally in Khmuic languages, in Mang, and within Pakanic in Bolyu but not in Bugan. In Palaungic the vowel length contrast is absent in various languages and sub-groups in something of a patchwork; length is lost in most Palaung-Riang lects, yet it exists for central vowels in Dara’ang Palaung, and is robust in Pale spoken in Yunnan. In Angkuic length is contrastive in Muak-Saak but lost in U and Hu, lost generally in Wa-Lawa but robustly retained in Lameet-Lua lects, and in the Bit-Khang sub-group length is retained in Bit and Buxing but lost in other lects. It is tempting to suggest that the influence of Burmese, Shan, Lü and other languages with no or limited quantity distinctions is to blame, but exceptions are so common it remains an open question. Diphthongs within closed syllables are another point of differentiation across NAA; they are robustly attested in Khmuic and Mang, and in Pakankic they exist in Bugan but are absent from Bolyu. Within Palaungic the distribution is mixed, but there is a clear tendency for them to be either lacking or present only subject to strong phonotactic restrictions. Some representative vowel inventories shown in Tables 7 to 13.



Northern Austroasiatic languages of MSEA 

 507

Tab. 7: Vowels of Dara’ang Palaung (Deepadung et al. 2014). i e ɛ

ɨ ə a

ə̆ ă

u o ɔ

ia ei ai

ua ou au

The length contrast for /a, ə/ is restricted to closed syllables. The diphthongs /ei, ai, ou, au/ only occur before /ʔ, h/. Additionally, /i, e, ə, o/ tend to be pronounced diphthongized in open syllables. Tab. 8: Vowels of Lameet Lampang (Charoenma 1982). i ɪ e ɛ

ɯ ə ʌ a

u ʊ o ɔ

iː ɪː eː ɛː (ia)

ɯː əː ʌː aː (ɯa)

uː ʊː oː ɔː (ua)

Lameet is unusual in having four degrees of height in monophthongs. The diphthongs are infrequent in the lexicon; Charoenma (1982: 40) states that ia is only present in Northern Thai loans, and finds only three examples of ɯa, ua in “native words”, although it is likely that these are loans from Khmu. Tab. 9: Vowels of U (Angkuic) (Svantesson 1988). i e ɛ

ɨ ə a

u o ɔ

The U vowel inventory is simple, with no length contrast, and no diphthongs in closed syllables. The historical velar nasal coda is realized phonetically as [ɑ̃ ] and consequently rhymes iɑ̃ , ɛɑ̃ , ɨɑ̃ , aɑ̃ , uɑ̃ , ɔɑ̃ are attested. Tab. 10: Vowels of Bumang (Dao 2007). i e ɛ

ə ă, a

ɯ

u o ɔ

ea ɛa

ɯa

ua

508 

 Paul Sidwell

Bumang only contrasts length in /a/ in closed syllables. The diphthongs occur only in open syllables or before glides j, w in native vocabulary. Tab. 11: Vowels of Khmu Cuang (Premsrirat 2002). i e ɛ

ɨ ə a

u o ɔ

iː eː ɛː iə

ɨː əː aː ɨə

ʌː

uː oː ɔː uə

According to Premsrirat all Khmu dialects have “similar vowel phonemes” (Premsrirat 2002: xli). The length contrast is robust and the diphthongs are well represented in native vocabulary. The ʌː vowel is only marginally attested, occurring in only 12 out of approximately 3,800 entries in Premsrirat’s dictionary. Other Khmuic languages show only minor differences from Khmu; for example, Malapol (1989) specifies a vowel inventory for Pray that is effectively the same as Khmu but lacking the ʌː member. By contrast, Mlabri (Rischel 1995) has a robust ʌ, ʌː contrast, plus diphthongs io, iʌ, iu, uɛ̃ are recorded in addition to those shared with Khmu, although their phonemic status is open to challenge (i.  e. are they allophones of iə, uə?). Vocalism in the Pramic lects is not well documented, but it appears that in addition to the normal Khmuic vowels there is a tendency for a diversity of diphthonged nuclei with h, ʔ codas. Tab. 12: Vowels of Bugan (Li and Luo 2014). i e ɛ

y

ɯ ə a

u o ɔ

ie ia

ua io

There is no length contrast in Bugan, and extensive mergers and losses among codas has resulted in many open syllables and vowel-coda restrictions. According to Li and Luo (2014), i, u, ɯ also function as codas (j, w, ɰ?) following diphthongs. Both Bugan and Bolyu have similar vowel inventories, including a front rounded y, although length is only contrastive in Bolyu. Tab. 13: Vowels of Mang (Nguyễn Văn Lợi et al. 2008). i e ɛ iə

ĭ ĕ ɛ̆

y ø œ

ɨ ɤ a ɯə

ɨ̆ ɤ̆ ă

u o ɔ uə

ŭ ŏ ɔ̆



Northern Austroasiatic languages of MSEA 

 509

Mang is described with three front rounded vowels y, ø, œ, which is unusual for NAA. Length is contrastive, with Nguyễn Văn Lợi et al. treating the long vowels as unmarked, and there is no length marking of the front rounded vowels. Of the diphthongs, iə, ɯə only occur in closed syllables, while uə is frequent in open syllables.

23.2.3 Suprasegmentals Among the NAA languages, contour tones and phonation types (breathy, creaky) are frequently utilized, yielding such a diversity that it is difficult to make strong generalizations. Broadly, we can distinguish several types of tone and register systems that occur among NAA languages: 1. Contour tone systems which may also include a dimension of phonation; 2. Simple tone systems with two basic settings (e.  g. high/low, rise/fall) which may include additional or more peripheral tone values (e.  g. from borrowings); 3. Register languages primarily contrasting creaky and/or breathy voice. In a majority of cases the type 2 and 3 systems have a common origin in the voicing setting of syllable onsets, and it is possible to find both types within sub-groups or among dialects of the same language. Perhaps the most tonal NAA language is Bugan, unsurprisingly in the areal context. According to Hsiu, “Bugan is in contact with Miao (Hmong), Sha Zhuang (Northern Tai), Nong Zhuang (Central Tai), various Southeastern Loloish (Ngwi) languages and Southwestern Mandarin” (Hsiu 2016: 11). Following the description of Li and Luo (2014), Bugan has six tones in unchecked syllables (five contours plus a neutral tone in prefixes/clitics), three tones in checked syllables, plus a tense/lax register in which the tensed vowels are articulated with a somewhat lowered and backed tongue position. This yields ten contrastive tone-register combinations (Table 14). Tab. 14: Tones of Bumang (Li and Luo 2014). Tone

unchecked syllables

checked syllables

55 high level 33 mid level 35 high rising 13 low rising 31 low falling 0 neutral tone

x x x x x x

x x

x

510 

 Paul Sidwell

Mang is described by Nguyễn Văn Lợi et al. (2008) as having five tones in sonorant final syllables and another two in checked syllables. Tones for three lects are described and the values for each are somewhat different, but they do show equivalent distributions of contours, and creak in several tones. The tones of Nậm Pi Mang are given in Table 15 for illustration. Tab. 15: Tones of Nậm Pi Mang (Nguyễn Văn Lợi et al. 2008). Tone number

Contour

Context

1 2 3 4 5

  22   14ˀ   42ˀ 3203   55ˀ

unchecked syllables

6 7

  35  213

checked syllables

The Angkuic (Palaungic) languages generally have two to four tones. Hu (Svantesson 1991) has a high/low tone contrast, which phonologized out of the historical short/ long vowels, based on the tendency for pitch to fall over longer syllable durations. The closely related U language (Svantesson 1988) has four tones (high, low, rising, falling) due to interaction with multiple segmental features. Muak Sa’ak (Hall 2010) has an intermediate system with three tones plus one borrowed from Tai Lü. The Muak Sa’ak tones are given in Table 16. Tab. 16: Tones of Muak Sa’ak (Hall 2010). Tone number

Contour

Context

1 2 2 3

low high allotone falling allotone falling

checked, long vowels checked, short vowels unchecked (loan words) unchecked

Riang lects described by Luce (1965) have high versus falling tone, the latter correlating with historically voiced onsets. The Riang2 spoken in Yunnan described by Dai and Liu (1997) has three tones: high-level, falling, falling-rising, with creaky phonation in

2 Dai and Liu call the lect “the Guangka subdialect of De’ang” (Dai and Liu 1997: 91).



Northern Austroasiatic languages of MSEA 

 511

the falling-rising tone. Two-tone systems originating in onset voicing distinctions are quite common, for example Lua of Wiang Papao (Charoenma 1982), also known as Khamet, has a two-tone system as described in Table 17. Tab. 17: Tones of Lua Wiang Papao (Charoenma 1982). Tone

Contour

Phonation

unmarked falling

low-rising ~ mid-level mid-level-falling

modal sometimes breathy

Like Riang, the Lua falling tone follows historically voiced onsets. The phonation is described as “sometimes breathy” with salience given to pitch over phonation; strong parallels are also seen in Khmuic. Premsrirat (2002) documents seven Khmu lects, divided into eastern and western groups on linguistic features. The eastern Khmu lects, such as Khmu Cuang, are non-tonal, while the western lects variously have a modal-breathy contrast, a high-low tone contrast, or a combination in which low tone is pronounced with a weakly breathy phonation. The western lects have also devoiced the historically voiced stop onsets, and have merged the devoiced and regular sonorants into a single series that that occur in either register/tone. While tones are reported in other Khmuic languages, the mechanisms of tonogenesis vary. L-Thongkum and Intajamornrak (2008) confirm reports of a two-tone (falling vs. rising) system in Mal (Nan Province, Thailand). It is apparent that the falling tone predominates in native vocabulary while rising tone is assigned to Thai loans, with a proportion of the latter also getting falling tone. Although the Pramic sub-group of Khmuic (in the east of Northern Laos and neighboring Vietnam) is poorly documented, there are indications of potentially complex tone systems. Bùi Khánh Thế (2000) describes Phong as having three tones (rising, level, falling) without further detail. Premsrirat (pers. comm.) records Iduh (Xiangkhouang Province, Laos) having five tone contours in combination with three phonation types, as in Table 18. Tab. 18: Tone and phonation combinations in Iduh (Premsrirat pers. comm.). Tone contour

Phonation

high mid low rising-falling falling-rising

clear clear, breathy clear clear, clear+creaky coda clear, breathy, creaky

512 

 Paul Sidwell

As this brief survey indicates, the tonal diversity of NAA extends from conservative toneless languages to complex systems, the latter combining up to six contours and multiple phonation types. No single explanatory principle accounts for these developments, although it is clear that the tonal typologies are highly correlated with areal context, cutting across genetic groupings and political borders.

23.3 Word formation 23.3.1 Compounding Compounding appears to be productive in all NAA languages, especially for deriving nominals such as kin terms, body parts, geographical forms, and animal and plant names by juxtaposing nouns, either in head-modifier relations or simple coordination. Some examples of noun-noun compounds are given in Table 19. Tab. 19: Selected NAA noun-noun compounds.

Kammu Yuan Kammu Yuan Bugan Bugan Golden Palaung Golden Palaung

c.ʔaːŋ plùʔ tuːt crìʔ mau³³-na³³ da³⁵-ta̱ ɯ̱³⁵ lɔ ʔom ʔom bɯŋ

Literal gloss

Translation

‘bone thigh’ ‘tree/plant banyan’ ‘younger brother-younger sister’ ‘water field’ ‘valley water’ ‘water hole’

‘thigh bone’ ‘banyan tree’ ‘sibling’ ‘rice field’ ‘stream’ ‘well’

We also find combinations of noun and verb forming nominal compounds, as in Table 20. Tab. 20: Selected NAA noun-verb compounds.

Pray Pray Bugan Bugan

kʰaː ɲɕəːl kʰram pəl da¹³(³⁵)-na̱ i̱⁵⁵ pə0qou⁵⁵-lu̱ ŋ³³

Literal gloss

Translation

‘fish bake’ ‘people die’ ‘water jump’ ‘sky make-noise’

‘baked fish’ ‘dead body’ ‘wave’ ‘thunder’

It is apparent that in Palaungic and Pakanic it is common to combine verb and objectnoun to derive a verb whose meaning is related to the parts directly or figuratively,



Northern Austroasiatic languages of MSEA 

 513

often calquing compounds from other languages (examples in Table 21). This does not seem to be reported for Khmuic languages, but is not ruled out. Tab. 21: Selected NAA verb-noun(object) compounds.

Golden Palaung Golden Palaung Bugan Bugan

hwij rimi hɔm kuŋ ʦɔ̱ ³¹-ɕɔ̱ ³⁵ bi³³(³⁵)-ma̱ n⁵⁵

Literal gloss

Translation

‘finish family’ ‘eat country’ ‘hunt game’ ‘buy wife’

‘to get married’ ‘to govern’ ‘go hunting’ ‘to marry (woman)’

23.3.2 Reduplication Simple reduplication is productive throughout NAA, especially in expressive/adverbial vocabulary, in which it is used to code augmentation and distributed meanings, and can include a distinctive pitch contour (frequently higher pitch on the first iteration). It is also common to reduplicate verbs to indicate sequential action or to emphasize that action leading to a result. The following examples from Pray are indicative (Table 22). Tab. 22: Pray reduplication. Augmentation Consecutive action/result Consecutive action/result Verbal reduplication Verbal reduplication

ɕiw ɕiw pɔŋ pɔŋ ʔuəj ʔuəj leŋ wal leŋ rəl wət toʔ wət wal

‘very sweet’ (sweet sweet) ‘eat then do something’ (eat eat) ‘lie down then you will sleep’ (lie down lie down) ‘playing’ (run come run go) ‘quickly return’ (quick come quick return)

Mak (2012), discussing Golden Palaung, provides examples of adverbial reduplicates with augmented, emphatic and distributed meanings. They are mostly reduplicated monosyllables but there are also phrasal reduplications (both AABB, ABAB patterns); some examples are given in Table 23.

514 

 Paul Sidwell

Tab. 23: Golden Palaung reduplication. roh roh l̥ɤ l̥ɤ ɲ̥oh ɲ̥oh lut lij li li ʔi lɤʔ ʔi lɛ

‘very red’ ‘in excess (time)’ ‘very much’ ‘exceedingly’ ‘exactly; well’ ‘unexpectedly’

ʔɔp ʔɔp se se ləŋ kəŋ kʌp kʌp ɟin ɟin ʔu siŋi ʔu siŋi

to poj to pij liŋ hʌʔ liŋ leh

‘naked’ < ‘body tender body wash’ ‘wander’ < ‘go-around up go-around down’

kət kət mij mij grij lut grij li

‘small’ ‘always’ ‘log’ ‘turtle’ ‘sound of a vehicle’ ‘every day’ < ‘one day one day’ ‘fever < ‘cold cold hot hot’ ‘gossip’ < ‘tell bad tell good’

Reduplication can be phonologically complete or partial, and involve additional morphology, and this is well documented for Kammu Yuan especially in the derivation of expressives where it is utilized in conjunction with various affixes. The examples in Table 24 show reduplication of the root krop ‘clanking sound’. Tab. 24: Kammu Yuan reduplication (Svantesson and Holmer 2014: 968). krop krop~krop krop-rŋ̀.krop c.krop~c.krop sl.krup-sl.krop

‘clanking sound heard once’ ‘clanking sounds heard many times (in one place)’ ‘clanking sounds heard many times (in many places)’ ‘clanking sounds heard at intervals’ ‘clanking sounds heard here and there’

Kammu Yuan reduplication strategies follow a range of phonological and morphological patterns which are typologized in the examples from Svantesson (1983) in Table 25. Tab. 25: Kammu Yuan reduplication types. Total reduplication Onset changing Peak changing Coda changing Rhyme changing Peak preserving Coda preserving

caŋ caŋ cɔːn kɔ̀ ːn lùːj làːj cɔ̀ ːm cɔ̀ ʔ làk lɛ̀ ːŋ kùːk truːl ses wɛ̀ ːs

‘kind of grasshopper ‘platform’ ‘flying ant’ ‘locust’ ‘basil’ ‘peaceful dove’ ‘kind of cicada’

It is apparent that reduplication in Kammu Yuan is also indicative for the less thoroughly documented Khmuic Khmu lects.



Northern Austroasiatic languages of MSEA 

 515

23.3.3 Derivation 23.3.3.1 Nominal derivation Two strategies are followed to derive nouns from verbs or other nouns: morphological (infixation, prefixation) and configurational (such as colocating a word meaning ‘thing’ or similar). The morphological nominalization is quite ancient, while the configurational is more recent, often calqued from other languages. Neither Mang nor Pakanic show indications of morphological nominalization. Svantesson (1983) describes Kammu Yuan derivation of nouns from verbs (and occasionally from other nouns) with infixes , and , the latter sometimes as a prefix. Additionally, the Lao nominalizers have been borrowed: kwàːm, kaːn, kʰɔːŋ, səŋ; səŋ is well integrated into Kammu Yuan being frequently manifest as a prefix sŋ́-. See Table 26. Tab. 26: Kammu-Yuan nominal derivations. Base

Derivative



koh ‘to cut’ tiʔ ‘hand’ kal ‘to measure’ hɔːm ‘to tie’

rn

lùh ‘to pound’ kɔːm ‘child’

knoh ‘cutting-board’ tniʔ ‘dexterity; stroke’ kmnàl ‘measuring-rod’ hrnɔːm ‘bamboo strip; bundle’ rǹlùh ‘pestle’ krnɔ̀ ːm ‘womb’

Base

Derivative

sŋsəŋ kwàːm kaːn

jɨ̀m ‘red’ cŋaːr ‘green’ haːn ‘to die’ kàː ‘to trade’

sŋjɨ̀m ‘red things’ səŋ cŋaːr ‘green thing’ kwàːm haːn ‘death’ kaːn kàː ‘business’

kʰɔːŋ

kʰɔːŋ ‘thing’

kʰɔːŋ nak sɛːŋ ‘thing charged with magic’

Other Khmuic languages have only moderate or no morphological nominalization; infixes , , are attested in Mlabri (Rischel 2007; Bätcher 2014) in a few words, although these may have been borrowed from Khmu. In Pray and Mal, according to Filbeck (1978), “[t]he great majority of words are monosyllabic, and disyllabic words contain no hint of previous morphological construction” (Filbeck 1978: 28). Additionally, no specific indications of other nominalization strategies in Pray or Mal are apparent in the literature. In Palaungic languages nominalization is rather limited. In Wa (Seng Mai 2012) nominalization is configurational, employing kɹaʔ for action/state nominalization, and ʧao for agentive nominalization, e.  g. kɹaʔ gau ‘the teaching’, ʧao gau ‘teacher’. In the closely related Plang language (Lewis 2008) there is a general nominalizer ku, e.  g. ku la ‘word’ < la ‘to say’). Both Dara’ang Palaung and Golden Palaung have indications of nominalizing prefixes, although it is not clear how productive they are. Some examples are provided in Table 27.

516 

 Paul Sidwell

Tab. 27: Palaung nominal derivations. Golden Palaung ri-

pʌn-

base gwij ‘dwell’ kʰrɛ ‘to protect’

Dara’ang Palaung

derivative rigwij ‘dwelling place’ rikʰrɛ ‘thing used for protection’ proh ‘to announce’ pʌnproh ‘anouncement’ (v. > n.)

N-

base bih ‘to sweep’ louɁ ‘to hack with hoe’

derivative mbih ‘broom’ nlouɁ ‘hoe’

Bugan is described as having three nominalizers, la⁴⁴, lɛ⁵⁵ and ni⁴⁴; they are placed post-verbally and derive nouns from active and stative verbs. Their etymologies are obscure, although it is notable that ni⁴⁴ resembles the proximal demonstrative ni³³. Bugan (Li and Luo 2014: 1058) (1)

a.

ʦu³¹ la⁴⁴/ni⁴⁴ eat nml ‘things to eat, food’

b. tʰo³¹/ʦei⁵⁵ lɛ⁵⁵ big/small- nml ‘big/small thing(s)’

c.

maŋ⁴⁴ ni⁴⁴ purple nml ‘purple thing(s)’

23.3.3.2 Verbal derivation The most common patterns of verbal derivation involve prefixing to increase or decrease transitivity of verbs, or less frequently to derive transitive or intransitive~stative verbs from nouns. Increasing transitivity typically yields causative verbs, while detransitivization typically yields statives. These morphological strategies are found across AA with varying degrees of productivity and are clearly archaic, but are giving way to periphrastic constructions in many sub-groups. Khmu/Kammu in particular robustly retains morphological verbal derivation, apparently more so than other NAA groups. Among the monosyllabic Palaungic language, Mang, or Pakanic no morphological derivation of verbs is evident. Svantesson (1983) discusses the Kammu Yuan morphological causative in extensive detail. Prefixes p-, pN- and infix are applied mostly to intransitive verbs, some transitives, and some nouns, to derive causative verbs. The choice to allomorph is partly phonological (p- attaches to CVC syllables, attaches to CCVC, while pnoccurs on both types). In some cases the coda of pN- may assimilate the place, or even place and manner of the main syllable coda. Examples follow in Table 28.



Northern Austroasiatic languages of MSEA 

 517

Tab. 28: Kammu Yuan morphological causatives (Svantesson 1983). Base form haːn poːl klɔːk pàːɲ mɔːŋ lɔ̀ ːc tlùːj kses

Derived form ‘to die’ ‘to roll’ ‘white’ ‘drunk’ ‘sad’ ‘to forget’ ‘to hang’ (intr.) ‘to fall’

phaːn pnpoːl pǹklɔːk pɲ̀pàːɲ pŋ̀mɔːŋ pclɔ̀ ːc tmlùːj km̀ses

‘to kill’ ‘to roll’ (tr.) ‘to whiten’ ‘to make drunk ‘to sadden’ (tr.) ‘to make sb. forget’ ‘to hang’ (tr.) ‘to drop’

Kammu Yuan is described as having a morphological resultative/passive with prefixes hN-, si-, ti-; effectively detransitivizing prefixes yielding statives from transitive bases. Other Khmuic languages show only modest traces of verbal derivation. Filbeck (1978: 28) finds just a handful of words in Mal and Pray that suggest traces of *p- causative in the form of a homorganic nasal: e.  g. mpləp ‘to immerse’ < pləp ‘to sink’, mpəl ‘to kill’ < pəl ‘to die’, ntʰeh ‘to put to sleep’ < tʰeh ‘to sleep’. The productive causative formation in Mal and Pray lects is a syntactic formation with ‘give’ (see 23.5.2). Some productive verbal morphology is noted for Palaungic languages. In Golden Palaung the causative prefix p-, pʌn- turns any open class lexeme into a causative verb, and is characterized as “quite productive” by Mak (2012: 60). Base form

Derived form

jəm ‘die’ diŋ ‘great’ kʰɔ ‘hard’ kwət ‘load/burden’

pjəm ‘to kill’ pʌndiŋ ‘to make smth. great’’ pʌnkʰɔ ‘to harden’ (tr.) pʌnkwət ‘to load’

Mak (2012) also describes kʌr- and kʌn- verbal prefixes; both derive verbs with meanings somehow extending the meaning of the base lexeme. The derivatives of kʌninclude reciprocity in their scope, e.  g. n̥ er ‘similar’ > kʌr.n̥ er ‘similar to each other’.

518 

 Paul Sidwell

23.4 Clause structure 23.4.1 Simple clauses 23.4.1.1 Intransitive clauses Among the NAA languages word order in intransitive clauses is dominantly SV, although the subject may be elided and the only necessary component is the verbal predicate (active or stative). Kammu Yuan (Svantesson and Holmer 2014: 970) (2) a. ròk təːm b. tuːt kcɔk kì plìa toad sing plant banana prox beautiful ‘The toad sang/is singing/sings’ ‘The banana plant is beautiful’ Pray (Malapol 1989: 105) (3) a. poːk klaːŋ tiger roar ‘A tiger roared’ Bugan (Li and Luo 2014: 1045) (4) a. ʦioŋ³³ ŋgɔ³⁵ deer step on ‘the deer stepped on’ Golden Palaung (Mak 2012: 12) (5) a. bi jum doj doj people laugh all ‘The people all laughed’

b. kʰwan jɛt ɲeːm child little cry ‘A little child cries’ b.

hɔŋ³³ zuŋ⁵⁵ ʦioŋ³³ ʦan³³ footprint foot deer smelly ‘Footprints of deer are smelly’

b. gir veŋ ti hɔ 3d go.back dir palace ‘They went back to the palace.’

Intransitive clauses with VS order are also attested, and examples are documented in Wa and Dara’ang Palaung, and it may be that VS clauses are more common in speech than is evident from the language documentation. It is not clear how the differences in ordering are meaningful, if at all. Wa (Mak 2012: 12) (6) hoik hu ai kʰun ka dəʔ kəŋ nuʔ finish go N appl in field past.near ‘Ai Khun went to (his) field already.’ Dara’ang Palaung (Deepadung et al. 2014: 1101) (7) Ɂăw jic maso neg get.up dog ‘The dog did not get up.’



Northern Austroasiatic languages of MSEA 

 519

23.4.1.2 Transitive clauses Transitive clauses among NAA languages are commonly ordered AVP, although VAP is also noted. In the following example from Wa we see two conjoined clauses; V precedes the agent in the first clause while A is elided in the second as it shares the same A. Wa (Seng Mai 2012: 23) (8) ja̤ m giah taʔ nap makmuŋ lwe tiʔ giah when slice uncle n mango do.accidentally V.chain slice tiaʔ tiʔ hand poss ‘When uncle Nap sliced the mango, he cut his fingers accidently.’ Additionally, patients may be fronted for contrast or emphasis. In the following example from Kammu Yuan, the first two clauses are AVP, while in the third the patient (‘dog’) is fronted for focus while the agent (‘people’) remains in the archetypal immediate preverbal position. In the Bugan example the P (‘bananas’) is topicalized in the interrogative context. Kammu Yuan (Svantesson and Holmer 2014: 980) (9) kʰɗíʔ prìaŋ phaːn sɨaŋ, prìaŋ pə̀ ʔ ʔàh sɨaŋ, sɔʔ prìaŋ now people kill pig, people eat meat pig, dog people pəʔ phaːn neg kill ‘Nowadays people slaughter pigs and eat pork, but dogs they don’t slaughter.’ Bugan (Li and Luo 2014: 1059) (10) mʦe³³ ʦo³³ kai³³ ma³³ mʦe⁵⁵(³¹) banana have or neg.have ‘(Do you) have any bananas?’ Only AVP ordered clauses are noted in the Pray data, as in the following examples. Pray (Malapol 1989: 104) (11) a. meː pɔŋ caʔ mother eat rice ‘Mother is eating rice’

b. nam pat ʔɔːk 3s carry water ‘He carries water’

23.4.1.3 Ditransitive clauses Ditransitive constructions have three arguments (Agent, Goal, Theme); the archetypal ditransitive clause are formed with ‘give’, plus other verbs that involve a transfer of possession, knowledge, perception etc. (e.  g. ‘lend’, ‘carry’, ‘show’, ‘explain’ etc.).

520 

 Paul Sidwell

The lexical ‘give’ form also often grammaticalizes as a general marker of benefactive or goal and consequently the preferred analysis can be ambiguous. The word orders noted in the NAA languages include AVGT and AVTG, and the theme is often but not always marked with an oblique preposition, especially if the order is AVTG, apparently the favored strategy in Palaungic. In the Kammu Yuan data we see AVGT without oblique marking and AVTG with oblique marking. Wa (Seng Mai 2012: 66) (12) ʔəuʔ tɔʔ pʰuk lai ka ai ka tiʔ pʰuk 1s give book appl n one clf ‘I gave a book to Ai Kar.’ Golden Palaung (Mak 2012: 116) (13) ʔɔ di dɛh ki.kʰrir ʔɯ ti 1s will give goldfish prox dir ‘I will give this goldfish to you.’

mi 2s

Kammu Yuan (Svantesson and Holmer 2014: 974) (14) a. ʔòʔ cə̀ ʔùːn mè tèʔ ŋɔʔ smlàh 1s irr give 2s.m get rice seed ‘He lent me a lot of money.’ b. ʔòʔ ʔùːn kmuːl jʌ̀ ʔ kə̀ 1s give money ins 3s.m ‘I handed money to him.’ Pray (Malapol 2012: 122) (15) ɲjaj ʔon caʔ kʰuan elder.sister give rice younger.sister ‘Elder sister gave younger sister (some) rice.’ Bugan (Li and Luo 2014: 1059) (16) ʔi³¹ nda⁴⁴ ti⁵⁵ ʔɔ³¹ fu³¹ da²⁴ kaŋ³¹ nau³¹ 3s lend give 1s money very many ‘He lent me a lot of money.’

23.4.1.4 Multi-verb predicates / serial verb constructions NAA languages make moderate use of Multi-Verb Predicates (MVPs), also known as Serial Verb Constructions; these occur where verbs are concatenated to denote complex events. One may distinguish MVPs from verbal compounds, particularly when it is apparent that the latter have become lexicalized, although this may not be clear when data is limited. MVPs are especially utilized to add directional or temporal meanings. For example, in the following Kammu Yuan sentence we see ‘run, return,



Northern Austroasiatic languages of MSEA 

 521

arrive’ in sequence; any one or two of these would have been sufficient for a complete sentence but the sequence of verbs highlighting aspects of the event in temporal sequence has greater descriptive impact. Kammu Yuan (Svantesson and Holmer 2014: 973) (17) ɲàːr tàr kàːj rɔ̀ t tà kàːŋ tè pn run return arrive loc house refl ‘Ñaar ran back to his house.’ Two Golden Palaung MVP examples follow; in the first leh dɛh ‘descend give’ are clearly separate but sequential events and we could analyze these as two clauses sharing the same A. Compare this to the next example in which ɟɔm leh ‘follow descend’ are apparently not separate but code a single action which is occurring in a downward manner, and this colocation may potentially be lexicalized. Formally both sentences have the same structure but the semantics of predication differ, and depending on one’s theoretical perspective and what more extensive texts may reveal, both, one, or neither may be recognized as MVPs. Golden Palaung (Mak 2012: 128) (18) kʰun.pʰi leh dɛh ʔʌn hɔm ple.bri sin spirit descend give 3s eat mango ripe/cooked ‘The spirit came down to give ripe mangos for her to eat.’ Golden Palaung (Mak 2012: 128) (19) niŋ ɟɔm leh de men ta lɔ princess follow descend self look dir valley ‘The princess followed down to look in the valley.’

23.4.2 Subordinate clauses The discussion of subordinate clauses in NAA languages can be somewhat problematic; they often lack any overt marker and may be difficult to distinguish from coordinated or conjoined independent clauses, leading to under-specification, and analyses may be regarded as arbitrary. Also, given the readiness to elide known or recoverable arguments, it may be challenging to distinguish a single clause incorporating an MVP from a combination of matrix and subordinate or conjoined clause describing a complex event. In the case of relative and complement clauses it is often the case that there is an overt subordinator, such as a pronoun, preposition, or partly bleached verb or noun, in which case the categorization may be straightforward. When overt subordinators are not used, the approach favored here avoids identifying MVPs when it appears sufficient to recognize subordinate or conjoined clauses.

522 

 Paul Sidwell

23.4.2.1 Relative clauses Relativization is common, and relative clauses may be introduced with or without a relative pronoun. Both patterns are found within Khmuic and Palaungic, but data is lacking to usefully discuss relativization in Mang or Pakanic. Kammu Yuan utilizes a relative pronoun kə̀ m or nàm to introduce relative clauses: Kammu Yuan (Svantesson and Holmer 2014: 976) (20) taʔ kə̀ m jɔ̀ h téc pleʔ kók niʔ old.man rel go sell fruit hogplum medl ‘the old man who went to sell hogplums’ It is notable that in Wa the word order within relative clauses is strictly verb initial. Wa introduces relative clauses with pə; additionally, the verb always follows immediately after pə: Wa (Seng Mai 2012: 118) (21) səɹamaʔ pə moh ʔəuʔ hoik ʔiŋ laʃio teacher rel love 1S finish return N ‘The teacher who I love already went back to Lahsio. In both Dara’ang Palaung and Pray relative clauses are introduced without overt marking, they follow immediately the arguments they modify. Dara’ang Palaung (Deepadung et al. 2014: 1080) (22) Ɂo nap Ɂikăt tuc dɨ dɔj 1sg know elder sit goal there ‘I know the elder (who) sits there.’ Pray (Malapol 1989: 65) (23) ɕwaʔ ʔah mbəl ŋiʔ ɲɕen ʔaj mi lɔʔ dog 3pl kill yesterday not not good ‘The dog which they killed yesterday was not good’



Northern Austroasiatic languages of MSEA 

 523

23.4.2.2 Complement clauses Complement clauses often involve verbs of communication and cognition (i.  e. ‘to say that …’, ‘to think that …’, etc.). The complement is typically unmarked and follows directly the matrix clause, or a verb such as ‘say/tell’ may be semantically bleached and follow the matrix clause verb. Multiple complement clauses may be embedded in a single sentence. Kammu Yuan (Svantesson and Holmer 2014: 978) (24) km̀muʔ trkə̀ t sah lòːk kì ʔah saːm cèn Kammu believe say world prox exist three level ‘The Kammu believe that this world has three levels,’ Dara’ang Palaung (Deepadung et al. 2014: 1080) (25) daraɁaŋ ɲɨm tauh boŋ Ɂakja kɨn-kɨn Dara’ang believe tell make merit much-red dɨ pən dɨ houɁ məŋdεŋ goal get goal ascend heaven ‘Dara’ang people believe (that) making lots of merit will make able to go to heaven.’ Golden Palaung (Mak 2012: 133) (26) ʔʌn nəp pwət mi de mɤh jipʰij 3s know right.away mother self be ogress ‘He knew right away that his mother was an ogress.’ Ma Seng (2012: 111–112) finds that specific verbs in Wa subordinate clauses take the initial position, and her examples suggest that it is specifically those of communication and cognition. Regard the following example with ‘to see’: Wa (Mak 2012: 112) (27) ʔəuʔ tom jao̤ ʔ ʧɔk nɔh ɹɔm hia 1s purp see scoop 3s honey ‘I saw that he was scooping honey.’

23.4.3 Clause chaining and coordination Clauses that are grammatically independent but express related/sequential events are often chained without an overt linker, although there may be an intonational pause between them. As with subordinate clauses, it is problematic to distinguish chained events and MNPs, and no attempt to do so is made here. Similarly, conditional or contrastive events may be coordinated with or without linkers, and when linkers are used it is clear that they are often borrowings. Khmu Cuang provides examples of clauses

524 

 Paul Sidwell

chained with several different linkers, and most or all of these conjunctions appear to be loans. Khmu Cuang clause linkers (Premsrirat 2002: ixi–ixiv). Linker

Use

Note

taŋ le gaj kɔ ci~ce jɔʔ

determination/possibility/obvious result action normally follows another unnatural or unexpected action action naturally follows result of action will occur indicates cause and effect

cf. Lao tȃŋ ‘set up, start, establish’ cf. Lao lɛ̄ ‘and’ cf. Lao kʰāj ‘wish/desire’ cf. Lao kɔ̄ ː ‘should/ought to’ cf. Lao sī ‘will/shall’

Khmu Cuang (Premsrirat 2002: ixvi) (28) baː ci jɔh baː ci jat paʔ baːr nɛːw niʔ 2s will go 2s will stay both two way prox ‘Will you stay or will you go? There are both ways.’ Similarly to Khmu Cuang, Kammu Yuan often utilizes conjunctions, such as lə̀ ‘then’, wàːj niʔ ‘after that’ (both Tai loans) to link sequential action, although no linker is obligatory, as in the following: Kammu Yuan (Svantesson and Holmer 2014: 979) (29) mè ʔuːn ʔoʔ teʔ sɨ́p màn, ʔoʔ 2sg.m give 1sg get ten piastre, 1sg ‘If you give me ten piastres, I will do it.’

cə̀ ʔə̀ h irr do

In Pray independent clauses can be chained without overt linkers; however, in the case of the temporal sequence in which one action follows the completion of another, the first clause may be introduced with lɛːw (from Lao lɛ̑ ːw ‘already’) and the second clause can optionally begin with ɕaŋ ‘then’ (cf. Palaung saŋ ‘about [to do something]’). Otherwise, the clauses can be linked with lot ‘so/therefore’. Pray (Malapol 1989: 158) (30) lɛːw ɕaː ɕaŋ ʔɛː lɛʔ tʰaːr mpəːn finish rice then 1pl find rope mat ‘After finishing working the field, then we go to collect rope for (weaving) mats.’ Pray (Malapol 1989: 148) (31) waːŋ tʰɛʔ ɗok leh miaʔ lot ʔah ɕih ɕaː ɗok year prox many come rain so/therefore 3p plant rice many ‘This year it rained so much they planted much rice.’ In Wa the conjunction mai ‘and’ is used to join two independent clauses, while kɛʔ allows joining two noun phrases or pronouns. Conjunctions are also attested in Bugan.



Northern Austroasiatic languages of MSEA 

 525

Wa (Seng Mai 2012: 110) (32) ʔəuʔ tom kaoh ʤau ʤau mai hu dəuŋ noŋ 1s purp wake.up early early and go in forest ‘I woke up very early and went to the forest.’ Bugan (Li and Luo 2014: 1059) (33) ʔi³¹ en³¹ bi⁴⁴ kau³¹ qa⁵⁵ tha³¹ nda²⁴ ken³¹ ȵou⁵⁵ 3sg wash finish cloth then carry water return home ‘Having finished her washing, she carried water back home.’

23.4.3.1 Conditional and contrastive coordination Conditional clauses may or may not require a conjunction; if a conjunction is used it is typically at the beginning of the first clause and is often a loan word. This suggests that the use of conditional conjunctions is largely calqued and may be historically recent. The following examples from Palaung and Pray show borrowing of a Tai conditional (cf. Lao kʰán ‘if’). Golden Palaung (Mak 2012: 62) (34) ʔur̥ ʔɛ kʌn hɤh ʔʌn, ʔʌn jəm odour 1p.incl if exhale 3s, 3s die ‘If our breath exhales to him, he will die.’

de self

Pray (Malapol 1989: 163) (35) kan ʔan ʔaj mi ʔɛŋ kaːn ɕiwaʔ loʔ ʔɛw ʔuʔ tik tik, if 1s not not do work what visit visit cont particle, kəj miː ɕiwaʔ pɔŋ not have what eat ‘If we do not work (but just) always got the visit, we will have nothing to eat.’ Wa conditionals are made with the conjunction pʰan ‘if’: Wa (Ma Seng 2012: 116) (36) pʰan nɔh hwet ʤau ʤau nuʔ nɔh tɤ jao̤ ʔ if 3s coe early early near.past 3s will.certain see lai ʔin letter prox ‘If he had come earlier, he would have seen the letter.’ Kammu Yuan requires no conjunction for conditional or contrastive coordination, see (29). Contrastive coordination may be coded with (Golden Palaung, Bugan) or without (Khmu Cuang) various linkers.

526 

 Paul Sidwell

Golden Palaung (Mak 2012: 54) (37) veŋ mi kʌnmɤh jɛ ki veŋ ci go 2s contr 1p.excl neg go pol ‘You go. In case of us, we don’t go. Bugan (Li and Luo 2014: 1059) (38) mʦe³³ ʦo³³ kai³³ ma³³ mʦe⁵⁵(³¹) banana have or neg.have ‘(Do you) have any bananas?’ Khmu Cuang (Premsrirat 2002: ixvi) (39) baː ci jɔh baː ci jat paʔ baːr nɛːw niʔ 2s will go 2s will stay both two way prox ‘Will you stay or will you go? There are both ways.’

23.4.4 Questions Both polar questions and content questions are discussed. Polar questions in NAA languages are typically formed by adding particles to declarative clauses; the polar particles take various positions depending on the language (although those under strong Tai influence favor sentence final position). Content questions in all NAA languages are formed with an interrogative pronoun taking the place of the ­corresponding argument in a declarative clause. In some cases it is normal to add a final ­particle to a content question, especially one that signals the questioner’s expectation of agreement or positive/negative response, and these particles can grammaticalize into interrogatives. It is also common to calque interrogative strategies from other languages and/or to borrow the forms (as in the Pray examples (41, 42) below). Intonation is also known to play an important role in interrogation although the published sources vary to the extent that it is discussed. Kammu Yuan follows the typical pattern in respect of content question words; polar questions are formed with a clause-initial particle. Kammu Yuan (Svantesson et al. 2014: xxiii) polar question (40) bec nàː mə̀ h kʰuː q 3s.fem be teacher ‘Is she a teacher?’ Pray polar interrogative clauses are formed with a range of final particles kaː, ʔəʔ, bɔʔ (which appear to reflect Lao/Northern Thai influence) plus ʔan, ʔaj which are accompanied by distinctive rising and falling intonation, respectively (Malapol does not explain the significance). The Pray interrogatives used in content questions are also influenced by Lao/Northern Thai borrowings (e.  g. ɕi.waʔ ‘what’ cf. wā interrogative particle).



Northern Austroasiatic languages of MSEA 

 527

Pray (Malapol 1989: 129) (41) mah pen Ɂah.praj kʰɯː noj kaː you be Pray same each.other fp ‘Are you a Pray like us?’ Pray (Malapol 1989: 170) (42) Ɂaw.meː Ɂɛŋ kaːn ɕi.waʔ parents do work q ‘What do your parents do?’ Golden Palaung forms polar questions in several ways. (i) Pronouncing a declarative with rising intonation (responding with falling intonation); (ii) with particle kɔ immediately after the main verb, plus optional paj before the verb; (iii) with the negator ki following the verb, plus the positive counter-option. Golden Palaung (Mak 2012: 33–34) (43) a. pʌnlut mi daŋ ɲ̊oh paj nəp kɔ sin 2s big/great truly y/n know y/n ‘Do you know that you sin greatly?’ b. mi veŋ ki veŋ 2s go/come neg go/come ‘Do you go or not?’ Bugan evidences a neat parallel to the Golden Palaung (44). Bugan (Li an Luo 2014: 1060) (44) ʔi³¹ no³¹ mə⁵⁵ no³¹ ni⁵⁵ 3s come neg come q ‘Is he coming?’ In Wa polar questions are formed with a rising intonation plus the optional final particle lɛ. Adding the negator ʔaŋ and the copular mɔ̤ h makes a tag question: Wa (Seng Mai 2012: 104) (45) hu ma̤ iʔ Ɂaŋ mɔ̤ h lɛ go 2s neg be q ‘Will you go, won’t you?’

23.4.5 Imperatives Among the NAA languages imperative/commands are typically a predicate with or without an imperative particle (which may be a grammaticalized verb); additionally/ alternately polite or down-toning particles may be used, and the addressee is typically not overtly expressed. The simplest form of imperative clause is a bare verb. In Wa

528 

 Paul Sidwell

negative imperatives are formed with bɔ (a borrowing of the regular Lao negative) fronting the clause. The polite marker ʧa is added before the verb to soften commands. Wa (Seng Mai 2012: 74) (46) a. bɔ siblɯ̤h sədaʔ soʔ nɔh proh pull tail dog 3s ‘Don’t pull his dog’s tail’

b. ʧa ŋhɛt ku kəuʔ pol listen every clf ‘Please everyone listen (to me).’

Golden Palaung commonly employs unmarked imperatives, although the verb may be fronted for extra force. Additionally, one can soften a request or command by opening the clause with dɛh (from the lexical verb ‘give’), and/or polite with the clause final particle ci. Both strategies are illustrated in (47). Golden Palaung (Mak 2012: 146, 35) (47) a. va ʔit mi ta carɔp come sleep 2s dir rest-house ‘You come sleep at the rest house!’

b. dɛh ʔɔ tiŋ l̥ɛ mi ci imp 1s load cart 2s pol ‘May I load your carts please?’

In Kammu Yuan imperatives an overt second person subject is often expressed. One can add an optional clause final hortative ʔəːm, although unmarked imperatives are common. Negative imperatives are formed with the prohibitive taʔ before the main verb. For Bugan, two imperative final particles are documented, la0 and no³¹. Bugan (Seng Mai 2012: 74) (48) a. wi³¹ na⁵⁵ la0 we go imp ‘Let’s go.’

b. mɯ³¹ na⁵⁵ ha⁵ no³¹ we go first imp ‘You go first.’

23.4.6 Syntax and pragmatics As is common in the MSEA Linguistic Area, the configuration of constituents is not always strictly rigid but subject to pragmatic reordering or deletion. Topic-comment structure is preferred, and fronting of topicalized constituents occurs throughout NAA languages without the need for special topic marking morphemes, grammatical relations being retrievable from context. Topicality may nonetheless be marked by intonation and/or interjections, for example Ma Seng (2012: 121) discusses Wa speakers sometimes marking topics with the interjections kɛʔnɛ and hɤ which she glosses as ‘uhm’. Malapol (1989) reports that in Pray any participant may be fronted, “[a] pause or intonation break usually occurs after the element that is topicalized” (Malapol 1989: 143). Throughout NAA previously mentioned or known core arguments may be deleted in speech, and eliding known referents serves in part to make new information more prominent. Although, as Watkins (2019) points out for Wa, “definite arguments in



Northern Austroasiatic languages of MSEA 

 529

Wa tend to be pronominalized and retained, rather than being elided altogether” (Watkins 2019: 459). Broadly speaking, 1st person subjects/agents are readily assumed pragmatically and there are times when speakers wish to express deference by minimizing self-reference. Also, the use of a transitive verb implies a patient, and known information combined with verbal semantics also often renders the overt utterance of the word unnecessary for communicative purposes.

Topicalized constituents: Pray (Malapol 1989: 143) (49) ɕalɔh tʰɯh ʔih ʔɛŋ ɕɛː mountain high 1p do field ‘(On those) high mountains we make (our) fields.’ Golden Palaung (Mak 2012: 148) (50) kunma ʔɔ ka jɤ parents 1s neg posses ‘I don’t have parents.’ Kammu Yuan (Svantesson and Holmer 2014: 980) (51) laʔ kə̀ m ʔàh ˀjiak siːm taʔ tèʔ ʔəːm leaf rel exist shit bird proh get hort ‘Don’t take those leaves that have bird shit on them!’

Deleted constituent: Golden Palaung (Mak 2012: 145) (52) – rɛʔ de ŋɤp gɤ pʌn ɲi kʰun.hɔ.kʰəm (queen) watch self look only rel do king ‘(The queen) only watched what the King did.’

23.5 Grammatical categories 23.5.1 TAM 23.5.1.1 Tense and aspect The Khmuic and Palaungic languages are broadly consistent in that they encode aspectual meanings with coverbs, eschewing direct use of grammatical tense; while

530 

 Paul Sidwell

the limited documentation makes this less clear in the case of Bugan. Generally in Khmuic and Palaungic there is a perfect/completive preverb, often derived from the lexical verb *hoːc ‘finish’ which has a robust AA etymology, although in some cases (such as Pray) this has been dropped in favor of a borrowed term along with its syntax. Other aspectual categories, such as continuative/durative, inchoative, irrealis, variously occur but are more likely to be coded with loan words or otherwise more recently formed constructions in Pray. Kammu Yuan has several aspect markers in addition to the hoːc ‘to finish’ used preverbally and/or clause finally. Irrealis mood cə̀ (an old Tai loan) is essentially obligatory for future events and takes a preverbal position, and there is a special negative-perfect preverbal praʔ ‘not yet’. Habitual aspect is signaled with the verb kùʔ used with the sense of ‘used to’. Khmu Cuang is described a little differently from Kammu but the discrepancies may be more descriptive than real. Irrealis/future is signaled with ci~ce, this is cognate with Kammu Yuan cə̀ but looks like a later borrowing. In Premsrirat’s account, ci~ce indicates that the action will occur as a consequence of a stated condition/action, as in (53). Two Khmu Cuang perfect markers are mentioned hi, and hoːc, but no cognate of praʔ is recorded. Preverb gəːj signals ‘used to’ (cf. Lao kʰə́ ːj ‘ever, accustomed to’ used to mark habitual/ongoing action). Khmu Cuang (Premsrirat 2002: ixiii) (53) haː sroʔ rɛːŋ kʰaːw briəŋ ce proh speak loud otherwise other will ‘Don’t speak loudly, or others will hear.’

mec hear

Pray is described as having two aspect markers, ʔuʔ continuative and lɛːw perfective, directly comparable to Lao jūː ‘live’/yet/still, lɛ̑ ːw ‘already’/past tense. Both Pray aspect markers are placed post-verbally. Pray (Malapol 1989: 86–87) (54) a. nam pɔŋ caʔ ʔuʔ 3s eat rice cont ‘He is eating.’

b. mah ʔəm nɔːk lɛːw 2s bathe water perf ‘Did you already take a bath?’

Golden Palaung parallels Pray somewhat; perfective and durative aspects are marked with the grammatical use of two lexical verbs as preverbs: hwaj ‘finish’ used to mark perfective, and dʌʔ ‘stop/remain’ used to signal durative. Additional aspectual preverbs are di ‘will’ for impending/intended action, jɤ ‘begin to’ inchoative. Golden Palaung (Mak 2012: 30) (55) gir pju de rimi hwaj bɤn pur sinʌm 3d make self family finish get seven year ‘They have been married for seven years already.’

jʌʔ sure



Northern Austroasiatic languages of MSEA 

 531

Five aspectual preverbs are listed for Wa: hoik completive (cognate with Khmu hoːc), kɔn durative, dɔ̤ k present perfect (usually in combination with hoik), jao̤ k marks inceptive, and saʔ marks that the speaker has experience of performing the action thus is inherently realis. Wa (Seng Mai 2012: 83–84) (56) a. kaŋkwɛ ʔan jao̤ k tiʔ to rabbit prox incept V.chain run ‘The rabbit began to run.’

b. nɔh saʔ sum ɲɛ̤ ʔ 3s exp build house ‘He has built a house.’

Two aspectual forms are evident in Bugan; the perfect bi⁴⁴ ‘finish’ follows the main verb iconically, while the durative is marked with the pre-verb sai³³ and the optional final particle naŋ³¹. Bugan (Li and Luo 2014: 1059) (57) ʔi³¹ en³¹ bi⁴⁴ kau³¹ qa⁵⁵ tha³¹ nda²⁴ ken³¹ ȵou⁵⁵ 3sg wash finish cloth then carry water return home ‘Having finished her washing, she carried water back home.’ Bugan (Li and Luo 2014: 1054) (58) li⁵⁵ sai³³ ʦo̱ u̱ ³¹ ʦiu⁵⁵ naŋ³¹ ox dur eat grass dur ‘The ox is eating grass.’

23.5.1.2 Modality Much like aspect, modality (communicating speaker’s ability/intentions/belief, etc.) is generally encoded with coverbs, mostly preverbal although postverbal use also occurs. In each NAA language there is generally a set of modal preverbs with meaning ‘to want to’, ‘to intend to’, ‘must’ and so forth. Additionally, there is a common pattern – also found more widely in MSEA languages – for ‘get/obtain’ to be used as a coverb to indicate ability/success/permission. The grammaticalized meaning of the coverb may be rather distant from its lexical source. Golden Palaung (Mak 2012: 62) (59) ʔise ki bɤn kivɛʔ anyone neg get play ‘No one is allowed to play.’ Wa (Seng Mai 2012: 226) (60) ai ka pon to pʰai kʰaiŋ ai kʰun pn get run quick than pn ‘Ai Kar can run faster than Ai Khun.’

532 

 Paul Sidwell

Kammu Yuan (Svantesson and Holmer 2014: 979) (61) jàːm jʌ̀ tè cə̀ pəʔ pɨ̀an pə̀ ʔ màh mɨ̀ s.kiʔ cry ins refl irr neg get eat rice day today ‘(I) am crying because I will not be able to eat anything today.’ Pray (Malapol 1989: 78) (62) ʔəɲ kuɲ pɔŋ pʰlɛʔ kʰam ɕiw 1s get eat tamarind sweet ‘I was able to eat sweet tamarind.’ Seng Mai (2012) describes Wa as having a purposive mood, “the eventuality is true/not true for a reason” (Seng Mai 2012: 84) signaled with pre-verb tom; it occurs frequently in the textual examples in that source. Wa (Seng Mai 2012: 112) (63) ʔəuʔ tom jao̤ ʔ ʧɔk nɔh ɹɔm hia 1s purp see scoop 3s honey ‘I saw that he was scooping honey.’

23.5.2 Causative and passive constructions NAA show two basic strategies for forming causatives, (i) morphological with prefixes deriving causative verbs, and (ii) syntactic causatives. The first type is already discussed under Derivation (23.3.3) so we focus on the periphrastic type here. Syntactic causatives are not discussed in the Palaung or Khmu/Kammu sources, although it would not be surprising to find periphrases calqued after Lao, Shan, etc. We find only syntactic causative constructions described for Wa, Pray/Mal, and Bugan. In Wa causative constructions are formed with kɛ̤ h ‘cause’ (< ‘be born’), tɔʔ ‘give’, vɛʔ ‘bring’, and mʰaiŋ ‘command’ functioning as causatives that take complement clauses (paralleling the syntax of the Lao hàj and Shan haɯ3 causative constructions). Wa (Seng Mai 2012: 98) (64) kok kɔn ɲɔm kɛ̤ h kloŋ ʧʰaʔ maʔ hwet call child cause cup tea broken come ‘Call the child who broke the glass to come.’ For Pray only syntactic causatives are described. Pray (Filbeck 1978: 29) (65) ʔaŋ nam toʔ give 3s come ‘Cause him to come’



Northern Austroasiatic languages of MSEA 

 533

In Bugan ŋgɔ̱ ³⁵ ‘to drive out’ is used as a preverb, combining with main verbs to form causatives; e.  g. ŋgɔ̱ ³⁵xo̱ u̱ ³⁵ ‘to make/force sb. to ride (a horse)’ (Li and Luo 2014: 1053). Various passive-like constructions occur in which the affectedness of the subject/ patient is emphasized. Golden Palaung has a preverb bʌp glossed ‘involuntary’ by Mak (2012) which emphasizes the affectedness of the subject in intransitive clauses. In a transitive clause bʌp emphasizes obligation for the agent to perform the action. Golden Palaung (Mak 2012: 8) (66) ʔʌn bʌp lʌr̥ 3s involuntary hit ‘He was forced to receive hitting’ In Wa there are two distinct constructions; an intransitive clause with kʰam ‘suffer’ or jao̤ ʔ ‘forced to’ and the agent expressed in a relative clause, and a V-initial transitive clause with dummy subject kiʔ ‘they’ or pui ‘people’. Wa (Seng Mai 2012: 95–96) (67) a. ai kʰun kʰam kɹaʔ pə giɛ̤ t soʔ n suffer nml rel bite dog ‘Ai Khun was bitten by a dog.’ b. hoik sum kiʔ/pui ɲɛ̤ ʔ tin perf build 3pl/people house here ‘The house is built here.’ In Bugan the verb tɕo³¹ ‘to hit’ has grammaticalized as a passive marker, effectively marking a ‘by’ phrase: Bugan (Li and Luo 2014: 1059) (68) ʔi³¹ tɕo³¹ pi³¹kua⁴⁴ naŋ²⁴ ʔa⁴⁴ 3s hit wasp sting disc ‘He got stung by wasps.’ Kammu Yuan creates morphological causatives with prefix hn- (e.  g. caːk ‘tear’ > hncaːk ‘torn’) but it occurs with rather few verbs. A periphrastic passive with tʰuːk ‘receive/ undergo’ (cf. Lao tʰɨ̀ːk ‘receive/undergo/suffer’) is also available. Kammu Yuan (Svantesson et al. 2014: 355) (69) kə̀ ː tʰuːk tòːt 3s receive/undergo capture ‘He was captured’

534 

 Paul Sidwell

23.5.3 Negation NAA languages show similar patterns in negation; negative auxiliaries that precede the verb and negate the predicate. The same form may also be used to negate the whole proposition, e.  g. as an adverb preceding the clause or an exclamation on its own (either declarative or imperative). Some languages use different forms for negative imperatives (prohibitives) or for negative copulars (‘not be/are no’). It is also normal to have specific negative temporal auxiliaries ‘not yet’, ‘never’, etc. Bipartite negation has not been noted. In Golden Palaung declaratives are negated with preverbal ka, prohibitives are formed with maj (cf. Thai mâj ‘not’). There are also negative auxiliaries such as ɲ̊əm ‘not yet’. Golden Palaung (Mak 2012: 109) (70) a. maj hʌʔ tʰap rakʌrvɤj proh go.up level area.above ‘Don’t go upstairs.’ b. saŋi din japʰaj ka gwaj day dist ogress neg dwell ‘That day, the ogress was not there.’ Wa has two pre-verbal negators, ʔaŋ for declaratives and bɔ (cf. Lao bɔ̄ ː ‘not’) for prohibitives. They precede any preverbs, the verb complex or even the S/A. The negators are used in conjunction with negative auxiliaries such as ɲaŋ ‘not yet’ (cf. Lao ɲáŋ ‘still, not yet’) or lai ‘not anymore’. Strikingly, VPA order, which is otherwise available in WA, is not used with negation. Wa (Seng Mai 2012: 73–74) (71) a. bɔ siblɯ̤h sədaʔ soʔ nɔh neg.imp pull tail dog 3s ‘Don’t pull his dog’s tail!’ b. ʔaŋ pliʔ ʔin ɲaŋ tom neg fruit prox not.yet ripe ‘This fruit is not ripe yet.’ The Kammu Yuan negator pəʔ ‘not’ is used preverbally or adverbially negating the whole clause. Additionally, there are negative auxiliaries praʔ for negated perfect, pʰɔn ‘never’, and a negative emphatic plɔʔ. The prohibitive is taʔ. Kammu Yuan (Svantesson and Holmer 2014: 971) (72) a. màt ʔòʔ ʔès, plɔʔ kùːɲ məh eye 1s swell neg.emph see what ‘My eyes were swollen and I could not see anything at all’

b. kə̀ pəʔ nə̀ ːŋ ŋɔ̀ r 3p neg know road ‘He did not know the road.’



Northern Austroasiatic languages of MSEA 

 535

Pray pre-verbal negators ʔaj and kəj are used with declaratives, and either may combine with mi to increase their force (it is not clear if there is a semantic difference between ʔaj and kəj, it is noted that the regular negator in Mlabri is ki~kəki, cognate with Pray kəj). The Pray prohibitive is ʔam ‘do not’. The Bugan pre-verbal negator form is mə55~mɯ55, and this combines with sa̲ŋ55 ‘be’ to form an analytical negative copular/existential. Bugan (Li and Luo 2014: 1060, 1044) (73) a. ʔi³¹ mə⁵⁵ no̲³¹ ma¹³ 3s neg come asrt ‘He is not coming here.’

b. ʔɔ³¹ mə⁵⁵ sa̲ŋ⁵⁵ piau¹³ pə⁵⁵se³³ 1s neg be person Guangnan ‘I am not a native of Guangnan.’

23.5.4 Pronominals 23.5.4.1 Personal pronouns It is in the personal pronouns that we find a single feature unifying the NAA languages: all four branches share a 1st person singular form that can be reconstructed *ʔɔːʔ, in contrast to *ʔaɲ which is found through much of AA, and in a subset of Khmuic ­languages. In common with much of AA, NAA personal pronouns tend to distinguish singular, dual, and plural forms, although the dual category is vulnerable to loss or renewal with analytical forms. Additionally, there may be inclusive-exclusive dimensions, and gender or status distinctions especially in 2nd and 3rd person forms. In the details, personal pronoun forms and structures can vary significantly even between closely related languages. In this respect, compare Khmu Cuang and Pray in Tables 29 and 30. Tab. 29: Khmu Cuang pronouns (Premsrirat 2002).

1 2 female 2 male 3 female 3 male

sg

du

pl

ʔoʔ baː meː naː gəː

ʔaʔ sbaː

ʔiʔ bɔː

snaː

nɔː

536 

 Paul Sidwell

Tab. 30: Pray pronouns (Malapol 1989). sg

du

pl

1 excl 1 incl

ʔəɲ

jəʔ ʔaː

ʔɛː ʔih

2

mah

paː

pɛː

3 +HUM 3 -HUM

nam jeh

paːm

ʔah

Although related within the same branch, Khmu and Pray hold few forms in common, only Pray makes an inclusive/exclusive distinction, Khmu has gendered singular forms, and Pray distinguishes ±human in the third person singular. Palaungic languages tend to maintain the inclusive/exclusive distinction in 1DU and 1PL forms, as seen below in Palaung and Wa, although this is lost in U. Otherwise the Palaungic languages are consistent in terms of their categories and forms. An exception is Eastern Lawa with a reduced pronoun inventory: dual forms are lost entirely and plural forms have been renewed analytically by borrowing Thai mùː ‘group’. (See Tables 31–34.) Tab. 31: Palaung pronouns (Mak 2012).

1 excl 1 incl 2 3

sg

du

pl

ʔɔ

jar ʔaj par gar

jɛ ʔɛ pɛ gɛ

mi ʔʌn

Tab. 32: Wa pronouns (Seng Mai 2012). sg

du

pl

1 excl 1 incl

ʔəuʔ

jaʔ ʔaʔ

ji̤ʔ ʔeʔ

2

mai̤ʔ

paʔ

peʔ

3

nɔh

kɛʔ

kiʔ



Northern Austroasiatic languages of MSEA 

 537

Tab. 33: U pronouns (Svantesson 1988).

1 2 3

sg

du

pl

ʔò mî ʔə̀ t

ʔǎj pʰǎj kǎj

ʔè pʰé ké

Tab. 34: Eastern Lawa pronouns (Block 2013).

1 2 informal 2 formal 3

sg

pl

ʔaj aʔ maʔ uj tʰɔ ~ keʔ

mu ʔɛ mu aʔ mu maʔ mu tʰɔ

In Bugan we see the dual category is maintained, but the forms have been renewed analytically by combining bi̱ɔ̱³¹ ‘two’ with the plural forms (Table 35). Tab. 35: Bugan pronouns (Li and Luo 2014).

1 excl 1 incl 2 3

sg

du

pl

ʔɔ³¹

wi³¹ bi̱ɔ̱³¹

mɯ³¹ ʔi³¹

mi³¹ bi̱ɔ̱³¹ hɛ³¹ bi̱ɔ̱³¹

pɛ³¹ wi³¹ mi³¹ hɛ³¹

23.5.4.2 Reflexives and reciprocals The most common strategy for reflexives and reciprocals is analytical using special pronouns or combinations of pronouns. For example, Kammu Yuan has a reflexive tèː ‘self’ and reciprocal jɔ̀ ʔ ~ jɔ̀ ʔ tèː ‘each other’, which come at the end of the clause, e.  g.: Kammu Yuan (Svantesson et al. 2014: 7) (74) kmmuʔ kràːn pʰaːn tèː man lazy kill refl ‘a lazy man kills himself’

538 

 Paul Sidwell

Cognates of Kammu tèː ‘self’ with related meanings/functions are found throughout Khmuic and Palaungic. In Wa (Seng Mai 2012) the tiʔ morpheme combines with ʧao to mark reflexives and with pao̤ ʔ to mark reciprocals: Wa (Seng Mai 2012: 97) (75) nɔh sɔm ʧao tiʔ 3S eat.rice oneself refl ‘He ate by himself.’ Wa (Seng Mai 2012: 98) (76) jɛ̤ ʔ ai.ka pə tɔk pao̤ ʔ tiʔ 1dl.excl pn rel beat recp ‘He stole his mother’s things.’ In Milne’s (1921) grammar of Palaung we find dē indicating his/himself: Palaung (Milne 1921: 29, 82) (77) a. ạ̄n k̔lă tō prīm dē 3S cut body old refl ‘He cut himself.’

  b.

grū mā dē ạ̄n rāt 3S steal thing mother refl ‘He stole his mother’s things.’

Li and Luo (2014) indicate that Bugan has a reflexive pronoun form mbaŋ⁴⁴ mbi⁴⁴, ‘self, one’s own’, although no examples of usage are given. There are also morphological strategies to signal reciprocity. Golden Palaung has a rək- verbal prefix, explained by Mak, “[t]he most productive way is to add reciprocity to a transitive verb to form an intransitive verb” (Mak 2012: 73). For example, rək ‘love’ kʌr.rək ‘love each other.’

23.5.4.3 Demonstratives NAA demonstratives are often simple systems that only distinguish two or three degrees of proximity, while some are more elaborated see table 36. The latter includes Khmu, in which the more distal grades include specification for ‘higher’, ‘lower’ and ‘beside’ (same level) as ego. The basic demonstratives may overlap with the third person personal forms, and in Wa and Palaung lects personal pronoun forms combine with demonstratives to form plurals ‘these, those’. Within the noun phrase demonstratives always fall to the right edge, with the order N-Adj-DEM observed in all languages examined for this chapter.



Northern Austroasiatic languages of MSEA 

 539

Tab. 36: Selected NAA demonstrative pronouns. Khmu Pray Mlabri Golden Dara’ang Danau Wa East- Muak U Bolyu Bugan Cuang Palaung ern Sa’ak Lawa prox medl medl.higher dist dist.beside medl.lower

giː naːj ʔnɨŋ hoʔ suʔ

tɛh toːn

gʌh

ntoːn ɲʌʔ

ʔɯ nan

ni

nì

ʔin hej hɔ

ni2

ní ni⁵⁵

ni³³

taj, din

dɔj

nɐ̄ ɔ

ʔan tʰɔ

jɛn2

nɛ̀ ndɯ⁵⁵ ki³³

23.5.5 Number and counting Generally there is no plural category in the grammar of NAA languages, but special plural forms and plural morphemes are used variously with pronouns (see 23.5.4.1), and in some Palaungic languages with demonstratives and nouns in certain contexts. The Wa 3rd person dual and plural pronouns kɛʔ, kiʔ combine with demonstratives to create dual and plural forms: ʔinkiʔ ‘these’, ʔankiʔ ‘those’. Wa also has a special plural copular verb nɛ ‘exist many’ which allows for overt expression of plurality, although not used with classifiers/counting. Wa (Seng Mai 2012: 35) (78) nɛ kʰaoʔ dəuʔ pansan exist.many tree in NPROP ‘There are many trees in Pan San’ Similarly to Wa, Golden Palaung uses the dual and plural 3rd person pronoun forms gar, gɛ as dual and plural markers that can follow nouns to emphasize/clarify their number. This is described by Mak (2012) although without textual examples.

23.5.5.1 Numerals All of the NAA languages have decimal counting systems with combining forms for higher numerals, typical of MSEA generally. The languages vary considerably in the extent to which they use indigenous numeral forms or have replaced these with loans, and the use of loans for numbers above ten is common. Tai (Lao, Lü, Shan, etc.) numeral forms in particular are borrowed frequently, and Muak Sa’ak, for example, has replaced all indigenous numerals with loans. In some cases, borrowed numerals are used every day, such as in market speech, while indigenous forms remain in

540 

 Paul Sidwell

rituals, songs, and other archaic genres. Numerals 1–10 for several NAA languages are given in Table 37. Tab. 37: Selected NAA cardinal numerals. Shading indicates Tai loans. Pray

Kammu Yuan

Muak Sa’ak

Wa

Golden Palaung

Mang

Bugan

one two three four five six seven eight

miː piaʔ pʰɛʔ ɕiː haː hok cet pɛːt

mòːj pàːr peʔ siː haː rok cet pɛːt

ʔak² sɔːŋ³ saːm³ siː¹ haː² rɔk² cɛt² piat¹

tiʔ ɹa lwe pon pʰwan lia̤ h ʔəlia̤ h sədaiʔ

ʔu ʔir ʔwij pʰon pʰən tɔr pur ti

bɔ⁵⁵ biɔ³¹ mʦe³¹ pɑu³³ mi³³ pi̱o̱³³ po̱ u̱ ³¹ sɑ̃ ³³

nine ten hundred thousand

kaw ɕip rɔːj pʰan

kaw sɨp rɔ̀ ːj mɨːn

kaːw² sip² ruaj² pan³

dim kau jɛ̤ h ɹeiŋ

tim ʔukɤr pʌr.jih hrɛŋ

măk⁶ ʑɯəi² pe³ pun² hăn² ʑɔ̆ m² tăm¹ py³ tăm¹ ham² tăm¹ θin² ʑi³ mɛ⁴ ran⁵~ʑan⁵ păn⁵

ɕi³³ mɑ̃ ³¹ ʑu³¹ thiaŋ¹³

Ordinals are hardly discussed in the literature on NAA languages, but some observations can be made. Golden Palaung (Mak 2012: 78) ordinals are formed with ɟuh ‘begin to’, thus ɟuh ʔu ‘first’, ɟuh ʔar ‘second’, etc. and similarly in Kamu Yuan with tʰiː (< Lao tʰíː ‘occasion, rank’ ordinal marker). In Wa (Seng Mai 2012: 55) the ordinal formed by inverting the order of number and classifier, e.  g. bɔg loe (clf three) ‘the third time’.

23.5.5.2 Numeral classifiers The use of numeral classifiers is normal in NAA languages, with the general word order in the classifier phrase N-NUM-CLF, as in the following example from Golden Palaung. Golden Palaung (Mak 2012: 93) (79) ʔʌn ʔu ku bʌp de rʌr kin ʔir nɛ 3S one clf involuntary SELF do work two clf ‘He must work two jobs by himself’ Individual languages vary in terms of when a classifier is obligatory; for example, Malapol (1989) advises that in Pray the numeral is optional if the quantity is ‘one’. For Kammu Yuan, Svantesson et al. (2014) state that classifiers must be used with numerals and more than 40 classifiers are listed, with some nouns functioning as



Northern Austroasiatic languages of MSEA 

 541

their own classifiers (e.  g. kuŋ pàːr kuŋ ‘two villages’). For Pray, Malapol (1989) notes that in the case of compound nouns, the first element of the compound functions as the classifier. E.g.: Pray (Malapol 1989: 56) (80) ɕwaʔ miː ntʰal-lɔl miː ntʰal dog have tail-bottom one clf ‘a dog has one tail’

23.5.6 Case and adpositions Throughout NAA languages case is not marked morphologically; relations between core arguments are typically indicated only by word order and semantics, while prepositions are used to mark oblique arguments, encoding categories such as location, source, goal, instrument, etc. The languages vary widely in terms of the richness of prepositions and the work done by prepositional phrases, but also the sources vary in their descriptive and analytical frameworks such that it is problematic to identify and compare prepositions and some license must be taken to interpret the literature. For example, movement towards or away from a location/actor may be indicated by adding a verb ‘to arrive’ or ‘to depart’ after the verb of motion, and it can be arguable whether the second verb is a coverb within the verb complex or a preposition within a prepositional phrase. In any case, it is clear that some NAA prepositions do derive from verbs in this manner; additionally, borrowing is another source of prepositional forms. Kammu Yuan seems to have few prepositions, with only two described (Svantesson and Holmer 2014) and this not a rare case in AA more widely.3 It is also common for NAA languages to have a general preposition with a broad range of meanings which can be interchanged with other more specific functors, and in the published descriptions there is no consistent way of glossing these. Seng Mai (2012) describes more than a dozen prepositions for Wa (including some forms that also function as conjunctions, adverbs). These cover a wide range of locational and directional meanings, yet most of these can be substituted with ka which is characterized as an applicative (appl) as it can also introduce a prepositional phrase. Compare: Wa (Seng Mai 2012: 150) (81) mwɛ tɔk lig ka sədaʔ tiʔ cow beat pig appl tail poss ‘The cow hit the pig with his tail.’

3 Consider that Nicobarese languages have only two or three prepositions (Sidwell 2020).

542 

 Paul Sidwell

Wa (Seng Mai 2012: 22) (82) hoik hu ai.kʰum ka dəʔ kəŋ nuʔ compl go n appl in field past.near ‘Ai Khun went to his field already.’ Both Golden Palaung and Dara’ang Palaung are described as having six prepositions (Mak 2012 calls them “referential nouns”) which have largely coincident functions. Those prepositions have very different forms except for one, which reflects a proto-AA linker *ta. The Palaung reflexes of *ta are highly polyfunctional, including directional, locative, recipient, and beneficiary meanings, and reflexes with similar polyfunctionality are found through NAA languages (see Table 38). Tab. 38: NAA reflexes of protoAA linker *ta. Language

Form and gloss

Note

Golden Palaung Dara’ang Palaung Plang Kammu Yuan Pray Bugan Bolyu

ta dir ‘into, on, to’ dɨ goal ‘at, to, toward’ ta ‘by, to’ tàː ‘in/on/at/by/to/from’ tak ‘at, on’ tʰɛ³¹ ‘to, until’ tɔ² ‘by’

highly polyfunctional highly polyfunctional instrumental, beneficiary highly polyfunctional primarily locative temporal passive

In addition to the use of ta in Golden Palaung as an oblique preposition (generally marked as a directional by Mak 2012), there are sentences in which it appears to be marking patients in main clauses and relative clauses (examples [83], [84]). Golden Palaung (Mak 2012: 78) (83) gɛ mʌn dʌʔ nwər̥ hi pɛt ta kiɲɔm 3p oneself remain worry forsake dir child/youth ‘They themselves worried the children.’

gɛ 3p

Golden Palaung (Mak 2012: 132) (84) ʔɔ ka nəp ha ʔun ʔɔ ta ʔʌn 1s neg know place keep 1s dir 3s ‘I can’t remember where I keep it.’

23.5.7 Final particles Sentence particles occupy a peripheral – typically final – position. Final particles can be hard to categorize, and grammars and sketches vary in the extent to which they attempt to document them. Watkins’ (2019: 466) description of Wa lists some 14 final



Northern Austroasiatic languages of MSEA 

 543

particles, which are categorized variously as communicating emphasis, supposition, suggestion, confirmation, and declaration, and this would appear to be fairly representative of NAA; some examples follow. Wa (Watkins 2019: 466) (85) a. jʰak hɤi look emph ‘Look!’

b. kɛ̤ t saɯʔ ŋɛ̤ h very hurt emph ‘It really hurts!’

Pray (Malapol 1989: 156) (86) a. mah ʔaŋ waʔ ʔəʔ 2s do q fp ‘What are do doing?’

b. ʔaw, ʔiː pʰɔk ʔəɲ teh father, 3s bite 1s fp ‘Father, it bit me!’

Kammu Yuan (Svantesson and Holmer 2014: 973) (87) pàː jɔ̀ h sɔk pàː tèʔ klèʔ ʔəːm 2s.f go seek 2s.f get husband hort ‘Go find yourself a husband.’ Bugan (Li and Luo 2014: 1060) (88) a. wi³¹ na⁵⁵ lao 1p.inc go imp ‘What are do doing?’

b. mɯ³¹ na⁵⁵ ha⁵ no³¹ go 3s first imp ‘Father, it bit me!’

23.6 Conclusion This typological review of NAA has focused primarily on five languages: Palaung, Wa, Khmu/Kammu, Pray, and Bugan, allowing us to compare across a comprehensive set of linguistic categories, and reveals a linguistic ecology of correlations between levels of linguistic structure and areal context. Archaism in phonology and morphology, especially evident in Khmu/Kammu, favors complexity in sesquisyllables and rich use of prefixing and infixing in derivation of both prosaic and expressive lexicon. Other NAA languages, particularly the smaller ones under strong influence of monosyllabic and highly configurational but unrelated neighbors, have restructured radically to converge on their neighbors, developing complex contour tone systems and calquing common syntactic patterns. Mang and Pakanic are very much of the latter type, as are some Palaungic languages such as U, Buman, and Khang. However, while the broad patterns of correlation are there, they do not manifest as a uniform cline. A striking example is Wa in which the lexicon has reduced almost completely to monosyllables, with only a single productive prefix remaining. While this suggests strong areal convergence with Southwestern Mandarin and other languages of Shan State and Yunnan where Wa is spoken, verb initial word order is

544 

 Paul Sidwell

strongly reflected in subordinate clauses and variably in matrix clauses. This runs counter to the otherwise strong areal preference for verb-medial order across the area, hinting potentially at a syntactic relic. Notwithstanding the NAA preference for verb-medial word order, variation in word order occurs for pragmatic and stylistic reasons, particularly fronting of topicalized arguments, consistent with preference for topic-comment structure. Additionally, elision of known or retrievable arguments is noted although it is not strongly exemplified in the documentation. Complex sentences are composed of multiple clauses that are joined or embedded with or without linkers. And while Khmu, for example, shows upwards of a dozen conjunctions it is not difficult to find examples of clauses joined asyndetically, and this pattern is found more or less throughout NAA. It is difficult to know how to interpret this, but it may be that to some extent the use of clause linkers is an artifact of elicitation and/or contact, and may not have been strong historically. Grammatical roles of core argument are primarily signaled by word order and lexical semantics, and pragmatic reordering may be accompanied by additional cues such as characteristic intonation, pauses, and/or particles. Oblique arguments are generally marked with prepositions, and it is common for NAA languages to possess both a rich inventory of prepositions and a general preposition that can be used across a wide range of meaning, with contextual and lexical-semantic cues permitting information recovery. NAA pronouns often show singular, dual, and plural forms, at least in the first and second person, and an inclusive versus exclusive distinction in the first person. In the case of Khmu/Kammu there are also gender distinctions in second and third person, and a ±human distinction in Pray third person singular, although otherwise NAA languages do not tend to distinguish gender or animacy in personal pronouns. Demonstratives often only show two degrees of distance, some distinguish three, and Khmu/Kammu stands out for additionally indexing for things above, below, or on the same level as ego, apparently reflecting the hillside swiddening lifestyle of the people. Counting in NAA is universally with decimal numerals, and some of the languages have replaced some or all of their indigenous numerals with borrowed forms (such as Muak Sa’ak). Following the areal pattern, numeral classifiers are common and up to 40 different classifiers have been recorded for individual languages. The order within the classifier phrase is apparently always n-num-clf, lexically some nouns can be their own classifier and some are used without classifiers. Throughout NAA final particles, expressing speaker attitude and/or expectation, are commonly used and their appropriate use is an index of one’s proficiency in everyday use of language. Particles are not easy to define, and one could take a broad view and include many adverbials and functors such as negatives and interrogatives among them. The use of final particles is areally typical and their forms diverse, although they are most commonly open or closed monosyllables without onset clusters.



Northern Austroasiatic languages of MSEA 

 545

References Bätscher, Kevin. 2014. Mlabri. In Mathias Jenny & Paul Sidwell (eds.), The handbook of Austroasiatic languages, 2 vols., 1003–1030. Leiden & Boston: Brill. Bùi Khánh Thế. 2000. The Phong language of the Ethnic Phong which live near the Melhir Muong Pon Megalith in Laos. In Pan-Asiatic Linguistics: The Fifth International Symposium on Languages and Linguistics, 199–253. Ho Chi Minh City: National University. Charoenma, Narumol. 1980. The sound systems of Lampang Lamet and Wiangpapao Lua. Thailand: Mahidol University MA thesis. Charoenma, Narumol. 1982. The phonologies of a Lampang Lamet and Wiang Papao Lua. Mon-Khmer Studies 11. 35–45. Dai Qingxia & Liu Yan. 1997. Analysis of the tones in the Guangka subdialect of De’ang. Mon-Khmer Studies 27. 91 –108. Dao Jie 刀洁. 2007. Bumang yu yanjiu 布芒语研究 [A study of Bumang]. Beijing: 民族出版社 [Nationalities Publishing House]. Deepadung, Sujaritlak, Ampika Rattanapitak & Supakit Buakaw. 2014. Dara’ang Palaung. In Mathias Jenny & Paul Sidwell (eds.), The handbook of Austroasiatic languages, 1065–1103. Leiden & Boston: Brill. Edmondson, Jerold. 1995. English-Bolyu glossary. Mon-Khmer Studies 24. 133–159. Filbeck, David. 1978. T’in: A historical study. Canberra: Pacific Linguistics. Gao Yongqi [高永奇]. 2001. A sketch of Mang [莽语概况]. Minzu Yuwen. 4. Gao Yongqi [高永奇]. 2003. A study of Mang [莽语硏究]. Beijing: Ethnic Publishing House [民族出版社]. Gao Yongqi [高永奇]. 2004. A study of Buxing [Buxing yu yanjiu 布兴语研究]. Beijing: Minzu University Press [民族出版社]. Hall, Elizabeth. 2010. A phonology of Muak Sa-aak. Thailand: Payap University MA thesis. Hsiu, Andrew. 2016. A preliminary reconstruction of Proto-Pakanic. Payap University manuscript. Zenodo archive. DOI: 10.5281/zenodo.1127811. Lewis, Emily Dawn. 2008. Grammatical studies of Man Noi Plang. Thailand: Payap University MA thesis. Li Jinfang & Luo Yongxian. 2014. Bugan. In Mathias Jenny & Paul Sidwell (eds.), The handbook of Austroasiatic languages, 1033–1062. Leiden: Brill. Li Jinfang. 1996. Bugan – A new Mon-Khmer language of Yunnan Province, China. Mon-Khmer Studies 26. 135–160. L-Thongkum, Theraphan & Chommanad Intajamornrak. 2008. Tonal evolution induced by language contact: A case study of the T’in (Lua’) language of Nan province, northern Thailand. Mon-Khmer Studies 38. 57–68. Mak, Pandora. 2012. Golder Palaung: A grammatical description. Canberra: Asian Pacific Linguistics. Malapol, Mingkwan. 1989. Pray grammar at Ban Pae Klang, Thung Chang District, Nan Province. Thailand: Mahidol University, MA thesis. Nguyễn Văn Lợi, Nguyễn Hữu Hoành & Tạ Văn Thông. 2008. Tiếng Mảng. Hanoi: Nhả xuất bản Khoa học Xã hội.  Premsrirat, Suwilai. 2002. Thesaurus of Khmu dialects in Southeast Asia. Thailand: Institute of Language and Culture for Rural Development Mahidol University. Rischel, Jørgen. 1995. Minor Mlabri: A hunter-gatherer language of Northern Indochina. Copenhagen: Museum Tusculanum Publishers. Rischel, Jørgen. 2007. Mlabri and Mon-Khmer. Copenhagen: Historisk-filosofiske Meddelelser 99. Seng Mai, Ma. 2012. A descriptive grammar of Wa. Thailand: Payap University MA thesis.

546 

 Paul Sidwell

Shorto, Harry L. 1960. Word and syllable paterns in Palaung. Bulletin of the School of Oriental and African Studies 23. 544–557. Shorto, Harry L. 1963. The structural pattern of northern Mon-Khmer languages. In Harry L. Shorto (ed.), Linguistic comparison in South-East Asia and the Pacific, 45–61. London: School of Oriental and African Studies. Sidwell, Paul. 2015. The Austroasiatic language phylum: A typology of phonological restructuring. In Claire Bowern & Bethwyn Evans (eds.), The Routledge handbook of historical linguistics, 675–703. London & New York: Routledge. Sidwell, Paul. 2020. Nicobarese comparative grammar. In Mathias Jenny, Paul Sidwell & Mark Alves (eds.), Austroasiatic syntax in areal and diachronic perspective, 82–104. Leiden & Boston: Brill. Svantesson, Jan-Olof. 1983. Kammu phonology and morphology. Lund: Gleerup. Svantesson, Jan-Olof. 1988. U. Linguistics of the Tibeto-Burman Area 11(1). 64–133. Svantesson, Jan-Olof. 1991. Hu – A language with unorthodox tonogenesis. In Jeremy H. C. S. Davidson (ed.), Austroasiatic languages, essays in honour of H. L. Shorto, 67–80. London: School of Oriental and African Studies, University of London. Svantesson, Jan-Olof & Arthur Holmer. 2014. Kammu. In Mathias Jenny & Paul Sidwell (eds.), The handbook of Austroasiatic languages, vol. 2, 957–1002. Leiden & Boston: Brill. Watkins, Justin. 2013. Dictionary of Wa, 2 vols. Leiden & Boston: Brill. Watkins, Justin. 2019. Wa (Paraok). In Alice Vittrant & Justin Watkins (eds.), The Mainland Southeast Asia Linguistic Area, 432–474. Berlin & Boston: Walter de Gruyter.

Paul Sidwell

24 Eastern Austroasiatic languages 24.1 Introduction The Eastern Austroasiatic (EAA) languages are defined here as the Monic, Pearic, Khmeric, Bahnaric, and Katuic branches. It not a phylogenetic group but geographical area that roughly corresponds to the historical reaches of Angkor, Champa, and Sukhothai. As shown in Map 1, EAA languages are spoken on the Indo-Chinese peninsula south of a line drawn roughly from Yangon in Myanmar to Quáng Đông in Vietnam, and stop before the Isthmus of Kra. Their historical center of gravity is the lower Mekong region, with the Mon migrating from the Chao Phraya valley through mountain passes to the Bay of Bengal from sometime in the 1st millennium CE (Indrawooth 2011). The Vietic languages, especially the central and southern forms of Vietnamese,

Map 1: Approximate distribution of Austroasiatic branches discussed in this chapter. https://doi.org/10.1515/9783110558142-021

548 

 Paul Sidwell

are also spoken in this region, but as colonists who spread southward throughout the 2nd millennium CE, they have a very different language history (Vietic typology is discussed separately by Alves, chapter 22 this volume). The approach taken in this chapter is straightforwardly synchronic with only marginal reference to historical linguistic explanation for similarities and differences. In very broad terms, one can characterize EAA languages as strongly favoring several structural traits in phonology and morpho-syntax: – iambic word structure and speech rhythm; – monophthong inventories with three degrees of height and backness; – verb-medial word order; – configurational morphosyntax with rather limited affixal morphology. These tendencies are broadly consistent with the rest of the Mainland Southeast Asian Linguistic Area (MSEA, see “Introduction”, this volume). The EAA languages ultimately have a common ancestry from proto-AA, and thus share many linguistic features by inheritance, although this belongs to a very deep past, and also encompasses their divergence in distinct branches. At the same time, it is clear that many convergent developments in the last two millennia occurred due to mutual contacts, and especially influence by Thai and/or Lao in more recent centuries. To a lesser extent there is some Chinese and Austronesian (Malay, Chamic) influence in EAA, although this is essentially restricted to lexical borrowing and has not apparently conditioned notable structural changes.

24.2 Phonology The EAA languages are remarkable in the extent to which they share phonotactic structures and segmental inventories. Generally they follow an iambic sesquisyllabic word template (see Thomas 1992, Pittayaporn 2015 on sesquisyllabicity), with a rich set of main syllable onsets that frequently includes voiced implosives, devoiced and/or preglottalized sonorants, and large vowel inventories that include contrastive length and diphthonged nuclei in closed syllables. In terms of suprasegmental phonology, contour tones are rare, but we do find phonological contrasts of creakiness, breathiness, and modal voice in combinations that yield up to four distinct voice types or “registers”. These registers may be associated with distinct pitch contours, and in some cases have been described as tonal systems or may be in transition to contour tone systems.



Eastern Austroasiatic languages 

 549

24.2.1 Word and syllable structure Throughout EAA the phonological word is basically monosyllabic or disyllabic (sesquisyllabic), the latter being highly restricted in structural possibilities. Longer forms may arise by prefixation (e.  g. Bahnar ɟəgənaːm ‘to have threatened’) but these are uncommon. Both monosyllables and sesquisyllables may be historical roots, although frequently sesquisyllables have been created by affixation of monosyllables. Additionally, due to compounding and morphological derivation (which can include reduplication) and lexical borrowing, a proportion of polysyllabic words are tolerated. Specific phonotactic rules governing sesquisyllables vary from language to language, but there is a consistent pattern that suggests an approximation of a historical prototype, which is exemplified below with Khmer. Following Huffman (1967), Jenner and Sidwell (2010) and others, one can analyze the phonological word from Old to Modern Khmer as follows: – Monosyllables: canonically C(R)V(C) where R is liquid or glide, and a coda is optional, e.  g. lɔː ‘to test/try’, dam ‘to plant’, slək ‘leaf’. – Subdisyllables: words with initial consonant sequences that tend to be pronounced with a transitional vowel (voiced or as a voiceless [h]) and may be ambiguous as to whether they take one or two beats in the speech rhythm, e.  g. tʰmɑː ‘stone’, rɔteh ‘cart’. – Disyllables: these take presyllable of the shapes CRV- and CVN-, e.  g. crɑmoh ‘nose’, ʔɑndaət ‘tongue’ and are unambiguously pronounced as disyllabic iambs. Specifically, in Khmer the presyllable vowel varies between ɑ~ɔ depending on the register series, while in most of EAA presyllable vowels are neutral (merely prosodic) yet in Katuic there is a tendency for u and i in some presyllables, although their phonological status is not always clear. We see Khmer-like sesquisyllables in Pearic and Katuic, and it is evident that codas within presyllables may reflect nasal or rhotic infixes. For example, in Ta’oi (Luang-Thongkum 2001) presyllables show a range of phonological forms, e.  g. har. mianʔ ‘turmeric’, haŋ.ʔaːm ‘yawn’, tra.ŋɯl ‘stump’, taɟ.rɯm ‘to wrestle’, taŋ.kɔːj ‘horn’, etc. Yet in Katu (An Diem variety described by Costello 1971) the only presyllables documented are simple consonants h-, p, b-, c-, ɟ-, t-, d-, k-, g-, l-, m-, r-, s-, j-, w-, ʔ- with transitional vowels (mostly written a but lacking phonological value); apparently indicating that infixation has lost productivity. In colloquial Mon presyllables are limited to only k-, t-, p-, m-, ʔ-, h- with transitional ə.1 Furthermore, in some spoken southern varieties, the historical stop presyllables are all reduced to simple hə- or ʔə-, so that there are only three possibilities, e.  g. həket ‘red’, həmic ‘mosquito’, həplṳʔ ‘betel leaf’, ʔərao ‘six’ (in northern varieties kərao), məcum ‘pair’,

1 Other presyllable onsets such as s-, c-, cʰ, etc. occur in a few words, often loanwords.

550 

 Paul Sidwell

məki̤ ‘woman’. At the extreme end of presyllable reduction Nyaheun (West Bahnaric) has restructured all phonological words to C(R)V(C) monosyllables, eliminating all sesquisyllables.

24.2.2 Phoneme inventories and phonotactics 24.2.2.1 Consonants The consonant inventories of EAA language follow an archetypal pattern, with variation mostly reflecting mergers and shifts in manners of articulation. The basic pattern finds the maximal inventory in main syllable onsets, with a full series of voiced and voiceless stops plus two (or occasionally three) implosives, sonorant onsets are frequent, and may include devoiced or preglottalized members Typically there are only two onset fricatives, both voiceless, one marked for place (usually recorded as s but tends to be laminal), and one unmarked for place (usually written h). In main syllable codas the inventory typically corresponds to the onsets but lacking any contrasts of voicing or other glottal settings, and stopped codas are characteristically unreleased. An example prototypical EAA consonant inventory is reflected in Katu (An Diem dialect, Costello 1971). While Katu appears to have a series of aspirated stops, these are relatively infrequent in the lexicon, and are largely restricted to Lao borrowings. A marked feature of the Katu consonants is the palatal implosive ʄ; it is relatively common and occurs in all kinds of lexicon (as opposed to, for example, in Bahnar where it is restricted to personal names and expressive lexicon), otherwise it is rare in EAA languages. See Table 1. Tab. 1: Katu consonants (Costello 1971: n.p.). Main syllable onsets p pʰ b ɓ m w

t tʰ d ɗ n s l r

ʧ ɟ ʄ~ʔj ɲ j

Main syllable codas k kʰ g ŋ

ʔ

p m

t n

w

l r

c ɲ ç j

k ŋ

ʔ h

h

Maximal complexity in the consonant inventories is found in languages that make use of devoiced and preglottalized sonorants, such as North Bahnaric languages like Sedang, Rengao, Bahnar, Jeh, Halang, etc. Among published sources, which are often



Eastern Austroasiatic languages 

 551

grounded in phonemic theory, such complex onsets have often been treated as clusters, but such analyses cannot be sustained because the supposed clusters cannot be split by infixes, and thus are unit segments at an underlying level. Table 2 lists the inventory of onsets for Sedang, which is essentially the maximal inventory of onsets found among EAA. Tab. 2: Sedang consonants (Smith and Sidwell 2014: 792). Main syllable onsets p b ɓ m ˀm m̥ w ˀw w̥

t d ɗ n ˀn n̥ s ʂ l r ˀl ˀr l̥ r̥

Main syllable codas

c ɟ ʄ ɲ ˀɲ ɲ̊

k g

ʔ

ŋ ˀŋ ŋ̊

p m w

t n l r

c ɲ j

k ŋ

ʔ h

h

j ˀj j̊

Many EAA languages have velar and palatal gaps in the voiced stop series, leaving only b and d, which may alternate with ɓ, ɗ. The same languages frequently have a strong aspiration contrast, and it is apparent that Thai/Lao borrowings play a role in introducing and reinforcing aspiration as a phonological contrast. Such a pattern is strongly represented in the Pearic languages, such as Chong, whose western dialects are apparently becoming tonal under Thai influence, assuming they do not become extinct due to language shift. Strikingly, b and d onsets in Chong derive primarily from two sources: Thai and Khmer loans, and stops that emerged from earlier *ʔm, *ʔn clusters (compare Chong boːt ‘younger sibling’, dak ‘he’, with Samre mɔːt ‘younger sibling’, nak ‘he/she’). See Table 3. Tab. 3: Chong consonants (Premsrirat and Rojanakul 2014: 606). Main syllable onsets p pʰ b m (f) w

t tʰ d n s l r

Main syllable codas

c cʰ

k kʰ

ɲ

ŋ

j

ʔ

h

p m w

t n s l r

c ɲ j

k ŋ

ʔ h

552 

 Paul Sidwell

Khmer is somewhat odd in that it has the velar-palatal gap yet lacks a robust aspirated stop series. Contrastive aspiration is found in loan vocabulary (especially Pali loans) and phonetic aspirates occur in onset clusters; additionally there are Ch- onsets in a modest amount of native vocabulary (e.  g. khae ‘moon’, phuːm ‘village’, thom ‘large’) but in the latter case these onsets can be split by infixation, so they are clearly phonological clusters. Consequently, Khmer can be regarded as lacking an aspirated series in its native phonology. See Table 4. Tab. 4: Khmer consonants (bracketed segments in loan words only) (Bisang 2014: 678). Main syllable onsets p b m (f) v

t d n s l r

Main syllable codas

c

k

ɲ

ŋ

j

ʔ

h

p m w

t n l

c ɲ j

k ŋ

ʔ h

Word-finally no EAA language tolerates a voicing contrast, so the maximal set of main syllable codas consists of one series of stops plus the full set of continuants. Some languages, such as Halang and Jeh (North Bahnaric) have eliminated the palatal place of articulation in codas, merging them with the apicals and velars, but this is uncommon and palatal codas are robustly found throughout EAA. Stop codas are typically unreleased, having coincident glottal closure (i.  e. /-p/ = [-pˀ] ~ [-p͡ʔ) although this is rarely indicated in descriptive sources. One consequence of the coincident glottal closure is that the coda stops can split phonologically, conditioned by the emergence of a glottalized or tense register. We see this, for example, in Katuic languages such as Ong and Katang, which have post-stopped sonorants (written nʔ~nt, mʔ~mp etc.) in addition to regular stop codas and plain nasal codas; this phenomenon is discussed separately at 24.2.3. Most EAA languages contrast two fricative codas, a placeless h which is realized as a devoiced vowel at the end of the syllable, and a fricative marked for place, which varies between an alveolar and a palatal contextually. The transcription of this coda segment has been a matter of some confusion; it has variously been written s, x, ɕ, ç, ih, yh, ẙ; these attempts to represent the same segment are often normalized to s. To the unaccustomed ear the most acoustically salient aspect of this coda is the vocalic transition that approximates e~ɪ, motivating the ih~yh transcriptions. Some languages, such as Khmer and Mon have merged these fricatives to h, other such as Sedang merged it to j, and some have an allophonic alternation between ç after front vowels and h elsewhere. There is also an occasional tendency to merge codas r and l; for example Samre (Pearic) merges these to r generally. On the other hand, Khmer has dropped coda r



Eastern Austroasiatic languages 

 553

completely while maintaining coda l robustly. Mon has dropped both -r and -l, merging them with -w before losing the coda completely. The influence of Lao on West Bahnaric lects is especially strong, and this writer has noted a tendency for speakers to merge codas r and l to n following the Lao pattern.

24.2.2.2 Vowels EAA languages are much more diverse in their vowel inventories than in their consonants, perhaps evidencing the largest vowel inventories among the world’s languages. The canonical EAA main syllable vowel inventory is typically Southeast Asian, with three degrees of openness and backness, a distinction between short and long nuclei, and there may be one or more diphthonged nuclei. At this basic level they approximate the Thai and Lao inventories, such as in Old Khmer (Table 5) or Chong (Table 6). Tab. 5: Old Khmer vowels (Sidwell 2014: 649). ɪ e ɛ

ɤ a

ʊ o ɔ

iː eː ɛː iːə

ɯː ɤː aː

uː oː ɔː uːə

Tab. 6: Chong vowels (Luang-Thongkum 1991: 144). i e ɛ

ɯ ɤ a

u o ɔ

iː eː ɛː iə

ɯː ɤː aː ɯə

uː oː ɔː uə

Several phonological processes can act to vary the number of vowel contrasts. One of these is neutralization of the length contrast; either by merger of long and short nuclei, or long nuclei may diphthongize so that only a contrast of timbre remains. An example of the latter is seen in Sedang with eight monophthongs and 11 diphthongs, essentially reflecting the earlier short:long distinction (Table 7). Tab. 7: Sedang vowels (Smith 1979: 33–35). i

u





io

e

ə

o





eo

ɛ

a

ɔ





uo

iɪ iə

554 

 Paul Sidwell

A similar vowel inventory, lacking a length distinction but marked by a rough balance between monophthongs and diphthongs is attested for modern spoken Mon, reflecting a complex combination of mergers and diphthongizations (Table 8). Tab. 8: Modern Mon vowels (Jenny 2014: 558). i e ɛ

ɤ ə a a

u o ɔ ɒ

iə eə ɛə

uə oə ɔə

ɒə ao

Northern (or Surin) Khmer – somewhat unusually – has a large inventory of monophthongs, with five degrees of openness in addition to the three diphthongs also found in Standard Khmer (Table 9). Tab. 9: Northern Khmer vowels (Chantrupanth and Phromjakgarin 1978: xii). i

ɯ

u



ɯː



ɪ

ɤ

ʊ

ɪː

ɤː

ʊː

e

ə

o



ɤː



ʌ

ɔ

ɛː

a

ɑ

ɛ

ia

ʌː

ɔː



ɑː

ɯa

ua

24.2.3 Suprasegmentals: registers/phonation types/ nazalization Many EAA languages have phonologically relevant register distinctions in the form of breathy and/or creaky phonation, potentially more than doubling the number of contrastive nuclei. Additionally, although less commonly, nasalization can also be contrastive. Lexical tone, realized as contrasting pitch contours, also does exist in EAA, although not independently from register, e.  g. in the Pearic languages of Thailand such as Kasong (Thongkham 2003) and Samre (Ploykaew 2001). Contrastive registers of this type are relatively uncommon worldwide, yet are prominent in the AA language of MSE Asia, correlating with the tendencies inherent in sesquisyllabic iambs: having more complex onsets there is more opportunity for asymmetries in glottal tension over the duration of syllable articulation. These can become perceptually salient at the expense of other features and consequently phonologize (see, for example, Wayland and Jongman [2002], Thurgood [2002], for articulatory accounts of registrogenesis).



 555

Eastern Austroasiatic languages 

Across EAA, register systems have been identified in all branches (Monic, Khmer, Pearic, Katuic, Bahnaric) although they vary in character and origins. The most well understood pattern is the splitting of nuclei into high and lower register variants, according to the voicing status of the syllable onset. This was elegantly analyzed by Huffman (1985a) and is recognized as an important principle explaining the diversity of EAA vowel inventories. Vowels in the first register (following voiceless onsets) tend to have modal or even creaky phonation, while vowels in the second register (following voiced onsets) tend to have breathy phonation. Additionally, there is often an effect of lowering or raising vowel onsets differently in each register, which can split the vocalic system, multiplying the number of distinct nuclei. In all EAA groups there are languages which have undergone this kind of onset-conditioned register formation, the best known being the breathy registers of Mon and Khmer. Additionally, North Bahnaric languages have breathy registers, but oddly did not undergo devoicing, and the explanation for this is still obscure. A good example of a large inventory of syllable nuclei is the Katuic language Bru (as described in Luang-Thongkum 1979). Including contrasts of timbre, phonation, and nasalization, some 70 distinct nuclei can be counted, although not all theoretically possible combinations are observed and some only occur very infrequently (Table 10).

breathy

modal

Tab. 10: Bru vowels and registers (Luang-Thongkum 1979: 229–230). i

ɯ

u



ɯː



e

ɤ

o



ɤː



ʌ

ɔ

ɛː

a

ɒ

ɛ

ʌː

ɔː



ɒː

ɯ̃ ɛ̃ ã



ɯ̤



i̤ː

ɯ̤ː

ṳː

i̤ə



ɤ̤



e̤ ː

ɤ̤ː

o̤ ː

i̤a

ʌ̤

ɔ̤

ɛ̤ ː

ɔ̤ ː

i̤ə



ɒ̤

a̤ ː

ɒ̤ ː

i̤a

ɛ̤

ɯ̤ə ɯ̤ə



ĩː





ɔ̃

ɛ̃ː

ɒ̃

ũː ʌ̃ ː

ɔ̃ ː

ãː

ɒ̃ ː

ṳə

ṳ̃ ə

ṳa

ṳ̃ a

ṳə ṳa

Also within Katuic, various languages of the Ta’oi sub-group in Laos, such as Ong (described by Ferlus 1974) and closely related Ir (Diffloth 1989), have post-glottalized sonorant codas mˀ, nˀ, jˀ in addition to regular stop and continuant codas. We know that the post-glottalized codas correspond to oral stops in other AA languages (compare Ong kamˀ with Katu kap ‘to bite’; Ong kahuajˀ with Katu kahuac ‘to whistle’), and that rhymes with sonorant codas also occur with creaky vowels (compare Ong pua̰ n with Katu puan ‘four’; Ong tkɔ̰ ːl with Katu takɑːl ‘eight’). Thus we can treat the creaky vowels as reflecting a tense register in opposition to model rhymes with simple codas (Table 11).

556 

 Paul Sidwell

Tab. 11: Ong registers across rhyme types (Ferlus 1974: 118). modal creaky

Vp Vmˀ

Vt Vnˀ

Vc Vjˀ

Vk Vˀ

V V̰

Vs V̰ s

Vh V̰ h

Vm V̰ m

Vn V̰ n

Vɲ V̰ ɲ

Vŋ V̰ ŋ

Vw V̰ w

Vr V̰ r

Vl V̰ l

Vj V̰ j

Creaky phonation tends to occur towards the right edge of syllables, while breathiness voice tends to fall on the left edge of syllables. This opens the opportunity for languages to combine breathy and creaky contrasts to multiply registers, and such systems do occur in Katuic (although are not well documented) and dominate in Pearic, the latter being well described. Within Pearic this phenomenon has been especially well studied among the Chong lects spoken in Thailand and Western Cambodia (e.  g. Huffman 1985b; Luang-Thongkum 1991; Ungsitipoonporn 2001). The findings indicate that up to four registers are utilized, and each has distinct correlates in terms of pitch contours and vowel timbre, and this also leads to considerable notational variation in the descriptive literature (Table 12). Tab. 12: Chong registers according to Ungsitipoonporn (2001). R1 [CVC] clear-modal mid-level and high rising pitch more open or ongliding vowel

R2 [CVˀC] clear-creaky high-rising-falling pitch more open vowels only closed syllables

R3 [CV̤ C] breathy/murmured low falling pitch vowels higher than R1

R4 [CV̤ ˀC] breathy-creaky high falling pitch raised vowel

In summary, we can say that EAA languages display typically Austroasiatic phonological structures (iambic sesquisyllables and large monophthong inventories), plus a range of complicating variation that often relates to neutralization of length contrast and the emergence of breathy and creaky registers. Language contact, especially lexical borrowing, has added new sounds or contrasts and/or reinforced marginal contrasts on top, further contributing to complexity.



Eastern Austroasiatic languages 

 557

24.3 Word formation 24.3.1 Compounding Various types of compounds are found in EAA languages, essentially as a word-formation strategy. The basic pattern is that two (or more) open class lexemes are combined to function as a unit, distinguishable from phrases by stress pattern and semantics. Compounds that juxtapose two nouns often show a generalized meaning reflected by their parts, and can be highly conventionalized. These include, for example, kin-terms or words for categories of goods or animals (Table 13). Tab. 13: Noun-noun compounds.

Khmer Kơho Mon Pacoh Sedang Samre Samre

ʔoːpùk-mdɐːj mɛʔ-baːp mìʔ-mɛ̀ ʔ ʔaʔi-ʔaʔam now-pa cʰanɯnA-kluəŋB kluəŋB-cʰanɯnA

Literal gloss

Translation

‘father-mother’ ‘mother-father’ ‘mother-father’ ‘mother-father’ ‘mother-father’ ‘wife-husband’ ‘husband- wife’

‘parents’ ‘parents’ ‘parents’ ‘parents’ ‘parents’ ‘married couple’ ‘married couple’

Note that in Samre the historical order was ‘wife-husband’ but under Thai influence the order ‘husband-wife’ is now reported as acceptable (Ploykaew 2001: 116).

Another type of noun-noun compound is endocentric: one part modifies another, typically the first element is the head; such modifying compounds are quite common for the area (Table 14). We can also regard specifying compounds (in which the second element restricts the meaning of the first) as a sub-type of endocentric compounds, since it is not always clear that these can be usefully distinguished. Tab. 14: Endocentric compounds.

Chong Khmer Khmer Katu Samre Mon Stieng Pacoh

nāːj-tɔŋ nɛək-srɐe nɛək-dɑmnaə kabuh-hanua siːwB-liəkA ɗɤŋ-sem cʰej-keːm tərhaw-jṵən

Literal gloss

Translation

‘owner-house’ ‘man-rice.field’ ‘man-trip’ ‘lineage-longtime (5~100 years)’ ‘curry-chicken’ ‘town-Thai’ ‘string-wire’ ‘medicine-Vietnam’

‘head of the family’ ‘farmer’ ‘traveler’ ‘ancestor’ ‘chicken curry’ ‘Thailand’ ‘wire’ ‘Vietnamese medicine’

558 

 Paul Sidwell

Noun-verb compounding occurs, apparently following a pattern common in Thai and Lao (e.  g. Lao kʰɔ̆ ːŋ kin ‘thing-eat’ > ‘food’), although it is not clear how frequent this is from the available documentation, but it does appear to be an emergent characteristic of this language area (Table 15). Tab. 15: Noun-verb compounds.

Chong Stieng Khmer Surin Khmer Kuy

tʰa̤ ːk cʰaː daːk-Ɂoːn nɛːək-cap-trɤj koːn-hɯŋ taːɁ-cʰoːh

Literal gloss

Translation

‘water-eat’ ‘water-drink’ ‘man-catch-fish’ ‘child-buzzing/ringing’ ‘iron-to plane’

‘drinking water’ ‘drinking water’ ‘fisherman’ ‘a child’s game’ ‘carpenter’s plane’

Verb-verb compounds also build the lexical stock, although sometimes there is no clear difference of meaning between the elements or the resultant compound, at least this is difficult to determine from the descriptive materials available (such as in the Samre example in Table 16). Verbal compounds are not always clearly distinguishable from multi-verb predicates or serial verb constructions; Table 16 provides some apparent examples. Tab. 16: Verb-verb compounds.

Mon Stieng Samre Nyah Kur

sɒh-ràn soːŋ-saː kroːkB-tʰaːrA ciət-priəp

Literal gloss

Translation

‘sell-buy’ ‘eat-rice eat-other-than-rice’ ‘rise-stand’ ‘take-compare’

‘to trade’ ‘to eat (in general)’ ‘to stand up’ ‘to take advantage of’*

* Apparently a calque of Thai ʔaw-prìap ‘to take advantage of’.



Eastern Austroasiatic languages 

 559

24.3.2 Reduplication Another common type of word formation found in EAA languages is reduplication (see Sidwell 2013 for a broad discussion). We find both full and partial reduplication, with or without phonetic change in the reduplicated part. Functionally, reduplication is used especially for adverbial expressions, for intensification, plurality, and for distributed and indefinite meanings. Detailed profiles of the range of reduplication phenomenon in Bahnar is offered by Banker (1964), in Pacoh by Watson (1966), and M’nong by Dinh Le Thu (2007). Full reduplication without alternation is attested in all EAA languages, often indicating intensification and/or plurality, or continuous or sequential action (Table 17). Tab. 17: Intensifying/pluralizing reduplication. Bunong Kui Chong Samre Khmer Bahnar Bahnar Stieng

kwɔŋ plɜm ŋa̤ ːˀj kʰiːnA cʰkae tʰom bat cɔʔ han

‘big’ ‘fat’ ‘far’ ‘child’ ‘large dog’ ‘to know’ ‘to tie up’ ‘to go/walk’

kwɔŋ-kwɔŋ plɜm-plɜm ŋa̤ ːˀj ŋa̤ ːˀj kʰiːnA kʰiːnA cʰkae tʰom tʰom bat-bat ~ bəbat cɔʔ-cɔʔ han-han-han

‘really big’ ‘very fat’ ‘very far’ ‘children’ ‘very large dog/large dogs’ ‘to know a lot’ ‘to tie up then do something else’ ‘to go on and on’

Some descriptive reduplication is euphonic in style, reduplicating only part of the lexical base; patterns vary considerably, and some selected examples are given in Table 18. Tab. 18: Euphonic reduplication.

vowel alternation

Samre Stieng Pacoh

saləʔA- salaʔ A khuc-khaːc puːc-paːc

‘ignorantly’ ‘destroyed’ ‘to flutter’

rhyme alternation

Bahnar Samre Jeh

kəlɛːŋ-kəlaj kamprahA-kamprɛːŋC ʔajaw-ʔajɛh

‘not worth looking at but looking anyway’ ‘tossing and turning’ ‘to pity’

onset alternation

Pacoh Halang M’nong

to̰ k-vo̰ k ce̤ ːw we̤ ːw riːk-biːk

‘endless amount’ ‘slashing movement of elephant tusk’ ‘to swarm, teem’

560 

 Paul Sidwell

In many cases the full reduplicates occur without the base form attested as an independent word, typically in expressive vocabulary, phonological alternations may also occur (Table 19). Tab. 19: Reduplication without independent base forms. Pacoh Bahnar Bahnar Samre Kui Kơho Katu

krɤɲ krɤɲ glak-glak kruk-kruk kûəj kûəj leːp-laːp cɔk-cɛk breh-brel

‘of pounding noise when person is sleeping’ ‘of heavy laughter’ ‘sound of adult or animal drinking’ ‘sluggish’ ‘butterfly’ ‘to gossip’ ‘many colours’

24.3.3 Derivation EAA languages historically have a system of derivational prefixes and infixes, and reflexes of these remain productive in various languages, and often as fossilized forms in languages that have reduced or lost affixation altogether. Old Mon had a full set of derivational affixes, but the derived forms surviving in modern Mon have frequently merged to yield an underspecified hə- with some remaining productivity in the spoken language (Jenny 2014). It is no coincidence that as some languages have greatly reduced phonological complexity (some eliminating sesquisyllables completely), the scope of use of affixes also declines. See Table 20. Tab. 20: Development of derivation in Mon. Old Mon

Gloss

Process

Modern Mon

gloṅ guṁloṅ girloṅ guloṅ

‘be many’ ‘many’ ‘quantity’ ‘increase’

base attr nml caus

klɒ̀ ɲ həlɒ̀ ɲ həlɒ̀ ɲ həlɒ̀ ɲ

Derivational strategies are also readily borrowed; for example, various EAA languages have borrowed the Thai nominalizer (kʰwaːm ‘matter/state’) or have calqued the same construction, for example we find Chong kʰwāːm-pʰoʔ ‘matter-dream’ and Bunong nau-raw ‘matter-worry’(‘thing-worry’).



Eastern Austroasiatic languages 

 561

24.3.3.1 Nominal derivation Old Khmer testifies a range of nominal derivation by affixation, with prefixes and infixes applied to nominal as well as verbal roots (Table 21). Tab. 21: Old Khmer nominalizing affixes. pN‘N-n-m-mn-p-

jāv ruṅ pvas pre jvan car

‘to barter’ ‘to be big, mature’ ‘to enter holy orders’ ‘to use’ ‘to offer’ ‘to plant in a row’

> > > > > >

pamjāv ʼaṃruṅ phnvas pamre jaṃnvan c(h)par

‘bartered goods’ ‘size, extent, area’ ‘holy orders’ ‘servant’ ‘offering’ ‘flower garden, plot’

Modern Khmer has retained some Old Khmer affixes, as well as innovating some new affixal forms (Table 22). Tab. 22: Khmer nominalizing affixes. k-: s-: m-: N-: -b-: -m-: -n-: -vmn-: bvN-: kvN-: svN-:

bɐŋ pɪ̀ːən hoːp baos rɔːəm sòːm kɪ̀ːəp dam tùk cas bɔ̀ ːk

‘to screen, shade/cover sth.’ ‘pass over, traverse’ ‘eat’ ‘to sweep’ ‘to dance’ ‘ask’ ‘squeeze, apply pincers’ ‘plant, v.’ ‘put away, keep’ ‘old’ ‘peel, strip of bark or skin’

> > > > > > > > > > >

k-bɐŋ s-pɪ̀ːən m-hoːp ʔɔm-baos rəbɐm smòːm khnɪ̀ːəp dɔmnɐm bɔntùk kɔɲcɐs sɔmbɔ̀ ːk

‘screen, movable curtain’ ‘bridge’ ‘food’ ‘brush, n.’ ‘dance, n.’ ‘beggar (so. who asks)’ ‘pincers’ ‘plant, n.’ ‘cargo, load’ ‘old man [derogative]’ ‘shell, husk, bark, skin’

Khmer is somewhat atypical, as the diversity of derivational affixes in EAA languages is typically smaller and is less productive, and where EAA languages have a range of nominalizing affixes they tend to resemble a subset of the Khmer affixes. The -n- infix is particularly ancient and persistent, with reflexes sometimes showing assimilatory changes (Table 23).

562 

 Paul Sidwell

Tab. 23: Examples of nominalizing -n- infix in EAA languages. Pacoh Pacoh Pacoh Sedang Sedang Sedang Kơho Kơho Kơho

katɨp tapaʔ kar ciə soə̰ŋ pa̰ n pat sɛ blɔ

‘to cork’ ‘to make fish sauce’ ‘to drill a hole’ ‘to dig’ ‘to divide’ ‘to raise’ ‘to knead, squeeze’ ‘to turn, detour’ ‘to wear in the ear’

kəntɨp təmpaʔ kanaːr həniə hənoə̰ŋ məna̰ n pənat sənɛ bənɔ

‘cork’ ‘fish sauce’ ‘hole-driller’ ‘shove-hoe’ ‘problem’ ‘domestic animals’ ‘something kneaded (clay, dough, etc.)’ ‘place where detour begins or ends’ ‘earring’

Some EAA languages use prefixes to derive nouns, for example in Mon ʔi- is associated with female personal names and kinship terms, ʔə- is associated with male kinship terms, and the common prefix sɛ̀ k- (not found as independent word) to make nouns meaning ‘something to V’ (sɛ̀ k-ʔa ‘a place to go’, sɛ̀ k-ciəʔ ‘something to eat’).

24.3.3.2 Verbal derivation The most widespread type of verbal derivation in AA languages is the labial causative prefix, with reflexes in all branches, often in fossilized forms. The realization of the prefix varies somewhat (pə-, pN-, pr-, and mə-) and motivations for the choice of forms are not always clear (Table 24). Tab. 24: Causative prefix examples. Khmer Khmer Chong Bahnar Sedang Pacoh Kui Kơho Mon Nyahkur

kɐət dac hoːc ɟiː loj sər boːl sɔŋ ceh tun

‘be born, arise, happen’ ‘break, be torn apart’ ‘die’ ‘to hurt’ ‘to abandon’ ‘go up’ ‘drunk’ ‘straight’ ‘descend’ ‘rise’

> > > > > > > > > >

prəkɐət phdac mahoːc pəɟiː pəloj pasər pəmboːl bəsɔŋ phjeh pətun

‘cause, bring about’ ‘break, separate, tr.’ ‘kill’ ‘cause to hurt’ ‘cause to abandon’ ‘raise’ ‘to intoxicate’ ‘straighten’ ‘take down’ ‘arm a spring-trap’

We also find an -am- causative infix in Old Khmer (e.  g. slāp ‘to die’, saṃlāp ‘cause to die’), and -u- causative infix in Old Mon (e.  g. gloṅ ‘be many/much’, guloṅ ‘cause to grow’) both of which are arguably reflexes of the labial prefix metathesized into initial clusters. Other prefixes with causative function also occur in different languages. For example, Banker (1964) reports tə- and ʔə- allomorphs of the causative in Bahnar.



Eastern Austroasiatic languages 

 563

Costello reports that, “/pa/ is the most common causative in the Katu, but /pi/, /ka/, and /ta/ also occur” (Costello 1998: 33). And similar variation is found in other EAA languages. Another widespread verbal affix in EAA is the tə- reciprocal marker, which also appears to be reported as a reflexive and/or passive marker, indicating subject as undergoer. Note that the form in Katuic is *tər-, with regular loss of /t/ yielding rə-, and is reduced to hə- in Mon (Table 25). Tab. 25: Reciprocal prefix examples. Sedang Bahnar Pacoh Bru

cuə̰ cum pɛːŋ tɒːt

‘to obey’ ‘to kiss’ ‘to beat’ ‘to peck’

> > > >

təcuə̰ təcum tərpɛːŋ rətɒːt

‘to obey (each other)’ ‘to kiss (each other)’ ‘to beat (each other)’ ‘to peck (each other)’

Across EAA a variety of other verbal affixes are described, although only the causative and reciprocal discussed above are clearly old. Individual languages and subgroups have independently innovated specific forms/functions. For example, Gradin (1976) describes for Jeh: ra- frequentative/intensifying, ʔa- resultative; Costello (1966) for Katu: ha- causative passive, ta- adjectivizer, ta- involuntary, ka- purposive; Watson (1966) for Pacoh: ti- resultative state, ca- completative; and considerably more diversity could be compiled. Other types of derivation are found, for example Chong and Kơho use prefixes to derive locative adverbials (e.  g. Chong dɨŋ ‘on’, padɨŋ ‘above’; Kơho ɗaŋ ‘up, top’, hə-ɗaŋ ‘on top of’). In Mon, prefixes are attached to the demonstrative and interrogative stems to derive pronouns and adverbs, originating from old nominal prefixes or compounds. Other kinds of derivational affixing occur, but their significance is mostly local, and more work is required to determine if we can identify more widely distributed and historically old affixes.

24.3.4 Inflectional morphology Inflexional morphology is largely absent in EAA languages, grammatical relations being generally encoded analytically. Nonetheless, case relations are marked on personal pronouns in some Katuic languages. Solntseva (1996) reports on Taoih spoken in Vietnam,2 describing specific case marking prefixes as follows: ʔa- dative, ʔŋ- genitive, ʔi- locative. Alves (2006) reports two case-marked pronouns for Pach: ʔa- dative 2 It is not quite clear which Katuic language Solntseva is actually describing, and it may be a local designation for a lect that might be regarded as a variety of Pacoh.

564 

 Paul Sidwell

and ʔN- possessive, clearly corresponding to Solntseva’s similar categories. There have been suggestions that these case marking forms may reflect older AA structures (Anderson 2004) but the origins of these case marking prefixes remains obscure.

24.4 Clause structure 24.4.1 Simple clauses 24.4.1.1 Intransitive clauses Intransitive clauses in EAA languages strongly prefer SV (subject-verb) word order; the verb may or may not be accompanied by tense-aspect-modality markers, directionals, or other modifiers. This word order is already attested in the earliest documents of Old Mon and Old Khmer, and it seems that this came to dominate in the SE Asian Linguistics Area in Indo-China at least two millennia ago, although there are suggestive traces of older verb-initial structures (Jenny 2015). In some languages, the normal order of constituents may be inverted, as in Mon it is possible with mostly existential or presentational verbs. The two word orders in Mon have different information structural functions, in line with the general topic-comment structure of the language. Mon (Jenny 2014: 584) (1) pɤŋ seh mɔ̀ ŋ ɲìʔ thɔ̀ raʔ cooked.rice remain stay little only foc ‘There’s a little bit of rice left.’ Mon (Jenny 2014: 584) (2) seh mɔ̀ ŋ chaʔ pɤŋ, hwaʔ ʔɒt ʔa jaʔ left stay excl cooked.rice curry all go nsit ‘There’s only rice left, the curry is gone.’ In Katu intransitive clauses there is variation in the order of S and V, with VS commonly found in the published data. Examples (3a)–(3c) demonstrate this variability in order. (Costello and Sulavan 1993: 460) (3) a. bət pɛː cʰet all 2p die ‘All of you will die.’



Eastern Austroasiatic languages 

 565

(Costello and Sulavan 1993: 39) (3) b. ʔəj plɛŋ pavaːç maŋiː, cʰet bət manɨç, kah already sky cause long.ago, die all person, no manɨç, ʄəʔ person, more ‘The spirit of the sky caused the end of everything, all people died, there were no people any more.’ (Costello and Sulavan 1993: 39) (3) c. ʔəj ɗok cʰet ʔakɔɲ, cʰet krɑːl; ŋaːj duː teːŋ harɛː already now die father, die starve; anyone who work field ‘The father is already dead, we will die of hunger; who will work the fields?’

24.4.1.2 Transitive clauses The most frequent word order in EAA transitive expressions is AVP (Agent Verb Patient), consistent with SV found in intransitive clauses. Deviations from verb-medial order do occur for pragmatic reasons (see discussion below under “Syntax and pragmatics”), and the omission of contextually retrievable arguments and other constituents. Similarly, AVP is found in our earliest inscriptional texts, through to the present day. This general situation is broadly consistent with word order patterns across the MSEA linguistic area. Old Khmer (Jenner and Sidwell 2010: 43) (4) paṃnvas cya slā monks eat areca ‘The monks were chewing areca nut.’ Khmer (Jacob 1968: 59) (5) mdɐːj thvɤ̀ː mhoːp mother make food ‘Mother is making the meal.’

24.4.1.3 Ditransitive clauses Ditransitive clauses have three arguments; Agent, Theme and Goal/Recipient, the most frequent word orders in EAA are AVGT and AVTG. Typically there is a limited set of ditransitive predicates; the archetypal ditransitive verb is ‘give’, although other benefactive verbs are also used such as ‘to feed’ (essentially lexicalizing the giving of specific items or states). AVGT order is favored, for example, in Katuic, Bahnaric, and Mon, and those languages require no overt marker of grammatical role so that word order and semantics alone determine the grammatical relations.

566 

 Paul Sidwell

Kui (Bos and Sidwell 2014: 844) (6) a. kruːpɛ̤ ːt ʔɒːn beʔ ʔiː krəncaɲ muŋ doctor give person ill malaria mosquito.net ‘The doctor is giving the malaria patients mosquito nets.’ Bahnar (Banker 1965: 28) (6) b. ɲoːn ʔathaj ʔan kəː 3p.excl must give prt ‘We must give rice to him.’

sɨː məh 3S rice

In Khmer and Pearic, the order is AVTG, with no overt marker of G or T. Khmer (Bisang 2014: 704) (7) khɲom ʔɐoj luj kòət 1sg give money 3s ‘I give him money.’ Chong (Premsrirat and Rojanakul 2014: 612) (8) siŋ pɔ̂ ːn kʰâːw boːt pn feed cooked.rice younger.sibling ‘Sing feeds rice to (his) younger brother.’ Some EAA languages have overt marking of recipients/goals. Mon uses kɒ, which is homonymous with ‘to give’ although it continues Old Mon (and Literary Mon) ku dative. Apart from the goal of ‘to give’, all G arguments can (and usually do) take this marker (Jenny 2005: 207). Mon (Jenny 2005: 70) (9) ɗɛh kɒ ʔəca lòc mùə 3 give teacher book one ‘He gave the teacher a book.’ Pacoh favors AVTG and has a special set of (prefixed) case marked pronouns, so pronominal recipients/goals are marked and this permits variability in word order (particularly in imperatives). Also, the appropriate dative marked pronoun can be used as a dative marker: Pacoh (Alves 2006: 99) (10) jṵən patḛːc ʔalik ʔadɔː pako̰ h Vietnamese sell pig dat Pacoh ‘The Vietnamese sell pigs to the Pacoh.’ Stieng allows variable order, with G marked with dah in AVTG clauses.



Eastern Austroasiatic languages 

 567

Stieng (Miller 1976: 10) (11) a. buː ʔaːn kərjaː dah juon 3s give medicine dat Vietnamese ‘He gave medicine to Vietnamese’ b. buː ʔaːn juon kərjaː 3s give Vietnamese medicine ‘He gave the Vietnamese medicine’ Old Khmer favored AVTG, with G marked by the oblique marker ta. Old Khmer (Jenner and Sidwell 2010: 69) (12) … kamrateṅ kaṃtvan ‘añ viṅ oy prasāda bhūmi sratāc land pn … high.lord maternal 1s again give favour ṛdval ta loñ vasudeva pn obl title pn ‘…  My High Lord in the maternal line again give the tracts in Sratāc [and] Ṛdval as a royal grant to the loñ Vāsudeva …’ Bunong, a Bahnaric language under Khmer influence, also marks G with ta (analyzed as a locative preposition by Butler 2014). Bunong (Butler 2014: 738) (13) ŋit ən tap ta ɲcʰot pn give egg loc pn ‘Ngit gave an egg/eggs to Nchot.’ Closely related to Bunong, Kơho marks G post-positionally with ʔin (which may be compared etymologically to Chong (Pearic) ʔiːn ‘get/achieve’). Kơho (Olsen 2014: 777) (14) kʰaj ʔaj tərnɒːm rəpuː ʔin 3sg give rice-wine buffalo dat ‘S/he gave rice wine to the water buffalo.’

24.4.2 Subordinate clauses In many EAA languages, subordinate clauses cannot readily be distinguished from coordinate or juxtaposed independent clauses, as often no overt marker is used. The subordinators most commonly used fall into three groups: (i) grammatical function words, such as prepositions, (ii) partly desemanticized verbs, and (iii) desemanticized nouns. Desemanticized verbs are frequently used as complementizers, while verbs or nouns with a wide range of lexical meanings are used to introduce adverbial clauses.

568 

 Paul Sidwell

24.4.2.1 Relativization Relative clauses generally follow the head noun they modify, in line with the usual position of postposed modifiers. The relative clause may be introduced with or without a relativizer. The following are examples from various EAA languages that show the use of relativizers. Bunong (Bequette 2008: 49) (15) paŋ iː cjab ntro̤ ːk ŋkwɔŋ ri rəlac 3s rel graze cow male dist argue ‘That one who grazed a bull argued …’ Khmer (Bisang 2014: 694) (16) mənùh dɐel khɲom nɪ̀ jɪ̀ ːəj human rel 1s talk ‘the man about whom I talk’ The Pacoh relativizer ʔən is the same morpheme as ʔən 3S personal pronoun. Pacoh (Alves 2006: 71) (17) viː klɨŋ tikuəj ʔən ləjʔ hoːj toːŋ kaːŋ exist many human rel neg able speak language ‘There are many people who can’t speak the Pacoh language.’

pako̰ h Pacoh

Note that the Kui relativizer ləm only introduces relative clauses that relativize past events. Kui (Bos and Sidwell 2014: 850) (18) ʔaːw niː ləm naːw hɛːk bəːn shirt prox rel 3S tear get ‘This shirt that he managed to tear up’ Old Mon uses a relativizer ma ~ mun to introduce attributive clauses, while in modern Mon it has fallen out of use and the whole relative clause follows the modified head noun without an overt relativizer. Strikingly, Bahnar may use a cognate relativizer (maʔ). Old Mon (Shorto 1971: 123) (19) mirmas guṇ ma smiṅ ʔiñcim jirku remember favour rel king sustain body ‘He remembered the favors with which the king had sustained him.’ Bahnar (Banker 1965: 39) (20) sɛc sɨː luʔ maʔ ʔɨh kəː gan ʔləŋ nɔh meat 3s pl rel neg prt very good that ‘the meat which is not very good’



Eastern Austroasiatic languages 

 569

In Chong the relative clause can be introduced with or without tʰîː, a borrowing of Thai tʰîː ‘that/which’. Chong (Premsrirat and Rojanakul 2014: 621) (21) wo̤ k tʰîː can ʔɨt cḛːn prəː ʔiːn ʔih jɔʔ cloth rel pn give come use get neg sp ‘(I) cannot use the cloth that Jan gave (me).’

24.4.2.2 Complementation Complement clauses occur frequently as clausal arguments of verbs of saying, perception, and cognition (‘think that …’, ‘say that …’ etc.). Often complement clauses are unmarked and directly adjoined to the matrix clause. Similarly, when embedding questions, no complementizer may be necessary since the question is apparent. Unmarked complement clauses can readily be found in EAA languages, as in the following examples. Pacoh (Alves 2006: 112) (22) kɨː ɲo̰ ŋ ʔaʔɛːm ʔɔː lɨː 1s see younger.person nice very ‘I see that you are very nice’ Sedang (Smith and Sidwell 2014: 807) (23) ʔɛ khḛn ʔoh tu wa ʔa̰ 2s say not not want 1s ‘You said (that you) don’t want me’ Bahnar (Banker et al. 1973: 17) (24) ʔiɲ kʰaːn ʔiɲ biʔ.siʔ bəŋaːj.naŋ.pəgaːŋ 1s say 1s not nurse ‘I answered that I wasn’t a nurse.’ Overt complementizers may be optionally used in some EAA languages, for example in Bunung (Bequette 2008) lah and iː can be used to introduce complement clauses; similarly in Chong (Premsrirat and Rojanakul 2014) wâː (borrowed from the Thai for ‘say’) and lok are documented as complementizers, although may be absent in speech. In Khmer, Mon, Kuy, and other languages, the verb ‘say’ is grammaticalized as a complementizer, in addition to simply joining complements to the matrix. Khmer (Jacob 1968: 100) (25) krùː prɐp khɲom thɐː kmeːŋ nùh prəlɔːŋ cɔːəp. teacher tell 1sg say boy dist sit.for.examination pass ‘The teacher told me that the boy will pass the examination.’

570 

 Paul Sidwell

Kuy (Bos and Sidwell 2014: 851) (26) kəː kət paːj daːʔ.kʰiəl haj sɔt 1S think say honey 1incl pure ‘I think that my honey is pure indeed.’

me̤ ːn.te̤ ːn intens

Literary Mon (Jenny 2005: 96) (27) theṅ gaḥ yay gna-kyāk ḍāṁ-ḍāṁ pra-pra. think say illness queen real-RDP true-RDP ‘He thought that the illness of the queen was real.’ An alternative construction is to have the complement clause precede the matrix, this is also a frequent construction in Mon, where the fronted complement clause is overtly marked as topical or non-predicative by the medial demonstrative. Mon (Jenny 2014: 571) (28) kɤ̀ʔ pɔŋ.phak ʔa mùə.han kom.kaoʔ kao plàj get associate go together with older.brother young.man kɔ̀ h tɔ̀ h hùʔ màn raʔ medl be neg win foc ‘It is impossible that I will stay together with you, my brother.’

24.4.3 Clause chaining and coordination Throughout EAA, clauses expressing connected events may be chained with or without an overt linker. Furthermore, a distinction between chained events and serial verb constructions may not be obvious, as paratactic chaining may be used when there is no change of subject from one clause to the next, and the events occur in consecutive order. On the other hand, particularly when contrastive or conditional meanings are being expressed, conjunctions are common, and it is apparent that borrowings are very common. Ploykaew (2001: 172) lists eight Thai conjunctions used “a lot” in speech, so it appears that such clause peripheral elements are readily borrowable.

24.4.3.1 Simple coordination: In examples (29)–(34) clauses are chained without a linker. Old Khmer (Jenner and Sidwell 2010: 45) (29) dep dau prāp cāmpa cāp phsok śata vyar jvan next go.forth defeat Cham catch prisoner hundred two take ta kamrateṅ jagat vrai.lvac link High.Lord 3.pol placename ‘Later [he] went forth to subdue the Cham, took two hundred prisoners of war, and offered [them] up to the High Lord of the World at Vrai Lvac’



Eastern Austroasiatic languages 

 571

Bahnar (Banker 1965: 11) (30) bahnaːr saː ləm miɲ nar ɓaːr ʔmaŋ pʰɔʔ peːŋ ʔmaŋ bahnar eat in one day two time sometimes three time ‘Bahnar eat in one day two times sometimes three times.’ Samre (Ploykaew 2001: 180) (31) patakaːC tɔŋA miɲA cʰuəkA kanuətB kʰuːɲA ciːtB mpʰiːtC front house mother pound rice father to.point bamboo.strips ‘In front of the house, mother pounds rice, father points bamboo strips.’ Bunong (Philips 1973: 132) (32) ɲup paŋ choŋ klup təm ntru͍ŋ catch 3s carry throw loc pen ‘Catch it, carry it over and throw it into the pen.’ Stieng (Miller 1976: 19) (33) buː pəːn tiew taːŋ lɛʔ ʔaː pɔh nej 3p strike burn destroy all at village there ‘They attacked, burned and destroyed all the village there.’ Pacoh (Alves 2014: 892) (34) pḛːh kənnoh koh sḛːc buəjʔ take cutter cut meat fish ‘Take the cutter and cut the meat of the fish.’ Pacoh can use coordinating conjunctions (homophonous with personal pronoun forms), and Watson (1964) found that this was specifically for coordinating human nouns, while a later data collection (1997) found ʔaɲaː ‘and’ (< 3D-P general pronoun) used as a general coordinating conjunction. Pacoh (Alves 2006: 53) (35) kɨː ʔiɲ pləj baːr lam ʔaciːw ʔaɲaː mo̰ ːj lam kuək 1S want buy two unit knife and one unit hoe ‘I want to buy two knives and a hoe.’ In spoken Mon clauses can be chained together, normally linked by toə ‘finish’, toə-teh ‘and then’: Mon (Jenny 2014: 568) (36) kraoh pèh həlɒ̀ c kətɒ nù kəmot toə kəliəŋ.cao klɤŋ kəpac male 2 shake rise abl fire finish return come side hɒəʔ pèh raʔ house 2 foc ‘Your husband got up from the fire and came back to your house.’ And in Kơho the instrumental marker mə can be used with a coordinating function.

572 

 Paul Sidwell

Kơho (Olsen 2014: 765) (37) kʰaj ŋgum kɔj mə ŋkʰjaŋ pʰɛ 3s winnow paddy ins separate uncooked.rice ‘She winnows the paddy and [she] separates the rice.’

24.4.3.2 Conditional and contrastive coordination Where conditional or contrastive meanings are intended it is normal to use an appropriate conjunction. It is notable that these conjunctions are often transparently borrowed from higher status languages (in the examples below Pacoh maː, Sedang təma, Stieng məːʔ may be borrowings of Vietnamese mà ‘but’); in other cases their etymology can be obscure. Bahnar (Banker et al. 1973: 18) (38) ʔiːɲ ʔan kəː baʔ ʔiːɲ ʔɛːt naŋ ham daʔ.ʔjaʔ 1s give link father 1s drink if with get.better ‘I’ll give (medicine) to my father to take to see if he will get better.’ Stieng (Miller 1976: 19) (39) məːʔ ɓɨːn gɛh prak, ɓɨːn gɛh piː ʔɛːn if not have money, not have thing what ‘If I didn’t have money, I wouldn’t have anything to eat.’

ʔaː in.order.to

soːŋ eat

Old Mon (Jenny 2005: 21) (40) yal kcit sak ñaḥ ma yām. cond die neg:exist person rel weep ‘If/when they die, there is no one to weep.’ Samre conditional clauses may be introduced with zero or tʰaːB (borrowed from Thai tʰâː ‘if’): Samre (Ploykaew 2001: 191) (41) tʰaːB nakB kohA wiɲB ciːwA kʰɹaːA naːcB nakB naːB nɔːŋB if 3S neg lost go way other 3S should will A A C A C jip kleː kuəj kuk həːj come reach long long nsit ‘If he has not lost the way he should have come long ago.’ In Laven conditionals are introduced with kʰan (from Lao kʰán ‘if’). Laven (Jacq 2001: 342) (42) kʰan ʔlɔːŋ ʔsoʔ ktɨəŋ kleh tɨəʔ htʌw if wood rotten bone fall down hole ‘If the wood is rotten the bones fall into the hole.’



Eastern Austroasiatic languages 

 573

The Kui conditional bɜː is a borrowing of Khmer baə ‘if’. Kui (Bos and Sidwell 2014: 853) (43) prənɒː bɜː mia cənap lic tuɒn həːj krənaː tomorrow if rain strong flood again nsit road ‘If it rains hard tomorrow the road will be flooded again.’ In Pacoh there are two conditional conjunctions nam, lah ‘if’; the if-clause can occur before or after the main clause. Strikingly nam resembles Lao nám ‘to accompany; with/because of’. Pacoh (Alves 2006: 88) (44) ʔitaʔ pɛː kərlɔːh nam pṵən naʔ tikuəj make three floor.layer if four unit person ‘Three karloh can be made if there are four people.’ It is quite common for EAA languages to have several different coordinators that signal contrastive meanings, similar to English ‘but, except, even though, etc.’. Three contrastive conjunctions, suə̰, mɛ, təma ‘but, except’ are recorded for Sedang. Sedang (Smith and Sidwell 2014: 806) (45) nah kla̰ hiə̰ŋ ka rɔ waj təma ʔoh ta ca previously tiger nsit eat cow 3P but neg neg get ‘Previously a tiger had killed their cow but (they) didn’t catch (it).’ Similarly, for Bahnar several are listed: səmaː, maː lɛj, raʔ ‘but’, e.  g.: Bahnar (Banker 1965: 21) (46) brɛː sɨː waʔ brɔk raʔ tik ɓət trət ʔniː kɔʔ sɨː kual pl 3s want return but loc place swamp prox dog 3s bark kuh bow.wow ‘They were returning but at the place where the swamp is their dog barked “bow wow”‘ In some cases only one contrastive conjunction is mentioned for a given EAA language. Pacoh (Alves 2006: 53) (47) hoːm ŋaːj taʔ maː kɨː ləjʔ cɔːm taʔ see 3S make but 1S neg know make ‘(I’ve) seen them make them, but I don’t know how to.’ Pacoh (Alves 2006: 88) (48) maj poːk ʔabɨːʃ kɨː ləjʔ poːk 2s go otherwise 1S neg go ‘Go, otherwise I won’t go.’

574 

 Paul Sidwell

Samre speakers have borrowed Thai tɛ̀ ː ‘but/only’: Samre (Ploykaew 2001: 173) (49) nakB licA nɔːŋB jipA tɛːB kohA jipA 3S say will come but neg come ‘(I’ve) seen them make them, but I don’t know how to.’ Laven interchangeably uses a native form bat (resembling Mon bɒt ‘as much as’) and the Lao tɛː for ‘but’: Laven (Jacq 2001: 335) (50) kmɔː hʌːc kəːt kaː ɗɨʔ (tɔː) … bat kmɔː nɛː ʔih kəːt year finish have fish many (animal) but year prox neg have ‘Last year there were many fish, but this year there are none.’ Khmer has various strategies for contrasting clauses; in the following example aena: ‘where’ functions to signal ‘instead of’ or ‘however’: Khmer (Haiman 2011: 251) (51) ckae nwng via mwn me:n nev sngiam aena: via kheu:nj dog the 3 not really be.at quiet where 3 see krabej water.buffalo ‘Instead of staying quiet, the dog stared at the water buffaloes.’

24.4.4 Questions Interrogative expressions of two types are discussed: polar questions and content questions. Polar questions in EAA languages are typically formed in two ways, (i) with a distinct intonation rise or fall at the end of a declarative statement (marked with ↑,↓ accordingly), and/or (ii) addition of polar question sentence final particle (pq) for which a negator may do duty (much like Lao bɔ̄ ː). Paralleling Lao, the strategy of adding the regular pre-verbal negator form to the end of the clause to form a polar question is common across EAA. Similarly, if a positive response is anticipated or demanded, a positive particle is used. Additionally, there are various (apparently) indigenous sentence final question particles, and these can even be combined with borrowed particles; in Chong there are four polar question particles ʔih doː, ʔih, rɨː taj, ʔih daːj or hɔʔ hɛː which may be interchanged depending on the context and it is clear that the rɨː and daːj morphemes are Thai loans. In Pacoh particles ləjʔ and bɨʃ are used to elicit affirmative answers and negative answers, respectively; the former somewhat parallels the use of nɔ̄ in Lao which expects acquiescence to requests.



Eastern Austroasiatic languages 

 575

Pacoh (Alves 2006: 86) (52) joːl dɔ̰ ːj ləjʔ still-exist cooked-rice neg ‘Is there still some rice left?’ Mon has a specific polar question particle ha. Mon (Jenny 2005: 16) (53) pèh ʔa kɒm ha 2 go too pq ‘Are you going along?’ Laven has the polar question particle lɔh, possibly influenced by Thai /rʉ́ / with (r > l): Laven (Jacq 2001: 350) (54) saw ʔih naʔ kəːt kuan lɔh 2S NEG yet have child pq ‘Aren’t you going to have children?’ Kơho (Olsen 2014: 763) (55) me lɔːt draːʔ səl 2sg.m go market pq ‘Are you going to market?’ Clause initial interrogative particles also occur, although less commonly. In the case of Katu example (56) this includes the widespread AA negator kah, hinting at an older Austroasiatic strategy (although the line between negation and interrogation in such examples can be unclear). A clearer example is found in Sedang (57). Katu (Costello and Sulavan 1993: 54) (56) kah maj lʌj vəːk kah tapaːŋ tʌj manɨjh neg 2S see monkey neg palm hand person ‘Haven’t you seen the monkey, isn’t it the hand of a person?’ Sedang (Smith and Sidwell 2014: 821) (57) ʔahom ʔo̰ j ʔa̰ j drow neo̰ pq remain exist wine more ‘Is there still more wine?’ In both Mon and Khmer declarative statements can be turned into questions with just a rising pitch at the end of the clause. In Kui a polar question can be asked with a rising intonation clause finally, or with a falling final pitch if the topic is fronted.

576 

 Paul Sidwell

Kui (Bos and Sidwell 2014: 846) (58) a. niː ʔaːw maj ↑ prox shirt 2 ‘Is this your shirt?’

b. ʔaːw maj niː ↓ shirt 2 prox ‘Is this your shirt?’

Khmer can also form questions for declaratives with both rising and falling intonation contour, according to Bisang, “questions can have rising contour if the speaker is respectful but falling contour is also possible if the status of the speaker is equal or higher” (Bisang 2014: 681).

Content questions Content (or Wh-) questions are typically formed with interrogative pronouns; we could regard them as adverbs but as they effectively take the place of a nominal in a declarative clause, a pronominal analysis is referred here. The position of the question word may vary, i.  e. in-situ vs. fronted, as it may for regular arguments for pragmatic reasons. The forms of interrogative pronouns are often transparently built on interrogative bases or, as in the case of Mon (mùʔ ‘what’, ʔəlɒ ‘where’, ɲèh-kɔ̀ h ‘who’, chəlɔ̀ ʔ ‘when’, etc.) are not composed from common component morphemes. Jacq (2001) describes Laven combining the following forms: ʔjaŋ~taŋ ‘Q’, ʔjuo, ʔuo ‘thing’, kəːt ‘have/exist’, duːɲ ‘longtime’, hmɔh ‘to name’. The last of these, while a lexical verb, also functions as an interrogative base when the question is seeking information that is unknown to the speaker. Laven (Jacq 2001: 162) (59) taŋ kəːt saw ɲɨəm q have 2S cry ‘Why are you crying’ Laven (Jacq 2001: 162) (60) ʔjaŋ hmɔh sɨː saw q to name name 2S ‘What is your name’ Laven (Jacq 2001: 163) (61) hmɔh ʔjuo trie han bɨːm wiak q thing wife 3S do work ‘What (kind) of work does his wife do?’ Temporal interrogatives can vary in placement iconically; in Kui a prospective interrogative occurs at the end of the sentence where the retrospective occurs initially.



Eastern Austroasiatic languages 

 577

Kui (Bos and Sidwell 2014: 847) (62) ciə pʰsaːr pɒ̤ h.nia go market when ‘When do [you] go to town?’ Kui (Bos and Sidwell 2014: 847) (63) təh.naː kɒːn maj n̩druh sɛːŋ rɜː dɔŋ when child 2 fall descend from house ‘When did your child fall from the house?’ Strikingly, in Mon it’s the opposite strategy to Kui: clause initial cʰəlɔʔ is future ‘when’, while clause final is past ‘when’. In high style or polite spoken Mon, the “relative questional particle” rao is used sentence finally, in addition to the interrogative mùʔ, which can appear fronted or in the expected position of a P argument after the verb. Mon (Jenny 2005: 18) (64) mùʔ ɗɛh hɒm rao what 3 speak qrel ‘What did he say?’ Mon (Jenny 2014: 567) (65) ʔəwao hnòk ʔɒt kɔ̀ h klon mɔ̀ ŋ mùʔ older.brother big all medl do stay what ‘What does your oldest brother do?’

rao. qrel

Sentence final content interrogatives are also common, as in Chong (66). Chong (Premsrirat and Rojanakul 2014: 616) (66) cʰɔː tap tōŋ nih dog bite loc where ‘Where did the dog bite you?’

24.4.5 Imperatives Among EAA languages a common simple imperative expression is a verb with or without the addressee overtly expressed. Additionally, particles may be added: special imperative markers (such as the verb ‘to go’ grammaticalized in that function), polite or downtoning particles, and/or particles that add force/emphasis. Bahnar (Banker 1965: 47) (67) ʔɛː nɛʔ kəː hliː hɔː 2S don’t link afraid imp-emph ‘Don’t be afraid!’

578 

 Paul Sidwell

Pacoh (Alves 2006: 90) (68) kɨː papiː ɟo̰ ːn ʔamaj, kammaŋ ʔaw 1S speak for 2s.dat listen imp ‘When I talk to you, listen’ Samre (Ploykaew 2001: 199) (69) ciːwA nɔːŋB.saːA tʰəʔB go together imp ‘Go together!’ Khmer (Bisang 1992: 439) (70) soːm tɐː ʔɔɲcɤ̀ːɲ tɤːu ʔɔːkùn kèː coh ask grandfather invite go thank 3sg imp ‘Please, you go and thank him.’ Nyah Kur is described as having a number of imperative particles indexing a ranking of speaker sentiments: duh, rɯːtna ‘positive strong command’, tʰəːt, kadɔw ‘positive mild imperative’, meʔ ‘positive mild command’, cʰəːl~ cʰəːn, ɲim ‘positive polite imperative’. Nyah Kur (Memanas 1979: 157) (71) a. ʔar kadɔw go imp ‘Go (if you want to)’ c. kraːs hĩːʔ kul cʰəːl broom house give imp ‘Please broom (my) house’

b. ʔar toːk daːk duh go draw water imp ‘Go to draw water!’

Negative imperatives or prohibitives may be expressed differently from negative statements by employing a prohibitive marker (see discussion under 24.5.3).

24.4.6 Syntax and pragmatics EAA languages frequently manifest pragmatically determined sentence structure. Constituents can be more or less freely fronted for pragmatic effect, and known arguments (such as those previously mentioned) may be omitted, as in other languages of the linguistics area, such as Thai and Lao, although these tendencies clearly precede the formation of the language area. Consider the example from Bunong (72). Bunong (Bequette 2008: 42) (72) Ø ʔɵh Ø rəŋ lɛʔ hɵːj reply sufficient all nsit ‘(He/they) replied (to Rabbit) “All here already”.’



Eastern Austroasiatic languages 

 579

Topicalization is frequent in the Old Khmer inscriptions, with NPs optionally introduced by the prepositions ri ~ ri e, e, nau ~ nau ru. Given that the default position for agents is clause initial, topicalized NPs are typically other arguments brought to the sentence head. A topicalized subject can be followed by an anaphoric pronoun, or a genitive NP, in Old Khmer (73) with a genitive topic. Old Khmer (Jenny and Sidwell 2010: 49) (73) nau ta yokk neḥ ta roḥ neḥ ti pre kāp thpvaṅ top link take prox link manner this link order chop head ‘Of those who take these aforesaid – [the executioner] shall be ordered to cut off [their] heads’ Similar constructions are found in contemporary EAA languages, with or without overt marking of the topic. Khmer (Haiman 2011: 211) (74) phtɛːəh thmɤj nɪ̀ h kèː khɐn cɪ̀ ːə bɤj bɔntùp. house new prox 3pl divide be three room ‘This new house, they divided into three rooms.’ Mon (Jenny 2005: 230) (75) kon pèh ʔuə pʰjiəʔ hùʔ màn pùh son 2 1s caus:eat neg to.win neg ‘I am not able (don’t have enough food) to feed your children.’ Pacoh (Alves 2006: 98) (76) limɔː pəlluk ʃaːc ʔnnɛh kɨː ɟo̰ ːn ʔadɔː tʰəj-jaw kɨː several unit book dem 1s give dat teacher 1s ‘As for these several books, I’m giving (them) to my teacher.’ In various EAA languages, as in the following example from Mon, when the A argument is fronted, it can be resumptively expressed by a pronoun in situ. Mon (Jenny 2014: 565) (77) ʔəpa ʔuə (kɔ̀ h), ʔərɛ̀ k (ɲèh) hùʔ sɤŋ, ɓɔk (ɲèh) father 1sg medl liquor person neg drink cigar person hùʔ sɤŋ. neg drink ‘My father, he doesn’t drink, nor does he smoke.’

580 

 Paul Sidwell

24.5 Grammatical categories 24.5.1 TAM and directionals 24.5.1.1 Tense and aspect The expression of tense and aspect in EAA languages varies considerably and we can only touch upon some of the diversity here. Grammatical tense is often not directly coded in EAA languages; although some do express a distinction of future versus non-future grammatically, it is not always clear if the “future” marker is better characterized as a marking of irrealis or probability, and therefore modality rather than of tense. It is also common for a grammaticalized usage of a verb (often lexically ‘to finish’, but others are used such as ‘get’ in Khmer) as a completive aspect, past tense, or new situation (NSIT) marker. Modals can also often be combined to communicate more subtle tense-aspect meanings, although also very frequently there may be no overt markers of tense or aspect and the meaning is understood from context: Laven (Jacq 2001: 289) (78) saːw maː prian ʔaːj dɛʔ 2s fut teach 1s emph ‘You will teach me, won’t you?’ Samre (Ploykaew 2001: 129) (79) maluəŋB ʔanA kamlaŋA nɔːŋB penA kʰruːB man prox prog fut be teacher ‘This man is going to be a teacher.’ In the Bahnar sentence (80) the future is implied by the sequence of events indicated by ‘then’ while in (81) the obligation indicated by ‘must’ is clearly irrealis-future. Bahnar (Banker 1965: 28) (80) … lɛːj ʔiɲ saː ʔɛː hɔː … then 1s eat 2s emph ‘… then I will eat you.’ Bahnar (Banker 1965: 28) (81) … ɲoːn ʔatʰaj ʔan kɨː sɨː məh … … 1pl.excl must give link 3s rice … ‘… we must give rice to him.’ Koho has two pre-verbal auxiliaries, overtly marking whether action is completed (nɛh) or not (rəp). For sentences without these the intended tense/aspect is understood pragmatically, and/or a temporal adverb is used. If rəp is used the action can be ongoing or going to happen.



Koho (Olsen 2014: 784) (82) nam daʔ kʰaj rəp year next 3sg non-compl ‘Next year, I will go to Dalat’

Eastern Austroasiatic languages 

 581

lɔːt tam daːlaːc go loc Dalat

In Khmer the verb baːn ‘to get (come to have)’ used preverbally can have three functions: ability/permission, past, or truth against the presupposition otherwise (Bisang 2011). Khmer (Bisang 2014: 697) (83) khɲom baːn tɤ̀u pʰsaː 1s get go market a. ‘I was able/allowed to go to the market.’ (ability/permission) b. ‘I went to the market.’ (past) c. ‘I did go to the market.’ (against the presupposition that I didn’t.) Khmer also has a clause final morpheme haəj, which Bisang characterizes as signaling “completion of an action, change of state (perfect), beginning of an action” (Bisang 2014: 690) (closely paralleling the syntax of Thai lɛ́ ːw ‘finished; already; afterwards’). The same marker is widely borrowed among EAA languages (most often as həːj, with similar syntax and semantics). The range of meanings/uses suggests a unitary semantic analysis of “new situation” (nsit), an aspectual category widespread in Southeast Asia which expresses that a situation has been established after a change of state. In negated expressions, the reading is regularly ‘not anymore’. It is not always evident from the descriptive literature whether a marker glossed as past or perfect(ive) is actually an nsit marker, but it is clearly a widespread feature of the MSE Asian linguistic area. Kui (Bos and Sidwell 2014: 872) (84) haj pac pəŋko̤ ːl rəmbɒːŋ bəːn həːj 1incl chop pole fence get nsit ‘I have logged fence poles already.’ Samre (Ploykaew 2001: 129) (85) maluəŋB tenB kəːjB penA kʰruːB kuəjC həːjC man prox prog be teacher long.time nsit ‘This man is going to be a teacher.’ Bunong (Bequette 2008: 117) (86) kə̤p ntrə̤t maj naj həːj 1s startle 2s.masc medl nsit ‘I was startled by you there.’

582 

 Paul Sidwell

Stieng (Miller 1976: 43) (87) hej saː pieŋ hɨːj 1s eat rice nsit ‘I’ve eaten.’ Other EAA languages have a clause final nsit marker from various lexical sources, such as Mon jaʔ (ultimately from a prefixed form of das ‘to be’).

24.5.1.2 Modality and related functions Like aspect, modality is mostly expressed in EAA languages with secondary verbs, such as combining with a grammaticalized verb ‘get/have’ to express possibility/ ability. This pattern is widespread throughout Southeast Asia and beyond (Enfield 2003). In examples (88)–(89), the grammaticalized ‘get’ is either preverbal or postverbal. Laven (Jacq 2001: 307) (88) caː ʔʌːp krʌːŋ bic klɔː kraʔ eat cooked.rice polished.rice get husband old ‘(If you) east polished rice you will get to be an old man.’ Kui (Bos and Sidwell 2014: 876) (89) təŋajː niː kəː mɜʔ bəːn day prox 1s neg get ‘Today I cannot go.’

ciə kraʔ go emph

In the Mon examples (90)–(91), kɤ̀ʔ ‘get’ occurs in two different positions, with different readings (in addition to the preverbal kɤ̀ʔ with functions similar to the ones described above for Khmer baːn). Mon (Jenny 2014: 585) (90) ɗɛh rɔ̀ p kɤ̀ʔ kaʔ 3 catch get fish ‘He caught a fish.’ (‘He succeeded in catching a fish.’) Mon (Jenny 2014: 585) (91) ɗɛh rɔ̀ p kaʔ kɤ̀ʔ 3 catch fish get ‘He can catch fish.’ (‘It is possible/allowed for him to catch fish.’) Combinations of preverbal and clause final modals are used to produce various nuanced meanings. In the following Samre example we see preverbal naːB ‘likely’ (from lexical ‘to dare’) and clause final ʔiːnA ‘able’ (from lexical ‘to get, to have’) to code ‘will be likely’:



Eastern Austroasiatic languages 

 583

Samre (Ploykaew 2001: 129) (92) maluəŋB tenB naːB nɔːŋB penA kʰruːB ʔiːnA man prox dare will be teacher have ‘That man is likely to have the ability to be a teacher’

24.5.1.3 Directionals Secondary verbs with lexical meanings ‘come’, ‘go’, ‘arrive’ etc. frequently function in various EAA languages as directionals with spatial and temporal meaning such as ‘toward’, ‘from’ and so forth. This appears to be a regional grammaticalization tendency arising from multi-verb predication (Bisang 1992). Samre (Ploykaew 2001: 81) (93) kʰuːɲA suːnB kʰiːnB ciːwA roːŋriənC father bring child go shool ‘Father took (his) child to school.’ Khmer (Bisang 1992: 442) (94) kɔːət prɐp tɤːu kmeːŋ ʔɐoj jɔ̀ ːk tɤːu cùːn cɔmpùəh 3sg tell go boy give take go give to/toward nɛːək-srɤj bɔndoːl. person-woman pn ‘He told the boy to take [the parcel] and deliver it to Miss Bondol.’ In the Mon example (95), ʔa ‘go’ indicates the direction of gaze by the (elided) actors who are stationary once they have reached their view point: Mon (Jenny 2005: 105) (95) krìp tɒn rɔ̀ ŋ ʔa pʰɛ̀ ə kɔʔ-kjac tɤʔ ɲàt həʔɒt run up look go temple Kawkyaik dist see all ‘(We) ran up and looked down towards the temple at Kaw Kyaik over there, (we) could see everything…’ Other secondary verbs can be used to express notions such as non-volitionality, non-intentionality, dissatisfaction with impending or past events, and so forth. In the Mon example (96), the verb tɛ̀ h ‘hit, touch, come into contact with’ is used to indicate ‘by coincidence’. Mon (Jenny 2014: 570) (96) kɤ̀ʔ tɛ̀ h ɓɛ̀ ʔ kon ŋèə həkaoʔ klàj kɔ̀ h. get hit ref offspring frog body seek medl ‘He got the little frog he was looking for (by coincidence).’

584 

 Paul Sidwell

24.5.2 Causative and passive constructions In addition to the morphological causatives, many EAA languages form causatives periphrastically. Frequently this is done with an auxiliary verb with the lexical meaning ‘give’ (or ‘take’, the semantics of these can frequently align); the structure commonly found is ‘CAUSER GIVE CAUSEE V’, with the causee as subject of its own predicate. This is even attested in Old Khmer, so it is apparently an old feature among EAA languages. Old Khmer (Jenner and Sidwell 2010: 57) (97) oy śapata ‘anak ta sruk pvan give swear person link village four ‘to administer the oath to men of four villages’ (‘make them swear to the men of the four villages’) Laven (Jacq 2001: 220) (98) ʔmɛː kriaŋ cɔk ceːm pros ʔam person wiseman take bird release give ‘The wiseman releases the bird so it can fly.’

par to fly

Samre (Ploykaew 2001: 99) (99) ʔajC cɛʔA tɔːB ʔamC ʔiɲA cɯtA Mr. pn make take 1s angry ‘Mr. Cae makes me angry.’ Few EAA languages actually have special passive constructions and it is not formally possible to distinguish from topicalization, although often unmarked verb forms receive a passive reading contextually as in Pacoh (100). Pacoh (Alves 2006: 92) (100) siŋ ʔn.nɛh taʔ cuət ʔabil trap prox make catch mouse ‘This spring trap is made to catch mice.’ Somewhat unusually, Kơho has a prefix gə- marking verbs as passive. Kơho (Olsen 2014: 756) (101) mpoːŋ gə-paːʔ mə caːl door pass-open ins wind ‘The door was opened by the wind.’ Old Khmer and Old Mon have particles marking the passive. In Old Khmer, the marker ti is of unknown origin (although it resembles Malay di ‘patient focus’); the agent in Old Khmer is not demoted to an oblique role but appears in its normal preverbal position.



Eastern Austroasiatic languages 

 585

Old Khmer (Jenner and Sidwell 2010: 15) (102) sruk sre ta ti mratāñ oy ta vraḥ village field link pass lord give link holy ‘village [and] ricefields which were given by the lord to the divinity’ Old Mon ñin indicates a passive reading, it is historically from a verb ‘receive, accept’ (Shorto 1971: 133); overt agents are lacking among the Old Mon passives. Old Mon (Shorto 1971: 132) (103) dinṅal thar ma ñin cincon na rat mirror gold rel pass decorate instr gem ‘a golden mirror set with gems’ Samre has a specific passive market tuənB (evidently a borrowing of Khmer doːn ‘hit’, with regular phonological change) used when the suffered action is unavoidable. Samre (Ploykaew 2001: 103) (104) kʰaniːwC tuənB wəjB child pass hit ‘A child was hit’ Similarly, in Monand Nyah Kur the lexical verb ‘to hit’ is used to mark passive voice (particularly an adversative passive): Mon (Jenny 2005: 106) (105) tɛ̀ h nìʔmòn ɗɔə mèsəlì tɤʔ hit loc loc Mesali dist ‘(We) were invited to Mesali (although we are not happy about it).’ Nyah Kur (Memanas 1979: 63) (106) ɲin tʰah cʰur kɯt 3s pass dog bite ‘He was bitten by a dog.’

24.5.3 Negation Negation in EAA languages shows several recurring patterns; there are negative auxiliaries that precede the verb and negate the predicate, and the same form may be used to negate the whole proposition (as is the pattern in Thai and Lao). However, some languages require the use of a different form to negate the proposition or to negate a non-verbal predicate (e.  g. a negative copular). Additionally, there are various negative particles and adverbials that are frequent in conversation (such as prohibitives). Pacoh provides a neat contrast between negation of verbal and non-verbal predicates:

586 

 Paul Sidwell

Pacoh (Alves 2006: 41) (107) kɨː ləjʔ cɔːm taʔ tumiəŋ 1S neg know make crossbow ‘I don’t know how to make a crossbow’ Pacoh (Alves 2006: 41) (108) kɨː ʔih tikuəj jṵən 1S neg person Vietnamese ‘I’m not Vietnamese’ The same ʔih negator is used as the regular pre-verbal negator in various Bahnaric and Pearic languages as well, so it may be quite old. We see the use of ʔih in Laven, while non-verbal predicates and whole propositions are negated with ʔʌːn. Laven (Jacq 2001: 279) (109) hʌːj pnam ʔaj ʔih kəːt ʔʌːp caː LOC village 1S neg have rice eat ‘In my village there isn’t any rice to eat’ Laven (Jacq 2001: 287) (110) ʔʌːn han luːj lɔh neg 3S angry pq ‘He wasn’t angry?’ Laven (Jacq 2001: 286) (111) ʔʌːn, … ʔaj pʰɛʔ neg 1S sated ‘No, … I’m full’ (in response to question ‘would you like to eat?’) In Chong, Premsrirat and Rojanakul (2014) reports that ʔih ‘not’ can occur both preand post-verbally. Note that, patterning after the syntax of Thai mâj ‘not’, it also functions as an interrogative in Chong. Chong (Premsrirat and Rojanakul 2014: 635) (112) pʰəj ʔih hoːc ʔih 3 NEG die neg ‘It does not die.’ Chong (Premsrirat and Rojanakul 2014: 618) (113) cʰan ceːw wə̤t mɛʔ tɛ̀ ː mɛʔ kɨj ʔih 1sg go meet grandmother but grandmother stay neg ‘I went to meet my mother but she was not there.’ Nonverbal predicates in Mon are negated by the postposed form hùʔ siəŋ ‘it is not (the case that)’, with hùʔ, the general negator in Mon, and siəŋ ‘be so’ (which only occurs in negative and interrogative contexts).



Eastern Austroasiatic languages 

Mon (Jenny 2014: 592) (114) həkaoʔ nɔʔ kɔ̀ h ʔəca kəsao body prox medl teacher nml.write ‘She (knew that she) was no writer.’

lɛ add

hùʔ neg

 587

siəŋ be.so

It is also common for EAA languages to employ emphatic negative particles, on their own or in addition to verbal negators in speech for emphasis. Mon (Jenny 2005: 54) (115) ŋuə nɔʔ ʔuə hùʔ tɒn phɛ̀ ə pùh day this 1s neg move.up school neg ‘I am not going to school today.’ Prohibitives are widely used, typically in the initial position. They often have a different form to the regular pre-verbal negator, as in the Khmer example (116). Khmer (Huffman 1970: 174) (116) kom dak mteːh craən peːːk nɤh proh put chili much too.much excl ‘Don’t put too much chili on it!’ It is also notable that prohibitives often appear to be borrowings from more dominant languages; in (117) we see another Chong use of the Thai mâj ‘not’ (which is not a prohibitive in Thai). Chong (Premsrirat and Rojanakul 2014: 614) (117) ma̰ ːj cʰaː kʰɔ̄ ːŋ puk lɔː proh eat thing rotten disc ‘Do not eat the rotten food!’

24.5.4 Demonstratives EAA demonstratives vary from very simple to elaborate multi-dimensional inventories. The simplest, such as Khmer, formally distinguish only two degrees of proximity. More commonly there are forms for three degrees of proximity, and examples are given in Table 26 from Pearic, Monic, Bahnaric, Katuic. Within Bahnaric and Katuic, additionally, more elaborate systems are utilized: Laven distinguishes five degrees of proximity (including one of visibility) and although Pacoh only codes four degrees of proximity, the more distal grades include specification for ‘higher’, ‘lower’ and ‘beside’ (same level) as ego. Within the noun phrase the demonstrative always follows the noun.

588 

 Paul Sidwell

Tab. 26: EAA demonstratives.

prox medl medl.dist dist nvis

Khmer

Samre

Mon

Kui

Bahnar

Sedang

Laven

nɪ̀h

ʔanA tenB

nɔʔ kɔ̀ h

niː

ʔnuː ʔnɔh

ko̰ mɛ

nùh

tihB

tɤʔ

to̤ ʔ tɨh

ʔniː

ta̰

ʔɲeʔ neʔ nɛ ʔɛʔ ʔɛː

Pacoh ʔn.nɛh ʔŋ.koh ʔn.tih ʔn.tih

ʔn.to̰ h ʔn.to̰ h

ʔn.trah ʔn.trah

higher

lower

beside

24.5.5 Pronominal systems 24.5.5.1 Personal pronouns The personal pronoun inventories in EAA languages are mostly purely grammatical pronouns (not repurposed kin terms), often include dual forms, and a limited inclusive/exclusive distinction. At an extreme is Khmer; etymological pronouns are mostly lost, and speakers employ an elaborate inventory that distinguishes levels of respect and includes specific terms for addressing or referring to clergy. The existence of respect levels is due to the special role of Buddhist and Hindu institutions in Khmer society and royalty, going back to Angkorian times. More commonly, if status distinctions are coded in pronouns, it is like Bahnar, which has a formal/informal distinction in 2nd and 3rd person forms, otherwise forms are purely grammatical (Table 27). Tab. 27: Bahnar pronouns (Banker 1965: 9).

1 excl 1 incl 2 informal 2 formal 3 informal 3 formal

sg

du

pl

ʔiɲ

ɲiː baː ʔmih

ɲoːn bən jɛm

brɛː hap brɛː sɨː

luʔ hap luʔ sɨː

ʔɛː ʔih hap sɨː

Closely related to Bahnar, Sedang personal pronouns lack any coding of status differences; this is a common pattern among highland communities (Table 28).



Eastern Austroasiatic languages 

 589

Tab. 28: Sedang pronouns (Smith and Sidwell 2014: 813).

1 excl 1 incl 2 3

sg

du

pl

ʔa̰

ma̰ pa̰ pɔ̰ prḛj

ŋin pin pɔ̰ waj

ʔɛh ga̰

We also find pronoun inventories in Katuic that are structurally similar to Sedang (and other North Bahnaric languages). See Table 29. Tab. 29: Katu pronouns (Wallace 1966: 56).

1 excl 1 incl 2 3

sg

du

pl

kuː

jɨa ɲaːŋ ɲɨa ɲiː

jiː hɛː pɛː piː

maːj ɗɔʔ

Unusually for EAA, Mon has a reduced personal pronoun inventory: only the first person distinguishes between singular and plural, while in the 2nd and 3rd persons a secondary distinction of politeness is found. Plurality in 2nd and 3rd persons is expressed by the postposed plural marker tɔʔ. Especially in the 2nd person, kinship and social terms are generally used instead of the pronouns. See Table 30. Tab. 30: Mon pronouns (Jenny 2014: 579). 1sg 1pl 2 3

ʔuə poj pèh ɗɛh

(ʔuə ɗoc when speaking with monks; lit. ‘I, servant’) (informal ɓɛ̀ ʔ;* formal mənɛ̀ h) (honorific ɲèh, literally ‘person’)

* The intimate form ɓɛ̀ ʔ is considered rude by most educated speakers.

24.5.5.2 Reflexives and logophorics A number of EAA languages have special reflexive pronouns that are used to refer to the subject of the same clause. They appear in different functions, including object and possessive, as seen in example (118) from Mon, where the noun həkaoʔ ‘body’ takes the function of a reflexive pronoun, and similarly in Stieng (119).

590 

 Paul Sidwell

Mon (Jenny 2014: 571) (118) ɲèh thiəŋ.həjaʔ kɛ̀ h rɔ̀ ə həkaoʔ 3h think say friend body ‘He thought that his friend was dead.’

khjɒt die

ʔa go

jaʔ nsit

Stieng (Miller 1976: 56) (119) ʔaː paːt cʰac nəːm ʔoː pɛh suffer cut body POSS with knife ‘I accidentally cut myself with the knife.’ In Nyahuen (and some other West Bahnaric lects), speakers employ logophoric hɨː to index the subject after its first overt mention, until there is a change of subject or topic: Nyaheun (Sidwell: fieldnotes) (120) jɛː hɨː hʔɔː ʔmɛː ɓʌːn hɨː accidentally logoph glimpse 3PL friend logoph ‘(he) just caught a glimpse of his friend’

24.5.6 Counting 24.5.6.1 Numerals All EAA languages have decimal counting systems with numeral forms for numbers one to ten, and higher numbers formed by combination, in much the same manner as Thai and other national languages. Terms for 100, 1,000 etc. are frequently loan words, and it is common to abandon local counting words in favor of national language terms. Khmer is odd in having combinatorial terms for six through nine (these forms have been borrowed into South Bahnaric) while other EAA languages have discrete unanalyzable forms up to ten: Khmer numerals 1–10 muəj ‘one’ prammuəj ‘six’ piː ‘two’ prampiː ‘seven’ baj ‘three’ prambaj ‘eight’ buən ‘four’ prambuan ‘nine’ pram ‘five’ dɑp ‘ten’ Among Katuic and Bahnaric languages there is a special system of derived forms for counting days, month, years, etc., into the past and future. Prefixes replace the onsets of cardinal numerals and unit of time is preposed; see discussion in Thomas (1976), Jacq (2001), Alves (2006). In the Laven system of “Temporal Numerals” described by Jacq (2001: 272–273), the onsets of the numerals are replaced with br- for past and dr-



Eastern Austroasiatic languages 

 591

(or gr-, some informants were ambivalent) for future. Additionally, there have been shifts in the numerical values of the forms so that the base form does not indicate the derived value. A partial table of possible Laven forms is given as Table 31. Tab. 31: Laven temporal numerals. Base

Yesterdays

Future years

Days hence

pɛː ‘three’

tŋaj drɛː ‘three days ago’

tŋaj brɛː ‘in three days time’

puan ‘four’ sʌːŋ ‘five’

tŋaj druan ‘four days ago’ tŋaj drʌːŋ ‘two days ago’

cit ‘ten’

tŋaj drit ‘five days ago’

kmɔː brɛː ‘three years hence’ kmɔː bruan ‘four years hence’ kmɔː brʌːŋ ‘two years hence’ kmɔː drit ‘five years hence’

tŋaj bruan ‘in four days time’ tŋaj brʌːŋ ‘in two days time’ tŋaj brit ‘in five days time’

24.5.6.2 Numeral classifiers It is argued that the use of numeral classifiers in Austroasiatic languages is not native to the family but borrowed from neighboring languages (Jones 1970; Adams 1989, 1991). Classifiers are often used only sparingly or preferentially in formal contexts. Bisang states in relation to Khmer, “[t]he status of the classifier is relatively weak, i.  e., it is not obligatory except for humans” (Bisang 2014: 702). In terms of word order, there is a broad east-west division between noun-final and noun-initial in the classifier phrase, although in the western area there is more variation: – Bahnaric and Katuic favoring num-clf-n, arguably reinforced by the same order in Vietnamese; – Khmer, Pearic, and Nyah Kur (Monic) favor n-num-clf, following the dominant pattern in Thai and Lao; – Old Khmer shows only very few examples of numeral classifiers, but those examples we have follow the order n-clf-num; – Mon uses classifiers only very sparingly; the noun həkaoʔ ‘body’ is semi-obligatory in counting monks, but with other nouns no classifier usually occurs, but the word order is different for common nouns (n-num) and measures (num-n). The generic classifier mɛ̀ ʔ ‘seed’ sometimes occurs with a wide range of nouns, such as cars, houses, phones, fruit, etc., in the pattern n-num-clf. Some examples of classifiers in use:

592 

 Paul Sidwell

Old Khmer (Jenner and Sidwell 2010: 29) (121) thvāy jā rājadharmma sre ‘anle prāṃ braḥ serve divine pn field clf five offer ‘(he) offered up five ricefields’ Chong (Premsrirat and Rojanakul 2014: 628) (122) me̤ ːˀw mo̤ ːˀj tūə pʰa̰ ːˀj.seː baːt fish one clf twenty baht ‘A fish costs twenty baht.’ Laven (Jacq 2001: 245) (123) ʔaːj cɔk kuan cet.sʌːŋ raː 1S take child 15 clf ‘I have fifteen children’ Nyah Kur (Memanas 1979: 206) (124) priəŋ pan tuh buffalo four clf ‘four buffalo’ Bahnar (Banker 1965: 10) (125) miɲ poːm kiɛk one clf tiger ‘one tiger’ Katu (Costello 1969: 26) (126) bəːr panɔŋ ɁanuɁ two clf dog ‘two dogs’

24.5.7 Case and adpositions Generally, EAA languages do not code case morphologically (with the marginal exception of case marked pronouns in Pacoh). S/A and P relations (with a wide range of functions) are typically marked only by word order and pragmatically (especially when recoverable arguments are elided), although occasionally with some verbs a direct object may take an oblique or locative preposition (as in Mon; Jenny 2005: 92). Oblique objects (recipients, locations, etc.) frequently take prepositions and or rarely postpositions. In relation to locative and directional meanings, a typical EAA pattern is well represented by Bahnar: there is a general locative and a set of specific locatives (‘within’, above’, etc.), plus allative and ablative prepositions, and prepositions can combine. Often there are transparent lexical origins for prepositions, and prepositions are also often borrowed. See Table 32.



Eastern Austroasiatic languages 

 593

Tab. 32: Bahnar prepositions (compiled from Banker 1965). təː ‘at/in/to’

kəpaːl ‘above’ < Khmer kbaːl ‘head’

ɗəŋ ‘from

tik ‘until’ < ‘stop before destination’

rəŋ ‘behind’

ʔalaː ‘under’ < Cham ʔala ‘under’

ləm~lam ‘in/within’ < Cham (da)lam ‘in/ inside’ tuːr ‘in/within’

Bahnar (Banker 1965: 20) (127) ʔɛː ŋɔːj naŋ təː kəpaːl wɛc ɟriː tiː 2S look.up look loc above top.of.tree above dem.dist.above ‘Look up in the top of that banyan there.’ Bahnar (Banker 1965: 20) (128) ʔiɲ jaːk ɗəŋ rəŋ ʔiɲ 1s walk from behind 2s ‘I will follow you’ According to Jenny (2005), the most common object marking prepositions in Spoken Mon are kɒ ‘to, for, with, by’, the ablative nù ‘from’ and the locative ɗɔə ‘in, at’, with kɒ having “gained something like universal status as oblique marker” (Jenny 2005: 89–90). Mon (Jenny 2005: 93) (129) ɗɛh ràn kwaɲ kɒ kon 3 buy sweets obl child ‘He bought sweets for (his) child’ Mon (Jenny 2005: 94) (130) ɗɛh nùm ɗɔə hɒəʔ 3 exist loc house ‘He is in the house’ Mon (Jenny 2005: 95) (131) kɔ̀ ŋ klɤŋ nù ʔuə dare come abl 1s ‘(You) dare to come (to Thailand) more than I (would have dared)’ In Nyaheun the morpheme ɗiː is a locative as a preposition, while post-positionally it is a benefactive. Nyaheun (Sidwell: fieldnotes) (132) ʔɐm gɔːŋ lɔʔ.tɨaŋ ɗiː ɗuo give gong orphan ben exclam ‘(She) gave the gong to the orphan!’

594 

 Paul Sidwell

24.5.8 Clause and sentence particles Sentence particles fulfill a wide range of functions, often conveying speakers’ attitude towards the event, expressing illocutionary force, or structuring the information flow. Yet many times sentence final particles are hard to classify or describe, and more extensively annotated texts are needed to deal with this issue. Some grammars do attempt to compile exhaustive lists; Memanas (1979) lists some 46 sentence final particles for Nyah Kur. The examples (133)–(137) illustrate some use of sentence final particles across a range of EAA languages. Old Mon (Jenny 2005: 140) (133)

smiṅ dewatau kuṁ rmiṅ da! 2s hear foc king god ‘Hear, king of gods!’

Nyaheun (Sidwell: fieldnotes) (134)

klɐːm ʔɨ.ʔɛː ŋuoɲ giet ɗuː.ɗaː! angry 3s want kill emph ‘The angry guy wanted to kill (the monster).’

Nyah Kur (Memanas 1979: 229) (135) caːʔ ʔəːj laː eat ptcl ptcl ‘Eat (if you want to)’ Khmer (Huffman 1970: 174) (136) kom dak mteːh craən peːk nɤh. proh put chili much too.much excl ‘Don’t put too much chili on it.’ Samre (Ploykaew 2001: 174) (137) nakB sɔːŋC huəpA kɔʔB huəpA ciːwA duːA 3s need eat then eat go ptcl ‘He needs to eat so let him eat (it).’

24.6 Conclusion In this survey of EAA languages we have touched upon features that fall into three broad categories: inherited characteristics, local innovations, and convergences with MSEA areal tendencies. The distinctions between these categories are not clear cut, and are often interdependent; and a purely typological account cannot untangle the



Eastern Austroasiatic languages 

 595

web of interactions and changes that yielded the present commonalities and diversities among EAA languages. In phonology we note the prevalence of sesquisyllabic word structures and large vowel inventories – often extended with diphthongs and register distinctions, while contour tones are infrequent and transparently recent. The emergence of complex vocalism is well understood, and we recognize a common prototype that is largely preserved intact in conservative languages such as Bahnar and Katu. The supposed areal tendency to monosyllables is quite well attested in particular languages, such as Nyaheun or Mon, yet sesquisyllables are robustly maintained across EAA. The languages are shown to strongly prefer verb-medial word order in isolated contexts, while permitting variation in ordering for pragmatic and stylistic reasons, and omission of retrievable arguments is permitted. Multiple clauses form complex sentences, embedded, conjoined with linkers, or asyndetically, with indications that overt use of linkers in clause chaining may be historically late in EAA. In clauses with multiple arguments there are indications of some historical depth to the use of one or more linkers with a wide range of functions such as “directive, goal oriented, A relates to B” or similar oblique relations. Otherwise, relations are most usually expressed by grammaticalized lexemes. In terms of morphology, while a small number of historical derivational affixes persist, a broad shift towards analytical derivation, influenced particularly by Thai and Lao, is evident. Thus, while EAA strongly follows areal preferences in word order, there is variation at the phrase level, and strong indications of historical convergence in word order typology. The personal pronouns tend to show a three-way distinction in number (singular, dual, and plural), at least in the first and second person, and an inclusive versus exclusive distinction in the non-singular first person pronouns. The EAA personal pronouns appear to show archaic systems among the smaller languages, and a range of independent developments among the more important languages, and it is not clear how to characterize these as areal tendencies. Similar observations apply to the demonstratives, which most frequently show three degrees of distance but may extend the range of distinctions in some languages. Numeral classifiers are also a commonly cited aspect of the language area, yet they appear to play a largely marginal role in EAA languages; usage varies considerably and in many cases they are clearly borrowed, or are restricted to a small set, and are frequently optional even when available. Additionally, there is some variation in the internal ordering of classifier phrases. Thus, in terms of features often cited as characteristic of the SEA linguistic area (see “Introduction”, this volume, for a fuller listing), their reflection in EAA is rather mixed, and given that much of the linguistic diversity within MSEA is accounted for by AA languages, this poses some challenges to the idea of what makes a coherent linguistic area. Certainly the data examined here suggest that the sharing of areal features may not be historically old, but reflect a fairly recent emergent situation.

596 

 Paul Sidwell

References Alves, Mark. 2006. A grammar of Pacoh: A Mon-Khmer language of the central highlands of Vietnam. Canberra: Pacific Linguistics. Alves, Mark. 2014. Pacoh. In Mathias Jenny & Paul Sidwell (eds.), The handbook of Austroasiatic languages, 881–906. Leiden & Boston: Brill. Anderson, Gregory D. S. 2004. Advances in Proto-Munda reconstruction. Mon-Khmer Studies Journal 34.159–184. Banker, Elizabeth M. 1964. Bahnar affixation. Mon-Khmer Studies 1. 99–117. Banker, Elizabeth M., Sip & Mơ. 1973. Bahnar language lessons (Pleiku Province) (Trilingual Language Lessons 20). Manila: Summer Institute of Linguistics. Banker, John E. 1965. Bahnar word classes. Hartford, CT: The Hartford Seminary Foundation MA thesis. Bequette, Rebecca Lee Elaine. 2008. Participant reference, deixis, and anaphora in Bunong narrative discourse. Ducanville, TX: Graduate Institute of Applied Linguistics MA thesis. Bisang, Walter. 1992. Das Verb im Chinesischen, Hmong, Vietnamesischen, Thai und Khmer: vergleichende Grammatik im Rahmen der Verbserialisierung, der Grammatikalisierung und der Attraktorpositionen. Tübingen: Genter Narr. Bisang, Walter. 2014. Modern Khmer. In Mathias Jenny & Paul Sidwell (eds.), The handbook of Austroasiatic languages, 677–716. Leiden & Boston: Brill. Bos, Kees Jan & Paul Sidwell. 2014. Kui Ntua. In Mathias Jenny & Paul Sidwell (eds.), The handbook of Austroasiatic languages, 837–880. Leiden & Boston: Brill. Butler, Becky. 2014. Bunong. In Mathias Jenny & Paul Sidwell (eds.), The handbook of Austroasiatic languages, 719–745. Leiden & Boston: Brill. Chantrupanth, Dhanan & Chatchai Phromjakgarin. 1978. Khmer (Surin)-Thai-English dictionary. Bangkok: Chulalongkorn University Language Institute. Costello, Nancy A. 1966. Affixes in Katu. Mon-Khmer Studies 1. 63–86. Costello, Nancy A. 1969. The Katu noun phrase. Mon-Khmer Studies 3. 21–35. Costello, Nancy A. 1971. Ngữ-vựng Katu: Katu vocabulary (Vietnam Montagnard Language Series 5). Saigon: Department of Education. Costello, Nancy A. & Khamluan Sulavan. 1993. Katu folktales and society, Katu-Lao-English. Vientiane: Ministry of Information and Culture. Costello, Nancy. 1998. Affixes in Katu of the Lao PDR. Mon-Khmer Studies 28. 31–42. Diffloth, Gérard. 1989. Proto-Austroasiatic creaky voice. Mon-Khmer Studies 15. 139–154. Dinh Le Thu. 2007. Reduplication in the M’nong language. In Mark Alves, Paul Sidwell & David Gil (eds.), SEALS VIII Papers from the 8th Annual Meeting of the Southeast Asian Linguistics Society 1998, 57–65. Canberra : Pacific Linguistics. Ferlus, Michel. 1974. La langue Ong, mutations consonantiques et transphonologisations. Asie du Sud-Est et Monde Insulindien 5(1). 113–121. Gradin, Dwight. 1976. Word affixation in Jeh. Mon-Khmer Studies 5. 25–42. Haiman, John. 2011. Cambodian Khmer. Amsterdam & Philadelphia: John Benjamins. Huffman, Franklin E. 1967. An outline of Cambodian grammar. New York: Cornell University PhD dissertation Huffman, Franklin E. 1970. Modern Spoken Cambodian. New Haven & London: Yale University Press. Huffman, Franklin E. 1985a. Vowel permutations in Austroasiatic languages. Linguistics of the Sino-Tibetan Area: The state of the art (Pacific Linguistics Series C, 87), 141–145. Canberra: Australian National University.



Eastern Austroasiatic languages 

 597

Huffman, Franklin E. 1985b. The phonology of Chong, a Mon-Khmer language of Thailand. In Surya Ratanakul, David Thomas & Suwilai Premsrirat (eds.), Southeast Asian linguistic studies presented to André-G. Haudricourt, 355–388. Bangkok: Mahidol University. Indrawooth, Phasook. 2011. Dvāravati and Śri-Ksetra: A cultural relation. In Patrick McCormick, Mathias Jenny & Chris Baker (eds.), The Mon over two millennia: Monuments, manuscripts, movements, 61–92. Bangkok: Institute of Asian Studies, Chulalongkorn University. Jacob, Judith M. 1968. Introduction to Cambodian. London: Oxford University Press. Jacq, Pascale. 2001. A description of Jruq (Loven): A Mon-Khmer language of the Lao PDR. Canberra: Australian National University MA thesis. Jenner, Philip & Paul Sidwell. 2010. Old Khmer grammar. Canberra: Pacific Linguistics. Jenny, Mathias. 2005. The verb system of Mon. Zurich: Universität Zürich. Jenny, Mathias. 2014. Modern Mon. In Mathias Jenny & Paul Sidwell (eds.), The handbook of Austroasiatic languages, 553–600. Leiden & Boston: Brill. Jenny, Mathias. 2015. Syntactic diversity and change in Austroasiatic languages. In Carlotta Viti (ed.), Perspectives on historical syntax, 317–340. Amsterdam: John Benjamins. Luang-Thongkum, Theraphan. 1979. The distribution of the sounds of Bruu. Mon-Khmer Studies 8. 221–293. Luang-Thongkum, Theraphan. 1991. An instrumental study of Chong register. In Jeremy H. C. S. Davidson (ed.), Austroasiatic languages: Essays in honour of H. L. Shorto, 141–160. London: School of Oriental and African Studies, University of London. Luang-Thongkum, Theraphan. 2001. Languages of the tribes in Xekong Province Southern Laos. Bangkok: Chulalongkorn University Press. Memanas, Payau. 1979. A description of Chaobon (ɲahkur): An Austroasiatic language in Thailand. Thailand: Mahidol University MA thesis. Miller, Vera Grace. 1976. An overview of Stiêng grammar. Grand Forks, ND: University of North Dakota MA thesis. Olsen, Neil Hayes. 2014. Kơho-Sre. In Mathias Jenny & Paul Sidwell (eds.), The handbook of Austroasiatic languages, 746–788. Leiden & Boston: Brill. Phillips, Richard L. 1973. A Mnong pedagogical grammar: The verb phrase and constructions with two or more verbs. Mon-Khmer Studies 4. 129–138. Pittayaporn, Pittayawat. 2015. Typologizing sesquisyllabicity: The role of structural analysis in the study of linguistic diversity in Mainland Southeast Asia. In N. J. Enfield & Bernard Comrie (eds.), Mainland Southeast Asian languages, 500–528. Berlin & New York: Mouton de Gruyter. Ploykaew, Pornsawan. 2001. The phonology of Samre. Mon-Khmer Studies 31. 15–27. Premsrirat, Suwilai & Nattamon Rojanakul. 2014. Chong. In Mathias Jenny & Paul Sidwell (eds.), The handbook of Austroasiatic languages, 603–640. Leiden & Boston: Brill. Shorto, Harry L. 1971. A dictionary of the Mon inscriptions from the sixth to the sixteenth centuries. London: Oxford University Press. Sidwell, Paul. 2014. Old Khmer. In Mathias Jenny & Paul Sidwell (eds.), The handbook of Austroasiatic languages, 643–676. Leiden & Boston: Brill. Smith, Kenneth & Paul Sidwell. 2014. Sedang. In Mathias Jenny & Paul Sidwell (eds.), The handbook of Austroasiatic languages, 789–833. Leiden & Boston: Brill. Smith, Kenneth D. 1979. Sedang grammar. Canberra: Pacific Linguistics. Solntseva, V. Nina. 1996. Case-marked pronouns in the Taoih language. Mon-Khmer Studies 26. 33–36. Thomas, David. 1992. On sesquisyllabic structure. Mon-Khmer Studies 21. 206–210. Thongkham, Noppawan. 2003. The phonology of Kasong at Khlong Saeng Village, Danchumphon Sub-District, Bo Rai District, Trat Province. Thailand: Mahidol University MA thesis.

598 

 Paul Sidwell

Thurgood, Graham. 2002. Vietnamese and tonogenesis: Revising the model and the analysis. Diachronica 19(2). 333–363. Ungsitipoonporn, Siripen. 2001. A phonological comparison between Khlong Phlu Chong and Wangkraphrae Chong. Thailand: Institute of Language and Culture for Rural Development, Mahidol University MA thesis. Wallace, J. M. 1966. Katu personal pronouns. Mon-Khmer Studies 2. 55–63. Watson, Richard L. 1966. Reduplication Pacoh. Hartfort, CT: Hartford Seminary Foundation MA thesis. Wayland, Ratree & Allard Jongman. 2002. Registronesis in Khmer: A phonetic account. Mon Khmer Studies 32. 101–115.

Mathias Jenny

25 The national languages of MSEA: Burmese, Thai, Lao, Khmer, Vietnamese 25.1 Introduction The national languages of present-day Mainland Southeast Asia (MSEA) share a number of typological and sociolinguistic features that set them apart from other vernaculars of the region.1 The similarities in structure, both grammatical and lexical, of the national languages can be traced back to several factors, most of them in the domains of politics, religion, and commerce, rather than linguistics per se. In this chapter, we present an overview of the profiles of Burmese/Myanmar, Thai, Lao, Khmer, and Vietnamese, and give a short account of their development. The national languages described in this chapter belong to three different language families, namely Tibeto-Burman (Burmese), Tai-Kadai (Thai, Lao), and Austroasiatic (Khmer, Vietnamese). Three of them, Burmese, Thai, and Vietnamese, constitute the largest speaker groups of their respective families, making them the most important representatives of Tibeto-Burman (TB), Tai-Kadai (TK), and Austroasiatic (AA), respectively, in terms of population. At the same time, their typological and lexical profiles make them very atypical members of their families. In fact, Thai shares much more features with Khmer than it does with most smaller Tai-Kadai languages. Vietnamese is typologically so different from its AA relatives that it was for a long time regarded as a Tai or Sinitic variety, before its affiliation with AA was shown by lexical correspondences (see Alves 2006). When describing SEA as a linguistic area, reference is most often made to the national languages, sometimes almost exclusively, neglecting the fact that they are in many aspects very poor representatives also of the area, although they clearly dominate the region in terms of political and cultural power. On the socio-political side, all national languages share the fact that they are used as state languages of multilingual nations and therefore have large numbers of second-language speakers. They are heavily standardized and used in all domains of life, which makes them source languages of loanwords into local vernaculars, especially in the domains beyond village life and local culture. The national languages of SEA are good examples of “spread zone” languages (Nichols 1992), representing “punctuations” which result in broad “equilibrium” (Dixon 1997). All national languages of SEA are characterized by heavy external influence, which is seen in the large portions of Indic vocabulary in Burmese, Thai, Lao and 1 This chapter is partly based on research financed by the Swiss National Science Foundation project nos. 100012_150136 “The Greater Burma Zone” and 100015_176264 “The Development of Verb-Initial Structures Cross-linguistically: Insights from Austroasiatic”. https://doi.org/10.1515/9783110558142-025

600 

 Mathias Jenny

Khmer, and Sinitic loanwords and structures in Vietnamese. Together with extensive internal exchange among the national languages over many centuries, this has led to a far-reaching convergence in many aspects. Typically, the national languages in SEA exist in diglossic environments, with a literary variety being used in formal contexts, while the means of communication in every-day life is a more or less distinct colloquial variety. This scenario was already apparent in Cambodia in the 13th century (Zhou 1993) and was firmly established in mid-19th century Siam. Pallegoix gives sample texts in different sociolects of Thai at the time (Pallegoix 1850). These different varieties of the national languages directly reflect the hierarchical structure of the societies, documented since the earliest times of the rising kingdoms (see e.  g. Vickery 1998). The similarities among the national languages of SEA are more evident in the respective literary styles, and the colloquial languages are influenced by these in varying degrees (Diller 1993, 2006; Jenny and Hnin Tun 2016). This chapter does not suggest that the national literary languages of MSEA are typologically radically different from the colloquial and non-literary vernaculars. In fact they share many features that are distinctly Southeast Asian. But on top of the widespread SEAn areal characteristics the national languages share a set of features that can be more or less directly attributed to their history and function as languages of state, commerce, and literature, as well as their need to be functional in all domains of language use, ranging from informal conversations to legal and academic texts.

25.2 Development With the rise of kingdoms encompassing much larger territories than the traditional Southeast Asian city-states (often referred to in the western literature by the Thai/Lao term as mueang), political and commercial power became more centralized and expansive (Higham 2002, 2004; Stark 2004). This centralization went together with religious and cultural hegemony. While in the early stages the Indic languages Sanskrit and Pali were used in stone inscriptions and presumably state affairs (no written documents apart from inscriptions on stone, metal, and pottery have survived from the early Southeast Asian civilizations), these were later replaced by indigenous languages. The first local languages to be appear in writing were Pyu, Mon, Khmer and Cham in the second half of the first millennium, with Burmese, Thai and Vietnamese following in the early second millennium. Pyu has long since disappeared, while Mon and Cham have lost their status as languages of big political entities, although not without leaving traces in the newly dominant languages (e.  g. Jenny 2013). From their earliest appearance on the historical stage, Burmese, Thai and Khmer show profound Indic influence in their lexicon. This was already the case in Pyu, Mon, and Cham, all of which were written in Indic scripts. It is not clear what role Sanskrit and Pali played in the structural development of the state languages of SEA, though there is some evidence (and



The national languages of MSEA: Burmese, Thai, Lao, Khmer, Vietnamese 

 601

claims) that direct translations from Pali helped shape the grammar of Burmese (Okell 1965; Yanson 2002, 2005). During the second half of the second millennium, language standardization and planning was mostly done in terms of rules of poetry, largely based on Indic models at least in the western part of the area, while Vietnamese followed Chinese traditions (see Nguyen et al. 2018 for an overview). Examples of this Indian style “linguistic” treatise are found in different places throughout SEA, first as stone inscriptions, from later periods also as palm leaf or mulberry paper manuscripts (Diller 2001). Often these texts relate to Buddhist traditions and explanations of word meanings (Veidlinger 2006). Probably from the 17th century Siamese court of Ayudhya emerged the Cindamani, a textbook of Siamese metrics and poetry, which is one of the first efforts at standardization of the language. In the mid-19th century, the later King Rama  IV (King Mongkut), was educated in Thai monastic style, with Pali as a main subject, but also exposed to Western knowledge, especially European languages, including Latin. It was partly through Mongkut’s initiative that the Thai (Siamese) language was modernized and standardized according to western principles. Pallegoix’s Grammatica linguæ Thai, written in Latin, was published in Bangkok in 1850. It marks the beginning of modern grammatography in Thailand and coincides with the political pressure to establish Siam as an independent nation, culturally equivalent to the colonial powers dividing up SEA. Apart from securing the status as a “civilized” language with a “real” grammar in the eyes of the western colonialists, Siamese also acquired emblematic function as language of the nation state, which was struggling to assert its geographical and cultural independence (Winichakul 1994). Through this standardization, Thai also was westernized structurally to some extent. Pallegoix’s grammar gives complete paradigms of verbal and nominal inflections, present in Latin but not in Thai (or any other Southeast Asian language).2 The grammatica also includes non-western topics, such as tones, metrics, and the different social levels of speech, making it a rather complete descriptive account of Thai. Initially the westernized structures were probably limited to learned and court language, but with increased education for the people in the 20th century and more widespread literacy throughout the population, many of these structures spread to the everyday language, with the result that Thai today shares many features with European languages. The spread of non-Thai features such as a quasi-passive form and a continuous aspect is well documented since the beginning of the 20th century, when non-religious texts such as novels were first translated from English into Thai and later written in Thai, but still modeled on the European style. This new literary genre introduced the western-style Thai language to a broad

2 The verbal paradigm does include tense, mode, and voice distinctions, but no person marking. The nominal paradigm lists the six cases of Latin (nom, gen, dat, acc, voc, abl), in Thai mostly rendered by prepositions.

602 

 Mathias Jenny

audience. The standardization process that started in Thai also affected closely related Lao, which shares much of its history and culture with Thai and became a national language only in 1975 with the establishment of the Lao PDR (Enfield 1999, 2007; Simpson and Thammasathien 2007). While Thai and Lao share the majority of their linguistic material, they differ markedly in their standardized written forms as well as in the codification of social and cultural hierarchies (Enfield 1999). While Burmese was exposed to multiple influences in terms of lexical borrowing (Jenny 2017), the main influence in shaping the standard language came probably from Pali, most prominently in the form of word-by-word translations, the so called nissaya (Okell 1965). These Pali-Burmese bilingual texts, which first seem to appear in the mid-15th century, follow the Pali original, adding a direct Burmese translation after every word or phrase, very much in the style of the medieval European glosses. This arrangement, together with the consistent use of Burmese postpositional markers on verbs and nouns to represent Pali inflectional endings, led to these Burmese markers being reanalyzed as equivalents of the Pali forms, in many cases acquiring grammatical meanings and functions that they did not inherently have, but which were compatible with their indigenous use. While these particles are for the most part optional or pragmatically triggered in native Burmese, the Pali endings are an integral part of the morphological profile of the language. With the reanalysis of the Burmese markers, they came to be seen more obligatory in some literary genres. From there, they spread to the general literary and, to a lesser extent, colloquial styles. Similar to the case of Thai, Burmese thus was restructured according to a foreign model, in this case Pali, which is characterized by a morphological structure unknown in Southeast Asian languages. Later English influence was restricted mostly to lexical borrowing, with little evidence for structural convergence. The Khmer kingdom of Angkor shifted between Hinduism and different schools of Buddhism, which exposed it to both Sanskrit and Pali influence (Coedès 1964; Stark 2004). Both have left their footprints in Khmer. After the collapse of Angkor and the rise of its successor, Siamese Ayudhya, there was much exchange between Siamese and Khmer language and culture, which lasted until the beginning of the French colonial time. The arrival of the French brought in western ideas, also in terms of language description and standardization, but also led to opposition against these imported ideas. Terminology for new technologies was preferentially coined on Indic bases or taken from Thai, rather than borrowed from French (Heder 2007). The situation in Vietnamese is radically different from the other SEAn national languages, as the main external influence in the early history here was from China, rather than India. Vietnamese adopted Chinese characters, in the first millennium to write Chinese, in the second millennium adapted to write Vietnamese. This led to an early literary diglossia, which was not abolished with the introduction of the ­Chinese-based Vietnamese script, as the literary language remained distinctly different from the spoken vernacular (Nguyen et al. 2018). At the same time, the language also acquired a large number of Chinese words and structures (see Alves, chapter 27



The national languages of MSEA: Burmese, Thai, Lao, Khmer, Vietnamese 

 603

and 2009; Minh-Hằng and O’Harrow 2007). In the early 20th century, many thousands of two-syllable Chinese-based compounds spread through Vietnamese (as well as Korean and Japanese). These were chosen rather than loanwords from French, apart from a few Western cultural concepts (e.  g. butter, western style shirts). Later in the 20th century, there were efforts to replace the Chinese terms by indigenous Vietnamese elements, also changing the Chinese-style modifier-head order to the Vietnamese head-modifier pattern (Nguyen et al. 2018). The phonology of Vietnamese was ­restructured to approach Chinese, especially in its syllable and word structure. The inherited Austroasiatic morphology was completely lost, and syntactic patterns largely harmonized with Chinese (which probably was very similar to Vietnamese in this respect; see Alves, this volume). In the 19th century, Vietnamese, like Khmer, faced the colonial threat from western nations (see Defrancis 1978). Interestingly, it is during this period that Vietnamese apparently introduced a number of grammatical elements that seem to imitate western linguistic usage, such as a possessive marker and passive forms (Washizawa 2019). In all cases of language reform and standardization in SEA in the late 19th and early 20th century, one important factor seems to have been the determination to make the languages more “civilized”, less “illogical, vague, and primitive”, opposing and correcting the widespread (mis)conceptions of SEAn languages held by westerners (Diller 1993). Language became an important symbol of national identity in all SEAn countries at latest by the late 19th century. The development in the individual countries differs, but the outcome was in all cases a heavily standardized, partly westernized variety of an indigenous SEAn language, diverging from its smaller relatives that lacked state power (Watkins 2007 on Burmese/Myanmar; Heder 2007 on Khmer; Simpson and Thammasathien 2007 on Thai/Lao; Minh-Hằng & O’Harrow 2007 on Vietnamese). Apart from the above described developments due to external influence, all SEAn national languages share the status as languages with large numbers of L2 speakers. This is not surprising, given the fact that they are the official languages of multiethnic and multilingual countries, and are used in international communication and trade. Languages with many L2 speakers tend towards levelling of structures and lexicon, losing irregular formulations and too specific vocabulary. This is probably due to the sometimes imperfect mastery of the language by L2 speakers, which leads to simplification in the sense of more grammatical regularity and broader lexical semantics, a phenomenon that has also been discussed under the topic of pidginization, although the mechanisms at play are probably rather different, and none of the SEAn national languages shows clear traits of a pidgin or creole (Michaelis et al. 2013). The following sections give brief descriptions of the main features of the SEAn national languages, keeping in mind that there are at least as many differences as there are commonalities. The aim of this chapter is not to provide full grammatical descriptions of the languages discussed here – there is no lack of relevant material available in print in the form of comprehensive descriptive grammars and studies ­presenting a

604 

 Mathias Jenny

wide range of linguistic topics in the individual languages. Rather, the focus is on the shared features due to similar paths of extra-linguistic developments.

25.3 Phonology In terms of phonology, the languages under discussion here seem to have undergone little restructuring that can be attributed to their status as national languages or languages with substantial numbers of L2 speakers. There is no marked simplification in the systems, and the segmental inventories and phonotaxis vary greatly from one language to another. All but Khmer exhibit lexical tone, realized as a combination of pitch, contour, and phonation, and all but Burmese have retained the coda consonants present in their respective families (and in most languages of SEA). The large numbers of L2 speakers obviously did not lead to any simplification in the domain of phonology in the national languages, but rather to a general convergence within the SEAn area. On the other hand, cross-family influence spreading from the national languages to minority languages belonging to other families is common. Austroasiatic languages spoken in Thailand, for example, tend to simplify the codas r/l/n, merging them in /n/, as is the case in Thai. Some marginal sounds have been introduced through recent western influence, such as Vietnamese /p/ in the onset primarily from French loanwords (e.  g. pin ‘battery’ from French la pile; see Barker 1969), and Thai initial /ʃ/, which in some urban genres tends to replace the indigenous /cʰ~ʨʰ/, which, together with the American-English inspired pronunciation of /r/ leads to a distinct “radio Thai” style. Similarly, Thai final /s/ first occurred in some English loans and later spread to a few colloquial words, such as man(s) ‘fun, great’.3 In Burmese, English pronunciations similarly made their way into the spoken language through loanwords, but they do not generally seem to have spread beyond borrowed vocabulary. In one interesting case, this partial introduction of a foreign sound has led to a new phonemic distinction between /f/ and /pʰ/, at least for the speakers who pronounce ‘telephone’ as [fòuɴ] and thus make it distinct from [pʰòuɴ] ‘merit, religious power’. For most native speakers, the two forms are homophonous, though.

3 At least the spelling in modern Thai suggests the final /s/, though it is not usually pronounced as such by most speakers in normal speech. The spelling variant can also be seen as a way to distinguish the word from the homophones ‘potato’ and ‘oil, fat’.



The national languages of MSEA: Burmese, Thai, Lao, Khmer, Vietnamese 

 605

25.4 Word classes and word formation In the absence of inflectional morphology, word classes in SEAn languages are not always easy to define. Only the syntactic behavior of lexemes can be used to categorize parts of speech in many cases. Based on Indic and European models, official grammars of SEAn languages do describe verbs, nouns, adjectives, adverbs, adpositions, etc., but the criteria for assigning lexemes to these classes are not always transparent. Burmese, which shows a richer morphological inventory than the languages further east, does indeed distinguish between verbs and nouns based on their cooccurrence with certain affixes, such as status markers -tɛ/dɛ ‘non-future’, -mɛ ‘future’, -pi/ bi ‘new situation’, and -pʰù/bù ‘negative’, and the negator mə- only combining with verbs, and case markers usually only combining with nominals (Jenny and Hnin Tun 2016). Other postpositional markers can be attached to verbal as well as nominal expressions, like -tàiɴ/dàiɴ ‘each, every; every time that’. There is a tendency towards more regulation and standardization in terms of parts of speech also in the other national languages, resulting in a more western-like picture. Derivational morphology was present in earlier stages of Tibeto-Burman (Burmese) and Austroasiatic (Khmer, Vietnamese) languages, but there are no traces of it in Tai-Kadai (Thai, Lao) at any stage. These morphological processes are lost in the modern languages, surviving at best as lexicalized forms. New periphrastic processes developed, covering a wide range of functions, and in some cases becoming themselves morphologized. One important derivational pattern is the possibility to form nouns from verbs or whole clauses, a feature widely used in formal styles. In colloquial speech, verbal and other underived forms are more common. Burmese has several nominalizing suffixes, depending on the semantics of the verb and/or the required meaning of the whole nominalization. Thai, Lao, and Khmer productively use the Sanskrit/Pali form kāra ‘deed’ as a pseudo-prefix (technically speaking it is rather a nominal head to which a verbal or clausal attribute is added). In Thai and Lao it appears as kaːn, in Khmer as kaː. This is similar in function and meaning to the Vietnamese phrasal/clausal nominalizing morpheme việc ‘work, deed’ (Washizawa 2019), among other, more abstract nominalizers, such as sự and cuộc (see Nguyen 1997: section 7.5, 168–170). The following sentences illustrate the use of nominalized clauses in Burmese (1) and Khmer (2). (1)

Burmese (Jenny and Hnin Tun 2016) θaɴdwɛ̀ -go ʔəbɛ-ɟáuɴ θèiɴ-ju-ɟìɴ Sandoway-obj inter-because to.keep-to.take-nmlz mə-pjú-gɛ́ -bí-ðə-nì. neg-to.do-displ-emph-nfut-cq ‘Why didn’t you take Sandoway?’

606 

(2)

 Mathias Jenny

Khmer (Bisang 2015) phìːəsaː cìːe kaː sɔmdaeŋ cɤt kùmnɯ̀ːt krùp jaːŋ. language to.be nml to.express heart thought all kind ‘Language is the expression of all emotions and ideas.’

25.5 Phrase and clause structure 25.5.1 Noun phrase Unlike the colloquial and less standardized varieties of SEA, the national literary languages make ample use of overt grammatical markers, both in the noun phrase and verb phrase, as well as in clause linkage. Pallegoix’s grammatica (1850) gives rather complete lists of declensions introduced to Thai. Examples (3) to (4) illustrate “casemarked” nouns in early modern Thai. (3)

Early modern Thai (Pallegoix 1850) kʰǎw tʰam ráːj kɛ̀ ː raw. 3.hum to.do be.cruel obl 1p ‘They hurt us.’

(4)

Early modern Thai (Pallegoix 1850) dâj ráp ŋɤn tɛ̀ ː kʰǔn.lǔəŋ. to.get to.receive silver abl king ‘(He) received money from the king.’

In Literary Burmese, case marking is even more developed and used rather consistently, including markers also for Subject, Direct object, besides Indirect object and local notions such as Ablative and Locative. Colloquial Burmese uses the case markers found in the literary style, though less consistently, especially for Subjects and Direct Objects (Jenny and Hnin Tun 2013, 2016). The following examples (5) and (6) illustrate the case markers of Burmese. (5)

Literary Burmese (Jenny and Hnin Tun 2016) ʔəme-ði ʔi sa.ʔouʔ-ko θà-ʔà pè-ði. mother-sbj this book-dir.o son-ind.o to.give-nfut ‘The mother gave this book to the son.’

(6)

Colloquial Burmese ʔəme(-gá) di sa.ʔouʔ θà-go pè-dɛ. mother(-sbj) this book son-obj to.give-nfut ‘The mother gave this book to the son.’



The national languages of MSEA: Burmese, Thai, Lao, Khmer, Vietnamese 

 607

Further east, Khmer and Vietnamese do not appear to have anything like case markers, but the possessive relation is optionally marked by a noun meaning ‘thing, property’ in Thai, Lao, Khmer, and Vietnamese, but not in Burmese. The function is often extended similarly to European Genitives to cover also non-possessive relations, as seen in (7). In some contexts, the possessive marker in Vietnamese seems to be obligatory, as in (8). The same is also true for Thai, where the possessive must be overtly marked after complex NPs, as in (9). (7)

Vietnamese (Washizawa 2019) sinh.viên của trường này. student poss school prox ‘students of (in) this school’

(8)

Vietnamese (Washizawa 2019) nhũng tấm ảnh cũ của tôi pl clf picture be.old poss 1s ‘my old pictures’

(9)

Thai rót màj *(kʰɔ̌ ːŋ) kʰǎw. car be.new poss 3.hum ‘His new car.’

25.5.2 Verb phrase Verbal expressions are similarly formalized in literary varieties of the national languages, aiming to achieve one-to-one correspondence sets with European and Indic languages. This often leads to contradictions with the inherited categories, which cannot readily be equated to Indo-European tenses and aspects. Pallegoix (1950) gives full verbal tense paradigms for Thai, artificially constructed based on Latin models. The latinized meanings assigned to these constructions diverge from the actual vernacular meaning in many cases. The following Table 1 illustrates the discrepancy between the “reformed” Thai values and the indigenous reading of the same forms. Tab. 1: Thai verbal paradigm of rák ‘to love’ (partial). Thai

Pallegoix 1850

Present-day vernacular reading

kʰâː rák; kʰâː rák jùː mɯ̂ə.nán kʰâː rá; kʰâː rák jùː kʰâː dâj rák mɯ̂ə.nán kʰâː dâj rák kʰâː càʔ rák kʰâː càʔ dâj rák

tempus præsens tempus imperfectum tempus perfectum plusquam perfectum tempus futurum futurum præteritum

neutral, default (any tense); continuous ‘then I (did, will) love; continuous ‘I get to/can/will get to love’ ‘then I did/could/can/will love’ ‘I will/would/might love’ ‘I will/would/might get to love’

608 

 Mathias Jenny

Later Thai grammars like the 1891 wajjakɔːn thaj (Thai Grammar) by the Education Department and Phaya Upakit’s still highly influential làk pʰaːsǎː thaj (Principles of the Thai Language), first published 1919 in Bangkok, further formalized and standardized the paradigms. From these normative grammars, the reformed patterns and usages found their way at least partly also into the colloquial varieties. Similar grammatical formalization and standardization of the verbal expressions is also seen in Burmese, to a lesser degree in Khmer and Vietnamese. The grammar section of Van Tan’s English-Vietnamese dictionary (Nguyên Vản Tạo 1950) gives a list of Vietnamese correspondences for English “subjunctive” verb forms (‘should have, would have, might have, etc.’), but it is not clear to what extent these forms have found their way into the vernacular.

25.5.3 Clause structure and clause linkage The clause in colloquial and non-literary Southeast Asian languages typically exhibits a rather big degree of freedom in terms of movement and omission of arguments. Elements can be fronted for pragmatic reasons and known or contextually retrievable referents are not usually mentioned in the linguistic expression, unless there is a pragmatic reason to do so. In the literary varieties, the occurrence of overt arguments, be it the form of pronouns or full NPs, is much more common, even to the extent that a dummy subject can be used where no referent is present. This is the case for example in meteorological expressions, as in Thai man rɔ́ ːn ‘it is hot’, where the third person non-human (or low honorific) pronoun man ‘it’ does not refer to any obvious entity. When linking events, the preferred strategy in colloquial varieties usually is mere juxtaposition, or the use of a general linker. The exact meaning and relation of the linked clauses depends on the semantics of each and on pragmatic factors. An example of unmarked juxtaposition with no overt argument in Khmer is given in example (10). (10)

Khmer (Bisang 2015) tɤ̀ːp stùh tɤ̀u deɲ cap jɔ̀ ːk mɔ̀ ːk ʔaop. then to.jump.up to.go to.pursue to.catch to.take to.come to.hug ‘[She] then jumped up, caught [the duckling] and hugged [it].’

It is evident that much context, both linguistic and extra-linguistic is needed to make sense of an utterance like the one in (10). In the Burmese example (11), much more content and relations are encoded linguistically, making the expression less open to different interpretations.



(11)

The national languages of MSEA: Burmese, Thai, Lao, Khmer, Vietnamese 

 609

Literary Burmese (Jenny and Hnin Tun 2016) təjouʔ-midija-dwe-gá we.baɴ-pì.nauʔ gɛlɛʔsi-pʰòuɴ-dwe-go sʰaɴsʰàuɴ-gá China-media-pl-sbj to.criticize-seq Galaxy-phone-pl-obj Samsung-sbj ʔəkʰá-mɛ́ pjiɴ-pè-mji. fee-without to.repair-to.give-fut ‘After being criticized by the Chinese media, Samsung will repair Galaxy telephones free of charge.’

Causal, conditional, and concessive relations in clause linkage in Thai can be expressed either by the general topic-comment linker kɔ̂ , or by adding explicit subordinators. The former is favored in the colloquial style, the latter in formal contexts. Examples (12) to (16) illustrate the subordination in Thai without and with explicit marking. (12)

Thai (colloquial) wâːŋ kɔ̂ paj. be.free tcl to.go ‘If I’m free, I will go.’ ‘I am free, so I will go.’ I will go because I’m free’, etc.4

(13)

Thai (colloquial) mâj wâːŋ kɔ̂ paj. not be.free tcl to.go ‘Even if I’m not free I will go.’, ‘If I’m not free, I will go.’, etc.

(13)

Thai (more formal) tʰâː pʰǒm wâːŋ pʰǒm kɔ̂ càʔ if 1sm be.free 1sm tcl irr ‘If I’m free, I will go.’

(14)

Thai (more formal) mɛ́ ː pʰǒm mâj wâːŋ pʰǒm kɔ̂ càʔ though 1sm not be.free 1sm tcl irr ‘Even though I’m not free, I will go.’

(15)

Thai (more formal) pʰrɔ́ ʔ pʰǒm wâːŋ pʰǒm kɔ̂ càʔ because 1sm be.free 1sm tcl irr ‘Because I’m free, I will go.’

(16)

Thai (more formal) pʰǒm wâːŋ pʰǒm cɯŋ 1sm be.free 1sm cons ‘I’m free, therefore I will go.’

4 All person and time references are possible.

càʔ irr

paj. to.go

paj. to.go

paj. to.go

paj. to.go

610 

 Mathias Jenny

Similarly, Vietnamese allows overt or implicit clause/event linkage, as seen in examples (17) and (18). (17)

Vietnamese (Brunelle 2015) Dũng đi mua báo đem về cho ông. pn to.go to.buy newspaper to.bring to.return to.give grandfather ‘Dũng goes to buy the newspaper and bring it back for her grandfather.’

(18)

Vietnamese (Brunelle 2015) vì Duy ngủ không sâu hàng.xóm đang xây nhà be.deep because neighbor prog to.build house pn to.sleep not ‘Duy does not sleep well because the neighbors are building a house.’

Besides the specific clausal connectors, Vietnamese also has the general linker thì which covers a similar range of functions as Thai kɔ̂ (Cao Xuân Hạo 1992.). The national languages differ from the colloquial varieties of SEA mainly in their explicit marking of phrasal and clausal relations, leaving less room to interpretation. At the same time, hardly any categories are obligatorily marked even in formal texts. There is in no case any person agreement on the verb, and core case marking is very rare, Burmese case markers being an exception in SEA. The phrase and clause structure of the SEAn national languages reflect the strive to replicate more prestigious languages, mostly of the Indo-European group (Pali, Sanskrit, English, French), without going all the way to reach metatypy. The intensive contact among the kingdoms and states across MSEA also led to large-scale convergence, resulting in many cases in one-to-one intertranslatability of unrelated languages. This syntactic convergence is easily illustrated by comparing parallel Thai and Khmer sentences, as in (19) and (20) (see Huffman 1973). Example (19) shows that this isomorphism already existed (or started) in the Old Khmer period, suggesting that Thai converged towards Khmer, rather than the other way round. (19)



Old Khmer (adapted from Jenner and Sidwell 2010; Thai translation MJ) OK oy vraḥ dakṣiṇā bhūmi sratāc ṛdval Th hâj pʰráʔ râːtcʰətʰaːn pʰɛ̀ n.din sàʔtàːt lɛ́ ʔ rɯ́ʔtʰuən pn and pn to.give holy royal.donation land sarvva.dravya ʔval ta mratāñ śrī.prathivinarendra. sáp.sǐn tʰáŋ.mòt câːw sǐː.pətʰíʔwiːnəren. kɛ̀ ː asset entire lnk dat lord PN ‘As royal honorarium [he] gave the lord Śrī Pṛthivīnarendra tracts of belonging to Sratāc [and] Ṛdval and all manner of costly things.’

nu kàp with

land



The national languages of MSEA: Burmese, Thai, Lao, Khmer, Vietnamese 

(20) Modern Khmer (Haiman 2011; Thai translation MJ) Kh lo:k ba:n aoj (luj) knjom ja:ng.tec Th pʰráʔ dâj hâj (ŋɤn) pʰǒm jàːŋ.nɔ́ ːj monk to.get to.give money 1s at.least ‘The monk gave me at least five riels.’

pram hâː five

 611

rial.5 rǐən. riel

As extensive as the similarities between Thai and Khmer clause structures are, there are also marked differences. While the use of classifiers with numerals is obligatory in Thai (as it is in Vietnamese and Burmese), they can frequently be dropped in Khmer. Khmer does have a rather large inventory of classifiers, partly modeled on the Thai usage, but their use is much less consistent than in Thai (Bisang 2015). No classifiers are used in Thai and Vietnamese when the noun is a measure word like ‘day’, ‘glass’, ‘mile’, which itself functions as a sort of classifier of a non-expressed abstract notion, such as ‘time’, ‘distance’, etc., and in some fixed expressions, such as brand names (Thai sǎːm mɛ̂ ː-kʰruə ‘three cooks’) and political slogans (Thai nɯ̀ŋ tambon nɯ̀ŋ pʰəlìttəpʰanI ‘One district – one product, OTOP’). In Vietnamese, additional expressions that do not require classifiers are Chinese loanwords (‘one university’, ‘one government’, etc.). These expressions are highly contextual and idiomatic in Thai and Vietnamese, and the pattern is not generally used productively in daily use.

25.6 Lexicon The lexicon of the national languages reflects their role as state languages especially in two domains, namely the pronoun systems and neologisms. Both will be briefly outlined in the following subsections.

25.6.1 Pronoun systems In SEA, two principal types of pronoun systems are found, namely grammatical systems and social-hierarchical systems. The former, more archaic systems are found principally in peripheral vernaculars, the latter in centralized state languages (Müller and Weymuth 2017). The inherited pronominal paradigms typically distinguish three persons and two or three numbers, in many cases also between inclusive and exclusive in the first person non-singular forms. In all national languages of SEA, these paradigms have been replaced by rather large sets of forms used as pronouns that indicate social status and relations rather

5 Haiman’s practical orthography of Khmer, which deviates greatly from more common transcriptions of the language, has been retained unchanged here.

612 

 Mathias Jenny

than grammatical features (Goddard 2005). The old pronouns are often retained, usually with low (intimate or rude) value, especially in the first and second person. This is seen for example in Thai kuː ‘I’ and mɯŋ ‘you’, both going back to proto-Tai etymons (*kawA/kuːA and *maɰA/mɯŋA, respectively, see Pittayaporn 2009) and having acquired contemptuous connotations (Diller 2001). Similarly the inherited Burmese pronouns ŋa ‘I’ and niɴ ‘you’, the former being considered informal and intimate, the latter used in informal contexts where at least one female is involved (from proto-TB *ŋa/ŋay and *naŋ/na, respectively, Matisoff 2003). In Vietnamese, at least the informal second person pronoun mày goes back to a proto-Austroasiatic etymon *mi(i)ʔ/mi(i)h, while Chinese loanwords make up a great part of the more recent hierarchical system (see Alves 2017). Typically, in the social-hierarchical pronoun systems, kinship, social, and professional terms take over the function of pronouns, in many cases with stable reference in a speech situation. In some contexts, especially among younger speakers, personal names or nicknames can take the place of pronouns. If a mother speaks to her child, she would use ‘mother’ for self-reference and ‘child’ for addressee-reference, while the child would use ‘child’ for self-reference and ‘mother’ for addressee-reference. There is no shifting of reference, as is the case with grammatical (deictic) pronouns such as ‘I’ and ‘you’. While there are many differences in the details, the national languages show a great extent of uniformity in their pronoun systems, which are often rather open classes, expressing social relations in great detail in many cases. One shared development is the use of forms (originally) meaning ‘servant, slave’ for first and ‘lord, master’ for second person. This is reflected for example in Thai kʰâː ‘servant; I’, Lao kʰɔ̀ j ‘I’ (< kʰàː nɔ̂ j ‘little servant’),6 Burmese couʔ (< cuɴnouʔ ‘little servant’), Khmer kʰɲom ‘servant; I’ and Vietnamese tôi, and Thai câːw ‘lord, prince; you’, Burmese mìɴ ‘lord, prince; you’. In some cases, like Thai and Burmese, these forms have become informal or intimate, being replaced by more formal innovations, such as Thai pʰǒm ‘head hair; I (male)’/dìʔcʰǎn ‘I (female, of unknown origin)’ and kʰun ‘you’ (from Pali guṇa ‘quality, favor’). Burmese has cənɔ ‘royal servant; I (male)’ and kʰəmjà ‘you (male speaker)’/ɕiɴ ‘you (female speaker)’, the latter two (probably) from ʔəkʰin-pʰəjà ‘beloved-lord’ and ʔəɕiɴ ‘lord, master’, respectively. In the colloquial varieties of the national languages, a reverse development towards socially neutral (and reciprocal) pronouns can be observed. As the indigenous systems do not readily allow for pronominal reference disconnected from social status, notably on the polite levels, recourse is taken to loanwords. In urban colloquial Thai, the English forms ʔaj and juː are frequently used, especially in interactions with

6 Lao kʰɔj became the general pronoun for 1st person, independent of social standing of the speaker or addressee in the process of leveling social hierarchies in society and language under the socialist Pathet Lao regime.



The national languages of MSEA: Burmese, Thai, Lao, Khmer, Vietnamese 

 613

foreigners who do not easily fit in the Thai social hierarchy. Among ethnic Chinese in Thailand, the pronouns ʔúəʔ ‘I’ and lɯː́ ‘you’ are commonly used, also by speakers who do not otherwise speak Chinese.

25.6.2 Neologisms Terminology for new technologies can be introduced to a language in different ways. Unlike non-official languages, national languages necessarily have to come up with one way or another to label newly introduced concepts in order to remain fully functional in all domains. The main processes through which neologisms can enter a language are (i) as direct loans (lexical borrowing), (ii) as loan translations (calquing), or (iii) by making up new terms from existing material. All three processes are found in the national languages of MSEA to different extents in different languages. Importantly, the non-official languages most frequently borrow the terms created by the national languages, independent of the process applied in these in the first place.

i. Lexical borrowing Lexical borrowing is widespread among the languages of MSEA, the major source languages of terminology for new technology being Chinese and English. Neologisms in Vietnamese are mostly direct loans from Chinese (see Alves on Sinitic influence in MSEA, this volume) or two-syllable terms made up of two Sino-Vietnamese morphs, such as nhà máy ‘factory < house-machine’ and bệnh viện ‘hospital < sick-institute’ respectively. The former contains the Vietnamese lexemes nhà and mày in the indigenous order head-modifier, the latter is a direct loan from Chinese bìngyuàn ‘disease-institute’, reflecting the Chinese order of elements in the compound (see Nguyễn et al. 2018; Alves 2019). French loanwords are fewer and more recent, such as ăng-ten ‘aerial’ (F antenne), ô tô buýt ‘bus’ (F autobus), xăng ~ ét-xăng ‘gasoline’ (F essence), phim ‘movie’ (F film), xích lô ‘cyclo’, the latter also used in Khmer as sikloː. ‘Ice cream’ is kem in Vietnamese, from French crême, which appears in Lao as kəlɛ́ ːm.7 Generally, loanwords are adapted to the Vietnamese phonology, so that they do not appear as foreign elements in the language (see Scholvin and Meinschäfer 2017). In the western parts of MSEA, Indic languages and English have played the major role of source languages for loanwords. Older layers of English words in Thai were replaced by more indigenous looking material in the first half of 20th century, like poːlít ‘police’, which is replaced by the Khmer form tamrùət, and meː ‘mail’, which gave place to the Sanskrit neologism prajsəniː. At the same time, the Royal Thai Insti-

7 Thai uses the English loan ʔajsəkriːm, colloquial ʔajtim.

614 

 Mathias Jenny

tute coined many words for newly introduced technologies, mostly based on Indic roots (Diller 1993; see sub-section ii). Lao took over the same set of coined Indic and Khmer neologisms from Thai, adapting them to the Lao phonology. Later English loans are found in Thai (and Lao), covering domains as diverge as tʰéknoːloːjîː ‘technology’, kʰɔmpiwtɤ̂ː ‘computer’, kʰéːk ‘cake’, pʰàp ‘pub’, líp-sətìk ‘lipstick’, daːwlòːt ‘download’, wiːdiːʔoː ‘video’, fɛːcʰân-cʰoː ‘fashion show’, and many more. In some cases, English terms colloquially replace the more formal Indic words, like tʰiːwiː ‘TV’ for tʰoːrətʰát. Many loanwords in Thai, although adapted to the indigenous phonology, still feel like foreign elements, be it because of their polysyllabic structure or due to uncommon tone-rhyme combinations. In Burmese, English loanwords started being used in great numbers with the beginning of British rule in the late 19th century and continued to the present, although there was a period when the government of independent Burma promoted indigenous terminology (see sub-section iii below). English terms in Burmese are in some cases easily detectable as loanwords due to uncommon sounds or sound combinations, as in redijo ‘radio’, which is pronounced with the unusual rhotic /r/, or due to their opaque polysyllabic structure, like hotɛ ‘hotel, tɛlipʰòuɴ ‘telephone’, dimokʰəresi ‘democracy’ and kuɴpjuta ‘computer’, all also without the usual voicing of intervocalic consonants within a word. Others are well integrated in the sound system of Burmese, like meiʔkaʔ ‘make-up’ and sʰaiʔ-kà ‘bicycle riksha’ (E side-car). As in Thai, some old English loans have been replaced by more indigenous terms, such as baiɴsəkouʔ ‘movie’ (E bioscope), which was replaced by jouʔ-ɕiɴ ‘live picture’. A number of English loans have entered other languages of Myanmar through the medium of Burmese, as can be seen in the form of Mon kɔmpjuta ‘computer’(rather than the expected *kʰɔmpʰjutɤ if the loan had been from English directly) and pàiŋsəkɤk ‘movie’ (not *ɓajʔosəkop or similar, which would be the likely outcome of a direct English loan). The case of Mon and Burmese is a nice example of shifting status: While Pali words entered Burmese through Mon in the 11th century, when Mon was the language of state and literature, today Burmese has the role of the high prestige language and acts as intermediary for English loans in Mon (see Jenny 2013).

ii. Loan translations Loan translations are made up of indigenous or material or words already present in the language as earlier borrowings, replicating the compound semantics of a foreign expression. Consisting of native or nativized vocabulary, lexical calques are not perceived as foreign elements in a language. In Thai, and Khmer, the most common way to calque new terminology is by using Sanskritizing forms. These show profound knowledge of Sanskrit by the institutions coining the words, although they are in most cases not identical to the Sanskrit neologisms used in India. Typical examples are Thai tʰoːráʔsàp ‘telephone’ from Sanskrit dūra ‘distant’ and śabda ‘sound,



The national languages of MSEA: Burmese, Thai, Lao, Khmer, Vietnamese 

 615

word’, tʰoːrətʰát ‘television’, based on the same Sanskrit prefix combined with dṛṣṇa ‘vision’. The same prefix is also used in tʰoːráʔlêːk ‘telegram’ (with lekha ‘writing’) and tʰoːráʔcìt ‘telepathy’ (with citta ‘mind, intention’). Similarly, Sanskrit (and Pali) words can receive extended semantics matching the English usage, like sətʰǎːniː ‘place’ > ‘(railway, radio) station’ in Thai. The same expressions are also used in Khmer and Lao, with Thai being the likely source of the calques in the latter. Native words are used in calques like Burmese le-zeiʔ ‘airport’ (‘wind/air-port’), and Thai tʰǔŋ-nɔːn, Vietnamese túi ngủ and Khmer kəboːp deːk ‘sleeping bag’ (all ‘bagsleep’).

iii. New terms made from indigenous material The third source of new terms is by coining them from existing native (or earlier borrowed) material. These newly coined terms are usually rather transparent in their composition. Examples are Vietnamese máy tính ‘machine-calculate’ and Thai kʰrɯ̂əŋ-kʰít-lêːk ‘machine-think-number’ for ‘calculator’. Burmese in many cases uses coined lexemes for imported technology, such as jouʔ-mjiɴ-θan-cà ‘television’, literally ‘picture-see-sound-hear’, and jouʔ-ɕiɴ ‘movie’, literally ‘picture-live’. For the latter, Vietnamese uses the French loanword phim (film) and Khmer and Thai use the Pali-Sanskrit compound bhāb-yantra ‘picture-moving’. The common SEAn lexeme roŋ (and similar forms, probably from Malay ruang) meaning ‘hall, big building’ is used in numerous neologisms across the area. ‘Factory’ is roːŋ-ŋaːn in Thai (‘hall-work’), sɛʔ-jouɴ in Burmese (‘machine-hall’), and ròːŋ-cak in Khmer (‘hall-machine’), ‘hospital’ is roːŋ-pʰəjaːbaːn in Thai (‘hall-medical care’), hóːŋ-mɔ̌ in Lao (‘hall-doctor’), and sʰè-jouɴ in Burmese (‘medicine-hall’). The word for ‘train, railway’ across MSEA is made up of the elements ‘car, vehicle’ and ‘fire’. The former is an Indic loanword in Burmese, Thai, Lao, and Khmer (ratha), and a Chinese loan in Vietnamese (xa/xe, Mandarin chē). The word for fire is an indigenous lexeme in all cases. Thai and Lao have rót-faj, Khmer rɔtèh-pʰlɤ̀ːŋ,8 and Burmese mì-jətʰà, in Vietnamese the form is xe lửa.

8 The Khmer form rɔtèh is probably from an early non-Sanskrit Indic form of the etymon ratha (cf. also Khmer rʊ̀ət, which derives from the Sanskrit/Pali form.

616 

 Mathias Jenny

25.7 Language use 25.7.1 Greeting and thanking A common traditional way of greeting across SEA is by asking a question such as ‘have you eaten?’ or ‘where are you going?’ (see Siebenhütter, chapter 30). From the mid1930s, more formal expressions of greeting were introduced as part of the government policy at the time to modernize (and nationalize) the societies of SEA. Thailand and Cambodia both made use of the Sanskrit svasti ‘well-being’, which also occurs much earlier as formulaic beginning of inscriptions in Old Khmer and Thai (Diller 2001). The nativized forms are səwàtdiː in Thai and suəsdɤj in Khmer. Lao further nativized the formula to səbaːj-diː, probably based on the fact that it has a transparent meaning in Lao (‘be well, comfortable’) and sounds vaguely similar to the Sanskrit-Thai expression. It has the added advantage that it is different from Thai (although the same expression exists in Thai), making it sound more Lao. At about the same time, newly independent Burma introduced the greeting miɴgəla-ba, based on Pali/Sanskrit maṅgala ‘auspiciousness’. While səwàtdiː in Thai has found its way into the colloquial language, frequently in the shortened form wàtdiː), in Burmese miɴgəla-ba is rarely used in non-formal contexts (see Jenny and Hnin Tun 2016). These general greeting expressions can be used (if they are used at all) at any time of the day, both when meeting and parting. More specific expressions imitating English formulas like ‘good morning’ and ‘good night’ were coined in Thai, namely ʔərun-səwàt and raːtriː-səwàt, respectively, based on Sanskrit aruṇa ‘dawn, morning’ and rātri ‘night’, together with svasti seen above. Like formal greetings, expressions of thanking are not widely used traditionally in many SEAn societies. The national languages introduced conventionalized formulas, probably as attempt to replicate European politeness and behavior. Thai kʰɔ̀ ːp-kʰun and its more intimate counterpart kʰɔ̀ ːp-caj are combinations of a now obsolete verb kʰɔ̀ ːp ‘pay back, reciprocate’ with kʰun from Pali guṇa ‘quality, favor’ and caj ‘heart’, respectively. Khmer has ʔɒː kùn, from ʔɒː ‘be happy, enjoy’ and Pali guṇa. Vietnamese uses the two-syllable Chinese loan expression cảm ơn (Mandarin gǎn’ēn ‘thank, grateful’). The common expression of thanking in Burmese is cèzù tiɴ-dɛ, literally ‘impose kindness’. All these formulaic expressions are rather transparent in their composition (even if the verb used in the Thai expression is no longer in use), which is not surprising given their recent origin.

25.7.2 Politeness conventions Apart from undergoing grammatical modernization and standardization, the national languages also developed politeness strategies deemed important to their function as languages of state. Politeness can be expressed by the use of appropriate pronouns



The national languages of MSEA: Burmese, Thai, Lao, Khmer, Vietnamese 

 617

(see 25.6.1 above) and lexical items (Goddard 2005). Socially stratified lexicon is seen in all national languages of SEA. This commonly covers all semantic domains, including basic daily activities, such as ‘eat’, which in Thai is dɛ̀ ːk, kin, tʰaːn, ráp prətʰaːn, in ascending order of politeness.9 In addition to the lexically coded politeness level, SEAn languages commonly have dedicated ‘politeness words’ that occupy a much more prominent position in the national languages than in peripheral varieties. These politeness particles may occur alone, as a short acknowledgement that one is listening, or at the end of any utterance to render it more polite. In many cases, adequate kinship or social terms may be used as politeness particles, which may be fully conventionalized. Often, the choice of the conventionalized politeness particle depends on the gender of the speaker, rather than the addressee. In Khmer, female speakers use caːh or a similar form (of unknown origin), while men use baːt, originally from ‘foot’ (Pali pāda). In Thai, the particle used by male speakers is kʰráp, shortened from kʰɔ̌ ː ráp ‘beg to accept’, female speakers use kʰâʔ or kʰáʔ, probably shortened from kʰâː ‘servant’. In Burmese, the corresponding particles are kʰəmja and ɕiɴ, for male and female speakers respectively. Both originate in the formal second person pronouns, the former with a tone change from heavy to neutral. Vietnamese has the acknowledgement particles vâng in the north and dạ in south, and the distinct sentence final politeness particle ạ, all without gender distinction. Lao does not use politeness particles, due to the Pathet Lao nationalists engineered ‘democratization’ of the language. However, many people are exposed to Thai TV and other media and will often use Thai kʰráp/kʰâʔ added to Lao sentences, especially when communicating with people from Isaan (Lao speaking Northeastern Thailand).

25.8 Summary: language convergence among the national languages The national languages of MSEA as they appear today are the result of different factors, both linguistic and extra-linguistic. Their genetic inheritance from three different language families, namely Austroasiatic (Khmer, Vietnamese), Tai-Kadai (Thai, Lao) and Sino-Tibetan (Burmese) is still evident both in lexicon and grammatical and phonological structure. Vietnamese, while retaining a clear Austroasiatic core, has approached Chinese in many respects, especially in terms of morphology and the large number of Chinese-style compounds. The inherited stock has been overlain by foreign layers also in the other languages. The Tai languages Thai and Lao contain a large

9 The list is far from being exhaustive, as numerous other forms of the concept ‘eat’ exist in Thai for use in specific contexts.

618 

 Mathias Jenny

portion of early Chinese lexical loans, many going back to the proto-Tai ancestor language, as well as full-fledged Chinese-style tone systems (Pittayaporn 2009). More recent are Indic elements which entered MSEA together with Indic culture and religious beliefs from the first half of the first millennium. While the Indic influence was not principally restricted to the national languages, it was the languages of state and literature that were first and most profoundly affected. Apart from the shared foreign influence (except for Vietnamese), the national languages are the ones that have been in long-standing contact, both through peaceful and aggressive interactions of the kingdoms and states. This contact is most clearly seen in the structural convergence of Thai and Khmer, which is not restricted to the language, but also includes many aspects of culture, especially court etiquette (Huffman 1973). Thai and Lao, being closely related varieties of the same subgroup of Tai-Kadai, shared many features from the beginning and are to a large extent mutually intelligible. Rather than converging, there may be attempts to diverge in order to maintain national identity. This divergence is mostly restricted to the lexicon without affecting the overall structure of the languages, but Lao has been redesigned as a democratic (socialist) language abandoning most or all social hierarchical distinctions which are very prominent in Thai. The combination of difference in lexicon, and especially the discrepancies in the use of formal societal registers can lead to a strong interference, which is often solved by Lao speakers switching completely to Thai in certain situations, retaining only Lao sound patterns. Playing an important role as emblems of national identity (and pride), the national languages, much more than local vernaculars, have been subject to language planning and engineering. From the mid-19th century to the present, the respective governments made efforts to standardize and modernize the official languages, making them adequate means of communication in a globalized society. Unlike local varieties, the national languages not only had to serve as strong symbols of independent political entities (or entities that strived for independence) in the face of colonial threats, they also had to be equipped with the grammatical and lexical means to serve as languages of broader communication in all domains. While all of them followed different paths in achieving this, there is a great extent of common developments and convergence that makes Burmese, Thai, Lao, Khmer, and Vietnamese stand out as a group in MSEA due to their common roles and functions in their domains. With their strong influence in their respective countries, they also induce changes in the local varieties. The fact that MSEA is seen as a prime example of a linguistic area (Matisoff 2019) is at least partly due to the fact that most descriptions are based on the national languages, but with more material becoming available on other languages of the area, we may assume that the uniformity of the MSEA area is the result also of features spreading from the national languages to peripheral languages throughout the region, making them more similar to the dominant varieties.



The national languages of MSEA: Burmese, Thai, Lao, Khmer, Vietnamese 

 619

References Alves, Mark J. 2021a. Typological profile of Vietic. (this volume, chapter 22) Alves, Mark J. 2021b. Linguistic influence of Chinese in Southeast Asia. (this volume, chapter 27) Alves, Mark J. 2006. Linguistic research on the origins of the Vietnamese language: An overview. Journal of Vietnamese Studies 1(1/2). 104–130. Alves, Mark J. 2009. Loanwords in Vietnamese. In Martin Haspelmath & Uri Tadmor (eds.), Loanwords in the world’s languages: A comparative handbook, 617–637. Berlin & Boston: Mouton de Gruyter. Alves, Mark J. 2017. Chinese loanwords in Vietnamese pronouns and terms of address and reference. In Lan Zhang (ed.), Proceedings of the 29th North American Conference on Chinese Linguistics (NACCL-29), vol. 1, 286–303. Memphis: University of Memphis. Alves, Mark J. 2019. Morphology in Austroasiatic languages. Oxford Encyclopedia of Linguistics. Oxford: Oxford University Press. DOI: 10.1093/acrefore/9780199384655.013.532 Barker, Milton E. 1969. The phonological adaptation of French loanwords in Vietnamese. http://sealang.net/sala/archives/pdf8/barker1969phonological.pdf (accessed 31 December 2020). Bisang, Walter. 2015. Modern Khmer. In Mathias Jenny & Paul Sidwell (eds.), The handbook of Austroasiatic languages, 677–716. Leiden & Boston: Brill. Brunelle, Marc. 2015. Vietnamese (Tiếng Việt). In Mathias Jenny & Paul Sidwell (eds.), The handbook of Austroasiatic languages, 909–953. Leiden & Boston: Brill. Cao, Xuân Hạo. 1992. Some preliminaries to the syntactic analysis of the Vietnamese sentence. The Mon-Khmer Studies Journal 20. 137–152. Coedès, Georges. 1964. Les États hindouisés d’Indochine et d’Indonésie. Paris: Editions E. De Boccard. Defrancis, John. 1978. Colonialism and language policy in Vietnam. Berlin: Mouton. Diller, Anthony. 1993. Diglossic grammaticality in Thai. In William A. Foley (ed.), The role of theory in language description, 393–420. Berlin & New York: Mouton de Gruyter. Diller, Anthony. 2001. Grammar and grammaticality in Thai. In Hannes Kniffka (ed.), Indigenous grammars across cultures, 219–244. Frankfurt am Main: Peter Lang. Diller, Anthony. 2006. Polylectal grammar and Royal Thai. In Felix K. Ameka, Alan Dench & Nicholas Evans (eds.), Catching language. The standing challenge of grammar writing, 565–608. Berlin: Mouton de Gruyter. Dixon, Robert M. W. 1997. The rise and fall of languages. Cambridge: Cambridge University Press. Enfield, Nick J. 1999. Lao as a national language. In Grant Evans (ed.), Laos: Culture and society, 258–290. Chiang Mai, Thailand: Silkworm Books. Enfield, Nick J. 2007. A grammar of Lao. Berlin & New York: Mouton de Gruyter. Enfield, Nick J. & Bernard Comrie. 2015. Mainland Southeast Asian languages. State of the art and new directions. In Nick J. Enfield & Bernard Comrie (eds.), The languages of Mainland Southeast Asia. The state of the art, 1–27. Berlin & Boston: Mouton de Gruyter. Goddard, Cliff. 2005. The languages of East and Southeast Asia: An introduction. Oxford: Oxford University Press. Haiman, John. 2011. Cambodian: Khmer. Amsterdam & Philadelphia: John Benjamins. Heder, Steve. 2007. Cambodia. In Andrew Simpson (ed.), Language and national identity in Asia, 288–311. Oxford: Oxford University Press. Higham, Charles. 2002. Early cultures of Mainland Southeast Asia. Bangkok: River Books. Higham, Charles. 2004. Mainland Southeast Asia from the neolithic to the iron age. In Ian Glover & Peter Bellwood (eds.), Southeast Asia from pre-history to history, 41–67. Oxfordshire: Routledge Curzon.

620 

 Mathias Jenny

Huffman, Franklin E. 1973. Thai and Cambodian – A case of syntactic borrowing? Journal of the American Oriental Society 93(4). 488–509. Jenner, Philip N. & Paul Sidwell. 2010. Old Khmer grammar. Canberra: Pacific Linguistics. Jenny, Mathias. 2013. The Mon language: Recipient and donor between Burmese and Thai. Journal of Language and Culture 31(2). 5–33. Jenny, Mathias. 2017. Foreign influence in the Burmese language. In Ampika Rattanapitak (ed.), A collection of papers on Myanmar language and literature, 1–34. Chiang Mai: Myanmar Center. Jenny, Mathias & San San Hnin Tun. 2013. Differential subject marking without ergativity. The case of colloquial Burmese. Studies in Language 37(4). 693–735. Jenny, Mathias & San San Hnin Tun. 2016. Burmese. A comprehensive grammar. London & New York: Routledge. Matisoff, James A. 2003. Handbook of Proto-Tibeto-Burman: System and philosophy of Sino-Tibetan reconstruction. Berkeley: University of California Press. Matisoff, James A. 2019. Preface. In Alice Vittrant & Justin Watkins (eds.), The Mainland Southeast Asia linguistic are, v–xvi a. Berlin & Boston: Mouton de Gruyter. Michaelis, Susanne Maria, Phillipe Maurer, Martin Haspelmath & Magnus Huber (eds.). 2013. Atlas of Pidgin and Creole language structures online. Leipzig: Max Planck Institute for Evolutionary Anthropology. http://apics-online.info (accessed 14 November 2019). Minh-Hằng, Lê & Stephen O’Harrow. 2007. Vietnam. In Andrew Simpson (ed.), Language and national identity in Asia, 415–441. Oxford: Oxford University Press. Müller, André & Rachel Weymuth. How society shapes language: Personal pronouns in the Greater Burma Zone. Asiatische Studien/Études Asiatique 71(1). 409–432. Nichols, Johanna. 1992. Linguistic diversity in space and time. Chicago & London: The University of Chicago Press. Nguyễn, Ðình-Hoà. 1997. Vietnamese. Oxford: Oxford University Press. Nguyễn, Ðình-Hoà, Mark J. Alves & Hồng Cổn Nguyễn. 2018. Vietnamese. In Bernard Comrie (ed.), The world’s major languages, 3rd edn., 696–712. Oxon: Routledge. Nguyên, Vản Tạo. 1950. Tân Vân: Tân Dại Tự Điền Anh -Việt [Tan Van’s newly revised English-Vietnamese dictionary]. Hanoi: Tan Van. Okell, John. 1965. Nissaya Burmese. Lingua 15. 186–227. Pallegoix, D. J. Bapt. 1850. Grammatica linguæ Thai. Bangkok: Collegium Assumptionis. Pittayaporn, Pittawat. 2009. The phonology of proto-Tai. Ithaca, NY: Cornell University dissertation. Scholvin, Vera & Judith Meinschäfer. 2017. The integration of French loanwords into Vietnamese: A corpus-based analysis of tonal, syllabic and segmental aspects. In Hiram Ring & Felix Rau (eds.), JSEALS Special Publication No. 3: Papers from the Seventh International Conference on Austroasiatic Linguistics, 157–173. Honolulu: University of Hawai’i Press. Simpson, Andrew & Noi Thammasathien. 2007. Thailand and Laos. In Andrew Simpson (ed.), Language and national identity in Asia, 391–414. Oxford: Oxford University Press. Stark, Miriam T. 2004. Pre-Angkorian and Angkorian Cambodia. In Ian Glover & Peter Bellwood (eds.), Southeast Asia from pre-history to history, 89–119. Oxfordshire: Routledge Curzon. Veidlinger, Daniel M. 2006. Spreading the Dhamma. Writing, orality, and textual transmission in Buddhist northern Thailand. Chiang Mai: Silkworm. Vickery, Michael. 1998. Society, economics, and politics in pre-Angkor Cambodia. Tokyo: The Centre for East Asian Studies for Unesco, The Toyo Bunko. Washizawa, Takuya. 2019. Process of grammaticalization and its role in the formation of the written language. Paper presented at the 8th International Conference on Austroasiatic Linguistics (ICAAL 8) in Chiang Mai, Thailand. Watkins, Justin. 2007. Burma/Myanmar. In Andrew Simpson (ed.), Language and national identity in Asia, 263–287. Oxford: Oxford University Press.



The national languages of MSEA: Burmese, Thai, Lao, Khmer, Vietnamese 

 621

Winichakul, Thongchai. 1994. Siam mapped. A history of the geo-body of a nation. Honolulu: University of Hawaii Press. Yanson, Rudolf A. 2002. On Pali-Burmese interference. In Christopher Beckwith (ed.), Medieval Tibeto-Burman languages, 39–57. Leiden, Boston & Köln: Brill. Yanson, Rudolf A. 2005. Tense in Burmese: A diachronic account. In Justin Watkins (ed.), Studies in Burmese linguistics, 221–240. Canberra: Pacific Linguistics. Zhou Daguan. 1993. The customs of Cambodia. Translated into English from the French version of Paul Pelliot by J. Gilman D’Arcy Paul. Bangkok: Siam Society.

Tom Hoogervorst

26 South Asian influence on the languages of Southeast Asia 26.1 Introduction The Indian subcontinent and the Southeast Asian mainland have for millennia seen the movement of peoples, cultures, and languages. The presence in both regions of Austroasiatic, Tibeto-Burman, and Tai-Kadai languages testifies to an intertwined past (van Driem 2001). This chapter deals specifically with the influence of Indo-Aryan and, to a lesser extent, Dravidian languages on the linguistic landscape of Mainland Southeast Asia, making comparisons wherever relevant with Maritime Southeast Asia. I use the shorthand “South Asian” to refer collectively to the Indo-Aryan and Dravidian language families, and “Indic” to refer specifically to Sanskrit (henceforth Sk.) and Pali (henceforth Pa.). In view of its historical focus, this chapter foregrounds languages with long written traditions; from the Mon-Khmer family: Mon, Khmer and Vietnamese; from the Tai-Kadai family: Thai and Lao; from the Tibeto-Burman family: Burmese; and from the Austronesian family: Cham and Malay. While the small-scale communities of Southeast Asia’s highlands, coasts, and other non-urban areas also exhibit some Indic loanwords, these were chiefly transmitted through neighboring languages rather than directly from South Asia. Several non-Buddhist Tai communities, for example, have almost no Indic loanwords in their basilects (Smalley 1994: 183). An extensive literature exists on early Indo-Aryan borrowings in Austroasiatic languages and – even more so – on tentative borrowings in the opposite direction (Kuiper 1948; Mayrhofer 1956–1980; Benedict 1975; Lévy et al. 1975). As ground-breaking as these publications were, later advances in Austroasiatic linguistics have rendered many of their postulations untenable (Kuiper 1991; Witzel 2009; Osada 2009). A number of similar-looking protoforms, however, indeed appear to be of considerable antiquity in Indo-Aryan and Mon-Khmer languages alike (Table 1), suggesting a scenario of horizontal borrowing. The words for ‘plough’, ‘pepper(corn)’, and ‘peacock’ have furthermore been connected to Dravidian etyma (Southworth 2005; Krishnamurti 2009). The existing scholarly literature on South Asian influence on the languages of Mainland Southeast Asia is rather unbalanced. The majority of publications single out one Southeast Asian language, e.  g. Burmese (Barbe 1879; Houghton 1893; Hla Pe 1960; Waxman and Soe 2014), Khmer (Martini 1954; Jacob 1977; Ménétrier 1985; Woznica 2010; Chhom 2016), or Thai (Gedney 1947; Navawongs 1975; Sarapadnuke 1975; Miyamoto 1992; Pengpala 1998). With the exception of one paper on the phonological integration of Indic and other loanwords in Khmer, Thai, Mon, and Burmese (Henderson 1951), little has been done across present-day national borders. As yet, no intraregional monograph of the magnitude of Gonda’s Sanskrit in Indonesia – which https://doi.org/10.1515/9783110558142-026

624 

 Tom Hoogervorst

Tab. 1: Proto-level Indo-Aryan and Mon-Khmer similarities (based on Turner 1966 and Shorto 2006). Indo-Aryan

Mon-Khmer

Meaning

*alāˊbu *karpāˊsa *lāˊṅgala *marīca *mayūˊra *śárkarā

*[r]baw *kpaas *lngal, *ŋgal *mrəc *mraik *skɔɔr

‘a gourd’ ‘cotton’ ‘plough’ ‘pepper(corn)’ ‘peacock’ ‘(candied) sugar’

despite its modest title encompasses South Asian linguistic influence across Maritime Southeast Asia – has been published on the languages of Mainland Southeast Asia. One recent publication focuses on Sanskrit loanwords in Indonesian, Lao, Malay, and Thai, providing four language-specific lists of tentative borrowings (Shastri 2005). Given the proximity of Indonesian and Malay as well as Lao and Thai, the volume contains a considerable degree of overlap that might have been prevented by an organization based on etyma rather than recipient languages. As will be demonstrated in this chapter’s first section, the language ecology of Mainland Southeast Asia has been influenced most thoroughly by Sanskrit and Pali. This is evident in the region’s earliest epigraphy, dating from the mid-first millennium CE. The Indic languages adopted by local elites enjoyed more sociolinguistic prestige than the indigenous Southeast Asian languages, such as Mon, Khmer, Burmese, Thai, Cham, and Malay. As the second section shows, contacts of a more vernacular nature led to the adoption of Middle Indo-Aryan (Prakrit), New Indo-Aryan, and Tamil loanwords. Judging from their phonological characteristics, a number of loanwords were borrowed via other Southeast Asian languages rather than directly. From late-colonial times, as is discussed in the third section, Sanskrit and Pali inspired the coinage of neologisms to denote novel concepts in Southeast Asia’s national languages. Most Southeast Asian languages have Indic-derived scripts (chapter 36, this volume), in which Sanskrit and Pali loanwords tend to be spelled according to their original orthography. Throughout this chapter I also indicate their colloquial pronunciations, since these cannot always be determined from the spelling alone yet are important to understand processes of phonological integration and secondary borrowing. Indic loanwords are often borrowed in their original orthography and then syllabified according to indigenous phonological patterns, including tones and registers, where applicable.



South Asian influence on the languages of Southeast Asia 

 625

26.2 Classical influence Several developments in the realms of metallurgy, agriculture, urbanization, and commerce seem to have taken place concurrently across the densely populated parts of South and Southeast Asia (Bellina 2007; Fuller et al. 2015; Castillo et al. 2016; Pryce 2016). The archaeological record points to regular long-distance networks across the Bay of Bengal by the first centuries BCE (Glover and Bellina 2011), and a number of widely adopted and early attested Indo-Aryan loanwords must have entered Southeast Asia during this period of proto-historical contact (Table 2). Tab. 2: Early Indo-Aryan loans in Southeast Asia. Indo-Aryan

Meaning

Attestations in early Southeast Asian epigraphy

Sk. aśva, Pa. assa Sk., Pa. jāla

‘horse’ ‘casting net’

Sk.,Pa. kambala

‘wool, woolen cloth’

Sk., Pa. ratha Sk., Pa. yava Sk., Pa. yuga

‘cart’ ‘barley, grain’ ‘yoke’

Old Cham aśva, Old Khmer ‘seḥ, Old Javanese aśva Old Mon jā, Old Khmer jala, Old Javanese jāla Old Mon kambar, Old Khmer kamval, Old Javanese kambala Old Mon ratha, Old Khmer ratha, Old Javanese ratha Old Khmer yava, Old Javanese yava Old Khmer yuga, Old Javanese yuga

By the first centuries CE, these interactions eventually led to the adoption in most of Southeast Asia’s political and commercial centers of South Asian religions, including Shaivism, Vaishnavism, Mahāyāna Buddhism and Theravāda Buddhism. As a consequence, two of India’s major literary languages entered Southeast Asia: Sanskrit and Pali. These Indo-Aryan languages are closely related, with Sanskrit representing a more archaic phonological stage of language development. As a literized variety of Old Indo-Aryan, Sanskrit was the sacred language of numerous communities eventually labelled as Hindus. While its study is often claimed to have been restricted to the Brahman caste, distinct varieties used by Buddhists also came into being and spread to Southeast Asia. Pali was adopted as the preferred language of Theravāda Buddhists, who studied it in monastic contexts. In fact, the vast majority of early Pali epigraphy comes from Mainland Southeast Asia rather than India or Sri Lanka, and texts produced in Burma have been crucial for the study of this language regionally and globally. Neither Pali nor Sanskrit ever served as a medium for everyday non-religious communication; both were predominantly written languages in South and Southeast Asia alike. Epigraphy provides the earliest solid evidence for Indic influence on Southeast Asia’s linguistic landscapes (see chapter 35 in this volume). As none of the earliest inscriptions contain dates, paleographic arguments are crucial to determine their antiquity.

626 

 Tom Hoogervorst

A small number of clay objects excavated in lower-central Burma in a 2nd century CE archaeological context – inscribed with the Prakrit name Saṁghasiri – constitute the earliest known evidence for Indic writing in Southeast Asia (Aung Thaw 1968: 51; Griffiths and Lammerts 2015: 989). The oldest inscriptions in a Southeast Asian language are claimed to be in Cham and have been provisionally dated to the late 4th century CE (Marrison 1975; Golzio 2004). Several Indic loanwords entered the Cham language presumably around the same time (cf. Headley 1976; Thurgood 1999). Sanskrit inscriptions start to appear in parts of present-day Vietnam and Indonesia (eastern Borneo and western Java) by the late 4th to early 5th century CE. In the Khmer-speaking area, the first Sanskrit inscription (the Devānīka Inscription) has been dated to the mid-5th century (Cœdès 1956). From that time, Sanskrit came to serve as the chief language of political expression and literary aesthetics among most – but not all – outward-looking Southeast Asia elites (Pollock 2006). Both Sanskrit and Pali feature in inscriptions from Arakan from the 5th century (Frasch 2018; Htin and Leider 2018). The so-called Maunggan Gold Plates inscribed in Pali and dated to the 5th or 6th century hail from Śrīkṣetra (near modern Pyay in Burma) along the Irrawaddy River, while a number of short Pali inscriptions emerge around the same time in the Mon-speaking Dvāravatī kingdom in present-day Thailand (Griffiths and Lammerts 2015). The first known Mon inscription has been dated to the 6th century (Shorto 1971; Bauer 2018). Inscriptions in the Khmer language date from the early 7th century, at roughly the same time the Khmers adopted Sanskrit in a chiefly Buddhist context (Bhattacharya 1964; Chhom 2016). The same century also saw the emergence of epigraphic traditions in Malay (Mahdi 2005; Griffiths 2018) and Pyu  – a now extinct Tibeto-Burman language in Burma (Griffiths et al. 2017). From the beginning of the 9th century, Javanese also became a language of epigraphy. It was followed by Burmese in the late 11th century and Thai from the late 13th century. There is no evidence for early Indic-influenced writing or lexical borrowing in Vietnamese. While several religious terms found entrance across Southeast Asia (cf. Table 3), the precise degree of “Indianization” differed across time and place. The Pyu language, for example, proved remarkably impenetrable to Indic loans (Griffiths et al. 2017). Indic-inspired but locally authored literature also came in different shapes. The premodern Burmese wrote innumerable works in Pali (Bode 1966), while the Thai, Javanese, and Malays produced localized versions of epic tales. After Theravāda ­Buddhism became the dominant religion in the Burmese kingdom of Pagan around the 11th century, contacts between Burmese and Sri Lankan Buddhists grew stronger than ever. As a consequence, Pali gained more status in Southeast Asia, especially after Cambodia and Thailand also institutionalized Theravāda Buddhism as the state religion by the 13th century. From this time, Pali served as the chief language of written communication between Sri Lankan and various Southeast Asian communi­ ties (Frasch 2017). In Khmer, several religious loanwords from Sanskrit were replaced by their Pali equivalents or co-existed with them as doublets. The online Dictionary of Old Khmer lists 3,367 Sanskrit loanwords and only 74 from Pali, yet the SEAlang



South Asian influence on the languages of Southeast Asia 

 627

Library Khmer Dictionary contains 3,228 Sanskrit and 5,290 Pali loans in modern usage. The hegemony of Pali did not extend to Maritime Southeast Asia, where Islam held sway from premodern times (except in Hindu-majority Bali). In the Cham-speaking areas, where Islam likewise became the dominant religion, many Sanskrit words were pushed into the shamanistic register of the language (Cabaton 1925). Tab. 3: Religious concepts. Indic

Meaning

Khmer

Thai

Mon

Burmese

Balinese

Sk. cakra, Pa. cakka

‘wheel, circle’

cakra (ចក្រ�) /caʔ/

cakra (จัักร) /càk/

cak (စက်) /cɛk/

cakra (စကြာ�ာ) /seʔʧa/

cakra /cakrə/

Sk. doṣa, Pa. dosa

‘fault, flaw, sin’

dosa (ទោ�ស) /toːh/

doṣa (โทษ) /tʰôːt/

dus (ဒုဟ်) /tùh/

dosa (ဒေါ�ါသ) /dɔ́ ða̰ /

dosa /dosə/

Sk. karma, Pa. kamma

‘actions, fate, deeds’

kamma (កម្មម ) /kam/

karma (กรรม) /kam/

kaṁ (ကံ) /kɔm/

kaṁ (ကံ) /kã/, krammā (ကြ�မ္မာာ�) /ʧəma/

karma /karmə/

Sk. mantra, Pa. manta

‘sacred words, incantation, spell’

manta (មន្តត ) /mʊən/

mant (มนต์ ์) /mon/

pa-man (ပမန်) /kəmɔn/

manḥ (မန်း) /mã/

mantra /mantrə/

Sk. ṛṣi, Pa. isi

‘hermit, sacred man’

isi (ឥសិិ) /ʔaisai/

ṝṣī (ฤๅษีี) /rɯːsǐː/

risi (ရိသိ) /rəsɔeˀ/

rase. (ရသေ့�့) /jəθḛi/

reṣi /rəsi/

Sk., Pa sukha

‘peace, happiness, pleasure’

sukha (សុុខ) /soʔ/

sukha (สุุข) /sùk/

sukha (သုခ) /saoˀkʰaˀ/, suik (သိုက်) /sak/

sukha (သုခ) /θṵkʰa̰ /

suka /sukə/

The large-scale adoption of Indic languages and loanwords brought about a social, political, and cultural transformation in Southeast Asia’s early centers of power. Sanskrit in particular served to legitimize and spiritually protect those in power, especially through the agency of Brahmans (Bronkhorst 2011). Several of Southeast Asia’s successful dynasties – such as Campā, Kambuja, Śailendra, and Śrī Vijaya – had Sanskrit names. Added to these were innumerable Sanskrit-derived toponyms, such as Maṅgalapūrī (មង្គគ លបូូ រី ី) /mʊəŋkʊəl ɓorɤj/ in Khmer, Ayudhayā (อยุุธยา) /ʔajuttʰajaː/ and Sukhōdaya (สุุโขทััย) /sukʰoːtʰai/ in Thai, and Dvāravatī in Old Mon. Aside from religion, lexical borrowing from Sanskrit and Pali also took place in the domains of culture, politics, science, law, and philosophy (Table 4).

628 

 Tom Hoogervorst

Tab. 4: Non-religious loanwords. Indic

Meaning

Khmer

Thai

Burmese

Balinese

Sk., Pa. jaya

‘victory’

jaya (ជ័័យ) /cej/

jaya (ชั ัย) /cʰai/

jeyya (ဇေ�ယျျ) /zeija̰ /

jaya /ɟajə/

Sk., Pa. kāla

‘time’

kāla (កាល) /kaːl/

kāla (กาล) /kaːn/

kāla (ကာလ) /kala̰ /

kala /kalə/

Sk. kāvya, Pa. kabba

‘poetry, poem’

kābya (កាព្យយ) /kaːp/

kāby (กาพย์ ์) /kàːp/

kabyā (ကဗျာ) /gəbja/

kavi /kawi/

Sk., Pa. rājā

‘king’

rāja (រាជ) /riəc/

rāja (ราช) /râːt/

rāja (ရာဇ) /jaza̰ /

raja /raɟə/

Sk. sākṣī, Pa. sakkhi

‘witness’

sākṣī (សាក្សីី�) /saʔsɤj/

sakkhī (สัักขีี) /sàkkʰǐː/

sakse (သက််သေ�) /θeʔθei/

saksi

Sk. śāstra, Pa. sattha

‘teaching’

sāstra (សាស្ត្រ�រ ) /saːh/

śāstr (ศาสตร์ ์) /sàːt/

sāttra (သျှှတ္တတရ) /ʃaʔtəra̰ /

sastra /­sastrə/

Sk. sattva, Pa. satta

‘creature’

satta (សត្តត ) /sat/

satv (สััตว์ ์) /sàt/

sattavā (သတ္တတဝါါ) /ðədəwa/

satva /satwə/

Sk. vaṁśa, Pa. vaṁsa

‘lineage, race, family’

baṅṣa (ពង្សស) /pʊəŋ/

vaṅś (วงศ์ ์) /woŋ/

vaṁsa (ဝံသ) /wũθa̰ /, vaṅ (ဝင်) /wĩ/

baṅsa /baŋsə/

Sk. varṣa, Pa. vassa

‘year, rain’

vassā (វស្សាា) /woəhsaː/

barṣā (พรรษา) /pʰansǎː/

vā (ဝါ) /wa/

varṣa /warsə/

Indic languages generally coexisted with Southeast Asia’s local languages in a context of diglossia, in which the former enjoyed greater prestige without necessarily being understood by the majority (Hunter 2011). Although nothing points to a widespread tradition of bilingualism between Indic and local languages, the high status of the former did give rise to Indianized registers of (previously) vernacular Southeast Asian languages. Throughout Mainland Southeast Asia, the first non-epigraphic use of local languages consisted of commentaries on Pali texts (nissaya) and glossaries of Sanskrit or Pali words (Okell 1965; Kasevic 2000; McDaniel 2008). Indic influence was manifest in the grammar, poetic meter, and of course in the writing system of these idioms. In addition, a number of “luxury loans” – lexical items for which local equivalents were readily available – entered Southeast Asia’s Indianized languages (Table 5).



South Asian influence on the languages of Southeast Asia 

 629

Tab. 5: Luxury loans. Indic

Meaning

Khmer

Thai

Burmese

Malay

Sk. candra, Pa. canda

‘moon’

candra (ចន្ទ្រ�រ) /can/

candr (จัันทร์ ์) /can/

candā (စန္ဒာာ�) /sà̃dà/

cendera /cəndəra/

Sk. manuṣya, Pa. manussa

‘person, man’

manussa (មនុុស្សស) /mɔnʊh/

manuṣy (มนุุ ษย์ ์) /mənút/

manussa (မနုဿ) /mənoʊʔθa̰ /

manusia

Sk. mitra, Pa. mitta

‘friend’

mitta (មិិត្តត) /mɯt/

mitra (มิิตร) /mít/

mit (မိတ်) /meiʔ/

mitra

Sk. prathama, Pa. paṭhama

‘first, primary’

paṭhama (បឋម) /ɓaʔtʰɑm/

prathama (ประถม) /pràʔtʰǒm/, paṭhama (ปฐม) /pətʰǒm/

pathama (ပထမ) /pətʰəma̰ /

pertama /pərtama/

Sk., Pa. rūpa

‘form, image’

rūpa (រូ ូប) /ruːp/

rūpa (รููป) /rûːp/

rup (ရုုပ််) /joʊʔ/

rupa

Sk. samudra, Pa. samudda

‘sea, ocean’

samudra (សមុុទ្រ�) /samot/

samudra (สมุุทร) /səmùt/

samuddrā (သမုုဒ္ဒဒရာာ) /θəmoʊʔdəja/

samudra

Sk. viṣa, Pa. visa

‘venom, ­poison’

bisa (ពិិស) /pɯh/

biṣa (พิิษ) /pʰít/

bisa

Numerals form a special category of luxury loans. For 1–10, some Southeast Asian languages exhibit an additional set of numbers restricted to the high linguistic registers (Table 6). In addition to these luxury borrowings, Sk. śūnya, Pa. suñña ‘zero’ has been adopted into Khmer as sūnya (សូូ ន្យយ) /soːn/, Thai śūny (ศููนย์ ์) /sǔːn/, and Burmese suñña (သုုည) /θoʊ̃̃ ɲa̰̰ /; Sk. lakṣa, Pa. lakkha ‘hundred thousand’ into Cham as lakśā, Khmer lakkha (លក្ខខ ) /leəʔ/, and Thai hlaka (หลััก) /làk/; and Sk., Pa. koṭi ‘ten million’ into Khmer as koṭi (កោ�ដិិ) /kaot/, Thai koṭi (โกฏิิ) /kòːt/, and Burmese kuṭe (ကုုဋေ�) /gədei/. In Cham, kottik denotes a generic large number. The numerical value has also shifted in Malay, which displays laksa ‘ten thousand’, keti /kəti/ ‘hundred thousand’, and juta /ɟuta/ ‘million’ (Sk. ayuta ‘ten thousand’).

630 

 Tom Hoogervorst

Tab. 6: Indic-derived numerals. Number

Indic ­(cardinal)

Formal Indonesian Malay (cardinal)

Formal Khmer (cardinal)

Pali (ordinal)

Burmese (ordinal)

1

Sk, Pa. eka

eka

eka (ឯក) /ʔaek/

paṭhama

pathama (ပထမ) /pətʰəma̰ /

2

Sk., Pa. dvi

dwi

dvi (ទ្វីី �) /twiː/

dutiya

dutiya (ဒုတိယ) /dṵtḭja̰ /

3

Sk. tri

tri

trī (ត្រី�ី) /trai/

tatiya

tatiya (တတိယ) /taʔtḭya̰ /

4

Sk. catur, Pa. catu

catur

catu (ចតុុ) /caʔtoʔ/

catuttha

catuttha (စတုုတ္ထထ) /zədoʊʔtʰa̰ /

5

Sk., Pa. pañca

panca

pañca (បញ្ចច ) /paɲcaʔ/

pañcama

pañcama (ပဉ္စမ) /pjĩsəma̰ /

6

Sk. ṣaḍ-, Pa. cha

sad

cha (ឆ) /cʰɔː/

chaṭṭhama

chaṭṭhama (ဆဌမ) /sʰaʔtʰəma̰ /

7

Sk. sapta, Pa. satta

sapta

satta (សត្តត ) /sat/

satthama

satthama (သတ္တတမ) /θaʔtama̰ /

8

Sk. aṣṭā

asta

astā (អស្តាា ) /ʔasɗaː/

aṭṭhama

aṭṭhama (အဌမ) /ʔaʔtʰama̰ /

9

Sk., Pa. nava

nawa

nava (នព) /nɔːp/

navama

navama (နဝမ) /nəwəma̰ /

10

Sk. daśa, Pa. dasa

dasa

dasa (ទស) /tʊəh/

dasama

dasama (ဒသမ) /da̰ θəma̰ /

Given the antiquity of borrowing, many Indic loanwords in Southeast Asian languages have become phonologically unrecognizable as such. For example, Sk. dravya ‘propeerty, wealth’ is spelled identically in Mon and Burmese as drap (ဒြ�ပ််), yet its modern pronunciations are /krɔ̀ p/ and /dəraʔ/ respectively. The word is spelled as drabya (ទ្រ�ព្យយ) in Khmer and draby (ทรัพ ั ย์ ์) in Thai, displaying the modern pronunciations /troap/ and /sáp/ respectively. In Javanese it has become duwe (from an earlier drəwe), the default word for ‘to possess’. Such phonological details can often shed light on the trajectory of loanwords. Examples of Mon-Burmese language contact involving Indic loanwords are given in Jenny (2012: 12–13); the Burmese word pūjō (ပူူဇော်�်�) /puzɔ/ ‘worship’ – ultimately from Sk., Pa. pūjā – must have been borrowed through Old Mon, in which the addition of /w/ after a long vowel is common (pujāw). Mon intermediacy is also likely for Burmese loanwords in which the word-final short vowel was dropped, such as buil (ဗိုု�လ််) /bo/ ‘strength’ and puid (ပိုု�ဒ််) /paiʔ/ ‘stanza’, ultimately



South Asian influence on the languages of Southeast Asia 

 631

going back to Sk., Pa. bala and Sk., Pa. pada. Conversely, as the author shows, Mon yathā (ယထာာ) /jətʰa/ ‘train’ must have entered the language through Burmese rathā: (ရထားး�) /jətʰá/ ‘carriage’ on account of its bisyllabicity and the palatal glide, which would have been unexpected had the word been directly adopted from Sk., Pa. ratha. In Cham, a similar process of short vowel dropping in Indic loanwords might be explained as a result of Khmer intermediacy, e.  g. kal ‘time’ from Khmer kāla (កាល) /kaːl/ (Sk., Pa. kāla), lok ‘world’ from loka (លោ�ក) /loːʔ/ (Sk., Pa. loka), and niak ‘snake’ from nāga (នាគ) /niəʔ/ (Sk., Pa. nāga). Much remains to be examined with regard to the phonological integration of loanwords in Southeast Asian languages from a comparative perspective, but see Henderson (1951). Equally important are semantic shifts in Indic loans, which have received scholarly attention for Burmese (Houtman 1990: 10–11; Waxman and Soe 2014) and are equally common in other Southeast Asian languages. Consider for example Sk., Pa. bīja ‘grain, seed, semen’, which in Khmer obtained additional connotations of ‘race, species, family’ (būja (ពូូ ជ) /puːc/), in Thai of ‘plants, vegetation’ (bīja (พืืช) /pʰɯ̂̂ːt/), and in Burmese of ‘inherent character’ and ‘female genitals’ (bīja (ဗီီဇ) /biza̰̰ /). Scholaarly failure to appreciate these novel vernacular meanings has been identified as “the Pali trap” (Houtman 1990: 21). By way of another example, the notoriously untranslatable concept of Sk. dharma, Pa. dhamma came to denote the ‘Buddhist teachings’ and ‘righteousness’ more generally in Khmer (dhamma (ធម្មម ) /tʰoam/) and Thai (dharma (ธรรม) /tʰam/), while it came to refer chiefly to the ‘law’ (both religious and mundane) in Mon (dhav (ဓဝ််) /tʰɔ̀̀/) and Burmese (tarāḥ (တရား�း) /təjá/), and to ‘alms’ or ‘charity’ in Malay (darma /dərma/). Unlike modern Indian languages, none of the Southeast Asian languages use dharma primarily in the meaning of ‘religion’. For that concept, Sk. āgama ‘doctrine’ is used in Cham (āgama) and Malay (agama), Sk. śāsana, Pa. sāsana ‘teaching, doctrine’ in Khmer (sāsanā (សាសនា) /saːsnaː/), Thai (śāsanā (ศาสนา) /sàːtsanǎː/), and Burmese (sāsanā (သာာသနာာ) /θaðəna/), and Pa. bhāsā ‘langguage’ also in Burmese and Mon (bhāsā (ဘာာသာာ) /baða/ and /pʰɛ̀̀ əsa/, respectively) but in the more specific meaning of professed denomination (Perrière 2017). Along similar lines, for the concept of a ‘nation’ or ‘race’, Sk, Pa. jāti ‘birth, caste, lineage’ is used in Khmer (jāti (ជាតិិ) /ciət/) and Thai (jāti (ชาติิ) /cʰâːt/), Sk. vaṁśa ‘lineage, race’ features in Malay (baŋsa) and Cham (baṅśā, baṅsā), while Burmese displays the inherited word amyuiḥ (အမျိုး�း��) /ʔəmjó/, literally ‘kind, type’.

632 

 Tom Hoogervorst

26.3 Vernacular languages To understand the full scope of linguistic influence from South into Southeast Asian languages, one must also take into account secondary borrowings (i.  e. through a third language) and influence from vernacular languages. As regards the former, both Burmese and Thai have borrowed heavily from the Old Mon language (Hla Pe 1960; Bradley 1980; Jenny 2012), which was spoken throughout much of Southeast Asia’s lowland areas before the arrival of Tibeto-Burman and Tai-Kadai speech communities from the north. Mon continued to be a language of prestige and epigraphy into the 16th century. Along the same lines, Khmer was studied alongside Sanskrit and Pali by Thai-speaking elites down to the 19th century and has profoundly influenced the Thai vocabulary (Nacaskul 1962; Varasarin 1984). Neither Thai, Khmer, nor Burmese seem to have adopted loanwords from Sinhala, reflecting the scholarly rather than vernacular nature of contact between Sri Lanka and Southeast Asia. The orthography of Indic loanwords often contains clues to the directionality of borrowing. For example, Thai tāla (ตาล) /taːn/ ‘sugar palm’ corresponds to Sk., Pa. tāla, yet ɗārā (ดารา) /daːraː/ ‘star’ reflects borrowing through Khmer tārā (តារា) /ɗaːraː/ rather than directly from Sk., Pa. tārā. Likewise, Thai ɓañjī (บััญชีี) /bancʰiː/ ‘list, catalogue’ reflects Khmer pañjī (បញ្ជីី �) /ɓɑɲciː/ ‘register, account-book, list’ rather than its ultimate precursor Sk. pañjī ‘almanac, calendar, register’. Some Sanskrit loanwords were borrowed twice into Thai, once through Khmer and once through Malay, yielding a number of lexical doublets (Table 7). For more recent borrowings, vernacular pronunciation often outweighs etymologically informed orthographies. For example, Burmese kanut (ကနုုတ်)် /kənoʊʔ/, the name of a traditional art design, was adopted from Thai kanaka (กนก) /kənòk/ in the same meaning; this Thai etymon ultimmately reflects Sk., Pa. kanaka ‘gold’, yet displays phonological influence from Old Khmer (kᵊnɔːk) that is not reflected orthographically. Tab. 7: Sanskrit-derived indirect borrowings in Thai (based on Krauße 2013: 58). Sanskrit

Through Khmer

Through Malay

candana (‘sandalwood’)

candan (ចន្ទទន៍៍) /can/ > candan (จัันทน์์) /can/

cendana /cəndana/ > cindāhnā (จิินดาหนา) /cindaːnǎː/

jīva (‘living, life’)

jība (ជីីព) /ciːp/ > jība (ชีีพ) /cʰîːp/

jiwa /ɟiwa/ (‘soul, life, sweetheart’) > yihvā (ยิิหวา) /jíʔwǎː/ (‘life, sweethheart’)

pati (‘lord, master, husband’)

ptī (ប្តីី �) /pɗɤj/ (‘husband’) > ɓaɗī  (บดีี) /bɔdiː/ (‘boss’)

pati (‘high officer’) > pātī (ปาตีี) /paːtiː/ (‘important person’)

velā (‘time, hour of death’)

velā (វេេលា) /weːliə/ (‘time’) > velā (เวลา) /weːlaː/ (‘time’)

bela (‘self-immolation’) > ɓǣhlā (แบหลา) /bɛːlǎː/ (‘suicide’)



South Asian influence on the languages of Southeast Asia 

 633

Unlike Southeast Asia’s other major languages, there is little evidence for direct borrowing from Indic languages into Vietnamese. However, several terms related to Buddhism entered that language through Middle Chinese (Table 8). The source language of these borrowings into Chinese was neither Sanskrit nor Pali, but a relatively conservative Middle-Indo-Aryan dialect used in Buddhist circles (cf. Coblin 1981; Karashima 1992; Levman 2018). Tab. 8: Indic loanwords in Vietnamese. Indic

Meaning

Middle Chinese

Vietnamese

Sk., Pa. buddha Sk. gandharva, Pa. gandhabba Sk., Pa. maṇḍala Sk. nirvāṇa, Pa. nibbāna Sk., Pa. saṅgha Pa. phalika Pa. thūpa Pa. veḷuriya Sk., Pa. yoga

‘enlightened’ ‘a heavenly being’

*but̚ da (佛陀) *kan tʰat̚ bwa (乾闥婆) *muɑn da la (曼陀羅) *nɛt̚ bwan (涅槃) *səŋ gɨa (僧伽) *pʰwa liə̆ (玻璃) *tʰap̚ (塔) *luw liə̆ (琉璃) *juə̆ gɨa (瑜伽)

phật đà càn thát bà

‘a cosmic symbol’ ‘nirvana’ ‘(the Buddhist) community’ ‘crystal’ ‘stupa, shrine, tower’ ‘a precious pearl’ ‘yoga’

mạn đà la niết bàn tăng già pha lê tháp lưu li du già

Not much is known about vernacular South Asian languages in ancient contact with Mainland Southeast Asia. The earliest known examples of Prakrit (Middle Indo-Aryan) influence in the region are the aforementioned 2nd-century CE clay objects from lower-central Burma. Prakrit-inscribed potsherds have also been identified in an early 1st millennium CE context in Sembiran, northern Bali (Ardika 1994: 141) and Khao Sam Keao, southern Thailand (Bellina and Silapanth 2006: 281). Further evidence can be deduced from loanwords. Some early examples of lexical borrowing can be found in Old Mon epigraphy; kayat ‘clerk, scribe’ seems to go back to *kāyattha ‘writer caste’ (Sk. kāyastha) and tr(u)k ‘Mongol’ to *turukka (Sk. turuṣka ‘Turk’). Other Prakritisms have been identified in Old Khmer inscriptions (Bhattacharya 1964; Pou 1986; Chhom 2016). A vernacular reflex of Sk. kartari, Pa. kattari ‘scissors’ was borrowed into Khmer as kantrai (កន្ត្រៃ��ៃ ) /kɑntrai/, Cham katrei, and Thai kartrai (กรรไตร) /kantrai/ or kankrai (กัันไกร) /kankrai/; the unexpected word-final diphthong may have emerged under the influence of proto-Mon-Khmer *ray, *raay ‘to cut’ (cf. Headley 1976: 462). These examples notwithstanding, it appears that Middle Indo-Aryan influence has been more modest on the languages of Mainland Southeast Asia than on those of Maritime Southeast Asia (Hoogervorst 2017). Table 9, which is based on the work of Pou (1986) and Chhom (2016), lists a number of Middle Indo-Aryan loans attested in Old Khmer inscriptions that cannot be explained as Pali on phonological grounds.

634 

 Tom Hoogervorst

Tab. 9: Middle Indo-Aryan loanwords in Khmer. Sanskrit

Middle Indo-Aryan

Old Khmer

Modern Khmer

amla (‘sour’)

*ambila

aṃvil, amvil (‘tamarind’) ampil (អម្ពិិ �ល) /ʔɑmpɯl/ (‘sour, tamarind’)

asarūpa (‘not beautiful’)

*asarūva

asaru, asarū, assarū (‘bad’)

āsrūv (អាស្រូ�ូ វ) /ʔaasrəv/ (‘bad’)

*boll (‘to speak’)

vol, bol (‘to tell, to make known’)

bola (ពោ�ល) /poːl/ (‘to tell, to say’)

gōpāla (‘cowherd’)

*gōvāla

gvāl

gvāla (ឃ្វាាល) /kʰwiəl/ (‘to guard, to tend [animals]’)

kaṭāha (‘frying pan’)

*kaḍāha

kadāha

khdaḥ (ខ្ទះះ �) /kʰteəh/

*kaṭṭora (‘cup’)

kathor (‘a vessel’)

kanthora (កន្ថោ�ោរ) /kɑntʰao/ (‘spittoon’)

pratigraha (‘spittoon’)

*paḍiggaha

padigaḥ

rūpa (‘form’)

*rūva

ru, ruva, rū, rūv, rūva (‘way, manner, mode’)

rū (រូ ូ) /ruː/ (‘similar to’)

Middle Indo-Aryan loanwords of the above type are not to be confused with more recent borrowings from the Hindi/Hindustani dialect continuum or a closely related New Indo-Aryan language. The relatively late acquisition of such loans through spoken language is obvious from their spelling, which matches the colloquial pronunciations rather than literary forms of the etyma (Table 10). Perhaps the most famous New Indo-Aryan loan in the Burmese language is luṁhkyaññ (လုခ ံ ျည်) /loʊ̃ ʤi/, Burma’s traditional wrap-around cloth (“longyi”), from the generic North Indian and ultimately Persian word luṅgī ‘a colored cloth’.



South Asian influence on the languages of Southeast Asia 

 635

Tab. 10: New Indo-Aryan loanwords. New IndoAryan

Meaning

Khmer

Thai

Burmese

Malay

bīṛī

‘a thin ­ cigarette’

pārī (បារី ី) /ɓaːrɤj/

ɓuhrī (บุุหรี่่)� /bùʔrìː/

bhīdī (ဘီဒ)ီ /bidi/

gāñjā

‘hemp, ­marijuana’

kañjā (កញ្ឆាា) /kɑɲcʰaː/

kañjā (กััญชา) /kancʰaː/

gangyā (ဂန်ဂျာ) /gãʤa/

gulī

‘pill, small round object’

ghlī (ឃ្លីី�) /kliː/ (‘rubber ball, billiard ball’)

koṛī

‘a score (20 pieces)’

kāḷī (កាឡីី) /kaːlɤj/

kulī (กุุลี)ี /kùʔliː/

kodi

kuñci, kuñji

‘lock; key’

kuñcae (កុុញ្ចែ�ែ ) /koɲcae/

kuñcǣ (กุุญแจ) /kunjɛː/

kunci

roṭī

‘flatbread’

rūdī (រូ ូទីី) /roːtiː/

rōtī (โรตีี) /roːtiː/ ruitī (ရိုတီ) /joti/

roti

ganja /gaɲɟa/

gūlī (ဂူူလီီ /guli/ (‘polo ball’)

Numerous additional Hindi/Hindustani loans found their way into Malaya and Burma under British colonialism, some of which ultimately from European or Persian sources (Table 11). Tab. 11: Hindustani loanwords in Burmese and Malay. Hindustani

Meaning

Burmese

Malay

cābuk

‘whip’

kyāpvat (ကျာပွတ်) /ʧabuʔ/

cabuk

capātī

‘a dry flatbread’

hkyapātī (ချပါတီ) /ʧʰəpati/

capati

dhobī

‘washerman’

duibhī (ဒိဘ ု ီ) /dobi/

dobi

golī

‘marble’

golī (ဂေါ်�်လီီ) /gɔli/

guli

parāṭhā

‘a moist flatbread’

palātā (ပလာတာ) /pəlata/

perata /pərata/

sūjī

‘semolina’

rvhekhyī (ရွှေ�ေချီီ�) /ʃweiʤi/

suji /suɟi/

ṭaṅkī

‘tank (container)’

tuiṅkī (တိုင်ကီ) /taĩki/

tengki /tɛŋki/

Tamil would be another logical source to look for vernacular South Asian influence on the languages of Mainland Southeast Asia. Scholars have occasionally stressed the necessity to do so (Jembunathan 1929; Jenner 1970), yet I know of no publications dedicated to this topic. The ancient Tamil presence in Mainland Southeast Asia is confirmed by various inscriptions pointing to the presence of South Indian merchant guilds (Wisseman Christie 1998; Francis 2008–2009; Karashima and Subbarayalu

636 

 Tom Hoogervorst

2009). Examples of Tamil loanwords in Old Khmer are kaṭṭi, kaṭṭī ‘unit of weight’ (Tamil kaṭṭi) and vannāra ‘unidentified slave function’ (Tamil vaṇṇārra- ‘washerman’) (Hoogervorst 2015). In a conference paper, Saveros Pou (1987) identifies some other tentative loans, yet also demonstrates that the overwhelming majority of Tamil borrowings entered the Khmer language in modern times. A widespread Tamil loanword in Southeast Asia not attested epigraphically is kappal ‘ship’, adopted as Khmer kapāl (កប៉ាាល់់) /kəppal/, Thai kāṁpan (กำำ�ปั่่�น) /kampan/, Cham kapal, and Malay kapal. The Tamil word tārai ‘trumpet’ – borrowed into Khmer as trae (ត្រែ��) /trae/ and Thai trǣ (แตร) /trɛː/ – is unattested in Old Khmer, but can be found in Old Javanese as tarai. Some additional Tamil loans can be seen in Table 12. Tab. 12: Tamil loanwords in Southeast Asia. Tamil

Khmer

Thai

kirāmpu (‘cloves’)

krāṁpū (ក្លាំំ �ពូូ) /klampuu/

kānblū (กานพลูู) /kaːnpʰluː/

mālai (‘garland’)

mālaya (មាល័័យ) /mieley/

mālaya (มาลััย) /maːlai/

malai

mōrī (โมรีี) /moːriː/ (‘a foreign cloth, kind of silk’)

muri (‘plain white linen or cotton fabric’)

muṟi (‘piece of cloth, rough cloth’) tippali (‘long pepper’)

ṭīplī (ដីីប្លីី �) /dəypliː/

vayiram, vairam (‘diamond’)

Malay

ɗīplī (ดีีปลีี) /diːpliː/ bairāṁ (ไพรำ��) /pʰairam/ (‘gem, jewel, precious stone’)

beram, biram

Like the Hindustani loanwords mentioned previously, Tamil borrowings in Burmese are presumably of colonial pedigree, when numerous people from southern India moved to Burma. Their spelling is likewise based on the colloquial pronunciation rather than Indic orthography (Table 13). Tab. 13: Tamil loanwords in Burmese. Tamil

meaning

Burmese

appam

‘round cake’

āpuṁ (အာပုံ) /ʔapoʊ̃ /

ceṭṭi

‘a caste name’

hkyactīḥ (ချစ်တီး) /ʧʰiʔtí/

paḷḷi

‘mosque’

balī (ဗလီ) /bəli/

tōcai

‘rice flour pancake’

tuihre (တိုု�ရှေ�ေ) /toʃei/

vīcai

‘a unit of weight’

vissā (ပိဿာ) /peiʔθa/



South Asian influence on the languages of Southeast Asia 

 637

26.4 Language planning Sanskrit and Pali have long facilitated lexical enrichment in Southeast Asian languages. From colonial times, the region’s literary languages  – including Thai, although Thailand was never formally colonized – coined Indic-inspired neologisms to designate European novelties. This linguistic phenomenon has been described as hyper-Sanskrit and hyper-Pali (cf. McDaniel 2008: 187). Clear parallels existed elsewhere in Asia. The wasei-kango (和製漢語), for example, were Chinese morphemes coined in Japan to designate concepts borrowed from the West. In Java during the 1920s, a group of scholars unsuccessfully attempted to introduce such compounds as rata maruta ‘bicycle’ (Sk. ratha ‘cart’ + Sk. maruta ‘wind’) and rata pawaka ‘train’ (Sk. ratha ‘cart’ + Sk. pāvaka ‘fire’) at the expense of their Dutch-derived equivalents (S. S. 1933: 109). One example that has found some acceptance is palwa udara ‘airplane’ (Sk. plava ‘ship’ + Sk. udāra ‘high’). At the same time, some of the language policies in colonial-era Southeast Asia aimed to diminish the omnipresent Indic influence (see also chapter 37). The Lao orthography, for example, was simplified under pressure from French administrators to make it less similar to Thai (Djité 2011: 33–35). As a result, Indic loanwords in Lao are spelled closer to their actual phonology compared to Thai and other Southeast Asian languages. Similar debates on the need to simplify conservative orthographies were held in Cambodia in the 1940s, while the benefits of reducing the Sanskrit and Pali element in the Khmer vocabulary to make way for “authentic” words were discussed in the 1960s (Forest 2008: 29). Nevertheless, the influence of Sanskrit and Pali has remained productive and prolific. These processes have received little scholarly attention of a comparative nature. Only one analysis is known to me of neologisms in Khmer (Jacob 1986). Similar processes of (partly) Indic-inspired word formation took place in Thai, Lao, and Burmese. For example, Burmese exhibits cakrūp (စက််ရုုပ််) /seʔ joʊʔ/ ‘robot’, which is a comppound of ‘machine’ (Pa. cakka ‘wheel’) and ‘physical form’ (Sk., Pa rūpa ‘form’). Thai exhibits ratha cakra yāna (รถจัักรยาน) /rót càkrə-jaːn/ ‘bicycle’, consisting of Sk., Pa. ratha ‘cart’, Sk. cakra ‘wheel’, and Sk., Pa. yāna ‘vehicle’. While Thai, Lao, and Khmer often display similar neologisms, Burmese language planners appear to have operated in relative isolation (Table 14). None of the Mainland Southeast Asian coinages were adopted in Indonesia (or vice versa), where early-independence language planners likewise coined Sanskrit-inspired neologisms (Gonda 1973: 626–634).

638 

 Tom Hoogervorst

Tab. 14: Indic-inspired neologisms. English

Lao

Thai

Khmer

Burmese

art

sinlapa (ສິິນລະປະ) /sinlapa/

śilpa (ศิิลป) /sǐnləpàʔ/

silpa (សិិល្បប) /sɤl/

anupaññā (အနုပညာ) /ʔənṵ pjĩɲa/

Sk. śilpa ‘artistic work’

democracy

pajādhipatai (ປະຊາທິິປະໄຕ) /pasaːtʰipatai/

Pa. aṇu ‘fine, subtle’ + Pa. paññā ‘knowledge’ prajādhipatay prajādhipateyya (ประชาธิิปััตย์ ์) (ប្រ�ជាធិិបតេ�យ្យយ) /pràʔcʰaːtʰíppətai/ /prɑciətʰəpətai/

Sk. prajā ‘people’ + Pa. ādhipateyya ‘self-governance’ gender

bhed (ເພດ) /pʰeːt/

beś (เพศ) /pʰêːt/

bhed (ភេ�ទ) /pʰeːt/

Sk., Pa. bheda ‘division, kind’ (Thai form possibly influenced by Sk. veṣa ‘dress, appearance’) geography

bhūmsād (ພູູ ມສາດ) /pʰuːm saːt/

Sk., Pa. liṅga ‘gender’

bhūmiśāstr (ภููมิศ ิ าสตร์ ์) /pʰuːmíʔsàːt/

bhūmisāstra (ភូូ មិិសាស្ត្រ�រ ) /pʰuːməsaːh/

Sk., Pa. bhūmi ‘earth’ + Sk. śāstra ‘teaching’ materialism

vadthu niyom (ວັັດຖຸຸ ນິິຍົົມ) /wattʰu niɲom/

pandid (ບັັນດິິດ) /bandit/ Sk, Pa. paṇḍita ‘learned man’

pathavīvaṅ (ပထဝီဝင်) /pətʰəwì wĩ/ Pa. paṭhavī ‘earth’ + Pa. vaṁsa ‘tradition’

vatthuniyam (วััตถุนิ ุ ิ ยม) /wáttʰùʔníʔjom/

Pa. vatthu ‘property’ + Sk., Pa. niyama ‘limitation’ PhD graduate

liṅ (လိင်) /leĩ/

paṇḍit (บััณฑิิต) /bandìt/

sambhāraḥniyam (សម្ភាារៈៈនិិយម) /sɑmpʰiəreaʔ niʔyʊm/

rūpvāda (ရုုပ််ဝါါဒ) /joʊʔ wada̰ /

Sk. sambhāra ‘materials’ + Sk., Pa. niyama ‘limitation’

Sk., Pa. rūpa ‘appearance’ + Sk. vāda ‘-ism’

paṇḍit (បណ្ឌិិ�ត) /ɓɑndɯt/

pāragū (ပါါရဂူူ) /parəgu/ Pa. pāragū ‘accomplished’



South Asian influence on the languages of Southeast Asia 

 639

Tab. 14 (continued) English

Lao

Thai

Khmer

Burmese

philosophy

padsayā (ປັັດຊະຍາ) /patsaɲaː/

prajñā (ปรั ัชญา) /pràtjaː/

dassanavijjā (ទស្សសនវិ ិជ្ជាា ) /tʊəhsənaʔ wicciə/

abhidhammā (အဘိိဓမ္မာာ�) /ʔəbḭdəma/

Pa. dassana ‘seeing’ + Pa. vijjā ‘science’

Pa. abhidhamma ‘theory of the doctrine’

cittavijjā (ចិិត្តត វិ ិជ្ជាា ) /cɤt wicciə/

citpaññā (စိတ်ပညာ) /seiʔ pjĩɲa/

Sk., Pa. citta ‘thinking, the mind’ + Pa. vijjā ‘science’

Sk., Pa. citta ‘thinking, the mind’ + Pa. paññā ‘knowledge’

vijjāsāstra (វិ ិទ្យាាសាស្ត្រ�រ ) /wicciə saːh/

sippaṁ (သိိပ္ပံ�)ံ /θeiʔpã/

Sk. vidyā ‘science’ + Sk. śāstra ‘teaching’

Pa. vijjā ‘science’ + Sk. śāstra ‘teaching’

Pa. sippa ‘branch of knowledge’

vijākān (ວິິຊາການ) /wisaːkaːn/

paccekavijjā (បច្ចេ�េកវិ ិជ្ជាា ) /paccaiʔ wicciə/

nañḥpaññā (နည်း ပညာ) /ní pjĩɲa/

Sk. vidyākara ‘doing science’

Pa. pacceka ‘by itself’ + Pa. vijjā ‘science’

Pa. naya ‘method’ + Pa. paññā ‘knowledge’

Sk. pratyaya ‘ascertainment, understanding’ psychology

cidta vithayā (ຈິິດຕະວິິທະຍາ) /citta witʰaɲaː/

citavidyā ิ ยา) (จิิตวิท /cìttəwíttʰəjaː/

Sk., Pa. citta ‘thinking, the mind’ + Sk. vidyā ‘science’

science

technology

university

vithayā sād (ວິິທະຍາສາດ) /witʰaɲaː saːt/

vidyāśāstr (วิิทยาศาสตร์ ์) /wíttʰəjaːsàːt/

vithayālai (ວິິທະຍາໄລ) /witʰaɲaːlai/

mahāvidyālaya (มหาวิิทยาลััย) /mahǎː wíttʰəjaːlai/

sākalavidyālaya (សាកលវិ ិទ្យាាល័័យ) /saːkɑl wicciəlai/

takkasuil (တက္ကကသိုု�လ််) /teʔkəθo/

Sk. vidyālaya ‘abode of learning’

Sk., Pa. mahā‘great’ + Sk. vidyālaya ‘abode of learning’

Sk., Pa. sakala ‘all’ + Sk. vidyālaya ‘abode of learning’

Pa. takkasilā ‘a historical center of education’

640 

 Tom Hoogervorst

The use of Sanskrit to create novel vocabulary is equally common in South Asia, in particular India, Nepal, and Sri Lanka. Here, again, no comparative research is known to me on different national or regional language policies and possible transnational networks between language engineers. A relatively small number of neo-Sanskritisms has been adopted in South and Southeast Asia alike. Sk. dūradarśana, Pa. dūradassana ‘seeing far’ provided a viable Indic alternative for ‘television’, e.  g. Hindi dūrdaarśan, Khmer dūradassan (ទូូ រទស្សសន៍៍) /tuːrəʔtʊəh/, and Thai dōradaśan (โทรทััศน์์) /tʰoːrətʰát/. However, Sinhala displays the Sanskritism rūpavāhiniya (Sk. rūpa ‘form, appearance’ + Sk. vāhinī ‘channel’) and Burmese has rup myaṅ saṁ kyā: (ရုုပ််မြ�င်သံ ် ကြား�း� ံ ) /joʊʔ mjĩ θã ʤa/ ‘see pictures, hear sounds’, the first element of which also goes back to Sk., Pa. rūpa. For ‘station’, Sk. sthānīya ‘local’ has been adopted in Khmer (sthānīy (ស្ថាានីីយ) /stʰaːniː/) and Thai (sathānī (สถานีี ) /sətʰǎːniː/), while the etymmologically related form Sk. sthāna ‘standing, place’ features in this meaning in Hindi (sthān) and Sinhala (sthāna-ya). In most cases, however, the neo-Sanskritisms of South and Southeast Asia are different. The Sanskrit-inspired word for ‘telephone’ in Khmer (dūrasabda (ទូូ រស័័ព្ទទ ) /tuːrəsap/) and Thai (dōraśabd (โทรศััพท์ ์) /tʰoːrəsàp/) consists of Sk. dūra ‘far’ + Sk. śabda ‘sound’, while most Indian languages exhibit dūravāṇi (Sk. dūra ‘far’ + Sk. vāṇi ‘sound’) as a not-too-popular alternative to the direct English loan. Common Sanskrit-derived words in India for ‘art’ (Sk. kalā ‘any practical art’), ‘airplane’ (Sk. vimāna ‘mythological flying machine’), ‘democracy’ (Sk. loka ‘people’ + tantra ‘system, rule’), ‘geography’ (Sk. bhūgola ‘globe’), ‘materialism’ (Sk. bhautika ‘material’ + vāda ‘-ism’), ‘philosophy’ (Sk. darśana ‘seeing, philosophical system’), ‘psychology’ (Sk. mano‘mind’ + vijñāna ‘understanding’), ‘science’ (Sk. vijñāna ‘understanding’), and ‘university’ (Sk. viśva ‘universal’ + vidyālaya ‘abode for learning’) are not used in these meanings in Southeast Asia. The word order, which is noun + adjective in Khmer and Thai, also influences Indic-inspired neologisms. For example, the neologism ‘spacesship’ in Khmer (yāna avakāsa (យានអវកាស) /jiən ʔawəkaːh/) and Thai (yāna avakāśa (ยานอวกาศ) /jaːn ʔəwəkàːt/) consists of Pa. yāna ‘vehicle’ + Sk. avakāśa, Pa. avakāsa ‘space’. From an Indo-Aryan grammatical point of view, the expected word order would be reversed, and Burmese indeed exhibits ākāsa yāñ (အာာကာာသယာာဉ်) /ʔakaθa̰̰ yĩ/ (Pa. ākāsa ‘sky’ + Pa. yāna ‘vehicle’), while Hindi has antarikṣ yān (Sk. antarikṣa ‘atmosphere’ + Sk. yāna ‘vehicle’). Along the same lines, Khmer manussa yanta (មនុុស្សសយន្តត ) /mɔnʊh jʊən/ ‘robot’ (Pa. manussa ‘human’ + yanta ‘machine’) displays the opposite word order of its Hindi equivalent yantramānav (Sk. yantra ‘machine’ + Sk. mānava ‘human’). Equally common are compound neologisms consisting of an Indic and a non-Indic element. This phenomenon has been described for Burmese (Okell 1969: 53–54; Wheatley and San 1999: 65; Waxman and Soe 2014: 263–264) and is also seen in other Southeast Asian languages. Khmer yanta hoḥ (យន្តត ហោះ�ះ) /jʊən hɑh/ ‘airplane’ consists of ‘machine’ (Pa. yanta) and ‘to fly’, while Thai hun yant (หุ่่�นยนต์ ์) /hun jon/ ‘robot’ combines ‘puppet’ and ‘machine’ (Pa. yanta). Several Southeast Asian languages



South Asian influence on the languages of Southeast Asia 

 641

combine Sk., Pa. ratha ‘cart’ with the indigenous word for ‘fire’ to designate the late-collonial concept of a steam train, e.  g. Khmer ratha ploeṅ (រថភ្លើ�ើ�ង) /rʊət plɤːŋ/, Thai ratha fai (รถไฟ) /rót fai/, Lao rod fai (ລົົດໄຟ) /lot fai/, and Burmese mī: rathā: (မီးး�ရထားး�) /mí jətʰá/.

26.5 Concluding remarks Influence from South Asia on the languages of Southeast Asia can be seen from – and in fact gave rise to  – the region’s earliest epigraphy. Across Southeast Asia’s ancient political centers, Sanskrit and Pali loanwords were adopted along with religious, scholarly, and cultural ideas from the subcontinent. Such borrowings typically followed the Indic orthography, even though their pronunciations were generally adjusted to the recipient phonological system. A closer historico-phonological analysis often reveals patterns of indirect borrowing, for example through Mon into Burmese or through Khmer into Thai. Pali established itself as the dominant language of religious expression from the 11th century, yet was never able to fully supersede Sanskrit in most places. The languages of Southeast Asia show some evidence of vernacular influence from Middle Indo-Aryan languages (Prakrit), Tamil, and in later times New Indo-Aryan languages such as Hindustani. Many of these loanwords were spelled on the basis of their pronunciation at the time of adoption rather than their “correct” Indic orthography, providing valuable insights in the phonological development of the donor as well as recipient languages. Such complex trajectories of borrowing occasionally yielded lexical doublets, as has been shown for Thai and is equally true for other Southeast Asian languages. In Vietnamese, Southeast Asia’s only major language with no history of direct contact with South Asian languages, Indic influence chiefly entered through Middle Chinese. Both Sanskrit and Pali can easily be used to form neologisms, a feature amply exploited by the language engineers of 20th-century Burma, Thailand, Cambodia, and Laos. Here, we see that Khmer, Thai, and Lao tend to exhibit many of the same neologisms, presumably following the initiatives of Thai language planners. Their Burmese colleagues seem to have coined their new terms in isolation (and with a preference for Pali rather than Sanskrit). Neither were substantially influenced by what language engineers in India, Sri Lanka, or Indonesia were doing at roughly the same time. To this day, South Asian influence on Southeast Asia remains an ongoing process in all countries except Vietnam, particularly in the stylistic registers of religion, literature, and scholarship. In addition to language planning policies, several shared developments have come to the fore that would benefit from more comparative future research. Southeast Asia’s complex trajectories of lexical borrowing reflect the nature of inter-ethnic

642 

 Tom Hoogervorst

contact over the millennia. It would be interesting to investigate in more detail the spread and historicity of semantic shifts. To give just one example, Sk. śilpa, Pa. sippa has come to refer to the arts in Lao, Thai, and Khmer, yet to the sciences in Burmese (and was never adopted in the languages of Maritime Southeast Asia). At this moment, much scholarship remains either country-specific or language-specific. The logical next challenge will be to identify intraregional and transregional patterns and look for connections that go beyond present-day borders. Acknowledgements: I am indebted to Kunthea Chhom for her valuable insights on the Khmer data.

Appendix: Languages referred to Balinese Barber (1979) Burmese Myanmar (1996) Cham Aymonier and Cabaton (1906) Hindi/Hindustani McGregor (1993) Khmer Headley et al. (1997) Malay Wilkinson (1932) Middle Chinese Pulleyblank (1991) Middle Indo-Aryan Turner (1966) Mon Shorto (1962) Old Cham Golzio (2004) Old Javanese Zoetmulder (1982) Old Khmer Jenner (2009) Old Mon Shorto (1971) Pali Rhys Davids and Stede (1966) Proto-Indo-Aryan Turner (1966) Proto-Mon-Khmer Shorto (2006) Sanskrit Monier-Williams (1899) Tamil Tamil (1924–1936) Thai Photjananukrom (1999)

References Ardika, I. Wayan. 1994. Early evidence of Indian contact with Bali. In Pierre-Yves Manguin (ed.), Southeast Asian Archaeology 1994: Proceedings of the 5th International Conference of the European Association of Southeast Asian Archaeologists. Paris, 24th–28th October 1994. Volume 1, 139–145. Hull: University of Hull. Aung Thaw. 1968. Report on the excavations at Beikthano. Rangoon: Revolutionary Government of the Union of Burma, Ministry of Union Culture. Aymonier, Étienne & Antoine Cabaton. 1906. Dictionnaire Čam-Français. Paris: Ernest Leroux.



South Asian influence on the languages of Southeast Asia 

 643

Barbe, Henry Lewis St. 1879. Pali derivations in Burmese. Journal of the Asiatic Society of Bengal 48(4). 253–258. Barber, Charles Clyde. 1979. Dictionary of Balinese – English (Aberdeen University Library Occasional Publications No. 2). Aberdeen: University of Aberdeen. Bauer, Christian. 2018. The Mon inscriptions of Thailand, Laos and Burma. In Daniel Perret (ed.), Writing for eternity: A survey of epigraphy in Southeast Asia, 135–149. Paris: l’École française d’Extrême-Orient. Bellina, Bérénice. 2007. Cultural exchange between India and Southeast Asia: Production and distribution of hard stone ornaments, VIc. BC–VIc. AD. Paris: Editions de la Maison des sciences de l’homme. Bellina, Bérénice & Praon Silapanth. 2006. Weaving cultural identities on trans-Asiatic networks: Upper Thai-Malay Peninsula – An early socio-political landscape. Bulletin de l’École française d’Extrême-Orient 93. 257–293. Benedict, Paul. 1975. Austro-Thai language and culture with a glossary of roots. New Haven, CT: Hraf Press. Bhattacharya, Kamaleswar. 1964. Recherches sur le vocabulaire des inscriptions sanskrites du Cambodge. Bulletin de l’École française d’Extrême-Orient 52(1). 1–72. Bode, Mabel Haynes. 1966. The Pali literature of Burma. London: The Royal Asiatic Society of Great Britain and Ireland. Bradley, David. 1980. Phonological convergence between languages in contact: Mon-Khmer structural borrowing in Burmese. Berkeley Linguistic Society 6. 259–267. Bronkhorst, Johannes. 2011. The spread of Sanskrit in Southeast Asia. In Pierre-Yves Manguin, A. Mani & Geoff Wade (eds.), Early interactions between South and Southeast Asia: Reflections on cross-cultural exchange, 263–275. Singapore: Institute of Southeast Asian Studies; New Delhi: Manohar. Cabaton, Antoine. 1925. À propos d’une langue spéciale de l’indochine. Études Asiatiques 1. 103–123. Castillo, Cristina Cobo, Bérénice Bellina & Dorian Q. Fuller. 2016. Rice, beans and trade crops on the early maritime silk route in Southeast Asia. Antiquity 90(353). 1255–1269. Chhom, Kunthea. 2016. Le rôle du sanskrit dans le développement de la langue khmère: Une étude épigraphique du VIe au XIVe siècle. Paris: Inalco PhD thesis. Coblin, W. South. 1981. Notes on the dialect of the Han Buddhist transcriptions. In Proceedings of the International Conference on Sinology: Section on linguistics and paleography, 121–183. Taipei: Academia Sinica. Cœdès, George. 1956. Nouvelles données sur les origines du royaume Khmèr: La stèle de Văt Luong Kău près de Văt P’hu. Bulletin de ‘École française d’Extême-Orient 48. 209–220. Djité, Paulin G. 2011. The language difference: Language and development in the Greater Mekong sub-region. Bristol, Buffalo & Toronto: Multilingual Matters. Forest, Alain. 2008. Buddhism and reform: Imposed reforms and popular aspirations. Some historical notes to aid reflection. In Alexandra Kent & David Chandler (eds.), People of virtue: Reconfiguring religion, power and moral order in Cambodia today, 16–34. Copenhagen: NIAS Press. Francis, Emmanuel. 2008–2009. Une inscription tamoule inédite au musée d’histoire du Vietnam de Hô Chi Minh-Ville. Bulletin de l’École française d’Extrême-Orient 95/96. 409–423. Frasch, Tilman. 2017. A Pāli cosmopolis? Sri Lanka and the Theravāda Buddhist ecumene, c. 500–1500. In Zoltán Biedermann & Alan Strathern (eds.), Sri Lanka at the crossroads of history, 66–76. London: UCL Press. Frasch, Tilman. 2018. Myanmar epigraphy – Current state and future tasks. In Daniel Perret (ed.), Writing for eternity: A survey of epigraphy in Southeast Asia, 48–71. Paris: l’École française d’Extrême-Orient.

644 

 Tom Hoogervorst

Fuller, Dorian Q., Nicole Boivin, Cristina Cobo Castillo, Tom Hoogervorst & Robin G. Allaby. 2015. The archaeobiology of Indian Ocean translocations: Current outlines of cultural exchanges by proto-historic seafarers. In Sila Tripati (ed.), Maritime contacts of the past: Deciphering connections amongst communities, 1–23. New Delhi: Delta Book World. Gedney, William J. 1947. Indic loanwords in spoken Thai. New Haven, CT: Yale University PhD thesis. Glover, Ian C. & Bérénice Bellina. 2011. Ban Don Ta Phet and Khao Sam Kaeo: The earliest Indian contacts re-assessed. In Pierre-Yves Manguin, A. Mani & Geoff Wade (eds.), Early interactions between South and Southeast Asia: Reflections on cross-cultural exchange, 17–45. Singapore: Institute of Southeast Asian Studies; New Delhi: Manohar. Golzio, Karl-Heinz (ed.). 2004. Inscriptions of Campā based on the editions and translations of Abel Bergaigne, Étienne Aymonier, Louis Finot, Édouard Huber and other French scholars and of the work of R. C. Majumdar: Newly presented, with minor corrections of texts and translations, together with calculations of given dates. Aachen: Shaker Verlag. Gonda, Jan. 1973. Sanskrit in Indonesia, 2nd edn. New Delhi: International Academy of Indian Culture. Griffiths, Arlo & D. Christian Lammerts. 2015. Epigraphy: Southeast Asia. In Jonathan A. Silk, Oskar von Hinüber & Vincent Eltschinger (eds.), Brill’s encyclopedia of Buddhism. Volume one: Literature and languages, 988–1009. Leiden: Brill. Griffiths, Arlo. 2018. The corpus of inscriptions in Old Malay language. In Daniel Perret (ed.), Writing for eternity: A survey of epigraphy in Southeast Asia, 275–283. Paris: l’École française d’Extrême-Orient. Griffiths, Arlo, Bob Hudson, Marc Miyake & Julian K. Wheatley. 2017. Studies in Pyu epigraphy, I: State of the field, edition and analysis of the Kan Wet Khaung Gon inscription, and inventory of the corpus. Bulletin de l’École française d’Extrême-Orient 103. 43–205. Headley, Robert K. 1976. Some sources of Chamic vocabulary. Oceanic Linguistics Special Publications 13. 453–476. Headley, Robert K., Rath Chim & Ok Soeum. 1997. Cambodian-English dictionary. Kensington: Dunwoody Press. http://sealang.net/ (last accessed April 2018). Henderson, Eugénie J. A. 1951. The phonology of loanwords in some South East Asian languages. Transactions of the Philological Society 1951. 131–158. Hla Pe. 1960. Some adapted Pali loan words in Burmese. Burma Research Society; Fiftieth Anniversary Publication No. 1. 71–100. Hoogervorst, Tom G. 2015. Detecting pre-modem lexical influence from South India in Maritime Southeast Asia. Archipel 89. 63–93. Hoogervorst, Tom G. 2017. The role of “Prakrit” in Maritime Southeast Asia through 101 etymologies. In Andrea Acri, Roger Blench & Alexandra Landmann (eds.), Spirits and ships: Cultural transfers in early Monsoon Asia, 375–440. Singapore: ISEAS – Yusof Ishak Institute. Houghton, Bernard. 1893. Sanskrit words in the Burmese language. The Indian Antiquary 22. 24–27. Houtman, Gustaaf. 1990. Traditions of Buddhist practice in Burma. London: School of Oriental and African Studies, London University PhD thesis. Htin, Kyaw Minn & Jacques P. Leider. 2018. The epigraphic archive of Arakan/Rakhine State (Myanmar): A survey. In Daniel Perret (ed.), Writing for eternity: A survey of epigraphy in Southeast Asia, 73–85. Paris: l’École française d’Extrême-Orient. Hunter, Thomas, 2011. Exploring the role of language in early state formation of Southeast Asia. Nalanda-Sriwijaya Centre Working Paper Series 7. Jacob, Judith M. 1977. Sanskrit loanwords in pre-Angkor Khmer. Mon-Khmer Studies 6. 151–168. Jacob, Judith M. 1986. The deliberate use of foreign vocabulary by the Khmer: Changing fashions, methods and sources. In Mark Hobart & Robert H. Taylor (eds.), Context, meaning and power in Southeast Asia, 115–129. Ithaca, NY: Southeast Asia Program, Cornell University.



South Asian influence on the languages of Southeast Asia 

 645

Jembunathan, S. 1929. A prolegomenon to the study of Burmese etymology. The Journal of Oriental Research Madras 3. 135–139. Jenner, Philip N. 1970. Tamil studies and Mon-Khmer linguistics. In International Association of Tamil Research: Proceedings of the Third International Conference Seminar, 207–211. Pondicherry: Institut Français d’Indologie. Jenner, Philip N. 2009. Dictionary of pre-Angkorian Khmer and dictionary of Angkorian Khmer. Canberra: Pacific Linguistics Series 597. http://sealang.net/ (last accessed April 2018). Jenny, Mathias. 2012. The Mon language: Recipient and donor between Burmese and Thai. Journal of Language and Culture 31(2). 5–33. Karashima, Noboru & Y. Subbarayalu. 2009. Ancient and medieval Tamil and Sanskrit inscriptions relating to Southeast Asia and China. In Hermann Kulke, K. Kesavapany & Vijay Sakhuja (eds.), Nagapattinam to Suvarnadwipa: Reflections on the Chola naval expeditions to Southeast Asia, 271–291. Singapore: Institute of Southeast Asian Studies. Karashima, Seishi. 1992. The textual study of the Chinese versions of the Saddharmapuṇḍarīkasūtra in the light of the Sanskrit and Tibetan versions. Tokyo: The Sankibo Press. Kasevic, Vadim B. 2000. Indian influence on the linguistic tradition of Burma. In Sylvain Auroux, Ernst Frideryk Konrad Koerner, Hans-Josef Niederehe, Kees Versteegh (eds.), History of the language sciences, 182–185. Berlin & New York: Walter de Gruyter. Krauße, Daniel. 2013. Language contact: Malay influence on Thai. Frankfurt am Main: Goethe University BA thesis. Krishnamurti, Bhadriraju. 2009. The Dravidian languages. Cambridge: Cambridge University Press. Kuiper, Franciscus Bernardus Jacobus. 1948. Proto-Munda words in Sanskrit. Amsterdam: Noord-Hollandsche Uitgevers Maatschappij. Kuiper, Franciscus Bernardus Jacobus. 1991. Aryans in the Rigveda. Amsterdam & Atlanta: Rodopi. Levman, Bryan. 2018. The transmission of the Buddhadharma from India to China: An examination of Kumārajīva’s transliteration of the Dhāraṇīs of the Saddharmapuṇḍarīkasūtra. In Ann Heirmann, Carmen Meinert & Christoph Anderl (eds.), Buddhist encounters and identities across East Asia, 137–195. Leiden: Brill. Lévy, Sylvain, Jean Przyluski & Jules Bloch (eds.). 1975. Pre-Aryan and pre-Dravidian in India. Translated by Prabodh Chandra Bagchi. Calcutta: University of Calcutta. Mahdi, Waruno. 2005. Old Malay. Alexander Adelaar & Nikolaus P. Himmelmann (eds.), The Austronesian languages of Asia and Madagascar, 182–201. London & New York: Routledge. Marrison, George E. 1975. The early Cham language, and its relationship to Malay. Journal of the Malaysian Branch of the Royal Asiatic Society 48(2). 52–59. Martini, François. 1954. De la reduction des mots sanskrits passés en cambodgien. Bulletin de la Société de Linguistique de Paris 50(1). 244–261. Mayrhofer, Manfred. 1956–1980. Kurzgefaßtes etymologisches Wörterbuch des Altindischen, 4 vols. Heidelberg: Carl Winter., McDaniel, Justin Thomas. 2008. Gathering leaves & lifting words: Histories of Buddhist monastic education in Laos and Thailand. Seattle & London: University of Washington Press. McGregor, Ronald Stuart. 1993. The Oxford Hindi-English dictionary. Oxford & New York: Oxford University Press. Ménétrier, Ernest. 1985. Le vocabulaire Cambodgien dans ses rapports avec le sanscrit et le pali. Paris: Centre de Documentation et de Recherche sur la Civilisation Khmere. Miyamoto, Tadao. 1992. Truncation of Sanskrit and Pali loanwords in Thai. In Pan-Asiatic linguistics: Proceedings of the Third International Symposium on Language and Linguistics. Volume II, 869–882. Bangkok: Chulalongkorn University. Monier-Williams, Monier. 1899. A Sanskṛit-English dictionary: Etymologically and philologically arranged with special reference to cognate Indo-European languages. Oxford: Clarendon Press.

646 

 Tom Hoogervorst

Myanmar. 1993. Myanmar-English dictionary. Kensington: Dunwoody Press. Republication. http:// sealang.net/ (last accessed April 2018). Nacaskul, Karnchana. 1962. A study of cognate words in Thai and Cambodian. London: University of London MA thesis. Navawongs, M. L. Chirayu. 1975. Sanskrit and Thailand. In V. Raghavan (ed.), Proceedings of the First International Sanskrit Conference, 347–352. New Delhi: Ministry of Education and Social Welfare. Okell, John. 1965. Nissaya Burmese: A case of systematic adaptation to a foreign grammar and syntax. Lingua 15. 186–227. Okell, John. 1969. A reference grammar of colloquial Burmese. London: Oxford University Press. Osada, Toshiki. 2009. How many Proto-Munda words in Sanskrit? With special reference to agricultural vocabulary. In Toshiki Osada (ed.), Linguistics, archaeology and human past in South Asia, 127–146. New Delhi: Manohar. Pengpala, Pathana. 1998. The change in meaning of Pali and Sanskrit words used in Thai. In Udom Warotamasikkhadit & Thanyarat Panakul (eds.), Papers from the Fourth Annual Meeting of the Southeast Asian Linguistics Society, 165–176. Tempe, AZ: Arizona State University, Program for Southeast Asian Studies. Perrière, Bénédicte Brac de la. 2017. About Buddhist Burma: Thathana, or “religion” as social space. In Michel Picard (ed.), The appropriation of religion in Southeast Asia and beyond, 39–66. New York: Palgrave Macmillan. Photjananukrom Chabap Ratchabandittayasathan. 1999. Royal Institute Thai Dictionary. Krungthep: Nanmi Book Publishing. http://www.royin.go.th/dictionary/ (last accessed April 2018). Pollock, Sheldon. 2006. The language of the gods in the world of men: Sanskrit, culture, and power in premodern India. Berkeley: University of California Press. Pou, Saveros. 1986. Prākrit loan-words in Old Khmer. Ṛtam 16–18. 259–267. Pou, Saveros. 1987. Dravidian loanwords in Khmer. Presented at the Sixth International Conference of Tamil Studies, November 1987, Kuala Lumpur, Malaysia. Pryce, Thomas Olivier. 2014. Metallurgy in Southeast Asia. In Helaine Selin (ed.), Encyclopaedia of the history of science, technology, and medicine in non-Western cultures, 3144–3159. Dordrecht: Springer. Pulleyblank, Edwin G. 1991. Lexicon of reconstructed pronunciation of early Middle Chinese, late Middle Chinese, and early Mandarin. Vancouver: UBC Press. Rhys Davids, Thomas William & William Stede (eds.). 1966. The Pali Text Society’s Pali-English dictionary. London: Luzac. [Reprint]. S. S. 1933. Pengaroeh bahasa asing kepada bahasa-bahasa Indonesia. II. Poedjangga Baroe 1. 109–111. Sarapadnuke, Chamlong. 1975. Sanskrit words in the Thai language. In Venkataraman Raghavan (ed.), Proceedings of the First International Sanskrit Conference, 353–361. New Delhi: Ministry of Education and Social Welfare. Shastri, Satya Vrat (ed.). 2005. Sanskrit words in South-east Asian languages. Mumbai: Somaiya Publications. Shorto, Harry L. 1962. Dictionary of modern spoken Mon. Oxford: Oxford University Press. http:// sealang.net/ (last accessed April 2018). Shorto, Harry L. 1971. A dictionary of the Mon inscription from the sixth to the sixteenth centuries. London: Oxford University Press. Shorto, Harry L. 2006. A Mon-Khmer comparative dictionary. Main editor: Paul Sidwell, assisting editors: Doug Cooper & Christian Bauer. Canberra: Pacific Linguistics Series 579. Southworth, Franklin C. 2005. Linguistic archaeology of South Asia. London & New York: RoutledgeCurzon.



South Asian influence on the languages of Southeast Asia 

 647

Smalley, William A. 1994. Linguistic diversity and national unity: Language ecology in Thailand. Chicago & London: University of Chicago Press. Tamil. 1924–1936. Tamil lexicon, 6 vols. Madras: University of Madras. http://dsal.uchicago.edu/ dictionaries/ (last accessed April 2018). Thurgood, Graham. 1999. From Ancient Cham to modern dialects: Two thousand years of language contact and change. With an appendix of Chamic reconstructions and loanwords (Oceanic Linguistics Special Publications 28). Honolulu: University of Hawai`i Press. Turner, Ralph Lilley. 1966. A comparative dictionary of the Indo-Aryan languages. London: Oxford University Press. van Driem, George. 2001. Languages of the Himalayas: An ethnolinguistic handbook of the Greater Himalayan region containing an introduction to the symbiotic theory of language. Leiden: Brill. Varasarin, Uraisi. 1984. Les éléments khmers dans la formation de la langue siamoise. Paris: Société d’Études Linguistiques et Anthropologiques de France. Waxman, Nathan & Soe Tun Aung. 2014. The naturalization of Indic loan-words into Burmese: Adoption and lexical transformation. Journal of Burma Studies 18(2). 259–290. Wheatley, Julian & San San Hnin Tun. 1999. Languages in contact: The case of English and Burmese. Journal of Burma Studies 4. 61–99. Wilkinson, Richard James. 1932. A Malay-English dictionary (romanised), 2 vols. Mytilene: Salavopoulos and Kinderlis. Wisseman Christie, Jan. 1998. The medieval Tamil-language inscriptions in Southeast Asia and China. Journal of Southeast Asian Studies 29. 239–268. Witzel, Michael. 2009. South Asian agricultural terms in Old Indo-Aryan. In Toshiki Osada (ed.), Linguistics, archaeology and human past in South Asia, 79–100. New Dehli: Manohar. Woźnica, Piotr. 2010. Remarks on Sanskrit and Pali loanwords in Khmer. Investigationes Linguisticae 20. 186–199. Zoetmulder, Petrus Josephus. 1982. Old Javanese-English dictionary, 2 vols. The Hague: Martinus Nijhoff.

Mark J. Alves

27 Linguistic influence of Chinese in Southeast Asia 27.1 Introduction The Sinitic branch of Sino-Tibetan has been spreading to the south of its original Yellow River and central plains homeland for three millennia. Southern China, which is today solidly “Chinese” linguistic and cultural territory, is the starting point of language contact and linguistic influence, which later extended into both Mainland Southeast Asia (MSEA hereafter) and Insular Southeast Asia (ISEA). This connection between languages of this general region makes it necessary to discuss Southern China together with Greater Southeast Asia. A broad characterization of the historical linguistic situation is as follows. (a) Sinitic entirely replaced languages previously spoken in large portions of modern-day Southern China. (b) Widespread typological convergence occurred in which Sinitic itself was morphophonologically restructured along with Kradai, Hmong-Mien, and Vietic. (c) Lexical borrowing from Chinese was most intense inside that core convergence area (sometimes referred to as the ‘Sinosphere’ [cf. Matisoff 1990]) in the early period. In MSEA and ISEA, later periods of Chinese immigrant groups, as well as Tai and Vietic languages, further spread Chinese loanwords in those regions, often with impact on semantic domains in those languages. The remainder of section 27.1.1 summarizes the history of the spread of Sinitic, the types of language contact with Sinitic, and the degrees of linguistic influence. Section 27.1.2 covers the types of linguistic influence inside and outside the core convergence area, including the lexical influence on semantic domains and various types of structural change. A note on the terms “Sinitic” and “Chinese”: The term “Chinese” is problematic as it can be used with multiple meanings. In this paper, “Chinese” is used to refer broadly to (a) Chinese culture and ethnicity and (b) varieties of Sinitic after sub-branching. The term “Sinitic”, on the other hand, is used to refer to (a) the early stages of Chinese languages (what is often called “Old Chinese”) or (b) the periods of early language contact in Southern China in the first half of the 1st millennium CE (into “Early Middle Chinese”), generally prior to sub-branching.

https://doi.org/10.1515/9783110558142-027

650 

 Mark J. Alves

27.1.1 A brief history of the spread of Sinitic: multiple waves and indirect influence Historical records and archaeological data allow for a reasonably clear overview of the periods of Sinitic southward migration. The main periods are broadly characterized in Table 1, showing the general destinations of these migrations. Each period is summarized below. Tab. 1: Periods of expansion by speakers of Sinitic languages. Period

Destinations of Sinitic migrations

Pre-Qin Dynasty (Mid- to late-1st millennium BCE) Han Dynasty (206 BCE to 220 CE)

Early expansion into southern China

Post-Han 1st millennium CE Early to mid-1st millennium CE 1800s to 1900s

Initial widespread settlement of Sinitic groups in Southern China and northern Vietnam Establishment of modern range of Sinitic speech communities Settlement of Chinese speakers in MSEA and ISEA, migration of Tai speakers into MSEA Migrations of Chinese throughout MSEA and ISEA

– The Pre-Qin Period: The language contact situation as Sinitic-speaking groups moved south in the mid-1st millennium BCE must have had impact on the various contact languages involved, presumably Hmong-Mien, Kradai, and, speculatively, Austronesian, or whatever languages were spoken in the southeastern portions of China. However, the sociocultural histories of all of these language groups, and their relationships with polities in the 1st millennium BCE such as the Chu and Yue, are vague. That Sinitic languages replaced earlier languages in southern China is clearly the case, but the amount of language contact and the direction of linguistic influence in this early period is uncertain. Regardless, textual evidence shows that, at the time of this early expansion, Sinitic was already an SVO language but without a full system of classifiers, which became more developed towards the end of the 1st millennium CE (e.  g. Aldridge 2016; Peyraube 1996; etc.), but it may have had presyllables, initial and final consonant clusters and some residual prefixes and suffixes (e.  g. Baxter and Sagart 2014, etc.), but lacked lexical tone systems (e.  g. Haudricourt 1954, etc.). – The Han Dynasty: During the Han Dynasty, the full expansion of Sinitic into modern-day Southern China took place, thus beginning the period of typological convergence of Sinitic, Kradai, Hmong-Mien, and Vietic (cf. 27.2.2). In the early Han Dynasty, 1.9 million people moved from the central plains to various parts of southern China and northern Vietnam (LaPolla 2001: 229). Few details of how many went when and where are available. In one documented instance,



Linguistic influence of Chinese in Southeast Asia 

 651

eight thousand soldiers from northern China were sent to Vietnam with the likely intent of settling, and marriage with locals also appears likely (Taylor 1983: 49). However, while the final effects on the linguistic landscape are clear, many questions remain. – Post-Han 1st millennium CE: With the permanent settlement of Sinitic groups in southern China amidst other language groups, the end of the Han effectively marked the end of the stage of Old Chinese and the beginning of Middle Chinese and eventually the differentiation of varieties of Sinitic. In the early 300s, another million Chinese moved south to escape political turmoil (Gernet 1996: 182). However, despite these numbers, many of the existing groups in southern China, in what was likely territory of Tai groups, often maintained their own cultural practices even while local leaders utilized Chinese cultural practices to gain political status (Churchman 2016). Further south, this was also a period of state and empire formation in MSEA, such as the Funan, Chenla, and Champa kingdoms, and Chinese records show initial contact with these emerging kingdoms as well as Chinese involvement in maritime trade in ISEA. – Early to mid-1st millennium CE: In this period, the maritime expansion of Chinese presence led to settlements of Chinese speakers in ISEA as well as coastal areas of MSEA (e.  g. Wu 2009). Another significant matter was the spread of Chinese linguistic elements in MSEA indirectly through both Tai languages and Vietnamese, notably impacting some Austroasiatic languages in the regions with Tai populations (cf. 27.2.1). – 1800s to 1900s: In this part of the European colonial era, millions of Chinese laborers and merchants migrated throughout both MSEA and ISEA. They mostly spoke southern varieties of Chinese. While these groups shared Chinese in their local communities, they also borrowed local words, and there have been instances of language loss among some ethnic Chinese groups in the region. A note on varieties of Chinese: Another aspect to consider is which varieties of Chinese were in contact with which Southeast Asian languages. It is possible to determine where main varieties, such as Cantonese, Hokkien/Fujian, Hakka/Kejia, and Teochow/Chaozhou Chinese, have been spoken in MSEA and ISEA over the last two centuries, though these have generally left only lexical imprints. Prior to that, it is difficult to determine with precision what kinds of Chinese were spoken, for example, which type or types may have been spoken in Cambodia in the 13th century when the Chinese observer Zhou Daguan wrote of Chinese communities there. The dialect situation is even less clear in the 1st millennium when Sinitic varieties were still in the process of differentiating. Indeed, varieties of Sinitic have themselves become distinct due to contact with other groups, such as Yue Chinese, which has been noted for a number of linguistic features shared with Tai but that distinguish it from varieties of Chinese to the north (cf. Matthews 2006).

652 

 Mark J. Alves

27.1.2 Types of language contact and the means of linguistic influence Throughout the history of Sinitic expansion, the types of language contact have naturally varied in different regions and at different times. Table 2 provides a broad overview of the general kinds of contact and representative regions of these. In addition to historical descriptions noted in 27.1.1, the nature of language contact in regions where Sinitic speakers moved in early history can only be inferred by noting what the modern speech communities are like and the amount of linguistic impact of Sinitic has been on other languages. Tab. 2: Primary types of language contact. Types of sociocultural contact

Examples

Trade Administration Widespread communities Settlements inside non-Chinese majority communities

All regions (Southern China, MSEA, ISEA) Southern China, northern Vietnam Southern China Cambodia, Indonesia, the Philippines, northern and southern Vietnam

Some languages groups have experienced different types of contact over time. For example, in northern Vietnam, early records indicate the Han Dynasty establishment of commandries in Vietnam. As noted in 27.1.1, after more than a century of administration, thousands of Chinese soldiers were brought into the region to settle. This is also the period in which archaeological records show clear evidence of Chinese presence, such as the Han Style tombs and other Chinese accoutrements. While no records demonstrate specifically how many Sinitic speakers settled in northern Vietnam in subsequent centuries of Chinese administration, the number and variety of early Chinese loanwords from the early part of the 1st millennium CE strongly suggest significant numbers of Chinese settlers. After Vietnamese independence from China, sociocultural contact continued in various ways (e.  g. trade, education, political tribute, etc.). In the late 1800s to mid-1900s, large numbers of Chinese from parts of Southern China settled in Vietnam, especially in Hanoi and Saigon. The linguistic consequences of these multiple periods of contact for over two millennia are seen in terms of both loanwords and typological convergence. Hmong-Mien and Tai-Kadai have also experienced multiple periods of contact with Sinitic, in ways similar to the case in Vietic, similarly resulting in both massive lexical borrowing and restructured morphophonology. In contrast, in areas outside of this core region in Cambodia, Indonesia, Malaysia, and the Philippines, contact took place largely though trade and localized settlements, and these languages have maintained their overall linguistic typology.



Linguistic influence of Chinese in Southeast Asia 

 653

27.1.3 Differing degrees of influence: structural versus lexical Linguistic influence of Sinitic can be divided into a few types, as in Table 3, listing three main types and the associated geographic scope of influence. Complete linguistic replacement occurred in most parts of Southern China, with remaining geographic islands of non-Sinitic languages, including Tai-Kadai, Hmong-Mien, Tibeto-Burman, and Austroasiatic, close to MSEA. During that period of replacement stretching over several centuries, structural convergence encompassed Sinitic itself as well as most non-Chinese groups in Southern China, with less structural impact on Tibeto-Burman and Austroasiatic languages, which have somewhat more distinct morphophonological systems. Lexical borrowing from Sinitic/Chinese is seen in all instances of language contact, with the largest numbers of loanwords inside the region of structural convergence and smaller numbers outside that zone. Tab. 3: Types of linguistic influence of Sinitic/Chinese. Type of influence

Affected regions (language groups)

Linguistic replacement Structural convergence Lexical borrowing

Southern China (Uncertain) China and MSEA (Sinitic, Kradai, Hmong-Mien, and Vietic) Southern China, MSEA, ISEA (Kradai, Hmong-Mien, Vietic, Austroasiatic, Austronesian, Tibeto-Burman)

Though numerous factors and scenarios could certainly be considered, some primary probable sociolinguistic situations for the types of linguistic influence include the following. – Lexical borrowing from Sinitic was stimulated by (a) prominent sociolinguistic status of Sinitic speech communities and (b) at least some bilingualism in communities, whether through settlement, trade, administration, or any/all of these. – In contrast, structural convergence of the languages in the region are associated with (a) long-term widespread bilingualism (e.  g. several centuries), (b) socioculturally equal bilingualism in at least some places and in some periods (e.  g. Sinitic and Tai-Kadai speakers speaking both languages in their communities in the earlier period of contact), and (c) imperfect language acquisition leading to loss of consonant clusters, syllables, and morphology. It is with these general scenarios in mind that we can consider the types of linguistic influence that Sinitic has had in southern China and Southeast Asia.

654 

 Mark J. Alves

27.2 Types of linguistic influence As noted in Table 3, there is a primary difference between the languages that participated in the massive typological convergence in southern China and northern Vietnam and those on the periphery farther south and west in MSEA and east into ISEA. Table 4 provides a summary of the types of linguistic influence and the aspects of these. All are summarized in following subsections. Tab. 4: Types of linguistic influence. Types

Aspects

Loanwords and impact on semantic domains

– – – – – – –

Phonology Syntax

Socio-pragmatics

Trade items Cultural practices and concepts Function words (in the convergence area) Morphophonological restructuring Development of tones Classifiers and noun phrase structure Grammatical structures and/or functions through grammatical loanwords – Kinship systems – Referential terms

27.2.1 Lexical borrowing: semantic and cultural domains Loanwords are highly visible type of linguistic influence that reflect sociocultural contact, exchange, and influence. Research on the history of loanword exchange between Sinitic languages and surrounding languages shows a very strong tendency towards borrowing from Sinitic/Chinese rather than the reverse. In one study (Wiebusch and Tadmor 2009: 581), only 25 of 2,000 words (a little over 1 %) that were studied in Mandarin Chinese were identified as loanwords, though varieties of Chinese in MSEA and ISEA do, of course, have loanwords from local languages. In contrast, in comparable studies, in Vietnamese, some 415 of 1,477 words, or 28 %, were considered Chinese (Alves 2009: 619–622), and in White Hmong, Chinese loanwords constituted about 15 % of 1,292 words (Ratliff 2009: 645–647). Studies on both early and more recent Chinese loanwords among all languages in this paper are widely available (e.  g. Pou and Jenner [1973] on Cambodian with notes on Southeast Asian languages, Alves [2009] and [2017a] on Vietnamese and [2017b] on other Southeast Asian languages, and Pittayaporn [2014] on Southwestern Tai), and in studies of minority languages in China, it is common to see notes on Chinese loanwords. A sense of the time depth of Sinitic loanwords in the languages in and near southern China comes from numbers of loans seen in the proto-language levels. A review of



Linguistic influence of Chinese in Southeast Asia 

 655

lists of lexical reconstructions show the following: (a) in proto-Vietic, at least 50 of the 1,200 reconstructed words (from the Mon-Khmer Etymological Dictionary) are probable Sinitic loanwords; (b) in Hmong-Mien, of the over 800 reconstructed forms, about 100 are noted to have comparable Chinese vocabulary as potential sources (Ratliff 2010); and (c) in Tai reconstructions (Li 1977; Pittayaporn 2009, 2014) and comparative data (Hudak 2009), over 150 probable Sinitic loanwords can be identified. The reconstructions of Proto-Hmong-Mien and Proto-Tai date roughly to this period of contact with Sinitic, while in Vietic, the reconstructions are not proto-language forms as Proto-­ Vietic is much older. Details of the timing of these loanwords and when they spread widely enough to be part of the proto-languages will require more study. Table 5 lists widespread semantic domains of Sinitic/Chinese loanwords that can be seen in multiple language families in multiple regions of both MSEA and ISEA. The main languages in Table 5 include Thai, White Hmong, and Vietnamese, though many other Tai and Hmong-Mien languages have also borrowed many of these and other Chinese words. This table contains a small sampling (see Alves [2017b] for additional lists of Chinese loanwords in the region). The timings of the loanwords vary and cannot be dealt with in detail here. Which words were borrowed necessarily vary greatly according to the intensity, length, and sociocultural nature of the contact, but there are recurring categories, including trade items (tools, clothing, containers, etc.); items and practices related to cuisine; kinship terms; and terms related to literacy. In some cases, the loanwords are not integrated into the culture widely, but rather are more restricted in usage to residential Chinese communities (e.  g. the use of Chinese kinship terms) and not used outside those communities, such as those in Cambodia and Indonesia. Not surprisingly, nouns are the most common loanwords in all areas, whereas the languages in southern China or northern Vietnam have borrowed many verbs and function words, as described below. Many words borrowed originally into Tai and Vietnamese were subsequently borrowed into neighboring Austroasiatic languages. For instance, the Chinese word for ‘hat’ has distinct forms in Tai languages and Vietnamese, with clear borrowings in various groups. Another semantic domain of Sinitic loanwords primarily in Hmong-Mien, Tai, and Vietic are terms for metals (cf. Alves 2015b), though the Sinitic word for ‘bronze/copper’ again appears to have been shared by Vietnamese and Tai languages among neighboring Austroasiatic languages. The following are the sources in Table 5: (a) LH (Late Han) and MC (Middle Chinese) are from Schuessler (2010). The Late Han period at the beginning of the 1st millennium CE is when the major period of contact occurred. When those reconstructions are lacking, in the cells, OC (Old Chinese) and MC (Middle Chinese) refer to reconstructions of Baxter and Sagart (2014); (b) the Austroasiatic data comes from the SEALang Mon-Khmer Etymological Dictionary; (c) Thai and Khmer comes from the SEALANG dictionaries of those respective languages; (d) Hmong-Mien data is from Ratliff (2010).

656 

 Mark J. Alves

Tab. 5: Recurring semantic domains of Sinitic/Chinese loanwords seen throughout MSEA and ISEA. Domain

Examples

Late Han

MC

Samples

Trade

hat (帽 mào)

*mouC OC *mˤaw(k)-s

mâuC

skirt/pants pants (裙 qún) pants (褲 kù) knife (刀 dāo)

*gun

gjuən

Thai mùak; Vietnamese mũ; Mnong (Bahnaric) mu; Katu (Katuic) mu; Khmer (Khmeric) mùːək; Tonga (Aslian) muak; Sedang (Bahnaric) muək; Bru (Katuic) muək; Chong (Pearic) muak; Wa (Palaungic) muk; Mon (Monic) hamok; Mang (Mangic) mɨək⁷; T’in (Khmuic) muək; Proto-Mien *gjunA ‘skirt’; Vietnamese quàn ‘pants’;

khuaC

khwaC

Khmer kʰao ‘pants’; Vietnamese khố ‘loincloth’

*tɑu

tâu

sickle (鐮 lián) bag (包 bāo)

*liam

ljäm

*pɔu

pau

bag (袋 dài)

OC *Cə. lˤək-s

MC dojH

box (盒 hé)

*gəp

ɣập

Vietnamese dao; Khmer daaw ‘sword’; Stieng (Bahnaric) daaw; Bru (Katuic) daaw; Nyah Kur (Monic) buun taaw; Chong taːw ‘sabre’; Central Tai; Proto-Hmong-Mien *ljim; Vietnamese liềm; Vietnamese bao; Khmer baaw; Sre (Bahnaric) ɓaːw; Kui (Katuic) baw ‘burlap bag’; Surin Khmer (Khmeric) baw ‘a jute bag’; Bolyu (Mangic) thi³¹paːu⁵³ ‘bag’; Nyah Kur (Monic) kǝpáw ‘old-fashioned cloth bag worn around the waist for holding food when travelling’; Thai tʰây ‘long purse tied around the waist’; Khmer tey ‘kind of cloth bag’; Proto-Mien *diC; Bru (Katuic) ta̤ j; Khmu (Khmuic) daj; Nyah Kur (Monic) thàj ‘old-fashioned cloth bag worn around the waist for holding food when travelling’; Thai àp ‘small box’; Vietnamese hộp; Khmer həp; Pacoh (Katuic) hṵːp; Alak (Bahnaric) ʔuup; T’in (Khmuic) kap; Nyah Kur (Monic) kap;

beans (豆 dòu)

*doC

dəuC

noodles (麵 miàn, cf. Taiwanese mī) tea (茶 chá)

NA

NA

*d͎ a

d͎ a

Cuisine

Thai thùa; White Hmong taw2̰1; Vietnamese đậu; (compounds with ‘bean’ such as Indonesian tahu ‘bean curd’, Tagalog tausi ‘black fermented beans’, etc.) Thai mìi; White Hmong mi52; Vietnamese mì; Khmer mii; Indonesian mi;

Thai chaa; Vietnamese trà & chè; Khmer tae; Indonesian teh; Semelai (Aslian) teh; Cua (Bahnaric) cɛː; Surin Khmer (Khmeric) tɛː; War (Khasic) cʰa; Pacoh (Katuic) traː ‘black tea’ & cɛː ‘green tea’; Khsing-Mul (Khmuic) ce:; Nancowry (Nicobaric) ca;



Linguistic influence of Chinese in Southeast Asia 

 657

Tab. 5 (continued) Domain

Kinship

Literacy

Metals

Others

Examples

Late Han

MC

Samples

pastry (餅 bǐng)

OC *peŋʔ

pjiengX

Thai pæ̂æŋ; Vietnamese bánh; Bahnar (Bahnaric) ɓaŋ ‘bread’; Pacoh (Katuic) pɛːŋ;

elder sister (姐 jiě) uncle (FOB) (伯 bó)

*tsiB

tsiB

*pak

pɐk

Thai cée; Vietnamese chị; Khmer cae; Tagalog ate; Thai pɛ́ ʔ ‘old Chinese man’; similar to “uncle” ’; Khmer peʔ ‘old man’;

ink (墨 mò) book (書 shū)

*mək

mək

*śɑ

śjwo

bronze/copper (銅 tóng)

*doŋ

duŋ

silver (銀 yín) silver/white (白 bái)

*ŋɨn

njen

*bak OC *brâk

bɐk

company (公司 gōng sī) playing cards (牌 pái)

(modern)

(modern)

Thai koŋˈsǐi; Vietnamese công ty; Indonesian kongsi;

*bɛ

baɨ

Thai phai; White Hmong phai55; Vietnamese bài; Khmer biə; Bru (Katuic) phaj; Nyah Kur (Monic) phàj; Palaung (Palaungic) phaj;

Thai mʉ̀ k; Vietnamese mực; Khmer mɨk; Indonesian (Jakarta) bak; Thai sʉ̌ ʉ; Vietnamese thư ‘letters’; Khmer siǝw pʰɨw (Cantonese 書簿syu1 bou2); Thai tʰɔɔŋ; White Hmong tooj; Vietnamese đồng; Riang (Palaungic) tɔŋ² ‘bronze’; Phong (Khmuic) tʰɔːŋ ‘bronze’; Tai Hat (Khmuic) dɔːŋ cŋar ‘copper’; Pacoh (Katuic) ɗṵːŋ ‘bronze’; Nyah Kur (Monic) thɔŋ phleeɲ ‘copper’; Thai ŋǝn; White Hmong ɲiə52; Vietnamese ngân ‘silver (only in lexical compounds)’; Khmer prak; Vietnamese bạch ‘silver’; Indonesian perak ‘silver’;

A sample of this range of borrowing can be seen in Cambodian. Pou and Jenner (1973) note how ancient Khmer texts do not mention China directly, but rather objects associated with China. Pou and Jenner collected 300 Cambodian words they propose are Chinese loanwords and provide a statistical breakdown of the semantic domains, as shown in Table 6. Whether or not all 300 items are indeed Chinese loanwords, this author’s view is that enough are so – based on consistency of phonology, lexical semantics, semantic domains, and similar loanwords in other Southeast Asian languages –  to consider the statistics a useful example. They demonstrate the lexical impact on cultural domains, again with trade and cuisine topping the list, but also with borrowing in religion, entertainment, and kinship terms.

658 

 Mark J. Alves

Tab. 6: Chinese loanwords in Cambodian (Pou and Jenner 1973: 2). Semantic domain

Percentage

Commerce and navigation Food and articles of use Religious terms Gambling and theater Kinship terms Arts and crafts Administrative and legal terms Miscellaneous verbs Miscellaneous nouns

22 % 21 %  8 %  6 %  5 %  4 %  2 %  8 %  7.5 %

On the issue of kinship terms, in some cases, these words have had secondary impact on referential systems in recipient languages. In Vietnamese, the true pronoun system has become secondary to a socially stratified system based in large part on kinship terms, of which Sinitic has been a major contributor (Alves 2017c). In other referential terms, in Betawi Malay of Jakarta, the 1st and 2nd pronouns gua and lu are both from Hokkien Chinese, and while they have not replaced native words, they have penetrated deeply in informal speech (Djenar et al. 2018: 37). An interesting kind of sociopragmatic grammaticalization pattern is the derivation of a 1st person polite pronoun from a word meaning ‘slave’ or ‘servant’, as was the case in Archaic Chinese (僕 pú ‘servant [male]’ and 婢 bì ‘servant [female]’), as well as modern Vietnamese (tôi), Khmer (kɲom), Thai (khâa), and Burmese (cənɔ [1st person masculine] from ‘royal servant’ and cəmá [1st person feminine] from ‘female servant’). They are all of entirely different etymologies and cannot be the result of lexical borrowing, but it is a shared regional development. However, whether Chinese contributed to this phenomenon directly or it is a broader regional phenomenon cannot be determined at this point, as this is seen in both the Sinitic influenced and Indic influenced major languages of MSEA (cf. Müller and Weymuth 2017). While content words have been borrowed widely throughout Greater Southeast Asia, grammatical vocabulary has been borrowed only within the typological convergence zone. Both Proto-Tai and Early Sino-Vietnamese show sizeable numbers of grammatical words borrowed during the early 1st millennium CE (cf. Alves 2007a, 2007b, and 2015a for specific words), as shown in Table 7. Not included in this table are more grammatical words borrowed in both Vietnamese and among various Tai languages from various types of Chinese in later periods. Some of these words have also been borrowed into the conservative southern Vietic languages in Vietnam, such as Ruc and May, though these appear to be relatively recent Sino-Vietnamese borrowings, having very similar segments to the modern Vietnamese forms. The borrowing of classifiers in particular is relevant to the question of the development of the classifier phrase in the languages in the region, as noted in Section 27.2.2.3.



Linguistic influence of Chinese in Southeast Asia 

 659

Tab. 7: Function words from early Sinitic contact with Tai and Vietic. Languages

Categories

Tai

– Numbers (2–100, 10,000) – Classifiers – Prepositions – Comparative words – Generalized quantity expressions – Time and aspectual words – Numbers (10,000, some numbers but with very limited usage) – Classifiers – Prepositions – Modal verbs – Connective words – Kinship terms

Vietnamese

A final point to consider is that words that are borrowed can also be lost. Early Vietnamese Nôm writings from the mid-second millennium have words that are no longer in use, such as the modal verb tua ‘should’ from Chinese 需 xū ‘should’, among other early Sinitic loanwords. However, many words in Tai, Hmong-Mien, and Vietic were borrowed from Sinitic in the early 1st millennium CE. Thus, undoubtedly other items were lost, meaning the full spectrum of lexical impact cannot be known.

27.2.2 Typological convergence The linguistic features of (a) monosyllabic prosodic words, (b) CVC or CCVC syllables, (c) complex tone systems, (d) SVO clause structure, (e) noun classifier systems, and (f) systems of sentence-final particles are shared by modern languages of Sinitic, HmongMien, Tai, and the Viet-Muong branch of Vietic. It is not the case that Sinitic led other language groups to adopt Sinitic structures, but rather that all groups became similar by losing features and ultimately typologically converging (Delancey 2011, 2013), notably losing presyllables and affixes, through the process of interrupted, imperfect acquisition (McWhorter 2007). This was simultaneously a period leading up to the emergence of tones among the language groups (Ratliff 2010: 183–192). Table 8 provides a broad characterization of the typology of the four language groups at the time of the Han Dynasty versus the languages today. There is variation and some exceptions of these descriptions (27.2.2.1), but they characterize the overall changes between those periods. Additional points are made in following subsections.

660 

 Mark J. Alves

Tab. 8: Linguistic typology in the region 2,000 BP versus today. Aspects

0 CE

Modern era

Tones Syllable structure Word forms

– None – More complex – Including lexemes with presyllables or presyllabic material – Richer affixation – Few or no classifiers (only measure words) – SVO

– Complex systems – Simpler – Predominantly monosyllabic roots

Affixation Classifiers Clause structure

– Little to no affixation – Complex systems, required usage – SVO

27.2.2.1 Syllable and word structure The most significant impact of the expansion of Sinitic is change in the structure of the prosodic word. The modern language groups in the convergence zone have been reconstructed with (a) more complex prosodic words, (b) presyllables or presyllabic material, and (c) affixes (in Sinitic, Vietic, and Hmong-Mien). The extent to which this reduction occurred varies among and within the language groups. For example, various Hmong-Mien languages still retain pre-nasals, and in Viet-Muong, some Muong dialects have several initial clusters. Nevertheless, a comparison of reconstructed forms and modern reflexes shows the extent of the reduction, as seen in Table 9. The reduction from polysyllabicity to monosyllabicity, the loss of affixation, and the less complex syllable structure can be hypothesized to be the result of widespread bilingualism/multilingualism and imperfect acquisition, resulting in the simplification of the features, as well as a natural tendency towards reduction over many centuries. Tab. 9: Comparison of proto-language forms and reflexes in modern languages. Proto-languages

Forms

Modern ­languages

Forms

Old Chinese (Baxter and Sagart 2014)

– – – –

Mandarin

– – – –

鉸 jiǎo ‘shears’ 教 jiào ‘teaching’ 妹 mèi ‘younger sister’ 品 pǐn ‘kind, rank’

Tai (Pittayaporn 2009)

– *k.tɯ:nB 700 ‘to wake up’ – *C̥ .dwi:tD 246 ‘sunshine’ – *s.ʔwɤ:jA 259 ‘steam, vapor’ – *p.ta:jA 704 ‘to die’

Thai

– – – –

� tʉ̀̀ ʉn ‘to wake up’ ตื่่น แดด dɛ̀̀ ɛt ‘sunlight’ ไอ ay ‘vapor; smell’ ตาย taay ‘to die’

*mə-[k]ˤr[a]wʔ ‘shears’ *s.kˤraw-s ‘teaching/education’ *C.mˤə[t]-s ‘younger sister’ *pʰr[ə]mʔ ‘kind, class’



Linguistic influence of Chinese in Southeast Asia 

 661

Tab. 9 (continued) Proto-languages

Forms

Modern ­languages

Forms

Hmong-Mien (Ratliff 2010)

– *N-cuŋ ‘earthworm’ – *ɢraŋA ‘span (arm)’ – *khju̯ ɛt ‘itch(y)/scratch(y)’

White Hmong

– cab [ca] ‘earthworm’ – daj [daj] ‘armspan’ – khaus [khaw] ‘itchy’

Vietic (MonKhmer Etymological Dictionary)

– – – –

Vietnamese

– – – –

*m-ri:w ‘axe’ *t-koːlʔ ‘rice mortar’ *g-raːŋ ‘winnow’ *p-səɲʔ ‘snake’

rìu ‘axe’ cối ‘rice mortar’ sàng ‘winnow’ rắn ‘snake’

The groups can be broadly characterized as follows. – Sinitic: Proto-Sinitic had at least pre-syllabic material and full presyllables, some of which were derivational prefixes (cf. Baxter and Sagart 2014). Moreover, syllable-finally, consonant clusters are reconstructed with the derivational *-s suffix deriving denominal verbs, contributing to word-final consonant clusters (Ibid.). Today, monosyllabic morphemes, very limited initial consonant clusters with only [-w-] and [-j-] as medials and no final consonant clusters are seen throughout the language group, and regarding finals, in the south, final voiceless stops /p, t, k/ and nasals /m, n, ŋ/ are common, but moving northward, final stops are lost (e.  g. Shanghainese with only final glottal stop and Mandarin with no stops), and final nasals decrease in number (Zee 1985). What is less clear in the literature is the stages and timing of transition from complex syllables of Old Chinese to CVC syllables of Middle Chinese. – Tai: While modern Tai languages have CCVC as a syllable template, generally with medials such as [-l-],[-r-], and [-w-], Proto-Tai has been reconstructed with a larger inventory of initial consonant clusters, additional presyllabic material, and clusters in which the second consonant is not sonorant (Pittayaporn 2009). There are, however, no reconstructed affixes. In word-final position, Tai languages frequently retain voiceless stops /p, t, k/, nasals /m, n, ŋ/, and glides /j, w/. – Hmong-Mien: Proto-Hmong-Mien has been reconstructed with nasal presyllables and some complex initials bordering on sesquisyllabic structures (cf. Ratliff 2010: 12–18). The presyllabic nasals have been retained in some Hmong-Mien languages, but there has been a tendency towards loss of these in other languages. Syllable-initial complex clusters have been reduced substantially. As for finals, some languages have experienced substantial reduction, including the loss of both stops and nasals, especially in the Hmongic branch. – Vietic: Proto-Vietic was polysyllabic, though likely sesquisyllabic having iambic stress pattern on words, as in the modern conservative southern Vietic languages. Some presyllables were prefixes or the result of infixes, which again are attested

662 

 Mark J. Alves

in southern Vietic. On main stressed syllables, a range of clusters occurred, and voiceless stops /p, t, c, k/ and nasals /m, n, ŋ, ɲ/. Final clusters have not been reconstructed (final sonorant-glottal stop combinations likely represent glottalization on the syllable rhyme rather than a cluster). Initial clusters have been retained among Vietic languages, with many initial clusters in the southern Vietic languages but fewer in varieties of Muong, the languages most closely related to Vietnamese. The most reduced structure is seen in Vietnamese, which retains only consonant-plus-/w/ initials. However, textual evidence strongly suggests that Vietnamese retained more complex clusters into the 1700s to 1800s (Vu 2019), and texts in from the 1200s to 1300s show archaic Vietnamese likely retained some presyllables (Shimizu et al. 2015). Thus, the process from polysyllabicity to monosyllabicity took many centuries, and the modern Vietnamese CVC structure is relatively recent in the overall process of reduction. Overall, while there is a range of phonological changes among the languages, the region clearly shows substantial reduction of presyllabic material, initial clusters, and the number of finals. It is the loss of these finals that corresponds to the development of tone categories, as discussed in 27.2.2.2.

27.2.2.2 Tones Lexical tone is a particularly significant feature in the convergence region. Today, all four language groups in question are tonal, but the proto-languages of all four groups – Sinitic, Hmong-Mien, Tai, and Vietic – have been reconstructed as nontonal languages. Instead of tones, they have been reconstructed with now non-existent final consonants, such as a final voiceless glottal stop and final voiceless fricatives (e.  g. *-s or *-h) or related laryngeal features (e.  g. glottalization or breathiness), which were conditioning factors in the emergence of tones. In Tai and Hmong-Mien, the practice is to reconstruct only categories: A, B, C, and D. Tone A tends to be manifested as a level tone; tones B and C as contour tones in syllables with either final glottal stops or fricatives or some glottalization or breathiness; and tone D as either level or contour, but always occurring in syllables with final voiceless obstruents -p/-t/-k, even when such finals have been lost in modern languages. Regardless of the reconstructions, patterns of four major tone categories can be consistently identified through early Sinitic loanwords from the pre-tone period in all four language groups, as in Table 10.



Linguistic influence of Chinese in Southeast Asia 

 663

Tab. 10: Four tone categories exemplified in early Sinitic loanwords. Tones

A

B

C

D

Chinese words

筋 jīn ‘sinew, tendon’

染 rǎn ‘to dye’

袋 dài ‘bag’

熟shú ‘ripe;

Old Chinese (Baxter and Sagart 2014)

*C.[k]ə[n]

*C.n[a]mʔ

*Cə.lˤək-s

*[d]uk

Hmong-Mien (Ratliff 2010)

*kʷjanA ‘sinew’

*ɲumC

*diC (Proto-­ Mien)

*dju̯ okD

Tai (Pittayaporn 2009)

*ˀjenA ‘sinew’

*ɲwu:mC

**daiB

*sukD

Vietnamese

gân (cân)

nhuộm (nhiễm)

đãy (đại)

thuộc (thục)

familiar’

Note: The representative Vietnamese examples are Sinitic loanwords borrowed in the earlier period, while the later tone-era words listed in Sino-Vietnamese dictionaries of Chinese character readings are provided in parentheses.

Such patterns are robust in the large numbers of such Sinitic loanwords in the languages (e.  g. hundreds of Early Sino-Vietnamese loans, over one hundred in Proto-Tai, over 300 of 1,000 Proto-Hmong-Mien items with comparable Sinitic forms). For the tones in these languages to have maintained such regularity, the Sinitic loanwords had to have been borrowed in a period before the loss of the corresponding final glottal stops and fricatives and before the full emergence of tones. Thus, as hypothesized by Ratliff (2010: 191), the most likely scenario is that tonogenesis (cf. Matisoff 1973) occurred not due to borrowing but through a simultaneous process during this period of large-scale, intense language contact beginning with the Han Dynasty expansion and subsequent centuries.

27.2.2.3 Syntax: clause and noun phrase structure As noted, all groups in the convergence area have SVO clause structure (with information-structure-related variation, such as fronting and other topic-comment patterns), and all have complex systems of nominal classifiers. Due to the absolute consistency of SVO patterns among the groups, and no clear historical linguistic data of other patterns in the past, it is not possible to determine whether this is part of the convergence. That is, the question is whether all languages encountered each other having SVO word order or whether this change to SVO followed language contact, and if so, which language group(s) had what pattern. In contrast, variation in noun phrase structure and understanding of the emergence of classifiers, in addition to the borrowing of some classifiers, suggests that changes did indeed occur due to Sinitic migration. Table 11 presents a list of the key aspects and brief notes on each aspect. The aspects are discussed in the remainder of this section.

664 

 Mark J. Alves

Clause structure As noted, textual data of Sinitic shows a predominantly SVO pattern three thousand years ago (Aldridge 2016). However, the widespread verb-final pattern elsewhere in Sino-Tibetan suggests that SOV was the original dominant pattern (Delancey 2013). If this was the case, some unanswered questions are how and when Sinitic developed a verb-medial pattern at such an early stage. Another question with little data to answer is what patterns there were in the other languages in question. Ancient writings in Vietnamese and Thai date back only several centuries, and these only show SVO patterns. However, if Tai-Kadai is closely related to the verb-initial Austronesian, and if Austroasiatic too were verb-initial (cf. Jenny 2015), then it is possible that the contact with Sinitic with its verb-medial structure played a role in the changes in those language groups. However, this is unknowable based on current available data. Other relevant aspects of clauses include questions, passive voice, and words playing roles in clause structure, such as prepositions and comparatives. While the Chinese-style A-not-A questions have not become part of other language groups, sentence-final particles are shared by all. However, there is currently insufficient data confirming or refuting claims of Sinitic influence on such sentence-final particles. Passive voice in Vietnamese specifically has been influenced by contact with Chinese (Alves 2020: 58–59). In Tai, several functional words affecting Tai syntactic patterns include the comparative marker meaning ‘more than’ (Proto-Tai *kwaB, grammaticalized from the verb meaning ‘to exceed’), which is shared by the Yue branch of Sinitic (Chinese 過guò ‘to pass’; LH kuaiC; MC kuâC; Cantonese gwɔ33 ‘to pass; more than’; also note Sapuan [Bahnaric] kua ‘more than’ and Khmu [Khmuic] kʰwəːj ‘more than’); the comitative meaning ‘with’ (Proto-Tai *kapD; Chinese 及 jí ‘and’; LH *gɨp; MC gjəp); and the marker of noun clauses complements after speaking verbs meaning ‘that’ (Proto-Tai *waB, derived from the verb meaning ‘to say’; Chinese 話huà ‘speech/what is said’ [cf. Cantonese wa:24 ‘to say’]; LH *guas; MC ɣwaiC) (cf. Alves 2015a).

Noun phrase structure and classifiers Modifiers and determiners in Hmong-Mien, Tai, and Vietic are fairly consistently in post-nominal position, in opposition to Sinitic noun phrase structure with all such elements preceding nouns. However, regarding the position of quantity terms and especially classifiers, there does appear to be some influence of Sinitic that has spread throughout Southern China and into Vietnam. Sinitic, Tai, Hmong-Mien, and Vietic are all classifier languages: they have fully grammaticalized classifiers which function semantico-syntactically in ways different from general measure words. The shared tendencies include the following: (a) these words tend to be obligatory even with nouns that are units (e.  g. humans, animals, plants, implements, buildings, etc.); (b) they co-occur with specific semantic classes



Linguistic influence of Chinese in Southeast Asia 

 665

(e.  g., objects of certain shapes, animate versus inanimate items, humans versus animals, etc.); and (c) they can have additional syntactic functions, such as being able to mark definiteness or to be nominalized and serve as noun heads in noun phrases. Among all the language groups, numbers and other quantity expressions precede classifiers, but the position of the number-plus-classifier units varies relative to the head noun in a noun phrase, either before or after nouns. As Jones (1970) noted, in MSEA, these two types of noun phrase structure are spread through two general regions. Classifiers in pre-nominal position are seen largely among languages in China and much of Vietnam, while those in post-nominal position are widespread in other parts of MSEA. This is the case, for example, in the Kam-Tai languages, in which only the Tai languages of MSEA have postnominal classifiers (Gerner 2006). In Table 11, Thai is the only language with post-nominal classifiers, as is the case in other Southwestern Tai languages (e.  g. Lao and Shan), Austroasiatic (e.  g. Cambodian, Khmu, Palaung, So Thavung, etc.) and Tibeto-Burman languages (e.  g. Burmese, Lolo, Lahu, etc.). Tab. 11: Examples of noun phrases with classifiers. Group

Language

Example

Sinitic

Mandarin

– sān zhāng zhǐ three CLF paper ‘three pieces of paper’ – yat1 jek3 gau2 one CLF dog ‘one dog’

Cantonese

Hmong-Mien

Iu Mien White Hmong

Kam-Tai

Northern Kam

Thai

Vietic

Vietnamese

Muong

– puə tâw mien three CLF person ‘three people’ (Court 1987: 145) – yim tus menyuam eight CLF child ‘eight children’ (Clark 1989: 183) – i45 jiu22 ȵa45 one CLF river ‘one river’ (Gerner 2006: 243) – mǎa sǒng tua dog two CLF ‘two dogs’ – ba quả táo three CLF apple ‘three apples’ – chầng nò nóc nhà how many CLF house ‘How many houses?’ (Nguyễn et al. 2002: 549)

666 

 Mark J. Alves

Thus, the geographic distribution of the patterns of classifiers in noun phrases is generally agreed to be a situation of contact-induced convergence, but the question is where the development began and whether a direction of influence can be identified. It has been hypothesized that Chinese was the source of development of classifier units in both Hmong-Mien (Ratliff 1992) and Vietnamese (Alves 2001). Another factor is that in the Vietic branch of Austroasiatic, the more likely early noun phrase structure was strictly right-branching, with quantity-classifier units following nouns, as they did in Old Khmer texts (Jenner and Sidwell 2009: 28–29). It has been suggested that the borrowing of the Chinese numeral system into Tai (cf. 27.2.1) and classifiers happened together (Morev 2000: 81). However, interestingly, Vietnamese has retained all native numbers while still borrowing Sinitic classifiers and measure words and developing a pre-nominal classifier/measure pattern. More support for a Sinitic origin is the lengthy article of Her and Tsiong (n.d.), who hypothesize that Sinitic was the probable source of classifiers in the region, rather than Tai-Kadai, considering the ancient Chinese writings with classifiers. Not surprisingly, some classifiers were borrowed early from Sinitic into Tai. Manomaivibool (1975: 345) has posited the borrowing of the Chinese classifier 頭 tóu (with the core meaning ‘head’, LH *do, MC dəu) Proto-Tai *tue for animals, and Delancey (1985) has noted the likely borrowing of the Chinese classifier referring roughly to thin pieces or slices 片 piàn (OC *pʰˤe[n]-s) as proto-Tai *phɛnB. To this, we can add the generic classifier 個 gè (LH kɑiC as proto-Tai *kaiB), and it is also the default classifier in Vietnamese cái. One other point of note is that, regardless of the position of classifiers, throughout Kam-Tai and Vietic, determiners follow nouns, unlike the noun-phrase-initial position of determiners throughout Sinitic. That these languages are largely left-branching provides further support for the supposition that the pre-nominal position in noun phrases was the result of contact with Sinitic. Finally, if this direction of impact is the case – that Sinitic was the source of the grammaticalization of classifiers and the position of them in noun phrases – then this is the ultimate source, via Vietnamese, of this same pattern in Vietic languages in Vietnam (Ruc and May), as well as the Bahnaric and Katuic branches of Austroasiatic in Vietnam.



Linguistic influence of Chinese in Southeast Asia 

 667

27.3 Conclusion and future directions Clearly, Sinitic has left a tremendous lexical imprint on other languages in ­southern China and into MSEA and ISEA. Words for trade, cuisine, and aspects of Chinese culture have spread quite widely. Recurring semantic domains of such loanwords reveal common types of sociocultural contact throughout the region. In southern China and northern Vietnam specifically, the substantial range of the types of loanwords, especially grammatical vocabulary, and extensive time depth demonstrate the intensity over time of the language contact. In the convergence zone, even the Chinese script came to be employed by Vietnamese (the Chữ Nôm script) and some Tai languages (e.  g. in the Tày and Nùng languages of northern Vietnam), incorporating a mixture of both Chinese characters and those representing non-Sinitic native words in those respective languages. The matter of morphophonological and syntactic change and convergence in the region – and the direction of influence – is more complex. Nevertheless, structural change of morphosyntax through extensive bilingualism is a documented phenomenon, and comparative analytical frameworks can provide insight. In Donohue’s (2013) model of scenarios of language contact and their impact, when intruding speech communities are populous and socio-politically dominant, the effect tends to be loss of the local languages. Such was eventually the case through southeastern China at some point beginning in the Han Dynasty and continuing with multiple massive expansions of Sinitic speakers into Southern China in the first several centuries. However, prior to that, Sinitic speakers would not have been numerically dominant throughout the region and were not necessarily all socioculturally dominant in the new communities. In that case, in Donohue’s model, both phonological and morphosyntactic impacts on Sinitic were likely to occur. However, rather than adopting structures of other languages, it appears all groups, over centuries, underwent morphophonological simplification. By the time of the eventual disappearance of previously dominant speech communities, Sinitic would have already reached its transformed Middle Chinese typology. In southern China and into northern Vietnam, those languages which were not completely replaced have also undergone morphophonological restructuring. Insofar as grammatical vocabulary is associated with syntactic structure, it can be said that aspects of Tai and northern Vietic syntactic structure have also been thusly impacted. Grammaticalization of classifiers likely originated in Sinitic, but even if grammaticalization of classifiers simultaneously occurred in Tai, Sinitic still facilitated the spread of such lexemes and their semantico-syntactic properties, as they did other categories of loanwords. Both Tai and Viet-Muong have borrowed measure words and classifiers from Chinese, coinciding with a restructuring of originally right-branching noun phrases. Again, to a large extent, structural changes among all the language groups, including Sinitic itself, were initiated by the spread of Sinitic.

668 

 Mark J. Alves

Tab. 12: Overview of Sinitic influence on syntax in the convergence area. Aspect

Notes

SVO

Regional shared pattern, but unclear source

Sentence-final particles

Regional shared traits (e.  g. modal functions, indicating the interrogative, politeness, etc.), but unclear source

Questions

Sinitic A-not-A pattern did not spread; unclear source of polar sentence-final particles

Position of classifiers in NPs

Impact in Tai in China (not MSEA), Hmong-Mien, and Vietic (also other Vietic, Katuic and Bahnaric in Vietnam via Vietnamese)

Position of modifiers and determiners in NPs

Little to no impact on non-Sinitic languages

Grammatical lexemes

Lexical borrowing in Tai and Viet (e.  g. clause-connectors, comparative words, locative words, the aspect marker in Tai, passive markers in Vietnamese [other Vietic languages via Vietnamese], etc.)

This cursory coverage of the linguistic influence of Chinese in MSEA and ISEA provides some hypotheses about the lexical borrowing and typological convergence stimulated by migrations. Semantic domains of loanwords and changes in linguistic typology are indicative of the sociocultural circumstances and timing of the contact and thus can be useful in ethnohistorical investigation. However, there remain many unanswered questions in the details and in larger matters of how and when these changes happened, what the multilingual situations were that led to the changes and borrowing, and why certain types of borrowing and convergence happened repeatedly in regions quite distant from each other.

References Aldridge, Edith. 2016. Old Chinese syntax: Basic word order. In Rint Sybesma (ed.), Encyclopedia of Chinese language and linguistics. Brill Online. Alves, Mark J. 2001. What’s so Chinese about Vietnamese? In Graham W. Thurgood (ed.), Papers from the Ninth Annual Meeting of the Southeast Asian Linguistics Society, 221–242. Tempe, AZ: Arizona State University, Program for Southeast Asian Studies. Alves, Mark J. 2007a. Categories of grammatical Sino-Vietnamese vocabulary. Mon-Khmer Studies 37. 217–229. Alves, Mark J. 2007b. Sino-Vietnamese grammatical borrowing: An overview. In Yaron Matras & Jeanette Sakel (eds.), Grammatical borrowing in cross-linguistic perspective, 343–362. Berlin & New York: Mouton de Gruyter. Alves, Mark J. 2009. Loanwords in Vietnamese. In Martin Haspelmath & Uri Tadmor (eds.), Loanwords in the world’s languages: A comparative handbook, 617–637. Berlin & New York: Mouton de Gruyter.



Linguistic influence of Chinese in Southeast Asia 

 669

Alves, Mark J. 2015a. Grammatical Sino-Tai vocabulary and implications for ancient Sino-Tai sociolinguistic contact. Presentation given at ICSTLL 48, University of California, Santa Barbara, 21–23 August. Alves, Mark J. 2015b. Historical notes on words for knives, swords, and other metal implements in eEarly Southern China and Mainland Southeast Asia. Mon-Khmer Studies 44.: 39-–56. Alves, Mark J. 2017a. Chinese loanwords in the languages of Southeast Asia. In Rint Sybesma (ed.), Encyclopedia of Chinese language and linguistics, 572–585. Boston: Brill. Alves, Mark J. 2017b. Chinese loanwords in Vietnamese. In Rint Sybesma (ed.), Encyclopedia of Chinese language and linguistics, 585–592. Boston: Brill. Alves, Mark J. 2017c. Chinese loanwords in Vietnamese pronouns and terms of address and reference. In Proceedings of the 29th North American Conference on Chinese Linguistics, vol. 1: 286–303. Columbus, OH: NACCL Proceedings Online, The Ohio State University. Alves, Mark. 2020. Initial steps in reconstructing Proto-Vietic syntax. In Mathias Jenny, Paul Sidwell & Mark Alves (eds.), Austroasiatic syntax in areal and diachronic perspective, 46–81. Boston: Brill. Bauer, Robert S. 1996. Identifying the Tai substratum in Cantonese. In Proceedings of the Fourth International Symposium on Languages and Linguistics, 1806–1844. Salaya, Thailand: Institute of Language and Culture for Rural Development, Mahidol University. Baxter, William H. & Laurent Sagart. 2014. Baxter-Sagart Old Chinese reconstruction, version 1.1 (20 September 2014). http://ocbaxtersagart.lsait.lsa.umich.edu/BaxterSagartOCbyMandarinMC2014-09-20.pdf (accessed 2 February 2015). Churchman, Catherine. 2016. The people between the rivers: The rise and fall of a Bronze Drum Culture, 200–750 CE. New York: Rowman & Littlefield Publishers. Clark, Marybeth. 1989. Hmong and areal South-East Asia. In David Bradley (ed.), Papers in Southeast Asian Linguistics No. 11: Southeast Asian syntax, 175–230. Canberra: Pacific Linguistics, the Australian National University. Court, Christopher. 1987. Some classes of classifier in Iu Mien (Yao). Linguistics of the Tibeto-Burman Area 10(2). 144–150. Delancey, Scott. 1985. Etymological notes on Tibeto-Burman case particles. Linguistics of the Tibeto-Burman Area 8(1). 59–77. DeLancey, Scott. 2011. On the origins of Sinitic. In Zhuo Jing-Schmidt (ed.), Proceedings of the 23rd North American Conference on Chinese Linguistics (NACCL-23), vol. 1, 51–64. Eugene, OR: University of Oregon. DeLancey, Scott. 2013. The origins of Sinitic. In Zhuo Jing-Schmidt (ed.), Increased empiricism: Recent advances in Chinese linguistics, 73–100. Amsterdam & Philadelphia: John Benjamins. Djenar, Dwi Noverini, Michael C. Ewing & Howard Manns. 2018. Style and intersubjectivity in youth interaction. Berlin & New York: Mouton de Gruyter. Donohue, Mark. 2013. Who inherits what, when? Contact, substrates and superimposition zones. In Balthasar Bickel, Lenore A. Grenoble, David A. Peterson & Alan Timberlake (eds.), Language typology and historical contingency (Typological Studies in Language 104), 219–240. Amsterdam & Philadelphia: John Benjamins. Gerner, Matthias. 2006. Noun classifiers in Kam and Chinese Kam-Tai languages: Their morphosyntax, semantics, and history. Journal of Chinese Linguistics 34(2). 237–305. Gernet, Jacques. 1996. A history of Chinese civilization. Cambridge: Cambridge University Press. Haudricourt, André G. 1954. Comment reconstruire le Chinois Archaïque. Word 10(2/3). 351–364. Her, One-Soon & Bing-Tsiong Li. n.d. A single origin of numeral classifiers in Asia and Pacific: A hypothesis. Unpublished manuscript. Hudak, Thomas John. 2009. William J. Gedney’s Comparative Tai source book. Honolulu: University of Hawaii Press.

670 

 Mark J. Alves

Jenner, Phillip N. & Paul Sidwell. 2009. Old Khmer grammar. Canberra: The Australian National University. Jenny, Mathias. 2015. Syntactic diversity and change in Austroasiatic languages. In Carlotta Viti (ed.), Perspectives on historical syntax, 317–340. Amsterdam & Philadelphia: John Benjamins. Jones, Robert B. 1970. Classifier constructions in Southeast Asia. Journal of the American Oriental Society 90(1). 1–12. LaPolla, Randy. 2001. The role in migration and language contact in the development of the Sino-Tibetan language family. In Alexandra Y. Aikhenvald & Robert M. W. Dixon (eds.), Areal diffusion and genetic inheritance, 225–254. Oxford: Oxford University Press. Li, Fang-Kuei. 1977. A handbook of Comparative Tai (Oceanic Linguistics Special Publications 15). Honolulu: The University of Hawaii Press. Manomaivibool, Prapin, 1975. A study of Sino-Thai lexical correspondences. Seattle: University of Washington dissertation. Matisoff, James A. 1973. Tonogenesis in Southeast Asia. In Larry M. Hyman (ed.), Consonant types and tone (Southern California Occasional Papers in Linguistics 1), 71–95. Los Angeles: University of Southern California. Matisoff, James A. 1990. On megalocomparison. Language 66(1). 106–120. Matthews, Stephen. 2006. Cantonese grammar in areal perspective. In Alexandra Y. Aikhenvald & Robert M. W. Dixon (eds.), Grammars in contact: A cross-linguistic typology, 220–236. Oxford: Oxford University Press. McWhorter, John. 2007. Language interrupted: Signs of non-native acquisition in standard language grammars. Oxford: Oxford University Press. Morev, Lev N. 2000. Some afterthoughts on classifiers in the Tai languages. The Mon-Khmer Studies Journal 30. 75–82. Müller, André & Rachel Weymuth. 2017. How society shapes language: Personal pronouns in the greater Burma zone. Asia 71(1). 409–432. Nguyễn, Văn Khang, Bùi Chi & Hoàng Văn Hành. 2002. Từ điển Mường-Việt [A Mường-Vietnamese dictionary]. Hà Nội: Nhà Xuất Bản Văn Hoá Dân Tộc. Peyraube, Alain. 1996. Recent issues in Chinese historical syntax. In: C.-T. James Huang & Audrey Li Yen Hui (eds.), New horizons in Chinese linguistics, 161–213. Dordrecht: Kluwer. Pittayaporn, Pittayawat. 2009. The phonology of Proto-Tai. Ithaca, NY: Cornell University PhD dissertation. Pittayaporn, Pittayawat. 2014. Layers of Chinese loanwords in Proto-Southwestern Tai as evidence for the dating of the spread of Southwestern Tai. MANUSYA: Journal of Humanities, Special Issue 20. 47–68. Pou, Saveros & Philip N. Jenner. 1973. Some Chinese loanwords in Khmer. Journal of Oriental Studies 6(1). 1–90. Ratliff, Martha. 1992. Meaningful tone: A Study of tonal morphology in compounds, form classes, and expressive phrases in White Hmong (Special Report No. 27). DeKalb, IL: Center for Southeast Asian Studies, Northern Illinois University. Ratliff, Martha. 2009. Loanwords in White-Hmong. In Martin Haspelmath & Uri Tadmor (eds.), Loanwords in the world’s languages: A comparative handbook, 638–658. Berlin & New York: Mouton de Gruyter. Ratliff, Martha. 2010. Hmong-Mien language history. Canberra: Pacific Linguistics. Schuessler, Axel. 2010. ABC etymological dictionary of Old Chinese. Honolulu: University of Hawai’i Press. SEALang Mon-Khmer etymological dictionary. http://www.sealang.net/monkhmer/dictionary/, (last accessed 6 November 2019).



Linguistic influence of Chinese in Southeast Asia 

 671

Shimizu, Masaaki, Lê Thị Liên & Shiro Momoki. 2005. A trace of disyllabicity of Vietnamese in the 14th century: Chữ Nôm characters contained in the inscription of Hộ Thành Mountain. Kobe City University of Foreign Studies 64. 17–49. Suthiwan, Titima & Uri Tadmor. 2009. Loanwords in Thai. In Martin Haspelmath & Uri Tadmor (eds.), Loanwords in the world’s languages: A comparative handbook, 599–616. Berlin & New York: Mouton de Gruyter. Taylor, Keith W. 1983. The birth of Vietnam. Berkeley: University of California Press. Vu, Duc Nghieu. 2019. Vietnamese initial consonant clusters in Quốc Ngữ documents from the 17th to early 19th centuries. Journal of the Southeast Asian Linguistics Society 12(1). 143–162. Wiebusch, Thekla & Uri Tadmor. 2009. Loanwords in Mandarin Chinese. In Martin Haspelmath & Uri Tadmor (eds.), Loanwords in the world’s languages: A comparative handbook, 575–598. Berlin & New York: Mouton de Gruyter. Wu, Xiao An. 2009. China meets Southeast Asia: A long-term historical review. In Ho Khai Leong (ed.), Connecting and distancing: Southeast Asia and China, 3–30. Singapore: Institute of Southeast Asian Studies. Zee, Eric. 1985. Sound change in syllable final nasal consonants in Chinese. Journal of Chinese Linguistics 13(2). 291–330.

Graham Thurgood

28 The influence of contact between Austroasiatic and Austronesian 28.1 Introduction This chapter focuses on evidence of contact between the mainland Austronesian [AN] languages, specifically Proto-Chamic [PC], and Austroasiatic [AA], specifically MonKhmer [MK]. The mainland AN languages are the Chamic languages. While Acehnese is now part of insular Southeast Asia, historical reconstruction shows Acehnese to be Chamic. The other mainland AN language is Moken-Moklen, spoken by the so-called sea gypsies. Early scholars making claims that Moken-Moklen was Chamic were misled by failure to distinguish between typological similarities and genetic correspondences.

28.2 The geographical and historical settings Proto-Chamic is the product of AN traders coming into contact with MK speakers. The Chamic languages themselves are the remnants of Champa – a group of loosely affiliated political entities that functioned as part of Austronesian trade patterns (Hall 1955, 1981, 1985; Higham 1989, 2002). The trade routes, of course, reflect geography (see Map 1). The northernmost point Guangzhou (Canton City) falls outside of Champa but is certainly part of the trading patterns. Further south is Hainan Island, which was the northernmost part of Champa (see Thurgood et al. 2014); Hainan Cham [=Tsat], the language near to Sanya, is now under Chinese influence and will disappear within a generation or so. The bulk of the remaining languages were in coastal, highland, and western Vietnam. See Map 2 for these. As the result of the Vietnamese movement down along the coast, much of the dialect continua were broken as the Chamic speakers either assimilated or moved elsewhere. Some moved into the highlands of central and southern Vietnam (Roglai, Jarai, Rade, Haroi, Eastern Cham, Chru), some went as far as Cambodia (Western Cham). Others went to the trading posts on Hainan particularly around 982, 986–987, when the northern capital fell. Others moved to Aceh. As for the Acehnese, undoubtedly some were already at Banda Aceh, but the large group moved there after the fall of the southern capital made the Acehnese the dominant group. Those remaining in Vietnam now came into contact primarily with Austroasiatic languages. Nonetheless another group is found in Kelantan, along the east coast of modern Malaysia; that they were Chamic is made obvious by still existing place names. Funan is listed on the map and it must have been part of the trade routes at least early in Chamic history, but little is known about the languages of Funan. https://doi.org/10.1515/9783110558142-028

674 

 Graham Thurgood

CHINA

Guangzhou (Canton City)

VIETNAM MYANMAR

Bay of Bengal

LAOS

Hainan

c. 986-988

Hainan Cham

THAILAND

Indrapura 982

S o u t h

KAMPUCHEA

Andaman Sea

TAIWAN

C h i n a

P A C I F I C O C E A N S e a

Philippine Sea

Champa PHILIPPINES

Vijaya 1471

Funan

Aceh

c. 1469-1470

Kelantan MALAYSIA

Celebes Sea MALAYSIA SINGAPORE

I N D I A N O C E A N

Sumatra

Map 1: The wider distribution of Chamic.

The map from Gregerson and Thomas (1980: xi) shows the Mon-Khmer and the Chamic languages of Vietnam. It is old and in at least some places limited. The placement of Hrê (MK) and Haroi, for instance, is not fully compatible with reports of Hrê and Haroi being spoken in the same villages. And, it is quite likely that substantive changes have occurred in the last 40 years.

28.3 Restructuring Proto-Chamic under contact Under contact with the MK languages, PC adopted the MK canonical word order, and restructured the vowel system, with several languages developing register systems under contact, and with Hainan Cham subsequently going on to develop a tone system (Thurgood et al. 2014; Thurgood 1999, 2020b). In addition to these major restructurings, there are other countless smaller pieces showing MK contact: the borrowing of MK morphology, the presence of core vocabulary, and so on.



The influence of contact between Austroasiatic and Austronesian 

Quang-Tri Bru

Hue

S o u t h C h i n a S e a

Pacoh Phuong

Da Nang

Katu Takua Cua

Jeh

Duan Katua Kayong Sedang

Rengao Halang

Quang Ngai

Hre

Todrah

Kontum

Monom

Bahnar

Pleiku

Qui Nhon

KEY: Cities Chamic

Jarai Haroi Tuy Hoa

Mon-Khmer Rade

Nha Trang

E Mnong N Roglai

C Mnong Dalat

Stieng Koho

S Roglai E Cham Chru Saigon

Chrau Jro

Map 2: Mon-Khmer and the Chamic languages of Vietnam.

S Roglai Phan Rang E Cham

 675

676 

 Graham Thurgood

28.3.1 Restructuring the canonical word The Austronesian speakers who arrived on the coast of the Southeast Asian mainland spoke a language with a canonical root that was disyllabic with penultimate stress (except when the first syllable vowel was a schwa). Under the contact with sesquisyllabic, finally stressed MK languages, PC likewise became sesquisyllabic and finally stressed. After the break-up of PC, some languages, such as Roglai, Rade, and Jarai, in contact with typologically similar languages, largely retained the four-way contrast in the unstressed presyllable but increased the distinctions in the stressed main syllable. This presyllabic–main syllable pattern is still essentially in place even now in the mainland Chamic languages and in Acehnese. Hainan Cham alone has gone on to become fully monosyllabic (see Thurgood 1999), although some movement toward monosyllabicity can be seen elsewhere, e.  g. Eastern Cham (in contact with Vietnamese, among other things). Some of these new main-syllable vowels developed out of splits of inherited Proto-Malayo-Polynesian (PMP) vowels, but the bulk of the forms with new vowels are found in pre-Chamic borrowings from MK. Thus, the main vowels of PC include two readily discernible historical layers: those vowels inherited from PMP, which form the core of the basic vowel system, and those vowels which primarily reflect MK influence and overwhelmingly occur in pre-Chamic MK borrowings.

28.3.2 The evolution of the PC vowel system The Malayo-Polynesian language that first came into contact with MK had four basic vowels, *-a, *-i, *-u, *-e ([-ə]), as well as three final diphthongs, *-ay, *-uy, and *-aw; the four vowels occurred in both syllables of the disyllabic forms, while the diphthongs were restricted to the final syllable. The atonic first syllable in PC still retains four vowels, but the stressed, final-syllable syllables have proliferated. This four-way distinction is only well-preserved in Acehnese, with the remaining languages having various degrees of reduction in the onset syllable; Hainan Cham roots have been reduced completely and are now monosyllabic. The inherited PC main syllable vowels, in contrast, have restructured, complicating the vowel system. Borrowed MK vowels are added in Figure 1.



The influence of contact between Austroasiatic and Austronesian 

PMP second syllable vowels *i

PC main syllable vowels

*u

*-u-,

*-i-,

*-ᴜu̯ > *-ɔw

*-ɪi̯ > *-əy *-ə- < e >

 677

>

*-a-

*a

*a (short) / *-a:-

*-ay

*-ay

*-uy

*-aw

*-uy

*-aw

Fig. 1: PMP second syllable vowels > PC main syllable vowels.

In the transition from PMP to PC the PMP high vowels split, becoming diphthongs in final position (at least when lengthened by stress), but remaining unchanged in closed syllables. The PMP shwa became PC *a; the PMP *a split in some contexts, becoming PC -a:- in others. This introduces a vowel length distinction. Figure 2 shows the inherited MP with the borrowed MK vowels. *-ia

*-u

*-i-, *-i

*-ua

*-u-, *-u:-

*-uəy

*-əy, *-əw, *ɛ

*-uay



*-ɔ *-ɔ-,*-ɔ:̆

*-a *-a-, *-a:*-ay

*-uy

*-aw

Fig. 2: PC main syllable vowels, inherited and incorporated.

The addition of incorporated MK forms further complicates the vowel system, adding new phonemes: the MK diphthongs *-ia, *-ua, *-uay, and *-uəy; the long and short ɔ in both open and closed syllables; a new ə have been added; and *ɛ. As is also obvious elsewhere in the data, the evidence indicates that PC evolved out of a context involving bilingual children, achieving native-like competence in both the mother’s language (MK) and the father’s language (AN).

678 

 Graham Thurgood

28.3.3 Register and tone One of the more striking innovations is historical development of new register systems, which in the case of Hainan Cham became a tone system. As is not atypical for register systems, breathy-voice occurs following the older voiced occlusives, and the vowels in that register develop allophonically lower pitch and/or higher vowels than occur in the contrasting modal register.1 From this starting point various things can happen: the register distinction can be lost, phonemicizing the two vowel sets. Or, the phonation and vowel differences can be lost, leaving phonemicized tone distinctions (see Thurgood 1999 for detailed examples).

28.3.3.1 Western Cham Figure 3 shows the Western Cham vowel system split into two sets on the basis of a two-way register contrast. Modal register vowels:

Breathy register vowels:

i

ə



i

e

ʌ

o

e

æ

a

ɔ



i

ɨ

u

ə̞

ou





Fig. 3: Vowel registers in Western Cham (from Edmondson and Gregerson 1993: 67).

28.3.3.2 Haroi Haroi has what is termed a restructured register system (Burnham 1976; Thurgood 1996), in which the registers split the vowels and then disappear. Probably under the influence of Hrê, but certainly due to phonation differences, Haroi experienced widespread vowel splitting (with some subsequent realignment), becoming what Huffman (1976) terms a restructured register system. The PC proto-voiceless obstruents led to tense voice, which lowered some monophthongs. The PC proto-voiced obstruents led to breathy voice, which raised monophthongs. Phonation differences led to on-glides developing on certain monophthongs, while others raised their onsets. As a result, instead of the nine- or ten-vowel system characteristic of most Chamic languages (cf.

1 I suspect that distinctions in both the pitch and the vowel registers occur, but such register systems are reported at times as having no pitch distinctions.



The influence of contact between Austroasiatic and Austronesian 

 679

the 11 vowels of Vietnamese), Haroi has a plethora of vowels (Tegenfeldt-Mundhenk and Goschnick 1977: 1):

11 simple, each both long and short; 17 diphthongs and triphthongs; and 10 rarely occurring nasalized vowels.

Some are borrowed, but most are from the result of phonation-driven splitting. The details are tedious, but relatively straightforward (cf. Figure 4, for an overview). PC initial classes: Voice quality: Effects on vowels: Result:

PC voiceless obstruents > tense voice > high vowels lower > proliferation of vowels

PC voiced obstruents > breathy voice > low and mid vowels rise>

all other PC initials > modal voice > no effect >

Fig. 4: Restructured register and Haroi vowel splitting.

Specific vowel splitting patterns are found in Figure 5. For further historical details, see Thurgood (1999: 197–213). voiceless obstruents > tense voice high vowels; *-əŋ > *-ɨŋ centering diphthongs: *ua > *oa *ia > *ea mid *ɛ; *ə; *ɔ; *-əy > *-ɔ̆ i low vowels

(onset) lowered; > -əŋ

glottalized obstruents, voiced aspirates and sonorants

voiced obstruents > breathy register

unchanged

unchanged raised and backed: **-ia- > -ɨa-; **-ua- > -ua; -ʌ- /___m, -ʔ raised: ɪ; ɨ; ʌ; -ɨi [fronted]

unchanged

unchanged

unchanged

unchanged

unchanged

unchanged

developed -ɨ- onset

Fig. 5: Consonant types, vowel classes, and vowel splitting.

Disyllabic words often involve phonation spreading. With the exclusion of cases when the initial of the presyllable is *s or *h, if the main syllable begins with a sonorant, it is the initial of the presyllable, not the initial of the main syllable, that determines the register of the main syllable vowel (Burnham 1976; Lee 1977: 89). When the pretonic syllable begins with a voiceless obstruent other than *s or *h and the main syllable begins with a sonorant, the main syllable follows the vowel splitting patterns associated with voiceless obstruent phonation. In contrast to the sonorants, main-syllable initial obstruents completely block spreading.

680 

 Graham Thurgood

28.3.3.3 Incipient Eastern Cham tonal system Eastern Cham has a quasi-registral, incipiently tonal system, which has evolved under MK influence (Thurgood 1993; Phu, Edmondson, and Gregerson 1992). More information is still needed about the Eastern Cham reflexes of PC forms that ended in *-h. Blood (1967) reported such forms as having allophonic but noticeable extra high pitch.

28.3.3.4 Hainan Cham Hainan Cham [Tsat] has a fully developed tone system (Zheng 1997; Haudricourt 1984; Benedict 1984; Ni 1990a, 1990b; Thurgood 1999). Initials classes: > PC initials (except voiced obstruents)

Resulting registers: > modal voiced, high series

PC voiced obstruents

breathy voiced, low series

Finals: > *-h glottal stop *voiced finals *-h stop glottal stop *voiced finals

Tones: 55 24ʔ 33 55 42ʔ 11

For each of these the paths of development are relatively transparent, with the difference in endpoints largely attributable to differences in the structural characteristics with which they were in contact. Although more work on the clarification of the modern systems would be useful, the overview seems trustworthy.

28.4 Morphology Various pieces of MK morphology reconstruct to PC, and various additional items have been borrowed into various Chamic language after the breakup of PC. Pronouns borrowed from MK are found in the post-breakup Chamic Highlands languages #ɓiŋ in ‘we’ and #ih ‘you; thou’. The reciprocal -kʰawʔ⁴³ ‘mutual, reciprocal (recp)’ < PC #*gəp ‘group’ is a PC level incorporation from MK. Negative imperatives from MK sources exist, namely, #*bɛʔ reconstructing to PC, and #*juəy is perhaps that old, but ultimately MK. (See Lee 1996.) Three of the negation markers reconstruct to PC, beyond this possibility, little is clear (Lee 1996). The origins of the morpheme #*ʔɔh ‘not, no; negative’ is probably MK; interestingly this form is found largely with sentence negative, but other forms with preverbal negation.



The influence of contact between Austroasiatic and Austronesian 

 681

28.5 The lexicon In Thurgood (1999) there are 290 etymologies of Austronesian origin, and 192 etymologies of MK origin. The numbers are only part of the story. Much of the MK is supposedly hard-to-borrow core vocabulary. Of the 60 body-part terms in the PC database, 23 are MK borrowings, including words for yawn, penis, finger, waist, loins, buttocks, cheek, jaw, meat, lungs, dead skin, neck, vomit, excrement, heel, head, lips, gums, stomach, large intestine, back, right (side), left (side), and so on. Incidentally, this pattern of borrowing is associated here and elsewhere with childhood bilingualism (Thurgood 2020a). Here the bilingualism involves male AN traders coming into contact with female MK speakers.

28.6 From Chamic into MK The presence of a largely dependable reconstruction of PC has put the focus in this paper on MK influence on Chamic, but without question there is also Chamic influence on MK languages. Specialists on these languages are well aware of this. For instance, David Blood (p.c.) notes, there is evidence of Chamic interaction with MK for Chrau, Mnong, and Hrê, but suggests various individual languages are largely free of Chamic influence: Koho, Stieng, Rengao, Jeh, and West Bahnaric. Diffloth (p.c.) noted that Katu has come under Chamic influence, as has Hrê (< Haroi). However, nothing so far has been as dramatic as the restructuring of AN under the influence of MK.

References Benedict, Paul K. 1984. Austro-Tai parallel: A tonal Cham colony on Hainan. Computational Analyses of Asian & African Languages 22. 83–86. Blood, David L. 1967. Phonological units in Cham. Anthropological Linguistics 9(8). 15–32. Burnham, E. C. 1976. The place of Haroi in the Chamic languages. Arlington, TX: University of Texas MA thesis. Edmondson, Jerold A. & Kenneth J. Gregerson. 1993. Western Cham as a register language. In Jerry Edmondson & Ken Gregerson (eds.), Tonality in Austronesian languages (Oceanic Linguistics Special Publication 24), 61–74. Honolulu: University of Hawaii Press. Gregerson, Marilyn & Dorothy Thomas. 1980. Notes from Indochina on ethnic minority cultures. Dallas, TX: Summer Institute of Linguistics Museum of Anthropology. Hall, Daniel George Edward. 1955. A history of South-East Asia, 1st edn. New York: St. Martin’s Press. Hall, Daniel George Edward. 1981. A history of South-East Asia, 4th edn. New York: St. Martin’s Press. Hall, Kenneth R. 1985. Maritime trade and state development in early Southeast Asia. Honolulu: University of Hawaii Press.

682 

 Graham Thurgood

Haudricourt, André-G. 1984. Tones of some languages in Hainan. Minzu Yuwen 4. 17–25. [Also published as “La tonologie des langues de Hai-nan” in Bulletin de la Société de Linguistique de Paris 79(1). 385–394]. Higham, Charles. 1989. The archaeology of mainland Southeast Asia: From 10,000 BC to the fall of Angkor. Cambridge: Cambridge University Press. Higham, Charles. 2002. Early cultures of mainland Southeast Asia. Bangkok: River Books. Huffman, Franklin E. 1976. The register problem in fifteen Mon-Khmer languages. In Philip N. Jenner, Laurence C. Thompson & Stanley Starosta (eds.), Austroasiatic studies, (Oceanic Linguistics Special Publication 13), vol. 1, 575–590. Honolulu: University of Hawaii Press. Lee, Ernest Wilson. 1977. Devoicing, aspiration, and vowel split in Haroi: Evidence for register (contrastive tongue-root position). In David Thomas, Ernest W. Lee & Nguyen Dang Liem (eds.), Papers in South East Asian Linguistics No. 4: Chamic Studies (Pacific Linguistics Series A, No. 48), 87–104. Canberra: Pacific Linguistics. Lee, Ernest Wilson. 1996. Bipartite negatives in Chamic. Mon-Khmer Studies 26. 291–317. Ni, Dabai. 1990a. The origins of the tones of the Kam-Tai languages. Manuscript. Ni, Dabai. 1990b. The Sanya (= Utsat) language of Hainan island: A living specimen of a linguistic typological shift. Manuscript. Phu, Van Han, Jerold Edmondson & Kenneth Gregerson. 1992. Eastern Cham as a tone language. Mon-Khmer Studies 20. 31–44. Tegenfeldt-Mundhenk, Alice & Hella Goschnick. 1977. Haroi phonemes. In David Thomas, Ernest W. Lee & Nguyen Dang Liem (eds.), Papers in South East Asian Linguistics No. 4: Chamic studies (Pacific Linguistics Series A, No. 48), 1–15. Canberra: Pacific Linguistics. Thurgood, Graham. 1993. Eastern Cham and Utsat: Tonogenetic themes and variants. In Jerry Edmondson & Ken Gregerson (eds.), Tonality in Austronesian languages (Oceanic Linguistics Special Publication 24), 91–106. Honolulu: University of Hawaii Press. Thurgood, Graham. 1996. Language contact and the directionality of internal “drift”: The development of tones and registers in Chamic. Language 71(1). 1–31. Thurgood, Graham. 1999. From Ancient Cham to modern dialects: Two thousand years of language contact and change. With an appendix of Chamic reconstructions and loanwords (Oceanic Linguistics Special Publications 28). Honolulu: University of Hawai’i Press. Thurgood, Graham. 2020a. Sociolinguistic, sociological, and sociocultural approaches to contactinduced language change. Identifying Chamic child bilingualism in contact-based language change. In Anthony Grant (ed.), Oxford handbook of language change, 173–192. Oxford: Oxford University Press. Thurgood, Graham. 2020b. Tonogenesis: Atonal to registral to tonal. In Brian Joseph & Barbara Vance (eds.), Handbook of historical linguistics, vol. II. Hoboken, NJ: Wiley Blackwell. Thurgood, Graham, Ela Thurgood & Li Fengxiang. 2014. A grammatical sketch of Hainan Cham: History, contact, and phonology (Pacific Linguistics 643). Berlin: Mouton de Gruyter. Zheng, Yiqing. 1997. Huihui Yu Yanjiu [A study of Cham]. Shanghai: Yuandong Chuban She [Shanghai: Far East Publishers].

Marc Brunelle and Tạ Thành Tấn

29 Register in languages of Mainland Southeast Asia: the state of the art 29.1 Introduction This chapter is an overview of what is known about Southeast Asian register, a type of phonological contrast common in Austroasiatic languages and attested in several Austronesian languages of Southeast Asia (Chamic, Javanese). Register normally arises through the neutralization of onset voicing and is realized through multiple phonetic exponents, like voice quality, pitch and vowel quality. In the following pages, we review the investigation of register since the late 19th century, opting for a chronological approach that contextualizes developments in the field. The chapter is divided into five sections. In 29.2, we first define register and give an overview of its acoustic properties. In 29.3, we briefly look at the discovery of register in Mon and Khmer by Western scholars in the late 19th and early 20th centuries, and give an overview the initial wave of modern research on the topic (1950s–1960s) during which the concept was defined and first applied to lesser-described Austro­ asiatic languages. In 29.4, we compare models of registrogenesis,1 i.  e. the diachronic development of register, that were mostly developed in the 1970s. In 29.5, we discuss advances in the instrumental investigation of register since the 1980s. Finally, in 29.6, we highlight a few important questions that have not yet been answered, and lay out a few possible avenues for further research.

29.2 What is register? The term register has multiple uses in linguistics. It can refer, among other things, to a sociolinguistic repertoire, a portion of the pitch range, or a phonological feature related to tone height. In this chapter, we discuss Southeast Asian register, a binary contrast common in Austroasiatic and some branches of Austronesian that initially derives from the neutralization of onset voicing and its transphonologization2 onto following vowels (see 29.4.3 for other possible sources of register). As sketched out in Table 1, it is typically realized as a combination of vocalic properties such as voice 1 The term “registrogenesis” (French registrogénèse) seems to have been coined by Diffloth (1982), based on “tonogenesis”, which was itself innovated by Matisoff (1970, 1973). 2 The concept of “transphonologization”, the transfer of a phonological contrast from one phonetic property to another, was developed by Haudricourt in the 1960s–1970s and formally defined in Hagège and Haudricourt (1978). https://doi.org/10.1515/9783110558142-029

684 

 Marc Brunelle and Tạ Thành Tấn

quality, vowel quality, pitch and duration, but can be accompanied by aspiration modulations in onsets or redundant voicing. The high register, that stems from voiceless stops, usually has a modal voice and a high pitch, while the low register, that originates in voiced stops, has a breathy or lax voice quality and a low pitch. High register vowels are also typically more open than their low register counterparts (at least in their initial portion). Tab. 1: Typical properties of register (adapted from Brunelle and Kirby 2016). High register (also head, tense, clear or first register) (< voiceless stops)

Low register (also chest, lax, breathy or second register) (< voiced stops)

Higher pitch Tense / modal voice More open vowels (esp. at their beginning) More peripheral vowels Shorter vowels Shorter VOT

Lower pitch Lax / breathy voice More close vowels (esp. at their beginning) More centralized vowels Longer vowels Longer VOT

Register languages exhibit significant diversity: they do not necessarily make use of all the properties listed in Table 1, and the salience of each property varies across languages. An illustration of the register contrast with a Chrau (Bahnaric) minimal pair is given in Figure 1. The main syllable of the word /ti/ ‘hand, arm’ is realized with a high register, while the main syllable of /t̥i/ ‘in order to’ is realized with a low register (the IPA subscript for devoicing is used on the onset to mark the low register). The vowel of /ti/, on the left, has a dramatic onglide; it starts with a higher F1 and lower F2 than the syllable /t̥i/, making its phonetic realization close to [tei]. The low register syllable /t̥i/, on the right, starts with a lower f0, and has a weaker relative intensity in upper frequencies, which indicates a laxer/breathier voice quality. It also has a longer positive VOT than its high register counterpart. In many tone languages of East and Southeast Asia, the neutralization of onset voicing led to a doubling of the number of tones, following the classic scenario established in Haudricourt (1954). This happened repeatedly in Sino-Tibetan, Tai-Kadai and Hmong-Mien, and even in Austroasiatic (Vietnamese, Rục) and Austronesian (Tsat). We only address tone splitting in passing in this chapter, but note that it is related to register and that it also left traces on the voice quality and vowel quality associated with some tones in many of these languages.



Register in languages of Mainland Southeast Asia: the state of the art 

 685

Fig. 1: Spectrograms of the minimal pair /ti/ ‘hand, arm’ (high register) and /t̥i/ ‘in order to’ (low ­register) produced by an older male speaker of Chrau. The high register vowel begins with an onglide marked by a high F1 (lower red arrow) and a low F2 (upper red arrow). The low register vowel starts with a lower f0 (red line), and its weaker relative intensity in higher formants is a sign of laxness (most clearly seen at the upper red arrow indicating F2). The VOT of the low register onset is also longer than its high register counterpart.

29.3 The discovery of register In the late 19th century, Western scholars became aware that the Indic scripts used to write Mon and Khmer did not use the graphemes denoting voiced and voiceless consonants in Indic languages to mark voicing, but rather to distinguish two series of vowels (Janneau 1869; Aymonier 1874; Haswell 1874). Although they had effectively discovered register, early authors largely described the two series of vowels in terms of vowel quality and overlooked other properties, despite some impressionistic passages suggesting a limited awareness of other registral properties. To our knowledge, the first author who clearly described register as involving more than vowel quality is Blagden (1910), who wrote about Mon:

686 

 Marc Brunelle and Tạ Thành Tấn

Cette division des consonnes en deux séries est le point capital de l’orthographe et de la phonétique de la langue. En effet, les consonnes dites sonores g, gh, j, jh, ḍh […], d, dh, b, bh se prononcent quant au son actuel comme des sourdes, k, kh, etc. Mais leur énonciation est accompagnée par une action de la glotte qui les distingue assez nettement des consonnes de la première série et qui donne à la voyelle qui suit une modification profonde, difficile parfois à décrire, mais qui me semble en certains cas avoir une qualité plutôt gutturale, tenant de la cavité postérieure de la bouche. (Blagden 1910: 479) ‘This division of consonants into two series is the main point of the spelling and phonetics of the language. Indeed, the so-called voiced consonants g, gh, j, jh, dh […], d, dh, b, bh are currently pronounced as voiceless k, kh, etc. But their production is accompanied by a glottal action that distinguishes them fairly clearly from the consonants of the first series and results in a deep modification of the following vowel, which is difficult to describe but seems to me in some cases to have a rather guttural quality, coming from the posterior cavity of the mouth.’ (our translation)

There were no adequate technical terms to talk about voice quality in 1910, but Blagden’s reference to a “glottal action” and to “gutturalness” suggests that he was perceiving the phonation contrast. Colonial scholars tacitly assumed that the two series derived from the neutralization of an earlier voicing contrast (Finot 1902; Schmidt 1905). In fact, it was even noted that the Indic graphemes indicating voiced stops were still occasionally voiced in Mon (Haswell 1874; Blagden 1910; Shorto 1962): … dans certains dialectes au moins, le talain a encore conservé quelques traces de l’ancienne valeur de ses lettres sonores. En effet, j’ai noté ce détail […] dans la prononciation d’un jeune homme talain originaire de la région de Maulmain que j’ai connu autrefois à Londres. (Blagden 1910: 480) ‘in at least some dialects, Mon has preserved some traces of the former value of its voiced letters. Indeed, I have noted that detail […] in the pronunciation of a young Mon man from the Moulmein region that I once knew in London.’ (our translation).

That the development of register stems from an original voicing contrast has been accepted since. Only Maspero (1915: 112–118) remained skeptical, deeming unlikely that Mon and Khmer independently devoiced their onset stops after adopting Indic scripts. Comparing Mon and Khmer with the limited evidence he had from related languages, he raised the possibility that vowels have long been the primary contrastive property and that voicing is only a secondary attribute. It is only in the 1950s that the concept of register as a phonological contrast associated to a bundle of phonetic properties emerged. According to Jenner (1974), the term register was coined by Shorto with reference to Mon, but it was first used in print by Henderson (1951), and was only explicitly defined in a phonological sketch of Khmer published the following year: The characteristics of the first register are a “normal” or “head” voice quality, usually accompanied by relatively high pitch. The characteristics of the second register are a deep rather breathy or “sepulchral” voice, pronounced with lowering of the larynx, and frequently accompanied by



Register in languages of Mainland Southeast Asia: the state of the art 

 687

a certain dilation of the nostrils. Pitch is usually lower than that of the first register in similar contexts. The register of a syllable is closely bound up with the vowel nucleus of that syllable, the two being mutually interdependent in a way that will be shown hereafter. (Henderson 1952: 151)

Paradoxically, just like standard Khmer, the register system of the Kompong Chhnang dialect studied by Henderson does not preserve f0 and voice quality distinctions: it appears that Henderson’s consultant was using an affected archaic pronunciation (Wayland and Jongman 2002: 10–11). However, this register contrast is alive and well in more conservative Khmer varieties spoken in the Cardamom mountains and the province of Chanthaburi (Martin 1975; Wayland and Jongman 2003), and the explicit description of register by Henderson fostered research on register in other languages. In the 1960s and early 1970s, it was documented in a number of minority Austroasiatic languages of Mainland Southeast Asia, until war and political upheaval restricted access to the field (Philips 1962: cited in Gregerson 1976; Cooper and Cooper 1965; Gradin 1966; Miller 1967; Ferlus 1971; Smith 1972; Gregerson and Smith 1973; Huffman 1976; Smalley 1976; Ferlus 1979b; Diffloth 1980).

29.4 Registrogenesis, or the diachronic ­development of register In this section, we review the different diachronic scenarios that have been proposed to explain the development of register contrasts. We first discuss the formation of ­register triggered by the neutralization of a voicing contrast in onset stops, which is by far the most common pattern (29.4.1). We then review claims that that register can develop out a voicing contrast in sonorants, and that contrastive register can be generalized to syllables with originally non-contrastive onsets (29.4.2). We finally discuss the development of register systems from vowel quality contrasts and mention phonetic factors that can force register realignments (29.4.3).

29.4.1 Voicing in onset stops and registrogenesis Despite their obvious diachronic relation, there was little discussion of the nature of the connection between obstruent voicing and register before the 1960s. The reason appears to be a lack of understanding of the phonetic relation between voicing and the phonetic correlates of register. The connection between voicing and pitch was first observed in the late 19th century (Rousselot 1901–1908: 888), and the diachronic role of voicing in tonogenesis was recognized in the early 20th century (Maspero 1912: 89, fn. 1), but it is only with the rapid development of experimental ­phonetics after World War II that the association between voicing, on the one hand,

688 

 Marc Brunelle and Tạ Thành Tấn

and vowel quality, on the other, became apparent (House and Fairbanks 1953; Stevens and House 1956). The discovery of register systems in many Austroasiatic languages and the growth of phonetics in the 1960s led several authors to propose models accounting for register formation. In 1965, Haudricourt laid out the first typology of the consonantal changes involved in registrogenesis (Haudricourt 1965). According to him, the devoicing of onset stops in Southeast Asian languages can lead to three outcomes: a Germanic mutation, in which voiceless stops are aspirated and voiced stops devoiced, a Mon-Khmer mutation, in which voiceless stops remain voiceless, and voiced stops are devoiced and take on a breathy voice that can be reinterpreted as aspiration, and a Far Eastern mutation, in which devoicing triggers a doubling of the number of tones in languages that were already tonal. The articulatory mechanisms invoked by Haudricourt to account for these changes were not very explicit, but all involved “laryngo-buccal tension”. Haudricourt explained the Germanic mutation by an increased articulatory tension resulting in a greater air pressure “escap[ing] at the burst” (Haudricourt 1965: 171), and the Mon-Khmer mutation by a lax larynx “let[ting] a breath pass” and the “laxing of the oral muscles produc[ing] a lax vowel quality” (Haudricourt 1965: 172). He finally saw the Far Eastern mutation as a process similar to the Mon-Khmer mutation, but affecting the “pitch of vowels” rather than their voice and vowel quality as it occurs in languages already tonal (Haudricourt 1965: 172). Although tone splitting is still a cornerstone of tonogenetic models, we now know that it is often accompanied by voice quality and vowel quality modulations (Hayes 1984; Nguyễn 1993; Premsrirat 1996; Ferlus 1998). We also know that binary tone contrasts exclusively based on pitch can develop from voicing in previously atonal languages, a possibility that Haudricourt had overlooked (Danaw in Luce 1965; Khmu in Svantesson and House 2006; Central Malagasy in Howe 2017; Afrikaans in Coetzee et al. 2018; Riang in Hall 2018). Gregerson (1976) proposed a model more explicitly grounded in articulatory phonetics than Haudricourt’s, putting forward the tongue root as the main articulator responsible for register and registrogenesis. He hypothesized that the connection between voicing and the narrower vowel aperture in the low register is that they are both associated with an advanced tongue root (i.  e. an expanded pharynx). While this articulatory mechanism could explain vowel quality differences between registers, an X-ray investigation of the articulation of register conducted on a Nyah Kur speaker did not find evidence for it (L.Thongkum 1988). A tongue-root account also seems problematic for other reasons. An expansion of the pharynx does favour voicing (Bell-Berti and Hirose 1975; Westbury 1983; Ahn 2018), but it does not in itself explain the lower pitch and breathier phonation normally associated with the low register. A careful reading of Gregerson’s paper reveals that he did not exclude the possibility that other articulations, like larynx lowering, accompany tongue-root movement in the production of the register contrast, but he saw these mechanisms as secondary. In a paper published the same year, Huffman took a different approach and tried to establish stages of register development, without addressing the question of their



Register in languages of Mainland Southeast Asia: the state of the art 

 689

phonetic motivation (Huffman 1976). Questioning the existence of the Germanic ­mutation in Austroasiatic, he proposed that register languages can be grouped into four types, corresponding to diachronic stages, assuming like other authors that registrogenesis is the instantiation of a functional pressure to preserve contrast (Haudricourt 1965; Hyman 1976). According to Huffman, languages start at a conservative stage, in which a voicing contrast in onsets perturbs vowel quality in predictable but limited ways. During the subsequent transitional stage, the voicing contrast is preserved, but in a reorganized manner: voiced stops devoice and develop a slight aspiration, while vowel perturbations become more salient (for reasons that are not explicit). At the following stage, register proper, vowel differences become contrastive and the weak aspiration in former voiced stops becomes sub-phonemic (note that in Huffman’s model, register first becomes fully contrastive after sonorants). Finally, at the restructured stage, remnants of the original voicing contrast are entirely neutralized and vowel quality becomes the only contrastive cue. This is what happened in Standard Khmer, in which most original vowels have split into two reflexes based on the original voicing of their onset, but register-conditioned pitch and voice quality differences are lost. Huffman’s classification came with a tacit assumption that any language that reaches a given stage must have gone through all preceding stages, but no teleological claim that languages must go through the full cycle. However, his model did not explicitly address the role of voice quality and pitch in register systems: it is unclear if they were treated as a mere consequence of the weak aspiration that develops after voiced stops or as intrinsically associated with vowel quality. Huffman’s focus on vowel quality also means that his classification overlooked the diverse combinations of register cues that are attested in Austroasiatic, but in a later paper, he did acknowledge Haudricourt’s Far Eastern mutation and proposed to add to his scenario a fifth category, which he named tonal, for languages in which onset devoicing doubles the number of pre-existing tones (Huffman 1985). In 1979, Ferlus also proposed a cross-linguistic overview of register systems, but focused on the explicit goals of providing a more exhaustive picture of their diversity and of proposing a better account of the phonetic mechanisms underlying their development. He first showed that despite a common historical source – onset voicing – languages can develop a wide array of register systems combining pitch, voice quality and vowel quality in diverse ways. He also reiterated the existence of languages with the Germanic mutation, like Phay (Khmuic) and more controversially proposed that minor adjustments to Haudricourt’s Mon-Khmer mutation are sufficient to explain the glottalized phonation associated with the high register in Katuic, Pearic and North Bahnaric. Ferlus articulated his phonetic explanation around principles of articulatory economy and contrast preservation. These principles were sometimes invoked ad hoc (Ferlus 1979a: 48–49), but he nonetheless pointed to a few important articulatory mechanisms. In his model, the register differences in pitch and vowel quality

690 

 Marc Brunelle and Tạ Thành Tấn

are rooted in the larynx lowering and pharyngeal expansion used to try to preserve closure voicing (Ferlus 1979a: 58–59). He also proposed that the glottal constriction sometimes found in the high register is due to an increase in the tension of the vocal folds to preserve its contrastive distance with the breathy low register (Ferlus 1979a: 56; 60–61). Ferlus (1979a) also made another important typological contribution. It had already been noticed that open and close vowels undergo different types of diphthongization in the two registers in Khmer (Jenner 1974; Huffman 1976), but Ferlus was able extend this generalization to other register languages. The distribution of interest is illustrated in Table 2 with Bru data. In the low register, open vowels tend to develop falling on-glides (that have then evolved into off-gliding vowels in Bru), while in the high register, it is close vowels that tend to diphthongize, acquiring rising on-glides. As noted by Huffman (1985), phonetically mid vowels behave variably in this respect, patterning with either the close or open vowels, depending on the language (contrast *ɛː and *ɔː in Table 2). Asymmetrical patterns of diphthongization in the two registers have been attributed to the vertical movement of the larynx during the production of voiced stops (Ferlus 1979a; Thurgood 2002) and to longer formant transitions after lax stops (Wayland and Jongman 2002). Tab. 2: Vowels of Bru, Ubol province, Thailand. Adapted from Huffman (1985). High register Low register

*iː

iː iː

High register Low register

*eː

ɛ eː eː

High register Low register

*ɛː

a ɛː ɛː

ɨː ɨː

*uː

*ɤː

ɤː ɤː

*oː

*aː

aː *ɔː i aː > iːa

e

ɤ

*ɨː

uː uː

o

ə

oː oː

ɔ

ɔː u aː > uːa

29.4.2 Registrogenesis and other types of onsets The models reviewed in the previous section mostly focus on registrogenesis triggered by the devoicing of onset stops. However, it has been proposed that register contrasts can also develop on syllables headed by sonorants. Huffman (1976), for instance, assumed that Proto-Austroasiatic had a voicing contrast in sonorants (which he terms “continuants”) and that it is in this environment that register had first developed in the languages constituting his sample. This scenario is problematic as no voicing contrast is reconstructed in sonorants in Proto-Austroasiatic (Shorto 2006; Sidwell and Rau 2015), as we know of no case of registrogenesis from onset sonorants, and as there are no obvious phonetic mechanism by which a neutralization of voicing in sonorants could lead to registrogenesis (cf. Maddieson 1984; L.Thongkum 1992: for



Register in languages of Mainland Southeast Asia: the state of the art 

 691

similar ­problems in tonogenetic models). That said, since we know that tones can emerge through the neutralization of a voicing contrast in onset sonorants (L.Thongkum 1992; Hyslop 2009; Pittayaporn and Kirby 2017), registrogenesis from sonorant onsets cannot be entirely ruled out. A better documented scenario for the extension of the register contrast to syllables headed by sonorants is the “register spreading” process that is fossilized in Khmer and Madurese and documented or reconstructed in some Chamic languages (Friberg and Hor 1977; Lee 1977; Thurgood 1999; Brunelle 2005b, 2009b; Kirby 2020; Misnadin and Kirby 2020). In many Austroasiatic and Austronesian languages, canonical words are sesquisyllabic, consisting of a stressed main syllable preceded by an unstressed presyllable with a restricted segmental inventory (Thomas 1992). In many languages, the register of the presyllable spreads onto the vowel of the main syllable if it is headed by a sonorant. At an early stage, the register of the main syllable is still predictable, but if the presyllable is dropped by a process of monosyllabicization, a register contrast develops in sonorant-initial syllables. For example, in Table 3, the presyllable of proto-Chamic *danaw developed a low register as onset stops devoiced. This low register then spread through the nasal onset of the main syllable onto the main vowel, while the monosyllable *naw kept its default high register. The presyllable of [t̥an̥ aw] was then dropped in colloquial modern Eastern Cham, forming a minimal pair contrasting in register. Tab. 3: Register contrast in words with nasal onsets resulting from register spreading and monosyllabicization in Eastern Cham (adapted from Brunelle 2005: 123). Proto-Chamic

*naw *danaw

Modern Cham

> >

naw t̥anaw

[naw] [t̥an̥ aw]

Colloquial Gloss Modern Eastern Cham > >

naw n̥ aw

‘to go’ ‘pond, lake’

Interestingly, default register assignment in sonorant-initial syllables is language-specific, or even dialect-specific. Cham dialects are a case in point: while sonorant-initial words pattern with the high register in Eastern Cham, as in the first row of Table 3, they pattern with the low register in Western Cham (Friberg and Hor 1977; Thurgood 1996). The word /naw/ ‘to go’ is phonetically realized [naw] in Eastern Cham, but as [n̥ aw] in Western Cham. Besides sonorants, most Southeast Asian languages also have series of obstruents that do not contrast in voicing, like aspirated stops, implosives and voiceless fricatives. They are typically described as patterning with the high register. However, the vowels that follow them are not necessarily identical to those found after plain stops. In Kuy, for instance, vowels following former implosives have a modal voice, like high

692 

 Marc Brunelle and Tạ Thành Tấn

register vowels, but have the same diphthongization patterns as low register vowels (Diffloth 1982). Along the same lines, Chru vowels following fricatives and aspirates start with a high pitch like high register vowels, but a breathy/lax voice reminiscent of the low register (Brunelle et al. 2020). Sound changes in non-contrastive series can also lead to complex patterns that obscure their diachronic source. For instance, aspirates have developed contrastive register because of the adoption of Indic loanwords with voiced aspirates in Khmer, Cham and Mon, because of vowel syncope between a voiced stop and *h in several Chamic languages (Thurgood 1999), and because voiced stops became voiceless aspirates in Kuy (Diffloth 1982).

29.4.3 “Heretical” register and the reorganization of register systems As register was described in a variety of Austroasiatic languages in the 1970s, apparent mismatches between registers and the reconstructed voicing of onset initials started to be reported in North-Bahnaric and Katuic (Smith 1972; Gregerson and Smith 1973; Ferlus 1979a). Diffloth (1982) demonstrated that vowel quality has caused the development of voice quality contrasts in the vowel inventory of Pacoh, a process he called “heretical registrogenesis”. This is illustrated in Figure 2. The original Proto-Katuic voicing contrast was neutralized without registrogenesis in Pacoh. Proto-Katuic had three contrastive vowel heights, but in Pacoh, the diphthong *ɯʌ and mid vowel *əː evolved into /eː/ and /oː/, causing former *eː and *oː to lower and to develop tenseness (“pharyngealization” in Watson et al. 1979) in order to avoid merger with existing /ɛː/ and /ɔː/. As a result, Pacoh now has a voice quality contrast between /ε̰ ː~ɛː/ and /ɔ̰ ː~ɔː/ that, according to Alves (2000), has extended to central and short vowels. *ɯʌ *eː [eː] [ɛː] ̰ *εː

*əː

[oː] *oː ̰ *ɔː [ɔː]

Fig. 2: Register from vowel quality in Pacoh (adapted from Diffloth 1982).

The relation between vowel and voice quality was also recently invoked by Sidwell (2002) to account for the development of register in North Bahnaric. In these languages, close vowels tend to be associated with laxness or breathiness, while open vowels tend to be tense or creaky, which is similar to what is reconstructed in Pacoh, but the opposite of what was described in Western Cham (Friberg and Hor 1977; ­Diffloth 1982). More research is obviously needed, but it should be noted that heretical registrogenesis typically results in a voice quality contrast restricted to a small subset



Register in languages of Mainland Southeast Asia: the state of the art 

 693

of vowels, contrary to registrogenesis from voicing. Whether it should be treated as register or as a mere voice quality contrast in a subset of the vowel inventory remains an open question. Register systems that develop through typical transphonologization of onset voicing can also undergo a reorganization of their vowel systems because of interactions between vowel and voice quality, obscuring the original voicing-register correspondences (Diffloth 1982; Gehrmann 2015). In Bru, for example, the high register vowels *ia, *ua and *ii have taken on a breathy voice (Diffloth 1982). Register shifts can also be conditioned by other factors: vowel length and laryngeal codas can cause the reorganization of register systems, by forcing voice quality changes in specific rhymes. A simple example is the shift of low register *-Vh rhymes from lax to clear voice in the Didra dialect of Tơdrah (Gregerson and Smith 1973).

29.5 Experimental investigation of register and its phonetic properties In this section, we first go over what is known about the associations between voicing and the vocalic cues of register and review the articulatory mechanisms that could explain them (29.5.1). We then provide the reader with an overview of experimental research on register since the 1980s (29.5.2).

29.5.1 The phonetic relation between voicing and register It seems generally accepted that the vocalic properties of register all derive from secondary acoustic cues of voicing. Acoustically, a voicing contrast in onsets always involves a number of acoustic properties, such as f0 and spectral changes affecting voice quality and vowel quality. In “true-voicing” languages like French and Italian, where voiced obstruents systematically exhibit vocal fold vibrations during their closure/constriction, f0 is higher after voiceless obstruents (Kirby and Ladd 2016). This is also the case in “aspiration” languages like English where the voicing contrast is phonetically realized as a difference in positive voice onset time (Ohde 1984; Lisker 1986; Hanson 2009; Dmitrieva et al. 2015). There is also substantial acoustic and perceptual evidence that vowels are more close (i.  e. have a lower F1) after voiced than after voiceless onsets (House and Fairbanks 1953; Stevens and House 1956; Stevens and Klatt 1974; Lisker 1975; Kingston and Diehl 1994; Hillenbrand et al. 2001; Esposito 2002; Kingston et al. 2008). Things are more complicated for voice quality. Research suggests that vowels tend to be breathier after voiceless stops in languages in which they are aspirated (Ní Chasaide and Gobl 1987; Löfqvist and McGowan 1992; Ní Chasaide and Gobl 1993). However,

694 

 Marc Brunelle and Tạ Thành Tấn

there is no obvious effect of voicing on the voice quality of following vowels in languages in which voiceless stops are unaspirated like Italian and French (Ní Chasaide and Gobl 1993). As far as we know, the only language in which voicing is followed by a breathier phonation is Eastern Armenian (Seyfarth and Garellek 2018). The breathy voice quality found in the low register would thus be explained if the devoicing of voiced stops is always accompanied by some aspiration (Huffman 1976; Thurgood 2002; Wayland and Jongman 2002), but this explanation fails if aspiration is not a necessary correlate of early register. It is of course entirely possible that voicing in Mainland Southeast Asian languages is not phonetically identical to voicing in Indo-European languages, but in the absence of hard evidence, this is an open question. The secondary acoustic properties of voicing are usually attributed to articulatory accommodation produced by speakers to favour or hinder voicing. During obstruents, the presence of an oral closure raises supraglottal pressure and hinders the transglottal airflow required to set the vocal folds into vibration. As a consequence, articulatory maneuvers such as larynx lowering, pharyngeal expansion and nasal leakage, which lower supraglottal pressure, have been found to accompany the production of voiced obstruents (Bell-Berti and Hirose 1975; Bell 1975; Westbury 1983; Westbury and Keating 1986; Solé 2018; Ahn 2018). It has also been proposed that the vocal folds are stiffened during voiceless stops to prevent their vibration (Löfqvist et al. 1989). These ancillary articulations have coarticulatory impacts on several acoustic properties of neighbouring vowels. The clearest case is the documented effect of larynx height on f0. Lowering the larynx decreases f0 because it changes the vertical tension of the vocal folds and tilts the angle between the cricoid and the thyroid cartilages, which has an effect of shortening the vocal folds (Ohala 1972; Honda et al. 1999). Similarly, a strong tension of the vocal folds intended to prevent voicing could raise f0 (Halle and Stevens 1971; Löfqvist et al. 1989). Changes in vowel quality can also be attributed to pharyngeal expansion maneuvers. An advancement of the tongue root to expand the pharynx would mechanically result in a fronting/raising of the tongue that should affect the quality of the following vowel. A lowering of the larynx, on the other hand, would lengthen the posterior cavity, and therefore lower F1, yielding perceptually more close vowels (Ferlus 1979a; Thurgood 2000, 2002; Brunelle 2010). A point that has previously been overlooked however, is that the expansion of the pharynx should affect F1 in a more dramatic way in open than close vowels. The reason is that while F1 is largely a function of the length of the posterior cavity (ranging from the glottis to the oral constriction) in open vowels, it corresponds to the Helmholtz resonance in close vowels, a resonance whose frequency is less dramatically affected by the expansion of the posterior cavity as it depends not only on the volume of the posterior cavity, but also on the area of the oral constriction. This asymmetry could explain why falling onglides are only found in open vowels in the low register (but note that it does not explain why the weak rising onglide found in the high register is limited to close vowels). Another possible mechanism primarily intended to lower supraglottal pressure that could affect formants



Register in languages of Mainland Southeast Asia: the state of the art 

 695

is nasal leakage. Nasal leakage during a stop could introduce nasal resonances and antiresonances that should in theory affect the formant structure at the onset of the following vowel. To our knowledge, the exact effect and magnitude of such perturbations have not yet been studied. While the consequences of changes in pharynx size and larynx height on f0 and vowel quality are at least partly understood, their effect on voice quality are more speculative. It has been proposed that “voice quality may depend physiologically on tongue root position […] if the aryepiglottal ligament and membrane, which connect the tongue root to the arytenoid cartilages via the epiglottis, cause the arytenoids to slide forward slightly and/or rock slightly apart, slackening or separating the vocal folds enough to lax the voice, when the tongue’s root is advanced or its body raised” (Kingston et al. 1997: 1697). A lowered larynx may also increase the subglottal air pressure, causing a turbulent airflow through lax vocal folds (Ferlus 1979a). These mechanisms are not experimentally established, but they could both also account for the weak aspiration often found in low register stops. A totally different approach would be that the secondary cues of voicing are actively manipulated by speakers to achieve an auditory low frequency effect reinforcing the perception of voicing (Kingston and Diehl 1994; Kingston et al. 2008). As pointed out elsewhere, this auditory account is not incompatible with articulatory accounts (Kirby and Ladd 2016). An auditory explanation could also be adopted to explain the longer duration of vowels in the low register: a lengthening of vowels before voiced stops has been hypothesized to be a strategy to create the auditory impression of a shorter closure (Kluender et al. 1988). However, the existence of a similar auditory enhancement strategy between vowel length and onset voicing has, to our knowledge, not been tested.

29.5.2 Previous phonetic studies of register The first experimental study of a register system was an acoustic investigation of Bru vowels that focused exclusively on formant frequencies (Miller 1967). Systematic instrumental investigation of the multiple properties involved in register started in the 1980s, with Lee (1983), an investigation of the register system of Mon as spoken in Thailand. Lee found that f0 was the most acoustically prominent property of register, with vowel duration and F1 only playing a secondary role. Crucially, he did not find a systematic effect of voice quality, probably due to his pooling of unnormalized spectral slope measures from five speakers, and, as pointed out by Diffloth (1985), who had collected the data, to the fact that speakers did not form a homogeneous sample, but spoke a variety of dialects. Follow-up studies on a dialect spoken in Ban Nakhonchum, Thailand, subsequently showed that f0, voice quality and vowel duration are all involved in distinguishing Mon registers (L.Thongkum 1987a; Abramson et al. 2015). In the late 1980s, Luangthongkum looked at multiple acoustic properties of register in three other Austroasiatic languages of Thailand, Kuy, Nyah Kur and Chong

696 

 Marc Brunelle and Tạ Thành Tấn

(L. Thongkum 1987b, 1991). F0 was found to be significantly higher in the high register in Nyah Kur and Kuy, F1 to be lower in the low register in Chong, and intensity to be higher in the high register in Kuy and Chong (L. Thongkum 1987b, 1991). Spectral measures indicated a surprisingly minor role of voice quality in the register contrast of these languages, but this was likely due to the computation of power spectra over entire vowels rather than the initial portion of the vowel where register-conditioned voice quality variations are typically strongest. In fact, finer-grained studies of phonation using spectral measures, inverse filtering and EGG later revealed significant differences in phonation in Chong (Edmondson 1996; DiCanio 2009). Phonation differences were also uncovered in more recent studies of Kuy, although their relative weakness and the presence of significant social stratification in the relative weight of f0 and phonation have led authors to suggest that the language is becoming tonal under Thai influence (Abramson et al. 2004; Lau 2019). The same tendency towards tonalization has been claimed for Khmu’ Rawk, a registral Khmu’ dialect that keeps traces of a voice quality contrast (Abramson et al. 2007) contrary to other Khmu’ dialects that either preserve voicing or are exclusively tonal (Svantesson and House 2006). The register systems of more Austroasiatic languages were instrumentally studied in the past 20 years. Wayland and Jongman (2001, 2003) showed that the Khmer dialect spoken in Chanthaburi, in Thailand, not only distinguishes register with vowel quality modulations, like Standard Khmer, but also has a breathier low register. Outside Thailand, studies of Wa (Watkins 2002) and Chrau (Tạ et al. 2019) showed that register languages can combine various properties (without necessarily having a primary one), and that there is significant interspeaker variation in the relatively use of each property. The existence of register contrasts in Austronesian languages (in Chamic, but also in languages of Java) which had been noted since the 1970s, was also instrumentally established at the end of the 20th century (Friberg and Hor 1977; Poedjosoedarmo 1986). It was first demonstrated that Western Cham has a register system based on vowel quality, pitch and voice quality (Friberg and Hor 1977; Headley 1991; Edmondson and Gregerson 1993; Brunelle 2009a, 2012). Eastern Cham, which had originally been described as a tone language (Blood 1967; Moussay 1971; Hoàng 1987; Phú et al. 1992; Thurgood 1996), was then shown to have a register system as voice quality and F1 play major roles in the patterns of contrast of its two “tones”, even if f0 is its primary phonetic property (Brunelle 2005b, 2006, 2012). The existence of a register system based on F1 and voice quality, but preserving residual voicing, has also recently been confirmed in Chru (Fuller 1977; Brunelle et al. 2019), and other Chamic languages have been described as registral, like Cát Gia Raglai, Southern Raglai and Western Jarai (Lee 1998; Tạ 2009; Jensen 2014). Still in Austronesian, research on the Javanese tense~lax stop contrast has shown that it is probably a form of register: Javanese lax stops have a longer VOT and are produced with a lowered larynx, and the vowels following them have a lower F1 and a laxer/breathier phonation (Fagan 1988; Hayward 1993; Hayward et al. 1994; Hayward 1995; Adisasmito-Smith 2004; Thurgood 2004; Brunelle 2010). It also seems likely that the complex onset-vowel interactions found in Madurese are a



Register in languages of Mainland Southeast Asia: the state of the art 

 697

form of restructured register originally akin to the Javanese tense~lax contrast (Cohn 1993; Cohn and Lockwood 1994; Misnadin et al. 2015; Misnadin and Kirby 2020). As mentioned at the beginning of the paper, there is also good evidence that the two-way tone splits reconstructed in Sino-Tibetan, Kra-Dai and Hmong-Mien (i.  e. Haudricourt’s Far Eastern mutation) involved register-like developments. As these typically interacted with other laryngeal properties to form complex tone systems, they rarely result in bitonal contrasts looking like Southeast Asian register. Yet, languages like Tamang, Southern Yi and Drenjongke have simple tone systems in which voice quality and vowel quality modulations derived from an original voicing contrast are still transparent (Mazaudon and Michaud 2008; Kuang and Cui 2018; Lee et al. 2019). While early work essentially studied the production of register acoustically, other experimental techniques such as X-ray (L. Thongkum 1988), electroglottography (Watkins 2002; Mazaudon and Michaud 2008; Brunelle 2009a; Abramson et al. 2015) and laryngoscopy (Hayward et al. 1994; Hayward 1995; Brunelle 2010) have allowed us to get a glimpse of the articulation of register. Work produced in the past 20 years also often includes perception experiments (Kingston et al. 1997; Abramson et al. 2004; Brunelle 2006; Abramson et al. 2007; Brunelle 2012; Kuang and Cui 2018; Brunelle et al. 2020). Perceptual results are essential to our understanding of the relative salience of the acoustic properties of register, but it remains difficult to synthesize natural-sounding voice quality modulations, which could bias the results of some studies. There is also substantial evidence that the acoustic properties that distinguish registers are not entirely separable for listeners, which may make the interpretation of perceptual results difficult (Kingston et al. 1997; Brunelle 2012).

29.6 Pending issues As a matter of conclusion, we would like to raise issues that, in our opinion, have not yet been answered or sufficiently addressed in the literature on register. Many remain unaddressed because of lack of data, technological limitations or local fieldwork conditions; recent perspectives and developments in phonology and phonetics have also brought new questions to the forefront. Some phonetic properties of register remain under-described. For instance, many early descriptions of register, acoustic or auditory, did not investigate the phonetic properties of onset consonants, thus possibly overlooking important cues. As it turns out, recent investigations of Chru and Chrau have shown that closure voicing is preserved in a surprisingly large proportion of former voiced stops, even in languages that have well-developed register systems (Brunelle et al. 2019; Tạ et al. 2019, Brunelle et al. 2020). A closer look at onsets would also allow a better substantiation of the claim that there is aspiration (or at least a long lag VOT) in devoiced stops, a central tenet of many diachronic models of register (Haudricourt 1965; Thurgood 2002; Wayland and

698 

 Marc Brunelle and Tạ Thành Tấn

Jongman 2002). The timing of the realization of the properties of register in vowels is another issue for which we lack data in many languages, in large part because of the limitations of instrumental techniques used in early work. In most languages, the phonetic properties of register seem more distinct at the beginning of the vowel, but there are also cases of stable phonetic differences over the entire vowel. Comparing fine-grained phonetic timing with phonological alternations that provide evidence about the underlying representation of register as a feature of onsets or vowels (like word games, poetry and infixation) could allow us to better understand the unfolding of the transphonologization of voicing into register (Jenner 1969; Svantesson 1983; Schiller 1994; Brunelle 2005a). Further research in articulatory-acoustic modelling and in phonetic typology would also shed light on the distribution of some register properties, like the relation between vowel quality and voice quality. While close vowels appear to have conditioned a laxer/breathier voice quality in North Bahnaric and Pacoh heretical registrogenesis (Diffloth 1982; Sidwell 2002), the opposite relation has been described in Western Cham (Friberg and Hor 1977). Acoustic research has established that formants affect spectral balance (Iseli and Alwan 2004; Iseli et al. 2007), and both cross-linguistic measures and perception experiments suggest that close vowels are associated with a laxer/breathier phonation (Kingston et al. 1997; Lotto et al. 1997; Esposito et al. 2019). However, as the magnitude of these effects seems limited and as perceptual studies suggest that voice and vowel quality are not fully distinct at the perceptual level (Kingston et al. 1997; Brunelle 2012), more research on the interaction between voice quality and vowel quality and the conditions and phonetic ranges in which it occurs could help us understand how heretical registrogenesis and register shifts conditioned by vowel quality take place. Another question that could be addressed typologically and acoustically is why the transphonologization of voicing often results in vowel systems in which open vowels have large falling onglides in the low register, while close vowels have weak rising onglides in the high register. In 29.5.1, we have proposed that this is due to the fact that the F1 of open vowels is largely a function of the length of the cavity extending from the glottis to the oral constriction, whereas the F1 of close vowels corresponds to the Helmholtz resonance, which depends on the volume of the back cavity and the area of the constriction. While this explains why falling onglides are more pronounced in open vowels in the low register, it is still unclear why rising onglides are only found in close vowels in the high register. Additional descriptive work about the role of F2 (which roughly corresponds to vowel backness/frontness) in register contrasts is also needed. While some studies find low register vowels to be more centralized (Henderson 1952; Shorto 1966; L. Thongkum 1987b; Watkins 2002), some find no significant effect of register on F2 (Edmondson and Gregerson 1993; Brunelle 2005b; Brunelle et al. 2020). The diachronic scenarios developed for registrogenesis in the 1970s–1980s also need to be revisited in light of phonetic and typological data collected since then. For



Register in languages of Mainland Southeast Asia: the state of the art 

 699

instance, as evidence accumulates that previously atonal languages can transphonologize voicing into tone without apparently going through an aspiration/voice quality stage (Khmu’ in Svantesson and House 2006; Malagasy in Howe 2017; Afrikaans in Coetzee et al. 2018), we should not only reconsider Haudricourt’s functional divide between the Mon-Khmer and Far Eastern mutations, but also the hypothesis that the devoicing of onset stops necessarily results in weak aspiration/breathiness (Haudricourt 1965; Huffman 1976; Thurgood 2002; Wayland and Jongman 2002). The limited evidence we have about spectral perturbations induced by prevoiced onset stops does not indicate that they condition aspiration or breathiness in following vowels (Löfqvist and McGowan 1992; Ní Chasaide and Gobl 1993). Moreover, despite plausible hypotheses (Ferlus 1979a; Kingston et al. 1997), no articulatory mechanism linking voicing and laxness/breathiness is established yet. Register should also be revisited in light of new questions that have arisen in phonetics and phonology in the past few years. A case in point is individual variation, which seems greater than previously assumed in the production and perception of register systems, and appears to play an instrumental role in their diachronic evolution. A recent study of Chru, for example, reveals important individual differences in both production and perception (Brunelle et al. 2020). In terms of production, all speakers have a register contrast primarily based F1 on-gliding, but older men still voice the majority of their low register stops. Other phonetic properties like voice quality and pitch are also present, but are speaker-specific rather than socially structured. This variation is reflected in perception: F1 is the dominant perceptual cue for all listeners, but voicing is also important, even for listeners who do not produce it. A better understanding of this type of variation could allow the development of more detailed models of sound change and integrate insights from variationist sociolinguistics. A closer look at individual variation would also help us further our understanding of the possible role of contact in the evolution of register systems (Lau 2019), an area where, despite much spilled ink, we still only have coarse speculative evidence. As is the case for other laryngeal contrasts, our phonetic understanding of register has so far been limited by our limited ability to investigate its articulation. The study of laryngeal articulation requires expensive non-portable equipment and the contribution of trained collaborators. Moreover, register languages are often spoken in relatively remote areas where well-equipped research facilities are unavailable. The development of better infrastructures in Southeast Asia, and greater availability of portable equipment, such as ultrasound technology, is likely to open opportunities for this type of research in the next few years. Acknowledgments: We would like to thank James Kirby and Ryan Gehrman for stimulating discussions on several issues addressed in the chapter. This work was partly funded by grants from the Social Sciences and Humanities Research Council of Canada [435-2017-0498] and the UK Arts and Humanities Research Council [AH/P014879/1].

700 

 Marc Brunelle and Tạ Thành Tấn

References Abramson, Arthur S., Mark K. Tiede & Therapan Luangthongkum. 2015. Voice register in Mon: Acoustics and electroglottography. Phonetica 72. 237–256. Abramson, Arthur S., Therapan Luangthongkum & Patrick W. Nye. 2004. Voice register in Suai (Kuai): An analysis of perceptual and acoustic data. Phonetica 61. 147–171. Abramson, Arthur S., Patrick W. Nye & Therapan Luangthongkum. 2007. Voice register in Khmu’: Experiments in production and perception. Phonetica 64. 80–104. Adisasmito-Smith, Niken. 2004. Phonetic influences of Javanese on Indonesian. Ithaca, NY: Cornell University PhD. Ahn, Suzy. 2018. The role of tongue position in laryngeal contrasts: An ultrasound study of English and Brazilian Portuguese. Journal of Phonetics 71. 451–467. Alves, Mark J. 2000. A Pacoh analytic grammar. Manoa, HI: University of Hawai’i PhD. Aymonier, Etienne. 1874. Dictionnaire français-cambodgien: précédé d’une notice sur le Cambodge et d’un aperçu de l’écriture et de la langue cambodgiennes. Paris: Challamel; Saigon: Imprimerie Nationale. Bell-Berti, Fredericka & Hajime Hirose. 1975. Palatal activity in voicing distinctions: A simultaneous fiberoptic and electromyographic study. Journal of Phonetics 3. 69–74. Bell, Fredericka. 1975. Control of pharyngeal cavity size for English voiced and voiceless stops. The Journal of the Acoustical Society of America 57. 456–461. Blagden, Charles Otto. 1910. Quelques notions sur la phonétique du Talain et son évolution historique. Journal Asiatique 15. 477–505. Blood, David L. 1967. Phonological units in Cham. Anthropological Linguistics 9. 15–32. Brunelle, Marc. 2005a. Register and tone in Eastern Cham: Evidence from a word game. Mon-Khmer Studies 35. 121–132. Brunelle, Marc. 2005b. Register in Eastern Cham: Phonological, phonetic and sociolinguistic approaches. Ithaca, NY: Cornell University PhD. Brunelle, Marc. 2006. A phonetic study of Eastern Cham register. In Paul Sidwell & Anthony Grant (eds.), Chamic and beyond, 1–36. Sidney: Pacific Linguistics. Brunelle, Marc. 2009a. Contact-induced change? Register in three Cham dialects. Journal of Southeast Asian Linguistics 2. 1–22. Brunelle, Marc. 2009b. Diglossia and monosyllabization in Eastern Cham: A sociolinguistic study. In James Stanford & Dennis Preston (eds.), Variation in indigenous minority languages, 47–75. Amsterdam: John Benjamins. Brunelle, Marc. 2010. The role of larynx height in the Javanese tense ~ lax stop contrast. In Rafael Mercado, Eric Potsdam & Lisa Travis (eds.), Austronesian contributions to linguistic theory: Selected Proceedings of AFLA, 7–24. Amsterdam & Philadelphia: John Benjamins. Brunelle, Marc. 2012. Dialect experience and perceptual integrality in phonological registers: Fundamental frequency, voice quality and the first formant in Cham. Journal of the Acoustical Society of America 131(4). 3088–3102. Brunelle, Marc & James Kirby. 2016. Tone and phonation in Southeast Asian languages. Language and Linguistics Compass 10. 191–207. Brunelle, Marc, Thành Tấn Tạ, James Kirby & Lư Giang Đinh. 2020. Transphonologization of voicing in Chru: Studies in production and perception. Journal of Laboratory Phonology 11(1), 15. 1–33. Brunelle, Marc, Thành Tấn Tạ, James Kirby & Đinh Lư Giang. 2019. Obstruent devoicing and registrogenesis in Chru. In Sasha Calhoun, Paola Escudero, Marija Tabain & Paul Warren Proceedings of the 19th International Congress of Phonetic Sciences, 517–521. Melbourne: Australasian Speech Science and Technology Association Inc.



Register in languages of Mainland Southeast Asia: the state of the art 

 701

Coetzee, Andries W., Patrice Speeter Beddor, Kerby Shedden, Will Styler & Daan Wissing. 2018. Plosive voicing in Afrikaans: Differential cue weighting and tonogenesis. Journal of Phonetics 66. 185–216. Cohn, Abigail C. & Katherine Lockwood. 1994. A phonetic description of Madurese and its phonological implications. Working Papers of the Cornell Phonetics Laboratory 9. 67–92. Cohn, Abigail C. 1993. Consonant-vowel interactions in Madurese: The feature lowered larynx. Papers from the Regional Meeting of the Chicago Linguistic Society 29. 105–119. Cooper, James & Nancy Cooper. 1965. Halang phonemes. Mon-Khmer Studies 2. 57–72. DiCanio, Christian T. 2009. The phonetics of register in Takhian Thong Chong. Journal of the International Phonetic Association 39.162–188. Diffloth, Gérard. 1980. The Wa languages. Linguistics of the Tibeto-Burman Area Ill(5). 1–182. Diffloth, Gérard. 1982. Registres, dévoisement, timbres vocaliques: leur histoire en katouique. Mon-Khmer Studies XI. 47–82. Diffloth, Gérard. 1985. The registers of Mon vs. the spectrographist’s tones. UCLA Working Papers in Phonetics 60. 55–58. Dmitrieva, Olga, Fernando Llanos, Amanda A Shultz & Alexander L. Francis. 2015. Phonological status, not voice onset time, determines the acoustic realization of onset f0 as a secondary voicing cue in Spanish and English. Journal of Phonetics 49. 77–95. Edmondson, Jerold A. 1996. Voice qualities and inverse filtering in Chong. Mon-Khmer Studies 26. 107–116. Edmondson, Jerold & Kenneth Gregerson. 1993. Western Cham as a register language. In Jerold Edmondson & Kenneth Gregerson (eds.), Tonality in Austronesian languages, 61–74. Honolulu: University of Hawaii Press. Esposito, Anna. 2002. On vowel height and consonantal voicing effects: Data from Italian. Phonetica 59. 197–231. Esposito, Christina M., Morgan Sleeper & Kevin Schäfer. 2019. Examining the relationship between vowel quality and voice quality. Journal of the International Phonetic Association. 1–32. DOI: https://doi.org/10.1017/S0025100319000094. Fagan, Joel L. 1988. Javanese intervocalic stop phonemes. Studies in Austronesian Linguistics 76. 173–202. Ferlus, Michel. 1971. La langue souei: Mutations consonantiques et bipartition du système vocalique. Bulletin de la Société Linguistique de Paris 61. 378–388. Ferlus, Michel. 1979a. Formation des registres et mutations consonantiques dans les langues Mon-Khmer. Mon Khmer Studies VIII. 1–76. Ferlus, Michel. 1979b. Lexique thavung-français. Cahiers de Linguistique-Asie Orientale 5. 71–94. Ferlus, Michel. 1998. Les systèmes de tons dans les langues viet-muong. Diachronica 15. 1–27. Finot, Louis 1902. Notre transcription du cambodgien. Bulletin de l’École française d’Extrême-Orient 2. 1–15. Friberg, Timothy & Kvoeu Hor. 1977. Register in Western Cham phonology. In David D. Thomas, Ernest W. Lee & Đăng Liêm Nguyễn (eds.), Papers in Southeast Asian Linguistics No. 4, 17–38. Sydney: Australian National University. Fuller, Eugene. 1977. Chru phonemes. In David D. Thomas, Ernest W. Lee & Đăng Liêm Nguyễn (eds.), Papers in South East Asian Linguistics No. 4: Chamic studies, 105–124. Canberra: Pacific Linguistics. Gehrmann, Ryan. 2015. Vowel height and register assignment in Katuic. Journal of the Southeast Asian Linguistics Society 8. 56–70. Gradin, Dwight. 1966. Consonantal tone in Jeh phonemics. Mon Khmer Studies 2. 41–53.

702 

 Marc Brunelle and Tạ Thành Tấn

Gregerson, Kenneth. 1976. Tongue-root and register in Mon-Khmer. In Philip N. Jenner, Laurence Thompson & Stanley Starosta (eds.), Austroasiatic studies, 323–369. Honolulu: University Press of Hawaii. Gregerson, Kenneth & Kenneth D. Smith. 1973. The development of Todrah register. Mon-Khmer Studies 4. 143–184. Hagège, Claude & André-Georges Haudricourt. 1978. La phonologie panchronique. Paris: Presses universitaires de France. Hall, Elizabeth. 2018. A phonological analysis of Riang Lang. Papers from the Seventh International Conference on austroasiatic Linguistics. JSEALS Special Publication No. 3, 7, 78–86. Halle, Morris & Kenneth Stevens. 1971. A note on laryngeal features. Quarterly Progress Report, 198–213. Cambridge, MA: Research Laboratory of Electronics, MIT. Hanson, Helen M. 2009. Effects of obstruent consonants on fundamental frequency at vowel onset in English. The Journal of the Acoustical Society of America 125. 425–441. Haswell, James M. 1874. Grammatical notes and vocabulary of the Peguan language. Rangoon: American Mission Press. Haudricourt, André. 1954. De l’origine des tons en viêtnamien. Journal Asiatique 242. 69–82. Haudricourt, André. 1965. Les mutations consonantiques et les occlusives initiales en mon-khmer. Bulletin de la Société Linguistique de Paris 60. 160–172. Hayes, La Vaughn H. 1984. The register systems of Thavung. Mon-Khmer Studies 12. 91–122. Hayward, Katrina. 1993. /p/ vs. /b/ in Javanese: Some preliminary data. Working Papers in Linguistics and Phonetics 3. 1–33. Hayward, Katrina. 1995. /p/ vs. /b/ in Javanese: The role of the vocal folds. Working Papers in Linguistics and Phonetics 5. 1–11. Hayward, Katrina, D. Grafield-Davies, B. J. Howard, J. Latif & Ray Allen. 1994. Javanese stop consonants: The role of the vocal folds. London: School of Oriental and African Studies. Headley, Robert K. 1991. The phonology of Kompong Thom Cham. In Jeremy H. C. S. Davidson (ed.), Austroasiatic languages essays in honour of H. L. Shorto, 105–121. London: School of Oriental and African Studies. Henderson, Eugenie. 1952. The main features of Cambodian pronunciation. Bulletin of the School of Oriental and African Studies 14. 453–476. Henderson, Eugénie. 1951. The phonology of loanwords in some South-East Asian lnaguages. Transactions of the Philological Society 50. 131–158. Hillenbrand, James M, Michael J. Clark & Terrance M. Nearey. 2001. Effects of consonant environment on vowel formant patterns. The Journal of the Acoustical Society of America 109. 748–763. Hoàng, Thị Châu. 1987. Hệ thống thanh điệu tiếng Chàm và các kí hiệu. Ngôn Ngữ 1/2. 31–35. Honda, Kiyoshi, Hiroyuki Hirai, Shinobu Masaki & Yasuhiro Shimada. 1999. Role of vertical larynx movement and cervical lordosis in F0 control. Language and Speech 42. 401–411. House, Arthur S. & Grant Fairbanks. 1953. The influence of consonant environment upon the secondary acoustical characteristics of vowels. The Journal of the Acoustical Society of America 25. 105–113. Howe, Penelope Jane. 2017. Tonogenesis in Central dialects of Malagasy: Acoustic and perceptual evidence with implications for synchronic mechanisms of sound change. Houston: Rice University PhD. Huffman, Franklin. 1976. The register problem in fifteen Mon-Khmer languages. In Philip N. Jenner, Laurence C. Thompson & Stanley Starosta (eds.), Austroasiatic studies, part 1 (Oceanic Linguistics Special Publication 13), 575–589. Honolulu: University of Hawai’i Press. Huffman, Franklin E. 1985. Vowel permutations in Austroasiatic languages: Papers presented to Paul K. Benedict for his 71st birthday. In Graham Thurgood, James Matisoff & David Bradley (eds.),



Register in languages of Mainland Southeast Asia: the state of the art 

 703

Linguistics of the Sino-Tibetan Area: The state of the art, 141–145. Canberra: Pacific Linguistics Series C 87, Australian National University. Hyman, Larry M. 1976. Phonologization. In Alphonse Juilland (ed.), Linguistic studies presented to Joseph H. Greenberg, 407–418. Saratoga: Anima Libri. Hyslop, Gwendolyn. 2009. Kurtöp Tone: A tonogenetic case study. Lingua 119. 827–845. Iseli, Markus & Abeer Alwan. 2004. An improved correction formula for the estimation of harmonic magnitudes and its application to open quotient estimation. Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP’04) 72. 669–672. Iseli, Markus, Yen-Liang Shue & Abeer Alwan. 2007. Age, sex, and vowel dependencies of acoustic measures related to the voice source. The Journal of the Acoustical Society of America 121. 2283–2295. Janneau, Gustave. 1869. Étude de l’alphabet cambodgien. Saigon. https://babel.hathitrust.org/cgi/ pt?id=coo.31924023359460&view=1up&seq=8 (last accessed 6 January 2021). Jenner, Philip N. 1969. Affixation in modern Khmer. Manoa: University of Hawaii PhD. Jenner, Philip N. 1974. The development of registers in Standard Khmer. In Nguyễn Đăng Liêm (ed.), Southeast Asian Linguistic studies, 47–60. Canberra: Autralian National University. Jensen, Joshua M. 2014. Jarai clauses and noun phrases: Syntactic structures in an Austronesian language. Berlin & New York: Mouton de Gruyter. Kingston, John, Randy Diehl, Cecilia Kirk & Wendy Castleman. 2008. On the internal perceptual structure of distinctive features: The [voice] contrast. Journal of Phonetics 36. 28–54. Kingston, John & Randy L. Diehl. 1994. Phonetic knowledge. Language 70. 419–454. Kingston, John, Neil A. Macmillan, Laura Walsh Dickey, Rachel Thorburn & Christine Bartels. 1997. Integrality in the perception of tongue root position and voice quality in vowels. Journal of the Acoustical Society of America 101.1696–1709. Kirby, James. 2020. Madurese. Journal of the International Phonetic Association 50. 109–126. Kirby, James & D. Robert Ladd. 2016. Effects of obstruent voicing on vowel F0: Evidence from “true voicing” languages. The Journal of the Acoustical Society of America 140. 2400. Kluender, Keith R., Randy L Diehl & Beverly A. Wright. 1988. Vowel-length differences before voiced and voiceless consonants: An auditory explanation. Journal of Phonetics 16. 153–169. Kuang, Jianjing & Aletheia Cui. 2018. Relative cue weighting in production and perception of an ongoing sound change in Southern Yi. Journal of Phonetics 71. 194–214. L. Thongkum, Therapan. 1987a. Another look at the register distinction in Mon. UCLA Working Papers in Phonetics 67. 132–165. L. Thongkum, Therapan. 1987b. Phonation types in Mon-Khmer languages. UCLA Working Papers in Phonetics 67. 29–48. L. Thongkum, Therapan. 1988. Phonation types in Mon-Khmer Languages. In Osamu Fujimura (ed.), Vocal fold physiology: Vocal physiology, voice production, mechanisms and functions, 319–334. New York: Raven Press. L. Thongkum, Therapan. 1991. An instrumental study of Chong register. In Jeremy H. C. S. Davidson (ed.), Austroasiatic languages: Essays in honour of H. L. Shorto, 141–160. London: School of Oriental and African Studies, University of London. L. Thongkum, Therapan. 1992. The raising and lowering of pitch caused by a voicing distinction in sonorants (nasals and approximants): an epidemic disease in SEA languages. Paper presented to the The Third International Symposium on Language and Linguistics, 8–10 January. Lau, Raksit Tyler. 2019. Generational differences in phonation and tone in Kuy. Paper presented at the 8th International Conference on Austroasiatic Languages, Chiang Mai University, 29–31 August. Lee, Ernest W. 1977. Devoicing, Aspiration, and vowel split in Haroi: Evidence for register (contrastive tongue-root position). In David D. Thomas, Ernest W. Lee & Nguyễn Đăng Liêm (eds.), Papers in Southeast Asian Linguistics no. 4, 87–104. Canberra: Australian National University.

704 

 Marc Brunelle and Tạ Thành Tấn

Lee, Ernest W. 1998. The contribution of Cat Gia Roglai to Chamic. In David D. Thomas (ed.), Papers in Southeast Asian Linguistics no. 15: Further Chamic studies, 31–54. Canberra: Pacific Linguistics – Series A. Lee, Seunghun J., Shigeto Kawahara, Céleste Guillemot & Tomoko Monou. 2019. Acoustics of the four-way laryngeal contrast in Drenjongke (Bhutia): Observations and implications. Journal of the Phonetic Society of Japan 23. 65–75. Lee, Thomas. 1983. An acoustical study of the register distinction in Mon. UCLA Working Papers in Phonetics 57. 79–96. Lisker, Leigh. 1975. Is it VOT or a first‐formant transition detector? The Journal of the Acoustical Society of America 57. 1547–1551. Lisker, Leigh. 1986. “Voicing” in English: A catalogue of acoustic features signaling /b/ versus /p/ in trochees. Language and Speech 29. 3–11. Löfqvist, Anders & Richard S. McGowan. 1992. Influence of consonantal environment on voice source aerodynamics. Journal of Phonetics 20. 93–110. Löfqvist, Anders, Thomas Baer, Nancy S. McGarr & Robin Seider Story. 1989. The cricothyroid muscle in voicing control. The Journal of the Acoustical Society of America 85. 1314–1321. Lotto, Andrew J., Lori L. Holt & Keith R. Kluender. 1997. Effects of voice quality on perceived height of English vowels. Phonetica 54. 73–96. Luce, Gordon H. 1965. Danaw, a dying Austroasiatic language. Lingua 14. 98–129. Maddieson, Ian. 1984. The effects on F0 of a voicing distinction in sonorants and their implications for a theory of tonogenesis. Journal of Phonetics 12. 9–15. Martin, Marie Alexandrine. 1975. Le dialecte cambodgien parlé à Tatey, massif des Cardamomes. Asie du Sud-Est et monde insulindien VI. 71–79. Maspero, Georges. 1915. Grammaire de la langue khmère (cambodgien). Paris: Imprimerie nationale. Maspero, Henri. 1912. Étude sur la phonétique historique de la langue annamite: les initiales. Bulletin de l’École française d’Extrême-Orient 12. 1–126. Mazaudon, Martine & Alexis Michaud. 2008. Tonal contrasts and initial consonants: A case study of Tamang, a “missing link” in tonogenesis. Phonetica 65. 231–256. Miller, John D. 1967. An acoustical study of Brou vowels. Phonetica 17. 149–177. Misnadin, Misnadin & James Kirby. 2020. Acoustic correlates of plosive voicing in Madurese. The Journal of the Acoustical Society of America 147. 2779–2790. Misnadin, Misnadin, James Kirby & Bert Remijsen. 2015. Temporal and spectral properties of Madurese stops. In the Scottish Consortium for ICPhS 2015 (ed.), Proceedings of the 18th International Congress of Phonetic Sciences, paper 789. Glasgow: University of Glasgow. Moussay, Gérard. 1971. Dictionnaire cam-vietnamien-français. Phan Rang: Trung-tâm Văn hoá Chăm. Nguyễn, Văn Lợi. 1993. Tiếng Rục. Hà Nội: Nhà xuất bản khoa học xã hội. Ní Chasaide, Ailbhe & Christer Gobl. 1987. Cross language study of the effects of voiced/voiceless consonants on the vowel voice source characteristics. Journal of the Acoustical Society of America 82. S116. Ní Chasaide, Ailbhe. 1993. Contextual variation of the vowel voice source as a function of adjacent consonants. Language and Speech 36. 303–330. Ohala, John J. 1972. How is pitch lowered? Paper presented to the 83rd meeting of the Acoustical Society of America, Buffalo. Ohde, Ralph N. 1984. Fundamental frequency as an acoustic correlate of stop consonant voicing. The Journal of the Acoustical Society of America 75. 224–230. Philips, Richard. 1962. Voice register in Mon-Khmer languages. Manuscript. Phú, Văn Hẳn, Jerold Edmondson & Kenneth Gregerson. 1992. Eastern Cham as a tone language. Mon Khmer Studies 20. 31–43.



Register in languages of Mainland Southeast Asia: the state of the art 

 705

Pittayaporn, Pittayawat & James Kirby. 2017. Laryngeal contrasts in the Tai dialect of Cao Bằng. Journal of the International Phonetic Association 47(1). 65–85. Poedjosoedarmo, Gloria. 1986. The symbolic significance of pharyngeal configuration in Javanese speech: Some preliminary notes. Nusa 25. 31–37. Premsrirat, Suwilai. 1996. Phonological characteristics of So (Thavung), a Vietic language of Thailand. Mon-Khmer Studies 26. 161–178. Rousselot, Jean-Pierre. 1901–1908. Principes de phonétique expérimentale, tome II. Paris & Leipzig: Welter. Schiller, Eric. 1994. Khmer nominalizing and causativizing infixes. In Karen L. Adams & Thomas John Hudak (eds.), Papers from the Second Annual Meeting of the Southeast Asian Linguistics Society, 309–326. Tempe, AZ: Arizona State University, Program for Southeast Asian Studies. Schmidt, Wilhelm. 1905. Grundzüge einer Laulehre der Mon-Khmer-Sprachen. Wien: Denkschriften der Kaiserlichen Akademie des Wissenschaften. Seyfarth, Scott & Marc Garellek. 2018. Plosive voicing acoustics and voice quality in Yerevan Armenian. Journal of Phonetics 71. 425–450. Shorto, Harry L. 1962. A dictionary of Modern Spoken Mon. London: Oxford University Press. Shorto, Harry. 1966. Mon vowel systems: A problem in phonological statement. In Charles E. Bazell, John C. Catford, Michael A. K. Halliday & Robert H. Robins (eds.), In memory of J. R. Firth, 398–409. London: Longmans. Shorto, Harry. 2006. A Mon-Khmer comparative dictionary. Canberra: Pacific Linguistics. Sidwell, Paul. 2002. Proto North Bahnaric a reconstruction of phonology and lexicon. Manuscript. Sidwell, Paul & Felix Rau. 2015. Austroasiatic comparative-historical reconstruction: An overview. The Handbook of Austroasiatic Languages 1. 221–363. Smalley, William Allen. 1976. Phonemes and orthography: Language planning in ten minority languages of Thailand (Pacific Linguistics Series C). Canberra: Research School of Pacific Studies, Australian National University. Smith, Kenneth. 1972. A phonological reconstruction of Proto-North-Bahnaric. Santa Ana, CA: Summer Institute of Linguistics. Solé, Maria-Josep. 2018. Articulatory adjustments in initial voiced stops in Spanish, French and English. Journal of Phonetics 66. 217–241. Stevens, Kenneth N. & Arthur S. House. 1956. Studies of formant transitions using a vocal tract analog. The Journal of the Acoustical Society of America 28. 578–585. Stevens, Kenneth N. & Dennis H. Klatt. 1974. Role of formant transitions in the voiced‐voiceless distinction for stops. The Journal of the Acoustical Society of America 55. 653–659. Svantesson, Jan-Olof. 1983. Kammu phonology and morphology. Malmo: CWK Gleerup. Svantesson, Jan-Olof & David House. 2006. Tone production, tone perception and Kammu tonogenesis. Phonology 23. 309–333. Tạ, Thành Tấn, Marc Brunelle & Nguyễn Trần Quý. 2019. Chrau register and the transphonologization of voicing. In Sasha Calhoun, Paola Escudero, Marija Tabain & Paul Warren (eds.), Proceedings of the 19th International Congress of Phonetic Sciences, 2094–2098. Melbourne: Australasian Speech Science and Technology Association Inc. Tạ, Văn Thông. 2009. Tiếng Ra glai ở các địa phương. In Tạ Văn Thông (ed.), Tìm hiểu ngôn ngữ các dân tộc ở Việt Nam, 222–245. Hà Nội: Nhà xuất bản khoa học xã hội. Thomas, David. 1992. On sesquisyllabic structure. Mon-Khmer Studies 21. 207–210. Thurgood, Ela. 2004. Phonation types in Javanese. Oceanic Linguistics 43. 277–295. Thurgood, Graham. 1996. Language contact and the directionality of internal drift: The development of tones and registers in Chamic. Language 72. 1–31. Thurgood, Graham. 1999. From Ancient Cham to modern dialects: Two thousand years of language contact and change. Honolulu: University of Hawai’i Press.

706 

 Marc Brunelle and Tạ Thành Tấn

Thurgood, Graham. 2000. Voice quality differences and the origin of diphthongs. Proceedings of the Twenty-Sixth Annual Meeting of the Berkeley Linguistics Society, 295–303. DOI: https://doi. org/10.3765/bls.v26i1.1162. Thurgood, Graham. 2002. Vietnamese and tonogenesis: Revising the model and the analysis. Diachronica 19. 333–363. Watkins, Justin. 2002. The phonetics of Wa: Experimental Phonetics, phonology, orthography and sociolinguistics. Canberra: Australian National University. Watson, Richard, Saundra Watson & Cubuat. 1979. Pacoh dictionary: Pacoh-Vietnamese-English. Huntington Beach: SIL. Wayland, Ratree & Allard Jongman. 2001. Chanthaburi Khmer vowels: Phonetic and phonemic analyses. Mon-Khmer Studies 31. 65–82. Wayland, Ratree. 2002. Registrogenesis in Khmer: A phonetic account. Mon-Khmer Studies 32. 101–115. Wayland, Ratree. 2003. Acoustic correlates of breathy and clear vowels: The case of Khmer. Journal of Phonetics 31. 181–201. Westbury, John R. 1983. Enlargement of the supraglottal cavity and its relation to stop consonant voicing. Journal of the Acoustical Society of America 73. 1322–1336. Westbury, John R. & Patricia A. Keating. 1986. On the naturalness of stop consonant voicing. Journal of Linguistics 22. 145–166.

Stefanie Siebenhütter

30 Contact and convergence in the semantics of MSEA 30.1 Introduction 30.1.1 MSEA as a “language area par excellence” MSEA refers to the area of the central SEA,1 a linguistic area where many languages, irrespective of their genetic kinship, exhibit the same concepts for the encoding of conceptions of basic domains such as space, time, emotions, and so on. As laid out in the introduction of this volume, strong typological similarity across language and family boundaries (i.  e. shared morpho-syntactic and phonological/phonetic features) has already been substantiated by previous research. However, MSEAn languages also share semantic features both in lexicon and collocations/idioms. In this chapter, the term conceptual area describes a linguistic area that has emerged due to parallelisms at the conceptual level – the Signifié (or semantic)-level of linguistic signs, indicating the encoding of specific concepts, such as spatial concepts (Siebenhütter 2018, 2019a, 2019b) developed due to long-standing and intensive internal and external language contact in MSEA.

30.1.2 Language families and genetic diversity in MSEA Mainland Southeast Asia (MSEA) is home to several language families that have been exposed to more or less intensive mutual contact: Hmong-Mien, Austroasiatic, Sino-Tibetan, Austronesian, and Tai-Kadai. MSEA forms an ideal “Sprachbund”, or “linguistic area” (Stolz 2002: 260). It is crucial to distinguish genetically defined language families from a Sprachbund in every respect (cf. Stolz 2002: 260). The term Sprachbund designates an area of linguistic convergence, comprising […] languages that have a great similarity in syntactic terms, a similarity in the principles of morphological construction, and a large number of common cultural words, sometimes external similarities in the body of the sound systems, but no systematic sound correspondences, no agreement in the phonetic form of the morphological elements and no common elementary words […]. (Trubetzkoy 1930: 18, Author’s translation).

1 Mainland Southeast Asia. https://doi.org/10.1515/9783110558142-030

708 

 Stefanie Siebenhütter

Previous research demonstrated the strong typological similarities in the languages of MSEA despite their genetic diversity (e.  g. Bisang 1996, 2008; Comrie 2007; Enfield 2005b; Enfield and Comrie 2015; Matisoff 2019) and the MSEA Sprachbund is well-established with significant shared documented parallels on the morphological, phonetic, and structural levels (Bisang 1992; Comrie 2007; Enfield 2008; Enfield and Comrie 2015; Gil 2015). Continental Southeast Asia (SEA) is often described as a prime example of a linguistic area. According to Comrie, (M)SEA is one of the best, if not the best, examples of a linguistic area, i.  e. as an area that is characterized on the one hand by internal homogeneity, and on the other hand clearly delimited from surrounding areas. (Comrie 2007: 18)

30.1.3 Demarcation of the MSEA: transition zones Internal homogeneity is seen as the languages of the area share a significant number of structural features, even though they belong to different language families. From a structural perspective, it is argued, that within the Mainland Southeast Asian (MSEA) linguistic area (e.  g. Matisoff 2003; Bisang 2006; Enfield 2005, 2011; Comrie 2007), some languages are said to be in the core of the language area, while others are said to be in the periphery. (De Sousa 2015: 356)

In this, the SEA area is demarcated from adjacent areas, although there are transition zones that can vary depending on the feature being considered (cf. Comrie 2007). A linguistic area is not necessarily coextensive with political borders or national state areas. The “margins of an area typically show certain special characteristics: weakening or absence of some of the convergent features, or mixed phenomena (such as both prepositions and postpositions); and thus statistical frequency becomes a consideration. These “transition zones” themselves help define an area” (Bashir 2016: 244). For MSEA such “transition zones” (Güldemann 2018) are found towards Austronesian languages in the southeast (Malay and Indonesian), Sino-Tibetan languages to the north (Cantonese, Mandarin, etc.), and Tibeto-Burman to the west (Burmese, Chin, Kachin). However, there is no agreement about the exact border of such a transition zone and it is likely that an exact border line will be never possible to define. It rather depends on the structural features included in the definition and questions like: How many languages are needed?, How many features should be included?, Are all features equally relevant?, What about the differences among the languages?, and so on. However, it is argued that the geographical borders of the area of MSEA can be drawn more clearly towards the north (Comrie 2007), although, similarly to the southern direction there is also a thinning out of MSEAn features within Sinitic the further north one gets, which suggest that there is no clear northern edge of MSEA. Likewise the border is fluid southward in the direction of Indonesia and Malaysia (Comrie 2007). For example to



Contact and convergence in the semantics of MSEA 

 709

the west, Jenny (2015: 203–204) argues, the languages of Myanmar have not received the attention they deserve and the territory from the Salween river up to and including Northeast India may be considered as a transitional zone between South Asia and SEA, as many of the languages of Myanmar share strong connections with central SEA languages. However, there is still discussion: The best known languages of Myanmar are thus superficially closer to the languages of South Asia, which lead Masica (1976: 183) to including Burmese as a (peripheral) member of the South Asian sprachbund, stating that there is “a profound hiatus between India and Southeast Asia beyond Burma”. In a recent study, Vittrant (2011) asks whether Burmese is linguistically part of (mainland) Southeast Asia. As Masica (1976), Vittrant (2011) looks only at Burmese, leaving aside the other languages spoken in present day Myanmar. Unlike Masica (1976), Vittrant (2011) concludes that Burmese shares enough features with Southeast Asian languages to be included in this linguistic area. (Jenny 2015: 155)

SEA languages might share very few linguistic features with South Asian (i.  e. Indo-Aryan and Dravidian, with Tibeto-Burman and Munda rather peripheral) languages (Post 2015: 247–248). In the east, there is the natural border formed by the ocean. Towards the North, De Sousa (2015) notes that, of the Sinitic languages, the southernmost languages are most similar to the languages of MSEA. This is not surprising as the regions that now form the southwestern part of China were originally inhabited by Tai-Kadai, Miao-Yao, and Austroasiatic speakers (Ansaldo 2010: 920). From a structural perspective, we can conclude, that within the Mainland Southeast Asian (MSEA) linguistic area […], some languages are said to be in the core of the language area, while others are said to be in the periphery. (De Sousa 2015: 356)

Whether a language or language family is included in the MSEA linguistic area depends on the features included or excluded from the areal definition.

30.2 Contact-induced convergence in MSEA 30.2.1 Approaches to contact-induced convergence Research on structural linguistic convergence among the languages of MSEA, and thus on the existence of a SEA language area, has evolved over time (Bisang 1991, 1992, 1996, 1999, 2004, 2008; Comrie 2007; Enfield 2003, 2005b, 2006, 2008, 2011a, 2011b, 2011c; Enfield and Diffloth 2009; among others). As Sidwell (2015: 52) states quoting Enfield, MSEA is a site of long-term contact between languages of several major language families. This contact has resulted in extensive parallels in linguistic structure, making MSEA an illustrative case study for areal linguistics. (Enfield 2005a: 198)

710 

 Stefanie Siebenhütter

However, the MSEA languages also reveal strong sociocultural influences from non-SEA (particularly Indian and Sinitic) language family’s speakers. In this context, Sidwell (2015: 52) points out that a purely areal–typological analysis might not give a complete picture of the features involved in areal linguistics and suggests including historical and other statistically significant analyses such as tonal distributions, as have been provided by Brunelle and Kirby (2015, 2017). Several kinds of convergence concerning areal linguistics have been investigated both synchronically and diachronically in the past, which includes “prosodic convergence” (Post 2011), “tonal convergence” (Brunelle and Kirby 2015, 2017), “phonetic convergence” (Sidwell 2015), and “syntactic convergence” (Enfield 2003). The approach in this chapter suggests another level of features that can add to the study of areal linguistics: more or less independent conceptual and semantic convergence of structural features including a sociolinguistic perspective. As Bashir (2016: 244) points out: Languages participating in a convergence area do not become typologically identical. Although at linguistic boundaries, often not a line but a mixed zone of much multilingualism, this may sometimes seem almost to be the case, more typically languages, especially unrelated ones, retain some typological differences and features of their genetic inheritance. (Bashir 2016: 244)

30.2.2 Contact semantics and its sociocultural and psychological dimensions Contact semantics has two dimensions, a sociocultural and a psychological one (Ameka 1996). The psychological side attempts to “discover and describe the cognitive mechanisms and conceptual constructs which underpin the organization and access of semantic information within the mind of a bilingual or a second language learner” (Ameka 1996: 130–131). However, this chapter focuses primarily on the sociocultural side, which investigates group-defined conventionalizations of meaning structures, and the effect of contact between distinct cultural groups on culturally determined parameters of meaning. (Ameka 1996: 131)

Certainly, both dimensions play a role in language contact phenomena research. For areal linguistics research, we try to focus for the moment on the latter one, as it is more relevant in this context. There are several indications for language contact in MSEA over the past many centuries. The Mekong originating in China, flows through six countries having joined cultures and languages of the local populations for a long time. Trade via the river path through China, Myanmar, Vietnam, Laos, Thailand, and Cambodia allowed a lively exchange of Chinese merchants with the traders of Southeast Asia and gave wings to the exchange of goods within the Southeast Asian population. It is obvious that with this exchange, language contact was also inevitable. (Siebenhütter 2019b: 29)



Contact and convergence in the semantics of MSEA 

 711

Further we find south-facing downstream migrations, mainly from China toward Thailand, Laos, and Vietnam (Siebenhütter 2019b). Political borders as we know them have existed for a relatively short time, since shortly before World War II, and mainland Southeast Asia as a whole displays a range of cross-cutting and overlapping cultural commonalities, and cannot be considered a single distinct “cultural area”. (Enfield 2003: 47)

Linguistic areas like South-East Asia might be understood as an impenetrable maze of intertwined branches, considered likely that one finds slow “percolations” or “filtrations” of small groups of people in “migrations of population groups” (Aikhenvald and Dixon 2006: 6). For example, there was immigration to Laos and other SEA countries in the 19th century by the Hmong from China through Laos to Thailand (White 2019). Language contact may have several possible semantic effects, for instance, transfer of meaning (i.  e. semantic borrowing), while meaning can be equated with concept here. Through the described intensive contact presumably not only words but also concepts, were borrowed, meaning for example through trade one language inherited a formerly alien concept from a contact language. The scope of meaning may shrink or widen in the process of borrowing – semantic broadening (expansion) or semantic reduction – where a word in the donor language may have multiple related meanings but in the borrowing language the meaning may be reduced to a single meaning (Ameka 1996: 135). Semantic broadening or expansion entails that a meaning has become less specific through loss of semantic components at the conceptual level (Ameka 1996: 135). Relevant for the identification of parallelisms on the conceptual level, to identify or verify a linguistic area on the conceptual level, is the convergence (of semantic scope) that can be found in genetically diverse languages in the designated area. There is still little research that deals with contact semantics and Ameka’s (1996: 136) appraisal that contact semantics has not been pursued as an area of study in its own right is still valid. In the discussion of sociocultural factors regarding the definition of a MSEA-area, the terms culture and knowledge must be defined. In addition, in order to examine the development of cultural identity in the SEA area, the connection between language and cognition also needs to be understood. The various hypotheses of cultural linguistics are not yet sufficiently defined, and many points remain unclear (Palmer 1996). Nonetheless, it is commonly accepted that language and cognition are inextricably interrelated and that conceptualization (categorization, schematization,2 and metaphorization) is a (socio)culturally influenced phenomenon. Likewise, language and knowledge are interconnected and, through the acquisition of values and norms with language acquisition, are decisively shaped by their environment culture. In order 2 In the process of perception, active schemes (equivalent to biological universals) should not be confused with learned categorizations. The perception of the speaker is selective.

712 

 Stefanie Siebenhütter

to understand these difficult phenomena of culture and language, an in-depth consideration of the social structures governing the lives of people – society in general and the group in particular – is necessary. This chapter highlights only the following: language cannot be seen as a conductive influence on culture and, in return, culture does not determine the structure of a language (cf. Janda 2006). Rather, the reason why these two phenomena cannot be considered separately is that culture and language develop a symbiotic relationship or in other words, language can be understood as part of culture.

30.2.3 The role of bi- and multilingualism in language contact A large portion of the MSEA population is multilingual having knowledge of two or more languages (Djité 2011: 11). The concentration of minority languages is comparatively high and language contact is ubiquitous all over the MSEA area. Multilingualism is a pervasive fact among SEA minority languages such as Kui, Bru, Hmong etc. especially in speakers older than 50 years of age (Siebenhütter 2019b; Tomioka 2019; White 2019). This is one of the main reasons why multilingualism plays a crucial role in areal linguistic research in MSEA. Most of the minority language speakers do not acquire other languages as adults but are rather socialized, especially in societies where more than one language is spoken on a daily basis, and that too in a broad range of situations: at home, at the market, at school, at work, and so on. Sociocultural concepts are embedded in language, and the architecture of each language contains such culture-specific properties that do not necessarily stop at a national border, but rather can be quite similar in the adjacent areas on both sides of the border, as people generally do have contact with others in their geographical vicinity (Siebenhütter 2019b).

30.3 Types of contact-induced transfer and ­convergence in MSEA 30.3.1 Concept transfer and conceptualization transfer Contact phenomena such as borrowing and interference have been widely researched throughout the languages of the world since Haugen (1950) and Weinreich (1953) (cf. Mathiot and Rissel 1996). As for MSEA, in this chapter, conceptual transfer is understood in the sense of Jarvis (2007: 53) who differentiates two types of conceptual transfer: concept transfer, which means “transfer arising from crosslinguistic differences in the conceptual categories stored in L2 users’ long-term memory” (Jarvis 2007: 53) and conceptualization transfer, which refers to “transfer arising from crosslinguistic



Contact and convergence in the semantics of MSEA 

 713

differences in the ways L2 users process conceptual knowledge and form temporary representations in their working memory” (Jarvis 2007: 53). Pavlenko and Jarvis (2002) found that transfer can be bidirectional already after a short period of exposure to a foreign language: variations in age of arrival and length of exposure did not significantly affect directionality or amount of transfer (…) This, in turn, suggests that L2 users who have been exposed to the L2 for 3 years or longer through intensive interaction in the target language context may start exhibiting bidirectional transfer effects in their two languages, not just L1 transfer in their use of the L2. (Pavlenko and Jarvis 2002: 209)

The intensity of interaction seem to be more relevant that the length of exposure (Pavlenko and Jarvis 2002: 209). There are empirical assumptions of the “conceptual transfer hypothesis” (Jarvis 2000, 2007, 2011; Jarvis and Pavlenko 2008).

30.3.2 The semantic and the conceptual level The “Wörter und Sachen” approach, which was propounded already around 1900 dealt with the study of the reciprocal relationship between words and objects and the facts they describe (e.  g. Jaberg and Jud 1928). For a successful combination, it is necessary to establish a mental link between a linguistic expression and the respective concept (see Jarvis and Pavlenko 2008; Siebenhütter 2016, 2018). The semantic representation which is encoded by the linguistic system is modeled in the Theory of Lexical Concepts and Cognitive Models (LCCM Theory, Evans 2009), by lexical concepts. Conceptual structure is that part of semantic representation which is encoded by the conceptual system. The three-levels of (cognitive) representation model, which includes concepts, lemma, and lexemes (Jarvis and Pavlenko 2008), allows us to differentiate between conceptual and semantic transfer. The level mainly relevant for semantic transfer analyzed in this chapter is the level of concepts, described as “mental images, schemas, and scripts” (Jarvis and Pavlenko 2008: 83), which is related to the respective lexeme and lemma, but which comprises more than the general semantic knowledge of a lexeme. This additional content is the semantic knowledge that is formed and influenced by cultural and social factors. The semantic representation, which is described as the semantic dimension of lexical representations, consists of both semantic and conceptual structures.

30.3.3 Loanwords, shared lexemes, shared expressions and concepts The emergence of conceptual borrowings can be found in the literature (e.  g. Enfield 2003; Matisoff 2006; Siebenhütter 2019b; Sybesma 2008). Since Trobetzkoy (1930),

714 

 Stefanie Siebenhütter

linguistic areas have been traditionally defined through the existence of a specific number of parallel structural features despite the prevalence of genetic diversity, that is, the specific languages belonging to different language families (Bisang 1991, 1992, 1996, 1999, 2004, 2008; Comrie 2007; Enfield 2003, 2005b, 2006, 2008, 2011a, 2011b, 2011c; Enfield and Diffloth 2009; Post 2015; Stolz 2002; Tosco 2008; etc.). A more recent approach concerns to target and widen the areal definition using conceptual and semantic convergence (Jarvis 1998, 2000, 2007, 2011, 2016; Jarvis and Pavlenko 2008; Pavleko and Jarvis 2002; Siebenhütter 2016, 2018, 2019a, 2019b). Convergence in the broader perspective, as suggested here, can be divided at least in the following parts: Loanwords, Loan translations, shared concepts and shared ways of expressions. Shared lexicon due to language contact is a common form of overlap in linguistic areas and there are several examples of borrowings in MSEA, for instance, loanwords as a result of language contact among MSEA languages and shared loanwords from contact languages outside the MSEA geographical area such as Pali, Sanskrit and European languages (English, Portuguese) and Chinese varieties, while Austronesian loanwords seem to be not very widespread in MSEA. Generally, we can distinguish such MSEA external and MSEA internal language contact. However, sometimes it is not easy to separate the multiple influences the MSEA internal and external languages had on each other and to trace back vocabulary to their origin.

30.3.3.1 Contact-induced convergence with languages outside of SEA 30.3.3.1.1 MSEA external loans MSEA external loans throughout the MSEAn languages are mostly of Indian and Chinese origin. However, also Persian loans are common, in many cases ultimately of Arab origin, often entering MSEA through South Asian intermediaries. Nowadays, more and more English can be found in MSEA languages and it is currently by far the most important donor language in Thai (Suthiwan and Tadmor 2009: 611). Lao and Thai, which are often treated as a single source in literature as they are closely related Tai varieties (Enfield 2002; White 2019), share numerous loanwords from language contact, for example with Khmer which heavily borrowed from Indian languages through religious and other socio-cultural influences. Varieties of Laotian Hmong exhibit large amounts of lexical material that have been adapted from contact languages, mostly from Lao, Thai, and Sinitic languages (Mortensen 2000; White 2019). Mortensen (2000: 59) assumes that more than 20 % of the vocabulary in Hmong is borrowed from Sinitic languages; for example, terms for metals and metalworking, a large number of words for crops and domestic animals as well as words with grammatical functions such as aspectual markers and conjunctions (Mortensen 2000: 60). Tai languages in southern China and Vietnam have mainly borrowed from Chinese while the languages of the southwestern branch of Tai languages in Thailand, Laos,



Contact and convergence in the semantics of MSEA 

 715

and Myanmar borrowed mainly from Sanskrit and Pali (Kausen 2013: 931). Another example is the Chinese influence on Vietnamese. However, this influence is structurally superficial at all levels: lexicon, phonology, morphology and syntax (Alves 2001). Among these MSEA external influences we can further differentiate between cultural and core vocabulary loanwords. A distinctly Austroasiatic core vocabulary predominates in Vietnamese and most of its loanwords have been semantically and/ or syntactically adjusted to Southeast Asian typology (Alves 2001). The study of loanwords and language contact is especially interesting for languages whose past contact scenarios are not well known and where linguistic evidence helps us to better understand the kind and intensity of contact (Jenny 2017). However, parallelism throughout basic domains like time and space may be identified as shared metaphors or common borrowings from, for example, Sanskrit and Pali in the languages of MSEA. The main contact languages that left traces in Burmese are all linguistically not related to Burmese: Pali and Sanskrit, Mon, and English (Jenny 2017: 2). Like other MSEA languages, through long time exposure to foreign influence of many different languages, Burmese appears today as a language with a very mixed lexicon, borrowed not only from outside the MSEA area (i.  e. Hindi, Chinese, Malay, English, and Arabic) but also from other MSEA languages such as Thai, Shan, and Mon (Jenny 2017). Also, Thai and Khmer share a long period of historical contact resulting in mutual borrowing (Huffman 1986). The Tai kingdoms centered first at Sukhothai, later at Lopburi and Ayutthaya (13th to 18th century) gradually replaced the Angkor Empire (9th to 13th century) as the main political force in the area and took over the whole package of Khmer palace culture and language. Today the Royal Thai language is mostly made up of Khmer lexicon, interspersed with Sanskrit elements. Sometimes it is difficult to determine the direction of borrowing but it seems that the direction is from Sanskrit to Khmer to Thai and later also Khmer back-borrowing from Thai (p.c. Jenny 2020). There are certain criteria that may help identifying loanwords such as comparing morphosyntactic structure, phonology, pronunciation, and even orthography when analyzing text material (cf. Huffman 1986). For example, if Pali and Sanskrit loanwords are eliminated from Thai, the language basically becomes monosyllabic, whereas Khmer has a pervasive pattern of derivation by prefixation and infixation (Huffman 1986). This evidence, however, is still not clear enough. A more reliable evidence for identifying loanwords are shared sets, which have a consistent derivational function in Khmer but not in Thai (Huffman 1986: 201). The best way to identify the direction of loanwords is to see in which family we find cognates of a given lexeme. Also derivation helps diagnostic processes, as the morphology may be borrowed and applied to native lexemes or to loanwords from other languages. Thai sǐəŋ ‘sound, voice’ is probably a Chinese loan (shēng), but it shows Khmer-like derivation in sǎmniəŋ ‘accent’. To identify Khmer loans for example the Khmer derivational sets, of which Thai has the derived form but not the base; for example, Khmer tuk ‘to put, place’ > bantuk ‘to load’; Thai banthúk ‘to load’ (Huffman 1986: 202).

716 

 Stefanie Siebenhütter

Most Buddhist religious terms in Thai (and other MSEAn languages) are of Indic origin (Suthiwan and Tadmor 2009: 602). The considerable Indianization that present-day MSEA had undergone can be found in the national SEAn languages and in the minority languages as well. Matisoff (2001) gives examples of “words have filtered all the way down from Sanskrit/Pali to humble minority languages like Shan, Lahu, or Phunoi, via the great regional literary languages Mon, Burmese, Khmer, and Thai (e.  g. Sanskrit/Pali > Mon > Burmese > Shan > Lahu; or Sanskrit/Pali > Khmer > Thai > Phunoi)” (Matisoff 2001: 22): Sanskrit ācārya ‘teacher, sage’

> WB charā > Lahu šālā > Mon ʔəca > Khmer ʔaːcaː(r) > Thai ʔaːcaːn

Indic lexemes form an established part of MSEAn languages’ vocabularies and are therefore still in use in nowadays Thai, for example “words for many everyday items, especially from the field of time, e.  g. weːlaː ‘time’ (Sanskrit vēlā ‘time’), naːlíkaː ‘clock, watch’ (< Sanskrit nāḍika ‘measure of time’), ʔaːthít ‘week’ (< Sanskrit āditya ‘sun’), khànà ‘moment’ (< Pali khaṇa ‘moment’), pàtìthin ‘calendar’ (< Pali patidina ‘daily’), sàmăy ‘period’ (Sanskrit samaya ‘juncture, time, season’), as well as the names of all days of the week and months of the year” (Suthiwan and Tadmor 2009: 602). As the Tai homeland “was probably located in modern-day southern China, not far from the Vietnamese border” (Suthiwan and Tadmor 2009: 601) the Tais must have been under strong influence from the nearby numerically, politically, and technologically superior Chinese. These early borrowings from Chinese include “common vocabulary items such as fùn ‘dust’ (< *pjuən ‘dust’), sĭaŋ ‘sound, voice’ (< *sjang ‘sound’), and thùa ‘bean’ (< *deu ‘bean’)” (Suthiwan and Tadmor 2009: 601). The Chinese immigrants and traders had a “strong influence on various aspects of Tai life and culture, such as commerce, cuisine, and music” (Suthiwan and Tadmor 2009: 602). Similarly, in Vietnamese the “socio-historical conditions under which the Chinese became the dominant sociolinguistic influence” (Alves 2009: 621) are clearly visible. “Vietnamese borrowed hundreds of words that form a notable portion of Vietnamese vocabulary today” (Alves 2009: 621) which includes a semantically wide range of words which have become entirely nativized in Vietnamese. In Vietnamese the semantic fields influenced mostly by Chinese are Social and political relations, Religion and belief, Warfare and hunting, and Law among others (Alves 2009: 623). Similarly, in Thai, the semantic fields with the highest borrowing rates are Religion and belief, Quantity, and Law, which all have a borrowing rate of over 40 % (Suthiwan and Tadmor 2009: 606). Another language that has been heavily influenced by Chinese over a 2,500-year period is Hmong, as seen for example in the White Hmong dialect of Laos (Ratliff 2009: 652). Loanwords in White Hmong can be found mainly from Modern Chinese besides Old and Middle Chinese as well as Tibeto-Burman languages (Ratliff 2009: 647)



Contact and convergence in the semantics of MSEA 

 717

In Thai most loanwords taken from Sanskrit and Pali are Nouns, content words, function words and Verbs while there is a higher borrowing rate for content words than for function words and the amount of loanword nouns in Thai is almost twice as high as for verbs (Suthiwan and Tadmor 2009: 605).

30.3.3.1.2 Shared lexems (Wanderwörter) A rather special role among the loanwords are Wanderwörter (‘wandering words’). By definition, loanwords are the ones that are more widely spread among numerous languages and cultures, and also among languages far away from each other. The language contact situation in which wandering words are usually spread is trade (Trask 2000: 366) although the scenarios and processes are often not well understood. Further, tracing back wandering words to their origins is generally difficult or not possible at all. As shared lexemes (Wanderwörter) we can look at lexemes that look similar and have similar meanings throughout the area, but no obvious origin or path of borrowing can be established e.  g. Thai pʰráʔ ‘monk, Buddha (statue, amulet)’ pʰəjaː ‘lord’, Burmese pʰəjà, Old Burmese purhā ‘Buddha’ (Jenny and McCormick 2015: 548), and Mon pəɲɛ̀ ə ‘lord’. Another example of a wandering word widely used in MSEA is exemplified by Thai faràng, translatable to ‘white foreigner’ or ‘western foreigner’. The term derived from Frank, which originally referred to the Germanic speaking people in the region of today’s France (Kitisara 2010: 60). Equivalents for the term farang are found in many languages; for example, faranggi (Persian), farengi or farangi (Hindi), pirangi (Tamil), palangi (Samoan/Polynesian), franji/ifrangi (Turkish and Arabic). The Thai term faràng was borrowed from Muslim Persian and Indian traders during the Ayutthaya period, when it was used to refer to the first Europeans (Portuguese) to visit Siam (Kitisara 2010: 61). Subsequently, farang came to be used for other Europeans as well, and more recently to all Caucasians and the West in general as a category to refer to Western-originating things; for example, man farang ‘potato’ (lit. ‘farang tuber’), mak farang ‘chewing gum’ (lit. ‘farang betel’), and so on (Kitisara 2010: 61). Variants of the word are used throughout MSEA, as in Laos (falang5) and Cambodia (barang) (Harris 1986: 9–12; Kitisara 2010: 61; Thion 1993: 18–23). As further examples we can include also grammatical words like lɛ ‘and, also’ and kɔ ‘then, topic-comment-linker’. These appear in similar forms and functions across SEA from Burmese to Khmer/Lao, including Thai, Shan, Karenic, maybe also Chin. The point here is that there is no obvious source and direction of borrowing, and there may in each case actually be more than one lexeme involved, but they still converge in form and meaning, although they do behave differently in terms of syntax in some cases. According to Matisoff, the “best study of technological and cultural Wanderwörter in SEA remains Benedict 1975” (Matisoff 2001: 18), though Benedict’s objective was to show genetic relationship, not cultural loanwords.

718 

 Stefanie Siebenhütter

30.3.3.2 Contact-induced convergence among MSEA languages 30.3.3.2.1 MSEA internal loans MSEA internal loans concern mostly the spread of lexical items and cultural concepts from the national languages to other languages in the area. That is, MSEA internal loans refer to loans from declining states to their successors as in the case of Mon and Burmese, Khmer and Thai. For example, Khmer vocabulary in Thai: If we examine the texts of the time of Sukhothai (13th-15th century), we notice that, with the exception of a few rare examples (daṃnep ‘to begin with’, khbuṅ ‘vertex’) most of the words borrowed during this period, which are also regularly used words, are well integrated in the Siamese language and remain in use until today Varasarin (1984: 93). Among the MSEA internal influences we can again differentiate between cultural and core vocabulary loanwords. For example, before “the arrival of the Tai, the area known today as central Thailand was inhabited mostly by Mons, who spoke an Austroasiatic language unrelated to Tai.” (Suthiwan and Tadmor 2009: 602). The Tai-Vietnamese contact in prehistory must have been of minor importance and the overall number of Tai loanwords in Vietnamese is small (Alves 2009: 621). However, there may have been influence in other domains, like the tone system (p.c., Jenny 2020). There are Chamic loans, mainly nouns, verbs and adjectives in Vietnamese (Alves 2009: 622). In White Hmong loanwords originated from Lao as well as Tibeto-Burman languages’ can be found (Ratliff 2009: 647). Although, Asian prehistory is not fully understood, Hmong shows traces of contact relationship with an old Tibeto-Burman donor, and a few basic words shared with members of Mon-Khmer, Austronesian and Tai-Kadai languages (Ratliff 2009: 652). Table 1 gives an example of spatial notions in Khmer, Lao, Thai, and Vietnamese that can be translated nearly word for word from one language to another without needing additional spatial relators, other constructions, or lexemes. For example, Thai kɛ̂ ːw jùː bon tóʔ ‘The cup is on the table’ can be translated word for word into Vietnamese cái cốc ở trên bàn or into Khmer peng nɨv lǝǝ tok. Lao is structurally and lexically the closest to Thai among the investigated languages. Still, the basic locator construction (blc) remains the same in all languages: Tab. 1: Comparison of a basic spatial relation in MSEA languages (Siebenhütter 2019b: 288).

Khmer Lao Thai Vietnamese Gloss Translation

Figure-object

Locative marker

Spatial relator

Ground-object

peng cɔ̏ ːk5 kɛ̂ ːw cái cốc cup ‘The cup is on the table.’

nɨv yūː1 jùː ở loc

lǝǝ tʰə́ ŋ2 bon trên on

tok tóʔ5 tóʔ bàn table



Contact and convergence in the semantics of MSEA 

 719

30.3.3.2.2 Cultural words Another term regularly found in literature on areal linguistics is cultural word. One of the earliest mentions to this is by Trubetzkoy (1930), who describes cultural words as terms that are familiar and commonplace in use and are frequently encountered (emblematic) in a group of speakers, and through which the group of speakers demarcates itself from other groups that do not use those culture words. One basic assumption is that cultural imprinting influences our perception and that it is therefore selective. We may ask, which cultural schemes are active in the process of perception of SEA native speakers (cultural schematization)? Regarding shared knowledge, certain “Kulturwörter” (Trubetzkoy 1930) were identified, which can be translated one to one in SEA languages, while they must have been translated in a circuitous and complicated way in languages that do not belong to MSEA. Thus, speakers of MSEA languages share a certain collective knowledge that speakers located outside of this linguistic area do not have; for example, the practice of reserving the seats on the roof of buses for men mainly owing to cultural reasons such as that females are considered unclean due to their menstrual period. Also, in some areas, men wear a traditional garment called the sarong (or lungi), which is again a cultural practice. Certain foods and plants, religious ceremonies and other associated activities and objects are also examples of collective cultural knowledge which may be encoded in the local language. For the purpose of defining an area, it is crucial to be aware that these items and activities are mainly known in the MSEA and are little known or completely unknown in the adjacent areas. An example is Lao tȏm,3 a sticky rice cake, which is prepared in a similar manner throughout SEA. A description over several lines of the term tôm, as translated in Lao–English dictionaries, shows that this term is not common in languages outside the SEA area. In Khmer, this term can be translated into one word with kaatɑm; in Vietnamese, the same can be translated with equal ease into bánh chưng. Another example is Thai kràtìp, a special container for sticky rice that is used mainly in northern Thailand, Laos, and Vietnam. More examples can be found in the area of traditional clothing that are worn in a similar manner in all the four countries. These cultural traits correspond to a characteristic of languages in a linguistic area to ensure translatability among them (Heine and Kuteva 2005: 179–180; Siebenhütter 2019b). 30.3.3.2.3 Shared concepts As shared concepts we can understand concrete (e.  g. objects, activities), and abstract (spatial, temporal, semantic relations, etc.) concepts. Such concepts are mental representations, schemas, and scripts related to conceptual categories that are stored and organized into conceptual categories (Jarvis and Pavlenko 2008: 82–83). This section 3 A Laotian cake either with sticky rice and bananas or sugar, coconut and sticky rice that is steamed in banana leaves.

720 

 Stefanie Siebenhütter

includes the lexicalization of cultural items, activities, etc. where there is no evident path of borrowing as described in the examples above. Such shared categories and shared cultural representations are found in many languages throughout the SEAn linguistic area. Overlap at the phrase or notion level, convergence of the conceptual content of the whole meaning of an event that goes beyond pure structural overlaps such as syntactic convergence, as demonstrated in examples (1a) and (1b). (1)

a. Lao cɔ̏ ːk5 yūː1 tʰə́ ŋ2 tóʔ5 b. Thai kɛ̂ ːw jùː bon tóʔ cup loc on table ‘The cup is on the table.’

Enfield (2004) illustrates, that events are often categorized culturally: events (i.  e. man standing, man sitting in front of landscape) illustrated and shown to Australian and Laotian consultants are categorized differently, contingent to the cultural logic the consultants may see in an event. Further, the languages of SEA share the lack of mandatory linguistic categories. For example, there are no mandatory tense, aspect, and modality markers in Khmer and in the majority of East Asian and MSEA languages (Bisang 2008: 20). Moreover, the level of ambiguity and the amount that is included in an utterance depending on the context is similar. MSEA languages share a high degree of ambiguity and less overtly expressed way of speech. For example, (2)

Lao dék5 hên3 mǎː3 child see dog ‘The child sees the dog’ (Kelz 1984: 101)

This can, depending on the context, also mean “The child sees (the) dogs” or “(The) children see (the) dogs.” Similar examples can be found in many MSEA languages as well. At the structural level, these languages share a radically isolating structure with monosyllabic morphemes with minimal flexion or total absence of inflectional morphology (e.  g. Enfield 2009: 812; Matisoff 2006: 291). This means that a high level of context-dependence is present in relatively little explicit expressions that leave space for interpretation, as demonstrated in the Lao example (3): (3)

tam3 khuaj2 taaj3 crash.into buffalo die a. ‘(S/he) crashed into a buffalo and died.’ b. ‘(S/he) crashed into a buffalo and (it) died.’ c. ‘(S/he) crashed into a buffalo and (the car) died (i.  e., stalled).’ (Enfield 2009: 811)



Contact and convergence in the semantics of MSEA 

 721

One approach, cultural conceptualization, understood as the way in which people across different cultural groups construe various aspects of the world and their experiences (Sharifian 2011, 2015). Similarly, the approach of cultural scripts (“written in the metalanguage of semantic primes” [Goddard and Ye 2015: 71]) which represents cultural norms that exist at different levels of generality and can be related to various aspects of speaking, thinking, feeling, and action (Goddard and Ye 2015: 71; cf. semantic primes by Wierczbicka 1996; cultural scripts by Wierzbicka 2015). Another, relatively new approach is what Enfield (2004) calls ethnosyntax, that is, the linguistic encoding of culturally conventionalized events that are grounded in shared cultural representations/conceptualizations. In Lao and numerous other East- and Southeast Asian languages, events can be expressed as serial verb constructions (see example [4]), and such an event typicality is a cultural phenomenon which can be accounted for and described in terms of  cultural representations, typifications which are carried, assumed-to-be-carried, and assumed-to-be-assumed-to-be-carried by all members of a given group. (Enfield 2004: 233).

Enfield (2004: 254) explains, that there is a kind of cultural logic that the interpretation of semiotic (“mediating artefactual”) structures […] entails personal search and retrieval of cultural representations which facilitate the best, most likely, and most ‘logical’ solutions. (Enfield 2004: 254)

Accordingly, speakers employ markedness in order to activate non-default cultural representations in the minds of interlocutors, ensuring convergent culturally logical solutions. (Enfield 2004: 254)

Enfield pleads for the need to revise research methodology in the light of “heuristics provided by cultural typifications, by which we assess the typicality or plausibility of situations predicated” (Enfield 2004: 240), not only with respect to verb serialization as in example (4a) and (4b). (4)

a.

laaw2 paj3 talaat5 sùù4 khùang1 maa2 (Enfield 2004: 240) 3.SG go market buy stuff come ‘She/he has come (here) from going and buying stuff at the market.’ (or: ‘She/he has been to the market and bought stuff.’)

(4)

b. laaw2 paj3 talaat5 lèka0 sùù4 khùang1 lèka0 maa2 3.SG go market C.LNK buy stuff C.LNK come ‘She/he went to the market, and then bought stuff, and then came (here).

Culturally loaded concepts may overlap among cultural groups with similar cultural structures, therefore it is expected that members of different cultural and linguistic communities pay attention to different aspects of knowledge, which might be under-

722 

 Stefanie Siebenhütter

stood as a different kind of a “culturally profiled knowledge system” (Siebenhütter 2019b: 48). As in example (4), the difference between (4a) and (4b) is that (4a) can be understood as a single conceptual event with “component-events” while (4b) is rather interpreted as “each separate conceptual events in themselves” (Enfield 2004: 2040). According to Enfield the “status of possible combinations of conceptual sub-components in complex expressions as more or less ‘normal’ (as culturally defined) affects the accessibility of such combinations to certain productive processes” (Enfield 2004: 255). “Conventionalized/idiomatic meaning” is contingent upon “cultural typifications” (Enfield 2004: 256) and therefore helpful in defining areas in which speakers share a wide range of sociocultural practices including linguistic behavior and specific language use. Knowledge (e.  g. emotive knowledge, action knowledge, encyclopedic knowledge, episodic knowledge, etc.) plays a crucial role in the way speakers perceive the world. The respective knowledge is said to be directly related to cultural conceptualization. The question further is which are the categories of MSEA language speakers that encode conceptual notions. Shared (semantic) concepts, i.  e. an overlap of the conceptual organization in two or more languages is alike is given in (5). The languages of MSEA differ considerably in the degree to which animacy plays a role in verbalizing spatial notions. While Indonesian treats animate (or body-part) and inanimate objects in the same way, as in example (5a) and (5b), in Lao the person is treated like the figure object and needs to be in the figure position (example [6]) rather than building a construction such as shown in example (6b), which would sound unnatural to native speakers while (6c) with an inanimate cup would be acceptable. (5)

Indonesian (Siebenhütter 2019b) a. Sepatu di kaki shoe loc foot ‘The shoe is on the foot’. b. Di atas meja loc on table ‘On the table’

(6)

Lao (Siebenhütter 2019b) a. kʰoń2 sāy5 kə̏ ːp2 person wear shoe ‘The person wears shoes’. b. * kə̏ ːp yūː1 tʰə́ ŋ2 kʰoń2 ‘Shoe on person’. c. cɔ̏ ːk5 yūː1 tʰə́ ŋ2 tóʔ5 cup loc on table ‘The cup is on the table’.



Contact and convergence in the semantics of MSEA 

 723

When considering example (5), we have to keep in mind the basic locative construction (blc) (Levinson and Wilkins 2006), the most commonly used construction to verbalize a spatial notion. A linguistic area on the conceptual level considering this phenomenon needs to analyze the likelihood of a language making use of constructions deviating from the blc. Such non-blc constructions are most likely to occur if the Figure–Ground organization deviates from the expected or most natural form or if the notion is unusual or atypical in some respect, that is, negative space such as a hole (see Siebenhütter 2019b). Other shared complex concepts of interest are “compounds which have been calqued throughout SEA, but which have so far not been attested elsewhere” (Matisoff 2001: 19): PIG + CRAZY/ILLNESS ---> EPILEPSY (Khmer, Mon, Thai, Indonesian/Malay, Burmese) FLY + SHIT ---> FRECKLE/MOLE (Khmer, Mon, Thai, Indonesian) EYE + X ---> ANKLEBONE: EYE + FOOT (Indonesian, Burmese, Lahu) EYE + COW (Khmer); EYE + ELEPHANT (Mon); EYE + FISH (Vietnamese) TOOTH + BUG ---> DENTAL CARIES (Khmer, Vietnamese, White Hmong, Thai (literally ‘bug eats tooth’), Jingpho, Burmese, Chinese) STAR + SHIT ---> METEOR (Hmong, Lahu) Matisoff (2001) further lists metaphorical extensions “clearly to be considered a SE Asian areal semantic feature, though it is certainly to be found elsewhere” (Matisoff 2001: 19): MOTHER/CHILD to AUGMENTATITVE/DIMINUTIVE. He continues, that “so far MOTHER ---> LOCK vs. CHILD ---> KEY has not been observed outside of SE Asia” (Matisoff 2001: 19) and gives a rich number of data from SEA languages such as Indonesian, Cham, White Hmong, Mien, Vietnamese, Thai, Lahu, Chinese, Karen, Jingpho, e.  g. (Matisoff 2001: 19):

Indonesian ibu ‘mother’ / anak ‘child’ ibu kota ‘capital city’, ibu roti ‘yeast’ (kota ‘city’, roti’ bread’); ibu djari ~ ibu tangan ‘thumb’, ibu kaki ‘big toe’ (djari ‘finger’, tangan ‘hand’, kaki ‘foot’); ibu panah ‘bow’ / anak panah ‘arrow’; ibu kuntji ‘skeleton key; lock’ / anak kuntji ‘key’

Vietnamese cái- ‘mother’ / con- ‘child’ ngón tay cái ‘thumb’, ngón chân cái ‘big toe’ (ngón ‘digit’, tay ‘hand’, chân ‘foot’)

724 

 Stefanie Siebenhütter

Thai mɛ̑ ɛ ‘mother’ / lûuk ‘child’ hŭa-mɛ̑ ɛ-myy ‘thumb’, hŭa-mɛ̑ ɛ-tiin ‘big toe’ (hŭa ‘head’, myy ‘hand’, tiin ‘foot’) Shared semantics on the word level describes a shared semantic scope (or semantic space; Ameka 1996) of linguistic symbols (i.  e. lexemes). Overlap is usually found on the level of basic conceptual domains, for example, motion, time, number, emotions, gender, and space among the known basic conceptual domains (Jarvis and Pavlenko 2008: 122–124). Such concepts are defined as being central to categorization and conceptualization (Evans 2009). Examples for the category of emotions are found in Matisoff (1986). He refers to psycho-collocations as a polymorphemic expression referring as a whole to a mental process, quality, or state, one of whose constituents is (…) a noun with explicit psychological reference (translatable by English words like heart, mind, spirit, soul, temper, nature, disposition, mood). (Matisoff 1986: 9).

Accordingly, compound and elaborated psycho-nouns such as Thai TWO – MIND – TWO – HEART for ‘hesitate’ (Matisoff 1986: 42) are an East and Southeast Asian areal feature (Matisoff 1986: 40). While certainly a number of SEA languages collocations “cannot be localized to a particular source-culture” and rather “seem to reflect universal human thought processes” Matisoff (1986: 48) identifies typical SEA psycho-metaphor combinations that cannot be found in western languages. For example it “is widespread” (in Tibeto Burman languages Lahu, Burmese and Jingpho at any rate) “for the notion of DIE to be extended to mean SETTLED/STABILIZED, even going so far as to acquire the positive sense of SATISFIED/SERENE” (Matisoff 1986: 48). As for MSEA, the conceptual area was approached by the domain of space (Siebenhütter 2016, 2019b). Example (7) shows overlapping semantic scope in Vietnamese, Khmer, Lao, and Indonesian for the concept on/above/over. Khmer lǝǝ covers more spatial notions than the equivalent terms from Lao, Indonesian and Vietnmese. Considering this specific spatial notion, Vietnamese, Lao, and Indonesian are found to form a more consistent area in terms of semantic scope overlap than Khmer. (7) Vietnamese Lao Indonesian Khmer

‘Cup on table.’

‘Lamp over table.’ trên tʰə́ ŋ2 atas lǝǝ

‘Stamp on letter.’

‘Ring on finger.’ đeo sāy5 me-lingkar

However, we cannot use only one single domain for the definition of a conceptual area. Rather, if we can find this tendency of semantic scope overlap throughout a range of domains (i.  e. motion, time, number, gender, and space), the semantic level



Contact and convergence in the semantics of MSEA 

 725

can provide a reliable source for the definition of a conceptual area. Example (7) would suggest a conceptual area by including Indonesian to the conceptual area but excluding Khmer. The situation of Thai for this example would look again different as Thai has a word for the notion ‘over’ nɯ̌ə. This is of course true only for the given example, which illustrates the semantic scope of the spatial notion on/over for only a small choice of languages. For defining a linguistic area solely on the level of conceptual representation, more areas of the basic conceptual domains need to show overlap. However, it is more likely that the conceptual or semantic level will serve as one feature for defining a linguistic area among others, that is, morphological, syntactic, and phonological ones. The importance of the combination of several features for defining a linguistic area was demonstrated in Siebenhütter (2018). If we consider only one lexical representation, for example the one represented by English “wear” such as in “wear a hat”, “wear a shoe”, “wear a ring” and “wear a necklace,” then it turns out that the central Khmer language shares less with Thai, Lao, and Vietnamese and might, therefore, be excluded from the MSEA area, whereas Indonesian, a language generally seen as peripheral to the area, would turn out to be a better match.4 A definition on the conceptual or semantic level should be seen ideally as an additional feature. In this context, Matisoff (2001) mentions also “‘negative areal semantic features’, i.  e. associations that are well exemplified elsewhere, but never seem to turn up in SE Asian languages” such as the “Indo-European association FINGER GRASP (cf. German fangen)” that “has not been attested in any language family of SE Asia”, and the “conceptual opposition between HEART and REASON (cf. Pascal’s aphorism, Le coeur a ses raisons, que la raison ne connaît point) seems quite foreign to East and SE Asia” (Matisoff 2001: 21). Jenny (2015) illustrates with the example of ‘give’ and ‘get’-constructions the evidence of intensive language contact for several centuries at the structural level that indicates similarities of Tibeto-Burman languages with the core languages of SEA. It is not surprising, that SEA languages share terms for typical local flora and fauna such as “a number of feared, spectacular, or ubiquitous species (e.  g. bird of prey, tiger, elephant, crocodile, rabbit) have diffused widely through the area” (Matisoff 2001: 18). Accordingly, it can be for example demonstrated in several cases that Mon-Khmer is the source of animal-name loans in Tibeto-Burman (Matisoff 2001: 18 cites Shorto 1973). As “another important category of areally diffused vocabulary”, Matisoff (2001: 18) mentions technological terms such as a root for a crossbow “that is found in Chinese, Karen, Hmong-Mien, Tai and Mon-Khmer” (Matisoff 2001: 18).

4 However this might be also due to the fact that national languages have more in common with English than with less standardized MSEAn languages because there is a general tendency to simplify the vocabulary and get rid of such distinctions in languages with many L2 speakers.

726 

 Stefanie Siebenhütter

30.3.3.2.4 Shared ways of expression There are further shared ways of expression throughout the SEA area. Such expressions can include elaborate expressions, ideophones, ways of greeting, thanking, etc. Gil (2015: 280–282) describes “Where are you going?” as a conventionalized form of greeting commonly used in the SEAn area. Directional conventionalized greetings as in example (8) “with ‘where’ are the rule” (Gil 2015: 281) in SEAn languages. (8)

a.

Vietnamese (both: Gil 2015: 281) Đi đâu? go where ‘Where are you going?’ b. Jakarta Indonesian Mau ke mana? want to where ‘Where are you going?’

Another often heard form of greeting in at least East Asia is ‘Have you eaten (rice) yet?’ or ‘Did you eat well?’, ‘What did you eat?’ (Wang 2019) e.  g. in Vietnamese one might say Ăn cơm chưa? ‘Have you eaten yet?’ as a friendly way for greetings. Another example for shared expressions is Thai kreːŋ-caj ‘not want to impose on someone’, ‘be afraid of offending (someone)’ as in examples (9), which has equivalents in many SEAn languages. In Thai the expression consists of kreːŋ (‘fear’; ‘be afraid of’; ‘be in awe of’) and caj (‘heart’; ‘mind’; ‘spirit’). In Burmese the expression is equivalent with ʔà-na-dɛ, in Mon hɲa-cɒt, to mention only some languages of SEA. This is typical example of an area-specific shared conceptualization that is linguistically encoded in the languages of the area, in each case with language-internal independent means. (9)

a. mâi tɔ̂ ŋ kreːŋ-caj náʔ. b. mâi kreːŋ-caj c. kʰwaːm-kreːŋ-caj

‘Don’t be humble.’ ‘have no respect; be inconsiderate’ ‘considerateness; thoughtfulness’

Also the grammatical uses of a verb meaning ‘to be finished’ to express that something is the case now but was not before, kind of a New Situation marker (NSIT). In Thai lɛ́ ːw as in examples (10), in Lao lɛ̑ ːw (‘ready, already, finished, done’), Khmer haəj, Vietnamese rồi (lâu rồi không gặp ‘long time no see’). (10)

a. lɛ́ ːw kɤ̀ːt ʔəraj kʰɯ̂n b. lɛ́ ːw tɔ̂ ŋ.kaːn ʔəraj

‘So what happened?’ ‘Then what do you want?’

In SEA one can find many shared expressions in the form of a combination of two or more concepts like Thai rót-faj (’car fire’) for railway. Areal equivalents are rʊət-pləəŋ in Khmer and xe-lửa in Vietnamese. Another example for shared expression is Thai nám-kʰɛ̌ ŋ ‘water solid’ for ice, in Khmer tʰuŋ-kɑ̊ k. Such examples show, how many



Contact and convergence in the semantics of MSEA 

 727

languages in the area express a concept similar or the same, but always with indigenous material.

30.4 Summary and outlook In this chapter language convergence in an areal perspective has been explained, with a focus on conceptual, lexical, and semantic convergence in the region of MSEA. The examples show that MSEA is a linguistic area where many languages, irrespective of their genetic kinship, exhibit the same concepts for the encoding of several concepts i.  e. of basic domains such as space, time, and emotions. It was explained, how shared cultural categories may help defining linguistic areas via cultural representations. Areal linguistic research included major investigations in the last few decades. One of the future research needs, although not just in areal linguistics, is the more extensive and detailed documentation and study of minority languages as the study of undescribed small languages can give us new insights in the make-up of linguistic areas. There is (still) a rich variety of minority languages within a relatively small geographical area. In MSEA, as in other areas around the world, there is a need to fill in the gaps in language research by including the hundreds of little or completely unstudied minority languages into the fold of language research (Haspelmath 2009: 162). Only at a few places the distinct linguistic diversity of the MSEA area has been studied in detail. There is little knowledge about the majority of MSEA and its diversified languages. Among the major languages of MSEA, Vietnamese is high on the list of languages to be studied in further detail, as existing descriptions and analyses pertaining to these two languages do not meet all criteria needed for basic linguistic description and documentation. Although, overall there are more published grammars on minority languages than in the past, however, some of these have been published in SEA languages but not in English or other major world languages. These are, therefore, not easily accessible to researchers from around the world (personal communication: Alves 2016). Without high-quality material on the major languages of MSEA, further investigation of the numerous existing varieties and dialects is difficult. Another desiderate of – not only, but also areal linguistic – research are the multilingual speakers in the SEA area. Speakers of minority languages, who are often multilingual with more than three or four languages right from early childhood, are an example of the potential of creativity of the human cognition. Therefore, multilingualism is another worthwhile focus of future research in specific regions of MSEA. Acknowledgements: For valuable comments on this chapter, I want to thank Mark Alves, Mathias Jenny and anonymous reviewers.

728 

 Stefanie Siebenhütter

References Aikhenvald, Alexandra Y. & Robert M. W. Dixon. 2006. Introduction. In Alexandra Y. Aikhenvald & Robert M. W. Dixon (eds.), Areal diffusion and genetic inheritance, vol. 1, 1–26. Oxford: Oxford University Press. Alves, Mark J. 2001. What’s so Chinese about Vietnamese? Manoa: University of Hawaii. http:// sealang.net/sala/archives/pdf4/alves2001what.pdf (accessed 7 January 2021). Alves, Mark J. 2009. 24 loanwords in Vietnamese. In Martin Haspelmath & Uri Tadmor (ed.), Loanwords in the world’s languages. A comparative handbook, 617–637. Berlin: Mouton de Gruyter. Ameka, Felix K. 1996. Semantics (Chapter 17). In H. Goebl (ed.), Contact linguistics: An International handbook of contemporary research, 130–138. Berlin: Walter de Gruyter. Ansaldo, Umberto. 2010. Surpass comparatives in Sinitic and beyond: Typology and grammaticalization. Linguistics: An Interdisciplinary Journal of the Language Sciences (Linguistics) 48(4). 919–950. Bashir, Elena. 2016. Contact and convergence. In Hans Henrich Hock & Elena Bashir (eds.), The languages and linguistics of South Asia: A comprehensive guide, 241–374. Berlin & Boston: Mouton de Gruyter. Benedict, Paul K. 1975. Austro-Thai language and culture, with a glossary of roots. New Haven: Human Relations Area Files Press. Bisang, Walter. 1991. Verb serialization, grammaticalization and attractor positions in Chinese, Hmong, Vietnamese, Thai and Khmer. In Hansjakob Seiler & Waldfried Premper (eds.), Partizipation. Das sprachliche Erfassen von Sachverhalten (Language Universals Series 6), 509–562. Tübingen: Narr. Bisang, Walter. 1992. Das Verb im Chinesischen, Hmong, Vietnamesischen, Thai und Khmer: Vergleichende Grammatik im Rahmen der Verbserialisierung, der Grammatikalisierung und der Attraktorpositionen (Language Universals Series 7). Tübingen: Narr. Bisang, Walter. 1996. Areal typology and grammaticalization: Processes of grammaticalization based on nouns and verbs in East and Mainland South East Asian languages. Studies in Language: International Journal Sponsored by the Foundation ‘Foundations of Language’ (SLang) 20(3). 519–597. Bisang, Walter. 1999. Classifiers in East and Southeast Asian languages: Counting and beyond. In Jadranka Gvozdanovic (ed.), Numeral types and changes worldwide, 113–185. Berlin: Mouton de Gruyter. Bisang, Walter. 2004. Grammaticalization without coevolution of form and meaning: The case of tense-aspect-modality in East and Mainland Southeast Asia. In Walter Bisang, Nikolaus P. Himmelmann & Bjorn Wiemer (eds.), What makes grammaticalization? A look from its fringes and its components (Trends in Linguistics: Studies and Monographs [TiLSaM] 158), 109–138. Berlin: Mouton de Gruyter. Bisang, Walter. 2008. Grammaticalization and the areal factor: The perspective of East and Mainland Southeast Asian languages. In Maria Jose Lopez-Couso, Elena Seoane & Teresa Fanego (eds.), Rethinking grammaticalization: New perspectives, 15–35. Amsterdam: Benjamins. Brunelle, Marc & James Kirby. 2015. Re-assessing tonal diversity and re-assessing tonal diversity and geographical convergence in Mainland Southeast Asia. In Nicholas J. Enfield & Bernard Comrie (eds.), Languages of Mainland Southeast Asia, 82–110. Berlin, München & Boston: Mouton de Gruyter. Brunelle, Marc & James Kirby. 2017. Southeast Asian tone in areal perspective. In Raymond Hickey (ed.), The Cambridge handbook of areal linguistics (Cambridge Handbooks in Language and Linguistics), 703–731. Cambridge: Cambridge University Press.



Contact and convergence in the semantics of MSEA 

 729

Comrie, Bernard. 2007. Areal typology of mainland Southeast Asia: What we learn from the WALS maps. In Pranee Kullavanijaya (ed.), Trends in Thai linguistics (=MANUSYA special issue 13), 18–47. Bangkok: Chulalungkorn University. De Sousa, Hilário. 2015. The far southern Sinitic languages as part of Mainland Southeast Asia. In Nicholas J. Enfield & Bernard Comrie (eds.), Languages of Mainland Southeast Asia: The state of the art (Pacific Linguistics [PL] 649), 356–440. Berlin & Boston: Mouton de Gruyter. Djité, Paulin G. 2011. The language difference: Language and development in the Greater Mekong Sub-Region (Multilingual Matters 144). Bristol: Multilingual Matters. Enfield, Nicholas J. 2002. How to define “Lao”, “Thai”, and “Isan” language? A view from linguistic science. Thai Culture 7. 62–67. Enfield, Nicholas J. 2003. Linguistic epidemiology: Semantics and grammar of language contact in Mainland Southeast Asia. London: Routledge Curzon. Enfield, Nicholas J. 2004. Cultural logic and syntactic productivity: Associated posture constructions in Lao. In Nicholas J. Enfield (ed.), Ethnosyntax. Explorations in grammar and culture (Oxford Linguistics), 231–258. Oxford: Oxford University Press. Enfield, Nicholas J. 2005a. The body as a cognitive artifact in kinship representations: Hand gesture diagrams by speakers of Lao. Current Anthropology 46(1). 51–81. Enfield, Nicholas J. 2005b. Areal linguistics and Mainland Southeast Asia. Annual Review of Anthropology 34. 181–206. Enfield, Nicholas J. 2006. On genetic and areal linguistics in Mainland South-East Asia: Parallel polyfunctionality of “acquire”. Problems in comparative linguistics. In Alexandra Y. Aikhenvald & Robert M. W. Dixon (eds.), Areal diffusion and genetic inheritance, 255–290. Oxford: Oxford University Press Enfield, Nicholas J. 2007. A grammar of Lao (Mouton Grammar Library 38). Berlin: Mouton de Gruyter. Enfield, Nicholas J. 2008. Transmission biases in linguistic epidemiology. Journal of Language Contact (2)1. 299–310. Enfield, Nicholas J. 2009. “Case relations” in Lao, a radically isolating language. In Andrej L. Malčhukov & Andrew Spencer (eds.), The Oxford handbook of case, 808–819. Oxford: Oxford University Press. Enfield, Nicholas J. 2011a. Dynamics of human diversity: The case of mainland Southeast Asia. Canberra: Pacific Linguistics. Enfield, Nicholas J. 2011b. Linguistic diversity in mainland Southeast Asia. In Nicholas J. Enfield (ed.), Dynamics of human diversity: The case of Mainland Southeast Asia, 63–80. Canberra: Pacific Linguistics. Enfield, Nicholas J. 2011c. Taste in two tongues: A Southeast Asian study of semantic convergence. The Senses & Society 6(1). 30–37. Enfield, Nicholas J. & Bernard Comrie. 2015. Mainland Southeast Asian languages: State of the art and new directions. In Nicholas J. Enfield & Bernard Comrie (eds.), Languages of Mainland Southeast Asia: The state of the art (Pacific Linguistics [PL] 649), 1–27. Berlin & Boston: Mouton de Gruyter. Enfield, Nicholas J. & Gérard Diffloth. 2009. Phonology and sketch grammar of Kri, a Vietic language of Laos. Cahiers de Linguistique – Asie Orientale (CLAO) 38(1). 3–69. Evans, V. (ed.). 2009. How words mean: Lexical concepts, cognitive models, and meaning construction (Oxford Linguistics). Oxford: Oxford University Press. Gil, David. 2015. The Mekong-Mamberamo linguistic area. In Nicholas J. Enfield & Bernard Comrie (eds.), Languages of Mainland Southeast Asia: The State of the Art (Pacific Linguistics [PL] 649), 266–355. Berlin & Boston: Mouton de Gruyter.

730 

 Stefanie Siebenhütter

Goddard, Cliff & Zhengdao Ye. 2015. Ethnopragmatics. In Farzad Sharifian (ed.), The Routledge handbook of language and culture (Routledge Handbooks in Linguistics), 66–83. London: Routledge. Güldemann, Tom. 2018. Language contact and areal linguistics in Africa. In Tom Güldemann (ed.), The languages and linguistics of Africa, 445–545. Berlin & Boston: Mouton de Gruyter.  Haugen, Einar. 1950. The analysis of linguistic borrowing. Language 26. 210–231. Harris, Jimmy. 1986. The Persian connection: Four loanwords in Siamese. Pasaa 16(1 June). 9–12. Haspelmath, Martin. 2009. Welche Fragen können wir mit herkömmlichen Daten beantworten? Zeitschrift für Sprachwissenschaft 28(1). 157–162. Heine, Bernd & Tania Kuteva. 2005. Language contact and grammatical change (Cambridge Approaches to Language Contact). Cambridge: Cambridge University Press. Huffman, Franklin E. 1986. Khmer loanwords in Thai. Sealang Archives. http://sealang.net/sala/ archives/pdf8/huffman1986khmer.pdf (accessed 07. January 2021). Jaberg, Karl & Jakob Jud. 1928. Der Sprachatlas als Forschungsinstrument. Kritische Grundlegung und Einführung in den Sprach- und Sachatlas Italiens und der Südschweiz. Halle (Saale): Niemeyer. Janda, Laura A. 2006. From cognitive linguistics to cultural linguistics. Slovo a smysl (Word and Sense) 8. 48–68. Jarvis, Scott. 1998. Conceptual transfer in the interlingual lexicon. Bloomington, IN: IULC Publications. Jarvis, Scott. 2000. Methodological rigor in the study of transfer: Identifying L1 influence in the interlanguage lexicon. Language Learning 50. 245–309. Jarvis, Scott. 2007. Theoretical and methodological issues in the investigation of conceptual transfer. Vigo International Journal of Applied Linguistics (VIAL) 4. 43–71. Jarvis, Scott. 2011. Conceptual transfer: Crosslinguistic effects in categorization and construal. Bilingualism: Language and Cognition 14. 1–8. Jarvis, Scott. 2016. Clarifying the scope of conceptual transfer. Language Learning 66(3). 608–635. Jarvis, Scott & Aneta Pavlenko. 2008. Crosslinguistic influence in language and cognition. New York: Routledge. Jenny, Mathias. 2015. The far west of Southeast Asia. “Give” and “get” in the languages of Myanmar. In Nicholas J. Enfield & Bernard Comrie (eds.), Languages of Mainland Southeast Asia: The state of the art (Pacific Linguistics [PL] 649), 155–208. Berlin & Boston: Mouton de Gruyter. Jenny, Mathias. 2017. Foreign influence in the Burmese language. In Ampika Rattanapitak (ed.), A collection of papers on Myanmar language and literature, 1–34. Chiang Mai, Thailand: Myanmar Center, Chiang Mai University. Jenny, Mathias & Patrick McCormick. 2015. Old Mon. In Paul Sidwell & Mathias Jenny (eds.), The handbook of Austroasiatic languages, vol. 1, 519–552. Leiden & Boston: Brill. Kausen, Ernst. 2013. Die Sprachfamilien der Welt. Teil 1: Europa und Asien. Hamburg: Buske. Kelz, Heinrich P. 1984. Typologische Verschiedenheit der Sprachen und daraus resultierende Lernschwierigkeiten: Dargestellt am Beispiel der sprachlichen Integration von Flüchtlingen aus Südostasien. In Els Oksaar (ed.), Spracherwerb – Sprachkontakt – Sprachkonflikt (Grundlagen der Kommunikation/Foundations of Communication and Cognition), 92–106. Berlin & New York: Walter de Gruyter. Levinson, Stephen C. & David Wilkins. 2006. Grammars of space. Explorations in cognitive diversity. Cambridge & New York: Cambridge University Press. Masica, Colin P. 1976. Defining a linguistic area. Chicago: University of Chicago Press. Mathiot, Madeleine & Dorothy Rissel. 1996. Lexicon and word formation. In Hans Goebl (ed.), Contact linguistics. An international handbook of contemporary research, 124–130. Berlin: Walter de Gruyter.



Contact and convergence in the semantics of MSEA 

 731

Matisoff, James A. 1986. Hearts and minds in South-East Asian languages and English: An essay in the comparative lexical semantics of psycho-collocations. Cahiers de Linguistique – Asie Orientale 15(1). 5–57. Matisoff, James A. 2001. Is there such a thing as areal semantics and if so, can we distinguish between plausible and implausible semantic change/associations in the Southeast Asian linguistic area? Paper presented at the Summer Institute of the Linguistic Society of America, University of California at Santa Barbara, July. Matisoff, James A. 2006. Genetic versus contact relationship: Prosodic diffusibility in South-East Asian languages. Problems in comparative linguistics. In Alexandra Y. Aikhenvald & Robert M. W. Dixon (eds.), Areal diffusion and genetic inheritance, 291–327. Oxford: Oxford University Press. Mortensen, David. 2000. Sinitic loanwords in Two Hmong dialects of Southeast Asia. Logan, UT: Utah State University honors thesis Palmer, Gary B. 1996. Toward a theory of cultural linguistics. Austin: University of Texas Press. Pavlenko, Aneta & Scott Jarvis. 2002. Bidirectional transfer. Applied Linguistics 23. 190–214. Post, Mark W. 2011. Prosody and typological drift in Austroasiatic and Tibeto-Burman: Against “Sinosphere” and “Indosphere”. In Sophana Srichampa, Paul Sidwell & Kenneth Gregerson (eds.), Austroasiatic studies: Papers from ICAAL4. Mon-Khmer Studies Journal special issue no. 3, 198–221. Dallas, Salaya & Canberra: SIL International, Mahidol University & Pacific Linguistics. Post, Mark W. 2015. Morphosyntactic reconstruction in an areal-historical context: A pre-historical relationship between North East India and Mainland Southeast Asia. In Nicholas J. Enfield & Bernard Comrie (eds.), Languages of Mainland Southeast Asia: The state of the art (Pacific Linguistics [PL] 649), 209–265. Berlin & Boston: Mouton de Gruyter. Ratliff, Martha. 2009. 25. Loanwords in White Hmong. In Martin Haspelmath & Uri Tadmor (eds.), Loanwords in the world’s languages. A comparative handbook, 638–658. Berlin: Mouton de Gruyter. Sharifian, Farzad. 2011. Cultural conceptualisations and language: Theoretical framework and applications (Cognitive Linguistic Studies in Cultural Contexts 1). Amsterdam: Benjamins. Sharifian, Farzad. 2015. Cultural linguistics. In Farzad Sharifian (ed.), The Routledge handbook of language and culture (Routledge Handbooks in Linguistics), 473–492. London: Routledge. Shorto, Harry L. 1973. Mon-Khmer contact words in Sino-Tibetan. Paper presented at First International Conference on Austroasiatic Linguistics, Honolulu. Sidwell, Paul. 2015. Local drift and areal convergence in the restructuring of Mainland Southeast Asian languages. In Nicholas J. Enfield & Bernard Comrie (eds.), Languages of Mainland Southeast Asia: The state of the art (Pacific Linguistics [PL] 649), 51–81. Berlin & Boston: Mouton de Gruyter. Siebenhütter, Stefanie. 2016. Transkategoriale Variationen im Vietnamesischen. In Daniel Holl, Patrizia Noel Aziz Hanna, Barbara Sonnenhauser & Caroline Trautmann (eds.), Variation und Typologie (Diskussionsforum Linguistik in Bayern / Bavarian Working Papers in Linguistics 5), 29–42. Munich & Bamberg: Ludwig Maximilians Universität & Otto-Friedrich-Universität. DOI: 10.5282/ubm/epub.28150. Siebenhütter, Stefanie. 2018. Study of linguistic areas: Evidence from cultural words, semantic maps, and spatial reference in Southeast Asia. In Stan D. Brunn & Roland Kehrein (eds.), Handbook of the changing world language map. Cham: Springer. https://link.springer.com/ referenceworkentry/10.1007%2F978-3-319-73400-2_83-1 (accessed 30 October 2019). Siebenhütter, Stefanie. 2019a. Sociocultural influences on linguistic geography: Religion and language in Southeast Asia. In Stan D. Brunn & Roland Kehrein (eds.), Handbook of the changing world language map. Cham: Springer. DOI: 10.1007/978-3-319-73400-2_84-1.

732 

 Stefanie Siebenhütter

Siebenhütter, Stefanie. 2019b. Conceptual transfer as an areal factor. Spatial conceptualizations in Mainland Southeast Asia (Pacific Linguistics [PL] 656). Berlin & Boston: Mouton de Gruyter. Stolz, Thomas. 2002. No Sprachbund beyond this line! On the age-old discussion of how to define a linguistic area. In Paolo Ramat & Thomas Stolz (eds.), Mediterranean languages, 259–281. Papers from the MEDTYP Workshop, Tirrenia, June 2000. Bochum: Universitätsverlag. Suthiwan, Titima & Uri Tadmor. 2009. 23 loanwords in Thai. In Martin Haspelmath & Uri Tadmor (eds.), Loanwords in the world’s languages. A comparative handbook, 599–616. Berlin: Mouton de Gruyter. Sybesma, Rint. 2008. Zhuang: A Tai language with some Sinitic characteristics: Post-verbal “can” in Zhuang, Cantonese, Vietnamese and Lao. In Pieter Muysken (ed.), From linguistic areas to areal linguistics (Studies in Language Companion 90), 221–274. Amsterdam: Benjamins. Thion, Serge. 1993. On some Cambodian words. Australian National University Thai-Yunnan Project Newsletter, 18–23. Canberra: Research School of Pacific Studies, ANU, no. 20, March. Tomioka, Yutaka. 2019. Linking non-linguistic phenomena to sociolinguistic phenomena: A case study of language shift in a Bru community in Northeastern Thailand. Presentation at 29th Southeast Asian Linguistic Society (SEALS), Tokyo, Japan, 27–29 May. Tosco, Mauro. 2008. What to do when you are unhappy with language areas but you do not want to quit. Journal of Language Contact – Thema 2. 112–123. Trask, Robert Lawrence (ed.). 2000. The dictionary of historical and comparative linguistics. Edinburgh: Edinburgh University Press. Trubetzkoy, Nikolai S. 1930. Proposition 16. Über den Sprachbund. In Actes du premier congrès international de linguistes à la Haye du 10–15 avril 1928, 17–18. Leiden: Sijthoff. Varasarin, Uraisi. 1984. Les éléments khmers dans la formation de la langue siamoise (Langues et civilisations de l’Asie du Sud-Est et du monde insulindien 15). Paris: SELAF. Vittrant, Alice. 2011. Aire linguistique Asie du Sud-Est continentale: le birman en fait-il partie? Moussons 16(1). 7–38. Wang, Guirong. 2019. How to guide the students to the multi-cultural environment in the colleges and universities English teaching. In Proceedings of the 4th International Conference on Humanities Science, Management and Education Technology (HSMET 2019), 539–542. Singapore, 21–23 June. Paris: Atlantis Press. Weinreich, Uriel. 1953. Languages in contact, findings and problems. New York: Linguistic Circle of New York. White, Nathan. 2019. Navigating languages: Multilateral mixing in the Hmong Diaspora. Talk given at the Johannes Gutenberg University Mainz, 23 September. Wierzbicka, Anna. 2015. Language and cultural scripts. In Farzad Sharifian (ed.), The Routledge handbook of language and culture (Routledge Handbooks in Linguistics), 339–356. London: Routledge.

Alice Vittrant and Marc Allassonnière-Tang

31 Classifiers in Southeast Asian languages 31.1 Introduction Classifiers are one of the types of nominal classifications systems that help speakers to identify discourse referents. They are commonly found in Southeast Asian languages, which motivates the geographical focus of this chapter. Given the semantic as well as the morphosyntactic overlap between the various systems, classifier devices are first presented in the context of all systems of nominal classifications. Then, the analysis focuses on the different constructional subtypes of classifiers and discusses their origin along with how they are used by speakers in discourse.

31.2 Classifiers as a type of nominal classification systems Nominal classification systems are linguistically interesting due to the interaction of their lexical and pragmatic behaviour with cognitive and cultural functions (ContiniMorava and Kilarski 2013; Craig 1986: 8; Denny 1976: 125; Kemmerer 2017a, 2017b; Kilarski 2014; among others). The two most common types of nominal classification systems in languages of the world are classifiers and noun classes (Aikhenvald 2000: 2). On the one hand, classifiers are independent morphemes or affixes that categorize nouns according to the inherent features of their referents based on criteria such as shape, consistency and animacy (Allan 1977; Grinevald 2000: 71). On the other hand, noun classes, also named (grammatical) gender,1 refer to classes of nouns that are reflected through grammatical agreement (Corbett 1991). For instance, the masculine/ feminine distinction in French or the noun class systems in Bantu languages, and more generally in Niger-Congo languages.

1 Grammatical gender and noun classes are used interchangeably in the literature. The tradition is to call “grammatical gender” those systems which rely on the feature of biological sex, e.  g. the masculine/feminine/neuter distinction in Indo-European languages; while ‘noun classes’ commonly refer to systems with a larger number of classes, e.  g. Bantu has from a dozen to twenty morphological classes, which include categories such as humans, plants, abstract nouns, among others (Grinevald 2000: 57). https://doi.org/10.1515/9783110558142-031

734 

 Alice Vittrant and Marc Allassonnière-Tang

31.2.1 Different nominal classification systems The major difference between classifier and gender/noun class systems is their level of grammaticalization or grammatical behaviour (Dixon 1986: 105–106). Comparing, as an example, French with grammatical gender and Burmese with classifiers in (1). ‘Car’ belongs to the feminine gender in French. Thus, the article, adjective, and verb show gender agreement, i.  e. glossed “feminine”. Burmese does not have grammatical gender. Therefore, there is no agreement on the elements of the clause. However, the use of a classifier that highlights an inherent semantic feature of the noun is found, as the noun ka3 ‘car’ is followed by the classifier si2 (clf:machine), which indicates that the referent denoted by ka3 is considered to be a machine. (1)



The difference in terms of agreement between grammatical gender and classifiers2 a. un-e grand-e voiture est venu-e ce matin one-f big-f car come.past.accomp dem morning ‘A big car came this morning’ (French) b. di2 mənɛʔ ka3 ci3 Tə =Si3 yaɔʔ la2 =Tɛ2 dem morning car big one clf:machine arrive come real ‘A big car came this morning’ (Burmese)

In terms of geographical distribution (Figure 1), classifier languages are mostly found in Asia, spreading eastward and westward toward the Americas and Europe (Gil 2013), whereas gender/noun class systems are generally found in languages of Africa, Europe, South Asia, Australia, Oceania, and sporadically attested in the Pacific and the Americas (Corbett 1991: 2, 2013). We may observe a complementary-like areal distribution, as classifier languages are mostly located in Asia (East Asia and Southeast Asia) while gender/noun class languages are concentrated in Europe and Africa. The concentration of classifiers in Southeast Asia further motivates an in-depth analysis for this region, stretching from the easternmost fringes of India in the west to China in the east, encompassing the peninsular Southeast Asian states of Burma, Thailand, Laos, Cambodia and Vietnam, as well as peninsular Malaysia (Vittrant and Watkins 2019). This area covers five different language families (Austroasiatic/Mon-Khmer, Tai-Kadai, Hmong-Mien, Sino-Tibetan and Austronesian).

2 Examples where no reference is mentioned are the authors’ own. We would like to thank Nichuta Bunkham for providing us the Thai examples.

Fig. 1: A simplified overview of classifier and gender in the world (Corbett 2013; Gil 2013).

 Classifiers in Southeast Asian languages   735

736 

 Alice Vittrant and Marc Allassonnière-Tang

31.2.2 Distinguishing the systems The past literature used to distinguish classifiers and gender3 (or noun class) systems as two distinct categories by applying different criteria such as size of the inventory, the presence of overtly marked agreement (Aikhenvald 2000: 6; Dixon 1982: 213–217), or assignment principles (Contini-Morava and Kilarski 2013: 266–267). For instance, the French gender system only distinguishes between the two categories of masculine and feminine, whereas Japanese uses more than 200 numeral classifiers4 (Downing 1986: 346). As another example, gender assignment is generally considered more rigid than classifier assignment. A noun is commonly affiliated to only one gender, while a noun can commonly be used with several different classifiers. Recent approaches converge to view both systems as different points on the same lexical-grammatical continuum (Grinevald 2000; Corbett and Fedden 2016), which is represented in Figure 2.5 Lexical [Measure terms] [Class terms]

Grammatical [Classifiers]

[Noun classes]

Fig. 2: The grammatical continuum of nominal classification systems (adapted from Grinevald 1999: 110).

Pure lexical devices such as measure terms and class terms are positioned on the lexical extreme of the continuum. First, measure terms refer to quantifying expressions that involve nouns, for instance, three cups of coffee. Such constructions are not considered to be classifiers in English (Kilarski 2013: 35), but rather to be nouns, since measure terms can take plural morphology and require the preposition “of”. It may be argued that the lack of plural morphology may be language specific, but even in genuine classifier languages that do have morphosyntactic plural markers, classifiers do not take such plural marking, e.  g. Hungarian (Csirmaz and Dekany 2010: 13) and Armenian (Borer 2005: 94–95). Second, class terms refer to nouns that are productively used either in derivation or compounding to express different related meanings (DeLancey 1986: 439; Grinevald 2000: 59, 64–65). For instance, in Lao (Tai-Kadai), /mè0/ which comes from /mèè1/ ‘mother’ and /naj0/, which comes from /naaj2/ ‘boss,

3 Henceforth, we will use gender and classifier as umbrella terms. “Gender” refers to the most grammaticalized systems, i.  e. gender and noun-class. “Classifier” gathers all types of classifier systems. 4 Although dictionaries may include as many as 200 to 300 different classifiers, the inventory actually used by speakers is more limited (30 to 80 items) (Downing 1986: 346). 5 It is important to point out that gender and classifier systems are not mutually exclusive, even though it is rare to find languages with both types of systems. For instance, Nepali has two gender systems and one classifier system co-occurring simultaneously (Allassonnière-Tang and Kilarski 2020).



Classifiers in Southeast Asian languages 

 737

lord’, are productively used to derive meanings of human’s occupations such as nun (2a), cook (2b), interpreter (2c), and military office (2d). (2)

The use of class terms in Lao, Tai-Kadai (Enfield 2004: 136) a. mè0-khaaw3 b. mè0-khua2 ct-white ct-cooking ‘nun’ ‘cook’ (f.) c. naj0-caang4 d. naj0-phasaa3 ct-language ct-soldier ‘interpreter’ ‘military officer’

Class terms are quite common in Southeast Asian languages (see Bon [2012] for Stieng, DeLancey [1986] for Tai languages and Vittrant [2005] for Burmese, among others). While they are sometimes confused with classifiers, they should be distinguished from them (DeLancey 1986: 440–443), as they do not fulfil the same function: the former is purely a process of noun derivation whereas the latter refers to grammatical functions (see also 31.5). Moreover, they can generally be syntactically distinguished, and the categorizations coded by class terms and classifiers need not to coincide. On the other hand, the grammatical extreme of the continuum is represented by gender systems such as the masculine/feminine distinction in French. Classifier systems are found in the middle of the continuum as they are more grammaticalized than measure terms or class terms but less grammaticalized than gender: they “agree” semantically with the referent of the noun (by highlighting a special shade of meaning) but are not marked on other elements in the phrase.6 To sum up, gender and classifiers both classify nouns of the lexicon, but the two kinds of systems have different morphosyntactic and semantic features/behaviours. In our description of classifiers in Southeast Asia, we cover their origin, their semantic and morphosyntactic patterns, along with their semantic and discourse functions.

31.3 Defining classifiers Even though the term “classifiers” is well established in the relevant literature (Aikhenvald 2000: 30; Bisang 1999: 113; Dixon 1986: 105; Grinevald 2000: 61, 2015: 811, among others), they still “go by an exasperating variety of names” (Blust 2009: 292) within nominal classification typologies and language descriptions, e.  g. “classifiers, quantifiers” (Adams 1989), “measure” or “quantitative” words (liang4ci2) (Li 1924), “company words” (Liu 1965), “specificatifs” (Nguyễn 1957: 124; Rygaloff 1973: 67), specifiers (Huffman 1970), “projectives” (Hurd 1977), “numeratives, numerical 6 Grammaticalization processes do not imply that only one type of systems can be found in the same language (Fedden and Corbett 2017).

738 

 Alice Vittrant and Marc Allassonnière-Tang

determinatives” (Chao 1968), among others.7 Nevertheless, this is not as alarming as it sounds, since a detailed reading of the sources shows that similar definitions are frequently used but with a different naming. Roughly speaking, two usages of the term “classifier” are found in the literature: (i) classifier within a wide-scope approach refers to a variety of systems (Aikhenvald 2000) (ii) classifier within a narrower approach refers only to those systems whose main function is to make count nouns enumerable by individualisation (Bisang 1999; Grinevald 2000). This second approach, adopted here, allows us to distinguish classifiers from other types of nominal classification, while also recognizing subtypes among them. In the following subsections, we describe main categories and subtypes of classifiers.

31.3.1 Sortal and mensural classifiers Classifiers systems may first be divided in two main categories based on different semantic (and sometimes syntactic) behaviours (Peyraube and Wiebusch 1993: 52–53). First, sortal classifiers highlight or single out some inherent features of the referent denoted by the noun. They may also make explicit information about a given referent that the noun itself leaves unspecified, and they do fulfil several semantic and discourse functions (see 31.5 for further details). Second, mensural classifiers are used for measuring mass nouns and count nouns according to their physical properties (Craig 1992: 279; Bisang 1999: 121; Aikhenvald 2000: 115). Unlike sortal classifiers which may appear semantically redundant, mensural classifiers contribute semantically to the noun phrase by indicating the information of quantity. For instance, in example (3a) from Vietnamese, the noun ‘fish’ is used with the sortal classifier con that highlights its (non-human) animacity. In (3b), the mensural classifier cân adds a new information of quantity. Removing the mensural classifier in (3b) would result in a different meaning of the noun phrase, while removing the sortal classifier in (3a) would not result in a loss of semantics. (3)

Sortal and mensural classifiers in Vietnamese (Löbel 2000: 261) a. một con cá one clf:animal fish ‘a fish’ b. một cân cá one mens:pound fish ‘a pound of fish’

7 The terminological confusion and proliferation is also mentioned by (Grinevald 2000: 54) or (Schembri 2003: 4) for Sign Languages. See also Yang-Drocourt (2004: 6) for a list of terms found in the Chinese literature since the 18th century.



Classifiers in Southeast Asian languages 

 739

Mensural classifiers are often compared to measure terms due to the information of quantity they both provide. They are often confused due to their similar semantic functions as they also add meaning to the lexicon. However, mensural classifiers and measure terms should be differentiated with regard to their morphosyntactic behaviour (Her 2012: 1682). For instance, measure terms in English are nouns (i.  e. pure lexical items) since they can take plural morphology and require the preposition ‘of’ when combining with another noun. In a classifier language like Vietnamese, the classifiers do not take plural marking (if present in the language), and they syntactically behave as sortal classifiers, occurring in a relatively tight relationship to the noun, not being mediated by an adposition. They should be considered as a distinct part of speech category from nouns in most of the languages.8

31.3.2 Common constructional subtypes of classifiers Beside the distinction generally made between sortal and mensural classifiers, several subtypes of classifiers can be further distinguished on the basis of their morphosyntactic properties and discourse functions. We follow the most common typologies and identify six major constructions based on their classifier locus (Aikhenvald 2000; Grinevald 1999, 2000): numeral classifiers, noun classifiers, genitive classifiers, deictic classifiers, verbal classifiers and locative classifiers (Grinevald 2000: 62–68; Seifart 2010: 721). As indicated by their names, these constructional subtypes of classifiers are differentiated based on the linguistic construction in which they are found, i.  e. their distribution in the clause. Across these subtypes, the classifiers can be either bound or independent morphemes, which usually depends on the linguistic structure of each individual language. The interaction between sortal/mensural classifiers and these subtypes is transversal, since the former are differentiated based on semantics while the latter are identified based on morphosyntactic properties. That is to say, sortal and mensural classifiers are found in most classifier subtypes. The first four subtypes, i.  e. numeral classifiers, noun classifiers, deictic classifiers, and genitive classifiers, are commonly found in Southeast Asian languages. Thus, the description will focus on those subtypes.

31.3.3 Numeral classifier First, numeral classifiers occur in numeral constructions and quantification expressions (see [6]). They are consistently tightly linked with the numeral (or the quantifier) 8 The difficulties for delimiting nouns and classifiers are discussed by Löbel (2000: 263–64) for Vietnamese, who claims that in an isolating language, the analysis must be primarily based on distributional criteria.

740 

 Alice Vittrant and Marc Allassonnière-Tang

(Allan 1977: 288; Tang and Her 2019). Within such construction, their main function is to differentiate the presence/absence of countability of the following noun. In other terms, they individualize items denoted by the noun before they can occur with a quantifying element. (4)

Classifiers in numeral construction in Hmong Leng (Hmong-Mien, Laos), from Mortensen (2019: 625) a. ob tug dlev b. *ob dlev two clf:animal dog two dog ‘three dogs’ *

(5)

Classifiers in numeral construction in Chontal (Mayan, Mexico), from Suárez (1983: 87) a. un- ts’it tʃəb b. un- ʃim kəkəw one clf:long candle one clf:grain cocoa bean ‘one candle’ ‘one cocoa bean’ c. un- kʔe pop d. un- tek teʔ one clf:flat sleeping mat one clf:plant tree ‘one sleeping mat’ ‘one tree’

(6)

Classifiers in quantification expressions in Thai (Tai-Kadai, Thailand) (Jenny 2019: 579) a. mǎa baaŋ tuə b. mǎa lǎay tuə dog some clf dog many clf ‘some dogs’ ‘many dogs’ c. mǎa kìi tuə dog how.many clf ‘how many dogs’

Numeral classifiers are mostly found in languages of East and Southeast Asia, parts of Oceania and in Mesoamerica (Gil 2013). Hmong Leng, Thai and Mayan Chontal are examples of languages with numeral classifiers. As shown in (4), the construction in Hmong Leng is grammatically correct if a classifier matching with an inherent property of the referent is used. In this case, the classifier for animals matches with the animal feature of the dog. However, the lack of numeral classifiers in numeral constructions generally9 results in ungrammaticality (4b), even if the intended meaning is transparent.

9 In some few context-specific situations, the classifier may be dropped. See Vittrant and Mouton (forthcoming) for examples and details.



Classifiers in Southeast Asian languages 

 741

31.3.4 Noun classifier Second, noun classifiers occur next to the noun or within the boundaries of the noun phrase, independently of the operation of quantification. Generally, they fulfil a determiner function, giving information on the specificity or definiteness of the referent. Noun classifiers are well-attested in Australian and Mesoamerican languages (Aikhen­ vald 2000: 149–171) but are also common in Southeast Asian languages. The two following examples illustrate the noun classifiers in Zhuang (7) and Jakaltec (8). In both examples, the classifier modifies the noun (without the occurrence of numerals) and marks (un)definiteness or specificity. (7)

Noun classifiers in Zhuang (Tai-Kadai, China) – (Qin 2007: 173) tu2 mou1 kɯn1 bou3 im5 clf:animal pig eat not enough ‘The pig is not full …’

(8)

Noun classifiers in Jakaltek (Mayan, Guatemala) – from Craig (1986: 264) xil ix ix hune7 hin no7 txitam see.past clf: female woman one poss.1sg clf: animal pig tu7 dem.distal ‘The woman saw that one pig of mine’

As illustrated by both examples, the operation of quantification is not a necessary condition for the use of noun classifiers. In (7), the classifier for animals is used with the noun ‘pig’ without the occurrence of a numeral. Likewise, in (8), the classifier for females is used with the noun ‘woman’ without the occurrence of a numeral.

31.3.5 Genitive classifier Third, genitive classifiers – also known as possessive, attributive, or relational classifiers (Lichtenberk 1983) – occur in possessive constructions and categorize the relation between the referents of the possessor and the possessed. Genitive classifiers are a characteristic of many Oceanic languages and also sporadically attested in South American languages. They are also found in Southeast Asian languages. Examples of genitive classifiers in White Hmong (Thailand) are shown in (9). In (9a), the classifier for instruments is used to highlight the instrument feature of the sword, while in (9b), the classifier for living beings highlights the animacy of ‘uncle’.

742 

(9)

 Alice Vittrant and Marc Allassonnière-Tang

Genitive classifier in White Hmong (Hmong-Mien, Thailand)  – from Bisang (1988: 108, 115) a. nws rab riam ntaj b. nws tus txiv ntxaw he clf:inst sword he clf: animate uncle ‘his sword’ ‘his uncle’

Examples from Iaai (New Caledonia) are shown in (10). In those examples, the classifiers are affixed to the possessive pronouns and highlight an inherent feature of the possessed items in a similar way as numeral classifiers and noun classifiers highlight an inherent property of the referent. For instance, in (10a), the classifier -k points out that the fish is considered as food rather than as an item to sell (10b). (10)

Genitive classifiers in Iaai (Austronesian, New Caledonia) – from Dotte (2017: 345–346) a. ö-k wââ wââ b. anyi-k clf:gen-poss.1sg fish clf:food-poss.1sg fish ‘my fish (to eat)’ ‘my fish (to sell)’

As shown by both examples, genitive classifiers differ from numeral and noun classifiers based on their distribution in clauses. While numeral and noun classifiers occur in numeral construction or within the noun phrase, genitive classifiers specifically occur in possessive constructions.

31.3.6 Deictic classifier Fourth, deictic classifiers appear with demonstratives or deictic elements, classifying the item to which the deictic element refers (Bisang 2002: 294). This type of classifiers is primarily described in American and African languages but also found in Southeast Asian languages. As an example from Southeast Asian languages, Lao uses classifiers in demonstrative constructions. In (11a), for instance, the deictic classifier highlights some inherent feature (being animate) of the referent (fish) whereas in (11b) it indicates the nature (cloth) of the lao skirt. (11)

Deictic classifiers in Lao (Tai-Kadai, Laos) – from Enfield (2004: 129–130) a. kuu3 si0 kin3 paa3 to0 nii4 1sg.nonp irr eat fish clf:anim dem.gen ‘I’m going to eat this fish’ b. khòòj5 mak1 sin5 phùùn3 nii4 1sg.p like lao.skirt clf:cloth dem.gen ‘I like this skirt.’



Classifiers in Southeast Asian languages 

 743

In Goemai however, deictic classifiers encode information about the orientation/ posture of the referent. As seen in (12), the classifier for standing objects indicates that the referent is an orange tree instead of an orange fruit. (12)

Deictic classifiers in Goemai (Afro-Asiatic, Hellwig 2003:252) Goe-n-d’yem-nnoe a lemu nmlz(sg)-advz-clf:stand(sg)-dem:prox foc orange goe-rok nmlz(sg)-become.sweet ‘This standing one is a sweet orange (tree).’

While deictic classifiers differ from numeral classifiers, noun classifiers, and genitive classifiers in terms of distribution, the examples from all subtypes show that they fulfill similar functions for classifying the referent of the noun. Further details about those functions are developed in 31.5.

31.3.7 Verbal classifier Fifth, verbal classifiers are found on verbs, on which they cross-reference an argument. In most cases, it is the subject of intransitive verbs but it can also be the object of transitive verbs (Aikhenvald 2000: 149; Seifart 2010: 723). This sub-type of classifiers is primarily found in North American languages and also commonly found in sign languages (Grinevald 2015: 814; Meir and Sandler 2007:107–120; Tumtavitikul, Niwatapant, and Dill 2009; Bakken Jepsen et al. 2015). They are rare in Asian languages.10 Figure 3 shows a signer of Hong-Kong Sign Language (HKSL) using the flat-verti-

CL: A-VEHICLE-MOVE-TOWARD-A-TREE

Fig. 3: HKSL constructions using entity classifier (from Bauer 2014: 190).

10 Kuki-Chin languages such as Khumi could be analysed as having verbal classifier devices according to Peterson (2008).

744 

 Alice Vittrant and Marc Allassonnière-Tang

cal-handshape as a verbal classifier for a car which have been introduced previously in the discourse. The handshape is combined with the hand motion corresponding to the verb ‘to move’. As noticed by Bauer (2014), classifiers in sign languages have similar semantic characteristics than those recorded for spoken languages: These handshape constructions may represent various entities according to their perceptible characteristics such as its shape, size, structure, consistency, position and/or animacy. Sign language classifiers best lend themselves to a comparison with only two subtypes of verbal classifiers in spoken languages […]: the classifying verbal affixes and the incorporated verbal classifiers (Sandler and Lillo-Martin 2006; Zwitserlood 2003). Similar to sign language classifiers, classifying verbal affixes […] are bound classifying morphemes adjacent to verbs and cannot occur separately. (Bauer 2014: 189)

Klamath is an example of a spoken language using verbal classifiers. As shown in (13), the use of different classifiers on the verb can highlight the shape (round or long wielded radially) of the object used for the action. (13)

Verbal classifiers from Klamath (Penutian, DeLancey 2000: 18) a. m-p’ak’a clf:round.ins-break.to.pieces ‘break to pieces with a round instrument’ b. w-p’ak’a clf:long-break.to.pieces ‘break to pieces with a long instrument wielded radially’

It is important to note that the term “verbal classifier” may be used with a different meaning in the literature. The term is also currently used in Asian studies to refer to morphemes that quantifies the number of times an action occurs (Court 1986: 165; Goral 1979: 16; He 2001; Lam and Vinet 2005). This system of event classification seems to be limited to language which also have numeral classifiers (Bisang 2018). See for instance example (14) in Thai where the numeral classifier expression modifies the event.11 (14)

“Verbal” classifier in Thai (Tai-Kadai, Thailand) a. mǎː kàt tɕʰǎn nɯ̀ŋ kʰráŋ/pʰlɛ̌ ː dog beat 1sg one clf:occurrence/injury ‘The dog beat me once.’ b. tɕʰǎn rîək tʰɤ: sɔ̌ ːŋ kʰráŋ 1sg call 2sg two clf:occurrence ‘I called you twice.’

11 Nguyễn (1957: 128–129) and Goral (1979: 9) also give examples of verbal classifiers used to specify the number of times the action is performed.



Classifiers in Southeast Asian languages 

 745

In constructions such as in (14), the classifier adds information on the number of occurrences of the event described by the verbal phrase. In the current chapter, we will not deal with these morphemes as classifiers since they quantify actions rather than referents.

31.3.8 Locative classifier Sixth, locative classifiers occur in locative noun phrases. Their choice usually depends on the semantic features of the argument of a locative adposition (Allan 1977: 287; Aikhenvald 2000:3; Grinevald 2015: 812). This sub-type of classifiers is rather rare and is mostly found in South American and Carib languages. Dâw is an example of a locative classifier language. As seen in (15a), the locative classifier for hollow objects is used to refer to a canoe, while the locative classifier for liquids is used for referring to a river in (15b). (15)

Locative classifiers from Dâw (Nadahup, Aikhen­vald (2000:174) a. xoo-kεd b. canoe-clf:in.hollow ‘in a canoe’

Colombia/Brasil) – adapted from nââx-pis-mĩ’ water-small-clf:in.liquid ‘in a small river’

As a summary, classifiers may occur in different constructions. This diversity contributes to the difficulty of counting how many classifier systems are found in a language (Bisang 2002: 294; Fedden and Corbett 2017). Moreover, the polyfunctionality of forms in Southeast Asian languages also enhances this difficulty (DeLancey 1986: 438; Enfield 2004). A form may have several meanings that belong to different parts of speech and have different functions. Nouns may function either as classifiers or classifying noun compounds. As an example, in Table 1, classifying forms in Thai may

Tab. 1: Continuum from noun to classifier in Thai (Tai-Kadai, Thailand) from DeLancey (1986: 439). Form in Thai sàpparót khamtɔɔ̀ p ŋuu ráan khon duaŋ lam lêm

‘pineapple’ ‘answer’ ‘snake’ ‘shop’ ‘person’ ‘round object’ ‘long object’ ‘CLF for blades, books, etc.’

Noun

Class term Classifier

+ + + + + – – –

– – + + + + + –

– + – + + + + +

746 

 Alice Vittrant and Marc Allassonnière-Tang

be distributed along “a continuum from pure noun to pure classifier” (DeLancey 1986: 439). For instance, the noun ‘pineapple’ may only be used as a noun, while ‘answer’ can be used as a noun or a classifier, and ‘person’ can be a noun, a class term, or a classifier. Similarly, in Lahu (16), the form [ɕi11] is shared by the classifier for fruits and the jackfruit, being a part of the noun (i.  e. class term). The use of the same form may lead to various interpretations depending on its position in the noun phrase. (16)

The polyfunctionality of forms in Lahu (Tibeto-Burman, Thailand)  – from ­Matisoff (1973: 91) a. nu53=fɨ35-qo11=ɕi11 ni53 ɕi11 jack.tree(cow=stomach=ct:fruit) two clf:fruit ‘two fruits from the jack tree’ b. a35-ci33-ku33=ɕi11 khɔʔ21 ɕi11 pomegranate=ct:fruit six clf:fruit ‘six pomegranates’

An example from Vietnamese in (17) also illustrates the polyfunctionality of forms: nhà ‘house’ may be used as a countable noun (17a), as a classifier (17b), as part of a compound (class term) or a noun modifier (17c)–(17d). (17)

     

The polyfunctionality of forms in Vietnamese (Austroasiatic, Vietnam) – from Goral (1979: 12) a. một cái nhà =N ‘one (a) house’ one clf N:house b. một nhà sách ‘a house (full) of book(s)’ = mensural.CLF one clf book c. nhà hát ‘cinema, theater’ = class term ct /N:house N: sing d. con thú nhà ‘the domestic animal’ = modifier clf N:animal N:house

In this section, we presented the major categories and constructional subtypes of classifiers. Two axes are defined. First, based on the semantic information conveyed, two transversal categories of classifiers are found: sortal classifiers and mensural classifiers. The latter provides new information of quantity while the former highlights an inherent and often unspecified feature of the referent. Second, based on their morphosyntactic features, and more specifically the constructions in which they occur, classifiers can be further divided into several constructional subtypes: numeral classifiers, noun classifiers, genitive classifiers, verbal classifiers, deictic classifiers, and locative classifiers. The first four of these subtypes are common in Southeast Asian languages. Finally, it is important to point out again that even though we distinguish subtypes of classifiers based on their formal properties (morpho-syntactic context), they mostly converge in terms of semantics and



Classifiers in Southeast Asian languages 

 747

functions.12 Further details about common characteristics are explained in 31.4 and 31.5.

31.4 Common characteristics of classifiers In this section, first, we describe how classifiers commonly emerge. Second, we describe the semantic features generally encoded in classifier systems. Finally, we compare the morphosyntactic behaviour of the different constructional subtypes.

31.4.1 The origin of classifiers Although classification systems may follow various paths of evolution (internal development or contact-induced change), they generally have a clear lexical origin (Grinevald 2015: 815). Cross-linguistically, classifiers mostly originate from nouns13 (Jones 1970: 4; Erbaugh 1986: 399; Bisang 1999; Aikhenvald 2000: 103). Classifier systems generally emerge from two different contexts. On the one hand, they may start from “the context of counting individual items which are of particular cultural importance” (Bisang 1999: 158). This type of development is predominant in languages such as Chinese and Japanese. On the other hand, classifiers systems may also evolve from a taxonomic or meronomic compounding process. Evolutions of this type are documented for Tai languages (Barz and Diller 1985: 178; DeLancey 1986: 445), Vietnamese and Hmong (Bisang 1992: 4, 1999: 166). Both types of developments converge in the sense that they are both noun reanalysis processes. The two development paths are illustrated respectively in Chinese and Thai. The development of classifier structures in Sinitic languages can be traced from Archaic Chinese (500 BCE–200 BCE) to Modern Chinese (1900–present).14 The following gives an overview of the development of classifier structures in Mandarin.15 Clas12 Some linguists have tried to match morphosyntactic classifiers types and semantic criteria (cf. Croft 1994; Dixon 1982; Grinevald 2000: 78–79). However, this pairing is not very relevant for Southeast Asian languages which are often multiple classifier languages, i.  e. “languages in which one and the same or almost the same morpheme can be used for different types of classifiers” (Bisang 2002: 298). 13 Few cases of classifiers from verbal origin have also been documented. By way of illustration, locative classifiers in Goemai are grammaticalized from postural verbs such as ‘sit’ or ‘stand’. Peyraube (1998: 55) also makes mention of rare verbal origin for classifiers in Mandarin Chinese, e.  g. zhang, ‘to stretch (a bow)’. See also Suriya (1988: 110) and Jenny and Hnin Tun (2016: 73) for verb-derived classifiers in respectively Sgaw Karen and Burmese. 14 For more details please refer to Peyraube (1998), Yang-Drocourt (2004), Wu et al. (2006), Jiang (2006), and Her (2017). 15 The Mandarin classifiers have cognates in other Sinitic dialects; the usage is extremely similar (slight variation in lexical choice) when it is not identical (Erbaugh 1986: 404).

748 

 Alice Vittrant and Marc Allassonnière-Tang

sifiers were extremely rare in the earliest periods of Chinese. According to Peyraube (1998: 39), they appear around the second century BC (Archaic Chinese) and spread during the Middle Chinese period (201–1000). They were used mainly to specify (and make more prominent in discourse) concrete and countable items. Thus, mensural classifiers were first used due to their quantifying nature, which is also a general assumption on classifier development.16 Mensural classifiers were more commonly found during this period than sortal classifiers, the latter being mostly used as echo classifiers17 (Jiang 2006: 106). An example of echo classifier in Archaic Chinese is shown in (18), in which the nouns ‘ox’ and ‘goat’ are repeated after the numerals. This type of construction is considered to be a “prelude” to numeral classifier constructions (Aikhenvald 2000: 103; Erbaugh 1986: 401; Her 2017: 41; Jiang 2006: 106). The next step being the use of the same classifier-like noun for different preceding nouns. (18) Echo classifiers in Archaic Chinese (Sinitic, China) fu2 niu2 san1bai3 wu3shi2 wu3 niu2 yang2 er4shi2 ba1 yang2 capture ox 300 50 five ox goat twenty eight goat ‘captured 355 oxen and 28 goats’ (Zhang 2001: 445) In Middle Chinese (201–1000), the preluding structures became genuine numeral classifier structures with 70 % of the quantified expressions containing a classifier (Peyraube and Wiebusch 1993: 59) and an inventory of more than fifty sortal classifiers based on semantic features. The increase in the use of classifiers may have been reinforced by contact with Tai language (Erbaugh 1986; Jones 1970; Peyraube and Wiebusch 1993). At that time, the word order also changed to Num-CLF-N, and structures such as Num-N or N-Num (found in Archaic Chinese) are not frequent in texts, which leads to the assumption that classifiers became obligatory (Liu 1965). In Pre-Modern Chinese (1001–1900), a mature system close to the classifier system in Modern Chinese is in place (Yang-Drocourt 2004). Beside the cross-linguistically well-attested origin for classifier described above for Chinese, i.  e. measure construction, another source of classifier device has to be sought in compounding processes. We refer here to taxonomic (and meronymic) compounding where a class term indicates the higher level of abstraction (the taxon or the category), and the other part of the compound a specific type, (the sub-category or further determination).18 Broadly speaking, this process starts with nouns with simultaneous (or consecutive) class-term uses; these forms then act as classifiers in some semi-lexical-

16 See also Erbaugh (1986: 426) on parallels between historical development and acquisition of classifiers in Mandarin Chinese. 17 Echo classifier, also known as “repeater” (Hla Pe 1965: 180–182: Goral 1979:16), means that the noun is repeated and used as a counting unit in numeral constructions. Its function is to itemize the referent: tə ʔɛiN2 (house one clf: repeater) ‘one house’. 18 The elements of the compound might also refer to a part-whole relation (meronymy). See Bisang (1999: 170–174) for illustration.



Classifiers in Southeast Asian languages 

 749

ized constructions, and gradually expand their classifying function, perhaps abandoning some of their nominal uses. This evolution is favoured in languages which exhibit num-CLF-N order and allow noun classifier, the class term appearing potentially in the same slot as the classifier with respect to the noun, as in Vietnamese (Löbel 2000) or Nung (DeLancey 1986: 442). Vietnamese example (19) illustrates that one and the same morpheme may function as a classified noun (19a) or a classifier (19b). The sentence in (19e) shows taxonomic compounds with the general term (or taxon) preceding the determination one. Thus, in (19c)–(19d) whether we face a CLF-N sequence or a lexicalized compound with a (generic) class term is controversial given their similar position in the sequence; the interpretation relies on the speech context. (19)

Taxonomic relations in Vietnamese (Austroasiatic, Vietnam) – adapted from Löbel (2000: 271–73) and Do-Hurinville (2013: 251) a. hai cái cây 2 clf:inanimate plant ‘two trees/plants’ b. hai cây rau 2 clf:plant vegetable ‘two vegetables’ c. – cây rau clf/ct vegetable ‘a/the vegetable (sub-class of plant)’ d. – rau cần clf/ct celery ‘a celery (sub-class of vegetable)’ e. chó thì có chó đực, chó cái, chó con chien top have [dog male] [dog female] [dog child] Ntaxon Ndetermination ‘As for the “dog” species, there are dogs, bitches and puppies.’

As a summary, there are different stages along the diachronic process by which nouns become true classifiers that are reflected by the various situations and nominal classification devices found in Southeast Asian languages. The lexical origin of classifiers is commonly admitted nowadays: Classification markers generally emerge from nouns by reanalysis of particular nominal structures.

31.4.2 The semantics of classifiers 31.4.2.1 Common semantic features Classifiers generally converge in terms of semantics features. That is to say, while the size of classifier inventories varies cross-linguistically (Tang 2004), a set of fea-

750 

 Alice Vittrant and Marc Allassonnière-Tang

tures are generally found in most inventories. The most common features relate to humans, animals, shape, and plants (Adam and Conklin 1973: 2–3, Allan 1977: 297). In terms of shape, the most commonly identified shapes are long and round (Croft 1994: 153). These shared features are motivated by cognitive principles and expected from a neuroscientific point of view, since they have been “one of the strongest determinants of the organization of object concepts in the brain” (Kemmerer 2017a: 406). In other words, these features are cognitively salient. The distinction between humans, animals, and objects relate to the differentiation between humans and other entities of the environment, while the long and round shapes are salient from a cognitive point of view due to their matching with human masculine and feminine secondary sexual characteristics (Kemmerer 2017a: 408). Examples from Thai are shown in (20) with classifiers for humans (20a), animals (20b), long shape (20c), and round shape (20d). (20) Examples of main semantic features for classifiers in Thai (Tai-Kadai, Thailand) a. phɯ̂ən sìp khon b. pla: hâ: tuə fish five clf:animal friend ten clf:hum ‘ten friends’ ‘five fish’ c. sǎw.thoŋ sǎ:m tôn d. sôm sì: lû:k flagpole three clf:long.vertical orange four clf:round ‘three flagpoles’ ‘four oranges’ While the features of humans, animals, and long/round shape are motivated by cognitive principles and mostly shared cross-linguistically, it is also common that classifier languages develop classifiers specific to their own sociolinguistic context (Croft 1994: 153). For instance, in Burmese, sacred objects and concepts related to Buddhism are counted with a particular classifier as shown in (21). Other examples are found in technological evolution (34) and social contexts (22) in Thai. (21)

Classifier and sociolinguistic context in Burmese (Tibeto-Burman, Myanmar) – from Hla Pe (1965) a. guN3.To2 ko3 =Pa3 attribute nine =clf:sacred obj ‘The nine attributes [of the three gems: the Buddha, the Law, the Sangha]’ b. θi2la1 ŋa3 =Pa3 precept five =clf:sacred obj ‘The five Precepts’ c. PhoN3.Ci3 Tə =Pa3 monk one =clf:sacred obj ‘one monk’



(22)

Classifiers in Southeast Asian languages 

 751

Classifier and social context in Thai (Tai-Kadai, Thailand) kʰrɯ ˆ əŋ.dɯ̀:m nɯ̀ŋ dríŋ (eng. drink) drink one clf: shot ‘A drink shot (of alcoholic beverage)’

The implicational hierarchy of semantic distinctions proposed by Bisang (1999: 125) highlights social status as an important feature at work in classifier systems. Social status is indeed very important in Burmese, Thai, Khmer and Vietnamese highly structured Southeast Asian societies, and this feature is reflected in the languages (DeLancey 1986: 449).

31.4.2.2 Unique classifier and repeater In other cases, classifiers may be used for one single specific item. This “unique” classifier is also labelled as an “echo-classifier” or “repeater” although the two terms should be distinguished.19 By way of illustration, Kathmandu Newar has a classifier that is only used for hole in (23a)and one for letter in (23b). (23)

Examples of unique classifiers in Kathmandu Newar (Tibeto-Burman, Nepal) – from Kiryu (2009) a. gā: cha- gā: b. pau cha- pau hole one clf:echo letter one clf:echo ‘one large hole’ ‘one letter’

(24)

Examples of repeater classifiers in Burmese (Tibeto-Burman, Myanmar) a. ʔɛiN2 θoN3 =ʔɛiN2 house three clf:house ‘Three houses’ b. taiʔ.ʔɛiN2 θoN3 =ʔɛiN2 brick_building.house three clf:house ‘Three masonry houses’ c. tɔ3.ywa2 θoN3 =ywa2 forest.village three clf:village ‘Three rural (forest) villages’

In both cases, the same form appears once as noun head and once as classifier. In Burmese example (24b, c), the classifier appearing as a part of the compound noun counted is referred as a “semi-repeater”.

19 Grinevald (2015: 817) distinguishes “unique classifier” from “repeater”: the “unique classifier” refers to a classificatory function (of class contains only one item), whereas the repeater refers to the form of the classifier. In Southeast Asian languages, unique classifiers are often repeaters.

752 

 Alice Vittrant and Marc Allassonnière-Tang

31.4.2.3 General classifier Classifier languages also commonly develop a so-called “general classifier”. Such a classifier is typically desemanticized and used in a generic way with most lexical items in a language. As an example, in Southern Min, the classifier e5 has a default function in addition to its use with nouns for humans. (25)

Example general classifiers in Southern Min (Sinitic, China) – from Chappell (2019: 201–202) a. chit4 -e5 lang5 dem -clf:hum person ‘This person’ b. chit8 -e5 toa7 chhiu-kha1 one - clf:general big tree.foot ‘a tree with a large base …’

In Sui, the general classifier lam1 is originally the classifying element for fruit; it has undergone a metaphorical extension from spherical meaning to a larger number of entities. Hlai, another Tai-Kadai language, has a general classifier hom53 used with a large range of inanimate items from small items like fruits or grained to bulky size object like ‘mountain’, encompassing also newly introduced items such as computer, guitar or university course (Burusphat 2007: 136–137). Table 2 shows that a noun like ‘tofu’ can be referred to with a specific classifier for lump-shape items or the general classifier. Notice that the classifiers used interchangeably with the general classifier are mostly shape-based classifiers. The choice of different classifiers can fulfil various functions in discourse, which are further explained in 31.5. Tab. 2: Example of general and specific classifier in Hlai (Tai-Kadai, Hainan) from Burusphat (2007: 201–203).

hom53 thun53 van11 ka11

clf: general clf: lump-shape clf: sheet-like clf: stick-like

da:u55hu55 bo55tua11 ‘tofu’ ‘newspaper’

tsi:u5phi:n11 tshai53koŋ55 ‘photograph’ ‘coffin’

ka11tsɯ55hjau53 ‘pod’

x x

x

x

x

x

x

x

x

x

As a summary, the inventory of classifiers in a language is generally guided by two main principles. On the one hand, concepts such as humanness, animacy, utility and specific shapes are more salient cognitively. Thus, classifiers related to these concepts are more likely to be shared cross-linguistically. On the other hand, languages also typically develop ‘customized’ classifiers based on their respective cultural contexts.



Classifiers in Southeast Asian languages 

 753

31.4.3 The morphosyntax of classifiers 31.4.3.1 Free or bound forms Classifiers can occur as independent morphemes or affixes. As shown in (26), classifiers in Thai occur as independent morphemes, while classifiers in Dolakha Newar (27) are affixed to the numeral. (26)

Free classifiers in Thai (Tai-Kadai, Thailand) klûəj lû:k ní: sùk lɛ́ :w banana clf:3d dem ripe accomp ‘This banana is (already) ripe.’

(27)

Affixed classifiers in Dolakha Newar (Tibeto-Burman, Nepal)  – from Genetti (2007: 265) thi-gur des=ki thi-ma misāmi da-u one-clf:gen country=loc one-clf:animate woman exist-3pa ‘In a country, there lived a woman.’

Whether classifiers are independent or bound morphemes depends on the morphosyntactic parameters of each language. Thus, this criterion is generally not used to distinguish between further types of classifiers.

31.4.4 Word order Classifiers subtypes have been distinguished in terms of their morphosyntactic loci, i.  e. the element (or construction) with which they are syntactically linked (numerals, nouns, demonstratives, possessives, locatives, and verbs). However, in all these morphosyntactic contexts, the slot filled with the classifier depends mainly on the syntactic parameters in the language, more specifically (usually) the head-modified order (Simpson 2005: 806). For instance, Lahu is a (S)OV language. In Lahu (28), the classifier thus appears after the noun it classifies (28b), which matches with the expected modifier-head order20 in the language (28a). On the other hand, Kilivila is both an 20 Regarding the elements of a numeral classifier construction, i.  e. noun, numeral, classifier, the issue of what element is the head is still under discussion (Gil 2013). What is however uncontroversial is that (i) the classifier forms a constituent with the numeral, as confirmed by the unattested orders [*CLF N Num] and [*Num N CLF], and (ii) the numeral is modifying the classifier. As for the noun, it may be syntactically tightly linked or separated from the [CLF-Num] sequence (see Thai example [40]). Whether the quantifying phrase [CLF-num] actually modifies the noun is often unclear and may depend on the languages themselves and the theoretical approaches adopted. In the literature on classifiers in Southeast Asian languages, it is generally admitted that the classifier is the head of the construction (Simpson 2005: 806).

754 

 Alice Vittrant and Marc Allassonnière-Tang

SVO and VOS language (29a). In Kilivila, the classifier thus precedes the numeral, as illustrated in (29b). However, unexpected order regarding the assumed general headedness of the languages is attested (see Thai) and might be explained by language contacts (Alves 2001: 234–235; Vittrant and Mouton forthcoming). (28)

Word order between the noun and the classifier in Lahu (Tibeto-Burman, Thailand) – from Matisoff (1973: 87, 305) a. yɔ̂ yɜ̀ te chɜ̂ ve 3sg house make.Vh prog Vv S O V ‘He is building a house.’ b. ánithâ khɛ šɨ -e ò vàʔ tê Yesterday pig one clf:animal die-dir accomp Modifier head ‘A pig died yesterday.’

(29)

Word order between the noun and the classifier in Kilivila (Austronesian, Papua New Guinea) – from Senft (1986: 109) a. ku-pola budubadu yena yokwa 2sg -fish many fish 2sg V O S ‘You caught many fish (unmarked).’ b. na-tala yena clf:animal-one fish head Modifier ‘one fish’ (Kilivila, Austronesian, Senft 2000: 18–21)

When enlarging the frame and considering the order between the numerals, the nouns, and the classifiers, six possibilities are expected mathematically (30). Upon these six possible orders, two orders are not attested in languages of the world: [CLF N Num] and [Num N CLF]. Interestingly, in both unattested patterns the noun intervenes between the numeral and the classifier (Jones 1970: 6; Greenberg 1974: 31). In other words, the classifiers and the numerals seem to form a tight syntactic unit that is not easily separated as noticed in early studies on Southeast Asian classifiers (Bisang 1999; Vittrant 2005: 132). This phenomenon is also explained by theories combining syntax and mathematics. For an extended discussion, please refer to Her (2017) and Her et al. (2019). (30) Possible word orders of numerals, classifiers, and nouns in languages of world (Her et al. 2019: 423–424) a. √ [(Num CLF) N] Many languages, e.  g. Mandarin (Sino-Tibetan), Vietnamese (Austroasiatic), Hungarian (Altaïc) b. √ [N (Num CLF)] Many languages, e.  g. Thai (Tai-Kadai), Burmese (Tibeto-Burman), Korean (Isolate) c. √ [(CLF Num) N] Few languages, e.  g. Ibibio (Niger-Congo)



Classifiers in Southeast Asian languages 

 755

d. √ [N (CLF Num)] Few languages, e.  g. Jingpho (Tibeto-Burman)21 e. * [CLF N Num] No languages attested f. * [Num N CLF] No languages attested The first two orders are by far the most relevant for Southeast Asian languages. Jones (1970: 3), Bartz and Diller (1985: 177) and Bisang (1999: 118) point out that word order follows an areal pattern: generally, in northern languages (upper China and the like), the classifier precedes the noun, whereas in southern languages (roughly Indochina peninsula), the classifier follows the counted noun. With regard to the relation between classifiers and other kind of elements they appear with (noun, genitive morpheme, deictic morpheme, verb), they are generally tightly adjacent (free or bound morpheme, prefixed or suffixed) while conforming to the language typological profile. Thus, the order may thus vary across languages. As an example, in Nelemwa, genitive classifiers are bound to the possessive marker and are in initial position (31). On the other hand, in Weining Ahmao (32), the genitive classifier is found between the possessor and the possessee. (31) Word order for genitive classifiers in Nelemwa (Oceanic, New Caledonia) –from Bril (2002: 365) ciic ââ-ny clf:plant-poss.1sg tree ‘my tree’ (32) Word order for genitive classifiers in Weining Ahmao (Hmong-Mien, China) – from Gerner and Bisang (2008: 728) ku55 lai55 ŋgɦa55 ɲi55 1sg clf house dem.prox ‘my house’

31.4.5 Optionality and obligatoriness Lastly, the compulsory nature of the classifier varies according to languages. For instance, classifiers are obligatory with the numerals in Burmese22 while they are optional in Malay as shown in (33) (Goddard 2005: 96; Nomoto and Soh 2019). They are also obligatory in Vietnamese noun phrases only if humans are counted (Bisang 1996:

21 This “N-CLF-num” order is also attested in Thai with the numeral ‘one’ to express indefinites, whereas the regular word order “N-one-CL” occurs in the context of counting. 22 Higher and round numbers such as ten, hundred may functions as classifier themselves in Burmese and other languages. They are similar in function to measure words such as ‘dizaine’ (ten items), ‘dozen’ (twelve items) in English. See Jenny and Hnin Tun (2016: 76) and Vittrant and Mouton (forthcoming).

756 

 Alice Vittrant and Marc Allassonnière-Tang

116). This variance of obligatoriness is language-specific and extremely context-specific and is not extensively discussed in the current chapter.23 (33)

Malay, (Austronesian, Malaysia) – from Nomoto and Soh (2019: 490) a. tiga (buah) majalah b. tiga (orang) guru 3 clf:bulky.item magazine 3 clf:human teacher ‘Three magazines’ ‘Three teachers’

All the examples in this section show that both the morphosyntactic environment (loci) and the typological profile of the language are involved in the morphosyntax of the classifiers. In the next section, we discuss the functions of classifiers.

31.5 Functions of classifiers Among the major frameworks for identifying functions of classifiers that are found in the literature, Contini-Morava and Kilarski (2013) present an important review of works that address functions of nominal classification (see also Bisang 1993; Craig 1986, 1992; Löbel 2000; among others). They identify the main functions of nominal classification systems, including grammatical gender/noun classes, and various types of classifiers. In the following paragraphs, we restrict our presentation to functions relevant for classifier systems in Southeast Asian languages, which is motivated by a customization for Southeast Asian languages based on the typology of Bisang (1993, 1999). In the latter, several hierarchical stages are defined amongst the functions of classifiers, ranging from classification to identification and individuation. Each stage of this hierarchy is relevant for a specific context, which motivates the occurrence of different constructional subtypes of classifiers. For instance, the function of individuation is relevant in a counting context, which motivates the use of numeral classifiers. As another example, the function of identification is relevant in any referentialization process that implies the use of a classifier (noun classifier, genitive classifier, deictic classifier). classification ⇒ identification              ⇒ ⇓ Referentialization (anaphora, deixis, definiteness and reference, topic continuity)

Individuation ⇓ Counting

Fig. 4: Functions of classifiers uses – adapted from Bisang (2002: 304).

23 For a discussion on optional classifier use in obligatory classifier language, see Nomoto (2013: 15).



Classifiers in Southeast Asian languages 

 757

While the terms of the two frameworks partially diverge, their essence and underlying principles are, however, synchronized. In the current chapter, we summarize the two frameworks and broadly distinguish two main categories of functions for classifiers apart from classification: semantic functions and discourse functions. Semantic functions refer to: − differentiation of referents by coercing the meaning of an under-specified lexicon or pointing to one rather than another interpretation; − individualization allowing nouns to be conceived as countable objects.24 On the other hand, we conceive discourse functions as: − reference identification (anaphora, deixis, disambiguation); − reference management (expression of prominence or specificity); − re-presentation of referents (expression of a specific attitude of the speaker toward the referent).

31.5.1 Semantic functions 31.5.1.1 Differentiation of referents Classifiers can first be used for differentiating referents. That is to say, classifiers can be used to provide a more subtle differentiation of existing lexical items with respect to features such as sex, animacy and/or physical properties (Aikhenvald 2000: 392; Huang and Ahrens 2003). A semantically neutral stem may point to various referents depending on the classifier that is used with the stem (Contini-Morava and Kilarski 2013: 272). As an example, in Thai (34), the use of the classifier for machines indicates that the phone is being referred to as a device (34a), while the classifier for long shape objects infers that the phone is being referred to as a phone call (34b). In both examples, the form of the noun is exactly the same and the use of different classifiers is the key for identifying different referents. Example (35) is a parallel example in Cantonese, where the mouth is classified as a body part in (35a) and an instrument for speaking (35b). (34) Differentiation of referents by numeral classifiers in Thai (Tai-Kadai, Thailand) – from Vittrant and Mouton (forthcoming) a. tho:ráʔsàp nɯ̀ŋ khrɯ̂əŋ b. tho:ráʔsàp nɯ̀ŋ sǎ:j phone one clf:machine phone one clf:long ‘a phone’ ‘a phone call’

24 The lexical function of expansion of the lexicon is not discussed extensively since it is less common for classifiers and it is theoretically difficult to infer “whether we are dealing with the same nominal stem or a different stem that is similar in form and meaning” (Contini-Morava and Kilarski 2013: 270–272), as demonstrated by languages such as Lahu in (10).

758 

(35)

 Alice Vittrant and Marc Allassonnière-Tang

Differentiation of referents by numeral classifiers in Cantonese (Sinitic, China) – from H. de Sousa (p.c.) a. keuih5 go3 hau2 hou2 chau3 3sg clf:general mouth very stinky ‘His/her mouth is really stinky’ b. keuih5 ba2 hau2 hou2 chau3 3sg clf:instr mouth very stinky ‘His/her speech is very rude, demeaning’

An even more extreme example is shown with Burmese (36). The noun /chɛri/ ‘cherry’ does not undergo changes of form. However, the referent varies according to the classifier that follow the noun. (36)

Differentiation of referents by numeral classifiers in Burmese (Tibeto-Burman, Myanmar) – adapted from Jenny and Hnin Tun (2016: 75) a. chɛ2ri2 tə -loN3 b. chɛ2ri2 Tə -PiN2 cherry one clf:3d cherry one clf:growing.item ‘one cherry’ (fruit) ‘one cherry tree’

In (36a), the classifier for tridimensional items indicates that the referent is a fruit. While in (36b), by using the classifier for growing items, the speaker points to cherry as a plant. As mentioned at the beginning of the paragraph, examples of this type are found but generally marginal in languages and restricted to domains that relies on taxonomy such as wildlife and flora.25

31.5.1.2 Individualization and counting The second most important semantic function of classifiers relates to the count/ mass distinction (Contini-Morava and Kilarski 2013: 276). Count nouns are perceived as semantically bounded entities that can be individuated and counted, while mass nouns incarnate things whose parts are not considered as discrete units (Bisang 1999: 120, Delahunty and Garvey 2010: 156). This distinction is mirrored through language (Chierchia 1998, 2010; Doetjes 2012; Gillon 1999; Quine 1960), as our brain “differentiates between count and mass nouns not only at the syntactic level but also at the semantic level” (Chiarelli et al. 2011: 1). See Tang and Her (2019) for a theoretical and quantitative analysis on the subject matter. This function is generally referred to as “individualization” (Bisang 1999: 120) or “unitizing” (Enfield 2004: 132). In classifier languages, count nouns use sortal classi25 Notice however that the taxonomy in these domains are often first rendered by class terms, classifiers then just emphasize a distinction (between taxon and specie) already present in the nominal compound.



Classifiers in Southeast Asian languages 

 759

fiers in contexts of enumeration/ quantification and mensural classifiers in contexts of measure, whereas mass nouns must rely on mensural classifiers. As demonstrated in (37), semantically unbounded mass nouns such as ‘water’ cannot apply sortal classifiers (37a) but can only be quantified with mensural classifiers (37b). (37)

Individuation by noun classifiers in Vietnamese (Austroasiatic, Vietnam) a. *ba cái nước b. ba chai nước three clf:gen water three mens:bottle water * three water ‘three bottles of water’

By further analysing how languages fulfil the function of individuation, previous typological studies found that classifiers and grammatical plural markers follow a complementary-like distribution cross-linguistically. Thus, different hypotheses have been developed to explain this observation (Ghomeshi and Massam 2012: 2). First, a typological approach suggests that classifier languages, unlike plural-marking languages, either do not make the mass-count distinction or only make this distinction semantically, but not syntactically, and therefore do not allow nouns to be quantified by numerals directly without classifiers (Allan 1977; Bale and Coon 2014; Chierchia 1998; Hansen 1983; Krifka 1995; Link 1998; Zhang 2012). Therefore, nouns in classifier languages are all mass nouns or transnumeral nouns, i.  e. nouns are not specified for number in the lexicon. This functional approach based on the transnumerality of nouns in classifiers languages is advocated by Bisang (1999, 2002). On the other hand, a universalist approach claims that sortal classifiers and plural markers are unified under one grammatical category (Borer 2005; Borer and Ouwayda 2010; Cowper and Hall 2012; Doetjes 2012; Greenberg 1990; Her 2012; Mathieu 2012; Nomoto 2013; Sanches and Slobin 1973; T’sou 1976). Under this hypothesis, the masscount distinction is recognized in both types of languages, where the use of a sortal classifier is analogous to that of a plural marker (Aikhenvald 2000; Borer and Ouwayda 2010; Cheng and Sybesma 1998; Jenks 2017; Yi 2011).26 To sum up, whatever the motivation, number is usually not expressed in numeral classifier constructions (Aikhenvald 2000: 249). However – and against the claim that languages with numeral classifier lack compulsory number (Greenberg 1990: 188) – some languages have been founded with obligatory number marking beside numeral classifiers (Gerner and Bisang 2008; Bisang 2012). As a summary, classifiers generally fulfil two semantic functions. First, classifiers can be used to differentiate referents on the same noun form. By using different classifiers, a noun form may be linked to specific referents. Second, classifiers fulfil the function of individuating referents. Different usage of sortal and mensural classifiers pinpoint the countability of the referent. In the following sub-section, we explain how classifiers can have an impact within discourse. 26 For a discussion on the compatibility, the development and the motivation of numeral classifiers system and obligatory plural marking in one and the same language, see Bisang (2012).

760 

 Alice Vittrant and Marc Allassonnière-Tang

31.5.2 Discourse functions Classifiers also help to reach an equilibrium between economy of production from the speaker side and ease of comprehension on the listener side (Contini-Morava and Kilarski 2013; Grinevald 2000: 294). Discourse functions of classifiers are summarized into three major types in the current chapter: reference identification, reference management, and re-presentation.

31.5.2.1 Reference identification With regard to reference identification, classifiers can provide clues to trace back a referent previously mentioned in discourse without repeating the noun several times. This typically happens when context and the classifier provide sufficient information to interpret the referent. For instance, in (38) the speaker may refer to ‘books’ with the classifier for volumes in (38b) instead of reiterating the noun itself, which shows the classifier fulfilling an anaphoric function. Similar function is fulfilled by the classifiers in Vietnamese examples (39). (38) Anaphora in Burmese (Tibeto-Burman, Myanmar) a. sa2.ʔoʔ bɛ2n̥ ə -ʔoʔ wɛ2 =mə book how.many clf:volume buy q.irr ‘How many books will you buy?’ b. θoN3 ʔoʔ laɔʔ three clf:volume about ‘About three (volume-shaped objects)’ (39)

=lɛ3 q

Anaphora in Vietnamese (Austroasiatic, Vietnam) – from Nguyễn (1957: 139) a. Tôi có ba con mèo, hai con trắng, một con đen 1sg have 3 clf:animate cat 2 clf white 1 clf black ‘I have three cats, two white (and) one black.’ b. Tôi có ba quyển sách, một quyển mỏng, một quyển dày 1sg have 3 clf book 1 clf thin 1 clf thick ‘I have two books, one thin (and) one thick.

Classifiers can also help to distinguish between multiple referents that have been introduced in the preceding discourse. If the speaker does not want to reiterate the same noun several times, she/he can use specific classifiers to narrow down the possibilities of interpretation. This disambiguation function is shown by example (40) where the apples are being referred to via the classifier for round-shaped objects in (40b). The listener may thus comprehend that the speaker has given two apples to the neighbors and not sugarcanes.



Classifiers in Southeast Asian languages 

 761

(40) Referent identification and disambiguation in Thai (Tai-Kadai, Thailand) a. tɕʰǎn sɯ́ː ʔɔ̂ j (maː)  sǎːm lam kàp ʔɛ́ ppɤ̂n sǎːm three 1sg buy sugarcane come three clf:stick and apple lûːk clf:round ‘I buy/bought three sugarcanes and three apples’   b. tɕʰǎn hâj pʰɯ̂ənbâːn (paj) sɔ̌ ːŋ lûːk 1sg give neighbour go two clf:round ‘I give/gave two (apple) to the neighbours’

31.5.2.2 Reference management and classifiers The second discourse function of classifiers is reference management. Classifiers can be used to increase the prominence of nouns and highlight them in discourse, e.  g. to introduce a new referent or to specify one (Bisang 1999: 150; Contini-Morava and Kilarski 2013: 284). Used to identify a referent or to indicate its specificity, the noun-classifier device (or bare classifier construction) is a first mean for reference management. The CLF-N sequence may convey an indefinite reading or a definite one according to the language, and the position of the classifier phrase. For instance, when comparing Sinitic dialects, Li and Bisang (2012: 336) show that the CLF – N sequence in a post-verbal position, conveys an indefinite reading in Mandarin Chinese (41a), but a definite one in the Wu dialect (41b). In a preverbal position, however, the classifier construction conveys a definite reading in the Sinitic dialects that allow this syntactic position (i.  e. Cantonese, Wu but not Mandarin). This may be related to information structure; the sentence-initial position is associated with topics in these topic-prominent languages (41d), and topic NP are preferably interpreted as definite. (41)



Noun-Classifier and reference management in Sinitic dialects  – from Li and Bisang (2012: 336–337) a. Ø laoban mai le liang che (*clf) boss buy pfv clf car ‘(The) boss bough a car.’ (Mandarin) b. kɤ lɔpan ma lə bu tsʰo clf boss buy pfv clf car ‘The boss bough a car.’ (Wu) c. go louban maai zo ga ce clf boss buy pfv clf car ‘The boss bough a/the car.’ (Cantonese) d. cheung fo a, houchoi di siufongyun lai dak hapsi clf fire top fortunately clf fire.brigade come adv fast ‘As for the fire, fortunately, the fire brigade came fast.’ (Cantonese)

762 

 Alice Vittrant and Marc Allassonnière-Tang

With regard to introducing new referents, two typical cases are found. First, a new referent can be projected to the foreground of discourse by using a numeral + classifier construction (e.  g. with the number ‘one’) as a presentative sentence (Hopper 1986: 312–313; Li 2000: 1121–1122; Erbaugh 2002: 52). As shown in (42), the new referent ‘ship’ is introduced by a numeral + classifier construction. Likewise, in (43), the new referent ‘tiger’ is also introduced by the construction of the numeral ‘one’ plus the classifier ‘tug’. (42) Classifier in presentative sentence in Written Malay (Austronesian, Malaysia) – adapted from Hopper (1986: 310) Maka pada suatu pagi kelihatan-lah sa-buah kapal Then on one.clf morning was.seen one- clf:inanim ship rendah low ‘Then one morning, a low ship was sighted.’ (43) Classifier in presentative sentence in White Hmong (Hmong-Mien, Laos/Australia) – from Jarkey (2015: 44) thaum ub muaj ib tug tsov time yonder exist one clf tiger ‘Once upon a time, there was a tiger.’ Second, new referents are typically introduced with specific classifiers, which then turn to general classifiers when the referent is not as prominent in discourse as noticed by Erbaugh (1993) for Mandarin Chinese. Special classifiers typically marked first mention of a new item. They appeared with indefinites rather than definites, and with near reference than far. Once reference is established, subsequent mentions take the general classifier or constructions where no classifier is required. (Erbaugh 1993: 408)

Sortal classifiers are likely to appear with a new topic, especially with unfamiliar or distant object not physically present. On the other hand, an object located within the same room as the hearer would often be designated with the general classifier. For instance, in Mandarin, at discourse level, sortal classifiers are often (72 %) associated with first mention (Erbaugh 2002: 43–44). As shown in (44), the referent located within the same room as the hearer is referred to with the general classifier in (44b). On the other hand, the distant referent (bike) is referred to with the specific classifier for vehicles in (44a). (44) Example of prominence with classifiers in Mandarin (Sinitic, China) a. na4 liang4 zi4xing2che1 dem clf:vehicle bike ‘that bike (viewed out the window)’





Classifiers in Southeast Asian languages 

 763

b. na4 ge0 zi4xing2che1 dem clf:gen bike ‘that bike (parked in the living room)’

Classifiers can also be used to identify the definiteness and specificity of a referent (Li and Bisang 2012). This function is more specific to Southeast Asian languages. As shown in (45), the change of word order within the classifier construction provides information on definiteness and specificity. In (45a), the order N-CLF-NUM provides an indefinite but specific reading in the sense that the speaker mentions a specific dog instead of a random dog. On the other hand, in (45b), the order N-NUM-CLF indicates that the speaker refers to an indefinite and unspecific dog, i.  e. a random dog. Finally, in (45c), adding a demonstrative to the sequence N-CLF shows that the dog is referred to in a definite manner. (45)

Classifiers conveying definiteness and specificity in Thai (Tai-Kadai, Thailand) a. cʰǎn hěn mǎː tuə nɯ̀ŋ 1sg see dog clf:animal one ‘I saw/see a dog’ (indefinite, specific) b. cʰǎn hěn mǎː nɯ̀ŋ tuə 1sg see dog one clf:animal ‘I saw/see one dog’ (indefinite, unspecific) c. cʰǎn hěn mǎː tuə níː/nán 1sg see dog clf:animal dem:prox/dist ’I saw this/that dog.’ (definite)

It is important to point out that using classifiers to convey definiteness and specificity is not rare in Southeast Asian languages, but it is not a feature shared by all Southeast Asian languages. For instance, classifiers in Burmese occur mostly in numeral constructions that are indefinite. They are not used with demonstratives (deictic-classifier), or to indicate singulative, specificity or definiteness of the noun (noun-classifier) as in Hmong (Gerner and Bisang 2010) or Vietnamese. On the other hand, languages such as Burmese (Vittrant 2005: 136), Lahu (Matisoff 1973: 93), Mi Aizhai Miao (Mien) or Cantonese make indefinite expressions such as ‘someone’, ‘something’ or ‘none’, ‘no one’, ‘nothing’ by reduplicating a classifier, sometimes associated with the numeral ‘one’ or another morpheme (Enfield 2019: 155). As shown by (46), the classifier for books is duplicated to infer the meaning of ‘any book’ in (46a). In (46b), the classifier for humans is duplicated to convey the meaning of ‘anybody’ whereas in (46c), the general classifier is associated with the particle /m̥a1/ ‘only’ to express the negative indefinite expression ‘nothing’. In Cantonese too, the classifier is reduplicated to convey the meaning ‘everyone’ as shown by (47).

764 

 Alice Vittrant and Marc Allassonnière-Tang

(46) Classifiers used in indefinite expressions in Burmese – adapted from Vittrant (2005: 136) a. tə-ʔoʔ-ʔoʔ yu2 =Pa2 one-clf:book-red take pol ‘take any (book)’ b. tə-yaɔʔ-yaɔʔ phye2 naiN2 =mə =la3 one-clf:human-red answer can irr q ‘Could someone answer (this question)?’ c. taɔN3 =Ta2 tə-khu1-m̥a1 mə= pe3 =Phu3 ask nmlz.real one-clf:gen-only neg give neg ‘He gives nothing of what has been asked.’ (47)

Classifiers used in indefinite expressions in Cantonese  – adapted from Matthews and Yip (1994: 96) Go-go (yàhn) dōu séung máaih láu clf-clf (people) all want buy flat ‘Everyone wants to buy a flat.’

31.5.2.3 Pragmatic motivation in classifier choice The third discourse function of classifiers is to express the speaker’s subjective attitude (Contini-Morava and Kilarski 2013: 277). When a noun accepts different classifiers, the choice is meaningful, focusing on a specific property of the referent or determined by register. Therefore, the choice of a classifier is often pragmatically motived (Burling 1965). In other words, different classifiers are used to reflect different attitudes of the speaker toward the same referent. They can be used in discourse to convey a change of attitude/perspective toward the referent by ascribing or highlighting different properties. As shown in (48), the speaker can use the classifier for humans to show respect toward the referents (48a) or use the classifiers for animals to refer to the human referents (48b) and convey a lack of respect toward them. (48) Classifiers expressing the subjective attitude of the speaker in Burmese a. bɛ2n̥ ə -yaɔʔ la2 =lɛ3 how.many clf:human come q ‘How many people (respectful) came?’   b. bɛ2n̥ ə -kaɔN2 la2 =lɛ3 how.many clf:animal come q ‘How many people (disrespectful) came?’ From a “European language” point of view, this function is generally fulfilled at the lexical level, since the expression of affective meanings or stylistic nuances is often realized by different lexical items in European languages. For instance, the examples



Classifiers in Southeast Asian languages 

 765

in (48) would be conveyed by using different lexical items in French, cf. (49a) vs. (49b). (49) Lexical items expressing the subjective attitude of the speaker in French a. Combien de personnes sont venues? how.many of people be.pl come.pl.f ‘How many people (respectful) came?’ b. Combien de types sont venus? how.many of people.disrespectful be.pl come.pl.m ‘How many people (disrespectful) came?’ As in Burmese, the choice of the numeral classifier in Thai or Mien language may be a matter of expressing politeness. In (50), the classifier tau35 used for people and animals may be replace by la:n53 when referring to a respected person (Liu 2012 cited by Enfield 2019: 155). (50) Classifiers expressing the subjective attitude of the speaker (politeness) in Mien (Thailand, Hmong-Mien) – from Liu (2012: 98) a. jet31 tau35 sjaʔ55tɔn33 one clf: anim woman ‘a woman’ b. jet31 la:n53 khɛʔ55mjen53 one clf: anim guest person ‘a guest’ As a summary, classifiers can fulfil various discourse functions, ranging from reference identification and reference management to re-representation. It is important to point out that these functions may be fulfilled by classifiers but are not necessarily fulfilled by classifiers only. In other words, classifiers do not obligatorily have to fulfil these functions. Other elements in a “classifier” language may also fulfil these functions depending on the context. See Allassonnière-Tang and Kilarski (2020) for a more detailed discussion on complex interaction of functions across different nominal classification systems in the same language.

31.6 Conclusion Classifiers as nominal classification systems share a common core in terms of semantics and morphosyntax. However, they also display large variation depending on the sociolinguistic context of individual languages. While this diversity makes classifiers extremely relevant for typological studies, it also results in divergence within the literature describing classifiers. Different definitions and/or terms may be used interchangeably, which highlights the importance of carefully examining language exam-

766 

 Alice Vittrant and Marc Allassonnière-Tang

ples provided in the literature. This diversity is especially found in Southeast Asia, which is a hotbed for classifier languages. As demonstrated in the current chapter, even though Southeast Asian languages are commonly described as typical numeral classifier languages, they are often multiple classifiers languages, with differences in terms of constructions, meanings, and functions. This convergence and divergence of classifier systems cross-linguistically thus represent an extremely interesting research ground for cross-disciplinary studies involving all fields of linguistics, cognitive studies, and neuroscience.

References Adams, Karen Lee. 1989. Systems of numeral classification in the Mon-Khmer, Nicobarese and Aslian subfamilies of Austroasiatic. Canberra: Pacific Linguistics. Aikhenvald, Alexandra Y. 2000. Classifiers: A typology of noun categorization devices. Oxford: Oxford University Press. Allan, Keith. 1977. Classifiers. Language 53(2). 285–311. Allassonnière-Tang, Marc & Marcin Kilarski. 2020. Functions of gender and numeral classifiers in Nepali. Poznań Studies in Contemporary Linguistics 56(1). 113–168. Alves, Marc. 2001. What’s so Chinese about Vietnamese? In Graham W. Thurgood (ed.), Papers from the Ninth Annual Meeting of the Southeast Asian Linguistics Society, 221–242. Tempe, AZ: Arizona State University, Program for Southeast Asian Studies. Bakken Jepsen, Julie, Goedele De Clerck, Sam Lutalo-Kiingi & William B. McGregor (eds). 2015. Sign languages of the world. Berlin: Mouton de Gruyter. Bale, Alan & Jessica Coon. 2014. Classifiers are for numerals, not for nouns: Consequences for the mass/count distinction. Linguistic Inquiry 45(4). 695–707. Barz, Richard & Anthony Diller. 1985. Classifiers and standardisation: Some south and southeast Asian comparisons. Paper in SEA Linguistics 9. 155–184. Bauer, Anastasia. 2014. The use of signing space in a shared sign language of Australia. Berlin: Mouton de Gruyter Bisang, Walter. 1988. Hmong Texte. Eine Auswahl mit Interlinearübersetzung aus Jean Mottin, Contes et Légendes Hmong Blanc [Bangkok, Don Bosco 1980]. Arbeiten des Seminares für allgemeine Sprachwissenschaft der Universität Zürich Nr. 8. Bisang, Walter. 1993. Classifiers, quantifiers and class nouns in Hmong. Studies in Language 17(1). 1–51. Bisang, Walter. 1999. Classifiers in East and Southeast Asian languages: Counting and beyond. In Jadranka Gvozdanović (ed.), Numeral types and changes worldwide, 113–186. Berlin & New York: Mouton de Gruyter Bisang, Walter. 2002. Classification and the evolution of grammatical structures: A universal perspective. STUF – Language Typology and Universals 55(3). 289–308. Bisang, Walter. 2012, Numeral classifiers with plural marking. A challenge to Greenberg. In Xu Dan (ed.), Plurality and classifiers across languages in China, 23–42. Berlin: Mouton de Gruyter. Bisang, Walter. 2018. Nominal and verbal classification. In William McGregor & Wichmann Søren (eds.), The diachrony of classification systems, 241–281. Amsterdam: John Benjamins Blust, Robert. 2009. The Austronesian languages. Canberra: Pacific Linguistics.



Classifiers in Southeast Asian languages 

 767

Bon, Noëllie. 2012. Les Classificateurs Numéraux du Stieng du Cambodge. Lidil 46. 45–77. Borer, Hagit. 2005. Structuring sense, part I. Oxford: Oxford University Press. Borer, Hagit & Sarah Ouwayda. 2010. Men and their apples: Dividing plural and agreement plural. Beijing: Beijing Language and Culture University. Bril, Isabelle. 2002. Le Nêlêmwa (Nouvelle-Calédonie): Analyse Syntaxique et Sémantique. Paris: Peeters Pub & Booksellers. Burling, Robbins. 1965. How to choose a Burmese Numeral Classifier. In Melford E. Spiro (ed.), Context and meaning in cultural anthropology, 243–264. New York: The Free Press. Burusphat, Somsonge. 2007. A comparison of general classifiers in Tai-Kadai languages. The Mon-Khmer Studies Journal 37. 129–153. Chao, Yuenren. 1968. A grammar of Spoken Chinese. Berkeley: University of California Press. Chappell, Hilary. 2019. Southern Min. In Alice Vittrant & Justin Watkins (eds.), The Mainland Southeast Asia Linguistic Area, 176–233. Berlin: Mouton de Gruyter. Cheng, Lisa L. S. & Rint Sybesma. 1998. Yi-Wan Tang, Yi-Ge Tang: Classifiers and massifiers. Journal of Chinese Studies 28(3). 385–412. Chiarelli, Valentina, Radouane El Yagoubi, Sara Mondini, Patrizia Bisiacchi & Carlo Semenza. 2011. The syntactic and semantic processing of mass and count nouns: An ERP study. PLoS ONE 6(10). 1–15. Chierchia, Gennaro. 1998. Plurality of mass nouns and the notion of semantic parameter. In Susan Rothstein (ed.), Events and grammar, 53–104. Dordrecht: Kluwer. Chierchia, Gennaro. 2010. Mass nouns, vagueness and semantic variation. Synthese 174(1). 99–149. Contini-Morava, Ellen & Marcin Kilarski. 2013. Functions of nominal classification. Language Sciences 40. 263–299. Corbett, Greville G. 1991. Gender. Cambridge: Cambridge University Press. Corbett, Greville G. 2013. Number of genders. In M. S. Dryer and M. Haspelmath (eds.), The world atlas of language structures online. Leipzig: Max Planck Institute for Evolutionary Anthropology. http://wals.info/chapter/30 (accessed 6 January 2021). Corbett, Greville G. & Sebastian Fedden. 2016. Canonical gender. Journal of Linguistics 52(3). 495–531. Court, Christopher. 1986. Fundamentals of Iu Mien (Yao) grammar. Berkeley: University of California PhD dissertation. Cowper, Elisabeth & Daniel Currie Hall. 2012. Aspects of individuation. In Diane Massam (ed.), Count and mass across languages, 27–53. Oxford: Oxford University Press. Craig, Colette (ed). 1986. Noun classes and categorization. Amsterdam: John Benjamins. Craig, Colette. 1992. Classifiers in functional perspective. In Michael Fortescue, Peter Harder & Lars Kristoffersen (eds.), Layered structure and reference in a functional perspective – Papers from the Functional Grammar Conference, Copenhagen 1990, 277–301. Amsterdam: John Benjamins. Croft, William. 1994. Semantic universals in classifier systems. WORD 45(2). 145–171. Csirmaz, Aniko & Eva Dekany. 2010. Hungarian classifiers. Rome: Roma Tre University. Delahunty, Gerald P. & James J. Garvey. 2010. The English language: From sound to sense. West Lafayette, IN: Parlor Press. DeLancey, Scott. 1986. Toward a history of Tai classifier systems. In Colette G. Craig (ed.), Typological studies in language, vol. 7, 437–452. Amsterdam: John Benjamins. DeLancey, Scott. 2000. Argument structure of Klamath bipartite stems. Annual Meeting of the Berkeley Linguistics Society 26(2). 15–25. Denny, Peter. 1976. What are noun classifiers good for? In Papers from the 12th Regional Meeting of the Chicago Linguistic Society, 122–132. Chicago: Chicago Linguistic Society.

768 

 Alice Vittrant and Marc Allassonnière-Tang

Dixon, Robert M. W. 1982. Noun classifiers and noun classes. In Robert M. W. Dixon (ed.), Where have all the adjectives gone? And other essays in semantics and syntax, 211–233. Berlin: Mouton de Gruyter. Dixon, Robert M. W. 1986. Noun class and noun classification. In C. Craig (ed.), Noun classes and categorization, 105–112. Amsterdam: John Benjamins. Do-Hurinville, Danh Thành. 2013. Complexité dans une langue isolante: exemple du vietnamien, Nouvelles perspectives en sciences sociales : Revue internationale de systémique complexe et d’études relationnelles 9(1). 239–267. Doetjes, Jenny. 2012. Count/Mass Distinctions across Languages. In Claudia Maienborn, Klaus von Heusinger & Paul Portner (eds.), Semantics: An international handbook of natural language meaning, part III, 2559–2580. Berlin: Mouton de Gruyter. Dotte, Anne-Laure. 2017. Dynamism and change in the possessive classifier system of Iaai. Oceanic Linguistics 56(2). 339–363. Downing, Pamela A. 1986. The Anaphoric Use of Classifiers in Japanese. In Colette G. Craig (ed.), Typological studies in language, vol. 7, 345. Amsterdam: John Benjamins. Enfield, Nick J. 2004. Nominal classification in Lao: A sketch. STUF – Language Typology and Universals 57(2/3). 117–143. Erbaugh, Mary S. 1986. Taking stock: The development of Chinese noun classifiers historically and in young children. In Colette Grinevald (ed.), Noun classes and categorization: Proceedings of a symposium on categorization and noun classification, 399–436. Amsterdam: John Benjamins. Erbaugh, Mary S. 2002. Classifiers are for specification: Complementary functions for sortal and general classifiers in Cantonese and Mandarin. CLAO 31(1). 33–69. Fedden, Sebastian & Greville G. Corbett. 2017. Gender and classifiers in concurrent systems: Refining the typology of nominal classification. Glossa: A Journal of General Linguistics 2(1). 1–47. Genetti, Carol. 2007. A grammar of Dolakha Newar. Berlin: Mouton de Gruyter Gerner, Matthias & Walter Bisang. 2008. Inflectional speaker-role classifiers in Weining Ahmao. Journal of Pragmatics 40(4). 719–732. Gerner, Matthias & Walter Bisang. 2010. Social-deixis in Weining Ahmao. In Jan Wohlgemuth & Michael Cysouw (eds.), Rara and Rarissima, 75–94. Berlin: Mouton de Gruyter. Ghomeshi, Jila & Diane Massam. 2012. The mass count distinction: Issues and perspectives. In Diane Massam (ed.), Count and mass across languages, 1–8. Oxford: Oxford University Press. Gil, David. 2013. Numeral classifiers. In Matthew S. Dryer & Martin Haspelmath (eds.), The world atlas of language structures online. Leipzig: Max Planck Institute for Evolutionary Anthropology. http://wals.info/chapter/30 (accessed on 6 January 2021). Gillon, Brendan S. 1999. The Lexical Semantics of English Count and Mass Nouns. In Evelyne Viegas (ed.), Breadth and depth of semantic lexicons, 19–37. Dordrecht: Springer. Goddard, Cliff. 2005. The languages of East and Southeast Asia: An introduction. Oxford: Oxford University Press. Goral Donald R. 1978. Numerical classifier systems: A Southeast Asian cross-linguistic analysis. Linguistics of the Tibeto-Burman Area 4(1). 1–72 Greenberg, Joseph H. 1990. Generalizations about Numeral Systems. In Keith Denning & Suzanne Kemmer (eds.), On language: Selected writings of Joseph H. Greenberg, 271–309. Stanford: Stanford University Press. Grinevald, Colette. 1999. Typologie des systèmes de classification nominale. Faits de langues 7(14). 101–122. Grinevald, Colette. 2000. A morphosyntactic typology of classifiers. In Gunter Senft (ed.), Systems of nominal classification, 50–92. Cambridge: Cambridge University Press.



Classifiers in Southeast Asian languages 

 769

Grinevald, Colette. 2015. Linguistics of classifiers. In James D. Wright (ed.), International Encyclopedia of the Social & Behavioral Sciences, 811–818. Oxford: Elsevier. Hansen, Chad. 1983. Language and logic in Ancient China. Ann Arbor: University of Michigan Press. He, Jie. 2001. Xiandai Hanyu Liangci Yanjiu [Studies on classifiers in Modern Chinese]. Beijing: Nationalities Publishing House. Hellwig, Birgit. 2003. The grammatical coding of postural semantics in Goemai (a West Chadic language of Nigeria). Nijmegen: Radboud University PhD dissertation. Her, One-Soon. 2012. Distinguishing classifiers and measure words: A mathematical perspective and implications. Lingua 122(14). 1668–1691. Her, One-Soon. 2017. Structure of numerals and classifiers in Chinese: Historical and typological perspectives and cross-linguistic implications. Language and Linguistics 18(1). 26–71. Her, One-Soon, Marc Tang & Bing-Tsiong Li. 2019. Word order of numeral classifiers and numeral bases. STUF – Language Typology and Universals 72(3). 421–452. Hla Pe, 1965. A re-examination of Burmese Classifiers, Lingua, 15:  163–185. Hopper, Paul J. 1986. Some discourse functions of classifiers in Malay. In Colette G. Craig (ed.), Typological studies in language, vol. 7, 309–325. Amsterdam: John Benjamins. Huang, Chu-Ren & Kathleen Ahrens. 2003. Individuals, kinds and events: Classifier coercion of nouns. Language Sciences 25(4). 353–373. Huffman, Franklin. 1970. Modern Spoken Cambodian. New Haven: Yale University Press. Hurd, Conrad. 1977. Nasioi projectives. Oceanic Linguistics 16(2). 111. Jenks, Peter. 2017. Numeral classifiers compete with number marking: Evidence from Dafing. Presentation at the Linguistic Society of America Annual Meeting, Austin, TX. Jenny, Mathias. 2019. Thai. In Alice Vittrant & Justin Watkins (eds.), The Mainland Southeast Asia Linguistic Area, 559–608. Berlin: Mouton de Gruyter. Jenny, Mathias & San San Hnin Tun. 2016. Burmese, a comprehensive grammar. New York: Routledge. Jiang, Ying. 2006. Hanzang Yuxi Mingliangci Yanjiu [A study of nominal classifiers in Sino-Tibetan]. Beijing: The Central University of Nationalities PhD dissertation. Jones, Robert B. 1970. Classifier constructions in Southeast Asia. Journal of the American Oriental Society 90(1). 1. Kemmerer, David. 2017a. Categories of object concepts across languages and brains: The relevance of nominal classification systems to cognitive neuroscience. Language, Cognition and Neuroscience 32(4). 401–424. Kemmerer, David. 2017b. Some issues involving the relevance of nominal classification systems to cognitive neuroscience: Response to commentators. Language, Cognition and Neuroscience 32(4). 447–456. Kilarski, Marcin. 2013. Nominal classification: A history of its study from the Classical Period to the present. Amsterdam: John Benjamins. Kilarski, Marcin. 2014. The place of classifiers in the history of linguistics. Historiographia Linguistica 41(1). 33–79. Krifka, Manfred. 1995. Common nouns: A contrastive analysis of Chinese and English. In Gregory N. Carlson & Francis J. Pelletier (eds.), The generic book, 398–411. Chicago: University of Chicago Press. Lam, Sylvie & Marie-Thérèse Vinet. 2005. Classifieurs nominaux et verbaux en chinois mandarin. Actes du Congrès annuel de l’Association canadienne de linguistique 2005. 1–11.  Li, Jinxi. 1924. Xinzhu Guoyu Wenfa [New Chinese Grammar]. Shanghai: The Commercial Press Li, Wendan. 2000. The pragmatic function of numeral classifiers in Mandarin Chinese. Journal of Pragmatics 32. 1113–1133. Li, XuPing & Walter Bisang. 2012. Classifiers in Sinitic languages: From individuation to definiteness-marking. Lingua 122. 335–355.

770 

 Alice Vittrant and Marc Allassonnière-Tang

Lichtenberk, Frantisek. 1983. A grammar of Manam. Honolulu: University of Hawaii Press. Link, Godehard. 1998. Algebraic semantics in language and philosophy. Stanford: CSLI. Liu, Shiru. 1965. Wei-Jin Nanbeichao Liangci Yanjiu [A study on classifiers in the Wei-Kin and in the Nanbeichao Periods]. Beijing: Zhonghua shuju chuban. Liu, Yulan (Thanyalak Saeliao). 2012.Thailand Mien Reference Grammar. PhD thesis, Minzu University of China, Beijing. Löbel, Elisabeth. 2000. Classifiers versus genders and noun classes: A case study in Vietnamese. In Barbara Unterbeck, Matti Rissanen, Terttu Nevalainen & Mirja Saari (eds.), Gender in grammar and cognition, 259–320. Berlin & New York: Mouton de Gruyter. Mathieu, Eric. 2012. On the mass-count distinction in Ojibwe. In Diane Massam (ed.), Count and mass across languages, 172–198. Oxford: Oxford University Press. Matisoff, James A. 1973. A grammar of Lahu. Berkeley: University of California Press. Matthews, Stephen & Virginia Yip. 1994. Cantonese, a comprehensive grammar. New York: Routledge Meir, Irit & Wendy Sandler. 2007. A language in space: The story of Israeli Sign Language. New York: Lawrence Erlbaum Associates. Mortensen, David. 2019. Hmong (Mong Leng). In Alice Vittrant & Justin Watkins (eds.), The Mainland Southeast Asia Linguistic Area, 609–652. Berlin: Mouton de Gruyter. Nguyễn, Đình Hoà. 1957. Classifiers in Vietnamese. WORD 13(1). 124–152. Nomoto, Hiroki. 2013. Number in classifier languages. Minneapolis: University of Minnesota PhD dissertation. Nomoto, Hiroki & Soh, Hooi Ling. 2019. Malay. In Alice Vittrant & Justin Watkins (eds.), The Mainland Southeast Asia Linguistic Area, 475–522. Berlin: Mouton de Gruyter. Peterson, David A. 2008. Bangladesh Khumi verbal classifiers and kuki-chin “chiming”. LTBA 31(1). 109–138. Peyraube, Alain. 1998. On the History of Classifiers in Archaic and Medieval Chinese. In Benjamin K. T’sou (ed.), Studia linguistica serica, 131–145. Hong Kong: City University of Hong Kong. Peyraube, Alain & Thekla Wiebusch. 1993. Le rôle des classificateurs nominaux en chinois et leur évolution historique: un cas de changement cyclique. Faits de langues 1(2). 51–61. Quine, Willard van Ormine. 1960. Word and object. Cambridge: MIT Press. Rygaloff, Alexis. 1973. Grammaire Élémentaire du Chinois. Paris: PUF. Sanches, Mary & Linda Slobin. 1973. Numeral classifiers and plural marking: An implicational universal. Working Papers in Language Universals 11. 1–22. Schembri, Adam. 2003. Perspectives on classifier constructions in sign languages. In Karen Emmorey & Adam Schembri (eds.), Perspectives on classifier constructions in sign language, 3–34. Mahwah, NJ: Lawrence Erlbaum Associates. Seifart, Frank. 2010. Nominal classification. Language and Linguistics Compass 4(8). 719–736. Senft, Gunter. 1986. Kilivila: The language of the Trobriand Islanders. Berlin & New York: Mouton de Gruyter. Senft, Gunter. 2000. Systems of nominal classification. Cambridge: Cambridge University Press. Simpson, Andrew. 2005. Classifiers and DP structure in Southeast Asia. In Guglielmo Cinque & Richard Kayne (eds.), The Oxford handbook of comparative syntax, 806–838. New York: Oxford University Press. Suárez, Jorge A. 1983. The Mesoamerican Indian Languages. (Cambridge Language Surveys.) Cambridge: Cambridge University Press. Tang, Chih-Chen Jane. 2004. Two types of classifier languages: A typological study of classification markers in Paiwan noun phrases. Language and Linguistics 5(2). 377–407. Tang, Marc & One-Soon Her. 2019. Insights on the Greenberg-Sanches-Slobin generalization: Quantitative typological data on classifiers and plural markers. Folia Linguistica 53(2). 297–331.



Classifiers in Southeast Asian languages 

 771

T’sou, Benjamin K. 1976. The structure of nominal classifier systems. Oceanic Linguistics Special Publications 13. 1215–1247. Tumtavitikul, Apiluck, Chirapa Niwatapant & Philipp Dill. 2009. Classifiers in Thai Sign Language. SKASE Journal of Theoretical Linguistics 9. 27–44. Vittrant, Alice. 2005. Classifier systems and noun categorization devices in Burmese. Proceedings of the Twenty-Eighth Annual Meeting of the Berkeley Linguistics Society: Special Session on Tibeto-Burman and Southeast Asian Linguistics 28. 129–148. DOI: https://doi.org/10.3765/bls. v28i2.1032. Vittrant, Alice & Justin Watkins. 2019. The Mainland Southeast Asia Linguistic Area (Trends in Linguistics. Studies and Monographs [TiLSM] 314). Berlin: Mouton de Gruyter. Vittrant Alice & Mouton Léa. Forthcoming. Systèmes de classification nominale en Asie du Sud-Est: les fonctions des classificateurs. Faits de Langue 51(1). Wu, Fuxiang, Shengli Feng & C. T. James Huang. 2006. Hanyu Shu+liang+ming Geshi de Laiyuan [On the origin of the construction of Numeral+classifier+noun in Chinese]. Zhongguo Yuwen [Studies of the Chinese language] 5. 387–400. Qin, Xiaohang. 2007. Concurrent functions of Hawyiengz Zhuang classifiers. Mon-Khmer Studies 37. 165–178. Yang-Drocourt, Zhitang. 2004. Evolution syntaxique du classificateur en chinois du XIIIe siècle av. J.-C. au XVIIe siècle. Paris: École des Hautes Études en Sciences Sociales, Centre de Recherches Linguistiques sur l’Asie Orientale. Yi, Byeong Uk. 2011. What is a numeral classifier? Philosophical Analysis 23. 195–258. Zhang, Niina Ning. 2012. Countability and numeral classifiers in Mandarin. In Diane Massam (ed.), Count and mass across languages, 220–237. Oxford: Oxford University Press. Zhang, Yachu. 2001. Yin-Zhou Jinwen Jicheng Yinde [Index to collection of inscriptions of the Yin-Zhou Period]. Beijing: Zhonghua Book Company.

Walter Bisang

32 Grammaticalization in Mainland Southeast Asian languages 32.1 Introduction Grammaticalization is a widespread phenomenon in Mainland Southeast Asian languages. Compared to other languages, grammaticalization is characterized by the high importance of discourse and pragmatic inference even with markers which express grammatical functions which are generally associated with high degrees of grammaticalization. This can be seen from their non-obligatoriness and their multifunctionality. In both cases, the relevant grammatical information is left to pragmatic inference. After a general introduction to grammaticalization, this will be illustrated with exemplary case studies on (i) the use of kinship terms as pronouns in Vietnamese (32.3), (ii) numeral classifiers employed for the expression of definiteness as well as indefiniteness in Vietnamese (32.4) and (iii) the development of various grammatical markers derived from the application of reanalysis and analogy on strings of juxtaposed verbs, often discussed in terms of serial verb constructions (32.5). The presentation of these case studies additionally reveals two corollaries of pragmatic inference: (i) The retention of earlier meanings blocks layering and supports the accumulation of meaning on individual linguistic signs. (ii) The emergence of specific syntactic patterns (attractor positions) potentially operates against the gradualness of grammaticalization. In the conclusion, these findings will be briefly related to linguistic complexity.

32.2 Grammaticalization and its specifics in Mainland Southeast Asian languages 32.2.1 General remarks on grammaticalization Grammaticalization covers that part of language change which is concerned with grammar. Since its introduction into linguistics by Meillet (1912), this term has been used to describe the development of lexical items into markers of grammatical categories. Later definitions also include processes in which a linguistic item which is already expressing a grammatical function takes on yet another, further grammaticalized function (Kuryłowicz 1965 [1975: 52]; Bybee et al. 1994: 4; Hopper and Traugott 2003: 1). Prototypical examples of grammaticalization are verbs with the meaning of ‘want’ developing into future markers, or nouns denoting body-parts becoming adpositions for marking orientation in space and, at a later stage, in time. Processes like https://doi.org/10.1515/9783110558142-032

774 

 Walter Bisang

these proceed along different diachronic stages which are called pathways or clines. In the course of time, research on grammaticalization has brought to light an impressive number of pathways, as can be seen from the World Lexicon of Grammaticalization by Heine and Kuteva (2002) and its second and revised edition published by Kuteva et al. (2019). A prominent example is the following pathway from (Givón 1979: 209): (1)

Discourse > Syntax > Morphology > Morphophonemics > Zero.

This pathway describes how a linguistic unit that used to have a function in discourse (e.  g. a topic) at a certain period of time acquires a syntactic function (e.  g. subject) at a later stage and then becomes part of morphology and morphophonemics until it may ultimately end up as a zero-marker in a binary opposition pair. Pathways of this type are claimed to have several cross-linguistically valid properties. They are cyclic in the sense that they are realized in stages, whose sequence follows universal clines and is unidirectional in the sense that it is not reversible. Thus, a syntactic unit in (1) can become a morphological item while the reverse development is claimed to be rare if not impossible (Bybee et al. 1994: 9–22). While there clearly are counterexamples to unidirectionality, their number is actually rather small, as one can see from Norde (2009) on degrammaticalization.

32.2.2 Grammaticalization in Mainland Southeast Asian languages Even though grammaticalization is generally described as a phenomenon that is cross-linguistically homogeneous, a closer look at these properties reveals that an impressive number of individual processes of grammaticalization in rather divergent domains of grammar are realized in similar ways which together constitute areal characteristics in Mainland Southeast Asian languages as well as in Sinitic. Since this chapter is on Mainland Southeast Asian languages (henceforth SEA languages for short), the focus of this presentation lies on Tai-Kadai/Kra-Dai, Austroasiatic and Hmong-Mien1 (on general publications of this area, cf. Bisang 2004, 2011, 2020a; Enfield 2003, 2005; Enfield and Comrie 2015; Vittrant and Watkins 2019). On the whole, there are three areal characteristics of grammaticalization. They are briefly introduced in this section and will be extensively discussed with examples in 32.3 to 32.5. The first property of grammaticalization is about the prominence of discourse and pragmatic inference. As will become evident in the course of this chapter, discourse and pragmatic inference are of fundamental importance for SEA languages in general

1 The fourth family marginally involved is Austronesian with its Chamic branch, spoken in Cambodia, Vietnam and on Hainan island. There will be no data on languages from this branch in this chapter.



Grammaticalization in Mainland Southeast Asian languages 

 775

and they are reflected in the lack of obligatoriness and the multifunctionality of grammaticalized markers (cf. 32.2.3 for more on this topic). In addition, the prominence of pragmatic inference in SEA languages sheds new light on the coevolution of meaning and form, i.  e. the question of the extent to which change in meaning affects change in form (cf. Bisang 2004, 2011). While some approaches argue that change in meaning universally comes with change in form, there are other approaches arguing against this view. More recently, this debate has been summarized by Heine (2018) in terms of the Parallel Reduction Hypothesis and the Meaning First Hypothesis. The first hypothesis is reflected in classical research as in Lehmann’s (2002 [1982]) assumption that the autonomy of the linguistic sign gets reduced with increasing grammaticalization or in the work of Bybee et al. (1994: 20), which starts out from the idea “that the development of grammatical material is characterized by the dynamic c­ oevolution of meaning and form” (Bybee et al. 1994: 20). The second hypothesis claims that “[s]emantic change is primary in grammaticalization and precedes form change (i.  e. morphosyntactic and phonological change) in time” (Heine 2018: 20). This perspective is prominently represented by Heine et al. (1991) and Heine (2002) as well as by the Invited Inference Theory of Semantic Change as presented by Traugott (2002) and Traugott and Dasher (2002). As will be shown in this chapter, the languages of SEA (as well as Sinitic) generally support the Meaning First Hypothesis.2 It is particularly remarkable how even linguistic signs with functions generally associated with high degrees of grammaticalization only show limited reduction on the form side, if at all3 (Bisang 2004, 2008, 2011). From the perspective of the Meaning First Hypothesis, pragmatic inference is the driving force which initiates processes of grammaticalization. It prominently operates through metonymic and metaphoric inferences (Hopper and Traugott 2003: 81–93), which activate reanalysis and analogy (Hopper and Traugott 2003: 50–70), respectively. Metonymic inference crucially depends on the order of linguistic elements and their interpretation in such a morphosyntactic environment. Given the right context, it induces the morphosyntactic reanalysis of a string of linguistic units. For the sake of illustration, let’s look at the following slightly idealized example from Khmer:

2 From a more general perspective, Bisang and Malchukov (2020) show on the basis of a systematic statistical analysis of cross-linguistic data from 29 individual languages, language families or geographic areas that the two hypotheses are too coarse-grained and need further specification in terms of individual parameters and their interactions. The parameters selected for that study are based on a slightly modified version of Lehmann (2002 [1982]) and two additional parameters (decategorization and allomorphy). On the whole, grammaticalization was measured in terms of the following eight parameters: semantic integrity, phonetic reduction, paradigmaticity, bondedness, paradigmatic variability, syntagmatic variability, decategorization and allomorphy. 3 A good example is the Thai perfective marker lɛ́ ɛw, which did not undergo any phonological reduction even though it has become almost obsolete in its function of a full verb.

776 

(2)

 Walter Bisang

Khmer (Bisang 2015a: 139) ʔoːpùk sɔŋ phtɛ̀ əh ʔa oj koːn nɤ̀u. father build house give child live/stay a. Sequence of three verbs: ‘Father builds a house, gives [it] to [his] children and they live [in it]’ b. Preposition: ‘Father builds a house for [his] children and [they] live [in it].’

At an initial stage, each of the three verbs in this example can be treated as an individual verb which is iconically concatenated in a sequence of events as in [[NP V1-build N] [V2-give NP] [V3-live.in]], translated as in (2a). As we will see later in 32.5.3, there are different options for reanalysis in (2). For the sake of illustration, only one is picked out here, in which the verb ʔa oj ‘give’ is interpreted as a benefactive preposition. As a consequence, there remain only two clauses as shown by the bracketing in the following syntactic analysis: [[NP V1-build NP P-for NP] [V3-live.in]]. Here, the first clause is headed by the verb sɔŋ ‘build’ with its prepositional adjunct, the second one is headed by nɤ̀u ‘live at’. Metaphoric inference is based on “understanding and experiencing one kind of thing in terms of another” (Hopper and Traugott 2003: 84) and follows principles of analogy across conceptual boundaries. Since metaphoric and metonymic inference “are complementary, not mutually exclusive” (Hopper and Traugott 2003: 93), the interpretation of the ‘give’-verb as a preposition can be modelled by way of analogy with other verbs used in prepositional function (e.  g. nɤ̀u ‘be at’ as a locative marker or dɔl ‘arrive’ as a marker of the target ‘as far as, to’) or by reanalysis as shown above. Both types of pragmatic inference play a key role in the discussion of serial verb constructions (cf. 32.5). The second property of grammaticalization in SEA languages is about the retention of earlier meaning. As is very frequently observed in processes of grammaticalization, the development of a new meaning B from an earlier meaning A is characterized by an overlapping period in which a linguistic item has both meanings before it loses its initial meaning (Heine et al. 1991; Hopper 1991). This process, which is called layering by Hopper (1991), is often represented by a pathway of the following type: A > {B/A} > B. While such pathways are cross-linguistically common, they are by no means universal. As was pointed out more recently by Xing (2013, 2015), grammaticalization in Chinese is rather characterized by the “accretion of more meaning over time” (Xing 2015: 595) in the sense that a linguistic item develops from A to B and then from B to C without losing its former meanings, creating a pathway of the type A > A,B > A,B,C, etc. The existence of such pathways is to be expected in languages in which pragmatic inference remains important even at stages in which a given marker expresses functions associated with prototypical grammatical domains like definiteness/indefiniteness or tense-aspect-mood (TAM) (Bisang 2020b). The third and last property to be discussed here is gradualness (Traugott and Trousdale 2010). While gradualness is clearly an important property of grammaticalization, there are instances in which a lexical item can take on a new function within a



Grammaticalization in Mainland Southeast Asian languages 

 777

relatively short period of time. The conditions for such rapid instances of change crucially depend on the existence of specific syntactic positions associated with specific grammatical domains (e.  g. TAM) which can attract new lexical items for expressing a specific value of that domain (cf. attractor positions in 32.5.4). The mechanism that supports such flexibility of lexical items to be interpreted in the light of a new grammatical function in a given morphosyntactic environment is again metonymic and metaphoric inference.

32.2.3 On pragmatic inference and its prominence in Southeast Asian languages The prominence of pragmatic inference in cross-linguistic comparison to other languages is a general property of the grammars of East and Mainland Southeast Asian languages (Huang 1994; LaPolla 2003) and it is also reflected in their products of grammaticalization (Bisang 2004, 2008, 2011). For that reason, this first prominent characteristic of grammaticalization in SEA languages is given some more extensive attention in this section. In Huang’s (1994) analysis of anaphora, pragmatic inference affects the division of labour between pragmatics and syntax. As he points out, “[t]here seems to exist a class of language (such as Chinese, Japanese and Korean) where pragmatics appears to play a central role which in familiar European languages (such as English, French and German) is alleged to be played by grammar” (Huang 1994: xiv). By focusing on the interaction of grammar and pragmatic inference, the present chapter takes a somewhat different perspective. It is interested in the extent to which information which is part of the grammatical inventory of a language can actually be subject to pragmatic inference depending on assumptions of the presence of mutually shared information between speaker and hearer in a given discourse scenario. In such a context, the importance of pragmatic inference generally manifests itself at least in the following two properties (Bisang 2004, 2008, 2011, 2020b): 1. Lack of obligatoriness: The grammar of a language does not force the speaker to overtly express grammatical categories that are part of its grammatical inventory in a given syntactic environment.4 2. Multifunctionality: One and the same grammatical marker expresses more than one grammatical function, depending on extra-linguistic context or intra-linguistic/constructional context. This lack of clarity also detracts from explicitness.

4 Notice that obligatoriness is defined in terms of Lehmann’s (2002) concept of the autonomy of the linguistic sign. The expression of a given grammatical category is obligatory if the grammar of a language forces its speaker to select one of the values available from the set of markers for expressing that category in a given morphosyntactic environment.

778 

 Walter Bisang

Since these properties are first of all properties of individual grammatical markers and constructions, overall prominence of pragmatic inference presupposes the occurrence of these properties in a wide variety of different grammatical domains in a language. And this is exactly what can be observed in SEA languages and in Sinitic (Bisang 2004, 2008). To illustrate this, let’s look briefly at some examples. A straightforward example of lack of obligatoriness is radical pro-drop, i.  e. the omission of verbal arguments in an independent clause without concomitant agreement marking on the verb (Rizzi 1986; Neeleman and Szendrői 2007 and many others). This phenomenon is well-known in SEA languages (and in Sinitic; Bisang 2020a). The following two examples are of particular interest because they show specific effects of pragmatic inference in radical pro-drop. The first example in (3) from Khmer additionally shows that pro-drop does not necessarily entail topic continuity. The story is about a sequence of events marked by square brackets, each of them with no overt subject, as is indicated by ø. While the unmarked subject of the first three bracketed events is the husband, the last unexpressed subject is the lover in the pitcher. This change of topic can only be inferred from context and general knowledge about the consequences of boiling water that is poured over somebody: (3)

   

Khmer (Bisang 1992: 7–8, 436) pdɤj1 kɔː [kra ok la əŋ] [ø1 da ə tɤ̀u] [ø1 lɤ̀ːk tɯ̀k mùːəj husband then get.up move.up walk go lift water one khtɛ̀ əh nùh] [ø1 jɔ̀ ːk tɤ̀u sra oc lɤ̀ː sa ːha ːj2 nɤ̀u knoŋ bucket dem take go pour on lover at inside pɪ̀ ːəŋ nùh] [ø2 sla p tɤ̀u]. pitcher dem die go ‘The husband1 got up, ø1 went away, ø1 raised the one bucket of [boiling] water and ø1 poured it over the lover2 [of his wife] in the pitcher and ø2 died.’ [Context: A wife hid her lover in a pitcher. Her husband comes home unexpectedly and finds him there.]

Example (4) from Hmong illustrates how individual referents like Nga Shua and Pa Tyai can be combined to form joint subjects in a pro-drop situation:5 (4)



Hmong (Mottin 1980: 98) ces tus ya wm yij1 nkɑj sua v then clf brother.in.law Nga Shua ø1 pw ib hmos, ø1+2 tuɑ ib sleep one night kill one

thia j also tug clf

mus go qa ib cock

nrog pa j ca i2 with Pa Tyai sa m ø1+2 noj. castrated eat

5 Notice that Hmong has a set of dual pronouns: wb ‘1st person dual’, neb ‘2nd person dual’, nka wd ‘3rd person dual’.





Grammaticalization in Mainland Southeast Asian languages 

 779

ø1+2 noj ta s, nws1 ua tsa ugh, nws1 yua v rov qa b. eat compl 3s make thank 3s want go.back ‘Then, [his] brother-in-law Nga Shua1 also went with Pa Tyai2 and ø1 spent a night [with him]. [They1+2] killed a castrated cock and ø1+2 ate it. When [they1+2] finished eating [it], he1 thanked [him2] and wanted to go back.’

Another look at examples (3) and (4) shows that information on the identity of arguments is not the only grammatical information that remains often unexpressed. Thus, there is not a single marker indicating tense-aspect in example (3) and only one instance of the completive aspect marker ta s ‘finish’ in (4) from Hmong. A good example of non-obligatoriness are perfective markers like Khmer ha əj (Gorgoniev 1966: 154–159; Haiman 2011a: 279) or Thai lɛ́ ɛw (Boonyapatipark 1983; Soogkasem 1990; Kullavanijaya and Bisang 2003). While the perfective form is obligatory in sequences of events as presented in (3) in languages with a prototypical perfective/imperfective opposition, the use of the perfective marker depends on discourse in Khmer and Thai. Radical pro-drop also has its effects on relative-clause formation. In (5) from Thai, the relative clause is introduced by the marker thîi. As in many other SEA languages, the head noun is not represented overtly within it (for much more, cf. Kullavanijaya 2004). If both arguments remain unexpressed in relative clauses with transitive verbs, this creates potential ambiguities which can only be resolved by pragmatic inference. Thus, the construction in (5a) is ambiguous, since it can be either interpreted in terms of subject coreference (‘the man who eats’) or in terms of object coreference (‘the man whom X eats’). The interpretation given in the translation of (5a) is not the result of grammatical rules but rather the result of pragmatic inference from world knowledge (cf. Iida [1996] on Japanese or Bisang [2014: 132] on Chinese for similar problems). Similarly, the interpretations in terms of object coreference in (5b) and locative coreference in (5c) are due to pragmatics. (5)





Thai a. Subject coreference: khon [thîi kin] man rel eat ‘the man who eats.’ b. Object coreference: məmûəŋ [thîi kin] mango rel eat ‘the mango which is eaten/which X eats’ c. Locative coreference: ráa n-ʔaa hǎa n [thîi kin] Restaurant rel eat ‘the restaurant where X is eating’

780 

 Walter Bisang

Multifunctionality comes in two different forms. In its more extreme form, one and the same marker in the same syntactic position (e.  g. preverbal) can express different functions. Thus, its concrete meaning in an utterance can only be derived from extra-linguistic context. In the second form of multifunctionality, the functional interpretation of a given marker depends on the intra-linguistic or constructional context in which it occurs. The first case can be illustrated by the grammatical functions of ‘come to have’-verbs as they are frequently found in Tai-Kadai, Austroasiatic, HmongMien and Sinitic (Bisang 1996; Enfield 2003). Typical instances of such verbs are Thai dâ j, Khmer ba :n, Vietnamese được and Hmong ta u. In the preverbal position, ‘come to have’-verbs can have no less than four different meanings, depending on contexts like [±desired] (6a)–(6b), truth in the context of reaction against a wrong presupposition (6c) or assumptions on the position on the time axis (6d):6 (6)

Possible inferences of ‘come to have’-verbs in SEA and Sinitic languages (revised version of Bisang 2004: 119): a. The event E is [+desired]: +> modal interpretation: ‘can’ (potential meaning: ability or permission) b. The event E is [-desired] +> modal interpretation: ‘must, have to’ (obligation) c. If E stands for a wrong presupposition: +> truth, factuality d. In order for X to come to have E, E must have taken place: +> Past (E) (particularly if E is negated)

The following constructed example illustrates the multifunctionality of the Khmer verb ba:n ‘come to have’ in its three preverbal functions of ability, truth/factuality and past (the deontic function is rare if not impossible in Khmer; for authentic examples, cf. Enfield 2003 and Bisang 2004: 120–121): (7)

kɔ̀ ət ba ːn tɤ̀u phsa ː. 3s come.to.have go market a. ‘He was able to go to the market.’ (ability; cf. [6a]) b. ‘He did go to the market.’ (factuality of E against a wrong presupposition; [6c]) c. ‘He went to the market.’ (past, cf. [6d])

The second type of multifunctionality can be illustrated by various verbs which are used in different grammatical functions depending on their syntactic position within specific constructions. We will discuss the multifunctionality of ‘give’-verbs and ‘finish’-verbs in 32.5.3.

6 +> stands for “pragmatically inferred”.



Grammaticalization in Mainland Southeast Asian languages 

 781

Taking together the discussion in 32.2.2 and 32.2.3, grammaticalization is characterized by (i) discourse and pragmatic inference as it is characterized by (i) a) lack of obligatoriness and (i) b) multifunctionality, (ii) retention of earlier meanings in terms of layering and (iii) certain instances of non-gradual grammaticalization. Sections 32.3 to 32.5 show how these properties interact in three domains of grammar. Section 32.3 is on kinship terms in pronominal function and illustrates characteristics (i) and (ii). The same characteristics also matter in the case of Vietnamese numeral classifiers expressing definiteness as well as indefiniteness as described in 32.4. Here, the relevance of pragmatics even manifests itself in meaning because the definiteness associated with classifiers is pragmatic definiteness rather than semantic definiteness (cf. Löbner 1985). Finally, 32.5 is on the grammaticalization of individual verbs out of the unmarked concatenations of verbs. After raising the issue of serial verb constructions in SEA languages, this section covers all three properties of grammaticalization as well as the way in which reanalysis and analogy support alternative analyses of surface strings. The conclusion in 32.6 presents a short summary and an outline of how SEA languages with their specific grammaticalization properties fit into the wider discussion of linguistic complexity.

32.3 Vietnamese kinship terms in pronominal function SEA languages are known for their extensive sets of pronouns based on differences in social status and for their large number of nouns, kinship terms and others, which can also be used as personal pronouns (e.  g. Cooke 1968 on Thai, Burmese and Vietnamese). Integrating these forms into clear-cut paradigms as we know them from many Indo-European languages is often difficult or even impossible because their concrete meaning depends on context and pragmatic inference (e.  g. kinship terms are person-neutral in their pronominal function, cf. below). As will be shown, the politeness interpretation of pronouns depends on the social setting in which they are used. Of particular interest from the perspective of multifunctionality are kinship terms because their semantics do not convey information on person. Thus, person basically has to be inferred from social context. Many SEA languages have a set of words which are specialized for expressing a specific person value, often combined with additional information on the social status of the referent they denote. In Vietnamese, the most common ones are tôi ‘1st singular, formal’, tao for ‘1st singular, informal’, mà y ‘2nd singular, informal’7, nó ‘3rd singular,

7 There is no specific pronominal form for ‘2nd person singular, formal’, even though there are many kinship terms that can be used in that function. Sometimes, paradigm tables mention the reflexive

782 

 Walter Bisang

informal’, hắ n and y ‘3rd singular, formal’, họ ‘3rd plural, formal’, chúng nó ‘3rd plural, informal’ (for different arrangements of these and other forms in paradigms, cf. Tuc 2003: 120; Le 2011: 89; Huynh and Yoon 2019: 95). Some of these pronouns are clearly the result of grammaticalization. Thus, tôi ‘1st singular, formal’ is also a noun with the meaning of ‘subject, slave’, as in many other East and Mainland Southeast Asian languages. Its pronominal function is the result of a process of grammaticalization which starts out from a bridging context in which the speaker refers to him/herself as a slave for expressing deference. The interpretation of the above pronouns in terms of politeness and social status is not as straightforward as their definition in terms of formal vs. informal suggests. As Tuc (2003) points out, the actual relation between the speaker who produces an utterance and the referent denoted by the pronoun strongly depends on context: “the usage of such Vietnamese personal pronouns as ta o ‘I’ and mà y ‘you’ may pragmatically presuppose either the underlying incongruence and hostility, or reinforce solidarity and stability between speakers” (Tuc 2003: 121). If mà y is used among peers it indicates group membership and solidarity, if it is used in a situation of social distance it may well induce hostility. In addition to these pronouns, Vietnamese uses an impressive number of kinship terms in pronominal function (and even terms of social position, titles, etc.). Some kinship terms are for general use beyond the context of family membership (8a), while others are family specific, i.  e. they can only be used among family members (8b): (8)

Vietnamese kinship terms (Huynh and Yoon 2019: 94): a. for general use: a nh ‘elder brother’ chị ‘elder sister’ em ‘younger sibling’ chá u ‘grandchild, nephew, niece’ bá c ‘uncle, father’s elder brother/sister (senior referent)’ con ‘child (son or daughter)’ chú ‘uncle, father’s younger brother (junior uncle)’ cô ‘auntie, father’s sister (either senior or junior)’ b. family specific terms: ba , bố, cha, tia ‘father’ (the terms are dialectal variants) má / mẹ ‘mother’ (the terms are dialectal variants) ông bá c ‘a parent’s bá c’ ông chú ‘father’s chú’ bá cô ‘father’s cô’ ông nội ‘paternal grandfather’

pronoun mình for ‘2nd person singular, formal’, but that form can be used for any person, depending on the person of the subject.



Grammaticalization in Mainland Southeast Asian languages 

 783

The use of kinship terms in pronominal function is very common. In the following example, a younger female seller addresses her elder female customer with the kinship term chị ‘elder sister’ in (9a) to pay respect to her as a person who is older than herself, while she refers to herself as em ‘younger sibling’ in (9b): (9)



Vietnamese (Le 2011: 154) a. Chị lựa áo dâ y hôn? elder.sister choose jacket stripe Q ‘Do you like striped jackets?’ b. Em lấ i ra cho chị lựa . younger.sibling take come.out for elder.sister choose ‘I’ll take them out for you to choose.’

The use of kinship terms is associated with politeness. Simultaneously, it creates a certain familiarity between speaker and hearer. In contrast, using a formal pronoun in the same situation would enlarge the distance between speaker and hearer. In the following example, the familiarity that is expressed by the use of chá u ‘nephew, niece’ and cô ‘auntie’ in (10a) would be destroyed if chá u would be replaced by tôi ‘1st singular, familiar’ as in (10b), which is pragmatically odd: (10)



Vietnamese (Huynh and Yoon 2019: 97) a. Chá u đã ă n bá nh ngọt cô cho chá u. nephew/niece pst eat bread sweet auntie give nephew/niece ‘I (who you are superior to, who I show respect to) ate the cake that you (who is my superior and who I show respect to) gave me.’ b. Chá u đã ă n bá nh ngọt cô cho #tôi. nephew/niece pst eat bread sweet auntie give I ‘I ate the cake that you gave me.’

Kinship terms differ from the specific pronouns discussed at the beginning of this section by their indeterminacy with regard to person. Thus, assigning the right person interpretation to a given kinship term depends on pragmatic inference based on the concrete speech situation. This is excellently illustrated by the following example from Tuc (2003) on the kinship terms con ‘child’ and má ‘mother’. Interpretation (11a) arises in a situation in which a mother is talking to her child. Thus, con gets the meaning of ‘you’ and má gets the meaning of ‘I’. If the same sentence is used by a father addressing his child as in (11b), con means ‘you’ again, while má is either used with its lexical meaning of ‘mother’ or in the pronominal function of ‘3rd person singular feminine’. If the same sentence is uttered by a child to her/his father as in (11c), con means ‘I’ and má means ‘mother’ or ‘she’. Finally, if the father talks to his wife about their child as in (11d), con is either interpreted as a noun ‘child’ or as a 3rd person pronoun, while má is treated as a 2nd person pronoun:

784 

(11)

 Walter Bisang

Vietnamese (Tuc 2003: 113–114, adapted from Thompson 1987 [1965]: 293) Tạ i sa o con không nói cho má ích lợi của trâ` u ca u? why child neg say to mother usefulness poss betel a. [mother to child]: ‘Why did you not talk about useful aspects of betel chewing to me?’ b. [father to child]: ‘Why did you not talk about useful aspects of betel chewing to mother/her?’ c. [child to father]: ‘Why did I not talk about useful aspects of betel chewing to mother/her?’ d. [father to mother]: ‘Why did the child/he/she not talk about useful aspects of betel chewing to you?’

Taking into consideration that pronominal elements of Vietnamese are not obligatory (cf. the above remarks on radical pro-drop), pragmatic inference is relevant at the level of lack of obligatoriness as well as at the level of multifunctionality in the case of kinship terms and the expression of person.

32.4 Numeral classifiers and the expression of ­definiteness and indefiniteness In many East and Mainland Southeast Asian languages, classifiers are clearly multifunctional inasmuch as they do not only express individuation (Greenberg 1992; Bisang 1999) or atomization (Chierchia 1998) but also definiteness and/or indefiniteness.8 This second function is mainly discussed in more recent literature (Hundius and Kölver 1983 on Thai; Cheng and Sybesma 1999 on Sinitic; Bisang 1999; Wu and Bodomo 2009; Jenks 2011, 2015; Simpson et al. 2011; Simpson 2017; Li and Bisang 2012; Bisang and Wu 2017; etc.). Of particular interest for the discussion in this context are bare classifier constructions [CLF N] in Southeast Asian languages whose (in)definiteness interpretation depends heavily on discourse and pragmatic inference. Vietnamese has such a construction. Its classifier shows multifunctionality because it functions like a variable which can be interpreted as definite or as indefinite (Nguyen 2004; Trinh 2011)9, even 8 Notice that classifiers do not function like articles in various respects to be further discussed in this section (lack of obligatoriness, pragmatic definiteness, use in information structure, etc.). Since there are no languages from this large area in which classifiers are only used in the context of (in) definiteness without also expressing individuation or atomization, it is reasonable to assume that that function is a historically later development. As for Sinitic, this statement is clearly supported from the texts available from earlier stages of Chinese. The problem of whether and to what extent this scenario fits into general notions of secondary grammaticalization is discussed in Bisang (2015a). 9 In other languages, the classifier in [CLF N] only expresses definiteness (e.  g. Hmong, Bisang 1993) or indefiniteness (Northern Kam, Gerner 2006; Mandarin Chinese).



Grammaticalization in Mainland Southeast Asian languages 

 785

though there is some preference of definiteness (Bisang and Quang 2020; Quang forthcoming). At the same time, pragmatic inference also matters at the level of obligatoriness because classifier use is not compulsory. Moreover, the definiteness expressed by the classifier is pragmatic rather than semantic. With these properties, classifiers in [CLF N] clearly differ from articles in Indo-European languages by their multifunctionality, their lack of obligatoriness and the type of definiteness they are associated with. This will be discussed in some more detail in the remainder of this section, which is based on an extensive study by Quang (forthcoming; also cf. Bisang and Quang 2020). While there are various Sinitic languages in which the (in)definiteness interpretation of the classifier depends on the position of [CLF N] relative to the verb (for a typological survey, cf. Wang 2013), this is not the case in Vietnamese. Here, the classifier can express both functions irrespective of word order. Thus, con bò [CLF cow] in the preverbal position of (12a) can be definite as well as indefinite. The same applies to cuốn sá ch [CLF book] in the postverbal position of (12b): (12)



Vietnamese (Nguyen 2004; also cf. Bisang and Quang 2020: 16) a. Con bò ă n lúa kìa ! clf cow eat paddy sfp ‘Look! A/the cow is eating your paddy!’ b. Ma ng cuốn sá ch ra đâ y! bring clf book out here ‘Get a/the book out here!’

Even though classifiers can be used in the context of definiteness and indefiniteness, they are not obligatory. This can be seen from examples (13a) and (13b), which only differ from (12a) and (12b) by the absence of the classifier. The functional difference between (12) and (13) is that the nouns in question can only be singular in (12), while they can be singular or plural in (13): (13)



Vietnamese (Nguyen 2004; also cf. Bisang and Quang 2020: 17) a. Bò ă n lúa kìa ! cow eat paddy sfp ‘Look! A/the cow(s) is/are eating your paddy!’ b. Ma ng sá ch ra đâ y! Bring book out here ‘Get a/the book(s) out here!’

For getting a better understanding of the use and the functions of classifiers in Vietnamese, Quang (forthcoming; also cf. Bisang and Quang 2020) is based on a corpus of 30 written and 30 oral texts produced by native speakers of Vietnamese who reported

786 

 Walter Bisang

on the contents of two silent films.10 Even though it was expected that there may be some differences between the written and the oral corpus, no significant differences showed up in the experiment. For that reason, that difference will not be further discussed here. The presence of a numeral classifier in the context of (in)definiteness depends on the type of concept expressed by the noun and on animacy. The majority of nouns taking a classifier are sortal nouns, which are defined in Löbner (1985, 2011) by the combination of the two features of [– relational] and [– unique]. Relational nouns are characterized by an additional relational argument as in daughter [of someone] in contrast to non-relational nouns like girl with no additional argument position. Unique nouns denote uniquely determined concepts in a given context (e.  g. the moon, the sky, the king, the president). Given that sortal nouns are neither relational nor unique, they typically denote concepts like man, cat, tree, stone, river, etc. If one compares the occurrence of all tokens of nouns attested in both corpora, one can see that out of 4,223 sortal nouns (i.  e. 2,309 + 1,914), 2,309 or 54.7 % occur with a classifier. For the other three possible combinations, that percentage is much lower. It is 4 out 165 (i.  e. 2 + 83+ 1 + 56 + 1 + 22) nouns or 2.4 %. See Table 1. Tab. 1: Token frequency of CLF with [± relational] and [± unique] nouns in the written and the oral corpus (from Bisang and Quang 2020: 24). [+ relational] Functional: [+ unique] Written Oral Total

[CLF N] 2 0 2

[– relational]

Relational: [– unique] [N] 76  7 83

[CLF N] 0 1 1

Sortal: [– unique]

Individual: [+ unique] [N] 48  8 56

[CLF N] 0 1 1

[N]  2 20 22

[CLF N] 1,567 742 2,309

[N] 1,563 351 1,914

The importance of the other semantic criterion of animacy can be seen from Table 2. Out of a total of 1,698 tokens of animate nouns, no less than 1,571 or 92.5 % take a classifier, while only 27.6 % of the inanimate nouns (742 out of 2,690 nouns) occur in the bare classifier construction [CLF N].

10 The experiments were carried out by Kim Ngoc Quang in 2016 in Ho Chi Minh City. The film for the written corpus is titled “Cook Papa, Cook” and is available on YouTube under https://www.youtube. com/watch?v=OITJxh51z3Q (accessed 15 January 2021). The oral corpus is based on the Pear Story (Chafe 1980) and is accessible on YouTube under https://www.youtube.com/watch?v=bRNSTxTpG7U (accessed 15 January 2021). The total length of the written corpus is 31,663 words, while the oral corpus consists of 17,777 words.



Grammaticalization in Mainland Southeast Asian languages 

 787

Tab. 2: Token frequency of [±animate] nouns and their interpretation as definite and indefinite in the written and the oral corpus (from Bisang and Quang 2020: 21). Construction

[+ animate]

[– animate]

[CLF N] [N]

1,571 instances (92.5 %) 127 instances (7.5 %)

742 instances (27.6 %) 1,948 instances (72.4 %)

The exceptions to these semantic rules are based on information structure and discourse. Thus, discourse/information structure dominates semantics (discourse/ information structure > semantics). To give an example, one typically finds [CLF N] in contrastive topics. In fact, each of the 84 instances of contrastive topics in the corpus occurs with a classifier (Bisang and Quang 2020: 34). In the following example, the actions of the two animate nouns chồng ‘husband’ and vợ ‘wife’ with their features of [+ relational] and [+ unique] are contrasted by the conjunction nhưng ‘but’. In such contrastive situations, often marked by nhưng ‘but’ or còn ‘while, whereas’, the agent nouns in the subject position generally take the classifier: (14)



Vietnamese (Bisang and Quang 2020: 35; Written text 26, sentence 36) Thấ y thá i độ của vợ-mình, ông chồng điên-má u-lên và see attitude poss wife-refl clf husband get.crazy and bắ t ép ăn, nhưng bà vợ vẫ n không ă n. force eat but clf wife still neg eat ‘Seeing the behaviour of his wife, the husband got crazy and [tried to] force her to eat, but [his] wife still did not eat.’

Other contexts for the classifier in its definite interpretation are contrastive focus and focus particles like chỉ còn ‘only’ and mỗi ‘only’. Typical contexts of indefinite interpretation are thetic statements, verbs of existence (có ‘there is’) or predications which denote the appearance of an object (e.  g. lòi ra ‘come out to light, appear’) (for more, cf. Bisang and Quang 2020; Quang forthcoming). The motivation for the (in)definite interpretation of the classifier is related to the extent to which a nominal concept is assumed to be activated in the hearer’s mind by the speaker (Lambrecht 1994). Thus, we find the [CLF N] construction in anaphoric contexts and in the context of information structure. As Li and Bisang (2012) argue, the definiteness interpretation of the classifier in preverbal [CLF N] constructions of many Sinitic languages is due to the fact that topics are preverbal and must be activated. In turn, postverbal [CLF N] constructions can have both functions in many Sinitic languages because that position goes with focus, which can be both inactivated or activated, depending on context. In our Vietnamese examples, the nouns in the contrastive focus position are all activated. The definiteness of the nouns following the two focus particles with the meaning of ‘only’ follows from the inclusion or exclusion of activated alternatives expressed by these markers. Analogously, the

788 

 Walter Bisang

indefiniteness interpretation as it is found in thetic statements and the predications mentioned above is motivated by the lack of activation associated with the nouns occurring in these constructions. Taking these observations together, the definiteness or indefiniteness interpretation of classifiers in [CLF N] complies with the assumption that grammaticalization starts out from pragmatic inference within certain bridging contexts (cf. the Meaning First Hypothesis in 32.2.2). In the case of (in)definiteness, these contexts are generally provided either by discourse or information structure. As pointed out above, the relevance of pragmatic inference does not stop at non-obligatoriness and multifunctionality. It is also reflected in the meaning of the definiteness expressed by the classifier. The observation by Li and Bisang (2012) that classifiers rather mark pragmatic definiteness than semantic definiteness in Cantonese and the Wu language of Fuyang also holds for Vietnamese. The distinction between the two types of definiteness goes back to Löbner (1985)11. In languages like English, definite articles are obligatory with unique nouns (cf. *Sky is blue vs. The sky is blue). Since it is the semantics of these nouns which determines the use of the definite article, this type of definiteness is called semantic definiteness. In contrast, pragmatic definiteness is defined in terms of the identifiability or the familiarity of a concept in a given context. In Vietnamese as well as in the Sinitic languages described by Li and Bisang (2012), unique nouns like sky do not necessarily take a classifier: (15)



Vietnamese (Bisang and Quang 2020: 33) a. Trời đâ`y sa o. sky be.full star ‘The sky is full of stars.’ / ‘There are many stars on the sky.’ b. Bầ u trời đêm na y đâ`y sao. clf sky tonight be.full star ‘The sky is full of stars tonight.’

The sentence in (15a) is a general statement about the fact that one can see many stars on the sky. Using a classifier with unique nouns in such a context is odd. Classifiers can only be used in contexts in which the speaker refers to a given unique concept as it is relevant and identifiable in a specific situation. Thus, (15b) is grammatical in a context in which the speaker talks about the sky as it currently matters and as it can be identified by the speaker and the hearer in a shared temporal (đêm na y ‘tonight’) or spatial (place where speech act participants are) environment. To summarize, classifiers used in the context of (in)definiteness excellently illustrate the relevance of pragmatic inference. They are not obligatory, they are multifunctional and they mark definiteness in terms of identifiability.

11 A similar distinction has been introduced more recently by Schwarz (2009, 2013). Using his terminology, Vietnamese classifier express anaphoric or “strong” definiteness rather than unique or “weak” definiteness.



Grammaticalization in Mainland Southeast Asian languages 

 789

32.5 Serial verb constructions and grammaticalization 32.5.1 On the definition of serial verb constructions Serial verb constructions (SVCs) are found in various parts of the world and in many families, prominently in West Africa, Oceanic, various families/areas in South America and East and Mainland Southeast Asia. Their most general and all-encompassing definition from Durie (1997) treats SVCs as sequences “of two or more verbs which in various (rather strong) senses, together act like a single verb” (Durie 1997: 289–290). Based on general definitions like these (for a similar one, cf. Crowley 2002: 10), a number of individual criteria for distinguishing SVCs from non-SVCs are discussed. Aikhenvald (2006: 4–21) presents the following formal and semantic criteria in her cross-linguistic analysis: – Single predicate: the SVC “occupies one functional slot in a clause” (Aikhenvald 2006: 4); – One event: the verbs of an SVC refer to a single event; – Monoclausality: the verbs of an SVC form a single clause, i.  e. (i) they do not allow markers of syntactic dependency between them and (ii) they are not involved in coordination, complementation, adverbial subordination and sequentialization (the temporal sequence of events); – Intonational properties: SVCs have the intonational properties of a monoverbal clause; – Shared grammatical categories: e.  g. TAM, negation; – Shared arguments: prototypically, SVCs share at least one argument, e.  g. subjects or the second argument of the first verb with the first argument of the second verb. To start with an example from a language outside of Southeast Asia, (16) from Ewe (Niger-Congo: Kwa: Left Bank: Gbe) illustrates the SVC properties of argument and tense sharing. Both verbs share the same subject (me ‘I’) and the future marker a can occur only once in front of the first verb (cf. [16a] vs. [16b]). Moreover, there is no marker of syntactic dependency between the two verbs (monoclausality). In addition, one may argue that the two events of hitting and breaking together form one single event.12

12 On the problem or even the impossibility of finding cross-linguistically reliable criteria for “single event”, cf. Bisang (2009a: 804–805). The criterion of the Macro-Event Property (MEP) as discussed by Bohnemeyer et al. (2007) is too specific for dealing with culture-related criteria based on component verbs which are perceived as forming a coherent overall event in a speech community. This issue is further addressed in the discussion of examples (17) and (18) below.

790 

(16)



 Walter Bisang

Ewe (Collins 1997: 463) Me fo kaɖɛɡbɛ ɡba 1s hit lamp break ‘I hit the lamp and broke it.’ b. Me a fo kaɖɛɡbɛ (*a) ɡba. 1s fut hit lamp fut break ‘I will hit the lamp and break it.’ a.

32.5.2 Concatenation of verbs and the existence of serial verb constructions in SEA languages SEA languages are well-known for complex multiverb constructions13 as in the following two examples (also cf. [3] on Khmer):14 (17)

Thai (Diller (2006: 160, 173) ʔɔ̀ ɔk pa j sɯ́ɯ klà p maa . exit go buy return come ‘[S/he] left for buying [something] and came back.’

(18)

Khmer (Bisang 1992: 7, 435) tɤ̀ːp stùh tɤ̀u deɲ ca p jɔ̀ ːk mɔ̀ ːk ʔa op. then jump.up go pursue catch take come embrace ‘Then [she] jumped up, caught [the duckling] and took it into her arms.’

A closer look at these examples shows that some of the verbs are grammaticalized into specific functions. In (17) from Thai, pa j ‘go’ and maa ‘come’ indicate the movement of the actor (or sometimes the theme, if other verbal arguments are involved) relative to a given deictic centre. Grammaticalized verbs indicating directionality are called “directional verbs” (VD) (Li and Thompson 1981). Thus, ʔɔ̀ ɔk pa j [exit go] indicates that the movement out of something is directed away from a given deictic centre, while klà p maa [return come] means that the event of returning is oriented towards it. The verb sɯ́ɯ ‘buy’ can be analysed either as being in the relation of a temporal sequence or a purpose to the event of ʔɔ̀ ɔk pa j [exit go] in the sense of ‘go out and buy [something] / go out to buy [something]’. On the whole, the complex multiverb construction in (17) consists of three components. In a similar way, the seven verbs in example (18) from

13 For a survey on serial verb constructions in SEA languages, cf. Clark (1978, 1992) and Bisang (1992, 1996). There are also many informative descriptions on individual languages, among them: Jarkey (2015) on Hmong; Paillard (2010, 2013) on Khmer; Needleman (1973), Thepkanchana (1986), and Diller (2016) on Thai. 14 Notice that Thai and Khmer are radical pro-drop languages (cf. 32.2.3). Since all relevant nominal concepts are assumed to be activated in discourse, what we see in (17) and (18) is just a string of verbs.



Grammaticalization in Mainland Southeast Asian languages 

 791

Khmer can also be divided into three components. The first consists of the verb stùh ‘jump up’ plus the directional verb tɤ̀u ‘go’. The second is a resultative construction (for a definition, cf. [33] plus discussion). Finally, the last three verbs form a construction in which the main verb ʔaop ‘embrace, hug’ is preceded by the construction [jɔ̀ ːk ‘take’-N-‘directional verb’]. This type of construction is employed in several SEA languages to introduce a concept expressed by N into discourse.15 In the context of (18), the zero element in the N position is a duckling which was supposed to be lost and is now happily welcomed back by the lady protagonist of the story. Even though the two examples in (17) and (18) are characterized by the unmarked juxtaposition of a sequence of verbs and share the same subject, it is by no means straightforward to describe them in terms of “one event”. As Diller (2006: 160) points out, (17) can be seen as “a cohesive action sequence”, whose cohesion is often iconically motivated by cultural factors in which frequently occurring sequences of events are juxtaposed. But this does not make (17) an instance of “one event”, particularly if the definition is ultimately based on culture-specific criteria rather than cross-linguistically applicable criteria (cf. footnote 12). In other examples like (18) from Khmer, it is hard to see how the actions of jumping up, catching and hugging can be seen as frequently co-occurring events. Here, the only cohesion that motivates their unmarked juxtaposition lies in the immediacy with which one event follows the next one (temporal sequentiality). Another semantic relation between two juxtaposed events is purpose if the first verb denotes directed movement as in the following example from Khmer (also cf. sɯ́ɯ ‘buy’ in [17] from Thai): (19)

Khmer (Bisang 1992: 398) kɔ̀ ət tɤ̀u phtɛ̀ əh mɯ̀t-sɔmla ɲ mɤ̀ːl tɪ̀ ːvɪ̀ ː. 3s go house friend watch TV ‘He goes to the house of a friend to watch TV.’

Event sequences with no overt relation marking are one of the aspects often discussed in the context of SEA serial verb constructions. Another aspect is concerned with those instances, in which one or more verbs are clearly grammaticalized as in (20) from Thai. In (20a), the verb jùu ‘to be at’ is used as an aspect marker expressing continuity through time or along time without reference to boundaries (Kullavanijaya and Bisang 2003). Example (20b) illustrates the use of directional verbs (also cf. stùh tɤ̀u [jump.up go] in [18] on Khmer). Finally, the verb hâ j ‘give’ in (20c) functions like a preposition marking the recipient (or the benefactive in other cases). Verbs in prepositional function are often called coverbs (COV) in the literature.

15 The same construction is also found in (3). In the sequence of jɔ̀ ːk ø tɤ̀u sraoc [take ø go pour], the first two verbs indicate how the subject (i.  e. the husband) took the bucket and poured it [over the lover in the pitcher].

792 

 Walter Bisang

(20) Thai a. A verb in the function of a tense-aspect marker (Boonyapatipark 1983: 100) khá w khuj jùu. 3s chat be.in ‘He is chatting.’   b. Verbs in the function of a directional marker sudaa wîŋ khɯ̂n pa j. Sudaa run move.up go ‘Sudaa runs up and away from the speaker.’ c. A verb in the function of a preposition   khá w sòŋ còtmǎa j hâ j phɯ̂ən. 3s send letter give friend ‘He sends a letter to [his] friend.’ In Bisang (1992, 1996), I introduced the term of “verb serialization in a narrow sense” for those constructions which partly consist of grammaticalized verbs as in (20), while I used the term of “verb serialization in a broad sense” for constructions with no grammaticalized verbs. As can be seen from (17) and (18), the two types of verb serialization can be combined in one and the same utterance. In (17) from Thai, the first and the third components are serial verb constructions in a narrow sense. The first component and the verb sɯ́ɯ ‘buy’ form a serial verb construction in a broad sense, which in combination with the third component develops into an even larger serial verb construction in the broad sense. In the Khmer example in (18), all three components of the overall serial verb construction in a broad sense are serial verb constructions in a narrow sense. A closer look at serial verb constructions in a broad sense and in a narrow sense shows that both of them do not necessarily qualify as instances of serial verb constructions. In the case of verb serialization in a narrow sense, the grammaticalized verb does no longer have the full set of verbal properties. Thus, Thai verbs like jùu ‘be at’, pa j ‘go’ or hâ j ‘give’ no longer have their full verb meaning in their grammaticalized status and they cannot take on various grammatical markers independently (e.  g. TAM markers, negation, depending on the specific verb and its degree of grammaticalization). In addition, it is often anything but clear to what extent they still have preserved their own argument structure. For such reasons, it is hard to describe the examples given in (20) as clear instances of serial verb constructions, since they basically consist of a single verb plus a grammatical marker whose form is identical to that in its main verb function, sometimes with some reductions in tonal or segmental substance. In the case of serial verb constructions in a broad sense, the problems are of a different nature. Here, it is hard to see in what way the sequences of verbs still have the properties of “single predicate” and “one event”. Even the absence of overt syntactic dependency markers in surface strings of verbs is by no means a reliable indicator of monoclausality because the unmarked juxtaposition of verbs can be fully



Grammaticalization in Mainland Southeast Asian languages 

 793

compatible with coordination, complementation and other non-serialising structures (cf. Paul [2008] on this topic in Chinese16). Given this situation, it does not come as a surprise that assumptions on the existence of serial verb constructions in SEA languages radically diverge. To give an example, Wilawan (1993) denies their existence in Thai and Khmer, arguing that there are finite/non-finite distinctions in the verbs of these languages which need to be considered, while Iwasaki (1989) takes any type of verb concatenation as an instance of a serial verb construction.

32.5.3 Concatenation of verbs and products of grammaticalization For the purpose of this chapter on grammaticalization, the question of whether and to what extent there are serial verb constructions in SEA languages is of secondary importance. In fact, this has been the topic of numerous studies as discussed above. What is crucial is the observation that the grammars of SEA languages generally allow the unmarked concatenation of verbs to more extensive multiverb constructions with different internal structures, which may, under specific context situations with particular verbs, serve as bridges for the development of new grammatical markers and new constructions. Thus, the rich options of forming strings of verbs in combination with pragmatic inference are an important driving force of grammaticalization processes. For illustrating this, let’s look again at (2) on the Khmer verb ʔa oj ‘give’, repeated here as (21): (21)

Khmer: Different analysis of a single string of verbs (also cf. Bisang 2015a: 139) ʔoːpùk sɔŋ phtɛ̀ əh ʔa oj koːn nɤ̀u. father build house give child live/stay a. Coverb: ‘Father builds a house for his children and they live [in it].’ b. Causative verb: ‘Father builds a house for making his children live [in it].’ c. Adverbial subordinator: ‘Father builds a house with the purpose that his children live [in it].’

16 Paul (2008) shows that the four types of verb serialization in Chinese presented by Li and Thompson (1981) have rather divergent syntactic properties: (i) the juxtaposition of two or more separate events (Li and Thompson 1981: 595), (ii) “one verb phrase or clause is the subject or direct object of another verb” (Li and Thompson 1981: 598), (iii) the pivot construction in which the object of the first verb is the subject of the second verb (Li and Thompson 1981: 607) and (iv) the descriptive clause construction in which the object of a transitive first verb is described by a following clause (Li and Thompson 1981: 611) as in Tā chǎ o-le yī gè cà i tèbié hǎ o chī [he cook-PFV one CLF dish particularly good eat] ‘He has prepared a dish which is particularly delicious [lit.: “good to eat”]’. None of these types meet the conditions of monoclausality.

794 

 Walter Bisang

In the interpretation of (21a), the ‘give’-verb is analysed as a preposition/coverb (cf.  [2] above). In the next interpretation (21b), the ‘give’-verb is interpreted as a causative verb marking ko:n ‘child’ as a causee followed by the verb nɤ̀u ‘live/stay’. Finally, the ‘give’-verb can be analysed as a conjunction expressing a purpose relation between the main clause formed by sɔŋ ‘build’ and the subordinate clause with nɤ̀u ‘live/stay’ (21c). To start the presentation of different products of grammaticalization emerging from verb concatenation, let’s look again at the grammaticalized verbs in example (20) in their functions of a TAM marker (20a), a directional verb (20b) and a preposition/coverb (20c). For each of these functions, we find ample evidence in many SEA languages and, what is even more remarkable, they follow fixed word order rules with some differences across different SEA languages. For that reason, they are treated as a coherent linguistic unit, which is called “serial unit” as a subset of serial verb constructions in a narrow sense in Bisang (1996). Table 3 presents some data on the order of the individual types of markers relative to the ungrammaticalized main verb (V): Tab. 3: Word order within the serial unit (for a similar list, cf. Bisang 2006: 591). Language

Type of markers in their order relative to the main verb

Hmong Vietnamese Thai Khmer

TAM TAM TAM TAM

(COV) (COV)

V-TAM V V V

VD VD VD VD

COV COV COV COV

TAM TAM TAM TAM

In SEA languages, TAM markers generally occur either at the left or the right periphery of the serial unit. In Hmong, there is one TAM marker, i.  e. ta u ‘come to have’, which immediately follows the main verb in its default order.17 In many languages, the ‘come to have’-verb is a good example of a TAM marker which can occur at the very beginning and at the very end of the serial unit (e.  g. Khmer, Thai and Vietnamese). The following list presents a few preverbal and postverbal TAM markers: (22)

List of some TAM markers Hmong: Preverbal: yua v ‘want / future’; immediately after V: ta u ‘come to have’; final: ta s ‘finish’ (cf. example [4]). Vietnamese: Preverbal: mới ‘new / immediate future’, thường ‘frequent, usual, average / habitual’; postverbal: rồi ‘finish / completive’, xong ‘finish / completive’.

17 Notice that the default position for ta u ‘come to have’ is immediately after the verb. In marked cases, it can also occur after an object (V O ta u) (Enfield 2003: 194–195). In Sinitic languages, TAM markers occur more prominently in the position right after the main verb. Good examples are the Mandarin Chinese TAM suffixes -le ‘perfective’, -guo ‘experiential’ and -zhe ‘durative’.



Grammaticalization in Mainland Southeast Asian languages 

Thai:

Khmer:

 795

Preverbal: khɤɤj kà p ‘be used to/be familiar with’ / khɤɤj ‘experiential aspect’, tɔ̂ ŋ ‘touch’ / ‘must’; postverbal: sèt ‘finish [for the moment] / telic’, còp ‘finish [for good] / telic’, jùu ‘be at / continuous’ (cf. example [20a]). Preverbal: nɤ̀u ‘be at, live / continuous’, thlɔ̀ əp ‘be used to / habitual’; postverbal: rùːəc ‘finish, escape, run away / telic’, srec ‘finish [for the moment] / telic’, cɔ̀ p ‘finish [for good] / telic’.

Prepositions/coverbs typically occur after the verb between the directional verb and the TAM marker. Some coverbs in Hmong (e.  g. xua s ‘use / instrumental’, nrog ‘to accompany / with [comitative]’) and in Vietnamese (e.  g. dùng ‘use / instrumental’) take the preverbal position, most likely due to influence from Sinitic, in which the preverbal position is the default position. Examples of two coverbs are given in the following list: (23)

List of coverbs Hmong: ra u ‘put, place / dative, benefactive, towards’, txog ‘arrive / to target, until’. Vietnamese: cho ‘give / dative, benefactive’, đến ‘be at / locative’. Thai: hâ j ‘give ‘give / dative, benefactive’, thɯ̆ŋ ‘arrive / to GOAL, about’. Khmer: ʔa oj ‘give / dative, benefactive’, dɔl ‘arrive / to GOAL’.

Directional verbs can be divided into deictic verbs and path verbs (Lamarre [2008] on Chinese). As indicated above, deictic verbs are oriented towards a deictic centre. Many SEA languages have two verbs of this type, among them Thai (pa j ‘go’, maa ‘come’) and Khmer (tɤ̀u ‘go’ and mɔ̀ ːk ‘come’).18 In Hmong, there are three deictic verbs, i.  e. los ‘movement to a stable centre of reference, tua j ‘movement to a temporary centre’ and mus ‘movement away from a centre’. Path verbs generally denote upward/downward movement or movement into/out of something plus some other paths depending on the language. In the case of Thai and Khmer, there are following four verbs for expressing path: (24)

Path verbs in Thai and Khmer ‘ascend’: Thai: khɯ̂n Khmer: la əŋ ‘descend’: Thai: loŋ Khmer: coh ‘enter’: Thai: khâ w Khmer: co:l ‘exit’: Thai: ʔɔ̀ ɔk Khmer: ceɲ

While Vietnamese has only one slot for directional verbs, Thai has two and Khmer has three. In Thai, path verbs precede deictic verbs. In Khmer, the path verbs are split into 18 Also cf. the terminology in Kuteva et al. (2019: 25, 33): ANDATIVE for ‘go’-verbs and VENITIVE for ‘come’-verbs.

796 

 Walter Bisang

the two subsets of ascend/descend and enter/exit whereby the former precedes the latter. The following example from Khmer illustrates a sequence in which each of the three positions is filled by a verb from the three subsets: (25)

Khmer kɔ̀ ət lòːt coh ceɲ mɔ̀ ːk. 3s jump descend exit come ‘He jumps down out [of something] towards a given deictic centre.’

The following two examples illustrate the combination of different elements of the serial unit. In (26) from Hmong, the serial unit consists of a preposition/coverb and a directional verb. Such examples with combinations of two elements are quite frequent in any possible combination. The Khmer example in (27) with all three elements is a constructed example which is grammatical but pragmatically unusual, given the strong tendency to provide only the contextually necessary grammatical information: (26)

Hmong (Bisang 1992: 70) koj los [nrog kuv ca ij tus nees no mus]. 2s come cov.with 1s ride clf horse dem vd.come ‘Come and ride with me on this horse [away from referential centre].’

(27)

Khmer kɔ̀ ət ba ːn jɔ̀ ːk ʔɤjva n coh ceɲ mɔ̀ ːk ʔa oj 3s pst take luggage vd.go.down vd.go.out vd.go cov.give khɲom. 1s ‘He took the luggage down and out for me. [The speaker was waiting in the hotel lobby for someone who took his luggage out of the room and brought it down to him]’.

The presentation so far describes the overall systematicity found in serial units. A closer look at individual languages reveals space for language-specific features within the overall pattern. This is briefly illustrated by two examples. The first one is about the expression of benefactives in Thai as described by Jenny (2010). Thai has several markers for benefactive, among them the verb hâ j ‘give’ as mentioned above and the marker phɯ̂ə, which is mainly used as a marker of benefactive or purpose with no corresponding lexical meaning in Thai. The crucial difference between the two markers depends on consciousness. If the beneficiary is directly involved in the event as recipient or experiencer (direct beneficiary), hâ j is used (28a). In contrast, phɯ̂ə is employed if the beneficiary is unaware of it (indirect beneficiary) (28b). Thus, the child must be present and listening while the mother sings in (28a), while she can also be absent in the case of (28b), e.  g. if her mother sings in the streets for collecting money for her child:



(28)



Grammaticalization in Mainland Southeast Asian languages 

 797

Thai (Jenny 2010: 389–390) a. mɛ̂ ɛ rɔ́ ɔŋ phleeŋ hâ j lûuk. mother sing song cov.give child ‘Mother is singing a song to her child.’ b. mɛ̂ ɛ rɔ́ ɔŋ phleeŋ phɯ̂ə lûuk. mother sing song for child ‘Mother is singing a song for her child.’

The second example is about Mon (Jenny 2014) and transitivity agreement in the selection of directional verbs from a set of basic intransitive forms and a set of derived transitive/causative forms. The transitive forms of the directional verbs must be selected with semantically transitive verbs and the patient argument has to be either foregrounded or the predicate has to describe an induced movement. Thus, if the main verb is intransitive as in (29a), the intransitive verbs are used, while the transitive forms have to be selected with the transitive verb in (29b): (29)



Mon (Jenny 2014: 63–64) a. With intransitive main verb: kon.ŋa ̀c kwa c ca o ʔa phɛ̀ ə. child walk return go school ‘The child walked back to school.’ b. With transitive main verb: rɔ̀ ə kok phja o na hɒəʔ. friend call caus.return caus.go home ‘The friend brought [her] back home.’

As the discussion of (21) showed, there are other types of grammaticalization emerging from strings of juxtaposed verbs. A good example are causative verbs and conjunctions. This is illustrated by the ‘give’-verb and the ‘finish’-verb. In addition to their function of benefactive prepositions/coverbs, ‘give’-verbs (e.  g. Vietnamese cho, Thai hâ j or Khmer ʔa oj) are also used as causative verbs and conjunctions expressing purpose or manner. Each function is associated with different positions within different constructions. The functional range of Thai hâ j ‘give’ is described in detail by various linguists, among them Song (1997), Rangkupan (2007), Yap and Iwasaki (2007) and Thepkanjana and Uehara (2008). The functions of Khmer ʔa oj ‘give’ are discussed in Bisang (1992, 2015a: 137–141). The following examples show ʔa oj ‘give’ in its functions of a causative verb (30), a conjunction expressing purpose (31) and manner (32): (30) Khmer: Causative verb (Bisang 1992: 440) ba ə ʔa oj khɲom tɤ̀u tɛ̀ ək. tɔ̀ ːŋ, khɲom tɤ̀u. if caus I go be.involved.with I go ‘If you let me get involved with [him], I’ll go [there].’

798 

 Walter Bisang

(31)

Khmer: Conjunction marking purpose (Jacob 1968: 141) khɲom khɔm thvɤ̀ː-ka ː ʔa oj ʔoːpùk khɲom sɔpba ːj-cɤt. 1s hard work purp father 1s be.comfortable-heart ‘I work hard to make my father happy/for my father to be happy.’

(32)

Khmer: Manner (Jacob 1968: 141) ca u rùət tɤ̀u sa ːla ːrɪ̀ ən ʔa oj rəha h. boy run go school mann quick ‘The little boy run quickly to school.’

Example (31) is of particular interest because the interpretation of ʔa oj ‘give’ as a purpose marker is not the only possible analysis. It could also be understood as a causative marker as reflected in the following translation: ‘I work hard and make father happy (by doing this)’. Cases like these are good indicators that the conjunctional function of ʔa oj ‘give’ is a further development from its causative function (Bisang 2015a). The grammaticalization of ‘finish’-verbs starts from the resultative construction19, which provides a basis for developing postverbal TAM markers and, at a later stage, conjunctions. The resultative construction consists of two verbs (V), of which the first (V1) carries the basic meaning of an event, while the second (V2) denotes the result of V1. Since the combination of V1 and V2 is not productive and the number of verbs is limited, resultative constructions rather side with lexicalization than with grammaticalization.20 The following list presents some examples from Thai (for a list of resultative constructions, cf. Noss 1964: 127–132; Bisang 1992: 338–339; also cf. the Khmer example in [18]: deɲ ca p [pursue catch] ‘catch successfully’)21: (33)

Resultative constructions in Thai mɔɔŋ ‘look’ – hěn ‘see’ jiŋ ‘shoot’ – thùuk ‘to hit (as a target)’ lâ j ‘chase’ – cà p ‘catch’

mɔɔŋ hěn ‘be able to see’ jiŋ thùuk ‘to shoot and hit the target’ lâ j cà p ‘to chase and catch sth.’ (cf. [18]). nɔɔn ‘lie, recline’ – là p ‘close the eyes/ nɔɔn là p ‘sleep, be asleep’ sleep’ khít ‘think’ – tòk ‘fall’ khít tòk ‘to have solved (a thinking problem), have gotten over (a live problem)’

19 Notice that the resultative construction is the starting point for many processes of grammaticalization, among them the use of postverbal ‘come to have’-verbs as potential or abilitative modal markers. 20 In Mandarin Chinese, the number of V2 is much larger. Xu et al. (2008) lists no less than 270 V2 and 476 V1, which can be constructed with at least one V2. On the diachronic development of the resultative construction in Chinese, cf. Xu (2006: 146–188). 21 Notice that the negation can only take the position between V1 and V2 and that the overall construction gets the meaning of inability: khá w nɔɔn mâ j là p [he lie NEG sleep] ‘he cannot sleep’.



Grammaticalization in Mainland Southeast Asian languages 

 799

The grammaticalization of verbs with the meaning of ‘finish, complete’ to markers of completion starts out from resultative constructions. At a next stage, they extend the number of V1 with which they combine until they develop into clause-final aspect markers. Some of these clause-final aspect markers are reanalysed in turn as clause-initial conjunctions of the next clause with the meaning ‘then, and then, after that’. In (34a) from Vietnamese, rồi ‘finish’ is a completive marker. In (34b) it is a clause-initial conjunction: (34) Vietnamese22 a. Completive Bố tôi ăn sáng rồi, bâ y giờ đi làm. father 1s eat morning compl now go work ‘My father has finished breakfast and goes to work now.’   b. Conjunction Bố tôi ă n sáng xong, rồi sẽ đi là m. father 1s eat morning finish then fut go city ‘My father is eating breakfast, then/after that he will go to work.’ The perfective markers of Thai (lɛ́ ɛw; cf. Kullavanijaya and Bisang 2003) and Khmer (ha əj) have undergone a similar process of grammaticalization, even though their full verb status is questionable at best. Thai lɛ́ ɛw is the borrowed form of the Chinese verb liǎ o ‘finish’, which developed into the perfective marker -le. Khmer ha əj is only a perfective marker. Its original verbal status can be reconstructed from the prefixed causative form bɔŋ-ha əj ‘finish [transitive]’.

32.5.4 Theoretical assessment of the grammaticalization phenomena discussed in this section Processes of reanalysis like the one presented in (21) show their effects in lexicalization and in grammaticalization, whose difference is described as follows by Himmelmann:22 The essential difference between grammaticization and lexicalization pertains to lexical generality. In lexicalization a specific string of items is conventionalized. In grammaticization the process of conventionalization applies to an expression pattern consisting of at least one fixed item (the grammaticizing element which becomes the increasingly general construction marker) and a growing class of items which enter into this construction. (Himmelmann 2004: 38)

22 Notice that Himmelmann (2004) uses the term “grammaticization” instead of “grammaticalization”. Both terms mean the same.

800 

 Walter Bisang

Seen from the perspective of an individual verb and the potential host verbs with which it can be combined, lexicalization leads to the reduction of host verbs. At the end, an individual verb cooccurs with one single host verb to produce a specific non-compositional lexical meaning. In contrast, grammaticalization leads to the reduction of lexically motivated cooccurrence restrictions, i.  e. it extends the number of potential hosts. The more grammaticalized an individual verb is, the more semantically general it becomes. As a consequence, potential lexical incompatibilities with other verbs get reduced up to the extent that a given grammaticalized verb can eventually cooccur with all verbs of a language (cf. Bybee [1985] on relevance vs. generality). Thus, the two scenarios of grammaticalization and lexicalization can be described by the following bifurcation: Productive serialization of verbs

Grammaticalization: The grammaticalized V can be combined with an increasing number of other V.

Lexicalization The sets of verbs for expressing a specific meaning tend to maximal restriction.

Fig. 1: From serialization to grammaticalization or lexicalization.

This scenario comes close to Aikhenvald’s (2006: 21–37) distinction between symmetrical and asymmetrical serial verb constructions, in which the former is characterized by the absence of any restrictions on the set of possible component verbs (open class set), while the latter is characterized by at least one verbal position which only accepts elements from a restricted or closed set of verbs. Based on this distinction, Aikhenvald (2006) further argues that “asymmetrical serial verb constructions tend to undergo grammaticalization”, while “symmetrical serial verb constructions tend to become lexicalized and develop idiomatic meanings” (Aikhenvald 2006: 30). The problem with Aikhenvald’s (2006) scenario from the perspective of grammaticalization is that it projects the final results into the initial stage that precedes grammaticalization (Bisang 2009a: 807–810). Once a verb is grammaticalized into an element of a serial unit, it clearly belongs to a closed set but the string of verbs out of which it developed was symmetrical.23 Thus, grammaticalization and lexicalization start out from the same basis of symmetrical strings of verbs as illustrated in Figure 1. In the case of grammaticalization, the grammaticalized verb becomes a member of a restricted set of markers (e.  g. a TAM marker, COV or VD), while all members undergo restrictions in the case of lexicalization.

23 There is clear evidence of that from historical texts in Chinese. Many components of the serial unit simply did not exist some 2,500 years earlier in Classical Chinese (cf. Bisang 2017 for a summary).



Grammaticalization in Mainland Southeast Asian languages 

 801

Once a verb has become a fully grammaticalized marker, the next question to ask is if it still is a verb. As was pointed out in 32.5.2, this is questionable. Taking up a very important observation from Paillard (2013: 12), one has to look at the meaning of a verb in different contexts. As a consequence, a verb loses some of its meaning as a full verb if it occurs in the syntactic position of a TAM marker, a directional verb or a coverb with corresponding additional effects on its properties in terms of argument structure and the ability to combine with its own TAM markers. Moreover, the relative order of the different types of markers within the serial unit are semantically motivated in terms of relevance and generality (Bybee 1985). Since directional verbs and coverbs/adpositions are lexically more relevant to the main verb, the set of full verbs they can combine with is more restricted and they are iconically closer to the verb than the semantically more general (less relevant) TAM markers. Thus, the more semantically general a marker is, the more it occurs to the periphery of the verb. Given the two observations that verbs in SEA languages have different meanings in different syntactic environments and that certain iconically ordered positions are associated with specific grammatical categories, the positions of TAM, COV and VD are called “attractor positions” in Bisang (1992, 1996). They attract lexical items for being grammaticalized into the function they are associated with by analogy and they simultaneously enhance the syntactic reanalysis of a given string of verbs. The serial unit with its attractor positions shows all three properties of grammaticalization specific to SEA languages (cf. 32.2.2). Pragmatically, it provides a syntactic pattern for grammaticalizing new verbs by integrating them into a closed set of markers in situations of bridging. The verbs acquire their function-related meaning depending on the syntactic environment or position in which they occur, but they do not lose their lexical meaning outside of that environment (cf. above, Paillard 2013). This fits with retention of earlier meaning and the development of pathways of the type A > A,B > A,B,C, etc. Finally, the existence of attractor positions minimizes gradualness by enhancing the reanalysis of a verb and its assignment to a set of markers of the same category by analogy in a relatively short period of time. This can be seen in languages with rich historical data as in the case of using different ‘give’-verbs for expressing datives/benefactives across time in Chinese.24 Other evidence comes from different verbs used for expressing similar functions in comparable syntactic positions across different varieties of a language (cf. Bisang 1996 on areality).

24 Peyraube’s (1988) excellent work on the diachrony of benefactive markers mentions the following ‘give’-verbs used for that purpose: 與 yǔ ‘give’ in classical Chinese (5th to 3rd centuries BC) and later texts emulating that language, the verb 饋 kuì with its original meaning of ‘to make a present of food, to offer in sacrifice’ between 1250 and 1400 AD and finally the verb 給 gěi ‘give’ from the Qing dynasty (1644–1911) up to now.

802 

 Walter Bisang

32.6 Conclusion and the wider perspective This chapter has illustrated the high importance of pragmatics in grammaticalization processes of SEA languages with a range of phenomena from different domains of grammar, i.  e. the grammaticalization of pronominals and their properties in terms of person (32.3), the association of the classifier in the bare classifier construction [CLF N] with definiteness (identifiability) and indefiniteness (32.4) and the development of various grammatical markers derived from the application of reanalysis and analogy on strings of juxtaposed verbs (32.5). In each case, the products of grammaticalization retained their previous functions. Kinship terms are still widely used in exactly that nominal function, but they are also employed in pronominal function. The classifiers in [CLF N] still operate as numeral classifiers for individuating or atomizing nominal concepts in the context of counting and the grammatical markers emerging from verbal strings have still preserved their function as full verbs at least in many cases. Thus, the ‘give’-verbs are used as dative/benefactive markers, causative markers and conjunctions (e.  g. in Khmer), while ‘finish’-verbs develop from their resultative function into tense-aspect markers and, at a later stage, into clause-initial conjunctions (e.  g. in Vietnamese). Finally, the association of certain grammatical functions with certain syntactic positions relative to each other (TAM, COV, VD) led to a semantically motivated pattern of attractor positions with their potential to reduce gradualness. The assessment of the role of pragmatic inference in its interaction with the form side of morphosyntax as outlined above can be seen in the broader context of successful communication, which crucially depends on providing the right amount of information in a given speech situation. Taking up Zipf’s (1949) ecological account, human speech behaviour is situated between the speaker’s economy, which aims at minimal linguistic articulation, and the hearer’s economy, which prefers maximal explicitness (also cf. Martinet 1962; Horn 1984; Levinson’s [2000] articulatory bottleneck, and other variations on the theme). In linguistic typology, these two perspectives are seen as two competing motivations with their effects on the grammars of individual languages (Haiman 2011b). One motivation is explicitness (“iconicity” in Haiman 2011b), the other one is economy. The joint competitive interaction of these two motivations is reflected in two different types of complexity, depending on which of them is stronger (Bisang 2009, 2014, 2015b). If explicitness wins, the grammar of a language forces the speaker to encode a certain grammatical category by a marker from its categorial inventory in a given situation (e.  g. tense or evidentiality in a finite clause, number in count nouns). This type of complexity is called “overt complexity” (Bisang 2009b, 2015b). It is this type which is frequently used in typology for cross-linguistic comparison of complexity (McWhorter 2001; Miestamo 2008; for a survey of different approaches to complexity, cf. Sinnemäki 2011). The economy-side of complexity favours a grammar which does not force the speaker to encode a grammatical category by one of its markers if the relevant information can be pragmatically inferred by the hearer. This type of



Grammaticalization in Mainland Southeast Asian languages 

 803

complexity is called “hidden complexity”. Moreover, grammatical markers express a single value of a grammatical category for the sake of clarity in overt complexity (e.  g. past, direct evidence or plural), while they are allowed to have more than one grammatical function in hidden complexity in the sense that one and the same marker can be associated with values from more than one grammatical domain (e.  g. adposition/ case marker, causative marker, conjunction) or more than one subcategory of a single grammatical domain (e.  g. definite and indefinite). Here again, the relevant information must be inferred from context. From the perspective of how much grammatical information is allowed to be left to pragmatic inference by the grammar of a language, the two competing motivations with their effects are independent of grammaticalization. Good examples of independence are relative clause structures with the option of multiple coreference interpretations as in (5) or the unmarked juxtaposition of verbs which might be in different syntactic relations to each other (e.  g. coordination, temporal sequence, adverbial subordination, complementation; cf. 32.5.2; Paul 2008).25 In spite of this, pragmatic inference also shows its strong impact on the way in which grammaticalization is realized in SEA languages. In fact, the properties of lack of obligatoriness and multifunctionality (32.2.3) exactly correspond to hidden complexity as it is motivated by economy. If these properties are recurrent in the grammar of a language across a large number of different grammatical domains, one can argue that this language is characterized by a comparatively high degree of hidden complexity in contrast to languages whose grammatical markers tend to be obligatory and/or monofunctional across many grammatical domains. The general dominance of pragmatic inference even in the case of markers representing categories which are associated with high degrees of grammaticalization from a cross-linguistic perspective does not exclude grammaticalization to develop into the direction of overt complexity. The general observation only is that this does not happen that frequently. A good example is verb agreement marking in Semelai (Austroasiatic: Aslian). Even though the language clearly has verb agreement (Kruspe 2004: 171), its markers must be the result of a rather recent development, given their status as proclitics and their formal similarities to the pronominal system. Moreover, they still show some reminiscences of discourse, since they are only used with individuated transitive events, i.  e. they are ungrammatical with generic argument nouns. It is for examples like these that I tentatively suggested the notion of a SEA type of incipient morphology (Bisang 2020a: 61).

25 Notice that hidden complexity is a matter of structural options provided by the grammar. It is not primarily based on criteria of frequency. Even if structures like (5) may be relatively rare, the mere fact that the grammar allows them and that other languages are forced to give more information is enough.

804 

 Walter Bisang

Explanations for why pragmatic inference is comparatively dominant across several linguistic families of SEA in grammar in general and in grammaticalization in particular needs a lot more work that integrates language-internal factors as well as multiple situations of linguistic contact (Bisang 2011; Ansaldo et al. 2018).

References Aikhenvald, Alexandra Y. 2006. Serial verb constructions in typological perspective. In Alexandra Y. Aikhenvald & R. M. W. Dixon (eds.), Serial verb constructions. A cross-linguistic typology, 1–68. Oxford: Oxford University Press. Ansaldo, Umberto, Walter Bisang & Pui Yiu Szeto. 2018. Grammaticalization in isolating languages and the notion of complexity. In Heiko Narrog & Bernd Heine (eds.), Grammaticalization from a typological perspective, 219–234. Oxford: Oxford University Press. Bisang, Walter. 1992. Das Verb im Chinesischen, Hmong, Vietnamesischen, Thai und Khmer. Vergleichende Grammatik im Rahmen der Verbserialisierung, der Grammatikalisierung und der Attraktorpositionen. Tübingen: Narr. Bisang, Walter. 1993. Classifiers, quantifiers and class nouns in Hmong. Studies in Language 17(1). 1–51. Bisang, Walter. 1996. Areal typology and grammaticalization: Processes of grammaticalization based on nouns and verbs in East and Mainland South East Asian languages. Studies in Language 20(3). 519–597. Bisang, Walter. 1999. Classifiers in East and Southeast Asian languages: Counting and beyond. In Jadranka Gvozdanovic (ed.), Numeral types and changes worldwide, 113–185. Berlin: Mouton de Gruyter. Bisang, Walter. 2004. Grammaticalization without coevolution of form and meaning: The case of tense-aspect-modality in East and Mainland Southeast Asia. In Walter Bisang, Nikolaus P. Himmelmann & Björn Wiemer (eds.), What makes grammaticalization? – A look from its fringes and its components, 109–138. Berlin: Mouton de Gruyter. Bisang, Walter. 2006. South East Asia as a linguistic area. In Keith Brown (ed.), Encyclopedia of language and linguistics, vol. 11, 587–595. Oxford: Elsevier. Bisang, Walter. 2008. Grammaticalization and the areal factor: The perspective of East and Mainland Southeast Asian languages. In María José López-Couso & Elena Seoane (eds.), Rethinking grammaticalization. New perspectives, 15–35. Amsterdam & Philadelphia: John Benjamins. Bisang, Walter. 2009a. Serial verb constructions. Language and Linguistics Compass 3. 792–814. DOI: 10.1111/j.1749-818X.2009.00128.x. Bisang, Walter. 2009b. On the evolution of complexity – Sometimes less is more in East and Mainland Southeast Asia. In Geoffrey Sampson, David Gil & Peter Trudgill (eds.), Language complexity as an evolving variable, 34–49. Oxford: Oxford University Press. Bisang, Walter. 2011. Grammaticalization and typology. In Heiko Narrog & Bernd Heine (eds.), Handbook of grammaticalization, 105–117. Oxford: Oxford University Press. Bisang, Walter. 2014. Overt and hidden complexity – Two types of complexity and their implications. Poznan Studies in Contemporary Linguistics 50(2). 127–143. Bisang, Walter. 2015a. Problems with primary vs. secondary grammaticalization: The case of East and Mainland Southeast Asian languages. Language Sciences 47. 132–147. Bisang, Walter. 2015b. Hidden complexity – The neglected side of complexity and its consequences. Linguistics Vanguard 1(1). 177–187.



Grammaticalization in Mainland Southeast Asian languages 

 805

Bisang, Walter. 2017. Grammaticalization. Oxford Research Encyclopedia, Linguistics. Online publication. DOI: 10.1093/acrefore/9780199384655.013.103. Bisang, Walter. 2020a. Radical analyticity and radical pro-drop scenarios of diachronic change in East and Mainland Southeast Asia, West Africa and Pidgin and Creoles. Journal of Asian Languages and Linguistics 1. 34–70. Bisang, Walter. 2020b. Grammaticalization in Chinese – A cross-linguistic perspective. In Janet Xing (ed.), A typological approach to grammaticalization and lexicalization: East meets west, 17–54. Berlin: Mouton de Gruyter. Bisang, Walter & Andrej Malchukov. 2020. Grammaticalization scenarios: Cross-linguistic variation and universal tendencies. Vol. 1, Grammaticalization scenarios from Europe and Asia; Vol. 2, Grammaticalization scenarios from Africa, the Americas and the Pacific (Comparative Handbooks of Linguistics). Berlin: Mouton de Gruyter. Bisang, Walter & Kim Ngoc Quang. 2020. (In)definiteness and classifiers in Vietnamese. In Katalin Balogh, Anja Latrouite & Robert R. Van Valin (eds.), Nominal anchoring. specificity, definiteness and article systems across languages, 15–49. Berlin: Language Science Press. Bisang, Walter & Yicheng Wu. 2017. Numeral classifiers in East Asia. Linguistics 55. 257–264. Bohnemeyer, Jürgen, Nicholas J. Enfield, James Essegbey, Iraide Ibarretxe-Antuñano, Sotaro Kita, Friederike Lüpke & Felix K. Ameka. 2007. Principles of event segmentation in language: The case of motion events. Language 83. 495–532. Boonyapatipark, Tasanalai. 1983. A study of aspect in Thai. London: University of London PhD dissertation. Bybee, Joan L. 1985. Morphology. A study of the relation between meaning and form. Amsterdam & Philadelphia: John Benjamins. Bybee, Joan L., Revere Perkins & William Pagliuca. 1994. The evolution of grammar. Tense, aspect and modality in the languages of the world. Chicago & London: The University of Chicago Press. Cheng, Lai-Shen Lisa & Rint Sybesma. 1999. Bare and not-so-bare nouns and the structure of NP. Linguistic Inquiry 30. 509–542. Chierchia, Gennaro. 1998. Reference to kinds across languages. Natural Language Semantics 6. 339–405. Clark, Marybeth. 1978. Coverb and cases in Vietnamese (Pacific Linguistics B-48). Canberra: Australian National University. Clark, Marybeth. 1992. Serialisation in Mainland Southeast Asia. In Amara Prasithrathsint (ed.), Pan-Asiatic linguistics: Proceedings of the Third International Symposium on Language and Linguistics, vol. 1, 145–159. Bangkok: Chulalongkorn University. Collins, Chris. 1997. Argument sharing in serial verb constructions. Linguistic Inquiry 28. 461–497. Cooke, Joseph Robinson. 1968. Pronominal reference in Thai, Burmese and Vietnamese. Berkeley, CA: University of California Press. Crowley, Terry. 2002. Serial verbs in Oceanic. A descriptive typology. Oxford: Oxford University Press. Diller, Anthony V. N. 2006. Thai serial verbs: Cohesion and culture. In Alexandra Y. Aikhenvald & R. M. W. Dixon (eds.), Serial verb constructions. A cross-linguistic typology, 160–177. Oxford: Oxford University Press. Durie, Mark. 1997. Grammatical structures in verb serialization. In Alex Alsina, Joan Bresnan & Peter Sells (eds.), Complex predicates, 289–354. Stanford, CA: Center for the Study of Language and Information (CSLI). Enfield, Nicholas J. 2003. Linguistic epidemiology. Semantics and grammar of language contact in Mainland Southeast Asia. London & New York: RoutledgeCourzon.

806 

 Walter Bisang

Enfield, Nicholas J. 2005. Areal linguistics and Mainland Southeast Asia. Annual Review of Anthropology 34. 181–206. Enfield, Nicholas J. & Bernard Comrie (eds.). 2015. Languages of Mainland Southeast Asia. The state of the art. Berlin: Mouton de Gruyter. Gerner, Matthias. 2006. Noun classes in Kam and Chinese Kam-Tai languages: Their morphosyntax, semantics and history. Journal of Chinese Linguistics 34. 237–305. Givón, Talmy. 1979. On understanding grammar. New York, San Francisco & London: Academic Press. Gorgoniev, Ju A. 1966. Kategorija glagola v sovremennom kxmerskom jazyke [The category of the verb in Modern Khmer]. Moskva: Izd. vozdoc#noj literatury. Greenberg, Joseph H. 1972. Numerical classifiers and substantival number: Problems in the genesis of a linguistic type. Working Papers on Language Universals 9, 1–39. Stanford, CA: Department of Linguistics, Stanford University. Haiman, John. 2011a. Cambodian. Khmer. Amsterdam & Philadelphia: John Benjamins. Haiman, John. 2011b. Competing motivations. In Jae Jung Song (ed.), The Oxford handbook of linguistic typology, 148–165. Oxford: Oxford University Press. Heine, Bernd. 2002. On the role of context in grammaticalization. In Ilse Wischer & Gabriele Diewald (eds.), New reflections on grammaticalization, 83–101. Amsterdam & Philadelphia: John Benjamins. Heine, Bernd. 2018. Grammaticalization in Africa. Two contrasting hypotheses. In Heiko Narrog & Bernd Heine (eds.), Grammaticalization from a typological perspective, 16–34. Oxford: Oxford University Press. Heine, Bernd & Tanja Kuteva. 2002. World lexicon of grammaticalization. Cambridge: Cambridge University Press. Heine, Bernd, Ulrike Claudi & Friederike Hünnemeyer. 1991. Grammaticalization. A conceptual framework. Chicago & London: The University of Chicago Press. Himmelmann, Nikolaus P. 2004. Lexicalization and grammaticization: Opposite or orthogonal. In Walter Bisang, Nikolaus P. Himmelmann & Björn Wiemer (eds.), What makes grammaticalization? – A look from its fringes and its components, 21–42. Berlin: Mouton de Gruyter. Hopper, Paul J. 1991. On some principles of grammaticization. In Elizabeth C. Traugott & Bernd Heine (eds.), Approaches to grammaticalization, vol. I, 17–36. Amsterdam & Philadelphia: John Benjamins. Hopper, Paul J. & Elizabeth C. Traugott. 2003. Grammaticalization, 2nd edn. Cambridge: Cambridge University Press. Horn, Laurence R. 1984. Towards a new taxonomy of pragmatic inference: Q-based and R-based implicature. In Deborah Schiffrin (ed.), Meaning, form, and use in context: Linguistic applications, 11–42. Washington, DC: Georgetown University Press. Huang, Yan. 1994. The syntax and pragmatics of anaphora. Cambridge: Cambridge University Press. Hundius, Harald & Ulrike Kölver. 1983. Syntax and semantics of numeral classifiers in Thai. Studies in Language 7. 165–214. Huynh, Juliet & Suwon Yoon. 2019. The compatibility between expressive elements: Kinship terms, pronouns, and racial slurs in Vietnamese. Journal of the Southeast Asian Linguistics Society 12. 91–111. Iida, Masayo. 1996. Context and binding in Japanese. Stanford, CA: CSLI Publications, Center for the Study of Language and Information. Iwasaki, Shoichi. 1989. Clausehood and verb serialization in Thai narratives. Phasa lae Phasasat 7. 84–130. Jacob, Judith M. 1968. Introduction to Cambodian. London: Oxford University Press.



Grammaticalization in Mainland Southeast Asian languages 

 807

Jarkey, Nerida. 2015. Serial verbs in White Hmong. Leiden: Brill. Jenks, Peter. 2011. The hidden structure of Thai noun phrase. Cambridge, MA: Harvard University PhD dissertation. Jenks, Peter. 2015. Two kinds of definiteness in numeral classifier languages. Proceedings of SALT 25. 103–124. Jenny, Mathias. 2010. Benefactive strategies in Thai. In Fernando Zuñiga & Seppo Kittilä (eds.), Benefactives and malefactives. Case studies and typological perspectives, 377–392. Amsterdam & Philadelphia: John Benjamins. Jenny, Mathias. 2014. Transitivity and affectedness in Mon. Mon-Khmer Studies 43. 57–71. Kruspe, Nicole. 2004. A grammar of Semelai. Cambridge: Cambridge University Press. Kullavanijaya, Pranee & Walter Bisang. 2003. Another look at aspect in Thai. MANUSYA Journal of the Humanities (Special Issue No. 6). 43–56. Kullavanijaya, Pranee. 2004. A historical study of /thii/ in Thai. In Anthony V. N. Diller, Jerold A. Edmondson & Yongxian Luo (eds.), The Tai-Kadai languages, 461–483. Oxford: Routledge. Kuryłowicz, Jerzy. 1965. The evolution of grammatical categories. Diogenes 13(51). 55–71. [Reprinted 1975: Esquisses linguistiques II, 38–54. Munich: Fink]. Kuteva, Tania, Bernd Heine, Bo Hong, Haiping Long, Heiko Narrog & Seongha Rhee. 2019. World lexicon of grammaticalization, 2nd revised edn. Cambridge: Cambridge University Press. Lamarre, Christine. 2008. The linguistic categorization of deictic direction in Chinese: With reference to Japanese. In Dan Xu (ed.), Space in languages of China: Cross-linguistic, synchronic and diachronic perspectives, 69–97. Dordrecht: Springer. Lambrecht, Knut. 1994. Information structure and sentence form. Topic, focus, and the mental representation of discourse referents. Cambridge: Cambridge University Press. LaPolla, Randy. 2003. Why languages differ: Variation in the conventionalisation of constraints on inference. In David Bradley, Randy LaPolla, Boyd Michailovsky & Graham Thurgood (eds.), Language variation: Papers on variation and change in the Sinosphere and in the Indosphere in honour of James A. Matisoff, 113–144. Canberra: Pacific Linguistics 555. Le, Phuc Thien. 2011. Transnational variation in linguistic patterns in Vietnamese: Australia and Vietnam. Melbourne: University of Victoria PhD dissertation. http://vuir.vu.edu.au/17945/ (last accessed 28 July 2020). Lehmann, Christian. 2002 [1995]. Thoughts on grammaticalization, 2nd revised edn. https://www.christianlehmann.eu/?open=schriftenverzeichnis (last accessed 12 August 2020). Levinson, Stephen C. 2000. Presumptive meanings. The theory of generalized conversational implicatures. Cambridge, MA: The MIT Press. Li, Charles N. & Sandra A. Thompson. 1981. Mandarin Chinese. A functional reference grammar. Berkeley, Los Angeles & London: University of California Press. Li, Xuping & Walter Bisang. 2012. Classifiers in Sinitic languages: From individuation to definiteness-marking. Lingua 122. 335–355. Löbner, Sebastian. 1985. Definiteness. Journal of Semantics 4. 279–326. Löbner, Sebastian. 2011. Concept types and determination. Journal of Semantics 28. 279–333. Martinet, André. 1962. A functional view of language. Oxford: Clarendon Press. McWhorter, John H. 2001. The world’s simplest grammars are creole grammars. Linguistic Typology 5. 125–166. Meillet, Antoine. 1912. L’évolution des formes grammaticales. Scientia (Rivista di Scienza) 12(26). 384–400. Miestamo, Matti. 2008. Grammatical complexity in a cross-linguistic perspective. In Matti Miestamo, Kaius Sinnemäki & Fred Karlsson (eds.), Language complexity: Typology, contact, change, 23–41. Amsterdam & Philadelphia: John Benjamins. Mottin, Jean. 1980. Contes et légendes Hmong Blanc. Bangkok: Don Bosco Press.

808 

 Walter Bisang

Needleman, Rosa M. 1973. Tai verbal structures and some implications for current linguistic theory. Los Angeles: University of California PhD dissertation. Neeleman, Ad & Kriszta Szendrői. 2007. Radical pro drop and the morphology of pronouns. Linguistic Inquiry 38. 671–714. Nguyen, Tuong Hung 2004. The structure of the Vietnamese noun phrase. Boston: Boston University dissertation. Norde, Muriel. 2009. Degrammaticalization. Oxford: Oxford University Press. Noss, Richard B. 1964. Thai reference grammar. Washington, DC: Foreign Service Institute. Paillard, Denis. 2010. La notion de prédicat complexe Préfixation – Particules Verbales – Constructions verbales en série. Les cahiers de faits de langues 2. 197–228. Paillard, Denis. 2013. Les constructions verbales en série en khmer contemporain. Faits des langues 41. 11–39. Paul, Waltraud. 2008. The serial verb construction in Chinese: A tenacious myth and a Gordian knot. The Linguistic Review 25. 367–411. Peyraube, Alain. 1988. Syntaxe diachronique du chinois. Évolution des constructions datives du XIVe siècle av. J.-C. au XVIIIe siècle. Paris: Collège de France, Institut des hautes études chinoises. Quang, Kim Ngoc. Forthcoming. Vietnamese classifiers and (in)definiteness: A text-based analysis. Mainz: University of Mainz PhD dissertation. Rangkupan, Suda. 2007. The syntax and semantics of GIVE-complex constructions in Thai. Language and Linguistics 8(1). 193–234. Rizzi, Luigi. 1986. Null objects in Italian and the theory of pro. Linguistic Inquiry 17. 501–557. Schwarz, Florian. 2009. Two types of definites in natural language. Amherst, MA: University of Massachusetts dissertation. Schwarz, Florian. 2013. Two types of definites cross-linguistically. Language and Linguistics Compass 7. 534–558. Simpson, Andrew. 2017. Bare classifier/noun alternations in the Jinyun (Wu) variety of Chinese and the encoding of definiteness. Linguistics 55(2). 305–331. Simpson, Andrew, Hooi Ling Soh & Hiroki Nomoto. 2011. Bare classifiers and definiteness: A cross-linguistic investigation. Studies in Language 35. 168–193. Sinnemäki, Kaius. 2011. Language universals and linguistic complexity. Helsinki: University of Helsinki, Department of Modern Languages. Song, Jae Jung. 1997. On the development of MANNER from GIVE. In John Newman (ed.), The linguistics of giving, 327–348. Amsterdam & Philadelphia: John Benjamins. Sookgasem, Prapa. 1990. Morphology, syntax, and semantics of auxiliaries in Thai. Tucson: University of Arizona doctoral dissertation. Thepkanchana, Kingkarn. 1986. Serial verb constructions in Thai. Ann Arbor: University of Michigan PhD dissertation. Thepkanjana, Kingkarn & Satoshi Uehara. 2008. The verb of giving in Thai and Mandarin Chinese as a case of polysemy: A comparative study. Language Science 30. 621–651. Thompson, Laurence C. 1987 [1965]. A Vietnamese reference grammar. Honolulu: University of Hawaii Press. Traugott, Elizabeth C. 2002. From etymology to historical pragmatics. In Donka Minkova & Robert Stockwell (eds.), Studying the history of the English language: Millennial perspectives, 19–49. Berlin: Mouton de Gruyter. Traugott, Elizabeth C. & Richard B. Dasher. 2002. Regularity in semantic change. Cambridge: Cambridge University Press. Traugott, Elizabeth C. & Graeme Trousdale. 2010. Gradience, gradualness and grammaticalization. Amsterdam & Philadelphia: John Benjamins.



Grammaticalization in Mainland Southeast Asian languages 

 809

Trinh, Tue. 2011. Nominal reference in two classifier languages. Sinn und Bedeutung 15. 629–644. Tuc, Ho-Dac. 2003. Vietnamese-English bilingualism. Patterns of code-switching. London & New York: RoutledgeCurzon. Vittrant, Alice & Justin Watkins (eds.). 2019. The Mainland Southeast Asian Linguistic Area. Berlin: Mouton de Gruyter. Wang, Jian / 王健. 2013. 类型学视野下的汉语方言“量名”结构研究 Lèixíngxué shìyě xià de hànyǔ fāngyán “liàng míng” jiégòu yánjiū [Bare classifier phrases in Sinitic languages: A typological perspective]. Language Sciences 12(4). 383–393. Wilawan, Supriya. 1993. A reanalysis of so-called serial verb constructions in Thai, Khmer, Mandarin Chinese and Yoruba. Honolulu: University of Hawaii PhD dissertation. Wu, Yicheng & Adams Bodomo. 2009. Classifiers ≠ determiners. Linguistic Inquiry 40. 487–503. Xing, Janet Z. 2013. Semantic reanalysis in grammaticalization in Chinese. In Zhuo Jing-Schmidt (ed.), Increased empiricism: New advances in Chinese linguistics, 223–246. Amsterdam & Philadelphia: John Benjamins. Xing, Janet Z. 2015. A comparative study of semantic change in grammaticalization and lexicalization in Chinese and Germanic languages. Studies in Language 39(3). 594–634. Xu, Dan, Chong Qi, Shuang Xu & Fabienne Marc. 2008. Les résultatifs du chinois contemporain. Dictionnaire pratique. Paris: L’Asiathèque – maison des langues du monde. Xu, Dan. 2006. Typological change in Chinese syntax. Oxford: Oxford University Press. Yap, Foong Ha & Shoichi Iwasaki. 2007. The emergence of “GIVE” passives in East and Southeast Asian languages. In Mark Alves, Paul Sidwell & David Gil (eds.), SEALS VIII: Papers from the Eighth Annual Meeting of the Southeast Asian Linguistics Society, 193–208. Canberra: Pacific Linguistics. Zipf, George K. 1949. Human behavior and the principle of least effort. Cambridge, MA: Addison-Wesley.

Jeffrey P. Williams

33 Expressives in languages of Mainland Southeast Asia 33.1 Introduction Expressives are shape-shifting forms whose functions cross-cut grammatical categories and classes but, in general, serve to allow the speaker to provide meta-commentary on an argument in the discourse. As the name implies, expressives allow speakers to ‘express’ an opinion, an attitude, a perception, or other psychological state regarding a topic in a situated discourse. As Tuvasson states, “[e]xpressives typically package multiple aspects of a sensory event into a single word” (Tuvasson 2011: 88). The languages of Mainland Southeast Asia evidence an extraordinary array of elaborate grammatical resources, such as echo words, phonaesthetic words, chameleon affixes, chiming derivatives, onomatopoeic forms, ideophones, and expressives. Speakers of these languages transform lexical resources in order to express and convey emotions, senses, conditions, and perceptions that enrich both every day and ritual discourse. Jacob (1966) already commented on the pervasiveness of phonaesthetic reduplicative compounds in the ‘plain language’ of Khmer; meaning, those varieties used in everyday discourse and not restricted specialized or ritual genres. As Jacob points out, the Khmer people are keenly attentive to how individuals move their bodies and/or limbs to perform tasks and speakers have an abundance of forms from which to choose to make meta-commentary on these actions (Williams 2013), and this is hardly restricted to the Khmers: the ubiquity of expressives in the languages of Mainland Southeast Asia is well-known. Even the casual observer of the languages of the region is struck by pervasiveness of the discussion of expressives in reference grammars and other linguistic materials. The whole concept of a grammatical class of expressives was articulated by Gérard Diffloth nearly 50 years back now (1972) in reference to Mainland Southeast Asian languages. The purpose of this chapter is to provide an overview of expressives and their kin in the languages of the region. Given the almost epidemic presence of these forms in the languages of Mainland Southeast Asia, it will not be possible to cover every language; instead, I will concentrate on representative languages where there is a solid understanding of the structures and functions that define expressives.

https://doi.org/10.1515/9783110558142-033

812 

 Jeffrey P. Williams

33.2 Defining expressives and expressivity So exactly what are expressives? While this should be an easy question given it is the title and topic of this chapter, it remains somewhat difficult to answer. The linguist is tempted to say, “I know an expressive when I see/hear one, but I can’t really tell you what they are for certain.” As Dingemanse (2011) contends for ideophones in Siwu (a West African language) they are conspicuous words. The example below provided by a native speaker of Jarai (Chamic) provides an illustration of the complexities of expressives in discourse.1 (1)

sang anai tơ-treh truai ƀu hơmao͡ mơnuih olô amăng house this pfx-silent echo neg have people stay.in echo sang ȏh house emph ‘This house is very quiet because no one lives here.’

In this example which derives from naturally occurring discourse, there are two examples of expressives in the form of echo words. In the first occurrence, the echoant follows the root while in the second, the echoant precedes the root. In Jarai, this variation is common and is not governed lexically or grammatically; instead, it is up to the speaker’s discretion as to where to place the echoant vis-à-vis the root. In some reference grammars, the term “expressive” is not used but the grammatical descriptions of formal processes such as reduplication, while the forms and glosses themselves clearly indicate expressivity.2 In Thompson’s classic grammar of Vietnamese, he refers to all such forms as “derivatives” (Thompson 1965: 139–178) and the term “expressives” does not appear in either the index or the body of the work.3 In Migliazza’s (2005) account of expressives in So, a Katuic (Austro-Asiatic) language spoken Northeast Thailand and Southern Laos, he uses the term ‘expressives’ in the broader sense encompassing onomatopoeic forms, as well as reduplications and other sound symbolic formations (cf. Sidwell 2013). I adopt a similar stance in this piece, collecting together the variety of forms which function in the capacity of expressivity.

1 As can be seen in the Jarai example, the echoant can appear either to the left or right of the stem or root to which it attaches. Such a pattern is reminiscent of what is also characteristic of Khmer and many other Austro-Asiatic languages of Mainland Southeast Asia. These similarities in both form and function indicate a strong history of contact between Austronesian speakers and Austro-Asiatic speakers in the linguistic prehistory/history of the region. 2 Alves’ (2006) brief reference grammar of the Mon-Khmer language Pacoh is one such example. 3 This is certainly due to the fact that Diffloth’s defining work on expressives did not appear in print until 1972.



Expressives in languages of Mainland Southeast Asia 

 813

In Temiar (Aslian, Austro-Asiatic), Benjamin points out that there are different kinds of expressive formations (2013). For instance, there is a process of incopyfixation which creates an imperfective formation from the verb ciib as shown below. (2)

a. ciːb b. cɛbciːb

‘walk’ ‘to have not completed walking’

Temiar also has a full reduplicated pattern for the same verb as shown below for ciib. It is clear that the two processes and derivative forms have different meanings although they both contain an iconic element of temporality. (3)

a. ciːb ‘walk’ b. ciːb ciːb ‘to keep on walking’

Ideophones are often considered to be synonymous with expressives – the main difference being terminological. The term “expressive” is used to refer to these linguistic phenomena that are part of the Southeast Asian linguistic area while the term ‘ideophone’ is typically used in the grammars of African languages. This distinction is arbitrary at best and creates a false impression that somehow the two entities are different when in fact they are similar in both form and function.4 The elusiveness of the definition lies in the tension between formal and functional explanations of linguistic phenomena. Taking a formalist tact, expressives cannot be reduced to a single proceses that generates a singular form. The structuralist principle of “one form: one meaning” (Kuryłowicz 1949) does not categorically dominate with expressives. From a functional perspective, there are a number of semantic categories that can be expressed through these morphosyntactic processes: duration, size, number, vagueness, iteration, and generality, to name a few. As for ideophones, expressives, reduplications, echo-words and the like, the plurality of forms and functions can only be accounted for through a metric of ‘from many, one’.5 In many cases, in the languages of Mainland Southeast Asia, a root or stem may have more than one expressive form, indicating a change in descriptive perspective. The following examples from Semai (Aslian, Austro-Asiatic) illustrate this condition. (4)

ghu:p ‘acrid odor; neutral’ gho:p ‘acrid odor; intense’ ghɒ:p ‘acrid odor; very intense’ From Tufvesson (2011: 89)

As Benjamin (2013) discusses for Temiar (Aslian, Austro-Asiatic), the syllable poŋ culturally represents the sound of a machete on bamboo. The formation of poŋ papoŋ to indicate repeated chopping illustrates Tufvesson’s point that “through this type of

4 Some works do not preference one form or the other, instead using both as in Enfield (2019). 5 These are also referred to as rhyming reduplications.

814 

 Jeffrey P. Williams

form–meaning mapping, gradient relationships in the perceptual world receive gradient linguistic representations” (Tufvesson 2011: 86).

33.3 Accounting for expressivity From a perspective that brings both grammatical and socio-cultural considerations to bear on the issue, the following list of features both delineate and define expressivity in the world’s languages: (i) iconicity; (ii) complexity; (iii) eloquence; and, (iv) context of use. I will outline these key concepts and how they function to shape expressive morphology in the languages of Mainland Southeast Asia where they are evidenced as key grammatical resources.

33.3.1 Iconicity As Diffloth (1976) claimed some time back now, iconicity is the raison d’être of expressives in the languages where they are found. Certainly there is an element of iconicity to expressive morphology, but we need to employ a more refined notion of iconicity than is usually invoked in linguistic description. While Brunelle and Xuyến (2013) analyse examples as in (5) as a kind of onomatopoeic iconicity, I would see these more as examples of culturally conventionalized iconicity (see below) since there is a cultural convention that equates the velar nasal onset with quiet or discrete chewing; and the dental onset with loud chewing; and finally, the affricated onset with very loud chewing. The cline of loudness moves from the back to the front of the mouth through each point of articulation, although the move from ‘loud’ to ‘very loud’ is more nuanced in relation to Vietnamese culturally conventionalized iconicity. (5)

nhóp nhép [ɲɔp ɲɛp] ‘of chewing discreetly’ tóp tép [tɔp tɛp] ‘of chewing loudly’ chóp chép [cɔp cɛp] ‘of chewing very loudly’ From Brunelle and Xuyến (2013)

Iconicity can be defined as a relationship of conventionalized similarity between a form, linguistic in this case, and its meaning. Briefly, an iconic sign resembles its meaning in some socio-cultural or cognitive manner. Iconicity stands in opposition to arbitrariness; in grammar, the principle of arbitrariness has governed most linguistic description and theory for decades. Expanding on this basic conceptualization of iconicity, Gérard Diffloth concludes his study of iconicity in Bahnar (Mon-Khmer, Cambodia/Vietnam) by saying:



Expressives in languages of Mainland Southeast Asia 

 815

Iconicity belongs to a different semiotic domain than the one usually described in our grammars. As far as expressives are concerned, the phonic and the meaning elements must be described in terms of certain elementary sensations. Iconicity consists here in exploiting similarities between the sensations of speech and other kinds of sensation. This kind of synesthesia must be described in a distinct component of grammar, the esthetic component, which is distinct but not isolated, as it somehow must be plugged into the conventional components which have received much of the attention of theoreticians so far. (Diffloth 1994: 113)

What is key to iconicity in expressive morphology is the fact that the type of iconicity that is evidenced is partly conventionalized, partly emblematic. To explain this statement further, I used the term ‘conventionalized’ to refer to the socio-cultural competency within which the linguistic form is embedded. It is the breadth of socio-cultural knowledge that is requisite for a meaningful understanding of any linguistic representation in discourse. As we can observe in example (6) from Jarai (Chamic), socio-cultural knowledge is crucial to an understanding of the expressive formation. (6)



a.

rơnguă sad ‘sad’ b. rơnguă rơnguăn sad echo ‘a feeling of melancholy that is often brought about by hearing the call of specific birds, such as doves’

In this example, the echoant has no existence as an independent form in Jarai. It only exists in collocation with the root ‘sad.’ Such usage can only be truly understood if one has the cultural knowledge to situate the grammatical resources into a larger socio-aesthetic context. Formally, this is typically the case with echo formations in the grammars of the languages of Mainland Southeast Asia. “Emblematic” here refers to the standard notion of iconic that is employed in linguistic discourse on the topic. This is the received notion that pervades most of the conceptualization of iconicity: that there is some direct relationship between the expressor and the expressed. While there is a considerable body of learned opinions on the subject, none are able to elegantly explain the motivation – the glue which binds – behind the expressor and the expressed concept. In the explanation put forth here, that glue is socio-cultural knowledge – or what I (Williams 2020) have referred to as conventionalized iconicity  – which binds the two together. Conventionalized iconicity is based on symbolic experience which becomes part of the socio-grammatical architecture, allowing the native speaker to navigate the sociolinguistic landscape. Without this knowledge, a speaker cannot produce these utterances and a listener cannot interpret them.

816 

 Jeffrey P. Williams

33.3.2 Complexity Sapir suggested the principle of complexity in reduplication a century ago. He stated that: The process is generally employed, with self-evident symbolism, to indicate such concepts as distribution, plurality, repetition, customary activity, increase of size, added intensity, continuance. (Sapir 1921: 76)

What is clear about the iconicity of expressive morphology is that it conforms to the principle of quantity, as pointed out by Givón (1985). The quantity principle in morphology holds that conceptual complexity corresponds to formal complexity. Whether we talk about reduplication, echo-word formation, or other expressive forms, there is a formal complexity that directly corresponds to the conceptual complexity that is conveyed through these forms. Downing and Stiebels have chosen to represent this relationship formally as in (7). (7)

M1 (σ) M2 (σ ∘ α)

--------- F1 /X/ --------- F2 /X + z/

As they explain: If two forms F1 and F2 differ in terms of extra (supra-)segmental material z and two semantic representations M1 and M2 differ in terms of an extra meaning component α, then M1 should be assigned to F1 and M2 to F2. (Downing and Stiebels 2012: 392)

This principle of expressive morphology is pervasively articulated in languages of Mainland Southeast Asia. Expressives evidence the addition of formal content, either through pure addition as in reduplication or through replacive addition in the case of echo words, or what are called “chameleon affixes” in the literature, the latter seen in the examples from Vietnamese below (9a)–(9c). (9)

a.

sạch sạch sạch b. đen đen đẻn c. rõ rõ rệt

‘to be clean’ ‘to be rather clean’ ‘to be black’ ‘to be rather black’ ‘to be clear’ ‘to be very clear’ From Thompson (1965)



Expressives in languages of Mainland Southeast Asia 

 817

33.3.3 Eloquence Eloquence is signaled by a variety of grammatical features and this characteristic of spoken language is conspicuously manifested in the languages of Mainland Southeast Asia.6 The feature of eloquence echoes through the descriptions of expressives in the grammars of the languages of Mainland Southeast Asia. Its evaluation is tied to a speaker’s command of expressive forms. In many cases, the feature of eloquence is often considered part of the poetic, or as Hudak (2008) has termed it, the aesthetic function of language. Hudak makes the strong claim that “[s]ound, and more precisely pleasing sound, is the common overriding aesthetic that pervades all of the Tai ­languages” (Hudak 2008: 404). This aesthetic is manipulated by the competent, eloquent speaker to create poetic utterances, not only in the Tai languages but throughout the languages of the region.7 Haiman (2013) has made the argument that expressives are “decorative” in function. His claim is that such resources are non-referential, and instead contribute to the “elegance” of an utterance. I would contend that the elegance that Haiman rightly points out for Khmer expressive constructions resides in the eloquence of the speaker – not in the utterance per se.

33.3.4 Contexts of use The importance of social and cultural factors in the use of expressives in the languages of Southeast Asia cannot be overstated. In general, what we find is that the use of expressive morphological devices is more common in naturally occurring discourse where the context is more vernacular. We also find that context of use relates to the previously discussed characteristic of eloquence. Eloquence, defined as a personal verbal charisma, is signaled through command of a genre as well as the grammatical embellishment of that genre, often through the use of expressivity. Tai Ahom (Tai-Kadai), an extinct language (Morey 2020) whose legacy lives on through a body of written texts. In his important contribution to the analysis of poetic devices found in a Tai Ahom manuscript Ming Mvng Lung Phai, Morey (2020) draws attention to a feature termed a “waist” rhyme, where the last word of a line rhymes with a syllable in the middle of the following line, which is metaphorically the waist.8 As Morey makes clear, waist rhyming is very common in the ethnopoetics of Thai6 Eloquence also functions as a partial marker of the feature of “charisma” as developed in the work of Weber (1922). 7 In my work with Jarai speakers, I have encountered evaluations of eloquence as being key in the use of expressives. 8 Waist rhymes are very characteristic in verbally aesthetic genres in the languages of Mainland Southeast Asia (cf. Williams 2013).

818 

 Jeffrey P. Williams

land’s linguistic groups; and beyond this, it is common in many of the languages of Mainland Southeast Asia. Modi and Post (2020) point out specific methodological constraints which disfavor the use of expressives. Modi and Post go on to state that it is exactly the context of formal linguistic elicitation in which speakers avoid the use of expressives due to the unnaturalness of the linguistic elicitation session in most cultures. In my work with speakers of Jarai, I have found that competent speakers make use of expressives in spontaneous discourse where the social interactions are relaxed and the topics of discourse are the most vernacular. In fact, Jarai speakers who are also fluent in Vietnamese are able to construct expressives that “code switch” as shown in example (11) below.

33.4 Expressive forms In the following sections, I will briefly define and exemplify the most characteristic types of expressive forms in the Mainland Southeast Asian languages; namely, 1. ideophones 2. reduplications 3. echo-words 4. onomatopoeia

33.4.1 Ideophones Ideophones are often taken as exemplary of the class of expressives, and as mentioned previously, the terms are even used interchangeably by some. In the linguistic literature on the Mainland Southeast Asian region, the term “ideophone” is only used infrequently; not in any way mimicking the usage we find in the grammatical descriptions of African languages, or the languages spoken in East Asia, especially Japanese and Korean. According to one of the earliest definitions of the term (Doke 1935), ideophones are broadly defined as sound-symbolic words which function to “give a vivid representation of an idea in sound” (Doke 1935: 118). In a nuanced definition provided by anthropologist Evans-Pritchard (1962), he states: It is not a simple matter to define an ideophone, but for the purpose of this note it may broadly be taken to mean, perhaps rather crudely, words the sounds of which seem to be highly expressive of what is referred to, to be just the right sounds to bring to the mind the idea they invoke. (Evans-Pritchard 1962: 143)



Expressives in languages of Mainland Southeast Asia 

 819

Like other forms of expressive morphosyntax, ideophones enliven discourse and often signal the skill of speaker while testing the competency of the listener. From a formal perspective, ideophones involve copying, or partial copying, or some root or stem. In some cases, they are onomatopoeic forms although some linguists maintain a distinction between the two. The treatment of ideophones is a growing area of interest in linguistic description and theory (cf. Williams forthcoming).

33.4.2 Reduplications Following Inkelas and Downing we can say that “reduplication involves the doubling of some component of a morphological base for some morphological purpose” (Inkelas and Downing 2015: 502). What sets reduplicative processes apart from other morphological processes is the fact that they are contingent on the shape of the word, stem or root form to provide the shape of the reduplicative affix. Simply put, reduplication involves the copying of some part of a morphological base for a specified semantic goal. Functionally, reduplication is typically viewed as expressing a core purpose of size/number and intensity. Chrau (Bahnaric) for instance, utilizes complete reduplication as a means to indicate expressivity as examples (8) through (10) show. (8) (9) (10)

khūch khūch ‘many fish’ hul hul ‘sitting quietly’ laq laq ‘sitting still, or sick’ From Thomas (1971: 155)

Reduplication plays an important role in Vietnamese (Austroasiatic). Vũ (1991) attests that around 4,000 stative and non-stative verbs in the language can undergo some form of reduplication. Like many other languages scattered across the globe which evidence reduplication, the languages of the Mainland Southeast Asian region exhibit several different types of reduplicative morphology. Typically, linguists group these into (i) total, or complete, reduplication and (ii) partial reduplication. Total reduplication copies the entire morphological base form as shown in example (8). Total or complete reduplication can convey a wide range of meaning categories in the Southeast Asian sociolinguistic landscape. In Bahnar (Bahnaric) for example, the word for ‘answer’ is drâng. Like many languages in the Mainland Southeast Asian linguistic area, Bahnar also has a causative prefix, pơ -. The stem formation pơ-drâng (‘cause to answer’) can undergo complete reduplication, deriving pơ-drâng drâng. This reduplication does not mean what one might expect, something like ‘to cause to answer repeatedly;’ instead, it means ‘to cause to answer and then change to do something else immediately’ (Bankar 1964).

820 

 Jeffrey P. Williams

Partial reduplication copies only a portion of the morphological base, typically characterized as a phonological or prosodic constituent. It is important to distinguish partial reduplication – which is not replacive – from echo word formation, which is typically replacive.

33.4.3 Echo morphology The concept of echo morphology (echo-word formation) was first introduced into the literature of linguistics in 1938 by Murray Emeneau in reference to what was present throughout the South Asian linguistic area. Nowadays, there a several descriptors used to cover the range of phenomena we find in the world’s languages, with some authors refer to these forms as rhyming reduplication, overwriting, fixed segment reduplication, and chameleon affixation. Echo word formation is an example of reduplication with prosodic replacement in the copy. Replacement can involve any prosodic portion of the copy; including an onset, nucleus, coda, tonal pattern, and register. The copy with prosodic replacement is referred to as the echoant. In example (11) from Jarai, we can see a borrowed Vietnamese echo form djikdjăk9 which is used in conjunction with indigenous Jarai compound to create a type of four-syllable expression djik-djăk khăk-kơčah; meaning to intensely criticize a person. While Jarai does not distinguish gender in third person pronouns, native speakers have told me that djik-djăk khăk-kơčah is normally used with women actors – and typically mothers-in-law. (11)

Nư djik-djăk khăk-kơčah kơ kâo 3 criticize hack-spit dat 1 ‘S/he is very critical of me.’ Williams (fieldnotes)

A diagnostic feature of echo formation is that the replacive morpheme directly impacts the meaning of the echo collocation. There can be, and is, a pattern of difference in echoants. For echo word formation where roots and other non-meaningful morphs have multiple collocative allomorphs, I would suggest that a common term “echoant” be used to categorize them on par with the use of the term “reduplicants”, that results in a difference in meaning as shown in the Jarai examples (12) below. (12)

a. anět ‘small’ b. anět aneo ‘very small’ c. anět anot ‘extremely small’ Williams (fieldnotes)

9 From the Vietnamese root khạc [xaːk] ‘to spit’.



Expressives in languages of Mainland Southeast Asia 

 821

In Wa (Palaungic), aesthetically motivated elements are added which contribute nothing new semantically and have no identifiable syntactic or semantic function, but which make an expressive contribution, as Watkins (2013) points out and is illustrated in (13). (13)

kra̤ ɯŋ kʰrai clothes + clothes ‘clothing, things, goods, possessions’

The additional element is a synonym or near-synonym, and the effect of adding it makes the expression sound and feel richer: in this case there is no systematic semantic intensification or augmentation of the unadorned monosyllabic morpheme. In a detailed and insightful study of Thai classical singing, Swangviboonpong � /ʔɯ̂̂ən/. In most dictionaries, its definition (2004) discusses the Thai term เอื้้อน includes ‘the speaking of a word,’ ‘a pronouncement’, or ‘speech in a pleasing voice’. However, in the vocabulary of Thai classical music, the same word refers to the ‘wordless vocalizations’ that are positioned between sung words (Swangviboonpong 2004: 32).10 They are meaningless in terms of the text but are instead aesthetically charged. The formal relationship between ʔɯ̂ən in Thai music and echo morphology in Thai grammar is striking. In both ʔɯ̂ən and in derisive/approximative echo words the Thai vowels ɯ and ǝ are prominent as rhymes.11 In Thai, the replacement vowel is dictated by the vowel of the root or either ɯ or ǝ is used as a kind of default replacement. DiCanio (2005) and other grammarians of Khmer, have referred to echo words in that language as “alliteration”. As Jacob (1979) points out, in Khmer it is normally the case that repetition is used in more descriptive speech or as a poetic device although it is found in both everyday and ritualized genres.

33.4.4 Onomatopoeia Onomatopoeia is traditionally defined in linguistic terms as the naming of a thing or action by a vocal imitation of the sound associated with that thing or action. The sounds produced by animals are often given as examples of onomatopoeia. Some scholars apply the term even more broadly to the use of words whose sound suggests the meaning of that word. Even more broadly are the expressives which refer to an action through an iconic image as in (14) from Jarai (Chamic). (14)

bru

‘scattered as birds separating from the flock’

10 The ability to use ʔɯ̂ən is an important indicator of musical competency in Thai classical singing and organized sǎtam chátn competitions that measure this ability (Swangviboonpong 2003). 11 According to Swangviboonpong (2003: 36) in ʔɯ̂ən [ǝ:] is most commonly used and can have the following variants: ǝ:, hǝ:, ŋǝ:, ǝ:ŋ, and ŋǝ:j.

822 

 Jeffrey P. Williams

The languages of Mainland Southeast Asia vary in their treatment of onomatopoeia as a type of expressive. According to Brunelle and Xuyến (2013), Vietnamese onomatopoeic forms belong to the class of ideophones as shown in (15) below: (15)

oàm oạp hi hi phì phò

[wa:m wa:p] [hi hi] [fi fɔ]

‘of waves breaking on the shore’ ‘of high-pitched laughter’ ‘of panting’

33.5 Conclusion This brief excursion into the world of expressivity in the languages of Mainland Southeast Asia is best viewed an introduction. The complexities of these processes in the region are far beyond the scope of the work or the state of the art at this time. Expressivity has not been the topic of a singular work in the linguistics literature or in the literature on the grammars of Mainland Southeast Asian languages.12 The answers to many of the questions regarding expressivity in grammar lie in more complete collaborative grammatical descriptions involving native speakers and linguists, such are found in Williams (2013). We remain in need of an overarching theory of how such data can be incorporated into existing models of human cognitive and its linguistic exponence.

References Alves, Mark. 2006. A grammar of Pacoh. Canberra: Pacific linguistics, Australian National University. Bankar, Elizabeth M. 1964. Bahnar reduplication. Mon-Khmer Studies I. 119–134. Benjamin, Geoffrey. 2013. Aesthetic elements in Temiar grammar. In Jeffrey P. Williams (ed.), The aesthetics of grammar: Sound and meaning in the languages of Mainland Southeast Asia, 36–60. Cambridge: Cambridge University Press. Brunelle, Marc & Lê Thị Xuyến. 2013. Why is sound symbolism so common in Vietnamese? In Jeffrey P. Williams, (ed.) The aesthetics of grammar: Sound and meaning in the languages of Mainland Southeast Asia, 83–98. Cambridge: Cambridge University Press. DiCanio, Christian. 2005. Expressive alliteration in Mon and Khmer (UC Berkeley Phonology Lab Annual Report). Berkeley: University of California. Diffloth, Gérard. 1972. Notes on expressive meanings. Papers from the Eighth Regional Meeting of the Chicago Linguistics Society, 440–447. Chicago: Chicago Linguistics Society. Diffloth, Gérard. 1976. Expressives in Semai (Oceanic Linguistics Special Publications 13), 249–264. Honolulu: University of Hawaii Press. Diffloth, Gérard. 1994. “i: big, a: small.” In Leanne Hinton, Johanna Nichols & John Ohala (eds.), Sound symbolism, 107–114. New York: Cambridge University Press. 12 Such a volume is in production now. See Williams (2020) in the reference list which follows.



Expressives in languages of Mainland Southeast Asia 

 823

Dingemanse, Mark. 2011. Ideophones and the aesthetics of everyday language in a West-African society. Senses and Society 6. 77–85. Doke, Clement M. 1935. Bantu linguistic terminology. London: Longmans, Green, and Co. Emeneau, Murray. 1938. An echo-word motif in Dravidian folktales. Journal of the American Oriental Society 58. 553–570. Evans-Pritchard, Edward Evan. 1962. Ideophones in Zande. Sudan Notes and Records 34. 143–146. Givón, Talmy. 1985. Iconicity, isomorphism, and non-arbitrary coding in syntax. In John Haiman (ed.), Iconicity in syntax: Proceedings of a symposium on iconicity in syntax, Stanford, June 24–26, 1983, 187–220. Amsterdam: John Benjamins. Haiman, John. 2013. Decorative morphology in Khmer. In Jeffrey P. Williams (ed.), The aesthetics of grammar: Sound and meaning in the languages of Mainland Southeast Asia, 61–82. Cambridge: Cambridge University Press. Hudak, Thomas John. 2008. Tai aesthetics. In Anthony V. N. Diller, Jerold A. Edmondson & Yongxian Luo (eds.), The Tai-Kadai languages, 404–414. London: Routledge. Inkelas, Sharon & Laura Downing. 2015. What is reduplication? Typology and analysis part 1/2: The typology of reduplication. Language and Linguistics Compass 9(12). 502–515. Jacob, Judith M. 1966. Some features of Khmer versification. In Charles Ernest Bazell, John C. Catford, Michael Alexander K. Halliday & Robert H. Robins (eds.), In memory of J. R. Firth, 227–241. London: Longmans Green and Co. Jacob, Judith M. 1979. Observations on the uses of reduplication as a poetic device in Khmer. In Theraphan L. Thongkum et al. (eds.), Studies in Tai and Mon-Khmer phonetics and phonology in honour of Eugénie J. A. Henderson, 111–130. Bangkok: Chulalongkorn University Press. Kuryłowicz, Jerzy. 1949. La notion de l’isomorphisme. Travaux du Cercle de Linguistique de Copenhague 5. 48–60. Migliazza, Bria. 2005. Some expressives in So. Ethnorêma 1. 1–18. Modi, Yankee & Mark W. Post. 2020. The functional value of formal exuberance: Expressive intensification in Adi and Milang. In Jeffrey P. Williams (ed.), Expressive morphology in the languages of South Asia, 187–212. London: Routledge. Morey, Stephen. 2020. A study of the poetics of Tai Ahom. In Jeffrey P. Williams (ed.), Expressive morphology in the languages of South Asia, 215–230. London: Routledge. Sapir, Edward. 1921. Language. New York: Harcourt, Brace. Shorto, Harry. 2006. A Mon-Khmer comparative dictionary (Pacific Linguistics 579). Main editor: Paul Sidwell. Canberra: The Australian National University. Sidwell, Paul. 2013. Expressives in Austroasiatic. In Jeffrey P. Williams (ed.), The aesthetics of grammar: Sound and meaning in the languages of Mainland Southeast Asia, 17–35. Cambridge: Cambridge University Press. Swangviboonpong, Dusadee. 2004. Thai classical singing: Its history, musical characteristics and transmission. Hampshire, England: Routledge. Thomas, David D. 1971. Chrau grammar (Oceanic Linguistics Special Publication 7). Honolulu: University of Hawaii Press. Thompson, Laurence C. 1965. A Vietnamese grammar. Seattle: University of Washington Press. Tufvesson, Sylvia. 2011. Analogy-making in the Semai sensory world. The Senses and Society 6. 86–95. Vuori, Vesa-Jussi. 2000. Repetitive structures in the languages of East and South-east Asia (Studia Orientalia 88). Helsinki: Finnish Oriental Society. Watkins, Justin. 2013. Grammatical aesthetics in Wa. In Jeffrey P. Williams (ed.), The aesthetics of grammar: Sound and meaning in the languages of Mainland Southeast Asia, 99–117. Cambridge: Cambridge University Press. Weber, Max. 1978 [1922]. Economy and society. Berkeley: University of California Press.

824 

 Jeffrey P. Williams

Williams, Jeffrey P. (ed.). 2013. The aesthetics of grammar: Sound and meaning in the languages of Mainland Southeast Asia. Cambridge: Cambridge University Press. Williams, Jeffrey P. 2020. Expressivity. Key topics in syntax. Cambridge: Cambridge University Press. Williams, Jeffrey P. & Lap Siu. 2013. Jarai echo word morphology. In Jeffrey P. Williams (ed.), The aesthetics of grammar: Sound and meaning in the languages of Mainland Southeast Asia, 191–206. Cambridge: Cambridge University Press.

Mathias Jenny

34 Pragmatics and syntax in the languages of MSEA 34.1 Introduction Mainland Southeast Asian languages are well known for their low use of strict grammatical marking and high context-dependence, leaving much of the communicative workload to the listener’s inference (Gil 2009; Bisang 2009; Enfield 2011). Being traditionally described as isolating languages, MSEAn languages do not exhibit extensive inventories of purely grammatical morphemes that are used obligatorily to express notions like tense, aspect, case relations, number, and others. If these categories are to be specified, more concrete lexical morphemes, e.  g. adverbs and secondary verbs, are used. On the other hand, MSEAn languages typically have rich sets of pragmatic particles, often sentence or clause final, that cannot be assigned clear grammatical functions, but rather express notions on a meta level, such as the speaker’s attitude towards the situation described, the purpose of the utterance, the social relationship of the interlocutors, and more. Some languages of Myanmar, especially of the Tibeto-Burman (TB) family, tend to be more grammatically explicit, approaching South Asian typological profiles. The use of purely grammatical marking is often less consistent in spoken vernaculars than in standardized varieties, though, making the colloquial TB languages more MSEA-like. Traditional grammatical descriptions usually treat pragmatic features peripherally at best, focusing on more formal morphosyntactic topics. More recent grammars of SEAn languages tend to dedicate more space to pragmatic aspects (e.  g. Iwasaki and Ingkaphirom [2005] for Thai; Jenny and Hnin Tun [2016] for Burmese). The importance of pragmatics in the grammars of MSEAn languages is reflected in a recent collection of grammar sketches of MSEA, which includes a section on pragmatics and discourse for each individual language (Vittrant and Watkins 2019a). This chapter aims at giving a brief overview of the pragmatic functions and context dependent structures in MSEAn languages. Context dependence is taken to be not only in the linguistic sense but includes the extra-linguistic situational context of an utterance. Most examples given in this chapter are from three languages belonging to the three main language families of MSEA, namely Mon (Austroasiatic), Thai (Tai-Kadai), and Burmese (Sino-Tibetan). Most of the statements that can be made about these three languages are also valid for the majority of other languages of the area. If anything, the picture of pragmatics-based syntax found in less standardized, non-literary vernaculars is likely to be more prominent than in the literary languages. Not all processes described in this chapter are equally common in all languages of MSEA, and many are not restricted to the area. Word order variation for pragmatic https://doi.org/10.1515/9783110558142-034

826 

 Mathias Jenny

reasons, for example, is common in the world’s languages and, besides prosodic variation, may be the most common means to express information structural functions (see Lambrecht 1994).

34.2 Basic syntactic structures and pragmatic variation 34.2.1 Word order variations The two main basic word order types found in MSEAn languages are SV/AVP in the central and eastern parts, and SV/APV in the west. The latter is found in all Tibeto-Burman languages except for Karenic, the former in languages belonging to the non-TB families of the area. While these are given as basic word orders (Dryer 2013), a number of communication-related factors can interfere with the arrangement of elements in a clause. Although many SEAn languages exhibit great flexibility in their syntax, the possibilities of variation are not unrestricted but depend on more or less strict contextual rules. The details of these rules may vary for different languages, but the overall picture remains similar. Importantly, although fronting and right-dislocation of arguments is possible, the SV/AVP order is still grammatically relevant, as it determines the grammatical relations of Subject and Object in the non-verb-final languages of MSEA. This means that the object can be fronted to PAV order, but *PVA, with fronted P and right-dislocated A, is not commonly found. While pragmatic variation is important, pragmatics does not override syntax in the verb-medial MSEAn languages. In verb-final languages the arrangement of arguments is more pragmatically based, with topical arguments preceding focal arguments.

34.2.1.1 Movement of constituents Determining the basic word order in any language is not trivial. What is described as ‘unmarked word order’ in a grammar may not be the most frequent pattern, and different clause types may exhibit different word orders or preferences of these (see Enfield 2011). SV/AVP is generally considered to be the underlying order in central MSEAn languages (east of Myanmar) with other orders derived by pragmatic movements such as topic fronting. Alternatively, one might take a Topic-Comment arrangement as basic in MSEAn languages, including the verb-final languages of the western part (Vittrant and Watkins 2019b). As S/A frequently coincides with Topic, in many cases the outcome is identical, but the latter perspective denies underlying syntactic rules based on grammatical relations, with pragmatics being the sole (or at least most



Pragmatics and syntax in the languages of MSEA 

 827

important) factor involved in clause structure. The assumption is that there are no pragmatically neutral contexts, all utterances being necessarily part of a communicative event and encoding a concrete pragmatic function. Even in constructed examples, native speakers will always have a concrete context in mind and construe the utterance according to this speech situation. On the other hand, subordinate clauses are often non-assertive, presupposed, backgrounded, and as such not accessible to pragmatic variation, suggesting that they exhibit what may be called the basic word order. Indeed, in most cases subordinate clauses in Mon and Thai only allow SV/AVP order. We may therefore take SV/AVP as underlying basic word order, with variations such as PAV and VS triggered by communicative factors. In the verb-final languages, the situation is similar, with APV and PAV occurring main clauses, but only APV in most subordinate clauses. This suggests that AVP and APV are the basic word orders, with possible fronting of P in independent clauses. It should be kept in mind, though, that different word orders may apply in subordinate clauses by grammatical rules of individual languages. A case in point are Palaungic varieties (Austroasiatic, Myanmar), where independent clauses have AVP order (with possible pragmatic variation), while most subordinate clauses only allow VAP. This may reflect an idiosyncratic historical development rather than a basic word order that is available only in subordinate clauses (Jenny 2020). The following examples illustrate P-fronting in Mon (1), Thai (2), and Burmese (3). (1)

Mon kwah kjac hnòk kɔ̀ h tak kɒ ɲìʔ. pupil holy be.big top to.beat to.give be.little ‘Hit your pupils for me, Reverend.’ (Jenny 2005)

(2)

Tha

nǎŋsɯ̌ː lêm níː (pʰǒm) jaŋ mâj dâj ʔàːn. book clf prox 1sm yet not to.get to.read ‘I haven’t read this book yet.’1

(3)

Bur

ʔɛ̀ .di lu (ŋa) twé-bù-dɛ tʰiɴ-dɛ. medl person 1s to.find-exper-nfut to.think-nfut ‘I think I’ve seen that guy before.’

Another possibility of movement is to place arguments outside the clause, either in the left or right periphery, as fronted topic and afterthought, respectively. Unlike fronted P, arguments displaced in this way are syntactically outside the clause and can be expressed within the clause by a resumptive (or “presumptive”) pronoun. Displacement to an extra-clausal position is not restricted to main clauses. Examples (4)–(6) illustrate extra-clausal arguments.

1 Examples without indication of a source a from the author’s own corpora, based on fieldwork in Thailand and Myanmar.

828 

 Mathias Jenny

(4)

Mon (ɲɛ̀ h-tɔʔ) hùʔ tɛm, mùʔ hùʔ tɛm, krɤk. person-pl neg to.know what neg to.know Chinese ‘They didn’t know, they didn’t know anything, the Chinese.’ (Jenny 2015)

(5)

Tha

(man) mâj rúː ʔəraj, pʰɯ̂ən raw. 3l not to.know what friend 1p ‘He doesn’t know anything, that friend of mine.’

(6)

Bur

(θu) ba-hmá mə-θí-bù, ʔɛ́ -kauɴ. 3 what-restr neg-to.know-neg medl.dep-clf ‘He doesn’t know anything, that guy.’

Two-step fronting, that is fronting of the object (P) and the subject (A) to yield APV is rare and usually indicated prosodically by a clear intonation break or overtly marked by a focus or topic marker, as in (7)–(8). (7)

Mon ʔəpa ʔuə kɔ̀ h ʔərɛ̀ k (ɲɛ̀ h) lɛ hùʔ sɤŋ, ɓɔk lɛ father 1s top alcohol 3 add neg to.drink cigar add hùʔ sɤŋ. neg to.drink ‘My father, he doesn’t drink nor smoke.’ (Jenny 2005)

(8)

Tha

pʰôː raw nɪ̂ə lâw kɔ̂ mâj kin, bùʔrìː father 1p top alcohol tcl not to.consume cigarette kɔ̂ mâj sùːp. tcl not to.smoke ‘My father, he doesn’t drink nor smoke.’

In verb-final Burmese, more than one constituent may appear in the postverbal extraclausal position, as seen in (9), where both the subject and the object are moved to the right, the former optionally being expressed in the clause by a pronoun (Jenny and Hnin Tun 2016). (9)

Bur

(θu) ʔəmjɛ̀ .dàɴ pʰaʔ-ne-dɛ, θədìɴ-za, ŋá ʔəpʰe-gá. 3 always to.read-stay-nfut news-text 1s.dep father-sbj ‘The newspapers, he always reads, my father.’

34.2.1.2 Interrogative fronting Interrogatives (pronouns and adverbials) in MSEAn languages typically occur in situ, that is, they occupy the position in the clause where the constituent asked about would occur. Alternatively, interrogatives may appear in a (high) focal position, such as clause initial or immediately preverbal (in verb-final clauses). Thai allows fronting to clause-initial position only of adverbial interrogatives, in some cases with a difference in meaning from in situ position. The causal interrogative in Thai, tʰammaj



Pragmatics and syntax in the languages of MSEA 

 829

‘why’, occurs clause-initially when asking about a reason why something happens, and clause-finally when asking about the purpose of an event. This reflects the origin of the interrogative from tʰam-raj ‘do what’. Negated causal interrogative contexts can only be interpreted as asking about a cause or reason, not a purpose, so ‘why not’ can only occur in clause-initial position. The position is based on semantic, rather than pragmatic factors, as illustrated in examples (10a)–(10b). (10)

Tha

Tha

a.

kʰun maː tʰîː.nîː tʰammaj. 2h to.come here why ‘What did you come here for?’ b. mɯ̂ə.waːn tʰammaj mâj maː. yesterday why not to.come ‘Why didn’t you come yesterday?’

If interrogative P or G arguments in Thai are fronted, a cleft construction is used, as in (11a)–(11b). (11)

Tha

Tha

a.

kʰun cʰɔ̂ ːp ʔan nǎj. 2h to.like clf which ‘Which one do you like?’ b. ʔan nǎj tʰîː kʰun cʰɔ̂ ːp. clf which rel 2h to.like ‘Which one is the one you like?’

In Burmese, interrogative pronouns and adverbials usually occur in the immediate preverbal slot, which is considered the main focus position. This reflects the focal function of content question words, as in (12). Adverbial and pronominal interrogatives may be fronted to clause-initial position for special emphasis, as seen in (13). (12)

Bur

ʔɛ̀ .di ʔəcàuɴ bɛ.ðú-go pjɔ̀ -já-hma lɛ̀ . medl matter who.dep-obj to.speak-get-nfut.nml cq ‘Who do I have to tell about this?’ (Jenny and Hnin Tun 2016)

(13)

Bur

ba-pʰjiʔ-ló ʔəmá mə-la-da lɛ̀ . what-be-sub older.sister neg-to.come-nfut.nml cq ‘Why didn’t you come, sister?’

Mon allows interrogatives either in situ, fronted in clause-initial position, or both at the same time (14)–(15), in the case of mùʔ ‘what’ also reduplicated when occurring in situ (Jenny 2011, 2015). In the colloquial language, clause-initial interrogatives seem to be preferred. While the preference for preverbal (in Mon clause-initial) interrogatives may be indirectly influenced by Burmese, the structure in Mon goes back to Old Mon, where cleft constructions with interrogatives are common. With the drop of the relative marker in modern spoken Mon, the cleft could easily be reanalyzed as fronted interrogative (Jenny 2011).

830 

 Mathias Jenny

(14)

Mon ʔaŋkəlòc kɔ̀ h mùʔ ɗɒc klɤŋ mùʔ. English top what to.ride to.come what ‘What did the English ride when they came here?’ (Jenny 2015)

(15)

Mon paʔ mùʔ~mùʔ rao, ɗɔə càt kɔ̀ h. to.do what~red cq loc performance medl What do you do in that theater?’ (Jenny 2015)

At least in one case, the position of the interrogative adverbial in Mon is based on semantic, rather than pragmatic factors, as in Thai. In Mon it is the temporal interrogative chəlɔʔ ‘when’ which occurs clause-initially when asking about future events, clause-finally when asking about past events.

34.2.1.3 Argument drop MSEAn languages can be described as radical argument-dropping (pro-drop, zero anaphora) languages. Known or retrievable arguments, either from the linguistic context or the extralinguistic speech situation, generally may or must be omitted, irrespective of the syntactic or semantic role and referential properties of the argument.2 Pronominal anaphora is possible, but not obligatory in most contexts. In some languages, arguments can also be dropped when there is no topic continuity, that is, when the dropped argument is not coreferential with the nearest antecedent (16), and arguments may be dropped if they are not coreferential with each other or any overt argument in the context, as in (17). Here the first gap refers to the first person plural, the second gap to third person plural, and neither is coreferential with the only overt argument in the clause. This general tendency to drop arguments is one of the main factors that contribute to the heavy conjectural and context-dependent profile of MSEAn languages. (16)

Mon həmɛ̀ əi tɒn klɤŋ Øj phɔc ɗɛhi rɔ̀ p Øj kɤ̀ʔ. Bama move.up to.come to.fear 3 to.catch to.get ‘The Burmese came up and (we) were afraid (they) would catch (us).’ (Jenny 2005)

(17)

Mon ɓaŋ.kja kɔ̀ h tao Øi phɔc Øj khjɒt. airplane medl to.burn to.fear to.die ‘That airplane caught fire and (we) were afraid (they) would die.’ (Jenny 2015)

2 Examples of dropped arguments are notoriously hard to find in (short) grammatical descriptions of many MSEAn languages, as grammars focus on demonstrating where elements are placed in a sentence.



Pragmatics and syntax in the languages of MSEA 

 831

In the following example (18), the omitted argument is interpreted as first person only based on the context. Reference to the second or third person is just as likely in a different context. (18)

Tha

kʰǐən wáj diː kwàː, Ø càʔ dâj mâj lɯːm. to.write to.keep be.good more irr to.get not to.forget ‘It’s better to write (it) down so that (I) don’t forget (it).’ (Iwasaki and Ingkaphirom 2005)

The translation of (19) is appropriate in a context where someone asks about where a friend was, whether he was coming. (19)

Bur

Øi Øj la-kʰàiɴ-lɛ̀ Øj mə-la-bù. to.come-order-add neg-to.come-neg ‘(I) told (him) to come but (he) wouldn’t show up.’

Although argument drop is widespread in MSEAn languages, there are restrictions to it. In many languages, argument drop is possible only if the omitted referent is retrievable from the context. Non-referential arguments must be expressed either by a generic noun or an appropriate pronoun, as in (20) and (21). (20) Tha

(21)

miː kʰon maː kʰəmoːj dɔ̀ ːk.máːj nâː to.have person to.come to.steal flower face ‘Someone stole the flowers in front of the house.’

bâːn. house

Mon mənìh prèə kɔ̀ h ɲɛ̀ h kok.khao ciəʔ mìʔ.lìm.cənaj human female medl/top person to.call eat pn raʔ foc ‘The woman was called Mi Lim Canay.’ (Jenny 2015)

34.2.2 Information structural organization Apart from word order variation for pragmatic reasons, MSEAn languages also have other means to organize information structure in a sentence. In the following subsections, the most prominent of these are presented, without any claims to completeness.

34.2.2.1 Topic prominence Topic prominence in MSEAn languages is defined as a type of sentence organization in which “a topical nominal appears in initial position, external to the clause that follows but semantically connected in that it sets the scope of what is to come” (Enfield 2005: 189–190). The topic is syntactically disconnected from the clause, only connected to

832 

 Mathias Jenny

it by semantic association. This topic-comment structure is illustrated by expressions like (22), where the locative ‘in Thailand’ is not marked as such. (22)

Tha

mɯəŋ-tʰaj ʔaːkàːt rɔ́ ːn mâːk. land-Thai weather be.hot be.much ‘The weather in Thailand is very hot.’

Though the typical position of the topical constituent is clause-initial (or pre-clausal), it may also appear after the clause as an afterthought. In addition to nominals, as described by Enfield (2005), the topic may be any other constituent of the clause, including a verb or adverbial, or a subordinate clause. The topic may or may not be overtly marked as such (see 34.3.1 below), and it may appear resumptively within the clause as seen in (23), where the verb sà ‘to eat’ is obligatorily present within the clause, and (24) where the subject may or may not appear as resumptive pronoun in the clause. (23)

Bur

sà-dɔ́ mə-sà-naiɴ-dɔ́ -bù. to.eat-contr neg-to.eat-capable-contr-neg ‘As for eating, I’m done.’

(24)

Tha

(man) mâj dâj rɯ̂əŋ, nǎŋ rɯ̂əŋ nîə 3l not to.get matter movie clf prox.top ‘This movie is really nonsense.’

34.2.2.2 Double SBJ Clauses that have two NPs in the preverbal subject position as described in Mandarin Chinese are common in MSEAn languages. Double subjects are defined as “subset of topic-comment sentences in which there happens to be a particular semantic relationship between the topic and the subject, which we may call part-whole” (Li and Thompson 1981: 93; Chappell 1996). These constructions do not involve a fronted P, but rather a topical NP and an NP in subject position that belongs to the comment part of the utterance. In this way, double subject constructions are a subcategory of general topic constructions as described in 34.2.2.1. Both preverbal NPs may exhibit subject properties, as can be seen in Burmese, where either may take the subject marker -ká/ gá, depending on which subject is contrasted with another (possible) one, as illustrated in (25)–(26). On the other hand, only the clause-internal subject may trigger plural marking on the verb, not the preclausal topic NP. (25)

Bur

ʔəko-gá ʔəpjɔ̀ kàuɴ-dɛ. older.brother-sbj speech be.good-nfut ‘Your speech is good.’ (Jenny and Hnin Tun 2016)



(26)

Pragmatics and syntax in the languages of MSEA 

Bur

 833

ʔəkó ʔəpjɔ̀ -gá kàuɴ-dɛ, ʔəlouʔ-ká mə-kàuɴ-bù older.brother speech-sbj be.good-nfut work-sbj neg-be.good-neg ‘Your speech is good, brother, but your acts are not.’

Double subject constructions are common in Thai, often being translation equivalents of possessive constructions in European languages, and the topic NP can be seen as fronted possessor, as in (27). (27)

Tha

pʰɯ̂ən kɛː bâːn jàj mâːk náʔ. friend 2l house be.big be.much emph ‘Your friend’s house is really huge.’

They are especially common in, but not restricted to, body part expressions and sensations, also known as psycho-collocations, as in (28) (Matisoff 1986; Iwasaki and Ingkaphirom 2005). (28)

Tha

kʰon níː taː bɔ̀ ːt, sùən kʰon nán hǔː nùək. person prox eye be.blind part person medl ear deaf ‘This guy is blind, that one is deaf.’

In languages that lack a transitive possessive verb ‘to have’, double subject constructions may be used to express predicative possession, as seen in the Mon examples (29) and (30). (29)

Mon rɔ̀ ə ʔuə kà nùm ɓa. friend 1s car to.exist two ‘My friend has two cars.’

(30) Mon ʔəpa ɗɛh jəmùʔ kjɛ.làj, ʔəmè ɗɛh jəmùʔ tan.ɲòʔ father 3 name pn mother 3 name pn ‘Father’s name is Kyae Lai, mother’s name is Than Nyunt.’ (Jenny 2015)

34.2.2.3 V-S besides S-V In intransitive clauses, the normal word order is SV in all MSEAn languages. The inverse order is found in Thai and Mon, but hardly in generally verb-final Burmese, in cases where the verb appears as topic, and the subject is the comment. This is regularly the case in presentational and existential constructions, following worldwide tendencies, as seen in (31) and (33). Other verbs that tend to trigger subject-verb inversion include ‘to be left’, ‘to lack’, ‘to be used up’, all of which may be seen as existentials in a broad sense, as illustrated in (32a)–(32b) and (34a)–(34b). (31)

Mon nùm mɔ̀ ŋ chaʔ pɤŋ, hwaʔ ʔɒt ʔa jaʔ. to.exist to.stay excl cooked.rice curry be.all to.go nsit ‘There’s only rice, the curry is used up.’ (Jenny 2019)

834 

 Mathias Jenny

(32)

Mon a.

seh mɔ̀ ŋ chaʔ ʔuə. be.left to.stay excl 1s ‘I’m the only one left.’ (Jenny 2019) Mon b. ʔuə seh mɔ̀ ŋ phɤh. 1s be.left to.stay still ‘I’m still here.’ (Jenny 2019)

(33)

Tha

(34) Tha

miː tɛ̀ ː kʰâːw, kɛːŋ mòt lɛ́ ːw. to.have only rice curry be.all nsit ‘There’s only rice, the curry is used up.’ a.

lɯ̌ə tɛ̀ ː pʰǒm kʰon diəw. be.left only 1sm clf single ‘I’m the only one left.’ Tha b. pʰǒm jaŋ lɯ̌ə jùː 1sm still be.left to.stay ‘I’m still here.’ Verb-subject inversion is also common with quantified subjects (see 34.2.2.4).

34.2.2.4 Right drift of new, specific information The basic clause arrangement in MSEAn languages tends towards proceeding from general, known information towards more specific, new information. This can be seen as overall topic-comment arrangement, combined with a funnel-like structure, with the most specific, newest piece of information being placed at the end of the utterance. This arrangement is evident from the position of adverbials, but also nominal expressions, especially quantifiers, which frequently appear at the end of the clause, separated from the NP they refer to. In Thai, the obligatory use of a numeral classifier usually unambiguously indicates which NP the quantifier phrase refers to. In example (35), the first constituents set the scene, with information getting more specific towards the end of the sentence. The quantifier ‘two persons’ at the very end refers back to and specifies the noun ‘friend’ near the beginning of the sentence. Unlike afterthoughts, right-shifted elements providing specific information are not separated from the clause by an intonational break, although the distinction may be subtle (and spurious) in many cases. (35)

Tha

mɯ̂ə.waːn pʰɯ̂ən maː kʰuj rɯ̂əŋ ŋaːn tʰîː bâːn yesterday friend to.come to.talk matter work place house sɔ̌ ːŋ kʰon. two clf ‘Yesterday, two friends of mine came to talk about business.’



Pragmatics and syntax in the languages of MSEA 

 835

The right drift of specific information also leads to subject-verb inversion if the subject is a quantifier, as in (36) and (37). In these cases, the most likely explanation is that a preverbal subject NP is dropped and only the more relevant quantifier phrase is retained, in its usual clause-final position. (36)

Mon (rɔ̀ ə tɔʔ kɔ̀ h) khjɒt sɒm həʔɒt raʔ. friend pl medl/top to.die incl all foc ‘All of them died.’

(37)

Tha

róp kan kʰráŋ nán taːj paj lǎːj kʰon. to.fight recp time medl to.die to.go be.numerous clf ‘During that fight, many people died.’

34.2.3 Pragmatic argument marking Case marking is not a common feature of MSEAn languages, but grammatical morphemes marking grammatical and/or semantic relations are found in Burmese and some other languages in the western parts of MSEA. While Literary Burmese has a set of postpositional markers, namely -θi/ði for subject (nominative), -ko/go for direct object (accusative) and -ʔà for indirect object (dative) which are used more or less consistently, as illustrated in (38), colloquial Burmese is more flexible in the application of the two main case suffixes, -ká/gá for (contrastive) subjects and -ko/go for primary objects. The more isolating languages east of Myanmar do not exhibit any overt core case marking, but there are syntactic means to highlight objects (P) in some contexts (Enfield 2011). (38) Bur

sʰəja-ði càuɴ.ðà-ʔà sa.ʔouʔ tə-ʔouʔ-ko pè-ði. teacher-nom student-dat book one-clf-acc to.give-nfut ‘The teacher gave the student a book.’

34.2.3.1 Differential Object Marking (DOM) Colloquial Burmese exhibits Differential Object Marking (DOM) as overt marking of objects depends on semantic and pragmatic factors (Sawada 1995; Jenny and Hnin Tun 2016). The marker -ko/go is regularly used with human referents in patient (P, direct object) or recipient (R, indirect object) roles (39)–(40). These may be expressed by pronouns, proper names, or social or kinship terms, with a great extent of overlap among these categories. (39)

Bur

ʔəmá-go twé-da là. older.sister-obj to.find-nfut.nml pq ‘Did you see your sister?’

836 

 Mathias Jenny

(40) Bur

cənɔ́ -go tə-kʰwɛʔ-lauʔ pè-ba. 1sm.dep-obj one-glass-as.much.as to.give-import ‘Give me just one glass (of water).’

Human referents as objects remain unmarked if they are not specific, as seen in the contrast between (41a) and (41b), both from Jenny and Hnin Tun (2016). (41)

Bur

Bur

a.

θu mèiɴ.má tə-jauʔ-ko ɕa-ne-dɛ. 3 woman one-clf-obj to.seek-stay-nfut ‘He is looking for a woman (a specific one).’ b. θu mèiɴ.má tə-jauʔ ɕa-ne-dɛ. 3 woman one-clf to.seek-stay-nfut ‘He is looking for a woman (any one would do).’

Non-human referents, including inanimate objects, can be marked overtly of they are salient in the discourse in some way. This includes topical (42), definite (43), as well as specific objects (43). Recipients are almost always marked by -ko/go, which suggests syntactic marking, but as recipients are human or high animate in most real-world contexts, the reason for the overt marking may be semantic as well. (42)

Bur

[How much is that book?] di sa.ʔouʔ-ko mə-jàuɴ-bù. prox book-obj neg-to.sell-neg ‘This book is not for sale.’

(43)

Bur

cənɔ ʔɛ̀ .di sa.ʔouʔ-ko pʰaʔ-cʰiɴ-dɛ 1sm medl book-obj to.read-des-nfut ‘I would like to read that book.’

(44) Bur

kà də-zì-go cənɔ-dó ʔəkouɴ.lòuɴ hŋà-dɛ. car one-clf-obj 1sm-pl all to.rent-nfut ‘We all rented a car.’

In literary Burmese, the markers for direct and indirect object may cooccur in the same (ditransitive) clause, but only one argument in a clause can be marked as object by -ko/go in colloquial Burmese.3 This potentially leads to a conflict of determining factors (and potential misunderstandings, which may be exploited humorously), especially when both the theme (T) and recipient (R) in a ditransitive expression are human. In natural speech, such conflicts hardly occur, though, as it is uncommon for a clause to have more than two overt arguments.

3 The presence of the object marker ko/go does not preclude an allative marked with the homonymous marker. This suggests that the two markers, though etymologically identical, are syntactically different in Burmese.



Pragmatics and syntax in the languages of MSEA 

 837

Objects in Thai and Lao can be foregrounded in complex verbal predicate expressions of the pattern [A take P V] for the basic structure [A V P]. This is fronting strategy is available only to direct objects (P and T), not recipients (R), and the fronted object must be physically or metaphorically handleable, as seen in (45). This construction is especially common in ditransitive expressions denoting events of transfer, apparently to avoid more than two overt arguments with one verbal predicate. The pattern [A give T R] is replaced by [A take T give R], as in (46). (45)

Tha

man ʔaw ŋɤn paj cʰáj mòt lɤːj. 3l to.take silver/money to.go to.use be.all emph ‘He (took and) used up all the money.’

(46) Tha

lûːk ʔaw pʰǒnləmáːj maː hâj mɛ̂ ː. offspring to.take fruit to.come to.give mother ‘The child brought her mother fruit.’

34.2.3.2 DSM Ergative marking is common in several Tibeto-Burman languages (LaPolla 1995). The use of the ergative marker can easily be extended to cover agent-like or foregrounded arguments of intransitive expressions, resulting in Differential Agent Marking (DAM). Burmese is different from the more western TB languages in that it does not show any ergative-like features, neither in its syntax nor in its morphology. Subjects (A and S arguments) are regularly marked as such by the marker -θi/ði in the literary language. In colloquial Burmese, the marker -ká/gá optionally marks S/A, depending on various pragmatic factors. The main function of the Differential Subject Marker (DSM) is to foreground or contrast the subject with another possible referent or other possible referents, which may or may not be present in the linguistic context (Sawada 1995; Jenny and Hnin Tun 2013, 2016). Examples (47) and (48) illustrate the subject marker in colloquial Burmese. (47)

Bur

(48) Bur

θu-gá mə-θwà-jiɴ cənɔ-gá-lɛ̀ mə-θwà-bù. 3-sbj neg-to.go-if 1sm-sbj-add neg-to.go-neg ‘If he isn’t going, I’m not going either.’ ʔəbá pjɔ̀ -ðwà-bí sʰo-dɔ́ , cənɔ-gá father to.speak-seq to.say-contr 1sm-sbj mə-tu-da-bɛ̀ pjɔ̀ -laiʔ-mɛ. neg-be.same-nfut.nml-excl to.speak-follow-nfut ‘Since father has already talked about it, I will only talk about other things.’ (Jenny and Hnin Tun 2013)

838 

 Mathias Jenny

The same marker ká/gá is also used to mark ablative relations (‘from’) and past temporal adverbials. The colloquial Burmese ablative marker also appears in elaborate forms such as ká/gá-ne, ká/gá-ne-pì-dɔ́ ‘from being at’, and pʰɛʔ-ká ‘from the side of’, as seen in (49). The same constructions are also available for DSM, suggesting that the connection between Subject and Ablative marking is synchronically transparent in Burmese.4 These more transparent expressions more recently also spread to some genres of Mon, especially modern prose, which is heavily influenced by Burmese (Jenny and Hnin Tun 2013). (50) shows the use of Burmese-style subject marking by kəpac ‘side’ in modern Mon. (49) Bur

θin.dàɴ-gá cənɔ-dó-pʰɛʔ-ká.ne pè-já-mɛ. class-fee 1sm-pl-side-abl to.give-get-fut ‘We (are the ones who) will have to pay the class fee.’ (Jenny and Hnin Tun 2013)

(50) Mon kəpac wətɒə mùʔ tɔ̀ h mɔ̀ ŋ lɛ hùʔ həca side pn what to.be to.stay add.top neg to.consider klɤŋ pùh. to.come neg ‘He didn’t think about how Wati was doing at all.’ (Jenny 2019)

34.3 Overt marking of information structure In general, MSEAn languages do not mark topical and focal constituents by intonation, but rather use syntactic means.5 Apart from syntactic variation at the clause level, many MSEAn languages also have dedicated markers for pragmatic functions, often, but not necessarily, overlapping with deictics. The main use of overt pragmatic markers is related to information structure, as illustrated in the following subsections. While maintaining the traditional dichotomy of topic vs. focus here for ease of presentation, it must be kept in mind that the two notions do not necessarily exclude each other, depending on the definition of topic and focus (Lambrecht 1994; Krifka 2008). In many cases a marker cannot be easily assigned to either topic or focus function, even if described traditionally as being one of these. In many grammatical descriptions, pragmatic markers appear under different labels, such as “emphatic” or “contrastive” markers, without assigning them to information structural functions.

4 The fact that both may appear in the same clause show that they are syntactically different, though. 5 Mon and other non-tonal languages are generally freer in the application of intonation for pragmatic expression.



Pragmatics and syntax in the languages of MSEA 

 839

34.3.1 Topic markers Topic markers are often derived functions of demonstratives, making use of their anaphoric value. This is true for Mon, where the medial demonstrative kɔ̀ h is widely used to mark topical elements. The topic marker kɔ̀ h can co-occur with the proximal or medial demonstratives, nɔʔ and kɔ̀ h respectively, resulting in constructions of the form [N nɔʔ kɔ̀ h] and [N kɔ̀ h kɔ̀ h], where the first is demonstrative and the second topic marker, which receives the main stress. In extended function, kɔ̀ h marks an element as referential, non-predicative. In (51), which was uttered as answer to the question ‘How did the British come to your village’, the first occurrence of kɔ̀ h marks the clause ɗac klɤŋ la ‘come riding on donkeys’ as topic, the second occurrence has scope only over the noun la ‘donkey’. (51)

Mon ʔe, ɗɒc klɤŋ la. ɗɒc klɤŋ la kɔ̀ h filler to.ride to.come donkey to.ride to.come donkey top toə teh la kɔ̀ h tʰɒʔ-tʰɒʔ həʔɒt. be.finished top donkey top to.discard-discard all ‘Well, they came riding on donkeys. They came riding on donkeys, and then they just got rid of those donkeys.’ (Jenny 2015)

The marker kɔ̀ h can be attached to any kind of constituent, including whole clauses, turning them into quasi-nominals. In this case, they are interpreted as subordinate, often, but not exclusively, complement or attributive clauses (52). (52)

Mon ɗɔə kon.kemɔʔ mɔ̀ n.sɒc mùʔ nùm mɔ̀ ŋ kɔ̀ h məkɤ̀ʔ loc heart pn what to.exist to.stay top want.to tɛm teh to.know top ‘If you want to know what’s in Mon Saik’s heart …’ (Jenny 2015)

Another topic marker used in spoken Mon is teh, which also occurs clause-finally to give conditional reading, as in (52) above (see also Jenny 2011). This marker of unknown origin is not found in the literary language apart from very recent texts. The exact functional difference between teh and kɔ̀ h as post-nominal topic marker is not clear, but teh may be more contrastive than kɔ̀ h (53a)–(53b). (53)

Mon a.

Mon b.

ʔuə kɔ̀ h klon hùʔ màn 1s top to.make neg to.win ‘As for me, I can’t do it.’ ʔuə teh klon hùʔ màn 1s top to.make neg to.win ‘If it were me, I couldn’t do it.’

pùh. neg pùh. neg

840 

 Mathias Jenny

The marker rao occurs after a constituent to mark it as alternating or negative topic (‘as for X, on the other hand; and what about X?’) and in clause-final position in content questions, as illustrated in examples (54) and (55). The additive topic marker lɛ, a loan from Burmese lɛ̀ , is the translation equivalent of ‘also, even, though’. It always occurs after the topical constituent or clause, as in (56). (54)

Mon ʔuə ʔa pɤ̀ pàɲsəkɤk. pèh rao. 1s to.go to.watch movie 2s top ‘I’m going to the movies. What about you?’

(55)

Mon lèə rao lèə kɒ ɲɛ̀ h. to.tell top to.tell to.give person ‘You didn’t even tell him about it.’ (Jenny 2005)

(56)

Mon kəlon lɛ hùʔ klon pɤŋ lɛ hùʔ ciəʔ. work add neg to.make cooked.rice add neg to.eat ‘He doesn’t work nor eat.’ (Jenny 2005)

Thai uses the demonstratives níː/nîː ‘proximal’ and nán/nân ‘medial’ for anaphoric and contrastive topics. The choice of the proximal or medial demonstrative depends not exclusively on the spatial or textual distance, but also on the association or dissociation of the speaker with the referent. Like in Mon, the presence of a demonstrative does not exclude the use of a topic marker. If both demonstrative and topic markers are present, they do not have to harmonize in their distance values, that is, the proximal demonstrative can be combined with the medial topic marker, the former expressing spatial, the latter emotional distance. When combined with a demonstrative, the topic marker usually takes the falling tone, nîː (or nîə) and nân,6 distinguishing it from the demonstrative in the high tone, níː and nán (57)–(58). (57)

Tha

pʰaːsǎː ʔaŋkrìt nîə wan wan nɯ̀ŋ cʰáj language English top.prox day day one to.use bɔ̀ j kʰɛ̂ ː.nǎj. often how.much ‘The English language, how often do you use it in a day?’ (Iwasaki and Ingkaphirom 2005)

(58)

Tha

rót kʰan níː nîə man rɛːŋ diː náʔ be.strong be.good emph car clf prox top.prox 3l ‘This car is really powerful, isn’t it.’

6 The falling tone on demonstratives in Thai also occurs on pronominal demonstratives: nîː ‘this one’, nân ‘that one’.



Pragmatics and syntax in the languages of MSEA 

 841

Contrastive topics in Thai can be overtly marked by the noun sùən ‘part’, as in (59). (59)

Tha

pʰîː pʰǒm pen mɔ̌ ː, sùən nɔ́ ːŋ pen older.sibling 1sm to.be doctor part younger.sibling to.be kʰruː. teacher ‘My older sister is a doctor, my younger brother, on the other hand, is a teacher.’

Interrogative topics in Thai take the form [lɛ́ ːw X lâʔ] or [X lâʔ]7 if a previous question is asked about a further referent, ‘and what about X?’, as in example (60). (60) Tha

pʰǒm mâj cʰɔ̂ ːp ʔaːhǎːn tʰəleː. lɛ́ ːw kʰun lâʔ. 1sm not to.like food sea nsit 2 top.inter ‘I don’t like seafood. Do you?’

Unlike Mon and Thai, Burmese rarely makes use of demonstratives to mark topics. The medial demonstrative ʔɛ̀ , ʔɛ̀ .di occurs after anaphoric topics, but more common are the contrastive marker tɔ́ /dɔ́ (61a) and (62a), and the additive marker lɛ̀ (61b) and (62b) in topicalizing function. Both markers can be attached to any kind of constituent, including clauses. When the contrastive marker occurs with subjects, the DSM marker ká/gá is usually present. (61a) Bur

Bur

(62)

Bur

Bur

a.

mìɴ mə-caiʔ-pʰù, ŋa-gá-dɔ́ ʔəjàɴ caiʔ-tɛ. 2fam neg-to.like-neg 1s-sbj-contr very to.like-nfut ‘You don’t like it, but I like it very much.’ b. mìɴ θwà-jiɴ ŋa-lɛ̀ θwà-mɛ. 2fam to.go-if 1s-add to.go-fut ‘If you go, I’ll go too.’ a.

mò-gá di-né-dɔ́ mə-jwa-bù. sky-sbj prox-day-contr neg-to.rain-neg ‘Today it’s not raining (it was yesterday).’ b. mò-gá di-né-lɛ̀ mə-jwa-bù. sky-sbj prox-day-add neg-to.rain-neg ‘Today it’s not raining (it wasn’t yesterday either).’

Interrogative topics in Burmese can be marked by kɔ̀ /gɔ̀ (or the variant jɔ̀ ) if a previous question is asked about a further referent, ‘and what about X?’, illustrated in (63) and (64).

7 The particle lâʔ is a weak form of both the nsit marker lɛ́ ːw and the discourse particle lâw ‘as for, speaking of’ (from the verb lâw ‘to tell a story’).

842 

(63)

 Mathias Jenny

Bur

(64) Bur

ŋa-gá ʔəlouʔ louʔ-ne-dɛ. mìɴ-gɔ̀ . 1s-sbj work to.do-stay-nfut 2fam-top.inter ‘I am working. What about you?’ θu wɛʔ-θà mə-sà-bù. cɛʔ-θà-gɔ̀ 3 pig-meat neg-to.eat-neg chicken-meat- top.inter ‘He doesn’t eat pork.’ ‘What about chicken?’

This marker kɔ̀ resembles the Topic-Comment Linker common in Thai and many other MSEAn languages, as illustrated in 34.3.3, though the connection is obscure.

34.3.2 Focus markers The focus marker raʔ in Mon originates in a weak form of the equational copula in Old Mon, tɔ̀ h in spoken Mon. This etymology suggests that the marker goes back to a cleft construction of the form ‘it is X that’ (Jenny 2006). Similar to the extended use of the demonstrative-topic marker kɔ̀ h, the focus marker raʔ is applied in a broader function to mark an element as predicative, irrespective of its syntactic class. The topic-comment organization of the clause is typically expressed by the pattern [X kɔ̀ h Y raʔ], where X and Y can be any kind of constituent, as in (65) and (66). This pattern naturally leads to raʔ frequently occurring in clause-final position, where it can be reanalyzed as SFP corresponding to the Burmese non-future marker tɛ/dɛ. (65)

Mon hmoɲ ʔələwìʔ kɔ̀ h nùm kɒ krɒə.cɔ̀ h hnòk tao king pn medl/top to.exist obl glory be.big to.stay pùə.mə.lòn raʔ. exceedingly foc ‘King Alawi was of great glory.’ (Jenny 2015)

(66) Mon pjɤ̀ pɤŋ kɔ̀ h ʔuə raʔ. be.hungry cooked.rice top 1s foc ‘The one who is hungry is me.’ Burmese uses the exclusive focus marker pɛ̀ /bɛ̀ ‘only, merely’ for verbal, adverbial, and nominal constituents (67)–(68). (67)

Bur

(68) Bur

ʔəkʰú-bɛ̀ zè jauʔ-la-dɛ. now-excl market to.arrive-come-nfut ‘Just now I arrived at the market.’ sʰəja-bɛ̀ pjɔ̀ -tʰà-dɛ. teacher-excl to.speak-keep-nfut ‘It’s the teacher who said it.’



Pragmatics and syntax in the languages of MSEA 

 843

The exclusive marker pɛ̀ /bɛ̀ is frequently used to mark a non-verbal constituent as predicative, as in (69). (69) Bur

θwà-ɟiɴ-dɛ́ lu-gá cənɔ-bɛ̀ . to.go-des-nfut.dep person-sbj 1sm-excl ‘The one who wants to go is me.’

Clausal and some adverbial constituents may also take the restrictive marker hmá ‘only if, only when, not unless’ (Jenny and Hnin Tun 2016). Although pɛ̀ /bɛ̀ and hmá cover similar semantics, their functional range is quite different, and they are not in all cases interchangeable. In the function as clausal focus marker, hmá has also been borrowed into Mon as hmaʔ, as in (70a)–(70b). (70)

Bur

a.

Mon b.

ʔəlouʔ pì-hmá tʰəmìɴ sà-mɛ. work be.finished-restr cooked.rice to.eat-fut ‘I will only eat once I’ve finished my work.’ kəlon toə hmaʔ ciəʔ pɤŋ. work be.finished restr to.eat cooked.rice ‘I will only eat once I’ve finished my work.’

In Thai, focus can be indicated by the postposed marker lɛ̀ ʔ, usually added to a fronted or floating constituent (71). Often, the topical part of the utterance is adjoined to the focus in a relative construction of the pattern [X lɛ̀ ʔ rel Y], literally ‘it’s X that Y’, as in (72). (71)

Tha

[‘Who is going to do the dishes?’] tʰɤː níː lɛ̀ ʔ 2fam prox foc ‘You are.’

(72)

Tha

nǎŋ.sɯ̌ː lêm níː lɛ̀ ʔ (tʰîː) hǎː maː tâŋ naːn. book clf prox foc rel to.seek to.come all.of be.long.time ‘This is the book I’ve been looking for all that time.’

844 

 Mathias Jenny

34.3.3 Topic-comment-linkers Thai has a very frequent particle kɔ̂ ,8 which covers a wide range of functions (or translation possibilities). It appears alone, before, after, or between constituents of any type and indicates some sort of semantic connection between the preceding and following chunks of speech. Ingkaphirom and Iwasaki (2005) describe the “five major functions” of kɔ̂ as “a nominal linker (‘also’), a clause linker (‘so’), as discourse marker (‘and then’), response marker (‘Well …’), and a marker of criticism and disappointment”. Many MSEAn languages have particles with similar, but necessarily completely coextensive ranges of functions, described as “sequential indicator” by Burusphat (2008). Burusphat (2008) suggests a Khmer origin of this marker, though its occurrence in Shan and other Tai languages beyond the sphere of Khmer influence make this etymology questionable. It is evident that the form kɔ̂ has been borrowed from Thai into several languages in MSEA, including unrelated Khmu (kɔɔ) and possibly Moken (kaʔ), and the connection with look-alikes like Burmese kɔ̀ ‘and what about X’ is unclear. There may well be pull-and-push interference among the languages of MSEA, with diachronically distinct morphemes converging and diverging in different contact scenarios. The full pattern [X kɔ̂ Y] expresses a logical connection between X and Y, whereas X is topical and Y the comment made about X. The exact nature of the connection is based on the semantics of X and Y, as well as on the situational context. While ‘also’ is a possible and frequent translation in the case of X=NP, as in (73) and (74), it is not the only possibility. In some contexts ‘on the other hand’, ‘but’, or others are adequate. (73)

Tha

ʔaːcaːn kɔ̂ bɔ̀ ːk náʔ pʰrûŋ.níː càʔ sɔ̀ ːp. wâː teacher tcl to.tell emph to.say tomorrow irr take.exam ‘The teacher also said that we’ll have an exam tomorrow.’

(74)

Tha

pʰaːsǎ ciːn kɔ̂ kʰɤːj riən. language Chinese tcl be.used.to to.learn ‘I have also studied Chinese at some point.’

When linking clauses, kɔ̂ marks any logical connection, causality, conditionality and contradiction being only some possible interpretations, as in (75a)–(75b). The overall function of kɔ̂ is best termed Topic-Comment Linker (TCL), with the concrete interpretation varying according to the context.

8 The short vowel without final glottal stop in an open syllable is phonologically unexpected in Thai, as is the spelling in Standard and Northern Thai as and , respectively, with no written vowel in an open full syllable.



(75)

Pragmatics and syntax in the languages of MSEA 

Tha

Tha

 845

a.

mâj miː ŋɤn kɔ̂ maː dâj. not to.have silver/money tcl to.come to.get ‘If you don’t have any money, you can still come.’ b. miː ŋɤn kɔ̂ maː dâj. to.have silver/money tcl to.come to.get ‘If you have money you can come.’

The presence of a floating (preclausal) topic does not interfere with the function of kɔ̂ as TCL within the clause, as seen in (76). (76)

Tha

ráːn níː nîə kin kɔ̂ ʔərɔ̀ j raːkʰaː kɔ̂ shop prox top.prox to.eat tcl be.delicious price tcl mâj pʰɛːŋ. not be.expensive ‘This restaurant is good and not expensive.’

The fact that kɔ̂ functions as a Topic-Comment linker is also shown by the fact that the scope of the topic shifts with the position of kɔ̂ , which can occur before or after the subject. In the former case, the subject is part of the comment, in the latter case it is (part of) the topic. This difference is illustrated in (77a)–(77b). (77)

Tha

Tha

a.

kɔ̂ pôː kʰǎw riən pʰaːsǎː jîːpùn jùː. tcl pn 3hum to.learn language Japan to.stay ‘Well, (that’s because) Poo is studying Japanese.’ b. pôː kʰǎw kɔ̂ riən pʰaːsǎː jîːpùn jùː. pn 3hum tcl to.learn language Japan to.stay ‘Poo is also studying Japanese.’ (Iwasaki and Ingkaphirom 2005)

Taking the full pattern of kɔ̂ as starting point, and given the possibility and tendency to drop known elements in an utterance, the interpretation of the reduced patterns [Ø kɔ̂ Y], [X kɔ̂ Ø], and [Ø kɔ̂ Ø]9 is rather transparent, but highly context dependent, as seen in (78)–(80). A construction like kɔ̂ diː ‘tcl be.good’ is understood as ‘what has been suggested or said is (also, still) good’. The function of kɔ̂ is to overtly link the comment ‘is good’ to an unexpressed topic, which is available to the speaker and listener in the given speech situation. If the topic is present, but the comment is omitted, as in wan.níː kɔ̂ ː ‘today tcl’, the effect is one of suspense, expecting a comment to follow or be omitted because it is obvious or unspeakable: ‘Now for today …’. Occurring on its own, kɔ̂ ː takes up a topic in the context and indicates that a comment is to follow. This is often used in conversation as hesitation marker, either when pausing a narrative or as suspended answer to a question.

9 When occurring in final position, kɔ̂ is regularly lengthened to kɔ̂ ː, as stand-alone form the lengthening may be extreme: [kɔ̂ ːːːː].

846 

 Mathias Jenny

(78)

Tha

[‘She is not coming for dinner tonight.’] kɔ̂ mâj hěn pen raj tcl not to.see to.be what ‘Well, that’s fine with me.’

(79)

Tha

[‘I am going to Japan again this year.’] kʰun kɔ̂ ː 2 tcl ‘What can I say about you …?’

(80) Tha

[‘What do you think about the present government?’] kɔ̂ ːːːː tcl ‘Well …’

34.4 Speaker’s perspective Many MSEAn languages have large inventories of means to express the speaker’s perspective and attitude towards the event and its participants denoted by an utterance. Apart from varying the presentation of information in terms of foregrounding and backgrounding, omitting and moving arguments, the choice of aspectuals/directionals and other secondary verbs can add an evaluative force to an utterance without changing its truth value or semantic content. Similarly, the choice of personal pronouns (and other lexical categories in some languages) and sentence and phrase particles indicate the attitude of the speaker, rather than the happenings they describe by an utterance.

34.4.1 “Differential TAM marking” TAM and directional marking in MSEAn languages is mostly optional, and the categories overlap to some extent and include connotations that are pragmatic rather than semantic. Very often the aspectual values do not match traditional categories such as “perfective” and “imperfective”, but rather crosscut these in different ways. One case in point is the use of the motion verbs maː ‘to come’ and paj ‘to go’ in Thai. The former describes a movement towards the center of interest, the latter a movement away from the center of interest (Jenny 2001). This movement can be in the spatial, temporal, or emotional domain. Based on this function, the reading of an expression like Thai pʰûːt paj ‘speak go’ can be interpreted as perfective (‘has spoken’) or imperfective (‘goes on speaking’). Especially the emotional function is used to express a speaker’s perspective on the event and/or its participants, as seen in the Thai example (81a)–(81b). Generally, the notion of ‘movement away from the center of interest’ indicates that the



Pragmatics and syntax in the languages of MSEA 

 847

event or its result are no longer of interest to the speaker, while ‘movement towards the center of interest’ shows that the speaker is involved in some way in the event or its result. The expression pʰûːt paj ‘speak go’ can, in a given speech situation, also be interpreted as ‘just talk’, implying ‘I’m not interested, listening’, while pʰûːt maː ‘speak come’ can be understood as ‘tell me’ (‘I’m interested, I’m not listening’), besides the temporal/aspectual reading as ‘they/you/I (have) said’. (81)

Tha

Tha

a.

pʰɯ̂ən kin kʰâːw maː sǎːm caːn. friend to.eat rice to.come three plate ‘My friend ate three plates of rice (he must be very full now).’ b. pʰɯ̂ən kin kʰâːw paj sǎːm caːn. friend to.eat rice to.go three plate ‘My friend ate three plates of rice (the rice is gone now).’

Similar means are available in Burmese and Mon, though not necessarily with the same lexical or morphological material, as in (82a)–(82c) and (83a)–(83b). (82)

Bur

Bur

Bur

(83)

a.

θəŋɛɟìɴ tʰəmìɴ θòuɴ bəgaɴ sà-gɛ́ -dɛ. friend cooked.rice three plate to.eat-displ-nfut ‘My friend ate three plates of rice.’ b. θəŋɛɟìɴ tʰəmìɴ θòuɴ bəgaɴ sà-tʰà-dɛ. friend cooked.rice three plate to.eat-keep-nfut ‘My friend ate three plates of rice (he must be very full now).’ c. θəŋɛɟìɴ tʰəmìɴ θòuɴ bəgaɴ sà-laiʔ-tɛ. friend cooked.rice three plate to.eat-follow-nfut ‘My friend ate three plates of rice (the rice is gone, it’s over).’

Mon a.

rɔ̀ ə ciəʔ lɔ̀ pɤŋ pɒəʔ pəŋan. friend to.eat to.keep cooked.rice three plate ‘My friend ate three plates of rice (he must be very full now).’ Mon b. rɔ̀ ə ciəʔ tʰɒʔ pɤŋ pɒəʔ pəŋan. friend to.eat to.discard cooked.rice three plate ‘My friend ate three plates of rice (the rice is gone, it’s over).’

As TAM marking is optional (apart from obligatory tense/status marking in Burmese), overt marking usually has expressive force. This is especially evident in the use of the New Situation (NSIT) marker, lɛ́ ːw in Thai and pi/bi in Burmese (Iwasaki and Ingkaphirom 2005; Noss 1964; Okell 1969; Jenny 2001; Jenny and Hnin Tun 2016). Often described as ‘perfective’, ‘perfect’ or translated as ‘already’, these markers convey the meaning of ‘situation has occurred, a crucial point of a development has been reached or passed’. The general insensitivity to tense distinctions of MSEAn languages also allows imminent or inevitable future readings of nsit in appropriate contexts.10 10 This does not hold for Burmese, which has a grammatical future/non-future distinction.

848 

 Mathias Jenny

While the NSIT marker in MSEAn languages often originates in a lexical verb meaning ‘to be finished’, as is the case in Thai and Burmese, Mon uses a more recent form jaʔ, literary Mon , which is (semantically opaquely) composed of the prefix ʔiʔ-, usually associated with nominal forms, and the focus marker raʔ (Jenny 2005).11 The use of the NSIT marker marks the event as presupposed, the principal new information/assertion in the utterance being that the (expected) event or stage in a development has occurred or been reached. Relevant examples are given in (84)–(92). (84) Bur

mobàiɴ-lɛʔ-kaiɴ-pʰòuɴ-hma bəma-lo pʰaʔ-naiɴ-bi. mobile-hand-to.hold-phone-loc Burmese-sim to.read-capable-nsit ‘You can now read in Burmese on mobile phones.’ (Jenny and Hnin Tun 2016)

(85)

ɕa-ne-dɛ́ sa.ʔouʔ twé-bi là. to.see-stay-nfut.dep book to.find-nsit pq ‘Have you found the book you were looking for?’

Bur

(86) Bur

ʔəθɛʔ θòuɴ-zɛ́ -hnə-hniʔ ɕí-bi. age three-ten.dep-two-year to.exist-nsit ‘(He) is 32 years old (so far).’ (Okell 1969)

(87)

riən maː tâŋ naːn, tɔːn.níː pʰûːt ʔaŋkrìt to.learn to.come all.of be.long.time now to.speak English pen lɛ́ ːw. to.be nsit ‘He has been learning English for a long time, now he can speak it.’

Tha

(88) Tha

mâj hǐw, kin kʰâːw maː lɛ́ ːw. not be.hungry to.eat rice to.come nsit ‘I’m not hungry, I have already eaten.’

(89) Tha

ʔaːjúʔ sǎːm-sìp-sɔ̌ ːŋ piː lɛ́ ːw. age three-ten-two year nsit ‘(He) is 32 years old (so far).’

(90) Mon pjùʔ ʔɒt jaʔ, nùm mɔ̀ ŋ phɤh. be.old be.all nsit to.exist to.stay still ‘They have all grown old, the ones that are still around’ (Jenny 2005) (91)

Mon pɒəʔ-klɔm-cɔh kɔ̀ h poj cɒp klɤŋ kɔʔ.ɗot jaʔ. three-hundred-ten medl/top 1p to.arrive to.come pn nsit ‘By 1310 (1949) we had already arrived at Kaw Dot.’ (Jenny 2005)

11 Thai lɛ́ ːw is an old loan from Chinese lǐao and overlaps in its scope of functions to a great extent with its grammaticalized form, the phrasal and clausal marker le (Li and Thompson 1981).



(92)

Pragmatics and syntax in the languages of MSEA 

 849

Mon kɤ̀ʔ ʔəjɤk pɒəʔ-coh-ɓa hnam jaʔ. to.get age three-ten-two year nsit ‘(He) is 32 years old (so far).’

34.4.2 Speaker’s attitude Besides the choice of secondary verbs and other TAM/DIR marking devices, several MSEAn languages have lexical means to express the speaker’s attitude towards the event and its participants. Especially the central languages, languages of states and kingdoms, have large inventories of pronouns or lexemes with pronominal functions (Iwasaki and Ingkaphirom 2005; Müller and Weymuth 2017). The choice of the pronominal form for all persons is used to express contempt, respect, affection, etc., as well as social relationships between the referents. Table 1 gives a sample of socially based first pronouns in Thai and Burmese. Obviously, the inventories of the two languages do not match in all forms, though the underlying systems are similar. Tab. 1: First person pronouns (not exhaustive). Thai

Burmese

Social meaning

dìʔcʰǎn pʰǒm cʰǎn – kʰáw kuː kʰâːpʰəcâːw – kʰâː raw ʔúə

cəmá cənɔ ŋa couʔ – – cənouʔ dəbɛ́ dɔ – dó –

female; general, polite male; general, polite familiar, intimate informal, rural informal, intimate (esp. children) rude formal addressing clergy informal plural; intimate, informal informal, Chinese

Similar effects can be attained by the choice of special lexicon for different layers of society. This is especially formalized in Thai, where the High-Low distinction is considered an integral part of the language (Diller 1993). Sentence particles are employed to express speaker centered notions such as displeasure (93), respect (94), and they can mark an utterance as a rhetoric question (95), among many others (see Iwasaki and Ingkaphirom [2005] for a summary of pragmatic particles in Thai). (93)

Bur

ba-dwe louʔ-ne-da-lɛ̀ kwá. what-pl to.do-stay-nfut.nml-qc intens ‘What the heck ae you guys doing?’

850 

 Mathias Jenny

(94)

Tha

paj nǎj maː kʰráp. to.go inter to.come pol ‘Where have you been, Sir?’

(95)

Mon pèh hɒm ʔərè sem lèp seʔ.12 2 to.speak language Thai be.skilled tag ‘You speak Thai, right?’ (Jenny 2015)

Common notions expressed by sentence particles include putting special emphasis on the content, as in (96) to (98), or seeking consent from the addressee, as in (99)–(104). The latter particle (Consent Seeking Particle, csp) can also be used on its own in the three sample languages, expressing an insisting seek of consent. (96)

Tha

krəcòk tɛ̀ ːk mòt lɤːj. glass.panel to.break be.all emph ‘All the windows were completely shattered.’ (Iwasaki and Ingkaphirom 2005)

(97)

Bur

ʔɛ̀ .di ʔəcʰeiɴ-hma hmauɴ-ne-bi le medl time-loc be.dark-stay-nsit emph ‘It was dark by that time, you see.’ (Okell 1969)

(98)

Mon ɓɛ̀ ʔ həmɛ̀ ə kɔ̀ h krìp ʔa lè. ref Burmese medl to.run to.go emph ‘Those Burmese all ran away.’

(99)

Tha

wan-níː nǎːw náʔ. day-prox be.cold csp ‘It’s cold today, isn’t it?’ (Iwasaki and Ingkaphirom 2005)

(100)

Tha

paj kin kʰâːw kan náʔ. to.go to.eat rice rec csp ‘Let’s go to eat, shall we?’

(101)

Bur

di-né jaði.ʔúdú kàuɴ-dɛ nɔ. prox weather be.good-nfut csp ‘The weather is nice today, don’t you think?’

(102)

Bur

tʰəmìɴ θwà sà-mɛ nɔ. cooked.rice to.go to.eat-fut csp ‘Let’s go to eat, shall we?’

12 seʔ is the colloquial form of siəŋ ‘be so’ when functioning as assumptive/tag question marker.



Pragmatics and syntax in the languages of MSEA 

(103)

Mon həʔɛh, ɗɛh kwɤ̀ʔ pɔn nɛm, ʔuə lèə kɒ no 3 neg.get to.shoot yet 1s to.tell to.give ɲìʔ nah. be.little csp ‘No, they hadn’t started the shooting yet, let me tell you, ok?’

(104)

Mon ʔa ciəʔ pɤŋ nah. to.go to.eat cooked.rice csp ‘Let’s go to eat, shall we?’

 851

Thai has a number of sentence particles affirming a situation/event or contradicting what has been assumed. The former is marked by sìʔ~siː~síʔ (105), the latter by rɔ̀ ːk (106). (105)

Tha

[‘Do you like hamburgers?’] cʰɔ̂ ːp sìʔ to.like affirm ‘Of course I do!’

(106)

Tha

[‘I guess you don’t eat spicy food.’] cʰɔ̂ ːp rɔ̀ ːk. to.like contradict ‘I do like (spicy food)!’

Burmese adds the sentence particle pɔ́ to indicate that the speech event covers common knowledge or a matter of fact, as in (107) and (108). (107)

Bur

[‘I don’t have any money left.’] bija mə-θauʔ-tɔ́ -bù pɔ́ . beer neg-to.drink-contr-neg insist ‘So you don’t drink any beer anymore,’

(108)

Bur

[‘Do you like tea leaf salad?’] caiʔ-ta pɔ́ . to.like-nfut.nml insist ‘Of course I do (like it).’

Another domain where a speaker’s attitude can be expressed is in the subjective evaluation of an event. The semantic content is the same whether one says ‘he gave me ten dollars’ or ‘he gave me only ten dollars’, but the former expression is neutral, while the second implies that the speaker is not (completely) satisfied with the amount. Both Thai and Burmese have particles expressing not only a subjectively small amount, but also a subjectively big amount, as illustrated in (109)–(111). In Mon, the restrictive marker chaʔ ‘only’ is used together with the emphatic phrase thɔ̀ -raʔ ‘alone, merely’ (lit. ‘straight foc’) to indicate subjectivity for small amounts. Large amounts are subjectively expressed by intonation (stress on the quantifier/numeral) and the emphatic

852 

 Mathias Jenny

phrase marker mùə-tɔn ‘at once’. This is obviously less grammaticalized than the constructions in Thai and Burmese. (109)

Tha

Tha

(110)

Bur

Bur

(111)

a.

dâj ŋɤn maː kʰɛ̂ ː to.get silver/money to.come only ‘I got only two thousand baht.’ b. dâj ŋɤn maː tâŋ to.get silver/money to.come all.of ‘I got as much as two thousand Baht!’

sɔ̌ ːŋ pʰan bàːt. two thousand Baht sɔː̌ŋ pʰan bàːt two thousand Baht

a.

paiʔsʰaɴ hnə-tʰauɴ-bɛ̀ já-gɛ́ -dɛ. money two-thousand-excl to.get-displ-nfut ‘I got only two thousand Kyat.’ b. paisʰaɴ hnə-tʰauɴ-dauɴ já-gɛ́ -dɛ. money two-thousand-all.of to.get-displ-nfut ‘I got as much as two thousand Kyat!’

Mon a.

kɤ̀ʔ-nɛ̀ ŋ hlɔə chaʔ to.get-caus.come copper/money only thɔ̀ .raʔ. merely ‘I got only two thousand Kyat.’ Mon b. kɤ̀ʔ-nɛ̀ ŋ hlɔə ɓá to.get-caus.come copper/money two ‘I got as much as two thousand Kyat!’

ɓa ŋìm two thousand

ŋìm mùə.tɔn. thousand at.once

34.5 Conclusion MSEAn languages exhibit a great range of conventionalized processes to express pragmatic features, adding subjective and individual perspectives to an utterance without changing its semantic content. The means involved in pragmatic expressions in MSEA differ widely in individual languages, but the overall picture shows a number of commonalities. Similar functions can be expressed in sometimes similar ways, with partly overlapping ranges of functions. While the possibilities of pragmatic expressions available to MSEAn languages are in most cases not unique to them, the combination and conventionalization of processes, as well as the range of their application marks MSEA as an area of special interest. Only a few processes related to pragmatic expression could be presented in this chapter, which nevertheless hopes to serve as a starting point for further investigation in the field which will include more languages and features. Ideally, more space will be given to pragmatic features in language descriptions and full grammars.



Pragmatics and syntax in the languages of MSEA 

 853

References Bisang, Walter. 2009. On the coevolution of complexity: Sometimes less is more in East and mainland Southeast Asia. In Geoffrey Sampson, David Gil & Peter Trudgill (eds.), Language complexity as an evolving variable, 34–49. Oxford: Oxford University Press. Burusphat, Somsonge. 2008. An etymological speculation on the sequential indicator kɔɔ3 in Thai narrative. In Anthony V. N. Diller, Jerold A. Edmundson & Yongxian Luo (eds.), The Tai-Kadai languages, 431–444. London & New York: Routledge. Chappell, Hilary. 1996. Inalienability and the personal domain in Mandarin Chinese discourse. In Hilary Chappell & William McGregor (eds.), The grammar of inalienability. A typological perspective on body part terms and the whole-part relation, 465–528. Berlin & New York: Mouton de Gruyter. Diller, Anthony. 1993. Diglossic grammaticality in Thai. In William A. Foley (ed.), The role of theory in language description, 393–420. Berlin & New York: Mouton de Gruyter. Enfield, N. J. 2005. Areal linguistics and Mainland Southeast Asia. Annual Review of Anthropology 34. 181–206. Enfield, N. J. 2011. Case relations in Lao, a radically isolating language. In Andrej Malchukov & Andrew Spencer (eds.), The Oxford handbook of case, 807–819. Oxford: Oxford University Press. Gil, David. 2009. How much grammar does it take to sail a boat? In Geoffrey Sampson, David Gil & Peter Trudgill (eds.), Language complexity as an evolving variable, 19–33. Oxford: Oxford University Press. Huang, Yan. 2000. Anaphora. A cross-linguistic study. Oxford: Oxford University Press. Iwasaki, Shoichi & Preeya Ingkaphirom. 2005. A reference grammar of Thai. Cambridge: Cambridge University Press. Jenny, Mathias. 2001. The aspect system of Thai. In Karen H. Ebert & Fernando Zúñiga (eds.), Aktionsart and Aspectotemporality in non-European languages, 97–140. Zurich: ASAS. Jenny, Mathias. 2005. The verb system of Mon. Zurich: ASAS. Jenny, Mathias 2006. Mon ra’ and nong: Assertive particles? Journal of Mon-Khmer Studies 36. 21–38. Jenny, Mathias. 2011. Burmese in Mon syntax: External influence and internal development. In Sophana Srichampa, Paul Sidwell & Kenneth Gregerson (eds.), Austroasiatic studies: Papers from ICAAL 4, 48–64. Dallas, Salaya & Canberra: SIL International, Mahidol University, Pacific Linguistics. Jenny, Mathias. 2015. Modern Mon. In Mathias Jenny & Paul Sidwell (eds.), The handbook of Austroasiatic languages, 552–600. Leiden & Boston: Brill. Jenny, Mathias. 2019. Grammatical relations in Mon. Syntactic tests in an isolating language. In Alena Witzlack-Makarevich & Balthasar Bickel (eds.), Argument selectors: A new perspective on grammatical relations, 107–129. Amsterdam: John Benjamins. Jenny, Mathias. 2020. Verb-initial structures in Austroasiatic languages. In Mathias Jenny, Paul Sidwell & Mark Alves (eds.), Austroasiatic syntax in areal and diachronic perspective, 21–45. Leiden & Boston: Brill. Jenny, Mathias & San San Hnin Tun. 2013. Differential subject marking without ergativity: The case of colloquial Burmese. Studies in Language 37(4). 693–735. Jenny, Mathias & San San Hnin Tun. 2016. Burmese. A comprehensive grammar. London & New York: Routledge. Krifka, Manfred. 2008. Basic notions of information structure. Acta Linguistica Hungarica 55(3/4). 243–276. Lambrecht, Knud. 1994. Information structure and sentence form. Topic, focus and the representations of discourse referents. Cambridge: Cambridge University Press.

854 

 Mathias Jenny

LaPolla, Randy. 1995. “Ergative” marking in Tibeto-Burman. In Yoshio Nishi, James A. Matisoff & Yasuhiko Nagano (eds.), New horizons in Tibeto-Burman morphosyntax (Senri Ethnological Studies 41), 189–228. Osaka: National Museum of Ethnology. Li, Charles N. & Sandra A. Thompson. 1981. Mandarin Chinese. A functional reference grammar. Berkeley, Los Angeles & London: University of California Press. Matisoff, James A. 1986. Hearts and minds in Southeast Asian languages and English: An essay in the comparative lexical semantics of psycho-collocation. Cahiers de Linguistique d’Asie Orientale 15(1). 5–57. Matthew S. Dryer. 2013. Order of subject, object and verb. In Matthew S. Dryer & Martin Haspelmath (eds.), The world atlas of language structures online. Leipzig: Max Planck Institute for Evolutionary Anthropology. http://wals.info/chapter/81 (accessed 9 January 2021). Müller, André & Rachel Weymuth. 2017. How society shapes language: Personal pronouns in the Greater Burma Zone. Asiatische Studien/Études Asiatiques 71(1). 409–432. Noss, Richard B. 1964. Thai reference grammar. Washington, DC: Foreign Service Institute. Okell, John. 1969. A reference grammar of colloquial Burmese. London: Oxford University Press. Sawada, Hideo. 1995. On the usages and functions of particles -kou_/-ka. in colloquial Burmese. In Yoshio Nishi, James A. Matisoff & Yasuhiko Nagano (eds.), New horizons in Tibeto-Burman morphosyntax (Senri Ethnological Studies 41), 154–187. Osaka: National Museum of Ethnology. Vittrant, Alice & Justin Watkins (eds.). 2019a. The Mainland Southeast Asia linguistic area. Berlin & Boston: Mouton de Gruyter. Vittrant, Alice & Justin Watkins. 2019b. Appendix: Guidelines for writing a Southeast Asian language description. In Alice Vittrant & Justin Watkins (eds.), The Mainland Southeast Asia linguistic area, 653–686. Berlin & Boston: Mouton de Gruyter.

Paul Sidwell and Mathias Jenny

35 MSEA epigraphy 35.1 Introduction

The earliest evidence of writing in MSEA consists of inscriptions on stone, being sufficiently durable to survive into present times, unlike paper, palm leaves, or other organic materials. In total there are some thousands of pre-modern inscriptions, in various regional languages, in addition to Sanskrit and Pali texts (these latter being mainly religious in nature). This epigraphic heritage is tremendously important to regional archaeology, history, and linguistics. Since the 19th century, the study of MSEA inscriptions has facilitated a limited yet priceless direct access to the lexicon, phonology, and grammar of languages spoken in the region in premodern times, spanning back up to 1,400 years or more. The linguistic significance has many aspects, for example, for some languages (including regionally influential languages) we have information on language change both internally and externally driven, we have data that informs and even confirms comparative reconstruction, and we have typological data that allows us to make inferences about the MSEA linguistic area for more than a millennium. Broadly speaking, inscriptions in Sanskrit begin to appear from around the middle of the first Millennium, while those in local vernaculars, or mixed/parallel Sanskrit and vernacular, appear from the late 6th and early 7th centuries in the cases of Old Mon and Old Khmer. There is the anomaly of the Đông Yên Châu inscription in an old form of Cham, found near Trà Kiệu in Vietnam, and generally dated to the mid-4th century (Coedès 1939; Marrison 1975); otherwise Cham inscriptions are not clearly attested until somewhat later. There was also use of Chinese writing in northern Vietnam from the beginning of the 1st millennium (see Salmon 2018), but that practice belongs to the Chinese cultural and political tradition and is not discussed further in this chapter. Beyond the region of northern Vietnam under Chinese control, all MSEA civilizations adopted versions of Brahmi script of India, most using Late Southern Brahmi script, while there was also some use of Late Northern Brahmi (Griffiths and Lammerts 2015). This reflects the fact that writing was not received into MSEA in one historical event, but multiple waves. It is generally regarded that the emerging 1st millennium polities of the region looked to South Asia and Hindu civilization for ready models of cultural authority, governance, and religion. This was a tactical approach, as following Chinese templates would have invited subordination to the very proximal and strong Chinese state while Indian states of the time were relatively weak and unable to project significant force across the Bay of Bengal and into Indochina. At the same time the coastal silk-route and various overland trails linked the regions, facilitating cultural and economic transmission. In this way, 1st millennium local elites could https://doi.org/10.1515/9783110558142-035

856 

 Paul Sidwell and Mathias Jenny

emulate Hindu (and later Buddhist) religion and statecraft, while maintaining effective autonomy of their realms (Coedès 1968; Wolters 1999; Stark 2015; etc.). The adaptation of Brahmi script for writing MSEA languages entailed various consequences, and complications in subsequent decipherment of inscriptions. Following Indic phonology, the script has more signs or akṣara than required for the phonemes of most MSEA languages, but there is a mismatch with MSEA areal phonology. Brahmi script has few vowel signs, but many consonants, including voiced aspirates and retroflex stops that are rare or not used in MSEA languages. Also, consonantal akṣara have syllabic readings with inherent vowels, so have different contextual readings (i.  e. in isolation, in clusters, syllable finally, etc.). Given the primacy of Sanskrit and Pali, the ancient scribes were aware of the orthographic conventions for those languages, and a mix of adapted conventions and local innovations marked the adoption of Brahmi regionally (anyone familiar with, for example, modern Thai script will already have a sense of the range of ambiguities and redundancies this can entail). Epigraphic practice is to transcribe inscriptions into Roman script following a version of conventional Brahmi transcription, and this then requires specialist knowledge of both the language and the orthographic conventions to achieve a proper reading. For example, Old Khmer spelling variants of ‘two’ /biːr/, include bira, biyara. The former is straightforward; i has a default long vowel reading before coda r, and ra is the syllabic reading which is assumed to have no vowel word finally.1 The latter variant has the additional ya, which may suggest another syllable, but the scribes were actually adding a /j/ to the nucleus to make it clear that the vowel is long. The shortage of Brahmi vowel akṣara led to many ambiguous or alternate spellings, and some local innovations. For another example, Old Mon had a central vowel, assumed to be /ɤ/ (written ø by Shorto [1971] and elsewhere) which lacks an equivalent in Sanskrit. Scribes variously spelled this sound with a, i, u, e, o or, later, the digraph ui (or iu); the correct assignment of /ɤ/ is achieved by noting variation in tokens of the same word, or by reference to modern forms. Correct transcription, transliteration and interpretation of inscriptions can be very problematic and many epigraphs are disputed or remain untranslated; complicating matters inscriptions are often weathered or broken, indistinctly inscribed to begin with, or there may be insufficient context to identify the language, among other issues. Before and during the use of stone and other durable materials for writing, it is certain that writing on palm leaf and other organic materials was used in MSEA, especially for religious scriptures, in many ways paralleling practices that continue into modern times. The beginning of widespread erection of sandstone stele correlates with a phase of social and economic transition going on in MSEA in the mid-1st millen1 Brahmi and other Indic scripts do have a ‘vowel killer’ (virāma) indicating that a consonant is to be read without the inherent vowel as coda. This ‘vowel killer’ is consistently used in the western SEA orthographies (Mon, Burmese, Shan), but omitted in Khmer and scripts derived thereof, including Thai, Lanna, and Khün. This omission leads to ambiguity in reading in some cases in these languages.



MSEA epigraphy 

 857

nium, as local institutions and power structures developed. Stele, along with inscriptions on monuments and buildings, were public documents, erected to announce and commemorate acts such as donations to temples, taxes to the sovereign, etc. and these legitimized social status, title to land, inheritances, and so forth. Many such stele present lists of names, presumably slaves or corvée labor, as well as goods. One consequence of the very specific functions of inscriptions is that the preserved lexicon is limited in scope, and clearly not a good representative of everyday speech (for example, nowhere in the Old Khmer corpus do we find 2nd person pronouns), yet it remains invaluable evidence of the earlier stages of the languages. In the case of Pyu language of Upper Burma, which became extinct as a vernacular in the 13th century, epigraphy is our only witness. The study of epigraphy in MSEA began in the 19th century as an activity of European colonialism, out of which emerged important corpora of rubbings, transcriptions and translations, such as Coedès (1937–1966) Inscriptions du Cambodge, Duroiselle (ed. 1919–1936) Epigraphia Birmanica, and others. The post-colonial era saw the establishment of epigraphic studies more strongly among local scholars, and in recent decades there are projects in all MSEA nations devoted to their epigraphic heritage, often in cooperation with European institutions such as the École française d’Extrême-Orient (EFEO) which has many research centers, libraries, and joint projects in the MSEA region (see Perret ed. [2018] for a good overview including high quality photos of inscriptions).

35.2 Epigraphy by language 35.2.1 Mon Old Mon inscriptions are widely dispersed geographically in MSEA, from southern to northern Thailand and Laos in the east, and Lower to Upper Burma to the west. The wide range of topics touched on in some of the longer Old and Middle Mon inscriptions make them important sources for research on the religious, cultural, and political life at different periods in Mon history. It is evident that a version of Old Mon was the vernacular language of Dvāravatī, the name given to a society of city states that flourished in central Thailand in the second half of the 1st millennium, and eventually became absorbed in Angkor, Sukhothai and Ayutthaya (Saraya 1999). As Mon society faced pressure from the Khmer and Siamese, there was significant migration from the 11th century onward into Lower Burma and into Bagan, where they played an important role in Burman society, and were for centuries the dominant culture in much of lower Burma (Luce 1953). Inscriptions in Old Mon first appear in the 6th century in the Chao Phraya Basin, and with some gaps there is almost a continuous tradition of written Mon that extends

858 

 Paul Sidwell and Mathias Jenny

all the way to the present. (An example of an early Old Mon inscription, with transliteration and translation, is given at Figure 1.) It is conventionally periodized as follows: Old Mon until the 13th century, Middle Mon until the 18th century, and Modern Mon subsequently. Old Mon is known only epigraphically, and many of the inscriptions, especially of the earliest period, are quite short or incomplete, being inscribed on votive tablets, terracotta seals, Buddha images, and other small objects. Fortunately, there are some lengthy donative and other inscriptions from 11th and 12th century found in Bagan. These have proved very important for our knowledge of the Old Mon language, although Burmese influence cannot be excluded, as these inscriptions were written in Bagan where Mon probably was not the language of a large portion of the population. Middle Mon inscriptions are mostly from Lower Burma, and they show a marked change in the phonology from Old Mon. Presyllables are reduced, final consonants dropped or simplified (e.  g. -r > -Ø, -l > -w, -s > -h), and the morphological system was largely broken down. Increasing Burmese influence is seen in the lexicon and syntactic structure of Middle Mon.

Fig. 1: Wat Phorang inscription, 6th century (after Prapassorn 1999).

The most important of the Mon inscriptions, according to Duroiselle, is the Great Inscription of the Shwezigon Pagoda (11th century), which present a vast range of topics, including narrative accounts and prophesies, and dialogues between the Buddha and his disciple Ananda, making it treasure grove for linguistic material. The longest Old Mon inscription, called the Tharaba Gate Inscription, dated tentatively to the beginning of the 12th century, gives a rather detailed account of the building of the palace in Bagan, with elaborate descriptions of ceremonials. Middle Mon inscriptions are abundant in Lower Burma, the Mon-Pali bilingual Kalyani inscriptions commissioned by king Dhammazedi of Pegu (Bago) being the longest and most important of these. The first three of ten stone slabs are inscribed in



MSEA epigraphy 

 859

Pali, the remaining seven in Middle Mon. They give a detailed account of religious and political relations between Pegu and Sri Lanka at the time, with the Pali text providing a summary of the Mon version. Editions in transliteration, both Romanized and in modern Mon script, as well as translations in different languages are available (e.  g. Epigraphia Birmanica vol. 3; Taw Sein Ko 1892). Editions of Mon inscriptions from Old Mon and Middle Mon are available in Roman transliteration with English translations in the several volumes of the Epigraphia Birmanica (Duroiselle ed. 1919–1936), and the Mun Kyauksa Paunggyoke (U Chit Thein 1965) gives transliteration in Modern Mon script with Burmese translations of the majority of Mon inscriptions found in Burma/Myanmar. Other editions of Mon inscriptions were published in Burma at different times, including translations into Modern Mon, but hardly any critical reassessment of the reading of the original stones was attempted. Mon inscriptions found in Thailand have been edited and published variously in local museum publications, mostly in Thai only (e.  g. Hariphunchai/ Lamphun, Nakhon Pathom, etc.), as well as in the Prachum Silacharuek Siam in Thai and French (Coedès 1929) and in the Bulletin de l’École Française d’Extrême Orient in French (Halliday 1930). An immensely useful resource for the study of Old and Middle Mon is Shorto’s (1971) Dictionary of Mon Inscriptions (2012), which also served as the basis for Nai Tun Way’s recently published dictionary giving all entries in both Romanized form and in Old Mon script. The study of Mon epigraphy has played an important role in Austroasiatic and Mon language history and reconstruction, as well as in the study of the history of Burma and Thailand. Several resources are available to researchers in the field, even if most of them are rather dated. Blagden (1909) gives a first interpretation of the Mon face of the quadrilingual Myazedi inscription, the Rosetta Stone of Burma (see also below on Pyu and Burmese). His understanding of Old Mon is still preliminary, but many of his interpretations are still valid today and he is rightfully seen as the “father of Mon studiers”, both by western and Mon scholars. In part I of volume I of the Epigraphia Birmanica, Duroiselle (1919) offers a full analysis of all four faces of the Myazedi inscription, with comparative vocabularies for Old Mon, Pyu, and Old Burmese. Part II gives a general introduction to the Mon people and their history, as well as their significance in Burma and the influence of Burmese on Mon. This can be seen as the first rather comprehensive publication on different aspects of Old Mon, including spelling conventions and phonology, Indic loanwords, history of Mon epigraphic studies and a chronology of Mon inscriptions. Shorto explores “some of the peculiar linguistic problems involved in the interpretation of epigraphic texts” (Shorto 1956: 344) and their relevance for linguistic studies, based on Old Mon inscriptions. His Dictionary of Mon Inscriptions (Shorto 1971) includes a short sketch of Old and Middle Mon phonology and morphology, and the entries give generous and fully referenced examples taken from the inscriptions, still quite the state of the art in Old Mon studies.

860 

 Paul Sidwell and Mathias Jenny

Luce and Ba Shin (1961), give a brief summary of Mon stone inscriptions and present Old Mon orthography in a chart comparing Lower Siam (Nakhon Pathom, Lopburi), Lower Burma (Thaton), Central Burma (Bagan), and Upper Siam (Lamphun), before moving on to ink and terra cotta inscriptions found at Kubyaukgyi in Myinkaba, Bagan. Guillon (e.  g. 1974, 1977) and Bauer (e.  g. 1991a, 1991b) continued the study of Mon inscriptions from both Burma/Myanmar and Thailand. Guillon (1985) presents the little-known terra cotta plaque inscriptions of Mara’s army on the Ananda Pagoda in Bagan, complementing Shorto’s (1966) study on the devatā plaques. Bauer suggested identifying several Old Mon dialects based on spelling differences in the inscriptions but has not published on this topic. It therefore remains unclear whether inscriptional evidence can be adduced to postulate dialectal differences in Old Mon, apart from the clearly distinct variety in the Lamphun inscriptions of northern Thailand. The latter exhibit a number of peculiarities, most prominent the spelling of final consonants by doubling, rather than using the vowel killer as other Mon texts. This use of doubled final consonants also appears in the earliest Thai inscriptions after the short vowel /a/. A concise overview of Mon epigraphy and its study is given in Bauer (2018). In Burma itself, Nai Pan Hla was an active researcher in Mon (as well as Burmese and Pyu, see Nai Pan Hla 2011), including epigraphy, but his publications have not become widely available outside Myanmar. His 1976 publication compares epigraphic Mon to the modern language and provides a summary of Mon history and the history of Mon studies, together with comparative word lists of Old, Middle and Modern Mon (Nai Pan Hla 1976). Old Mon epigraphic sources have also played an important part in comparative reconstructions, and in the 1980s two reconstruction of proto-Monic were published (Ferlus 1983; Diffloth 1984). Old and Middle Mon, plus Sanskrit and Pali, were principally relied upon by Ferlus (1983) to reconstruct proto-Mon phonology using philological methods to model phonological values of epigraphic texts. At about the same time Diffloth (1984) used Old Mon, together with present-day Mon and Nyahkur dialects, to reconstruct proto-Monic lexicon and phonology. Additionally, Old Mon forms made a crucial contribution to Shorto’s (2006) proto-Mon-Khmer reconstruction; 495 Old Mon forms are cited and Old Mon phonological values are the basis of Shorto’s proto-MonKhmer phonology (see correspondence tables at Shorto 2006: 5–6). In spite of the importance of Mon epigraphic studies in linguistics and history, most sources have been long out of print and are not easily available. This is one of the reasons why research in Mon epigraphy has been more or less dormant in recent decades, with hardly any new findings and insights since the 1970s.



MSEA epigraphy 

 861

35.2.2 Khmer The epigraphic heritage of Cambodia is the most extensive and best studied in MSEA, with more than 2,000 inscriptions known and around 1,300 inventoried, transcribed, translated, and available for study in publications and libraries. Epigraphs are predominantly in Khmer and Sanskrit languages, roughly equally in proportion, including many stele with texts in both languages. Among them Sanskrit was used for mainly religious functions and the Khmer for more prosaic texts. Despite the inherent limits of epigraphic corpora, the absolute extent of inscriptions has facilitated the publication of substantial dictionaries of Old Khmer (Pou 1992; Jenner 2009a, 2009b) studies of Old Khmer grammar (Seam 1992; Pou 1996; Khin Sok 2007; Jenner and Sidwell 2010; etc.), reconstruction of language and social history (Ferlus 1992; Vickery 1998; and others). An example of Old Khmer, with transliteration and translation, is given at Figure 2.

1038 śaka gi nu vraḥ kamrateṅ añ vnaṃ rhek ta jā ta pasvi rājakāryya ta pvas anau vraḥ tapovana kalyānāśrama jvan khñuṃ bhūmi ta kamrateṅ jagat śrīsūryyaparvvata thve caṃnāṃ saṃkrā nta raṅko je mvāy nā vraḥ rājadharmma ta thve caṃnāṃ sṭhi raloñ saṃ loñ saṃ sot loñ ‘In [the year] śaka 1038, the Vrah Kamrateng An Vnam Rhet, ascetic in the service of the king, who had entered the religion in the hermitage Kalyanasrama, offered servants and land to the Kamrateng Jagat Sri Suryaparvata, established for the New Year a foundation of a basket of rice and performed the holy rajadharma ceremony.’ Fig. 2: Old Khmer inscription K 32, 10th–11th century (translation after Coedès, IC vol. 2).

The Khmer language is periodized into Old Khmer until the 14th century, Middle Khmer until 1800, and Modern Khmer to the present. Old Khmer is divided into pre-Angkorian (pre-800 CE) and Angkorian (post-800 CE) with each of these phases marked by distinct lexical and phonological differences (Ferlus 1992). Khmer language inscriptions begin around 600 CE and peter out in the 15th century with the collapse of Angkor (circa 1431) and there followed a dark period of over a century in which no Khmer writings are attested.

862 

 Paul Sidwell and Mathias Jenny

The first inscriptions appear in the pre-Angkorian period during which the Khmer state was centered on the lower Mekong. In the early 800s the capital was moved away from the Mekong to the region in which the great temple complex of Angkor would be established (near modern-day Siem Reap) and over the following six centuries a powerful centralized state was built that came to dominate much of Indo-China. The highly organized nature of the Khmer state and economy left a significant regional linguistic legacy; within Kambujā the vernacular Khmer language was leveled and showed no significant internal diversity until the post-Angkor period and the emergence of dialects in Middle Khmer, other linguistic groups were largely assimilated with effectively only the Pearic and Kui Austroasiatic groups surviving as small minorities. Other language groups, such as Chams and Bahnarics, spread into the territory subsequent to the fall of Angkor. Khmer loan words came to permeate the region and left an especially strong impression on Modern Thai (see Khanittanan 2004) as well as in the many neighboring Austroasiatic groups. The earliest Cambodia inscriptions are purely Sanskrit and begin the 5th century; they are poems placed at the doors of temples and offer praise to the deities of the temple and may include names of illustrious ancestors of the person making the tribute, typically a royal or nobleman. Inscriptions in Khmer language begin in the 7th century, and many bear horoscopes that allow us to accurately date their production. According to Jacques (2002: 41) the first dated Khmer language epigraph, found at Angkor Borei, can be dated to 9:52 pm, Friday the 21st of January 612.2 Epigraphs in Old Khmer are prosaic, with those on stele are effectively deeds, consisting often of inventories of goods, parcels of land and place names, names of “the god’s slaves”, frequently with dates and values specified. Other epigraphs, such as inscribed into temple lintels and jambs – similar to the Sanskrit epigraphs – honor deities, and are more common later in the Old Khmer period. Fortunately there are some royal chronicles and other texts which give us some insight in to the use and grammar of the language, but it is tantalizingly fleeting and not particularly colloquial. As Jenner and Sidwell (2010) explain, the Old Khmer texts are, “couched in legalistic form and employing a chancery idiom cultivated by a small, educated elite” (Jenner and Sidwell 2010: 1). The epigraphical legacy of Khmer extends beyond Cambodia and into large areas of Thailand, Laos, and Vietnam. More than 200 Khmer inscriptions found in Thailand, where they are referred to as Thai Khom, are preserved in the National Library and National Museum of Bangkok. A few inscriptions found in Vietnam are held in museums in Hanoi and Ho Chi Minh City, and in Laos the movable inscriptions from the Wat Phu site are displayed at the museum of Paksé. The inventory of Cambodia inscriptions was established by Coedès (1937–1966) publishing approximately 1,000 inscriptions, assigning K. numbers for each. The inventory was subsequently carried on by Jacques (1971) and Gerschheimer (2003–

2 The same stela, K.557/600 has had other dates ascribed, e.  g. Zakharov (2019: 66) asserts “from 611”.



MSEA epigraphy 

 863

2004), with Gerschheimer managing the project going forward as the Corpus of Khmer inscriptions (CIK) project.3 The work is published in the new EFEO series Manuel d’épigraphie du Cambodge, the first of which appeared in 2007 (Ishizawa et al.), and by 2017 the inventory had been extended to K.1360. Other volumes of Old Khmer inscriptions have been published, notably by Pou (1989, 2001; Pou and Mikaelian 2011), collections of rubbings (Ang 2013; Somporn and McCarthy 2012) and an inventory of inscription dates (Billard and Eade 2006).There are the extensive searchable online corpora maintained by the Center for Research in Computational Linguistics (Bangkok),4 and see also Estève (2010–2011) on digitization and curation of Khmer inscriptions in the digital age. In addition to stele and architectural epigraphs, there is a growing corpus of short donative inscriptions on ritual utensils in bronze (e.  g. Estève and Vincent 2010; Griffiths and Vincent 2014). There are also more than 100 inscriptions in Middle Khmer, variously published and forming the basis of Jenner’s (2011) dictionary. The is also a significant number of Middle Khmer inscriptions engraved on gold plates, issued by Ayutthayan (Siamese) kings in the 15th and 16th centuries, found in the Tenasserim region of what is now peninsular Burma, as well as at Ayutthaya-period sites in Thailand (Vickery 1973). The roughly 400-year Middle Khmer period begins before the fall of Angkor, and is marked by significant changes in pronunciation and phonology, which are revealed substantially in the inscriptional record. The orthography became more variable, with frequent confusion of old voiced stops with their voiceless counterparts indicating that the voiced stops were devoiced while the vowels began to fall into the two complementary registers of later Khmer. For example: Forms such as gusala for kusala, jībara for cībara, sāthā for saddhā, and proma for broma show that contrast was being lost as the old voiced stops underwent devoicing. Certain vowel substitutions are to be seen as attempts to represent new values in the developing registers. Forms such as semā for sīmā and sarenna ~ srena ~ sūrena for surinda show the development of low-register /eː ~ e/, though kīrti has not yet reached kerti. (Jenner 2011: xi)

In terms of comparative reconstruction, Old Khmer forms play a role in Shorto’s (2006) proto-Mon-Khmer, with 155 lexical items cited. Given the relative greater extent of inscriptional Khmer compared to other inscriptional languages this might seem incongruous, but there are good reasons. Old Khmer had already undergone significant sound changes by the time it was being written down; also the lexical register of Old Khmer strongly preferred Indic terms over native forms. However, the latter point proves very useful in terms of reconstructing Old Khmer phonological values, as done by Ferlus (1992). 3 See https://cik.efeo.fr/ (accessed 10 January 2021). Additionally there is a photo library of epigraphic material (http://collection.efeo.fr/ws/web/app/report/index; accessed 10 January 2021) by copyright permissions must be obtained to facilitate access. 4 http://sealang.net/oldkhmer/, http://sealang.net/classic/khmer/ (accessed 10 January 2021).

864 

 Paul Sidwell and Mathias Jenny

35.2.3 Cham The history of Cham inscriptions spans from about the 5th to the 15th century. The Đông Yên Châu inscription, mentioned above, is generally dated to the mid-4th century and is often heralded as the earliest epigraphy in SEA, predating Mon, Khmer, Malay and others by centuries, however that inscription is only dated by epigraphy, and its priority is open to challenge. The Cham people are closely related to ancient Malays, being descendants of settlers who apparently began migrating from Kalimantan to settle the Indo-Chinese coast during the 1st Millennium BCE (Higham 2014; Thurgood 1999). They developed a network of Indianized city states along the Vietnam central coastline, roughly from the range of present-day Da Nang to Phan Rang, from the 2nd century AD, reaching a peak in the 10th century, after which rivalry with Angkor and Vietnam saw Champa in a long and intermittent retreat, which ended in 1832 with the absorption of a remnant Champa state around Phan Rang into Vietnam. The decline of Champa was also associated with a millennium of diaspora, with migrations to Hainan, Cambodia, Sumatra, and a substantial Chamic presence in the highlands of central Vietnam. Historically the dominant religion in Champa was the Hindu cult of Śaivism (later substantially replaced by Islam, although the 2009 Vietnam census still counted some 56,427 Hindu Chams) and the Cham epigraphic legacy is substantially Śaivist texts in Sanskrit, plus some mixed and Cham language texts. Champa inscriptions are not particularly numerous, and today somewhat more than 400 are known and less than 300 of them inventoried. Given the dominance of Sankrit, and the moderate number of epigraphs in Cham language, the inscriptional materials have played no effective role in the comparative reconstruction of proto-Chamic (e.  g. Thurgood 1999). An inventory of Champa inscriptions was initiated by Coedès in the early 20th century, assigning C numbers (in the fashion of Khmer K numbers). The 1908 edition compiled 118 entries, extended to 170 in the 1923 edition (Coedès and Parmentier 1923). Supplements over the following two decades brought the number to 200, but subsequently the inventory was neglected, unsurprising given the tumult in Vietnam from the 1960s. The inventory was resumed at EFEO in cooperation with Vietnamese scholars, in a project headed by Arlo Griffiths (Corpus of the Inscriptions of Campā [CIC]). This first publication in that series (Griffiths et al. 2008–2009) covers 240 C-numbered inscriptions; additionally, the project maintains a website5 and at the time of writing it makes available 50+ inscriptions with images, transcriptions and translations, metadata and commentaries. Additionally, most Champa inscriptions published before 2004 are available in a single aggregation published in French and English versions

5 https://isaw.nyu.edu/publications/inscriptions/campa (accessed 10 January 2021).



MSEA epigraphy 

 865

(Golzio ed. 2004), and other relevant resources/collections are listed in the CIC Bibliography.6

35.2.4 Siamese/Lao The epigraphic record of Thai begins with the four-faced Ramkhamhaeng stele (labeled Inscription I) of Sukhothai, dated to 1292. Earlier inscriptions found in Siam/ Thailand were written in Khmer (or Khom, as the language is traditionally called in Thai), Sanskrit, or Pali. Inscription I was discovered in Sukhothai in 1833 by the future king Rama IV (Mongkut) and first translated into English and published by Bradley (1909). The Ramkhamhaeng stele gives rare insight into the life in Sukhothai in the 13th century, something quite uncommon in Southeast Asian kingdoms of the time. While other inscriptions are more concerned with religious and royal ceremonies and donations, Inscription I gives a detailed account of different aspects of Sukhothai, including religion, trade, warfare, tax policy, and a personal description of the king’s family. Interestingly, the inscription also describes the invention of Thai writing by king Ramkhamhaeng and dates it to AD 1283. The language of the inscription is very similar to modern Thai with some archaic and dialectal traits, but the script itself differs not only from modern Thai, but also from all other Indic scripts used in South and Southeast Asia. Most strikingly, Inscription I places the symbols for vowels on the same line as the consonants, while in traditional Indic scripts vowels appear as diacritics before, above, beneath, or under the consonant to which they belong. This manner of writing is not found in any other inscription in Thai or another language of the area. The other innovation is the consistent marking of three tones, in line with the state supposed to have prevailed at the time. Tone markers do appear in later Thai inscriptions, but not in a consistent way until many centuries later. These innovations, together with the unusual contents, have led scholars, both Thai and western, to doubt the authenticity of Inscription I. In the late 1980’s, the “Ramkhamhaeng Controversy” led to numerous publications both in favor of and against the view that the stele is indeed the product of the 13th century. The advocates of a later date argue that the discovery of the inscription in the mid-19th century, when Siam was creating its national identity and historical narrative to fight off the threat of colonialism with the British in the west (Burma) and the French in the east (Indo-China). The discovery by the future king Rama IV more or less coincides with his interest in the European languages and the publication the Thai grammar in Latin by one of the king’s mentors, the French missionary Pallegoix. The theory is that the inscription was commissioned by the future king in order to create a national narrative of a civilized modern state with a long-documented history. Other factors speak for

6 https://isaw.nyu.edu/publications/inscriptions/campa/bibliography (accessed 10 January 2021).

866 

 Paul Sidwell and Mathias Jenny

the authenticity of the inscription: the archaic character of the language and orthography, especially the consistent use of the long-obsolete letter , which is difficult to explain unless the text was actually written by a speaker of Thai before the sounds /x/ and /kh/ merged long before the 19th century (or a linguist with knowledge of reconstructed states of Thai). The controversy has never been definitely settled, both sides having good arguments for their respective point view (Chamberlain ed. 1991; Wongthes ed. 2003). Inscription  I has been very much the center of attention of research in Thai epigraphy, but there are numerous other, uncontroversial inscriptions as well. The Prime Minister’s Office in Bangkok supervised the ongoing edition of inscriptions in modern script in the several volumes of the Prachum Silacharuek (Collection of Inscriptions) since the beginning of the 20th century. The volumes cover inscriptions found in all regions of Thailand in different languages. A more accessible collection was published by the Siam Society from 1968 to 1979 and later consolidated in a single volume (Prasert and Griswold 1992). This 821-page volume gives transcripts in modern Thai letters of some 40 inscriptions in Thai, Mon, and Pali, together with historical descriptions and English translations. Photos of the inscriptions are added, though the quality does not allow a direct reading of these in most cases. The oldest inscription apart from the Ramkhamhaeng stele is Inscription II, the Wat Srichum inscription (early 14th century), which gives a lengthy account of a monk from Sukhothai traveling to Sri Lanka and upon his return implementing the Sri Lankan Buddhist way of life (Prasert and Griswold 1992). Several editions of inscriptions of Thailand have been published since 1924 (Coedès 1924, 1929), usually including inscriptions found in present-day Thailand in any language in transcription in modern Thai, in some cases with translations (French, English, Thai) and photos. Griswold and Prasert (1992) give a number of Thai inscriptions dating between 1292 and 1563 with transcriptions in modern Thai, English translations and historical analyses. Skilling (ed. 2008) offers a comprehensive account of the Jataka inscriptions at Wat Si Chum in Sukhothai, linking them to the Jataka plaques at the Ananda Pagoda in Pagan. Being the best online resource, the Princess Maha Chakri Sirindhorn Anthropology Centre in Bangkok offers an online database with several hundred inscriptions of Thailand, including many in languages other than Thai (accessible and searchable in Thai and English online).7 For northern Thai stone inscriptions and manuscripts, Buchmann’s (2011, 2012, 2018, 2020) ongoing work offers catalogues and descriptions, including a grammatical analysis and vocabulary. The Thai epigraphic record is of limited linguistic value for the reconstruction of proto-Tai. This is mainly due to the fact that modern Thai orthography remains mostly unchanged from the orthography used in older texts, representing a pre-devoiced

7 https://db.sac.or.th/inscriptions/ (accessed 23 June 2021).



MSEA epigraphy 

 867

stage of the language.8 The inconsistencies in spelling found in later inscriptions point to the time of the ongoing sound changes, though much more research needs to be done in this field. In a few cases inscriptional Thai does retain distinctions that are lost in modern Thai, namely /x/ vs. /kʰ/ and /ɣ/ vs. /g/. The earliest known inscription in Lao is dated to 1494 (Lorrillard 2018). The Lan Xang kingdom had close connections with northern Thai Lanna as well as Sukhothai, and the three languages are very closely related. It is not clear how independent Lao epigraphy developed in this closely knit and interacting Tai environment. Given the importance of Lan Xang in the middle Mekhong region, Laos has not received the attention of international historical and epigraphic research it deserves, making the sources difficult to access and assess. Lorrillard (2018) gives a concise overview with good quality photographs of selected inscriptions from Laos, but no transcripts or translations of these. Several scholarly articles focus on Thai/Siamese epigraphic sources and their analysis, but there is no comprehensive description or dictionary of inscriptional Thai, making the older stages of the language more difficult to access than is the case with Mon and Khmer. As the authenticity of Inscription I is not definitely established, it cannot be used as source of early Thai history with any certainty, though it is the text which would give us by far the most complete insight. Other inscriptions deal mostly with religious affairs such as donations, the building of Pagodas and erection of Buddha images.

35.2.5 Burmese The earliest Burmese language epigraph is conventionally regarded to be the Burmese face of the quadrilingual Myinkaba Kubyaukgyi inscription of Rājakumāra (also known as Myazedi inscription, circa 1113); 2013 saw the discovery of the quadrilingual Suwlumin inscription in the Mandalay region, provisionally dated to 1079, although 1054 has been claimed (Bee Htaw Monzel 2014). The Burmese inscriptional record continues uninterruptedly from the 12th century to modern times. It appears that in some cases at least, old inscriptions were copied in later times. U Than Tun (1998) presents an early 13th century inscription written in Mon and Pali, with a Burmese text added almost a century later, which was unearthed in 1996. This finding turned out to be the original of an 18th century inscription that was published in 1897. The Burmese writing system has traditionally been taken to be based on Mon, which served as literary language at Bagan during the 11th and 12th centuries. This view was recently challenged by Aung Thwin (2005), who claims that there was no

8 At some point between the 14th and 16th century, all voiced plosives became unvoiced in Thai (and most other languages of MSEA). This devoicing is not represented in modern Thai orthography.

868 

 Paul Sidwell and Mathias Jenny

Mon polity in Lower Burma from which the Burmese could have inherited the script (among other cultural concepts). Rather, it was the Mon who borrowed the writing system from the Burmese. Aung Thwin’s hypothesis, while based on partly solid evidence, has been rejected by most scholars in the field. Burmese inscriptions are relatively numerous across in Upper, Central and Southwestern Burma, although the history of Burmese epigraphic study is somewhat complex. It began mainly with the efforts of European scholars such as Duroiselle, Bladgen and Luce in the first half of the 20th century, was disrupted by WWII, and although there was an enthusiastic resumption of work, marked by cooperation between Burmese and European scholars, the expulsion of foreigners and isolation of Burma from 1962 did much to take international attention away from the field. The posthumous publication of Luce’s (1985) two-volume Phases of Pre-Pagan Burma reflects the height of that mid-20th century period of fruitful cooperation which is accessible in English.9 Good summaries of the history of Burmese epigraphic studies are given by U Tin Htway (2001), Griffiths et al. (2017), and Frasch (2018). Early lists of inscriptions include Tun Nyein et al. (1899) and Duroiselle (1921), and over 600 early stone inscriptions from Burma reproduced in Inscriptions of Burma (Luce and Pe Maung Tin eds. 1934–1956). While that number includes epigraphs in various languages, the majority are from central Burmese and are dated up to the mid-14th century. More were published in the 1960 and 1963 issues of the Bulletin of the Burma Historical Commission, and subsequent efforts by Burmese scholars yielded seven volumes covering some 900 inscriptions spanning the mid-11th to the late 18th centuries (e.  g. Nyein Maung et al. 1972–2013), with substantial materials remaining unpublished. The so far seven volumes produced by the Burmese government since 1972 present hand-written modern Burmese transcripts of many, but by far not all, Burmese inscriptions in chronological order. In late 2020, a group of Burmese and western scholars published a full archive of Old Burmese inscriptions, together with transliterations, making this important resource easily accessible for the first time.10 See also Frasch (2005) for a list of translated Old Burmese inscriptions up to that date. There is also active work on Burmese inscriptions by Japanese scholars, notably Hideo Sawada (Tokyo University of Foreign Studies) whose efforts include an online inventory of stone and ink inscriptions with images and metadata.11 Ink-gloss epigraphs also form a significant part of the Burmese epigraphic legacy, mostly donatives and horoscopes on temple walls (U Ba Shin 1964) and captions on plaques and murals dating from the 12th century up to the modern era (U Ba Shin 1962; Munier and Myint Aung 2007). 9 Luce’s papers are held at the National Library of Australia, and a digital archive is accessible at: http://sealang.net/archives/luce/ (accessed 10 January 2021). 10 U Nyein Maung; Lewis-Wong, Jennifer; Khin Khin Zaw; McCormick, Patrick; Hill, Nathan (https:// zenodo.org/record/4321314, accessed 23 February 2021). 11 http://www.aa.tufs.ac.jp/~sawadah/ODSEAS/burmcont-e.html (accessed 10 January 2021).



MSEA epigraphy 

 869

Based on Old Burmese inscriptions, Toru (2005) gives a concise account of Pagan period Burmese phonology and grammatical structure. This is probably the single most accessible and comprehensive source of Old Burmese for linguists working in the area. The epigraphic evidence of Old Burmese importantly clarifies the early history of cluster simplification in Burmese phonology, and also confirms substantial continuity in the grammar of the language through the 2nd millennium. Furthermore, the rendering of what in modern Burmese are creaky (á) and heavy (à) tones with final glottal stop and , respectively, adds support to the hypothesis of laryngeal origin of modern Burmese tones. In terms of informing comparative reconstruction of Tibeto-Burman, epigraphic Burmese has played only a marginal role compared to synchronic comparison. Besides textbooks and papers on the development of the Burmese script, a number of Pagan era inscriptions have been published and analyzed in Burmese, and a short dictionary of Pagan period Burmese was published in 2001 by U Nyunt Han. The latter lists lexical entries in modern Burmese script but Old Burmese orthography, with (modern Burmese) pronunciation and short examples. In addition to their linguistic importance, Old Burmese inscriptions are also invaluable sources for the history of Burma, complementing foreign (mostly Chinese) accounts (e.  g. Luce 1959a, 1959b, 1970), and religious and palace life in the Pagan kingdom (e.  g. Frasch 1998). The formerly independent kingdom of Arakan (present-day Rakhine State in Myanmar) is home to a distinct archaic variety of Burmese, Arakanese. The earliest inscriptions in Arakanese go back to the 14th century and are written in Burmese script of the time. This brings them close to the Burmese-Mon cultural area, which is markedly distinct from the earlier Sanskrit-Pali inscriptions of Arakan. These are written in a northern Brahmi-type script, while the Mon-Burmese script goes back to a southern Brahmi variety (Kyaw Minn Htin and Leider 2018). Arakanese and other non-standard Burmese epigraphy is greatly under-studied, although much historical and linguistic insight could certainly be gained from more extended research in this field.

35.2.6 Pyu Pyu is the name given to a language and culture of city states that thrived in Upper Burma through the 1st millennium, peaking in the 9th century before the ascendency of the Burmans. Pyu remained as a vernacular language at least into the 13th century before being completely absorbed into Burman society. Pyu is regarded as a Tibeto-Burman language, but the classification remains unclear, and this itself hinders decipherment. Another aspect of the problem is the low proportion of Indic words in the inscriptions, and a lack of long parallel texts. Consequently, only a small number of inscriptions have been deciphered (Blagden 1913–1914; Luce 1985; Krech 2012) and

870 

 Paul Sidwell and Mathias Jenny

there have been important advances in recent years: Griffiths et al. (2017) provide a summary of Pyu studies, and one may note also the studies by Miyake on Pyu phonology (Miyake 2018) and grammar (Miyake 2019). Archaeology and radiocarbon dating suggest that urban centers of Upper Burma such as Halin, Beikthano, Sriksetra, were established by the 2nd century CE, although the dating methods are imprecise and exact dates are unknown (Aung Thaw 1968: 62; Moore 2004: 1–2). By the mid-1st millennium there was, “[a] cultural change is indicated by new modes of expression of political dominance through royal sponsorship of religious buildings built in or near the walled cities” (Griffiths et al. 2017: 51) and inscriptions begin to appear which apparently reflect the Pyu language. It is apparent that Pyu was effectively absent from Lower Burma, and was not in significant contact with the Mon epigraphic tradition. Early work on Pyu is represented by the inventory of inscriptions by Duroiselle (1921) and Blagden’s (1913–1914, 1914, 1917, 1919) decipherment of the Myazedi pillars from Bagan (faces in Pyu, Old Mon, Old Burmese, and Pali) as well as some other Pyu inscriptions. Other notables include Shafer (1943) reworking of Blagden’s analyses, which extended to Pyu lexicon to over 150 words, and Luce’s (1985) aggregation of numerous works on Pyu, in addition to some reanalyses, which has been an important foundation for more recent work. Subsequent scholarly attention to Pyu did not formally extend Duroiselle’s numbered inventory, a task which was only reestablished in 2012 under the auspices of the EFEO. The Paris-based team, headed by Arlo Griffiths, actively pursues a project on Pyu, which maintains an online corpus of inscriptions, including images, transcriptions and extensive metadata and bibliography.12 At the time of writing there were 161 items in the corpus, which includes some texts in languages other than Pyu, particularly Sanskrit and Pali. Most Pyu epigraphs are short, and are found on a variety of materials and objects; only two large stele, of the kind otherwise common in MSEA epigraphy, have been found. Dozens of the inventoried inscriptions are no more than number signs impressed into bricks found at Pyu sites, so their linguistic significance is limited. One of the significant findings of the EFEO project (Griffiths et al. 2017) has been to overturn the notion of earlier scholars that Pyu lacked closed syllables. It is apparent that there were variants of Pyu script including an aberrant form that omitted final consonants. Some early studies, such as Blagden’s work on the Myazedi pillars, which happened to be of this type, leaving Bladgen and his successors with skewed views about Pyu phonology. Luce, for example, misread what are now recognized as final consonants as unexplained inserted characters. Within Burmese scholarship there has also been a tradition of Pyu studies, pursued by U Mya, Tha Myat, and others. Writing in Burmese in the 1950s and 1960s,

12 http://hisoma.huma-num.fr/exist/apps/pyu/index2.html (accessed 10 January 2021).



MSEA epigraphy 

 871

their work received little international attention but was accessed for Luce’s aggregation. This includes U Mya’s (1961) Votive Tablets of Burma, and U Tha Myat’s (1963, 2011) Pyu Reader. More recent Burmese studies include Aung Thein’s (2005) study of Pyu inscriptions and Khin Maung Than’s (2013) Pyu dictionary, although such later works do heavily recapitulate earlier scholarship. Much of the contemporary work of the EFEO project personnel focusses on the transliteration and interpretation of the Pyu inscriptions. Griffith et al. (2017) discuss at some length their identification of various akṣara and diacritics, and their (re)interpretations. This includes recognition of various sub-linear graphs and akṣara as final consonants, marks which earlier scholars such as Blagden regarded as decorative or extraneous (see, e.  g. Griffith et al. [2017: 87] for an example of a single line of Pyu text interpreted as two separate lines in earlier studies). There has been a proposal for a unicode version of Pyu script since 2010 and at the time of writing remains on the Consortium’s roadmap.13

35.2.7 Sanskrit/Pali The earliest inscriptions found in MSEA are written in Sanskrit and Pali, in Brahmi-derived letters. The Indic language inscriptions go back the first half of the 1st millennium, predating the earliest vernacular writing by at least 200 years. Sanskrit is generally associated with Hinduism (mostly Shaivism in SEA) and Mahayana Buddhism, while Pali is the language of the Theravada school of Buddhism. All of MSEA apart from Vietnam and the southernmost provinces of Thailand at the present are mainly Theravada Buddhist countries with Pali as the language of religion. Historically there is a regional opposition between early Arakan, Srikshetra (Pyu), Cambodia, and early Siam (Sukhothai, Ayudhya) which favor Sanskrit, and the mainly Pali areas of the Mon, Burmese, and northern Thai (Lanna). Even today, written Thai retains the complete set of Sanskrit letters, including , , and , all pronounced [s] in Thai. Burmese and Mon, on the other hand, represent Pali phonology by merging the three sibilants in one letter, namely (pronounced [s] in Mon and [θ] in Burmese). With the earliest appearances of local languages in writing, inscriptions are frequently bi- or multilingual, often beginning with a verse in Pali or Sanskrit before continuing in Mon, Khmer, Thai, etc. In Burmese and Mon areas, a kind of glossed style appears in several inscriptions, giving a direct translation of the Pali phrase in the vernacular. This style continues to the present day in Mon and Burmese (nissaya, see Okell 1965). It is not clear to what extent Pali phrase structure and morphology has influenced the local languages, but there have been claims that some features of Burmese go back to Pali models (e.  g. Yanson 2002).

13 https://www.unicode.org/L2/L2010/10295-pyu-chart.pdf (accessed 10 January 2021).

872 

 Paul Sidwell and Mathias Jenny

Pali writing continued in northern Thailand, where not only Buddhist texts were composed in this language, such as commentaries on the Pitaka, but also historical texts such as the Cāmadevīvaṃsa (the history of Chamdevi, early 15th century) and the Jinakālamālīpakaraṇaṃ (The sheaf of garlands of the epoch of emperors, 16th century), based on Sri Lankan Buddhist traditional chronicles but locally adapted (Swearer and Premchit 1998). The cultural importance of Indic traditions in MSEA cannot be underestimated. Apart from serving as model for MSEAn writing, Sanskrit and Pali also are the major source of lexical borrowing in the region in historical and modern times. Furthermore, Indic poetic patterns and prose style have deeply influenced the literary languages of MSEA since the earliest time. Concise overviews of Indic epigraphy in Southeast Asia are given by Bronkhorst (2011) and Griffiths (2014). See also Hoogervorst (this volume) on the South Asian historical influence on MSEA.

35.3 Conclusion The inscriptional record sets in at different times in different parts of MSEA, and apart from Vietnamese all of these were driven by Indic cultural influence and, with Sanskrit and Pali appearing as earliest written languages, before the local vernaculars set in. The epigraphic corpus of MSEA is as varied as is the history of its research, both by Asian and Western scholars. Publications of inscription corpora are not easily accessible outside of the institutions that hold them, and the quality of the available transcriptions and interpretations is in many cases not reliable. However, ongoing projects, including the use of new imaging tools, is seeing an emerging stream of publication of inscriptions and republication and reanalysis of inscriptions previously known only from rubbings and other incomplete images. A sense of this can be gained from the recent renewal of Pyu studies and the advances this has brought about. Additionally, epigraphy in MSEA plays an important role in assessing the socio-political, religious, and linguistic history of the region, informing multiple disciplines and catalyzing inter-disciplinary scholarly cooperation.

References Ang Choulean. 2013. Inscriptions of Angkor Wat: Ancient, middle and modern periods. Phnom Penh: Yosothor. Aung Thaw. 1968. Report on the excavations at Beikthano. Rangoon: Ministry of Union Culture. Aung Thein. 2005. pyū nhaṅ. pyū kyok cā myāḥ sui. ma hut pyū khet buddhā sāsanā [The Pyu and Pyu inscriptions, or the Buddhist religion at the time of the Pyu]. Insein: Zin Yadana Publishing House.



MSEA epigraphy 

 873

Aung Thwin, Michael. 2005. The mists of Ramañña. The legend that was Lower Burma. Honolulu: University of Hawai’i Press. Bauer, Christian. 1991a. Notes on Mon epigraphy. Journal of the Siam Society 79(1). 31–83. Bauer, Christian. 1991b. Notes on Mon epigraphy II. Journal of the Siam Society 79(2). 61–79. Bauer, Christian. 2018. The Mon inscriptions of Thailand, Laos and Burma. In Daniel Perret (ed.), Writing for eternity. A survey of epigraphy in Southeast Asia, 135–149. Paris: École Française dÉxtrême Orient. Bee Htaw Monzel, Nai. 2014. Myittha Slab Inscription of King Sawlu (Bajrâbharaṇadeva). https:// www.academia.edu/30201343/Myittha_Slab_Inscription_of_King_Sawlu_Bajrâbharaṇadeva (accessed 10 January 2021). Billard, Roger & J. Chris Eade. 2006. Dates des inscriptions du pays khmer. Bulletin de l’École française d’Extrême-Orient 93. 395–428. Blagden, Charles Otto. 1909. The Talaing inscription of the Myazedi pagoda at Pagan, with a few remarks on the other versions. Journal of the Royal Asiatic Society of Great Britain and Ireland. 1017–1052. Blagden, Charles Otto. 1913–1914. The “Pyu” inscriptions. Epigraphia Indica 12. 127–132. Blagden, Charles Otto. 1914. The Myazedi inscriptions. Journal of the Royal Asiatic Society of Great Britain and Ireland. 1063–1069. Blagden, Charles Otto. 1917. The “Pyu” inscriptions. Journal of the Burma Research Society 7(1). 37–44. Blagden, Charles Otto. 1919. The Pyu face of the Myazedi inscription at Pagan. Epigraphia Birmanica 1. 59–68. Bronkhorst, Johannes. 2011. The spread of Sanskrit in Southeast Asia. In Pierre-Yves Manguin, A. Mani & Geoff Wade (eds.), Early interactions between South and Southeast Asia. Reflections on cross-cultural exchange, 264–275. Singapore: ISEAS Publishing. Bradley, Cornelius Beach. 1909. The oldest known writing in Siamese. The inscription of Phra Ram Kamhaeng of Sukhothai 1293 AD. Bangkok: The Siam Society. Buchmann, Marek. 2011. Northern Thai stone inscriptions (14th–17th centuries): Glossary (Abhandlungen für die Kunde des Morgenlandes 73-1), English & Thai edn. Wiesbaden: Harrassowitz Verlag. Buchmann, Marek. 2012. Northern Thai stone inscriptions (14th–17th centuries): Catalogue (Abhandlungen für die Kunde des Morgenlandes 73-2). Wiesbaden: Harrassowitz Verlag. Buchmann, Marek. 2020. Inscriptions from Northern Thailand in Dhamma script: Texts and translations. Glossary and indices (Asien und Afrika-Studien der Humboldt-Universität zu Berlin 53). Wiesbaden: Harrassowitz Verlag. Chamberlain, James R. 1991. The Ram Khamhaeng controversy. Collected papers. Bangkok: The Siam Society. Coedès, George & Henri Parmentier. 1923. Listes generales des inscriptions et des monuments du Champa et du Cambodge. Hanoi: École française d’Extrême-Orient. Coedès, George. 1924. Recueil des inscriptions du Siam. Première partie: Inscriptions de Sukhodaya. Bangkok: Bangkok Times Press. Coedès, George. 1937–1966. Inscriptions du Cambodge (École française d’Extrême-Orient collections de textes et Documents sur l’Indochine), 8 vols. Hanoi: Imprimerie d’Extrême-Orient. Coedès, Georges. 1929. Recueil des Inscriptions du Siam, Deuxième partie: Inscriptions de Dvâravatï, de Çrïvijaya et de Lâvo. Bangkok: Bangkok Times Press. Coedès, Georges. 1939. La plus ancienne inscriptions en langue cham. In Sumitra M. Katre & Parshuram K. Gode (eds.), A volume of Eastern and Indian studies presented to Professor F. W.

874 

 Paul Sidwell and Mathias Jenny

Thomas, C.I.E.: on his 72nd birthday, 21st March 1939 (New Indian Antiquary Extra Series 1), 46–49. Bombay: Karnatak Publishing House. Coedès, George. 1968. The Indianized states of Southeast Asia, W. F. Vella (ed.), S. B. Cowing (trans.). Honolulu: University of Hawaii Press. Diffloth, Gérard. 1984. The Dvaravati Old Mon language and Nyah Kur. Bangkok: Chulalongkorn University Printing House. Duroiselle, Charles (ed.). 1919–1936. Epiqraphia Birmanica. Rangoon: Government Printing. Duroiselle, Charles. 1921. A list of inscriptions found in Burma. Rangoon: Government Printing. Estève, Julia & Brie Vincent. 2010. L’about inscrit du musée national du Cambodge (K. 943): nouveaux éléments sur le bouddhisme tantrique à l’époque angkorienne. Arts Asiatiques 60. 133–158. Estève Julia. 2010–2011. New approaches to old texts: Cambodian inscriptions in the digital age; Une conférence internationale organisée par le Corpus des inscriptions khmères, l’APSARA et l’Université de Sydney. Bulletin de l’École française d’Extrême-Orient 97/98. 403–405. Ferlus, Michel. 1983. Essai de phonétique historique de môn. Mon-Khmer Studies 12. 1–90. Ferlus, Michel. 1992. Essai de phonétique historique du khmer (Du milieu du premier millénaire de notre ère à l’époque actuelle). Mon-Khmer Studies 21. 57–89. Frasch, Tilman. 1998. King Nadangmya’s great gift. In Pierre Pichard & François Robine (eds.), Études birmanes en hommage à Denise Bernot, 27–35. Paris: École Française d’Extrême Orient. Frasch, Tilman. 2005. Inscriptions of Bagan, edited and translated. In Myanmar Historical Commission Golden Jubilee Volume, 134–148. Yangon: Yangon University Press. Gerschheimer, Gerdi. 2003–2004. Le corpus des inscriptions khmères. Bulletin de l’Ecole française d’Extrême-Orient 90/91. 478–482. Golzio, Karl-Heinz (ed.). 2004. Inscriptions of Campā: Based on the editions and translations of Abel Bergaigne, Etienne Aymonier, Louis Finot, Edouard Huber and other French scholars and of the work of R. C. Majumdar; newly presented, with minor corrections of texts and translations, together with calculations of given dates. Aachen: Shaker Verlag. Griffiths, Arlo & Brice Vincent. 2014. Un vase khmer inscrit de la fin du XIe siècle (K. 1296). Arts Asiatiques 69. 115–128. Griffiths, Arlo & D. Christian Lammerts. 2015. Epigraphy: Southeast Asia. In Jonathan A. Silk, Oskar von Hinüber & Vincent Eltschinger (eds.), Encyclopedia of Buddhism, 988–1009. Leiden: Brill. Griffiths, Arlo, Bob Hudson, Marc Miyake & Julian K. Wheatley. 2017. Studies in Pyu epigraphy, I: State of the field, edition and analysis of the Kan Wet Khaung mound inscription, and inventory of the corpus. Bulletin de l’École française d’Extrême-Orient 103. 43–205. Griffiths, Arlo. 2014. Early Indic inscriptions of Southeast Asia. In John Guy (ed.), Lost kingdoms. Hindu-Buddhist sculpture of early Southeast Asia, 53–57. New Haven & London: Yale University Press. Griffiths, Arlo, Amandine Lepoutre, William A. Southworth & Thành Phần. 2008–2009. Études du corpus des inscriptions du Campā III Épigraphie du Campā 2009–2010: prospection sur le terrain, production d’estampages, supplément à l’inventaire. Bulletin de l’École française d’Extrême-Orient 95/96. 435–497. Griswold, Alexander B. & Prasert na Nagara. 1992. Epigraphic and historical studies, nos. 1–24 published in the Journal of the Siam Society from 1968–1979. Bangkok: Historical Society. Guillon Emmanuel. 1974. Recherches sur quelques inscriptions môn. Bulletin de l’Ecole française d’Extrême-Orient 61. 339–348. Guillon Emmanuel. 1977. Recherches sur quelques inscriptions mônes. Bulletin de l’Ecole française d’Extrême-Orient 64. 83–114. Guillon, Emmanuel. 1985. L’armée de Māra au pied de l’Ānanda (Pagán-Birmanie). Paris: Editions Recherche sur les Civilisations.



MSEA epigraphy 

 875

Higham, Charles. 2014. Early Mainland Southeast Asia. Bangkok: River Books Co. Halliday. Robert. 1930. Les inscriptions môn du Siam, éditées et traduites. Bulletin de l’Ecole française d’Extrême-Orient 30. 81–105. Ishizawa, Yoshiaki, Claude Jacques, Khin Sok, Uraisi Varasarin, Michael Vickery & Tatsurō Yamamoto (eds.). 2007. Manuel d’épigraphie du Cambodge, vol. 1. Paris: l’École Française d’Extrême-Orient. Jacques, Claude. 1971. Supplément au tome VIII des inscriptions du Cambodge. Bulletin de l’Ecole française d’Extrême-Orient 58. 177–195. Jacques, Claude. 2002. Khmer epigraphy. Museum International 54(1/2). 37–43. Jenner, Philip & Paul Sidwell. 2010. Old Khmer grammar. Canberra: Pacific Linguistics. Jenner, Philip. 2011. A dictionary of Middle Khmer. Canberra: Pacific Linguistics. Jenner, Phillip. 2009a. A dictionary of pre-Angkorian Khmer. Canberra: Pacific Linguistics. Jenner, Phillip. 2009b. A dictionary of Angkorian Khmer. Canberra: Pacific Linguistics. Khin Sok. 2007. Quelques principes de grammaire khmère ancienne. In Yoshiaki Ishizawa, Claude Jacques & Khin Sok (eds.), Manuel d’épigraphie du Cambodge, vol. 1, 11–22. Paris: l’École Française d’Extrême-Orient. Khanittanan, Wilaiwan. 2004. Khmero-Thai: The great change in the history of the Thai language of the Chao Phraya Basin. In Somsong Burusphat (ed.), Papers from the Eleventh Annual Meeting of the Southeast Asian Linguistics Society, 375–391. Tempe, AZ: Arizona State University, Program for Southeast Asian Studies. Khin Maung Than. 2013. rheḥ hoṅḥ pyū-mran mā akkharā cā pe samuiṅḥ nhaṅ. pyū abhidhān: pyū akkharā asaṁ thvak Roman-English-mran mā [A history of the Ancient Pyu and Myanmar scripts and a Pyu dictionary that includes Pyu writing, pronunciation in Romanization, and English and Burmese glosses]. Sagaing: sutesana nhaṅ. kyamḥ pru ṭhāna [Research and Publication Department] / Sītagū International Buddhist Academy. Krech, Uwe. 2012. A preliminary reassessment of the Pyu faces of the Myazedi inscriptions at Pagan. In Nathan W. Hill (ed.), Medieval Tibeto-Burman languages, 121–169. Leiden: Brill. Kyaw Minn Htin & Jacques Leider. 2018. The epigraphic archive of Arakan/Rakhine State (Myanmar): A survey. In Daniel Perret (ed.), Writing for eternity: A survey of epigraphy in Southeast Asia, 73–85. Paris: Ecole française d’Extrême-Orient. Lorrilard, Michel. 2018. Research on the inscriptions in Laos: Current situation and perspectives. In Daniel Perret (ed.), Writing for eternity. A survey of epigraphy in Southeast Asia, 87–107. Paris: École Française d’Extrême Orient. Luce, Gordon H. & Pe Maung Tin (eds.). 1934–1956. Inscriptions of Burma. Oxford: Oxford University Press. Luce, Gordon H. 1985. Phases of pre-Pagan Burma: Languages and history, 2 vols. Oxford & New York: Oxford University Press. Luce, Gordon H. 1953. Mons of the Pagan dynasty. Journal of the Burma Research Society 36(1). 1–19. Luce, Gordon H. 1959a. Geography of Burma under the Pagan dynasty. Journal of the Burma Research Society XLII(i). 52–74. Luce, Gordon H. 1959b. Note on the peoples of Burma in the 12th–13th century AD. Journal of the Burma Research Society XLII(i). 32–51. Luce, Gordon H. 1970. Aspects of Pagán history – Later period. Bangkok: Siam Society. Luce, Gordon H. & Bohmu Ba Shin. 1961. Pagan Myinkaba Kubyauk-gyi Temple of Rājākumār (1113 AD) and the Old Mon writings on its walls. In Bulletin of the Burma Historical Commission, vol. II, 217–416. Rangoon: Burma Historical Commission. Marrison, Geoffrey E. 1975. The early Cham language. And its relationship to Malay. Journal of the Malaysian Branch of the Royal Asiatic Society 48(2). 52–59.

876 

 Paul Sidwell and Mathias Jenny

Miyake, Marc. 2018. Studies in Pyu phonology, II: Rhymes. Bulletin of Chinese Linguistics. 11(1/2). 37–76. Miyake, Marc. 2019. A first look at Pyu grammar. Linguistics of the Tibeto-Burman Area. 42(2). 150–221. Moore, Elizabeth. 2004. Interpreting Pyu material culture: Royal chronologies and finger-marked bricks. Myanmar Historical Research Journal 13. 1–57 Munier, Christopher & U Myint Aung. 2007. Burmese Buddhist murals, vol. 1. Epigraphic corpus of the Powin Taung caves. Bangkok: White Lotus. Nai Pan Hla. 1976. A comparative study of Old Mon epigraphy and Modern Mon. Oceanic Linguistics Special Publications 13, 891–918. Nai Pan Hla. 2011. Archaeological aspects of Pyu, Mon, Myanmar. Yangon: Loka Ahlinn. Nyein Maung et al. 1972–2013. rheḥ hoṅḥ mran mā kyok cā myāḥ [Ancient Burmese inscriptions], 6 vols. Yangon: Department of Archaeology. Nai Tun Way. 2012. abhidhān lik tma’ man. A dictionary of the Mon inscription. Bangkok: Tech Promotion Okell, John. 1965. Nissaya Burmese. A case of systematic adaptation to a foreign grammar and syntax. Lingua 15. 186–227. Perret, Daniel (ed.). 2018. Writing for eternity. A survey of epigraphy in Southeast Asia. Paris: École Française dÉxtrême Orient. Pou, Saveros. 1989. Nouvelles inscriptions du Cambodge I. Paris: École française d’Extrême-Orient. Pou, Saveros. 2001. Nouvelles inscriptions du Cambodge II & III. Paris: École française d’Extrême-Orient. Pou, Saveros & Grégory Mikaelian. 2011. Nouvelles inscriptions du Cambodge IV. Paris: L’Harmattan. Pou, Saveros. 1992. Dictionnaire vieux khmer-français-anglais: An Old Khmer-French-English dictionary. Paris: Centre de documentation et de recerche sur la civilisation khmère. Pou, Saverous. 1996. Les termes grammaticaux du vieux khmer (VIe–XIVe siècle). Bulletin de l’École française d’Extrême-Orient 83. 21–34. Prapassorn, Posrithong (ed.). 1999. Phrapathom Chedi National Museum. Bangkok: Fine Arts Department. Salmon, Claudine. 2018. Chinese epigraphic studies in Southeast Asia – An overview. In Daniel Perret (ed.), Writing for eternity. A survey of epigraphy in Southeast Asia, 287–321. Paris: École Française dÉxtrême Orient. Saraya, Dhida. 1999. (Sri) Dvaravati. The initial phase of Siam’s history. Bangkok: Muang Boran Publishing. Seam, Long. 1992. Quelques traits grammaticaux charactéristiques de l’ancien khmer. Mon-Khmer Studies 20. 19–30. Shafer, Robert. 1943. Further analysis of the Pyu inscriptions. Harvard Journal of Asiatic Studies 7(4). 313–366. Shorto, Harry L. 1956. Notes on Mon epigraphy. Bulletin of the School of Oriental and African Studies 18(2). 344–352. Shorto, Harry L. 1966. The devatā plaques of the Ananda basement. In Essays offered to G. H. Luce by his colleagues and friends in honour of his seventy-fifth birthday. Vol. 2: Papers on Asian Art and Archaeology, 156–165. Ascona: Artibus Asiae. Supplementum 23. Shorto, Harry L. 1971. Dictionary of Mon inscriptions from the 6th to the 16th century. Oxford: Oxford University Press. Shorto, Harry L. 2006. A Mon-Khmer comparative dictionary. Canberra: Pacific Linguistics. Skilling, Peter (ed.). 2008. Past lives of the Buddha. Wat Si Chum. Art, architecture and inscriptions. Bangkok: River Books.



MSEA epigraphy 

 877

Somporn, Thitsadee & Robert McCarthy. 2012. Estampage of Bayon inscriptions. Phnom Penh: Digital Advertising. Stark, Miriam. 2015. Southeast Asian urbanism: From early city to Classical state. In Norman Yoffee (ed.), Early cities in comparative perspective, 4000 BCE–1200 CE, 74–93. Cambridge: Cambridge University Press. Swearer, Donald K. & Sommai Premchit. 1998. The legend of Queen Cāma. New York: State University of New York Press. Taw Sein Ko. 1892. The Kalyānī Inscriptions erected by King Dhammacetī at Pegu in 1476 AD. Text and translation. Rangoon: Government Printing. Thurgood, Graham. 1999. From Ancient Cham to modern dialects: Two thousand years of language contact and change (Oceanic Linguistics Special Publications 28). Honolulu: University of Hawaii Press. Tun Nyein, Emanuel Forchhammer & Taw Sein Ko. 1899. Inscriptions of Pagan, Pinya and Ava: Translation, with notes. Rangoon: Archaeological Survey of India. Toru, Ohno. 2005. The structure of Pagan period Burmese. In Justin Watkins (ed.), Studies in Burmese linguistics, 241–305. Canberra: Pacific Linguistics. U Ba Shin 1962. The Lokahteikpan. Early Burmese culture in a pagan temple. Rangoon: Burma Historical Commission U Ba Shin. 1964. Pagan-min-sa-su Thutesana Leik-ngan [Manual of ink inscriptions from Pagan]. Yangon: Burma Historical Commission. U Chit Thein. 1965. She haung Mun kyauksa paungyoke [Collection of Old Mon inscriptions]. Rangoon: Historical Research Department. U Mya. 1961. Votive tablets of Burma, 2 vols. Yangon: Department of Archaeology. U Nyunt Han. 2001. Bagan khit myanma kyauksa abidan [Dictionary of Pagan period Burmese stone inscriptions]. Yangon: Sa Pe Biman. U Tha Myat. 1963. pyū phat cā: pyū akkharā samuiṅḥ [Pyu reader: A history of the Pyu alphabet]. Yangon: uḥ lha daṅ [U Hla Din]. U Tha Myat. 2011. pyū phat cā: pyū akkharā samuiṅḥ [Pyu reader: A history of Pyu alphabet]. Yangon: kaṁ. kau wat raññ cā pe [Kant Kaw Wut Yee Publishing House]. U Than Tun. 1998. An original inscription dated 10 September 1223 that king Badon copied on 27 October 1785. In Pierre Pichard & François Robine (eds.), Études birmanes en hommage à Denise Bernot, 37–55. Paris: École Française dÉxtrême Orient. U Tin Htway. 2001. Burmese epigraphy: G. H. Luce’s legacy yet to be unearthed. Aséanie, Sciences humaines en Asie du Sud-Est 7. 35–57. Vickery, Michael. 1973. The Khmer inscriptions of Tenasserim: A reinterpretation. Journal of the Siam Society 61(1). 51–70. Vickery, Michael. 1998. Society, economics and politics in Pre-Angkor Cambodia: The 7th–8th centuries. Tokyo: The Centre for East Asian Cultural Studies for Unesco, The Toyo Bunko. Wolters, Oliver. W. 1999. History, culture, and region in Southeast Asian perspectives (Southeast Asian Program Publications 26), revised edn. Ithaca, NY: Cornell University. Wongthes, Sujit (ed). 2003. Suek silacharuek [The fight over the inscription]. Bangkok: Art and Culture. Yanson, Rudolf. 2002. On Pali-Burmese interference. In Christopher I. Beckwith (ed.), Medieval Tibeto-Burman languages, 39–57. Leiden, Boston & Köln: Brill. Zakharov, Anton O. 2019. The earliest dated Cambodian inscription K. 557/600 from Angkor Borei, Cambodia: An English translation and commentary. Vostok (Oriens) 1. 66–80.

Mathias Jenny

36 Writing systems of MSEA 36.1 Introduction Writing was introduced to MSEA from two directions some 2,000 years ago, namely from the west through travelers and merchants from South Asia, and from the north and north-east through Chinese merchants and officials. While Chinese writing remained confined to present-day Vietnam, Indic scripts spread into all parts of MSEA and became the preeminent type of writing in the area, developing into several typically MSEAn subtypes. Much later, Arab merchants introduced the Arab consonantal alphabet (abjad) mainly to insular and peninsular regions with substantial Muslim populations, where Indic and Arab style systems coexist to this day, the latter now mostly replaced by the Latin script. The Latin script is the most recent arrival in MSEA, introduced from the early 17th century by European missionaries. In Vietnam, Jesuit missionaries started replacing the traditional Sino-Vietnamese script (Chữ Hán) by the Latin-based orthography (Chữ Quốc Ngữ), making Vietnamese the only national language in MSEA that does not use an Indic writing system. Other local languages were written in Latin characters from the 18th century, such as different varieties of the Kachin ethno-linguistic group and several Chin languages in Myanmar. More peripherally, a few writing systems were invented in MSEA for special use. These will briefly be presented in section 36.5. In MSEA, much more than in Europe, script and language are intimately connected with cultural, religious, and ethnic identity. While traditionally a language that gained some local or regional status would develop a script of its own, today political and religious factors are more important when alphabetizing an unwritten language. In some cases, centralized political systems may force a national standard also on local varieties formerly written in their own scripts, as can be seen in Thailand, where the traditional Lanna script has been all but superseded by the Standard Thai script, only to resurface more recently as a cultural item on some street signs and signboards. The landscape of Southeast Asian writing systems is extremely diverse and offers interesting insights into the development and indigenization of scripts, adapting foreign models to local vernaculars. While a number of studies are available on individual scripts and their development (e.  g. U Tha Myat n.d. for Mon-Burmese; Sai Kam Mong 2004 for Shan; Danvivathana 1978 for Thai; Hundius 1990 for Lanna), there exists no comprehensive or comparative overview of the area as a whole. This chapter is a first attempt at a synthesis of the different writing systems of MSEA, focusing mainly on the Indic scripts as indigenous developments of imported models, for which a typological classification is proposed here. No attempt is made to present complete inventories of the scripts and orthographies. The aim is rather to offer a typological overview. Detailed descriptions of individual writing systems and their functioning https://doi.org/10.1515/9783110558142-036

880 

 Mathias Jenny

and development can be found in the literature. A quick overview of different writing systems worldwide, including numerous from MSEA, can be found at the Omniglot website.1

36.2 The history of writing in MSEA 36.2.1 First documents While Chinese script first appears in the 2nd century BC in the easternmost part of MSEA, first Indic inscriptions appear in present-day southern Myanmar, the Chao Phraya Basin (present-day Thailand), and Champa (present-day southern Vietnam) from around the 4th century (Coedès 1968; Thurgood 1999). The earliest documents in Indic scripts found in MSEA are written in Sanskrit and Pali, with vernacular languages appearing only around the 6th century (Mon and Khmer), complementing, rather than replacing the classical Indian languages (see also Sidwell and Jenny, this volume chapter 35).

36.2.2 Development and spread of writing in MSEA As mentioned above, it was mainly the Indic scripts that found their way into and spread throughout MSEA. Chinese and Arab writing was confined from the beginning to the periphery (north-east and south) of the region and to certain ethnolinguistic and religious groups (see Goddard 2005). Similarly, the Latin script was not in widespread use and did not succeed in replacing Indic writing systems. Only in the second half of the 20th century did efforts to introduce the Latin alphabet for hitherto unwritten languages outside the missionaries’ domain meet with some success in China (e.  g. Wa, Zhuang) and non-national languages in the Vietnamese sphere of influence (e.  g. Cham). A first wave of Indian influence in MSEA seems to have brought Hindu culture to the region, later replaced by increasing Pali-Buddhist influence from Sri Lanka (Hartmann 1986). Although Indian-style kingdoms with Indian-style laws were introduced to MSEA over many centuries, there is no evidence of any South Asian language being actually used as spoken vernacular at any time in the area. Sanskrit and Pali were apparently restricted in their function as literary languages of administration and religion. When the scripts were used to write indigenous languages with phonological systems that diverge greatly from Sanskrit and Pali, several adaptations and innova-

1 https://www.omniglot.com/writing/index.htm (accessed 2 January 2021).



Writing systems of MSEA 

 881

tions had to be introduced. These innovations led to diverging systems in different regions of MSEA, which will be presented in some detail in section 36.3.

36.2.3 Language, writing, and religion The choice of a writing system for any language depends on a number of factors, linguistic and non-linguistic. The latter are probably more important than the former in most cases, as socio-cultural, political, and religious preferences or pressures usually outweigh linguistic adequacy. A language may change its preferred writing system over time, reflecting the prevalence of religious, political, or rather practical factors. One case in point is Malay, which used an Indic script in its earliest inscriptions (7th century) and during the classical period (from the 14th century) was written in Arabic letters (Jawi script). The Arabic abjad, which in traditional Arabic usage basically represents only consonants, was rather inadequate to the phonology of Malay and other Austronesian languages and was replaced with the Latin alphabet (Rumi script) in the 20th century. The Malay varieties spoken in southern Thailand still use the Jawi script as part of their Muslim culture, distinguishing them from the Buddhist majority in the country which uses the Thai script, as well as making a point of being politically independent from Malaysia. The colloquial use in Thai of jaːwiː (Jawi) equates the script with the language, referring both to the Arabic script and the local Malay variety. One major motor in the spread of writing is undoubtedly religion. As mentioned above, Muslim merchants from the Middle East brought with them not only Islam but also the Arabic script to SEA, and European missionaries tried, more or less successfully, to introduce the Latin alphabet together with Christianity. Much earlier, it was Indian travelers who brought Hindu worship and Buddhist practice to MSEA, together with South Indian scripts. Unlike Sanskrit, which was traditionally written in the Northern Indic Devanagari script, Pali as the language of Theravada Buddhism common in MSEA, was not confined to a single script, but rather appeared in the respective local alphabet of each region.2 As Theravada Buddhism was brought to MSEA from South India/Sri Lanka, it was these writing systems that formed the basis for indigenous developments. Initially, canonical Buddhist texts were written on palm leaves and metal sheets, which were kept at local temples and monasteries for preaching and teaching. With this monastic education, reading and writing spread throughout the Buddhist regions, which mostly covered the lowlands of MSEA. In a monograph-length study, Veidlinger (2006) presents the transmission of textual and oral Buddhism in northern Thailand, exemplifying the shared paths of religion and

2 In this chapter, I use “alphabet” in the colloquial, broad sense of “set of letters used in a writing system”, rather than in the more restrictive, linguistic sense of “set of letters representing individual consonants and vowels”.

882 

 Mathias Jenny

writing. This study can be taken as representative for many or most Buddhist societies in early MSEA. The activities of Christian missionaries in MSEA from the 16th and 17th centuries resulted in a number of (mostly highland) communities converting to Christianity of different denominations. As their languages in most cases had no written form, the Latin alphabet was applied and is still the received standard in several groups throughout the region, though to a greater extent in Myanmar, Laos and Vietnam than in Thailand and Cambodia. In the case of Vietnam, the reason was evidently that the Latin alphabet had already been established for the national language, while in Myanmar and Laos there has been less (Buddhist) centralization than in Thailand and Cambodia.

36.3 Indic writing systems Most literary languages of MSEA make use of abugida-type syllabaries derived from Southern Indian Brahmi-Pallava scripts (U Tha Myat n.d.). Ultimately derived from Phoenician, the Brahmi scripts are of the abugida or “vowel-incorporating” type (Coulmas 2003). Unlike the Semitic abjads, which represent only consonants, and true alphabets, which represent each sound with one letter (ideally, at least), abugida scripts are transparent, compositional syllabic scripts. Each character (akṣara) represents a consonant with an inherent vowel, usually /a/, and diacritics are added for other vowels.3 These diacritics appear before, above, under, or after the consonant they belong to and cannot occur on their own. Special symbols are used for syllable-initial vowels, which do not combine with consonants. Consonant clusters and consonants in syllable-final position need special marking to suppress the inherent vowel. Indic scripts use combining forms of consonants (ligatures) in clusters and a special diacritic (virāma, ‘vowel killer’) for final consonants. The consonants are arranged in groups (varga) of five, starting with velars (ka-varga) and going on to labials (pa-varga), with each group consisting of a plain stop, a voiceless aspirated stop, a voiced stop, a voiced aspirated stop, and a nasal. The remaining letters, consisting of approximants and fricatives, are outside the groups (avarga, ‘groupless’). This order of consonants, illustrated in Table 1, is retained in all MSEAn Indic scripts, with some variation in the more progressive systems.

3 In MSEA Indic scripts, like in the Sinhala alphabet, precedes the consonant, and are placed above, and beneath, and after the consonant. The combination is /o/.



Writing systems of MSEA 

 883

Tab. 1: Indic akṣaras with approximate phonetic values. ka [k] ca [c] ṭa [ʈ] ta [t] pa [p] ya [j] śa [ç]

kha [kʰ] cha [cʰ] ṭha [ʈʰ] tha [tʰ] pha [pʰ] ra [r] ṣa [ʃ]

ga [g] ja [ɟ] ḍa [ɖ] da [d] ba [b] la [l] sa [s]

gha [g ɦ] jha [ɟ ɦ] ḍha [ɖ ɦ] dha [d ɦ] bha [bɦ] va [w] ha [h]

ṅa [ŋ] ña [ɲ] ṇa [ɳ] na [n] ma [m]

Combinations of a single initial consonant with non-default vowels (other than the inherent a) are in Devanagari गा gā, गि gi, गी gī, गु gu, गू gū, गे ge, गै gai, गो go, गौ gau. The example ‘village’, Sanskrit ग्राम grāma illustrates the use of combined consonants: ग+र → ग्र , ग्रा , म . The subscript diacritic virāma ( ्) marks a final consonant without inherent vowel, as in the accusative case of ‘village’, ग्रामम् grāmam. The anusvāra ( ं) marks vowel nasalization in Sanskrit and is usually pronounced as syllable-final /ŋ/ in Pali, and the visarga (ः) marks devoicing of the vowel or final /h/ in Sanskrit. This writing system is well adapted to the phonological structure of Sanskrit and other Indo-Aryan languages, but its use with Southeast Asian vernaculars required a number of adaptations. These adaptations were done differently in different regions, leading to three main types of abugida-based writing systems found in MSEA today, which in what follows will be called “western (Mon) type”, “eastern (Khmer) type”, and “northern (Tai Tham) type”. These three types and their subtypes are presented in the following sections. Apart from a number of differences, the MSEAn Indic scripts share several inherited similarities, among which the lack of word spacing is especially striking for western language (or script) learners. A space is generally only inserted between bigger chunks (phrases or even clauses) or when an intonation break is to be indicated. This usage contrasts with modern Vietnamese orthography, where a space separates each syllable, irrespective of word boundaries.

36.3.1 The western type (Mon type) The western (Mon) type scripts are mainly based on Pali orthography; consequently they lack the Sanskrit sibilants and , which merged with /s/ in Pali, and phonetically in all MSEA languages in loanwords also from Sanskrit as well as letters for the vocalic sonorants and in common use. Both the traditional and progressive western type scripts share a number of features, some of which are inherited from their Indic ancestors, others are innovations of MSEAn usage. As in South Asian scripts, a short default vowel is always inherent in an akṣara, usually /a/ or similar. Consonant clusters in syllable-initial position are written in Mon type scripts with the first con-

884 

 Mathias Jenny

sonant in its full form, the second (and third, if present) with a reduced ligature form. This avoids potential ambiguity between words beginning, for example, with /kr-/ and /kar-/, the latter having both consonants in their full forms. A syllable-final consonant is marked as such by the virāma, called ‘killer’ in Mon (həcɒt) and Burmese (ʔəθaʔ). Like in Sanskrit, the visarga is used to mark final /h/ in some contexts, while in others the full with the ‘killer’ is used. Independent vowels appear only in some contexts, in modern Mon and Burmese they are restricted to a limited set of words. More common is the marking of the initial glottal stop by , which was reinterpreted as /ʔ/ in MSEAn scripts and counts as a regular consonant. One innovative vowel combination appears in the earliest Burmese inscriptions (but only very rarely in Old Mon), namely + . As the former is placed above the consonant and the latter below it, the order in Romanization has been the topic of some dispute among western scholars. The question whether the transliteration should be or obviously cannot be decided on the written form, but traditionally is the preferred rendering in publications.4 The linguistic evidence suggests that neither is adequate, just as would not be an adequate transliteration of , although it is composed of these two forms. Rather, it is very likely that the combination or stands for a central vowel (ɤ or) in Old Burmese and Middle Mon which is not present in Sanskrit and Pali, and not for a diphthong like /ui/ or / iu/ (Dempsey 2001). A prominent superficial characteristic of the western type scripts is their round basic letter shapes. According to traditional belief, the writing material (palm leaves inscribed with a sharp metal stylus) did not favor straight lines, which might cause the leaves to break, resulting in the more square epigraphic characters to develop into rounded shapes.

36.3.1.1 Traditional systems (Mon, Burmese) The traditional western type scripts use a largely etymological orthography, not fully representing sound changes that occurred in the languages after the introduction of the script. Indic loanwords, especially the predominant Pali vocabulary, are commonly written as they are in Pali but pronounced according to indigenous phonology. Modern Mon orthography is not very different from Middle Mon, and modern Burmese retains Old Burmese spellings, though the spoken languages diverge in many respects. The western type is best represented by Mon, which exhibits the full set of Pali consonants, plus two additional letters for implosive /ɓ/,5 as well as the Indic short vowel 4 The actual order of writing in Burmese is i+u, both in handwriting and Unicode-based computer typing. 5 The light register is derived from with a dot added to its center ၜ, while the heavy register form is originally a combination of + မ္ဗဗ, which in modern Mon script appears as one letter ၝ.



Writing systems of MSEA 

 885

, which like in other writing systems of MSEA has been reassigned the value of /ʔ/ or vowel support in initial position. The original order of consonants is retained, with the additional characters appearing at the end of the list. The originally voiced consonants are today pronounced as voiceless, inducing the heavy register on the syllable. The only exception is , which is used to represent /ɗ/. Basic sonorants are voiced and induce the heavy register on the syllable, except and , which mark /n/ and /l/ with light register syllables, respectively. Table 2 gives the Mon consonant chart in Mon script (column 1), the Indic value (column 2), and the modern Mon pronunciation (column 3). According to the consonant class, the inherent vowel in modern Mon is either /a/ or /ɛ/, for light and heavy register consonants, respectively. Tab. 2: Mon consonants. 1

2

3

1

2

3

1

2

3

1

2

3

1

2

3

က စ ဋ တ ပ ယ ဟ

ka ca ṭa ta pa ya ha

kaʔ caʔ taʔ taʔ paʔ jɛ̤ ʔ haʔ

ခ ဆ ဌ ထ ဖ ရ ဠ

kha cha ṭha tha pha ra ḷa

kʰaʔ cʰaʔ tʰaʔ tʰaʔ pʰaʔ rɛ̤ ʔ laʔ

ဂ ဇ ဍ ဒ ဗ လ ၜ

ga ja ḍa da ba la ḅa

kɛ̤ ʔ cɛ̤ ʔ ɗaʔ tɛ̤ ʔ pɛ̤ ʔ lɛ̤ ʔ ɓaʔ

ဃ ဇျှ6 ဎ ဓ ဘ ဝ အ

gha jha ḍha dha bha va a

kʰɛ̤ ʔ cʰɛ̤ ʔ tʰɛ̤ ʔ tʰɛ̤ ʔ pʰɛ̤ ʔ wɛ̤ ʔ ʔaʔ

ၚ ည ဏ န မ သ ၝ

ṅa ña ṇa na ma sa mba

ŋɛ̤ ʔ ɲɛ̤ ʔ naʔ nɛ̤ ʔ mɛ̤ ʔ saʔ ɓɛ̤ ʔ

Unlike other MSEAn Indic scripts, Mon does not assign conventionalized names to the consonants, but vowels have specific names used for spelling out words. Some consonants have special combining forms, namely , , , , >r> and , while others appear reduced in size as subscripts. Word-internal consonant clusters are usually represented by stacked consonants in Mon (and in Burmese). Orthographic clusters of the type usually represent aspirated/voiceless sonorants Several consonant clusters and vowel-final combinations (rhymes) have a special pronunciation: is pronounced as /h/ in most contexts, represents /s/ in heavy register syllables, is /kr-/, is /kəl-/, is pronounced as /iəŋ/ or /ɔɲ/, as /it/ or /ɛt/, among many other regular and irregular orthography-pronunciation correspondences. There remains a good amount of ambiguity in the conservative Mon orthography, including irregular register marking (see Jenny 2015 for more details). The phonological status of the Indic aspirated voiced stops in Old Mon is not quite clear. As real voiced aspirates are unlikely on typological grounds, they are probably best analyzed as close clusters /gh/, /jh/, etc., which became real aspirates in Modern Mon (kʰ, cʰ, etc.), but their original voicing led to the heavy register. This is in accordance with voiceless aspirated stops, which in Old Mon must be analyzed as clusters, 6 In Burmese, this letter is written မ္.

886 

 Mathias Jenny

allowing infixes to intervene between the stop and h. Example (1) shows Mon writing, transliteration, and pronunciation. A unique feature of modern Mon script is the repetition of the final consonant with the virāma to indicate reduplication of the word, as seen in the second word, hɛt-hɛt ‘to be quiet’. The same word also shows the reading /h/ for , and /ɛt/ for . Unlike Burmese, the vowel always requires a final consonant, with serving as default if no other consonant is present, as seen in /kɒ/ ‘to give, let’. (1)

Mon script ် တ် မံံင်သ္ၚိ ိ� တ် ် ဏး ် း ဂွံံ�လဴကဵု ဴ ပုံ ု� ံ� ဂွံံ�ကလင််။ ṇaḥ gwaʔ lau kɯw puṁ gwaʔ kalaṅ>

low

/pʰàːk/ low

/pàːk/ falling

/pʰâːk/

Thai differs from Khmer also in the use of the visarga (ḥ), which does not mark final /h/, but final /ʔ/ after some vowels, and the rhyme /aʔ/ if no written vowel is present. Having lost the ligatures representing consonant clusters, Thai orthography shows some ambiguity when there are more than two consonants in initial position. The words written and are pronounced /pʰəlaŋ/ ‘strength, power’ and /pʰláŋ/ ‘make a mistake’, respectively. The inherent vowel is /ɔː/, /ʔə~aʔ/, or /o/, depending on the context, short /a/ in closed syllables is written with an innovative symbol      ั , derived from the visarga (ะ). Other innovative vowel signs in Thai are        ึ and         ื (for /ɯ/ and /ɯː/, respectively) derived from        ิ , แ derived from เ by duplication, and the use of อ as vowel /ɔ/. Thai retains foreign spellings also in final consonants, although they are adapted in pronunciation to the possible codas in Thai, namely k, t, p, ʔ, ŋ, n, m, j and w. The word ‘English’ is spelled , but pronounced /ʔaŋkrìt/, similarly ‘French’ is written and pronounced /fəràŋsèːt/. Table 6 shows the full set of Thai consonants with their Indic or assumed Old Thai values (column 2) and modern pronunciations (column 3). Note that the mid and low consonants in the open syllables (citation form in the alphabet) have mid-level tone, while the high consonants have rising tone. The original voiceless stops /t/ and /p/ represent the originally implosive /ɗ/ and /ɓ/ respectively, like in Khmer. These are now pronounced as voiced stops in standard Thai. Their voiceless counterparts are written with derived forms, avoiding the ambiguity of traditional Khmer, where

stands for /ɓ/ or /p/, depending on the context. The letter was introduced more recently to write loanwords (mostly Chinese and English) requiring the low series of tones. It was presumably never actually pronounced as voiced /ɦ/ in Thai. Both and are obsolete, but still occur in the official alphabet as taught in schools in Thailand.

894 

 Mathias Jenny

Tab. 6: Thai consonants. 1

2

3

ก จ ฎ ด บ ย ฬ

k c ṭ t p y ḷ

kɔː cɔː dɔː dɔː bɔː jɔː lɔː

1

ฏ ต ป ร อ

2

ṭ ̂ t̂ p̂ r a

3

1

2

3

1

2

3

tɔː tɔː pɔː rɔː ʔɔː

ข ฉ ฐ ถ ผ ล ฮ

kh ch ṭh th ph l ɦ

kʰɔ̌ ː ฃ x kʰɔ̌ ː cʰɔ̌ ː tʰɔ̌ ː tʰɔ̌ ː pʰɔ̌ ː ฝ f fɔ̌ ː lɔː ว w wɔː hɔː

1

2

3

1

2

3

ค ช ฑ ท พ ศ

g j ḍ d b ś

kʰɔː ฅ ซ tʰɔː tʰɔː pʰɔː ฟ sɔ̌ ː ษ

ɣ z

kʰɔː ฆ gh sɔː ฌ jh ฒ ḍh ธ dh fɔː ภ bh sɔ̌ ː ส s

v ṣ

1

2

3

1

2

3

kʰɔː cʰɔː tʰɔː tʰɔː pʰɔː sɔ̌ ː

ง ญ ณ น ม ห

ṅ ñ ṇ n m h

ŋɔː jɔː nɔː nɔː mɔː hɔ̌ ː

Thai consonants have conventional names used in spelling out in order to avoid ambiguity. The names assigned to the consonants are usually words beginning with the respective letter, except for the obsolete and , the names of which are kʰɔ̌ ː kʰùət ‘bottle KH’ and kʰɔː kʰon ‘man KH’, written with and , respectively. There is no word in Thai that begins with , so the name of letter uses a word with in medial position: lɔː cùʔlaː ‘kite L’. Example (6) illustrates Thai orthography and pronunciation. A variant of the numeral 2 is used to indicate reduplication or repetition of the preceding word(s). The spelling is used for the diphthong /uə/, as seen in the last word of example (6). (6)

Thai orthography � ้านเขามีีหมาตััวใหญ่่ๆ สองตััว ที่่บ้

/tʰîː bâːn kʰǎw miː mǎː tuə jàj-jàj sɔ̌ ːŋ tuə/ ‘There are two rather big dogs at his house.’

Complete descriptions of the Thai writing system are available in numerus publications and online resources, including the first complete Thai grammar in a western language, Pallegoix’s Grammatica linguae Thai (1850), which gives a detailed account of the orthography, and the more modern study by Danvivathana (1987).10 Apart from standard Thai, the Thai script is also used for local languages in Thailand, in some cases replacing traditional writing systems, in others introducing literacy in a language for the first time. Non-standard orthographic adaptations were made where necessary in order to adequately represent the sounds of the target language. While the Thai system is not quite as flexible as the Latin alphabetic systems, it is nevertheless adaptive enough to handle most requirements more or less easily, even if computer-based printing of non-standard combinations is not always possible. Several literacy initiatives have been undertaken by Mahidol University in Thailand in

10 A good online resource is offered by Omniglot: https://omniglot.com/writing/thai.htm (accessed 2 January 2021).



Writing systems of MSEA 

 895

the last decades, with varying degrees of success in establishing a written standard for local languages such as Karen, Nyahkur, Surin Khmer, among others. The Lao script is a further development of the Thai alphabet with a number of simplifications introduced in the mid-20th century. The Indic voiced aspirated stops were dropped, as their pronunciation presumably never differed from the non-aspirated voiced stops in Lao (and Thai). Likewise, Lao does not use the Sanskrit sibilants and . Final consonants in loanwords, which in Thai are retained according to the original spelling in the source language, are neutralized in Lao orthography according to their pronunciation. Lao orthography represents all vowels overtly, making the script less ambiguous and more alphabet-like than Thai, although the vowel signs remain subordinate to the consonants. In spite of these reforms, Lao retains the consonant classes like Thai, marking tones according to their categories, rather than their pronunciation (a tone marker does not always represent the same tone, depending on the initial consonant). Furthermore, the Lao script is more conservative than Thai in retaining a few combined consonants as ligatures, where Thai has juxtaposed single consonants. This is the case especially for originally voiceless aspirated sonorants, which today are used for sonorants in syllables with high series tones. The Lao script has been adapted for local languages in Laos, such as Katu and Khmu. Also, several varieties of a script similar to Lao, known as Tai Viet script, are used by Tai groups in Vietnam, such as the Tai Dam, White Tai, and others (Hartmann 1986b). The consonant classes are retained, and vowels are consistently marked as they are in Lao. Traditionally, no tones were indicated in the script, but recent reforms introduced Lao-style tone markers, as seen in Figure 1 (from Don et al. 1989).

Fig. 1: Tai Dam script.

896 

 Mathias Jenny

36.3.3 The northern type (Tai Tham type) The northern type or Tai Tham scripts are similar to the eastern style scripts in many respects, but their appearance with round shaped letters is more similar to the western type. Like the eastern type scripts, Tai Tham scripts do not use the virāma to mark final consonants, and they use Thai style tone markers and vowel symbols, including and and the combinations and for /ɤ/. The latter combination represents original /ɯə/, which merged with /ɤ/ in Khuen, the Tai variety spoken in Eastern Shan State. Ligatures are used for initial consonant clusters, and word-internal consonant clusters (including geminates) are written as stacked consonants, like in Mon and Burmese. Tai Tham scripts are generally conservative, retaining old voicing distinctions in the script, resulting in three (in some varieties two) consonant groups, marking different tones in combination with the tone markers. The status of the Indic voiced aspirated stops in Old Tai Tham scripts is unclear. Like in Mon and Khmer, they are today pronounced as voiceless aspirated stops with low series tones, indicating original voiced initials. This would suggest a pre-devoicing pronunciation as /gʰ/, /jʰ/, etc., but this is typologically unlikely. Unlike in Mon and Khmer, the voiced aspirates can hardly be explained as clusters /g+h/, /j+h/, etc., as there is no phonological or other evidence for such an analysis of either voiced or voiceless aspirated stops in the old Tai languages.

36.3.3.1 Khuen, Lanna, Lao Tham The traditional Tai Tham scripts are used for the closely related languages of Lanna (Kam Mueang), Khuen of eastern Shan State, and Lue in Xixuangbanna in Yunnan, as well as in monastic Lao texts. While the use of the Tai Tham script has been decreasing for many decades in Lanna (northern Thailand) and Laos, and the Lue script has been reformed in China (see 36.3.3.2), it is still in use in the area of Kyaingtong (Kengtung) in Myanmar. The shape of the letters varies to some extent in the different languages using Tai Tham, but this is a “font” issue rather than a systematic difference. There is one Unicode block for Tai Tham, and a few fonts are available for each variety. Depending on the phonology of the language variety written in Tai Tham, consonants are grouped into two or three classes, corresponding to the Thai high, mid, and low consonants, continuing original voiceless, glottalized, and voiced phonemes. Sonorants are always voiced in the present languages, but the orthography of , , etc. suggests that voiceless aspirated sonorants existed at an earlier stage, like in Thai. Today, these spellings are used to indicate sonorant onsets in high series tones. One special feature of Tai Tham scripts not found in other writing systems of MSEA is the rendering of final consonants as subscript. If there is a subscript vowel present in the same syllable, the final consonant appears in its full form on the main line without any marking. Since subscript consonants can also be used to write con-



Writing systems of MSEA 

 897

sonant clusters, ambiguity arises in some cases, as seen in the near-homographs hǐn ‘stone’ and nǐː ‘to flee’ as and , respectively.11 In the former case, the orthographic syllable structure is CVC, in the latter CCV. Example (7), from a Khuen manuscript from Kengtung, illustrates some features of Tai Tham script. The first word is a common fusion of two lexemes ‘name, be called’ and ‘to say’. Stacked consonants can be seen in the geminates of the name , and the word shows the subscript final . The last word is another fused form, combining and the vowel . Tone marking is rather inconsistent in manuscripts, but regular in printed texts. (7)

Khuen script

/cɯ̀ː wàː náːŋ kǎnnəkəwâttiʔ téːwíː wàː ʔǎn kɔ̀ míː lɛ́ ː/ ‘there was also one by the name Lady Kannakavatti Devi.’ Not much literature is available on Tai Tham scripts, although good overviews can be found in Hundius (1990) and Owen (2017). Apart from Tai languages, Tai Tham script is also used for southern Palaung (Rucing), which adopted the script from Khuen at Kengtung in eastern Shan State. Its use in Palaung communities is not widespread, though, and the literacy among speakers is very low. Presently, the Rucing Tai Tham script is in competition with the unified Ta’ang script which was designed based on northern Palaung scripts (see 36.3.1.3).

36.3.3.2 Reformed Lue The Tai Lue of Xishuangbanna in Yunnan traditionally used the Tai Tham script, with basically identical orthography to the one used in Khuen and Lanna (Jagacinski 1986). In the 1950s, a writing reform for Lue was introduced in China, simplifying the script to some extent. The most important innovations involved placing all vowel signs and tone markers on the same line as the consonants, but the vowels still appear before or after the consonant (or both), so the linearity of the arrangement of letters is not complete. Also, the two consonant classes originating in long-lost voicing distinctions are retained, which means that the tone marking remains historical, rather than phonemic, as is the case in Thai, Lao and Tai Tham. Some ligatures appear in initial clusters, with a full sized first consonant and a subscript second consonant. The combinations

11 As vowel length is not phonemic in most cases, ‘stone’ is frequently written as in manuscripts, making the two words perfect homographs.

898 

 Mathias Jenny

of and sonorant are used to mark high series tones with sonorant initials, identical to the other traditional systems. The reformed script appears in recent printed publications and online media, but it is not generally used by all speakers, as many still prefer the traditional Tai Tham script (Hanna 2012).12

36.4 Latin writing Latin script-based orthographies were introduced to MSEA from the 16th century onwards, but were not particularly successful in replacing established local scripts, just as Christianity was not successful in replacing Buddhism (and Islam). Roman scripts could gain a strong standing mainly in previously illiterate societies, and in Vietnam, which obviously saw practical advantages in replacing the (equally imported) Chinese script by the more flexible Roman alphabet.

36.4.1 Christianization and alphabetization Christian Nestorian missionaries first arrived in Annam as early as the 10th or 11th century, but it was only in the early 17th century that the Italian and Portuguese Jesuit missionaries started designing an orthography for Vietnamese, which was completed by the middle of the century by Alexandre de Rhodes. The Vietnamese Chữ Quốc Ngữ (National script) consists of Latin letters and digraphs (plus Đ đ for the implosive /ɗ/) to represent the sounds of Vietnamese, in some cases with non-standard use of Roman letters, such as for /j/ and for /s/. The vowels can be combined with one or two diacritics indicating different vowel qualities and tones. The compilation of a dictionary as well as translations into Vietnamese of Christian texts (but not the whole Bible at that time) facilitated the establishment of the religion and the script in present-day Vietnam, though the latter has been much more successful in the long term. Example (8) illustrates the Vietnamese orthography and its relation to the pronunciation in the standard northern dialect (from Brunelle 2015). (8)

Vietnamese script Duy ngủ không sâu vì hàng xóm đang xây nhà. /zɥiA1 ŋuC1 xowŋ͡mA1 sɔ̆ wA1 viA2 haŋA2.sɔmB1 ɗaŋA1 sɛ̆ jA1 ɲaA2/ ‘Duy does not sleep well because the neighbors are building a house.’

The Vietnamese script has been adapted to the use with several other minority languages in Vietnam, including Cham, which has all but lost its original Indic script, and 12 A good online resource for New Tai Lue script can be found at https://www.webonary.org/ dailu/?lang=en and https://omniglot.com/writing/tailue.htm (accessed 2 January 2021).



Writing systems of MSEA 

 899

Nùng and Tày, which, unlike their close relatives Tai Dam and White Tai, do not use the Tai Viet script. In many cases additional letters and combinations had to be introduced to adequately render all phonemic distinctions in non-Vietnamese languages. Other communities in MSEA were more open to Christianity and the Latin script brought by the missionaries. Among these are societies far away from the politico-cultural centers, such as the Kachin and Chin in Myanmar (see Kurabe and Imamura 2016 for Kachin). Without a Buddhist background, the conversion to Christianity was met with fewer obstacles than in the established lowland societies, and being illiterate, the introduction of the Latin alphabet was accepted by these communities without resistance. There are legends of old, long-lost scripts among many groups, but these have not impeded the acceptance of Latin-based orthographies (Scott 2009; Kelly 2018). In most cases, the Latin scripts make use only of standard characters used in English, disregarding vowel quality distinctions or using digraphs to represent them, and not marking tones at all. Limitations of English typewriters were obviously a factor in designing the orthographies, as the newly literate languages should be easily writeable on a standard typewriter. This is seen in Jinghpaw as well as most Chin varieties, where there is a reluctance to introduce diacritics to mark finer phonemic distinctions in order not to change the traditional orthography of the Bible. As Latin scripts were introduced to different communities independently, there are differences in usage among these orthographies. While in Chinic languages, aspirate stops are written as , , , and the final glottal stop is marked by , Jinghpaw uses , , and for aspirates and leaves the glottal stop unmarked. In other cases, like Rawang (a Tibeto-Burma language of the Kachinic group), tones are marked consistently, and additional vowel signs are used. In Rawang, the letter represents /ɯ/, while is the central vowel /ə/, and marks the glottal stop /ʔ/ in syllable-final position. As the case of Karen varieties shows, the conversion to Christianity does not necessarily go together with the introduction of the Latin alphabet, but missionaries could choose to adapt local scripts for the newly literate languages.

36.4.2 Post-missionary Latin scripts In the second half of the 20th century, Roman scripts were introduced especially in China for several minority languages independent of Christian missionary activity. Like Pinyin, which is seen as a second standard in China, Latin-based orthographies are seen as pragmatic solutions, which are easy to implement. In most cases, only standard letters were used, but idiosyncratic combinations and the use of consonants to indicate non-standard vowels and tones can make these scripts quite untransparent to the outsider. Such neo-Latin scripts are used for example in Hmong, Zhuang, and Wa. Example (9) illustrates the tone marking in Hmong, and (10) gives the Official Wa (Paraok) orthography, the PRC Wa orthography, and the standard pronuncia-

900 

 Mathias Jenny

tion (Watkins 2013, which see for a complete account of Wa writing and orthography systems). (9)

Hmong tone marking Orthography pob po pos poj pov pom pog Pronunciation pɔ́ pɔ pɔ̀ pɔ̂ pɔ̌ pɔ̤̀ pɔ̤̂ Gloss ball spleen thorn female throw see grandmother

(10) Wa orthography and pronunciation Official Wa: Lai pawd aux, keem: hoit jhak maix nawh? PRC Wa: Lāi bōd ex, geem hoig nqag maix noh? Pronunciation: la̤ i bɔ̤ t ʔɤʔ kɯm hɔc dʑʰak maiʔ nɔh Translation: ‘Have you read my letter yet.’

36.5 Other scripts 36.5.1 Pollard script A novel syllabic script was introduced by the Methodist missionary Samuel Pollard for Hmong and Lipo/Eastern Lisu (“Pollard script”) in the 20th century, but it did not gain popularity beyond their respective communities and is losing ground to more widespread systems, especially outside of China (Enwall 1994). The Pollard script is an abugida-type syllabic script, using Latin letters as visual basis, but assigning new values to some letters and inventing new shapes resembling Latin capital letters. The inventory of initials consists of 23 basic signs plus 12 modified characters. Vowels and finals are added in smaller size, and tones are indicated by dashes on different levels above or to the right of the syllable. The apostrophe is added to initial consonants to mark a voiceless initial: T /d/, T’ /t/, C /n/, ’C /hn/, etc. Prenasalized initial stops are written as two consonants: CT /nd/, CT’ /nt/, etc.

36.5.2 Secret scripts Secret scripts for special insider communication were invented independently at different places in MSEA at different times (Kelly 2018). Naturally, these writing systems were of limited use and generally disappeared after a change in situation made them redundant. In some cases, secret scripts go back to legendary origins or reflect a community’s “lost script” (see Scott 2009). One example of an invented secret script is the northern Thai Yuttasara, which is based on the Lanna Tai Tham script, but combines the letters with numbers in an intricate and innovative way. The final outcome of composed syllables vaguely resem-



Writing systems of MSEA 

 901

bles cursive Chinese characters. Not much is known about the origin and spread of the Yuttasara, and the only publication on the subject (in Thai) is written as a textbook introducing the system, rather than providing background information (Thamthi 2001). Figure 2 illustrates Yuttasara (Thamthi 2001) as written in a mulberry paper manuscript.

Fig. 2: Yuttasara.

Another intriguing example of a secret script is the “Khom” script, reportedly devised and introduced by the leader of the anti-French rebellion in Laos, Ong Kommadam, in the early 20th century (Sidwell 2008). The script ceased to be used with the end of the rebellion in 1936, and only a few old people remember it today. The Khom script shows a number of ingenious and fascinating linguistic features not found in any other writing system of the area. The inventory consists of more than 300 characters, which are composed of an initial single consonant or cluster, the latter not derivable from the former in any way, and a rhyme of equally opaque form. This system of onsetrhyme combination is unique in MSEA. Superficially some characters resemble MSEAn Indic scripts, but most appear to be novel inventions not based on any recognizable model. The inventory of characters seems to cover all possible syllables in every language of the area. On the other hand, tones are apparently not marked, as they are not part of the phonology of most Austroasiatic languages, for which the script was probably designed. Figures 3 and 4 illustrate some characters for initials and rhymes (from Sidwell 2008).

902 

 Mathias Jenny

Fig. 3: Khom initials.

Fig. 4: Khom rhymes.

36.6 Literacy Reading and writing in MSEA today is part of general education in all countries, and large parts of the population have at least basic alphabetic skills. Until not very long ago, access to reading material was restricted to community or school libraries, and the publication of written material was costly and for practical reasons available only to certain groups in society.



Writing systems of MSEA 

 903

36.6.1 Monastic and popular literacy Traditionally, basic education was provided for boys who entered novicehood for some time, or who went to study at local monasteries. Reading and reciting Pali texts was part of this monastic education, which led to large portions of the male population having some knowledge of the local script, at least in theory. Texts written in local languages were mainly religious treatises, with literary activities taking place almost exclusively in the palace context. Reading for personal enjoyment is not part of traditional MSEAn culture and was only introduced with more widespread secular schools in the early 20th century. Even today, reading books for fun is not a widespread pastime in many parts of MSEA, although all countries in the region have a substantial publication activity, including original novels and poetry (both traditional and modern), as well as translations of a large number of international books of all genres.

36.6.2 Digital literacy The biggest leap in everyday literacy, or in actively using written language in daily life, has come to MSEA, as in many other parts of the world, with the arrival of digital media and online resources. Unicode encoding has been standardized for most scripts, including non-national languages of MSEA, so that within a short time a vast source of reading material became available to potential readers. Mobile phones and internet access, the basic prerequisites for digital literacy, are ubiquitous, and people in urban as well as in rural areas in MSEA make extensive use of them.

36.7 Conclusion and outlook The writing systems found in MSEA show a wide range of conventions and paths of historical development. While the Latin-based scripts were introduced independently at different times in different places over the last few centuries, the Indic scripts have been in use in the region for well over a thousand years. Coming from the same origin, they developed into distinct types, diverging from their origins and from each other in different cultural areas. These writing systems are important witnesses of linguistic developments in MSEA. While some scripts have undergone reforms to approach a more phonological ideal, the traditional etymological orthographies have proven very resistant, resulting in great discrepancy between the written form and the spoken language in many cases. With digital technology available even in the remotest corners of the region in the form of mobile phones and the internet, some formerly neglected scripts are seeing

904 

 Mathias Jenny

a revival owing to the low-cost possibilities of online publication and distribution of texts. This goes together with an increased awareness for local culture and languages, and at the same time with a democratization of writing, as spreading one’s ideas through writing is no longer restricted to a select few, but accessible to everyone. The writing systems in MSEA are regarded by language communities as an integral part of their cultural identity, making any significant change difficult, if not impossible, even if a reformed orthography might be beneficial for practical reasons. The cultural or religious attachment to a script in its received form is a strong characteristic of MSEAn societies. This is an important factor in the introduction of a script to a hitherto unwritten language, as a certain writing system may be rejected due to its connection with a culture felt to be “alien” by the community. The digital revolution in communication makes it seem likely that more local languages will come up with their own writing systems in MSEA and elsewhere, in some cases through government-sponsored initiatives, like in Thailand where Mahidol University has been active in the field for decades, in others initiated by language communities, like the Palaung in Myanmar. At the same time standard national languages are developing popular spellings, as can be observed in social media chats across the region. These non-standard spellings (or “sub-standard”, to the guardians of “correct” language use), while still emblematic for specific genres and users, may become more common with informal writing gaining ground over formal publications in the daily lives of many.

References Brown, Marvin J. 1965. From Ancient Thai to modern dialects. In Marvin J. Brown, From Ancient Thai to modern dialects and other writings on historical Thai linguistics, 69–254. Bangkok: White Lotus. Brunelle, Marc. 2015. Vietnamese (Tiếng Việt). In Mathias Jenny & Paul Sidwell (eds.), The handbook of Austroasiatic languages, 909–953. Leiden & Boston: Brill. Chamberlain, James R. 1991. The Ram Khamhaeng controversy. Collected papers. Bangkok: The Siam Society. Chantanaroj, Apiradee. 2007. A sociolinguistic survey of selected Tai Nua speech varieties. Chiang Mai, Thailand: Payap University MA thesis. Coedès, George. 1968. The indianized states of Southeast Asia. Edited by Walter F. Vella, translated by Susan Brown Cowing. Canberra: Australian National University Press. Coulmas, Florian. 2003. Writing systems. An introduction to their linguistic analysis. Cambridge: Cambridge University Press. Danvivathana, Nantana. 1987. The Thai writing system (Forum Phoneticum 39). Hamburg: Helmut Buske Verlag. Dempsey, Jakob. 2001. Remarks on the vowel system of Old Burmese. Linguistics of the Sino-Tibetan Area 24(2). 205–234. Diller, Anthony V. N. 1996. Thai orthography and the history of marking tone. Oriens Extremus 39(2). 228–254.



Writing systems of MSEA 

 905

Don, Baccam, Baccam Faluang, Baccam Hung & Dorothy Fippinger. 1989. Tai Dam – English, English – Tai Dam vocabulary book. Eastlake, CO: Summer Institute of Linguistics. Enwall, Joakim. 1994. A myth become reality: History and development of the Miao written language (Stockholm East Asian Monographs 5 & 6). Stockholm: Institute of Oriental Languages, Stockholm University. Gedney, William R. 1972. A checklist for determining tones in Tai dialects. In M. Estellie Smith (ed.), Studies in linguistics in honor of George L. Trager, 423–437. The Hague: Mouton. Goddard, Cliff. 2005. The languages of East and Southeast Asia. Oxford: Oxford University Press. Griswold, Alexander Brown & Prasert na Nagara. 1992. Epigraphic and historical studies, nos. 1–24 published in the Journal of the Siam Society from 1968–1979. Bangkok: Historical Society. Hanna, William J. 2012. Dai Lue – English dictionary. Chiang Mai: Silkworm Books. Hartmann, John F. 1986a. The spread of South Indic scripts in Southeast Asia. Crossroads: An Interdisciplinary Journal of Southeast Asian Studies 3(1). 6–20. Hartmann, John F. 1986b. Varieties of Tai Dam scripts. Crossroads: An Interdisciplinary Journal of Southeast Asian Studies 3(1). 97–103. Hundius, Harald. 1990. Phonologie und Schrift des Nordthai. Stuttgart: Steiner Verlag Wiesbaden. Jacob, Judith M. 1968. Introduction to Cambodian. London: Oxford University Press. Jagacinski, Ngampit. 1986. The Tai writing of Sipsongpanna. Crossroads: An interdisciplinary Journal of Southeast Asian Studies 3(1). 80–96. Jenner, Philip. 2011. A dictionary of Middle Khmer. Canberra: Pacific Linguistics. Jenner, Phillip. 2009a. A dictionary of pre-Angkorian Khmer. Canberra: Pacific Linguistics. Jenner, Phillip. 2009b. A dictionary of Angkorian Khmer. Canberra: Pacific Linguistics. Jenny, Mathias. 2015. Modern Mon. In Mathias Jenny & Paul Sidwell (eds.), The handbook of Austroasiatic languages, 552–600. Leiden & Boston: Brill. Kelly, Piers. 2018. The art of not being legible. Invented writing systems as technologies of resistance in mainland Southeast Asia. Terrain. Anthropologie & Sciences humaines. DOI: https://doi.org/10.4000/terrain.17103. Kurabe, Keita & Masao Imamura. 2016. Orthography and vernacular media: The case of Jinghpaw-Kachin. The Newsletter 75. International Institute for Asian Studies. Mak, Pandora. 2012. Golden Palaung. A grammatical description (SEAsian Mainland Languages E-Series 002). Canberra: Asia-Pacific Linguistics. Morey, Stephen. 2005. The Tai languages of Assam: A grammar and texts. Canberra: Pacific Linguistics. Okell, John. 1994. Burmese (Myanmar). An introduction to the script. De Kalb, IL: Northern Illinois University. Omniglott writing systems. https://www.omniglot.com/writing/index.htm (accessed 2 January 2021). Owen, R. Wyn. 2017. A description and linguistic analysis of the Tai Khuen writing system. Journal of the Southeast Asian Linguistics Society (JSEALS) 10(1). 140–164. Pallegoix, Jean Baptiste. 1850. Grammatica linguae Thai. Bangkok: Ex typographia Collegii Assumptionis. Sai Kam Mong. 2004. The history and development of the Shan scripts. Chiang Mai: Silkworm Books. Scott, James C. 2009. The art of not being governed. An anarchist history of upland Southeast Asia. New Haven & London: Yale University Press. Sidwell, Paul. 2008. The Khom script of the Kommodam Rebellion. International Journal of the Sociology of Language 192. 15–25.

906 

 Mathias Jenny

Singhanetra-Renard, Anchalee & Ronald D. Renard. Resilience among the Palaung in Shan State. In Ronald D. Renard & Anchalee Singhanetra-Renard (eds.), Mon-Khmer peoples of the Mekong region, 271–308. Chiang Mai: Chiang Mai University Press. Thamthi, Sanan. 2001. Yuttasara. Nueng nai aksorn phiset Lanna. [Yuttasara – one of the special scripts of Lanna]. Chiang Mai: Mueang Press. Thurgood, Graham. 1999. From Ancient Cham to modern dialects. Two thousand years of language contact and change. Honolulu: University of Hawai’i Press. U Tha Myat. n.d. The history of Mon-Burmese alphabet. Rangoon: Cultural Institute. (In Burmese). Veidlinger, Daniel M. 2006. Spreading the Dhamma. Writing, orality, and textual transmission in Buddhist northern Thailand. Chiang Mai: Silkworm Books. Watkins, Justin. 2013. Dictionary of Wa with translations into English, Burmese and Chinese. Leiden & Boston: Brill.

Kimmo Kosonen and Kirk Person

37 Language policy and planning in Mainland Southeast Asia 37.1 Introduction The five countries of Mainland Southeast Asia (MSEA) share many similarities in terms of geography, history, culture, religion, and language. Nonetheless, their language policies are quite different from each other. The contrasts are particularly evident in how each nation has chosen to address the issue of linguistic diversity in their language policies as well as how language planning is coordinated and carried out. As the other chapters of this volume show, MSEA is linguistically and culturally highly diverse. All five nations – Cambodia, Laos, Myanmar, Thailand, and Vietnam – have their dominant ethnolinguistic groups – Khmer, Lao, Bamar/Burmese, Thai, and Kinh/Vietnamese, respectively. The largest language in each nation also represents the largest and the most prominent ethnolinguistic group. In all cases this language has been designated the national/official language. Yet, all countries have a substantial number of other languages. Some details of different languages are given in other chapters of this volume. This chapter introduces language policy and planning in MSEA, focusing on how the five countries have chosen to include or exclude non-dominant languages (NDLs) in their national education systems.

37.2 Key concepts and terminology Language planning – in its simplest form – has traditionally been divided into two parts: status planning and corpus planning (Cooper 1989; Deumert 2001; Kaplan and Baldauf 1997). In the 1980s, Cooper (1989), added acquisition planning to this definition. Acquisition planning is also known as language-in-education policy or language education policy (Kaplan and Baldauf 1997). It basically refers to the processes through which different languages are used, taught, and learned in education systems. Status planning refers, for the most part, to language policy and includes, for example, the determination of status and domains of use for different languages, and decision-making about which languages are used for official and educational purposes in a given society. Kaplan and Baldauf (2003), and more recently Kirkpatrick and Liddicoat (2019), Kosonen (2017a, 2017b), and Sercombe and Tupas (2014) provide overviews on MSEA language-in-education policies. Corpus planning, on the other hand, refers to the linguistic aspects of a particular language, and is thus directly relevant to the development of non-dominant languages, https://doi.org/10.1515/9783110558142-037

908 

 Kimmo Kosonen and Kirk Person

or any languages for that matter. Corpus planning can be further divided into several components: codification, standardization, and elaboration. Codification refers to the development of orthographies, i.  e. writing systems. Standardization refers to spelling and the standard usage of the writing system among various varieties of a language, while elaboration relates to the development of new terminology and styles. This chapter uses the term ethnolinguistic minority to refer to a group of people who: (a) share a culture and/or ethnicity and/or language that distinguishes them from other groups of people; and (b) are either fewer in terms of number or less prestigious in terms of power than the predominant group(s) in the given state (Kosonen 2010). The term non-dominant language (NDL) is used to refer to the languages or language varieties spoken in a given place that are not considered the most prominent in terms of number, prestige or official use by the government and/or the education system (Kosonen 2010). The term first language or L1 refers to a language a person speaks as a mother tongue, vernacular, native language or home language. It should be noted that bi- or multilingual people may consider several languages their first languages. The first language is seen here as a language that a speaker: (a) has learnt first; (b) identifies with; (c) knows best; or (d) uses most (Skutnabb-Kangas 2000; UNESCO 2003).

37.3 Language policy and planning by country This section discusses the details of language policy and planning (LPP) in each of the five MSEA countries. The discussion of each national case includes: (i) national context, (ii) status of different languages, (iii) official/national languages, (iv) role of “other”, particularly non-dominant, languages, and (v) acquisition planning, i.  e. language-in-education policy and practice. The Lao and Vietnamese sections are briefer than the other three, as those two countries have seen fewer LPP developments in recent years.

37.3.1 Cambodia Cambodia is the least linguistically diverse nation in MSEA (Kirkpatrick and Liddicoat 2019; Kosonen 2017a). Native Khmer speakers comprise some 90 % of the population of 16 million, while an additional 26 languages are spoken by ethnolinguistic minority communities (Eberhard et al. 2020; UNESCAP 2019). The populations of most ethnolinguistic communities are small, apart from Cham, Chinese, and Vietnamese, who number in the hundreds of thousands (Eberhard et al. 2020; Kosonen 2019).



Language policy and planning in Mainland Southeast Asia 

 909

The Constitution of 1993 enshrines Khmer as the official language. Other languages are not given any status in the constitution or other major legislation. Khmer is the exclusive language of government administration. Until the late 1990s, the medium of instruction at all levels of education was Khmer, though some schools taught Chinese and Vietnamese as subjects of study (Kosonen 2019). The Royal Academy of Cambodia’s Institute of National Language [sic] and the more recently created National Council for Khmer Language are active in standardizing and promoting Khmer, thus playing corpus planning functions (Rinith 2019). In the early 1990s, international development agencies piloted L1-based bilingual education programs for adults and children in non-formal community learning centers. These were later expanded and adapted for use in formal government schools. Projects supported by CARE International and International Cooperation for Cambodia (ICC) in Ratanakiri and Mondulkiri provinces were particularly influential and served as models for further initiatives. Representatives from the Ministry of Education, Youth, and Sport (MoEYS), scholars from Royal Academy of Cambodia, local and international NGO staff, as well as members ethnolinguistic minority communities cooperated to develop and secure official approval of Khmer-based orthographies for several minority languages, including Brao, Bunong, Kavet, Krung, Tampuan, and Kui (Bos, Bos, and Page 2008; Jordi 2015). Expansion of L1-based multilingual education (MLE) to all five northeastern provinces followed, along with some action towards increasing the number of languages with MLE programs (Ball and Smith 2019; CARE International Cambodia 2004; Kosonen 2005, 2009, 2013, 2019; Sun 2009; Thomas 2002). The success of these programs influenced language policy development. Prior to 2007, there was no explicit policy on use of NDLs in education. This changed with the Education Law of 2007, which gave authorities the right to choose the language(s) of instruction for “Khmer learners of minority Khmer origin” (Kingdom of Cambodia 2007; Kosonen 2010). The law nevertheless failed to address large NDLs like Cham, Chinese, Vietnamese, and Lao, which many Cambodians consider “immigrant” languages as opposed to indigenous “Khmer Lue” minority tongues (Ball and Smith 2019; Kosonen 2013, 2019). Additional policy support followed. In 2010, MoEYS released “Guidelines on implementation of bilingual education programs for indigenous children in highland provinces” (Frewer 2014; Kosonen 2013, 2019). In 2013, the Bilingual Education Decree further strengthened the position of NDLs in education. A series of consultations with multiple stakeholders resulted in 2015’s “Multilingual Education National Action Plan” (MENAP). MENAP was a detailed four-year plan for MLE implementation which increased the role of the government in MLE delivery. At the four-year mark, a UNICEF evaluation found that while MENAP brought positive results in some communities, it fell somewhat short in its goal to strengthen Cambodia’s overall MLE model (Ball and Smith 2019). The UNICEF report also highlights challenges encountered in expanding MLE to new languages, including Jarai, Kui, and Cham (Ball and Smith 2019; Kosonen 2019).

910 

 Kimmo Kosonen and Kirk Person

As a result of these efforts, Cambodia today boasts the strongest NDL-supportive policy framework in MSEA, and is the only MSEA country where MLE programs initiated by international organizations have been successfully adopted by and scaled up through the formal government education system. Impetus for these positive developments has come through long-term interaction between international development agencies, government officials, and NDL communities. The Cambodia experience thus reflects Kosonen and Benson’s (2021) contention that the effective inclusion of NDLs in formal education requires support from the top (policy level), bottom (community) and the side (academics, non-governmental organizations, international development agencies, etc.).

37.3.2 Thailand An estimated 73 languages are spoken in the Kingdom of Thailand (Eberhard et al. 2020). The populations of some ethnolinguistic communities, such as Lao-Isan, Kammeuang, Pak Tai, Patani Malay, and Northern Khmer, are in the millions (Eberhard et al. 2020). Available data on language populations are outdated, particularly for larger languages. Standard Thai (based on Central Thai as spoken in Bangkok) is the de facto official and national language. It is estimated that roughly half of the Thai population of nearly 70 million speak Standard or Central Thai as their first language (Eberhard et al. 2020; Kosonen 2013, 2017a; Kosonen and Person 2014; UNESCAP 2019). Standard Thai is widely spoken as a second language throughout the country, but no data are available on the language proficiency of second language Thai speakers. Thai linguist Suwilai Premsrirat points out that Thailand is a linguistically diverse country that sees itself as essentially monolingual (Premsrirat 2008 cited in Person 2014). So strong is this perception, that the four most recent Thai constitutions (Thailand has 17 since the abolition of the absolute monarchy in 1932) do not mention language – unlike the constitutions of the four other MSEA countries discussed in this chapter. State Mandate no. 9, promulgated by the nationalist Prime Minister Phlaek Phibonsongram in 1940, declared Thai to be the official language, stating further “Thai people must extol, honor and respect the Thai language, and must feel honored to speak it” (Royal Thai Government 1940; Warotamasikkhadit and Person 2011). Subsequent government formulations of national identity (often translated as ‘Thainess’) frequently mention language (Reynolds 2002). The Royal Society of Thailand (known as the Royal Society of Siam from 1926 to 1933 and the Royal Institute from 1934 to 2015) is the government agency responsible for corpus planning, i.  e. codification, standardization and elaboration for the Thai language. This includes spelling standardization (as reflected in the authoritative dictionary) and the coining of new technical terms (often first revealed in smaller discipline-specific dictionaries). The Royal Society promotes proper Thai usage through cooperation with the Ministry of Education, as well as through the mass media (radio,



Language policy and planning in Mainland Southeast Asia 

 911

television, newspapers) and, increasingly, digital media (including standalone smartphone apps). The Royal Society reports directly to the Prime Minister. Thailand’s NDLs are primarily found along its borders, with a scattering of smaller NDLs in the Central region. Some of these NDLs boast large populations with high language vitality (Patani Malay, Akha, Sgaw Karen, etc.) while many smaller groups are severely endangered or moribund (Bisu, Mpi, Nyahkur, etc.) (Premsrirat and Person 2018). Some have long literary traditions utilizing traditional scripts (Mon, Patani Malay, Northern Thai/Lanna) while the Roman and Thai-based orthographies of other NDLs are more recent in origin, developed through the work of Christian missionaries or Thai academics (Premsrirat and Person 2018; Smalley 1994). Students from ethnolinguistic minority backgrounds typically lag far behind their Thai peers in school performance. The Ministry of Education does not disaggregate statistics on the basis of student ethnicity or first language, making it challenging to determine whether the disadvantage is due to remote location, language, or a combination of both. Nevertheless, the 2016 Mixed Indicator Cluster Survey, conducted by UNICEF Thailand and the National Statistics Office, found that nearly one-third of ethnolinguistic minority youth (aged 15–24) were illiterate, compared to less than 2 % of youth nationwide (National Statistical Office and UNICEF 2016). Academics, both Thai and international, have played a significant role in corpus planning and the educational use of various NDLs. As a result, several NDLs are currently used in L1-based education pilot projects run by academic institutions and non-governmental actors in partnership with the Ministry of Education (Draper 2019; Kosonen 2013, 2017b; Kosonen and Person 2014; Premsrirat and Person 2018; UNICEF 2018). In the late 1990s, linguists from Mahidol University began cooperating with the Chong community on a language revitalization project which came to include the teaching of Chong in a few local schools. Other ethnolinguistic communities heard about the Chong project and requested similar assistance. The exact nature of the revitalization projects differed according to context, with the minority language being taught in community learning centers (similar to the “language nest” approach in New Zealand) or as a subject in local schools. Mahidol’s language revival work has thus far engaged 27 languages, many of them severely endangered (Premsrirat and Hirsh 2018). The 2004 upsurge in violence in the Deep South, which included separatist attacks on Thai schools (perceived by Patani Malay-speaking Muslim separatists as cultural assimilation tools of the Thai Buddhist state), led Mahidol University linguists to work with Ministry of Education officials, UNICEF Thailand, the Thailand Research Fund and Patani Malay communities to develop the Patani Malay-Thai Mother Multilingual Education (PMT-MLE) program for grades K-6 (Joll 2013; Kosonen and Person 2014; UNICEF 2018). Yala Rajabhat University later partnered with Mahidol to develop undergraduate MTB-MLE courses and MTB-MLE student internships – the first Asian university to offer such opportunities to undergraduates. International recognition has

912 

 Kimmo Kosonen and Kirk Person

been forthcoming, with Mahidol University receiving the 2016 UNESCO King Sejong Award for Literacy, and Yala Rajabhat University the 2017 UNESCO Wenhui Award for Innovations in the Professional Development of Teachers (honorable commendation). PMT-MLE succeeded in dramatically improving student school performance in the midst of a conflict zone, and has figured prominently in Thailand’s language policy and education reform dialogue (UNICEF 2018). The Foundation for Applied Linguistics (FAL), a Thai civil society organization, developed similarly structured MTB-MLE programs in the north and west in a number of languages, including Mon, Sgaw Karen, Pwo Karen, and (most successfully) Hmong (PCF and FAL 2019; Person 2019). FAL likewise won honorable commendation for the 2013 UNESCO Wenhui Award for Educational Innovation for Cultural Expression. The work of FAL and its principal international partner Pestalozzi Children’s Foundation resulted in the formation of the Highland School Directors Association, which advocates on the local and national level for the unique needs of ethnic minority children. Thailand’s first comprehensive National Language Policy (NLP), developed by the Royal Institute and signed by two successive prime ministers in 2010 and 2012, is supportive of language revitalization and multilingual education programs in NDLs (Draper 2019; UNICEF 2018). The NLP deals with a range of language issues, but many key points relate to non-dominant languages and their use in education. The rationale given by the NLP goes beyond merely justifying the use of local languages as a way to teach the national language more effectively. The NLP also calls for the use of learners’ first languages as the basis for cognitive development. Political instability in the past decade has hindered the operationalization of the language policy, prompting the Royal Society to submit a more detailed NLP implementation plan to the Office of the Prime Minister in early 2020. As an outgrowth of its NLP work, which involved multiple meetings with NDL representatives, the Royal Society has expanded its corpus planning and standardization efforts to include NDLs. Committees composed of Royal Society fellows and academics from several Thai universities have worked with NDL community members to grant official recognition to ten Thai-based NDL orthographies. The Ministry of Culture has recognized several of these orthographies as “Cultural Treasures of the Nation” (Person 2018). Since 2005, periodic policy statements issued by the Ministry of Education and the National Security Council have called for mother tongue-based “bilingual education” in schools in the restive Deep South. There has also been increasing policy openness to NDL inclusion in schools elsewhere in the country. Chiang Mai is one of seven provinces granted educational autonomy through the 2018 “Education Sandbox” policy, and provincial authorities have worked with FAL and PCF to launch three types of language-related interventions: “Full” MTB-MLE (grades K1–3) and “Partial” MTB-MLE (grades K1–K2) in linguistically homogeneous schools, and a “Thai as a Second Language” curriculum for linguistically heterogeneous classrooms (Office of the Basic Education Commission 2018; Tienmee 2020).



Language policy and planning in Mainland Southeast Asia 

 913

These efforts are supported by the National Education Plan 2017–2036, which calls for an increase in “special areas that are teaching and learning by integrating the curriculum in accordance with local languages, cultures, and local society” as well as encouragement for “people in all ages in special areas to read and write Thai and local languages …” (Office of the Education Council 2017). Similarly, the Policy and Focus of the Ministry of Education, Fiscal Year 2020 called for primary schools to help students learn “local languages (mother tongues) for communication” (MoE 2019). A new government agency, the Equitable Education Fund, has instituted a “homegrown teacher” program through which high-achieving secondary school students from remote areas will receive scholarships to study education in regional universities with a guaranteed teaching position awaiting them in their home village – which could lead to a dramatic increase in the number of NDL-speaking government teachers (Equitable Education Fund 2018). Thailand thus presents a clear example of how the interplay of various stakeholders – community members, education officials, academics, civil society, United Nations agencies, etc. – has informed policy and practice, illustrating the “from above/ below/the side” formulation mentioned previously (Kosonen and Benson 2021).

37.3.3 Myanmar An estimated 120 languages are spoken in the Union of Myanmar (Eberhard et al. 2020). The majority of the nation’s 54 million citizens are ethnic Bamar (also called Burmese or Burman), and while population figures per ethnicity gleaned from the 2014 national census have never been released, an estimated one-third of the population are of other ethnicities (Aung 2018; Aye and Sercombe 2014; UNESCAP 2019). Armed conflict between the national government and some ethnolinguistic minority groups has been a near-constant reality since Myanmar gained its independence from the United Kingdom in 1948. Eight of the 15 largest armed ethnic groups signed the Nationwide Ceasefire Agreement (2015), although tensions and sporadic fighting continue in those and other minority areas (South et al. 2018). The 2008 Constitution specifies Myanmar (Burmese) as the official language.1 The government administration uses Myanmar almost exclusively. Myanmar is also the main language of instruction in government schools, although English is theoretically the main medium of instruction in upper secondary, vocational, and university classes (British Academy and École française d’Extrême-Orient 2015). The Myanmar Language Commission (MLC) (formerly the Burmese Language Commission) is the government body charged with Myanmar language corpus plan-

1 Myanmar is the official name of both the country and the language, although the latter is often called Burmese.

914 

 Kimmo Kosonen and Kirk Person

ning, although its work has related primarily to standardization and codification but not elaboration (McCormick 2019). The MLC has maintained a long-standing Myanmar language dictionary project, which includes inputting the 2008 dictionary into electronic form, integrating an immense collection of paper lexicon cards from a SOAS Burmese dictionary project initiated in the 1950s. In recent years, the MLC has encouraged digital resources to be developed in Myanmar Unicode, rather than the widely used but technically problematic Zawgyi system (Hotchkiss 2016). MLC administrators have stated their intention to produce dictionaries in a half-dozen NDLs. The national government has long struggled to provide education in rural, lowland ethnic Bamar areas, much less the more remote ethnolinguistic minority regions. This is complicated by the language situation: it is estimated that some 30 % of children do not speak Myanmar upon entry to formal education (Aye and Sercombe, 2014; Kirkpatrick 2012; Kosonen 2017a; Martin 2011). A myriad of “ethnic education providers” (EEPs), often linked to ethnic armed organizations (EAOs), along with church- or temple-based schooling have emerged, offering some form of basic education in the local languages, Burmese, English and, in some cases, Mandarin Chinese or other languages of regional importance (McCormick 2019; South and Lall 2016). Quality and quantity of teaching materials vary greatly between EEPs, ranging from full K–10 curricula used by some of the larger EEPs to a ten-page “writing guide” from which teachers from a smaller EEP are expected to generate several years’ worth of lessons. Refugees living in camps on the Thai side of the border have benefited from teacher training and curriculum enhancement programs developed in cooperation with international development agencies. NDL corpus planning activities are thus being conducted almost exclusively by non-governmental actors. The 2008 constitution and promised social and economic reforms garnered significant international attention, with UNICEF, the European Union, and other donor agencies dramatically increasing their educational aid. These international actors saw government recognition of the EEPs and language-in-education policy reform as a peacemaking tool (Lall and South 2018). UNICEF’s Language and Social Cohesion (LESC) initiative (2014–2016) facilitated a series of discussions involving local government officials as well as hundreds of educators from various ethnolinguistic minorities (Lo Bianco 2016). The initiative aimed at finding common ground for cooperation, culminating in an unprecedented international conference on language policy and peacebuilding in Mandalay in February 2016. The conference stressed the need to build not only a national language policy, but also state and district level language-in-education agreements. These international efforts ran parallel to, but were not necessarily integrated with, government policy developments. The National Education Law of 2014 acknowledged Myanmar’s linguistic diversity, while stipulating that Myanmar and English were to be the main languages of instruction with ethnolinguistic minority languages used alongside Myanmar “if there is a need […] at the basic level” (Government of the Republic of the Union of Myanmar 2014: section 43). The Law was greeted with skep-



Language policy and planning in Mainland Southeast Asia 

 915

ticism in some corners, prompting student protests (The Irrawaddy 2015). Proposed amendments regarding ethnic language education were considered but rejected by parliament in 2015. LESC’s policy recommendations met a similar fate in 2016. Nonetheless, international interest in language-in-education issues has persisted. The Myanmar Education Consortium (MEC), for example, exists as a “multi-donor pooled fund” to work with “complementary education systems” to “provide the hardest-to-reach children with good quality, accredited education and contribute to a coherent, inclusive national system” (MEC 2016). MEC seeks to strengthen EEPs’ organizational and educational delivery capacities, with eventual government recognition of at least some of EEPs as legitimate education providers along the lines of accredited private schools a key advocacy goal. UNICEF, UNESCO, the World Bank, and the European Union, among others, continue to press for more inclusive language-in-education policies. Meanwhile, the Ministry of Education has moved from allowing NDLs to be taught after hours in government school buildings (with nominal financial compensation), to instruction during school hours. Some 20 % of class time (equivalent to one hour per day) can now be allocated to the minority language/culture teaching (South 2020). Under the Basic Education Law of 2019, state-level language, literature and culture committees are charged with determining what will be included in such classes, raising concerns of continued exclusion for smaller languages in favor of the dominant ethnic groups (Government of the Republic of the Union of Myanmar 2019). Nonetheless, these developments reflect the government’s acknowledgement of linguistic realities and a degree of willingness to develop appropriate policy responses. Once again, impetus for language policy reform is coming from below (local organizations), from above (local and national government) and the side (international development agencies), albeit with little input from Myanmar academics (Kosonen and Benson 2021).

37.3.4 Vietnam It is estimated that 109 languages are spoken in the Socialist Republic of Vietnam (Eberhard et al. 2020). Following ideological precedents established by the Soviet Union and Peoples’ Republic of China, the Vietnamese government officially recognizes only 54 “nationalities”. About 86 % of the population of 97 million are Vietnamese-speaking Kinh, while the rest of the population comprises various ethnolinguistic communities, some of which number in the hundreds of thousands (Benson and Kosonen 2012; Kosonen 2004, 2013, 2017a, 2017b; Nguyen and Nguyen 2019; Phan, Vu, and Bao 2014; UNESCAP 2019). The Constitution of 2013 declares Vietnamese to be the national and official language. Since 1968, the Institute of Linguistics, a unit of the Vietnam Academy of Social Sciences, has been charged with status and corpus planning for Vietnamese and

916 

 Kimmo Kosonen and Kirk Person

non-dominant languages (Kosonen 2004). This includes “researching and proposing policy related to language issues” and “building written languages for ethnic minorities” (Vietnamese Academy of Social Sciences n.d.). Institute researchers have written ethnographies, compiled dictionaries, and conducted language surveys, producing scholarly works in Vietnamese and English. The Vietnam Education Law of 2005 declares Vietnamese the official language of instruction in all educational institutions of the country. Some policy documents support the use of NDLs education (even as media of instruction), while others appear non-supportive (Benson and Kosonen 2012; Kosonen 2004, 2013, 2017a, 2017b; Nguyen and Nguyen 2019). This is not unusual in countries following Soviet-influenced political ideology where policies are often focused on ideologies and principles rather than implementation. In practice, Vietnamese remains the dominant classroom language (Benson and Kosonen 2012; Kosonen 2004, 2013, 2017a, 2017b; Nguyen and Nguyen 2019; Phan, Vu, and Bao 2014). Nonetheless, some NDLs are taught as subjects in some minority regions (Nguyen and Nguyen 2019). A few MLE pilot programs initiated by international development agencies roughly a decade ago generated positive results in terms of student achievement, and there is evidence of teachers continuing to use MLE materials and methods even after external project support ended (Kosonen 2013, 2017b; Nguyen and Nguyen 2019; Phan, Vu, and Bao 2014). Still, the emphasis on using the Vietnamese language to strengthen the national identity combined with the push for more English attract greater government attention than education issues in ethnolinguistic minority communities. As Nguyen and Nguyen explain: The weight attached to the importance of Vietnamese in education policy might negatively affect the status of minority languages […]. More balanced language-in-education policies are thus needed so that Vietnamese will not thrive at the expense of less dominant languages. (Nguyen and Nguyen 2019: 196)

37.3.5 Laos Despite a total population of only 7 million people, The Lao People’ s Democratic Republic is home to some 86 languages (Eberhard et al. 2020; UNESCAP 2019). Various sources disagree on the exact number of languages or ethnolinguistic groups in Laos, and the government officially recognizes only 49 ethnic groups (Meyers 2019). Like Vietnam, Laos follows a Soviet-influenced classification of ethnic groups which does not necessarily correspond to the languages people speak (Benson and Kosonen 2012; Kosonen 2005, 2010, 2017a). According to the Constitution of 2015, Lao is the official language and the Lao script the official script. The Constitution and other policy documents, however, provide some support to “ethnic groups” and “ethnic group areas”, reflecting the dominant



Language policy and planning in Mainland Southeast Asia 

 917

political ideology, but there are no references to the use of non-dominant languages in society or education (Djité 2011; Kosonen 2010; Meyers 2019). Djité summarizes the situation well when he argues that “[t]he wide disparity in education along ethno-linguistic lines is generally acknowledged; and yet, language issues seem to be mostly ignored in this country” (Djité 2011: 38). Efforts to standardize the Lao language date back to colonial period collaboration between the Buddhist Academic Council and the École française d’Extrême-Orient (Enfield 1999; Meyers 2019). The post-1975 government rapidly implemented spelling reform, based on previous work by Lao grammarians. In 2002 the Linguistic Research Institute was established under the Ministry of Information and Culture, conducting ethnographic and linguistic survey research during its brief institutional lifetime (Enfield 2007). The Ministry of Education is the current de facto language standardization body by virtue of its control over the curriculum development process (Chamberlain 2020). The Education Law of 2007 established Lao as the language of education. This is usually interpreted to mean that Lao should remain the sole language of education, especially in terms of written materials. Thus, non-dominant languages are not officially used in education. In some areas, however, teachers use local community languages informally to help minority pupils understand the lesson contents (Djité 2011; Kosonen 2010). Meyers notes that there is increasing “openness to use spoken ethnic languages to help students understand the meaning of Lao words in the early grades” (Meyers 2019: 211). Yet, it is evident that the government focus is on developing educational strategies and materials to help minority children learn spoken and written Lao as quickly as possible. Several policy documents, such as the seventh National Socio-Economic Development Plan 2010–2015 and the National Inclusive Education Policy of 2010 and the related Action Plan of 2011 emphasize teaching of Lao to minority populations, even at the preschool level (Meyers 2019). The Lao government seems reluctant to move forward on MLE, despite advocacy efforts by nongovernmental, multilateral, and donor agencies. Available educational statistics show that the enrolment, retention, and achievement rates of ethnolinguistic minority children are lower than the national average. The fact that at least half and possibly more of the Lao population do not speak Lao as their first language is a major challenge for the education system, even though it is rarely admitted as such by government authorities (Benson and Kosonen 2012; Cincotta-Segi 2014; Faming 2012; Kosonen 2005, 2009, 2010, 2017a; Meyers 2019). The large proportion of non-Lao speakers is likely a key reason why language issues are so highly political; as Djité notes: “[l]anguage policy in the Lao PDR is […] overshadowed by fundamental issues of infrastructure and politics of language” (Djité 2011: 42). Perhaps the more pressing challenge faced by the Lao government is an existential threat to the Lao language. The prominent role of the national languages of the other MSEA countries is self-evident; those languages are in no way endangered. This is not the case for Lao, as Meyers points out:

918 

 Kimmo Kosonen and Kirk Person

The similarities between spoken and written Lao and Thai present new challenges as television, social media and magazines from Thailand continue to dominate Lao. Preservation of the Lao language for future generations is likely to become a greater priority to national leaders than the preservation of ethnic languages. (Meyers 2019: 213)

37.4 Issues and challenges Language policies throughout MSEA reflect the ideologies and priorities of the respective governments (Kirkpatrick and Liddicoat 2019; Kosonen 2017a; 2017b). Throughout the region, national and official languages are prioritized. Of other languages, only English, and secondarily, Mandarin Chinese, receive major attention from government agencies. Government interest in non-dominant languages varies from country to country. A major region-wide challenge is governments’ ethnolinguistic classification, which does not always correspond to the languages people actually speak. Vietnam officially recognizes only 54 ethnic groups or nationalities (Vietnamese Kinh being one), while twice that many languages are spoken in the country (Eberhard et al. 2020; Kosonen 2004, 2013; Nguyen and Nguyen 2019). In Myanmar, ethnic groups such as Karen, Kachin, and Chin (each with states named after the group), are considered by many as a homogenous communities, though major linguistic diversity exists within each designated ethnicity (McCormick 2019). Internal migration, urbanization, and the consequent increase in multilingualism and language shift exacerbates confusion as to who speaks which language. Unfortunately, it is disadvantaged ethnolinguistic minority children who are most likely to suffer the effects of misclassification when language-of-instruction decisions are made. Besides ethnolinguistic classifications, orthographies (or the perceived lack thereof) can also be problematic across MSEA. Government officials throughout the region often mistakenly claim that non-dominant languages in their countries cannot be used in education because they have never been written. The governments of Cambodia and Thailand prefer NDL orthographies be based on the script of the national language, frowning on traditional and Roman-based orthographies  – even if they have been widely used for decades (Benson and Kosonen 2012; Kosonen 2013; Person 2009a). This complicates matters for cross-border groups such as the Tai Dam, who employ their traditional Tai Viet script alongside Romanized (reflecting Vietnamese) and Lao-based orthographies (Vitrano-Wilson 2018). With the exception of Laos, all MSEA countries have language policies which support linguistic diversity – at least in theory. Nonetheless, these policies are not fully implemented, and can become problematic. Thailand’s Ministry of Education has generous policy provisions for the use of NDLs in government schools, but does not promote operationalization; that is left to non-state actors working in conjunction with local officials (Draper 2019; Kosonen 2019). Vietnam has the widest gap



Language policy and planning in Mainland Southeast Asia 

 919

(among Southeast Asian nations) between the written policy and actual practice, due to internally conflicting policies which make it difficult for any actor to know what is and what is not allowed (Kirkpatrick and Liddicoat 2019; Kosonen 2004, 2013; 2017b; Nguyen and Nguyen 2019). Cambodia’s MLE policy applies only to language groups perceived to be “indigenous,” while Myanmar’s policy of allowing state-level officials to determine the languages to be taught during school hours has created a situation where larger NDLs are marginalizing smaller NDLs. Pluralistic language policies are often opposed by government figures and parents alike, out of a lack of understanding of the role of language in children’s cognitive development. The most common misconception is that simply introducing an unknown language, such as the official language or English, to children as early as possible, increases and accelerates the learning of that language, an idea which has been thoroughly repudiated by global research (Benson 2019; Benson and Kosonen 2012; Collier and Thomas 2019). There is also confusion about the distinction between learning about a language versus learning through a language as a medium of instruction (Heugh 2020). Additionally, there are contested understandings of bilingual and/or multilingual education. Some feel that mere oral use of an NDL at school is sufficient, and literacy in the learners’ first or home language is unnecessary. A clear example of such thinking can be found in southern Thailand, where an oral-only kindergarten “bilingual” program was found to actually lower student performance at the same time that a true multilingual education program with a strong focus on L1 literacy plus L1-based learning of academic content produced dramatically higher results (UNICEF 2018). In one area of Myanmar, local development workers claimed a school utilized multilingual education because the children spoke their mother tongue during recess (but not during class) (MEC 2017). Assimilation of minority populations to dominant languages and cultures can be widely observed throughout MSEA (Sercombe and Tupas 2014). National education systems play an important role in such assimilation. In some places, such as Laos and Thailand, the assimilation is quite subtle, and children are expected to operate in the respective national language in school regardless of their home or first language (Draper 2019; Kosonen 2010, 2017a, 2017b; Myers 2019). Myanmar’s recent push of Burmese-medium schooling into areas with pre-existing “ethnic education providers” (EEP), coupled with a resistance to accepting the academic qualifications of children schooled by EEPs, can be viewed as an attempt to impose Burmese language and culture on other ethnolinguistic groups (McCormick 2019). The most explicit form of assimilation of linguistic minorities can be found in Vietnam, where Vietnamese-medium preschools and boarding schools in minority areas are used to “strengthen” ethnolinguistic minority student’s Vietnamese skills to the exclusion of NDLs (Benson and Kosonen 2012; Kosonen 2004, 2013, 2017b). Similar language submersion situations can be found in boarding schools in Laos (Cincotta-Segi 2014; Faming 2012), though on a smaller scale than in Vietnam. Similar past practices in North America

920 

 Kimmo Kosonen and Kirk Person

and Australia are now viewed as linguistic and cultural genocide, the devastating effects of which continue to haunt generations of indigenous people.

37.5 Conclusion For their individual differences, the five countries of Mainland Southeast Asia share a strong policy commitment to their respective national languages, and an uneven and sometimes contradictory policy position vis-à-vis non-dominant languages. For all MSEA countries but Laos, regulations and practices which facilitate cultural and linguistic assimilation exist side-by-side with at least nominal commitments to pluralistic language policies. Bolstered by successful pilot programs, community, international, and scholarly support for the inclusion of NDLs in government schools is increasing – and gaining support from some education officials. The situation is nonetheless quite fluid, as MSEA countries frame language policy discussions in terms of national development agendas.

References Aung, San Yamin. 2018. Still no date for release of census findings on ethnic populations. The Irrawaddy, 21 February. https://www.irrawaddy.com/news/burma/still-no-date-releasecensus-findings-ethnic-populations.html (accessed 7 January 2021). Aye, Khin Khin & Peter Sercombe. 2014. Language, education and nation-building in Myanmar. In Peter Sercombe & Ruanni Tupas (eds.), Language, education and nation-building: Assimilation and shift in Southeast Asia, 148–164. Basingstoke: Palgrave Macmillan. Ball, Jessica & Mariam Smith. 2019. Independent evaluation of the multilingual education national action plan in Cambodia, July 2018 – February 2019. Phnom Penh: UNICEF. https://www.unicef. org/cambodia/reports/evaluation-multilingual-education-national-action-plan-cambodia (accessed 7 January 2021). Benson, Carol & Kimmo Kosonen. 2012. A critical comparison of language-in-education policy and practice in four Southeast Asian countries and Ethiopia. In Kathleen Heugh & Trove Skutnabb– Kangas (eds.), Multilingual education and sustainable diversity work: From periphery to center, 111–137. New York: Routledge. Benson, Carol. 2019. L1-based multilingual education in the Asia and Pacific region and beyond: Where are we, and where do we need to go? In Andy Kirkpatrick & Anthony J. Liddicoat (eds.), The Routledge handbook of language education policy in Asia, 14–28. Abingdon: Routledge. Bos, Kees Jan, Mirjam Bos & Christina Page. 2008. Community-based orthography development: Experiences from the Kuy. Unpublished manuscript. https://www.seameo.org/_ld2008/ doucments/Presentation_document/MicrosoftWord_CommunityBasedOrthographyDevelopment_Experiences_fromthe_Kuy_with_edits.pdf (accessed 7 January 2021). British Academy & École française d’Extrême-Orient. 2015. Language choice in higher education: Challenges and opportunities. London: The British Academy. https://www.thebritishacademy. ac.uk/publications/language-choice-higher-education-challenges-and-opportunities/ (accessed 7 January 2021).



Language policy and planning in Mainland Southeast Asia 

 921

CARE International Cambodia. 2004. Cambodia: Highland children’s education project (HCEP), Ratanakiri Province. In Linda King & Sabine Schielmann (eds.), The challenge of indigenous education: Practice and perspectives, 113–122. Paris: UNESCO. Chamberlain, James. 2020. Interview with Kirk R. Person, 23 September. Cincotta-Segi, Angela. 2014. Language/ing in education. In Peter Sercombe & Ruanni Tupas (eds.), Language, education and nation-building: Assimilation and shift in Southeast Asia, 106–130. Basingstoke: Palgrave Macmillan. Collier, Virginia P. & Wayne P. Thomas. 2019. Literacy leadership brief: The role of bilingualism in improving literacy achievement. International Literacy Association. https://literacyworldwide. org/docs/default-source/where-we-stand/ila-role-bilingualism-improving-literacy-achievement.pdf (accessed 7 January 2021). Djité, Paulin G. 2011. The language difference: Language and development in the greater Mekong sub-region. Bristol: Multilingual Matters. Deumert, Ana. 2001. Language planning: Models. In Rajend Mesthrie (ed.), Concise encyclopedia of sociolinguistics, 644–647. Amsterdam: Elsevier. Draper, John. 2019. Language education policy in Thailand. In Andy Kirkpatrick & Anthony J. Liddicoat (eds.), The Routledge handbook of language education policy in Asia, 229–242. Abingdon: Routledge. Eberhard, David M., Gary F. Simons & Charles D. Fennig (eds.). 2020. Ethnologue: Languages of the world, 23rd edn. Dallas, TX: SIL International. http://www.ethnologue.com (accessed 7 January 2021). Enfield, Nick J. 2007. A grammar of Lao. Berlin & New York: Mouton de Gruyter. http://nickenfield. org/books/a-grammar-of-lao/ (accessed 7 January 2021). Enfield, Nick J. 1999. Lao as a national language. In Grant Evans (ed.), Laos: Culture and society, 258–290. Chiang Mai: Silkworm Books. Equitable Education Fund. 2018. EEF – Equitable Education Fund. https://www.eef.or.th/en/eef/ (accessed 7 January 2021). Faming, Manynooch. 2012. Boarding schools for ethnic minorities in Laos. In James A. Banks (ed.), Encyclopaedia of diversity in education, 254–258. London: Sage. Frewer, Timothy. 2014. Diversity and “development”: The challenges of education in Cambodia. In Peter Sercombe & Ruanni Tupas (eds.), Language, education and nation-building: Assimilation and shift in Southeast Asia, 45–67. Basingstoke: Palgrave Macmillan. Government of the Republic of the Union of Myanmar. 2016. National education strategic plan 2016–2021. https://www.britishcouncil.org/sites/default/files/myanmar_national_education_ strategic_plan_2016-21.pdf (accessed 7 January 2021). Government of the Republic of the Union of Myanmar. 2014. National education law. https://www. burmalibrary.org/docs20/2014-09-30-National_Education_Law-41-en.pdf (accessed 7 January 2021). Government of the Republic of the Union of Myanmar. 2019. Basic education law. Unpublished translation. Heugh, Kathleen. 2020. Approaches to language in education for migrants and refugees in the Asia-Pacific region. Bangkok: UNESCO Asia and Pacific Regional Bureau for Education. https:// unesdoc.unesco.org/ark:/48223/pf0000373660 (accessed 7 January 2021). Hotchkiss, Griffin. 2016. Battle of the fonts. Frontier Myanmar, 23 March. https://www.frontiermyanmar.net/en/battle-of-the-fonts/ (accessed 7 January 2021). Joll, Christopher. 2013. Language loyalty and loss in Malay South Thailand: From ethno-religious rebellion to ethno-linguistic angst? Presented at the Asia-Pacific Research Conference: Engaging violent conflicts in Asia-Pacific with nonviolent alternatives, Imperial Queen’s Park Hotel, Bangkok, Thailand, 15 November. https://www.researchgate.net/

922 

 Kimmo Kosonen and Kirk Person

publication/280564560_Language_loyalty_and_loss_in_Malay_South_Thailand_-_From_ Ethno-religious_rebellion_to_ethno-linguistic_angst (accessed 7 January 2021). Jordi, Jacqueline. 2015. Brao Ombaa writing system. Unpublished manuscript. https://www.sil. org/system/files/reapdata/17/02/58/1702584822541162218862094662930108389/brao_ orthography_statement_final__1_.pdf (accessed 7 January 2021). Kaplan, Robert B. & Richard B. Baldauf. 1997. Language planning: From practice to theory. Clevedon, UK: Multilingual Matters. Kaplan, Robert B. & Richard B. Baldauf. 2003. Language and language-in-education planning in the Pacific Basin. Dordrecht: Kluwer. Kingdom of Cambodia. 2007. Law on education. http://www.unesco.org/education/edurights/ media/docs/9cb1ef01a5bcba1dd396832969c31342aacf87bb.pdf (accessed 7 January 2021). Kirkpatrick, Andy. 2012. English in ASEAN: Implications for regional multilingualism. Journal of Multilingual and Multicultural Development 33(4). 331–344. Kirkpatrick, Andy & Anthony J. Liddicoat (eds.). 2019. The Routledge handbook of language education policy in Asia. Abingdon: Routledge Kosonen, Kimmo. 2004. Language in education policy and practice in Vietnam. Hanoi: UNICEF. Kosonen, Kimmo. 2005. Vernaculars in literacy and basic education in Cambodia, Laos and Thailand. Current Issues in Language Planning 6(2). 122–142. DOI: 10.1080/14664200508668277. Kosonen, Kimmo. 2008. Literacy in local languages in Thailand: Language maintenance in a globalised world. International Journal of Bilingual Education and Bilingualism 11(2). 170–188. DOI: 10.2167/beb492.0. Kosonen, Kimmo. 2009. Language-in-education policies in Southeast Asia: An overview. In Kimmo Kosonen & Catherine Young (eds.), Mother tongue as bridge language of instruction: Policies and experiences in Southeast Asia, 22–43. Bangkok: SEAMEO. Kosonen, Kimmo. 2010. Ethnolinguistic minorities and non-dominant languages in mainland Southeast Asian language-in-education policies. In Macleans A. Geo-Jaja & Suzanne Majhanovich (eds.), Education, language, and economics: Growing national and global dilemmas, 73–88. Rotterdam, Boston & Taipei: Sense Publishers. Kosonen, Kimmo. 2013. The use of non-dominant languages in education in Cambodia, Thailand and Vietnam: Two steps forward, one step back. In Carol Benson & Kimmo Kosonen (eds.), Language issues in comparative education, 39–58. Rotterdam: Sense Publishers. Kosonen, Kimmo. 2017a. Language of instruction in Southeast Asia. Paper commissioned for the 2017/8 Global Education Monitoring Report, Accountability in education: Meeting our commitments. Paris: UNESCO http://unesdoc.unesco.org/images/0025/002595/259576e.pdf (accessed 7 January 2021). Kosonen, Kimmo. 2017b. Language policy and education in Southeast Asia. In Teresa McCarty & Stephen May (eds.), Language policy and political issues in education. Vol. 1 of Stephen May (ed.), Encyclopedia of language and education, 3rd edn., 477–490. New York: Springer. DOI: 10.1007/978-3-319-02320-5_35-1. Kosonen, Kimmo. 2019. Language education policy in Cambodia. In Andy Kirkpatrick & Anthony J. Liddicoat (eds.), The Routledge handbook of language education policy in Asia, 216–228. Abingdon: Routledge. Kosonen, Kimmo & Carol Benson. 2021. Bringing non-dominant languages into education systems: Change from above, from below, from the side – or a combination? In Carol Benson & Kimmo Kosonen (eds.), Language issues in comparative education II: Policy and practice in multilingual education based on non-dominant languages, 25–56. Leiden & Boston: Brill. Kosonen, Kimmo & Kirk R. Person. 2014. Languages, identities and education in Thailand. In Peter Sercombe & Ruanni Tupas (eds.), Language, education and nation-building: Assimilation and shift in Southeast Asia, 200–231. Basingstoke: Palgrave Macmillan.



Language policy and planning in Mainland Southeast Asia 

 923

Lall, Marie & Ashley South. 2018. Power dynamics of language and education policy in Myanmar’s contested transition. Comparative Education Review 62(4). 482–502. Lo Bianco, Joseph. 2016. Language Education and Social Cohesion (LESC) initiative: Myanmar country report. Bangkok: UNICEF East Asia and Pacific Regional Office. Martin, Richard. 2011. Education in Myanmar: Opportunity for limited engagement. In Colin Brock & Lorraine Pe Symaco (eds.), Education in South-East Asia, 121–137. Oxford: Symposium Books. McCormick, Patrick. 2019. Language policy in Myanmar. In Andy Kirkpatrick & Anthony J. Liddicoat (eds.), The Routledge handbook of language education policy in Asia, 243–256. Abingdon: Routledge. Meyers, Cliff. 2019. Lao language policy. In Andy Kirkpatrick & Anthony J. Liddicoat (eds.), The Routledge handbook of language education policy in Asia, 202–215. Abingdon: Routledge. Ministry of Education (MoE). 2019. Policy and focus of the Ministry of Education, fiscal year 2020. Bangkok: Ministry of Education, Thailand. https://www.moe.go.th/moe/upload/news19/ FileUpload/54715-6694.pdf (accessed 7 January 2021). Myanmar Education Consortium. 2016. MEC. 5 May. https://myanmareducationconsortium.org/ about/donors/ (accessed 7 January 2021). Myanmar Education Consortium. 2017. Multilingual education field report. Unpublished manuscript. National Statistical Office & UNICEF. 2016. Thailand multiple indicator cluster survey 2015–2016. Bangkok: National Statistical Office and UNICEF Thailand Country Office. https://www.unicef. org/thailand/Thailand_MICS_Full_Report_EN.pdf (accessed 7 January 2021). National Institute of Educational Testing Service (NIETS). http://www.serviceapp.niets.or.th/ onetmap/ (accessed 7 January 2021). Nguyen, Xuan Nhat Chi Mai & Van Huy Nguyen. 2019. Language education policy in Vietnam. In Andy Kirkpatrick & Anthony J. Liddicoat (eds.), The Routledge handbook of language education policy in Asia, 185–201. Abingdon: Routledge. Office of the Basic Education Commission. 2018. Education sandbox. https://www.edusandbox.com/ home (accessed 7 January 2021). Office of the Education Council. 2017. The national scheme of education B.E. 2560–2579 (2017–2036). Bangkok: Office of the Education Council, Ministry of Education. http://www. onec.go.th/index.php/book/BookView/1540 (accessed 7 January 2021). Pestalozzi Children’s Foundation and Foundation for Applied Linguistics. 2019. New dawn over the mountains: Improving access and equity through mother tongue-based multilingual education for Thailand’s ethnic children. Chiang Mai: Pestalozzi Children’s Foundation and Foundation for Applied Linguistics. https://www.pestalozzi.ch/sites/pestalozzi.ch/files/documents/ downloads/new_dawn_over_the_mountains_mtb-mle_in_thailand_-_pestalozzi_childrens_ foundation.pdf (accessed 7 January 2021). Person, Kirk R. 2009a. Heritage scripts, technical transcriptions, and practical orthographies: A middle path towards educational excellence and cultural preservation for Thailand’s ethnic minority languages. In Proceedings from the International Conference on National Language Policy: Language Diversity for National Unity, 189–200. Bangkok: Royal Institute of Thailand. Person, Kirk R. 2009b. Ethnic minority people and the United Nations Millennium Development Goals: Global trends in language rights and mother-tongue first multilingual education. Festschrift in linguistics, applied linguistics, language and literature in honor of Professor Udom Warotamasikkhadit’s 75th birthday. Bangkok: Suan Sunandha Rajabhat University. Person, Kirk R. 2018. Reflections on two decades of Bisu language revitalization. In David Hirsh & Suwilai Premsrirat (eds.), Language revitalization: Insights from Thailand. Bern: Peter Lang. Person, Kirk R. Forthcoming. The Bangkok statement on language and inclusion: a rose by any other name? In Tove Skutnabb-Kangas & Robert Phillipson (eds.), Handbook of linguistic human rights. Oxford: Wiley-Blackwell.

924 

 Kimmo Kosonen and Kirk Person

Phan, Le Ha, Vu Hai Ha & Bao Dat 2014. Language policies in modern-day Vietnam. In Peter Sercombe & Ruanni Tupas (eds.), Language, education and nation-building: Assimilation and shift in Southeast Asia, 232–244. Basingstoke: Palgrave Macmillan. Premsrirat, Suwilai. 2008. Award acceptance speech. Comité International Permanent des Linguistes (The Permanent International Committee of Linguists). 18th International Congress of Linguists, Korea University, Seoul, Korea. http://www.ciplnet.com/data/premsrirat.html (accessed 29 November 2011). Premsrirat, Suwilai & Kirk R. Person. 2018. Education in Thailand’s ethnic languages: Reflections on a decade of mother tongue based multilingual education policy and practice. In Gerald W. Fry (ed.), Education in Thailand: An old elephant in search of a new mahout, 393–408. Singapore: Springer. Reynolds, Craig J. 2002. National identity and its defenders: Thailand today, revised edn. Chiang Mai, Thailand: Silkworm Books. Rinith, Taing. 2019. Keeping the Khmer language up to date. Khmer Times, 4 July. https:// www.khmertimeskh.com/621184/keeping-khmer-language-up-to-date/ (accessed 7 January 2021). Royal Thai Government. 1940. State mandate #9. The Royal Gazette 57 (June). 151. Sercombe, Peter & Ruanni Tupas (eds.). 2014. Language, education and nation-building: Assimilation and shift in Southeast Asia. Basingstoke: Palgrave Macmillan. Slodkowski, Antoni. 2015. Myanmar signs ceasefire with eight armed groups. Reuters, 15 October. https://www.reuters.com/article/us-myanmar-politics-idUSKCN0S82MR20151015 (accessed 7 January 2021). Smalley, William A. 1994. Linguistic diversity and national unity: Language ecology in Thailand. Chicago: University of Chicago Press. South, Ashley. 2020. Interview with Kirk R. Person, 25 August. South, Ashley & Marie Lall. 2016. Schooling and conflict: Ethnic education and mother tongue-based teaching in Myanmar. Yangon: USAID and Asia Foundation. https://asiafoundation.org/ resources/pdfs/SchoolingConflictENG.pdf (accessed 7 January 2021). South, Ashley, Tim Schroeder, Kim Jolliffe, Susanne Kempel, Axel Schroeder & Naw Wah Shee Mu. 2018. Between ceasefires and federalism: Exploring interim arrangements in the Myanmar peace process. Yangon: Covenant Consult Ltd. Sun, Neou. 2009. Education policies for ethnic minorities in Cambodia. In Kimmo Kosonen & Catherine Young (eds.), Mother tongue as bridge language of instruction: Policies and experiences in Southeast Asia, 62–68. Bangkok: SEAMEO. The Irrawaddy. 2015. Timeline of Myanmar student protests. 10 March. https://www.irrawaddy.com/ news/burma/timeline-of-student-protests-against-education-law.html (accessed 7 January 2021). Thomas, Anne. 2002. Bilingual community-based education in the Cambodian highlands: A successful approach for enabling access to education by indigenous peoples. Journal of Southeast Asian Education 3(1). 26–58. Tienmee, Wanna. 2020. Interview with Kirk R. Person, 15 September. UNESCAP. 2019. Population and development indicators for Asia and the Pacific, 2019. https:// www.unescap.org/sites/default/files/Population%20Data%20Sheet%202019.pdf (accessed 7 January 2021). UNESCO. 2015. MLE mapping data. Bangkok: Asia Multilingual Education Working Group. https:// asiapacificmle.net/data-mapping (accessed 7 January 2021). UNICEF. 2012. Lao Cai primary classroom language profile. Hanoi: UNICEF, Lao Cai DOET, MOET and SIL International. http://www.unicef.org/vietnam/Lao_Cai_mapping_profile_set.pdf (accessed 11 January 2017).



Language policy and planning in Mainland Southeast Asia 

 925

UNICEF. 2018. Bridge to a brighter Tomorrow: The Patani Malay-Thai multilingual education programme. Bangkok: UNICEF Thailand. https://www.unicef.org/thailand/reports/bridgebrighter-tomorrow (accessed 5 May 2020). Vitrano-Wilson, Seth. 2018. Tai Dam orthographies: Multigraphia, mismatching tones, and mutual borrowing of tone marking devices among three scripts. Written Language & Literacy 21(2). 198–237. Vietnam Academy of Social Sciences.  n.d. http://en.vass.gov.vn/noidung/gioithieu/cocautochuc/ Pages/thong-tin-don-vi.aspx?ItemID=152&PostID=56 (accessed 7 January 2021). Warotamasikkhadit, Udom & Kirk R. Person. 2011. Development of the national language policy (2006–2010). The Journal of the Royal Institute of Thailand III. 29–44. http://www.royin.go.th/ royin2014/upload/246/FileUpload/2523_4254.pdf (accessed 7 January 2021). World Bank. 2016. Thailand country overview. http://www.worldbank.org/en/country/thailand/ overview (accessed 7 January 2021).

Andrew Simpson

38 Language and the building of nations in Southeast Asia 38.1 Introduction The major states of Southeast Asia, with the single exception of Thailand, all achieved independence during the 20th century following extended periods of foreign colonial domination and were faced with the significant challenge of how to develop successful new nations from populations that were typically very complex in their ethno-linguistic makeup. Language issues have played an important role in the process of nation-building in Southeast Asia, as elsewhere in the world, and the different decisions made by political leaderships with regard to post-colonial language planning have resulted in a broad range of outcomes and different measures of success, promoting both national and official languages by means of either heavily monolingual or alternatively multilingual policies. This chapter describes the linguistic situations that have evolved in countries in Southeast Asia as governments have confronted the needs and demands of their largely heterogeneous populations and the pressures which arise when multiple languages compete with each other inside a single political territory. The chapter first provides an overview of the general relation of language to the construction of national identity and the governance of modern, multilingual states, and then presents language profiles of individual countries in Southeast Asia, focusing on the relation between majority and minority languages and ethnic groups, and how state language policies have attempted to address political, cultural and economic problems specifically linked to language issues. These studies also highlight broader, general lessons that can be learned for language planning from the particular experiences of Southeast Asian states, as different approaches have been experimented with and implemented with either beneficial or negative results.

38.2 The role of language planning in the ­construction of new nations The ability of new, multi-ethnic states to prosper and avoid inter-ethnic conflict is significantly enhanced when equal socio-economic and political opportunities are offered to all groups present in a mixed population. The long-term success of nation states around the world is also typically increased if the citizens of a state come to feel connected with each other at the national level, developing feelings of loyalty both to their country and other members of its population with a sense of collective, national https://doi.org/10.1515/9783110558142-038

928 

 Andrew Simpson

identity. Language and language planning may often play an important role in such a process, in three general ways. First, economic progress is greatly assisted when a shared means of communication is made available in multilingual populations – knowledge of a language (or languages) that can be used by all in trade, education, and government administration. Second, the socio-political stability of ethnically mixed states requires the development and practice of language policies which are perceived as fair toward all groups and not offering unequal advantages to a particular sub-section of the population. Third, the regular use of a common language by all members of a population, at least some of the time, has the potential to serve as a strong psychological symbol of belonging to a single unified nation with shared interests and goals, stimulating positive feelings of a special connection with other co-members of the state. Identifying what kind of language and language policies can best facilitate the development of newly independent multi-ethnic states is often very challenging, due to the complex mixture of peoples, cultures and languages that may be present in territories which were previously established as colonies by Western powers, or which alternatively arose from patterns of migration occurring over longer periods of time. In Southeast Asia, there are states with extremely heterogeneous populations, such as Indonesia and the Philippines, where very many different ethno-linguistic groups co-habit a singular national territory and hundreds of languages are claimed to be spoken. There are also states where one ethnic group constitutes a very sizeable majority, such as Thailand, Vietnam and Burma, but many other minorities are also present. How to shape effective national language policies in such states has not always been straightforward and easy, and in various cases has been further complicated by the “hangover” presence of an ex-colonial language in use in many formal domains of life – for example, English in the Philippines, Malaysia, Singapore and Burma/Myanmar, retained and periodically advanced for its pragmatic and international value. The types of language policy that have been implemented in Southeast Asian countries can be characterized in terms of a distinction between single language/ unilingual and multilingual models of language planning, and the promotion of languages with different roles, as national and/or official languages. A strong influence on Asian countries in their development of language policy has been the perceived wisdom from Western countries that successful nations elevate a single language into a dominant, fully national role, pursuing a “one nation, one language” ideal in which the inhabitants of a nation are bonded together by being speakers of a single common language. Such thinking has led many countries in Southeast Asia to attempt to promote the learning and speaking of a single, heavily privileged language, as for example in Thailand and Vietnam, where national unity and strength has regularly been linked to citizens’ civic duty to become speakers of Thai and Vietnamese. The unilingual/single language approach to language planning at the national level found in much of Southeast Asia (and the world in general) contrasts with attempts to foster



Language and the building of nations in Southeast Asia 

 929

high-level multilingualism and the simultaneous promotion of multiple languages in important roles, as in Singapore, where four languages are given equal status and rights in all government-regulated activities of daily life. An additional important twist on the single language versus multilingual approaches to language planning concerns governments’ designation of languages as having either national language or official language status, or sometimes both such statuses. An official state language is a language that is proscribed for official use in various areas of life such as education, government administration, courts of law, public transportation etc. The citizens of a state have the legal right and are also required to use an official language in such domains, and official languages consequently have an essentially pragmatic function, to help speakers negotiate their daily lives at the national level with a form of speech that is known and understood by others in a state. Economic efficiency and the smooth running of government business all benefit from the nationwide utilization of official languages, which facilitate formal communication between people who may be native speakers of quite different languages. A national language, by way of contrast, is a language that has a primarily symbolic function, like a national flag or anthem, used to unify the citizens of a nation and instill feelings of group identity. A national language need not be sanctioned for use in formal domains of life or be required in formal interactions. Rather, its intended purpose is to encourage feelings of nationhood through being distinctive and setting its speakers off from other neighboring populations. In some instances, a single language may be able to serve both official and national language roles, as for example in Japan and Korea, where Japanese and Korean can be referred to as “national-official languages”. However, in other cases, countries establish separate official and national languages, for a variety of reasons, as we will see in the chapter’s discussion of the linguistic situation in the Philippines, Malaysia and Singapore. In order to bring into practice whatever language policy is felt to be best suited to a country, governments regularly engage in hands-on language planning, a process which has several different stages and objectives. Status planning involves the decision to give certain special roles to one or more languages – the selection of languages for national or official language status. This decision-making process is critically important, especially in multilingual populations, where the promotion of one language over others can have major consequences for inter-ethnic relations. Following the selection step come various activities of corpus planning. In many cases, the decision to upgrade a language to national or official language status will require standardization of the language – agreement on which words are to be recognized as comprising the standard language, compiled into dictionaries, and the creation of grammatical descriptions of the language, indicating which grammatical rules are considered standard forms, to be taught to new speakers and also encouraged among existing speakers. In the case of new official languages, it will also often be necessary for linguists to help expand the vocabulary of the language so that it can be effectively used in all formal domains of life, such as higher education, government administra-

930 

 Andrew Simpson

tion, scientific discussion, and legal documentation. Once sufficient standardization and vocabulary development has been achieved, knowledge of new national and official languages needs to be spread among the population of a state, typically by means of mass education and heavy use in public media – television, radio and literature. Finally, governments may also need to work on convincing their citizens of the benefits of adopting use of new national and official languages, so that they will actually speak these languages with enthusiasm and commitment – winning psychological acceptance for the promoted language forms. When states attempt to manipulate the language habits of their populations and impose language policies of different types, the success of such initiatives can potentially be measured in two broad ways. A major goal of many countries is to cultivate a strong national identity among its people, which will help nurture feelings of loyalty to the nation and stimulate cooperation in national endeavors. A second important aim of language planning in multi-ethnic states in particular, is to craft a policy that will help maintain peace and stability among the different groups and not cause linguistic grievances which could become catalysts for general rejection of the state or lead to conflict between different language groups. In the set of case studies of Southeast Asian countries which make up the rest of the chapter, we will see how these goals have been approached in different ways and with varying degrees of success, partly as a result of decisions made by the political leaderships of countries in the region, and partly due to the nature of the populations present in individual states at the time when national language planning needed to be effected. We will begin with two cases which are widely recognized as having achieved the two goals noted above of stimulating the growth of a strong national identity while minimizing ethnic discord due to language-related reasons: Thailand and Vietnam. These two countries are similar in their population make-up, with large majorities from one ethnic group living alongside many smaller minority groups. However, their routes to the spread of highly effective national-official languages have been quite different, in one instance being a well-planned defense of the nation faced with the threat of Western encroachment, in the other being linked to the struggle against colonial domination and civil war.

38.3 Thailand – nationalism and modernization as a mechanism of self-defense In the 19th century, the area that would become modern Thailand lay at the center of a much larger Siamese empire which incorporated much ethno-linguistic diversity and no commonly shared identity. Politically, the empire was constructed upon a network of local allegiances to regionally powerful rulers and little connection was felt between peoples who lived in different parts of the empire. As Western powers increasingly



Language and the building of nations in Southeast Asia 

 931

penetrated Southeast Asia during this time, the integrity of the Siamese empire came under threat, with Britain and France taking control of more and more territory to the west, south and east of the empire. The Siamese monarchy realized that steps needed to be taken to ensure that Siam itself would not be overrun by either Britain or France and made into a colonial possession as had occurred in Burma, the Malay peninsula and Indo-China (Laos, Vietnam and Cambodia). King Chulalongkorn set about effecting the rapid modernization of Siam in a way that would present the image to the outside world of a stable modern country that Britain and France could successfully conduct business with without the need to subjugate it militarily. During the course of this modernization process, Siam actually lost half of the territory that had comprised the Siamese empire and was transformed from a vast, sprawling empire constructed on regional power relations to a smaller nation-state with a centralized bureaucracy. As the country managed to survive any foreign encroachment of its newly reconfigured borders and retained its independence, unlike all other countries and kingdoms of Southeast Asia, the idea of a Thai nation was promoted, vigorously, for the first time, with policies that were intended to coalesce the mixed population as a united, (largely) uniform nation with a common national culture. A major component of the drive to develop a strong, new national identity in the first half of the 20th century was the promotion of a standardized form of Thai, modeled on the speech of the center of the country, as the national language. In 1905 a grammatical description of standard Thai was completed, “Principles of the Thai Language”, and used as a model for all language textbooks teaching Thai in compulsory mass education introduced throughout the state, and in many new publications made available in Thai. Presented as the national language, standard central Thai also quickly came to be used as the dominant medium of instruction in schools, which now emphasized the teaching of a common Thai history and culture. In the 1930s in particular, heavy nationalist propaganda orchestrated by the political leadership of the country disseminated the myth of a single Thai people with a long, shared history. The name of the country was changed, in a very symbolic gesture, from “Siam” to “Thailand”, and its population were referred to as “Thais” rather than “Siamese”, in an attempt to reinforce the notion of an ethnically uniform race and nation speaking a single language, Thai. Terms used to refer to parts of the population in ways that diverged from this homogenous ideal were discontinued, with the result that those living in the northeast of Thailand were no longer permitted to be referred to as “Lao” (as had previously been the custom) and had to be called “Thai”, one member of the monarchy, Prince Damrong, insisting that “we know they are Thai, not Lao” (Keyes 2003). Other non-linguistic symbols reinforced the widespread pressure to adopt and revere Thai national identity, such as a new national flag and national anthem, very regularly seen and heard in daily life in Thailand through until the present. The second half of the 20th century saw the continued strengthening of Thai national identity, bolstered by further modernization and a significant economic boom in the 1960s. Leaders of the state highlighted the fact that Thailand had been

932 

 Andrew Simpson

able to maintain its independence through the 20th century while other countries in southeast Asia had all been overrun and colonized by Western powers. This helped embed the feeling of truly belonging to a successful modern nation among the population of the country, as Thailand seemed to be making progress like other modern states in Europe and northeast Asia. During this time, standard Thai firmly established its position as one of the strongest symbols of the shared national identity, with more than 90 % of the population being able to speak the language and communicate effectively with each other. Currently, there is a stable co-existence of standard Thai with a broad range of other languages and distinctive, regional forms of Thai, in a relation of complementary distribution. Standard Thai is used in all formal domains, including education, government administration, legal matters, much business, and in interactions in banks, on public transport and in higher end stores throughout the country, while regional forms of Thai and other minority languages are heard outside the center of the country in informal interactions. It has widely been observed that the rise of standard Thai and its total dominance in national and official language roles (stimulating feelings of national identity and fulfilling all language needs in formal areas of life) never encountered obvious resistance from the public and was brought about very effectively by those leading the country during the 20th century. Five primary reasons for this striking success of the state’s national language planning policy are identified in Simpson and Thammasathien (2007a). First, the promotion of standard Thai as the national language was not accompanied by any attempt to fully suppress other languages. Speakers of other varieties were permitted to continue to make use of these forms of language in informal domains of life, although required, as a national duty, to learn standard Thai for national- and formal-level interactions. Second, the first, home language of 90 % of the population is some form of Thai, hence speakers of regional forms of Thai perceive that their home varieties are related to the national language  – standard Thai is therefore not a foreign imposition. Third, the general promotion of Thai national identity has been considerably successful and brought about a very perceptible sense of national pride and loyalty among the population of the country, and standard Thai is part of this accepted national identity. Fourth, through its own efforts at self-defense, Thailand was spared the complications which frequently arise when a foreign language comes to be used during extended periods of foreign colonial domination (as with English in the Philippines, Singapore and Malaysia). Fifth, the ethnically non-Thai 10 % of the population have seen that there are pragmatic incentives for accepting the nationwide dominance of standard Thai. It enables access to educational, economic and advancement opportunities that might otherwise not be available in the absence of a nationally shared language. One residual challenge to the very broad acceptance of standard Thai and Thai national identity still remains, however, in the deep south of the country, where a Malay-speaking Muslim population lives in four provinces adjacent to the border with Malaysia. This area previously belonged to independent Malay states and was only



Language and the building of nations in Southeast Asia 

 933

incorporated into Siam in the 19th century. As its population is ethnically Malay, has traditionally spoken Malay rather than any form of Thai, and is Muslim rather than Buddhist, the people in the southern borderlands area feel they have more in common with the inhabitants of Malaysia than the rest of Thailand and would like to preserve Malay language and culture and transmit this further to rising generations. Despite such wishes, the use of Malay in schools in the area has been heavily controlled by the Thai government and education through standard Thai has been largely imposed as elsewhere in the country. This has created certain resentment among much of the local population, who indicate that they feel discriminated against on the basis of their language and religion. Notwithstanding the case of the provinces bordering Malaysia, the general picture is one of striking conformity to the national language planning initiative which has been vigorously promoted by the state since the early 20th century as part of its continued efforts at nation-building. Modern Thailand stands out in southeast Asia as a country which has achieved its goal of constructing a strong national identity, and has done this in considerable measure through the successful development and dispersion of an indigenous national-official language which is perceived to be prestigious and a positive linguistic symbol of the nation, and also serves as a practical resource offering clear advantages to its speakers in everyday life.

38.4 Vietnam – national language and the role of writing systems in identity formation The language situation in present-day Vietnam resembles that in Thailand in a very clear way. As in Thailand, there is a successful, widespread national-official language, Vietnamese, which is used in all formal domains of life – higher education, government administration, scientific research, legal matters, the creation of literature, as well as dominating print and visual media – and the same language functions well as a strong marker of national identity, distinguishing its speakers in a positive way from populations in other countries. Vietnam also has a population distribution which is similar to that of Thailand, with approximately 90 % of its citizens sharing the same ethnic background (Kinh Vietnamese) and 10 % being made up of many smaller minority groups. What distinguishes Vietnam from Thailand in an interesting way is how its national-official language has achieved its current highly developed position, the political and military struggles which have constantly interacted with language and the development of Vietnamese, and the role that forms of writing/orthography have played in the evolution of the national language. From 111 BCE until 939 BCE, the area of modern day north and north central Vietnam was ruled over by Chinese forces, following an initial invasion during the Han Dynasty, and this foreign control embedded classical Chinese as the language of

934 

 Andrew Simpson

administration and the only form of written communication. “Sino-Vietnamese” then emerged as a localized written form of classical Chinese, differing from the latter predominantly in the way it was pronounced. Following the expulsion of Chinese rulers in the 10th century, Sino-Vietnamese continued on as the common form of writing, being the only way that official acts of administration were recorded, and dominating the creation of literature. An adaption of Chinese characters to transcribe actual spoken Vietnamese was initiated in the 11th century, but “chữ nôm” never achieved prestige and all high-level writing remained in Sino-Vietnamese, which even served as a vehicle to express ideas of Vietnamese national identity until the late 19th century (Lê and O’Harrow 2007). A third system of writing known as quốc ngữ was developed in the 17th century by Jesuit missionaries, as a means to represent spoken Vietnamese using the Roman alphabet supplemented with certain diacritics. The creation of quốc ngữ added to the complexity of written forms available in Vietnam, and was an orthography that was very easy to learn and use, in comparison with Sino-Vietnamese and chữ nôm, which both utilized large numbers of Chinese characters. However, despite its much greater simplicity, use of quốc ngữ did not spread beyond the Catholic population in Vietnam for two more centuries, when the country came under new foreign domination, subjugated gradually by France. As the French established their rule over north, central and southern Vietnam, they saw that opposition to French rule was led by members of the Vietnamese intellectual elite who commanded knowledge of Sino-Vietnamese, and this Sinitic written form of language was used as a center-piece of national identity representing Vietnamese traditions and new anti-colonial sentiments. Because of this connection between Sino-Vietnamese and resistance to French rule, the French decided to promote the use of quốc ngữ in local government in place of Sino-Vietnamese, as a way to undermine the influence of the traditional Vietnamese elite. With the same goal in mind, publications in quốc ngữ were also significantly increased under French rule in the late 19th century, in a sustained attempt to weaken the symbolic power of Sino-Vietnamese and replace it with a Western-sourced Romanized form of writing. While quốc ngữ was initially perceived to be the orthography of the enemy and associated with colonial domination, in the early 20th century attitudes held by those opposed to French colonial rule changed in an interesting way. It was realized that because quốc ngữ was a system that was easy to master and represented spoken Vietnamese not classical Chinese, it actually offered an excellent means to spread messages of resistance to French rule among the masses who had no knowledge of Sino-Vietnamese. Intellectuals hoping to reach a wide audience with nationalist messages thus all switched from the use of Sino-Vietnamese to the use of quốc ngữ during the 1920s and 1930s, and a large new body of work written in quốc ngữ came into creation. This included not only political tracts, but also works of literature and translations of classical texts originally written in Sino-Vietnamese, dramatically increasing the prestige which quốc ngữ was felt to have. Lê and O’Harrow (2007) discuss the remarkable “conflict of the scripts” which played out in the late 19th and early 20th centuries in



Language and the building of nations in Southeast Asia 

 935

Vietnam, noting that at one point there were four different writing systems available for use: Sino-Vietnamese, chữ nôm, quốc ngữ, and Romanized French. To begin with, Sino-Vietnamese and chữ nôm were linked with opposition to the French and use of quốc ngữ represented collaboration. However, later on, the pragmatic value of quốc ngữ was appreciated by the nationalists, who realized that it represented the most practical means to disseminate anti-colonial propaganda and help modernization. With the adoption and extended use of quốc ngữ by the independence movement, its initial negative symbolic value was first replaced with appreciation of its pragmatic value for the spread of nationalist ideas, and then quốc ngữ ironically came to acquire a strong new positive symbolic value, associated with resistance to the French and a commitment to modernize the country. Lê and O’Harrow (2007) note that: If there is any general lesson to be derived from the French period, it is perhaps that symbolic values associated with language can undergo considerable change even in relatively short periods of time such as the span of one generation. A “foreign” language system such as the French-developed (and promoted) Romanization of vernacular Vietnamese in quoc ngu came to be “nativized” in the minds of speakers over time through increased association with domestic, national use, to the point of becoming an important new icon of national identity and losing earlier negative associations of foreign origin. (Lê and O’Harrow 2007: 429)

When independence from the French came about in the second half of the 20th century, after much internal conflict, extensive corpus planning activities were carried out to further develop Vietnamese, written in quốc ngữ, as the national language. Literacy campaigns were initiated to spread knowledge of written Vietnamese to all the population, and every ethnic group in the country was told it had a social responsibility to learn Vietnamese (Vasavakul 2003). The standardization of Vietnamese was assisted by the founding of a new Institute of Linguistics, and intensive work was carried out on expanding the vocabulary of Vietnamese so that it could be effectively used in all domains of life, including high-level academic, political and scientific discussion. As part of this development of the vocabulary of modern Vietnamese, attempts were made to restore the “purity” of the national language by eliminating words which had been borrowed from foreign sources in earlier times (principally Chinese), and avoiding the adoption of words from other languages in the expansion of new technical vocabulary for Vietnamese. As a result of the post-independence promotion of standardized spoken and written forms of the language, Vietnamese has become a highly successful national-official language, just like standard Thai in Thailand. It is now the principal medium of instruction in all schools and institutes of higher education and is used throughout the country in all formal and informal domains of life and all modes of interaction. Symbolically, it binds the Vietnamese population together very effectively and has become a major component of national identity. Finally, no foreign language competes with Vietnamese as a lingering colonial legacy, French having all but disappeared from Vietnam, and Vietnamese is spoken confidently and with pride by all

936 

 Andrew Simpson

levels of society, felt to be a prestigious language able to convey complex information and create fine literature as well as any other language.

38.5 Indonesia: a successful official language paired with stable multilingualism While Thailand and Vietnam have populations in which a single ethnic group comprises a very large majority of the total in the state and this has greatly assisted the selection and development of a national-official language, Indonesia is a country with a much more mixed population, with hundreds of languages being spoken by a large number of ethnic groups, none of which constitutes a clear majority of the population. In such an ethno-linguistically mixed state, language has the potential to be very divisive and lead to inter-ethnic competition and possible conflict. However, Indonesia has been remarkably successful in its post-independence management of language issues and the use of language to develop a modernized, largely unified state, and presents a good lesson to other countries of how official language planning in a very heterogeneous population can in fact succeed very well if treated with sufficient care and attention. Two aspects of Indonesia’s engagement in language planning have been particularly important for its sustained success. First, the nationalist leadership of Indonesia made an excellent choice in the selection of a language to be developed as the country’s new official language. Second, implementation of the spread of “Indonesian” throughout the nation was wisely handled with much concern for the population’s continued attachment to other local languages. Concerning the selection issue, pre-independence nationalist groups agreed that it would be very useful for a future, independent Indonesia to have a single, widely known official language. The critical question was how to choose a language that could be promoted in this way without causing any major dissatisfaction among the population. Dutch, the language of the colonial rulers of Indonesia, was never considered as a possible official language choice, due to negative associations with Dutch rule. The language of the largest ethnic group in Indonesia, Javanese, was also rejected, because the promotion of Javanese as Indonesia’s common official language would have given unfair advantages to the Javanese group and most probably caused much discontent among other sections of the population. Javanese is also linguistically a complex language to learn, requiring the mastery of multiple, different speech levels for use in different social contexts, and might not have been easy to spread as a language among other groups in the country. The decision was instead taken to select and promote (following the achievement of independence) a form of Malay that had come to be used in trading interactions by speakers of different languages in much of the country, renaming this variety “Indonesian” (Bahasa Indonesia). The choice of this variety for promotion as the nation’s future official language made good sense for



Language and the building of nations in Southeast Asia 

 937

many reasons. First, because it was primarily used as a trading lingua franca, it was perceived to be an ethnically neutral language, not giving special advantages to any already powerful group, and this helped people readily accept Indonesian as a useful link language when it was developed as the official language of the state. At the time of its selection as future official language, Malay/Indonesian was only spoken as a first language by a relatively small and economically insignificant group on Sumatra, not by any dominant majority. Second, some basic Malay/Indonesian had already been taught in schools in different parts of the country before independence, and it had come to be used in various newspapers and popular works of fiction. Third, Malay/Indonesian is an Austronesian language and there are similarities in its vocabulary and grammatical structure to Indonesia’s many other Austronesian languages. It could therefore be learned without great difficulty by the general population and was felt to be broadly representative of the linguistic identity of the country. Finally, Indonesian was frequently used by the nationalists from the 1930s onward, and so it acquired positive prestige from its close association with the independence movement. Independence was ultimately not achieved until 1949. However, between 1942 and 1945, the development of Indonesian as an official-like language was assisted by the replacement of Dutch with Indonesian during the Japanese occupation of the country, requiring Indonesian to be used in a range of situations it had not previously been used in and a sudden, necessary growth in its vocabulary. Following independence, there was a continued, massive development of technical vocabulary and the creation of a grammatical description of Indonesian, establishing a standard model of the language that could be used in teaching Indonesian throughout the country. Mass education then spread knowledge of Indonesian very widely. Importantly, this implementation of Indonesian as new official language of the state was effected in a gradual way without any attempt by the government to suppress the use of other local languages in informal domains. The result of this very tolerant process of promotion is nationwide bilingualism. Indonesian is used by everyone in the population in formal areas of life – in government administration, higher levels of education, inter-regional commerce, legal matters and to access science and technology – while regional languages are regularly used in the home and in other casual interactions with friends and family. This combination of Indonesian as nationwide official lingua franca with local languages used as informal means of communication seems to work very well, and language issues have not been the causes of ethnic conflict in Indonesia’s very mixed population. Although Indonesian is technically classified as the official language of the country, it has frequently been noted that Indonesian performs more than just purely utilitarian functions, and its use over time has helped stimulate the development of Indonesian national identity, hence it additionally serves in national language functions. The language facilitates communication between different ethnolinguistic groups and is a clearly unifying feature of the population, for Bertrand “the strong-

938 

 Andrew Simpson

est symbol of national unity” (Bertrand 2003: 279), and “the primary shared component of the country’s emerging national identity” (Simpson 2007b: 334) encoding an all-Indonesian identity. The language planning policies of the Indonesian government since independence have therefore been very successful, and show that it is in fact possible to develop a single indigenous language as an official (or national-official) language in an ethnically very mixed country, if this is carried out with careful toleration for other languages spoken in a population. Perhaps the most important lesson to come from Indonesia’s post-independence language program is that the continued use and even encouragement of local minority languages alongside the development of a nationwide official language does not pose a threat to the successful promotion of the latter, as official and local languages may be used for different functions which are not in competition with each other but instead serve as distinct assets enriching a population’s linguistic repertoire.

38.6 Singapore – official linguistic pluralism A consideration of post-independence Indonesia demonstrates how a single language policy promoting one official language can be successful even in a heavily mixed population, if implemented well, with no attempted suppression of other home languages. Singapore, by way of contrast, is a good example of an ethnically mixed state which has striven to effect a pluralist, multilingual official language policy at the national level, and made such an ambitious policy succeed for half a century already. Under British colonial rule from 1824 to 1958, Singapore developed a complex population, principally made up of Chinese, Malays, and South Asians. When self-government was granted in 1958, the new political leaders of the state faced the challenge of how to unify the mixed population as an independent nation. No feelings of trans-ethnic, collective identity had been nurtured under the British (quite the opposite, in fact), and the natural historical means to build a common national identity were not present, as Singapore had no long history with co-participation of the three major ethnic groups in struggles to defend and improve the state. In an attempt to begin to bind the population together, the new leaders of Singapore decided to focus on the future and stressed joint economic growth and the protection of equal rights as goals for the development of the state and its population. It set about promoting cultural and linguistic pluralism and the growth of a new Singaporean identity founded on respect for broad, traditional Asian values. The result has been a determined program of language planning sustained over many decades, with regular attempts to guide and sometimes redirect the common language practices of the population in the interests of the state and the maintenance of harmonic relations among the population. With regard to state language policy and the question of what language might be privileged with the role of official language of Singapore, rather than selecting a single



Language and the building of nations in Southeast Asia 

 939

language for such a status, the decision was taken to promote four official languages in a fully equal manner: Mandarin Chinese, Malay, Tamil and English. The first three languages provided official linguistic representation for the three major ethnic groups, and English was added as a fourth official language for its international, utilitarian value and as recognition of its important use as a language of interethnic communication. Additionally, Malay was given the role of national language, for political reasons, as Singapore’s larger neighbors to the north and south were both Malay-speaking states (Malaysia and Indonesia). Malay’s status as national language has, however, largely been symbolic and gives it no dominant role in daily life. In all major areas of formal life in Singapore, such as schooling, government administration, legal matters and media air time, the four official languages have been guaranteed equal treatment, and are very widely used. A key component of the Singaporean government’s attempts to integrate the population and remove barriers to communication between the different ethnic groups has been an evolving program of mandatory bilingual education. Initially, students entering school were required to select two of the four official languages as mediums of education. One language was used in 60 % of a student’s classes (the “L1”), and the other in the remaining 40 % (the “L2”). The government hoped that students would learn the languages of the other primary ethnic groups in Singapore, and in doing so increase their cultural knowledge of others in the population. However, it turned out that, for pragmatic reasons, most students selected a combination of English and the official language identified with their own ethnic group rather than the language of another group. As a result of this massive convergence on selection of English as the 60 % “first language” of education, the school system was fully reorganized, converting schools that had previously focused on teaching with Mandarin Chinese, Malay and Tamil as the L1 into L1 English schools. All schools became uniform in their structure, using English as the L1 and offering the other three official languages as L2. While the initial hoped-for cross-cultural bilingualism did not arise from bilingual education, and Chinese students learned through English and Chinese not Malay or Tamil, and Malay students took English and Malay, not one of the other two official languages, there nevertheless was a very positive side-effect of the restructuring of schools in Singapore. As the older Chinese, Malay and Tamil schools were merged into L1 English schools, students from Singapore’s different ethnic groups all began to attend the same schools and mixed with each other much more than in previous generations, improving their understanding of their neighbors from other ethnic backgrounds. The government’s regular involvement in aspects of language planning has also aimed at improvements in the speaking of two of the four official languages, with broad campaigns targeting adults as well as younger people. The “Speak Mandarin Campaign” asked speakers of different varieties of Chinese, such as Cantonese and Hokkien, to switch to using Mandarin both at work and in the home, in order to improve cross-generational knowledge of Mandarin Chinese and bring together the

940 

 Andrew Simpson

Chinese population with a single form of Chinese known/spoken by all. Mandarin classes were offered free of charge to adults and the government began to require the use of Mandarin in workplace interactions between ethnically Chinese Singaporeans, rather than Cantonese or Hokkien. Over a period of years, the Speak Mandarin Campaign did indeed considerably improve the Chinese population’s proficiency in Mandarin and led to Mandarin becoming established as the common form of communication among Chinese from different dialect backgrounds. A second major campaign was focused on English and was a response to worries on behalf of the government that the use of colloquial Singapore English or “Singlish” was negatively impacting people’s abilities to speak standard English. Singlish is a combination of English, Malay and Chinese vocabulary and grammar and quite distinct from standard forms of English, though a majority of the words used in Singlish are in fact easily recognizable English words. When Singlish came to be used frequently in popular television shows during the 1990s, the government imposed a ban on Singlish in television and radio, voicing concern that Singapore’s ability to prosper as an international center of commerce depended on its use of standard English and that this was threatened by a decline into the vernacular forms of Singlish, not easily comprehensible to non-Singaporeans. In 2000, the “Speak Good English Movement” was then a further step to direct people away from Singlish and toward the “better” standards of international English. In reality, most Singaporeans seem to be able to switch between Singlish and standard (Singaporean) English according to the situation and there does not seem to be any obvious decline in abilities in the latter. Ironically, as a language form binding the mixed population together, it is actually Singlish which functions as the most obvious informal symbol of a race-neutral, general Singaporean identity, yet Singlish is felt to be much too informal for any official promotion in such an integrative role. Viewed overall, language planning instituted by the Singaporean leadership in the form of official multilingualism can be said to have stimulated the growth of a unifying national identity based on multiculturalism with equal linguistic rights for each of the three major ethnic groups in the population and the official language linked to each group. Linguistic pluralism has helped create conditions of social stability and a beneficial foundation for the development of a new Singaporean identity which references properties of all three major ethnic groups in an inclusive way, and emphasizes inter-ethnic cooperation and the celebration of cultural diversity. While the maintenance of multiple official languages requires both money and constant attention to preserve genuine equality, Singapore continues to show that such a policy is both possible and can be very successful, helping decrease the likelihood that language issues will become causes of conflict between different ethnic groups, and increase the potential for different groups to bond together as a single multicultural nation (Simpson 2007c).



Language and the building of nations in Southeast Asia 

 941

38.7 The Philippines – difficulties in promoting the acceptance of a national language The ethnolinguistic composition of the Philippines is similar to that of Indonesia in many ways  – the country has a large population made up of many languages and ethnic groups (perhaps over 150 languages), spread over an extensive archipelago of islands. The area of the Philippines was ruled over first by the Spanish, from the 16th century to the end of the 19th century, and then by the USA in the first half of the 20th century, until independence was achieved in 1946. While 300 years of Spanish presence in the Philippines resulted in little knowledge of Spanish being established among the population, the US government saw the spread of education and the English language as a major priority, and by 1939 there came to be more (L2) speakers of English than any single Filipino language. At the time of independence, however, the new leadership of the country felt that the promotion of an indigenous national language was critical for the building of a unified national identity to help bind the mixed population together, and it set about the selection and development of such a language, first called Pilipino, later renamed Filipino. Several decades years after this process had been initiated, however, the director of the Philippines Institute of National Language admitted that the national language had still not been accepted by the general population of the country and that Pilipino remained “a language in search of a people (or a nation)” (Gonzales 2007: 360). Currently, widespread success continues to elude the establishment of Filipino as a truly national language, and national identity has not been strengthened by language planning in the Philippines, unlike the situation in Indonesia and Thailand. The reasons for this comparative lack of success in the Philippines’ national language program relate to both status and corpus planning issues, and the lingering interfering presence of the ex-colonial language English. First of all, the selection of a specific language form to be promoted as new national language was not handled well. The leadership of the independent Philippines took the decision to make use of a slightly adjusted form of Tagalog, the language of the largest ethnic group in the country (12 million people at the time of independence), as the country’s national language. Tagalog in the guise of national language was subsequently renamed as Pilipino in 1959 and then Filipino in 1973. This choice of a thinly disguised Tagalog as national language caused much discontent among other large ethnic groups in the Philippines such as the Cebuano (10 million) and the Ilocano (5 million), and was generally viewed as the Tagalog speaking leadership giving members of its own ethnic group unfair advantages in the future development of the country. Because of these feelings of resentment at the selection of the language of much of the ruling elite as national language, and the perception that Tagalog-speakers from the north of the Philippines would benefit heavily from the spread of Pilipino/ Filipino as national and also official language, speakers of other Filipino languages

942 

 Andrew Simpson

did not take up the learning and use of Pilipino/Filipino with any great enthusiasm. Second, it has been widely acknowledged that the government has failed to develop Pilipino/Filipino well and has not provided it with the linguistic resources necessary to serve national and official language functions in an effective way. There has been no satisfactory standardization of the language, insufficient development of its formal vocabulary, lack of support for its effective spread in education, and a general failure to win prestige and respect for the language through the creation of literature and a linking with other forms of high culture and scholarly learning. These selection and implementation issues have contributed greatly to the ambivalent, lackluster attitude towards the national language that prevails to the present in much of the country, and its lack of success as a language uniting the nation and stimulating strongly positive feelings of national identity. The continued presence and attraction of English in the Philippines is also a factor which has affected the uptake of Filipino by the general population. English is considered to be extremely important for the access it provides to higher-paid jobs in the Philippines and the possibility of working overseas in various service occupations requiring a knowledge of English. The continued, common “clamor for more English” leads Gonzales (2007) to note that “the Filipino’s first priority in language-learning for life is English, not Filipino” (Gonzales 2007: 370). For the government too, the development of English skills among the population is financially very important, as the foreign revenue which the country receives from Filipinos working overseas is more than from any other “export” from the Philippines and critical for the economy. In order to achieve the goal of spreading knowledge of both the national language and English, bilingual education was established in schools in 1974 with English and Filipino being used as mediums of instruction to teach different subjects. However, rather than producing competent bilinguals, it is frequently complained that standards of English have dropped and that Filipino is also not being learned well. Many observers have blamed the program of bilingual education for this perceived failure to acquire either English or the national language in an academically proficient way. Yet such an assessment has also been challenged, and a 1986 study reported that well-run schools did a good job in teaching both languages, whereas poorly-run schools performed unsatisfactorily in their delivery of bilingual education. Gonzales (2007) concludes that the key factor in language teaching success has been economic, with “the quality of teaching higher in more affluent schools being higher due to the presence of more competent teachers” (Gonzales 2007: 369). It seems that the desire to spread advanced knowledge of both English and Filipino throughout the country often leads to pressure on the delivery of education in two languages which only the better supported schools are able to handle well, and the general attraction of English to the Filipino population can both be financially beneficial at times, but also serve to hinder academic progress due to problems in the actual implementation of bilingual education. The formal status of Filipino and English since the proclamation of the 1987 constitution is that Filipino is the national language of the Philippines and both Filipino



Language and the building of nations in Southeast Asia 

 943

and English are the country’s official languages. The on-the-ground reality of everyday language use is that English is used in higher, formal domains of life, Filipino is spoken as an informal link language between people from different language groups and heard in national media, while regional languages dominate local informal interactions. There is consequently a hierarchy of languages (Hau and Tinio 2003) with English privileged as the language of political, economic and intellectual power and opportunity, above Filipino, which is regarded as a national lingua franca, useful for informal interethnic communication rather than as an expression of national identity, and regional languages which are valued for the ways they express personal and local identity. The national language has become a purely functional, pragmatic means of communication in contexts where neither English is appropriate nor local languages can be understood by all speakers, and the opportunity for Filipino to bind the population enthusiastically together as a single unified people with a shared national identity has unfortunately not been realized. The success of national languages depends on careful selection, development, promotion in education, and the winning of psychological acceptance as a prestigious symbolic representation of the nation, and in the case of Filipino, these conditions for success have not been effectively satisfied so far.

38.8 Laos – economic, geographical and population challenges for national language planning Laos is another country in Southeast Asia which has not seen much success with the use of a national language to stimulate feelings of unity and belonging to a single people. The causes of this lack of success partially overlap with those described for the Philippines – economic underdevelopment leading to a lack of available financial support for mass education and language programs, insufficient standardization of a (potential) national language, and negative attitudes among some groups to the privileging of one particular language in a national role. Other challenges facing Lao language planning have been the geography of the country and a lack of infrastructure connecting different parts of the country, and complications caused by issues relating to adjacent Thailand and the Lao population living there. The six million population of Laos is made of 65 % who are ethno-linguistically Lao, related to the Thais, 25 % who speak Mon-Khmer languages, and 10 % who are speakers of Sino-Tibetan languages. These different groups are distributed over a large area dominated by mountains and forests with few major roads and railways, making travel and commerce less easy than in many other Southeast Asian countries. Before the 20th century, there were several different kingdoms in the territory of modern-day Laos and no unity or shared identity among the groups living in the area. The first attempts to forge a national identity for the “Lao” people were actually made by the

944 

 Andrew Simpson

French as a defensive measure against Thailand. During the ultranationalist period of 1930s in Thailand, claims were made by the Thai government that the Lao were related to the Thais and so should be absorbed into a growing Thai nation. The French, who had occupied a significant portion of modern Laos, strove to counter this expansionist move with their own local nationalist program designed to create and stimulate a separate Lao national identity, and make the population of the Lao regions feel distinct from their neighbors in Thailand. The way that this was done was to present the language and culture of the ethnically Lao group as the all-encompassing national identity, despite the fact that the languages and cultures of the Mon-Khmer and Sino-Tibetan groups are significantly different from those of the Lao. Later on, in the 1970s, there was a new initiative to build unity and feelings of national identity among the population of the country. However, this resulted in use of the term “Lao” also for the non-Lao groups, labelling the Mon-Khmer as “Midland Lao” and the Sino-Tibetan speakers as “Upland Lao”. The attitude of the government therefore seems to have been that the identity of the entire population and the country should be centered around Lao ethnicity, which has not been welcomed by members of the non-Lao groups. The general results of the government’s attempts to stimulate growth of a Lao national identity since the 1970s have been rather weak. With regard to national language planning, no effective steps have been taken to establish and spread knowledge of a well-standardized national language. The form of Lao associated with the capital Vientiane may be widely understood in the country and a proxy-national language, but it has not been made the language of education and is not the only form of Lao to be used in formal acts of communication such as government administration, public announcements and religious activities. The areas of life which are typically utilized to build up familiarity with and use of a new national or official language are therefore not being exploited in Laos to strengthen the status of any single variety of language, and although Vientiane Lao is often heard, it is not perceived to have the strongly unifying power of national languages elsewhere, for example Thai and Vietnamese. In contrast to spoken Lao, the way that Lao is written is in fact uniform all over the country, making use of a script which is unique to Lao. However, literacy levels are generally low in Laos and so the potential for written Lao to serve as a symbol of unity in the Lao population is not being realized. A second challenge to attempts to build a strongly unified Lao nation arises from the geographical features dominating the country and the difficulties for internal communication and travel caused by the presence of mountains and forests in most of the country, which limit regular, integrative interaction between people from different parts of Laos. An additional obstacle to the development of Laotian national identity is the odd composition of its population, created as an artificial grouping by the French colonial expansion in Indo-China. The heterogeneous mixture of the Lao, Mon-Khmer and Sino-Tibetan components of the population makes it difficult to identify any common cultural or linguistic symbols that could be used to promote a sense of shared national consciousness. Furthermore,



Language and the building of nations in Southeast Asia 

 945

the majority of the Lao people, approximately 80 %, actually live in northeast Thailand, not in Laos, due to the international borders formed by Siam and the French. Such a separation of the Lao into two territorial states makes the potential imagining of a truly “Lao nation” just within Laos considerably more complicated. Finally, the use of television programming in Laos to spread a national form of Lao language and identity is regularly hindered by the accessibility of Thai television programs, which are better financed and produced and frequently attract more viewers than programs produced by the Lao government, promoting knowledge of Thai rather than a semi-standardized form of Lao. For all these reasons, it is not surprising to find that the use of language to help unify a nation has not been successful in Laos and it is likely that the current situation of a weakly-linked population will continue on into the future, unless there is significant economic growth and the financial means to invest much more in physical and linguistic communication in the country (Keyes 2003; Simpson and Thammasathien 2007a).

38.9 Burma/Myanmar – language planning and political goals Watkins (2007: 263) observes that two major struggles and tensions have characterized the sociolinguistic situation in Burma/Myanmar during the 20th century. The first of these has been “a nationalist drive […] to establish, maintain and develop an independent state free of colonial and other foreign influence, coalescing an essentially Burmese national identity at the centre and heart of the country” (Watkins 2007: 263). The second tension has been the developing relation of the majority Burman ethnic group to the range of other minority groups which constitute a third of the population, and how the latter might be integrated in a single nation together with the majority Burmans. To some extent, challenges facing the development of an all-encompassing national identity in Burma/Myanmar resemble those experienced in Laos – the proportions of majority to minority groups are similar in both countries, and in both countries there have been economic challenges holding back the successful promotion of a national identity which is heavily anchored to the language and culture of the ethnic majority. In Burma/Myanmar there have also been additional political complications which have hampered the success of language planning and popular enthusiasm for “national” Myanmar culture. Early moves to make use of language and culture in Burma/Myanmar as means to unify the population in a struggle against outsiders were prominent in the 1930s, in the nationalist anti-colonial movement, which campaigned against the use of English, as a foreign imposition by the British colonial government, and exhorted the central majority to be proud of Burmese language and cultural traditions. It was argued that there was a need for Burmese to be asserted as the national language and for other

946 

 Andrew Simpson

components of Burman culture to be stressed as symbols of resistance to continued British rule and an affirmation of the desire to form an independent nation. When independence was attained in 1948, the new leadership faced the problem common in new multi-ethnic states of how to unify the mixed population and coalesce, in some way, the many different groups which had previously not had any strong connections with each other, or feelings of loyalty to a single polity. The basic approach adopted by the government was to promote the language and culture of the Burman majority as representations of the entire nation, however many non-Burman groups in the border areas were against the attempt to brand the whole country with linguistic, cultural and political “Burmanization” (Callahan 2003). Nevertheless, the push to spread Burmese, the language of the Burman majority, continued on in the decades following independence, with increased literacy campaigns in the 1960s and the development of a standard form of Burmese, whose learning the government thought would be able to convey its political message and convert the population into effective socialists. Later on, in the 1980s, politics and the struggle for political control of the country triggered further initiatives relating to the promotion of language and culture. The military junta which had taken power in 1987, following widespread demonstrations caused by the near-collapse of the economy, became concerned that the political opposition, the National League for Democracy/NLD, might succeed in gaining anti-government support from the many minority groups for an end to military rule. In order to block the formation of alliances between the NLD and non-Burman minorities, the military government made attempts to keep the minority groups fragmented and disconnected from each other and the NLD through encouraging the use (and in some cases revitalization) of their distinct languages and different cultural practices. If communication between the minority groups and the NLD could be hampered by linguistic differences, this could help keep useful divisions between these groups, it was thought. Callahan (2003: 144) notes that, quite paradoxically, this stimulation of ethnolinguistic differences by the military government came at a time when it was also involved in a politically-driven “cultural homogenization program […] designed to erase differences among the peoples of Burma” (Callahan 2003: 144; emphases added), and making claims that the peoples of Burma were all closely connected in a single ethnicity. In 1989, the State Law and Order Restoration Council (SLORC) established by the army took the view that the anti-government demonstrations of 1988 had been caused by a lack of unity in the country, and that a re-imaging of the nation was required to “remind” the population of its close ethnic and historical links to each other. The name of the country was changed from “Burma” to “Myanmar” and the innovative (but ungrounded) claim was made and broadly dispersed that all groups in the country were descended from a single ancient race called the “Myanmars”, hence all the indigenous inhabitants of modern-day Myanmar were ethnically related as one people. The Burmese language was renamed “Myanmar”, and all uses of the term “Burma” were removed from books, signs and public records. The military govern-



Language and the building of nations in Southeast Asia 

 947

ment suggested that the switch from the terms Burma/Burmese to Myanmar for reference to the country, the people and the language would benefit the country and make the non-Burman minorities feel more included in the nation, due to use of a broader national term rather than terms related to the Burman majority. However, as pointed out in Callahan (2003) and Watkins (2007), this shift in terminology did nothing to change the fact that it was still the language of the Burman majority group that was being presented as the national language, and the switch to use of a different name for the language appeared vacuous to many and a continued show of dominance of the Burmans over the minority population. The national language situation in Burma/ Myanmar is thus quite akin to Laos and the Philippines, where the language of the largest group in the country has been promoted as the national language in a situation where at least one third of the population are not speakers of this language, and it is felt that unfair advantages and symbolic power are transferred to native speakers by such an advancement of the language of the majority/largest group. As in the Philippines, the name change applied to the national language has failed to change perceptions of its ethnolinguistic bias and even increased negative feelings towards its nationwide promotion. In assessing the effects of language planning policies in Burma and attempts by the post-independence leadership to create greater unity in the population, Watkins (2007) argues that the promotion of Burmese/Myanmar as a national language has not stimulated the growth of a strongly unifying national identity, and that efforts to establish such a binding identity among the population have been hampered by three non-linguistic factors, compounding the difficulty of developing a genuinely inclusive approach to national/official language issues. First, the borders of the country were created during the colonial period and resulted in a very mixed population with no shared ethnicity or history being grouped together in a single territory, as in many other ex-colonial states. Second, Watkins notes that the economy of the country has long been heavily depressed and that such a situation has an important negative effect on the development of positive feelings of pride and hope in the nation and its future. Finally, it is pointed out that “the attempted promotion of a Myanmar national identity […] is strongly associated with the military government” (Watkins 2007: 286) and this association may cause a negative reaction toward its promotion in much of the population who are sympathetic to the political opposition. It will be interesting to see if attitudes toward language and national identity may perhaps undergo change in Burma/Myanmar now that a new, democratically elected leadership has replaced the military government, if foreign investment can also be attracted into the country to improve the condition of the economy and ethnic stability can additionally be achieved.

948 

 Andrew Simpson

38.10 Cambodia – the development of a natural national language held back by upheaval Cambodia is a state in which national language planning efforts have been severely impacted by internal conflict, regime change and a poor economy. While there have been periodic attempts to build a national identity during the course of the 20th century, these have regularly faltered due to lack of sufficient resources and the effects of political and civil instability, and while the country has a population which overwhelmingly comes from one ethnic group – at least 90 % are ethnically Khmer – programs of “Khmerization” to promote strongly positive feelings of belonging to a single re-emerging nation have not been greatly successful, despite the availability of potential symbols of nationhood, such as widely shared language (Khmer) and religion (Theravada Buddhism). The first attempts to develop a new national consciousness in modern times began in the 1920s and 1930s under French rule, when Cambodian intellectuals and French colonial administrators jointly promoted the idea of a Khmer/Cambodian nation which would re-kindle the previous glory of the Angkor period (9th  – 14th century) and project it in an even greater way as a modern nation. The spreading of education, creation of printed materials in Khmer, and the development of Khmer as a national language were all seen as extremely important elements in this process of national restoration and improvement, and would distinguish Cambodia significantly from its neighbors, Vietnam, Laos and Thailand, although French remained the official language of government matters (Heder 2007). Later on, as independence came to Cambodia in 1953, the nationalist momentum of the 1930s was lost, as Prince Norodom Sihanouk became the leader of the country. Sihanouk had no great interest in promoting the status of Khmer further in formal domains and French was retained as the language of administration, higher education and politics through until the end of the 1960s. The immediate post-independence experience of Cambodia was therefore different from other newly independent nations in Southeast Asia such as Vietnam and Indonesia which quickly moved to develop an indigenous language for use in the formal domains of life, as a replacement for ex-colonial languages. When Sihanouk was eventually overthrown by Lon Nol in 1970, the direction of official and national language policy changed from the maintenance of French to support for greater roles for Khmer, and during the Khmer Republic (1970–1975) the state helped spread new writings in Khmer and disseminated nationalist propaganda in praise of traditional Khmer culture and the greatness of the Khmer race. As the Khmer Republic fell, the regime of Democratic Kampuchea (1975–1978) took its place, led by Pol Pot, ushering in three murderous years of the persecution of all those deemed against the establishment of a perfect Marxist Communist state aimed at restoring Cambodia’s glory. During this period of internal genocide causing the deaths of over 20 % of the population, only the speaking of Khmer was sanctioned by those



Language and the building of nations in Southeast Asia 

 949

in control of the country, and use of other foreign languages could lead to a person’s execution. Ultimately, Pol Pot’s dictatorship collapsed after only a few years when Vietnamese forces invaded Cambodia, and as peace was restored, the People’s Republic of Kampuchea (1979–1991) began a new systematic development of the Khmer language, spreading its use in government administration and as the language of education. Publications in Khmer also dominated from this time on. However, despite such positive moves to promote knowledge and the use of Khmer as a unifying national language, implementation issues plagued the success of linguistic Khmerization and limited economic resources held back the spread of mass education, so that much of the population remained illiterate and unable to take advantage of newspapers, books and magazines printed in Khmer, and at the beginning of the 21st century, only a third of the adult population can effectively read and write the language. Cambodia therefore remains well behind most other Asian countries in the achievement of widespread literacy, as well as in young people’s attendance of schools. While there has consequently been some progress toward the development of Khmer as a national language also used in official language roles and in education, and Khmer is unquestionably the language spoken in some form by just about all of the population in Cambodia, the general effect of national language policies and initiatives has been weak, due to the dual problems of semi-constant political upheaval and lack of financial backing for government attempts to improve education and access to knowledge through Khmer. As a result, the use of language planning to help stimulate feelings of national unity has been more limited in what it has achieved in Cambodia than in neighboring Thailand and Vietnam, and, as noted by Heder (2007) “after a series of at best weak and at worst catastrophically self-destructive regimes since the 19th Century – late classical, colonial, royalist, republican, Communist and liberal democratic – Cambodia still lacks an effective modern state and a self-sustaining national identity” (Heder 2007: 288).

38.11 Malaysia – using state language policy to protect a challenged majority In Malaysia, the government has played a major role in language planning since attaining independence from British rule in 1957, and the dominating theme of these activities, in the eyes of the national leadership, has been the use of language policy to help create a balance in socio-economic opportunities among the mixed population, as well as stimulate a unified national identity and create a prosperous, modern state. The population of the country consists in three major groups. The largest group, making up 69 % of the country’s total, is officially referred to as the Bumiputra or indigenous population, and consists in Malays, 55 % of the population of the country, defined in the constitution as those who speak Malay, practice Islam and maintain

950 

 Andrew Simpson

Malay culture, and indigenous non-Malays, mostly living in Sarawak and Sabah on the island of Borneo, comprising 14 % of Malaysia’s population and speaking Austronesian languages such as Iban, Dusun and Kadazan. The other two large ethnic groups are the Chinese, 24 %, speaking a range of varieties of Chinese, and South Asians, 7 %, mostly speakers of Tamil. The ethnic mix and the proportions of the different groups to each other closely resemble the population situation in Singapore. As in Singapore, prior to independence, there was little integration of the different groups, and the different groups maintained their own schools where Malay, Chinese and Tamil were used as mediums of instruction. With the achievement of independence in 1957, an important worry among the majority Malay population was that the economically much stronger Chinese and Indian communities might come to control Malaysia if nothing were done to provide special protection and equalizing opportunities for the Malays and other Bumiputra. In the area of language, Malay politicians argued that Malay should be given a privileged position following independence as the country’s single national language, but other, non-Malays worried that this would significantly disadvantage them, as many Chinese and Indians only had a very basic proficiency in Malay. They therefore suggested that a multilingual policy be adopted instead, with four official languages: Malay, Mandarin Chinese, Tamil, and English. Ultimately, the Malay majority was successful in its bid to establish a special position for Malay, and Malay was made into the country’s single national language, a role it still maintains. Malay was also given the status of official language of the state, with English being recognized as a second official language for a restricted period of ten years to help with the transition of the country in the immediate post-independence years, after which it was anticipated that Malay would become the only official language of Malaysia. This language situation indeed continued until 1967, when the temporary period of English as a second official language came to an end and English formally lost this role. However, the National Language Act of 1967 which re-affirmed Malay as Malaysia’s official language also permitted “the continued use of English at the discretion of state and federal officials, as well as for the use of Mandarin and Tamil (and other Indian languages) in all unofficial matters” (Ganguly 2003: 248). In 1969, Malaysia experienced disturbing inter-ethnic violence due to discontent at the results of national elections. An investigation of the causes of the conflict carried out by the government’s new National Operations Council attributed these, in large part, to the continuing socio-economic differences between the Malay and Chinese and Indian communities and the lagging behind of the Malays and other Bumiputra. The policy which the government subsequently adopted as an attempted solution to even out the distribution of wealth among the population was the introduction of increased privileges for the Bumiputra, including cheaper housing, priority in applications for government employment and licenses for business and commerce. Additionally, in education, the language of instruction in post-primary government schools came to be Malay, with a phasing out of English as medium of education from 1970 onward.



Language and the building of nations in Southeast Asia 

 951

The National Operations Council also presided over a change in the name of the national language designed to be more inclusive in nature. A switch was promoted from use of the term Bahasa Melayu (language of the Malays) to the new designation bahasa Malaysia (language of Malaysia). It was hoped that this name change would deflect criticism from non-Malays that the national language was the language of one particular ethnic group and not genuinely representative of the mixed population. However, as with the strategic renaming of Burmese as Myanmar, and Tagalog as Filipino, in attempts to increase the general acceptance of a national language associated with one ethnic group, the recasting of Malay as Malaysian has not obviously changed perceptions among non-Malays that “Malaysian” is primarily a symbol of the Malay group and not the entire population. The embedding of Malay as the national language and also the promoted language of everyday use among the population has continued over the years, making it heavily salient in daily life in Malaysia and a language that is regularly heard both in informal and formal acts of communication. Due to the widespread introduction of Malay-medium teaching in government schools since the 1970s, there is a good knowledge of Malay among all generations who have passed through the education system since that time, although Chinese and Indian Malaysians also use Chinese and Tamil in casual communication with other members of their ethnic groups. While a high level of proficiency in Malay thus became much more common among ethnically non-Malay parts of the population from the 1980s onward, the linguistic advancement of Malay was accompanied by a decrease in rising generations’ abilities in English, with the result that new university graduates found it increasingly difficult to find employment in multinational firms doing business in Malaysia. Reacting to this unanticipated development, viewed as unwelcome for its effects on the Malaysian economy, the prime minister and leader of the country Dr. Mahathir decided on an important change to language policy in education and announced that, from 2003 onward, all government schools would use both Malay and English as mediums of instruction, with English being required for the teaching of science and mathematics. Bilingual education was therefore introduced throughout the country and the pragmatic usefulness of English was stressed by Mahathir, arguing that knowledge of modern science, technology and business strategy were only easily accessible through a knowledge of English, and consequently the learning of English was still vital for the progress of Malaysia as a successful, modern state. English was subsequently given the status of “second most important language” of the country (now shortened to “second language”) and continues to have a significant presence in various formal and official areas of life and in business in Malaysia, required in many government documents (complementing the use of Malay), and widely used in the financial sector, engineering, medicine, scholarly discussion and private business, among other domains (Omar 2007). Public language use and learning has thus been directed and re-directed by the leadership of the country multiple times since the arrival of independence, and state

952 

 Andrew Simpson

language planning has attempted to achieve both nation-building and utilitarian goals, with the nationwide promotion of Malay aimed at protecting the originally challenged position of the Malay majority and simultaneously stimulating more national unity among the different parts of the population, and English made use of to help Malaysia’s competitiveness in global markets and its growth as a modern nation. As a result of such language policies, Malay has certainly become a very important marker of national identity for the Malays, though it is less clearly so for the non-Malays. However, the additional sideline presence of English and the continued toleration of other languages in informal domains has helped Malaysia and its mixed population remain stable since the 1970s, with language issues not becoming the cause of any serious conflict in the country.

38.12 Summing up – common themes and lessons to be learned This chapter has attempted to give a sense of the different trajectories taken by countries in Southeast Asia in the development of national and official language policies as part of general nation-building initiatives. In describing some of the principal factors that have affected governmental language planning and its outcomes in Southeast Asia, a number of themes have reoccurred, offering lessons for future attempts to direct the language habits of national populations in ways that will benefit both individuals and the formation of stable, successful nations. In this closing section, five of these themes will be returned to, highlighting their importance for state-led language planning. Several of the studies reported here emphasize how important the selection of national and official languages is for the subsequent success of governmental language policies, and how choosing the “right” language for the roles of national and official language(s) is essential in order to avoid negative reactions in ethno-linguistically mixed populations. In the Philippines, a poor choice of national language, Tagalog tagged as Pilipino, resulted in a largely failed national language program and significant apathy towards the learning and use of the national language, whereas in Indonesia, with a similar very mixed population, the choice of a smaller language not associated with any powerful group has led to the very successful establishment of a single official language for the entire country. In other cases where the language of the large majority group is promoted as the national language of a country, we find that this may be accepted by minority groups if the latter can also attain benefits through knowledge of the language and participation in the national economy, as in Thailand. However, where minorities are not made to feel fully equal partners in the development and prosperity of a state, as perhaps in Malaysia with its privileging of the Malay group, the selection and promotion of the majority language as single national/official language may not serve to unify a population well.



Language and the building of nations in Southeast Asia 

 953

A second recurrent theme we have seen is the observation that simply giving an existing language a new name as a national language does not result in any significant transformation, and may often be resented by groups who see such renaming as an attempt to promote the language of a dominant group in an underhand way, by means of a linguistic disguise. This occurred with the renaming of Tagalog as Pilipino, Burmese as Myanmar, and also Malay as Malaysian. However, it can be added that the adoption of a new name for an existing language or country is not necessarily bound to trigger a negative reaction from the public and would seem to depend on such a process being accompanied by other actions that enhance the situation of speakers, their opportunities in life and/or their self-esteem, which has happened successfully with Malay being recast (and then strongly developed) as Indonesian, and the renaming of Siam as Thailand during the country’s reorganization as a modern nation. A third issue which has surfaced multiple times in the chapter is the importance of corpus planning in national language programs and the need for sufficient linguistic development of languages promoted in national and official language roles. A lack of standardization and expansion of vocabulary in technical, commercial and academic fields has clearly held back the success of national/official languages in Cambodia, Laos, and the Philippines, whereas the concerted development of these resources has made standard Thai, Indonesian and Vietnamese into languages that can be used in all formal (and informal) domains of life, in government business, higher education, legal matters and commerce, without the need for any ancillary official language such as English, French or Spanish. Aspects of the learning of new national and official languages have also affected how well such languages come to function in different countries, especially in situations where high level bilingualism is targeted, as in Malaysia, Singapore, and the Philippines. The observation made in the Philippines is that the attempt to use both English and Filipino as mediums of instruction in higher education has placed too high a learning burden on students whose home language is often a third language, and this has caused standards of English and the national language to be worryingly low in many instances. Singapore, by way of contrast, has managed to develop advanced bilingualism in much of its population, and so the critical issue may be the quality of bilingual education that a state can provide, which may in turn depend on resources that are available, such as well-trained teachers and appropriate teaching materials. However, even Singapore has experienced concerns about the achievement of bilingualism in its schools, and the Goh Report in 1978 noted that the bilingual education program was not producing the high results hoped for, leading to a lowering of targets for certain students who showed difficulties in second language learning. Finally, a major non-linguistic factor which has frequently been seen to impinge on the success of national and official language planning and its potential role in nation-building is the strength of a country’s economy. A strong economy will help governments assign important financial support to corpus planning activities and

954 

 Andrew Simpson

education, both important for the development and spread of new national/official languages, and a strong economy may also engender feelings of pride in national success and bolster the growth of national identity, as, for example, in Singapore. Where a country experiences extended economic difficulties, there will be less support available for national language and education programs, and psychological pressure on the stimulation and maintenance of a strong national identity. This has been the case in Burma, Laos, Cambodia, and the Philippines, where the infrastructure to develop and spread national language throughout the population has been lacking, when compared with other Southeast Asian countries, and general feelings of pride in national success and achievements are weaker than in other states. “If the nation thrives, so will its language” (Hidup Bangsa, Hidup Bahasa) is a belief attributed to Malaysia’s leader Dr. Mahathir in Omar (2007: 356), and such an expression underlines the frequent connection which may exist between economic buoyancy and popular attachment to a promoted national language. All over the world, language planning and the spread of a shared language across a population may often be major components in the successful building of new nations, and used to bind different ethnic groups together as a single people at the national level. Yet there are many difficulties to overcome in such a process, in the selection, development, implementation/spread of national languages and in winning their acceptance. A study of the states of Southeast Asia states shows how these steps in national and official language planning are constrained by a range of practical and psychological factors and their interaction with each other, and that there is considerable variation in the ways that countries are able to overcome demographic, linguistic, and economic challenges to establish viable and enabling language policies that will lead their mixed populations toward unity, peace and prosperity. Hopefully, as more insight is gathered about past attempts at national language planning among complex populations, future governments will be able to manage statewide language issues with consistently more uniform success.

References Bertrand, Jacques. 2003. Language policy and the promotion of national identity in Indonesia. In Michael Brown & Šumit Ganguly (eds.), Fighting words: Language policy and ethnic relations in Asia, 263–290. Boston: MIT Press. Callahan, Mary. 2003. Language policy in modern Burma. In Michael Brown & Šumit Ganguly (eds.), Fighting words: Language policy and ethnic relations in Asia, 143–175. Boston: MIT Press. Ganguly, Šumit. 2003. The politics of language policies in Malaysia and Singapore. In Michael Brown & Šumit Ganguly (eds.), Fighting words: Language policy and ethnic relations in Asia, 239–261. Boston: MIT Press. Gonzales, Andrew. 2007. The Philippines. In Andrew Simpson (ed.), Language and national identity in Asia, 360–374. Oxford: Oxford University Press.



Language and the building of nations in Southeast Asia 

 955

Hau, Caroline & Victoria Tinio. 2003. Language policy and ethnic relations in the Philippines. In Michael Brown & Šumit Ganguly (eds.), Fighting words: Language policy and ethnic relations in Asia, 319–348. Boston: MIT Press. Heder, Steve. 2007. Cambodia. In Andrew Simpson (ed.), Language and national identity in Asia, 288–311. Oxford: Oxford University Press. Keyes, Charles. 2003. The politics of language in Thailand and Laos. In Michael Brown & Šumit Ganguly (eds.), Fighting words: Language policy and ethnic relations in Asia, 177–210. Boston: MIT Press. Lê, Minh-Hằng & Stephen O’Harrow. 2007. Vietnam. In Andrew Simpson (ed.), Language and national identity in Asia, 415–441. Oxford: Oxford University Press. Omar, Asmah Haji. 2007. Malaysia and Brunei. In Andrew Simpson (ed.), Language and national identity in Asia, 337–359. Oxford: Oxford University Press. Simpson, Andrew and Noi Thammasathien. 2007a. Thailand and Laos. In Andrew Simpson (ed.), Language and national identity in Asia, 391–414. Oxford: Oxford University Press. Simpson, Andrew. 2007b. Indonesia. In Andrew Simpson (ed.), Language and national identity in Asia, 312–336. Oxford: Oxford University Press. Simpson, Andrew. 2007c. Singapore. In Andrew Simpson (ed.), Language and National Identity in Asia, 391–414. Oxford: Oxford University Press. Vasavakul, Thaveeporn. 2003. Language policy and ethnic relations in Vietnam. In Michael Brown & Šumit Ganguly (eds.), Fighting words: Language policy and ethnic relations in Asia, 211–238. Boston: MIT Press. Watkins, Justin. 2007. Burma/Myanmar. In Andrew Simpson (ed.), Language and national identity in Asia, 263–287. Oxford: Oxford University Press.

Subject index ablative 424, 592–593, 606, 838 absolutive 268, 390, 424 accusative 424, 835, 883 additive marker 490, 840–841 adjectives 346, 376, 378, 417, 422, 424, 428, 443, 444, 482, 484, 493, 605; see also property concepts; stative verbs adpositions 351–353, 363, 364, 541, 592, 605, 739, 745, 801, 803; see also postpositions; prepositions adverbializers 346, 384 agentive nouns 267–268, 515 agentivity markers 424, 448 agriculture (early) 34  fn, 45  fn; see also cereal agriculture allative 316, 424, 592, 836 alliteration 469, 482, 821–822 ambitransitive verbs 6, 422 anaphoric function 316–317, 579, 756  fn, 777, 787–788, 830, 839–841 andative/venitive 421, 795 animacy 256, 257, 310, 312, 326, 355, 390, 420, 425, 448, 485, 665, 722, 733, 741, 744, 752, 757, 786–787, 836 anticausative 114, 363 applicative 362–363, 379, 392, 421, 541 archeology 23  fn, 34  fn areal convergence; see language contact argument drop 778, 779, 784, 790, 830–831 assimilation (cultural/linguistic) 673, 862, 911, 919–920 assimilation (phonological) 306, 474, 502, 516, 561 associative marker 392, 451, 458 attributive phrase/clause 458, 459, 568, 839 augmentative 288, 723 Austric hypothesis 62, 262  fn Austro-Tai hypothesis 103–105, 142, 248, 264–265 auxiliary verbs 111, 426–428, 443, 534, 580, 584 basic constituent order 7, 124, 268, 277, 292  fn, 296, 315, 339, 388, 448, 483, 826, 423, 827 benefactive 320, 362, 392, 421, 426, 520, 565, 593, 776, 791, 795–797, 801–802 https://doi.org/10.1515/9783110558142-039

breathy voice 73, 164, 188, 195, 241, 284–286, 306, 343, 413, 436, 440, 473–474, 501, 509, 511, 554–556, 678–680, 684–694, 891 Burma (country); see Myanmar Cambodia –  linguistic studies 14 –  nation building 908  fn –  national language 559  fn causative 113, 267, 318, 322–324, 359, 362, 386, 391–392, 414, 426, 455, 481, 516–517, 532–533, 562–563, 584, 793  fn, 802–803, 819 cereal agriculture –  origins and spread in MSEA 34  fn, 45  fn –  pre-cereal subsistence 53 Champa inscriptions 865; see Cham epigraphy class terms 736–737, 745–749, 758 classification (of languages) –  Chinese official approach 121 –  internal classification; see individual language families –  of Austroasiatic languages (in MSEA) 179  fn –  of Hmong-Mien languages 247  fn –  of Sino-Tibetan languages 115, 207  fn –  of Tai-Kadai/Kra-Dai languages 225  fn –  of Tibeto-Burman languages 115, 207  fn –  of Trans-Himalayan languages 115, 207  fn –  sub-family proposals; see individual language families classifiers 6, 255–257, 279, 281, 288, 347, 351, 354, 403, 416, 419, 420, 423, 424, 428, 443, 447, 451–452, 488–489, 539, 504, 541, 591, 595, 611, 650, 654, 658–668, 733  fn, 773, 781, 784  fn, 802 –  and agreement 733, 736 –  and politeness 764  fn –  and word order 753  fn –  in SEA languages 733  fn –  mensural classifiers 738–739, 746, 748, 759 –  numeral classifiers 6, 256–257, 347, 354, 403, 420, 428, 540, 591, 595, 739–748, 753, 756–759, 765–766, 773, 784  fn, 802 –  origin of 747  fn –  sortal classifiers 311, 420, 738–739, 746, 748, 758–759, 762, 786

958 

 Subject index

–  vs. other nominal classification systems 734  fn clear voice 693 comitative 318, 362–363, 392, 424, 664, 795 completive 317, 426–427, 530–531, 580, 779, 794, 799 complexity (types of) 802–804 –  hidden complexity 803–804 –  overt complexity 802–803 compounds, calqued 723 compounds, verbal (vs. multi-verb predicates) 520, 558 consent seeking particle 850 consonant clusters 479, 650, 653, 661–662 –  and writing systems 882–897 constituent order; see word order contact; see language contact context-dependence and ambiguity 720 continuous (aspect) 352, 426, 601, 607, 795 converbs 114 convergence; see language contact copula 295, 318, 333, 351, 417, 422, 428, 449, 527, 534–535, 539, 585, 842 coreference (subject/object) 779, 803 coverbs 529, 531, 541, 791, 293  fn, 801 creaky voice 68, 73–74, 164, 195, 284–286, 289, 304, 306–307, 323, 403, 405, 413, 428, 501, 509–511, 554–556, 692, 869, 886–887 –  dependent/induced creaky tone 307 cultural words; see Kulturwörter

determiners 444, 452, 488, 664, 666, 668 digital literacy (in MSEA) 903 diminutive 288, 723 diphthong 4, 101–102, 164, 196, 200, 202, 232, 303, 305, 340–342, 375, 412, 428, 438–439, 469, 474, 506, 507–509, 548, 553–554, 595, 633, 676–677, 679, 690, 692, 706, 884, 894 diphthongization 164, 196, 553–554, 690, 692 directional markers 116, 118, 312, 316, 381, 421, 481, 489–490, 520, 541–542, 580  fn, 789  fn, 846 directional verbs; see directional markers dispersal history (of MSEA language families) 8  fn, 33  fn; see also individual language families distant phylogenetic relations 261  fn disyllabic structure 3, 100, 103–105, 313, 345–346, 348, 410–411, 446, 501, 515, 549, 676, 679 disyllabic words 345–346, 348, 411, 446, 515, 679 ditransitive constructions 295, 314, 318, 391, 454, 519, 565, 836–837 DNA analysis; see Genomics domestication of animals 35 domestication of millet; see cereal agriculture domestication of rice; see cereal agriculture dual 174, 289, 310, 382, 418–419, 428, 495, 535–539, 588, 595, 778 durative 280, 288, 530–531, 794

dative 563, 566, 795, 801–802, 835 definite article 489 definiteness/indefiniteness 256, 278–279, 416–417, 419, 452, 489, 528, 559, 665, 741, 756, 761–764, 773, 776, 781, 784  fn, 802–803, 836 –  and numeral classifiers 784  fn deictic verbs 495 deictics 6, 309, 313, 316–317, 417, 812, 838 –  deictic classifiers 739, 742, 746, 756, 763 demonstratives 250, 268, 290, 354–355, 376, 383, 385, 388, 399, 417–418, 424, 428, 444–445, 452, 457–458, 484, 516, 538–539, 563, 570, 587–588, 595, 742, 753, 763, 839–842 dependent-head order 289, 290, 296, 388, 452, 458–459

echo words 811–812, 816, 820–821 École française d’Extrême-Orient (EFEO) 150–157, 857  fn economy principle 689, 760, 802–803 echo words/morphology 811–812, 816, 820–821 elaborate expressions 726 emphatic particle 445, 534, 587, 838, 851 emphatic pronouns 352, 353 epenthetic vowel 502 epigraphy 855  fn –  Burmese 867  fn –  Cham 864  fn –  earliest evidence of writing 855 –  Khmer 861  fn –  Mon 857  fn –  Pyu 869  fn



–  Sanskrit/Pali 871  fn –  Siamese/Lao 865  fn –  study of epigraphy 857 epistemic markers 312, 314, 329–330, 461 ergative 268, 386–387, 390, 424, 837 ethnolinguistic communities 50, 154, 167, 403, 429, 880, 907  fn, 937, 941, 946–947 euphonic reduplication 559 Eurocentricity 4, 17; see also Latin grammar model evidential markers 312, 314, 327, 329–330, 802 experiential 426, 794 explicitness (vs. economy) 777, 802 expressives 62, 309, 312–313, 315, 222, 385, 513–514, 811  fn; see also ideophones, echo words; onomatopoeia –  and eloquence 817 –  and iconicity 814  fn –  and the principle of complexity 816 –  contexts of use 817 –  in the languages of MSEA 811  fn extra-linguistic context 608, 777, 780, 825 family trees; see classification; phylogenetics; individual language families ‘finish’–verb 317, 530–531, 571, 779, 794, 797–799, 802 France; see French French linguistics (in MSEA) 149  fn –  epigraphy 69, 154, 857, 863–864, 870–871 –  Institut national des langues et cultures orientales (INALCO) 16, 157 –  structuralist principles 150 –  École française d’Extrême-Orient (EFEO) 150–157, 857  fn frequentative 563 fronting (word order) 293, 315, 329, 398, 459, 487, 519, 528, 570, 575–579, 608, 663, 826  fn, 827–828, 833, 837, 843 gender 388, 409, 417–418, 428, 435–536, 617, 638, 724, 733, 735–737, 756, 820; see also masculine/feminine distinction genitive 326, 386–387, 414, 423–424, 579, 607, 739, 741–743, 746, 755–756 Genomics 26, 28–29, 36  fn ‘get’–verb 580, 582, 725 ‘give’–verb 295, 320, 391, 517, 519–520, 528, 532, 565, 584, 725, 776, 780, 789  fn

Subject index 

 959

glides 304, 306, 374–375, 435–437, 439, 473, 503, 508, 549, 631, 661 glottalization 344, 375–376, 440, 474, 552, 662, 679, 689, 892, 896 goal (semantic relation) 295, 310, 519–520, 541–442, 565–566, 595, 795 grain-based agriculture; see cereal agriculture grammaticalization 773  fn –  and pragmatic inference 777  fn –  and serial verb constructions 789  fn –  classifiers and definiteness 784  fn –  ‘finish’–verb 317, 530–531, 571, 779, 794, 797–799, 802 –  ‘get’–verb 580 582, 725 –  ‘give’–verb 295 320, 391, 517, 519–520, 528, 532, 565, 584, 725, 776, 780, 789  fn –  of kinship terms 781  fn greeting, forms of 616  fn, 726 head-dependent order 289, 290, 296, 448, 451–452, 484, 488  fn hesitation marker 645 hierarchical person marking 421–422 history of Hmong-Mien studies 139  fn history of MSEA Austroasiatic studies 61  fn –  reference materials and archives 77  fn history of Tai-Kadai studies 93  fn history of Tibeto-Burman studies 111  fn history of Trans-Himalayan studies 111  fn homelands; see dispersal; settlement of MSEA hortative 329, 528 human referent 255–257, 288, 310, 312, 332, 390, 489, 571, 591, 608, 664–665, 738, 750, 752, 755, 835–836 iambic stress pattern 126, 255, 410, 411, 477, 501, 548–549, 554, 556, 661 ICAAL; see International Conference on Austroasiatic Linguistics identity and language; see language planning; nation building; national languages ideophones 6, 309, 313, 385, 187, 326, 811  fn; see also expressives imperative 312, 315, 329–333, 356, 389, 397, 445–446, 450, 527–528, 534, 566, 577–578, 680 imperfective 317, 449–450, 779, 813, 846 implosive 113, 341, 344, 412, 435–436, 440, 471–472, 501, 548, 550, 691, 884, 893, 898

960 

 Subject index

incipient morphology 803 indefiniteness; see definiteness/indefiniteness Indic writing systems –  The eastern/Khmer type 890  fn –  The western/Mon type 883  fn indicative (mood) 389 infixes 5, 263, 267–268, 270–271, 386, 447, 469, 479, 481, 515–516, 549, 551–552, 560–562, 661, 698, 715, 886 inflectional morphology –  Eastern Austroasiatic languages 563 –  lack of 4, 6, 111, 443, 446, 448, 451, 563, 601, 605, 720 –  Pali influence on Burmese 602 –  Trans-Himalayan 112 information structure (overt marking)  838  fn –  focus markers 842  fn –  topic markers 839  fn Institut national des langues et cultures orientales (INALCO) 16, 157 instrumental 318–319, 326, 363, 392, 424, 542, 571, 795 internal classification; see individual language families internal drift 1 International Conference on Austroasiatic Linguistics (ICAAL) 16, 61, 76 irrealis markers 307, 346, 351–352, 421, 530, 580 isolating (morphologically) 62, 114, 255, 277, 288, 296, 347, 433–434, 443, 446, 448, 462, 470, 499–500, 720, 739, 825 Khom script 862, 865, 901 kinship terms 314–315, 407–408, 414, 475, 495, 562, 655, 658–659, 781  fn, 835 –  in pronominal function 781  fn Kol-Anam family 63 Kulturwörter 719 labile verbs; see ambitransitive language contact; see also individual language families; Indo-Aryan influence, Dravidian influence, Sanskrit/Pali influence –  and semantics 707  fn –  between Austroasiatic and Austronesian 673  fn –  in MSEA 1  fn

–  with Chinese 649  fn –  with South Asia 623  fn language isolates 1 language planning 907  fn; see also nation building; national languages –  challenges 918–920 –  in Cambodia 908  fn –  in Laos 916  fn –  in Myanmar 913  fn –  in Thailand 910  fn –  in Vietnam 915  fn –  key concepts 907–908 language policy; see language planning Laos –  linguistic studies 13 –  nation building 916  fn –  national language 559  fn Latin grammar model, influence of 94, 150–152, 601, 607 Latin writing 898  fn layering (in grammaticalization) 773, 781 lenition 196, 502 linguistic area/Sprachbund 1, 4, 707–709 linguistics (MSEAn); see also history of studies; individual countries –  in Australia 15 –  in Europe 16 –  in Japan 16 –  in the USA 14 –  today 11  fn literacy (in MSEA) 902  fn macro-families 248, 261  fn Mainland Southeast Asia (MSEA) 1  fn –  and cereal agriculture 45  fn –  and the Thai model 8, 93 –  as a linguistic area 1–8 –  diagnostic linguistic features 5–6 –  early linguistic landscape 8  fn –  language dispersal history 1–2, 8  fn, 33  fn –  language families 1, 33  fn –  Neolithic occupation 21  fn –  pragmatics 825  fn –  semantics 707  fn –  syntax 825  fn manner adverbs/adverbials 309, 312, 315, 384, 444 manner conjunctions 797–798



masculine/feminine distinction 733, 736–737, 750; see also gender Meaning First Hypothesis 775, 788 mensural classifiers 738–739, 746, 748, 759 metaphoric inference 776–777 metonymic inference 775–776 middle (voice) 319, 363, 379–380, 421, 485 millet farming; see cereal agriculture minor syllables 345, 410–411, 477–478, 480 modal voice 73, 195, 285, 306, 511, 548–556, 678  fn, 684, 691 modals/modality (TAM) 317–318, 322–323, 421, 448, 456, 461, 490, 491, 531, 580, 582, 659, 720, 780 Mon-Anam family 10 63–64 262 Mon-Khmer Studies (MKS) 75, 77, 163, 165 monosyllabism 255  fn, 278, 309, 340  fn, 433, 435, 441  fn, 469–471, 659–660 –  influence of Chinese 659 multifunctionality (of grammmatical markers) 446, 745–746, 775–803 Myanmar/Burma –  linguistic studies 13 –  nation building 913  fn –  national language 559  fn Myazedi inscription 121, 859, 867, 870 Myinkaba Kubyaukgyi inscription; see Myazedi inscription nasalization 277, 305, 340, 436–437, 506, 554–555, 679, 883, 886 nation building 927  fn; see also language planning; national languages –  Cambodia 908  fn –  language planning 907  fn –  Laos 916  fn –  Myanmar 913  fn –  Thailand 910  fn –  Vietnam 915  fn national identity; see language planning; nation building; national languages national languages 599  fn; see also language planning; nation building –  and language planning 907  fn –  and nation building 927  fn –  typological profile 599  fn –  vs. non-dominant languages 907  fn

Subject index 

 961

negation 295, 309, 312–323, 228–229, 356–357, 397, 414, 419, 421–422, 449–450, 490–491, 526–528, 534–535, 575, 585–587, 680 negative imperative; see prohibitive Neolithic period 8  fn, 21  fn; see also cereal agriculture; dispersal history; genomics –  archeological evidence 23  fn, 34  fn –  biological evidence 26, 34  fn new situation (NSIT) 580–582, 605, 726, 847–848 nominal classification systems 734  fn nominalizers 267, 310, 313–314, 316, 319, 325–326, 350, 379, 387, 419, 424, 515–516, 560, 605 nominative 268, 424, 835 non-dominant languages (NDLs) 907  fn non-obligatoriness; see obligatoriness non-volitionality 359, 583 noun classes 733, 736, 745, 756 numeral classifiers 6, 256–257, 347, 354, 403, 420, 428, 540, 591, 595, 739–748, 753, 756–759, 765–766, 773, 784  fn, 802 –  and definiteness 784  fn obligatoriness/non-obligatoriness 720, 755–756, 773  fn –  of classifiers 755–756 –  of linguistic categories 720 oblique 455, 487, 520, 541–542, 567, 584, 592–595 Old Khmer epigraphy 861  fn Old Mon epigraphy 857  fn on-glides 678, 690 onomatopoetic expressions 312, 811–812, 818–819, 821 onset clusters 100, 102, 192, 239, 255, 202, 303–304, 374, 442, 469, 476–477, 502–504, 551–552, 562, 650, 660, 883 ordinal numbers 416, 540, 630 orthography; see writing system Parallel Reduction Hypothesis 775 partial reduplication 309, 385, 387, 416, 446, 482–483, 414, 559, 820 participles 379 passive constructions 362, 426, 455, 484–486, 517, 532–533, 542, 563, 584–585, 601, 603, 664, 668

962 

 Subject index

path verbs; see directional markers perfect 456, 530–531, 534, 581, 847 perfective 352, 456, 530, 775, 779, 794, 799, 846–847 phonation 6, 65, 68, 141, 164, 284–286, 306–307, 243, 403, 411, 413, 428, 469, 474–475, 477, 496, 501, 509–512, 554–556, 604, 678–679, 683  fn; see also individual language families; register –  breathy voice 73, 164, 188, 195, 241, 284–286, 306, 343, 413, 436, 440, 473–474, 501, 509, 511, 554556, 678–680, 684–694, 891 –  clear voice 693 –  creaky voice 68, 73–74, 164, 195, 284–286, 289, 304, 306–307, 323, 403, 405, 413, 428, 501, 509–511, 554–556, 692, 869, 886–887 –  modal voice 73, 195, 285, 306, 511, 548–556, 678  fn, 684, 691 –  state of the art 683  fn phonological word 477, 549, 501, 550 phylogenetics 3, 17, 34 –  computational/quantitative 182–183, 191–200, 210, 240, 244 –  distant phylogenetic relations 261  fn plural 300, 310, 315–316, 321, 354, 416, 539, 559, 736, 739, 759, 766, 785, 803, 813, 816, 832 politeness 6, 309–310, 314, 332–333, 418, 462, 495, 528, 578, 589, 612, 616–617, 658, 668, 765, 781, 783, 849; see also kinship terms –  and classifier choice 764 –  and kinship terms in pronominal function 781  fn –  in MSEAn national languages 616  fn Pollard script 900 polyfunctionality; see multifunctionality polysyllabic structure 442, 469, 476, 501, 549, 614, 660–662 population history of SEA 33  fn possessive constructions 280, 307, 315, 326, 352–354, 376–377, 384–385, 388, 418, 423, 451–452, 484, 487–488, 564, 589, 603, 607, 741–742, 753, 755, 833 post-stopped sonorants 552 postglottalization 555

postpositions 318–319, 323, 326, 328, 592, 602, 605, 708, 835; see also adpositions pragmatic argument marking 835  fn pragmatic inference 777  fn –  and numeral classifiers 784  fn pragmatic variation (of word order) 826  fn pre-cereal subsistence 53 preglottalization 119–120, 252, 341, 399, 413–414, 436, 440, 501, 548, 550 prehistory 8  fn, 21  fn, 33  fn prenasalization 117, 250, 255, 285, 303, 435–436, 440, 504, 900 prepositions 443–445, 448, 451–453, 489, 520–521, 541–542, 567, 579, 592–593, 601, 659, 664, 708, 776, 791, 794–797; see also adpositions presyllables 5, 278–280, 447, 469, 473, 476–478, 481, 483, 502–503, 549–550, 650, 659–662, 676, 679, 691, 858, 886 pro-drop; see argument drop prohibitive 309, 312, 314, 356, 397, 450, 491, 528, 534–535, 578, 585, 587 property concepts 417, 422, 443, 447, 454; see also adjectives; stative verbs proto-Austric 264, 269 proto-East-Asian 266 Pyu epigraphy 869  fn quantifiers 526, 311–312, 315, 376–377, 423, 445, 451–452, 739, 834–835, 851 Ramkhamhaeng Controversy 865; see Thai epigraphy realis markers 307–308, 313, 320, 325–326, 345, 421–422 recipient (semantic relation) 95, 320, 350, 542, 565–566, 592, 791, 835–837 reciprocal 321, 379–380, 537–538, 563, 612, 680 reduplication 309, 312, 315, 347–349, 385, 387, 415–417, 446–448, 469, 482, 513, 559, 812–813, 816, 818  fn, 886, 894 –  in writing systems 886, 894 –  partial reduplication 309, 385, 387, 416, 446, 482–483, 414, 559, 820 reflexive 319–320, 363, 379, 386, 421, 537–538, 563, 589, 781



register 4, 65, 68, 71, 113, 188, 195, 250, 469, 474, 475, 509, 511, 579, 549, 552–555, 595, 674, 678  fn, 683  fn, 820, 863, 884–891; see also phonation –  articulatory mechanisms 693  fn –  development 554, 683, 687  fn, 698 –  reorganization 692  fn –  state of the art 683  fn –  typical properties 684 registrogenesis 554, 683, 687  fn, 698 request particle 328, 461, 528, 574 resultative 381, 454, 481, 517, 563, 791, 798–799, 802 resumptive pronouns 355, 579, 827, 832 retroflex 239–240, 277, 305, 471–472, 856, 891 rice farming; see cereal agriculture RWAAI (The Repository and Workspace for Austroasiatic Intangible Heritage) 16, 79 script; see writing systems sea, spread by (of Munda languages) 39 secondary verbs 454, 582–583, 825, 846, 894 secret scripts 900  fn semantics (and language contact in MSEA) 707  fn; see also grammaticalization –  ambiguity and context-dependence 720 –  cultural words 719 –  greetings 616  fn, 726 –  ideophones 726 –  of classifiers 749  fn –  of spatial relations 718 –  shared concepts 719  fn –  shared lexemes 717 semivowels 343, 410, 412 sentence-final particles 6, 295, 308, 310, 328–329, 395, 460  fn, 488, 491, 526, 542  fn, 574, 577, 594, 617, 659, 664, 668 sequential events 321, 324, 453, 490, 513, 521, 523, 524, 559, 789, 791, 844 serial verb constructions; see also individual typological profiles –  and grammaticalization 789  fn –  and shared concepts 721 sesquisyllabic structure 1, 7, 102–103, 214, 219, 255, 277, 278, 280, 306, 345–346, 403, 410–411, 428, 442, 447, 469–470, 476–477, 501, 543, 548–550, 554, 556, 560, 595, 661, 676, 691

Subject index 

 963

settlement of MSEA 26 –  migratory routes 29 shared concepts 719  fn shared lexemes; see Wanderwörter SIL International (in MSEA) 163  fn –  and universities in Thailand 167  fn –  education programs in Cambodia 166 –  education programs in Thailand 167  fn –  education programs in Vietnam 164 –  linguistic contributions in Laos 172 –  linguistic contributions in Myanmar 173 Sino-Tai hypothesis 104, 225 Sinosphere 213, 252, 254, 649 sortal classifiers 311, 420, 738–739, 746, 748, 758–759, 762, 786 sound symbolism 62, 812; see also expressives Southeast Asian Linguistics Society 12, 15, 16, 77 spatial relations (in MSEA languages) 718 speaker’s perspective 846  fn –  differential TAM marking 846  fn –  speaker’s attitude 849  fn spelling; see writing system Sprachbund/linguistic area 1, 4, 707–709 spread by sea (of Munda languages) 39 stative verbs 312, 314–315, 346, 349, 378, 443–444, 447, 454, 484, 487, 493, 516; see also adjectives; property concepts sticky cereal zone 51  fn stone age; see Neolithic period sub-family proposals; see individual language families Summer Institute of Linguistics; see SIL International switch reference 395 syllabification 502, 624 tabooing 184, 188 Thai Khom 862; see Khmer epigraphy Thailand/Siam –  linguistic studies 12 –  nation building 910  fn –  national language 559  fn –  universities 167  fn theme (semantic relation) 295, 350, 519–520, 565, 790, 836 tone sandhi 256–257, 277, 286  fn, 291, 314, 375, 440–441, 474 tones in grammar 422  fn

964 

 Subject index

tonogenesis 7, 73, 74, 100, 252, 663, 683, 687 topic continuity 756, 778, 830 topic-comment structure 6, 459, 484–485, 487, 528, 564, 609, 623, 717, 826, 832–834, 842, 844–845 topic-comment-linkers 844 topicalization 315, 351–353, 364, 398, 485, 519, 528–529, 579, 584, 841 triphthong 102, 439, 679 trisyllabic words 477 Turanian 3, 262 Two Layer Hypothesis 26, 29 typology –  of Burmic languages 299  fn –  of Eastern Austroasiatic languages 547  fn –  of Hmong-Mien languages 277  fn –  of Kachin languages 403  fn –  of Karenic languages 337  fn –  of Kra-Dai languages 433  fn –  of Kuki-Chin languages 369  fn –  of Northern Austroasiatic languages of MSEA 499  fn –  of the national languages of MSEA 599  fn –  of Vietic 469  fn Unicode encoding 172, 871, 884, 896, 903, 914 venitive/andative 421, 795 verb agreement marking 113–114, 116, 123, 377–378, 386, 388–389, 310, 803 verb-initiality 268–269, 292–293, 350, 518, 564, 664, 827 –  subject-verb inversion 833 verbal nouns 379, 387 Vietnam –  linguistic studies 13 –  nation building 915  fn –  national language 559  fn voiceless nasals 125, 304, 374

voiceless onsets 113, 555, 693 voiceless sonorants 11, 250, 252, 255, 282, 303–304, 323, 345, 405, 412, 428, 885 volitionality 359, 583 vowel harmony 345, 346 vowel length 102, 250, 277, 375–376, 385, 411, 437–439, 441, 473, 501, 506, 677, 693, 695, 892, 897 vowel shift 65 VS constituent order; see verb-initiality Wanderwörter 717 word order –  basic constituent order 7, 124, 268, 277, 292  fn, 296, 315, 339, 388, 448, 483, 826, 423, 827 –  dependent-head 289, 290, 296, 388, 452, 458–459 –  head-dependent 289, 290, 296, 448, 451–452, 484, 488  fn –  pragmatic variation 826  fn word spacing (in writing) 883 word tabooing 184, 188 writing systems 879  fn; see also individual language entries –  and literacy in MSEA 902  fn –  Indic writing systems 882  fn –  Khom script 901 –  Latin writing 898  fn –  Pollard script 900 –  secret scripts 900  fn –  Unicode encoding 172, 871, 884, 896, 903, 914 –  word spacing 883 –  Yuttasara script 900 Yuttasara script 900–901 zero anaphora 6, 323, 326, 330, 428, 830

Language index (page numbers in italics refer to language examples)

A-Hmao 288–289 Ahom 98 –  Ahom Buranji 98 –  Ahom script 887 Aiton 440 Akha 317, 227, 332 Archaic Chinese 748 Arem 472–475, 478 Aslian 66; see also Austroasiatic –  convergence with Malay 3 –  internal classification 183 –  Nico-Monic 200 Austroasiatic 1–2, 547  fn, 499  fn; see also individual languages –  Austric hypothesis 62, 262  fn –  bibliographies 61  fn –  classification 179  fn –  contact with Austronesian 673  fn –  homeland and dispersal history 10, 37  fn –  internal classification 197  fn –  linguistic studies 61  fn –  relation of Munda to Mon-Khmer 64–65, 179 –  Sino-Vietnamese contact 74, 469 –  typological profile of Eastern Austroasiatic 547  fn –  typological profile of Northern Austroasiatic 499  fn Austronesian 1–2; see also Chamic –  and Sinitic expansion 39 –  Austric hypothesis 62, 262  fn –  classification 262 –  influence on Austroasiatic 673  fn –  Malayo-Chamic migration 40 Bahnar 199, 559–593 Bahnaric 67 –  internal classification 184 Biao 438, 452, 458 Biao Min 286 Bodo-Garo 125 Bolyu 505, 539, 542 Bru 555, 563, 690 Bugan 508, 512–543 Bumang 507–509 https://doi.org/10.1515/9783110558142-040

Bunong 559, 567–581 Burmese 303–333, 408, 427, 412, 427, 550–551, 558, 560, 564, 605–606, 609, 627–639, 723, 734, 750, 758, 760, 764, 827–852, 886 –  and South Asian influence 8 –  and the MSEA linguistic area 8 –  as national language 559  fn –  epigraphy 867  fn Burmish/Burmic 119, 299  fn; see also individual languages –  induced creaky tone 307 –  kinship terms  331 –  monosyllabism 309 –  politeness 309, 332–333 –  sentence-final epistemic/evidential markers 329–330 –  tones and phonation contrasts 306  fn –  typological profile 299  fn Burmo-Qiangic 116 Cantonese 758, 761, 764 Cham 864; see Chamic Cham, Eastern; see Chamic 691 Cham, Western; see Chamic 678 Chamic –  anomalous geographical distribution 9 –  as MSEAn type 3 –  Cham epigraphy 855  fn –  influence on Austroasiatic 673  fn Chinese (influence on MSEAn languages)  623  fn –  lexical borrowing 654  fn –  the spread of Sinitic 650  fn –  typological convergence 659  fn Chong 551–592 Chontal (Mayan, Mexico) 740 Chrau 819 Cuoi 472–478 Daai 372, 374, 384, 386–397 Daic; see Kra-Dai Danau 539

966 

 Language index

Dara’ang Palaung 507, 516–523, 539, 542 Dâw (Nadahup, Colombia/Brasil) 745 Déhóng Dai 449–452 Dravidian (influence on MSEAn languages) 623–624, 635, 636, 641, 709 Eastern Lawa 536–537 Ewe (Niger-Congo) 790 Falam Chin 392–398 French 734, 765 Geba 360, 363 Goemai (Afro-Asiatic) 743 Golden Palaung 512–542 Green Mong 287, 290–291 Hainan Cham 680 Hakha Lai 371–397 Haroi 679 Hlai 752 Hmong 740, 778–779, 794–796, 900 Hmong-Mien 1–2, 277  fn; see also individual languages –  bibliographies 139  fn –  classification 142, 247  fn –  classifiers 256  fn –  homeland 248  fn –  internal classification 250  fn –  linguistics 139  fn –  monosyllabism 255  fn, 278 –  presyllables 278  fn –  proto-language reconstruction 142 –  tones and phonation contrasts 252  fn, 284  fn –  typological profile 277  fn Hmongic; see also Hmong-Mien –  internal classification 250  fn Hmu 279, 285, 290–296 Hong-Kong Sign Language 743 Hu 505 Hyow 372, 374, 382–383, 387–398 Iaai (Austronesian, New Caledonia) 742 Iduh 511 Indo-Aryan (influence on MSEAn languages) 623  fn Indonesian 722–726 Isaan Thai; see Lao Iu Mien 280–295, 665

Jakaltek (Mayan, Guatemala) 741 Jarai 812, 815, 820–821 Jingpho 126; 408–427, 723; see also Kachin K’Chò 392 Kachin languages 403  fn –  contact situation and membership 404  fn –  height-based demonstratives 417 –  hierarchical person marking 421–422 –  kinship terms 408 –  tones and phonation contrasts 411  fn –  tones in grammar 422  fn –  typological profile 403  fn Kadai languages; see also Kra-Dai –  classification 225–226 Kam-Sui 103, 225, 233 Kam-Tai 225–226, 231 Kammu Yuan 512–543 Karen/Karenic 124, 337  fn –  anomalous geographical distribution 9 –  Karen script 888 –  monosyllabism 340  fn –  SVO syntax 124, 350 –  tones and phonation contrasts 343  fn –  typological profile 337  fn Katu 550–592 Katuic 68 –  dialect chain 186 –  internal classification 186 Kayah Li 361, 363 Kayan 361 Kayan Lahta 355 Khamti 448–449 Khmer 69; 189–190, 199, 269–270, 552–594, 606, 608, 611, 627–639, 656–557, 718, 723–724, 776–798, 891 –  as national language 559  fn –  epigraphy 861  fn –  internal classification 188 –  Khmer script 890 –  regional dialects 188 Khmer, Northern 554 Khmu Cuang 504, 508, 524–539 Khmuic 70 –  dialect chain 189 –  internal classification 189 Khuen 897 Khumi 371, 387 Kilivila (Austronesian, Papua New Guinea) 754



Klamath (Penutian) 744 Kơho 557–584 Kra languages 103, 225  fn; see also Kra-Dai Kra-Dai 1–2, 433  fn; see also individual languages –  and language contact 343 –  bibliographies 94  fn –  classification 225  fn –  final particles 460  fn –  homeland and dispersal history 38 –  linguistics 93  fn –  monosyllabism 433, 435, 441  fn –  proto-Austric 264  fn –  shift to dependent-head word order 452, 458–459 –  tones 439  fn –  typological profile 433  fn Kri 472–495 Kui 559–588 Kuki-Chin 123, 369  fn –  ergative-absolutive marking 390–391 –  internal classification 371–372 –  tones 375 –  typological profile 369  fn –  verbal form alternations 370–371, 379–380 Lacid 409–426 Lahu 318–332, 723, 746, 754 Lameet Lampang 507 Lángjià DaiBuyang 449 Lao/Isaan 96, 599  fn; 638–639, 718–722, 737, 742 –  as national language 559  fn –  epigraphy 865  fn –  Lao script 895 Laven 572–592 Leqi 410–427 Lhaovo 408–427 Lisu 317–332, 409 Lisu, Central 330 Lisu, Northern 330 Lisu, Southern 227, 330 Loloish languages 121 Long-haired Lachi 451–461 Lóngmíng Zhuang 435, 438, 741 Lua Wiang Papao 511 Mal 505 Malay 756, 762

Language index 

 967

Maleng Bro 472, 475, 478, 482 Mandarin 761–763 Mang 506, 508, 510, 540 Mang/Pakanic 71, 191 Maonan 454–460 Mara 377 May 472–495 Miao-Yao; see Hmong-Mien Mien 765 Mienic; see also Hmong-Mien –  internal classification 250  fn Mindat K’Cho 371, 377, 387 Mizo 374, 378, 383, 387, 392–393, 396–397 Mlabri 539 Mon 192–193, 200, 554–593, 627, 716, 723, 797, 827–852, 885–886 –  epigraphy 857  fn –  Mon script 884  fn Mon-Khmer; see Austroasiatic Monic 70 –  internal classification 192  fn –  Nico-Monic 200 Muak Sa’ak 510, 539–540 Mun 284–286  Muong 472–495, 665 Myanmar (language); see Burmese Nậm Pi Mang 510 Nelemwa (Oceanic, New Caledonia) 755 Newar 751, 753 Ngochang 408 Nicobarese 37, 181–182, 200, 265–268, 270 –  Nico-Monic 200 Nung Fan Sling 443–448, 453–461 Nyah Kur 558–594 Nyaheun 590–594 Old Khmer 189, 553–592, 610, 625, 634, 861 Old Mon 192, 560, 568, 572, 585, 594, 625 Pa-O 357 Pacoh 557–588 Paha 449, 453–461 Paite 396–397 Palaung (Rumai) 889 Palaung Namhsan 504 Palaung/Palaungic 72 –  internal classification 194 –  Palaung script 888

968 

 Language index

Palaychi 363 Pearic 73 –  dialect chain 196 –  internal classification 195 Poong 472, 488, 491, 494 Pray 512, 513, 518–543 Pwo Karen, Eastern 341–363 Rawang 408–427 Royal Thai language 715 Ruc 472–495 Saek 442 Samre 557–594 Sanskrit/Pali –  epigraphy 871  fn –  influence on MSEAn languages 600, 624, 625  fn, 871–872 Sedang 551–589 Semai 813 Sgaw Karen 341–363 Shan 96; 888 –  Shan script 887 Siamese; see Thai Sinitic classification 207–208, 219  fn Sino-Tibetan; see Trans-Himalayan Sizang 387–396 So Thavung 472–495 Southern Min 752 Stieng 557–590 Sui 437 Surin Khmer 189, 558, 656 Tai languages 225, 238  fn; see also Kra-Dai Tai-Kadai; see Kra-Dai Tamil; see Dravidian Tedim 372, 374, 392 Temiar 813 Thai 94, 599  fn; 443, 449–450, 461–462, 606–611, 627–639, 656–557, 660, 665, 718–726, 740, 744–745, 750–753, 757, 761, 763, 779, 790–798, 827–852, 893–894 –  as national language 559  fn –  as representative of MSEA linguistic area 8, 93 –  influence of Latin grammar 94, 150–152, 601, 607 –  Royal Thai language 715

–  Siamese/Lao epigraphy 865  fn –  Thai script 891  fn Thai Noi; see Lao Thai, Northern 449–450, 462 Thavung 472–495 Tiānděng Zhuang 452, 459 Tibeto-Burman classification 207  fn Tiddim 392 Trans-Himalayan 1–2; see also individual languages –  bibliographies 111 –  classification 207  fn –  homeland and dispersal history 39 –  linguistics 111  fn –  Trümmersprachen 127 U 537, 539 Vietic 74, 469  fn; see also language contact with Chinese –  internal classification 196 –  monosyllabism 469–471 –  tones and phonation contrasts 474  fn –  typological profile 469  fn Viet-Muong; see Vietic Vietnamese 599  fn; 472–495, 607, 610, 633, 656–668, 718, 723–726, 738, 746, 749, 759–760, 782–788, 794–799, 814, 816, 822, 898 –  as national language 559  fn –  Sino-Vietnamese contact 74 –  Vietnamese script 898  fn Wa 821, 900 Wàngmó Bouyei 453–456 Weining Ahmao 755 White Hmong 289, 661, 665, 742, 762 Wu 761 Wǔmínɡ Zhuang 441 Xong 278–283, 290–296 Yang Zhuang 449–450 Zaiwa 408–427 Zhānɡlǔ Kam 440 Zhānɡlǔ Kam 450 Zhuang, Standard 457 Zoulei 458–460