The Languages and Linguistics of South Asia: A Comprehensive Guide 9783110423303, 9783110427158

With nearly a quarter of the world’s population, members of at least five major language families plus several putative

275 36 4MB

English Pages 927 [928] Year 2016

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Acknowledgments
Table of contents
Introduction
1 The languages, their histories, and their genetic classification
1.1 Introduction
1.2 Indo-Iranian
1.3 Indo-Aryan
1.3.1 Old and Middle Indo-Aryan
1.3.2 Modern Indo-Aryan
1.4 Iranian
1.5 Nûristânî
1.6 Dravidian
1.7 Austroasiatic languages of South Asia
1.8 The Tibeto-Burman languages of South Asia
1.9 Daic or Tai languages of South Asia
1.10 Language isolates
1.10.1 Andaman languages
1.10.2 The Burushaski language
1.10.3 Kusunda
1.10.4 Nihali
2 Contact and convergence
2.1 Introduction
2.2 Overall South Asia
2.3 Ancient contact, convergence, substratum influence
2.3.1 Introduction
2.3.2 Lexical evidence
2.3.3 Structural features and geographical evidence
2.3.4 Post-Vedic contact linguistics
2.4 The Northwest
2.4.1 Pre-1947 convergences
2.4.1.1 Pamir-Hindukush-Karakoram-Kohistan-Kashmir region
2.4.1.2 Baluchistan
2.4.2 Post-1947 convergence in Pakistan and Afghanistan
2.4.2.1 Recent convergence and divergence in Pakistan
2.4.2.2 Recent developments in Afghanistan
2.5 Contact and convergence in the Northeast
2.6 Other contact, regional and local
2.7 English and South Asian languages
3 Phonetics and phonology
3.1 Introduction
3.2 Phonetics · Peri Bhaskararao
3.3 Phonology and phrasal prosody
4 Morphology
4.1 Introduction
4.2 Coverage
4.3 Typological issues
4.4 Theoretical issues
4.5 Morphosyntactic issues
4.5.1 Agent marking
4.5.2 Object marking
4.5.3 Agreement marking
5 Syntax and semantics
5.1 Introduction
5.2 Formal syntax
5.2.1 An overview of generative syntactic work and reference resources in South Asian languages
5.2.2 Minimalist approaches to South Asian syntax
5.2.3 Generative approaches to Pashto syntax
5.3 Cognitive Linguistics ·
5.4 Morphosyntactic typology
5.4.1 Oblique Experiencers and Oblique Subjects
5.4.2 Complex Verbs
5.4.2.1 Introduction
5.4.2.2 Expanded verbs in Dravidian
5.4.2.3 Compound verbs in Indo-Aryan
5.4.3 Finite and nonfinite subordination
5.5.1 Evidentiality and mirativity in Iranian, Nuristani, Indo-Aryan, Burushaski, and Dravidian
5.5.2 Evidentiality and Mirativity in Tibeto-Burman
6 Sociolinguistics
6.1 Introduction
6.2 Language endangerment and documentation
6.2.1 The situation in India and adjacent areas
6.2.2 Pakistan and Afghanistan
6.3 Language policy and planning in South Asia
6.4 Diglossia
6.4.1 Diglossia in Bangla
6.4.2 Diglossia in Dravidian languages
6.5 South Asian pidgins and creoles
6.6 South Asian languages in diaspora
7 Indigenous South Asian grammatical traditions
7.1 Introduction
7.2 Indo-Aryan grammatical traditions (Sanskrit and Prakrit)
7.3 Tamil and Dravidian grammatical traditions
8 Applications of modern technology to South Asian languages
8.1 Introduction
8.2 Localization
8.3 Language and linguistic resources
8.3.1 Corpus and lexical resources
8.3.1.1 Early work
8.3.1.2 India
8.3.1.2.1 History and methodologies
8.3.1.2.2 Sanskrit
8.3.1.3 Nepal
8.3.1.4 Pakistan
8.3.1.5 Bangladesh
8.3.2 Treebanking – Hindi/Urdu
8.4 Applications
9 Writing systems
9.1 Introduction
9.2 General historical and analytical
9.3 Recent script-related research
9.4 Perso-Arabic adaptations for South Asian languages
9.5 New research areas and desiderata
10 Sources and Resources
Language index
Subject index
Recommend Papers

The Languages and Linguistics of South Asia: A Comprehensive Guide
 9783110423303, 9783110427158

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

The Languages and Linguistics of South Asia WOL 7

The World of Linguistics

Editor

Hans Henrich Hock Volume 7

De Gruyter Mouton

The Languages and Linguistics of South Asia A Comprehensive Guide

Edited by

Hans Henrich Hock Elena Bashir

De Gruyter Mouton

ISBN 978-3-11-042715-8 e-ISBN (PDF) 978-3-11-042330-3 e-ISBN (EPUB) 978-3-11-042338-9 Library of Congress Cataloging-in-Publication Data A CIP catalog record for this book has been applied for at the Library of Congress. Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available on the Internet at http://dnb.d-nb.de. © 2016 Walter de Gruyter GmbH, Berlin/Boston Cover image: Chogolisa, Karakorum/Uwe Steffens/ullstein bild Typesetting: Dörlemann Satz GmbH & Co. KG, Lemförde Printing and binding: CPI books GmbH, Leck ♾ Printed on acid-free paper Printed in Germany www.degruyter.com

Acknowledgments This volume has greatly benefited from the assistance and advice of many colleagues and institutions. In the early phases of our project we were lucky to receive advice from a broad range of specialists in the field, especially Peri Bhaskararao, Agnes Korn, and K. V. Subbarao. As the project grew and we were facing problems with contributors begging off, we were lucky that E. Annamalai, Tej Bhatia, Alice Davison, Suresh Kolichala, Benjamin Slade, and Ian Smith jumped into the breaches and, often at short notice, made major contributions to the volume. They and the other contributors are the reason that this volume finally came about; we owe an immense amount of gratitude to all of them. Suresh Kolichala further contributed by creating the general map of South Asian languages and the maps of Indo-Aryan and Dravidian languages. We also express our gratitude to the staff at Mouton and de Gruyter’s production department who have been more than gracious, patient, and supportive, especially in the final stages of the project.

Table of contents Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v Introduction · Hans Henrich Hock and Elena Bashir . . . . . . . . . . . . . . . . . . 1 1

The languages, their histories, and their genetic classification · edited by Hans Henrich Hock . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.1 Introduction · Hans Henrich Hock . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.2 Indo-Iranian · Hans Henrich Hock . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.3 Indo-Aryan · edited by Hans Henrich Hock . . . . . . . . . . . . . . . . . . . . . 18 1.3.1 Old and Middle Indo-Aryan · Hans Henrich Hock . . . . . . . . . . 18 1.3.2 Modern Indo-Aryan · James W. Gair . . . . . . . . . . . . . . . . . . . . 35 1.4 Iranian · Agnes Korn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 1.5 Nûristânî · Richard F. Strand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 1.6 Dravidian · Suresh Kolichala . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 1.7 Austroasiatic languages of South Asia · Gregory D. S. Anderson. . . . 107 1.8 The Tibeto-Burman languages of South Asia · Carol Genetti . . . . . . . 130 1.9 Daic or Tai languages of South Asia · Hans Henrich Hock . . . . . . . . . 155 1.10 Language isolates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 1.10.1 Andaman languages · Anvita Abbi . . . . . . . . . . . . . . . . . . . . . 157 1.10.2 The Burushaski language · Étienne Tiffou . . . . . . . . . . . . . . . 165 1.10.3 Kusunda · Hans Henrich Hock . . . . . . . . . . . . . . . . . . . . . . . . 168 1.10.4 Nihali · Norman Zide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171

2

Contact and convergence · edited by Elena Bashir . . . . . . . . . . . . . 241 2.1 Introduction · Elena Bashir . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241 2.2 Overall South Asia · Colin Masica . . . . . . . . . . . . . . . . . . . . . . . . . . . 244 2.3 Ancient contact, convergence, substratum influence · Hans Henrich Hock and Franklin C. Southworth . . . . . . . . . . . . . . . . 250 2.3.1 Introduction · Hans Henrich Hock . . . . . . . . . . . . . . . . . . . . . 250 2.3.2 Lexical evidence · Franklin C. Southworth . . . . . . . . . . . . . . . 252 2.3.3 Structural features and geographical evidence · Hans Henrich Hock . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256 2.3.4 Post-Vedic contact linguistics · Hans Henrich Hock . . . . . . . . 260 2.4 The Northwest · edited by Elena Bashir . . . . . . . . . . . . . . . . . . . . . . . 264 2.4.1 Pre-1947 convergences · Elena Bashir . . . . . . . . . . . . . . . . . . 264 2.4.1.1 Pamir-Hindukush-Karakoram-Kohistan-Kashmir region · Elena Bashir . . . . . . . . . . . . . . . . . . . . . . . . . 264 2.4.1.2 Baluchistan · Elena Bashir . . . . . . . . . . . . . . . . . . . . 271

viii Table of contents 2.4.2

Post-1947 convergence in Pakistan and Afghanistan · edited by Elena Bashir . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284 2.4.2.1 Recent convergence and divergence in Pakistan · Elena Bashir . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284 2.4.2.2 Recent developments in Afghanistan · Lutz Rzehak . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292 2.5 Contact and convergence in the Northeast · Shobhana Chelliah and Nicholas Lester . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300 2.6 Other contact, regional and local · Hans Henrich Hock . . . . . . . . . . . 309 2.7 English and South Asian languages · Hans Henrich Hock . . . . . . . . . 325

3

Phonetics and phonology · edited by Hans Henrich Hock . . . . . . . 375 3.1 Introduction · Hans Henrich Hock . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375 3.2 Phonetics · Peri Bhaskararao . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376 3.3 Phonology and phrasal prosody · Hans Henrich Hock . . . . . . . . . . . . 388

4

Morphology · edited by Hans Henrich Hock with contributions by Elena Bashir and K. V. Subbarao . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437 4.1 4.2 4.3 4.4 4.5

5

Introduction · Hans Henrich Hock . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437 Coverage · Hans Henrich Hock . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437 Typological issues · Hans Henrich Hock . . . . . . . . . . . . . . . . . . . . . . . 440 Theoretical issues · Hans Henrich Hock . . . . . . . . . . . . . . . . . . . . . . . 448 Morphosyntactic issues · edited by Hans Henrich Hock . . . . . . . . . . . 450 4.5.1 Agent marking · Elena Bashir . . . . . . . . . . . . . . . . . . . . . . . . . 450 4.5.2 Object marking · K. V. Subbarao . . . . . . . . . . . . . . . . . . . . . . . 459 4.5.3 Agreement marking · Hans Henrich Hock . . . . . . . . . . . . . . . 465

Syntax and semantics · edited by Hans Henrich Hock . . . . . . . . . . 501 5.1 Introduction · Hans Henrich Hock . . . . . . . . . . . . . . . . . . . . . . . . . . . . 501 5.2 Formal syntax · edited by Hans Henrich Hock . . . . . . . . . . . . . . . . . . 501 5.2.1 An overview of generative syntactic work and reference resources in South Asian languages · Alice Davison. . . . . . . . 502 5.2.2 Minimalist approaches to South Asian syntax · Rajesh Bhatt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 506 5.2.3 Generative approaches to Pashto syntax · Taylor Roberts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 529 5.3 Cognitive Linguistics · Bhuvana Narasimhan . . . . . . . . . . . . . . . . . . . 537

Table of contents

ix

5.4 Morphosyntactic typology · edited by Hans Henrich Hock . . . . . . . . 544 5.4.1 Oblique Experiencers and Oblique Subjects · Hans Henrich Hock . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 544 5.4.2 Complex Verbs · edited by Hans Henrich Hock . . . . . . . . . . . 549 5.4.2.1 Introduction · Hans Henrich Hock . . . . . . . . . . . . . . 549 5.4.2.2 Expanded verbs in Dravidian · E. Annamalai . . . . . . 550 5.4.2.3 Compound verbs in Indo-Aryan · Benjamin Slade . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 559 5.4.3 Finite and nonfinite subordination · Hans Henrich Hock . . . . 567 5.5. Morphosemantic typology: Evidentiality · edited by Elena Bashir . . 584 5.5.1 Evidentiality and mirativity in Iranian, Nuristani, Indo-Aryan, Burushaski, and Dravidian · Elena Bashir . . . . . 584 5.5.2 Evidentiality and Mirativity in Tibeto-Burman · Scott DeLancey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 590

6

Sociolinguistics · edited by Elena Bashir . . . . . . . . . . . . . . . . . . . . . . 631 6.1 Introduction · Elena Bashir . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 631 6.2 Language endangerment and documentation · edited by Elena Bashir . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 632 6.2.1 The situation in India and adjacent areas · Anvita Abbi with input from Carol Genetti and Gregory D. S. Anderson . . 632 6.2.2 Pakistan and Afghanistan · Elena Bashir . . . . . . . . . . . . . . . . . 638 6.3 Language policy and planning in South Asia · Harold F. Schiffman . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 645 6.4 Diglossia · edited by Elena Bashir . . . . . . . . . . . . . . . . . . . . . . . . . . . . 657 6.4.1 Diglossia in Bangla · Probal Dasgupta . . . . . . . . . . . . . . . . . . 658 6.4.2 Diglossia in Dravidian languages · E. Annamalai . . . . . . . . . . 661 6.5 South Asian pidgins and creoles · Ian R. Smith. . . . . . . . . . . . . . . . . . 669 6.6 South Asian languages in diaspora · Tej K. Bhatia . . . . . . . . . . . . . . . 676

7

Indigenous South Asian grammatical traditions · edited by Hans Henrich Hock . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 707 7.1 Introduction · Hans Henrich Hock . . . . . . . . . . . . . . . . . . . . . . . . . . . . 707 7.2 Indo-Aryan grammatical traditions (Sanskrit and Prakrit) · Hans Henrich Hock . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 707 7.3 Tamil and Dravidian grammatical traditions · E. Annamalai . . . . . . . 716

x Table of contents

8

Applications of modern technology to South Asian languages · edited by Elena Bashir . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 735 8.1 Introduction · Elena Bashir . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 735 8.2 Localization · Elena Bashir . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 736 8.3 Language and linguistic resources · edited by Elena Bashir . . . . . . . 739 8.3.1 Corpus and lexical resources · Elena Bashir . . . . . . . . . . . . . . 739 8.3.1.1 Early work · Elena Bashir . . . . . . . . . . . . . . . . . . . . . 739 8.3.1.2 India · Niladri Sekhar Dash and Amba Kulkarni . . . 740 8.3.1.2.1 History and methodologies · Niladri Sekhar Dash . . . . . . . . . . . . . . . . . 740 8.3.1.2.2 Sanskrit · Amba Kulkarni . . . . . . . . . . . . . 748 8.3.1.3 Nepal · Yogendra P. Yadava . . . . . . . . . . . . . . . . . . . 752 8.3.1.4 Pakistan · Elena Bashir . . . . . . . . . . . . . . . . . . . . . . . 753 8.3.1.5 Bangladesh · Elena Bashir . . . . . . . . . . . . . . . . . . . . 754 8.3.2 Treebanking – Hindi/Urdu · Rajesh Bhatt . . . . . . . . . . . . . . . . 755 8.4 Applications · Elena Bashir . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 759

9

Writing systems · edited by Elena Bashir . . . . . . . . . . . . . . . . . . . . . 787 9.1 9.2 9.3 9.4 9.5

Introduction · Elena Bashir . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 787 General historical and analytical · Stefan Baums . . . . . . . . . . . . . . . . 788 Recent script-related research · Stefan Baums . . . . . . . . . . . . . . . . . . 798 Perso-Arabic adaptations for South Asian languages · Elena Bashir . 803 New research areas and desiderata · Elena Bashir. . . . . . . . . . . . . . . . 809

10 Sources and Resources · Hans Henrich Hock and Elena Bashir . . 823 Language index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 889 Subject index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 903

Introduction With nearly a quarter of the world’s population, members of at least five major language families plus several putative language isolates, and around 700 different languages, South Asia is a fascinating arena for linguistic investigations, whether comparative-historical linguistics, studies of language contact and multilingualism, or general linguistic theory. In addition, it offers a great variety of indigenous writing systems that pose interesting challenges to theories of writing, as well as two major indigenous traditions of phonetic and grammatical analysis, of which the Sanskrit tradition has had a tremendous influence on general phonetics and linguistic analysis. Recent publications provide detailed information on individual language families of South Asia — Steever (ed.) 1998 on Dravidian, Cardona & Jain (eds.) 2003 on Indo-Aryan, Anderson (ed.) 2008 on Munda, and relevant sections in Windfuhr (ed.) 2009 on Iranian and in Thurgood & LaPolla (eds.) 2003 on Tibeto-Burman. However, there has been no comprehensive survey of all of the South Asian languages and linguistic work on them since Current trends in linguistics 5 (Sebeok et al., eds. 1969). The present volume is intended to provide such an updated comprehensive survey. At the same time, it differs considerably from Current trends on a number of counts, reflecting changes in research paradigms and methodologies. Important in this regard is a much greater focus on issues of language contact and convergence, reflecting the impact of Emeneau’s publications on “India as a Linguistic Area” (see especially Emeneau 1980) and Ramanujan & Masica’s (1969) and Masica’s work (1976) on the geographical distribution of South Asian convergence features. Linguistic theory and analysis has changed dramatically from the time of Current trends. Sociolinguistic approaches, too, have developed deeper insights into such issues as code switching and code mixing, diglossia, and South Asian languages in the diaspora. Field research on minority and endangered languages has greatly expanded (although still too many languages are in danger of passing out of existence without proper documentation). The volume is organized thematically, with contributions on different subareas by specialists in the area and in some cases by the editors themselves. Chapter 1 covers the languages, their histories, and their genetic classification. Chapter 2 deals with contact and convergence. Chapters 3, 4, and 5 focus on phonetics/ phonology, morphology, and syntax, respectively. Chapter 6 covers sociolinguistics. Chapter 7 presents an overview of indigenous South Asian grammatical traditions. Chapter 8 deals with the burgeoning field of applications of modern technology to South Asian languages. Chapter 9 covers South Asian writing systems.

2

Introduction

The volume concludes with an appendix which gives a classified listing of major sources and resources. The Appendix is a special feature of this volume, intended to provide an even more comprehensive overview of sources and resources than what is contained in the “Bibliographical references” to each chapter (which also themselves include some publications not referred to in the respective chapter). While some important articles are included among the publications listed in the Appendix, the major focus is on edited volumes, monographs, and other monograph-length works. Specifically, the Appendix lists journals and periodicals; bibliographies; corpora, digital texts, and other online materials; online dictionaries; publications on language endangerment and language preservation; general linguistic surveys; and descriptions and handbooks on language families and individual languages. We hope that the information in the Appendix can be put online after publication of this volume, with provisions for subsequent online additions and updates. Work toward this volume started in 2007.1 The fact that it has taken so long is partly attributable to the usual problems encountered when pursuing a project like this — finding knowledgeable colleagues who are willing to contribute, making sure that they actually do contribute, and making alternative plans when things go wrong. In part, however, the long time that it has taken toward completion is a natural consequence of the complexity of the South Asian linguistic scene. The resulting product reflects this complexity. Different languages, different geographical areas present different linguistic issues as well as different linguistic approaches. Thus, genetic subgrouping is a major issue in Modern Indo-Aryan, Iranian, and Tibeto-Burman, while even basic description and attempts at language preservation are paramount for the Andamanese languages, Kusunda, or Nihali. There are important differences in the transcription conventions employed by our individual authors. For languages with long and rich historical documentation, an “indological” system developed in the late 19th century is employed, and this system tends to be extended to many modern languages. For underdescribed and usually endangered languages, the IPA phonetic system tends to be employed, as in the examples from the Munda languages Remo and Ho (1.7.2). And many syntacticians use a system going back to early typewriter conventions, with double vowels indicating long vowels and upper-case letters indicating retroflex consonants. A guide to these different transcription systems is provided at the end of this introduction. One regret is that we have not been able to get detailed coverage on psycholinguistic work on South Asian languages. Some discussion of psycholinguistics is found in Section 5.3 in the larger context of cognitive linguistics. Another area 1

Karumuri V. Subbarao was actively involved in the early planning stages but was not able to continue during the later editing stages.

Introduction

3

that, to our regret, could not be covered in detail is the newly emerging and rapidly developing work on South Asian Sign Languages. Work has progressed to the extent that there are now even publications on various dialects of Indian Sign Language, and the methods of computational linguists have also begun to be applied to sign languages. Since the medium of sign language is visual, there are a large number of resources (too numerous to list in this volume) in the form of online videos, which can be located and accessed through simple online searches. References to major published work on South Asian Sign Languages are included in the Appendix. Finally, we would like to recognize scholars of South Asian linguistics whose recent passing has been a great loss to our field — among them Hermann Berger, Murray B. Emeneau, Yamuna Kachru, Ashok Kelkar, Bhadriraju Krishnamurti, B. Lakshmi Bai, Manfred Mayrhofer, Michael Noonan, Subhadra Kumar Sen, Rajendra Singh, V. I. Subramoniam, Manindra K. Verma, David Watters, and Kamil Zvelebil. Their dedication and scholarship will continue to inspire. Phonetic and phonological transcription As noted earlier, various transcription systems are used in South Asian linguistics. Authors engaged in field work tend to use the IPA system (see https://www. internationalphoneticassociation.org/content/full-ipa-chart). Many syntacticians employ a system going back to early typewriter conventions, with double vowels indicating long vowels and upper-case letters indicating retroflex consonants. A number of Eastern Middle Iranian languages are written in offshoots of Aramaic script, an “abjad” system that only contains consonant symbols (hence such transcriptions as Sogdian yγwsty ‘is taught, learns’). The most widespread system is an “indological” one developed in the late 19th century. Even within this system there is some variation. The following charts present an overview of symbols that are employed and the phonetic values that they represent. The charts have been amplified by symbols used in citing ProtoIndo-European antecedents of Sanskrit forms. (Some contributions to this volume use additional symbols whose values are explained in context.)

4

Introduction

Consonants Labial Dental Alveolar Retroflex Palatal Prevelar- Velar Labio- Uvular Glottal “Palatal”2 velar3 Stop

vl.

Affricate

Fricative

Sibilant

p

t





c

vl. asp. ph

th

ṯh

ṭh

ch

vd.

d





j

ǵ

vd. asp bh

dh

ḏh

ḍh

jh

ǵh

vl.

ts, ċ

ṯs̱

ṭṣ, c ̣

č

vl.asp

tsh, ċh ṯs̱ h

ṭṣh, c ̣h

čh

vd.

dz, J̇

ḏẕ

ḍẓ, J ̣

ǰ

vd.asp

dzh

ḏẕh

ḍẓh

ǰh

b

vl.

f

θ

vd.

v5

ð, δ6

vl.

s

vd. Nasal

m

Lateral

Approximant

β

Glide/ semivowel

(w, u̯ )

2



š/ś 9

z





n





l

Rhotic







r

ṛ12

q

g

gw,gu̯

G

gh

g h,g h

ʔ

kh

w



x

χ

h/ḥ4

ɣ

ʁ

ɦ/h7

8

ž ñ

10



kw, ku̯

k

ṅ ɫ11

r̤ , ẓ13 (y, i̯ )

* 3 ** 4 *** 5 **** 6 ***** 7 ****** 8 ******* 9 ******** 10 ********* 11 ********** 12 *********** 13

2 3 4 5 6 7 8 9 10

11 12

13

For detailed discussion on anusvāra and anunāsika see Cardona 2013. Employed in Indo-European linguistics. ḥ (“visarga”) is used for Sanskrit. The phonetic value of symbols transliterated as v may vary between [v] and [β]. δ is used in Iranian linguistics. h is used for voiced [ɦ] in Sanskrit and other Indo-Aryan languages. ś is conventionally used for Sanskrit and by some authors also for Modern Indo-Aryan. In Dravidian linguistics ẓ is often used to indicate the retroflex approximant r̤ . In traditional transcription, designates both a retroflex non-syllabic lateral and a dental syllabic lateral, disambiguated by context. Recent, especially Indo-Europeanist, publications may use for the syllabic lateral. Velarized ɫ is found in Kalasha, Khowar, and Palula. ṛ also is used for the retroflex flap of languages like Hindi. In Sanskrit, it indicates a syllabic rhotic; recent, especially Indo-Europeanist, publications may use for the syllabic rhotic. Dravidian linguists tend to use the symbol ẓ.

Introduction

Additional symbols:

5

ṁ (or ṃ), “anusvāra” — a segment-length nasal transition; in Middle and Modern Indo-Aryan, it indicates nasalization of the preceding vowel. m̐ — “anunāsika”, (roughly) a variant of anusvāra.14 14 ḷh — aspirated retroflex lateral.

Vowels and syllabic sonorants Front i

ü

(Upper) Mid

e

ö16

Lower Mid

ɛ, ai

Low

æ

Short vowels:

Syllabic sonorants:

15 16 17

18

19

15

High

Vowel length:

14

Front round

Central

Back

Back unround

ɨ, ï

u

ʉ

ǝ

o

17

ɔ, au, O18 a19

Long vowels (and long syllabic sonorants) are marked by a macron, as in ī, ā, ū. In Sanskrit and some Modern IndoAryan languages which, like Sanskrit, do not have short e and o, the macron is omitted (hence , = ē, ō); this transcription is also used for Middle Indo-Aryan which does have a length contrast in the mid vowels. However, practice varies; some scholars do, for the sake of clarity, use the macron to indicate length with /ē/ and /ō/. ĕ, ŏ are used for Middle Indo-Aryan (and Brahui) to distinguish the short vowels from the corresponding (unmarked) long vowels. Elsewhere short vowels are normally left unmarked. In the traditional indological system, the syllabic lateral and rhotic are transcribed as ṛ and ḷ respectively. In Indo-European linguistics, syllabic sonorants are marked by a subscript ring, as in m̥ , n̥ , r̥ , l̥ . There is now a tendency to use the latter two symbols also for Sanskrit.

Employed in Indo-European linguistics. In Toda the symbol seems to indicate a centralized vowel. In Toda the symbol seems to indicate a centralized vowel. In Standard Hindi, ai designates a long [ɛ:], except in the combination aiy which is pronounced [ayy]. In Standard Hindi, au designates a long [ɔ:], except in the combination auv which is pronounced [aww]. Bangla ɔ is often transcribed as O. In most of Indo-Aryan, the short counterpart of long ā is centralized to a schwa vowel.

6

Introduction

Nasalization:

Tone:

For modern languages, nasalization is commonly indicated by a tilde above the vowel symbol, e.g. õ. An alternative follows the Sanskritist tradition of marking nasalization by ṁ following the vowel (see above under anusvāra). In some of the Munda examples in this volume, low tone is marked with a grave accent over the concerned vowel, e.g. Korku bulù ‘thigh’ and rising tone by the acute accent, as in Kharia [rɔ.chɔ́ʔb̚ m] ‘side’.

Bibliographical references Anderson, Gregory D. S. (ed.) 2008 The Munda languages. Oxford/New York: Routledge. Cardona, George 2013 Development of nasals in early Indo-Aryan: anunāsika and anusvāra. Tokyo University Linguistic Papers 33: 3–81. Cardona, George, and Dhanesh Jain (eds.) 2003 The Indo-Aryan languages. London/New York: Routledge. Devy, Ganesh N. 2014 Indian sign languages. (People’s Linguistic Survey of India, 38.) New Delhi: Orient Black Swan. Emeneau, Murray B. 1980 Language and linguistic area. Essays selected by A. S. Dil. Stanford, CA: Stanford University Press. Masica, Colin P. 1976 Defining a linguistic area: South Asia. Chicago/London: University of Chicago Press. Ramanujan, A. K., and Colin P. Masica 1969 Toward a phonological typology of the Indian linguistic area. In: Sebeok, Emeneau and Ferguson (eds.), 543–577. Sebeok, Thomas A., Murray B. Emeneau, and Charles A. Ferguson (eds.) 1969 Current trends in linguistics, 5: Linguistics in South Asia. The Hague: Mouton. Steever, Sanford B. (ed.) 1998 The Dravidian languages. London/New York: Routledge. Thurgood, Graham, and Randy J. LaPolla (eds.) 2003 The Sino-Tibetan languages. London/New York: Routledge. Windfuhr, Gernot L. (ed.) 2009 The Iranian languages. London/New York: Routledge.

Introduction

South Asian language families (map produced by Suresh Kolichala, 2015)

7

1

The languages, their histories, and their genetic classification Edited by Hans Henrich Hock

1.1.

Introduction by Hans Henrich Hock

South Asia is home to a great number and variety of languages. Some estimates put the total number of languages at about 685;1 but as in many other parts of the world, distinctions between language and dialect are difficult to make and tend to depend more on political, social, and cultural criteria (such as literary history, use in written form, or recognition in national or state constitutions) than on purely linguistic ones. Four language families are commonly recognized as being present — Austro-Asiatic, Dravidian, Indo-European/Indo-Iranian,2 and Tibeto-Burman (e.g. Emeneau 1980b: 31–32, Subbarao 2012: 1), with or without specific mentioning of the major subfamilies of Austro-Asiatic (Munda, Nicobarese, Khasi) or of Indo-Iranian (Indo-Aryan, Iranian, Nuristani). In addition there are also members of the Daic or Tai family, and several putative isolates, especially Burushaski, but also Kusunda, Nihali, and the Andamanese languages. Further, Persian, English, Portuguese, and Malay have contributed to the complex linguistic mosaic that is South Asia. While the major classifications are certain, subclassification is a perennial problem. This is certainly true for the Modern Indo-Aryan languages (see 1.3.2.4– 1.3.2.6), Munda (1.7.2), and the Tibeto-Burman languages of South Asia (1.8.1 and 1.8.4); it is probably true as well for Old Indo-Aryan (1.3.1.4) and for Iranian (1.4.2.1–2). Even for Dravidian, where Krishnamurti’s classification (2003) is widely accepted, alternative classifications have been proposed (1.6.2). In many 1

2

Based on figures in Ethnologue (http://www.ethnologue.com/country/, accessed 9 December 2013) for Afghanistan, Bangla Desh, India, Maldives, Nepal, Pakistan, and Sri Lanka, making some adjustments for “shared” languages. Zoller (1988, 1989, 1993) suggests that Bangani (Uttarakhand, India) contains an archaic layer of words indicating affiliation with western Indo-European languages. Van Driem and Sharma’s (1996) questioning of Zoller’s data led to a controversy on the internet; see http://www-personal.umich.edu/~pehook/bangani.html with http://www. himalayanlanguages.org/language_studies/bangani, both accessed 11 December 2013. Abbi (1997) confirms the accuracy of Zoller’s data. As noted by Hock (1997a), ‘the evidence … is highly suggestive; but a larger amount of words of the same type would certainly be helpful to allay worries that we might be dealing with chance similarities.’ The issue deserves fuller investigation.

10

Hans Henrich Hock

cases, geographical clustering is more easily discernable than distinct branchings in terms of exclusively shared common innovations. It appears thus as if the extensive bi- or multilingual contact between the different major language families, widely recognized in terms of the notion “India as a Linguistic Area” (Emeneau 1980a and Chapter 2 below), is also characteristic of intra-language-family relations. Put differently, as far as South Asia is concerned there is no clear line of demarcation between language contact and dialect contact. In addition to these issues, the contributions to the remainder of this chapter address the linguistic history of the various South Asian language families and in many cases, salient aspects of their grammatical structure as well. Given that, in spite of centuries or even millennia of contact, different families and subfamilies still retain substantial differences, it should not be surprising that individual sections differ in their coverage. This is especially true for discussions of linguistic history, since there are highly different chronological attestations and many languages, especially the “tribal” ones, begin to be recorded only in the 19th century (Hock 2000). The distinction between “tribal” and other languages is an important one in South Asia, intimately connected with the issues addressed in the first paragraph of this section — political, social, and cultural criteria, to which must be added “power”. Tribal societies traditionally exist outside the political, social, and cultural mainstream, are marginalized — both geographically and socially, and have no long-standing tradition of written literature. Tribal languages therefore tend to be underdescribed. Moreover, many are highly endangered, although a few (mainly those with large numbers of speakers and some political “clout”) are now officially recognized in the Constitution of India and hence may be reinvigorated. For a good overview of the status and documentation of Indian tribal languages and societies see the contributions to Abbi (ed.) 1997. In the 19th and early 20th centuries, much of the work on tribal languages was conducted by missionaries; but recent developments have made missionary activities controversial in most of South Asia. A number of different projects have been initiated under the aegis of foundations such as Documenting Endangered Languages (DEL) of the US National Science Foundation and National Endowment for the Humanities (http://www.nsf.gov/funding/pgm_summ.jsp?pims_id=12816), Dokumentation bedrohter Sprachen (DoBeS) of the German Volkswagen Foundation (http://dobes.mpi.nl), and the Hans Rausing Endangered Languages Documentation Programme (HRELP) at SOAS, London (http://www.hrelp.org/). Within India, the Bhasha Trust, an organization for the ‘study, documentation, and conservation of marginal languages’ has been established under the direction of Ganesh Devy (http://www.bhasharesearch.org), which also sponsors the publication series “People’s Linguistic Survey of India” (http://peopleslinguisticsurvey. org).

The languages, their histories, and their genetic classification

1.2.

11

Indo-Iranian3 By Hans Henrich Hock

The earliest attested forms of Indo-Iranian are Sanskrit, especially Vedic Sanskrit (with the Rig Veda [RV] being the oldest), and on the Iranian side, Avestan (the sacred language of Zoroastrianism) and Old Persian (used in the inscriptions of the Persian emperors). 1.2.1.

Indo-Iranian as a subgroup of Indo-European

Of the various proposed subfamilies of Indo-European, Indo-Iranian is the best established, defined by common innovations that clearly distinguish it from the other members of the family. Still, some major developments of Indo-Iranian are shared, to different degrees, by neighboring Slavic and Baltic. The following is a brief summary of the most important developments, largely focusing on phonology. 1.2.1.1. Changes shared with Slavic and Baltic4 — “RUKI” — the change of Proto-Indo-European (PIE) *s to *š after r, u, i and their syllabic or non-syllabic counterparts, as well as after (labio)velars; see e.g. (1). In Slavic and Baltic, located between the RUKI area and languages not participating in the development, the change peters out. — “Satem Assibilation” — the change of PIE palatovelar stops *ḱ, ǵ, ǵh to affricates *ć, ȷ́, ȷ́h, which may further change to sibilants; (2). This change, too, peters out in Slavic and Baltic. — Delabialization of labiovelars and merger with plain velars; (3). This change also has parallels in Slavic and Baltic. (1)

PIE *pis- > Avest. piš- ‘grind’ PIE *mūso- > Avest. mūša ‘mouse’

(2)

PIE *ḱm̥ tom > IIr. *ćatam > Avest. satǝm, OPers. θatam ‘100’

(3)

PIE *kwos > IIr. *kas > Avest. kō ‘who’

3

4

Transcription of the palatal stops varies in Indo-Iranian, with preferred in Indo-Aryan, and or in Old Iranian; there are also transcriptions of the type . To avoid confusion, in this section the transcription is used to indicate palatal stops, and for palatal affricates. An excellent summary of Indo-Iranian historical phonology is Mayrhofer 1989; for Iranian see Schmitt 1989.

12

Hans Henrich Hock

1.2.1.2. Indo-Iranian changes Certain changes mark Indo-Iranian as a well-established, separate group of Indo-European. These include the palatalization of velars (4), and a subsequent merger of non-high to low vowels, which makes palatalization unpredictable; see the combined scenario in (5). PIE *gwiHwo- > *gīwo > Skt. ǰīva- ‘alive, living (being)’

(4) (5)

Delabialization Palatalization Vowel merger Sanskrit

*kwe ‘and’ *ke *če *ča ča

*kwos ‘who’ *kos ----*kas kaḥ

Further, the affricates resulting from Satem Assibilation change to š before obstruent, merging with the outcome of RUKI as in (6), and voiceless stops + laryngeal change to voiceless aspirates, as in (7) (6)

PIE *oḱtō > *aćtā > Avest. aštā ‘eight’

(7)

PIE *pontHe- > Skt. panthā ‘road, path’

Indo-Iranian further shares a number of morphological and lexical innovations. Shared morphological innovations include an a-stem genitive plural in -ānām (for expected -ām); Cardona 2003a. Lexical innovations include *ǵhes-to ‘hand’ (Skt. hasta, Avest. zasta, OPers. dasta) vs. *ǵhes-r- in Greek and Hittite, as well as words like Skt. khara, Avest. xara, Vâsivari Nuristani korū́ ‘donkey’ that cannot be traced to PIE and have been argued to be borrowings from a Central Asian substrate; for the latter see Lubotsky 2001. 1.2.2.

Subgrouping of Indo-Iranian

Although the name suggests a division into two subgroups — Indo-Aryan5 and Iranian — a third branch has often been proposed. Grierson (2003: 100) recognized Dardic, including Nuristani,6 as a third branch. Morgenstierne showed Dardic to be Indo-Aryan and argued for Nuristani as a third branch (1973). His view is shared by most current Nuristani scholars; see Section 1.5. Mayrhofer (1983) argues for Iranian, Cardona (2003a) for Indo-Aryan affiliation. A careful discussion of all the different options is found in Degener 2002.

5

6

The term “Indic” is often used by Indo-Europeanists and western scholars. South Asian linguists prefer “Indo-Aryan” and use “Indic” for all languages of the Subcontinent. “Kafiri” in early publications.

The languages, their histories, and their genetic classification

13

The difference between Indo-Aryan and Iranian is well established by a large number of developments that are limited to one or the other branch, including the following. 1.2.2.1. Iranian Linguistic changes distinguishing Iranian include the change of *s to h (unless followed by obstruent or n) as in (8); deaspiration of voiced aspirate stops (9); fricativization of voiceless stops before consonants and of PIIr. voiceless aspirates (10); and dentalization of PIIr. palatal affricates, with subsequent simplification to sibilant/fricative or stop (11).7 (8)

PIE *septm̥ > *sapta > Avest. hapta, Mod. Pers. haft

(9)

*bhrāter > Avest. brātar- ‘brother’

(10) PIE *treyes > Avest. θrayō ‘3’ PIE *pn̥ tHe- > *patha- > Av. paθō ‘road (genitive)’ (11) PIE *ḱens- > ćans- > *tsanh- > Avest. saŋh-, OP θanh- ‘proclaim’ *ǵhesto > *ȷ́(h)asta > *dzasta > Av. zasta, OP dasta ‘hand’ 1.2.2.2. Indo-Aryan Indo-Aryan differs from Iranian by deaffrication of PIIr. *ć to ś (12) and the merger of *ȷ́(h) with *ǰ(h) and subsequent debuccalization of *ǰh to h (13). (12) PIE *ḱens- >*ćans- > śaṁs- ‘proclaim, praise’ *gwīwo

(13) Satem-assibilation Delabialization Palatalization Vowel merger Palatal merger ȷ́ h > h Sanskrit

*ǵenH*ȷ́en(H)-

*gīwo *ǰīwo *ǰīwa *ǰīwa

*ȷ́an(H)*ǰan-

ǰīva ‘alive’

ǰan‘be born’

*gwhen*ghen*ǰhen*ǰhan*ǰhanhanhan‘slay’

*ǵhesto *ȷ́hesto *ȷ́hasta *ǰhasta hasta hasta ‘hand’

An important further change is that of IIr. *š to retroflex *ṣ (similarly for the voiced counterpart) and the subsequent assimilation of a following dental to retroflex, as in (14). (See also 1.3.1.5.1.1.) 7

The Old Persian syllabary symbols transliterated as stops might have been fricatives or had fricative allophones.

14

Hans Henrich Hock

(14) PIE *pis-to > *piš-ta > *piṣ-ta > piṣ-ṭa ‘ground’ PIE *oḱtō > *aćtā > *aštā > *aṣtā > Vedic aṣṭā ‘eight’ 1.2.2.3. Nuristani Arguments for Nuristani as a separate, third branch are based on the following developments.8 — Unlike Indo-Aryan and Iranian, Nuristani did not have RUKI after u; see (15). — Unlike Indo-Aryan and Iranian, Nuristani preserves the dental affricate outcome of PIE palatovelars; (16). — Like Iranian and some of Dardic, Nuristani deaspirates voiced aspirates; but unlike these, it also deaspirates the VOICELESS ones; (17) (15) PIE *mūso > Skt. mūṣa vs. Kâmkatavari mū̃ sǝ ‘mouse’ PIE *deuseh2 > Skt. doṣā ‘night’ vs. Kâmkatavari dus ‘yesterday’ (16) PIE *deḱm̥ > *daća > Av. dasa, Skt. daśa vs. Kâmkatavari duċ [-ts] ‘ten’ (17) PIIr. *khara > YAvest. xara, Skt. khara vs. Kâmkatavari kur, Vâsivari korū́ ‘donkey’ The most widely mentioned feature is the non-application of RUKI after u-vowels, which has elicited numerous phonological explanation attempts (e.g. Longerich 1998, Hamann 2003, with earlier literature). A recent dissenting view is Cathcart 2011. Noting the absence of RUKI effects in contexts other than after u, such as Ashkun wīs (Skt. viṣ ‘poison’), Cathcart proposes that presence or absence of RUKI may result from special developments and that RUKI may have taken place across the board. A major difficulty is the lack of earlier historical stages and the limited amount of relevant attested data. Some Nuristani languages seem to have RUKI after u-vowel (18a); others don’t (18b); note also the yet different outcome in (18c). Do these differences result from different phonological changes (e.g. palatalization in müšt)? Or from borrowing (e.g. yūṣṭ)? Is it likely that body-part words are borrowed? And how can we be sure that borrowing accounts are not simply ways of removing counterexamples to the claimed nonapplication of RUKI after u? (18) a. b. c. 8

9

Kâtavari muṣṭ ‘fist’, yūṣṭ ‘lip’9 like Skt. muṣṭi, oṣṭha Ashkun must ‘fist’ Vâsivari müšt ‘fist’

For consistency, the transcriptions follow Turner 1962–1969 (who in most cases follows Morgenstierne). Strand’s observations are more up-to-date; but some data are not included in his published and online materials. Strand has š in Kâtavari míšt, but ṣ in íṣṭ (p.c. 2011).

The languages, their histories, and their genetic classification

15

A related issue is the fate of PIE *ḱ before obstruent (see (6) above). In the only attested example, several Nuristani languages have ṣ followed by ṭ: *oḱtō > Kâmkatavari uṣṭ ‘8’; but Prasun āstë. Again, are the retroflex forms Indo-Aryan borrowings?10 Does Prasun āstë indicate that Nuristani did not participate in the change in (6) (another feature distinguishing it from Indo-Aryan and Iranian)? Or did special internal developments obscure the operation of the change in (6), as well as RUKI? To some extent, similar concerns holds for (16) and (17). However, these seem to be more solidly established. There is thus some evidence for considering Nuristani a third branch of Indo-Iranian; but further research is needed.11 1.2.3.

The subgrouping of Iranian12

A number of different subgroupings have been offered. Windfuhr (2009) proposes a division of Old Iranian into Southwest (Old Persian), Northwest (Median), Central (Avestan), and Northeast (Scythian/Saka), and a classification of Middle Iranian into West and East Iranian. The latter area and its modern descendants are the most relevant for this volume. Among Eastern Middle Iranian languages, Khotan Saka is most important. Modern East Iranian languages include Pashto, the Pamir languages, Parachi, and Ormuṛi. In addition, Baloch, sometimes classified as northwestern, is spoken in South Asia. The position of Avestan and its relation to Middle and Modern Iranian languages has not been settled. Windfuhr groups it as Central, but Schmitt (1989: 28) considers classification problematic, while mentioning affinities with East Iranian during some phases of its history. Further, the two major stages of the language, Gātha and Younger Avestan, may belong to different dialect areas, with Younger Avestan showing greater affinities to Old Persian (Skjærvø 2009b: 44). 1.2.4.

Lateral relationships, including dialect and language contact

The issue of subgrouping addressed in the preceding section is crucially informed by the Tree model of language relationship, which operates with the notion of branchings of different languages from a common ancestor, through divergent lin10 11

12

For an alternative see 1.2.4 below. The apparent retention of a laryngeal reflex in Nuristani reflexes of *dhughH-ter‘daughter’ (e.g. Prasun lüšt < *dužit-) might suggest closer affiliation with Indo-Aryan (Skt. duhitar) rather than Iranian (Av. du ar-); but Mayrhofer 1983 suggests that laryngeal developments were late and could have applied independently in the Indo-Iranian subfamilies. See section 1.4 for fuller discussion.

16

Hans Henrich Hock

guistic innovations. Since J. Schmidt (1872) it has been known that this approach needs to be supplemented by a Wave model, which acknowledges that innovations can cut across the branchings established by the Tree model. The approach has recently been reinvigorated by Garrett (1999). One feature posing difficulties for the Tree model is the change of s to h (see (8) above), a defining innovation of Iranian, which under this approach would have to predate the Proto-Iranian ancestor. However, borrowings into Elamite show that the change took place much later, no earlier than the 8th or 7th century BC (Mayrhofer 1989: 7). The change thus must have diffused across the Iranian languages after their diversification. Another feature generally considered characteristic of Iranian is the dentalization of ć, ȷ́ to ts, dz; see (11). However, the change is also found in Nuristani (16); it is only the subsequent change to s, z or θ, d that distinguishes Iranian from Nuristani, but that change plays out differently in different Iranian languages and cannot be postulated for Proto-Iranian. While the clusters ćw, ȷ́w yield outcomes with dentals in most of Iranian, in Eastern Middle Iranian Khotan Saka, as well as in the modern Pamir language Wakhi the outcome is PALATAL š, ž (Emmerick 1989: 216): *eḱwo > *aćwa > Av. aspa, OP asa vs. Kh. Saka aśśa [š], Skt. aśva ‘horse’. This suggests that dentalization was not pan-Iranian. Moreover, the fact that the palatal outcome is found in Eastern Iranian, the area closest to Indo-Aryan with its palatal reflexes of ć, ȷ́(h) (12)/(13) may suggest a common development; in which case the division Iranian : Nuristani : Indo-Aryan is not as straightforward as the discussion in 1.2.3 suggests. In several East Iranian languages the sibilant š outcome of RUKI or of (6) above becomes retroflex ṣ with assimilation of a following t: *oḱtō > *aćta > *ašta > Kh. Saka haṣṭa ‘8’ (Emmerick 1989: 215, Skjærvø 1989: 377). Nuristani, too, may have such developments (see 1.2.3 above). Retroflexion of š is commonly considered a defining feature of Indo-Aryan. Its appearance in languages on the western border of Indo-Aryan has been attributed to Indo-Aryan, or even Dravidian language contact (e.g. Kieffer 1989: 451–452). Another view postulates regional convergence, with Burushaski holding an important position (Payne 1989: 423). For sibilant retroflexion, a contact explanation may be appropriate. However, this is not the only retroflexion source in the area. Another source consists of clusters of r and dental obstruents (Emmerick 1989: 215, Skjærvø 1989: 377; Strand 2012; Turner 1927). In Indo-Aryan the development is generally considered Prakritic (but see 2.3.4.2); on the Iranian side it is already Avestan, with rt > ṣ̌13 (contrasting with two other sibilants, transcribed as š and palatalized š́ ): *ṛta > aṣ̌a ‘truth’.

13

The change is generally considered limited by accent (Kellens 1989: 43).

The languages, their histories, and their genetic classification

17

Here again innovations cut across branchings of the Tree model, but much earlier than sibilant retroflexion. Perhaps a contact explanation would also apply here. But given the palatal outcome of *ćw, ȷ́w on the eastern periphery of Iranian, close to Indo-Aryan, it is possible to entertain an alternative, Wave-model account — the spread of features within an early continuum of Indo-Aryan, Iranian, and Nuristani varieties of Indo-Iranian. 1.2.5.

“Mitanni”

Lexical items in documents from the ancient Near East (about 15th century BC) show that an Indo-Iranian group (the “Mitanni”14) had migrated to the area. Several features are taken to indicate specifically Indo-Aryan origin (Mayrhofer 1974; Masica 1991: 35–37 with references): (1) Some theonyms are compatible with the Vedic pantheon but not the Iranian one (Aruna = Varuṇa); (2) *s is preserved in words like satta ‘7’ (Skt. sapta) while Iranian has h (Av. hapta); (3) the numeral aika ‘one’ agrees with Skt. eka < *aika, not Av. aēva, OP aiwa. However, the theonyms may reflect a stage prior to the “Zoroastrian revolution”, which eliminated most of the old Indo-Iranian Gods. As noted in 1.2.4, the change of s > h postdates Proto-Iranian, with s preserved into the 8th/7th century BC. In fact, Mitanni also appears to predate the earliest Indo-Aryan, retaining a palatal obstruent in wašanašaya ‘of the chariot’ (vs. Ved. vāhanasya with ȷ́h > h). Further, IAr. *aika beside Iran. *aiwa suggests coexistence of both forms in Proto-Indo-Iranian, with the competition resolved in favor of *aiwa in Iranian, but Indo-Aryan selecting *aika as numeral and relegating *aiwa to particle status (evá ‘only; indeed’; Mayrhofer 1986–2001, s.v. evá). The form aika, thus, may go back to an early period, before Indo-Aryan and Iranian made their different choices. See Hock 1999, fn. 3 on this entire issue.15 The question whether Mitanni is Indo-Aryan or reflects an early variety of (Proto-)Indo-Iranian, thus, cannot be resolved at this point. 1.2.6.

Resources

In addition to 1.2.1, 1.4, and 1.5 of this volume, the following recent publications provide helpful information: Mayrhofer 1989 for Indo-Iranian; Masica 1991, Cardona 2003a, and Cardona & Jain (ed.) 2003 for Indo-Aryan; Bashir 2003 for Dardic; Schmitt (ed.) 1989 and Windfuhr (ed.) 2009 for Iranian; Kellens 1989, M. Hale 2008 for Avestan; Strand 1997–present and 2010 for Nuristani.

14

15

The term Mitanni actually refers to speakers of a non-Indo-European language, whose rulers were Indo-Iranian. Parpola (2002: 74) rejects Hock’s argument, but without linguistic counterevidence.

18

Hans Henrich Hock

1.3.

Indo-Aryan Edited by Hans Henrich Hock

1.3.1.

Old and Middle Indo-Aryan By Hans Henrich Hock

1.3.1.1. Structural sketch and major trends This section presents a brief outline of the structure of Old and Middle Indo-Aryan (OIAr. and MIAr.) and of major historical trends. Some additional phenomena are discussed in 1.3.1.5. Except for the Vedic pitch accent (see 1.3.1.5.1.1), the phonological inventory of Indo-Aryan (Table 1.1) remains remarkably unchanged. There are however major changes in phonological distribution and syllable structure (see 1.3.1.2 and 1.3.1.4.2). Table 1.1: The segmental phonemes of Old Indo-Aryan16 VELAR /

PALATAL

RETROFLEX

DENTAL

LABIAL

c ch j jh ñ ś

ṭ ṭh ḍ ḍh ṇ ṣ

t th d dh n s

p ph b bh m

y i ī e [ē] ai

r ṛ ṝ

l ḷ (ḹ)

v u ū o [ō] au

GLOTTAL

Stops

Fricatives

vl. vl.asp. vd. vd.asp. nas. vl. vd.

Semivowels Vowels Diphthongs

Additional elements:

16

17

k kh g gh ṅ (ḥ) h a ā

ṁ (or ṃ), “anusvāra” — a segment-length nasal transition; in Middle Indo-Aryan and later, it may indicate nasalization m̐ — “anunāsika”, (roughly) a variant of anusvāra17

This classification reflects the insights and views of the Sanskrit phonetic tradition, including the grouping together of VELAR and GLOTTAL and of a-vowels as “glottal”, the classification of nasals as stops, and the characterization of alveolar r-sounds as retroflex. For detailed discussion on anusvāra and anunāsika see Cardona 2013.

The languages, their histories, and their genetic classification

19

ḷ(h) — (aspirated) retroflex lateral18 ĕ, ŏ — these occur in Middle Indo-Aryan Morphology is less “stable” and undergoes increasing attrition, which in the nouns tends to be compensated for through introduction of (new) adpositions. Sanskrit presents the richest system, with three numbers (singular : dual : plural), reduced to two in MIAr. Nominal and (demonstrative) pronominal inflection distinguishes three genders (masculine : feminine : neuter) and seven19 cases. However, even at the earliest stage, there is extensive syncretism; see Table 1.2 and note that outside the pronouns and nominal a-stems, ablative and genitive singular are not distinct (hence the broken lines around ablative and genitive singular). Late and post-Vedic begin to lose further distinctions, especially dative : genitive or dative : locative. Phonological change accelerates case syncretism, especially in late MIAr., which approximates the common Modern IAr. (Mod. IAr.) system of nominative : oblique. Table 1.2: Old Indo-Aryan case system, illustrated by the a-stem deva ‘God’

Nominative Accusative Instrumental Dative Ablative Genitive Locative

Singular devaḥ devam devena devāya devāt devasya deve

Dual devau devau devābhyām devābhyām devābhyām devayoḥ devayoḥ

Plural devāḥ devān devaiḥ (early Vedic also devebhiḥ) devebhyaḥ devebhyaḥ devānām deveṣu

Pronouns and verbs come in three persons (first : second : third20). Verb inflection distinguishes a present from three past tenses (imperfect : aorist : perfect; see below for the ta-participle) and a future (a recent innovation in Vedic). The oldest modal system consists of indicative, subjunctive, optative, and imperative, with the subjunctive dropping out by post-Vedic.21 Sanskrit makes a distinction between 18

19

20

21

In traditional transcription, designates both a retroflex non-syllabic and a dental syllabic lateral, disambiguated by context. Recent, especially Indo-Europeanist, publications may use for the syllabic lateral and similarly for the syllabic rhotic. The vocative, distinct from the nominative only in the singular of certain nominal inflections, is considered a variant of the nominative in traditional Sanskrit grammar. Except for certain clitics, third-person pronouns do not exist, demonstratives taking their place. Early Vedic also has an “injunctive”, non-modal traces of which remain in Epic Sanskrit; and a conditional develops in the history of Vedic.

20

Hans Henrich Hock

“active” and “middle” voice, which fades out in MIAr. The passive is distinct from the “middle” only in the present/imperfect and in the third singular aorist. In addition, there are “derived” inflections (causative, desiderative, and intensive), as well as a number of non-finite forms (infinitive, verbal nouns, gerundives, converbs, and a range of participles). Uninflected elements include adverbs, adpositions, and “particles”. In Vedic, the distinction between adpositions, particles, and verbal prefixes is not fully settled, but univerbation of prefix-verb combinations and other specializations lead to greater distinctiveness. In MIAr., prefix-verb structures tend to become noncompositional. By late Vedic, new adpositions develop from nominal case forms (e.g. Skt. arthe ‘for the purpose of’) or verbal structures (e.g. Skt. kṛte ‘for the sake of’). Many particles (e.g. ca ‘and’) are clitics, and there are also pronominal clitics (e.g. enam ‘him’). Unlike the morphology, the overall syntax of Indo-Aryan remains relatively stable. The unmarked order is SOV; adpositions generally tend to be postpositive (a major exception is ā ‘to; from’); demonstratives, adjectives, and genitives precede their head. However, there is a great amount of variation, not just in phrase order, but even in word order. Toward late MIAr., word order freedom fades out. (A history of word and phrase order freedom remains a desideratum.) Passive constructions can in principle involve any verb (including ‘be’). There is also a causative and, developing in the history of Sanskrit, a “double causative”. Subordination is marked by non-finite structures (involving infinitives, gerundives, converbs, or participles), or finite relative-correlative constructions of the type (19). An additional device is the use of quotative marking, as in (20). (19) [tvaṁ taṁ … bādhasva …]CC you.NOM . SG that.ACC . SG . M bind.IMP .2 SG [… yo no jighāṁsati]RC who.NOM . SG . M we.OBL . CLIT . slay.DESID . PRS .3 SG ‘You … tie down that (evil-doer) who … tries to slay us.’(Rig Veda 6.16.32) (20) nakir vaktā ‘na dād’ nobody.NOM . SG . M say.NOM . SG . FUT NEG give.SUBJ .3 SG ‘Nobody will say, “He shall not give.”’ (Rig Veda 8.32.15c)

iti QUOT .

Two phenomena at the interface of morphology and syntax are of special interest. One is the use of the ta-participle (originally a perfective past participle) as general past tense, as in (21), rivaling and by late MIAr. replacing the old finite past tenses.22 This is the source of the Mod. IAr. perfective past. Debate continues as to whether the Sanskrit and MIAr. transitive construction (21a) is “passive”, 22

Traces survive in a few modern northwestern Indo-Aryan languages.

The languages, their histories, and their genetic classification

21

reinterpreted as ergative, or whether it was always ergative. (See e.g. Srivastava 1970, S. Anderson 1977, Dixon 1994, Harris & Campbell 1995, Deo & Sharma 2006 vs. Klaiman 1978, Hock 1986, Butt 2006.) The “passive” hypothesis ignores the non-passive intransitive construction (21b). (21) a. tena pustakaṁ paṭhi-ta-m read-ta.PTCP - NOM . SG . N that.INS . SG . M book.NOM . SG . N ‘He read the book.’ (Lit. ‘By him book (is/was) read.’) b. sa ga-ta-ḥ go-ta.PTCP - NOM . SG . M that.NOM . SG . M ‘He went.’ (Lit. ‘He (is/was) gone.’) The other phenomenon is an (optional) periphrastic construction indicating continued or repeated action. It involves a converb or present participle plus a helping verb (‘go’, ‘sit’, ‘stand’, later also ‘be’); see e.g. (22). Some consider this construction the antecedent of Mod. IAr. compound verb constructions, but the latter typically have telic or perfective, rather than continued-action function. (See e.g. Butt 2003 vs. Slade 2013.) By late MIAr., one structure — present participle plus ‘be’ — is grammaticalized as progressive (R. A. Singh 1980: 138). (22) ime … te vayam … ye who.NOM . PL . M . this.NOM . PL . M that.NOM . PL . M we.NOM . PL 23 ārabhya carāmasi … tvā + hold.on.CVB go.PRS .1 PL you.ACC . SG . CLIT ‘We here … are the ones who keep holding on to you …’ (Rig Veda 1.57.40)

1.3.1.2. Chronological classification The conventional division is into Old Indo-Aryan and Middle Indo-Aryan. The division reflects major phonological changes which include consonant-cluster assimilations (23a), intervocalic weakening of single stops (23b), change of syllabic ṛ to vowel (23b), sibilant merger (23c), and the “Two-Mora Conspiracy” (conversion of trimoraic syllables to bimoraic ones) (23d). Languages on the western and especially northwestern periphery do not participate in all of these changes; 1.3.1.4.2.

23

A plus sign is used to indicate that sandhi has been undone for greater clarity.

22

Hans Henrich Hock

(23) a. b. c. d.

Old Indo-Aryan sapta ‘7’   sakta ‘able’ śapta ‘(ac)cursed’  kṛta ‘done; deed’ sapta ‘7’ śapta ‘(ac)cursed’ teṣu ‘among those’ r ā j. ñaḥ ‘of a king’



/\ \

μ μμ

Middle Indo-Aryan

(later MIAr.)

satta kita satta satta tesu r a ñ. ño | | μμ

ki(y)a

Old Indo-Aryan is further classified into Vedic (the Rig Veda representing the earliest stage) and “Epic” and/or “Classical” (better: post-Vedic) Sanskrit. For Middle Indo-Aryan, three major stages are recognized — Pali and the language of the Aśokan inscriptions; Prakrits; Apabhraṁśa. There are numerous problems with this classification. While “Epic”/“Classical” Sanskrit is classified as OIAr. because of its phonology and morphology, it is contemporary with MIAr. Although some evidence suggests Vedic as the ancestor of MIAr. (v. Hinüber 2001: 43), the relation is not direct, for MIAr. sometimes preserves older forms than Vedic, such as Pali idha ‘here’ vs. Ved. iha (v. Hinüber 2001: 41–42). Further, the existence of “Prakritisms” in earliest Vedic (see 1.3.1.3.1) might suggest that “Old” and “Middle” Indo-Aryan are descendants of a common ancestor. A comprehensive discussion of these and related issues is found in Emeneau 1996. Note further that, unlike the various forms of MIAr., Sanskrit continues in spoken use to the present, chiefly among Indian Sanskritists (Hock 1992). For further problems see 1.3.1.3.1 and 1.3.1.4.1. 1.3.1.3.

Social interactions: Diglossia and Koiné

1.3.1.3.1.

Diglossia

One of the most problematic issues is the relationship between Sanskrit and early MIAr. (“Prakrit”). A large number of Prakritic features are found as early as the Rig-Veda, such as ṛ > V (24a), or the substitution of –m for –d in the nominative/ accusative singular neuter of the interrogative pronoun (24b). (24) a. b.

“Vedic” *kṛtavat vikṛta *ki-d/ka-d

“Prakritic” kitava vikaṭa ki-m

‘gambler’ ‘misshapen, ugly’ ‘what’

The languages, their histories, and their genetic classification

23

In some cases Sanskrit and Prakrit exhibit polarizing changes, as in the instrumental plural of a-stems, with resolution of early suffix variation in favor of -aiḥ in Post-Vedic, but -ehiṁ (< *-ebhiḥ) in Prakrit. (25) Early Vedic -aiḥ / -ebhiḥ

Post-Vedic -aiḥ

Prakrit -ehiṁ

Classical Sanskrit drama exhibits remarkable variation between Sanskrit and various Prakrits, with Sanskrit generally reserved for male protagonists and Prakrit assigned to other males and (most) females — a division that mirrors (access to) education. Further, Sanskrit and Prakrit are treated as mutually intelligible. After an extensive survey of earlier literature (primary and secondary), Hock and Pandharipande (1976, 1978) conclude that the relationship between Sanskrit and Prakrit is best characterized as diglossia. Note however that diglossia here differs from classical cases in so far as both the H variety (Sanskrit) and the L variety (Prakrit) are used as spoken media. This view of Sanskrit and Prakrit coexisting diglossically as spoken languages conflicts with an early proposal of Franke (1902), recently revived in more nuanced form by Pollock (1996), that Sanskrit (in effect) died out and was revived during the Gupta period or through inscriptional use by Rudradāman. While there may have been a revival (or at least, reinvigoration) of Sanskrit in written use, the evidence of Sanskrit drama (going back to late BC) makes the hypothesis unlikely as regards spoken use. Moreover, Hock and Pandharipande (1976) point to references to literary use in Patañjali’s 2nd-century BC Mahābhāṣya as indicating literary continuity.24 Pollock dismisses these as “stray references”. However, he adds interesting arguments regarding the question of why Sanskrit suddenly appears in inscriptions, replacing the earlier Prakrit tradition. For an alternative perspective see Filliozat 1972. The diglossic coexistence of Sanskrit with various MIAr. (and Modern) vernaculars had strong effects on the lexicon of the vernaculars, in terms of borrowings at various stages, especially in religious and intellectual vocabulary, such as Pali vākya ‘speech, utterance’ (for expected *vakka/vāka) or brāhmaṇa ‘brahmin’ (for *bamhaṇa). Beside Sanskrit, several MIAr. languages seem to have coexisted diglossically with various vernaculars. These include Pali (especially in Sri Lanka), Ardhamāgadhī (among Jains), and Apabhraṁśa (used as literary language in much of northern India, coexisting with early forms of Mod. IAr.).

24

Jamison (2007) argues for even farther-reaching continuity, going back to the Rig Veda.

24

Hans Henrich Hock

1.3.1.3.2.

Koiné developments

Pali and Ardhamāgadhī, sacred languages of Theravāda Buddhism and Jainism, have been considered koinés because of their composite linguistic nature, combining eastern with central and western features; and so has the chancery language of the eastern and central Aśokan insriptions; v. Hinüber 2001: 48–49, 93–95, 98–100 (for Aśokan see also Oberlies 2003: 165). However, similar composite features are characteristic of many literary languages, including Sanskrit. True, Pali, Ardhamāgadhī, and to a lesser degree the Aśokan dialect were, like koinés, used over vast territories, overlying a great variety of other varieties. But again, so are many other literary languages, including Sanskrit, which in fact became the link language of South Asia par excellence (Filliozat 1972). It remains to be seen to what extent the MIAr. varieties in question went beyond the “ordinary” composite nature of such link languages. The case is stronger for considering the latest MIAr. stage, Apabhraṁśa, to have been a koiné. Pattanayak’s attempt to reconstruct the ancestor of the Mod. IAr. languages (1966) yielded structures remarkably similar to Apabhraṁśa; and Subhadra Sen claimed Apabhraṁśa to be the ancestor of Mod. IAr. (1973). Under this view, Apabhraṁśa, a product of massive dialect leveling, replaced the older regional varieties, just like the Greek Koiné in Greece. A potential problem is the well-known fact that eastern Apabhraṁśa, like eastern Mod. IAr., has palatal ś, contrasting with dental s in central and western varieties. In fact, Katre (1965, 1968) and Miranda (1978) argue against the koiné hypothesis, and Bubenik (2003: 209–210) notes further regional differences, such as the retention of clusters with r in western varieties of Apabhraṁśa, a retention with counterparts in modern western Indo-Aryan varieties, e.g. Guj. traṇ ‘three’ < *trīṇi. (See also 1.3.1.4.2.) Moreover, northwestern Indo-Aryan varieties, which did not enter into Pattanayak’s reconstruction, preserve an even greater number of consonant clusters, not found in Apabhraṁśa. The issue deserves further investigation. 1.3.1.4. Dialectology Almost all of our information comes from literary texts, and as noted, the literary languages are composite. This holds true even for regionally defined Prakrits, such as Māgadhī and Mahārāṣṭrī (see 4.2). The possibility of determining dialectal differences within Old and Middle Indo-Aryan is therefore reduced. Nevertheless, some information can be abstracted from the available evidence.

The languages, their histories, and their genetic classification

1.3.1.4.1.

25

Vedic dialectology

The Vedic tradition is characterized by different branches (or schools), and there are well-known differences between these; see e.g. Whitney 1892 on the use of past tenses in Brāhmaṇa-Prose narratives. In a series of articles, Witzel (1989, 1995) attempts to develop a broader picture, relating the different Vedic branches to different geographical areas. There are also differences between the language described in Pāṇini’s grammar and the roughly contemporary (late) Vedic tradition. These include the marking of goals of motion verbs, restrictions on agent-coreference in an infinitival construction, and causee-marking (Deshpande 1983, Hock 1981, 2012). Deshpande, followed by Hock, argues that these differences are best explained in terms of regional difference — Pāṇini came from the extreme northwest (near Gandhāra) and presumably spoke a variety of Sanskrit different from that of the textual mainstream (located in a more central area).25 While the PIE contrast r : l is neutralized in favor of r in most of the Rig Vedic lexicon, some words appear to have preserved l (e.g. loka ‘world’ < *lewko-). A long tradition assumes that there was a dialect of Old Indo-Aryan that preserved the PIE distinction r : l (e.g. Fortunatov 1881, Arnold 1893, Wackernagel 1896: 217, Parpola 2002: 50, Fortson 2004: 182). An alternative view considers such “l-forms” borrowings from eastern Indo-Aryan where the contrast r : l was neutralized in favor of l, as shown by eastern Aśokan inscriptions (1.3.1.4.2). In fact, even PIE r-forms may have l-outcomes, as in upala ‘upper grinding stone’ beside upari ‘above’ (PIE *uper-). Proponents of the “r : l dialect” hypothesis counter by attributing forms like upala to a change of r to l in “labial environment”; but Bartholomae (1896) shows that l-outcomes for original l also occur in “labial environment” (e.g. loka). Edgerton (1946: 17–19), Hock (1991: 137), and most recently Mayrhofer (2002), therefore conclude that there is no evidence for an r : l dialect. 1.3.1.4.2.

Middle Indo-Aryan dialectology

The division of literary Prakrits into Māgadhī (eastern), Śaurasenī (central), and Mahārāṣṭrī (western) is well established,26 and some of their differences correspond well with earlier and/or later regional differences in Indo-Aryan, especially 25

26

Cardona (2002) argues that, pace Whitney (1892) and Witzel (1989), Pāṇini’s tense use is attested in the Aitareya Brāhmaṇa; Hock (2012) considers the interpretation of the evidence to be uncertain. There is also a Paiśācī Prakrit, whose exact nature is less certain. No texts in this language are extant, and information derives only from characterizations in grammatical literature. V. Hinüber (2001: 109–112) argues for affiliation with Pali and Aśoka’s chancery language; but forms such as vatana (for Skt. vadana ‘face’), with devoicing rather than the usual voicing or loss of medial consonants, suggest some kind of hyperarchaism.

26

Hans Henrich Hock

the palatal ś and the lateral l of Māgadhī vs. the dental s and the rhotic r of the other varieties; compare keśeśu : kesesu ‘by the hair’ (Skt. keśeṣu), lājā : rājā ‘king’. However, they share the composite nature of many literary languages; and Māgadhī and Śaurasenī Prakrit, confined to Sanskrit drama (1.3.1.3.1), tend to be generated by rule out of Sanskrit. Mahārāṣṭrī, however, enjoys broader currency, both in fine literature and in Jaina texts. For the early MIAr. religious standard languages Pali and Ardhamāgadhī there is ample evidence that they started out as eastern varieties — not surprising, since the Buddha and Mahāvīra hail from the east. Eastern features include the appearance of l-forms, such as antalikkha ‘sky’ (Skt. antarikṣa), or the nominative singular masculine a-stem ending -e vs. western -o. Both languages (especially Pali), however, underwent major reaffiliation with western MIAr. (v. Hinüber 2001, Oberlies 2003.) Further, note Gāndhārī, a northwestern Prakrit used in a Buddhist canon different from Pali, as well as in other texts, some of which (“Niya Prakrit”) come from present-day Xinjiang. The language shares many features with northwestern Aśokan (see below), but intervocalic stops tend to be weakened; and the canonical texts exhibit occasional forms that seem to be of Pali origin (Brough 1962). An interesting glimpse into MIAr. dialectology is (inadvertently) provided by the inscriptions of Emperor Aśoka. Three broad areas can be distinguished — eastern (the largest area, dominated by the Māgadhī chancery “koiné” of Aśoka’s court), western, and northwestern. These areas differ from each other in a number of features, which (making allowance for borrowings back and forth) correspond well with Mod. IAr. differences. One of these concerns the fate of r/ṛ + dental stop, as in artha > aṭṭha vs. attha. A difference between eastern retroflex and western dental is generally recognized (v. Hinüber 2001: 199, Oberlies 2003: 165). However, there is a TRIPLE geographic distinction — unlike the west, and like the east, the northwest offers retroflex outcomes (Turner 1926a with 1921, 1927; Hock 1996a). Certain clusters that are assimilated in the east, remain in the west and especially the northwest. Thus Skt. asti ‘is’ appears as asti in both the west and the northwest vs. atthi in the east. (V. Hinüber (2001: 182) attributes western st to archaization.) Similarly consonant + r remains in the west and northwest, as in Skt. śuśrūṣā ‘obedience’ > W susrūsa, NW suśruṣa vs. E sussusā. (The western area exhibits some variation.) The preceding example further shows that the sibilant merger, characteristic of the east and west, is not found in the northwest. In fact, even the modern languages of the area preserve the triple contrast s : ṣ : ś. There is also suggestive evidence for the development of a palatal : retroflex affricate contrast čh : c ̣h (v. Hinüber 2001: 109).27 Significantly, these are features of the Mod. IAr. languages of the area, as well as a number of non-Indo-Aryan languages (see 2.3.4.3). 27

Brough (1962: 72), however, does not commit to this interpretation.

The languages, their histories, and their genetic classification

27

A dialectal division between east and west not noted in earlier publications concerns the fate of h + R clusters. What is clear is that these clusters change to in Middle Indo-Aryan, but the phonetic interpretation of is less certain. Based on Pali metrical evidence, v. Hinüber argues for a reading RRh, i.e. geminate aspirate sonorant (2001: 187–189). While this interpretation is correct for (most of) Pali and no doubt also for western (and northwestern) Aśokan, the evidence of eastern Aśokan points to a CLUSTER R+h, as shown by the introduction of an oral epenthetic stop in baṁbhana/bābhana (< Skt. brāhmaṇa ‘brahmin’), a change not found in W b(r)ā̆mhaṇa (interpretable as b(r)ā̆mmhaṇa).28 Hock (2009) shows that this geographical difference is paralleled in Mod. IAr. regional variation and that similar differences are found in the normal outcomes of s + nasal clusters. The (optional) lack of vowel shortening in b(r)ā̆mhaṇa shows that the TwoMora Conspiracy (1.3.1.2) does not always apply in western Aśokan. Note further W ātpa ‘obtained, reached’ (Skt. āpta), nāsti ‘there is not’ (Skt. nāsti), rāññā ‘by the king’ (Skt. rājñā), sūpātthaye ‘for the sake of food’ (Skt. sūpārthāya).29 V. Hinüber (2001: 117–118) questions the theory of long-vowel preservation in western Aśokan, but tentatively accepts it for Modern Sindhi, referring to Turner 1923. The latter, however, provides evidence for non-application of the TwoMora Conspiracy in a much larger western area of Mod. IAr., including dialects of Panjabi (Turner 1967). 1.3.1.5.

Some further noteworthy features and developments

1.3.1.5.1.

Sanskrit

The following sections focus on phonology and (morpho-)syntax. Morphology, while remarkable for its richness, offers fewer issues of general interest or controversy. 1.3.1.5.1.1. Phonology The complex system of morphophonemic “SANDHI ” rules is probably the most noteworthy. Not only do rules operate word-internally (“internal sandhi”) but also across word boundary (“external sandhi”). The result is a great amount of wordfinal (and some word-initial) variation. For instance, word-final /r/ can be realized in at least five different ways — r before voiced segments; ś, ṣ, s before voiceless 28

29

NW bramana may suggest yet a different development, but NW inscriptions do not regularly mark consonant gemination (or vowel length); v. Hinüber (2001: 188) points to Gāndhārī bramma as evidence for a stage with mmh. The northwestern inscriptions do not mark vowel length and therefore do not provide relevant evidence.

28

Hans Henrich Hock

palatal, retroflex, or dental obstruent; ḥ elsewhere; plus degemination of rr with compensatory lengthening of preceding short vowel. Vedic adds two further variants — [φ] and [χ] — before voiceless labial and velar stop respectively. (Details in Emeneau 1958, Allen 1963.) External sandhi is de rigueur in extant written — and oral — texts; but Pāṇini characterizes it as optional. In modern spoken Sanskrit, external sandhi is ordinarily ignored, except in fixed collocations or combinations with clitics such as punaś ca ‘and again’ (/punar ca/). Among the internal sandhi phenomena, “G RASSMANN ’ S L AW ” (GL) and its relation to reduplication has elicited a large amount of discussion (surveyed in Collinge 1985: 47–61). Historically, GL was a constraint against more than one aspirated stop per word, with every aspirate but the last deaspirated. Combined with other developments, this resulted in alternations as in (26). Some have proposed a synchronic version of GL to account for these (26’) (e.g. S. Anderson 1970), but Sag presents a strong case for adopting Pāṇini’s account (1974), according to which there is only one aspirate, and the initial aspirate in (26b) results from “aspirate throwback” (see (26”). (Cardona 1991 discusses important historical evidence.) (26) a. b.

PIE bhe-bhowd-e bhudh-s

a. b.

Underlying /bhu-bhodh-a/ /bhudh-s/

Deaspiration etc. -----bhut-s

a. b.

Underlying /bu-bodh-a/ /budh-s/

Final loss -----budh

(26’)

(26”)

Sanskrit bu-bodh-a bhut

‘awoke’ (PERF .) ‘awareness’ (NOM . SG .) Final loss -----bhut

GL bu-bodh-a -----

Surface bubodha bhut

Deasp.etc./Aspirate throwback Surface -----bubodha bhut bhut

V EDIC PITCH ACCENT is lexically and morphologically determined, rather than assigned by rule. Within the Vedic period it undergoes several interesting developments. From one perspective, Vedic forms such as agnínā ‘by Agni’ contain just one pitch element — the high pitch on the second syllable, marked by ´. In Vedic recitation, however, reflected in indigenous transcriptions, there are two further elements — an extra low pitch on the syllable preceding the high pitch of the accented syllable, yielding a LH melody, plus a falling pitch on the post-accent syllable; see (27a). The LH melody is reminiscent of the widespread LH contour of modern South Asian pitch accent systems. It is an open question whether there is a historical connection. Some Vedic traditions (e.g. the Rig Veda) show a further development: The post-accent syllable receives a high-falling contour [ˆ], whose initial pitch is

The languages, their histories, and their genetic classification

29

higher than that of the preceding accented syllable, and the accented syllable is left unmarked in transcription (27b). In a subsequent stage, in a different Vedic branch, the higher pitch on the final syllable was reinterpreted as THE high-pitch accent (27c), and by the time of the late Vedic Śatapatha Brāhmaṇa, the LH melody was reasserted through introduction of low pitch on the preceding syllable (27d). What complicates matters is that the Śatapatha Brāhmaṇa only marks the low pitch (27e), giving the appearance either of a “marking reversal” with low pitch replacing high pitch or of an unusual marking convention for high pitch. Cardona’s (1993) study of the “Bhāṣika Sūtra” shows that neither interpretation is correct, and that the development proceeded along the lines of (27). (27) a. b. c. d. e.

Earliest pattern Rig Vedic Reinterpretation Śatapatha Brāhmaṇa

agnínā agninā̀ agninā́ agninā́ agninā

[agnínā̀ ] [agnínā̂] [agninā́ ] [agninā́ ]

R ETROFLEXION is a feature that has attracted widespread attention. One approach has been to attribute it to outside, usually Dravidian, influence (see 2.3.3.1); another has been to explain it through internal developments along the lines of (28), with change of palatal to retroflex sibilant, assimilation, and loss of some of the triggers for assimilation (see Hock 1996b for details and references). (28)

PIE PIIr a. (*liǵh-to- >) *liždha- > cf. (*wiḱ-to- >) *wišta- > b. (*wiḱ-s >) *wišš > cf. (*wiḱ-su >) *wiššu >

pre-IAr I pre-IAr II *liẓdha- > *liẓḍha- > viṣṭa- = *wiṣṣ > *wiṭṣ > *wiṣṣu > *wiṭṣu >

Vedic līḍha- ‘licked’ viṣṭa- ‘entered’ viṭ ‘people, clan’ (N sg.) vikṣu (id.) (L pl.)

The traditional assumption is that retroflexion results from prehistoric changes; but Deshpande claims that it is a relatively late Vedic phenomenon (1979). Pointing to the fact that (variable) external-sandhi retroflexion as in (29) is most common in the Rig Veda and dies out in late Vedic, Hock argues for the correctness of the view that the changes were prehistoric (1979). (29) dhenuṣ ṭa indra … = /dhenus te …/ ‘Your milch-cow, Indra …’ (Rig Veda 8.14.3) There has also been disagreement on the “spontaneous retroflexion” of intervocalic n, as in (30). After surveying earlier accounts, Mayrhofer (1968) concludes that the most likely explanation is that the change is one of a number of Vedic developments that anticipates MIAr. developments. An alternative, suggested by Hock (1991) is that forms like these may be borrowings from Vedic Prakrits.

30

Hans Henrich Hock

(30) Proto-Indo-Iranian *st(h)ūnā*mani-

Avestan stūnā-maini-

Rig-Vedic Skt. sthūṇā ‘pillar’ maṇí‘necklace’

The grammarians, and phonetic treatises following the grammarians, classify r as retroflex, a classification that motivates the placement of r in the phonetic chart in Table 1.1. Western linguists tend to accept this classification as phonetically accurate (e.g. Wackernagel 1896: 209). The Vedic phonetic treatises, however, characterize r as alveolar, and that is its pronunciation in all of Mod. IAr. The retroflex classification probably reflects the phonological fact that, like retroflex ṣ, r triggers retroflexion of n (under certain conditions); see Hock 2014. What seems to be relevant is that, like retroflex ṣ, alveolar r is postdental and hence triggers change of dental n to postdental ṇ. 1.3.1.5.1.2. (Morpho-)syntax Several noteworthy phenomena and developments concern REFLEXIVIZATION . The earliest system, more or less intact in the Rig Veda, encodes genitival/possessive reflexives through a pronoun sva ‘one’s own’, but argument and other non-genitival reflexives by means of middle-voice verb inflection, as in yajate ‘sacrifices for himself’ vs. yajati ‘sacrifices (for somebody else)’.30 A non-possessive reflexive pronoun, based on nominal forms, is introduced in the Rig Veda (tanū ‘body’) but develops fully only later (based on ātmán ‘self’); see Hock 2006. The new pronominal reflexivization no doubt made middle-voice inflection redundant and thus was partly responsible for its functional attrition and its MIAr. demise (but see Delbrück 1888: 262–263). Another element with impact on middle-voice inflection is the development of a stative-intransitive (SI) verb category, as in (31a). Ignoring the issue of root vs. suffix accent in Vedic (which fluctuates anyway), the formation is identical to that of the passive, and passive and SI seem to have a common origin (Gonda 1951: 98–101). However, as (13b) shows, the syntax differs — in (31a) the surface subject of the verb controls the converb gatvā, but in (31b) it is the underlying subject or “agent” (in Pāṇini’s sense) that does so (whether overtly present or not). The SI is interesting as the probable source of the Mod. IAr. “middle” type, as in Hindi kaṭnā ‘be cut (stative)’ vs. kāṭnā ‘cut’, kāṭā jānā ‘be cut (pass.)’. (31) a.

30

tatra gatvā na mucyase NEG get.free.SI.PRS .2 SG there go.CVB ‘Having gone there you do not get free.’

In addition, there is an emphatic reflexive svayam.

The languages, their histories, and their genetic classification

b.

31

tatra gatvā (tena) na mucyase that.INS . SG . M NEG free.PRS . PASS .2 SG there go.CVB ‘You are not freed by someone/himi having gone there/whoi has gone there.’

As indicated in the preceding paragraph, converb (and similarly, reflexivization) is exerted not by the surface, but by the underlying subject. In postRig Vedic, this control holds for all “P-oriented” constructions, whether passive, gerundive, or (transitive) ta-participle. (Hock 1986; for the converb see Pāṇini 3.4.21.) Control, therefore, offers a way to determine whether putative “O BLIQUE S UBJECTS ” have subject properties. Other than the trivial case of the instrumental-marked “agents” of P-oriented constructions, the only category of oblique subjects that can be confidently established is that of “Possessor Subjects”, as in (32a). See Hock 1990, with evidence against Hook’s claim (1976, 1984), based on isolated examples like (32b), that Sanskrit also has “Dative Subjects”. CONTROL

(32) a.

b.

na +

asya śrutvā gṛhe vaseta NEG he.GEN . SG . M hear.CVB house.LOC . SG . N dwell.OPT .3 SG ‘(She) should not dwell in hisi house, hei having heard the verses.’ (Atharva Veda 12.4.27) śrutvā tv idam upākhyānam this.NOM . SG . N story.NOM . SG . N hear.CVB & anyan na rocate [tasmai] NEG please.PRS .3 SG that.DAT . SG . M other.NOM . SG . N ‘(Hei) having heard this story, another (one) doesn’t please himi/hei does not like another (one).’ (Mahābhārata 1.2.236)

As noted in 1.3.1.1, example (4), Sanskrit has certain constructions that have been claimed to be ancestral to the modern COMPOUND VERB constructions but which differ aspectually. Following Tikkanen (1987), Hook believes to have found an ancestor that shares the telic aspect of the modern compound verb (1993); see (33a). While Hook’s light-verb interpretation of gatā is possible, the alternative literal reading (33b) is equally possible. Slade (2013) concludes that the evidence is inconclusive. (33) a. b.

tato makṣikā + uḍḍīya gatā go.ta.PTCP . NOM . SG . F then fly.NOM . SG . F fly.up.CVB ‘Then the fly flew (up and) away.’ (Pañcatantra 1.22) ‘Then the fly flew up and left.’ (Lit. ‘… having flown up left.’)

Q UOTATIVE marking (see 1.3.1.1, ex. (20)) undergoes a variety of changes, both in terms of its position (relatively free in the Rig Veda, usually post-citation later) and in terms of specific uses. The latter include Cause and Purpose marking, gram-

32

Hans Henrich Hock

maticalized along the lines of (34). Hock (1982) is a pilot study of the historical development of the quotative; but detailed further research is required. (34) a. b.

phalāni labhā iti kṣetraṁ fruit.ACC . PL . N get.IMP .1 SG QUOT field.ACC . SG . N ‘(Thinking) “Let me get fruit”, I go to the field.’ ‘I go to the field in order to get fruit.’

gacchāmi go.PRS .1 SG

Instead of quotative marking (or no marking at all), Sanskrit more rarely uses relative-correlatives for embedding cited discourse or Purpose structures, as in (35). (35) a. b.

kṣetraṁ gacchāmi yathā phalāni labhai field.ACC . SG . N go.PRS .1 SG so (that) fruit.ACC . PL . N get.IMP .1 SG ‘I go to the field in order to get fruit.’ tena (tad) kathitaṁ say.ta.PTCP . NOM . SG . N that.INS . SG . M that.ACC . SG . N yad āgacchāmi come.PRS .1 SG which.NOM . SG . N ‘He said (that) “I am coming”.’

As in other languages with competing complementizer strategies (such as Bangla and Marathi), some varieties of Sanskrit offer a “blend” of the two constructions, as in (36). Bayer (2001) shows that such structures create interesting problems for generative syntax. (36) sa ṛtam abravīt oath.ACC . SG . N say.IMPF .3 SG that.NOM . SG . M yathā sarvāsv eva samāvad vasāni + iti dwell.SUBJ .1 SG QUOT so (that) all.LOC . PL . F EMPH equally ‘He swore an oath “I will dwell among all (of them) equally”.’ (Maitrāyaṇi Saṁhitā 2.2.7) Finally, it is worth mentioning that, as in many languages, the use of certain syntactic phenomena depends strongly on discourse or text-type. Gonda (1942) notes that subject personal pronouns (commonly omitted) are used frequently in dialogue. Jamison (1991) observes that in Vedic-Prose dialogues, deictics such as ayam ‘this (here)’ are used more frequently than simple demonstratives such as saḥ ‘that’. Hock (1997b) notes greater frequency of extraposition to the right in hymnal poetry and dramatic dialogue than in Vedic Prose and post-Vedic fable literature, both of which are less given to poetic flourishes. Wallace (1984) finds extraposition of subjects to be especially common in imperatival structures and explains it as politeness-conditioned downgrading. The interaction between discourse/text-type and the degree to which particular syntactic phenomena are utilized deserves further research.

The languages, their histories, and their genetic classification

1.3.1.5.2.

33

Middle Indo-Aryan

1.3.1.5.2.1. Phonology Beyond some traces in Pali, external sandhi drops out. As a result of widespread loss of intervocalic consonants, vowel hiatus increases dramatically, especially in Apabhraṁśa which offers structures such as aṇurā.i.u ‘attached (NOM . SG . M )’. A curious phenomenon, apparently related to the Two-Mora Conspiracy is the fact that sequences of V̄ C and V̌ CC are treated as equivalent, such that old V̄ C may change to V̌ CC and vice versa; see e.g. (37). (37) Sanskrit nīḍa hărtum

Pali I nīḷa hātuṁ

Pali II nĭḍḍa hăttuṁ

‘abode’ ‘to hold’

Prakrit and Apabhraṁśa exhibt a strong tendency for merger of n and ṇ, and a redistribution such that n occurs in strong position (initially and in geminates) and ṇ in weak (intervocalic) position; a similar redistribution has affected l (l-, -ll- vs. -ḷ-); see Turner 1924: 219–222, and Masica 1991: 192–193, both of whom also discuss the repercussions of these changes in Mod. IAr. 1.3.1.5.2.2. (Morpho-)Syntax Like the Sanskrit structure in (15), some MIAr. structures have been claimed as ancestors of Mod. IAr. compound-verb constructions; see e.g. (38a) where the use of adāsi is compared to the benefactive use of, say, Hindi denā in similar constructions (Hook 1993). As in Sanskrit, a literal interpretation is possible, too (38b). Slade (2013) considers the evidence inconclusive. (38) a.

a. b.

so tassā saddaṁ sutvā that.GEN . SG . F sound.ACC . SG . M hear.CVB that.NOM . SG . M assamapadaṁ ānetvā hermitage.ACC . SG . N take.CVB aggiṁ katvā adāsi fire.ACC . SG . M make.CVB give.PST .3 SG ‘He heard her crying, took her to the hermitage, and made a fire (for her benefit).’ ‘He heard her crying, took her to the hermitage, made a fire and gave (it to her).’ (Jātaka 1.296.10; cited from Hendriksen 1944: 134)

34

Hans Henrich Hock

A comprehensive study of quotatives and related structures in MIAr. is still a desideratum. Meenakshi (1986) finds an Aśokan antecedent for the Mod. IAr. complementizer ki/ke in the kiṁti of structures such as (39), assuming the interpretation in (39a). This, however, does not explain the use of the form kiṁti, whose literal meaning is ‘what (unquote)’, nor the final quotative ti (< Skt. iti), nor the fact that structures of this sort always employ a modal verb form (optative). An alternative, literal reading (39b) is equally possible, under the assumption that ‘what’ is a rhetorical device of the type common in instructional discourse (establishing something like a “staged dialogue”). Most important, kiṁti fails to explain the phonology and geographical distribution of Modern ki/ke. The variation ki : ke [kē] is not explainable through derivation from kiṁti but follows naturally from different nativizations of Persian kĕ — given the absence of short ĕ in languages like Hindi-Urdu, the short ĭ captures the quantity of kĕ, the long ē the quality. And as P. Marlow (1997) shows, the maximal geographic distribution of ki/ke (and related forms) coincides roughly with the maximal extension of the Persian-dominant Mughal Empire. (There do not seem to be any post-Aśokan MIAr. reflexes of kiṁti.) (39)

a. b.

tatta icchitaviye tupphehi you.INS . PL therefore desire.GERUNDIVE kiṁti majjhaṁ paṭipādayemā ti practice.OPT .1 PL QUOT kiṁti impartiality.ACC . SG . N ‘Therefore it is to be desired by you that you practice impartiality.’ ‘Therefore it is to be desired by you. What? “May we practice impartiality.”’ (Aśoka Kalinga Dh. 1)

An interesting counterpart to the syntactic “blend” in (39) is found in Jaina Mahārāṣṭrī, where cited discourse may be marked by initial jaha (< Skt. yathā), final ti (< iti), or both (40).31 (40)

vinnaviyaṁ jahā deva eehiṁ lord.VOC . SG . M this.INS . PL . M say.ta.PTCP . NOM . SG . N so (that) savvo vi logo    viṭṭalio all.NOM . SG . M EMPH    world.NOM . SG . M    dirty-up.ta.PTCP . NOM . SG . M tti QUOT

‘… said “Lord, these have dirtied up the whole world.”’ (from Jacobi 1886: 2, line 5–6) Quotative marking persists into early Apabhraṁśa. For instance, Jacobi’s edition of the Janatkumāracarita (1921) offers numerous examples of post-discourse tti 31

Another option is no marking at all.

The languages, their histories, and their genetic classification

35

(< iti), as well as an apparent (incipient) replacement ia (< evam ‘so, thus’?); see also S. K. Sen 1973: 31, 48, and elsewhere. This is in marked contrast to the earliest Mod. IAr. texts (other than western ones), which seem to lack quotative marking. 1.3.1.6. Resources General: Altindische Grammatik (= Wackernagel 1896 (2nd ed. 1957), 1905, Debrunner & Wackernagel 1930, Debrunner 1954), Renou 1961, Whitney 1889; Bubenik 2003, Cardona 2003b, v. Hinüber 2001, Jamison 2008a, 2008b, Masica 1991, Oberlies 2001, 2003, S. Sen 1960, Tagare 1987, Vaidya 1941 History: Bloch 1965, Bubenik 1996, 1998, Burrow 1955, Edgerton 1946, Kobayashi 2004, Mansion 1931, Renou 1956, 1957, Sociolinguistics: Hock & Pandharipande 1976, 1978 Phonology: Wackernagel 1896 (2nd ed. 1957), Allen 1963, Emeneau 1958 Morphology: Wackernagel 1905, Debrunner 1954, Debrunner & Wackernagel 1930, Cardona 2007 Syntax: Deshpande & Hock 1991 (bibliography), Bubenik 1998, Delbrück 1888, Hendriksen 1944, Hock (ed.) 1991, S. Sen 1953, R. A. Singh 1980, Speijer 1886, 1896 Dictionaries: Böhtlingk & Roth 1855–1875, Davids & Stede 1931, Mayrhofer 1956–1976, 1986–2001, Monier-Williams n.d., Turner 1962–1969, Trencker et al. 1924–1948 1.3.2.

Modern Indo-Aryan By James W. Gair

1.3.2.1 General introduction The Modern Indo-Aryan languages are spoken primarily in the Indian subcontinent, but are also the majority languages on the nearby islands of Sri Lanka, the Republic of the Maldives, and Minicoy (a territory of India). On the subcontinent, they range from the Pakistan-Afghan border on the west, to the eastern border of Assam, including Bangladesh, and on the north, they extend to the lower ranges of the Himalayas, including Nepal. To the south, they are bordered by Dravidian languages, marked roughly by the northern borders of Karnataka and Andhra Pradesh/Telangana. (See Map 1.1 and the detailed maps in Breton 1997, especially chapters 7–9.) One important language, Urdu, is not associated with any specific locale, but is spoken throughout India, primarily by Muslims. It is also the chief official language, though not the majority language, of Pakistan, where it is widely spoken as a second language (Breton 1997: 74, plate 13).

36

James W. Gair

Map 1.1: The Indo-Aryan languages of South Asia (produced by Suresh Kolichala, 2015)

There are some Indo-Aryan language islands in the Dravidian south. A notable one is Dakkhini Hindi-Urdu in Telangana/Andhra Pradesh, resulting from Muslim rule in the Hyderabad area in the 15th to 17th centuries. Other languages resulted from migration, often economic, such as Saurashtri (also called Sourashtra) in Tamil Nadu. Konkani, the official language of Goa, is also spoken in several dialects in Karnataka and northern Kerala. Vaagri Boli (among other names; Varma 1970) is spoken by several small groups, commonly seminomadic or peddlers, in Tamil Nadu, Karnataka, and Maharashtra.

The languages, their histories, and their genetic classification

37

Indo-Aryan languages account for nearly 80 percent of the multitudinous languages of the subcontinent as a whole, including Sri Lanka, Nepal, and the Maldive Islands, along with India, Pakistan, and Bangladesh (Breton 1997: 182). They also include fifteen of the twenty-two official state and national languages recognized in the Constitution of India, as well as the main official or majority languages of Pakistan, Nepal, and Bangladesh. There are also some Indo-Aryan languages outside the subcontinent resulting from older population movement, including Romani (1.3.2.8). Recent, widely scattered outliers are mainly the result of the South Asian diaspora largely connected with the export of Indian labor in colonial times, some of which have developed specific varieties such as Fiji Hindi, and varieties of Caribbean Hindustani in Surinam, Guyana, and Trinidad and Tobago (Mesthrie 2007). 1.3.2.2. How many Indo-Aryan languages? The question of how many Indo-Aryan languages there are is impossible to answer confidently, for a number of reasons. The definition of language versus dialect is notoriously difficult in language studies and one not possible to answer in the best of circumstances without making special, often arbitrary, assumptions or decisions as to the criteria that count. such as restricting the designation “language” to varieties that have literary traditions or official status. In the Indian context described, it becomes truly impossible. (In relation to Indo-Aryan, see the account in Masica 1991: 23–30, Shapiro & Schiffman 1981: 16–69.) First of all, there are for the most part no “natural” boundaries between languages and groups of speakers. While maps appear to show language boundaries clearly, the situation on the ground for most of the contiguous Indo-Aryan speaking area of South Asia is a continuum rather than a patchwork quilt. Throughout the area there are local speech varieties such that a person walking eastward from Pakistan to Assam, or in any other direction, would find the speech of each village mutually intelligible with that of the next, but at some point in any segment would encounter a variety which was not mutually intelligible with that of the starting point. Superimposed on that picture are local or regional dialects, which may or may not be recognized and named. Superimposed on those, geographically, are still broader regional languages, and languages with literary traditions and/or official status for administration and education. The father of the Linguistic Survey of India (LSI), Sir George Abraham Grierson, was well aware of the problem, remarking in the LSI (1: 30–31, quoted in Singh and Manoharan 1993: 17): The identification of the boundaries of a language, or even of the language itself, is not an easy matter. As a rule, unless they are separated by great ethnic differences, such as a range of mountains or a larger river, Indian languages gradually merge into one another and are not separated by hard and fast boundary lines.

38

James W. Gair

Furthermore, the situation regarding names and affiliations, especially in different sources, is a complex one. For example, the category of Bihari was created by Grierson for the LSI to include Maithili, Magahi, and Bhojpuri, a designation that persisted through the 1961 census, in which it included those three together with 31 other languages.32 In the 2001 census, however, Magahi and Bhojpuri were listed separately and incorporated under Hindi, while Maithili, a language with a literary tradition, now a constitutionally recognized language, was given separate status. Similarly, Grierson coined the term Rajasthani to include a number of languages/dialects in Rajasthan and nearby areas. The term has had a checkered history in terms of what it included and its proposed internal and external affiliations (Masica 1991: 441, 451–456). In the 1961 census Rajasthani included Marwari and Mewati, but in the 2001 census, these two were incorporated into Hindi along with the now much smaller Rajasthani, and the category of Bihari disappeared. In short, the whole question of names of specific languages or dialects in South Asia is a kaleidoscopic one. Language names may be defined by scholars or political entities, but where the information is “bottom-up”, i.e., by speakers naming their own languages, the results are subject to many factors, including region, caste, religion, or political affiliation. Languages often have alternate names, sometimes many, depending on the source. Names may also be a product of pressure from speakers’ movements for different status or autonomy, so that a given variety may have a daunting number of designations, whereas a single name may apply to multiple clearly distinct ones.33 Aside from the question of the lack of a clear agreement on language versus dialect, when we turn to current information using census data both within and across political entities, the method of collecting information differs from country to country, and even from time to time in a single census series, and data is not directly comparable with relation to such terms as “mother tongue” and “official”. Masica, who provides a clear brief account (1991: 30–31), correctly describes the nomenclature situation as a ‘boulder-strewn path over which one must pick one’s way carefully.’

32

33

To add further complexity, the 1981 census established a cutoff of 10,000 speakers for listing a language, and this applied not only to several of the languages under Rajasthani, but to numerous others which thus lost official existence as independent languages. The list of named dialects and varieties in Masica 1991: 420–445 is instructive, and Ethnologue (www.ethnologue.com) lists 221 Indo-Aryan languages with their alternate names. Helpful lists may also be found in Breton 1997 and Cardona 1974.

The languages, their histories, and their genetic classification

39

1.3.2.3. Hindi-Urdu Something must be said here about the modern standard form of Hindi (also known as Khaṛī Bolī),34 Urdu, and their relationship. At the spoken level, Hindi and Urdu are essentially the same language, sharing a grammar that differs only in relatively minor respects and having a high degree of mutual intelligibility; but they become more distinct in relation to formality or political/official status. The main differences are in lexicon and script, with Hindi written in the Devanagari script and Urdu in a Perso-Arabic one. Both contain many words, especially in basic vocabulary, that are regular Indo-Aryan derivatives, but Urdu draws its “higher” vocabulary, including official and technical terms, from Arabic and Persian, whereas Modern Standard Hindi makes use of Sanskritic borrowings in adding to the lexicon. The history is rather complex and not without controversy, but essentially both have their origin in a language developed in the region of Delhi, the capital during the several centuries of Muslim rule, and they have absorbed influences from several directions over that time. It has been claimed on the one hand that Urdu was developed by replacing Old Indo-Aryan forms with Perso-Arabic ones, and on the other that Hindi was created by stripping the language of such forms and replacing them with Sanskritic ones (Masica 1991:29). Both positions are likely overstated, but both also have an element of truth. The labels that have been used for the different varieties, including “Hindi” and “Urdu” themselves, have varied over time, with one or the other being used for the same language (Masica 1991: 29–30, Y. Kachru 2007: 82). Complicating this is the term Hindustani, which was commonly used in the British period for Hindi and/or Urdu, but sometimes also came to refer to a common language combining features of both, and proposed by Gandhi and some others, unsuccessfully, to be the Union language of independent India. The 1950 Constitution of India specified Hindi in Devanagari script as the Union language. English was to be an associate official language until 1965; but that proved not to be possible, primarily due to resistance from some regions and groups, largely in the Dravidian south, but also from speakers of some Indo-Aryan languages with long and proud literary traditions, such as Bengali (Bangla). Parliament passed acts in 1963 and again in 1967 allowing the use of English as a “subsidiary official language”, and that remains in force. Hindi is also the official language of the states of Haryana, Himachal Pradesh, Madhya Pradesh, Rajasthan, and Uttar Pradesh, in the territory of Delhi, and in the Andaman and Nicobar Islands. It is secondary with Gujarati in Gujarat, and with Maithili in Bihar, and is a shared official language in the recently added states of Chattisgarh, Jharkhand, and Uttarakhand. It also is used in many parts of India as a lingua franca for 34

This term also has different uses, and is also applied to a dialect in northeastern Haryana, called “Vernacular Hindōstāni” by Grierson (Masica 1991: 9–10).

40

James W. Gair

inter-language communication. Thus Singh and Manoharan (1993: 23) report that in the People of India survey, of the communities reported to be using more than one language for intergroup communication, 66.4 % had speakers who knew Hindi which served as ‘a lingua franca for many communities, both tribals and non-tribals’ and that ‘the tribal communities of central, western and north-eastern regions, irrespective of their linguistic background, use Hindi as a means of intergroup communication while speaking with non-tribals’ (1993: 23). Urdu was made the national language of Pakistan, where it shares official status with English, but it was also given additional official status in India in Jammu and Kashmir, Bihar, Delhi, Uttar Pradesh, and Telangana (whose main official language is Dravidian Telugu). The term Hindi has been used in a far wider sense than that of Modern Standard Hindi, especially in the Census of India, so as to cover a number of dialects and languages, primarily those of the area referred to by Breton as the “Hindi Belt” (1997, especially 72–73, Plate 12). The Census of 2001, for example, lists 49 named subvarieties under Hindi, plus “Others”. This is actually a far-ranging lot, including a number that would under any interpretation count as individual languages. The assignments of languages to this category have varied over time, and other languages have been incorporated. As mentioned earlier, Marwari and Mewati, along with a shrunken Rajasthani were shifted to become separate entries under Hindi in the 2001 census, as were Magahi and Bhojpuri, 1.3.2.4. Grierson’s “Inner and Outer” hypothesis There is no generally accepted scheme for classifying the modern Indo-Aryan languages into subfamilies, though there have been numerous attempts since the late 19th century, and there is quite general agreement on some subgroups, such as an eastern subfamily of Bengali, Assamese, and Oriya. However, there is a complication even here since there are shared features with Marathi and Gujarati, which may define a larger grouping. An ambitious and influential attempt at a classification of the IA languages in pre-partition India, i.e. the greater subcontinental India of the British Empire, was made by Grierson, initially in 1913–1919, in connection with his monumental pathbreaking work, The Linguistic Survey of India (LSI, 1903–1928), and it has been under discussion, with varying degrees of acceptance, up until the present time. Grierson based his work in part on an 1880 attempt by Hoernle on a more limited set of languages, which proposed two major divisions: Northwestern and Southwestern. In the LSI, Grierson initially grouped the languages into Central, Northwestern, Southern, and Eastern sets, along with a further northwestern Dardic group (described below), that he considered to be outside Indo-Aryan. He then proposed a major three-way grouping into Inner, Mediate, and Outer subbranches, which has led to his work being referred to as an “inner-outer” hypothesis. This

The languages, their histories, and their genetic classification

41

is shown in Figure 1.1, which represents a later 1931 revision by Grierson. The quotation marks around some names indicate sets proposed by Grierson that later received major alterations (see 1.3.2.5). Grierson’s knowledge of the languages of India was prodigious for the time, in part due to his pioneering work on the LSI, but much work since has led to challenges and revisions, especially as much data became available from the investigation of lesser-known languages, or in more depth on better known ones. 1.3.2.5. Grierson’s names and identifications The names of language groups used by Grierson as they appear in Figure 1.1 are to a large extent his constructs, to capture his classification of individual languages. Relating them to current languages as they appear on maps is no simple matter. Many of them have been challenged by inclusion or by rejection of languages, especially as new research was carried out on the ground. Even aside from the kaleidoscopic changes in names over time, and the varying status afforded to the languages and dialects, there is the basic problem of the identification of languages, owing in part to the continuum nature of the territory (see 1.3.2.4). Grierson used “Lahnda” to refer to a set of dialects west of Panjabi (now located in Pakistan) that he considered a separate language, sometimes called

Figure 1.1: Grierson’s IAr. subclassification (1931 version) (From Masica 1991: 453, figure II.5)

42

James W. Gair

Western Panjabi, among other names, with several subgroups that have been variously defined (see Masica 1991: 435–456, Shackle 1979). “Dardic” used to designate a number of northwestern languages in the mountainous area between Kashmir and Afghanistan which included, among others, Kashmiri and a “Kafiri” group. Though Grierson considered the group as a whole to be Indo-Iranian rather than Indo-Aryan, later scholarship (following Morgenstierne 1965) has come to consider it Indo-Aryan with the exception of the “Kafiri” group, now “Nuristani”, generally taken to be not Indo-Aryan but a separate branch of Indo-Iranian (but note Cardona 2003a: 22–25). For further discussion of Dardic and Nuristani see Masica 1991, Morgenstierne 1926, 1973, Strand 1973, 2001, Buddruss 1977a, Bashir 2003. (See also 1.2.2 and 1.4.) Grierson’s scheme was challenged by another towering scholar in Indian linguistics, S. K. Chatterji, in his 1926 magnum opus The origin and development of the Bengali language, for which Grierson wrote a graceful and admiring preface while noting his disagreement with Chatterji’s criticisms. In an appendix to the introduction to that work, Chatterji presented a detailed analysis of Grierson’s criteria, phonological, morphological, and otherwise. Foreshadowing discussion by later scholars, Chatterji rejected the “outer” grouping, seeing no real evidence for it; and the issue of the validity of that group, though rejected by many subsequent scholars, remains a live issue up to the present. Chatterji proposed a geographically oriented scheme with six groups in addition to Dardic, instead of Grierson’s four. In the central position was Western Hindi, constituting a Midland group, while other languages in Grierson’s central group were otherwise assigned. Eastern Hindi became part of the Eastern group, and Marathi, with Konkani, formed a Southern group. (Grierson had put Sinhala with Marathi; but see 1.3.2.7 below.) Rajasthani and Gujarati formed a new Southwestern group, separated from Grierson’s earlier Central group. Panjabi was classed with “Lahnda” and Sindhi to form an expanded Northwestern group, and a new Northern group included the Pahari languages and Nepali. There were a number of other changes and identifications within groups that went along with this general scheme, but they cannot be enumerated here.35 Grierson himself, while retaining his inner-outer view, made serious revisions, as reflected especially in his 1931–1933 publication On the Modern Indo-Aryan vernaculars. As in Chatterji’s scheme, Western Hindi took on a special status as the Midland Language, replacing the Central Group. It was surrounded on the west, north, and east by an “Intermediate” branch, consisting largely of the old Central Branch minus Western Hindi, but with the addition of the Pahari group and Nepali, which was also linked with Eastern Hindi. The Western and Eastern groups remained 35

See the account in Masica 1991: 450–452, which is part of a larger summary of different subclassification schemes over time (his Appendix II: 446–463).

The languages, their histories, and their genetic classification

43

essentially unchanged. There were also some reassignments of languages and dialects, but the overall scheme was now simpler, with a kind of concentric scheme, or more accurately a kind of inverted horseshoe in which the Midland language was surrounded by the outer languages of the Northwestern, Southern, and Eastern branches. The outer chain was interrupted by Gujarati, and thus not completely continuous, a problem for Grierson, who explained it as the result of the intrusion of speakers from the midland into that area at some time in the past (1918: 56–58). Figure 1.1 reflects this scheme. The affiliation of Gujarati has indeed varied with different classifications. Thus Chatterji (1926) considered it to be Western Rajasthani, and Cardona (1974) put it into a Southwestern group with Marathi and Sinhala. 1.3.2.6. Difficulties with the classification — Summary It is quite understandable that there is no single general classification of IndoAryan languages that has met with general acceptance and there are relatively few such subclassifications, given the problems inherent in the task. There are a staggering number of languages and dialects; and with some exceptions, despite the significant work done since Grierson, the necessary detailed research on many if not most of them is lacking once we go beyond the major ones. Despite the great amount of literary material going back for millennia, there are major gaps in the historical record of the Indian subcontinent, and during that time there have been many population movements that have gone unrecorded, bearing with them changes in the distribution of languages. In addition, and perhaps more important, given the general accessibility of regions one to another, there has been throughout much diffusion of features, often in an overlapping or crisscrossing manner across languages and dialects, as well as differential influences from without. Furthermore, the linguistic landscape continuum referred to earlier included successive layers of varieties encompassing regional standards, lingua francas, official languages, and literary varieties, opening the possibility of vertical as well as horizontal diffusion. That is, even in a single geographic region, there could be changes progressing differently on each of these levels, as well as upward or downward, with, say, some features from a higher level differentially affecting subdialects in a particular region, or the reverse. This would all militate against finding sets of features that would determine clear, non-intersecting classifications. For example, isoglosses for a set of sound changes from Old and Middle Indo-Aryan into New Indo-Aryan that have been advanced for grouping the IA languages are shown in Figure 1.2; for details see Table 1.3, arranged in a grid that has major languages arrayed from left to right in an essentially east to northwest pattern but passing through central and south (non-mainland Sinhala is an outlier).

44

James W. Gair

Figure 1.2: Geographical distribution of sound changes. (Numbers refer to the list of changes in Table 1.3.)

Grierson’s inner-outer hypothesis has, as a whole, been rejected by later scholars, although it can still be found with implied acceptance in some more popular accounts. Indeed, some scholars have doubted the possibility of achieving any definitive overall scheme, in light of the complexities just noted and the commonly opaque character of much South Asian linguistic history with regard to language groupings over time. Thus Masica (1991: 460) was led to remark that We might therefore be well-advised to give up as vain the quest for a final and “correct” NIA taxonomy, which no amount of tinkering could achieve, and concentrate instead on working out the history of various features, letting such feature-specific historical groupings emerge as they may, with their overall non-coincidence as testimonial to the complexity of the situation

This agnostic view was recently challenged by Southworth (esp. 2005), who argued for the general accuracy of Grierson’s conclusions on the basis of several kinds of both linguistic and historical evidence. As yet, there has been very little published discussion of this proposal, beyond the brief review by Kulikov (2007).

The languages, their histories, and their genetic classification

45

Table 1.3: Distribution of important sound changes in Modern Indo-Aryan As. Ben. Or Hi. Guj. Mar. Punj Ka. Sindh 1.

VCC > VVC

X

X

X

X

X

X

2.

y, j > j

X

X

X

X

X

X

X

3.

V# > Ø

X

X

X

X

X

X

X

4.

b, v > b

X

X

X

X

5.

s, ṣ, ś > s

6.

s, ṣ, ś > ś

X

X

7.

n, ṇ > n

X

X

8.

n, ṇ > ṇ

9.

kṣ > (c)ch / s

10.

NT > ND****

11,

i, ī > i; u, ū > u

12.

MIA l > NIA ḷ

13.

Ch > C

X

(X)*

? *** X

X

X

X

X X

X

X

X

X

X

X

X

X

X

X

(X) ****

X X

X

(X) **

X ?

X

Sinh.

X

X

X

X

(X)*****

X X******

* Clusters were reduced (with retention of /Cr/) in Sindhi, but with original vowel length retained. ** CC was retained in Old Sinhala, with CC > C and compensatory lengthening post-2nd century (Karunatillake 2001: 15–16, 49). The resulting vowel length distinction was lost later, but left traces in long vowel umlaut. *** Panjabi basically retains the b : v distinction, but also has v > b forms, probably from Central influence (Masica 1991: 202–203) **** The n : ṇ distinction was retained in Sinhala, until lost ca. 8th century (Karunatillake 2001: 96–98) **** Voicing of post-nasal stops. Also found in Pahari languages and Nepali, as well as Romani (1.2.3.8) ***** Old Sinhala shows /kk/ (orthographic ), with /s/ appearing after the 8th century, apparently from loan words (Karunatillake 2001: 28) ****** Complete loss of aspiration in Sinhala and Dhivehi. Some languages have partial loss, as in voiced aspirates, or in certain positions

A single detailed survey of the history of the different Indo-Aryan classification schemes proposed over time is lacking, though one would be welcome, especially if it gathered the (sometimes not very explicit) criteria used by different scholars to support their conclusions. For the time being, an excellent brief account of the major attempts has been provided by Masica (1991) in an appendix (446–462) which I have drawn on heavily here.

46

James W. Gair

1.3.2.7. Southern Insular Indo-Aryan36 Sinhala (Sinhalese), the majority language of Sri Lanka, and Dhivehi (Maldivian),37 the language of the Republic of the Maldives, together with Mahl (also Maliku Bas), the language of the island of Minicoy, clearly constitute a subfamily within Indo-Aryan. Mahl is essentially a dialect of Dhivehi, though spoken on Indian territory (Cain & Gair 2000: 1). Sinhala and Dhivehi, though not mutually intelligible, share a number of features at all structural levels linking them and distinguishing them from their northern relatives (Cain 2000, 2004, Cain & Gair 2000, Geiger 1919, Fritz 2002, Gair 1994, 2011). Sinhala, the majority language of Sri Lanka, is without doubt an Indo-Aryan language, though earlier scholars, such as Rask, had held it to be Dravidian (see Geiger 1938: vi–xiii, Hettiaratchi 1959: 33–45). The resemblances to Dravidian are the result of over two millennia of separation from its northern kin and influence from Dravidian languages, notably Tamil-Malayalam. Various scholars have placed it with Eastern, Western, Southern, and Southwestern IA (see Karunatillake 1977; Masica 1991, Appendix II: 446–463; De Silva 1979: 13–20). Wilhelm Geiger, a pioneer in Sinhala linguistics, remarked on the uncertainty of classification, noting that ‘it is extremely difficult and perhaps impossible to assign it to a definite place among the modern Indo-Aryan dialects’ (1935: xxiii). One reason for this, as he opined, is that more than one dialect has entered into the history of the language, but more importantly, Sinhala was well established on the island by the 3rd century BCE, as shown by inscriptions, in which it already showed an independent character. Thus it clearly arrived at some earlier time, before the changes like those in Table 1.3 that differentiated the mainland languages. For example, it retained initial /y/, as opposed to the general /y/ > /j/ of other Middle Indo-Aryan, indicating early isolation, and much of the evidence adduced for western affiliation consists of retention rather than shared change, such as the retention of initial /v-/ which was proposed by Geiger (1938: xi) as a similarity to Marathi, Gujarati, and other western languages (see Karunatillake 1977). Also, thanks to the inscriptional and literary record and the work of scholars such as Wijeratne (1945–1957) and Karunatillake (2001), it is clear that some changes that might appear to be shared with mainland Indo-Aryan groups were in fact independent ones. For example, the simplification of geminates, with compensatory vowel lengthening (VCC > V̄ C) dates from some time between the 2nd and the 4th centuries CE.38 The similar change 36

37

38

This term is derived from “Insular Indo-Aryan” as coined by Sonia Fritz (2002), with the addition of “Southern” to underscore the location of the languages. The dh in Dhivehi does not indicate aspiration, but is a diacritic to distinguish dental from retroflex in the official transcription system. A similar convention is used in many roman-script renderings of Tamil. Masica (1991: 459) states that Sinhala did not have vowel lengthening before CC > C, but somewhat indirect evidence shows that not to be the case (Karunatillake 2001: 50).

The languages, their histories, and their genetic classification

47

on the mainland took place much later, perhaps the 9th to 10th centuries CE, and was in fact nonoperative in some areas (Turner 1985: 421, Bloch 1965: 91–92). A particularly striking example is the coalescence of /ṇ/ and /n/ into /n/, since the distinction was maintained until the 8th century CE (Karunatillake 2001: 96–98), well after any plausible date of separation from the mainland IA languages. Although proposed affiliations of Sinhala have often been with western, southern, or southwestern Indo-Aryan (Masica 1991: 451–456), perhaps the strongest evidence suggests a non-western origin. A number of the inscriptions have nominative singular in -e, as in Magadhi, prose Ardhamagadhi, and eastern inscriptions. This was noted by Geiger (1935: xx), but there is strong indirect evidence as well, as revealed by Karunatillake (1977 passim, 2001: 67–73, 119–120). Vowel fronting (“umlaut”) of long vowels was triggered by a following /i/ (4th Century), but there are also many forms in Sinhala exhibiting fronted vowels for which their OIA or MIA etymons lacked an /i/. A plausible explanation for many if not most of these instances is that fronting was indeed triggered by a following /i/, but that /i/ resulted from an independently attested merger of final /e/ with /i/. Thus /bæta/ ‘paddy’ from */bāti/ < */batte/ (OIA /bhaktam/, Pali /bhattam/); Karunatillake 2001: 70.39 This in turn indicates the presence of more final -e forms than are directly attested in the lithic record. It also gives evidence for the change VCC > V̄ C referred to earlier since VCC was subsequently treated like V̄ C (Karunatillake 2001: 49–50). Recently, Hock (2009) argued for an eastern origin on the basis of treatment of s or h + nasal clusters. The evidence for classification with Dhivehi (Maldivian) is strong. Most obvious is the complete loss of aspiration in both languages; e.g. Sinh. /dunna/ ‘bow’, Dhiv. /duni/ ‘arrow’, Sanskrit /dhanus/; Sinh. /an̆ dura/, Dhiv. /an̆ diri/ ‘dark’. Sanskrit /andhakāra/. Given the general persistence of aspiration in Indo-Aryan (except for Nuristani if it is taken as Indo-Aryan) it is perhaps the single most distinguishing feature of Sinhala-Dhivehi phonology vis-à-vis other Indo-Aryan. Note also the development of prenasalized stops in both Sinhala and Dhivehi (as in Sinh. /an̆ dura/), certainly not a common change in South Asia. There are no examples of early inscriptions for Dhivehi, and the relative order or simultaneity of settlement with Sinhala is disputed, as is the date of separation (Geiger 1919, De Silva 1970a, 1970b, Maloney 1978, Wijesundera et al. 1988, Cain & Gair 1995, Cain 1997). Geiger (1919: 99) proposed a tenth-century arrival of Sinhala on the Maldives; but evidence based on common and independent changes suggests that Geiger’s tenth-century CE split is too late, while a split before the 3rd or 4th century is too early. For example, the prenasalized stops in Sinhala developed between the 2nd and 3rd centuries CE, and if, as seems likely, this is a shared development in Dhivehi, the languages would not have separated before that. On the other hand, the 8th-century Sinhala coalescence of retroflex and dental nasal 39

For a fuller account see Karunatillake 2001: 67–73.

48

James W. Gair

was not shared with Dhivehi, so that separation preceded that development. The most detailed treatment, Cain 2000, proposes that Dhivehi began diverging by the first century BCE, and that the existing high degree of similarity is attributable to continuing contact and Dravidian influence on both. One important aspect of the uncertainty here is that seemingly common changes differ in detail. For example, both languages underwent the umlaut of long vowels followed by /–i–/. This is certainly an unusual change in Indo-Aryan in general, and thus suggests a shared change around the 4th century when it occurred in Sinhala. However, as Cain pointed out (2000: 195–196), it affected all long back vowels in Sinhala, but only /ā/ in Dhivehi, and the result was /æ/ in Sinhala, but /e/ in Dhivehi.40 Despite these still unresolved problems, however, the subfamily status of Southern Insular languages remains solid. Not unexpectedly, given their long isolation from other Indo-Aryan, both Sinhala and Dhivehi exhibit considerable influence from the neighboring Dravidian languages. Although Tamil is usually cited, “South Dravidian” or “TamilMalayalam” would be more appropriate, since much of the influence precedes the split of the latter. Thus Sinhala and Dhivehi have become members of a SouthSouth Asia linguistic area, showing a number of features in common, especially in syntax and lexicon (Gair 1994, 2003, 2011, 2012). In syntax, the Dravidian influence is indeed considerable, appearing in numerous features by which Sinhala differs from its northern kin. Most striking is the development of a thoroughgoing left-branching nature, as in Southern Dravidian, along with the adoption of other features such as a special focus construction and a constituent-final question particle. 1.3.2.8. “Gypsy” languages — Romani Romani (sometimes Romany) is the language of the Rom,41 the people commonly referred to as “Gypsies”, generally existing at the fringes of society, working as service providers, small craftsmen, or entertainers, and widely scattered through the Near East, the Balkans, Greece, western Europe, and beyond. The Indian origin of the people and their language was recognized as early as the late eighteenth 40

41

The differences are supported by the description of vowel distribution in Fritz 2002: 26–27, who also suggests that there was a change /æ/ > /e/. This is discussed in Gair 2007: 367. The result of the umlaut of /ā/ was written in Sinhala inscriptions prior to the 7th century, but it was clearly not merged with preexisting /e/, but phonemically distinct as shown by later developments in which they were treated differently (Karunatillake 2001: 69). Following the usage of Matras and other current scholars, “Rom” is used here as a collective term for the people. It may also refer to an individual, and then has a plural “Roma”, also used in reference to the people by other scholars.

The languages, their histories, and their genetic classification

49

century, and the language has been studied by an active succession of scholars since (Matras 2002: 2–4; Hancock 1988). A journal, the Journal of the Gypsy Lore Society, now Romani Studies, has been published since 1888, and there is an active center for Romani Studies at Manchester University with an important online presence. In 1926, Sir Ralph Turner argued for an origin in the central group of IndoAryan languages, primarily on the basis of innovations shared and not shared, and his conclusion has been generally accepted since (Turner 1926b). Turner also argued for separation from the central group as early as the third century BCE, on the grounds that Romani failed to undergo some changes undergone by the central languages that already appeared in Aśokan inscriptions of that time, such as the assimilation in st and ṣṭ clusters: OIA mṛṣṭa, MIA miṭṭha, Hindi mīṭhā, Rom. mišto ‘good’ (Matras 2002: 32); OIA hasta ‘hand’, Pali hattha, Hindi hāth, Rom. v–ast (Matras 2002: 33). In any case, the speakers appear to have remained in the subcontinent for many centuries, and spent some time in the Northwest. Romani does in fact show some Northwestern features such as the preservation of the st and ṣṭ clusters. These similarities led scholars before Turner to suggest a northwestern origin, but his explanation for them was that the separation from the central group took place before the changes in those clusters that took place there, so that the stay in the northwest (where those changes did not occur) simply allowed their continuance. Leaving the subcontinent about the turn of the first millennium, the Rom traveled to the Byzantine empire, coming in contact with Greek, and acquiring some important Greek features, such as derivational and inflectional morphology and default VO word order, making it “the only Indo-Aryan language that does not show obligatory Object-Verb order” (Matras 2010: 37). They progressed into Persia, Armenia, and the Slavic areas, and into Europe, where their presence was first noted in the Balkans in the 14th century, arriving in northern and western Europe in the 15th century (Matras 2002: 1–2), and further dispersing widely beyond. This widely accepted scenario has been challenged, and there are even serious scholars holding for an origin outside India (discussion in Hancock 1988: 204– 206), leading Hancock to note that ‘The ultimate origin of the Gypsy population and the date or dates of their separation from the other Indic peoples has still not been fully resolved’ (1988: 195). Within Romani proper, there exist numerous dialects/languages. The “prevailing discourse” (Matras 2010: 55) lists Northern, Central, Balkan, and Vlax (Bakker & Matras 1997, Matras 2002, Elšík & Matras 2006), though some dialects remain outside the scheme (Matras 2010: 55).42

42

Ethnologue does not give Central as a separate group.

50

James W. Gair

Vlax, formed in Romania, became the most widely spoken — and studied — subgroup, generally throughout Europe, but with a significant presence of several hundred thousand in the Americas, notably in Colombia (Ethnologue Report 2009, under Romani, code [rmg]). The most widely spoken Vlax variety is Kalderash, in Romania and beyond. Another well documented Vlax variety is Lovari, in Transylvania but also in Austria and beyond in Europe, including Scandinavia. 1.3.2.8.1.

Lomavren, Domari, and other isolated “Gypsy” languages

The term “Romani” has both wide and narrow interpretations. For the major scholars Turner (1926b) and Sampson (1923), Romani had three divisions: European Romani, Domari (the language of the Dom, centered in Syria), and Lomavren (language of the Lom, which has been largely restructured towards Armenian, while retaining significant Rom lexicon). Despite their agreement on the early Central Indo-Aryan origin of all three languages, these scholars differed as to whether their speakers had left India as an undifferentiated group. Turner argued for a split within India, while Sampson (1923) concluded that they had split subsequently, in Iranian territory. More recently, Hancock (1995) concluded that the three groups traveled through Iran separately, and thus would have split while in India. The evidence is complex, and it has been proposed that even the early separation from Central Indo-Aryan was sequential. The debate is still alive (see Matras 2002: 46–48, Hancock 1988 for useful surveys). In any event, later scholars, such as Matras, have considered the three to be separate languages, though displaying cross influence, and restrict the use of the label Romani to the European varieties and their offshoots. There are other isolated Indo-Aryan languages existing outside the Indian subcontinent, in Central Asia, the near East, and Europe, that also are generally spoken by groups of itinerant artisans and entertainers and result from migrations of generally the same periods as those of the Rom. These include Parya, in Tajikistan (Payne 1997, Müller et al. 2010). Another outlier is Dumaki, in Hunza, Northern Pakistan (Lorimer 1939). Both of these have a Central Indian origin. Some, like the Jat in Afghanistan have names corresponding to Indian caste names. As Matras concludes (2010: 38), ‘The presence of these various groups outside of India confirms an overall phenomenon of emigration from India and the maintenance of caste-like identity, even after the breakaway from the actual caste-based system of the Indian subcontinent’. Domari and Lomavren, if not subsumed under the wider sense of Romani, can also be considered representative of such languages.

The languages, their histories, and their genetic classification

1.3.2.8.2.

51

Para-Romani

There are, in addition to Lomavren mentioned earlier, a number of Rom communities that speak the local language, but retain a Romani lexicon, perhaps with some fossilized grammatical features for intergroup use, a phenomenon for which the term “Para-Romani” has been coined (Cortiade 1991, Bakker & Vandervoort 1991, Matras 1998). There are many examples, including Angloromani (UK), Erromintxela (Basque) and Caló (Spain), Scandoromani, and numerous others (Matras 2002: § 10.9, with examples, and Matras 2010 for Angloromani). 1.4.

Iranian43 By Agnes Korn

1.4.1.

The languages and research resources

Iranian (Ir.) languages have traditionally been classified chronologically in terms of three periods — Old, Middle, and New Iranian (based on external criteria such as important changes in political history and the cultural-religious field), and dialectally into West and East Iranian (see 1.4.2). While dichotomies like West vs. East Iranian have increasingly become blurred as new material and new interpretations have emerged, the divisions still give a general idea of the historical and cultural context and to some extent also about grammatical features. However, it is important to keep in mind that the terminology originates from a time when a number of Iranian languages were not known yet (including most Middle Iranian languages). Interacting with any classification is a considerable amount of continuity and overlap (parallel phenomena occurring at various points in different Iranian languages), and areal influences both within and beyond the group. For instance, Khotanese and Tumshuqese Saka (followed by Sogdian) are the most archaic Middle Iranian languages and preserve many Old Iranian features, but Khotanese joins Persian in showing innovative “peripheral” characteristics, and displays Indic features in its phonemic system (see 1.4.2.1, 1.4.2.2, 1.4.2.3). Another field of contrast is the discontinuity of transmission and the broadness of the Iranian sources as a whole. Persian is the only Iranian language attested in all three periods (OP, MP, NP; each with a somewhat different dialectal basis),

43

Special abbreviations: CLI Schmitt (ed.) 1989 DIR direct case Ir. Iranian Khot. Khotanese MP Middle Persian

NWIr. NP OP PIE SWIr.

North-West Iranian New Persian Old Persian Proto-Indo-European South-West Iranian

52

Agnes Korn

while Avestan is only known from the Old Iranian period, and Khotanese and Tumshuqese Saka, Sogdian, Bactrian, and Parthian only from Middle Iranian (Khwarezmian is classified as Middle Iranian although most sources are from the 12th–13th centuries). The modern languages, apart from Persian, do not have an attested predecessor, but share features with one or the other earlier language. On the other hand, Iranian as a group spans some three thousand years of testimonies, and regions ranging from Anatolia far into China (as demonstrated by recent finds of Sogdian testimonies; de la Vaissière & Trombert 2005), and from the Caucasus to Southern Pakistan. Cultural, religious, and other environments have been similarly diverse, including orthodox and unorthodox varieties of Zoroastrianism, Buddhism, Christianity, Islam, etc. With that in view, the overall development of and relations within Iranian are best studied by looking at the whole group together. This contribution attempts to do so, while presenting a necessarily selective picture.44 The lines of investigation on the various Iranian languages and the availability of surveys are rather diverse (for earlier surveys see MacKenzie 1969 and Duchesne-Guillemin 1992, for a more recent one, Tremblay I–III, who focusses on works on Old Iranian and some Middle Iranian from the years 1989 to 2002 which he considers relevant for Indo-European studies). Windfuhr 2009 offers a collection of grammatical surveys (synchronic, with an emphasis on syntax and typology) of a number of Iranian languages, introduced by a chapter on “Dialectology and topics” (Windfuhr 2009a), but many Iranian languages are not represented. Emmerick & Macuch 2009 (bibliographically up-to-date up to ca. 2000, with some later works) surveys Old and Middle Iranian literature, offering chapters on Avestan, Old and Middle Iranian inscriptions, Pahlavi, Middle Iranian Manichean and Christian literature, Buddhist Sogdian, and Khotanese literature. To some extent, the Compendium Linguarum Iranicarum (CLI = Schmitt (ed.) 1989), which covers all Iranian languages, thus remains an authority for information on grammar (including diachronic aspects) with at that time up-to-date bibliographies, although the approach of the chapters varies.45 The same applies to the entries of the Encyclopædia Iranica (the online edition at http://www.iranica online.org suffers from a certain loss of special characters).

44

45

Major focus will be on languages of the South Asian area, but some other languages will also be mentioned to permit evaluation of possible areal influences. References will be made to CLI plus page number(s), not to the individual articles.

The languages, their histories, and their genetic classification

53

1.4.1.1. Old Iranian Old Iranian has chiefly been studied from the perspective of comparative philology, focussing on textual interpretation and etymology, leaving issues like syntax and text linguistics seriously underrepresented. Comprehensive survey articles present the work done (Tremblay II and Skjærvø 1997 on Avestan, Tremblay III: 2–7 on Old Persian). For Old Persian, recent findings have added a few more inscriptions (see Schmitt 2009 and Schweiger 1998), while current research on Avestan focusses on historical philology on the one hand and the history of its transmission on the other. In the latter field, Geldner’s edition (1889–1896) has been rendered outdated in terms of data (important manuscripts, chiefly from Iran, are available today that were not known to Geldner) as well as method (cf. Cantera 2010). Contemporary Avestan scholarship joins the general move in disciplines studying manuscripts to turn away from the attempt to establish “the original text”, approaching instead the manuscripts as documents in their own right to reveal elements of transmission that are important for the textual tradition as well as for the text itself. Thus, ‘a study of individual manuscripts’ is a ‘serious desideratum’ (Skjærvø 2009b: 45) for advances on the current knowledge of Avestan. Accordingly, while internet-based text databases started out by making available (and searchable) the printed text editions (e.g. many of the texts on http://titus.uni-frankfurt.de/indexd.htm?/texte/ texte2.htm), technology is developing towards offering photos of manuscripts (e.g. in the Avestan Digital Archive at http://www.avesta-archive.com/ and in the TITUS database at http://titus.uni-frankfurt.de/didact/idg/iran/avest/avestmss. htm). Also available online are some teaching materials such as those by Skjærvø (http://www.fas.harvard.edu/~iranian/). 1.4.1.2. Middle Iranian The most remarkable advances have been made in Middle Iranian. Of prime importance is the discovery and edition of far more than 100 manuscripts in Bactrian — transforming the language, previously very poorly known, into a moderately well documented Iranian variety; see Sims-Williams 1997a for a report and 2000–2012 for editions of the texts (with grammatical survey and glossary). The interpretation of these texts has also shed light on the Kushan inscriptions and the Bactrian fragment in Manichean script (Sims-Williams 2008, 2009, 2012a). Evidently, studies on Bactrian grammar have only just started, but reveal linguistically interesting patterns (e.g. Gholami 2009, Sims-Williams 2011a, 2011b). The discovery of new texts has also augmented the corpus of Sogdian texts (Yoshida 2009b: 280–281), but many texts still await an (or, an up-to-date) edition. This is particularly true for the fragments in Brahmi script and for the “Ancient Letters” (cf. Sims-Williams 1996b and 2005, respectively), but editions of other

54

Agnes Korn

Middle Iranian material have progressed considerably in recent years. Work on the Middle Persian, Parthian, and Sogdian Manichean fragments in Berlin has resulted in several new editions; for a list of Berliner Turfantexte, see http://www.bbaw.de/ bbaw/Forschung/Forschungsprojekte/turfanforschung/de/Publikationen. The now complete volumes containing transcriptions and translations of the ostraca from Nisa (Diakonoff & Livshits 2001) offer this material for study. As for Khwarezmian, an edition of a text in Arabic script (MacKenzie 1990) has been instrumental in furthering research on the language. These works have also yielded new dictionaries for the Manichean material (Durkin-Meisterernst 2004 for Manichean Middle Persian and Parthian, Sims-Williams & Durkin-Meisterernst 2012 for Sogdian and Bactrian, and de Blois & Sims-Williams 2006 for Persian etc.), as well as the onomastica by Lurje 2010 (Sogdian) and Sims-Williams 2010 (Bactrian). These works also contain comprehensive up-to-date bibliographies of text editions and secondary literature. Just as for Old Iranian, Middle Iranian manuscripts have been digitised and made available on the internet. Most prominent is the International Dunhuang Project, which organises the digital images of holdings from Central Asia in various libraries in Europe and beyond (http://idp.bl.uk). Photos of the fragments in the Berlin Academy of Sciences are also available at http://www.bbaw.de/forschung/ turfanforschung/dta/. Much of the published text material is also available on TITUS at http://titus.uni-frankfurt.de/indexd.htm?/texte/texte.htm. 1.4.1.3. New Iranian Research on contemporary Iranian languages has often embraced theoretical perspectives (e.g. on topics mentioned in Sections 1.4.3 and 1.4.4 below), mainly studying Persian, but also e.g. Ossetic. The series International Conference on Iranian Linguistics aims to bridge the gap between traditional philology, historical linguistics, and theoretical approaches, so far with results published in Karimi, Samiian & Stilo (eds.) 2008 and Korn et al. (eds.) 2011. There is also considerable interest in Iranian minority languages, some of which is linked to recent UNESCO activities concerning cultural diversity and language vitality. Various foundations established funding schemes targeted at the documentation of “E NDANGERED LANGUAGES ”, often accompanied by a dedicated archive hosting the data and making them accessible. Among these are the program “Documenting Endangered Languages (DEL)” of the National Science Foundation (USA), the “Endangered Languages Documentation Programme (ELDP)” at SOAS, London (with its Endangered Languages Archive (ELAR) at http://elar. soas.ac.uk/), and the program “Dokumentation bedrohter Sprachen (DoBeS)” of the Volkswagen Foundation (Germany), archived at the Max Planck Institute for Psycholinguistics in Nijmegen (http://corpus1.mpi.nl). Accordingly, documen-

The languages, their histories, and their genetic classification

55

tation of minority languages and their oral traditions has been provided, using modern database technology and often also including a typological perspective. However, many Iranian languages have “too many” speakers to fit into the often rigorous definitions of these funding schemes. As often pointed out by members of the language communities, in view of the absence of any official status, these languages may well be lost in a few generations despite the number of speakers, particularly if efforts to develop a standard language and a standard orthography prove unsuccessful. At the same time, there is in fact a considerable use of minority languages (often in somewhat spontaneous orthographies) in internet fora and online discussion groups. In Iran, research is done either under the title of gūyeš-šenāsī ‘dialectology’ (which exclusively refers to minority languages and dialects belonging to the Iranian language family), chiefly at the Persian Academy of Sciences (cf. Faridi 2001 for a list), or as MA and PhD theses in general linguistics at various universities, mostly written by native speakers (see Naseh 2001). 1.4.2.

Isoglosses and historical phonology

1.4.2.1. Traditional classification and West Iranian dialectology Genetically, Iranian has been divided into an East and a West subfamily, and to a certain extent also into Northern and Southern subgroups within these two, so that, for instance, Parthian, Talyshi, Southern Tati, Vafsi, Zazaki, Semnani, Gilaki, Kurdish (Kurmanji and Sorani), Balochi, etc., would be N ORTH W EST I RANIAN (NWIr.), and Persian, Luri, etc. S OUTH W EST I RANIAN (SWIr). (For East Iranian, see 1.4.2.2). Particularly sound changes, but to a certain extent also other features have been used to assign Iranian languages to one of the branches. I SOGLOSSES have been formulated as binary oppositions, checking whether or not a given language shows a certain feature. Despite major advances in knowledge about Iranian languages of all periods, the inventories of isoglosses used in dialectology and classification of Iranian languages have remained rather constant. Those for West Iranian tend to be based on the study by Tedesco (1921), whose aim was the description of the varieties found in the material from Turkestan (nowadays known as Manichean Middle Persian and Parthian). Paul (1998b, 2003) reformulates Tedesco’s features as a scale of “Northwesternness” or “Southwesternness”, on which a language is ‘more or less NW or SW’ (2003: 61), transforming the binary opposition into a scalar parameter. However, this does not change anything about the fact that only points which differentiate Middle Persian from Parthian are considered (while features in which Middle Persian and Parthian happen to agree are rarely included as isoglosses), and that the position of a language is measured in terms of its agreement with one pole, and its difference from the other.

56

Agnes Korn

Another blind spot in the discussion is the question which isogloss is meaningful at all, i.e. whether it is a typologically marked feature, or whether agreement in a certain feature represents a shared archaism or a shared innovation, each with quite different implications (Korn 2003). Similarly, isoglosses are not on the same chronological level since some date back to Old Iranian, such as OP θ, d/δ vs. Avestan s, z (PIE *ḱ, *ǵ(h)) and OP ç vs. Avestan θr (PIE *tr), and others to later stages (e.g. Old Iranian postvocalic č > Middle Persian z, but č/ǰ in Parthian; Korn 2010). One recent work that tries to evaluate the markedness of isoglosses is the study by Tremblay (2005a), who questions the unity of Proto-Iranian. Typically Iranian changes that seem to postdate Proto-Iranian include the change of PIE *s to h (cf. also Schmitt 2000: 14–15, referring to Hintze 1998), the fricativisation of stops preceding a consonant (e.g. *pr > fr; see also W. Hock 2006), and the loss of aspiration (PIE *b/bh, *d/dh, *g(u̯ )/g(u̯ )h > Ir. b, d, g). The output of the PIE laryngeals likewise differs in some environments (CLI 7); and for PIE *-o/es (> *-ah), there are -ō and -ē (vel sim.) both in Indo-Aryan and Iranian. A controversial issue is whether PIE l may have been preserved (not changed to r) in some Iranian variety or varieties (CLI 10, Schwartz 2008, but cf. Mayrhofer 2002). While not all of Tremblay’s points appear equally convincing, Proto-Iranian as the putative predecessor of all Iranian languages does seem less well established than Proto-IndoIranian; probably, then, ‘the Ir. languages formed a sprachbund which converged to an extent that it became difficult to distinguish from a genetic family’ (Tremblay 2005a: 687). The O LD I RANIAN varieties still seem to have been a continuum of dialects. Despite attempts to establish Avestan as specifically East Iranian, it ‘shows few if any of the distinctive characteristics of the later Eastern Iranian languages’ (Sims-Williams 1996a: 649), and insofar as it does, these features may belong to textual transmission rather than to the language itself (CLI 28). Also, the features noted in 1.4.2.1 in fact do not oppose Old Persian to Avestan, but rather, a few distinctive features mark off Old Persian and probably the predecessor of Khotanese (for whose historical phonology see CLI 210–216) from the main body of Old Iranian, establishing them as peripheral varieties vs. central ones, the latter including Avestan (cf. CLI 27–28, Schmitt 2000: 18). Like Old Persian, but surely independently, Khotanese assimilates PIE *ḱu̯ , giving Saka śś (OP s), e.g. aśśa- ‘horse’ : OP asa- (cf. Tremblay 2005a: 677–679, 684–685; Kümmel 2007: 312–318, 352– 353, and passim, starting with p. 358; and Lipp 2009 for the development of the PIE palatals). Wakhi, a modern Iranian language of the Pamir and Northern Pakistan, shares this development (e.g. yaš ‘horse’), but cannot be a direct descendant of Khotanese (cf. Wendtland 2009: 174 and CLI 375 for more details). Recent research has also questioned the widely held assumption that Median (the putative source of the Iranian loanwords in Old Persian) is characterised by the (single) change of PIE *su̯ - (> Proto-Ir. *hu̯ -) to f-, supposedly seen in

The languages, their histories, and their genetic classification

57

farnah- ‘glory’ and its derivatives. But the f- in farnah- may be due to some specific phenomenon (Skjærvø 1983b, Lubotsky 2002: 191–195), so there is no distinctive characteristic of Median, which instead is likely to have been one of the central varieties, not showing the specific developments of Old Persian or Khotanese. This renders a designation of one or the other subgroup of West Iranian as “Median” (thus e.g. Borjian 2009) rather misleading. It also demonstrates that Median cannot be the predecessor of Kurdish (MacKenzie 1999.II: 675–676) or of Balochi: these languages share the Persian (“SWIr.”) features of PIE *tr (> OP ç) > s(s) vs. Median θr, and of PIE *ḱu̯ > OP s vs. Median sp (aspa- ‘horse’) (MacKenzie 1999.II: 675–676 pace Gershevitch 1992; Korn 2005b: 89–91 vs. Paul 1998b: 170), although they are classified as “NWIr.” in terms of their treatment of the PIE palatals (*ḱ > s etc., see above). The agreement of some languages with Persian for one or the other feature traditionally described as “SW Iranian” includes features from various periods. We are thus not faced with a steadily increasing Persian influence on neighbouring languages (pace the impression that might be suggested by the diagram in Paul 1998b: 170). Rather, Persian influence is likely to have had the form of waves, which could quite well be linked to the periods of political power of the Achaemenid, Sasanid, and later empires. 1.4.2.2. East Iranian and Middle and Modern Iranian dialectology Phonological changes opposing the main body of Iranian to the periphery continue in M IDDLE I RANIAN . In a development beginning in late Old Iranian, the voiced stops yield fricatives postvocalically (b, d, g > β, δ, γ). East Iranian (important members being Khotanese and Tumshuqese Saka, Sogdian, Khwarezmian, Bactrian, Yaghnobi, Ossetic, Pashto, and the Pamir languages) has been said to show the change also in word-initial position, but this isogloss does not include the whole group: word-initial stops are preserved in Parachi and Ormuri; Ossetic has b-, d-; and Khotanese at least g- (discussion in Wendtland 2009: 175 and Kümmel 2007: 289–294, 441). In a parallel way, Old Iranian postvocalic voiceless stops change to voiced ones in West Iranian (p, t, k > b, d, g, cf. Korn 2005b: 323–327), but not in Balochi, while Khotanese also shows the change. Other alleged East Iranian isoglosses do not cover all of East Iranian either (Sims-Williams 1996a: 650, Wendtland 2009): the supposedly typically East Iranian change ft, xt > βδ, γδ does not take place in Yaghnobi and Ormuri, maybe only partially in Sogdian, and the outcome is t in Khotanese and Parachi. A second group of features does not delimit East Iranian, since it is also seen in West Iranian, viz. the preservation (or reversal to t) of Old Ir. θ, which also takes place in Balochi (Korn 2005b: 81, 326–328), but not in Khotanese and Bactrian (where intervocalic θ gives h as it does in Middle Persian and Parthian). The change of Old Ir. č, ǰ [ʧ, ʤ] to [ʦ, ʣ] (sometimes with further change to s, z, CLI 168) is seen e.g. in Khotanese

58

Agnes Korn

(CLI 213), Bactrian, and Pashto, but does not occur in Sogdian, Yaghnobi, YidghaMunji, and Parachi, while it is seen in some dialects of Fars (Salami 2004: 35–36) and of Zazaki. The ‘numerous vocabulary items’ still deemed to be ‘attested exclusively in Eastern Iranian’ by Sims-Williams (1996a: 651) have in the meantime also been found in one or the other West Iranian language (e.g. *abi-ar- ‘find’, *gari- ‘mountain’, *kuta/ī- ‘dog’). Indeed, as pointed out by Sims-Williams (1996a: 651), ‘if one reconstructs “Proto-Eastern Iranian” in such a way as to account for all the features of the group, the result proves to be identical to the “Common Iranian” reconstructible as the ancestor of the whole Iranian family’, and the absence of Proto-Eastern Iranian precludes East Iranian from being a genetic unity. Shared features are thus more likely to be due to areal phenomena and to ‘result from centuries of contiguity, during which features spread from one language of the group to another and neighbouring languages supported each other in the retention of shared features’. Similar things hold on a smaller scale: the P AMIR LANGUAGES (Wakhi, Yidgha, Munji, Yazghulami, Shughni, etc.) are likely to be a sprachbund as well (see Wendtland 2009 for a detailed account). At any rate, given the absence of a common ancestor, the subgrouping of East Iranian heavily depends on an author’s assessment of which isoglosses are more important than others, introducing a certain amount of circularity into the argument. One particularly unclear point is the status of Parachi and Ormuri, which have been held to be West Iranian by some (e.g. V. A. Efimov 2009), but are mostly considered East Iranian. If so, they might be said to constitute a “South-Eastern” group while the remaining East Iranian languages would be “North-Eastern”. Alternatively, “North-Eastern” could be limited to Ossetic and Yaghnobi, leaving the others as South East Iranian. 1.4.2.3. External contact At various points in their history, Iranian varieties have adopted phonemic or morphological traits of neighbouring languages. For instance, several Iranian languages spoken within the Indic sphere have retroflex consonants (not only in loanwords), including Khotanese, Pashto, some other modern East Iranian languages, and Balochi. Ossetic has acquired a set of glottalised stops, which are typical for Caucasian languages (Thordarson 2009: 66–67). The loss of fricatives in Balochi (which have re-entered Eastern Balochi secondarily), and probably also in Khotanese (discussion in Kümmel 2007: 289–291, 441) may also mirror contact with Indic languages. The strengthening of sounds in word-initial position in Khotanese corresponds to a lenition in word-internal position, which ‘is stronger than in any other Middle Iranian language’ (Kümmel 2007: 290), e.g. *tanθra- > ttāra- ‘dark’ — incidentally another parallel to Persian (cf. NP tārīk).

The languages, their histories, and their genetic classification

1.4.3.

59

Morphology and change of morphological categories

1.4.3.1. Nominal and pronominal morphology Nominal and verbal morphology underwent a major refashioning in the Middle Iranian period. C ASE syncretism (see Table 1.4) starts out in late Old Iranian and results in a one-case system (Type V) in later Middle Persian, later Bactrian texts, and some New Iranian languages. However, numerous New Iranian languages have preserved the two-case-system (Type IV) and the use of the oblique case (OBL ) for direct and indirect objects, possessor, and the agent in ergative constructions (see 1.4.3.3). Others add new case markers. For instance, the shift of the OBL marker to a neo-genitive and the introduction of a marker -ā for the other oblique functions may be an innovation common to Gilaki and Balochi (Korn 2005a). Shughni and Wakhi share a neo-dative whose marker derives from *arda- ‘side’; and the Ossetic case system has been adjusted to that of neighbouring Caucasian languages (CLI 382) by grammaticalising various postpositions, yielding nine cases (Weber 1980, and see Stilo 2009 for a survey). Table 1.4: Case syncretism in Iranian (simplified)

P RONOMINAL CLITICS are preserved in most Middle Iranian and New Iranian languages (exceptions include Zazaki and Kurmanji), and are widely used in all oblique functions. Noteworthily, they are the only unambiguously OBL

60

Agnes Korn

nominal forms in Iranian varieties that have lost all case distinctions (cf. also 1.4.3.3). East Iranian languages, which preserve many word-final syllables, often also preserve inflectional PLURAL marking while West Iranian languages, characterised by loss of word-final syllables, show new agglutinative plurals. But again, both types are also found in the respective other group. The OBL . PL (deriving from *-ānām or *-abiš, see Table 1.4) is reinterpreted as PL suffix in various New Iranian languages. Elsewhere, an abstract or collective suffix is used as PL marker, e.g. -hā (< abstract suffix *-iyaθwa; Sims-Williams 2004: 539, quoting earlier accounts, cf. also CLI 155), and -t (< abstract-collective *-tā, Sims-Williams 1979: 337, 1990: 276), which is attested in Sogdian, Yaghnobi, Ossetic, and Yazghulami (Thordarson 2009: 116), while -išt is also found (Sogdian, Wakhi). In Pashto and other East Iranian varieties, -hā and/or -ān coexist with other PL markers (Thordarson 2009: 181, CLI 379, Wendtland 2009: 178–179). The Old Iranian dual is found only in isolated forms later on. In Sogdian, Khwarezmian, and possibly Pashto, it is generalised as a form following numerals (hence termed “numerative”; Sims-Williams 1979: 339–341). Similarly, the neuter tends to be lost already in Sogdian and Khotanese (Sims-Williams 1990: 275, Emmerick 2009: 384), and completely in the later languages. There is no trace of GENDER in attested Middle West Iranian, but masculine and feminine gender must have existed in the (unattested) predecessors of Kurdish (MacKenzie 1954), Zazaki, Semnani, etc. Gender is also preserved in many East Iranian languages, but Bactrian agrees with Sorani, Balochi, etc., in losing gender (Sims-Williams vol. 1: 40). As some categories are lost, principles of case assignment change as well. While syntactic function determines case in the Old Iranian and the more archaic Middle Iranian languages (e.g. Avestan uta druuā̊ aspəm viste ‘the wicked one obtains the/a horse’, JamaspAsa 1982: § 82), case marking later on ‘depends on inherent semantic properties of the object (animacy, person) or its referential status (definite, indefinite, specific, non-specific)’ in most New Iranian languages (Bashir 2008: 52), i.e. objects receive case according to “Differential Object Marking” (Bossong 1985). For instance, definite (or “identified”, by other analyses) direct objects ([±animate]) are marked with the clitic -rā in New Persian (< OP rādiy ‘on account of’), while the generic noun is used for indefinite / unidentified objects; e.g. asb-Ø mībīnam ‘I see a horse/horses’ (unmarked for case and number) vs. asb-rā mībīnam ‘I see the (a specific) horse’ (cf. e.g. Paul 2008 for further details). In Ossetic (Thordarson 2009: 131–140), apparently also in Yaghnobi as well as some of Western Iranian, only animate identified direct objects are marked while other objects are unmarked. The same differentiation may be achieved by the use of adpositions (see e.g. Sims-Williams 2011a for Bactrian). Differences in case assignment depending on animacy etc. are also found for indirect objects and the ergative subject, so that one may speak of “D IFFERENTIAL A RGUMENT M ARKING ”.

The languages, their histories, and their genetic classification

61

1.4.3.2. Verbal morphology As in the nominal system, some verbal categories are going out of use already in Young Avestan and in Old Persian (thus the aorist and perfect). A new system emerges, which is based on the dichotomy of a present STEM (mostly deriving from the various Old Iranian present stem formations) vs. a past stem (from the verbal adjective in -ta-), resulting in some synchronically unpredictable allomorphy. Secondary past stems are also frequent; these replace opaque past stems (e.g. New Persian present stem rōy- ‘grow’ → past stem rōy-īd vs. older rust) and derive past stems from denominatives, borrowings, etc. Bactrian shares the use of *-āt- (a suffix resulting from metanalysis, cf. Durkin-Meisterernst 2000: 81–87) with Sogdian and a subgroup of NWIr., including Parthian, Zazaki, and Vafsi, while Khotanese has *-ita- (an inherited suffix common in causatives; cf. Korn 2009), as do Persian, Balochi, and Kurmanji; another suffix is -ist (Paul 2003: 67–70, Bartholomae 1915). Many Middle and New Iranian transitive present stems derive from causatives, while intransitives derive from intransitive presents and from stems with the “inchoative” suffix -s- (see Weber 1970). So there are numerous pairs such as Khotanese hamīh- ‘change (TR )’ (< *fra-maiθaya-) vs. hamäh- ‘change (INTR )’ (< *fra-miθa-, Emmerick Forthcoming: 5.3.1), Balochi sōč- (< causative *sauč-aya-) ‘burn (TR )’ vs. suč- ‘burn (INTR )’. Historically intransitive verbs may also be used in middle or passive meaning. This includes the inchoative (which ‘is typically seen as an action that happens all by itself’, Stilo 2004: 240), thus e.g. stems suffixed with -s- in Sogdian (e.g. yγwsty ‘is taught, learns’ vs. ywc- ‘teach’, Gershevitch 1954: § 826) and Khwarezmian (Durkin-Meisterernst 2009: 349–350). Like other inherited categories, the ASPECT opposition of the PIE present and aorist stems is subject to loss in late Old Iranian. A new aspect opposition is found in various New Iranian languages, often matching aspectual systems in neighbouring languages. Various prefixes are used to mark the imperfective aspect: mī- in New Persian (generalised in the present tense in the contemporary language; Fritz 1982: 27–28), a- in Sorani and some Balochi varieties, etc. (Jeremiás 1993). Conversely, prefixing marks perfectivity in Pashto (wə-), which uses accent shift for already prefixed verbs (Fritz 1982: 27, CLI 395), and in Ossetic (various directional preverbs, probably on the model of Caucasian languages; Fritz 1982: 27, Thordarson 2009: 67). Several other Iranian languages employ a locatival periphrasis similar to the English progressive form, combining a verbal noun with the copula, as in various Balochi dialects (infinitive in the OBL or present participle) or in Yaghnobi (infinitive with the logical subject in the OBL ). The particle kām is used for the future in Sogdian, Khwarezmian (where it is suffixed to the inflected present tense), and Sistani (Paul 2003: 70) and other varieties, where it is a prefix. With the imperfect, kām marks a counterfactual in Sogdian (Yoshida 2009a: 286–288). It has been suggested that the prefix k-, which is used with certain present stems in Balochi, also derives from kām (Paul 2003: 70).

62

Agnes Korn

Several Old Iranian MOODS are preserved in Middle Iranian. This generally applies to the subjunctive, which also survives in Ossetic (thus also the optative, Lazard 1992), while other modal categories are expressed by novel formations. One way is the grammaticalisation of particles: the prefix ba/i- is widely used for the future and conditional or the subjunctive, e.g. in Pashto (CLI 395), New Persian (Jahani 2008), etc., sometimes also for a subjunctive past (as in Gilaki and Balochi, combining ba/i- with a suffix -ēn-). The Old Iranian middle VOICE is still functional in Khotanese and Sogdian, although only a few verbs are used both in the active and the middle (Sims-Williams 1994). The old middle endings are used as general endings for the Khwarezmian 3 SG and 3 PL and the Yaghnobi 3 PL (Tremblay 2002). The Old Iranian passive in -ya- survives in some verbal stems in Khotanese (Emmerick Forthcoming: 5.3.1, 6.1.3), Sogdian (Gershevitch 1954: § 540), and Middle Persian (Skjærvø 2009a: 220–221), as well as in some NW Iranian varieties. Other Iranian languages with morphological passives include Sorani (suffix -rV), Zazaki (-i/ey) and Eastern Balochi (-īǰ-, borrowed from Indo-Aryan, Bashir 2008: 61–64). In analytical passives, the most common auxiliary is *baw‘become, be’ (as for instance in Sogdian, Parthian, Middle Persian, and Balochi). Otherwise verbs of movement are employed, viz. *čyaw- ‘move forward’ in Christian Sogdian, Ossetic, Pashto, some Pamir languages (CLI 415), Persian (where the meaning has shifted to ‘become’), and Khotanese (besides other constructions, Emmerick Forthcoming: 6.1.3, 2009: 398), and hatın ‘come’ in Kurmanji. 1.4.3.3. Transitivity and ergativity The category of transitivity becomes increasingly important as a result of the integration of the verbal adjective in -ta- into the verbal paradigm (see 1.4.3.2). Intransitive verb forms of the perfect/past are composed of the verbal adjective and the copula in much of Middle and New Iranian while various strategies are found for the inflection of transitives. One of these is the selection of a transitive auxiliary as in Sogdian, which uses the verb ‘hold, have’ for this purpose (e.g. ’krtw-δ’rt ‘he did’, see Wendtland 2011 for details), effecting a new transitive inflection (in Khwarezmian apparently generalised to intransitives). Khotanese uses a specific participle formation with the copula for the transitive inflection in the past domain (Sims-Williams 1997b: 322–323, Tremblay 2005b). An ERGATIVE pattern arises by the combination of a transitive subject (agent) in the OBL with the past stem, to which the copula or the verbal endings are suffixed — agreeing with the object (patient) in person and number (in some languages also in gender). This applies e.g. to Bactrian, Parthian, Middle Persian, Pashto, Kurmanji, as well as some Sogdian examples (CLI 189, Yoshida 2009b: 302). In Balochi, some Kurdish varieties, and maybe in Yaghnobi, verbal agreement is limited to the marking of number for a 3 PL patient. Ergativity in Iranian thus con-

The languages, their histories, and their genetic classification

63

forms to the typological tendency observed by Trask (1979: 388) that if there is a tense/aspect split in an ergative system, it is the past tense/perfective aspect that shows ergativity while the present or imperfective domain patterns nominatively. (Since Iranian shows “morphological” or “surface” ergativity, it is the forms based on the past/perfect STEM — independent of their tense/aspect function — that show ergativity, including modal forms; on the other hand, forms based on the present stem may include past tenses, as e.g. in Sogdian, Yaghnobi, and Talyshi.) It is typologically noteworthy that many Iranian languages show (synthetic or analytic) passives (see 1.4.3.2) alongside ergative constructions. Owing to developments in the nominal and pronominal systems, and the use of Differential Argument Marking (see 1.4.3.1) in most New Iranian varieties, case marking patterns are rarely limited to “purely” ergative vs. nominative types and exhibit all theoretically possible types listed by Comrie (1978: 332), including the “double oblique” type with subject and object both in the OBL , the verb variously agreeing with the subject (as in Vafsi, cf. Stilo 2004: 243–244) or the object (as in Balochi), or with neither of them, as in Talyshi and the Pamir languages (cf. Lazard 2005 for a survey, Wendtland 2008 and Stump & Hippisley 2011 for Pamir constructions, and Scheucher 2006 and Haig 2008 generally). In fact, such “mixed” systems appear to be rather stable (Korn 2008). Pronominal clitics (cf. 1.4.3.1) are widely used in ergative and mixed patterns to index the agent, and enjoy considerable freedom as to which elements of the sentence they may be affixed to, the Wackernagel position being one option. In languages that have lost case marking, they are frequently used in addition to an overt agent, and the tendency is that the position after the agent is blocked, so that the pronominal clitics are often affixed to the object and thus serve to identify the arguments (Korn 2008: 257–258, Stilo 2009: 714). In Persian, which has come full circle from NOM - ACC through ergative to a novel NOM - ACC system (and so have some other New Iranian varieties), the use of the pronominal clitics (goft-eš ‘s/he said’, raft-eš ‘s/he went’), indexing the subject in the past domain, is the only reflex of the ergative construction. Other Iranian languages have grammaticalised the pronominal clitics as verbal endings (thus in Sorani; Jügel 2009), or show a coalescence of verbal endings and pronominal clitics (as in some Pamir languages). 1.4.4.

Summary and typological points

As discussed in the preceding sections, Iranian languages show various changes of relevance in grammatical categories. Iranian also displays both striking instances of continuity and of far-reaching innovations of grammatical systems, often, but surely not only, under the influence of neighbouring languages (within the Iranian group or beyond). The CONTINUITIES include e.g. the preservation of Proto-Iranian intervocalic voiceless stops in Balochi and in some East Iranian languages, the mobile accent in Pashto, plus numerous inherited lexical items. Morphologi-

64

Agnes Korn

cally, noteworthy archaisms include preserved pronominal clitics for the plural in Balochi and some other languages, the augment in Yaghnobi, causative vs. intransitive verbal stem pairs, and morphological passives. S YNTHETIC structures are stabilised in various ways: metanalysis yields an irrealis suffix -ōt- and a “precative” (or “optative middle”) suffix -ēt- in Sogdian (Yoshida 2009a: 282); inherited elements may be combined as in the irrealis (prefix be- plus -ēn- suffixed to the past stem) in Gilaki and Balochi, and adpositions and auxiliaries become modal, tense, or aspect clitics or affixes. On the other hand, the overall trend certainly is the substitution of synthetic by ANALYTICAL structures. Various periphrastic constructions arise, e.g. in the case system (see 1.4.3.1), for the passive voice, and also for aktionsart or modal categories (see also 1.4.3.2). Some of these constructions start out from patterns (perhaps idiomatic expressions) seen in Old Iranian. For instance, a periphrasis with ‘stand’ expressing durativity is seen already in Avestan examples such as hištaite dražimnō arǝduuī (Yt 5, 123) ‘Arǝduuī is wearing … (lit.: stands wearing)’ (Benveniste 1966: 48). ‘Stand’ functions as a progressive auxiliary, e.g. in Tajiki (xonda istoda-am ‘I am reading’; Jeremiás 1993: 102–105) and Buddhist Sogdian (e.g. wynʼm ʼštn ‘I am seeing’), and has been generalised for the present tense in Yaghnobi (wēnom-išt ‘I see’, Benveniste 1966: 47). It is used for the perfect in Middle Persian and Parthian (Skjærvø 2009a: 218–219, 231–232) and marks imperfectivity or durativity, e.g. in Wakhi (Bashir 2009: 836–840) and Yidgha (CLI 415). A potential construction (‘be able to do’) composed of the verbal adjective in -ta- (later: past stem) plus a finite form of ‘do’ is found already in Old Persian; it is attested in Khwarezmian, Parthian, Khotanese, and Sogdian (Sims-Williams 2007), and it is still in use in contemporary varieties such as Munji, Yaghnobi, and Balochi. Several of these languages also have an intransitive/passive counterpart (‘cannot be done’) with the verb ‘be/become’. Essentially the same verbs which are employed as auxiliaries in periphrastic verbal constructions are also used as light verbs in so-called COMPLEX PREDICATES or light verb constructions; the most frequent ones are ‘do’ (NP kardan, Ossetic kænyn, Wakhi tsar-, etc.) for complex predicates expressing transitive, active, and other meanings, and ‘become’ (New Persian šodan, Ossetic uyn, Wakhi wots-, etc.) for intransitive, middle, etc. functions (see e.g. Ahadi 2001, Folli et al. 2005, Pantcheva 2009, and Korn 2013). Several languages add ‘hold’ (Ossetic dar-, Pashto larəl, Wakhi δïr-), ‘strike, hit’ (NP zadan, Wakhi di-), etc. to the first, and ‘eat’ (New Persian xwordan, Balochi war-) and others to the second group; e.g. New Persian gūl zadan ‘to cheat’, gūl xwordan ‘to be cheated’ (Thordarson 2009: 77–81, Bashir 2009: 833). In Zazaki, verbs of movement enter the second slot (amiyayış ‘to come’, şiyayış ‘to go’, cf. Paul 1998a: 100–101, 131–133) — verbs that in other Iranian varieties are used to form the analytic passive (see 1.4.3.2). Some of the Iranian developments result in TYPOLOGICALLY noteworthy morphological patterns. For instance, there are modal and aktionsart constructions contain-

The languages, their histories, and their genetic classification

65

ing two finite verb forms, of the type New Persian mītavānam beravam ‘I am able to go (lit.: I can (indicative), I go (subjunctive)’, with the 1SG ending -am occurring twice) or dāram mīxwānam ‘I am presently reading (lit.: I hold/have, I am reading)’, and the Khotanese perfect system (see 1.4.3.3), whose modal forms consist of the (inflected) perfect plus the copula, e.g. auttä vätāya ‘it would have lasted’ (Emmerick 2009: 396). Sogdian shows a conditional formed from the inflected verb forms plus the particle x’t, originally a 3 SG optative of the copula, according to Yoshida (2009b: 280–281) a calque on Turkic ärsär (3 SG conditional of ‘be’). Yet other combinations with auxiliaries or particles end up with the verbal ending in the middle. Pointing in this direction is the New Persian future pattern xwāham raft ‘I will go’ (AUX .1SG go.PTCP ). Even more remarkable is the suffixing of particles to inflected verb forms, as in the Khwarezmian and Sogdian future (see 1.4.3.2) or the present tense in Yaghnobi (see above). Similarly infix-like is the position allowed for pronominal clitics in some Iranian varieties, where elements may follow a preverb and precede a verbal stem, as in Sorani (e.g. da-m-dayt-ē ‘you give [it] to me’, with the adposition -ē ‘to’ suffixed to the verb, and its complement, the pronominal clitic -m-, between the present prefix and the verb) and in Vafsi (Stilo 2004: 239). Other clitics can occur in such positions as well (e.g. Yaghnobi na-k-tifarant ‘when they don’t give [it]’, with the subordinator -k- ‘when, if’ following the negation). Commensurate with changes in noun inflection (see 1.4.3.1), the N OUN P HRASE has developed in various ways. Many Middle Iranian and New Iranian languages show group inflection, i.e. the case marker is only used on the last of several coordinated nouns, as for instance in Sogdian (Sims-Williams 1982: 68), Balochi, and Gilaki. New Persian, Kurdish, and Zazaki display a right-branching pattern, in which dependent elements are joined to the head noun by a clitic called eżāfe. These languages also employ prepositions, and place relative clauses after the head noun. The latter also applies to Iranian languages that are otherwise left-branching, with genitives and adjectives preceding the head noun, and a preference for postpositions (e.g. Ossetic and Gilaki). Conversely, some adjectives precede their head noun in otherwise right-branching Iranian languages. This “mixed typology” has been interpreted as the result of Iranian as a whole being ‘sandwiched between typical VO [right-branching] languages’ like Arabic ‘and typical OV [left-branching] languages (Turkic […], North Caucasian […], and Indic’ (Stilo 2005: 38). While it appears difficult to substantiate how such an overall position would shape the development of an individual Iranian variety, noun phrase structure indeed seems liable to be influenced by neighbouring languages. One could perhaps interpret the use of prepositions, postpositions, AND circumpositions (as e.g. in Pashto and Pamir languages) as a reflex of the older flexibility of adpositions (Thordarson 2009: 174–175), and individual Iranian languages would adjust choices depending on neighbouring languages. This becomes particularly clear when looking at diverging patterns of historically closely related languages such as differences

66

Richard F. Strand

within the Iranian languages of Iranian Azerbaijan (Donald Stilo, p.c.), or the dialects of Balochi (Jahani & Korn 2009: 657). 1.5.

Nûristânî46 By Richard F. Strand

1.5.1.

Introduction

The Nûristânî branch of the Indo-Irânian languages comprises five languages spoken by some 15 ethnic groups in the region on Afghânistân’s eastern frontier formerly called Kâfiristân, but known since 1896 as Nûristân. The five languages fall into two subgroups: The Northern group includes the markedly divergent languages Kâmkʹata-vari (“Kati”) and Vâsʹi-vari (“Prasun”), the former with major dialects Kâtʹa-vari and Kâmvʹiri. The Southern group comprises the less-divergent but still mutually unintelligible languages of peoples who call themselves Kalaṣʹa (the separate, Indo-Aryan-speaking Kalʹaṣa people of Chitral acquired their name from the Nûristânî Kalaṣa); it includes Âṣkuňu-Saňu-vîri (“Ashkun, Wâmâî”), Kalaṣa-alâ (“Waigali” with major dialects Väi-alâ and Nišei-alâ), and Tregâmî (native name uncertain). Details on the locations and ethnicities of speakers of the Nûristânî languages appear in Strand 1973, 1997–present [hereafter Website], 2001, 2010. Nûristânî speakers inhabit the southern slopes of the eastern Hindu Kush range, encompassing the province of Nûristân in Afghânistân and adjacent border areas in Afghânistân’s Laghmân and Badakhshân provinces and in Pâkistân’s Chitral District of Khyber Pakhtunkhwâ Province. Current population estimates (Government of Afghânistân, Central Statistics Office, 2005) indicate as many as 130,000 speakers of the five Nûristânî languages. 1.5.2.

Phylogeny

The Nûristânî languages preserve archaic Indo-Irânian phonological traits, notably dental affricates derived from Indo-European palatal stops and retention of IE *s after *u, that have made their phylogeny the subject of curiosity and controversy since the early 1900s. Konow (1911) proposed grouping Kâmkata-vari (then known by its Khowàr name “Bashgali”) with the Irânian languages. Grierson (1919) proposed that the Nûristânî and “Dardic” languages form a third branch of the Indo-Irânian languages. Morgenstierne (1945, 1974) refined Grierson’s schema by recognizing the “Dardic” languages (with the possible exception of Dameḷi) as Indo-Aryan and leaving only the Nûristânî languages as a third branch. All subsequent field researchers in Nuristan have upheld Morgenstierne’s phylogenetic schema and confirmed a 46

Diacritic use in this contribution follows the author’s wishes.

The languages, their histories, and their genetic classification

67

linguistic division between the Northern and Southern Nûristânî subgroups. Strand’s field research placed the position of Dameḷi clearly within Indo-Aryan rather than Nûristânî (Website: 1998–2002, 2001: 254). The current phylogenetic schema appears in Strand Website: 1999–2009, 2001, 2010, which updates Strand 1973 and Morgenstierne 1974, both of which have long superseded Grierson 1919. There has been controversy over whether the Nûristânî languages are more closely related to the Irânian or Indo-Aryan branches of Indo-Irânian. Some scholars have pointed to the preponderance of Middle Indo-Aryan sound changes in the Nûristânî languages as evidence for their being Indo-Aryan. But the earliest distinctly Nûristânî sound changes coincide with those of early Irânian. Such changes were produced by a strong fronting of the tongue coupled with strong anterior voicing, which precluded the posterior whispery-voicing found in the Indo-Aryan languages. Thus, in Irânian and Nûristânî PIE *bh, *dh, *gh, *gʷh lost their posterior whispery-voicing (“aspiration”) to merge with their anterior-voiced “non-aspirated” counterpart stops, and *ḱ and *ǵ were fronted to lamino-alveolar *č and *ǰ, which were subsequently “prognathized” to dental *ć and *ź, which remain in Nûristânî but which were deaffricated in Irânian. (“Prognathizing” is a slight jutting-out of the jaw with the tongue’s apex pressed behind the lower front teeth, which produces lamino-dental affricates among other sound changes; see Strand Website: 2002–2011, 2013.) The processes of prognathizing and deaffrication have recurred in the evolution of many languages of the region, including the Nûristânî language Âṣkuňu-Saňu-vîri. It is the relative chronology of sound changes that determines phylogeny. The Nûristânî languages were first on the margins of Indo-Irânian linguistic influence, then within the Irânian sphere, and only later within the Indo-Aryan sphere. Thus, an Indo-Aryan origin of the Nûristânî languages must be ruled out. An account of Nûristânî linguistic evolution appears in Strand Website: 2008–2010, 2010. Historical records to bolster any proposed phylogeny of the Nûristânî languages are essentially lacking before 1800, but origin myths collected by Strand (Website: 1997–1999) contradict Morgenstierne’s (1945) speculation that the early Nûristânî languages preserve traits ‘going back to the language of tribes which split off from the main body of Aryans and penetrated into the Indian borderland before the invasion of the Indo-Aryans.’ According to their own stories, the Nûristânîs arrived in the region well after the Indo-Aryans, after fleeing the onslaught of Islâm from Khorasân to Kandahâr to Kâbul to Kâpisâ to Kâma in Afghânistân’s Nangarhâr Province, and they only entered present-day Nûristân in the aftermath of the Ghaznavid conquest of Nangarhâr ca. 1000 AD. 1.5.3.

Milestones in Nûristânî studies

The difficulty of access to Nûristân for political and security reasons has severely limited direct field studies of the Nûristânî languages. Except for a few foreign-

68

Richard F. Strand

sponsored scientific expeditions to the region, individual researchers were not allowed into Nûristân until the mid 1960s, and that window was only open for a decade. We must distinguish the research and analyses of primary linguistic fieldresearchers, both foreign and native, from secondary (and often misguided) analyses made by scholars who have had no access to Nûristân or to speakers of Nûristânî languages. The studies of the latter rely totally on the quality of data provided by primary researchers. As we would expect, that quality has increased over the century since first European contact, and the quality of secondary analyses is dependent on whose data the secondary analyst cites. Generally, data gathered by field researchers after the mid 1960s supersede data presented in older writings. Primary linguistic field researchers in Nûristân include the following nonNûristânîs, with their dates of research indicated in parentheses: Mullâh Najib (1809), Khân Sâhib Abdul Hakim Khân (1898, 1900), John Davidson (1898), Georg Morgenstierne (1924, 1929, 1949, 1964, 1970s), Wolfgang Lenz (1935), Georg Buddruss (1956, 1969, 1970), Aleksandr L. Grjunberg (1963, 1964, 1967– 1968, 1970s), and Richard F. Strand (1967–1969, 1973–1974, 1984–1985, 1991– 1992, 2003–2005). Kendall Decker’s brief contact with Nûristânî speakers (published 1992) added confusion to previous well-founded research. Native scholars who have published on Nûristânî include Ghullâmullâh (1966, 1968) on Kâmviri, Samiullâh Tâza (1995, 2000) on Kalaṣa-alâ, and Jan Mohammad (1991) on Western Kâta-vari. Secondary scholars whose analyses have been significant to Nûristânî linguistic studies include Konow (1911, 1913), Grierson (1919), Hamp (1968), Fussman (1972), Edelʹman (1983, 1999), Nelson (1986), Degener (1998), Reichert (1998), and Bashir (2010b). I will not mention other ephemeral researchers whose analyses mostly suffer from being based on superseded data, except to note the untenable proposal by Sihler (1997), which has no basis in the phonetic reality of the IndoIrânian world. I must also mention the poor quality of most secondary sources on the Nûristânî languages currently available on the Internet, most egregious of which are the online articles from Wikipedia. 1.5.3.1. 1800–1910: Imperial research The Nûristânî peoples and languages first drew the attention of British imperial administrators in the early 19th century. The earliest account of the Kom Nûristânî appears in Appendix C of Monstuart Elphinstone’s An account of the kingdom of Caubul (1815). Elphinstone sent an Afghan counterpart, “Moollah Nujeeb”, to reconnoiter “Caufiristaun”. The Mullâh not only succeeded in penetrating Kâfiristân as far as Kâmdesh (“Caumdaish”), he also produced a remarkably accurate but brief report on the customs of the pre-Islâmic Kom. Most of the native terms in his report are recognizable, and they constitute the first recorded Kâmviri words.

The languages, their histories, and their genetic classification

69

In the latter part of the 19th century short vocabulary lists of Kâmkata-vari and Kalaṣa-alâ were compiled by various administrators, missionaries, and travelers. John Davidson (1902) published the first grammar of “Bashgalī”. His material was drawn mostly from Eastern Kâta-vari speakers that he encountered in Chitral, but several Kâmviri elements also appear. Davidson was a linguistically untrained British army colonel, and his grammar must be judged accordingly. His work is admirable in that it includes some 1,700 sentences as data designed to aid future scholars. 1.5.3.2. 1910–1964: Early scholarly period Sten Konow (1913) produced a dictionary from Davidson’s sentences, and based on Davidson’s data, he proposed a classification of the Indo-Irânian languages that grouped “Bashgalī” with the Irânian rather than the Indo-Aryan languages (1911). George Grierson’s grammatical sketch of “Bashgalī” for the Linguistic Survey of India (1919) is largely based on Davidson’s grammar, but his sketch also includes survey data from Kâmvʹiri recorded by Abdul Hakim Khân in 1898. The latter also recorded the first sketches (1900) of Kalaṣa-alâ (Väi-alâ dialect) and Vâsi-vari for the Linguistic Survey of India (Grierson 1919). Most notably, Konow’s son-in-law, Georg Morgenstierne, gathered data on all the Nûristânî languages on several occasions since his initial visit to Afghânistân in 1924. Much of his data appeared in his numerous publications (especially 1926, 1929, 1932, 1934, 1945, 1949, 1952, 1954, 1974), but a portion of his material remains archived and unpublished. Morgenstierne’s extensive research provides the enduring basis on which the phylogenetic position of the Nûristânî languages stands. Morgenstierne’s lexical materials were incorporated into Turner’s Comparative dictionary of the Indo-Aryan languages (1962–1969), and later scholars, notably Fussman (1972) and Nelson (1986), based their analytical works on Morgenstierne’s lexical data. A few more Nûristânî words emerged from the research of German and Danish expeditions to the region, appearing primarily in the writings of Wolfgang Lenz (1939) and Lennart Edelberg (Edelberg & Jones 1979). 1.5.3.3. 1964 to present: The era of participant scholarship During a trip to Afghânistân in 1964 Morgenstierne met Qâzi (Judge) Ghulâmullâh, a native Kâmviri speaker and a natural-born linguist, introduced by Afghânistân’s champion of linguistic studies, Dr. Rawân Farhâdi. Morgenstierne commissioned the Qâzi (through the Linguistic Institute of Kâbul University and the largesse of Gordon Wasson) to write a grammar and vocabulary of his language. His manuscript grammar (1966), in Persian, contains a detailed account of phonology, morphology, and syntax, with a wealth of examples written in a phonemic, Pashto-based

70

Richard F. Strand

alphabet which he developed. His grammar follows a traditional Arabic format which occasionally obscures the workings of an Indo-Irânian language, but overall his presentation is excellent. In 1968 he completed his Kâmviri lexicon, a translation of the Pashto dictionary published by the Pashto Ṭolana in Kâbul. Copies of Ghulâmullâh’s manuscripts reside, unpublished, in the library of Oslo University. Aleksandr Grjunberg gathered and analyzed a large corpus of recorded narrative data from Western Kâta-vari during his research during the 1960’s, from which he published his Jazyk Kati [Kati language] in 1980. This was the first fulllength published grammar of a Nûristânî language by a qualified linguist. The phonological data are sound, although somewhat unconventionally transcribed and analyzed, and the grammatical data are presented in a traditional format. Notable is his demonstration of aberrant 1st person subject marking in “Preterite I” verbs in the Western Kâta-vari subdialects of Kulʹem and Řâmgʹal (1980: 219–225), as opposed to more expected forms in the Ktʹivi subdialect (Strand Website: 1999a– present). The few problems noted include the mislabeling of what should have been SPEAKER -centric diagrams of directional prepositions as “subject”-centered (1980: 275, 279), and the labeling of the language as “Kati”, which is not a recognizable name in any Nûristânî language, following Morgenstierne’s misapprehension of a Persian nonce-word (Kâta plus the nesbati suffix -î) for the language’s native name. Grjunberg also contributed short descriptions (1999) of the Nûristânî languages to Edelʹman’s (1999) survey volume of regional languages. Georg Buddruss began studying Vâsi-vari as part of an expedition in 1956, and in 1970 he returned to the field to continue his research on that language. He also gathered considerable data on the Nišei-alâ dialect of Kalaṣa-alâ and a lesser amount on Saňu-vîri (published in 2006) and Tregâmî. His Nišei-alâ data were organized and published by his student Almuth Degener as Die Sprache von Nisheygram im afghanischen Hindukusch (Degener 1998, reviewed by Strand 1999), the first full-length description of a Southern Nûristânî language, complete with lengthy lexicon. Buddruss has given us only a tantalizing glimpse of his Vâsivari findings (1977a, 1977b, 2005). Richard Strand’s initial linguistic fieldwork on Kâmviri spanned two years from 1967 to 1969, during which he resided in the Kom Nûristânî community of Kâmdesh and collaborated with Ghulâmullâh long enough to gain fluency in Kâmviri. He subsequently carried out further field research on Kâmviri and on other Nûristânî dialects and languages as well as neighboring Indo-Aryan languages. An increasing amount of his linguistic research has been available on his website (Strand 1997–present), as well as in other publications (1973, 1985, 1999, 2001, 2010). The Nûristânî scholars Samiullâh Tâza and Jan Mohammad have produced short studies on their native languages, Samiullâh on Kalaṣa-alâ phonology and

The languages, their histories, and their genetic classification

71

oral literature (1995, 2000) and Jan Mohammad on causative constructions in Western Kâta-vari (1991). Notable contributions from the Nûristânî research of contemporary scholars include: Phonology Recognition of the roles of the following phonetic processes in the development of Indo-Irânian (Strand Website: 2002–2011, 2010): (1) Anterior vs. posterior voicing: the voicing of the Nûristânî and Irânian languages is exclusively ANTERIOR , vs. the mixed POSTERIOR and ANTERIOR voicing for neighboring Indo-Aryan languages; (2) “Prognathizing” (1.5.2), which produces changes such as č > ć, dv > b (3) Improved phonemic analyses of Kâta-vari (Kulʹem Dialect: Grjunberg 1980, Ktʹivi dialect: Strand Website: 2011a), Kâmvʹiri (Strand Website: 1997– 2007), and Kalaṣa-alâ (Samiullâh Tâza 1995, Degener 1998, Strand 1999), which supersede the phonetic transcriptions of predecessors and remove spuriously marked features such as length. Grammar The ubiquity of “directionality” in the Nûristânî languages has impressed all field researchers (Morgenstierne 1949: 229–231, Grjunberg 1980: 265–281, Buddruss 1977b, Strand 1997) and has inspired the spatial-visual cognitive approach to Nûristânî grammar that Strand has advocated since the early 1980s (1985, 1991, 1997–present, 1999). This approach has yielded grammatical insights that are applicable to other languages of the region; notably: There are two ways of looking at the image of discourse: 2-dimensionally (sideways) or 3-dimensionally (in perspective). Nouns with (at least) two case forms (certain classes of nouns including pronouns) imply three dimensions; nouns uninflected for case imply two dimensions. Three-dimensional images have a foreground and a background zone; uninflected nouns stand in the foreground, while case-marked nouns stand in the background (Strand Website: 2000–2002). Looking backward into time with a RETROSPECTIVE PERSPECTIVE onto foreground and background zones accounts for split ergativity and case marking more insightfully than a traditional statement like “the subject is in the oblique case, and the object is in the direct case and indicated by the verbal ending”, which confuses the concepts of verbal subject, which always stands in the foreground, and oblique-case agent, which stands in the background (Strand 1999, Website: 1999b–present, Website: 2000–2002). Remote agency is the basis of “causative” verbs, along a distance scale of internal vs. surface vs. external vs. remote zones (Strand 1985, Website: 1999b–present).

72

Richard F. Strand

The stative verb âsa- ‘is’ is a special type of motion verb that moves or projects nouns into the image of discourse, and the verb bu- ‘happen; become’ depicts nouns “popping up” in the discourse in different locations, directions, or guises (Strand 1985, Website: 1999b–present). Certain directional suffixes have probable ancient gestural precursors (Strand Website: 2011). Lexicography From Degener (1998), a lengthy lexicon of Nišei-alâ; from Strand (Website: 1999a–present), online phonemically transcribed lexicons of Kâmviri (5,500 native words), plus shorter lexicons of Kâta-vari (Ktʹivi dialect), Kalaṣa-alâ (Nišei-alâ dialect), and Saňu-vîri, in addition to the Indo-Aryan languages Khowàr, Aćharêtâʹ, Degano (Eastern Pashaî), and Bhaṭesa-zib. Strand’s Nûristânî etymological lexicon, initially with some 1,800 entries, appeared on his website in 2012. Phylogeny Details of phylogenetic classification (Morgenstierne 1974, Strand 1973, 2001, 2010, 2013, Buddruss 1977a), with a proper accounting of languages, dialects, and ethnic groups (along with supporting origin myths: Grjunberg 1980: 36–38, Degener 1998: 237–252, Strand Website: 1997–1999), a confirmation of the division between Northern and Southern (“Kalaṣa”, characterized by prognathizing) groups of Nûristânî languages (passim), and the exclusion of the Indo-Aryan language Dameḷi as a possible Nûristânî language (Strand Website: 1998–2002, 2001: 254). 1.5.4.

Prospects for Nûristânî studies

At present more data are forthcoming from Strand on his website; his long-overdue Kâmviri grammar is nearing completion, and his grant-sponsored pilot project to visually depict the meaning of Kâmviri discourse in real-time on a computer display is completed. The bulk of Buddruss’s Vâsi-vari material is in preparation and awaits publication. Studies of the Nûristânî languages are still largely in the exploratory stage. Detailed grammatical descriptions built on solid phonemic grounds are needed for Âṣkuňu-Saňu-vîri and Tregâmi, along with more lexical data. Cognitive formats of grammatical description would be more informative than traditional linguistic paradigms in elucidating the strong sense of space that underlies the grammar of the Nûristânî languages. Transcriptions of local oral history are sorely lacking, and such knowledge is fast disappearing.

The languages, their histories, and their genetic classification

1.6.

Dravidian Languages By Suresh Kolichala

1.6.1.

Introduction

73

The Dravidian language family comprises 27 languages, spoken by about 222 million47 people across South Asia. Although at present these languages are spoken in southern India, in parts of eastern and central India, and in isolated pockets in western Pakistan, it is speculated (Krishnamurti 2003, Southworth 2005) that Dravidian once had a wider distribution. There are immigrant communities of Dravidian speakers around the world, including large populations in Sri Lanka, Singapore, and Malaysia. 1.6.1.1. Overview The Dravidian languages are classified into South, South-Central, Central, and North subgroups (Krishnamurti 2003: 19). The four major, literary languages — Kannada, Malayalam, Tamil, and Telugu — are recognized as scheduled languages by the constitution of India. They are the official languages of the states of Karnataka, Kerala, Tamil Nadu, Telangana/Andhra Pradesh, respectively. Tamil is also an official language in Sri Lanka and Singapore. This chapter draws extensively from the works of Bhadriraju Krishnamurti — the most prolific scholar in Dravidian linguistics — supplemented by updated information and review of more recent research, including coverage of other views.48 1.6.1.2. The history of the Dravidian languages The question of when and whence the people who spoke Dravidian languages entered the Indian subcontinent cannot be satisfactorily answered based on the available archeological or linguistic evidence. There have been numerous attempts at proving external genetic connections, but none is particularly convincing. The Dravidian languages have been compared with Altaic by Menges (1977), Vacek (1983, 1987), and G. Starostin (2005); and with Uralic by Tyler (1968, 1990) and E. Marlow (1974). Interesting comparisons were also made with Sumerian by Fähnrich (1981) and Boisson (1987); with Japonic by Ohno (1980); with Kartvelian by Fähnrich & Sardshweladse (1965); with Afro-Asiatic by Blažek (2002); and with Mongolian by Uma Maheshwar Rao (2014). Blažek (2006) also attempted to look 47 48

Source: Ethnologue.com Special thanks are due to Hans Henrich Hock for his patient editing and valuable suggestions.

74

Suresh Kolichala

for an Australian substratum in Dravidian. On the basis of perceived correspondences with Uralic, Altaic, and Indo-European, Dravidian has often been included in a larger macrofamily known as Nostratic (Illič-Svityč & Dybo 1971, Illič-Svityč 1984, Blažek 2002, Dolgopolsky 2008, Bomhard 2008). Since the discovery of the Indus Valley Civilization (IVC) the idea that the people of the Indus Valley spoke Dravidian has figured prominently. The presence of Brahui, a Dravidian language in the highlands of southwestern Pakistan, is often used as an important factor in identifying the IVC with Dravidian. However, some scholars argue that Brahuis migrated within the past millennium from central India (Elfenbein 1987, Krishnamurti 2003: 141). There have been several attempts (Heras 1953, Knorozov 1965, Parpola 1994, Fairservis 1992, Mahadevan 2002) to interpret the Indus seals as Dravidian, but none has gained wide acceptance (Possehl 2003, Farmer et al. 2004, and Chapter 9, this volume). Lack of any archaeological evidence of a southward migration from the Indus Valley area and the absence of any convincing Harappan artifacts in the south also create problems for theories identifying the IVC with Dravidian. McAlpin (1981) suggested that Dravidian is most closely related to Elamite and posited a single Elamo-Dravidian family. Despite tantalizing typological similarities and a few morphological parallels cited by McAlpin, scholars felt that ‘many of the rules formulated by McAlpin lack intrinsic phonetic/phonological motivation and appear ad hoc, invented to fit the proposed correspondences’ (Krishnamurti 1985; also Zvelebil 1990). In his latest attempt to revive the hypothesis of Elamo-Dravidian, now referred to as Zagrosian, McAlpin removes Brahui from Dravidian and brings it under the Elamite branch (McAlpin, Forthcoming). This proposal has not been vetted by the scholarly community but appears to suffer from the same problems as before. For example, he compares (as quoted in Southworth 2012) Elamite aš ‘cow’ with Brahui xarās ‘bull’ and presents PDr. *ā(y) ‘cow’ as distant cognate. However, Brahui xarās can straightforwardly be derived from PDr. *kaṭac- with cognates in almost all Dravidian languages, e.g. Ta. kaṭāy, Ka. kaḍasu, Tu. gaḍasu, Nai. kaṛas, Ko. kaḍas, Kurux kaṛā.49 Southworth (2005) and Fuller (2003, 2007) independently attempt to analyze linguistic data combined with archaeological and archaeobotanical evidence and suggest that the Southern Neolithic complex50 is the most promising in terms of connecting an archaeological complex with Dravidian. Southworth also traces typical Dravidian place-name endings throughout Maharashtra and the Saurashtra 49

50

Similarly, McAlpin compares El. hidu with Brahui heṭ-. However, Brahui heṭ- ‘shegoat’ has parallel forms in Gandhari Prakrit heḍi (Burrow 1937: 28) and eastern Asokan Prakrit hiḍa/heṭa (K. R. Norman 1992: 85) as well. The Southern Neolithic archaeological complex of northern Karnataka and southwest Andhra Pradesh provides the earliest evidence of pastoralism and agriculture in Peninsular India, starting around 2800 BCE (Allchin & Allchin 1982, Bellwood 2009).

The languages, their histories, and their genetic classification

75

peninsula, and a few in Sindh and Rajasthan (2005). Based on this analysis, he argues for a peninsular homeland for Proto-Dravidian, with a population expansion along the west coast of India. The practice of cross-cousin marriage, characteristically Dravidian, found in parts of Maharashtra and Gujarat also lends credence to the theory of a Dravidian substratum in these regions (Trautmann 1981). Recent population genetic studies suggest that most South Asian groups derive from two genetically divergent populations. Reich et al. (2009) have shown that the model of admixture between two ancestral populations referred to as Ancient South Indians (ASI) and Ancient North Indians (ANI) provides a good fit to genetic data from most modern Indian groups. Reich et al. find that the ANI component is genetically closer to Middle Easterners, Central Asians, and Europeans, and that it is found significantly higher among Indo-European speakers than among Dravidian speakers. In contrast, the authors state that the ASI component is restricted to South Asia and suggest the ancestral ASI population might have spoken a Dravidian language before mixing with the ANI. Another study (Metspalu et al. 2011), while validating the hypothesis of Reich et al., further suggests that both Indian ancestry components are older than the purported Indo-Aryan migration (3,500 BP). A recent study estimates ANI-ASI mixture to have occurred 1,900–4,200 years BP (Moorjani et al. 2013). However, it is important to exercise caution in inferring linguistic prehistory from genetics, given that the mechanisms of transmission of genes and languages are vastly different. Communities can abandon one language and adopt another, and such language shifts are not detectable genetically. For instance, Reich et al. indicate that the Onge (indigenous Andaman Islanders) form a clade with the ASI. However, the Onge language is considered an isolate, perhaps a long lost sister of Proto-Austronesian, but unrelated to Dravidian (Blevins 2007; see also 1.10.1, this volume). Such linguistic-genetic mismatches emphasize the need for caution in correlating human genetics and linguistics. It is widely acknowledged that South Asia is a Linguistic Area (Sprachbund), where different linguistic families have developed convergent structures as a result of long-standing bilingual contact (Emeneau 1980a, Masica 1976). Emeneau (1980a) suggested that more than a dozen loanwords detected in the Rigveda (1500 BCE) are Dravidian, but Witzel (1999a, 1999b) argues that the Dravidian loanwords started to enter the language only in the middle and late Rigvedic periods. The introduction of retroflex consonants in Sanskrit has also been attributed to contact with South Asian languages, possibly Dravidian. Whether features such as the extensive use of converbs and quotative iti51 in Vedic Sanskrit can be explained as a result of substratum or adstratum influence of Dravidian remains a source of scholarly debate. (See Emeneau 1980a, Kuiper 1991, Witzel 1999a, Witzel 1999b, Hock 2005a. Also see 2.3.) 51

For a counter-argument on the Dravidian origin of the iti construction, see e.g. Hock 2005b.

76

Suresh Kolichala

A good deal of recent comparative work on Munda points to a range of features shared between Dravidian and Munda. Beside lexical loans, phonological and structural influences at varying stages have been proposed. Anderson (2003) suggests that Munda features such as retroflexion, loss of the initial velar nasal (ṅ), loss of subject prefixes and object suffixes, inflection and selection of auxiliary verbs, dative subjects, and SOV word order can be attributed to Dravidian influence. 1.6.1.3. Languages and geographic distribution The word drāviḍa and its variants53 occur in Classical Sanskrit literature, Sinhala inscriptions, and early Buddhist and Jaina sources from the 3rd century BCE, where they mostly designate the Tamil country, Tamil people, and the Tamil language (Joseph 1989). However, the term was also ambiguously employed referring to the peninsular languages (including Marathi and Gujarati) or the speakers of those languages, especially in the term pañcadrāviḍa (Deshpande 2010). Caldwell, who wrote the first comparative grammar of Dravidian (1856), adopted the name drāviḍa as a generic name for the whole family. There are twenty-seven Dravidian languages known at present. The following classification of these languages into four genetic subgroups is generally accepted. South Dravidian 1. Tamil (Tamiẓ)53 2. Malayāḷam 3. Iruḷa 4. Kuṟumba 5. Koḍagu 6. Toda

1.6.2.

7. Kota 8. Baḍagu 9. Kannaḍa 10. Tuḷu 11. Koraga(?)

South Central Dravidian 12. Telugu 13. Goṇḍi 14. Koṇḍa 15. Kui 16. Kuwi 17. Pengo 18. Maṇḍa

Central Dravidian 19. Kolāmi 20. Naikri 21. Naiki 22. Parji 23. Ollari 24. Gadaba

North Dravidian 25. Kuṛux 26. Mālto 27. Brahūi

Subgrouping in Dravidian

The traditional Comparative Method works best in deriving a family tree when a proto-language splits and the descendent languages lose contact. The task of subgrouping in Dravidian is complicated by the fact that throughout their history, the known Dravidian languages have been influenced by mutual contact. Earlier attempts at subgrouping include Bray 1909, Ramaswami Aiyar 1928, 1936, Tuttle 1940, Burrow & Bhattacharya 1953, Emeneau 1955, Krishnamurti 1961, Su52 53

The language names are used without diacritics in the rest of this section. Variants of drāviḍa include drāmiḍa, dameḍa, and damiḷa.

The languages, their histories, and their genetic classification

77

Map 1.2: Distribution of the Dravidian languages (produced by Suresh Kolichala, January 2015)

brahmanyam 1971, and Southworth 1976. In the mid 1970s, Krishnamurti (1975, 1976, 1985: 220–223) found evidence to separate the Telugu-Kuwi subgroup – designated South-Central Dravidian – and presented the four-way subgrouping in Figure 1.3. While the above four major subfamilies are well-established on linguistic grounds, the phylogenetic tree in Figure 1.3 is probably less accurate. Krishnamurti (2003) presents evidence to set up a common stage for South Dravidian and South-Central Dravidian (Figure 1.4). This is a revision of his earlier view of a common branch for South-Central and Central Dravidian. According to him, the

78

Suresh Kolichala

Figure 1.3: Proto-Dravidian with four main branches

Figure 1.4: Proto-Dravidian with a common stage for South and South-Central Dravidian

first branch to split off is North Dravidian, the second is Central Dravidian, and the last branch is Proto-South Dravidian, which further split into South Dravidian (South Dravidian I) and South-Central Dravidian (South Dravidian II). While his latest division, for which evidence is presented in Krishnamurti 2003: 492–501, is accepted by the majority of scholars, a few scholars question if the shared innovations presented for the common stage of the latter two branches may rather be due to diffusion (e.g. Subrahmanyam 2008). Krishnamurti (2003) also suggests that it is possible to set up an original binary division of Proto-Dravidian into Proto-North Dravidian and the rest. There is lean evidence to set up a common stage of Proto-South and Proto-Central Dravidian, and the lack of clear shared innovations may suggest that these branches diverged quite rapidly. The North Dravidian group shows no such interaction, and thus it is probable that Proto-North Dravidian separated from the parent speech at a time when Central and South Dravidian were still at least in loose contact.

The languages, their histories, and their genetic classification

79

Southworth (2005) adds insightful remarks while accepting the revised four-subgroup classification. He analyzes shared innovations that cross the boundaries of the major subgroups to investigate if a common stage of development can be set up for South and Central Dravidian. He concludes that although there is no basis for assuming a common stage of development, it is clear that at some stage in the past these two branches were in sufficiently close contact that some innovations could cross the boundaries between them. To explain some of the common innovations between Telugu and the Central Dravidian languages, he suggests that there was a period when Telugu was in contact with some of Central Dravidian as well as with Tamil, and this state of affairs probably existed for some time. Fuller (2007: 429) attempts to correlate early ecological and agriculture data with linguistic evidence to suggest a Proto-Dravidian homeland somewhere in Peninsular India. He analyzes cognates of botanical words across the subgroups

Figure 1.5: Southworth’s subgrouping tree diagram with possible contact scenarios (From Southworth 2005: 235)

80

Suresh Kolichala

of the Dravidian languages to arrive at a tentative timeline: ‘these data suggest that Proto-South Dravidian might be identified with the latest phase of the Southern Neolithic and the transition to the Megalithic period in South India, in the time horizon 1500–1300 BC, and certainly no earlier than 1800–1700 BC. Central Dravidian is likely to have diverged prior to this date (by ca. 2000 BC, before the introduction of wheat and barley), and North Dravidian even earlier.’ McAlpin (2003) attempts to prove that the North Dravidian hypothesis is untenable. He argues that many of the common features listed for Kurux-Malto and Brahui are not shared innovations but retentions. As mentioned in 1.6.1.3, McAlpin (Forthcoming) pushes this idea much farther and proposes Brahui as an intermediate link between Elamite and Dravidian. (For more discussion, see 1.6.4.5.) In a recent publication P. S. Subrahmanyam (2008) reasserts the old claim of placing Telugu-Kuwi (Krishnamurti’s South-Central Dravidian) along with Kolami-Parji in the Central Dravidian subgroup. However, much of what he presents as new evidence appears to be a repetition of his earlier arguments (Subrahmanyam 1971), which were thoroughly analyzed and criticized by Southworth (1976), among others. None of the features listed as evidence for placing Telugu-Kuwi and Kolami-Parji in the same subgroup is likely to reflect shared innovation,54 and therefore a common stage for these languages cannot be substantiated.

54

None of the following features given by Subrahmanyam can be used as evidence for including Telugu-Kuwi in Central Dravidian: 1. PDr. *ya > a-: This archiphoneme ā̆ shows reflexes of e- in the oblique forms in NDr. and SDr., and there is no reason why we should assume that these are independent developments. 2. *ẓ > ḍ /ṛ: Inscriptional evidence indicates that the phoneme ẓ was retained until recent times. ẓ > ḍ /ṛ is also found in Kota-Toda and North Dravidian. 3. Loss of *n- in second person pronouns: Among the SCDr. only Telugu shows loss of word-initial nasals in 2nd-person pronouns; therefore this cannot be used for a common-stage argument. 4. Female kinship terms with –āl: This can be reconstructed to Proto-Dravidian times, as all Dravidian languages have derivative stems denoting a female human by addition of the suffix –āl. 5. Obligatory use of neuter plural suffix: This is a parallel development in SCD and Central Dravidian, as the suffix itself is different in each of these languages. 6. Past adverb with –cci: This must be a case of diffusion as Central languages show the allomorph of Proto-Dravidian *-i for the perfective participle. Kolami and Naiki do not have –cci. It must be noted that -ci is also found in Brahui. 7. Widespread use of non-past *tt: The reflexes of a dental stop (tu/ttu) as non-past marker occur in SD, SCD, and CD. This perhaps can be constructed as an aorist marker in Proto-Dravidian. 8. Negative adverb with –ak(k)a: This is clearly borrowed from Telugu into Central Dravidian.

The languages, their histories, and their genetic classification

81

Kolichala (2010) and Kolachina et al. (2011) apply computational phylogenetic techniques to the issue of Dravidian subgrouping and confirm the four major subgroupings of Figure 1.4. However, the results are inconclusive as regards nested hierarchies at the top. Pilot-Raichoor (2012b) argues that the development of the Dravidian languages over millennia cannot be simply explained by the linear splitting of an original mother-language into daughter-languages. Using Dixon’s punctured-equilibrium model (1997) she proposes that the development of new convergent grammatical features in Dravidian is the result of some historical punctuating events. Further research is warranted. 1.6.3.

Overview of the Dravidian languages

1.6.3.1. Phonology Proto-Dravidian had ten vowels, *i, *e, *a, *o, *u, and their long counterparts *ī, *ē, *ā, *ō, *ū, and sixteen consonants, see Figure 1.6 and Tables 1.5 and 1.6. In addition, laryngeal /*H/, alveolar nasal /*ṉ/, and uvular stop /*q/ figure in some reconstructions.

Figure 1.6: Proto-Dravidian vowels

82

Suresh Kolichala

Table 1.5: Proto-Dravidian consonants labial Stops

dental

p

t

m

n

alveolar ṯ

retroflex

palatal



c



ñ

velar k

Fricatives Nasals Laterals

l

Flap

r

Approximants

w





y

Table 1.6: Dravidian stop allophonic patterns labial

dental

alveolar

retroflex

palatal

velar

Word initial

p-

t-

-

-

c-

k-

Geminates

-pp-

-tt-

-ṯṯ-

-ṭṭ-

-cc-

-kk-

Nasal+stop

-mb-

-nd-

-ṉḏ-

-ṇḍ-

-ñj-

-ṅg-

Intervocalic singleton

-w-

-d-

-ḏ-/-ṟ-

-ḍ-

-s-

-g-

The three-way distinction dental-alveolar-retroflex /t ṯ ṭ/ in the stop series, a separate series of phonemic retroflexes /ṭ ṇ ḷ ẓ/ (stop, nasal, lateral, approximant), and the absence of voice contrast in the stop series are typologically important features of the Proto-Dravidian consonantal system. The stops, when intervocalic, had lenis allophones [w, d, ḏ/ṟ,56 ḍ, s, g], after a nasal they were voiced, and geminates were always voiceless; see Table 1.6. Many descendant languages have developed aspiration and word-initial voicing through sound changes and borrowings from Indo-Aryan. Krishnamurti (1997) proposes reconstruction of a laryngeal *H to account for the peculiar phonology of demonstrative bases as well as alternation of vowel 55

56

While ẓ has been a standard symbol for representing the retroflex approximant in Dravidian literature, some authors use r̤ or ḻ. The voicing of voiceless ṯ in intervocalic position is currently found only in Central Dravidian languages. In SDr and SCDr it shows reflexes of ṟ. However, voicing in intervocalic position is common for other phonemes in Dravidian, and this perhaps represents the retention of the original voicing pattern.

The languages, their histories, and their genetic classification

83

length in personal and reflexive pronouns. The Old Tamil grammar Tolkāppiyam presents a phoneme known as āytam in these cases, the exact pronunciation of which is unknown. Krishnamurti suggests that Old Tamil āytam was a reflex of PDr. *H, and reconstructs the demonstrative bases as *aH ‘that’, *iH ‘this’, *uH ‘yonder’. The remote and proximate forms in Kuwi–Gondi in South-Central Dravidian and Kurux–Malto in North Dravidian have /h-/ freely varying with zero. Furthermore, the numeral ‘ten’ has aspirated variants in Kannada and Telugu from early times, Te. ēnbhadi ‘fifty’ (8th century; B. Radhakrishna 1971: 249), Ka. ombhattu ‘nine’, tombhattu ‘ninety’, which agrees with Tolkāppiyam’s description of the numeral ten as *paḥtu. Thus the arguments for the reconstruction of laryngeal *H appear tenable, although the limited evidence warrants caution and requires further study before accepting it for Proto-Dravidian. Subrahmanyam (1983, 2008) suggests reconstructing an alveolar nasal /ṉ/ for Proto-Dravidian. The nasal contrast alveolar-dental is only found in Old Tamil and Malayalam. In the majority of cases, dental and alveolar nasals occur in complementary distribution, with dental [n] in initial position and before dental stop /t/, and alveolar [ṉ] elsewhere. A few cases in Malayalam where dental [n] occurs singly or as a geminate in non-initial position can be explained as resulting from loss of the stop in an original nt cluster. In a few other cases, final dental -n seems to be a variant of –m (Zvelebil 1990: 11). Still, there are a few Old Tamil forms where dental -n is manifested postvocalically, such as warunar ‘one who comes’, and werin ‘back’, which Zvelebil thinks may be a result of over-differentiation in Tamil orthography. Given the lack of evidence of any contrast in the other languages, there is no compelling reason to reconstruct two different phonemes (also see Shanmugam 1972). McAlpin (2003) introduces an interesting argument for the reconstruction of Proto-Dravidian uvular /q/ to explain the contrast of /q/ and /k/ in Malto.57 However, as he also notes, /q/ occurs only before mid and low vowels in Malto, and all of the North Dravidian languages, including Brahui, attest only k- before i/ī.58 Although his proposal has some methodological appeal, McAlpin doesn’t show why a change *k > q /___ [e,o,a] should be ruled out. The issue needs further investigation. Possible influence of languages such as Kusunda — where /q/ occurs only before non-high vowels (Watters 2005: 25) — must also be investigated.

57

58

It must be noted that while Brahui has /k/ and /x/ natively, it always treated /q/ as a foreign phoneme. The uvular stop /q/ occurring in Perso-Arabic loan words is almost always nativized to /x/ or /k/, e.g., waqt > vaxt ‘time’; warq > varx ‘sheet (e.g. paper)’; faqīr > p(h)axīr ‘beggar’. (Thanks to Elena Bashir for providing the examples.) Kurux’s umlauting rule, which raises *xe to xi and *xo to xu if followed by a highvowel is an independent development.

84

Suresh Kolichala

1.6.3.2.

Phonotactics

Proto-Dravidian roots have the shape (C)V1(C) = V1, CV1, V1C, CV1C (V1 = long or short). Alveolar and retroflex consonants do not begin a root or word. All noninitial consonants except r and ẓ can be geminated. Vowel-ending roots may take formative suffixes of the shape, C, CV, CCV, CCCV. Roots ending in C followed by consonant-initial suffixes insert a V2 = a, i, u. There are no consonant clusters word-initially. Non-initial clusters are either geminates or sequences of nasal and stop (+ stop). No vowel clustering is allowed in Dravidian. When two vowels come together in compound words, a vowel cluster simplification rule normally deletes the first vowel. When there is no sandhi, a glide /y/ or /w/ is inserted to prevent vowel clustering (or hiatus). If a word ends in a stop, it is followed by the “enunciative” vowel /u/. Roots of the type (C)VC- and (C)VCC- contrast when followed by derivative suffixes beginning with vowels. Krishnamurti (1991) suggests that there is a phonological convergence in the emergence of (C)V̄ C(V) and (C)VCC(V) as favored syllable types in Indo-Aryan and Dravidian. Several Dravidian languages show a regular loss of gemination after a long vowel. For example, Proto-Dravidian *āṭṭa-m ‘play, game,’ became āṭa in Kannada, Tulu, Telugu, and Gondi. (But see also 2.3.4.1.) In addition, many trisyllabic forms became disyllabic in the descendant languages through the loss of the unstressed vowel, as in Proto-Dravidian *mar-untu ‘medicine’ > Telugu mandu, Kannada mardu, maddu, Parji merd, and Kurux mandar. 1.6.3.3.

Morphology

Dravidian morphology is agglutinative. There are no prefixes or infixes; morphological relations are expressed using suffixation and compounding. The major grammatical categories are nouns and verbs. There is no conclusive evidence for reconstructing adjectives and adverbs; many forms with these functions appear to be defective nouns and verbs. Dravidian lacks conjunctions as well; non-finite verb forms are employed in place of conjunctions. There are no articles; the numeral ‘one’ may be used as indefinite article. Use of the accusative case of a neuter noun identifies it as definite. 1.6.3.3.1.

Nominal morphology

Dravidian nominals consist of nouns, pronouns, numerals, and adverbs of time and place. Nouns carry gender and number and are inflected for a variety of cases.

The languages, their histories, and their genetic classification

85

1.6.3.3.1.1. Pronouns, number, and gender The reconstructed personal pronouns of Proto-Dravidian are listed in Table 1.7 (Krishnamurti 2003: 243). Many Dravidian languages distinguish inclusive and exclusive first plural pronouns. However, the distinction is also found in Marathi, Gujarati, Sindhi, and Marwari, as well as several Tibeto-Burman languages. In Austroasiatic, including many Munda languages, “clusivity” is a common phenomenon.59 The distinction is clearly an areal feature, but we cannot easily determine the original source (Masica 2001, Osada 2004). Table 1.7: Proto-Dravidian personal pronouns

First

Singular Nominative Oblique *yaHn/*yān *yan

Second Third

*nīn *tān

*nin *tan

Exclusive Inclusive

Plural Nominative Oblique *yaHm/*yām *yam *ñām *ñam *nīm *nim *tām *tam

Deictic pronouns, some of which are used as 3rd-person pronouns are derived from the deictic bases *a-/aH- ‘that’, *i-/iH- ‘this’; in addition there is interrogative *ya/yaH- ‘what’. South and South-Central Dravidian show alternative forms for the first singular, which Krishnamurti (2003: 245) thinks is a shared innovation, reconstructible to Proto-Southern Dravidian as *ñān-/*ñan-. In South Dravidian, a three-way gender distinction occurs in the singular, e.g. *awan ‘he’, *awaḷ ‘she’, *atu ‘it’; in the plural, *awar ‘they (human)’ and *away ‘they (non-human)’ are distinguished. South-Central and Central Dravidian show a two-way distinction in the singular and plural: *awantu ‘he’, *atu ‘she, it’; *awar ‘they (men)’; *away ‘they (non-human, or women)’, which Krishnamurti (2001) thinks represents the Proto-Dravidian system. He concludes that South Dravidian innovated *awaḷ in the singular, whereas North Dravidian and Telugu independently added the semantic category of ‘women’ under *awar. North Dravidian (Kurux and Malto) also introduced gender-marked verb forms for the 1st and 2nd persons.

59

According to Anderson (p.c.), clusivity is a Munda feature, best manifested in the least Dravidianized North Munda languages; these languages, in fact, show dual and plural inclusive and exclusive first person pronouns along with corresponding verb inflections.

86

Suresh Kolichala

There are two numbers: singular (unmarked) and plural. Plural is marked with *-(V)r for humans, and with *-k(V), *-nk(V), *-nkk(V), *-Vḷ or *-V(n)kaḷ60 for non-humans. 1.6.3.3.1.2. Cases The reconstructible cases are nominative, accusative, dative, and possibly instrumental/locative. Sociative and ablative cases also exist, but are not clearly reconstructible. The nominative case is unmarked. Non-nominative cases are added to the oblique stem, which can also function as the genitive. A number of postpositions denoting cause, purpose, direction, etc., with the status of independent words, are also used in different languages. 1.6.3.3.1.3. Numerals The numerals 1 to 5 and 8 to 10 consist of a root and a fused neuter morpheme *t, *tt, *k, e.g. *on-tu ‘one’ (DEDR 990), *ir-aṇṭu ‘two’ (DEDR 474), *mū-ntu ‘three’ (DEDR 5052), *nāl-(k)ku ‘four’ (DEDR 3655), *caymtu ‘five’ (DEDR 2826), *enṭṭu ‘eight’ (DEDR 784), *pa(H)-tu ‘ten’ (DEDR 3918); the forms *cātu ‘six’ (DEDR 2485) and *ēẓ ‘seven’ (DEDR 910) are also used with neuter (non-human) agreement; a human suffix *-war is added to the numeral roots when they qualify human nouns, e.g., *mūwar ‘three persons’ : *mūntu ‘three (non-human)’. 1.6.3.3.2.

Verbal morphology

An inflected finite verb consists of verb stem + (modal auxiliary) + tense + gender/ number/person (GNP) markers. The Dravidian verbal stem can be intransitive, transitive, or causative. An intransitive verb may be optionally extended by suffixes resulting in transitive and causative forms, as in Telugu taḍiyu- ‘become wet’, taḍupu- ‘make someone wet’, taḍipiñcu- ‘cause to make someone wet’. Similarly, an inherently transitive verb may become causative by adding a causative suffix. An extended verb stem may contain a transitive/causative marker and a reflexive suffix. Two tenses can be reconstructed: past and non-past. The existence of positive and negative conjugations is one of the most notable features. While non-past negative constructions are found in all subgroups, a past negative construction is only extant in Konda, Pengo, Manda, Kolami-Naiki, and Old Malayalam.

60

This plural suffix -V(n)kaḷ is a pleonastic sequence of two underlying plural morphemes: V(n)k and aḷ.

The languages, their histories, and their genetic classification

1.6.3.3.3.

87

Adjectives, adverbs, and clitics

There is no consensus on whether adjectives are a separate part of speech. Bloch (1946: 32) and Andronov (2003: 178–181) deny a distinct category, while D. N. S. Bhat (1994: 18–41) vehemently argues for it. Amritavalli and Jayaseelan (2003) observe that most of Bhat’s arguments are functional and not syntactic. However, Krishnamurti (2003: 389) considers bound adjectives like perum-, pēr-, peru‘great’ a separate part of speech, as they do not behave like nominal or verbal forms. Most adjectives are nouns in the genitive. There is, however, a small class of adjectives that occur in compounds: *kem ‘red’ (DEDR 1931), *weḷ ‘white’ (DEDR 5496a), *kitu ‘small’ (DEDR 1594). The deictic bases are: *ā/*aH ‘that’ (DEDR 1), *ī/*iH ‘this’ (DEDR 140), *ū/*uH ‘yonder’ (DEDR 557a), and *yā/*yaH ‘which’ (DEDR 5151). Demonstrative pronouns denoting person, time, place, quantity, etc. are derived from these roots; e.g. PDr. *awantu ‘that man’, *atu ‘that woman, thing’, *appōẓ(u) ‘then’ (DEDR 1); *iwantu ‘this man’, *itu ‘this woman, thing’, *ippōẓ(u) ‘now’ (DEDR 410). Adverbs of time and place are inflected for case like noun stems, but do not carry number and gender. Adverbs are also formed from descriptive adjectives by adding an inflected form61 of the verb ‘to be’. Onomatopoetic and echo words also generally function as adverbs. There are many clitics, of which four are reconstructible: emphatic *-ē, interrogative *-ā, conjunctive *-um, and dubitative-alternative *-ō. Each language and subgroup has evolved many others, mostly representing the contraction of finite verbs. 1.6.3.3.4.

Compounds

Krishnamurti (2003: 200) sets up four different compound classes: 1) verb + verb, 2) noun + noun, 3) adjective + noun, and 4) verb + noun. Extensive use of reduplication and echo words is a characteristic feature of Dravidian, which is also shared by the other linguistic families of the subcontinent. Beside echo compound formation in nouns and verbs, Chandrasekaran (2011) recently described a new formation called “pleonastic compounding”. For instance, the words ūma and kuñci are individually attested in the meaning of ‘owl’, but the pleonastic compound ūmaguñji is also attested in a few languages in the same meaning.

61

Mainly the perfective participle form; e.g. Kannada -āgi, Telugu -gā ‘having become’.

88

Suresh Kolichala

1.6.3.4. Syntax The Dravidian languages are left-branching, SOV languages. Although extensive case marking and verbal agreement in Dravidian permits relatively free word order and omission of arguments, there is a fairly strong tendency toward verb-finality. Dravidian also shares several word order universals of SOV languages: subordinate clauses precede main clauses; adverbs precede verbs; adjectives precede nouns; main verbs precede auxiliaries; and postpositions are used instead of prepositions. A sentence may have a VP or NP predicate. When the predicate is an NP, it is common in Dravidian to have no copula. The finite verb, inflected for tense, carries Gender/Number/Person (GNP) agreement with the subject in the 3rd person, but only in number and person in the 1st and 2nd persons. A predicate NP also carries subject agreement in many cases. Subject-predicate agreement follows a nominative-accusative pattern. The constituent NPs of the VP carry case morphemes generally interpreted in terms of the semantic structure of the verb. The main clause ends in a finite verb with the internal structure stem + tense / mode + GNP; subordinate clauses typically end in non-finite verbs. Quotatives, derived from the perfective participle of *yan ‘having said’, can serve as complementizers. It is generally assumed that Dravidian syntactic typology permits only one finite verb per sentence, except in quotative constructions which may have their own finite verbs. Any violations of the single finite-verb constraint, such as in relative-correlative constructions, were commonly attributed to Indo-Aryan influence (Krishnamurti & Gwynn 1985: 361, Sridhar 1990: 47, Asher & Kumari 1997: 53). This view was challenged by Ramasamy (1981), Lakshmi Bai (1985), and especially Steever (1988, 1993) who convincingly argued that relative-correlatives are native to Dravidian. Steever further claims that Dravidian relative clauses (RC) must be followed by a clitic particle (commonly =ō), and the absence of post-RC clitics in North Dravidian is an innovation. Hock (2008) argues that the occurrence of clitic-less relative-correlatives in geographically northern languages as well as Old Tamil-Malayalam and Old Kannada is an archaism and that the post-RC clitics in southern languages are a regional innovation. At this point, the difference between Steever’s and Hock’s views remains at a stand-off. Further research by other scholars is a desideratum. Serial verb formation, where two finite verbs are used as a compound verb, is another exception to the rule of one finite verb in the Dravidian sentence (Steever 1988). Christiane Pilot-Raichoor (2012b) analyzes Dravidian morphology and syntax based on early Tamil-Brahmi inscriptions and boldly argues that Proto-Dravidian morphology was isolating, suggesting agglutination was a later innovation. She also suggests that early Dravidian was acategorical (no noun and verb catego-

The languages, their histories, and their genetic classification

89

ries), and all the modern Dravidian categorical and relational grammatical features expressed through morphological constructions result from relatively late developments. She further argues that Proto-Dravidian word order might have been predominantly OVS. While these radically new proposals raise interesting questions, there has not been any detailed reaction by other scholars. Further research and discussion is needed. The dative subject construction is another widespread areal feature of South Asia. In Dravidian, these dative NPs have many behavioral properties characteristic of subjects (Sridhar 1979, Verma & Mohanan (eds.) 1990, Umarani 2005). However, Jayaseelan (2004) and Amritavalli (2004) argue that the dative NPs are indirect objects, rather than syntactic subjects. Nizar (2010) analyzes South Dravidian data and concludes that treating the dative NPs as syntactic subjects more fully accounts for their behavior. (See also 5.3.1.) 1.6.4.

Details of the subgroups in Dravidian

1.6.4.1. Shared features in South and South-Central Dravidian Krishnamurti (1958) demonstrates that vowel-lowering, through which the high vowels (*i, *u) merged with mid-vowels (*e, *o) before a low vowel (*a) in the next syllable, affected not only South Dravidian62 but also the South-Central group. He presents this as a crucial argument for positing a common source for South and South-Central Dravidian. He suggests that at a much later stage, all instances of *e, *o became i, u in Early Tamil-Malayalam. As a consequence, Tamil and Malayaḷam have i, u corresponding to Telugu and Kannada e, o before a in the next syllable. Here the evidence of nonliterary Central and North Dravidian is useful because it preserves the original root vowels (Krishnamurti 1958). Another phonological change shared by South Dravidian and some of South-Central Dravidian (Telugu and Gondi) is the sporadic loss of initial *c-. Krishnamurti (2003: 122) argues that the loss of PD *c- went through the intermediate stages s- and h-, and although the missing phonetic links were not recorded, he suggests that loanwords from Dravidian into Sanskrit and the Prakrits show evidence of -s- and -h-, as in kaṭāha- ‘heifer’ (< *kaṭac-), kalaha- ‘strife, quarrel’ (< *kalac-), sarāhaya- ‘a snake’ (< saraha < *carac–). Early Tamil attests the loss of Sanskrit and Prakrit sibilants in loanwords, perhaps prompted by a similar process. Emeneau considers the sound change a possible case of lexical diffusion, which failed to cover all eligible lexical items before it ceased to operate (1994: 12–14). In morphology, South Dravidian and South-Central Dravidian also share the common feature of competing forms for the first person singular pronoun: *yān/ 62

The Brahmin dialect of Tulu doesn’t show this change. Interestingly this development is also shared by Koraga, judging by D. N. S. Bhat’s (1971) data.

90

Suresh Kolichala

yan and ñān/ñan. The latter form, according to Krishnamurti (2003: 269), owes its initial consonant to the analogy of the inclusive first plural *ñām/ñam- (contrasting with exclusive *yām/yam-). The innovation of *nīr as 2nd person plural pronoun, with replacement of the inherited plural suffix -m by -r, is another feature cited by Krishnamurti as evidence for including South and South-Central Dravidian under a nested hierarchy. 1.6.4.2. South Dravidian

Figure 1.7: South Dravidian languages

1.6.4.2.1.

Shared features in South Dravidian

The South Dravidian languages reveal a number of shared developments. The creation of separate demonstrative pronouns for the feminine category (*aw-aḷ, 3SG . F ) is an innovation shared by all of South Dravidian but absent in the other Dravidian languages. (Toda is an exception, having lost all gender distinctions.) Some languages also have developed feminine verb suffixes. However, it must be noted that this development is not shared by Koraga, which is considered South Dravidian by Krishnamurti (2003) and Steever (1998). The system of expressing intransitive-transitive forms by means of an alternation NC : (N)CC63 in the tense marker is widely prevalent in most of South and South-Central Dravidian. Kannada, Tulu in South Dravidian and Telugu, Gondi in South-Central Dravidian have lost this feature which is compensated for by the 63

Krishnamurti uses the notation of NP to represent nasal + stop and NPP to represent nasal + stop + stop. Since NP is used elsewhere in this section for Noun Phrase, NC and NCC are used for these two phonological combinations to avoid confusion.

The languages, their histories, and their genetic classification

91

use of additive transitivizers -cu and -pu beside the transitive-causative markers -incu/-isu, e.g. Tel. naḍucu- ‘to walk’, naḍupu- ‘to drive’, naḍipincu- ‘to make someone walk’. However, since the North Dravidian languages show some traces of paired intransitive-transitive stems, this feature should be reconstructed to Proto-Dravidian. Loss of *ṯ in 3 SG . M *aw-anṯ, *iw-anṯ ‘he’ is another change found in all of South Dravidian. The addition of a dental to the negative participle, the use of wēṇṭu and wiṭu as auxiliaries, and the generalization of enunciative ї are other innovations shared by all of South Dravidian. Other shared features include the optional nature of the neuter plural, widespread use of the non-past marker *-pp-, and allomorphs in past markers (Subrahmanyam 2008). 1.6.4.2.2.

Development of subbranches in South Dravidian

Tulu is the first to branch off from Proto-South Dravidian. Several innovations shared among the languages of the Tamil-Kannada group are absent in Tulu (Subrahmanyam 2008: 13). For example, in Tamil-Kannada, the neuter plural allomorph *-kaḷ replaced the allomorph -ḷ, while Tulu retains both forms. The addition of dental root extensions for the numerals five and eight is found in all of South Dravidian except Tulu. Kannada is the next language to split, as there is a set of features shared by Tamil-Toda,64 which are absent from Kannada. The loss of the vowel in non-initial syllable after /r, l, ẓ, ḷ/, and change of short mid-vowel to high vowel before syllables containing high vowels are some of the innovations of Kannada not found in Tamil-Toda. The Tamil-Toda loss of the nasal in *NCC is not shared by Kannada; compare *eṇṭṭu ‘eight’ > Ta., Ma. eṭṭu, Koḍ. eṭṭï, Ko. eṭ, To. öṭ, but Ka. eṇṭu. Loss of gender-number distinction in 3rd person verb forms and loss of short vowels in non-initial syllables are some of the important exclusive innovations in the Toda-Kota subgroup. Emeneau (1967: 370) thought that Koḍagu split off from Tamil-Malayālam before Toda-Kota. But Krishnamurti (2003: 270) observes that the loss of -Vn as accusative marker places Kodagu, Kurumba, and Irula closer to Tamil-Malayalam than to Toda-Kota. Furthermore, the addition of *-kaḷ to the 1st and 2nd singular pronouns to derive exclusive plurals is a shared feature of Tamil-Malayāḷam-IruḷaKurumba-Koḍagu (Krishnamurti 2003: 248). Subrahmanyam (2008) provides three additional shared innovations to show that Kodagu belongs to a lower node than Toda-Kota.

64

Expressions of the type Tamil-Toda indicate the range of languages from Tamil through Toda in Figure 1.7.

92

Suresh Kolichala

1.6.4.2.3.

The Nilgiri linguistic microarea

The Nilgiri region is a relatively isolated, mountainous area, and the languages of this region show high diversity. Zvelebil (1980) and Diffloth (1968) identified features of diffusion and convergence among various tribal languages, and proposed that the extended Nilgiri area must be treated as a “linguistic microarea”. Toda, Kota, Irula, and Kurumba are spoken by Scheduled Tribes (officially recognized indigenous peoples) in the Nilgiri Hills of western Tamil Nadu, near Karnataka. Badagu is also spoken in the Nilgiris. Koḍagu, which doesn’t belong to the Nilgiri region, shares some of the features of the Nilgiri linguistic microarea. Zvelebil suggests the following features as characteristic of the Nilgiri microarea: 1. With the exception of Kota and Badagu, all the languages show centralized vowels (Krishnamurti 2003: § 2.1.1). Most of the Nilgiri languages have centralized vowels caused by split of i and e when followed by retroflex (or alveolar in some cases), although this does not completely define the environments that centralize vowels in Toda. For example, Proto-South Dravidian *kiḷ-i/*kiṇ-i ‘parrot’ became Kodagu gïṇ-i; Proto-South Dravidian *eṇ-ṭṭ‘eight’ developed to Toda öṭ; and South Dravidian kēḷ ‘to hear, ask’ is the source of Irula kë:kka (infinitive, compare Tamil kēṭ-ka). 2. Several of the Nilgiri languages preserve the contrast of the three Proto-South Dravidian coronal consonants, t: ṯ: ṭ, particularly in postnasal position and gemination. 3. A labial formative morph -VvV- is found in several Nilgiri languages. For example, the cognates for Tamil īral ‘liver’ and Kannada hīri are Irula īrvo, Kurumba īruvu, Toda ǖruf. 4. Several Nilgiri languages including Kodagu, but not Kota and Badagu, also have unrounded back vowels. Interestingly, Tulu and Koraga also participate. 5. Zvelebil also lists four shared semantic features, besides an interesting set of vocabulary items as belonging to the common stock of the Nilgiri microarea.65 A great deal has been published on the languages, geography, and ethnography of the Nilgiris during the past several decades (see Hockings 1989, 1997, 2012).

65

Zvelebil (1990: 65) speculates that there is a substratum of pre-Dravidian languages in this region.

The languages, their histories, and their genetic classification

1.6.4.2.4.

93

Literary languages

1.6.4.2.4.1. Tamil Of the four literary Dravidian languages, Tamil has the oldest literary tradition, dating to the beginning of the Common Era. The earliest written evidence comes from 2nd-century BCE cave inscriptions in Tamil-Brahmi script. The first known work in the Tamil language, Tolkāppiyam (1st–5th century CE),66 is a treatise on grammar and poetics. It is probable that a considerable body of literature was already available, perhaps in the form of anthologies. (For a detailed discussion on the indigenous Dravidian grammatical tradition, see 7.3). The history of Tamil can be categorized into three periods, Old Tamil (200 BCE–700 CE), Middle Tamil (700–1600), and Modern Tamil (1600–present), each with its own distinct grammatical characteristics. O LD T AMIL appears to preserve many Proto-Dravidian features, including consonant inventory, syllable structure, and various grammatical features. However, we can safely suggest the following changes in Old Tamil based on firm comparative evidence. Malayalam, then a dialect of Tamil, shared in these changes. Old Tamil palatalized Proto-Dravidian initial *k to *c before front vowel. When the front vowel was followed by a retroflex consonant, the change did not occur, perhaps because the vowel was articulated centralized as in the Nilgiri languages (Burrow 1943). For example, PDr. *kewi ‘ear’ transformed to *cewi, but *keṭ-u ‘perish’ remained keṭ-u. Old Tamil preserved PD *y word-initially. Intervocalic -c-, with lenis articulation -s- in many languages, shows further lenition to -y-; e.g. *ucir ‘life’ > usir > uyir (Old Tamil). Furthermore, Old Tamil shows the loss of word-final -cu with compensatory lengthening of preceding a. Optionally, a glide /w/ or /y/ may be added, as a long vowel in second syllable is not common in Dravidian; e.g. kaṭay/ kaṭā < kaṭas-u < *kaṭac-u. Old Tamil made a distinction between dental [n] and alveolar [ṉ] nasals. Two plural forms of Proto-South Dravidian pronouns, *yām/ *yam- ‘we (exclusive)’ and *ñām/*ñam- ‘we (inclusive)’, were retained in Old Tamil. It had a three-way deictic distinction including a medial deictic series with u-, which can be reconstructed to Proto-Dravidian: iwan ‘this man’, uwan ‘that man nearby’, awan ‘that man yonder’. This distinction was lost in mainland Middle and Modern Tamil, but is preserved in Sri Lankan Tamil. Old Tamil had a distinct negative conjugation; e.g. kāṇēṉ ‘I do not see’, kāṇōm ‘we do not see’. Nouns could take pronominal suffixes like verbs; e.g. peṇṭirēm 66

Takahashi (1995) argues that Tolkāppiyam has several layers with the oldest dating to 1st–2nd CE, and the newest and final redaction dating to the 5th–6th centuries CE. Mahadevan (2003) notes that the practice of placing a dot (puḷḷi) mentioned in Tolkāppiyam is used only in the late Tamil-Brahmi inscriptions (2nd century – 4th century).

94

Suresh Kolichala

‘we are women’ formed from peṇṭir ‘women’ and the first-plural marker -ēm. Causation was expressed both lexically and morphologically. Rajam’s grammar (1992) is a useful source of reference on Old Tamil. The propagation of Jainism and Buddhism in South India led to a number of lexical borrowings from Prakrit and Sanskrit in Old Tamil Cankam anthologies. Besides the Cankam anthologies, two long epics, Cilappatikāram and Maṇimēkalai, and a number of ethical and didactic texts show the growing influence of Sanskrit Kavya literature. It should be noted, however, that the direction of influence is in no sense one-way, and it is very likely that in some instances the direction of influence is from Old Tamil to Sanskrit.67 This M IDDLE T AMIL period was characterized by a number of phonological and grammatical changes. The disappearance of āytam, the loss of contrast between alveolar and dental nasals, and the transformation of alveolar stop /ṯ/ into alveolar trill /ṟ/ were some of the important phonological changes. In grammar, Middle Tamil developed a distinct present tense which was lacking in Old Tamil. Of the two present tense markers -kkir-/-kinr- and -aninr- found in Middle Tamil, only the first survives in Modern Tamil. In Middle Tamil, causative stems are productively formed by suffixing -wi, -pi, or -ppi. Lexical causative forms of Old Tamil were no longer used. Middle Tamil shows a significant influence of Sanskrit. Religious poems and songs of the Bhakti poets dominate the literary scene. Tēvāram verses on Saivism and Nālāyira Tivya Pirapantam on Vaishnavism, adaptations of religious legends such as the 12th-century Tamil Ramayana by Kamban, and the story of Saivite devotees known as Periyapurāṇam were produced during this period. Iraiyaṉār’s Akapporuḷ, an early treatise on love poetics, and Naṉṉūl, a 13th-century grammar that became the standard grammar of literary Tamil, are also from this period. M ODERN T AMIL , like Greek and Arabic, has diglossia. The standard written and spoken variety, called centamiẓ ‘beautiful Tamil’, is based on the earlier classical language and not on any of the contemporary regional dialects. The many spoken varieties of Tamil are called koṭuntamiẓ ‘crooked Tamil’ and are not used in formal speech and writing. (For detailed discussion on diglossia, see 6.4.) Modern spoken Tamil shows a number of changes: rounding of front vowels between bilabial and retroflex (e.g., peṭṭi > poṭṭi ‘box’; peṇ- > poṇṇu ‘woman’); palatalization of tt, nt after the high front vowel (e.g., paḍittēn > paḍiccēn), and the deletion of intervocalic w and k. The negative conjugation of verbs has fallen out of use. Negation is, instead, expressed through compound verbs (e.g. wara INF māṭṭān NEG .FUT .3SG ‘he will not go’). Similarly, causation is expressed using auxiliary verbs like waika ‘place’, ceyya ‘do’, and paṇṇa ‘make’. Spoken Tamil, inter-

67

According to George Hart (p.c.), Sanskrit aesthetics was in many areas indebted to the earliest Tamil, such as in its use of suggestion.

The languages, their histories, and their genetic classification

95

estingly, shows mid vowels corresponding to high vowels in Literary Tamil when the following vowel is a.68 Phonologically, the difference between r and ṟ is lost in most of the dialects; and ẓ merges with ḷ in some dialects, although in modern times, the sound ẓ acquired a shibboleth-like status and a great deal of attention is paid to its production and “correct” use in Tamil words (Schiffman 1980). In the early 21st century, Tamil is spoken by more than 66 million people, mostly residing in India, northwestern Sri Lanka, Malaysia, Singapore, Mauritius, Fiji, and Myanmar (Burma). 1.6.4.2.4.2. Malayalam Malayalam is the principal language of Kerala and the Lakshadweep islands, and is spoken by over 35 million people. It was the west-coast dialect of Tamil until about the 9th century CE. Separated from the main speech community by the steep Western Ghats, the dialect gradually developed into a distinct language. The first literary work is Ramacaritam (late 12th or early 13th century). The first grammar, Lilatilakam (14th century), was written in Sanskrit. Unlike Tamil, and to a greater degree than Kannada and Telugu, Malayalam has liberally borrowed from Sanskrit. Malayalam changed the combination nasal + stop (NC) to geminate nasal (NC > NN); e.g., *ṅk became ṅṅ, as in Tamil poṅku ‘boil’ : Malayalam poṅṅu. Unlike other Dravidian languages, Malayalam inflects its finite verb only for tense — not for person, number, or gender. Malayalam does not have diglossia of the Tamil kind. 1.6.4.2.4.3. Kannada Kannada is the official language of Karnataka. Inscriptions begin in the 5th century CE. The first extant literary text is the 9th-century Kavirājamārga, a work of rhetoric containing references to earlier texts, none of which are directly attested. Pampa Bharata of 941 CE is the earliest available literary work. Kesiraja’s Sabdamanidarpana (13th century) is the first comprehensive grammar written in Kannada. Kannada literature was influenced by the Virasaiva and Haridasa movements. The 16th century is the golden age of the Haridasa movement with Purandaradasa and Kanakadasa, the former considered the father of Carnatic music, the classical music of southern India. Modern standard Kannada is based on the educated speech of southern Karnataka (associated with Mysore and Bangalore [Bengaluru]) and 68

Bright (1966: 312) and Tieken (2008: 54) argue that the mid vowels in spoken Tamil are actually retentions from the Proto-South-Dravidian stage. Tieken suggests that the forms with raised vowels were learned forms introduced in Classical Tamil, supposed to correct a presumably careless pronunciation in the spoken Tamil.

96

Suresh Kolichala

differs considerably from the northern (Hubli-Dharwar) and coastal (Mangalore) varieties. There are also caste dialects within each of the regions. Kannada shows a regular sound change of word initial /*w/ > /b/ from the earliest records. This sound change, often associated with eastern Indo-Aryan languages, is considered a parallel development in Kannada and North Dravidian. The change spread to neighboring Kodagu and Tulu from Kannada. /*w/ > /p/ in Toda may also be due to Kannada influence through Badagu. In Classical Kannada, earlier radical /e/ and /o/ merged with /i/ and /u/ respectively when followed by high vowel. This shift is dated to the 8th century CE. Compare id-ir ‘opposite’ < *ed-ir : Te. ed-iri ‘opponent’, Ta. et-ir; sur-i ‘pour down’ (< *cor-i) : Ta. cor-i, Te. tor-gu (Krishnamurti 1958: 467). Middle Kannada changed South Dravidian word-initial *p- to h-; e.g. Old Kannada *pāl (milk) > hāl(u). 1.6.4.2.5.

Nonliterary languages

1.6.4.2.5.1. Tulu Among the nonliterary South Dravidian languages, Tulu is spoken by the largest population, approximately 1.7 million people. Most reside in the Dakshina Kannada district of Karnataka and the Cannanore district of Kerala. The Brahmin dialect of Tulu is heavily influenced by Kannada, while the widely used “common” Tulu is used by non-Brahmin castes. Tulu speakers use Kannada as official language. There is a growing modern literature in Tulu, but there are no known early texts. Tulu seems to share several features of phonology, grammar, and lexicon with members of Central Dravidian, such as Parji and Kolami, and there was some ambiguity on whether Tulu belongs to South or Central Dravidian. Subrahmanyam (1968) performed a detailed comparative study and concluded that Tulu indeed belongs to the South, although it was the first language to branch off from Proto-South Dravidian. Besides the features listed in section 4.2.2, Tulu stands out from other South Dravidian languages in several aspects: The verbal adjective marker -i- in Tulu contrasts with -a- in other languages. Furthermore, the Brahmin dialect seems to have preserved the original contrast between high and mid vowels before /a/, which is lost elsewhere in South and South-Central Dravidian. In Tulu and Kodagu a preceding labial consonant tends to round i and e to u and o when followed by a retroflex; compare South Dravidian *piṭ-i ‘hold, grasp’ > Tulu, Kodagu puḍ-i. 1.6.4.2.5.2. Koraga Koraga is a minor tribal language, adjacent to Tulu. About one thousand basket makers in South Kanara district speak this language as their native tongue. Koraga

The languages, their histories, and their genetic classification

97

is almost like Tulu in most respects, but D. N. S. Bhat (1971) suggests a possible genetic closeness with North Dravidian, based on the following grammatical features in which Koraga resembles North Dravidian. 1. The past tense suffixes -k, -g, -kk 2. Non-past suffix -n, -nn, -ṇ; Kurux also has a 3rd plural -n suffix 3. Onti, a dialect of Koraga, has an imperfect suffix -o, which could be compared with the future suffix -o of North Dravidian 4. The gender-number distinction is identical with North Dravidian 5. No plural suffixes are added to nonhuman nouns Further, Hock (2008) points out that Koraga, like North Dravidian, has relativecorrelatives without post-RC clitics as well as Quotativals,69 and concludes that the assumption that Koraga is South Dravidian needs to be reconsidered. 1.6.4.2.5.3. Badagu70 Badagu has approximately 135,000 speakers, who form the dominant community of the Nilgiri area both numerically and economically. This community emerged in the Nilgiri hills in the course of time by aggregating various migrant and local people. For long it was taken for granted that Badagu was a dialect of Kannada. There is also some historical evidence for the migration of small groups from the Kannada-speaking area, mainly in 1565–1617, who fled the crumbled Empire of Vijayanagar (Hockings 2013). However, recent studies of the language by Pilot-Raichoor (1997, 2012a) and Balakrishnan (1999) show that Badagu contains a number of archaic non-Kannada features and that there is no linguistic ground to derive Badagu from late medieval Kannada. Pilot-Raichoor suggests that the new migrants had first to acquire the local intertribal lingua franca, mainly a variant of Kurumba, to communicate and survive in the hills. A variant of this Nilgiri lingua franca later became dominant, under the name of Badagu. Balakrishnan 1999 and Pilot-Raichoor 2012a are the main sources. 1.6.4.2.5.4. Kota Kota is spoken by approximately 2000 speakers in the Nilgiri region. The language, along with Badagu, remarkably doesn’t show centralized vowels although all other languages of the Nilgiri region do. Emeneau 1944–1946 and Subbaiah 1972, 1973 are the main sources. 69

70

Hock uses the term “Quotatival” for direct discourse (DD) structures not embedded by means of a fully grammaticalized quotative marker. On Pilot-Raichoor’s suggestion, I use the term “Baḍagu” for the name of the language, instead of the traditionally used adjectival form “Baḍaga”.

98

Suresh Kolichala

1.6.4.2.5.5. Toda Toda, spoken by a small population of about 1,600, has the distinction of having the greatest number of phonological changes within the entire family resulting in many unique features such as the presence of three trills (post-dental, alveolar, and retroflex) and two voiceless laterals (alveolar and retroflex) contrasting with corresponding voiced ones. Toda has the largest number of vowels (14) and consonants (37) of any Dravidian language. Emeneau’s excellent grammar with texts (1984) is the main source of information. 1.6.4.2.5.6. Kodagu Also known as Kodava or Coorg, Kodagu is spoken in the Kodagu (formerly Coorg) district of Karnataka, bordering on Kerala. Kodagu speakers use Kannada as their official language and as the language of education. Although not directly in the Nilgiris, Kodagu shares several features of the region, and therefore is grouped in the Nilgiri microarea. It shares with Tulu the regular sound change of labial consonants changing unrounded vowels to rounded. Cole 1867, Balakrishnan 1976, 1977, and Ebert 1996 are the main sources of grammatical information. 1.6.4.2.5.7. Kurumba The “Kurumba” languages are spoken by tribal communities such as Ālu Kurumba, Beṭṭa/Ūrāli Kurumba, Cholanaika, Jēnu Kurumba, Muḷḷu Kurumba, and Pālu Kurumba. Each of these dialects seems to have freely drawn features from the nearest literary languages. It is even unclear if these speech varieties are close enough to be labeled dialects of a single language (Pederson 2012). Although Krishnamurti (2003: 21) suggests Kodagu as Kurumba’s closest sibling, more definitive data and analysis may yield surprising results. Beside a short article by Kapp and Hockings (1989), these languages are currently being documented by the “Dokumentation Bedrohter Sprachen” (DoBeS) project. 1.6.4.2.5.8. Irula Irula is spoken by a population of 2,000 in the Nilgiris. Irula is closest to Old Tamil. Besides having centralized vowels, it shows a few sound changes not found in Tamil-Malayalam such as *ẓ > y. Diffloth 1968 and Zvelebil 1973, 1979, 1982, 2004 are important sources of information.

The languages, their histories, and their genetic classification

99

1.6.4.3. South-Central Dravidian languages

Figure 1.8: South-Central Dravidian languages

A major change affecting all members of this subgroup, albeit to different degrees, is “apical displacement”, the shifting of apical consonants from original postvocalic position to prevocalic position in root syllables. The resulting word-initial consonant clusters tend to get simplified. For instance, Proto-Dravidian *uẓ-u ‘plow’ became Kui, Kuwi, Pengo ṛū- ‘to plow’, Telugu ḍu-kki ‘plowing’; ProtoDravidian *car-a-cu ‘snake’ became Telugu trācu, later tācu, Konda srāsu, Kui srācu, Kuwi rācu, Pengo rāc. Konda seems to be the phonologically most conservative. Kui-Kuwi and Pengo-Manda are closely related, and share an innovation of object-verb agreement, not found elsewhere in Dravidian (Bhattacharya 1972, 1975b, Steever 1993; see also this volume, 2.6.5). 1.6.4.3.1.

Literary languages

1.6.4.3.1.1. Telugu Among the Dravidian languages, Telugu is spoken by the largest population. Telugu place names occur in Prakrit inscriptions from the 2nd century CE. The first Telugu inscription is dated to 575 CE. The first literary work, by Nannaya (11th century), is a poetic translation of part of the Mahabharata. The first Telugu grammar, Āndhraśabdacintāmaṇi, written in Sanskrit, is said to have been composed by the same author. Telugu phonology is characterized by vowel harmony with both progressive and regressive assimilation. In most cases, it is the suffix vowels that are subject to vowel harmony, not the root vowels. Krishnamurti (1998) gives the following rules:71 71

See also 3.3.4.

100

Suresh Kolichala

— High, non-root vowels in multisyllabic forms must agree in rounding — In trisyllabic stems, medial vowels become low if the following vowel is low — Medial vowels change to [i] if the vowel in the following syllable has a nonback vowel Another phonological feature characteristic of Telugu is the articulation of palatal stops /c/, /j/ as palatal affricates [tš], [dž] before front vowels, and as alveolar affricates [ts] and [dz] elsewhere. Outside Telugu, this phenomenon is observed in dialects of southern Oriya, northern Kannada, Marathi, and Konkani. Since this articulation is found in all dialects of Telugu, it is considered an areal feature diffused from Telugu (Emeneau 1956: 7–8; but see this volume, 2.6.3). Three historical stages can be traced. The early stage (200–1000 CE) is marked by deretroflexion of ḷ, ṇ between vowels and in gemination; *ṯ > ṟ intervocalically and initially; ṯṯ > ṭṭ; nṯ > ṇḍ/nd, and the merging of PD *ẓ with ḍ and r; see for example ēḷ > ēlu ‘rule’, -koṇi > -koni ‘to receive’, wānṯu > wāṇḍu ‘he’, ēẓu > ēḍu ‘seven’, kẓocce ‘(one) engraved’ > krocce. Loss of preconsonantal nasal after a long vowel results in nasalization of the preceding vowel; e.g. wāṇḍu > wã̄ ḍu. Telugu also had an ancient rule of palatalization that operated without any restrictions, unlike Old Tamil, as in Telugu ceḍ-u vs. Tamil keṭ-u ‘perish’ (Emeneau 1995). Middle Telugu (1100–1599 CE) was marked by the merger of word-initial ḍwith d- and -ṟ- with –r-. The word-initial consonant clusters formed through apical displacement were simplified, as in mrānu ( mānu ‘tree’. A major morphosyntactic development was the change of predicate NPs with subject agreement to finite verbs; for example, waccinawã̄ ḍu ‘he who came’ > waccināḍu ‘he came’. The classical durative -cu(n) was replaced by -tū (from the spoken language). In Modern Telugu (1600–present), many spoken forms excluded from earlier written literature start surfacing; for example, īyaka > iyyaka ‘not giving’; cēstimi ‘we did’ (lit. cēsitimi). Modern Telugu has added /æ/, derived from internal changes as well as from Perso-Arabic and English loanwords, e.g. waccǣḍu < waccināḍu ‘he came’; tāṭǣku (tāṭi+āku) ‘palm leaf ’, bǣnku ‘bank’ (Krishnamurti & Gwynn 1985: 29–30). Loss of initial *w- is also evident in auxiliaries and compounds (Subrahmanyam 2008: 147). Modern Telugu has four regional dialects: Telangana, Rayalaseema, Coastal-Andhra, and Kalinga (Krishnamurti 1998). 1.6.4.3.2.

Nonliterary languages

The nonliterary South-Central languages are all spoken by Scheduled Tribes. Gondi-Kuwi show shared innovations such as loss of nasal before geminate stops, merger of nongeminated retroflex *ṭ with alveolar *r, generalization of -c(c)i as past suffix, and creation of the plural suffixes -sk for feminine and -nk for nonhuman nouns. Gondi seems to have diverged first from the group. Konda-Kuwi

The languages, their histories, and their genetic classification

101

shares innovations such as degemination of stops after short vowel. Pengo-Kuwi shares the merger of alveolar stop with palatal and the creation of an inclusive plural suffix -as. The dialect group comprising Kui-Kuwi must have separated from Pengo-Manda some 500 or 600 years ago. 1.6.4.3.2.1. Gondi Gondi, with many dialects in the five neighboring states of Maharashtra, Madhya Pradesh, Chhattisgarh, Orissa, and Telangana, is spoken by more than 2.5 million people. The more important dialects are: Northern Gondi, Southern Gondi, Khirwar, Maria, Koya, Nagarchal, and Pradhan/Pardhan. Some dialects are probably mutually unintelligible, particularly Maria Gondi and Koya in the south and southeast. The Gonds are mentioned (as Gondaloi) by Ptolemy of Naukratis, writing in the 2nd century CE. Rao 2008 is the main source. 1.6.4.3.2.2. Konda Konda (also known as Kubi) is mainly spoken in the hills of northeastern Andhra Pradesh. Konda comprises several local dialects, many of which are mutually intelligible. Krishnamurti (1969) notes that Konda is the only South-Central language to preserve the alveolar stop as alveolar trill, while the other languages merged it with reflexes of *r, *ṭ or *t. It also retains the morphological process of forming transitive stems by changing intransitive NC to CC. It has a negative past tense form, which Krishnamurti considers a retention. For descriptions and references see Krishnamurti & Benham 1998. 1.6.4.3.2.3. Kui Proto-Kui-Kuwi has two shared innovations: 1) lowering of the mid-vowels *e/*o to *a if the next syllable has *a, and 2) loss of the nasal in NC clusters. Kui is spoken in two districts of Orissa (Ganjam and Phulbani) by an estimated population of 650,000. Exclusive sound changes of Kui include *y > j and *l/ḷ > ḍ. Winfield’s grammar and vocabulary (1928, 1929) and Burrow & Bhattacharya 1961 are the main sources of information.

1.6.4.3.2.4. Kuwi Kuwi is spoken in the districts of Ganjam, Kalahandi, and Koraput of Orissa, and Visakhapatnam and Srikakulam of Andhra Pradesh by an estimated population of 250,000. The sound change *c- > h- is found only in Kuwi, but not in Kui. Burrow & Bhattacharya 1963 and J. Reddy 1979 are the main sources.

102

Suresh Kolichala

1.6.4.3.2.5. Pengo Proto-Pengo-Manda shows the shared innovations of shortening of long vowels in initial syllables and the creation of a feminine category in the 3rd person pronouns and finite verbs. Pengo is spoken in the Navarangpur district of Orissa by about 1,300 speakers. Burrow & Bhattacharya 1970 is the main source of information. 1.6.4.3.2.6. Manda Manda is spoken in the town of Thuamul Rampur in Orissa. Burrow 1976 and the recently published Manda-English dictionary by R. Reddy (2009) are the main sources. Reddy reports that there are two other dialects called Indi and Āwe which could be distinct languages. 1.6.4.4.

Central Dravidian

Figure 1.9: Central Dravidian languages

The Central Dravidian languages are spoken by over 200,000 people. All of Central Dravidian has merged the Proto-Dravidian alveolar stop */ṯ/ with dental /d/ or retroflex /ḍ/. Moreover, this parent sound retained its stop feature intervocalically, unlike in South and South-Central Dravidian where it became a trill /ṟ/. There is a sporadic loss of *n- in various languages (Krishnamurti 1961: 17, 91–2), which is more common in the Kolami–Naiki subgroup and almost regular in Naiki (Suvarchala 1992: 20). While Subrahmanyam (2008) considers the use of non-human plural for females a shared innovation, Krishnamurti (2003) treats it as a retention. Another shared innovation is the introduction of derivational markers for the three genders in the first four numeral classifiers. The use of –cci for the perfective participle and -Vt/ṭ for 2nd singular finite verbs are shared innovations of the Parji-Gadaba-Ollari sub-branch.

The languages, their histories, and their genetic classification

1.6.4.4.1.

103

Kolami

Among the Central Dravidian languages, Kolami has the largest number of speakers (about 100,000) in the Adilabad district of Telangana and the Yeotmal and Wardha districts of Maharashtra. Kolami has borrowed heavily from Telugu. Sethumadhava Rao 1950 and Emeneau 1961 are the main sources. 1.6.4.4.2.

Naikri

An estimated population of 1,500 speaks Naikri in close vicinity of Kolami. Naikri preserves some archaic features which Kolami has lost, e.g. nēm ‘we (inclusive)’ from PD *ñām [3647]. It also preserves the *l/*ḷ contrast which is lost in all other Central Dravidian languages. Thomasiah 1986 is the only source on this language. 1.6.4.4.3.

Naiki

Naiki is spoken by a population of 54,000 in the Chanda district of Maharashtra. Bhattacharya 1961 is the only source. 1.6.4.4.4.

Parji

Parji, spoken in the Bastar district of Madhya Pradesh, has borrowed extensively from Halbi, a dialect of Marathi. Parji is geographically contiguous to Ollari and Gadaba, which are spoken in the Koraput district of Orissa and the Srikakulam district of Andhra Pradesh, respectively. In pre-Parji the low vowels a and ā became e and ē when followed by an alveolar consonant, as in PDr. *kal ‘stone’ > Parji kel, PDr. *man ‘be’ > Parji men. Burrow & Bhattacharya 1953 and Subrahmanyam 1964 are the main sources. 1.6.4.4.5.

Gadaba

The people who identify themselves as Gadabas belong to two different language families: one speaking a Dravidian language (Koṇekor Gadaba), the other speaking a Munda language (Gutob Gadaba; see 1.7.2). Bhaskararao 1980, 1998 is the main source. 1.6.4.4.6.

Ollari

While Burrow and Emeneau (1961, 1984) treat Ollari as a dialect of Gadaba, Krishnamurti (2003: 26) treats it as an independent language. Ollari has two alveolar affricates ts and dz that do not occur in Gadaba. In Gadaba the conditional

104

Suresh Kolichala

suffix is -koṭ, while in Ollari it is -koṛen. The 3rd person masculine suffix is -ṇ in Gadaba but -ṇḍ in Ollari. Ollari also has past progressives, negative past progressives, and serial verb compounds which are not found in Gadaba (Bhaskararao 1998). Bhattacharya 1957b is the main source. 1.6.4.5.

North Dravidian

Figure 1.10: North Dravidian languages

Several shared features in these three languages suggest a common undivided stage deeper in history. Krishnamurti (2001, 2003) suggests the following sound changes as shared innovations: 1. PDr. *k > q, x /___ [a, e, o]. In all the North-Dravidian languages only k- is attested before *i, *ī. 2. PDr *c > k /___ [u, ū]. Before u, ū, all three languages show a velar stop /k/, while /c/ appears in the rest of Dravidian. This has been called into doubt, as Emeneau found valid counter examples in Kurux-Malto (1988: 255– 256). 3. PD *w- > *b-. This is a typologically motivated change due to convergence with central and eastern Middle Indo-Aryan, as postulated by Krishnamurti (2001: 321). Subramoniam (1991–1992) suggests that this change may be attributed to the spread of Magadhan Prakrit spoken by Jains who might have influenced the languages of Tulu, Kannada, Kodagu, Kurux, Malto, and Brahui, as the Jain religion spread in these areas. The change /v/ > /b/ is found only in the central and eastern languages of Indo-Aryan; in the northwest /v/ is either retained or becomes /w/. This is another argument in favor of a recent arrival of Brahui from mainland India. 4. Kurux-Malto and Brahui share a future/subjunctive marker in -o-, which has no cognates in other Dravidian languages (Ramaswamy 1929: 117).

The languages, their histories, and their genetic classification

5.

105

Reflexes of the proto-Dravidian interrogative pronouns *ya-/e- (Emeneau 1962) show variants with initial n- in Brahui and Kurux-Malto.

McAlpin (2003, Forthcoming) suggests there is not enough comparative evidence to place Brahui in North Dravidian. He argues that Brahui is not closely related to any other Dravidian language and proposes an early branch-off in the current location. However, the absence of any old Iranian loanwords in Brahui works against his proposal. The main Iranian contributor to Brahui vocabulary is Balochi, coming from the west (Kurdistan area) only around 1000 CE. On the cumulative evidence available so far, it seems warranted to place Brahui in the North Dravidian subbranch, but this requires further investigation. Hock (1996a, 2005b with references) suggests that the modern distribution of these languages may be a result of migration from the Narmada Valley within recent history. In the case of Kurux-Malto, this speculation is supported by indigenous traditions, the lack of Dravidian place names in the present-day territory of Kurux-Malto, and some evidence for Kurux influence on Nihali and Kurku in the Narmada area. 1.6.4.5.1.

Brahui

Brahui is the most geographically distant of all Dravidian languages, being spoken in western Pakistan. Because Brahui does not show any archaic features, it is considered likely (Krishnamurti 2003: 142, see 1.6.4.5 above) that its speakers migrated westward from the mainland, where they had lived next to the speakers of Kurux and Malto. Most Brahui speakers are bilingual in Baluchi. Under the influence of the neighboring Indic and Iranian languages, Brahui lost the short vowels e and o; Proto-Dravidian *e developed to i/a/ē, and *o to u/a/ō under different conditions. Proto-Dravidian *n and *m became d and b respectively when followed by front vowels. For example, *nīr > Br. dīr ‘blood’, *mēy > Br. beī ‘grass fit for grazing’ (Krishnamurti 2001: 121). The Brahui verb is complex and still developing (Bashir 2010a). Only the basic morphology is likely to be old. Bray 1909, 1934 and Emeneau 1962 are major sources of information. 1.6.4.5.2.

Kurux

Kurux, also known as Oraon, is spoken by 1.7 million people in the five neighboring states of Bihar, Jharkhand, Madhya Pradesh, Chhattisgarh, and West Bengal, where it is in contact with both Indo-Aryan and Munda languages. A dialect of Kurux, called Dhangar, is spoken in Nepal, apparently as a result of recent migration to the tea plantations under British rule. Hahn 1900 and Grignard 1924 are the main sources.

106

Suresh Kolichala

1.6.4.5.3.

Malto

Malto is spoken by nearly 100,000 people in the Rajmahal hills of Bihar and in West Bengal. Kobayashi’s recent publication (2012) is a great addition to Mahapatra’s authoritative description (1979). 1.6.5.

Conclusion

It is almost 200 years since Ellis (1816) — the first scholar to recognize Dravidian as a separate language family — wrote the “Dravidian Proof”, and we have come a long way in our understanding since then. Substantial progress has been made since Caldwell’s pioneering Comparative grammar of the Dravidian or South-Indian family of languages (1856). Detailed comparative reconstructions of the ProtoDravidian language at the phonological (Emeneau 1970, Zvelebil 1970, Subrahmanyam 1983, 2008, Krishnamurti 2003) and morphological levels (Krishnamurti 1961, 2001, 2003 Subrahmanyam 1971, 2013 Shanmugam 1971, Zvelebil 1977, Andronov 2003) are among the major achievements in the field. Burrow and Emeneau’s Dravidian etymological dictionary (1961, revised 1984) is a landmark contribution to Dravidian linguistics. The Dravidian languages, edited by Sanford Steever (1998), is a good summation of research on many important Dravidian languages. The Dravidian languages by Krishnamurti (2003) is a major work of consolidation and serves as the most comprehensive and authoritative source of reference. Dravidian syntax is relatively less explored, though Steever’s work (1988, 1993) provides a brilliant analysis of the serial verb formation and relativecorrelatives (also see Ramasamy 1981, Lakshmi Bai 1985, Hock 2005a, 2008). Subbarao’s recent book (2012), focusing on South Asian synchronic syntax, provides excellent coverage on Dravidian. However, when compared to the work on other linguistic families, such as Indo-European, progress in Dravidian appears minimal. The celebrated Dravidian etymological dictionary gives an extensive list of cognates, but doesn’t offer reconstructions. Although Krishnamurti provides reconstructions for about 500 entries (Krishnamurti 2003: 523–533), systematic historical reconstruction for all known cognates of Dravidian is still pending. Historical grammars are available for four literary languages, but careful, scholarly analyses of these grammars from a comparativist perspective are still missing. There are still no good descriptions with texts and lexicon for several nonliterary languages and their dialects. The relative position of Koraga and the Kurumba languages in the Dravidian family tree continues to be doubtful. Application of computational techniques to solve the problems of Dravidian linguistics has been very limited. The ancient history of the Dravidian languages is still shrouded in mystery. As discussed in 1.6.1.2, there have been several attempts to link Dravidian with other language families, but none is convincing. Any attempt to postulate a macro-

The languages, their histories, and their genetic classification

107

family must be based on well-established reconstructions for the members of that family. However, for many of the language families to which Dravidian has been compared, reconstruction is still fragmentary. A multidisciplinary pursuit involving linguistics, archaeology, and history, correlated with new insights from genomic studies, will hopefully lead to a better understanding of Dravidian prehistory. The number of western and Indian scholars specializing in Dravidian linguistics is on a steady decline. A few international scholars still interested in Dravidian linguistics are driven by interest in Dravidian/Indo-Aryan contact or macro-family relationships. The long-term health and vitality of Dravidian studies requires development of high-quality research institutes, especially in India. Creative use of information technology to generate a large corpus of linguistic information on the Dravidian languages and to make it accessible to the international linguistic community will also go a long way toward sustaining the momentum attained in the last century. 1.7.

Austroasiatic languages of South Asia By Gregory D. S. Anderson

1.7.1

Classification of the Austroasiatic language phylum

The Austroasiatic [AA] languages are a phylum of languages spoken by mostly small population groups residing in primarily remote and inaccessible hilly or mountainous regions throughout Southeast Asia, as west as central India and as east as Vietnam. There are roughly one hundred and seventy Austroasiatic languages, which belong to numerous subgroups. The Austroasiatic languages of South Asia belong to three recognized subgroups — Munda (1.7.2), Khasic (1.7.3), and Nicobarese (1.7.4). It was tradtionally believed that a split in the Austroasiatic phylum happened at some point in the distant past between the ancestors of the languages of the Munda stock, now spoken mainly in central and eastern India and adjacent parts of Nepal and Bangladesh, and those of the remainder of the family, called MonKhmer, scattered throughout Southeast Asia (Wilhelm Schmidt 1906, Shafer 1952, Pinnow 1960b, 1963, 1965, Diffloth 1989, 2005, Donegan 1993; Peiros (1998) denies the early split based on lexicostatistic data). Sidwell 2009 is a thorough assessment of the classification of Austroasiatic.72 This view has now been abandoned by most researchers, and Mon-Khmer is no longer considered a valid taxon. 72

Kemiehua [kfj] and Kuanhua [xnh], each spoken by merely a thousand people in Jinghong County, Xishuangbanna Dai Autonomous Prefecture, Yunnan Province, China near the Laotian border, have been tentatively assigned to Austroasiatic but not yet to any subgroup (Lewis 2009).

108

Gregory D. S. Anderson

The AA phylum has been splitting apart for several millennia. It is conventional to speak of macro-areal cultural and linguistic convergence zones in southern and eastern parts of Asia. Munda languages form part of the so-called “Indosphere” of areal influences and most of the other language families are part of the “Sinosphere” (Bradley et al. 2003).73 Vietnamese is the most extreme case of contact, convergence, and restructuring towards Sinospheric norms; other groups show varying degrees of such influence.74 Munda languages reflect certain Indospheric norms (retroflexion, SOV syntax, etc.) found in Indo-Aryan, Dravidian, and South Asian Tibeto-Burman. In many instances AA languages are spoken by only a few hundred or few thousand speakers. However, AA languages are also the national majority language of both Cambodia (Khmer) and Vietnam (the heavily Sinicized Vietnamese). In addition to Munda and Nicobarese, AA consists of the following families: Bahnaric, Pakanic, Katuic, Khasic, Khmeric, Khmuʔic, Mangic, Monic, Palaungic-Wa, Pearic, Vietic, Aslian, and Nicobaric.75 Aslian is a major subgroup of Austroasiatic spoken primarily in Malaysia and adjacent areas of Thailand (Wilhelm Schmidt 1901, Benjamin 1976a, 1976b, Diffloth 1976b, 1976c, Haji Omar 1976, Adams 1989).76 The primary splits are between a southern group (Semelaic) and a branch consisting of the northern (Jahaic) and central subgroups (Senoic). Semelai [sza] is the best described of all Aslian languages (Kruspe 2004).77 Senoic consists of several languages.78 The most important of these is Semai [sea] — the largest group with possibly as many as 20,000 speakers. Temiar [tmh] with perhaps 10,000 speakers is the best studied

73

74

75

76

77

78

It is proper to recognize a third linguistic zone of influence in Austroasiatic, the “Malayosphere”, which should be considered in the histories of both Aslian and Nicobarese. Unsurprisingly, some of these interior Southeast Asian AA languages show significant homologies with Tai-Kadai and Hmong-Mien languages as well — two other families participating in the Sinospheric convergence zone. It has been proposed that Viet-Muong may rather be coordinate with or sister to the remainder of “Mon-Khmer”. This proposal has little currency today. Ethnoracially, the so-called Orang Asli of Malaysia fall into three subgroups, the Semang/Negrito, the Sakai/Senoi, and the Jakun/Aboriginal Malay (Parkin 1991: 41). The first name in each case was traditional but has become stigmatized, with the latter variant in each instance now being preferred. Curiously some Semang/Negrito speakers prefer Sakai, although this is now considered offensive to those whom it originally designated (Parkin 1991: 42). Importantly, the linguistic subgroups of Aslian correspond only partially to this ethnoracial categorization. The Semelaic branch consists of a small number of languages each of which has probably fewer than 2,000 speakers. The Semelaic languages include, in addition to Semelai and Temoq, Semaq Beri [szc] and Maq Betiseq [mhe]; other names for these latter two include Mah Meri and Besisi. Senoic also includes Lanoh [lnh] and the poorly known Sabüm [sbo].

The languages, their histories, and their genetic classification

109

of this group (Carey 1961, Benjamin 1976b). Jah Hut [jah] may constitute an isolate branch within Aslian, although others consider it a divergent member of the Senoic subgroup (Diffloth 1976a). Jahaic languages are mainly spoken by very small groups of a few hundred speakers at most.79 Jehai (Jahai proper) [jhi] is the best described (Burenhult 2005). “Mon-Khmer” used to be considered an important Austroasiatic subgroup (Haudricourt 1965, Thomas & Headley 1970, Gregerson 1976, Huffman 1976a, Ferlus 1974, 1979, 1980, Adams 1989, Shorto 1976, 2005/2006), but contemporary researchers such as Sidwell 2009 reject Mon-Khmer as a coherent notion. Further research will refine and revise the classification and internal relations of the Mon-Khmer languages; a conservative approach is offered here. Bahnaric is a large group of minority languages spoken in southern central Vietnam, southern Laos, and northwestern Cambodia (Thomas 1971, 1980, Smith 1972, 1975, 1992, Gregerson et al. 1976, Bauer 1990, A. Ju. Efimov 1990, Smith 1992, Sidwell & Pascale 1999, 2003–2004, and later). The total number of all Bahnaric language speakers is likely less than one million.80 Katuic languages are spoken in the region where Laos, Cambodia, and Vietnam meet.81 There are two subgroups, conventionally called Eastern and Western Katuic (Peiros 1996, Diffloth 1983, Miller & Miller 1996). The total number of speakers of Katuic languages is approximately 200,000–300,000. Katuic speakers live mainly in Laos; many are undergoing shift to Lao. Khmeric consists of two languages — Central or Standard Khmer [kmr], the national language of Cambodia, and Northern Khmer [kxm] spoken mainly across the border in Thailand, by as many as seven million speakers. Khmer has been attested since the seventh century and appears in at least four historical stages: Pre-Angkorian, Old Khmer, Middle Khmer, and Modern Khmer (Jacob 1992, Jenner & Pou 1982, Gorgoniev 1974).

79

80

81

The Jahaic subgroup includes Negrito groups as well as racially Senoic Chewong [cwg]. The Jahaic subgroup includes such languages as Kintaq (Kintaq Bong) [knq], Minriq [mnq], Mintil [mzt], Batek [btq], Kensiu [kns], and Tonga/Mos [tnz] which is mainly spoken in Thailand, and probably also the Lowland Semang [orb] of Sumatra, with several thousand speakers. There are three or four major subdivisions within Bahnaric. The entire Southern subgroup is spoken in Vietnam, as are all but one of the Northern Bahnaric languages (Talieng [tdf] is spoken in Laos). West Bahnaric languages on the other hand are not found in Vietnam at all, but are dispersed throughout various enclaves in Laos and Cambodia. The Central Bahnaric languages, which include Bahnar proper, are a disparate group of five languages scattered across Vietnam, Laos, and Cambodia. The Katu [ktv] proper (Costello 1991), the Pacoh (Alves 2006) and the closely related Phuong [phg], and the Khua [xhv] live in Vietnam, and the Kuy and Western Bru [brv] in Thailand (and northern Cambodia).

110

Gregory D. S. Anderson

Khmuʔic consists of approximately a dozen languages scattered across Laos, Vietnam, and Thailand, with small enclaves in Myanmar and China as well (Smalley 1961, Svantesson 1983, Premsrirat 1987, 2004).82 The total number of Khmuʔic speakers is moderately large, with Khmuʔ [kjg] proper the largest group, having between 350,000 and 500,000 speakers in numerous local variants. The Monic branch consists of just two languages — Mon [mnw] of Myanmar and Thailand and Nyah Kur [cbw] of Thailand, with no more than a few thousand speakers (Huffman 1980, Diffloth 1984, Bauer 1989). Ethnic Mon may number nearly half a million, but the total number of speakers is significantly less, possibly only a tenth of that figure. Mon, like Khmer, has a long literary tradition, with texts dating back one thousand years; isolated inscriptional sources date back as far as the 7th century.83 The languages of the important Pearic branch were spoken by around 8,000– 10,000 people in Cambodia before the ravages of the Vietnam War and the subsequent terror imposed by the Khmer Rouge (Headley 1977, 1978). Only a handful of speakers of the half dozen or so languages may remain.84 Members of the widespread Palaungic-Wa branch are found scattered throughout Myanmar, Thailand, Laos, and Yunnan province, China.85 The total number of Palaungic-Wa speakers is likely over one million. Lametic languages (Charoenma 1983) — consisting of Con [cno] with perhaps 1,000 speakers and Lamet [lbn] with maybe 10,000 speakers — show considerable influence from Khmuʔic, linguistically and culturally, enough to make their classification unclear. It is possible that they were originally speakers of a Palaungic-language.

82

83

84

85

The Khmuʔic branch is further subdivided into the Khao, Mlabri, Xinh Mul, and MalKhmuʔ subgroups. The Nyah Kur probably represent the remnant of an old Mon kingdom of southern Thailand. In Thai they are called Chaubon. Both ethnonyms mean ‘mountain people’. The languages of the Pearic branch include Ch[h]ong [cog], known for its unusually developed system of register/voice quality contrasts characterizing its vowel system, Pear [pcb], Samre [scc], Somray [smu], Sa’och [scq], and the poorly known Suoy [syo] (not to be confused with the Katuic-speaking group of the same name). Several divergent groups are to be found within this branch. The major languages or subgroups are the moribund Danau/Danaw, the various divergent Angkuic groups, Palaung proper (Paulsen 1992, Milne 1921), Riang, and the large Waic group with multiple subdivisions. The large and diverse Waic family constitutes a heterogeneous group. The number of Waic languages and the internal divisions remain open questions, despite considerable work by Diffloth (1980). Most Waic languages are spoken by small populations which range from under 100 to more than 100,000. Wa is often known as Va in China; other common ethnonyms referring to Wa-speaking groups include (Rankin 1991: 111): Vu, Vo, Lave, Ravet, Krak, Kut Wa, Hsap Tai, and Gaungpyat (“head-cutting”).

The languages, their histories, and their genetic classification

111

The large and diverse branch of Austroasiatic known as Vietic consists of an as yet indeterminate number of languages spoken primarily in Vietnam and adjacent parts of Laos (Hayes 1983, 1984, Premsrirat 1996). First and foremost belonging to this branch is Vietnamese [vie], far and away the Austroasiatic language with the most speakers, with perhaps as many as 60,000,000–70,000,000. In fact, Vietnamese (Ferlus 1992) has more speakers than all the other AA languages combined.86 Many Vietic languages are undergoing rapid assimilation to Vietnamese.87 In addition to the subgroups adduced above, there are a number of as yet unclassified or isolated groups. Most are relatively recently described minority languages from China and Vietnam in particular. The Mang [zng] or Mang U of Vietnam and China number perhaps 1,000 speakers.88 The Palyu/Bolyu [ply] who occupy the Guangxi-Guizhou border region of China have also been identified as Austroasiatic (Edmondson 1995, Edmondson & Gregerson 1996). They are locally known as Lai.89 Many Austroasiatic languages exhibit unusual or noteworthy phonological features, such as the predilection to sesquisyllabic (“one-and-a-half-syllable”) words consisting of a minor/reduced syllable followed by a major/full syllable (Cohen 1965, Diffloth 1976b, Nacaskul 1978, Thomas 1992). Examples of such words with atypical clustering in initial position that typify Austroasiatic languages can be found even in the names of several of them, e.g. Khmer [kmr], Khmu(ʔ) [kjg], Sre [kpm], (C/E/S) Mnong [cmo/mng/mnn], Mrabri [mra], etc. Vowel systems among Austroasiatic languages are frequently highly developed, with elaborate systems of back unrounded vowels, centralized vowels, diphthongs, etc., often in combination with various phonation types or register phenomena; e.g. Huffman 1976b. Phonation types include creaky voice, breathy voice, etc. This combination of large-core vowel systems and phonation types yields exploded inventories of syllable nuclei and/or vowel phonemes in various

86

87

88

89

Highly divergent within the family, with a developed tone system (Haudricourt 1954), lack of minor syllables and with monosyllabic structure, lack of affixation processes, and heavy lexical influence from Chinese, the AA affiliation of Vietnamese was not established until relatively recently (and is still disputed by some). Among the other languages of the branch, Muong [mtq] stands out with at least 400,000–500,000 speakers, possibly a million (Sokolovskaja & Nguyen 1987). Most other Vietic languages have from several hundred to several thousand speakers, and are poorly known or indeed unattested linguistically, save perhaps an isolated word list. Mang has similarities with Khmuʔic and Palaungic-Wa, more with the latter, but may constitute its own subgroup within Austroasiatic. Not to be confused with the Tibeto-Burman Lai Chin of Bangladesh and Myanmar. Two recently identified languages, Bogan [bgh] and Buxinhua [bxt] seem to belong to this Pakanic group as well. Bugan [bbh] may belong here too (Li 1996).

112

Gregory D. S. Anderson

individual Austroasiatic languages. These rank among the largest, if not the largest, of such inventories in the languages of the world. Tense/aspect morphology is not common among non-Munda Austroasiatic languages but may be found in Lyngngam of the Khasic branch of Austroasiatic (see 1.7.3 below) and in certain Bahnaric and Katuic languages. In addition to Munda, certain Aslian languages also show subject agreement in the verb, but otherwise this feature is not common in Austroasiatic, where uninflecting TAM particles (and/or auxiliaries) predominate. Various proposals for larger genetic units that include Austroasiatic with Austronesian, Tai-Kadai (Kradai), Hmong-Mien (Miao-Yao), and even Sino-Tibetan (Benedict 1976) have been made, but none are accepted by specialists; they should therefore be treated with caution. 1.7.2.

Munda

Munda languages are spoken in eastern and central India, primarily in the states of Orissa, Jharkhand, Bihar, and Madhya Pradesh and in adjacent areas of West Bengal, Maharashtra, and Andhra Pradesh. Some Munda-speakers are found in expatriate or diaspora communities throughout India, and in Nepal and western Bangladesh as well. In earlier literature Munda is often referred to as Kol or Kolarian. There is a major split in Munda between a North Munda and a South Munda subgroup. At least the following languages belong to the older and more internally diversified South Munda: Sora (Savara) [srb], Juray [juy], Gorum (Parengi ~ Parenga) [pcj], Gutob (Gadaba) [gbj], Remo (Bonda/Bondo) [bfw], Gtaʔ (Didey) [gaq], Kharia [khr], and Juang [jun]. It is clear that Sora and Gorum form a branch of their own, as do the closely related Gutob and Remo. Gtaʔ has been traditionally linked with Gutob-Remo in a so-called Gutob-Remo-Gtaʔ subgroup. In turn this has been coordinated with Sora-Gorum in a Koraput Munda group (Zide 1969). Kharia and Juang have been linked together in a putative Kharia-Juang branch as well (Stampe & Zide 1968). Bhattacharya (1975a, 1975c) on the other hand reckons i) a Lower Munda group based on the absence of object agreement, which is Zide’s Gutob-RemoGtaʔ group, and ii) an Upper Munda group that consists of the remaining aforementioned groups (North Munda, Kharia-Juang, Sora-Gorum). One obvious problem with this is the lack of object agreement — the diagnostic typological feature — in the putatively “Upper Munda” language Kharia. These larger classifications are tenuous and remain to be adequately demonstrated (Anderson 2001).

The languages, their histories, and their genetic classification

113

Table 1.8: Different classifications of Munda (Key: GRG Gutob-Remo-Gtaʔ; NM North Munda; SG Sora-Gorum; SM South Munda) Language

Zide 1969

Bhattacharya 1975c

Anderson 2001

Korku

NM > Korku

Upper Munda

NM > Korku

Santali

NM > Kherwarian

Upper Munda

NM > Kherwarian

Mundari

NM > Kherwarian

Upper Munda

NM > Kherwarian

Ho

NM > Kherwarian

Upper Munda

NM > Kherwarian

Bhumij

NM > Kherwarian

Upper Munda

NM > Kherwarian

Turi

NM > Kherwarian

Upper Munda

NM > Kherwarian

Birhor

NM > Kherwarian

Upper Munda

NM > Kherwarian

Asuri

NM > Kherwarian

Upper Munda

NM > Kherwarian

Agarija

NM > Kherwarian

Upper Munda

NM > Kherwarian

Bijori

NM > Kherwarian

Upper Munda

NM > Kherwarian

Koda

NM > Kherwarian

Upper Munda

NM > Kherwarian

Korwa

NM > Kherwarian

Upper Munda

NM > Kherwarian

Koraku

NM > Kherwarian

Upper Munda

NM > Kherwarian

Mah[a]li

NM > Kherwarian

Upper Munda

NM > Kherwarian

Karmali

NM > Kherwarian

Upper Munda

NM > Kherwarian

Juang

SM > Kharia-Juang

Upper Munda

SM > Juang

Kharia

SM > Kharia-Juang

Upper Munda

SM > Kharia

Sora

SM > Koraput > SG

Upper Munda

SM > Sora-Gorum

Juray

SM > Koraput > SG

Upper Munda

SM > Sora-Gorum

Gorum

SM > Koraput > SG

Upper Munda

SM > Sora-Gorum

Gutob

SM > Koraput > GRG

Lower Munda

SM > Gutob-Remo

Remo

SM > Koraput > GRG

Lower Munda

SM > Gutob-Remo

Gtaʔ

SM > Koraput > GRG

Lower Munda

SM > Gtaʔ

South Munda languages range in speaker number from 300,000+ (Sora), to 150,000–200,000 (Kharia), to approximately 30,000–50,000 (Gutob90), to around 15,000 (Juang). The remaining South Munda languages have around 3,000–8,000 speakers each. North Munda opposes Korku [kfq] and a large dialect/language continuum called Kherwarian. Kherwarian includes both the largest of the Munda languages, Santali [snt], with nearly seven million speakers, as well as the smallest, Koda [cdz] and 90

This number includes Dravidian-speaking Gadaba as well; the number of Mundaspeakers is far less.

114

Gregory D. S. Anderson

Turi [trd], each with only a couple of hundred speakers remaining. Other languages include Ho [hoc] with over one million speakers, and Mundari with more than two million. Minor Kherwarian varieties include such languages as Agariya [agi], Asuri [asr], Bhumij [bhm/muw], Bijori [bix], Birhor [biy], Karmali [kfl], Koraku [ksz], Korwa [kfp], and Mahali [mjx]. Publications may be found in the larger of the Kherwarian languages (Mundari, Ho, Santali), including a range of Santali publications in a native orthography (the Ol’ Cemet script — Zide 1967, 1999–2000). It is clear that Munda languages are Austroasiatic lexically (Pinnow 1959). Morphosyntactically, the highly synthetic Munda languages differ radically from their isolating sister languages to the east. All Munda languages are moderately agglutinating and show SOV basic clause structure. An extreme example of this agglutinative morphological structure comes from the following Kharia word with 8 morphemes: (41) ɖoɖ-kay-ʈu-ɖom-bhaʔ-goɖ-na-m carry-BEN - TLOC -PASS -quickly-COMPLT - FUT -2 ‘get yourself there for me quickly’ (Kharia; Malhotra 1982) Among the more unusual phonological features of the Munda languages from a South Asian areal perspective are such features as low tone in Korku (Zide 2008a): bulù ‘thigh’ vs. lulu ‘REDPL :draw water’ or creaky voice in Gorum (Anderson & Rau 2008). Stops in final position typically have a characteristically “checked” or pre-glottalized articulatory feature (Zide 1958), readily distinguishing Munda languages from others of the subcontinent. In Mundari coda-position consonants (Osada 1992, 2008), the glottis is closed and the tongue and the lips simultaneously form an oral closure, then the glottal closure is released, which is optionally followed by nasal release and voicing. In Mundari a nasal release is found only in monosyllables: /ub/ ‘hair’ [uʔb̥ m] but /udub/ ‘to tell’ [uduʔb]; /rid/ ‘to grind’ [riʔd̥ n] but /birid/ ‘to stand up’ [biriʔd̥ ]. Kharia (Peterson 2008) shows a similar type of articulation for coda-position obstruents but the nasal release is not restricted to monosyllables. In Remo (Fernandez 1968, Anderson & Harrison 2008) the nasal release associated with checked consonants is an idiolectal feature. Like other Austroasiatic languages, Munda languages make extensive use of diphthongs; Santali has at least fifteen separate diphthongs and even triphthongs. Phonemic nasalized vowels are also found in Munda languages, e.g. Juang (Patnaik 2008) tɔɔrɔ ‘I fastened’ vs. tɔ̃ɔ̃rɔ ‘elephant’s trunk’, or Remo (Bhattacharya 1968) nkwĩ ‘father-in-law’ vs. nkwi ‘younger sister’. Monosyllables with short vowels are generally found only with particles and high frequency function words in (at least South) Munda; otherwise a minimalword constraint necessitates a minimum of two morae in any phonologically freestanding word (Anderson & Zide 2002, Anderson 2004).

The languages, their histories, and their genetic classification

115

A weak-strong prosodic word pattern is pervasive in the syllable structure of Munda languages and their systems of stress assignment. For example, in Kharia the low-high pitch contour, according to Peterson (2008), defines the domain of phonological word. Kharia words begin with a low-tone pitch that gradually rises throughout the word, as in rochoʔb ‘side’ [rɔ.chɔ́ʔb̚ m]; monosyllabics have a rising contour, as in laŋ ‘tongue’ [lǎŋ]. Santali has fixed second position stress (Ghosh 2008). In addition, Osada (1992, 2008) states that in Mundari, a quadrisyllabic “syntactic” word is divided into two bisyllabic phonological words. Accent is allocated to each 2-syllable phonological word; e.g., aká+dandá ‘to feel astonished’, even if it is not morphologically analyzable as in this example. Finally, Plains Remo (Anderson & Harrison 2008; Fernandez 1968), like Santali, shows a majority of words with second-position stress, with subsequent even-numbered syllables getting secondary stress. Many words in Munda languages consist of two syllables reflecting the prosodic word pattern of a weak syllable followed by a strong syllable, e.g., Ho ape ‘you (PL )’, bulu ‘thigh’, enɖel ‘meal left overs’, daʈob ‘press compactly’, halmaɖ ‘salt lick’, tumbrub ‘short’, or dursu ‘aim (a bow)’. Gtaʔ has lost unstressed vowels in many initial weak syllables and has recreated a word-structure reminiscent of the sesquisyllabic word structure (or minor+major syllable sequence) that characterizes the majority of the Austroasiatic family: bsa ‘to grow long hair’, bnoʔ ‘ladder of single bamboo’, tmwaʔ ‘mouth’, lgoʔ ‘neck’, tboʔ ‘earth, ground’, etc. Ghosh (2008) describes an ATR-type of height harmony system for Santali: Within the same stress group, if /i/ or /u /occurs, /ə/ but not /a/ will occur, while /a/ but not /ə/ co-occurs with /e o ɛ ɔ/. Further, if there is /ɛ/ or /ɔ/ in the first syllable of a stress unit having more than one syllable there must always be /ɛ/ or /ɔ/ in the following syllables, never /e/ or /o/. In the Ho of Mayurbhanj district, Orissa, front harmony is seen in certain suffixes (Anderson, Osada & Harrison 2008), e.g. the progressive present suffix -tAn- or the declarative/finitizer in -A; (42). (42) a.

tʃimiŋ hoː-ko kadʒi-ten-e how.many Ho-PL speak-PROG - FIN ‘How many Ho speak (their language)?’

b.

tʃimiŋ hoː-ko dʒagar-tan-a how.many Ho-PL speak-PROG - FIN ‘How many Ho speak (their language)?’ (Mayurbhanj Ho; Field Notes, KCN)90

91

[KCN] and other similar abbreviations, [DH], [CMH], [SoDM], [SuM], etc. stand for the names of the consultants who offered the forms; they come from the author’s field notes.

116

Gregory D. S. Anderson

Munda languages make extensive use of auxiliary verb constructions (Hook 1991, Anderson 2007b). The auxiliary verb is generally the finite verb of the clause and appears in clause-final position following the lexical verb in the sentence; (43). (43) no anɖigna niŋ bur-o you without I live-CV ‘I can’t live without you.’ (Remo; Field Notes, SoDM)

a-gon-t-iŋ NEG - CAPABIL - NPST -1

In Ho, the lexical verb in the desiderative AVC appears in a form marked by what functions as an ablative case marker with nominals. (44) aliŋ baro tʃa ɲu-te=liŋ two tea drink-ABL =1 DU we.DU ‘We two wish to drink tea’ (Ho; Field Notes, CMH)

sanaŋ-tan-a DESID - PROG - FIN

Doubled or serialized agreement is found in auxiliary verb constructions in Gorum (45), an inflectional pattern that is characteristic of local Dravidian languages as well, e.g. Parji, Gondi, etc. (Anderson 2003, 2006). (45) miŋ ne-gaʔ-ru I 1-eat- PST ‘I ate vigorously’ (Gorum; Aze 1973)

ne-laʔ-ru 1-AUX - PST

Fused formations that likely derive from V-Aux structures are also common, as in the Juang formation below, where the perfect markers represent fused original auxiliary formations, here deriving from a structure of *ma’d+dʒim+sɛ{ɖ}-ɔ or ma’d+dʒim+sɛ{ɖ}-kɛ, respectively. (46) a. b.

aiɲ ma’d-dʒim-sɛr-ɔ I beat-AUX -PRF - PST . II ‘I was beaten’ aiɲ ma’d-dʒim-sɛ-kɛ I beat- AUX -PRF - PRS . II ‘I am beaten’ (Juang; Pinnow 1960a)

Such structures probably underlie the many tense/aspect inflectional affixes in Kherwarian languages like Ho (47). Thus, the complex suffix form in -le-n probably derives from a fused auxiliary structure historically, but was already present as an affix for sure in Proto-Kherwarian and most likely in Proto-North Munda as well (Anderson 2007b).

The languages, their histories, and their genetic classification

(47) iɲ dʒajpura hatu-r=iɲ I Jaypur: GEN village-LOC =1 ‘I was born in Jaypur village’ (Ho; Field Notes, DH)

117

dʒonom-le-n-a born- T / A - ITR - FIN

The inflectional characteristics of South Munda nouns include the use of an unusual objective case prefix *a- that conforms to a primary object pattern in Dryer’s (1986) sense, i.e., it marks “accusative” in mono-transitive and “dative” in ditransitive formations. It probably was originally restricted to pronouns but has been expanded to mark other nouns in various languages. (48) a.

b.

a-no tajak-t-iŋ OBJ -you kick- NPST -1 ‘I kick you’ (Remo; Field Notes, SoDM) no a-niŋ dʒu-lo-tə-no OBJ -I look.at-CVB - NPST -2 you ‘You are looking at me’ (Remo; Field Notes, SuM)

A likely cognate element is found in some Katuic languages like Pacoh (or Ta’oih). Here too are found the same unusual features that are found associated with its potentially cognate South Munda element. In Pacoh, this case prefix marks dative with first and second person pronouns: (49) ʔa-maj ʔa-ɲaŋ DAT -2 DAT -1 DL ‘to you’ ‘to us 2’ (Pacoh; Alves 2006: 31) Verbs as a lexical category in Munda languages, generally speaking, are not easily or rigorously defined in opposition to nouns (D. N. S. Bhat 1997; Bhattacharya 1975a; Cust 1878; Pinnow 1966a, Evans & Osada 2005; but see Peterson 2005 for a different perspective). One and the same root may be used as noun (50a), as modifier (50b), and as predicate/verb (50c). Even a noun root like ‘house’ (50d) may be used verbally with verbal inflection in Santali (hɔɽ rɔɽ). (50) a. b.

kombro thief ‘thief’ kombro mɛrɔm stolen goat ‘a stolen goat’

118

Gregory D. S. Anderson

c. d.

mɛrɔm=ko kombro-ke-d-e-a steal-ASP - TR -3- FIN goat=3PL ‘they stole the goat’ oɽak-ke-d-a=e house-ASP - TR - FIN =3 ‘he made a house’ (Santali; Ghosh 1994: 21)

The default position for subject agreement clitics is in immediately pre-verbal position in Kherwarian. Note that this is true even if the element appearing in this position is an overt subject (or object) pronoun itself (51b, c). (51) a. Kumbɽəbad-te=ko əgu-ke-’t-le-a         bring- ASP - TR -1 PL - FIN K- DIR -3 PL ‘they brought us to Kumbrabad’ (Santali; Bodding 1929a: 208) iɲ=iɲ tʃala’k-a b. hè~ yes I=1 go.INTR –FIN ‘yes, I will go’ (Santali; Bodding 1929b: 58) c. iɲ am=iɲ ɲɛl-mɛ-a I you=1 see-2-FIN ‘I will see you’ (Santali; Ghosh 1994: 60) A wide range of arguments or referents may be encoded within the Santali verbal complex. This includes formally distinct ways of marking subjects, which is different for non-imperative (52a,d,e) and imperative (52b,c) clauses, direct object (52a), indirect objects (52b), beneficiaries (52c), and possessors of objects (52d) or subjects (52e). In forms like (52d,e) possessor is marked by a formally distinct t-series of inflections that occupies the second to last position in the verbal template in Santali, allowing agreement in the verb with an object (52d) or subject (52e) and its possessor simultaneously! (52) a.

b.

ba=ko sa’p-le-d-e-a NEG =3 PL catch- ANT - TR -3- FIN ‘they did not catch him’ (Santal; Bodding 1929a: 212) im-əɲ=me give-1=2 ‘give me’ (Bodding 1923: 22)

The languages, their histories, and their genetic classification

c.

d.

e.

119

dul-a-ɲ=me pour.out-BEN -1=2 ‘pour out for me’ (Bodding 1923: 21–2) sukri=ko gɔ’tʃ-ke-d-e-tiɲ-a die- ASP - TR -3-1 POSS - FIN pig=3PL ‘they killed my pig’ (Bodding 1929a: 209) hɔpɔn=e hɛ’tʃ-en-tiɲ-a son=3 come-PST . INTR -1. POSS - FIN ‘my son came’ (Ghosh 1994: 65)

Various South Munda languages allow agreement in the verb with not only the object but also with a possessor of a logical argument, but not both simultaneously. Unlike Santali, South Munda Gorum shows a pattern (53), in which the possessor is “raised” to a term argument and encoded in the verb in a manner identical to object marking (marked by suffixes), even if referring to the possessor of the subject. (53) putiputi-nom ir-om luʔr-om heart-2 beat-2 AUX-2 ‘your heart is beating’ (Aze 1973) Much work remains to be done in comparative Munda linguistics, despite excellent ground-setting studies by scholars such as Pinnow (1966a), Bhattacharya (1966, 1969, 1972, 1975a, 1975b), Zide (1976, 1978, 1985), A. Zide (n.d.), Zide & Zide (1976), Osada (1996), Zide & Anderson (2001), Anderson & Zide (2001), or Anderson (2001, 2003, 2004, 2007b). When examining cognates across the Munda language family, trying to ascertain what the full form of a noun might have been in the Proto-Munda ancestor language is very difficult. Roots underlying these nouns are usually cognate across the Munda languages, while the free forms themselves rarely are. Underlying roots in many South Munda languages remain active elements that serve as the basis for compounds and as the form of the noun that is incorporated within the verbal complex. It is these so-called “combining forms” that are cognate across South Munda (Starosta 1992, Mahapatra & Zide 1972), while the free forms are often not (see Anderson & Zide 2002, Anderson 2004, 2007b for more). The nature of Munda word structure is for roots to be augmented by prefixes or infixes, less commonly suffixes or lexical compound elements to create freestanding forms (Table 1.9). Originally the prefixes likely expressed noun class reflecting some now opaque or lost semantic categorization; some infixes still have transparent semantics (instrument nouns *-n- or agent nouns *-m- for example).

120

Gregory D. S. Anderson

Table 1.9: A sample of noun correspondence sets and patterns of freestanding noun forms in Munda languages Gutob Remo

Gtaʔ

Kharia Juang Sora

Gorum

Korku Kher- gloss warian

titi

titi

nti/tti

tiʔ

Rdpl

Rdpl

susuŋ

iti

sʔi

siʔi

ti

ti ~ tii

*N-ʔ (Rdpl)

*N-

-ʔ-

-ʔ-

Ø

Ø, ː

tiksuŋ

ntʃo



idʒiɲ/ŋ dʒʔeŋ

dʒiʔiŋ

naŋgà

dʒaŋga ‘foot’

Rdpl

tik-

*N-



*N-

-ʔ-

-ʔ-

-a

-a

gikil, kilɔ

kilɔ, ŋku kukurag

kiɽo(g) kiɭɔg

kɨna

kul(a)

kula

kul[a] (Mu)

*kV-, -ɔ

-ɔ, Rdpl+ -ag

*N-

-ɔg

-ɔg

-a

-a/Ø

-a

*-a

gusɔʔ

gusɔd

gsuʔ

soloʔ

sɛlog

kənsod

kusɔd

sita

seta

*kV-

*kV-

*kV-

-Vl-

-Vl-

kən-

ku-*kən- -a

gubɔn

gibɛ

gbɛ

bane/ai Banae kəmbud kibud

bana

bana

*kV-

*kV-

*kV-

-ai

-a

-a

-ae

*kən-

*kən-

‘hand’

‘tiger’

‘dog’

-a ‘bear’

The correspondence sets above are typical of what one finds in comparative Munda linguistics. There are tantalizing local (areal-genetic) developments, trends, and tendencies, but there is also a frustrating lack of regular correspondences of the sort that makes it possible to reconstruct plausible looking forms for Proto-Munda.92 92

In fact, there are almost no full forms or nouns that are formally cognate across all the Munda languages. This kind of pan-Munda correspondence set (other than common instrument nouns like ‘broom’ derived from ‘sweep’ by -n-infixation in most Munda languages) is limited to the form that means both ‘turmeric’ and ‘yellow’, which is a reduplicated form, realized as C1-CVC, C1V1-CVC, C1V1C2-CVC, and C1V1ɽV1C2-CVC patterns of reduplication in the various individual languages and subgroups (e.g., Gtaʔ, Gutob+North Munda, other South Munda, and one variant in Juang, respectively). This correspondence set suggests that ‘turmeric’ (and possibly ‘yellow’) was probably expressed in Proto-Munda with a * C1V1C2 reduplication pattern of the root /saŋ/ (i.e., was realized as *saŋsaŋ in Proto-Munda). Gutob Remo sasaŋ

Gtaʔ Kharia

saŋsaŋ ssia

saŋsaŋ

Juang

Sora

Gorum Korku

Kher- gloss warian

sa(ɽa)- saŋsaŋ saŋsaŋ sasaŋ sasaŋ ŋsaŋ (tʃatʃaŋ-)

‘turmeric’

The languages, their histories, and their genetic classification

121

Putative Proto-North Munda forms are relatively easy to reconstruct. Proto-Sora-Gorum can also be relatively securely offered for this set of correspondences. The correspondences between Gutob and Remo are demonstrative of how even closely and obviously related languages can differ with respect to the way freestanding forms of nouns are related to their underlying roots. Proto-Gutob-Remo forms can be guessed for ‘hand’, ‘bear’, ‘dog’ and perhaps one variant of ‘tiger’. It is possible of course that Remo reflects the older Proto-Gutob-Remo form for ‘leg/ foot’ that was analogically reformed via reduplication in Gutob, but that remains to be demonstrated. Gtaʔ as usual shows its complex set of forms. Most likely Gtaʔ tti is a loan from Remo (there are many Remo loans in Gtaʔ — a topic awaiting a specialized study), with the variant nti being the original Gtaʔ form, as many nouns in Gtaʔ are formed with the syllabic nasal prefix N-, particularly body parts, e.g. nlu ‘ear’, nle ‘tongue’. The process of noun incorporation appears to be very old in Munda. All South Munda languages have noun incorporation either as an active morphosyntactic process (Sora, Gtaʔ), preserved in numerous lexicalized examples (Remo, Gutob, and Gorum, where it may well still be an active process), or in a small and decreasing number of lexicalized formations as in Juang and especially Kharia.93 Thus, all South Munda languages can be shown to have incorporated forms with a cognate combining form *=ti ‘hand’, despite showing a range of formations for free forms meaning ‘hand’: i) reduplication (Proto-Gutob-Remo and some Gtaʔ varieties), ii) syllabic N-prefixation (other Gtaʔ varieties, and historically Juang as well), and iii) glottal stop infixation (Proto-Sora-Gorum, and probably Kharia as well, but realized synchronically as suffixation in Kharia); see Table 1.9. North Munda languages mostly have lost noun incorporation altogether or preserve it in one or two expressions (a similar situation is found with the original Proto-Munda and Proto-Austroasiatic causative prefix in North Munda languages). Thus, Munda is like Siouan in having languages lacking or nearly lacking incorporation but other languages where it is a core part of the grammar. Noun incorporation is also found in Nicobarese (see 1.7.4 below) and in lexicalized form in a handful of Aslian (Bishop 1996) and “Mon-Khmer” languages (Thavung, Bolyu, Old Mon, etc.); so it appears to be an old feature of the Austroasiatic phylum. (54) kawɔl pɯcpɛh ‘hug’ ‘swing arms’ cf. wɔl ‘shoulder blade’ cf. k[ə]lapɛh ‘upper arm’ (Kensiw [Aslian]; Bishop & Peterson 1994: 188, 193)

93

Kharia has undergone rather extensive influence from Mundari, where noun incorporation is almost entirely lacking.

122

Gregory D. S. Anderson

(55) pasal-naq ki-chiibjuq reason-that 1PL-walk < *‘go.foot’ ‘so we had to go on foot’ (Temiar; Carey 1961: 46) (56) khɛʔɛʔ ‘to shit’ (So/Thavung [Vietic, Thailand]; Premsrirat 1996: 168) (57) titey /titea/ ‘lead’ cf. tey /tea/ ‘hand’ (Old Mon [Monic]; Nai Pam Hla 1976: 907) cf. modern Mon datay /hetoa/ (58) tselei ‘beat cow’ vɯŋqɔ ‘to (catch) fish’ ɬjitlei ‘kill cow’ ɬjittsu ‘kill dog’ ɬjittən ‘butcher pig’ tɕənmət ‘start fire’ (Bolyu [Palyuic]; Edmondson 1995: 134, 141, 144, 154) Sora is among the very small number of the world’s languages that allows for instances of multiple noun incorporation (59) as well as the typologically unusual incorporation of an agent noun with transitive verbs (60); it may in fact be unique in this regard. (59) jo-me-bob-dem-te-n-ai smear-oil-head-RFLXV - NPST - INTR - CLOC /1 ‘I will anoint my head with oil’ (Ramamurti 1931:143) (60) ɲam-kit-t-am seize-tiger-NPST -2 ‘tiger will seize you’ (Ramamurti 1931: 40)

sa-bud-t-am mangle-bear-NPST -2 ‘bear will mangle you’ (Ramamurti 1931: 142)

The languages, their histories, and their genetic classification

123

That these forms contain incorporated nouns is demonstrated by the fact that it is the combining form that is found in them, not the syntactically freestanding full forms of the nouns and that the undergoers are encoded by verbal object suffixes. Note that such formations as the Sora ones in (60) above violate alleged “universals” of noun incorporation put forth in the theoretical linguistic literature (e.g. Baker 1988, 1996). The Munda shift to SOV word order from original SVO~VSO may be attributed to “Indospheric” areal influence from Indo-Aryan or Dravidian (Anderson 2003; cf. Bhattacharya 1972, 1975b). 1.7.3.

Khasic or Khasian

The Khasi [khi] are a group of Mon-Khmer speakers living predominantly in the Khasi and Jaintia Hills region of Meghalaya in northeastern India, and a smaller number in Assam, West Bengal, and Manipur. In some (particularly older) sources the Khasi have been called Khuchia. The vast majority of Khasi people (ca. 90 %) live in India, with a further 10 % living across the border in Bangladesh. Traditionally Khasi is divided into a number of “dialect” groups (Nagaraja 1993–1994), but it is linguistically more sound to speak of a small group of related languages, labeled here Khasic or Khasian. These “dialects” or closely related languages include the following (Ethnologue; Parkin 1991): Khasic languages i. Amwi [aml] ii. Bhoi iii. Lyngngam iv. Pnar (a.k.a. Synteng or Jaintia) [pbv] v. Khynriam or Cherrapunji/Standard Khasi [khi] vi. War vii. ? Nongtalang viii. ? Others yet to be identified/described but known to exist Of these dialects/languages, Lyngngam is linguistically most distant from the Standard Khasi dialect. Amwi (Weidert 1975, Daladier 2002) is also quite distant from Standard Khasi. Lyngngam (Nagaraja 1996) most likely includes a linguistically Khasified Garo element (a Tibeto-Burman language), so substratum features may in part explain the divergence of this variety. War and Bhoi may include assimilated Mikir (Tibeto-Burman) elements. The Pnar (Synteng/Jaintia) ruled a kingdom in the region from at least 1500 to 1835 when it was disbanded by the British colonial authorities (Parkin 1991: 58); so they exerted some local dominance in the not too distant past. Further Khasic varieties include Lakadong, Nongtalang, and Mynnar. Other undocumented local Khasic varieties exist

124

Gregory D. S. Anderson

and there is also considerable under-documented microlevel variation. An exact assessment of the diversity found in the Khasic languages is the subject of ongoing research. There are likely over one million total speakers of Khasi, which includes all the above-mentioned languages and dialects. Khasi is a literary language, and a language of media and government in Meghalaya, and one of the official languages of India. There are radio and television broadcasts in the Standard Khasi language. Phonologically, Khasi exhibits some areally and typologically atypical initial clusters, e.g. [bt], [kt]. Khasi thus shows a characteristic Austroasiatic word profile with a minor syllable followed by a major syllable in a low-high prosodic word structure (61) as seen in examples such as bta ‘wash/besmear face’, ksew ‘dog’, kti ‘hand’ ktháw ‘grandfather’. (61) Khasi [minor syllable + major syllable]Word (Rabel-Heymann 1976: 971) WORD minor-σ

major-σ LH

Syntactically, Khasi is SVO while other Khasic languages can show different basic word orders, among many other different features. (62) phi-m ʔiithuʔ recognize you-NEG ‘don’t you recognize me?’ (Rabel 1961: 61)

ya ŋa OBJ I

Morphosyntactically, Khasi is characterized by use of gender markers and a system of personal verb inflection. This system of gender classifiers is highly marked for Austroasiatic, setting Khasi apart from its sister languages spoken to the east and west. (63) u khɨnnaʔ u-m bam DET . M boy MASC - NEG eat ‘the boy doesn’t eat’ (Standard Khasi; Nagaraja 1993–1994: 5) In terms of verbal derivational morphology, Khasi makes use of a causative prefix consisting of a labial consonant in various allomorphic realizations. Khasi lacks the infixed allomorph of the causative that is found in South Munda and Nicobarese (see 1.7.4).

The languages, their histories, and their genetic classification

125

(64) a. ph-rung ‘penetrate’ < rung ‘enter’ b. ph-láit ‘clear away’ < láit ‘be free’ c. b-ta ‘wash/besmear face’ (Khasi; Henderson 1976a: 487) Negative is realized either as an enclitic on a subject pronoun or a gender agreement marker, or proclitic to the verb stem, depending on the tense/aspect value of the clause. Note also the presence of fused subject pronoun+tense forms in (65b), here marking first person singular future. (65) a.

b. c.

phi-m ʔiithuʔ ya ŋa recognize OBJ I you-NEG ‘don’t you recognize me?’ (Standard Khasi; Rabel 1961: 61) ŋan ʔm-thoʔ I.FUT NEG -write ‘I’m not writing’ u khɨnnaʔ u-m bam DET . M boy MASC - NEG eat ‘the boy doesn’t eat’ (Standard Khasi; Nagaraja 1993–1994: 5)

In Bhoi, the negative has a different phonological shape and occurs between the lexical verb and a postposed gender/agreement marker. Most likely negative and subject markers are bound elements in Bhoi but this awaits definitive demonstration. (66) u

khannaʔ bam boy eat ‘the boy doesn’t eat’ (Bhoi; Nagaraja 1993–1994: 5) DET . M

re

u

NEG

MASC

A suffixal past tense marker in -laʔ and a non-past in -diʔ are found in Lyngngam. Note that these most likely derive from fused auxiliary formations (Anderson 2006). The past tense form appears with an unmarked form of the verb stem (67a-c), which suggests that the likely original formation was a serial verb structure that was grammaticalized as a tense/aspect auxiliary verb construction. This auxiliary verb construction in pre-Lyngngam subsequently fused to become the attested suffixal formation in Lyngngam. The non-past construction on the other hand is of a formally different type. The future non-past form also probably derives from a fused auxiliary verb construction with an auxiliary verb that likely meant ‘go’ originally. Unlike the past tense form which appears with an unmarked stem of the verb, the future/non-past (67d-f) formation is found with what is most likely a nominalized form of the verb stem using the infix /-ənn-/ or /-ɨnn-/.

126

Gregory D. S. Anderson

(67) a.

b.

c. d. ̛ e. f.

brə kyu di-laʔ lɨŋba go-PST through man 3PL ‘the men went through the forest’ (Nagaraja 1996: 43) nə di-laʔ I go-PST ‘I went’ (Nagaraja 1996: 44) mi baŋ-laʔ you eat-PST ‘you ate’ nə dənni I go.NPST ‘I go’ tu dənni-diʔ he go.NPST - FUT ‘he will go’ mi bɨnnəŋ-diʔ you eat. NPST - FUT ‘you will eat’ (Nagaraja 1996: 44)

laʔtap forest

Khasi (like most AA languages) may derive deverbal nominals through a process of -n-infixation, e.g. shnong ‘village’ < shong ‘live, sit’ (Henderson 1976a: 517). Note that sometimes the derived noun reflects a more archaic phonological form of the stem it historically derives from, e.g. preservation of initial s- in the word for ‘wing’ while in the corresponding verb stem ‘fly’, the stem-initial s has shifted to h. (68) sner ‘feather, wing’ < her ‘fly’ (Khasi; Henderson 1976a: 518) In addition, there is evidence in Khasic languages of a now covert noun-class system that manifests itself in the form of lexicalized prefixes in noun stems. Such lexicalized prefixes have reflexes in all the different subgroups of Austroasiatic to one degree or another. Like many Austroasiatic subgroups (Anderson & Zide 2002), Khasic languages show irregular correspondences in the free forms of nouns, while the corresponding “underlying” roots are clearly cognate across the subgroup; Table 1.10.

The languages, their histories, and their genetic classification

127

Table 1.10: Irregular Khasic correspondences (Fournier 1974: 86–92; Nagaraja 1996: 38) Khasi

Lyngngam

Synteng

Amwi

Lakadong

Mynnar

War

gloss

ksew

ksu:/’su:

ksaw ~kswa

ksiá

ksaw

ksow

ksià

‘dog’

sim

ǝsim

sim

ksem

ksem

‘bird’

khmat

kh’mat

khmat

ma:t

ma:t

ma:t

‘eye’

khmut

leo-‘mut

khmut

mur-koŋ

mur-koŋ

myrkoŋ ‘nose’

kaçkor

ləkur

‘ear’

Elements like k-/kh- in words such as khmat, which are now lexicalized prefixes in Khasi, are most likely to once have marked noun classes of some sort. The CVC root form in these correspondences is what is cognate. Such CVC root forms also generally serve as the combining form used in the frequent compounds and in lexicalized traces of noun incorporation found in the Khasi lexicon, examples of which can be seen in the related sets of forms given in (69). (69) kti ‘hand’ but tiipdeŋ ‘middle finger’ (Rabel 1961: 44) khmat ‘eye’ but matliʔ ‘white of eye’, ʔiimat ‘eye’ < ‘see/eye/face’ (Rabel 1961: 149) khnaay ‘mouse, rat’ but naaysaaw ‘small red hill mouse’ Standard Khasi is a relatively well-documented language even if most of the other Khasic varieties are not. More than 150 years of documentation exist; see von der Gabelentz 1858, Roberts 1891, Wilhelm Schmidt 1904, Rabel 1961, Henderson 1965, 1966, 1967, 1976a, 1976b, Fournier 1974, Nagaraja 1979, 1984a, 1984b, 1985, 1993–1994, 1996, Sharma 1999, etc. There is also a native Khasi scholarly tradition, e.g. Bars 1973, Blah 1970, Subbarao & Temsen 2003. 1.7.4.

Nicobarese

Nicobarese is a small group of at least five languages spoken across the Nicobar Islands, part of the Indian territory of the Andaman and Nicobar Islands, lying in the Indian Ocean south of the Andaman Islands, off the southeastern coast of India, north of Sumatra. Among this group of languages, Car Nicobarese [caq] (aka Pû) may have had 30,000 speakers prior to the 2004 tsunami which devastated the island; recent estimates are lacking (van Driem 2007). Nancowry Nicobarese or Central Nicobarese [ncb] has a couple of thousand speakers scattered across the islands of Nancowry and Camorta, Katchal (where it is called Téhñu), and Trinket [Trinkut] (where it is called Lâfûl). The other members of the Nicobarese language family include Teressa (a.k.a. Taih-Long) [tef] with under 3,000 speakers, spoken on Teressa Island and Bompoka Island (where it is called Powahat), and Chowra (a.k.a. Tutet/Tatet) [crv] with roughly 2,000 speakers, spoken on Chaura Islands.

128

Gregory D. S. Anderson

Great Nicobarese or Southern Nicobarese [nik] may have as many as 5,000 speakers on Great Nicobar (where it is called Lo’ong), Little Nicobar (where it is called Ong), and some outlying islands, e.g. Kondul (known locally as Lâmongshé) or Milo (Miloh). One further language that is conventionally called Nicobarese, Shompeng [sii], is spoken in the interior of the southernmost Great Nicobar island, and appears highly divergent; some have even suggested that it may not be Austraoasiatic at all (Chattopadhyay & Mukhopadhyay 2003, Blench 2007). Materials on this language remain scanty and the ones that exist are largely unreliable; claims that Shompeng might be a language isolate should be treated with caution. Little scientific documentation has been possible on Nicobarese in the past decades, much in the way of even basic facts remains to be investigated for most languages, and some languages are barely attested at all. Large dictionaries and brief materials on the grammars of various Nicobarese languages were prepared by missionaries and administrators in the 19th and early 20th centuries (e.g. Man 1888–1889). The Car and Central Nicobarese languages have received the most amount of linguistic investigation. Temple 1902b is an early source on Nicobarese grammar. A. R. Das 1977 is the best “recent” source. Car was the subject of a dissertation by Braine (1970). The main source on Nancowry is Radhakrishnan 1981. The Central Institute for Indian Languages has produced a couple of short primary school books for use in elementary schools in the Nicobar Islands (Pongi 1990, Harry 1990). Adams (1989) included Nicobarese data in her study of numeral classifiers — a core feature of many Austroasiatic languages. The other members of the Nicobarese language family are very poorly attested. As in many other Austroasiatic groups, there are irregular correspondences in the full forms of nouns across languages, while the underlying roots are clearly cognate: (70) Nicobarese words for ‘hand’ (or ‘palm’) (Man 1975 [1888–1889]) Central Car Shom Pen Teressa kane-tai el-ti: noai-ti: mɔh-ti: (71) Related words in Car Nicobarese (A. R. Das 1977: 32) ɛlti ‘palm of hand’ ukti ‘back of hand’ kuntiː ‘finger’ The root form is used as a combining form in compounds and incorporated formations that are common in Nicobarese. For example, in Nancowry (Central), the root for ‘hand’ is incorporated in the following forms: (72) Nancowry incorporation of -tay ‘hand’ (Radhakrishnan 1981: 106) təŋ ‘ reach; up to’, təŋtatay ‘reach for’ (cf. təŋta/tənta ‘reach at’)

The languages, their histories, and their genetic classification

129

Certain sets of lexemes in individual Nicobarese languages are suggestive of a now opaque and lexicalized but formerly active system of noun classification. Recurrent prefixal elements in Car nouns include ta-, ha-, li- and ɛl-: (73) Car Nicobarese (A. R. Das 1977: 17, 31–32, 41, 42) a. [ɛl]mɛh ‘nose’ ɛlmat ‘color’ ɛlkui ‘brain’                    ɛlwaŋ ‘mouth’ ɛlti ‘palm of hand’ ɛlŋoh ‘chest’ ɛlran ‘sole, hoof’ b. tarul ‘cloud’ tacam ‘dew’ tahɯi ‘today’ taaː ‘day after tomorrow’ takɯn ‘thigh’ c. haniːŋ ‘axe’ hataːm ‘night’ harã p ‘evening’ d. litak ‘tongue’ likɯn ‘nape’ likap ~ kilap ‘gullet’ As mentioned throughout this survey, among the more noteworthy features of Austroasiatic is the unusually frequent use of infixation, and Nicobarese is no exception. Two infixation processes that are found in Nicobarese and across the languages of the Austroasiatic stock are (74a) the nominalizing infix -n- (found in such forms as Mlabri chnrɛɛt ‘comb’ < chrɛɛt ‘to comb’, Mundari dunub ‘meeting’ < dub ‘sit’, Khmer kɒndaːr ‘auger, gimlet’ < kdaːr ‘pierce’, or Mon ginruŋ ‘laughter’ < gruŋ ‘laugh’, snāl ‘mat’ < sāl ‘spread’; Tran Nghia 1976: 1210; Nai Pam Hla 1976: 906), and (74b) the -m- agentive infix form (found also for example in Khmer chmam ‘watcher’ < cam ‘watch’ or Mon kamlɔt ‘thief’ < klɔt ‘steal’; Tran Nghia 1976: 1209–1210). (74) -[i]n- deverbal and -[u]m- agentive nouns in Car Nicobarese (A. R. Das 1977: 34) a. kinriɔmə ‘dance’ kumriɔm ‘dancer’ /k-riɔm/ ‘dance’ b. tinkɔːka ‘song, music’ tumkɔk ‘singer’ /t-kɔk/ ‘sing, play music’ It seems certain that Proto-Austroasiatic was richer morphologically than the majority of modern Austroasiatic languages, particularly in terms of derivation, but not as developed morphologically as the Munda languages, perhaps something like what is found in Nicobarese. Some derivational elements are cognate across the family, such as the causative verb formant, which in Proto-Austroasiatic appeared either as prefix or infix. Both prefix and infix elements are found in Munda, Nicobarese, Monic, and Khmuʔic; other branches generally preserve only the prefix. (75) a. a’b-soŋ CAUS -buy ‘sell’ b. kɔ-’b-sɔr (< kɔsɔr) dry-CAUS -dry ‘dry something’ (Juang; Pinnow 1960a)

130

Carol Genetti

(76) a. b.

ha-kah-naŋ CAUS -know-ear ‘make understand’ p-um-lóʔ (< plóʔ) lose-CAUS -lose ‘make lose’ (Nancowry; Radhakrishnan 1981: 87, 54)

Nicobarese was used in at least primary level education (Nandan 1993) in areas where Nicobarese constitute the majority population or in monoethnic Nicobarese villages (Phulo Bhadi village on Great Nicobar) two decades ago; contemporary data is lacking. Little is known about the pre-history of Nicobarese, how the various individual languages developed, or what degree of lexical and grammatical variation is attested across Nicobarese. Indeed, it is not too much of a stretch to say that it remains to be demonstrated how many distinct Nicobarese languages there actually are! In short, much is left to be done in the basic documentation of Nicobarese. 1.8.

The Tibeto-Burman languages of South Asia By Carol Genetti

1.8.1.

Introduction

This survey is an introduction to the Tibeto-Burman languages of South Asia and their genetic classification; it is intended to be a brief but useful starting point for those seeking an overview and guide to the literature.94 It is important to note at the start that the field of comparative Tibeto-Burman is characterized by a lack of consensus on subgrouping, brought about by a paucity of data on many languages and the fact that for such a large family the careful “bottom-up” comparative reconstruction that would convincingly demonstrate higher-level subgroups has not been feasible. Various proposals have been based on a “top-down” approach, but these do not agree. Recently, efforts have been made to approach the problem statistically (LaPolla 2012), which may in time prove to be fruitful, especially as some success has been achieved with such methodologies for other languages (e.g. Nichols 1996). The uncertainty regarding

94

This survey was supported by a fellowship from the Cairns Institute at James Cook University. The survey has benefited from comments by Mark Post, Yankee Modi, Scott DeLancey, George van Driem, Graham Thurgood, James A. Matisoff, Randy LaPolla, David Bradley, Gwendolyn Hyslop, and Nicholas Lester; all errors are my own.

The languages, their histories, and their genetic classification

131

subgrouping is compounded by a lack of consensus on names of languages and genetic groupings. As a result, the literature is contradictory and confusing. This contribution attempts to provide sufficient background to allow readers to identify and locate literature on the languages, to compare current subgrouping proposals, and to understand points of contemporary debate. A basic introduction on the location, size, and status of the South Asian TibetoBurman languages is given in 1.8.2, while important bibliographic resources are presented in 1.8.3. More detailed information on the languages and low-level genetic groupings are introduced in 1.8.4, using the “agnostic” view of subgrouping proposed by van Driem (2002 and subsequent publications). This is a useful starting point because it identifies forty distinct “groupings” of languages that few would dispute are reliable genetic units. For each grouping, a basic description is provided and references are given to the primary descriptive and/or historical literature. 1.8.5 then surveys three current proposals for higher-level branches (Bradley 2002; Matisoff 2003; Thurgood & LaPolla 2003), using van Driem’s forty groupings as the fabric of comparison and a means to highlight areas of convergence and disagreement. 1.8.6 discusses outstanding issues in the field and directions for future research. 1.8.2.

Location, demographics, and vitality

Tibeto-Burman languages are spoken across a wide swath of Asia, from China in the east to Pakistan in the west, and encompassing large parts of Southeast Asia, the Himalayan region, and the Tibetan plateau. Within South Asia, they extend from Bangladesh westward through Northeast India, Bhutan, Nepal, northern India, and Pakistan. For a variety of reasons,95 we cannot say exactly how many Tibeto-Burman languages are spoken in South Asia, although the number is likely to be more than two hundred. Within this study, I discuss 257 distinct varieties, where the term VARIETY is neutral with respect to the language/dialect distinction; most of the varieties listed are probably distinct languages based on the criterion of mutual unintelligibility. The eastern part of the region, especially Northeast India, is an area of especially high linguistic diversity, although little is known about many of the languages. Tibeto-Burman languages tend to have smaller populations than most of their predominantly Indo-Aryan neighbors. Statistics on speaker demographics are problematic as they are often estimates and/or count people by ethnic group rather than linguistic competence. Table 1.11, based on data in Ethnologue (Lewis

95

Reasons include the problem of distinguishing languages from dialects, an issue compounded by notions of ethnicity and language. Also, some languages are being newly discovered by linguists, so any list is likely to be incomplete.

132

Carol Genetti

2009), presents statistics on 245 Tibeto-Burman languages of South Asia.96 About half were reported to have fewer than 10,000 speakers and only a handful were reported to be larger than a half million. Table 1.11: Population distributions across 245 Tibeto-Burman languages of South Asia Estimated number of speakers

Number of languages

1000 >

33

1,000–10,000

93

10,000–100,000

83

100,000–500,000

27

500,000–1,000,000

7

1,000,000
h. In: Wolfgang Meid (ed.), Sprache und Kultur der Indogermanen: Akten der X. Fachtagung der Indogermanischen Gesellschaft, 36–148. Wiesbaden: Reichert. Hinüber, Oskar von 1968 Studien zur Kasussyntax des Pali, besonders des Vinaya-Pitaka. Universität Mainz dissertation. Hinüber, Oskar von 2001 Das ältere Mittelindisch im Überblick. 2nd rev. ed. Wien: Österreichische Akademie der Wissenschaften. Hock, Hans Henrich 1979 Retroflexion rules in Sanskrit. South Asian Languages Analysis 1: 47–62. Hock, Hans Henrich 1981 Sanskrit causative syntax: A diachronic study. Studies in the Linguistic Sciences 11(2): 9–33. Hock, Hans Henrich 1982 The Sanskrit quotative: A historical and comparative study. Studies in the Linguistic Sciences 12(2): 39–85. Hock, Hans Henrich 1986 “P-oriented” constructions in Sanskrit. In: Krishnamurti et al. (eds.) 1986: 15–26. Hock, Hans Henrich 1990 Oblique subjects in Sanskrit? In: Verma & Mohanan (eds.) 1990: 119–139. Hock, Hans Henrich 1991 Dialects, diglossia, and diachronic phonology in early Indo-Aryan. In: William G. Boltz and Michael C. Shapiro (eds.), Studies in the historical phonology of Asian languages, 119–159. Amsterdam/Philadelphia: Benjamins. Hock, Hans Henrich 1992 Spoken Sanskrit in Uttar Pradesh: Profile of a dying prestige language. In: Dimmock et al. (eds.) 1992: 247–260. Hock, Hans Henrich 1996a Pre-Ṛgvedic convergence between Indo-Aryan (Sanskrit) and Dravidian? A survey of the issues and controversies. In: Houben (ed.) 1996: 17–58. Hock, Hans Henrich 1996b Subversion or convergence? The issue of pre-Vedic retroflexion reconsidered. Studies in the Linguistic Sciences 23(2): 73–115. Hock, Hans Henrich 1997a Bangani. http://www-personal.umich.edu/~pehook/bangani.hock.html (accessed 29 November 2014)

The languages, their histories, and their genetic classification

199

Hock, Hans Henrich 1997b Chronology or genre? Problems in Vedic syntax. In: Michael Witzel (ed.), Inside the texts — beyond the texts: New approaches to the study of the Vedas, 103–126. Harvard Oriental Series, Opera Minora, 2. Cambridge, MA: Harvard University. Hock, Hans Henrich 1999 Out of India? The linguistic evidence. In: Johannes Bronkhorst and Madhav Deshpande (eds.), Aryan and Non-Aryan in South Asia: Evidence, interpretation, and ideology, Proceedings of the International Seminar on Aryan and Non-Aryan in South Asia, University of Michigan, Ann Arbor, 25–27 October, 1996, 1–18, Cambridge, MA: Harvard Oriental Series, Opera Minora, 3. Hock, Hans Henrich 2002 South Asia: Historical. The Yearbook of South Asian Languages and Linguistics 2000: 220–237. Hock, Hans Henrich 2005a How strict is strict OV? A family of typological constraints with focus on South Asia. In: Rajendra Singh and Tanmoy Bhattacharya (eds.), Yearbook of South Asian Languages and Linguistics 2005, 145–163. Berlin/New York: Mouton de Gruyter. Hock, Hans Henrich 2005b The problem of time in South Asian convergence. Proceedings of the Murray B. Emeneau Seminar, Central Institute of Indian Languages, Mysore. https:// www.dropbox.com/s/83t6504h1a0xjjy/TimeSoAsConv.pdf (accessed 29 November 2014) Hock, Hans Henrich 2006 Reflexivization in the Rig-Veda (and beyond). In: Bertil Tikkanen and Heinrich Hettrich (eds.), Themes and tasks in Old and Middle Indo-Aryan linguistics, Papers of the 2004 World Sanskrit Conference, v. 5, 19–44. Delhi: Motilal Banarsidass. Hock, Hans Henrich 2008 Dravidian syntactic typology: A reply to Steever. In: Rajendra Singh (ed.), Annual Review of South Asian Languages and Linguistics 2008, 163–198. Berlin/New York: Mouton de Gruyter. Hock, Hans Henrich 2009 Middle Indo-Aryan “aspirate” clusters revisited. In: Klaus Karttunen (ed.), Anantam Śāstram: Indological and linguistic studies in honour of Bertil Tikkanen, 87–102. (Studia Orientalia 108.) Helsinki. Hock, Hans Henrich 2012 Sanskrit and Pāṇini — Core and periphery. Saṁskṛta Vimarśa N. S. 6: 85–102. (World Sanskrit Conference Special.) New Delhi. Hock, Hans Henrich 2014 The Sanskrit phonetic tradition and western phonetics. In: V. Kutumba Sastry (ed.), Sanskrit and development of world thought, 53–80. New Delhi: Rasthriya Sanskrit Sansthan/D. K. Printworld. Hock, Hans Henrich (ed.) 1991 Studies in Sanskrit syntax: A volume in honor of the centennial of Speijer’s Sanskrit syntax (1886–1986). Delhi: Motilal Banarsidass.

200

Bibliographical references

Hock, Hans Henrich, and Rajeswhari Pandharipande 1976 The sociolinguistic position of Sanskrit in pre-Muslim South Asia. Studies in Language Learning 1(2): 106–138. Urbana: University of Illinois. Hock, Hans Henrich, and Rajeswhari Pandharipande 1978 Sanskrit in the pre-Islamic context of South Asia. Aspects of sociolinguistics in South Asia, ed. by B. B. Kachru & S. N. Sridhar, 11–25. (= International Journal of the Sociology of Language, 16.) Hock, Wolfgang 2006 Altwestnordisch pt : avestisch pt — eine diachron-typologische Parallele? In: Antje Hornscheidt, Kristina Kotcheva, Tomas Milosch, and Michael Rießler (eds.), Grenzgänger: Festschrift zum 65. Geburtstag von Jurij Kusmeno, 111– 122. Berlin: Nordeuropa-Institut der Humboldt-Universität. Hockings, Paul (ed.) 1989 Blue Mountains: The ethnography and biogeography of a South Indian region. New Delhi/New York: Oxford University Press. Hockings, Paul (ed.) 1997 Blue mountains revisited. Cultural studies on the Nilgiri Hills. Delhi: Oxford University Press. Hockings, Paul (ed.) 2012 Encyclopaedia of the Nilgiri Hills. New Delhi: Manohar Books. Hockings, Paul (ed.) 2013 So long a saga: Four centuries of Badaga history. New Delhi: Manohar Books. Hoernle, Augustus Friedrich Rudolf 1880 A comparative grammar of the Gaudian languages. London: Trübner. Hook, Peter Edwin 1976 Aṣṭādhyāyī 3.4.21 and the role of semantics in Paninian linguistics. Papers from the 12th Regional Meeting of the Chicago Linguistic Society, 302–312. Hook, Peter Edwin 1984 Panini’s aṣṭādhyāyī: A two-storey house for a three-storey language? 6th World-Sanskrit Conference, Philadelphia, PA. Hook, Peter Edwin 1991 The compound verb in Munda: An areal and typological overview. In: Abbi (ed.) 1991: 181–195. Hook, Peter Edwin 1993 Aspectogenesis and the compound verb in Indo-Aryan. In: Verma (ed.) 1993: 97–114. Houben, Jan E. M. (ed.) 1996 Ideology and status of Sanskrit: Contributions to the history of the Sanskrit language. Leiden: Brill. Huffman, Franklin E. 1976a The relevance of lexicostatistics to Mon-Khmer languages. In: Jenner et al. (eds.) 1976: 539–574. Huffman, Franklin E. 1976b The register problem in fifteen Mon-Khmer languages. In: Jenner et al. (eds.) 1976: 575–590. Huffman, Franklin E. 1990 Burmese Mon, Thai Mon, and Nyah Kur: A synchronic comparison. MonKhmer Studies 16–17: 31–84.

The languages, their histories, and their genetic classification

201

Hyslop, Gwendolyn 2011 A grammar of Kurtöp. University of Oregon PhD dissertation. Hyslop, Gwendolyn 2013 On the internal phylogeny of East Bodish languages. In: Gwendolyn Hyslop, Stephen Morey, and Mark Post (eds.), North East Indian linguistics 5, New Delhi: Foundation Books/Cambridge University Press. Illič-Svityč, Vladislav M. 1984 Opyt sravnenija nostratičeskix jazykov 3, Sravnitel’nyj slovar’ (r – q). Moskva: Nauka. Illič-Svityč, Vladislav M., and Vladimir A. Dybo 1971 Opyt sravnenija nostratičeskix jazykov: Semitoxamitskij, kartvel’skij, indoevropejskij, ural’skij, dravidijskij, altajskij 1 Vvedenie: sravnitel’nyj slovar’ (b – ḳ). Moskva: Nauka. Jacob, Judith 1993 Cambodian linguistics, literature and history: Collected articles. Ed. by D. A. Smythe. London: School of Oriental and African Studies. Jacobi, Hermann 1886 Ausgewählte Erzählungen in Mâhârâshṭrî. Leipzig: Hirzel. Jacobi, Hermann 1921 Sanatkumāracaritam, ein Abschnitt aus Haribhadras Nemināthacaritam: Eine Jaina Legende in Apabhraṁśa. (Abhandlungen der Bayerischen Akademie der Wissenschaften, Phil.-phil. u. hist. Kl., 31.2.) München. Jacques, Guillaume 2004 Phonologie et morphologie du japhug (Rgyalrong). Université Paris VII, Denis Diderot PhD thesis. Jacques, Guillaume In Press a The genetic position of Chinese. In: Rint Sybesma, James Huang, Wolfgang Behr, and Zev Handel (eds.), The encyclopedia of Chinese languages and linguistics. Leiden: Brill Jacques, Guillaume In Press b Rgyalrong. In: Rint Sybesma, James Huang, Wolfgang Behr, and Zev Handel (eds.), The encyclopedia of Chinese languages and linguistics. Leiden: Brill. Jahani, Carina 2008 Expressions of future in Classical and Modern New Persian. In: Karimi, Samiian & Stilo (eds.) 2008: 155–176. Jahani, Carina, and Agnes Korn 2009 Balochi. In: Windfuhr (ed.) 2009: 634–692. Jahani, Carina, and Agnes Korn (eds.) 2003 The Baloch and their neighbours: Ethnic and linguistic contact in Balochistan in historical and modern times. Wiesbaden: Reichert. JamaspAsa, Kaikhusroo 1982 Aogəmadaēčā: A Zoroastrian liturgy. Vienna: Österreichische Akademie der Wissenschaften. Jamison, Stephanie W. 1991 The syntax of direct speech in Vedic. In: Hock (ed.) 1991: 95–112.

202

Bibliographical references

Jamison, Stephanie W. 2007 The Rig Veda between two worlds: quatre conférences au Collège de France en mai 2004. (Publications de l’Institut de Civilisation Indienne, série in-8°, 74.) Paris: Boccard. Jamison, Stephanie W. 2008a Sanskrit. In: Woodward (ed.) 2008: 6–32. Jamison, Stephanie W. 2008b Middle Indic. In: Woodward (ed.) 2008: 33–49. Jayaseelan, K. A. 2004 The possessor-experiencer dative in Malayalam. In: Bhaskararao & Subbarao (eds.) 2004: 227–244. Jenner, Phillip N., and Saveros Pou 1982 A lexicon of Khmer morphology. (Mon-Khmer Studies 9–10.) Honolulu: University Press of Hawaii. Jenner, Phillip N., Laurence C. Thompson, and Stanley Starosta (eds.) 1976 Austroasiatic Studies, 2 volumes with consecutive page numbers. (Oceanic Linguistics, Special Publication, No. 13.) Honolulu: University of Hawaii. Jeremiás, Éva 1993 On the genesis of the periphrastic progressive in Iranian languages. In: Wojciech Skalmowski and Alois van Tongerloo (eds.), Medioiranica: Proceedings of the International Colloquium organized by the Katholieke Universiteit Leuven from the 21st to the 23rd of May 1990, 99–116. Leuven: Peeters. Joseph, P. M. 1989 The word Draviḍa. International Journal of Dravidian Linguistics 18(2): 134–142. Jügel, Thomas 2009 Ergative remnants in Sorani Kurdish? Orientalia Suecana 58: 142–158. Justin, Anstice 2000 Who are the Jarawa? Unpublished MS. Kachru, Braj B., Yamuna Kachru, and S. N Sridhar (eds.) 2007 Language in South Asia. Cambridge: Cambridge University Press. Kachru, Yamuna 2007 Hindi–Urdu–Hindustani. In: Kachru et al. (eds.) 2007: 81–102. Kansakar, Tej Ratna, Yogendra Prasad Yadava, Krishna Prasad Chalise, Balaram Prasain, Dubi Nanda Dhakal, and Krishna Paudel 2011 A sociolinguistic study of the Baram language. Himalayan Linguistics 10(1): 187–225. Kapp, Dieter B., and Paul Hockings 1989 The Kurumba tribes. In: Hockings (ed.) 1989: 232–248. Karimi, Simin, Vida Samiian, and Donald Stilo (eds.) 2008 Aspects of Iranian linguistics. Newcastle: Cambridge Scholars Publishing. Karunatillake, W. S. 1977 The position of Sinhala among Indo-Aryan languages. Indian Journal of Linguistics 4: 1–6. Karunatillake, W. S. 2001 Historical phonology of Sinhalese: From Old Indo-Aryan to the 14th century A. D. Colombo: S. Godage and Brothers. (Based on the author’s 1969 Cornell University PhD dissertation.)

The languages, their histories, and their genetic classification

203

Kashyap, V. K., T. Sitalaximi, B. N. Sarkar, and R. Trivedi 2003 Molecular relatedness of the aboriginal groups of Andaman and Nicobar Islands with similar ethnic populations. International Journal of Human Genetics 3(1): 5–11. Katre, Sumitra Mangesh 1965 Some problems of historical linguistics in Indo-Aryan. Poona: Deccan College. Katre, Sumitra Mangesh 1968 Problems of reconstruction in Indo-Aryan. Simla: Indian Institute of Advanced Study. Kellens, Jean 1989 Avestique. In: Schmitt (ed.) 1989: 32–55. Khatri, R. 2008 The structure of verbs and sentences of Raji. Tribhuvan University MA dissertation. Kieffer, Charles M. 1989 Le parāćī, l’ōrmuṛī et le groupe des langues iraniennes du Sud-Est. In: Schmitt (ed.) 1989: 445–455. King, John 2009 A grammar of Dhimal. Leiden: Brill. Klaiman, M. H. 1978 Arguments against a passive origin of the IA ergative. In: Donka Farkas, Wesley M. Jacobsen, and Karol W. Todrys (eds.), Papers from the fourteenth regional meeting of the Chicago Linguistic Society, 204–216. Chicago: Chicago Linguistic Society. Knorozov, Yuri V. (ed.) 1965 Predvaritel’noe soobščenie ob issledovanii protoindijskix tekstov. Moscow: Institut Etnografii. Kobayashi, Masato 2004 Historical phonology of Old Indo-Aryan consonants. Tokyo: Research Institute for Languages and Cultures of Asia and Africa, Tokyo University of Foreign Studies. Kobayashi, Masato 2012 Texts and grammar of Malto. Vizianagaram: Kotoba Books. Kolachina, S., T. Rama, and B. Lakshmi Bai 2011 Maximum Parsimony method in the subgrouping of Dravidian languages. Proceedings of Quantitative Investigations in Theoretical Linguistics 4: 4–52. Kolichala, Suresh 2010 Use of phylogenetic reconstruction methods in evaluating subgrouping in Dravidian. 14th Round Table on the Ethnogenesis of South and Central Asia, Harvard University, October 4–5, 2010. Kölver, Ulrike, and Iswaranand Shresthacarya 1994 A dictionary of contemporary Newari (Newari-English). Bonn: VGH Wissenschaftsverlag. Konow, Sten 1911 Notes on the classification of Bashgali. Journal of the Royal Asiatic Society of Great Britain and Ireland 1911: 1–47.

204

Bibliographical references

Konow, Sten 1913 Bashgali dictionary: An analysis of Col. J. Davidson’s notes on the Bashgali language. (Journal of the Asiatic Society of Bengal, New Series, 9, Extra Number.) Calcutta. Repr. 1986, Delhi: Gyan Publishing House. Korn, Agnes 2003 Balochi and the concept of North-West Iranian. In: Jahani & Korn (eds.) 2003: 49–60. Korn, Agnes 2005a Das Nominalsystem des Balochi, mitteliranisch betrachtet. In: Günter Schweiger (ed.), Indogermanica: Festschrift Gert Klingenschmitt (…), 289– 302. Taimering: VWT-Verlag. Korn, Agnes 2005b Towards a historical grammar of Balochi: Studies in Balochi historical phonology and vocabulary. Wiesbaden: Reichert. Korn, Agnes 2008 Marking of arguments in Balochi ergative and mixed constructions. In: Karimi, Samiian & Stilo (eds.) 2008: 249–276. Korn, Agnes 2009 Lengthening of i and u in Persian. In: Almut Hintze, François de Blois, and Werner Sundermann (eds.), Exegisti monumenta: Festschrift in honour of Nicholas Sims-Williams, 197–213. Wiesbaden: Harrassowitz. Korn, Agnes 2010 Parthian ž. Bulletin of the School of Oriental and African Studies 73: 415–436. Korn, Agnes 2013: Looking for the Middle Way: Voice and transitivity in complex predicates in Iranian. Lingua 135: 30–55. Korn, Agnes, Geoffrey Haig, Simin Karimi, and Pollet Samvelian (eds.) 2011 Topics in Iranian linguistics. Wiesbaden: Reichert. Koul, Omkar Nath (ed.) 2011 Indo-Aryan linguistics. Mysore: Central Institute of Indian languages. Krishnamurti, Bhadriraju 1958 Alternations i/e and u/o in South Dravidian. Language 34: 458–468. Repr. with a postscript in Krishnamurti 2001: 2. 29–41. Krishnamurti, Bhadriraju 1961 Telugu verbal bases: A comparative and descriptive study. Berkeley: University of California Press. Repr. 1972, Delhi: Motilal Banarsidass. Krishnamurti, Bhadriraju 1969 Konda or Kūbi: A Dravidian language. Hyderabad: Tribal Cultural Research and Training Insitute. Krishnamurti, Bhadriraju 1975 Gender and number in Proto-Dravidian. International Journal of Dravidian Linguistics 4: 328–50. Repr. with a postscript in Krishnamurti 2001: 8. 133– 153. Krishnamurti, Bhadriraju 1976 Review of Kamil Zvelebil, Comparative Dravidian phonology. Lingua 39: 139–153.

The languages, their histories, and their genetic classification

205

Krishnamurti, Bhadriraju 1985 An overview of comparative Dravidian studies since Current Trends (1969). In: Veneeta Z. Acson and Richard L. Leed (eds.), For Gordon Fairbanks, 212– 231. Honolulu: University of Hawaii Press. Repr. in Krishnamurti 2001: 14. 243–260. Krishnamurti, Bhadriraju 1991 The emergence of the syllable types of stems (C)VCC(V) and (C)V̄ C(V) in Indo-Aryan and Dravidian: Conspiracy or convergence? In: W. G. Boltz and M. C. Shapiro (eds.), Studies in the historical phonology of Asian languages, 160–175. Amsterdam/Philadelphia: Benjamins. Krishnamurti, Bhadriraju 1997 Proto-Dravidian laryngeal *H revisited. Pondicherry Institute of Linguistics and Culture (PILC) Journal of Dravidic Studies 7(2): 145–165. Repr. as “Evidence for a laryngeal H in Proto-Dravidian” in Krishnamurti 2001: 19. 323–244. Krishnamurti, Bhadriraju 1998 Telugu. In: Steever (ed.) 1998: 202–240. Krishnamurti, Bhadriraju 2001 Comparative Dravidian linguistics: Current perspectives. Oxford: Oxford University Press. (References are to articles followed by page numbers.) Krishnamurti, Bhadriraju 2003 The Dravidian languages: A comparative, historical and typological study. Cambridge: Cambridge University Press. Krishnamurti, Bhadriraju (ed.) 1968 Studies in Indian linguistics (Professor M. B. Emeneau Ṣaṣṭipūrti volume). Poona/Annamalainagar: Deccan College/Annamalai University. Krishnamurti Bhadriraju, Colin P. Masica, and Anjani Sinha (eds.) 1986 South Asian languages: structure, convergence, and diglossia. Delhi: Motilal Banarsidass. Krishnamurti, Bhadriraju, and J. P. L. Gwynn 1985 A grammar of Modern Telugu. Delhi: Oxford University Press. Krishnamurti, Bhadriraju, and Brett A. Benham 1998 Koṇḍa. In: Steever (ed.) 1998: 241–269. Kruspe, Nicole 2004 A grammar of Semelai. Cambridge: Cambridge University Press. Kuiper, F. B. J. 1962 Nahali: A comparative study. (Mededelingen der Koninklijke Nederlandse Akademie van Wetenschappen, Afd. Letterkunde, N. R., 25: 5.) Amsterdam: Noord-Hollandsche Uitgevers Maatschappij. Kuiper, F. B. J. 1966 The sources of the Nahali vocabulary. In: Norman H. Zide (ed.), Studies in comparative Austroasiatic linguistics, 57–81. The Hague: Mouton. http://sealang. net/sala/archives/pdf8/kuiper1966sources.pdf (accessed 20 November 2013) Kuiper, F. B. J. 1991 Aryans in the Rigveda. Amsterdam/Atlanta, GA: Rodopi. Kulikov, Leonid 2007 Review of Southworth 2005. Acta Orientalia Vilnensia 8(1): 173–177.

206

Bibliographical references

Kumar, Pramod 2012 Descriptive and typological study of Jarawa. Jawaharlal Nehru University PhD dissertation. Kümmel, Martin 2007 Konsonantenwandel: Bausteine zu einer Typologie des Lautwandels und ihre Konsequenzen für die vergleichende Rekonstruktion. Wiesbaden: Reichert. Lakshmi Bai, B. 1985 Some notes on correlative constructions in Dravidian. In: Veneeta Z. Acson and Richard L. Leed (eds.), For Gordon H. Fairbanks, 181–190. Honolulu: University of Hawaii Press. Lakshmi Bai, B., and B. Ramakrishna Reddy 1991 Studies in Dravidian and general linguistics: A festschrift for Bh. Krishnamurti. Hyderabad: Osmania University Publications in Linguistics. LaPolla, Randy J. 2001 The role of migration and language contact in the development of the SinoTibetan language family. In: Alexandra Aikhenvald and R. M. W. Dixon (eds.), Areal diffusion and genetic inheritance, 225–254. Oxford: Oxford University Press. LaPolla, Randy J. 2003 Overview of Sino-Tibetan morphosyntax. In: Thurgood & LaPolla (eds.) 2003: 22–42. LaPolla, Randy J. 2012 Comments on methodology and evidence in Sino-Tibetan comparative linguistics. Language and Linguistics 13(1): 117–132. Lazard, Gilbert 1992 Subjonctif et optatif en ossète. Studia Iranica 21: 57–66. Lazard, Gilbert 2005 Structures d’actances dans les langues irano-aryennes modernes. In: Dieter Weber (ed.), Languages of Iran, past and present: Iranian studies in memoriam David Neil MacKenzie, 81–93. Wiesbaden: Harrassowitz. Lenz, Wolfgang 1939 Zeitrechnung in Nuristan und am Pamir. (Abhandlungen der Preussischen Akadamie der Wissenschaften, Jahrgang 1938, Phil-hist.Klasse, 7.) Berlin. Lewis, M. Paul (ed.) 2009 Ethnologue. 16th edition. Arlington, Texas: Summer Institute of Linguistics. http://www.ethnologue.com/ (accessed 8 December 2013) Li, Jinfang 1996 Bugan: A new Mon-Khmer language of Yunnan Province, China. Mon-Khmer Studies 26: 135–160. Lin, You-Jing 2009 Units in Zhuokeji rGyalrong discourse: Prosody and grammar. UCSB PhD dissertation. Lipp, Reiner 2009 Die indogermanischen und einzelsprachlichen Palatale im Indoiranischen. I: Neurekonstruktion, Nuristan-Sprachen, Genese der indoarischen Retroflexe, Indoarisch von Mitanni. Heidelberg: Winter.

The languages, their histories, and their genetic classification

207

Longerich, Linda 1998 Acoustic conditioning for the RUKI rule. St. John’s: Memorial University of Newfoundland Ph.D. dissertation. http://www.collectionscanada.gc.ca/obj/s4/ f2/dsk2/tape15/PQDD_0010/MQ36148.pdf (accessed 18 November 2013) Lorimer, David L. R. 1935–1938 The Burushaski language. 3 volumes. Oslo: Instituttet for Sammenlignende Kulturforskning. Lorimer, David L. R. 1939. The Ḍumāki Language. Nijmegen: Dekker and Van de Vegt. Lorimer, David L. R. 1962 Werchikwar English vocabulary. Oslo: Instituttet for Sammenlignende Kulturforskning. Lubotsky, Alexander 2001 The Indo-Iranian substratum. In: Chr. Carpelan, A. Parpola, and P. Koskikallio (eds.), Early contacts between Uralic and Indo-European: Linguistic and archaeological considerations: Papers presented at an international symposium held at the Tvärminne Research Station of the University of Helsinki, 8–10 January 1999, 301–317. (Mémoires de la Société Finno-ougrienne, 242.) Helsinki. Lubotsky, Alexander 2002 Scythian elements in Old Iranian. In: Nicholas Sims-Williams (ed.), IndoIranian languages and peoples, 189–202. Oxford: Oxford University Press. Lurje, Pavel 2010 Personal names in Sogdian texts. (Iranisches Personennamenbuch 2: 8.) Vienna: Österreichische Akademie der Wissenschaften. Lynch, Owen 1963 Outline of Nihali grammar. Unpublished typescript. MacKenzie, D. Neil 1954 Gender in Kurdish. Bulletin of the School of Oriental and African Studies 16: 528–541. Repr. in McKenzie 1999, vol. 2: 353–366. MacKenzie, D. Neil 1969 Iranian languages. In: Sebeok et al. (1969): 450–477. MacKenzie, D. Neil 1990 The Khwarezmian element in the Qunyat al-munya. London: School of Oriental and African Studies. MacKenzie, D. Neil 1999 Iranica diversa, 2 vols. Rome: Istituto italiano per l’Africa e l’Oriente. Mahadevan, Iravatham 2002 Aryan or Dravidian or neither? A study of recent attempts to decipher the Indus script (1995–2000). Electronic Journal of Vedic Studies 2: 1–23. Mahadevan, Iravatham 2003 Early Tamil epigraphy from the earliest times to the sixth century A. D. Chennai/Cambridge, MA: Harvard Oriental Series. Mahapatra, B. P. 1979 Malto: An ethno-semantic study. Mysore: Central Institute of Indian Languages. Mahapatra, Kageshwar, and Norman H. Zide 1972 Nominal combining forms in Gtaʔ. Indian Linguistics 33: 79–102.

208

Bibliographical references

Malhotra, Veena 1982 The structure of Kharia: A study in linguistic typology and change. Jawaharlal Nehru University PhD dissertation. Maloney, Clarence 1978 People of the Maldive islands. Madras: Orient Longman. Man, E. Horace 1875–1879 Andaman vocabulary. Part I (English-Andamanese); Part II (AndamaneseEnglish). Handwritten manuscript. London: Royal Anthropological Institute. Man, E. Horace 1881, 1883 On the aboriginal inhabitants of the Andaman Islands. Journal of the Anthropological Institute of Great Britain and Ireland 11, part I: 1–48, part II: 49–106, Part III: 107–173. Man, E. Horace 1885 On the aboriginal habitants of the Andaman Islands with reports of researches into the language of the South Andaman Islands by A. J. Ellis. Anthropological Institute of Great Britain and Ireland. Man, E. Horace 1888–1889 A dictionary of the Central Nicobarese language. Repr. 1975, Delhi: Sanskaran Prakashak. Man, E. Horace 1923 A Dictionary of the South Andaman (Âkà-Bêa) language. Mazgaon, Bombay: British India Press. Man, E. Horace, and R. C. Temple 1875–1878 A grammar of the Bojingyida or South Andaman language. Handwritten manuscript, loose sheets. London: Royal Anthropological Institute. Manandhar, Thakur Lal 1986 Newari-English dictionary: Modern language of the Kathmandu valley. Ed. by Anne Vergati. Kathmandu: Agam Kala Prakashan. Manoharan, S. 1980 Language of the present Great Andamanese. Journal of the Indian Anthropological Society 15: 43–55. Manoharan, S. 1983 Subgrouping Andamanese group of languages. International Journal of Dravidian Linguistics 12(1): 82–95. Manoharan, S. 1989 A descriptive and comparative study of the Andamanese language. Calcutta: Anthropological Survey of India. Mansion, Joseph 1931 Esquisse d’une histoire de la langue sanscrite. Paris: Geuthner. Manson, Ken 2010 Bibliography of Karen linguistics. Manuscript: https://www.academia. edu/215262/A_bibliography_of_Karen_linguistics (accessed 19 November 2014 Maran, Laraw 1971 Burmese and Jingpho: A study in tonal linguistic processes. Urbana: University of Illinois, Center for Asian Studies.

The languages, their histories, and their genetic classification

209

Marlow, Elli Johanna Pudas 1974 More on the Uralo-Dravidian relationship: A comparison of Dravidian etymological vocabularies. University of Texas PhD dissertation. Marlow, Patrick Edward 1997 Origin and development of the Indo-Aryan quotatives and complementizers: An areal approach. University of Illinois PhD dissertation. Masefield, John (ed.) 1908 The travels of Marco Polo. London: Everyman’s Library. Masica, Colin P. 1976 Defining a linguistic area: South Asia. Chicago/London: University of Chicago Press. Masica, Colin P. 1991 The Indo-Aryan languages. Cambridge: Cambridge University Press. Masica, Colin P. 2001 The definition and significance of linguistic areas: Methods, pitfalls, and possibilities (with special reference to the validity of South Asia as a linguistic area). In: Peri Bhaskara Rao and K. V. Subbarao (eds.), The yearbook of South Asian Languages and Linguistics 2001, 205–267. New Delhi: Sage. Matisoff, James A. 1972 The Loloish tonal split revisited. Berkeley: Center for South and Southeast Asia Studies, University of California. Matisoff, James A. 1974 The tones of Jinghpaw and Lolo-Burmese: Common origin vs. independent development. Acta Linguistica Hafniensia 15(2): 153–212. Matisoff, James A. 2000 On “Sino-Bodic” and other symptoms of neosubgroupitis. Bulletin of the School of Oriental and African Studies 63(3): 356–369. Matisoff, James A. 2001 The interest of Zhangzhung for comparative Tibeto-Burman. In: Yasuhiko Nagano and Randy LaPolla (eds.), New research on Zhangzhung and related Himalayan languages: Bon studies 3, 155–180. Osaka: National Museum of Ethnology. Matisoff, James A. 2003 Handbook of Tibeto-Burman: System and philosophy of Sino-Tibetan reconstruction. Berkeley/Los Angeles: University of California Press. https:// escholarship.org/uc/item/19d79619#page-13 (accessed 29 June 2014) Matisoff, James A. 2012 Mainland SE Asian languages: The state of the art in 2012. Max Planck Institute, Leipzig, November 29–December 1, 2012. Manuscript. Matras, Yaron 2002 Romani: A linguistic introduction. Cambridge: Cambridge University Press. Matras, Yaron 2005 Para-Romani revisited. In: Matras (ed.) 1995: 1–27. Matras, Yaron 2010 Romani in Britain: The afterlife of a language. Edinburgh: Edinburgh University Press.

210

Bibliographical references

Matras, Yaron (ed.) 1995 Romani in contact: The history and sociology of a language. Amsterdam/ Philadelphia: Benjamins. Matras, Yaron, Peter Bakker, and Hristo Kyuchukov (eds.) 1997 The typology and dialectology of Romani. Amsterdam/Philadelphia: Benjamins. Mayank 2009 Comparative lexicon of Great Andamanese languages. Jawaharlal Nehru University MPhil dissertation. Mayrhofer, Manfred 1956–1976 Kurzgefaßtes etymologisches Wörterbuch des Altindischen. Heidelberg: Winter. Mayrhofer, Manfred 1968 Über spontanen Zerebralnasal im frühen Indo-Arischen. In: Comité de Rédaction, Société Linguistique de Paris (ed.), Mélanges d’indianisme à la mémoire de Louis Renou, 509–517. Paris: Boccard. Mayrhofer, Manfred 1974 Die Arier im Vorderen Orient: Ein Mythos? (Sitzungsberichte der Österreichischen Akademie der Wissenschaften 294: 3.) Wien. Mayrhofer, Manfred 1983 Lassen sich Vorstufen des Uriranischen nachweisen? Akten der Österreichischen Akademie der Wissenschaften 120: 249–255. Mayrhofer, Manfred 1986–2001 Etymologisches Wörterbuch des Altindoarischen, 3 vols. Heidelberg: Winter. Mayrhofer, Manfred 1989 Vorgeschichte der iranischen Sprachen: Uriranisch. In: Schmitt (ed.) 1989: 4–24. Mayrhofer, Manfred 2002 Zur Vertretung der indogermanischen Liquiden in den indo-iranischen Sprachen. Indologica Taurinensia 28: 149–161. (Published 2004.) Mazaudon, Martine 1973 Phonologie du tamang: Étude phonologique du dialecte tamang de Risiangku (langue tibéto-birmane du Népal). (Langues et civilisations à tradition orale, 4.) Paris: Centre National de la Recherche Scientifique, Société d’Études Linguistiques et Anthropologiques de France. Mazaudon, Martine 1977 Tibeto-Burman tonogenetics. Linguistics of the Tibeto-Burman Area 3(2): 1–123. Mazaudon, Martine 1978 Consonantal mutation and tonal split in the Tamang sub-family of TibetoBurman. Kailash 6(3): 157–179. Kathmandu, Nepal. Mazaudon, Martine 2005 On tone in Tamang and neighbouring languages: Synchrony and diachrony. In: Shigeki Kaji (ed.), Cross-linguistic studies of tonal phenomena: Historical development, tone-syntax interface, and descriptive studies, 79–96. Tokyo: Institute for the Study of Languages and Cultures of Asia and Africa, Tokyo University of Foreign Studies.

The languages, their histories, and their genetic classification

211

McAlpin, David W. 1981 Proto-Elamo-Dravidian: The evidence and its implications, Philadelphia: The American Philosophical Society. McAlpin, David W. 2003 Velars, uvulars, and the North Dravidian hypothesis. Journal of the American Oriental Society 123(3): 521–546. McAlpin, David W. Forthcoming Modern colloquial Eastern Elamite. Meenakshi, K. 1986 The quotative in Indo-Aryan. In: Krishnamurti et al. (eds.) 1986: 209–218. Menges, Karl H. 1977 Dravidian and Altaic. Anthropos 72: 129–179. Mesthrie, Rajend 2007 South Asian languages in the second diaspora. In: Kachru et al. (eds.) 2007: 497–514. Metspalu, Mait, Irene Gallego Romero, Bayazit Yunusbayev, Gyaneshwer Chaubey, Chandana Basu Mallick, Georgi Hudjashov, et al. 2011 Shared and unique components of human population structure and genomewide signals of positive selection in South Asia. The American Journal of Human Genetics 89(6): 731–744. Michailovsky, Boyd 1994 Manner versus place of articulation in the Kiranti initial stops. In: Hajime Kitamura, Tatsuo Nishida, and Yasuhiko Nagano (eds.), Current issues in Sino-Tibetan linguistics, 766–781. (Proceedings of the 26th International Conference on Sino-Tibetan Languages and Linguistics, National Museum of Ethnology, Osaka, September 1993). Osaka: Organizing Committee of the 26th International Conference on Sino-Tibetan Languages and Linguistics. Michailovsky, Boyd, and Martine Mazaudon 1994 Preliminary notes on languages of the Bumthang groups. In: Tibetan Studies: Proceedings of the 6th Seminar of the International Association for Tibetan Studies 2, 545–557. Fagernes: The Institute for Comparative Research in Human Culture. Miller, John D., and Carolyn Miller 1996 Lexical comparison of Katuic Mon-Khmer languages with special focus on So-Bru groups of northeast Thailand. Mon-Khmer Studies 26: 255–290. Milne, M. Leslie 1921 Palaung grammar. Oxford: Clarendon Press. Miranda, Rocky V. 1978 Proto-language reconstruction from Mod. IAr. evidence. Indian Linguistics 39: 277–295. Mohammad, Jan 1991 Causative constructions in Kati, a Nuristani language of Afghanistan. Ohio University MA thesis. Mohan, Shailendra Forthcoming Documentation and description of Nihali, a critically endangered language isolate of India. Endangered Languages Archive http://elar.soas.ac.uk/ deposit/0168 (accessed 20 November 2013)

212

Bibliographical references

Monier-Williams, Monier n.d. A Sanskrit-English dictionary. Oxford: Clarendon Press. Repr. 2005, Delhi: Motilal Banarsidass. http://www.sanskrit-lexicon.uni-koeln.de/monier/ (assessed 17 November 2013) Moorjani, Priya, Kumarasamy Thangaraj, Nick Patterson, Mark Lipson, Po-Ru Loh, Periyasamy Govindaraj, Bonnie Berger, David Reich, and Lalji Singh 2013 Genetic evidence for recent population mixture in India. The American Journal of Human Genetics 93: 1–17. Morey, Stephen 2005 The Tai languages of Assam: A grammar and texts. Canberra: Pacific Linguistics. Morey, Stephen 2006 Constituent order change in the Tai languages of Assam. Linguistic Typology 10: 327–367. Morgenstierne, Georg 1926 Report on a linguistic mission to Afghanistan. (Instituttet for Sammenlignende Kulturforskning Serie C 1.2.) Oslo: Aschehoug. Morgenstierne, Georg 1929 The language of the Ashkun Kafirs. Norsk Tidsskrift for Sprogvidenskap 2: 192–289. Morgenstierne, Georg 1932 Report on a linguistic mission to North-Western India. (Instituttet for Sammenlignende Kulturforskning Serie C 3.1.) Oslo: Aschehoug. Morgenstierne, Georg 1934 Additional notes on Ashkun. Norsk Tidsskrift for Sprogvidenskap 7: 56–115. Morgenstierne, Georg 1945 Indo-European k’ in Kafiri. Norsk Tidsskrift for Sprogvidenskap 13: 225–238. Morgenstierne, Georg 1949 The language of the Prasun Kafirs. Norsk Tidsskrift for Sprogvidenskap 15: 186–334. Morgenstierne, Georg 1952 Linguistic gleanings from Nuristan. Norsk Tidsskrift for Sprogvidenskap 16: 117–135. Morgenstierne, Georg 1954 The Waigali language. Norsk Tidsskrift for Sprogvidenskap 17: 146–324. Morgenstierne, Georg 1965 Dardic and Kafir languages. In: B. Lewis, Ch. Pellat, and J. Schacht (eds.), Encyclopedia of Islam, 2nd edition, 2: 138–139. Leiden: Brill. Morgenstierne, Georg 1973 Die Stellung der Kafirsprachen. In: Irano-Dardica (collected papers of Georg Morgenstierne), 327–344 Wiesbaden: Reichert. Morgenstierne, Georg 1974 Languages of Nuristan and surrounding regions: In: Karl Jettmar, in collaboration with Lennart Edelberg (eds.), Cultures of the Hindukush: Selected papers from the Hindu-Kush Cultural Conference held at Moesgård 1970, 1–10. Wiesbaden: Steiner.

The languages, their histories, and their genetic classification

213

Morin, Yves-Charles, and Étienne Tiffou 1988 Passive in Burushaski. In: Masayoshi Shibatani (ed.), Passive and voice, 493– 524. Amsterdam/Philadelphia: Benjamins. Mortensen, David 2003 Comparative Tangkhul. MS. http://www.davidmortensen.org/papers/compara tive_tangkhul.pdf (accessed 19 November 2014) Moseley, Christopher (ed.) 2007 Encyclopedia of the world’s endangered languages. New York: Routledge. Müller, Katja, Daniel Paul, Calvin Tiessen, and Gabriela Tiessen 2010 Language maintenance among the Parya of Tajikistan. SIL Electronic Survey Reports 2010–014: 32. Available at http://www.sil.org/silesr/abstract. asp?ref=2010–014 (accessed 8 December 2013) Mundlay, Aasha 1962–1965 Fieldnotes and reports on Nihali, including texts. Unpublished MS. Mundlay, Aasha 1964 Changing patterns of linguistic behaviour among the Nihals in Maharashtra. Unpublished MS. Mundlay, Aasha 1964–1972 Nihali (Kaltu-Mandi): Its place among the Munda languages of India. Unpublished typescript. Mundlay, Aasha 1965 Multilingual behaviour of the Nihals in some settlements in Buldana district. Unpublished MS, Chicago. Mundlay, Aasha 1966 Linguistic and ritual maintenance of self-identity among the Nihals. Unpublished MS. Mundlay, Aasha 1979 A Nihali lexicon. Unpublished typescript. Mundlay, Aasha 1988 Nihali lexicon, with phonological and etymological notes. Unpublished MS (Abstract by David Stampe International Journal of American Linguistics 32: 395, 1966.) Mundlay, Aasha 1996a Cognates in the Nihali lexicon. Mother Tongue 2: 11–16. Mundlay, Aasha 1996b Nihali lexicon. Mother Tongue 2: 17–40. Mundlay, Aasha 1996c Who are the Nihals? What do they speak? Mother Tongue 2: 5–9. Munshi, Sadaf 2006 Jammu and Kashmir Burushaski: Language, language contact, and change. University of Texas, Austin, PhD dissertation. Munshi, Sadaf In Progress Burushaski language documentation project. http://www.ltc.unt. edu/~sadafmunshi/Burushaski/archive.html (accessed 21 November 2013). Murugaiyan, A., and Christiane Pilot-Raichoor 2004 Les prédications indifférenciées en dravidien: témoins d’une évolution typologique archaïque. In: Jacques François and Irmtraud Behr (eds.), Les constituants prédicatifs et la diversité des langues, 155–177. Louvain: Peeters.

214

Bibliographical references

Nacaskul, Karnchana 1978 The syllabic and morphological structure of Cambodian words. Mon-Khmer Studies 7: 183–200. Nagano, Yasuhiko 2003 Cogtse rGyalrong. In: Thurgood & LaPolla (eds.) 2003: 469–489. Nagaraja, K. S. 1979 Contraction of Khasi nouns in compounds. Indian Linguistics 40: 18–23. Nagaraja, K. S. 1984a Reduplication in Khasi. Indo-Iranian Journal 27(3): 189–200. Nagaraja, K. S. 1984b Compounding in Khasi. Bulletin of the Deccan College Research Institute 43: 79–90. Pune: Deccan College. Nagaraja, K. S. 1985 Khasi: A descriptive analysis. Pune: Deccan College. Nagaraja, K. S. 1993–1994 Khasi dialects: A typological consideration. Mon-Khmer Studies 23: 1–10. Nagaraja, K. S. 1996 The status of Lyngngam. Mon-Khmer Studies 26: 37–50. Nagaraja, K. S. 2006–2007 Nihali and Korku: A comparative note. Bulletin of the Deccan College Post-Graduate and Research Institute 2006–2007: 316–322. Nagaraja, K. S. 2008 Korku-Nihali kinship terms: A comparative note. Indian Linguistics 69: 259– 265. Nagaraja, K. S. 2014 The Nihali language: Grammar, texts and vocabulary. Mysore: Central Institute of Indian Languages. Nagaraja, K. S. n.d. Fieldnotes on Nihali. Unpublished MS. Nai Pan Hla 1976 A comparative study of Old Mon epigraphy and Modern Mon. In: Jenner et al. (eds.) 1976: 891–918. Nandan, Anshu Prokash 1993 The Nicobarese of Great Nicobar. New Delhi: Gyan Publishing. Nara, Tsuyoshi 1979 Avahaṭṭha and comparative vocabulary of Indo-Aryan languages. Tokyo: Institute for the Study of Languages and Cultures of Asia and Africa. Nāseḥ, Moḥammad Amīn 2001 Fohrsat-e pāyān-nāmehā-ye kāršenāsī-ye eršād va doktorī dar zamīne-ye gūyešhā-ye Īrān (1339–1379). Tehran: Farhangestān-e zabān-o-adab-e fārsī, 1380 h.š. Needham, J. F. 1894 Outline grammar of the … (Khâmtî) language, as spoken by the Khâmtîs residing in the neighbourhood of Sadiya. Rangoon: Superintendent of Government Printing, Burma. Nelson, David 1986 The historical development of the Nuristani languages. University of Minnesota PhD dissertation.

The languages, their histories, and their genetic classification

215

Nepal Bhasa Dictionary Committee 2000 A dictionary of Classical Newari. Kathmandu: Cwasā Pāsā. Nguyen, V. Kh. 1987 Tu-Diên Anh-Viet: English-Vietnamese dictionary. Glendale, CA: Dainam Publishing. Nichols, Johanna 1996 The comparative method as heuristic. In: Mark Durie and Malcolm D. Ross (eds.), The comparative method reviewed: Regularity and irregularity in language change, 39–71. New York: Oxford University Press. Nishi, Yoshio 1990 The distribution and classification of Himalayan languages (Part I). Kokoritsu Minzokugaku Hakubutsukan Kenkyu Hokoku 15(1): 265–337. Nishida, Tatsuo 1988 On the mTsho-sna Monpa language in China. In: David Bradley, Eugénie J. A. Henderson, and Martine Mazaudon (eds.), Prosodic analysis and Asian linguistics: To honour R. K. Sprigg, 223–236. Canberra: Pacific Linguistics C-104. Nizar, Milla 2010 Dative subject constructions in South-Dravidian languages. University of California, Berkeley, Linguistics Undergraduate Honors Thesis. Noonan, Michael 2011 Aspects of the historical development of nominalizers in the Tamangic languages. In: Yap Foong Ha, Karen Grunow-Hårsta, and Janick Wrona (eds.), Nominalization in Asian languages: Diachronic and typological perspectives, 195–214. Amsterdam/Philadelphia: Benjamins. Norman, Jerry 1988 Chinese. Cambridge: Cambridge University Press. Norman, K. R. 1992 Pali lexicographic studies IX. Journal of the Pali Text Society 16: 77–85. Oberlies, Thomas 2001 Pāli: A grammar of the language of the Theravāda Tipiṭika. Berlin/New York: de Gruyter. Oberlies, Thomas 2003 Aśokan Prakrit and Pāli. In: Cardona & Jain (eds.) 2003: 161–203. Ohno, Susumu 1980 Sound correspondences between Tamil and Japanese. Tokyo: Gakshuin University. Opgenort, Jean Robert 2004 Implosive and preglottalized stops in Kiranti. Linguistics of the Tibeto-Burman Area 27(1): 1–27. Opgenort, Jean Robert 2005 A grammar of Jero: With a historical comparative study of the Kiranti languages. Leiden: Brill. Opgenort, Jean Robert 2011 A note on Tilung and its position within Kiranti. Himalayan Linguistics 10(1): 253–271. (Special Issue in Memory of Michael Noonan and David Watters.)

216

Bibliographical references

Osada, Toshiki 1992 A reference grammar of Mundari. Tokyo: Institute for the Study of Languages and Cultures of Asia and Africa. Osada, Toshiki 1996 Notes on the Proto-Kherwarian vowel system. Indo-Iranian Journal 39: 245– 258. Osada, Toshiki 2004 A historical note on inclusive/exclusive opposition in South Asian languages: Borrowing or retention or innovation? Mon-Khmer Studies 34: 79–96. Osada, Toshiki 2008 Mundari. In: Anderson (ed.) 2008: 99–164. Osada, Toshiki (ed.) 2009 Linguistics, archaeology and human past in South Asia. Delhi: Manohar. Pande, Satish, and Anvita Abbi 2010 Ethno-ornithology: Birds of Great Andamanese: Names, classification and culture. Pune/Bombay/New Delhi: Ela Foundation/Natural History Society/ Oxford University Press. Pantcheva, Marina 2009 First phase syntax of Persian complex predicates: Argument structure and telicity. Journal of South Asian Linguistics 2: 53–72. http://ansatte.uit.no/marina. pantcheva/Pub/Pantcheva-CPTelicity.pdf (accessed 19 November 2014) Parkin, Robert 1991 A guide to Austroasiatic speakers and their languages. (Oceanic Linguistics, Special Publication 23.) Honolulu: University of Hawaii Press. Parpola, Asko 1994 Deciphering the Indus script. Cambridge: Cambridge University Press. Parpola, Asko 2002 From the dialects of Old Indo-Aryan to Proto-Indo-Aryan and Proto-Iranian. In: Nicholas Sims-Williams (eds.), Indo-Iranian languages and peoples, 43–102. Oxford: Oxford University Press. Patnaik, Manideepa 2008 Juang. In: Anderson (ed.) 2008: 508–556. Patry, Richard, and Étienne Tiffou 1998 Etude exploratoire des connecteurs de liaison dans un corpus de contes en bourouchaski du Yasin: Critères d’identification, quantité et distribution. In: Yves Dulhoux (ed.), Langue et langues: Hommage à Albert Maniet, 225– 253. (Bibliothèque des Cahiers de l’Institut de linguistique de Louvain, 97.) Louvain la Neuve: Peeters. Pattanayak, Debi Prasanna 1966 A controlled reconstruction of Oriya, Assamese, Bengali, and Hindi. The Hague: Mouton Paul, Ludwig 1998a Zazaki: Grammatik und Versuch einer Dialektologie. Wiesbaden: Reichert. Paul, Ludwig 1998b The position of Zazaki among West Iranian languages. In: Nicholas SimsWilliams (ed.), Proceedings of the Third European Conference of Iranian Studies (…): Part I: Old and Middle Iranian studies, 163–177. Wiesbaden: Reichert.

The languages, their histories, and their genetic classification

217

Paul, Ludwig 2003 The position of Balochi among Western Iranian languages: The verbal system. In: Jahani & Korn (eds.) 2003: 61–71. Paul, Ludwig 2008 Some remarks on the Persian suffix -rā as a general and historical linguistic issue. In: Karimi, Samiian & Stilo (eds.) 2008: 329–337. Paulsen, Debbie A. 1992 Phonological reconstruction of Proto-Plang. Mon-Khmer Studies 18–19: 160– 222. Payne, John 1989 Pāmir languages. In: Schmitt (ed.) 1989: 417–444. Payne, John 1997 The Central Asian Parya. In: Shirin Akiner and Nicholas Sims-Williams (eds.), Languages and scripts of Central Asia, 144–153. London: School of Oriental and African Studies. Pederson, Eric W. 2012 Kurumba languages. In: Hockings (ed.) 2012: 505–508. Peiros, Ilja 1996 Katuic comparative dictionary. (Pacific Linguistics C-132.) Canberra: Australian National University. Peiros, Ilja 1998 Comparative linguistics in Southeast Asia. (Pacific Linguistics C-142.) Canberra: Australian National University. Peterson, David A., and Jonathan Wright 2009 Mru-Hkongso: A new Tibeto-Burman grouping. 42nd International Conference on Sino-Tibetan Languages and Linguistics. Payap University, Thailand. Peterson, John M. 1998 Grammatical relations in Pali and the emergence of ergativity in Indo-Aryan. München: LINCOM. Peterson, John M. 2005 There’s a grain of truth in every “myth”. Or, Why the discussion of lexical classes in Mundari isn’t quite over yet. Linguistic Typology 9: 391–405. Peterson, John M. 2008 Kharia. In: Anderson (ed.) 2008: 434–507. Pilot-Raichoor, Christiane 1997 Aperçu du système verbal badaga. Faits de langues 10: 163–172. Pilot-Raichoor, Christiane 2012a The Badaga language. In: Hockings (ed.) 2012: 97–104. Pilot-Raichoor, Christiane 2012b Tamil Brahmi inscriptions: A critical landmark in the history of the Dravidian language. In: A. Murugaiyan (ed.), New dimensions in Tamil epigraphy: A multi-disciplinary approach, 285–315. Chennai, Cre-A publishers. Pinnow, Heinz-Jürgen 1959 Versuch einer Lautlehre der Kharia-Sprache. Wiesbaden: Harrassowitz. Pinnow, Heinz-Jürgen 1960a Beiträge zur Kenntnis der Juang-Sprache. Unpublished MS.

218

Bibliographical references

Pinnow, Heinz-Jürgen 1960b Über den Ursprung der voneinander abweichenden Strukturen der Munda und Khmer-Nikobar Sprachen. Indo-Iranian Journal 4: 81–103. Pinnow, Heinz-Jürgen 1963 The position of the Munda languages within the Austroasiatic language family. In: Shorto (ed.) 1963: 140–152. Pinnow, Heinz-Jürgen 1965 Personal pronouns in the Austroasiatic languages: A historical study. IndoPacific Linguistic Studies 1: 3–42. Pinnow, Heinz-Jürgen 1966a A comparative study of the verb in the Munda languages. In: Zide (ed.) 1966: 96–193. Pinnow, Heinz-Jürgen 1966b Review of Kuiper 1962. Orientalistische Literaturzeitung 61: 492–496. Pinnow, Heinz-Jürgen 1980 Remarks on the structure of the Khmer syllable and word. Mon-Khmer Studies 9: 131–138. Plaisier, Heleen 2007 A grammar of Lepcha. Leiden: Brill. Pollock, Sheldon 1996 The Sanskrit Cosmopolis, 300–1300: Transculturation, vernacularization, and the question of ideology. In: Houben (ed.) 1996: 197–248. Pongi, Rev. Fred 1990 Ro-Tarik I: 1st year Nicobarese school book. Mysore: Central Institute of Indian Languages. Portman, Maurice Vidal 1887 Manual of the Andamanese languages. London: W. H. Allen. Repr. 1992, Delhi: Manas Publications. Portman, Maurice Vidal 1898 Notes on the languages of the South Andaman group of tribes. Calcutta: Office of the Superintendent of Government Printing, India. Portman, Maurice Vidal 1899 A history of our relations with the Andamanese, 2 vols. Calcutta: Office of the Superintendent of Government Printing, India. Repr. 1990, New Delhi: Asian Educational Services. Possehl, Gregory L. 2003 The Indus civilization: A contemporary perspective. New Delhi: Vistaar Publications. Post, Mark 2011 Isolate substrates, creolization, and the internal diversity of Tibeto-Burman. Workshop on the Roots of Linguistic Diversity, James Cook University. https:// www.academia.edu/1689301/Isolate_substrates_creolization_and_the_inter nal_diversity_of_Tibeto-Burman (accessed 5 December 2014) Post, Mark 2012 The language, culture, environment and origins of Proto-Tani speakers: What is knowable, and what is not (yet). In: Toni Huber and Stuart Blackburn (eds.), Origins and migrations in the extended Eastern Himalayas, 153–185. Leiden: Brill.

The languages, their histories, and their genetic classification

219

Post, Mark, and Roger Blench 2011 Siangic: A new language phylum for North East India. 6th International Conference of the North East Indian Linguistic Society, Tezpur University, Assam. Premsrirat, Suwilai 1987 Khmu: A minority language of Thailand. (Papers in South-East Asian Linguistics 10. Pacific Linguistics A-75.) Canberra: Australian National University. Premsrirat, Suwilai 1996 Phonological characteristics of So (Thavung), a Vietic language of Thailand. Mon-Khmer Studies 26: 161–176. Premsrirat, Suwilai 2004 Register complex and tonogenesis in Khmu dialects. Mon-Khmer Studies 34: 1–18. Rabel, Lucy 1961 Khasi: A language of Assam. Baton Rouge: Louisiana State University Press. Rabel-Heymann, Lucy 1976 Analysis of loanwords in Khasi. In: Jenner et al. (eds.) 1976: 971–1034. Radcliffe-Brown, A. R. 1922 The Andaman islanders. Cambridge: Cambridge University Press. Repr. 1948, Glencoe, IL: Free Press. Radhakrishna, B. 1971 Early Telugu inscriptions upto 1100 AD. Hyderabad: Andhra Pradesh Sahitya Akademi. Radhakrishnan, R. 1981 The Nancowry word. Edmonton: Linguistic Research. Rajam, V. S. 1992 A reference grammar of classical Tamil poetry (150 BC – pre-fifth/sixth century AD). Philadelphia: American Philosophical Society. Ramamurti, G. V. 1931 A manual of the So:ra: (or Savara) language. Madras: Government Press. Ramasamy, K. 1981 Correlative relative clauses in Tamil. In: Agesthialingom & Rajasekharan Nair (eds.) 1981: 363–380. Ramaswami Aiyar, L. V. 1928 Dravidian notes (1. Nasals and nasal accentuation in Dravidian; 2. The tense forms of the Brahui verbs; 3. The affinities of the Kodagu language). Educational Review, Madras. Ramaswami Aiyar, L. V. 1936 The evolution of Malayalam morphology. Ernakulam: Cochin Government Press. Rana, B. K. 2002 New materials on Kusunda language. Fourth Roundtable Conference on Ethnogenesis of South and Central Asia. Harvard University, 11–13 May 2002. Rao, Garapati Uma Maheshwar 2008 A comparative grammar of the Gondi dialects: With special reference to phonology and morphology. Kuppam: Dravidian University.

220

Bibliographical references

Rapacha, Lal-Shyakarelu 2008 Indo-Nepal Kiranti bhashaharu: Vigat, samakalin parivesh ra bholika chunautiharu. [Indo-Nepal Kiranti languages: Past, contemporary scenario and future challenges]. Kathmandu: Research Institute for Kirãtology. Reddy, Joy 1979 Kuvi grammar. Mysore: Central Institute of Indian Languages. Reddy, Ramakrishna B. 2009 Manda-English dictionary. Mysore/Vadodara: Central Institute of Indian Languages/Bhasha Resarch and Publication Centre. Reich, David, Kumarasamy Thangaraj, Nick Patterson, Alkes L. Price, and Lalji Singh 2009 Reconstructing Indian population history. Nature 461: 489–494. Reichert, Pierre 1998 Anmerkungen zur Dialektologie des Kati (Lautlehre). In: V. V. Kushev, N.L. Luzhetskaia, Lutz Rzehak & I.M. Steblin-Kamensky (eds.), Countries and peoples of the East, 20: Central Asia: Eastern Hindukush, 123–135. St. Petersburg: Peterburgskoe Vosokovedenie. Reinhard, Johan, and Sueyoshi Toba 1970 A preliminary linguistic analysis and vocabulary of the Kusunda language. Kathmandu: Summer Institute of Linguistics/Tribhuvan University. Renou, Louis 1956 Histoire de la langue sanscrite. Lyon: AIC. Renou, Louis 1957 Introduction générale. (= Introduction to Wackernagel 1957.) Renou, Louis 1961 Grammaire sanskrite: Phonétique, composition, dérivation, le nom, le verbe, la phrase. 3rd ed. Paris: Adrien Maisonneuve. Reynolds, C. H. B. 1974 Buddhism and the Maldivian language. In: L. Cousins, Arnold Kunst, and K. R. Norman (eds.), Buddhist studies in honour of I. B. Horner, 193–198. Dordrecht: Reidel. Reynolds, C. H. B. 1978 Linguistic strands in the Maldives. In: C. Maloney (ed.), Contributions to Asian studies, 2: Language and civilization change in South Asia, 156–166. Leiden: Brill. Roberts, H. 1891 A grammar of the Khassi language. London: Kegan Paul, Trench, Trübner and Co. Romani Linguistics Website n.d. http://romani.humanities.manchester.ac.uk/ (accessed 8 December 2013) Rutgers, Leopold Roland 1999 Puroik or Sulung of Arunachal Pradesh. 5th Himalayan Languages Symposium, Kathmandu, Nepal. Sag, Ivan 1974 The Grassmann’s Law ordering pseudoparadox. Linguistic Inquiry 5: 591–607. Salāmī, ‘Abdonnabī 2004 Ganǰīne-ye gūyeš-šenāsī-ye fārs 1. Tehran: Farhangestān-e zabān-o-adab-e fārsī, 1384 h.š.

The languages, their histories, and their genetic classification

221

Sampson, John 1923 On the origin and early migration of the Gypsies. Journal of the Gypsy Lore Society, Third Series, 2(4): 156–169. Sampson, John 1926 The dialect of the Gypsies of Wales. Oxford: Oxford University Press. Saxena, Anju 1992 Finite verb morphology in Tibeto-Kinnauri. University of Oregon PhD dissertation. Saxena, Anju 1997 Towards a reconstruction of the Proto West Himalayish agreement system. In: David Bradley (ed.), Papers in Southeast Asian linguistics 14: Tibeto-Burman languages of the Himalayas, 73–94. (Pacific Linguistics A-86.) Saxena, Anju 2011 Towards empirical classification of Kinnauri varieties. In: Peter K. Austin, Oliver Bond, Lutz Marten, and David Nathan (eds.), Proceedings of the Conference on Language Documentation and Linguistic Theory 3, 15–25. London: School of Oriental and African Studies. Saxena, Anju, and Lars Borin 2011 Dialect classification in the Himalayas: A computational approach. Proceedings of NODALIDA 2011, 307–310. Riga: NEALT. Saxena, Anju, and Lars Borin 2013 Carving Tibeto-Kinnauri by its joints: Using basic vocabulary lists for genetic grouping of languages. In: Lars Borin and Anju Saxena (eds.), Approaches to measuring linguistic differences, 175–198. Berlin/New York: Mouton de Gruyter. Saxena, Anju, and Lars Borin (eds.) 2006 Lesser-known languages of South Asia: Status and policies, case studies and applications of information technology. Berlin/New York: Mouton de Gruyter. Scheucher, Bernhard 2006 Teilergativität in den modernen westiranischen Sprachen. In: Heiner Eichner, Bert Fragner, Velizar Sadovski, and Rüdiger Schmitt (eds.), Iranistik in Europa — gestern, heute, morgen, 169–193. Vienna: Österreichische Akademie der Wissenschaften. Schmidt, Johannes 1872 Die Verwandtschaftsverhältnisse der indogermanischen Sprachen. Weimar: Böhlau. Schmidt, Pater Wilhelm 1901 Die Sprachen der Sakai und Semang auf Malacca und ihr Verhältnis zu den Mon-Khmer-Sprachen. Bijdragen tot de Taal-, Land-, en Volkenkunde van Nederlandsch-Indië 52: 399–583. Schmidt, Pater Wilhelm 1904 Grundzüge einer Lautlehre der Khasi-Sprache. Abhandlungen der Königlichbayerischen Akademie der Wissenschaften, Philosophisch-historische Classe 22(3): 657–810. Schmidt, Pater Wilhelm 1906 Die Mon-Khmer Völker: Ein Bindeglied zwischen Völkern Zentralasiens und Austronesiens. Archiv für Anthropologie, n.s. 5: 59–109. Braunschweig:

222

Bibliographical references

Vieweg und Sohn. Repr. 2012, Ulan Press. [Ulan Press has no known place of publication; its products are available through Amazon.] Schmitt, Rüdiger 1989 Altiranische Sprachen im Überblick. In: Schmitt (ed.) 1989: 25–31. Schmitt, Rüdiger 2000 Die iranischen Sprachen in Geschichte und Gegenwart. Wiesbaden: Reichert. Schmitt, Rüdiger 2009 Die altpersischen Inschriften der Achaimeniden: Editio minor mit deutscher Übersetzung. Wiesbaden: Reichert. Schmitt, Rüdiger (ed.) 1989 Compendium linguarum iranicarum. Wiesbaden: Reichert. Schwartz, Martin 2008 Iranian *L, and some Persian and Zazaki etymologies. Iran and the Caucasus 12: 281–287. Schweiger, Günter 1998 Kritische Neuedition der achaemenidischen Keilinschriften, 2 vols. Taimering: VWT-Verlag. Sebeok, Thomas A., Murray B. Emeneau, and Charles A. Ferguson (eds.) 1969 Current trends in linguistics, 5: Linguistics in South Asia. The Hague: Mouton. Sen, Subhadra Kumar 1973 Proto-New Indo-Aryan. Calcutta: Eastern Publications. Sen, Sukumar 1953 Historical syntax of Middle Indo-Aryan. Indian Linguistics 13: 355–473. Sen, Sukumar 1960 A comparative grammar of Middle Indo-Aryan. Poona: Deccan College. Sethumadhava Rao, P. 1950 A grammar of the Kolami language. Hyderabad: The Co-operative Press. Shackle, C. 1979 Problems of classification in Pakistan Punjab. Transactions of the Philosophical Society 1979: 191–210. Shafer, Robert 1952 Études sur l’austroasien. Bulletin de la Société de Linguistique 48: 111–158. Shafer, Robert 1954 The linguistic position of Dwags. Oriens: Zeitschrift der Internationalen Gesellschaft für Orientforschung 7: 348–356. Shafer, Robert 1957 Bibliography of Sino-Tibetan languages, 1. Wiesbaden: Harrassowitz. Shafer, Robert 1963 Bibliography of Sino-Tibetan languages, 2. Wiesbaden: Harrassowitz. Shafer, Robert 1966 Introduction to Sino-Tibetan, 1. Wiesbaden: Harrassowitz. Shafer, Robert 1967 Introduction to Sino-Tibetan, 2. Wiesbaden: Harrassowitz. Shafer, Robert 1968 Introduction to Sino-Tibetan, 3. Wiesbaden: Harrassowitz. Shafer, Robert 1970 Introduction to Sino-Tibetan, 4. Wiesbaden: Harrassowitz.

The languages, their histories, and their genetic classification

223

Shafer, Robert 1974 Introduction to Sino-Tibetan, 5. Wiesbaden: Harrassowitz. Shanmugam, S. V. 1971 Some problems of Old Tamil phonology. Indo-Iranian Journal 13: 31– 43. Shanmugam, S. V. 1972 Dental and alveolar nasals in Dravidian. Bulletin of the School of Oriental and African Studies 35: 74–84. Shapiro, Michael C., and Harold F. Schiffman 1981 Language and society in South Asia. Delhi: Motilal Banarsidass. Sharma, H. S. 1999 A comparison between Khasi and Manipuri word order. Linguistics of the Tibeto-Burman Area 22(1): 139–148. Shorto, Harry L. 1976 The vocalism of Mon-Khmer. In: Jenner et al. (eds.) 1976: 1041–1068. Shorto, Harry L. 2005 A Mon-Khmer comparative dictionary. Canberra: Australian National University. Shorto, Harry L. (ed.) 1963 Linguistic comparison in Southeast Asia and the Pacific. London: School of Oriental and African Studies. Sidwell, Paul 2009 Classification of the Austroasiatic languages: History and state of the art. München: LINCOM. Sidwell, Paul, and Jacques Pascale 1999 Sapuan. München: LINCOM. Sidwell, Paul, and Jacques Pascale 2003 A handbook of comparative Bahnaric. Volume 1: West Bahnaric. Canberra: Australian National University. Sihler, Andrew L. 1997 The myth of direct reflexes of the PIE palatal series in Kati. In: Dorothy Disterheft, Martin Huld, and John Greppin (eds.), Studies in honor of Jaan Puhvel, Part One: Ancient languages and philology, 187–194. (Journal of Indo-European Studies Monograph No. 20.) Washington: Institute for the Study of Man. Simon, I. M. 1970 Aka language guide. Shillong: North-East Frontier Agency. Simon, I. M. 1979 Miji language guide. Shillong: North-East Frontier Agency. Sims-Williams, BD I-III = Sims-Williams 2000–2012 Sims-Williams, Nicholas 1979 On the plural and dual in Sogdian. Bulletin of the School of Oriental and African Studies 42: 337–346. Sims-Williams, Nicholas 1982 The double system of nominal inflexion in Sogdian. Transactions of the Philological Society 1982: 67–76. Sims-Williams, Nicholas 1990 Chotano-Sogdica II: Aspects of the development of nominal morphology in Khotanese and Sogdian. In: Gherardo Gnoli and Antonio Panaino (eds.),

224

Bibliographical references

Proceedings of the First European Conference of Iranian Studies (…): Part 1: Old and Middle Iranian Studies, 275–296. Rome: Istituto Italiano del medio ed estremo oriente. Sims-Williams, Nicholas 1994 The middle voice in Middle Iranian. Handout, Symposion anläßlich des 25-jährigen Bestehens der Kommission für Iranistik an der Österreichischen Akademie der Wissenschaften, 4–5 Nov. 1994. Sims-Williams, Nicholas 1996a Eastern Iranian languages. Encyclopædia Iranica VII: 649–652. Sims-Williams, Nicholas 1996b The Sogdian manuscripts in Brāhmī script as evidence for Sogdian phonology. In: Ronald E. Emmerick, Werner Sundermann, Ingrid Warnke, and Peter Zieme (eds.), Turfan, Khotan and Dunhuang: Vorträge der Tagung “Annemarie von Gabain und die Turfanforschung” (…), 307–315. Berlin: Berlin-Brandenburgische Akademie der Wissenschaften. Sims-Williams, Nicholas 1997a New light on ancient Afghanistan: The decipherment of Bactrian. London: School of Oriental and African Studies. Sims-Williams, Nicholas 1997b The denominal suffix -ant- and the formation of the Khotanese transitive perfect. In: Alexander Lubotsky (ed.), Sound law and analogy: Papers in honor of Robert S. P. Beekes on the occasion of his 60th birthday, 317–325. Amsterdam/Atlanta: Rodopi. Sims-Williams, Nicholas 2000–2012 Bactrian documents from Northern Afghanistan. I: Legal and economic documents. Oxford: Oxford University Press, 2000 (2nd ed. London: Nour Foundation, 2012); II: Letters and Buddhist texts. London: Nour Foundation, 2007; III: Plates. London: Nour Foundation, 2012. Sims-Williams, Nicholas 2004 The Parthian abstract suffix -yft. In: John H. W. Penney (ed.), Indo-European perspectives: Studies in honour of Anna Morpurgo Davies, 539–547. Oxford: Oxford University Press. Sims-Williams, Nicholas 2005 Towards a new edition of the Sogdian Ancient Letters: Ancient Letter 1. In: de la Vaissière & Trombert (eds.) 2005: 181–193. Sims-Williams, Nicholas 2007 The Sogdian potentialis. In: Maria Macuch, Mauro Maggi, and Werner Sundermann (eds.), Iranian languages and texts from Iran and Turfan: Ronald E. Emmerick memorial volume, 377–386. Wiesbaden: Harrassowitz. Sims-Williams, Nicholas 2008 The Bactrian inscription of Rabatak: A new reading. Bulletin of the Asia Institute 8 [2004]: 53–68. Sims-Williams, Nicholas 2009 The Bactrian fragment in Manichaean script (M 1224). In: Desmond DurkinMeisterernst, Christiane Reck, and Dieter Weber (eds.), Literarische Stoffe und ihre Gestaltung in mitteliranischer Zeit: Kolloquium anlässlich des 70. Geburtstages von Werner Sundermann, 245–268. Wiesbaden: Reichert.

The languages, their histories, and their genetic classification

225

Sims-Williams, Nicholas 2010 Bactrian personal names. (Iranisches Personennamenbuch 2.7.) Vienna: Österreichische Akademie der Wissenschaften. Sims-Williams, Nicholas 2011a Differential object marking in Bactrian. In: Korn, Haig, Karimi & Samvelian (eds.) 2011: 23–38. Sims-Williams, Nicholas 2011b Remarks on the phonology of the Manichaean Bactrian fragment (M 1224). In: Elena K. Molčanova et al. (eds.), Leksika, ėtimologija, jazykovye kontakty: K jubileju doktora filologičeskix nauk, professora Džoj Iosifovny Ėdel’man, 244–251. Moscow: Tezaurus. Sims-Williams, Nicholas 2012a Bactrian historical inscriptions of the Kushan period. The Silk Road 10: 76–80. Sims-Williams, Nicholas, and Desmond Durkin-Meisterernst 2012b Dictionary of Manichaean texts III: Texts from Central Asia and China 2: Dictionary of Manichaean Sogdian and Bactrian. Turnhout: Brepols. Singh, K. S., and S. Manoharan 1993 Languages and scripts. (People of India National Series 9.) Oxford: Oxford University Press. Singh, Ram Adhar 1980 Syntax of Apabhraṁśa. Calcutta: Simant Publications. Skjærvø, Prods Oktor 1983a Case in Inscriptional Middle Persian, Inscriptional Parthian and the Pahlavi Psalter. Studia Iranica 12: 47–62, 151–181. Skjærvø, Prods Oktor 1983b Farnah-: mot mède en vieux-perse? Bulletin de la Société de Linguistique de Paris 78: 241–259. Skjærvø, Prods Oktor 1989 Modern East Iranian languages. In: Schmitt (ed.) 1989: 370–383. Skjærvø, Prods Oktor 1997 The state of Old Avestan scholarship. Journal of the American Oriental Society 117: 103–114. Skjærvø, Prods Oktor 2009a Middle West Iranian. In: Windfuhr (ed.) 2009: 196–278. Skjærvø, Prods Oktor 2009b Old Iranian. In: Windfuhr (ed.) 2009: 43–195. Slade, Benjamin 2013 The diachrony of light and auxiliary verbs in Indo-Aryan. Diachronica 30(4): 531–578. Smalley, William A. 1961 Outline of Khmuʔ structure. New Haven: American Oriental Society. Smith, Kenneth D. 1972 A phonological reconstruction of Proto-North-Bahnaric. Ukarumpa, Papua New Guinea: Summer Institute of Linguistics. Smith, Kenneth D. 1975 The velar-animal prefix relic in Vietnam languages. Linguistics of the TibetoBurman Area 2(1): 1–18.

226

Bibliographical references

Smith, Kenneth D. 1992 The -VC rhyme link between Bahnaric and Katuic. Mon-Khmer Studies 18–19: 106–159. Sokolovskaja, N. K., and Nguyen V. T. 1987 Jazyk Muong: Materialy sovetsko-v’etnamskoj lingvističeskoj èkspedicii 1979 goda [The Muong Language: Materials from the Soviet-Vietnamese linguistic expedition of 1979]. Moscow: Nauka. Som, Bidisha 2006 A lexico-semantic study of Great Andamanese: A thematic approach. Jawaharlal Nehru University PhD dissertation. Southworth, Franklin C. 1976 On subgroups in Dravidian. International Journal of Dravidian Linguistics 5(1): 114–137. Southworth, Franklin C. 2005 Linguistic archaeology of South Asia. London/New York: Routledge-Curzon. Southworth, Franklin C. 2012 Rice in Dravidian. In: Magnus Fiskesjö and Yue-ie Caroline Hsing (eds.), Rice and language across Asia: Crops, movement, and social change, 142–148. (Special issue of the journal Rice.) New York: Springer. Speijer, J. S. 1886 Sanskrit syntax. Leiden: Brill. Speijer, J. S. 1896 Vedische und Sanskrit-Syntax. Straßburg: Trübner. Sridhar, S. N. 1979 Dative subjects and the notion of subject. Lingua 49: 99–125. Sridhar, S. N. 1990 Kannada: A descriptive grammar. London/New York: Routledge. Srivastava, Dayanand 1970 Historical syntax of Early Hindi prose (1800–1850 A. D.) Part I: Syntax of the cases. Calcutta: Atima Prakashan. Stampe, David L., and Norman H. Zide 1968 The place of Kharia-Juang in the Munda family. In: Krishnamurti (ed.) 1968: 370–377. Starosta, Stanley 1992 Sora combining forms and pseudo-compounding. Mon-Khmer Studies 18–19: 78–105. Starostin, George 2010 Dene-Yeniseian and Dene-Caucasian: Pronouns and other thoughts. In: Tuttle & Spence (eds.) 2010: 107–117. Starostin, Georgij S. 2005 Vnešnie sootvectvija načaľnyx zvonkix smyčnyx v dravidijskix jazykax. Orientalia et Classica: Aspekty komparatistiki 1: 125–132. Moskva: Rossijskij gosudarstvennyj gumanitarnyj universitet (Trudy Instituta vostočnyx kuľtur i antičnosti). Starostin, S. A. 2005 Sino-Caucasian. http://starling.rinet.ru/Texts/scc.pdf (accessed 22 November 2013).

The languages, their histories, and their genetic classification

227

Steever, Sanford B. 1988 The serial verb formation in the Dravidian languages. Delhi: Motilal Banarsidass. Steever, Sanford B. 1993 Analysis to synthesis: The development of complex verb morphology in the Dravidian languages. New York: Oxford University Press. Steever, Sanford B. (ed). 1998 The Dravidian languages. London/New York: Routledge. Stilo, Donald 2004 Vafsi folk tales: Twenty four folk tales in the Gurchani dialect of Vafsi as narrated by Ghazanfar Mahmudi and Mashdi Mahdi and collected by Lawrence P. Elwell-Sutton. Wiesbaden: Reichert. Stilo, Donald 2005 Iranian as a buffer zone between Turkic and Semitic. In: Éva Csató, Bo Isaksson, and Carina Jahani (eds.), Linguistic convergence and areal diffusion: Case studies from Iranian, Semitic and Turkic, 35–63. London/New York: RoutledgeCurzon. Stilo, Donald 2009 Case in Iranian: From reduction and loss to innovation and renewal. In: Andrej Malchukov and Andrew Spencer (eds.), The Oxford handbook of case, 700– 715. Oxford: Oxford University Press. Strand, Richard F. 1973 Notes on the Nuristani and Dardic languages. Journal of the American Oriental Society 93: 297–305. Strand, Richard F. 1985 Locality and nominal relationships in Kamviri. In: Arlene Zide et al. (eds.) 1985: 48–57. Strand, Richard F. 1991 Depicting cognitive images in Kâmviri. 20th Annual Conference on South Asia, University of Wisconsin, Madison, November, 1991. Strand, Richard F. 1997 Direction and location in the Nuristâni languages. 3rd Himalayan Languages Symposium, University of California at Santa Barbara, July 1997. Reproduced in Nuristân: Hidden Land of the Hindu Kush. http://nuristan.info/Nuristani/ Kamkata/Kom/KomLanguage/kamdirec.html (accessed 20 November 2014) Strand, Richard F. 1997–1999 The history of the Kom. Nuristân: Hidden Land of the Hindu Kush. http:// nuristan.info/Nuristani/Kamkata/Kom/KomTexts/KomHist.html (accessed 20 November 2014) Strand, Richard F. 1997–2007 The sound system of kâmv’iri. Nuristân: Hidden Land of the Hindu Kush. http://nuristan.info/Nuristani/Kamkata/Kom/KomLanguage/Lexicon/ phon.html (accessed 20 November 2014) Strand, Richard F. 1997–present Nuristân: Hidden Land of the Hindu Kush. http://nuristan.info. (accessed 20 November 2014)

228

Bibliographical references

Strand, Richard F. 1998–2002 The kalaṣa of kalaṣüm. Nuristân: Hidden Land of the Hindu Kush. http:// nuristan.info/Nuristani/Kalasha/kalasha.html (accessed 20 November 2014) Strand, Richard F. 1999 Review of Almuth Degener, Die Sprache von Nisheygram im afghanischen Hindukusch. Acta Orientalia 60: 236–244. Strand, Richard F. 1999–2009 Languages of the Hindu-Kush [Table]. Nuristân: Hidden Land of the Hindu Kush. http://nuristan.info/lngIndex0.html (accessed 20 November 2014) Strand, Richard F. 1999a–present Lexicons of the Hindu Kush. Nuristân: Hidden land of the Hindu Kush. http://nuristan.info/lngFrameL.html (accessed 20 November 2014) Strand, Richard F. 1999b–present Kâmviri grammar. Nuristân: Hidden land of the Hindu Kush. http:// nuristan.info/lngFrameG.html (accessed 20 November 2014) Strand, Richard F. 2000–2002 The cognitive geometry of object relationships: Case markers and subject reference [in Aćharêtâ']. Nuristân: Hidden Land of the Hindu Kush. http:// nuristan.info/IndoAryan/Indus/Atsaret/AtsaretLanguage/Lexicon/case.html (accessed 20 November 2014) Strand, Richard F. 2001 The tongues of Peristân. In: Alberto M. Cacopardo and Augusto S. Cacopardo (eds.), Gates of Peristan: History, religion and society in the Hindu Kush, 251–257. Roma: Istituto Italiano per l’Africa e l’Oriente. Strand, Richard F. 2002–2011 Phonological processes on the Indo-Iranian frontier. Nuristân: Hidden Land of the Hindu Kush. http://nuristan.info/Phonology/IIFproc.html (accessed 20 November 2014) Strand, Richard F. 2007–2008 Transcription and pronunciation of the Nuristâni languages. Nuristân: Hidden Land of the Hindu Kush. http://nuristan.info/Nuristani/phon.html (accessed 20 November 2014) Strand, Richard F. 2008–2010 The evolution of the Nuristâni languages. Nuristân: Hidden Land of the Hindu Kush. http://nuristan.info/Nuristani/NuristaniEvolution.html (accessed 20 November 2014) Strand, Richard F. 2010 Nurestâni Languages. In: Encyclopaedia Iranica, online edition. http://www. iranicaonline.org/articles/nurestani-languages (accessed 20 November 2014) Strand, Richard F. 2011 Indo-European spatial markers in Kâmviri. Nuristân: Hidden Land of the Hindu Kush. http://nuristan.info/Nuristani/IEmarkers.html (accessed 20 November 2014) Strand, Richard F. 2011a The sound system of kt'ivřâ·i-vari. Nuristân: Hidden Land of the Hindu Kush. http://nuristan.info/Nuristani/Kamkata/Kata/KataLanguage/Lexicon/phon. html (accessed 20 November 2014)

The languages, their histories, and their genetic classification

229

Strand, Richard F. 2012 Nûristânî etymological lexicon. Nuristân: Hidden Land of the Hindu Kush. http://nuristan.info/lngFrameL.html (accessed 20 November 2014) Strand, Richard F. 2013 Basic processes in the evolution of the Nûristânî languages. Nuristân: Hidden Land of the Hindu Kush. http://nuristan.info/Nuristani/BasicEvolutionary Processes.html (accessed 20 November 2014) Stump, Gregory, and Andrew Hippisley 2011 Valence sensitivity in Pamirian past-tense inflection: A realizational analysis. In: Korn, Haig, Karimi & Samvelian (eds.) 2011: 104–115. Subbarao, Karumuri V. 2012 South Asian languages: A syntactic typology. Cambridge/New York: Cambridge University Press. Subbarao, Karumuri V., and Gracious M. Temsen 2003 Wh-question formation in Khasi. In: Tej R. Kansakar and Mark Turin (eds.), Themes in Himalayan languages, 197–218. Heidelberg/Kathmandu: South Asia Institute/Tribhuvan University. Subbarao, Karumuri V., and Harbir Arora 2005 The conjunctive participle in Dakkhini Hindi-Urdu: Making the best of both worlds. Keynote, Murray B. Emeneau Centenary Celebrations-CIIL, Mysore, India, January 1, 2005. Subbiah, G. 1972 A note on agreement in Kota. In: S. Agesthialingom and V. Shanmugam (eds.), Third Seminar on Dravidian Linguistics, 285–292. Annamalainagar: Annamalai University. Subbiah, G. 1973 Kota personal pronouns. Indian Linguistics 34: 56–58. Subrahmanyam, P. S. 1964 Two problems in Parji verb forms. Indian Linguistics 25: 46–55. Subrahmanyam, P. S. 1968 Position of Tulu in Dravidian. Indian Linguistics 29: 47–66. Subrahmanyam, P. S. 1971 Dravidian verb morphology: A comparative study. Annamalainagar: Annamalai University. Subrahmanyam, P. S. 1983 Dravidian comparative phonology. Annamalainagar: Annamalai University. Subrahmanyam, P. S. 2008 Dravidian comparative grammar, 1. Mysore: Central Institute of Indian Languages. Subrahmanyam, V. I. 2013 The morphosyntax of the Dravidian languages. Thiruvananthapuram: Dravidian Linguistic Association Subramoniam, V. I. 1991–1992 Alternation of v/b. In: Professor S. M. Katre felicitation volume, 407–410. (Bulletin of the Deccan College Post-Graduate and Research Institute.) Pune: Deccan College Post-Graduate and Research Institute.

230

Bibliographical references

Sun Hongkai, Lu Zhaozun, Zhang Jichuan, and Ouyang Jueya 1980 Menba, Luboa, Deng ren de yu yan [The languages of the Monpa, Lhopa, and Deng peoples]. Peking: Zhongguo shehui kexue chuban-she. Sun, Jackson T. S. 1993a A historical-comparative study of the Tani (Mirish) branch of Tibeto-Burman. University of California, Berkeley, PhD dissertation. Sun, Jackson T. S. 1993b The linguistic position of Tani (Mirish) in Tibeto-Burman. Linguistics of the Tibeto-Burman Area 16(2): 143–188. Sun, Jackson T. S. 2003a Caodeng rGyalrong. In: Thurgood & LaPolla (eds.) 2003: 490–502. Sun, Jackson T. S. 2003b Tani Languages. In: Thurgood & LaPolla (eds.) 2003: 456–466. Suvarchala, B. 1984 Central Dravidian comparative phonology. Hyderabad: Osmania University MPhil dissertation. Svantesson, Jan-Olaf 1983 Kammu phonology and morphology. Lund: CWK Gleerup. Tagare, Ganesh Vasudev 1987 A historical grammar of Apabhraṁśa. Poona: Deccan College. Támsáng, Khárpú 1980 Lepcha-English encyclopedic dictionary. Kalimpong: Mayel Clymit Tamsang. Repr. 2009. Tâza, Samiullâh 1995 Alifbâî-i zabân-i nûristânî [Alphabets of Nûristâni language] (self published). Tâza, Samiullâh 2000 Adabiyât-i šafâhî-i nûristân [The oral literature of Nûristân] Peshawar: Dâniš kitâbîûn qis̤ a xawânî. Tedesco, Paul 1921 Dialektologie der mitteliranischen Turfantexte. Monde Oriental 15: 184–258. Temple, Richard C. 1902a A grammar of the Andamanese language, being Chapter IV of Part I of the Census Report on the Andaman and Nicobar Islands. Port Blair: Superintendent’s Printing Press. Reprint 1994, New Delhi. Temple, Richard C. 1902b A grammar of the Nicobarese language, Being chapter IV of part II of the census report on the Andaman and Nicobar Islands. Port Blair: Superintendent’s Office. Thangaraj, K. Kumarasamy, Gyaneshwer Chaubey, Toomas Kivisild, Alla G. Reddy, Vijay Kumar Singh, Avinash A. Rasalkar, and Lalji Singh 2005 Reconstructing the origin of Andaman Islanders. Science 308: 996. Thangaraj, Kumarasamy, Gyaneshwer Chaubey, Vijay Kumar Singh, Ayyasamy Vanniarajan, Ismail Thanseem, Alla G. Reddy, and Lalji Singh 2006 In situ origin of deep rooting lineages of mitochondrial Macrohaplogroup ‘M’ in India. BMC Genomics 2006(7): 151.

The languages, their histories, and their genetic classification

231

Thangaraj, Kumarasamy, Lalji Singh, Alla G. Reddy, V. Raghavendra Rao, Subhash C. Sehgal, Peter A. Underhill, Melanie Pierson, Ian G. Frame, and Erika Hagelberg 2003 Genetic affinities of the Andaman Islanders: A vanishing human population. Current Biology 13(2): 86–93. Thomas, David L. 1971 Chrau grammar. (Oceanic Linguistics, Special Publication 7.) Honolulu: University of Hawaii Press. Thomas, David L. 1980 The place of Alak, Tampuan, and West Bahnaric. Mon-Khmer Studies 9: 171– 186. Thomas, David L. 1992 On sesquisyllabic structure. Mon-Khmer Studies 21: 205–210. Thomas, David L., and Robert K. Headley 1970 More on Mon-Khmer subgroupings. Lingua 25: 398–418. Thomasiah, K. 1986 Naikri dialect of Kolami: Descriptive and comparative study. Annamalai University PhD dissertation. Thordarson, Fridrik 2009 Ossetic grammatical studies. Vienna: Österreichische Akademie der Wissenschaften. Thumb, Albert, and Richard Hauschild 1958 Handbuch des Sanskrit. Heidelberg: Winter. Thurgood, Graham 1984 The “Rung” languages: A major new Tibeto-Burman subgroup. Proceedings of the 10th Annual Meeting of the Berkeley Linguistics Society, 338–349. Thurgood, Graham 2003 A subgrouping of Sino-Tibetan languages: The interaction between language contact, change, and inheritance. In: Thurgood & LaPolla (eds.) 2003: 3–21. Thurgood, Graham, and Randy J. LaPolla (eds.) 2003 The Sino-Tibetan Languages. London/New York: Routledge. Tiffou, Étienne 1999 Parlons bourouchaski: État présent sur la culture et la langue des Bourouchos (Pakistan). Paris: L’Harmattan. Tiffou, Étienne 2014 Dictionnaire du bourouchaski du Yasin: Bourouchaski – français et français – bourouchaski. Louvain-la-Neuve: Peeters. Tiffou, Étienne (ed.) 2004 Bourouchaskiana: Actes du colloque sur le bourouchaski organisé à l’occasion du XXXVIème congrès international sur les études asiatiques et nord-africaines (Montréal 27 août – 2 septembre 2000). (Bibliothèque des Cahiers de Linguistique de l’Université de Louvain, 113.) Louvain-la-Neuve: Peeters. Tiffou, Étienne, and Richard Patry 1995a La relative en bourouchaski du Yasin. Bulletin de la Société de Linguistique de Paris 90(1): 335–390. Tiffou, Étienne, and Richard Patry 1995b La notion de pluralité verbale: Le cas du bourouchaski du Yasin. Journal asiatique 282(2): 407–444.

232

Bibliographical references

Tiffou, Étienne, and Yves-Charles Morin 2004 Le bénéfactif dans le bourouchaski du Yasin. In: Tiffou (ed.) 2004: 63–81. Tikkanen, Bertil 1987 The Sanskrit gerund: A synchronic, diachronic and typological analysis. Helsinki: Finnish Oriental Society. Tikkanen, Bertil 1988 On Burushaski and other ancient substrata in Northwestern South Asia. Studia Orientalia 64: 303–325. Tikkanen, Bertil 1999 Archeological-linguistic correlations in formation of retroflex typologies and correlating features in South Asia. In: R. Blench and M. Spriggs (eds.), Archaeology and language IV: Language change and cultural transformation, 138–148. London: Routledge. Toba, Sueyoshi 1991 A bibliography of Nepalese languages and linguistics. Kirtipur: Linguistic Society of Nepal, Tribhuvan University. Toba, Sueyoshi 2004 Tilung: An endangered Kiranti language: Preliminary observations. Nepalese Linguistics 20: 142–147. Trail, Robert L. (ed.) 1973 Patterns in clause, sentence, discourse in selected languages of India and Nepal, vol. 1. Norman, OK: Summer Institute of Linguistics. Tran Nghia 1976 Some characteristics of the Khmer-Mon languages. In: Jenner et al. (eds.) 1976: 1205–1214. Trask, Robert 1979 On the origins of ergativity. In: Frans Plank (ed.), Ergativity: Towards a theory of grammatical relations, 385–404. London: Academic Press. Trautmann, Thomas R. 1981 Dravidian kinship. Cambridge: Cambridge University Press. Tremblay, Xavier 2002 Ist die Aktivendung 3Pl -āra in einigen ostiranischen Sprachen inneriranische Entwicklung oder indogermanisches Erbe? (mit einem Exkurs über die athematischen Endungen des Chwaresmischen). Münchener Studien zur Sprachwissenschaft 62: 259–287. Tremblay, Xavier 2005a Bildeten die iranischen Sprachen ursprünglich eine genetische Familie oder einen Sprachbund innerhalb des indo-iranischen Zweiges? In: Olav Hackstein and Gerhard Meiser (eds.), Sprachkontakt und Sprachwandel: Akten der XI. Fachtagung der Indogermanischen Gesellschaft (…), 673–688. Wiesbaden: Reichert. Tremblay, Xavier 2005b Die Bildung des chotansakischen agentiven Präteritums (Beiträge zur vergleichenden Grammatik der iranischen Sprachen IX). In: N. N. Kazanskij and E. R. Kriuxkova (eds.), Hṛdā́ mánasā: Sbornik statej k 70–letiju so dnja roždenija professora Leonarda Georgeviča Gercenberga, 75–80. Sankt Peterburg: Nauka.

The languages, their histories, and their genetic classification

233

Tremblay, Xavier I-III Iranian historical linguistics in the twentieth century. Indo-European Studies Bulletin 11 (2005): 1–23 [= I]; 13 (2008): 1–51 [= II]; 14 (2009): 1–35 [= III] Trencker, V[ilhelm], Dines Andersen, Helmer Smith, Oskar von Hinüber, and Ole Holten Pind 1924–1948 A critical Pāli dictionary. Copenhagen: Kongelige Danske Videnskabernes Selskab. http://pali.hum.ku.dk/cpd/ (accessed 17 November 2013) Turin, Mark 1998 The Thangmi verbal agreement system and the Kiranti connection. Bulletin of the School of Oriental and African Studies 61(3): 476–491. Turin, Mark 2004 Newar-Thangmi lexical correspondences and the linguistic classification of Thangmi. Journal of Asian and African Studies / Ajia Afurika gengo bunka kenkyu 68: 97–120. Turin, Mark 2011 A grammar of Thangmi with an ethnolinguistic introduction to the speakers and their culture. Leiden: Brill. Turner, Ralph L. 1921 Gujarati phonology. Journal of the Royal Asiatic Society 53: 329–365, 54: 505–544. Repr. in Turner 1973: 88–145. Turner, Ralph L. 1923 The Sindhi recursives or voiced stops preceded by glottal closure. Bulletin of the School of Oriental Studies 3: 301–315. Turner, Ralph L. 1924 Cerebralization in Sindhi. Journal of the Royal Asiatic Society 56: 555–584. Repr. in Turner 1973: 206–227. Turner, Ralph L. 1926a Middle Indian -d- and -dd-. In: Festgabe Hermann Jacobi, ed. by Willibald Kirfel, 34–45. Bonn: Klopp. Repr. in Turner 1973: 239–250. Turner, Ralph L. 1926b The position of Romani in Indo-Aryan. Journal of the Gypsy Lore Society, Third Series, 5(4): 145–194. Turner, Ralph L. 1927 Notes on Dardic. Bulletin of the School of Oriental Studies 4: 533–541. Turner, Ralph L. 1962–1969 A comparative dictionary of the Indo-Aryan languages. London: Oxford University Press. http://dsal.uchicago.edu/dictionaries/soas/ (accessed 17 November 2013) Turner, Ralph L. 1967 Geminates after long vowel in Indo-Aryan. Bulletin of the School of Oriental and African Studies 30: 73–82. Repr. in Turner 1973: 405–415. Turner, Ralph L. 1973 Indo-Aryan linguistics: Collected papers. Ed. by J. Brough. London: School of Oriental and African Studies. Repr. 1985, Delhi: Disha Publications. Turner, Ralph L. 1985 A comparative dictionary of the Indo-Aryan languages, vol. 3: Addenda and corrigenda. Ed. by James C. Wright. London: School of Oriental and African Studies.

234

Bibliographical references

Tuttle, Edwin H. 1940 Dravidian gender words. Bulletin of the School of Oriental and African Studies 4: 769–778. Tuttle, Siri, and Justin Spence (eds.) 2010 Working papers in Athabaskan languages (Alaska Native Language Center Working Papers, 8.) Fairbanks, Alaska: Alaska Native Language Center. Tyler, Stephen A. 1968 Dravidian and Uralian: The lexical evidence. Language 44(4): 798–812. Tyler, Stephen A. 1990 Summary of noun and verb inflectional correspondences in Proto-Dravidian and Proto-Uralian. In: Vitaly Shevoroshkin (ed.), Proto-languages and proto-cultures, 68–76. Bochum: Brockmeyer. Uma Maheshwar Rao, Garapati 2014 Dravidian and Mongolian affinity: The provisional evidence. International Journal of Dravidian Linguistics 43(1): 1–37 Umarani, Pappuswamy 2005 Dative subjects in Tamil: A computational analysis. South Asian Language Review 15(2): 40–62. Vacek, Jaroslav 1983 Dravido-Altaic: The Mongolian and Dravidian verbal bases. Journal of Tamil Studies 23: 1–17. Vacek, Jaroslav 1987 The Dravido-Altaic relationship: Some views and future prospects. Archiv Oríentalní 55(2): 134–149. Vaidya, Paraśurāma Lakṣmaṇa 1941 A manual of Ardhamagadhi. Poona: Wadia College. van Driem, George 1990 An exploration of Proto-Kiranti verbal morphology. Acta Linguistica Hafniensia 22: 27–48. van Driem, George 1991 Bahing and the Proto-Kiranti verb. Bulletin of the School of Oriental and African Studies 54(2): 336–356. van Driem, George 1992 Le proto-kiranti revisité, morphologie verbale du lohorung. Acta Linguistica Hafniensia 24: 33–75. van Driem, George 1993 The proto-Tibeto-Burman verbal agreement system. Bulletin of the School of Oriental and African Studies 56(2): 292–334. van Driem, George 1995 Black Mountain conjugational morphology, Proto-Tibeto-Burman morphosyntax, and the linguistic position of Chinese. In: Yoshio Nishi, James A. Matisoff, and Yasuhiko Nagano (eds.), New horizons in Tibeto-Burman morphosyntax, 229–259. (Senri Ethnological Studies 41.) Osaka: National Museum of Ethnology. van Driem, George 1997 Sino-Bodic. Bulletin of the School of Oriental and African Studies 60(3): 455– 488.

The languages, their histories, and their genetic classification

235

van Driem, George 2001a Languages of the Himalayas, 2 volumes. Leiden: Brill. van Driem, George 2001b The Nahali. In: van Driem 2001a: 242–253. van Driem, George 2002 Tibeto-Burman phylogeny and prehistory: Languages, material culture and genes. In: Peter Bellwood and Colin Renfrew (eds.), Examining the farming/ language dispersal hypothesis, 233–249. Cambridge: McDonald Institute for Archaeological Research. van Driem, George 2005 Tibeto-Burman vs. Indo-Chinese: Implications for population geneticists, archaeologists and prehistorians. In: Laurent Sagart, Roger Blench, and Alicia Sanchez-Mazas (eds.), The peopling of East Asia: Putting together the archaeology, linguistics and genetics, 81–106. London: Routledge Curzon. van Driem, George 2007 South Asia and the Middle East. In: Christopher Mosely (ed.), Encyclopaedia of the world’s endangered languages, 283–347. London/New York: Routledge. van Driem, George 2011a Tibeto-Burman subgroups and historical grammar. Himalayan Linguistics 10(1): 31–39. (Special Issue in Memory of Michael Noonan and David Watters). van Driem, George 2011b Lost in the sands of time somewhere north of the Bay of Bengal. In: Mark Turin and Bettina Zeisler (eds.), Himalayan languages and linguistics: Studies in phonology, semantics, morphology and syntax, 13–38. Leiden: Brill. van Driem, George 2013 Biactantial agreement in the Gongduk transitive verb in the broader TibetoBurman context. In: Tim Thornes, Johana Jensen, Gwen Hyslop, and Erik Andvik (eds.), Functional-historical approaches to explanation: In honor of Scott Delancey, 69–82. Amsterdam/Philadelphia: Benjamins. van Driem, George, and Suhnu Ram Sharma 1996 In search of Kentum Indo-Europeans in the Himalayas. Indogermanische Forschungen 101: 107–146. van Skyhawk, Hugh 2003 Burushaski-Texte aus Hispar: Materialen zum Verständnis einer archaischen Bergkultur in Nordpakistan. Wiesbaden: Harrasowitz. VanBik, Kenneth 2006 Proto-Kuki-Chin: A reconstructed ancestor of the Kuki-Chin languages. University of California, Berkeley, PhD dissertation. Varma, G. Srinivasa 1970 Vaagri Boli, an Indo-Aryan language. Annamalainagar: Annamalai University. Verma, Manindra K. (ed.) 1993 Complex predicates in South Asian languages. New Delhi: Manohar Verma, Manindra K., and K. P. Mohanan (eds.) 1990 Experiencer subjects in South Asian languages. Stanford: CSLI. Voegelin, Charles Frederick, and Florence Marie Voegelin 1977 Classification and index of the world’s languages. New York: Elsevier North Holland.

236

Bibliographical references

Wackernagel, Jakob 1896 Altindische Grammatik, 1: Lautlehre. Göttingen: Vandenhoeck & Ruprecht. Wackernagel, Jakob 1905 Altindische Grammatik, 2:1: Einleitung zur Wortlehre: Nominalkomposition. Göttingen: Vandenhoeck & Ruprecht. Wackernagel, Jakob 1957 Altindische Grammatik, 1: Lautlehre, 2nd ed. Göttingen: Vandenhoeck & Ruprecht. Wallace, William D. 1984 The interaction of word order and pragmatics in a Sanskrit text. Studies in the Linguistic Sciences 14(1): 167–188. Wang Feng 2005 On the genetic position of the Bai language. Cahiers de linguistique Asie Orientale 34(1): 101–127. Watters, David E 2002 A grammar of Kham. Cambridge/New York: Cambridge University Press. Watters, David E. 2005 Kusunda: A typological isolate in South Asia. In: Yogendra Yadava, Govinda Bhattarai, Ram Raj Lohani, Balaram Prasain, and Krishna Parajuli (eds.), Contemporary issues in Nepalese linguistics, 375–396. Kathmandu: Linguistic Society of Nepal. Watters, David E., with Yogendra P. Yadava, Madhav P. Pokharel, and Balaram Prasain 2006 Notes on Kusunda grammar: A language isolate of Nepal. Himalayan Linguistics Archive 3: 1–182. (First published 2005, by the National Foundation for the Development of Indigenous Nationalities, Kathmandu, Nepal.) http:// www.linguistics.ucsb.edu/HimalayanLinguistics/grammars/HLA03.html (accessed 18 November 2013) Wazir Shafi n.d. Burushaski Raẓun: A book on Burushaski grammar (in Yasin dialect) (Foreword in Burushaski and in English by Major Dr. Faiz Aman). Karachi: Bureau of Composition & Translation, University of Karachi. Weber, Dieter 1970 Die Stellung der sog. Inchoativa im Mitteliranischen. Clausthal-Zellerfeld: Boenecke. Weber, Dieter 1980 Beiträge zur historischen Grammatik des Ossetischen. Indogermanische Forschungen 85: 126–137. Weidert, Alfons K. 1975 I tkong Amwi: Deskriptive Analyse eines Wardialekts des Khasi. Wiesbaden: Harrassowitz. Weidert, Alfons K. 1977 Tai-Khamti phonology and vocabulary. Wiesbaden: Steiner. Wendtland, Antje 2008 On ergativity in the Pamir languages. In: Karimi, Samiian & Stilo (eds.) 2008: 419–433. Wendtland, Antje 2009 The position of the Pamir languages within East Iranian. Orientalia Suecana 58: 172–188

The languages, their histories, and their genetic classification

237

Wendtland, Antje 2011 The emergence and development of the Sogdian perfect. In: Korn, Haig, Karimi & Samvelian (eds.) 2011: 39–52. Whitehouse, Paul, Timothy Usher, Merritt Ruhlen, and William S.-Y. Wang 2004 Kusunda: An Indo-Pacific language in Nepal. Proceedings of the National Academy of Sciences of the United States of America 101(15): 5692–5695. http://www.pnas.org/content/101/15/5692.full (accessed 18 November 2013) Whitehouse, Paul 1997 The external relationships of the Nihali and Kusunda languages. Mother Tongue 3: 4–49. Whitney, William Dwight 1885 The roots, verb-forms, and primary derivatives of the Sanskrit language. Leipzig: Breitkopf und Härtel. Whitney, William Dwight 1889 Sanskrit grammar, 2nd ed. Cambridge, MA: Harvard University Press. Whitney, William Dwight 1892 On the narrative use of imperfect and perfect in the Brāhmaṇas. Transactions of the American Philological Association 23: 5–34. Wijeratne, P. B. F. 1945–1957 Phonology of the Sinhalese inscriptions up to the end of the 10th century AD. Bulletin of the School of African and Oriental Studies 11(3) (1945): 580– 594; 11(4) (1946): 823–836; 12(1) (1947): 163–183; 13(1) (1949): 166–181; 14(2) (1952): 263–298; 19(3) (1957): 479–514. Wijesundera, Stanley, G. D. Wijayawardhana, J. B. Disanayaka, Hassan Ahmed Maniku, and Mohamed Luthufee 1988 Historical and linguistic survey of Dhivehi: Final Report. MS, University of Colombo. Wilaiwan, Khanittanan 1986 Kamti Tai: From an SVO to an SOV language. In: Krishnamurti et al. (eds.) 1986: 174–178. Windfuhr, Gernot L. 2009 Dialectology and topics. In: Windfuhr (ed.) 2009: 5–42. Windfuhr, Gernot L. (ed.) 2009 The Iranian languages. London/New York: Routledge. Winfield, W. W. 1928 A grammar of the Kui language. Calcutta: Asiatic Society of Bengal. Winfield, W. W. 1929 A vocabulary of the Kui language. Calcutta: Asiatic Society of Bengal. Witzel, Michael 1989 Tracing the Vedic dialects. In: Caillat (ed.) 1989: 97–265. Witzel, Michael 1995 Early Indian history: Linguistic and textual parameters. In: George Erdosy (ed.), The Indo-Aryans of ancient South Asia: Language, material culture, and ethnicity, 85–125. Berlin/New York: de Gruyter. Witzel, Michael 1999a Substrate languages in Old Indo-Aryan. Electronic Journal of Vedic Studies 5: 1–67.

238

Bibliographical references

Witzel, Michael 1999b Early sources for South Asian substrate languages. Mother Tongue Special Issue October 1999: 1–70. Woodward, Roger D. (ed.) 2008 Ancient languages of Asia and the Americas. Cambridge: Cambridge University Press. Yadava, Yogendra P., and Warren W. Glover (eds.) 1999 Topics in Nepalese linguistics. Kathmandu: Royal Academy of Nepal Yoshida, Yutaka 2009a Minor moods in Sogdian. In: Yoshida Kazuhiko and Brent Vine (eds.), East and West: Papers in Indo-European studies, 281–293. Bremen: Hempen. Yoshida, Yutaka 2009b Sogdian. In: Windfuhr (ed.) 2009: 279–335. Zide, Arelene R. K. n.d A Gorum-English lexicon. Unpublished MS, Chicago. Zide, Arlene R. K., David Magier, and Eric Schiller (eds.) 1985 Proceedings of the Conference on Participant Roles: South Asia and Adjacent Areas. Bloomington: Indiana University Linguistics Club. Zide, Norman H. 1958 Final stops in Korku and Santali. Indian Linguistics 19: 44–48. Zide, Norman H. 1967 The Santali Ol Cemet script. In: Languages and areas: Studies presented to George V. Bobrinskoy, 180–189. Chicago: Department of Linguistics, Department of Slavic Languages and Literatures, and Committee on Southern Asian Studies, University of Chicago. Zide, Norman H. 1969 Munda and non-Munda Austro-Asiatic languages. In: Sebeok et al. (1969): 411–430. Zide, Norman H. 1976 ‘3’ and ‘4’ in South Munda. Linguistics 174: 89–98. Zide, Norman H. 1978 Studies in the Munda numerals. Mysore: Central Institute of Indian Languages. Zide, Norman H. 1985 Notes mostly historical on some participant roles in some Munda languages. In: A. Zide et al. (eds.) 1985: 92–103. Zide, Norman H. 1996 On Nihali. Mother Tongue 2: 93–100. (1995 prepublication version at http:// www.ling.hawaii.edu/austroasiatic/AA/nihali, accessed 20 November 2013) Zide, Norman H. 1999–2000 Three Munda scripts. Linguistics of the Tibeto-Burman Area 22: 199–232. Zide, Norman H. 2008a Korku. In: Anderson (ed.) 2008: 256–298. Zide, Norman H. 2008b On Nihali. In: Anderson (ed.) 2008: 764–776. Zide, Norman H. (ed.) 1966 Studies in comparative Austroasiatic linguistics. The Hague: Mouton.

The languages, their histories, and their genetic classification

239

Zide, Norman H., and Aasha Kelkar Mundlay n.d. Nihali, a Munda language? Unpublished MS. (Abstract by David Stampe, International Journal of American Linguistics 32: 395, 1966.) Zide, Norman H., and Arlene R. K. Zide 1976 Proto-Munda cultural vocabulary: Evidence for early agriculture. In: Jenner et al. (eds.) 1976: 1295–1334. Zide, Norman H., and Gregory D. S. Anderson 2001 The Proto-Munda verb: Some connections with Mon-Khmer. In: K. V. Subbarao and P. Bhaskararao (eds.), Yearbook of South-Asian Languages and Linguistics 2001, 517–540. Delhi: Sage Publications. Zide, Norman, and Vishvajit Pandya 1989 A bibliographical introduction to Andamanese Linguistics. Journal of the American Oriental Society 109(4): 639–651. Zoller, Claus Peter 1988 Bericht über besondere Archaismen im Bangani, einer Western PahariSprache. Münchener Studien zur Sprachwissenschaft 49: 173–200. Zoller, Claus Peter 1989 Bericht über grammatische Archaismen im Bangani. Münchener Studien zur Sprachwissenschaft. 50: 159–218. Zoller, Claus Peter 1993 A note on Bangani. Indian Linguistics 54: 112–114. Zvelebil, Kamil V. 1970 Comparative Dravidian phonology. The Hague/Paris: Mouton. Zvelebil, Kamil V. 1973 The Irula language. Wiesbaden: Harrassowitz. Zvelebil, Kamil V. 1977 A sketch of comparative Dravidian morphology, Part 1. The Hague: Mouton. Zvelebil, Kamil V. 1979 The Irula (ёrla) language, Part 2. Wiesbaden: Harrassowitz. Zvelebil, Kamil V. 1980 A plea for Nilgiri areal studies. International Journal of Dravidian Linguistics 9: 1–22. Zvelebil, Kamil V. 1982 The Irula (ёrla) language, Part 3: Irula lore, texts and translations. Wiesbaden: Harrassowitz. Zvelebil, Kamil V. 1990 Dravidian linguistics: An introduction. Pondicherry: Pondicherry Institute of Linguistics and Culture. Zvelebil, Kamil. V. 2004 Prolegomena to an etymological dictionary to the Irula language. In: Jean-Luc Chevillard and Eva Wilden (eds.), South Indian horizons: Felicitation volume for François Gros on the occasion of his 70th birthday, with the collaboration of A. Murugaiyan, with a preface by R. E. Asher, 281–290. Pondichéry: Institut Français de Pondichéry.

2

Contact and convergence Edited by Elena Bashir

2.1.

Introduction By Elena Bashir

The field of areal linguistics is relatively recent.1 With recent challenges to the stammbaum model of language history, e.g. Dixon 1997, language contact phenomena are receiving increased attention; contact and convergence and “genetic” inheritance are now acknowledged to be equally important sources of language similarity. Important works in this newly vigorous area include Thomason & Kaufman 1988, Ramat 1998, Thomason 2001, Dahl 2001, Heine & Kuteva 2003, 2005, and 2008; Matras, McMahon & Vincent 2006; and Muysken 2008. An online Journal of Language Contact published its first issue in 2007. See also Hock 1986: 491–512, Hock & Joseph 2009: 347–424, and Winford 2003 for general discussion. 2.1.1.

Areal linguistics and South Asia

Ever since Emeneau 1956, 1969b, 1974, and 1980b,2 Kuiper 1968a, and Masica 1976, South Asia has been an increasingly active site for the study of contact and convergence phenomena.3 Studies adopting an areal approach to South Asian phenomena include Bashir 1988, Marlow 1997, and Eaton 2008. A new focus on micro- rather than macro-areas is emerging. A recent research program, the results of which are described in Ebert 2009, has reexamined Masica’s (1976) proposed features of a South Asian sprachbund, using a broader base of descriptive data than was available in 1976. These researchers found a dividing 1

2 3

According to Google’s English corpus and Ngram viewer (http://books.google.com/ ngrams/), use of the term “areal linguistics”, first observed in the late 1940s, peaked in the 1970s and has remained high since then. German “Sprachbund” first appears in Google’s English corpus in the 1940s and continues to rise in frequency until now. “Linguistic area” first appeared around 1850 and, after a peak around 1890, its frequency has continued to rise slowly or remain steady until the present. “Language contact”, first noted around 1949, has shown continuous steady increase in use. More of Emeneau’s essays on related topics are collected in Emeneau 1980a. Andronov (1968: 13) argues that continued convergence may weaken genetic boundaries and eventually lead to the emergence of new language families. He thinks that ‘the development of the typological similarity of the Modern Indo-Aryan and Dravidian languages can be regarded as a prerequisite or an initial stage in the formation of a new linguistic family.’

242

Elena Bashir

line around the 84th meridian, languages to the east of which show many Dravidian and Austroasiatic traits, but few of the sprachbund features. The Indo-Aryan (IA) languages of this area (Assamese, Bengali, Nepali, Oriya) were found to have adopted many features from neighboring Austroasiatic and Tibeto-Burman languages (see 2.6.7 and the references therein). A workshop on linguistic microareas in South Asia was held at Uppsala University, 5–6 May 2014 (Saxena 2013). Linguistic features discussed at the workshop as (potential) areal phenomena include retroflexion, “impersonal patients”, pronominal clitics, causal constructions, and evidentiality. Areal configurations discussed include the Southern Kirant microarea, the Indian Himalayas, the Amdo sprachbund, Manipur, Eastern-Central South Asia, and southern Maharashtra. The proceedings appear as Saxena (ed.) 2015. 2.1.2.

Desiderata

Given the state of the field of South Asian contact linguistics, what research priorities should be established? Much empirical data on language contact and convergence in South Asia is now available, and this research continues apace. At this point, the field seems ripe for attention to typological or theoretical generalizations that can be drawn from the abundant data. Haspelmath (2004: 221) says: ‘When reading [Aikhenvald & Dixon 2001], one can get the overall impression that research on areal linguistics is currently still in the hunting and gathering stage. All the articles are rich in data and individual observations, but there is not much systematicity in this research — no sampling or quantitative methods, no evaluation of specific competing models or hypotheses (apart from Dixon’s punctuated equilibrium model …).’4 Problems of distinguishing between contact and inherited phenomena need much attention. Contact and convergence among closely related languages is discussed in Braunmüller 2009. This work focuses on the German-Danish border situation, but the general discussion is relevant to some of the knottiest problems of the South Asian situation, like teasing apart contact and genetic features in the Dardic languages. Braunmüller says (p. 67): ‘In any case, linguistic convergence between genetically closely related languages/varieties inevitably results […] in overt or covert code mixing […] rather than a clear separation of genetically related varieties.’

4

Zoller (2005: 11–12) interprets the situation of the Dardic languages in terms of Dixon’s punctuated equilibrium model, specifically Dixon’s second type of language split, in which the diverging groups remain in close geographical proximity. According to Zoller, ‘The latter type of language splitting, which seems to reflect the scenario of the Dardic languages, is, according to Dixon, invariably motivated by political reasons.’

Contact and convergence

243

Johanson & Robbeets 2012 is an important new collection of articles on the problem of distinguishing “copies” — similar forms resulting from language contact — from “cognates” — similar forms resulting from inheritance from a common ancestral form. This collection reflects the new recognition that genealogical and areal explanations for shared morphology must be considered together, and the studies in it share the goal of developing criteria for distinguishing between cognates and copies. This problem of distinguishing between copies and cognates has been alluded to several times, though not in those terms, in the contributions to this chapter, particularly in connection with the problem of determining the trajectories of various Perso-Arabic words in the languages of the Northwest; 2.4.1.1 this volume. See also 2.5.1.1 for discussion of this problem in northeast India. Semantic convergence has received much less attention. A handful of studies address such questions. Verma (1976: 185) argues that stativeness ‘is a fundamental entity of the conceptual structure of Indic languages’ and is expressed by a variety of structural devices including the compound verb system. Klaiman (1986: 180), amplifying Verma’s case and focusing on “dative subjects” and the parameter of volitionality, argues that ‘characteristic similarities among South Asian languages at the formal level may be attributable to the sharing of semantic parameters at the conceptual level.’ The arguments presented by Verma (1976) and Klaiman (1986) about synchronic semantic parallelisms seem to foreshadow the work in Butt & Ahmed 2011, in which they argue for a sort of cyclical diachronic semantic stability such that the development of new case markers in New Indo-Aryan (NIA), after the inflectional case system of Old Indo-Aryan (OIA) was eroded during Middle Indo-Aryan (MIA), reflects changing instantiations of a stable spatial semantics. Hook (1982: 1) argues that formal and semantic convergence sometimes coincide, and sometimes do not. He teases apart some differences between languages which converge morphologically or syntactically but use those convergent forms to express different meanings, and those which agree in the semantic distinctions they make but use different morphological or syntactic devices to make these distinctions. For example, the compound verb is found in both Hindi and Marathi, an instance of formal convergence. However, according to Hook, while it expresses both anteriority/posteriority and perfectivity in Hindi, it does not express perfectivity in Marathi. Conversely, the same set of semantic contrasts (perfectivity and anteriority) which are expressed in Hindi by the compound verb are expressed in Godwari by the appearance of an “ergative adverb” po (Hook 1982: 33). Further such studies of macro- or micro-semantic convergence areas would be very welcome. Finally, explicit attention to the translocal role of languages of wider contact in South Asia — Indo-Aryan (Sanskrit), Persian, Urdu (in Pakistan), and now English — could yield more general insight into the nature of change in these lingua francas themselves as well as in the local and regional vernaculars with which they have interacted.

244

Colin Masica

2.2.

Overall South Asia By Colin Masica

2.2.1.

Convergence and linguistic areas

That languages in contact, whether related or unrelated, often influence one another structurally, that is, to varying degrees “converge”, is now widely acknowledged as an important fact of linguistic history (Thomason & Kaufman 1988, Haspelmath et al. 2005: 1–2, Starostin 2011). Failure to take this into account may not only result in an incomplete picture of the history of a language, but also obscure and confuse its genetic relationships. The term LINGUISTIC AREA is usually reserved for a situation involving multiple shared traits and several languages of different “stocks”. Such shared traits may be conservative as well as innovational; that is, areal reinforcement may be a factor in preserving features in a language that are lost in its genetic relatives. More striking, however, is a situation where a language acquires features alien to its genetic relatives and to its own earlier forms, but characteristic of its neighbors — often new neighbors — either because the neighbors are intrusive or the language has moved to a new area. Languages participate in such areal convergences to varying degrees. Those at the center of a convergence zone typically share a larger number of traits; those at the margins fewer. Although linguistic areas may often seem to follow geographic configurations, it is important, when identifying and defining a linguistic area, not to base it on these, but rather on the distribution of the traits themselves. These distributions will generally not coincide, but if they cluster or especially if they form a CONCENTRIC PATTERN we have a “linguistic area” (sprachbund). Margins of an area typically show certain special characteristics: weakening or absence of some of the convergent features, or mixed phenomena (such as both prepositions and postpositions); and thus statistical frequency becomes a consideration. These “transition zones” themselves help define an area. Unlike genetic groupings, linguistic areas are non-unique and sometimes overlapping. That is, a language, typically one on the margins of an area, may have features of more than one area of convergence. A good example is Persian, in some respects connected to the South Asian/Central Asian area, in others to the Middle East/Balkan area (Heston 1980, 1981, 1983). Languages participating in a convergence area do not become typologically identical. Although at linguistic boundaries, often not a line but a mixed zone of much multilingualism, this may sometimes seem almost to be the case, more typically languages, especially unrelated ones, retain some typological differences and features of their genetic inheritance.

Contact and convergence

2.2.2.

245

South Asia as a linguistic area

The greater South Asian subcontinent eminently qualifies as a linguistic area according to the above criteria. It also exemplifies the stated caveats. Here we have representatives of at least5 six distinct linguistic stocks (Dravidian, Munda, Tibeto-Burman, Indo-European [Indo-Aryan, Iranian, and Nuristani], and the isolates Burushaski and Kusunda) sharing a well-defined space, and influencing one another in various ways over sometimes millennia. All of them except Dravidian, Burushaski, and Kusunda have generally-recognized relatives outside the area that do not participate in the South Asian typological configurations. Even in the case of Indo-Aryan, there is Romani, now heavily “Balkanized” or “East Europeanized” typologically in syntax, phonology, and grammatical categories lost or acquired (Matras 2002). Following initial observations that unrelated languages in a particular area share traits with one another that they do not share with relatives outside the area, the next step is to make a preliminary list of such traits. For South Asia such a list might include the following: a. The whole complex of what has been called “left-branching (morpho)syntax”, modifier before modified, including: — SOV order of basic elements in a clause — AdjN, DemN, GenN order in the noun phrase — Postpositions rather than prepositions — Suffixes rather than prefixes — Preposed subordinate clauses, often employing nonfinite verb forms, such as conjunctive participles (“absolutives”) b. Phonological contrasts: — Retroflex vs. dental apical consonants — Aspirated vs. non-aspirated consonants — Nasalized vs. non-nasalized vowels c. Grammatical and semantic categories: — Morphologically-marked CAUSATIVE VERBS — Differential case-marking of Direct Objects, according to definiteness, animacy, etc. — Marked (usually Dative) E XPERIENCER S UBJECTS (Klaiman 1986) 5

Whether to include the Andaman languages in the area presents a problem. While they do share some “South Asian” traits (SOV, retroflexes), it is difficult to relate this to “contact”. It is possible to imagine a scenario in which they represent a remnant of a very ancient substratum once found on the mainland. (The Nicobarese languages are not typologically South Asian.)

246

Colin Masica

— I NCLUSIVE / EXCLUSIVE distinction in 1st person plural pronouns — E RGATIVE , QUASI - ERGATIVE , OR RESIDUAL ERGATIVE construction (in its fullest form entailing special case-marking of agents of transitive verbs, identity of case-marking of Subjects of intransitive verbs and Objects of transitive verbs, and agreement of transitive verbs with Objects; in the “split ergative” version, these features apply only to certain tense-aspect forms) — Quasi-grammaticalized (to varying degrees) SPECIFICATION OF VERBS for orientation or manner of action by compounding a non-finite form of main verbs (often the conjunctive participle / “absolutive”) with a limited set of secondary finite verbs (“vectors”, “explicators”), which are generally common verbs emptied somewhat of their lexical content — rather than by morphological devices (cf. Arabic), or the use of different verbs used to convey such meanings in other languages (if they are conveyed at all). — Specification of nouns by NUMERAL “ CLASSIFIERS ” Let us assume that cross-genetic convergences involving these traits have been shown to exist in the area. (This list is not exhaustive, only representative.) That is not enough. The third and crucial step is to trace the distribution of each trait to determine whether it more or less coincides with the South Asian subcontinent as such, including all its genetic groups, or defines an area smaller or larger than, or different from that. Considering them in the order presented above: a. “Left-branching syntax”, which seems at first to be a prime characteristic of the area, does not hold up as a South Asia-defining criterion. (1) Primarily, the problem is that its lynchpin, the SOV order of basic elements of a clause, turns out to define a much larger area, including most of Central, Northern, and Northeastern Asia as well, although the South Asian sub-area is still sharply set off in this regard from its immediate neighbors to the east and west, that is, from Southeast Asia and the Arab Middle East. A few languages in South Asia itself are either exceptions to the SOV norm, e.g. Khasi in northeastern India, or show various types of departures from it, e.g. Kashmiri, Sinhala. This need not detain us long: such “incomplete coverage” and attenuation, especially on the margins, is typical of all linguistic areas. (2) Second, contrary to early idealizing formulations of word order typologies (Greenberg 1966, Lehmann 1978), subsequent work (Dryer 1988, 1992) has shown that the various components of the left-branching syntactic bundle, which may appear to be interdependent, are in fact independent variables to varying degrees; their co-occurrence is areal. The order of noun phrase elements in particular is independent of that of clausal elements; that is, with SOV, right-branching orders (Modified + Modifier, e.g. NAdj) are as common as left-branching (AdjN) ones. Some languages more peripherally involved in the “greater” South Asian convergence, e.g. Tibeto-Burman and Persian (but not Pashto), not surprisingly part company with it here. This is not a problem: it makes the sharing of left-branching at both

Contact and convergence

247

clausal and phrasal levels by the other participating languages (including Pashto) all the more salient as an area definer, since it is not automatic. However, the complementary “Altaic” lobe of the larger SOV area also shares this feature (AdjN). (3) Even the correlation of postpositions with SOV order begins to unravel at the margins of SOV areas, Persian being a prime example of a marginal SOV language with PREPOSITIONS (as well as NAdj and NGen order). Significantly, there also turn out to be TRANSITIONAL ZONES between the two, with both postpositions and prepositions, sometimes simultaneously (AMBIPOSITIONS , e.g. in Pashto (see Stilo 1987[2006], 2004; Heston 1987). Both Tibeto-Burman and “Altaic” are solidly postpositional. (4) Exclusive suffixation also is a feature shared by most South Asian and “Altaic” languages, but ancestral prefixation persists in South Asia in Munda and Tibeto-Burman languages, and Burushaski (and in the tatsama or Sanskrit-derived layer of both Indo-Aryan and Dravidian lexicons, as well as the Persian-derived layer of some Indo-Aryan lexical derivation). (5) Exclusively preposed subordinate clauses, although a feature of “Altaic”, are in South Asia a property only of Dravidian (exception: Iranian-influenced Brahui), Munda, most Tibeto-Burman languages, and apparently Burushaski, but not of Indo-Aryan, which is mixed, with preposing appropriately dominant in Marathi, however, and important in Kalasha and Khowar (see Bashir 1988), or of Iranian, which postposes such clauses. b. Testing the phonological criteria, we again find mixed results. (1) Only the RETROFLEXION contrast is pan-South Asian, affecting all genetic groups (i.e. Indo-Aryan, Dravidian, Munda, western Tibeto-Burman, eastern Iranian, Nuristani, Burushaski, Kusunda), although this consensus, most intense in the northwest (Southworth 1974), breaks down on the eastern margins, excluding eastern Tibeto-Burman and Indo-Aryan Assamese. It is Proto-Dravidian, and probably Proto-Burushaski, but acquired at different times by (?) Munda (a phonetically retroflex /ḍ/ is commonly reconstructed for Proto-Munda, but not a series), Indo-Aryan, Nuristani, and those Tibeto-Burman and Iranian languages that now have it. (2) The ASPIRATION contrast is shared by Indo-Aryan, Burushaski, some Munda languages, and Tibeto-Burman; but excludes Dravidian, Eastern Iranian, and other Munda languages; and is lost in Sinhala. It extends beyond South Asia to Chinese and Thai, but not to “Altaic”. (3) Although South Asia contrasts with neighboring regions (including Central/ Northern Asia) by the presence of nasalized vowels, these are not pan-South Asian, but are found mainly in northern Indo-Aryan (not in Sinhala, Marathi, apparently Gujarati; or in Kashmiri or Shina or Khowar) and cross-genetically, in modern Tibetan, Newari, Burushaski, and some Munda languages. They are not found in Dravidian, although they are emergent in Tamil allophonically.

248

Colin Masica

(4) In contrast with the syntactic features, none of the phonological features just discussed that characterize either most or a large part of South Asia are shared with the “Altaic” area, which has its own characteristics (Jakobson 1931). Phonology alone therefore could serve to distinguish these two lobes of the larger SOV area. c. Among characteristic grammatical categories occurring cross-genetically in South Asia, both (1) MORPHOLOGICAL CAUSATIVES and (2) ACCUSATIVE CASE - MARKING OF DEFINITE OBJECTS are also characteristic of the “Altaic” area (in fact, more so). (3) On the other hand, MARKED EXPERIENCER SUBJECTS are not. (4) The INCLUSIVE / EXCLUSIVE distinction in 1st person plural pronouns is shared by a number of South Asian languages of different stocks (Dravidian, Munda, some neighboring Indo-Aryan languages) but is not pan-South Asian; (5) Various stages of development or decay of ERGATIVITY are found in Burushaski, Indo-Aryan, Himalayan Tibeto-Burman, and Iranian, on into the languages of the Caucasus, but not in Dravidian, “Altaic”, or Austroasiatic. (For a fuller discussion see Masica 2001: 248–250). (6) Of the two “lexico-grammatical categories” we have chosen to focus on here, SPECIFICATION OF VERBS BY VERBS (“explicator compounds”) is shared in specific form and function by Indo-Aryan, Dravidian and Munda — and by a number of Central and Northeast Asian languages. This feature is quite old in Turkic (Orkhon inscriptions, 8th century AD; Liang & Hook 2001). Unknown in Sanskrit, it is however found in Pali (Hook 1974, 2001), and now seems most highly developed in Hindi and Panjabi. Its history in Dravidian needs clarification. Its point of origin (if not multiple) is unknown.6 (7) S PECIFICATION OF NOUNS BY CLASSIFIERS is clearly a Southeast Asian areal phenomenon, though extending to some eastern South Asian languages. In summary, of the cross-genetic convergences surveyed above, only two, namely the presence of a retroflex stop series and a marked category of Experiencer Subjects, define distributional parameters that coincide, more or less, with the South Asian subcontinent.7 The others are either shared with a wider 6

7

Although many of the “Resultative Verb” constructions of Chinese and mainland Southeast Asian languages seem rather to be compressed sentences, with the Object of the first verb acting as Subject of the second verb, there is a subset of “directional” compounds, with intransitive verbs only, where the second verb can be seen as specifying the action denoted by the first, therefore marginally participating in this areal pattern. I thank Peter Hook for persuading me regarding these points, although he may still not agree with my modified conclusion. (See Liang & Hook 2001.) Features not discussed here for lack of space include one of the earliest to be noted, the postposed quotative marker (Bloch 1934, Kuiper 1968a), and one of the most recent,

Contact and convergence

249

(“Indo-Turanian”) area (SOV-AdjN-Postpositional syntax, morphological causatives, explicator compound verbs, definite objects), or link parts of South Asia with other adjacent areas (nominal classifiers, aspiration, ergativity), or define smaller areas within it (inclusive/exclusive pronouns, nasalized vowels). 2.2.3.

Processes of convergence and directions of influence

“Definition” of a South Asian area is not the only point. With its definite core, attenuation on some margins, even its overlap with other areas, South Asia is a typical linguistic area. In such a large area with a deep and complex history, no single influence or process is responsible for convergence, nor has it been all in one direction. There have been different influences, in different directions, at different times. Because it is well-defined geographically, with great genetic diversity, it constitutes an excellent laboratory for the study of convergence phenomena, whether pan-subcontinental or not. Among the processes that have been adduced to have been operative are: (1) bilingualism (Emeneau 1962b), favoring parallelism of structure for ease of processing; (2) language “shift” with incomplete learning of the new language (Thomason & Kaufman 1991); (3) reinterpretation of material in one language in terms of another language (e.g. animate/personal object-marking can become definite object-marking); (4) different internal dynamics in different linguistic stocks producing similar results from varied antecedents (cf. the development of retroflexion in Indo-Aryan and Tibeto-Burman; or the form of Dative suffixes in Dravidian and Indo-Aryan; or the development of ergativity in Indo-Iranian and Tibeto-Burman). “Natural” processes working on similar antecedents elsewhere, where areal models of retroflexion, ergativity, etc. were not present, did not produce the same results. South Asia provides good material to test the alleged implicational universal (Comrie 1981) that structural borrowing implies lexical borrowing. Language shift, such as plausibly happened in much of the present Indo-Aryan area, need not entail much lexical borrowing, even while affecting structure. Meanwhile, even massive lexical borrowing does not necessarily entail structural borrowing: both Dravidian and Southeast Asian languages are full of Sanskrit loanwords, but their structure has been essentially unaffected. On the other hand, loanwords are an important factor in the establishment of a retroflex series in both Indo-Aryan and Tibeto-Burman. The use of Persian in administration in parts of South Asia for a thousand years seems to have had both lexical and structural effects (e.g. in the positioning and marking of quotative clauses, and in Definite Object-Marking), at least on northern Indo-Aryan. the “subset of [idiomatic] complex predicates” involving the operator EAT + a “contained noun” (Pardeshi & Hook 2006). Both are “Altaic” as well as South Asian.

250

Hans Henrich Hock

There seems to have been influence of Dravidian on Indo-Aryan, probably of Burushaski on Indo-Aryan and Tibeto-Burman (at least locally), of IndoAryan on Tibeto-Burman, Munda, and Dravidian, of Iranian (Baluchi) on Dravidian (Brahui) (see 2.3.1.2, 2.4, and 2.6), and of Persian (and possibly Turkic) on Indo-Aryan. Whatever influences there may have been in early periods (see Witzel 1999a for the Vedic period; also 2.3), Munda seems an unlikely source for the later patterns of South Asian convergence. In almost every respect in which it agrees typologically with its South Asian neighbors, Munda has parted company with its Austroasiatic relatives in Southeast Asia. Characteristics that it has retained or developed, such as prefixing and incorporating structures, have not been diffused. 2.3.

Ancient contact, convergence, substratum influence By Hans Henrich Hock and Franklin C. Southworth

2.3.1.

Introduction By Hans Henrich Hock

Since Pott (1833, 1836) it has been noted that Sanskrit (Old Indo-Aryan) exhibits features not shared by other early Indo-European languages but found in most South Asian languages. The feature recognized earliest is the contrast dental : retroflex. Other structural features, besides lexical and geographical evidence, were added later, especially by Emeneau (1956, 1980b) and Kuiper (1968a, b), both with earlier references. To date, the majority view considers early contact with Dravidian responsible for these features, assuming that speakers of Dravidian, forced by invading Indo-Aryans to adopt their language, changed that language. Some alternative sources for the features have been proposed, such as Munda (Kuiper 1948) or “Para-Munda” (Witzel 1999c: 10), and Burushaski or other languages of the northwest (Tikkanen 1987, 1988). Moreover, some have questioned the cogency of the “pro-Dravidian” arguments (e.g. Hock 1975, 2015). Assessing the relative merits of these proposals is made difficult by the uneven chronological attestations of the South Asian languages. Sanskrit, the earliest Indo-Aryan, is attested since at least 3500 BP;8 the earliest attested Dravidian, Tamil, may go back to about 2100 BP, with the other literary languages considerably later; the earliest attestation of Tibeto-Burman, Classical Tibetan, is yet later (ca. 1400 BP); and the other languages, including the Dravidian Tribal languages, all of Austro-Asiatic/Munda, and isolates such as Burushaski began to be recorded only some 150 years ago (Hock 2000). 8

For optimal comparison, I give dates in terms of Before Present (BP).

Contact and convergence

251

The following sections attempt to summarize the major claims and counterclaims for the prehistoric period and present a brief outline of possible contact developments in the post-Vedic period. Section 2.3.2 (by FCS) addresses lexical evidence for prehistoric contact and migrations; Section 2.3.3 (by HHH) discusses structural features and geographical evidence; 2.3.4 (by HHH) deals with possible contact developments in the post-Vedic period. At the outset, it must be noted that all the views discussed here assume that prehistoric contact arose as a consequence of an Indo-Aryan in-migration, which is generally dated to around 3500 BP.9 This assumption, sometimes called the “Aryan Invasion Theory”, has been rejected by various Indian and Hindu nationalists. Many question the methodology of comparative-historical linguistics, without offering an alternative methodology (e.g. Misra 1992; Rajaram 1995). See Hock 1999, 2000 for discussion of early proposals. Talageri (2008) uses linguistic arguments to show that early Indo-European dialectology can be explained through migrations out of India, rather than into India (see Hock 1999). Talageri’s data are secondary, taken from Gamkrelidze & Ivanov (1995), who postulated the IndoEuropean homeland near the Caucasus;10 Talageri fails to show how the same data establish Indian origin instead. Moreover, to account for the issue of Indo-European dialectology, Talageri postulates a set of six hypothetical stages of migrations and reconfigurations, a violation of Occam’s Razor if compared to Hock’s simpler scenario of Indo-European dispersal from a central area in the steppes of southeastern Europe and Central Asia.

9

10

Some scholars assume a much earlier Indo-Aryan presence in South Asia (ca. 5500 BP), based on cultural, rather than linguistic considerations (e.g. Bellwood 2009, adopting Renfrew’s controversial hypothesis [1987] that Indo-European migrations spread farming from Anatolia). This assumption conflicts with the chronology suggested by the Indo-Iranian evidence for horse-and-chariot culture (e.g. Ved. rathe-ṣṭhā : Avest. raθaē-štā- ‘warrior’, lit. ‘standing on the chariot’). The horse-and-chariot culture complex in Sintashta, including horse burials, cannot be dated earlier than ca. 4000 BP (Anthony 2007), and the earliest horse burial in South Asia (Swat Valley) is from ca. 3700 BP (Kennedy 1995, Kenoyer 1995). Earlier migrations, therefore, are not likely to have been Indo-Aryan; and migrations by other, pre-Indo-Aryan Indo-Europeans must remain speculative, given the absence of verifiable linguistic evidence. Talageri fails to appreciate that some of Gamkrelidze & Ivanov’s claims are problematic, such as an alleged special relationship between Tocharian, Anatolian, Italic, and Celtic; see Ringe 1990, Winter 1997 for an accurate assessment.

252 2.3.2.

Franklin C. Southworth

Lexical evidence By Franklin C. Southworth

2.3.2.1. General remarks The identification of non-Indo-European lexical elements in Old Indo-Aryan (OIA) has been controversial from the early days of Indo-European (IE) studies. Kuiper (1991b: 2–4) notes the reluctance of some Sanskritists to accept such elements, while some Indo-Europeanists saw signs of foreign linguistic influence even in the earliest Rigvedic hymns. (See also Das 1995, Kuiper 1995.) It is also true that early attempts to find etymologies for these words often led to premature guesses. In assessing these issues, the data must take precedence over any a priori assumptions about the probability or improbability of contact between particular languages — however difficult that may be. Investigations of foreign elements in OIA have focused mainly on AustroAsiatic and Dravidian, though attention has also been given recently to several other languages, including some of unknown genetic affiliation — such as Burushaski and Nahali/Nihali — and others whose existence is inferred from the evidence of the Indo-Iranian languages — such as the “Indus” languages, Proto-Bhili, and a presumed Central Asian substrate language. A number of possible sources have been neglected, either from lack of scholarly interest or lack of information — especially true of the Sino-Tibetan languages. Witzel 1999c is a detailed discussion of the languages that served as sources of borrowings in OIA, with primary focus on the Rig Veda (RV). He notes that the RV contains some 383 words (‘roughly 4 % of its hieratic vocabulary’) of non-IA, non-IE origin, as shown by their phonological and morphological structure. Noting that many of the non-IE words in the RV have prefixes which are ‘close to, and in part even identical with those of Proto-Munda’, and following Kuiper (1962: 51,102; cf. Zide 1996), Witzel uses “Para-Munda” and “Para-Austroasiatic” to designate the substrate assumed to have provided these words, suggesting the possibility of as yet unknown branches of Munda. He distinguishes four Indus Valley substrates: (1) a pre-Rigvedic Para-Munda substrate in the Panjab, ‘… [with] some hints which point to Munda influence in the Himalayas (Konow 1905, Witzel 1993)’ (2) a “Northern Indus language” containing words of unknown origin with NIA reflexes (3) a Southern Indus language, also referred to as “Meluhhan”, the source of some 40 words referring to Indus Valley products used in the trade with Mesopotamia and recorded in ancient Mesopotamian sources (1999a: 24–25) (4) Dravidian words in the middle and late Vedic period (Section 2.3.2.2 below) Beyond the Indus region, Witzel (1999c: 5) mentions a hypothetical Central Asian substrate (proposed by Lubotsky 2001 [= 1999]) as the source of a number of words shared by Iranian and Indo-Aryan such as (Vedic/Avestan) uṣṭra/uštra ‘camel’, as

Contact and convergence

253

well as Proto-Burushaski in the northwest, Tibeto-Burman in the Himalayas and in Kosala (1999a: 43–45), and predecessors of remnant languages now found in isolated pockets: Kusunda in Central Nepal, pre-Tharu in S. Nepal/UP, Nahali/Nihali in Central India (1999a: 46–48), the Vedda (in Sri Lanka, see De Silva 1972), and the inferred pre-Nilgiri substrates (for South India, see Zvelebil 1990). 2.3.2.2. Dravidian Dravidian lexical borrowings in OIA have been discussed by Bloch (1925), Burrow (1945, 1946, 1947, 1948, 1973: 380–387), Emeneau (e.g. 1954, 1969a; see also Emeneau 1980b: 355–369, Kuiper 1955, 1991b, 1995, Parpola & Janhunen 2011, and Southworth 1979, 2005, among others). Hock (1975) and Witzel (1999c) provide important critiques of much of this work. OIA loanwords in Dravidian languages are treated in Emeneau & Burrow 1962, and the Appendix to Burrow & Emeneau 1984 (pp. 509–514) contains a supplementary list. Turner 1966 and Burrow & Emeneau 1984 contain many cross-references regarding possible borrowings (Southworth, forthcoming). Mayrhofer’s two etymological dictionaries (1956–1976, 1986–2001) deal with the origins of words at all periods of OIA, with emphasis on the Vedic period — in a number of cases, words marked as “wohl Dravidisch” in Mayrhofer 1956–1976 were reclassified as “unklar” or the like in Mayrhofer 1986; see Southworth 2005: 72–83 for examples. Witzel, after examining all of the proposed Dravidian loanwords in the Rigveda, concludes that the only acceptable cases occur from the middle Rigvedic period onwards (1999c: 15–20). Recent work on Dravidian, including the possible relationship between Dravidian and Elamite (McAlpin 1981), has questioned the prevailing view of Dravidian prehistory, e.g. as proposed in Krishnamurti 2003. The evidence for a North Dravidian subgroup consisting of Brahui, Kurux, and Malto has been shown by McAlpin (2003) to rest on inadequate evidence. McAlpin (forthcoming) shows that Brahui is closer to Elamite than to Dravidian, and accounts for the overall relationship in terms of a Zagrosian language family, consisting of (1) Elamitic, with Brahui and Elamite branches, and (2) Dravidian, with North (Kurux-Malto) and Peninsular branches. (See also Southworth 2011, Southworth & McAlpin 2013.) Speakers of Proto-Dravidian, or later forms of Dravidian, coming from western Iran may have entered the Indus Valley as early as the mid-4th millennium BCE (Bellwood 2009). A new examination of the words accepted by Witzel as Dravidian loans in the RV suggests that some may be the result of earlier contact between Dravidian and Indo-Aryan speakers, a possibility envisaged by Witzel (1999c: 23; see also Parpola 2001, 2002a, b, Parpola and Janhunen 2011) and supported by the presence of cognates of the presumed Dravidian words in the NIA languages of the high Himalayas. Southworth (2011) notes a number of words with cognates

254

Franklin C. Southworth

in these languages — such as CDIAL 9051 phala ‘fruit’, with cognates in three Nuristani (N) languages and seven Dardic (D) languages, 3834 khala ʻthreshing floor’11 (N1-D8) — and points out that the numbers and distribution of Himalayan cognates for these words are similar to those found for words of PIE origin and therefore must be old.12 Thus these words may be the result of language contact OUTSIDE the Indus Valley, in Iran or Central Asia. If the Nuristani languages have been separate from Indo-Aryan since 3900 BP (Blažek & Hegedűs 2010), then these contacts may have been pre-Vedic. The continued appearance of Dravidian loanwords in late Vedic and post-Vedic texts, as Witzel notes (1999c: 39–40), is evidence of the presence of Dravidian-speaking groups within the IAdominated society; thus the absence of Dravidian words in the earliest RV may be accounted for by the assumption that the Vedic Aryans were not then in contact with Dravidian-speaking groups, but only with speakers of Para-Munda, “Indus”, and other as-yet-unidentified languages. 2.3.2.3. Munda and Austroasiatic Austro-Asiatic (AA) (including Munda) words in OIA have been discussed by Lévi (1923), Przyluski (1926), Kuiper (1948, 1955, 1991b), Mayrhofer (1951, 1956–1976), and Witzel (e.g. 1999c). Several of these authors, as well as Hock (1975: § 3.1), have noted that the earliest foreign elements found in the RV are Austro-Asiatic in appearance. Munda languages are at present located mainly in eastern India, the westernmost language being Korku on the middle Narmada. However, Kuiper has long held that the earliest identifiable foreign words in the Rigveda are of Munda or AA origin, which would imply the presence of speakers of these languages in the Panjab as early as the second millennium BCE. Many of the proposed Munda/AA words are names — of individuals, tribal groups, or geographical features. Some early work on Munda borrowings in OIA, for example that of Przyluski, was difficult to evaluate because of the many assumptions, both linguistic and non-linguistic, required to make the derivations plausible. Osada (2009) has criticized much of Kuiper’s and Przyluski’s work. In addition, some of Kuiper’s Munda interpretations have been challenged on the grounds that they may be Dravidian compounds (Krishnamurti 2003: 37–38, Gurov 2000).

11

12

Proto-Dravidian *qaḷ-am ‘threshing ground’ is derivable from Proto-Zagrosian *qal ‘field’: cf. Proto-Elamitic *xal, Elamite hal ‘land’. Words inherited from PIE commonly show cognates in Nuristani and Dardic, though the numbers vary. At the high end are found words like 7655 pañca ‘5’ (N4-D17, out of a possible total of N6-D21), while words like 10016 mātṛ ‘mother’ (N0-D4) and 12357 śaśa ‘hare’ (N1-D3) are found at the lower end.

Contact and convergence

255

Witzel (1999c: 11–13) presents proposed Para-Munda etymologies of Rigvedic words, including personal names, river names, and common nouns. In a later section (pp. 36–40) he deals with the Para-Munda substrate in the post-Rigvedic period, when words of Munda/AA extraction continue to appear in texts. He summarizes (p. 38): ‘… a strong Austro-Asiatic substrate is found both in the early Panjab (RV, ca. 1500 BC) as well as later on in the Ganges valley (YV Samhitas, Brāhmaṇas, c. 1200–500 BC)’. 2.3.2.4. Inferred languages: “Indus”13 Southworth (2005: 67) suggested the “Indus” language(s) as a possible source for a number of words which are attested at early levels in both Indo-Aryan and Dravidian languages, without clear signs of provenance. Subsequently, in examining the reconstructed (“Proto-NIA” or “desi”) words in the CDIAL, Southworth (2006) noted that many of these words show wide distribution in the NIA languages of the plains, generally with little representation in the Himalayan languages.14 He proposed a working hypothesis that at least some of these words (those occurring in all the submontane regions of NIA) may have originated in one or more languages of the Indus Valley during the time when Indo-Aryan-speaking groups passed through the region,15 and suggested referring to them as “Indus” words until proven otherwise. Noting that a number of the words in the first category mentioned above, many of which are attested in OIA, also show the requisite distribution in NIA, and thus may have the same origin — for example, CDIAL 7563 nīla ‘dark blue, dark green, black’, 2360 ulūkhala ‘mortar’, 268 āmra ‘mango tree/fruit’ — he proposed that these may also be provisionally classed as “Indus” words. Many of these words are names of plants which grew in the Indus Valley region in Harappan times, and thus could logically be expected to have names in the local languages. This includes most of the agricultural words in Masica’s “language X” (Masica 1979). 13

14

15

The term “Indus language” was used in Southworth 2006 to refer to this body of data for the reason given below. However, this term collides with Witzel’s prior use of the same term to refer to an “Indus language of the Panjab” (1999c: 10) and a “southern Indus language: Meluhhan” (1999c: 28–34). Since Witzel also suggested (1999c: 13) that Masica’s “language X” may belong to an older level, it may be appropriate to use the term “Pre-Indus” (with capital P) for these words, to distinguish them from Witzel’s “Indus”. This will be done in future publications. In cases of words with more than a few cognates in Dardic and/or Nuristani languages, it may be necessary to assume origin outside of South Asia proper. See 2.3.2.1 above. The simplest explanation is that these words entered IA from a language of the Indus Valley, since (it is assumed here that) the ancestors of all NIA languages passed through the region, some moving eastward — from Panjab into northern India, or from Sindh into Central India, the Deccan, and the east (Southworth 2005, chapters 5–6).

256

Hans Henrich Hock

It is difficult of course to interpret these results, as these words did not all necessarily come originally from the same language, or even the same language family. A full study of these words, which should include all known cases of words attested in OIA with “Indus” distribution, could throw considerable light on the dynamics of prehistoric language contact in the subcontinent.16 Other inferred languages have been discussed in Section 2.3.2.1 above. 2.3.3.

Structural features and geographical evidence By Hans Henrich Hock

This section examines the major structural features considered to reflect prehistoric (and early historic) contact between Sanskrit/OIA and other South Asian languages, as well as geographical arguments believed to favor contact with Dravidian. 2.3.3.1. Structural features Four features are commonly listed as characteristic of the South Asian Convergence Area.17 These are: I. A phonological contrast between dental and retroflex, as in (1) II. An unmarked major constituent order Subject (S) – Object (O) – Verb (V) and the order Main Verb (MV) before Auxiliary; (2) III. A tendency to use non-finite converbs,18 where modern European languages tend to employ finite dependent clauses; (3) IV. The marking of cited (direct) discourse by postposed quotative markers; (4) (1)

Skt.

(2)

Hindi maiṁ kitāb book.O I.S ‘I read a book.’

16

17

18

pāta- ‘flight’ : pāṭa- ‘portion’ paṛhtā read.MV

hūṁ AUX

Note that Witzel (1999c) also uses the term “Indus language” for two different substrates of the Indus Valley (see 2.3.2.2 above). While there is substantial overlap between Southworth’s and Witzel’s use of the term, it may be advisable to keep them distinct. Emeneau (1956) adds the use of Sanskrit api in functions parallel to those of Dravidian (*)um; but these functions appear relatively late. Another feature is lexical “reduplication” or iteration; see Abbi 1992. As Hock (1993) demonstrates, Sanskrit non-verbal iteration has parallels in other early Indo-European languages; only verbal iteration (as in utplutya + utplutya … ‘continually jumping up’) is innovated, appearing first in late Vedic. For the other South Asian languages, the historical facts are uncertain. These forms are variously called “conjunctive participles”, “absolutives”, “gerunds”, etc., but all of these terms are potentially ambiguous. The term “converb” is employed in Altaic linguistics, is unambiguous, and has been introduced to general linguistics in Haspelmath & König 1995.

Contact and convergence

(3)

Skt.

tatra gatvā na muc-ya-se NEG become.free-ITR -2 SG . PRS there go.CONV ‘When you have gone there, you do not get free.’

(4)

Skt.

nakir vaktā [na dād] NEG give.SBJV nobody speak.FUT ‘Nobody will say, “He shall not give.”’

257

iti QUOT

Although these features are highly characteristic of the South Asian area, there are exceptions, generally on the periphery. Thus, SOV order is absent in Ahom, Khasi, and Nicobarese, which instead have VO order. The contrast dental : retroflex is absent in Assamese and other Northeastern and Himalayan languages, including Ahom, Khasi, and much of Tibeto-Burman, plus Nicobarese, which have alveolar stops instead. For Ahom and Khasi/Nicobarese, the absence of both features is probably inherited from the linguistic ancestors. The Assamese absence of the dental : retroflex contrast is innovated, reflecting contact with neighboring TibetoBurman and probably also Ahom and Khasi. All of the four features are present in the earliest stage of Indo-Aryan, Rigvedic Sanskrit; and each of them has been claimed to be innovated. Since Dravidian has all of the features, it is usually considered the source for their presence in Indo-Aryan. Moreover, feature II is commonly assumed to be weakly present in the Rigveda and to become stricter in late Vedic, reflecting continuing Dravidian influence (Emeneau 1956, 1980b; Kuiper 1968ab, Thomason & Kaufman 1988). Arguments against the “Dravidian hypothesis” are of several kinds. One view holds that the arguments for the hypothesis are not cogent: Feature I can be explained by internal developments in Indo-Aryan and thus does not require a contact explanation. Moreover, the generally assumed early Dravidian and Indo-Aryan systems differ considerably — Sanskrit has initial retroflex consonants, early Dravidian does not; Sanskrit has a retroflex sibilant ṣ, Dravidian does not. Dravidian has a retroflex approximant r̤ ,19 Sanskrit does not; Dravidian has a triple contrast dental : alveolar : retroflex, Sanskrit only has dental : retroflex. Feature II is inherited from Proto-Indo-European. The difference in “strictness” between Rigvedic and later Vedic-Prose SOV order reflects a genre difference — hymnal poetry vs. didactic prose, a difference persisting in Post-Vedic with minor differences (Hock 1997). The Dravidian hypothesis fails to account for the formal variation of Sanskrit converbs (feature III), depending on whether the verb is prefixed or not (e.g. kṛ-tvī ‘having done’ : vi-kṛ-t-ya ‘having changed’); Dravidian has no such variation, and no verbal prefixes. Finally, the combination of features II to IV can be argued to be characteristic of a subvariety of SOV typology that would favor the appearance or introduction of III and IV. See Hock 1996a, Southworth 2005: Chapter 3 for comprehensive surveys of the arguments. 19

An alternative transcription is ẓ, but the segment functions as retroflex rhotic.

258

Hans Henrich Hock

Proponents of the Dravidian Hypothesis counter that even if internal developments may have set the stage for features I to IV — especially I — contact accelerated the developments (Emeneau 1980b, Thomason & Kaufman 1988). While in principle possible, this argument is not falsifiable and therefore methodologically problematic. A number of possible alternative candidates for substratum influence have been proposed, ranging from Burushaski to Nahali to even unknown languages. A good survey is found in Southworth 2005: Chapter 3; see also Tikkanen 1987, 1988, 1999. Based on lexical evidence, Southworth (2006) argues for an “Indus” language with a dental : retroflex stop contrast, found also in initial position (in contrast to early Dravidian). However, he provides no evidence for retroflex ṣ, which plays a crucial role in the development of Indo-Aryan retroflexion.20 The isolates Kusunda21 (Watters et al. 2006) and Andamanese22 (Abbi 2006) likewise offer an initial retroflex : dental contrast, and so does Burushaski (Berger 1998). Burushaski is especially interesting, because like Sanskrit and unlike Dravidian, it has a triple sibilant contrast s : ṣ : š. Given the importance of ṣ for Sanskrit retroflexion and the northwestern location of Burushaski, this may be significant, since Indo-Aryan speakers would have made first contact with languages of the Northwest. Tikkanen (1988, 1999), however, considers Burushaski influence questionable, since unlike Indo-Aryan and Dravidian it lacks a retroflex nasal. He concludes that ‘the pre-Aryan language(s) … at the time of the advent of the Indo-Aryans can HARDLY have been Dravidian, Burushaski or even Sino-Tibetan’ (1999: 147). A very different perspective argues that, instead of the traditional, unidirectional “substratum” approach, one should consider bi- or multidirectional “convergence”. The stop contrast dental : (alveolar :) retroflex is proposed to reflect convergent assimilatory developments in both Dravidian, Indo-Aryan, and neighboring Iranian varieties (Avest. rt > ṣ̌), and possibly other languages too (Hock 1996b, 2015; see also Tikkanen 1987, Southworth 2005). In spite of these various alternatives, the Dravidian hypothesis remains the majority view. The counterarguments, however, suggest a need for comprehensive reexamination of the issues. In this reexamination, the following points need to be considered. First, except for neighboring Iranian and Nuristani varieties (and to a more limited extent Tocharian) — see Section 2.3.4 — the contrast is not found in other early Indo20

21

22

In his most recent publication (Southworth & McAlpin 2013), Southworth returns to the hypothesis of early Dravidian contact (in the northwest); but the arguments are largely based on lexical evidence. (See 2.3.2.2 above.) Kusunda also has alveolar stops. But according to Watters et al. (2006), the distribution of dental : alveolar : retroflex seems to be allophonic. According to Abbi, there are two Andaman language groups (See 1.10.1 above).

Contact and convergence

259

European languages. Second, the Indo-Aryan contrast dental : retroflex thus is clearly an innovation — whatever the precise scenario responsible. Third, the contrast exists in almost all of South Asia, from the Andaman languages to Burushaski and beyond. Its presence in early Indo-Aryan therefore is not likely to be accidental. Fourth, the presence of retroflexion in the Andaman languages raises important questions that so far have not been addressed about the prehistory of the feature, given that the Andaman languages appear to have been isolated from the mainland for centuries, if not millennia. At the same time, as noted in 2.3.1, the chronologically uneven attestation of the South Asian languages creates major difficulties. Most obviously this is true for any attempts at attributing features I to IV to Burushaski — there is no way of determining the structure of Burushaski some 3500 BP. Even for Dravidian, a considerable gap exists between its earliest attestations (ca. 2000 BP) and the time that Sanskrit/Indo-Aryan developed the dental : retroflex contrast (ca. 3500 BP).23 2.3.3.2. Geographical evidence24 A common argument for the Dravidian hypothesis, in addition to structure and lexicon, has been the geographical position of Brahui, Kuṛux, and Malto, generally referred to as the North Dravidian branch of the family.25 The northern location of these languages and their separation from the rest of Dravidian has been argued to constitute evidence for a prehistoric presence of Dravidian languages throughout the north; especially important is Brahui in the northwest, close to the presumed area of first Indo-Aryan contact in South Asia. As noted by various scholars (e.g. Grierson 1906: 406, Hahn 1911, Vesper 1971), the northern location of Kuṛux and Malto appears to be secondary; according to their own tradition, the Kurukh and Malto, close linguistic relatives of Brahui, migrated to their present locations, via the Narmada valley, from a much more southern area in Karnataka. Kuṛux and Malto, therefore, do not provide conclusive evidence for a prehistoric Dravidian presence in the north (Hock 1996a). The speakers of Brahui, too, believe that they came to their present location from outside; but their claim that they came from Aleppo, Syria (Bloch 1911) 23

24

25

Some evidence suggests a different Proto-Dravidian structure from the agglutinative one of early attested Dravidian. For instance, Old Tamil preserves evidence for “portmanteau” verbal endings (e.g. var-um ‘come [3rd person]’) that are being replaced by the familiar pronominal ones (e.g. varu-v-āṉ ‘he comes’, varu-v-ār ‘they come’), leading to transitional blends such as aṅk-un-tu ‘they move around’ = aṅk- + portmanteau -um + pronominal 3 sg. n. –tu. (Lehmann 1994, Subrahmanyam 2008; more generally Murugaiyan & Pilot-Raichoor 2004). This section addresses only geographical arguments for the Dravidian hypothesis. Other geographical issues are addressed in 2.3.2 above. But see McAlpin 2003 and forthcoming.

260

Hans Henrich Hock

must be fanciful; moreover, the same claim is made by the Balochi (Jahani & Korn 2009). Elfenbein (1998) assumes that the Brahui migrated from the Central Deccan ca. 800–1000 AD. Kieffer (1989) claims Brahui presence in the northwest as early as 2500 to 2000 BC and extensive influence on Nuristani, Dardic, and the Pamir languages. Southworth and McAlpin (2013) deny that Brahui is Dravidian and classify it as closer to Elamite. (See also Kolichala, 1.6.1.3, this volume.) For Brahui, thus, the issue of migration or indigenousness so far remains unresolved. 2.3.4.

Post-Vedic contact linguistics By Hans Henrich Hock

This section examines three issues that can be considered Post-Vedic, involving the Prakrit stage of Indo-Aryan. Two of these mainly concern peninsular South Asia, the third pertains to the Northwest. Note that the term “Post-Vedic” is problematic because of the well-known existence of “Prakritic” phenomena even in the Rig Veda (see Section 1.3.1.3.1). Still, the developments responsible for the phenomena must postdate the linguistic stage presented by “standard” Vedic. 2.3.4.1. The “Two-Mora Conspiracy” The early Prakrit reduction of trimoraic CV̄ CC to bimoraic CV̆ CC or CV̄ C (“TwoMora Conspiracy”) is well known; see e.g. Hock 1986: 159–161 with references. Krishnamurti (1991) shows that similar developments are found in Dravidian and claims that the Prakrit development reflects Dravidian substratum influence. As Krishnamurti acknowledges, the change did not take place in (north-) western Indo-Aryan, which retained length distinctions before consonant clusters, as in Skt. rūpya(ka) > Panj. rūppā ‘rupee’, āṇḍa(ka) > āṇḍā ‘egg’ (Turner 1967). Tamil and Malayalam likewise maintain long vowel + CC, as in Tam. kēḷ- (kēṭp-, kēṭṭ-) ‘hear, ask’, cāṯṯu- ‘publish, announce’, tūṅku- ‘hang, sleep’.26 Moreover, in Prakrit varieties that have the change, the Two-Mora Conspiracy affects all clusters; in much of Dravidian vowels remained long before nasal + stop, as in Kan. kāṅke ‘heat’, Konda tōṇṭa ‘garden’, Kolami mūndin ‘three things’. The higher degree of application in Indo-Aryan varieties might be taken to indicate Indo-Aryan, not Dravidian origin of the change; but degree of application is a weak criterion. So the directionality of the change may not be resolvable. More important is the fact that the change does not take place on the southern and (north)western periphery, but is shared only by those languages which historically are in more intensive bilingual contact. This opens the possibility of CONVER GENT developments, rather than unidirectional substratum change. 26

Dravidian data from Subrahmanyam 2008.

Contact and convergence

261

2.3.4.2. rt-clusters and alveolars Dravidian and Indo-Aryan exhibit a remarkably similar geographical distribution in the outcomes of geminate alveolar stops and rt-clusters, respectively; see Map 2.1. Hock (1996a, b) accounts for the similarity by proposing that OIA rt changed to ṯṯ, a change convergent with the development that introduced Dravidian ṯṯ, and that in Dravidian (except in extreme southern languages) and Indo-Aryan the resulting ṯṯ developed to ṭṭ in the east and extreme northwest, and to tt in the intermediate area. Hock acknowledges the problem posed by the considerable difference in chronology between Indo-Aryan and Dravidian (possibly some 800 years). However, if the account were rejected, the geographical alignments in Map 2.1 would have to

Map 2.1. Development of rt in the Aśokan inscriptions and Modern Indo-Aryan (Turner 1921, 1924) and Dravidian development of (geminate) alveolar stops (from Hock 1996a, b) (Triangles refer to major Aśokan sites; shaded areas to Dravidian).

262

Hans Henrich Hock

be considered accidental. This, in turn, would raise significant questions about any contact accounts based on the geographical distribution of features — including the issue of South Asia and Indo-Aryan retroflexion. 2.3.4.3. Triplets of sibilants (± affricates) in the Northwest and beyond As noted in Section 2.3.3.1, Burushaski and (Vedic) Sanskrit share the feature of a triple contrast s : ṣ : ś. This contrast is widespread in the modern northwestern languages, whether Indo-Aryan/Dardic, Nuristani, Iranian, or Burushaski. Moreover, within the larger triple-sibilant area there is a smaller (core?) area with a corresponding affricate contrast ċ : c ̣ : č; see Figure 2.1, where the sibilant distribution is indicated by shading and the triple-affricate contrast by heavy borders.27 Note that at an earlier time the triple-sibilant contrast extended farther north; Shughni, Yazghulami, and Sariqoli changed ṣ to x. Interestingly, during the Middle Indo-Aryan and Iranian period, Saka (Tarim area, near present-day Sariqoli) displays the same triple contrasts, and Gāndhārī has been argued to do so too (Emmerick 1989, 2009, Brough 1962). Moreover, the triple sibilant contrast is also found in Tocharian, but for the affricates we only find ċ : č (Ringe 1996). The wide geographical distribution during this period certainly raises questions about the prehistory of these features, and various substratum explanations; see Section 1.2.4. The fact that Avestan, too, has a rich sibilant system (s : š : š́ : š ̣) and that Saka and modern Wakhi share with Sanskrit a palatal reflex in PIIr. *ḱw > Saka śś, Wakhi š, Skt. śv (vs. a dental outcome in the rest of Iranian) raises the possibility of a dialectal-spread account within Indo-Iranian (see 1.2.4). In that case, the time frame for the developments in question might have to be moved back to PRE -Vedic times. A definitive resolution of these different possibilities does not seem possible at this time, especially because of the lack of contemporary attestations of Burushaski and other possible contact languages. What is interesting, however, is that while the different types of contrast may be shared, the developments leading to them differ from language to language; see the examples in (5). (5)

27

a.

Saka

PIIr.

č>ċ kš > c ̣ ky > č

Data from the following sources. Pamir languages: Bashir 2009, Edelman & Dodykhudoeva 2009a,b; Payne 1989; Skjærvø 1989a,b; Pashto: Robson & Tegey 2009, Skjærvø 1989c; Parachi: Kieffer 1989, 2009; Dardic/Kashmiri: Bashir 2003, Koul 2003; Dumaki (IA): Lorimer 1939; Nuristani: Strand 2010; Burushaski: Berger 1998.

Contact and convergence

263

Figure 2.1. Approximate distribution of sibilant and triple affricate contrasts in the Northwest (sibilant contrasts indicated by shading; triple-affricate contrasts by heavy borders)

b.

Tocharian PIE

c.

Dardic

d>ċ ty > č

PIAr. č kṣ tr

: : :

General Pattern č c ̣h tr-

Special Developments ċ (Kashmiri, Torwali) čh (Kashmiri) č (Poguli), c ̣ (Shina, Torwali)

This fact has important consequences for our understanding of contact-induced change — whether convergent or substratal. What seems to be important is the shared “target”, and not specific changes that lead to the target. As a consequence, some of the objections to particular contact explanations voiced in earlier publications may need to be reconsidered.

264

Elena Bashir

2.4.

The Northwest Edited by Elena Bashir

For purposes of this volume, the “Northwest” is considered as consisting of the northwest mountain regions (2.4.1.1), Baluchistan (2.4.1.2), and present-day Pakistan (2.4.2.1) and Afghanistan (2.4.2.2). Since the events of 1947 profoundly altered the political map of the northwest and set in motion rapid linguistic changes, the area is discussed in its pre- and post-1947 phases. 2.4.1.

Pre-1947 convergences By Elena Bashir

2.4.1.1.

Pamir-Hindukush-Karakoram-Kohistan-Kashmir region

The sharing of features by languages spoken over a continuous geographical area, which may belong to different families or sub-families, can be the result of shared retention of some inherited feature(s),28 substratum influence, language shift to a dominant language, regular typological cooccurrences, or convergence. Frequently these processes interact within the same region. Thus various areal configurations — both macro- and micro-regions — may represent “fossils” of differing and multiple types, and of differing ages.29 Research on such areal phenomena in the Pamir-Hindukush-Karakoram-Kohistan-Kashmir (PHKKK) region must deal with numerous levels of time depth, from the prehistoric to the recent. 2.4.1.1.1.

Macro-areas

Discussions of prehistoric language contacts involving this region include Tuite 1998 (on Burushaski and NE Caucasian), Lamberg-Karlovsky 2002, and Witzel 28

29

An example of a shared retention is a NOM-ACC case marking in a subarea which includes Nuristani Prasun and IA Kalasha and Khowar, the latter two of which also retain traces of the OIA preterital augment. Edelman (1983: 56) suggests former geographical contiguity prior to the relatively recent immigration of Kati speakers into the area as the reason for this common feature. This is to be contrasted with the situation in Standard Bengali, Oriya, and Sinhalese (Masica 1991: 344), where the NOM-ACC system is an innovation resulting from the decay of a MIA split-ergative system. The presence of dental/alveolar, palatal, and retroflex sibilants and affricates in many of the PHKKK languages can be considered a shared retention, whatever the source of this areal phenomenon may have been. Hook (1985: 157) discusses a method for discriminating similarities due to areal factors from those due to other causes, arguing that if the co-occurrence of two features is relatively rare, a high frequency of their co-occurrence points to causes other than typological harmonics, for example areal or convergence effects.

Contact and convergence

265

2003, 2005. Southworth 1979 is a study of Dravidian and Indo-Aryan (IA) relations; it also discusses some lexical items and a morphological property found in the PHKKK region. Emeneau 1965b is a pioneering study of the areal distribution of pronominal suffixes in eighteen languages, including those of the PHKKK area, with respect to whether they show these six behaviors: (1) index the subject on past tense of the verb; (2) index the direct object on the verb; (3) occur with nouns as possessives; (4) occur in phonological constituency with a form other than that with which they are in morphosyntactic immediate constituency; (5) can be repeated in a sentence, or resume/index a noun or independent pronoun; (6) occur in a verb structure consisting of modal or aspectual prefix + pronominal suffix + past stem +/– other elements. Toporov 1965 is a study of feature gradients in the phonological systems of the languages of a Central Asian Linguistic Union (CALU), which includes the Nuristani, Dardic, and Pamir languages, Tajik Persian, Domaki, and Burushaski. He establishes a set of nine abstract phonological oppositions characterizing the consonant phonemes of these languages and then computes the percentage of consonant phonemes characterizable by each opposition for each of the languages in question.30 Zoller (2005: 12–13), applying a procedure similar to that of Toporov (1965) to the study of aspiration in the Dardic and Nuristani languages, finds a phonologically innovative center in Dir and Kalam Kohistan surrounded by more conservative areas to the east and northwest. Following Tikkanen (1988), he attributes loss of aspiration to a substratum influence. Tikkanen 2008 maps phonological isoglosses in the PHKKK region. Edelman (1980), discussing various semantic and morphological characteristics found in languages of a CALU more widely defined than in Toporov 1965, finds that the languages of the Hindukush, Pamirs, Karakorams, and part of the Himalayas share certain features which she attributes to substratal influences. Among these are: (1) retroflex sibilants and affricates, which are characteristic of Burushaski but not found in Indo-Aryan languages outside this small geographical area; (2) construction of the numerals from 11–20, which follow the pattern 10 + n, rather than the inherited Indo-Aryan n + 10 pattern; (3) morphological patterns (e.g. in Wakhi) analogous to prefixal Burushaski patterns. Edelman attributes several features to the (partial) acquisition of the characteristics of a language of the active type.31 They include: (4) treatment of some grammatically intransitive 30 31

Ramanujan & Masica 1969 is a similar study on a larger geographical scale. An active-stative language or split-intransitive language is one in which the single argument (S) of an intransitive verb sometimes receives agentive marking and sometimes absolutive or nominative case. The case of the intransitive subject (S) varies according to criteria particular to each language, often depending on the degree of volition or control over the verbal action exercised by S. See 4.5.1.3.3, this volume, for a discussion of this phenomenon, now known as fluid agent marking or fluid intransitivity.

266

Elena Bashir

verbs, like ‘laugh’, ‘cough’, ‘weep’ as transitives;32 (5) a group of stative intransitives of the type ‘be red’; (6) the inclusive/exclusive distinction in the 1st person plural; (7) the treatment of inalienable and alienable possession, as in the use of obligatory affixal elements with inalienably possessed entities (like body parts or kinship terms), possibly under the influence of Burushaski; (8) a change from grammatical to semantic gender based on animacy;33 (9) the expression of the concept ‘to begin’. Edelman concludes that a substratal layer of the active type underlies this whole area at a deep time depth. Burushaski figures importantly in her discussion. Dodychudojev 1972 is a comparative study of the Pamir languages and their possible interactions. Convergence phenomena at several time depths and with varying geographic spread can be identified in Kalasha (Bashir 1988: 250). The largest-scale pattern and that with the greatest time-depth is an extensive left-branching area embracing Altaic, Tibeto-Burman, Burushaski, Dravidian, and (partially) Indo-Aryan. This is the area which Masica (1983: 5) designates “Indo-Turanian” (previously “Indo-Altaic”). Bashir 1988, a study of Kalasha’s areal typology, finds that in addition to features which it shares with the South Asian (SA) linguistic area as described in Masica 1976, Kalasha displays characteristics not associated with the SA linguistic area. (1) It shows a much higher incidence of left-branching structures than would be predicted for a language in its geographical position solely with respect to the SA linguistic area.34 These include the preponderance in conversational discourse of left-branching relative clauses which employ a finite verb or a participial form, left-branching SAY -complements,35 and an extremely strong predilection for the use of the conjunctive participle. (2) Kalasha and Khowar both

32

33

34

35

This phenomenon is also observed in Urdu to a limited extent. For example, the past tense of the verb hãsnā ‘to laugh’ can be us-ne hãsā ‘(s)he laughed’, in which the agentive form us-ne, usually found with subjects of perfective forms of transitive verbs, appears. Animacy has become grammaticized in the nominal morphology of Torwali and Sawi. It is grammaticized in the verb systems of Kalasha, Khowar, Shumashti, and Pashai; and in the deictic systems of Torwali and Kalam Kohistani (Bashir 2003: 823). Recent work on Dameli (Perder 2013) shows that animacy is also grammaticized in the verb system of that language. Hook 1985 discusses the distribution of left-branching subordinate-main and right-branching main-subordinate clause orders along a geographical gradient showing a correlation of decreasing frequency of subordinate-main order with increasing distance to the northwest, with Iranian Balochi, Pashto, and also Brahui showing 0 % and Dravidian Kannada, Tamil, Telugu, and Malayalam 100 %. Left-branching structures are more frequent in Kalasha than in Khowar, and more so in oral than in written discourse. SAY -complements are found in Turkic languages, as well as in Burushaski and Tibeto-Burman (Bashir 1996a).

Contact and convergence

267

show highly developed expression of the category of inferentiality36 in basic verb morphology. (3) There is significant use of the causative to express involuntary experience, coupled with weak development of the dative-subject construction. (4) Grammatical gender has been replaced by grammaticization of the animateinanimate distinction. (5) The numerals 11–19 have been restructured in accordance with the pattern 10 + n rather than the inherited Indo-Iranian n + 10 schema. (6) Contrastive dental, palatal, and retroflex sibilants and affricates are preserved. (7) Kalasha is one of a small number of (contiguous) languages having retroflex vowels (Di Carlo 2008). (8) The infinitive ending is -k. These features show Kalasha to be strongly under the influence of factors other than position vis-à-vis the SA linguistic area.37 Other morphological characteristics, such as the absence of a copula in equational sentences in Kalasha and Khowar, raise the question of what substratum or contact influences are the sources of these features. Seen against this broad picture, the injection of right-branching characteristics into Khowar and to a somewhat lesser extent into Kalasha appears as a later, smaller-scale effect, probably due to the widespread use of Persian in the Khowarspeaking area. To the most recent level can be assigned features linking Kalasha and its neighboring languages to the SA linguistic area (via Urdu). As communication and cultural links shift from ties with Central Asia or Nuristan to relations with the lower Indus valley, we can expect to see an increasing approximation of Kalasha to the South Asian linguistic pattern. 2.4.1.1.2

The role of Burushaski

The language most frequently mentioned in connection with substratum effects in the PHKKK area is Burushaski. Those features which seem attributable to interaction with a Burushaski substratum constitute another convergence/diffusion layer, later than that of the extensive left-branching area mentioned above. Many scholars, including Grierson (1919: 6), Morgenstierne (1935: xiii), Edel’man (1976, 1980, 1984), Tikkanen (1988, 1999), Bashir (1988: 392–401, 1997), Berger (1998: 4), Witzel (1999a,c) and Zoller (2005), have suggested that Burushaski was formerly spoken over a much wider area than it is now, and has contributed substratal elements in the region.38 Tikkanen (1988) discusses the question of a Burushaski substratum in the languages of this area. While he agrees that Burushaski-speakers were present in the ancient state of Bolor, he feels that contentions that there was a Burushaski substratum in most of the Hindukush and Pamir area are exaggerated (p. 305). Tikkanen 1999, a study of the types, origins, and patterns of conver36

37 38

This is Haarmaan’s (1970) indirekte Erlebnisform, or Aikhenvald’s (2003) Type I system. Bashir 1996b is a similar discussion, focused on Khowar. Berger 1959 and Djačok 1988 discuss Burushaski loanwords in Romani.

268

Elena Bashir

gence in retroflexion, concludes (p.147) that ‘the pre-Aryan language(s) of Swat, Kashmir and the adjacent area at the time of the advent of the Indo-Aryans can hardly have been either Dravidian, Burushaski or even Sino-Tibetan.’ Burushaski also participates in more recent convergence phenomena. Exploring the question of a Burushaski substratum in Khowar, Bashir (1997) discusses Burushaski-Khowar commonalities, including the following features: a necessitative construction in baṣ; the use of causatives to express involuntative semantics; replacement of grammatical by semantic gender; the use of apparently redundant possessive pronouns with kinship terms; relative clauses consisting of a finite clause preceding the relative noun; the use of plural marking for multiple actions; some place names; and numerous lexical items, e.g. čumur ‘iron’, and many other old borrowings which have become completely nativized and are perceived as layers of original Khowar vocabulary. Yasin Burushaski shows much lexical and some morphological influence from Khowar, which has been pointed out and elaborated by several scholars, including Berger (1974: 40–41), Lorimer (1962: 26), Tiffou & Pesot (1989: 35), and Bashir (2007a: Section 3.2.7). Lorimer (1935, 1937) discusses morphological commonalities among Hunza and Nager Burushaski, Shina, Khowar, Domaki, and Wakhi; Berger (1996) treats Shina loans in Burushaski. Tikkanen (1988: 305) identifies Burushaski loan words in Shina, Khowar, Wakhi, and Balti, but notes that these borrowings seem relatively recent in the surrounding IA languages. Tikkanen 2007 points to the use of Burushaski huruṭ- ‘sit, remain’ as an aspectual auxiliary meaning ‘keep on V-ing’ as evidence for influence of Urdu and/or Shina on the Burushaski aspectual system, and the presence of this construction in Domaki as a probable Burushaski and/or Shina influence. However, Tikkanen (1995: 518) found no clear indications of external influence on Burushaski converbs (conjunctive participles) except a recent Urdu-influenced use of Burushaski converbs sharing negation with the main clause. Tikkanen 2011 discusses Burushaski influence on the Domaki case system, and other morphological parallels between Domaki, Burushaski, and Shina. Morin & Dagenais 1977 is a study of how the pronunciation of Urdu words borrowed into Burushaski is affected by Burushaski phonology. Patry & Tiffou 1997 examines lexical influences of Urdu on Yasin Burushaski. Frembgen 1997 discusses English loan words in Burushaski. Burushaski transplanted in the late 19th century to Jammu and Kashmir has been heavily influenced by both Kashmiri and Urdu (Munshi 2006). 2.4.1.1.3.

Micro-areas

Other features are shared by different subsets of the languages of this area. One is the presence of three- (or more) valued deictic systems. The Dardic languages Pashai, Shumashti, Khowar, Kalasha, Torwali, Indus Kohistani, Shina, and Palula share this feature (Bashir 2003: 823), as does Kashmiri (Koul 2003: 912). Among

Contact and convergence

269

the East Iranian languages, Wakhi, the Shughni group (except for Yazghulami), Ishkashmi and Yidgha-Munji have three-valued deictic systems, while Ossetic, Yazghulami, and Yaghnobi have two-valued systems (Dodychudojev 1972: 466, Skjærvø 1989: 372–373). Burushaski appears to have a two-valued system (Lorimer 1935, Berger 1998: 81). Infinitives in -k are found in the micro-area including Gawarbati, Kalasha, Pashai, Khowar, Wakhi, and Gilgit Shina. The conjunctive participle of the verb meaning ‘to adhere to’ functions as a postposition marking the causative agent in Kalasha, Khowar, Palula, and Gilgit Shina (Bashir 1988: 186–187, 424; 2003: 823; 2015). Di Carlo (2008), citing and building on Emeneau 1965b, discusses the distribution of two features as evidence for a diffusion sub-area including the Nuristani languages and some of the (formerly) adjacent Dardic languages. The first of these is the use of pronominal suffixes in possessive NPs involving kinship terms, a restricted version of Emeneau’s (1965b: 42) feature (3) (occurring as possessives with nouns), in Kalasha, the Nuristani languages, Dameli, and Gawar Bati. Recent fieldwork (Lehr 2014: 162–173) indicates that this feature is also found in Pashai. Di Carlo also considers the presence of retroflex vowels in Kamkataviri, Ashkun, and Waigali as strong evidence, given their typological rarity, of a specific, geographically limited convergence area. These retroflex vowels were first described in Kalasha by Trail and Cooper (1985), then represented in Trail and Cooper’s Kalasha dictionary (1999), and later analyzed in Mørch & Heegård 1997 and Heegård & Mørch 2004. Extending her previous research (Bashir 1988) on the grammaticization of evidential distinctions in Kalasha and Khowar, Bashir (2007b) has identified a large sub-areal grouping of languages having morphological expression of Type I evidentiality (Aikhenvald 2003) distinctions. To a significant extent, this intersects the Central Asian Linguistic Union (CALU) as described in Edelman 1980. Khowar and Wakhi have undergone multi-layered interactions. Morgenstierne (1926, 1936) discusses numerous Iranian loans in Khowar belonging to different chronological strata, pointing to interaction with Middle Iranian as well as contemporary Eastern Iranian languages. He points out the important fact that Wakhi took its first person singular oblique pronoun from IA at a very early stage, and later its first and second person plural pronouns from Khowar. Simultaneously, Wakhi has contributed numerous loan words including basic vocabulary items to Khowar (Morgenstierne 1926: 79–80; 1938 II: 441; 1975: 434). Morgenstierne (1975) points to both early contact of pre-Wakhi with some form of IA at a time when diverging IA and Iranian were still largely mutually intelligible, and to later intense mutual interaction between Khowar and Wakhi. Bashir (2001) continues this line of research, discussing additional lexical borrowings into Khowar from Wakhi. The continuation of Khowar-Wakhi contact until fairly recently is a matter of living historical memory. Movement between Chitral and Wakhan was more frequent in the past than it is today, both because of constraints imposed by

270

Elena Bashir

new political boundaries and topographical changes in which some formerly used passes between Chitral and Wakhan have been closed by recent glacier formation. The numerous varieties of Shina participate in many local convergences. Schmidt & Kaul 2008 compares core vocabulary items in ten dialects of Shina and in Kashmiri, with a focus on questions of language contact, finding that Shina and Kashmiri do not share a large number of cognates and that the phonological developments of cognates in the two languages have been quite different (p. 231). The importance of loan words in Shina from Burushaski, Persian, and Tibetan is also stressed. Kohistani and Schmidt (2006: 141), discussing Shina dialects, note that while heavily influenced by Gawar Bati, another Dardic language, the outlying Shina dialect of Sawi is still mutually intelligible with Palula. Liljegren 2013 discusses the relatively recent convergence of Kalkoti, at base a Shina variety, with a variety of Kohistani spoken in its vicinity, while also pointing up similarities with Palula and Sawi. Liljegren (2009) argues for recent convergence between the northern and southern dialects of Palula, speakers of which, he argues, reached southern Chitral via separate migration routes. An oral text in the Shina of Gurez (Bashir & Hook forthcoming) indicates word-order convergence effects with Kashmiri. In contemporary Pakistan, the authors observe, Shina is coming under increasingly heavy influence from Urdu and English, particularly in Gilgit town. Fussman 1972, volume I is a linguistic atlas which maps 167 lexical items in the Dardic, Nuristani, and bordering languages, giving forms from as many languages and varieties as possible and sorting them into form classes. For example, 50 forms for the lexical item APRICOT are given, which are mapped onto eight groups. Such information can yield information on questions of lexical borrowings among (subsets of) the languages in question. Volume II is commentary and discussion of each lexical item. A phonological feature shared by many of the languages of North Pakistan is tonal systems. Baart (2003) finds that a majority of the languages of North Pakistan have tonal or pitch accent systems and outlines the beginning of a typology of the systems of these seventeen languages. A tonal language recently investigated is Kundal Shahi (Rehman & Baart 2005: 17). The authors find that its tone system is similar to that found in Shina, Indus Kohistani, and Palula, rather than that of Panjabi, Hindko, and Gujari. Kundal Shahi seems to be a language descended from an archaic form of Shina, which has undergone significant influence from Kashmiri, Hindko, and Indus Kohistani, with traces of contact with languages farther to the West (Swat, Dir, and Chitral) also discernable.

Contact and convergence

2.4.1.2.

Baluchistan

2.4.1.2.1.

Introduction

271

Baluchistan has seen intense convergence between Brahui and Baluchi,39 with the predominant influence from Baluchi to Brahui. Brahui, along with Kuṛux and Malto, has been generally considered to be a North or Northwest (Andronov 2006: 146) Dravidian language,40 now spoken mainly in Pakistani Balochistan, but also in Iran, Afghanistan, and minimally in Turkmenistan (Panikkar 1993). However, McAlpin, beginning in 1975 and developing his views since then (1975, 1980, 1981, 2003, 2015, forthcoming; and Southworth & McAlpin 2013) challenges this view, concluding most recently that Brahui is not Dravidian, but an independent branch of a putative Proto-Elamitic (sister to Proto-Dravidian) branch of a ProtoZagrosian family (named for the Zagros Mountains of southwestern Iran). (See also 1.6.1.2 above.) Two hypotheses have been advanced as to how Brahui comes to be located so far to the northwest. According to the first scenario, proto-Brahui split off from the main body of Dravidian as the Dravidians moved from the northwest toward the Indian subcontinent, and has remained in its present position in Balochistan since about 3000 BCE (Andronov 2003: 21–23, 2006: 146). McAlpin’s work is consistent with this hypothesis, and Southworth & McAlpin presents a more detailed historical scenario. The second, first proposed by Bloch (1924), and advocated by Elfenbein (1987), proposes that the Brahuis migrated from the Deccan to their present location about 1000 years ago.41 According to Elfenbein (1987: 223), the ethnonym “Brāhūī” (older “Brāhōī”) is itself of relatively recent origin, ‘first used in the 16th c. to refer to a now vanished tribe of Baloch, the Ibrāhīmī, who dwelt amongst the Jaṭṭs of Awārān in Pakistani Makran’, and is thus not helpful in tracing the earliest history of the language. However, Andronov (2001: 25; 2006: 6) argues that the origin of the name “Brahui” is very old and purely Dravidian, and considers Elfenbein’s explanation to be based on folk etymology (see also 1.6, this volume.). The oldest known name for the Brahui LANGUAGE is Kûr Gâlli (Lassen 1844: 339) or Kūrdgālī (Elfenbein 1987: 226). Baluchi is a North Western Iranian language showing some features of South Western Iranian (Korn 2003, 2005: 37, and 1.4.2.1, this volume). It is spoken primarily in Pakistani Balochistan, Afghanistan, Iran, and marginally in Turkmenistan (Axenov 2006).42 It, too, has been and continues to be subject to numerous 39

40 41 42

I use the spelling “Baluchi” for the language in its earlier stages and for the varieties spoken in Iran and Afghanistan. “Balochi” is now the official spelling in Pakistan, and will be used when specifically modern Pakistani Balochi is intended. For a discussion of the history of this classification see McAlpin 2003: 521. Elfenbein 1987: 216 maps these two scenarios. Spooner 2012 is a historical and sociolinguistic treatment of Balochi.

272

Elena Bashir

contact influences. Korn 2005 contains many references to contact with various Iranian and IA languages, and with Brahui. Work on Baluchi in Iran includes Spooner 1967; Jahani 1994, 1999, 2003, 2008; Baranzehi 2003; Mahmoodzahi 2003; Mahmoodi Bakhtiari 2003; Dabir-Moghaddam 2008; Rzehak 2009; and Delforooz 2010. Jahani’s works discuss specific influences of modern Persian on Iranian Baluchi. Rzehak 2003 discusses the status and development of Baluchi within the multiethnic, multilingual society of Afghanistan up to the beginning of the 1990s. He sees the long-standing close contact between Persian, Baluchi, Pashto, and Brahui in Afghanistan as having resulted in a sprachbund-like situation, especially with regard to vocabulary including ‘political, scientific and philosophical terminology as well as many terms for objects and other aspects of the real world’ (p. 263) and some morphological patterns. Ezāfa constructions, combinations of Persian and Baluchi prepositions, and copies of morphologicalsyntactic constructions, e.g. bād š-āī ‘after that’ (cf. Persian ba’d az īn), also reflect this convergence (pp. 263–267). With regard to Baluchi contact with IA languages, Elfenbein (1982: 80) thinks that study of Jaḍgālī (~ Jagḍālī), the language of people speaking an IA language thought to be a variety of Sindhi, who live between Čābahār and Gowātr (Gwadur) and are associated with the name “Jaṭṭ”,43 could yield insight into Baluchi-IA contacts. Bashir (2008) investigates certain features of contemporary Eastern Balochi, finding that it ‘has clearly acquired some of the characteristics of its IAr. neighbors: some retroflex consonants, contrastive nasalization, and the morphological passive in -ij,’ and that ‘features of E[astern]B[alochi] transitional between the inherited Iranian state of affairs and the IAr. areal norm include: the status of aspiration, a differentiating series of progressive verb forms, a conjunctive participle which has some but not all the properties of the IAr. C[onjunctive] P[articiple]; and serial verb and CP constructions which show early stages in the evolution of an IAr.-like compound verb’ (Bashir 2008: 78). 2.4.1.2.2.

Brahui-Baluchi convergence

Convergence studies involving Baluchi and Brahui have all stressed the remarkably intertwined relationship of these languages and their speakers. Morgenstierne (1932a: 8–9) observed that: ‘the tribal system of the Baloches and Brahuis, which in contrast to that of the Pathans favours the assimilation of racially foreign elements into the tribe, has no doubt led to frequent changes of language within many Baloch and Brahui clans.’44 As a result, there is no strong correlation of language and ethnicity within the Baloch and Brahui communities (Bray 1934; Elfenbein 43 44

Delforooz 2008 is the only published article on this language community that I know of. Barth (1964) reaffirms this observation, deploying it to explain the territorial expansion of the Marri tribe.

Contact and convergence

273

1982, 1987; Emeneau 1962b); thus “Brahui tribesman” and “Brahui speaker” do not necessarily refer to the same population. Many Baluchi and Brahui speakers are “bilaterally bilingual” (Emeneau 1962b), a result of complex sociolinguistic factors, not the least of which is that ‘at one time in the history of the Brahui Confederacy there must have been more non-native speakers of Brahui, whose mother tongue was Baluchi, and descendants of such non-native speakers, than there were speakers who had learned Brahui from native speakers. It was this bilingual majority who handed on to later generations their version of Brahui, a version which in many features was essentially a calque of Baluchi clothed for the most part in Brahui forms’ (Emeneau 1962c: 60). According to relatively recent estimates, at least 30 percent of Brahui tribesmen speak no Brahui at all, and at least 80 percent of Brahui speakers are bilingual or trilingual (Elfenbein 1989). However, according to Elfenbein (1989), these bilaterally bilingual speakers never mix the two languages consciously, since the choice of language to be used is an important social and psychological decision. Perhaps this is one reason why despite the pervasive influence on Brahui lexicon, morphology, and syntax by Baluchi for perhaps the past 1000 years, Brahui remains a recognizably agglutinative Dravidian language (Elfenbein 1983: 103). Thomason & Kaufman (1988: 92–93) think that the Brahui-Baluchi case belongs either in their category 5 (very strong cultural pressure: heavy structural borrowing) or category 4 (strong cultural pressure: moderate structural borrowing), noting that the interference features that have been identified in Brahui are not particularly typologically disruptive. This situation has resulted in mutual influences between the two languages, albeit with more influence from Baluchi on Brahui than vice versa; and most studies so far have focused on Baluchi > Brahui influences. Elfenbein 1982 discusses the long-standing cultural and linguistic interaction between Baluchi and Brahui; Elfenbein 1987 continues the discussion of Baluchi-Brahui parallelisms, giving many examples of specific shared features and discussing the sociolinguistic contexts in which Brahui-Baluchi bilateral bilingualism operates.45 Sabir (1995, 2003) summarizes these earlier observations. The fact that these two genetically different languages share so many lexical items has been noted since the earliest studies of Brahui. Lassen (1844: 402–404) mentions Persian, Arabic, and Baluchi words in Brahui. Bray (1909: 7) wrote that the lexicon of Brahui had been augmented, first with Iranian items (Persian and Baluchi, but not Pashto) and then with IA (Sindhi, Lahnda, Urdu), the degree of borrowing from one or another source varying from tribe to tribe according to their geographical position. The Brahui lexicon consists of approximately 15 % words of native Dravidian origin, 45

The degree to which Brahui speakers use Pashto is an interesting question. According to Elfenbein (1987: 223), no Brahui group uses Pashto at all, even as a secondary language. In the urban setting of Quetta, however, the four Brahui speakers interviewed by Archer (2003) all report knowing Pashto.

274

Elena Bashir

20 % of Baluchi origin, 20 % of IA origin (including many “Jaṭki” words borrowed through Baluchi), 35 % of Persian/Arabic origin, mainly through IA or Baluchi, and the remainder of unknown origin (Elfenbein 1989). According to Elfenbein (1983: 104), the Baluchi loanwords in Brahui come from at least two different dialects, probably at different times; a proper stratification of these in Brahui could throw important new light on the history of both languages.46 With regard to phonology, the vowel systems of Baluchi and Brahui are almost identical. Bray (1909: 24) finds both /ē/ and /ĕ/, e.g. bīn-ĕ ‘hunger-ACC )’ vs. nē ‘to thee’. Morgenstierne, however, thought that Brahui has only /ē/, and that the Brahui system has been almost completely assimilated to that of Baluchi (1932a: 7). Emeneau (1962a) mentions the (limited) occurrence of /ĕ/ in non-accented, non-initial syllables, but concurs that Brahui has a non-Dravidian-like vowel system, likely due to Baluchi influence. Andronov (2006: 10–11) follows Bray (1909: 24) in affirming the existence of Brahui short /ĕ/ in various positions, citing minimal pairs such as /ē/ ‘that’ vs. /ĕ/ ‘is, exists’; Bashir (1991a: 15, n.13) contains 1990 fieldwork-based examples showing both /ĕ/ and /ŏ/, e.g. arĕ ‘husband’ vs. arē ‘is (exists)’; ē ‘that, those’ (distant but visible) vs. ĕ ‘is’ and ĕ ‘accusative case marker’; ō ‘that, those’ (not visible) vs. ŏ ‘are’. The consonant systems, however, are less similar, differing in the distribution of retroflexion and aspiration, and notably, the presence in Brahui of native fricatives. The Brahui voiceless lateral fricative /ɬ/ is unique in the region. Morphological parallelisms have also been long discussed. Trumpp (1880) pointed out several instances of what he conjectured to be Baluchi influence. He mentions the initial k- on the present/future tense of verbs of motion, common to Baluchi and Brahui (p. 40), the present progressive formation (p. 59), agent noun in -ok (pp. 73–74), and locative adverbs in -ngo, which he mentions as including an element -ng found in both Baluchi and Brahui (p. 120). Grierson (1906: 622) pointed to Brahui’s loss of the Dravidian distinction between rational and irrational nouns, attributing this to Iranian influence. Emeneau (1962a: 40–71), elaborates on the history of Baluchi-Brahui bilingual relationships and morphological parallelisms, listing the following features as probably reflecting Baluchi influence: loss of Dravidian gender system, loss of first person plural inclusive vs. exclusive distinction, an aspectual -a affix suffixed to the word preceding present-future or imperfect forms, and relative clauses employing ki. Brahui pronominal suffixes marking possessive and object relations use native Dravidian morphemes in a borrowed structure. Emeneau (1964) considered that Baluchi was as likely as Sindhi to be its source, but later (1965a: 40–71) favored Sindhi as the immediate source of this structural feature in Brahui.47 Elfenbein (1982: 95), agrees with Emeneau’s 46 47

Parkin (1989) discusses probable Baluchi influences on Brahui kinship terminology. Emeneau (1965a: 66) spells out his latter position: ‘And in addition as we have found, the contiguous Dravidian language, Brahui, was also drawn, through bilingualism with

Contact and convergence

275

latter position. Thomason & Kaufman (1988: 93) think that Baluchi is the more likely source, ‘though Indic influence may have helped fix it in Brahui’. The paradigm of the Brahui verb kann- kar- kē- ‘do’ combines a Dravidian stem, kē-, with an IA stem, kar-, and an Iranian stem kann-. ‘The condition that facilitated such borrowing was the prior formation in Brahui from Dravidian sources of a subclass of verbs with irregular allomorphy of the type mann-, mar- ‘become’, bann-, bar‘come’, dann-, dar-, de- ‘cut, take’. The borrowed allomorphs kann- and kar-, with the regular k- that is seen in parallel instances of borrowing, then became the source of the unexpected k- in ke-’ (Emeneau 1964: 75). Elfenbein (1982: 96) argues that the Brahui locative ending -ā, indicating motion either to or from, is another example of Baluchi influence. Elfenbein (1983) mentions the suffixation of -ī to nouns to make adjectives (< Bal.). Bashir (2010: 31–37) discusses the increasing frequency of use of a specifically progressive form, which seems to be evolving in parallel with similar forms in Baluchi, with both languages increasingly under the influence of Urdu and English.48 Related to this development, a new type of Brahui non-finite negative forms has emerged employing a prefixal strategy borrowed from the Baluchi or Urdu (Iranian/IA) pattern, resulting in a new type of deverbal nominals (Bashir 2010: 38). Although influence has been predominantly from Baluchi to Brahui, Morgenstierne (1932a: 9) noted a number of Brahui words in the Baluchi of Noshke and Panjgur; Spooner (1967) comments on Brahui lexical influence on all Baluchi dialects; and Farrell (2003: 183) cites loaning from Brahui into Balochi, including syntactic calques, in Rakhshani-speaking areas. There are also cases in which the directionality of borrowing is not clear (Rossi 1979, Korn 2005). 2.4.1.2.3.

Brahui and other languages

Grierson (1906: 627) expressed the possibility that Brahui has been influenced by other languages in addition to Baluchi. Rossi (1979) classifies loan-words in Brahui into several categories, giving all available etymological material: (a) items whose derivation from Baluchi is certain; (b) items attributed to any Iranian language different from Baluchi; (c) words common to two or more Iranian languages excluding Persian; (d) items common to Brahui and only one Iranian language, for

48

the contiguous Indo-Aryan language, Sindhi, into the linguistic area contained by this isogloss that represents the pronominal-suffix structural trait. The diffusion was that of a structural feature, which was clothed with native morphemes in several different languages or language groups, and the direction of diffusion was Iranian to Indo-Aryan to Dravidian.’ Elfenbein (1998: 403) says that the Baluchi “progressives” are an innovation originating in the Eastern Baluchi area. He feels that the Brahui progressives cannot be a calque on the Baluchi, and that Brahui could just as equally be the source of the Baluchi forms.

276

Elena Bashir

which there are not sufficiently clear grounds to attribute their source to Baluchi or another specific Iranian language; (e) words which are possibly of Iranian origin; (f) items previously attributed to Brahui but which are now considered doubtful; (g) items of Persian origin; (h) words common to Baluchi and at least one IA language. Bray (1934) contains a sophisticated discussion of the multiple sources and possible routes of borrowed lexical items in Brahui and also of the need to consider the borrowing of Dravidian words into other languages. He says (p. 28): ‘… Iranian and Indian philologists are now again brought up against the whole question of the presence of Dravidian words and this time of Brāhūī loan-words in particular, in Balōchī, Sindhī, Jaṭkī and even Pashtō’; Gren-Eklund (2003: 45), assuming the later, northwestward migration scenario for Brahui, and noting the lack of study of Brahui-IA contact relationships, suggests the desirability of studying possible contact relationships with other languages, especially Munda, prior to Brahui’s movement to its present position. With regard to IA structural influences on Brahui, Emeneau, discussing early contacts, assumed a historical scenario in which IA languages (specifically Sindhi) were in contact with Brahui prior to Baluchi (1965a: 61): We may conclude that the Brahuis have only in the last centuries been in close contact with Balochis, that they may conceivably have had Persian-speaking neighbors earlier, but that for the earlier period and presumably for a long time they had more intimate contact with speakers of IAr. languages (note the Hindu rulers of Kalat). These in all geographical probability were Sindhi speakers, and we should add that even if Sindhi speakers should have been absent from the Brahuis’ present home, there is an evidently long-standing practice of the Brahuis whereby during the winter they migrate in large numbers into Sind where they have hereditary winter-quarters [Bray 1934: 10–12]. This in itself provides the bilingual situation which would allow an IAr., specifically a Sindhi, trait to diffuse into the Dravidian Brahui language.

Elfenbein, however, argues that there is ‘no real evidence for any deeper, structural influence from IA on Brahui’ (1982: 80). A recent study in Sindhi, Brohi 1994, compiles comparative word lists showing similar or identical words in Sindhi and Brahui and includes chapters entitled “Sindhi influences on Brahui” and “Brahui influences on Sindhi”. More recently, Bashir (2010) finds that the increasing grammaticization of the progressive forms, and the use of the nominalizing suffix -ī can be viewed as IA influences. The questions of IA influence on Brahui and on Baluchi are, of course, closely related. 2.4.1.2.4.

Desiderata

Advancement of knowledge about convergence phenomena in Balochistan requires a huge amount of work in documenting various dialects; collecting, transcribing, and annotating texts to facilitate linguistic analysis; and focused studies of specific syntactic differences, e.g. case marking systems with respect to differential

Contact and convergence

277

object marking, agent marking, and split ergativity. Emeneau (1964) stressed the importance of recording and study of the IA speech forms (Jaṭkī/Jaḍgālī) spoken in Baluchistan, which he thinks are likely to predate the arrival of Brahui and to be the source of early IA influence on both Brahui and Baluchi. He considers the lack of information about these IA dialects a significant gap in our knowledge. Another need is for close historical linguistic work on determining the directionality of morphological and syntactic influences; this would enable a better picture of potential Brahui influences on Baluchi. At the level of semantics, one might wish for comparative study of Brahui and Baluchi like that of Filippone (1996) on locative expressions and spatial models.49 Delforooz 2010 is a study of discourse features in Sistan Baluchi. Comparative discourse studies await more work like that of Delforooz 2010 and the availability of large corpora in both languages. One as yet unexploited resource is manuscript and other materials which exist in private libraries in Pakistan, which may contain older texts in these languages. Many of these collections were identified under the Private Libraries and Archival Survey Project (PLASP) (American Institute of Pakistan Studies n.d.). In addition to Baluchi-Brahui interactions, contact phenomena — both former and currently ongoing — involving both these languages with other languages like Pashto and Urdu also need study. Most of the convergence phenomena discussed in these paragraphs are the result of long-standing patterns of interaction; however, the sociolinguistic situation has changed rapidly since the days when those patterns were established, and new patterns are rapidly emerging. Language contact involving Balochi in multilingual, multicultural Karachi and Quetta (Pakistan) is discussed by Farrell (2003) and Archer (2003), respectively. Titus 2003 discusses sociolinguistic factors involving Brahui, Pashto, and Balochi in Pakistani Balochistan’s highland zone. Korn (2005: 48) notes that: ‘With increasing school attendance and the advent of mass media also in the remoter areas of Balochistan, the respective official languages Persian (Iran), Sindhi and Urdu (Pakistan), Dari and Pashto (Afghanistan), Russian and Turkmen (Turkmenistan) have made their influence felt much more than ever before.’ (See also Sections 2.2 above and 2.4.2.1.2.3 below.) 2.4.1.3. Iranian contact to the north and west In this section a rough chronological sketch precedes discussion of specific languages. Bashir 2006b is a previous treatment of Iranian–Indo-Aryan interactions.

49

Filippone (1996: 19–20) mentions approximately 50 hours of interviews recorded and transcribed by her, a corpus of texts published in Balochi magazines, and three sets of unpublished sources.

278

Elena Bashir

2.4.1.3.1.

Early influences

Ancient contacts between Iranian and Indo-Aryan (IA) resulted in influences in both directions, at a time when, ‘There must, in the 6th and 5th centuries B. C., have been hundreds of the most commonly used words which were practically identical on both sides of the linguistic border’ (Morgenstierne 1974: 271). Not only have Iranian languages influenced IA, but IA influences have been identified in Iranian languages. For example, the existence of some IA-like characteristics in E.Ir. Wakhi has been noted by Morgenstierne (1975), Pakhalina (1975, 1985), and Kuiper (1991a). Among such features, Morgenstierne (1975: 432) particularly mentions the retention of past participles in -n, common in IA but in Iranian found only in Wakhi (and one word in Sanglechi-Ishkashimi); and an oblique singular of the first person pronoun in maẓ, which he derives from *mazya- (cf. Skt. mahyam). Morgenstierne (1974: 279) notes the presence in the relict Iranian language Parachi of numerous loanwords from adjacent IA Pashai, and also a verb sī ‘it exists, is (inanimate)’, in which Morgenstierne attributes the semantic development from ‘lying down, exists’ (< Skt. śete ‘it is lying down, exists) to ‘is’, to either an early loan or semantic influence from Pashai, where šī(k) (Morgenstierne 1973a, Lehr 2014: 257) means ‘is (inanimate)’.50 During the Achaemenian period (ca. 550–330 BCE), however, Morgenstierne (1974) thinks that it is likely that the direction of borrowing was mostly from Iranian, via politically dominant Persian speakers. As a result of this great political and cultural influence on India, Old Persian words were adopted in Prakrit and in Sanskrit. Chatterji (1966) discusses some Iranian and Turkic loans in Sanskrit. Continuing this topic, Morgenstierne (1974: 273) distinguishes between two types of loanwords from this period: ordinary loanwords, e.g. Skt. kantha, according to Pāṇini a dialectal word for ‘town’ < Ir. kanθa, known from several M.Ir. languages; and what he calls ‘Ir. words which have been phonetically sanskritized, in other words Skt. words semantically influenced by Ir.’. He discusses several words whose Sanskrit meanings he feels point to an Iranian source, e.g. Skt. aśvavāra ‘horseman’. The next stage begins after the end of the Achaemenian period, from the time of the Parthian (ca. 247 BCE–224 CE) until the end of the Sasanian empire (224–651 CE), and for some time afterward. By then, Avestan and Old Persian had evolved into Middle Persian (e.g. Pahlavi). The Niya Prakrit was a NW Prakrit51 used in the administrative language of the Shan-Shan kingdom near the southern edges of

50

51

In the 2014 Pashai of Village Amla in Darrai Nur, the animate/inanimate distinction is restricted to the present tense (Lehr 2014: 257). Burrow (1936) thinks that the Niya Prakrit most closely resembles Torwali, a Dardic language, spoken today in the mid range of the Swat Valley.

Contact and convergence

279

the Takla Makan desert in the third century CE.52 It is represented in the Kharoshthi documents studied by Burrow (1937), who, based on comparison of the Niya Prakrit with the language of the Ashokan edicts, concludes that it originated somewhere west of the Indus, and that some of its phonological characteristics result from the fact that the native language of Shan-Shan was (like) Tocharian. Words of probable Iranian origin in the Niya Prakrit documents are analyzed by Burrow (1933–1935a, b). Weber (1997: 36), discussing a few of the more than 40 Iranian loanwords in the Niya documents, concludes that the problem of identifying the source of Iranian loanwords in the Niya Documents is ‘more or less, a problem of loans within Iranian itself.’53 A complex picture of multiple layers of IA-Ir. interaction is seen in Khowar. Morgenstierne 1936 is a foundational article for Khowar etymological studies. It discusses four layers of historical accretion of Iranian lexical elements in Khowar: words from (a) unidentified Ir. sources, (b) Pamir languages, mainly Wakhi, (c) Middle Ir. languages, and (4) numerous words from New Persian.54 Notably, borrowings from Pashto are very few, and very recent. Morgenstierne (1975) points to both early contact of pre-Wakhi with some form of IA at a time when diverging IA and Iranian were still largely mutually intelligible, and to later intense mutual interaction between Khowar and Wakhi. These multi-layered interactions have resulted both in Wakhi’s relative isolation from its Eastern Iranian neighbors and in Khowar’s differences from its NWIA neighbors. Bashir (2001) continues the discussion of mutual Khowar-Wakhi influences, discussing unusual cases like the replacement of basic vocabulary items in Khowar, and the adoption of Khowar personal pronouns in Wakhi. Some of these innovative basic vocabulary items in Khowar appear to consist of a Wakhi verbal base + a Khowar suffix. For instance, ligíni ‘tongue’ seems to consist of the stem of the verb ‘lick’, its [g] pointing 52

53

54

As to how and why an IA Prakrit came to be the administrative language of a small Central Asian kingdom, Brough (1965: 598) postulates ‘a period of Kuṣāṇa possession of the Shan-shan country, a period which may in fact have been quite short before independent rulers took over control. It must have been long enough for the establishment of Prakrit and the Kharosthi script for government purposes.’ This does not imply a large colony of Indians settled in the region; rather ‘at the most, one would assume that the Kuṣāṇa administration brought into Central Asia a relatively small number of Indian scribes and minor civil servants’ (Brough 1965: 605). New work on Gandhari Prakrit is being done under the “Buddhist Manuscripts from Gandhāra” project at the Bavarian Academy of Sciences and Humanities, notably by Stefan Baums and Andrew Glass. The Dictionary of Gāndhārī, Bibliography of Gāndhārī Studies, Catalog of Gāndhārī Texts, and a collection of digital editions of Gāndhārī documents contributed by numerous scholars can be found at http://gandhari.org/. Lorimer (1922) describes an enclave of Persian speakers in Madaglasht (Chitral, Pakistan). This community is of modern origin and their language is closest to the Persian of Badakhshan.

280

Elena Bashir

probably to Wakhi lix- : lix-t- rather than to Khowar li-, plus the Khowar instrumental derivational suffix -íni (Bashir 2001: 10). Buddruss 1989b discusses an example of recent contact between Khowar and Wakhi, with Khowar the recipient language. This article is an analysis of Pakhalina’s (1981) word list of a language variety called “Kivi”. Buddruss finds that Kivi, which is in fact the Wakhi name for Khowar, is virtually identical to the best-known variety of Khowar, spoken in Chitral (Pakistan), with only minor phonetic changes due to recent interaction with Wakhi and the incorporation of a few Wakhi lexical items. Neither IA Khowar nor its closest neighbor Kalasha has retained inherited IA grammatical gender, presumably under the influence of substratum effects from Burushaski (Bashir 1988: 409), coupled with the influence of various stages of Iranian, particularly early Iranian and Wakhi.55 This loss of gender must have occurred at a time before more recent divergences in Kalasha and Khowar lexicon and syntax occurred. Persian grammatical influences on Khowar include subordinate clauses introduced by ki, the ezāfa construction, conjunctive -o-, and spreading use of the Persian (animate) plural marker -ān. Direct case plurals in -án (from Persian), originally used in Khowar with Persian words denoting animate beings, e.g. buzurg-án ‘elders’, are spreading to native words, e.g. ḍaq-án ‘boys’, replacing the original unmarked direct-case plural (Bashir 2007a: 225–226). In addition to its close contacts with Khowar, Wakhi, an Eastern Ir. language, was in various types of contact with varieties of Persian over hundreds of years; and until recently Wakhi in Gojal (Pakistan) remained under the influence of Persian. This influence is, however, restricted to lexical items and set phrases, and has not affected the structure of the language. Persian is now used only by men when speaking in public, or by older speakers. Reinhold 2006 is a 333-page monograph devoted to the contact history of Wakhi. It includes chapters tracing the various stages of Persian influence on Wakhi, and on the changed situation after the massive introduction of Urdu and English into Gojal after 1947. Brahui has been the recipient of words from Ir. at various stages of its development. In a series of works, Rossi (1971, 1977, 1979) presents detailed etymological analyses of several classes of these words. Rossi 1971 treats 44 Brahui words ending in -ā/ănk, -ī/ĭnk, -ū/ŭnk, -ēnk, -ōnk, and previous discussions of them in Bray 1934: 25 and Morgenstierne 1932, 1937. Rossi concludes that most of these originate in various forms of Middle Persian, having entered Brahui at different times and from various specific speech communities. Rossi 1977 discusses 26 Brahui lexemes selected to include all possible outcomes of Proto/Old/Middle Iranian *-k(a)- stems borrowed by Brahui at any phase of development of the language. He divides these into four categories: (i) Br. ← Bal.; (ii) Br. ← Prs.; (iii) Br. (in)directly ← Prs.; (iv) Br. ← some Ir. language. Finally, Rossi (1979: vi) presents what he 55

New Persian remained the official, government language of Chitral, where Khowar and Kalasha are spoken, until 1953, when it was replaced by Urdu (Bashir 2006b).

Contact and convergence

281

modestly characterizes (p. vi) as ‘an etymological supplement to Bray’s vocabulary.’ It contains rigorous etymological analysis of Iranian elements in the Brahui lexicon, remedying what he perceives as earlier neglect of borrowed elements in Brahui. For further discussion of Brahui-Balochi interactions, see 2.4.1.2 above. Pashto has preserved a remarkable number of morphological archaisms. Some of its features, however, have been attributed to IA influence. Emeneau (1965b: 158–159) thinks that it is possible that the retention of grammatical gender in Pashto is at least partly due to IA influence: ‘… it seems suspicious that there is a bunching of Iranian languages with two-gender systems in the area nearest to the Indo-Aryan border, on the other side of which all the languages (with the exception of Khowar) have two-gender Indo-Aryan systems.’ The probable IA source of a causative morpheme -aw, is discussed by Morgenstierne (1940: 113–114). Later IA loanwords in Pashto are mainly from Sindhi and “Lahndā”, e.g. koṭ ‘fort’ (from Lahndā), kaṛə́y ‘ring’ (from Sindhi) (Elfenbein 1997: 758). In IA loans with retroflex consonants, the retroflexes are retained in Pashto. Regarding retroflex /ṛ/, Elfenbein (1997: 758) notes that even in loans from New Persian containing alveolar /r/, ṛ develops, e.g. daṛd < dard ‘pain’. Strand (2011) describes the process of Pashto spread and its influence on, and displacement of, other languages, starting in the 15th and 16th centuries when the Khakhay branch of Afghans entered the Laghman, Swat, and Panjkora basins, displacing their indigenous Indo-Aryan speakers. Pashto still continues to displace Indo-Aryan and Nuristani speech in the Laghman, Kabul, and Indus valleys. Strand describes a similar process of “Farsification” occuring west of Nuristan, where Dari Persian is displacing the Nuristani and Pashai languages in their westernmost valleys of Řamgal and Farazhghan. By the last centuries of the first millennium CE, Pahlavi had developed into New Persian, and northern India was conquered by the Persian-using Turks and Iranians. During this period vast numbers of Perso-Arabic loans entered most Indian languages. 2.4.1.3.2.

New Persian and South Asia

New Persian has profoundly influenced both the languages and literatures of South Asia. Alam (2003: 185) discusses the social, political, and literary influences and development of Persian in premodern India, culminating under the Mughals, before finally being eclipsed by the mid-19th century when it was replaced as the language of power by English and some vernacular languages. Persian not only spread its lexical and some morphological influences into the indigenous languages with which it came into contact, but also was itself influenced by its Indian environment, developing a new literary variety, Sabk-e-Hindi. Abidi & Gargesh 2008 discusses this “Indianization of Persian”, citing both the borrowing of words from Indian languages and the use of expressions which are semantically and emo-

282

Elena Bashir

tionally Indian. Code mixing with Indian languages is found at the levels of morpheme, phrase, and clause. Compound words include one item from Persian and the other from Hindi; and the ezafe construction and the conjunctive -o- are found joining Hindi words (Abidi & Gargesh 2008: 112). The development of Urdu from Khaṛī Bolī, its further incorporation of Persian elements, and the long-lasting consequences of this, have been discussed extensively elsewhere, mostly with a focus on literature or political history, and will not be treated here. Persian influence in Bengali has been treated by Chatterji (1926: 202–214). The language came to Bengal at the beginning of the 13th century, but did not have much influence before the time of the Mughals in the last quarter of the 16th century. It reached a peak of dominance in Bengal in the 18th century, and remained the language of the courts in Bengal until 1835. From the 17th century, when Hindustani (> Urdu) became a lingua franca for north India, Persian words also began to enter Bengali indirectly through that medium. According to Chatterji, Persian influence has been mainly lexical, with adopted Persian words relating largely to the subjects of kingship, warfare, and hunting; revenue, administration, and law; Muslim religion, intellectual culture, material culture, proper names, and some 500 words relating to common things. Hilali & Haq 1967 is a dictionary of 5,186 words of Perso-Arabic origin in Bengali. It also includes a list of 26 word elements and suffixes used in Bengali word formation. Kashmir is said to have had cultural and trade relations with Persia since ancient times; however, the influence of Persian language and culture burgeoned with the introduction of Islam around the mid-14th century (Koul 2008: 9–10). This influence continued to increase with the immigration of nobles and scholars from Persia and Central Asia, and Persian functioned as the official language of Kashmir during the rule of the Mughals and Afghans until Urdu was declared the official language by Maharaja Pratap Singh in 1907.56 Koul 2008 discusses lexical borrowings and the phonological changes undergone by Persian loanwords after entering Kashmiri, and morphological patterns involving Persian elements. These include: Persian affixes added to Kashmiri stems, e.g. be-patš ‘untrustworthy’, ləṭ-dār ‘having a tail’; Kashmiri elements attached to Persian words, e.g. nazri-tal ‘under the sight of’; and hybrid compounds with Persian as the second element, e.g. tsok-ātaš ‘very sour’, or as the first element, e.g. mōm-bəty ‘candle’. Semantic changes undergone by Persian loans in Kashmiri are also treated (Koul 2008: 90–92).57 Panjabi has acquired Perso-Arabic words at various time levels. Shackle 1978 discusses the over 1,000 Persian loanwords that occur in the Ādi Granth, the great majority of which are nouns/adjectives. The article details the types of phonolog56 57

Other sources give differing dates. O. N. Koul 2011 is a more accessible article, containing much of the information found in A. K. Koul 2008.

Contact and convergence

283

ical changes that occur in these words and assigns them a relative chronology. Patterns of gender assignment to Persian nouns and their assimilation into existing declension classes of the IA language of the time, as well as the semantic fields into which most loans fall, and their deployment in the religious verse of the Ādi Granth are treated in detail. Semantic change in some Persian words and idioms found in modern Panjabi are discussed in Nirvair 1975 and Chopra 2000. The effects on Sindhi literary culture of its inclusion in the Persian cultural nexus are described in Asani 2003. The article treats script issues, and the waxing and waning of Persian literary influences on Sindhi. Persian was the language of education and literature until the British conquest of Sindh in 1843, and in 1853 Sindhi and English were introduced for educational purposes.58 The extent of the penetration of Arabic and Persian words into Sindhi can be gauged by the detailed treatment given to the Sindhi pronunciations of Arabic and Persian sounds in Trumpp’s classic grammar (1872). Most modern works on the Sindhi language are, however, in Sindhi, and not accessible to non Sindhi-knowing scholars. Persian lexical items have entered virtually all the languages of South Asia, both as single words, and as idiomatic collocations. One example is idiomatic constructions involving EAT , in the meaning of ‘experience’; these are treated in Hook and Pardeshi 2009, which offers persuasive evidence for the New Persian origin of these constructions in Hindi-Urdu and Marathi.59 In addition to single words and collocations, Persian affixal elements entered South Asian languages, where they became variably productive, entering into hybrid formations of various types. Kuczkiewica-Fraś 2003 treats such formations in Hindi and Urdu. Fewer works discuss Persian influence on morphology and syntax. Bashir (1988: 279– 284) discusses the differential uses of ki-clauses in Kalasha and Khowar and their influence in introducing elements of right-branching syntax into a previously left-branching language.60 Bashir 2006b discusses morphological and syntac58

59

60

Khubchandani (1969) mentions a thesis by Allana (1964) on Arabic elements in Sindhi, but I have not been able to access this work. Burrow (1933–1935b: 789–790) mentions a single instance of the use of EAT in the meaning of experience in the Niya Prakrit, śavatha khayaṃnae ‘to take an oath’, concluding that this usage is probably a Middle Iranian influence operating first on Indian Prakrits during the rule of the Kushans (first and second centuries CE), whence it traveled to Niya. In present day Torwali, Inam Ullah (p.c. April 2012) lists five such EAT expressions, which seem to be recent calques from Urdu. Bashir (1988: 420 fn. 43) points out complementary parallels between Kalasha and languages like Marathi and Dakkhini Urdu. Kalasha had/has left-branching complementation and relativization strategies in the process of accommodating (to) right-branching structures, while Marathi and Dakkhini (Urdu) show (some) right-branching structures in transition to left branching. In Kalasha the left-branching structures are the legacy of its long-term membership in the Indo-Turanian area, while in Marathi and Dakkhini they are being re-acquired as a result of more recent interaction with Dravidian.

284

Elena Bashir

tic influences of New Persian in other IA languages, e.g. the ubiquitous, multifunctional clause-initial conjunction ki, the ezafa construction and compounds formed with unstressed, enclitic o ‘and’. Marlow (1997) provides general information on the origin and development of ki/ke in Indo-Aryan. Further diachronic study of the introduction and spread of ki-clauses in individual languages and with attention to their functions and distribution across discourse types in those languages would be a valuable contribution. The “Persianization” of Indian languages has been compared by many to the later “Englishization” of Indian languages, by among others Shackle (1978) and Abidi & Gargesh (2008). Even after the waning of direct Persian influence on South Asian languages, the increasing influence of Urdu in Pakistan and Kashmir continues the earlier trajectory established by New Persian, with Urdu functioning as the vector carrying Perso-Arabic words into other languages. 2.4.2.

Post-1947 convergence in Pakistan and Afghanistan Edited by Elena Bashir

2.4.2.1. Recent convergence and divergence in Pakistan By Elena Bashir The most important single factor in recent linguistic change in Pakistan61 is the institution of Urdu as the national language, and its almost universal use as medium of education.62 This has resulted mainly in the increasing influence of Urdu on the indigenous languages of Pakistan; however, Urdu itself has absorbed elements of the indigenous languages and diverged from Urdu as spoken in India to the extent that Pakistani Urdu is now recognized as a distinct variety.63

61

62

63

Green (2011) examines the role of Urdu in Afghanistan through its participation in the “Urdusphere”: ‘For the elites of Afghanistan in the late nineteenth and early twentieth centuries Urdu was more important than Pashto — or even in some cases, Persian — as a source of ideas and a means of engagement with the world beyond their borders’ (2011: 486). The exceptions to this have been Sindh, where the medium of education has remained (partially) Sindhi, and Khyber Pakhtunkhwa (former NWFP), where some primary education has been conducted in Pashto. The concept of pluricentric language, i.e. a language with more than one standard variety, has been applied to “Hindi-Urdu” (Dua 1992). Now, however, divergence has progressed to the point that Urdu itself must be considered a pluricentric language, with centers in India, Pakistan, and perhaps even the diaspora. Anjum (1991) is a study of the Urdu spoken by Pakistanis settled in Texas (USA).

Contact and convergence

2.4.2.1.1.

285

Pakistani Urdu

The divergence of Urdu usage in Pakistan was remarked on as early as 1966 (Azhar 1966a, b). Azhar argued that absorption of words from the indigenous languages of Pakistan was a natural process and would eventually result in all Pakistanis taking ownership of Urdu. A set of 24 articles published on the 50th anniversary of Pakistan’s creation (Durrani 1997) contains instances of specifically Pakistani literary usages and words absorbed from the indigenous languages; it, too, stresses the revitalizing aspect of language change. Anecdotal observations on Pakistani Urdu include comments such as: (i) Its intonation and pronunciation are more precise and formal than that of Indian Urdu. (ii) Pakistani Urdu uses the subjunctive with polite imperative force more than does Indian Urdu. (iii) The imperative ending -o, traditionally associated with the second person familiar pronoun tum, is now being used with the formal second person pronoun āp, reportedly from the desire to combine informality with politeness.64 (iv) Some expressions for notions expressed by English ‘have’ are changing so that a sentence like ‘I have time’ is now most often expressed by mere pās waqt hai, with the locative postposition ke pās, instead of the traditional mujhe waqt hai with the dative case of the one who “has”. Bashir (2011) discusses some salient features of Pakistani Urdu. One change in progress is that there is increasing uncertainty about grammatical gender assignment, resulting from differences between inherited IA gender patterns in Urdu and gender patterns in the languages with which it is interacting. Some languages of Pakistan do not have grammatical gender (Balochi, Khowar, Kalasha, Brahui), and in some (e.g. Pashto), gender patterns operate differently than they do in Urdu.65 Bashir (1991b: 23–33, 247–255) discusses and illustrates how gender assignment patterns differ in Urdu and Pashto. This results in a tendency to assign many nouns default masculine singular gender, and may lead to the eventual loss of the category, as happened historically in Iranian languages like Persian, and in other IA languages like Khowar and Kalasha or Bengali.66 Bashir (1999) discusses reanalysis of the postposition -ne in Pakistani Urdu as an emerging agentive marker. Rigorous analytical or quantitative study of these divergences, however, remains a desideratum. New technology and corpus linguistics tools are now available to make such work feasible.

64

65

66

This is also observed in urban India, particularly with younger speakers (Hans Henrich Hock, p.c. 21 Dec. 2014). Even in Panjabi and Sindhi, some nouns have different genders than they do in Urdu, e.g. axbār ‘newspaper’ is feminine in Panjabi but masculine in Urdu. See 2.4.2.1.2.2 below for Sindhi examples. Khubchandani (1963: 266) describes gender vacillation in Indian Sindhi due to stem pattern differences in Hindi and Sindhi.

286

Elena Bashir

2.4.2.1.2.

Interactions of indigenous languages with Urdu

2.4.2.1.2.1. Panjabi Mutual influences of Urdu and urban Panjabi are widely noted. For example, in Panjabi Urdu one notices (i) phonetic changes, such as the deaspiration of Urdu voiced aspirates /bh/, /dh/, /ḍh/, /gh/, /jh/, e.g. gobī ‘cauliflower’ (~ gobhī), or insertion of an epenthetic vowel, e.g. bahārat ‘India’ (~ bhārat); (ii) incorporation of lexical items, e.g. Panjabi faṭāfaṭ ‘quickly’ alongside Urdu jaldī or fauran; (iii) allegedly Panjabi-influenced sentence patterns, e.g. mãĩ ne jānā hai ‘I want to go/ am going’, alongside mujhe jānā hai.67 There is, however, a dearth of analytical or controlled quantitative study of these phenomena and virtually no publication on these topics. Urban Panjabi is undergoing relexification from Urdu, as well as weakening of the contrasts between retroflex /ḷ/ and /ṇ/ and their dental-alveolar counterparts, contrasts which are not present in increasingly dominant Urdu. Even when Panjabi is written in Perso-Arabic script (Shahmukhi), unique characters are not (yet) used for /ḷ/ and /ṇ/.68 A perhaps even more important development deserving attention is language shift in urban Punjab from Panjabi or Saraiki to Urdu (Shackle 1970, Baart 2003, Mansoor 1993; Asif 2005). 2.4.2.1.2.2. Sindhi Bughio 2001 is a quantitative sociolinguistic comparison of rural and urban Sindhi with respect to three phonological variables deemed by the author to be characteristic of either urban or rural Sindhi: final short vowel retention (rural/indigenous) vs. deletion (urban/innovative);69 simple vowels /o/, /e/ (rural/indigenous) vs. diphthongs /au/, /ai/ (urban/innovative); and the presence (indigenous) or absence (innovative) of /r/, a voiced apico-alveolar trill, following retroflex/ṭ/, /ḍ/, and /ḍh/. The results indicate consistent differences between older/rural speakers and younger/urban speakers in the degree to which the innovative pronunciations are 67

68

69

This particular sentence is a special target of language purists (Shackle 1970: 247, n. 5), and has been repeated often in the literature. Note that the actual Panjabi for the equivalent sentence is mãĩ jāṇā e, and that the Urdu sentence said to result from Panjabi influence results from adding the Urdu postposition ne to what is historically an oblique form. A Google search on November 28, 2013 yielded 16,600 “hits” for mujhe jānā hai and 29,500 for mãĩ ne jānā hai. The same search on November 28, 2014 yielded 3,580 “hits” for mujhe jānā hai and 13,900 for mãĩ ne jānā hai. Mizokami (1987: 27  ff) discusses interference effects involving these two phonemes involving Hindi speakers speaking Panjabi and Panjabi speakers speaking Hindi in Jullundur (India). Sindhi preserves final short (whispered) vowels which have been lost in most NIA languages. Thus change in this basic phonological characteristic of Sindhi is diagnostic of contact-induced influence.

Contact and convergence

287

adopted. Bughio 2009 continues this discussion. Since Urdu’s being made the national language ‘marks the origins of bilingualism and multilingualism in Sindhi society’ (p. 35), these changes can be directly linked to the advent of Urdu in a dominating position in Sindh.70 Bughio 2001 also discusses lexical developments in Sindhi and their relations to Urdu (pp. 70–75). Shackle 2005 discusses the morphological naturalization of loanwords in Sindhi; for instance, most Persian nouns in -a are assimilated to the masculine declension in -ō, thus darvāzō ‘gate’. Consonant-final words add a final short vowel (usually not written), e.g. umata ‘community’ (< ummat with regular loss of gemination). Gender assignment is sometimes unpredictable, and sometimes different from Urdu, e.g. kitābu ‘book’ (m.), dili ‘heart’ (f.), versus Urdu kitāb (f.) and dil (m.). Besides conjunct verbs consisting of Persian nominal elements plus a Sindhi verbalizer like karaṇu ‘to do’, a few Sindhi verbs are derived from nominal loans, e.g. dafnāiṇu ‘to bury’ (< dafn), talbaṇu ‘to seek’ (< talab), nazrījaṇu ‘to appear, be seen’ (< nazar + Sindhi derivational passive morpheme -īj-). Khubchandani 1963 examines changes in Sindhi in India following partition. It discusses massive influence of Hindi on Sindhi because of asymmetrical bilingualism between Sindhi and Hindi, but very little influence from India’s other regional languages (p. 81). Results of this interference in phonology (syllable structure, cluster patterns, vowel articulation, suprasegmental features) and morphology (e.g. vacillating gender, change in vocative case pronunciation, declining use of pronominal suffixes, partial productivity of some borrowed affixes) are detailed. Khubchandani (1969) characterizes the overall trend in India as “tatsamization”. Interestingly, he also mentions a few innovations which are not simple influences of Hindi. Fusion of Hindi affixes with Sindhi words sometimes yields a new form which is not parallel to the Hindi model, e.g. cuṇḍ-əkU ‘electorate’ (pp. 269–270). A phonological innovation replaces Hindi /b/ and /g/ in clusters with the Sindhi implosives /ɓ/ and /ɠ/ (p. 260), altering the relative frequencies of various consonant clusters. 2.4.2.1.2.3. Balochi and Brahui Karachi is a multilingual microcosm of Pakistan. Pashto is the third most widely spoken language there, and there is also a sizeable Balochi-speaking community.71 Farrell (2003) discusses the relative importance of influences on Karachi Balochi 70

71

It is interesting that English loan words show somewhat higher final vowel retention than Urdu words. This reminds one of the tendency for English loans to become more colloquial/informal than an original Urdu word. For instance, in Lahore, eirporṭ (< Eng. ‘airport’) is almost exclusively used instead of Urdu hawāī aḍḍā ‘airport’. Research on the Karachi varieties of the various languages spoken there would be highly rewarding.

288

Elena Bashir

phonology, lexicon, morphology, and syntax of Sindhi, Brahui, English, and Urdu. He thinks that the predominance of postpositions in Karachi Balochi is likely to be an IA influence. Bashir 2008 discusses Eastern Balochi as a language transitional between its Iranian origin and its IA neighbors, and also recent influences of Urdu. Bashir 2010 describes aspects of the pervasive influence of Urdu on the lexis and even structure of the verb system of Balochi and Brahui. Barker & Mengal 1969 contains observations on Balochi treatment of loanwords from various sources. 2.4.2.1.2.4. Khowar Bashir 2007a discusses contact effects on Khowar, whose speakers are increasingly becoming asymmetrically bilingual in Urdu and (in southern Chitral) Pashto. In southern Chitral, where Pashto is also widely spoken, retroflex sibilants and affricates are yielding ground to the palatals. For instance, ṣapík ‘bread’ is sometimes pronounced as šapík; and bac ̣hoóɫ ‘calf’ as bačhoóɫ. In this case, multiple causation may be at work, since neither Urdu nor Pashto has a contrast between palatal and retroflex sibilants or affricates. One place where change can be predicted is with the velarized /ɫ/. Neither Urdu nor Pashto has this sound, while Khowar does not have a retroflex /ṛ/. However, Khowar’s /ɫ/ is written with the letter ڑ‬, as is Urdu /ṛ/. With the increase of literacy, both in Urdu and in Khowar, it is possible that this graphemic ambiguity will lead to a weakening of the distinctive status of /ɫ/ in Khowar. In addition to single lexical items, the versatile Urdu verb lag-72 the basic meaning of which is ‘be attached/contiguous to’, has started appearing in Khowar as legík in one of its extended senses. For example, lag- is used in Urdu in the meaning of ‘take/require (amount of time or money)’. The older Khowar construction for such meanings employs the Khowar verb ganík ‘to take’, but now legík can be seen in this meaning. For example, zap tayáar bikote(n) ju ganṭá ganiír/leguúr ‘It will take two hours for the clothes to be ready.’ Other syntactic changes, involving the evidential system and the development of new imperfective constructions are also discussed. 2.4.2.1.2.5. Burushaski Morin & Dagenais 1977 studies phonological changes observed in borrowings from Urdu into Burushaski, with the caveat that words showing such changes may have come directly from Urdu, or from Persian, Khowar, or Shina. Patry and Tiffou (1997) find what they consider relexification of Yasin Burushaski with Urdu words in progress. Especially with younger speakers, a high percentage of Urdu nouns are used, but grammatical morphemes and verbs are less affected. Frembgen 1997 is a study of English loan words in Burushaski. See 2.4.1.1.2 above for discussion 72

See Shapiro 1987 for discussion of the multiple meanings of lag-.

Contact and convergence

289

of an aspectual auxiliary construction believed to originate in Urdu. The language of Burushaski speakers transplanted at the end of the 19th century into Jammu and Kashmir is examined in Munshi 2006 and 2010. Their language has been subject to heavy Urdu and Kashmiri influences, with lexical influence mainly from Urdu and structural influence from Kashmiri. 2.4.2.1.2.6. Domaki Domaki, a Central IA language transplanted to Hunza about 200–300 years ago, is a severely endangered language under pressure from Burushaski, Shina, and Urdu. Backstrom (1992b: 81) gives lexical similarity percentages between Domaki, Urdu, Burushaski, and Shina which indicate 27 %, 23 %, and 40 % vocabulary shared with Urdu, Burushaski, and Shina, respectively. However, Buddruss (1985: 30) states that, ‘... Ḍomáaki has a characteristic morphology and syntax of its own, hardly influenced by the neighboring languages,...’ Weinreich 2010 discusses language shift from Domaki to Burushaski and Shina. 2.4.2.1.2.7. Shina Kohistani & Schmidt 2006 is a sociolinguistic study of Shina in contemporary Pakistan. The authors note increasing use of Urdu and Pashto in Shina-speaking population centers, but find that in rural areas bilingualism is rare (p. 143). Schmidt (p.c. May 2011) comments that loanwords from Urdu to Kohistani Shina are recent, as links with down-country Pakistan have become closer, but that most loanwords occurring in her texts, e.g. γaltíi ‘mistake’, dunyá ‘world’, faráz ‘duty’, could have come from Urdu, Persian, Panjabi, or even Pashto. Voiced aspirates in Kohistani Shina (but not in Gilgiti)73 seem to have been reintroduced through (fairly recent) borrowing, though the source of this borrowing is unclear (Schmidt & Kohistani 2008: 30–32).74 2.4.2.1.2.8. Balti Balti is a Tibeto-Burman language spoken in the Baltistan region of northeast Pakistan. Backstrom (1992a: 6–7) noted that the Balti lexicon showed relatively little influence of either neighboring Shina or politically dominant Urdu. Despite the presence of Shina speakers around and within Baltistan, and the influence of Urdu through government and schools in the area for many years, the common Balti vocabulary thus far showed relatively little influence from these or other IA languages. The standard list of 210 words used in Backstrom’s study showed only 73 74

There are no voiced aspirates in Gilgit Shina. I know of no further research addressing recent contact-induced change in Shina.

290

Elena Bashir

seven apparent Urdu loans, nearly all of which are nouns referring to objects or concepts not native to Baltistan, and only one apparent Shina loanword. Sering (2002: 5), however, presents quite a different picture, in which ‘Balti is at the mercy of other languages and literatures …’ and ‘… the random adoption of foreign loan words has further adulterated Balti, resulting in code-switching in everyday conversation.’ Twenty-three years have elapsed since Backstrom’s study; information on the contemporary (2015) situation needs comparable new research. 2.4.2.1.2.9. Wakhi The Wakhi-speaking population moved only in the 18th and 19th centuries to Gojal (Pakistan) from Afghanistan and Tajikistan, where Persian served as the language of poetry and of wider communication. Thus Persian has had considerable impact on the language, and until about 20 years ago its function was comparable to that of Urdu today. Spoken Wakhi today has a high percentage of Persian and Arabic loanwords; however, it is almost impossible to decide whether a particular PersoArabic word was borrowed from Persian or from Urdu (Reinhold 2006). Mock (1998: 38) finds that language vitality is strong in Gojal, even though bilingualism and literacy in Urdu are high, especially among males. Reinhold 2006, based on fieldwork with women, focuses on describing patterns of linguistic change in the context of changed social conditions and in specific social situations. In Tajikistan, Müller et al. (2008: 23) find that Wakhi is highly vital in most of the communities where it is spoken and is only declining in communities where ethnic Wakhi are a minority. 2.4.2.1.3.

The spread of Pashto

Another major development in northern Pakistan is the spread of Pashto. The territory of Pashto speakers and the influence of the language have been expanding since the 15th century, when the lower parts of Swat and Dir, as well as Bajaur, once entirely occupied by speakers of Dardic languages, were conquered and settled by Pathans migrating from the south (Weinreich 2001, 2009: 16). The advance of Pashto into formerly Dardic territories was noted as early as 1880 by Biddulph (1880: 69–70). The entry of Pashto lexical items into Torwali began during this period. Inam Ullah 2005, part of which has been published as Inam Ullah 2011, lists 650 out of 5,493 lexical entries with Pashto etymology, either original or from Persian through Pashto. There are numerous Pashto loanwords in the Shina of Indus Kohistan in the cultural domain, such as weš ‘distribution of land’, hašár ‘joint cooperative effort’, and hújra ‘men’s guest house’. The most obvious influences from Pashto are lexical, with some resulting phonological effects, such as the phoneme /x/, an old rather than recent influence. In contrast, there is less Pashto influence in Gilgiti Shina (Ruth Laila Schmidt, p.c. May 2012).

Contact and convergence

291

R. Nichols 2008 is a history of Pashtun migration starting from the late 18th century. Pashto has continued to advance in the northern reaches of Pakistan since Partition. In addition to contact-induced change, language shift is also occurring. In Gawri-speaking villages in upper Dir, the population is in the late stages of shift to Pashto (Baart 2003: 4–5). Weinreich 2009 is a detailed documentation of the most recent phases of Pashtun migration and expansion in Gilgit-Baltistan (formerly Northern Areas) of Pakistan; and Weinreich 2010 treats recent changes in Pashto transplanted into Gilgit-Baltistan. Shackle 1980 discusses the influence of Pashto on Kohat and Peshawar Hindko. Since 1947, with the departure of non-Muslim Hindko speakers and their replacement by Pashto speakers, Hindko has lost ground in Kohat. In 1980 there were still fair numbers of Hindko speakers, some of whom had Persian as a home language, but bilingualism with Pashto appeared to be general (pp. 486–487). The strongest influence of Pashto is seen in the lexicon, most of the Pashto loans being nouns (p. 496). Contact with Pashto does not seem to have encouraged any greater use of pronominal suffixes in Kohat Hindko than is typical of other varieties of Hindko (p. 495). Discussing Peshawar Hindko, Shackle finds that command of Pashto is increasingly general, and all educated speakers are also fluent in Urdu. The Hindko of younger speakers, especially those with higher education, tends to contain a marked proportion of partially assimilated elements, especially from Urdu and Pashto (p. 497). Shackle also mentions a negative ‘be’, found exclusively in Peshawar Hindko, which inflects only for gender and number, m.sg. nī̃ -gā ‘is not’, m.pl. nī̃-ge, f. sg. nī̃-gi, f.pl. nī̃-giā̃ , and thinks that this development may have been encouraged by Pashto ništa ‘is not’ (p. 505). These forms consist of the negative element + number and gender agreeing gā, also used in some varieties of Panjabi in affirmative contexts to emphasize existence. Fussman (1972: 5–6) describes the situation of the Dardic and Nuristani languages in Afghanistan, commenting that the pace of their erosion accelerated after 1924, and that by May 1970 the phonological systems of certain dialects were rapidly changing, dialectal differences were eroding, and foreign lexical elements were increasingly being used. Strand 2011 describes the advance of Pashto at the expense of the Dardic and Nuristani languages of contemporary northeast Afghanistan. Lehr (2014) reports that Pashai has been deeply influenced by Pashto culturally, socially, and linguistically. Only women still count in Pashai, while men prefer to count in Pashto. Pashai words for months and days and for many everyday objects have been replaced by Pashto words (p.c. May 2012). 2.4.2.1.4.

Interactions with English

All the languages of Pakistan are increasingly being influenced by English. Studies of code-mixing and code-switching are numerous — for example, Rasul 2009 on code-mixing and hybridization, Rasul 2013 on code-mixing in Urdu children’s

292

Lutz Rzehak

literature, and Janjua 2011, which finds that the frequency of code switching from Urdu to English is so extensive that most Urdu discourses are no longer in standard Urdu, rather in an “Urdish” which is emerging from increasingly frequent code switching, extending even to the morphemic level. Islam 2011 is a study of morphological treatment of English loans in Urdu, Pashto, Panjabi, and Sindhi. Conversely, Qadeer 2011 is a diachronic study of the appearance of Hindi and Urdu words in various editions of the Concise Oxford English Dictionary. See also 2.7 above. 2.4.2.2.

Recent developments in Afghanistan By Lutz Rzehak

2.4.2.2.1.

Trends in linguistic research

It is neither possible, nor the object of this section, to give an overview of all the specialized research on the languages of Afghanistan which has been carried out during the last decades. Instead, some general trends in linguistic research will be sketched out to show the main achievements and to reveal the main desiderata of linguistic research in and on Afghanistan. Already at the beginning of the last century, the pioneer investigator of the languages spoken in Afghanistan, Georg Morgenstierne, stated (1926: 6) that ‘Afghanistan was actually the linguistic center of the Eurasian continent, and nearly all its chief families of languages were represented there’. Six decades later Charles Kieffer (1985: 501) explained the great linguistic and ethnic variety of that region by ‘Afghanistan’s ability to amalgamate rather than assimilate’. Morgenstierne’s (1926: 2) hope ‘to come across the last of the unknown Indo-European languages which are still spoken’ steered the main directions of linguistic research in Afghanistan in the 20th century. Starting with Morgenstierne, linguistic research in Afghanistan was originally carried out in the tradition and with the objectives of historical-comparative linguistics. As a result, the main focus was not on the largest linguistic groups with respect to the number of speakers but on those languages which were supposed to represent the oldest still-observable layers of language history. Wākhī and other Pamir languages, as well as Parāčī and Ōrmuṛī were seen as the most plausible candidates among the Iranian languages, Pašaī among the Indo-Aryan languages, and above all the so-called Nūrestānī languages.75 Rules of historical phonology were thoroughly worked out for particular languages in order to find equivalents in the sound system, to allow comparisons with other languages and to establish sub-grouping within the language families; morphological and lexical features were also considered. Thus it became clear very soon that the Dardic languages of 75

The term “Kafir languages” is inconvenient and politically incorrect because in Afghanistan the speakers of these languages are no longer “infidels” (kāfer).

Contact and convergence

293

Afghanistan (Pašaī, Gawar-Bātī, Tirāhī) are purely Indo-Aryan idioms of a very archaic type and that the Nūrestānī languages (Katī, Waigalī, Aškūn, Prasūn) constitute a separate third branch of Indo-Iranian which, however, is closely related to and profoundly influenced by the neighboring northwest Indo-Aryan languages (Morgenstierne 1979: 25).76 Due to the lack of linguistic data older than the recordings of the late 19th and early 20th centuries, the methods of historical comparative linguistics soon reached their limits. Neither was the question of the origin of Ōrmuṛī and Parāčī finally solved,77 nor could the genetic relatedness of the Pamir languages to each other be defined clearly, though numerous isoglosses like common features in phonetics and morphology bear testimony to a certain genetic relatedness.78 Dodychudojev (1972: 468) argues that the languages which are united under the term Pamir languages cannot be traced back to a common proto-Pamir language and that only for the languages of the Shuġnī-Rušānī-group and for Yazgulāmī can a common origin be assumed. Special features which mark the Pamir languages out as a group can better be explained by centuries of contiguity and long-term processes of linguistic convergence. Grjunberg and SteblinKamenskij (1974: 278) therefore bring forward the idea of a sprachbund.79 Furthermore, it was almost excluded that any of these languages developed from Bactrian (Kieffer 1985: 511), and it was assumed that Wākhī must belong to the most ancient stratum of Iranian in Afghanistan (Morgenstierne 1979: 27). See also 1.4.2.2 above. Studies in the field of historical comparative linguistics would not have been possible without more-or-less extensive language documentation. Documentation 76 77

78

79

An overview of these languages is given by G. Fussman (1972). Ch. Kieffer (1977: 72–73) acknowledges an ancient group of southeastern Iranian languages which is represented today only residually by Ōrmuṛī and Parāčī, whereas Morgenstierne in one of his later publications (1979: 27) suggests that they are the last remaining vestiges of a group of southwestern Iranian dialects. Efimov (1999a: 257, 1999b: 276) assigns both languages to the group of northwestern Iranian languages but admits that at least for Ōrmuṛī this position is under dispute. In phonetics, common features include: the opposition of long and short vowels, the existence of reduced vowels, monophthongization, vowel changes, the absence of double consonants in initial position, the existence of the dental affricates c and ʒ, or the contrast of velar x́ and γ́ and uvular x and γ. Special characters used in this section include: voiceless velar fricative, voiced velar fricative, voiceless uvular fricative, voiced uvular fricative, voiced uvular fricative, voiceless dental affricate, voiced dental affricate, voiced retroflex fricative, voiced postalveolar affricate. Two different characters appear for the voiced uvular fricative since they originate in different scholarly traditions. In morphology, common features are relict forms of a two-case system in nominal declension, a casus rectus and casus obliquus of personal pronouns, three verb stems (present, past, and perfect), agreement of the personal endings, and the existence of enclitic formants for person and number (Dodychudojev 1972: 465 and Payne 1989: 424). See also Steblin-Kamenskij 1999: 8.

294

Lutz Rzehak

of the languages of Afghanistan ranges from word lists, over selected phrases, proverbs and short folklore samples, to comprehensive publications with longer texts on everyday life, culture, and other related subjects including glossaries and grammatical accounts. Among the latter, besides the fundamental publications of Georg Morgenstierne in the Indo-Iranian Frontier Languages series, the documentation of Dardic splinter languages and Afghan Balōčī by Buddruss (1960, 1967, 1989a), of Wākhī by Grjunberg and Steblin-Kamenskij (1976), of Munǧī and Katī by Grjunberg (1972, 1980), as well as of Ōrmuṛī by Efimov (1986) and Kieffer (2003) should be mentioned. All these documentations deal with minority or residual languages of the Indo-Iranian families and some of these languages (Ōrmuṛī and Parāčī) seem to have been almost given up by their speakers, at least inside Afghanistan today.80 The same can be said for the Moġol language of the Mongols of Herat province which was documented by Weiers (1972) and, probably, also for the language of the Arabic-speaking Arabs of Balkh and neighboring western provinces, which was not documented in a comparable way.81 Thus the main focus of language documentation was on minority or residual languages and the primary object of description was usually a single language or variety, though almost all of these publications refer in more or less detail to questions of bi- and multilingualism as well. See especially Kieffer 1977, where bi- and multilingualism are studied in relation to language shift. Considering their social role and the number of speakers, the two official languages of Afghanistan, Darī-Persian and Pashto and their varieties, were studied to a relatively lesser degree. Basic lexicographical works were published both for Pashto (Aslanov 1985, Mōmand & Sahrāī 1994) and Darī-Persian (Kiseleva 1986, Bulkin 2010). Kiseleva 1985 gives a general description of Darī as the Afghan variety of modern Persian, and Grjunberg 1987 presents the most detailed and systematic account of Pashto grammar. Like other publications of that kind they deal mainly with the written language. Our knowledge of the spoken varieties remains insufficient, though numerous publications deal with single phenomena in grammar or in the lexicon of spoken Darī and Pashto (e.g. Bečka 1969, Kiseleva 1973, Meyer-Ingwersen 1966, Ostrovskij 1996, and Roberts 2000). Broader descriptions of particular local varieties were presented for the Pashto of Kandahar by Penzl (1955), for the Dzadrānī dialect of Pashto by Septfonds (1994), for Kābolī Persian by Farhâdi (1955) and others,82 for the Hazāragī dialect 80

81 82

Already in the 1970s Kieffer (1977: 72) stated that these languages are ‘doomed to disappear in the near future’. No verifiable information about today’s situation is available. For contemporary Ōrmuṛī as spoken in Waziristan (Pakistan) see Hallberg 2004: 53–64 and Burki 2001. Some more general information was presented by Kieffer (1981). Afġānīnawīs (1335) gives the most comprehensive lexicological description of Kābolī Persian. For other publications on Kābolī see Kieffer 1985: 516–517.

Contact and convergence

295

by Efimov (1965) and Dulling (1973), and for the Persian dialect of Herat by Ioannesyan (1999), but our understanding of the general dialect division of DarīPersian and Pashto as well as our knowledge of the local and social distribution of their varieties is still superficial. A preliminary overview of the main Persian dialects of Afghanistan and their characteristic features was presented by Kieffer (1985: 505–510). The widely accepted classification of Pashto dialects is mainly based on phonological criteria relating to five different phonemes as proposed by MacKenzie (1959: 232). Other conclusions about dialect distinctions would probably arise if the lexicon and other criteria were studied and taken into consideration more systematically (Hallberg 2004: 26). Reliable linguistic data for different varieties are the main desideratum in that field of knowledge. One of the most ambitious projects of linguistic research in Afghanistan was the edition of a linguistic atlas of Afghanistan (Atlas linguistique de l’Afghanistan) by G. Redard (Bern), which, however, remained unpublished except for some lexical maps (see Redard 1974 and Kieffer 1974). Annotated maps of the Dardic and Nūrestānī languages were published by Fussman (1972). Grjunberg and SteblinKamenskij (1974) propose an ethno-linguistic mapping of the Eastern Hindu Kush which is based not only on the historical-genetic classification of languages but also takes into consideration questions of linguistic behavior including bi- and multilingualism. Generally speaking, sociolinguistic research in or on Afghanistan is underdeveloped. Only a few studies on related subjects are available, most of them dealing with various aspects of language standardization (see MacKenzie 1959, Kalinina 1977, Kieffer 1983, Lorenz 1990, Bauer 1995, Rzehak 2003). Biand multilingualism, though widespread phenomena in Afghanistan, have rarely been objects of specialized studies (see Kiseleva 1982 and Rzehak 2009). Linguistic research in modern Afghanistan as described above reached its peak in the 1960s and 1970s and continued partially until the middle of the 1980s. Due to the difficult security situation, almost no linguistic fieldwork has been carried out in Afghanistan during the last two decades.83 Nearly all studies which have been published since then are based on data which had been collected much earlier. Today the most important shortcoming of linguistic research in and on Afghanistan is the lack of original studies on the contemporary linguistic situation, which is no longer the same as in the 1960s and 1970s. Among the deficits, the lack of qualified linguists in Afghanistan and the low quality of linguistic education at local universities must also be mentioned.

83

Editor’s note: Lehr (2014) is a dissertation on the southeastern Pashai of village Amla in the Darra-e-Nur Valley, based on recent field work.

296

Lutz Rzehak

2.4.2.2.2.

Trends in language development

Language development in modern Afghanistan is characterized by processes of both linguistic convergence and divergence. The main extra-linguistic factors affecting language development during the last decades include language planning activities, the civil war, economic change and labor migration, as well as modernization in the fields of education and media. Since Persian has been the dominant language of culture, education, and government for centuries, during the reign of Zahir Shah (1933–1973) language planning activities were primarily aimed at facilitating the use of Pashto in the public spheres. Since 1933, all state officials and civil servants have been obliged to learn both Persian and Pashto. For a certain period of time salary bonuses were paid for knowledge of Pashto, and sometimes Pashto-speaking persons were privileged in getting official positions. In 1936, Zahir Shah declared Pashto an official language together with Persian. This status was fixed again for both languages in the constitution of 1964 with Dari (darī) for the first time being used as the official name of the Persian language of Afghanistan.84 In 1937 an organization named paš ̣tō ṭolǝna (Pashto Society) was founded. It took some basic decisions for the standardization of (written) Pashto, published books and journals on Pashto language and literature, and organized Pashto language courses for civil servants.85 Both official languages are compulsorily taught at school. For that, the territory of Afghanistan was divided by the Ministry of Education according to the dominant local language into so-called Persian-speaking and Pashto-speaking regions, with Dari-Persian or Pashto, respectively, as the language of instruction and the other to be taught as a secondary language from the third class up. As a result of the favored development of Pashto, some Pashto lexemes were introduced into Dari-Persian, most of them being official terms and belonging to the field of higher education, like pōhantūn ‘university’, pōhanzai ‘faculty’, pōhānd ‘professor’, pōhanwāl ‘assistant professor’; or to the military field like dagarwāl ‘colonel’, ǧagran ‘major’, tōlai ‘company’, ġūnd ‘regiment’, and others. However the intended Dari-Pashto bilingualism of state officials and civil servants has never become a mass phenomenon among speakers of Persian. In Afghanistan, Dari-Pashto bilingualism is common in regions with a mixed population having Pashto as the local lingua franca, but it is seldom a result of successful language planning activities. Usually speakers with Pashto as the first language have a better command of Dari-Persian as the second language than vice versa. Even many speakers of Persian who have learned Pashto at school for almost ten 84

85

This name is usually traced back to the expression zabān-e darbār ‘language of the court’, implying a claim that the Persian language of Afghanistan has preserved many archaic features which were characteristic of the Persian language spoken at the courts of Khorasan in early Islamic times. For details see Lorenz 1990: 109–111 and Kiseleva 1982: 96.

Contact and convergence

297

years remain only passive speakers of Pashto. This imbalance can partially be explained by the fact that from the point of view of psychology of learning it is, at least at the beginning level, often easier to proceed from a more synthetic language such as Pashto to a more analytical language such as Persian than vice versa. But at the same time the principles of teaching and the learning content of the Pashto courses hardly met the requirements of achieving active language skills and they remain this way today. In some cases, cultural reservations can form a psychological obstacle to the learning of Pashto. Some minority languages were facilitated after the Saur-revolution of 1978. Five minority languages were officially promoted to the rank of so called “national languages” (Dari: zabān-e mellī, Pashto: melli žǝ́ba), to be distinguished from the official languages Dari-Persian and Pashto on the one hand, and from the languages of smaller minorities on the other hand. These were Ūzbakī, Torkmānī, Pašaī, Balōčī and the so-called Nūrestānī which, actually, was Katī. All five “national languages” were to be introduced into primary education. Some fundamental questions needed to be solved to make these languages suitable for the communicative purposes arising from their new status. Whereas Pašaī and Nūrestānī exist exclusively in Afghanistan, Torkmanī, Ūzbakī, and Balōčī are also spoken in neighboring countries. For these languages, the question to be answered was whether the standard and the written language to be developed in Afghanistan should follow the standard which already existed for these languages outside Afghanistan or not. It was decided that corpus planning for these languages should take into account the specific linguistic situation of Afghanistan with Dari-Persian and Pashto as official languages and with its own traditions in the fields of terminology and education. All writing systems were created on the basis of the Arabic-Persian alphabet, therefore, and special letters such as for retroflex sounds in Balochi were created according to the established writing tradition of Pashto in Afghanistan. Already by 1981, textbooks in Ūzbakī, Torkmanī, and Balōčī had been issued by the Ministry of Education. Each of these languages had daily 30–60 minute broadcasts on Radio Kabul and TV. In the 1980s, numerous publications appeared in these languages and research on them was carried out in a newly created branch of the Institute of Languages and Literature at the Academy of Science.86 In 1991, all language planning activities were stopped abruptly when Islamic opposition groups captured Kabul. The 1990s were dominated by civil war and political chaos. The linguistic policy of the Taliban was a preferential treatment of Pashto, which was mainly realized by pressure and force. During these years the linguistic situation in Afghanistan was primarily influenced by mass migrations to Pakistan and Iran and to a lesser degree to Tajikistan.

86

For details see Kieffer 1983, Grjunberg 1988, and Rzehak 2003.

298

Lutz Rzehak

After the civil war and the fall of the Taliban, the ethnic factor gained much in importance. For the first time in the history of Afghanistan, the ethnic composition of the population was officially described in the Constitution of 2004. In the field of language policy, the new political order mainly followed the traditions of the 1980s. The official status of Dari-Persian and Pashto was reconfirmed in Article 16 of the new constitution. In addition to that, Ūzbakī, Torkmanī, Pašaī, Nūrestānī, Balōčī, and Pāmīrī (which actually stands for Šuġnī) were given the status of a third official language in regions where the majority of the population speaks one of these languages. New attempts were undertaken in the field of corpus planning for these languages. For the first time lexicographical works on Torkmanī and Balōčī which feature the vocabulary of these languages as spoken in Afghanistan were published (Rāsex 1388, Pahwāl 1386).87 Notwithstanding all the attempts undertaken in the last century to enhance the prestige of Pashto and to facilitate its use among non-Pashto speakers, DariPersian remains the dominant language for processes of linguistic convergence in Afghanistan — at least on the country-wide level. Dari-Persian owns a colloquial standard which is mainly based on the dialect of Kabul, which is promoted in the media and has high prestige all over the country. This standard variety is called ʿasrī ‘modern’ by many speakers of rural dialects, and its use can, in fact, stand for a corresponding way of living and thinking. On a regional level, the urban dialects of bigger cities like Herat or Mazar-e Sharif can play a similar role. Besides official terms, only very few Pashto words were really incorporated into the active vocabulary of speakers of Persian, and after 2001 even some official terms were put into question by Persian-speaking language activists. In 2008, students revolted in Kabul and Mazar-e Sharif demanding that the Pashto word pōhantūn ‘university’ be replaced by the Persian word dānešgāh in the official Dari names of their universities. No final decision has been taken up to now. Since then the Faculty of Languages and Literature of Kabul University has no name plate at the entrance, and the new Higher Education Act cannot pass the parliament because it is unclear which words should be used to denote the educational institutions. Such processes of politically motivated linguistic divergence reflect a new ethnic consciousness and a new language awareness of Persian-speaking groups, which some speakers of Pashto also possess. Numerous neologisms were introduced into Pashto to replace words which previously had been borrowed from or via Dari-Persian, e.g. wulusmǝšr ‘President’, ṭōlṭākǝna ‘plebiscite, referendum’, zēž ̣īz ‘AD (Anno Domini)’, ġūrdzang in the meaning of ‘movement’. However, most of these neologisms are mainly used in the written language, whereas the spoken language both in and outside the mass media shows a dif87

Both dictionaries were compiled under the linguistic guidance of experts of the Department for Central Asian studies at Berlin Humboldt-University. A recent language development project for Pašaī is described in Yun 2003.

Contact and convergence

299

ferent picture. Pashto has no colloquial standard, and even in regions with no mentionable Persian-speaking population many speakers of Pashto tend to replace well-established Pashto words by their Dari-Persian equivalents. In spoken Pashto one can observe increasing Dari-Persian influence on the level of morphology and syntax as well. Most evident are Persian ezāfa-constructions, but they are usually copied as lexical units and hence are not grammatically productive. More interesting are new prepositions and circumpositions. Today, for example, as a synonym to the compound preposition wrusta lǝ ‘after’, the circumposition lǝ … na baʿd can be used; this is, evidently, a copy of the compound Dari-Persian preposition baʿd az. As a synonym to the preposition tǝr ‘than’, the circumposition nisbat … ta has become quite common in combination with comparative forms of adjectives to indicate the benchmark. This is, likewise evidently, a copy of Dari-Persian constructions with nesbat ba ‘than’, ‘in relation to’.88 Mass migrations to Iran and Pakistan as well as the development of education and the increasing importance of electronic mass media have promoted crossborder processes of linguistic convergence. Today the lexicon of Dari-Persian is increasingly influenced by the Persian of Iran, from where many political and other terms are borrowed and spread through Afghan mass media. Iranian textbooks are widely used in academia. However, there is no significant influence of Iranian Persian on the levels of phonology and grammar.89 For Pashto, the written standard which had been developed in Afghanistan is also accepted in Pakistan today. This concerns primarily the letters for retroflex sounds which in Pakistan were previously written according to the Urdu script. Neologisms of the kind mentioned above are conducive to the development of a unified terminology for Pashto in Afghanistan and Pakistan. Since many Afghan intellectuals had moved to Pakistan during the civil war, Peshawar became the most important center of Pashto literature. With regard to other languages, cross-border ties are of less importance for Ūzbakī and Torkmanī, though in southwest Afghanistan Balōčī shows some influence of the modern Persian language of Iran. On the regional level, processes of linguistic convergence and divergence can show a different picture depending on the particular linguistic situation. The importance of English for linguistic development in Afghanistan has increased significantly during the last decade due to the massive presence of international troops and organizations. Thus the political, economic, and development terminology of Dari-Persian and Pashto shows many characteristics which are direct or indirect copies of English matrixes. 88

89

For more examples of code-copying in which Dari-Persian presents a model code that is copied in colloquial Pashto see Rzehak 2012: 88–89. Only in rare cases can one come across grammatical constructions which are characteristic of Iranian Persian such as the gerund with dāštan of the type man dāram mīrawam ‘I am going’, or possessive constructions with māl-e.

300 2.5.

Shobhana Chelliah and Nicholas Lester

Contact and convergence in the Northeast By Shobhana Chelliah and Nicholas Lester

Though the precise number is not known, there are at least 220 distinct language varieties90 spoken in an area of approximately 250,000 km2 in the Northeast Indian states of Assam, Meghalaya, Manipur, Nagaland, Tripura, Mizoram, and Arunachal Pradesh. These languages are predominantly from the Tibeto-Burman (TB) branch of the Sino-Tibetan family, as well as from the Indo-Aryan (IA) (Assamese, Bengali, Nepali) and Austro-Asiatic (Khasi) families. The intense and varied scenarios of contact between these languages make Northeast India an ideal laboratory for refining our understanding of language change through contact. 2.5.1.

Contact situations

The specific contact situations we mention here are representative of reasons for change (influence due to migration, cultural contact, trade and geography, imperfect bilingualism, demography) and examples of change (phonological, morphological, and syntactic). 2.5.1.1. Subgroup relationship LaPolla (2001: 234–242) discusses an often noted morphological and typological divergence in Tibeto-Burman that is the result of migration patterns and diffusion. The languages spoken in the areas along the southwestern edge of the Tibetan plateau (Nepal, northern India, and to a lesser extent, Bhutan) tend to have complex verb agreement morphology while many of the languages spoken in Northeast India have simpler agreement systems and predictable agglutinative, semantically compositional morphology. In addition, languages to the west have been influenced by long-term contact with IA languages while languages to the east are characterized by long-term contact with Chinese. Matisoff (1990) calls these the Indospheric and Sinospheric languages, respectively. Like Chinese, Sinospheric languages typically have tone and are commonly isolating. The Indospheric languages, unlike other TB languages but like other Indic languages, typically have retroflex stops, post-head relative pronouns, and the use of the verb meaning SAY as a quotative or purpose marker (LaPolla 2001: 234–235). Dryer (2003: 43–55) 90

Burling (2011) discusses the difficulties associated with assigning a linguistic entity the status of language or dialect in Northeast India given the contrasting uses of the terms by, most notably, linguists and tribal members: while the former hinges on considerations such as mutual intelligibility and sufficient, distinctive innovations, the latter is based largely on claims of ethnic identity and membership (in some cases, without regard to linguistic similarity).

Contact and convergence

301

notes that while most TB languages are verb final (OV), some show variation in word order, which hints at change due to contact with Mon-Khmer or Tai-Kadai (e.g. the VO Karen languages). LaPolla (2001: 243–245) gives examples of shared patterns of person marking from southwestern China to Northeast India in languages such as Angami Naga, Mikir, and the Kuki-Chin languages. Since these patterns are not original to Tibeto-Burman but are attested in languages spoken in areas along known paths of migration, they have most likely developed through contact. (But see DeLancey 2010b, 2011.) Within the languages of Northeast India, subgrouping is not certain. Geographically proximate languages appear lexically similar; the question remains, however, whether this similarity is due to contact or genetic relationship. The question is compounded by different histories of contact so that languages in proximity may reflect genetic relatedness in one part of the grammar but not in another (Donohue, Dawson & Baker 2012). The following groups, listed along with their representative languages, are included in the Northeast Indian sprachbund by Burling (2003: 178). These groupings are largely based on suggestions made by previous researchers, comparative analyses of lexicon, and in some cases — given the current scarcity of thorough documentation of many of these languages — educated guesswork. For languages grouped together on the basis of geographic and typological similarity rather than genetic relationship, see Bradley 2002: 77–78. Group (1) constitutes the most extensive and most confident sub-grouping as presented in Burling 2003. Groups (2) and (3) each present local groupings within the northern state of Arunachal Pradesh and the eastern border states of Nagaland and Manipur. 1.

Bodo-Konyak-Jingphaw (Assam, Meghalaya, Tripura, Manipur, Nagaland, Myanmar) a. Jingphaw, Singpho b. Luish i. Andro ii. Sengmai iii. Kadu iv. Sak c. Konyak Group i. Tangsa ii. Nocte iii. Wancho, Konyak, Phon, Chang, Khiamngan d. Bodo-Koch i. Deori ii. Bodo (Dimasa, Kachari, Boro, Mech, Kokborok, Tiwa, Hill Kachari) iii. Koch (A’tong, Ruga, Koch, Rabha) iv. Garo

302 2.

3.

Shobhana Chelliah and Nicholas Lester

Arunachal Pradesh a. Tshangla-Takpa (Tshangla (a.k.a. Central Monpa), Takpa (a.k.a Northern Monpa) b. Sherdukpen, Bugun/Khoa, Sulung, Lishpa c. Hrusish (Hruso (a.k.a. Aka), Shammai d. Tani (Mirish, Misingish, Abor-Miri-Dafla) e. Idu-Digaru (a.k.a. Mishmi) f. Miju (a.k.a Mishmi) Eastern Border languages (along Myanmar border) a. Kuki-Chin b. Meitei, Karbi c. Tangkhul Group: Tangkhul (various distinct varieties), Maring d. Zeme group: i. Nruanghmei, Puiron, Khoirao ii. Zeme, Mzieme, Liangmai iii. Maram e. Angami-Pochuri Group i. Angami, Chokri, Kheza, Mao ii. Regma, Simi iii. Rengma N, Pochuri f. Ao group i. Sangtam, Yimchungru, Lotha ii. Yacham-Tengsa 1. Ao-Chungli 2. Ao-Mongsen

The complexity of reconstructing the relationship of the TB languages in Northeast India and elsewhere is covered in Section 1.8, this volume. 2.5.1.2. Cultural contact The specifics of subgroup relationships are not as easy to characterize because we know so little about the socio-political histories, migration routes, and geographic proximities of the affected languages. While we have some idea of the early history of Burma (see LaPolla 2001: 238–239) and know that the north was inhabited by the Shan and Jingphaw, the central region by the Pyu, and the southern region by the Mon (Mon Khmer), we do not have as clear a picture of the linguistic history of Northeast India. Consider, for example, Milang, a language of the Tani family, spoken in the northeast Himalayas in Arunachal Pradesh. Sun (1993) places Milang in the Eastern Tani branch. Four of the phonological innovations identified for Eastern Tani are seen in Milang. Furthermore, 64 % of its observed lexicon is shared with Eastern Tani languages, while only 4 % of its vocabulary is

Contact and convergence

303

shared with Western Tani languages. Post and Modi (2011) reevaluate Sun’s 1993 proposal, hypothesizing that trade needs and geographical positioning were contributing factors for intense contact between Milang and the Eastern Tani language Padam (Adi), which led to borrowing and lexical similarity between Milang and Padam. The similarity is misleading as there are other features of Milang that are inconsistent with Proto-Tani. Thus, while it is safe to say that Milang is closely related to Tani, convergence with Eastern Tani is due to contact; similarities are not due to close genetic relationship. In addition, Post (2010) explains how the Western Tani language Galo exhibits Eastern Tani features because of geographical and cultural influence, specifically from Minyong (Adi): speakers of Galo have adopted several basic cultural practices from the Minyong community, including methods of constructing houses, weaving (a move to cotton-thread loin-looms over plant-fiber), and familial guardianship (certain apparent trends in the hierarchical organization of kinship terms that are more prolific in Eastern Tani than Western Tani indicate contact along the Siyom corridor, which provided the only practicable contact zone in an area otherwise divided from north to south by a wedge of steep mountains). Burling (2003) notes the cultural influence of Tibetan via Buddhism on the languages of northern Arunachal Pradesh. For example, Tshangla and Takpa have some shared features with Sherdukpen not due to close relatedness but due to similar religious influences. Another striking example of change through cultural contact is the religious proselytizing from Indo-Aryan Hindu communities from the south and west of the Northeast Indian region. Chelliah (1997) describes massive lexical borrowing from Bengali into Meitei due to adoption of Hindu practices by Meitei speakers, which has had the effect of reshaping the phonemic inventory of Meitei to include a voiced aspirate series. A different language, Bishnupriya Manipuri, resulted from proselytizing immigrants bringing the Hindu tradition of Vaishnavism to Manipur. Through the broad migration patterns of this religious movement, Bishnupriya Manipuri partially acquired a lexicon from Meitei and other TB languages spoken near Manipur (Satyanath & Laskar 2008). For example, in a primarily speech-based corpus of Bishnupriya Manipuri (composed of recordings of roughly ten hours of running speech, as well as one complete text and 909 randomly collected lexical items taken from previous studies), Satyanath and Laskar found that about 30 % of the vocabulary came from Meitei and other TB languages spoken near Manipur; the other 70 % came from IA sources (mainly Hindi and Bengali). The distribution of lexical types helps trace the complex history of contact and reveal the sources contributing to the emergence of this contact language — about 70 % of kinship terms are IA, as are many body part terms. However, distinctive cultural categories are from either IA or TB sources; almost all weaving terms and most terms for textiles are TB while agricultural terms and words about marriage rituals are close

304

Shobhana Chelliah and Nicholas Lester

to evenly split between the two language families. Similar asymmetries occur in the functional categories: numerals, pronouns, simple verbs, verbal morphology, and clause linking strategies are mostly IA, whereas complex verbs, adverbs, and nominal morphology are more evenly distributed between TB and IA. The diversity of sources present within and across lexical categories, along with the apparent lack of leveling in Bishnupriya, when taken together, are suggestive of several sequential and transient periods of intense contact. Asymmetries across culturalsemantic and functional domains present evidence for the relative likelihood of borrowing, receptivity, and survival of different linguistic categories while also tracing the development of an emergent and distinctive Bishnupriya identity. 2.5.1.3. Bilingualism Examples of the influence of bilingualism on language change are noted in Coupe 2007 for Nagamese and Nagaland Nepali. Nagamese is a pidgin-like variety with a typical Northeast Indian typological profile (SOV, suffixing, lack of agreement or complex grammatical case systems) and Assamese lexicon. Nagaland Nepali is Nepalese spoken in Nagaland by Tibeto-Burman descendants of Nepali immigrants who were brought from the Himalayan foothills to Northeast India by the British as soldiers. Some are newer immigrants from Nepal. Both Nagamese and Nagaland Nepali exhibit the use of the same clause structure for disjunctive interrogatives as illustrated for the “Naga”91 languages Ao, Chung, and Khiamniungan (p. 352). Relative clause strategies of these three TB languages and the IA languages Nagamese and Nagaland Nepali show convergence. As in other IA languages, relative clause heads in Nagamese and Nagaland Nepali typically occur after the participial but, under influence of Tibeto-Burman, may occur before (pp. 353–354). Additionally, in the TB language Mongsen Ao, an interrogative pronoun is used as a relative pronoun, copying a typical IA style of relative clause formation (p. 355). But the convergence goes even further in that Ao uses a topic marker at the end of the relative clause as found in Nagamese (see also 2.6.8 below). The adoption of IA relative clause strategies by TB languages in Northeast India is not uncommon (see Chelliah 1997 for one relative clause strategy in Meitei). Bilingualism in more than one TB language also affects patterns of contact-induced language change. LaPolla (2009) provides examples of three distinct manifestations of bilingual effects in contact: substratum influence (L1>L2), superstratum influence (L2>L1), and adstratum influence (L1L2) are conditioned by cognitive and behavioral habits developed in L1, L2, or cooperative convergence of the two. 91

Following Burling (2003), we will use scare-quotes around the term “Naga” to indicate the caution with which it should be used to label a group of languages that, though spoken by tribes recognizing a general ethnic affiliation, exhibit huge internal heterogeneity.

Contact and convergence

305

Population size and economic strength also condition the direction in which borrowing occurs. In Manipur, Meitei speakers in the valley region are demographically and economically dominant while “Naga” and Kuki speakers belong to smaller communities. The “Naga” and Kuki languages show high degrees of borrowing from Meitei, which is the state lingua franca, while Meitei exhibits far fewer borrowings from the “Naga” and Kuki languages. 2.5.1.4. Imperfect bilingualism Imperfect learning of an additional language has been shown to create new varieties of that language. We learn from Barz & Diller (1985), for example, that Assamese developed a classifier system unusual to Indo-Aryan on the basis of contact with Tai-Ahom starting in the 13th century. By the 16th century, the TaiAhoms intermarried with Hindu Assamese, at which time Assamese began to be used for day-to-day communication, while Tai was used for ceremonial and literary purposes. But the Tai-Ahom, as learners of a language they perceived of as having lower prestige than their own, were, it is assumed, free to speak an imperfect variety of that language which was then imitated and adopted by native speakers (p. 170). Crucially, the adoption of a classifier system was made possible because Assamese was itself undergoing morphological restructuring; for example, there was a loss of case distinctions, and due to phonological erosion, a change of how singular and plural were indicated. The stage was set for development of a classifier system that would take on some of the functions of these morphological systems that indicated definiteness marking. Assamese now has a substantial set of classifiers with varied function, which further illustrates the variability with which prestige and other extra-linguistic pressures on contact-induced change can affect the languages involved. 2.5.1.5. Lingua francas and simplification Burling (2007) discusses the effects on lingua francas in the Northeast Indian state of Nagaland, where speakers of 20 mutually unintelligible TB languages have been brought in contact with each other since the 19th century through the development of city centers, roads and transportation, and also the adoption of Christianity, which is common among the groups. English is used as a lingua franca for school and government purposes, though Assamese and “Naga” speakers alike regard English as difficult to learn. Another lingua franca, Nagamese, is preferred for daily interactions. Nagamese, which, as noted earlier, has an Assamese lexicon (with unanalyzed or completely absent morphology) and simplified grammatical structure, is spoken with a variety of accents depending on the first language of the speaker. This morpho-syntactic simplification is characteristic of lingua francas generally (see McWhorter 2007 for several in-depth analyses) and suggestive of

306

Shobhana Chelliah and Nicholas Lester

the influences of cognitive-acquisitional processes on language change in multilingual environments, for example, the relative tenacity of L1 phonology as opposed to morphosyntax or lexicon. In keeping with observations by McWhorter (2007), we find that for TibetoBurman as well, simple, regularized grammatical systems tend to result when languages are used as lingua francas (Burling 2007, DeLancey 2010a). An example is the lingua franca varieties of Jinghpaw, i.e. Valley Jinghpaw spoken in Northern Burma, and Singpao, spoken in Assam. Jinghpaw has verb agreement and also exhibits plural marking that resembles more conservative TB languages, while Valley Jinghpaw and Singpao have simplified systems. The main reason put forth for this simplification in Assam is that the conquered local population acquired “easier” parts of Jinghpaw but did not acquire more complex morphology, such as tense and aspect indicated by postverbal suffixes. Similarly, Valley Jinghpaw does not have sentence-final verbal elements. DeLancey (2010a) argues that the more complex morphosyntactic systems are indicative of original (or at least quite old) Tibeto-Burman structures; the less complex morphological systems are due to, ‘a reversion towards a creoloid structure’ (p. 46). However, the time depth of verb agreement in the Tibeto-Burman family is at issue; LaPolla (2003: 32) argues that morphological complexification in Tibeto-Burman arose through contact. The Bodo-Garo languages — e.g. Boro, Dimasa, Rabha, Atong, and Garo — which are spoken throughout Assam and in northwestern Bangladesh exhibit simpler structures than the northern members of the family: grammatical regularity, no inflectional portmanteau or fused morphology, and no morphophonological alternations (DeLancey 2010a, 2012). DeLancey (2010a: 48) suggests that this simplification may be understood by postulating that Bodo-Garo was introduced to this region some time in the 1st millennium BCE and became a lingua franca to the existing Austroasiatic speakers by the 4th–6th centuries CE. Imperfect acquisition, or to use McWhorter’s terminology as DeLancey does, “interrupted transmission”, led to systemic simplification of a common language. This interesting theory is based mostly on linguistic evidence and argued on analogy with other, more obvious cases (e.g. those put forth in McWhorter 2007). What is needed to support the simplification theory is more data on contact influences on lingua francas for this region. A new line of research in this area exists: Dey 2012 describes the spread of Bengali as a lingua franca through Bangladesh, West Bengal, and Assam, and the phonological changes to Assamese Bengali through contact with Tibeto-Burman. Sharma 2012 describes changes to Hindi in Shillong under the influence of Khasi, Nepali, Bengali, and Bhojpuri. 2.5.1.6. Demography, family structure, and loss Language change through contact takes place more rapidly in some situations than others. Jacquesson (2008) compares the “Naga” languages from Nagaland and

Contact and convergence

307

Manipur with the Tani languages spoken in the Siyom River Valley. The “Naga” tribes tend to reside in isolated, densely populated, fortified villages, a vestige perhaps of their past practice of headhunting (which rightfully made them distrustful of neighboring tribes). The Tani languages, by contrast, are spoken largely along a wide latitudinal swath of Arunachal Pradesh in less densely populated villages with less distinct borders. These languages form a continuum of intelligibility, even despite the sometimes adverse geographical impediments. When comparing Tani and “Naga” language cognates for ‘stone’, ‘bird’, and ‘four’, Jacquesson finds that Tani languages have very similar forms while the “Naga” languages do not. Extralinguistic factors are provided to explain this difference. The Tani languages are slower to change because these small groups communicate with each other regularly due their mutual reliance on the Siyom; each group can contribute to a balanced economic and social existence (p. 301). As a result, many of the Tani dialects are mutually intelligible (though those spoken on the westernmost and easternmost extremes are strikingly distinctive), and there is a great deal of cross-language borrowing. The “Naga” languages, on the other hand, are spoken in a comparatively smaller area and, surprisingly, the denseness of population does not encourage lexical similarities across the languages. Rather, speakers, who must lay claim to scarce land resources, keep their identities maximally distinct and do not conform to or adopt the linguist habits of outsiders. There are many examples of contact causing language attrition. We know that unbroken dialect chains are less likely to be found since the spread of Indo-Aryan. For example, Burling (2003: 178) notes that the Bodo-Koch group, which was probably the main language group in the Assam valley, is now interspersed with Bengali, Assamese, Khasi, and other TB languages. The last thirty years has seen an influx of Nepali speakers into Darjeeling, which was primarily inhabited by Tibeto-Burman groups such as the Lepcha, Sherpa, and Bhotia. Several smaller languages are under severe threat; in the Luish group for example, Andro and Sengmai are no longer spoken although communities still claim these as their heritage language (p. 178). Dattamajumdar (2012) reports remarkable “shrinkage” in Lepcha (such as the loss of an article, gender, and dual marking) under the influence of Tibetan and Nepali in Sikkim and West Bengal, respectively. In the Bodic language Baram the lexicon consists of only 1000 native words while the rest are from Nepali (Dhakal 2012). A prevalent cause of language attrition is intermarriage. When speakers of different groups marry, it is often the lingua franca which is used rather than either spouse’s native language. Hvenekilde (2001) found in a study of families in Shillong that children are often spoken to in English in multilingual/multicultural households. If the wife comes from a matrilineal clan structure, she will expect her native language to be the language of the home, which can potentially cause conflict if the husband belongs to a patrilineal community. Using English allows the couple to avoid these conflicting expectations. Such compromises affect what

308

Shobhana Chelliah and Nicholas Lester

languages will be taught in what measure, and whether the household will tolerate bilingualism, monolingualism in one of the parents’ native languages, or a neutral language apart from the parents’ L1s. 2.5.2.

New trends and old questions

One of the burning questions in the field of Sino-Tibetan linguistics is the reconstruction of the family. We know there is an interaction between language change, contact, and genetic affiliation, but there is still much left to do before we can be sure about how the Sino-Tibetan languages are related. Thurgood (2003) presents as good an explanation of sub-grouping as possible with the given data, but the picture will change as the evidence improves. In particular, we need more descriptions of contact scenarios, better and more language documentation, a clearer understanding of language names and speaker affiliation (see Matisoff 1986 and Burling 2003 on this point), rigorous application of reconstruction methodology (see LaPolla 2013 for an application of J. Nichols’s 1996 probabilistic thresholds to the sub-grouping of Sino-Tibetan languages), and knowledge of the cultural, geographic, and historical background of the groups in contact. This will require interdisciplinary collaborations with anthropologists and historians. It will also require training and collaboration with native scholars who have access to oral histories and other local resources. It is more or less accepted in this field of study that a strict family tree model is not warranted. Reconstructions must consider more than lexical similarities or even systematic correspondences and shared innovations in morphosyntax (thought in large part to be the most powerful reconstructive evidence). Also important are migration patterns, history, and social factors. Incorporating these factors into the reconstructive method will enrich and extend our understanding of linguistic relatedness. Relatively new ways of discussing contact phenomena are replacing older ways, which include sometimes implausible migratory accounts. The so called “abrupt discontinuities” (à la McWhorter 2007) that arise with the rapid popularization and spread of lingua francas are used to explain how some languages have absorbed and mixed features of many different languages as they expand (Burling 2007). This, importantly, shows one way in which borrowing is conditioned by the extant system of the receiving language, and also constitutes a step in a much more complex reconstructive methodology which ties relevant known factors in borrowing to dialect-dependent divergences in lingua francas. 2.5.3.

Major publications and online resources

Major recurring publications providing descriptive and theoretical articles on the Tibeto-Burman languages of Northeast India are: Linguistics of the Tibeto-

Contact and convergence

309

Burman Area; publications of The Sino-Tibetan etymological dictionary and thesaurus; the Proceedings of the North East Indian Linguist Society; Himalayan Linguistics (online); Journal of the Southeast Asian Linguistics Society (online); and publications of the Central Institute of Indian Languages. A reference volume on Tibeto-Burman languages is Thurgood & LaPolla 2003. Online resources include the Tibeto-Burman Domain (http://tibeto-burman.net/) which includes a bibliography and links to listservs and relevant publications. Sound and interlinear analysis of data on Tibeto-Burman languages can be found on the Endangered Languages Archive (http://elar.soas.ac.uk/) and the Leipzig Endangered Languages Archive (http://www.eva.mpg.de/lingua/resources/lela.php). The Mouton Grammar Library includes grammars of several Tibeto-Burman languages which include information on language contact and convergence in Northeast India. These are Chelliah 2011, Coupe 2008, Genetti 2009, van Driem 1987, and van Driem 1993.

2.6.

Other contact, regional and local By Hans Henrich Hock

2.6.1.

Introduction

Besides his better known work on general South Asian convergence, Emeneau pointed to the existence of subareas of convergence — the Northwest and the Northeast (1980b) and the Nilgiris (1989). For these areas see also 1.6.4.2.3, 2.4, and 2.5, this volume. Other publications have noted further cases of regional and local convergence; and the various clusterings of phonological features in Ramanujan and Masica’s classic study (1976) suggest yet further subareas. Many of the publications are scattered in journals or edited volumes, some of which are out of print or difficult to access. A complete survey of the literature is still a desideratum. The following presentation attempts to provide a sample of the different types of contact effects that have been proposed, their implications, and remaining uncertainties. 2.6.2.

Indo-Aryan/Dravidian contact in the South

The issue of Indo-Aryan/Dravidian interaction in the South has received a great amount of attention. It is useful to distinguish two major types of interaction, although the distinction is not always clear-cut — localized (largely involving transplanted varieties of Indo-Aryan [IA]), and border-area contact.

310

Hans Henrich Hock

2.6.2.1. Localized contact Contact between (generally transplanted) varieties of IA and regional Dravidian languages has been investigated intensively. The best-known cases are those of Dakkhini Urdu and Telugu (e.g. Pray 1980, Arora 2004), Mangalore Saraswat Konkani (MSK) and Kannada (Nadkarni 1975), and Sinhala and Tamil (Gair 1976, 1980, 1985). Lesser-known cases are Saurashtri and Tamil (Pandit 1972, Učida 1991), Urdu and Kannada in Bidar (Upadhyaya 1971), Bhalavali Marathi and Kannada (Varija 2005). Though there are differences in detail and in the extent of contact-induced change, some developments seem to be common to all these cases. These include the development of a post-sentential question particle (QP) and a change in relative-correlative (RCCC) structures requiring a QP92 (or some other element) to occur after the relative clause (RC); in many cases, the IA relative pronoun (RP) of the type Urdu jo is replaced by the interrogative pronoun (IP). See e.g. (6a-c) and (7a-c). These patterns have a perfect counterpart in the Dravidian languages, as in (6d) and (7d); they have therefore been plausibly attributed to Dravidian influence (6)

a. b. c. d.

(7)

92

a.

to baro āssa -ki QP he well be.PRS .3 SG ‘Is he well?’ (MSK, Nadkarni 1975) āe ki naīṁ ki QP NEG QP come.PF . PL ‘Did they come or not?’ (Dakkhini Urdu; Pray 1980) chitra ee potǝ kieuwa dǝ QP Chitra this book read ‘Did Chitra read this book?’ (Sinhala; Slade 2011) occinr -ō lēd -ō NEG QP come.PST .3 PL QP ‘Did they come or not?’ (Telugu; Pray) [khanco IP ( RP ) [to

mhāntāro pepar vāccat āssa -ki] RC old.man paper reading is QP ḍākṭaru āssa]CC CP doctor is ‘The old man who is reading the paper is a doctor.’ (MSK; Nadkarni 1975)

It is commonly assumed that ki is the complementizer (e.g. Pray 1980, Nadkarni 1975); but for the mainland contacts, the alternative (question) marker ki — as in Hindi-Urdu āogī ki nahīṁ ‘Are you coming or not?’— is equally possible. The evidence of Sinhala tilts the argument in favor of the latter, since da/dǝ can only be derived from an earlier alternative marker (Skt. utāho, Pali udāhu); see Slade 2011: 176–180.

Contact and convergence

b.

c.

d.

311

[kilās meṁ kon avval ātā ki]RC class LOC IP ( RP ) first come.IMPF . SG . M QP us ku ich vazīfā miltā CP . DAT EMPH scholarship come.to.IMPF . SG . M ‘Who comes first in class will get a scholarship.’ (Dakkhini Urdu; Arora 2004) da [yam kumariyak ohu duṭuvā]RC RP princess.INDEF him see.PST .3 SG . F QP [oo ohu kerehi piḷin̆ da sit ætikara gattāya]CC she him toward connected mind developed get.PST .3 SGF ‘Whatever princess saw him fell in love with him.’ (Sinhala; Gair & Karunatilaka 1974: 295) [yāva mudakanu pēpar ōdutta iddān -ō]RC IP / RP old.man paper reading is QP [avanu ḍākṭaranu iddāne]CC CP doctor is ‘The old man who is reading the paper is a doctor.’ (Kannada; Nadkarni 1975)

Most accounts focus on the influence of the regional Dravidian language on local Indo-Aryan. Upadhyaya (1971), however, covers Urdu influence in Bidar Kannada (including the introduction of the complementizer ki); and Swarajya Lakshmi (1984) discusses Urdu influence on Telugu. The question whether similar “reverse” influence (at the local level) occurs in the other contact cases deserves further study. 2.6.2.2. Border-area contact This section addresses the broader issue of interaction at and across the border dividing Dravidian from Indo-Aryan. (See also Sections 2.6.3, 2.6.4, 2.6.6, and 2.6.7 below.) Most of the literature focuses on the influence of Dravidian on Indo-Aryan, generally under the assumption of an earlier Dravidian substratum. Important publications are Southworth 1971, 1974, Klaiman 1977, and the more comprehensive discussion in Sjoberg 1992. In addition, Masica 1991 provides useful summaries and discussions. The features most commonly attributed to contact are the following. x A distinction inclusive : exclusive first person plural, found in Marathi, Gujarati, and some dialects of Rajasthani (Masica 1991: 251) x Clause-final question particles in Bangla, Marathi, Sinhala (Masica 1991: 388), to which should be added Konkani x The use of post-cited-discourse quotative markers based on a verb of saying in Oriya, Bangla, Assamese, Dakkhini Urdu, Marathi (Masica 1991: 402–

312

Hans Henrich Hock

403); note also Gujarati ɛm, and Marathi asa which, like Skt. iti, derive from adverbs meaning ‘so, thus’, rather than from verbs of saying. The third of these features is the most problematic: As Masica notes, it is also found in Nepali, as well as in Tibeto-Burman languages. Moreover, it is widespread in the Munda languages (Anderson (ed.) 2008: 81, 242, 365, 421–423, 486, 546, 612, 667), and in the languages of the Northwest (Bashir 1996). Further, beside innovated bolke, early Dakkhini Urdu has kar (Arora 2004), which has counterparts in various northern Indo-Aryan dialects (Marlow 1997). Note also Sanskrit iti, MIA tti. At the same time, much of Modern Indo-Aryan has a preposed marker ki/ kē adopted from Persian kĕ (Marlow 1997); peripheral areas preserve an earlier calque, je or jo. This marker, or calques on it, is found in a large number of Dravidian languages close to the Indo-Aryan border, as well as in many Munda languages. (Some languages permit both the preposed marker and the postposed quotative marker in the same clause; see Bayer 2001 for discussion.) While Marlow’s 1997 dissertation goes a long way toward providing a satisfactory historical explanation of the “intrusion” of the preposed marker into South Asia, the history of Modern Indo-Aryan postposed quotative marking, including possible morphological renewals, remains uncertain. The case is similar for Tibeto-Burman. Classical Tibetan does not yet seem to have a quotative marker but what can been called “quotatival” marking, involving various combinations of a preposed verb of speaking plus preposed di ‘this’ and postposed de ‘that’, elements such as skad(a) ‘speech’, pre- and postposed ces(a) ‘thus’, as well as pre- and postposed converbal verbs of speaking, such as (ba)sgoo ‘saying’ (Hock 1982). (See also Section 2.6.8 below.) Finally, the possibility of INDIRECT influence should not be dismissed. For instance, the quotative marking based on a converb of bol- ‘speak’ in Oriya, Bangla, and Assamese could reflect a chain of contacts: Dravidian > Munda > Eastern Indo-Aryan. Various other scenarios are imaginable, involving spread within Eastern Indo-Aryan either from a Dravidian or Munda contact situation (Oriya?) or a Tibeto-Burman one (Assamese?). In addition, Assamese, Bangla,93 Oriya, Marathi, and Konkani have postclausal QP s, a feature that could be attributed to Dravidian influence. But note that Nepali has the same feature, and for both Nepali and Assamese a Tibeto-Burman origin is equally possible (see also 2.6.8 below). Indo-Aryan influence on Dravidian has been traditionally assumed for relative-correlative constructions. Thus, Nadkarni 1975 assumes that the construction 93

Masica (1991: 388) recognizes only Bangla and Marathi (in addition to Sinhala, for which see Section 2.6.2.1), and his examples show that the Bangla marker ki can also occur in preverbal position.

Contact and convergence

313

was first borrowed into Dravidian and then, in modified form, into Mangalore Saraswat Konkani. The work of Ramasamy (1981), Lakshmi Bai (1985), and Steever (1988) has led to a reassessment and to the conclusion that relative-correlatives are inherited in Dravidian. 2.6.3.

“Dentalization” of palatals in Central India

Since the time of Chatterji (1926: Appendix), the split of old palatal stops into alveolar and palatal affricates found in Marathi and Konkani as well as southern Oriya is commonly attributed to Dravidian influence, since it is also found in Telugu and northern Kannada. What might favor Dravidian origin is that Telugu is at the center of the area, with IA Oriya and Marathi/Konkani on the eastern and western peripheries. Moreover, the change is found in all of Telugu; and in Telugu it is attested as early as the 7th century AD (Kolichala, p.c. October 2013). While citing Chatterji’s view without further discussion at one point in his Indo-Aryan languages (1991: 450), at another (1991: 94) Masica views the Marathi/ Konkani development in the larger context of Indo-Aryan, noting a general tendency to ‘pronounce the /c/ as an alveolar (or “dental”) affricate [ts]’, including in Nepali, Eastern and Northern dialects of Bangla, some of the Marwari dialects, Northern Lahnda, Kumauni, and many West Pahari dialects. Moreover, just as in Telugu, the change is found in all of Marathi/Konkani. More than that, in the latter languages it is paralleled by a similar split of *s into s and ś. Interestingly, the area in which the split of palatal stops is found corresponds roughly to the maximum extent of the Bahmani Sultanates and earlier Deccan kingdoms such as that of the Satavahanas. This might provide a sociopolitical context for the spread of the change. At this point, however, the issue of where the change originated does not seem to be resolvable. 2.6.4.

Gangetic “dentalization” and beyond

Masica (1991: 95–98, 192–193) notes the merger of retroflex nasal and lateral94 with dental (read: alveolar) n and l in a large eastern (“Gangetic”) part of IndoAryan. “Dentalization” of ṇ is found in Bengali, Assamese, Nepali, and the eastern Hindi area, of ḷ in most of the Hindi area, Nepali, Garhwali, Bengali, and Assamese. (Neither of these changes reaches Oriya.) Masica further notes the presence of a contrastive velar nasal in the same general area — Bengali, Assamese, Nepali, Maithili, and Bhojpuri. 94

On the phonological distribution of retroflex vs. “dental” nasals and laterals in late Middle Indo-Aryan and the origin of this distribution see Masica 1991: 192–193 with references.

314

Hans Henrich Hock

As it turns out, “dentalization” of ṇ and ḷ also occurs in eastern Dravidian languages, including Telugu,95 most of Central Dravidian, Kuṛux, and Malto (Subrahmanyam 2008: 11, 80, 89). In Konḍa-Pengo-Manḍa-Kuwi, ḷ merges with r, rather than l, but significantly, is lost as a separate phoneme. Moreover, the same languages merge the retroflex approximant r͍ with ḍ, r, or ṛ, the latter being the most common outcome (p. 90). The Munda languages likewise have no contrastive retroflex nasal (except Mundari and Hill Remo), and only Juang has a retroflex lateral;96 the remaining languages have retroflex ṛ instead.97 Except for Ho, where ŋ may be in allophonic variation with ñ, all of the Munda languages have a contrastive velar nasal. (See the contributions to Anderson (ed.) 2008.) It thus appears that there are three overlapping areas — one with only alveolar nasals and laterals (“Gangetic” Indo-Aryan, most of Munda, eastern Dravidian), one with retroflex ṛ (most of Munda and eastern Dravidian), and one with contrastive ŋ (“Gangetic” Indo-Aryan, Munda). Interestingly, on all of these counts Oriya is an outlier. Given that retroflexion is inherited only for the voiced coronal stop of Munda, that ḍ easily changes to ṛ, and that contrastive velar nasals are a prominent feature of Munda, it is tempting to attribute the observed geographical patterns to Munda influence. Further work is needed to either strengthen or question this conclusion, as well as to provide some explanation for Oriya’s outlier status. 2.6.5.

Dravidian and Munda

Bhattacharya (1972, 1975) deserves credit for raising the possibility of mutual interaction between Dravidian and Munda languages in the K(h)ondmals. One feature is the appearance of an object agreement marker (for first and secondperson objects) in Kui, Kuvi, Pengo, and Maṇḍa, with traces in Koṇḍa, a pattern attributed to the influence of Munda, which is well known for marking both subject and object agreement. Steever (1986) shows that the Dravidian marking can be derived as grammaticalization of the verb tar- ‘give (to first or second person)’ and argues that Munda object agreement is really (pro)noun incorporation. The first part of his argument is no doubt correct; his incorporation account for Munda is less convincing, given that object agreement can be traced to Proto-Munda (Anderson 2001). What further complicates matters is that of the Munda languages in the area, only Sora and Gorum have object (and subject) agreement, while Gutob and Remo do not, and Gtaʔ only has traces. Anderson (2003) attributes this loss of object agreement and innovated suffixal subject agreement to Dravidian. What 95 96 97

Telugu has ḷḷ in sandhi. Even in Juang this alternates with ṛ. Could the ḷ of Juang be convergent with Oriya? Gutob has no retroflex sonorants at all.

Contact and convergence

315

deserves further research is the geographical and social factors explaining the fact that some of Dravidian adopts object marking from Munda and, in the same general area, some of Munda loses object marking due to Dravidian influence. Anderson (2003) provides detailed discussion of Dravidian influence on Munda. Some of his discussion focuses on broader, prehistoric issues, such as SOV word order. Note also the widespread use of converbs or converb-like structures (see Section 2.6.7 below), and the use of relative-correlative constructions, either using Munda interrogative/indefinite pronouns, or jo and the like, borrowed from Indo-Aryan (Anderson (ed.) 2008: 83–84, 186, 291, 356, 426, 487, 546, 658, 731). Given the Indo-Aryan origin of jo, the possibility cannot be excluded that a number of these broader typological features reflect combined Dravidian/IndoAryan influence. More relevant for present purposes is the fact that Gutob (Munda “Gadaba”) has undergone extensive influence from neighboring Ollari (Dravidian “Gadaba”), including the addition of -u after word-final consonants. Moreover, a number of South Munda languages have acquired what Steever (1988) refers to as “Serial Verbs” — sequences of (morphologically) finite verbs with person/number (/gender) agreement. As Steever shows, this pattern is inherited in Dravidian. Further discussion of possible southern Dravidian-Munda contact developments is found in Israel 1997 and Mohanty 1997. Northern contact developments are usually covered in the larger context of interaction with Indo-Aryan or even Tibeto-Burman; see Section 2.6.7 below. More specific Munda-Dravidian contact is discussed by Kobayashi and Murmu (2008: 186), who note that both Keraʔ Mundari (8a) and Kuṛux/Malto (8b) have relative-correlative constructions with demonstratives in the RC, rather than relative or interrogative pronouns, or as an alternative. This construction also occurs in the South Munda language Juang (M. Patnaik 2008: 546; see (8c) below), which might suggest Munda origin of the construction. (8)

a.

[ini

laṛki bāre jagar-ke-n-a-le]RC girl about talk-PST - ITR - FIN -1 PL 98 [ini laṛki ka-e hej-kan-a-e]CC DEM girl NEG -3 SG come-CONT - FIN -3 SG ‘The girl whom we are talking about has not come.’ (Keraʔ Mundari) endr [īs ās-im malk-as]RC QP this.SG . M DEM . SG . M - EMPH NEG . COP -3 SG . M [okk-ar tembālagyas ās]CC DEM .3 SG . M sit.CVB beg.PROG .3 SG . M ‘Isn’t this the man who used to sit and beg for alms?’ (Kuṛux) DEM

b.

98

FIN indicates a marker of finiteness, also referred to as IND [icative].

316

Hans Henrich Hock

c.

auri sudure bhangi kuru-cere that red shirt wear-PRF ere aiɲ-a joḷa-ɲ that I-GEN friend-1SG ‘The boy who is wearing a red shirt is my friend.’ (Juang)

Additional evidence for closer regional contact may perhaps be the fact that Mundari has subject agreement on subordinate, non-finite verbs (Osada 2008: 150), and so does Malto (Steever 1998: 380). Moreover, Keraʔ Mundari can use finite verbs (note the FIN affix -a) in subordinate function (9a) (Kobayashi & Murmu 2008: 187), and so can Kuṛux (Hahn 1911: 58); see e.g. (9b). (9)

a. b.

abu jagar-ke-n-a-bu du ini hej’-an-a-e talk-PST - ITR - FIN -1 PL FOC he come-PST - FIN -3 SG we. INCL ‘He came while we were talking.’ (Keraʔ Mundari) ēn eskan (kī) barckan come.PST .1 SG I break.PST .1 SG CPLTVE ‘Having broken I came’ (Kuṛux)

Further research on the Mundari-Kuṛux/Malto interaction would be highly desirable. 2.6.6.

Interaction of Indo-Aryan, Munda, and Dravidian in Jharkhand

In addition to several features that are more widespread in South Asia (e.g. echoword formations and onomatopoeia) Osada (1991) notes a structural feature shared only by the languages of Jharkhand — a four-way present-tense distinction between existential and copula verbs, negative and positive. Abbi (1997) focuses on Indo-Aryan influence on Munda and Dravidian, both lexical and structural. In the latter category she argues for Indo-Aryan influence regarding the converb markers of Kharia (-ke or -kon) and of Kuṛux (kī), as well as embedded relative clauses in Kharia. The most comprehensive study so far, focusing mainly on Munda and the Indo-Aryan lingua franca Sadri, is Peterson 2010. He adds convergent phenomena such as a formal distinction between alienable and inalienable possession, absence of grammatical gender, the incipient development of a pronominal dual number in the Sadri of native Kharia speakers, and a complex set of developments involving (or starting out from) the genitive. Significantly, while some of these developments originate in Sadri, others come from Munda, and yet others are of uncertain origin.

Contact and convergence

2.6.7.

317

East of 84ºE — Tibeto-Burman and Munda (± Dravidian, Indo-Aryan) ?

In his coverage of Tibeto-Burman, Konow (1909: 7), argues that several features widely (even if not consistently) found in the Himalayan or Kiranti area of TibetoBurman are innovated and reflect the influence of Munda languages; moreover, he claims that ‘Muṇḍās, or tribes speaking a language connected with those now in use among the Muṇḍās, have once lived in the Himalayas and left their stamp on the dialects spoken at the present day’ (p. 179). These features are a vigesimal system of numerals (pp. 7, 427), dual number and exclusive vs. inclusive first-person plural and dual (pp. 179, 273–274, 427), and verb agreement (pp. 8, 179, 274–275, 427). Further, he considers the Kiranti languages to form a link to the eastern Kuki-Chin languages which likewise have verb agreement. Konow’s view has been widely accepted. Recent publications, especially of Ebert (e.g. 1993, 1999) and Neukom (e.g. 1999, 2000), have added further arguments. Underlying Konow’s claims is the notion that the features in question are not inherited in Tibeto-Burman and that they are unique to the Kiranti (± Kuki-Chin) and Munda languages. However, there are problems on both counts. Vigesimal systems are fairly widespread in greater South Asia, occurring in many of the Iranian languages, Nuristani, Dardic, and Burushaski (Edelman 1999). Similarly, the contrast exclusive : inclusive first plural is also found in Dravidian, Marathi, Gujarati, and dialects of Rajasthani. Only the dual number seems to be limited to Tibeto-Burman and Munda in the modern period, but pre-modern Sanskrit also has a dual. Moreover, the dual number is found in Tibeto-Burman languages outside the Himalayan/Kiranti area (Genetti 2007: 7). The issue of verb agreement has become controversial. Some scholars have argued for Indo-Aryan origin (Maspero 1946, Egerod 1973). Current scholarship favors indigenous origin, with various changes and morphological renewals; see Henderson 1957, Bauman 1975, Watters 1993, Jacques 2012, and especially DeLancey 2010b, 2011 (vs. LaPolla 2001, 2003). Under the circumstances Genetti’s conclusion seems appropriate that ‘without more substantive evidence than a small number of shared typological features (which are in fact shared by a wide number of languages and language families), the hypothesis of a Munda substratum appears untenable’ (2007: 7). The Tibeto-Burman/Munda connection has however recently been revived and expanded, especially in work by Ebert (1993, 1999, 2009) and Neukom (1999, 2000). A strong formulation of their view is found in Ebert 2009: 1000. … East of the 84th meridian, two areas can be set off: (A) A former contact zone between TB, A[ustro-]A[siatic], and DRAV, stretching from Nepal to Orissa, and (B) the predominantly TB northeast … The non-IA languages from Nepal to Orissa (zone A) are characterized by a complex verbal morphology, which is not characteristic of the TB relatives farther north and east and may be due either to MU[nda] … or to an unidentified third substratum … Different from the converbal structures typical of OV languages, much of the complex pattern of person and tense marking is retained in sub-

318

Hans Henrich Hock

ordination in MU languages, in Kurukh, and in Kiranti languages of eastern Nepal … It seems that long contact between DRAV, MU, TB, and possibly other language groups have lead [sic] to an area little affected by the rest of South Asian language developments …

Now, it is true that the agreement morphology of Kiranti languages (10a) and Munda (10b) is more complex than what is found in most of Dravidian and Indo-Aryan, encoding not only subject but also object (whether DO or IO). However, some of the Indo-Aryan languages in Ebert’s Zone A share this phenomenon — Rājbanshi (10c), Kurmali, Maithili, and Magahi — and so do TB languages of Ebert’s Zone B such as the Kuki-Chin languages Mizo (10d) and Hmar.99 Even some Dravidian languages have acquired object marking in addition to subject marking (6.5 above). (10) a.

b. c. d.

khan-na asen a-in-u-na you.SG - ERG yesterday 2-buy-3-ART meruba pu-metta-ŋ goat look-CAUS -1 SG ‘Show me the goat you bought yesterday.’ (Athpare; Bickel 1999) ñɛl-gɔt’-ka-t’-ko-a=ko see-EMPH - TAM - TR -3 PL ( OBJ )- FIN =3 PL ( SUBJ ) ‘They saw them off.’ (Santali; Anderson 2007) kalʰi di-m-(k)u-n tomorrow give-FUT -2 SG -1 SG ‘I will give (it) to you tomorrow.’ (Rājbanshi; Wilde 2008) ka-tanpui-ce 1 SG -help-2 SG ‘I help you.’ (Mizo; Subbarao 2001)

Even more striking is the fact that agreement in these languages can encode non-arguments (11), especially possessors (with various restrictions discussed in Subbarao 2012; see also Neukom 2000). (11) a. b. c.

99

ŋka n-tak-ŋa I.ABS 2 SG ( POSS )-friend-1 SG ‘I am your friend.’ (Belhare; Bickel 2003) gǝi=ko idi-ke-d-e-tiñ-a take-TAM - TR -3( OBJ )-1 SGPOSS - FIN cow=3 PL ‘They took my cow.’ (Santali; Anderson 2007) ʌmʰa-r gari-ḍʌ as-ec-ku cart-CLASS come-PRS -2 SG they-GEN ‘Their cart is coming for your (2 SG ) benefit.’ (Rājbanshi; Wilde 2008)

Many Tibeto-Burman languages as well as Maithili and Magahi, have portmanteau affixes, such as Maithili –ǝuk 3NON - HON  2 NON - HON .

Contact and convergence

d.

319

zova-n ka-kut a-mi-sɔ:p-pek my-hands 3 SG -1 SG -washed-BEN Zova-AG ‘Zova washed my hands.’ (Hmar; Subbarao 2012)

Significant for present purposes are the following facts, which are not easily accommodated in Ebert’s Zone A hypothesis. x Multiple agreement (MA, including non-argument agreement) is found in TB languages of Ebert’s Zone B. x MA is found in IA languages located between the Kiranti and Munda languages. x MA is not found in other area-relevant IA languages, including Bangla and Sadri. x MA is also not found in the area-relevant Dravidian languages Kuṛux and Malto. The case looks more promising as regards Ebert’s claim that the TB and Munda languages of Zone A prefer subordinate structures with retention of verb agreement to converbal ones; see e.g. (12a). While in most of the North Munda languages such structures drop the FIN marker -a(-), as in (12b), Keraʔ Mundari retains the marker and thus has a fully finite subordinate structure (12c); see 2.6.3. (12) a.

b.

c.

khan-na asen a-in-u-na 2-buy-3-ART you-SG . ERG yesterday meruba pu-metta-ŋ goat look-CAUS -1 SG ‘Show me the goat you bought yesterday.’ (Athpare = (10a)) ba=m hɔhɔ-iñ-khan ñur-k-ok’-a-ñ NEG =2 SG call-1 SG - COND fall-OPT - MID - FIN -1 SG ‘If you had not called me I might have fallen in the ditch.’ (Santali; Ghosh 2008) abu jagar-ke-n-a-bu du ini hej’-an-a-e talk-PST - ITR - FIN -1 PL FOC he come-PST - FIN -3 SG we. INCL ‘He came while we were talking.’ (Keraʔ Mundari = (9a))

The Tibeto-Burman pattern must be considered in the larger context of the pervasive phenomenon of Tibeto-Burman clausal nominalization which, as in Korean (Yoon 1996), integrates an entire — finite — clause into a matrix structure by means of a post-clausal nominalizer, e.g. the “ART ” in (12a). For Munda the case is less clear. Kherwarian languages tend to add case and other markers to non-FIN structures (Anderson (ed.) 2008: 66, 119), while South Munda languages tend to employ an affix containing n (variously labeled “attributive”, “N ”, and the like) before the case marker (Anderson (ed.) 2008: 345, 416, 615–616, 731–732). Could the n-affixes be related to the Austro-Asiatic nominalizing infix -n- (Diffloth & Zide 1992), which is employed in subordinating function in Kharia (Peterson 2008:

320

Hans Henrich Hock

488, ex. 191)? Examination of the contributions to Anderson (ed.) 2008 yields many other formations that precede a subordinating marker, including partial or complete root or stem reduplication. In short, the issue of what, if any, general subordination strategy was employed by Proto-Munda remains uncertain.100 What seems to be more certain is the related issue of converb use (or avoidance). North Munda languages do have converbs — Santali -(ka)tɛ, Mundari -ke-(n-)ate, Keraʔ Mundari koto(r), Ho -te, Korku -ṭen, where –(a)tɛ/te is an ablative marker. South Munda languages tend to have switch reference instead, with a samesubject marker (SS ) opposed to a “different-subject marker” (DS ); see Anderson 2007: 213–227. Based on the South Munda evidence and what he considers an SS marker -ci in Mundari,101 Anderson reconstructs an SS marker (*-čǝ/čɨ(ʔ)). DS marking is less unified, with cross-linguistic variation of marking or even no marking at all. Moreover, the DS marker may have other functions, including conditional. Considering that converbs prototypically occur under same-subject conditions, it would be possible to consider Anderson’s SS *-čǝ/čɨ(ʔ) to be a converb marker in origin which was (partly) renewed by other suffixes in North Munda, and which acquired SS status through contrast with an innovated DS marker. Munda thus does provide evidence for some kind of converb strategy, contrary to Ebert’s hypothesis. Neukom (1999) presents an attempt to distinguish an area similar to Ebert’s Zone A (but also including Zone B) from more western languages. Some of his findings are similar to what has been discussed in 2.6.4 above; but as that discussion has shown, developments regarding retroflex vs. dental/alveolar segments are not unified, and some of them extend far to the west (especially the “dentalization” of ḷ). Other findings are problematic too, such as identifying the typical Munda phenomenon of “checked” consonants with TB “creaky voice” or not distinguishing the regional phenomenon of voiced aspirated sonorants (Bangla dialects, Bhojpuri, Nepali, Newari, Chamling, Dhimal, Dhangar Kuṛux) from the much more widespread voiced aspirated stops and attributing both phenomena to Indo-Aryan influence. What does remain is the fact that both Munda and Tibeto-Burman have contrastive ŋ; and that is certainly interesting. Ebert and Neukom, thus, have added important new evidence and arguments suggestive of a special contact relation between Tibeto-Burman and Munda, as well as other, intervening languages. Further, more detailed work, paying attention to all the area-relevant languages, would certainly be appropriate. At this point, there are still too many uncertainties to consider the situation resolved. 100

101

Anderson (2007: 141–142) suggests a possible inherited “non-finite” marker -ken, reflected in Gtaʔ -kne, Korku -ken. However in his analysis of Gtaʔ in Anderson 2008, he considers -kne to probably derive from verb stem + ‘-ke tense/aspect form’ + “definite/genitive” -ne. Osada (2008: 152) analyzes -ci as an (alternative) converb marker.

Contact and convergence

2.6.8.

321

Tibeto-Burman/Indo-Aryan contact

Discussions of contact between Indo-Aryan and Tibeto-Burman tend to be “onesided” — Indo-Aryan influence is discussed in publications on Tibeto-Burman, Tibeto-Burman influence in publications on Indo-Aryan. A few major areas of contact effects emerge from the literature. Indo-Aryan-oriented publications focus especially on AGENT (also called “ergative”) case marking not limited to perfective structures in languages bordering on Tibeto-Burman. Masica (1991: 345) finds such marking in Assamese, Bishnupriya Manipuri, and Shina, and notes variable extension of Agent marking in Nepali (p. 347), with the remark ‘This distribution roughly coincides with that of several Tibeto-Burman languages of Nepal, including the previously culturally dominant Newari.’ Noonan (2003: 2) further notes varieties of Nepali in which Agentive marking is found consistently, without any restrictions.102 Another feature, which Masica (1991: 388) finds only in languages bordering Dravidian (Bangla, Marathi, Sinhala), is the use of post-clausal QP s. This marking is also found in Assamese (Goswami & Tamuli 2003: 436) and Nepali (Bal 2004–2007), where Dravidian influence is unlikely. Post-clausal QP marking has, however, been reconstructed for Proto-Tibeto-Burman (Hargreaves 1996). In addition, at least two languages bordering Tibeto-Burman have been characterized as preferring participial relativization to relative-correlatives — Rājbanshi (Wilde 2008: 326) and Nepali (Riccardi 2003: 575–576). Noonan (2003: 11) plausibly attributes this to the fact that relative-correlatives are not native to Tibeto-Burman which instead uses quasi-participial nominalizations. In the area of phonology, the merger of retroflex and dental consonants into a single alveolar series in Assamese and certain varieties of Nepali has been attributed to Tibeto-Burman influence (Goswami & Tamuli 2003: 396, Noonan 2003: 2). Note however that in the Assam area there are members of two other language families that lack the contrast dental : retroflex and have alveolars instead — Khasi and the Daic languages. On the Tibeto-Burman side, the appearance of relative-correlatives in languages in contact with Indo-Aryan is commonly noted as resulting from that contact (e.g., Chelliah 1997: 162–163, LaPolla 2001: 235, Noonan 2003: 11, Schackow 2008: 96). What is noteworthy is that in many cases, Indo-Aryan relative pronominals are employed, although in other cases interrogatives are used. Moreover, in many languages the relative clause is followed by a marker (conditional, topic, dubitative, or the like); see, e.g. Cable 2009 for Modern Tibetan, Coupe 2007 for Mongsen Ao, Subbarao 2008: 64 for Angami. (For an example see (14a) below.) What complicates matters is that Classical Tibetan has a superficially similar type of construction, with interrogative/indefinite pronoun in the relative clause 102

Fuller discussion of this issue is found in Section 4.3.8 of this volume.

322

Hans Henrich Hock

and a post-clausal nominalizer (13a) beside simple nominalization structures (13b). However, unlike the original Indo-Aryan (and Dravidian) pattern, the RC follows the head NP and thus is center-embedded into the matrix clause. Moreover, as Beyer argues (1993: 318–326), the pronoun functions as a “dummy role particle carrier”, serving to disambiguate the grammatical role of gapped coreferential NPs in the RC. The relationship of this kind of structure to the later RCCC structures of various Tibeto-Burman languages would be worthy of further investigation. (13) a.

b.

saṅgs-rgyasi Buddha dgon-pa-la bžugs] -pa-s [gaṅi QP / IP monastery-LOC dwell.PRS NMLZ - AGENTIVE tšhos bšad dharma teach.PST ‘The Buddhai whoi dwells in the monastery taught the dharma.’ [bla-mas bgegs btul] -sa NMLZ Lama demon tame.PST ri-la yod mountain-LOC COP ‘The place where the Lama tamed the demon is on the mountain.’

The use of quotative markers has likewise been attributed to Indo-Aryan influence (LaPolla 2001: 235 with reference to Saxena 1988). In phonology, the development of a separate retroflex series, contrasting with dentals, has been attributed to Indo-Aryan (e.g. Noonan 2003: 5). Finally, LaPolla (2001: 243–245) considers Tibeto-Burman verbal agreement marking to have resulted from contact with Indo-Aryan. This hypothesis contrasts with the emerging majority view that verb agreement is inherited; see DeLancey 2010b, 2011 and the discussion in 2.6.7 above. Two recent studies stand out by addressing the issue of regional or local BIDI RECTIONAL interaction — Noonan 2003 and Coupe 2007 (see also Bendix 1974). Noonan’s observations have been integrated in the above discussion: Nepali dialects that extend agentive marking without restrictions, merge dental and retroflex into alveolar, and avoid relative-correlatives on one hand, and neighboring Tibeto-Burman varieties that employ relative-correlatives on the other. Coupe’s work shows even more remarkable bidirectional influence between Mongsen Ao and Nagamese. As (14a) shows, Mongsen Ao adopts the Indo-Aryan relativecorrelative construction while keeping the Tibeto-Burman post-clausal marker; Nagamese retains the Indo-Aryan relative-correlative structure and adopts the Tibeto-Burman post-clausal marker (14b).

Contact and convergence

(14) a.

b.

323

la pa thak ku [tʃǝ́páʔ tʃà-mì-ǝ̀r]RC place.LOC INTERROG do-DESID - PRS PARTICLE 3 SG [tʃ(à)-aŋ]CC do.IMP ‘[The lightning said] to it “Whatever you want to do, do (it)” ’ (Mongsen Ao) le [jun manu ahi-se]RC RP man come-PST PARTICLE [utu manu amar kokai ase]CC CP man my older.brother be.PRS ‘The man who came is my older brother.’ (Nagamese)

2.6.9 Local contact Joseph (2007) argues that all language contact is local and that its effects become areal only through secondary spread. Given that, in reality, speakers are bi- or multilingual, not “languages”, this claim makes eminent sense (although it needs to be modified to make allowance for discontinuous supraregional speech communities, such as Hindi or English; see Section 2.7).103 The most famous case probably is that of Kupwar (Gumperz & Wilson 1971), where Kannada (identity language of Jain merchants), Urdu (Muslim landholders), and Marathi (landless laborers) were in a long-standing contact situation for some 300 years.104 The result of this extended contact is bi- or multidirectional convergence. In some cases, such as gender, it is the “natural”-gender pattern of Kannada that prevails (15); in others, such as adjective and pronoun morphology (Standard Kannada variation between attributive and predicative forms) as well as copula use, it is that of Urdu and Marathi (16). (A prefixed K indicates the local variety of Kannada, Marathi, and Urdu.)

103

104

Even before the notion “standard language” became an issue, there was a long tradition of similar bi- or multilingual interactions between supraregional languages. Consider the case of Persian as a court language introduced by Muslim rulers and coexisting with various regional languages (2.4.1.3, 2.4.2.1 above). Similarly, Brown (1854) observes that ‘Government business in Southern India is chiefly conducted in … Tamil, Telugu, Kannadi, Malayalam or Maratha’, while Muslims use Hindustani, the local native tongue, and Persian. At an earlier period, Sanskrit played a similar role, as at the Vijayanagara court, where it coexisted with Prakrit, Shauraseni, Kannada, Telugu, Tamil, and various other languages; see e.g. Wagoner 1993. (For detailed discussion with further references see Mitchell 2009.) Kulkarni-Joshi (2008, 2011) provides evidence suggesting that Gumperz and Wilson’s claim may have been over-argued. She also notes that similar phenomena are found in other areas of intensive Marathi-Kannada contact.

324

Hans Henrich Hock

(15) vs. vs.

Ka Ur KUr Ma KMa

(16) Ka KKa Ma. Ur.

masculine + human ± human ± human (default) ± human + human ī-Ø mane nim-du i-d mani nim-d he ghar tumcā ye ghar tumhārā this house your ‘This house is yours.’

feminine + human ± human + human ± human + human

neuter – human (default) ----------± human – human (default)

Ø i-du nim-Ø eti i-d nim-d tumcā hāi105 he hai ye tumhārā is this your ‘This is your house.’

mane mani ghar ghar house

Ø eti hāi hai is

Equally noteworthy is the study of Nagpur by Pandharipande (1982), where Standard Marathi and Standard Hindi coexist with Marathi-influenced Hindi and Hindiinfluenced Marathi, and where all four varieties are used concurrently as different registers of sociolinguistic identification. Hindi influence on Nagpuri Marathi occurs in compound verb structures, adverb formation, the progressive, etc.; the converse influence is found in the emphasizing particle, conditional structures, causatives, etc. Similar bi- or multidirectional convergence can be observed in many of the “tribal” areas, even though the situation may not appear to be as stable as the one in Kupwar, for speakers of tribal languages tend to be under great pressure to shift to the language(s) of the more prestigious — and powerful — groups. But where the relationship is more stable, as it appears to be in the case of Mongsen Ao and Nagamese (section 2.6.8 above) or the Dravidian and Munda languages of Jharkhand (section 2.6.6), this kind of convergence seems to be more robust. The extent to which the stability of such convergence depends on social circumstances can be gauged from more recent developments in Kupwar. As KulkarniJoshi (2008, 2011) notes, the traditional local interaction in Kupwar between Kannada, Marathi, and Urdu is rapidly fading; parents are now bringing up their children to be Standard Marathi-dominant, in order to assure job opportunities in the new reality of India’s “Linguistic States”. In fact, the effect of privileging the regional language in Indian states at the expense of minority languages is an issue that deserves detailed, cross-area investigation. As the discussion in the preceding sections has shown, the most common approach has been to take a broader, large-area perspective and to talk about contact influence of one language on another or, in rare cases, bidirectional influence. In many cases this may be unavoidable, given limitations of evidence and time. However, detailed study of local contact, focusing on all sides involved in 105

This is the regional Marathi form of the copula.

Contact and convergence

325

the contact should be a desideratum, if only because it provides explanations in principle for cases where such local information may not be available. 2.6.10. Conclusions This contribution is only a limited survey of the different contact situations (beyond the Northwest, the Northeast, and the Nilgiris). A more comprehensive survey would certainly be desirable. At the same time, I hope that the present survey does convey a representative perspective on the extent of language contact, the great variety of languages involved, the great variation in contact-induced developments, and the similarly great variation in the confidence with which these developments can be explained. I also hope to have shown that it is desirable to pay equal attention to all local or regional languages whose speakers are in bi- or multilingual contact and to be prepared for (at least local) bi- or multidirectionality of influence — i.e. true CON VERGENCE . A fundamental problem with the unidirectional approach to language contact is that it neglects to ask what happens to the “substratum” language in the process of contact interaction; and the language undergoing “substratum” influence is in essence deprived of agency. 2.7.

English and South Asian languages By Hans Henrich Hock

2.7.1.

Introduction

English has been in contact with South Asian languages for some 300 years, without extensive shift of speakers in either direction. Some effects of the contact are well known and/or well researched (e.g. major features of South Asian English pronunciation or lexical borrowing/code mixing), others are not (e.g. the syntactic influence of English, especially on languages other than Hindi-Urdu). 2.7.2.

Code mixing, code switching, and lexical change

There is an extensive literature on the general issue of code mixing and code switching, with different views as to what distinguishes the two phenomena; see the survey by Sankoff (2002: 650–652). For practical purposes it is useful to distinguish between the insertion of single words or collocations from one language into another, as in (17) (code mixing), and switching between longer, syntactically defined stretches coming from two different languages (code switching), as in (18). (English words and passages are in bold.)

326

Hans Henrich Hock

(17) maiṁ to pyūr hindī hī bolnā lāik kartī TOP pure Hindi EMPH speak.INF like do.IPF . SG . F I hūṁ be.PRS .1 SG miks karne kā to kvesčan hī nahīṁ EMPH NEG mix do.INF GEN . SG . M TOP question uṭhtā … arise.IPF . SG . M ‘I like to speak pure Hindi, the question of mixing (Hindi and English) does not arise.’ (Adapted from Snell 1993: 83) (18) a. b.

maiṁ-ne us-se he-INS I- ERG I asked

c.

I

asked

pūch-ā ki where is Saral ask-PRF . SG . M COMP him that/ki saral kahāṁ hai COMP Saral where be.PRS .3 SG him ki where is Saral COMP

‘I asked him where Saral was.’ (Adapted from Snell 1993: 84) Of these two phenomena, code mixing (and the related process of borrowing106) has a significantly greater effect — as a conduit for the incorporation of English words and collocations into South Asian languages. The effect has been studied most widely for Hindi-Urdu (HU), see Snell 1993 for discussion and references; but it is found throughout South Asia. A recent discussion of Urdu-English code switching is Anwar 2007. Significantly, code mixing/borrowing is not a one-way street; it also leads to the incorporation of South Asian lexis in South Asian English (henceforth SAE), on which there is a rich literature going back to B. B. Kachru 1969, 1983; see e.g. Baumgardner 1993, 1996, Sailaja 2009, Agnihotri & Singh (eds.) 2012. The influx of English words meets resistance in literary languages which, with the exception of Urdu and Tamil, leads to the substitution of Sanskrit-based lexis for borrowings from English. Interestingly, the semantics and use of the Sanskrit words are informed by the English words that are being “avoided” — something that has been called “covert Englishization”; see e.g. Y. Kachru 1989, Hock 1992, Snell 1993, and Section 2.7.5.2 below. While in most languages code mixing/borrowing merely leads to increased lexical choices (with different connotations, including “neutrality” in HU), Sridhar (1978) reports a case of institutionalization of a code-mixed variety of Kannada as 106

The two phenomena are notoriously difficult to distinguish, especially since both draw on the same nativization processes.

Contact and convergence

327

professional wrestlers’ jargon. Further study of similar phenomena in other South Asian languages is desirable. 2.7.3.

Phonology

Code-mixed items are phonologically nativized, along the same lines as “ordinary” borrowings (Hock 1986[1991]: § 16.1.3). This is also true for the English stretches in code switching, as in (19),107 as well as for SAE in general. The processes involved are well known for HU (see Hock 1986[1991]: § 14.3.1). They prominently include substitution of retroflex for alveolar stops (e.g. iṭ for [I t], voiceless unaspirated for (slightly) aspirated stops (as in pīpal for [piypl̩ ]), aspirate th for [θ], monophthongal [e:, o:, i:, u:] for [ey, ow, iy, uw] (as in ḍōnṭ for [downt]). (19) kahte haiṁ ki pīpal ḍōnṭ lāik iṭ say.IPF .3 PL . M be.PRS .3 PL COMP people don’t like it ki rāj nārāin aikṭs lāik a fūl COMP Raj Narayan acts like a fool ‘They say/it is said that people don’t like it that Raj Narayan acts like a fool’ Some of the literature on Indian English addresses phonological issues in other languages, such as Prabhakar Babu 1974 (Telugu), Vijayakrishnan 1978 (Tamil), Sethi 1980 (Panjabi), Wiltshire & Harnsberger 2006 (Gujarati, Tamil). But systematic studies are rare. Especially interesting would be research on the phonological “dialectology” of Indian English. A good survey is found at http://en.wiki pedia.org/wiki/Indian_English#Phonology, but without detailed references. (To the observations found there might be added the Marathi distinction between Engl. v  [vh] and w  [β].) The most recent study, Sirsa & Redford 2013, based on listening judgments by Indian English speakers, shows that linguistically trained speakers can distinguish “Telugu English” from “Hindi English”, but that there seems to be a common ‘target phonology that is distinct from the phonology of the native Indian languages’. While South Asian phonological influence on English is considerable, converse influence seems more limited. Some speakers of Hindi suggest that the widespread change of ph to f is due to English influence; but the change is also found in Bangla and Gujarati (Masica 1991: 103) and in the Panjabi of non-Englishknowing speakers, e.g. fir ‘then, so’ (instead of phir). English influence may perhaps play a role in boosting the HU use of z, rather than the traditional substitution of j for foreign [z]. Girish (2005) reports the introduction of f in Malayalam, a development shared with other Dravidian languages. Further research is needed.

107

Example (19) is invented but versions of it have been tested with native speakers.

328 2.7.4.

Hans Henrich Hock

Morphology

South Asian influence on Indian English morphology seems minimal. The most widespread effect is the use of the HU imperative ending -o to incorporate verbs into SAE, as in gherā-o ‘protest by encirclement’.108 English influence on South Asian morphology manifests itself most prominently in the use of long “loose compounds” such as Kannada kruṣi1 varamāna2 terige3 vināyati4 masūde5 ‘farm1 income2 tax3 exemption4 bill5’, based on English models. (Example from Sridhar 2008: 339.) 2.7.5.

Morphosyntax

2.7.5.1. “Indianization” of English As B. B. Kachru notes (1969), some of the syntactic features characterizing SAE were noticed as early as Kindersley 1938. Beyond Kachru’s summary (1969: 646– 647), several other publications discuss morphosyntactic features attributable to South Asian languages. Much of the literature focuses on SAE in the northern area, especially HU, for which Bhatt (2008) and Mahboob (2008) offer excellent summaries. Lange (2012) examines the syntax of colloquial Indian English. Coverage of the Dravidian south and especially of the northwest is much less extensive. Commonly cited features of SAE include the following (the first three were already noticed by Kindersley). x The use of the pluperfect to indicate a remote past, rather than an anterior (relative) past (Q: Did you read the book? A: I had read it.109) x The use of would as future marker (In this paper we would prove that …) x The use of the progressive with statives (I am knowing this) x Invariable isn’t it? in tag questions (You are coming, isn’t it?) x Lack of inversion in questions (What you are thinking?) x Focus marker only (She gave it to me only) x Argument deletion (Throw [it to me]; Interestingly, the language of the interim report, though [it] covered all the factual aspects before it, seems to have left sufficient room for the accused to wriggle out of the blame …110) x Variable presence or absence of the definite article (Segment /h/ is playing a significant role in phonological changes in Hindi111) 108

109 110 111

The phenomenon of adding the English verbal ending -ofy (imperative o + English -(i)fy to Panjabi verbs or nouns is widespread in English-matrix sentences spoken by Panjabi speakers, e.g. taṛkofy the dāl ‘Fry onion/garlic and put it on the (cooked) dal.’ (Bashir 1982). From http://www.bollywhat-forum.com/index.php?topic=17316.0 Times of India, 7 March 2010. From a recent issue of Indian Linguistics.

Contact and convergence

329

While many of these features may be pan-South Asian, a comprehensive study of their distribution on the subcontinent remains a desideratum. 2.7.5.2. “Englishization” of South Asian languages Compared to the influence of South Asian languages on SAE, the converse influence of English on the languages of South Asia has received much less attention. What seems to have received the widest attention is English influence on the use of the passive. In Dravidian, English has been held responsible for a more common use of the passive (Marar 1971, Sridhar 2008). In Indo-Aryan, the innovation lies in the use of structures with overt agents such as (20a), in contrast to the traditional (in-)capabilitive value of the construction (20b); see B. N. Patnaik & Pandit 1986, Y. Kachru 1989, Masica 1991: 358. In both Dravidian and IndoAryan, the new/increased passive use is especially found in journalistic, governmental, and scholarly contexts. (20) a.

bhārat sarkār-dvārā India government-INS ki

ek nay regyulešan a new regulation

anāuns kiyā gayā hai announce.PASS . PFV . PRS

COMP

b.

‘A new regulation has been announced by the Government of India that …’ mujh-se yah kām nahīṁ kiyā jātā hai this work NEG do.PASS . PRS I-INS ‘I can’t do this work.’ (lit. ‘This work is not done by me.’)

Related to this increased use of the passive is the Hindi type kahā jātā hai ‘it is said’ for earlier kahte haiṁ ‘(they) say’ (Mishra 1963). Another feature that has received wider attention is the use of post-nominal, “English-type” relative clauses as in (21a), instead of the older and, for most speakers, preferable112 relative-correlative (RC-CC or CC-RC) types (21b) and (21c).

112

For Bangla, Dasgupta states that structures of the type (21a) tend to be avoided (2003: 389).

330

Hans Henrich Hock

(21) a.

b.

c.

vah ādmī [jo vahāṁ baiṭhā (huā) hai]RC bahut hośiyār smart that man RP there sit. PFV . SG . M AUX .3 SG very hai be.PRS .3 SG ‘The man who is sitting there is very smart.’ [vah (ādmī) bahut [jo ādmī vahāṁ baiṭhā (huā) hai]RC RP man there sit. PFV . SG . M AUX .3 SG CP (man) very hośiyār hai]CC smart be.PRS .3 SG [vah ādmī bahut hośiyār hai]CC [jo (ādmī) vahāṁ CP man very smart be.PRS .3 SG RP (man) there baiṭhā (huā) hai]RC sit. PFV . SG . M AUX .3 SG

Patnaik and Pandit (1986) attribute the Oriya counterpart of (21a) directly to English influence (see also Snell 1993), whereas Masica (1991: 414) and Y. Kachru (2008: 101) allow for a combination of earlier Persian and later English influence. Puri (2011) shows that postnominal RCs are a recent innovation in HU and first appear in translations from English; the assumption of earlier Persian influence so far remains a guess. Beside (21a) there is a superficially similar, but more common alternative of the type (21d), in which the post-nominal RC is followed by a correlative — or resumptive ? — pronoun. Some Indo-Aryan varieties apparently do not have (21a), but do have (21d); see Pandharipande 1997: 78–80 for Marathi and Wilde 2008: 324 for Rājbanshi. Structures of this type raise further questions regarding the origin of post-nominal RCs in Indo-Aryan. (Marlow 1994 argues that the pre-RC element in the type (21d) originally is left-extracted and that the type (21a) results from reanalysis of (21d).) (21) d.

vah bahut hośiyār vah ādmī [jo vahāṁ baiṭhā huā hai]RC that man RP there sit. PFV . SG . M be.PRS .3 SG . M he very smart hai be.PRS .3 SG ‘The man who is sitting there (he) is very smart.’

A feature that has been noted for both Kannada and HU is the use of prenominal (relative) participles as in (22a), especially in newspaper headlines; see Sridhar 2008: 338–339, Snell 1993: 86 (with reference to Lakshmi Bai 1984). Snell further observes that in structures of this sort the Sanskrit past participle (such as likhit ‘written’) is preferred to the corresponding Hindi form (likhā huā) — another case of covert Englishization. Elena Bashir (p.c. 8 April 2013) considers structures like (22b) perfectly normal and probably going back to pre-English influence — another issue that deserves further study.

Contact and convergence

(22) a. b.

331

… sangatiyannu jñāpakadalliṭṭira-bēkenda pradhani avaru remember-must.say.REL . PPL prime minister.HON … fact.ACC ‘The Prime Minister, saying that we must remember the fact (that …)’ angrezī ke mādhyam-se kām karne vāloṁ ke liye medium-INS work-doing.OBL . PL . M for English GEN . OBL . ‘for (those) working in the English medium’

In addition there are numerous references to English influence on individual South Asian languages, e.g. the contributions in Krishnamurti & Mukherji (eds.) 1984; but many are “hidden” in more general publications. A few of these are the following. George (1972: 15) notes the use of oru, lit. ‘one’, as indefinite article in Malayalam — a more widespread phenomenon in South Asian languages, but in languages like Hindi-Urdu such as use of ek ‘one’ is better glossed as ‘a certain’, which might suggest indigenous development.113 Snell (1993) offers a range of phenomena in HU, including the use of right-peripheral conditional clauses, an increasing use of indirect, rather than direct speech, and the use of the progressive as near-future. Bashir (2006: 26) suggests that the increasing tendency of HU not to omit the present-tense auxiliary in negated clauses may reflect English influence. Bashir (2010: 31, 34) attributes the (increased) use of the progressive in Brahui (and Balochi) to Urdu and English influence — perhaps the only coverage so far of a language on the northwestern periphery. Beyond sentence syntax, Y. Kachru (1989) and Snell (1993) note English influence on Hindi rhetorical structure, especially in officialese registers; and so does Hock (1992) for Sanskrit. Just as in the case of the “Indianization” of English, a more comprehensive investigation of the Englishization of South Asian languages, including regional variation, would be highly desirable. 2.7.6.

Conclusions and implications

As has been observed by most of those working on the language, SAE has become indigenized, used primarily for communication within South Asia. What is less clear is whether it has regional variations, whether these are getting leveled out through communication across the regions, or whether regional “dialects” of SAE are developing. A related issue is whether different national varieties of SAE are emerging. Comparison of Bhatt 2008 and Mahboob 2008 suggests that Pakistani and Indian 113

In fact, the phenomenon is also seen in the “singulative marker”/“indefinite article” from ‘one’ in Burushaski, where English influence is unlikely (Elena Bashir p.c. November 2014).

332

Bibliographical references

English differ little. Too little has been published on Bangladeshi and Srilankan SAE to assess whether they are developing in a different direction. In short, much additional work is needed. Finally, the interaction of English and the South Asian languages is a two-way street. Although “Englishization” and “Indianization” tend to be compartmentalized (even by scholars who have worked on both issues), they must be seen in the broader context of non-replacive, long-standing bilingualism — leading to convergence, rather than unidirectional “substratum” or “superstratum” influence. (Note however that in phonology, the structure of the South Asian languages prevails.)

Bibliographical references Abbi, Anvita 1992 Reduplication in South Asian languages: An areal, typological, and historical study. New Delhi: Allied Publishers. Abbi, Anvita 1997 Languages in contact in Jharkhand: A case of language conflation, language change and language convergence. In: Abbi (ed.) 1997: 131–148. Abbi, Anvita 2006 Endangered languages of the Andaman Islands. München: LINCOM. Abbi, Anvita (ed.) 1997 Languages of tribal and indigenous peoples of India: The ethnic space. Delhi: Motilal Banarsidass. Abidi, S. A. H., and Ravinder Gargesh 2008 Persian in South Asia. In: Kachru, Kachru & Sridhar (eds.) 2008: 103–120. Acson, Veneeta A., and Richard L. Leed (eds.) 1985 For Gordon H. Fairbanks. Honolulu: University of Hawaii Press. Afġānīnawīs, ’Abdollah 1335 [= 1956/1957] Loġāt-e ‘āmiyāna-ye fārsī-ye afġānestān [Colloquial words of the Persian of Afghanistan]. Kābol: Mo’assesa-ye Balx. (For additional spellings of the author’s name and publication information, see http://viaf.org/ viaf/63887581/ (accessed 15 Dec. 2014).) Agesthialingom, S., and N. Rajasekharan Nair (eds.) 1981 Dravidian syntax. (Annamalai University Publications in Linguistics, 73.) Annamalai University. Agesthialingom, S., and S. V. Shanmugan (eds.) 1972 Third seminar on Dravidian linguistics. Annamalainagar: Annamalai University. Agnihotri, Ramakant, and Rajendra Singh (eds.) 2012 Indian English: Towards a new paradigm. Hyderabad: Orient BlackSwan. Aikhenvald, Alexandra 2003 Evidentiality in typological perspective. In: Aikhenvald & Dixon (eds.) 2003: 1–31. Amsterdam: Benjamins.

Contact and convergence

333

Aikhenvald, Alexandra Y., and R. M. W. Dixon (eds.) 2001 Areal diffusion and genetic inheritance: Problems in comparative linguistics. Oxford: Oxford University Press. Aikhenvald, Alexandra Y., and R. M. W. Dixon (eds.) 2003 Studies in evidentiality. Amsterdam: Benjamins. Alam, Muzaffar 2003 The culture and politics of Persian in precolonial Hindustan. In: Pollock (ed.) 2003: 131–198. Allana, Ghulam Ali 1964 The Arabic elements in Sindhi. University of London MA thesis. American Institute of Pakistan Studies n.d. http://www.pakistanstudies-aips.org/programs/other-aips-projects/plasp (accessed 16 Dec. 2014) Anderson, Gregory D. S. 2001 A new classification of South Munda: Evidence from comparative verb morphology. Indian Linguistics 62: 27–42. Anderson, Gregory D. S. 2003 Dravidian influence on Munda. International Journal of Dravidian Linguistics 32(1): 27–48. Anderson, Gregory D. S. 2007 The Munda verb: Typological perspectives. Berlin/New York: Mouton de Gruyter. Anderson, Gregory D. S. (ed.) 2008 The Munda languages. London/New York: Routledge. Andronov, Mikhail S. 1968 Two lectures on the historicity of language families. Annamalainagar: Annamalai University. Andronov, Mikhail S. 2001 A grammar of the Brahui language in comparative treatment. München: LINCOM. Andronov, Mikhail S. 2003 A comparative grammar of the Dravidian languages. Wiesbaden: Harrassowitz. Andronov, Mikhail S. 2006 Brahui, a Dravidian language: A descriptive and comparative study. München: LINCOM. Anjum, Tanveer 1991 Urdu-English code-switching in the speech of Pakistani women in Texas. University of Texas, Austin, PhD dissertation. ProQuest Dissertations 9212482. Anthony, David W. 2007 The horse, the wheel, and language: How Bronze-Age riders from the Eurasian steppes shaped the modern world. Princeton: Princeton University Press. Anwar, Behzad 2007 Urdu-English code-switching: The use of Urdu phrases and clauses in Pakistani English (a non-native variety). ESP Worldwide 17: 3–14. http://www.espworld.info/Articles_16/PDF/Urdu_English.pdf (accessed 24 November 2014)

334

Bibliographical references

Archer, Bernice 2003 Acquiring a multilingual repertoire in Quetta, Balochistan. In: Jahani, Korn & Gren-Eklund (eds.) 2003: 157–168. Arora, Harbir 2004 Syntactic convergence: The case of Dakkhini Hindi-Urdu. Delhi: University of Delhi. Asani, Ali 2003 At the crossroads of Indic and Iranian civilizations: Sindhi literary culture. In: Pollock (ed.) 2003: 612–646. Asif, Saiqa Imtiaz 2005 Shame: A major cause of ‘language desertion’. Journal of Research 8: 1–13. Multan: Bahauddin Zakariya University, Faculty of Languages & Islamic Studies. Aslanov, Martiros G. 1985 Puštu-russkij slovar’ [Pashto-Russian dictionary]. Moskva: Russkij Jazyk. Azhar, A. D. 1966a Urdū yā Pākistānī? Nusrat 10 July 1966: 7–17. Azhar, A. D. 1966b Pākistani Urdū aur Hindūstānī Urdū. Nusrat 14 December 1966: 9–25. Baart, Joan L. G. 2003 Sustainable development and the maintenance of Pakistan’s indigenous languages. In: Proceedings, Conference on State of Social Sciences and Humanities: Current Scenario and Emerging Trends, Islamabad Dec. 15–17, 2003, 202–213. Islamabad: Committee on the Development of Social Sciences and Humanities, Higher Education Commission (HEC)/United Nations Educational, Scientific and Cultural Organization (UNESCO). http://unesdoc. unesco.org/images/0013/001352/135261eo.pdf (accessed 28 Nov. 2014) Baart, Joan L. G. 2003 Tonal features in languages of northern Pakistan. In: Joan L. G. Baart and Ghulam Hyder Sindhi (eds.), Pakistani languages and society: Problems and prospects, 132–144. Islamabad: National Institute of Pakistan Studies, Quaidi-Azam University/Summer Institute of Linguistics. Backstrom, Peter C. 1992a Balti. In: Backstrom & Radloff (eds.) 1992: 3–27. Backstrom, Peter C. 1992b Domaaki. In: Backstrom & Radloff (eds.) 1992: 77–83. Backstrom, Peter C., and Carla F. Radloff (eds.) 1992 Languages of Northern Areas. (Sociolinguistic Survey of Northern Pakistan, 2). Islamabad: National Institute of Pakistan Studies, Quaid-i-Azam University/ Summer Institute of Linguistics. Bagchi, P. C. (transl.) 1975 Pre-Aryan and pre-Dravidian in India. [Translation and edition of essays by Jules Bloch and Sylvain Lévi.] New Delhi: Asian Educational Services. Bal, Bal Krishna 2004–2007 Structure of Nepali grammar. PAN Localization, Working Papers 2004– 2007: 332–396. http://www.panl10n.net/english/outputs/Working%20Papers/ Nepal/Microsoft%20Word%20-%207_E_N_396.pdf (accessed 16 Dec. 2014)

Contact and convergence

335

Baranzehi, Adam Nader 2003 The Sarawani dialect of Balochi and Persian influence on it. In: Jahani, Korn & Gren-Eklund (eds.) 2003: 75–111. Barker, Muhammad Abd-al Rahman, and Aqil Khan Mengal 1969 A course in Balochi, volumes I and II. Montreal: Institute of Islamic Studies, McGill University. Barth, Fredrik 1964 Ethnic processes on the Pathan-Baluch boundary. In: Indo-Iranica; mélanges présentés à Georg Morgenstierne, à l’occasion de son soixante-dixième anniversaire, 13–20. Wiesbaden: Harrassowitz. Barz, R. K., and A. V. N. Diller 1985 Classifiers and standardization: Some South and South-East Asian comparisons. In: David Bradley (ed.), Papers in South-East Asian linguistics No. 9: Language policy, language planning, and sociolinguistics in South-East Asia, 155–184. Canberra: Department of Linguistics, Research School of Pacific Studies, Australian National University. Bashir, Elena 1982 Hybrid words in -ify in South Asian languages. MS, University of Wisconsin, Madison. Bashir, Elena 1988 Topics in Kalasha syntax: An areal and typological perspective. University of Michigan PhD dissertation. ProQuest Dissertations 8821545. Bashir, Elena 1991a A contrastive analysis of Brahui and Urdu. Peshawar, Pakistan/Washington, DC: Directorate of Primary Education, NWFP/Academy for Educational Development. Bashir, Elena 1991b A contrastive analysis of Pashto and Urdu. Peshawar, Pakistan/Washington, DC: Directorate of Primary Education, NWFP/Academy for Educational Development. Bashir, Elena 1996a Mosaic of tongues – Part II: Quotatives and complementizers in Northwest Indo-Aryan, Burushaski, and Balti. In: Hanaway & Heston (eds.) 1996: 187– 286. Bashir, Elena 1996b The areal position of Khowar: South Asian and other affinities. In: Elena Bashir & Israruddin (eds.), Proceedings of the Second International Hindukush Cultural Conference, 167–179. Karachi: Oxford University Press. Bashir, Elena 1997 Burushaski-Khowar commonalities. Third International Himalayan Languages Symposium, 18–20 July 1997, Santa Barbara, California. Bashir, Elena 1999 The Urdu postposition ne: Its changing role in the grammar. In: Rajendra Singh (ed.), The yearbook of South Asian languages and linguistics 1999, 11–36. New Delhi/London: Sage. Bashir, Elena 2001 Khowar-Wakhi contact relationships. In: Dirk W. Lönne (ed.), Tohfa-e-Dil: Festschrift Helmut Nespital, 3–17. Reinbek: Wezler.

336

Bibliographical references

Bashir, Elena 2003 Dardic. In: Cardona & Jain (eds.) 2003: 818–894. Bashir, Elena 2006a Change in progress: Negation in Hindi and Urdu. In: Singh (ed.) 2006: 3–29. Bashir, Elena 2006b Indo-Iranian frontier languages. In: Ehsan Yarshater (ed.), Encyclopaedia Iranica. http://www.iranicaonline.org/articles/indo-iranian-frontier-languagesand-the-influence-of-persian (accessed 29 December 2014) Bashir, Elena 2007a Contact-induced change in Khowar. In: Shafqat Saeed (ed.), New perspectives on Pakistan: Contexts, realities and visions of the future, 205–238. Karachi: Oxford University Press. Bashir, Elena 2007b Evidentiality in South Asian languages. In: Miriam Butt and Tracy Holloway King (eds.), Proceedings of the LFG-06 Conference, 30–50. http://cslipublica tions.stanford.edu/LFG/11/lfg06.html (accessed 21 Dec. 2014) Bashir, Elena 2008 Some transitional features of Eastern Balochi: An areal and diachronic perspective. In: Jahani, Korn & Titus (eds.) 2008: 45–82. Bashir, Elena 2009 Wakhi. In: Windfuhr (ed.) 2009: 825–862. Bashir, Elena 2010 Innovations in the negative conjugation of the Brahui verb system. Journal of South Asian Linguistics 3(1): 23–43. http://tiger.sprachwiss.uni-konstanz. de/~jsal/ojs/index.phpjsal/article/view/21/17 (accessed 15 Dec. 2014). Bashir, Elena 2011 Urdu and linguistics: A fraught but evolving relationship. The Annual of Urdu Studies 26: 97–123. http://www.urdustudies.com/pdf/26/11ElenaBashir.pdf (accessed 28 Nov. 2014) Bashir, Elena, and Peter Edwin Hook Forthcoming An oral history in the Shina of Gurez. Bashir, Elena, Madhav M. Deshpande, and Peter Edwin Hook (eds.) 1987 Select papers from SALA-7: South Asian Languages Analysis Roundtable Conference. Bloomington, IN: Indiana University Linguistics Club. Bauer, Erhard 1995 Sprachlicher Wandel als Ergebnis politisch motivierter Sprachlenkung: Neuere Entwicklungen im Paš ̣to Afghanistans [Language change as a result of politically motivated language management: Recent developments in Pashto]. In: Bert G. Fragner et al. (eds.), Proceedings of the Second European Conference of Iranian Studies, 53–63. (Serie Orientale Roma LXIII.) Roma: Istituto Italiano per il Medio ed Estremo Oriente. Bauman, James 1975 Pronouns and pronominal morphology in Tibeto-Burman. University of California, Berkeley, PhD dissertation. ProQuest Dissertations 7615103. Baumgardner, Robert J. 1993 The English language in Pakistan. Karachi: Oxford University Press.

Contact and convergence

337

Baumgardner, Robert J. 1996 South Asian English: Structure, use, and users. Urbana: University of Illinois Press. Bayer, Josef 2001 Two grammars in one: Sentential complements and complementizers in Bengali and other South Asian languages. In: Bhaskararao & Subbarao (eds.) 2001: 11–36. Bečka, Jiřī 1969 A study in Pashto stress. (Dissertationes orientales, 12.) Prague: Oriental Institute in Academia, Czechoslovak Academy of Sciences. Bellwood, Peter 2009 Early farmers: Issues of spread and migration with respect to the Indian subcontinent. In: Toshiki Osada (ed.), Linguistics, archaeology and human past in South Asia, 55–70. Delhi: Manohar. Bendix, Edward H. 1974 Indo-Aryan and Tibeto-Burman contact as seen through Nepali and Newari verb tenses. International Journal of Dravidian Linguistics 3(1): 42–59. Berger, Hermann 1959 Die Burushaski-Lehnworter in der Zigeunersprache. Indo-Iranian Journal 3(1): 17–43. Berger, Hermann 1966 Remarks on Shina loans in Burushaski. In: Dil (ed.) 1966: 79–88. Berger, Hermann 1974 Das Yasin-Burushaski (Werchikwar). Wiesbaden: Harrassowitz. Berger, Hermann 1998 Die Burushaski-Sprache von Hunza und Nager. Wiesbaden: Harrassowitz. Beyer, Stephan V. 1993 The classical Tibetan language. Delhi: Sri Satguru Publications. Bhaskararao, Peri, and K. V. Subbarao (eds.) 2001 The yearbook of South Asian languages 2001. (= Proceedings of Tokyo symposium on South Asian languages: Contact, convergence, and typology.) Thousand Oaks/London/New Delhi: Sage. Bhatt, Rakesh M. 2008 Indian English: Syntax. In: Mesthrie (ed.) 2008: 547–562. Bhattacharya, Sudhibushan 1972 Dravidian and Munda: A good field for areal and typologic studies. In: Agesthialingom & Shanmugan (eds.) 1972: 241–256. Bhattacharya, Sudhibushan 1975 Linguistic convergence in the Dravido-Munda area. International Journal of Dravidian Linguistics 4: 199–214. Bickel, Balthasar 1999 Nominalization and focus constructions in some Kiranti languages. In: Yadava & Glover (eds.) 1999: 271–296. http://www.uni-leipzig.de/~bickel/research/ papers/focnom99.pdf (accessed16 Dec. 2014). Bickel, Balthasar 2003 Belhare. In: Thurgood & LaPolla (eds.) 2003: 546–570.

338

Bibliographical references

Biddulph, John 1880 Tribes of the Hindoo Koosh. Calcutta: Office of the Superintendent of Government Printing. Repr. 1977, Karachi: Indus Publications. Blažek, Václav, and Irén Hegedűs 2010 On the position of Nuristani within Indo-Iranian. In: Roman Sukač and Ondřej Šefčík (eds.), The sound of Indo-European, 2: Papers on Indo-European phonetics, phonemics and morphophonemics (Opava, Nov. 2010), 40–66. München: LINCOM. Bloch, Jules 1911 Review of Bray 1909. Journal Asiatique, 10e série, 17(1): 162–167. Bloch, Jules 1924–1925 Sanskrit et dravidien. Bulletin de la Société de Linguistique de Paris 25(1): 1–21 (English translation in Bagchi 1975: 35–62). Bloch, Jules 1934 L’indo-aryen du Veda aux temps modernes. Paris: Adrien-Maisonneuve. Bradley, David 2002 The subgrouping of Tibeto-Burman. In: Christopher I. Beckwith (ed.), Medieval Tibeto-Burman languages: Proceedings of a symposium held in Leiden, June 26, 2000, at the Ninth Seminar of the International Association of Tibetan Studies, 73–112. (Brill’s Tibetan Studies Library 2/6.) Boston: Brill. Bradley, David (ed.) 2003 Language variation: Papers on variation and change in the Sinosphere and Indosphere in honour of James A. Matisoff. Canberra: Pacific Linguistics. Braunmüller, Kurt 2009 Converging genetically related languages: Endstation code mixing? In: Kurt Braunmüller (ed.), Convergence and divergence in language contact situations, 53–69. Amsterdam/Philadelphia: Benjamins. Bray, Denys de S. 1909 The Brahui language, Part I: Introduction and grammar. Calcutta: Superintendent, Government Printing. Repr. 1977, Quetta: The Brahui Academy. Bray, Denys de S. 1934 The Brahui language, Part II: The Brāhūī Problem. Calcutta: Superintendent, Government Printing. Repr. 1978, Quetta: The Brahui Academy. Brohi, Dad Muhammad 1994 Sindhī brahuī bolī’a jo taqābulī jā’izo [A comparative look at the Sindhi and Brahui languages]. Haidarabad: Sindhī Bolī’a jo Bāikhtiyār Idāro [Sindhi Language Authority]. Bronkhorst, Johannes, and Madhav M. Deshpande (eds.) 1999 Aryan and non-Aryan in South Asia: Evidence, interpretation, and ideology: Proceedings of the International Seminar on Aryan and Non-Aryan in South Asia, University of Michigan, Ann Arbor, 25–27 October, 1996. (Harvard Oriental Series, Opera Minora, 3.) Cambridge, Mass.: Dept. of Sanskrit and Indian Studies, Harvard University. Brough, John 1962 The Gāndhārī Dharmapada. Repr. 2001, Delhi: Motilal Banarsidass. Brough, John 1965 Comments on third-century Shan-Shan and the history of Buddhism. Bulletin of the School of Oriental and African Studies 28(3): 582–612.

Contact and convergence

339

Brown, Charles Philip 1854 Miśra bhāṣā nighaṇṭu: Dictionary of mixed Telugu. Madras: Christian Knowledge Society Press. Buddruss, Georg 1960 Die Sprache von Woṭapūr and Kaṭārqalā: Linguistische Studien im afghanischen Hindukusch [The language of Woṭapūr and Kaṭārqalā: Linguistic studies in the Afghan Hindukush]. (Bonner Orientalistische Studien, neue Serie, 9.) Bonn: Orientalisches Seminar der Universität Bonn. Buddruss, Georg 1967 Die Sprache von Sau in Ostafghanistan: Beiträge zur Kenntnis des dardischen Phalūra [The language of Sau in eastern Afghanistan: Contributions to the knowledge of the Dardic Phalūra]. München: Kitzinger. Buddruss, Georg 1985 Linguistic research in Gilgit and Hunza: Some results and perspectives. Journal of Central Asia 8(1): 27–32. Islamabad: Centre for the Study of the Civilizations of Central Asia, Quaid-i-Azam University. Buddruss, Georg 1989a Aus dem Leben eines jungen Balutschen von ihm selbst erzählt [From the life of a young Baloch as told by himself]. (Abhandlungen für die Kunde des Morgenlandes, 48[4].) Stuttgart: Steiner/Deutsche Morgenländische Gesellschaft. Buddruss, Georg 1989b Kommentar zu einem Kivi-Vokabular aus dem sowjetischen Pamir. Studien zur Indologie und Iranistik 15: 197–205. Bughio, M. Qasim 2001 A comparative sociolinguistic study of rural and urban Sindhi (Study of language variation and change in Sindhi spoken in Sindh, Pakistan). München: LINCOM. Bughio, M. Qasim 2009 Stratification of /r/ pronunciation in Sindhi spoken in Sindh, Pakistan. International Journal of Arts & Humanities 37: 29–48. Bulkin, Carleton 2010 Dari practical dictionary: Dari-English/English-Dari. New York: Hippocrene Books. Burki, Rozi Khan 2001 Dying languages: Special focus on Ormuri. Pakistan Journal of Public Administration 6(2). http://www.khyber.org/publications/016–020/ormuri. shtml (accessed 28 Nov. 2014) Burling, Robbins 2003 The Tibeto-Burman languages of northeastern India. In: Thurgood & LaPolla (eds.) 2003: 167–192. Burling, Robbins 2007 The lingua franca cycle: Implications for language shift, language change, and language classification. Anthropological Linguistics 27(3/4): 207–234. Burling, Robbins 2011 Three meanings of “language” and “dialect” in northeast India. In: Gwendolyn Hyslop, Stephen Morey, and Mark Post (eds.), North East Indian linguistics, 35–45. Delhi: Cambridge University Press.

340

Bibliographical references

Burrow, Thomas 1933–1935a Iranian words in the Kharoṣṭhī documents from Chinese Turkestan-I. Bulletin of the School of Oriental Studies 7: 509–516. Burrow, Thomas 1933–1935b Iranian words in the Kharoṣṭhī documents from Chinese Turkestan-II. Bulletin of the School of Oriental Studies 7: 779–790. Burrow, Thomas 1936 The dialectal position of the Niya Prakrit. Bulletin of the School of Oriental Studies 8(2/3): 419–435. Burrow, Thomas 1937 The language of the Kharoṣṭhi documents from Chinese Turkestan. Cambridge: Cambridge University Press. Burrow, Thomas 1945 Some Dravidian words in Sanskrit. Transactions of the Philological Society 1945: 79–100. Burrow, Thomas 1946 Loan words in Sanskrit. Transactions of the Philological Society 1946: 1–30. Burrow, Thomas 1947 Dravidian studies VI. Bulletin of the School of Oriental and African Studies 12(1): 132–147. Burrow, Thomas 1948 Dravidian studies VII: Further Dravidian words in Sanskrit. Bulletin of the School of Oriental and African Studies 12(2): 365–396. Burrow, Thomas 1973 The Sanskrit language, 3rd ed. London: Faber & Faber. Burrow, Thomas, and Murray B. Emeneau 1984 A Dravidian etymological dictionary. Revised edition, Oxford: Oxford University Press. Butt, Miriam, and Tafseer Ahmed 2010 The redevelopment of Indo-Aryan case systems from a lexical semantic perspective. Morphology 21(3): 545–572. http://ling.uni-konstanz.de/pages/ home/butt/ (accessed 14 Dec. 2014) Cable, Seth 2009 The syntax of the Tibetan correlative. In: Anikó Liptak (ed.), Correlatives cross-linguistically, 195–222. Amsterdam/Philadelphia: Benjamins. Cardona, George, and Dhanesh Jain (eds.) 2003 The Indo-Aryan languages. London/New York: Routledge. Caron, Bruce R., et al. (eds.) 1980 Proceedings of the Sixth Annual Meeting of the Berkeley Linguistics Society. Berkeley, CA: Berkeley Linguistics Society. CDIAL = Turner 1966. Chatterji, Suniti Kumar 1926 The origin and development of the Bengali language. 3 vols. Calcutta: Calcutta University Press. Reprinted 1970, London: Allen & Unwin; distributed by Motilal Banarsidass, Delhi. Chatterji, Suniti Kumar 1966 Some Iranian and Turki loans in Sanskrit. In: Dil (ed.) 1966: 123–140.

Contact and convergence

341

Chelliah, Shobhana L. 2011 A grammar of Meithei. 1st edn. 1997. Berlin/New York: Mouton de Gruyter. Chopra, R. M. 2000 Perso-Arabic words in Punjabi. Indo-Iranica 53: 91–108. Comrie, Bernard 1981 Language universals and linguistic typology. Chicago: University of Chicago Press. Coupe, Alexander R. 2007 Converging patterns of clause linkage in Nagaland. In: Matti Miestamo and Bernhard Wälchli (eds.), New challenges in typology: Broadening the horizons and redefining the foundations, 339–361. Berlin/New York: Mouton de Gruyter. Coupe, Alexander R. 2008 A grammar of Mongsen Ao. Berlin/New York: Mouton de Gruyter. Dabir-Moghaddam, Mohammad 2008 On agent clitics in Balochi in comparison with other Iranian languages. In: Jahani, Korn & Titus (eds.) 2008: 83–100. Dahl, Östen 2001 Principles of areal typology. In: Martin Haspelmath, E. König, W. Oesterreicher, and W. Raible (eds.), Language typology and language universals: An international handbook, vol. 2, 1456–1470. Berlin: Mouton de Gruyter. Das, Rahul Peter 1995 The hunt for foreign words in the Ṛgveda. Indo-Iranian Journal 38: 207–238. Dasgupta, Probal 2003 Bangla. In: Cardona & Jain (eds.) 2003: 351–390. Dattamajumdar, Sattarupa 2012 Linguistic shrinkage in Lepcha: Traces of language contact. Seventh Annual International Conference of the North East Indian Linguistics Society, Guwahati, Assam, India. Davison, Alice, and Frederick M. Smith (eds.) 1994 Papers from the 15th SALA Roundtable. Iowa City, IA: South Asia Studies Program, University of Iowa. De Silva, M. W. Sugathapala 1972 Vedda language of Ceylon: Texts and lexicon. (Münchener Studien zur Sprachwissenschaft. Beiheft, n. F. 7.) München: Kitzinger. DEDR = Burrow & Emeneau 1984. DeLancey, Scott 2010a Language replacement and the spread of Tibeto-Burman. Journal of the SouthEast Asian Linguistics Society 3(1): 40–55. DeLancey, Scott 2010b Towards a history of verb agreement in Tibeto-Burman. Himalayan Linguistics 9: 1–39. DeLancey, Scott 2011 Notes on verb agreement prefixes in Tibeto-Burman. Himalayan Linguistics 10: 1–29. DeLancey, Scott 2012 On the origins of Bodo-Garo. In: Gwendolyn Hyslop, Mark Post, and Stephen Morey (eds.), North East Indian Linguistics 4, 3–20. New Delhi: Cambridge University Press.

342

Bibliographical references

Delforooz, Behrooz Barjasteh 2008 A sociolinguistic survey among the Jadgal in Iranian Balochistan. In: Jahani, Korn & Titus (eds.) 2008: 23–43. Delforooz, Behrooz Barjasteh 2010 Discourse features in Balochi of Sistan: Oral narratives. Uppsala University PhD dissertation. (Studia Iranica Upsaliensia, 15.) Uppsala. Deshpande, Madhav M., and Peter E. Hook (eds.) 1979 Aryan and non-Aryan in India. Ann Arbor: Center for South and Southeast Asian Studies, University of Michigan. Dey, Kakoli 2012 Spirantization in Tibeto-Burman-Bengali bilinguals in the south of Assam. Seventh Annual International Conference of the North East Indian Linguistics Society, Guwahati, Assam, India, Jan 31–Feb 2. Dhakal, Dubi Nanda 2012 Contact induced changes in Baram. Seventh Annual International Conference of the North East Indian Linguistics Society, Guwahati, Assam, India, Jan 31– Feb 2. Di Carlo, Pierpaolo 2011 Two clues of a former Hindu-Kush linguistic area? In: Carol Everhard and Elizabeth Mela-Athanasopoulou (eds.), Selected papers from the International Conference on Language Documentation and Tradition with a special interest in the Kalasha of the Hindu Kush valleys, Himalayas, 7–9 November 2008, 101–114. Thessaloniki: Aristotle University of Thessaloniki. Diffloth, Gerard, and Norman Zide 1992 Austro-Asiatic languages. In: William Bright et al. (eds.), International encyclopedia of linguistics, 137–142. New York: Oxford University Press. Dil, Anwar S. (ed.) 1966 Shahidullah presentation volume. (Pakistani Linguistics Series, 7.) Lahore: Linguistics Research Group of Pakistan. Dixon, R. M. W. 1997 The rise and fall of languages. Cambridge/New York: Cambridge University Press. Djačok, M. T. 1988 Burušaski-cyganskie jazykovye kontakty (po povodu stat'i Ch. Bergera). In: B. V. Boldyrev and L. V. Petropavlovskaja (eds.), Grammatičeskaja i semantičeskaja struktura slova v jazykax narodov sibiri, 115–128. Novosibirsk: Akademia Nauk, SSSR, Sibirskoe Otdelenie, Institut Istorii filologii i filosofii. Dodychudojev, Rahim 1972 Die Pamir-Sprachen: Zum Problem der Konvergenz [The Pamir languages: On the problem of convergence]. Translated from Russian by Manfred Lorenz. Mitteilungen des Instituts für Orientforschung zu Berlin 17(3): 463–470. Donegan, Patricia, and David Stampe 2004 Comparative Munda (mostly North), rough draft, ed. by David Stampe, based on Heinz-Jürgen Pinnow’s Versuch einer historischen Lautlehre der KhariaSprache (Wiesbaden: Harrassowitz, 1959) and Ram Dayal Munda’s ProtoKherwarian phonology, University of Chicago MA thesis, 1968. Online at http://www.ling.hawaii.edu/austroasiatic/AA/Munda/ETYM/Pinnow&Munda (accessed 15 Dec. 2014)

Contact and convergence

343

Donohue, Mark, Virginia Dawson, and Keren Baker 2012 The typological position(s) of the languages of northeast India and implications for social history. Seventh Annual International Conference of the North East Indian Linguistics Society, Guwahati, Assam, India, Jan 31–Feb 2. Dryer, Matthew S. 1988 Object-verb order and adjective-noun order: Dispelling a myth. Lingua 74: 77–109. Dryer, Matthew S. 1992 The Greenbergian word order correlations. Language 68: 81–138. Dryer, Matthew S. 2003 Word order in Sino-Tibetan languages from a typological and geographical perspective. In: Thurgood & LaPolla (eds.) 2003: 43–55. Dua, Hans R. 1992 Hindi-Urdu as a pluricentric language. In: Michael Clyne (ed.), Pluricentric languages, 381–400. Berlin/New York: Mouton de Gruyter. Dulling, Gurth Kenton 1973 The Hazaragi dialect of Afghan Persian: A preliminary study. (Central Asian Monograph, 1.) London: Central Asian Research Center. Durrani, Attash (ed.) 1997 Pākistanī Urdū ke khadokhāl [Characteristics of Pakistani Urdu]. Islamabad: National Language Authority. Ebert, Karen H. 1993 Kiranti subordination in the South Asian areal context. In: Karen H. Ebert (ed.), Studies in clause linkage: Papers from the First Köln-Zürich Workshop, 83–110. Zürich: Arbeiten des Seminars für Allgemeine Sprachwissenschaft. Ebert, Karen H. 1999 Nonfinite verbs in Kiranti languages. In: Yadava & Glover (eds.): 371–400. Ebert, Karen H. 2009 South Asia as a linguistic area. In: Keith Brown and Sarah Ogilvie (eds.), Concise encyclopedia of the world’s languages, 995–1001. Oxford: Elsevier. Èdel’man, Džoi I. 1976 Strukturnye “anomalii” vostočnoiranskix jazykov i tipologija substrata. In: Harry Spitzbardt and Bernd Barschel (eds.), Studien zur allgemeinen und vergleichenden Sprachwissenschaft: Karl Ammer zum Gedenken, 79–93. (Wissenschaftliche Beiträge der Friedrich-Schiller-Universität Jena.) Jena: Friedrich-Schiller University. Èdel’man, Džoi I. 1980 K substratnomu naslediju central’noaziatskogo jazykovogo sojuza [On the substratal heritage of the Central Asian linguistic union]. Voprosy Jazykoznanija 5: 21–32. Èdel’man, Dzhoi I. 1983 The Dardic and Nuristani languages. Moscow: Nauka. Èdel’man, Džoi I. 1984 K substratnym javleniam v sintaksise iranskix jazykov [On substratal effects in the syntax of the Iranian languages]. Voprosy dialektologii i istorii jazyka, 34–41. Dušanbe: Donish.

344

Bibliographical references

Edelman, D(zhoy) (Joy) I., and Leila R. Dodykhudoeva 2009a The Pamir languages. In: Windfuhr (ed.) 2009: 773–786. Edelman, D(zhoy) (Joy) I., and Leila R. Dodykhudoeva 2009b Shughni. In: Windfuhr (ed.) 2009: 787–824. Edelman, Dzhoy (Joy) I. 1999 On the history of non-decimal systems and their elements in numerals of Aryan languages. In: Jadranka Gvozdanovic (ed.), Numeral types and changes world-wide, 221–241. Berlin/New York: Mouton de Gruyter. Efimov, Valentin Aleksandrovich 1965 Jazyk Afganskix Xazara: Jakaulangskij dialekt [The language of the Afghan Hazara: The dialect of Yakaulang]. Moskva: Nauka. Efimov, Valentin Aleksandrovich 1986 Jazyk Ormuri v sinxronnom i istoričeskom osveščenii [The Ormuṛi language in synchronic and historical perspective]. Moskva: Nauka. (English translation: Efimov, Valentin Aleksandrovich. The Ormuri language in past and present. Translated and edited, 2011, by Joan L. G. Baart. Islamabad: Forum for Language Initiatives.) Efimov, Valentin Aleksandrovich 1999a Parači jazyk [The Parachi language]. In: Jartseva et al. (eds.) 1999: 257–275. Efimov, Valentin Aleksandrovich 1999b Ormuri jazyk [The Ormuṛi language]. In: Jartseva et al. (eds.) 1999: 276–296. Egerod, Søren 1973 Review of Paul K. Benedict, Sino-Tibetan: A conspectus. Journal of Chinese Linguistics 1: 498–505. Elfenbein, Josef 1982 Notes on the Baluchi-Brahui linguistic commensality. Transactions of the Philological Society 1982: 77–98. Elfenbein, Josef 1983 The Brahui problem again. Indo-Iranian Journal 25: 103–132. Elfenbein, Josef 1987 A periplus of the ‘Brahui problem’. Studia Iranica 16: 215–233. Elfenbein, Josef 1989 Brahui. Iranica online. http://www.iranicaonline.org/articles/brahui (accessed 29 December 2014) Elfenbein, Josef 1997 Pashto phonology. In: Alan S. Kaye (ed.), Phonologies of Asia and Africa (including the Caucasus), 733–760. Winona Lake, Indiana: Eisenbrauns. Elfenbein, Josef 1998 Brahui. In: Steever (ed.) 1998: 388–414. Emeneau, Murray B. 1954 Linguistic prehistory of India. Papers of the American Philosophical Society 98: 282–292. Emeneau, Murray B. 1956 India as a linguistic area. Language 32: 3–16. (Repr. in Emeneau 1980a: 105– 125.) Emeneau, Murray B. 1962a Brahui and Dravidian comparative grammar. Berkeley/Los Angeles: University of California Press.

Contact and convergence

345

Emeneau, Murray B. 1962b Bilingualism and structural borrowing. Proceedings of the American Philosophical Society 106(5): 430–442. Emeneau, Murray B. 1962c Iranian and Indo-Aryan influence on Brahui (adapted from chapter 4 of Emeneau 1962a: 47–61). Republished in: Emeneau 1980: 333–349. Emeneau, Murray B. 1964 Linguistic desiderata in Baluchistan. In: Indo-Iranica, mélanges présentés à Georg Morgenstierne, à l’occasion de son soixante-dixième anniversaire, 73–77. Wiesbaden: Harrassowitz. Emeneau, Murray B. 1965a India and historical grammar. (Annamalai University Publications in Linguistics, 5.) Annamalainagar: Annamalai University. Emeneau, Murray B. 1965b India and linguistic areas. Originally published in Emeneau 1965a: 25–75. Repr. in: Emeneau 1980a: 126–166. Emeneau, Murray B. 1969a Review of Mayrhofer, Kurzgefasstes Wörterbuch des Altindischen, fasc. 19. Language 45: 373–374. Emeneau, Murray B. 1969b Onomatopoetics in the Indian linguistic area. Language 45: 274–299. Emeneau, Murray B. 1974 The Indian linguistic area revisited. International Journal of Dravidian Linguistics 3: 92–134. (Repr. in Emeneau 1980a: 197–249.) Emeneau, Murray B. 1980a Language and linguistic area, Essays selected by Anwar S. Dil. Stanford, CA: Stanford University Press. Emeneau, Murray B. 1980b India and linguistic areas. In: Emeneau 1980a: 126–166. Emeneau, Murray B. 1989 The language of the Nilgiris. In: Paul Hockings (ed.), Blue mountains: The ethnography and biogeography of a South Indian region, 133–143. Delhi/New York: Oxford University Press. Repr. in Emeneau 1994. Emeneau, Murray B. 1994 Dravidian studies: Selected papers, ed. by Bh(adriraju) Krishnamurti. Delhi: Motilal Banarsidass. Emeneau, Murray B., and Thomas Burrow 1962 Dravidian borrowings from Indo-Aryan. Berkeley/Los Angeles: University of California Press. Emmerick, Ronald E. 1989 Khotanese and Tumshuqese. In: Schmitt (ed.) 1989: 204–229. Emmerick, Ronald E. 2009 Khotanese and Tumshuqese. In: Windfuhr (ed.) 2009: 377–415. Farhâdi, Abd-ul-Ghafûr 1955 Le Persan parlé en Afghanistan: Grammaire du Kâboli [Spoken Persian in Afghanistan: A grammar of Kāboli]. Paris: Centre national de la recherche scientifique.

346

Bibliographical references

Farrell, Tim 2003 Linguistic influences on the Balochi spoken in Karachi. In: Jahani, Korn & Gren-Eklund (eds.) 2003: 169–209. Filippone, Ela 1996 Spatial models and locative expressions in Baluchi. (Balochistan Monograph Series, 4.) Naples: Istituto Universitario Orientale, Istituto Italiano per il Medio ed Estremo Oriente. Frembgen, Jürgen W. 1997 English loan words in Burushaski as a barometer of cultural change. In: Irmtraud Stellrecht and Matthias Winiger (eds.), Perspectives on history and change in the Karokorum, Hindukush, and Himalaya, 463–471. (Culture Area Karakorum, Scientific Studies 3.) Köln: Rüdiger Köppe. Fussman, Gérard 1972 Atlas linguistique des parlers dardes et kafirs. Vol. 1, cartes. Vol. 2, commentaire [Linguistic atlas of the Dardic and Kafir dialects. Vol. 1, maps. Vol. 2, comments]. Paris: École Française d’Extrême-Orient; Dépositaire: AdrienMaisonneuve. Gair, James W. 1976 The verb in Sinhala, with some preliminary remarks on Dravidianization. International Journal of Dravidian Linguistics 5: 259–273. Gair, James W. 1980 Adaptation and naturalization in a linguistic area: Sinhala focused sentences. In: Caron et al. (eds.) 1980: 28–43. Gair, James W. 1985 How Dravidianized was Sinhala phonology? Some conclusions and cautions. In: Acson & Leed (eds.) 1985: 37–55. Gair, James W., and W. S. Karunatillake 1974 Literary Sinhala. Ithaca, NY: South Asia Program and Department of Modern Languages and Linguistics, Cornell University. Gamkrelidze, Tomas, and Vjacheslav Ivanov 1995 Indo-European and the Indo-Europeans: A reconstruction and historical analysis of a proto-language and proto-culture, 2 vols. Berlin/New York: Mouton de Gruyter. Genetti, Carol 2009 A grammar of Dolakhā Newār. Berlin/New York: Mouton de Gruyter. (1st ed. 2007.) George, K. M. 1972 Western influence on Malayalam language and literature. New Delhi: Sahitya Akademi Publications. Ghosh, Arun 2008 Santali. In: Anderson (ed.) 2008: 11–89. Girish, P. M. 2005 The influence of English on Malayalam language. Language in India 5. http:// www.languageinindia.com/may2005/girishenglishmalayalam1.html (accessed 22 Nov. 2014) Goswami, G. C., and Jyotiprakash Tamuli 2003 Asamiya. In: Cardona & Jain (eds.) 2003: 391–443.

Contact and convergence

347

Green, Nile 2011 The trans-border traffic of Afghan modernism: Afghanistan and the Indian “Urdusphere”. Comparative Studies in Society and History 53(3): 479–508. Greenberg, Joseph H. 1963 Some universals of grammar with particular reference to the order of meaningful elements. In: Greenberg (ed.) 1966: 73–113. Greenberg, Joseph H. (ed.) 1966 Universals of language, 2nd ed. Cambridge, MA: MIT Press. Gren-Eklund, Gunilla 2003 Language contact in Balochistan from the Indian point of view. In: Jahani, Korn & Gren-Eklund (eds.) 2003: 33–48. Grierson, George Abraham 1906 [= Grierson (ed.) 1903–1928, Vol. 4] Grierson, George Abraham 1909 [= Grierson (ed.) 1903–1982, Vol. 3.1] Grierson, George Abraham 1919 [= Grierson (ed.) 1903–1928, Vol. 8.2] Grierson, George Abraham (ed.) 1903–1928 Linguistic survey of India, 11 volumes in 20. Calcutta: Office of the Superintendent of Government Printing. Repr. 1967: Delhi: Motilal Banarsidass. http://dsal.uchicago.edu/books/lsi/. Grjunberg, Aleksandr Leonovič 1972 Mundžanskij jazyk: Teksty, slovar’, grammatičeskij očerk [The Munjī language: Texts, dictionary, grammatical sketch]. (Jazyki vostočnogo Gindukuša [Languages of the Eastern Hindukush].) Leningrad: Nauka. Grjunberg, Aleksandr Leonovič 1980 Jazyk Kati. Teksty, grammatičeskij očerk [The Kati language: Texts, grammatical account]. (Jazyki vostočnogo Gindukuša [Languages of the Eastern Hindukush].) Moskva: Nauka. Grjunberg, Aleksandr Leonovič 1987 Očerk grammatiki Afganskogo jazyka (Pašto) [An account of the grammar of the Afghan language (Pashto)]. Leningrad: Nauka. Grjunberg, Aleksandr Leonovič 1988 Afganistan: Jazykovaja situacija i jazykovaja politika [Afghanistan: Linguistic situation and language policy]. Izvestija AN SSSR. Serija literatury i jazyka 47: 167–173. Grjunberg, Aleksandr Leonovič, and Ivan M. Steblin-Kamenskij 1974 Ėtnolingvističeskaja xarakteristika vostočnogo Gindukuša [An ethno-linguistic characterization of the eastern Hindukush]. In: Solomon Il'ich Bruk (ed.), Problemy kartografirovanija v jazykoznanii i etnografii [Problems of cartography in linguistics and ethnography], 276–283. Leningrad: Nauka. Grjunberg, Aleksandr Leonovič, and Ivan M. Steblin-Kamenskij 1976 Vaxanskij jazyk: Teksty, slovar’, grammatičeksij očerk [The Wakhi language: Texts, dictionary, grammatical account]. (Jazyki vostočnogo Gindukuša.) Moskva: Glavnaja redakcija vostočnoj literatury. Gumperz, John J., and Robert Wilson 1971 Convergence and creolization: A case from the Indo-Aryan/Dravidian border. In: Hymes (ed.) 1971: 151–168.

348

Bibliographical references

Gurov, Nikita V. 2000 Spisok Kuipera substratnaja leksika ‘Rigvedy’ [Kuiper’s list of substratum vocabulary in the Rigveda]. In: Processy jazykovoi interferensii v Južnnoi Azii na rubeže 3-go tysiačeletija [Processes of language interference in South Asia in the 2nd–1st Millennia BCE], 25–37. Moscow. Haarmann, Harald 1970 Die indirekte Erlebnisform als grammatische Kategorie: Eine eurasische Isoglosse [Indirect experience as a grammatical category: A Eurasian isogloss]. Wiesbaden: Harrassowitz. Hahn, Ferd[inand] 1911 Grammar of the Kurukh language. Calcutta: Bengal Secretariat Press. Repr. 1985, Delhi: Mittal. Hallberg, Daniel G. 2004 Pashto, Waneci, Ormuṛi. (Sociolinguistic Survey of Northern Pakistan, 4.) Islamabad: National Institute of Pakistani Studies, Quaid-i-Azam University/ Summer Institute of Linguistics. Hanaway, William L., and Wilma Heston (eds.) 1996 Studies of popular culture in Pakistan. Islamabad/Lahore: Lok Virsa/Sang-eMeel Publications. Hargreaves, David 1996 From interrogation to topicalization: Proto-Tibeto-Burman *-la in Kathmandu Newar. Linguistics of the Tibeto-Burman Area 19(2): 31–44. Haspelmath, Martin 2004 How hopeless is genealogical linguistics, and how advanced is areal linguistics? (Review article of Aikhenvald & Dixon (eds.) 2001.) Studies in Language 28(1): 209–223. Haspelmath, Martin, and Ekkehard König (eds.) 1995 Converbs in cross-linguistic perspective: Structure and meaning of adverbial verb forms — adverbial participles, gerunds. Berlin/New York: Mouton de Gruyter. Haspelmath, Martin, Matthew S. Dryer, David Gil, and Bernard Comrie (eds.) 2005 The world atlas of language structures. New York: Oxford University Press. Heegård, Jan, and Ida E. Mørch 2004 Retroflex vowels and other peculiarities in the Kalasha sound system. In: Saxena & Borin (eds.) 2004: 57–76. Heine, Bernd, and Tania Kuteva 2003 Contact-induced grammaticalization. Studies in Language 27: 529–572. Heine, Bernd, and Tania Kuteva 2005 Language contact and grammatical change. Cambridge: Cambridge University Press. Heine, Bernd, and Tania Kuteva 2008 Constraints on contact-induced linguistic change. Journal of Language Contact – THEMA 2: 57–90. Henderson, Eugenie J. A. 1957 Colloquial Chin as a pronominalized language. Bulletin of the School of Oriental and African Studies, University of London 20(1/3): 323–327. (Studies in Honour of Sir Ralph Turner, Director of the School of Oriental and African Studies, 1937–1957.)

Contact and convergence

349

Heston, Wilma L. 1980 Some areal features: Indian or Irano-Indian. International Journal of Dravidian Linguistics 9: 141–167. Heston, Wilma L. 1981 Review of Masica 1976. International Journal of Dravidian Linguistics 10(1): 180–187. Heston, Wilma L. 1983 “What linguistic area?” An Iranianist’s view. Association for Asian Studies, March 26, 1983. Heston, Wilma L. 1987 Pashto ambipositions and historical antecedents. In: Bashir, Deshpande & Hook (eds.) 1987: 163–181. Hilali, Shaikh Ghulam Maqsud, and Muhammad Enamul Haq 1967 Perso-Arabic elements in Bengali. Dacca: Central Board for Development of Bengali. Repr. 2002, Dhaka: Bangla Academy. Hock, Hans Henrich 1975 Substratum influence on (Rig Vedic) Sanskrit? Studies in the Linguistic Sciences 5(2): 76–125. Hock, Hans Henrich 1982 The Sanskrit quotative: A historical and comparative study. In: Hock (ed.) 1982: 39–85. Hock, Hans Henrich 1986 Principles of historical linguistics. 2nd edn. 1991. Berlin/New York: Mouton de Gruyter. Hock, Hans Henrich 1992 A note on English and modern Sanskrit. In: L. E. Smith and S. N. Sridhar (eds.), The extended family: English in global bilingualism (Studies in honor of Braj B. Kachru), 153–171. (World Englishes 11, parts 2–3.) Hock, Hans Henrich 1993 Review of Abbi 1992. Studies in the Linguistic Sciences 23(1): 169–192. Hock, Hans Henrich 1996a Pre-Ṛgvedic convergence between Indo-Aryan (Sanskrit) and Dravidian? A survey of the issues and controversies. In: J. E. M. Houben (ed.), Ideology and status of Sanskrit: Contributions to the history of the Sanskrit language, 17–58. Leiden: Brill. Hock, Hans Henrich 1996b Subversion or convergence? The issue of pre-Vedic retroflexion reconsidered. Studies in the Linguistic Sciences 23(2): 73–115. Hock, Hans Henrich 1997 Chronology or genre? Problems in Vedic syntax. In: Michael Witzel (ed.), Inside the texts — beyond the texts: New approaches to the study of the Vedas, 103–126. (Harvard Oriental Series, Opera Minora, 2.) Cambridge, MA: Harvard University. Hock, Hans Henrich 1999 Out of India? The linguistic evidence. In: Bronkhorst and Deshpande (eds.) 1999: 1–18.

350

Bibliographical references

Hock, Hans Henrich 2000 South Asia: Historical. In: Rajendra Singh (ed.), The yearbook of South Asian languages and linguistics 2000, 220–237. New Delhi: Sage. Hock, Hans Henrich 2015 The northwest of South Asia and beyond. The issue of Indo-Aryan retroflexion yet again. Journal of South Asian Languages and Linguistics 2(1): 111–135. Hock, Hans Henrich (ed.) 1982 Papers on diachronic syntax: Six case studies. (= Studies in the Linguistic Sciences 12(2).) Hock, Hans Henrich, and Brian Joseph (eds.) 2009 Language history, language change, and language relationship. 2nd ed. Berlin/ New York: Mouton de Gruyter. Hook, Peter Edwin 1974 The compound verb in Hindi. Ann Arbor: University of Michigan, Center for South and Southeast Asian Studies. Hook, Peter Edwin 2001 Where do compound verbs come from? And where are they going? In: Bhaskararao & Subbarao (eds.) 2001: 101–130. Hook, Peter Edwin 1982 South Asia as a semantic area: Forms, meanings, and their connections. In: P. J. Mistry (ed.), Studies in South Asian languages and linguistics, 30–41. (Special Issue of South Asian Review 6(3).) Hook, Peter Edwin 1985 Linguistic areas: Getting at the grain of history. In: George Cardona and Norman H. Zide (eds.), Festschrift for Henry Hoenigswald, on the occasion of his seventieth birthday, 155–168. Tübingen: Narr. Hook, Peter Edwin, and Prashant Pardeshi 2009 The semantic evolution of EAT-expressions: Ways and byways. In: John Newman (ed.), The linguistics of eating and drinking, 151–172. Amsterdam/ Philadelphia: Benjamins. Hvenekilde, Anne 2001 Kinship systems and language choice among academics in Shillong, Northeast India. International Journal of Applied Linguistics 11(2): 174–193. Hymes, Dell (ed.) 1971 Pidginization and creolization of languages. Cambridge: Cambridge University Press. Inam Ullah 2005 Torwali lexical database. unpublished. Inam Ullah 2011 Torwali online dictionary. http://www.cle.org.pk/software/ling_resources/otd. htm (accessed 28 Nov. 2014) Islam, Riaz Ahmed 2011 The morphology of loanwords in Urdu: The Persian, Arabic and English strands. Newcastle University PhD thesis. https://theses.ncl.ac.uk/dspace/bit stream/10443/1407/1/Islam,%20R.A.%2012.pdf (accessed 28 Nov. 2014) Israel, M. 1997 Language situation and linguistic convergence (with special reference to Kuvi). In: Abbi (ed.) 1997: 121–130.

Contact and convergence

351

Jacques, Guillaume 2012 Agreement morphology: The case of Rgyalrongic and Kiranti. Language and Linguistics 13(1): 83–116. Jacquesson, François 2008 The speed of language change, typology, and history: Languages, speakers, and demography in North-East India. In: Alicia Sanchez-Masas, Roger Blench, Malcolm D. Ross, Ilia Peiros, and Marie Lin (eds.), Past human migrations in East Asia: Matching archaeology, linguistics, and genetics, 287–310. New York: Routledge. Jahani, Carina 1994 Notes on the use of genitive constructions versus iẓāfa constructions in Iranian Balochi. Studia Iranica 23: 285–298. Jahani, Carina 1999 Persian influence on some verbal constructions in Iranian Balochi. Studia Iranica 28: 123–143. Jahani, Carina 2003 The case system in Iranian Balochi in a contact linguistic perspective. In: Jahani, Korn & Gren-Eklund (eds.) 2003: 139–166. Jahani, Carina 2008 Restrictive relative clauses in Balochi and the marking of the antecedent: Linguistic influence from Persian? In: Jahani, Korn & Titus (eds.) 2008: 45–82. Jahani, Carina, and Agnes Korn 2009 Balochi. In: Windfuhr (ed.) 2009: 634–692. Jahani, Carina, Agnes Korn, and Gunilla Gren-Eklund (eds.) 2003 The Baloch and their neighbours: Ethnic and lingustic contact in Balochistan in historical and modern times. Wiesbaden: Reichert. Jahani, Carina, Agnes Korn, and Paul Titus (eds.) 2008 The Baloch and others: Linguistic, historical and socio-political perspectives on pluralism in Balochistan. Wiesbaden: Reichert. Jakobson, Roman 1931 K xarakteristike evrazijskogo jazykovogo sojuza [Toward a characterization of a Eurasian linguistic area]. Paris: Imprimerie de Navarre. Janjua, Fauzia 2011 Urdu-English switching: Decline of the linguistic capital of Urdu language in Pakistan. International Journal of Academic Research 3(4): 406–409. Jartseva, Victoria Nikolaevna, et al. (eds.) 1999 Jazyki mira: Iranskie Jazyki: II. Severo-zapadnye iranskie jazyki [Languages of the world: Iranian languages: II. Northwestern Iranian languages]. Moskva: Indrik. Johanson, Lars, and Martine Robbeets (eds.) 2012 Copies versus cognates in bound morphology. Leiden/Boston: Brill. Johanson, Lars, and Martine Robbeets 2012 Bound morphology in common: Copy or cognate? In: Johanson & Robbeets (eds.) 2012: 3–22. Joseph, Brian D. 2007 Broad vs. localistic dialectology, standard vs. dialect: The case of the Balkans and the drawing of linguistic boundaries. In: Stavroula Tsiplakou, Marilena

352

Bibliographical references

Karyolemou, and Pavlos Pavlou (eds.), Language variation – European perspectives II: Selected papers from the 4th International Conference on Language Variation in Europe (ICLaVE 4), Nicosia, June 2007, 119–134. Amsterdam/Philadelphia: Benjamins. Juannejan, Juli Arkadevič 1999 Geratskij dialect jazyka dari sovremennogo Afganistana [The Herat dialect of contemporary Dari of Afghanistan]. (Jazyki narodov Azii i Afriki.) Mosvka: Vostočnaja literature, RAN. Kachru, Braj B. 1969 English in South Asia. In: Sebeok, Emeneau & Ferguson (eds.) 1969: 627– 678. Kachru, Braj B. 1983 The Indianization of English: The English language in India. Delhi: Oxford University Press. Kachru, Braj B., Yamuna Kachru, and S. N. Sridhar (eds.) 2008 Language in South Asia. Cambridge: Cambridge University Press. Kachru, Yamuna 1989 Corpus planning for modernization: Sanskritization and Englishization of Hindi. Studies in the Linguistic Sciences 19(1): 153–164. Kachru, Yamuna 2008 Hindi-Urdu-Hindustani. In: Kachru, Kachru & Sridhar (eds.) 2008: 81–102. Kalinina, Z. M. 1977 Jazykovaja situacija v sovremennom Afganistane i dejatel’nost’ afganskoj Akademii jazyka i literatury Pašto-Tolyna [The linguistic situation in modern Afghanistan and the activities of the Afghan Academy of Language and Literature Pashto Tolena]. In: Leonid B. Nikol’skij (ed.), Jazykovaja politika v afroaziatskix stranax [Language politics in Afroasiatic countries], 214–222. Moskva: Nauka. Kennedy, Kenneth Adrian Raine 1995 Have Aryans been identified in the prehistoric skeletal record from South Asia? Biological anthropology and concepts of ancient races. In: George Erdosy (ed.), The Indo-Aryans of ancient South Asia: Language, material culture, and ethnicity, 32–66. Berlin/New York: de Gruyter. Kenoyer, Jonathan Mark 1995 Interaction systems, specialized crafts and culture change: The Indus Valley tradition and the Indo-Gangetic tradition in South Asia. In: George Erdosy (ed.), The Indo-Aryans of ancient South Asia: Language, material culture, and ethnicity, 213–257. Berlin/New York: de Gruyter. Khubchandani, Lachman Mulchand 1963 The acculturation of Indian Sindhi to Hindi: A study of language in contact. University of Pennsylvania PhD dissertation. ProQuest Dissertations 6407380. Khubchandani, Lachman Mulchand 1969 Sindhi. In: Sebeok, Emeneau & Ferguson (eds.) 1969: 201–234. Kieffer, Charles M. 1974 L’établissement des cartes phonétiques: Premiers résultats: l’atlas linguistique des parlers iraniens [The establishment of the phonetic maps: First results: The linguistic atlas of the Iranian dialects]. In: Redard, Sana & Kieffer (eds.) 1974: 21–51.

Contact and convergence

353

Kieffer, Charles M. 1977 The approaching end of the relict South-East Iranian languages Ōrmuṛi and Parāči in Afghânistan. Linguistics 191: 71–100. Kieffer, Charles M. 1981 L’arabe et les Arabophones de Bactriane. I. Situation ethnique et linguistique [Arabs and Arabic speakers of Bactria. I. Ethnic and linguistic situation]. Die Welt des Islams 20: 178–196. Kieffer, Charles M. 1983 Die kleinen sprachlichen und ethnischen Gruppen Afghanistans: Gibt es ein linguistisches Problem in Afghanistan? [The small linguistic and ethnic groups of Afghnistan: Is there a linguistic problem in Afghanistan?]. In: Siegmar W. Breckle and Claus M. Naumann (eds.), Forschungen in und über Afghanistan: Situation der wissenschaftlichen Erforschung Afghanistans und Folgen der gegenwärtigen politischen Lage, 71–91. Hamburg: Deutsches Orient-Institut. Kieffer, Charles M. 1985 Afghanistan. V. Languages. In: Ehsan Yarshater (ed.), Encyclopaedia Iranica, vol. I, 501–516. London/Boston/Henley: Routledge & Paul. http://www.iranica online.org/articles/afghanistan-v-languages (accessed 29 Nov. 2014). Kieffer, Charles M. 1989 Le parāčī, l’ōrmuṛī et le groupe des langues iraniennes du Sud-Est. In: Schmitt (ed.) 1989: 445–455. Kieffer, Charles M. 2003 Grammaire de l’ōrmuṛī de Baraki-Barak (Lōgar, Afghanistan) [Grammar of the Ormuṛi of Baraki-Barak (Logar, Afghanistan)]. Wiesbaden: Reichert. Kieffer, Charles M. 2009 Parachi. In: Windfuhr (ed.) 2009: 693–720. Kindersley, A. F. 1938 Notes on the Indian idiom of English: Style, syntax, and vocabulary. Transactions of the Philological Society 37: 25–34. Kiseleva, Lidija Nikolaevna 1973 Očerki po leksikologii jazyka Dari [A sketch of Dari lexicology]. Moskva: Nauka. Kiseleva, Lidija Nikolaeva 1982 Dvujazyčie Pašto-Dari v Afganistane [Pashto-Dari bilingualism in Afghanistan]. Narody azii i afriki 6: 94–99. Kiseleva, Lidija Nikolaevna 1985 Jazyk Dari Afganistana [The Dari language of Afghanistan]. (Jazyki narodov Azii i Afriki.) Moskva: Nauka. Kiseleva, Lidija Nikolaevna 1986 Dari-russkij slovar’ [Dari-Russian dictionary]. Moskva: Russkij jazyk. Klaiman, M. H. 1977 Bengali syntax: Possible Dravidian influence. International Journal of Dravidian Linguistics 6(2): 303–317. Klaiman, M. H. 1986 Semantic parameters and the South Asian linguistic area. In: Krishnamurti, Masica & Sinha (eds.) 1986: 179–194.

354

Bibliographical references

Kobayashi, Masato, and Ganesh Murmu 2008 Keraʔ Mundari. In: Anderson (ed.) 2008: 165–194. Kohistani, Razwal, and Ruth Laila Schmidt 2006 Shina in contemporary Pakistan. In: Saxena & Borin (eds.) 2006: 137–160. Konow, Sten 1905 On some facts connected with the Tibeto-Burman dialect spoken in Kanawar. Zeitschrift der Deutschen Morgenländischen Gesellschaft 59: 117–125. Konow, Sten (ed.) 1909 Linguistic survey of India 3.1. [= Grierson 1909.] Korn, Agnes 2003 Balochi and the concept of North-Western Iranian. In: Jahani, Korn & GrenEklund (eds.) 2003: 49–60. Korn, Agnes 2005 Towards a historical grammar of Balochi: Studies in Balochi historical phonology and vocabulary. Wiesbaden: Reichert. Koul, Ashok K. 2008 Lexical borrowings in Kashmiri. New Delhi: Indian Institute of Language Studies. Koul, Omkar N. 2003 Kashmiri. In: Cardona & Jain (eds.) 2003: 895–952. Koul, Omkar N. 2011 Kashmir iv. Persian Elements in Kashmiri. Encyclopaedia Iranica. http:// www.iranicaonline.org/articles/kashmiri-language (accessed 28 Nov. 2014) Krishnamurti, Bh(adriraju) 1991 The emergence of the syllable types of stems (C)VCC(V) and (C)V̄ C(V) in Indo-Aryan and Dravidian: Conspiracy or convergence? In: W. G. Boltz and M. C. Shapiro (eds.), Studies in the historical phonology of Asian languages, 160–175. Amsterdam/Philadelphia: Benjamins. Krishnamurti, Bh(adriraju) 2003 The Dravidian languages: A comparative, historical and typological study. Cambridge: Cambridge University Press. Krishnamurti, Bh(adriraju), and Aditi Mukherji (eds.) 1984 Modernization of Indian languages in the news media. Hyderabad: Osmania University Department of Linguistics. Krishnamurti, Bh(adriraju), Colin P. Masica, and Anjani K. Sinha (eds.) 1986 South Asian languages: Structure, convergence, and diglossia. Delhi: Motilal Banarsidass. Kuczkiewica-Fraś, Agnieszka 2003 Perso-Arabic hybrids in Hindi. New Delhi: Manohar. Kuiper, F. B. J. 1948 Proto-Munda words in Sanskrit. (Verhandelingen der Koninklijke Nederlandsche Akademie van Wetenschappen, Afd. Letterkunde, Nieuwe Reeks, Deel LI, No. 3.) Amsterdam. Kuiper, F. B. J. 1955 Rigvedic loanwords. In: Otto Spies (ed.), Studia Indologica: Festschrift für Willibald Kirfel, 137–185. Bonn: Orientalistisches Seminar der Universität.

Contact and convergence

355

Kuiper, F. B. J. 1962 Nahali: A comparative study. (Mededelingen der Koninklijke Nederlandse Akademie van Wetenschappen, Afd. Letterkunde, N. R., 25: 5.) Amsterdam: Noord-Hollandsche Uitgevers Maatschappij. Kuiper, F. B. J. 1966 The sources of Nahali vocabulary. In: Norman H. Zide (ed.), Studies in comparative Austroasiatic linguistics, 57–81. The Hague: Mouton. Kuiper, F. B. J. 1968a The genesis of a linguistic area. Indo-Iranian Journal 10(2/3): 81–102. Repr. 1974, International Journal of Dravidian Linguistics 3: 135–153. Kuiper, F. B. J. 1968b The Sanskrit nom. sing. viṭ. Indo-Iranian Journal 10(2/3): 103–125. Kuiper, F. J. B. 1991a An Indo-Iranian isogloss? Indo-lranian Journal 34(1): 39–41. Kuiper, F. B. J. 1991b Aryans in the Rigveda. Amsterdam/Atlanta, GA: Rodopi. Kuiper, F. B. J. 1995 Foreign words in the Rigveda. Indo-Iranian Journal 38(3): 261. Kulkarni-Joshi, Sonal 2008 Deconvergence in Kupwad? Indian Linguistics 69: 153–162. Kulkarni-Joshi, Sonal 2011 Variation in a convergence area: The evidence from Marathi-Kannada contact. South Asian Languages Analysis 29, Central Institute of Indian Languages, Mysore, January 2011. Lakshmi Bai, B. 1984 Syntactic innovations in newspaper Hindi. In: Krishnamurti & Mukherji (eds.) 1984: 20–28. Lakshmi Bai, B. 1985 Some notes on correlative constructions in Dravidian. In: Acson & Leed (eds.): 181–190. Lamberg-Karlovsky, C. C. 2002 Language and archaeology: The Indo-Iranians. Current Anthropology 43(1): 63–88. Lange, Claudia 2012 The syntax of spoken Indian English. Amsterdam/Philadelphia: Benjamins. LaPolla, Randy J. 2001 The role of migration and language contact in the development of the SinoTibetan language family. In: Aikhenvald & Dixon (eds.) 2001: 225–254. LaPolla, Randy J. 2003 Overview of Sino-Tibetan morphosyntax. In: Thurgood & LaPolla (eds.) 2003: 22–42. LaPolla, Randy J. 2009 Causes and effects of substratum, superstratum, and adstratum influence, with reference to Tibeto-Burman languages. Senri Ethnological Studies 75: 227– 237. LaPolla, Randy J. 2013 Subgrouping in Tibeto-Burman: Can an individual-identifying standard be developed? How do we factor in the history of migrations and language

356

Bibliographical references

contact? In: Balthasar Bickel, Lenore A. Grenoble, David A. Peterson, and Alan Timberlake (eds.), What’s where why? Language typology and historical contingency: In honor of Johanna Nichols, 463–474. Amsterdam/New York: Benjamins. Lassen, Christian 1844 Die Brahui und ihre Sprache. Zeitschrift für die Kunde des Morgenlandes 5: 337–409. Bonn: H. B. Koenig. Repr. 1971, Amsterdam: Associated Publishers. Lehmann, Thomas 1994 Grammatik des Alttamil unter besonderer Berücksichtigung der CaṅkamTexte des Dichters Kapilar. Stuttgart: Steiner. Lehmann, Winfred (ed.) 1978 Syntactic typology: Studies in the phenomenology of language. Austin: University of Texas Press. Lehr, Rachel 2014 A descriptive grammar of Pashai: The language and speech community of Darrai Nur. University of Chicago PhD dissertation. ProQuest Dissertations 3638612. Lévi, S[ylvain] 1923 Pré-aryen et pré-dravidien dans l’Inde. Journal Asiatique 203: 1–57. (English translation in Bagchi 1975: 63–126.) Liang, Hsin-hsin, and Peter Edwin Hook 2001 The compound verb in Chinese and Hindi-Urdu and the Indo-Turanian linguistic area. In: Bhaskararao & Subbarao (eds.) 2001: 105–124. Liljegren, Henrik 2009 The Dangari tongue of Choke and Machoke: Tracing the proto-language of Shina enclaves in the Hindu Kush. Acta Orientalia 70: 7–62. Liljegren, Henrik 2013 Notes on Kalkoti: A Shina language with strong Kohistani influences. Linguistic Discovery 11(1): 129–160. Lorenz, Manfred 1990 Die Entwicklung des Paschto als moderne Literatursprache [The development of Pashto as a modern literary language]. In: István Fodor and Claude Hagège (eds.), Language reform: History and future, 105–125. Hamburg: Buske. Lorimer, David Lockhart Robinson 1922 The phonology of the Bakhtiari, Badakhshani, and Madaglashti dialects of Modern Persian, with vocabularies. London: Royal Asiatic Society. Lorimer, David Lockhart Robinson 1935–1938 The Burushaski language. I–III. (Publikationer Ser. B: Skrifter, Instituttet for sammenlignende Kulturforskning, Serie B 29, 1–3.) Oslo: H. Aschehoug & Co. Lorimer, David Lockhart Robinson 1937 Burushaski and its alien neighbours: Problems in linguistic contagion. Transactions of the Philological Society 36: 63–98. Lorimer, David Lockhart Robinson 1939 The Dumaki language. Nijmegen: Dekker & van de Vegt. Lorimer, David Lockhart Robinson 1962 Werchikwar-English vocabulary. Oslo: Norwegian Universities Press.

Contact and convergence

357

Lubotsky, Alexander M. 2001 The Indo-Iranian substratum. In: C. Carpelan, A. Parpola, and P. Koskikallio (eds.), Early contacts between Uralic and Indo-European: Linguistic and archaeological considerations: Papers presented at an international symposium held at the Tvärminne Research Station of the University of Helsinki 8–10 January 1999, 301–317. Helsinki: Suomalais-Ugrilainen Seura. MacKenzie, David Neil 1959 A standard Pashto. Bulletin of the School of Oriental and African Studies 22: 231–235. Mahboob, Ahmar 2008 Pakistani English: Morphology and syntax. In: Mesthrie (ed.) 2008: 578–592. Mahmoodi Bakhtiari, Behrooz 2003 Notes on the tense system in Balochi and standard Persian. In: Jahani, Korn & Gren-Eklund (eds.) 2003: 133–145. Mahmoodzahi, Moosa 2003 Linguistic contact in Iranian Balochistan in historical and modern times. In: Jahani, Korn & Gren-Eklund (eds.) 2003: 147–156. Mansoor, Sabiha 1993 Punjabi, Urdu, English in Pakistan. Lahore: Vanguard. Marar, Kuttikrishana 1971 Malayala Sailee. Kozhikode: Mathrubhumi Publication. Marlow, Patrick Edward 1994 On the origin of embedded relative clauses in Hindi. In: Davison & Smith (eds.) 1994: 167–186. Marlow, Patrick Edward 1997 Origin and development of the Indo-Aryan quotatives and complementizers: An areal approach. University of Illinois PhD dissertation. ProQuest Dissertations 9737189. Masica, Colin P. 1976 Defining a linguistic area: South Asia. Chicago/London: The University of Chicago Press. Masica, Colin P. 1979 Aryan and non-Aryan elements in north Indian agriculture. In: Deshpande & Hook (eds.) 1979: 55–152. Masica, Colin P. 1983 South Asian languages: Typological coincidence or areal convergence? 35th meeting of the Association for Asian Studies. Masica, Colin P. 1991 The Indo-Aryan languages. Cambridge: Cambridge University Press. Masica, Colin P. 2001 The definition and significance of linguistic areas: Methods, pitfalls, and possibilities (with special reference to the validity of South Asia as a linguistic area). In: Bhaskararao & Subbarao (eds.) 2001: 165–267. Maspero, Henri 1946 Notes sur la morphologie du tibeto-birman et du munda. Bulletin de la Société de Linguistique de Paris 21: 9–15.

358

Bibliographical references

Matisoff, James A. 1986 The languages and dialects of Tibeto-Burman: An alphabetic/genetic listing, with some prefatory remarks on ethnonymic and glossonymic complications. In: John McCoy and Timothy Light (eds.), Contributions to Sino-Tibetan studies, 3–75. Leiden: Brill. Matisoff, James A. 1990 On megalocomparison. Language 66(1): 106–120. Matras, Yaron 2002 Romani: A linguistic introduction. Cambridge: Cambridge University Press. Matras, Yaron, April McMahon, and Nigel Vincent (eds.) 2006 Linguistic areas: Convergence in historical and typological perspective. Houndmills, Basingstoke, Hampshire/New York: Palgrave MacMillan. Mayrhofer, Manfred 1951 Arische Landnahme und indische Bevölkerung im Spiegel der altindischen Sprache. Saeculum 2: 54–64. Mayrhofer, Manfred 1956–1976 Kurzgefaßtes etymologisches Wörterbuch des Altindischen. Heidelberg: Winter Mayrhofer, Manfred 1986–2001 Etymologisches Wörterbuch des Altindoarischen [Etymological dictionary of Old Indo-Aryan], 3 vols. Heidelberg: Winter. McAlpin, David W. 1974 Toward Proto-Elamo-Dravidian. Language 50(1): 89–101. McAlpin, David W. 1975 Elamite and Dravidian: Further evidence of relationship. Current Anthropology 16(1): 105–115. McAlpin, David W. 1979 Linguistic prehistory: The Dravidian situation. In: Deshpande & Hook (eds.) 1979: 175–189. McAlpin, David W. 1980 Is Brahui really Dravidian? In: Caron et al (eds.) 1980: 66–72. McAlpin, David W. 1981 Proto-Elamo-Dravidian: The evidence and its implications. Transactions of the American Philosophical Society, New Series 71(3): 1–155. McAlpin, David W. 2003 Velars, uvulars, and the North Dravidian hypothesis. Journal of the American Oriental Society 123(3): 521–546. McAlpin, David W. 2015 Brahui and the Zagrosian hypothesis. Journal of the American Oriental Society 135(3): 551–586. McAlpin, David W. Forthcoming Modern colloquial Eastern Elamite. McWhorter, John 2007 Language interrupted: Signs of non-native acquisition in standard language grammars. Oxford: Oxford University Press. Mesthrie, Rajend (ed.) 2008 Varieties of English: Africa, South and Southeast Asia. Berlin/New York: Mouton de Gruyter.

Contact and convergence

359

Meyer-Ingwersen, Johannes 1966 Untersuchungen zum Satzbau des Paschto [Studies on sentence construction of Pashto]. Universität Hamburg PhD dissertation. Mishra, V. N. 1963 Hindī bhāṣā aur sāhitya par angrezī prabhāv (1870–1920) [Influence of English on Hindi language and literature]. Dehradun. (Cited in B. B. Kachru 1969.) Misra, Satya Swarup 1992 The Aryan problem: A linguistic approach. New Delhi: Munshiram Manoharlal. Mitchell, Lisa 2009 Language, emotion, and politics in South India: The making of a mother tongue. Bloomington/Indianapolis: Indiana University Press. Mizokami, Tomio 1987 Language contact in Panjab: A sociolinguistic study of the migrants’ language. New Delhi: Bahri. Mock, John Howard 1998 The discursive construction of reality in the Wakhi community of northern Pakistan. University of California, Berkeley, PhD dissertation. ProQuest Dissertations 9922976. Mohanty, Panchanan 1997 Loss of /o/ in Kui, Sora, and Oriya: A clue for sub-linguistic area. In: Abbi (ed.) 1997: 109–120. Mōmand, Qalandar, and Farīd Sahrāī 1994 Daryāb: Paš ̣tō luġat [Daryāb: A dictionary of Pashto]. Peshawar: NWFP Textbook Board. Mørch, Ida Elisabeth, and Jan Heegård 1997 Retroflekse vokalers oprindelse i kalashamon i historisk og areallingvistisk perspektiv [The origin of retroflex vowels in Kalashamon in a historical and areal linguistic perspective]. University of Copenhagen MA thesis. Morgenstierne, Georg 1926 Report on a linguistic mission to Afghanistan. Oslo: Instituttet for Sammenlignende Kulturforskning. Morgenstierne, Georg 1932a Report on a linguistic mission to north-western India. Oslo: Instituttet for Sammenlignende Kulturforskning. Morgenstierne, Georg 1932b Notes on Balochi etymology. Norsk Tidsskrift for Sprogvidenskap 5: 37–53. Repr. in Morgenstierne 1973b: 148–164. Morgenstierne, Georg 1935 Preface to Lorimer 1935–1938. In: Lorimer 1935–1938, Vol. 1: vii–xxx. Morgenstierne, Georg 1936 Iranian elements in Khowar. Bulletin of the School of Oriental Studies 8: 657– 671. Repr. in Morgenstierne 1973b: 241–255. Morgenstierne, Georg 1937 Review of Bray 1934. Journal of the Royal Asiatic Society (new series) 69(2): 345–348.

360

Bibliographical references

Morgenstierne, Georg 1938 Indo-Iranian frontier languages, vol. II: Iranian Pamir languages. Oslo: Instittutet for Sammenlignende Kulturforskning. Morgenstierne, Georg 1940 Archaisms and innovations in Pashto morphology, Norsk Tidsskrift for Sprogvidenskap 12: 88–114. Morgenstierne, Georg 1973a Indo-Iranian frontier languages, vol. III: The Pashai language, 1: Grammar. 2nd ed. Oslo/Bergen/Tromsö: Universitetsforlaget. Morgenstierne, Georg 1973b Irano-Dardica. Wiesbaden: Reichert. Morgenstierne, Georg 1974 Early Iranic influence upon Indo-Aryan. In: Commemoration Cyrus: Actes du Congrès de Shiraz 1971 et autres études rédigées à l’occasion du 2500e anniversaire de la fondation de l’Empire perse, vol I, 271–279. (Acta Iranica 1.) Tehran/Leiden: Brill. Morgenstierne, Georg 1975 Ancient contacts between N. E. Iranian and Indo-Aryan? In: Mélanges linguistiques offerts à Émile Benveniste, 431–434. (Collection Linguistique, Société de linguistique de Paris 7.) Paris/Louvain: Peeters. Morgenstierne, Georg 1979 The linguistic stratification of Afghanistan. Afghan Studies 2: 23–33. Morin, Yves-Charles, and Louise Dagenais 1977 Les emprunts ourdous en bourouchaski. Journal Asiatique 265: 307–343. Müller, Katja, Elisabeth Abbess, Calvin Thiessen, and Gabriela Thiessen 2008 Language vitality and development among the Wakhi people of Tajikistan. Dallas: SIL International. www.sil.org/silesr/2008/silesr2008–011.pdf (accessed 28 Nov 2014) Munshi, Sadaf 2006 Jammu and Kashmir Burushaski: Language, language contact and change. The University of Texas, Austin, PhD dissertation. http://www.lib.utexas.edu/ etd/d/2006/munshis96677/munshis96677.pdf (accessed 29 December 2014) Munshi, Sadaf 2010 Contact-induced language change in a trilingual context: The case of Burushaski in Srinagar. Diachronica 27(1): 32–72. Murugaiyan, A., and Christiane Pilot-Raichoor 2004 Les prédications indifférenciées en dravidien: témoins d’une évolution typologique archaïque. In: Jacques François and Irmtraud Behr (eds.), Les constituants prédicatifs et la diversité des langues, 155–177. (Mémoires de la Société de linguistique de Paris, Nouvelle Série, 14.) Louvain: Peeters. Muysken, Pieter (ed.) 2008 From linguistic areas to areal linguistics. Amsterdam/Philadelphia: Benjamins. Nadkarni, Mangesh V. 1975 Bilingualism and syntactic change in Konkani. Language 51: 672–683. Neukom, Lukas 1999 Phonological typology of northeast India. Linguistics of the Tibeto-Burman Area 22(2): 121–147.

Contact and convergence

361

Neukom, Lukas 2000 Argument marking in Santali. Mon-Khmer Studies 30: 95–113. Nichols, Johanna 1996 The comparative method as heuristic. In: Mark Durie & Malcolm D. Ross (eds.), The comparative method reviewed: Regularity and irregularity in language change, 39–71. New York: Oxford University Press. Nichols, Robert 2008 A history of Pashtun migration, 1776–2006. Oxford: Oxford University Press. Nirvair, Darshan Singh 1975 Persian words in Pañjābī: A semantic view. Vishveshvaranand Indological Journal 13(1/2): 250–257. Noonan, Michael 2003 Recent language contact in the Nepal Himalaya. In: Bradley (ed.) 2003: 65–88. Osada, Toshiki 1991 Linguistic convergence in the Chotanagpur area. In: S. Bosu Mullick (ed.), Cultural Chotanagpur: Unity in diversity, 99–119. New Delhi: Uppal Publishing. New Delhi: Uppal Publishing. Osada, Toshiki 2008 Mundari. In: Anderson (ed.) 2008: 99–164. Osada, Toshiki 2009 How many Proto-Munda words in Sanskrit? With special reference to agricultural vocabulary. In: Osada (ed.) 2009: 101–126. Osada, Toshiki (ed.) 2009 Linguistics, archaeology and human past in South Asia. Delhi: Manohar. Ostrovskij, Boris J. 1996 Opyt sistematizacii glagol’nyx kategorij (na materiale jazyka dari) [Attempt at a systematization of verb categories (based on material of the Dari language)]. Voprosy jazykoznanija 6: 119–132. Moskva: Nauka. Pahwāl, ‘Abdorrahmān 1386 [2007] Balōčī gālband. Qāmūs-e balōčī, paštō, darī, englīsī. Ba kūšeš-e doktūr Lūts Žehāk wa Bēdollah Nārūyī [Balochi Vocabulary. Balochi-Pashto-DariEnglish Dictionary, edited by Lutz Rzehak and Bedollah Naruyi]. Kabul/ Peshawar: Al-Azhar Book Company. Pakhalina, Tatiana Nikolaevna 1975 Vaxanskij jazyk [The Wakhi language]. Moskva: Nauka. Pakhalina, Tatiana Nikolaevna 1981 Materialy po jazyku Kivi [Materials on the Kivi language]. In: Iranskoe jazykoznanie. Ežegodnik 1982, 112–114. Moskva: Nauka. Pakhalina, Tatiana Nikolaevna 1985 Éléments indo-aryens dans les langues iraniennes orientales [Indo-Aryan elements in the Eastern Iranian languages, translated from Russian by Jacques Veyrenc]. In: Mélanges linguistiques offerts à Émile Benveniste, 441–445. (Collection Linguistique publiée par la Société de linguistique de Paris, 70.) Louvain: Peeters. Pandharipande, Rajeshwari 1982 Counteracting forces in language change: Convergence vs. maintenance. In: Hock (ed.) 1982: 97–116.

362

Bibliographical references

Pandharipande, Rajeshwari 1997 Marathi. London/New York: Routledge. Pandit, Prabodh B. 1972 Bilingual’s grammar: Tamil-Saurashtri grammatical convergence. In: Pandit (ed.) 1972: 1–25. Pandit, Prabodh B. (ed.) 1972 India as a socio-linguistic area. Ganeshkind: University of Poona. Panikkar, G. K. 1993 The Brahuis of Iran. International Journal of Dravidian Linguistics 22(2): 34–50. Pardeshi, Prashant, and Peter Edwin Hook 2006 Toward a geotypology of EAT-expressions in languages of Asia: Visualizing areal patterns through WALS. Gengo Kenkyu 130: 89–108. Parkin, Robert 1989 Some comments on Brahui kinship terminology. Indo-Iranian Journal 32(1): 37–43. Parpola, Asko 1988 The coming of the Aryans to Iran and India and the cultural and ethnic identity of the Dāsas. Studia Orientalia 64: 195–302. Parpola, Asko 1994 Deciphering the Indus script. Cambridge: Cambridge University Press. Parpola, Asko 2001 Emergence, contacts and dispersal of Proto-Indo-European, Proto-Uralic and Proto-Aryan in archaeological perspective. In: C. Carpelan, A. Parpola, and P. Koskikallio (eds.), Early contacts between Uralic and Indo–European: Linguistic and archaeological considerations, 55–150. Helsinki: SuomalaisUgrilainen Seura. Parpola, Asko 2002a From the dialects of Old Indo-Aryan to Proto-Indo-Aryan and Proto-Iranian. In: Nicholas Sims-Williams (ed.), Indo-Iranian languages and peoples, 66–72. London: The British Academy. Parpola, Asko 2002b Pre-Proto-Iranians of Afghanistan as initiators of Śākta Tantrism: On the Scythian/Saka affiliation of the Dāsas, Nuristanis and Magadhans. Iranica Antiqua 37: 233–324. Parpola, Asko, and Juha Janhunen 2011 On the Asiatic wild asses and their vernacular names. In: Toshiki Osada and Hitoshi Endo (eds.), Occasional Paper 12: Linguistics, archaeology and the human past, revised version, 59–124. Kyoto: Research Institute for Humanity and Nature. Patnaik, B. N., and Ira Pandit 1986 Englishization of Oriya. In: Krishnamurti, Masica & Sinha (eds.) 1986: 232–243. Patnaik, Manideepa 2008 Juang. In: Anderson (ed.) 2008: 508–556. Patry, Richard, and Étienne Tiffou 1997 Les emprunts lexicaux à l’ourdou en bourouchaski du Yasin: Un phénomène qui varie selon l’âge. In: Communications au congrés New Wave (Quebec 1997), 1–11.

Contact and convergence

363

Payne, John 1989 Pāmir languages. In: Schmitt (ed.) 1989: 417–444. Penzl, Herbert 1955 A grammar of Pashto: A descriptive study of the dialect of Kandahar, Afghanistan. Washington: American Council of Learned Societies. Peterson, John 2008 Kharia. In: Anderson (ed.) 2008: 434–507. Peterson, John 2010 Language contact in Jharkhand: Linguistic convergence between Munda and Indo-Aryan in eastern-central India. Himalayan Linguistics 9(2): 56–86. Pollock, Sheldon (ed.) 2003 Literary cultures in history: Reconstructions from South Asia. Berkeley/Los Angeles/London: University of California Press. Post, Mark W. 2010 The Siyom River Valley: An essay on intra-subgroup convergence in TibetoBurman. Fifth International Conference of the North East Indian Linguistics Society, Gauhati University, Assam, India, Feb. 12–14. Post, Mark W., and Yankee Modi 2011 Language contact and the genetic position of Milang in Tibeto-Burman. Anthropological Linguistics 53(3): 215–258. Pott, August Friedrich 1833, 1836 Etymologische Forschungen auf dem Gebiete der indogermanischen Sprachen, 2 vols. Lemgo: Meyer. Prabhakar Babu, B. A. 1974 A phonological study of English spoken by Telugu speakers in Andhra Pradesh. Hyderabad: Osmania University. Pray, Bruce R. 1980 Evidence of grammatical convergence in Dakhini Urdu and Telugu. In: Caron et al. (eds.) 1980: 90–99. Przyluski, Jean 1926 Un ancien peuple du Penjab: Les Udumbara. Journal Asiatique 206: 25–36. Puri, Vandana 2011 The influence of English on the history of Hindi relative clauses. Journal of Language Contact 4: 250–268. Qadeer, Altaf 2011 The socio-cognitive dynamics of Hindu/Urdu lexemes. In: Concise Oxford English dictionary. http://archiv.ub.uni-heidelberg.de/savifadok/1896/1/ Hindi_Urdu_Words_in_Oxford_Eng_Dictionaries_June_27_2011_1.pdf (accessed 28 Nov. 2014) Rajaram, Navaratna S. 1995 The politics of history: Aryan invasion theory and the subversion of scholarship. New Delhi: Voice of India. Ramanujan, A. K., and Colin P. Masica 1969 Toward a phonological typology of the Indian linguistic area. In: Sebeok, Emeneau & Ferguson (eds.) 1969: 543–577. Ramasamy, K. 1981 Correlative relative clauses in Tamil. In: Agesthialingom & Nair (eds.): 363– 380.

364

Bibliographical references

Ramat, Paolo 1998 Typological comparison and linguistic areas: Some introductory remarks. Language Sciences 20(3): 227–240. Rāsex Yāldaram, [pōhandōy] Mohammad Sāleh 1388 [= 2009/2010] Farhang-e rāsex (torkmanī darī) [The dictionary of Rasekh: Turkmen-Dari]. Kābol: Wazārat-e omūr-e sarhadāt wa aqwām-u qabāyel. Riyāsat-e našarāt wa omūr-e farhangī. Rasul, Sarwet 2009 Code-mixing and hybridization in Pakistan: Linguistic, socio-cultural and attitudinal perspectives. Saarbrücken: VDM Verlag. Rasul, Sarwet 2013 Borrowing and code mixing in Pakistani children’s magazines: Practices and functions. Pakistaniaat: A Journal of Pakistan Studies 5(2): 46–72. Redard, Georges 1974 État des travaux et publication: Quelques cartes onomasiologiques [State of work and publication: Some onomasiological maps]. In: Redard, Sana & Kieffer (eds.) 1974: 7–19. Redard, Georges, Sanaoullah Sana, and Charles M. Kieffer (eds.) 1974 L’Atlas linguistique des parlers iraniens: Atlas de l’Afghanistan. (University of Bern, Institut für Sprachwissenschaft, Arbeitspapiere, 13.) Bern: University of Bern. Rehman, Khwaja A., and Joan L. G. Baart 2005 A first look at the language of Kundal Shahi in Azad Kashmir. http://www.sil. org/silewp/abstract.asp?ref=2005–008 (accessed 27 Nov. 2014) Reinhold, Beate 2006 Neue Entwicklungen in der Wakhi-Sprache von Gojal (Nordpakistan): Bildung, Migration und Mehrsprachigkeit [New developments in the Wakhi language of Gojal (North Pakistan): Culture, migration and multilingualism]. Wiesbaden: Harrassowitz. Renfrew, Colin 1987 Archaeology and language: The puzzle of Indo-European origins. London: Pimlico. Riccardi, Theodore 2003 Nepali. In: Cardona & Jain (eds.) 2003: 538–580. Ringe, Donald A., Jr. 1990 Evidence for the position of Tocharian in the Indo-European family? Die Sprache 34: 59–123. Ringe, Don[ald A.], Jr. 1996 On the chronology of sound changes in Tocharian, 1. New Haven: American Oriental Society. Roberts, Taylor 2000 Clitics and agreement. MIT PhD dissertation. ProQuest Dissertations 0802415. Robson, Barbara, and Habibullah Tegey 2009 Pashto. In: Windfuhr (ed.) 2009: 721–772. Rossi, Adriano V. 1971 Iranian elements in Brahui I: Stems with -ā/ănk, -ī/ĭnk, -ū/ŭnk, -ēnk, -ōnk. Annali dell’Istituto Orientale di Napoli 31: 400–407.

Contact and convergence

365

Rossi, Adriano V. 1977 Brāhūi and Western Iranian clusters *šk, *sk (being Iranian elements in Brahui, II). Annali dell’Istituto Orientale di Napoli, Supplemento n. 12 agli 37(3): 1–71. Rossi, Adriano V. 1979 Iranian lexical elements in Brahui. (Seminario di Studi Asiatici, Series Minor, 8.) Naples: Istituto Universitario Orientale. Rzehak, Lutz 2003 Some thoughts and material on Balochi in Afghanistan. In: Jahani, Korn & Gren-Eklund (eds.) 2003: 259–276. Rzehak, Lutz 2009 Code copying in the Balochi language of Sistan. Iranian Journal of Applied Language Studies 1: 115–141. http://ijals2.usb.ac.ir/ (accessed 16 Dec. 2014) Rzehak, Lutz 2012 How to name universities? Or: Is there any linguistic problem in Afghanistan? ORIENT: Deutsche Zeitschrift für Politik, Wirtschaft und Kultur des Orients 53(2): 84–90. Sabir, Abdul Razzak 1995 Morphological similarities in Brahui and Balochi languages. International Journal of Dravidian Linguistics 24(1): 1–8. Sabir, Abdur Razzak 2003 Language contact in Balochistan (with special reference to Balochi and Brahui). In: Joan L. G. Baart and Ghulam Hyder Sindhi (eds.), Pakistani languages and society: Problems and prospects, 121–131. Islamabad: National Institute of Pakistan Studies, Quaid-i-Azam University/Summer Institute of Linguistics. Sailaja, Pingali 2009 Indian English. Edinburgh: Edinburgh University Press. Sankoff, Gillian 2002 Linguistic outcomes of language contact. In: J. K. Chambers, Peter Trudgill, and Natalie Schilling-Estes (eds.), The handbook of language variation and change, 638–668. Malden, MA/Oxford: Blackwell. Satyanath, Shobha, and Nazrin Laskar 2008 Lexicon in a contact language: The case of Bishnupriya. In: Stephen Morey and Mark W. Post (eds.), North East Indian linguistics, 75–92. New Delhi: Cambridge University Press. Saxena, Anju 1988 On syntactic convergence: The case of the verb “say” in Tibeto-Burman. In: Kira Hall and Michael Meacham (eds.), Proceedings of the Fifteenth Annual Meeting of the Berkeley Linguistics Society: February 18–20 1989, 375–388. Berkeley, CA: Berkeley Linguistics Society. Saxena, Anju 2013 International workshop on linguistic microareas in South Asia. http://www. lingfil.uu.se/calendar/konf/lmsa/ (accessed 14 Dec. 2014). Saxena, Anju (ed.) 2015 Micro-linguistic areas in South Asia. (Journal of South Asian Languages and Linguistics 2(1), special issue.)

366

Bibliographical references

Saxena, Anju, and Lars Borin (eds.) 2006 Lesser-known languages of South Asia: Status and policies, case studies and applications of information technology. Berlin/New York: Mouton de Gruyter. Schackow, Diana 2008 Clause linkage in Puma (Kiranti). Universität Leipzig Magisterarbeit. Schiffman, Harold F., and Brian Spooner (eds.) 2012 Language policy and language conflict in Afghanistan and its neighbors. Leiden/Boston: Brill. [Brian Spooner’s coeditorship was inadvertently omitted in the published version.] Schmidt, Ruth Laila, and Razwal Kohistani 2008 A grammar of the Shina language of Indus Kohistan. Wiesbaden: Harrassowitz. Schmidt, Ruth Laila, and Vijay Kumar Kaul 2008 A comparative analysis of Shina and Kashmiri vocabularies. Acta Orientalia 69: 231–301. Schmitt, Rüdiger (ed.) 1989 Compendium linguarum iranicarum. Wiesbaden: Reichert. Sebeok, Thomas A., Murray B. Emeneau, and Charles A. Ferguson (eds.) 1969 Current trends in linguistics 5: Linguistics in South Asia. The Hague/Paris: Mouton. Septfonds, Daniel 1994 Le dzadrâni: Un parler Pashto du Paktyâ (Afghanistan). (Travaux de l’Institut d’Études Iraniennes de l’Université de la Sorbonne Nouvelle, 15.) Louvain: Peeters. Sering, Senge 2002 Key issues in contemporary Balti languge and script. Eighth Himalayan Languages Symposium, University of Bern, Switzerland, Institute for Linguistics, September 2002. Sethi, J. 1980 Word accent in educated Punjabi speakers’ English. Bulletin of the Central Institute of English 16(2): 31–55. Shackle, Christopher 1970 Punjabi in Lahore. Modern Asian Studies 4(3): 239–267. Shackle, Christopher 1978 Approaches to the Persian loans in the “Ādi Granth”. Bulletin of the School of Oriental and African Studies 41(1): 73–96. Shackle, Christopher 1980 Hindko in Kohat and Peshawar. Bulletin of the School of Oriental and African Studies 43(3): 482–510. Shackle, Christopher 2005 Sindhi. Encyclopaedia Iranica. http://www.iranicaonline.org/articles/sindhi (accessed 28 Nov 2014) Shapiro, Michael 1987 Hindi lagnā: A study in semantic change. Journal of the American Oriental Society 107(3): 401–408.

Contact and convergence

367

Sharma, Maansi 2012 Phonological changes in the Hindi loanwords of contact Hindi in Shillong. Seventh Annual International Conference of the North East Indian Linguistics Society, Guwahati, Assam, India, Jan 31–Feb 2, 2012. Singh, Rajendra (ed.) 2006 The yearbook of South Asian languages and linguistics 2006. Berlin/New York: Mouton de Gruyter. Sirsa, Hema, and Melissa A. Redford 2013 The effects of native language on Indian English sounds and timing patterns. Journal of Phonetics 41: 393–406. http://www.sciencedirect.com/science/ journal/00954470/41/6 (accessed 12 Dec. 2014) Sjoberg, Andrée F. 1992 The impact of Dravidian on Indo-Aryan: An overview. In: Edgar C. Polomé and Werner Winter (eds.), Reconstructing languages and cultures, 507–529. Berlin/New York: Mouton de Gruyter. Skjærvø, Prods O. 1989a Modern East Iranian languages. In: Schmitt (ed.) 2009: 370–383. Skjærvø, Prods O. 1989b Yidgha and Munǰī. In: Schmitt (ed.) 2009: 411–416. Skjærvø, Prods O. 1989c Pashto. In: Schmitt (ed.) 2009: 384–410. Slade, Benjamin 2011 Formal and philological inquiries into the nature of interrogatives, indefinites, disjunction, and focus in Sinhala and other languages. University of Illinois PhD dissertation. ProQuest Dissertations 3496670. Snell, Rupert 1993 The hidden hand: English lexis, syntax and idiom as determinants of Modern Hindi usage. In: David Arnold and Peter Robb (eds.), Institutions and ideologies: A SOAS South Asia reader, 74–90. Richmond, UK: Routledge Curzon. Southworth, Franklin C. 1971 Detecting prior creolization: An analysis of the historical origins of Marathi. In: Hymes (ed.) 1971: 255–276. Southworth, Franklin C. 1974 Linguistic stratigraphy of North India. International Journal of Dravidian Linguistics 3(1): 201–223. Southworth, Franklin C. 1979 Lexical evidence for early contacts between Indo-Aryan and Dravidian. In: Deshpande & Hook (eds.) 1979: 191–233. Southworth, Franklin C. 2005 Linguistic archaeology of South Asia. London/New York: Routledge Curzon. Southworth, Franklin C. 2006 New light on three South Asian language families. Mother Tongue 11: 124– 159. Southworth, Franklin C. 2011 Rice in Dravidian. Rice 4(3–4): 142–148. Southworth, Franklin C. Forthcoming Indo-Aryan-Dravidian language contact: An etymological investigation based on CDIAL and DEDR, Part I.

368

Bibliographical references

Southworth, Franklin C., and David W. McAlpin 2013 South Asia: Dravidian linguistic history. In: Immanuel Ness (ed.), Encyclopedia of global human migration, Chapter 30. Hoboken, NJ: Wiley-Blackwell. Spooner, Brian 1967 Notes on the Baluchī spoken in Persian Baluchistan. Iran: Journal of the British Institute of Persian Studies 5: 51–71. Spooner, Brian 2012 Balochi: Towards a biography of the language. In: Schiffman (ed.) 2012: 319– 336. Sridhar, S. N. 1978 On the functions of code-mixing in Kannada. International Journal of the Sociology of Language 16: 109–117. Sridhar, S. N. 2008 Language modernization in Kannada. In: Kachru, Kachru & Sridhar (eds.) 2008: 327–341. Starostin, Georgii 2011 (2010) Kak sozdaetsja edinaja klassifikacija jazykov mira [How a single classsification of the languages of the world is being created]. Polytechnical Museum (Moscow), 16 December 2010. http://polit.ru/article/2010/11/23/starostin/ (accessed 12 Nov 2014) Steblin-Kamenskij, Ivan M. 1999 Ėtimologičeskij slovar’ vaxanskogo yazyka [Etymological dictionary of the Wakhi language]. Sankt-Peterburg: Peterburgskoe vostokovedenie. Steever, Sanford B. 1986 Morphological convergence in the Khondmals: (Pro)nominal incorporation. In: Krishnamurti, Masica & Sinha (eds.) 1986: 270–285. Steever, Sanford B. 1988 The serial verb formation in the Dravidian languages. Delhi: Motilal Banarsidass. Steever, Sanford B. 1998 Malto. In: Steever (ed.) 1998: 359–387. Steever, Sanford B. (ed.) 1998 The Dravidian languages. London/New York: Routledge. Stilo, Donald L. 1987 Ambipositions as an areal response: The case-study of the Iranian zone. In: Bashir, Deshpande & Hook (eds.) 1987: 308–336. Stilo, Donald L. 2004 Iranian as buffer zone between the universal typologies of Turkic and Semitic. In: Éva Ágnes Csató, Bo Isaksson, and Carina Jahani (eds.), Linguistic convergence and areal diffusion: Case studies from Iranian, Semitic and Turkic, 35–63. London/New York: Routledge Curzon. Stilo, Donald L. 2006 Circumpositions as an areal response: The case of the Iranian zone. (Reprint of Stilo 1987.) In: Lars Johanson and Christiane Bulut (eds.), Turkic-Iranian contact areas: Historical and linguistic aspects, 310–333. Wiesbaden: Harrassowitz.

Contact and convergence

369

Strand, Richard F. 2010 Nurestâni languages. Encyclopaedia Iranica. http://www.iranicaonline.org/ articles/nurestani-languages (accessed 14 Nov 2014) Strand, Richard F. 2011 Irânian-Speaking peoples of the Hindu-Kush region. http://nuristan.info/ Iranian/Iranians.html (accessed 28 Nov. 2014) Subbarao, Karumuri V. 2001 Agreement in South Asian languages and Minimalist inquiries: The framework. In: Bhaskararao & Subbarao (eds.) 2001: 457–492. Subbarao, Karumuri V. 2008 Typological characteristics of South Asian languages. In: Kachru, Kachru & Sridhar (eds.) 2008: 49–78. Subbarao, Karumuri V. 2012 South Asian languages: A syntactic typology. Cambridge: University Press. Subrahmanyam, P. S. 2008 Dravidian comparative grammar, 1. Mysore: Central Institute of Indian Languages. Sun, Jackson 1993 A historical-comparative study of the Tani (Mirish) branch of Tibeto-Burman. University of California, Berkeley, PhD dissertation. ProQuest Dissertations 9408131. Swarajya Lakshmi, V. 1984 Urdu influence on Telugu. Hyderabad: Department of Linguistics, Osmania University. Talageri, Shrikant G. 2008 The Rigveda and the Avesta: The final evidence. New Delhi: Aditya Prakashan. Thomason, Sarah G. 2001 Language contact: An introduction. Edinburgh/Washington, DC: Edinburgh University Press/Georgetown University Press. Thomason, Sarah G., and Terrence Kaufman 1988 Language contact, creolization, and genetic linguistics. Republished 1991, Berkeley/Los Angeles: University of California Press. Thurgood, Graham 2003 A sub-grouping of the Sino-Tibetan languages: The interaction between language change, contact, and inheritance. In: Thurgood & LaPolla (eds.) 2003: 1–21. Thurgood, Graham, and Randy LaPolla (eds.) 2003 The Sino-Tibetan languages. London: Routledge. Tiffou, Étienne, and Jurgen Pesot 1989 Contes du Yasin. Paris: Peeters/SELAF. Tikkanen, Bertil 1987 The Sanskrit gerund: A synchronic, diachronic, and typological analysis. (Studia Orientalia 62.) Helsinki: Finnish Oriental Society. Tikkanen, Bertil 1988 On Burushaski and other ancient substrata in northwestern South Asia. Studia Orientalia 64: 303–325.

370

Bibliographical references

Tikkanen, Bertil 1995 Burushaski converbs in their South and Central Asian areal context. In: Martin Haspelmath and Ekkehard König (eds.), Converbs in cross-linguistic perspective, 487–528. Berlin: Mouton de Gruyter. Tikkanen, Bertil 1999 Archaeological-linguistic correlations in the formation of retroflex typologies and correlating areal features in South Asia. In: Roger Blench and Matthew Spriggs (eds.), Archaeology and language IV: Language change and cultural transformation, 138–148. London/New York: Routledge. Tikkanen, Bertil 2007 Burushaski hurúṭas and Domaki beésiná ‘to sit, stay, dwell’ as aspectual auxiliaries and their regional parallels. Acta Orientalia 68: 135–160. Tikkanen, Bertil 2008 Some areal phonological isoglosses in the transit zone between South and Central Asia. In: Israruddin (ed.), Proceedings of the Third International Hindu Kush Cultural Conference (Chitral, 26–30 September, 1995), 250–262. Karachi: Oxford University Press. Tikkanen, Bertil 2011 Domaki noun inflection and case syntax. In: Bertil Tikkanen and Albion M. Butters (eds.), Pūrvāparaprajñābhinandanam: East and west, past and present: Indological and other essays in honour of Klaus Karttunen, 205–228. (Studia Orientalia 110.) Helsinki: Finnish Oriental Society. Titus, Paul 2003 Linguistic contact in the Baloch-Pushtun boundary zone. In: Jahani, Korn & Gren-Eklund (eds.) 2003: 221–235. Toporov, V. N. 1965 Neskol’ko zamečanij k fonologičeskoj xarakteristike centralnoaziatskogo jazykovogo sojuza. In: Symbolae linguisticae in honorem Georgii Kurylowicz, 322–330. (Polska Akademia Nauk. Oddział w Krakowie. Prace Komisji Językoznawstwa 5.) Wrocław: Zakład Narodowy im. Ossolińskich. Trail, Ronald L., and Gregory R. Cooper 1985 Kalasha phonemic survey. MS. Trail, Ronald L., and Gregory R. Cooper 1999 Kalasha dictionary – with English and Urdu. (Studies in Languages of Northern Pakistan, 7.) Islamabad: National Institute of Pakistan Studies, Quaid-i-Azam University/Summer Institute of Linguistics. Trumpp, Ernst 1872 Grammar of the Sindhi language, compared with the Sanskrit-Prakrit and the cognate Indian vernaculars. London: Trübner and Co. Repr. 1970, Osnabrück: Biblio Verlag. Trumpp, Ernst 1880 Grammatische Untersuchungen über die Sprache der Brāhūīs. (Sitzungsberichte der Bayerischen Akademie der Wissenschaften, PhilosophischPhilologische und Historische Klasse 6.) München: Bayerische Akademie der Wissenschaften. Tuite, Kevin 1998 Evidence for prehistoric links between the Caucasus and Central Asia: The case of the Burushos. In: Victor H. Mair (ed.), The Bronze Age and Early Iron

Contact and convergence

371

Age peoples of eastern Central Asia, 1, 448–475. (Journal of Indo-European Studies, Monograph 26.) Washington, DC: The Institute for the Study of Man, in collaboration with The University of Pennsylvania Museum Publications. Turner, Ralph L. 1921 Gujarati phonology. Journal of the Royal Asiatic Society 1921: 505–554. Repr. in Turner 1973: 88–145. Turner, Ralph L. 1924 Cerebralization in Sindhi. Journal of the Royal Asiatic Society 1924: 555–584. Repr. in Turner 1973: 206–227. Turner, Ralph L. 1966 A comparative dictionary of the Indo-Aryan languages. London: Oxford University Press. Turner, Ralph L. 1967 Geminates after long vowel in Indo-Aryan. Bulletin of the School of Oriental and African Studies 30: 73–82. Repr. in Turner 1973: 405–415. Turner, Ralph L. 1973 Indo-Aryan linguistics: Collected papers. Ed. by J. Brough. London: Oxford University Press. Repr. 1985, Delhi: Disha Publications. Učida, Norihiko 1991 The language of the Saurashtrans in Tirupati. Bangalore: Mahalaxmi Enterprises. Upadhyaya, U. P. 1971 Effects of bilingualism on Bidar Kannada. Indian Linguistics 32(2): 132–138. van Driem, George 1987 A grammar of Limbu. Berlin/New York: Mouton de Gruyter. van Driem, George 1993 A grammar of Dumi. Berlin/New York: Mouton de Gruyter. Varija, N. 2005 Syntactic convergence in Bhalavali Marathi. Professor Murray B. Emeneau Centenary International Conference on South Asian Linguistics, Mysore. Verma, Manindra K. 1976 Typological aspects of the stative in some Indic languages. Revue roumaine de linguistique 24(2): 185–193. Vesper, Don 1971 Kurukh syntax with special reference to the verbal system. University of Chicago PhD dissertation. ProQuest Dissertations T-22543. Vijayakrishnan, K. G. 1978 Stress in Tamilian English: A study within the framework of generative phonology. Hyderabad: Central Institute of English and Foreign Languages. Wagoner, Philip B. 1993 Tidings of the king: A translation and ethnohistorical analysis of the Rayavacakamu. Honolulu: University of Hawaii Press. Watters, David E. 1993 Agreement systems and syntactic organization in the Kham verb [Nepal]. Linguistics of the Tibeto-Burman Area 16(2): 89–111.

372

Bibliographical references

Watters, David E., with Yogendra P. Yadava, Madhav P. Pokharel, and Balaram Prasain 2006 Notes on Kusunda grammar: A language isolate of Nepal. Himalayan Linguistics Archive 3: 1–182. http://www.linguistics.ucsb.edu/HimalayanLinguistics/ grammars/2006/HLA03_Watters.pdf (accessed 16 Dec. 2014) Weber, Dieter 1997 Iranian loans in the Niya documents re-examined. In: Shirin Akhiner and Nicholas Sims-Williams (eds.), Languages and scripts of Central Asia, 30–38. London: School of Oriental and African Studies. Weiers, Michael 1972 Die Sprache der Moghol der Provinz Herat in Afghanistan [The language of the Moghols of Herat province in Afghanistan]. (Abhandlungen der RheinlandWestfälischen Akademie der Wissenschaften, 49: Materialien zur Sprache und Literatur der Mongolen von Afghanistan I.) Opladen: Westdeutscher Verlag. Weinreich, Matthias 2001 Die Pashto-Sprecher des Karakorum: Zur Migrationsgeschichte einer ethnolinguistischen Minderheit. Iran and the Caucasus 5: 107–132. Weinreich, Matthias 2009 “We are here to stay”: Pashtun migrants in the Northern Areas of Pakistan. Berlin: Klaus Schwarz. Weinreich, Matthias 2010 Language shift in northern Pakistan: The case of Domaakí and Pashto. Iran and the Caucasus 14: 43–56. Wilde, Christopher P. 2008 A sketch of the phonology and grammar of Rājbanshi. University of Helsinki PhD dissertation. Wiltshire, Caroline R., and James D. Harnsberger 2006 The influence of Gujarati and Tamil L1s on Indian English: A preliminary study. World Englishes 25: 91–104. Windfuhr, Gernot (ed.) 2009 The Iranian languages. London/New York: Routledge. Winford, Donald 2003 An introduction to contact linguistics. Malden, MA: Blackwell. Winter, Werner 1997 Lexical archaisms in the Tocharian languages. In: Hans Henrich Hock (ed.), Historical, Indo-European, and lexicographical studies: A festschrift for Ladislav Zgusta on the occasion of his 70th birthday, 183–193. Berlin/New York: Mouton de Gruyter. Witzel, Michael 1993 Nepalese hydronomy: Towards a history of settlement in the Himalayas. In: G. Toffin (ed.), Nepal, past and present: Proceedings of the Franco-German Conference, Arc-et-Senans, June 1990, 217–266. New Delhi: Sterling Publishers. Witzel, Michael 1999a Early sources for South Asian substrate languages. Mother Tongue Special Issue (October 1999), 1–76. Witzel, Michael 1999b Aryan and non-Aryan names in Vedic India: Data for the linguistic situation, c. 1900–500 BC. In: Bronkhorst and Deshpande (eds.) 1999: 337–404.

Contact and convergence

373

Witzel, Michael 1999c Substrate languages in Old Indo-Aryan. Electronic Journal of Vedic Studies 5: 1–67. http://www.ejvs.laurasianacademy.com/issues.html (accessed 27 Nov. 2014) Witzel, Michael 2003 Linguistic evidence for cultural exchange in prehistoric western Central Asia. (Sino-Platonic Papers 129: 1–70.) Philadelphia: Department of East Asian Languages and Civilizations, University of Pennsylvania. http://www. sino-platonic.org/complete/spp129_prehistoric_central_asia_linguistics.pdf (accessed 27 Nov. 2014) Witzel, Michael 2005 Central Asian roots and acculturation in South Asia: Linguistic and archaeological evidence from western Central Asia, the Hindukush and northwestern South Asia for early Indo-Aryan language and religion. In: Toshiki Osada (ed.), Linguistics, archaeology and the human past, 87–211. Kyoto: Indus Project, Research Institute for Humanity and Nature. Yadava, Yogendra P., and Warren G. Glover (eds.) 1999 Topics in Nepalese linguistics. Kathmandu: Royal Nepal Academy. Yoon, James Hye Suk 1996 A syntactic approach to category-changing phrasal morphology: Nominalizations in Korean and English. In: Hee-Don Ahn, Myung-Yoon Kang, YoungSuck Kim, and Sookhee Lee (eds.), Morphosyntax and generative grammar: Proceedings of 1997 Seoul International Conference on Generative Grammar, 63–86. Seoul: Hankuk Publishing Company. Yun, Ju-Hong 2003 Pashai Language Development Project: Promoting Pashai language, literacy and community development. http://www.sil.org/asia/ldc/parallel_papers/ ju-hong_yun.pdf (accessed 29 Nov. 2014) Zide, Norman H. 1996 On Nihali. Mother Tongue 2: 93–100. Zide, Norman H., and Arlene R. K. Zide 1976 Proto-Munda cultural vocabulary: Evidence for early agriculture. In: P. Jenner, L. Thompson, and S. Starosta (eds.), Austroasiatic studies: Papers from the First International Conference on Austroasiatic Linguistics, Honolulu, 1973, 1295–1334. Honolulu: University Press of Hawaii. Zoller, Claus Peter 2005 A grammar and dictionary of Indus Kohistani. 1: Dictionary. Berlin/New York: Mouton de Gruyter. Zvelebil, Kamil A. 1990 Dravidian linguistics: An introduction. Pondicherry: Pondicherry Institute of Linguistics and Culture.

3

Phonetics and phonology Edited by Hans Henrich Hock

3.1.

Introduction By Hans Henrich Hock

3.1.1.

General overview

The South Asian languages exhibit a number of phonetic characteristics that are “unusual” or “exotic” from a western perspective, including “voiced aspirates”, retroflex consonants, “implosives”, and even tonal differences. These characteristics have attracted a great amount of phonetic research, as shown in Section 3.2. Phonological work has been more limited, especially compared to the great amount of research that has been conducted in syntax and morphosyntax. Consider the space devoted to phonology in two major surveys of South Asian languages, Cardona & Jain (eds.) 2003 on Indo-Aryan and Steever (ed.) 1998 on Dravidian. Out of a total of 952 pages, the contributions to Cardona & Jain devote only about 100 pages to “phonology”, and most of that concerns phonetic inventories (± the relation of phonetics to graphemics); discussion of phonology such as sandhi phenomena, vowel harmony, palatalization, and umlaut is much more limited, and only one contribution (Cardona 2003) presents an extensive discussion of phonology (15 pages) — on Sanskrit, a language with a rich tradition of phonetic and phonological work in both modern western and indigenous scholarship (see Chapter 9). The case is similar for the contributions to Steever 1998. Of a total of 413 pages, only about 40 deal with “phonology”, and again, most of the coverage concerns phonetic inventories (± phonetics and graphemics); discussion of phonological phenomena such as sandhi and vowel harmony is limited to a few remarks, with the exception of Krishnamurti (1998) and Bhaskararao (1998) who provide information on Telugu phonological rules and Gadaba sandhi respectively. Similarly, of the modern South Asian languages covered in Windfuhr (ed.) 2009, only Balochi receives relatively extensive phonological discussion (Jahani & Korn 2009). Coverage in grammars and handbooks follows similar lines. Sanskrit receives extensive attention, especially in Wackernagel 1896/1957 and Whitney 1889. Modern-language grammars and handbooks devote less discussion to phonology, with traditional philologically-oriented publications tending to provide more detail. Thus, Berger 1998 (volume 1) offers 20 pages devoted to phonetics and phonology, and Beythan 1943 contains a detailed discussion of phonological issues, including sandhi (13 pages), whereas Krishnamurti & Gwynn 1985 focus entirely on issues of ‘Orthography and pronunciation’ (35 pages).

376

Peri Bhaskararao

Recent years, however, have witnessed an increase in phonological studies, including work on suprasegmental and intonational prosody. See Section 3.3. 3.1.2.

Ancient Indian phonetics and phonology

The focus in this chapter is on modern approaches to South Asian phonetics and phonology. There is, however, a long, impressive tradition of indigenous work in both areas. Earlier work, that of the prātiśākhyas, focuses especially on phonetics and offers remarkable insights into the difference between (modal) voicing and the breathy characteristics of “voiced aspirates” and voiced h [ɦ]; a classification of nasals as nasalized stops; and a change from articulatory to resonance-based phonetics. See Allen 1953, Varma 1961, Cardona 1986, Hock 2014. Much of modern articulatory phonetic terminology, such as Engl. voiced : voiceless, Germ. stimmhaft : stimmlos, is calqued on original Sanskrit terminology (ghoṣavat/ghoṣin : aghoṣa respectively). Beside their phonetic observations, the prātiśākhyas also deal with sandhi and other selected phonological issues. In the Pāninian and post-Pāṇinian phonological tradition, some of the earlier phonetic insights were replaced by phonologically motivated concepts and terminology, such as the characterization of breathy-voice aspirates as “voiced aspirated” (Hock 2014). At the same time, the later phonological tradition offers phonological rules that are more firmly integrated into the overall grammar and generally operate with simpler or more elegant rule formulations. (See also 7.2.2– 7.2.3.) 3.2.

Phonetics By Peri Bhaskararao

The inventory of speech sounds of the subcontinent could form a large subset of the ‘articulatory possibilities of man’ (Catford 1968). It is not only the variety of speech sounds but also the phonemic contrasts they enter into that contribute to the variety of the phonological systems of South Asian languages. Maddieson 1984 contains a detailed listing of sound inventories of the languages of the subcontinent as part of its main concern of showing the ‘distribution of phonological segments in the world’s languages’. Based mostly upon available published materials, Ramaswami 1999, Reddy 2003, and Pandey 2005, 2006 list similar sound patterns for Indian languages. Ohala 1991 provides phonetic explanations for sound patterns in the Indian linguistic area, covering the features of retroflexion, aspiration, stress, nasalization, etc. A more comprehensive survey of work is available in Pandey 2007, 2014. As sound inventories of South Asian languages are already available in the works just cited, I will concentrate on some of the interesting

Phonetics and phonology

377

phonetic topics, especially those that are based upon instrumental evidence, rather than purely perceptual observations. 3.2.1.

Articulation

Sounds of the languages of the subcontinent are produced at a variety of places of articulation. There are languages such as Malayalam whose articulatory space is quite densely occupied as, for instance, its nasals are produced at seven different places of articulation (Ladefoged & Maddieson 1996). However, the use of canonical names for places of articulation for such complex systems sometimes does not give precise information as to the lower and upper articulators involved in their production. For instance, Dart and Nihalani (1999) say that Malayalam has ‘stops and nasals in seven places of articulation, but what is meant by “place of articulation” is not clear’. Hence, ‘palatograms and linguograms, with simultaneous audio recordings, were taken of nine speakers and eight test words illustrating the four coronal stops and nasals in word-medial position’, and (The) results show that reference must be made to both the upper and lower articulators in describing the articulation of these consonants. Some of the contrasts are made solely on the basis of place of articulation on the upper articulator (e.g. dental vs. alveolar for some speakers) and others only on the basis of apicality (e.g. alveolar vs. palatoalveolar, both of which are articulated on the alveolar ridge for most speakers, but differ in the tongue contact). Other articulations (such as retroflex) remain maximally distinct by utilizing both parameters. Acoustic results give further evidence of overall tongue shape differences between these four segments. (1999: 129)

Dixit (1990) studies linguotectal contact patterns during the production of Hindi dental stops and corresponding retroflex stops and finds that Overall contact was greater for the retroflex than for the dental stops, though the central constriction was narrower for the former than for the latter. In both the retroflex and dental stops, the central constriction was narrower for the voiced than for the voiceless stops. Vowel context affected the retroflex stops more than the dentals, perhaps indicating that exact place of contact is not critical for the retroflex but is for the dental stops. (1990: 189)

He further observes that ‘the traditional articulatory descriptions of the dental and retroflex stops are phonetically inaccurate, and that retroflex stops do not form a single invariant category in terms of previously proposed distinctive features’. Dixit and Flege (1991) observe the effects of vowel context, rate of pronunciation, and loudness on the place of articulation of Hindi /ʈ/ and find that the anterior and posterior boundaries of the /ʈ/ constriction moved progressively forward from /a/ to /u/ to /i/ context, which reflects the shift in the place of /ʈ/ articulation as a consequence of coarticulatory effects of vowel context. These coarticulatory effects of

378

Peri Bhaskararao

vowel context on the place of /ʈ/ constriction were quite large and similar across normal, fast and loud speech conditions. This forward shift in the A-P location of /ʈ/ constriction suggests that the degree of retroflexion during /ʈ/ production decreased systematically from /a/ to /u/ to /i/ as a function of vowel context. (1991: 227)

Changes in ‘speaking rate did not substantially affect the extent of the area of linguopalatal contact or A-P length of /ʈ/ constriction’. The ‘central-lateral occlusal constriction was always formed in all three vowel contexts and under all three speech conditions indicating that an occlusal constriction is critical for the production of /ʈ/’. D. N. S. Bhat (1974) shows that retroflexion and retraction are different processes and that the ‘consonants that are produced with the “curling in” or “curling out” of the tip of the tongue show a number of linguistically relevant characteristics … which differentiate them from other consonants that are produced with an uncurled tip’. Ladefoged and Bhaskararao (1983) show that there ‘are a number of cases in which it seems evident that the sounds of one language are not identical with sounds that may be given similar classificatory labels in other languages’ and the data reported by them show that ‘rather than there being a simple category retroflex, there are degrees of retroflexion just as there are degrees of vowel height’. These findings were reverified by Bakst (2012) whose results ‘mirror those in Ladefoged and Bhaskararao (1983), and the results were extended to rhotics and laterals as well’. Hamann 2003 contains the most comprehensive examination and analysis of the phonetics and phonology of retroflexes based upon findings from several languages of the world including those from the subcontinent. Hamann makes four major contributions: a new phonetic definition of retroflexion, description of cross-linguistically common phonological processes, phonetic grounding for these processes, and phonological analysis of these processes (in OT). The only language in the world which has been reported to have retroflex vowels of two degrees is Badaga. For this language, Emeneau (1939) established a contrast between ‘normal, half-retroflexed and fully-retroflexed’ vowels (p. 44) by providing sets of words such as beˑ ‘mouth’, béˑ ‘bangle’, be̋ ˑ ‘crops’, where [eˑ] is a normal vowel, [éˑ] is half-retroflexed, and [e̋ ˑ] is fully retroflexed. However, as noted by Ladefoged and Maddieson (1996: 314), during 1992 fieldwork with the Badagas, Ladefoged and Bhaskararao found that the distinction of two degrees of retroflexion in Badaga was not as rigorous as it must have been during Emeneau’s fieldwork of the 1930s. For retroflex vowels in Kalasha see Mørch & Heegård 1997, Heegård & Mørch 2004. In the case of Tamil, Balasubramanian (1982a) observes that although ‘there are two orthographic symbols representing the n sounds and two other symbols representing the tap [ɾ] and the trill [r]’, ‘there is no one-one-correspondence between these four orthographic symbols and the sounds they represent and there

Phonetics and phonology

379

is in fact one /n/ phoneme and one /r/ phoneme in the language’. These observations were made on the basis of palatographic and electrokymographic tracings. Direct palatography and electro-aerometer recordings of the nasals and laterals were used by Balasubramanian (1982b) to establish the following facts: Articulators ‘make a firmer contact with each other for a longer duration during the articulation of the double consonants than during the articulation of their single counterparts, and ‘duration of the double nasals and laterals is about two and half times that of their single counterparts’. McDonough and Johnson (1997) differentiate the five liquids of Tamil by means of the features of duration, dynamic tongue movement, and constriction location/spectral shape. Narayanan and Kaun (1999) discuss modeling of Tamil retroflex liquids by means of their acoustic parameters, based on ‘MRI-derived vocal tract data in conjunction with an articulatory synthesizer to investigate articulatory-acoustic mappings of five contrastive liquids of Tamil’. In their study, the ‘effects of varying constriction area, constriction location and side-cavity lengths are investigated’ and simulation results ‘show good correspondence between estimated and actual formant values’. Narayanan, Byrd, and Kaun (1999) present a characterization of the five liquid consonants of Tamil ‘in terms of articulatory geometry and kinematics, as well as their articulatory-acoustic relations’. This study ‘illustrates the use of multiple techniques — static palatography, magnetic resonance imaging (MRI ), and magnetometry (EMMA ) — for investigating both static and dynamic articulatory characteristics using a single native speaker of Tamil’. While trying to distinguish dental stops from alveolar stops in Malayalam, Jongman et al. (1985) find that ‘a measure of the rms amplitude of the burst normalized to the rms amplitude of the vowel could distinguish the two classes of stop consonants’. Dutta and Redmon (2013) examine the acoustic differences between dental, alveolar, and retroflex stops in Malayalam and find that differences in F3 play an important role in distinguishing the stops, and that alveolars ‘show flatter slopes’ and have greater coarticulatory resistance than retroflex and dental stops. Reddy (2009) discusses the articulatory and acoustic features of Telugu fricatives, and Reddy and Srikumar (1988) present an articulatory study of Malayalam trills supported by acoustic investigations. The acoustics of the retroflex approximant of Malayalam was discussed by Punnoose and Khattab (2011), who find that it has ‘rhotic characteristics’ and that an ‘extrinsic phonetic interpretation of phonology is suggested’. Punnoose 2011 is a detailed study of the five liquids of Malayalam. Lahiri et al. (1984) describe a ‘method based on the change in the distribution of spectral energy from the burst onset to the onset of voicing’ which could ‘classify over 91 % of the stops in Malayalam, French and English’. Evers et al. (1998) study the acoustic aspects of distinction between [s] and [ʃ] which are allophonic

380

Peri Bhaskararao

in Bengali and Dutch, whereas they are phonemically distinct in English. Their findings lead to the conclusion that ‘phonological status does not affect the realisation’ of the phonetic distinction between [s] and [ʃ] and that ‘the appropriate acoustic correlate displays a relative rather than an absolute kind of invariance’. Palatalization is the major secondary articulation found in Kashmiri and the languages spoken north and northwest of the Kashmir area. Grierson 1897, 1911, Bailey 1937, and Morgenstierne 1943 are the earliest works which mention this feature in Kashmiri. Kelkar & Trisal 1964, and Kelkar 1984 contain detailed articulatory descriptions of Kashmiri palatalization. Bhaskararao et al. (2009) show in detail, with acoustic evidence, that the process of palatalization envelops both the pre-strictural and post-strictural portions of the concerned consonant. Schmidt and Kohistani (2008) illustrate palatalization in Shina of Indus Kohistan. (For Kashmiri palatalization see also 3.3.6 below.) 3.2.2.

Airstream types

Sindhi is the textbook case of implosives in the subcontinent. Implosives are also reported in Saraiki, another language of the Sindh area. Nihalani (1986) demonstrates that Sindhi implosives involve ingressive airflow, unlike the implosives of Hausa. The immediate consequence of this fact is that the proposal that there are no true implosives, i.e., sounds that involve suction, must be rejected. It also raises the question whether implosives should be characterized in phonological theory as sounds involving suction, or as sounds involving the lowering of the larynx. Comparison of the implosives in Sindhi with those of Hausa also demonstrates the need for including certain kinds of “phonetic implementational phenomena” in the domain of phonology. Opgenort (2004a, b) reports that Wambule, a Kiranti language of the TibetoBurman group, has two voiced implosives (ɓ, ɗ  ) at the phonemic level in wordinitial position. However, when a form with word-initial implosive becomes word-medial, the implosive is replaced by the corresponding pulmonic nonimplosive sound. Chatterji (1926) refers to sounds ‘made with simultaneous glottal closure’ occurring in East Bengali dialect(s). Though the exact meaning of the term “simultaneous glottal closure” is not clear, his equating the sounds with the implosives of Sindhi suggests that he is referring to implosives. A fuller study of East Bengali would be desirable. 3.2.3.

Phonation types

Gujarati murmured sounds were studied extensively by Fischer-Jørgensen (1967). She found that from a physiological point of view, they are produced with a ‘strong air flow’ which assumes ‘stronger activity of the expiratory muscles’. Acoustically,

Phonetics and phonology

381

the more stable cues are ‘duration, distribution of spectral energy, and distance to the tonal peak’. While comparing different phonation types across Gujarati, Hmong, Mazatec, and Yi, Keating et al. (2010) find that spectral tilt (H1*-H2*) is the ‘most important measure of phonation contrasts’ across these languages. S. Khan’s study of breathy voice in Gujarati (2012) is based on a series of measurements that include five spectral measures, four noise measures, and one electroglottographic measure. Khan finds that ‘breatccccchy voice in Gujarati is a dynamic, multidimensional feature, surfacing through multiple acoustic cues that are potentially relevant to the listener’. Benguerel and Bhatia (1980) find that the voiced aspirates of Hindi can be described more precisely as ‘voiced phonoaspirated’. Schiefer (1986) deals with the ‘acoustical and perceptual importance of F0 perturbations in word-initial breathy CV syllables of Hindi’. She finds that breathy phonation ‘can be simulated by a low-rising F0 trajectory following the stop release in initial CV syllables’. From a perception angle, she finds that as ‘F0 onset, trajectory, and trajectory duration carry information about the glottal gesture, they play an important role in the perception of breathy CV syllables’. Yadav’s (1984a) fiberoptic and acoustic study of voicing and aspiration in Maithili shows that opening of the glottis during the oral stricture is ‘narrow’ for voiceless unaspirated stops; ‘wide open’ in voiceless aspirated stops; ‘closed’ in voiced unaspirated stops; and ‘closed’ but with a ‘posterior opening of the arytenoids at or immediately before the articulatory release’ in voiced aspirated stops. Further, ‘voicing is observed to be present throughout, including the time when there is a posterior opening’. Narang and Becker (1971) discuss the treatment of aspirated consonants as unit phonemes or clusters. Srivastava (1968) utilizes morphonematics for addressing the question of aspirated phonemes of Hindi. Dixit (1993) uses transillumination technique to study spatiotemporal patterns of glottal dynamics for all the four types of stops of Hindi and observes the following. Unvoiced stops are produced without vocal fold vibration, whereas voiced stops are produced with vocal fold vibration. His observation about the state of the glottis in the production of the four types of stops is that for unvoiced unaspirated, the glottis is either open or closed; unvoiced aspirated stops have a wide glottal opening; voiced unaspirated have a closed glottis; and voiced aspirated have moderate glottal opening. Dixit (1989) examines the gestural differences of the vocal folds during the production of different phonation types found in various plosives of Hindi. Esposito et al. (2007: 275) investigate the nature of ‘murmured nasals in three Indo-Aryan languages’ and find that while ‘some within-language comparisons gave inconclusive results for Hindi and Bengali, other comparisons with Marathi and within-language phonological evidence pointed to the lack of breathy nasals in Hindi and an uncertain status for breathy nasals in Bengali’.

382

Peri Bhaskararao

Dutta’s (2009) detailed study of Hindi aspirated sounds shows that voiceless aspirated (VLAS) and breathy voiced stops (BVS) show shorter closure durations compared to unaspirated stops. Place of articulation-dependent duration of aspiration for the VLAS is close to 20–30 percent of the following vowel, while the duration of breathy/murmur following the BVS is nearly 30–50 percent. Place of articulation has no effect on the duration of the murmur. BVS show lower mean F0 values till about 20–30 percent of the vowel compared to voiced stops (VS) that tend to have comparably higher F0 values. The voiceless stops, both aspirated and unaspirated, exhibit higher mean F0 values compared to the voiced stops. Voicing-dependent F0 perturbation persists till 30 percent of the vowel. Berkson (2012: iii), on phonation type distinctions in Marathi, concludes that males and females ‘cue breathy phonation in sonorants differently’ (with H1–H2* more reliable in male speech and decreased cepstral peak prominence in female speech), and that ‘phonation type distinctions are not cued as well by sonorants as by obstruents’. 3.2.4.

Voice Onset Time

Lisker (1958) discusses the issue of durational and voicing differences between the voiceless and voiced sets of plosives in Tamil and concludes that the voicing feature is a predominant marker for separation between these sets. Lisker and Abramson’s classic paper (1964) on Voice Onset Time (VOT) in different languages includes ample discussion on this phenomenon in Hindi, Marathi, and Telugu. Cho and Ladefoged (1999) study the VOT phenomenon in 18 different languages including Khonoma Angami. Poon and Mateer’s (1985) study gives the VOT parameters that differentiate sets of Nepali stop consonants. Reddy (1982) uses kymographic and spectrographic measurements for the study of Telugu aspiration. Rami et al. (1999) analyse the VOT and burst frequency characteristics that differentiate the four velar plosives of Gujarati. 3.2.5.

Nasality

Narang and Becker (1971) discuss issues regarding nasalized vowels as sequences of oral vowels and nasal consonants — how this treatment brings in economy in the lexical representation, and how the schwa-syncope rule treats the above two types of sounds differently. Bhatia and Kenstowicz’s critique (1972) points out problems in Narang and Becker’s treatment of Hindi nasalization and shows that it is both descriptively and explanatorily inadequate. Ohala (1983) discusses the issue of homorganic nasals and nasalization in greater detail. Reddy (1998) examines the nature of vowel nasalization in Telugu. Dave (1970) examines the acoustic differences between nasalized and non-nasalized vowels in Gujarati.

Phonetics and phonology

383

Voiceless nasals (along with voiceless trills and laterals) are encountered in some of the Tibeto-Burman languages of Northeast India and adjacent countries. Bhaskararao and Ladefoged (1992) examine the voiceless nasals of two different varieties of Angami Naga (Kohima and Khonoma) as well those of Mizo and Burmese and show for the first time that alignment of gestures of velic opening and voice onset during the terminal portion of voiceless nasals can vary across languages in a significant way. Sinhala is well-known for its “half-nasals” which are now better known as prenasalized stops (Dantsuji 1987; Ladefoged & Maddieson 1996; Gair & Paolillo 1997). Gair and Paolillo describe them in the following way (1997: 12): An unusual feature of the Sinhala consonant inventory not shared with other languages of the region (except for Divehi of the Maldives Islands) is the existence of a series of prenasalized stops. They are distinct in length from nasal-stop clusters, which also occur in Sinhala. Note that they pattern with single consonants where syllable weight is concerned, as illustrated by the e/ee alternations in several of the examples …

Some examples given by them (p. 13) are: an̆ gee ‘horn.GEN ’, aŋge ‘the feature’; kan̆ dee ‘tree.trunk.GEN ’, kande ‘hill.GEN ’; kam̆ bee ‘the rope’, kambe ‘the ola book cover’. Dantsuji (1987: 168) feels that ‘half nasal is a kind of prenasalized voiced plosive’. Ladefoged and Maddieson (1996: 121) had earlier studied the acoustic aspects of these sounds and concluded: ‘On a phonetic basis at least, this contrast in Sinhala is more appropriately described as a contrast of single versus geminate nasals followed by stops, that is [mb, nd] vs [mmb, nnd] etc.’. Feinstein (1979) discusses issues in the phonological representation of the Sinhala prenasalized consonants. 3.2.6.

Aerodynamics

Dixit and Shipp (1985) find that the differences in the subglottal pressures for /p, ph, b, bh/ are in correlation with changes in glottal and supraglottal impedances. They also study the relationship between subglottal pressure and stress. Nihalani 1974a deals with supra-glottal and sub-glottal air pressures in the production of stops in Sindhi. The physiological events with reference to both the time factor and the aerodynamics of the breath stream are studied. Pneumotachography and laryngoscopy with fiber-optics bundle were some of the techniques employed for obtaining quantitative data. Since ‘phoneticians’ description of sounds is generally based on auditory impressions’ and ‘articulatory positions have not been examined with the help of any instrumental techniques’, Nihalani uses palatography and x-ray photography to confirm the validity of his ‘proprioceptive impressions, and also to demonstrate the precision with which the physiological features of the stop articulations in Sindhi can be described with the help of these techniques’. He concludes that ‘the number of phonetic categories has increased considerably’ and

384

Peri Bhaskararao

states that ‘such minute distinctions are absolutely essential if we are to organize our phonetic material accurately’. Nihalani (1975a: 89) ‘proprioceptively’ felt ‘that voiced and aspirated voiced stops in Sindhi are characterized by a slight nasalization’.1 Nasal airflow during the production of voiced stops was measured and the ‘results suggest that there is an incomplete velopharyngeal closure which helps to absorb the transglottal airflow and thus prevents the rise in the supraglottal pressure’; ‘for the affricated palatal stops of Sindhi, the vocal fold vibrations were, however, maintained by expanding the walls of the supraglottal cavities’. Nihalani (1975b: 205) finds that ‘aspirated phonation has higher air flow rate than non-aspirated phonation’ and ‘the retroflex sounds have a higher air flow rate, irrespective of the phonation process involved, whereas the palatal sounds have the minimum air flow rate in general’. 3.2.7.

Suprasegmentals

Tonal systems among the languages of the subcontinent are of three types: Panjabitype, Mizo-type, and Tai-type. Gilgiti Shina is reported to have pitch-accent as illustrated by Radloff (1999) who says ‘every word has one and only one accent’ and defines pitch-accented syllable as ‘one syllable which is more prominent than the other syllables of that word’ (p. 57) and ‘if a syllable contains a long vowel, accent can be associated with either the first or the second part of that vowel’ (p. 58). However, Rajapurohit (1983) considers a similar phenomenon in the Drasi variety of Shina as a type of “stress”. Earlier, Bailey (1924) proposed both stress and tone contrasts for Shina. Baart’s phonetic analysis of Kalam Kohistani (1997) shows that the language ‘appears to be a full-fledged tone language, firstly in the sense that the number of contrastive patterns that can occur, even on words of only one syllable, does not allow for an analysis in terms of accent’ (p. 41). The five contrastive pitches that he establishes are: high level; high-to-low falling; low level; delayed high-to-low falling; low-to-high rising. Baart 1999 enlarges the description of tonology of Kalam Kohistani by including changes in tones that are brought about by interaction of morphology and syntax on tone-bearing lexical items. Baart (2003) classifies the tone languages of northern Pakistan into three types: Panjabi-type (with three tonal contrasts), Shina-type (with two tonal contrasts), and Kalami-type (with no less than five tones). According to him, “Shina-type” languages include Shina, Burushaski, Palula, Indus Kohistani, and possibly Khowar, Gowro, Bateri, Chilisso, Dameli, Gawar-Bati, and Ushojo; the “Panjabi type” includes Panjabi, Hindko, Gujari, and possibly Pahari-Potwari; the “Kalami type” includes Kalami-Kohistani, Torwali, and possibly Kalkoti.

1

A similar observation for Tibeto-Burman Nàvakat is found in Saxena 2011.

Phonetics and phonology

385

Huysmans (2007) reports contrastive word-accent in Sampang (a Kiranti language of Tibeto-Burman group spoken in eastern Nepal) illustrated by means of minimal pairs such as: 'inma ‘to sell’ : in'ma ‘to hear’; 'cumma ‘money’ : cum'ma ‘to pile up’. As found elsewhere in the world’s languages (Gandour 1974, Hombert 1978, J. Ohala 1978) breathy voice or murmur can develop into a tonal feature on adjacent vowels etc. Most of the tone-bearing syllables in Panjabi are related to historically murmured consonants or the [ɦ] sound. Bhaskararao (1999) shows that the process of tonogenesis from murmured sounds could be of different degrees as observed in data from several Indo-Aryan languages of the (sub-) Himalayan areas (such as Nepali, Mandeali, Kangri, Dogri, Northern Haryanvi) as well as Tibeto-Burman Newari. Purcell, Villegas, and Young (1978: 292) show that ‘pitch effects of Hindi consonants obtain in the second syllable’ even though the responsible consonants occur in the first syllable. They further show that ‘Panjabi tonal contours exhibit similar effects in the second syllable, even though the first syllable is judged prominent, and the Panjabi tone is ascribed to the first syllable’. In the tradition of descriptive linguistics, Bahl (1955–1956) is the earliest work on tones in Panjabi. It establishes three tones, viz. even, falling, and rising. These tones are examined in words with two different weights — monosyllabic and disyllabic. In addition, different combinations of vowels and consonants such as consonant clusters, short and long vowels etc. with superimposed tones are examined. Gill and Gleason (1972) have a similar description of Panjabi tones. The earliest instrumental study of the tones of the Majhi dialect of Panjabi by Sampat (1964) provides a clearer picture of the actual fundamental-frequency variations involved in the production of the three tonemes of the language. This study is based on photographs of double-beam oscillograms, a technique available in those days. Wells and Roach (1980) give fundamental-frequency correlates of the three tones using more modern methods of instrumental analysis. Joshi (1973) and Sandhu (1986) describe the acoustic characteristics of Panjabi tones. In his OT approach to Panjabi tones, Vijayakrishnan (2003a) presents ‘a phonological analysis of tone in the context of the neutral, word level pitch melody’ of the language and argues that ‘this leads to the most explanatory account of the birth of two contour tones namely, a fall and a complex contour tone of a fall-rise in the language.’ Tonal systems of many of the Tibeto-Burman languages of the area follow the general Sino-Tibetan pattern where different syllables in a “word” can carry different tones. However, internal tone-sandhi and external tone-sandhi (across words) can introduce further complications; see e.g. Coupe 2003. Toneme inventories of Tibeto-Burman languages of the subcontinent range from a minimal system of two tonemes in Manipuri (= Meitei) (Chelliah 1997) to that of four or five tonemes. The phonetic realization of tonemes can be register tones, contour tones, or a combination of both.

386 3.2.8.

Peri Bhaskararao

Duration and syllable structure

Experimental work has also been conducted on issues of duration and syllable structure. For instance, based upon experimental data from Hindi, Ohala (2007: 368) shows that gemination seems to act as a prosody and ‘geminates are not two adjoining singletons, that is to say that they are not clusters, but, rather, a different type of unitary consonant’. Lahiri and Hankamer (1988) study differences in duration between geminated voiceless stops and their non-geminate counterparts in Bengali (along with those in Turkish). They find (p. 327) that geminated stops differ from their non-geminate counterparts in closure duration, irrespective of whether the geminated stops are tautomorphemic, concatenated, and derived by total assimilation. Hankamer et al. (1989: 283) test the perceptual cues for differentiating single and geminate stops in Bengali and Turkish and conclude that ‘in actual speech recognition there is no evidence that cues other than closure duration play a role in the discrimination of geminate and non-geminate stops in these languages’. Prakasam (1991, 1992) analyzes length in Telugu from a prosodic phonological perspective. Segmental duration of Telugu is studied in Reddy 1985 and the phonology of length beyond the word level in Reddy 2000. Reddy 1987 studies the typology of consonant clusters in a variety of Indian languages. Abbi and Mishra (1984–1985) examine consonant clusters and syllable structure in Meitei. The phonetic and phonological nature of syllables has been studied for Ao (Thakwani 1983) Bangla (Kar 2010), Hindi (W. E. Jones 1971, Mehrotra 1959), Maithili (Mishra 2006), Malayalam (K. P. Mohanan 1986, T. Mohanan 1989), Marathi (Khokle 1988), Sinhala (Feinstein 1979), Tamil (Vijayakrishnan 1982, Christdas 1988, 2013), and Telugu (Rao 1996). 3.2.9.

Other resources on phonetics

In addition to regular papers, short illustrations of languages, following a set pattern, are published in a section “Illustrations of the IPA” of the Journal of the International Phonetic Association. Publications on South Asia are on Hindi (Ohala 1994), Sindhi (Nihalani 1995), Tamil (Keane 2004), Nepali (Khatiwada 2009), Bengali of Bangladesh (S. Khan 2010), Assamese (Mahanta 2012), Sumi (= Sema) (Teo 2012). Under its “Phonetic Reader Series”, the Central Institute of Indian Languages, Mysore, India has issued short phonetic descriptions of various languages, ‘with a view to presenting the range of phonetic variation obtaining’ in the South Asian subcontinent ‘and demonstrating the closeness of languages on the basis of phonetic patterning’. These readers ‘are biased towards learning the sound systems of languages’. The general format of each of these phonetic readers contains an introduction to the language; description of speech organs, sounds of the language, pho-

Phonetics and phonology

387

netic drills, phonemics of the language, writing system of the language. Following is a list of languages included in the series: Angami (Ravindran 1974); Ao-Naga (Gurubasave Gowda 1972); Assamese (Dutta Baruah 1992); Balti (Rangan 1975); Bengali (K. Bhattacharya 1999); Brokskat (Ramaswami 1975); Gojri (J. C. Sharma 1979); Gujarati (U. Nair 1979); Kannada (Upadhyaya 2000); Kashmiri (Handoo 1973); Khasi (Nagaraja 1990); Kota (Subbaiah 1986); Kurux (Ekka 1985); Kuvi (Ramakrishna Reddy et al. 1974); Ladakhi (Koshal 1976); Lotha (Acharya 1975); Malayalam (Syamala Kumari 2000); Manipuri (I. Singh 1975); Mishmi (G. D. P. Sastry 1984); Mundari (N. K. Sinha 1974); Panjabi (Dulai & Koul 1980); Sema (Sreedhar 1976); Shina (Rajapurohit 1983); Tamil (Rajaram 1972); Telugu (J. V. Sastry 2000); Thaadou (Thirumalai 1972); Tangkhul Naga (Arokianathan 1980); Tripuri (Karapurkar 1972); Urdu (N. Hassan & Koul 1980). A phonetic reader of Hindi published by the Indian Institute of Language Studies (Koul 1994) gives articulatory phonetic details of the Hindi language. 3.2.10. Further resources on Tibeto-Burman phonetics and phonology The journal Linguistics of the Tibeto-Burman Area (LTBA) has been consistently publishing articles related to Tibeto-Burman languages of the subcontinent. Papers with phonetic and phonological content are listed below, sorted into categories. P HONETICS AND P HONOLOGY : Spiti (S. R. Sharma 1979), Mising (Taid 1987), Manipuri (Meitei) (Chelliah (1990), Khonoma Angami (Blankenship et al. 1992), Classical Tibetan (Hogan 1996), Rgyalthang Tibetan (Wang 1996), Lai (Melnik 1997), Sharchhop (Fulop & Dobrovolsky 1999), Phek dialect of Chokri (Bielenberg & Nienu 2001), Tangbe, Tetang, and Chuksang dialects of Seke (Honda 2002), Manange (Hildebrandt 2005). T ONES : Khezha (Kapfo 1989), Tamang and Tibetan (Sprigg 1990), Paṭani and Central Tibetan (Saxena 1991), Lhasa Tibetan, Gar Tibetan, Gerze Tibetan, and Zedang Tibetan (Duanmu 1992), Garo (Burling 1992), Central Tibetan and Kham Tibetan (Haller 1999), Bodo languages (Joseph & Burling 2001), Dzongkha, Lhomi, Sherpa, Dolpo Tibetan, and Mugom Tibetan (S. A. Watters 2002), Hakha Lai (Hyman & VanBik 2002), Chin (Löffler 2002), Tai languages of NE India (Morey 2005b). T ONOGENESIS : Tibetan dialects (Huang 1995). M ORPHOPHONEMIC ALTERNATIONS : Meiteiron = Meitei (Thoudam 1989), Tiddim Chin (Bhaskararao 1989), Daai Chin (Hartmann-So 1989). P ROSODIES : Garo and rGyarong (Benedict 1994), Achang dialects and four languages of the Zaiwa group (Dempsey 2003). N ASALIZATION AND NASALS : Lhasa Tibetan (Hogan 1994), prenasalization and preglottalization of Daai Chin (Hartmann 2001). G LOTTAL STOP AND / OR GLOTTALIZATION : Garo (Duanmu 1994), Lai (Roengpitya 1997), Daai Chin (Hartmann 2001).

388 3.3.

Hans Henrich Hock

Phonology and phrasal prosody By Hans Henrich Hock

As noted earlier, much that has been published under the rubric “phonology” concerns phonetic inventories (± phonetics/graphemics relationships); phonological phenomena such as vowel harmony or sandhi receive less attention. The most important contributions in this area, beyond the extensive literature on Sanskrit sandhi, tend to be published in a variety of different journals and conference proceedings, many of which are not specifically oriented toward South Asia. A complete overview is, at this point, still elusive. 3.3.1.

Major resources on phonology

Two grammar publication series offer at least some discussion of phonology. These are the Descriptive Grammar Series by Croom Helm/Routledge and the Mouton Grammar Series by de Gruyter Mouton. Relevant volumes in the former are Asher 1985 (Tamil), Asher & Kumari 1997 (Malayalam), Bhatia 1993 (Panjabi), Pandharipande 1997 (Marathi), Sridhar 1990 (Kannada), Wali & Koul 1997 (Kashmiri), Anderson (ed.) 2008 (Munda), and Thurgood & LaPolla (eds.) 2003 (Tibeto-Burman). Relevant volumes in the Mouton Grammar Series are David 2013 (Pashto and dialects), Chelliah 1997 (Meithei), Coupe 2007 (Mongsen Ao), David 2015 (Bangla), Genetti 2007 (Dolakha Newar), van Driem 1987 (Limbu), van Driem 1997 (Duma). Other publications on grammars of individual languages or language families include the following. Andronov 1996 (Malayalam), Bailey 1924 and Schmidt & Kohistani 2008 (Shina), Berger 1974, 1998 and Lorimer 1935–1938 (Burushaski), Beythan 1943 (Tamil), Dhongde & Wali 2009 (Marathi), Emeneau 1984 (Toda), Gair & Paolillo 1997 (Sinhala), Grierson 1911 and B. Kachru 1969 (Kashmiri), Krishnamurti 2003 (Dravidian comparative grammar), Liljegren 2008 (Palula), Morey 2005a (Tai languages of Assam), Opgenort 2004b (Wambule), Plaisier 2007 (Lepcha), Schiffmann 1979 (Spoken Tamil), D. D. Sharma 1988 (Kinnauri), Subrahmanyam 1983, 2008 (comparative Dravidian phonology), Thompson 2012 (Bangla), Whitney 1889 (Sanskrit). Publications specifically dedicated to phonology and phonological typology include Kelkar 1968 (Hindi-Urdu), with review by Srivastava (1969), Ohala 1983 (Hindi), Michailovsky 1988 (Nepali), Vasanthakumari 1989 (Tamil), Namkung (ed.) 1996 (Tibeto-Burman), Neukom 1999 (Northeast India), Shukla 2000 (Hindi), M. A. Khan 2000 (Urdu), and Modi 2013 (Gujarati). Note also Kaye (ed.) 1997, with contributions on the phonology of selected South Asian languages: Kaye 1997 (Hindi-Urdu), Mistry 1997 (Gujarati), Elfenbein 1997a (Pashto), Elfenbein 1997b (Balochi), Elfenbein 1997c (Brahui), and Anderson 1997 (Burushaski).

Phonetics and phonology

389

Rajapurohit (ed.) 1986, Ramaswami 1999, Reddy 2003, and Pandey 2014 cover a broader range of Indian languages. Of these publications, Pandey’s stands out by providing a systematic bibliographical survey of earlier work as well as sketches of the phonology — both segmental and suprasegmental — of 148 languages belonging to all the language families of India, including contact languages and “Historical Varieties”. The Summer Institute of Linguistics has issued several monographs on the phonemic systems of the following languages: Chepang (Caughley 1969), Gurung (Glover 1969), Newari (Hale & Hale 1969), Sherpa (Gordon (1969), Sunwar (Bieri & Schulze 1969), Tamang (Taylor 1969), Thakali (Hari 1969), Kham (D. E. Watters 1971), Magar (Shepherd & Shepherd 1971), Parengi (Gorum) (Aze 1971), Khaling (Toba & Toba 1972), Lhomi (Vesaleinen & Vesaleinen 1976). (See also Section 3.2.9 above.) The following sections focus on issues that have received broader attention in the literature. 3.3.2.

Sandhi

Sandhi, or the morphophonemic interaction of segments in morphology (“internal sandhi”) or across word boundary (“external sandhi”), is a phenomenon probably found in all South Asian languages. Standard handbooks, especially philologically oriented ones such as Beythan 1943 for Tamil, Emeneau 1984 for Toda, or Wackernagel 1896/1957, Whitney 1889 for Sanskrit provide detailed information. In addition, note also Chelliah 1990 (Meithei morphologically conditioned voicing assimilation, aspirate dissimilation, various changes affecting l, etc.), S. Singh 1976 (Hindi morphophonemics), Bhaskararao 1989 (vowel alternations in Tiddim Chin reduplicated adverbs), Hartmann-So 1989 (sandhi phenomena in Daai Chin), Thoudam 1989 (morphophonemic rules for Meithei compounds), A. Singh 1994 (Hindi phonology-morphology interface), Pierrehumbert & Nair 1996 (Hindi gemination before y), Coupe 2003 (tone sandhi in Ao), Peet 2007 (Amdo Tibetan labial assimilation, an abstract account), Begam 2008 (assimilation in Bangla), M. D. Ramasamy 2011 (Tamil morphophonemics). Most of the processes are language-specific or have received little general attention in phonological literature. The following sections deal with phenomena that have received broader phonological discussion. 3.3.2.1. Sanskrit sandhi The term “sandhi” goes back to the Sanskrit grammatical tradition, and both Pāṇini and the Prātiśākhyas cover sandhi extensively. In Pāṇini’s grammar, the entire last three chapters, the “tripādi”, are dedicated to sandhi, but internal sandhi rules occur throughout the grammar, such as iko yaṇ aci (6.1.77) which provides that

390

Hans Henrich Hock

high vowels and syllabic ṛ are replaced before vowel by their non-syllabic counterparts. Sandhi also is covered in standard western grammars of Sanskrit, such as Wackernagel 1896/1957 (especially 301–343) and Whitney 1889 (Chapter 3). Several modern publications treat Sanskrit sandhi in detail. These are Emeneau 1952, Allen 1962, Emeneau & van Nooten 1968. Note also Zwicky 1964, which introduced Sanskrit sandhi phenomena into generative phonological discussion, and Kiparsky 1973a, which draws on Sanskrit internal sandhi phenomena in the context of the phonological “abstractness controversy” and, in the process, argues against many early generative accounts such as Zwicky’s. More recently, Kessler (1992) claims that with minor exceptions all sandhi rules can be described as syllable-structure rules. He further develops a computer program to test his formulation of sandhi rules. Vowel sandhi is discussed in McCarthy 2005, Gnanadesikan 1997, Kessler 1992, Gunkel & Ryan 2011, Jensen & Stong-Jensen 2012, and C. Smith 2012; see also Dočkalová 2009 on Sanskrit and Prakrit. Consonant sandhi is treated in Kessler 1992, 1994 and Dočkalová 2009 (Sanskrit and Prakrit). Several specific sandhi phenomena have been widely discussed in generative literature. They are covered in the following sections. 3.3.2.1.1.

Grassmann’s Law

Early approaches attempt to analyze the synchronic reflexes of Grassmann’s Law (for which see 1.3.1.5.1.1) in terms of rules that are essentially identical to the traditional historical analysis; see e.g. S. Anderson 1970, Kiparsky 1973b. Sag 1974 shows that, instead, the synchronic analysis of Pāṇini, with aspirate throwback, must be accepted. See also Janda & Joseph 1989, Calabrese & Keyser 2006, and the discussion in Collinge 1985: 47–61. 3.3.2.1.2.

RUKI and retroflex assimilation

The synchronic outcome of PIE *s as retroflex sibilant ṣ by RUKI (see 1.2.1.1) has led to alternations with unchanged s, as in (1). (1)

RUKI vs.

agni-su pitṛ-su senā-su

  =

agni-ṣu pitṛ-ṣu senā-su

‘fire (LOC . PL )’ ‘father (LOC . PL )’ ‘army (LOC . PL )’

Further, dental stops following ṣ are assimilated and become retroflex leading to alternations between dental and retroflex stops, as in (2). Similar developments, of PIE “palatals” before obstruent (see e.g. 1.3.1.5.1.1 with example (10)), lead to comparable alternations.

Phonetics and phonology

(2)

Retroflex assimilation piṣ-ta  tuṣ-ta  vs. as-ta =

piṣ-ṭa tuṣ-ṭa as-ta

391

‘grind (PST . PTCP )’ ‘please (PST . PTCP )’ ‘throw (PST . PTCP )’

Early accounts such as Zwicky 1964, 1970 attempt to account for the outcome of RUKI, whether synchronically alternating as in (1) or not, by reformulating the historical changes as synchronic rules. Beginning with Kiparsky 1973a (see also 2010), O’Bryan 1974, and Vennemann 1974, it was realized that because of analogy and borrowings, the RUKI rule had become synchronically opaque and applied only in what Kiparsky calls “derived environments”. See also Longerich 1998. Arsenault (2008: 36–42 and 50–56), partly drawing on earlier work by Hall (1997a, 1997b), covers RUKI and issues of retroflex assimilation in the context of coronal feature theory. He concludes that retroflex is defined as [– distributed] and that it may also be [+ back] ‘at a post-lexical level’. Based largely on earlier historically-oriented work that questions whether RUKI ever was a single phonological process, Arsenault comes to the extraordinary conclusion that RUKI was not a unitary phonological process in Sanskrit either and that it was only Pāṇini’s unitary treatment that turned RUKI into a prescriptive rule, but not a natural rule. While the post-Vedic language limits RUKI and retroflex assimilation to internal sandhi, Hock 1979 shows that in Vedic it also applies variably in external sandhi; see e.g. 1.3.1.5.1.1 with example (11). 3.3.2.1.3.

Nati or n-retroflection

Sanskrit dental nasals change to retroflex after retroflex ṣ or alveolar (post-dental) r and before [–stop] segments, with the restriction that no coronals may intervene between trigger and target. See e.g. (3). (3) vs.

varṣ-man-ā var-man-ā brah-man-ā sad-man-ā vart-man-ā

   = =

varṣ-maṇ-ā var-maṇ-ā brah-maṇ-ā sad-man-ā vart-man-ā

‘top (INS . SG )’ ‘armor (INS . SG )’ ‘ritual priest (INS . SG )’ ‘seat (INS . SG )’ ‘road (INS . SG )’

Like RUKI and retroflex assimilation, Nati entered phonological literature with Zwicky 1964, 1970; see also Schein & Steriade 1986. A number of recent publications discuss the phenomenon under the heading of consonant harmony; see 3.3.5 below. Hock 1979 shows that like RUKI and retroflexion, Nati variably applies across word boundary in Vedic Sanskrit.

392

Hans Henrich Hock

3.3.2.2. Sandhi in Dravidian Indigenous Dravidian scholarship goes back to the Tolkāppiam (before the 5th c. AD), which devotes some four sections of the first book (the Er̤ uttatikāram) to sandhi phenomena. Two sandhi phenomena have received broader attention in modern phonological literature. One is a process of initial gemination (reminiscent of Italian radoppiamento sintattico), the other involves neutralization and/or assimilation of root-final consonants before consonant-initial affixes. 3.3.2.2.1.

Initial gemination

Initial gemination is an external-sandhi phenomenon whose application is conditioned (or inhibited) by complex syntactic, morphological, and phonological factors. For an example see (4) from Tamil. The process is found in Tamil and Malayalam, and possibly other (South) Dravidian languages; see e.g. Beythan 1943: 46–50; Andronov 1996: 27–28. It has something like a mirror-image counterpart in Toda, where stops appear as singletons under similar conditions but change to fricatives where Tamil and Malayalam have singletons; Emeneau 1984: 34. (4)

anta ppustakattai kkoḍu give.IMP that book.ACC ‘Give (me) that book.’

Generative accounts for Tamil initial gemination are proposed by Vijayakrishnan (1985, 1988), Christdas (1987), and Nagarajan (1994, 1995). The Toda counterpart would be an interesting challenge for phonological analysis. 3.3.2.2.2.

Consonant neutralization and assimilation

In Tamil and Malayalam, liquids change to corresponding stops before stops; nasals do so likewise, but with some variation; a following dental t assimilates to preceding alveolar or retroflex, leading to an alveolar or retroflex geminate. See Beythan 1943: 41–44 and Andronov 1996: 26–27. In the case of alveolar liquids (and nasals), sound change leads to further complications in Tamil, in that original alveolar geminates are realized as ttr in the high variety and tt in colloquial varieties. A phonological account has been proposed for Tamil by Vijayakrishnan (1987); see also Wiltshire 2000 (focus on past-tense formations with suffix-initial t).

Phonetics and phonology

3.3.2.2.3.

393

Other sandhi phenomena

Telugu has received the most attention. Kelley (1963) examines external vowel sandhi; see also Bhaskararao 1982. Krishnamurti (1957) deals with place and voicing assimilations in clusters resulting from syncope (for which see 3.3.3.2 below). Kolachina et al. (2011) propose an account of sandhi splitting for the purposes of automated tree banking. A number of the phenomena discussed in the following sections may also be considered sandhi processes (in a broader sense). 3.3.3.

Syncope

Syncope or vowel deletion processes are common in Modern Indo-Aryan languages and are also a feature of Telugu morphophonology. Interestingly, Hindi and Telugu are diametrically different as regards (some of) the constraints on syncope. 3.3.3.1.

Schwa-syncope in Indo-Aryan, especially in Hindi

Schwa-syncope is widespread in Indo-Aryan languages; see e.g. Mistry 1997: 160–162 and Cardona & Suthar 2003: 667 (Gujarati), Koul 2003: 905 (Kashmiri), Miranda 2003: 740 (Konkani), Yadav 2003: 484 (Maithili), Pandharipande 2003: 724 (Marathi), S. Singh 1992 and Bhatia 1993: 349–350 (Panjabi). The language that has received the greatest amount of attention is Hindi.2 (Note that phonetic [ǝ] is phonologically the short counterpart of long ā [a:] and following standard indological practice, is transcribed here as a.) Important early publications on Hindi “schwa deletion” are Kelkar 1968, Pray 1970, and Narang & Becker 1971. The most significant contributions are those of Ohala (1974a, 1974b, 1977b, 1987, and especially 1983) and Pandey (1990). Important findings include the following constraints: A distinction must be made between “native” vocabulary and words borrowed from Sanskrit (which may be exempt from schwa-deletion). Schwa-deletion is blocked if it results in non-permissible triple consonant clusters (Narang & Becker 1971, modified in Ohala 1983). Schwa deletion also is blocked in careful (non-allegro) speech if the vowel is flanked by homorganic consonants (Ohala 1983; Pandey 1990). Bakovic 2005 proposes that syncope is a process of “blind” deletion, constrained by “antigemination”. Based on an ingenious experiment involving nonce-derivations, Ohala (1974a, 1977b) further shows that for some speakers, the Hindi alternation between schwa and zero is accounted for by a rule of schwa-insertion, rather than deletion. 2

There is also a historical process of schwa-apocope, but this has not left any traces in terms of synchronic alternations. What is found, however, is variation in non-native words, such as Urdu xatm : Hindi khatam ‘finish(ed)’, or Hindi karam : karma ‘karma’.

394

Hans Henrich Hock

3.3.3.2. Syncope in Telugu and other Dravidian languages As noted in 3.3.7.1.2, Mohanan draws on vowel loss to determine Malayalam stress placement (but see Terzenbach 2011). Vowel syncope is also found in Kannada (Sridhar 1990: § 3.4.4.1.2). See also Kissock & Reiss 2003 for Koya, with syncope very similar to Telugu. Telugu syncope has received greater attention. As in Koya, but unlike Malayalam and Kannada, it is an EXTERNAL -sandhi process, affecting final vowels before word or compound boundary. The first publication to account for the phonology of Telugu syncope is Kelley 1963; see also Wilkinson 1974a. Krishnamurti (1957) observes that syncope takes place only if the flanking consonants are homorganic, with the proviso that all coronals are homorganic. Kissock & Reiss (2003) observe that syncope also takes place before word-initial vowel and that, contrary to Krishnamurti, it also takes place if the flanking consonants are not homorganic (as in nellūru biyyam  nellūrbiyyam ‘Nellore rice’).3 3.3.3.3. Syncope and (anti-)antigemination As we have seen, Telugu requires or at least permits flanking consonants to be homorganic and thus allows for geminate outcomes. By contrast, Hindi blocks syncope if the vowel is flanked by homorganic consonants and thus does not allow for geminate outcomes. This difference creates interesting challenges to phonological theories of antigemination and anti-antigemination. See Odden 1988, Kissock & Reiss 2003. Since both Telugu and Hindi have geminates of independent origin, it does not seem possible to explain the difference in behavior in terms of linguistic structure. Odden (1988: 470) proposes that the conflicting behavior can be resolved ‘as phonologized alternative resolutions of this neural timing problem.’ Kissock and Reiss (who do not explicitly refer to Hindi) suggest an alternative, but their explanation for anti-antigemination looks speculative. What may be relevant is that, as Kissock and Reiss point out, Telugu syncope does not require homorganicity of flanking consonants, it only permits it. Note further that in INTERNAL sandhi, Telugu has been claimed to obey the antigemination (or OCP) constraint (Balusu 2009, 2011). These issues clearly deserve further research. 3.3.4.

Vowel harmony and umlaut

Vowel harmony systems have been proposed for a number of South Asian languages, with heavy concentration in the east. Languages include Assamese (U. N. 3

Telugu also has internal-sandhi alternations between u and Ø, as in nalugu : nalgu ‘four’ (Kissock & Reiss 2003). Given that u is the default epenthetic vowel in Telugu, it is not clear whether the alternation involves u-epenthesis or u-syncope.

Phonetics and phonology

395

Singh 1985, Goswami & Tamuli 2003: 410–412, Mahanta 2007, 2009), Bangla (U. N. Singh 1985, T. Ghosh 2001, Dasgupta 2003: 357–358, Mahanta 2005, 2007, Begam 2008), Konkani (Miranda 2003: 740), Magahi (S. Verma 2003: 506–507), Oriya (U. N. Singh 1985); Gadaba (Bhaskararao 1998: 334), Telugu (Kelley 1959, 1963, Subbarao 1971, Wilkinson 1974a, 1974b, Prabhakar Babu 1981, Rao 1996, Krishnamurti 1998: 208); Santali (Anderson 2007: 13–14, A. Ghosh 2008), Mayurbhanj Ho (Anderson, Osada & Harrison 2008), Mundari (Stirtz 2013); note further Lhasa Tibetan (Delancey 2003: 271). See also 1.10.3.2 on the combined system of vowel and consonant harmony in Kusunda. The language that has received the most detailed attention is Telugu. Recent work (Kissock 2010 and Kissock & Dworak 2009) questions whether Telugu has vowel harmony, noting that it is morphologically and lexically highly restricted and, as shown by tests with nonce-words, not productive. Gupta and Dutta (2013) find that, unlike other languages with labial harmony, Telugu fails to exhibit ‘labial coarticulatory resistance’. The issue of Telugu vowel harmony, thus, deserves further research. Vowel harmony is not always easy to distinguish from umlaut, and many of the cases labeled vowel harmony in the literature may alternatively be considered umlaut — or just simple vowel assimilation. See e.g. Kissock 2010 and Kissock & Dworak 2009 on Telugu. This issue, too, requires further research. (Relatively) clear cases of umlaut are found in Sinhala (Gair 2003: 779),4 and especially in languages of the Northwest, including Kashmiri (Koul 2003: 904–905 and, more detailed, R. N. Bhat 2008), various Dardic languages (Bashir 2003: 823, 835, 882, 885, 876), and in East Middle Iranian Khotanese (Emerick 1989: 210–211 [“palatalization” and “labialization”]) and the Pamir languages (Payne 1989: 427; see also Edelman & Dodykhudoeva 2009: 792–792 for Shughni). In both Kashmiri and some of the Dardic languages, i-umlaut coexists with palatalization (R. N. Bhat 2008, Bashir 2003: 823) and, in Kashmiri, under certain conditions also with front-glide epenthesis before the palatalized consonant — evidently a case of palatalization-onglide segmentalization (for which see Hock 1986/1991: 119–120). Front-glide epenthesis is also found in Maithili (M. Mishra 2006). In both Maithili and in Kashmiri the phenomenon seems to be connected to the super-short final “matra vowels” reported for early Kashmiri by Grierson (1896, 1897, 1911); see also Mishra 2006 for Maithili. For Gadaba, Bhaskararao (1998: 330–331) notes suffix variation between u and i depending on whether the preceding root contains a labial consonant and/or a rounded vowel.

4

Umlaut also seems to have been at work in the history of Toda (see the data in Subrahmanyam 2008: 73–77). It is not clear whether it is reflected in synchronic alternations.

396 3.3.5.

Hans Henrich Hock

Consonant harmony

Retroflex harmony has been observed for Kalasha (Arsenault & Kochetov 2009, based on Bashir 2003) and for Malto, which also has velar/uvular harmony (Steever 1998c: 360–361). A comprehensive account of retroflex/dental harmony is Arsenault 2012, who finds that northern languages, irrespective of linguistic affiliation (Indo-Aryan, Munda, Burushaski), permit T … T and Ṭ … Ṭ, but not T … Ṭ or Ṭ … T, while southern languages (including most of Dravidian as well as Sinhala) have no restrictions. In Kalasha and Indus Kohistani, anticipatory retroflex harmony takes place only in each manner class (e.g. stops, fricatives), but not between different classes. In many of the languages, retroflex/dental harmony is a structural constraint or manifests in historical changes, but some languages have alternations that require a synchronic phonological rule (e.g. Burushaski). Two sandhi phenomena of Sanskrit are frequently included in discussions of consonant harmony — RUKI in Hamann 2003: 107–110; Nati (n-retroflexion) in Gafos 1999, Hansson 2001, Walker & Mpiranya 2005, Rose & Walker 2012. 3.3.6.

Palatalization

Palatalization plays a major role in the morphophonology of Kashmiri, both as a phonetic and as a phonological phenomenon (Koul 2003: 902, 905). R. N. Bhat 2008 is a detailed study of Kashmiri palatalization and its correlation with umlaut and palatal onglide segmentalization. Palatalization has also been noted in Dhivehi (Arsenault 2008) and Konkani (Miranda 2003: 738–739), in some of the Dardic languages (Bashir 2003: 823), and with more limited effects in Marathi (Pandharipande 2003: 722). For Tamil see Schiffmann 1979: 60 and Vasanthakumari 1989: 105–106 and passim. 3.3.7.

Prosody

The two major areas of prosodic research are stress or accent and intonation (including Focus). 3.3.7.1. Stress or accent Research on stress or accent focuses on two major issues. One is the acoustic, articulatory, and auditory features of stress or (pitch) accent; the other is the placement of accent or stress within the word. The two phenomena are to some extent related, but research generally is limited to one or the other. A fair amount of the research is based on the examination of relatively short words and hence can lead to problems of interpretation. For instance, in a configuration CVCV́ CV it would be difficult to tell whether stress or accent falls on

Phonetics and phonology

397

the penult or on the second syllable (the PEN - ANT 5); and similarly, a configuration CV́ CVCV could be interpreted as having either antepenult or initial accent. Further, if only disyllabic words are examined, it is difficult to determine whether high pitch on the second syllable (after the initial low pitch) indicates a melody LH or L … H (with H spreading through a larger prosodic domain than the second syllable). Some claims that have been made about the nature of stress or accent and its location within the word must therefore be taken with a grain of salt. For many languages more research, based on words longer than three syllables and with a variety of different weights in each syllable, is urgently required. General studies on stress or word-accent are Vijayakrishnan 1982 and Keane 2006b (Tamil), Mohanan 1986 (Malayalam), Selkirk 2007 (Bangla), Mahanta 2002 (Assamese), Yadav 1979/1984b: Chapter 5 (Maithili). See also the crosslinguistic study in Schiering & van der Hulst 2010: 551–578, and see the coverage in Pandey 2014. Hindi(-Urdu) accent has perhaps received the greatest attention. It is discussed in Dixit 1963, Mehrotra 1965, Kelkar 1968, A. Sharma 1969, Hussain 1997, Ohala 1977a, 1986, Pandey 1989, Hayes 1995: 162–166, 276–277, Pierrehumbert & Nair 1996, Dyrud 2001, R. Nair 2001, Genzel & Kügler 2010. A useful survey of different views is found in Puri 2013: 36–46. 3.3.7.1.1.

The nature of stress or accent

A survey of the literature reveals a general tendency of South Asian languages to have a pitch accent with a LH melody, but the realization of the melody may differ. For Hindi(-Urdu), Ohala (1977a) finds a rising pitch on the accented syllable and falling pitch on the next syllable; see also Moore 1965, R. Nair 2001. Harnsberger (1994) observes a similar pattern, but notes an initial fall of F0. Dyrud (1997: 17, 24, 29) also observes a LH melody but notes that ‘(i)f the first vowel is short, most of the rise may be executed on the second syllable’. Hussain (1997) only notes a lower F0 on stressed vowels ‘due to low-tone alignment’. Vijayakrishnan (2003a) also observes a LH contour in Panjabi, but adds that final syllables have no following rise. He also notes that low pitch of the LH melody spreads to the pre-accented syllable (if any). In addition, he discusses further complications resulting from the interaction of (lexical) tone with pitch accent. In the same publication Vijayakrishnan refers to a study by Balusu, R. Mahanta, R. Mohanty, and Vijayakrishnan (In progress) which finds a low rise on stressed syllables in Panjabi, Telugu, Oriya, Bangla, and Assamese; but Balusu 2001 notes that Telugu initial stressed syllables have low pitch. 5

For “pen-ant” see Hock 1999: 16–17 with references.

398

Hans Henrich Hock

For Bangla, S. Khan (2008) postulates a “L* … Ha” default, i.e. with the entire word as domain for the rise to high pitch. A similar pattern has been observed for Kharia; Rehberg 2003: 23–28, Peterson 2006: 18–33, 2008: 436–437. According to Peterson, the Kharia pitch rise is compressed on monosyllables. Anderson (Section 1.7.2, this volume) notes a general ‘weak-strong prosodic word pattern … in the syllable structure of Munda languages and their systems of stress assignment.’ Interestingly, Vedic pitch accent has similar LH characteristics, but with L realized on the pretonic syllable (if any), and a fall on the posttonic one in some varieties. In the Rig-Vedic tradition, the posttonic fall (the “svarita”) actually starts on a higher pitch than that of the accented syllable. See Cardona 1993 and Section 1.5.1.1 for further variations. Mathew & Bhat 2010, primarily a study of intonation, suggests a default rising pattern for Kannada females and Konkani males and females, but a different HLH*L pattern for Kannada males and Tulu males and females. 3.3.7.1.2.

Stress or accent placement

Except for some of the northwestern languages that have “free” accent (e.g. Gawarbati, Bashir 2003: 830, or Burushaski, Anderson 1997: 1028–1029) and possibly Sora (Anderson & Harrison 2008b: 306), two major tendencies can be observed — left-edge default accent, with possible shift to the right by weight (PROTRACTION 6), and right-edge-oriented accent systems. Some languages are said to have strict left-edge accent, such as Bangla (e.g. Hayes & Lahiri 1991, Lahiri & Fitzpatrick-Cole 1999, Selkirk 2006) or the Tibeto-Burman languages Chantyal (Noonan 2003a: 317), Nar-Phu (Noonan 2003b: 399), Dolakha Newar (Genetti 2003: 357), and Belhare (Bickel 2003: 547). Initial accent is also often asserted for various Dravidian languages, including Tamil (Annamalai & Steever 1998: 103), Kannada (Steever 1998b: 131), and Goṇḍi (Steever 1998a: 274). See also Schiering & van der Hulst 2010: 566 for Marathi and Anderson, Osada & Harrison 2008: 204 for Ho. For other languages, left-edge may be the default, but if the first syllable is light and the second is heavy, the accent is protracted, generally to the pen-ant. For Malayalam, Mohanan (1986) proposed this account based on vowel reduction and deletion processes. His analysis was accepted in Asher & Kumari 1997 and Hayes 1995. Mohanan’s claims regarding vowel reduction and deletion are questioned by Terzenbach (2011). More research is needed, especially work based on detailed observations of pitch features. Schiering and van der Hulst (2010: 555, 558, 831) propose similar accounts (left-edge default and protraction to the pen-ant by weight) for Nepali and Assamese (the latter with reference to Mahanta 2002) as well as for Lhasa Tibetan, and so do 6

See Hock 1999: 16–17 (with references) for this phenomenon.

Phonetics and phonology

399

Shaw (1984) and Das (2001) for Bangla. (S. Khan 2008, however, argues for leftedge only.) See also Krishnamurti & Benham 1998: 244–245 for Koṇḍa. The pattern is also found in a number of Munda languages; see Patnaik 2008: 513 (Juang), Anderson & Harrison 2008a: 565 (Remo), A. Ghosh 2008: 30 and Schiering & van der Hulst 2010: 573–574 (Santali). For Remo, Anderson and Harrison employ a foot-structure analysis which further accounts for secondary stress on even-numbered syllables after the pen-ant. A foot-structure account may also be appropriate for Mundari, where quadrisyllabic words are realized the same way as two disyllabic words; the latter vary between initial and final stress depending on weight, and trisyllabic words have default final accent (Osada 2008: 104). As it turns out, though Marathi has been characterized as having left-edge accent, the data in Pandharipande 2003: 720 suggest initial default, but if there is a heavy syllable, the accent occurs on the leftmost heavy syllable. A similar system has been proposed by Elfenbein (1997c: 809, 1998: 394) for Brahui and for Balochi (1997b: 774). Systems of this sort seem to display extended protraction by weight, beyond the pen-ant; see Hock 1999: 16–17 for discussion. A similar situation may perhaps hold for Telugu. Lisker and Krishnamurti (1991), based on the evidence of two- or three-syllable words argue for a tendency for penult or final accent, depending on quantity. But following Sitapati (1936) and Srinivas (1992), Balusu (2001) finds that in words with only short vowels, including words of four or five syllables, the first syllable is stressed. Lisker and Krishnamurti’s penult or final accent, thus, may reflect protraction by weight. To test this possibility, it is necessary to examine four- or five-syllable words with weight variation in every syllable. Descriptions for Gujarati focus on two- and three-syllable words and conclude that accent occurs on the penult or antepenult, conditioned by weight, but that syllables containing a (< ā) attract the accent irrespective of position; see Mistry 1997: 660, Cardona & Suthar 2003: 667, de Lacy 2006: § 5.4. Cardona and Suthar, however, note that there is great judgment variation among both speakers and linguists. Since in the selected data antepenult coincides with initial position and penult with pen-ant, an examination of words longer than three syllables is needed to determine the correct analysis. The situation may be similar for Nepali. Schiering and van der Hulst (2010: 556–557) state that the accent is on the first syllable, except when the second vowel is long and the first is not; but Riccardi (2003: 552) provides an account in terms of final/penult/antepenult accentuation by weight. There are, of course, a number of languages for which right-edge-oriented accentuation is certain, with stress on the final, penult, or antepenult (with placement within this range determined by weight). This is true, for instance, for Hindi-Urdu. However, different views exist regarding the exact placement rules. According to Kelkar (1968) stress falls on the final syllable if it is heavy and the preceding syllable is light; elsewhere, the penult

400

Hans Henrich Hock

is stressed. A similar account is found in Dixit 1963 (although formulated differently). By contrast, Pandey (1989) makes a distinction between heavy and extraheavy (V̄ C) syllables and formulates rules predicting final stress if the final syllable is extra-heavy, antepenult stress if both penult and final are light, and penult stress elsewhere. (See also Mehrotra 1965.) It remains to be investigated whether the differences reflect dialect distinctions or perceptual differences because of the tendency of the LH accent melody to carry over into the post-accent syllable. Moreover, analyses seem to have been based on words of two or three syllables (which, in addition to monosyllabics, form the bulk of the Hindi lexicon). It would be interesting to see how longer words such as asaphal(a)tā ‘failure’ are stressed. Similar right-edge systems are found in Maithili (Yadav 2003: 483), Bhojpuri (M. Verma 2003: 522), and possibly other languages in the “Hindi Belt”, as well as in Southern and perhaps also Eastern Balochi (Jahani & Korn 2009: 649). They also seem to be found in the Munda languages Gorum (Aze 1971: 35), Gtaʔ (Anderson 2008: 686), and Korku (Zide 2008: 260). For the Munda languages it remains to be seen to what extent reported judgments are based on two- or three-syllable words or whether longer words have been examined that would make it possible to determine whether, say, a configuration CVCV́ CV should be analyzed as having penult or pen-ant accent. 3.3.7.1.3.

Possible interaction of LH pitch accent and accent placement, and other effects

As Dyrud (1997: 17, 24, 29) observes regarding the LH pitch accent of HindiUrdu, ‘(i)f the first vowel is short, most of the rise may be executed on the second syllable’. This fact may provide an explanation, at least in principle, for the common protraction from left-edge accent to the pen-ant by weight — as a reinterpretation of the second-syllable rise as indicating accent location. An alternative, namely that pen-ant accent by weight is simply a linguistic misanalysis, could be dismissed by referring to languages like Marathi and Brahui which seem to exhibit extended protraction by weight, beyond the pen-ant. It is to be hoped that further research will lead to greater certainty. Interestingly, something like protraction has been invoked in comparative Dravidian linguistics to account for loss of vowels in initial syllables, a process often referred to as aphaeresis (Subrahmanyam 1983: 224–248) or, for some of the languages, as “apical displacement” (Krishnamurti 1978). Referring to Ramaswamy Aiyar (1931–1932: 471), Subrahmanyam accounts for the phenomenon as resulting from ‘redistribution of accent resulting in an increase of accent in the original second syllable and a conspicuous decrease in the first syllable’. Perhaps significantly, the majority of Tamil-Malayalam cases cited by Subrahmanyam, have long vowels in the second syllable. The lack of such a restriction in other languages could be accounted for as resulting from further extension of the process.

Phonetics and phonology

401

3.3.7.2. Intonation, including focus Bangla has received the greatest amount of attention, in good part because of Hayes & Lahiri 1991, a detailed study of the phonology of intonation of Bengali employing the descriptive framework of ToBI. They find that focus is characterized by a H tone ‘near the end of the focused constituent’. See further Fitzpatrick-Cole & Lahiri 1997, Lahiri & Fitzpatrick-Cole 1999 (which also considers emphatic clitics), and the later, more abstract OT reanalysis by Selkirk (2007). S. Khan (2008) further elaborates on earlier work, showing that Bengali focus is marked by ‘a special high tone’, with three different realizations (depending on the type of focus) as well as different patterns of phrasing. Truckenbrodt (2002) examines variation in p-phrasing in Bangla. Das 2001 examines the prosodic phonology of Tripura Bangla. Although not the major focus of Hayes and Lahiri (1991), their claim that in contrast to other OV languages, Bangla utterance-final verbs are prosodically prominent has been widely accepted; see e.g. Ladd 1996, 1980, Gussenhoven 2002, Truckenbrodt 2002. By contrast, Dutta and Hock (2006) present experimental evidence that Bangla utterance-final verbs avoid prosodic prominence. Lahiri (p.c. 2013) objects, claiming that her study (with Hayes) found that final verbs always form IPs of their own and that Dutta and Hock’s data show downstep on the final verb, not absence of prosodic prominence. While this proposal may work for some of Dutta and Hock’s data, it does not do so for others, especially examples with creaky phonation on the final verb. Further, more detailed investigation is needed. Hindi has also received a fair amount of coverage. Féry 2010 and Patil et al. 2008 address issues of focus, word order, and intonation in Hindi. Harnsberger 1999 and Harnsberger & Judge 1996 observe greater pitch excursion and longer duration for focused phrases and pitch-range reduction for postfocal elements. Patil et al. (2008) also note pitch excursion and longer duration on focused elements, but no pitch rise, and find that pitch-range reduction of postfocal elements is a major feature of focus. See also Sengar and Mannell 2012 on the acoustic characteristics of Hindi intonation, which notices considerable speaker variation as regards chunking. Other languages do not seem to have been as fully investigated. Prabhakar Babu (1978) gives a detailed description of intonational patterns of colloquial Telugu. Prakasam (1979) discusses various suprasegmental features of Telugu as part of its sentence phonology. The intonational system of Tamil is studied by Ravishankar (1988). Keane (2006a) shows that Tamil interrogatives tend to be followed by ‘lowering of f0 peaks … resulting in some compression of the pitch register’. Keane 2014 presents the results of a detailed follow-up study, investigating a large variety of different sentence types. Intonation systems of other Indian languages are discussed in Kelkar 1958 and Gokhale 1982 (Marathi), Nihalani 1984 (Sindhi), Pathak 1977 (Bagheli), Genetti

402

Hans Henrich Hock

& Slater 2004 (Dolakha Newar), C. S. Singh 2014 (Panjabi); see also Rajapurohit (ed.) 1986, with brief studies on tonal and intonational phenomena in a variety of Indian languages. 3.3.8.

Word structure

Donegan (1993) and Donegan and Stampe (2002, 2004) propose a major rhythmic shift in Munda from original head-initial, iambic structure to head-final, trochaic structure (as part of a broader structural realignment). While not challenging the claim that Munda languages have trochaic structure, Brunelle and Pittayaporn (2012: 4) question the “rhythmic shift” hypothesis and ‘argue that direct rhythmic shifts are unlikely and that languages that have undergone such shifts probably went through a stage in which they had no word stress’. (Their major focus, however, is on correlations between iambic and trochaic foot structure and Southeast Asian monosyllabicity vs. sesquisyllabicity.) More important, some (South) Munda languages appear to have final/penult accent (see 3.3.7.2), and others have LH, i.e. iambic pitch accent. In fact, Anderson (1.7.2, this volume) argues for a general Munda tendency of weak-strong word structure (with L (…) H pitch contour) and notes that Gtaʔ tends to lose vowels in initial syllables and thus approximates sesquisyllabicity. What complicates matters is that the Kharia L (…) H pitch contour has been analyzed as initial accent, characterized by lower pitch, with pitch rise on the following, unaccented syllables (Rehberg 2003: 23–28), and similar accounts are possible for other Munda languages. Under such an analysis it becomes difficult to decide whether (some of the) Munda languages are left-edge dominant (because of initial and/or pen-ant accent placement) and thus possibly troachaic, or iambic (because of their LH, weak-strong prosodic organization). Anderson (1.7.3, this volume) notes that Khasi exhibits areally unusual initial clusters. ‘Khasi thus shows a characteristic Mon-Khmer word profile with a minor syllable followed by a major syllable in a low-high prosodic word structure … as seen in examples such as bta ‘wash/besmear face’, ksew ‘dog’, kti ‘hand’, ktháw ‘grandfather’.’ Khasi may thus be added to Gtaʔ as an Austro-Asiatic South Asian language with (near-)sesquisyllabicity. Sesquisyllables are clearly found in Boro-Garo and Kuki-Chin (Delancey 2014: 64). Matisoff (2003: 97) reconstructs sesquisyllabicity for Tibeto-Burman prefixed structures. It remains to be seen whether the sesquisyllables of Boro-Garo and Kuki-Chin are inherited or an innovation, perhaps attributable to the general South Asian trend toward LH pitch contours. A crosslinguistic study of Vijayakrishnan (2007) argues for a ‘disyllabic word minimum’, with different phonological manifestations in Bangla, Panjabi, and Tamil.

Phonetics and phonology

403

Bibliographical references Abbi, Anvita 1987 Palatals or lamino-dentals in Khasi? A probe into feature theory. International Journal of Dravidian Linguistics 16: 99–107. Abbi, Anvita, and Awadesh K. Mishra 1984–1985 Consonant clusters and syllabic structures of Meitei. Linguistics of the Tibeto-Burman Area 8: 81–92. Abbi, Anvita, R. S. Gupta, and Ayesha Kidwai (eds.) 2001 Linguistic structure and language dynamics in South Asia: Papers from the proceedings of SALA XVIII Roundtable. Delhi: Motilal Banarsidass. Acharya, K. P 1975 Lotha phonetic reader. Mysore: Central Institute of Indian Languages. Allen, W. Sidney 1953 Phonetics in ancient India. London: Oxford University Press. Allen, W. Sidney 1962 Sandhi: The theoretical, phonetic, and historical bases of word-juncture in Sanskrit. ’sGravenhage: Mouton. Ammamalai, E., and Sanford B. Steever 1998 Modern Tamil. In: Steever (ed.) 1998: 100–128. Anderson, Gregory D. S. 1997 Burushaski phonology. In: Kaye (ed.) 1997: 1021–1041. Anderson, Gregory D. S. 2007 The Munda verb. Berlin/New York: Mouton de Gruyter. Anderson, Gregory D. S. 2008 Gtaʔ. In: Anderson (ed.) 2008: 682–763. Anderson, Gregory D. S., and K. David Harrison 2008a Remo (Bonda). In: Anderson (ed.) 2008: 557–632. Anderson, Gregory D. S., and K. David Harrison 2008b Sora. In: Anderson (ed.) 2008: 299–380. Anderson, Gregory D. S., Toshiki Osada, and K. David Harrison 2008 Ho and the other Kherwarian languages. In: Anderson (ed.) 2008: 195– 255. Anderson, Gregory D. S., and Felix Rau 2008 Gorum. In: Anderson (ed.) 2008: 381–433. Anderson, Gregory D. S. (ed.) 2008 The Munda languages. Oxford/New York: Routledge. Anderson, Stephen R. 1970 On Grassman’s Law in Sanskrit. Linguistic Inquiry 1: 387–396. Andronov, Michail S. 1996 A grammar of the Malayalam language in historical treatment. Wiesbaden: Harrassowitz. Annamalai, E., and Sanford B. Steever 1998 Modern Tamil. In: Steever (ed.) 1998: 100–128. Arokianathan, S. 1980 Tangkhul Naga phonetic reader. Mysore: Central Institute of Indian Languages.

404

Bibliographical references

Arsenault, Paul 2008 Coronal features and retroflexion in Indo-Aryan languages. University of Toronto PhD Generals Paper. https://twpl.library.utoronto.ca/index.php/twpl/ article/view/6560/3523 (accessed 22 February 2014) Arsenault, Paul 2012 Retroflex consonant harmony in South Asia. University of Toronto PhD dissertation. https://twpl.library.utoronto.ca/index.php/twpl/article/view/6560/3523 (accessed 22 February 2014) Arsenault, Paul, and Alexei Kochetov 2009 Retroflex (consonant) harmony in Kalasha. 83rd Annual Meeting of the Linguistic Society of America. Asher, Ronald E. 1985 Tamil. London/Sydney/Dover: Croom Helm. Asher, Ronald E., and T. C. Kumari 1997 Malayalam. London/New York: Routledge. Aze, Richard 1971 Parengi (Gorum) phonemic summary. Kathmandu: Summer Institute of Linguistics. Baart, Joan L. G. 1997 The sounds and tones of Kalam Kohistani, with words and texts. Islamabad: Quaid-i-Azam University. Baart, Joan L. G. 1999 Tone rules in Kalam Kohistani (Garwi, Bashkarik). Bulletin of the School of Oriental and African Studies 62: 88–104. Baart, Joan L. G. 2003 Tonal features in languages of northern Pakistan. In: Joan L. G. Baart and Ghulam Hyder Sindhi (eds.), Pakistani languages and society: Problems and prospects, 132-144. Islamabad: National Institute of Pakistan Studies/Summer Institute of Linguistics. http://www.fli-online.org/documents/linguistics/tone_ in_np.pdf (accessed 29 June 2011) Bahl, Kali Charan 1955–1956 Tones in Panjabi. Indian Linguistics 17: 139–147. Bailey, Thomas Grahame 1914 A Panjabi phonetic reader. London: University of London Press. Bailey, Thomas Grahame 1924 Grammar of the Shina (Ṣiṇā) language. London: The Royal Asiatic Society. Bailey, Thomas Grahame 1937 The pronunciation of Kashmiri. London: The Royal Asiatic Society. Bakovic, Eric 2005 Antigemination, assimilation and the determination of identity. Phonology 22: 279–315. http://www.unice.fr/dsl/egg/constanta10/Reiss/Bakovic%20 antigemination.pdf (accessed 1 July 2014) Bakst, Sarah 2012 Rhotics and retroflexes in Indic and Dravidian. Cambridge University MPhil dissertation. Balasubramanian, T. 1980 The pure oral vowels of colloquial Tamil: A spectrographic study. International Journal of Dravidian Linguistics 9: 23–35.

Phonetics and phonology

405

Balasubramanian, T. 1982a The two r’s and the two n’s in Tamil. Journal of Phonetics 10: 89–97. Balasubramanian, T. 1982b Intervocalic double nasal and lateral consonant articulations in Tamil. Journal of Phonetics 10: 99–104. Balusu, Rahul 2001 Acoustic correlates of stress and accent in Telugu. South Asian Languages Analysis Meeting 21. https://files.nyu.edu/rb964/public/correlates.pdf (accessed 26 June 2014) Balusu, Rahul 2009 OCP effects in Telugu. New York University PhD dissertation. Balusu, Rahul 2011 OCP effects in Telugu. In: Wai-Sum Lee & Eric Zee (eds.), Online proceedings of the International Congress of Phonetic Sciences XVII, 284–287. Hong Kong: Department of Chinese, Translation and Linguistics, City University of Hong Kong. http://www.icphs2011.hk/resources/OnlineProceedings/Regular Session/Balusu/Balusu.pdf (accessed 25 August 2014) Balusu, Rahul, R. Mahanta, R. Mohanty, and K. G. Vijayakrishnan In Progress The low rise on stress in Punjabi, Telugu, Oriya, Bangla and Assamese. MS, English and Foreign Languages University. Bashir, Elena 2003 Dardic. In: Cardona & Jain (eds.) 2003: 818–894. Begam, Monira 2008 The role of “sandhi” in Bangla languages. Dhaka University Journal of Linguistics 1(1): 69–78. http://www.banglajol.info/index.php/DUJL/article/view/ 3355 (accessed 26 June 2014) Benedict, Paul K. 1994 Garo and rGyarong (Suomo) prosodies. Linguistics of the Tibeto-Burman Area 17: 179–180. Benguerel, Andre-Pierre, and Tej K. Bhatia 1980 Hindi stop consonants: An acoustic and fiberscopic study. Phonetica 37: 134–48. Berger, Hermann 1974 Das Yasin-Burushaski (Werchikwar). Wiesbaden: Harrassowitz. Berger, Hermann 1998 Die Burushaski-Sprache von Hunza und Nager, 3 vols. Wiesbaden: Harrassowitz. Berkson, Kelly Harper 2012 Phonation types in Marathi: An acoustic investigation. University of Kansas PhD dissertation. http://kuscholarworks.ku.edu/dspace/handle/1808/ 12339 (accessed 31 January 2014) Beythan, Hermann 1943 Praktische Grammatik der Tamilsprache in Umschrift. Leipzig: Harrassowitz. Bharti, Surabhi 1994 Aspects of the phonology of Hindi and English. New Delhi: Arnold Publishers. Bhaskararao, Peri 1982 A re-examination of consonantal sandhi in modern colloquial Telugu. Bulletin of the Deccan College Research Institute 41: 16–26.

406

Bibliographical references

Bhaskararao, Peri 1989 The process of chiming in Tiddim Chin. Linguistics of the Tibeto-Burman Area 12(1): 110–132. Bhaskararao, Peri 1998 Gadaba. In: Steever (ed.) 1998: 328–357. Bhaskararao, Peri 1999 Voiced aspiration and tonogenesis in some South-Asian languages. In: Shigeki Kaji (ed.), Cross-linguistic studies of tonal phenomena: Tonogenesis, typology and related topics, 337–345. Tokyo: ILCAA, Tokyo University of Foreign Studies. Bhaskararao, Peri, and Peter Ladefoged 1992 Two types of voiceless nasals. Journal of the International Phonetic Association 21(2): 80–88. Bhaskararao, Peri, and Peter Ladefoged 2009 Timing constraints within gestures: A re-examination of Toda sibilants. Indian Linguistics 70: 73–78. Bhaskararao, Peri, Sheeba Hassan, I. A. Naikoo, P. A. Ganai, N. H. Wani, and T. Ahmad 2009 A phonetic study of Kashmiri palatalization. In: M. Minegishi et al. (eds.), Field research, corpus linguistics and linguistic informatics, 1–17. (Working Papers in Corpus-based Linguistics and Language Education 3.) Tokyo: Tokyo University of Foreign Studies. http://www.aa.tufs.ac.jp/~bhaskar/dardic/kashpal.pdf (accessed 30 November 2014) Bhat, D. N. S. 1974 Retroflexion and retraction. Journal of Phonetics 2: 233–237. Bhat, Raj Nath 2008 Palatalization: A note on Kashmiri morphophonology. Indian Linguistics 69: 43–50. Bhatia, Tej K. 1993 Punjabi: A cognitive-descriptive grammar. London/New York: Routledge. (Reprinted 2000.) Bhatia, Tej K., and Michael J. Kenstowicz 1972 Nasalization in Hindi: A reconsideration. Papers in Linguistics 5: 201–212. Bhattacharya, K. 1999 Bengali phonetic reader. (Reprint) Mysore: Central Institute of Indian Languages. Bickel, Balthasar 2003 Belhare. In: Thurgood & LaPolla (eds.) 2003: 546–570. Bielenberg, Brian, and Zhalie Nienu 2001 Chokri (Phek dialect): Phonetics and phonology. Linguistics of the TibetoBurman Area 24(2): 85–122. Bielmeier, Roland 1982 On tone in Tibetan. In: Helga Uebach and Jampa L. Panglung (eds.), Studia Tibetica: Quellen und Studien zur tibetischen Lexikographie, 2: 43–54. München: Kommission für Zentralasiatische Studien, Bayerische Akademie der Wissenschaften. Bielmeier, Roland, and Felix Haller (eds.) 2011 Linguistics of the Himalayas and beyond. Berlin/New York: Mouton de Gruyter.

Phonetics and phonology

407

Bieri, Dora, and Marlene Schulze 1969 Sunwar phonemic summary. (Tibeto-Burman phonemic summaries.) Dallas: SIL International. Blankenship, Barbara, Peter Ladefoged, Peri Bhaskararao, and C. Nichumeno 1992 Phonetic structures of Khonoma Angami. Linguistics of the Tibeto-Burman Area 16(2): 69–88. Bodman, Nicholas 1989 Some remarks on Lepcha vowels. In: David Bradley, Eugénie J. A. Henderson, and Martine Mazaudon (eds.), Prosodic analysis and Asian linguistics: To honour R. K. Sprigg, 137–141. Canberra: Pacific Linguistics. Brunelle, Marc, and Pittayawat Pittayaporn 2012 Phonologically-constrained change: The role of the foot in monosyllabization and rhythmic shifts in Mainland Southeast Asia. Diachronica 29(4): 411–433. http://aix1.uottawa.ca/~mbrunell/Feet,%20monosyllabization%20and%20 rhythmic%20shifts%20in%20MSEA.pdf (accessed 29 June 2014) Bundrick, Camille 1987 A lexical phonological approach to Hindi schwa deletion. Studies in the Linguistic Sciences 17(1): 15–24. Burling, Robbins 1981 Garo spelling and Garo phonology. Linguistics of the Tibeto-Burman Area 6(1): 61–81. Burling, Robbins 1992 Garo as a minimal tone language. Linguistics of the Tibeto-Burman Area 15(2): 33–51. Burling, Robbins, and U. V. Joseph 2001 Tone correspondences among the Bodo languages. Linguistics of the TibetoBurman Area 24(2): 41–55. Burling, Robbins, and L. Amon Phom 1998 Phom phonology and word list. Linguistics of the Tibeto-Burman Area 21(2): 13–42. Calabrese, Andrea, and Samuel Jay Keyser 2006 On the peripatetic behavior of aspiration in Sanskrit roots. In: Eric Bakovic, Junko Ito, and John McCarthy (eds.), Wondering at the natural fecundity of things: Essays in honor of Alan Prince, 71–94. Santa Cruz: UC Santa Cruz Linguistics Research Center. http://homepages.uconn.edu/~anc02008/Papers/ On%20the%20Peripathetic%20Behavior%20of%20Aspiration%20in%20 Sanskrit%20roots.pdf (accessed 6 January 2014) Cardona, George 1986 Phonology and phonetics in ancient Indian works: The case of voiced and voiceless elements. In: Krishnamurti et al. (eds.) 1986: 60–80. Cardona, George 1991 On the dialectal status of Vedic forms of the type dakṣ-/dhakṣ-. In: B. Lakshmi Bai and B. Ramakrishna Reddy (eds.), Studies in Dravidian and general linguistics: A Festschrift for Bh. Krishnamurti. Hyderabad: Osmania University Publications in Linguistics. Cardona, George 1993 The Bhāṣika accentuation system. Studien zur Indologie und Iranistik 18: 1–40.

408

Bibliographical references

Cardona, George 2003 Sanskrit. In: Cardona & Jain (eds.) 2003: 104–160. Cardona, George, and Babu Suthar 2003 Gujarati. In: Cardona & Jain (eds.) 2003: 659–697. Cardona, George, and Dhanesh Jain (eds.) 2003 The Indo-Aryan languages. London/New York: Routledge. Catford, J. C. 1968 The articulatory possibilities of man. In: Bertil Malmberg (ed.), Manual of phonetics, 309–333. Amsterdam: North Holland. Caughley, Ross C. 1969 Chepang phonemic summary. (Tibeto-Burman phonemic summaries.) Dallas: SIL International. Chatterji, Suniti Kumar 1926 The origin and development of the Bengali language. Calcutta University Press. Repr. 1970, London: Allen & Unwin; distributed by Motilal Banarsidass, Delhi. Chelliah, Shobhana 1990 Level ordered morphology and phonology in Manipuri. Linguistics of the Tibeto-Burman Area 13(2): 27–71. http://www.ling.unt.edu/~chelliah/pdf/ PDF10_Level%20Ordered%20Morphology%20and%20Phonology%20 in%20Manipuri.pdf (accessed 25 June 2014) Chelliah, Shobhana 1991 Tone in Manipuri. In: Martha Ratliff and Eric Schiller (eds.), Proceedings of the 1st Meeting of the Southeast Asian Linguistics Society, 65–85. Tempe: Arizona State University. Chelliah, Shobhana 1997 A grammar of Meithei. Berlin/New York: Mouton de Gruyter. Cho, Taehong, and Peter Ladefoged 1999 Variation and universals in VOT: Evidence from 18 languages. Journal of Phonetics 27: 207–229. Christdas, Prathima 1987 On constraining the power of Lexical Phonology: Evidence from Tamil. Proceedings of the Northeastern Linguistic Society 17: 122–146. Christdas, Prathima 1988 The phonology and morphology of Tamil. Cornell University PhD dissertation. Christdas, Prathima 2013 The phonology and morphology of Tamil. Oxford/New York: Routledge. Collinge, N. E. 1985 The laws of Indo-European. Amsterdam/Philadelphia: Benjamins. Coupe, Alexander R. 2003 A phonetic and phonological description of Ao: A Tibeto-Burman language of Nagaland, North-East India. Canberra: Pacific Linguistics. Coupe, Alexander R. 2007 A grammar of Mongsen Ao. Berlin/New York: Mouton de Gruyter. D’Souza, Jean 1985 Schwa syncope and vowel nasalization in Hindi-Urdu: A non-linear approach. Studies in the Linguistic Sciences 15(1): 11–30.

Phonetics and phonology

409

Dantsuji, M. 1987 Some acoustic observations on half nasals in Sinhalese. In: Tamaz Valerianovich Gamkrelidze (ed.), Proceedings of the Eleventh International Congress of Phonetic Sciences, Tallinn, vol. 4: 165–168. Tallinn: Academy of Sciences of the Estonian SSR. Dart, Sarah N., and Paroo Nihalani 1999 The articulation of Malayalam coronal stops and nasals. Journal of the International Phonetic Association 29(2): 129–142. Das, Shyamal 2001 Some aspects of the prosodic phonology of Tripura Bangla and Tripura Bangla English. Central Institute of Indian and Foreign Languages PhD dissertation. http://roa.rutgers.edu/files/493–0202/493–0202-DAS-0–0.PDF (accessed 31 January 2014) Dasgupta, Probal 2003 Bangla. In: Cardona & Jain (eds.) 2003: 351–390. Dave, Radhekant 1970 A formant analysis of the clear, nasalized and murmured vowels in Gujarati. Indian Linguistics 28: 1–47. Dave, Radhekant 1977 Retroflex and dental consonants in Gujarati: A palatographic and acoustic study. Annual Report of the Institute of Phonetics, University of Copenhagen 11: 27–156. David, Anne Boyle 2013 Descriptive grammar of Pashto and its dialects, ed. by Claudia Brugman. Berlin/New York: de Gruyter Mouton. David, Anne Boyle 2015 Bangla, ed. by Thomas J. Conners and Dustin Chacón. Berlin/New York: de Gruyter Mouton. de Lacy, Paul 2006 Markedness: Reduction and preservation in phonology. Cambridge: Cambridge University Press. Delancey, Scott 1989 Contour tones from lost syllables in Central Tibetan. Linguistics of the TibetoBurman Area 12(2): 33–34. Delancey, Scott 2003 Lhasa Tibetan. In: Thurgood & La Polla (eds.) 2003: 270–288. Delancey, Scott 2014 Sociolinguistic typology in North East India: A tale of two branches. Journal of South Asian Languages and Linguistics 1(1): 59–82. Dempsey, Jacob 2003 Analysis of rime-groups in Northern-Burmish. Linguistics of the TibetoBurman Area 26(1): 59–117. Dhongde, Ramesh Vaman, and Kashi Wali 2009 Marathi. Amsterdam/Philadelphia: Benjamins Dixit, R. Prakash 1963 The segmental phonemes of contemporary Hindi. University of Texas, Austin, MA thesis.

410

Bibliographical references

Dixit, R. Prakash 1989 Glottal gestures in Hindi plosives. Journal of Phonetics 1: 213–237. Dixit, R. Prakash 1990 Linguotectal contact patterns in the dental and retroflex stops of Hindi. Journal of Phonetics 18: 189–201. Dixit, R. Prakash 1993 Spatiotemporal patterns of glottal dynamics and control of voicing and aspiration in Hindi stops. Indian Linguistics 54: 1–36. Dixit, R. Prakash, and Jim Flege 1991 Vowel context, rate and loudness effects of linguopalatal contact patterns in Hindi retroflex /ʈ/. Journal of Phonetics 19: 213–229. Dixit, R. Prakash, and Thomas Shipp 1985 Study of subglottal air pressure during Hindi stop consonants. Phonetica 42: 53–78. Dočkalová, Lenka 2009 Development of sandhi phenomena in Sanskrit and in Aśokan Prakrit and Pāli. Linguistica Brunensia 57(1–2): 45–59. http://digilib.phil.muni.cz/bitstream/ handle/11222.digilib/115116/1_LinguisticaBrunensia_10–2009–1_6.pdf (accessed 7 January 2014) Donegan, Patricia 1993 Rhythm and vocalic drift in Munda and Mon-Khmer. Linguistics of the TibetoBurman Area 16(1): 1–43. http://www.ling.hawaii.edu/faculty/donegan/ Papers/1993rhythm.pdf (accessed 29 June 2014) Donegan, Patricia, and David Stampe 2002 South-East Asian features in the Munda languages: Evidence for the analyticto-synthetic drift of Munda. In: Patrick Chew (ed.), Proceedings of the TwentyEighth Annual Meeting of the Berkeley Linguistics Society, 111–120. Berkeley: Berkeley Linguistics Society. http://www.ling.hawaii.edu/faculty/donegan/ Papers/2002mundadrift.pdf (accessed 29 June 2014) Donegan, Patricia, and David Stampe 2004 Rhythm and the synthetic drift of Munda. In: Rajendra Singh (ed.), The yearbook of South Asian languages and linguistics 2004, 3–36. New Delhi: Thousand Oaks. http://www.ling.hawaii.edu/faculty/donegan/Papers/2004rhythm.pdf (accessed 29 June 2014) Duanmu, San 1992 An autosegmental analysis of tone in four Tibetan languages. Linguistics of the Tibeto-Burman Area 15(1): 65–91. Duanmu, San 1994 The phonology of the glottal stop in Garo. Linguistics of the Tibeto-Burman Area 17(2): 69–82. Dulai, Narinder K., and Omkar Nath Koul 1980 Punjabi phonetic reader. Mysore: Central Institute of Indian Languages. Dutta Baruah, P. N. 1992 Assamese phonetic reader. Mysore: Central Institute of Indian Languages. Dutta, Indranil 2009 Acoustics of stop consonants in Hindi: Voicing, fundamental frequency and spectral intensity. Saarbrücken: VDM Verlag.

Phonetics and phonology

411

Dutta, Indranil, and Hans Henrich Hock 2006 Interaction of verb accentuation and utterance finality in Bangla. In: Rüdiger Hoffmann and Hansjörg Mixdorff (eds.), Speech Prosody 2006, Dresden (CD-ROM Proceedings). (Studientexte zur Sprachkommunikation, 40.) Dresden: TUDpress. http://sprosig.isle.illinois.edu/sp2006/contents/papers/ PS8–11_0161.pdf (accessed 20 June 2014) Dutta, Indranil, and Charlie Redmon 2013 Coarticulation and contrast in static and dynamic models of second formant trajectories. The Journal of the Acoustical Society of America 134(5): 4203. http://duttalab.wikispaces.com/file/view/ASA_SanFran.pdf/477961412/ ASA_SanFran.pdf (accessed 19 June 2014) Dyrud, Lars O. 2001 Hindi-Urdu: Stress accent or non-stress accent? University of North Dakota MA thesis. http://arts-sciences.und.edu/summer-institute-of-linguistics/ theses/_files/docs/2001-dyrud-lars.pdf (accessed 1 February 2014) Edelman, D. (Joy) I., and Leila R. Dodykhudoeva 2009 Shughni. In: Windfuhr (ed.) 2009: 787–824. Ekka, Francis 1985 Kurux phonetic reader. Mysore: Central Institute of Indian Languages. Elfenbein, Josef 1997a Pashto phonology. In: Kaye (ed.) 1997: 733–760. Elfenbein, Josef 1997b Balochi phonology. In: Kaye (ed.) 1997: 761–776. Elfenbein, Josef 1997c Brahui phonology. In: Kaye (ed.) 1997: 797–814. Elfenbein, Josef 1998 Brahui. In: Steever (ed.) 1998: 388–414. Emeneau, Murray B. 1939 The vowels of the Badaga language. Language 15: 43–47. Emeneau, Murray B. 1952 Sanskrit sandhi and exercises, rev. ed. London: Cambridge University Press. Emeneau, Murray B. 1984 Toda grammar and texts. Philadelphia: American Philosophical Society. Emeneau, Murray B., and B. A. van Nooten 1968 Sanskrit sandhi and exercises, 2nd ed. Berkeley: University of California Press. Emerick, Ronald E. 1989 Khotanese and Tumshuqese. In: Schmitt (ed.) 1989: 204–229. Esposito, Christina M., Sameer ud Dowla Khan, and Alex Hurst 2007 Breathy nasals and /Nh/ clusters in Bengali, Hindi, and Marathi. Indian Linguistics 68: 275–299. http://www.macalester.edu/academics/linguistics/ facultystaff/christinaesposito/documents/EspositoKhanHurstBreathyNasa lsandNhClustersinBengaliHindiandMarathi.pdf (accessed 31 January 2014) Evers, Vincent, Henning Reetz, and Aditi Lahiri 1998 Crosslinguistic acoustic categorization of sibilants independent of phonological status. Journal of Phonetics 26: 345–370. Feinstein, Mark H. 1979 Prenasalization and syllable structure. Linguistic Inquiry 10: 245–278.

412

Bibliographical references

Féry, Caroline 2010 Indian languages as intonational ‘phrase languages’. In: I. Hasnain and S. Chaudhury (eds.), Festschrift to honour Ramakant Agnihotri. Delhi: Aakar Publisher. http://web.uni-frankfurt.de/fb10/fery/publications/Indian_ Languages_Phrase_Languages.pdf (accessed 31 January 2014) Fischer-Jørgensen, Eli 1967 Phonetic analysis of breathy (murmured) vowels in Gujarati. Indian Linguistics 28: 71–139. Fitzpatrick-Cole, Jennifer, and Aditi Lahiri 1997 Focus, intonation and phrasing in Bengali and English. In: Antonis Botinis, Georgios Kouroupetroglou, and George Carayiannis (eds.), Intonation: Theory, models and applications: Proceedings of the ESCA Workshop, Athens, 119–122. Athens: ESCA/University of Athens Department of Informatics. Fromkin, Victoria (ed.) 1978 Tone: A linguistic survey. New York: Academic Press. Fulop, Sean A., and Michael Dobrovolsky 1999 An instrumental analysis of Sharchhop obstruents. Linguistics of the TibetoBurman Area 22(1): 59–70. Gafos, Adamantios I. 1999 The articulatory basis of locality in phonology. New York: Garland Publishing. http://web.jhu.edu/sebin/o/y/gafos_dissertation-table-of-content.pdf (accessed 7 January 2014) Gair, James W. 2003 Sinhala. In: Cardona & Jain (eds.) 2003: 766–817. Gair, James W., and John C. Paolillo 1997 Sinhala. München: LINCOM. Gandour, Jackson T. 1974 Consonant types and tone in Siamese. Journal of Phonetics 2: 337–250. Genetti, Carol 2003 Dolakhā Newār. In: Thurgood & LaPolla (eds.) 2003: 355–370. Genetti, Carol 2007 A grammar of Dolakha Newar. Berlin/New York: Mouton de Gruyter. Genetti, Carol, and Keith Slater 2004 An analysis of syntax and prosody interactions in a Dolakhā Newar rendition of The Mahābhārata (with appendices and sound files). Himalayan Linguistics 3: 1–91. http://www.linguistics.ucsb.edu/HimalayanLinguistics/articles/2004/ PDF/HLJ01_Genetti_with.pdf (accessed 22 June 2014) Genzel, Susanne, and Frank Kügler 2010 The expression of contrast in Hindi. Speech Prosody 2010. http://www.ling. uni-potsdam.de/~kuegler/docs/2010.Genzel.Kuegler.SP.paper100143.pdf (accessed 1 February 2014) Ghai, Veda Kumari 1991 Studies in phonetics and phonology with special reference to Dogri. New Delhi: Ariana. Ghosh, Arun 2008 Santali. In: Anderson (ed.) 2008: 11–98.

Phonetics and phonology

413

Ghosh, Tanmay 2001 Vowel harmony in Bangla: An optimality account. In: Abbi et al. (eds.) 2001: 144–163. Gill, Harjeet Singh, and Henry A. Gleason 1972 The salient features of Punjabi language. Pàkha Sanjam 5: 1–150. Patiala: Punjabi University. Glover, Warren W. 1969 Gurung phonemic summary. (Tibeto-Burman phonemic summaries.) Dallas: SIL International. Gnanadesikan, Amalia Elisabeth 1997 Phonology with ternary scales. University of Massachusetts, Amherst, PhD dissertation. https://rucore.libraries.rutgers.edu/rutgers-lib/37826/pdf/1/ (accessed 4 July 2014) Gokhale, S. B. 1982 Intonation in Marathi and Marathi English. Central Institute of English and Foreign Languages PhD dissertation. Gordon, Kent 1969 Sherpa phonemic summary. (Tibeto-Burman phonemic summaries.) Dallas: SIL International. Goswami, G. C., and Jyotiprakash Tamuli 2003 Asamiya. In: Cardona & Jain (eds.) 2003: 391–443. Green, R. Jeffrey 2012 The phonology of voicing and aspiration in Amdo Tibetan. Linguistics of the Tibeto-Burman Area 35(2): 1–31. Grierson, George 1896 The Kashmiri vowel system. Journal of the Asiatic Society of Bengal 65(1): 280–305. Grierson, George 1897 On the Kashmiri consonantal system. Journal of the Asiatic Society of Bengal 66(1): 180–184. Grierson, George 1911 Manual of the Kâshmiri language, comprising grammar, phrase book, and vocabularies, 2 vols. Oxford: Oxford University Press. Gunkel, Dieter, and Kevin Ryan 2011 Hiatus avoidance and metrification in the Rigveda. In: Stephanie W. Jamison, H. Craig Melchert, and Brent Vine (eds.), Proceedings of the 22nd Annual UCLA Indo-European Conference. Bremen: Hempen. http://www.indogerma nistik.uni-muenchen.de/downloads/publikationen/publ_gunkel/hiatus_avoid ance.pdf (accessed 7 January 2014) Gupta, Ganesh, and Indranil Dutta 2013 Labial coarticulatory resistance to vowel harmony in Telugu. In: Acoustics2013NewDelhi (Proceedings of the Joint meeting of the French Acoustical Society and the Acoustical Society of India, New Delhi, 10th15th November), 2013, 999–1004. https://docs.google.com/file/d/0BzFCMwQWT-WCMFdoTWh2enJwUm8/edit and http://duttalab.wikispaces.com/ file/view/999–1004.pdf/477962328/999–1004.pdf (accessed 19 June 1014) Gurubasave Gowda, K. S. 1972 Ao-Naga phonetic reader. Mysore: Central Institute of Indian Languages.

414

Bibliographical references

Gussenhoven, Carlos 2002 Phonology of intonation. Glot International 6: 271–284. Hale, Austin 1970 Newari segmental synopsis. Occasional Papers of the Wolfenden Society on Tibeto-Burman Linguistics 3(1): 300–327. Hale, Austin 1982 Research on Tibeto-Burman languages. Berlin/New York: Mouton de Gruyter. Hale, Austin, and Margrit Hale 1969 Newari phonemic summary. (Tibeto-Burman phonemic summaries.) Dallas: SIL International. Hall, T. Alan 1997a The phonology of coronals. Amsterdam/Philadelphia: Benjamins. Hall, T. Alan 1997b The historical development of retroflex consonants in Indo-Aryan. Lingua 102: 203–221. Haller, Felix 1999 A brief comparison of register tone in Central Tibetan and Kham Tibetan. Linguistics of the Tibeto-Burman Area 22(2): 77–97. Haller, Felix 2012 Vowel harmony in Shigatse Tibetan. Linguistics of the Tibeto-Burman Area 35(2): 33–47. Hamann, Silke Renate 2003 The phonetics and phonology of retroflexes. Universiteit Utrecht PhD dissertation. http://user.phil-fak.uni-duesseldorf.de/~hamann/Hamann2003Diss.pdf (accessed 31 January 2014) Handoo, Jawaharlal 1973 Kashmiri phonetic reader. Mysore: Central Institute of Indian Languages. Hankamer, Jorge, Aditi Lahiri, and Jacques Koreman 1989 Perception of consonant length: Voiceless stops in Turkish and Bengali. Journal of Phonetics 17: 283–298. Hansson, Gunnar Ólafur 2001 Theoretical and typological issues in consonant harmony. University of California, Berkeley, PhD dissertation. http://faculty.arts.ubc.ca/gohansson/ pdf/GH_diss.pdf (accessed 4 July 2014) Hari, Maria 1969 Thakali phonemic summary. (Tibeto-Burman phonemic summaries.) Dallas: SIL International. Harnsberger, James 1994 Towards an intonational phonology of Hindi. Laboratory Phonology V. http:// www-personal.umich.edu/~jharns/hindi.html (accessed 6 January 2014) Harnsberger, James 1999 The role of metrical structure in Hindi intonation. South Asian Analysis Roundtable 20, University of Illinois. http://www-personal.umich.edu/~ jharns/hindi.html (accessed 6 January 2014) Harnsberger, James, and Jasmeet Judge 1996 Pitch range and focus in Hindi. 131st Meeting of the Acoustical Society of America.

Phonetics and phonology

415

Hartmann-So, Helga 1989 Morphophonemic changes in Daai Chin. Linguistics of the Tibeto-Burman Area 12(2): 51–65. Hartmann, Helga 2001 Prenasalization and preglottalization in Daai Chin and with parallel examples from Mro and Mara. Linguistics of the Tibeto-Burman Area 24(2): 123–142. Hassan, Nazir, and Omkar Nath Koul 1980 Urdu phonetic reader. Mysore: Central Institute of Indian Languages. Hayes, Bruce 1995 Metrical stress theory: Principles and case studies. Chicago/London: University of Chicago Press. Hayes, Bruce, and Aditi Lahiri 1991 Bengali intonational phonology. Natural Language and Linguistic Theory 9: 47–96. http://www.linguistics.ucla.edu/people/hayes/Papers/ HayesLahiriBengaliIntonationalPhonology.pdf (accessed 31 January 2014) Heegård, Jan, and Ida Elisabeth Mørch 2004 Retroflex vowels and other peculiarities in the Kalasha sound system. In: Anju Saxena (ed.), Himalayan languages, past and present, 57–76. Berlin/New York: Mouton de Gruyter. Hildebrandt, Kristine A. 2005 A phonetic analysis of Manange segmental and suprasegmental properties. Linguistics of the Tibeto-Burman Area 28(1): 1–36. Hildebrandt, Kristine A. 2007 Prosodic and grammatical domains in Limbu. Himalayan Linguistics 8: 1–34. Hock, Hans Henrich 1979 Retroflexion rules in Sanskrit. South Asian Languages Analysis 1: 47–62. Hock, Hans Henrich 1986/1991 Principles of historical linguistics, 1st and 2nd editions. Berlin/New York: Mouton de Gruyter. Hock, Hans Henrich 1999 Finality, prosody, and change. In: O. Fujimura, B. D. Joseph, and B. Palek (eds.), Proceedings of LP’98, 15–30. Prague: The Karolinum Press. Hock, Hans Henrich 2014 The Sanskrit phonetic tradition and western phonetics. In: V. Kutumba Shastri (ed.), Sanskrit development of world thought, 53–80. Delhi: Rasthriya Sanskrit Sansthan and D. K. Printworld. Hogan, Lee C. 1994 Nasalization in Lhasa Tibetan. Linguistics of the Tibeto-Burman Area 17(2): 83–102. Hogan, Lee C. 1996 The moraic structure of Classical Tibetan. Linguistics of the Tibeto-Burman Area 19(1): 115–149. Hombert, Jean-Marie 1978 Consonant types, vowel quality, and tone. In: Fromkin (ed.) 1978: 77–111. Honda, Isao 2002 Seke phonology: A comparative study of three Seke dialects. Linguistics of the Tibeto-Burman Area 25(1): 191–210.

416

Bibliographical references

Huang, Bufan 1995 Conditions for tonogenesis and tone split in Tibetan dialects. Linguistics of the Tibeto-Burman Area 18(1): 43–62. Hussain, Sarmad 1997 Phonetic correlates of lexical stress in Urdu. Northwestern University PhD dissertation. Huysmans, René 2007 The Sampang word accent: Phonetic realisation and phonological function. In: Bielmeier & Haller (eds.) 2007: 153–162. Hyman, Larry M., and Kenneth VanBik 2002 Tone and stem2-formation in Hakha Lai. Linguistics of the Tibeto-Burman Area 25(1): 113–121. Jahani, Carina, and Agnes Korn 2009 Balochi. In: Windfuhr (ed.) 2009: 634–692. Janda, Richard D., and Brian D. Joseph 1989 In further defense of a non-phonological account for Sanskrit root-initial aspiration alternations. In: Joyce Powers and Kenneth De Jong (eds.), ESCOL ’88: Proceedings of the Fifth Eastern States Conference on Linguistics, 246–260. Columbus: Department of Linguistics, The Ohio State University. Jensen, John T., and Margaret Stong-Jensen 2012 Sanskrit vowel hiatus. McGill Working Papers in Linguistics 22(1): 1–12. http://sanskrit.jnu.ac.in/rstudents/mphil/sachin.pdf (accessed 7 January 2014) Jones, W. E. 1971 Syllables and word stress in Hindi. Journal of the International Phonetic Association 1: 74–78. Jongman, Allard, Sheila A. Blumstein, and Aditi Lahiri 1985 Acoustic properties for dental and alveolar stop consonants: A cross-language study. Journal of Phonetics 13: 235–251. Joseph, U. V., and Robbins Burling 2001 Tone correspondences among the Bodo languages. Linguistics of the TibetoBurman Area 24(2): 41–55. Joshi, Shiv Sharma 1973 Pitch features of Panjabi tones. In: Harjeet Singh Gill (ed.), Linguistic atlas of the Punjab, 26–27. Patiala: Punjabi University. Kachru, Braj B. 1969 A reference grammar of Kashmiri. Urbana: Department of Linguistics, University of Illinois. Kapfo, Kedutso 1989 Tones in Khezha noun constructions. Linguistics of the Tibeto-Burman Area 12(2): 67–78. Kar, Somdev 2010 Syllable structure of Bangla: An optimality-theoretic approach. Newcastle upon Tyne: Cambridge Scholars Publishing. Kar, Somdev 2011 Gemination before liquids in Bangla. International Conference of Phonetics and Phonology (ICPP 2011), 10–14 December 2011, Kyoto, Japan. Karapurkar, Pushpa 1972 Tripuri phonetic reader. Mysore: Central Institute of Indian Languages.

Phonetics and phonology

417

Kaye, Alan S. 1997 Hindi-Urdu phonology. In: Kaye (ed.) 1997: 637–652. Kaye, Alan S. (ed.) 1997 Phonologies of Asia and Africa. Winona Lake, IN: Eisenbrauns. Keane, Elinor L. 2004 Tamil. Journal of the International Phonetic Association 34: 111–116. Keane, Elinor L. 2006a Phonetics vs. phonology in Tamil wh-questions. In: Rüdiger Hoffmann and Hansjörg Mixdorff (eds.), Speech Prosody 2006, Dresden (CD-ROM Proceedings). (Studientexte zur Sprachkommunikation, 40.) Dresden: TUDpress. http://sprosig.isle.illinois.edu/sp2006/contents/papers/PS2–01_ 0002.pdf (accessed 20 June 2014) Keane, Elinor L. 2006b Prominence in Tamil. Journal of the International Phonetic Association 36: 1–20. Keane, Elinor L. 2014 The intonational phonology of Tamil. In: Sun-Ah Jun (ed.), Prosodic typology II: The phonology of intonation and phrasing, 118–153. Oxford: Oxford University Press. Keating, Patricia, Christina M. Esposito, Marc Garellek, Sameer ud Dowla Khan, and Jianjing Kuang 2010 Phonation contrasts across languages. UCLA Working Papers in Phonetics 108: 188–202. Kelkar, Ashok Ramchandra 1958 The phonology and morphology of Marathi. Cornell University PhD dissertation. Kelkar, Ashok Ramchandra 1968 Studies in Hindi-Urdu, 1. Pune: Deccan College. Kelkar, Ashok Ramchandra 1984 Kashmiri: A descriptive sketch. In: Koul & Hook (eds.) 1984: 62–89. Kelkar, Ashok Ramchandra, and Pran Nath Trisal 1964 Kashmiri word phonology: A first sketch. Anthropological Linguistics 6 (1): 13–22. Kelley, Gerald 1959 Telugu vowel phonemes. Indian Linguistics 20: 146–158. Kelley, Gerald 1963 Vowel phonemes and external vocalic sandhi in Telugu. Journal of the American Oriental Society 83: 67–73. Kessler, Brett 1992 External sandhi in Classical Sanskrit. Stanford University MS thesis. http:// www.artsci.wustl.edu/~bkessler/sanskrit-thesis/thesis.ps (accessed 7 January 2014) Kessler, Brett 1994 Sandhi and syllables in Sanskrit. In: Erin Duncan, Donka Farkas, and Philip Spaelti (eds.), Proceedings of the Twelfth West Coast Conference on Formal Linguistics, 35–50. Stanford: CSLI. http://spell.psychology.wustl.edu/sandhiWCCFL/WCCFL-sandhi.html.en.utf8 (accessed 2 July 2014)

418

Bibliographical references

Khan, Mobin Ahmad 2000 Urdu phonology. Aligarh: Department of Linguistics, Aligarh Muslim University. Khan, Sameer ud Dowla 2006 Bengali intonational phonology. Journal of the Acoustical Society of America 120: 3092. Khan, Sameer ud Dowla 2008 Intonational phonology and focus prosody in Bangla. UCLA PhD dissertation. http://www.linguistics.ucla.edu/general/Dissertations/Khan_dissertation UCLA2008.pdf (accessed 28 June 2014) Khan, Sameer ud Dowla 2010 Bengali (Bangladeshi standard). Journal of the International Phonetic Association 40: 221–225. Khan, Sameer ud Dowla 2012 The phonetics of contrastive phonation in Gujarati. Journal of Phonetics 40: 780–795. Khatiwada, Rajesh 2009 Nepali. Journal of the International Phonetic Association 39: 373–380. Khokle, Vasant S. 1988 Syllable in Marathi phonology: Further evidence. International Journal of Dravidian Linguistics 17: 36–56. Kiparsky, Paul 1973a Abstractness, opacity, and global rules: Phonological representations. In: O. Fujimura (ed.), Three dimensions of linguistic theory, 57–86. Tokyo: Institute for Advanced Studies of Language. Kiparsky, Paul 1973b On comparative linguistics: The case of Grassmann’s Law. In: Henry Hoenigswald and Robert E. Longacre (eds.), Current trends in linguistics 2: Diachronic, areal, and typological linguistics, 115–134. The Hague: Mouton. Kiparsky, Paul 2010 Reduplication in stratal OT. In: Linda Uyechi and Lian Hee Wee (eds.), Reality exploration and discovery: Pattern interaction in language and life (Festschrift for K. P. Mohanan), 125–142. Stanford: CSLI Press. http://web. stanford.edu/~kiparsky/Papers/reduplication.pdf (accessed 7 July 2014) Kissock, Madelyn 2010 What counts as Vowel Harmony? Synchrony, diachrony, and epenthesis in Telugu. 6th North American Phonology Conference, Montreal, April 28–30, 2010. http://modlang-phonetica.concordia.ca/Naph6beamer.pdf (accessed 4 July 2014) Kissock, Madelyn, and Catherine Dworak 2009 Telugu vowel assimilation: Harmony, umlaut, or neither? Seventeenth Manchester Phonology Meeting, May 28–30, 2009. http://modlang-phonetica. concordia.ca/17mfmhand.pdf (accessed 4 July 2014) Kissock, Madelyn, and Charles Reiss 2003 Anti-antigemination: Syncope and epenthesis in Telugu. 11th Manchester Phonology Conference, May 22–24, 2003. http://modlang-phonetica.concordia. ca/11mfmhand.pdf (accessed 2 July 2014)

Phonetics and phonology

419

Kolachina, Sudheer, Dipti Misra Sharma, Phani Gadde, Meher Vijay, Rajeev Sangal, and Akshar Bharati 2011 External sandhi and its relevance to syntactic treebanking. Polibits 43: 67–74. http://cs.jhu.edu/~myeleti/papers/cicling11-extSandhi.pdf (accessed 7 January 2014) Koshal, Sanyukta 1976 Ladakhi phonetic reader. Mysore: Central Institute of Indian Languages. Koul, Omkar Nath 1994 Hindi phonetic reader. Mysore: Indian Institute of Language Studies. Koul, Omkar Nath 2003 Kashmiri. In: Cardona & Jain (eds.) 2003: 895–952. Koul, Omkar Nath, and Peter E. Hook (eds.) 1984 Aspects of Kashmiri linguistics. New Delhi: Bahri. Krishnamurti, Bhadriraju 1957 Sandhi in modern colloquial Telugu. Indian Linguistics 17: 178–188. Krishnamurti, Bhadriraju 1978 Areal and lexical diffusion of sound change: Evidence from Dravidian. Language 54: 1–20. Krishnamurti, Bhadriraju 1998 Telugu. In: Steever (ed.) 1998: 202–240. Krishnamurti, Bhadriraju 2003 The Dravidian languages: A comparative, historical and typological study. Cambridge: Cambridge University Press. Krishnamurti, Bhadriraju, and Brett A. Benham 1998 Koṇḍa. In: Steever (ed.) 1998: 241–269. Krishnamurti Bhadriraju, Colin P. Masica, and Anjani Sinha (eds.) 1986 South Asian languages: Structure, convergence, and diglossia. Delhi: Motilal Banarsidass. Krishnamurti, Bhadriraju, and J. P. L. Gwynn 1985 A grammar of modern Telugu. Delhi: Oxford University Press. Ladd, D. Robert 1980 Intonation, main clause phenomena, and point of view. Baltimore: University Park Press. Ladd, D. Robert 1996 Intonational phonology. Cambridge: Cambridge University Press. Ladefoged, Peter, and Ian Maddieson 1996 The sounds of the world’s languages. Oxford: Blackwell. Ladefoged, Peter, and Peri Bhaskararao 1983 Non-quantal aspects of consonant production: A study of retroflex consonants. Journal of Phonetics 11: 291–302. Lahiri, Aditi, and Jennifer Fitzpatrick-Cole 1999 Emphatic clitics and focus intonation in Bengali. In: R. Kager & W. Zonnefeld (eds.), Phrasal phonology, 119-144. Nijmegen: University of Nijmegen. Lahiri, Aditi, and Jorge Hankamer 1988 The timing of geminate consonants. Journal of Phonetics 16: 327–338. Lahiri, Aditi, Letitia Gewirth, and Sheila E. Blumstein 1984 A reconsideration of acoustic invariance for place of articulation in diffuse stop

420

Bibliographical references

consonants: Evidence from a cross-language study. Journal of the Acoustical Society of America 76: 391–404. Liljegren, Henrik 2008 Towards a grammatical description of Palula: An Indo-Aryan language of the Hindukush. University of Stockholm PhD dissertation. su.diva-portal.org/ smash/get/diva2:198468/FULLTEXT01 (accessed 25 November 2014) Lisker, Leigh 1958 The Tamil occlusives: Short vs. long or voiced vs. voiceless? Indian Linguistics 19: 294–301. Lisker, Leigh 1972 On stops and gemination in Tamil. International Journal of Dravidian Linguistics 1: 144–150. Lisker, Leigh, and Arthur Abramson 1964 Cross-language study of voicing in initial stops: Acoustical measurements. Word 20: 384–422. Lisker, Leigh, and Bhadriraju Krishnamurti 1991 Lexical stress in a ‘stressless’ language: Judgments by Telugu- and Englishspeaking linguists. In: Proceedings of the XIIth International Congress of Phonetic Sciences, vol. 2: 90–93. Aix-en-Provence: Université de Provence. Löffler, Lorenz G. 1985 A preliminary report on the Paangkhua language. In: G. Thurgood, J. A. Matisoff, and D. Bradley (eds.), Linguistics of the Sino-Tibetan area: The state of the art, 279–286. Canberra: The Australian National University. Löffler, Lorenz G. 2002 The tonal system of Chin final stops. Linguistics of the Tibeto-Burman Area 25(2): 123–153. Longerich, Linda 1998 Acoustic conditioning for the RUKI rule. Memorial University of Newfoundland PhD dissertation. http://www.collectionscanada.gc.ca/obj/s4/f2/dsk2/tape15/ PQDD_0010/MQ36148.pdf (accessed 18 November 2013) Lorimer, D. L. R. 1935–1938 The Burushaski language, 3 vols. Oslo: Instituttet for Sammenlignende Kulturforskning. Maddieson, Ian 1984 Patterns of sounds. Cambridge: Cambridge University Press. Mahanta, Shakuntala 2002 Some aspects of prominence in Assamese and Assamese English. Central Institute of Indian Languages MPhil dissertation. Mahanta, Shakuntala 2005 Morpheme realization and direction of vowel harmony. In: Anna Asbury, Ivana Brasileiro, and Shakuntala Mahanta (eds.), Utrecht Institute of Linguistics Yearbook OTS, 51–64. https://www.academia.edu/2984688/Morpheme_ Realization_and_Direction_of_Vowel_Harmony (accessed 22 February 2014) Mahanta, Shakuntala 2007 Directionality and locality in vowel harmony. Utrecht University PhD dissertation.

Phonetics and phonology

421

Mahanta, Shakuntala 2009 Morpheme-specific exceptional processes and emergent unmarkedness in vowel harmony. In: Rajendra Singh (ed.), Annual review of South Asian languages and linguistics, 65–100. Berlin/New York: Mouton de Gruyter. Mahanta, Shakuntala 2012 Assamese. Journal of the International Phonetic Association 42: 217–224. Mathew, Mili Mary, and Jayashree S. Bhat 2010 Nature of sentence intonation in Kannada, Tulu and Konkani. Language in India 10(11): 15–25. http://www.languageinindia.com/nov2010/milisentence intonation.pdf (accessed 24 June 2014) Matisoff, James A. 2003 Handbook of Proto-Tibeto-Burman: System and philosophy of Sino-Tibetan reconstruction. Berkeley/Los Angeles: University of California Press. https:// escholarship.org/uc/item/19d79619#page-13 (accessed 29 June 2014) Mazaudon, Martine 1977 Tibeto-Burman tonogenetics. Linguistics of the Tibeto-Burman Area 3(2): 1–123. Mazaudon, Martine 2012 The influence of tone and affrication on manner: Some irregular manner correspondences in the Tamang group. Linguistics of the Tibeto-Burman Area 35(2): 97–112. McCarthy, John J. 2005 Taking a free ride in morphophonemic learning. Catalan Journal of Linguistics 4: 19–55. http://www.mml.cam.ac.uk/dtal/courses/ugrad/paper_support/ li8/683–0904–0-0.pdf (accessed 9 March 2014) McDonough, Joyce, and Keith Johnson 1997 Tamil liquids: An investigation into the basis of the contrast among five liquids in a dialect of Tamil. Journal of the International Phonetic Association 27: 1–26. Mehrotra, Ramesh Chandra 1959 Hindi syllabic structure. Indian Linguistics 20: 231–237. Mehrotra, Ramesh Chandra 1965 Stress in Hindi. Indian Linguistics 26: 96–105. Meile, Pierre 1949 Quelques particularités du sandhi au tamoul. Bulletin de la Société de Linguistique de Paris 45(1): 130. Melnik, Nurit 1997 The sound system of Lai. Linguistics of the Tibeto-Burman Area 20(2): 9–19. Michailovsky, Boyd 1988 Phonological typology of Nepal languages. Linguistics of the Tibeto-Burman Area 11(2): 25–50. Miranda, Rocky 2003 Konkani. In: Cardona & Jain (eds.) 2003: 729–765. Mishra, Mithilesh 2006 The syllable structure and stress patterns of the Maithili language. University of Illinois PhD dissertation.

422

Bibliographical references

Mistry, P. J. 1997 Gujarati phonology. In: Kaye (ed.) 1997: 653–674. Modi, Bharati 2013 Some issues in Gujarati phonology. München: LINCOM. Mohanan, K. P. 1986 The theory of lexical phonology. Dordrecht: Reidel. Mohanan, K. P., and Tara Mohanan 1984 Lexical phonology of the consonant system in Malayalam. Linguistic Inquiry 15: 575–602. Mohanan, Tara 1989 Syllable structure in Malayalam. Linguistic Inquiry 20: 589–625. Moore, Robert R. 1965 A study of Hindi intonation. University of Michigan PhD dissertation. Mørch, Ida Elisabeth, and Jan Heegård. 1997 Retroflekse vokalers oprindelse i kalashamon i historisk og areallingvistisk perspektiv [The origin of retroflex vowels in Kalashamon in historical and areal-linguistic perspective]. University of Copenhagen MA thesis. Morey, Stephen 2005a The Tai languages of Assam: A grammar and texts. Canberra: The Australian National University. Morey, Stephen 2005b Tonal change in the Tai languages of Northeast India. Linguistics of the TibetoBurman Area 28(2): 139–202. Morgenstierne, Georg 1943 The phonology of Kashmiri. Acta Orientalia 19: 79–99. Morse, Robert H. 1963 Phonology of Rawang. Anthropological Linguistics 5(5): 17–41. Nagaraja, K. S. 1990 Khasi phonetic reader. Mysore: Central Institute of Indian Languages. Nagarajan, Hemalatha 1994 A theory of post-syntactic phonology. Madras: T. R. Publications. Nagarajan, Hemalatha 1995 Gemination of stops in Tamil: Implications for the phonology-syntax interface. UCL Working Papers in Linguistics 7: 485–509. http://www.ucl.ac.uk/ psychlangsci/research/linguistics/publications/wpl/95papers/NAGARAJA (accessed 6 January 2014) Nair, Rami 2001 Acoustic correlates of lexical stress in Hindi. In: Abbi et al. (eds.) 2001: 121– 143. Delhi: Motilal Banarsidass. Nair, Usha 1979 Gujarati phonetic reader. Mysore: Central Institute of Indian Languages. Namkung, Ju (ed.) 1996 Phonological inventories of Tibeto-Burman languages. (Sino-Tibetan Etymological Dictionary and Thesaurus Project, Monograph Series, 3.) Berkeley: University of California Center for Southeast Asian Studies. http:// stedt.berkeley.edu/pubs_and_prods/STEDT_Monograph3_PhonologicalInv-TB.pdf (accessed 25 June 2014)

Phonetics and phonology

423

Narang, G. C., and Donald A. Becker 1971 Aspiration and nasalization in the generative phonology of Hindi-Urdu. Language 47: 646–467. Narayanan, Shrikanth S., and Abigail Kaun 1999 Acoustic modeling of Tamil retroflex liquids. In: John J. Ohala, Yoko Hasegawa, Manjari Ohala, Daniel Granville, and Ashlee C. Bailey (eds.), Proceedings of the 14th International Congress on Phonetic Sciences 1999, San Francisco, 2097–2101. Berkeley: University of California at Berkeley. Narayanan, Shrikanth S., Dani Byrd, and Abigail Kaun 1999 Geometry, kinematics, and acoustics of Tamil liquid consonants. Journal of the Acoustic Society of America 106: 1993–2007. Neukom, Lukas 1999 Phonological typology of northeast India. Linguistics of the Tibeto-Burman Area 22: 121–147. Nihalani, Paroo 1974a An aerodynamic study of stops in Sindhi. Phonetica 29: 193–224. Nihalani, Paroo 1974b Lingual articulation of stops in Sindhi. Phonetica 30: 197–212. Nihalani, Paroo 1975a Velo-pharyngeal opening in the formation of voiced stops in Sindhi. Phonetica 32: 89–102. Nihalani, Paroo 1975b Air flow rate in the production of stops in Sindhi. Phonetica 31: 198–205. Nihalani, Paroo 1984 On the anatomy of intonation in Sindhi. International Journal of Dravidian Linguistics 13(2): 213–228. Nihalani, Paroo 1986 Phonetic implementation of implosives. Language and Speech 29: 253–262. Nihalani, Paroo 1995 Sindhi. Journal of the International Phonetic Association 25: 95–98. Noonan, Michael 2003a Chantyal. In: Thurgood & LaPolla (eds.) 2003: 315–385. Noonan, Michael 2003b Nar-Phu. In: Thurgood & LaPolla (eds.) 2003: 336–352. O’Bryan, Margie 1974 Exceptions and the unity of phonological processes. In: Roger W. Shuy and Charles-James N. Bailey (eds.) Toward tomorrow’s linguistics, 185–193. Washington, DC: Georgetown University Press. Repr. 1988 in: Rajendra Singh (ed.), Modern studies in Sanskrit, 189–198. New Delhi: Bahri. Odden, David 1988 Anti antigemination and the OCP. Linguistic Inquiry 19: 451–475. Ohala, John J. 1978 Production of tone. In: Fromkin (ed.) 1978: 5–39. Ohala, Manjari 1974a The abstractness controversy: Experimental input from Hindi. Language 50: 225–235.

424

Bibliographical references

Ohala, Manjari 1974b The schwa-deletion rule in Hindi: Phonetic and non-phonetic determinants of rule application. Bloomington, IN: Indiana University Linguistics Club. Ohala, Manjari 1977a Stress in Hindi. In: Larry M. Hyman (ed.), Studies in stress and accent, 4: 327–338. Los Angeles: Southern California Occasional Papers in Linguistics. Ohala, Manjari 1977b The treatment of phonological variation: An example from Hindi. Lingua 42(2–3): 161–176. Ohala, Manjari 1983 Aspects of Hindi phonology. Delhi: Motilal Banarsidass. Ohala, Manjari 1986 A search for the phonetic correlates of Hindi stress. In: Krishnamurti et al. (eds.) 1986: 81–92. Ohala, Manjari 1987 Schwa deletion in Hindi by linear and non-linear routes. In: Werner Bahner et al. (eds.), Proceedings of the Fourteenth International Congress of Linguists: Berlin / GDR, August 10–August 15, 1987, 497–501. Berlin: Akademie-Verlag. Ohala, Manjari 1991 Phonological areal features of some Indo-Aryan languages. Language Sciences 13: 107–124. Ohala, Manjari 1994 Hindi. Journal of the International Phonetic Association 24: 35–38. Ohala, Manjari 2001 Some patterns of unscripted speech in Hindi. Journal of the International Phonetic Association 31: 115–126. Ohala, Manjari 2007 Experimental methods in the study of Hindi geminate consonants. In: MariaJosep Sole, Patrice Speeter Beddor, and Manjari Ohala (eds.), Experimental approaches to phonology, 351–396. Oxford: Oxford University Press. Ohala, Manjari, and John J. Ohala 1991 Nasal epenthesis in Hindi. Phonetica 48: 207–220. Opgenort, Jean Robert 2004a Implosive and preglottalized stops in Kiranti. Linguistics of the Tibeto-Burman Area 27(1): 1–27. Opgenort, Jean Robert 2004b. A grammar of Wambule. Leiden: Brill. Osada, Toshiki 2008 Mundari. In: Anderson (ed.) 2008: 99–164. Pandey, Pramod Kumar 1977 Nasalization in Southern Havyaka Kannada. International Journal of Dravidian Linguistics 6: 256–264. Pandey, Pramod Kumar 1978 A physiological note on the effect of nasalization on vowel height. International Journal of Dravidian Linguistics 7: 217–222.

Phonetics and phonology

425

Pandey, Pramod Kumar 1989 Word accentuation in Hindi. Lingua 77: 37–73. Pandey, Pramod Kumar 1990 Hindi schwa deletion. Lingua 82: 277–311. Pandey, Pramod Kumar 1991 Schwa fronting in Hindi. Studies in the Linguistic Sciences 21(1): 147–159. Pandey, Pramod Kumar 1992 Hindi-Urdu phonology since 1968. In: R. N. Srivastava (ed.), Language and text: Studies in honour of Ashok R. Kelkar, 155–170. Delhi: Kalinga Publications. Pandey, Pramod Kumar 1997 Optionality, lexicality and sound change. Journal of Linguistics 33: 91–130. Pandey, Pramod Kumar 2005 Vowel phoneme patterns in Indian languages. Indian Linguistics 66: 77–84. Pandey, Pramod Kumar 2006 Retroflex consonants in Indic languages. Indian Linguistics 67: 129–147. Pandey, Pramod Kumar 2007 Developments in Indian linguistics 1965–2005: Phonology. In: K. S. Nagaraja et al. (eds.), Research trends in lexicography, Sanskrit and linguistics: Proceedings of seminar in Honour of Professor S. M. Katre, 121–132. Pune: Deccan College. http://www.jnu.ac.in/Faculty/pkspandey/ papers/Developments_in_Indian_Linguistics-_1965–2005-_Phonology.doc (accessed 5 January 2014) Pandey, Pramod Kumar 2014 Sounds and their patterns in Indic languages, 2 vols. Delhi: Cambridge University Press India. Pandharipande, Rajeshwari 1997 Marathi. London/New York: Routledge. Pandharipande, Rajeshwari 2003 Marathi. In: Cardona & Jain (eds.) 2003: 698–728. Pathak, R. S. 1977 The intonation of Bagheli. Indian Linguistics 38(3–4): 197–209. Patil, Umesh, Gerrit Kentner, Anja Gollrad, Frank Kügler, Caroline Féry, and Shravan Vasishth 2008 Focus, word order, and intonation in Hindi. Journal of South Asian Linguistics 1: 53–70. http://www.ling.uni-potsdam.de/~patil/PatilEtAl-2008.pdf (accessed 31 January 2014) Patnaik, Manideepa 2008 Juang. In: Anderson (ed.) 2008: 508–556. Payne, John 1989 Pāmir languages. In: Schmitt (ed.) 1989: 417–444. Peet, Karl A. 2007 Implications of labial place assimilation in Amdo Tibetan. In: Bielmeier and Haller (eds.) 2007: 225–246. Peterson, John M. 2006 Kharia: A South Munda language, vol. 1. Habilitationsschrift, Universität Osnabrück (published 2011). Leiden: Brill.

426

Bibliographical references

Peterson, John M. 2008 Kharia. In: Anderson (ed.) 2008: 434–507. Pierrehumbert, Janet, and Rami Nair 1996 Implications of Hindi prosodic structure. In: J. Durand and B. Laks (eds.), Current trends in phonology: Models and methods, 549–584. Paris/Salford: CNRS, Paris-X/University of Salford Press. http://faculty.wcas.northwestern. edu/~jbp/publications/implications_hindi.pdf (accessed 6 January 2014) Plaisier, Heleen 2007 A grammar of Lepcha. Leiden: Brill. Poon, Pamela G., and Catherine A. Mateer 1985 A study of VOT in Nepali stop consonants. Phonetica 42: 39–47. Prabhakar Babu, B. A. 1978 Intonation of Colloquial Telugu. International Journal of Dravidian Linguistics 7: 205–216. Prabhakar Babu, B. A. 1981 Vowel harmony in Telugu. International Journal of Dravidian Linguistics 10: 82–85. Prakasam, V. 1972 A systematic treatment of certain aspects of Telugu phonology. York University PhD dissertation. Prakasam, V. 1976 Functional view of phonological features. Acta Linguistica Academiae Scientiarum Hungaricae 26: 77–88. Prakasam, V. 1979 Aspects of sentence phonology. Archivum Linguisticum 10: 57–82. Prakasam, V. 1991 Length as a formative prosody in Telugu. In: V. Prakasam, and S. Parasher (eds.), Linguistics at large: Papers in general and applied linguistics, 19–25. Hyderabad: Booklinks. Prakasam, V. 1992 Length in Telugu. In: Paul Tench (ed.), Studies in systemic phonology, 70–76. London: Pinter Publishers. Pray, Bruce R. 1970 Topics in Hindi-Urdu grammar. Berkeley: Center for South and Southeast Asian Studies, UC Berkeley. Pulleyblank, Edwin G. 1986 Tonogenesis as an index of areal relationships in East Asia. Linguistics of the Tibeto-Burman Area 9: 65–82. Punnoose, Reenu 2011 An auditory and acoustic study of liquids in Malayalam. Newcastle University PhD dissertation. Punnoose, Reenu, and Ghada Khattab 2011 Phonetic and phonological investigation of the fifth liquid in Malayalam: Evidence for rhotic characteristics. In: W. S. Lee and E. Zee (eds.), Proceedings of the 17th International Congress of Phonetic Sciences, Hong Kong, 1646. Hong Kong: City University of Hong Kong. http://www.icphs2011.hk/ resources/OnlineProceedings/RegularSession/Punnoose/Punnoose.pdf (accessed 4 July 2014)

Phonetics and phonology

427

Purcell, E. T., G. Villegas, and S. P. Young 1978 A before and after for tonogenesis. Phonetica 35: 284–293. Puri, Vandana 2013 Intonation in Indian English and Hindi late and simultaneous bilinguals. University of Illinois PhD dissertation. https://www.ideals.illinois.edu/bitstream/ handle/2142/45457/Vandana_Puri.pdf?sequence=1 (accessed 6 January 2014) Radloff, Carla F. 1999 Aspects of the sound system of Gilgiti Shina. (Studies in Languages of Northern Pakistan, 4.) Islamabad: National Institute of Pakistan Studies, Quaid-i-Azam University/Summer Institute of Linguistics. Rajapurohit, B. B. 1983 Shina phonetic reader. Mysore: Central Institute of Indian Languages. Rajapurohit, B. B. (ed.) 1986 Acoustic studies in Indian languages. Mysore: Central Institute of Indian Languages. Rajaram, S. 1972 Tamil phonetic reader. Mysore: Central Institute of Indian Languages. Ramakrishna Reddy, B., P. Susheela, P. Thomas, U. P. Upadhyaya, and Joy Reddy 1974 Kuvi phonetic reader. Mysore: Central Institute of Indian Languages. Ramasamy, Mohana Dass 2011 Topics in the morphophonology of Standard Spoken Tamil (SST): An Optimality Theoretic study. University of Newcastle on Tyne PhD dissertation. Ramaswami, N. 1975 Brokskat phonetic reader. Mysore: Central Institute of Indian Languages. Ramaswami, N. 1999 Common linguistic features in Indian languages: Phonetics. Mysore: Central Institute of Indian Languages. Ramaswamy Aiyar, L. V. 1931–1932 Aphaeresis and sound-displacement in Dravidian. Quarterly Journal of the Mythic Society 22: 448–480. Rami, Manish K., Joseph Kalinowski, Andrew Stuart, and Michael P. Rasattera 1999 Voice onset times and burst frequencies of four velar stop consonants in Gujarati. Journal of the Acoustic Society of America 106: 3736–3738. Rangan, K. 1975 Balti phonetic reader. Mysore: Central Institute of Indian Languages. Rao, G. Uma Maheshwar 1996 A nonlinear analysis of syllable structure and vowel harmony in Telugu. PILC Journal of Dravidian Studies 6(1): 55–84. Ravindran, N. 1974 Angami phonetic reader. Mysore: Central Institute of Indian Languages. Ravishankar, R. 1988 The intonational system of Tamil. Coimbatore: Bharatiyar University. Ray, Tapas S. 2003 Oriya. In: Cardona & Jain (eds.) 2003: 444–476. Reddy, Nagamma K. 1982 A kymographic and spectrographic study of aspiration in Telugu. Osmania Papers in Linguistics 7–8: 131–154.

428

Bibliographical references

Reddy, Nagamma K. 1985 Phonetic conditioning of duration of vowels and consonants in Telugu. Osmania Papers in Linguistics 11: 54–83. Reddy, Nagamma K. 1987 Constraints on consonant sequences across some Indian languages: A typological view. Osmania Papers in Linguistics 13: 37–57. Reddy, Nagamma K. 1998 Distinctive vowel quality, quantity and nasalization in Telugu and Hindi. Osmania Papers in Linguistics 24: 49–69. Reddy, Nagamma K. 1999 Coarticulation in Telugu: Instrumental and phonetic evidence for and against syllable affinity. Osmania Papers in Linguistics 25: 1–24. Reddy, Nagamma K. 2000 Linguistic functions of length beyond word level in Telugu. International Journal of Dravidian Linguistics 29: 81–90. Reddy, Nagamma K. 2003 The vowel and consonant sounds of Indian languages. International Journal of Dravidian Linguistics 32: 33–54. Reddy, Nagamma K. 2004 On the phonological status of /AE/ in Telugu. International Journal of Dravidian Linguistics 33: 93–104. Reddy, Nagamma K. 2009 Fricatives in Telugu: Articulatory and acoustic characteristics. International Journal of Dravidian Linguistics 38: 37–56. Reddy, Nagamma K., and K. Srikumar 1988 An articulatory and acoustic study of trills in Malayalam. Osmania Papers in Linguistics 14: 42–54. Rehberg, Kerstin 2003 Phonologie des Kharia: Prosodische Strukturen und segmentales Inventar. Magisterthese, Universität Osnabrück. Riccardi, Theodore 2003 Nepali. In: Cardona & Jain (eds.) 2003: 538–580. Roengpitya, Rungpat 1997 Glottal stop and glottalization in Lai (connected speech). Linguistics of the Tibeto-Burman Area 20(2): 21–56. Rose, Sharon, and Rachel Walker 2011 Harmony systems. In: John Goldsmith, Jason Riggle, and Alan C. L. Yu (eds.), Handbook of phonological theory, 2nd edition, 240–290. Oxford: Blackwell. http://idiom.ucsd.edu/~rose/RoseWalkerHarmonysystemsch8.pdf (accessed 7 January 2014) Sadanand, Suchitra 1999 Malayalam phonology: An optimality-theoretic approach. University of Southern California PhD dissertation. Sag, Ivan. A. 1974 The Grassmann’s Law ordering pseudoparadox. Linguistic Inquiry 5: 591– 607.

Phonetics and phonology

429

Sahu, Ram Niwas 1982 Vowel-sequences in Mahto (a Munda language). International Journal of Dravidian Linguistics 11: 374–375. Sailaja, Pingali 1994 Tier conflation and bracket erasure: The case of Telugu. In: Alice Davison and Frederick M. Smith (eds.), Papers from the 15th South Asian Language Analysis Round Table Conference. Iowa City: University of Iowa. Sampat, K. S. 1964 Tonal structure of Majhi. Indian Linguistics 25: 108–110. Sandhu, Balbir Sing 1986 The articulatory and acoustic structure of the Panjabi consonants. Patiala: Punjabi University. Santhakumar, P. 1983 Diphthongs in Malayalam. International Journal of Dravidian Linguistics 12: 441–448. Sastry, G. Devi Prasada 1984 Mishmi phonetic reader. Mysore: Central Institute of Indian Languages. Sastry, J. Venkateswara 2000 Telugu phonetic reader. Mysore: Central Institute of Indian Languages. Saxena, Anju 1991 Tone in PaTani and Central Tibetan: Parallel developments? Linguistics of the Tibeto-Burman Area 14(1): 129–136. Saxena, Anju 2011 The sound system of Nàvakat. Orientalia Suecana 60: 185–191. Schein, Barry, and Donca Steriade 1986 On geminates. Linguistic Inquiry 17: 691–744. Schiefer, Liselotte 1986 F0 in the production and perception of breathy stops: Evidence from Hindi. Phonetica 43: 43–69. Schiering, René, and Harry van der Hulst 2010 Accentual systems in the languages of Asia. In: Harry van der Hulst, R. Goedemans, and E. van Zanten (eds.), Stress patterns of the world, 2: The data, 509–615. Berlin/New York: Mouton de Gruyter. http://homepage.uconn. edu/~hdv02001/Articles-pdfs/134 %20-%20Asia.pdf (accessed 26 June 2014) Schiffmann, Harold 1979 A grammar of Spoken Tamil. Madras: Christian Literature Society. Schmidt, Ruth Laila, and Razwal Kohistani 2008 A grammar of the Shina language of Indus Kohistan. Wiesbaden: Harrassowitz. Schmidt, Ruth Laila, with Mohammad Manzar Zarin 1981 The phonology and tonal system of Pālas /kohis’tyõ:/ Shina. Münchener Studien zur Sprachwissenschaft 40: 155–185. Schmitt, Rüdiger (ed.) 1989 Compendium linguarum iranicarum. Wiesbaden: Reichert. Schöttelndreyer, Burkhard 1971 A guide to Sherpa tone. Guide to tone in Nepal: Part 5: Sherpa and Sunwar tone studies. Kathmandu: Tribhuvan University and Summer Institute of Linguistics.

430

Bibliographical references

Schöttelndreyer, Burkhard 1980 Vowel and tone patterns in the Sherpa verb. Pacific Linguistics, Series A, 53: 113–123. Selkirk, Elisabeth 2007 Bengali intonational phonology revisited: An optimality theoretic analysis in which FOCUS stress prominence drives FOCUS phrasing. In: Chung-min Lee, Matthew Gordon, and Daniel Büring (eds.), Topic and Focus: Cross-linguistic perspectives on intonation and meaning, 217–246. Dordrecht: Springer. http:// people.umass.edu/selkirk/pdf/Bengali%20Intonation%20Revisited%20copy. pdf (accessed 31 January 2014) Sengar, Anuradha, and Robert Mannell 2012 A preliminary study of Hindi intonation. Australasian International Conference on Speech Science and Technology 2012: 149–152. http://assta.org/sst/SST-12/ SST2012/PDF/AUTHOR/ST120106.PDF (accessed 31 January 2014) Shackle, Christopher 2003 Panjabi. In: Cardona & Jain (eds.) 2003: 581–621. Shalev, Michael, Peter Ladefoged, and Peri Bhaskararao 1994 Phonetics of Toda. PILC Journal of Dravidic Studies 4(1): 19–56. (Earlier version: University of California Working Papers in Phonetics 84: 89–126.) Shapiro, Michael C. 2003 Hindi. In: Cardona & Jain (eds.) 2003: 250–285. Sharma, Aryendra 1969 Hindi word-accent. Indian Linguistics 30: 115–158. Sharma, Devi D. 1988 A descriptive grammar of Kinnauri. Delhi: Mittal. Sharma, Jagdish Chander 1979 Gojri phonetic reader. Mysore: Central Institute of Indian Languages. Sharma, Suhnu Ram 1979 Phonological structure of Spiti. Linguistics of the Tibeto-Burman Area 4(2): 83–110. Shaw, Rameshwar 1984 Stress-patterns in Bengali and Hindi: A comparative study. In: B. B. Rajapurohit (ed.), Papers in phonetics and phonology: Proceedings of an institute, 95–110. Mysore: Central Institute of Indian Languages. Shepherd, Gary, and Barbara Shepherd 1971 Magar phonemic summary. (Tibeto-Burman phonemic summaries.) Dallas: SIL International. Shukla, Shaligram 2000 Hindi phonology. München: LINCOM. Simon, Walter 1977 Alternation of final vowel with final dental nasal or plosive in Tibetan. Bulletin of the School of Oriental and African Studies 40(1): 51–57. Singh, Aanjanee 1994 The phonology-morphology interface: A case from Hindi. University of Delhi PhD dissertation. Singh, Chander Shekhar 2014 Punjabi intonation: An experimental study. München: LINCOM.

Phonetics and phonology

431

Singh, Inder 1975 Manipuri phonetic reader. Mysore: Central Institute of Indian Languages. Singh, Sudhakar 1976 Morphophonemics in contemporary Hindi. Lucknow University PhD dissertation. Singh, Sukhvinder 1992 Schwa-deletion in Panjabi: A sociolinguistic perspective. South Asian Language Review 2(2): 84–93. Singh, Udaya Narayan 1980 Comments on rule ordering in Bengali morphology. Indian Linguistics 41(2): 91–101. Singh, Udaya Narayana, with Sujata Chatterjee 1985 Notes on vowel harmony in Bengali, Oriya and Assamese. ISDL Working Papers in Linguistics 1(3): 83–88. Sinha, N. K. 1974 Mundari phonetic reader. Mysore: Central Institute of Indian Languages. Sitapati, G. V. 1936 Accent in Telugu speech and verse. Indian Linguistics 6: 201–211. Smith, Caley 2012 The development of sandhi [-ō] in Sanskrit. GSAS Workshop on Indo-European and Historical Linguistics. (‘Based on a 2010 M. A. thesis by Caley Smith’) https://www.academia.edu/2641133/The_Development_of_Sandhi_-o_in_ Sanskrit (accessed 7 January 2014) Sprigg, Richard Keith 1990 Tone in Tamang and Tibetan and the advantages of keeping register-based tone systems separate from contour-based systems. Linguistics of the TibetoBurman Area 13(1): 33–56. Sreedhar, M. V. 1976 Sema phonetic reader. Mysore: Central Institute of Indian Languages. Sridhar, S. N. 1990 Kannada: A descriptive grammar. London/New York: Routledge. Srinivas, Ch. 1992 Word stress in Telugu and English. Central Institute of English and Foreign Languages PhD dissertation. Srivastava, R. N. 1968 Theory of morphonematics and aspirated phonemes of Hindi. Acta Linguistica 18: 363–373. Srivastava, R. N. 1969 Review of Kelkar 1968. Language 45: 913–917. Srivastava, R. N. 1970 The problems of Hindi semi-vowels. Indian Linguistics 31: 129–137. Steever, Sanford B. 1998a Goṇḍi. In: Steever (ed.) 1998: 270–297. Steever, Sanford B. 1998b Kannada. In: Steever (ed.) 1998: 129–156. Steever, Sanford B. 1998c Malto. In: Steever (ed.) 1998: 359–386.

432

Bibliographical references

Steever, Sanford B. (ed.) 1998 The Dravidian languages. London/New York: Routledge. Stirtz, Timothy M. 2013 Mundari mid-vowel raising in [ATR] harmony and other phonology. http:// mundari.webonary.org/files/MundariiPhonTimJul13.pdf (accessed 7 January 2014) Subbaiah, G. 1986 Kota phonetic reader. Mysore: Central Institute of Indian Languages. Subbarao, Karumuri V. 1971 Vowel harmony in Telugu and parentheses and infinite rule schemata notations. In: Papers from the Seventh Regional Meeting of the Chicago Linguistic Society, April 16–18, 1971, 543–552. Chicago: Chicago Linguistic Society. Subrahmanyam, P. S. 1983 Dravidian comparative phonology. Annamalainagar: Annamalai University. Subrahmanyam, P. S. 2008 Dravidian comparative grammar I. Mysore: Central Institute of Indian languages. Sun, Jackson T.-S. 1995 The typology of tone in Tibetan. Taiwan: Institute of History and Philology, Academia Sinica. Syamala Kumari, B. 2000 Malayalam phonetic reader. Mysore: Central Institute of Indian Languages. Taid, Tabu 1987 A short note on Mising phonology. Linguistics of the Tibeto-Burman Area 10(1): 130–137. Taylor, Doreen 1969 Tamang phonemic summary. (Tibeto-Burman phonemic summaries.) Dallas: SIL International. Teo, Amos 2012 Sumi (Sema). Journal of the International Phonetic Association 42: 365–373. Terzenbach, Lauren M. 2011 Malayalam prominence and vowel duration: Listener acceptability. University of Texas, Austin, MA thesis. http://repositories.lib.utexas.edu/bitstream/ handle/2152/ETD-UT-2011–12–4611/TERZENBACH-MASTERS-REPORT. pdf?sequence=1 (accessed 25 June 2014) Thakwani, Pitamber 1983 Syllable structure of Ao words. International Journal of Dravidian Linguistics 12: 199–211. Thirumalai, M. S. 1972 Thaadou phonetic reader. Mysore: Central Institute of Indian Languages. Thompson, Hanne-Ruth 2012 Bengali. Amsterdam/Philadelphia: Benjamins. Thoudam, Purna C. 1989 Conditioning factors for morphophonemic alternations of manner in Meiteiron. Linguistics of the Tibeto-Burman Area 12(2): 93–99. Thurgood, Graham, and Randy LaPolla (eds.) 2003 The Sino-Tibetan languages. London/New York: Routledge.

Phonetics and phonology

433

Toba, Sueyoshi, and Ingrid Toba 1972 Khaling phonemic summary. (Tibeto-Burman phonemic summaries.) Dallas: SIL International. Truckenbrodt, Hubert 2002 Variation in p-phrasing in Bengali. Linguistic Variation Yearbook 2: 259–303. http://www.ucalgary.com/dflynn/files/dflynn/Truckenbrodt03.pdf (accessed 31 January 2014) Turin, Mark 2004 The phonology of Thangmi: A Tibeto-Burman language of Nepal. Journal of Asian and African Studies 67: 63–103. Upadhyaya, U. Padmanabha 2000 Kannada phonetic reader. Mysore: Central Institute of Indian Languages. Vacek, Jaroslav 1978 The palatal nasal in Tamil. International Journal of Dravidian Linguistics 7: 239–247. van Driem, George 1987 A grammar of Limbu. Berlin/New York: Mouton de Gruyter. van Driem, George 1997 A grammar of Duma. Berlin/New York: Mouton de Gruyter. Varma, Siddheshwar 1961 Critical studies in the phonetic observations of Indian grammarians. Delhi: Munshi Ram Manoharlal. Vasanthakumari, T. 1989 Generative phonology of Tamil. Delhi: Mittal. Vennemann, Theo 1974 Sanskrit RUKI and the concept of a natural class. Linguistics 130: 91–97. Verma, Manindra 2003 Bhojpuri. In: Cardona & Jain (eds.) 2003: 515–537. Verma, Sheela 2003 Magahi. In: Cardona & Jain (eds.) 2003: 498–514. Vesaleinen, Olavi, and Marja Vesaleinen 1976 Lhomi phonemic summary. Kathmandu: Summer Institute of Linguistics/ Institute of Nepal and Asian Studies, Tribhuvan University. Vijayakrishnan, K. G. 1982 The Tamil syllable. Hyderabad: Central Institute of English and Foreign Languages PhD dissertation. Vijayakrishnan, K. G. 1985 The Interaction of syntax and external sandhi: Evidence from Tamil. Central Institute of English and Foreign Languages Working Papers in Linguistics 2: Part 1. Vijayakrishnan, K. G. 1987 Hierarchical representation of phonological features. Berkeley Linguistic Society: Proceedings of the Thirteenth Annual Meeting, 310–320. http://elan guage.net/journals/bls/article/view/2513/2480 (accessed 24 June 2014) Vijayakrishnan, K. G. 1988 The parameters of phonological rule application. Central Institute of English and Foreign Languages Occasional Papers in Linguistics 5(1): 77–96.

434

Bibliographical references

Vijayakrishnan, K. G. 2003a Stress and tone in Punjabi. Central Institute of English and Foreign Languages Occasional Papers in Linguistics 10: Chapter 5. http://www.languageinindia. com/nov2003/ciefl10.html#chapter5 (accessed 20 June 2014) Vijayakrishnan, K. G. 2003b Weakening processes in the Optimality framework. In: Jeroen van de Weijer, Vincent J. van Heuven, and Harry van der Hulst (eds.), The phonological spectrum, volume I: Segmental structure, 241–255. Amsterdam/Philadelphia: Benjamins. http://roa.rutgers.edu/files/329–0699/roa-329-k.g.vijayakrishnan-1.ps (accessed 22 Febrary 2014) Vijayakrishnan, K. G. 2007 The disyllabic word minimum: Variations on a theme in Bangla, Punjabi, and Tamil. In: Josef Bayer, Tanmoy Bhattacharya, and M. T. Hany Babu (eds.), Linguistic theory and South Asian languages: Essays in honour of K. A. Jayaseelan, 237–248. Amsterdam/Philadelphia: Benjamins. http:// books.google.com/books?id=bE32-zeFKD8C&pg=PA237&dq=Vijayakr ishnan+disyllabic+word&hl=en&sa=X&ei=yRE6UbupDLGO2QW8jIH gAQ&ved=0CDYQ6AEwAQ#v=onepage&q=Vijayakrishnan%20disyl labic%20word&f=false (accessed 20 June 2014) Wackernagel, Jakob 1896/1957 Altindische Grammatik, vol. 1. Göttingen: Vandenhoeck & Ruprecht. (The 1957 edition contains a revised introduction, by Louis Renou, plus addenda by Albert Debrunner.) Wali, Kashi, and Omkar N. Koul 1997 Kashmiri: A cognitive-descriptive grammar. London/New York: Routledge. Reprinted 2010. Walker, Rachel, and Fidèle Mpiranya 2005 On triggers and opacity in coronal harmony. Proceedings of the Berkeley Linguistics Society 2005, 383–394. http://www-bcf.usc.edu/~rwalker/Walker Mpiranya.pdf and http://elanguage.net/journals/bls/article/viewFile/886/773 (accessed 7 January 2014) Wang, Xiaosong 1996 Prolegomenon to Rgyalthang Tibetan phonology. Linguistics of the TibetoBurman Area 19(2): 55–67. Watters, David E. 1971 Kham phonemic summary. (Tibeto-Burman phonemic summaries.) Dallas: SIL International. Watters, Stephen Andrew 1996 A preliminary study of prosody in Dzongkha. University of Texas, Arlington, MA thesis. Watters, Stephen Andrew 1999 Tonal contrasts in Sherpa. In: Yogendra P. Yadava and Warren W. Glover (eds.), Topics in Nepalese linguistics, 54–77. Kathmandu: Royal Nepal Academy. Watters, Stephen Andrew 2002 The sounds and tones of five Tibetan languages of the Himalayan region. Linguistics of the Tibeto-Burman Area 25: 1–65.

Phonetics and phonology

435

Watters, Stephen Andrew 2003 An acoustic look at pitch in Lhomi. In: Tej Ratna Kansakar and Mark Turin (eds.), Themes in Himalayan languages, 249–264. Heidelberg/Kathmandu: South Asia Institute/Tribhuvan University. Weidert, Alfons K. 1975 Componential analysis of Lushai phonology. Amsterdam/Philadelphia: Benjamins. Wells, Clarice, and Peter Roach 1980 An experimental investigation of some aspects of tone in Panjabi. Journal of Phonetics 8: 85–89. Whitney, William Dwight 1889 Sanskrit grammar, 2nd ed. Cambridge, MA: Harvard University Press. Wilkinson, Robert W. 1974a A phonetic constraint on a syncope rule in Telugu. Language 50: 478–497. Wilkinson, Robert W. 1974b Tense/lax vowel harmony in Telugu: The influence of derived contrast on rule application. Linguistic Inquiry 5: 251–270. Wiltshire, Caroline R. 2000 The phonology of the past tense in Tamil. In: Jeff Good and Alan C. L. Yu (eds.), Proceedings of the 25th Annual Meeting of the Berkeley Linguistics Society: Special Session on Caucasian, Dravidian, and Turkic Linguistics, 42–53. Berkeley: Berkeley Linguistic Society. http://elanguage.net/journals/ bls/article/view/3326/3310 (accessed 30 June 2014) Windfuhr, Gernot L. (ed.) 2009 The Iranian languages. London/New York: Routledge. Yadav, Ramawatar 1979 Maithili phonetics and phonology. University of Kansas PhD dissertation. Yadav, Ramawatar 1984a Voicing and aspiration in Maithili: A fiberoptic and acoustic study. Indian Linguistics 45: 1–25. Yadav, Ramawatar 1984b Maithili phonetics and phonology. Mainz: Selden & Tamm. Yadav, Ramawatar 2003 Maithili. In: Cardona & Jain (eds.) 2003: 477–497. Zide, Norman H. 2008 Korku. In: Anderson (ed.) 2008: 256–298. Zwicky, Arnold M. 1964 Three traditional rules of Sanskrit. Quarterly Progress Report 74: 203–204. Cambridge, MA: MIT Research Lab of Electronics. Zwicky, Arnold M. 1970 Greek-letter variables and the Sanskrit ruki class. Linguistic Inquiry 1: 549– 555.

4

Morphology Hans Henrich Hock With contributions by Elena Bashir and K. V. Subbarao

4.1.

Introduction

A comprehensive survey of the morphology of South Asian languages has yet to be written. The closest to such a survey is Grierson (ed.) 1903–1928. However, primary focus is on inflectional morphology, including periphrastic structures such as Hindi progressives as well as derivational issues that come close to inflectional, such as causative or passive formation; (other) derivational morphology receives minimal attention. Unfortunately, this focus on inflectional morphology and relative neglect of derivational morphology is a feature shared by the majority of morphological presentations, whether in grammars of individual languages or language families, in monographs on morphology, in journal papers, or in chapters of comprehensive volumes on particular language families or subfamilies. This chapter presents an outline of general coverage of morphology (4.2), of issues of typological interest (4.3), of theoretical issues (4.4), and of issues related to morphosyntax (4.5). Given the limitations of what has been published on South Asian morphology, the chapter is naturally brief and relatively cursory in its coverage. The bibliographical references open up sources for further information and exploration. 4.2.

Coverage

Two publication series offer detailed discussion of morphology (both inflectional and derivational). These are the Descriptive Grammar Series by Croom Helm / Routledge and the Mouton Grammar Series by de Gruyter Mouton. Relevant volumes in the former are Asher 1985 (Tamil), Asher & Kumari 1997 (Malayalam), Bhatia 1993 (Panjabi), Wali & Koul 1997 (Kashmiri), Pandharipande 1997 (Marathi), Sridhar 1990 (Kannada); relevant volumes in the latter series are Chelliah 1997 (Meithei), Coupe 2007 (Mongsen Ao), Genetti 2007 (Dolakha Newar), van Driem 1987 (Limbu), 1997 (Duma). Several other volumes offer significant coverage of both inflectional and derivational morphology. One is Kaye 2007, with Anderson 2007a on Burushaski, Cardona 2007 on Sanskrit, and Mistry 2007 on Gujarati. A second one is Anderson (ed.) 2008, with every contribution devoting some space to derivation. Especially helpful are Anderson 2008 (Gtaʔ, including coverage of complex verbs), Anderson

438

Hans Henrich Hock

& Harrison 2008a (Sora), Anderson & Harrison 2008b (Remo), Anderson, Osada & Harrison 2008 (Ho and other Kherwarian), Anderson & Rau (Gorum), Ghosh 2008 (Santali), Osada 2008 (Mundari), Patnaik 2008 (Juang), Peterson 2008 (Kharia), Zide 2008 (Korku). Finally there is Thurgood & LaPolla (eds.) 2003, with Bickel 2003 (Belhare), Chelliah 2003 (Meithei), DeLancey 2003 (Lhasa Tibetan), Genetti 2003 (Dolakha Newar), Hargreaves 2003 (Kathmandu Newar), and Noonan 2003ab (Chantyal, Nar-Phu). The sections below offer an overview of morphological coverage beyond the publications just mentioned, arranged by language families. Publications that offer more than cursory coverage of derivational morphology are specially mentioned. 4.2.1.

Indo-Aryan1

The best morphological coverage is available for Sanskrit, in large measure because of the great accomplishments of Pāṇini’s scholarship, which in turn — directly or indirectly — informs western scholarly grammars. Significantly, the Sanskrit tradition treats both inflectional and derivational morphology (for some details see 4.3.7 and 4.4.1 below). Although there are several editions of Pāṇini’s grammar (e.g. Böhtlingk 1887, Vasu 1897, Katre 1989), extensive training, including in the large commentatorial tradition, is required to make the work intelligible to modern linguistic audiences. Recent discussions of the architecture of Pāṇini’s grammar are Cardona 2009, Kiparsky 2009; see also Cardona 2000 for morphology. The most important western publications are Whitney 1889, Wackernagel 1905, Debrunner & Wackernagel 1930, Debrunner 1954, Macdonell 1916, Thumb-Hauschild 1959, Renou 1984, Cardona 2007. Middle Indo-Aryan morphology is covered to varying degrees in v. Hinüber 2001, Oberlies 2001, Sen 1960, Tagare 1987. For Modern Indo-Aryan, see Masica 1991 (limited coverage of derivational morphology) and Cardona & Jain (eds.) 2003 (best coverage of derivational morphology in Cardona 2003). The comparative morphology of Modern Indo-Aryan is treated in Zograph 1976. There are also individual-language publications, beyond the ones mentioned at the beginning of 4.2, especially for Hindi/Urdu (Butt 1995, Shapiro 1976, Shukla 2001, Singh & Agnihotri 1997) and Bangla (Bhattacharja 2007, Chatterji 1926, Thompson 2012). Other languages are covered too, including Marathi (Dhongde & Wali 2008), Maithili (Yadav 1996), Oriya (Neukom & Patnaik 2003), and Sinhala (Gair & Karunatillake 1974), as well as several

1

Some information on derivational morphology in Iranian languages of South Asia is found in Jahani & Korn 2009: 684 (Balochi), Edelman & Dodykhudoeva 2009: 816– 817 (Shughni), and Bashir 2009: 855 (Wakhi). See Kümmel In Press for general coverage of early Indo-Iranian morphology.

Morphology

439

“minor” languages, including Kangri (Eaton 2008), Palula (Liljegren 2008), and Rājbanshi (Wilde 2008). 4.2.2.

Dravidian

In addition to the publications mentioned at the beginning of 4.2, morphology (both inflectional and derivational) is covered in the comparative Dravidian grammars by Caldwell (1875, 1913) and Krishnamurti (2003), and in Tolkāppiyam, the oldest Tamil grammar (Israel 1973). See also Steever 1988a for some general discussion of derivational morphology and Steever 1998b for fuller discussion of Kannada. Several publications are specifically devoted to morphology. See especially the comparative work of Rao 1991 (nominal derivation), Subrahmanyam 1971 (verb morphology), Suvarchala 1992 (Central Dravidian morphology), and Zvelebil 1970/1977 (Dravidian comparative morphology), as well as the more specialized publications of Christdas 2013 (Tamil), Garman 1986 (Tamil), Krishnamurti 1961 (Telugu verbal bases), Lehmann 1989 (Tamil), 1996 (Old Tamil), Rajam 1992 (Old Tamil), Vijayakrishnan 1994 (Tamil compounding), and Steever’s publications on issues in historical Dravidian morphology (1988 on “Serial Verbs”, 1993 “Analysis to synthesis”). 4.2.3.

Tibeto-Burman, Munda, Burushaski

Tibeto-Burman, Munda languages, and even Burushaski are richly presented in the publications mentioned at the beginning of 4.2. Beyond these publications, Tibeto-Burman is the family with the largest number of relevant publications. These include Bauman 1975 (pronominal morphology), Beyer 1993 (derivational morphology), DeLancey 2010/2011b (verb morphology), Genetti 1992 (relative clause morphology), Genetti 2011 (nominalization), Jacques 2012ab (agreement morphology; verb morphology), Nishi et al. 1995 (morphosyntax), Saxena 1991, 1997, 2000 (verb morphology), Tournadre 2010 (Classical Tibetan case). The literature on Munda is considerably more limited; but see Anderson 2007b (Munda verb morphology) and Peterson 2007, 2011 (Kharia). There is also a publication on Tibeto-Burman and Munda morphology, Maspero 1948. See also Section 2.6 for work by Ebert and Neukom on Tibeto-Burman and Munda morphology. For Burushaski, the richest sources of information (beyond Anderson 2007a) are Berger 1974 and 1998; an earlier publication is Lorimer 1935–1938.2 See also

2

Anderson draws heavily on the information in Berger’s and Lorimer’s earlier work.

440

Hans Henrich Hock

Anderson & Eggert 2001 (verb agreement), Bashir 2004 (evidentiality), Tiffou & Morin 1982 (ergativity), Willson 1990 (verb agreement and case marking). 4.2.4.

Computational approaches

Recent years have seen an explosion of computational approaches to the morphology of South Asian languages. Even when available in hard copy, most of the publications appear on the internet. For further information see Chapter 8. 4.3.

Typological issues

Not unexpectedly, the morphological systems of South Asian languages can differ in a great number of details, both within and across language families, both within and across geographic areas. A detailed account of the different systems is beyond the scope of this review. A few features, however, are sufficiently interesting to deserve discussion, including features that have been considered characteristic of the South Asian convergence area or of some of its constituent members. 4.3.1.

Agglutination vs. Isolation vs. Flexion

The Dravidian languages are usually considered to be quintessentially agglutinative and to always have been that way (e.g. Steever 1998: 18, Krishnamurti 2003: 28); Sjoberg (2001) considers this typology to account for what she sees as Dravidian resistance to morphological influence by Indo-Aryan (and other languages). Examples such as (1) support the view that Dravidian is agglutinative. However, agglutination is much more widespread in South Asian languages; see (2) – (5). In fact, some of the languages may have much more complex agglutinative structures than Dravidian; see especially (2), (4), and (5b). Still, it is generally agreed that the complex structures of Munda languages are innovated and that the ancestral Austro-Asiatic had much simpler morphology (e.g. Anderson 2007b);3 and to judge by the minimal verb morphology of Classical Tibetan (Beyer 1992), the structures in (3) may likewise be innovated. On the Indo-Aryan side, by contrast, the rich morphology of Sanskrit, as in (5b), has given way to morphologically much simpler systems, such as the Hindi near-equivalent (6)4 of (5b).

3

4

There is, however, evidence for some morphology in Austro-Asiatic (Diffloth & Zide 1992). A complete counterpart would probably require a much greater amount of periphrasis.

Morphology

(1)

a. b.

(2)

a. b.

(3)

a.

b.

(4)

(5)

avaṉ-ai tuṉpa p.paṭu-ttu-kiṟ-ēṉ sorrow suffer-CAUS - PRS -1 SG he-ACC ‘I make him suffer sorrow.’ (Tamil; Beythan 1943: 116) huṛ-da-t-aṅ see-NON 3-PST -1 SG ‘I saw you’ (Pengo; Bhattacharya 1972, 1975) ñɛl-gɔt’-ka-t’-ko-a=ko see-EMPH -TAM -TR -3 PL ( OBJ )- FIN =3 PL ( SUBJ ) ‘They saw them off.’ (Santali; Anderson 2007b) gǝi=ko idi-ke-d-e-tiñ-a take-TAM - TR -3( OBJ )-1 SG . POSS - FIN cow=PL 3 ‘They took my cow.’ (Santali; Anderson 2007b) khan-na asen a-in-u-na 2-buy-3-ART you.SG -ERG yesterday meruba pu-metta-ŋ goat look-CAUS -1 SG ‘Show me the goat you bought yesterday.’ (Athpare; Bickel 1999) zova-n ka-kut a-mi-sɔ:p-pek my-hands 3 SG -1 SG -washed-BEN Zova.AG ‘Zova washed my hands.’ (Hmar; Subbarao 2012) a-t-i-man-i-m-i NEG -d.SUFFIX -3 SG -become-LINKER -m.SUFFIX -3 SG ‘he was not born’ (Burushaski; Berger 1998:105)

a. b.

(6)

441

kalʰi di-m-(k)u-n tomorrow give-FUT -2 SG -1 SG ‘I will give (it) to you tomorrow.’ (Rājbanshi; Wilde 2008) ta-ṁ kār-ya-ṁ that-ACC . SG . M do-GERUNDIVE - ACC . SG . N ci-kīr-ṣ-ay-i-ṣyā-m-i REDUPL -do-DESID - CAUS -L INK V OWEL - FUT -1 SG - NPST ‘I will make him desire to do that work (lit. ‘that to-be-done’).’ (Sanskrit) us-se kām kar-vā-ūṁ-gā work do-CAUS - MODAL 1 SG - FUT he-INS ‘I will make him do the work.’ (Hindi)

Beside these strong agglutinative tendencies, South Asian languages offer a great variety of other features. Even some of the elements of Dravidian morphology are noncompositional, such as the first-singular marker –ēṉ or the third plural mascu-

442

Hans Henrich Hock

line/feminine marker –ār, which combine features of person and number or even gender. Similarly, the marker –am of Sanskrit combines accusative, singular, masculine in tam but accusative, singular, neuter in kāryam. These are features usually associated with flexion rather than agglutination. (See also 4.3.7 below.) Moreover, it has been long known that beside agglutinative structures of the type (1a), Old Tamil has very different structures with FLEXIONAL or even ISOLAT ING characteristics; e.g. Lehmann 1996, 1998. Beside the well-known “pronominal” agreement marking, which is added to the tense stem as in (1a), Old Tamil has a set of “portmanteau” endings with combination of (non-past) tense with person and underdifferentiation of third persons; see Table 4.1. Traces of this system are found in Modern Malayalam and, indirectly, in Parji; Israel 1964, Subrahmanyam 1964, 2010. The tendency of the “portmanteau” type to die out suggests that it is an archaism, compared to the “pronominal” type, whose later productivity suggests that it is innovated. Table 4.1: Old Tamil verb agreement markers “Portmanteau” SG . 1

-k(k)u

2

-t(t)i -um -um



3F Ṁ/F 3N

-um

“Pronominal”

PL .

SG .

PL .

-tum/-kum -kam -tir -um

-eṉ/-ēṉ/-al/-aṉ

-am/-ām -em/ēm -ir/-īr

-um/(-pa/-mār5) ---

-ai/-āy/-ōy -aṉ/-āṉ/-oṉ -aḷ/-āḷ/-ōḷ -t(t)u/-atu

-ar/-ār/-ōr -a

There are also Old Tamil examples such as (7), without any inflection; and just like the “portmanteau” suffixes, this ISOLATING type is replaced by the apparently newer agglutinative system. (7)

mañcai aṟai-Ø īṉ-Ø muṭṭai peacock rock incubate egg ‘The egg that the peacock incubates on the mountain’ (Old Tamil; Lehmann 1994: 124)

As in the case of the “portmanteau” endings, the existence of such uninflected structures has been known since Caldwell 1875; see also Lehmann 1994: 52–54, 1998: 98. It is only recently that Pilot-Raichoor, partly in collaboration with Murugaiyan, has interpreted evidence of this type as indicating an earlier, nonagglutinative, isolating layer of Dravidian morphology; Murugaiyan & Pilot-Raichoor 5

This form is probably a blend of portmanteau –(u)m and pronominal –ār (Israel 1964).

Morphology

443

2004, Pilot-Raichoor 2012. The claim has not yet been subjected to detailed discussion. However, combined with the evidence of the verb suffixes, it suggests that Dravidian has undergone considerable changes in morphological typology. This finding, however, should not distract from the fact that agglutination is a common tendency in present-day South Asia. 4.3.2.

Incorporation

Beyond the quasi-incorporation in “conjunct verbs” along the lines of Baker 1988 (see 4.5.3.4 and 5.4.2.1),6 one group of South Asian languages exhibits classical or “true” incorporation — the insertion of a noun (usually in truncated form) INTO the verb. This is South Munda, with some traces also in North Munda. See Anderson 2007b: Chapter 6, as well as example (8), where (8a) gives the full nominal form kina-n ‘tiger’, while (8b) has the incorporated, truncated variant kid. Anderson’s examples show that incorporated nouns may be Agents as in (8b), Patients, Beneficiaries, or Instruments. Moreover, there may be multiple incorporation, as in (9).7 (8)

a. b.

(9)

kina-n ñam-t-am tiger-NOUN seize-NPST -2 ‘The tiger will seize you.’ ñam-kid-t-am seize-TIGER -NPST -2 ‘Tiger will seize you’ (you will be tiger-seized) (Sora; Anderson 2007b: 189 with reference; Anderson’s translation) jo-me-bo(:)b-dem-te-n-ai smear-OIL - HEAD - RFLXV - NPST - ITR - CISLOC 1 ‘I will anoint my head with oil’ (Sora, Anderson 2007b: 186 with reference)

Anderson shows that traces of incorporation are also found in Khasi and Nicobarese, the other South Asian members of Austro-Asiatic, as well as in some of the Southeast Asian varieties of Austro-Asiatic.

6

7

Kulikov (2012) argues for incorporation in Sanskrit. Many of his structures are compounds of noun stem + nonfinite verb; but one structure is less restricted — the so-called cvi-construction, as in Skt. śuddhī-karoti ‘makes pure’, śuddhī-bhavati ‘is pure’, from śuddha- ‘pure’ + inflected forms of kṛ- ‘do, make’ or bhū- ‘be’. Functionally, these are similar to the Hindi type śurū karnā ‘begin (tr.)’, śurū honā ‘be ready, begin (itr.)’; but their structure is that of a true compound. Anderson gives his examples in modified IPA; they are retranscribed for ease of comparison.

444

Hans Henrich Hock

4.3.3.

Affixation

The general tendency of South Asian languages is suffixing. Some of the languages also have prefixes, including Old Indo-Aryan (e.g. Whitney 1889: 220–222, 350– 355), Munda (Anderson 2007b: 20–22, 174), Tibeto-Burman (see examples (3ab) above), and Burushaski (example (4) above). Munda and Old Indo-Aryan additionally have infixes (Anderson 2007b: 17–18; Whitney 1889: 250–254). But all of these languages also have suffixes. Moreover, in the history of Indo-Aryan, prefixation tends to be lost or gets reinterpreted as part of the root, as in (10). In Burushaski, nominal prefixes are rare except for possessive marking as in (11) and for the borrowed prefixes a- (NEGATIVE ) and su- (POSITIVE ); Berger 1998: 44–46, 211–212. (10) Skt.

ud-val> ud-ghāṭaya- >

Hindi ubal-nā ughāṛ-nā

(11) Bur.

i-śák-ulo 3SG -arm-LOC ‘on his arm’

a-śák-ulo 1 SG .-arm-LOC ‘on my arm’

‘boil’ ‘expose’

The prefixation in (11) looks like inalienable-possession marking, and it is common in words for body parts and family members; but as Berger observes, the present-day use or nonuse of pronominal prefixes is not fully predictable and the prefixes may occur to distinguish for instance the innards of a bird from its stomach. 4.3.4.

Reduplication and echo-words

Reduplication, in one form or another, is common in South Asian languages. Two major subtypes can be distinguished — morphological reduplication, often with truncation and other phonological changes of the reduplicand, as in (12), and whole-word duplication, as in (13). (12) a. b.

(13) a. b.

o-gi~geb CAUS - REDUPL ~heat ‘caused to heat’ (Remo; Anderson 2007b: 34) ci~kīr-ṣ-ay-i-ṣyā-m-i REDUPL ~do-DESID - CAUS -L INK Vowel- FUT .1 SG - NPST ‘I will cause (s.b.) to desire to do.’ (Sanskrit) rāvaṇ kī baṛī-baṛī āṁkheṁ thīṁ be.IPFV : PST . PL . F Ravan GEN . F big.F -big.F eye.N . PL . F ‘Ravan had big-big eyes’ (Hindi; Abbi 1992) guċhárimi-guċhárimi go.PST .3 SG -go.PST .3 SG ‘he went and went’ = ‘he kept going’ (Berger 1998: 198)

Morphology

c.

445

suté-sute vāvṛdhe … (Rig Veda 3.36.1c) pressed.LOC . SG . N -pressed.LOC . SG . N grow.PRF . MID .3 SG ‘He grows at every pressed (soma) …’ (Sanskrit)

The latter type, “āmreḍita” in traditional Sanskrit grammar, has received the greatest amount of attention, being considered a special characteristic of South Asian languages; Abbi 1992. However, the feature is more widespread, and its origin is difficult to establish (Hock 1993). A related phenomenon, called ECHO word formation, with phonological change in the second element, is also widespread in South Asia; see Abbi 1992, as well as Masica 1991: 81 for Indo-Aryan, and compare the examples in (14). In many languages the type with replacement of the initial consonant (± vowel) (14ab) is more common; but a second type, with vowel change (14cd), is also found. As shown by Hock (2007), the first type has parallels outside South Asia, especially in Turkic. (14) a.

b. c. d.

4.3.5.

avaṉukku paccikiṟatu-cicikiṟatu hunger.PRS .3 SG . N he.DAT . SG ‘He is always hungry or some damn thing!’ (Tamil; Steever 1988, his translation) ċhil-mil water ‘water etc.’ (Burushaski; Berger 1998: 223–224) ṭhīk-ṭhāk good ‘good and well; completely OK’ (Hindi) khāṇā-khūṇā food ‘food and the like’ (Panjabi; Bhatia 1993: 329) “Two-storey”8 noun inflection and case

A common pattern in South Asian languages is a “two-storey” system of nominal inflection, with a distinction between Nominative/Absolute and an Oblique, which serves as the basis for further case affixes or clitics. The Oblique often is identical to the Genitive, as in Telugu and other Dravidian languages9 (Krishnamurti 2003: 217–240), and some of the Munda languages, including Korku (Zide 2008: 264), Juang (Patnaik 2008: 518), and Gutob (Griffiths 2008: 648); and in Burushaski 8

9

The term “two-storey” (“zweistöckig”) was introduced for Tocharian in Krause & Thomas 1960: 78; see also Pinault 2008: 426. Kümmel (2013) applies it in reference to Hindi as well as Romani. In Tamil, the Oblique is different from the Genitive.

446

Hans Henrich Hock

the Oblique of Class II nouns is form-identical to the Genitive (Anderson 2007a). See for example the partial Telugu singular paradigm (15). In some Indo-Aryan languages, e.g. Hindi-Urdu, the Oblique may be of Genitive origin, at least in the plural (-oṁ/āṁ < Skt. –ānām); in others, such as Assamese, Oriya, Marathi, and Nepali,10 the synchronic Genitive serves as basis for case affixes and clitics; Goswami & Tamuli 2003: 419, Ray 2003: 455, Pandharipande 2003: 703, Riccardi 2003: 557. (15) Nominative/Absolute Oblique/Genitive Accusative Dative

illu iṇṭi iṇṭi-ni iṇṭi-ki

‘house’

What complicates matters is the general tendency to develop new case markers, sometimes referred to as “Relator Nouns” (e.g. Starosta 1985, Anderson 2007a: 1241), in which locational, manner, and other nouns added to Genitives create finer case distinctions. Where synchronic Genitives differ from Obliques, this may lead to a “third storey” of case inflection. See also Masica 1991: 231–240, with a yet finer distinction of four different “layers” in Indo-Aryan case systems. Consider (16) from Hindi-Urdu, where (16a) gives the Nominative/Absolute, (16b) an older stage of case marking based on the Oblique, and (16c) the new “Relator Noun” marking based on the Genitive. (In this particular instance, the older type (16b) has become grammaticalized, and only (16c) is synchronically productive.) (16) a. b. c.

4.3.6.

vah ‘that; he, she, it’ is-liye that.OBL -for *‘for that’  ‘therefore’ is-ke liye that.OBL - GEN for ‘for that, him, her, it’ Synthetic vs. analytic morphology

The case markers in many modern South Asian languages derive from earlier clitics which, themselves, reflect earlier (grammaticalized) full words. For instance, the Hindi marker liye in example (16) is transparently derived from the verb lenā ‘take’ (perfective participle liyā/liye/lī). In fact, in some cases it is difficult to draw a line between affix and clitic. The Hindi Genitive marker -kā/ke/kī, for instance, 10

A few case markers are added to the Ergative instead.

Morphology

447

is written solid with a preceding pronoun but separate after nominals. Further, in contrast to Hindi-Urdu, Marathi and Bangla treat their counterpart of the HindiUrdu Genitive marker as suffixes. Facts like these suggest that much of modern synthetic morphology reflects earlier analytic morphology. This is true not only for nominal but also for verbal morphology. See e.g. Steever 1988 and especially 1993 for Dravidian. Remarkably, however, some modern languages, especially Hindi-Urdu, Panjabi, and Gujarati, retain a large amount of analytical verb morphology. Consider e.g. the periphrastic Hindu-Urdu present formations in (17). (17) a. b.

cal-tā go-IMPFV . SG . M ‘he goes’ cal rah-ā go remain-PFV . SG . M ‘he is going’

hai be.PRS .3 SG hai be.PRS .3 SG

What accounts for such “resistance” to synthesis in some cases but not in others is an issue that deserves further study. 4.3.7.

Root alternations

The system of vocalic alternations in Sanskrit is well known, especially that of ablaut alternations, as in (18), which is deeply embedded in the morphological system. Other alternations are more idiosyncratic, such as the variant kīr- in example (5) above or kri- in the passive stem kri-ya-. (18) a. b. c.

Base form kṛ Guṇa kar Vṛddhi kār

as in as in as in

kṛ-ta- ‘done’ kar-tum ‘to do’ kār-ay- ‘cause to do’ (causative stem)

Though much of this system has been lost in later Indo-Aryan, traces persist in alternations such as (19), between more and less transitive root forms. (19) Hindi

ruk-nā ‘stop (ITR .)’ : kaṭ-nā ‘be cut (ITR )’ :

rok-nā ‘stop (TR .)’ kāṭ-nā ‘cut (TR .)’

Kashmiri has acquired vowel alternations through umlaut, as in (20); Koul 2003: 904–905. (21) Kashmiri

mōl NOM : krūr SG :

mǝ̄l’ ERG kr ̄ r’ PL

‘father’ ‘well(s)’

448

Hans Henrich Hock

Dravidian vowel alternation is generally limited to quantity as in (21), with length generally restricted to monosyllabic forms, except for monosyllabic combining forms (21c) used in compounds or attributive contexts. (21) Tamil a. b. c.

tāṉ NOM taṉṉ-ai ACC tan-tai

(third singular reflexive) ‘his father’

Vowel harmony is found in Telugu and various Indo-Aryan and Munda languages; for discussion with references see 3.3.4 (this volume). 4.4.

Theoretical issues

This section provides a brief overview of noteworthy theoretical approaches or proposals that have been made regarding South Asian morphology. 4.4.1.

Root, stem, affix, derivation: Sanskrit tradition, its legacy, and its recent rejection

The Sanskrit grammatical tradition was the first to make a clear distinction between 11 ROOT (Skt. dhātu), noun STEM (prātipadika), AFFIX (pratyaya), and WORD (pada), and to have linked these by means of DERIVATION . These notions pervade most approaches to morphology since the 19th century, whether western or South Asian, with augmentation by the classical Western notion PARADIGM . It is worth noting, however, that unlike western approaches (e.g. Whitney 1889, Debrunner & Wackernagel 1930), Pāṇini does not extend the notion “Stem” to verbs. Rather, derived forms, such as the causative (suffix –i-) or the desiderative (suffix –sa-), are treated as (extended) roots; see Hock 2009. The traditional approach has recently been rejected in a framework of “WholeWord” or “Seamless” Morphology developed by Rajendra Singh in collaboration with various scholars; see e.g. Ford, Singh & Martohardjono 1997, Dasgupta, Ford & Singh 2000, Singh, Starosta & Neuvel 2003, Singh & Agnihotri 2007, Bhattacharja 2007. It remains to be seen how well this new approach can be applied to morphologically highly complex languages like Sanskrit and whether it will lead to new insights for such languages.

11

See 7.2.1 on the issue of “artificial” roots and whether all words can be derived from verbal roots.

Morphology

4.4.2.

449

Finite vs. nonfinite distinction in traditional Tamil grammar

A major contribution of the Tamil grammatical tradition is the clear functional distinction between finite and nonfinite. Tolkāppiyam defines nonfinite verbs as deficient (eccam; 2.227) and distinguishes two subtypes — vinaiyeñcukiḷavi (2.228) and peyareñcukiḷavi (2.231), words (kiḷavi) that are incomplete (eñcu) in respect to a verb (viṉai) or a noun (peyar) respectively, that is, forms requiring a (following) verb or noun to produce a complete clause. (Later terminology, going back to Naṉnul, is viṉayeccam and peyareccam). Unlike Pāṇinian grammar, this approach had no direct influence on western grammatical accounts (except those focused on Dravidian). Nevertheless, the insight — and its formulation — at a time between 100 BC and 300 AD is remarkable. (See Israel 1973 and Albert 1985.) A complication arises from the existence of Old Tamil structures with finite verbs in places where later normative grammar requires nonfinite verbs; see 5.4.3.3.4, this volume. Tolkāppiyam does not have any explicit discussion, but its sūtra 2.457 is later interpreted as covering the phenomenon, and an additional term muṟṟeccam (≈ ‘deficient/incomplete finite verb’) is introduced (Israel 1973: 221). 4.4.3.

Categories

The Pāninian tradition distinguishes four major morphological categories — nouns, verbs, particles, and prefixes/adpositions. Tolkāppiam’s categories are even more limited — nouns, verbs, and a category called iṭaicol which includes affixes and particles (including clitics). Neither tradition recognizes distinct major categories for pronouns, adjectives, numerals, or adverbs — in marked contrast to western traditions. The reason for this difference seems to lie in the fact that formal, inflectional criteria predominate in the classical Indian traditions (e.g., pronouns are inflected for the same cases as nouns and make the same number distinctions), while western traditions also consider functional criteria (the difference between nouns and pronouns) and agreement variation (for adjectives). Moreover, as discussed by Lehmann (1994: 24–26), although some Old Tamil words are sufficiently different from nouns (or verbs) to be recognized as a separate category ADJECTIVE , their number is exceedingly small. The idea that (some of the) Munda languages do not have a categorial distinction between verbs, nouns, etc. goes back to Hoffmann12 (1903) and has been supported in a number of other publications, such as Pinnow 1966, Bhat 1997. The claim has been refuted for Mundari by Evans and Osada (2005). However, in recent publications (2007, 2008, 2011) Peterson offers strong arguments for the claim, distinguishing only one major morphological category for Kharia — an 12

In his later publication (Hoffmann & van Emelen 1930–1979), Hoffmann retreated from this position.

450

Elena Bashir

“open” class with no distinction between nouns, verbs, adjectives, etc., and in addition two “closed” classes, consisting of functional morphemes and proforms/ deictics. This is an issue that deserves further study, as well as further vetting by Munda experts and general typologists. 4.5.

Morphosyntactic issues

Several aspects of morphology are heavily intertwined with morphosyntax. Three of these are discussed in this section — Agent marking (by Elena Bashir), Object marking (K. V. Subbarao), and Agreement marking (Hans Henrich Hock). 4.5.1.

Agent Marking By Elena Bashir

4.5.1.1. The concept of “agent” The concept of “agent” employed in this section corresponds to the macro-role “Actor” defined in Foley & Van Valin 1984: 29 as ‘the argument of a predicate which expresses the participant which performs, effects, instigates, or controls the situation denoted by the predicate’,13 and thus includes both transitive and agentive intransitive subjects (“unergatives”). Underlying this approach is a continuum-of-agentivity model, ranging from prototypical agent (volitional, exerting control, animate) to inadvertent agent (including causee), incapable agent (passive), to experiencer agent. This section will treat only prototypical agents and not inadvertent or incapable agents, or experiencers of mental, physical, or deontic states. Such roles are discussed as “dative subjects” (Masica 1976), “experiencer subjects” (Verma & Mohanan 1990), “experiencing agents” (Hook 1982), or “affected agents” (A. Saksena 1980).14 See also 5.4.1, this volume. Contemporary interest in agent marking in South Asian languages has developed largely in connection with study of case alignment typologies, especially of split-ergative languages like Hindi. Simultaneously, concern with defining the syntactic concept of subject has motivated much research on the topics of subject (and agent) marking; see e.g. Verma (ed.) 1976, Bhaskararao & Subbarao (eds.) 2004. Masica (1991: 339–364) discusses such work up to 1991. The bulk of this earlier research on agent marking focused on syntactic determinants of case

13 14

This definition subsumes Pāṇini’s kartṛ ‘agent’ and hetu ‘instigator’ (1.4.54). For the modern languages, the discussion will be limited to agents in finite, declarative clauses.

Morphology

451

marking. Recent research, especially on the Tibeto-Burman (TB) languages, has revealed the importance of semantic and discourse determinants of case marking. 4.5.1.2.

Diachrony of agent case marking

4.5.1.2.1.

Indo-Aryan and Iranian

Some analyses of the origin of (split) ergativity in IA languages, e.g., Miltner 1965, Pray 1976 and S. R. Anderson 1977, have advanced a “passive to ergative” hypothesis, under which ergative alignment developed from a presumed NOM - ACC system in OIA. This analysis was challenged by Klaiman (1978), who argued that the OIA ta-participle already patterned ergatively; Hock (1986) continued this trajectory; and recently Butt (2001) has turned the discussion toward the role of semantics in case marking. Other studies examining the development of (split) ergative systems in MIA and NIA include Zakharyin 1979 on Dardic, Khokhlova 1992 on NIA, Peterson 1998 on Pali, Jamison 2000 on Gandhari, and Wallace 1982 and Poudel 2008a, 2008b on Nepali. Bubenik 1989a discusses both the development and the attrition of ergativity in MIA in general. The relation between genitive-marked and instrumental-marked agents in OIA has been discussed by Pirejko (1979) and Andersen (1986a, 1986b), who concludes that in both Vedic and MIA there are two different ta-participle constructions: an ergative construction whose genitive-marked agent represents old information, and a passive construction in which the instrumental-marked agent represents new information. He further suggests that in MIA these two different constructions marked the definiteness status of the agent. Hock 1986 includes diachronic discussion of genitive agent marking, and Bubenik 1989a discusses the interplay of genitive and ergative marking of agents in OIA, MIA, and NIA languages including “Lahnda”15 and Sindhi, and in Iranian Pashto and Balochi. Stroński 2009b continues the debate on both these issues. Oguibénine 2006 discusses uses of instrumental case in Buddhist (Hybrid) Sanskrit, outside the context of alignment typology. Ergative characteristics peaked in late MIA, after which ergativity began to diminish in the syntactic systems of most NIA languages during the 17th to 20th centuries (Khokhlova 2001: 172). Today Eastern NIA Bengali16 and Oriya (Montaut 2004: 40), and Nuristani Prasun (Morgenstierne 1949), are strictly NOM 15

16

“Lahnda” (‘west’) is a term coined by Grierson (1919a) for the numerous varieties of western and southern “Greater Panjabi” (including Saraiki, Hindko, Potohari and other local varieties), itself a fraught term. “Lahnda” has subsequently been used by some scholars, but has never been a self-designation for any language variety. Conventional English spellings for language names will be used in this article; these do not represent vowel length or consonant retroflexion.

452 ACC .

Elena Bashir 17

Chatterji (1926: 947–948, 971–972) discusses the presence of ergative features in Old Bengali and the chronology of their subsequent loss. A cluster of studies focuses on the phase of diminishing ergativity. These include Stump 1983, Klaiman 1987, and Montaut 2009 on NIA languages in general; Khokhlova 2001 on Western NIA; Farrell 1995 on Balochi (WIr.); and Payne 1980 on the Pamir languages (EIr.). Tiffou 1977 and Tiffou & Morin 1982 describe the structure of, and changes in, the ergative system of Burushaski. Another group of studies synchronically rank degrees or types of ergativity in modern South Asian languages. Skalmowski 1974 is an early work of this type for Pamir and Dardic languages. Klaiman 1987 compares morphological correlates of (split) ergativity in sixteen languages in a move toward establishing an implicational hierarchy of features that could serve as a metric for grading the degree of “ergativeness”. Also relevant here are Stump 1983, Stroński 2009a on Eastern Hindi, Rajasthani, and Pahari, and Sigorskiy 2007 on early Hindi. Magier (1983b: 309) finds that ergative case marking of agents in Marwari is nearly extinct, but ergative verbal agreement patterns remain. Deo and Sharma (2006) employ Optimality Theory to discuss loss of ergativity in Hindi, Nepali, Gujarati, Marathi, and Bengali. Assamese (Devi 1986; Amritavalli & Sarma n.d.), and Shina (Bailey 1924) have extended agentive subject marking to all tenses of transitive verbs and to some intransitives. Some dialects of Shina now have two ergative case markers, one marking subjects of imperfective transitives and the other of perfective transitives (Schmidt & Kaul 2010: 199; Hook 1996: 149). Brokskat has three ergative markers: -ya (nouns and pronouns in past), -i (proper nouns in past), and -sa (non-past) (Palancar 2002, after Ramaswami 1982). D. D. Sharma (1994: 98–99) summarizes the use of ergative case in “Tibeto-Himalayan”18 languages, noting that they use ergative for all transitive subjects regardless of tense, but pointing out Spiti as a significant exception. In addition to NOM - ACC and ERG - ABS patterns, a variety of alignment systems have been identified in IA and Iranian languages. In Balochi, for example, there are neutral (in which S, A, and P are identically marked in the PAST domain), double oblique (in which A and P are oblique while S is in the direct case), and tripartite (in which S, A, and P are all in different cases) patterns (Korn 2008). Tripartite and double-oblique systems have been noted for Pamir languages (Payne 1980; Wendtland 2008). Wendtland (2008: 420) argues that a distinction should be made 17

18

NIA Kalasha and Khowar are also both NOM - ACC , but preserve remnants of the OIA augment and appear not to have passed through an ergative stage (Bashir 1988), whereas Prasun may have (see Morgenstierne 1949). “Tibeto-Himalayan” is a geographical cover term for two clusters (Tibetan and Himalayan) of tribal languages spoken in Himachal Pradesh and Uttarakhand. D. D. Sharma 1994 compares fifteen of these languages.

Morphology

453

between the decaying ergative system developing into a nominative-accusative system, as presented in Payne 1980, and developments for which an analysis referring to (formerly) ergative systems is not sufficient. See also Haig 2008. Some of these different modes of explanation may be found in the concepts of fluid ergativity and transitivity discussed below. Some TB languages also show (partial) tripartite, accusative, or neutral systems (Bickel 2000 and references therein). Recently, Butt (2001: 136) has suggested that the model of a shift from accusative to ergative should be replaced with an account which does not presuppose a rigid classification of case systems into accusative vs. ergative vs. active, but allows for both structural and lexical case marking within a single, more complex, system. Hook and Koul (2004) argue that ergative marking of transitive subjects in finite clauses, usually in preterit or perfect tenses, is not semantically conditioned but should be analyzed as an agreement phenomenon, i.e. can be considered as “structural case” as argued in Davison 2001, while the ergative marking of intransitives demands another analysis, involving semantic and discourse variables. ‘Animacy, aktionsart, point-of view, all are inputs to the speaker’s choice of case for subjects of intransitive inceptives in Kashmiri. When speakers regard the referents of such subjects as being distant from them or external to their interests, they choose the nominative case. If they want to assume a stance that is closer to the presumed viewpoint of the subject they use the ergative’ (Hook & Koul 2004: 218). 4.5.1.2.2.

Tibeto-Burman19

Studies on the diachrony of case-marking patterns in TB languages are more recent and very different from those for IA, although the terminology applied to IA diachrony is sometimes employed. Vollmann 2008 is a historiographical and cross-cultural account of descriptions of ergativity in Tibetan. LaPolla (1995: 213, 217) argues that agentive/ergative marking was not a feature of Proto-TibetoBurman, and that its presence in so many TB languages is the result of independent but parallel innovations, a long-term “drift” which ‘reflects a semantically based system of grammatical organization rather than one based on syntactic functions such as subject and object.’ He finds a continuum of ergative marking types in Tibeto-Burman. At one extreme are systems with optional ergative/agentive forms, whose function is to clarify which of two possible nominals is the actual agent. Some languages also have forms which mark a “non-agent”. At the other extreme are languages which ‘have relatively stable paradigmatic ergative systems 19

When describing TB languages, some authors use the term “ergative” to refer to structural case categories and “agentive” to refer to semantic distinctions, but usage is not consistent. In this summary of research, “ergative” should be taken as referring to case markers sometimes called “ergative” and sometimes “agentive”.

454

Elena Bashir

[in which] the use of the ergative marker is obligatory [in certain grammatically defined contexts], for example in Kham […]. Word order, information structure, agency, and volitionality are all not relevant to the use or non-use of the marker’ (LaPolla 1995: 216). DeLancey says about Lhasa Tibetan that, ‘In all cases “optionality” of ergative marking simply means that its function is pragmatic rather than syntactic or semantic’ (1990: 306), and that ‘the discourse-pragmatic function of the ergative, […] seems, when viewed diachronically, to be a later overlay on the semantic function’ (DeLancey 1990: 308). Discussing Central Tibetan, Saxena (1991: 115) finds that the ergative marker has increasingly become “optional”, possibly because it is being reanalyzed as a topic marker, and that the language may be moving away from the ergative pattern. Vollmann (2010: 249) thinks that case marking is a secondary phenomenon in Tibetan, and that ‘a diachronic “development” from ERG to NOM or from NOM to ERG cannot be observed in Tibetan.’ Devi 1986 is a diachronic discussion of ergativity in Assamese. 4.5.1.3.

Differential agent marking

Two types of differential agent marking patterns must be distinguished — split and fluid subject/agent case marking. 4.5.1.3.1.

Split transitive (ergative) patterns

A large number of studies deal with the various types of split ergative patterns found in South Asian languages. (Partially) tense-based splits are found in Pashto (Tegey 1979)20 and Burushaski (Tiffou & Morin 1982). Aspect-based splits are the norm in IA languages, and animacy-hierarchy, including person/number splits are fairly numerous. In some languages, only third person subjects of transitive clauses in perfective aspect take an ergative (or oblique) marker, as in Panjabi (Bhatia 1993), Marwari (but only with pronominal subjects, and in free variation with direct case forms; Magier 1983b: 245), Domaki (Weinreich 2008), Karachi Balochi (Farrell 1995: 240), Burushaski (for tenses except the future), and TB Kham (Watters 2002: 66). Li (2007: 1462) argues that ‘Nepali is a split ergative language conditioned by the semantic nature of NPs. In the domain of inanimate NPs, the language is ergative; elsewhere, it resists classification as ergative or accusative.’ Li finds (2007: 1470) that ‘the use of [the ergative marker] -le on inanimate transitive subjects is obligatory, but its use on animate transitive subjects 20

Roberts (2001: 1) argues, based on the behavior of conjunct (NP/ADJ + light verb) verbs, that ‘Pashto’s ergative split is more intricate than has hitherto been noted, being determined by both tense and aspect.’

Morphology

455

varies according to tense/aspect. In the perfective domain, -le on the A argument is obligatory; in the imperfective domain, the use of -le is optional.’ 4.5.1.3.2.

Split intransitive patterns

In a split-intransitive pattern, intransitive verbs are divided into two classes, one of which requires agentive and the other non-agentive subject marking (Creissels 2010). Agha (1993) bases his discussion of Lhasa Tibetan on the distinction between agentive and non-agentive verbs. Agentive subjects can bear zero-marking or ergative/instrumental case. He argues that there are two types of “splits” in subject case marking (Agha 1993: 64–65). One depends on case relations and the other, in which only first-person subjects can receive agentive marking, on mood. Split intransitive patterns have been identified in Limbu, a TB language of Nepal (Michailovsky 1997, cited in Creissels 2008). Li (2007) shows that Nepali has both split transitive and split intransitive patterns. The use of the ergative marker -le with intransitive subjects varies according to the verb. With unaccusatives, no case marker can be used; however, with unergatives, the use of -le is possible in the perfective domain for some speakers (Li 2007: 1470). Li argues that the intransitivity split emerges from the interaction of agentivity and telicity; however, Li uses “split intransitivity” to mean a phenomenon in which some intransitive verbs pattern together with respect to a certain test while other intransitive verbs behave differently and group together with regards to that test, a notion different from that in Creissels 2010. So for Nepali we find a complex split involving interaction of both aspect and NP class. 4.5.1.3.3.

Fluid intransitivity/ergativity systems

In fluid systems, on the other hand, fluctuations in S-marking are the norm: the case of the subject depends on its role in a particular context, and is conditioned by a host of often interacting variables, including volitionality, animacy, conscious choice, and control. Current interest in this type of variation in IA and Iranian languages is perhaps foreshadowed in Hock 1985, which mentions instances in Marathi (citing Bloch 1970: 262–264) and Gujarati (citing Cardona 1965: 109). Hook et al. 1987 and Hook 1997 discuss this pattern in Gujarati, Kashmiri, and Marathi. Early discussions of fluid case marking for Hindi and Urdu include Kachru & Pandharipande 1978, Tuite et al. 1985, Mohanan 1994: 69–78, Butt & King 1991, Bashir 1999, and Davison 1999. More recently, de Hoop and Narasimhan (2008), employing Optimality Theory, argue for a notion of “strong subject”, not identical to volitionality, to explain the marking of some intransitive subjects with ergative -ne. McGregor (2010: 1616) notes that a geographical concentration of optional ergative marking is found in northern India–Nepal–Tibet–Western China, where

456

Elena Bashir

many TB as well as IA/Ir languages show it. Fluid ergativity/intransitivity has been described in Wakhi (Bashir 1986, 2009), Kashmiri (Hook & Koul 1997; Wali & Koul 1997: 153), Assamese (Amritavalli & Sarma n.d.), and Nepali (Butt & Poudel 2007). The situation for TB languages is enormously complex, and fundamentally different from that of IA and Iranian languages. In TB languages, agent marking is effected not only by the presence of an agentive case marker on an NP, but by interaction with conjunct/disjunct systems, evidential operators, the presence or absence of person marking, and auxiliary selection. Also, the extent to which agent marking is syntactically or semantically determined varies from language to language, as for instance between Kathmandu and Dolakha Newar (Genetti 1994, 1988). Fluid subject marking was first described for Lhasa Tibetan by Chang and Chang (1980), and has been central in discussion of the Tibetan ergative since then (DeLancey 1990, Saxena 1991, Agha 1993, Tournadre 1991, 1996). About Lhasa Tibetan, DeLancey (1990: 305) says that the distribution of ergative case is ‘conditioned by a combination of transitivity, volitionality, aspect, and topicality. […] there is no occurrence of ergative that requires explanation in strictly syntactic terms.’ Genetti (1988) has described this situation in Kathmandu Newari.21 McGregor (2010: 1616) says: ‘it would seem that ergative marking in Tibeto-Burman is prototypically optional’ and gives a list of over 25 languages/groups (2010: 1631– 1632, with references), in which optional ergative marking has been described. 4.5.1.4. Extended meanings of ergative markers Poudel (2008a: 9–10) argues that in modern Nepali, -le, already established as the marker of transitive subjects in perfective clauses and in the injunctive, began to be contrastively employed to indicate meanings of accomplishment vs. nonaccomplishment (NOM ), obligation vs. desire (NOM ), certain vs. uncertain (NOM ) future, optative vs. declarative (NOM ), possible vs. simple (NOM ) future, individual level (referring to an inherent property of a referent) and stage level (referring to the property of a referent that holds at a particular moment) predications (NOM ), and volitional and non-volitional (DAT ) predicates. Intransitive subjects which initiate an action and are unaffected by it are marked by -le, and reason clauses can also be marked by -le. Bashir (1999) proposes that the ergative postposition ne in (Pakistani) Urdu is being reanalyzed and becoming semanticized to mark agentivity outside of its historical domain of perfective tenses of transitive verbs, now indicating “conscious choice”/volitionality and even futurity. Similar studies on other IA languages would be welcome.

21

In 1988, the usual term for these languages was “Newari”. Now, however, the preferred term is “Newar”.

Morphology

457

In many TB languages, agentive/ergative markers have acquired pragmatic and discourse functions, and a wealth of new research on this is currently appearing.22 Tshangla, a Bodic language spoken in Bhutan, has been characterized as an ergative system with an “active/stative” split, in which the presence of the agentive case marker -gi is determined by a complex interaction among syntactic, semantic, and pragmatic factors affecting both the individual clause and the larger discourse context, no single one of which is sufficient to motivate agentive marking (Andvik 1999: 194, 2010). For instance, verbs of perception, cognition, or utterance almost always appear with an agentive-marked subject. In this case, the choice of an agentive case marking appears to reflect ‘not agentivity, but a natural starting point for the flow of attention’ (Andvik 2010). In Kinnauri, the ergative marker seems to be used to indicate a change in perspective. In constructions other than those reporting speech in a connected narrative, the ergative marker occurs when a clause describes something against expected behavior, or describes the magnitude of surprise or urgency (Saxena 2009). Discussing Ladakhi, researchers at Tübingen University similarly find that, ‘Another factor that plays a crucial role […] is distance or closeness in terms of space, time, and emotion. Events that are perceived as close tend to receive less overt marking than those that are perceived as distant. Emotional distance includes all kinds of personal involvement: surprise, embarrassment, compassion, or being highly affected’ (Tübingen University 2011).23 Important work has been done on Meithei (Manipuri) by Chelliah (1997, 2009). The ergative marker -nə marks some semantic and discourse dimensions, including specificity of the subject and new information. For instance, -nə can appear either when ‘a particular instance of an unusual activity for subjects [or] a generic statement of an activity characteristic for the subject is expressed’ (Chelliah 2009: 392). If a routine activity is recast as unusual or noteworthy, the agent is marked by -nə (Chelliah 2009: 387). Additionally, a homophonous, but distinct, contrastive marker -nə has developed, by which the speaker places an entity in contrastive focus. Chelliah’s central claim is that this change of semantic role markers to pragmatic marking is not random, but is clearly motivated through metonymy. ‘Recognition of the connection between new information markers and semantic role marking provides a means for understanding argument marking in other Tibeto-Burman languages, many of which exhibit the same homophony discussed here for Meithei. Tibetan, for example, has homophonous agentive and contrastive focus markers’ (Chelliah 2009: 398). Discussing Manipuri, Poudel (2007, 2008c) 22

23

Chelliah & Hyslop (eds.) 2011, 2012, special issue of Linguistics of the Tibeto-Burman Area, includes eight articles (Chelliah & Hyslop 2011, DeLancey 2011a, Coupe 2011, Peterson 2011, Morey 2012, Teo 2012, Willis 2011, and Zeisler 2012) concerned with optional case marking in TB languages. Interestingly, though dealing with the same semantic/discourse parameter, these findings are the opposite of those reported in Hook & Koul 2004 for Kashmiri inceptives.

458

Elena Bashir

argues that in addition to distinguishing between unergatives and unaccusatives, intransitive usages also subclassify between individual-level and stage-level predications. Subjects of individual-level constructions (including statives) are ergative, those of stage-level predicates are nominative. 4.5.1.5. Desiderata More text collection and text-based studies are needed. Case marking variation may show up in natural texts while going unobserved in elicited data or isolated sentences. For example, Bashir (2009: 843) has found that two different systems of marking operate in Hunza Wakhi: context-free, elicited sentences show a pattern in past tenses in which a maximally distinguishing morphological marking strategy employing OBL 1 for objects and OBL 2 for agents appears. In texts, however, OBL 2 also occurs as an option for objects, depending on discourse variables and choice of subject-marking strategy. A similar phenomenon has been noted in the Pamir language Bartangi (Wendtland 2008: 431). Most discussions in the literature are of agent marking by case desinences on an agent NP. However, often an agent is indexed or exclusively expressed by pronominal clitics on the verb or other constituent, as for example in Wakhi (Bashir 2009: 835), Kashmiri (Hook & Koul 1984), Sindhi (Khubchandani 2003: 652), western and southern “Greater Panjabi” (Shackle 1976), Pashai (Morgenstierne 1973, Lehr 2014), Balochi (Korn 2008: 256), Pashto (Tegey 1979), Pamir languages (Payne 1980), and Munda (Anderson 2007b). For Balochi, Jahani (2003: 125) notes that the use of agent clitics is more frequent in Iranian than in Pakistani varieties. Indexing of the agent on the verb (clause level, head-marking agreement) or other constituent has not been treated in this summary, even though this division is somewhat arbitrary. In all these languages, the distribution of agent clitics and the correlations of their appearance with other factors like appearance of a full NP or free pronominal agent, discourse variables, and dialect need study. Klaiman 1987 and Bubenik 1989b are starts at this for IA languages. Especially fruitful would be comparison of these systems with those of the complex pronominalizing Tibeto-Burman languages. It is also useful to consider nuances of agent marking in NOM - ACC languages, such as Sinhala. In Sinhala, neither transitivity nor tense/aspect are significant determinants of case marking; the primary distinction is between volitive and involitive sentences. ‘An intentional, animate subject, which is canonically the agent, is always nominative, whereas an unintentional animate subject can be assigned accusative, dative or instrumental case depending on the semantic properties that the subject bears’ (Henadeerage 2002:14). However, some verbs which normally appear with nominative subjects may also take instrumental subjects when the subject is ‘corporate or organizational’ in nature, whereas when the subject appears in the nominative, the reference is commonly to the collective

Morphology

459

membership of the organization rather than to the organization as entity (Gair & Paolillo 1997: 31). Such subtleties need to be explored for other NOM - ACC languages, which tend to be ignored in discussions of agent marking. Creissels 2009 and Zeisler 2007 offer valuable summaries/typologies, which could be good starting points for developing research initiatives. Research on the Munda languages has not yet focused on agent-marking. From Anderson 2007b, one can conclude that, in general, all subjects, both agentive and non-agentive are indexed on the verb in the same way. However, ‘… the use of the two series of tense/aspect markers interacts with the semantic role of the actor in various Munda languages in different and complex ways.’ In certain South Munda languages the TAM suffixes roughly correlate with an intransitive and a transitive class (with the exception that some logically intransitive roots are inflected as if they were transitive; Greg Anderson, p.c. May 2011). Available knowledge about case marking in individual Munda languages can be gleaned from Anderson (ed.) 2008. 4.5.2.

Object marking By K. V. Subbarao

The accusative case marker in traditional grammars has generally been treated as an object case marker, which to a large extent holds. However, there may be direct objects that occur without an overt lexical accusative case marker in South Asian languages in particular and in many other languages in general. This shows that the occurrence of the accusative marker on the direct object is not obligatory and there may be other factors that govern the occurrence or nonoccurrence of the accusative marker. In the last decade attempts have been made to show that the occurrence of the accusative case marker on the direct object may, amongst other things, indicate specificity in addition to animacy. This section provides a brief account of the work that has been done on the DIFFERENTIAL nature of the occurrence of the accusative case marker on the direct object in terms of its presence vis à vis its absence, and the factors that govern it. Works that discuss this phenomenon include Masica 1982, Magier 1987, 1990, Lehmann 1989, Mahajan 1990, Butt 1993, M. Singh 1994, Mohanan 1994, and Bhatt 2007. The accusative marker may be present or absent in nominative-accusative constructions, while it does not occur in the dative subject construction (DSC) on the theme/patient, except in a few South Asian languages such as Tamil and Malayalam of the Dravidian language family and Bodo and Rabha of the Tibeto-Burman language family. In Subbarao 2012, several arguments were presented to show that the reason for the non-occurrence of the accusative case marker in the DSC is that the predicate in the DSC is [-transitive], and hence, it cannot assign accusative Case to the theme. Subbarao (2012) provides arguments as to why the accusative case marker occurs in the DSC on the theme in some South Asian languages such as Bangla, Tamil, Malayalam, and Bodo.

460

K. V. Subbarao

In some Tibeto-Burman languages such as Mizo, Hmar, Thadou, the accusative marker occurs only when the direct object (patient or theme) is [+definite]. For example, (22) lalan sakei ahmu LalaERG tiger 3 SG saw ‘Lala saw a tiger.’ (Hmar; Tibeto-Burman, Vanlal Bapui, p.c.) (23) lalan sakeicu ahmu LalaERG tigerACC / DEF 3 SG saw ‘Lala saw the tiger.’ (Hmar; Tibeto-Burman, Vanlal Bapui, p.c.) (22) and (23) show that the direct object is differentially marked. With this brief background in mind, we shall now discuss the work that has been done on Differential Object Marking (DOM).24 In most of the South Asian languages the accusative marker denotes the feature [specificity] and it occurs with [+animate] themes. The marker ko that occurs with the theme in nominative-accusative constructions is a marker of specificity and animacy, as Magier (1987,25 1990) and Mahajan (1990) have shown for Marwari and Hindi-Urdu respectively. Tamil (Dravidian) in contrast to Hindi-Urdu presents a very interesting case. In Tamil, when the object noun phrase is indefinite and “irrational” ([-animate]) and is marked by a [-definite] determiner, the occurrence of the accusative case marker is optional (Lehmann 1989: 80), while in Hindi-Urdu the occurrence of the accusative case marker invariably denotes specificity. The following examples are illustrative. (24) kumār oru peṭṭi-(y.ai) vāṅkKumar a box-(ACC ) buy‘Kumar bought a box.’ (Tamil)

inPST -

āṉ 3 SG . M

(25) kumār ne ek peṭī (?ko) kharīdī/ (kharīdā) box ACC bought Kumar ERG a 26 ‘Kumar bought a box.’ (Hindi-Urdu)

24

25

26

The following discussion is abstracted from Subbarao (2012) and hence, reference to the source is not always cited. Magier (1987: 192–193) clearly states that ko in Hindi does not ‘… convey relational information’ when it occurs with ‘direct objects’, but it ‘follows a semantic hierarchy of specificity and animacy that contributes to the overall salience of the marked object noun.’ Magier (1999: 56) points out that the marker ko that occurs with the derived subject in passives in Hindi-Urdu connotes ‘that the action was intentionally carried out by the agent [italics in the original], while this connotation is absent’ in sentences without the marker ko in passive sentences.

Morphology

461

In Bangla, the features ANIMACY and SPECIFICITY play an important role in the assignment of the accusative case marker ke to the theme, just as in many IndoAryan languages, such as Hindi-Urdu (Mahajan 1990), and Marwari (Magier 1987, 1990). The accusative case marker ke does not occur when the theme is [-definite] and [-animate]. (26) rinar kono jiniš bhalo lage na GEN any thing good appear NEG Rina‘Rina does not like any thing.’ (Bangla; Probal Dasgupta, p.c.) Note that the marker ke is not present with the theme kono jiniš ‘any thing’ which clearly shows that ke is a marker that does not occur when the theme is [-definite] and [-animate]. If the theme is under focus or contrastive stress, the marker ke occurs, see (27), as Probal Dasgupta (p.c.) points out. T HEME UNDER CONTRASTIVE FOCUS (27) rinar kono jiniš keGEN any thing ACC Rinašotti-šotti27 bhalo lage na really good appear NEG ‘Rina does not like really any thing at all.’

(i) (ii)

i EMPH

šāntī mārī gayī ‘Shanti was killed.’ šāntī ko mārā gayā ‘Shanti was killed.’ (i.e. murdered)

It is not clear whether such semantic distinction holds with different kinds of predicates, as the following sentences illustrate. (iii) (iv)

27

kal rāt ko is muhalle meṁ hī ek cor pakṛā gayā EMPH a thief was caught last night in this area ‘Last night a thief was caught right in this area.’ kal rāt ko vah cor / us cor ko pakrā gayā ki last night that thief was caught or ‘Was that thief caught last night or not?’

nahīṁ not

There does not seem to be any semantic distinction between the two alternatives (with or without the marker ko on the derived subject) in (iv). An in-depth study of the occurrence of the specificity marker in passives in Hindi-Urdu needs to be carried out to be able to arrive at firm conclusions. In Dravidian languages the occurrence of the accusative marker on the derived subject is not permitted and hence, it is always case-marked nominative. šotti-šotti is a reduplicated form.

462

K. V. Subbarao

Dasgupta further points out that the correlation between the behavior of the patient in the experiencer subject sentences (27) and in the agent (nominative subject) sentences (28) is exact. There is an interaction with ANIMACY and SPECIFICITY , but that interaction is identical in the two clause types. Note that the marker ke is not present in (28), while it is present in (29), when the theme is under contrastive focus. T HEME UNDER NEUTRAL FOCUS (28) rina kono jiniš pɔchondo Rina any thing liking ‘Rina does not like any thing.’

kɔre does

na NEG

T HEME UNDER CONTRASTIVE FOCUS (29) rina kono jiniš kei šotti-šotti Rina any thing ACC EMPH really ‘Rina does not like really any thing at all.’

pɔchondo liking

kɔre does

na NEG

Hence, we can conclude that the marker ke in Bangla in the nominative-accusative construction and the genitive-accusative construction is a SPECIFICITY MARKER , and not an accusative marker, just like the marker ko in Hindi-Urdu which is treated as a specificity marker (Mahajan 1990) and (Magier 1987, 1990). In the case of Malayalam and Tamil, a similar fact is observed. (i)

The case of Malayalam

We now provide evidence to show that the accusative marker -ye in Malayalam functions as a specificity marker. Sentence (30) is a DSC, and the accusative marker -ye occurs with the theme āna ‘elephant’. (30) kuṭṭik’k’ə ānaye išṭam āyi DAT elephantACC liking became child‘The child liked the elephant.’ (Malayalam, Jayaseelan 2004: 229) Interestingly, this construction alternates with a nominative subject construction (30’). The accusative marker -ye occurs with the theme. (30’) kuṭṭi ānaye iṣṭa-ppeṭṭu ACC likePST child elephant‘The child liked the elephant.’ (Malayalam; Jayaseelan 2004: 229) When the theme is [-animate] and [-definite], the accusative marker -ye does not occur (30’’). (30’’) enik’k’ə oru māṅṅa vēṇam DAT one mango.NOM want I‘I want a mango.’ (Malayalam; Jayaseelan 2004: 234)

Morphology

463

Thus, the features ANIMACY and DEFINITENESS explain the occurrence of the accusative marker -ye, and it is NOT the transitive nature of the predicate that is responsible for its presence. (ii)

The case of Tamil

Tamil permits an accusative case-marked theme in a DSC (Paramasivam 1979: 65–66, Lehmann 1989: 184, Schiffman 2000: 37 for Tamil). Lehmann (1989: 184) labels such DSCs as the “DAT-ACC” pattern. According to him, the predicates that require this pattern are: a) b) c)

verbs of mental experience such as teri ‘know’, puri ‘understand’; verbs of emotional experience such as piṭi ‘like’; and verbs of physical and biological experience such as paci ‘be hungry’, vali ‘feel pain’, ari ‘itch’, kūcu ‘feel ticklish’.

Lehmann treats these predicates as morphologically defective, as they exhibit agreement in the neuter. This, of course, is expected as there is no nominative casemarked subject to agree with. Hence, this should be treated as DEFAULT CASE like in many other South Asian languages such as Hindi-Urdu, Panjabi, and Telugu. (31) kumārukku inta ūrai.t teri.yum this place- ACC know.FUT - 3 SG . N Kumar- DAT ‘Kumar knows this place.’ (Tamil; Lehmann 1989: 184) In Tamil, too, the features [+animacy] and [+ specificity] play a crucial role in the occurrence of the specificity marker (see Subbarao 2012 for details). In Bodo (TB), the adjective mɯjaŋ ‘good’ together with a tense marker imparts the meaning of ‘like’, and this predicate assigns genitive case -ha to its subject. Note that adjectives behave like verbs in many Tibeto-Burman languages (see Subbarao 2012 for details). The patient in such cases is accusative case-marked by -khɯu. (32) khampha ha laogikhɯu mɯjaŋ- mɯn GEN Laogi- ACC goodPST Khampha ‘Khampha liked Laogi.’ (Bodo, Tibeto-Burman) We do not have further data to show that the accusative marker -khɯu is a specificity marker in Bodo. In conclusion, though the phenomenon of accusative/dative case marking of the theme in Bangla, Malayalam, and Tamil seems to suggest that the predicate in DSCs is [+transitive], we have demonstrated that the marker that occurs with the theme in such constructions is a marker of specificity and animacy, as Magier (1987,28 1990) and Mahajan (1990) have shown for Marwari and Hindi-Urdu 28

Recall that Magier (1987:192–93) clearly articulated that ko in Hindi does not ‘… convey relational information’ when it occurs with “direct objects”, but it ‘follows a seman-

464

K. V. Subbarao

respectively for the nominative-accusative construction. Hence, predicates in Non-Nominative Subject (NNS) constructions are syntactically [-transitive]. The occurrence of the accusative marker is generally construed as an indicator of the predicate’s [+transitive] nature. I have shown that the object marker in the DSC does not function like an accusative case marker. Subbarao 2012 presents arguments to show that the predicate in the DSC is [-transitive]. The above discussion shows that the occurrence of the accusative marker in the DSC does not constitute counterevidence to the claim that the predicate in non-nominative subject constructions (NNS) is [-transitive], as argued in Subbarao 2012. Aissen (2003) discusses differential object marking (DOM), where some objects are case-marked, and some others are not, depending upon the semantic and pragmatic features of the object. Aissen points out that DOM depends on two features — ANIMACY and DEFINITENESS , and they compete with each other for dominance. While Persian prefers definiteness, Hindi chooses animacy. As we have observed, the case marking of theme in the DSC (dative/genitive subject construction, to be very specific) in Bangla, Malayalam, and Tamil by the accusative depends on animacy/specificity independent of transitivity in the [-NNS construction]. Recall that in nominative-accusative constructions too, the accusative marker is associated with TRANSITIVITY and ANIMACY / SPECIFICITY and hence, should be treated as a specificity marker (Magier 1987, 1990; Mahajan 1990). Thus, with regard to Differential Object Marking (DOM) in South Asian languages, when the accusative case marker denoting specificity occurs, the predicate is [-transitive] in the DSC (dative/genitive subject construction), and the predicate is [+transitive] in the nominative subject construction. Based on these facts, Subbarao (2012) proposes the following parameter to account for this variation:

tic hierarchy of specificity and animacy that contributes to the overall salience of the marked object noun.’ Hans Henrich Hock (p.c.) points out that ‘definite DO marking is fine with inanimate direct objects that are highly definite and specific as in (i), while for many speakers it is not acceptable in other cases as in (ii).’ (i) (ii)

us ko haṭā do this.ACC remove.IMP ‘remove this (thing)’ ?? us ne (us) kitāb ko he.ERG (that) book.ACC ‘he read that book.’

paṛhā read.PST

I feel that sentences such as (ii) are grammatical under contrastive stress as in (iii) (iii)

us ne (us) kitāb ko paṛhā, is ko nahīṁ he.ERG (that) book.ACC read.PST this.ACC NEG ‘He read that book and not this one.’

Morphology

465

The Differential Object Marking (DOM) parameter: When the noun phrase is accusative case-marked, the object marker is either associated with transitivity and animacy/ specificity in the [+NNS construction],29 or purely with animacy/specificity, independent of transitivity, in the [-NNS construction].

4.5.3.

Agreement marking By Hans Henrich Hock

The main focus of this section is on major issues in (finite) verb agreement. South Asian languages exhibit a great variety of verb agreement marking strategies, ranging from Ø-agreement, to single-agreement marking, to multiple agreement, to agreement “distributed” over several components of a complex verb. Moreover, object agreement interacts with “Ergativity” and thus relates to issues discussed in 4.5.1 and 4.5.2. Agreement may be in terms of person, number, gender (or a combination of these); animacy and person hierarchies may play a role,30 and so may the notion of honorificity (4.5.3.3 below); in the absence of antecedents, morphological requirements may call for default marking. Some languages also exhibit agreement on non-finite verbs; see 2.6.5, 2.6.7, and 5.4.3.3.4. Special complications arise in agreement with conjoined antecedents, especially with mixed gender; see e.g. Hock 2012 and 2015: § 3.3 for Sanskrit; Benmamoun, Bhatia & Polinsky 2009 for Hindi; Beythan 1943: 179 and Corbett 1991: 276 for Tamil. 4.5.3.1. Agreement features Beside the crosslinguistically well-known agreement with person and number, many Indo-Aryan languages also have gender agreement, as in (33), and most of Dravidian has mixed person-number and gender agreement in third persons (34).31 (33) a. b.

29 30

31

ādmī āy-ā come.PFV - M man.M ‘the man came’ aurat ā-ī come.PFV - F woman.F ‘the woman came’ (Hindi)

Thanks to Alice Davison for the formulation of this parameter. See e.g. Wali & Kaul 1997: 250–251, Devyani Sharma 2001 for Kashmiri; Paudyal 2008 for Darai; Delancey 1989, Bickel 2000 for Tibeto-Burman. Toda, Koḍagu, Kuṟumba, and several dialects of Tamil do not mark gender distinctions in third persons; and Malayalam has no person agreement. The exact nature of gender agreement varies in Dravidian. See Krishnamurti 2003: 205–215.

466

Hans Henrich Hock

(34) a. b. c.

paiyaṉ vant-āṉ come.PST -3 SG . M boy.M ‘the boy came’ amma vant-āḷ come.PST -3 SG . F mother.F ‘mother came’ kālam vant-atu come.PST .3 SG . N time.N ‘the time came’ (Tamil)

4.5.3.2.

Agreement triggers32

4.5.3.2.1.

Subjects, Agents, Objects, and the issue of Ergativity33

The most common trigger for verb agreement is the nominative subject, whether Subject (S ) or Agent (A ) in the Dixonian classification (1994) — this definition of subject is referred to as “subject (S / A )” in the remainder of this section. Agreement of this sort is the prevailing pattern in Dravidian; see (34). Complications arise in two contexts — Dative and other Oblique Subjects, and “Ergativity”. Dative/Oblique subjects generally do not control verb agreement; but note Hook 1990 for Shina and Paudyal 2008 for Darai, an Indo-Aryan language of Nepal. Normally, agreement, if any, is triggered by a nominative NP in the Dative or Oblique subject construction, as in (35). (35) us ādmī ko yah kahānī acchī nahīṁ lagt-ī this story.NOM . F good.F NEG seem.IMPFV - F that man.M =DAT hai be.PRS .3 SG ‘That man does not like this story.’ In languages with “Ergativity”, a major distinction must be made between “ergative” AGENT MARKING and “ergative” VERB AGREEMENT . In Hindi-Urdu the two phenomena are correlated, in so far as verb agreement can be linked to Subject, Agent, and Patient marking. In intransitive structures, morphologically unmarked, nominative Subjects trigger agreement; see (33). In transitives, Agents marked by the agentive (or “ergative”) postposition ne fail to trigger agreement; rather agreement is with unmarked, nominative Patients; see (36ab). Patients marked by the postposition ko fail to trigger agreement and the verb has (masculine) Default agreement marking; (36cd). Generative accounts,

32 33

A different perspective is found in Subbarao 2012: 95–109. For a bibliography on ergativity see Drocco 2009.

Morphology

467

therefore, link Hindi-Urdu agreement with the presence or absence of overt case marking; see Subbarao 2012: 97–98 with references. (36) a. b. c. d.

rām ne kitāb book. NOM . F Ram.M = AG ‘Ram read the book.’ *rām ne kitāb book. NOM . F Ram.M = AG rām ne sītā ko Sita. F = ACC Ram.M = AG ‘Ram saw Sita.’ *rām ne sītā ko Sita. F = ACC Ram.M = AG

paṛh-ī read.PFV : PST - F paṛh-ā read.PFV : PST - M dekh-ā see.PFV : PST - M ( DEFAULT ) dekh-ī see.PFV : PST - F

While this approach works for Hindi-Urdu, it fails to provide satisfactory answers for other “ergative” Indo-Aryan languages. First, there are languages like Assamese, Nepali, and Shina with “ergative” marking of transitive Agents,34 but verb agreement with Agents and Subjects; see e.g. Goswami & Tamuli 2003: 432, Riccardi 2003: 557–558, Bashir 2003: 881. This is, of course, also a widespread Tibeto-Burman phenomenon. Further, Panjabi and Marathi do not have “Ergative” marking of first and second person pronouns,35 and Wotapuri-Katarqalai does not have it for first and second pronouns in the plural; agreement, however, in all these languages is with the Patient, not the unmarked Agent; Shackle 2003: 612–615, Pandharipande 2003: 711, Bashir 2003: 873. Finally, Gujarati perfective-past verbs agree with the Patient, even though it is case-marked, and this holds optionally for Marathi too; Cardona & Suthar 2003: 682, Pandharipande 2003: 711.36 There are thus at least two types of “ergative” languages outside of HindiUrdu. One of these has “ergative” Agentive marking, but no “ergative” Patient agreement; the other has Patient agreement even if Agents are unmarked and/or Patients are marked. Beyond its implications for formal syntax, this difference raises questions as to what is “true ergativity” — Agent marking or Patient agreement. (For a detailed survey and discussion of variation see Deo & Sharma 2006, Verbeke 2013, Verbeke & Willems 2012, and especially Wunderlich 2012.)

34

35

36

There is a fair amount of variation regarding the tense/aspect categories and the extent to which some intransitives (under certain circumstances) can have “Ergative”-marked Subjects; details in Goswami & Tamuli 2003, Riccardi 2003, and Bashir 2003. In colloquial varieties of Panjabi, the marker is optional even in third persons (Bashir, p.c. April 2013). In Marwari, patient marking (by -ne) fails to block agreement even in passives; Magier 1990.

468

Hans Henrich Hock

4.5.3.2.2.

Object and Possessor agreement

A large number of South Asian languages have (direct or indirect) object agreement or even possessor agreement, in addition to subject (S / A ) agreement. In some of these languages, a distinction can be made between affixal agreement and clitic agreement; but the distinction often is blurred. Object and possessor agreement is found especially in Munda, the Kiranti and Kuki-Chin groups of Tibeto-Burman, Burushaski, as well as Indo-Aryan languages near the Kiranti and Munda languages (see 2.6.7) and in the (North-)West. 4.5.3.3.

Single and multiple agreement, clitic agreement, honorificity, and other features

Some South Asian languages have no verb agreement marking at all, including Meithei (Chelliah 2003: 2, 106), Tibetan-Gurung (DeLancey 1989: 329), and Malayalam (Krishnamurti 2003: 205–215). Sanskrit and most of Indo-Aryan, as well as most of Dravidian have agreement just with one constituent, most commonly the subject (S / A ), but the Patient in “ergative” agreement languages (see 4.5.3.2.1). Of greater typological interest are languages with MULTIPLE agreement, whether compositional or portmanteau. Multiple agreement commonly involves both Subject (S / A ) and direct or indirect object; see e.g. (1b),37 (2a), (3), and (5) repeated below for convenience (examples not relevant to the present discussion are omitted). But it may also involve Possessors (with various restrictions, for which see Subbarao 2001); e.g. (2b), (3b). Even ablatives may be involved; Subbarao 2012: 99. (1)

b.

huṛ-da-t-aṅ see-NON 3-PST -1 SG ‘I saw you’ (Pengo; Bhattacharya 1972, 1975)

(2)

a.

ñɛl-gɔt’-ka-t’-ko-a=ko see-EMPH -TAM -TR -3 PL ( OBJ )- FIN =3 PL ( SUBJ ) ‘They saw them off.’ (Santali; Anderson 2007b) gǝi=ko idi-ke-d-e-tiñ-a take-TAM - TR -3( OBJ )-1 SG . POSS - FIN cow=PL 3 ‘They took my cow.’ (Santali; Anderson 2007b)

b.

37

In Dravidian, this type is limited to languages in intensive contact with Munda; see 2.6.5.

Morphology

(3)

a.

b.

(5)

a.

469

khan-na asen a-in-u-na yesterday 2-buy-3-ART you-SG . ERG meruba pu-metta-ŋ goat look-CAUS -1 SG ‘Show me the goat you bought yesterday.’ (Athpare; Bickel 1999) zova-n ka-kut a-mi-sɔ:p-pek my-hands 3 SG -1 SG -washed-BEN Zova.AG ‘Zova washed my hands.’ (Hmar; Subbarao 2012) kalʰi di-m-(k)u-n tomorrow give-FUT -2 SG -1 SG ‘I will give (it) to you tomorrow.’ (Rājbanshi; Wilde 2008)

Examples of this type, with just two agreement markers, are widespread in Munda, the Tibeto-Burman Kiranti (e.g. Athpare) and Kuki-Chin languages (e.g. Hmar), and several Indo-Aryan languages, including Darai (Paudyal 2008), Kurmali (Mahto 1989), and Rājbanshi (Wilde 2008); it is also found in Iranian Balochi (Bashir 2008: 47). There are even cases with triple agreement; see e.g. (26a) from Maithili. Note, however, that the Maithili agreement system is quite complex, and only some of the complications can be mentioned here. (For further details see Yadav 1996: 172–185, Stump & Yadav 1988, Yadava 1999, Bickel, Bisang & Yadava 1999.) First, Maithili agreement prominently involves HONORIFICITY , 38 as indicated by the notations HH (high honorific), MH (mid honorific), and NH (non-honorific). Secondly, the final suffix, -nh, in (37ab) marks 3 HON whether subject (S / A ), object, or any other category. Third, there is a fair amount of portmanteau morphology. Fourth, as noted by Bickel, Bisang and Yadava (1999: 488), the structure in (37a) is the only triple-agreement structure permitted by the morphology and morphotactics of Maithili. Finally, the same suffix combination can express a different relation; (37b). (37) a. b.

38

bābu-jik-kē dekh-al-ii-auj-nhk hami toharj I.NOM you.2 NH / MH . GEN father.HH - ACC see-PST -1–2 NH / MH -3 H ‘Ii saw yourj fatherk.’ (Maithili; Bickel, Bisang & Yadava 1999: 510) kaniyā-kē dekh-au-l-ii-auj-nhk hami torā I.NOM you.2NH / MH bride-DAT see-CAUS - PST -1–2 NH / MH -3 H ‘Ii showed youj the bridek.’ (Maithili; Bickel, Bisang & Yadava 1999: 482)

A similar form of honorific agreement is found in the closely related Magahi; see S. Verma 2003: 513 and Rakesh & Kumar 2013 for details. More widespread is the use of plural and/or third person pronouns for honorificity with effects on verb agreement. This type of agreement, also found outside South Asia, is not considered in this report.

470

Hans Henrich Hock

Triple agreement is also possible in Santali, as well as Kashmiri and other (north-) west Indo-Aryan languages; see e.g. (38) and (39). But note that these involve at least one clitic. Moreover, CLITIC AGREEMENT tends to have its own peculiarities. In Santali, the traditional, unmarked position of the subject marker is enclitic on the pre-verbal constituent, rather than proclitic on the verb; for discussion and details see Anderson 2007b, Ghosh 2008, and Kidwai 2005. In the NW Indo-Aryan languages, clitics may double other agreement markers, as in (39b), where clitic -se doubles up on the (gender) agreement suffix -āṁ. Moreover, in Panjabi (some of) the clitics attach to the preverbal negation rather than to the verb; see (39c) and Butt 2007. For agreement in Kashmiri, Sindhi, Lahnda/Saraiki/Western Panjabi see also Grierson 1919a: 42.65, 85–86; 1919b: 260–261, 270–271. A number of Dardic languages (e.g. Pashai; Bashir 2003: 829, Lehr 2014) also exhibit clitic agreement of this type, and so does Eastern Balochi (Bashir 2008: 47). Note that in all of the NW Indo-Aryan languages, the core agreement — the first, suffixal, non-clitic marker after the verb — remains “ergative” in the perfective past; see (39b), where -āṁ agrees with the implicit Patient. (38)

ako-ge=ko idi-ke-t’-ko-tako-a they-EMPH =3 PL ( SUBJ ) take- ASP - TR -3 PL ( OBJ )-3 PL : POSS - FIN ‘They took theirs away themselves.’ (Santali; Anderson 2007b: 95)

(39) a.

bɨ ču-s=an=ay su tse havālɨ karān he.ACC you.DAT hand.over do.PPL I.NOM be-1 SG .=3 SG =2 SG ‘I am handing him over to you.’ (Kashmiri; Wali & Koul 1997: 253) chaḍḍi-āṁ=īṁ=se give up.PFV : PST - M ( OBJ ) i-3 SG ( OBL ) j-3 SG i ‘hej gave himi up’ (Sindhi; adapted from Grierson 1896a) ditt-īj fāwād=ne nai=s(u)i Fawad.M =AG NEG =3 SG give.PFV : PST . SG . F ‘Fawad did not give (thisj) to heri.’ (Panjabi; adapted from Butt 2007)

b. c.

Multiple marking, as well as clitic agreement (with the Agent), is also found in Old Avadhi as a transitional phase between ergative and accusative verb agreement, (40);39 B. Saksena (1937). Grierson (1896ab) considers the clitic =uṁ to be pronominal in origin; see also Butt 2007. B. Saksena (1937: 253–254) instead proposes derivation from the copula. The fact that Old Marathi has similar structures whose clitics are clearly of copular origin (Master 1964: 129–132) may support Saksena; but third person clitics like Old Avadhi –nh(i/a) and the –nh of Maithili are more compatible with pronominal origin (compare OAv. tinha ‘that OBL ’).

39

In Modern Avadhi, -i- and –uṁ have fused into a single suffix, marking agreement with the Agent, not with the Patient.

Morphology

471

The issue of the origin of Indo-Aryan agreement clitics deserves further, in-depth study. (40)

so suni samujhi understood.CVB that heard.CVB sah-i=uṁ saba sūlā all torture.F endured-F =1SG ‘Having heard and understood that, I (Bharata) endured all the torture.’ (Old Avadhi; Ram Carit Manas 2. 261)

An interesting variant on multiple marking with doubling is found in Burushaski, where a subset of intransitives can mark the Subject both in the subject (S / A ) and object slots of the morphological template; see (41b) vs. (41a). As noted by Anderson (2007a: 1251–1252), the construction in (41b) is used when the Subject is not in control of the action. (41) a. b.

γurc-ím-i sink-m.SUFFIX -3 SG ( SUBJ ) ‘he dived under’ i-γurc-ím-i 3SG ( OBJ )-sink-m.SUFFIX -3 SG ( SUBJ ) ‘he sank’ (Burushaski; adapted from Anderson 2007a: 2152)

In some South Asian languages, multiple marking is realized in terms of portmanteau morphology. This is especially the case in Tibeto-Burman. Thus, Limbu subject (S / A ) and object agreement is realized as in (42), with portmanteau suffixes for 1>2, 1>3, 3>1 (DeLancey 1989: 319); and DeLancey reconstructs a similar system for Proto-Tibeto-Burman (1989: 321).40 (42)

Object

1

1 2 3

kɛ-V-aŋ V-aŋ

2

3

V-nɛ

V-uŋ kɛ-V-u V-u

Subject kɛ-V-a

4.5.3.4 Agreement in complex verbs and “distributed” agreement Complex verbs are a common feature of South Asian languages. Two major subtypes can be distinguished: Noun (or adjective) + Verb structures (C ONJUNCT V ERBS ) and Verb + Verb structures. The latter come in two subvarieties: Non-

40

On the issue of Tibeto-Burman agreement reconstruction, see the discussion and references in 2.6.7.

472

Hans Henrich Hock

finite verb + Auxiliary (“A UXILIARY STRUCTURES ”) and Verb + Verb compounds (“C OMPOUND V ERBS ”).41 Various issues of agreement marking and its distribution over the entire verbal complex can arise. This is especially the case in IndoAryan, where participles and other nonfinite verbs in Auxiliary Structures tend to be adjectival and thus subject to gender/number agreement marking, while the auxiliary is subject to person/number agreement. But agreement issues also arise in the other Complex Verb types. 4.5.3.4.1

Agreement in Conjunct Verbs — Incorporation vs. nonincorporation

In cases where the first element of a Conjunct Verb is a noun, the question of gender agreement with the following quasi-auxiliary must arise. The general tendency is for the noun to become “incorporated” in the sense of Baker (1988); as a consequence there is no gender agreement with the following verb and no linkage of the preceding “complement” by means of genitive marking; see e.g. (43a). However, for some combinations of noun + quasi-auxiliary there is variation as in (43b) vs. (43a); see e.g. McGregor 1972: 58. The most common verbs in structures of this sort are verbs meaning ‘do, make’ and ‘be, become’; (43cd).42 (43) a. b. c. d.

4.5.3.4.2

mujhe rām yād āy-ā Ram.M memory(.F ) come.PFV : PST - SG . M I.DAT ‘I remembered Ram.’ mujhe rām kī yād ā-ī Ram.M - GEN memory.F come.PFV : PST - SG . F I.DAT ‘I remembered Ram.’ rām śyām ko yād kar rahā hai Shyam-ACC memory(F ) do.PROGR . PRS .3 SG . M Ram.NOM . SG . M ‘Ram is remembering Shyam.’ āp ko rām yād hogā Ram.NOM . SG . M memory(F ) be.FUT .3 SG . M you.HON - DAT ‘You will remember Ram.’ Agreement in Auxiliary Structures

The general pattern in Indo-Aryan Auxiliary Structures is for the nonfinite and auxiliary components to agree with the same constituent, whether Subject (includ-

41

42

The terms Conjunct and Compound Verb are standard in South Asian linguistics; see e.g. Kachru 1980, 1982, Hook 1974, as well as 5.4.2. The term “Auxiliary Structure” is used here for ease of exposition. The examples in (43) are from Hindi.

Morphology

473

ing S / A subject) or Patient/Object, but in terms of different features; see e.g. (44), where the nonfinite element agrees in gender/number, and the finite element in person/number. This pattern may be called D ISTRIBUTED A GREEMENT . (44) a. b.

sa samāgato ’sti come.ta.PTCP . NOM . SG . M be.PRS .3 SG that.NOM . SG . M ‘He came/has come.’ (Sanskrit) ve ā rah-ī haiṁ come PROGR - F AUX . PRS .3 PL that.PL ( F ) ‘They(F ) are coming.’ (Hindi)

Remarkably, in varieties of Rajasthani, the two elements may agree with different antecedents in the “ergative” construction; (45) and see Magier 1983a. Apparently there is some variation in this regard, perhaps because of Hindi normative influence; see Khokhlova 2002 vs. Bahl 1972, Stroński 2010. (45)

mhaiṁ saugan lai lī take take.PFV : PST . F I oath.F ‘I have taken an oath.’ (Rajasthani; Magier 1983a)

hūṁ AUX .PRS .1 SG

The construction is remarkably similar to the Old Avadhi one in (40), which raises the question whether the similarity can be attributed to similar historical developments. 4.5.3.4.3

Agreement in Compound Verbs and the issue of Serial Verbs

The most widespread type of Compound Verb combines a nonfinite form of the first verb (the P OLE ) with a finite second verb (the V ECTOR ), as in (46); agreement marking is restricted to the Vector. The Vector tends to add an element of aspect or aktionsart; but as (46b) shows, it may have other functions as well. See Slade 2013 and 5.4.2.3 (this volume) for discussion of different perspectives on the nature and historical developments of Indo-Aryan Compound Verbs, as well as crosslanguage variation. (46) a. b.

vah ā gay-ā come go.PFV : PST -SG . M that.SG ( M ) ‘He came/arrived.’ us-ne kitāb paṛh book(F ) read that.OBL . SG ( M )- AG ‘He read the/a book (for himself).’

l-ī take.PFV : PST . SG . F

A special type of Complex Verb is what Steever (1988) refers to as S ERIAL V ERB — a structure in which the two verbs agree in person, number, and (if applicable) gender, forming a single construction. The examples in (47), from Steever 1988, show that this type functionally straddles the fence between Compound Verb and

474

Hans Henrich Hock

Auxiliary Structure. The relation between the two verbs in (47a) involves auxiliation; in (47b) the relation is similar to that of governing verb + purpose infinitive; and (36c) represents the Compound Verb variety, a “Balance Verb” whose meaning transcends those of its component parts. See also 5.4.2.1, this volume. (47) a. b. c.

celvēm allēm go.NPST .1 PL be(come).NEG .1 PL ‘We will not go.’ (OTamil) irakku vāreṉ beg.NPST .1 SG come.NEG .1 SG ‘I do not come to beg.’ (OTamil) biba injo ōti samgiri uṭar provisions drink.PST .3 PL marriage house.LOC take.CVB ticar eat.PST .3 PL (Pengo) ‘They ate-drank, i.e. consumed, the provisions brought to the marriage house.’

Similar constructions are found in a number of other South Asian languages, as in (48), at different times and in different, geographically unconnected areas, except perhaps for (48c) which may reflect Dravidian influence. (48) a. b.

c.

av-uṁ ch-uṁ be.PRS -1 SG come-1 SG ‘I come’ (Gujarati; Cardona & Suthar 2003: 681)43 yaŋ te·s-u-ŋ sur-u-ŋ money spend-3-1 PST AUX -3-1 PST ‘I’ve spent the money.’ (Limbu; van Driem 1987: 119, cited by Anderson 2007b: 253) miŋ ne-gaʔ-ru ne-laʔ-ru 1-AUX - PST I 1-eat-PST ‘I ate vigorously’ (Gorum; Anderson 2007b: 253)

An interesting variant of the Serial Verb construction consists of structures in which the two verbs agree only in part. Incomplete agreement may involve subject or object, tense, and/or modality. For instance in (49a) the first verb exhibits a “Truncated Personal Ending” (TPE ) (Steever 1988: 72–74);44 in (49b) it omits tense; in

43

44

Similar structures are found in early Urdu (Schmidt 2003: 323), Marwari (Grierson 1908: 26–28), Braj Bhākhā (Grierson 1916: 81). Further examples from Burushaski and Limbu, with suppression of object agreement, are given by Anderson (2007b: 260, 261).

Morphology

475

(49c) it is underspecified for modality, with the first form only in the “Casual” (C) imperative, while the second verb may be in the nonpolite (N) or polite (P) imperative; and (49d) shows person neutralization on the first verb and inexact, “hortative” agreement (imperative vs. subjunctive) in the case of first-person second verbs. (49) a.

b.

c.

d.

vā-t-a suṛ-t-a PL vā-t-a suṛ-t-ap come-PST - TPE see-PST -1 PL ( EX ) come-PST - TPE see-PST -1 SG ‘I came and saw’ ‘We came and saw’ suṛ-t-i PL vā-t-i suṛ-t-ider 2 SG vā-t-i come-PST - TPE see-PST -2 PL come-PST - TPE see-PST -2 SG ‘You came and saw’ ‘You PL came and saw’ (Konḍa; Steever 1988: 72–74) ḍokra na bole-nturiaʔ na-čoṅ na-ug-k(e)=ṛe old.man you rice-millet.gruel 2-eat 2-drink-PST .II= Q ‘Old man, didn’t you drink/eat your rice millet and gruel?’ (Gtaʔ; Anderson 2007b: 265 with reference; Anderson’s glosses and translation) baŋani aḍ-t-a pī-gǝy door close-REAL - C go-IMP . SG . P ‘Go, close the door (please).’ baŋani aḍ-t-a bār-ǝy door close-REAL - C come-IMP . SG . N ‘Come, close the door.’ (Betta Kurumba; Coelho 2012: 50) 1

SG .

bhakṣa + ehi mā + ā viśa food.VOC . SG come.IMP .2 SG I.ACC . CLIT enter.IMP .2 SG (Taittirīya Saṁhitā 3.2.5.1) ‘Come, food, enter me.’ eta + u nv indraṁ stavāma Indra.ACC . SG . M praise.SBJV .1 PL come.IMP .2 PL PCL (Rig Veda 8.24.19) ‘Come now, let us praise Indra.’ (Vedic Sanskrit; Hock 2002)

4.5.3.5. A note on adjective, demonstrative, and noun agreement45 In Sanskrit and Middle Indo-Aryan, adjectives and demonstrative, interrogative, or relative pronouns exhibit case, number, and gender agreement with their nominal head in attributive constructions and with their underlying subjects in predicate

45

For Hindi “Long Distance” gender/number agreement, see 5.2.2.3.2 and Subbarao 2012: 114–122.

476

Hans Henrich Hock

structures; see the Sanskrit examples in (50).46 There is a similar agreement, but only in case, between predicate nouns and their subjects; see (51) below. (50) a.

s-aḥ sundar-aḥ beautiful.NOM . SG . M that.NOM . SG . M ‘that beautiful king’ a’. s-aḥ rājā king. NOM . SG . M that.NOM . SG . M (asti) be.PRS .3 SG ‘that king is beautiful.’ b. s-ā sundar-ī beautiful.NOM . SG . F that.NOM . SG . F ‘that beautiful queen b’. s-ā rājñī queen. NOM . SG . F that.NOM . SG . F (asti) be.PRS .3 SG ‘that queen is beautiful.’ c. t-ad sundar-am beautiful.NOM . SG . N that.NOM . SG . N ‘that beautiful kingdom’ c’. t-ad rājy-am kingdom. NOM . SG . N that.NOM . SG . N (asti) be.PRS .3 SG ‘that kingdom is beautiful.’

rājā king. NOM . SG . M sundar-aḥ beautiful.NOM . SG . M

rājñī queen. NOM . SG . F sundar-ī beautiful.NOM . SG . F

rājy-am kingdom. NOM . SG . N sundar-am beautiful.NOM . SG . N

The rule of predicate agreement with underlying subjects leads to interesting consequences in impersonal passive-like constructions, “locative-absolutes”, and other structures, in which the underlying subject is not in the nominative: By agreement, predicate nouns and adjectives wind up appearing in non-nominative cases. See the predicate noun structures in (51).

46

Examples in this section are cited without sandhi. — A Vedic complication is that bare demonstrative or relative pronoun subjects normally agree in gender with their predicates, not the other way around, as in the following example. ye tuṣāḥ sā tvak (Aitareya Brāhmaṇa 1.22.14) shell. PL . M that.SG . F skin. SG . F RP.PL . M ‘What are the shells, that is the skin.’ (≈ ‘The shells are tantamount to the skin.’)

Morphology

(51) a. b. c.

477

s-aḥ rājā (asti) king. NOM . SG . M be.PRS .3 SG that-NOM . SG . M ‘He is (a) king.’ ta-smin rājñ-e bhavati … king- LOC . SG . M be.PRS . PTCP . LOC . SG . M that-LOC . SG . M ‘With him being king …’ t-ena rājñ-ā bhavitavyam king- INS . SG . M be.GERUNDIVE . NOM . SG . N that-INS . SG . M ‘He must/may be king’ (Lit. ‘By him (is) having-to-be by king.’)

Bibliographical references Abbi, Anvita 1992 Reduplication in South Asian languages: An areal, typological, and historical study. Delhi: Allied Publishers. Abbi, Anvita (ed.) 1997 Languages of the tribal and indigenous peoples of India: The ethnic space. Delhi: Motilal Banarsidass. Abbi, Anvita, R. S. Gupta, and Ayesha Kidwai (eds.) 2001 Linguistic structure and language dynamics in South Asia: Papers from the proceedings of SALA XVIII Roundtable. Delhi: Motilal Banarsidass. Agha, Asif 1993 Structural form and utterance context in Lhasa Tibetan: Grammar and indexicality in a non-configurational language. New York: Peter Lang. Albert, D. 1985 Tolkāppiyam: Phonology and morphology (An English translation). Madras: International Institute of Tamil Studies. Aissen, Judith 2003 Differential object marking: Iconicity vs. economy. Natural Language and Linguistic Theory 21: 435–483. Amritavalli, R. 1997 A Kannada perspective on morphological causatives. In: M. Hariprasad, Hemalatha Nagarajan, P. Madhavan, and G. Vijayakrishnan (eds.), Phases and interfaces of morphology, 180–205. Hyderabad: Central Institute of English and Foreign Languages. Amritavalli, R., and K. A. Jayaseelan 2005 Finiteness and negation in Dravidian. In: G. Cinque and R. S. Kayne (eds.), The Oxford handbook of comparative syntax, 178–220. Oxford: University Press. Amritavalli, R., and Partha Protim Sarma n.d. A case distinction between unaccusative and unergative subjects in Assamese. www.ledonline.it/Snippets/allegati/snippets5001.pdf (accessed 4 May 2011) Andersen, Paul Kent 1986a Die ta-Partizipialkonstruktion bei Asoka: Passiv oder ergativ? Zeitschrift für vergleichende Sprachforschung 99: 75–95.

478

Bibliographical references

Andersen, Paul Kent 1986b The genitive agent in Rigvedic passive constructions. In: Franciszek Sławski et al. (eds.), Collectanea linguistica in honorem Adami Heinz, 9–13. (Prace Komisji Językoznawstwa 53). Wrocław/Warszawa/Kraków/Gdańsk/Łódź: Wydawnictw Polskiej Akademii Nauk. Anderson, Gregory D. S. 2007a Burushaski morphology. In: Kaye (ed.) 2007: 1233–1276. Anderson, Gregory D. S. 2007b The Munda verb: Typological perspectives. Berlin/New York: Mouton de Gruyter. Anderson, Gregory D. S. 2008 Gtaʔ. In: Anderson (ed.) 2008: 682–763. Anderson, Gregory D. S. (ed.) 2008 The Munda languages. Oxford/New York: Routledge. Anderson, Gregory D. S., and Felix Rau 2008 Gorum. In: Anderson (ed.) 2008: 381–433. Anderson, Gregory D. S., and K. David Harrison 2008a Sora. In: Anderson (ed.) 2008: 299–380. Anderson, Gregory D. S., and K. David Harrison 2008b Remo. In: Anderson (ed.) 2008: 557–632. Anderson, Gregory D. S., and Randall H. Eggert 2001 A typology of verb agreement in Burushaski. Linguistics of the Tibeto-Burman Area 24(2): 235–254. Anderson, Gregory D. S., Toshiki Osada, and K. David Harrison 2008 Ho and the other Kherwarian languages. In: Anderson (ed.) 2008: 195–255. Anderson, Stephen R. 1977 On the mechanisms by which languages become ergative. In: Charles N. Li (ed.), Mechanisms of syntactic change, 217–264. Austin/London: University of Texas Press. Andvik, Erik 1999 Tshangla grammar. University of Oregon PhD dissertation. Andvik, Erik 2010 Pragmatically motivated marking of the agentive case in Tshangla. 16th Himalayan Languages Symposium, 1–5 September 2010, SOAS, University of London. Annamalai, E., and S[anford] B. Steever 1998 Modern Tamil. In: Steever (ed.) 1998: 100–128. Asher, Ronald E. 1985 Tamil. London/Sydney/Dover: Croom Helm. Asher, Ronald E., and T. C. Kumari 1997 Malayalam. London/New York: Routledge. Bahl, Kali Charan 1972 On the present state of Modern Rajasthani grammar. Parampara 33: 1–76. Bailey, T. Grahame 1924 Grammar of the Shina (Ṣinā) language. London: The Royal Asiatic Society. Baker, Mark C. 1988 Incorporation: A theory of grammatical function changing. Chicago/London: University of Chicago Press.

Morphology

479

Bashir, Elena 1986 Beyond split-ergativity: Subject marking in Wakhi. In: Anne M. Farley, Peter T. Farley, and Karl-Erik McCullough (eds.), Papers from the regional meeting, Chicago Linguistic Society 22(1), 14–35. Chicago: Chicago Linguistic Society. Bashir, Elena 1988 Topics in Kalasha syntax: An areal and typological perspective. University of Michigan PhD dissertation. ProQuest, UMI Dissertations Publishing 8821545. Bashir, Elena 1999 The Urdu and Hindi ergative postposition ne: Its changing role in the grammar. In: Rajendra Singh (ed.), The Yearbook of South Asian Languages and Linguistics 1999, 11–36. New Delhi/Thousand Oaks/London: Sage Publications. Bashir, Elena 2003 Dardic. In: Cardona & Jain (eds.) 2003: 818–894. Bashir, Elena 2004 Le préfixe d- en Bourouchaski: deixis et point de référence. (French translation of The d-prefix in Burushaski: Deixis and viewpoint, originally presented at the 36th International Congress of Asian and North African Studies (ICANAS 2000), Montreal, August 27–September 2, 2000.) In: Étienne Tiffou (ed.), Bourouchaskiana, 17–62. Louvain-la-Neuve: Peeters. Bashir, Elena 2008 Some transitional features of Eastern Balochi: An areal and diachronic perspective. In: Carina Jahani, Agnes Korn, and Paul Titus (eds.), The Baloch and others: Linguistic, historical and socio-political perspectives on pluralism in Balochistan, 46–82. Wiesbaden: Reichert. Bashir, Elena 2009 Wakhi. In: Windfuhr (ed.) 2009: 825–858. Bauman, James 1975 Pronouns and pronominal morphology in Tibeto-Burman. University of California, Berkeley, PhD dissertation. Benmamoun, Elabbas, Archna Bhatia, and Maria Polinsky 2009 Closest-conjunct agreement in head-final languages. Linguistic Variation Yearbook 9: 67–88. Berger, Hermann 1974 Das Yasin-Burushaski (Werchikwar). Wiesbaden: Harrassowitz. Berger, Hermann 1998 Die Burushaski-Sprache von Hunza und Nager, 3 vols. Wiesbaden: Harrassowitz. Beyer, Stephan V. 1993 The classical Tibetan language. Delhi: Sri Satguru Publications. Beythan, Hermann 1943 Praktische Grammatik der Tamilsprache in Umschrift. Leipzig: Harrassowitz. Bhaskararao, Peri, and Karumuri V. Subbarao (eds.) 2001 The yearbook of South Asian languages 2001. (= Proceedings of Tokyo symposium on South Asian languages: Contact, convergence, and typology.) Thousand Oaks/London/New Delhi: Sage. Bhaskararao, Peri, and Karumuri V. Subbarao (eds.) 2004 Non-nominative subjects. 2 vols. Amsterdam/Philadelphia: Benjamins.

480

Bibliographical references

Bhat, D. N. S. 1997 Noun-verb distinction in Munda languages. In: Abbi (ed.) 1997: 227–251. Bhatia, Tej K. 1993 Punjabi: A cognitive descriptive grammar. London/New York: Routledge. Reprinted 2000. Bhatt, Rajesh 2007 Unaccusativity and case licensing. Lecture, McGill University. Bhattacharja, Shishir 2007 Word formation in Bengali: A whole word morphological description and its theoretical implications. München: LINCOM. Bhattacharya, Sudhibushan 1972 Dravidian and Munda: A good field for areal and typologic studies. In: S. Agesthialingom and S. V. Shanmugan (eds.), Third Seminar on Dravidian Linguistics, 241–256. Annamalainagar: Annamalai University. Bhattacharya, Sudhibushan 1975 Linguistic convergence in the Dravido-Munda area. International Journal of Dravidian Linguistics 4: 199–214. Bickel, Balthasar 1999 Nominalization and focus constructions in some Kiranti languages. In: Yadava & Glover (eds.) 1999: 271–296. http://www.uni-leipzig.de/~bickel/research/ papers/focnom99.pdf (accessed 8 May 2013) Bickel, Balthasar 2000 Introduction: Person and evidence in Himalayan languages. Linguistics of the Tibeto-Burman Area 23(2): 1–11. www.spw.uzh.ch/bickel-files/papers/Bickel2000Person.pdf (accessed 8 May 2013) Bickel, Balthasar 2003 Belhare. In: Thurgood & LaPolla (eds.) 2003: 546–570. Bickel, Balthasar, Walter Bisang, and Yogendra P. Yādava 1999 Face vs. empathy: The social foundations of Maithili verb agreement. Linguistics 37: 481–518. Bloch, Jules 1970 The formation of the Marathi language. (English translation of original 1920 edition, by Dev Raj Chanana.) Delhi: Motilal Banarsidass. Böhtlingk, Otto von 1887 Pâṇini’s Grammatik. Leipzig: Haessel. Repr. 1998, Darmstadt: Wissenschaftliche Buchgesellschaft. Bubenik, Vit 1989a On the origins and elimination of ergativity in Indo-Aryan languages. Canadian Journal of Linguistics 34(4): 377–398. Bubenik, Vit 1989b An interpretation of split ergativity in Indo-Iranian languages. Diachronica 6(2): 181–211. Butt, Miriam 1993 Object specificity and agreement in Hindi/Urdu. Papers from the 29th regional meeting of the Chicago Linguistics Society, 80–103. Chicago: Chicago Linguistic Society. Butt, Miriam 1995 The structure of complex predicates in Urdu. Stanford, CA: CSLI.

Morphology

481

Butt, Miriam 2001 A reexamination of the accusative to ergative shift in Indo-Aryan. In: Miriam Butt and Tracy Holloway King (eds.), Time over matter: Diachronic perspectives on morphosyntax, 105–141. Stanford, CA: CSLI. Butt, Miriam 2003 The light verb jungle. Harvard Working Papers in Linguistics 9: 1–49. Butt, Miriam 2007 The role of pronominal suffixes in Punjabi. In: Annie Zaenen (ed.), Architectures, rules and preferences, 341–368. Stanford: CSLI. http://ling.uni-konstanz.de/ pages/home/butt/main/papers/final-punjabi.pdf (accessed 8 May 2013) Butt, Miriam, and Aditi Lahiri 2002 Historical stability vs. historical change. MS, Universität Konstanz. http:// ling.uni-konstanz.de/pages/home/butt/main/papers/stability.pdf (accessed 8 May 2013) Butt, Miriam, and Tafseer Ahmed 2008 The redevelopment of Indo-Aryan case systems from a lexical semantic perspective. Morphology 21: 545–572. Butt, Miriam, and Tikaram Poudel 2007 Distribution of the ergative in Nepali. MS, Universität Konstanz. http://ling. uni-konstanz.de/pages/home/butt/ (accessed April 2011) Butt, Miriam, and Tracy Holloway King 1991 Semantic case in Urdu. In: Lise M. Dobrin, Lynn Nichols and Rosa M. Rodriguez (eds.), Papers from the Annual Regional Meeting of the Chicago Linguistic Society 27, 31–45. Chicago: Chicago Linguistic Society. Butt, Miriam, and Tracy Holloway King 2004 The status of case. In: Dayal & Mahajan (eds.) 2004: 153–198. Butt, Miriam, Tracy Holloway King, and Gillian Ramchand (eds.) 1994 Theoretical perspectives on word order in South Asian languages. Stanford, CA: CSLI. Caldwell, Robert 1856 A comparative grammar of the Dravidian or South-Indian family of languages. Madras: University of Madras. Caldwell, Robert 1875 A comparative grammar of the Dravidian or South-Indian family of languages. 2nd edition, revised and enlarged. London: Trübner & Co. Caldwell, Robert 1913 A comparative grammar of the Dravidian or South-Indian family of languages. 3rd edition, revised and edited by J. L. Wyatt and T. Ramakrishna Pillai. Reprinted 1974, New Delhi: Oriental Books Reprint Corporation. Cardona, George 1965 A Gujarati reference grammar. Philadelphia: University of Pennsylvania Press. Cardona, George 2000 Old Indic grammar. In: Geert E. Booj, Christian Lehmann, and Joachim Mugdan (eds.), Morphologie/Morphology: Ein internationales Handbuch zur Flexion und Wortbildung/An international handbook on inflection and word formation, vol. 1, 41–51. Berlin: Mouton de Gruyter.

482

Bibliographical references

Cardona, George 2003 Sanskrit. In: Cardona & Jain (eds.) 2003: 104–160. Cardona, George 2004 From Vedic to modern Indic languages. In: Geert E. Booj, Christian Lehmann, Joachim Mugdan, and Stavros Skopeteas (eds.), Morphologie/Morphology: Ein internationales Handbuch zur Flexion und Wortbildung/An international handbook on inflection and word formation, vol. 2, 1712–1729. Berlin: Mouton de Gruyter. Cardona, George 2007 Sanskrit morphology. In: Kaye (ed.) 2007: 775–824. Cardona, George 2009 On the structure of Pāṇini’s system. In: Huet et al. (eds.) 2009: 1–32. Cardona, George, and Babu Suthar 2003 Gujarati. In: Cardona & Jain (eds.) 2003: 659–697 Cardona, George, and Dhanesh Jain (eds.) 2003 The Indo-Aryan languages. London/New York: Routledge. Chang, Kun, and Betty Shefts Chang 1980 Ergativity in spoken Tibetan. Bulletin of the Institute of History and Philology 51(1): 15–32. Taipei: Academica Sinica. Chatterji, Suniti Kumar 1926 The origin and development of the Bengali language. Calcutta University Press. Repr. 1970, London: Allen & Unwin; distributed by Motilal Banarsidass, Delhi. Chelliah, Shobhana L. 1997 A grammar of Meithei. Berlin/New York: Mouton de Gruyter. Chelliah, Shobhana L. 2003 Meithei. In: Thurgood & LaPolla (eds.) 2003: 427–438. Chelliah, Shobhana L. 2009 Semantic role to new information in Meithei. In: Jóhanna Barðdal and Shobhana L. Chelliah (eds.), The role of semantic, pragmatic, and discourse factors in the development of case, 377–400. Amsterdam/Philadelphia: Benjamins. Chelliah, Shobhana L., and Gwendolyn Hyslop 2011 Linguistics of the Tibeto-Burman Area 34 (2): Special issue on optional case marking in Tibeto-Burman, Part 1. Chelliah, Shobhana L., and Gwendolyn Hyslop 2012 Linguistics of the Tibeto-Burman Area 35(1): Special issue on optional case marking in Tibeto-Burman, Part 2. Christdas, Prathima 2013 The phonology and morphology of Tamil. Oxford/New York: Routledge. Coelho, Gail 2012 The re-emergence of finite serial verbs in South Dravidian. In: Rajendra Singh and Shishir Bhattacharja (eds.), Annual review of South Asian languages and linguistics 2012, 45–75. Berlin/New York: de Gruyter Mouton. Corbett, Greville G. 1991 Gender. Cambridge: Cambridge University Press. Coupe, Alexander R. 2007 A grammar of Mongsen Ao. Berlin/New York: Mouton de Gruyter.

Morphology

483

Coupe, Alexander R. 2011 On core case marking patterns in two Tibeto-Burman languages of Nagaland. Linguistics of the Tibeto-Burman Area 34(2): 21–48. Creissels, Denis 2008 Remarks on split intransitivity and fluid intransitivity. In: O. Bonami and P. Cabredo Hofherr (eds.), Empirical issues in syntax and semantics 7, 139–168. http://www.cssp.cnrs.fr/eiss7 (accessed May 2011) Creissels, Denis 2009 Ergativity/accusativity revisited. 8th Biennial Meeting of the Association for Linguistic Typology (ALT VIII), Berkeley, CA, July 24–28, 2009. www.denis creissels.fr/public/Creissels-ergativity.pdf (accessed May 2011) Creissels, Denis 2010 Fluid intransitivity in Romance languages: A typological approach. Archivio Glottologico Italiano 95(1): 117–151. www.deniscreissels.fr/public/Creisselsfluid_intransitivity.pdf (accessed May 2011) Dasgupta, Probal 2003 Bangla. In: Cardona & Jain (eds.) 2003: 351–390. Dasgupta, Probal, Alan J. Ford, and Rajendra Singh 2000 After etymology: Towards a substantivist linguistics. München: LINCOM. Davison, Alice 1999 Ergativity: Functional and formal issues. In: Michael Darnell, Edith Moravcsik, Frederick Newmeyer, Michael Noonan, and Kathleen Wheatley (eds.), Functionalism and formalism in linguistics, Volume I: General papers, 177– 208. Amsterdam/Philadelphia: Benjamins. Davison, Alice 2001 Ergative case licensing in a split ergative language. In: Abbi et al. (eds.) 2001: 291–307. Dayal, Veneeta, and Anoop Mahajan (eds.) 2004 Clause structure in South Asian languages. Dordrecht: Kluwer. de Hoop, Helen, and Bhuvana Narasimhan 2008 Ergative case-marking in Hindi. In: Helen de Hoop and Peter de Swart (eds.), Differential subject marking, 63–78. Dordrecht: Springer. Debrunner, Albert 1954 Altindische Grammatik, 2.2: Nominalsuffixe. Göttingen: Vandenhoeck & Ruprecht. Debrunner, Albert, and Jakob Wackernagel 1930 Altindische Grammatik: 3: Nominalflexion, Zahlwort, Pronomen. Göttingen: Vandenhoeck & Ruprecht. DeLancey, Scott 1989 Verb agreement in Proto-Tibeto-Burman. Bulletin of the School of Oriental and African Studies 52(2): 315–333. http://www.academia.edu/785247/Verb_ agreement_in_Proto-Tibeto-Burman (accessed 9 May 2013) DeLancey, Scott 1990 Ergativity and the cognitive model of event structure in Lhasa Tibetan. Cognitive Linguistics 1(3): 289–321. DeLancey, Scott 2003 Lhasa Tibetan. In: Thurgood & LaPolla (eds.) 2003: 270–288.

484

Bibliographical references

DeLancey, Scott 2010 Towards a history of verb agreement in Tibeto-Burman. Himalayan Linguistics 9: 1–39. DeLancey, Scott 2011a “Optional” “Ergativity” in Tibeto-Burman languages. Linguistics of the TibetoBurman Area 34(2): 9–20. DeLancey, Scott 2011b Notes on verb agreement prefixes in Tibeto-Burman. Himalayan Linguistics 10: 1–29. Deo, Ashwini, and Devyani Sharma 2006 Typological variation in the ergative morphology of Indo-Aryan languages. Linguistic Typology 10(3): 369–418. Devi, Jayantimala 1986 Ergativity: A historical analysis in Assamese. University of Delhi PhD dissertation. Dhongde, Ramesh Vaman, and Kashi Wali 2009 Marathi. Amsterdam/Philadelphia: Benjamins Diffloth, Gerard, and Norman Zide 1992 Austro-Asiatic languages. In: William Bright (editor-in-chief), International encyclopedia of linguistics, 137–142. New York: Oxford University Press. Dixon, Robert M. W. 1994 Ergativity. Cambridge: Cambridge University Press. Dokumentation Bedrohter Sprachen/Documentation of Endangered Languages n.d. Singpho/Tai/Tangsa. http://www.mpi.nl/DOBES/projects/singpho_tai_tangsa/ (accessed 9 May 2013) Drocco, Andrea 2009 Bibliography on ergativity in Indo-Aryan. http://www.indoaryanlinguistics. com/allegati/Bibliography%20on%20ergativity%20in%20Indo-Aryan.pdf (accessed 9 May 2013) Eaton, Robert D. 2008 Kangri in context: An areal perspective. University of Texas, Arlington, PhD dissertation. Edelman, D. (Joy) I., and Leila R. Dodykhudoeva 2009 Shughni. In: Windfuhr (ed.) 2009: 787–824. Emeneau, Murray B. 1984 Toda grammar and texts. Philadelphia: American Philosophical Society. Evans, Nicholas, and Toshiki Osada 2005 Mundari: The myth of a language without word classes. Linguistic Typology 9: 351–390. Farrell, Tim 1995 Fading ergativity? A study of ergativity in Balochi. In: David C. Bennett, Theodora Bynon, and B. George Hewitt (eds.), Subject, voice and ergativity, 218–243. London: School of Oriental and African Studies. Foley, William A., and Robert D. Van Valin, Jr. 1984 Functional syntax and universal grammar. Cambridge: Cambridge University Press.

Morphology

485

Ford, Alan, Rajendra Singh, and Gita Martohardjono 1997 Pace Panini: Towards a word-based theory of morphology. New York: Peter Lang. Gair, James W., and John C. Paolillo 1997 Sinhala. München: LINCOM. Gair, James W., and W. S. Karunatillake 1974 Literary Sinhala. Ithaca, NY: South Asia Program/Department of Modern Languages and Linguistics, Cornell University. Garman, Michael 1986 An approach to Dravidian derivational morphology. International Journal of Dravidian Linguistics, Working Papers in Linguistics 2(1): 47–67. Genetti, Carol 1988 A syntactic correlate of topicality in Newari narrative. In: John Haiman and Sandra A. Thompson (eds.), Clause combining in grammar and discourse, 29–48. Amsterdam/Philadelphia: Benjamins. Genetti, Carol 1992 Semantic and grammatical categories of relative clause morphology in the languages of Nepal. Studies in Language 16(2): 405–427. Genetti, Carol 1994 A descriptive and historical account of the Dolakha Newari dialect. Tokyo: Institute for the Study of Languages and Cultures of Asia and Africa. Genetti, Carol 2003 Dolakhā Newar. In: Thurgood & LaPolla (eds.) 2003: 355–370. Genetti, Carol 2007 A grammar of Dolakha Newar. Berlin/New York: Mouton de Gruyter. Genetti, Carol 2011 Nominalization in Tibeto-Burman languages of the Himalayan area: A typological perspective. In: Foong Ha Yap, Karen Grunow-Hårsta, and Janick Wrona (eds.), Nominalization in Asian languages: Diachronic and typological perspectives, 163–194. Amsterdam/Philadelphia: Benjamins. Ghosh, Arun 2008 Santali. In: Anderson (ed.) 2008: 11–89. Goswami, G. C., and Jyotiprakash Tamuli 2003 Asamiya. In: Cardona & Jain (eds.) 2003: 391–443. Grierson, George Abraham 1886a On pronominal suffixes in the Kāçmīrī language. Journal of the Asiatic Society of Bengal 64(1): 336–351. Grierson, George Abraham 1896b On the radical and participial tenses of the modern Indo-Aryan languages. Journal of the Asiatic Society of Bengal 64(1): 352–375. Grierson, George Abraham 1908 [= Grierson ed. 1903–1928, vol. 9.2] Grierson, George Abraham 1916 [= Grierson ed. 1903–1928, vol. 9.1] Grierson, George Abraham 1919a [= Grierson ed. 1903–1928, vol. 8.1]

486

Bibliographical references

Grierson, George Abraham 1919b [= Grierson ed. 1903–1928, vol. 8.2] Grierson, George Abraham (ed.) 1903–1928 Linguistic survey of India, 11 volumes in 20. Calcutta: Office of the Superintendent of Government Printing. Repr. 1967, Delhi: Motilal Banarsidass. Griffiths, Arlo 2008 Gutob. In: Anderson (ed.) 2008: 633–681. Haig, Geoffrey 2008 Alignment change in Iranian languages: A Construction Grammar approach. Berlin/New York: Mouton de Gruyter. Hargreaves, David 2003 Kathmandu Newar (Nepālī Bhāśā). In: Thurgood & LaPolla (eds.) 2003: 371– 384. Henadeerage, Deepthi Kumara 2002 Topics in Sinhala syntax. The Australian National University PhD dissertation. Hinüber, Oskar von 2001 Das ältere Mittelindisch im Überblick. 2nd rev. ed. Wien: Österreichische Akademie der Wissenschaften. Hock, Hans Henrich 1985 Transitivity as a gradient feature? Evidence from Indo-Aryan, especially Sanskrit and Hindi. In: A. Zide et al. (eds.) 1985: 247–263. Hock, Hans Henrich 1986 “P-Oriented” constructions in Sanskrit. In: Krishnamurti et al. (eds.) 1986: 15–26. Hock, Hans Henrich 1993 Review of Abbi 1992. Studies in the Linguistic Sciences 23(1): 169–192. Hock, Hans Henrich 2002 Vedic éta … stávāma: Subordinate, coordinate, or what? In: Mark R. V. Southern (ed.), Journal of Indo-European Studies Monograph 43: 89–102. Washington, DC: Institute for the Study of Man. Hock, Hans Henrich 2007 South Asia and Turkic: The Central Asian connection? In: Masica (ed.) 2007: 65–90. Delhi: Motilal Banarsidass. Hock, Hans Henrich 2009 Stämme oder Wurzeln im Sanskrit? Primäre vs. sekundäre Verbalstammbildung und das Kausativ. In: Jost Gippert (ed.), International Conference on Morphology and Digitisation, 63–80 (= Ústav srovnávací jazykovědy — Chatreššar 2009). Prague. Hock, Hans Henrich 2012 Issues in Sanskrit agreement. In: Klein & Yoshida (eds.) 2012: 49–58. Hock, Hans Henrich 2015 Some issues in Sanskrit syntax. In: Peter M. Scharf (ed.), Sanskrit syntax: Selected papers presented at the seminar on Sanskrit syntax and discourse structures, 13-15 June 2013, Université Paris Diderot, 1–52. Providence, RI: The Sanskrit Library..

Morphology

487

Hoffmann, John 1903 Mundari grammar. Calcutta: The Secretariat Press. Hoffmann, John, and Arthur van Emelen 1930–1979 Encyclopedia Mundarica, 16 volumes. Patna: Government Printing. Hook, Peter Edwin 1974 The compound verb in Hindi. Ann Arbor: Center for South and Southeast Asian Studies, University of Michigan. Hook, Peter Edwin 1982 South Asia as a semantic area: Forms, meanings and their connections. In: Studies in South Asian languages and linguistics, 30–41. Special issue of South Asia Review 6(3), edited by P. J. Mistry. Hook, Peter Edwin 1990 A note on expressions of involuntary experience in the Shina of Skardu. Bulletin of the School of Oriental and African Studies 53(1): 77–82. Hook, Peter Edwin 1996 Kesar of Layul: A Central Asian epic in the Shina of Gultari. In: William Hanaway and Wilma Heston (eds.), Studies in Pakistani popular culture, 212– 186. (Final report of the University of Pennsylvania/Lok Virsa Multi-Disciplinary Study of Pakistan Culture.) Islamabad/Lahore: Lok Virsa/Sang-e-Meel Publications. Hook, Peter Edwin, and Omkar N. Koul 1984 Pronominal suffixes and split ergativity in Kashmiri. In: Koul & Hook (eds.) 1984: 123–135. Hook, Peter Edwin, and Omkar N. Koul 2004 Case as agreement: Ergative in Eastern Shina, dative in Kashmiri and Poguli, and labile subjects in Kashmiri and Gujarati intransitive inceptives. In: Bhaskararao & Subbarao (eds.) 2004, vol. 1: 213–225. Hook, Peter Edwin, with Omkar N. Koul 1997 Fluid ergativity in Gujarati and Kashmiri inceptives. In: Kora Singer, Randall Eggert, and Gregory Anderson (eds.), Proceedings from the panels of the Chicago Linguistic Society’s thirty-third meeting, April 17–19, 1997, 163– 174. Chicago: Chicago Linguistic Society. Hook, Peter Edwin, Omkar N. Koul, and Ashok Kumar Koul 1987 Differential S-marking in Marathi, Hindi-Urdu, Kashmiri. In: Barbara Need, Eric Schiller, and Anna Bosch (eds.), Papers from the 23rd annual regional meeting of the Chicago Linguistic Society. Part one: The general session, 148–165. Chicago: Chicago Linguistic Society. Huet, Gérard, Amba Kulkarni, and Peter Scharf (eds.) 2009 Sanskrit computational linguistics. Berlin/Heidelberg: Springer. Israel, M. 1964 The finite verb of the “ceyyum” pattern in Tamil. Indian Linguistics 25: 179– 181. Israel, M. 1973 Treatment of morphology in Tolkāppiyam. Madurai: Madurai University. Jacques, Guillaume 2012a Agreement morphology: The case of Rgyalrongic and Kiranti. Language and Linguistics 13(1): 83–116.

488

Bibliographical references

Jacques, Guillaume 2012b An internal reconstruction of Tibetan stem alternations. Transactions of the Philological Society 110(2): 212–224. Jahani, Carina 2003 The case system in Iranian Balochi in a contact linguistic perspective. In: Carina Jahani & Agnes Korn (eds.), The Baloch and their neighbours: Ethnic and linguistic contact in Balochistan in historical and modern times, 113–132. Wiesbaden: Reichert. Jahani, Carina, and Agnes Korn 2009 Balochi. In: Windfuhr (ed.) 2009: 634–692. Jamison, Stephanie W. 2000 Lurching towards ergativity: Expressions of agency in the Niya documents. Bulletin of the School of Oriental and African Studies 63(1): 64–80. Jayaseelan, K. A. 2004 The possessor-experiencer dative in Malayalam. In: Bhaskararao & Subbarao (eds.) 2004, vol. 1: 227–244. Kachru, Yamuna 1980 Toward a typology of compound verbs in South Asian languages. Studies in the Linguistic Sciences 10(1): 113–124. Kachru, Yamuna 1982 Conjunct verbs in Hindi-Urdu and Persian. South Asian Review 6(3): 117–126. Kachru, Yamuna, and Rajeshwari Pandharipande 1978 On ergativity in selected South Asian languages. Studies in the Linguistic Sciences 8(1): 111–127. Katre, Sumitra M. 1987 Aṣṭādhyāyī of Pāṇini. Austin: University of Texas Press. Repr. 1989, Delhi: Motilal Banarsidass. Kaye, Alan S. (ed.) 2007 Morphologies of Asia and Africa. Winona Lake: Eisenbrauns Khokhlova, Liudmila V. 1992 Trends in the development of ergativity in New Indo-Aryan. Osmania Papers in Linguistics 18: 71–89. Khokhlova, Liudmila V. 2001 Ergativity attrition in the history of Western New Indo-Aryan languages. In: Bhaskararao & Subbarao (eds.) 2001: 159–184. Khokhlova, Liudmila V. 2002 Syntactic peculiarities of Rajasthani. 17th European Conference on Modern South Asian Studies, Heidelberg. http://serv.iaas.msu.ru/pub_on/khokhlova/ syntactic%20perculiarities%20of%20rajasthani.pdf (accessed 9 May 2013) Khubchandani, Lachman M. 2003 Sindhi. In: Cardona & Jain (eds.) 2003: 622–658. Kidwai, Ayesha 2005 Santali ‘Backernagel’ clitics: Distributing clitic doubling. In: Rajendra Singh (ed.), The yearbook of South Asian languages and linguistics (2005), 189–207. Berlin/New York: Mouton de Gruyter. Kiparsky, Paul 2009 On the architecture of Pāṇini’s grammar. In: Huet et al. (eds.) 2009: 33–94.

Morphology

489

Klaiman, Miriam H. 1978 Arguments against a passive origin of the IA Ergative. In: Donka Farkas, Wesley M. Jacobsen, and Karol W. Todrys (eds.), Papers from the fourteenth regional meeting of the Chicago Linguistic Society, 204–216. Chicago: Chicago Linguistic Society. Klaiman, Miriam H. 1987 Mechanisms of ergativity in South Asia. Lingua 71: 61–102. Klein, Jared S., and Kazuhiko Yoshida (eds.) 2012 Indic across the millennia: From the Rigveda to Modern Indo-Aryan. (14th World Sanskrit Conference, Kyoto, Japan, September 1st-5th, 2009, Proceedings of the Linguistic Section.) Bremen: Hempen. Kobayashi, Masato, and Ganesh Murmu 2008 Keraʔ Mundari. In: Anderson (ed.) 2008: 165–194. Korn, Agnes 2008 Marking of arguments in Balochi ergative and mixed constructions. In: Simin Karimi, Vida Samiian, and Donald Stilo (eds.), Aspects of Iranian linguistics, 249–276. Newcastle upon Tyne: Cambridge Scholars Publishing. Korn, Agnes 2009 The ergative system in Balochi from a typological perspective. Iranian Journal of Applied Language Studies 1(1): 43–79. http://titus.uni-Frankfurt. de/personal/agnes/ergativ.pdf (accessed April 2011) Koul, Omkar Nath 2003 Kashmiri. In: Cardona & Jain (eds.) 2003: 895–952. Koul, Omkar Nath, and Peter E. Hook (eds.) 1984 Aspects of Kashmiri linguistics. New Delhi: Bahri. Krause, Wolfgang, and Thomas Krause 1960 Tocharisches Elementarbuch, 1. Heidelberg: Winter. Krishnamurti, Bhadriraju 1961 Telugu verbal bases: A comparative and descriptive study. Berkeley/Los Angeles: University of California Press. Repr. 1972, Delhi: Motilal Banarsidass. Krishnamurti, Bhadriraju 1998 Telugu. In: Steever (ed.) 1998: 202–240. Krishnamurti, Bhadriraju 2003 The Dravidian languages: A comparative, historical and typological study. Cambridge: Cambridge University Press. Krishnamurti, Bhadriraju, Colin P. Masica, and Anjani Sinha (eds.) 1986 South Asian languages: Structure, convergence, and diglossia. Delhi: Motilal Banarsidass. Kulikov, Leonid 2010 Nominal composition, noun incorporation and non-finite formations in Sanskrit: Delimiting the boundaries of the verbal paradigm. In: Klaus Karttunen (ed.), Anantam śāstram: Indological and linguistic studies in honour of Bertil Tikkanen, 111–131. Helsinki: Finnish Oriental Society. Kümmel, Martin Joachim 2013 Zur Frage “zweistöckiger” Kasussysteme. In: Martin Lachout (ed.), Aktuelle Tendenzen der Sprachwissenschaft: Ausgewählte Beiträge zu den GeSuS-Linguistiktagen an der Metropolitan Universität Prag, 26.-28. Mai 2011, 205– 216. Hamburg: Dr. Kovač.

490

Bibliographical references

Kümmel, Martin Joachim In Press The morphology of Indo-Iranian. In: Jared S. Klein, Brian D. Joseph, and Matthias Fritz (eds.), Comparative Indo-European linguistics: An international handbook of language comparison and the reconstruction of Indo-European, Berlin/New York: Mouton de Gruyter. LaPolla, Randy J. 1992 On the dating and nature of verb agreement in Tibeto-Burman. Bulletin of the School of Oriental and African Studies 55(2): 298–315. http://v2.linguistlist. org/~lapolla/rjlapolla/papers/LaPolla-Dating_and_Nature_of_Verb_Agree ment_in_TB.pdf (accessed 9 May 2013) LaPolla, Randy J. 1995 ‘Ergative’ marking in Tibeto-Burman. In: Yoshio Nishi, James A. Matisoff, and Yasuhiko Nagano (eds.), New horizons in Tibeto-Burman morphosyntax, 189–228. Osaka, National Museum of Ethnology. LaPolla, Randy J. 2001 The role of migration and language contact in the development of the SinoTibetan language family. In: R. M. W. Dixon and A. Y. Aikhenvald (eds.), Areal diffusion and genetic inheritance: Case studies in language change, 225–254. Oxford: University Press. LaPolla, Randy J. 2003 Overview of Tibeto-Burman morphosyntax. In: Thurgood & LaPolla (eds.) 2003: 22–42. Lehmann, Thomas 1989 A grammar of Modern Tamil. Pondicherry: Pondicherry Institute of Linguistics and Culture. Lehmann, Thomas 1994 Grammatik des Alttamil unter besonderer Berücksichtigung der CaṅkamTexte des Dichters Kapilar. Stuttgart: Franz Steiner Verlag. Lehmann, Thomas 1998 Old Tamil. In: Steever (ed.) 1998: 75–99. Li, Chao 2007 Split ergativity and split intransitivity in Nepali. Lingua 117: 1462–1482. Liljegren, Henrik 2008 Towards a grammatical description of Palula: An Indo-Aryan language of the Hindukush. University of Stockholm PhD dissertation. su.diva-portal.org/ smash/get/diva2:198468/FULLTEXT01 (accessed 25 November 2014) Lorimer, D. L. R. 1935–1938 The Burushaski language, 3 vols. Oslo: Instituttet for Sammenlignende Kulturforskning. Macdonell, Arthur Anthony 1916 A Vedic grammar for students. Oxford: University Press. Repr. 1953, Bombay/ Calcutta/Madras: Oxford University Press. Magier, David S. 1983a Components of ergativity in Marwari. In: Amy Chukerman, Mitchell Marks, and John F. Richardson (eds.), Papers from the Nineteenth Regional Meeting of the Chicago Linguistic Society, 244–255. Chicago: Chicago Linguistic Society.

Morphology

491

Magier, David S. 1983b Topics in the grammar of Marwari. UCLA PhD dissertation. Magier, David S. 1987 The transitivity prototype: Evidence from Hindi. Word 38(3): 187–199. Magier, David S. 1990 Dative/accusative subjects in Marwari. In: Verma & Mohanan (eds.) 1990: 213–220. Magier, David S. 1999 The transitivity prototype and Hindi ko. In: Omkar Nath Koul (ed.), Topics in Hindi linguistics, 76–82. New Delhi: Bahri. Mahajan, Anoop Kumar 1987 Notes on wh questions in Hindi. South Asian Languages Analysis 1987, Cornell University. Mahajan, Anoop Kumar 1990 The A/A-bar distinction and movement theory. MIT PhD dissertation. (Distributed by MIT Working Papers in Linguistics.) Mahto, Panchanan 1989 On the nature of empty pronominals. Hyderabad: Central Institute of English and Foreign Languages PhD dissertation. Masica, Colin P. 1976 Defining a linguistic area. Chicago/London: The University of Chicago Press. Masica, Colin P. 1982 Identified object marking in Hindi and other languages. In: Omkar Nath Koul (ed.), Topics in Hindi linguistics 2, 16–50. New Delhi: Bahri. Masica, Colin P. 1990 Varied case marking in obligational constructions. In: Verma & Mohanan (eds.) 199: 335–342. Masica, Colin P. 1991 The Indo-Aryan languages. Cambridge: Cambridge University Press. Masica, Colin P. (ed.) 2007 Old and new perspectives on South Asian languages: Grammar and semantics. Delhi: Motilal Banarsidass. Maspero, H. 1948 Notes sur la morphologie du Tibeto-birman et du Munda. Bulletin de la Société de Linguistique de Paris 44: 155–185. Master, Alfred 1964 A grammar of Old Marathi. Oxford: Clarendon Press. McGregor, William B. 2010 Optional ergative case marking systems in a typological-semiotic perspective. Lingua 120: 1610–1636. Michailovsky, Boyd 1997 Catégories verbales et intransitivité duale en limbu. Studi Italiani di Linguistica Teorica e Applicata 26: 307–325. Miltner, Vladimir 1965 From OIA Passive to NIA Active. Asian and African Studies 1: 143–146. Mistry, P. J. 2007 Gujarati morphology. In: Kaye (ed.) 2007: 825–854.

492

Bibliographical references

Mohanan, Tara 1994 Argument structure in Hindi. Stanford: CSLI (Stanford University PhD dissertation, 1990.) Montaut, Annie 2004 Oblique main arguments in Hindi as localizing predications. In: Bhaskararao & Subbarao (eds.) 2004, vol. 1: 33–56. Montaut, Annie 2009 Ergative and pre-ergative patterns in Indo-Aryan as predications of localization: A diachronic view of past and future systems. In: Ali R. Fatihi (ed.), Language vitality in South Asia, 295–325. Aligarh: Dept. of Linguistics, Aligarh Muslim University. http://halshs.archives-ouvertes.fr/docs/00/54/98/74/PDF/ diachrony_of_ergative_futureJr.pdf (accessed 9 May 2013) Morey, Stephen 2012 The Singpho agentive: Functions and meanings. Linguistics of the TibetoBurman Area 35(1): 1–14. Morgenstierne, Georg 1949. The language of the Prasun Kafirs. Norsk Tiddskrift for Sprogvidenskap 15: 186–334. Morgenstierne, Georg 1973 Indo-Iranian frontier languages, Vol. III, The Pashai language, Part I, Grammar (2nd ed.). Oslo: Univertetsforlaget. Murugaiyan, A. and Christiane Pilot-Raichoor 2004 Les prédications indifférenciées en dravidien: témoins d’une évolution typologique archaïque. In: Jacques François and Irmtraud Behr (eds.), Les constituants prédicatifs et la diversité des langues, 155–177. Louvain: Peeters. Neukom, Lukas 2000 Argument marking in Santali. Mon-Khmer Studies 30: 95–113. Neukom, Lukas, and Manideepa Patnaik 2003 A grammar of Oriya. Zürich: Seminar für Allgemeine Sprachwissenschaft der Universität Zürich. Nishi, Yoshio, James A. Matisoff, and Yasuhiko Nagano 1995 New horizons in Tibeto-Burman morpho-syntax. Osaka: National Museum of Ethnology. Noonan, Michael 2003a Chantyal. In: Thurgood & LaPolla (eds.) 2003: 315–335. Noonan, Michael 2003b Nar-Phu. In: Thurgood & LaPolla (eds.) 2003: 336–352. Oberlies, Thomas 2001 Pāli: A grammar of the language of the Theravāda Tipiṭika. Berlin/New York: de Gruyter. Oguibénine, Boris 2006 Notes on the instrumental case of the subject/agent vs. other cases in Buddhist Sanskrit. In: Bertil Tikkanen and Heinrich Hettrich (eds.), Themes and tasks in Old and Middle Indo-Aryan linguistics, 89–119. Delhi: Motilal Banarsidass. Osada, Toshiki 2008 Mundari. In: Anderson (ed.) 2008: 99–164.

Morphology

493

Palancar, Enrique 2002 The origin of agent markers. (Studia Typologica 5). Berlin: Akademie Verlag. Pandharipande, Rajeshwari 1997 Marathi. London/New York: Routledge. Pandharipande, Rajeshwari 2003 Marathi. In: Cardona & Jain (eds.) 2003: 698–728. Paramasivam, K. 1979 Effectivity and causativity in Tamil. Trivandrum: Dravidian Linguistics Association. Patnaik, Manideepa 2008 Juang. In: Anderson (ed.) 2008: 508–556. Paudyal, Netra Prasad 2008 Agreement patterns in Darai: Typological study. Nepalese Linguistics 23: 186–207. http://www.academia.edu/852830/Agreement_Patterns_in_Darai_ typological_Study (accessed 9 May 2013) Payne, John R. 1980 The decay of ergativity in Pamir languages. Lingua 51: 147–186. Peterson, David A. 2011 Core participant marking in Khumi. Linguistics of the Tibeto-Burman Area. 34(2): 73–100. Peterson, John M. 1998 Grammatical relations in Pali and the emergence of ergativity in Indo-Aryan. München: LINCOM. Peterson, John M. 2007 Languages without nouns and verbs? An alternative to lexical classes in Kharia. In: Masica (ed.) 2007: 274–303. Peterson, John M. 2008 Kharia. In: Anderson (ed.) 2008: 434–507. Peterson, John M. 2011 Aspects of Kharia grammar: A Role and Reference Grammar approach. In: Rajendra Singh & Ghanshyam Sharma (eds.), Annual review of South Asian languages and linguistics 2011, 81–124. Berlin/New York: Mouton de Gruyter. Pilot-Raichoor, Christiane 2012 Tamil Brahmi inscriptions: A critical landmark in the history of the Dravidian languages. In: Appasamy Murugaiyan (ed.), New dimensions in Tamil epigraphy, 285–315. Chennai: CreA publishers. Pinnow, Heinz-Jürgen 1966 A comparative study of the verb in the Munda languages. In: N. Zide (ed.) 1966: 96–193. Poudel, Tikaram 2007 Ergativity and stage/individual level predications in Manipuri. Stuttgart, November 2007. http://ling.uni-konstanz.de/pages/home/tafseer/manipuristuttgart.pdf (accessed April 2011) Poudel, Tikaram 2008a Nepal ergativity: A historical perspective. Workshop on Case and Alignment in Indo-European, University of Bergen, 10–11 December 2008. http://ling. uni-konstanz.de/pages/home/tafseer/bergen_erg.pdf (accessed April 2011)

494

Bibliographical references

Poudel, Tikaram 2008b The evolution of the ergative in Nepali. 14th Himalayan Languages Symposium, University of Gothenburg, 21–23 August 2008. http://ling.uni-konstanz. de/pages/home/tafseer/nepali_erg.pdf (accessed April 2011) Poudel, Tikaram 2008c Ergative/nominative alternations in Manipuri intransitive constructions. Workshop on Transitivity and Case Alternations, University of Stuttgart, January 14–15, 2008. http://independent.academia.edu/Tikarampoudel/Papers (accessed April 2011) Pray, Bruce R. 1976 From passive to ergative in Indo-Aryan. In: Verma (ed.) 1976: 195–211. Rajam, V. S. 1992 A reference grammar of classical Tamil poetry (150 BC – pre-fifth/sixth century AD). Philadelphia: American Philosophical Society. Rakesh, Nilu, and Rajesh Kumar 2013 Agreement in Magahi complex predicate. International Journal of Linguistics 5(1): 176–190. http://www.macrothink.org/journal/index.php/ijl/index (accessed 9 May 2013) Ramaswami, N. 1982 Brokskat grammar. Mysore: Central Institute of Indian Languages. Rao, Goparaju Sambasiva 1991 A comparative study of Dravidian noun derivatives. New Delhi: Bahri. Ray, Tapas S. 2003 Oriya. In: Cardona & Jain (eds.) 2003: 444–476. Renou, Louis 1984 Grammaire sanscrite. Paris: Maisonneuve. Riccardi, Theodore 2003 Nepali. In: Cardona & Jain (eds.) 2003: 538–580. Roberts, Taylor 2001 Split-agreement and ergativity in Pashto. Originally published in Kurdica 5(3), now at: http://www.yorku.ca/twainweb/troberts/ling/roberts-2001–kurdica5–3.html (accessed May 2011) Saksena, Anuradha 1980 The affected agent. Language 56(4): 812–826. Saksena, Baburam 1937 Evolution of Awadhi. Allahabad: Indian Press. Repr. 1971, Delhi: Motilal Banarsidass. Saxena, Anju 1991 Pathways of the development of the ergative in Central Tibetan. Linguistics of the Tibeto-Burman Area 14(1): 109–118. Saxena, Anju 1992 Finite verb morphology in Tibeto-Kinnauri. University of Oregon PhD dissertation. Saxena, Anju 1997 Aspect and evidential morphology in Standard Lhasa Tibetan: A diachronic study. Cahiers de Linguistique – Asie Orientale 26(2): 282–306.

Morphology

495

Saxena, Anju 2000 Diverging sources of new aspect morphology in Tibeto-Kinnauri: External motivation or internal development. In: John Charles Smith and Delia Bentley (eds.), Historical linguistics 1995, 1: General issues and non-Germanic languages, 361–375. Amsterdam/Philadelphia: Benjamins. Saxena, Anju 2009 Optional ergative as a discourse marker in Kinnauri and other Himalayan languages. Unpublished MS. Schiffman, Harold 1999 A reference grammar of Spoken Tamil. Cambridge: Cambridge University Press. Schmidt, Ruth Laila 2003 Urdu. In: Cardona & Jain (eds.) 2003: 286–350. Schmidt, Ruth Laila 2004 Compound verbs in the Shina of Kohistan. Acta Orientalia 65: 19–31. Schmidt, Ruth Laila, and Vijay Kaul 2010 A grammatical sketch of Guresi Shina. In: Klaus Karttunen (ed.), Anantaṁ śāstram: Indological and linguistic studies in honour of Bertil Tikkanen, 195– 214. Helsinki: Finnish Oriental Society. Sen, Sukumar 1960 A comparative grammar of Middle Indo-Aryan. Poona: Deccan College. Shackle, Christopher 1976 The Siraiki language of central Pakistan: A reference grammar. London: School of Oriental and African Studies. Shackle, Christopher 2003 Panjabi. In: Cardona & Jain (eds.) 2003: 581–621. Shapiro, Michael C. 1976 The analysis of Hindi morphologically related verb sets. Indian Linguistics 37(1): 1–44. Shapiro, Michael C. 2003 Hindi. In: Cardona & Jain (eds.) 2003: 250–285. Sharma, D. D. 1994 A comparative grammar of Tibeto-Himalayan languages (of Himachal Pradesh & Uttarakhand). New Delhi: Mittal Publications. Sharma, Devyani 2001 Kashmiri case clitics and person hierarchy effects. In: Peter Sells (ed.), Formal and empirical issues in Optimality Theoretic syntax, 225–256. Stanford: CSLI. http://web.mit.edu/pritty/www/ergativity/documents/sharma2001.pdf (accessed 9 May 2013) Shukla, Shaligram 2001 Hindi morphology. München: LINCOM. Sigorskiy, Alexander A. 2007 Case, split nominativity, split ergativity and split accusativity in Hindi: A historical perspective. In: Masica (ed.) 2007: 34–61. Singh, Mona 1994 Perfectivity, definiteness, and specificity: A classification of verbal predicates in Hindi. University of Texas, Austin, PhD dissertation.

496

Bibliographical references

Singh, Rajendra, and Ramakant Agnihotri 1997 Hindi morphology: A word based description. Delhi: Motilal Banarsidass. Singh, Rajendra, Stanley Starosta, and Sylvain Neuvel (eds.) 2003 Explorations in Seamless Morphology. New Delhi: Sage. Sjoberg, Andrée F. 2001 Convergence and resistance to morphological change in agglutinative languages of South and Central Asia. In: Bhaskararao & Subbarao (eds.) 2001: 369–390. Skalmowski, Wojciech 1974 Transitive verb constructions in the Pamir and Dardic languages. Studia Indoeuropejskie, 205–212. (Prace Komisji Językoznawstwa 37.) Krakow: Polska Akademia Nauk. Slade, Benjamin 2013 The diachrony of light and auxiliary verbs: Evidence from Indo-Aryan. Diachronica 30(4): 531–578. Sridhar, S. N. 1990 Kannada: A descriptive grammar. London/New York: Routledge Starosta, Stanley 1985 Relator nouns as a source of case inflection. In: Veneeta Z. Acson and Richard L. Leed (eds.), For Gordon H. Fairbanks, 111–133. Honolulu: University of Hawaii Press. Steever, Sanford B. 1986 Morphological convergence in the Khondmals: (Pro)nominal incorporation. In: Bh. Krishnamurti et al. (eds.) 1986: 270–285. Steever, Sanford B. 1988 The serial verb formation in the Dravidian languages. Delhi: Motilal Banarsidass. Steever, Sanford B. 1993 Analysis to synthesis: The development of complex verb morphology in the Dravidian languages. New York: Oxford University Press. Steever, Sanford B. 1998 Introduction to the Dravidian languages. In: Steever (ed.) 1998: 1–39. Steever, Sanford B. (ed.) 1998 The Dravidian languages. London/New York: Routledge. Stroński, Krzysztof 2009a Variation of ergativity patterns in Indo-Aryan. Poznań Studies in Contemporary Linguistics 45(3): 237–253. http://versita.metapress.com/content/ n562t273u6468267/ (accessed April 2013) Stroński, Krzysztof 2009b Approaches to ergativity in Indo-Aryan. Lingua Posnaniensis 51: 77–118. Stroński, Krzysztof 2010 Variation of ergativity patterns in Indo-Aryan. Poznań Studies in Contemporary Linguistics 46: 237–253. Stump, Gregory 1983 The elimination of ergative patterns of case-marking and verbal agreement in Modern Indic languages. Ohio State University Working Papers in Linguistics 27: 140–164.

Morphology

497

Stump, Gregory T., and Ramawatar Yadav 1988 Maithili verb agreement and the control agreement principle. In: Diane Brentari, Gary Laron, and Lynn MacLeod (eds.), Papers from the parasession on agreement in grammatical theory, 304–321. Chicago: Chicago Linguistic Society. Subbarao, Karumuri V. 2001 Agreement in South Asian languages and Minimalist inquiries: The framework. In: Bhaskararao & Subbarao (eds.) 2001: 457–492. Subbarao, Karumuri V. 2012 South Asian languages: A syntactic typology. Cambridge: Cambridge University Press. Subrahmanyam, P. S. 1964 Two problems in Parji verb forms. Indian Linguistics 25: 47–55. Subrahmanyam, P. S. 1971 Dravidian verb morphology: A comparative study. Annamalainagar: Annamalainagar University. Suvarchala, B. 1992 Central Dravidian comparative morphology. New Delhi: Navrang. Tagare, Ganesh Vasudev 1987 A historical grammar of Apabhraṁśa. Poona: Deccan College. Tegey, Habibullah 1979 Ergativity in Pushto (Afghani). In: Irmengard Rauch and Gerald F. Carr (eds.), Linguistic method: Essays in honor of Herbert Penzl, 369–418. The Hague: Mouton. Teo, Amos 2012 Sumi agentive and topic markers: no and ye. Linguistics of the Tibeto-Burman Area 35(1): 49–74. Thompson, Hanne-Ruth 2012 Bengali. Amsterdam/Philadelphia: Benjamins. Thumb, Albert, and Richard Hauschild 1959 Handbuch des Sanskrit II: Formenlehre. Heidelberg: Winter Thurgood, Graham, and Randy LaPolla (eds.) 2003 The Sino-Tibetan languages. London/New York: Routledge. Tiffou, Étienne 1977 L’effacement de l’ergatif en Bourouchaski. Studia Linguistica 31(1): 18–37. Tiffou, Étienne, and Yves-Charles Morin 1982 A note on split ergativity in Burushaski. Bulletin of the School of Oriental and African Studies 45: 88–94. Tirumalesh, K. V. 1997 Kannada prefixation. In: M. Hariprasad et al. (eds.), Phases and interfaces of morphology, 74–84. Hyderabad: Central Institute of English and Foreign Languages. Tournadre, Nicolas 1991 The rhetorical use of the Tibetan ergative. Linguistics of the Tibeto-Burman Area 14: 93–107. Tournadre, Nicolas 1996 L’ergativité en Tibétain. Approche morphosyntaxique de la langue parlée. (Bibliothèque de l’information grammaticale 33). Paris/Leuven: Peeters.

498

Bibliographical references

Tournadre, Nicolas 2010 The Classical Tibetan cases and their transcategoriality: From sacred grammar to modern linguistics. Himalayan Linguistics 9(2): 87–125. http://www. nicolas-tournadre.net/wp-content/uploads/2014/07/2010-HLGrammar.pdf (accessed 30 November 2014) Tuebingen University 2011 Project B11: Semantic roles, case relations, and cross-clausal reference in Tibetan. http://www.sfb441.uni-tuebingen.de/b11/b11fieldwork.html#clauseTypes (accessed May 2011) Tuite, Kevin J., Asif Agha, and Randolph Graczyk 1985 Agentivity, transitivity, and the question of active typology. In: William H. Eilfort, Paul D. Kroeber, and Karen L. Peterson (eds.), Papers from the parasession on causatives and agentivity at the twenty-first regional meeting of the Chicago Linguistic Society, Part 2, 252–270. Chicago: Chicago Linguistic Society. van Driem, George 1987 A grammar of Limbu. Berlin/New York: Mouton de Gruyter. van Driem, George 1997 A grammar of Duma. Berlin/New York: Mouton de Gruyter. Vasu, Srisa Chandra 1897 The Ashtádhyáyí of Páṇini. Benares: Sindhu Charan Bose. Repr. 1962, Delhi: Motilal Banarsidass. Verbeke, Saartje 2013 Alignment and ergativity in New Indo-Aryan languages. Berlin/New York: de Gruyter Mouton. Verbeke, Saartje, and Klaas Willems 2012 Ergativity in Modern and Middle Indo-Aryan: A critical digest. In: Klein & Yoshida (eds.) 2012: 209–226. Verma, Manindra K. 2003 Bhojpuri. In: Cardona & Jain (eds.) 2003: 515–537. Verma, Manindra K. (ed.) 1976 The notion of subject in South Asian languages. (South Asian Studies 2). Madison: University of Wisconsin, Dept. of South Asian Studies. Verma, Manindra K. (ed.) 1993 Complex predicates in South Asian languages. Delhi: Manohar Publishers. Verma, Manindra K., and K. P. Mohanan (eds.) 1990 Experiencer subjects in South Asian languages. Stanford: CSLI. Verma, Sheela 2003 Magahi. In Cardona & Jain (eds.) 2003: 498–514. Vijayakrishnan, K. G. 1994 Compound typology in Tamil. In: Butt et al. (eds.) 1994: 263–278. Vollmann, Ralf 2008 Descriptions of Tibetan ergativity: A historiographical account. (Grazer Vergleichende Arbeiten, 23.) Graz: Leykam. Vollmann, Ralf 2010 Optional ergative case marking in Tibetan. Online publication. http://www. uni-graz.at/~vollmanr/pubs/VR2010A_opt_erg_12.pdf (accessed April 2013)

Morphology

499

Wackernagel, Jacob 1905 Altindische Grammatik, 2.1: Einleitung zur Wortlehre, Nominalkomposition. Göttingen: Vandenhoeck & Ruprecht. Wali, Kashi, and Omkar N. Koul 1997 Kashmiri: A cognitive-descriptive grammar. London/New York: Routledge. Reprinted 2010. Wallace, William D. 1982 The evolution of ergative syntax in Nepali. Studies in the Linguistic Sciences 12(2): 147–211. Watters, David 2002 A grammar of Kham. Cambridge/New York: Cambridge University Press. Weinreich, Matthias 2008 Two varieties of Domaakí. Zeitschrift der Deutschen Morgenländischen Gesellschaft 158(2): 299–316. Wendtland, Antje 2008 On the ergativity in the Pamir languages. In: Simin Karimi, Vida Samiian, and Donald Stilo (eds.), Aspects of Iranian linguistics, 419–433. Newcastle upon Tyne: Cambridge Scholars Publishing. Whitney, William Dwight 1889 Sanskrit grammar, 2nd ed. Cambridge, MA: Harvard University Press. Wilde, Christopher P. 2008 A sketch of the phonology and grammar of Rājbanshi. University of Helsinki PhD dissertation. Willis, Christina 2011 Optional case marking in Darma (Tibeto-Burman). Linguistics of the TibetoBurman Area 34(2): 101–132. Willson, Stephen R. 1990 Verb agreement and case marking in Burushaski. University of North Dakota MA thesis. Windfuhr, Gernot L. (ed.) 2009 The Iranian languages. London/New York: Routledge. Wunderlich, Dieter 2012 Case and agreement variation in Indo-Aryan. http://www.zas.gwz-berlin. de/fileadmin/mitarbeiter/wunderlich/Case_and_agreement_variation_in_ Indo-Aryan.pdf (accessed 8 May 2013) Yadav, Ramawatar 1996 A reference grammar of Maithili. Berlin/New York: Mouton de Gruyter. Yadava, Yogendra P. 1999 The complexity of Maithili verb agreement. In: Rajendra Singh (ed.), The yearbook of South Asian languages and linguistics 1999, 139–152. New Delhi: Sage. Yadava, Yogendra, and Warren G. Glover (eds.) 1999 Topics in Nepalese linguistics. Kathmandu: Royal Nepal Academy. Zakharyin, B. 1979 On the formation of ergativity in Indo-Aryan and Dardic. Osmania Papers in Linguistics 5: 50–71.

500

Bibliographical references

Zide, Arlene R. K., David Magier, and Eric Schiller (eds.) 1985 Proceedings of the Conference on Participant Roles: South Asia and Adjacent Areas. Bloomington: Indiana University Linguistics Club. Zide, Norman H. 2008 Korku. In: Anderson (ed.) 2008: 356–298. Zide, Norman H. (ed.) 1966 Studies in comparative Austroasiatic linguistics. The Hague: Mouton. Zograph, G. A. 1976 Morfologičeskij stroj novyx indoarijskix jazykov. Moscow: Nauka. Zvelebil, Kamil V. 1970 Comparative Dravidian morphology. The Hague/Paris: Mouton. Zvelebil, Kamil V. 1977 A sketch of comparative Dravidian morphology (part I). The Hague: Mouton.

5

Syntax and semantics Edited by Hans Henrich Hock1

5.1.

Introduction

South Asian languages offer a number of syntactic phenomena that are of broad relevance for syntacticians in general, and some of these phenomena pose interesting challenges to syntactic theory. These phenomena prominently include “split ergativity” (characteristic of many of the Indo-Aryan and Iranian languages), Oblique Subjects and complex predicate structures (found throughout South Asia), and evidentiality (especially characteristic of the Himalayan and Northwestern languages, but also found elsewhere). Except for Khasi and the Daic languages, which are generally SVO, the languages of South Asia are verb-final. (Kashmiri has V2 in main clauses and in clauses introduced by the complementizer ki/zi.) The correlation of SOV with finite and/or nonfinite subordination has been a topic of controversy. A more recent controversy concerns the question whether there is any difference between Oblique Subjects (defined in terms of their syntactic behavior) and Oblique Experiencers (which do not necessarily have the syntactic behavior of subjects). These and other issues are discussed in the following sections. Section 5.2 deals with transformational-generative approaches, especially Minimalism, the dominant formal approach to syntax. Section 5.3 presents an overview of Cognitive Linguistic approaches to the syntax of South Asian languages, with references to work in Construction Grammar. Section 5.4 is devoted to morphosyntactic typological issues — Oblique Experiencers/Oblique Subjects; Complex verbs; and finite vs. nonfinite subordination. (The phenomena discussed in this section relate to issues addressed in Chapter 4, and there is some natural overlap in presentation.) Section 5.5 is dedicated to the morphosemantic typology of evidentiality. 5.2.

Formal syntax

Formal approaches to the syntax of South Asian languages have been dominated by the transformational generative framework (from the 1960s, through Government and Binding and Principles and Parameters, to Minimalism). This dominance 1

Many thanks to Elena Bashir and K. V. Subbarao for their help in developing and editing this chapter.

502

Alice Davison

is reflected in this section, although all of the contributions also refer to nontransformational generative work conducted in HPSG (Kopris & Davis 2005, Dost 2007 on Pashto) and in LFG (Mohanan 1994a, 1994b, 1995, Butt 1995 on HindiUrdu). In fact, on a range of issues there is a lively interchange between scholars working in one or another of these three generative frameworks. The LFG approach is especially vigorously presented in contributions to annual LFG Conferences, whose proceedings are edited by Miriam Butt and Tracy Holloway King (http://web.stanford.edu/group/cslipublications/cslipublica tions/site/ONLN.shtml), and in publications by Miriam Butt (with various coauthors), including Butt 1994, 2003, 2010, Butt, King & Maxwell 2003, Butt & Sadler 2003, Butt & Deo 2013. Note also Raza & Ahmed 2011. HPSG seems to be less well presented, but note e.g. Arsenault 2002, Poornima 2012, Poornima & Koenig 2008, Rau 2007. Beyond these formal approaches, there is a wealth of information that can be gleaned from traditional grammars, as well as linguistic surveys of the various languages and language families; see 5.2.1.3 below. 5.2.1.

An overview of generative syntactic work and reference resources in South Asian languages By Alice Davison

5.2.1.1. Introduction This brief contribution presents an overview of generative syntactic work on South Asian languages and reference resources on syntax. My goal is to call attention to significant sources of data and analyses that have been conducted from a generative point of view, construing that term very broadly. I want to call attention to linguistic research on a wide variety of languages of South Asia, work which is useful for sources of data and ideas. I have omitted journal articles and dissertations, with a few exceptions from the early phase of generative work on South Asian languages, as well as purely pedagogical and reference grammars. Bhatt’s contribution (5.2.2) presents more specific coverage on the topics of scrambling and variation in phrasal order; local vs. long-distance agreement; and the various ways in which question scope can be expressed. 5.2.1.2. Generative studies since the 1960s Analyses of South Asian languages inspired by Chomsky’s work began in the mid to late 1960s, responding to the transformational studies on English and a few other languages like Japanese. Many of these first applications of transformational grammar to South Asian languages were done at universities with a center for South Asian Studies as well as a Linguistics department, such as Chicago, Illi-

Syntax and semantics

503

nois, Pennsylvania, and Cornell. A lot of the work was not exactly published; it circulated in mimeographed or xeroxed form, or was published informally, and is now not easily accessible. Some was published in India in book form, and most of this work is readily available in many university libraries, through the Library of Congress purchasing program in India. A wealth of useful bibliographic references about descriptive and early generative work on Indo-Aryan languages can be found in Masica 1991, which also includes very perceptive descriptions of the phonology, morphology, and syntax of these languages. The early generative work naturally looked for structures found in many other languages, such as case marking, non-finite sentence embedding, reflexive binding, questions, and relative clauses. But linguists were also noting structures for which there were no counterparts in (modern) English or other familiar languages, such as VV and VN combinations, ergative case, dative subjects, reduplication, in-situ questions, and correlative clauses. Some of the very earliest general work was on Hindi (Kachru 1966, 1968). Bahl 1964 focused on the verb and its combinations in Hindi. This topic was pursued by Hook (1974), who offers a rich array of data and generalizations about “vector” verbs in combination with main verbs. Compounds formed by noun or adjective plus verb in Hindi are the subject of two early volumes which are still of great interest, Bahl 1974 and 1979. These volumes are all that was published of the results of a hand-done corpus study of Hindi verb collocations. They give naturally occurring examples of the verb karnā ‘do’ with adjectives and nouns, a rich source of lexical items in Hindi and Urdu. Entries include the case marking of subjects and logical objects, as well as the immediate surrounding context. The theme of complex predicates is expanded in a collection of articles on a variety of South Asian languages in Verma (ed.) 1993. Topics in South Asian languages were the subject of dissertations and monographs fairly early on. Nadkarni (1970) defined island effects in relative clauses in Kannada and Konkani. Annamalai (1969/1997) explored the semantic and syntactic restrictions on participial relative clauses in Tamil. Subbarao (1974/1984) provided transformational rules for Hindi finite and non-finite embedded clauses. Abbi (1974) studied clausal reduplication in Hindi with like subjects. Kachru (1968) showed that dative experiencers count as the antecedent of subject-oriented reflexives, while Davison (1969) showed that Hindi dative subjects are controllers of the null subject of the conjunctive participle, a common construction in South Asian languages. Verma (ed.) 1976 is an extremely interesting collection of papers on how subjects are defined grammatically in modern South Asian languages and in Sanskrit. There has been a trend to start from analyses of a single language, usually Hindi or a Dravidian language, and then to look at a specific construction in other languages as well. The existence of good traditional grammars was often an incentive to translate observations into generative terms. But Dasgupta (1980), finding

504

Alice Davison

no model for a generative analysis of Bangla syntax, wrote an eclectic grammar of Bangla, inspired by generative concepts, in order to articulate his analysis of embedded clauses and in-situ questions. Most of the earlier work I have cited here has been published in India or informally in the US, or is only available in the form of dissertations. There has been a growing awareness of the importance of South Asian languages in defining the range of properties of human language. The distinctive structure of languages of the South Asian linguistic area as discussed by Masica (1976) is turning out to reveal more about how syntactic structures are expressed, and even to pose serious challenges to the ability of current theories to explain them, as Bhatt shows in his contribution to this volume. Properties of South Asian languages, including TibetoBurman languages, are a rich source for theorizing about the universal properties of human language. For example, D. N. S. Bhat (e.g. 1994, 1999) has published original and thought-provoking books on general topics integrating much information from South Asian languages. The grammatical facts of Hindi-Urdu and other languages like Kashmiri and Kannada are becoming part of the common knowledge in the field of linguistics, so that there are more incentives to publish articles and books about them. These languages are the subject of rich typological studies, such as Abbi 1992 on reduplication, and Subbarao 2012, a comprehensive study of the major grammatical structures in all of the language families of South Asia. Both are based on extensive fieldwork, as well as linguistic analysis. 5.2.1.3. Comprehensive reference grammars Comprehensive reference grammars written by generative linguists are fairly recent. The monograph series published by Routledge under the direction of Bernard Comrie includes four on South Asian languages. These are exhaustive reference grammars which follow a standard descriptive organization. Bhatia 1993 summarizes a very large body of descriptive papers and monographs on Panjabi, a language with many similarities to Hindi-Urdu but which has not received as much analytical attention as Hindi-Urdu. He has access to sources both in English and Panjabi. Sridhar 1990 gives a clear overview of Kannada with copious illustrations of sentence structures, again based on linguistic work both in English and Kannada. Pandharipande 1997 is a similar comprehensive description of Marathi, with perhaps an emphasis on Nagpuri Marathi, again based on both Marathi and English language sources. In this connection I should mention Wali 2006, a collection of her theoretical papers on Marathi, perhaps representing a different variety of Marathi, more to the center and north of the Marathi-speaking area. Wali and Koul (1999) offer a comprehensive description of Kashmiri; one of the authors (Koul) is a native speaker of Kashmiri. In addition to the standard descriptive categories, they add information about the V2 word order which is distinctive in this

Syntax and semantics

505

language, as well as the clitic agreement system. There are two recent additions to the analysis of Kashmiri as well as Hindi-Urdu, framed in Minimalist terms, which should be noted here. They are a revision of the dissertation on Kashmiri V2 syntax by Emily Manetta (2011), as well as a related article with a novel explanation for the difference between Kashmiri and Hindi-Urdu (Manetta 2010). The author has built on the work of Wali and Koul, and supplemented their information with new sentences judged by native speakers of Kashmiri. The Routledge series also includes grammars on other languages of South Asia, bringing together information from many sources as well as fieldwork reflecting native speaker judgments. These include Asher 1983 on Tamil and Asher & Kumari 1997 on Malayalam. There are also volumes in the Mouton Grammar Series, which includes Chelliah 1997 on Methei, Coupe 2007a on Mongsen Ao, Genetti 2007 on Dolakha Newari, and van Driem 1980 on Limbu. Useful information on the syntax of South Asian languages is also found in edited volumes of the Routledge Language Family Descriptions series, including Anderson (ed.) 2008 (Munda), Cardona & Jain 2003 (eds.) (Indo-Aryan), Steever (ed.) 1998 (Dravidian), and Thurgood & LaPolla (eds.) 2003 (Tibeto-Burman). Outside of these series, there are recent comprehensive grammars of HindiUrdu — Montaut 2004 and Kachru 2008 for Hindi, and Schmidt 1999 for Urdu. Neukom & Patnaik 2003 is a linguistically informed comprehensive grammar of Oriya. Gair 1998 is a collection of linguistic his analyses of Sinhala and Sri Lankan Tamil. Th. Lehmann 1993 is a detailed and very thorough study of Tamil syntax. Steever (1988) makes an important contribution to the understanding of embedded clauses in Dravidian languages, proposing that there is at most one finite verb in a complex sentence, with consequences for participles and the “quotative” marking of embedded clauses. Hock (2005) discusses this issue with further evidence and arguments, with reference to Sanskrit. There are some notable collections of studies on specific theoretically related themes. These include Lust et al. (eds.) 2000, a work which was designed to give as uniform a characterization as possible of reflexives and pronouns in the major Dravidian and Indo-Aryan languages, as well as one Munda language. The authors followed the same format of presentation, so that it is possible to find corresponding information for each language. The basic framework that each author adopted for the volume was the Binding Theory of Chomsky 1981 and following works. Another collection of contributions on a specific theme is Bhaskararao & Subbarao (eds.) 2004, which includes contributions on non-nominative subjects in both South Asian and other languages; South Asian languages are heavily represented, both Dravidian and Indo-Aryan. There are several contributions on the acquisition of dative subjects (Tamil and Telugu), as well as analyses of non-nominative subjects from various theoretical points of view. A third collection is Dayal & Mahajan (eds.) 2004, comprising contributions on a variety of topics related to South Asian clause structure. It represents the work of ten scholars in India, Europe, and the US

506

Rajesh Bhatt

who are active in the theory-based analysis of South Asian languages. Jayaseelan 1999 is a collection of theoretically based articles on Malayalam. There are also some published versions of dissertations which investigate in depth, and from a specific theoretical point of view, such topics as compound verbs, case marking, focus particles, and question scope. These include Mohanan 1994a (Hindi), Butt 1995 (Urdu), and Bayer 1996: Chapter 7 (Bangla/Bengali). 5.2.2.

Minimalist approaches to South Asian syntax By Rajesh Bhatt

5.2.2.1. The Minimalist Program The Minimalist Program is a set of ideas developed by Noam Chomsky and associated researchers in a series of papers (Chomsky 1993, 1995, 1998, 1999). It can be seen as a development of ideas that have been pursued under the rubric of the Government and Binding (GB) Theory (Chomsky 1981) and more generally under the Principles and Parameters Approach to generative syntax (Chomsky & Lasnik 1993). The chief strategy within the Minimalist Program has been to examine critically whether the various theoretical and representational devices used within GB Theory are necessary and to dispense with any device that is found to have insufficient conceptual and empirical justification. The best example of this concerns the levels of D-structure and S-structure within GB Theory. GB Theory assumes four levels of representation: the underlying level of D-Structure, the surface level of S-structure, Logical Form (LF), the input to the logico-conceptual system, and Phonological Form (PF), the input to the articulatory-perceptual system. The Minimalist Program eliminates D-structure and S-structure as privileged levels of representation, as unlike PF and LF, they lack independent justification. Their only motivation is theory-internal, and reexamination of the data that motivated them reveals that it is possible to construct a theory that does not require reference to these levels. By Minimalist reasoning, such a theory is simpler and is to be preferred. This leads to a theory with only two privileged levels, each associated with an interface, and a derivational mechanism that relates the two. All aspects of syntactic behavior have to now follow from properties of the derivational system or from the properties of the interface. The Minimalist Program, as the name suggests, is a program; it does not correspond to any single theory of syntax. We can think of it intensionally as in the above paragraph — any syntactic system that questions its primitives and attempts to reduce them to the minimum could be thought of as being minimalist. But for the purposes of this survey, I will think of it extensionally in terms of the choices made by scholars inspired by the ideas expressed in the Minimalist Program. These choices are not necessarily intrinsically minimalist (many appear in GB Theory and in other frameworks) and are not shared by everyone working within

Syntax and semantics

507

the Minimalist Program; they are best exemplified by examining the specific treatments of the syntactic phenomena discussed in this survey. Minimalist work assumes that there is a core computational system that consists of primitive operations such as Merge, Move, and Agree that creates syntactic structure. This system is taken to be invariant across languages. Structure building consists of an interleaving of these primitive operations. Variation across languages is taken to follow from the properties of their lexical items, which can be smaller than phonological words. Like most syntactic work on Indian languages, minimalist work has unfortunately focused on only a few languages. The bulk of the work is on Hindi-Urdu, followed by Bengali, Malayalam, and Kashmiri. There is, of course, work on other languages,2 but I think it is fair to say that there is much more to be done. My survey will not attempt to be comprehensive. I will attempt to convey the richness of the minimalist work on South Asian languages by narrowly limiting myself to three topics: scrambling, case and agreement, and question formation, and there too focusing largely on Hindi-Urdu. This means for example that I will not be discussing the significant minimalist work on clefting, word order, and question formation in Malayalam (e.g. Jayaseelan 1996, 2000, 2004a, and 2008), or the work on DP structure in Bengali (Bhattacharya 1999). 5.2.2.2 Scrambling The considerable flexibility in word order found in most South Asian languages has been the focus of much work in contemporary syntax. We start with the observation that in several South Asian languages, a clause with three arguments allows for all (4!) orders. These orders can be grouped as follows. (1)

a. b. c. d.

Base order: S IO DO V Movement below the subject: S DO IO V (1 order) Movement above the subject: IO S DO V, DO S IO V, IO DO S V, DO IO S V (4 orders) Orders where the verb is not final: 18 orders (“rightward scrambling”)

This word order flexibility is sometimes called “scrambling”. Scrambling in this sense is merely a description of the availability of many word orders. Does it correspond to a primitive operation? Does the same operation derive all the orders in (1b–d)? What drives these movements? The syntactic literature offers different 2

Without being exhaustive, there is also work on Kannada (Amritavalli 2004b), Tamil (Sundaresan & McFadden 2009, Sundaresan 2012), Meithei (Bhattacharya & Devi 2003, Kidwai 2010), and Kutchi Gujarati (Grosz & Patel-Grosz 2014); in a very hopeful development, younger syntacticians are increasingly exploring a wider range of languages.

508

Rajesh Bhatt

answers to these questions but it agrees that orders where the verb is not final (“rightward scrambling”) are to be distinguished from the ones where the verb is final (“leftward scrambling”). For this reason, we will delay the discussion of the derivation of the rightward scrambling orders in (1d). Limiting ourselves to leftward scrambling, we can further distinguish between scrambling within the clause and scrambling out of a finite clause. (2)

(3)

Scrambling within the clause: a. Short Scrambling: below the subject nuur-ne kitaab anjum-ko book.F Anjum-DAT Nur-ERG ‘Nur gave the book to Anjum.’ b. Intermediate Scrambling: above the subject anjum-ko nuur-ne kitaab Nur-ERG book.F Anjum-DAT ‘It is Nur who gave a/the book to Anjum.’ Long Scrambling: Out of a finite clause a. “Topicalization” anjum-koi yuusuf soc-taa hai think-HAB . M . SG be.PRS .3 SG Anjum-DAT Yusuf.M ti kitaab di-i] give.PFV - F book.F ‘Anjum, Yusuf thinks that Nur gave a book to.’ b. “Interleaving” hai yuusuf anjum-koi soc-taa Anjum-DAT think-HAB . SG . M be.PRS .3 SG Yusuf.M ti kitaab di-i] give.PFV - F book.F ‘Anjum, Yusuf thinks that Nur gave a book to.’

di-i give.PFV - F di-i give.PFV - F

[ki nuur-ne that Nur-ERG

[ki nuur-ne that Nur-ERG

The syntactic literature on phrasal movement distinguishes between A-movement (passive, raising) and A-bar movement (topicalization, wh-movement). One major question has been to determine whether the movement operations involved in the derivation of the word orders in (1b-d) can be characterized as A or A-bar. Gurtu (1992), Srivastav Dayal (1994), and Kidwai (2000) argue that scrambling is always an A-bar movement, while Mahajan (1990/1994) argues that scrambling is sometimes an A-movement and sometimes an A-bar movement (but not both simultaneously, contra Webelhuth 1989). The traditional distinction between A-movement and A-bar movement concerns the target of movement and the reason for the movement. A-movement targets case positions such as [Spec,TP] and is driven by case/EPP motivations. In contrast, A-bar movement targets positions like [Spec,CP] and adjoined positions

Syntax and semantics

509

and is driven by features such as wh, scope, and topic/focus. There are also correlational properties: A-movement does not cause weak crossover (WCO) effects while A-bar movement does. (4)

a. b.

A-movement does not cause WCO: Every studenti seems to hisi mother [ti to be intelligent]. Which studenti ti seems to hisi mother [ti to be intelligent]. A’-movement causes WCO: */???Hisi mother loves every boyi. */???Whoi does hisi mother love ti?

Other correlational properties include quantifier stranding (A-movement can strand quantifiers, A-bar movement cannot), reconstruction (A-movement can but does not have to reconstruct, A-bar movement obligatorily reconstructs), and parasitic gaps (A-bar movement licenses parasitic gaps, while A-movement does not). Deciding where the different kinds of scrambling movement fall with respect to the A/A-bar distinction is not straightforward. Distinguishing between various specifier positions is non-trivial in SOV languages. This leaves us with the properties that drive the movements and the correlational properties. With respect to the properties that drive scrambling, there is consensus that optional scrambling is related to information-structural notions such as given/new and topic/focus. This point was noted in Gambhir 1981 and is developed in Kidwai 2000. On the other hand, the correlational properties, in particular weak crossover and reconstruction, point towards a non-unified picture that suggests three distinct cases: short scrambling below the subject (2a), which has only A-properties; scrambling out of a finite clause (3), which has only A-bar properties; and intermediate scrambling that crosses the subject (2b), which seems to have both kinds of properties. Let us see what the correlational properties of weak crossover and reconstruction tell us about short scrambling, intermediate scrambling, and long scrambling respectively. 5.2.2.3 Short scrambling By short scrambling, I refer to scrambling of the direct object over the indirect object. Such scrambling amnesties weak crossover violations. (5)

(from Bhatt & Anagnostopoulou 1996) har laṛkaai a. *unhõ-ne [us-kii mãã]-koi his mother-KO every boy they-ERG *‘They returned hisi mother every boyi.’ laṛkaai [us-kii mãã]-koi ___ b. unhõ-ne har every boy his mother-KO they-ERG ‘They returned every boyi to hisi mother.’

lauṭaa-yaa return.PFV lauṭaa-yaa return.PFV

510

Rajesh Bhatt

Such scrambling creates new binding possibilities not just for variable binding but also for reciprocal binding. (6)

a.

b.

*unhõ-ne [ek duusre]-kei lekhakõ-ko [ve do kitaabẽ]i they-ERG each-other-GEN writers-KO these two books.F dĩ-ĩ give-PFV . F . PL *‘They gave each other’s writersi these two books.’ lekhakõ-ko ti unhõ-ne [ve do kitaabẽ]i [ek duusre]-kei they-ERG these two books.F each-other-GEN writers-KO dĩ-ĩ give-PFV . F . PL ‘They gave these two booksi to each other’s writersi.’

The direct object cannot bind a reciprocal inside the indirect object when it follows it but can when it precedes it. Moreover short scrambling does not reconstruct for the purposes of binding theory. Scrambling of the direct object over the indirect object eliminates a reading where the indirect object binds into the direct object. (7)

(from Bhatt & Anagnostopoulou 1996) a. *unhõ-nei laṛkiyõ-koj [ek duusrei/j kii girls-KO each other GEN they-ERG ‘Theyi gave the girlsj each-otheri/j’s books.’ kitaabẽ]i/•j b. *unhõ-nei [ek duusre kii each other GEN books they-ERG ‘They gave each-otheri/j’s books to the girls.’

kitaabẽ] dĩ-ĩ books give-PFV laṛkiyõ-koj ___ dĩ-ĩ girls-KO give-PFV

The flip side of this is that scrambling of a direct object over the indirect object also gets rid of a Condition C violation in the base order. (8)

(from Bhatt & Anagnostopoulou 1996) us-koj [aaditya*i/*j/k kii kitaab] lauṭ-aa di-i a. us-nei DEM -ERG DEM -KO Aditya GEN book.F return give-PFV . F ‘Hei returned himj Aditya*i/*j/k’s book.’ [aaditya*i/j/k kii kitaab] us-koj ___ lauṭ-aa di-i b. us-nei DEM -ERG Aditya GEN book.F DEM -KO return give-PFV . F ‘Hei returned Aditya*i/j/k’s book to himj.’

Put together, these diagnostics point towards short scrambling being an Amovement, but with the added property that it does not reconstruct. Like standard A-movement, short scrambling creates new binding possibilities, both for variable and reciprocal binding, and it does not obligatorily reconstruct, eliminating disjoint reference effects. However, in addition, it does not permit reconstruction at all, eliminating binding options that reconstruction would make available. We can conclude that movement to the Subj-IO medial site cannot be A’-movement;

Syntax and semantics

511

if A’ sites are created by adjunction, this also indicates that adjunction cannot take place at this site.3 Some support for this conclusion comes from the parallels between short scrambling and the obligatory object shift operation found in Hindi-Urdu double object constructions. -ko marked Direct Objects in Hindi undergo obligatory object shift to a Subj-IO medial site (Bhatt & Anagnostopoulou 1996). (9)

a. b. c.

raam-ne [VP aniitaa-ko ciṭṭhii bhej-ii] Anita-KO letter.F send-PFV . F Ram-ERG ‘Ram sent a letter to Anita.’ bhej-aa] raam-ne ciṭṭhii-koi [VP aniitaa-ko ti Anita-KO send-PFV Ram-ERG letter-KO ‘Ram sent the letter to Anita.’ bhej-aa] #raam-ne [VP aniitaa-ko ciṭṭhii-koi letter-KO send-PFV Ram-ERG Anita-KO ‘#Ram sent Anita to the letter.’ (NOT: Ram sent the letter to Anita.)

The oddness of (9c) has been taken to show that ‘two -ko marked NP’s cannot appear in a sentence’ (Mohanan 1994b, Kidwai 2000: 78–80). When the DO is a pronoun that refers to a human, it must be -ko marked and consequently, object shift is forced. (10) a. b.

vo ‘Dem’ i.e. he/she/it/that di-yaa] yusuf-ne [VP niinaa-ko vo Nina-DAT DEM give-PFV . M Yusuf-ERG ‘Yusuf gave that/*him to Nina.’ [VP niinaa-ko yusuf-ne use/us-koi Nina-DAT Yusuf-ERG DEM . DAT / DEM . OBL - KO ‘Yusuf gave him/her/???that to Nina.’

ti

di-yaa] give-PFV . M

Object-shifted direct objects behave the same as short-scrambled direct objects with respect to the movement diagnostics. If object shift is an A-movement, then it is plausible that short scrambling is too. Before turning to intermediate scrambling where the results are complicated, I will first consider long scrambling which patterns unequivocally with A-bar movement.

3

An alternative characterization of these facts would be to assume that Hindi-Urdu allows for two Merge configurations for double object constructions: one with the Goal above the Theme and the other with the Theme above the Goal, with neither structure derived from the other.

512

Rajesh Bhatt

5.2.2.3.1.

Long scrambling

By long scrambling, I refer to movement out of a finite clause. (11) (from Gambhir 1981: 303–304) a. A inquires of B if he knows what time the stores open. B replies: [meraa khayaal hai [ki ti nau baje khul dukaanẽi my idea be.PRS . SG that 9 o’clock open stores.F jaa-tii haĩ ]] GO-HAB . F be.PRS . PL ‘The stores, I think, open at 9 o’clock.’ b. Two friends are talking about their common friend, Ramesh. One of them adds: hai [ki ti baink-mẽ raamesh-koi [maĩ-ne sun-aa hear-PFV be.PRS . SG that bank-in Ramesh-DAT I- ERG gayii hai naukrii mil find GO. PFV . F be.PRS . SG job. F ‘Ramesh, I heard, has got a job in a bank.’ In the above examples, the fronted element appears in a left peripheral position but this is not always the case. “Interleaving” orders as in (3b) are also possible though these are not as natural. Long scrambling triggers weak crossover. Unlike short scrambling, there is no weak crossover amnesty. (12) No WCO amnesty a Base: [ki raam-ne *[[us-kiii behin]-ne soc-aa his sister- ERG think-PFV that Ram-ERG [kaun-saa/har aadmii] dekhaa]] which/every man see.PFV ‘*Which mani did hisi sister think that Ram saw ti?’ ‘*Hisi sister thought that Ram saw [every man]i.’ b. Long Scrambling: aadmii]i [[us-kiii behin]-ne *[kaun-saa/har which/every man his sister.ERG [ki raam-ne ti dekhaa]] see.PFV that Ram-ERG ‘*Which mani did hisi sister think that Ram saw ti?’ ‘*Hisi sister thought that Ram saw [every man]i.’

soc-aa think-PFV

Possessive and reciprocal anaphors in long scrambled NPs can be bound by embedded subjects, indicating that reconstruction to the base position is possible. Long scrambling also makes binding by the matrix subject an option, indicating that intermediate reconstruction is also an option.

Syntax and semantics

(13) a.

b.

513

Base: hai [ki [raami apnii behin]-koi’•j [siitaj soc-tii Sita think- HAB . F be-PRS . SG that Ram self’s sister-ACC pasand kar-taa hai]] be-PRS . SG like do-HAB . M . SG ‘Sita thinks that Ram likes his/*her sister.’ Scrambled behin]-kok siitaj soc-tii hai [apniii/??j Sita think- HAB . F be-PRS . SG self’s sister-ACC [ki raami tk pasand kar-taa hai]] be-PRS . SG that Ram like do-HAB . M . SG ‘Sitaj thinks that Rami likes his/???her sister.’

Moreover, unlike short scrambling, long scrambling does not void Condition C violations. (14) a.

b.

Base: hai [ki [ṭiinaa-kaai bhaai] *[voi soc-tii be-PRS . SG that Tina’s brother she think- HAB . F hai]] monaa-ko pasand kar-taa do-HAB . M . SG be-PRS . SG Mona-ACC like ‘*Shei thinks that Tinai’s brother likes Mona.’ Scrambled [voi soc-tii hai *[ṭiinaa-kaai bhaai]j be-PRS . SG Tina’s brother she think- HAB . F [ki tj monaa-ko pasand kar-taa hai]] be-PRS . SG that Mona like do-HAB . M . SG ‘*Tinai’s brother, shei thinks, likes Mona.’

These tests all point in the same direction, namely that long scrambling is A-bar movement. 5.2.2.3.2.

Intermediate scrambling

Finally we turn to the tricky case of intermediate scrambling, by which we refer to movement past the subject but within the same finite clause. Unlike long scrambling, but with short scrambling, intermediate scrambling amnesties weak crossover violations. (15) a.

piite hue) ???[us-kiii behin]-ne [har laṛke]-koi (sigreṭ every boy-ACC cigarette smoking his sister-ERG dekh-aa see-PFV . M ‘???Hisi sister saw every boyi (smoking a cigarette).’

514

Rajesh Bhatt

b.

[har laṛke]-koi [us-kiii behin]-ne ti (sigreṭ piite hue) his sister-ERG cigarette smoking every boy-ACC dekh-aa see-PFV . M ‘Every boyi was seen (smoking a cigarette) by hisi sister.’

This could be taken to indicate that intermediate scrambling is A-movement. If that is the case, then it is A-movement that allows for reconstruction as it does not disrupt binding possibilities that exist prior to scrambling. (16) a. b.

ek-duusre-kei bhaaiõ-ko unhõnei each-other-GEN brothers-ACC they.ERG ‘They hit each other’s brothers.’ bhaaiõ]-koj unhõnei tj [ek-duusre-kei each-other-GEN brothers-ACC they.ERG ‘They hit each other’s brothers.’

maar-aa hit-PFV maar-aa hit-PFV

This sets it apart from short scrambling, which does not reconstruct and hence destroys pre-scrambling binding possibilities. Intermediate scrambling also displays some A’-properties. Intermediate scrambling of an XP containing an Rexpression does not amnesty an existing Condition C violation, suggesting that reconstruction is forced. (17) a. b.

[mohan-kiii kitaab] paṛh-ii *us-nei Mohan-GEN . F book.F read-PFV . F he-ERG ‘*Hei read Mohani’s book.’ *[mohan-kiii kitaab]j us-nei tj paṛh-ii he-ERG read-PFV . F Mohan-GEN . F book.F ‘*Hei read Mohani’s book.’

Together the two reconstruction effects point towards A-bar movement. We are in a bind now — taken at face value, the weak crossover amnesty points toward A-movement, while the reconstruction data points towards A-bar movement. These are the kinds of facts that led Webelhuth (1989) to argue that scrambling in German has both A and A-bar properties. Mahajan (1990) argues instead that intermediate scrambling can be one of two operations — an A-movement or an A-bar movement, but not both simultaneously. However, if intermediate scrambling can be both A and A-bar, then the Condition C reconstruction data in (17) is unexpected. Independent of the particular analysis, the possibility of variable binding indicated by the weak crossover amnesty does not sit well with a reconstruction-based treatment of the disjoint reference effect in (17). Presumably a different treatment will be needed for (17). Mahajan’s dual treatment of intermediate scrambling has an A-position past the subject and predicts that fronted XPs should be able to bind anaphoric elements

Syntax and semantics

515

in the subject. Indeed, Mahajan (1994) presents contrasts such as the following which show that intermediate scrambling allows for reflexive binding into the subject. mohan-koi (18) a. *[apnei baccõ]-ne Mohan-ACC self’s children-ERG ‘*Self’s children hit Mohan.’ [apnei baccõ]-ne b. ?mohan-koi self’s children-ERG Mohan-ACC ‘?Self’s children hit Mohan.’

maar-aa hit-PFV ti

maar-aa hit-PFV

However, Srivastav (1994: 248–249) and Kidwai (2000: 128–129) note that Mahajan is unable to explain why (18b) is degraded (Srivastav herself judges it as ungrammatical). Srivastav and Kidwai relate the marginal acceptability of a referential/“logophoric” usage of X0 reflexives in Hindi.4 I agree with Srivastav and Kidwai that the source of the oddness of (18) has to do with reflexive pronouns. The reciprocal counterpart of (18b) is not degraded. (19) a. b.

*[ek-duusre-kei fains]-ne kaṭriinaa aur saif-koi pehcaan liyaa each-other-GEN fans-ERG Katrina and Saif-ACC recognize take.PFV ‘*Each other’s fans recognized Katrina and Saif.’ [ek-duusre-ke fains]-ne ti pehcaan [kaṭriinaa aur saif-ko]i each-other-GEN fans-ERG recognize Katrina and Saif-ACC liyaa take.PFV ‘Katrina and Saif were recognized by each other’s fans.’

I take this to support Mahajan’s proposal that there is an A-position past the object. Summing up, intermediate scrambling patterns with A-bar movement with respect to Condition C reconstruction, but with A-movement with respect to 4

Both Srivastav and Kidwai note that the unacceptability of structures with reflexive/ reciprocal subjects is not ameliorated by scrambling. They take this to be a further argument against the existence of an A-position past the subject. (i)

(from Kidwai 2000: 31–32) a. *ek-duusre-nei [mohan aur siitaa-koi] each-other-ERG Mohan and Sita-ACC ‘*Each other hit Mohan and Sita.’ b. *[mohan aur siitaa]-koi ek-duusre-nei Mohan and Sita-ACC each-other-ERG ‘*Each other hit Mohan and Sita.’

maar-aa hit-PFV maar-aa hit-PFV

I speculate that the ungrammaticality of (i.b) is related to a more general restriction against reciprocals/reflexives as subjects of finite clauses and is not due to a failure of binding by the fronted object.

516

Rajesh Bhatt

binding. There are tensions between these properties which exist both for Mahajan’s dual approach and Kidwai’s uniform adjunction approach. In addition, Kidwai’s uniform approach needs an explanation for why weak crossover amnesty is unavailable with long scrambling but is readily available with intermediate scrambling.5 5.2.2.3.3.

Rightward movement

In the syntactic literature on movement, it has long been known that so-called rightward movement has properties that are different from the better-studied leftward movement. For one thing, rightward movement is subject to much more stringent locality restrictions than leftward movement (note the Right Roof Constraint of Ross 1967). For this reason, there have been attempts to eliminate the option of rightward movement from the inventory of syntactic operations. For Hindi-Urdu too, Mahajan (1997) puts forward arguments which show that rightward movement, by which we mean movement to the right of the finite verb, has properties that are quite distinct from leftward movement. Mahajan (1997) notes that rightward scrambling, unlike leftward scrambling, does not create new binding possibilities; it does not amnesty weak crossover violations. The rightward scrambled object quantifier in (20b) is unable to bind a pronoun inside the subject phrase, while the leftward scrambled quantifier in (20c) is able to do so. In this respect the rightward scrambled structure behaves like the canonical SOV structure in (20a): (20) a. b. c.

SOV: weak crossover configuration bhaai-ne] [har ek aadmii-ko]i maar-aa *[us-kei he-GEN.OBL brother-ERG every one man-ACC hit-PFV SVO: no weak crossover amnesty bhaai-ne] maar-aa [har ek aadmii-ko]i *[us-kei he-GEN.OBL brother-ERG hit-PFV every one man-ACC OSV: weak crossover amnesty [us-kei bhaai-ne] maar-aa [har ek aadmii-ko]i every one man-ACC he-GEN.OBL brother-ERG hit-PFV ‘Hisi brother hit [every man]i.’

The verdict from other diagnostics such as reciprocal binding and Condition C is the same — rightward scrambling preserves the scopal relations of the corresponding structure without rightward movement. Examination of multiple rightward scrambling reveals a similar pattern. 5

Kidwai (2000: 136) suggests that weak crossover amnesty is marginally possible with long scrambling, and the seeming contrast stems from an interaction between the independent markedness of long scrambling and the typically non-presuppositional nature of quantificational elements

Syntax and semantics

517

Three kinds of treatments have been proposed for rightward scrambling. Working within the framework of antisymmetry (Kayne 1994), Mahajan (1997) proposes a treatment where rightward scrambling is reanalyzed as stranding — the material around the ostensibly rightward movement material moves to the left. As a result the material on the left is structurally higher than the material on the right (see also Simpson & Bhattacharya 2003, with a related proposal, based on different empirical grounds, for Bangla, a closely related Indo-Aryan language). Bhatt and Dayal (2007) argue that Mahajan’s antisymmetric account does not in fact derive the scopal predictions it sets out. Instead they propose an account that utilizes rightward movement of a verbal projection that has been evacuated by its verbal head. The ostensibly rightward scrambled elements do not move themselves; what moves is a larger projection that contains them. This remnant is taken to reconstruct at LF, recreating the pre-rightward movement scopal configuration. Manetta (2012) resurrects Mahajan’s original (1988) analysis of rightward scrambling as involving actual rightward movement of the material to the right of the verb. Mahajan (1997) had abandoned his original analysis on the grounds that this predicted the wrong scopal relations. Manetta assumes that rightward-moved material obligatorily reconstructs. She further assumes that the head that triggers rightward movement forces rightward movement of every DP in its scope. With this additional assumption, Manetta is able to recreate the pre-rightward movement scopal configuration. 5.2.2.4.

Case and agreement

5.2.2.4.1.

Local agreement

The broad facts of Hindi-Urdu case and agreement can be succinctly stated as follows. Subjects of transitive verbs appear with ergative marking in the presence of perfective aspect; in non-perfective aspects, subjects do not receive any overt case marking. Objects may or may not be accompanied by overt case marking. Verbal agreement goes with the structurally most prominent argument that is not overtly case marked. When the subject is not overtly case marked, it controls agreement. When the subject is overtly case marked but the object is not, the object controls agreement. When both are overtly case marked, we have default agreement which corresponds to 3 SG . M features. Despite the succinctness of this statement, it was challenging to handle these facts within existing frameworks. This is because initially, minimalist approaches to case and agreement assumed a tight coupling between case, agreement, and syntactic position. The approaches to case and agreement pursued in the syntactic literature on Hindi-Urdu can be seen as following the developments in the Minimalist Program. The earlier stages of the Minimalist Program associate specific specifier positions with specific structural cases. Likewise elements that trigger agreement occupy

518

Rajesh Bhatt

designated specifier positions. So nominative case was taken to be licensed in [Spec,AgrSP] or [Spec,TP] and accusative case in [Spec,AgrOP] or the outer specifier of [Spec,vP]. These positions were also taken to be the locus of subject and object agreement respectively. These assumptions inform Mahajan’s (1989) analysis of Hindi-Urdu agreement. Mahajan assumes the following phrase structure. (21) [IP... [AgrP... [VP DP [V’ DP V]] Agr] I] (Mahajan’s VP corresponds to the contemporary vP, the combination of V with Agr is not shown.) The [Spec,AgrP] position is the locus of agreement in this system. Agreement goes with the subject or the object depending upon which passes through this position. In the case of subject agreement, the subject moves through [Spec,AgrP] to [Spec,IP] where it receives nominative case. Since it moves through [Spec,AgrP], we get subject agreement. The object receives case in situ from V. (22) Subject Agreement: [IP raami [AgrP ti Ram.M . SG ‘Ram will eat bread.’

[VP ti

[V’

roṭiii bread.F

khaa-egaa]] Agr] I] eat.FUT .3 SG . M

To derive object agreement, Mahajan assumes that perfective verbs do not assign case to their objects. Consequently the object needs to raise to [Spec,AgrP] for case. The subject in such structures typically has inherent ergative case, which blocks agreement. Since the object ends up in [Spec,AgrP], we have object agreement. (23) Object Agreement: [IP raam-nei [AgrP Ram-ERG ‘Ram ate bread.’

roṭiiij [VP ti bread.F

[V’ tj

khaa-yiiij]] Agr] I] eat.PFV . F

Subsequent developments in the Minimalist Program dissociated movement and case. Case is no longer taken to drive syntactic movement. Case assignment and φ-agreement are both taken to follow from the operation Agree. Agree establishes a relationship between a head, the probe, and a designated phrase, the goal, as long as the probe c-commands the goal and certain other locality and featurecompatibility requirements are met. What is crucial is that a [Spec, Head] configuration is not required for Agree and consequently case/agreement does not require designated specifier positions. This is the position adopted in Bhatt 2005. The specific assumption is that finite T licenses nominative case, and transitive v licenses accusative case. (24) ... [T0 [vP DP[NOM] [v’ v0 [VP V DP[ACC]]]]] (heads are shown on the left for readability)

Syntax and semantics

519

This proposal for case assignment is the same as the one that is commonly put forward for English. The proposal for agreement diverges from English. In English, there is always a nominative NP, and that NP controls agreement. Hindi-Urdu allows for non-nominative subjects and in these cases, the object can control agreement.6 To handle such cases, Bhatt (2005) proposes that T probes for φ-features. He assumes that φ-features are visible only on bare DPs and not on DPs that bear a postposition. If the closest DP (the subject) is bare, its φ-features are visible to T and T agrees with its features. However if the closest DP bears a postposition, its features are not visible to T and T cannot agree with it. In this case, T probes further into the vP. If it finds an unmarked argument NP, it agrees with it. If there is no such NP, the agreement features on the T probe receive default 3 SG . M specification. (25) a. b.

T0[uF] …. [Asp0[uF] …. [vP SUBJ[φ F] v [VP V OBJ[φ F]]]] (Asp = Non-Perfective) → subject agreement T0[uF] …. [Asp0[uF] …. [vP SUBJ-ERG v [VP V OBJ[φ F]]]] (Asp = Perfective) → object agreement

Legate (2008: 73) endorses Bhatt’s (2005) proposal but criticizes its reliance on the surface markedness of argument DPs. She takes agreement relations to be determined in the syntax while the morphological realization of case is determined post-syntactically. So agreement should not directly make reference to the details of morphological realization but should instead make reference to abstract syntactic features such as case. Reformulated thus, the proposal can handle languages like Panjabi and Marathi, whose agreement systems are similar to those of Hindi-Urdu, even though only some abstractly ergative subjects bear ergative case marking (Legate 2008: 94–95). 5.2.2.4.2.

Long distance agreement

Mahajan (1989) introduced the phenomenon of Long Distance Agreement to the syntactic literature on Hindi-Urdu. Though the existence of the phenomenon had been noted before, Mahajan was the first to provide a formal analysis. Long Distance Agreement (LDA) refers to the situation exemplified below where an argument of an embedded non-finite clause triggers agreement on the embedding predicate. 6

There is a rich and extensive literature on non-nominative subjects in South Asian languages. Davison’s (2004b) analysis of ergative case is highly relevant to and compatible with the discussion in the main text. Mohanan 1994a, Verma & Mohanan (eds.) 1990, and Bhaskararao & Subbarao (eds.) 2004 are valuable surveys.

520

Rajesh Bhatt

(26) (from Mahajan 1989) a. LDA: raam-ne [roṭii khaa-nii] Ram-ERG bread.F eat-INF . F ‘Ram wanted to eat bread.’ b. no LDA: raam-ne [roṭii khaa-naa] Ram-ERG bread.F eat-INF . M ‘Ram wanted to eat bread.’

caah-ii want-PFV . SG . F caah-aa want-PFV . SG . M

To handle LDA, Mahajan assumes that infinitival verbs in Hindi-Urdu license accusative case optionally. When they license accusative case, the object in question stays in situ and does not trigger agreement. This leads to optionality in agreement, which is something we only find in Hindi-Urdu with Long Distance Agreement. When the infinitival verb does not license accusative case, the embedded object needs to move for case. The closest case position is the [Spec,AgrP] associated with the matrix predicate. So the embedded object moves there and since it is in [Spec,AgrP], we get long-distance object agreement. Mahajan’s analysis links movement of the embedded object with agreement. This link is also pursued and developed in work by Chandra (2007). Bhatt (2005) uses the Agree-based system set up for local agreement to handle LDA. If T0, the agreement probe, is unable to locate any features on the subject or in its clause, it probes into the embedded infinitival clause and locates the features on the embedded object leading to LDA. Legate (2008) refers to this as AGGRESSIVE AGREEMENT. Bhatt argues that LDA is only possible with restructuring infinitival clauses and that infinitival complements can be either restructuring infinitives or not. The optionality of LDA then reduces to the optionality of restructuring. Modal verbs only take restructuring infinitives and with them LDA is obligatory.7 Between the full movement into the matrix clause analyses of Mahajan (1989) and Chandra (2007) and the no-movement analyses of Davison (1991), Butt (1995), Boeckx (2004), and Bhatt (2005), there are also analyses which propose that LDA involves a short movement of the LDA trigger to the edge of the infinitival clause. These analyses propose that this movement to the edge is necessary for the trigger to be visible to agreement probes in the matrix clause. Polinsky and Potsdam’s

7

Butt’s (1995) analysis of LDA is not couched within the Minimalist Program but its core idea can be easily rendered in minimalist terms. Butt proposes that LDA be seen as two steps of agreement — first the infinitival clause, which she argues has nominal properties, agrees with its object, and then the matrix predicate agrees with the infinitival clause, its object. Thus there is strictly speaking no long distance agreement as each step of agreement is local. Davison’s (1991) proposal for LDA is also relevant here.

Syntax and semantics

521

(2001) analysis of LDA in Tsez is perhaps the first example of such an analysis. Keine (2013) presents a short movement analysis of LDA for Hindi-Urdu. 5.2.2.5.

Questions

Questions are another domain that has been the topic of much minimalist analysis for Hindi-Urdu, Bengali, and Kashmiri. There is a considerable diversity of question-related phenomena found in these languages: Hindi-Urdu and Bengali are pre-theoretically wh-in-situ while Kashmiri has overt wh-movement. The wh-insitu strategy often coexists with an overt movement strategy and with the scope marking construction. Broadly speaking, research on this topic has focused on answering the following questions: (i) What is the proper analysis of wh-in-situ: does it involve covert movement or is there overt wh-movement that is disguised by subsequent movement? If so, what is the target of the movement? (ii) How does one explain the locality restrictions on question construal that relate to finiteness and directionality. (iii) Is the overt movement of wh-phrases found in these languages derived by a special rule of wh-movement or by the more generally available process of scrambling? (iv) Does the proper analysis of scope marking involve an interclausal syntactic dependency or is the interclausal relationship semantic? As we shall see, the answers that have been given to these questions are not independent. 5.2.2.5.1.

Wh-in-situ

Let us look at the empirical phenomena one by one and examine the analytical choices taken by various researchers. We start with the phenomenon of wh-in-situ. In Hindi-Urdu, wh-words do not need to be fronted for local question construal. In fact, fronting is perceived to be less natural than the in-situ option. There is an overall preference for wh-phrases to be immediately preverbal even if that would not be their default word order in the declarative counterpart (28b). (27) Questioned element: Object a. S wh-O V, most natural: raam-ne [kyaa ciiz] Ram-ERG what thing.F ‘What thing did Ram eat?’ b. wh-O S V: [kyaa ciiz] raam-ne what thing.F Ram-ERG ‘What thing did Ram eat?’

khaa-ii eat-PFV . F khaa-ii eat-PFV . F

522

Rajesh Bhatt

(28) Questioned element: Subject a. wh-S O V: kis-ne billuu-ko Billu-ACC who-ERG ‘Who hit Billu?’ b. O wh-S V, most natural: billuu-ko kis-ne Billu-ACC who-ERG ‘Who hit Billu?’

maar-aa hit-PFV maar-aa hit-PFV

Matrix question construal is possible even if the in-situ wh-phrase is in an embedded infinitival complement clause or an nonfinite adjunct clause. (29) a.

b.

in-situ inside an infinitival complement clause (from Mahajan 1990: 160) raam-ne [PRO kis-ko dekh-naa] caah-aa who-ACC see-INF want-PFV Ram-ERG ‘Who did Ram want to see?’ in-situ inside an infinitival adjunct clause raam [kyaa khaa-te hue] ghar ga-yaa Ram.M what eating-while home go-PFV . SG . M ‘Whati did Ram go home while eating ti?’

However if the embedded clause is finite, matrix question construal is not possible in Hindi-Urdu. With embedded finite clauses, only a local question construal is available and if that is incompatible with the semantics of the embedding predicate, we get ungrammaticality. (30) a.

b.

jaan ‘know’ takes both declarative and interrogative complements: wajaahat jaan-taa hai [ki riimaa Wajahat.M know-HAB . SG . M be.PRS . SG that Rima.F kis-ko pasand kar-tii hai] do-HAB . F be.PRS . SG who-ACC like Embedded Question: ‘Wajahat knows who Rima likes.’ *Matrix Question: ‘Who does Wajahat know Rima likes?’ maan ‘believe’ takes only declarative complements: wajaahat maan-taa hai [ki riimaa Wajahat.M believe-HAB . SG . M be.PRS . SG that Rima.F kis-ko pasand kar-tii hai] do-HAB . F be.PRS . SG who-ACC like *Embedded Question: ‘Wajahat believes who Rima likes.’ *Matrix Question: ‘Who does Wajahat believe Rima likes?’

Syntax and semantics

523

Another factor, and one that is perhaps more decisive, is the location of wh-in-situ. The generalization seems to be that if the in-situ wh-phrase is to the right of the finite verb, it cannot receive a question construal associated with that verb. We have seen this in (30) where the finite clause that contains the in-situ wh appears to the right of the matrix predicate. In (30), though, we cannot say whether it is finiteness or syntactic placement which is responsible for the lack of matrix wh-construal. This point is resolved by examining extraposed infinitival clauses — unlike non-extraposed infinitival clauses, these do not allow matrix construal of in-situ wh. (31) wh-in-situ: a. Extraposed infinitival complement, complement wh-XP: *Ram-ne caah-aa [PRO kis-ko dekh-naa] who-ACC see-INF Ram-ERG want-PFV ‘*Ram wanted to see who.’ b. Extraposed infinitival complement, adjunct wh-XP: *Ram-ne caah-aa [PRO gaaṛii kaise ṭhiik kar-naa] car.F how correct do-INF Ram-ERG want-PFV ‘*Ram wanted to fix the car how.’ In fact, rightward scrambled wh-phrases are by themselves unable to receive a normal question interpretation. (32)

Only echo/rhetorical reading: Ram-ne kitaab di-i kis-ko Ram-ERG book.F give-PFV . F who-DAT ‘Ram gave a book to WHO?’

Parallel data are noted for Bengali in Bayer 1996: 284–289. The clearest evidence for the role of directionality in the interpretation of wh-in-situ comes from a contrast between Hindi-Urdu and Bengali noted by Josef Bayer. Hindi-Urdu finite clauses are obligatorily extraposed; they are ungrammatical in a preverbal position. Bengali, however, permits finite clauses to appear both pre- and post-verbally. Preverbal finite clauses can only have a final complementizer bole while postverbal finite clauses can only have an initial complementizer. What we find then in Bengali is that in-situ wh in an embedded preverbal finite clause can have matrix construal but not in a postverbal finite clause. (33) (Bengali, from Bayer 1996:272–275) a. Preverbal Complement Clause: wide scope possible ora [[ke aS-be] (bole)] Sune-che hear.PST -3 they who come-FUT .3 COMP Narrow Scope: ‘They have heard who will come.’ Wide Scope: ‘Who have they heard will come?’

524

Rajesh Bhatt

b.

Postverbal Complement: wide scope impossible ora Sune-che [ke aS-be] they hear.PST -3 who come-FUT .3 Narrow Scope: They have heard who will come.’ *Wide Scope: ‘Who have they heard will come?’

Similar facts obtain for Marathi (Wali 1988). 5.2.2.5.2.

Overt movement

We have seen in (27) that wh-phrases can be fronted, though this is not obligatory and is in fact a dispreferred option. However, in configurations where in-situ wh does not receive matrix construal, wh-fronting becomes necessary to achieve a matrix question interpretation. (34) a.

b.

(Bengali, from Bayer 1996: 297) bhab-cho [CP je tumi [ki OSukh-e]i you which illness-LOC think-2 COMP mara ge-che]? die GO.PST -3 ‘Of which illness do you think that Ram died?’ (Hindi, from Srivastav 1991a) soc-te ho kauni tum Who you.PL think-HAB . PL . M be.PRS .2 PL aa-egaa]? come-FUT .3 SG . M ‘Who do you think will come?’

ram ti Ram

[ki ti that

This is also the case with extraposed infinitival clauses. The missing wide scope construal in (31) becomes accessible if the in-situ wh-phrases are overtly moved into the matrix clause. 5.2.2.5.3.

Scope marking

Wh-fronting is a possible strategy in Hindi-Urdu for forming long distance questions but it is not the most natural strategy. In some other Indo-Aryan languages, this strategy is strongly dispreferred or restricted — Panjabi (Bhatia 1993) and Marathi (Wali 1988). Often speakers rephrase the question to avoid an extraction out of a finite clause. They also use the scope-marking strategy which achieves matrix scope without overtly moving the in-situ wh out of a finite clause (Dayal 1994, 1996, 2002, Lahiri 2002, Mahajan 1990, 2002, Manetta 2010).

Syntax and semantics

(35) a. jaan ‘know’: wajaahat kyaa jaan-taa hai [ki know-HAB . SG . M be.PRS . SG that Wajahat.M WHAT kis-ko pasand kar-tii hai] like do-HAB . F be.PRS . SG who-ACC Matrix Question: ‘Who does Wajahat know Rima likes?’ (Literally: What does Wajahat know, who does Rima like?) b. maan ‘believe’ takes only declarative complements: wajaahat kyaa maan-taa hai [ki believe-HAB . SG . M be.PRS . SG that Wajahat.M WHAT kis-ko pasand kar-tii hai] like do-HAB . F be.PRS . SG who-ACC Matrix Question: ‘Who does Wajahat believe Rima likes?’ (Literally: What does Wajahat believe, who does Rima like?)

525

riimaa Rima.F

riimaa Rima.F

In the scope marking strategy, a WHAT appears in the matrix clause where we want the in-situ wh to take scope while the wh-expression being questioned stays in situ. The WHAT can be thought of as marking the scope of the in-situ wh, as its presence is what makes the matrix construal possible. 5.2.2.5.4.

Analytical choices

5.2.2.5.4.1. No wh-movement The major analytical choice in the analysis of questions in Hindi-Urdu (and other Indo-Aryan languages except Kashmiri) is how wh-in-situ is to be analyzed. Mahajan (1990) and Dayal (1996) take the surface appearance of wh-in-situ to reflect the reality, i.e. they assume that there is no obligatory overt movement of wh-phrases to a designated structural position. For them, wh-phrases will appear in the same surface positions as their corresponding non-wh-phrase. This is largely true with two major exceptions. The first is that rather than undergo fronting, wh-phrases are felt to be most natural in the immediately preverbal position, even when that is not the canonical position for the corresponding non-wh-phrase. The second is the environments where in-situ wh cannot receive matrix construal. In such environments, fronting is obligatory for matrix construal. Mahajan (1990) and Dayal (1996) propose that the relationship between the complementizer associated with question meaning and the wh-phrase is established via QR at LF. In contemporary terms, one could also explore establishing this relationship via the Agree operation without appealing to movement of the wh-phrase at any level. The restrictions on in-situ wh receive an explanation in terms of restrictions on QR. For the finite-clause cases, a proposal that was entertained was that finite clauses are scope islands and hence islands for QR. However, as we have seen, finiteness is actually not the determinant of the possibility of

526

Rajesh Bhatt

matrix construal. The real determinant is the location of the wh-word with respect to the verb of the question. Mahajan (1990) assumes that postverbal clauses are extraposed clauses and hence, by a pattern seen elsewhere, they are islands for extraction. Consequently an in-situ wh-phrase in an extraposed clause cannot QR into the matrix clause, and matrix-wh construal is impossible. It is, however, possible to overtly extract out of finite clauses. Mahajan’s (1990) explanation for why this is possible goes as follows. Finite clauses in Hindi-Urdu are generated in a preverbal argument position from which they are obligatorily extraposed. In the argument position, the clause is not an island for extraction and this is when extraction takes place. Extraposition takes place after overt movement of the wh-phrase has taken place. Covert movement would have to take place after extraposition, but extraposition creates an island, blocking further movement. Bayer (1995) criticizes Mahajan’s (1990) explanation in the context of the corresponding German data. He notes that if we can use ordering to get out of an extraposed clause in this case, then what would stop us from voiding freezing effects more generally. The following kinds of cases are at stake. (36) a. b. c. d.

John read a book about quasars yesterday. What did John read a book about yesterday? John read a book yesterday about quasars. *What did John read a book yesterday about?

However, there is an important difference between the kinds of movements involved in the movement of the wh-phrase in English and Hindi-Urdu. Movement of wh-phrases in English is bona-fide wh-movement that targets [Spec,CP], while according to Mahajan the overt movement of wh-phrases in Hindi-Urdu is a scrambling movement that targets a lower position. The difference between HindiUrdu and English is then not surprising. The remaining challenge for in-situ proposals is the fact that even unembedded postverbal wh-phrases do not allow for an ordinary question construal. Only rhetorical/echo question readings are available. The challenge posed by this restriction for the in-situ account was noted by Mahajan (1997: 209). Bhatt and Dayal (2007) present a treatment of rightward movement from which the restriction against postverbal placement of in-situ wh follows without assuming overt wh-movement. We now turn to overt movement approaches which bring a different perspective to the preverbal/postverbal asymmetry with respect to question construal. 5.2.2.5.4.2. Overt wh-movement Simpson and Bhattacharya (2003) and Manetta (2010) tackle the challenge posed by the restriction on in-situ wh in postverbal positions by developing overt wh-movement accounts for Bengali and Hindi-Urdu respectively. The basic proposal is that the wh-phrase obligatorily and overtly moves to a distinguished speci-

Syntax and semantics

527

fier position associated with the verbal projection, but a position that is not left-peripheral. Further movement then takes place that typically disguises the obligatory and overt movement of the wh-phrase. Since specifier positions in Hindi-Urdu and Bengali, and presumably universally, are on the left of the head, we have a handle on why there might be a preverbal/postverbal asymmetry with respect to the placement of wh-phrases. Beyond this common core, Simpson and Bhattacharya (2003) and Manetta (2010) differ in assumptions and details, and so I will now consider their proposals separately. Working within Kayne’s (1994) antisymmetry program, Simpson and Bhattacharya (2003) assume that Bengali is an SVO language with overt wh-movement to a specifier in the C-domain. To derive the default SOV order, they assume that various verbal dependents move to designated specifier positions for case and related motivations. Antisymmetry only allows for left specifiers, and so movement to these positions creates a verb-final order as long as we assume that verbal heads stay lower than these specifier positions.8 Finite complement clauses, being CPs, do not need case and hence do not need to move to a specifier position. In fact, je-clauses are blocked from appearing in preverbal positions. The postverbal positioning of finite clauses under this perspective is the base position. The preverbal position is the derived one. For a wh-phrase to get matrix question construal, the wh-phrase needs to overtly move to a specifier position in the C-domain. They assume a left periphery in which the default surface position of the subject is higher than the part of the C-domain associated with wh-movement. As a result, wh-movement in Bengali cannot be equated with wh-fronting. With these assumptions in place, if a wh-phrase appears inside a postverbal embedded clause, we know that it has not moved to a specifier position in the C-domain of the matrix clause. Hence it cannot receive a matrix question construal. (37) a. b. c.

wh-phrase in postverbal CP: no movement → ungrammatical *Subject... C[+wh]... V [CP... wh-XP...] wh-phrase move to matrix CP: Subject wh-XPi C[+wh]... V [CP... ti...] wh-phrase pied-piped embedded CP to matrix CP: Subject [CP... wh-XP...]i C[+wh]... V ti

Next we can consider what happens to wh-phrases embedded inside preverbal finite complement clauses. Simpson and Bhattacharya (2003) argue that in these cases, the embedded wh-phrase moves locally within the embedded clause and then causes pied-piping of the entire clause to the C-domain. They note the precedent for clausal pied-piping analyses of wh-movement in treatments of Basque 8

The introduction of auxiliaries might bring in additional complications which I will not address here. See Bhatt & Dayal 2007 for relevant discussion in the context of related facts from Hindi-Urdu, which may or may not generalize to Bengali.

528

Rajesh Bhatt

(Arregi 2003) and Quechua. Even though Simpson and Bhattacharya do not set out to handle the restriction on postverbal unembedded wh-phrases, their analysis could be extended to derive this restriction. Since antisymmetry does not allow for rightward movement, postverbal DPs would be derived by stranding. As long as the verb does not move past the C-domain, a postverbal positioning of a wh-phrase will indicate that the wh-phrase has not undergone wh-movement. Manetta (2010) analyses Hindi-Urdu and Kashmiri wh-movement. She also assumes overt wh-movement but she does not assume an antisymmetric system. She assumes that all complement clauses in Hindi-Urdu are generated in a preverbal position but that finite clauses obligatorily get linearized postsyntantically at the right edge of their clause. Manetta (2010) differs from Simpson and Bhattacharya (2003) in the target of wh-movement. Simpson and Bhattacharya have wh-movement targeting the C-domain (in Bengali) while Manetta has wh-movement targeting the outer specifier of v (in Hindi-Urdu).9 In order to derive the tendency for wh-phrases to appear in the immediately pre-verbal position, she needs to assume that all vP-internal material scrambles to a position preceding the outer specifier of vP. The proposal is able to block matrix question construal for wh-phrases inside postverbal finite clauses. If a wh-phrase appears inside a postverbal finite clause, it means that it did not undergo wh-movement to the [Spec,vP] of the matrix verb. It follows that it cannot receive matrix construal. Manetta does not discuss the preverbal finite clauses of Bengali but she could straightforwardly adopt Simpson and Bhattacharya’s (2003) clausal pied-piping proposal. Handling the restriction on postverbal unembedded wh-phrases is not straightforward. Manetta (2012) explicitly allows for rightward movement of DPs. So she needs to have an explicit statement that wh-moved phrases cannot undergo rightward movement. Since this is exactly what we would like to explain, it does not constitute a satisfactory explanation. 5.2.2.5.4.3. The scope-marking construction A number of intertwined questions are at stake in the analysis of the scope-marking construction; see (35) above. The first is whether the wh-phrase in the embedded clause is in the matrix clause at some level of representation — one class of analyses assumes that the wh-phrase moves into the matrix clause at LF, while another class of analyses assumes that the wh-phrase remains in the embedded clause at all levels. The second is whether the wh-phrase in the matrix clause, kyaa lit. ‘what’ in Hindi-Urdu, is semantically an expletive. The so-called direct dependency analyses (Davison 1984, Mahajan 2000) answer both these questions positively — the wh-phrase in the embedded clauses moves to the matrix clause at LF and replaces the expletive kyaa. The so-called indirect dependency analyses (Dayal 1996, 9

For Kashmiri, Manetta (2010) argues that wh-movement targets the C-domain.

Syntax and semantics

529

Lahiri 2002) answer both these questions negatively — the wh-phrase remains within the embedded clause and the kyaa in the matrix clause is semantically contentful; the relationship between the kyaa and the embedded clause is mediated in the semantics. Manetta’s (2010) analysis answers the first question in the negative but the second one positively — it treats kyaa as an expletive but gives matrix scope to the embedded wh via unselective binding; this allows the wh-phrase in the embedded clause to remain in the embedded clause at LF. The fourth combination is not attested in the literature, presumably because a semantically contentful kyaa cannot be replaced by another element at LF. The state of the art in the analysis of the scope-marking construction has made salient two limiting conditions, one syntactic and one semantic. The first is that the complement clause in a scope marking is tightly integrated into the embedding clause and in fact in the syntactic scope of all the elements in the embedding clause. This is also the case for ordinary complement clauses. This observation was originally made by Haider for German and then for Bengali by Bayer. Since then there is consensus that this is also the case for Hindi-Urdu. A consequence of this is that for the Hindi-Urdu construction, we cannot have analyses where the relationship between kyaa and the complement CP is entirely semantic. Dayal notes the existence of something we could see as the discourse counterpart of the scope marking construction. (38) What do you think? Who does Mary like? Here the relationship between the ‘what’ and the subsequent question is indeed entirely semantic. The point of the Haider/Bayer observation is that this kind of analysis cannot be directly extended to the Hindi-Urdu scope marking construction. Subsequent analyses have taken this point seriously and have either treated the complement clause as an argument of the embedding predicate (Manetta 2010) or as an obligatorily extraposed modifier of kyaa (as explored in Lahiri 2002). The second limiting condition comes to us from Lahiri (2002). Lahiri shows that at LF, the restrictor of the wh-phrase in the embedded clause always takes low scope; the wh-phrase does not covertly move to the matrix clause at LF. This rules out direct dependency analyses for Hindi-Urdu but is agnostic between indirect dependency analyses and an analysis like Manetta’s (2010) where the matrix +Q C0 binds the wh-phrase, which stays in situ. 5.2.3.

Generative approaches to Pashto syntax By Taylor Roberts

With the exception of Persian and some research on Kurdish and Digor Ossetic, Pashto has been the best known Iranian language to receive attention in generative linguistics. Spoken in parts of Afghanistan and Pakistan by approximately 20 million people, Pashto has been noted for having an archaic vocabulary, and

530

Taylor Roberts

the same may be said of its syntax, as the language contains many properties of interest to generative grammarians that other Indo-European languages have only inherited piecemeal. For example, Pashto contains such features as secondposition clitics, verbal clitics, pro-drop (allowing for both null subjects and objects), complex predicates, ergativity, scrambling, external possession, dislocation, resumptive pronouns, psych predicates with varying subject markings, and split agreement, thus providing a rich testing ground for hypotheses within the Minimalist framework. Unfortunately, years of political instability in Pashtospeaking areas have resulted in few speakers being trained in generative grammar who might fully explore the theoretical promise of the language. The dean of Pashto generative linguistics was Habibullah Tegey (1932–2005). His 1977 dissertation, The grammar of clitics, brought attention to the language’s second-position clitics, which continue to attract interest from linguists to the present. Following his 1979 article on ergativity, Tegey turned to more descriptive and pedagogical works, coauthoring with Barbara Robson a series of learning materials for the Center of Applied Linguistics in Washington, DC, which culminated in a reference grammar (1996) and glossary (1993) that stand as the fullest and most useful works on the language and are available for free download from the Education Resources Information Center (ERIC). In this later part of his career, Tegey was also senior editor of the Afghan language service at the Voice of America. Pashto’s second-position clitics are illustrated below. As optional, sentenceinitial items are removed, the clitics take as a host whatever other element appears initially. (Here and throughout, the second-position clitics are underlined.) The last sentence illustrates that, although the language is rigidly verb-final, the clitic’s requirement to have a host to its left is stronger: (39) a. b. c. d.

Khushal mee zyaati ne anymore NEG Khoshal 1 SG ‘Khoshal does not hit me anymore.’ zyaati mee ne wah-i NEG hit-PRS .3 SG anymore 1 SG ‘He doesn’t hit me anymore.’ ne mee wah-i NEG 1 SG hit-PRS .3 SG ‘He doesn’t hit me.’ wah-i mee hit-PRS .3 SG 1 S G ‘He hits me.’ (Tegey 1977:132)

wah-i hit-PRS .3 SG

The following sentence illustrates more specifically that the clitics follow not the first word, but rather the first constituent:

Syntax and semantics

(40)

531

[NP

aagha sheel kalena danga aw khaaysta peeghla] that 20 year tall and pretty girl mee nen byaa welida 1 SG today again saw ‘I saw that twenty-year-old tall and pretty girl again today.’ (Tegey 1977: 83–84)

Certain verbs such as akhistel ‘to buy, to take’, which have variable stress placement in their imperfective forms, may actually be divided by a clitic when stressed initially: (41) a. b.

a-khistéle mee PREF -buy 1 SG ‘I was buying them’ á mee khistele PREF 1 SG buy ‘I was buying them’ (Tegey 1977: 89)

Such sentences pose a problem to as basic a notion as what constitutes a “word”, and David (2009) notes the strong plausibility that analogy extended this unusual behavior to other verbs in Pashto. Tegey 1977 was the first major work to confront Pashto from a generative perspective. Generative syntax (including Minimalism, Principles & Parameters, Transformational Grammar, etc.) works from the assumption that an invariant Universal Grammar (UG) underlies all languages, with the source of variation residing in the lexicon — not only in the phonological shapes of words, but in the syntactic features that are projected in the derivation of sentences that may (or may not) drive such processes as movement or agreement, which in turn give rise to the surface dissimilarity of languages. When Tegey initially tackled these constructions, this constraining aspect of generative syntax was in its infancy, which allowed him to offer explanations that would not be so readily acceptable today. For example, Tegey (1977: 122) suggested that second-position clitics ‘are placed after the first major surface constituent that bears at least one main stress — where “major constituent” may be directly dominated by S, VP, or V.’ Such a statement, while descriptively adequate, conflates syntactic and phonological processes in a way that would be discouraged in later versions of syntactic theory, in which syntactic processes are strictly ordered before phonological ones. Moreover, due to some unexpected interactions between processes of vowel coalescence and clitic placement, Tegey was led to conclude that the architecture of the grammar itself must be reorganized to accommodate the hitherto unobserved facts of Pashto, a conclusion that practitioners of generative syntax today would avoid. An additional theoretical gap that existed in 1977 was that empty pronominals (pro) had not been widely discussed — and as Pashto has pro-drop in both subject

532

Taylor Roberts

and object positions, it was surely difficult to label the syntactic structure of the most basic sentences. For example, in contrast to strong pronouns in English, which appear in fixed positions nearly identical to those of full NPs, seemingly mobile pronominal clitics present a great challenge whose details (while taken up enthusiastically today by dozens of researchers) were somewhat overlooked in the 1970s with the greater goal of at least describing phenomena that Anglocentric generativists had until then avoided. The positioning of clitics in all languages requires reference to syntactic, morphological, and phonological/prosodic features, and it is precisely in their variation that their great interest lies. Again, this broader perspective has emerged after decades of research into many languages. Kaisse (1981) was the first to take up Tegey’s challenge, by astutely noting that some apparently monomorphemic verbs such as akhistel ‘to buy, to take’ above should in fact be regarded as historically polymorphemic with synchronically opaque morphology (akin to such English words as permit, remit, transmit, commit, compel, concur, recur, transfer) and separability (such as the English prefixes pre- and post-, or pro- and anti-). By assuming this perspective, Pashto becomes less troubling for theorists, as is to be desired from the theory itself. However, neither Tegey nor Kaisse explicitly considered whether clitic placement is primarily syntactic or phonological. While both authors often seem to assume that it is syntactic movement, neither addressed what positions clitics move to or from. Hock (1992) diverged from predecessors by advocating strongly for a primarily prosodic approach to clitic placement. Similarly, van der Leeuw (1995; 1997) endeavoured to have clitics placed primarily by the phonological component of the grammar. Roberts (1997) attempted a similar analysis within Optimality Theory, noting that its basic approach — that grammatical constraints are violable under the force of more highly ranked constraints — can be fruitfully applied to clitics, as descriptions of their placement often require disclaimers (e.g., ‘place a clitic after the leftmost constituent unless it is not stressed’). Roberts’s OT approach sought to formalize some of Hock’s suggestions by having syntactic maximal projections (e.g. NP) derive prosodic phrase boundaries, which could then be manipulated late in the derivation (in the phonology, i.e. after Spell Out); a challenge for this approach, however, remained the troublesome separable-prefix words such as akhistel ‘to buy, to take’ (as well as simple coordinated phrases as illustrated above, which ought not to be so troublesome). All of the researchers hitherto mentioned had concerned themselves mainly with defining what “second position” is in Pashto in order to explain the surface positions of the clitics, but none had yet closely addressed the syntactic properties of the clitics, an oversight that gradually became clearer as the relevant empty categories and projections in better studied languages became more widely scrutinized by generative syntacticians. With the great help of Babrakzai (1999),

Syntax and semantics

533

a descriptive grammar that provided much additional data and insight to supplement Tegey & Robson 1996, and University of Arizona linguist Jan Mohammad as principal consultant, Roberts (2000) was the first work to investigate such issues as whether Pashto’s pronominal clitics are arguments of the verb, or agreement morphemes, and whether they are base-generated (merged) in (or moved to) their surface positions. In order to address these questions, though, it first became necessary to propose a phrase structure for the language by establishing such basic properties as the head directionality of phrases. Probing the basic structure of the language in Minimalist terms had the result of bringing to light some grammatical phenomena that otherwise might not have been uncovered within other theories (although this should not be taken to demonstrate that Minimalism is superior to other theories, as the descriptive results stand on their own and require explanation within any grammatical theory). At first glance, Pashto’s verb-final sentences would suggest that phrases are head-final, but Roberts found that headedness is not uniform: It is specifically the lexical categories NP and VP that are head-final, while many functional categories such as PP, NegP, CP, and DP are head-initial (a split that also obtains in Persian, according to Dehdari 2006). Additionally, while developing a structure for Pashto’s complex predicates (which are common in Indo-Iranian, and which in Pashto comprise a noun or adjective followed by a transitive or intransitive auxiliary verb), a process of split agreement came to light. Until Roberts 2000, Pashto had seemed unlike Hindi/Urdu in defining its ergative split on tense, rather than aspect, but it was learned that the agreement of the first constituent of the compound verb has aspect-driven ergativity, while the second constituent of the compound verb shows the usual tense-driven ergativity. Again, accounting for this variation within a generative phrase structure offers great potential not only for the theory of grammar itself, but for understanding the historical development of the modern Indo-Iranian languages. With a more articulated set of functional projections, Roberts adduced evidence that Pashto’s pronominal clitics are not arguments of the verb, but rather agreement morphemes that identify empty pronominals; as such, they are merged in functional projections that are structurally high, with the result that clitics are often generated directly in their surface positions, i.e., they do not depend on any type of clitic-moving rule. Similarly, the adverbial clitics (which cooccur in second position) are likewise merged in high projections, thereby explaining why they appear alongside the pronominal clitics in a cluster, and eliminating the need for a template to derive the correct ordering of clitics with respect to each other. Roberts proposed that it is not the clitics that move, but rather the nominals (in the relevant sentences) via an independently required process of scrambling. Although movement of clitics in certain configurations seemed to depend on prosodic properties, such operations were strictly post-syntactic, crucially not requiring that the architecture of the grammar be modified.

534

Taylor Roberts

Roberts (2000) also closely examined some properties of interest within Minimalism, such as resumption and dislocation, as well as the ambiguity that can occur in the interpretation of pronominal clitics (as arguments of the verb or as possessors of full nominals), as illustrated below: (42)

plaar mee dee léeg-i father 1 SG 2 SG send-PRS .3 SG ‘My father is sending you’ or ‘Your father is sending me’ (Tegey & Robson 1996: 175)

The separation of the possessive clitic from its possessee has the appearance of possessor raising (also called external possession) in other languages. However, a close comparison of this phenomenon in Pashto to similar phenomena in such languages as Kurdish, Hebrew, French, Spanish, and Japanese, revealed that the constructions are not comparable, and so Roberts named the Pashto phenomenon “possessor dislocation”, in order to distinguish it from the superficially similar constructions of other languages. Pashto’s second-position clitics are not only worthy of study in themselves, but their placement has proven useful as a diagnostic in clarifying other, apparently unrelated syntactic properties of the language. Several researchers since Tegey (1977) have thought that ergatives could not function as heads of relative clauses, since such clauses are obligatorily left-dislocated, leaving a coreferential clitic in the base clause, as well as requiring that the head of the relative clause not appear in ergative (oblique) case, but rather be “demoted” to absolutive (direct) case. (43) a.

b.

hagha ndzheley [tshee de tor sare wlaarr-e we], that girl. ABS COMP with Tor with standing-SG . F be.PST .3 SG . F maassaam mee sinamaa te byaay-i evening 1 SG . ACC cinema to take.PRS .1 PFV -3 ‘That girl who is standing with Tor is taking me to the movies this evening.’ *hagha ndzheley [tshee de tor sare wlaarr-e that girl- ABS COMP with Tor with standing-SG . F we] mee maassaam sinamaa te byaay-i cinema to take.PRS .1 PFV -3 be.PST .3 SG . F 1 SG . ACC evening (Pate 2012: 70 citing Tegey 1977: 128)

Due to the case of the head noun most obviously being in absolutive form (rather than the expected ergative), and an obligatory ergative resumptive clitic in the main clause that seemingly appears in “third” position, most researchers have failed to note that the restriction only obtains when the relative clause appears clauseintially. Using the appearance and placement of second-position clitics, Pate (2012: § 4.5) successfully demonstrated that Pashto remarkably in fact forbids clauseinitial complex DPs more generally, regardless of whether they are ergative, abso-

Syntax and semantics

535

lutive, nominative, or accusative (the only exception being when they are selected by a control predicate such as ghwosstel ‘to want’). The obligatory left-dislocation from a coreferential clitic in the matrix clause bears resemblance to Hanging Topic Left Dislocation in Germanic, although the precise loci of variation (and whether there may be a common Indo-European source) remain to be investigated. Here, the binding principles familiar from Minimalism could be useful diagnostics for clarification. Pashto continues to have interest in Minimalist typological work (e.g. Neeleman & Szendrői 2007 and Larson 2009), as well as within Head-driven Phrase Structure Grammar (Kopris & Davis 2005; Dost 2007) and Lexical Functional Grammar (Bögel 2010). However, many aspects of Pashto syntax remain poorly studied. Despite the research of many years, the language remains a rich area for the study of clitics generally, as it has second-position clitics as well as dative clitics that attach to the verb. The former generally do not double an overt NP, while the latter do, and this discrepancy poses a great problem for a unified analysis (although this state of affairs becomes less frustrating if the seeming discrepancy is caused merely by linguists inappropriately using the general term “clitic” to describe two kinds of morphemes that should, in fact, be distinguished). Surprisingly, the areas of Pashto syntax that require further study are also the most basic, falling under the general categories of case and word order. For example, although Pashto has been described as SOV, OSV is also possible. Similarly in ditransitives, any order of arguments is possible (at least for some speakers and/or dialects), as long as the verb remains final. These alternations should be systematically studied with reference to such properties as topic and focus in order to better understand how discourse affects word order and prosody. Any such conclusions would also better inform an analysis of clitics, as they would help to isolate more clearly how each of syntax and prosody determines their placement. Another understudied topic is anaphora. It would be worthwhile to determine the distribution of reflexive pronouns, as they may differ interestingly from those in more familiar languages. The full (compound) reflexive pronoun is khpel dzhan ‘one self’, of which the first possessive element often may be omitted. Furthermore, that first possessive element, khpel, may serve as a possessive reflexive on its own (where English would simply employ the usual possessive pronoun ‘his’). Again, having a solid understanding of word order would be essential before proceeding with anaphora, but this would be highly desirable in order to establish the behaviour of Pashto with regard to binding principles, which would then facilitate comparison to better studied languages. There is also a need to better understand what kinds of phrases may occupy subject position. Psych-predicates such as ‘like’ and ‘think’ select a possessive subject, which comprises the possessive preposition dee/de and its oblique-marked complement (Tegey & Robson 1996: 184–188), while other predicates require

536

Taylor Roberts

their subjects to appear as complements of locative, dative, or ablative adpositions (Babrakzai 1999: Chapter 7). As illustrated below, the subject’s degree of volition may be indicated by these varying ways of marking the subject: (44) a.

de POSS

b.

pe LOC

c.

laylaa Layla laylaa Layla

delta pinze kaala here five years. DIR baandee delta on here

teer passed pinze five

shw-el became-3 PL . M kaala teer years. DIR passed

shw-el became-3 PL . M laylaa delta pinze kaala teer krr-el here five years. DIR passed did-3 PL . M Layla. OBL ‘Layla spent five years here’ (Babrakzai 1999: 179–180)

Although all of these sentences have the same literal meaning, the most neutral interpretation occurs with the possessive-marked subject (44a). In (44b), the subject is flanked by the locative circumposition pe … baandee, and implies that Layla had no choice in her stay. In contrast, the verb in (44c) is transitive, and the subject is a bare NP (although still marked oblique because of past-tense ergativity); in this sentence, Layla is interpreted as having exercised volition in her stay, deliberately spending five years in one place. An ever-present complication in establishing any of the above properties in Pashto is ergativity, which is similar but not identical to that found in Hindi. Ergativity is problematic generally for creating a unified analysis of case assignment and agreement, and so the challenges posed by Pashto are similar to those in better studied languages; nevertheless, it remains open as to how and why the languages differ in their ergative pivots. While there has not been any evidence that ergativity affects word order or binding in Pashto, those facts must nevertheless be predictable from whatever phrase structure representation is proposed for ergative structures. Finally, in recent years, the collaborative possibilities of the Internet have resulted in Pashto’s syntactic properties (with illustrative examples) being entered by Pate and Roberts into a freely available database, Syntactic Structures of the World’s Languages (http://sswl.railsplayground.net/). The database — created by Chris Collins and Richard Kayne, and funded by the National Science Foundation — presents typological properties in a standardized, descriptive format, and can only help to increase interest in Pashto (and other poorly studied languages) to linguists of varied theoretical backgrounds.

Syntax and semantics

5.3.

Cognitive Linguistics By Bhuvana Narasimhan

5.3.1.

Introduction

537

Cognitive Linguistics involves studying language as it relates to cognitive structures and mechanisms outside of language. Croft and Cruse (2004: 1–4) outline three main guiding principles of the cognitive linguistic approach to language.10 First, language is seen, not as an autonomous module, but as a form of knowledge whose representation and processing is not significantly different from that of other kinds of nonlinguistic knowledge. A number of general cognitive capacities such as analogy, recursion, viewpoint and perspective, figure-ground organization, and conceptual integration (Fauconnier & Turner 1998) are involved in both linguistic and nonlinguistic processing. Second, knowledge of language, including grammatical inflections and constructions, involves the conceptualization of experience in ways that go beyond establishing a simple truth-conditional correspondence with the world (Jackendoff 2011). Importantly, it includes the ability to project metaphorically from structures in the physical domain to structures in the abstract domain (Lakoff 1987). Third, knowledge of language emerges from language use. While language users construct abstract representations by generalizing from specific utterances produced in specific contexts of use, they nonetheless retain highly detailed knowledge of idiomatic and semiproductive patterns in language. In the following, I elaborate on each of these principles in further detail, and discuss relevant research in South Asian languages. 5.3.1.1. Linguistic and nonlinguistic knowledge are not different Since linguistic knowledge is viewed as an integral part of general cognition and thinking (Ibarretxe-Antuñano 2004), cognitive linguistic research is highly interdisciplinary, drawing on research in synchronic and diachronic linguistics, language acquisition and processing, cognitive anthropology, computational linguistics, and cognitive psychology. In particular, psychological models of memory, perception, attention, and categorization have played an important role in inspiring theorizing about the structure of language. For instance, research on the role of prototypes in categorization (Rosch 1973, 1977, 1978, 1983; Rosch & Mervis 1975; Mervis & Rosch 1981) has demonstrated that categories have a graded structure such that some members are more typical of the category than others. Category members share a family resemblance 10

I would like to thank Cecily Jill Duffield for her help in compiling the bibliographic references that were used as the basis for this report.

538

Bhuvana Narasimhan

with one another and typically share some properties, but there is no core set of definitional properties that is necessary and sufficient for category membership. These ideas have inspired a body of research investigating the prototype structure of linguistic categories (see Taylor 1995 for an overview). As an example, patterns of polysemy in the preposition over can be characterized in terms of a more prototypical (central) spatial meaning (e.g. The picture is over the mantelpiece) that is metaphorically related to a less prototypical (peripheral) meaning (e.g. Jane has a strange power over him) having to do with control (Brugman & Lakoff 1988, see also Evans, Bergen & Zinken 2007). Concepts having to do with attention and perception from gestalt psychology (Wertheimer 1923; Kofka 1935), such as figure-ground organization, have also inspired accounts of linguistic construal and profiling (e.g. Talmy 1988, 2000; Langacker 1991). The figure-ground asymmetry has to do with humans’ tendency to perceive one aspect of a situation as the figure or foreground and the rest as the ground or background. Talmy extends this notion to ‘the pervasive system by which language establishes one concept as a reference point or anchor for another concept’ (2000: 311). For instance, in an event of motion or location, the smaller, moveable object is assigned the role of figure, while the larger, more stationary object is construed as a reference entity with respect to which the figure’s path, site, or orientation is characterized. This asymmetry is taken to be responsible for the oddity of sentences such as the house is near the bike (versus the bike is near the house) wherein the figure role is assigned to the house, which possesses the characteristics typically associated with the ground. Psychological models of memory have also inspired work on knowledge representation in terms of frames and domains (Croft & Cruse 2004). Semantic frames are knowledge structures stored in long-term memory consisting of assemblages of conceptual elements that characterize culturally salient scenarios. These conceptual elements cannot be understood independently of the frame with which they are associated (Fillmore 1982; see discussion in Evans, Bergen & Zinken, 2007). For example, in order to understand the word buy, one would have to evoke the semantic frame of commercial transfer within which the lexical concept is embedded, and which involves a seller, a buyer, goods, money, and so on. Research in Cognitive Linguistics thus suggests that an intimate relationship exists between language on the one hand, and culture and cognition on the other. Although the specific topics of research discussed above (e.g. prototype structure of categories or the use of culturally salient semantic frames) have not been investigated in any detail in any of the South Asian languages, there is a growing body of work investigating how linguistic patterns correlate with cultural practices and cognitive behaviour. Research by Bickel (1999a) demonstrates that a strong parallelism exists between spatial language and cultural practices in Belhare (TibetoBurman). For instance, the importance given to the UP-DOWN-ACROSS axis that is prominent in the environment and geography of this Himalayan culture

Syntax and semantics

539

demonstrates that ‘both linguistic and nonlinguistic cultural practice draw upon the same cognitive background’. In a study of speakers of Tamil, Pederson (1995) shows that habitual encoding patterns in speakers of a language correlate with how spatial arrays are encoded in nonlinguistic tasks. The study focuses on speakers from two different subcommunities who vary in how they habitually encode spatial arrays in Tamil (e.g. using a speaker-based frame of reference, the cup is to the LEFT of the saucer, or an environmentally-based frame of reference, the cup is to the NORTH of the saucer). When asked to perform nonlinguistic tasks that involved mentally encoding spatial arrays, Tamil speakers from the two communities employed the spatial frame of reference that they also used to linguistically encode such arrays, a finding that has been taken to support the linguistic relativity hypothesis. On the other hand, research conducted with children acquiring Hindi (IndoAryan), Nepali (Indo-Aryan), and Newari (Tibeto-Burman), among others, suggests that the correlation between linguistic behavior and performance on nonlinguistic spatial tasks is not strong, and is mediated by other factors, such as the demands of the specific task in which participants engage (Mishra, Dasen & Niraula 2003). A similar absence of strong linguistic-relativity effects was reported in a comparison of linguistic and nonlinguistic motion-event encoding in 17 genetically and typologically diverse languages, including Hindi and Tamil (Bohnemeyer, Eisenbeiss & Narasimhan 2006). However a study on monolingual and bilingual members of the Kond tribe in Orissa, conducted by Mohanty and Babu (1983), demonstrates the influence of language on cognition on a somewhat different level. Bilingual children who were fluent in Kui (Dravidian) and Oriya (Indo-Aryan) showed higher levels of metalinguistic ability than monolinguals who only spoke Oriya, even when the effects of age, socio-economic status, intelligence, educational experience, and cultural background were taken into account. The studies reported here attest to the variety of topics and languages that are beginning to be investigated in order to elucidate the intricate relationship between language, culture, and mind. But further crosslinguistic and crosscultural comparative research needs to be done so that we can address these complex issues in a more systematic manner. 5.3.1.2. Conceptualization is involved at different levels of linguistic structure In addition to the ‘Cognitive Commitment’ (Lakoff 1987) that leads cognitive linguists to link language with other cognitive faculties, the second major guiding principle of this approach is a focus on conceptualization. Thus, a fundamental aspect of human cognition is taken to involve ‘the ability to form structured conceptualizations with multiple levels of organization, to conceive of a situation at varying levels of abstraction, to establish correspondences between facets of different structures, and to construe the same situation in alternate ways’ (Langacker 1987, 1991, cited in Fauconnier 2003). According to Croft and Cruse (2004: 40)

540

Bhuvana Narasimhan

‘all aspects of the grammatical expression of a situation involve conceptualization in one way or another, including inflectional and derivational morphology and even the basic parts of speech.’ Thus alternative construals can be conveyed by truth-conditionally equivalent expressions at lexical (e.g. spend versus waste) as well as phrasal or clausal levels (e.g. the use of a presentational vs. a declarative construction as in There was Sam sitting on the floor versus Sam was sitting on the floor, 2004: 41). For many cognitive linguists, concepts are grounded in our bodily, physical, social, and cultural experiences and the neural structures that give rise to them (Fauconnier 2003; Ibarretxe-Antuñano 2004), a notion termed “embodiment” (Johnson 1987; Lakoff 1987; Lakoff & Johnson 1980, 1999). A critical insight within this approach is that figurative aspects of language such as metaphor and metonymy are not in the ‘rhetorical periphery of language’ but are conceptual mappings that form the core of human thought (Fauconnier 2003). In metaphor theory (Lakoff & Johnson 1980), source domains structure target domains by means of metaphorical mappings that constitute the mechanism whereby embodied experiences form the basis for abstract reasoning (e.g. ‘states are locations’, ‘causes are forces’, ‘purposes are destinations’). Much of the experimental and descriptive work investigating semantic patterns in South Asian languages is relevant to our understanding of conceptualization in language. One of the pioneering studies in this regard is that of Sridhar (1988), who investigates how grammatical structure is influenced by the ways in which we perceive and organize nonlinguistic information. For instance, he hypothesizes that an important aspect of our conceptualization abilities (perceiving figure-ground asymmetry) influences word order and grammatical structure during online sentence production. Speakers of 10 different languages, including Kannada (Dravidian), described visual arrays in which the perceptual salience of the objects in the array was manipulated (e.g. by changing the shape, size, or orientation of the objects). His research reveals that perceptual factors, as well as saliency and pragmatic factors, influence aspects of sentence production in Kannada as well as other languages. In more recent research, B. Narasimhan (2007) employs an elicited production task to compare the encoding of events of cutting and breaking in Hindi and Tamil, focusing on the kinds of factors that play a role in speakers’ verb choice (physical properties of the figure and the ground, use of instruments to perform the action, etc.). A similar methodology was employed to investigate the encoding of placement and removal events in speakers of Hindi and Tamil (B. Narasimhan 2012) and the Indo-Aryan language Kalasha (Petersen 2012), and to investigate the categorization of parts of the body in Panjabi (Majid 2006). These studies form a part of large-scale surveys designed to systematically investigate the structure of semantic fields across languages within the growing field of “semantic typology” (Levinson 2012; Majid, Enfield & van Staden 2006; Majid & Bowerman 2007;

Syntax and semantics

541

Kopecka & Narasimhan 2012). Hence they also contribute to an understanding of how event and object categorization in South Asian languages relates to categorization patterns in areally, genetically, and typologically unrelated languages. Other work focusing on spatial and event semantics includes an investigation of locative expressions in Sinhala (Tilakaratne 1992) and the study of event realization in Tamil (Pederson 2007). In research conducted within the framework of Conceptual Metaphor Theory (Lakoff & Johnson 1980, 1999), Shinohara and Pardeshi (2011) examine metaphorical extensions of spatial metaphors to the temporal domain in Japanese and Marathi. They show that spatial terms such as Marathi puḍhe ‘in front’ and māge ‘behind’ are mapped onto the temporal notions EARLIER and LATER respectively. But in some contexts, the same terms exhibit the reverse mapping, viz. when they are used with positional terms (i.e., named members of cyclical events, e.g., summer/autumn/winter/spring). Other work conducted within a cognitivefunctional framework includes an investigation of postural verbs in Marathi (Pardeshi, Horie & Sato 2010) and the semantics of complex predicates in East and South Asian languages (Hook, Pardeshi & Liang 2012). South Asian (psycho-)linguistic research has also focused on the kinds of meanings associated with grammatical structures and forms. Early work includes research by Saxena (1979) investigating how notions such as “affected” agent play a role in the interpretation of causative verbs in Hindi. More recently, Bashir (1999) explores new functions that are emerging for the ergative postposition ne in Pakistani Urdu, proposing that ne is being regrammaticised as a marker for the semantic category SOURCE. Ahmed (2006) accounts for the diverse uses of the accusative/dative postposition ko in Urdu as originating from metaphorical extensions of a core spatial meaning. Focusing on the interpretation of perfectives in Hindi, Arunachalam and Kothari (2012) experimentally examine speakers’ interpretations of simple and complex perfective verb forms that describe change-of-state events in which the change of state was fully achieved or not fully achieved. They find that verb semantics does not suffice to account for participants’ interpretations of such forms in Hindi, and suggest that real-world contextual knowledge also plays a role. In a study of proximal and non-proximal deictic forms in Tamil oral narratives, Herring (1994) finds that semantic factors such as ‘physical proximity to the speaking ego’ do not suffice to account for speakers’ patterns of use, but additional metaphorical and textual functions also need to be taken into account. The research reported in this section ranges from psycholinguistic experiments and corpus studies to traditional linguistic research that relies on native speaker judgments. The studies also reflect a range of topics pertinent to the issue of how conceptual structures are encoded in language. Yet they constitute only a small cross-section of descriptive semantic work on South Asian languages. Other relevant research in South Asian languages appears in a variety of venues which will be briefly discussed in section 5.3.2.

542

Bhuvana Narasimhan

5.3.1.3. Language structure emerges from language use The notion that language structure emerges from language use is a third core principle guiding the Cognitive Linguistic approach identified by Croft and Cruse (2004). Proponents of constructional and usage-based approaches to grammar (e.g. Langacker 1987; Fillmore 1989; Bybee 1995; Goldberg 1995; Croft 2001) posit that information derived from specific utterances in particular communicative contexts of use is retained as part of one’s grammatical representation, and coexists with more schematic representations that are formed by abstracting over specific instances. As formulated by Tomasello, ‘when people repeatedly say “similar” things in “similar” situations, what may emerge over time is a pattern of language use, schematized in the minds of users as one or another kind of linguistic category or construction – with different kinds of abstractions’ (2006: 3). There is no distinction between meaningful words and abstract, formal rules; rather, there is a continuum of meaningful constructions that ranges from words and phrases to clauses and sentences. Thus, for example, the pattern ‘X Verbed Y the Z’ signals (metaphorical) transfer of possession (Tomasello 2006), and the pattern ‘The X-er the Y-er’ is used to signal comparison using a two-part structure with definite noun phrases (Fillmore, Kay & O’Connor 1988). This view of grammatical knowledge has not only given rise to an increasingly prominent body of research in adult language, but has, in particular, influenced the field of developmental psycholinguistics. Since Cognitive Linguistics is interested in the nature of ‘stored linguistic experience’ and its use in communication, it is of interest to investigate how that linguistic experience is built up and used (Tomasello 2000: 77). In the constructional approach, children are thought to acquire a first language by first learning concrete words, phrases, and utterance schemas that they frequently encounter in the input. Children’s early linguistic competence is thus organized, not by distinct kinds of representations for words versus rules, but by an inventory of item-based constructions, spanning words as well as phrases and clauses, that form the basis for the generalization of more abstract schemata. There is relatively little research within a usage-based or constructional approach in South Asian languages, but studies of adult and child language that adopt this perspective do exist and are increasing in number. The pioneering work of R. Narasimhan and colleagues on Tamil language acquisition, although not explicitly formulated within a Cognitive Linguistics framework, provides a pragmatically based account of the development of forms and their functions in early child language that also takes into account the kind of input that caregivers provide to children (R. Narasimhan 1998; R. Narasimhan & Vaidyanathan 1984; Vaidyanathan 1988, 1991). In more recent work, B. Narasimhan (2003) accounts for differences between Hindi and English in the syntactic co-occurrence patterns of motion verbs in adult

Syntax and semantics

543

language in terms of crosslinguistic differences in the availability of a “directed motion construction”. Subsequent research within a usage-based perspective investigates the acquisition of verbs and syntactic frames in children acquiring Hindi (Budwig, B. Narasimhan & Srivastava 2006; Srivastava 2008) as well as Tamil (Sethuraman, Laakso & Smith 2011). And as part of a larger project on the study of adult and child language in Chintang (a Tibeto-Burman language spoken in Eastern Nepal), Stoll and colleagues investigate patterns in the use of nouns and verbs in child-caregiver interactions (Stoll et al. 2011). Other work relevant to the emergence of language and its relationship to cognitive development includes research on the acquisition of ergative case-marking in children acquiring Hindi (B. Narasimhan 2005), the use of caused posture verbs in children acquiring Dutch and Tamil (B. Narasimhan & Gullberg 2011), and the comprehension of mental state terms in children acquiring Oriya (Patnaik & Babu 2001). 5.3.2.

Cognitive Linguistics and South Asian languages: Future directions

In section 5.3.1, I provided a broad overview of recent as well as earlier research on South Asian languages that addresses topics of interest from a Cognitive Linguistics perspective. However, this listing is not exhaustive, and the interested reader is directed to additional research on these languages reported in venues such as Studies in the Linguistic Sciences (published by the Linguistics Department, University of Illinois, Urbana-Champaign), Indian Linguistics (published by the Linguistic Society of India), and Psycho-Lingua (published by the Psycholinguistic Association of India), as well as more theoretically oriented journals such as the Journal of South Asian Linguistics. Other journals in which linguistic and psycholinguistic research on South Asian languages appear include Cognitive Linguistics, Journal of Pragmatics, Language, Journal of Psycholinguistic Research, Language Sciences, and Journal of Child Language. In addition, a Hindi-Urdu corpus, annotated for both syntactic structure and semantic role labels, is available as a resource for researchers in linguistics, computer science, and neighboring disciplines (Palmer et al. 2009). A Tamil child language corpus collected by R. Narasimhan (1981) and two Persian corpora collected by Neiloufar Family (Family 2009) and Habibeh Samadi (1996) are also freely available on the CHILDES database (http://childes.psy.cmu.edu) and constitute a valuable resource for cognitive linguistic research from a developmental perspective. Most of the studies on South Asian languages investigating questions that are important from a cognitive linguistic perspective focus on patterns in lexical semantics or the syntax-semantics interface. Much of the research also relies on traditional linguistic methods such as the use of native speaker judgments, often from just one or two speakers — the use of corpora or experimental methods is mostly limited to a few studies. Further, a preponderance of the research is focused on Hindi and Tamil, with a smattering of studies in Marathi, Chintang, Kannada,

544

Hans Henrich Hock

Panjabi, and Belhare, among others. Not only do we need to investigate topics of central interest in Cognitive Linguistics, but we need to do so in a greater variety of languages. It is encouraging however that advances are gradually beginning to be made in a greater range of topics, e.g. the investigation of the language-cognition-culture interface, conceptual metaphor, and the development and use of language within a usage-based approach. Further, with the increasing availability of electronic resources as well as the growing use of experimental approaches in Cognitive Linguistics (e.g. see Gonzalez-Marquez et al. 2007) it is likely that a greater range of methodological approaches will be used in the study of South Asian languages in the future. Further crosslinguistic comparisons are required to elucidate the extent to which languages vary, and the degree to which linguistic patterns correlate with cultural practices in the community and the cognitive abilities of individual language users. In this respect, detailed investigation of South Asian languages, in all their rich diversity, can provide critical data to evaluate fundamental questions about the nature of the relationship between language, culture, and the mind. 5.4.

Morphosyntactic typology

This section overlaps to some extent with Section 4.5, which deals with morphosyntactic issues in morphology. 5.4.1.

Oblique Experiencers and Oblique Subjects11 By Hans Henrich Hock

5.4.1.1. Introduction A feature commonly considered to be characteristic of South Asian languages is the widespread12 existence of Oblique-Experiencer constructions as in (45) (from Hindi), whose Theme (what is experienced) usually is in the nominative. The Experiencer NP is commonly marked Dative, but other markings are possible, 11 12

I am grateful for comments by Elena Bashir, Alice Davison, and K. V. Subbarao. Oblique-Experiencer constructions seem to be most common in Indo-Aryan and Dravidian. Hook (2014) provides some overview of the distribution of Oblique-Experiencer constructions in South Asia. He finds that in Tibeto-Burman, Oblique Experiencers tend to be limited to languages in close proximity to Indo-Aryan and that Santali (Munda) uses a structure with “incorporated” Experiencer dative in the verb. Subbarao et al. (2014) note that Santali Experiencer structures are limited to possessors and physical afflication and provide more detailed information on verb agreement marking. Among the isolate languages, Kusunda has a Dative Experiencer; see also 1.9.3 (this volume) for Daic.

Syntax and semantics

545

such as Genitive (Assamese, Bangla, Oriya) and Accusative (Sinhala, Bodo), and some languages have variation between Dative and Genitive or Accusative (Oriya, Sinhala); see e.g. Dasgupta 2003: 376, Gair 2003: 790–791, Goswami & Tamuli 2003: 422, Ray 2003: 467–468. Volumes that contain important relevant information are Verma (ed.) 1976, Verma & Mohanan (eds.) 1990, and Bhaskararao & Subbarao (eds.) 2004; see also Shibatani 1999 and Subbarao 2012: Chapter 5. (45) a. b.

mujhe yah kitāb this book I.DAT ‘I like this book.’ mujhe jānā go.INF I.DAT ‘I have to/must go.’

pasand “like”

hai be.PRS .3 SG

hai be.PRS .3 SG

Since at least the time of Kachru 1968 and Davison 1969 it has been known that in many of the South Asian languages, the Experiencer NP exhibits syntactic behavior that is normally associated with subjects (“subject properties”), such as control of (the PRO or implicit subject) of converbs (a.k.a. conjunctive participles, absolutives, gerunds, etc.) and of reflexivization; see e.g. (46) from Hindi. While the term “Oblique Subject” (or more specifically, “Dative Subject”) is often used more widely, it is useful to distinguish Oblique Subjects from simple Oblique Experiencers, which may not exhibit these subject properties. (46) a. b.

kitāb paṛh ke fil(a)m nahīṁ pasand hai film NEG “like” be.PRS .3 SG book read.CVB ‘Having read the book, (I) don’t like the film.’ ghar pahuṁcānā hai mujhei baccoṁj ko apnei/*j home take.INF be.PRS .3 SG I.DAT children.ACC self’s ‘I have to take the children to my/*their home.’

Several issues are of general interest and/or controversial. One is the question whether all South Asian languages have Oblique Subjects. A related question is how to account for the subject properties of Oblique Subjects in languages that have them. There is also the question whether the distinction between Oblique Subjects and Oblique Experiencers is meaningful and the related question whether there has been a linguistic change from Oblique Experiencers to Oblique Subjects in Indo-Aryan languages. 5.4.1.2. Oblique Subjects in South Asian languages Strong arguments have been presented that the Oblique Experiencers of HindiUrdu, Panjabi, Maithili, Oriya, Kashmiri, Tamil, Malayalam, Kannada, and Belhare are Oblique Subjects, based especially on the evidence of converb and reflexive control; see e.g. Kachru 1968, Davison 1969, Kachru, Kachru & Bhatia 1976,

546

Hans Henrich Hock

Sridhar 1976, Cole, Herman & Sridhar 1980, U. N. Singh 1983, Rakesh Bhatt 1999, Bickel 2003: 565, Davison 2004a, 2011, 2012, Nizar 2010, B. N. Patnaik (n.d.). For Bangla, Telugu, and Kharia, see also Subbararo 2012: 156–159 with references.13 While Mohanan and Mohanan (1990) argue for Subject status of Malayalam Dative Experiencers, Jayaseelan (1976, 1990, 2004b) and Amritavalli (2004a) raise questions for Malayalam and Kannada, arguing that reflexive control is not a reliable indicator of subject properties, and that Dative Experiencers are really indirect objects. Similarly, Rangan and Rajendran (2001) consider reflexivization an unreliable criterion. Nizar (2012) shows that Dative Experiencers differ in their syntactic behavior from indirect objects and that treating them as subjects ‘can more fully account for the data’. Pappuswamy (2005) presents a more nuanced perspective on Tamil, distinguishing between structures with nominative and accusative Theme, and finding variation regarding reflexive control. Muralikrishnan (2011) argues on the basis of experimental data that Dative Subjects are processed differently from Nominative Subjects and indirect objects. The situation is even more uncertain for Marathi. Pandharipande (1990) argues that reflexive control is not a reliable indicator of subject properties. By contrast, Joshi (1993) finds there to be sufficient evidence for subjecthood of Dative Experiencers. Wali (2004) similarly observes that Dative Experiencers control the reflexive āpaṇ, as well as the converb; but the reflexive svatāh can be controlled by both the Dative Experiencer and the Theme. See also Rosen & Wali 1989. Dhongde and Wali (2009) consider the issue to be still unresolved. Perhaps some of these differences can be attributed to regional or dialect differences. (See 5.2.1.3 above.) For Nepali, Wallace (1985: 136–139) notes that Oblique Experiencers control reflexivization; but converb control is more complex: Experiencers of intransitive structures exert control, those of transitive structures do not; but the Theme controls marginally. Wallace’s findings contrast with Bickel’s claim (2004) that in Himalayan languages, including Nepali, control is exerted by topicality, not by subjecthood. For Kalasha and Khowar, Bashir (1988b: Chapter 3, see also 2014) shows that Oblique Experiencers are construed as direct objects of causative verbs in an older layer of the languages, and that the “Dative Subject” construction is a later development. More research is needed, especially on languages that have received less research attention than Hindi-Urdu. Special variants of Oblique-Subject constructions are “Double Dative” constructions in Dravidian languages with Possessor Raising, i.e. a part-whole relationship between the two dative-marked NPs (Subbarao 2012: 189–192) and “Reversible Dative Subjects”, in which Experiencer and Theme can switch case 13

Mistry (2004) notes that Dative Experiencers control reflevization in Gujarati; but he does not address the question of converb control.

Syntax and semantics

547

marking and control features (Yamabe 1990, Joshi 1993, Davison 2012). Note also Subbarao 2012: 246–262 on “Backward Control”. The question of how to account for Oblique Subject constructions in the grammar has received various answers, reflecting changes and differences in grammatical theory. In the Relational-Grammar account of Sridhar 1976 and Cole, Hermon & Sridhar 1980, for instance, Oblique Subjects are underlying Subjects. Mohanan and Mohanan (1990) argue for a semantically informed argument structure. Davison (2004a) proposes a Lexical-Case analysis under which dative [and other oblique-subject] case is licensed by the theta roles of particular arguments; see also Davison 2011. Operating within LFG, Butt and Holloway (2004) propose a three-way distinction between structural, semantic, and “quirky” case. 5.4.1.3. Oblique Experiencer vs. Oblique Subject and historical developments The distinction between Oblique Experiencers (in general) and Oblique Subjects (as a special, syntactically defined subset) has been questioned in recent publications by Barðdal and Eythórsson (2009), Barðdal and Smitherman (2012), and Barðdal (2013). Working within a variant of Construction Grammar, they define Oblique Subjects broadly as any construction in which the left-most argument in the subcategorization frame is defined as the subject. Given this broad definition, they are able to argue that Indo-Aryan Dative-Experiencer constructions, just like those of languages like Icelandic, are inherited from Proto-Indo-European and thus do not constitute a novel phenomenon. Butt and Deo (2013) question this account on the grounds that there is no clear evidence for Dative-Subject constructions in Sanskrit (Hock 1990, 1991) and propose that Marathi Dative Subjects are a late development, coming about in three ways: Sanskrit change of state predicates become experiencer verbs, experiencer is marked with dative; Originally intransitive verbs acquire a psych verb reading, experiencer is marked with dative; Nominative experiencers of transitive predicates [are] reanalyzed as dative experiencers. While there is indeed no clear evidence for Dative-Subject constructions in Sanskrit (but only for Possessor-Genitive Subjects), Hock (1990) presents strong evidence that Sanskrit did have Dative (and other oblique) Experiencer constructions (which, however, did not meet the criteria for subject status).14 14

Hook (1990b: 331, fn. 4) objects to Hock’s (1990) argument that structures claimed to be Dative-Subject constructions by Hook are at best ambiguous and might simply be examples of “sloppy” converb control. Hook claims that, in contrast to constructions

548

Hans Henrich Hock

Barðdal (et al.) may therefore be correct as regards their (pre)historical claims. At the same time, by failing to distinguish between Oblique Subjects and other Oblique Experiencers, they fail to address the complex analytical issues that arise when taking the differences between these constructions to be significant. The exact historical pathway by which modern Indo-Aryan acquired Dative Subject constructions (and other Oblique Subject structures, other than Possessor Subject ones) is still to be determined. A good beginning, focusing on Hindi-Urdu and including evidence from Sanskrit, Prakrit, and Apabhraṁśa, is Montaut 2013; see also Hook 2014. 5.4.1.4. Agreement issues in Oblique-Subject (and Ergative) constructions Dative and other Oblique Subjects normally do not control verb agreement; see 4.5.3.2.1 above. This is the one feature in terms of which such subjects robustly differ from nominative subjects. Exceptions have been noted for Shina (Hook 1990a) and Darai (Paudyal 2008). It remains to be seen whether these are the only languages with verb agreement triggered by Dative/Oblique Subjects. If there is any verb agreement, it is normally with the nominative-marked Theme. Accusative Themes appear to occur in Bangla, Tamil, and Malayalam; but Subbarao (2012: 173–178; 4.5.2, this volume) argues that accusative in these structures does not mark transitivity of the Oblique-Subject construction but rather animacy and definiteness of the Theme. In Tamil, verb agreement is the default third singular neuter. Apparently there are no examples of verb agreement with non-nominative Themes of Oblique-Subject constructions. This is an issue that would benefit from further research. The issue of Theme (or DO) verb agreement does arise in Indo-Aryan languages with “Split Ergativity”. In languages like Hindi-Urdu, agreement in the perfective is blocked if the Theme is marked by the accusative/definite postposition =ko (see e.g. 5.2.2.3.1); however, in Gujarati, Marwari, and (optionally) Marathi, the corof this type (with apparent converb control by the Experiencer NP), there are no Sanskrit examples of constructions in which the “experiencee” (i.e. the Theme) controls converbs. Example (ii) below, with the converb controlled by the “experiencee” of the Dative-Experiencer verb ruc ‘be pleasing’/“like”, shows this claim to be problematic. (ii)

yo who.RELPRON . NOM . SG . M an-ūcya na + NEG -study.CVB NEG sa etām̐ ś he.NOM . SG . M these

’brāhmaṇo non-brahmin.NOM . SG . M iva roceta PARTICLE be.pleasing.OPT .3 SG caturhotṝṇ … caturhotṛ (formulas).ACC . SG . M

vidyām wisdom.ACC . SG . F vyācakṣīt(a) recite.OPT .3 SG

‘A non-brahmin who, not having studied (sacred) wisdom, might not please should recite these caturhotṛ formulas …’ (Kāṭhaka Saṁhitā 9.16)

Syntax and semantics

549

responding postposition does not block agreement (Cardona & Suthar 2003: 682, Magier 1990, Pandharipande 2003: 711). As noted in 4.5.3.2.1, this difference in morphosyntactic behavior raises important questions for syntactic theory. 5.4.2.

Complex Verbs

Complex verbs are widespread in South Asian languages, with two major subtypes usually distinguished: structures of Noun (or adjective) + Verb (C ONJUNCT V ERBS ) and Verb + Verb (C OMPOUND V ERBS ). This chapter deals with these and related constructions. 5.4.2.1. Introduction By Hans Henrich Hock Important early work on South Asian complex verbs is found in Hook 1974 and Kachru 1980, 1982; see also the contributions in Verma (ed.) 1993. More up-todate general surveys are Subbarao 2012: 22–24, 182–187 and the cross-linguistic study of Anderson 2006. Sections 5.4.2.2 and 5.4.2.3 present current overviews of Dravidian and Indo-Aryan. The following brief survey shows that complex-verb constructions also occur in the other major language families of South Asia, and beyond. Tibeto-Burman Compound Verbs and Conjunct Verbs are found e.g. in Belhare (Bickel 2003: 559–560), Camling (Ebert 2003: 542), Dolakha Newar (Genetti 2003: 361), Kathmandu Newar (Hargreaves 2003: 480), Nar-Phu (Noonan 2003: 345), Tamang (Mazaudon 2003: 301). Of special interest is Tournadre & Jiatso 2001, a detailed study of Literary Tibetan and of Tibetan dialects, with discussion of the use of certain Compound-Verb constructions to mark evidentiality (see also 5.5.2 below). An early study of Munda is Hook 1991b. Anderson 2007: Chapter 8 provides extensive coverage under the heading ‘Auxiliary verb construction and other complex predicate types’. Anderson makes the important observation that, at least in Munda, the dividing line between Conjunct and Auxiliary Verb constructions is difficult to draw. Iranian, too, makes extensive use of Complex-Verb constructions; see 1.4.4 (this volume) for Iranian in general, and Bashir 2009: 833 and Edelman & Dudykhudoeva 2009: 798 for Conjunct Verbs in Wakhi and Shughni. Complex verbs are also found in areas adjacent to South Asia. For Turkic, see Erdal 1998: 144, 151, Johanson 1998: 38, 42; for Chinese see Liang & Hook 2006. Special issues, discussed in 4.5.3.4.1 and 4.5.3.4.3, concern agreement. In languages like Hindi-Urdu some putative Conjunct Verbs show agreement between the nominal and the verb, as in (47b), while others do not (47a). Moreover, in (47b) yād retains nominal status, as indicated by the genitive marking on the preceding

550

E. Annamalai

rām, while in (47a) it does not, and the preceding rām is treated as the subject of the entire collocation yād ā-. The difference can be explained in terms of different degrees of grammaticalization or “incorporation” (in the sense of Baker 1988). Full incorporation, into the verbal morphology, is found in Munda languages, especially in South Munda; Anderson 2007: Chapter 6, and Section 4.3.2, this volume. (47) a. b.

mujhe rām Ram.M I.DAT ‘I remembered Ram.’ mujhe rām kī Ram.M - GEN I.DAT ‘I remembered Ram.’

yād memory(.F )

āy-ā come.PFV : PST - SG . M

yād memory.F

ā-ī come.PFV : PST - SG . F

A different kind of agreement phenomenon is found in “S ERIAL V ERBS ”, as defined by Steever (1988). The construction might be considered a subtype of Compound Verbs, with agreement between the two verbs of the construction; but see below. As section 4.5.3.4.3 shows, agreement may be complete (in terms of gender, number, person), but variants with partial or “attenuated” agreement are also found. Steever (1988), focusing on Dravidian, considers the second verb to be syntactically finite and the first verb to receive its agreement features morphologically. This account is intuitively appealing for the more common type (48a), in which the relation between the two verbs is asymmetric and the final verb can be considered the head of the construction; it is problematic for the “Balance-Verb” type (48b), in which the two verbs are in a symmetric relationship. Balance-Verb structures also raise questions about considering Serial Verbs a subtype of Compound Verbs. Further work is desirable. (48) a. b.

celvēm allēm be(come).NEG .1 PL go.NPST .1 PL ‘We will not go.’ (Old Tamil; Steever 1988) biba injo ōti samgiri uṭar drink.PST .3 PL marriage house.LOC take.CVB provisions ticar eat.PST .3 PL (Pengo; Steever 1988) ‘They ate-drank, i.e. consumed, the provisions brought to the marriage house.’

5.4.2.2.

Expanded verbs in Dravidian By E. Annamalai

5.4.2.2.1.

Introduction

Jules Bloch’s observation that ‘Reduced to its basic elements, the Dravidian is an incomplete and a simple system’ (1946: 69) reflects a view of Dravidian mor-

Syntax and semantics

551

phology in comparison with the inflectional morphology of languages such as Sanskrit and Latin. Dravidian verb morphology is considered simple because it marks nothing more than tenses and agreement with the subject (the latter is absent in Malayalam) and these markers are transparent except for phonetically motivated sandhi changes. It is incomplete because of its simplicity leaving out from the morphology other functional elements of the verb such as aspect and mood. These other functions, however, are expressed by syntax. The earliest indigenous grammar of Tamil, Tolkāppiyam, which antedates the European grammarians of Dravidian by two millennia, describes only the tense and agreement morphology of the verb besides the two syntactically defined categories of verbs, viz. the finite (mur̠ r̠ u) and the participial (eccam), with the adverbial, conditional, the relative, and the infinitive being classified under the latter. The verb in Dravidian, however, does not remain reduced. The two ways in which the verb is expanded by syntactic means are to add a certain kind of verbs from a finite set to the infinitive or to the verbal (i.e. adverbial) participle form of the verb. The verbs added to the infinitive are “defective”, in the sense that they are not inflected for agreement. These constructions equate generally with modal forms (Steever 2005: 86–87) that express grammatical functions such as obligation, desirability, possibility, probability, and admissibility of the act. Another set of verbs added to the infinitive are “defective” in another sense, in that they are semantically bleached; they express the grammatical functions of how an act comes into effect such as by causation, trial, allowance, and imminence. Some of these qualify to be complex predicates. These grammatical functions of this verbal construction may not be found universally in all Dravidian languages. The following examples, as most of the examples in this article, are from Tamil. (49) a. b. c. d. e.

avan kār-ai ōṭ-a vai-tt-ān run-INF place-PST -3 SG . M he car-ACC ‘He made the car run.’ avan kār-ai ōṭṭ-a pār-tt-ān run(TR )-INF see-PST -3 SG . M he car-ACC ‘He tried to drive the car.’ avan kār-ai ōṭ-a viṭ-ṭ-ān run-INF let.go-PST -3 SG . M he car-ACC ‘He let the car run.’ avan kār-ai ōṭṭ-a pō-n-ān run(TR )-INF go-PST -3 SG . M he car-ACC ‘He was about / going to drive the car.’ avan kār-ai ōṭṭ-a iru-nt-ān run(TR )-INF be-PST -3 SG . M he car-ACC ‘He was all set to drive the car’

552

E. Annamalai

These syntactically expanded verb constructions, whose second verbs are a closed set, just like suffixes, and do not have a referential meaning like their corresponding full verbs, are not the subject of this article. 5.4.2.2.2.

Verb + Light Verb structures

The second way of expanding the verb in Dravidian is to add “light” verbs (similar to the semantically bleached verbs with the infinitive and shared in both constructions) to the verbal participle form of the verb (not to the bare form of the verb as in Indo-Aryan). These light verbs are also a finite set. They are conjugated in the same way as their corresponding full verbs, but their meanings are reduced to grammatical meanings. Nevertheless, the selection of verbs to this set of light verbs is semantically motivated since the synonymous verbs have the same grammatical meaning in this construction. Some of the light verbs may be phonologically reduced to the extent of not having the canonical form of verbs; this, however, does not remove their ability to conjugate. Some light verbs do not have any lexical restriction based on the meaning of the verb of the verbal participle they combine with, just like suffixes. These light verbs have various technical designations in the literature such as secondary verb (Subbarao 1979), auxiliary verb (Schiffman 1969), vector (Bhat 1979), and operator, depending on the analyst’s syntactic theory. The combined verbal construction is also variously called serial verb (Fedson 1981), complex verb (or predicate) (Krishnamurti 1992), or compound verb (Fedson 1981). The designation compound verb is less common in Dravidian, unlike in Indo-Aryan (Hook 1974). The light verbs with the verbal participle could be of any of the verbal forms such as the finite verb, imperative, infinitive, verbal participle, and relative participle. But they are more elaborate and common in the finite verb form, and it is this form that has been studied more by Dravidian linguists. The examples in this article are of the finite form of the light verbs. There are restrictions on the possible sequences of light verbs after the verbal participle, in which each preceding light verb is in the verbal participial form (Annamalai 1985: 13–14). The order in sequencing is determined by the verbhood status of the particular light verb (the one with the higher status precedes the one with a lower status) as well as the semantic compatibility of the light verbs (the light verb meaning that the act is for the benefit of the subject and the light verb meaning that the act is for the benefit of another do not combine with each other). There are four issues that are most studied on these light verbs in Dravidian languages. They are: what verbs constitute the finite set of light verbs; what the meaning is of the light verbs individually and collectively; whether the light verbs express a single grammatical function such as aspect, or more than one; what the structural relationship of the light verbs to full verbs is, i.e. whether they form two distinct categories of verbs or form a cline in a single category of verbs. There are

Syntax and semantics

553

some studies that posit a syntactic structure of the full verb + light verb construction (Schiffman 1969, Steever 2005, Subbarao 1979). These syntactic analyses are short-lived, as the syntactic theory underlying these structures changes over time. Hence the syntactic representation of the construction under discussion (i.e. the compound verb) is not discussed in this article. The syntactic process producing this structure may be called auxiliation (Kuteva 2001, Steever 2005: 9–12), contrasting it with other processes, viz. inflecting and compounding, though not all analysts use this term. Incomplete lists of light verbs in the literary Dravidian languages are given below. It is possible to have linguistic criteria to identify the light verbs based on their syntactic behavior (Annamalai 1985: 15), which include tests for the formal integrity of the V1 + V2 construction (i.e. the compound verb, where V1 is a verbal participle), such as inadmissibility of insertion, scrambling, movement leaving a trace on the site such as relativization, duplication of V1 to give the meaning of repetition of the act, tests for the ability of V2 to have its independently motivated arguments or verb complements, and tests for the differential scope of the negative as to both verbs in the construction or to one of them. The common practice of native-speaker linguists, however, is to go by their intuition about the absence of the full verb meaning and the occurrence of a bleached meaning of a light verb in this construction. Thus the lists are not comparable for their completeness and robustness within one language or across languages. Tamil has the following types (Schiffman 1969, Fedson 1981, Annamalai 1985, Steever 2005). A. viṭu ‘let go’ ; pō ‘go’ ; āku ‘become’ ; iru ‘be’ ; koḷ ‘have, hold’ B. koṭu ‘give ; viṭu ‘leave’ ; vai ‘place down’ ; pōṭu ‘drop’ ; eṭu ‘take’ ; par ‘see’ ; kāṭṭu ‘show’ C. tolai ‘lose’ ; kiẓi ‘tear up’ ; taḷḷu ‘push’. And there are others in each type.

Each of the three types (A, B, C) of the light verbs indicated above is characterized by its own syntactic behavior and semantic nature. The characteristics more or

554

E. Annamalai

less correspond respectively with the grammatical functions expressed by temporal contour, subject orientation, or speaker point of view, and attitude about and manner of performing the act. It is clear that these three non-referential functions are not monolithic. Type A is commonly identified with aspect (Schiffman 1969, Annamalai 1979), but even this is not internally homogeneous. The aspectual notion associated with at least some of the light verbs in this construction is called “perfect”, following the grammatical description of European languages. D. N. S. Bhat (1979: 309) broadens the notion of Aspect to include manner of occurrence of an act. Given these problems, it is fair to say that the compound verbs consisting of verbal participle + light verb are a heterogeneous category grammatically and semantically. The structural device of V1 + V2 available in Dravidian accommodates multiple expressive functions. The following are some illustrative examples from Tamil. (50) a. b. c.

(51) a. b. c.

d.

e.

avan kār vānk-i-viṭ-ṭ-ān he car buy-PST -let.go-PST -3 SG . M ‘He has (surely) bought a car’ avan kār vānk-i-yiru-nt-ān he car buy-PST -be-PST -3 SG . M ‘He had bought a car.’ avan kār vānk-i-kkoṇ-ṭ-ān he car buy-PST -hold-PST -3 SG . M ‘He bought a car for himself.’ avan kār vānk-i-kkoṭu-tt-ān he car buy-PST -give-PST -3 SG . M ‘He bought / helped buying a car (for someone else).’ avan kār vānk-i-viṭ-ṭ-ān he car buy-PST -leave-PST -3 SG . M ‘He bought a car for some use (such as to run it as a taxi).’ avan kār vānk-i-vai-tt-ān he car buy-PST -place-PST -3 SG . M ‘He bought a car for some purpose (such as to give it later to his daughter as a gift).’ avan kār vānk-i-ppār-tt-ān he car buy-PST -see-PST -3 SG . M ‘He bought a car to test (to see if he could afford to maintain it, for instance).’ avan kār vānk-i-kkāṭṭ-in-ān he car buy-PST -show-PST -3 SG . M ‘He bought a car to demonstrate (that he could afford it, for instance).’

Syntax and semantics

(52) a.

b.

555

avan kār vānk-i-ttolai-tt-ān he car buy-PST -lose-PST -3 SG . M ‘He bought a car with no choice of avoiding it (and being pushed to it).’ avan kār vānk-i-kkiẓi-tt-ān he car buy-PST -tear-PST -3 SG . M ‘He bought a car, damn it (I bet he wouldn’t).’

Malayalam has the following types (Nayar 1979:289–299) kaẓi ‘be finished’ ; vay ‘place’ ; kāṇ ‘see’ / iri ‘sit’ / ēku ‘join’ 15; kaḷay ‘cast away’ ; pō ‘go’ ; iri ‘sit’ / nil ‘stand’ ; tulay ‘destroy’ ; aruḷ ‘act suitably’ ; eṭu ‘take’ / koḷ ‘contain’ ; koṭu ‘give (to the third person) / tar ‘give to the first or second person’ ; nōkku ‘look’ .

Kannada has the following types (D. N. S. Bhat 1979: 300–309) A. hōgu ‘go’ ; āḍu ‘play’ ; biḍu ‘release’ B. koḍu ‘give’ ; nōḍu ‘see’ ; hāku ‘put’ .

The light verbs of Type B occur mostly with transitive full verbs. In addition to these six light verbs, there is koḷ, which makes the full verb reflexive. The light verbs of Type A express the manner of occurrence of an act of the full verb and Type B verbs express the result of an act. Telugu has the following types (Subbarao 1979: 268–276) paḍ ‘fall’ ; kon ‘buy’ ; wēs ‘put’ ; caw ‘die’ ; peṭṭ ‘ ‘keep’ ; kūrcon ‘sit’, cūs ‘see’, and pō ‘go’ .16 15 16

These light verbs have this meaning only in their future tense forms. The senses of light verbs in Malayalam, Kannada, and Telugu given here are my abstractions from the meaning descriptions of the authors of the original papers. The light-verb function of the last three verbs was provided by K. V. Subbarao (p.c. 2014).

556

E. Annamalai

It is difficult to compare the light verbs and their meanings in these four languages because of the different extents of data coverage and different ways of designating the meaning of light verbs. Some apparent differences may disappear if a unified analysis is applied to all the languages. Nevertheless, it should be noted that the phenomenon of V1 + V2, where V1 is the full verb in its verbal participle form and V2 is the light verb with a specific non-referential meaning, is shared by the Dravidian languages; there are cognate and non-cognate light verbs across the languages; the meanings of the light verbs are a shared set, though not always expressed by cognate light verbs. The use of non-cognate light verbs for a shared meaning is not surprising because this is noticed between the dialects of the same language as well. This raises a problem for comparative reconstruction: Can a grammatical phenomenon be reconstructed for the proto stage while the forms of particular light verbs that are instances of the phenomenon cannot be? Given the paucity of light verbs in old Tamil texts before the Common Era, one may speculate that “delexicalization” of full verbs into light verbs is a development that is shared, but its time and instantiation are specific to individual languages. One reason for the differences in the semantic specification of light verbs among analysts is the problem of boundary between semantics and pragmatics. This may be illustrated with one example, (50a), where the light verb is commonly said to give the sense of completion, with an additional sense ‘as anticipated’ or ‘as not anticipated’. (50a) in one context is understood as indicating that the subject of the sentence was planning to buy a car and that the speaker knew this and now acknowledges that he accomplished it. In another context the same sentence is understood as meaning that the speaker did not believe that the subject of the sentence was capable of buying a car, but that the subject did so against the anticipation of the speaker. The second reading will be invited when the subject is followed by the emphatic suffix -ē (avanē ‘even he’). So the element of anticipation is not a part of the sense of completion but is a conversational implicature of the utterance. A thorough pragmatic analysis of the light verbs is yet to be done. Some analyses take the full verb and the light verb to belong to different grammatical categories, the former to the lexicon and the latter to the grammar of the auxiliary verb system. Other analyses (Annamalai 1979, 1985, Subbarao 1979) show that the full verbs and light verbs form a cline. The evidence is the following. There is ambiguity of interpretation whether V2 is a full verb or a light verb in particular instances because the semantic bleaching of a light verb is relative. There are structural factors such as case markings and adverbs that differentiate one verb from another, but they are not dependent on category difference. The light verbs themselves form a cline which is reflected in the classification of light verbs into the A, B, C types as given above. This squishiness allows full verbs, light verbs, and bound suffixes to form a cline, where the light verbs, in the middle, have members that are closer to full verbs and other members that are closer to suffixes.

Syntax and semantics

557

V1 + V2 may merge semantically in some constructions to form new words. The syntactic structure is thus lexicalized, in which case the light verb does not add an inferential meaning to the full verb but changes its referential meaning. Some examples from Tamil are pārttukkoḷ ‘look after, take care’ (< pār ‘see’ + koḷ ‘have, hold’), kāttiru ‘wait’ (< kā ‘guard’ + iru ‘be’), kāṭṭikoṭu ‘betray’ (< kāṭṭu ‘show’ + koṭu ‘give’), viṭṭukkoṭu ‘concede, yield’ (< viṭu ‘let go’ + koṭu ‘give’), taḷḷivai ‘postpone’ or ‘excommunicate’ (< taḷḷu ‘push’ + vai ‘place down’). These developments expand the class of verbs but not the grammatical and semantic scope of verbs. 5.4.2.2.3.

Noun + Light Verb structures

There is another strategy to expand the class of verbs, which is adding the light verbs to nouns to make them verbs. These are different from noun + verb combinations (or collocations), where a generic (or bleached) verb, in combination with the noun, acquires a special meaning. A Tamil example is (53a) below, where ‘fever left’ has the meaning of ‘fever subsided’. The meaning of (53a) is not ‘the fever left him’, because the dative avanukku ‘to him’ is the Experiencer of the reduced fever. This contrasts with (53b), where viṭu has the literal meaning ‘leave’ and takes the accusative avanai ‘him’ (53) a. b.

avan-ukku kāyccal viṭ-ṭ-atu fever leave-PST -3 SG . N he-DAT ‘Fever left him / he was free of fever.’ pīṭai avan-ai viṭ-ṭ-atu leave-PST -3 SG . N affliction he-ACC ‘The affliction left him/ he was free of affliction.’

One analytical task is to separate the N+V construction that is a syntactic string of two words as in (54) below, from the N+V construction that produces one single word, viz. a verb, combining the two words into one, as in (55) below. In the first case, the most common syntactic relation of the noun with the verb is that of an object, as in (54). The noun could be relativized, questioned, pluralized, modified like any other noun, and passivized like any other object. (54) a. b.

aracu enkaḷ ūr-ukku bas viṭ-ṭ-atu bus let.go-PST -3 SG . N government our town-DAT ‘The government ran buses to our town.’ arac-āl enkaḷ ūr-ukku bas our town-DAT bus government-INS viṭ-a-ppaṭ-ṭ-atu let.go-INF -fall-PST -3 SG . N ‘Buses were run to our town by the government.’

558

E. Annamalai

The sentences in (55) are candidates for treating N+V as a conjunct verb. (These are often called compound verbs in Dravidian linguistic literature.) The light verbs illustrated in (55) are productive, but other verbs are also used in this construction. (55) a.

b.

c.

d.

e.

f.

g.

cey / paṇṇu avan vēlai-kku muyaṟci^cey-t-ān / paṇṇ-in-ān effort make-PST -3 SG . M / do-PST -3 SG . M he job-DAT ‘He made efforts for a job.’ paṭu avan vēlai-kku avacara^ppaṭ-ṭ-ān hurry experience-PST -3 SG . M he job-DAT ‘He hurried / was in a hurry for a job.’ paṭuttu avan vēlai-kku enn-ai avacara^ppaṭu-ttin-ān I-ACC hurry experience-TR .PST -3 SG . M he job-DAT ‘He hurried me for a job.’ aṭi avan vīṭṭ-ukku veḷḷai^aṭi-tt-ān he house-DAT white strike-PST -3 SG . M ‘He struck white paint on to the house / he whitewashed the house.’ pōṭu 7avan vēlai-kku tiṭṭam^pōṭ-ṭ-ān plan drop-PST -3 SG . M he job-DAT ‘He made a plan / planned for a job.’ eṭu avan ētō coll-a vāy^eṭu-tt-ān mouth take-PST -3 SG . M he some say-INF ‘He opened his mouth to say something.’ viṭu avan koṭṭāvi^ viṭ-ṭ-ān he yawn let.go-PST -3 SG . M ‘He let out a yawn / he yawned.’

The question is whether each N+V in (55) constitutes a single verb, as its English gloss tends to suggest. As it turns out, the question is not just one of translational equivalence, it may even arise in Tamil in some instances. For example, the simple verb muyal ‘try’ can substitute for muyaṟci cey in (55a). Moreover, it is possible to substitute full verbs for light verbs, such as eṭu ‘take’ for cey ‘do’ in (55a) without change in meaning. Further, one or more of the tests mentioned above with reference to (54) for claiming the noun to be syntactically independent of the verb are positive in (55) also. These suggest that the N+Vs in (55) are also word collocations as in (54) and not conjunct verbs.

Syntax and semantics

559

Nevertheless, in some cases the object noun is incorporated within the verb, making the construction a complex predicate, and this predicate can have another object. Thus, (56) is an alternative to (55d), where the accusative vīṭṭai is used in place of the dative vīttukku. (56) avan vīṭṭ-ai veḷḷai^aṭi-tt-ān white-strike-PST -3 SG . M he house-ACC ‘He painted his house white / he whitewashed the house.’ More commonly, the meanings of the noun and the verb are fused to give a composite (or idiomatic) meaning, and they behave syntactically as single, unified words in that they fail the tests mentioned above. These, then, are real conjunct verbs. Some examples are kaṇpōṭu ‘eye + put’ , kaṇpaṭu ‘eye + touch’ , kaiviṭu ‘hand + let go’ , talaiyeṭu ‘head + take’ . 5.4.2.2.4.

Conclusion

The above description, based on Tamil data, shows that there is fluidity in making compound and conjunct verbs. It remains to be seen whether this may or may not be true for Dravidian as a whole. 5.4.2.3.

Compound verbs in Indo-Aryan By Benjamin Slade

5.4.2.3.1.

Introduction

One prominent feature of modern Indo-Aryan languages (IA) is the use of COM (CVs), a particular sort of verb-verb collocation where one of the verbal elements behaves as a “light verb”, that is with much of its normal semantic content bleached, which modifies the other (“main”) verb. Common “light” (or vector) verbs in Indo-Aryan (IA) include GIVE, TAKE, GO, COME, FALL, RISE.17 Despite the prominence of this feature, much remains to be done in terms of descriptive and theoretical analysis of the properties of CVs in the various modern IA languages — including differences in morphosyntactic constraints on CVs, semantic constraints on possible V1-V2 combinations, and resulting semantics of V1-V2 combinations. The most complete descriptions by far concern Hindi, POUND VERBS

17

ALLCAPS is used to indicate the English gloss of the main-verb sense of a light verb. As discussed herein, the light verb senses are sometimes related transparently to their main verb meanings, e.g. Hindi GIVE, often signalling other-benefaction; but sometimes are related rather opaquely, as in Hindi SIT, signalling regret.

560

Benjamin Slade

with relatively few in-depth analyses of CVs in other IA languages; for Hindi see Hook’s 1974 book-length study and Nespital’s extensive dictionary of 1997, as well as dissertations by Butt (1995) and Poornima (2012). Further, the historical development of IA CVs is not entirely clear, especially with respect to the origin of CVs; it would be desirable to be able to establish a rough terminus ad quem (and terminus a quo) for this origin, as well as working out more of the details of the relationship of IA CVs to CVs in geographically proximate languages, including, in particular, Dravidian. 5.4.2.3.2.

Basic morphosyntax

In IA, CVs are prototypically formed from two verbal elements, where one verb is semantically contentful (often referred to in the Indological tradition as the POLE or POLAR verb), and the other verb (traditionally referred to as the vector or VECTOR verb) acts as a modifier, contributing aspectual/aktionsart, attitudinal, and/or other features such as benefactivity and volitionality. In almost all cases both verbal elements also occur in the language in verbal simplexes. The polar verb appears in a fixed grammatical form (often referred to as an absolutive form) which does not show differentiation for number, person, tense etc. All agreement and tense/aspect morphology is borne by the vector verb. Usually the pole immediately precedes the vector.18 Examples (57) and (58) give a contrast between a simplex and a CV construction in Hindi. Both examples use the same main verb, ā ‘come’; in (58) ā appears in absolutive form, followed by jā ‘go’ appearing as vector and contributing a sense of perfectivity (see Hook 1974, M. Singh 1998). (57) vah kal āegā he/she.NOM tomorrow come.FUT .3 SG . M ‘He will come tomorrow.’ (Hindi) (58) vah kal ā-Ø he/she yesterday come-ABS ‘He came yesterday.’ (Hindi)

gayā go.PST . SG . M

Related to IA CVs are what are sometimes referred to as “CONJUNCT VERBS ” (Kachru 1982, Masica 1993), which also involve what might be thought of as “light verbs” (Butt 2010) combining with more semantically-contentful elements, in this case nouns and adjectives. The most frequent light verbs which appear in this function are DO (e.g. Hindi śurū karnā ‘begin (tr.)’ (lit. “start make”)) and BE/ 18

Some IA languages permit “reversal” in which the vector appears in absolutive form and the pole appears in finite form, see Hook 1974, Poornima & Koenig 2008, 2009, Poornima 2012.

Syntax and semantics

561

BECOME (e.g. Hindi śurū honā ‘begin (intr.)’ (lit. “start be”)), where these elements behave as semantically-empty verbalisers for making transitive and intransitive verbs, respectively, from nouns and adjectives.19 However, other light verbs can appear in N/ADJ+V combinations including GIVE, TAKE, HIT/KILL, which also appear in V+V collocations, e.g. lāt mārnā ‘kick’ (lit. “leg (n.) hit/kill (v.)”, mol lenā ‘purchase’ (lit. “price (n.) take (v.)”). 5.4.2.3.3.

Diachronic evolution

Historically, IA CV constructions derive from collocations involving converbs, with which they continue to coexist. The fixed form of the absolutive pole derives from that of the converb (also referred to as a “conjunctive participle”). However, converbs, unlike the pole of CVs, denote a state or event independent of that expressed by any other verbal element of the clause; contrast (59), where ā appears as a converb, with (58), where it appears as an absolutive. (59) vah kal ā-ke gayā go.PST . SG . M he/she.NOM yesterday come-CVB ‘Yesterday, he came and went.’ (Hindi) In many IA languages the converb has undergone secondary morphological differentiation from the absolutive by the (somewhat optional) use of various extensions, including the ke shown in the Hindi example (59). However, CV and converb constructions remain potentially ambiguous in some instances (Hook 1974: 54; Slade 2013: 534). The modern IA CV has traditionally been considered an innovation of the modern period, as clear examples of CVs do not appear in IA languages until the early modern period (Masica 1991: 325; Slade 2013) — with the exception of Sinhala, although various authors have pointed to apparent early instances of CVs in Apabhramśa (R. A. Singh 1980, Hook 1993, Bubenik 1998), Pāli (Hook 1993), and Sanskrit, including even Vedic (Butt & Lahiri 2002, Butt 2003, 2010; see also Tikkanen 1987). Both historically and synchronically, IA languages show the use of a variety of verb-verb (V-V) collocations which involve complex predication in which one verb provides the main verbal semantics, and the other verb contributes aktionsart/aspectual information. These other V-V constructions involve the “main” verb appearing in a participial form (rather than as an absolutive/converb). Example (60) provides an instance of such a V-V from Vedic prose, while (61) and (62) are from modern Hindi and Nepali, respectively. 19

This is a particularly common strategy for borrowed elements, and many conjunct verbs involve Persian/Arabic elements. Masica (1991: 368) suggests that the conjunct verb structure itself may reflect Persian influence.

562

Benjamin Slade

(60) adaṇḍyaṁ daṇḍena ghnantaś hit.PRS .PTCP . NOM .PL . M not-to-be-beaten. ACC . SG . M stick.INS . SG caranti move.PRS .3 PL ‘They keep beating (somebody who is) not to be beaten.’ (Pañcaviṁśa Brāhmaṇa 17.1.9; see Whitney § 1075b) (Sanskrit, Vedic prose) (61) jāgte wake.PRS . PTPC . PL . M ‘stay alert!’ (Hindi)

raho continue.IMPV

(62) ma bhandai jānchu, I.NOM speak.PRS . PTCP go.PRS .1 SG taṁ lekhtai jā write.PRS . PTCP go.IMPV you.NOM ‘I’ll keep speaking and you keep writing!’ (Nepali) Various explanation of the historical development of IA CVs have been offered (see Hook 1991a, 1993; Butt 2003, 2010; Butt & Lahiri 2002; Poornima & Painter 2010; Slade 2013; Kimmig 2014), but a number of issues remain unclear, including the question of when the first CVs are found in IA. Resolving this question would involve interpreting evidence from Pāli and Apabhraṁśa. (Pāli offers certain additional difficulties arising from the fact that numerous texts originate in Sri Lanka or South India and thus instances of apparent CVs in Pāli may reflect calquing from Sinhala or Dravidian which would then have no direct bearing on developments in later continental IA.) As well, given that the modern IA languages show numerous differences in terms of both their inventories of vectors and the morphosyntax of CVs, much work remains to be done on the later history of the development of IA CVs, including the historical development of CVs in particular languages. Butt (2003, 2010) argues that modern IA CVs have existed with few changes since Vedic Sanskrit (as part of a larger argument about the special status of light verbs). Various studies (Hook & Pardeshi 2005; Slade 2013; Kimmig 2014) have pointed out difficulties with such claims. In part, though various types of complex predicates are found in early IA, there is no obvious continuity between these and modern IA CVs (see Slade 2013). The earliest examples of verb-verb collocations which are truly ancestral to the modern IA CVs are uncertain — only dubious examples exist in Sanskrit; Pāli evidence is difficult to interpret; however, it is possible that CVs may be already found in Apabhraṁśa (R. A. Singh 1980, Bubenik 1998). Further investigation by scholars versed in the intricacies of Apabhramśa is required. It does appear that CVs are found already in the Sinhala of the 8th to 10th century (Paranavitana 1956: § 501), though these show numerous similarities

Syntax and semantics

563

to Dravidian CVs and could represent an early instance of convergence not shared with other IA languages — thus the relevance of such examples is bound up with the question of the relation of IA and Dravidian CVs. The relationship of IA CVs to CV structures in Dravidian is not fully clear — it has frequently been suggested that IA CVs may represent a calque of Dravidian CVs (Chatterjee 1926, Hook 1991a, Herring 1993) based on numerous similarities. However, it is also certainly possible that IA CVs represent an independent development which (in certain languages) underwent convergence with Dravidian (see Hook 1991b) or are part of a larger areal phenomenon (including at least both other South Asian language families as well as Central Asian languages (see Masica 1976); on the potential participation of IA in a larger Sprachbund stretching as far east as China, see Liang & Hook 2006). One language for which more research would be highly desirable is Sinhala, unusual amongst modern IA languages in its possession of a fairly complete continuous literary record over the past 1000 years (with some earlier texts as well). Sinhala shows a clear use of CVs well before other IA languages (ca. 800–1000 AD; Paranavitana 1956: § 501). Sinhala’s use of CVs is interesting in a number of respects — the language diverged from mainland IA over two millennia ago and has had relatively little contact with other IA languages since that time — thus it would be instructive to know whether Sinhala’s use of CV verbs reflects an IA inheritance or is due to influence from Dravidian (or both). 5.4.2.3.4.

Semantics

The semantic contribution of the vector in CVs varies from instance to instance and can be difficult to pin down in many cases. In general, many CV combinations involve a sense of perfectivity/completion (Hook 1974, 1993), which goes beyond the grammatically-encoded imperfective : perfective distinction that is found in IA examples like: (63) a. b.

vah ātā thā be.PST . SG . M he/she.NOM come.IPFV . PTCP . M . SG ‘he was coming / he used to come’ (Hindi) vah āyā he/she.NOM come.PFV . PTCP . SG . M ‘he came’ (Hindi)

That is, even where an action/event is encoded in terms of using a past/perfective participle, as in (63b), that does not require that the action be complete, whereas the use of a CV, as in (58), requires that the action/event must have reached true completion; see the Nepali example in (64) where a perfect form of the verb is possible, but not a perfective CV.

564

Benjamin Slade

(64) mai-le us-lāī ciṭhī (diẽ him-DAT letter (give.PST .1 SG I-AGT / *di-i sakẽ), complete.PST .1 SG ) / *give-ABS tara us-le liena take.PST .3 SG . NEG but he-AGT ‘I gave him the letter, but he didn’t take it.’ (see Hook 1974: 163–168) (Nepali) In other cases the choice of a particular vector can involve features other than aspect. One such is the explicit signalling of the actor’s volitionality with respect to the event/action (e.g. in Hindi ḍālnā ‘(lit.) pour’ attributes conscious choice to the subject, while paṛnā ‘(lit.) fall’ signals the subject’s lack of control over the event; see Butt 1993). Another example of a CV with a non-aspectual feature, present in many IA languages, is the use of a GIVE vector to indicate that the action is done for another’s benefit or else has some sort of outward-direction (e.g. Hindi paṛh denā, ‘(lit.) read give’ is used to mean reading aloud), while the TAKE vector indicates that the action benefits or is directed towards the subject. Other vectors can indicate that the action/event occurred suddenly (e.g. Hindi uṭhnā ‘(lit.) rise’) or violently (e.g. Hindi ḍālnā ‘(lit.) pour’), or that the speaker regrets the action/event (e.g. Hindi baiṭhnā ‘(lit.) sit’). Fairly detailed formal semantic analyses of IA CVs are found in M. Singh 1998 and Butt & Ramchand 2001, and more recently, Poornima 2012. Poornima examines the semantics of several vectors in Hindi, and suggests that semantic notions like ‘regret’ which linguists often assign to vectors are in fact pragmatic inferences which arise from more basic semantic properties. She argues that the aspectual properties of vectors involve boundedness, with other properties (like ‘regret’) arising from affectedness. Further formalisation of the semantics involved in IA CVs, as well as consideration of the synchronic connection of the light verb senses to their full verb counterparts (if any), remain areas worthy of study. 5.4.2.3.5.

Language-particular collocational and syntactic constraints

For many vectors, in many IA languages, there are large numbers of polar verbs they may combine with; however, there exist certain collocational constraints on particular vectors in particular languages (see U. N. Singh et al. 1986). In some languages (e.g. Bengali) there is a fairly strict general requirement that pole and vector both be transitive or both be intransitive (Dasgupta 1977); other cooccurrence restrictions may be more idiosyncratic and less transparent. There exist completely idiomatic CV collocations like Hindi bajā lānā ‘carry out, obey’ (Hook

Syntax and semantics

565

1974: 115) or Nepali ai-pugnu ‘to arrive’, where pugnu only occurs as a vector in this CV collocation. Though there seems to be a core group of vectors shared by most IA languages, which includes GIVE, and often GO, TAKE, THROW/PUT, RISE, FALL, and COME, a good deal of variation exists, both in terms of total number of vectors — Hindi uses between 24 (Hook 1974) and 47 (Nespital 1997) while Nepali uses somewhere between 11 (Sharma 1980) and 16 (Pokharel 1991); in Shina 6 vectors have been identified (Schmidt 2004), and just 3 in Kalasha (Bashir 1993) — and in terms of which vectors are found. Though TAKE is common in many IA languages, it is conspicuously absent in Nepali and apparently in Shina and Kalasha as well. There is also a great deal of variation in terms of the lexical meanings of the full verbs used to express various vector meanings: the vector associated with ‘regret’ is SIT in Hindi but SEND in Nepali. A full and detailed study of the vectors found in each language is highly desirable. IA languages vary not only in the collocational constraints on V-V combinations, but also in terms of the syntactic environments in which CV constructions are permitted or required. For example, Hindi CVs are fairly infrequent in negative contexts and “semi-negative” contexts like sirf … hī ‘only’, śāyad hī ‘hardly’ (Hook 1974, 1988), while the same constraint is not as strong in other languages; e.g. in Marathi certain CV combinations can be easily negated (Hook 1988, Pardeshi 2001). Similarly, in Hindi, CVs are nearly obligatory whenever an event/action is perfective/completive (Hook 1974), while the same requirement is not found in other IA languages like Marathi (Hook 1988) or Nepali (Slade 2013). Hook (1988, 1993) points out also that certain contexts in Hindi strongly prefer or disprefer the use of CVs; in Hindi strongly CV-preferring environments include clauses dependent on a verb expressing fear, as in: (65) mujhe ḍar thā ki kahīṁ tum use fear be.PST . M . SG that lest you he.DAT me.DAT ciṭṭhī nā de do letter NEG give.ABS give.SBJV ‘I was afraid that you might give him the letter’ (Hook 1993: 100) (Hindi) Hook (1988, 1993) points out that other IA languages may display such preferences more weakly or not at all. In Marathi, and likewise in Nepali, verbs of fear do not trigger the use of CVs: (66) ciṭhī ta dienas holā tai-le give.PST .2 SG . NEG be.FUT .3 SG you-AGT letter PARTICLE bhanera ma-lāī ḍar lāgethyo QUOT me-DAT fear apply.PST .1 SG ‘I was afraid that you might give him the letter’ (Nepali)

566

Benjamin Slade

A more complete description of such differences across a wider range of IA languages is highly desirable, as would be diachronic studies focussed on the paradigmatisation of CVs (as in Hook 1988, 1993; Slade 2013). 5.4.2.3.6.

Language-specific morphosyntactic differences

IA CVs also differ from language to language in terms of their morphosyntactic properties. These include reversibility (see fn. 18 above), interruptibility, vector recursion, head dominance, and other morphosyntactic restrictions (on the last four, see Slade 2013). In terms of interruptibility, many speakers of Hindi allow for pole and vector to be separated by particles, pronouns, and for some speakers even full NPs, while in Nepali nothing at all may intervene between the two verbal elements. Vector recursion refers to ability of a pole to support multiple vectors: languages like Nepali and Sinhala allow for this (67), while Hindi does not. Hindi, Nepali, and many other IA languages in certain tenses require special marking of agents of transitive verbs — head dominance refers to whether the pole or the vector determines whether the entire CV is treated as transitive or intransitive for purposes of agent-marking. (67) meyāge bandinə wayəsə dæŋ pahu his/her marrying age now past wē -gənə -yanəwa -take.ABS -come.ABS become.ABS ‘Her marrying age is now passing by.’ (Paolillo 1989, cited in Herring 1993) (Sinhala) A wide-ranging study of morphosyntactic differences across IA languages remains to be done. 5.4.2.3.7.

Conclusion

CVs are a distinctive feature of modern IA languages. Although this feature has not been neglected in linguistic research, much remains to be investigated; particularly desirable are in-depth studies of CVs in languages other than Hindu/Urdu and crosslinguistic comparison of the differences in the morphosyntax of CVs across IA languages — from both synchronic and diachronic perspectives. Questions concerning the position of CVs within the grammar, in terms of the environments in which CVs are permitted and the environments in which they are obligatory etc., have been explored from various perspectives in Hook 1988, Poornima & Painter 2010, and Slade 2013, though more remains to be done in this area, particularly in terms of comparison of different IA languages.

Syntax and semantics

5.4.2.3.8.

567

Appendix: A non-exhaustive listing of studies by language

Assamese: Buragohain 2008 Bengali: Pal 1972; Dasgupta 1977; see also Butt & Lahiri 2002 Gujarati: See Cardona 1976: 124–133 Hindi/Urdu: Hook 1974; Butt 1993, 1995; Verma 1993; Nespital 1997; Butt & Ramchand 2001; Poornima 2012; see also Nespital 1966, 1989; Hook 1977, 1988, 1991a, 1993; Masica 1976; Kachru & Pandharipande 1980; Abbi & Gopalakrishnan 1991; Kachru 1993; Butt & Lahiri 2002; Butt 2003, 2010; Hook & Pardeshi 2005; Poornima & Koenig 2008, 2009; Slade 2013 Kalasha: Bashir 1988b: 218–263, 1993 Kashmiri: Kaul 2006; Buragohain 2008 Khowar: Bashir 1988b: 218–263 Marathi: Pandharipande 1993; Pardeshi 2001 Marwari: Hook 1993 Nepali: Pokharel 1991; see also Sharma 1980: 100–111; Slade 2013 Panjabi: Bahl 1969; Raja 2003 Shina: Schmidt 2004 Sinhala: Paolillo 1989 5.4.3.

Finite and nonfinite subordination20 By Hans Henrich Hock

5.4.3.1.

Introduction

Until recently the standard view held that Dravidian languages are “Strict SOV” languages permitting only one finite verb21 per complex sentence and requiring any other, subordinate verb to be nonfinite — the “Finiteness Constraint” — in contrast to Indo-Aryan which permits both finite and nonfinite subordination. For Dravidian, the perception goes back to Caldwell 1875: In every sentence there is but one finite verb, which is the last word in the sentence, and the seat of government; and all the verbs which express subordinate actions or circumstances, whether antecedent or contemporaneous, assume an indeterminate, continuative character, as verbal participles or gerundials, without the need of conjunctions or copulatives of any kind; so that the sense (and more or less the time also) waits in suspense for the authoritative decision of the final governing verb. (Caldwell 1875: 379)

The perception continues widespread in later publications including Winfield 1928, Emeneau 1956, 1965, Kuiper 1967, Thomason & Kaufman 1988. Dravidian 20

21

A comprehensive account, but from a different perspective, is Subbarao 2012: Chapters 6 and 8. For Dravidian the term “finite PREDICATE ” is more appropriate (Steever 1988, 2008).

568

Hans Henrich Hock

R ELATIVE -C ORRELATIVE (“RCCC”22) structures with finite verbs in both clauses, such as (68), were considered alien, borrowed from Indo-Aryan; e.g. Burrow & Bhattacarya 1970, Masica 1976, Sridhar 1981, Thomason & Kaufman 1988. Indigenous relativization was considered to be nonfinite participial as in (69). [avaṉ (68) [evaṉ naṉṟāka ur̤ aikkiṟāṉ]RC =ō who hard work.PRS .3 SG . M =CLIT . that muṉṉēṟuvāṉ]CC succeed.FUT .3 SG . M ‘Who works hard will progress in life.’ (Tamil23)

vār̤ kkaiyil life.LOC . SG

(69) naṉṟāka ur̤ aikkiṟa paiyaṉ vār̤ kkaiyil boy. NOM . SG . M life.LOC . SG hard work.PRS . PTCP muṉṉēṟuvāṉ succeed.FUT .3 SG . M ‘A boy who works hard will progress in life.’ (Tamil) Regarding Indo-Aryan, the concern was raised that RCCC constructions with clause-initial relative pronouns are typologically unusual, since relative clauses of this type supposedly cannot precede their referents. Some scholars assumed Dravidian influence; see the following passage from Masica 1976, and also Masica 1991: 401–416. [Left branching is consistent only in Dravidian] Although conditional, temporal, and adverbial clauses normally precede the main clauses to which they are subordinated in all the languages [of South Asia], only Dravidian […] forces nominal object clauses to precede their verb and adjectival (“relative”) clauses to precede the noun they qualify; in Hindi, Bengali, and Santali they normally follow. [The Hindi relative-correlative construction is a kind of] compromise as far as relative clauses are concerned. (Masica 1976: 25)

Two developments led to a reassessment of the “standard view”. One was work by Ramasamy (1981) and Lakshmi Bai (1985), pursued by Steever (1988), showing that RCCCs are inherited in Dravidian—they are found throughout the language family, go back to the oldest stages, and differ from Indo-Aryan by being limited to the order RC-CC. The second development was the recognition of RCCCs as a feature widespread in SOV languages (but not in others); e.g. Downing 1978, Chr. Lehmann 1984, Keenan 1985, Lipták 2009. While RCCCs with finite verbs in both clauses thus are fully compatible with SOV typology and are indigenous to Dravidian, this does not render the distinction between finite and nonfinite subordination meaningless. Tamil and other modern 22 23

stands for Relative Clause – Correlative Clause. Example (1) = (3a) comes from Ramasamy 1981: 363. I am grateful to Chaitra S. Prasad, a native speaker of Tamil, for checking the related (69), (70b), (83).

RCCC

Syntax and semantics

569

Dravidian languages exhibit a complementary relationship between finite and nonfinite subordination in RCCCs. If the relative clause has a finite verb it must be followed by a clitic, most commonly =ō (70a); if it has the nonfinite conditional, the clitic does not occur (70b). Steever (1988) accounts for the type (70a) by claiming that the post-RC clitic “shields” the finite RC verb from the Finiteness Constraint. (70) a.

b.

=ō [evaṉ naṉṟāka ur͍ aikkiṟāṉ]RC work.PRS .3 SG . M CLIT RP.NOM . SG . M hard [avaṉ var͍ kaiyil muṉṉēṟuvāṉ]CC CP.NOM . SG . M life.LOC . SG . progress.FUT .3 SG . M ‘Who works hard progresses in life.’ (Tamil) Ø [evaṉ naṉṟāka ur͍ aittālum]RC work.COND ( NONFIN ) RP.NOM . SG . M hard [avaṉ var͍ kaiyil muṉṉēṟuvāṉ]CC CP.NOM . SG . M life.LOC . SG . progress.FUT .3 SG . M ‘Who works hard progresses in life.’ (Tamil)

The definition of “finite” or “finiteness” has been subject to some debate; see e.g. Adger 2007, Amritavalli & Jayaseelan 2005, Landau 2004. The absence of agreement in Malayalam finite verbs (in the sense of Caldwell’s “final governing verb”) rules out the intuitively appealing definition in terms of agreement marking. For present purposes, a good working definition of “finite” is the morphological form of a VP head in simple, non-composite sentences. (For some complications see 5.4.3.4 below.) The issue of finite vs. nonfinite subordination extends beyond relativization, and there are considerable differences both geographically and chronologically as regards finite and nonfinite subordination. 5.4.3.2.

Finite subordination

At least three different finite subordination strategies occur in South Asian languages — relative clauses, clausal nominalizations, and quotatives/complement structures. A few languages may employ all of these, others just two. With “Strict SOV” redefined along the lines of Steever (1988) and Hock (2005, see below), most languages conform to the Finiteness Constraint. The advent of centerembedded postnominal relative clauses breaks the Constraint. 5.4.3.2.1.

Relative-correlatives

Relative-correlatives are widespread in South Asian languages. In Indo-Aryan and Dravidian they are indigenous; see the references above and Hock 1988, 2005. Their occurrence in Tibeto-Burman languages most likely reflects Indo-Aryan

570

Hans Henrich Hock

influence (see 2.6.8 with references). In Munda, they may reflect Indo-Aryan influence (especially where the Indo-Aryan relative pronoun in j- is used), but Dravidian influence cannot be excluded (2.6.5). Indo-Aryan stands out by having a special relative pronoun distinct from the interrogative, marked by y- in Old and Middle Indo-Aryan, j- or its reflex in Modern Indo-Aryan. Further, it permits both the order RC-CC and CC-RC, a feature inherited from Proto-Indo-European; see Hock 1988 and 2013. So, corresponding to (70a), Sanskrit permits both (71a) and (71b). (71) a.

b.

[yo yatnena RP.NOM . SG . M effort.INST . SG . M [sa saṁsāre CP.NOM . SG . M world.LOC . SG . M ‘Who works hard progresses in life.’ [sa saṁsāre CP.NOM . SG . M world.LOC . SG . M [yo yatnena RP.NOM . SG . M effort.INST . SG . M

kāryaṁ work.ACC . SG . N puṣyati]CC thrive.PRS .3 SG puṣyati]CC thrive.PRS .3 SG kāryaṁ work.ACC . SG . N

karoti]RC do.PRS .3 SG

karoti]RC do.PRS .3 SG

Except where the Indo-Aryan j-pronoun is adopted, the other South Asian languages with RCCCs use the interrogative pronoun (which may also serve as indefinite). Exceptions are certain Munda and North Dravidian languages which may use a demonstrative (see 2.6.7, this volume), as well as Nihali (Nagaraja 2014: 128) and Palula (Liljegren 2008: 349–357). There are also RCCCs without any pronoun (relative, interrogative, or demonstrative) in the RC; see e.g. Gondi (Steever 1998c: 280), Marathi (Pandharipande 2003: 715), and Khowar, Kalasha, and Shina (Bashir 2003: 849–850, 856, 882). In some of the Iranian languages of the Northwest, RCs are marked by relative particles in preverbal position; see e.g. Shughni (Edelman & Dodykhudova 2009: 812) and Wakhi (Bashir 2009: 850). And there is the more common Iranian type with clause-initial relative marker kĕ, čĕ, etc.; see relevant contributions in Windfuhr 2009 (ed.). Further research on these variants and their relationships to each other is highly desirable. Both Dravidian and Indo-Aryan permit relative-correlatives with more than one RP, corresponding to more than one CP, as in (72) (Hindi-Urdu, adapted from Bhatt 2003). (72) [jis-ne jo karnā cahā]RC RP.ERG RP. ABS do.INF want.PFV . PST . SG . M [us-ne vah kiyā]CC CP.ERG CP. ABS do.PFV . PST . SG . M “Who wanted to do what, that one did that.” ≈ ‘Everybody did whatever he wanted.’

Syntax and semantics

571

In discussions of Modern Indo-Aryan the term relative-correlative is usually restricted to structures of the type (71a), and type (71b) is considered an extraposed variant of center-embedded postnominal structures; see e.g. Masica 1991: 401–415, Srivastav 1991b/Dayal 1996.24 The account works for languages like Modern Hindi-Urdu, which have postnominal structures (see below); but fails for earlier stages of Indo-Aryan which lacked such relatives. Most recent publications analyze left-peripheral RCs as underlyingly adjoined (e.g. Srivastav 1991b/Dayal 1996, Davison 2009 for Hindi-Urdu). Bhatt (2003) accounts for structures with single RPs by movement from matrix-clause-internal position and considers only RCs with multiple RPs to be underlyingly adjoined; for critique see Davison 2009. For Sanskrit, Hock (1989) presents arguments for analyzing RCCC (and CCRC) as conjoined, an analysis accepted and modified by Davison 2009. Comparison of (71) and (72) with (70a) illustrates an important difference between Indo-Aryan and Modern Dravidian languages like Tamil. The latter require a post-RC clitic (if the RC has a finite verb), the former do not. Post-RC marking is also found in Burushaski (Hock 2005), Kalam Kohistani (Baart 1999: 27, 150), and Nagamese/Mongsen Ao (Coupe 2007b). It is absent in geographically northern Dravidian languages, in Koraga, and optional in Malayalam, but also in Old Tamil, Old Malayalam, Old Kannada (Hock 2005, 2008), and of course in Indo-Aryan. Hock argues for supplementing Steever’s “clitic-shielding” account for type (70a) by an account that permits (peripheral) RCs in RCCC constructions to be syntactically autonomous and hence not subject to the Finiteness Constraint (2005). Steever (2008) rejects Hock’s findings for Old Dravidian, but does not offer a synchronic account for the Modern Dravidian languages that lack post-RC clitics (or only have them optionally). The issue deserves further, independent research. P OSTNOMINAL (center-embedded) RCs, as in (73), are an innovation found in a number of Modern Indo-Aryan languages (2.7.5.2). They are often considered “bookish” or “officialese”, and some speakers do not like them; for others they are grammatical. The question whether and how the new relativization strategy affects the grammar of languages like Hindi-Urdu and how it relates to “double-DEM ” structures like (74) deserves fuller investigation. For the time being, see Srivastav 1991b/Dayal 1996. At any rate, structures of this sort cannot be reconciled with the Finiteness Constraint in a principled way.

24

Dwivedi’s (2003) proposal that HU right-peripheral RCs are quasi-conjoined has not gained traction.

572

Hans Henrich Hock

(73) vah

ādmī [jo vahāṁ baiṭhā hai]RC merā bhāī man RP there sit.PFV .3 SG my brother ‘The man who is sitting there is my brother.’ DEM

(74) vah DEM

ādmī man

[jo vahāṁ baiṭhā hai]RC RP there sit.PFV .3 SG

hai be.PRS .3 SG

vah merā bhāī my brother

DEM

hai be.PRS .3 SG ‘The man who is sitting there is my brother.’ 5.4.3.2.2.

Nominalizations

Tibeto-Burman languages generally employ a different, inherited relativization strategy which however has much broader application — clausal nominalization (see e.g. Matisoff 1972, Genetti 1992, 2011, Noonan 1997, DeLancey 2002).25 Prototypically, this type of nominalization is characterized by clause-internal finite syntax, with a post-verbal nominalizer or a similar element, e.g. the “article” in (75) integrating the clause into the matrix clause as one of its constituents. Yoon (1996) characterizes structures of this sort as having in effect two heads, one being the finite verb, the other the nominalizer. (75) [ [khan-na asen a-in-u] -na 2.buy.3 ART you-SG . ERG yesterday meruba pu-metta-ŋ] goat look-CAUS -1 SG ‘Show me the goat you bought yesterday.’ (Athpare; Bickel 1999b) There is, however, some variation. For instance, Bickel (1999b) notes that some, but not all, of the Kiranti languages have reduced tense-modal marking in the nominalized clause. Instead of acting as independent constituents of the matrix clause and being “internal-headed” in Subbarao’s (2012) terminology, they may modify matrix-clause constituents and hence be “external-headed”; see e.g. (76b) vs. (76a) (Classical Tibetan, Beyer 1993). Following the general rules of TibetoBurman, these modifiers can either precede or follow their “external head”. (76) a.

25

[bla-mas bgegs btul] -sa ri-la yod Lama.ERG demon tamed NMLZ mountain.LOC is ‘The Lama who tamed the demon is on the mountain.’

TB clausal nominalization can have a variety of other functions. For its mirative use, see 5.5.2.2.

Syntax and semantics

b.

573

saṅgs-rgyasi Buddha dgon-pa-la bžugs] -pa-s [(gaṅi) QP / IP monastery.LOC dwell.PRES NMLZ tšhos bšad dharma teach.PST ‘The Buddhai (whoi) dwells in the monastery taught the dharma.’

Example (76b) is interesting for another reason — the optional appearance of an interrogative/indefinite pronoun in the nominalized structure. As Beyer (1993: 318–326) explains, the pronoun functions as a “dummy role particle carrier”, serving to disambiguate the grammatical role of gapped coreferential NPs in the RC. Although often found in translations of Sanskrit RCCCs, the usage is indigenous according to Beyer. Further research on structures like (76b) and their possible relation to Modern Tibetan RCCCs like (77) (from Cable 2009) and similar structures in other modern Tibeto-Burman languages is needed. Given the Classical Tibetan evidence, the analysis of the post-RC marker in (77) is ambiguous — an (original) nominalizer or a marker shielding the preceding RC from the Finiteness Constraint. (77) [khyodra-s gyag gare nyos yod] yak what buy AUX you.AG nga-s de bsad pa yin that kill PERF AUX I.AG ‘I killed whatever yak you bought.’

na COND

Munda, too, offers structures that may be considered nominalizations (2.6.7), but their internal syntax (finite vs. nonfinite) requires further study. Tamil (and perhaps other Dravidian languages, too) permits the quotative verb eṉ to be nominalized and thereby exhibit(s) a limited degree of clausal nominalization; see (78) from Beythan 1943: 206. (78) [maṉitar ellārum pāvikaḷ] eṉ-patu people.MF . PL all.MF . PL . EMPH sinner.MF . PL eṉ-NMLZ ‘The fact that all people are sinners is truth itself.’ 5.4.3.2.3.

uṇmai tāṉ truth itself

Quotatives and other Complement structures

Many of the South Asian languages, especially the Dravidian ones, employ QUO to mark cited discourse that may contain finite verbs; e.g. (79). In most cases, the markers are derived from a verb SAY (often a converb), but some languages (e.g. Marathi, Gujarati, Sanskrit) have a marker meaning ‘thus’. Quotatives TATIVES

574

Hans Henrich Hock

may exhibit extended uses, with verbs of thinking, hearing, seeing, as well as in purpose or cause constructions; as a consequence they may gain properties usually associated with Complementizers. See Kachru 1979 for a pilot study, Hock 1982 for Sanskrit, and 2.6.2.2, 2.6.8 of this volume for regional distribution. (79) nāṉ [ [avaṉ iṅkē he. NOM . SG here I.NOM . SG niṉaikkirēṉ (Tamil) think. PRS .1 SG ‘I think that he has come here.’

vantāṉ] come.PST .3 SG . M

eṉṟu]] QUOT

Quotative markers normally follow the cited discourse or complement clause; but see Marlow 1997 for preposed markers in some Indo-Aryan languages. Dravidian languages employ a number of syntactically similar devices, especially structures marked by a form of āk- ‘be, become’, a growing variety of nouns, as well as particles like “dubitative” =ō (Steever 1988). In typological accounts, constructions of this type are commonly considered Complement structures and are contrasted — as Left-branching — with the “Right-branching” Complement constructions marked by preposed ki/ke or a form of the RP in j- that are widespread in Indo-Aryan; e.g. (80). See e.g. Bayer 2001, Davison 2007, Subbarao 2012. Davison (2007) makes the interesting observation that Left-branching complementation is found only in languages with clause-final yes/no-question markers. (80) maiṁ (yah) soc rahā hūṁ I this think.PRS . PROGR .1 SG āyā] ] come.PFV . PST . SG . M ‘I think that he came here.’

[ki COMP

[vah he

yahāṁ here

As noted by Bayer, Davison, and Subbarao, the Left- and Right-branching structures are not equivalent in their behavior. Differences include the following. Left-branching structures canonically occur in the object position of the matrix clause, but may be moved to the left or right periphery. (Sridhar 1990 notes that rightward movement is normal in Kannada for longer embedded clauses.) Right-branching structures can only be right-peripheral.26 Left-branching structures tend to be more limited in usage than right-peripheral ones. Left-branching structures may exhibit “stacking” of markers, as in (81a) (Tamil, from Subbarao 2012: 198) and (81b) (Sanskrit; iti iti  itīti; see Hock 1982).27 Right-branching 26

27

HU structures in which the ki-clause is preceded by yah bāt (lit. ‘this matter’) or the like are an exception (Subbarao 2012: 221) The structures in (81) differ. Subbarao’s Dravidian “stacking” examples seem to be pleonastic, but the Sanskrit one is syntactically motivated by multiple “embedding”. Something

Syntax and semantics

575

structures do not permit this. Note further that Hindi-Urdu Right-branching structures are similar to CC-RC constructions, in that the preceding “main clause” may contain a correlative-like demonstrative pronoun; see the optional yah in (80). (81) a. b.

nī vara-v-illai eṉṟu (solli) nāṉ QUOT QUOT I you come.NEG ‘(Because)28 you did not come, I did not come.’ tena kathitam DEM . INS . SG . M say.ta. PTCP . NOM . SG . N [mayā cintitam I. INSTR . SG think.ta. PTCP . NOM . SG . N na samāgata] iti] NEG arrive.ta. PTCP . NOM . SG . M QUOT ‘He said “I thought that the king has not come.”’

vara-v-illai come.NEG

[[ rājā king.NOM . SG . M iti] QUOT

A number of South Asian languages have both Left- and Right-branching structures. These include Bangla, Oriya, Marathi, Gujarati, Kashmiri, as well as Old and Middle Indo-Aryan. In the latter, Right-branching structures are of the CC-RC type and are not as common as Left-branching ones. A subset of the languages that permit both branchings also have structures with an initial and final “complementizer” in the same clause; e.g. (82). These languages include Old and Middle Indo-Aryan, Marathi, Gujarati, Oriya, Kalasha, Palula, and Gadaba (see Bayer 2001 and Davison 2007 for Marathi, Gujarati, and Oriya, Bashir 1988: 279–282 for Kalasha, Liljegren 2008: 334–341 for Palula, and Steever 1998: 36 for Gadaba29). As noted by Bayer and Davison, structures of this sort present a challenge to syntactic analysis. (82) a.

b.

tvaṣṭā tam asyā ā badhnād she.LOC . SG tie-on.SUBJ .3 SG Tvaṣṭṛ.NOM . SG . M that.ACC . SG . M yathā putraṁ janād iti son.ACC . SG . M give-birth.SUBJ .3 SG QUOT RP.ADV ‘Tvaṣṭṛ shall tie that on to her so that (she) shall give birth to a son.’ (Sanskrit, Atharva Veda 6.81.3; Hock 1982) rām mhaṇto ki mī sindhī śiken Sindhi learn.FUT .1 SG Anu say.PRS . SG . M COMP I asa/mhaṇūn QUOT

‘Ram says that I (= he) will learn Sindhi.’ (Marathi, Pandharipande 2003: 716)

28 29

similar occurs in (Dravidian) Toda, where Emeneau’s (1984) texts contain numerous examples with cited discourse followed by ïḏs̱ k ïḏti ‘ says, so they say’, with two instances of the quotative verb ïn-. It remains to be seen whether this is more widespread in Dravidian. The causal interpretation is one of the extended functions of the quotative. The preclausal marker ki of Gadaba comes from Indo-Aryan.

576

Hans Henrich Hock

Steever (1988) accounts for the widespread Dravidian use of quotatives and similar Left-branching constructions as licensed by their postclausal marker, just as in the RCCC type (70a). This account could be adapted to the markers of Right-branching structures unless, focusing on the optional DEM in structures like (82), we analyze these as relative-correlatives. The general tendency in South Asian morphosyntactic typology discussions is to look at Right-branching structures as a major departure from (“strict”) OV typology. The issue deserves further investigation. 5.4.3.3.

Nonfinite subordination

Although using various finite strategies, South Asian languages do have a strong tendency to use nonfinite subordination devices — much greater than western Eurasian languages — and in this respect they seem to conform to the idealized picture of “Strict OV” typology. The following sections survey the major nonfinite strategies. 5.4.3.3.1.

Nonfinite relativization

Nonfinite relativization or participialization is common to Dravidian and IndoAryan and inherited in both families. Tibeto-Burman languages use similar structures, but these typically are nominalizations (at least in origin); and the case may be similar in Munda (more research is needed). Signficantly, participialization originally behaves differently in Dravidian and Indo-Aryan. Dravidian has no restrictions on the Accessibility Hierarchy; see e.g. (83a) = (69) above beside (83b).30 Old Indo-Aryan (Sanskrit) restricts participialization to coreference between the (surface) subject of the embedded structure and the phrase that it modifies; see the grammatical (84a), with gender agreement between participle and its antecedent vs. the ungrammatical (84b) without agreement (which is left untranslated). (83) a.

b.

30

naṉṟāka ur̤ aikkiṟa paiyaṉ vār̤ kkaiyil boy. NOM . SG . M life.LOC . SG hard work.PRS . PTCP muṉṉēṟuvāṉ succeed.FUT .3 SG . M ‘A boy who works hard will progress in life.’ (Tamil) paiyaṉ naṉṟāka ur̤ aikkiṟa vēlai work boy hard work.PRS . PTCP veṟṟi successful ‘Work (that) a boy does (by working) hard (is) successful.’ (Tamil)

For a “comitative” exception see Subbarao 2012: 291.

Syntax and semantics

(84) a.

b.

577

yatnena kāryaṁ kurvan work.ACC . SG . N do.PRS . PTCP . NOM . SG . M effort.INST . SG . M bālaḥ saṁsāre puṣyati world.LOC . SG . M thrive.PRS .3 SG boy.NOM . SG . M ‘A boy who works hard will progress in life.’ *yatnena kāryaṁ kurvat work.NOM . SG . N do.PRS . PTCP . NOM . SG . N effort.INST . SG . M bālaḥ saṁsāre puṣyati world.LOC . SG . M thrive.PRS .3 SG boy.NOM . SG . M

As Subbarao notes, this restriction is relaxed, to different degrees, in Modern Indo-Aryan languages, especially in close contact with Dravidian and TibetoBurman (2012: 278–294 with references). Subbarao also provides useful information on case marking in Modern Indo-Aryan participial constructions. See 5.4.3.3.3 below for “absolute” participial formations. 5.4.3.3.2.

Nonfinite nominalization, verbal nouns, and infinitives

Terms like nominalization, verbal noun, and infinitive are widespread in grammatical discussions, especially of Indo-Aryan and Dravidian languages; but there is a fair amount of variation in the application of the terms. In principle, the designated phenomena are distinguishable in terms of their internal and external behavior. Nonfinite Nominalizations (NFN) and Verbal Nouns (VN) have case/number variation on the head, determined by the matrix clause; infinitives do not. VNs differ from NFN by non-nominative marking of internal subjects. In practice, the categories are not so clearly distinguished. For instance, Dravidian “VNs” may have nominative subject marking (85) behaving more like NFNs; and the “infinitive” of languages like Hindi-Urdu may behave more like a VN by showing case/number and even gender variation31 on the verb and genitive case on the internal subject (86b). (85) nāṉ mantiri va-nt-at-ai pār-tt-ēn I.NOM minister.NOM come.PST . VN . ACC see.PST .1 SG ‘I saw the minister coming.’ (Tamil; Annamalai & Steever 1998: 121) (86) a.

31

us-ko jā-ne go.INF . OBL that.DAT ‘Let him/her go.’

do give.IMP .2 SG

The issue of gender variation in structures like (86c) has given rise to continuing syntactic debate on “Long Distance Agreement”; see Subbarao 2012: 114–122 and see 5.2.2.3.2 (this volume).

578

Hans Henrich Hock

b. c.

us-ke jā-ne go.INF . OBL that.GEN ‘Let us talk after he leaves.’ mujhe cīnī sugar.F I.DAT ‘I have to buy sugar.’

ke bād after

bāt talk

kareṁ do.SUBJ .1 PL

kharīd-nā/nī buy.INF . M / F

hai be.PRS .3 SG

Historically, too, the distinctions are not always clear, and formations may change from one category to another. Thus, the Sanskrit infinitive marker -tum is a “frozen” accusative of a VN in -tu (which preserves alternative case forms in Vedic: -tos GEN / ABL and -tave/tavai DAT ), but the form is lost in Modern Indo-Aryan, whose “infinitives” reflect Sanskrit VNs (e.g. Hindi-Urdu -nā from Skt. -na). Moreover, Tibeto-Burman exhibits fluctuation between finite and nonfinite nominalizations; see e.g. Genetti 2011. Detailed crosslinguistic studies of the morphosyntax of South Asian NFNs, VNs, and infinitives, and of their similarities, differences, and relations to each other are still a desideratum. One aspect of NFN/VNs has received a fair amount of attention, namely their use in Dravidian “cleft” constructions such as (87), in which a focused element is construed as predicate of a nominalization of the rest of the sentence (with or without copula). For recent discussion see Jayaseelan & Amritavalli 2005. (87) a.

b.

nāṉ piṟa-nt-atu maturai(y.il) I.NOM be.born-PST -“ VN ” Madurai(.LOC ) ‘It is Madurai where I was born.’ (Tamil; Annamalai & Steever 1998: 123) John Mary-ye innale āṇǝ kaṇḍ-atǝ yesterday COP saw-NMLZ J. M.-ACC ‘It was yesterday that John saw Mary.’ (Malayalam; Jayaseelan & Amritavalli 2005)

Subbarao notes a similar construction in Tibeto-Burman Mizo and Hmar, with both finite and nonfinite nominalization (2012: 130–131). 5.4.3.3.3.

Converbs and similar nonfinite subordination devices

One of the defining features of South Asian convergence is a structure variously referred to as “adverbial past participle” (Carey 1804), “verbal participle or gerundial” (Caldwell 1875), “conjunctive participle” (Grierson 1903–1928), “absolutes Participium” (A. Schlegel 1820), “gerund(ium)” (Bopp 1819, note 77), “absolutive” (e.g. Bloch 1934); see Masica 1976: 109–110, Haspelmath 1995: 45–46. The use of terms containing “participle”, especially widespread in English-language publications, is problematic since the form is adverbial rather than adjectival,

Syntax and semantics

579

and there are problems with the other traditional terms too. The term CONVERB , traditional in Altaic, is more felicitous, has come to be used more widely after Haspelmath 1995 and Haspelmath & König 1995, and is used in TB and Munda linguistics (e.g. Noonan 1999, Willis 2007a, Genetti 2011; Anderson 2007). Haspelmath’s definition (1995: 3–8) as ‘a non-finite verb form whose main function is to mark adverbial subordination’ captures most of the characteristics of South Asian converbs. It needs to be supplemented by the fact that South Asian converbs generally refer to anterior, sequenced actions and generally require subject control. Some languages also have simultaneous-action converbs, e.g. Sanskrit (Renou 1930), Chantyal (Noonan 1999). Further, many languages permit occasional exceptions to subject-control. See Tikkanen 1995 for fuller discussion (including the transparency of converbs to the modality, negation, etc. of the main verb). The general tendency for subject control has been widely used in South Asian linguistics as one of the criteria for determining whether oblique experiencers have subject properties. See the contributions in Verma & Mohanan 1990, as well as 1.3.1.5.1.2 for Sanskrit, and 5.4.1 for general discussion. Some South Asian languages have a system of SWITCH REFERENCE , where Same-Subject converbs have Different-Subject counterparts; see 2.6.7 for Munda, and Watters 2002, Willis 2007a for certain TB languages. Dravidian may use infinitive structures for this purpose, as in (88); see Caldwell 1875: 427. Khowar employs an infinitive with oblique case marking for the same purpose (Elena Bashir, p.c. December 2012). For transitive verbs, Sanskrit comes close with the option of using the ta-participle instead of the converb; see (90) below. But as Anderson notes (2007), Different-Subject structures tend to serve a range of other functions beside switch reference. A fuller investigation of Different-Subject strategies in South Asian languages has yet to be conducted. (88) oruvaṉ muṉ ir̤ ukk-a maṟṟavar piṉ one.NOM . SG . M front pull.INF other.NOM . PL . M / F back taḷḷiṉārkaḷ push.PRS . PL .3 M / F ‘While one pulled from the front, the others pushed from the back.’ (Tamil; Beythan 1943: 164) Converbs serve as the basis for South Asian compound verb constructions (Masica 1976: Chapter 5; and section 5.4.2, this volume) and they are widely used for “Clause Chaining”, as in (89).32

32

See Haspelmath 1995: 20–27 for “Clause Chaining”.

580

Hans Henrich Hock

(89) tad ākarṇya brāhmaṇaś chāgaṁ tyaktvā brahmin.NOM . SG . M goat.ACC . SG . M drop.CVB that.ACC . SG . N hear.CVB muhur nirīkṣya punaḥ skandhe kṛtvā shoulder.LOC . SG . M do.CVB repeatedly look.at.CVB again dolāyamānamatiś calitaḥ wavering.minded.NOM . SG . M go.ta.PTCP . NOM . SG . M ‘Hearing that, the brahmin dropping the goat, repeatedly looking at (it), putting (it) again on (his) shoulder, went (on) with a wavering mind.’ = ‘When the brahmin heard that, he dropped the goat, repeatedly looked at it, put it again on his shoulder, and went on with a wavering mind.’ (Sanskrit; Hitopadeśa 4. 53) Following Bloch 1930 the use of converbs as discourse linkers as in (90a) has been considered to reflect Dravidian influence; see especially Emeneau 1971. Under the name Tail-Head Linkage the phenomenon of nonfinite recapitulation has been shown to be widespread in (folk) narratives, irrespective of syntactic typology (Thompson & Longacre 1985: 209–213); and under the term Catena it has been shown to occur also in Ancient Greek (Migron 1993). In Sanskrit, converbs alternate with ta-participles in a Switch Reference system (90b). A further alternative is the use of an “Absolute Participle” construction (90c), with the participle in the locative (or genitive) and case/number/gender agreement with the subject of the construction. (In Hindi-Urdu, the subject instead tends to be in the genitive; McGregor 1972: 158.) This construction usually is employed if the subject of the embedded clause has no counterpart in the matrix clause. (90) a.

b.

c.

… pratyūcus te divaukasaḥ | … … || that.NOM . PL . M heaven-dweller.NOM . PL . M reply.PRF .3 PL evam uktvā kaliṁ devā … yayuḥ || thus speak.CVB Kali.ACC . SG . M God.NOM . PL . M go.PRF .3 PL ‘Those heaven-dwellers replied “… …”. Having thus spoken to Kali, the Gods went …’ (Sanskrit; Mahābhārata 3.55.7–11) yudhiṣṭhira uvāca | … … || Yudhiṣṭhira.NOM . SG . M speak.PRF .3 SG evam uktas tato rājñā thus speak.ta. PTCP . NOM . SG . M then king.INST . SG . M ḑhaumyo ’tha … | akarot vidhivat sarvaṁ do.IPFV .3 SG duly all.ACC . SG . N Ḍhaumya.NOM . SG . M then ‘Yudhiṣṭhira spoke “… …”. Thus spoken to by the king, Ḍhaumya then did everything duly.’ (Sanskrit; Mahābhārata 4.4.51–52) evam ukte nalena … Nala.INST . SG . M thus speak.ta. PTCP . LOC . SG . N nṛpaḥ … āsasāda … bibhītakam vibhītaka.(tree).ACC . SG . M king.NOM . SG . M sit.near.PRF .3 SG

Syntax and semantics

581

‘Nala thus having spoken, … the king sat near a vibhītaka tree.’ (Sanskrit; Mahābhārata 3.70.6) Beside converbs, Dravidian languages offer another nonfinite construction, the CONDITIONAL (which may also be used with RCCCs; 5.4.3.1 above). Similarly, Noonan (1999) talks about conditional converbs in Chantyal; and Bangla, Asamiya, and Oriya have conditionals distinct from converbs and infinitives (Dasgupta 2003: 370; Goswami & Tamuli 2003: 424; Ray 2003: 462). Note further the Hindi-Urdu use of the imperfective/present participle in contrary-to-fact conditionals (Schmidt 2003: 327). In Indo-Aryan, finite RCCC structures of the type Skt. yadi … tarhi ‘if … then’ are an alternative; but the relative adverb yadi (or the like) may be omitted. A comprehensive examination of conditional syntax in South Asian languages is still a desideratum. 5.4.3.3.4.

Finitization (?)

As noted in 2.6.7, Mundari, Malto, and Kuṛux share the feature of adding (secondary) agreement markers to nonfinite verbs or simply using finite verbs in lieu of nonfinite ones. (The latter strategy is also widespread in Kiranti languages; see 2.6.7 with references.) Similar phenomena are found elsewhere. Emeneau (1984: 51–52, 137–138) recognizes a category of Toda subordinate structures whose verb has person/ number agreement, but the third person marker differs from that of ordinary finite verbs. Old Tamil has structures with person markers where nonfinite verbs would be expected (91a) — in this case, a converb. Traditional Tamil grammar refers to this formation as muṟṟeccam ‘a finite verb in the function of a non-finite verb’, a term also used for the more widespread grammaticalized type (91bc) with two finite agreeing verbs, which Steever refers to as Serial Verbs; see Steever 1988: 50–52, Th. Lehmann 1994: 128–130, and 5.4.2.1, this volume. Steever argues that only the final predicate of Serial Verbs is syntactically finite and that the finiteness of the other one is morphological. This may work for structures like (91b), where the final verb can be considered a syntactic head, transmitting its features to the preceding verb; it does not work for the “Balance Verb” type (91c), since there is no hierarchical relation between the two verbs. Hock’s proposal that Serial Verbs involve (paratactic) coordination (1988) may work as a historical explanation but may not be appropriate synchronically. The syntax of Serial Verbs (91bc) and their relation to the type (91a) deserve further investigation. (91) a.

33

… maliv-aṉ-am maṟukk-i … nir̤ al iru-nt-aṉ-am be-full.NPST . EUPH .1 PL wander.CVB shade remain.PST . EUPH .1 PL 33

EUPH indicates a so-called euphonic morphological linking element with no discernable syntactic function.

582

Hans Henrich Hock

b. c.

‘... (we) were filled (with desire), wandered (around), and remained in the shade.’ (Old Tamil; Th. Lehmann 1994: 129) celvēm allēm go.NPST .1 PL become.NEG .1 PL ‘We will not go.’ (Old Tamil; Steever 1998: 42) ... vizu aya kulur poṭi tinad uṇad all that crane bird eat.NPST .3 SG . N drink.NPST .3 SG . N ‘... that crane consumes everything.’ Lit. ‘that crane eats, drinks everything’ (Koṇḍa; Steever 1988: 70)

The occurrence of structures with finite verbs where nonfinite ones are expected raises questions about the robustness of the distinction finite : nonfinite in early Dravidian and hence also about the Finiteness Constraint. 5.4.3.4. Non-finite verbs in finite function and vice versa As the result of grammaticalization, originally predicate verbal adjectives may come to be used as main verbs in simple, non-composite sentences. Consider the examples in (92). The first of these (92a) is the well-known case of the Sanskrit ta-participle coming to be used as (equivalent to) a finite past tense — the ancestor of the Modern Indo-Aryan perfective past. Example (92b) illustrates a similar use of the Sanskrit gerundive (GDV) in obligational or future function — this, too, has reflexes in Modern Indo-Aryan, including in future forms of the eastern languages. (92c) illustrates the Tamil use of the infinitive in (finite) optative function (Beythan 1943: 99). (92) a.

b.

c.

devadattaḥ samāgataḥ D.NOM . SG . M come.ta.PTCP . NOM . SG . M ‘Devadatta came/has come.’ (Lit. ‘D. [is] come.’) devadattena pustakaṁ paṭhitavyam D.INS . SG . M book.NOM . SG . N read.GDV . NOM . SG . N ‘Devadatta must/will read a book.’ (Lit. ‘By D. a book [is] to be read’) atu niṟka this.NOM . SG . N stand.INF ‘Let this stand = enough of this’ (Lit. ‘this [is] to stand’)

Significantly the same forms continue to be used in their usual non-finite functions as well, as in the attributive structures in (93a, b) or the purpose structure in (93c) — a fact that may raise questions such as whether the distinction finite : non-finite breaks down here or whether the form-identical adjectival and finite/main-verb forms are morphosyntactically different.

Syntax and semantics

(93) a. b. c.

583

samāgataṁ devadattaṁ paśya D.ACC . SG . M see.IMP .2 SG come.ta.PTCP . ACC . SG . M ‘See Devadatta (who has) come.’ paṭhitavyaṁ pustakaṁ paṭha tāvat read.IMP .2 SG then read.GDV . ACC . SG . N book.ACC . SG . N ‘So read the book (that is) to be read.’ āciriyar māṇavaṉai veḷiyil niṟka student.ACC . SG . M outside stand.INF teacher.NOM . SG . M vaittār place.PST .3 SG . M ‘The teacher made the student stand outside’ (Annamalai & Steever 1998: 112)

The converse phenomenon, the use of finite verbs where nonfinite forms are expected, is found in Old Tamil (94a), where a finite past form is used for the expected conditional (94b). This phenomenon, however, appears to have been only transitory. (For discussion see 4.5.3.4.3, this volume.) (94) a.

b.

...

pāṭiṉai celiṉ ir̤ ai peṟukuvai go.COND jewel get.NPST .2 SG sing.PST .2 SG ‘If you sing and go, you will receive jewelry.’ (Lit. ‘You sang, if you go, you will receive jewelry.’) (Th. Lehmann 1994: 130) ... pāṭiṉ celiṉ ir̤ ai peṟukuvai go.COND jewel get.NPST .2 SG sing.COND

5.4.3.5. Relation between finite and nonfinite alternatives Little work has been done on the issue of which of two or more coexisting finite and nonfinite subordination alternatives is preferred, and under which circumstances. The only area that has received some attention is the relation between finite and nonfinite subordination in Indo-Aryan and Dravidian. For Indo-Aryan, Masica notes that Marathi, Nepali, and Shina, on the southern and northern periphery, prefer nonfinite relativization (1991: 415–416), and Wilde observes the same tendency for Rājbanshi (2008: 326). Similarly, Dravidian prefers nonfinite relativization (e.g. Sridhar 1990: § 1.1.2.3), a fact that seemed to support the traditional view that CCRCs are alien, “borrowed” from Indo-Aryan (see 5.4.3.1 above). Sridhar notes that CCRCs are used ‘especially to relativize minor constituents such as location, circumstance, etc.’, where it offers ‘greater accessibility in relativization’ in contrast to the participial strategy which ‘obscures the grammatical relations’. A more comprehensive study, of Sanskrit narrative texts, is that of TsiangStarcevic 1997, which demonstrates that nonfinite subordination is preferred in

584

Elena Bashir

narration, while finite subordination occurs more commonly in cited-discourse deliberations. (An earlier publication, Tsiang & Watanabe 1987, suggests that the preference for nonfinite subordination in narration is shared by Ancient Greek and may thus be a more widespread phenomenon.) More comprehensive work on the relation between finite and nonfinite subordination in South Asian languages remains a desideratum. 5.5.

Morphosemantic typology: Evidentiality

The morphology and morphosyntax of evidentiality and mirativity has received most prominent attention in the languages of the Himalayas and the Northwest of South Asia, but is not limited to these. The following sections are intended to provide up-to-date surveys of research results and remaining challenges. 5.5.1.

Evidentiality and mirativity in Iranian, Nuristani, Indo-Aryan, Burushaski, and Dravidian By Elena Bashir

5.5.1.1. Introduction This section focuses on morphological marking of evidentiality and mirativity, and does not treat analytic or syntactic means, i.e. non-morphological evidential strategies, in detail. I follow DeLancey (1997, 2001) in considering mirativity and evidentiality as distinct grammatical categories, though they often share a grammatical marker. The position adopted in this section is reflected in Hyslop 2011: 43: ‘The mirative as a conceptual category is different from, but related to, evidentiality and epistemic modality, and is perhaps best understood in light of these two. Evidentiality is concerned with source of knowledge; epistemic modality encodes certainty of knowledge, while mirativity is concerned with expectations of knowledge.’ I would stress that in my view mirativity is concerned with change in the speaker’s mental state rather than with modal or epistemic properties of the utterance itself. It is well known that evidentiality marking systems are extremely prone to borrowing, and other contact influences (e.g. Aikhenvald 2004: 21; Epps 2005). Thus, not surprisingly, I have found that the evidential strategies examined so far in South Asian languages tend to cluster geographically. 5.5.1.2. Northern groups: Iranian, Nuristani, Indo-Aryan, and Burushaski The centrality of evidentiality and mirativity in TB languages (Section 5.5.2 below), and the existence of morphologically marked indirectivity in the Persian

Syntax and semantics

585

of Iran (Lazard 1985, 1996, 2000; Windfuhr 1982; Jahani 2000; Windfuhr & Perry 2009) and Tajik Persian (Perry 2000, 2005) have long been known.34 Ostrovsky 1996 and 1997 are studies of the use of perfect forms to express evidential meanings in Dari Persian. Wakhi (Iranian) also achieves some indirective and mirative meanings by use of the perfect (Bashir 2006, 2009). So far, data and studies on the other Iranian languages of the region, i.e. Balochi, Pashto, and the other East Iranian languages, are lacking. Until fairly recently, it had been thought that evidentiality and related categories are not morphologically indicated in Indo-Aryan (IA) languages; however, evidence for evidential systems in IA languages has begun to emerge. Nepali has (at least) three forms marked for evidential meanings: (i) the inferential perfect, (ii) a hearsay particle re, and (iii) a mirative copula rahecha (Bashir 2006a: 35). A form called ajñāt bhūt ‘unknown past’ was known to Nepali grammarians, and mentioned by Clark (1963). Michailovsky (1996) also discusses this form, calling it “inferential”. Peterson (2000) continues this discussion, now in terms of the newly-identified category of mirativity. He elaborates on the specific meanings developing from the inferential forms, and also discusses the specifically hearsay particle re. Bashir (1988a, 1988b) discusses the evidentiality systems of Khowar and Kalasha, “Dardic” languages spoken in northwest Pakistan. In these languages, there is an obligatory morphological distinction between direct, which Bashir (1988a, 1988b) termed “actual”, and indirective “inferential” past tense verb forms. The “actual” forms encode actions/events known by first-hand witnessing, or which are part of the speaker’s established knowledge. The “inferential” forms encode several indirective meanings, including hearsay, traditional knowledge, and inference from observation of resultant state, as well as mirativity. In the non-past tenses, indirective meanings are accomplished, in both languages, by use of inferential past tense forms of ‘become’. Bashir (1988b: 144–145) offers the hypothesis that Kalasha and Khowar retain the indirective semantics of the OIA perfect and subsequently reinforce it under the influence of contact with Persian, Wakhi, and perhaps Turkic languages. It appears that in Kashmiri, mirativity semantics emerges with the auxiliary gatshun ‘to become’, which expresses a sense of surprise, joy, happiness, etc. (Wali & Koul 1997: 47, citing Shauq 1983, which I have not been able to access). Palula, an archaic variety of Shina spoken adjacent to the Khowar-speaking area, uses the particle maní, a non-finite form of ‘say’, in events reported as hearsay and in traditional tales (Bashir 2006a: 40, Liljegren 2008: 230–231); a few of Liljegren’s examples suggest that it may also have a mirative function. Bashir (2010a) finds possible indications of mirative marking in eastern varieties of Shina, achieved by the use of a non-tense-marked past participle in contrast to past par34

Utas (2000: 269), however, finds no regular way in Classical New Persian of expressing an inferential or reportative perspective.

586

Elena Bashir

ticiple + tensed auxiliary forms for direct narration. In Kalam Kohistani, hearsay and mirative meanings and indirect knowledge are indicated by a sentence-final particle -yer, which appears to be from a defective verb -ar- ‘say’, which now exists only in past tense forms (Bashir 2006a: 41). In Torwali, spoken in the upper Swat Valley, I have so far identified two particles indicating evidential meanings. A sentence-final particle -a is employed in all tenses for sentences representing information acquired indirectly; a particle ko marks information acquired by inference from visual evidence of a resultant state (Bashir 2006a: 41–42). In Urdu and Hindi, indication of evidentiality/inferentiality semantics is associated with at least three morphological patterns: (i) simple verb vs. compound verb (Bashir 1993); (ii) simple perfective participle vs. tense-marked perfective (Montaut 2001); (iii) use of the negative marker na vs. nahīṁ (Bashir 2006a: 42–43, 2006b: 24–25).35 Inferential/indirective systems in the Nuristani languages Waigali (Kalaṣa-alâ), Kâmviri, and Ashkun (Aṣkuṇu), and in Yasin Burushaski are discussed in Bashir 2006a: 37–38. The indirective form in Yasin Burushaski appears to result from contact influence of Khowar, since Khowar is spoken both in Yasin and immediately adjacent to it in Chitral, and all Yasin Burushaski speakers also reportedly speak it (Backstrom 1992: 35). 5.5.1.3. Southern group: Dravidian, Dakkhini, and Marathi So far, there has not been much work on the expression of indirective meanings in Dravidian languages. Bashir (2006a) includes fresh data from fieldwork addressing these questions. Some tentative generalizations emerge from the work reported in that paper. In general, in this cluster of languages, most evidentiality distinctions are not morphologically encoded, but are distributed throughout the grammar. Some generalizations do, however, emerge: (i) the use of the perfect for reporting of inferences based on observation of a resultant state, attested for Malayalam, Tamil, and Marathi; and (ii) particles derived from ‘say’, which are used in Tamil, Kannada, Telugu, Toda, Dakkhini Urdu, and Marathi for “hearsay” or indirect information; (iii) a form glossable as ‘like (similar to)’ is employed in Tamil, Kannada, Telugu, and Dakkhini Urdu in meanings ranging from inference from observation of a resultant state to mirative; (iv) all these languages employ “surprise” particles to report new or surprising information (cf. mirative meaning). If Tamil -ām ‘hearsay particle’ is, in fact, derived from a form of ‘become’, it would be typologically comparable to the languages of the northern cluster. In 35

Sigorskiy (2010) summarizes some of the previous work on evidentiality in Hindi, but concludes that evidentials, inferentials and miratives constitute a single category ‘which denotes a source of new information’, stressing that it is everywhere combined with modal semantics.

Syntax and semantics

587

Toda, paragraphs often end in a particle (ï)dti ‘so they say’, which is distinct from quotative ïdu. An example of a sentence containing both the quotative and (ï)dti is found in Emeneau 1984: 250 (Story 53, sentence 4).36 The situation in Marathi is complex, involving both selection between tense-aspect forms and the use of various particles (Bashir 2006a: 46–47). As one example, the Telugu situation is presented here in detail (see Bashir 2006a: 42–43 for the examples).37 In Telugu, marking of evidential meanings is distributed, including: (i) the particle anṭa ‘saying’; (ii) a surprise particle -ē; (iii) the morpheme -aṭl- ‘like’. Of these, anṭa ‘saying’ functions to indicate hearsay, and other types of indirect knowledge. No meaning of reduced belief in the statement is inherent in statements with -anṭa, but it can be used as a discourse strategy to distance the speaker from responsibility for a statement, and to quote proverbs. In reporting the actions of a third person, in combination with the emphatic marker -ē, anṭa can yield a mirative-like meaning; with a first-person subject, -ē can evoke the mirative sense. The morpheme -aṭl- ‘like’, which follows the non-finite verbal element, can indicate indirect knowledge of events or situations acquired from sources other than (extended) speech, e.g. inference from observation of a resulting state, or, with a first-person agent, giving a nuance of inadvertent action. 5.5.1.4. Emerging generalizations So far, most of the IA and Iranian languages studied appear to have Type A1 evidentiality systems (Aikhenvald 2004: 25–26); that is, they have two choices — first hand and non-first hand. In such systems, the first-hand term typically appears when reporting information acquired directly by the speaker/observer through his or her own senses, and the non-first-hand covers everything else, including hearsay, traditional knowledge, and inference from resultant states. Thus they focus on the reception of the information/utterance by the observer/speaker rather than on its source or the specific sensory mechanism by which it was perceived. For languages having this type of system, the category is sometimes referred to as “inferentiality” (e.g. Bashir 1988a, 1988b).38 The development of indirective evidential meanings from the perfect is crosslinguistically well attested (Willett 1988; Bybee & Dahl 1989: 73–77). This typological regularity is observed in both the languages of the northern clusters and in the southern group. In the southern 36 37

38

I thank Hans Henrich Hock for pointing me to Emeneau’s work on Toda. Telugu examples and discussion are due to Nagaraj Paturi, formerly Fellow, Centre for Folk Culture Studies, School of Social Sciences, University of Hyderabad, India. A few languages, e.g. Nepali and Waigali, also have a specifically “hearsay”-marking particle. Many of the languages in the northern cluster also have quotative particles derived from ‘say’ (Bashir 1996).

588

Elena Bashir

group, the picture is more complex, and some of the languages, e.g. Tamil, appear to have three (or even more) choices, most of these involving non-morphological strategies (cf. Aikenvald’s B-type systems). 5.5.1.5. Desiderata For many South Asian languages, data on evidential or indirective systems is sparse or totally lacking. Even for the most-studied language, Sanskrit, as far as I have been able to determine, there is relatively little work on evidentiality and mirativity. The relevant research I have been able to locate concerns (i) the use of the perfect for reporting of events unwitnessed by the speaker, and (ii) the use of the particle kila. Discussion of the perfect as used to report events unwitnessed by the speaker originates with Pāṇini’s rule P.3.2.115 (parokṣe liṭ). Deshpande (1981: 62), after considering various scholarly positions and analyzing the sequential relationships of P.3.2.115, P.3.2.110 and P.3.2.111, concludes that the perfect (liṭ) has the following features: [+] past, [–] recent, and [–] seen; he also stresses Panini’s distinction between his own and Vedic usage (Deshpande 1981: 63): ‘Pāṇini, who gives very precise roles concerning the use of the aorist, imperfect and perfect in his own Sanskrit, says that these varieties occur at random in early Vedic (chandas).’ Cardona (2002: 236–37) continues the discussion, arguing that the (preferred) use of the perfect for unwitnessed events is also found in some Vedic texts. Most recently, Hock (2012: 93–99) finds that the one text that, according to Cardona (2002), testifies to Pāṇini’s distinctions (the story of Śunaḥśepa in the Aitareya Brāhmaṇa), does not conform to the distinction between perfect (parokṣe) and imperfect. Emeneau (1969: 244) finds that the meanings of kila include vārttā ‘report’, aitihya ‘tradition’, āgama ‘traditional account’, prasiddhi ‘general opinion, universal knowledge’, or gloss with iti śrūyate ‘so it is heard’. Since this work was done before recent research in identifying and defining the categories of evidentiality and mirativity, it is possible that renewed attention to texts with these issues in mind might yield information on the appearance of certain verb forms or of kila in typical mirative contexts. Since a large number of searchable electronically available texts is now readily available, this now seems quite feasible. Given the geographical clustering of evidential marking mechanisms observed in both the northern and southern groups, historical studies of the development and geographical spread of these systems would be highly desirable. This is especially feasible in south India, where the major languages have a long written tradition. Also, intensive work on these questions for Marathi would be especially interesting, given the complexity of the situation in that language, and its position straddling the IA and Dravidian spheres. Evidentiality and mirativity need study in Balochi and Brahui. Rossi (1989), citing Windfuhr’s (1982) discussion of inferentiality in Persian, argues on the basis

Syntax and semantics

589

of elicited Balochi sentences patterned on sentences in Windfuhr (1982) that in the Balochi of Chakansar/Kang (Afghanistan), some verb forms are used with inferential meaning. Lazard (2000: 225) discussing Barker and Mengal’s (1969) Past II (“Past Completive”) and Past Perfect II (“Past Perfect Completive”), based on Rakhshani Balochi (Pakistan), thinks that the meanings of these forms may have “nuances médiatives”.39 This question needs textual analysis focused on the new questions of whether and how inferential and/or mirative semantics is encoded in Balochi. Regarding Brahui, I have not yet been able to identify any forms or constructions which convey evidential or indirective meanings. However, correlation observed in available texts of the occurrence of the (new) present progressive forms with the complements of verbs of perception, mental activity, or speech, suggests that these progressive forms may be associated with the “actual/witnessed” pole of (an emerging) system of expressing a range of epistemic senses ranging from directly witnessed to non-witnessed events (Bashir 2010b: 34). A recent study on Sinhala involuntative verbs focuses on the fact that nominative and ergative subjects can occur with involitive verb morphology, yielding a nonvolitional reading signaling an action counter to speaker expectations (Zubair 2008: 177). This suggests that the concepts of mirativity and doxastic modality, which for Zubair ‘indicates eventualities that occur counter to speaker expectations’ (2008: 174), may be closely related. Despite an upsurge in recent work on these categories, work remains in its beginning stages. For example, the types of theoretical (syntactic and semantic) questions now being asked about evidentiality in other languages (e.g. McCready & Ogata 2007 for Japanese) have yet to be approached for South Asian languages. For most languages, this stage cannot be reached before much more data is available. However, new work by Annamalai on evidentiality in Tamil (Annamalai 2012) is addressing questions such as the extent to which evidentiality is correlated with speaker assessment of the reliability of the information and/or commitment to its truth. The two dimensions — whether the proposition is based on perceptual evidence and whether the speaker believes in its truth — are logically independent; however, sometimes indirect evidentials can carry the implication that the speaker does not necessarily vouch for the truth of the statement. To what extent does this hold for the various forms in the different languages? To what extent do evidential forms appear in subordinate clauses (relative clauses, adverbial clauses)? Are they used in questions? Do they take scope over modality? To what extent are they the same as or similar to exclamatory illocution? Can evidential or indirective forms occur in reported speech (direct discourse, indirect discourse)? These sorts of questions remain almost unexplored for South Asian languages.

39

Sabir Badalkhan, a native speaker of Balochi, does not accept either Rossi’s or Lazard’s views, especially for Pakistani Balochi (p.c.16 April 2006).

590 5.5.2.

Scott DeLancey

Evidentiality and Mirativity in Tibeto-Burman By Scott DeLancey

This section discusses a set of categories which have been discussed under the broad rubric of evidentiality. This is a relatively recent addition to our typological inventory, having become a common topic of research only with Chafe & Nichols (eds.) 1986. Since documentation of Tibeto-Burman languages is likewise in a very undeveloped state, much of what we can say about the intersection of the two areas is very preliminary. 5.5.2.1. Evidentiality in Tibeto-Burman Tibeto-Burman (TB) languages have been prominent in the contemporary discussion of evidentiality (Mushin 2001, Aikhenvald 2004, Aikhenvald & LaPolla 2007). Most contemporary studies of evidentiality as a crosslinguistic category discuss data from one or more TB languages. Three of fifteen studies of specific evidential systems in Chafe & Nichols (eds.) 1986 are of TB languages. A full issue of the journal Linguistics of the Tibeto-Burman Area has been devoted to evidential systems in the family (vol. 30(2), 2007, edited by Aikhenvald), and two consecutive issues — 23(2) (2000) and 24(1) (2001) — under the editorship of Balthasar Bickel, to ‘Person and evidence in Himalayan languages’, dealing primarily with TB languages. Evidential marking in TB languages is concerned with the distinction between first-hand and inferred or reported information, and with the “mirative” category of information which the speaker presents as unanticipated or contradicting previous expectations. There are few reported TB examples of a sensory evidential system as reported for several New World languages, which distinguish visual, auditory, and other channels of information. Many TB languages of the Himalayas show some variation on the unusual “conjunct/disjunct” or “egophoric” pattern. Evidentiality is not universal in TB; it has been reported sporadically or not at all in some branches of the family (e.g. Bodo-Garo, Kuki-Chin, Lolo-Burmese). The considerable representation of TB in the evidentials literature is due to the broad interest of the unusual “conjunct/disjunct” systems of the Bodish and other languages of the Eastern Himalayas, which seems to be an areal rather than a genetic feature, although almost all the languages involved are TB. (Until recently this was thought to be a uniquely Himalayan phenomenon, but there are recent reports of apparently similar phenomena in Altaic, Caucasian, and Andean languages). The hotbed of evidentiality is the Himalayan region. Evidentiality has been described for at least one language of every subgroup of the Bodic or Western branch, including West Himalayan Kinnauri (Saxena 2000) and Darma (Willis 2007b), Central Himalayan Kham (Watters 2002) and Magar (Grunow-Hårsta 2007), Newar (Bendix 1974, Genetti 1986, Hargreaves 2005, Hale & Shresta

Syntax and semantics

591

2006), Kiranti languages including Sunwar (DeLancey 1992a, 1997) and Limbu (van Driem 1987, see above); East Bodish Kurtöp (Hyslop 2011), Tshangla (Andvik 2010), and Tamangic Kaike (Watters 2006). It is best known from the Tibetan languages, with an extensive literature on Lhasa (Chang & Chang 1984, DeLancey 1985, 1986, 1990, Agha 1993, Tournadre 1996, 2008, Denwood 1999, Garrett 2001, inter alia) and other Tibetic languages (Woodbury 1986, Sun 1993, Saxena 1997, van Driem & Tshering 1998, Bielmeier 2000, Haller 2000a,b, Huber 2000, 2005, Volkart 2000, Zeisler 2000, Häsler 2000, Tournadre 2001, Hein 2001, 2007, Hongladarom 2007, Gawne 2013). Evidential categories are widespread in the rGyalrongic (Sun 2007, Lin 2003, Jacques 2004) and Qiangic (LaPolla 2003, Shirai 2007) languages of western China, as well as the unclassified Baima (Chirkova 2008). Evidentiality has not been widely reported for Lolo-Burmese languages, but typical egophoric systems have been reported for two Southern Loloish languages, Akha (Egerod & Hanson 1974, Thurgood 1986) and Sankong (Matisoff 1993). Since this is not a prevalent feature of Lolo-Burmese languages, we have to suppose that its occurrence here represents contact influence from languages farther west. Evidential marking is reported in a few languages of North East India and Northern Burma, including Meitei (Chelliah 1997, 2001) and Tani (Post 2013), and with further documentation we should expect to find more. But there is no evidence of it in the languages of the plains, such as Karbi or the Bodo-Garo languages. While it is probable that a majority of TB languages have some evidential constructions, in all of those which have been described they are evidently recent developments (DeLancey 1992b, Bickel 2000). Despite the prevalence of egophoric systems and other evidential categories throughout the modern Tibetic languages, there is no obvious grammaticalized evidentiality in Classical Tibetan, and it appears to be a recent development in the modern languages (Zeisler 2004: 304, see also Hongladarom 1997, Volkart 2000). Based on comparative as well as internal reconstruction we must conclude either that the category is a relatively new development in the area, or else that evidential constructions are diachronically unstable and subject to regular replacement. 5.5.2.2. Mirative and inferential categories Two semantic parameters reflected in Tibeto-Burman evidential systems are the inferential, where a proposition is marked as known to the speaker only through secondary evidence, and the mirative, marking a proposition as one which the speaker (or, sometimes, the addressee) would not have predicted based on his or her knowledge up to that point. As in other languages (e.g. the “mediative” category reported in Balkan and Turkic languages), there is often some functional overlap between these.

592

Scott DeLancey

An example of a specifically inferential construction is Lhasa Tibetan -zhag (< bzhag ‘put’, see DeLancey 1991), which marks a statement as based on secondary evidence rather than direct perception of the actual event, as in these examples (adapted from Denwood 1999: 159–161, see also DeLancey 1985). (95) de=ring char=pa btang-’dug today rain fall-PERFECT ‘It has been raining today.’ (96) de=ring char=pa btang-zhag today rain fall-PERFECT / INFERENTIAL ‘It has been raining today.’ (95), with –’ dug, explicitly entails that the speaker has seen the rain directly. (96), with the inferential -zhag, makes clear that the speaker has not actually seen the rain, but infers it from secondary evidence, e.g. fresh rain puddles on the ground. Similarly, the speaker of (97) remembers leaving the umbrella behind, while (98) would be said by someone who has just reached for his umbrella and finds it missing. (97) nga’i nyi=gdugs lus-’dug my umbrella leave-PERFECT ‘[I] left my umbrella behind.’ (98) nga’i nyi=gdugs lus-zhag my umbrella leave-PERFECT / INFERENTIAL ‘[I] left my umbrella behind.’ A grammaticalization of a verb ‘put’ is a recurrent source for inferential constructions; the same path has been described for Newar by Genetti (1986). The mirative can be illustrated by pairs like (99) and (100) from Kham (Watters 2002: 288–296). The mirative construction is a nominalized verb form in construction with a finite copula inflected for 3rd person singular: (99) ba-duh-ke-rə go-PRIOR - PFV -3PL ‘They already went.’ (100) ya-ba-duh-wo o-le-o 3PL -go-PRIOR - NMZ 3SG -be-NMZ ‘They already went!’ (unexpected information) This construction can be used both when the information being related is perceived at first hand, as in (101), said when the speaker had just seen a leopard which he and the addressee were looking for, and also when it is only inferred from perceived evidence, as in (102), said when the speaker discovered traces showing that the leopard had eaten his dog.

Syntax and semantics

(101) nə-kə zə ci syã:-də u-li-zya-o there-at EMPH CEP sleep-NF 3-be-CONT - NMZ ‘He’s right there sleeping, see!’ (102) a-kə zə o-kəi-wo here-at EMPH 3s-eat-NMZ ‘He ate [him] right here!’

oleo

sani

MIRATIVE

CONFIRM

593

oleo MIRATIVE

Relatively few specifically mirative constructions have been reported, e.g. van Driem’s (1987: 241–242) “deprehensative” sentence-final particle in Limbu (a form borrowed from Nepali). Often mirativity in Tibeto-Burman is bound up with inferentiality, person, and/or control in conjunct-disjunct systems (see LaPolla 2003, inter alia). 5.5.2.3. Conjunct-Disjunct or Egophoric systems The most striking, and most uniquely Tibeto-Burman, grammatical system encoding the evidential-mirative domain is what has been called the “conjunct-disjunct” (Hale 1980, DeLancey 1992b, Hargreaves 2005) or more recently “egophoric” (Tournadre 2008) pattern found along the Himalayas in many Bodic and a few other TB languages. This is the grammatical association of a mirative or evidential category with person, and often agentivity or volition as well. In these systems we find a distinct “conjunct” or “egophoric” verbal form used for statements with 1st-person, and questions with 2nd-person subject, and a “disjunct” form used in all other circumstances. The phenomenon was noted by several scholars of Tibetan (Goldstein 1973, Goldstein & Nornang 1970, Jin 1979, Chang & Chang 1984); the “conjunct-disjunct” terminology comes from Hale’s (1980) analysis of Newar. The Newar and Tibetic systems have attracted growing attention over the past 30 years (DeLancey 1985, 1986, 1990, 1992a, Agha 1993, Zeisler 2000, Mushin 2001, Häsler 2001, Hein 2001, Hargreaves 2005, Tournadre 2001, 2008, Garrett 2001, de Villiers et al. 2009). We now have reports of essentially the same phenomenon, described in various terms, in rGyalrongic languages (Lin 2003: 270– 271), Tani (Post 2013), and Loloish (Thurgood 1986, Matisoff 1993). At present we cannot say how widespread it might be in the largely undocumented languages of North East India and northern Burma. The egophoric pattern is essentially the grammaticalization of the natural association between mirativity and person (DeLancey 2001, Häsler 2001). Crosslinguistically, mirative constructions with 1st-person subject are unusual. In Tibetic and other languages the tendency for mirative utterances to be about non-1st person has crystallized into a system that is sometimes mistaken for verb agreement. (Many Tibeto-Burman languages do have true verb agreement, but it is a distinct and much older system, which has been lost in many languages of the family; DeLancey 2010.) Genetti (1988: 188) proposes that some disjunct mor-

594

Scott DeLancey

phology in Kathmandu Newar is reanalyzed from the Proto-TB 3rd-person suffix. The conjunct-disjunct system emerges when this contrast is incorporated into the finite verb system, usually through the innovation of new tense/aspect forms based on copulas (DeLancey 1991, 2011). The new forms based on the non-mirative copula are restricted to statements with 1st and questions with 2nd-person subject, and forms based on the mirative occur everywhere else. When it is imported into the finite verb system, the egophoric contrast is associated with volition or intention. Egophoric forms are possible only if the event is presented as intentional; an unintentional 1st-person action takes non-egophoric forms (Hargreaves 2005, DeLancey 1990). In Tibetan it is probably more correct to say that such an event is not presented as a 1st-person action but as an impersonal event in which the egophoric actor is involved. The egophoric system appears to be a recent development. No attested system is morphologically ancient; all are transparent grammaticalizations from motivated constructions. In Written Tibetan we see no evidence of any such system until the introduction of the new copula red (Shigatse sbas) into the system. Presumably this was originally a mirative form, and the innovation was a new mirative distinction, as has happened even more recently in Sunwar (DeLancey 1992a, 1997). The egophoric system then appears as a specialization of the mirative category, entwining with the category of volition/intention which seems otherwise to be a major functional category in Himalayan languages. 5.5.2.4. Directions for further research The domain of evidentiality is an area of current interest and typological and theoretical research. Tibeto-Burman languages have contributed significantly to this field since its beginning, and will continue to do so. The desperate need in TibetoBurman linguistics is for more and better descriptions. We have adequate grammars of only a very small fraction of the languages of the family. With particular respect to the issue of evidentiality, it is essential that authors describing TibetoBurman languages be aware of the phenomenon and the common forms which it takes in the family, and explicitly address the topic. While progress has been made in the understanding of egophoric systems, much remains to be done, and further progress depends on detailed, semantically sophisticated analyses of the varieties of conjunct/disjunct marking found in the various languages. Several languages have been reported to have obscure personbased alternations in verb forms which are more suggestive of egophoricity than of true verb agreement (Post 2013); we need to understand these languages better. An important direction for research is the sources of evidential constructions. As noted above, they generally result from recent developments. We need to catalogue the various paths by which evidential, mirative, mediative, and egophoric systems arise (see Genetti 1988, Huber 2005: 97–126).

Syntax and semantics

595

Bibliographical References Abbi, Anvita (ed.) 1991 India as a linguistic area revisited. (Special issue of Language Sciences 13(2).) Abbi, Anvita 1975 Reduplication in Hindi: A generative semantic study. Cornell University PhD dissertation. Abbi, Anvita 1992 Reduplication in South Asian languages. New Delhi: Allied Publishers. Abbi, Anvita, and Devi Gopalakrishnan 1991 Semantics of explicator compound verbs in South Asian languages. Language Sciences 13(2): 161–180. Adger, David 2007 Three domains of finiteness: A Minimalist perspective. In: Irina Nikolaeva (ed.), Finiteness: Theoretical and empirical foundations, 25–58. Oxford: Oxford University Press. Agha, Asif 1993 Structural form and utterance context in Lhasa Tibetan: Grammar and indexicality in a non-configurational language. New York: Peter Lang. Ahmed, Tafseer 2006 Spatial, temporal and structural uses of Urdu ko. In: Miriam Butt and Tracy Holloway King (eds.), Online proceedings of the LFG 2006 Conference, 1–13. Stanford: CSLI Publications. http://web.stanford.edu/group/cslipublications/csli publications/LFG/11/lfg06ahmed.pdf (accessed 6 December 2014) Aikhenvald, Alexandra 2004 Evidentiality. Oxford: Oxford University Press. Aikhenvald, Alexandra, and Randy LaPolla (eds.) 2007 New perspectives on evidentials: A view from Tibeto-Burman. (Special issue of Linguistics of the Tibeto-Burman Area 30(2).) Aissen, Judith 2003 Differential object marking: Iconicity vs. economy. Natural Language and Linguistic Theory 21: 435–483. Amritavalli, R. 2004a Experiencer datives in Kannada. In: Bhaskararao & Subbarao (eds.) 2004, vol. 1: 1–24. Amritavalli, R. 2004b Some developments in the functional architecture of the Kannada clause. In: Veneeta Dayal and Anoop Mahajan (eds.), Clause structure in South Asian languages, 13–38. Dordrecht: Kluwer. Amritavalli, R., and K. A. Jayaseelan 2005 Finiteness and negation in Dravidian. In: G. Cinque and R. S. Kayne (eds.), The Oxford handbook of comparative syntax, 178–220. Oxford: Oxford University Press. Anderson, Gregory D. S. 2006 Auxiliary verb constructions. Oxford: Oxford University Press. Anderson, Gregory D. S. 2007 The Munda verb: Typological perspectives. Berlin/New York: Mouton de Gruyter.

596

Bibliographical References

Anderson, Gregory D. S. (ed.) 2008 The Munda languages. Oxford/New York: Routledge. Andvik, Erik 2010 A grammar of Tshangla. Leiden: Brill. Annamalai, E. 1969 Adjectival clauses in Tamil. University of Chicago PhD dissertation. Repr. 1997, Tokyo: Institute for the Study of Languages and Cultures of Asia and Africa, Tokyo University of Foreign Studies. Annamalai, E. 1979 Aspects of aspect in Tamil. International Journal of Dravidian Linguistics 8(2): 260–267. Annamalai, E. 1985 Dynamics of verbal extension in Tamil. Trivandrum: Dravidian Linguistics Association. (Also in International Journal of Dravidian Linguistics 11: 22–166, 1982.) Annamalai, E. 2012 Evidentiality in Tamil. University of Chicago, February 2012. Annamalai, E., and S[anford] B. Steever 1998 Modern Tamil. In: Steever (ed.) 1998: 100–128. Arregi, Karlos 2003 Clausal pied-piping. Natural Language Semantics 11(2): 115–143. Arsenault, Paul Edmond 2002 Toward an HPSG account of case in Hindi. University of Hyderabad MS thesis. Arunachalam, Sudha, and Anubha Kothari 2012 An experimental study of Hindi and English perfective interpretation. Journal of South Asian Linguistics 4: 27–42. Asher, Ronald E. 1985 Tamil. London/Sydney/Dover: Croom Helm. Asher, Ronald E., and T. C. Kumari 1997 Malayalam. London/New York: Routledge. Baart, Joan 1999 A sketch of Kalam Kohistani grammar. Islamabad: National Institute of Pakistan Studies, Quaid-i-Azam University. http://www.academia.edu/1992272/A_ Sketch_of_Kalam_Kohistani_grammar (accessed 25 November 2014) Babrakzai, Farooq 1999 Topics in Pashto syntax. University of Hawai'i, Manoa, PhD dissertation. Backstrom, Peter C. 1992 Burushaski. In: Peter C. Backstrom and Carla F. Radloff (eds.), Languages of Northern Areas, 31–56. (Sociolinguistic Survey of Northern Pakistan 2.) Islamabad: National Institute of Pakistan Studies, Quaid-i-Azam University/ Summer Institute of Linguistics. Bahl, Kali Charan 1964 A transformational analysis of the Hindi verb. Chicago: Department of South Asian Languages and Civilizations, University of Chicago. Bahl, Kali Charan 1969 Punjabi. In: Sebeok et al. (eds.) 1969: 153–200.

Syntax and semantics

597

Bahl, Kali Charan 1979 Studies in the semantic structure of Hindi, 2. Delhi: Manohar Publications. Baker, Mark C. 1988 Incorporation: A theory of grammatical function changing. Chicago/London: University of Chicago Press. Barðdal, Jóhanna 2013 Construction-based historical-comparative reconstruction. In: Thomas Hoffmann and Graeme Trousdale (eds.), Oxford handbook of Construction Grammar, 438–457. Oxford: Oxford University Press. Barðdal, Jóhanna, and Thórhallur Eythórsson 2009 The origin of the Oblique Subject construction: An Indo-European comparison. In: Vit Bubenik, John Hewson, and Sharon Rose (eds.), Grammatical change in Indo-European languages, 179–193. Amsterdam: John Benjamins. Barðdal, Jóhanna, and Thomas Smitherman 2012 The quest for cognates: A reconstruction of Oblique Subject constructions in Proto-Indo-European. Language Dynamics and Change 3(1): 28–67. Barker, Abd-al-Rahman, and Aqil Khan Mengal 1969 A course in Baluchi. Montreal: Institute of Islamic Studies, McGill University. Bashir, Elena 1988a Inferentiality in Kalasha and Khowar. In: Proceedings of the 24th regional meeting of the Chicago Linguistic Society, 47–59. Chicago: Chicago Linguistic Society. Bashir, Elena 1988b Topics in Kalasha syntax: An areal and typological perspective. University of Michigan PhD dissertation. ProQuest, UMI Dissertations Publishing 8821545. Bashir, Elena 1993 Causal chains and compound verbs. In: Verma (ed.) 1993: 1–30. Bashir, Elena 1996 Mosaic of tongues: Quotatives and complementizers in Northwest IndoAryan, Burushaski and Balti. In: Hanaway & Heston (eds.) 1996: 187–286. Bashir, Elena 1999 The Urdu postposition ne: Its changing role in the grammar. In: Rajendra Singh (ed.), The yearbook of South Asian languages and linguistics 1999, 11–36. New Delhi/London: Sage Publications. Bashir, Elena 2003 Dardic. In: Cardona & Jain (eds.) 2003: 818–894. Bashir, Elena 2006a Evidentiality in South Asian languages. In: Miriam Butt and Tracy Holloway King (eds.), Proceedings of the LFG 06 conference, 30–50. Stanford: CSLI. http://web.stanford.edu/group/cslipublications/cslipublications/LFG/11/lfg 06bashir.pdf (accessed 12 Nov 2014) Bashir, Elena 2006b Change in progress: Negation in Hindi and Urdu. In: Rajendra Singh (ed.), The yearbook of South Asian languages and linguistics 2006, 3–29. Berlin/New York: Mouton de Gruyter. Bashir, Elena 2009 Wakhi. In: Windfuhr (ed.) 2009: 825–862.

598

Bibliographical References

Bashir, Elena 2010a Traces of mirativity in Shina. Himalayan Linguistics 9(2): 1–55. http://www. linguistics.ucsb.edu/HimalayanLinguistics/articles/2010/PDF/HLJ0902A.pdf (accessed 12 Nov. 2014) Bashir, Elena 2010b Innovations in the Brahui verb system. Journal of South Asian Linguistics 3(1): 23–44. http://tiger.sprachwiss.uni-konstanz.de/~jsal/ojs/index.php/jsal/ article/view/21/17 (accessed 12 Nov. 2014) Bashir, Elena 2014 Two micro-areal developments in northwestern South Asia: Causative involuntatives and causee-marking postpositions. International Workshop on Linguistic Microareas in South Asia, Uppsala University, 5–6 May 2014. (Published in Journal of South Asian Languages and Linguistics 2(1): 29–62.) Bayer, Josef 1995 On the origin of sentential arguments in German and Bengali. In: Hubert Haider, Susan Olsen, and Sten Vikner (eds.), Studies in comparative Germanic syntax, 47–75. Dordrecht: Kluwer. Bayer, Josef 1996 Directionality and Logical Form: On the scope of focussing particles and wh-in-situ. Dordrecht: Kluwer. Bayer, Josef 2001 Two grammars in one: Sentential complements and complementizers in Bengali and other South Asian languages. In: Bhaskararao & Subbarao (eds.) 2001: 11–36. Bayer, Josef, Tanmoy Bhattacharya, and Hany Babu (eds.) 2007 Linguistic theory and South Asian languages. Amsterdam/Philadelphia: Benjamins. Bendix, Edward 1974 Indo-Aryan and Tibeto-Burman contact as seen through Nepali and Newari verb tenses. International Journal of Dravidian Linguistics 3(1): 42–59. Beyer, Stephan V. 1993 The Classical Tibetan language. New Delhi: Sri Satguru Publications. Beythan, Hermann 1943 Praktische Grammatik der Tamilsprache in Umschrift. Leipzig: Harrassowitz. Bhaskararao, Peri, and K. V. Subbarao (eds.) 2001 The yearbook of South Asian languages 2001. (= Proceedings of Tokyo symposium on South Asian languages: Contact, convergence, and typology.) Thousand Oaks/London/New Delhi: Sage. Bhaskararao, Peri, and Karumuri V. Subbarao (eds.) 2004 Non-nominative subjects, 2 vols. Amsterdam/Philadelphia: Benjamins. Bhat, D. N. S. 1994 The adjectival category: Criteria for differentiation. Amsterdam/Philadelphia: Benjamins. Bhat, D. N. S. 1979 Vectors in Kannada. International Journal of Dravidian Linguistics 8(2): 300– 309.

Syntax and semantics

599

Bhat, D. N. S. 1999 The prominence of tense, aspect and mood. Amsterdam/Philadelphia: Benjamins. Bhatia, Tej K. 1978 Negation in South Asian languages. University of Illinois PhD dissertation. Bhatia, Tej K. 1993 Punjabi: A cognitive descriptive grammar. London/New York: Routledge. (Reprinted 2000.) Bhatia, Tej K. 1995 Negation in South Asian languages. Patiala: Indian Institute of Language Studies. Bhatt, Rajesh 2003 Locality in correlatives. Natural Language and Linguistic Theory 21: 485– 541. Bhatt, Rajesh 2005 Long distance agreement in Hindi-Urdu. Natural Language and Linguistic Theory 23(4): 757–807. Bhatt, Rajesh 2007 Unaccusativity and case licensing. Lecture, McGill University. Bhatt, Rajesh, and Elena Anagnostopoulou 1996 Object shift and specificity: Evidence from ko phrases in Hindi. In: Lise M. Dobrin, Kora Singer, and Lisa McNair (eds.), Papers from the 32nd regional meeting of the Chicago Linguistic Society, 11–22. Chicago: Chicago Linguistic Society. Bhatt, Rajesh, and Veneeta Dayal 2007 Rightward scrambling as rightward remnant movement. Linguistic Inquiry 38(2): 287–301. Bhatt, Rakesh M. 1999 Verb Movement and the syntax of Kashmiri. London: Kluwer. Bhattacharya, Tanmoy 1999 The structure of the Bangla DP. University College, London, PhD dissertation. Bhattacharya, Tanmoy (ed.) 2005 The yearbook of South Asian languages and linguistics. Berlin/New York: Mouton de Gruyter. Bhattacharya, Tanmoy, and Thangjam Hindustani Devi 2003 Why cleft? University of Delhi MS. ling.auf.net/lingbuzz/000457/current.pdf (accessed 1 December 2014) Bickel, Balthasar 1999a Cultural formalism and spatial language in Belhare. In: Balthasar Bickel and Martin Gaenszle (eds.), Himalayan space: Cultural horizons and practices, 73–101. Zürich: Museum of Ethnography. Bickel, Balthasar 1999b Nominalization and focus constructions in some Kiranti languages. In: Yadava & Glover (eds.) 1999: 271–296. http://www.uni-leipzig.de/~bickel/research/ papers/focnom99.pdf (accessed 8 May 2013)

600

Bibliographical References

Bickel, Balthasar 2000 Introduction: Person and evidence in Himalayan languages. Linguistics of the Tibeto-Burman Area 23(2): 1–11. www.spw.uzh.ch/bickel-files/papers/Bickel 2000Person.pdf (accessed 8 May 2013) Bickel, Balthasar 2003 Belhare. In: Thurgood & LaPolla (eds.) 2003: 546–570. Bickel, Balthasar 2004 The syntax of experiencers in the Himalayas. In: Bhaskararao & Subbarao (eds.) 2004, vol. 1: 77–112. Bielmeier, Roland 2000 Syntactic, semantic, and pragmatic-epistemic functions of auxiliaries in Western Tibetan. Linguistics of the Tibeto-Burman Area 23(2): 79–125. Bloch, Jules 1930 Some problems of Indo-Aryan philology. Bulletin of the School of Oriental Studies 5: 719–756. Bloch, Jules 1934 L’indo-aryen du veda aux temps modernes. Paris: Adrien-Maisonneuve. Bloch, Jules 1946 Structure grammaticale des langues dravidiennes. Paris: Adrien-Maisonneuve Boeckx, Cedric 2004 Long-distance agreement in Hindi: Theoretical implications. Studia Linguistica 58: 23–36. Bögel, Tina 2010 Second-position and endoclitics in Pashto. Fifteenth International Lexical Functional Grammar Conference (LFG 2010), Carleton University, Ottawa. Abstract at http://www.carleton.ca/lfg2010/abstracts/Bogel.pdf (accessed 17 May 2014) Bohnemeyer, Jürgen, Sonja Eisenbeiss, and Bhuvana Narasimhan 2006 Ways to go: Methodological considerations in Whorfian studies on motion events. Essex Research Reports in Linguistics 50: 1–19. Colchester: University of Essex, Department of Language and Linguistics. Bopp, Franciscus 1819 Nalus, carmen sanscritum e Mahâbhârato, edidit, latine vertit, et adnotationibus illustravit. London/Paris/Straßburg: Treuttel und Würtz. Brugman, Claudia, and George Lakoff 1988 Cognitive topology and lexical networks. In: Steven L. Small, Garrison W. Cottrell, and Michael K. Tanenhaus (eds.), Lexical ambiguity resolution, 477– 507. San Mateo, CA: Morgan Kaufman. Bubenik, Vit 1998 A historical syntax of late Middle Indo-Aryan (Apabhraṁśa). Amsterdam/ Philadelphia: Benjamins. Budwig, Nancy, Bhuvana Narasimhan, and Smita Srivastava 2006 Interim solutions: The acquisition of early constructions in Hindi. In: Eve V. Clark and Barb Kelly (eds.), Constructions in acquisition, 163–185. Stanford: CSLI. Buragohain, Dipima 2008 Explicator compound verbs in Assamese and Kashmiri: A comparative analysis. In: Stephen Morey (ed.), North East Indian linguistics, 203–220. New Delhi: Foundation Books.

Syntax and semantics

601

Burrow, Thomas, and S. Bhattacarya 1970 The Pengo language: Grammar, texts, and vocabulary. Oxford: Clarendon Press. Butt, Miriam 1993a Conscious choice and some light verbs in Urdu. In: Verma (ed.) 1993: 31–46. Butt, Miriam 1993b Object specificity and agreement in Hindi/Urdu. Papers from the 29th regional meeting of the Chicago Linguistic Society, 80–103. Chicago: Chicago Linguistic Society. Butt, Miriam 1994/1995 The structure of complex predicates in Urdu. Stanford: CSLI. (Stanford University PhD dissertation.) Butt, Miriam 2003 The light verb jungle. Harvard Working Papers in Linguistics 9: 1–49. Butt, Miriam 2010 The light verb jungle: Still hacking away. In: Mengistu Amberber, Brett Baker, and Mark Harvey (eds.), Complex predicates: Cross-linguistic perspectives on event structure, 48–78. Cambridge: Cambridge University Press. Butt, Miriam, and Aditi Lahiri 2002 Historical stability vs. historical change. MS, Universität Konstanz. http:// ling.uni-konstanz.de/pages/home/butt/main/papers/stability.pdf (accessed 26 November 2014) Butt, Miriam, and Ashwini Deo 2013 A historical perspective on Dative Subjects in Indo-Aryan. 18th International Lexical Functional Grammar Conference, Debrecen. http://ling.uni-konstanz. de/pages/home/butt/main/papers/lfg13-slides.pdf (accessed 14 November 2014) Butt, Miriam, and Gillian Ramchand 2001 Complex aspectual structure in Hindi/Urdu. Oxford Working Papers in Linguistics, Philology and Phonetics 6: 1–30. Butt, Miriam, and Louisa Sadler 2003 Verbal morphology and agreement in Urdu. In: Uwe Junghanns and Luka Szucsich (eds.), Syntactic structures and morphological information, 57–100. Berlin/New York: Mouton de Gruyter. Butt, Miriam, and Tracy Holloway King 2004 The status of case. In: Dayal & Mahajan (eds.) 2004: 152–198. Butt, Miriam, Tracy Holloway King, and Gillian Ramchand (eds.) 1994 Theoretical perspectives on word order in South Asian Languages. Stanford: CSLI. Butt, Miriam, Tracy Holloway King, and John T. Maxwell III 2003 Complex predicates via restriction. In: Miriam Butt and Tracy Holloway King (eds.), Proceedings of the LFG03 Conference, 95–104. Stanford: CSLI. http://web.stanford.edu/group/cslipublications/cslipublications/LFG/8/lfg 03buttetal.pdf (accessed 1 December 2014) Bybee, Joan L. 1995 Regular morphology and the lexicon. Language and Cognitive Processes 10: 425–455.

602

Bibliographical References

Bybee, Joan L., and Östen Dahl 1989 The creation of tense and aspect systems in the languages of the world. Studies in Language 13(1): 51–103. Cable, Seth 2009 The syntax of the Tibetan correlative. In: Lipták (ed.) 2009: 195–222. Caldwell, Robert 1856 A comparative grammar of the Dravidian or South-Indian family of languages. London/Edinburgh: Williams and Norgate. Caldwell, Robert 1875 A comparative grammar of the Dravidian or South-Indian family of languages. 2nd edition, revised and enlarged. London: Trübner & Co. Cardona, George 1965 A Gujarati reference grammar. Philadelphia: University of Pennsylvania Press. Cardona, George 2002 The Old Indo-Aryan tense system. Journal of the American Oriental Society 122(2): 235–241. Cardona, George, and Babu Suthar 2003 Gujarati. In: Cardona & Jain (eds.) 2003: 559–597. Cardona, George, and Dhanesh Jain (eds.) 2003 The Indo-Aryan languages. London/New York: Routledge. Carey, William 1804 A grammar of the Sungscrit language, composed from the works of the most esteemed grammarians … Serampore: Mission Press. Chafe, Wallace, and Johanna Nichols (eds.) 1986 Evidentiality: The linguistic coding of epistemology. Norwood, NJ: Ablex. Chandra, Pritha 2007 (Dis)Agree: Movement and agreement reconsidered. University of Maryland PhD dissertation. Chandra, Pritha, and Richa Srishti (eds.) 2014 The lexicon-syntax interface: Perspectives from South Asian languages. Amsterdam/Philadelphia: Benjamins. Chang, Betty Shefts, and Kun Chang 1984 The certainty hierarchy among Spoken Tibetan verbs of being. Bulletin of the Institute of History and Philology, Academia Sinica 55: 603–635. Chatterjee, Suniti Kumar 1926 The origin and development of the Bengali language, 3 vols. Calcutta: Calcutta University Press. Revised, 1971; reprinted 1985, New Delhi: Rupa. Chelliah, Shobhana 1997 A grammar of Meithei. Berlin/New York: Mouton de Gruyter. Chelliah, Shobhana 2001 Text collection and elicitation in linguistic fieldwork. In: Paul Newman and Martha Ratliff (eds.), Linguistic fieldwork, 152–189. Cambridge: Cambridge University Press. Chirkova, Ekaterina 2008 Baimayu yu Zangyu fangyan de shizheng fanchou (The evidential category in Baima and Tibetan dialects). Minzu Yuwen 2008: 36–43.

Syntax and semantics

603

Chomsky, Noam 1981 Lectures on Government and Binding. Dordrecht: Foris. Chomsky, Noam 1993 A minimalist program for linguistic theory. In: Kenneth Hale and Samuel Keyser (eds.), The view from Building 20: Essays in linguistics in honor of Sylvain Bromberger, 1–52. (Studies in Linguistics 24.) Cambridge, MA: MIT Press. Chomsky, Noam 1995 The minimalist program. (Studies in Linguistics 28.) Cambridge, MA: MIT Press. Chomsky, Noam 1998 Minimalist inquiries: The framework. (MIT Occasional Papers in Linguistics 15.) Cambridge, MA: MITWPL. Chomsky, Noam 1999 Derivation by Phase. (MIT Occasional Papers in Linguistics 18.) Cambridge, MA: MITWPL. Chomsky, Noam, and Howard Lasnik 1993 The theory of Principles and Parameters. In: Joachim Jacobs, Arnim von Stechow, Wolfgang Sternefeld, and Theo Vennemann (eds.), Syntax: An international handbook of contemporary research, 506–569. Berlin: de Gruyter. Clark, Thomas W. 1963 Introduction to Nepali. Cambridge: W. Heffer and Sons. Cole, Peter, Gabi Hermon, and S. N. Sridhar 1980 The acquisition of subjecthood. Language 56: 719–743. Coupe, Alexander R. 2007a A grammar of Mongsen Ao. Berlin/New York: Mouton de Gruyter. Coupe, Alexander R. 2007b Converging patterns of clause linkage in Nagaland. In: Matti Miestamo and Bernhard Wächli (eds.), New challenges in typology: Broadening the horizons and redefining the foundations, 339–361. Berlin/New York: Mouton de Gruyter. Croft, William 2001 Radical Construction Grammar: Syntactic theory in typological perspective. Oxford: Oxford University Press. Croft, William, and D. Alan Cruse 2004 Cognitive Linguistics. Cambridge: Cambridge University Press. Dasgupta, Probal 1977 Compound verbs in Bangla. Indian Linguistics 38: 68–85. Dasgupta, Probal 2003 Bangla. In: Cardona & Jain (eds.) 2003: 351–390. David, Anne E. 2009 On the evolution of Pashto ‘endoclitics’. MS, Center for Advanced Study of Language, University of Maryland. Davison, Alice 1969 Reflexivization and movement rules in relation to a class of Hindi psychological predicates. Papers from the 5th regional meeting of the Chicago Linguistic Society 5: 37–52. Chicago: Chicago Linguistic Society. Davison, Alice 1984 Syntactic constraints on Wh in-situ: Wh questions in Hindi-Urdu. 1984 Annual Meeting of the Linguistic Society of America.

604

Bibliographical References

Davison, Alice 1991 Feature percolation and agreement in Hindi-Urdu. South Asia Conference, University of Wisconsin. Davison, Alice 2001 The VP structure and case checking properties of sentences with nonnominative subjects: Hindi/Urdu. http://clas.uiowa.edu/linguistics/files/ linguistics/tokyowp.PDF (accessed 22 November 2014) Davison, Alice 2004a Non-nominative subjects in Hindi-Urdu: VP structure and case parameters. In: Bhaskararao & Subbarao (eds.) 2004, vol. 1: 141–168. http://clas.uiowa.edu/ linguistics/files/linguistics/nonnom.PDF (accessed 16 November 2014) Davison, Alice 2004b Structural case, lexical case, and the verbal projection. In: Dayal & Mahajan (eds.) 2004: 199–226. Davison, Alice 2007 Word order, parameters, and the Extended COMP projection. In: Bayer, Bhattacharya & Babu (eds.) 2007: 175–198. Davison, Alice 2009 Adjunction, features, and locality in Sanskrit and Hindi-Urdu correlatives. In: Lipták (ed.) 2009: 223–262. Davison, Alice 2012 Reversible and non-reversible dative subjects: A structural account. http:// linguisticssouthasia.commons.yale.edu/files/2012/07/davison_hindi.pdf (accessed 15 November 2014) Dayal, Veneeta Srivastav 1994 Scope marking as indirect wh-dependency. Natural Language Semantics 2(2): 137–170. Dayal, Veneeta 1996 Locality in Wh-quantification: Questions and relative clauses in Hindi. Studies in Linguistics and Philosophy 62: 51–88. Dayal, Veneeta 2002 Scope marking: Cross linguistic variation in indirect dependency. In: Uli Lutz, Gereon Müller, and Arnim von Stechow (eds.), Wh-scope marking. Amsterdam/Philadelphia: Benjamins. Dayal, Veneeta, and Anoop Mahajan (eds.) 2004 Clause structure in South Asian languages. Dordrecht: Kluwer. de Villiers, J. G., J. Garfield, H. Gernet-Girard, T. Roeper, and M. Speas 2009 Evidentials in Tibetan: Acquisition, semantics, and cognitive development. New Directions for Child and Adolescent Development 2009 (125): 29–47. Dehdari, Jonathan 2006 Crossing dependencies in Persian. Brigham Young University MA thesis. DeLancey, Scott 1985 Lhasa Tibetan evidentials and the semantics of causation. Proceedings of the Eleventh Annual Meeting of the Berkeley Linguistics Society, 65–72. Berkeley: Berkeley Linguistics Society. DeLancey, Scott 1986 Evidentiality and volitionality in Tibetan. In: Chafe and Nichols (eds.) 1986: 203–213.

Syntax and semantics

605

DeLancey, Scott 1990 Ergativity and the cognitive model of event structure in Lhasa Tibetan. Cognitive Linguistics 1(3): 289–321. DeLancey, Scott 1991 The origins of verb serialization in Modern Tibetan. Studies in Language 15(1): 1–23. DeLancey, Scott 1992a Sunwar copulas. Linguistics of the Tibeto-Burman Area 15(1): 31–38. DeLancey, Scott 1992b The historical status of the conjunct/disjunct pattern in Tibeto-Burman. Acta Linguistica Hafniensia 25: 39–62. DeLancey, Scott 1997 Mirativity: The grammatical marking of unexpected information. Linguistic Typology 1(1): 33–52. DeLancey, Scott 2001 The mirative and evidentiality. Journal of Pragmatics 33(3): 371–384. DeLancey, Scott 2002 Relativization and nominalization in Bodic. Proceedings of the 28th Annual Meeting of the Berkeley Linguistics Society 28: 55–72. Berkeley: Berkeley Linguistics Society. DeLancey, Scott 2010 Towards a history of verb agreement in Tibeto-Burman. Himalayan Linguistics 9(1): 1–38. DeLancey, Scott 2011 Finite structures from clausal nominalization in Tibeto-Burman languages. In: Foong Ha Yap, K. Grunow-Hårsta, J. Wrona (eds.), Nominalization in Asian languages: Diachronic and typological perspectives, 343–362. Amsterdam/ Philadelphia: Benjamins. Denwood, Philip 1999 Tibetan. Amsterdam/Philadelphia: Benjamins. Deshpande, Madhav M. 1981 Pāṇini and the Vedic evidence: A peep into the “past”. In: T. H. Dharmadhikari (ed.), Golden jubilee volume, 52–65. Poona: Vaidika Saṁśodhana Maṇḍala. Dhongde, Ramesh Vaman, and Kashi Wali 2009 Marathi. Amsterdam/Philadelphia: Benjamins. Dost, Ascander H. 2007 Linearization, square pegs, and round holes. University of California, Santa Cruz, PhD dissertation. Downing, Bruce T. 1978 Some universals of relative clause structure. In: J. H. Greenberg (ed.), Universals of human language, 4: 375–418. Stanford, CA: University Press. Dwivedi, V. 2003 The view from the left periphery: Hindi right-adjoined relatives. Proceedings from the Annual Meeting of the Chicago Linguistic Society 39(1): 32–48. Chicago: Chicago Linguistic Society. Ebert, Karen H. 2003 Camling. In: Thurgood & LaPolla (eds.) 2003: 533–545.

606

Bibliographical References

Edelman, D. (Joy) L., and Leila R. Dodykhudova 2009 Shughni. In: Windfuhr (ed.) 2009: 787–824. Egerod, Søren, and Inga-Lill Hanson 1974 An Akha conversation on death and funeral. Acta Orientalia (Hafnensia) 36: 225–284. Emeneau, Murray B. 1956 India as a linguistic area. Language 32: 3–16. Emeneau, Murray B. 1965 India and historical grammar. Annamalainagar: Annamalai University. Emeneau, Murray B. 1969 Sanskrit syntactic particles – kila, khalu, nūnam. Indo-Iranian Journal 11: 241–268. Emeneau, Murray B. 1971 Dravidian and Indo-Aryan: The Indian linguistic area. In: Andrée F. Sjoberg (ed.), Symposium on Dravidian civilization, 33–68. Austin/New York: Jenkins Publishing. Repr. in Emeneau 1980: 167–196. Emeneau, Murray B. 1980 Language and linguistic area. Essays by Murray B. Emeneau, selected and introduced by Anwar S. Dil. Stanford: Stanford University Press. Emeneau, Murray B. 1984 Toda grammar and texts. Philadelphia: American Philosophical Society. Epps, Patience 2005 Areal diffusion and the development of evidentiality: Evidence from Hup. Studies in Language 29(3): 617–650. Erdal, Marcel 1998 Old Turkic. In: Johanson & Csató (eds.) 1998: 138–157. Evans, Vyvyan, Benjamin K. Bergen, and Jörg Zinken 2007 The Cognitive Linguistics enterprise: An overview. In: Vyvyan Evans, Benjamin K. Bergen, and Jörg Zinken (eds.), The Cognitive Linguistics reader, 2–36. London: Equinox. Family, Neiloufar 2009 Lighten up: The acquisition of light verb constructions in Persian. In: Jane Chandlee, Michelle Franchini, Sandy Lord, and Gudrun-Marion Rheiner (eds.), Proceedings of the 33rd Annual Boston University Conference on Language Development, 139–150. Somerville, MA: Cascadilla Press. Fauconnier, Giles 2003 Cognitive Linguistics. In: Lynn Nadel (ed.), Encyclopedia of Cognitive Science, 1: 539–543. London: Macmillan. Fauconnier, Giles, and Mark Turner 1998 Conceptual integration networks. Cognitive Science 22(2): 133–187. Fedson, Vijayarani 1981 The Tamil serial or compound verb. University of Chicago PhD dissertation. Fillmore, Charles J. 1982 Frame semantics. In: Linguistic Society of Korea (ed.), Linguistics in the Morning Calm, 111–137. Seoul: Hanshin.

Syntax and semantics

607

Fillmore, Charles J. 1989 Grammatical construction theory and the familiar dichotomies. In: Rainer Dietrich, and Carl F. Graumann (eds.), Language processing in social context, 17–38. Amsterdam: North Holland/Elsevier. Fillmore, Charles J., Paul Kay, and Mary Catherine O’Connor 1988 Regularity and idiomaticity in grammatical constructions: The case of let alone. Language 64: 501–538. Fouchécour, Charles-Henri de, and Philippe Gignoux (eds.) 1989 Études irano-aryennes offertes à Gilbert Lazard. Paris: Association pour l’avancement des études irannienes. (Studia Iranica 7.) Gair, James W. 1998 Studies in South Asian linguistics: Sinhala and other South Asian languages. Oxford: Oxford University Press. Gair, James W. 2003 Sinhala. In: Cardona & Jain (eds.) 2003: 766–817. Gambhir, Vijay 1981 Syntactic restrictions and discourse functions of word order in Standard Hindi. University of Pennsylvania PhD dissertation. Garrett, Edward 2001 Evidentiality and assertion in Tibetan. UCLA PhD dissertation. Gawne, Lauren 2013 Lamjung Yolmo copulas in use: Evidentiality, reported speech and questions. University of Melbourne PhD dissertation. Genetti, Carol 1986 The grammaticalization of the Newari verb tɔl. Linguistics of the TibetoBurman Area 9(2): 53–70. Genetti, Carol 1988 A contrastive study of the Dolakhali and Kathmandu Newari dialects. Cahiers de Linguistique – Asie Orientale 17(2): 161–191. Genetti, Carol 1992 Semantic and grammatical categories of relative clause morphology in the languages of Nepal. Studies in Language 16: 405–427. Genetti, Carol 2003 Dolakhā Newār. In: Thurgood & LaPolla (eds.) 2003: 355–370. Genetti, Carol 2011 Nominalization in Tibeto-Burman languages of the Himalayan area: A typological perspective. In: Foong Ha Yap, Karen Grunow-Hårsta, and Janick Wrona (eds.), Nominalization in Asian languages: Diachronic and typological perspectives, 163–194. Amsterdam/Philadelphia: Benjamins. Goldberg, Adele E. 1995 Constructions: A construction grammar approach to argument structure. Chicago/London: University of Chicago Press. Goldstein, Melvyn 1973 Essentials of Modern Literary Tibetan: A reading course and reference grammar. Repr. 1991, Berkeley/Los Angeles: University of California Press. Goldstein, Melvyn, and Nawang Nornang 1970 Modern Spoken Tibetan: Lhasa dialect. Seattle/London: University of Washington Press.

608

Bibliographical References

Gonzalez-Marquez, Monica, Irene Mittelberg, Seana Coulson, and Michael J. Spivey 2007 Methods in Cognitive Linguistics. Amsterdam/Philadelphia: Benjamins. Goswami, G. C., and Jyotiprakash Tamuli 2003 Asamiya. In: Cardona & Jain (eds.) 2003: 391–443. Grierson, George Abraham (ed.) 1903–1928 Linguistic Survey of India, 11 volumes in 20. Calcutta: Office of the Superintendent of Government Printing. Repr. 1967, Delhi: Motilal Banarsidass. Grosz, Patrick, and Pritty Patel-Grosz 2014 Agreement and verb types in Kutch Gujarati. In: Chandra & Srishti (eds.) 2014: 217–244. Grunow-Hårsta, Karen 2007 Evidentiality and mirativity in Magar. Linguistics of the Tibeto-Burman Area 30(2): 151–194. Guentchéva, Zlatka (ed.) 1996 L’Énonciation médiatisée. Louvain/Paris: Peeters. Gurtu, Madhu 1992 Anaphoric relations in Hindi and English. New Delhi: Munshiram Manoharlal. Hale, Austin 1980 Person markers: Finite conjunct and disjunct verb forms in Newari. In: Ronald Trail (ed.), Papers in South-East Asian linguistics 7, 95–106. (Pacific Linguistics Series A 53.) Canberra: Australian National University. Hale, Austin, and Kedar Shresta 2006 Newār (Nepāl Bhāsā). München: LINCOM. Haller, Felix 2000a Verbal categories of Shigatse Tibetan and Themchen Tibetan. Linguistics of the Tibeto-Burman Area 23(2): 175–191. Haller, Felix 2000b Dialekt und Erzählungen von Shigatse. Bonn: VGH Wissenschaftsverlag. Hanaway, William L., and Wilma Heston (eds.) 1996 Studies in Pakistani popular culture: Final report of the University of Pennsylvania/Lok Virsa Multi-Disciplinary Study of Pakistan Culture. Islamabad/ Lahore: Lok Virsa/Sang-e-Meel Publications. Hargreaves, David 2003 Kathmandu Newar (Nepāl Bhāśā). In: Thurgood & LaPolla (eds.) 2003: 371– 384. Hargreaves, David 2005 Agency and intentional action in Kathmandu Newar. Himalayan Linguistics 5: 1–48. Häsler, Katrin 2001 An empathy-based approach to the description of the verb system of the Dege dialect of Tibetan. Linguistics of the Tibeto-Burman Area 24(1): 1–34. Haspelmath, Martin 1995 The converb as a cross-linguistically valid category. In: Haspelmath & König (eds.) 1995: 1–55.

Syntax and semantics

609

Haspelmath, Martin, and Ekkehard König (eds.) 1995 Converbs in cross-linguistic perspective. Berlin/New York: Mouton de Gruyter. Hein, Veronika 2001 The role of the speaker in the verbal system of the Tibetan dialect of Tabo/ Spiti. Linguistics of the Tibeto-Burman Area 24(1): 35–48. Hein, Veronika 2007 The mirative and its interplay with evidentiality in the Tibetan dialect of Tabo (Spiti). Linguistics of the Tibeto-Burman Area 30(2): 195–214. Herring, Susan C. 1993 Aspectogenesis in South Dravidian: On the origin of the ‘compound continuative’ KONTIRU. In: Henk Aertsen and Robert J. Jeffers (eds.), Historical linguistics 1989: Papers from the 9th International Conference on Historical Linguistics, Rutgers University, August 14–18, 1989, 167–185. Amsterdam/ Philadelphia: Benjamins. Herring, Susan C. 1994 Discourse functions of demonstrative deixis in Tamil. In: Proceedings of the 20th Annual Berkeley Linguistics Society: General Session Dedicated to the Contributions of Charles F. Fillmore, 246–259. Berkeley: Berkeley Linguistics Society. Hettrich, Heinrich 1988 Untersuchungen zur Hypotaxe im Vedischen. Berlin: De Gruyter. Hock, Hans Henrich 1982 The Sanskrit quotative: A historical and comparative study. Studies in the Linguistic Sciences 12(2): 39–85. Hock, Hans Henrich 1988 Review article on Steever 1988. Studies in the Linguistic Sciences 18(2): 211– 231. Hock, Hans Henrich 1989 Conjoined we stand: Theoretical implications of Sanskrit relative clauses. Studies in the Linguistic Sciences 19(1): 93–126. Hock, Hans Henrich 1990 Oblique subjects in Sanskrit? In: Verma & Mohanan (eds.) 1990: 119–141. Hock, Hans Henrich 1991 Possessive agents in Sanskrit. In: Hans Henrich Hock (ed.), Studies in Sanskrit syntax, 55–69. Delhi: Motilal Banarsidass. Hock, Hans Henrich 1992 What’s a nice word like you doing in a place like this? Syntax vs. Phonological Form. Studies in the Linguistic Sciences 22(1): 39–87. Hock, Hans Henrich 2005 How strict is strict OV? A family of typological constraints with focus on South Asia. In: Bhattacharya (ed.) 2005: 145–164. Hock, Hans Henrich 2008 Dravidian syntactic typology: A reply to Steever. In: Rajendra Singh (ed.), Annual Review of South Asian Languages and Linguistics, 164–198. Berlin/ New York: Mouton de Gruyter.

610

Bibliographical References

Hock, Hans Henrich 2012 Sanskrit and Pāṇini – core and periphery. Saṁskṛta Vimarśa N. S. 6: 88–104. (World Sanskrit Conference Special.) New Delhi: Rashtriya Sanskrit Sansthan. Hock, Hans Henrich 2013 Proto-Indo-European verb finality: Reconstruction, typology, validation. In: Leonid Kulikov and Nikolaos Lavidas (eds.), Proto-Indo-European syntax and its development, 49–76. (Journal of Historical Linguistics 3(1).) Amsterdam/Philadelphia: Benjamins. Hongladarom, Krisadawan 1997 Historical development of the Tibetan evidential tuu. In: Arthur Abramson (ed.), Southeast Asian linguistics studies in honor of Vichin Panupong,115–126. Bangkok: Chulalongkorn University Press. Hongladarom, Krisadawan 2007 Evidentiality in Rgyalthang Tibetan. Linguistics of the Tibeto-Burman Area 30(2): 17–44. Hook, Peter Edwin 1974 The compound verb in Hindi. Ann Arbor: Center for South and Southeast Asian Studies, University of Michigan. Hook, Peter Edwin 1977 The distribution of the compound verb in the languages of North India and the question of its origin. International Journal of Dravidian Languages 6: 335–351. Hook, Peter Edwin 1988 Paradigmaticization: A case study from South Asia. In: Proceedings of the Fourteenth Annual Meeting of the Berkeley Linguistics Society, 293–303. Berkeley: Berkeley Linguistics Society. Hook, Peter Edwin 1990a A note on expressions of involuntary experience. Bulletin of the School of Oriental and African Studies 67: 77–82. Hook, Peter Edwin 1990b Experiencers in South Asian languages: A gallery. In: Verma & Mohanan (eds.) 1990: 319–334. Hook, Peter Edwin 1991a The emergence of perfective aspect in Indo-Aryan languages. In: Elizabeth Traugott and Bernd Heine (eds.), Approaches to grammaticalization, vol. 1: 59–89. Amsterdam/Philadelphia: Benjamins. Hook, Peter Edwin 1991b The compound verb in Munda: An areal and typological overview. In: Abbi (ed.) 1991: 181–195. Hook, Peter Edwin 1993 Aspectogenesis and the compound verb in Indo-Aryan. In: Verma (ed.) 1993: 97–114. Hook, Peter Edwin 2014 The distribution of South Asian ‘dative subjects’ in time and space. Workshop on Diachronic typology of differential argument marking, Konstanz, 5–6 April 2014.

Syntax and semantics

611

Hook, Peter Edwin, and Prashant Pardeshi 2005 Are vector verbs eternal? South Asian Linguistic Analysis (SALA)-25, University of Illinois, 16–18 September 2005. Hook, Peter Edwin, Prashant Pardeshi, and Hsin-hsin Liang 2012 Semantic neutrality in complex predicates: Evidence from East and South Asia. Linguistics 3: 605–632. Huber, Brigitte 2000 Preliminary report on evidential categories in Lende Tibetan (Kyirong). Linguistics of the Tibeto-Burman Area 23(2): 155–174. Huber, Brigitte 2005 The Tibetan dialect of Lende (Kyirong). Bonn: VGH Wissenschaftsverlag. Hyslop, Gwendolyn 2011 Mirativity in Kurtöp. Journal of South Asian Linguistics 4(1): 43–60. Ibarretxe-Antuñano, Iraide 2004 What’s Cognitive Linguistics? A new framework for the study of Basque. Cahiers (Cahiers de l’Association for French Language Studies) 10: 3–31. Jackendoff, Ray 2011 Conceptual Semantics. In: Claudia Maienborn, Klaus von Heusinger, and Paul Portner (eds.), Semantics: An international handbook of natural language meaning, 1: 688–709. Berlin: Mouton de Gruyter. Jacques, Guillaume 2004 Phonologie et morphologie du japhug (Rgyalrong). Université Paris VII, Denis Diderot, PhD thesis. Jahani, Carina 2000 Expressions of indirectivity in spoken Modern Persian. In: Johanson & Utas (eds.) 2000: 185–208. Jayaseelan, K. A. 1990 The Dative Subject construction and the Pro-Drop Parameter. In: Verma & Mohanan (eds.) 1990: 269–283. Jayaseelan, K. A. 1996 Question word movement to focus and scrambling in Malayalam. Linguistic Analysis 26: 63–83. Jayaseelan, K. A. 1999 Parametric studies in Malayalam. New Delhi: Allied Publishers. Jayaseelan, K. A. 2000 A Focus Phrase above vP. In: Nguyen Chi Duy Khuong, Richa, and Samar Sinha (eds.), The Fifth Asian GLOW: Conference proceedings, 195–212. Mysore: Central Institute of Indian Languages. Jayaseelan, K. A. 2004a Question movement in some SOV languages and the theory of feature checking. Language and Linguistics 5(1): 5–27. Jayaseelan, K. A. 2004b The possessor-experiencer dative in Malayalam. In: Bhaskararao & Subbarao (eds.) 2004, vol. 1: 227–244. Jayaseelan, K. A. 2007 The argument structure of the dative construction. In: E. Reuland, T. Bhattacharya, and G. Spathas (eds.), Argument structure, 37–48. Amsterdam/Philadelphia: Benjamins.

612

Bibliographical References

Jayaseelan, K. A. 2008 Topic, focus, and adverb position in clause structure. Nanzan Linguistics 4: 43–68. Jayaseelan, K. A., and R. Amritavalli 2005 Scrambling in the cleft construction in Dravidian. In: Joachim Sabel and Mamoru Saito (eds.), The free word order phenomenon: Its syntactic sources and diversity, 138–163. Berlin/New York: Mouton de Gruyter. Johanson Lars 1998 The structure of Turkic. In: Johanson & Csató (eds.) 1998: 30–66. Johanson, Lars, and Bo Utas (eds.) 2000 Evidentials: Turkish, Iranian and neighbouring languages. Berlin/New York: Mouton de Gruyter. Johanson, Lars, and Éva Á. Csató (eds.) 1998 The Turkic languages. London/New York: Routledge. Johnson, Mark 1987 The body in the mind: The bodily basis of meaning, reason and imagination. Chicago/London: University of Chicago Press. Joshi, Smita 1993 Selection of grammatical and logical functions in Marathi. Stanford University PhD dissertation. Kachru, Yamuna 1966 An introduction to Hindi syntax. Urbana: Department of Linguistics, University of Illinois. Kachru, Yamuna 1968 Studies in a transformational grammar of Hindi. Dhanbad: East West Books. Kachru, Yamuna 1979 The quotative in South Asian languages. South Asian Language Analysis 1: 63–78. Kachru, Yamuna 1980 Toward a typology of compound verbs in South Asian languages. Studies in the Linguistic Sciences 10(1): 113–124. Kachru, Yamuna 1982 Conjunct verbs in Hindi-Urdu and Persian. South Asian Review 6(3): 117–126. Kachru, Yamuna 1990 Experiencer and other oblique subjects in Hindi. In: Verma & Mohanan (eds.) 1990: 59–76. Kachru, Yamuna 1993 Verb serialization in syntax, typology, and historical change. In: Verma (ed.) 1993: 115–134. Kachru, Yamuna 2008 Hindi. Amsterdam/Philadelphia: Benjamins. Kachru, Yamuna, and Rajeshwari Pandharipande 1980 Toward a typology of compound verbs in South Asian languages. Studies in the Linguistic Sciences 10(1): 113–124. Kachru, Yamuna, Braj B. Kachru, and Tej K. Bhatia 1976 The notion ‘subject’: A note on Hindi-Urdu, Kashmiri, and Panjabi. In: Verma (ed.) 1976: 79–108.

Syntax and semantics

613

Kaisse, Ellen M. 1981 Separating phonology from syntax: A reanalysis of Pashto cliticization. Journal of Linguistics 17: 197–208. Kaul, Vijay Kumar 2006 Compound verbs in Kashmiri. Delhi: Indian Institute of Language Studies. Kayne, Richard Stanley 1994 The antisymmetry of syntax. (Linguistic Inquiry Monographs 25.) Cambridge, MA: MIT Press. Keenan, Edward L. 1985 Relative clauses. In: Timothy Shopen (ed.), Language typology and syntactic description, 2: 141–170. Cambridge: Cambridge University Press. Keine, Stefan 2013 On the role of movement in Hindi/Urdu long-distance agreement. In: Stefan Keine and Shayne Sloggett (eds.), Proceedings of NELS 42, 273–284. Amherst, Massachusetts: Graduate Linguistics Student Association. Kidwai, Ayesha 2000 XP-adjunction in universal grammar: Scrambling and binding in Hindi-Urdu. (Oxford studies in comparative syntax.) New York: Oxford University Press. Kidwai, Ayesha 2010 The cartography of phases: Facts and inference in Meiteilon. In: Anna Maria Di Sciullo and Virginia Hill (eds.), Edges, heads, and projections: Interface properties, 233–262. Amsterdam/Philadelphia: Benjamins. Kimmig, Rainer 2014 ‘Light Verbs’ in Old and Middle Indo-Aryan. 30th South Asian Languages Analysis Roundtable (SALA), Hyderabad, February 2014. (Forthcoming under the title ‘Verb-Verb-Sequences in Old and Middle Indo-Aryan’.) Koffka, Kurt 1935 Principles of gestalt psychology. Oxford: Harcourt, Brace. Kopecka, Anetta, and Bhuvana Narasimhan (eds.) 2012 Events of “putting” and “taking”: A crosslinguistic approach. Amsterdam/ Philadelphia: Benjamins. Kopris, Craig A., and Anthony R. Davis 2005 Endoclitics in Pashto: Implications for lexical integrity. Fifth Mediterranean Morphology Meeting (MMM5), Fréjus, France. Krishnamurti, Bhadriraju 1992 Complex predicates in Telugu. In: Amrit Madhav Ghatage (ed.), S. M. Katre felicitation volume, 313–318. (Bulletin of the Deccan College and Research Institute 51–52.) Krishnamurti Bhadriraju, Colin P. Masica, and Anjani Sinha (eds.) 1986 South Asian languages: Structure, convergence, and diglossia. Delhi: Motilal Banarsidass. Kuiper, F. B. J. 1967 The genesis of a linguistic area. Indo-Iranian Journal 10: 81–102. Repr. 1974, International Journal of Dravidian Linguistics 3: 135–153. Kuteva, Tania 2001 Auxiliation: An enquiry into the nature of grammaticalization. New York: Oxford University Press.

614

Bibliographical References

Lahiri, Utpal 2002 On the proper treatment of ‘expletive wh’ in Hindi. Lingua 112: 501–540. Lakoff, George 1987 Women, fire, and dangerous things: What categories reveal about the mind. Chicago/London: University of Chicago Press. Lakoff, George, and Mark Johnson 1980 Metaphors we live by. Chicago/London: University of Chicago Press. Lakoff, George, and Mark Johnson 1999 Philosophy in the flesh: The embodied mind and its challenge to western thought. New York: Basic Books. Lakshmi Bai, B. 1985 Some notes on correlative constructions in Dravidian. In: V. Z. Acson and R. L. Leed (eds.), For Gordon H. Fairbanks, 181–190. Honolulu: University of Hawaii Press. Landau, Idan 2004 The scale of finiteness and the calculus of control. Natural Language and Linguistic Theory 22: 811–877. Langacker, Ronald W. 1987 Foundations of Cognitive Grammar, 1: Theoretical prerequisites. Stanford, CA: Stanford University Press. Langacker, Ronald W. 1991 Foundations of Cognitive Grammar, 2: Descriptive application. Stanford, CA: Stanford University Press. LaPolla, Randy J. 2003 Evidentiality in Qiang. In: Alexandra Aikhenvald and R. M. W. Dixon (eds.), Studies in evidentiality, 63–78. Amsterdam/Philadelphia: Benjamins. Larson, Richard K. 2009 Chinese as a reverse Ezafe language. Yuyanxue Luncong (Journal of Linguistics) 39: 30–85. Peking University. http://semlab5.sbs.sunysb.edu/~rlarson/ LarsonCAREL09.pdf (accessed 17 May 2014) Lazard, Gilbert 1985 L’inferentiel ou passé distancie en persan. Studia Iranica 14: 27–42. Lazard, Gilbert 1996 Le médiatif en Persan. In: Guentchéva (ed.) 1996: 21–30. Lazard, Gilbert 1999 Mirativity, evidentiality, mediativity, or other? Linguistic Typology 3(1): 91–109. Lazard, Gilbert 2000 Le médiatif: Considérations théoriques at application à l’iranien. In: Johanson & Utas (eds.) 2000: 209–228. Legate, Julie Anne 2000 Relative asymmetries and Hindi correlatives. In: Artemis Alexiadou, Andre Meinunger, Chris Wilder, and Paul Law (eds.), The syntax of relative clauses, 1–51. Amsterdam/Philadelphia: Benjamins. Legate, Julie Anne 2002 Towards a unified treatment of wh-expletives in Hindi and German. In: Uli Lutz, Gereon Müller, and Arnim von Stechow (eds.), Wh-scope marking. Amsterdam/Philadelphia: Benjamins.

Syntax and semantics

615

Legate, Julie Anne 2008 Morphological and abstract case. Linguistic Inquiry 39(1): 55–101. Lehmann, Christian 1984 Der Relativsatz: Typologie seiner Strukturen, Theorie seiner Funktionen, Kompendium seiner Grammatik. Tübingen: Narr. Lehmann, Thomas 1993 A grammar of modern Tamil. Pondicherry: Pondicherry Institute of Linguistics and Culture. Lehmann, Thomas 1994 Grammatik des Alttamil unter besonderer Berücksichtigung der CaṅkamTexte des Dichters Kapilar. Stuttgart: Franz Steiner Verlag. Levinson, Stephen C. 2012 Preface. In: Kopecka & Narasimhan (eds.) 2012: xi–xv. Liang Hsin-hsin, and Peter Edwin Hook 2006 The compound verb in Chinese and Hindi-Urdu and the plausibility of macro linguistic areas. In: Masica (ed.) 2006: 105–126. Lidz, Liberty 2007 Evidentiality in Yongning Na (Mosuo). Linguistics of the Tibeto-Burman Area 30(2): 45–87. Liljegren, Henrik 2008 Towards a grammatical description of Palula: An Indo-Aryan language of the Hindukush. Stockholm: Department of Linguistics, University of Stockholm. su.diva-portal.org/smash/get/diva2:198468/FULLTEXT01 (accessed 25 November 2014) Lin, You-Jing 2003 Tense and aspect morphology in the Zhuokeji rGyalrong verb. Cahiers de linguistique – Asie orientale 32(3): 245–286. Lipták, Anikó 2009 The landscape of correlatives: An empirical and analytical survey. In: Lipták (ed.) 2009: 1–46. Lipták, Anikó (ed.) 2009 Correlatives cross-linguistically. Amsterdam/Philadelphia: Benjamins. Lönne, Dirk (ed.) 2001 Tohfa-e-Dil: Festschrift Helmut Nespital. Reinbek: Wezler. Lust, Barbara C., Kashi Wali, James W. Gair, and K. V. Subbarao (eds.) 2000 Lexical anaphors and pronouns in selected South Asian languages: A principled typology. Berlin/New York: Mouton de Gruyter. Magier, David S. 1990 Dative/accusative subjects in Marwari. In: Verma & Mohanan (eds.) 1990: 213–220. Mahajan, Anoop Kumar 1989 Agreement and agreement phrases. In: Itziar Laka and Anoop Kumar Mahajan (eds.), Functional heads and clause structure, 217–252. (MIT Working Papers in Linguistics, 10) Cambridge, MA: MITWPL. Mahajan, Anoop Kumar 1990 The A/A-bar distinction and movement theory. MIT PhD dissertation. (Distributed by MIT Working Papers in Linguistics.)

616

Bibliographical References

Mahajan, Anoop Kumar 1994 Toward a unified theory of scrambling. In: Norbert Corver and Henk C. van Riemsdijk (eds.), Studies on scrambling: Movement and non-movement approaches to free word-order phenomena, 301–330. Berlin/New York: Mouton de Gruyter. Mahajan, Anoop Kumar 1997 Universal Grammar and the typology of ergative languages. In: Artemis Alexiadou and T. Alan Hall (eds.), Studies on Universal Grammar and typological variation, 35–57. Amsterdam/Philadelphia: Benjamins. Mahajan, Anoop Kumar 2000 Relative asymmetries and Hindi correlatives. In: Artemis Alexiadou, Andre Meinunger, Chris Wilder, and Paul Law (eds.), The syntax of relative clauses, 1–51. Amsterdam/Philadelphia: Benjamins. Majid, Asifa 2006 Body part categorisation in Punjabi. Language Sciences 28(2–3): 241–261. Majid, Asifa, and Melissa Bowerman (eds.) 2007 The semantic categories of cutting and breaking events: A crosslinguistic perspective. (Special issue of Cognitive Linguistics, 18(2).) Majid, Asifa, N. J. Enfield, and Miriam van Staden (eds.) 2006 Parts of the body: Cross-linguistic categorisation. (Special issue of Language Sciences 28(2–3).) Manetta, Emily 2010 Wh-expletives in Hindi-Urdu: The vP phase. Linguistic Inquiry 41(1): 1–34. Manetta, Emily 2011 Peripheries in Kashmiri and Hindi-Urdu: The syntax of discourse-driven movement. Amsterdam/Philadelphia: Benjamins. Manetta, Emily 2012 Reconsidering rightward scrambling: Postverbal constituents in Hindi-Urdu. Linguistic Inquiry 43(1): 43–74. Marlow, Patrick Edward 1997 Origin and development of the Indo-Aryan quotatives and complementizers: An areal approach. University of Illinois PhD dissertation. Masica, Colin P. 1976 Defining a linguistic area: South Asia. Chicago/London: University of Chicago Press. Masica, Colin P. 1991 The Indo-Aryan languages. Cambridge: Cambridge University Press. Masica, Colin P. 1993 Subareal variation in conjunct verbs. In: Verma (ed.) 1993: 157–162. Masica, Colin P. (ed.) 2006 Old and new perspectives on South Asian languages: Grammar and semantics (Proceedings of the 5th International Conference on South Asian Linguistics). Delhi: Motilal Banarsidass. Matisoff, James 1972 Lahu nominalization, relativization, and genitivization. In: J. Kimball (ed.), Syntax and semantics, vol. 1: 237–257. New York: Seminar Press.

Syntax and semantics

617

Matisoff, James 1993 Sangkong of Yunnan: Secondary “verb pronominalization” in Southern Loloish. Linguistics of the Tibeto-Burman Area 16(2): 123–142. Mazaudon, Martine 2003 Tamang. In: Thurgood & LaPolla (eds.) 2003: 291–314. McCready, Eric, and Norry Ogata 2007 Evidentiality, modality and probability. Linguistics and Philosophy 30(2): 147–206. McGregor, Ronald Stuart 1968 The language of Indrajit of Orchā: A study of early Braj Bhāṣā prose. Cambridge: Cambridge University Press. McGregor, Ronald Stuart 1972 Outline of Hindi grammar: With exercises. Oxford: Oxford University Press. Mervis, Carolyn B., and Eleanor Rosch 1981 Categorization of natural objects. Annual Review of Psychology 32: 89–115. Michailovsky, Boyd 1996 L’inférentiel du Népali. In: Guentchéva (ed.) 1996: 109–123. Migron, Saul 1993 Catena and climax in Vedic prose. Die Sprache 35: 71–80. Mishra, Ramesh C., Pierre R. Dasen, and Shanta Niraula 2003 Ecology, language, and performance on spatial cognitive tasks. International Journal of Psychology 38(6): 366–383. Mistry, P. J. 2004 Subjecthood of non-nominatives in Gujarati. In: Bhaskararao & Subbarao (eds.) 2004, vol. 2: 1–32. Mohanan, K. P., and Tara Mohanan 1990 Dative subjects in Malayalam: Semantic information in syntax. In: Verma & Mohanan (eds.) 1990: 43–57. Mohanan, Tara 1994a Argument structure in Hindi. Stanford: CSLI Publications. (Stanford University PhD dissertation, 1990.) Mohanan, Tara 1994b Case OCP: A constraint on word order in Hindi. In: Butt, King & Ramchand (eds.) 1994: 185–216. Mohanty, Ajit K., and Nandita Babu 1983 Bilingualism and metalinguistic ability among Kond tribals in Orissa, India. The Journal of Social Psychology 121: 15–22. Montaut, Annie 2001 On the aoristic behaviour of the Hindi/Urdu simple past: From aorist to evidenciality. In: Lönne (ed.) 2001: 345–364. Montaut, Annie 2004 A grammar of Hindi. München: LINCOM. Montaut, Annie 2013 The rise of non-canonical subjects and semantic alignments in Hindi. Studies in Language Companion Series 140 (1): 91–117. https://hal.archives-ouvertes. fr/file/index/docid/962420/filename/Montaut_Non_canonical_subjects.pdf (accessed 22 November 2014)

618

Bibliographical References

Muralikrishnan, R. 2011 An electrophysiological investigation of Tamil dative-subject constructions. Inaugural-Dissertation, Universität Marburg. http://pubman.mpdl.mpg.de/ pubman/item/escidoc:1203560:4/component/escidoc:1400694/muralikr ishnan.pdf (accessed 22 November 2014) Mushin, Ilana 2001 Evidentiality and epistemological stance. Amsterdam/Philadelphia: Benjamins. Nadkarni, M. V. 1970 NP embedded structures in Kannada and Konkani. UCLA PhD dissertation. Nagaraja, K. S. 2014 The Nihali language: Grammar, text and vocabulary. Mysore: Central Institute of Indian languages. Narasimhan, Bhuvana 2003 Motion events and the lexicon: A case study of Hindi. Lingua 119: 123–160. Narasimhan, Bhuvana 2005 Splitting the notion of ‘agent’: Case-marking in early child Hindi. Journal of Child Language 32(4): 787–803. Narasimhan, Bhuvana 2007 Cutting, breaking, and tearing verbs in Hindi and Tamil. In: Majid & Bowerman (eds.) 2007: 195–205. Narasimhan, Bhuvana 2012 Putting and Taking in Tamil and Hindi. In: Kopecka & Narasimhan (eds.) 2012: 201–230. Narasimhan, Bhuvana, and Marianne Gullberg 2011 The role of input frequency and semantic transparency in the acquisition of verb meaning: Evidence from placement verbs in Tamil and Dutch. Journal of Child Language 38(3): 504–532. Narasimhan, Bhuvana, Anetta Kopecka, Melissa Bowerman, Marianne Gullberg, and Asifa Majid 2012 Putting and taking events: A crosslinguistic perspective. In: Kopecka & Narasimhan (eds.) 2012: 1–18. Narasimhan, Rangaswamy 1981 Modeling language behavior. Berlin: Springer. Narasimhan, Rangaswamy 1998 Language behaviour: Acquisition and evolutionary history. New Delhi: Sage Publications. Narasimhan, Rangaswamy, and R. Vaidyanathan 1984 Language behaviour: Interaction between a child and her parents: An extended corpus. Bombay: Tata Institute of Fundamental Research. Nayar, Prabodachandran V. R. 1979 Aspectual system in Malayalam. International Journal of Dravidian Linguistics 8(2): 289–299. Neeleman, Ad, and Kriszta Szendrői 2007 Radical pro drop and the morphology of pronouns. Linguistic Inquiry 38(4): 671–714.

Syntax and semantics

619

Nespital, Helmut 1966 Verbal aspect in Indo-Aryan and Dravidian languages: The relation of simple verbs to verbal expressions (‘compound verbs’). Berliner Indologische Studien 9–10: 247–258. Nespital, Helmut 1989 Verbal aspect and lexical semantics in Indo-Aryan languages: The typology of verbal expressions (‘compound verbs’) and their relation to simple verbs. Studien zur Indologie und Iranistik 15: 159–195. Nespital, Helmut 1997 Hindī kriyā-koś/Dictionary of Hindi verbs. Allahabad: Lokbharati. Neukom, Lukas, and Manideepa Patnaik 2003 A grammar of Oriya. Zürich: Universität Zürich. Nizar, Milla 2010 Dative subject constructions in South-Dravidian languages. University of California, Berkeley, undergraduate honors thesis. Noonan, Michael 1997 Versatile nominalizations. In: Joan L. Bybee, John Haiman, and Sandra A. Thompson (eds.), Essays on language function and language type, 373–394. Amsterdam/Philadelphia: Benjamins. Noonan, Michael 1999 Converbal constructions in Chantyal. In: Yadava & Glover (eds.) 1999: 401–420. http://archiv.ub.uni-heidelberg.de/savifadok/volltexte/2008/183/pdf/ Converbal_Constructions.pdf (accessed 25 November 2015) Noonan, Michael 2003 Nar-Phu. In: Thurgood & LaPolla (eds.) 2003: 336–352. Ostrovsky, B. Y. 1996 Glagol jazyka dari: perfektnye formy v evidencialnom značenii [The Dari verb: Perfective forms in evidential meanings]. Vestnik Moskovskogo Gosudarstvennogo Universiteta. (Vostokovedenije Series 13, No. 11.) Ostrovsky, B. Y. 1997 Evidencialnost i perfektnye formy: na materiale jazyka Dari [Evidentiality and perfect forms: (based) on materials from the Dari language]. Voprosy Jazykoznanija 6: 75–88. Pal, Aminesh K. 1972 Aspect in Bengali verbal compounds. Journal of the Asiatic Society 12: 110– 114. Palmer, Martha, Rajesh Bhatt, Bhuvana Narasimhan, Owen Rambow, Dipti Misra Sharma, and Fei Xia 2009 Hindi syntax: Annotating dependency, lexical predicate-argument structure, and phrase structure. In: Proceedings of the 7th International Conference on Natural Language Processing (ICON) 2009, Hyderabad, India, 259–268. Chennai: Macmillan Publishers. (Pre-release version: http://ltrc.iiit.ac.in/tree bank_H2014/) Pandharipande, Rajeshwari V. 1990 Experiencer (dative) NPs in Marathi. In: Verma & Mohanan (eds.) 1990: 161– 180.

620

Bibliographical References

Pandharipande, Rajeshwari V. 1993 Serial verb construction in Marathi. In: Verma (ed.) 1993: 177–196. Pandharipande, Rajeshwari V. 1997 Marathi. London/New York: Routledge. Pandharipande, Rajeshwari V. 2003 Marathi. In: Cardona & Jain (eds.) 2003: 698–728. Paolillo, John C. 1989 Deictic and dynamic interactions in Sinhala verb-verb compounds. MS, Stanford University. Pappuswamy, Umarani 2005 Dative subjects in Tamil: A computational analysis. South Asian Language Review 15(2): 42–62. Paranavitana, Senarat 1956 Sigiri graffiti: Sinhalese verses of the eighth, ninth and tenth centuries. London: Oxford University Press. Pardeshi, Prashant 2001 The compound verb in Marathi: Definitional issues and criteria for identification. Kobe Gengogaku Ronsô (Kobe Papers in Linguistics) 3: 94–111. Pardeshi, Prashant, Kaoru Horie, and Shigeru Sato 2010 An anatomy of the posture verb ‘sit’ in Marathi: A cognitive-functional account. In: Sally Rice and John Newman (eds.), Empirical and experimental methods in cognitive/functional research, 91–107. Stanford: CSLI. Pate, David M. 2012 Second position clitics and subordinate tʃe clauses in Pashto. Graduate Institute of Applied Linguistics MA thesis. http://www.gial.edu/images/theses/ pate_david-thesis.pdf (accessed 1 December 2014) Patnaik, B. N. n.d. Some non-nominative subject constructions in Oriya. http://home.iitk.ac.in /~patnaik/documents/nnom.pdf (accessed 22 November 2014) Patnaik, Bhaswati, and Nandita Babu 2001 Relationship between children’s acquisition of a theory of mind and their understanding of mental terms. Psycho-Lingua 31(1): 3–8. Paudyal, Netra Prasad 2008 Agreement patterns in Darai: Typological study. Nepalese Linguistics 23: 186–207. http://www.academia.edu/852830/Agreement_Patterns_in_Darai_ typological_Study Pederson, Eric 1995 Language as context, language as means: Spatial cognition and habitual language use. Cognitive Linguistics 6(1): 33–62. Pederson, Eric 2007 Event realization in Tamil. In: Melissa Bowerman and Penelope Brown (eds.), Crosslinguistic perspectives on argument structure: Implications for learnability, 331–357. New York: Lawrence Erlbaum. Perry, John 2000 Epistemic verb forms in Persian of Iran, Afghanistan and Tajikistan. In: Johanson & Utas (eds.) 2000: 229–257. Perry, John 2005 A Tajik Persian reference grammar. Leiden/Boston: Brill.

Syntax and semantics

621

Petersen, Jan Heegård 2012 How to put and take in Kalasha. In: Kopecka & Narasimhan (eds.) 2012: 349366. Peterson, John 2000 Evidentials, inferentials, and mirativity in Nepali. Linguistics of the TibetoBurman Area 23(2): 13–37. Peterson, John 2002 The Nepali converbs: A holistic approach. In: Rajendra Singh (ed.), The yearbook of South Asian languages and linguistics, 93–133. Thousand Oaks, CA: Sage Publications. Pokharel, Madhav P. 1991 Compound verbs in Nepali. Contributions to Nepalese Studies 18(2): 149– 173. Repr. 1999 in: Yogendra P. Yadava & Warren W. Glover (eds.), Topics in Nepalese linguistics, 185–208. Kathmandu: Royal Nepal Academy. Polinsky, Maria, and Eric Potsdam 2001 Long-distance agreement and topic in Tsez. Natural Language and Linguistic Theory 19: 583–646. Poornima, Shakthi 2012 Hindi aspectual complex predicates at the syntax-semantics interface. State University of New York, Buffalo, PhD dissertation. Poornima, Shakthi, and Jean-Pierre Koenig 2008 Reverse complex predicates in Hindi. In: S. Moran, D. Tanner, and M. Scanlon (eds.), Proceedings of the 24th Northwest Linguistics Conference, 27, 17–26. Seattle: University of Washington. Poornima, Shakthi, and Jean-Pierre Koenig 2009 Hindi aspectual complex predicates. In: Stefan Müller (ed.), Proceedings of the 16th International Conference on Head-Driven Phrase Structure Grammar, Georg-August-Universität Göttingen, Germany, 276–296. Stanford: CSLI. http://web.stanford.edu/group/cslipublications/cslipublications/HPSG/2009/ poornima-koenig.pdf (accessed 1 December 2014) Poornima, Shakthi, and Robert Painter 2010 Diverging pathways: The current status of grammaticalization of Hindi light verbs. In: Rajendra Singh (ed.), Annual review of South Asian languages and linguistics 2011, 75–104. Berlin/New York: Mouton de Gruyter. Post, Mark 2007 A grammar of Galo. LaTrobe University PhD dissertation. Post, Mark 2013 Person-sensitive TAME marking in Galo: Historical origins and functional motivation. In: Timothy Thornes, Erik Andvik, Gwendolyn Hyslop, and Joana Jansen (eds.), Functional and historical approaches to explanation, 107–130. Amsterdam/Philadelphia: Benjamins. Raja, Nasim Akhtar 2003 Aspectual complex predicates in Punjabi. In: Rajendra Singh (ed.), The yearbook of South Asian languages and linguistics 2003, 99–129. Berlin/New York: Mouton de Gruyter.

622

Bibliographical References

Ramasamy, K. 1981 Correlative relative clauses in Tamil. In: Agesthialingom and N. Rajasekharan Nair (eds.), Dravidian syntax, 363–380. (Annamalai University Publications in Linguistics, 73.) Annamalainagar: Annamalai University. Rangan, K., M. Suseela, and S. Rajendran 2001 Exploring the notion subject in Tamil. Indian Linguistics 62: 59–72. Rau, Nalini 2007 Verb agreement in Kannada: A constraint based account. University of Illinois PhD dissertation. Ray, Tapas S. 2003 Oriya. In: Cardona & Jain (eds.) 2003: 444–476. Raza, Ghulam, and Tafseer Ahmed 2011 Argument scrambling within Urdu NPs. In: Miriam Butt and Tracy Holloway King (eds.), Proceedings of the LFG11 Conference, 461–481. Stanford: CSLI. http://web.stanford.edu/group/cslipublications/cslipublications/LFG/16/ papers/lfg11razaahmed.pdf (accessed 5 December 2014) Renou, Louis 1930 L’absolutif sanskrit en -am. Mémoires de la Société Linguistique de Paris 23: 359–392. Roberts, Taylor 1997 The optimal second position in Pashto. In: Geert Booij and Jeroen van de Weijer (eds.), Phonology in progress — progress in phonology: HIL phonology papers III, 367–401. The Hague: Holland Academic Graphics. Roberts, Taylor 2000 Clitics and agreement. MIT PhD dissertation. Rosch, Eleanor 1973 Natural categories. Cognitive Psychology 4: 328–350. Rosch, Eleanor 1977 Human categorization. In: Neil Warren (ed.), Studies in cross-cultural psychology, 1: 1–49. London: Academic Press. Rosch, Eleanor 1978 Principles of categorization. In: Eleanor Rosch and Barbara B. Lloyd (eds.), Cognition and categorization, 27–48. Hillsdale, NJ: Erlbaum. Rosch, Eleanor 1983 Prototype classification and logical classification. In: Ellin Scholnik (ed.), New trends in cognitive representation: Challenges to Piaget’s theory, 73–86. Hillsdale, NJ: Erlbaum. Rosch, Eleanor, and Carolyn B. Mervis 1975 Family resemblances: Studies in the internal structure of categories. Cognitive Psychology 7: 573–605. Rosen, Carol, and Kashi Wali 1989 Twin passives, inversion, and multistratalism in Marathi. Natural Language and Linguistic Theory 7: 1–50. Ross, John Robert 1967 Constraints on variables in syntax. MIT PhD dissertation. Rossi, Adriano V. 1989 L’inferenziale in Baluci. In: Fouchécour & Gignoux (eds.) 1989: 283–291.

Syntax and semantics

623

Samadi, Habibeh 1996 The acquisition of Persian: Grammatically-based measures for assessing normal and abnormal Persian language development. University of Sheffield PhD dissertation. Saxena, Anju 1992 Finite verb morphology in Tibeto-Kinnauri. University of Oregon PhD dissertation. Saxena, Anju 1997 Aspect and evidential morphology in Standard Lhasa Tibetan: A diachronic study. Cahiers de linguistique – Asie orientale 26(2): 282–306. Saxena, Anju 2000 Evidentiality in Kinnauri. In: Johanson & Utas (eds.) 2000: 471–482. Saxena, Anuradha 1979 Grammar of Hindi causatives. UCLA PhD dissertation. Schiffman, Harold 1969 A transformational grammar of the Tamil aspectual system. University of Chicago PhD dissertation. (Studies in Linguistics and Language Teaching, 7.) Seattle: University of Washington. Schlegel, August Wilhelm von 1820 Ausgaben indischer Bücher: Nalus. Indische Bibliothek 2: 96–128. Schmidt, Ruth Laila 1999 Urdu: An essential grammar. London/New York: Routledge. Schmidt, Ruth Laila 2003 Urdu. In: Cardona & Jain (eds.) 2003: 286–350. Schmidt, Ruth Laila 2004 Compound verbs in the Shina of Kohistan. Acta Orientalia 65: 19–31. Sebeok, Thomas A., Murray B. Emeneau, and Charles A. Ferguson (eds.) 1969 Current trends in linguistics, 5: Linguistics in South Asia. The Hague: Mouton. Sethuraman, Nitya, Aarre Laakso, and Linda B. Smith 2011 Verbs and syntactic frames in children’s elicited actions: A comparison of Tamil- and English-speaking children. Journal of Psycholinguistics, 40: 241– 252. Sharma, Tara Nath 1980 The auxiliary in Nepali. University of Wisconsin, Madison, PhD dissertation. Shauq, Shafi 1983 A contrastive study of some syntactic patterns of English and Kashmiri with special reference to complementation and relativization. University of Kashmir PhD dissertation. Shibatani, Masayoshi 1999 Dative subject constructions twenty-two years later. Studies in the Linguistic Sciences 29(2): 45–76. Shinohara, Kazuko, and Prashant Pardeshi 2011 The more in front, the later: The role of positional terms in time metaphors. Journal of Pragmatics 43: 749–758. Shirai, Satoko 2007 Evidentials and evidential-like categories in nDrapa. Linguistics of the TibetoBurman Area 30(2): 125–150.

624

Bibliographical References

Sigorskiy, Alexander A. 2010 Evidentiality, inferentiality and mirativity in the Modern Hindi. Lingua Posnaniensis 52(1): 71–80. Simpson, Andrew, and Tanmoy Bhattacharya 2003 Obligatory overt wh-movement in a wh-in-situ language. Linguistic Inquiry 34(1): 127–142. Singh, Mona 1998 On the semantics of the perfective aspect. Natural Language Semantics 6(2): 171–199. Singh, Ram Adhar 1980 Syntax of Apabhraṁśa. Calcutta: Simant Publications. Singh, Udaya Narayana 1983 Subjecthood hierarchy in Maithili. Indian Linguistics 44: 75–81. Singh, Udaya Narayana, Karumuri V. Subbarao, and Samir K. Bandyopadhyay 1986 Classification of polar verbs in selected South Asian languages. In: Krishnamurti et al. (eds.) 1980: 244–269. Slade, Benjamin 2013 The diachrony of light and auxiliary verbs in Indo-Aryan. Diachronica 30(4): 531–578. Sridhar, S. N. 1976 The notion ‘subject’ in Kannada. In: Verma (ed.) 1976: 212–239. Sridhar, S. N. 1981 Linguistic convergence: Indo-Aryanization of Dravidian. Lingua 53: 199–220. Sridhar, S. N. 1988 Cognition in sentence production: A cross-linguistic study. New York: Springer. Sridhar, S. N. 1990 Kannada: A descriptive grammar. London/New York: Routledge. Srivastav Dayal, Veneeta 1994 Binding facts in Hindi and the scrambling phenomenon. In: Butt & Ramchand (eds.) 1994: 237–262. Srivastav, Veneeta 1991a Subjacency effects at LF: The case of Hindi wh. Linguistic Inquiry 22(4): 762– 769. Srivastav, Veneeta 1991b The syntax and semantics of correlatives. Natural Language and Linguistic Theory 9: 637–686. Srivastava, Smita 2008 Hindi-speaking two-year-old children’s development of verb constructions: An examination of experimental and everyday contexts. Clark University PhD dissertation. Steever, Sanford B. 1988 The serial verb formation in the Dravidian languages. Delhi: Motilal Banarsidas. Steever, Sanford B. 1998a Introduction to the Dravidian languages. In: Steever (ed.) 1998: 1–39. Steever, Sanford B. 1998b Kannada. In: Steever (ed.) 1998: 129–157.

Syntax and semantics

625

Steever, Sanford B. 1998c Gondi. In: Steever (ed.) 1998: 270–300. Steever, Sanford B. 2005 The Tamil auxiliary verb system. London/New York: Routledge. Steever, Sanford B. 2008 What’s so subversive about Dravidian? Revisiting finiteness in Dravidian syntax. In: Rajendra Singh (ed.), Annual review of South Asian languages and linguistics, 117–162. Berlin/New York: Mouton de Gruyter. Steever, Sanford B. (ed.) 1998 The Dravidian languages. London/New York: Routledge. Stoll, Sabine, Balthasar Bickel, Elene Lieven, Netra P. Paudyal, Goma Banjade, Toya N. Bhatta, Martin Gaenszle, Judith Pettigrew, Ichchha Purna Rai, Manoj Rai, and Novel Kishor Rai 2011 Nouns and verbs in Chintang: Children’s usage and surrounding adult speech. Journal of Child Language 39(2): 284–321. Subbarao, Karumuri V. 1974 Noun phrase complementation in Hindi. University of Illinois PhD dissertation. Subbarao, Karumuri V. 1979 Secondary verbs in Telugu. International Journal of Dravidian Linguistics 8(2): 268–276. Subbarao, Karumuri V. 1984 Complementation in Hindi syntax. Delhi: Academic Publications. (Reprint of Subbarao 1974.) Subbarao, Karumuri V. 2012 South Asian languages: A syntactic typology. Cambridge/New York: Cambridge University Press. Subbarao, Karumuri V., and Peri Bhaskararao 2004 Non-nominative subjects in Telugu. In: Bhaskararao & Subbarao (eds.) 2004, vol. 2: 161–196. Subbarao, Karumuri V., J. Mayuri, G. Uma Maheshwar Rao, and Sanat Hansdā 2014 Experiencer subjects and possessors: A case of subject-object agreement swapping in Santali. International Conference of South Asian Languages and Literatures, Benares Hindu University, 2014. Sun, Jackson T.-S. 1993 Evidentials in Amdo Tibetan. Bulletin of the Institute of History and Philology 1993: 945–1001. Sun, Jackson T.-S. 2007 The irrealis category in rGyalrong. Language and Linguistics 8(3): 797–819. Sundaresan, Sandhya 2012 Context and (co)reference in the syntax and its interfaces. University of Tromsø/University of Stuttgart PhD dissertation. Sundaresan, Sandhya, and Thomas McFadden 2009 Subject distribution in Tamil and other languages: Selection vs. case. Journal of South Asian Linguistics 1(2): 5–34. Talmy, Leonard 1988 The relation of grammar to cognition. In: Brygida Rudzka-Ostyn (ed.), Topics in Cognitive Linguistics, 165–205. Amsterdam/Philadelphi: Benjamins.

626

Bibliographical References

Talmy, Leonard 2000 Toward a cognitive semantics. Cambridge, MA: MIT Press. Taylor, John R. 1995 Linguistic categorization: Prototypes in linguistic theory. Oxford: Clarendon Press. Tegey, Habibullah 1977 The grammar of clitics: Evidence from Pashto and other languages. University of Illinois PhD dissertation. Tegey, Habibullah 1979 Ergativity in Pushto (Afghani). In: Irmengard Rauch and Gerald F. Carr (eds.), Linguistic method: Essays in honor of Herbert Penzl, 369–418. The Hague: Mouton. Tegey, Habibullah, and Barbara Robson 1993 Pashto-English glossary for the CAL Pashto materials. Washington, DC: Center for Applied Linguistics. Tegey, Habibullah, and Barbara Robson 1996 A reference grammar of Pashto. Washington, DC: Center for Applied Linguistics. Thomason, Sarah Grey, and Terrence Kaufman 1988 Language contact, creolization, and genetic linguistics. Berkeley/Los Angeles: University of California Press. Thompson, Sandra A., and Robert E. Longacre 1985 Adverbial clauses. In: Timothy Shopen (ed.), Language typology and syntactic description, 1: 171–234. Cambridge: University Press. Thurgood, Graham 1986 The nature and origins of the Akha evidentials system. In: Chafe & Nichols (eds.) 1986: 214–222. Thurgood, Graham, and Randy J. LaPolla (eds.) 2003 The Sino-Tibetan languages. London/New York: Routledge. Tikkanen, Bertil 1987 The Sanskrit gerund: A synchronic, diachronic and typological analysis. Helsinki: Finnish Oriental Society. Tikkanen, Bertil 1995 Burushaski converbs in their areal context. In: Haspelmath & König (eds.) 1995: 487–528. Tilakaratne, Sunanda K. 1993 A cognitive semantic study of locative expressions in Sinhala and a comparison with English locative expressions. University of Kansas PhD dissertation. Tomasello, Michael 2000 First steps toward a usage-based theory of language acquisition. Cognitive Linguistics 11: 61–82. Tomasello, Michael 2006 Construction Grammar for kids. Constructions 1: 1–11. Tournadre, Nicolas 1996 Comparaison des systèmes médiatifs de quatre dialects tibétains (tibétain central, ladakhi, dzongkha et amdo). In: Guentchéva (ed.) 1996: 195–211.

Syntax and semantics

627

Tournadre, Nicolas 2008 Arguments against the concept of ‘conjunct’/‘disjunct’ in Tibetan. In: Brigitte Huber, Marianne Volkart, and Paul Widmer (eds.), Chomolangma, Demawend und Kasbek: Festschrift für Roland Bielmeier zu seinem 65. Geburtstag, 281– 308. Halle: International Institute for Tibetan and Buddhist Studies. Tournadre, Nicholas, with Konchok Jiatso 2001 Final auxiliary verbs in Literary Tibetan and in the dialects. Linguistics of the Tibeto-Burman Area 24(1): 49–111. Tsiang, Sarah, and Albert Watanabe 1987 The Pañcatantra and Aesop’s Fables: A comparison of rhetorical structures in Classical Indian and Western literature. Studies in the Linguistic Sciences 17(1): 137–146. Tsiang-Starcevic, Sarah 1997 The discourse functions of subordinate constructions in Classical Sanskrit narrative texts. University of Illinois PhD dissertation. Utas, Bo 2000 Traces of evidentiality in Classical New Persian. In: Johanson & Utas (eds.) 2000: 259–271. Vaidyanathan, R. 1988 Development of forms and functions of interrogatives in children: A longitudinal study in Tamil. Journal of Child Language 15: 533–549. Vale, Ramchandra Narayan 1948 Verbal composition in Indo-Aryan. Poona: Deccan College. van der Leeuw, Frank 1995 Alignment and integrity constraints in cliticization. In: Audra Dainora, Rachel Hempbill, Barbara Luka, B. Need, and S. Pargman (eds.), Proceedings of the 31st regional meeting of the Chicago Linguistic Society, parasession on clitics, vol. 2: 168–180. Chicago: Chicago Linguistic Society. van der Leeuw, Frank 1997 Clitics: Prosodic studies. The Hague: Holland Academic Graphics. (University of Amsterdam doctoral dissertation.) van Driem, George 1987 A grammar of Limbu. Berlin/New York: Mouton de Gruyter. van Driem, George, with Karma Tshering 1998 Dzongkha. (Languages of the Greater Himalayan Region 1.) Leiden: Research School CNWS. Verma, Manindra K. 1993 Complex predicates and light verbs in Hindi. In: Verma (ed.) 1993: 197–215. Verma, Manindra K. (ed.) 1976 The notion of subject in South Asian languages. Madison: South Asian Studies, University of Wisconsin. Verma, Manindra K. (ed.) 1993 Complex predicates in South Asian languages. New Delhi: Manohar. Verma, Manindra K., and K. P. Mohanan (eds.) 1990 Experiencer subjects in South Asian languages. Stanford: CSLI. Volkart, Marianne 2000 The meaning of the auxiliary morpheme ’dug in the aspect systems of some Central Tibetan dialects. Linguistics of the Tibeto-Burman Area 23(2): 127–153.

628

Bibliographical References

Wali, Kashi 1988 A note on wh-questions in Marathi and Kashmiri. Cornell Working Papers in Linguistics 8: 161–180. Wali, Kashi 2004 Non-nominative subjects in Marathi. In: Bhaskararao & Subbarao (eds.) 2004, vol. 2: 223–242. Wali, Kashi 2006 Marathi: A study of comparative South Asian structures. Delhi: Indian Institute of Language Studies. Wali, Kashi, and Omkar N. Koul 1997 Kashmiri: A cognitive-descriptive grammar. London/New York: Routledge. Reprinted 2010. Wallace, William David 1985 Subjects and subjecthood in Nepali: An analysis of Nepali clause structure and its challenges to Relational Grammar and Government & Binding. University of Illinois PhD dissertation. Watters, David E. 2002 A grammar of Kham. Cambridge/New York: Cambridge University Press. Watters, David E. 2006 The conjunct-disjunct distinction in Kaike. Nepalese Linguistics 22: 300–319. Webelhuth, Gert 1989 Syntactic saturation phenomena and the modern Germanic languages. University of Massachusetts, Amherst, PhD dissertation. (Distributed by GLSA.) Wertheimer, Max 1923 Untersuchungen zur Lehre von der Gestalt. Psychologische Forschung: Zeitschrift für Psychologie und ihre Grenzwissenschaften 4: 301–350. Abridged transl. 1950, “Laws of organization in perceptual forms” by Willis D. Ellis. In: Willis D. Ellis (ed.), A source book of gestalt psychology, 71–88. New York: Humanities Press. Whitney, William Dwight 1879/1889 Sanskrit grammar, including both the classical language and the older dialects of Veda and Brahmana, 1st/2nd ed. Leipzig: Breitkopf and Härtel. Reprint 1995, Delhi: DK Publications. Wilde, Christopher P. 2008 A sketch of the phonology and grammar of Rājbanshi. University of Helsinki PhD dissertation. Willett, Thomas 1988 A cross-linguistic survey of the grammaticization of evidentiality. Studies in Language 12: 51–97. Willis, Christina Marie 2007a Converb constructions in Darma — A Tibeto-Burman language. In: Frederick Hoyt, Nikki Seifert, Alexandra Teodorescu, and Jessica White (eds.), Proceedings of the Texas Linguistics Society 9: Morphosyntax of Underrepresented Languages, 299–318. Stanford: CSLI. Willis, Christina Marie 2007b A descriptive grammar of Darma: An endangered Tibeto-Burman language. University of Texas, Austin, PhD dissertation.

Syntax and semantics

629

Willis, Christina Marie 2007c Evidentiality in Darma (Tibeto-Burman). Linguistics of the Tibeto-Burman Area 30(2): 89–124. Windfuhr, Gernot L. 1982 The verbal category of inference in Persian. In: Monumentum Georg Morgenstierne II, 263–287. (Acta Iranica 22.) Leiden: Brill. Windfuhr, Gernot L. (ed.) 2009 The Iranian languages. London/New York: Routledge. Windfuhr, Gernot L., and John Perry 2009 Persian and Tajik. In: Windfuhr (ed.) 2009: 415–544. Winfield, W. W. 1928 A grammar of the Kūi language. Calcutta: Baptist Mission Press. Woodbury, Anthony 1986 Interactions of tense and evidentiality: A study of Sherpa and English. In: Chafe & Nichols (eds.) 1986: 188–202. Yadava, Yogendra P., and Warren W. Glover (eds.) 1999 Topics in Nepalese linguistics. Kathmandu: Royal Nepal Academy. Yamabe, Junji 1990 Dative Subject constructions in Indic languages. University of Tokyo MA thesis. Yoon, James Hye Suk 1996 A syntactic approach to category-changing phrasal morphology: Nominalizations in Korean and English. In: Hee-Don Ahn, Myung-Yoon Kang, YoungSuck Kim, and Sookhee Lee (eds.), Morphosyntax in generative grammar, 63–86. Seoul: Hankuk Publishing Company. Zeisler, Bettina 2000 Narrative conventions in Tibetan languages: The issue of mirativity. Linguistics of the Tibeto-Burman Area 23(2): 39–77. Zeisler, Bettina 2004 Relative tense and aspectual values in Tibetan languages. Berlin/New York: Mouton de Gruyter. Zubair, Cala 2008 Doxastic modality as a means of stance taking in colloquial Sinhala. In: Texas Linguistic Forum 52, Proceedings of the Sixteenth Annual Symposium about Language and Society – Austin April 11–13, 2008, 174–190. http://studentorgs. utexas.edu/salsa/proceedings/2008/Zubair_2008.pdf (accessed 25 November 2014)

6

Sociolinguistics Edited by Elena Bashir

6.1.

Introduction By Elena Bashir

Given the comprehensive scope of this volume, and the vast existing literature on the sociolinguistics of South Asian languages, we have chosen a few topics which seem to us to be of great current importance, or to have received relatively less attention than others. Section 6.2 deals with perhaps the most pressing sociolinguistic issue today, that of language endangerment and loss. The situations in India and Pakistan are discussed by Anvita Abbi and Elena Bashir, respectively. Intimately related to matters of language loss or survival is the area of language policy and planning, discussed by Harold Schiffman in Section 6.3. Issues of language policy take on more urgency in India today (2014) in light of some of the unintended consequences of the three-language policy. Diglossia has been much studied for Arabic and Tamil, for example, but less so for Bangla. Section 6.4 on Diglossia contains original analysis and synthesis by Probal Dasgupta for Bangla and E. Annamalai for Tamil. In Section 6.5, Ian Smith provides a valuable synthesis of discussion and an extensive list of references on pidgins and creoles in South Asia, a relatively under-researched topic. In Section 6.6, Tej Bhatia summarizes the state of research on South Asian languages in the diaspora, concluding that most diasporic studies focus either on the history or the sociology of migrations rather than linguistic issues. Those studies that do treat South Asian languages in the diaspora mostly discuss language attrition and loss. This points to the need for much work on the specifics of linguistic changes in diasporic languages. Many important topics which have been discussed in earlier literature are not treated here. Several of these are listed here, with a few suggested references. — Code mixing and code-switching: Vaid 1980 on code-mixing in Indian films, S. N. Sridhar 1978 on code-mixing in Kannada, B. B. Kachru 1978, Kachru, Kachru & Sridhar (eds.) 2008, Bhattacharja 2010 on “Benglish” verbs, Rasul 2013 on code mixing in Pakistani children’s magazines. (See also Section 2.7.2 above.) — Multilingualism: Pattanayak (ed.) 1990, Annamalai 2001, Pandharipande 2003, Bhatia & Ritchie 2012, Canagarajah & Ashraf 2013 on multilingualism and educational policy — Dialect studies: Grierson’s Linguistic Survey of India 1903–1922, Ferguson & Gumperz (eds.) 1960, Miranda 1978 — Language and power: Rahman 1996a, 2002, (contributions at http://www.tari qrahman.net/); Ayres 2009

632

Anvita Abbi

— Historical sociolinguistics: Ferguson 1959, Hock & Pandharipande 1978, D’Souza 1987, Deshpande 1993 — Language and identity: Bharadwaj 2001, Khan 2004, Lahiri 2008, Shetty 2008, Pemberton & Nijhawan (eds.) 2009 — Gender and language: Valentine 1986, Nagar 2008, Zubair 2010 — Language and religion: Brass 1974, Bartholomeusz 1999; Pandharipande 2001, 2006. A selected bibliography on sociolinguistics in South Asia up to 1978 is available as Anonymous 1978. A more recent overview is Jain 2003. An emerging research area on which very little work on South Asia has yet appeared is linguistic landscape studies (LLS), in the new sense of languages as represented in the visual landscape of a specific (urban) space — for example, by signage or graffiti (e.g. Choksi 2014). 6.2.

Language endangerment and documentation

6.2.1.

The situation in India and adjacent areas By Anvita Abbi

6.2.1.1. Introduction The Indian parliament was shaken by the publication of UNESCO’s Atlas of the World’s Languages in Danger (Moseley [ed.] 2010), which highlighted the fact that India, with 197 endangered languages, had the highest rate of language loss in the world. This draws attention to several glaring facts: (i) while the number of endangered languages may not be that high, it is true that the number and vitality of Indian languages has been gradually but continuously decreasing in post-independence India. (ii) The methods of assessment of language attrition or language death adopted by international standards, viz. Fishman’s ‘Graded Intergenerational Disruption Scale’ (GIDS), adopted in UNESCO’s Language Vitality and Endangerment (2003) and Ethnologue’s 17th edition Expanded Graded Intergenerational Disruption Scale (EGIDS) (Lewis, Simons & Fennig 2014), are not appropriate for the South Asian case. The South Asian sociolinguistic scene, demanding distinct and additional parameters for the assessment of language vitality, draws attention to the need for serious research into the constituents of traditional South Asian multilingual society that have helped sustain it in the past, and into the threats to it in the last sixty years. One major marker of the robustness of South Asian languages is ORALITY . However, in the absence of any tested and authenticated different scale of parameters, we will follow the UNESCO guidelines to assess the endangerment of South Asian languages. The four best sources of information on endangerment referred to here are:

Sociolinguistics

633

Census of India 2001, National Census of Nepal 2011, Ethnologue (Lewis, Simons & Fennig 2014), and Moseley (ed.) 2010. The languages of India, Nepal, Bhutan and Sri Lanka represent seven language families, viz. Dravidian, Indo-Aryan, Tibeto-Burman, Austroasiatic, Great Andamanese, Austronesian (Angan languages), and Tai-Kadai. A brief account of the endangered languages in each language family is given below. Three significant factors that play a major role in endangerment are small population base, diminishing use of the language in various domains, and decreasing or total absence of intergenerational transfer. In India alone, as many as 156 languages are spoken by fewer than 10,000 speakers. The geographical spread of these languages shows an uneven distribution: The North Eastern states of India and the State of Jammu and Kashmir together are home to 93 of the 156 languages under consideration. The family-wise distribution of the endangered languages spoken by fewer than 10,000 speakers is as follows. 6.2.1.2.

Language-family wise situations

6.2.1.2.1.

Dravidian

Dravidian endangered languages occupy the Indian territory of the south, and some parts of central and eastern India. The languages are listed on a state-wise basis; numerals in brackets beside the language names represent numbers of speakers. Kerala: Ara Nandan (200), Moopan (3000), Maduga (3370), Paliya (9520). Mannan is spoken both in Kerala and Tamil Nadu by 7850 speakers, Eravallan is spoken both in Kerala and Tamil Nadu by 5000 speakers. Karnataka: Hakkipikki (8414), Kutiya (2800); Toda (1560) both in Karnataka and Tamil Nadu, is considered vulnerable due to heavy language shift to the State languages. Tamil Nadu: Jenu Kurumba (3500), Kurumba (5498), Malasar (7760), and Kota (1186) spoken in Tamil Nadu, Karnataka, and Kerala. Orissa: Two languages, Manda (4040) and Pengo (2000), are highly endangered. Introducing education in State languages and ignoring the mother tongue medium has impeded the growth of these languages. Documentation in these languages is not adequate. 6.2.1.2.2.

Indo-Aryan

The geographical and demographic spread of this language family is vast, stretching from Afghanistan to Pakistan (considered in Section 6.2.2), India, Nepal, Bangladesh, and Sri Lanka. A large number of endangered Indo-Aryan languages of India are concentrated in Jammu and Kashmir. Among the 156 languages spoken by fewer than 10,000 speakers in India, 37 Indo-Aryan languages are spoken in the region of Jammu and Kashmir alone. Other regions in which endangered languages are found are Himachal Pradesh with 3000 speakers of Bharmauri, 750 speakers of Chinali, and only 31 speakers of Baghati; and Chattisgarh and Madhya

634

Anvita Abbi

Pradesh with 6790 speakers of Bhunjia. Speakers of Birjia/Binjhia are spread out in three states: 5365 speakers in Jharkhand, 9479 in Orissa, and 1654 in Bengal. Other languages which are spoken by more than 10,000 speakers are also vulnerable due to reductionist and subjugating education policy. The ancient language Sanskrit, although claimed by 14,100 people as mother tongue (Census of India 2001), is also endangered as it is no longer a spoken language and remains as the language of the Classical texts which is used by Hindus only in specific domains of religious contexts.1 Bote-Majhi in Nepal has no monolingual speaker left, and the total figure of speakers is 8770 (National Census of Nepal 2011). It is undergoing rapid shift to Nepali. Similar is the situation of Kumhali with 12,200 speakers but shifting to Nepali. Palpa, which was spoken in the Lumbini zone, is now extinct. Vedda, spoken in Sri Lanka, is definitely endangered, with only 300 speakers. 6.2.1.2.3.

Tibeto-Burman [with input from Carol Genetti]

A large area of the Himalayan region is home to most of the languages of the TibetoBurman family stretching from Western Nepal to Bhutan and to the so-called “Seven Sisters”,2 i.e. the Northeast of India. Languages of this family are also found in some areas of Jammu and Kashmir, Uttarkhand, Himachal Pradesh, West Bengal, Assam, and Bangladesh. The area is marked by both linguistic multiplicity and linguistic diversity. The endangerment scenario for the languages of this family reflects ongoing shift towards State languages and English in some or most domains. As a result, more than 90 % of the languages and language varieties reported in Ethnologue and in the Encyclopedia of the World’s Endangered Languages (van Driem 2007) are classified at some level of endangerment between “unsafe” and “moribund”. According to UNESCO, more than 40 % are classified as “endangered”, more than 20 % as “severely endangered”, and more than 6 % are considered “moribund”. Despite governmental support extended by Bhutan to preserve minority languages as part of the cultural heritage of the country, languages with a small population base run the risk of being “definitely endangered” as claimed by UNESCO. Hence, the Black Mountain (500), Brokkat (300), Chali (1000), Gongduk (1000), Lakha (8000), Lhokpu (2500), and Dakpa (1000) languages are threatened. Speakers of Turung (1200) no longer speak a Tai language but a variety of Singpho, a Tibeto-Burman language (Morey 2010). Highly endangered languages spoken by fewer than 10,000 speakers in various states of India other than the Seven Sisters are Dhimal (450), Mech and Toto (1000) in West Bengal; Jad (300) and Kanashi 1 2

Editorial note: See Hock 1992 and Hastings 2004 for different perspectives. “Seven Sisters” is the name given to the contiguous seven states of northeastern India, viz. Assam, Arunachal Pradesh, Tripura, Mizoram, Manipur, Meghalaya, and Nagaland.

Sociolinguistics

635

(500–700) in Himachal Pradesh, and Dargari (400–500) in Jammu and Kashmir. The list of the endangered languages in the seven states of the Northeast is large, representing language shift to major languages and the perception that proficiency in a minority language is a hurdle in socio-economic development. In Bangladesh, Mru (18,000), Bom (7000), Pankua (3000), Chak (6000), Khyang (2000), Kumi (2000), and Mizo (1000) are minor languages that share their speech areas with India. These are endangered and need to be safeguarded. 6.2.1.2.4.

Austroasiatic [with input from Gregory Anderson]

Virtually all Austroasiatic languages of South Asia are threatened or at varying stages of endangerment and shift. Jahaic languages in particular are mainly spoken by very small groups of a few hundred speakers. All seven Nicobaric languages — Luro (2000), Lâmongshé (400), Muot (10,000), Pû (5000), Sanenyo (1300), Takahanyilang (3000), and Shompen (400) — are threatened or critically endangered, and warrant immediate documentation. Within Munda, only Santali is safe and healthy; Sora is stable in some areas but should be considered threatened where it is not yet outright endangered. The Kherwarian languages Turi (5000), Birhor (approximately 2000 speakers spread out in five to six states), Asuri (5000), Kodaku (15,700), are poorly described and at varying stages of shift from incipient to advanced. Munda languages in Jharkhand are shifting to a local variety of Sadri, an Indo-Aryan contact language. Munda languages spoken in Orissa, such as Bondo/Remo (9000), GtaɁ/Didyi (5000), Gorum (43), and Gutob/Gadaba (8000) are highly endangered. The same holds true for Koḍa (1300) in Bangladesh. Standard Khasi is an official language of India, so it is stable and healthy, but most minor Mon-Khmer languages range from “threatened” to “endangered” in status. These include Lyngngam, Pnar, War, and Nongtalang. In Meghalaya, English and Standard Khasi are expanding at the expense of Khasi’s minority sister languages. 6.2.1.2.5.

Great Andamanese

This language family represents the languages of the descendants of the first settlement of South and Southeast Asia. Great Andamanese, with an estimated population of 3000 to 3500 in the early part of the nineteenth century, was reported to have been reduced to 625 speakers by the twentieth (Temple 1903). There were ten different varieties of mutually intelligible languages, viz. Sare, Khora, Jero, Bo, Puchikwar, Kede, Kol, Juwai, Âkà-Bêa, and Aka-Bale; all of these are now extinct. The language family as a whole has only five speakers left, who speak the present-day Great Andamanese language, which is a mixture of the four northern varieties.3 Most of the languages in this family became extinct by the mid 1930s, 3

Details on this family are given in Section 1.10.1 on the Andaman languages.

636

Anvita Abbi

and currently the entire language family is on the verge of extinction. Documentation of these languages is far from satisfactory; however, Âkà-Bêa and present-day Great Andamanese are richly documented. 6.2.1.2.6.

Austronesian (Angan Languages)

Angan is a newly established family of languages (Blevins 2007), consisting mainly of Jarawa (300), Onge (100), Sentinelese (exact number not known), and Jangil (extinct), all represented in the Andaman Islands. Except Jangil, which became extinct by 1925, all the other languages are spoken by small populations, transmitted to the younger generation, and used in all relevant domains of the hunter and gatherer society. However, recent contact with outsiders may endanger Onge and Jarawa and force them to meet the same fate as Great Andamanese. Sentinelese alone has the possibility of long-term survival, as the community has been successful in maintaining an insular society and resisting all contacts with the outside world. Although Jarawa and Onge have been described by linguists and anthropologists to some extent (see Section 1.10.1.3.1), further documentation is needed as there are no dictionaries or intensive grammars. 6.2.1.2.7.

Tai-Kadai

Tai languages cover a fairly wide area of South and Southeast Asia including Myanmar. Languages of this family in India are spoken in Arunachal Pradesh and Assam. Out of the seven languages spoken in India, four are endangered, viz. Aiton (1500), Khamiyang (50), Phakial (5000), and Phake (2000); one is stable, Tai Khamti, and two are extinct, Ahom and Nora. All Ahoms, including other Tai groups, are now Theravada Buddhists and speak Assamese. Khamti, found in the Lohit district of Arunachal Pradesh, is the largest, with 12,890 speakers, among all the Tai languages that are found in India and is used in education (Das 2014). Aiton and Phake are well documented by Morey (2005). Khamyangs are mainly settled in Pawoimukh, seven miles downstream of Margherita in Tinsukia district in Assam. It is reported that 50 elderly speakers residing in Pawoimukh can speak this variety of Tai and are taking measures to teach the language to the younger generation (Morey 2005: 29–32). Tai languages are severely endangered because of heavy language shift to Assamese. However, in Lohit District of Arunachal Pradesh, where Khamti is a dominant language, people of other Tai varieties use it as a lingua franca. Their own dialects are influenced by Khamti (Das 2014). 6.2.1.2.8.

Conclusions

To sum up, more than 32 % of the languages which are spoken by fewer than 10,000 speakers belong to the Tibeto-Burman group. Most of the threatened

Sociolinguistics

637

Indo-Aryan languages (37) are spoken in the state of Jammu and Kashmir. The Northeast of India and Jammu and Kashmir together hold 93 of the 156 endangered languages under consideration. Out of this total of 156, only ten are used for literacy and education, and none for judicial purposes (Narayanan 2014). Approximately 87 % of these languages are spoken by the section of the population which is deprived of education privileges. There are also 21 unclassified languages which are threatened. In addition to individual threatened languages, the continent is on the threshold of losing the entire language family of Great Andamanese. This is a mammoth, irrevocable loss. Interestingly, the highest concentration of endangered languages is in the conflict zones of India: the Seven Sisters of the Northeast; Jammu and Kashmir; and the east central states of Jharkhand, Chattisgarh, and Madhya Pradesh. Multilingualism was never seen as a threat to language maintenance in South Asia, but the recent trend of banishing indigenous languages from the home domain by various tribal and minority groups — in the context of the three language formula adopted in schools and the hegemony of the State language — is disturbing (Abbi 1997). Further, in many South Asian countries it is observed that the will to disassociate from the indigenous language is stronger than the will to use it as an identity marker (Abbi 2008). In India, the biggest damage is being done by the current education policy where students are forced to be educated in the State language of domination, and the existing three-language formula does not take any cognizance of the indigenous languages (Abbi 2009). Proficiency in English and State major languages is considered the road to development and a place in the higher strata of society. These factors motivate speakers to shift to the language of the majority. Revival programs are few and temporary because of lack of sustained interest. Institutional support is almost nil. The only saving grace is that All India Radio broadcasts programs in some of the minor languages regularly. These are very popular and may sustain enough interest for the speakers to continue speaking their indigenous languages. However, documentation of the languages, especially those spoken in the conflict zones, is needed to preserve indigenous knowledge bases. Linguistic diversity, the greatest human-made treasure, is at stake, ironically at the hands of those who created it. The extinction of each language results in the irrecoverable loss of unique cultural, historical, and ecological knowledge. Reversing the trend of language shift and abandoning indigenous languages in the wake of globalization, the Internet, and necessary education in English will be a real challenge for all the countries of South Asia.

638 6.2.2.

Elena Bashir

Pakistan and Afghanistan By Elena Bashir

6.2.2.1. Introduction UNESCO’s Atlas of the World’s Languages in Danger (Moseley [ed.] 2010) lists over 20 languages in Pakistan whose status ranges from vulnerable to severely endangered according to Brenzinger et al.’s (2003) intergenerational language transmission scale. Of these, seven are classed as vulnerable, ten as definitely endangered, and five as severely endangered. For Afghanistan, 23 languages are classed as vulnerable to critically endangered. While not necessarily agreeing with the UNESCO ratings for each language or the completeness of the list, this summary of research on language endangerment and language documentation work in Pakistan and Afghanistan discusses institutions and persons involved in work on endangered languages.4 6.2.2.2. Government and education policy in Pakistan Regional and local languages in Pakistan have suffered from neglect since the formation of the country. The “one nation one language” philosophy enunciated by Muhammad Ali Jinnah at the outset,5 combined with preoccupation with establishing a national identity for Pakistan and socioeconomic pressures, set the scene for neglect, or even suppression of languages other than Urdu.6 Educational policy established Urdu as the medium of education in Government schools in all provinces except for Sindh and for short-lived attempts in the early 1990s to introduce Pashto as medium of instruction in NWFP, and Balochi, Brahui, and Pashto in Balochistan. Local practice has frequently discouraged or actively punished the use of local languages in schools. For example, Asif 2005 describes the feelings of shame inculcated in schoolchildren when they use their mother tongue Saraiki, which, though not an endangered language, is subject to the same sorts of pressures 4

5

6

Rahman 2004 gives an extensive appendix listing the smaller languages of Pakistan, their status, and the resources available in them. ‘[…] But let me make it very clear to you that the State Language of Pakistan is going to be Urdu and no other language. Any one who tries to mislead you is really the enemy of Pakistan. Without one State Language, no Nation can remain tied up solidly together and function. Look at the history of other countries. Therefore, so far as the State Language is concerned, Pakistan’s language shall be Urdu.’ From a 1948 speech on “National Consolidation” in Dacca (Jinnah 1989). Addleton 1986 is an early statement on the importance of the smaller languages of Pakistan. Baart 2003 is a more recent statement of the need for language documentation and preservation work. It also contains an appendix listing the languages spoken as mother tongues in Pakistan, with estimated numbers of speakers. Rahman 2006 is another general discussion of this topic.

Sociolinguistics

639

that the smaller languages are. Abbasi & Khattak 2010, Abbasi & Asif 2010, and Abbasi, Khattak & Bin Saeed 2011 describe a similar situation for Pahari, which is undergoing shift to Urdu largely due to pressure from school environments. Manan, David & Dumanig 2014 describes this in private schools in Quetta. Farrell 2000 and Tan 2000 discuss local attempts in Karachi to develop mother-tongue education in Balochi. In 2011, after the passage of the 18th Constitutional Amendment and the subsequent devolution of the portfolio of Education to the Provincial governments, the Khyber Pakhtunkhwa (formerly NWFP) cabinet gave approval, in principle, to introducing five regional languages — Pashto, Hindko, Saraiki, Khowar, and Kohistani — into the school syllabus of the areas where they are spoken.7 Pashto would be taught as a compulsory subject in the areas where it is predominantly spoken. Hindko, Saraiki, Khowar, and Kohistani were to be taught as compulsory subjects from classes 1 to 7 in the areas in which they are spoken from academic session 2012–2013, and from classes 8 to 12 by 2017–2018, in Government as well as private schools throughout Khyber Pakhtunkhwa (Khyber Pakhtunkhwa Government 2011; Dawn 4/18/2013). If this policy is maintained, and is implemented, it could be a very significant development, serving as a precedent for similar steps in the other provinces. Another sign of growing official awareness of the necessity of preserving and encouraging indigenous languages is a call for the establishment of a language commission to develop criteria for giving national language status to all major languages spoken in the country (http://www.chitral today.net/2014/03/14/commission-to-set-criteria-for-national-status-to-local-lan guages/#sthash.FFlLPzNl.dpuf). 6.2.2.3. Government and education policy in Afghanistan Language preservation work is in its infancy in Afghanistan. The 2004 Constitution of the Islamic Republic of Afghanistan recognizes all languages spoken in Afghanistan, listing the major minority languages by name. Chapter 1 Article 16 states: ‘The state adopts and implements effective plans for strengthening, and developing all languages of Afghanistan.’ The Ministry of Higher Education, under a grant from the World Bank, began developing educational materials for the minority languages in 2006. Beginning with class one primers and working their way up to high-school textbooks, the minority language staff of the Ministry has been writing curriculum to teach minority languages to their own language populations. These textbooks are intended to be used in government schools, and are intended for third-language teaching in the minority speaking regions, not as full-scale language instruction. At this time it is not known whether any of the 7

Khowar and Indus Kohistani are classed as vulnerable, and two other varieties of Kohistani as definitely endangered.

640

Elena Bashir

materials have yet been systematically adopted and implemented. Some of these minority languages have previously been unwritten, and the process of developing educational materials has prompted government-sponsored work on orthography development for them. Jamal (2010:24) describes the sociolinguistic situation of Hazaragi Persian, and recalls being scolded for using Hazaragi in school. Pashai, spoken in Eastern Afghanistan and classed as “vulnerable”, has also had an indigenous effort to document and publish reading materials.8 Beginning in 2003, The Darrai Noor Language Committee, a project of SERVE, an international NGO, has developed an orthography based on Pashto. This orthography differs somewhat from that produced by the Ministry orthography, being more phonemically based and linguistically informed. It is intended to help Pashai speakers preserve the unique features of their language. There is over 80 % illiteracy in the rural, mountainous Pashai-speaking regions and a high degree of bilingualism, putting the language at risk for endangerment. Lehr 2014, based on recent fieldwork on the Darrai Nur dialect of Pashai (vulnerable), and constituting the first English-language Pashai grammar since Morgenstierne’s work in the early 20th century (Morgenstierne 1944, 1956, 1967), is a notable recent documentation project. Her dissertation includes a descriptive grammar and discussion of current language preservation efforts, women’s role in preserving language, and ethnography of the Pashai-speaking community. In addition to Degener 1998, Strand 1997–2014 remains the most recent documentation of the Nuristani languages (all severely or definitely endangered); for Prasun see also Buddruss & Degener 2016. Work on Wakhi (definitely endangered) is under way in Pakistan, Afghanistan, and Tajikistan. Orthography development for Wakhi is still very much in flux (Anonymous 2011). In Afghanistan, the Ministry of Education has published Wakhi language teaching texts in two parts (Yevam Sinf, Boyem Sinf), by Mir Ali Wakhani.9 Dodykhudoeva 2007 discusses language shift, documentation work, and revitalization efforts with speakers of the Pamiri languages, spoken in Afghanistan, Pakistan, Tajikistan, and China. For more on education policy in Afghanistan see Section 2.4.2.2 above. 6.2.2.4. Economic, social and political factors The long years of war in Afghanistan and the resulting insecurity in Afghanistan and northern Pakistan have resulted in disruption of small language communities and displacement of their speakers. These include Pashai (vulnerable), GawarBati (definitely endangered), Sawi (definitely endangered), and all the Nuristani

8 9

I am grateful to Rachel Lehr for the information in this paragraph. Information on these Wakhi teaching texts is courtesy of John Mock, whom I thank. No other publication information is available to me.

Sociolinguistics

641

languages in Afghanistan (definitely to severely endangered).10 In summer 2010, catastrophic floods in Swat and Indus Kohistan (Pakistan) resulted in the destruction of the homes and habitats of many speakers of vulnerable and endangered languages (e.g. Kalam Kohistani, Ushojo, Torwali). The Kalasha language, which is intimately connected with the local physical environment of its speakers, has been threatened by deforestation, excessive tourism, and most recently terrorism (Hussain & Zaman 2004; Lines 1996). The erosion of Kalasha culture by conversion and modernization also threatens the language; see Di Carlo 2010 for the role of traditional ritual practices in language preservation. 6.2.2.5. Language shift Language shift is taking place in many regions of Pakistan, even involving the larger languages. For example, in urban Punjab, where Panjabi has little social or economic status, is associated with illiteracy and low-level jobs, and is plagued by feelings of inferiority, there is widespread shift from Panjabi to Urdu (Mansoor 1993: 129; Zaidi 2001, 2011: 6; Ayres 2008: 923; 2009; Khoklova 2014). In northern Pakistan, where many small and severely threatened languages are found, small linguistic communities are in the process of language shift, usually to the locally dominant language. For instance, in southern Chitral, Kalasha speakers have shifted to Palula or Khowar (Alberto Cacopardo 1991; Augusto Cacopardo 1991; Cacopardo & Cacopardo 1991; Mørch 2000); in Hunza and Nager, speakers of Domaki11 (severely endangered) are switching to Burushaski or Shina (Weinreich 2010: 10); in the upper Dir and Swat valleys, Gawri (Dir/Kalam Kohistani) speakers are shifting to Pashto (Sagar 2003a; Baart 2003: 4–5); the Kundal Shahispeaking community in the Neelam Valley is shifting to Hindko (Rehman & Baart 2005; Baart 2003: 5–6); Ushojo-speaking (severely endangered) households in the Chail Valley in upper Swat are adopting Pashto (Sagar 2003b; S. J. Decker 1992: 75–76), and/or Torwali (Inam Ullah, p.c.). In Pakistan-controlled Kashmir, Kashmiri speakers are shifting to Urdu (Dhar 2009). The reasons for language shift fall into two clusters: those connected with (i) feelings of inferiority or shame associated with their language, and (ii) economic realities. Some languages are affected by both kinds of factors, for example Domaki in the northwest, Saraiki in southern Punjab, and Panjabi in central Punjab. Some languages are under pressure for economic reasons, but their speakers maintain a positive attitude toward the language, e.g. Khowar and Kalasha. And some, like Pashto in the northwest and Sindhi in the southeast of the country, seem to resist both types of pressure. Language vitality has been investigated in detail for the 10

11

Descriptive, historical linguistic, and ethnographic information on the Nuristani language communities can be found at Strand 1997–2014. Tikkanen 2011 is an important new descriptive piece on Domaki grammar.

642

Elena Bashir

languages of northern Pakistan by the authors of the five-volume Sociolinguistic survey of Northern Pakistan: Rensch, S. J. Decker & D. G. Hallberg 1992; Backstrom & Radloff 1992; Rensch, C. E. Hallberg & O’Leary 1992; D. G. Hallberg 1992a, and K. D. Decker 1992a. Structural changes in minority and threatened languages are occurring with great rapidity, but have so far hardly been studied. Bashir (2007, 2008, 2010) discusses such changes for Khowar, Balochi, and Brahui, respectively. In Srinagar (India), a transplanted variety of Burushaski is rapidly being influenced by Kashmiri and Urdu (Munshi 2006, 2010). 6.2.2.6. Documentation efforts Some of these languages have been described in older scholarship, written before the current concern with language endangerment; information about these works is available elsewhere, so they will not be discussed here. One exception is a Russian grammar of Ormuri which has recently been translated into English (Efimov 2011[1986]). Other studies have been written by persons with a focused interest in language documentation in the context of the 21st century — both by Pakistani language activists and by foreign linguists. Work by local activists often begins with lexicography; notable examples include print and online dictionaries of Torwali (Inam Ullah 2010a, 2010b), a Gilgiti Shina-Urdu dictionary (Zia 2010), works on Kohistani Shina (e.g., Kohistani & Schmidt 1996), a Khowar-Urdu dictionary (Naji 2008), work in progress on Ormuri by Burki (2001), and a three-volume Burushaski-Urdu dictionary (Burushaski Research Academy 2007, 2009, 2014). Local literary-cultural organizations are playing an increasing role in raising awareness about the importance of documenting and preserving indigenous languages, and some of them are making use of new technologies like the internet and social media.12 The Anjuman-e-Taraqqi Khowar, originally founded in 1956, has a substantial number of publications to its credit. More recently the Mothertongue Institute for Education and research has been organized and is engaged in language documentation and development work. Some results of their work and many other articles on Khowar language and culture can be seen at www. mahraka.com. Kalasha-language activists have created web resources where culture and language preservation matters are aired (https://www.facebook.com/ KalashaPeople, http://kalashaheritage.org). The Gandhara Hindko Board (http:// www.gandharahindko.com, https://www.facebook.com/pages/Gandhara-HindkoBoard/119133821494355) is active in publishing and in cultural activities like 12

Some of these organizations have created websites, but some of these websites have vanished, while some have migrated to Facebook and other social media. The reader is encouraged to search for workarounds for those websites which may no longer be available or which may have moved.

Sociolinguistics

643

the 5th KPK Language and Culture Conference held on 25–26 December 2010, a Seraiki-Hindko Conference held in March 2014, and the Third International Hindko Conference held in November 2015. The Anjuman-e-Taraqqi-e-Palula was founded in 2003 to promote the language, and encourage writing in a Unicode-compatible script. The Wakhi Tajik Culture Association (https://www.face book.com/pages/Wakhi- Tajik-culture-Association-WTCA/111159485661668) is active in Pakistan. A blog has been established offering online lessons in Wakhi (http://learnwakhi.blogspot.com/). The Idara Baraye Taleem-o-Taraqqi is a Torwali community organization for multilingual education and development. These and other similar organizations cooperate in holding annual Mother Tongue Day (Feb. 22) functions in several cities of Pakistan. The Mother Tongue and Heritage for Education and Research (MOTHER) group is active at a regional level in northern Pakistan. Also active are the Burushaski Research Academy (originally founded in 1982 in Hunza and Gilgit), the Gawri Cultural Society (founded in 1996 as the Kalam Cultural Society, and renamed in 2007 as the Gawri Cultural Society; see Sagar 2008 for a detailed description of the activities of the Gawri Cultural Society), and the Shina Language and Culture Promotion Society, which has established a presence on Facebook and has advocated establishment of a Language Authority in the Gilgit-Baltistan region to promote the languages of the region (http://tribune.com.pk/story/113368/shina-balti-should-be-recognisedas-national-languages/). In Baltistan, the Baltistan Cultural Foundation is active in awareness-building activities aimed at preserving Balti language and culture. The Brahui Academy and the Participatory Development Initiatives NGO organized a conference in January 2015 to raise awareness of the issues facing the Brahui language (http://www.brahuiconference.org).13 The Forum for Language Initiatives-Islamabad, formerly Frontier Language Institute-Peshawar, (FLI) is an NGO which offers training and library resources for local language activists and encourages mother-tongue literacy programs. The Allama Iqbal Open University in Islamabad has started a Department of Pakistani languages. There is, however, so far no government-sponsored effort focused specifically on documentation of languages under threat. Localization efforts by the computational linguistics community in Pakistan are still focused on the larger languages, but there are indications that the techniques developed by computational linguists will also be deployed in efforts to document smaller languages (see Rahman 2004, Inam Ullah 2012). As an example of such a potential development, Hippisley, Stump, and Finkel (2009), working on Shughni in Tajikistan, discuss the use of computing technology to model data collection strategy and be ‘a means of furnishing the field-worker with elicitation tasks whose results feed into an enhanced understanding of the data, which in turn show the path to the next stage 13

A similar national-level Seminar on Brahui Language and Literature was held on May 11–12, 2014 in Karachi.

644

Elena Bashir

of elicitation, ultimately leading to a well-informed and robust account of the data which is already digitized and therefore exchangeable.’ Their Shughni project is described at: http://www.rch.uky.edu/Shughni/. Documenting endangered languages by work with speakers in the diaspora is another possibility. An example of such work is a project at City University of New York under which researchers plan to survey a small Afghan neighborhood in Flushing, Queens, for speakers of Ormuri (Roberts 2010). Information available to me on resources about language vitality studies [V] or documentation work [D] on specific languages is summarized in Table 6.1. Table 6.1 Documentation of threatened languages in Pakistan Language

Endangerment Status

Bateri (Baterawal Kohistani, bhaT’esa z’ib)

Definitely endangered Zoller (2005) [D]; Strand (2011b) [D]; Hallberg (1992b) [D]

Brahui

Vulnerable

Bashir (2010) [D]

Burushaski

Vulnerable

Burushaski Research Academy (2007, 2009) [D];Jammu and Kashmir Burushaski Munshi (2006, 2010) [V, D]; Hunza Burushaski Munshi (2009, 2010– present) [D]

Chilisso

Severely endangered

Hallberg (1992b) [D, V]

Dameli (Damia)

Severely endangered

Perder (2008) [V, D], Perder (2013) [D]

Domaki

Severely endangered

Weinreich (1999) [D]; Weinreich (2009) [D]; Weinreich (2010) [V]; Tikkanen (2011) [D]

Gawar-Bati

Definitely endangered Decker, K. D. (1992b) [V]

Gowro (Gabaro)

Severely endangered

Hallberg, D. G. (1992b) [V]

Indus Kohistani (“Maiyā̃”)

Vulnerable

Hallberg (1992) [D, V]; Hallberg, D. G. & Hallberg, C. E. (1999) [D]; Zoller (2005) [D]

Kalam Kohistani (Bashkarik, Gawri, Kalami)

Definitely endangered Baart (1997, 1999, 2004) [D];14 Baart & Sagar (2004) [D]

14

Documentation and Vitality Studies

Two dialect surveys, Sagar 2003a, 2003b, were formerly available online, but one of them is no longer so.

Sociolinguistics

645

Language

Endangerment Status

Documentation and Vitality Studies

Kalasha (Kalashamon)

Severely endangered

Bashir (1988) [D]; Peterson (2006) [D]; Trail & Cooper (1999) [D]; Mørch (2000) [V]; Decker, K. D. (1992c) [V]

Kati

Definitely endangered Strand (2011a) [D]

Khowar

Vulnerable

Kundal Shahi

Definitely endangered Rehman & Baart (2005) [V] [D]

Ormuri

Definitely endangered Burki (2001) [V]; Hallberg, D. G. (1992a) [V]; Efimov 2011[1986]), English translation by Baart in 2011 [D]

Palula (Phalura, Dangarikwar)

Definitely endangered Decker, K. D. (1992a) [V]; Strand (2000) [D]; Liljegren (2008) [D]

Torwali

Definitely endangered Inam Ullah (2010a, 2010b) [D]

Ushojo

Definitely endangered Decker, S. J. (1992) [V]

Wakhi

Definitely endangered Mock (1998) [V, D]; Müller et al. (2008) [V]

Yidgha

Definitely endangered Janjua (2011) [V]

Bashir (in progress) [D]; Decker, K. D. (1992a) [V]

6.3.

Language policy and planning in South Asia By Harold F. Schiffman

6.3.1.

Introduction

Language policy in the nations of South Asia has been a contentious issue for some time. For researchers who consider language policy to be only the explicit or “official” formulation of decisions and rules about language, research on this issue in South Asia can be expected to focus on changes in policy after the nations of the region became independent, i.e. in the late 1940s. More serious studies of the issue date back to the conflict over whether the British colonial powers should utilize the indigenous languages in their governance of British India and Ceylon, or whether only English would suffice as a way both to govern and to bring modernity to the region.15 Other researchers, who see language policy as not only explicit, overt, and “official”, but as involving also implicit, covert, unoffi15

This issue, culminating in what is often referred to as the “Macauley Minute of 1835” has been described in various sources, e.g. Schiffman 1996: 158–160.

646

Harold F. Schiffman

cial, grass-roots phenomena that are deeply rooted in the cultural traditions of a nation, would look further back in South Asian history, where concern for language and for protecting its purity goes back thousands of years.16 Given the extreme linguistic diversity of the region (Ferguson & Gumperz 1960) and the existence of sacred texts in languages as diverse as Sanskrit, Arabic, and Tamil, it was perhaps a mystery to the colonial powers that the region could endure so long without serious linguistic conflict, given that they tended to think that a monistic approach to language policy was the only way to survive in the modern world. The region’s long history of multilingualism and linguistic tolerance was then gradually abandoned as independence approached, with ethnolinguistic conflict becoming the norm. Since India is the birthplace of at least two major religions and has religious minorities who are members of other major religions such as Islam, Sikhism, and Christianity, it is not surprising that religious differences surfaced before and after independence, resulting in Pakistan becoming a separate state, with Urdu as its official “state” language. Meanwhile in Sri Lanka, the Tamil-Sinhala conflict also had religious overtones, since the Sinhalese were mostly Buddhist and the Tamils were Hindu, Christian, or Muslim. In some countries in the region, this conflict resulted in separation (e.g. Bangladesh from Pakistan), attempted separatism resulting in a long, and finally unsuccessful, civil war (Sri Lanka, see Kearney 1978), and other lesser conflicts within India and especially Nepal (Sonntag 2006) that may still be working themselves out. For extensive discussion of language policy in Afghanistan, see Section 2.4.2.2 above. 6.3.2 Language policy in India Language policy in India can be viewed as occurring in three different stages: x The “classical” period, extending from the earliest period of Indian history up until the arrival of Europeans x The colonial period, from the arrival of European colonialists until Independence x The post-Independence period, from 1947 to the present. The Classical Period is not usually treated as important in language policy development in India, since many researchers see language policy only as explicit and overt, rather than considering also the implicit, covert, unwritten grass-roots cultural and historical aspects of policy. An exception to this is Deshpande 1979, which looks closely at attitudes about language, especially the place of Vedic San16

For more on the genesis of South Asian linguistic culture, see Schiffman 1996, Chapter 6. Editorial note: For discussion of how “unofficial” language policy enforced in schools affects linguistic culture, see 6.2.2.2 above.

Sociolinguistics

647

skrit in a world of non-Aryan, ritually inferior languages that the Aryans encountered when they arrived in India; see also 1.3.1.3.1, this volume. Language issues did not constitute an important part of early colonialism in India, since early contacts simply involved trade. But as the British extended their control over the subcontinent, and displaced or defeated other colonial powers such as the Portuguese and the French, language issues came to a head. Publications about these issues include Spear (ed.) 1958, DeBary 1958, Brass 1974, and Shapiro & Schiffman 1981. Concern for post-Independence language policy actually began in the 1920s and involved debates over whether Hindi could take the place of English in an Independent India, and if so, what kind of Hindi that would be — plain old Hindustani, a Persianized version (Urdu), or a Sanskritized version. After 1947 this struggle continued, since the Constitution did not decree a changeover to Hindi until fifteen years after Independence; but when that moment arrived, severe conflict ensued. Scholarship for this period includes Apte 1976, Benedikter 2009, Brass 1974, Das Gupta 1969 and 1970, Ekbote 1984, Ferguson 1959, Ferguson & Gumperz 1960; Fishman, Ferguson & Das Gupta 1968; Laitin 1989, Mitchell 2009, Nayar 1966 and 1969, Schiffman 1996, Shapiro & Schiffman 1981, and Tsui & Tollefson (eds.) 2006. 6.3.2.1. Language policy in the classical period As noted above, India has a long, ancient tradition involving the care and transmission of language and sacred texts. Though cultural literacy in ancient India was at first totally oral and focused on the magical power of language, the issue of transmission and survival of the culture then became paramount. Responses included the development of phonetic and grammatical accounts (Chapter 7, this volume) on the one hand, and differentiation between the classical language and the “vernaculars” on the other, leading to various forms of diglossia (see 1.3.1.3.1 for Sanskrit and Prakrit and 6.4.2 for Tamil and other Dravidian languages). For a classic account of diglossia see Ferguson 1959. Some Prakrit languages, too, acquired status as literary and religious languages and, consequently, became languages of wider communication and also were described in grammars of their own; see 1.3.1.3.2 and 7.2.3.5. 6.3.2.2. Language policy under colonialism Since colonialism in South Asia was a gradual process, beginning with trade at various coastal ports and not involving the acquisition of territory at first contact, language policy was laissez-faire, utilizing whatever lingua franca was available and worked. European contacts began with the Portuguese, and involved the use of a pidginized (or creolized) version of Portuguese, and as contact increased,

648

Harold F. Schiffman

the language began to become indigenized in coastal settlements in India and Sri Lanka; see Shapiro & Schiffman 1981: 194–222 and 6.5.2, this volume. When the British and other colonial powers arrived, it seemed logical to make use of this lingua franca. But as control of territory, mostly by Britain, increased over time, contact with other linguistic varieties also increased, and by the 19th century it became an issue whether Britain would control its territories by using indigenous languages, or whether it would involve English.17 The British did not at first allow missionaries to work in their colonies, but when the pressure to allow them became irresistible, the issue of indigenous language use became inescapable. This came to a head when the Act of 1813 was passed, allowing the introduction of ‘useful knowledge, and of religious and moral improvement’ (Spear (ed.) 1958: 526). But even before this, scholars interested in the languages of the area had managed to learn some Sanskrit.18 The significance of the language increased when Sir William Jones gave his famous speech on the perfection of Sanskrit and its relation to Latin and Greek (Jones 1786). This positive attitude to the indigenous languages by the “Orientalists” was not shared by the “Anglicists”, who wanted to promote the use of English; see De Bary et al. 1958, vol. 2: 38 and Schiffman 1996: 159. This led to a struggle between the “Anglicists” and the “Orientalists” (as well as the missionaries, who wanted to communicate with people in their own languages). The deadlock was broken by Thomas Babington Macaulay, who formulated the famous (or infamous) “Minute on Education” of 1835, according to which government funds would be used to support education in English in India, and the curriculum would be based on the one prevalent in England. All parties seem to be agreed on one point, that the dialects commonly spoken among the natives of this part of India contain neither literary nor scientific information, and are, moreover, so poor and rude that, until they are enriched from some other quarter, it will not be easy to translate any valuable work into them. It seems to be admitted on all sides that the intellectual improvement of those classes of the people who have the means of pursuing higher studies can at present be effected only by means of some language not vernacular amongst them. (Macaulay, Prose and Poetry, quoted in de Bary (ed.) 1958: 44)

17

18

Though the British continued to acquire territory up until the so-called Mutiny in the mid-19th century, many states, kingdoms, and other territories remained under indigenous control, although the British demanded fealty from these rulers, who had to acknowledge the dominion of the British King-Emperor. In these non-British territories, language use was traditional, and was not under the control of the British Viceroy. Traditionally, only sons of Brahmin families could learn Sanskrit, since it was considered ritually polluting to have it even fall upon the ears of impure people. But exceptions could be made, and some Europeans succeeded in studying Sanskrit and recognizing its value.

Sociolinguistics

649

Macaulay’s scorn for the value of local languages still rankles the sensitivities of many a modern linguist, as well as — naturally — Indian nationalists. Though the use of English in education, and especially higher education, after English-medium universities were founded in various cities in British India, came to predominate and even continued into the post-Independence period, the indigenous languages did not die on the vine, but instead found English to be a challenge that some of their users decided to try to counter. 6.3.2.3. Language policy in independent India Well before India became independent in 1947, the issue of which language(s) would be official, and what the role of English would be began to be discussed. In the 1920s and 1930s the Congress Party passed resolutions declaring Hindustani to be the language that would be official in an independent India, a move that prompted some other Indian language groups to begin to resist this plan. Supporters of Hindi advocated a Sanskritized form of the language, and similarly supporters of Urdu supported a Persianized form of their language, but Gandhi argued instead for a “neutral” form of the language, referred to as Hindustani. Amrit Rai 1984 and Alok Rai 2002 are important discussions of these issues. From the early 19th century, the role of the English language in British India grew in prestige and in use. In reaction to this anglicization, the Congress Party had long been involved with the question of the status of Indian languages in the postcolonial period. In 1920, at Mahatma Gandhi’s urging, the Congress had organized itself on the basis of linguistic and cultural regions, and the Nehru Committee Report of 1928 pressed for state boundaries based on linguistic factors, so that state business could proceed in the regional vernaculars. Gandhi emphasized the need for an indigenous all-India language as something of grave need, and promoted Hindustani, a north Indian koiné that blurred the distinction between Hindi and Urdu. Gandhi did not advocate that the regional languages should be ignored; rather, he felt that a common Indian language for an independent country was of utmost concern. As Congress Party nationalists heatedly debated the issue as members of the constituent assembly tasked with drafting a Constitution, there was hardly any question about the desirability of a common official language, and that some form of Hindi would play that role. India’s constitution therefore specified that Hindi would eventually become the official language for all-Union business, supplanting English (Laitin 1989: 418). But the issue of which languages would be used in the states was not specified in the early drafts of the Constitution, and what evolved after Independence was not envisioned in the early plans, although as Laitin notes, the urgency of redefining state boundaries along linguistic lines was an early concern and had been called for in the Nehru Committee Report of 1928. After Independence in 1947, and the drafting of a Constitution in 1950 that did not call for the removal

650

Harold F. Schiffman

of English until fifteen years had passed, there was a brief lull in the turmoil about language policy. But not for long; movement in various central government ministries to change over to Hindi prompted non-Hindi regions to begin to agitate for reorganization of states along linguistic lines. In spite of the fact that two Congress committees advised against redrawing state boundaries commensurate with language use, agitation in Andhra for a Telugu language state intensified, and Prime Minister Nehru gave in to the pressure, which had become quite violent. Furthermore, he announced that in the future, no Indian would be compelled to use Hindi. King 1997 discusses Nehru’s dealing with the issues of national language and linguistic states. In 1956, the State Reorganization Commission recommended that state boundaries be redrawn to take into account linguistic realities. Eventually there was general reorganization of states along linguistic lines. Once state reorganization was carried out administratively, virtually all states legislated a single official language (Laitin 1989: 420). 6.3.2.4. The turmoil of the 1960s As already noted above, the Constitution promulgated in 1950 postponed the decision about what to do about English for fifteen years, and little preparation was made during this lull; but when 1965 dawned, advocates of Hindi demanded immediate compliance with the provision to substitute Hindi for English, while those who preferred English reacted with intense emotion. In the south, Tamils and Telugus attacked trains, setting them and, in some cases, themselves on fire, or drinking poison.19 In the end, after several years of turmoil, a compromise, known as the Three-Language Formula (TLF), was reached in 1968. Indian citizens would be expected to learn their mother tongue, English, and Hindi. Those who were already Hindi speakers would be expected to learn Hindi, English, and another (south) Indian language. English would be retained as an additional language, and could be used alongside Hindi in dealings with the Central government, e.g. in the Lok Sabha (Parliament).20 It is significant that the TLF was instigated by chief ministers of some of the states rather than emanating from the Central Government. It is also significant that though the dominant language groups in the new linguistic states have linguistic rights, smaller groups, of which there are many, as well as linguistic minorities from adjacent titular language states often have no rights, unless there are bilateral agreements between states. The southern states of Andhra Pradesh, Karnataka, Tamilnadu, and Kerala have taken the lead in working out these bilateral agreements, by which Telugu speakers in Karnataka, for example, get reciprocal rights to schooling in Telugu, if Kannada speakers in 19 20

For more on this see Ramaswamy 1997 and Mitchell 2009. Other languages could also be used in the Lok Sabha if the request was submitted with 24 hours notice, so that simultaneous interpretation could be arranged.

Sociolinguistics

651

Andhra receive the same right on an equal reciprocal basis. But the rights of even smaller groups, as noted, are minimally protected, if at all (Benedikter 2009). In some areas, e.g. Karnataka, small groups such as Tulu or Kodagu speakers have long since accepted the status of linguistic minority, receive their education in Kannada, and even use Kannada script to write their own languages. Other states, such as Andhra, are reported to be considering a three-language formula with Telugu as the first language, Urdu as the second, and English as third, for some districts. Andhra was of course the first state created due to agitation for linguistic states, and is now in turmoil for other, non-linguistic reasons, which has led to the separation of the Telangana area. As for the issue of linguistic states and their reorganization in the 1950’s, another solution to the issue of linguistic rights has been to create new linguistic states over time, carving off territory where a linguistic group is dominant. This has been particularly common in the northeast, where new states such as Meghalaya, Nagaland, Arunachal Pradesh, Manipur, Mizoram, and Tripura were all carved out of Assam and other former states, as their dominant ethnolinguistic groups demanded statehood. 6.3.3.

Language policy in Pakistan

6.3.3.1. Literature review The “classical” period is in effect identical with that of India, and the literature on the period is also the same. For research on 19th-century British colonial policy in the part of India that was to become Pakistan, see Mir 2010 and Diamond 2012. The bulk of research on post-Independence Pakistan has been carried out by Tariq Rahman (1996a b, 1999, 2002, 2010). 6.3.3.2. The birth of Pakistan and the genesis of a language policy Though Pakistan came into existence only in 1947 after the partition of what had been British India, language policy in Pakistan has been entangled in questions of what is Hindi and what is Urdu and how it came to be that two competing varieties — one written in Devanagari script and borrowing learned vocabulary from Sanskrit, and the other in Perso-Arabic script, and obtaining its learned vocabulary from Persian and Arabic — dominated the linguistic scene in large parts of North India. The fact that these two varieties are almost indistinguishable at the spoken level, where they have been referred to as Hindustani, complicates the picture even further. Since many of the authorities in the princely states that the British encountered as they began to take over India were Muslim, and Persian was largely the language they used to administer their domains, Persianized Hindustani, later

652

Harold F. Schiffman

known as Urdu, became the language the British preferred to deal with, up until the introduction of English after the Macaulay “Minute”. The name most associated with the establishment of a separate Muslim homeland in South Asia, i.e. Pakistan, is that of Muhammad Ali Jinnah. Although Jinnah first resisted the idea, he came to accept and champion the notion of an independent Pakistan, whose one and only national language had to be Urdu. (Jinnah was himself not a mother-tongue speaker of Urdu — his home language was Gujarati, and Urdu was not even his second language.) Soon after Independence, Jinnah declared in Dacca (now Dhaka), the capital of the eastern wing of Pakistan, that ‘… the State language of Pakistan is going to be Urdu and no other language. Anyone who tries to mislead you is really the enemy of Pakistan. Without one State language, no Nation can remain tied up solidly together and function’ (Hamid 2011: 194, citing Oldenburg 1985: 76; see also 6.2.2.2 above, fn. 4). The idea that Urdu had to supersede all other languages in an independent Pakistan did not go over well with the large population of Bengali speakers in the eastern wing of Pakistan, nor with speakers of Sindhi and some other languages, though Panjabi speakers, who today constitute 44.5 % of Pakistan’s population,21 were more accepting, since Urdu had been in common use since British times as a language of bureaucratic administration (Diamond 2012). Jinnah’s proposal conflicted with the fact that the native-speaker population of Urdu at the time of Independence was probably less than 3 %, and does not exceed 7 % in present-day Pakistan, where it is mainly the mother tongue of the “Mohajirs”, who came as refugees from India to Pakistan after 1947. Be that as it may, Urdu was made the national language of Pakistan: ‘The National language of Pakistan is Urdu, and arrangements shall be made for its being used for official and other purposes within fifteen years from the commencing day.’ (Article 251 of the 1973 Constitution of the Islamic Republic of Pakistan, quoted in Rahman 2006: 74) As for English, ‘English was supposed to continue as the official language of Pakistan till such time that the national language(s) replaced it. However, this date came and went, as had many other dates before it, and English is as firmly entrenched in the domains of power in Pakistan as it was in 1947’ (Rahman 2006: 77). The reason Rahman gives for this entrenchment is that English has remained the monopoly of the power elite, which weaken[s] the local languages and lower[s] their status even in their home country. This militates against linguistic and cultural diversity, weakens the “have-nots” even further, and increases poverty by concentrating the best-paid jobs in the hands of the international elite and the English-using elite of the peripheries (Rahman 2006: 79). (Editorial note: On 8 September 2015, the Supreme Court issued an order that Article 251 of the 21

This figure is from Rahman 2006: 73, and counts Saraiki as a language distinct from Panjabi.

Sociolinguistics

653

1973 Constitution be implemented, beginning with short-term measures within three months of the date of announcement. See http://supremecourt.gov.pk/web/ user_files/File/Const.P._56_2003_E_dt_3-9-15.pdf) It is clear that similar ideologies about language are operating in both Pakistan and in India —there has to be one language and only one, whether it be known as a “state” language, an “official” language, or the “national” language, and it has to represent and reinforce the identity of the nation in some way, either in terms of national history or religious tradition.22 Unfortunately for Pakistan, the idea that only Urdu could fulfill this role meant that linguistic unrest in the eastern wing would eventually result in a civil war and the breakup of Pakistan into (west) Pakistan and Bangladesh in the east. After the Bangladesh war, Pakistan continued to rely on Urdu, the language of only a minority of the population, in the face of continued resentment from its other linguistic groups, and complicated by the preference for English of its elites (Rahman 2006). This has meant that groups other than the elites feel resentment that they must learn Urdu, which does not lead to good jobs, while elite groups get to study in prestigious English-medium schools, and get better jobs. 6.3.4.

Language policy in Bangladesh

The literature on colonial language policy in what is now Bangladesh is in fact the same as the literature on colonial language policy in India. As far as the Bengalispeaking area of colonial India is concerned, it is important to note that the establishment of what began as the colonial capital in Calcutta meant that English gained a foothold there earlier than in many other parts of India; but otherwise, the fact that Bengali was a well-established language with its own literature meant that few other Indian languages could challenge it. Eventually, the Bengal Presidency was divided by Curzon in 1905, which separated the Hindu and Muslim communities and set the stage for East Bengal, where Islam predominated, to become part of Pakistan. After Independence, when the province of East Bengal was ceded to Pakistan, threats to the dominance of Bengali came to a head. We have already noted the ill-advised pronouncement by Jinnah that there could be no question but that Urdu had to be the “state” language of Pakistan, made to an audience in Dacca in 1948. And in fact, Urdu was imposed as the “state” language of East Bengal, just as in West Pakistan. Violent protest in East Pakistan, culminating in the tragic shooting 22

If this seems like a contradiction, it is because this is another example of a covert (Schiffman 1996) language policy: the overt, official policy is that Urdu is the “state” (“national”, “official”) language; but a covert (or undeclared) policy that reserves English as the domain of the elite undermines the official declared policy and dilutes it, as well as the power of all linguistic minorities.

654

Harold F. Schiffman

deaths of 21 February 21 1952,23 led to the recognition of both Bengali and Urdu as state languages of Pakistan. Under the circumstances, neither Bengali nor Urdu but English became the common language for communication between East and West Pakistan. More important, linguistic tensions between East and West Pakistan did not subside, and after decades of turmoil a civil war broke out that eventually led to the independence of Bangladesh in 1971. ‘After the liberation, Bangladesh made Bengali the state language and the status of English was drastically reduced. Bengali replaced English in all official communications except those in foreign missions and countries and in armies, where English is still used as official language’ (Hasan 2004). Since that time, there has been little tension within Bangladesh over language. Bengali is spoken by 98 % of the population (Imam 2005: 474), which means that speakers of other languages have few grievances about the lack of use of their languages, any of which is spoken by less than 1 % of the population. What does raise some hackles is the use of English, which, as in other South Asian nations, remains the language of elite education. As Imam points out, ‘The national elite continues to invest privately, as it always has, in English language and culture’ and Bangla-medium education ‘threatens to signify not only lower cultural status but global incompetence’ (Imam 2005: 474). This statement in some ways summarizes the language policy situation for all of the nations of South Asia, where English-medium education and its benefits continue to be preferred by elites, and even non-elites strive to claim it, rather than languish in “ghettoized” non-English medium schools. But as Hosain and Tollefson (2006: 241–257) point out, one of the results of this dominance is a lack of curriculum materials in Bengali. This means that higher education has to continue in the medium of English, limits access to it by Bengali-medium graduates, and deepens the social divide between those who can access it and those who cannot (Hosain & Tollefson, 2006: 241–257). This is of course a problem in the rest of the subcontinent as well, and attempts to deal with the lack of higher education in local languages have mostly led to failure. 6.3.5.

Language policy in Sri Lanka

Language policy in Sri Lanka has been studied by many researchers, because the Government’s attempt at pushing Sinhala at the expense of other languages has been extremely controversial — some might even say disastrous, especially as the Tamils developed a counter-policy under the Liberation Tigers of Tamil Eelam (LTTE). In addition, there is the ubiquitous issue of the role of English. For general 23

The shooting deaths of four people are linked to the founding of “International Mother-Language Day”, described at the following website: http://www.timeanddate.com/ holidays/un/international-mother-language-day

Sociolinguistics

655

discussion see DeSilva 1997, Coperahewa 2009, DeVotta 2004 and 2007, and Canagarajah 2005, 2008, and 2009. Thirumalai 2002 and Coperahewa 2009 are studies of language planning in Sri Lanka; Lim & Ansaldo 2007 is a study of a much neglected linguistic minority in Sri Lanka, the Malays (see also 6.5.5 below). We have already alluded to the fact that language policy in Sri Lanka has been perhaps the most problematic of all the language policies in the region. By this is meant the denial of linguistic rights to a large linguistic minority, the Tamils, by the majority Sinhalese speakers in a blatant attempt to reverse what the latter saw as favoritism to the Tamils under the British. This policy, known as the “Sinhala Only Act”, was instituted in 1956 by the Marxist-oriented populist government.24 What followed was an almost farcical failure to implement the policy, including the failure to implement a constitutional modification in 1978 which would have granted “national” language status to Tamil, while retaining the status of “official” language for Sinhala. But this status, despite claims to the contrary, remained symbolic and honored only in the breach, which resulted in the ethnic riots of 1983, and civil war. Again, in 1988, ‘[A]s a part of the 13th amendment to the constitution, Tamil was raised to the status of an official language, while English was assigned the position of a “link language”. This part of the 13th amendment to the constitution stated, “Tamil shall also be an official language”’(Coperahewa 2009: 121). But as various sources have noted, this provision was ‘too little, and too late’ and the seeds of separatism which fed the civil war overwhelmed any constitutional concessions that might have sufficed if enacted three decades earlier. As Coperahewa points out, ‘The tardiness in the implementation of statutes making Tamil an official language is so evident that the O[fficial] L[anguage] C[omission] has admitted “there is an enormous gap between the constitutional provisions and their application”’ (Coperahewa 2009: 134). On May 16, 2009, the president of Sri Lanka declared that the LTTE had been defeated; but despite good intentions, the same questions that have plagued language policy in Sri Lanka remain unresolved. 6.3.6.

Language policy in Nepal

Research on language policy in Nepal is, in terms of numbers of articles, slim, but what there is, mostly the work of one author, is thorough and up-to-date. Sonntag (1980, 1995, 2006) provides an excellent review of the past and present of the policy, while her other work (2002, 2003) focuses on issues such as globalization of English in South Asia. 24

See Canagarajah 2005 for details of this. As he notes, English still survived as a language of elites under this policy, and Tamil was reduced to second-class status.

656

Harold F. Schiffman

As she points out in her 2006 article, Nepal never underwent colonization, whether from Britain or other colonial powers, but a British presence south of its borders nonetheless had certain effects, if only as a haven for Nepali dissidents. It therefore remained a Hindu monarchy, so even into the 21st century attitudes about language similar to those of classical India prevailed. Nepali, though not originally the dominant language, did become the dominant language as the Rana dynasty extended its hold over the Katmandu valley from the 9th to the 17th century, leading to marginalization of other linguistic groups.25 After other countries of the region became independent, perhaps in the spirit of decolonialization, some changes took place. ‘In the early 1950s, after close to a hundred years of autocratic rule by a prime ministerial dynasty, the Ranas, Nepal tumultuously joined the 20th century by ushering in an experiment in democracy’ (Sonntag 1980: 76). King Tribhuvan, along with the Nepali Congress party, introduced representative democracy, but this was short-lived; his son, Manendra, ‘dismissed the elected government in 1960, and reestablished absolute monarchical rule’ (Sonntag 2006: 3). With this also came what one might call a parallel in language policy — the reliance on absolutism for the Nepali language, which spread first as a second language, then even became the first language of many former speakers of other languages — part of what Sonntag refers to as “Nepalification” (2006: 7). By the 1990s, however, a people’s movement arose to counter this regime, and for a while it appeared that democracy had been restored. But in 1996 a Maoist insurgency arose, relying on general discontent with the failures of democracy. Meanwhile, after a deranged son of the then king killed members of the royal family, and then himself, his brother Gyanendra took over as king and reestablished autocratic rule. After some years of turmoil, the Maoists succeeded in disestablishing the monarchy, but stability has not returned to Nepal, and a state of permanent political stalemate seems to have become de rigueur. What this means for language policy in Nepal is difficult to determine, but implementation of a more democratic policy does not seem to be what is happening. A new constitution was adopted in 1990, but it ‘adopt[ed] a somewhat ambiguous position on the question of language … The new constitution recognized Nepal as a “multilingual’’ country, but still gave prominence to Nepali over the country’s other languages by designating the former the rashtra bhasa (official/state language) and all the other languages as rashtriya bhasa (national languages)’ (Sonntag 2006: 8–9). Sonntag details more attempts in the late 2000s to fashion a language policy that would give rights to other languages, with some groups attempting to make Sanskrit compulsory in Nepali education while others advocated English. 25

Nepal may be unique in that it contains groups from the Indo-Aryan language family, the Tibeto-Burman family, the Austro-Asiatic family, and even the Dravidian language family.

Sociolinguistics

6.4.

657

Diglossia

All societies have at least some variation between formal and less formal uses of language. South Asia is well known as offering paradigm cases of the much more formidable distinction that is characterized as DIGLOSSIA , the use of historically related language varieties whose structure — and prestige — differs more radically than, say, the difference between Standard and regional or social Dialect in languages like English; see Ferguson 1959 and 1991. The following two sections present details on two of these diglossic situations in South Asia: Bangla and the Dravidian languages (especially Tamil). It is worth mentioning that something like a diglossic situation — at the lexical level — is found in Hindi and Urdu. At the every-day level, the two are virtually a single language, but the ideologically and politically motivated practice of coining neologisms by drawing on Sanskrit for Hindi and Persian/Arabic for Urdu has led to the fact that at a highly formal level the two languages are mutually unintelligible and, at least as important, have become unintelligible to ordinary speakers. Even Sanskrit-based neologisms like višvavidyālay(a) ‘university’ are not understood by ordinary Hindi speakers, who use and understand the English-based yunivarsiṭī instead — not to mention longer expressions as in (1a), where ordinary speakers use and understand only the version in (1b).26 A fuller investigation of this phenomenon would be desirable. (1)

a. b.

āp kī dūrbhāṣ saṅkhyā kyā hai? ‘What is your phone number?’ āp kā fon nambar kyā hai?

Similarly, for the Urdu situation in Pakistan, one finds the native Urdu word hawāī aḍḍā ‘airport’ only on (some) traffic signs and in (some) journalistic writing; otherwise, the normal colloquial word is eirporṭ. A nascent diglossia-like situation seems to be developing in Pakistan — one in which the H variety is Urdu, and the indigenous languages (with many English loanwords) are the L varieties. Interest in this question is beginning to increase in Pakistan; see, for example, the discussion in 6.2.2.2 above, in which diglossia-like situations are described, but without reference to “diglossia”. This topic deserves much attention.

26

For early Indo-Aryan see also 1.3.1.3.1, this volume.

658 6.4.1.

Probal Dasgupta

Diglossia in Bangla By Probal Dasgupta

The Bangla grammar debate that took place from about 1900 onwards highlighted the cleavage between two forms of Bangla. The Sadhu (/ʃadhu/) or ‘pure’ variety used archaic forms of verbs (e.g. khaiya ‘having eaten’) and pronouns (e.g. uhara ‘they’) and preferred Sanskrit loanwords (e.g. marjar ‘cat’) over ordinary Bangla words (e.g. beral ‘cat’). This variety, used only in writing (or when reading texts out loud), contrasted with the Cholit (/čolit/) or ‘colloquial’ variety spoken by the metropolitan elite in Kolkata and featuring phonologically contemporary forms (kheye ‘having eaten’, ora ‘they’). The documents of the grammar debate are now easily accessible (Azad 1984 is one useful anthology, and includes some of the relevant writings by Tagore). The fact that Rabindranath Tagore was a participant in that debate and supported the cause of language modernization makes it possible to hope for serious historiographical attention. Following up on that grammar debate and pressing the case for replacing Sadhu with Cholit Bangla in the educational system and the public space, Pramatha Chaudhuri, with Tagore’s support, spearheaded a movement for language reform, which alone, he argued, would rescue Bengal from its cultural impasse. Here is a characteristic comment of his on the modern educational system of Bengal: We have been caught between modern Europe and classical India to the point of nearly forgetting the Bangla we speak. We learn an English and write a Sadhu Bangla which are separated by a Sanskrit distance. We have quite properly sown the seeds of our English education in the appropriate terrain of classical India first; but we must now replant the resulting saplings in the soil of our own Bangla; otherwise the literature of our land will not blossom (1914/1968: 29, as translated in Dasgupta 1993: 98).

Chaudhuri was successful; the Cholit variety did gradually eclipse Sadhu Bangla even as a literary medium. The Sadhu/Cholit conceptualization in terms of which this historical process was steered, and experienced, precedes and grounds the conceptualization of diglossia proposed by Ferguson (1959). In his classic paper, Charles Ferguson notes that certain languages, including Arabic and Greek, exhibit a sharp formal and functional cleavage — which he calls diglossia — separating an H (‘high’) variety of the language, used in writing and/or in formal settings, from an L (‘low’) variety restricted to informal contexts of speech. It is important to note that Ferguson cut his linguistic teeth on the descriptive study of Bangla (Ferguson 1945). Note also the importance of the Sadhu vs. Cholit cultural contestation in the recent history of the language; when Ferguson was doing his doctoral fieldwork, the victory of Cholit over Sadhu in the public space was far from complete. One may thus assume that his 1959 article was written with some awareness of the indigenous debates in Bengal — supplemented, no doubt, by what he had learnt from

Sociolinguistics

659

other sources. Ferguson sought to generalize not just over the phenomena, but over the conceptualizations as well. The historical study of diglossia theory will involve contextualizing Ferguson’s early work in terms of its roots in his knowledge of the history of Bangla. Such inquiry cannot be attempted here. Until it is undertaken, a discursive gap will continue to separate Anglophone publications purporting to represent the state of the art of a linguistics-embedded “diglossia theory” from academic reflections on these issues published in Arabic, Bangla, Greek, or Telugu that go back to earlier endogenous conceptualizations in the domain of vertical code cleavage. Anglophone linguistics-embedded diglossia theory rests on two early papers that delineated the field of inquiry: Ferguson 1959 and Fishman 1967. Joshua Fishman argued that the code cleavages Ferguson had highlighted were to be seen as special cases where the H and L terms of the diglossic pairing happened to be varieties of the same language. Underplaying the formal unity of the language structure (a unity definable in terms of what the H and L varieties share), Fishman drew attention instead to the functional differentiation of a speaker’s repertoire into distinct formal and informal strata of verbal ability. Focusing on migrants in the United States who use English for public purposes and, say, Spanish in their private lives, Fishman proposed to describe them as diglossic bilingual speakers, in contrast to Ferguson’s diglossic monolinguals. Some studies of South Asian sociolinguistic scenarios drawing both on Ferguson’s conception of what we may call “classical” diglossia and on Fishman’s approach to “extended” diglossia are collected in Krishnamurti, Masica & Sinha (eds.) 1986, the proceedings volume of a 1980 conference Ferguson attended. There Ferguson argued that his take on classically diglossic languages threw some light on options available in their diachrony that were due to the specificity of diglossia (his remark was made in response to a question and has never been reported in print). The reexamination of classical diglossia in Bangla by U. N. Singh and Maniruzzaman (1983, reviewed in Dasgupta 1990) — an attitude-based study of the status of the rising L and declining H varieties of Bangla in specific domains — sidesteps the issue of where English is located in the verbal repertoire of a speaker of Bangla, and by the same token sidesteps Fishman’s extension of diglossia. Dasgupta 1993, a diglossia-theoretic study focusing on the location of English on the South Asian map but also paying some lateral attention to the sociolinguistics of Bangla, uses Ferguson’s conceptualization to make sense of the specific history of Bangla; it also employs Fishman’s apparatus at the level of the pan-Indian alignment of H English with L Indian languages. However, that study critiques the structural-functionalist account of languages as codes that the Ferguson-Fishman debate presupposed. Viewing languages as sites of discourse, Dasgupta 1993 proposes that the cognitive content associated with the H and L poles of a discursive system dyad gets negotiated as part of the management of a diglossic process. This proposal is part of the broader transition from structural-functional studies

660

Probal Dasgupta

of verbal behavior towards generative and related investigation of knowledge of language and of cultural/cognitive systems surrounding it. This move made in Dasgupta 1993 and updated in Dasgupta 2004a was based on Francis Britto’s (1986) conceptualization of diglossia as a universal differentiation of language into H and L “diasystems” and on Abel’s (1998) “substantivist” rearticulation of Britto’s approach. Dasgupta 2004b and 2011 connect this move in diglossia theory to the methodological status of generative linguistics. Dasgupta 2004b argues that in diasystemic terms the basic portrait of language that generative grammarians hope to draw is a portrait of the H diasystem embodied in the written language — of ‘the formal perfection of the idealized competence of a typical authorial mind which brings out the infinity of the writing potential and thus escapes closure in any one writer’s performance’ — and that ‘the interruptibility which reflects the substantive recycling of spokens in transactive assembly’ distinguishes speech from writing. (For a social science approach to issues of speech and writing that this line of inquiry draws on, see Goody 1986.) In a perspective that assumes that the object of linguistic study is heterogeneous and that general linguistics must treat the duality of written and spoken substances of linguistic form as constitutive rather than historically contingent — a “substantivist” perspective in the sense of Dasgupta, Ford & Singh 2000 — one must then construe ‘the generative revolution’s formal infinity’ in such terms as, ‘Written prose alone unpacks the operative fullness of speech’, terms that are ‘convertible into the langue-as-writing basis of poststructuralism.’ When other authors step in — as they surely will — to contest these claims, they will find it appropriate to take on board the fact that a series of influential interventions by Tagore have shaped the endogenous tradition of analysis of the Sadhu/Cholit code cleavage in Bangla. See Azad 1984, an anthology that inter alia showcases the Bangla grammar debate and helps relate Tagore’s interventions to the views of other participants in the debate, and Tagore 1984, a collection that helps place those interventions in the context of Tagore’s other writings on the linguistics of Bangla. Despite high praise from Suniti-Kumar Chatterji — ‘the first Bengali with a scientific insight to attack the problems of the language was the poet Rabindranath Tagore’, Chatterji wrote in the most celebrated monograph in the history of the field (1926: xvi) — Tagore’s contribution to linguistics is unknown to most linguists. The Sadhu/Cholit divide remained an abiding theme in his linguistic writings from the early 1880s to the late 1930s (Dasgupta 1985). When the Anglophone sociolinguistics of diglossia gets its conceptual and descriptive act together, an academic encounter with this strand of Rabindranath Tagore’s legacy will be called for.

Sociolinguistics

6.4.2.

661

Diglossia in Dravidian languages By E. Annamalai

Traditional grammars of Dravidian languages recognize varieties of the language they describe. Tolkāppiyam, the first grammar of Tamil of the early centuries of the Common Era, mentions twelve regional varieties of Tamil (Chevillard 2008). It also mentions two non-regional varieties of Tamil, viz. ceyyuḷ ‘that which is composed (in verse)’ and vaḻakku ‘that which is in practice (in life)’. Modern linguistic scholars of Tamil take these two terms to refer to the language of literature and the language of speech, respectively, and assume that they differed grammatically. Another variety of Tamil is found in inscriptions that are contemporaneous with literary works from the beginning of the written history, and its grammatical and lexical features are at variance with those of the contemporaneous literary language. The first grammar of Telugu by Nannaya of the 11th century (some place him in the 16th century) classifies the words of Telugu origin into dēsya ‘of the (Telugu) country’ and grāmya ‘of the village’, of which the latter is not fit for literature. Modern Dravidian languages have variations which are correlated with regional-social divisions, with class (educated and uneducated), and with the medium of communication (writing and speaking). Theorizing a particular kind of language variation, Ferguson (1959) introduced the concept of diglossic difference, setting it apart from differences between dialects, styles, and registers. He suggested that Tamil is illustrative of it. Though this concept is largely ignored by present-day traditional grammarians of Dravidian languages, who reject the lower of the diglossic varieties as unworthy, linguists have enthusiastically embraced it and examined their languages to validate the existence of diglossia. Ferguson’s original idea was to describe a property of some languages which have two varieties grammatically and functionally differentiated in special ways. The concept of diglossia was expanded by others (Fishman 1967) to describe a speech community using functionally differentiated codes, whether the codes are grammatically related (i.e. varieties of a language) or not (i.e. distinct languages), and was brought to overlap with bilingualism. It should, however, be said that in identifying a code, whether it is a distinct language or a variety of the same language, non-linguistic factors play a role; the functional differentiation of Sanskrit and Prakrit (Deshpande 1986), for example, may be treated as an instance of bilingual use or diglossic use depending on how their relationship is conceptualized. Dravidian linguists stayed close to the original concept of diglossia as primarily a property of language rather than of speakers. They, however, expanded the concept in another direction. Analysis of the language of pre-modern literature, which was rarely in prose, and the speech of modern times showed grammatical and lexical differences between them; the former was to be learned formally and was endowed with prestige as the learned language. As these are the characteristics attributed by Ferguson to the High variety in a diglossic situation, the existence

662

E. Annamalai

of a literary variety and a colloquial variety in a Dravidian language came to be regarded as evidence for diglossia. The issue to decide was the extent of the grammatical difference between the two varieties. The extent of differences and the grammatical complexity of the language of early literature in Telugu, Kannada, and Malayalam are due to Sanskritization of the literary language both in the (basic) lexicon and in word formation including compounding. This has consequences for phonology (including morphophonemics) and, to a limited extent, in syntax. The case of Tamil is different with regard to the nature of the differences and complexity of its language of literature. There were two parallel languages of literature in Kerala, viz. maṇipravāḷa and pāṭṭu. They differed in the lexicon — with regard to the source words, compounds, and assimilation of Sanskrit loans in phonology. The maṇipravāḷa literary language, practiced by the upper castes and considered superior, was codified in the grammatical work Lilātilakam of the 14th century (Freeman 1998), which sought to give autonomy to this language from Tamil. In the 17th–18th century Eḻuttaccan, who rendered the Mahābhārata in Malayalam (as the language had come to be called by then), drew from folk meters and blended the language of maṇipravāḷa and pāṭṭu literary compositions. The language of Malayalam literature was reduced to one standard (Gopinatha Pillai 1985) and it was less distant from speech than maṇipravāḷa. There was no emergence of diglossia, though this is disputed by some (Gopinathan 1980) on the basis of some grammatical differences between modern spoken and literary Malayalam. There were differences between the language of literature and contemporaneous speech (as evidenced in inscriptions) in Kannada. The language of literature was influenced by Sanskrit. The phonological forms of spoken Kannada were accepted, though sparingly, by grammarians starting from Keśirāja of the 13th century and by some poets from roughly the same period (Nayak 1967: 26–28). This acceptance paved the way for the difference between the language of literature and speech to become a difference of style. Nayak (1967: 32) describes the difference between literary Kannada and a regional-social dialect and claims that the use of the literary style in everyday conversation would be ridiculed as bookish and, conversely, the use of colloquial style in a public lecture would be ridiculed as boorish. This fits with the attitudinal difference between diglossic varieties. But this relates to the non-standard dialect that Nayak describes. The spoken Kannada accepted as the standard dialect, based on the upper-caste speech of the Mysore region, which is influenced by the literary education of its speakers, is not ridiculed. It has phonological features in lexical forms that are common with those of the literary style. This standard spoken dialect was accepted in the modern period for the prose that emerged to write literature, textbooks, articles, and news; it was also used in public speech and in the classroom. Any difference between the common literary and standard colloquial is a matter of style and, hence, it does not qualify to be called diglossic.

Sociolinguistics

663

The language of pre-modern Telugu literature was highly Sanskritized and was the language learned and used in traditional literary education. This literary language was standardized by a trio of poets (kavitraya) from the 11th to 14th centuries (Krishnamurti 1979: 3). Prose was used in inscriptions from the 6th century, but the literature that emerged a few centuries later was in verse (except for some ornamental prose between verses). Prose came to be the language of the new genres of literature such as novels, and of other genres of writing such as textbooks and journals of news and information. The Madras School Book and Vernacular Society that came into existence in 1820 produced school textbooks in different subjects in the language of the prevailing documentary records, which had a mixture of spoken and literary forms. This language was changed to the literary when the Society’s chairmanship went in 1855 to a literary scholar and teacher, Chinnayasuri, who wrote a grammar, Bālavyākaraṇamu, codifying the literary language, called grānthika, for its modern users. The colloquial language, called vyavahārika, was condemned as unsuitable for the new prose, literary as well as expository. There was a long struggle against this by social reformers and novelists that involved changing the mindset of the government, university authorities, and the general public. These modernists finally won over the traditionalists, in phases and after initial failures, in the use of siṣṭavyavahārika, which is a polite or polished colloquial language, in different domains including education and mass communication (Krishnamurti 1979). This variety is closer to the standard spoken Telugu based on the upper-caste speech of coastal Andhra and, has features drawn from the simplified literary variety (saralagrānthika). The above social developments prevented the emergence of diglossia in modern Telugu, though the old grānthika style has not totally gone out of use (Radhakrishna 1980). The language of literature in Tamil has a long history of two millennia and among the literary Dravidian languages, it was least influenced by Sanskrit. This language did change over time morphologically, syntactically, lexically, and semantically, but the phonology — spelling of words and combinatory alterations in simple words (sandhi) — remained constant with some minor changes in the permissible sequence of phonemes (letters in spelling) in a word. Changes in morphology and syntax were evolutionary with no drastic shifts; changes in the lexicon were from contact with Sanskrit; changes in meaning reflect changes in the world view of Tamil society. Scholars have speculated on the impact of speech on these changes in literary Tamil, presuming co-existence of two parallel varieties, literary and spoken. This is based on the evidence from inscriptions with regard to the phonological forms of words. It is difficult, however, to demonstrate the impact of speech on the grammar setting it apart from evolutionary grammatical changes. The language of inscriptions in Tamil is as old as its language of literature. The difference between the two is transparent, the major one being the prose used in inscriptions. Scholars (e.g. Velu Pillai 1971: 52) have observed that the language of inscription is closer to spoken Tamil by the evidence of occurrence of non-

664

E. Annamalai

literary phonological forms of words in inscriptions. But, they also show the phonemes (voiced and aspirated stops, sibilants) of Prakrit and Sanskrit, mostly in loan words. This spelling, however, is not stable and, it alternates with forms written with literary and assimilated phonology (spelling) (Agesthialingom & Shanmugam 1970, Velu Pillai 1976). The loan word phonology suggests that the inscriptional language is not identical or even close to the colloquial language in phonology in that the loan words are not assimilated to the native phonology as they are in the speech of the common people. It is likely that the language of inscriptions represents a documentary language register of Tamil and that the influence of the spoken language and Prakrit/Sanskrit on this register is more than on literary writing. The Tamil language entered its modern period in the 19th century, with the production of textbooks, science books, translations from English, novels, newspapers, and magazines. The language came to be used for writing non-literary content (going beyond the content of inscriptions), and these writings were in prose. The college of Fort St. George, established in Madras in 1812 to teach language to British officers, compiled textbooks in Tamil which included prose pieces. Though it had come to be used for religious propagation by Christian missionaries two centuries earlier, prose later became the new language of literature. To create a prose for the new uses, the authors had these models to base the new prose on: the language of poetry, of commentaries (literary, grammatical, and religious), of inscriptions, and of speech. None of the traditional grammatical treatises was written for the language of speech, which deprived speech of legitimacy. The only grammars that included spoken Tamil in their description were written by foreign missionaries in a language other than Tamil (such as Portuguese or Latin), but they did not change the status of spoken Tamil. The fact that there were only dialects in speech and no standard dialect with wide acceptance until the later part of the 20th century was also probably a factor in discouraging the use of the spoken language, or any modified version of it, from being adapted to write prose. The prose of inscriptions and of religious commentaries was of a special register; the maṇipravāḷa prose of the commentaries of Vaishnavite hymns was highly specialized and sectarian. The compressed and metrical language of poetry was unsuited for prose. The pedagogical language used in traditional learning settings to explicate and interpret the versified literature in verse had elements of literary and grammatical commentaries. Multiple kinds of prose coexisted for the purposes of rendering purāṇas and other literature in prose, propagation of religion, and secular teaching of Tamil. The prose gradually became relatively simpler by reducing old morphological forms and the complexity of sentences, but it kept the conventional spelling of words. This variety, with modifications, began to be used in the new literature in prose as well as in essays, news reports, textbooks, and in public speech and classrooms. The motivation for the simplifying process was not, however, to close the gap between writing and speaking. As a result, modern Tamil ended up having two

Sociolinguistics

665

varieties that are variously named literary/written/formal and colloquial/spoken/ informal. These two varieties are identified respectively with H(igh) and L(ow) varieties of diglossia by modern linguists. They satisfy the structural and functional properties identified by Ferguson with the two diglossic varieties. The first of the two varieties carries prestige, is grammatically sanctioned, is learnt though formal instruction (i.e. it is a taught language; competence in it is restricted to the educated class more than the standard dialect), is derived from the language of culturally valued and historically anteceded literature, is grammatically complex in the sense that it has more than one morphological and syntactic form to express the same meaning, and has more synonyms (Ramaswamy 1997: 126, 172–197). The second variety, believed to lack grammar or to be grammatically deviant, is acquired at home and in the streets in the process of primary socialization, and is lacking in cultural value and prestige. These two varieties are functionally differentiated as well. The H variety is used in social situations where the relation between the communicators is distant and impersonal; they include public speech (including religious discourse), classroom instruction, interviews, and writing. The L variety is used in social situations where the relation between the communicators is intimate and personal; they include conversations, instructions, curses, and jokes. Because of these properties, the differences between the two varieties cannot be equated with two styles or registers or with a standard/non-standard dialect distinction. The two varieties of Tamil qualify to be called H and L varieties. It should be noted, however, that the choice of a variety is not categorical in specific situations (Schiffman 1997: 210), nor does the specific situation remain invariant. Jokes in a public political speech, admonitions in the classroom are in L. The early political diary of Ananda Ranga Pillai of the 18th century was in L, but his personal life diaries are in H or in L. The medium may change the choice of a variety for the same function; written instructions are in H. The chosen variety in a specific situation may not be uniformly used; gossip columns in a printed magazine may be in L. The linguistic features of each variety are not fixed; they make a range which allows the lower end of the range in H to come close to the higher end of the range in L. Personal letters, though written, may be open to L variety features depending on intimacy. Such diffusion in correlating form and function in language use is natural; it does not disprove the existence of diglossic varieties. The characteristic differences in prestige, acquisition, literary heritage, and grammatical complexity between the diglossic varieties are predicated on grammatical and lexical differences between the varieties. The use of the tag of diglossia for Tamil crucially depends on this linguistic distance. Lesser linguistic distance is the difference between standard and non-standard dialects or between elevated and ordinary styles, even if they meet some of the social and psychological attributes of the diglossic difference. Scholars have made divergent claims about the depth of the linguistic distance in Tamil. For some, H and L varieties of Tamil are as different as distinct languages (Caldwell 1991: 81, Shanmugam Pillai 1960:

666

E. Annamalai

27) or are opposite ends of a pole (Zvelebil 1964: 237). For others, they share a large common core and constitute an unstable spectrum (Arokianathan 1988: 41–65). For Ferguson, the linguistic distance between H and L is less than the distance between languages but more than that between dialects, which Britto (1986: 10) calls the optimal distance to define diglossia. There is, however, no objective measurement to determine grammatical distance. The broad contours of the linguistic difference between H and L in Tamil are the following.27 They differ enormously in the phonological structure of words. L avoids word-final consonants by dropping word-final nasals and subsequently nasalizing the preceding vowel, and also avoids final laterals and /y/ under certain conditions; it tolerates more clusters of stops and nasals with liquids and /r/ in word-medial position, but assimilates or reduces heterogeneous consonant clusters in medial position; it has palatalization of dental stops and nasals triggered by preceding front vowels; it has vowel harmony of high back vowel to a front vowel in the preceding syllable, centralizes short high front vowels in medial syllables, and entertains lowering of short, high front and back vowels to mid vowels before a low vowel in the following syllable; it also has some analogical changes in the phonological structure of words. The scholarly consensus is to give historical precedence to H (Sethu Pillai 1974, Mahapatra 1985) or underlying status to it (Ramaswami 1997), and to derive the phonological forms of L from H. There is an enormous difference in sandhi (which covers morphophonemic changes and writing conventions) between H and L varieties. Sandhi in the writing of classical literature (as evidenced in palm manuscripts) extended to combining clauses and even sentences. It is reduced to internal sandhi (combining of morphemes within a word and in some compound words) in the modern H variety. It is unstable across words in this variety, in the sense that sandhi across words may be present in some cases, but not in others, or may be present in the same word sequence sometimes but not at other times in the same writer. Reduction of sandhi is a simplification in H of the classical literary writing (Annamalai 2011: 39–42). H and L differ in some respects in the internal morphological structure (arrangement and number of morphemes) of words, but differ substantially in the inflected morphological form as a result of the phonological differences described above. The structural differences include absence of the empty morphs (cāriyai) to link morphemes; loss of contrast between neuter singular and plural in verb agreement; tripartite structure of verbs (stem + tense + agreement) becoming bipartite structure (stem + agreement [portmanteau]) in neuter in one class of verbs; differentiating in form between neuter finite verbs and action nouns by the absence and presence of tense respectively; choice of present tense or future tense marker to 27

Differences attributable to literate and oral modes of language in communication, which are based on how the information is packaged and are of universal nature, however are not counted (Annamalai 2011: 56–62).

Sociolinguistics

667

refer to non-past tense in relative participles, participial nouns, and action nouns; loss of contrast between the two negative verbs, viz. negation of existence and negation of identity; and grammaticalization of some semi-lexical morphemes and some words. These morphological features of the L variety are relatable to those of the H variety by a set of rules. The difference in syntax between the two varieties is minimal, including the presence of the passive construction in H. The H variety has carryovers from the older literary language, which leaves H with more than one sentence structure for the same proposition. Syntactic innovations in L, which are very few, are considered erroneous in H in its higher range, such as classroom composition and essays on literary and scientific content. The carryover is true of the lexicon also. H has synonyms, some of which are not used in L. H stipulates words that are of Tamil origin etymologically as appropriate (this also applies to the newly coined technical terms). It does not admit words that are specific to colloquial Tamil or words spelled as they are pronounced in colloquial Tamil. It does not resist words from classical Tamil of the past as it does contemporary colloquial words. Carryover of the meaning of words from the language of old literature can be found in H. That is, words in modern H become polysemic, adding some of the meanings they had in classical Tamil. Resistance to bringing the two varieties closer is strongest with regard to the phonological form of the words in H and L, as mentioned earlier. These and other differences are to be learned in school. Given the linguistic distance added to the functional characteristics attributed by Ferguson to diglossic varieties, Tamil is a case of a diglossic language. There are some sociolinguistic and socio-political factors that explain the difference between Tamil and other Dravidian languages with regard to diglossia. The classical literary language of Tamil was not Sanskritized as were other Dravidian languages. There were no conscious efforts in the literary history of Tamil to bring the literary language closer to speech, though that language changed over time. This lack of bridging is almost absolute with regard to the phonological form (spelling). Spelling has become a cultural heritage. There was a Tamil renaissance in the 19th century built around the printing of classical literary works, which increased their accessibility to Tamil readers and fed into Tamil nationalism in the political arena. This endowed the old literary language with special cultural significance and the Tamil community with motivation for its preservation. This was nurtured by a movement in the next century to eschew words of Sanskrit explicitly and all foreign words and colloquialism implicitly. This movement succeeded in written Tamil, increasing its lexical distance from spoken Tamil (Annamalai 2011: 26–28). To maintain a high variety of Tamil was thus a political-cultural need. The need for this variety was increased by the absence of an accepted standard spoken dialect at a time when Tamil confronted the demand of new uses for prose at the beginning of the modern period.

668

E. Annamalai

There is a question of whether Tamil diglossia is a stable phenomenon. A characteristic of diglossia, according to Ferguson, is relative stability. The two varieties of Tamil are not immutable and both change by mutual influence, but the changes do not make them one. The question of stability has two dimensions — from the points of view of the past and of the future. Some Tamil scholars (Deivasundaram 1981: 19, Arokianathan 1986, 1988: 23) take it to be in existence from the beginning of the recorded history of Tamil on the basis of the mention of ceyyuḷ and vaḻakku in Tolkāppiyam and the difference in the language of inscriptions, which suggests the influence of a spoken language. Britto (1986: 109) and Ramaswami (1997: 29) place fully-formed diglossia a few centuries later. Britto’s observation is based on a speculation that literary Tamil (centamiḻ) ceased to be identified with one dialect at some point. There is no hard evidence that the varieties in the past were grammatically distant enough for them to be called H and L varieties, or close enough that they be called different styles. Tolkāppiyam’s distinction may parallel, but not be identical with, the distinction between Sanskrit and Prakrit, or it may be the distinction between the literary language with its conventions and the ordinary language with its naturalness. The features of spoken Tamil that are not found at all in literary Tamil but are found alternatingly in inscriptions are the different phonological forms of words that resemble the form of modern colloquial Tamil. This fact may not be sufficient to call the grammatical distance optimal in the sense of Britto (1986: 10). Moreover, there is no evidence in grammars or in literature for the prevalence of the social and psychological attributes of the two varieties in the pre-modern period. Only if the absence of evidence for hierarchical functional and optimal grammatical (other than phonological) differences is ignored, could diglossia be said to have existed from the earliest history of Tamil. Otherwise, diglossia in Tamil should be related to the emergence of prose for new uses at the beginning of the modern period. With regard to stability in the future, the currently dominant cultural-political ideology is to have the H variety in education, both for teaching Tamil and other subjects (Britto 1986: 267–284). There is a cultural-heritage need to have it as the variety of Tamil shared by the Tamils in India and Sri Lanka, whose spoken dialects have diverged greatly (Yesudhasan 1980). These are factors that will support the maintenance of diglossia. But the diglossic boundary, which has already become porous, will become even more hazy. L is no longer a collection of dialects; there is now a standard spoken Tamil of educated speakers, whose number is increasing. The standard dialect is not the adoption of a particular regional-social dialect, but has developed through a process of dropping features from one’s dialect that mark the speaker’s regional-social background. The dropped feature can be replaced by a neutral feature common to more dialects or from literary Tamil (Annamalai 2011: 75, Gnanasundaram 1980: 76). Writing allows L to be used in more and more situations, all of which have the shared thrust of reaching the readers and

Sociolinguistics

669

being closer to life in communication. So do formal speech and interviews, since the electronic media and a democratic polity draw less-formally educated people into public speech and interviews. In spite of their best efforts to use H, they slip into L because their mastery of H is less than optimal. Increasing fascination with English education makes the English-medium educated less competent in H and more withdrawn from it, so that any speech or interview they give in their subject of specialization is mixed with L or English. The digital age takes away the power of editorial control. The convention of using H in the conversation of historical and mythological figures in stories and films is diluted. The literary strategy of using H for the author’s voice in the narrative passages in fiction and L for characters’ voices in conversation (Shanmugam Pillai 1965), though never categorical with regard to the occurrence and shape of lexical and grammatical forms (Deivasundaram 1981: 54–64, Schiffman & Arokianathan 1986), has begun to yield place to using L for both. The grammatical convergence between H and L is accelerating in nonscholarly and non-analytical writings (Annamalai 2011: 46–55). Their differentiation, however, is maintained by H keeping the conventional phonological form (spelling) of words and the alternating morphological forms of words. A diglossic situation less distanced formally but still hierarchized in prestige may continue to exist in Tamil — differing from a standard/non-standard dialect distinction in that spellings unreflective of the standard speech and free-varying morphological alternatives of H will have to be learnt formally in schools superposed on L.

6.5.

South Asian pidgins and creoles By Ian R. Smith

6.5.1.

Introduction

There are six known pidgins/creoles or pidgin/creole-like languages in South Asia (Smith 2008): Nagamese, Bazaar Hindi, Indo-Portuguese (with several mutuallyunintelligible sub-varieties), Sri Lanka Malay, Veddah, and Butler English. This section will focus on these languages, but mention should also be made of two pidgins based on South Asian languages spoken outside the area: Fiji Pidgin Hindustani (Siegel 1987, 1988, 1990) and Gulf Pidgin Urdu, about which little is known. Also noteworthy, but beyond the scope of this section, is the fact that speakers of South Asian languages have had a significant role in the development of some pidgins/creoles spoken outside the region, such as Pidgin Madam, a pidginized Arabic spoken by Sinhala-speaking domestic servants in Lebanon and other parts of the Middle East (Bizri 2010). Pidgins and creoles are new languages that result from the need for communication between groups having no common language. Prototypical pidgins are

670

Ian R. Smith

minimal systems used in restricted social circumstances such as trade or employment. Creoles have become the native language of some community and are full language systems, like any non-creole. Pidgins that become useful beyond their initial restricted setting may expand, developing the necessary lexical and structural resources until they are fully elaborated systems, without necessarily acquiring native speakers. Both prototypical pidgins and prototypical creoles take their lexicon predominantly from one language (the LEXIFIER ) — usually the sociopolitically dominant one in the contact situation — but the grammar develops under the influence of the SUBSTRATE languages — those spoken by the nondominant groups. The definition of CREOLE is the subject of considerable debate among creolists, and the characterization offered above is seen by some as an “improper generalization” [‘généralisation abusive’] (Chaudenson 1995: 13). Mufwene (1997, 2000) argues that CREOLE is not a category that can be defined on the basis of general sociohistorical conditions linked to typological outcomes. Rather it can be applied only to ‘a group of vernaculars whose developments are similar especially in their temporal and geographical positions, viz. in tropical colonies settled by Europeans practicing slave-based economy from the 17th to the 19th centuries’ (2000: 78, following Chaudenson 1989, 1992). According to this minority view, then, this section is superfluous, as no such creoles are found in South Asia. This existential debate is typical of the intellectual ferment within creolistics, a field that could legitimately be termed the “wild west” of linguistics. For the present purposes, I will take pidgins and creoles to be natural languages whose transmission has been maximally disrupted through untutored second language acquisition in contexts where access to the normal spoken variety of the lexifier is restricted. The number of South Asian pidgins and creoles is rather small for the size and population of the region, but the languages are significant in that they have different social and structural characteristics from the Atlantic creoles which served as the main developing ground for creole theory in the second half of the 20th century. While all of the Atlantic creoles are lexified by European languages, four of the six South Asian candidate languages are lexified by non-European languages; all but a few of the Atlantic creoles developed in circumstances involving the displacement of substrate language speakers to work on plantations under conditions of slavery, but no displacement of substrate language speakers took place in the development of South Asian creoles. South Asian creoles are thus important for the perspective they can offer on the relationship between differing input conditions and outcomes of the creolization process. The remainder of this section looks at each of the languages in turn, summarizing the theoretical issues to which each has contributed.

Sociolinguistics

6.5.2.

671

Indo-Portuguese

The Portuguese colonial period in South Asia began with Da Gama’s discovery of a sea route from Europe in 1498. Trading bases were established at numerous ports around the coast of India, from Diu to Hugli, and also in Sri Lanka, in many of which Portuguese-lexified creole languages developed. Indo-Portuguese creoles continue to be spoken in Korlai near Bombay (Clements 1996), Daman (Clements & Koonz-Garboden 2002), Diu (Cardoso 2009a), and until recently Cochin28 in India, and in Batticaloa (Smith 1977, 1979, 2001) and Trincomalee in Sri Lanka. No creole appears to have developed in Goa, the administrative centre of the Portuguese Estado da Índia. As in other parts of Asia, Portuguese-based creoles have remained continuously in contact with their substrate languages, but contact with their lexifier, Portuguese, differed from region to region. The Portuguese were dispossessed of their last holdings in Ceylon by the Dutch in 1658, and the creole had very little further contact with Portuguese speakers.29 By contrast, the Portuguese administered Daman and Diu up until 1961, when they were relieved of the task by the Indian government. The lexifier thus maintained a strong presence from colonization almost up to the present. Between these two extremes lies Korlai, where there was no Portuguese administration after 1740, but the presence of a Portuguese priest kept at least a minimal contact between the creole and its lexifier (Clements 2009, Clements & Mahboob 2000). On the basis of a comparison of various features in the Korlai and Diu creoles, Clements (2009) shows that substrate influence had a greater impact on Korlai than on Daman and Diu creoles, arguing that that the greater presence of the lexifier in the latter had inhibited substrate influence. This finding is in consonance with the fact that Sri Lanka Portuguese exhibits even greater amounts of substrate influence (Smith 1979). Smith (2012c) tests the hypothesis on word order features in a wider sample of eight Portuguese- or Spanish-based creoles, four from South Asia and four from Southeast and East Asia. A statistically significant correlation is found between stronger lexifier presence and conformity to lexifier word order. These findings indicate that although Sri Lanka Portuguese now exhibits morphosyntactic conformity to its substrates, it was probably more akin typologically to its lexifier in the early stages, but has subsequently undergone ADSTRATE influence from the (former) substrates during its long period of isolation from its lexifier. (A similar argument can be made for Philippine creole Spanish). In other 28 29

The last speaker is reported to have passed away in 2010 (Pradeep 2010). A few monks and priests from Portuguese India worked in Sri Lanka from 1687 to 1842, mostly under the auspices of the Portuguese Padroado, and presumably spoke a variety of Portuguese. They were largely replaced from 1842 by Italian- and French-speaking missionaries sent from Europe by the Propaganda Fide (Don Peter 1996).

672

Ian R. Smith

words, its modern structure is as much a product of convergence(/metatypy) under conditions of bilingualism than of pidginization/creolization under conditions of untutored second language acquisition. Indeed, ongoing contact-induced syntactic change is observable in Korlai Portuguese (Clements 1991). That said, a number of close similarities in the Dravidian typological elements of Malabar Portuguese (Cardoso 2013) and Sri Lanka Portuguese possibly indicate a very early presence (at least variably) of certain South Asian features in these creoles; further work is needed, however, to rule out the possibility of parallel development. Finally, comparing modern spoken Sri Lanka Portuguese with 19th century texts, Bakker (2000a, 2000b) argues that adstrate influence is recent; Smith (2011), on the other hand, claims the 19th century texts were written in a Missionary-created high variety that masked the presence of a spoken low variety that has not changed radically since then. No clear distinction has been found between contact-induced developments that take place under bilingualism and those that occur under pidginization/creolization, though Smith (2002, 2013) posits that untutored second language acquisition may result in the grammaticalization of elements whose semantics and syntax in the lexifier bear little relation to the new use. Such “hijacked” forms and constructions are claimed to be rare in convergence in the context of bilingualism. 6.5.3.

Nagamese

The mountainous terrain of Nagaland has produced a large number of languages in a small area. The Nagas have a long history of trade with neighbouring areas of Assam, which likely provided the context in which Nagamese, an Assameselexified pidgin developed (Sreedhar 1974: 38). Because of its utility for intertribal communication, Nagamese followed the path of expansion mentioned in the introduction and became a fully-elaborated language which functions as the chief lingua franca of Nagaland. One community (the non-Naga Kacharis of Dimapur) has adopted it as their L1. Descriptions are available in Sreedhar 1974 and 1985 and Bhattacharjya 2007. Bhattacharjya notes several ways in which Nagamese differs from a prototypical pidgin: Its genesis and development involved no conquest or colonization, no plantation slavery or major population displacement, and there was no dramatically unequal power relationship between substrate and superstrate speakers. Morphologically [Nagamese] is more complex than P[idgins]/C[reole]s based on European languages, and it has SOV word-order, like its superstrate (2007: 238).

Rather than being non-prototypical, in my view Nagamese indicates that a slavebased plantation economy and a European colonial context are not attributes central to the pidgin prototype. Of greater weight are frequent contacts in a restricted

Sociolinguistics

673

domain, such as trade, and untutored second language acquisition. Other pidgins lexified by non-European languages also developed from trade contexts not associated with European colonial plantations, e.g. Hiri Motu (Dutton 1985) and YimasArafundi pidgin (Foley 1988, 1991), both spoken in Papua New Guinea. As for its morphological complexity, Smith (2008) argues that the regional differences in noun suffixes as well as the optionality of many case-forms (with syntactic alternatives for all but one case) reported by Sreedhar (1974) indicate that early Nagamese likely lacked case morphology and the regional differences reflect independent developments. Again we see the need to consider that creole languages are subject like all other languages to both internal and contact-induced change (cf. Cardoso 2009b: 17, citing Arends 1989: 89). 6.5.4.

Bazaar Hindi

Bazaar Hindi is a simplified form of Hindi/Urdu used, particularly outside the Hindi/Urdu heartland, when English is not available as a lingua franca. No work has appeared on the language since the pioneering reports of Chatterji 1931 (on Calcutta), Chernyshev 1971, Apte 1976 (on Bombay), and V. D. Singh 1981 (on Shillong). Researchers have noted lexical and grammatical influence from dominant local languages, but the varieties also have traits in common, such as the use of ham (‘we’) as a first-person singular pronoun and the invariant future in -egaa, generalized from the Hindi third-person singular future form. These could reflect common simplification strategies on the part of Hindi speakers, or the areal spread of features through speaker mobility. A sociolinguistically sophisticated study of one of the varieties would be of great interest. From a wider perspective, the close genetic relationship between Bazaar Hindi and its substrates in the Indo-Aryan region has allowed it to exhibit more lexifier morphology and have a more porous lexicon than is typical for a pidgin (Smith 2008). 6.5.5.

Sri Lanka Malay

The Sri Lanka Malay community dates from around 1640, when the Dutch brought soldiers recruited from their East-Indies possessions to assist them in ousting the Portuguese from Sri Lanka. During their reign, they continued to bring troops as well as convicts and exiled nobles. Although they were of varied backgrounds, they spoke Vehicular Malay as a common lingua franca. Many of the single men (particularly among the soldiers) found local wives, especially in the island’s existing Tamil-speaking Muslim community. After control was ceded to the British in 1796, Malay troops continued to be recruited for the Ceylon Rifle Regiment until it was disbanded in 1873 (Hussainmiya 2008). A full-scale description of the language now exists (Nordhoff 2009).

674

Ian R. Smith

The genesis and development of Sri Lanka Malay is a fiercely contested topic. The language has been classed as a creole (e.g. Hussainmiya 1986, De Silva Jayasuriya 2002, Smith, Paauw & Hussainmiya 2004), a (bilingual) mixed language (e.g. Ansaldo 2008, Ansaldo & Nordhoff 2009, Meakins 2013), a product of convergence/metatypy/“conversion” (Bakker 2000a, 2006, Ansaldo 2008, Nordhoff 2009, 2012), and a creole that has subsequently undergone convergence/metatypy (Bakker 2000b, 2003). One of the chief diagnostic features of early creoles is the absence (or near absence) of inflectional morphology. The fact that this condition was already present in Vehicular Malay makes the categorization of Sri Lanka Malay difficult on the basis of structure, particularly given the lack of early data and the subsequent accretion of complex morphosyntax on the Lankan model. Several fundamental issues are in dispute which broadly divide scholars into two opposing camps. One camp argues for a close historical relationship between the Malays and their fellow Muslims, the Moors — a relationship that extends to significant intermarriage between these communities (e.g. Hussainmiya 1986, 2008, Slomanson 2013). Not surprisingly, this group argues for the primary influence of Shonam (Sri Lanka Muslim Tamil) on the development of Sri Lanka Malay, with Sinhala playing a secondary, mainly post-independence role (e.g. Smith, Paauw & Hussainmiya 2004; Slomanson 2011; Smith 2012a, 2012b). The other camp argues that relationships between the Malay community and the numerically dominant (and currently politically dominant) Sinhalese community could not have been less intense than those between the Malays and the Moors. Further, this group disputes the evidence for significant exogamy among the Malays and claims that interethnic marriages did not favour Moors (e.g. Ansaldo 2005, 2008, 2011; Nordhoff 2009). For this group, Sinhala is of at least equal importance in the genesis and development of Sri Lanka Malay (e.g. Ansaldo 2005, 2008, 2011; Nordhoff 2009, 2012, 2013). The papers in Nordhoff (ed.) 2013 reflect various approaches to the issues. A further contentious topic is the timing of Lankan influence on the language. As with Sri Lanka Portuguese, Bakker (2000a, 2000b) posits recent rapid convergence, while Smith and Paauw (2006) argue for 19th century diglossia with early rapid developments. Ansaldo (2005, 2008), Ansaldo and Nordhoff (2009), and Nordhoff (2009), on the other hand, argue for a slow process of convergence. A less controversial line of research examines the regional origins of the Malay component of Sri Lanka Malay. Adelaar (1991) identifies Maluku Malay as the source of many phonological and lexical traits. Paauw’s extensive lexical studies (2004, 2013) indicate an equally significant role for Java Malay. 6.5.6.

Veddah

DeSilva (1972) and Dharmadasa (1974) argue that the variety of Sinhala spoken by the Veddahs, an aboriginal community of Sri Lanka, shows grammaticaliza-

Sociolinguistics

675

tion and lexicalization patterns reminiscent of those found in pidgins and creoles. Unfortunately, little is known about the sociohistorical context in which Veddah developed or about the original language of the Veddahs. No recent work on the language has appeared, and the language is reportedly no longer used (Lewis 2009). 6.5.7.

“Butler English”

Butler English (Hosali 2000, 2005; Hosali & Aitchison 1986) is the English of poorly educated service workers, particularly in the context of the British Raj. Terms such as “Kitchen English”, “Babu English” refer to similar phenomena (Kachru 1983: 25). Since its prime function was (is) to communicate with competent English speakers, rather than with speakers of other Indian languages, it remains mutually intelligible with English and has not developed an independent stable norm like a true pidgin. Thus, while it exhibits reduction and simplification reminiscent of pidgins it is more like a collection of interlanguages, similar to the German of foreign workers in Germany (Blackshire-Belay 1993). 6.5.8.

Observations and conclusion

From the above survey some common research themes arise. First are efforts to distinguish among different products of language contact through the relationships between their sociohistorical contexts, the linguistic processes involved, and the linguistic products observed. Second is the recognition that these languages have accretions of contact-induced change (as well as possible internal change) that obscure their earlier form. Third is the need to be sensitive to differences between spoken and written material. None of these themes is limited to South Asia, but they are particularly acute in the region because of the patterns of long-term minority language maintenance, the linguistic practices associated with extensive and intensive bi- and multilingualism (Nadkarni 1975), and the penchant for diglossia. Of the six languages surveyed four are endangered. Sri Lanka Malay is now quite well studied due to the efforts of a critical mass of interested scholars. Sri Lanka Portuguese and Butler English are comparatively less well known, and further data collection is needed before they disappear. Nagamese and Bazaar Hindi are not endangered, but have not attracted the attention of many investigators, despite the opportunities they present for fruitful and significant research. Veddah is also comparatively little researched, but time to study it may have already passed.

676

Tej K. Bhatia

6.6.

South Asian languages in diaspora By Tej K. Bhatia

6.6.1.

Introduction

Although no firm figure is available, based on national censuses and other sources it is estimated that the South Asian (SA) diaspora is fifty million strong and continues to grow in the age of globalization. According to The Economist (Nov 19–25, 2011, p. 13), some twenty-two million Indians alone are currently scattered around the globe. However, the SA diaspora is neither recent nor a product of globalization alone. In order to understand the salient features of South Asian languages in the diaspora, it is imperative to examine the causes and history which led South Asian people to leave their homelands. For the purpose of this paper, the SA diaspora can be classified into three broad periods: Pre-Indentured Era, Indentured Era, and Post-Indentured Era. The Pre-Indentured Era (330 BCE–800 CE) can best be characterized as an Ancient Indian diaspora. During this period Hindu sages and Buddhist monks traveled to distant lands in search of knowledge and the meaning of life, and to spread the word of the Buddha. Indian traders also traveled in search of new trade, wealth, and skills. The linguistic consequences of the ancient diaspora are still visible in the form of Sanskrit, Tamil, and Pali-Prakrit influence as far away from India as Central Asia and Southeast Asia. This was followed by the darkest chapter in the history of the Indian diaspora, the Gypsy or Roma diaspora (see Bhatia 2001 for more details). The Indentured Era migration of Indians began from the nineteenth century soon after the abolition of slavery in the European colonies in the early 19th Century. This marked the onset of the indentured labor system (often called the “Coolie” system). Under this system Indians were brought to the Caribbean areas, Africa, and islands in the Indian and Pacific Oceans to work on sugar plantations, to fill a gap left by the emancipation of slaves, and to seek economic opportunities abroad. Although the indenture system was terminated in 1920, Indian immigration into the British colonies continued. Indian migration during the Indentured Period was not entirely indentured though. Non-indentured migration to Southeast Asia and free migration to Africa were the two notable waves of migration during this period. See Mesthrie 2008 on the second diaspora. The Post-Indentured Era can be termed the “modern/new diaspora” (i.e. the diaspora during the post-independence period since 1947) during which Indians, Pakistanis, Bangladeshis, and Sri Lankans set out to the United Kingdom, the United States, Canada, Australia, and New Zealand. By 1990, approximately twothirds of Mauritians, more than half of Fijians, about half of Guyanians, and about one-third of Trinidadians were of Indian ancestry. Additionally, the Indian diaspora has a marked presence in many other countries (Malaysia, Singapore, Hong

Sociolinguistics

677

Kong, Kenya, Nigeria, South and East Africa, the United Kingdom, the Netherlands, Germany, Australia, Canada, the United States, and many countries of the Middle East). The forces of globalization, information technology (including the flow of information across borders), and the entertainment industry (“global Bollywood”) have added yet another facet to the modern Indian diaspora. Due to historical factors, during the Indentured Period the Indian diaspora must be considered together with the Pakistani and the Bangladeshi diasporas. Therefore, the term “Indian diaspora” refers to all South Asian countries and their languages. Although the Indian diaspora is the primary vehicle for the spread of SA languages abroad, non-Indians such as Dutch, British, and French merchants and missionaries also played a secondary role in the spread of SA languages. 6.6.2.

Major resources and accomplishments

Of the three stages of the Indian diaspora, the study of the Indian/SA language diaspora during the Indentured Period represents the core of language diasporic studies. 6.6.2.1. The Ancient Indian diaspora Notable works on the ancient Indian diaspora include Gonda 1997 and Zakharyin 2013 on the influence of Sanskrit on Southeast Asian languages, primarily in the context of religion and literature. Patnaik 2003 discussed the rise of maritime trade of the Kalingas in ancient India. Parya and Romani represent diasporas of a different nature. Parya is a Central IA language now spoken in the border regions between Tajikistan and Uzbekistan, whose speakers migrated to their present location perhaps in the late 19th century. See Payne 1997 for a general treatment, and Abbess et al. 2005 for discussion of the surprising vitality of this minority language. Romani, another diasporic Central IA language, has absorbed multiple influences over the course of its historical trajectory and diversified into multiple dialects. Important sources on Romani include Matras 2002, Matras, Bakker & Kyuchkov (eds.) 1997, Proctor 2008, and the work of Ian Hancock (e.g. 1987 and in the Romani Archives and Documentation Center at the University of Texas at Austin, http://www.radoc.net/radoc.php?doc=about_radoc&lang=en). See also Section 1.3.2.8, this volume. 6.6.2.2. The Indentured Era diaspora The Indentured Era diaspora represents the best researched area of SA language diaspora. Bhatia 1982a and Mesthrie 2008 examine the context, nature, and scope

678

Tej K. Bhatia

of this work. The work on the SA language diaspora began in the context of Hindi and Tamil as “international languages” and can be grouped in the following two categories: descriptive contact studies and pedagogical studies. Descriptive contact studies are devoted to the developments in SA languages abroad. In-depth studies on the formation of overseas Bhojpuri and Hindi include Domingue 1971 on Mauritius Bhojpuri; Siegel 1975 and 1987 on Fiji Hindi; Durbin 1973 and Mohan 1977 and 1978 on Trinidad Bhojpuri-Hindi; Bosch 1978 on Surinam Hindi; Gambhir 1981 on Guyanese Bhojpuri-Hindi; Mesthrie 1991 on Bhojpuri-Hindi in South Africa. In addition, a collection of articles on overseas Hindi-Bhojpuri has also been published (Barz & Siegel (eds.) 1988). Two important works of a pedagogical nature are the texts on Fiji and Surinam Hindi by Moag (1977) and Huiskamp (1978), respectively. As is evident from the above discussion, Bhojpuri-Hindi has been the primary focus of study on the transplanted varieties of SA languages. These studies deal primarily with the following four aspects of Bhojpuri-Hindi: (1) demographic information; (2) language/ dialect identification and description; (3) language contact and change; and (4) language pedagogy. In addition to Bhojpuri and Hindi, the presence of other SA languages (e.g. the Dravidian languages Tamil and Telugu); and other IndoAryan languages, namely Panjabi and Gujarati, was significant among the diasporic migrants; however, research on such languages is largely neglected. During the Indentured Era, multiple Indian languages (e.g. Panjabi, Bengali, Marathi, Tamil, and Telugu) and dialects of Hindi (Bihari, Rajasthani, Awadhi, Bhojpuri) came in contact with each other. This led to the evolution of South Asian language varieties which underwent the process of dialect leveling and language convergence as well as divergence. The process of dialect leveling led to the elimination of some “sharp”, marked, or idiosyncratic dialectal features on the one hand and convergence of features leading to a “common denominator” on the other. Thus, new speech forms of SA languages emerged due to language contact in each colony. Eventually a new form of Bhojpuri emerged as a lingua franca of Indians, similar to the case of Bazar Hindi, namely Trinidad Bhojpuri or “Plantation Hindustani” (Tinker 1993: 208). The process of dialect leveling and koineization has been studied in detail in Siegel 1987, Gambhir 1988, Mohan 1978, Moag 1979, Mesthrie 1991, Domingue 1971, and others. Bhatia 1988 and 1982b and Mesthrie 1991 attempt to focus on language contact and change in the colonies at the supralocal level. For instance, Mesthrie 1991 examines the loan words from Fanagalo pidgin and English in South African Bhojpuri, and Bhatia 1988 and 1982b present an account of code-switching with English in Trinidad Hindi together with word order and morpho-syntactic changes due to contact with Trinidad Creole. The existence of this new Bhojpuri, along with Standard Hindi and English creoles, created a diglossic situation in which Bhojpuri served as a Low variety. In essence, the majority of studies concerning the SA language diaspora are devoted to the historic account of migration from the different linguistic

Sociolinguistics

679

and geographical regions of India and the change undergone by Bhojpuri due to language contact with creole English and Standard Hindi. Media, missionaries, and government policies played an important role in the introduction of Standard Hindi as the prestige variety over nativized overseas Bhojpuri. A collection of articles in Sharma & Annamalai (eds.) 2003 not only extends the scope of work to the neglected South Indian language diaspora in Fiji, Mauritius, South Africa, Singapore, and Hong Kong during the Indentured Period but also includes articles on new diasporas, namely Gujarati, Panjabi, and Hindi-Urdu diasporas in the UK (Safder 1985). Bhatia 2001 and K. Sridhar 2008 examine SA languages in the USA and their interface with issues of identity and language assimilation. 6.6.3.

Journals and other resources

Although there is an online journal devoted exclusively to the South Asian diaspora (South Asian Diaspora, published by Taylor & Francis), most of the articles discuss the experiences of various diasporic communities; they do not focus on specifically language-oriented content. However, linguistics journals such as Language, International Journal of Sociology of Language, Journal of Sociolinguistics, and other journals devoted to bilingualism serve as important and potential sources of research on the topic. The South/South East Asia Library of the University of California-Berkeley serves as an important digital resource on the SA language diaspora (http://www.lib.berkeley.edu/SSEAL/SouthAsia/diaspora. html). Another resource is the South Asian American Digital Archive (http://www. saadigitalarchive.org/). It contains some digital images of materials in some South Asian languages, which might furnish data for diachronic studies. 6.6.4.

Issues needing further research

Both general and particular issues either remain unresolved or inadequately researched. First, consider that the notion of “diaspora” itself can be defined in a wider sense as “dispersion from the homeland” or in a narrow sense as a “link (physical or psychological) with the homeland”. Unlike the new and the Indentured diaspora, under the narrow sense the Roma diaspora cannot be considered a part of the SA diaspora. Specific linguistic issues fall into three categories: language maintenance, shift, and attrition; typological modeling; and language acquisition. Although dialect leveling and koineization studies have made ground-breaking contributions to the study of SA languages in diaspora, issues of language maintenance, shift, and loss need in-depth examination on both cross-linguistic and explanatory grounds. For example, questions such as why overseas Bhojpuri could not survive in places like

680

Bibliographical References

Jamaica but did survive in Trinidad and Guyana need to be addressed (Mufwene 2004: 476). Such issues need further investigation, particularly within the framework of social identity and social accommodation theory on the one hand, and in the context of language choices, language power (English and/or Hindi as killer languages for Bhojpuri), and attitudes involving the indentured population on the other. Typological modeling of the SA language diaspora remains largely understudied. Studies devoted to the theoretical modeling of the developmental cycles/processes of SA languages in the diaspora are lacking. The developmental models of World Englishes such as those by Moag (1992), and Schneider (2003) among others can serve as a benchmark model for adding explanatory power and can test hypotheses such as the Language Bioprogram hypothesis and the relexification hypothesis. Language attrition (inter-generational language transmission) studies like Bhatia 1982b and Mohan & Zador 1986 can provide further impetus to investigate the role of personal, societal, and political bilingualism in bilingual and trilingual language acquisition (Berkes & Flynn 2013). The modern SA language diaspora also needs urgent attention. Although initial progress is under way (e.g. Sharma & Annamalai 2003), examination of the roles of media, globalization, and varying contact with homeland languages can shed light on issues involving bilingual education, language variation, social networks, intelligibility, appropriateness, and acceptance of overseas SA languages in the modern era. Surprisingly, work on Tamil still remains largely neglected in the Indentured as well as modern eras. Bibliographical References Abbasi, Muhammad Gulfraz, and Saiqa Imtiaz Asif 2010 Dilemma of usage and transmission: A sociolinguistic investigation of DhundiPahari in Pakistan. Language in India 10: 197–214. www.languageinindia. com/may2010/pakistanpaharifinal.pdf (accessed 19 Dec. 2014) Abbasi, Muhammad Gulfraz, and Zafar Iqbal Khattak 2010 Official ways to subjugate languages: School setting as a cause of Pahari Dhundi-Kairali decline. Language in India 10: 19–27. www.languageinindia. com/sep2010/pahariprestige.pdf (accessed 19 Dec. 2014) Abbasi, Muhammad Gulfraz, Zafar Iqbal Khattak, and Sayyam Bin Saeed 2011 Psycho-sociological processes under-mining Pahari teaching in schools and homes in Pakistan. Procedia – Social and Behavioural Sciences 15: 3656– 3660. http://www.sciencedirect.com/science/article/pii/S1877042811008986 (accessed 19 Dec. 2014) Abbess, Elizabeth, Katja Müller, Calvin Tiessen, Daniel Paul, and Gabriela Tiessen 2005 Language maintenance amongst the Parya of Tajikistan. In: John M. Clifton (ed.), Studies in languages of Tajikistan, 25–64. Dushanbe/St. Petersburg:

Sociolinguistics

681

National State University of Tajikistan/North Eurasia Group, SIL International. Abbi, Anvita 1997 Languages of tribal and indigenous peoples of India: The ethnic space Delhi: Motilal Banarsidass. Abbi, Anvita 2008 Tribal languages. In: Kachru, Kachru & Sridhar (eds.) 2008: 151–174. Abbi, Anvita 2009 Vanishing diversities and submerging identities: An Indian case. In: Asha Sarangi (ed.), Language and politics in India, 299–311. New Delhi: Oxford University Press. Abel, Rekha 1998 Diglossia as a linguistic reality. In: Rajendra Singh (ed.), Yearbook of South Asian languages and linguistics 1998, 83–103. New Delhi: Sage. Addleton, Jonathan S. 1986 The importance of regional languages in Pakistan. Al-Mushir 28(2): 55–80. Adelaar, K. Alexander 1991 Some notes on the origin of Sri Lanka Malay. In: H. Steinhauer (ed.), Papers in Austronesian linguistics, 23–37. (Pacific Linguistics Series A–81, 1.) Canberra: Australian National University. Agesthialingom, S., and K. Karunakaran (eds.) 1980 Sociolinguistics and dialectology: Seminar papers. Annamalainagar: Annamalai University. Agesthialingom, S., and S. V. Shanmugam 1970 The language of inscriptions 1250–1350 A. D. Annamalainagar: Annamalai University. Annamalai, E. 2001 Managing multilingualism: Political and linguistic manifestations. New Delhi/ Thousand Oaks, CA: Sage Publications. Annamalai, E. 2011 Social dimensions of modern Tamil. Chennai: Cre-A. Anonymous 1978 A selected bibliography of recent publications on aspects of sociolinguistics in South Asia. International Journal of the Sociology of Language 16: 119–121. Anonymous 2011 International committee proposed for standardization of Wakhi alphabets. http://pamirtimes.net/2011/10/15/formation-of-international-committee-pro posed-for-standardization-of-wakhi-alphabets/ (accessed 19 Dec. 2014) Ansaldo, Umberto 2005 Typological admixture in Sri Lanka Malay: The case of Kirinda Java. MS, University of Amsterdam. Ansaldo, Umberto 2008 Sri Lanka Malay revisited: Genesis and classification. In: A. Dwyer, D. Harrison, and D. Rood (eds.), A world of many voices: Lessons from documented endangered languages, 13–42. Amsterdam/Philadelphia: Benjamins. Ansaldo, Umberto 2011 Sri Lanka Malay and its adstrates. In: Lefebvre (ed.) 2011: 367–382. Amsterdam/Philadelphia: Benjamins.

682

Bibliographical References

Ansaldo, Umberto, and Sebastian Nordhoff 2009 Complexity and the age of languages. In: Enoch O. Aboh and Norval Smith (eds.), Complex processes in new languages, 345–363. Amsterdam/ Philadelphia: Benjamins. Apte, Mahadev L. 1976 Language controversies in the Indian parliament (Lok Sabha): 1952–1960. In: William M. O’Barr and Jean F. O’Barr (eds.), Language and politics, 213– 234. The Hague: Mouton. Arends, Jacques 1989 Syntactic developments in Sranan: Creolization as a gradual process. University of Nijmegen PhD dissertation. http://dbnl.org/tekst/aren012synt01_ 01/aren012synt01_01.pdf (accessed 19 Dec. 2014) Arokianathan, S. 1986 moḻiyiyal: iraṭṭai vaḻakku [Linguistics: Diglossia]. Chidambaram: Manivasagar Nulagam. Arokianathan, S. 1988 Language use in mass media. Delhi: Creative Publishers. Asif, Saiqa Imtiaz 2005 Shame: A major cause of ‘language desertion’. Journal of Research 8: 1–13. Multan: Faculty of Islamic Studies and Languages, Bahauddin Zakariya University. Ayres, Alyssa 2008 Language, the nation, and symbolic capital: The case of Punjab. Journal of Asian Studies 67(3): 917–946. Ayres, Alyssa 2009 Speaking like a state: Language and nationalism in Pakistan. Cambridge: Cambridge University Press. Azad, Humayun (ed.) 1984 bangla bhaSa (prothom khOnDo): bangla bhaSabiSOyok probondhoSOnkolon (1743–1983). [Bangla language [vol. 1]: Papers about the Bangla language 1743–1983]. Dhaka: Bangla Akademi. Baart, Joan L. G. 1997 The sounds and tones of Kalam Kohistani. (Studies in Languages of Northern Pakistan 1.) Islamabad: National Institute of Pakistan Studies, Quaid-i-Azam University/Summer Institute of Linguistics. Baart, Joan L. G. 1999 A sketch of Kalam Kohistani grammar. (Studies in Languages of Northern Pakistan 5.) Islamabad: National Institute of Pakistan Studies, Quaid-i-Azam University/Summer Institute of Linguistics. Baart, Joan L. G. 2003 Sustainable development and the maintenance of Pakistan’s indigenous languages. In: Proceedings of the Conference on the State of the Social Sciences and Humanities, Islamabad, Pakistan, September 26–27, 2003. http://fli-on line.org/documents/sociolinguistics/development_maintenance_of_pak_lgs. pdf (accessed 19 Dec. 2014) Baart, Joan L. G. 2004 Contrastive tone in Kalam Kohistani. Linguistic Discovery 2(2): 1–20.

Sociolinguistics

683

Baart, Joan L. G., and Muhammad Zaman Sagar 2004 The Gawri language of Kalam and Dir Kohistan. http://fli-online.org/docu ments/languages/gawri/gawri_introduction.pdf (accessed 19 Dec. 2014) Backstrom, Peter C., and Carla F. Radloff 1992 Languages of Northern Areas. (Sociolinguistic Survey of Northern Pakistan 2.) Islamabad: National Institute of Pakistan Studies, Quaid-i-Azam University/ Summer Institute of Linguistics. Bakker, Peter 2000a Convergence intertwining: An alternative way towards the genesis of mixed languages. In: L. D. G. Gilbers, J. Nerbonne, and J. Schaeken (eds.), Languages in contact, 29–35. (Studies in Slavic and General Linguistics 28.) Amsterdam/ Atlanta: Rodopi. Bakker, Peter 2000b Rapid language change: Creolization, intertwining, convergence. In: Colin Renfrew, Larry Trask, and April McMahon (eds.), Papers in the prehistory of languages, 585–620. Cambridge: McDonald Institute for Archaeological Research. Bakker, Peter 2003 Pidgin inflectional morphology and its implications for creoles. In: Gert Booij and Jaap van Marle (eds.), Yearbook of morphology 2002, 3–33. New York/ Dordrecht: Kluwer Academic. Bakker, Peter 2006 The Sri Lanka sprachbund: The newcomers Portuguese and Malay. In: Yaron Matras, April McMahon, and Nigel Vincent (eds.), Linguistic areas: Convergence in historical and typological perspective, 135–159. Houndmills, Basingstoke, UK: Palgrave Macmillan. Bartholomeusz, Tessa 1999 First among equals: Buddhism and the Sri Lankan state. In: Ian Harris (ed.), Buddhism and politics in twentieth-century Asia, 173–193. New York: Continuum (by the editor and authors). Barz, Richard K., and Jeff Siegel (eds.) 1988 Languages transplanted: The development of overseas Hindi. Wiesbaden: Harrassowitz. Bashir, Elena 1988 Topics in Kalasha syntax: An areal and typological perspective. University of Michigan PhD Dissertation. ProQuest Dissertations 8821545. Bashir, Elena 2007 Contact-induced change in Khowar. In: Shafqat Saeed (ed.), New perspectives on Pakistan: Contexts, realities and visions of the future, 205–238. Karachi: Oxford University Press. Bashir, Elena 2008 Some transitional features of Eastern Balochi: An areal and diachronic perspective. In: Carina Jahani, Agnes Korn, and Paul Titus (eds.), The Baloch and others: Linguistic, historical and socio-political perspectives on pluralism in Balochistan, 45–82. Wiesbaden: Reichert.

684

Bibliographical References

Bashir, Elena 2010 Innovations in the Brahui verb system. Journal of South Asian Linguistics 3(1): 23–44. http://tiger.sprachwiss.uni-konstanz.de/~jsal/ojs/index.php/jsal/ article/view/21/17 (accessed 19 Dec. 2014) Benedikter, Thomas 2009 Language policy and linguistic minorities in India: An appraisal of the linguistic rights of minorities in India. Berlin: LIT Verlag. Berkes, Eva, and Suzanne Flynn 2013 Multilingualism: New perspectives on syntactic development. In: Bhatia & Ritchie (eds.) 2013: 137–167. Bharadwaj, Vasudha 2011 Languages of nationhood: Political ideologies and the place of English in 20th century India. University of Rochester PhD dissertation. ProQuest Dissertations 3442774. Bhatia, Tej K. 1982a Transplanted South Asian languages: An overview. Studies in the Linguistic Sciences 11(2): 129–134. Bhatia, Tej K. 1982b Trinidad Hindi: Three generations of a transplanted variety. Studies in the Linguistic Sciences 11(2): 135–150. Bhatia, Tej K. 1988 Trinidad Hindi: Its genesis and a generational profile. In: Barz & Siegel (eds.) 1988: 179–196. Bhatia, Tej K. 2001 Media, identity and diaspora: Indians abroad. In: Diaspora, identity, and language community. Special Issue of Studies in the Linguistic Sciences 31(1): 269–287. Bhatia, Tej K., and William C. Ritchie (eds.) 2004 The handbook of bilingualism and multilingualism, 1st edn. Malden, MA: Blackwell. Bhatia, Tej K., and William C. Ritchie (eds.) 2012 The handbook of bilingualism and multilingualism, 2nd edn. Oxford: WileyBlackwell. Bhatia, Tej K., and William C. Ritchie 2012 Bilingualism and multilingualism in South Asia. In: Bhatia & Ritchie (eds.) 2012: 843–870. Bhattacharja, Shishir 2010 Benglish verbs: A case of code-mixing in Bengali. In: PACLIC 24 Proceedings, 75–84. http://www.aclweb.org/anthology/Y10–1011 (accessed 23 Nov. 2014) Bhattacharjya, Dwijen 2007 Nagamese (Restructured Assamese). In: John Holm and Peter L. Patrick (eds.), Comparative creole syntax, 237–254. London/Colombo: Battlebridge. Bizri, Fida 2010 Pidgin madame: une grammaire de la servitude. Paris: Geuthner. Blackshire-Belay, Carol 1993 Foreign workers’ German: Is it a pidgin? In: Francis Byrne and John Holm (eds.), Atlantic meets Pacific: A global view of pidginization and creolization, 431–440. Amsterdam/Philadelphia: Benjamins.

Sociolinguistics

685

Blevins, Juliette 2007 A long lost sister of Proto-Austronesian? Proto-Ongan, mother of Jarawa and Onge of the Andaman Islands. Oceanic Linguistics 46(1): 155–198. Bosch, Tinke 1978 The distribution of verb forms in Saranami Hindustanie narratives. Penama: Lenguas Penama. Brass, Paul R. 1974 Language, religion and politics in North India. London/New York: Cambridge University Press. Brenzinger, Matthias, Akira Yamamoto, Noriko Aikawa, Dmitri Koundiouba, Anahit Minasyan, Arienne Dwyer, Colette Grinevald, Michael Krauss, Osahito Miyaoka, Osamu Sakiyama, Rieks Smeets, and Ofelia Zepeda 2003 Language vitality and endangerment. Paris: UNESCO Expert Meeting on Safeguarding Endangered Languages. Britto, Francis 1986 Diglossia: A study of the theory with application to Tamil. Washington, D. C.: Georgetown University Press. Buddruss, Georg, and Almuth Degener 2016 Materialien zur Prasun-Sprache des afghanischen Hindukusch, Teil I: Texte und Glossar. (Harvard Oriental Series, 80.) Cambridge, MA: Department of South Asian Studies, Harvard University. Distributed by Harvard University Press. Burki, Rozi Khan 2001 Dying languages; Special focus on Ormuri. Originally published in Pakistan Journal of Public Administration 6(2). http://www.khyber.org/publications/ 016–020/ormuri.shtml (accessed 19 Dec. 2014) Burushaski Research Academy 2007 Burūs̆ askī-Urdū luγat – Jild awwal [Burushaski-Urdu dictionary, vol. 1]. Karachi: Bureau of Composition, Compilation & Translation, University of Karachi. Burushaski Research Academy 2009 Burūšaskī-Urdū luγat – Jild doam [Burushaski-Urdu dictionary, vol. 2]. Karachi: Bureau of Composition, Compilation & Translation, University of Karachi. Burushaski Research Academy 2014 Burūšaskī-Urdū luγat – Jild soam [Burushaski-Urdu dictionary, vol. 3]. Karachi: Bureau of Composition, Compilation & Translation, University of Karachi. Cacopardo, Alberto 1991 The other Kalasha: A survey of Kalashamun-speaking people in Southern Chitral, Part I: The eastern area. East and West, 41: 273–310. Cacopardo, Alberto, and Augusto Cacopardo 1991 The other Kalasha: A survey of Kalashamun-speaking people in Southern Chitral, Part III: Jinjeret Kuh and the problem of Kalasha origins. East and West 42: 333–375. Cacopardo, Augusto 1991 The other Kalasha: A survey of Kalashamun-speaking people in Southern Chitral, Part II: The Kalasha of Urtsun. East and West 41: 311–350.

686

Bibliographical References

Caldwell, Robert 1981 A comparative grammar of the Dravidian languages or south Indian family of languages. Delhi: Gian Publications. First published 1875, London, Trübner. Canagarajah, A. Suresh 2005 Dilemmas in planning English/vernacular relations in post-colonial communities. Journal of Sociolinguistics 9(3): 418–447. Canagarajah, A. Suresh 2008 Language shift and the family: Questions from the Sri Lankan Tamil diaspora. Journal of Sociolinguistics 12(2): 143–176. Canagarajah, A. Suresh 2009 The plurilingual tradition and the English language in South Asia. Association Internationale de Linguistique Appliquée Review 22: 5–22. Canagarajah, A. Suresh, and Hina Ashraf 2013 Multilingualism and education in South Asia: Resolving policy/practice dilemmas. Annual Review of Applied Linguistics 33: 258–285. Cardona, George, and Dhanesh Jain (eds.) 2003 The Indo-Aryan languages. London/New York: Routledge. Cardoso, Hugo C. 2009a The Indo-Portuguese language of Diu. University of Amsterdam PhD dissertation. Utrecht: LOT. http://dare.uva.nl/document/2/66896 (accessed 8 Nov. 2014) Cardoso, Hugo C. 2009b Jacques Arends’ model of gradual creolization. In: Rachel Selbach, Hugo C. Cardoso, and Margot van den Berg (eds.), Gradual creolization: Studies celebrating Jacques Arends, 13–23. (Creole Language Library 34.) Amsterdam/ Philadelphia: Benjamins. Cardoso, Hugo C. 2013 Convergence in the Malabar: The case of Indo-Portuguese. Conference on Language Contact in India: Historical, Typological and Sociolinguistic Perspectives, Pune, February 6–8. Cardoso, Hugo C. (ed.) 2014 Language endangerment and preservation in South Asia. (Language Documentation & Conservation Special Publication No. 7.) http://schol arspace.manoa.hawaii.edu/bitstream/handle/10125/4607/master.pdf?se quence=1 (accessed 23 Nov. 2014) Census of India 2001 Office of the Registrar General & Census Commissioner, India. Table Raw C-16. Section on ‘Languages spoken by less than 10,000 speakers’. Supplied by Central Institute of Indian Languages, Mysore. Chatterji, Suniti Kumar 1926 The origin and development of the Bengali language. 3 vols. Calcutta University Press. Reprinted 1970, London: Allen & Unwin; distributed by Motilal Banarsidass, Delhi. Chatterji, Suniti-Kumar 1931 Calcutta Hindustani: A study of a jargon dialect. Indian Linguistics 1: 2–4. Chaudenson, Robert 1989 Créoles et enseignement du français. Paris: L’Harmattan.

Sociolinguistics

687

Chaudenson, Robert 1992 Des îles, des hommes, des langues: Essais sur la créolisation linguistique et culturelle. Paris: L’Harmattan. Chaudenson, Robert 1995 Les créoles. (Series: Que sais-je 2970.) Paris: Presses Universitaires de France. Chaudhuri, Pramatha 1914/1968 Sobuj pOtrer mukhopOtro. [Message from the editor of Sabuj Patra]. Repr. [from Sabuj Patra’s Baisakh 1321 issue] in Chaudhuri 1968, 25–30. Chaudhuri, Pramatha 1968 probondhoSOnggroho. [Collected essays.] Kolkata: Visvabharati. Chevillard, Jean-Luc 2008 The concept of ticaiccol in Tamil grammatical literature and the regional diversity of Tamil classical literature. In: M. Kannan (ed.), Streams of language: Dialects of Tamil, 21–51. Pondicherry: French Institute of Pondicherry. Choksi, Nishaant 2014 Scripting the border: Script practices and territorial imagination among Santali speakers in eastern India. International Journal of the Sociology of Language 227: 47–63. Clements, J. Clancy 1991 The Indo-Portuguese creoles: Languages in transition. Hispania 74(3): 637– 646. Clements, J. Clancy 1996 The genesis of a language: The formation and development of Korlai Portuguese. Amsterdam/Philadelphia: Benjamins. Clements, J. Clancy 2009 Accounting for some similarities and differences among the Indo-Portuguese creoles. Journal of Portuguese Linguistics 8: 23–47. Clements, J. Clancy, and Ahmar Mahboob 2000 Wh-words and question formation in pidgin/creole languages. In: John McWhorter (ed.), Language change and language contact in pidgins and creoles, 459–497. Amsterdam/Philadelphia: Benjamins. Clements, J. Clancy, and Andrew Koontz-Garboden 2002 Two Indo-Portuguese creoles in contrast. Journal of Pidgin and Creole Languages 17: 191–236. Coperahewa, Sandagomi 2009 The language planning situation in Sri Lanka. Current Issues in Language Planning 10(1): 69–150. D’Souza, Jean 1987 South Asia as a sociolinguistic area. University of Illinois PhD dissertation. ProQuest Dissertations 8721625. Das Gupta, Jyotirindra 1969 Official language problems and policies in South Asia. In: Sebeok, Emeneau & Ferguson (eds.) 1969: 578–596. Das Gupta, Jyotirindra 1970 Language conflict and national development: Group politics and national language policy in India. Berkeley: University of California Press.

688

Bibliographical References

Das, Bishakha 2014 A descriptive grammar of Tai-Khamti. Jawaharlal Nehru University PhD dissertation. Dasgupta, Probal 1985 The rethinking of language. Visvabharati Quarterly 48(1–4): 138–50. Dasgupta, Probal 1990 Review of Udaya Narayana Singh and Maniruzzaman 1983, Diglossia in Bangladesh and language planning, and Udaya Narayana Singh and R. N. Srivastava 1987, Perspectives in language planning. Indian Linguistics 49: 131–142. Dasgupta, Probal 1993 The otherness of English: India’s auntie tongue syndrome. New Delhi: Sage. Dasgupta, Probal 2004a Dynamic diglossia and South Asian modernization. In: Lipi Ghosh and Achintya Dutta (eds.), Indian Association for Asian and Pacific Studies: Proceedings of the First Biennial Conference 2002, 108–115. Kolkata: Progressive. Dasgupta, Probal 2004b Relativizing the formal to the substantive in linguistics. In: Ashok Kumar (ed.), Language, context and culture: In honour of Professor Shivendra K. Verma, 68–80. Lucknow: Gurukul. Dasgupta, Probal 2011 Loĝi en homaj lingvoj: la substancisma perspektivo. New York: Mondial. [English version: Inhabiting human languages: The substantivist visualization, 2012, Delhi: Samskriti/Indian Council of Philosophical Research.] Dasgupta, Probal, Alan Ford, and Rajendra Singh 2000 After etymology: Towards a substantivist linguistics. München: LINCOM. De Bary, William Theodore, et al. (eds) 1958 Sources of Indian tradition, Vol. 2. New York: Columbia University Press. De Silva Jayasuriya, Shihan 2002 A unique Malay: Sri Lankan Malay creole. NUSA — Linguistic Studies of the Languages of Indonesia 50: 43–57. http://sealang.net/nusa/ (accessed 1 Nov. 2014) Decker, Kendall D. (ed.) 1992a Languages of Chitral. (Sociolinguistic Survey of Northern Pakistan 5.) Islamabad: National Institute of Pakistan Studies, Quaid-i-Azam University/ Summer Institute of Linguistics. Decker, Kendall D. 1992b Gawar Bati. In: K. D. Decker (ed.) 1992: 151–165. Decker, Kendall D. 1992c Kalasha. In: K. D. Decker (ed.) 1992: 96–114. Decker, Sandra J. 1992 Ushojo. In: Rensch, S. J. Decker, & Hallberg 1992: 65–80. Degener, Almuth 1998 Die Sprache von Nisheygram im afghanischen Hindukusch. (Neuindische Studien 14.) Wiesbaden: Harrassowitz. Deivasundaram, N. 1981 Tamil diglossia. Tirunelveli: Nainar Patippagam.

Sociolinguistics

689

Deshpande, Madhav M. 1979 Sociolinguistic attitudes in India: An historical reconstruction. Ann Arbor: Karoma. Deshpande, Madhav M. 1986 Sanskrit grammarians on diglossia. In: Krishnamurti, Masica & Sinha (eds.): 312–321. Deshpande, Madhav M. 1993 Sanskrit & Prakrit: Sociolinguistic issues. Delhi: Motilal Banarsidass. DeSilva, M. W. Sugathapala 1972 The Vedda language of Ceylon: Texts and lexicon. München: Kitzinger. DeSilva, M. W. Sugathapala 1997 Diglossia and literacy. Mysore: Central Institute of Indian Languages. DeVotta, Neil 2004 Blowback: Linguistic nationalism, institutional decay, and ethnic conflict in Sri Lanka. Stanford: Stanford University Press. DeVotta, Neil 2007 Sinhalese Buddhist nationalist ideology: Implications for politics and conflict resolution in Sri Lanka. (Policy Studies 40.) Washington, DC: East-West Center Washington. Dhar, Nazir Ahmad 2009 Media-induced language valorization in multilingual speech community. 28th South Asian Languages Analysis Roundtable, University of North Texas, October 9–11, 2009. Dharmadasa, K. N. O. 1974 The creolization of an aboriginal language: The case of Vedda in Sri Lanka (Ceylon). Anthropological Linguistics 16(2): 79–106. Di Carlo, Pierpaolo 2010 Take care of the poets! Verbal art performances as key factors in the preservation of Kalasha language and culture. Anthropological Linguistics 52(2): 141–159. Diamond, Jeffrey M. 2012 A ‘vernacular’ for a ‘new generation’? Historical perspectives about Urdu and Punjabi and the formation of language policy in colonial northwest India. In: Schiffman (ed.) 2012: 282–318. Dimock, Edward C., Braj B. Kachru, and Bh(adrijaju) Krishnamurti (eds.) 1992 Dimensions of South Asia as a sociolinguistic area: Papers in memory of Gerald B. Kelley. New Delhi: Oxford University Press/IBHK. Dodykhudoeva, Leila R. 2007 Revitalization of minority languages: Comparative dictionary of key cultural terms in the languages and dialects of the Shughnani-Rushani group. In: Peter K. Austin, Oliver Bond, and David Nathan (eds.), Proceedings of Conference on Language Documentation and Linguistic Theory, London: SOAS. www. hrelp.org/eprints/ldlt_09.pdf (accessed 19 Dec. 2014) Domingue, Nicole 1971 Bhojpuri and creole in Mauritius: A study of linguistic interference and its consequences in regard to synchronic variation and language change. University of Texas, Austin, PhD dissertation. ProQuest Dissertations 7215744.

690

Bibliographical References

Don Peter, W. L. A. 1996 The Catholic Church in Sri Lanka: A history in outline. http://xoomer.virgilio. it/alperera/intro.html (accessed 1 Nov. 2014) Durbin, Mridula 1973 Formal changes in Trinidad Hindi as a result of language adaptation. American Anthropologist 75(5): 1290–1304. Dutton, Tom 1985 Police Motu: Iena sivarai (Its story). Port Moresby: University of Papua New Guinea Press. Efimov, Valentin Aleksandrovich 2011 The Ormuri language in past and present. Translated and edited 2011 by Joan L. G. Baart. Islamabad: Forum for Language Initiatives. Originally published 1986 as Jazyk Ormuri v sinxronnom i istoričeskom osveščeni [The Ormuri language in a synchronic and historical light]. Moscow: Nauka. Ekbote, Gobalrao 1984 A nation without a national language. Hyderabad: Hindi Prachar Sabha. Farrell, Tim 2000 Mother tongue education and the health and survival of the Balochi language. In: Jahani (ed.) 2000: 19–32. Ferguson, Charles Albert 1945 A chart of the Bengali verb. Journal of the American Oriental Society 65(1): 54–55. Ferguson, Charles Albert 1959 Diglossia. Word 15: 325–340. Repr. 1971 in: Anwar S. Dil (ed.), Language structure and language use: Essays by Charles A. Ferguson, 1–26. Stanford: Stanford University Press. Ferguson, Charles Albert 1992 South Asia as a sociolinguistic area. In: Dimock, Kachru & Krishnamurti (eds.) 1992: 25–36. Repr. 1996 in Thom Huebner (ed.), Sociolinguistic perspectives: Papers on language in society, 1959–1994, 84–96. New York: Oxford University Press. Ferguson, Charles Albert, and John Joseph Gumperz (eds.) 1960 Linguistic diversity in South Asia: Studies in regional, social and functional variation. (International Journal of American Linguistics 26(3), pt. 3.) Bloomington, IN: Indiana University Research Center in Anthropology, Folklore and Linguistics. Fishman, Joshua A. 1967 Bilingualism with and without diglossia; Diglossia with and without bilingualism. Journal of Social Issues 23(2): 29–38. Fishman, Joshua A., Charles Albert Ferguson, and Jyotirindra Das Gupta (eds.) 1968 Language problems of developing nations. New York: Wiley. Foley, William 1988 Language birth: The processes of pidginization and creolization. In: Frederick J. Newmeyer (ed.), Linguistics: The Cambridge survey, vol. 4, 162–183. Cambridge: Cambridge University Press. Foley, William 1991 The Yimas language of New Guinea. Stanford: Stanford University Press.

Sociolinguistics

691

Freeman, Rich 1998 Rubies and coral: The lapidary crafting of language in Kerala. The Journal of Asian Studies 57(1): 38–65. Gambhir, Surendra 1981 The East Indian speech community in Guyana: A sociolinguistic study with reference to koine-formation. University of Pennsylvania PhD dissertation. ProQuest Dissertations 8207963. Gambhir, Surendra 1988 Structural development of Guyanese Bhojpuri. In: Barz & Siegel (eds.) 1988: 69–94. Gnanasundaram, V. 1980 Standard spoken Tamil: What and how. In: Agesthialingom & Karunakaran (eds.) 1980: 66–97. Gonda, Jan 1997 Indonesian linguistics. Leiden: Brill. Goody, Jack 1986 The interface between the written and the oral. New York: Oxford University Press. Gopinatha Pillai, N. R. 1985 Standardization of the poetical language. In: Prabhakara K. M. Variar (ed.), History of Malayalam language, 91–98. Madras: University of Madras. Gopinathan, V. P. 1980 Diglossic situation in Malayalam. In: Agesthialingom & Karunakaran (eds.) 1980: 119–129. Hallberg, Daniel G. 1992a Pashto Waneci Ormuri. (Sociolinguistic Survey of Northern Pakistan 4.) Islamabad: National Institute of Pakistan Studies, Quaid-i-Azam University/ Summer Institute of Linguistics. Hallberg, Daniel G. 1992b The languages of Indus Kohistan, In: Rensch et al. (eds.) 1992: 83–141. Hallberg, Daniel G., and Calinda E. Hallberg 1999 Indus Kohistani: A preliminary phonological and morphological analysis. (Studies in Languages of Northern Pakistan 8.) Islamabad: National Institute of Pakistan Studies, Quaid-i-Azam University/Summer Institute of Linguistics. Hamid, M. Obaidul 2011 Planning for failure: English and language policy and planning in Bangladesh. In: Joshua A. Fishman and Ofelia García (eds.), Handbook of language and ethnic identity: The success-failure continuum in language in ethnic identity efforts, Vol. 2, 192–203. Oxford: Oxford University Press. Hancock, Ian 1987 The pariah syndrome: An account of Gypsy slavery and persecution. Ann Arbor: Karoma. Hasan, Md. Kamrul 2004 A linguistic study of English language curriculum at the secondary level in Bangladesh. Language in India 4(8). http://www.languageinindia.com/ aug2004/hasandissertation1.html (accessed 13 Dec. 2014)

692

Bibliographical References

Hastings, Adi M. 2004 Past perfect, future perfect: Sanskrit revival and the Hindu nation in contemporary India. University of Chicago PhD dissertation. ProQuest Dissertations 3125608. Hippisley, Andrew, Gregory Stump, and Raphael Finkel 2009 Computing in the field: Language modeling for elicitation and documentation of Shughni. 1st International Conference on Language Documentation and Conservation (ICLDC) http://scholarspace.manoa.hawaii.edu/handle/ 10125/5066 (accessed 19 Dec. 2014) Hock, Hans Henrich 1992 Spoken Sanskrit in Uttar Pradesh: Profile of a dying prestige language. In: Dimock, Kachru & Krishnamurti (eds.) 1992: 247–260. Hock, Hans Henrich, and Rajeshwari V. Pandharipande 1978 Sanskrit in the pre-Islamic sociolinguistic context of South Asia. International Journal of the Sociology of Language 16: 11–25. Hosain, Tanya, and James W. Tollefson 2006 Language policy in education in Bangladesh. In: Tsui and Tollefson (eds.) 2006: 241–257. Hosali, Priya 2000 Butler English: Form and function. Delhi: B. R. Publishing. Hosali, Priya 2005 Butler English. English Today 21(1): 34–39. Hosali, Priya, and Jean Aitchison 1986 Butler English: A minimal pidgin? Journal of Pidgin and Creole Languages 1(1): 51–79. Huiskamp, A. K. 1978 Soeoe se soeroe kar: An audio-visual course in Sarnami Hindustani for beginners. Paramaribo-Zuid: Summer School of Linguistics. Hussain, Sehr, and Saima Zaman 2004 Land, culture and identity: The case of the Kalasha. In: Victoria Tauli-Corpuz and Joji Cariño (eds.), Reclaiming balance: Indigenous peoples, conflict resolution and sustainable development, 267–291. Baguio City, Philippines: Tebtevva Foundation. Hussainmiya, B. A. 1986 Melayu Bahasa: Some preliminary observations on the Malay creole of Sri Lanka. Sari 4(1): 19–30. Hussainmiya, B. A. 2008 Orang regimen: The Malays of the Ceylon Rifle Regiment. 2nd edition. Dehiwala: A. J. Prints. Imam, Syeda Rumnaz 2005 English as a global language and the question of nation-building education in Bangladesh. Comparative Education 41(4): 471–486. Inam Ullah 2010a Torwali-Urdu luγat [Torwali-Urdu dictionary]. Lahore: Centre for Research in Urdu Language Processing, National University of Computer and Emerging Sciences.

Sociolinguistics

693

Inam Ullah 2010b Torwali-Urdu luγat [online Torwali-Urdu dictionary]. Lahore: Center for Language Engineering, Al-Khwarizmi Institute of Computer Science, University of Engineering and Technology. http://cle.net.pk/otd/ (accessed 19 Dec. 2014) Inam Ullah 2012 Documenting languages in danger. http://www.cle.org.pk/clt12/slides/CLT12_ Documenting%20Languages%20in%20Danger.pdf (accessed 5 Nov. 2014) Jahani, Carina (ed.) 2000 Language in society: Eight sociolinguistic essays on Balochi. (Studia Iranica Upsaliensia 3.) Uppsala: Acta Universitatis Upsaliensis. Jain, Dhanesh 2003 Sociolinguistics of the Indo-Aryan languages. In: Cardona & Jain (eds.) 2003: 391–443. Jamal, Abedin 2010 Attitudes toward Hazaragi. Southern Illinois University PhD dissertation. ProQuest Dissertations 1477416. Janjua, Fauzia 2011 Causes of decline of Yadgha language. Canadian Social Science 7(2): 249– 255. Jinnah, Mohammad Ali 1989 Quaid-i-Azam Mohammad Ali Jinnah: Speeches and statements as Governor General of Pakistan 1947–48. Islamabad: Ministry of Information & Broadcasting, Government of Pakistan. Jones, Sir William 1786 The third anniversary discourse, delivered 2nd February, 1786: On the Hindus. Published 1789 in Asiatick Researches 1: 415–431. Kachru, Braj B. 1978 Toward structuring code-mixing: An Indian perspective. International Journal of the Sociology of Language 16: 27–46. The Hague: Mouton Kachru, Braj B. 1983 The Indianization of English: The English language in India. Delhi: Oxford University Press. Kachru, Braj, Yamuna Kachru, and S. N. Sridhar (eds.) 2008 Language in South Asia. Cambridge: Cambridge University Press. Kearney, Robert N. 1978 Language and the rise of Tamil separatism in Sri Lanka. Asian Survey 18(5): 521–534. Khan, Hussain Ahmad 2004 Re-thinking Punjab: The construction of Siraiki identity. Lahore: Research and Publication Centre, National College of Arts. Khoklova, Liudmila V. 2014 Majority language death. In: Cardoso (ed.) 2014: 19–45. http://scholarspace. manoa.hawaii.edu/handle/10125/4600 (accessed 19 Dec. 2014) Khyber Pakhtunkhwa Government (Pakistan) 2011 http://www.kptbb.gov.pk/includes/secure_file.cfm?ID=38&menuID=6 (accessed Nov. 2013)

694

Bibliographical References

King, Robert D. 1997 Nehru and the language politics of India. Delhi: Oxford University Press. Kohistani, Razwal, and Ruth Laila Schmidt 1996 Ṣiṇā qāida [Shina primer]. Islamabad: Himalayan Jungle Project. Krishnamurti, Bh(adriraju) 1979 Classical or modern: A controversy of styles in education in Telugu. In: E. Annamalai (ed.), Language movements in India, 1–24. Mysore: Central Institute of Indian Languages. Krishanamurti, Bh(adriraju), Colin P. Masica, and Anjani K. Sinha (eds.) 1986 South Asian languages: Structure, convergence and diglossia. Delhi: Motilal Banarsidass. Lahiri, Sharmita 2008 A language made our own: The politics of identity and language in Indian writing. University of Houston PhD dissertation. ProQuest Dissertations 3318765. Laitin, David 1989 Language policy and political strategy in India. Policy Sciences 22: 415–436. Lefebvre, Claire (ed.) 2011 Creoles, their substrates and language typology. Amsterdam/Philadelphia: Benjamins. Lehr, Rachel 2014 A descriptive grammar of Pashai: The language and speech community of Darrai Nur. University of Chicago PhD dissertation. ProQuest Dissertations 3638612. Lewis, M. Paul (ed.) 2009 Ethnologue: Languages of the world, 16th edition. Dallas, Tex.: SIL International. http://www.ethnologue.com/ (accessed 6 Jan. 2015) Lewis, M. Paul, Gary F. Simons, and Charles D. Fennig (eds.) 2014 Ethnologue: Languages of the world, 17th ed. Dallas, Texas: SIL International. Online version: http://www.ethnologue.com (accessed 6 Jan. 2015) Liljegren, Henrik 2008 Toward a grammatical description of Palula: An Indo-Aryan language of the Hindu Kush. Stockholm University PhD thesis. http://www.diva-portal.org/ smash/get/diva2 %3A198468/FULLTEXT01.pdf (accessed 9 Jan. 2015) Lim, Lisa, and Umberto Ansaldo 2007 Identity alignment in the multilingual space: The Malays of Sri Lanka. In: E. A. Anchimbe (ed.), Linguistic identity in multilingual postcolonial spaces, 218–243. Newcastle upon Tyne: Cambridge Scholars Publishing. Lines, Maureen 1996 A sad legacy: Environmental problems in the Kalash valleys. In: Elena Bashir and Israr-ud-Din (eds.), Proceedings of the Second International Hindukush Conference, 439–446. Karachi: Oxford University Press. Macaulay, Thomas Babington 1970 Prose and poetry. Edited by G. M. Young. Cambridge: Harvard University Press. Mahapatra, Ranganayaki 1985 inṟaiya tamiḻ: ilakkiya vaḻakkum pēccu vaḻakkum [Contemporary Tamil: Literary use and spoken use]. Chennai: Pari Nilaiyam.

Sociolinguistics

695

Manan, Syed Abdul, Maya Khemlani David, and Francisco Perlas Dumanig 2014 Language management: A snapshot of governmentality within the private schools in Quetta, Pakistan. Language Policy, online first at: http://link. springer.com/article/10.1007 %2Fs10993–014–9343-x (accessed 5 Jan. 2015) Mansoor, Sabiha 1993 Punjabi, Urdu, English in Pakistan: A sociolinguistic study. Lahore: Vanguard. Matras, Yaron 2002 Romani: A linguistic introduction. Cambridge: Cambridge University Press. Matras, Yaron, Peter Bakker, and Christo Kyuchkov (eds.) 1997 The typology and dialectology of Romani. Amsterdam/Philadelphia: Benjamins. Meakins, Felicity 2013 Mixed languages. In: Peter Bakker and Yaron Matras (eds.), Contact languages: A comprehensive guide, 159–228. Berlin/New York: de Gruyter Mouton. Mesthrie, Rajend 1991 Language in indenture: A sociolinguistic history of Bhojpuri-Hindi in South Africa. London: Routledge. Mesthrie, Rajend 2008 South Asian languages in the second diaspora. In: Kachru, Kachru & Sridhar (eds.) 2008: 497–514. Mir, Farina 2010 The social space of language: Vernacular culture in British colonial Punjab. Berkeley: University of California Press. Miranda, Rocky 1978 Caste, religion and dialect differentiation in the Konkani area. International Journal of the Sociology of Language 16: 77–91. Mitchell, Lisa 2009 Language, emotion, and politics in South India: The making of a mother tongue. Bloomington: Indiana University Press. Moag, Rodney 1977 Fiji Hindi. Canberra: Australian National University. Moag, Rodney 1979 The linguistic adaptation of the Fiji Indians. In: Vijay Misra (ed.), Rama’s banishment: A centenary tribute to the Fiji Indians 1879–1979, 112–138. London: Heineman Educational. Moag, Rodney 1992 The life cycle of non-Native Englishes: A case study. In: Braj Kachru (ed.), The other tongue, 233–244. Urbana: University of Illinois Press. Mock, John Howard 1998 The discursive construction of reality in the Wakhi community of Northern Pakistan. University of California, Berkeley, PhD dissertation. ProQuest Dissertations 9922976. Mohan, Peggy Ramesar 1977 Creole and Bhojpuri serial verbs in Trinidad: A case of syntactic reinforcement. 1977 Annual Meeting of the Linguistic Association of the Southwest, Baton Rouge, Louisiana, November 10–12, 1977.

696

Bibliographical References

Mohan, Peggy Ramesar 1978 Trinidad Bhojpuri: A morphological study. University of Michigan PhD dissertation. ProQuest Dissertations 7813706. Mohan, Peggy Ramesar, and Paul Zador 1986 Discontinuity in a life cycle: The death of Trinidad Bhojpuri. Language 62: 291–320. Mørch, Ida Elizabeth 2000 How fast will a language die when it is officially no longer spoken? In: C. E. Lindberg and S. Nordahl (eds.), 17th Scandinavian Conference of Linguistics II, 161–176. (Odense Working Papers in Language and Communication 19.) Lund. Morey, Stephen 2005 The Tai languages of Assam: A grammar and texts. (Pacific Linguistics 565.) Canberra: Research School of Pacific and Asian Studies, the Australian National University. Morey, Stephen 2010 Turung: A variety of Singpho language spoken in Assam. (Pacific Linguistics 614.) Canberra: Research School of Pacific and Asian Studies, the Australian National University. Morgenstierne, Georg 1944 Indo-Iranian frontier languages, Volume III, The Pashai language, Part 2: Texts and translations. Oslo: H. Aschehoug & Co. Morgenstierne, Georg 1956 Indo-Iranian frontier languages, Volume III, The Pashai language, Part 3: Vocabulary. Oslo: Universitetsforlaget. Morgenstierne, Georg 1967 Indo-Iranian frontier languages, Volume III, The Pashai language, Part 1: Grammar. Oslo: Universitetsforlaget. Moseley, Christopher (ed.) 2010 Atlas of the world’s languages in danger. 3rd edn. (Memory of Peoples Series.) Paris: UNESCO Publishing. http://www.unesco.org/culture/en/endangered languages/atlas (accessed 19 Dec. 2014) Mufwene, Salikoko S. 1997 Jargons, pidgins, creoles, and koines: What are they? In: Arthur K. Spears and Donald Winford (eds.), The structure of pidgins and creoles including selected papers from the meetings of the Society for Pidgin and Creole Linguistics, 35–69. Amsterdam/Philadelphia: Benjamins. Mufwene, Salikoko S. 2000 Creolization is a social not a structural process. In: Ingrid Neumann-Holzschuh and Edgar Werner Schneider (eds.), Degrees of restructuring in creole languages, 65–84. Amsterdam/Philadelphia: Benjamins. Mufwene, Salikoko S. 2004 Multilingualism in linguistic history: Creolization and indigenization. In: Bhatia & Ritchie (eds.) 2004: 460–488. Müller, Katja, Elisabeth Abbess, Calvin Tiessen, and Gabriela Tiessen 2008 Language vitality and development among the Wakhi people of Tajikistan. SIL Electronic Survey Report 2008–011 http://www2.sil.org/silesr/2008/ silesr2008–011.pdf (accessed 19 Dec. 2014)

Sociolinguistics

697

Munshi, Sadaf 2006 Jammu and Kashmir Burushaski: Language, language contact, and change. University of Texas, Austin, PhD dissertation. ProQuest Dissertations 3263374. Munshi, Sadaf 2009 Documenting the Burushaski language: Issues in data collection, transmission, preservation, and revitalization. http://burushaskilanguage.com/ (accessed 9 Sept. 2015) Munshi, Sadaf 2010 Contact-induced language change in a trilingual context: The case of Burushaski in Srinagar. Diachronica 27(1): 32–72. Munshi, Sadaf 2010–present Work in progress on an Archive of Annotated Burushaski Texts. http://ltc. unt.edu/~sadafmunshi/Burushaski/index.html (accessed 19 Dec. 2014) Nadkarni, Mangesh V. 1975 Bilingualism and syntactic change in Konkani. Language 55(3): 672–683. Nagar, Ila 2008 Language, gender and identity: The case of kotis in Lucknow-India. Ohio State University PhD dissertation. ProQuest Dissertations 332816. Naji, Naji Khan 2008 Khowār-Urdū luγat [Khowar-Urdu dictionary]. Shotkhar, Torkhow: Naji Sons Publications. Narayanan, R. Karthik 2014 Assessing vitality of languages spoken by less than 10,000 speakers in India. Jawaharlal Nehru University MPhil Dissertation. National Census of Nepal 2011 Kathmandu: Nepal Central Bureau of Statistics. Nayak, Harōgadde Mānappa 1967 Kannada: Literary and colloquial: A study of two styles. Mysore: Rao and Raghavan. Nayar, Baldev Raj 1966 Minority politics in the Punjab. Princeton: Princeton University Press. Nayar, Baldev Raj 1969 National communication and language policy in India. New York: Praeger. Nordhoff, Sebastian 2009 A grammar of upcountry Sri Lanka Malay. University of Amsterdam PhD thesis. Utrecht: LOT. http://dare.uva.nl/record/1/319874 (accessed 8 Nov. 2014) Nordhoff, Sebastian 2012 Establishing and dating Sinhala influence in Sri Lanka Malay. Journal of Language Contact 5: 23–57. Nordhoff, Sebastian 2013 The genesis of Sri Lanka Malay as a multi-layered process. In: Nordhoff (ed.) 2013: 217–239. Nordhoff, Sebastian (ed.) 2013 The genesis of Sri Lanka Malay: A case of extreme language contact. Leiden: Brill.

698

Bibliographical References

Oldenburg, Philip 1985 A place insufficiently imagined: Language, belief, and the Pakistan crisis of 1971. The Journal of Asian Studies 44(4): 711–733. Paauw, Scott H. 2004 A historical analysis of the lexical sources of Sri Lanka Malay. York University MA thesis. ProQuest Dissertations MQ99370 (accessed 19 Dec. 2014) Paauw, Scott H. 2013 The lexical sources of Sri Lanka Malay revisited. In: Nordhoff (ed.) 2013: 129–143. Pandharipande, Rajeshwari V. 2001 The role of language of religion in the convergence of South Asian languages. In: Rajendra Singh (ed.), The yearbook of South Asian languages and linguistics 2001, 289–310. New Delhi: Sage. Pandharipande, Rajeshwari V. 2003 Sociolinguistic dimensions of Marathi: Multilingualism in central India. München: LINCOM. Pandharipande, Rajeshwari V. 2006 Ideology, authority, and language choice: Language of religion in South Asia. In: Joshua A. Fishman and Tope Omoniyi (eds.), Explorations in the sociology of language and religion, 141–164. Amsterdam/Philadelphia: Benjamins. Patnaik, Ashutosh Prasad 2003 The early voyagers of the East. Vol. 2. Delhi: Pratibha Prakashan. Pattanayak, Debi Prasanna (ed.) 1990 Multilingualism in India. Clevedon, Avon (England)/Bristol, PA: Multilingual Matters. Payne, John 1997 The Central Asian Parya. In: Shirin Akiner and Nicholas Sims-Williams (eds.), Languages and scripts of Central Asia, 144–153. London: School of Oriental and African Studies. Pemberton, Kelly, and Michael Nijhawan (eds.) 2009 Shared idioms, sacred symbols, and the articulation of identities in South Asia. New York/Abingdon, Oxon: Routledge. Perder, Emil 2008 Dameli, a preliminary sketch. International Conference on Endangered Language Documentation and Tradition with a Special Interest in the Kalasha of the Hindu Kush Valleys, Himalayas, 7–9 November 2008, Aristotle University, Thessaloniki. Perder, Emil 2013 A grammatical description of Dameli. University of Stockholm PhD thesis. Peterson, Jan Heegård 2006 Local case-marking in Kalasha. University of Copenhagen PhD thesis. Pradeep, K. 2010 Tribute to Cochin Creole Portuguese. The Hindu, 26 Sept. http://www.thehin du.com/life-and-style/society/article795353.ece (accessed 1 Nov. 2014) Proctor, Edward 2008 Gypsy dialects: A selective annotated bibliography of materials for the practical study of Romani. Hatfield, England: University of Hertfordshire Press.

Sociolinguistics

699

Radhakrishna, B. 1980 Study of diglossia and the Telugu situation. In: Agesthialingom & Karunakaran (eds.) 1980: 229–241. Rahman, Tariq 1996a Language and politics in Pakistan. Karachi: Oxford University Press. Rahman, Tariq 1996b The history of the Urdu-English controversy in Pakistan. Repr. 1998, Islamabad: National Language Authority. Rahman, Tariq 1999 Language, education, and culture. Oxford/New York: Oxford University Press. Rahman, Tariq 2002 Language, ideology and power: Language-learning among the Muslims of Pakistan and North India. Karachi: Oxford University Press. Rahman, Tariq 2004 Language policy and localization in Pakistan: Proposal for a paradigmatic shift. 2004 SCALLA conference, Kathmandu, Nepal, 5–7 January 2004. http:// www.elda.org/en/proj/scalla/SCALLA2004/rahman.pdf (accessed 19 Dec. 2014) Rahman, Tariq 2006 Language policy, multilingualism and language vitality in Pakistan. In: Saxena & Borin (eds.) 2006: 73–106. Rahman, Tariq 2010 Language policy, identity, and religion: Aspects of the civilization of the Muslims of Pakistan and North India. Islamabad: National Institute of Pakistan Studies, Quaid-i-Azam University. Rai, Alok 2002 Hindi nationalism. New Delhi: Orient Longman. Rai, Amrit 1984 A house divided: The origin and development of Hindi/Hindavi. Delhi/New York: Oxford University Press. Ramaswami, N. 1997 Diglossia: Formal and informal Tamil. Mysore: Central Institute of Indian Languages. Ramaswamy, Sumathi 1997 Passions of the tongue: Language devotion in Tamil India, 1891–1970. Berkeley: University of California Press. Rasul, Sarwet 2013 Borrowing and code mixing in Pakistani children’s magazines: Practices and functions. Pakistaniaat: A Journal of Pakistan Studies 5(2): 46–72. Rehman, Khwaja A., and Joan L. G. Baart 2005 A first look at the language of Kundal Shahi in Azad Kashmir. http://www.sil. org/silewp/2005/silewp2005–008.pdf (accessed 19 Dec. 2014) Rensch, Calvin R., Sandra J. Decker, and Daniel G. Hallberg 1992 Languages of Kohistan. (Sociolinguistic Survey of Northern Pakistan 1.) Islamabad: National Institute of Pakistan Studies, Quaid-i-Azam University/ Summer Institute of Linguistics.

700

Bibliographical References

Rensch, Calvin R., Calinda E. Hallberg, and Clare F. O’Leary 1992 Hindko and Gujari. (Sociolinguistic Survey of Northern Pakistan 3.) Islamabad: National Institute of Pakistan Studies, Quaid-i-Azam University/ Summer Institute of Linguistics. Roberts, Sam 2010 Listening to (and saving) the world’s languages. New York Times, April 29, 2010. http://www.nytimes.com/2010/04/29/nyregion/29lost.html?sq=language &st=cse&scp=10&pagewanted=1 (accessed 19 Dec. 2014) Safder, Alladina 1985 South Asian languages in Britain: Criteria for description and definition. (Occasional Paper N: 5.) London: Centre for Multicultural Education, University of London Institute of Education. Sagar, Muhammad Zaman 2003a Report on a dialect survey mission to Dir Kohistan. http://www.oocities.org/ kcs_kalam/dialdirkh.html (accessed 5 Nov. 2014) Sagar, Muhammad Zaman 2003b Report on a language survey trip to the Bishigram valley. http://fli-online.org/ (accessed 5 Nov. 2014) Sagar, Muhammd Zaman 2008 A multilingual education project for Gawri-speaking children in northern Pakistan. The 2nd International Conference on Language Development, language Revitalization, and Multilingual Education in Ethnolinguistic Communities, Bangkok 1–3 July, 2008. http://www.seameo.org/_ld2008/ doucments/Presentation_document/Gawri_presentation_Bangkok2008.pdf (accessed 3 Dec. 2014) Saxena, Anju, and Lars Borin (eds.) 2006 Lesser-known languages of South Asia: Status and policies, case studies and applications of information technology. Berlin/New York: Mouton de Gruyter. Schiffman, Harold F. 1996 Linguistic culture and language policy. London/New York: Routledge. Schiffman, Harold F. 1997 Diglossia as a sociolinguistic situation. In: Florian Coulmas (ed.), The handbook of sociolinguistics, 205–216. Oxford: Blackwell. Schiffman, Harold F., and S. Arokianathan 1986 Diglossic variation in Tamil film and fiction. In: Krishnamurti, Masica & Sinha (eds.) 1986: 371–381. Delhi: Motilal Banarsidass. Schiffman, Harold F., and Brian Spooner (eds.) 2012 Language policy and language conflict in Afghanistan and its neighbors: The changing politics of language choice. Leyden: Brill. (Editor’s note: Information about Brian Spooner being co-editor reached Brill too late to be included on the title page, but should be noted, as his contribution was significant.) Schneider, Edgar 2003 The dynamics of new Englishes: From identity construction to dialect rebirth. Language 79(2): 233–281. Sethu Pillai, R. P. 1974 Tamil: Literary and colloquial. Madras: University of Madras.

Sociolinguistics

701

Shanmugam Pillai, M. 1960 Tamil: Literary and colloquial. International Journal of American Linguistics 26(3): 27–42. Shanmugam Pillai, M. 1965 Merger of literary and colloquial Tamil. Anthropological Linguistics 7(4): 98–103. Shapiro, Michael C., and Harold F. Schiffman 1981 Language and society in South Asia. Delhi: Motilal Banarsidass. Sharma, Rekha, and E. Annamalai (eds.) 2003 Indian diaspora: In search of identity. Mysore: Central Institute of Indian Languages. Shetty, Malavika Leeladhar 2008 Television and the construction of Tulu identity in South India. University of Texas, Austin, PhD dissertation. ProQuest Dissertations 3341959. Siegel, Jeff 1975 Fiji Hindustani. Working Papers in Linguistics 7(3): 127–144. Honolulu: University of Hawaii. Siegel, Jeff 1987 Language contact in a plantation environment: A sociolinguistic history of Fiji. Cambridge: Cambridge University Press. Siegel, Jeff 1988 The development of Fiji Hindustani. In: Barz & Siegel (eds.) 1988: 121–149. Siegel, Jeff 1990 Pidgin Hindustani in Fiji. In: Jeremy H. C. S. Davidson (ed.), Pacific Islands languages: Essays in honour of G. B. Milner, 173–196. London: School of Oriental and African Studies. Singh, Udaya Narayana, and Maniruzzaman 1983 Diglossia in Bangladesh and language planning. Kolkata: Gyan Bharati. Singh, V. D. 1981 Portrait of a pidgin: Bazaar Hindi. In: K. P. Acharya (ed.), Papers in Indian linguistics: Proceedings of a seminar: Convergence, Pidginization and Simplification with specification [sic] reference to the Indian situation, March 1981, Part – II on Pidgins,Pidginization,Creolization & Simplification, paper no. 5. (CIIL ebook.) Mysore: Central Institute of Indian Languages. http:// www.ciil-ebooks.net/html/piil/acharya11.html (accessed 1 Nov. 2014) Slomanson, Peter 2011 Dravidian features in the Sri Lankan Malay verb. In: Lefebvre 2011 (ed.): 383–409. Amsterdam/Philadelphia: Benjamins. Slomanson, Peter 2013 Known, inferable, and discoverable in Sri Lankan Malay research. In: Nordhoff (ed.) 2013: 77–104. Smith, Ian R. 1977 Sri Lanka Creole Portuguese phonology. Cornell University PhD dissertation. ProQuest Dissertations 7800089. Published 1978, Trivandrum: Dravidian Linguistics Association. Smith, Ian R. 1979 Convergence in South Asia: A creole example. Lingua 48: 193–222.

702

Bibliographical References

Smith, Ian R. 2001 Sri Lanka Portuguese. In: Philipp Strazny (ed.), Encyclopedia of linguistics, 1033–1036. Chicago: Fitzroy Dearborn. Smith, Ian R. 2002 Creolization and convergence in morphosyntax: Sri Lanka Portuguese and Sourashtra nominal marking typology. In: Peri Bhaskararao and K. V. Subbarao (eds.), The yearbook of South Asian languages and linguistics 2001, 391–409. London/New Delhi: Sage. Smith, Ian R. 2008 Pidgins, creoles and Bazaar Hindi. In: Kachru, Kachru & Sridhar (eds.) 2008: 253–268. Smith, Ian R. 2011 Diglossia in Sri Lanka Portuguese: The role of Anglophone missionaries. MS. Smith, Ian R. 2012a Adstrate influence in Sri Lanka Malay: Definiteness, animacy and number in accusative case marking. Journal of Language Contact 5: 5–22. Smith, Ian R. 2012b Comments on Nordhoff’s “Establishing and dating Sinhala influence in Sri Lanka Malay”. Journal of Language Contact 5: 58–72. Smith, Ian R. 2012c Measuring substrate influence: Word order features in Ibero-Asian creoles. In: Hugo Cardoso, Alan Baxter, and Mário Pinharanda Nunes (eds.), IberoAsian creoles: Comparative perspectives, 125–148. Amsterdam/Philadelphia: Benjamins. Smith, Ian R. 2013 Hijacked constructions in untutored second language acquisition: Implications for Sri Lanka Malay. In: Nordhoff (ed.) 2013: 195–232. Smith, Ian R., and Scott Paauw 2006 Sri Lanka Malay: Creole or convert? In: Ana Deumert and Stephanie Durrleman (eds.), Structure and variation in language contact, 159–181. (Creole Language Library 29.) Amsterdam/Philadelphia: Benjamins. Smith, Ian R., Scott Paauw, and B. A. Hussainmiya 2004 Sri Lanka Malay: The state of the art. In: Rajendra Singh (ed.), The yearbook of South Asian languages and linguistics 2004, 197–215. Berlin/New York: Mouton de Gruyter. Sonntag, Selma K. 1980 Language planning and policy in Nepal. ITL – Review of Applied Linguistics 48: 71–92. Sonntag, Selma K. 1995 Ethnolinguistic identity and language policy in Nepal. Nationalism and Ethnic Politics 1(4): 108–120. Sonntag, Selma K. 2002 Minority language politics in North India. In: James W. Tollefson (ed.), Language policies in education: Critical issues, 165–178. Mahwah, NJ: Lawrence Erlbaum. Sonntag, Selma K. 2003 The local politics of global English: Case studies in linguistic globalization. Lanham, MD: Lexington Books.

Sociolinguistics

703

Sonntag, Selma K. 2006 Change and permanence in language politics in Nepal. In: Tsui & Tollefson (eds.) 2006: 205–217. Spear, Percival (ed.) 1958 The Oxford history of India. 3rd edition. Oxford: Clarendon Press. Sreedhar, M. V. 1974 Naga pidgin: A sociolinguistic study of inter-lingual communication pattern in Nagaland. Mysore: Central Institute of Indian Languages. Sreedhar, M. V. 1985 Standardized grammar of Naga pidgin. Mysore: Central Institute of Indian Languages. Sridhar, Kamal K. 2008 South Asian diaspora in Europe and the United States. In: Kachru, Kachru & Sridhar (eds.) 2008: 515–533. Sridhar, S. N. 1978 On the functions of code-mixing in Kannada. International Journal of the Sociology of Language 16: 109–118. Strand, Richard 1997–2014 Nuristan: Hidden land of the Hindu-Kush. http://nuristan.info/index.html (accessed 19 Dec. 2014) Strand, Richard 2000 aćharêtâ' lexicon. http://nuristan.info/lngFrameL.html (accessed 19 Dec. 2014) Strand, Richard 2011a The sound system of kt'ivřâ·i vari. http://nuristan.info/Nuristani/Kamkata/ Kata/KataLanguage/Lexicon/phon.html (accessed 19 Dec. 2014) Strand, Richard 2011b bhaT'esa z'ib lexicon. http://nuristan.info/lngFrameL.html (accessed 19 Dec. 2014) Tagore, Rabindranath 1984 bangla SobdotOtto [Bangla philology]. 3rd edition. Kolkata: Visvabharati. Tan, Eunice 2000 A mother tongue literacy programme among the Baloch of Singo Line, Karachi. In: Jahani (ed.) 2000: 59–67. Temple, R. C. 1903 The Andaman and Nicobar Islands. Report on the Census. (Census of India 1901, vol 3.) Calcutta. Office of the Superintendent of Government Printing, India. Thirumalai, M. S. 2002 Sri Lanka’s language policy: A brief introduction. Language in India 1(9): Paper no. 4. http://www.languageinindia.com/jan2002/index.html (accessed 14 Dec. 2014) Tikkanen, Bertil 2011 Domaki noun inflection and case syntax. In: Bertil Tikkanen and Albion M. Butters (eds.), Pūrvāparaprajñābhinandanam: East and west, past and present: Indological and other essays in honour of Klaus Karttunen, 205–228. (Studia Orientalia 110.) Helsinki: Finnish Oriental Society.

704

Bibliographical References

Tinker, Hugh 1993 A new system of slavery: The export of Indian labour overseas, 1830–1920. London: Hansib Publishing. Trail, Ronald L., and Gregory R. Cooper 1999 Kalasha dictionary, with English and Urdu. Islamabad: National Institute of Pakistan Studies, Quaid-i-Azam University/Summer Institute of Linguistics. http://fli-online.org/ (accessed 20 Dec. 2014) Tsui, Amy B. M., and James W. Tollefson (eds.) 2006 Language policy, culture, and identity in Asian contexts. (New Perspectives on Language and Education.) Mahwah, NJ: Lawrence Erlbaum Associates. UNESCO 2003 Language vitality and endangerment. (Report of the Ad Hoc Expert Group on Endangered Languages.) Paris: UNESCO Ad Hoc Expert Group on Endangered Languages. http://www.unesco.org/new/fileadmin/MULTIMEDIA/HQ/CLT/ pdf/Language_vitality_and_endangerment_EN.pdf (accessed 19 Dec. 2014) Vaid, Jyotsna 1980 The form and functions of code-mixing in Indian films: The case of Hindi and English. Indian Linguistics 41: 376–384. Valentine, Tamara Marie 1986 Aspects of linguistic interaction and gender in South Asia. University of Illinois PhD dissertation. ProQuest Dissertations 8701643. van Driem, George 2007 South Asia and the Middle East. In: Christopher Moseley (ed.), Encyclopedia of the world’s endangered languages, 2nd edition. 283–347. London/New York: Routledge. Velu Pillai, A. 1971 cācanamum tamiḻum [Inscriptions and Tamil Studies]. Paradenia, Sri Lanka: Author. Velu Pillai, A. 1976 Study of the dialects in inscriptional Tamil. Thiruvananthapuram: Dravidian Linguistics Association. Weinreich, Matthias 1999 Der Domaaki-Dialekt von Nager. Studien zur Indologie und Iranistik 22: 203– 214. Weinreich, Matthias 2009 Two varieties of Ḍomaakí. Zeitschrift der Deutschen Morgenländischen Gesellschaft 158(2): 299–316. Weinreich, Matthias 2010 Language shift in Northern Pakistan: The case of Domaakí and Pashto. Iran and the Caucasus 14: 43–56. Yesudhasan, C. 1980 Sri Lanka Tamil and mainland Tamil. In: Agesthialingom & Karunakaran (eds.) 1980: 463–477. Zaidi, Abbas 2001 Linguistic cleansing: The sad fate of Punjabi in Pakistan. In: Thomas J. Hubschman (ed.), Best of GOWANUS: New writing from Africa, Asia and the Caribbean, 208–214. New York: GOWANUS Books.

Sociolinguistics

705

Zaidi, Abbas 2011 Ethnolinguistic vitality of Punjabi in Pakistan: A GIDS approach. Journal of Language and Literature Review 1(1): 1–17. Zakharyin, Boris 2013 Sanskrit and Pāli influence on languages and literatures of ancient Java and Burma. Lingua Posnaniensis 4(2): 151–158. Zia, Muhammad Amin 2010 Ṣiṇā-Urdū luγat [Shina-Urdu dictionary]. Gilgit: Zia Publishers. Zoller, Claus Peter 2005 A grammar and dictionary of Indus Kohistani, Vol. 1: Dictionary. Berlin/New York: Mouton de Gruyter. Zubair, Shirin 2010 Not easily put-downable: Magazine representations and Muslim women’s identities in Southern Punjab, Pakistan. Feminist Formations 22(3): 176– 195. http://muse.jhu.edu/journals/feminist_formations/v022/22.3.zubair.pdf (accessed 25 Nov. 2014) Zvelebil, Kamil 1964 Spoken languages of Tamil Nadu. Archiv Orientálni 32: 237–264.

7

Indigenous South Asian grammatical traditions Edited by Hans Henrich Hock

7.1.

Introduction By Hans Henrich Hock

Most of this volume focuses on modern linguistic scholarship on South Asia. There is, however, a long and distinguished indigenous tradition of grammatical scholarship, both in Sanskrit and in the Dravidian languages. Far from being of interest only to historians of linguistics, philologists, and other “antiquarians”, the tradition has important insights to offer to modern linguists. Like the work of, say, Delbrück, Speijer, Wackernagel, Whitney; or Krishnamurti, Masica, G. Anderson, Genetti; or Bhatt, Davison, Subbarao, the work of the indigenous grammarians must be considered earlier scholarship — to be argued against where appropriate, but not to be neglected. The two contributions to this chapter deal with the best known and most influential streams of scholarship, that of the Sanskrit grammarians (especially Pāṇini) and the grammars of various Dravidian languages (especially Tolkāppiam). At least two other grammatical traditions can be recognized. One is an indigenous Tibetan grammatical tradition, which has been discussed by Miller (1976, 1993, 2000) and Verhagen (2000a, b); see also Tournadre 2010. The other is the Sidat San̆ garā, a 13th to 14th century grammar of Sinhala, edited and translated by Gair and Karunatillake (2013). For Hindi, Bhatia (1987) suggests that the use of Sanskrit models of grammatical description is a relatively recent phenomenon and that the foundations of Hindi grammar are to be sought in western, colonial grammars. The case may be similar for the grammars of other modern Indo-Aryan languages. Further research is needed.

7.2.

Indo-Aryan grammatical traditions (Sanskrit and Prakrit) By Hans Henrich Hock

7.2.1.

Introduction

Concern with ritual linguistic purity is found as early as the Rig Veda (e.g. 10.71.9). Late Vedic offers glimpses of a developing grammatical terminology, e.g. kurvat ‘present’, kariṣyat ‘future’, cakṛvat ‘past’ (Kauṣītaki Brāhmaṇa 22.3), but early grammatical activity remains largely hidden. This includes the development of the concept “root”. Early on, the third singular present must have been used to designate the root, such as juhoti ‘offer(s)’

708

Hans Henrich Hock

= √hu ‘offer’. Traces are preserved in Pāṇini’s grammar. A plausible intermediate form is the ta-participle (e.g. hu-ta), which employs the weakest root form. While speculative, this account would explain the fact that the final stage, the familiar monosyllabic root, normally presents the weakest root form (e.g. √hu, rather than the fuller form ho of ju-ho-ti). A strand of linguistic activity that can be traced better is Vedic phonetics and sandhi analysis. The starting point must have been the development of a padapāṭha (‘[prepausal] word-recitation’) beside the saṁhitāpāṭha (‘continuous recitation’) that is used in the ritual. Recitation of individual words in prepausal form undoes sandhi and so produces an invariant representation (e.g. agníḥ for agnís, agníṣ, agníś, …). The next stage was the development of rules converting padapāṭha into saṁhitāpāṭha. Together with detailed phonetic observations, as well as other elements (e.g. statements on recitational practices), this tradition culminated in the Prātiśākhyas (6th–5th c. BC?), phonetic/phonological minigrammars associated with different branches of the Veda. The later Śikṣās, influenced by the grammatical tradition, tend to replace the phonetic approach of the Prātiśākhyas with a phonological one (Hock 2014; see also this volume, 3.1.2). The etymological tradition, which explains word meanings through synchronic derivation, is assumed to have started with Śākaṭāyana, who proposed that all words are derived from verbal roots. This view is accepted in the first extant text, Yāska’s Nirukta (5th–4th c. BC?), where it is contrasted with an opposing view, attributed to Gārgya (Matilal 1990). Adopting Śākaṭāyana’s view required postulating “artificial roots” to account for words not derivable from verbal roots, such as √aś ‘run’ for aśva ‘horse’. A distinct tradition of grammar effectively starts with Pāṇini (4th c. BC?). In principle it postdates the Prātiśākhyas and Nirukta, but the grammatical, phonetic, and etymological traditions continued interacting with each other — an exact relative chronology of the extant texts is difficult to establish. Further, Pāṇini refers to numerous predecessors whose grammars have not survived, presumably because they were superseded by Pāṇini’s. (Cardona 1976: 146–148.) Pāṇini’s grammar is composed in “sūtra” (i.e. highly compressed) style and presumably was accompanied by an autocommentary, which has been lost. Interpretation of Pāṇini depends on later commentaries, especially the Mahābhāṣya of Patañjali (2nd c. BC), which also contains comments by Kātyāyana (3rd c. BC?) and other grammarians, with Patañjali generally upholding Pāṇini against alternative views. The commentatorial tradition continues to the present. (Cardona 1976: 276–293.) Later grammars make various adjustments to Pāṇini’s approach and/or extend it to Pali and the Prakrits. Most of these are composed in Sanskrit, except for the Pali grammars. (See 7.2.3.5.) Although primarily focused on other issues, various philosophical traditions also address points of grammar. Bhartṛhari’s Vākyapadīya (5th c. AD) is the major representative of language philosophy. (See 7.2.4.)

Indigenous South Asian grammatical traditions

7.2.2.

709

Phonetics

The Prātiśākhyas present the first known articulatory classification of speech sounds. Insights include the following. Voice and voicelessness are defined as involving approximation or openness of the glottis respectively. The breathy voice of “voiced aspiration” and voiced h is recognized as involving a glottis setting intermediate between open and approximated. Nasals are classified as nasalized stops. The articulatory difference between stops and fricatives is clearly recognized. The entire sound system is organized in terms of place and manner of articulation; see 1.3.1.1, Table 1.1. (This organization is adopted for the order of letters when writing is introduced.) In addition to articulatory classification, there are developments toward a resonance-based system, according to which the velars, glottals, and a-vowels share neutral resonance and hence are classified by one term (kaṇṭhya, lit. ‘glottal, guttural’). (Cardona 1986.) The later Śikṣās and grammatical tradition tend to replace the phonetic classifications of the Prātiśākhyas with phonological ones, such as “voiced” for both voiced and “voiced aspirated” stops, or the classification of r as retroflex, rather than alveolar (like retroflex ṣ, it triggers retroflexion of n to ṇ). (Hock 2014.) The Sanskrit tradition exerted a significant influence on early western phonetics. Thus, terms like “voiced” and “voiceless” are calques of Skt. ghoṣin/ghoṣavat ‘having voice’ and aghoṣa ‘without voice’. However, the influence came through western grammars based on the later grammatical tradition; hence stops like bh are classified as “voiced aspirates”, ignoring their breathy-voice articulation. Through the same channel, the phonetically inaccurate retroflex classification of r entered western accounts of Sanskrit. (Hock 2014.) 7.2.3.

Grammar

Pāṇini’s grammar, the Aṣṭādhyāyī,1 is the crown jewel of the older Indo-Aryan grammatical tradition. Most of this section therefore focuses on this grammar. Later developments are briefly discussed in 7.2.3.5. 7.2.3.1. General characteristics of Pāṇini’s grammar2 Pāṇini’s work is recognized as a śabdānuśāna, i.e. a grammar accounting for the correct formation of WORDS . Its major focus is phonology and morphology. Some issues of syntax are covered, but many aspects are not, or only implicitly. These 1 2

Editions by Böhtlingk (1887), Vasu (1897), Katre (1989). For recent discussions of the architecture of Pāṇini’s grammar see Cardona 2009, Kiparsky 2009; and see Raster 1993 for syntax and Cardona 2000 for morphology.

710

Hans Henrich Hock

include the notions “subject” and “predicate”, “sentencehood”, and “subordination”. Notice that in the Pāṇinian framework, “grammaticality” is defined in term of formal correctness; the grammar does not rule out semantically anomalous structures. Thus, the Sanskrit version (1) of Chomsky’s Colorless green ideas sleep furiously would be perfectly grammatical. (1)

avarṇā haritā matayaḥ sakopaṁ svapanti colorless.NOM . PL . F green.NOM . PL . F idea.NOM . PL . F furiously sleep.PRS .3 PL

Pāṇini’s grammar is generative in the technical sense (but not transformational). The need for a generative account is explicitly discussed, and justified, in Patañjali’s Mahābhāṣya (Paspaśā 7). The major object of the grammar is the spoken language that Pāṇini was familiar with. The language of the Vedic tradition receives more cursory treatment. (See also 7.2.3.2 below.) The grammar was most likely composed and transmitted orally. Rules such as a  b / c _ d therefore are stated in an oral metalanguage (rather than graphically): ‘a-genitive b-nominative c-ablative d-locative’; see (3) below. The most recent attempt to deny orality is Bronkhorst 2011; but see Deshpande 2011 and Witzel 2011. The grammar presupposes the insights of the Prātiśākhyas and the etymological tradition (in some version). However, in order to make phonological generalizations, Pāṇini reorganizes the sound system as in (2) — the “pratyāhāra” or “Śiva” sūtras. Grammatical markers (here signaled by small caps) make it possible to refer to classes of elements. Thus iK indicates the vowels starting with i and ending with the last vowel before the marker K , i.e. i-, u-, ṛ-, and ḷ-vowels; these are the vowels that have non-syllabic counterparts. Note that only short vowels are listed, since long, short, nasalized (etc.) vowels generally behave the same in sandhi. Where length distinctions are relevant, this is indicated by grammatical markers following the vowel. See (3) for an example of how Pāṇini uses the pratyāhāra sūtras to make phonological generalizations. (2)

aiuṆ ṛḷK eoṄ ai au C hyvrṬ lṆ ñmṅṇnM jh bh Ñ gh ḍh dh Ṣ jbgḍdŚ kh ph ch ṭh th c ṭ t V

Indigenous South Asian grammatical traditions

711

kpY śṣsR hL (3)

iK -o iK - GEN {i, u, ṛ, ḷ} 

yaṆ yaṆ . NOM {y, v, r, l}3

aC -i (6.1.77) aC - LOC / __ V

As regards the issue whether all forms can be derived from verbal roots (7.2.1 above), Pāṇini steers a middle course, accepting “artificial roots”, but listing them in a separate component of the grammar. The overall structure of the grammar is as follows. The core consists of eight books: 1. Technical terms, metarules, “kārakas” (see 7.2.3.3); 2. Nominal composition, surface morphology; 3. Primary derivation; 4. and 5. Secondary derivation; 6. and 7. Accentology and lexical phonology; 8. External sandhi. To these are added the pratyāhara sūtras (2), the gaṇapāṭha (list of stems, grammatically organized), the dhātupāṭha (list of roots, grammatically organized), and — presupposed — the uṇādi sūtras (list of “artificial roots”). Significantly, the various components of the core are organized thematically, not in the order of application during a derivation. In fact, derivations typically draw on rules distributed over various parts of the grammar. Traditionally, this is accomplished by accessing rules from the text as stored in memory, much like accessing data in a computational data bank. 7.2.3.2 Rule application and interaction Except in certain sections, where strict ordering is specifically prescribed (e.g. in the “kāraka” section), the application of rules follows from general principles. Unfortunately, these are not specified in Pāṇini’s grammar, but are mentioned in later commentaries. One of the principles is uncontroversial, the “elsewhere” principle: A more specific rule takes precedence over a competing general rule. A second principle, that of bracketing, is the subject of continuing controversy, mainly between Cardona who accepts it (e.g. 1977: xvi-xxiii) and Kiparsky who does not (e.g. 1991). According to this principle, a rule that applies in the closer domain (“antaraṅga”) takes precedence over one that applies in the broader domain (“bahiraṅga”). Pāṇini also recognizes optionality of rule application, using the terms anyatarasyām (lit. ‘either way’), vā (‘or’), and vibhāṣā (‘alternative’). Kiparsky (1979) tries to distinguish different values of these terms, supporting his analysis with the evidence of late Vedic texts roughly contemporary with Pāṇini. A problem is the 3

Sūtra 1.3.10 provides for “respective” substitution, so that e.g. y substitutes for i, v for u.

712

Hans Henrich Hock

apparent existence of dialect differences between Pāṇini and the mainstream of Vedic texts; see 1.3.1.4.1. In references to Vedic, Pāṇini often uses bahulam ‘variously’, a term that apparently serves to indicate variation without going into details. 7.2.3.3 Abstract and overt case and related issues As noted, Pāṇini’s grammar does not operate with the notions “subject” and “predicate”. Instead, it works with two distinct categories of what may be called “case”, as well as an abstract set of verbal affixes (“LA -kāras”), which together make it possible to (nontransformationally) relate active and passive. The abstract case system consists of the “kārakas”, which define different “roles” of participants in an action. The status of the kārakas has been the topic of some debate. The initial definition of kārakas in semantic terms, as in (4a), might suggest something like “semantic case”; and some scholars have accepted this idea (e.g. Sinha 1973). However, subsequent rules typically restrict the definition in purely formal terms, as in (4b). The first definition, therefore, may be considered an initial approximation (in terms that are easily understood), but the “real” definition consists of the totality of defining sūtras and is operational, rather than semantic. Nevertheless, the basic semantic definition remains significant in the grammar. (See Cardona 1976: 216–221 with references.) (4)

a. b.

karmaṇā yam abhipraiti sa saṁpradānam (1.4.32) ‘Whom one aims at with the karman [see below], that is SAṀPRADĀNA (which is generally realized as dative)’ rucy-arthānāṁ priyamāṇaḥ (1.4.33) ‘With (roots) meaning ruc “please”, (the person) pleased (is saṁpradāna).’

The grammatically most important kārakas are “karman” and “kartṛ”, loosely translatable as ‘patient’ and ‘agent’. However, the semantic notion of “agency” is irrelevant for defining kartṛ. Thus, an axe is kartṛ in (5), if a speaker conceives of it as “agent”, even though by the semantically-based kāraka rules it should be classified as karaṇa ≈ ‘instrument’ (there is no rule deriving its kartṛ-status from a more basic “instrumental” one). Kartṛ effectively designates an underlying subject, and karman an underlying object. (5)

asir vṛkṣaṁ chinatti axe.NOM . SG . M tree.ACC . SG . M cut.PRS . ACT .3 SG ‘The axe cuts the tree’

The terms karman and kartṛ are also relevant in verbal derivation: The abstract LA -kāras of transitive verbs may be either marked for kartṛ (active voice) or karman (passive). For intransitives, the option is kartṛ or bhāva (≈ action pure and simple; impersonal passive).

Indigenous South Asian grammatical traditions

713

The “vibhakti” (surface-case) realization of kārakas depends on (negative) crossreference of nominal and verbal marking. Kartṛ and karman are realized as instrumental and accusative respectively if not marked on the verb, but as default nominative if marked. LA -kāras are realized in terms of appropriate finite or non-finite stems and agreement suffixes. See the simplified schemas in (6). (6)

a. vibhakti rules default vibhakti LA -kāra rules b. vibhakti rules default vibhakti LA -kāra rules

asi axe.KARTṚ ------

vṛkṣa chid tree.KARMAN cut.LA . KARTṚ ACCUSATIVE

NOMINATIVE

chi-na-t-ti asir vṛkṣaṁ chinatti ‘The axe cuts the tree.’ asi vṛkṣa chid tree.KARMAN cut.LA . KARMAN axe.KARTṚ INSTRUMENTAL -----NOMINATIVE

asinā vṛkṣaś ‘The tree is cut by the axe.’

chid-ya-te chidyate

In a similar way, an impersonal passive corresponding to (1) can be derived by LA -kāra marking for bhāva; this produces the structure in (1’), which cannot be idiomatically glossed in English, but is perfectly grammatical in Sanskrit. (1’) avarṇābhir haritābhir matibhiḥ colorless.INST . PL . F green.INST . PL . F idea.INST . PL . F supyate sleep.PRS . PASS .3 SG

sakopaṁ furiously

The notion kartṛ plays a role in a number of other contexts. For instance, the introduction of the converb affix K tvā (where K is a grammatical marker) is governed by the restriction that the kartṛ of the converb and of the matrix finite verb must be identical (Pāṇini 3.4.19). 7.2.3.4. Agreement and the issue of sentencehood Pāṇini takes care of agreement by the notion samānādhikaraṇa ‘having the same reference’. Elements having the same reference have the same case, number, gender, etc. Clearly, however, the notion samānādhikaraṇa requires a domain in which it applies, and that would have to be the sentence. Similarly, sūtra 8.1.28, which states that a finite verb preceded by another finite verb is accented, makes sense

714

Hans Henrich Hock

only if the verbs are in the same domain — the sentence. However, as noted, Pāṇini does not offer an overt definition of the sentence. These problems are addressed in Patañjali’s Mahābhāṣya, which cites a suggestion by Kātyāyana to define sentences as “ekatiṅ” (having just one finite verb), only to show that this suggestion cannot account for the kind of structures addressed by Pāṇini’s sūtra 8.1.28. (See Deshpande 1991 for discussion.) 7.2.3.5. Later grammars — Sanskrit, Pali, Prakrit4 Within the Sanskrit tradition, two different strands need to be distinguished. One consists of relatively late recasts of Pāṇini’s grammar, whose epitome is the Siddhāntakaumudī of Bhaṭṭojī Dīkṣita (17th c. AD). In addition to providing extensive exemplification, these grammars tend to reorganize the grammar and to “simplify” it, e.g. by collapsing the kāraka and vibhakti rules. The second strand starts with the Kaumāralāta (early 4th c. AD?) and the slightly later Kātantra. These grammars, too, recast Pāṇini’s grammar, but add a number of interesting features. First, they do not adopt Pāṇini’s pratyāhāra or Śiva sūtra arrangement of the sound system (see (2) above), but instead revert to — or continue (?)5 — the Prātiśākhya list of speech sounds. Moreover, instead of being ancillary to the grammar (as with Pāṇini), the list is incorporated as the beginning of the grammar. Second, being composed by and for Buddhists, they include coverage of Buddhist Sanskrit, which often differs from Pāṇinian Sanskrit as in bhāveti vs. bhāvayati ‘causes to be’, under the label ārṣa ‘belonging to the ṛṣis [of Buddhism]’. Third, instead of the abstract grammatical elements of Pāṇini’s grammar, such as LAṄ ‘imperfect’, it uses descriptive terms, such as hyastanī ‘yesterday’s; past’. Fourth, in some cases it follows Pāṇini in spite of the fact that Buddhist Sanskrit grammar differs from Pānini’s (e.g. in the use of the three past tenses). Finally, they reflect the influence of writing, as in the label bindu for the anusvāra (after the droplike anusvāra symbol in Brāhmī script). The Kātantra in turn influenced Kaccāyana’s Pakarana (after 7th c. AD), written in Pali, which initiates a tradition of Pali grammars. Like the Kātantra, the Pakarana starts with the Prātiśākhya list of speech sounds (minus ṛ and ḷ, which do not exist in Pali), and uses descriptive terminology such as hiyattanī (Skt. hyastanī). To some extent it departs from the Kātantra, by introducing new abstract elements such as GHA for feminine ā-stems. A later Pali grammar, Moggallāna’s Saddalakkhana (12th c. AD) differs from the Pakarana on several points, such as including short ĕ, ŏ among the vowels — needed for Pali, but not for Sanskrit (and 4 5

For comprehensive discussion see Scharfe 1977a. Burnell (1875) proposed that the Kātantra tradition (as well as Tolkāppiyam, for which see Section 7.3) goes back to a pre-Pāṇinian “Aindra” school. For discussion and critique see Cardona 1976: 150–151.

Indigenous South Asian grammatical traditions

715

hence not included in the Prātiśākhya list). The most influential Pali grammar, at least in Southeast Asia, is Aggavaṁsa’s Saddanīti (17th c. AD). (See Pinde 1995 for the Pali tradition.) The Prakrit grammarians (Vararuci, Hemacandra, and others) essentially derive Prakrit from Sanskrit, but also add lists of deśī words (words not derivable from Sanskrit). While the derivation of Prakrit from Sanskrit may suggest a historical approach, this was not the intent of the Prakrit grammarians. The grammars are written in Sanskrit, with the possible exception of Caṇḍa’s grammar of “ārṣa” Prakrit (Ardhamāgadhī); see v. Hinüber 2001: 89 with references, as well as NittiDolci 1938. 7.2.4.

Philosophy and grammar

A number of philosophical systems have addressed issues of linguistic interest. A common concern is whether words — i.e. the words of Sanskrit (in contrast to, say, Prakrit) — are eternal, an issue raised already in Patañjali’s Mahābhāṣya. The school of Mīmāṁsā (1st c. BC +) accepts the eternity of words and their meanings, and so does Bhartṛhari in his Vākyapadīya (5th c. AD). This view is not accepted by Nyāya (2nd c. BC +) and the Buddhist philosophers (2nd c. AD +), who hold that words and their meanings are conventional. Another issue is the question of sentence meaning. The most radical perspective is that of Bhartṛhari, who considers grammarians’ concepts such as word and speech sound to be unreal abstractions; the meaning of a sentence arises as a “sphoṭa” (‘spark’) upon hearing it. Bhartṛhari’s perspective is questioned, in different ways, by the other philosophical schools, but the Prābhākara branch of Mīmāṁsā takes an intermediate position, arguing that while individual words contribute meaning, full meaning arises only from the connection of words within the sentence. More specific contributions include the following. Mīmāṁsā, whose major goal was the interpretation of Vedic ritual, focused on the analysis of Vedic injunctions and argued that the core of the injunction lies in the verb (expressing the ritual action to be taken), and the major element in the verb is the optative suffix (indicating that the ritual action must be undertaken). Nyāya contributes the concepts of “subject” (uddeśya) and “predicate” (vidheya). Sentence interpretation requires four major elements: Sannidhi (contiguity of words), ākāṅkṣā6 (the fact that there is an ‘expectancy’ between words), yogyatā (semantic appropriateness), and tātparya (speaker’s intention).7 In a 6

7

This concept is already found in Patañjali’s Mahābhāṣya and also in Mīmāṁsā. Pāṇini (3.2.114, 8.1.35) uses s(a-)ākāṅkṣā ‘with expectancy’ in reference to the relation between conditional clauses. The first three notions also occur in Bhartṛhari’s Vākyapadīya (e.g. 1.100, 2.337, 2.460).

716

E. Annamalai

free-word-order language, the notion sannidhi creates problems, but these can be finessed by means of ākāṅkṣā. Of greater concern is yogyatā. Not only does it rule out sentences such as (1) and (1’), but also the Pāṇinian analysis of structures like (5) and (6), since considering an axe to be a kartṛ, i.e. AGENT , is semantically inappropriate. A formal grammar that would license such structures would have to be very different from Pāṇini’s, but such an alternative grammar is not proposed. An important contribution of Buddhist philosophy is the theory of apoha: The meaning of a word is definable only in terms of what it does NOT designate. (See e.g. Hattori 2000.)

7.3.

Tamil and Dravidian grammatical traditions By E. Annamalai

7.3.1.

Introduction

The grammatical tradition described in this section is indigenous in the sense of its ‘cultural embeddedness’ (Kniffka 2001: 1), which includes its epistemological grounding in the country of its origin. It is part of the mainstream intellectual pursuit about language and related matters for over two millennia and is practiced to date in the institutions of education from schools to universities. In the modern period, it is indigenous in the sense of being in contrast to the western grammatical tradition, with which it is in a simultaneously competing and complementing relationship. 7.3.2.

The grammatical traditions of Telugu, Kannada, and Malayalam

The Dravidian grammatical tradition is pluralistic in terms of the four literary Dravidian languages. With regard to the grammars of Telugu and Kannada, the earliest grammarians imported the Sanskrit grammatical tradition, especially of Pāṇini, to describe these languages, much like the grammarians of Prakrit. These grammars were written in the Sanskrit language, as Prakrit grammars were. Their writers conceptualized their languages to be relatable, if not related, to Sanskrit and, as a consequence of this and of the dominance of Sanskrit as the medium of philosophical inquiry of their times, they used, conceptually and terminologically, the grammatical model developed for Sanskrit to describe the grammar of their languages. The Sanskrit language itself was considered to be the metalanguage, as much as the conceptual and analytical apparatus of grammatical analysis, for the description of these Dravidian languages. This was the case whether their grammars were written in the target Dravidian language or in Sanskrit. The language variety chosen for description was literary. This variety with its preponderance of

Indigenous South Asian grammatical traditions

717

Sanskrit and Prakrit loans (words and compounds) and with its use of Sanskrit literary forms, themes, and conventions made the choice of the Sanskrit descriptive model as well as its rules less problematic. The tradition claims that the first grammar of Telugu is the Āndhra Śabda Cintāmaṇi and that it was written by Nannaya Bhattaraka of the 11th century CE, in Sanskrit. The scholarly opinion, however, holds that it belongs to the 16th century, and that the first grammar of Telugu is the Āndhra Bhāṣā Bhūṣaṇamu written in the 13th century in Telugu by Kētana. The later supplementing work of this grammar by Atarvaṇa is known by the name Vikṛti Vivēkamu ‘(Body of) knowledge of the modified’. Nannaya’s grammar considers Telugu to be a vikṛti, a modified form (of Sanskrit, like a Prakrit) and goes on classifying the words of Telugu into Sanskrit-derived — further categorized into those with (tadhbhava) and without (tatsama) phonological change, and native — further categorized into common (desya) and rustic (grāmya). Description of Telugu is thus made in relation to Sanskrit in order to identify the similarities and explain the differences. Bāla Vyākaraṇamu, also written in Telugu, but in the Sanskrit tradition and terminology, by Parvatsu Cinnaya Sūri in the 19th century, eclipsed the earlier works. (Purushottam 1996, 1997.) Similar adherence to the Sanskrit tradition is evidenced in Kannada, but the grammar is contextualized within the description of poetics, not just in the choice of the literary language for description. The first full-fledged grammatical work is Karnāṭaka Bhāṣā Bhūṣaṇa written in Sanskrit with a short commentary by Nagavarma II, a Jain, in the 12th century. He mentions a few earlier works of grammar, which are lost. He also wrote a work on poetics (Kāvyāvalōkana) in Kannada, of which grammar (Śabdasmriti) is a part. This work is preceded by the first Kannada work in poetics called Kavirāja Mārga of the 9th century, which has some description of grammar to complement poetics. The second full-fledged grammar is Śabdamaṇidarpaṇa of Kēśirāja written in the 13th century (Kulli 1976), which is exceptional in its choice of Kannada to write the grammatical description, though it is the traditional medium for describing poetics. The third grammatical work is Śabdānuśāsana of the 17th century written by Bhaṭṭākaḷaṅka in Sanskrit. Irrespective of the medium, all grammars follow the Sanskrit model for theory and analysis (Kulli 1991, 1997). For them, Sanskrit grammar, in its twin aspects of external codification and internal structure, is universally applicable. These grammars of Telugu and Kannada do not share anything from the Tamil grammatical tradition codified in Tolkāppiyam, which does not share the assumption of the universality of Sanskrit grammar, though this alternative tradition was practiced for about a millennium by the time of the first grammars of Telugu and Kannada, and it was built around another southern language. Nevertheless, they, especially Kannada, imbibe the early Tamil tradition of taking grammar to be not just a description of the literary language but also to be a part of the description of literature itself.

718

E. Annamalai

The Malayalam grammatical description is similar to that of Telugu and Kannada in approach and modeling, though its literary and grammatical heritage was common with Tamil. Before becoming an independent language in the early centuries of the second millennium, its speech area was part of the Tamil land and its speakers were participants in the literary and grammatical activities of Tamil. For sociopolitical reasons, Chera (Kerala) Tamil claimed autonomy from Tamil, fueled by the literary and lexical impact of Sanskrit. Its first grammatical description is in Lilātilakam written in Sanskrit in the 14th century with a commentary by the author. It is primarily a book on poetics. The grammatical section of this work shows familiarity with the Tamil grammatical tradition and its description of the phonology of Dravidian word formation reflects the sandhi rules of Tolkāppiyam and Nannūl, an elder contemporary of Lilātilakam. It even cites the former in support of its analysis and refers to the latter by name; it takes examples from Tamil grammatical commentators to illustrate a point or to differentiate Malayalam from Tamil (Elayaperumal 1972). One objective of this work is to endow linguistic autonomy to Kerala Bhasha (which around the 16th century came to be known as Malayalam), highlighting the lexical and some grammatical differences with the rest of Tamil on the east and the north (Shanmugam 1992a). Though the linguistic content leads Lilātilakam to draw from Tamil sources for authority, its analytical model and terminology are of the Sanskrit tradition (Nampoothiry 1972). Its grammatical part describes the language of literature called maṇipravāḷa, which is highly Sanskritized lexically, metrically, and thematically and is distinct from pāṭṭu, which is the literary residue of the Tamil heritage held on to by the non-elite segment of society (Ezutaccan 1975, 1997). The description of literary rhetoric is entirely Sanskritic with no use for the poetics of Tolkāppiyam. In spite of shared ancestry with the Tamil tradition, projecting the Malayalam grammar on to the Sanskrit tradition is strong in the Malayalam grammatical practice. The name of the 19th century grammar, Kērala Pāṇiniyam by Rāja Rāja Varma, shows this, in spite of the modern influence on it from the model of European grammatical description introduced by missionary grammarians. 7.3.3.

The Tamil tradition

The Tamil grammatical tradition is different from the tradition of these other Dravidian languages. The beginnings of the Tamil literary tradition in the early centuries before the Common Era are distinct from Sanskrit and have claims to literary independence in terms of temporal themes and indigenous theory. Tamil country was politically divided, but each part was sovereign under a Tamil-speaking ruler. Tamil literati claimed parity for their language with Sanskrit; there was total absence of “Prakrit mentality” about Tamil. No Tamil grammar was ever written in Sanskrit, in contrast to the grammars of Prakrit and other Dravidian languages. Tamil grammatical tradition can be taken to represent the Dravidian tradition in

Indigenous South Asian grammatical traditions

719

the sense that it claims to parallel the Sanskritic tradition and that it is informed by the structure of a Dravidian language. Parallel tradition does not mean there was no contact between the two traditions (for a comparison of the two, see Meenakshi 1997, 1999 and Ananthanarayana 1984–1986: 470–490). Tamil grammarians, including Tolkāppiyar, the author of Tolkāppiyam ‘the ancient book’, who was probably a Jain, were cognizant of Sanskrit grammatical traditions including Pāṇini’s. Some of them likely were bilinguals in Tamil and Sanskrit in spite of the fact that there is no grammar of Sanskrit written in Tamil until the modern period. Tamil borrowed a limited number of words from Sanskrit, and these few loans were assimilated to the phonological structure of Tamil; the preferred mode was to create semantically parallel words where necessary. Tolkāppiyam mentions Sanskrit8 as one of the sources of words found in the language of literature, the other kinds of words being the regional lexical alternates (ticai-c col ‘words of (geographical) directions’) (Chevillard 2008: 21–50), semantically “deviant” words (tiri col ‘words deviant [from the norm]’ such as homonyms and synonyms) and (semantically) normal words (iyaṟ col ‘normal words’). The grammarians of other Dravidian languages also classify words (Shanmugam 1992a: 145–147), as mentioned earlier, but only by the language of their origin and the region of their use. Lexical classification in Pāṇini’s grammar of Sanskrit, on the other hand, is limited to identifying chandas (Vedic) and bhāṣā (spoken-language) words for the operation of some grammatical rules. The different ways of classifying words show different perceptions of the purpose of grammar. For Tolkāppiyar, the basis for the classification is the difference in the knowledge needed for getting the meaning of words in poetry. When there are parallels between the Tamil and Sanskrit traditions, modern scholars tend to conclude that the Tamil grammarians borrowed the concepts, analysis, or terminology from Sanskrit grammars or even translated sūtras from Sanskrit works. (Subrahmanya Sastri 1934 is an early example; for an alternative view of parallelism and adaptation to fit the facts of Tamil, see Thirugnanasambandham 1992.) Borrowing and translation are a misnomer to characterize every feature that is shared or similar. Tamil grammatical studies were part of the Indian epistemological enquiries on language with their similarities and differences, which are highlighted variously from time to time. This would explain the similarities found in Tolkāppiyam, written around the beginning of the Common 8

Tolkāppiyar does not use the word Sanskrit, but uses the word vaṭamoẓi ‘northern language’ (in opposition to tenmoẓi ‘southern language’ that refers to Tamil at later times), which must include Sanskrit and Prakrit, the latter of which came in contact with Tamil earlier through Jain traders and monks. Commentators differ in the interpretation of the kinds of words identified by Tolkāppiyar; a full discussion of this can be found in Shanmugam 1984: 64–113.

720

E. Annamalai

Era9 (on the basis of paleographic evidence; Mahadevan 2003), with Pāṇini’s Aṣṭādhyāyī, associated texts (Prātiśākyas, Śikṣās etc.), as well as non-Pāṇinian grammars of Sanskrit.10 There is, however, no acknowledged evidence, it seems, that the Sanskrit grammarians were cognizant of the Tamil grammatical tradition including poetics, though some of them worked in the Tamil environment and probably were speakers of Tamil.11 Explanation must be found in the political economy of knowledge in ancient India. The nature of grammatical description derives from the grammarian’s notion of language. For Tolkāppiyar, language is a code to be analyzed to understand literature. Grammar describes the ordinary language the speakers use as well as the language which the literature creates to satisfy its aesthetics, which includes metaphors, metonyms, rhyme, prosody, and the like. The language of literature cannot be divorced from literary content and structure. The base language is called vaẓakku ‘the one in vogue’ and the literary language made out of it is called ceyyuḷ ‘the one constructed’. Because the subject matter of grammar is the language’s double construct and dual signification, Tolkāppiyam is a three-part grammar consisting of the description of sounds (eẓuttu) in a word, description of words (col) in a sentence, and description of the literary form and substance (poruḷ). The Sangam poetic categories of akam ‘(poems of) the interior’ and puṟam ‘(poems of) the exterior’ are in a sophisticated symbolic language and these poems cannot be understood without an understanding of this language. To understand this language of literature, knowledge of the ordinary language is a prerequisite. This knowledge is codified in the first two parts of the grammar called eẓuttatikāram ‘chapter on sound’, on the production of sounds, their distribution in a word, and their changes in a sequence of grammatical and lexical forms (Shanmugam 9

10

11

Though Tolkāppiyar is the earliest grammarian of Tamil, he refers to some analyses as ‘so they say, so say scholars’. These anonymous references could be to earlier grammarians or to the evolving contemporary schools of grammatical thought. It will be illuminating to investigate if any of these references relate to any grammarian in the Sanskrit tradition and to suggest a common tradition. Uncertainty about the time of the authors is a problem for suggesting the influence of one on the other. Tolkāppaiyar and Bharata are an example. The similarities between the former’s meyppāṭu and the latter’s rasa are striking, but they could still be misleading. Scholars have argued that the significations of these two concepts are very different. The literary compositions themselves in Sanskrit and Prakrit share some literary conventions and themes with Tamil (Hart 1976). Sanskritic literary theory is constructed from such literary works. There are rare references in modern scholarship to traces in Sanskrit literary theory of the poetic theory of Tamil as enunciated in Tolkāppiyar’s chapter on constructing poetry (ceyyuḷiyal), the largest chapter in the entire work (Rajagopalan 1968). Tolkāppiyar makes the point that the sequence of prosodic units must be meaningful to constitute a poem (Shanmugam 2006). The chronologically later notion of avisatya in Sanskrit theory of literature probably expresses this point.

Indigenous South Asian grammatical traditions

721

2001),12 and collatikāram ‘chapter on word’, which is on the morphological structure of words in isolation and in relation to others in a sentence. This is the foundation of the Tamil grammatical tradition, which sets it apart from the Pāṇinian tradition with its concentration on the middle part of the above three. The theory of sound production and distribution and the theory of literary composition and interpretation were not just parts of language studies, as in the Sanskrit tradition, but were parts of the grammatical study in the Tamil tradition. The purpose of grammar for Tolkāppiyar was to help understand and interpret literature (only poetry in his times) as a double-layered sign system. Tolkāppiyar’s attention to the written signs (letters of the alphabet), not found in Pāṇini, is a reflection of the literate literary culture as well as the need to codify the emerging writing system of Tamil. An extension of the latter is his observation that loan words from Prakrit (and Sanskrit) are not written with letters from outside the Tamil alphabet (i.e. with the letters of Prakrit (and Sanskrit)), which has been a characteristic of literary Tamil but not of the co-existing documentary Tamil of inscriptions (Subbarayalu 2009). Tolkāppiyar did not construct a metalanguage to have his grammatical rules compressed and ordered. His metalanguage is limited to the technical terms for grammatical concepts, many of which are valid for the description of the Tamil of modern days. Though the rules are called sūtras ‘strings’ by the commentators adopting the Sanskrit term, the Tamil term for the rules is nūṟpā ‘string-verses or grammar-verses’. These in Tolkāppiyam have the structure and rhythm of verses with occasional poetic flashes. Their language is flowing rather than being formulaic. This language probably reflects the purpose of the grammar mentioned above in the Tamil tradition. The grammatical terms used by Tolkāppiyar are original Tamil words or semantic adaptations of Sanskrit terms. They reflect respectively the concepts developed specifically for Tamil and concepts applied to Tamil from the description of Sanskrit. Such a terminological practice is the opposite of what the grammarians of other Dravidian languages followed, which was to import them from Pāṇini along with his metalanguage. The commentarial tradition in Tamil, however, uses metalinguistic techniques for interpreting sūtras such as anuvṛtti (called māṭṭēṟṟu in Tamil), which is the technique of carrying over words from one sūtra to the following one(s); adhikāra, which is the context provided by the preceding sūtra to interpret the following one; and such others. The commentarial style is also shared between the two traditions, consisting of raising a question and answering it in an imagined teacher-student interaction. Tolkāppiyar sets up four classes of words, which are nouns (peyar ‘name’), verbs (vinai ‘activity’), grammatical words and morphemes (iṭai ‘the middle or 12

It may be noted that a writing system for Tamil adapting the Brahmi script to its system of phonology had emerged and was becoming standardized during the time of Tolkāppiyam. Defining the sound structure of words was very important for the grammarian.

722

E. Annamalai

in-between’), and special words (uri ‘belonging (to special semantics), i.e. semantically ‘special’ words like the uncommon, the onomatopoetic, the symbolic, the intensifiers, the collectives, synonyms, homonyms etc.).13 Of these, the first two are parts of speech in modern terminology and they are essential to describe sentence formation. The third is essential for word and sentence formation and the fourth for semantic interpretation. Adjectives and adverbs are not identified separately and recognized as independent word categories or parts of speech, just as in Sanskrit grammar (Scharfe 1977b). Adjectives in the sense of being modifiers of nouns can be derived from nouns and verbs while nouns and verbs themselves can assume adjectival function. Words that modify a noun but are non-derived, non-nouns and non-verbs, are very few in Old Tamil. The same is true of adverbs in the sense of being modifiers of verbs that are not derived or inflected (unlike non-finite verb forms). Besides the presence of these morphological and functional facts in support of non-recognition of adjective and adverb as separate word categories or parts of speech, another reason for their absence in Tamil grammar is the particular way of conceptualizing a sentence in Tolkāppiyam. The core of a sentence in the Sanskrit grammatical tradition is a noun and a verb, of which the former has the referencing function and the latter the predicating function. The European grammatical tradition has subject and predicate as the pivots of a sentence and allows the predicate to include word categories other than the verb such as adjective (Bhat 1984). So does the Tamil tradition. Tamil, as Tolkāpppiyar shows, has verbs and nouns as predicate, but not adjectives. He sets up subject (eẓuvāy ‘the opener’) and predicate (payanilai ‘the site of function’), which function can be performed by a verb or a noun. The verb includes action that bears time (tense) and nouns of possession or belonging (see below) that do not bear time but are inflected like verbs to agree with the subject; the noun, which may be simple or derived (but not inflected), includes referential nouns and interrogative pronouns. Tolkāppiyar also needs the notion predicate to describe verb agreement with the subject, which he describes in relatively more detail (compared to, for example, verb conjugation). He divides nouns into grammatical sub-categories for this purpose, one of which is the noun of common gender (viravu-t tiṇai), i.e. common between human and non-human genders. Tolkāppiyar needs a predicate for relating the nominative case to it. He does not set up abstractly a verb be in sentences where the predicate is a noun and thus does not make the predicate and the verb coterminous, as Kātyāyaṇa does for Sanskrit. He has six kinds of predicate: stating existence, commanding, action, open (i.e. wh-) question, description of quality or belonging, and object identification. 13

Shanmugam 1986: 115–175 is a detailed discussion of the meaning of uriccol; it brings in meaning and literature in the understanding of this term. For iṭaiccol, see Shanmugam 1992.

Indigenous South Asian grammatical traditions

723

In terms of word categories, the first three are verbs in Tamil, the fourth and the last are nouns, and the fifth is treated as a category of verb (kuṟippu vinai ‘(tense) implicit verb’). He sets up six kinds on the basis of their agreement characteristics with the subject. Examples of tense-implied (i.e. tenseless) verbs as predicate are gender-inflected nouns (malaiyan ‘hills-he’, malaiyai ‘hills-you’ from malai ‘hill’) and stative verbs (iniyan ‘sweet-he’, iniyai ‘sweet-you’ from in ‘be sweet’). These are treated as verbs because of the view that their morphology instantiates agreement, though they are not marked for tense, which is the defining characteristic of a verb, according to Tolkāppiyam. In spite of the theoretical step of setting up a sub-category of verbs without tense, Tolkāppiyar recognizes that predicates are not only verbs but could be nouns as well. The case relationship other than that of the nominative is described as the relationship between the noun and the verb (both tense marked and tense implied), though the verb and the predicate are not coterminous in Tamil. This can be seen by the list of verbal meanings that a case is compatible with. The nouns are identified and differentiated by the case markers (vēṟṟumai urupu) they carry. The subject noun does not have a case marker and is called peyar vēṟṟumai ‘nominal case’. The term vēṟṟumai (‘difference’) is explained by commentators: As a functional term, it refers to the fact that nouns are differentiated as to their relation with the verb by the case marker on them; as a formal term, it means that nouns are differentiated as to their form from their occurrence in citation or in a dictionary; they are also differentiated from the nominative case form. The formal understanding of case as ‘difference’ would not create a problem to list the genitive and the vocative as cases. Though case is a relation between a noun and the verb in a sentence, the relation between two nouns in some nominal compounds (called vēṟṟumai-t tokai ‘case compound’, where the case marker is elided) is explained to be one of recovering the formally elided case, which relates the nouns of the compound through a tense-implied verb in its relative-participle form. Thus, poṟṟoṭi ‘gold bangle’ would expand as ‘bangle that is made of gold’. This is true of the genitive case also (pon vilai ‘the price gold has’ i.e. ‘the price of gold’). The tense-implied verb as predicate would be analyzed in the same fashion (avan malaiyan / iniyan ‘he is one who belongs to hills / has sweetness’). Thus a case relation between a noun and another noun or a gender suffix as the head of a phrase (as in anmoẓittokai ‘compound with an absent head’) is mediated by a tenseless verb. Similar analysis would hold for predicates taken to be tenseless verbs (kuṟippu vinai), not as gendered nouns, as they relate to cases in the sentence. Examples are: nī ennin iniyai ‘you are sweeter from (than) me’, nī enakku iniyai ‘you are sweet / nice to me’. The unique concept of kuṟippu vinai, needed for the description of agreement as well as case, allows not having adjectives as a word category; it makes a putative “adjective” such as nal ‘good’ into a relative participle of a tenseless verb (kuṟippu-p peyareccam). Tolkāppiyar gives a set of possible semantic characteristics of verbs to which each case is linked. The tense of the verb may be explicit or implicit. The “second

724

E. Annamalai

case”, which is the accusative case, has, for example, twenty-eight senses of verbs to relate to. The sense list is not a closed one, as the sūtra ends with ‘and such senses’. Thus a case is defined by a set of senses of verbs, not by any semantic role assigned to the noun in which it is used in a sentence. Tolkāppiyar, however, introduces the concept of toẓilmutal ‘the primaries of action’. The ‘primaries’ are eight: the act, actor, the thing acted on, the place acted in, the time acted in, the instrument to do the act with, the thing or person acted for, and the purpose of the act. In spite of superficial similarity, these are not the semantic roles of nouns of Pāṇini’s kārakas, which are six. The list includes the act itself, which the verb represents; the commentators say it separates the action from doing it, as in ‘do weaving’; it should cover the occurrence of doubling for emphasis of the root form of the verb as in utai utaittaan ‘kicked a kick’, where the first utai is not an object, since the sentence will have an object in the accusative case and this doubling is possible with intransitive verbs. The list leaves out the source or comparison of the act, which is assigned to the fifth case (ablative14). It separates place and time semantically, but both are assigned to the seventh (locative) case. It adds a new case relationship, viz. purpose, different from the recipient represented by the dative case. This list, therefore, is not one of semantic roles or kārakas to relate the case-marked nouns to. Coming after the description of case alternation and before the sūtra which says that all these eight might not be found in each sentence, this can be said to set a parameter from which the meaning of a case marker could be obtained when it is not the commonly assigned one for the case — an interpreter parameter of cases. It would mean that kūlikku ‘for wages’ in the dative case will not be interpreted as representing a recipient, but a purpose; it would predict that no other case can be interpreted to occur in the sense of the ablative case. Tolkāppiyam has a separate chapter on case indeterminacy (vēṟṟumai mayakkam), which includes three things: case alternation, which is more than one case marker having the same meaning (like ‘hit the eye, hit on the eye’), case ambiguity, which is to have more than one interpretation when the case marker is absent (mostly in poetry) (‘the elephant-kill (killed or killing)-tiger’), and case substitution, which is obtaining the meaning of one case marker when another case marker is used (only in poetry). This is recognition of the fact that there could be indeterminacy in interpretation of cases and there could be, semantically, merger of cases, potential for multiple cases, and uncommon use of cases. It expresses the fact that the relation between case markers and the verb is neither transparent nor straight14

Tolkāppiyar assigns to this case the sense of comparison of things that differ in degree, but the commentators add to this case the sense of source from which a thing leaves. They also interpret comparison of unlike things as suggestive of being different from one another; that which is compared to is the source or ground deserving ablative case. It should be noted that comparison is not a sense of the ablative case in Pāṇini, nor in modern case theory.

Indigenous South Asian grammatical traditions

725

forward. This informs the case theory of Tamil grammatical tradition, which is a theory of interpretation rather than of generation. This theory of indeterminacy makes untenable any theory of semantic roles of case marked nouns or kārakas. The Tamil grammatical tradition codified in Tolkāppiyam was commented on and explicated from the eleventh century, after a lapse of a millennium, to the twentieth century (see for example Tamilannal 1993, 1994, which is a userfriendly commentary meant for non-specialists). The preeminence of Tolkāppiyam in the Tamil grammatical tradition can be seen in the elaborate commentaries on it in full or in part (there are seven of them in the premodern period); in the recapitulation of it (Ilakkaṇa Viḷakkam of Vaittiyanāta Tēcikar, a Saivite, in the 17th century, which is nicknamed ‘the little Tolkāppiyam’); in the rereading of it in the last century in the light of the modern Tamil renaissance asserting the distinctiveness of the Tamil tradition in its contemporary political context;15 and in the reinterpretation of it in the light of modern linguistic theories by S. V. Shanmugam (1984, 1986, and elsewhere) and by other linguists. The grammatical study of Tamil in the second millennium, which is invariably conducted with reference to Tolkāppiyam, whether in consonance or in dissonance with its commentaries, shows some new developments. The distinctive idea of the Tamil tradition that the grammar is a tool for literary interpretation is abandoned except by a few; the theory of poetics is decoupled from the theory of grammar, which comes to be treated as autonomous and to consist of two parts, one on sounds (eẓuttu) and another on words (col). Tamil literary form and content changed from the time of Sangam poetry, which the third part of Tolkāppiyam on substance (poruḷ) theorizes (Niklas 2001), and new separate works on parts of poetics started to appear from the 7th century. But the developments in literary history did not warrant a relook at the first and second parts of the grammar for integrating all three parts; rather they came to be treated autonomously, though Tolkāppiyar’s attention to the language of poetry that includes description of metaphor and metonymy was continued in the grammar, as they are features of the ordinary language as well. The new limiting concept of grammar brings the Tamil tradition closer to the Sanskrit tradition. Between the sound and the word, greater interest is on the latter as revealed in the new developments in the analysis of the internal structure of words.16 This is similar to Pāṇini’s focus on the word. 15

16

The leader of the political party Dravida Munnetra Kazhagam and the former Chief Minister of Tamil Nadu, M. Karunanidhi, has written a celebratory exposition of 100 selected sūtras (out of around 1600) from all three parts with visuals for public consumption by the name Tolkāppiya-p pūnkā ‘the flower garden of Tolkāppiyam’ (2003). There are more commentaries on the part dealing with the word in Tolkāppiyam than on the other two parts, of which the one by Cēnāvaraiyar (13th century) is considered to be superior for its informed use of the logical and grammatical traditions of Sanskrit in Tamil grammatical argumentation. A later grammarian, Civañāna Munivar (18th cen-

726

E. Annamalai

This development of incorporation of elements from Pāṇini’s tradition into the Tamil tradition has a large spectrum ranging from minimal deviation from Tolkāppiyam to the treatment of Tamil grammar as an image of the grammar of Sanskrit. Meenakshisundaran (1974) calls the latter a foreign model of Tamil grammar. An example is Ilakkaṇa-k kottu by Cāmināta Tēcikar, a Saivite, of the 17th century, who admired Tolkāppiyam and was well versed in it, but believed that Tamil differed minimally from Sanskrit as a language and so in grammar. He belongs to the school of grammarians who critiqued Ilakkaṇa Viḷakkam mentioned above. This school did not critique Tolkāppiyam, but differed from its commentators and other successor grammarians, claiming that their interpretation of Tolkāppiyam is ill informed and defective and that Tolkāppiyam should be read in the light of the ideas of the Sanskrit grammatical tradition. Some of the positions in this range have similarities with the grammatical ideology and theory in other Dravidian languages of the comparable period as regards Sanskrit but, unlike them, they are a cross between Tolkāppiyar’s and Pāṇini’s models in actual description. Vīracōẓiyam by Puttamittirar, a Buddhist grammarian, of the 11th century, is an example of this. It switches to Sanskrit grammatical terms, pays greater attention to the internal composition of words, sets up roots, and adopts a bipartite division of words into roots and suffixes (rather than Tolkāppiyam’s three elements in a word, viz., the base form, the middle form, and the end form), has a marker for the nominative case, describes kārakas, equates the implied verb with taddhitānta of Sanskrit, analyzes the morphology of Sanskrit loans in Tamil using the rules of Sanskrit grammar, and does other such things (Rajaram 1992, Srinivasan 2000: 93–115). But it does not adopt the metalinguistic conventions of Pāṇini and treats poetics as a part of grammar. Nannūl, ‘the good book’, by Pavaṇanti, a Jain of the 13th century, is a judicious accommodation of some features from the Sanskrit tradition responding to change in the language and the linguistic milieu of the medieval period. Its basic approach and analysis are that of Tolkāppiyam. Its grammatical description of the Tamil of his period is very condensed compared to Tolkāppiyam. The two separate sections on case in addition to the one on the vocative in Tolkāppiyam are a part of the chapter on nouns, in which case indeterminacy is treated just in one sūtra, reproduced from Tolkāppiyam, which gives freedom to interpret case markers by the sense appropriate to the linguistic context. Nannūl does not discuss kārakas. It is, in general, less concerned with syntax and semantics (it gives just five sense categories of verbs for the accusative case), for both of which the description of case is important to Tolkāppiyar. It pays greater attention to the internal composition of words, like Pāṇini, and has, for example, more details about verb conjugation tury), praises his mastery of Sanskrit grammar and claims that Cēnāvaraiyar wrote his commentary on the part on the word alone because it is this part that is important to Sanskrit grammarians (Srinivasan 2000: 91).

Indigenous South Asian grammatical traditions

727

than Tolkāppiyam. The description of morphology in general is more detailed and complete. However, the phonological structure of words is as important as their morphological structure for Nannūl as it is for Tolkāppiyam. Its sūtras are compacted, but are not embedded in a metalanguage like Pāṇini’s. The most important difference between Tolkāppiyam and Nannūl is that the latter does not have poetics as part of the grammar. By this time, poetics had come to be a separate discipline, out of which its subareas such as prosody (yāppu ‘binding’) and rhetoric17 (aṇi ‘ornament’) had also become separate, as mentioned by the first commentator on Tolkāppiyam. This separation is due to a change in the idea of grammar, namely that it pertains just to the general language used in elite speech, prose, and poetry (in its linguistic register) but does not include the special language of literature (with its symbolism and all). Grammar is a description of an object for itself and not necessarily for an understanding and interpretation of poetry. It also tends to be a prescription for good writing. Unlike Tolkāppiyam, Nannūl brings in the kinds and characteristics of teaching and texts in a prelude to its description of grammar. It should be noted that all aspects of poetics had come to be prescriptive — to be guidelines for writing poetry. Tolkāppiyar’s idea of a chapter on poetics in his grammar (this is the largest of the three chapters) as a framework or theory to explain literature and its special language had changed. Nannūl became the new standard for the description of Tamil and it motivated many commentaries, but it did not eclipse Tolkāppiyam, which continued to have its revered place and sustained influence on the Tamil grammatical tradition. This tradition has had continuity as the main stream of grammatical thought over two millennia. This continuity, however, is not static and its interaction with the Sanskritic tradition and acceptance of changes in the language made modifications in the mainstream itself while keeping imitation of the other tradition at bay and in the margins. 7.3.4.

Colonial grammars

The new grammatical tradition that came in contact with the Tamil tradition in the colonial period through Christian missionaries is the European tradition as espoused in its classical languages. It brought to bear two shifts on grammatical thinking regarding the pedagogical orientation of grammar and the legitimacy of colloquial Tamil in grammar. The former introduced the classification of Tamil verbs into weak and strong based on their conjugation; the latter added communi17

These independent works on rhetoric are adoptions of its description in Sanskrit without any serious concern to relate it to Tamil literary practices (Niklas 2001). They take Tamil literary composition in the second millennium to be the same as Sanskrit literary composition. The similarity of this view to the treatment of the grammars of Tamil and Sanskrit alike by a school of Tamil grammatical tradition is easy to see.

728

Bibliographical References

cation with ordinary people as a purpose of the grammar. Another new development is that, for the first time, a grammar of Tamil was written in a language other than Tamil. Constanzo Beschi (1680–1747), a Jesuit from Italy, wrote in Latin a grammar of literate Tamil (centamiẓ) and a grammar of spoken Tamil (koṭuntamiẓ). The grammars use the terminology of Nannūl in translation as well as Latinate terminology; they introduced some new analyses of grammatical facts such as modals, not addressed by the native Tamil grammarians. The postcolonial period brought the Tamil tradition into contact with modern linguistics, whose grammatical theories were developed in the Western world. In spite of resistance to these theories among the practitioners of the Tamil grammatical tradition, influence of some of their insights is discernable in the traditional grammars and commentaries written by postcolonial scholars trained in the tradition. Scholars trained in modern linguistics reinterpret the tradition in the light of modern Western theories of grammar. They also conduct grammatical analysis of modern, medieval, and old Tamil in the framework of these theories. They write in Tamil and English. There is not yet a holistic grammar of Tamil written in this new parallel tradition, which, unlike the indigenous tradition, allows fragmented grammars of aspects of Tamil, to compete with the achievements of the indigenous tradition. When a holistic grammar is written, it will not be totally divorced of indigenous grammatical thoughts.18 Bibliographical References Ananthanarayana, H. S. 1984–1986 Tolkāppiyam and Aṣṭādhyāyī: A comparative study. Ṛtam: Journal of Akhila Bharatiya Sanskrit Parishat 16–18. Albert, D. 1985 Tolkāppiyam: Phonology and morphology (An English translation). Madras: International Institute of Tamil Studies. Bhat, D. N. S. 1984 Word and its meaning in the Indian grammatical tradition. International Journal of Dravidian Linguistics 13: 47–59. Bhatia, Tej K. 1987 A history of the Hindi grammatical tradition: Hindi-Hindustani grammar, grammarians, history and problems. Leiden: Brill. Böhtlingk, Otto von 1887 Pâṇini’s Grammatik. Leipzig: Haessel. Repr. 1998, Darmstadt: Wissenschaftliche Buchgesellschaft. Bronkhorst, Johannes 2011 Illiteracy as a socio-cultural marker. In: Houben & Rotaru (eds.) 2011: 44–56. 18

This section greatly benefitted from comments and information provided by S. V. Shanmugam, A. Damotharan, S. Rajaram, R. Srinivasan, and Yigal Bronner.

Indigenous South Asian grammatical traditions

729

Burnell, Arthur Coke 1875 The Aindra school of Sanskrit grammarians, their place in the Sanskrit and subordinate literatures. Mangalore: Basel Mission Book and Tract Depository. Cardona, George 1976 Pāṇini: A survey of research. The Hague/Paris: Mouton. 2nd ed. 1998, Delhi: Motilal Banarsidass. Cardona, George 1986 Phonology and phonetics in ancient Indian works: The case of voiced and voiceless elements. In: Krishnamurti et al. (eds.) 1986: 60–80. Cardona, George 1997 Pāṇini: His works and its traditions, 1: Background and introduction. 2nd ed. Delhi: Motilal Banarsidass. Cardona, George 2000 Old Indic grammar. In: Geert E. Booij, Christian Lehmann, and Joachim Mugdan (eds.), Morphologie/Morphology: Ein internationales Handbuch zur Flexion und Wortbildung/An international handbook on inflection and word formation, vol. 1: 41–51. Berlin: De Gruyter. Cardona, George 2004 Recent research in Pāṇinian studies. 2nd ed. Delhi: Motilal Banarsidass. Cardona, George 2009 On the structure of Pāṇini’s system. In: Huet et al. (eds.) 2009: 1–32. Chevillard, Jean-Luc 2008 The concept of ticai-c col in Tamil grammatical literature and the regional diversity of Tamil classical literature. In: M. Kannan (ed.), Streams of language: Dialects in Tamil, 21–51. Pondicherry: Institut Français de Pondichéry. Deshpande, Madhav M. 1991 Pāṇinian syntax and the changing notion of sentence. In: Hans Henrich Hock (ed.), Studies in Sanskrit syntax, 31–44. Delhi: Motilal Banarsidass. Deshpande, Madhav M. 2011 From orality to writing: Transmission and interpretation of Pāṇini’s Aṣṭādhyāyī. In: Houben & Rotaru (eds.) 2011: 57–100. Elayaperumal, M. 1972 The influence of Tolkappiyam on Liilaatilakam. In: V. I. Subramoniam (ed.), Proceedings of the First All India Conference of Dravidian Linguists (1971), 170–173. Trivandrum: Dravidian Linguistics Association. Ezuttaccan, K. N. 1975 History of grammatical theories of Malayalam. Thiruvananthapuram: Dravidian Linguistics Association. Ezuttaccan, K. N. 1997 Malayalam. In: V. I. Subramoniam (ed.), Dravidian encyclopedia, vol. 3: Language and literature – grammar, 246–247. Thiruvananthapuram: International School of Dravidian Linguistics. Gair, James W., and W. S. Karunatillake 2013 The Sidat San̆ garā: Text, translation and glossary. New Haven, CT: American Oriental Society. Hart, George L. 1976 The relation between Tamil and classical Sanskrit literature. Wiesbaden: Harrasowitz.

730

Bibliographical References

Hattori, Masaaki 2000 Dignaga’s theory of meaning: An annotated translation of the Pramanasamuccayavrtti Chapter V: Anyapohapariksa (I). In: Jonathan A. Silk (ed.), Wisdom, compassion and the search for understanding: The Buddhist studies legacy of Gadjin M. Nagao, 137–146. Honolulu: University of Hawaii Press. Hinüber, Oskar von 2001 Das ältere Mittelindisch im Überblick. 2nd rev. ed. Wien: Österreichische Akademie der Wissenschaften. Hock, Hans Henrich 2014 The Sanskrit phonetic tradition and western phonetics. In: V. Kutumba Sastry (ed.), Sanskrit and development of world thought, 53–80. New Delhi: Rashtriya Sanskrit Sansthan and D. K. Printworld. Houben, Jan, and Julieta Rotaru (eds.) 2011 Le Veda-Vedāṅga et l’Avesta entre oralité et écriture/Veda-Vedāṅga and Avesta between orality and writing. (Travaux de Symposium International: Le livre, la Roumaine, l’Europe/The International Symposium: The Book, Romania, Europa, Part IIIA, ed. by Florin Rotaru.) Bucharest: Éditeur Bibliothèque de Bucarest. Huet, Gérard, Amba Kulkarni, and Peter Scharf (eds.) 2009 Sanskrit computational linguistics. Berlin/Heidelberg: Springer. Israel, M. 1973 Treatment of morphology in Tolkāppiyam. Madurai: Madurai University. Karunanidhi, M. 2003 Tolkāppiya-p pūnkā. Chennai: Tamilkkani Patippakam. English translation, The flower garden of Tolkāppiyam, by G. Thiruvasagam, Coimbatore/New Delhi: Bharathiar University/Macmillan Publishers India. Katre, Sumitra M. 1987 Aṣṭādhyāyī of Pāṇini. Austin: University of Texas Press. Repr. 1989, Delhi: Motilal Banarsidass. Kiparsky, Paul 1979 Pāṇini as a variationist. Cambridge, MA: MIT Press. Kiparsky, Paul 1991 On Paninian studies. Journal of Indian Philosophy 19: 189–225. Kiparsky, Paul 2009 On the architecture of Pāṇini’s grammar. In: Huet et al. (eds.) 2009: 33–94. Kniffka, Hannes 2001 Editor’s introduction. In: Hannes Kniffka (ed.), Indigenous grammar across cultures, 1–29. Frankfurt: Peter Lang. Krishnamurti Bhadriraju, Colin P. Masica, and Anjani Sinha (eds.) 1986 South Asian languages: Structure, convergence, and diglossia. Delhi: Motilal Banarsidass. Kulli, J. S. 1976 Kēśirāja’s Śabdamaṇidarpana. Dharwar: Karnatak University. Kulli, J. S. 1991 History of grammatical theories in Kannada. Thiruvananthapuram: International School of Dravidian Linguistics.

Indigenous South Asian grammatical traditions Kulli, J. S. 1997

731

Kannada. In: V. I. Subramoniam (ed.), Dravidian encyclopedia, vol. 3: Language and literature – grammar, 244–246. Thiruvananthapuram: International School of Dravidian Linguistics. Mahadevan, Iravatham 2003 Early Tamil epigraphy from the earliest times to the sixth century A. D. Chennai/Cambridge, MA: Harvard University Press. Matilal, Bimal Krishna 1990 The word and the world: India’s contribution to the study of language. Oxford: Oxford University Press. Meenakshi, K. 1997 Tolkappiyam and Astadhyai. Chennai: International Institute of Tamil Studies. Meenakshi. K. 1999 Literary criticism in Tamil and Sanskrit: Tolkāppiyam Poruḷatikāram and Sanskrit Alaṁkāra Śāstra. Chennai: International Institute of Tamil Studies. Meenakshisundaran 1974 Foreign models in Tamil grammar. Trivandrum: Dravidian Linguistics Association. Miller, Roy Andrew 1976 Studies in the grammatical tradition in Tibet. Amsterdam/Philadelphia: Benjamins. Miller, Roy Andrew 1993 Prolegomena to the first two Tibetan grammatical treatises. Wien: Arbeitskreis für tibetische und buddhistische Studien, Universität Wien. Miller, Roy Andrew 2000 The early Tibetan grammatical treatises and Thon-mi Sambhoṭa. In: S. Auroux, E. F. Konrad Koerner, Hans-Josef Niederehe, and Kees Versteegh (eds.), History of the language sciences: An international handbook on the evolution of the study of language from the beginnings to the present, vol. 1: 203–206. Berlin/New York: De Gruyter. Nampoothiry, Easwaran 1972 Influence of Panini’s grammar on Lilathilakam. In: V. I. Subramoniam (ed.), Proceedings of the First All India Conference of Dravidian Linguists (1971), 138–143. Trivandrum: Dravidian Linguistics Association. Niklas, Ulrike 2001 Poetics, prosody, and rhetorics in classical Tamil grammar. In: Hannes Kniffka (ed.), Indigenous grammar across cultures, 135–150. Frankfurt: Peter Lang. Nitti-Dolci, Luigia 1938 Les grammairiens prakrits. Paris: Adrien-Maisonneuve. Engl. translation (1972) by Prahbākara Jhā, The Prākṛita grammarians. Delhi: Motilal Banarsidass. Pinde, Ole Holten 1995 Pāli and the Pāli grammarians: The methodology of the Pāli grammarians. In: Mirja Juntunen, W. L. Smith, and C. Suneson (eds.), Sauhṛdyamaṅgalam: Studies in honor of Siegfried Lienhard on his 70th birthday, 281–297. Stockholm: Association of Oriental Studies.

732

Bibliographical References

Purushottam, Boddupalli 1996 The theories of Telugu grammar. Thiruvananthapuram: International School of Dravidian Linguistics. Purushottam, Boddupalli 1997 Telugu. In: V. I. Subramoniam (ed.), Dravidian encyclopedia, vol. 3: Language and literature – grammar, 244–256. Thiruvananthapuram: International School of Dravidian Linguistics. Rajagopalan, N. V. 1968 Influence of Tholkaapiyam and akapporull in Sanskrit poetics. In: R. E. Asher (ed.), Proceedings of the Second International Conference-Seminar of Tamil Studies, vol. 2: 171–178. Chennai: International Association of Tamil Research. Rajaram, S. 1992 Vīracōẓiya ilakkaṇa-k kōṭpāṭu [The grammatical theory of Viracōẓiyam]. Nagercoil: Ragavendra. Raster, Peter 1993 Die indische Grammatiktradition. In: Joachim Jacobs, Arnim von Stechow, and Wolfgang Sternefeld (eds.), Syntax: Ein internationales Handbuch zeitgenössischer Forschung/An international handbook of contemporary research, 199–207. Berlin/New York: Mouton de Gruyter. Scharfe, Hartmut 1977a Grammatical literature. (History of Indian literature 5.2, ed. by Jan Gonda.) Wiesbaden: Harrassowitz. Scharfe, Hartmut 1977b Grammars of the Dravidian languages. In: Scharfe 1977a: Chapter 13. Sebeok, Thomas A., Murray B. Emeneau, and Charles A. Ferguson (eds.) 1969 Current trends in linguistics, 5: Linguistics in South Asia. The Hague: Mouton. Shanmugam, S. V. 1984 Collilakkaṇa-k kōṭpāṭu-tolkāppiyam [Morphological theory of Tolkāppaiyam], Part 1. Annamalainagar: All India Association of Tamil Linguistics. Shanmugam, S. V. 1986 Collilakkaṇa-k kōṭpāṭu-tolkāppiyam [Morphological theory of Tolkāppiyam], Part 2. Annamalainagar: All India Association of Tamil Linguistics. Shanmugam, S. V. 1992 Collilakkaṇa-k kōṭpāṭu-tolkāppiyam [Morphological theory of Tolkāppiyam], Part 3. Chennai: Manivasagar Patippakam. Shanmugam, S. V. 1992a Malayāḷa moẓiyin mutal ilakkaṇam: Camūkamoẓiyal āyvu [The first grammar of the Malayalam language: A sociolinguistic investigation]. Chennai: Manivasagar Patippakam. Shanmugam, S. V. 2001 Eẓuttilakkaṇa-k kōṭpāṭu [Phonological theory (of Tolkāppiyam)]. Chennai: International Institute of Tamil Studies. (First edition 1980.) Shanmugam, S. V. 2006 Yāppum nōkkum: Tolkāppiyarin ilakkiya-k koḷkaikaḷ [Propriety and point of view: Literary theories of Tolkappiyar]. Chidambaram: Meyyaappan Patippakam. Sinha, Anil C. 1973 Generative semantics and Pāṇini’s kārakas. Journal of the Oriental Institute, Baroda 23: 27–39.

Indigenous South Asian grammatical traditions

733

Srinivasan, R. 2000 Tamiẓ ilakkaṇa marapukaḷ (ki.pi. 800–1400): Ilakkaṇa nūlkaḷum uraikaḷum [Tamil grammatical traditions (A. D. 800–1400): Grammatical works and commentaries]. Chennai: The Parkar. Staal, J. F. 1969 Sanskrit philosophy of language. In: Sebeok et al. (eds.) 1969: 499–531. Subbarayalu, Y. 2009 Sanskrit in Tamil inscriptions. In: M. Kannan and Jennifer Clare (eds.), Passages: Relationships between Tamil and Sanskrit, 115–124. Pondicherry: Institut Français de Pondichery. Subrahmanya Sastri 1934 History of the grammatical theories in Tamil and their relation of the grammatical literature in Sanskrit. Journal of Oriental Research, Madras. Repr. 1997, Chennai: The Kuppuswami Sastri Research Institute. Tamilannal 1993 Tolkāppiyam: Eẓuttatikāram [Tolkāppiyam: Phonology]. Chennai: Manivasagar Patippakam. Tamilannal 1994 Tolkāppiyam: Collatikāram [Tolkāppiyam: Morphology]. Chennai: Manivasagar Patippakam. Thirugnanasambandham, P. 1992 Sanskrit Tamil contact. Thiruvananthapuram: International School of Dravidian Linguistics. Tournadre, Nicolas 2010 The Classical Tibetan cases and their transcategoriality: From sacred grammar to modern linguistics. Himalayan Linguistics 9(2): 87–125. http://www. nicolas-tournadre.net/wp-content/uploads/2014/07/2010-HLGrammar.pdf (accessed 30 November 2014) Vasu, Srisa Chandra 1897 The Ashtádhyáyí of Páṇini. Benares: Sindhu Charan Bose. Repr. 1962, Delhi: Motilal Banarsidass. Verhagen, Pieter Cornelius 2000a The Classical Tibetan grammarians. In: S. Auroux, E. F. Konrad Koerner, Hans-Josef Niederehe, and Kees Versteegh (eds.), History of the language sciences: An international handbook on the evolution of the study of language from the beginnings to the present, vol. 1: 207–210. Berlin/New York: De Gruyter. Verhagen, Pieter Cornelius 2000b The influence of the Sanskrit tradition on Tibetan indigenous grammar. In: S. Auroux, E. F. Konrad Koerner, Hans-Josef Niederehe, and Kees Versteegh (eds.), History of the language sciences: An International handbook on the evolution of the study of language from the beginnings to the present, vol. 1: 210–214. Berlin/New York: De Gruyter. Witzel, Michael 2011 Gandhāra and the formation of the Vedic and Zoroastrian canons. In: Houben & Rotaru (eds.) 2011: 490–532.

8

Applications of modern technology to South Asian languages Edited by Elena Bashir

8.1.

Introduction By Elena Bashir

Computational linguistics (CL), with both theoretical and applied components, lies in the intersection of linguistics and computer science — with its sub-field of artificial intelligence. The recent revolution in information and communication technologies is profoundly affecting many aspects of language use and study. Digital audio and video devices enrich language description and documentation efforts. The effects of globalization and localization on language use patterns are in a dynamic tension. On the one hand, some fear a loss of cultural diversity due to globalization; on the other hand, however, the Worldwide Web is becoming both a tool for language revitalization, thus contributing to stemming the tide of language endangerment, and a medium for innovation. Information on South Asian languages is increasingly available to a wide range of people through digital dictionaries and libraries. Each of these aspects of the interaction of technology with languages and linguistics is a field of study in itself. The articles in Saxena & Borin (eds.) 2006 provide an overview of many aspects of this area. This chapter will focus on only a few aspects of the interaction of CL with South Asian languages and linguistics. Section 8.2 treats localization, the process of enabling all aspects of computing by users of local languages, having varying linguistic structures and employing differing scripts; Section 8.3 discusses the development of language and linguistic resources; and Section 8.4 deals on a countrywise basis with specific language engineering applications. Theoretical CL, which deals with issues in theoretical linguistics and cognitive science, is not treated here. Rashel (2011: 1), discussing the state of the field in Bangladesh, and relationships between computation, linguistics, and computational linguistics, concludes that ‘the actual work that has been done in Bangladesh... has so far been... largely motivated by the demands and interest of practical processing systems and that information technology has rather little influence on linguistics at large.’ This would appear to apply to other South Asian countries as well.

736

Elena Bashir

8.2.

Localization By Elena Bashir

8.2.1.

Introduction

Localization, ‘the process of enabling computing experience in local culture and language’ (S. Hussain, Durrani & Gul 2005: 3), encompasses the main thrusts of computational linguistics work on South Asian languages at this time. In the age of the global digital revolution, it is driven by the need to extend the benefits of information and communication technology to developing countries, in particular those whose languages employ non-roman-based scripts. In the context of globalization and the increasing endangerment of many languages, it can also be viewed as a means of language maintenance, and even revitalization. Broadly conceived of, localization includes development of local language and linguistic resources (Section 8.3) and specific applications (Section 8.4). Prerequisites for localization of computing resources and techniques include, first of all, detailed linguistic descriptions and analyses, followed by agreement by language communities on character sets to be used, provision of Unicode code points for necessary characters not already existing, and development of fonts, rendering engines, and input methods including keyboard layouts and voice-to-text applications. Building local research capacity is a critical component of sustainable localization efforts (Shams & Hussain 2011). 8.2.2.

Pan Asian Networking (PAN) Localization Project

The PAN Localization Project (www.panl.10n.net), supported by the International Development and Research Centre, Canada, has been a major engine of localization work on South Asian languages. For an introduction to the Project background, see S. Hussain & Gul 2004. Spanning Phase I (March 2004–March 2007) and Phase II (April 2007–March 2010), the Project involved the following countries and languages: Afghanistan (Pashto and Dari), Bangladesh (Bangla), Bhutan (Dzonkha), Cambodia (Khmer), Laos (Lao), Nepal (Nepali), Sri Lanka (Sinhala and Tamil), and Pakistan (Urdu).1 Pan Localization Project Teams 2008 is a language-wise collection of research publications through the Project up to 2008. Documents on Project activities carried out from 2004–2012 are accessible at http://www.panl10n.net/activities/. Shams et al. 2012 summarizes the evaluation findings for the Project. The following sections discuss Project activities in some of the participant countries. Although a huge amount of localization work has been done in India, since it was

1

Pakistan was host of the Regional Secretariat of the Project.

Applications of modern technology to South Asian languages

737

not a participant in this Project, it has not been discussed here. See Section 8.3.1.2 for work on development of language resources in India. 8.2.2.1. Sri Lanka The Language Technology Research Laboratory was established at the University of Colombo in 2004 under the PAN Localization Project. Under Phase I, the following linguistic resources were developed: University of Colombo School of Computing Sinhala part-of-speech (POS) tag set, a 500,000 word tagged Sinhala corpus, a 10 million word contemporary Sinhala corpus, and a trilingual SinhalaTamil-English dictionary. Language tools developed include: a Sinhala text-tospeech system, a Sinhala screen reader, an optical character recognizer (OCR) for Sinhala, and utilities for encoding conversion. Under Phase II of the Project, linguistic resources developed were a 100,000-word English-Sinhala parallel corpus and a 1,000 word Sinhala WordNet (Welgama et al. 2011). Generic toplevel domains (gTLD) were defined for Sinhala, and teaching materials for local languages were developed. Secondary results of the project include development of a Sinhala lexicon and spell checker, and localized versions of Windows operating systems. Specific aspects of this work include work on English-to-Sinhala machine translation (Liyanapathirana & Weerasinghe 2011) and on automatic speech recognition for Sinhala (Nadungodage & Weerasinghe 2011). Earlier work (Weerasinghe 2004) discussed Sinhala-Tamil machine translation. Detailed lists of research papers on specific applications, including OCR and text-to-speech work, and software downloads can be found at http://www.panl10n.net/sri-lanka-phase-i/ and http://www.panl10n.net/sri-lanka-phase-ii/.2 The site http://www.panl10n.net/ english/2012PDF/PAN%20L10N%20-%20Sri%20Lanka_Last%20day.pdf summarizes work done under the PAN Localization project. 8.2.2.2. Nepal For localization work done in Nepal, see http://www.panl10n.net/nepal-phase-i/ and http://www.panl10n.net/nepal-phase-ii/. See also Section 8.3.1.4 for language resource development in Nepal, and Section 8.4.2.4 for discussion of language technology applications.

2

Information in this paragraph is from: http://www.panl10n.net/english/2012PDF/ PAN%20L10N%20-%20Sri%20Lanka_Last%20day.pdf, where further details and 22 technical references can be found.

738

Elena Bashir

8.2.2.3. Bangladesh The status and challenges of local language computing in Bangladesh at the outset of the PAN Localization project are summarized in Mumit Khan n.d. Documents reporting work carried out by the Bangladesh component of the Project are at http://www.panl10n.net/bangladesh-phase-i/ and http://www.panl10n.net/bang ladesh-phase-ii/. See also Section 8.3.1.5 on language resource development in Bangladesh and Section 8.4.2.4 on technology applications. 8.2.2.4. Afghanistan The first-ever National Computational Linguistics Seminar in Afghanistan was conducted on August 12–13, 2006. It was organized by the Afghanistan component of the PAN Localization Project and sponsored by the Afghan Telecom and Alcatel Afghanistan office. Work carried out under this project is available at http://www. panl10n.net/afghanistan/. 8.2.2.5. Pakistan Publications and software produced under the PAN Localization Project are available at: http://www.panl10n.net/pakistan-phase-i/ and http://www.panl10n.net/ pakistan-phase-ii/. Details of Pakistan’s participation in Phase II can be found at http://www.cle.org.pk/research/projects/Details/pan.htm. Urdu localization of the SeaMonkey internet application suite was announced in 2010 by the Centre for Research in Urdu Language Processing (CRULP), and work on a Microsoft Vista Urdu Language Interface Pack soon followed. A pioneering localization effort in Pakistan was Project Dareecha ‘little window’, aimed at developing strategies for access to information technology and local language content generation in rural schools (S. Hussain, Shams & Sarfraz 2012). The goal was for school children to be able to access the Internet at the beginning of the training program, and by the end of the program be able to produce their own content and make it available on the Internet. For a detailed presentation on the project, see http://www.panl10n.net/dareecha/index.htm. So far in Pakistan, localization efforts have been mostly for Urdu, with some work on Sindhi and Pashto.3 Ismaili, Bhatti & Shah 2011 reports work on a Graphical User Interface (GUI) for Sindhi. T. Rahman 2004 early argued for the need for localization involving all the languages of Pakistan. 3

Rasool Bux Sarang reports that Microsoft has included Panjabi and Sindhi as Locale Languages in its Windows 8 operating system. http://download.cnet.com/Punjabiand-Sindhi-Keyboard-Layouts/3000–2110_4–75453482.html. Pashto and Kashmiri are also available as Locale Languages (http://msdn.microsoft.com/en-us/goglobal/ bb964664.aspx).

Applications of modern technology to South Asian languages

8.3.

739

Language and linguistic resources Edited by Elena Bashir

This section focuses on the development of the kinds of resources that enable localization efforts and the kinds of practical applications discussed in Section 8.4, as well as typological or theoretical linguistic research on particular languages. It includes contributions on the history and methodologies of such work in India by Niladri Sekhar Dash (Section 8.3.1.2.1), work on Sanskrit by Amba Kulkarni (Section 8.3.1.2.2), and work in Nepal by Yogendra P. Yadava (Section 8.3.1.3). 8.3.1.

Corpus and lexical resources

8.3.1.1. Early work By Elena Bashir Work on corpus development for South Asian languages began largely outside South Asia. These early efforts faced the challenges of building an electronic corpus at a time when most written texts of South Asian languages were either on paper or in the form of image files. The earliest such work I am aware of is an Urdu corpus based solely on the (text-based) BBC Urdu website (Becker & Riaz 2002). The corpus was intended, on completion, to consist mostly of raw Urdu text marked up to the paragraph level so it could be used as input for natural language processing (NLP) tasks. It was also to be manually part-of-speech (POS) tagged, with the aim of training and testing NLP tools. Riaz 2007 and 2010 are on Urdu stemming and named entity recognition, respectively. A major pioneering effort, the EMILLE/CIIL Project (Enabling Minority Language Engineering), was established in 2000 at the Universities of Lancaster and Sheffield and continued with the cooperation of the Central Institute of Indian Languages, Mysore. The goals of the Project — extending a language engineering (LE) architecture, developing corpora, and developing basic LE tools — are explained at http://www.emille.lancs.ac.uk/about.php. The primary resource developed by the project is the EMILLE Corpus, a set of monolingual corpora for fourteen South Asian languages — including Bengali, Hindi, Gujarati, Panjabi, Urdu, Sinhala, and Tamil — totaling more than 96 million words, and a parallel corpus of English and five of these languages. See McEnery et al. 2000 and Baker et al. 2004 for discussions of the purposes, methodology, and execution of the project. Hardie et al. 2006 is another accessible account of the work of the EMILLE project. The corpus can be accessed at http://www.lancs.ac.uk/fass/projects/corpus/emille/.

740

Niladri Sekhar Dash

8.3.1.2. India An overview of the history and methodologies of lexical and corpus development research in India (Section 8.3.1.2.1) is followed by a specialized discussion of work on Sanskrit (Section 8.3.1.2.2). 8.3.1.2.1.

History and methodologies By Niladri Sekhar Dash

The development of indigenous corpora in Indian languages began, in the true sense, in 1991, when a concerted effort was initiated by computer scientists and linguists together under the aegis of the Department of Electronics (DOE), Government of India, to generate digital text corpora for most of the Indian languages for various works of Natural Language Processing (NLP). This effort, however, should not be considered the first of its kind in India, since the work of corpus generation in any Indian language began some time ago at the individual level at the Shivaji University, Kolhapur (Shastri 1988). To the best of our knowledge, the Kolhapur Corpus of Indian English (KCIE) is the first Indian corpus which was systematically developed following the principles and norms adopted for the Brown (Francis & Kucera 1964) and the Lancaster-Oslo-Bergen (LOB) corpus (Atwell, Leech & Garside 1984). This corpus, as the name shows, consists of written text samples of modern Indian English compiled with the goal of making cross-comparisons between British English, American English, and Indian English (Shastri 1988). The KCIE consists of approximately one million words of Indian English drawn proportionally from texts published in 1978. Text samples were collected from fifteen different text categories to make it maximally comparable to Brown and LOB. The text samples were manually inserted into a computer following ASCII (American Standard Code for Information Interchange) so that the corpus would be maximally retrievable and accessible by its target users. At present, the KCIE has been included as a representative sample of Indian English in the International Corpus of English and treated as one of the most reliable and authentic linguistic resources for comprehensive analysis and description of Indian English. Although the KCIE was initially planned to be maximally comparable to Brown and LOB, there are, nevertheless, some deviations from the standard format due to certain logical and practical considerations. The first deviation is that it fails to match with the Brown and the LOB in synchronicity. While the Brown and the LOB store sample English texts published in 1961, the KCIE stores sample English texts published in 1978. In spite of differences in years, the corpus designers were very careful to preserve the maximum amount of comparability between these corpora and to maintain parallelism in text representation across genres and text types. Thus the value of the KCIE for describing the actual form and nature of

Applications of modern technology to South Asian languages

741

Indian English is attested in its faithful representation of sample texts. First, it succeeds in projecting the distinct texture of Indian English — enriched with unique lists of Indian words and terms (Wilson 1992). Second, it faithfully reflects the differences noted in the unique syntactic patterns (Shastri 1992, 1996) and semantic loads of Indian English (Schilk 2006). Finally, it exhibits the independence of Indian English from the overwhelming shadow of British English (Mukherjee 2002). Thus, KCIE is successful in exhibiting the “Indianness” of Indian English as a notable phenomenon of post-Independence India, which has evolved into a distinct entity within the last few decades (Leitner 1991). As the KCIE is meant to represent Indian English as used in printed and published texts, the text samples are accumulated in a simple, stratified, and random manner from the following sources: (a) printed books (140 sample texts from 1200 titles of different disciplines); (b) government documents (37 sample texts from central and state governments); (c) press materials (53 sample texts from six national and fifteen regional English newspapers); and (d) periodicals (50 text samples from almost all disciplines). The most notable difference of the KCIE from the other two corpora, however, is that the balance in text representation is slightly tilted towards informative prose. This seems inevitable as texts available in imaginative prose fall short of the number required for balance in composition of the corpus. Moreover, the larger number of short stories than full-length novels helps to tilt the balance towards informative texts. It is clear that the KCIE cannot claim to be maximally representative of Indian English, as it is built to ensure maximum comparability with other English corpora rather than developing a maximally representative text database of Indian English. Despite several limitations and shortcomings, the KCIE has immense value in Indian linguistics in general, because it provides a great opportunity to analyse Indian English with close reference to real-life usage. In fact, the availability of the KCIE makes it possible to observe the distinct features of Indian English, which has not been possible before. At present it is used as a primary resource for Indian English, and has been made open for various linguistic studies and investigations. It is freely available for applied linguistics and language technology work for designing language processing tools and teaching materials. The next project for generating corpora in Indian languages began in 1991 under the aegis of the Department of Electronics & Information Technology, Government of India in the form of a pan-Indian project entitled Technology Development for Indian Languages (TDIL), the goal of which is to develop text corpora in digital form for all major Indian languages. The mission of the project was to collect representative text samples of three million words from each Indian language, following a uniform sampling technique through which proportional amounts of language data are collected from different disciplines (Murthy & Despande 1998: 2). This project is distributed among several research groups across the country, and each group is entrusted with the tasks of compiling text corpora

742

Niladri Sekhar Dash

from the language(s) assigned to it and of processing the corpora for developing systems and tools for spell-checking, morphological processing, grammar checking, machine translation, information retrieval, lexical databases, and other applications. The basic objectives of the project may be summarised as follows (Murthy & Despande 1998: 4): (a) Develop a text corpus for each Indian language as a true representative of the language. (b) Develop language processing tools to facilitate man-machine interaction. (c) Enhance multilingual knowledge sharing and management across Indian languages. (d) Promote development of techniques and systems for cross-lingual studies and research. (e) Generate tools and resources for machine translation, human-machine interaction, language learning, and other applications. Keeping these objectives in view, the TDIL project was initiated to generate machine-readable texts in major Indian languages, build machine-aided translation systems for Indian languages and from English to Indian languages, develop man-machine interfaces for information exchange, and develop computer-assisted techniques for language learning and teaching. At the preparatory stage certain questions were raised about the effort for corpus generation in Indian languages. It had to be decided whether the proposed corpora would contain samples of written texts, samples of transcribed spoken texts, or both. Although it is acknowledged that speech is a better representative of the basic structure and fundamental organisation of a natural language (Biber 1986, Halliday 1989), this project aimed at focussing on written texts only due to certain technical and logistic constraints, which were simply unsurmountable at the time when the project was started. The actual work of corpus development started in 1992 with a query for suitable answers to the following questions: why was there a need for developing corpora in Indian languages; who were to develop these corpora and how; how large would these corpora be; who would be interested in using these corpora, where and how; what kind of texts would be included in these corpora; how much time would be required to develop these corpora; which time-span would be represented in these corpora; what kind of language data would be collected and from which texts; and what kind of formalities would be followed for final text storage and representation (Dash 2003). Many such questions needed to be addressed satisfactorily before the actual work of corpus generation started. The generation of the TDIL corpora involved various linguistic issues, such as size of corpora, choice of documents, collection of documents (e.g. books, newspapers, magazines, periodicals), selection of text samples, sorting of text materials, manner of page selection (random, regular, selective), determination of target

Applications of modern technology to South Asian languages

743

users, manner of data input, methods of corpus sanitation, management of corpus databases, and release of corpora for public access. Although most of the issues were addressed adequately with reference to English language corpora (Atkins, Clear & Ostler 1992), some of them were found to be redundant for Indian languages. On the other hand, some other issues, which were not addressed elsewhere, were considered necessary for Indian languages (Dash 2007). Most of these issues were settled for the TDIL corpora as follows (Dash 2009). (a) Corpus size: each corpus contained three to four million words from each language. (b) Text representation: samples were selected from all kinds of texts. (c) Text selection: both random and selective methods were adopted for text selection. (d) Time span: texts produced between 1981 and 1990 were selected. (e) Documents: all documents were classified based on subject area, year of publication, and title. (f) Newspapers: classified according to name, year, month, and date. (g) Books: from all disciplines published between 1981 and 1990. (h) Authors: texts produced by authors irrespective of gender, age, ethnicity, influence, race, locality, merit, education, or profession were selected. (i) Target users: people of all walks of life. Following these principles and strategies, the working groups were able to generate a text corpus of three to four million words in each of the Indian languages included in the project by the end of March 1995. The corpora are now in the custody of the Central Institute of Indian Languages (CIIL) Mysore, which is given the authority for dissemination of the resources to interested Indian scholars and institutes for research and development purposes. However, unfortunately, to date these corpora have not been much used due to the mode of generation in ISCII (Indian Standard Code of Information Interchange), which has a complex conversion compatibility with Unicode. Although the corpora of some of the languages (e.g. Hindi, Bengali, Telugu, Marathi), have been converted into Unicode by individual efforts, the rest of the database still remains in ISCII, therefore still not utilizable. Once converted into Unicode-compatible texts and normalized, these corpora will: (a) present better scope for studying variations of the relations of linguistic elements across all text types; (b) provide a wider spectrum of language use to study the frequency of occurrence of various linguistic items; (c) provide better scope for exploring behavioural patterns of various language elements; (d) increase the number of citations of lexical items in a language, which will make possible systematic classification of linguistic items in terms of their usage patterns and meanings;

744

Niladri Sekhar Dash

(e) ensure better opportunity to obtain all kinds of statistical results of language elements with the aim of making various observations and hypotheses; (f) provide a wider spectrum of usage patterns of individual lexical items, to generalise about the grammar of a language (Sinclair 1991: 18); (g) give scope for faithful descriptive study on patterns of use of compounds, lexical collocations, phrases, clauses, sentences, technical and scientific terms, idioms and proverbial expressions in the language; (h) provide opportunity to track coinage of new words and their fields of usage, find variations of sense of words caused from variation of contexts, patterns of formation of words, compounds, idioms, and phrases; (i) allow authentic analysis and citation of examples of spelling variation — a major issue in most of the Indian languages, including Bengali, Oriya, and Tamil. In 2012, India was enriched by the completion of the Indian Languages Corpora Initiative (ILCI) Phase-I project, where parallel Indian language corpora have been developed across twelve Indian languages, keeping Hindi as the source language and others as target languages. The first phase of the project (2009–2012) has generated 50,000 POS-tagged parallel sentences in each of the Indian languages involved in the project, covering two major domains of human knowledge: health and tourism. The size of the corpora database is 600,000 annotated sentences with each sentence averaging sixteen or more words (Jha 2010). The most striking feature of this corpus database is that parallelism is preserved at its highest possible level across the Indian languages, making it an indispensable resource for cross-lingual machine translation, information retrieval, and cross-cultural research and investigation. Another important contribution of this project is the development of a nationally approved tagset (known as the BIS tagset) — a benchmark standard to be adopted and used across all Indian languages (Dash 2013). A product of this project is a bilingual parallel lexical database that eventually will lead to compilation of digital bilingual dictionaries (where Hindi is the source languages) in all the languages involved in the project. This corpus is available in Unicode format for general access from the Data Centre of the TDIL homepage of the Department of Electronics & Information Technology, Ministry of Communication & Information Technology, Government of India (http://www. tdil.mit.gov.in/Default.aspx).4 The second phase of the ILCI (Phase-II) project, which began in February 2012, includes the remaining eleven constituent languages and adds 1,700,000 new sentences (1,100,000 sentences in eleven new languages including the four languages of the North-East, and 600,000 in the twelve languages treated in 4

One has to send a formal request to the TDIL and then they will provide the corpus. http://tdil-dc.in/index.php?option=com_vertical&parentid=58&lang=en.

Applications of modern technology to South Asian languages

745

Phase I). The total size of the corpora after the second phase is estimated to be approximately 38,000,000 parallel-annotated words including all the 23 languages in the domains of (a) health, (b) tourism, (c) agriculture, and (d) entertainment. The recent activities for developing IndoWordNet on a pan-Indian basis are being appreciated across the world (Bhattacharyya, Fellbaum & Vossen 2010), because a well-formed and well-framed WordNet has been accepted as one of the most valuable resources for language processing, machine learning, language education, machine translation, information retrieval, and many more academic and commercial activities. Since the development of a digital lexical resource in the form of IndoWordNet has tremendous application potential in a multilingual country like India, sincere efforts have been under way since the year 2000, by individual institutes and consortia, to develop WordNets for a number of Indian languages keeping Hindi as the pivot language (Bhattacharyya 2010). WordNets are, in essence, intricate interfaces of lexical structures composed of synsets (sets of synonyms) and the semantic relations in which synsets act as sets of synonyms to refer to similar or near-similar ideas and concepts, and are thereby linked with one another in semantic relations like hypernymy and hyponymy (is-a relation), meronymy and holonymy (part-of relations), or troponymy (manner-of relations). In WordNet creation, central attention is paid not to the words but to the concepts which the word(s) are meant to denote. A concept can have several words to denote it. In that case, all the words denoting this concept are competent candidates for expressing the concept. In such a situation, the particular word which can best express the concept is treated as the “primary unit” to be assigned with a specific identity code (ID), while the other words are considered synonyms of the primary unit and are, therefore, put together under a single synset along with the primary unit. For example, in Hindi, the concept of WOMAN is expressible by several terms like mahilā, nārī, strī, kāminī, abalā, and ramanī. Since these words stand for a common concept, they are grouped as a set of synonymous words within a synset, where the word mahilā may be assigned a specific ID as it is more generic, best explicates the concept, and has the highest percentage of use in a balanced monitor corpus of modern Hindi. Depending on the number and relationships of the language(s) to be covered, WordNets are constructed using two basic approaches (Vossen (ed.) 1998) — the Merge Approach and the Expansion Approach. If a WordNet is being developed for a single language without including geographically, genealogically, or typologically related languages, then the Merge Approach is preferable. In this approach, an effort is first made to prepare and record an exhaustive sense repository of each word found to be used in the language. The lexicographers and lexicologists are then encouraged to construct synsets for each unique sense found in the language, following the principles of minimality, coverage, and replaceability — three controlling constraints that govern the creation of synsets. On the other hand, if a

746

Niladri Sekhar Dash

WordNet is being developed for a group of languages geographically, genealogically, or typologically related, then the Expansion Approach is the better strategy. In this case, a particular language is treated as the Source Language (SL) and a careful attempt is made to generate a list of exhaustive synsets after carefully studying the meanings of words available in the language. Once the list is generated in the SL, the lexicons of the target languages (TL) — the member languages of the group — are accessed to extract conceptually equivalent lexical items from their respective vocabularies to be tagged with the synsets of the SL as representing similar concepts, ideas, or meanings. In the case of IndoWordNet, both these approaches are followed. For Hindi WordNet the first approach was adopted (Chakrabarti et al. 2002) following the basic design principles of the Princeton WordNet for English (Fellbaum 1998). At first, synsets in Hindi were created manually by looking up the various listed meanings of words in different dictionaries. Once the lists were ready, the Expansion Approach was used to develop the Sanskrit WordNet (M. Kulkarni et al. 2010), Telugu WordNet (Arulmozi 2010), and WordNets of other Indian languages. After realizing the enormous functional relevance of the resource, several consortia (e.g. Dravidian WordNet, Indradhanush) were formed to construct a general IndoWordNet for eighteen languages of India, including (alphabetically) Assamese, Bengali, Bodo, Gujarati, Hindi, Kannada, Kashmiri, Konkani, Malayalam, Manipuri, Marathi, Nepali, Odia, Panjabi, Sanskrit, Tamil, Telugu, and Urdu. These languages cover the length and breadth of India and are used by about nine hundred million people across the world. The primary concern of the IndoWordnet is to express a concept unambiguously in the target languages wherever possible. The basic guidelines followed for all the language groups in the process of synset generation for IndoWordNet are: (a) Synsets are divided into six broad classes: U NIVERSAL (synset which has indigenous lexemes for a concept in all languages, e.g. sun, moon, star), P AN I NDIAN (synset which has indigenous lexemes in all Indian languages but no equivalent in English, e.g. pāpaḍ [pa:pɔɽ] ‘thin cake of dried ground pulses variously spiced’), I N -F AMILY (synset found in a particular language family, e.g. kal/ḷ- ‘toddy, fermented juice from the flower of the palmyra tree’ in Dravidian,5 found in Tamil kaḷ, Telugu kallu, Kannada and Malayalam kaḷ, Tulu kali). L ANGUAGE S PECIFIC (synset unique to a particular language, e.g. bihu ‘a kind of group dance of Assam’ in Assamese), R ARE (specific technical terms, e.g. modem), and S YNTHESIZED (synset that is created in a language due to influence of another language, e.g. pizza). (b) Member groups should not transliterate the words used in the Hindi synset. They have to understand the meaning or concept expressed by the synset 5

This is a tree in south India, not found elsewhere. Thanks to Professor E. Annamalai for providing this example.

Applications of modern technology to South Asian languages

747

with the help of the attached glossary and example sentence(s) to select the most appropriate conceptual equivalents from their languages in order of frequency. (c) To express the concepts, the member groups will provide conceptual equivalents from their respective languages either as single-word units (most preferred) or multi-word units. (d) To supply conceptual equivalents, one of these four alternative methods will be followed, in order of preference: regular words in dictionary, transliteration, short phrases or multiwords, and neologisms (coinage of new words). (e) Dictionary words will be included in IndoWordNet according to their frequency of occurrence in their respective languages. For example, if in Bengali mānus, jan, byakti, and lok are used as synonyms to denote the concept HUMAN , then mānus is the first choice as the nearest conceptual equivalent as it records the greatest frequency of occurrence in the language as well as being the most generic concept. (f) Transliteration will be used for those scientific, technical, and medical terms where no native language equivalents are available or where translation would produce a term which is less intelligible for common users. For instance, the English term “diabetes” should be transliterated in Bengali as ডায়ােবিঢস ḍāyābeṭis, rather than using mɔdhumeha, which is less intelligible to common people. (g) The use of short phrases or compounds is necessitated when the concept is available in the target language but there is no single word unit that can express the concept. Therefore, the best option open, in this case, is to go for short phrases or multiword expressions. (h) Neologisms should be used only when there is a unique concept in the SL for which there is no equivalent concept in the TL. Since this concept is new in the TL, obviously there is no word or expression that can express it. Therefore, neologism is the best option in such situations. Since coining new words involves the issue of standardization, it is least encouraged. (i) Separate lists of Language Specific Synsets (LSS) should be created in each language, as there is no doubt that each TL can have some unique concepts not available in SL. Therefore, these unique culture-, region-, or languagespecific concepts should be expressed in words not available in other languages. (j) The same synset ID has to be maintained across all the languages involved in the IndoWordNet. This means that if a particular concept is assigned a unique ID, that ID should be used for the words denoting the same concept in all target languages. For instance, the ID of the Hindi word mā̃ ‘mother’ should be exactly the same for Bengali mā, Odia mā, and words denoting MOTHER in other languages.

748

Amba Kulkarni

The construction of IndoWordNet will, no doubt, open up a new vista of lexical resources for most of the Indian languages. It will eventually become an open resource usable in all works of descriptive, applied, and computational linguistics.6 8.3.1.2.2.

Sanskrit By Amba Kulkarni

8.3.1.2.2.1. Electronic corpora Classical languages like Sanskrit in addition to being repositories of classical literature play an important role as carriers of culture and history. With the advent of the electronic age, it was natural that enthusiasts as well as academics took interest in preserving the texts using new technology. Like other Indian languages, Sanskrit also experienced its initial hurdles as regards keyboard, fonts, etc. Scharf & Hyman 2012 examines fundamental linguistic issues in encoding Sanskrit texts and discusses processing principles relevant to the present technology. The birth of the Internet paved the way for enthusiasts to make their individual collections public. Thanks to the World Wide Web technology, numerous digital collections of Sanskrit could be made available online. The critical edition of the Sanskrit Epic Mahābhārata, based on John Smith’s revision of Muneo Tokunaga’s version of the text, is made available online by the Bhandarkar Oriental Research Institute (http://bombay.indology.info). George Cardona headed a project to create a databank and relational database of major Sanskrit grammatical texts such as the Aṣṭādhyāyī, Mahābhāṣya, Kāśika, Prātiśākhyas, and Nirukta (http://sanskritlibrary.org). Another major contribution is the bibliographical listing of the philosophical literature of India during its classical phase and also the secondary material in European languages on this literature at http://faculty.washington.edu/ kpotter/ckeyt/home.htm, Karl Potter being the General Editor. The year 1999–2000 (Kaliyugabda 5101) was declared as the “Year of Sanskrit” by the Government of India, and during this year, a major initiative, SanskNet, was taken up by the Rashtriya Sanskrit Vidyapeetha, Tirupati, under the leadership of K. V. Ramakrishnamacharyulu. SanskNet brought together various institutes of traditional learning, oriental research institutes, manuscript collection centers, and libraries to develop a digital corpus of Sanskrit, and make it available online to researchers. In the first phase, around 475 books were digitized. The mammoth task of digitizing Sanskrit texts is being continued further by the Rashtriya Sanskrit Sansthan.

6

Editorial note: Chaplot, Bhingardive & Bhattacharya 2014 presents a graphical user interface to browse and explore the IndoWordNet lexical database for various Indian languages.

Applications of modern technology to South Asian languages

749

The Göttingen Register of Electronic Texts in Indian Languages (GRETIL), a comprehensive repository of e-texts in Sanskrit and other Indian languages, was started by Reinhold Grünendahl, with the intention of its being a ‘cumulative register of the numerous download sites for electronic texts in Indian languages’ (http://gretil.sub.uni-goettingen.de/). Rather than scanned books or typeset PDF files, these documents are plain texts, in a variety of encodings, and are machine-readable, so that (for instance) word search can be performed on them. Searches become even more effective if the text is semantically marked. SARIT (http://sarit.indology.info/), an initiative by Dominik Wujastyk, hosts texts that are marked up (tagged) using the rich Text Encoding Initiative (TEI) system and provides a Search and Retrieval facility for them. 8.3.1.2.2.2. Lexicon development In the late 1990s the Cologne Digital Sanskrit Lexicon (CDSL) project (http:// www.sanskrit-lexicon.uni-koeln.de/) undertook the digitisation of major bilingual Sanskrit dictionaries of the 19th century. Major works such as Monier-Williams’ Sanskrit-English dictionary, Böhtlingk and Roth’s Sanskrit-German dictionary, Apte’s English-Sanskrit dictionary, Wilson’s Sanskrit-English dictionary, and Stchoupak’s Sanskrit-French dictionary have been digitised and are available online with various search features. The search features were developed in collaboration with Peter Scharf’s Sanskrit Library Project (http://sanskritlibrary.org). There was a similar effort at Rashtriya Sanskrit Vidyapeetha Tirupati by Varakhedi and his team in the last decade to digitise the Sanskrit encyclopedic lexicon Vācaspatyam (Varakhedi & Jaddipal 2009). While the knowledge structure of Amarakośa has been explored by computer scientists (http://sanskrit.uohyd.ac.in/ scl/amarakosha/), Sanskrit scholars have also taken the initiative in the development of modern e-lexicons such as WordNet (http://www.cfilt.iitb.ac.in/wordnet/ webswn/) under the guidance of Malhar Kulkarni. Sanskrit being highly inflectional and productive in its derivational morphology, accessing dictionaries such as Monier-Williams is laborious and needs a good knowledge of grammar. The free word order and the ambiguities at various levels of analysis make it quite difficult to understand a Sanskrit text. Current technology provides a mechanism through which one can bring some relief to the life of a Sanskrit reader. The late 1990s and early years of the 21st century saw various individual efforts towards the development of Sanskrit computational tools. The first parser for Sanskrit using Integer Programming was developed by Pushpak Bhattacharyya in 1987 as a part of his M.Tech. dissertation at IIT-Kanpur. In the early 1990s, the Center for Development of Advanced Computing (CDAC) and the Academy for Sanskrit Research, Melkote, developed morphological generators. An ambitious research and development activity for Sanskrit and other Indian

750

Amba Kulkarni

languages was started by Girish Nath Jha at Jawaharlal Nehru University, Delhi (http://sanskrit.jnu.ac.in). Amba Kulkarni from the Akshar Bharati group, who was actively engaged in developing language accessors among Indian languages taking insights from Pāṇini’s Grammar, joined hands with K. V. Ramakrishnamacharyulu and Srinivas Varakhedi of Rashtriya Sanskrit Vidyapeetha and started developing an accessor for Sanskrit. Gérard Huet, while developing a hypertext Sanskrit-French dictionary, strongly felt the need of linking the words to the morphological generator; this resulted in the implementation of a Sanskrit computational linguistics platform conceived as a coordinated set of Web services around a lexical database obtained mechanically from his highly structured Sanskrit Heritage dictionary (http://sanskrit.inria. fr). A typical Sanskrit reader needs access to a dictionary, as well as grammar. The Sanskrit Heritage site provides a user with a hypertext dictionary where the words are linked to the morphological generator, a segmenter to split a continuous Sanskrit text into words, and a word-level and a sentence-level analyser, all under one integrated environment. Peter Scharf (http://sanskritlibrary.org) developed a ‘digital library dedicated to facilitating education and research in Sanskrit by providing access to digitized primary texts in Sanskrit and computerized research and study tools to analyze and maximize the utility of digitized Sanskrit text.’ The Sanskrit Library is also engaged in aligning digital texts with digital images of manuscripts, which allows immediate focused access to particular passages in manuscripts and conversely, when examining a manuscript, to complementary texts, metadata, linguistic tools, and lexical resources. The Digital Corpus of Sanskrit (DCS) (http://kjc-fs-cluster.kjc.uni-heidelberg. de/dcs/index.php), developed by Oliver Hellwig, is another major collection of Sanskrit texts; it provides access to a searchable collection of lemmatized Sanskrit texts and to the database of the SanskritTagger software. With the SanskritTagger it is possible to analyze unprocessed digital Sanskrit text both lexically and morphologically. The DCS has been designed to support research in Sanskrit philology. It provides search of lexical units and their collocations from a corpus of around 3,000,000 words. Collaboration between the Institute for Research in Computer Science and Automation (INRIA) in Paris and the Department of Sanskrit Studies of the University of Hyderabad provided a common platform to researchers working in the field of Sanskrit computational linguistics in the form of a symposium. In October 2007, the joint team organized the First International Sanskrit Computational Linguistics Symposium at INRIA. This was followed by a series of symposia — at Brown University (May 2008), University of Hyderabad (Jan. 2009), JNU, Delhi (Dec. 2010), and IIT Bombay in Jan. 2013. The symposium publications have been published as Huet, Kulkarni & Scharf (eds.) 2009, A. Kulkarni & Huet (eds.) 2009, Jha (ed.) 2010, and M. Kulkarni & Dangarikar (eds.) 2013.

Applications of modern technology to South Asian languages

751

The Sanskrit Consortium of seven institutes7 in India developed tools for analysis of Sanskrit texts, viz. Samsaadhanii (http://sanskrit.uohyd.ac.in/scl; http:// tdil-dc.in/san/). The tools contain a morphological analyser and generator, tools for segmentation and formation of sandhi, compound processors, a parser leading to kāraka analysis, and a machine translation system from Sanskrit into Hindi. Gaveśikā, a search engine for Sanskrit, was launched by the University of Hyderabad in March 2012. This search engine integrates a Sanskrit morphological analyser with the basic search engine resulting in better search results. Efforts are also in progress to integrate Peter Scharf’s Sanskrit Library, Gérard Huet’s Heritage segmenter, and Amba Kulkarni’s parser (P. Goyal et al. 2012). In addition to developing various tools, the consortium under the guidance of K. V. Ramakrishnamacharyulu has developed tagsets and guidelines for annotating Sanskrit compounds and developing Sanskrit treebanks following Pāṇinian grammar. A treebank of around 4,000 sentences following these guidelines has been developed. Developing computational tools for Sanskrit poses many challenges. Though Sanskrit has a rich grammatical tradition, this rich literature is not readily digestible by a computer scientist working on Sanskrit. It is only the efforts of Cardona (1988), Gillon (1995), Kiparsky (2009), Staal (1967), and others throwing light on various aspects of Pāṇinian grammar such as the organisation and interpretation of Pāṇini’s sūtras, presenting Pāṇini’s formal grammar in modern linguistic terminology, and describing the structure of the Sanskrit language, that came to the rescue of computational linguists engaged in bridging the gap between modern technology and the enormous descriptive grammatical literature of the Pāṇinian tradition. Pāṇini’s Aṣṭādhyāyī is often compared to a computer program, for its rigour and coverage. No wonder there are notable efforts in “modeling” Pāṇini by Swami Sri Taralabalu (http://www.taralabalu.org/panini/), Mishra (2009), Scharf (2009), P. Goyal, Kulkarni & Behera (2009), and Subbana & Varakhedi (2010).

7

International Institute of Information Technology-Hyderabad, Jagadguru Ramanandacharya Rajasthan Sanskrit University-Jaipur, Jawaharlal Nehru University-Delhi, Poornaprajna Vidyaapetha-Bangalore, Rashtriya Sanskrit Vidyapeetha-Tirupati, Sanskrit Academy Osmania University-Hyderabad, and University of HyderabadHyderabad.

752

Yogendra P. Yadava

8.3.1.3. Nepal By Yogendra P. Yadava Applications of modern technology to Nepal’s languages can be traced back to the 1970s when the Summer Institute of Linguistics (SIL) and Nepalese linguists collected texts from various languages including Nepali (Bandhu 1971), created databases, wordlists, and concordances for them, and tried to do their linguistic analysis within the framework of tagmemic grammar. Most of these works appear in printed form or mimeograph. Later Acharya, in his doctoral dissertation (1991), used a computer programme to create an analyzed corpus of a selected Nepali written text and then prepared a Nepali grammar on the basis of this corpus. While compiling the Prajna Nepali dictionary (ongoing), the Nepal Academy used the SIL computer software, Shoebox, in preparing a partial Nepali text database based on school-level Nepali textbooks and Gorkhapatra, a Nepali daily newspaper, interlinearizing it and making lexical entries. Computer software also supported the compilation of the Dictionary of Classical Newari: Compiled from manuscript sources (Malla 2000), especially for the purposes of scanning and formatting. The Gurung-Nepali-English dictionary (W. Glover, J. Glover & Gurung (eds.) 1977) is a trilingual dictionary prepared using Shoebox. This programme has been used in compiling multilingual dictionaries of several minority languages of Nepal at Tribhuvan University.8 Following these early initiatives, development of corpora and lexical resources started in earnest with the European Union-funded Nepali Language Resources and Localization for Education and Communication (NELRaLEC, called Bhashasanchar in Nepali) project (2005–2007). The most significant outcome of this undertaking is the construction of the 14 million-word Nepali National Corpus (NNC). This corpus includes both spoken and written data, the latter incorporating a Nepali match for the Freiburg-Lancaster/Oslo/Bergen (FLOB) Corpus of British English and a broader collection of texts. Additional resources within the NNC include parallel data (English-Nepali and Nepali-English) and a speech corpus. The NNC is encoded as Unicode text and marked up in CES-compatible XML. The whole corpus is also annotated with part-of-speech (POS) tags (see Yadava, Hardie et al. 2008 for details). Hardie, Lohani, Regmi & Yadava 2005 and 2009 describe the process of devising a tagset and retraining a tagger for the Nepali language, for which there existed no corpus resources. More recently, Hardie, Lohani & Yadava 2011 deals with the extension of automated text and corpus annotation in Nepali from POS tags to lemmatisation, enabling a more complex set of corpus-based searches and analyses. This approach to tokenisation and lemmatisation may be helpful for analyzing other Himalayan languages with Nepali-like morphological 8

Editorial note: Michailovsky 2006 is an early summary of digitized resources for languages of Nepal. Hardie 2007 and 2008 are early studies of the collocational properties of Nepali adpositions.

Applications of modern technology to South Asian languages

753

behavior. Some of the important applications of the NNC are the development of the online Contemporary dictionary of Nepali, the first corpus-based dictionary of the language (http://www.nepalisabdakos.com/),9 Nepali text-to-speech (TTS), developed using the framework of the Festival Speech Synthesis System (http:// www.bhashasanchar.org/textspeech_intro.php), and the launch of courses in corpus and computational linguistics at Tribhuvan University. The NNC is being used by linguists as a resource in grammatical research as well. The corpora developed during the Bhasha Sanchar Project are now distributed and maintained by ELRA. See http://universal.elra.info/product_info.php?cPath=42_43&products_id=2077. In addition, under the PAN Localization Project, Phase I (2004–2007) and Phase II (2007–2009) (http://panl10n.net), much work has been accomplished on the software localisation and natural language processing fronts in Nepal. These include NepaLinux (http://nepalinux.org), a completely localised Linux distribution into Nepali and other Nepali language-processing utilities like the Nepali Spell checker, Thesaurus, Nepali Computational Grammar Analyzer, and corresponding tools including tokeniser, POS tagger, chunker, and parser (see also 8.2.2.2 above). In addition to these applications, 100,000 words from the Penn Treebank corpus have been translated into Nepali and POS tagged as well. (Details at http://ltk.org.np.) A machine translation project, Dobhase, was completed in 2006 with support from the Pacific Asia Networking Information and Communication Technologies Research and Development (PAN ICT R&D) Grants Program to develop an online machine translation system for English to Nepali. This has been hosted online to be freely accessible at http://ltk.org.np. This online machine translation system is capable of taking general English text as input (excluding technical registers) and generating a corresponding Nepali text as output, thus conveying the meaning of the English source text by representing it in Nepali. However, as with many other machine translation systems, the result needs post-editing to make it more natural (Yadava et al. 2005). Also, this system, based on a limited lexicon, needs further enlargement and also replacement by using parallel corpora. 8.3.1.4. Pakistan By Elena Bashir Linguistics research is in its early stages in Pakistan. Following a typical historical trajectory, language-related work has begun with the compilation of dictionaries. Most of these works, often by language enthusiasts and activists, are, however, still available only in very small, locally published print runs. A valuable and now technologically feasible project would be to scan and apply OCR to these valuable documents, then collect them in a central database. So far, only a few dictionaries are available in electronic form. These include a searchable online Urdu-Urdu 9

The dictionary contains about 7000 lexical entries and requires further enhancement.

754

Elena Bashir

dictionary developed at the Center for Language Engineering (CLE), University of Engineering and Technology, Lahore, which includes essential grammatical information and examples of historical usages (http://182.180.102.251:8081/ oud/default.aspx), a Torwali-Urdu dictionary compiled by Inam Ullah, a native speaker of Torwali (http://www.cle.net.pk/otd/), and a Sindhi-English dictionary developed jointly by Jennifer Cole at the University of Illinois Urbana-Champaign and Sarmad Hussain at CLE (http://182.180.102.251:8081/sed1/homepage.aspx). This dictionary has a clickable alphabetical word index and is searchable by either Sindhi-script entries or by roman transliterations.10 Development of a “born digital” lexicon of post-2002 Urdu is described in Ijaz & Hussain 2007. A total of 50,365 words were obtained from a 19.3 million word corpus containing 104,341 orthographically unique words collected from online editions of Jang and BBC Urdu news (www.jang.com.pk and http://www.bbc. co.uk/urdu/). The corpus on which the Urdu lexicon described above is based has been followed by a set of multi-genre corpora (100,000; 500,000; and 1,000,000 words, each of which is a complete subset of the next larger one) drawn from the Urdu Digest, a popular long-standing, general-interest magazine. These corpora are to be further elaborated under the Essential Urdu Language Resources Project, a joint endeavor of the CLE in Lahore and the University of Konstanz in Germany (http://cle.org.pk/eulr/scope.html). This project will also include development of (i) a POS tagset and POS-tagged corpus, (ii) an Urdu WordNet (Ahmed & Hautli 2011), (iii) an Urdu VerbNet, and (iv) a sense-tagged corpus. A three-day International Corpus Linguistics Workshop was organized by the International Islamic University, Islamabad in collaboration with the University of Birmingham and was held on 9–11 February 2015. Work on other languages of Pakistan is in its early stages; see Section 8.4.3 for discussion of Panjabi, Sindhi, Pashto, and Kashmiri. 8.3.1.5. Bangladesh By Elena Bashir In Bangladesh, work on development of language resources on Bangla is being done at Daffodil International University in Dhaka, Khulna University in Khulna, and the BRAC University in Dhaka. Work at these institutions is represented by Rashel’s (2011) overview of computational linguistics in Bangladesh (Daffodil), Shamshed & Karim 2010 (Khulna) on corpus building for Bangla, Md. Islam & 10

Some older dictionaries of languages spoken in Pakistan, including Balochi, Panjabi, Pashto, Sindhi, and Urdu, have been digitalized and made available at http://dsal/ dictionaries/index.html. Born-digital dictionaries of Torwali, Pashto, and Khowar are in various stages of development under the Digital Dictionaries of South Asia component of the Digital South Asia Library at the University of Chicago.

Applications of modern technology to South Asian languages

755

Rajon 2007 (BRAC) on corpus design, and Mahmud & Khan 2007 (BRAC) on Treebank building for Bangla. See also Section 8.2.2.3 for localization work done in Bangladesh under the PAN Localization Project. There is, naturally, a large body of work on Bangla being done in India, e.g. Chakrabarti 2011 on POS tagging for Bangla, and Biswas et al. 2008, an example of a study utilizing the kinds of language resources discussed in Section 8.3.1.2.1. Darbari et al. (eds.) 2008 contains many other valuable papers. 8.3.2.

Treebanking – Hindi/Urdu By Rajesh Bhatt

The term “treebank” refers to corpora whose sentences have been annotated with syntactic structure. Treebanks have played an increasingly important role in current natural language processing techniques. They have also been used for theoretical linguistics research as they allow for easy extraction of linguistically significant patterns and large-scale evaluation of theoretical proposals. Large-scale treebanks now exist for many languages such as Arabic, Chinese, Czech, English, French, German, Icelandic, Korean, Spanish, and Turkish; however, within the South Asian languages, large-scale treebanks have only been constructed for Hindi and Urdu. The Hindi/Urdu treebanks have been built as part of a collaborative NSF project involving the University of Colorado at Boulder, IIIT Hyderabad, the University of Washington, Columbia University, and the University of Massachusetts (Bhatt et al. 2009). Once we decide to build a treebank, we have to decide what kinds of information we want to annotate and what kinds of representations we want to use to represent this information. Most existing treebanks represent one particular kind of information (e.g. syntactic structure) and one particular kind of representation (e.g. phrase structure trees). The treebanks for Hindi/Urdu have been designed to be multi-layered and multi-representational. In addition to representing syntactic information, they also represent semantic information about verb meaning. Moreover, each sentence has a phrase structure tree representation as well as a dependency representation. The treebank for Hindi has 425,000 words, while the treebank for Urdu has 200,000 words. The Hindi corpus is largely drawn from newswires (350,000 words). There are also sections devoted to tourism (50,000 words) and conversational data (25,000 words). The Urdu corpus has a similar breakdown. The treebank is currently in pre-release status; interested researchers can access it by contacting http://verbs.colorado.edu/hindiurdu/. The construction of the Hindi/Urdu treebanks involved a number of steps, which are listed below. (1)

a. b.

Tokenization Morphological analysis

756

Rajesh Bhatt

c. d. e. f. g.

POS tagging Chunking Dependency annotation PropBank (Proposition Bank) annotation Phrase structure annotation

The first four steps are done automatically, followed by manual annotation of the dependency structure and the PropBank relations. The phrase annotation is also created automatically by transforming the dependency structure using additional information from the PropBank. The first step in the tree banking process involves taking a sentence and breaking it down to the basic elements of the syntactic annotation. What the basic elements are is of course decided by the syntactic theory underlying the annotation. For the Hindi treebank, tokenization follows the pre-theoretic notion of “word” as delimited by whitespace, thus following the lead of orthographic conventions. For example, when a postposition is written together with an immediately preceding pronoun (usne ‘s/he.erg’), it is treated as a single token, and when it is written with a whitespace in between it is treated as two tokens. One exception is the treatment of hyphenated compounds. These are broken down into hyphen-separated pieces. The separators are also treated as tokens. So bhaai-behen ‘brother-sister’ is tokenized as three tokens, bhaai, ‘JOIN’, and behen. For the Urdu treebank, the process is similar though one needs to distinguish between perceived whitespace between non-connecting letters and real whitespace. Once the sentence has been decomposed into its constituent words, the words are morphologically analyzed and tagged with a part-of-speech (POS) label. The Hindi/Urdu treebanks use the ILMT (Indian Language Machine Translation) tag set. The following is an illustration of two words with their POS tags and morphological analyses. (2)

a. b.

laṛake, NN, ne, PSP,

“NN” is the POS tag for common nouns and “PSP” for postpositions. “fs” stands for “feature structure” and “af” is the composite attribute which consists of the morphological analysis of a word. It includes information such as root, category, gender, number, person, case, tense/aspect, and suffix. In case a word is not specified for one of these attributes, the field is left unspecified. The POS-tagged and morphologically analyzed words are automatically grouped into chunks. Each chunk has a head, and dependency relations are manually marked between the chunk heads. After dependency annotation has been done, the chunks can be automatically expanded. Minimal noun phrases form a chunk together with their modifiers, as do verbs together with any associated auxiliaries. See below for an illustration of chunking.

Applications of modern technology to South Asian languages

(3)

757

Chunks = sequences in parentheses (raam ke) (us ghoṛe ne) (kelaa) that horse ERG banana Ram GEN ‘That horse of Ram’s had eaten a banana.’

(khaayaa eat.PFV

thaa) be.PST

The dependency annotation is based on a dependency grammar inspired by Pāṇini’s grammar for Sanskrit. Sentences are treated as a series of modifier-modified relations. The nodes in a tree for the most part consist of words. If w2 modifies w1, we have a dependency subtree rooted in w1 with w2 as a daughter. The arc label connecting w1 to w2 is annotated with the name of the dependency relation that relates the two words. The dependency grammar used for the Hindi/Urdu treebanks has three kinds of dependency relations — kāraka relations — which are relations between a verb and its direct participant. The most important of the kāraka relations are kartā, the locus of activity; karma, the locus of the result; and sampradān, the beneficiary. For convenience, kartā, karma, and sampradān are abbreviated as k1, k2, and k4. Other dependents of the verb such as instruments, locative modifiers, and temporal modifiers are also marked by kāraka relations. Dependents of categories other than verbs are not marked by kāraka relations; instead a number of other labels such as r6 (for genitives), and nmod relc (for relative clauses) are used. Finally there are also labels that are part of the representation but which are strictly speaking not part of the dependency system. These include pof ‘part of’, which is used to indicate complex predicates, and fragof ‘fragment of’, which is used to relate elements which would have been chunked together but cannot because of non-contiguity. Here is a sample dependency representation. (4)

(Parentheses indicate chunking, boldface indicates head of chunk) (jo

laṛkaa) (naac rahaa hai) vo (aatif ko) (jaantaa hai) boy dance PROG . M . SG is he Atif ACC know.HAB . M . SG is ‘The boy who is dancing knows Atif.’ (The dependency label which should label the arc connecting the modifier and the modified is shown right above the modifier.) REL

758

Rajesh Bhatt

For the most part, the dependency annotation eschews null elements. Thus there are no silent subjects or traces. The exception to this is in cases where postulation of a null element is necessary to construct a connected tree. This is the case with coordinations without overt coordinators, free relatives, and most significantly ellipsis. In a number of ellipsis constructions such as gapping, the verb goes missing and a silent verb needs to be postulated for the dependents of the elided verb to have something to modify. PropBanking provides semantic role information. For all the verbs in the treebank corpus, a set of semantic roles is defined. The corpus is then tagged with verb-specific semantic role information. In the Hindi/Urdu treebanks, PropBanking has been done on top of the dependency annotation. As well as providing a PropBank label in addition to the dependency label, PropBanking makes certain kinds of implicit syntactic information explicit. The dependency annotation does not distinguish between unaccusative verbs like khul- ‘open’ and unergative verbs like naac- ‘dance’ but the PropBank does. The PropBank label for the unique argument of khul- ‘open’ is Arg1 (‘thing opened’) while the PropBank label for the unique argument of naac- ‘dance’ is Arg0 (‘agent of dancing’). The PropBank also makes null subjects explicit, in both finite and non-finite clauses. Finally, we turn to the phrase structure representation, which is inspired by the Minimalist Program. The representation is systematically binary branching. Unlike the dependency system, it assumes that there is a canonical word order and divergences from that word order are indicated using traces. Each simplex clause is taken to correspond to a verbal projection with three designated positions, which are indicated in the following tree.

The node label VP corresponds to the full clause in other treebanks and the node label VPPred to VP. In an ordinary transitive, the subject would occupy the XP1 position, and the object would start in XP3 and move to XP2. In an unaccusative, the same phrase would occupy all three positions. With small clause complements, all three positions would be occupied by different elements. (5)

miinaa aatif ko bewakuuf ACC stupid Mina Atif ‘Mina considers Atif stupid.’

samajhtii consider.HAB . F

hai is

Applications of modern technology to South Asian languages

759

Unlike the dependency and PropBank representations which have been manually annotated for the full corpus, the phrase structure representations have been manually annotated for a very small subset of the corpus. The rest of the phrase structure representations will be transformed automatically from the dependency annotation, using additional information about unaccusativity, null arguments, and causatives. 8.4.

Applications By Elena Bashir

8.4.1.

Introduction

The techniques, tools, and applications of computational linguistics are often called language engineering (LE) or language technology (LT). One goal of LE is to create software products which can respond to natural language — both written texts and speech — to improve human-machine interaction. End-use applications of such natural language interfaces include database queries, search engine technology, monolingual and crosslingual information retrieval and management, machine translation, expert systems, and robot control. Systems for crosslingual information and knowledge management will help surmount language barriers for e-commerce, education, and international cooperation. The emergence of the information society and the rapid growth of the Internet/ WWW pose urgent new challenges for LT. Language technology for content management is necessary to turn the exponentially increasing wealth of digital information into structured, easily accessible knowledge. For browsing, navigating,

760

Elena Bashir

filtering, and processing the information on the web, we need software that can identify the contents of documents, for example by a method called Named Entity Recognition. The increasing multilinguality of the global Web means that it can only be mastered with the help of multilingual tools for indexing and navigating; hence the importance of localization efforts. Specific computational results that enable the overarching purposes described above include: at the lexical level, development of corpora, dictionaries, and concordances; at the morphological level, morphological analysis, stemming, and spell checking; at the syntactic level, part-of-speech (POS) tagging, parsing, and grammar checking; and at the semantic level, machine translation, information extraction, discourse analysis, and dialogue systems. Necessary for all of these written-language applications is optical character recognition (OCR); for applications involving spoken language, software for speech recognition, text-to-speech conversions including speech synthesis, and speech-to-text conversion are needed. 8.4.2.

Countrywise summaries

The volume and variety of computational linguistics work on South Asian languages is increasing exponentially. The following sections summarize some of the important work done in India, Bangladesh, Nepal, Sri Lanka, and Pakistan. 8.4.2.1. India Computational linguistics applications in India, with its multitude of languages and relatively advanced technology, are so numerous that space here allows mention of only a few centrally important websites, where many further links to specific information can be found. These include: http://tdil.mit.gov.in/, http://www.cse. iitb.ac.in/~pb/pubs-yearwise.html (a list of technical papers), http://www.tdil-dc. in/, and www.ildc.in. The Technology Development for Indian Languages (TDIL) website includes a (semi-)annual newsletter, Vishwabharat (http://tdil.mit.gov.in/ Publications/Vishwabharat.aspx#content), which publishes non-technical information on products, tools, services, activities, developments, and achievements in the area of Indian language software. Sourabh 2013, a literature review on cross-language information retrieval (CLIR) work in India, contains many recent references. It discusses CLIR and machine translation (MT) systems for Indian languages. Various approaches to CLIR utilize MT, parallel corpora, or bilingual dictionaries. This survey paper covers major ongoing developments in CLIR and MT with extensive lists of specific projects, along with the participants, status, results, and Government participation in the projects. Activities and results under the Eleventh Five Year Plan (2007–2012) are detailed. See also Section 8.3.1.2 above for work on development of linguistic resources in India.

Applications of modern technology to South Asian languages

761

8.4.2.2. Bangladesh In Bangladesh, an International Conference on Computer and Information Technology has been held annually since 1997 http://en.wikipedia.org/wiki/Inter national_Conference_on_Computer_and_Information_Technology.11 L. Rahman & Hossain 2006 gives data on the numbers of papers (total of 127) on Bangla computational linguistics presented in various conferences from 1997–2005. M. S. Islam 2009 is a similar state-of-the-art summary, discussing OCR, morphological parsing, speech processing, and MT. Some early articles on MT are M. Ali & Ali 2002, Uddin, Ashraf et al. 2004, and Uddin, Murshed & Hasan 2005. Numerous articles on phonetic analysis of Bangla sounds for computational purposes are listed at http://faculty.daffodilvarsity.edu.bd/akhter/index.php. 8.4.2.3. Sri Lanka The early report at http://www.emille.lancs.ac.uk/reports/milleftreport2.pdf describes the prospects for language engineering work in Sri Lanka prior to the major efforts of the PAN Localization Project. For work done under this Project, see Section 8.2.2.1 above. Silva & Weerasinghe 2008 discusses English-to-Sinhala Machine Translation. Weerasinghe, Herath & Welgama 2009 discusses a corpusbased lexicon, and Welgama et al. 2011 is on developing a Sinhala WordNet. Additionally, many research papers on computational linguistics in Sri Lanka have been collected at http://www.mendeley.com/groups/1860571/computationallinguistics-in-sri-lanka/papers/. 8.4.2.4. Nepal For work done in Nepal and on languages of Nepal, see http://www.bhashasanchar. org/ and http://ltk.org.np/about_ltk.php. These sites include reports on numerous topics, e.g. Nepali computational grammar, OCR, and spell checking. A localized Nepali OpenOffice.org suite, including a spell checker, along with other free software like FireFox and Thunderbird is described at http://www.nepalinux.org/. See also Section 8.3.1.3 above on lexical and corpus resources and Section 2.2.2 on the PAN Localization Project for Nepal. Prasain 2011 is a recent doctoral dissertation on Nepali computational morphology.

11

A conference proceedings volume, School of Engineering & Computer Science, Independent University, Bangladesh 2006, contains over 30 papers on various aspects of computational work on Bangla, including automatic segmentation of Bangla speech, character recognition systems, and OCR. Unfortunately this book is not available online.

762

Elena Bashir

8.4.2.5. Pakistan In Pakistan, the main type of linguistics work being done is computational linguistics, the largest component of which is on Urdu (Bashir 2011: 12).12 The Centre for Research on Urdu Language Processing (CRULP) was established in July 2001, at the National University of Computer and Emerging Sciences in Lahore to conduct research and development in speech processing, computational linguistics, and script processing. Work there resulted in developing an Urdu speech interface for computers; language applications for Urdu, including a machine translation system; Urdu grammar and spell checkers; and Urdu lexicon development. Much early research by CRULP students and faculty was published in the CRULP Annual Student Reports 2010–2002, 2002–2003, and 2003–2004, which can be found at http://www.cle.org.pk/resources/reports.htm. Sarmad Hussain (2003) outlined the state of affairs at that time and set out basic recommendations, including the need for basic linguistic research on the languages of Pakistan. Work under way then included lexical development, corpus-based lexical data acquisition, and grammar modeling at CRULP; machine translation at Karachi University and the Pakistan Institute of Engineering and Applied Sciences; linguistic research at CRULP and the National University of Modern Languages; OCR at Ghulam Ishaq Khan Institute; and speech synthesis and recognition at CRULP. The Computer Science Department at The University of Peshawar has produced work on computational problems in both Urdu and Pashto (e.g. Zuhra & Khan 2007, R. Ali, Khan & Rabbi 2007, R. Ali, Khan & Ahmad 2008, Rabbi et al. 2008, Zuhra & Khan 2009). In 2010, the CRULP research group shifted to the University of Engineering and Technology (UET) in Lahore, as the Centre for Language Engineering (CLE), Al-Khwarizmi Institute of Computer Science, under the leadership of Sarmad Hussain (www.cle.org.pk). This is now a centrally important site for information on computational linguistics work in Pakistan.13 The Society for Natural Language Processing was established in 2008, with the objective of coordinating the multiple efforts in computational linguistics going on in Pakistan. It is now headquartered at the CLE, UET, Lahore. OCR for Nastaliq script is a particularly difficult problem. Precursor work toward this goal includes Lodhi 2004, a computationally oriented work aimed at 12 13

Early computational work on Urdu is represented by Hardie 2003, 2004, and 2005. Formerly some computational linguistics work was done by the Centre of Excellence for Urdu Informatics at the National Language Authority (NLA), Islamabad (http://cen tralasiaonline.com/en_GB/articles/caii/features/2009/07/23/feature-09), but as of October 2012, the NLA was reconstituted as the National Language Promotion Department under the Ministry of National Heritage & Integration and National Language Promotion Department (http://www.nlpd.gov.pk/) (http://www.thenews.com.pk/TodaysNews-2–140276-NLA-renamed-as-Nation …%E2 %80 %8E).

Applications of modern technology to South Asian languages

763

developing an Urdu character pattern recognition system which can classify patterns even under non-optimal conditions. By 2013, work on OCR for Nastaliq script was under way in several places in Pakistan, including the CLE in Lahore; the Computer Science Department at NED University of Engineering & Technology in Karachi (Sattar et al. 2009); the Ghulam Ishaq Khan (GIK) Institute of Engineering Sciences and Technology (Husain & Amin 2002); Quaid-i-Azam University, Islamabad (Satti 2013); and the College of Signals, National University of Science & Technology (NUST), Rawalpindi (Rehman 2010). Working at SUNY Buffalo in the USA, Mukhtar, Setlur and Govindaraju (2009) discuss and include a bibliography of previous work on Nastaliq OCR, mentioning in particular the need for a corpus of handwritten Urdu. Javed & Hussain 2013 discusses segmentation-based Urdu OCR. M. Naz, Akram & Hussain 2013 discusses evaluation of binarization for Urdu Nastaliq. In August 2014, OCR software for Nastaliq was released in alpha version by CLE (http://182.180.102.251:8080/ocr/). This can scan at 30 milliseconds/page with 90 % accuracy. The most recent work is described in Akram, Hussain, Niazi et al. 2014 and Akram, Hussain, Adeeba et al. 2014. OCR coupled with the text to speech (TTS) software, now available (http://182.180.102.251:8080/UrduTTS/), will make Urdu textual content accessible to illiterate and visually impaired people. W. Ahmad & Hussain 2011 treats enabling complex Asian scripts on mobile devices, especially smart phones. A dialog system to supply weather data on mobile phones has been released (http://cle.org.pk/dialog/); this will yield the additional benefit of collecting audio data on Urdu dialects and pronunciation by micro-region. An Urdu WordNet currently has around 28,967 synsets consisting of nouns, verbs, adverbs, and adjectives (Adeeba & Hussain 2011; Zafar et al. 2012 and http://www.cle.net.pk/urduwordnet/). Important works on the computational grammar of Urdu are Rizvi 2007, which ‘... proposes an algorithm for parsing Urdu sentences based on closed-wordclasses... which helps in identifying chunks based on the linguistic characteristics of the word classes (p. vi)’; Bögel et al. 2008 on a morphological analyzer for Urdu; and Humayoun, Hammarström & Ranta 2007, which presents work on software for Urdu grammar. 8.4.3.

Under-resourced major languages

A major desideratum for all South Asian countries is work on under-resourced languages. Work is also being done to bring smaller languages into the digitally connected world. For example, Unicode code points have been recently added for PersoArabic characters needed for writing Burushaski, Torwali, and Khowar (Bashir, Hussain & Anderson 2006, and http://www.unicode.org/charts/PDF/U0750.pdf). However, much remains to be done on under-resourced major languages. Four such languages are discussed here, which have in common that they are spoken

764

Elena Bashir

across national boundaries: Panjabi, Sindhi, and Kashmiri in both India and Pakistan; and Pashto in Pakistan and Afghanistan. 8.4.3.1. Panjabi Considering its large number of speakers in India, Pakistan, and the diaspora, Panjabi, compared with Urdu, Hindi, or Bengali, for example, is a severely under-resourced language. Since it is written in two different scripts — Gurmukhi in India and Perso-Arabic (Shahmukhi) in Pakistan — computational work on Panjabi has followed two separate trajectories. According to a 2013 report on the status of linguistic resources for several languages of Pakistan (S. Hussain 2013), in Pakistan Panjabi lacks even minimal support in any of the areas of the basic work necessary for language technology development, including linguistic data collection, core linguistic analysis, and definition and annotation of linguistic data. Malik 2005 discusses the need to agree upon and provide Unicode support for a character for retroflex ḷ, “RLAAM”, a character essential for representing Panjabi accurately, and offers some examples of varying usages for retroflex n [ṇ] and l [ḷ]. This lack was reiterated in Javaid et al. 2011, and remains the case today in 2015. An essential step for Panjabi in Pakistan — for both literary and computational work — is agreement by the Shahmukhi-using Panjabi community upon characters to be used for retroflex n (RNOON), and for retroflex l (RLAAM), and their subsequent adoption by Unicode if necessary.14 An important contribution toward computational work on Panjabi in Perso-Arabic script is Humayoun & Ranta 2010, which discusses morphology, corpus building, and lexicon. Given the use of two disparate scripts for Panjabi, conversion from one script to another is a problem which has attracted the attention of scholars working on machine transliteration either from Shahmukhi to Gurmukhi or vice versa. Work on Shahmukhi to Gurmukhi includes Malik 2006;15 Saini & Lehal 2008; Saini, Lehal & Kalra 2008; Singh 2011; and Lehal & Saini 2012. Lehal 2009b and Lehal, Saini & Chowdhary 2012 are discussions of Gurmukhi to Shahmukhi. Work in Panjabi computational linguistics in India is more advanced than it is in Pakistan. Lehal 2009a, a comprehensive survey of work in Panjabi computational linguistics, discusses font development, dictionary tools, spell checkers and grammar checkers, corpus development, POS tagging, morphological analyzers, and machine translation. Despite the relatively advanced state of Panjabi lin14

15

Singh 2011 mentions retroflex n [ɳ] /ṇ/ and lists a Shamukhi character ‫( ݨ‬U+0768 U), which is used for Saraiki but has not been yet accepted for Panjabi in Pakistan. Retroflex l is not mentioned. Malik, Boitet & Bhattacharyya 2008 discusses the potential extension of this transliteration system to deal with other languages written in two mutually incomprehensible scripts, e.g. Sindhi or Kashmiri.

Applications of modern technology to South Asian languages

765

guistics in India relative to Pakistan, Lehal notes a scarcity of linguistic resources and he comments that work in India is scattered. Newer works originating in India include D. Kumar & Rana 2010 on development of a stemmer for Panjabi, and Kaur et al. 2010 and Narang 2012 on development of a Panjabi WordNet. V. Goyal & Lehal 2009, V. Goyal & Lehal 2010, and V. Goyal 2010 discuss Hindito-Panjabi machine translation. Pr. Kumar & Goyal 2010 is on a Hindi-Panjabi parallel corpus. Bansal, Ahuja & Sharma 2011 discusses a Panjabi morphological analyzer. Virk, Humayoun & Ranta 2011 and Virk 2013 discuss work on a computational grammar for Panjabi developed in GF (Grammatical Framework), a programming language for multilingual grammar applications. Pa. Kumar & Sharma 2012 discusses EnConversion of input Panjabi sentences to an interlingua representation called Universal Networking Language (UNL), a formal language designed to represent semantic data extracted from natural language texts, with the ultimate aim of facilitating machine translation or information retrieval applications. Gupta & Lehal 2013 discusses an automatic text summarization system for Panjabi and contains many further references. 8.4.3.2. Sindhi Like Panjabi, Sindhi is written in two scripts — Devanagari in India and PersoArabic in Pakistan. And, also like Panjabi, Sindhi is under-resourced compared to other South Asian languages of comparable size. However, since Sindhi is taught in schools and is a medium of instruction and administration in most of Sindh Province in Pakistan, the prospects for computational work on Sindhi there are brighter than for Panjabi.16 S. Hussain (2013) notes that Sindhi has “some support” in the core areas of linguistic data collection, publishing language computing standards, and publishing data annotation schemas. Work on Sindhi computational linguistics has begun in several areas. M. U. Rahman 2008 is an early assessment of problems facing Sindhi computing. Oad 2009 reports work in progress on developing a resource grammar for Sindhi computation. M. U. Rahman 2010 reports on problems and progress in Sindhi corpus construction; by that time 4.1 million words had been analyzed and 70,576 distinct word forms found. Ismaili, Bhatti & Shah 2012 discusses work on developing Unicode-based digital Sindhi dictionaries. Leghari & Rahman 2010 is on transliteration between Devanagari and Perso-Arabic. Tafseer Ahmed and his team at the DHA Suffa University in Karachi (p.c. 17 Dec. 2014), are working on roman to Perso-Arabic transliteration for Pakistani languages. Their version 0.9 is now available (http://cs.dsu.edu.pk/faculty/tafseer/PakTrans/), to be followed soon by version 0.91, and then expanded to include Pashto, Saraiki, and Balochi, as well as 16

Thanks to Tafseer Ahmed for making this important point. Thanks also to Mutee ur Rahman for pointing me to many references.

766

Elena Bashir

user interfaces. Bhatti, Ismaili, Shaikh & Soomro 2013 describes a Unicode-based typing system for Sindhi which does not require special fonts or language settings. Mahar & Memon 2011 discusses probabilistic analysis of Sindhi word prediction using N-grams. Word prediction is important for computational linguistic applications like POS tagging, word sense disambiguation, speech recognition, spell checking, and diacritic restoration. Mahar, Memon & Danwar 2011 treats lexicon-driven word segmentation. Mahar, Memon & Shaikh 2011 discusses automatic diacritic restoration. Mahar & Memon 2010a and 2010b discuss rule-based POS tagging using WordNet; Mahar, Shaikh & Solangi 2011 compares syntactic and rule-based semantic POS tagging systems. See also M. U. Rahman 2012 and the references therein on developing a Sindhi POS tagset. Bhatti, Ismaili, Khan & Nizamani 2012 discusses a rule-based approach to Sindhi spell checking. Mughal & Rahman 2013 presents a corpus-based statistical analysis of various spelling error patterns. Sindhi is written in the Naskh (Arabic) script style. Work on developing OCR systems for Sindhi has been under way for some time. S. Naz et al. MS, a review of various approaches to character recognition for languages employing Arabic script, contains numerous references on OCR work being done on various languages. N. A. Shaikh, Z. A. Shaikh & Ali 2008 and N. A. Shaikh, Mallah & Z. A. Shaikh 2009 are on Sindhi character segmentation work directed toward laying the foundation for OCR work. Nizamani & Janjua 2013 discusses back-propagation neural networks for Sindhi OCR. Work on speech recognition and generation is also being done. Keerio 2010 presents acoustic analytic work prerequisite for an automatic speech recognition system. Mahar, Memon & Shah 2010 is on a WordNet-based Sindhi text to speech synthesis system, and Abbasi & Hussain 2012 analyzes syllabification in Sindhi-English loanwords. 8.4.3.3. Pashto Computational work on Pashto is mainly being carried out at the Department of Computer Science, University of Peshawar, Pakistan. M. A. Khan & Zuhra 2007 describes the development of a 10,000 word, open-ended corpus for Pashto. Zuhra & Khan 2007 discusses computational treatment of the verb; and, continuing with corpus-related work, Zuhra & 2009 describes a corpus-based morphological analyzer for Pashto verbs. Rabbi et al. 2008 reports on precursor work for parsing; Bilal, Khan & Ali 2009 discusses identification of syntactic ambiguities; and R. Ali, Khan & Khan 2011 describes the development of a treebank for Pashto. Several researchers have published on anaphora resolution, e.g. R. Ali, Khan & Rabbi 2007; M. A. Khan, Ali & Khan 2007; R. Ali, Khan & Ahmad 2008; and R. Ali, Khan & Ali 2009. R. Ali, Khan, Bilal & Rabbi 2008a,b and Rabbi, Khan & Ali 2008 discuss development of a tagset for POS tagging; and Rabbi, Khan & Ali 2009 reports work

Applications of modern technology to South Asian languages

767

on rule-based POS tagging. Z. Ahmad et al. 2012 represents work on constituent splitting, a prerequisite for machine translation work; and Abbas, Ahmad & Ali 2012 reports on initial work toward an automatic speech recognition system. Work carried out at the National University of Computer and Emerging Sciences includes Wahab, Amin & Ahmad 2009 and R. Ahmad, Amin & Khan 2010 on precursor work for Pashto OCR. In addition, computational work on Pashto has been done in the U. S. (e.g. Precoda et al. 2004; Decerbo, MacRostie & Natarajan 2004), and in France (Mostafa et al. 2012). 8.4.3.4. Kashmiri Kak, Mehdi & Lawaye 2009a on Perso-Arabic/Devanagari conversion, and 2009b on developing a tagset for Kashmiri, are among the first works on Kashmiri computational linguistics. Chachoo & Quadri 2011 discusses morphological analysis from a raw Kashmiri corpus. R. A. Bhat & Sharma MS discusses a hybrid approach to shallow parsing (chunking) of Kashmiri; S. M. Bhat 2012 is on Kashmiri treebanking. Bibliographical references (Many of the publications listed here appear only online, so editorial information and page ranges are sometimes incomplete. Since URLs change rapidly and frequently, some items may be more easily found by searching directly by item title.) Abbas, Arbab Waseem, Nasir Ahmad, and Hazrat Ali 2012 Pashto spoken digits database for the automatic speech recognition research. In: 18th International Conference on Automation and Computing (ICAC), 2012, 1–5. http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=6330527& url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnum ber%3D6330527 (accessed 7 Dec. 2014) Abbasi, Abdul Malik, and Sarmad Hussain 2012 Syllable structure and syllabification in Sindhi-English loanwords. International Researchers 1(4): 120–134. http://cle.org.pk/Publication/papers/2012/ Syllable%20Structure.pdf (accessed 7 Dec. 2014) Acharya, Jayaraj 1991 A descriptive grammar of Nepali. Washington, DC: Georgetown University Press. Adeeba, Farah, and Sarmad Hussain 2011 Experiences in building Urdu WordNet. In: Proceedings of 9th Workshop on Asian Language Resources (ALR9). http://www.cle.org.pk/Publication/ papers/2011/UrduWordNet.pdf (accessed 7 Dec. 2014) Ahmad, Riaz, Syed Hassan Amin, and Muhammad A. U. Khan 2010 Scale and rotation invariant recognition of cursive Pashto script using SIFT features. In: Proceedings of the 6th International Conference on Emerging

768

Bibliographical references

Technologies, Islamabad, Pakistan, 2010, 299–303. http://www.gbv.de/dms/ tib-ub-hannover/646130196.pdf (accessed 7 Dec. 2014) Ahmad, Waqar, and Sarmad Hussain 2011 Enabling complex Asian scripts on mobile devices. Localisation Focus: The International Journal of Localisation 10(1): 18–28. http://www.localisa tion.ie/resources/lfresearch/Vol%2010 %20Issue%201.pdf (accessed 7 Dec. 2014) Ahmad, Zaheer, Mohammad Abid Khan, Jehan Zeb Khan Orakzai, Rahman Ali, and Ibrar Ahmad 2012 A computational multilingual text constituent splitter and phrasing: A case of Pashto language. In: Proceedings of the Conference on Language & Technology 2012, 1–7. http://www.cle.org.pk/research/rep/proceedingsCLT12. pdf (accessed 7 Dec. 2014) Ahmed, Tafseer, and Annette Hautli 2011 A first approach towards an Urdu WordNet. Journal of Language and Literature Review 1(1): 1–14. http://umt.edu.pk/llr/data/08102011/LLR-Research-Papers. pdf (accessed 18 Dec. 2014) Akram, Qurat ul Ain, Sarmad Hussain, Aneeta Niazi, Umair Anjum, and Faheem Irfan 2014 Adapting tesseract for complex scripts: An example for Urdu Nastalique. http://www.cle.org.pk/research/papers.htm (accessed 7 Dec. 2014) Akram, Qurat ul Ain, Sarmad Hussain, Farah Adeeba, S. Rehman, and M. Saeed 2014 Framework of Urdu Nastalique Optical Character Recognition system. In: Proceedings of Conference on Language and Technology 2014 (CLT 14), Karachi, Pakistan. http://cs.dsu.edu.pk/clt14/ (accessed 7 Dec. 2014) Ali, Mortuza, and Muhammad Masroor Ali 2002 Development of machine translation dictionaries for Bangla language. In: Proceedings of 7th International Conference on Computer and Information Technology (ICCIT), 272–276. http://research.banglacomputing.net/iccit/ ICCIT_pdf/5th%20ICCIT-2002_p272-p276.pdf (accessed 7 Dec. 2014) Ali, Rahman, Mohammad Abid Khan, and Ihsan Rabbi 2007 Strong personal anaphora resolution in Pashto discourse. In: International Conference on Emerging Technologies, 2007, Islamabad, Pakistan (ICET 2007), 148–153. http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source =web&cd=4&ved=0CDcQFjAD&url=http%3A%2F%2Fwww.researchgate. net%2Fpublication%2F4334738_Strong_Personal_Anaphora_Resolution_in_ Pashto_Discourse%2Ffile%2F9c960517f6802efd2e.pdf&ei=3OQoUu-TKMq VygGRkYDQCQ&usg=AFQjCNFEG_t2SmpFw8sZxUlfwcqFrhWueA& bvm=bv.51773540,d.aWc (accessed 7 Dec. 2014) Ali, Rahman, Mohammad Abid Khan, and Mushtaq Ali 2009 Reflexive anaphora resolution in Pashto discourse. In: Proceedings of the Conference on Language & Technology 2009, 41–45. http://www.cle.org.pk/ clt09/download/Papers/Paper6.pdf (accessed 7 Dec. 2014) Ali, Rahman, Mohammad Abid Khan, and Rashid Ahmad 2008 Implementation of the rule-based approach for the resolution of strong personal anaphora in Pashto discourse. In: Multitopic Conference 2008 (INMIC 2008), 501–507. http://ieeexplore.ieee.org/xpl/articleDetails.jsp?tp=&arnum ber=4777790&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_ all.jsp%3Farnumber%3D4777790 (accessed 8 Dec. 2014)

Applications of modern technology to South Asian languages

769

Ali, Rahman, Mohammad Amir Khan, and Mohammad Abid Khan 2011 Development of Pashto treebank. In: International Conference on Computer Networks and Information Technology (ICCNIT), 2011, 257–262. http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=6020939&url= http%3A%2F%2Fieeexplore.ieee.org%2Fstamp%2Fstamp.jsp%3Ftp %3D%26arnumber%3D6020939 (accessed 7 Dec. 2014) Ali, Rahman, Mohammad Abid Khan, Muhammad Bilal, and Ihsan Rabbi 2008a Empirical analysis of Pashto text for types of Pashto anaphora. In: Proceeding of International Conference on Information & Communication Technology (ICICT 2008), University of Science & Technology, Bannu, Pakistan. http://scholar.google.com/scholar?q=%22computational+linguistics %22+and+%22Pashto&hl=en&as_sdt=0&as_vis=1&oi=scholart&sa=X& ei=qJwnUty-CqPSyAHu7YD4Cg&ved=0CCcQgQMwAA (accessed 7 Dec. 2014) Ali, Rahman, Muhammad Abid Khan, Muhammad Bilal, and Ihsan Rabbi 2008b Reciprocal anaphora resolution in Pashto discourse. In: 4th International Conference on Emerging Technologies, 2008 ICET 2008, 1–5. http://www. researchgate.net/publication/224382042_Reciprocal_anaphora_resolution_ in_Pashto_discourse (accessed 7 Dec. 2014) Ali, Rahman, Muhammad Abid Khan, Rashid Ahmad and Ihsan Rabbi 2008 Rule based personal references resolution in Pashto discourse for better machine translation. In: Proceeding IEEE ICEE 2nd International Conference on Electrical Engineering. UET Lahore, Pakistan, 57–62. http://ieeexplore. ieee.org/xpl/articleDetails.jsp?arnumber=4553941 (accessed 11 Sept. 2015) Arulmozi, Selvaraj 2010 Telugu WordNet. In: Proceedings of the 5th International Conference on Global WordNet (GWC10), January 31 – February 4, 2010, Mumbai, India. www.globalwordnet-iitb2010.in (accessed 7 Dec. 2014) (Editorial note: This website contains many other articles on WordNets of other Indian languages.) Atkins, Sue, Jeremy Clear, and Nicholas Ostler 1992 Corpus design criteria. Literary and Linguistic Computing 7(1): 1–16. Atwell, Eric, Geoffrey Leech, and Roger Garside 1984 Analysis of the LOB Corpus: Progress and prospects. In: Jan Aarts and Willem Meijs (eds.), Corpus linguistics, 41–52. Amsterdam: Rodopi. Baker, Paul, Andrew Hardie, Tony McEnery, Richard Xiao, Kalina Bontcheva, Hamish Cunningham, Robert Gaizauskas, Oana Hamza, Diana Maynard, Valentin Tablan, Cristian Ursu, B. D. Jayaram, and Mark Leisher 2004 Corpus linguistics and South Asian languages: Corpus creation and tool development. Literary and Linguistic Computing 19(4): 509–524. Bandhu, Churamani 1971 The computer concordance of spoken Nepali. Norman, OK: Summer Institute of Linguistics. Bansal, Gagan, Satinder Pal Ahuja, and Sanjeev Kumar Sharma 2011 Improving existing Punjabi morphological analyzer. Research Cell: An International Journal of Engineering Sciences 5: 221–229. Bashir, Elena 2011 Urdu and linguistics: A fraught but evolving relationship. Annual of Urdu Studies 26: 97–123.

770

Bibliographical references

Bashir, Elena, Sarmad Hussain, and Deborah Anderson 2006 Proposal to add characters needed for Khowar, Torwali, and Burushaski. In: Proceedings of 49th Meeting of ISO/IEC JTC1/SC2/WG2, Tokyo, Japan, 2006, 1–22. http://cle.org.pk/Publication/papers/2006/n3117.pdf (accessed September 9, 2013) Becker, Dara, and Kashif Riaz 2002 A study in Urdu corpus construction. In: Proceedings of the 3rd Workshop on Asian Language Resources and International Standardization at the 19th International Conference on Computational Linguistics. August 2002. http:// dl.acm.org/citation.cfm?doid=1118759.1118760 (accessed 8 Dec. 2014) Bhat, Riyaz Ahmad, and Dipti Misra Sharma MS A hybrid approach to Kashmiri shallow parsing. http://hnk.ffzg.hr/bibl/ ltc2011/book/papers/PAR1–1.pdf (accessed 8 Dec. 2014) Bhat, Shahid Mushtaq 2012 Introducing Kashmiri dependency treebank. In: Proceedings of the Workshop on Machine Translation and Parsing in Indian Languages (MTPIL-2012), COLING 2012, Mumbai, December 2012, 53–60. http://www.aclweb.org/ anthology-new/W/W12/W12–56.pdf (accessed 8 Dec. 2014) Bhatt, Rajesh, Bhuvana Narasimhan, Martha Palmer, Owen Rambow, Dipti Sharma, and Fei Xia 2009 A Multi-representational and multi-layered treebank for Hindi/Urdu. In: Proceedings of the Third Linguistic Annotation Workshop, held in conjunction with ACL-IJCNLP 2009, Singapore. http://aclweb.org/anthology/W/W09/ W09–3036.pdf (accessed 8 Dec. 2014) Bhattacharyya, Pushpak 2010 IndoWordNet. In: Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelois Piperidis, Mike Rosner, and Daniel Tapias (eds.), Proceedings of the 7th International Language Resources and Evaluation Conference (LREC 2010), 19–21 May 2010, Valletta, Malta, 986–992. Paris: European Language Resources Association. Bhattacharyya, Pushpak, Christiane Fellbaum, and Piek Vossen (eds.) 2010 Principles, construction and application of multlingual wordnets, Proceedings of the 5th Global WordNet Conference, January 31–February 4, 2010, Mumbai. New Delhi: Narosa Publishing House. Bhatti, Zeeshan, Imdad Ali Ismaili, Asad Ali Shaikh, and Wasim Javaid Soomro 2012 Spelling error trends and patterns in Sindhi. Journal of Emerging Trends in Computing and Information Sciences 3(10): 1435–1439. http://cisjournal. org/journalofcomputing/archive/vol3no10/vol3no10_13.pdf (accessed 8 Dec. 2014) Bhatti, Zeeshan, Imdad Ali Ismaili, Waqar-ul-Islam Khan, and Aamir Shahzad Nizamani 2013 Development of Unicode based Sindhi typing system. Journal of Emerging Trends in Computing and Information Sciences 4(3): 309–314. http://www.cis journal.org/journalofcomputing/archive/vol4no3/vol4no3_10.pdf (accessed 8 Dec. 2014) Biber, Douglas 1986 Spoken and written textual dimensions in English. Language 62(4): 384–414.

Applications of modern technology to South Asian languages

771

Bilal, Muhammad, Mohammad Abid Khan, and Rahman Ali 2009 Identification of syntactic ambiguities in Pashto text. In: International Conference on Emerging Technologies, 2009 (ICET 2009), 1–6. http://ieeexplore. ieee.org/xpl/login.jsp?tp=&arnumber=5353211&url=http%3A%2F%2 Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D5353211 (accessed 8 Dec. 2014) Bögel, Tina, Miriam Butt, Annette Hautli, and Sebastian Sulger 2008 Developing a finite-state morphological analyzer for Urdu and Hindi. In: Thomas Hanneforth & Kay-Michael Würzner (eds.), Finite-state methods and natural language processing: Sixth International Workshop FSMNLP Potsdam, Germany. September 14-16: Revised papers, 86-96. Potsdam: Universitätsverlag Potsdam. http://ling.uni-konstanz.de/pages/home/boegel/ Dateien/boegeletal_fsmnlp07.pdf (accessed 8 Dec. 2014) Cardona, George 1988 Pāṇini: His work and its traditions. Vol. I. Delhi: Motilal Banarsidass. Chachoo, Manzoor Ahmad, and S. M. K. Quadri 2011 Morphological analysis from the raw Kashmiri corpus using open source extract tool. Trends in Information Management 7(2): 176–187. http://ojs.uok. edu.in/ojs/index.php/crdr/article/view/22 (accessed 8 Dec. 2014) Chakrabarti, Debasri, Dipak Kumar Narayan, Prabhakar Pandey, and Pushpak Bhattacharyya 2002 Experiences in building the Indo WordNet — A Wordnet for Hindi. https:// www.cse.iitb.ac.in/~pb/papers/gwn-2002.ps (accessed 30 Nov. 2015) Chaplot, Devendra Singh, Sudha Bhingardive, and Pushpak Bhattacharyya 2014 IndoWordnetVisualizer: A Graphical User Interface for Browsing and Exploring Word-net of Indian Language. 7th Global WordNet Conference (GWC 2014), Tartu, Estonia. http://www.cse.iitb.ac.in/~chaplot/documents/ rnd_report.pdf (accessed 8 Dec. 2014) Dash, Niladri Sekhar 2003 Corpus linguistics in India: Present scenario and future direction. Indian Linguistics 64(1–2): 85–113. Dash, Niladri Sekhar 2007 Indian scenario in language corpus generation. In: Niladri Sekhar Dash, Probal Dasgupta, and Pabitra Sarkar (eds.), Rainbow of linguistics, vol. 1, 129–162. Kolkata: T. Media Publication. Dash, Niladri Sekhar 2009 Corpus based analysis of the Bengali language. Saarbrücken: VDM Publications. Dash, Niladri Sekhar 2013 Part-of-speech (POS) tagging in Bengali written text corpus. Journal of Linguistics and Technology (SNLTR Journal) 1(1): 53–96. Decerbo, Michael, Ehry MacRostie, and Premkumar Natarajan 2004 The BBN Byblos Pashto OCR System. (ACM Digital Library.) http://dl.acm. org/citation.cfm?id=1031447 (accessed 8 Dec. 2014) Fellbaum, Christiane (ed.) 1998 WordNet: An electronic lexical database, Cambridge, MA: MIT Press.

772

Bibliographical references

Francis, William Nelson, and Henry Kucera 1964 Manual of information to accompany A standard corpus of present-day edited American English. Providence, RI: Brown University, Department of Linguistics. (Editorial note: This unpublished document was revised and published in 1979 as Francis & Kucera 1979, Manual of information to accompany A standard corpus of present-day edited American English, for use with digital computers. Providence, RA: Brown University Department of Linguistics.) Gillon, Brendan 1995 The autonomy of word formation: Evidence from Classical Sanskrit. Indian Linguistics 56: 17–52. Glover, Warren W., Jessie R. Glover, and Deu Bahadur Gurung 1977 Gurung-Nepali-English dictionary, with English-Gurung and Nepali-Gurung indexes. Canberra: Dept. of Linguistics, Research Scool of Pacific Studies, Australian National University. Government of Nepal, National Planning Commission Secretariat 2011 2001 population report: Kathmandu, Central Bureau of Statistics. http://cbs. gov.np/?p=513 (accessed 9 Jan. 2015) Goyal, Pawan, Amba Kulkarni, and Laxmidhar Behera 2009 Computer simulation of Ashtadhyayi. In: Gérard Huet, Amba Kulkarni, and Peter Scharf (eds.), Sanskrit computational linguistics 2007/2008 LNCS (LNAI) 5402, 139–161. Heidelberg: Springer. Goyal, Pawan, Gérard Huet, Amba Kulkarni, Peter Scharf, and Ralph Bunker 2012 A distributed platform for Sanskrit processing. Proceedings of COLING 2012: Technical papers, 1011–1028. http://aclweb.org/anthology//C/C12/C12–1062. pdf (accessed 8 Dec. 2014) Goyal, Vishal 2010 Development of a Hindi to Punjabi machine translation system. Punjabi University PhD thesis. http://languageinindia.com/oct2010/vishalthesis.pdf (accessed 8 Dec. 2014) Goyal, Vishal, and Gurpreet Singh Lehal 2009 Evaluation of Hindi to Punjabi machine translation system. International Journal of Computer Science Issues 4(1): 36–39. Goyal, Vishal, and Gurpreet Singh Lehal 2010 Web Based Hindi to Punjabi Machine Translation System. Journal of Emerging Technologies in Web Intelligence 2(2): 148–151. http://ojs.academypublisher. com/index.php/jetwi/article/view/0202148151/1846 (accessed 8 Dec. 2014) Gupta, Vishal, and Gurpreet Singh Lehal 2013 Automatic text summarization system for Punjabi language. Journal of Emerging Technologies in Web Intelligence 5(3): 257–271. http://www. google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&sqi=2&ved= 0CDMQFjAB&url=http%3A%2F%2Fojs.academypublisher.com%2Findex. php%2Fjetwi%2Farticle%2Fdownload%2Fjetwi0503257271 %2F7597&ei= eFEzUq-wCoSAqQGVzIDQCQ&usg=AFQjCNGRfOck5ErGxHwiSCDbrl G3bcNkAw&bvm=bv.52164340,d.aWM (accessed 8 Dec. 2014) Halliday, M. A. K. 1989 Spoken and written language. Oxford: Oxford University Press.

Applications of modern technology to South Asian languages

773

Hardie, Andrew 2003 Developing a tag-set for automated part-of-speech tagging in Urdu. In: Dawn Archer, P. Rayson, Andrew Wilson, and A. M. McEnery (eds.), Proceedings of the Corpus Linguistics 2003 Conference, 1–11. (UCREL Technical Papers Series 16.) Lancaster, UK: Centre for Computer Corpus Research on Language Technical Papers, University of Lancaster. http://citeseerx.ist.psu. edu/viewdoc/download?doi=10.1.1.218.513&rep=rep1&type=pdf (accessed 10 Jan. 2015) Hardie, Andrew 2004 The computational analysis of morphosyntactic categories in Urdu. University of Lancaster PhD dissertation. http://eprints.lancs.ac.uk/106/ (accessed 8 Dec. 2014) Hardie, Andrew 2005 Automated part-of-speech analysis of Urdu: Conceptual and technical issues. In: Yogendra P. Yadava and Govinda Bhattarai (eds.), Contemporary issues in Nepalese linguistics, 49–72. Kathmandu: Linguistic Society of Nepal. Hardie, Andrew 2007 Collocational properties of adpositions in Nepali and English. In: Matthew Davies, Paul Rayson, Susan Hunston, and Pernilla Danielsson (eds.), Proceedings of the Corpus Linguistics Conference, CL 2007. Birmingham: University of Birmingham. http://www.birmingham.ac.uk/documents/college-artslaw/ corpus/conference-archives/2007/88Paper.pdf (accessed 23 August 2013) Hardie, Andrew 2008 A collocation-based approach to Nepali postpositions. Corpus Linguistics and Linguistic Theory 4(1): 19–61. Hardie, Andrew, Paul Baker, Tony McEnery, and B. D. Jayaram 2006 Corpus-building for South Asian languages. In: Saxena & Borin (eds.) 2006: 211–241. Hardie, Andrew, Ram Raj Lohani, and Yogendra P. Yadava 2011 Extending corpus annotation of Nepali: Advances in tokenization and lemmatization. Himalayan Linguistics 10(1): 151–161. Hardie, Andrew, Ram Raj Lohani, Bhim N. Regmi, and Yogendra P. Yadava 2005 Categorisation for automated morphosyntactic analysis of Nepali: Introducing the Nelralec Tagset (NT-01). (Nelralec/Bhasha Sanchar Working Paper 2.) http:// www.bhashasanchar.org/pdfs/nelralec-wp-tagset.pdf (accessed 22 August, 2013) Hardie, Andrew, Ram Raj Lohani, Bhim N. Regmi, and Yogendra P. Yadava 2009 A morphosyntactic categorisation scheme for the automated analysis of Nepali. In: Rajendra Singh (ed.), Annual review of South Asian languages and linguistics 2009, 171–195. Berlin/New York: Mouton de Gruyter. Huet, Gérard, Amba Kulkarni, and Peter Scharf (eds.) 2009 Sanskrit computational linguistics 2007/2008 LNCS (LNAI) 5402. Heidelberg: Springer. http://www.informatik.uni-trier.de/~ley/db/conf/sanskrit/sanskrit 2008.html (accessed 8 Dec. 2014) Humayoun, Muhammad, and Aarne Ranta 2010 Developing Punjabi morphology, corpus and lexicon. In: Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation,

774

Bibliographical references

163–172. http://www.aclweb.org/anthology-new/Y/Y10/Y10–1020.pdf (accessed 8 Dec. 2014) Humayoun, Muhammad, Harald Hammarström, and Aarne Ranta 2007 Implementing Urdu grammar as open source software. Conference on Language and Technology, University of Peshawar 7–11 August 2007. http://www.lama. univ-savoie.fr/~humayoun/UrduMorph/downloads/Urdu-ExtendedAbstract. pdf (accessed 8 Dec. 2014) Husain, Syed Afaq, and Syed Hassan Amin 2002 A multi-tier holistic approach for Urdu Nastaliq recognition. In: International Multi Topic Conference 2002, December 27–28, 2002 at Hotel Marriott and Karachi Institute of Information Technology: Abstracts. Piscataway, NJ: Institute of Electrical and Electronics Engineers. http://www.academia. edu/2485225/A_multi-tier_holistic_approach_for_Urdu_Nastaliq_recogni tion (accessed 21 Dec. 2014) Hussain, D. M. Akbar, Abdul Qadeer Khan Rajput, Bhawani Shankar Chowdhry, and Quintin Gee (eds.) 2008 Wireless networks, information processing and systems: First International Multi Topic Conference, IMTIC 2008 Jamshoro, Pakistan, April 11–12, 2008, Revised Papers. (Communications in Computer and Information Science 20.) Berlin/Heidelberg: Springer. Hussain, Sarmad 2003 Computational linguistics (CL) in Pakistan: Issues and proposals. In: Proceedings of EACL 2003 (Workshop in Computational Linguistics for Languages of South Asia), Hungary, 2003. http://www.cle.org.pk/research/ papers.htm (accessed 8 Dec. 2014) Hussain, Sarmad 2013 Developing standards and linguistic resources for computational research in Pakistani languages. http://cle.org.pk/linguistic%20Resources-RIU.pdf (accessed 8 Dec. 2014) Hussain, Sarmad, and Sana Gul 2004 PAN Localization Project: A regional initiative to develop local language computing capacity in Asia, in I4D: Information for Development II(6), www. I4Donlinet.net, 2004. www.cle.org.pk/research/papers.htm (accessed 8 Dec. 2014) Hussain, Sarmad, Nadir Durrani, and Sana Gul 2005 Pan-Localization: Survey of language computing in Asia. Lahore: Center for Research in Urdu Language Processing, National University of Computer and Emerging Sciences. http://www.panl10n.net/english/outputs/Survey/Survey. pdf (accessed 8 Dec. 2014) Hussain, Sarmad, Sana Shams, and Huda Sarfraz 2012 Dareecha ICT training program for public schools in rural Punjab. Lahore: Center for Language Engineering, Al-Khawarizmi Institute of Computer Science University of Engineering and Technology. http://www.cle.org.pk/ research/rep/DareechaBook.pdf (accessed 8 Dec. 2014) Ijaz, Madiha, and Sarmad Hussain 2007 Corpus based Urdu lexicon development. In: The proceedings of Conference on Language Technology (CLT07). Peshawar: University of Peshawar,

Applications of modern technology to South Asian languages

775

Department of Computer Science. http://www.cle.org.pk/research/papers.htm (accessed 8 Dec. 2014) Islam, Md. Rafiqul, and S. A. Ahsan Rajon 2010 Design and analysis of an effective corpus for evaluation of Bengali text compression schemes. Journal of Computers 5(1): 59–68. http://ojs.academypub lisher.com/index.php/jcp/article/view/05015968 (retrieved 2 Dec. 2014) Islam, Muhammad Saiful 2009 Research on Bangla language processing in Bangladesh: Progress and challenges. 8th International Language & Development Conference, 23–25 June 2009, Dhaka, Bangladesh. http://teacher.buet.ac.bd/mdsaifulislam/journal/L_ and_D_2009.pdf (accessed 8 Dec. 2014) Ismaili, Imdad Ali, Zeeshan Bhatti, and Azhar Ali Shah 2011 Design and development of graphical user interface for Sindhi language (GUISL). Mehran University Research Journal of Engineering & Technology 30(4): 663–672. http://publications.muet.edu.pk/research_papers/pdf/pdf176. pdf (accessed 8 Dec. 2014) Ismaili, Imdad Ali, Zeeshan Bhatti, and Azhar Ali Shah 2012 Towards a generic framework for the development of Unicode based digital Sindhi dictionaries. Mehran University Research Journal of Engineering & Technology 31(1): 59–66. http://publications.muet.edu.pk/research_papers/ pdf/pdf203.pdf (accessed 8 Dec. 2014) Javaid, Saira, Hira Sattar, Aasim Ali, and M. G. Abbas Malik 2011 Survey of computational support for Shahmukhi script of Punjabi language. Academic Research International 1(1): 292–300. http://www.savap.org.pk/ journals/ARInt./Vol.1%281%29/2011%281.1-29%29.pdf (accessed 11 Sept. 2015) Javed, Sobia Tariq, and Sarmad Hussain 2013 Segmentation based Urdu Nastalique OCR. In: Proceedings of 18th Iberoamerican Congress on Pattern Recognition (CIARP 2013), Havana CUBA, 2013. http://www.cle.org.pk/Publication/papers/2013/086.pdf (accessed 8 Dec. 2014) Jha, Girish Nath 2010 The TDIL program and the Indian Languages Corpora Initiative (ILCI). In: Proceedings of the 7th International Language Resource and Evaluation Conference (LREC-10), Valletta, Malta, May 19–21, 2010, 981–985. http:// www.lrec-conf.org/ (accessed 8 Dec. 2014) Jha, Girish Nath (ed.) 2010 Sanskrit computational linguistics: 4th international symposium, New Delhi, India, December 10–12, 2010. Heidelberg: Springer. http://link.springer.com/ book/10.1007/978–3-642–17528–2/page/1 (accessed 8 Dec. 2014) Kak, Aadil Amin, Nazima Mehdi, and Aadil Ahmad Lawaye 2009b Towards developing a tagset for Kashmiri. Nepalese Linguistics 24: 49–60. http://himalaya.socanth.cam.ac.uk/collections/journals/nepling/pdf/Nep_ Ling_24.pdf (accessed 8 Dec. 2014) Kak, Aadil Amin, Nazima Mehdi, and Aadil Ahmad Lawaye 2009a Building a cross script Kashmiri converter: Issues and solutions. http:// desceco.org/O-COCOSDA2010/proceedings/paper_38.pdf (accessed 8 Dec. 2014)

776

Bibliographical references

Kaur, Rupinderdeep, Suman Preet, R. K. Sharma, and Parteek Bhatia 2010 Punjabi WordNet relations and categorization of synsets. http://www. cfilt.iitb.ac.in/wordnet/webhwn/IndoWordnetPapers/12_iwn_Punjabi%20 WordNet%20Relations%20and%20Categorization%20of%20Synsets.pdf (accessed 8 Dec. 2014) Keerio, Ayaz 2010 Acoustic analysis of Sindhi speech-pre-curser for an ASR system. University of Sussex PhD dissertation. Khan, Mohammad Abid, and Fatima Tuz Zuhra 2007 A general-purpose monitor corpus of written Pashto. In: Conference on Corpus Linguistics, Birmingham, 2007. http://www.birmingham.ac.uk/documents/ college-artslaw/corpus/conference-archives/2007/249Paper.pdf (accessed 8 Dec. 2014) Khan, Mohammad Abid, Muhammad Naveed Ali, and Muhammad Aamir Khan 2007 Treatment of pronominal anaphoric devices in Urdu discourse. In: Proceedings of the International Conference on Emerging Technologies, 2006, 543–547. Peshawar: Department of Computer Science, Peshawar University. http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4136919 (accessed 8 Dec. 2014) Khan, Mumit n.d. Status and challenges of local language computing in Bangladesh. http://www. panl10n.net/Presentations/IJCNLP/IJCNLP-NLP_Bangladesh.pdf (accessed 8 Dec. 2014) Kiparsky, Paul 2009 On the architecture of Pānini’s grammar. In: Huet, Kulkarni & Scharf (eds.) 2009: 33–94. Kulkarni, Amba, and Gérard Huet (eds.) 2009 Sanskrit computational linguistics: Third international symposium, Hyderabad, India, January 15–17, 2009: Proceedings. Heidelberg: Springer. http://link.springer.com/book/10.1007/978–3-540–93885–9/page/1 (accessed 8 Dec. 2014) Kulkarni, Malhar, and Chaitali Dangarikar (eds.) 2013 Recent researches in Sanskrit computational linguistics: Fifth international symposium proceedings, 4–6 January 2013, IIT, Bombay, India. Delhi: D. K. Printworld. Kulkarni, Malhar, Chaitali Dangarikar, Irawati Kulkarni, Abhishek Nanda, and Pushpak Bhattacharyya 2010 Introducing Sanskrit Wordnet. In: Proceedings of the 5th International Conference on Global Wordnet (GWC10), January 31 – February 4, 2010, Mumbai, India. http://www.cfilt.iitb.ac.in/gwc2010/pdfs/67_Sanskrit_Word net_Kulkarni.pdf (accessed 8 Dec. 2014) Kumar, Dinesh, and Prince Rana 2010 Design and development of a stemmer for Punjabi. International Journal of Computer Applications 11(12): 18–23. http://www.google.com/url?sa=t&rct= j&q=&esrc=s&source=web&cd=2&ved=0CDIQFjAB&url=http%3A %2F%2Fwww.researchgate.net%2Fpublication%2F49592794_Design_and_ Development_of_a_Stemmer_for_Punjabi%2Ffile%2Fe0b4951780dee38

Applications of modern technology to South Asian languages

777

d63.pdf&ei=SUEiUueiH4Lx2AW32oHYDg&usg=AFQjCNF6KyYke WtGA-y7ESObUeMvF6hiEQ&bvm=bv.51495398,d.b2I (accessed 8 Dec. 2014) Kumar, Parteek, and R. K. Sharma 2012 Punjabi to UNL enconversion system. Sādhanā 37(2): 299–318. http://link. springer.com/article/10.1007 %2Fs12046–012–0060-x#page-1 (accessed 8 Dec. 2014) Kumar, Pradeep, and Vishal Goyal 2010 Development of Hindi-Punjabi parallel corpus using existing Hindi-Punjabi machine translation system and using sentence alignments. International Journal of Computer Applications 5(9): 15–19. http://citeseerx.ist.psu.edu/ viewdoc/download?rep=rep1&type=pdf&doi=10.1.1.206.5407 (accessed 8 Dec. 2014) Leghari, Mehwish, and Mutee ur Rahman 2010 Towards transliteration between Sindhi scripts by using roman script. http:// www.cle.org.pk/clt10/papers/Towards%20Transliteration%20between%20 Sindhi%20Scripts%20by%20using%20Roman%20Script.pdf (accessed 8 Dec. 2014) Lehal, Gurpreet Singh 2009a A survey of the state of the art in Punjabi language processing, Language in India 9(10): 9–23. http://www.languageinindia.com/oct2009/punjabiprocessing.pdf (accessed 8 Dec. 2014) Lehal, Gurpreet Singh 2009b A Gurmukhi to Shahmukhi transliteration system. In: Proceedings of ICON2009: 7th International Conference on Natural Language Processing, 167– 173, Hyderabad, India. http://learnpunjabi.org/pdf/GurmukhiToShahmukhi Transliteration.pdf (accessed 8 Dec. 2014) Lehal, Gurpreet Singh, and Tejinder Singh Saini 2011 A transliteration based word segmentation system for Shahmukhi script. In: Chandan Singh, Vishal Goyal, Jyotsna Sengupta, Dharam Veer Sharma, and Gurpreet Singh Lehal (eds.), Proceedings of ICISIL, 136–143. Berlin/ Heidelberg: Springer. http://download.springer.com/static/pdf/292/chp%253 A10.1007 %252F978–3-642–19403–0_22.pdf?auth66=1418069654_6e4e0626ac7c6d9cd46ca227be5f1da0&ext=.pdf (accessed 8 Dec. 2014) Lehal, Gurpreet Singh, and Tejinder Singh Saini 2012 Conversion between scripts of Punjabi: Beyond simple transliteration. Proceedings of COLING 2012, Mumbai, December 2012: Posters: 633–642. http://www.aclweb.org/anthology-new/C/C12/C12–2062.pdf (accessed 8 Dec. 2014) Lehal, Gurpreet Singh, Tejinder Singh Saini, and Savleen Kaur Chowdhary 2012 An omni-font Gurmukhi to Shahmukhi transliteration system. Proceedings of COLING 2012, Mumbai, December 2012: Demonstration Papers, 313–320. http://www.aclweb.org/anthology-new/C/C12/C12–3039.pdf (accessed 8 Dec. 2014) Leitner, Gerhard 1991 The Kolhapur Corpus of Indian English: Intravarietal description and/or intervarietal comparison. In: Stig Johansson and Anna-Brita Stenström (eds.),

778

Bibliographical references

English computer corpora: Selected papers and research guide, 215–232. Berlin/New York: Mouton de Gruyter. Liyanapathirana, Jeevanthi Uthpala, and Ruvan Weerasinghe 2011 English to Sinhala machine translation: Towards better information access for Sri Lankans. In: Proceedings of Conference on Human Language Technology for Development, 2–5 May 2011, Alexandra, Egypt, 183–187. http://www.cle. org.pk/hltd/pdf/HLTD201129.pdf (accessed 8 Dec. 2014) Lodhi, Saeed 2004 Robust Urdu character recognition using Fourier descriptors. University of Denver PhD dissertation. ProQuest Dissertations 3138976. Abstract http:// dl.acm.org/citation.cfm?id=1048076 (accessed 8 Dec. 2014) Mahar, Javed Ahmed, and Ghulam Qadir Memon 2010a Rule based part of speech tagging of Sindhi language. In: Signal Acquisition and Processing, 2010. ICSAP ʼ10: International Conference on Signal Acquisition and Processing, 101–106. http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnum ber=5432667&url=http%3A%2F%2Fieeexplore.ieee.org%2Fstamp%2F stamp.jsp%3Ftp%3D%26arnumber%3D5432667 (accessed 8 Dec. 2014) Mahar, Javed Ahmed, and Ghulam Qadir Memon 2010b Sindhi part of speech tagging system using WordNet. International Journal of Computer Theory and Engineering 2(4): 538–545. /http://www.ijcte.org/ papers/198-H065.pdf (accessed 8 Dec. 2014) Mahar, Javed Ahmed, and Ghulam Qadir Memon 2011 Probabilistic analysis of Sindhi word prediction using N-grams. Australian Journal of Basic and Applied Sciences 5(5): 1137–1143. Mahar, Javed Ahmed, Ghulam Qadir Memon, and Hidayatullah Shaikh 2011 Sindhi diacritics restoration by letter level learning approach. Sindh University Research Journal (Science Series) 43(2): 119–126. http://www.surj.usindh. edu.pk/volume_43_02/2.pdf (accessed 8 Dec. 2014) Mahar, Javed Ahmed, Ghulam Qadir Memon, and Shahid Hussain Danwar 2011 Algorithms for Sindhi word segmentation using lexicon-driven approach: Abstract. International Journal of Academic Research 3(3): 28. http://connection.ebscohost.com/c/articles/69707577/algorithms-sindhi-word-segmentation-using-lexicon-driven-approach (accessed 8 Dec. 2014) Mahar, Javed Ahmed, Ghulam Qadir Memon, and Syed Hyder Abbass Shah 2010 WordNet based Sindhi text to speech synthesis system. In: ICCRD ʼ10: Proceedings of the 2010 Second International Conference on Computer Research and Development, 20–24. Washington, DC: IEEE Computer Society. http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5489422 (accessed 8 Dec. 2014) Mahar, Javed Ahmed, Hidayatullah Shaikh, and Abdul Rashid Solangi 2011 Comparative analysis of rule based semantic and syntactic Sindhi parts of speech tagging systems. International Journal of Academic Research 3(5): 157–160. Mahmud, Altaf, and Mumit Khan 2007 Building a foundation of HPSG-based Treebank on Bangla language. http:// ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4579375 (accessed 2 Dec. 2014)

Applications of modern technology to South Asian languages

779

Malik, M. G. Abbas 2005 Towards a Unicode compatible Punjabi character set. 27th Internationalization and Unicode Conference, Berlin, Germany, April 2005. http://www.sanlp.org/ Publications/mgam05–1.pdf (accessed 8 Dec. 2014) Malik, M. G. Abbas 2006 Punjabi machine transliteration. In: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the ACL Sydney, July 2006, 1137–1144. Sydney: Association for Computational Linguistics. http://www.aclweb.org/anthology/P06–1143 (accessed 8 Dec. 2014) Malik, M. G. Abbas, Christian Boitet, and Pushpak Bhattacharyya 2008 Hindi Urdu machine transliteration using finite-state transducers. In: Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008), 537–544. http://malhejji.kau.edu.sa/Files/611/ Researches/62785_33810.pdf (accessed 8 Dec. 2014) Malla, Kamal P. (ed.) 2000 A dictionary of Classical Newari: Compiled from manuscript sources. Kathmandu: Nepal Bhasa Dictionary Committee. McEnery, Anthony M., J. Paul Baker, R. Gaizauskas, and H. Cunningham 2000 EMILLE: Towards a corpus of South Asian languages. In: British Computing Society Machine Translation Specialist Group 11, 1–9. Lancaster: Department of Linguistics and English Language. Michailovsky, Boyd 2006 Digitized resources for languages of Nepal. In: Saxena & Borin (eds.) 2006: 243–256. Mishra, Anand 2009 Simulating the Pāṇinian system for Sanskrit grammar. In: Huet, Kulkarni & Scharf (eds.) 2009: 127–138. Mostafa, Djamel, Khalid Choukri, Sylvie Brunessaux, and Karim Boudahmane 2012 New language resources for the Pashto language. In: Proceedings LREC 2010, 2917–2922. http://lrec.elra.info/proceedings/lrec2012/pdf/824_Paper.pdf (accessed 8 Dec. 2014) Mughal, Muhammad Umair, and Mutee Ur Rahman 2013 Analysis of Sindhi spelling error patterns for spelling error detection and correction. In: Proceedings of International Conference on Computer and Emerging Technologies (ICCET 2013) http://www.isra.edu.pk/research/mr/ AnalysisofSindhiSpellingErrorPatterns%28draft%29.pdf (accessed 8 Dec. 2014) Mukherjee, Joybrato 2002 Norms for the Indian English classroom: A corpus-linguistic perspective. Indian Journal of Applied Linguistics 28(2): 63–82. Mukhtar, Omar, Srirangaraj Setlur, and Venu Govindaraju 2009 Experiments on Urdu text recognition. In: Venu Govindaraju and Srirangaraj Setlur (eds.), Guide to OCR for Indic scripts: Document recognition and retrieval, 163–172. (Advances in Pattern Recognition Series.) London: Springer.

780

Bibliographical references

Murthy, Binay Kumar, and W. R. Despande 1998 Language technology in India: Past, present, and the future. In: Proceedings of the SAARC Conference on Extending the use of Multilingual and Multimedia Information Technology (EMMIT’98), September 1–4, 1998, Pune, India. (Organized by Centre for Development of Advanced Computing, Pune, a Scientific Society under the Department of Electronics & Information Technology, Government of India). Nadungodage, Thilini, and Ruvan Weerasinghe 2011 Continuous Sinhala speech recognizer. In: Proceedings of Conference on Human Language Technology for Development, 2–5 May 2011, Alexandria, Egypt, 141–147. http://www.cle.org.pk/hltd/pdf/HLTD201123.pdf (accessed 8 Dec. 2014) Narang, Ashish 2012 Lexico-semantic relations for Punjabi WordNet. Thapar University, Patiala, M Engineering thesis. http://dspace.thapar.edu:8080/dspace/handle/10266/1881 (accessed 8 Dec. 2014) Naz, Mamoona, Qurat ul Ain Akram, and Sarmad Hussain 2013 Binarization and its evaluation for Urdu Nastalique document images. In: Proceedings of The 16th International Multi Topic Conference (INMIC) 2013, Lahore, Pakistan. http://www.cle.org.pk/Publication/papers/2013/Binarization %20and%20its%20Evaluation%20for%20Urdu%20Nastalique%20Document %20Images%208–3-1.pdf (accessed 8 Dec. 2014) Naz, Saeeda, Khizar Hayat, Muhammad Imran Razzak, Muhammad Waqas Anwar, and Salahuddin MS Arabic script based character segmentation: A review. http://www.google.com/ url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&ved=0CCkQFjAA&url =http%3A%2F%2Fwww.ijsr.in%2Fupload%2F1116262947Segmentation. doc&ei=v30nUuS8AdKyygGT4IH4BQ&usg=AFQjCNFy3CcqWL9Kb01Xs Oluqp65PEExeg&bvm=bv.51495398,d.aWc (accessed September 4, 2013) Nizamani, Ali Muhammad, and Naeem Ul Hassan Janjua 2013 Sindhi OCR using back propagation neural networks. International Journal of Advanced Computer Science 3(3): 113–117. http://www.ijpg.org/index.php/ IJACSci/article/download/176%20/154 (accessed 11 Sept. 2015) Oad, Jherna Devi 2009 Implementing GF resource grammar for Sindhi language. Chalmers University of Technology, Gothenburg, Sweden, MSc thesis. http://publications.lib.chal mers.se/records/fulltext/163234.pdf (accessed 18 Dec. 2014) PAN Localization Project Teams 2008 Working Papers 2004–2007. Lahore: Center for Research in Urdu Language Processing, National University of Computer and Emerging Sciences, Lahore, Pakistan. http://www.cle.org.pk/research/reports.htm (accessed18 Dec. 2014) Prasain, Balaram 2011 A computational analysis of Nepali morphology: A model for natural language processing. Tribhuvan University PhD dissertation. http://tiger.sprachwiss. uni-konstanz.de/~jsal/ojs/dissertations/diss-balaram.pdf (accessed 18 Dec. 2014)

Applications of modern technology to South Asian languages

781

Precoda, Kristin, Horacio Franco, Ascander Dost, Michael Frandsen, John Fry, Andreas Kathol, Colleen Richey, Susanne Riehemann, Dimitra Vergyri, and Jing Zheng 2004 Limited-domain speech-to-speech translation between English and Pashto. In: Proceedings HLT-NAACL—Demonstrations ’04 Demonstration Papers at HLT-NAACL 2004, 9–12. http://www.aclweb.org/anthology/N04–3003 (accessed 18 Dec. 2014) Rabbi, Ihsan, Mohammad Abid Khan, and Rahman Ali 2008 Developing a tagset for Pashto part of speech tagging. In: Second International Conference on Electrical Engineering 2008, 1–6. http://ieeexplore.ieee.org/ xpl/login.jsp?tp=&arnumber=4553909&url=http%3A%2F%2Fieeexplore. ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D4553909 (accessed 18 Dec. 2014) Rabbi, Ihsan, Mohammad Abid Khan, and Rahman Ali 2009 Rule-based part of speech tagging for Pashto language. In: Proceedings of the Conference on Language & Technology 2009, 82–87. http://www.cle.org.pk/ clt09/download/Papers/Paper12.pdf (accessed 18 Dec. 2014) Rabbi, Ihsan, Mohammad Abid Khan, Rashid Ahmad, and Rahman Ali 2008 Theoretical analysis of Pashto phrases for the creation of parser. In: Proceedings of International Conference on Information & Communication Technology (IC-ICT2008). http://www.google.com/url?sa=t&rct=j&q=&esrc=s&sou rce=web&cd=1&ved=0CB4QFjAA&url=http%3A%2F%2Fwww.research gate.net%2Fprofile%2FIhsan_Rabbi%2Fpublication%2F236461795_ Theoretical_Analysis_of_Pashto_Phrases_for_the_Creation_of_Par ser%2Flinks%2F00b7d517f5e3b7303c000000&ei=Rv-SVL__ F4KPsQTPv4KoDQ&usg=AFQjCNEyQg_dN7TsdTjzp6bhizaqQlKTpQ&b vm=bv.82001339,d.ZGU (accessed 18 Dec. 2014) Rahman, Lutfar, and Syed Akhter Hossain 2006 Research on Bangla computational linguistics in Bangladesh. In: Proceedings: International Conference on Computer Processing of Bangla, 1–9. Dhaka: School of Engineering & Computer Science, Independent University. Rahman, Mutee Ur 2008 Challenges in Sindhi computing. http://www.isra.edu.pk/research/mr/ ChallengesinSindhiComputing.pdf (accessed18 Dec. 2014) Rahman, Mutee ur 2010 Towards Sindhi corpus construction. Linguistics and Literature Review 1(1): 74–85. Lahore: University of Management and Technology. http://www.cle. org.pk/clt10/papers/Towards%20Sindhi%20Corpus%20Construction.pdf (accessed18 Dec. 2014) Rahman, Mutee ur 2012 Developing a part of speech tagset for Sindhi. In: Proceedings of the Conference on Language & Technology 2012, 37–46. http://www.cle.org.pk/clt12/pdf/pro ceedings.pdf (accessed 18 Dec. 2014) Rahman, Tariq 2004 Language policy and localization in Pakistan: Proposal for a paradigm shift. In: Crossing the Digital Divide, SCALLA Conference on Computational Linguistics. http://www.apnaorg.com/research-papers-pdf/rahman-1.pdf (accessed 18 Dec. 2014)

782

Bibliographical references

Rashel, Muhammad Mostafa 2011 Introducing language technology & computational linguistics in Bangladesh. International Journal of English Linguistics 1(1): 179–186. Rehman, Muhammad Atique ur 2010 A new scale invariant optimized chain code for Nastaliq character representation. In: Computer Modeling and Simulation, 2010, Vol. 4: Proceedings of the Second International Conference on Computer Modeling and Simulation (ICCMS) Sanya, China, 22–24 January. New York: Institute of Electrical and Electronic Engineers (IEEE), 400–403. http://ieeexplore.ieee.org/xpl/ login.jsp?tp=&arnumber=5421561&url=http%3A%2F%2Fieeexplore. ieee.org%2Fiel5 %2F5420948 %2F5421447 %2F05421561.pdf%3Farnum ber%3D5421561 (accessed 18 Dec. 2014) Riaz, Kashif 2007 Challenges in Urdu stemming. BCS IRSG Symposium: Future Directions in Information Access (FDIA 2007), Glasgow, August 2007. http://bcs.org/ upload/pdf/ewic_fd07_paper4.pdf (accessed 18 Dec. 2014) Riaz, Kashif 2010 Rule-based named entity recognition in Urdu. In: Proceedings of the 2010 Named Entities Workshop, ACL 2010, 126–135. http://www.aclweb.org/ anthology-new/W/W10/W10–2419.pdf (accessed 18 Dec. 2014) Rizvi, Syed Muhammad Jafar 2007 Development of algorithms and computational grammar for Urdu. Pakistan Institute of Engineering and Applied Sciences, Nilore, PhD thesis. http:// eprints.hec.gov.pk/2156/1/2072.htm (accessed 18 Dec. 2014) Saini, Tejinder Singh, and Gurpreet Singh Lehal 2008 Shahmukhi to Gurmukhi transliteration system: A corpus based approach. Research in Computing Science 33: 151–162. http://pics.cicling.org/2008/ RCS-vol-33/12-Saini.pdf (accessed 18 Dec. 2014) Saini, Tejinder Singh, Gurpreet Singh Lehal, and Virinder S. Kalra 2008 Shahmukhi to Gurmukhi transliteration system. In: Proceedings of 22nd International Conference on Computational Linguistics (Coling), 177–180, Manchester, UK. http://www.aclweb.org/anthology/C08–3009 (accessed 18 Dec. 2014) Sattar, Sohail A., Shamsul Haque, Mahmood K. Pathan, and Quintin Gee 2009 Implementation challenges for Nastaliq character recognition. In: Hussain, Rajput, Chowdhry & Gee (eds.) 2008: 279–285. Saxena, Anju, and Lars Borin (eds.) 2006 Lesser-known languages of South Asia: Status and policies, case studies and applications of information technology. Berlin/New York: Mouton de Gruyter. Satti, Danish Altaf 2013 Offline Urdu Nastaliq OCR for printed text using analytical approach. Quaidi-Azam University PhD thesis. http://www.cle.org.pk/resources/theses.htm (accessed 18 Dec. 2014) Scharf, Peter 2009 Modeling Pāṇinian grammar. In: Huet, Kulkarni & Scharf (eds.) 2009: 95–126.

Applications of modern technology to South Asian languages

783

Scharf, Peter, and Malcolm Hyman 2012 Linguistic issues in encoding Sanskrit. Delhi: Motilal Banarsidass. Schilk, Marco 2006 Collocations in Indian English: A corpus-based sample analysis. Anglia 124(2): 276–316. School of Engineering & Computer Science, Independent University, Bangladesh 2006 Proceedings: International Conference on Computer Processing of Bangla (ICCPB 2006). Dhaka, Bangladesh: School of Engineering & Computer Science, Independent University, Bangladesh. Shaikh, Noor Ahmed, Ghulam Ali Mallah, and Zubair A. Shaikh 2009 Character segmentation of Sindhi, an Arabic style scripting language, using height profile vector. Australian Journal of Basic and Applied Sciences 3(4): 4160–4169. http://www.researchgate.net/publication/228970919_Character_ Segmentation_of_Sindhi_an_Arabic_Style_Scripting_Language_using_ Height_Profile_Vector (accessed 18 Dec. 2014) Shaikh, Noor Ahmed, Zubair Ahmed Shaikh, and Ghulam Ali 2008 Segmentation of Arabic text into characters for recognition. In: Hussain, Rajput, Chowdhry & Gee (eds.) 2008: 11–18. http://link.springer.com/chapter/ 10.1007/978–3-540–89853–5_4#page-1 (accessed 18 Dec. 2014) Shams, Sana, and Sarmad Hussain 2011 Strategies for research capacity building in local language computing: PAN Localization project case study. Conference on Human Language Technology for Development, Alexandria, Egypt, 2–5 May 2011. http://cle.org.pk/research/ papers.htm (accessed 18 Dec. 2014) Shams, Sana, Mudasir Mustafa, Atif Mirza,Yasmeen Daud, Qaisar Khalid Mahmood, and Sarmad Hussain 2012 Evaluation findings of PAN Localization Project. Lahore: Center for Language Engineering, Al-Khwarizmi Institute of Computer Science, University of Engineering & Technology. http://www.cle.org.pk/research/rep/evaluation findings.pdf (accessed 18 Dec. 2014) Shamshed, Jubayer, and S. M. Masud Karim 2010 A novel Bangla text corpus building method for efficient information retrieval. JCIT (online) 1(1). http://ijcit.org/jcit_papers/vol-1_no-1/JCIT-100708.pdf (accessed 2 Dec. 2014) Shastri, S. V. 1988 The Kolhapur Corpus of Indian English and work done on its basis so far. International Computer Archive of Modern English Journal 2: 15–26. Shastri, S. V. 1992 Opaque and transparent features of Indian English. In: Gerhard Leitner (ed.), New directions in English language corpora: Methodology, results, software developments, 263–275. Berlin/New York: Mouton de Gruyter. Shastri, S. V. 1996 Using computer corpora in the description of language with special reference to complementation in Indian English. In: R. J. Baumgardner (ed.), South Asian English: Structure, use, and users, 70–81. Urbana, IL: University of Illinois Press.

784

Bibliographical references

Silva, Anne Mindika, and Ruvan Weerasinghe 2008 Example based machine translation for English-Sinhala translations. In: Proceedings of the 9th International IT Conference (IITC 2008), Colombo, Sri Lanka: 2008. http://www.icter.org/conference/sites/default/files/icter/IITC2008p3.pdf (accessed18 Dec. 2014) Sinclair, John 1991 Corpus, concordance, collocation. Oxford: Oxford University Press. Singh, Tejinder 2011 Development of Shahmukhi to Gurmukhi transliteration system. Punjabi University PhD thesis. http://ir.inflibnet.ac.in:8080/jspui/handle/10603/4931 (accessed 18 Dec. 2014) Sourabh, Kumar 2013 An extensive literature review on CLIR and MT activities in India. International Journal of Scientific & Engineering Research 4(2). http://www. ijser.org/researchpaper/An-Extensive-Literature-Review-on-CLIR-and-MTactivities-in-India.pdf (accessed 18 Dec. 2014) Staal, Johan Frederik 1967 Word order in Sanskrit and universal grammar. Dordrecht: Reidel. Subbanna, Sridhar, and Shrinivasa Varakhedi 2010 Asiddhatva principle in computational model of Aṣṭādhyāyī. In: Girish Nath Jha (ed.), Sanskrit computational linguistics: 4th International Symposium, New Delhi, India, December 10–12, 2010, proceedings, 231–238. Heidelberg: Springer. Uddin, Muhammad Gias, Humaid Ashraf, Abu Hena Mustafa Kamal, and Muhammad Masroor Ali 2004 New parameters for Bangla to English statistical machine translation. In: Proceedings of 3rd International Conference on Electrical & Computer Engineering (ICECE 2004), 545–548. http://www.researchgate.net/ publication/255572322_NEW_PARAMETERS_FOR_BANGLA_TO_ ENGLISH_STATISTICAL_MACHINE_TRANSLATION (accessed 18 Dec. 2014) Uddin, Muhammad Gias, Mahbub Murshed, and Muhammad Abul Hasan 2005 A parametric approach to Bangla to English statistical machine translation for complex Bangla sentences-Step 1. http://research.banglacomputing.net/iccit/ ICCIT_pdf/8th%20ICCIT_2005_529.pdf (accessed 18 Dec. 2014) Varakhedi, Shrinivasa, and Sheeba Jaddipal 2009 An effort to develop a tagged lexical resource for Sanskrit. In: Huet, Kulkarni & Scharf (eds.) 2009: 339–345. Virk, Shafqat Mumtaz 2013 Computational linguistics resources for Indo-Iranian languages. Centre for Language Technology, Gothenburg, PhD dissertation. http://www.cle.org.pk/ Publication/theses/2013/shafqat-phd-thesis.pdf (accessed 18 Dec. 2014) Virk, Shafqat Mumtaz, Muhammad Humayoun, and Aarne Ranta 2011 An open source Punjabi resource grammar. In: Proceedings of Recent Advances in Natural Language Processing (RANLP), Hissar, Bulgaria, 12–14 September 2011, 70–76. http://aclweb.org/anthology/R11–1010 (accessed 18 Dec. 2014)

Applications of modern technology to South Asian languages

785

Vossen, Piek (ed.) 1998 EuroWordNet: A multilingual database with lexical semantic networks. Dordrecht: Kluwer Academic Publishers. Wahab, Mehreen, Hassan Amin, and Farooq Ahmad 2009 Shape analysis of Pashto script and creation of image database for OCR. 5th International Conference on Emerging Technologies, Islamabad, Pakistan, 2009. http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=5353160 (accessed 18 Dec. 2014) Weerasinghe, Ruvan 2004 A statistical machine translation approach to Sinhala-Tamil language translation. In: Proceedings of the Conference on Sharing Capability in Localisation and Human Language Technologies (SCALLA) 2004. http://www.elda.org/en/ proj/scalla/SCALLA2004/weerasinghe.pdf (accessed 18 Dec. 2014) Weerasinghe, Ruvan, Asanka Wasala, Dulip Herath, and Viraj Welgama 2008 NLP Applications of Sinhala: TTS & OCR. In: Proceedings of the 3rd International Joint Conference on Natural Language Processing (IJCNLP 2008), Hyderabad, India: Jan, 2008. http://aclweb.org/anthology/I/I08/I08– 2142.pdf (accessed 18 Dec. 2014) Weerasinghe, Ruvan, Dulip Herath, and Viraj Welgama 2009 A corpus-based Sinhala lexicon. In: Proceedings of the 7th Workshop on Asian Language Resources, Singapore, Aug. 2009. http://aclweb.org/anthology/W/ W09/W09–3403.pdf (accessed 18 Dec. 2014). Welgama, Viraj, Dulip Lakmal Herath, Chamila Liyanage, Namal Udalamatta, Ruvan Weerasinghe, and Tissa Jayawardhane 2011 Towards a Sinhala WordNet. In: Proceedings of Conference on Human Language Technology for Development, 2–5 May 2011, Alexandria, Egypt, 39–43. http://www.researchgate.net/publication/235931814_Towards_a_Sin hala_Wordnet (accessed 18 Dec. 2014) Wilson, Andrew 1992 The usage of since: A quantitative comparison of Augustan, modern British and modern Indian English. (Lancaster Papers in Linguistics, 80.) Lancaster, U. K.: Department of Linguistics and Modern English Language, University of Lancaster. Yadava, Yogendra P., G. R. Bhattarai, S. K. Bista, B. Keshari, and J. Bhatta. 2005 Envisioning machine translation for the new millennium: Outlines of preliminary steps in Nepal. In: Yogendra P. Yadava and Govinda Bhattarai (eds.), Contemporary issues in Nepalese linguistics, 429–439, Kathmandu: Linguistics Society of Nepal. Yadava, Yogendra P., Andrew Hardie, Ram Raj Lohani, Bhim N. Regmi, Srishtee Gurung, Amar Gurung, Tony McEnery, Jens Allwood, and Pat Hall 2008 Construction and annotation of a corpus of contemporary Nepali. Corpora 3(2): 213–225. Zafar, Ayesha, Afia Mahmood, Farhat Abdullah, Saira Zahid, Sarmad Hussain, and Asad Mustafa 2012 Developing Urdu WordNet using the merge approach. In: Proceedings of Conference on Language and Technology 2012 (CLT12), Lahore, Pakistan. http://www.cle.org.pk/clt12/pdf/proceedings.pdf (accessed 18 Dec. 2014)

786

Bibliographical references

Zuhra, Fatima Tuz, and Mohammad Abid Khan 2007 Towards the computational treatment of the Pashto verb. In: Proceedings of Conference on Language and Technology (CLT07), Bara Gali Summer Campus, University of Peshawar, August 7–11, 2007. Peshawar, University of Peshawar. Zuhra, Fatima Tuz, and Mohammad Abid Khan 2009 A corpus-based finite state morphological analyzer for Pashto. In: Proceedings of the Conference on Language & Technology 2009, 61–66. http://www.cle. org.pk/clt09/download/Papers/Paper9.pdf (accessed 18 Dec. 2014)

9

Writing systems Edited by Elena Bashir

9.1.

Introduction By Elena Bashir

The relations between spoken language and the visual symbols (graphemes) used to represent it are complex. Orthographies can be thought of as situated on a continuum from “deep” — systems in which there is not a one-to-one correspondence between the sounds of the language and its graphemes — to “shallow” — systems in which the relationship between sounds and graphemes is regular and transparent (see Roberts & Joyce 2012 for a recent discussion). In orthographies for Indo-Aryan and Iranian languages based on the Arabic script and writing system, the retention of historical spellings for words of Arabic or Persian origin increases the orthographic depth of these systems. Decisions on how to write a language always carry historical, cultural, and political meaning. Debates about orthography usually focus on such issues rather than on linguistic analysis; this can be seen in Pakistan, for example, in discussions regarding orthography for Kalasha, Wakhi, or Balti, and in Afghanistan regarding Wakhi or Pashai. Questions of orthography are intertwined with language ideology, language planning activities, and goals like literacy or standardization. Woolard 1998, Brandt 2014, and Sebba 2007 are valuable treatments of such issues. In Section 9.2, Stefan Baums discusses the historical development and general characteristics of the (non Perso-Arabic) writing systems used for South Asian languages, and his Section 9.3 deals with recent research on alphasyllabic writing systems, script-related literacy and language-learning studies, representation of South Asian languages in Unicode, and recent debates about the Indus Valley inscriptions. Elena Bashir’s Section 9.4 treats adaptations of the Perso-Arabic script used for various languages of South Asia, and her Section 9.5, on current research areas and desiderata, concludes the chapter.

788 Stefan Baums 9.2.

General historical and analytical By Stefan Baums

9.2.1.

Early scripts

We have no securely datable documents from South Asia until the edicts of the emperor Aśoka in the third century BCE.1 The edicts, throughout Aśoka’s empire in northern and parts of southern India, are written in two different scripts: those in the northwest (modern Afghanistan and Pakistan) in Kharoṣṭhī, the rest in Brāhmī. Greek and Aramaic versions of the edicts have likewise been found in Afghanistan. The connections of the Kharoṣṭhī script and scribal institutions with the Achaemenid Empire suggest that Kharoṣṭhī was invented not later than the fourth century BCE (Baums 2014) and that Aśoka was following established custom when he used Kharoṣṭhī in his northwestern edicts. The Sanskrit grammar of Pāṇini (fourth century BCE and a native of the northwest) likewise mentions writing (lipi) and books (grantha), but it remains unclear whether he referred to Aramaic or Kharoṣṭhī script. Around 325 BCE, Alexander’s general Nearchos observed that the inhabitants of the northwest wrote letters on tightly woven cloth (whether in Aramaic or Kharoṣṭhī). Some twenty years later, the Greek ambassador Megasthenes denied that writing was used for legal proceedings in northeastern India, but this does not preclude its use for other purposes (v. Hinüber 1990). The evidence thus indicates a continuous writing tradition since Achaemenid times in northwestern South Asia, with a transition from Aramaic to Kharoṣṭhī at some point before the third century BCE. In contrast, there is no conclusive evidence for writing in mainland India until the time of Aśoka, and it remains possible, though by no means certain, that he or his immediate predecessors may have invented the Brāhmī script on the model of Kharoṣṭhī. 9.2.1.2. Kharoṣṭhī Kharoṣṭhī remained throughout its existence a regional script of the northwest (ancient Gandhāra). Following the Silk Roads, it spread into Central Asia and in the third century CE was used to write administrative and legal records in the area of Loulan and Niya in the southern Tarim basin. Around the same time, expatriate South Asian Buddhist communities used Kharoṣṭhī in the capital of China. In its homeland, Kharoṣṭhī died out by the fourth century CE and was replaced by variants of Brāhmī; in the northern Tarim basin it continued to be used for another century or two.

1

The discussion in this section is an updated and expanded version of Baums 2011: 240–250.

Writing systems 789

Like its model, the Aramaic script used by the Achaemenid administration in Gandhāra, Kharoṣṭhī is written from right to left, and the two scripts agree in the shapes and sound values of some (but by no means all) of their letters. While Aramaic writes only the consonants of words, not their vowels, the developers of Kharoṣṭhī added an original system of marking vowels that became the model for Brāhmī and all later scripts derived from it. A consonant letter without modification, such as k, signifies not just the bare consonant, in this case k, but the syllable ka – the vowel a is said to be “inherent” in consonant signs. If a vowel other than a follows a consonant, its presence is indicated by attaching a vowel diacritic to the consonant sign: Á, e.g., is ki, and  is ku. The phonemic difference between short and long vowels is not marked: Á is used to write both [ki] and [kiː], etc. Some scribes indicate preconsonantal nasal segments by a hook (anusvāra, transliterated ṃ) under the preceding consonant sign: compare k ka with À kaṃ. Where one consonant immediately follows another without intervening vowel, this is indicated by combining the two consonant signs into one conjunct consonant or ligature: the sequence tsa, for instance, is represented by combining the signs for ta, t, and sa, Ń, into the ligature ö, thus cancelling the vowel a inherent in t; the sequence tasa, on the other hand, is written Ńt. While most consonant clusters are written with transparent conjuncts, some, such as ō staṃ or Å kṣa, are indicated by opaque (and possibly atomic) signs. In order to write a vowel at the beginning of a word or a vowel following another vowel, its vowel diacritic is attached to the “vowel carrier” A; A alone signifies a, H is i, U is u, and so on. (It is likely that this so-called vowel carrier was in reality the consonant [ʔ].) The graphical unit of a simple consonant sign or ligature followed by an optional vowel diacritic and/or anusvāra is called an akṣara. It forms the basic graphical unit of Kharoṣṭhī and all Brāhmī-derived scripts. The letters of the Kharoṣṭhī script are conventionally arranged in the alphabetic order a, ra, pa, ca, na, etc. — the so-called arapacana alphabet. Its origins are obscure, but it seems to have undergone systematic extension (Baums 2009: 194– 197) and, after the demise of Kharoṣṭhī, lived on as a Buddhist magical formula (Brough 1977). Equally obscure is the origin of the name “Kharoṣṭhī” (see Falk 1993: 84–90 for a summary of theories). Kharoṣṭhī was almost exclusively used for the Middle Indo-Aryan literary language Gāndhārī, but attempts were made to adapt it to the writing of Sanskrit (Salomon 2008, Strauch 2012). 9.2.1.3. Brāhmī In contrast to Kharoṣṭhī, the other early South Asian writing system Brāhmī is (almost always) written from left to right, and where Kharoṣṭhī (like Aramaic) has a distinctly cursive ductus, early Brāhmī is a monumental script whose letters consist of straight lines, circles, and other basic geometric shapes. Both properties may have been inspired by some degree of familiarity with the Greek script (Falk 1993:

790 Stefan Baums 109–112). The general system of Brāhmī is the same as that of Kharoṣṭhī, but it improves on the older script by using separate diacritics for short and long vowels; Brāhmī also employs independent signs for initial vowels instead of a vowel carrier plus diacritic like Kharoṣṭhī: ka, kā, ki, kī are written k, 0, 1, 2, and syllable-initial a, ā, i, ī are written a, , i, I. Brāhmī, like Kharoṣṭhī, was originally developed for Middle Indo-Aryan and only later came to be used for Sanskrit. Among the earlier Brāhmī inscriptions, conjunct consonants are therefore rare, and where they occur in the Girnar edict of Aśoka they may represent coarticulations rather than true clusters ( pta, e.g., may be a labialized [tʷ] < OIA [tʋ]; v. Hinüber 2001: 196– 197). The name “Brāhmī” may refer to its use (starting in the first century CE) for writing the language of the Brahmans, Sanskrit (Falk 1993: 106–108). The letters of the Brāhmī script are arranged in an alphabetic order called varṇamālā (‘sound garland’) that is based on phonological principles. In the full form used for Sanskrit, first come the simple vowels in pairs of short and long, then the diphthongs; then anusvāra and visarga (ḥ, syllable-final voiceless [h]); then the stop consonants in the order unvoiced plain and aspirate, voiced plain and aspirate, and nasal (in order of place of articulation, starting in the back of the mouth); then the four semivowels ya, ra, la, va; then the sibilants śa, ṣa, sa; and finally, ha (voiced [ɦ]). Early Brāhmī (third to first century BCE) became the ancestor of all later South Asian (as well as the main Southeast and some Central Asian) scripts. With few exceptions, the akṣara system remained constant, and only the forms of consonant signs, vowel signs, and vowel diacritics continued to evolve. Following Dani (1963), we can distinguish the following stages of development: Early, Middle, and Late Brāhmī. 9.2.1.3.1.

Early Brāhmī

Changes in letter shape resulted from cursivization or stroke reduction (Aśokan k ka, e.g., turned into K, and n na into n); from changes in the order or direction of strokes (Aśokan N ṇa became N and later O); and from the incorporation into the writing system of originally insignificant mechanical changes. The last type of change had the greatest effect on the ductus of scripts. The little blot of ink, for instance, that a stylus leaves where it first touches the writing surface at the top of letters later developed into the long horizontal headlines of Nagari and Bengali and the variously shaped heads of the Southern scripts, such as the “check mark” of Telugu or the “umbrella” of Oriya. Regional variations first appeared in the Early Brāhmī period (third to first centuries BCE). The letter D dha was mirrored to D, g ga was cursivized to g and m ma was angularized to m. More radical systemic changes occurred in Old Tamil cave inscriptions of the second and first centuries BCE. Mahadevan’s (2003) system TB-I uses the vowel mātrā ā to represent both short and long [a] / [aː]; in this

Writing systems 791

orthography consonant signs without the ā mātrā do not have an inherent vowel and always represent the bare consonant. In system TB-II, the ā mātrā always represents long [aː], whereas vowelless consonant signs can be read either with inherent short [a] or as bare consonants. By the second century CE, the ambiguity of ā in TB-I and of bare consonant signs in TB-II (and the influence of the standard Brāhmī system) led to system TB-III in which a dot (puḷḷi) marks the absence of a vowel when placed above basic consonant signs, and shortening when used with the vowels e and o (in contrast to Old and Middle Indo-Aryan, Dravidian languages have short as well as long e and o phonemes). Old Tamil Brāhmī added signs for ḻ, ḷ, ṟ, ṉ and possibly ṅ to the original inventory of Brāhmī. Middle Indo-Aryan inscriptions from Bhattiprolu in South-East India (second century BCE) employ two separate diacritics for short and long a (0 is ka, is kā), which seems to be an extension of the Old Tamil Brāhmī system TB-I. The main reason for abandoning inherent [a] in Tamil Brāhmī does not apply in the case of the Bhattiprolu inscriptions since Middle Indo-Aryan does not have word-final consonants or non-homorganic clusters. This implies that the dedicated long ā mātrā, too, was first introduced in a Tamil context, and the resulting system only later imitated in Bhattiprolu, but no such Tamil inscription has yet been discovered.

k

9.2.1.3.2.

Middle Brāhmī

In the Middle Brāhmī period (first to third centuries CE), local variants became more distinct; Dani (1963) distinguishes Kauśāmbī, Mathurā, Western Deccan, and Eastern Deccan regional styles. Headmarks developed different shapes (linear, square, triangular, etc.), there were further angularizations (j ja) and cursivizations (s sa), and the vowel diacritics tended to assume more elaborate forms (as in d dī). In this period, Sanskrit was first used in inscriptions and manuscripts, and additional signs were introduced to represent its sounds (o kṛ, p kau, q kaḥ, r ṅa). Northwestern manuscripts of the first century CE contain the first examples of vowel cancellation marker (virāma) indicating cancellation of inherent a wherever there was no following consonant sign to form a ligature with. This early virāma device consists in lowering the sign for the vowelless consonant below the baseline, linking it with the preceding akṣara, and putting a short horizontal line on top of it (as in A tvāt). In the later scripts, just the equivalents of that horizontal line, now placed diagonally below the consonant sign, are used as virāma, with the consonant sign in normal position (cf. Devanāgarī वात ् tvāt). 9.2.1.3.3.

Late Brāhmī

In the Late Brāhmī period (fourth to sixth centuries CE), graphical differentiation reached a point where regional forms of Brāhmī would have to be learned separately and we can therefore speak of different scripts rather than variants of

792 Stefan Baums a single script. While older scholarship distinguished between “Western” and “Eastern” Gupta scripts and South Indian Brāhmī, Dani (1963) suggests a categorization into nine main geographical divisions. In the South Indian scripts, letters began to assume their typical round forms because they were now incised into the surface of palm leaves, then rubbed with ink, and straight lines would have tended to rupture the leaf. Some letters reached their modern forms (compare e.g. northern G ga with Nāgarī ग). In Central Asia, local scripts were developed on the basis of a northwestern Gupta type, in order to write Sanskrit locally and for the writing of Tocharian, Uyghur, and Khotanese. 9.2.2.

Transitional Script period

In the Transitional Script period (seventh to tenth centuries CE), proto-Śāradā (“Gilgit-Bamiyan type II” in Sander’s 1968 terminology) emerged as a distinct northwestern script. In its fully developed form, Śāradā was used for the writing of Sanskrit and Kashmiri and gave rise to the regional traders’ scripts Takri (used for Western Pahari) and Landa (used for Sindhi and Panjabi). In the rest of Northern India, the Siddhamātṛkā script was in use, eventually giving rise to Nagari and Bengali and living on in East Asia as the “Siddham” script. The Tibetan writing system was developed at the beginning of the Transitional Script period under Central Asian and North Indian influences (van Schaik 2011 argues for a predominance of the latter). In upper South India, a distinct proto-Kannada-Telugu script began to take shape. In the far South, three different scripts were emerging: the Grantha script for Sanskrit, and the Tamil and cursive Vaṭṭeḻuttu scripts for Tamil. The Grantha script used by the Pallava dynasty became the basis of the Southeast Asian scripts. The Sinhalese script had so far mostly developed in isolation; now it was subjected to a strong influence from Pallava Grantha. 9.2.3.

Modern Devanagari and Gujarati

The modern Nagari (or Devanagari) script (Maurer 1976) had assumed a distinct identity by the 11th century and is now used throughout northern India for Hindi, Nepali, Marathi, local dialects like Bhojpuri, and non-Indo-Aryan languages like Gondi.2 From the 11th to 16th centuries, a regional form called Nandinagari was used in southern India, and between the 12th and 16th centuries, an ornamental variant called Ranjana took shape in Nepal under influence from Bengal. Sanskrit is primarily printed in Nagari today, and in spite of strong loyalty to the local scripts, Nagari has developed a presence throughout modern India. 2

Some of the examples for the modern Indic scripts are drawn from Bright & Daniels 1996.

Writing systems 793

Nagari letters are characterized by horizontal headlines and right angles. The vowel diacritics for ā, i, ī, and o have drooped to the base line of the akṣara (in the case of i on its left side): compare का kā and िक ki with their remote ancestors in Early Brāhmī, 0 and 1. At a stage in the linguistic prehistory of Hindi, word-final, and under certain conditions word-medial, short [ə] disappeared, but this change is not mirrored by the writing system of Hindi, so that ‘Monday’, e.g., is written सोमवार somavāra but pronounced [soːmʋaːr]. For the representation of peripheral phonemes that have entered Hindi via loanwords, a subscript dot (nuqtā) is optionally added to consonant signs of similar pronunciation: क़ qa, ख़ xa, ग़ γa, ज़ za, and फ़ fa, imitating the way the Arabic script was extended for the writing of languages such as Persian. Nagari gave rise to regional traders’ scripts such as Modi (used for Marathi). The earliest inscriptions in the Gujarati language, dating from the 15th century, are written in Nagari script. Later a cursive variant of Nagari began to develop into a separate Gujarati script, which only attained widespread currency in the middle of the 19th century. Gujarati betrays its cursive origin in the lack of a headline (compare ગ ga and ત ta with their Nagari counterparts ग and त) and developed a consistently analytic notation for initial vowels: the signs for ā (આ), e (એ), ai (ઐ), o (ઓ), and au (ઔ) are all derived by diacritic vowel signs from the sign for initial a (અ), while in Nagari this is only true of ā (आ), o (ओ), and au (औ) (all from अ). 9.2.4.

Gurmukhi and Khojki

The Gurmukhi script is used for Panjabi, especially by Sikhs in and from the Indian Panjab. It was developed by Guru Angad (1504–1552) on the background of Landa and takes its name from this fact and from its use in the Adi Granth. Gurmukhi has a unique system of writing initial vowels by adding the vowel diacritics to one of three different vowel carriers: ਅ is used for ਅ a, ਆ ā, ਐ ai, and ਔ au (low vowels and diphthongs); ੲ for ਇ i, ਈ ī, and ਏ e (front vowels); and ੳ for ਉ u, ਊ ū, and ਓ o (back vowels). Panjabi lost the voiced aspirates and developed a system of high, mid, and low tones. Synchronically, the “voiced aspirate” letters have the following tone-marking functions: a vowel preceded by a “voiced aspirate” (stem-initial, or stem-medial between a short and a long vowel) carries a low tone (ਘੋੜਾ ghoṛā [kòɽɑ], ਪਘਾਰਨਾ paghāranā [pəɡɑ̀rnɑ]); a vowel followed by a stem-final “voiced aspirate” letter carries a high tone (ਮਾਘ māgha [mɑ́ɡ]); non-initial ਹ ha also represents high tone on the preceding vowel (ਤੀਹ tīha [tí]). Another innovation is a diacritic called addak, marking gemination of consonants (ਪੱ ਕੀ pakkī), imitating the Arabic tašdīd. As in Nagari, peripheral loanword phonemes (and more recently, indigenous retroflex ḷ ) are marked by subscript dots. Around the same time as the development of Gurmukhi, the Landa script was also adapted by the South Asian Ismaili community for the writing of their religious literature, the ginān (Asani 1987). The development of this Ismaili Khojki script

794 Stefan Baums (from Persian xwājah ‘master’) is attributed to Pir Sadr al-Din (15th century). The main improvements of Khojki over the traders’ script Landa consist in the addition of medial vowel marks (here called lākanā), of a gemination marker called šadda (corresponding to the Gurmukhi addak), and especially the introduction of systematic punctuation to separate words. The Khojki script fell out of general use by the 1970s and has been replaced by the Gujarati and Arabic scripts. 9.2.5.

Bengali and Oriya

Like Nagari, the Bengali script derives from the north Indian Siddhamātṛkā. The close relationship of Bengali script and Nagari is apparent from their use of a horizontal head line and the shape of letters such as ক / क ka and ন / न na. The ductus of the Bengali script is defined by acute angles. For the representation of postconsonantal e, ai, o, and au the Bengali script employs so-called pṛṣṭhamātrās (‘backstrokes’): e and ai are written to the left of the consonant sign (েক, ৈক), o and au surround it (েকা, েকৗ); the other modern scripts with pṛṣṭhamātrās are Oriya, Malayalam, Tamil, and Sinhalese. While the orthography of most South Asian scripts is close to their pronunciation, Bengali orthography is very conservative and does not reflect many sound changes in the history of Bengali: The distinction between long and short i / ī and u / ū is only made in writing (kল kula and kল kūla, e.g. are both pronounced [kul]); the three sibilants শ śa, ষ ṣa, and স sa are pronounced the same (mostly [ʃ], [s] before dentals); ণ ṇa and ন na are both pronounced [n]. Conversely, there is only one sign (a, অ or inherent) for the not wholly predictable two pronunciations [ɔ] and [o]; and one (এ or ে) for the pronunciations [æ] and [e]. Original consonant clusters are written as such, but pronounced as geminates if the first element is medial, and as the simple first element if it is initial ( াস śbāsa [ʃaʃ], িবdান bidbāna [biddan]). Nasal stops and y as second elements of clusters lead to nasalization and palatalization of following vowels (sারক smāraka [ʃãrok], বয্াকরণ byākaraṇa [bækɔron]). The Oriya script is descended from proto-Bengali, but has been influenced by the ductus of South Indian scripts in the round shape of letters and in the “umbrella” covering most letters, corresponding to the head line of Nagari or Bengali (cf. କ ka, Nagari क, Bengali ক; ଲ, Nagari ल, Bengali ল). 9.2.6.

South Indian scripts

Among the modern South Indian scripts, one subgroup is formed by the Kannada and Telugu scripts, another by the Grantha-derived Malayalam and Tamil. Kannada and Telugu have their first precursor in the script of the Kadamba and Cālukya inscriptions of the 5th to 7th centuries. After the 10th century, one can speak of a distinct Old Kannada script, which by the year 1500 had begun to differentiate into Kannada and Telugu varieties; the differences between the two scripts were

Writing systems 795

standardized by their use in printing from the 19th century. The main distinction between the modern Kannada and Telugu scripts is the different shape of the headmark, which is a horizontal line with a hook at the right in Kannada (e.g. in ಕ ka, ಚ ca, ತ ta) but looks like a check mark on top of the letters in Telugu (క ka, చ ca, త ta). In both scripts, some of the aspirate letters are formed by addition of a diacritic subscript line or dot to the unaspirated letter (e.g. Telugu ఛ cha from చ ca, Kannada and Telugu ಧ / ధ dha from ದ / ద da); the subscript line is then also applied to aspirates that have their own distinct letter (e.g. Kannada and Telugu ಘ / ఘ gha, cf. ಗ / గ ga). In Kannada only, the diacritics for the long vowels ī, ē, and ō are derived from those of the corresponding short vowels: ಕಿ ki vs. ಕೀ kī, ಕೆ ke vs. ಕೇ kē, ಕೊ ko vs. ಕೋ kō. Also in Kannada, consonant clusters with initial r are written with the combining form of r following the other consonant: ತರ್ rta. The Malayalam and Tamil script form another subgroup as shown by similarities of letter shape and systemic features. The vowel diacritic for short i, e.g., is placed to the right of the akṣara only in these two scripts: Malayalam കി ki, Tamil கி ki vs. Kannada ಕಿ ki, Telugu కి ki. The vowel diacritics for ā, e, ē, ai, o, ō, and au are physically separated from the main part of the akṣara. Malayalam and Tamil use pṛṣṭhamātrās for these vowels, again in contrast with Kannada and Telugu, but in common with Sinhalese (as well as Bengali and Oriya). Six Malayalam characters (ക ka, ണ ṇa, ന na, ര ra, ല la, and ള ḷa) form ligatures with the virāma sign ( ്), the so-called cillakṣarams: ൿ, ൺ, ൻ, ർ, ൽ and ൾ (cf. ഖ്‌ kh or ഗ്‌ g). Orthographic reforms in the 1970s and 1980s introduced new signs for postconsonantal u, ū, ṛ, and r that are placed on the left and right side of the akṣara (പു pu, പൂ pū, പൃ pṛ, and ്രപ pra) and replaced all consonant conjuncts by combinations with virāma or cillakṣaram (k kta became ക്‌ത, n nta became ന്‌ത, etc.). The Malayalam script is primarily used for the Malayalam and Tulu languages. The Tamil script can be traced back to the 9th century (like its cursive variant Vaṭṭeḻuttu) and assumed its modern form by the 15th century. The sign inventory of Tamil is much smaller than that of the other Brāhmī-derived scripts: signs for aspirate stops were abandoned because they do not occur in Tamil, and signs for voiced stops because they only occur as allophones of their voiced counterparts (க ka represents both [k] and [ɡ], etc.). There are separate vowel signs for short and long e and o. Tamil consistently uses the puḷḷi sign instead of consonant conjuncts (with the exception of க்ஷ kṣa which is regarded as a basic letter). The character repertoire of Tamil has three layers: the core characters needed to write Tamil itself; five characters inherited from Grantha and used for Sanskrit words (ஜ ja, ஷ ṣa, ஸ sa, ஹ ha and க்ஷ kṣa); and the visarga sign (ஃ), called āytam. The last is also used in combination with other consonant signs to write peripheral loanword phonemes (corresponding functionally to the subscript dot, nuqtā, of Nāgarī and other scripts): ஃப (ḥ + p) for [f] and ஃஜ (ḥ + j) for [z].

796 Stefan Baums 9.2.7.

Sinhalese

Writing was introduced to Sri Lanka by the 2nd century BCE (and possibly as early as the fourth). The subsequent development of the Sinhalese script was characterized by long periods of isolation, interrupted by occasional strong influence from mainland South Indian scripts (especially Pallava Grantha), and by the 14th century it had approached its modern form. It has special signs for the Sinhalese open vowels ä (ඇ, –ැ) and ǟ (ඈ, –ෑ) and the prenasalized stops n̆ ga (ඟ), n̆ ḍa (ඬ), n̆ da (ඳ), and m̆ ba (ඹ). The Sinhalese character repertoire can be divided into two layers: the core characters for Classical Sinhalese (eḷu hōḍiya) and an outer set with signs for Sanskrit and Pāli words (ṛ, ṝ, ai, au, the aspirates, the nasals ṅa and ña, and the sibilants śa and ṣa); the complete alphabet is called miśra hōḍiya. Letters added in the miśra hōḍiya are in normal speech pronounced like corresponding letters from the eḷu hōḍiya (ඝ gha and ග ga are both [ɡə], etc., and the sibilants are all [s]). 9.2.8.

Dhivehi

Dhivehi, spoken on the Maldives, is closely related to Sinhalese, and was first committed to writing in the 13th century in the evēlā (‘ancient’) script derived from Sinhalese, developing further into the dhivehi akuru (‘island letters’) script (Geiger 1919: 20–29, 149–168, DeSilva 1969). In the early 17th century this was replaced by the current script, called Thaana, which is written from right to left and contains 24 consonant letters. The first nine are based on the Arabic numerals; the next nine on an old set of local numerals; and the last six letters, used for loanwords, are modifications of other letters or borrowings from Arabic. There are ten vowel signs, including one for short a (–) which is not inherent in consonant signs. The absence of a vowel is redundantly signalled by a cancellation mark (–, called sukun). The character alifu (‫ )’ އ‬serves as vowel carrier for initial and ް ‫ށ‬, ް ‫)ތ‬ ް reprepost-vocalic vowels. Alifu, ‫ ށ‬š, or ‫ ތ‬t in combination with sukun (‫އ‬, sent a glottal stop word-finally; when preceding another consonant, they indicate ަ ba’ṭe’ gemination, and ‫ ްތ‬also adds an off glide [j] to the preceding vowel: ‫ބ ްއ ެޓ ްއ‬ ް ‫ރ‬ ަ raš [raʔ] ‘island’, ‫ޅ‬ ު ‫ ައ ްތ ުޕ‬atpuḷu [ajppuɭu] ‘hand’; pre[baʈʈeʔ] ‘eggplant’, ‫ށ‬ nasalization is marked by ‫ ނ‬n (without sukun) and sometimes left unmarked: ‫ ަކ)ނ( ުޑ‬ka(n̆ )ḍu (examples from Gair & Cain 1996). Diphthongs are written as vowel + alifu + vowel (e.g. ‫ ަފ ިއ‬fa’i [fai] ‘leg’). Loanwords from Arabic are written in Arabic script or with the help of twelve additional characters formed by adding dots to Thaana letters.

Writing systems 797

9.2.9.

Latin script

The only South Asian language using (since the 16th century) the Latin script as its primary writing system is Konkani, where vowel length is not marked and retroflexes are indicated by double consonants. Attempts to replace other South Asian scripts by the Latin script were unsuccessful, reflecting strong attachment to and identification with the local scripts as well as the effectiveness of the Brāhmīderived scripts in representing Indian languages. 9.2.10. Numerals and punctuation Early Kharoṣṭhī had number signs for 1, 10, 20, 100, and 1000 (𐩀, 𐩄, 𐩅, 𐩆, 𐩇); 20 is a cursive combination of two signs for 10 arranged on top of each other. Later a separate sign for 4 (𐩃) is added to the inventory and cursive forms for 2 and 3 develop (𐩁 from 𐩀𐩀, 𐩂 from 𐩀𐩀𐩀). The Kharoṣṭhī number system is additive, with higher number signs preceding lower ones in the reading direction: 16, e.g. is written 10 (+) 4 (+) 1 (+) 1 (𐩀𐩀𐩃𐩄). Multiples of 100 and 1000 are written with a multiplier preceding the 100 or 1000: 200 is 𐩆𐩀𐩀. The Kharoṣṭhī system is based on the Aramaic one (Chrisomalis 2010: 68–74, 83–86). The early Brāhmī number system is also additive, but has a larger number of basic signs for 1 to 9 (1, 2, 3, 4, 5, 6, 7, 8, 9), for 10 to 90 (a, b, c, d, e, f, g, h, i) and for 100 and 1000 (j, m). Higher signs precede lower ones in the reading direction: 16 is 10 (+) 6 (a6). Multiples of 100 and 1000 are written with a multiplier following and conjoined with the 100 and 1000 (m4 is 1004, but p is 4000); 200/2000 and 300/3000 are written by adding one or two horizontal strokes (k, n, and l, o). The origin of the early Brāhmī number system remains unclear, but inspiration from China (Falk 1993: 168–176) or Egypt (Chrisomalis 2010: 191–192) may have played a role. Around the 7th century, the positional system came into use, with the signs for 1 to 9 continuing those of the older system and a new sign for 0 (a dot or circle). Some of Aśoka’s edicts use word spacing to mark syntactic units (Janert 1972). Other inscriptions and manuscripts are written continuously, but use various punctuation marks such as 𐩐, 𐩑, 𐩕 and 𐩓 in Kharoṣṭhī and  and  in Brāhmī, and later the signs daṇḍa (।) and double daṇḍa (॥). Since the early 20th century, European punctuation has increasingly been used in Indian texts.

798 Stefan Baums 9.3.

Recent script-related research By Stefan Baums

9.3.1.

Recent work on alphasyllabic writing systems

The two ancient writing systems of South Asia, Kharoṣṭhī and Brāhmī, were first deciphered with the help of bilingual coin legends and by working forwards from the letter shapes of the Semitic scripts (Kharoṣṭhī) and backwards from those of later South Asian scripts (Brāhmī). After pioneering efforts by Charles Wilkins (1749–1836), Henry Thomas Colebrooke (1765–1837), Christian Lassen (1800– 1876), and others, James Prinsep (1799–1840) announced his decipherment success in two articles — in 1837 on the inscriptions of Sanchi, and 1838 on the Indo-Greek coin legends. The decipherment of Kharoṣṭhī was consolidated when Norris (1846) published his reading of the newly-discovered Aśokan edict at Shabazgarhi (Falk 1993: 99–103; Salomon 1998: 199–215). From the midnineteenth until the beginning of the twentieth century, knowledge of the historical scripts of South Asia solidified and received first synthetic treatments by Dowson (1863) on Kharoṣṭhī, Burnell (1874, 1878) on the South Indian scripts, and Bühler (1896) in his comprehensive paleography. The modern study of the historical scripts can be divided into three formative phases. The first of these was prompted by the discovery of large numbers of early Buddhist manuscripts from South Asia along the Silk Roads in modern Xinjiang, China, and near Gilgit in modern Pakistan. Hoernle (1916) and Boyer, Rapson, Senart & Noble (1920–1929) provided the first major publications and analyses of the Central Asian Brāhmī and Kharoṣṭhī material; v. Hinüber (1979) summarizes research on the Gilgit manuscripts. Next, the mid-twentieth century saw a number of new syntheses by Das Gupta (1958) on Kharoṣṭhī, Dani (1963, 1986) in a new comprehensive paleography giving particular attention to regional developments of Brāhmī, Sircar (1942, 1965a, 1965b, 1966, 1983) in a series of reference works on South Asian epigraphy, Sander (1968) on the development of Brāhmī in Central Asia, and Jensen’s (1969) overview of the South Asian scripts as part of a general history of writing systems. Finally, another series of compendia inaugurated the third and current phase of historical studies: v. Hinüber (1990) and Falk (1993) reevaluated our knowledge of writing and literacy in ancient South Asia and assembled a comprehensive history of research (cf. the review article by Salomon 1995). Daniels & Bright 1996 gave a new overview of the world’s writing systems, replacing Jensen’s book with a collection of essays. Salomon’s (1998) handbook of South Asian epigraphy similarly updated Sircar 1965a, and Falk & Slaje (eds.) 2000–2005 broke new ground with a paleographical database assembling the contributions of numerous experts; Einicke (2009) draws on this database for a comprehensive handbook of South Asian scribal notation from the fifth century to modern times. The third phase,

Writing systems 799

like the first, is also characterized by the emergence of large amounts of new primary material. Recent discoveries started off with Coningham, Allchin, Batt & Lucy’s (1996) report of a find of potsherds from Anurādhapura, Sri Lanka, dated stratigraphically to the early fourth century BCE and thus potentially indicating a history of Brāhmī as a traders’ script before its adoption by Aśoka. The discovery of around eighty-five Gāndhārī manuscripts on birch bark and palm leaf (Salomon 1999; Strauch 2008; Baums & Glass ongoing) has put Kharoṣṭhī manuscript studies on an entirely new footing (Glass 2000 discusses the paleography of this material, Baums 2009: 110–200 its orthography). The recovery of large numbers of first-millennium Sanskrit manuscript fragments from Gilgit and Bamiyan (Hartmann 2000, Braarvig 2000) and of early second-millennium Sanskrit manuscripts from Tibet (Steinkellner 2004) is filling gaps in our knowledge of the development of Brāhmī (Sander 2000) and the Transitional Scripts. The gradual replacement of the Kharoṣṭhī by the Brāhmī tradition and the concomitant switch from Middle Indo-Aryan to Sanskrit for the transmission of Buddhism is discussed by Salomon (2008) and Strauch (2012). The decipherment of the rare Bhaikṣukī (or Arrow-Headed) script of the Transitional period has been completed with the help of a manuscript discovered in Nepal (Dimitrov 2010). The Shell Script remains one of the undeciphered writing systems of South Asia (Salomon 1987). With the rise of writing-system studies as a part of modern linguistics, new analyses and questions were brought to bear on the South Asian writing systems. The traditional typology of writing systems as either alphabetic, syllabic, or logographic could not accommodate Kharoṣṭhī and the Brāhmī-derived scripts, and a fourth type — called “alphasyllabic” or “abugida” — was defined for the purpose (Bright 1994: 323–324; Bright 1999; Salomon 2000; Coulmas 2003: 131–150; Swank 2008). Building on the formal analysis of Sproat (2006a), Weingarten 2011 suggests that the typological category of the South Asian scripts varies by specific feature examined and perspective (semasiological or onomasiological) adopted. 9.3.2.

Script and literacy

Modern studies of script acquisition and literacy in South Asia can be traced back to development studies and the “comparative reading” approach of the 1970s and 1980s (Oommen 1973; Malmquist 1982; Hladczuk & Eller 1987). Examples of recent work are Karanth & Suchitra 1993 on Hindi and Kannada, Patel 1995 on Gujarati, Prakash & Joshi 1995 on Kannada, and the 2004 special issue of the journal Reading and Writing on ‘reading and writing in semi-syllabic [= alphasyllabic] scripts’ with contributions like Vasanta 2004 (Telugu), Gupta 2004 (Hindi), Karanth, Mathew & Kurien 2004 (Kannada), and Chengappa, Bhat & Padakannaya 2004 (Hindi and Kannada). The question of orality and literacy and their relative spheres of application in South Asia is addressed by Bright (1990: 130–146);

800 Stefan Baums Patel (1993) (on ancient South Asia; cf. v. Hinüber 1990); Glück (1994: 741–742) (on the relationship of literacy and diglossia); Jain (2003: 50–53); and Agnihotri (2008). The effects of South Asian “multigraphism”, “digraphia”, or “bi-literacy”, i.e. the use of more than one script by an individual, are studied by Ferguson (1978); Pederson (2003) (Tamil-English biliterate readers have more precise shape recognition than monoliterate readers); Prakash et al. (1993) (Kannada-English and Hindi-English readers perform better at phonemic segmentation); Wali et al. (2009); Unseth (2005: 36–37); Vaid (1995) (script directionality influences production and perception of non-linguistic shapes). 9.3.3.

South Asian scripts in Unicode

After birch bark, palm leaf, and paper, the South Asian writing systems are now undergoing a further transition to digital encoding as one of their primary media. The Unicode Consortium (representing industry, academic, and governmental interests) is responsible for defining the universal digital coding system of the world’s scripts (Unicode Consortium 1991–; Baums 2006: 111–116), and the first version of the Unicode standard (1.0, released in October 1991) provided support (modelled on the 1988 ISCII standard) for the nine major modern indigenous scripts of India: Devanagari, Bengali, Gujarati, Gurmukhi, Kannada, Malayalam, Oriya, Tamil, and Telugu. Eight years later (3.0, September 1999) the standard added support for Sinhala and Thaana, completing coverage of the major modern indigenous scripts of South Asia. Further South Asian scripts were added as follows: in March 2005 (4.1), Kharoṣṭhī and Syloti Nagri (a nineteenth-century modification of the Bengali script for the Syloti dialect); in April 2008 (5.1), Saurashtra (developed in the nineteenth century on the basis of Gujarati and Oriya, and used by Gujarati immigrants to southern India) and Ol Chiki (an alphabet developed in 1925 by Raghunath Murmu for the Munda language Santali; Zide 1996: 612); in October 2009 (5.2), Kaithi (a cursive form of Nagari used by traders from the 16th to the early 20th century) and Meetei Mayek (developed for the Tibeto-Burman language Manipuri and used until the early 18th century); in October 2010 (6.0), Brāhmī; and in January 2012 (6.1), Śāradā, Takri, and Sorang Sompeng (an alphabet invented in 1936 by Mangei Gomango for the Munda language Sora; Zide 1996: 613). The Unicode standard thus currently contains support for six historical scripts, the eleven major modern indigenous scripts, and four modern minority scripts. The most urgent desiderata of the standard are coverage and usage guidelines for the remaining historical scripts, informed by a proper historical classification and meeting the practical needs of the scholarly community, and the implementation of software support for all covered scripts.

Writing systems 801

9.3.4.

The Indus inscriptions

The inscriptions of the “Indus” or “Harappan” civilization (Wheeler 1968; Kenoyer 1998) on seals and other objects have long been considered the earliest — and as yet undeciphered — writing system of South Asia (Possehl 1996 provides a good overview). A precursor to the sign system of the seal inscriptions were potters’ marks, used by the people of the early Indus civilization from the fourth millennium BCE to about 2600 BCE. Then, after a relatively short transition period, the fully developed Indus sign system came into being by 2500 BCE. It disappeared with the decline of the Indus civilization around 1900 BCE (1700 BCE in its southern outposts in Maharashtra). The overall inspiration for the development of the Indus sign system may have come from the Indus people’s western trade partners in Mesopotamia (trade relations are attested by around 40 Indus seals found in the Near East). The Indus civilization did not, however, borrow the Mesopotamian cuneiform, but invented their own sign system. Some of the shapes of the Indus signs appear to point back to the earlier potters’ marks. More than 4,000 inscriptions in the Indus sign system are known today, most of them seal inscriptions, some on amulet tablets and pottery. The corpus of known Indus inscriptions is catalogued by Joshi & Parpola (1987), Shah & Parpola (1991), and Parpola, Pande & Koskikallio (2010), building on earlier work by Mahadevan (1977). It is reasonable to suppose that the referents of the seal inscriptions are similar to those of Mesopotamian seals: items of merchandise and the names of owners and titles of office, often incorporating the names of gods. The average length of the inscriptions is just five signs, ranging from single-sign inscriptions to an untypically long 28-sign inscription on the sides of a prismatic amulet. The total number of different signs in these texts is roughly 400. If the Indus sign system does in fact represent a full writing system (see below), then this relatively high number together with the pictorial nature of most signs would point to a logographic writing system. That the number of signs is not even larger may have to do with the specialized nature of the texts. Since the first publication of an Indus seal in 1875 (Cunningham 1875: 108, pl. XXXIII), many attempts have been made to decipher the Indus inscriptions as a writing system, but none of them is fully convincing. Parpola (1994, 1996) summarizes earlier attempts and discusses the challenges to interpreting the Indus inscriptions. Our understanding is hampered by the brevity of the texts, the absence of parallel texts in another writing system, and our ignorance of the language (if any) that is used in the Indus inscriptions. Beyond that, many proposed decipherments also suffer from methodological weakness. Equating Indus signs with similar-looking characters in other ancient writing systems is, for instance, a doubtful procedure due to the arbitrariness of similarity judgments and the fact

802 Stefan Baums that different writing systems tend to use the same basic geometric shapes for different purposes. More promising are attempts starting from distributional criteria. Using these, it has been argued that a certain class of signs probably represents suffixes or phonetic/semantic determinatives. Another class of characters, consisting of groupings of vertical lines, probably represents the numerals of the language iconically. These numeral signs typically precede a limited class of other signs, probably denoting the things being counted; if the inscriptions do write a language, then the position of the numerals in front of their headword would give a typological clue to the type of language represented. (It seems relatively certain, from the stroke order of characters and their spacing in lines, that the Indus inscriptions were incised from right to left.) The strongest contender for a language underlying the Indus inscriptions is Dravidian, as suggested in the work of Yurii Knozorov (e.g. 1965), Asko Parpola (e.g. 1994, 1996, 2008), and Iravatham Mahadevan (e.g. 1977, 2003). There are pockets of speakers of Dravidian in Northern India and Baluchistan, indicating an early area of use much larger in ancient times than today; this is confirmed by the presence of Dravidian loan-words in the Ṛgveda, composed in the Indus region in the second millennium BCE.3 Some of the distributional patterns of Indus signs are also reminiscent of patterns of homophony in Dravidian languages. It is a priori less likely that the Indus inscriptions record an early Indo-Aryan dialect (recent suggestions to this effect include Rao 1982 and Jha & Rajaram 2000), since we have no evidence that Indo-Aryan was used in South Asia when the inscriptions were produced. A radically different approach to the interpretation of the Indus inscriptions has recently been introduced by Farmer, Sproat, and Witzel (2004), who argue that the Indus inscriptions reflect a non-linguistic symbolic system. They adduce the inscriptions of the southeast European Vincha complex (38–39) and the Near Eastern emblems used, e.g. on boundary stones (39–40), as ancient parallels for such non-linguistic symbolic systems, adding the medieval Scottish heraldic system as a later example (27–28). In their interpretation, the function of the Indus inscription was not the conveying of linguistic messages, but the association of important natural, supernatural, and social entities (40–43) in a cohesive ideological system operating in a dispersed multilingual population (45). Contrary to traditional interpretations of the Indus inscriptions, even number signs may not have served accounting purposes, and may sometimes have been used metaphorically (e.g. to refer to numeric sets of deities; 41–42). From the archeological absence of longer inscriptions on nonperishable material and of writing utensils, Farmer, Sproat, and Witzel argue further not only that the known Indus inscriptions are non-linguistic, but that the Indus civilization had no writing system at all. 3

See also Section 2.3 for a different view, and 1.6.1.2 for recent Dravidologist views on Brahui.

Writing systems 803

In a rejoinder to Farmer, Sproat, and Witzel’s suggestion, Parpola (2008) continues to maintain a linguistic (and most likely Dravidian) interpretation of the Indus inscriptions. He particularly stresses that the alternation of different number signs before an identical non-number sign does point to counting (116); that cotton cloth was one of the main trade goods of the Indus civilization, and according to Nearchus was used for writing in the 4th century BCE, but that no ancient specimens of cloth are preserved from the Indus area, making it less unlikely that longer Indus texts on perishable material existed (117); that brushes were evidently used for inscribing pots and possibly also for manuscripts (118–119); and that seal impressions have been found on clay tags that were probably attached to merchandise, indicating a commercial use of some Indus seals after all (122). The work of Yadav and Vahia (2011) illustrates the ongoing formal analysis of the Indus inscriptions on the assumption that they do represent a writing system. While no scholarly consensus on the status of the inscriptions as writing or non-linguistic symbols and (if the former) on the language of the Indus civilization has yet emerged, the arguments have reached a level of refinement and methodological reflection that will hopefully result in a clearer definition of the possible scope and limits of our knowledge. 9.4.

Perso-Arabic adaptations for South Asian languages By Elena Bashir

9.4.1.

Early adaptations

The Arabic script consists of 28 consonant letters, three of which can indicate both consonants and long vowels, short vowels being (optionally) indicated with diacritics. Like scripts derived from it, it is cursive and has no lower-/upper-case distinction. When the Arabic script was adopted for other, non-Semitic languages, various kinds of modifications became necessary.4 For writing Persian, the 28 original Arabic characters were supplemented by the addition of ‫ژ‬, ‫چ‬, ‫گ‬, and ‫پ‬, [ž], [č], [g], and [p], respectively. Later, Urdu required representing the phonological distinctions between retroflex and dental and between aspirated and unaspirated consonants, and a unique final /ē/ (‫ )ے‬to indicate grammatical distinctions. Parvez (1996: 15) notes that these early extensions of the Arabic script exhibited partial systematicity, i.e. three dots below to indicate voiceless sounds, e.g. ‫[ پ‬p], ‫[ چ‬č]. Not all voiceless sounds, however, have three dots below, e.g. ‫[ ت‬t]. In Urdu, the representation of retroflexion by a small ‫ ط‬diacritic above, and of aspiration by a digraph consisting of stop consonant + ‫ھ‬, is consistent.

4

See 9.2.8 above for discussion of the Arabic-based script used for Dhivehi.

804 Elena Bashir Panjabi has been written in the Perso-Arabic script (“Shahmukhi”) since the 12th century, using the same set of characters as Urdu. Panjabi, however, is a tone language, while Urdu is not; original voiced aspirates have changed in Panjabi to either voiceless unaspirated or voiced unaspirated stops, depending on their position. Syllable-initial Urdu voiced aspirated stop letters represent Panjabi voice[pā̀ bī] ‘brother’s wife’. less stops with low tone on the following vowel, e.g. Word-final Shahmukhi voiced aspirate letters indicate unaspirated voiced stops [sā́ d] ‘holy man, ascetic’. The with high tone on the preceding vowel, e.g. two other letters representing consonant /h/, ‫ ہ‬and ‫ح‬, also indicate tone in Panjabi, [pṓ] ‘tenth month of Bikrami calendar’, or [nikā́ ] ‘Muslim marriage e.g. contract’. Urdu also lacks /ṇ/ and /ḷ/. Panjabi’s phonemic /ṇ/ and /ḷ/ are still not uniquely represented in Perso-Arabic used for Panjabi. Various proposals for representing them in Unicode are under discussion (Malik 2005). Kashmiri has been written in the Arabic script since the 15th century. The consonants are the same as those used for Urdu, but the vowel symbols have been considerably augmented by various diacritics to represent Kashmiri vowels not found in Urdu (Koul 1995). The orthography is still not standardized. The earliest known Pashto manuscript is dated to the mid-17th century, but it is not known when the current standard orthography was adopted (MacKenzie 1997). In the current standard, retroflexion is marked with a ring attached to the body of the letter, e.g. ‫ ډ‬/ḍ/, and there are two “versatile” consonants, ‫ ږ‬and ‫ ښ‬for voiced and voiceless fricatives respectively, which are pronounced [ẓ], [γ], or [γ] (voiced retroflex fricative, voiced palato-velar fricative, or voiced velar fricative); and [ṣ], [x̌ ], or [x] (voiceless retroflex fricative, voiceless palato-velar fricative, or voiceless velar fricative), respectively, according to a speaker’s dialect (MacKenzie 1959: 232). Some orthographic differences exist between Afghan and Pakistani Pashto; for Afghan Pashto see Penzl 1954. Mirdehghan 2010 compares the representations of consonantal and vocalic sounds in Persian, Urdu, and Pashto and their orthographic systems. The present Sindhi script, including 52 characters and seven diacritics, was instituted by the British in the 1850s. It represents aspiration and retroflexion inconsistently, e.g. ‫[ ڍ‬ḍh] but ‫[ ڀ‬bh], and ‫[ ڊ‬ḍ], but ‫[ ٽ‬ṭ]; three of the implosive consonants are represented consistently with two vertical dots — ‫[ ٻ‬ɓ], ‫[ ڄ‬ʆ], and ‫[ ڳ‬ɠ ] — but the fourth, ‫[ ڏ‬ɗ ̣ ], is not. Khubchandani (2003: 635) gives a complete list of Sindhi letters. All the Sindhi characters in use have Unicode code points.5 Though Balochi has been written since at least 1873 (Jahani 1989: 23), and though questions of orthography and spelling have been hotly debated for years, ‘no one orthography has won general acceptance among the Balochi cultural elite’ (Jahani & Korn 2009: 638). Jahani (1989) discusses the historical, political, and 5

For an early discussion of the introduction of Unicode for languages of Pakistan, especially Sindhi, see Bhurgri MS.

Writing systems 805

linguistic issues in great detail. Unique developments in Balochi orthography are morphophonemic symbols, for instance the use of ‫ َء‬for the oblique singular case ending /ā/ and ‫ ِء‬for the genitive suffix /əy~ī/ (Barker & Mengal 1969, Vol 2: 9, 37–39). An attempt was made in the early 1990s to introduce four new hybrid “cross-dialectal” symbols that could represent dialectal variation between Eastern [f/p(h)], [γ/g], [x/k], and [θ/t] and Western Balochi, respectively — — which would function like the Pashto “versatile” letters ‫ ښ‬and ‫( ږ‬Balochistan Textbook Board 1989, Barker & Mengal 1969, Vol 2: 8). Speakers of Eastern dialects would pronounce the first (fricative) variant, while speakers of other dialects would say the second (stop) variant. This interesting proposal, however, has not survived. Brahui began to be written in the 19th century, but literary production received impetus after Independence in Quetta (Elfenbein 1983: 107). Now since the 1960s it is written with the same Urdu-style symbol set as Balochi, but with the addition of ‫( ڷ‬U+06B7) to represent the voiceless lateral fricative [ɬ] (Balochistan Textbook Board 1991). In 2008 a Brahui Language Board was established; one of its tasks was to be redesigning Brahui script for Brahui. One of their efforts can be seen at https://sites.google.com/site/brahuilb/ (this orthography is different from that in Balochistan Textbook Board 1991). Now fonts and keyboard layouts have been developed for Brahui, facilitating its use in the modern world. Khowar has been written, using Persian (later Urdu) orthography, since the 17th century, first in a mixture of Persian and Khowar (Bashir 2006). In the early 20th century, symbols were devised by Prince Hussam ul-Mulk and his son Samsam ul-Mulk (ul-Mulk, n.d.) for the Khowar consonant sounds not found in Persian or Urdu: /ṣ/, /c ̣/, /j/,̣ and /ẓ/. These symbols have remained in use, and are now encoded in the Unicode Standard as ‫( ݰ‬U+0770), ‫( ݯ‬U+076F), ‫( ݮ‬U+076E), and ‫( ݱ‬U+0771). Letters for /ts/ and /dz/ were already encoded as ‫( څ‬U+0685) and ‫ځ‬ (U+0681), respectively (www.unicode.org/charts). Buddruss (1982) analyzes the considerable orthographic variation obtaining at the time of his writing; it seems that with increasing writing and publication in Khowar the orthography is slowly moving in the direction of standardization. 9.4.2.

Recent adaptations

Several recent adaptations reflect modern linguistic analysis. These include those for Torwali, Kalasha, Burushaski, Wakhi, Shina, Gawri, and Pashai. Many northwestern languages have phonemic tone (Baart 2003), which is only sometimes represented. Panjabi (though indirectly), Burushaski, and Kohistani Shina (Schmidt & Kohistani 1995) represent tone in their writing systems, using different techniques. Debate on whether and how to represent tone in other varieties of Shina and in Khowar continues. The advent of Unicode, coinciding with increased concern about language endangerment and language documentation, has given a new impetus to work

806 Elena Bashir on developing scripts for previously unwritten languages. Recently new Unicode characters were proposed for Khowar, Torwali, and Burushaski (Bashir, Hussain & Anderson 2006) and have been added to the Arabic Supplement code page. 9.4.2.1. Shina The first published attempt at accurately representing Shina’s phonology was by Namus (1961: 28–29), who introduced eight new consonant symbols and a system of vowel diacritics indicating four degrees of length: slight, short, normal, and long. Zia (1986, 2010) and Taj (1989) have since followed, each with a different proposal. Since the beginning, debates on (Gilgit) Shina orthography have continued to focus on the questions of whether or not it is necessary to represent vowel length and phonemic tone. Schmidt & Kohistani (1995 ms: 5) discuss the analysis underlying their work on an orthography for Kohistani Shina, focusing on the question of representing tone. They conclude that, ‘It is possible to use length to predict stress and the occurrence of tone, and this approach is more appropriate to the consonant-rich Arabic orthography.’ The article includes a list of symbols used in their scheme.6 Buddruss 1983 is a history of the development of writing in Shina. 9.4.2.2. Burushaski Burushaski began to be written in the 20th century through the efforts of Allama Nasir ud-Din Nasir Hunzai. Some early publications employed roman representations (e.g. Hunzai n.d.); recently, however, a Perso-Arabic representation is being employed by the Burushaski Research Academy in their dictionary project (Burushaski Research Academy 2006, 2009, 2014). A list of characters used and their phonetic values can be seen in Burushaski Research Academy 2011. The vowel symbols do represent tone. 9.4.2.3. Saraiki Development of a specifically Saraiki writing system has begun within the last quarter century. Both Saraiki and Sindhi have four voiced implosives: [ɓ], [ʄ], [ɗ], and [ɠ]; usually represented as ‫ٻ‬, ‫ڄ‬, ‫ݙ‬, ‫ڳ‬, respectively, although these representations are not yet entirely standardized. The existing Sindhi characters for 6

This paper was subsequently published in Israr-ud-Din (ed.) 2008, Proceedings of the Third International Hindu Kush Cultural Conference, 283–287, Karachi: Oxford University Press. However, the published version introduces errors not present in the original 1995 manuscript. The reader is advised to consult the original, at http:// www.hf.uio.no/ikos/english/research/projects/shina/publications/Schmidt%20and%20 Kohistani%202008-original-1.pdf.

Writing systems 807

these sounds are not employed in the same way. Several suggestions for other representations are current, but none has won acceptance. Some of these can be seen in Shackle 2003: 598. The Unicode character U+0768 ‫ݨ‬, has been included for Saraiki and Potohari [ṇ]. Rasoolpuri 1976 is a short history of Saraiki orthography. 9.4.2.4. Kalasha Several different schemes for writing Kalasha have been put forward — some employing roman script and others Perso-Arabic. Trail & Cooper 1999 is a dictionary using a system devised by its authors in collaboration with the Kalasha community. Following the Urdu convention for representing retroflex consonants, it represents retroflex vowels also with a small diacritic ‫ ط‬above the vowel symbol. Several unique consonant symbols have also been devised, but to my knowledge, these have not yet been incorporated in Unicode. Heegård 2000 discusses the interplay of political and linguistic factors in the designing of alphabets for this language. 9.4.2.5. Torwali Inam Ullah (2004) reports on his work on developing a writing system for Torwali. His system reflects phonological analysis and distinguishes the retroflex sibilants and affricates and the low front vowel ‫[ أ‬æ] (see Inam Ullah n.d.). His TorwaliUrdu dictionary (Inam Ullah 2010) and online Torwali dictionary (http://www.cle. net.pk/otd/) employ this system. 9.4.2.6. Gojri Losey 2002 is a phonological analysis of Gojri, done to provide a foundation for script development efforts; it includes discussion of the representation of tone and a list of the characters used and their phonemic values. 9.4.2.7. Gawri Work on developing a script and orthography for this language, spoken in the upper reaches of the Swat Valley, has made rapid progress since 1995, when a Kalam-based Spelling Committee was established. Linguistic considerations and conventions for indicating consonant sounds not present in Urdu, tone, and vowel length are discussed in Baart & Sagar n.d.: 8–10. Sagar 2008 discusses literacy efforts and publications using the Perso-Arabic script variant adopted. The Gawri characters can also be viewed in Inam Ullah n.d.

808 Elena Bashir 9.4.2.8. Pashai Pashai has until very recently been unwritten. Current efforts to create a standard orthography draw from both the locally-organized Darrai Nur Language Committee (DNLC) and the Minority Language Committee at the Afghan Ministry of Education. Both suggested orthographies are based on the Pashto script (Perso-Arabic), and use the same notation for retroflex consonants. The Darrai Nur Language Committee’s orthography is based on a phonemic analysis of Pashai, and is intended for adult literacy training, as opposed to the Ministry’s version, which retains historical spellings for borrowed words. A new character adopted for Pashai is ‫( ڵ‬U+06B5) to represent [ɬ] (Rachel Lehr p.c. 3 Dec. 2014; see also Yun 2003). Lamuwal & Baker 2013 contains two sample texts of a well-known folk tale — one in the DNLC orthography and the other in the orthography developed by the Ministry of Education. 9.4.2.9. Wakhi Roman/IPA (Ali 1980) and Perso-Arabic (Sakhi 2000) script alternatives have been advocated for Wakhi in Pakistan. In Pakistan, the roman/IPA approach seems to be predominating, while in Afghanistan Perso-Arabic Pashto orthography has been adopted, and in Tajikistan Cyrillic is used (Mock 1998: 36–37). Beg, Mock & Wakhani 2014 is a detailed discussion of recent developments in orthography debates, and of computer fonts and keyboards for Wakhi. 9.4.2.10. Balti Starting in 727 CE, when Baltistan was conquered by the Tibetans, writings in Balti, a TB language, were in the Tibetan script; in the 16th century the PersoArabic script was introduced (https://baltistaan.wordpress.com/category/history/). Several orthographies have been employed for (contemporary) Balti. Sprigg (1996) describes his development of a roman-based orthography and dictionary for Balti, the phonological representations in which are intended to facilitate comparison with Classical Tibetan. This effort culminated in Sprigg’s (2002) dictionary. Recently there has been a local initiative by literary scholars and social activists working through the Baltistan Cultural Foundation to revive the Tibetan script for Balti, in an attempt to preserve indigenous Balti culture and ethnic identity.7 See 7

On the request of local activists, the September 2006 meeting of ISO/IEC 10646 WG2 agreed to encode two new characters in the Tibetan block — 0F6B Tibetan letter KKA, and 0F6C Tibetan letter RRA — in order to facilitate writing Urdu loanwords used in modern Balti using Tibetan script (https://baltistaan.wordpress.com/category/history/).

Writing systems 809

Kazmi 1996 and Khan 2000 for discussion of these issues. The Baltistan Cultural Foundation has published a Tibetan-script primer for Balti, and has encouraged the use of this script on local signboards.8 As of 2002, however, the Perso-Arabic alternative seemed to have retained its dominant position. National Language Authority 2002 is a Perso-Arabic Balti primer, which includes five non-Urdu characters. Chitrali 2004 is another, privately-published, Perso-Arabic primer, which includes another six non-Urdu characters. 9.4.3.

Diverse representations

Several languages have phonetically similar/identical sounds, but their speakers, presumably wanting to maintain cultural uniqueness, have chosen to use separate characters for them. For example, the retroflex voiceless sibilant [ṣ] is found in several languages of northwestern Pakistan, but it is represented differently in each of them: Kohistani Shina uses the basic shape for ‫ س‬with two short horizontal lines above it (no Unicode code yet); Khowar uses ‫( ݰ‬U+0770); Burushaski uses ‫( ݽ‬U+077D); Torwali ‫( ݜ‬U+075C); Kalasha uses ‫ ش‬with a small ‫ ط‬diacritic above it (no Unicode code yet); and Gowri uses ‫( ݭ‬U+076D). 9.5.

New research areas and desiderata By Elena Bashir

Currently research on writing systems is very active. According to Sproat (2000: 127), ‘The question of what kinds of linguistic elements written symbols represent is the single most investigated issue in the study of writing systems.’9 Current research moves past earlier classifications like the “deep : shallow” distinction and asks other, new questions.10 Veldhuis & Kurvers 2012 raises the question of how the acquisition of writing affects language processing (in the brain); and Banga et al. 2012 asks whether knowledge of the relationship between speech and writing in one language influences understanding of that relationship in another. This question is particularly relevant for South Asia, where common scripts are shared by languages with many layers of historical accretion, convergence, and divergence.

8

9

10

Pandey 2010 is a proposal to the Unicode Consortium recommending yet another script for writing Balti. A conference, Signs of Writing: The Cultural, Social and Linguistic Contexts of the World’s First Writing Systems, was held at the University of Chicago on 8–9 November 2014. Sproat 2000: 128–144 contains an overview of various taxonomies of writing systems and many references.

810 Bibliographical References Orthographic conventions like spelling, even when generally agreed upon, are often not consistently applied. Spelling and punctuation variation are fertile fields for study, especially as computational approaches to language processing and analysis make them more feasible. Other emerging fields of research are the interesting use of roman to represent South Asian languages in email and text messaging, and the representation of code mixing and code switching in writing; see Sebba, Manootian & Jonsson (eds.) 2012. What may such roman and/or mixed representations reveal about users’/writers’/speakers’ understandings of their languages and of themselves? See Section 9.3 above for discussion of three other important areas of current research. Bibliographical References Agnihotri, Rama Kant 2008 Orality and literacy. In: Braj B. Kachru, Yamuna Kachru, and S. N. Sridhar (eds.), Language in South Asia, 271–284. Cambridge: Cambridge University Press. Ali, Haqiqat 1980 xəkwōr zik [Wakhi language], Book 1. Gilgit: Wakhi Culture Association. Asani, Ali 1987 The Khojkī script: A legacy of Ismaili Islam in the Indo-Pakistan subcontinent. Journal of the American Oriental Society 107: 439–449. Baart, Joan L. G. 2003 Tonal features in the languages of northern Pakistan. In: Joan L. G. Baart and Ghulam Hyder Sindhi (eds.), Pakistani languages and society: Problems and prospects, 132–144. Islamabad: National Institute of Pakistan Studies, Quaidi-Azam University/Summer Institute of Linguistics. Baart, Joan L. G., and Muhammad Zaman Sagar n.d. The Gawri language of Kalam and Dir Kohistan. http://www.fli-online.org/ documents/languages/gawri/gawri_introduction.pdf (accessed 5 December 2014) Balochistan Textbook Board 1989 Baločı̄ qāeda [Balochi primer]. Quetta: Qalat Publishers. Balochistan Textbook Board 1991 Brāhūī qāeda [Brahui primer]. Quetta: Mahmud Stationers and Book Sellers. Banga, Arina, Esther Hanssen, Robert Schreuder, and Anneke Neijt 2012 How subtle differences in orthography influence conceptual interpretation. Written Language & Literacy 15(2): 185–208. Barker, Muhammad Abd-al-Rahman, and Aqil Khan Mengal 1969 A course in Baluchi, volumes I and II. Montreal: Institute of Islamic Studies, McGill University. Bashir, Elena 2006 Indo-Iranian frontier languages. In: Iranica [online] http://iranica.com/articles/indo-iranian-frontier-languages-and-the-influence-of-persian (accessed 5 December 2014)

Writing systems 811 Bashir, Elena, Sarmad Hussain, and Deborah Anderson 2006 Proposal for characters for Khowar, Torwali and Burushaski. http://www.cle. org.pk/Publication/papers/2006/n3117.pdf (accessed 5 December 2014) Baums, Stefan 2006 Towards a computer encoding for Brāhmī. In: Adalbert J. Gail, Gerd J. R. Mevissen, and Richard Salomon (eds.), Script and image: Papers on art and epigraphy, 111–143. (Papers of the 12th World Sanskrit Conference 11.1.) Delhi: Motilal Banarsidass. Baums, Stefan 2009 A Gāndhārī commentary on early Buddhist verses: British Library Kharoṣṭhī fragments 7, 9, 13 and 18. University of Washington PhD dissertation. ProQuest Dissertations 0822059. Baums, Stefan 2011 Indiske skrifter: Seglskrift, akṣaras og “arabertal”. In: Stig T. Rasmussen, ed., Verdens skrifter, 239–263. København: Forlaget Vandkunsten. Baums, Stefan 2014 Gandhāran scrolls: Rediscovering an ancient manuscript type. In: Jörg Quenzer and Jan U. Sobisch (eds.), Manuscript cultures: Mapping the field, 183–225. (Studies in Manuscript Cultures 1.) Berlin: De Gruyter. Baums, Stefan, and Andrew Glass Ongoing Catalog of Gāndhārī texts. http://gandhari.org/catalog/ (accessed 12 December 2014) Beg, Fazal Amin, John Mock, and Mir Ali Wakhani 2014 Recent developments in Wakhi orthography. Manuscript submitted for publication. Bhurgri, Abdul-Majid MS Enabling Pakistani languages through Unicode. http://www.bhurgri.com/ bhurgri/downloads/PakLang.pdf (accessed 5 December 2014) Boyer, Auguste M., E. J. Rapson, E. Senart, and P. S. Noble 1920–1929 Kharoṣṭhī inscriptions discovered by Sir Aurel Stein in Chinese Turkestan. Oxford: Clarendon Press. Braarvig, Jens (ed.) 2000 Buddhist manuscripts, volume I. (Manuscripts in The Schøyen Collection 1.) Oslo: Hermes Publishing. Bright, William 1990 Language variation in South Asia. New York: Oxford University Press. Bright, William 1994 Evolution of the Indian writing system. In: Hartmut Günther and Otto Ludwig (eds.), Schrift und Schriftlichkeit: Ein interdisziplinäres Handbuch internationaler Forschung, 1527–1535. (Handbücher zur Sprach- und Kommunikationswissenschaft 10.) Berlin: De Gruyter. Bright, William 1999 A matter of typology: Alphasyllabaries and abugidas. Studies in the Linguistic Sciences 30(1): 63–71. Bright, William, and Peter T. Daniels 1996 The world’s writing systems. New York: Oxford University Press.

812 Bibliographical References Brough, John 1977 The arapacana syllabary in the old Lalitāvistara. Bulletin of the School of Oriental and African Studies 40: 85–95. Buddruss, Georg 1982 Khowar-Texte in arabischer Schrift. (Akademie der Wissenschaften und der Literatur, Abhandlungen der Geistes- und Sozialwissenschaftlichen Klasse 1.) Wiesbaden: Steiner. Buddruss, Georg 1983 Neue Schriftsprachen im Norden Pakistans: Einige Beobachtungen. In: Aleida Assmann, Jan Assmann, and Christof Hardmeier (eds.), Schrift und Gedächtnis: Beiträge zur Archäologie der literarischen Kommunikation, 231– 244. München: W. Fink. Burnell, Arthur Coke 1874 Elements of South-Indian palæography from the fourth to the seventeenth century A. D.: Being an introduction to the study of South-Indian inscriptions and MSS. 1st ed. Mangalore: Stolz & Hirner, Basel Mission Press. Burnell, Arthur Coke 1878 Elements of South-Indian palæography from the fourth to the seventeenth century A. D.: Being an introduction to the study of South-Indian inscriptions and MSS. 2nd ed. London: Trübner & Co. Burushaski Research Academy 2006 Burūšaskī-Urdū luγat, jild-e awwal [Burushaski-Urdu dictionary, vol. 1]. Karachi: Director, Bureau of Composition, Compilation & Translation, University of Karachi. Burushaski Research Academy 2009 Burūšaskī-Urdū luγat, jild-e doam [Burushaski-Urdu dictionary, vol. 2]. Karachi: Director, Bureau of Composition, Compilation & Translation, University of Karachi. Burushaski Research Academy 2011 Burushaski alphabets with examples. http://www.scribd.com/doc/31593787/ Burushaski-Alphabets-With-Examples (accessed 5 Dec. 2014) Burushaski Research Academy 2014 Burūšaskī-Urdū luyat, jild-e soam [Burushaski-Urdu dictionary, vol. 3]. Karachi: Director, Bureau of Composition, Compilation & Translation, University of Karachi. Cardona, George, and Dhanesh Jain (eds.) 2003 The Indo-Aryan languages. London/New York: Routledge. Chengappa, S., S. Bhat, and P. Padakannaya 2004 Reading and writing skills in multilingual/multiliterate aphasics: Two case studies. Reading and Writing: An Interdisciplinary Journal 17: 121–135. Chitrali, Rahmat Aziz 2004 Balti primer. Karachi/Chitral: Khowar Academy. http://www.scribd.com/ doc/247229102/Balti-Language-Qaida-by-Rehmat-Aziz-Chitrali-Publishedby-Khowar-Academy-Chitral-Pakistan (accessed 4 December 2014) Chrisomalis, Stephen 2010 Numerical notation: A comparative history. Cambridge: Cambridge University Press.

Writing systems 813 Coningham, Robin A. E., F. Raymond Allchin, Catherine M. Batt, and D. Lucy 1996 Passage to India? Anuradhapura and the early use of the Brahmi script. Cambridge Archaeological Journal 6: 73–97. Coulmas, Florian 2003 Writing systems: An introduction to their linguistic analysis. Cambridge: Cambridge University Press. Cunningham, Alexander 1875 Report for the year 1872–73, volume V. (Archæological Survey of India.) Calcutta: Office of the Superintendent of Government Printing. Dani, Ahmad Hasan 1963 Indian palaeography. Oxford: Clarendon Press. Dani, Ahmad Hasan 1986 Indian palaeography. 2nd ed. New Delhi: Munshiram Manoharlal Publishers. Daniels, Peter T., and William Bright (eds.) 1996 The world’s writing systems. New York: Oxford University Press. Das Gupta, Charu Chandra 1958 The development of the Kharoṣṭhī script. Calcutta: Firma K. L. Mukhopadhyay. DeSilva, M. W. Sugathapala 1969 The phonological efficiency of the Maldivian writing system. Anthropological Linguistics 11: 199–208. Dimitrov, Dragomir 2010 The Bhaikṣukī manuscript of the Candrālaṃkāra: Study, script tables, and facsimile edition. (Harvard Oriental Series 72.) Cambridge, MA: Department of Sanskrit and Indian Studies, Harvard University. Einicke, Katrin 2009 Korrektur, Differenzierung und Abkürzung in indischen Inschriften und Handschriften. (Abhandlungen für die Kunde des Morgenlandes 68.) Wiesbaden: Harrassowitz. Elfenbein, Josef 1983 The Brahui problem again. Indo-Iranian Journal 25: 103–132. Falk, Harry 1993 Schrift im alten Indien: Ein Forschungsbericht mit Anmerkungen. (ScriptOralia 56.) Tübingen: Gunter Narr. Falk, Harry, and Walter Slaje (eds.) 2000–2005 IndoSkript: Eine elektronische indische Paläographie. http://www.indo logie.uni-halle.de/forschung/indoskript/ (accessed 12 December 2014) Farmer, Steve, Richard Sproat, and Michael Witzel 2004 The collapse of the Indus-script thesis: The myth of a literate Harappan civilization. Electronic Journal of Vedic Studies 11(2): 19–57. Ferguson, Charles A. 1978 Patterns of literacy in multilingual situations. In: James E. Alatis (ed.), International dimensions of bilingual education, 582–590. Washington: Georgetown University Press. Gair, James W., and Bruce D. Cain 1996 Dhivehi writing. In: Daniels & Bright (eds.) 1996: 564–568. Geiger, Wilhelm 1919 Máldivian linguistic studies. (Journal of the Ceylon Branch of the Royal Asiatic Society 27.) Colombo: H. C. Cottle, Govt. Printer.

814 Bibliographical References Glass, Andrew 2000 A preliminary study of Kharoṣṭhī manuscript paleography. University of Washington MA thesis. Glück, Helmut 1994 Schriftlichkeit und Diglossie. In: Hartmut Günther and Otto Ludwig (eds.), Schrift und Schriftlichkeit: Ein interdisziplinäres Handbuch internationaler Forschung, 739–766. Berlin: De Gruyter. Gupta, Ashum 2004 Reading difficulties of Hindi-speaking children with developmental dyslexia. Reading and Writing 17: 79–99. Hartmann, Jens-Uwe 2000 Zu einer neuen Handschrift des Dīrghāgama. In: Christine Chojnacki, Jens-Uwe Hartmann, and Volker M. Tschannerl (eds.), Vividharatnakaraṇḍaka: Festgabe für Adelheid Mette, 359–367. Swisttal-Odendorf: Indica et Tibetica Verlag. Heegård, Jan 2000 Linguistic and political aspects of alphabet-making for a threatened language. In: Carl-Erik Lindberg and Steffen Nordahl Lund (eds.), 17th Scandinavian Conference of Linguistics II, 161–176. (Odense Working Papers in Language and Communication 19.) Odense: Institute of Language and Communication, University of Southern Denmark. Hinüber, Oskar von 1979 Die Erforschung der Gilgit-Handschriften: Funde buddhistischer SanskritHandschriften I. (Nachrichten der Akademie der Wissenschaften in Göttingen, 1: Philologisch-historische Klasse, 327–360.) Göttingen: Vandenhoeck & Ruprecht. Hinüber, Oskar von 1990 Der Beginn der Schrift und frühe Schriftlichkeit in Indien. (Abhandlungen der geistes- und sozialwissenschaftlichen Klasse 1989, 11.) Mainz: Akademie der Wissenschaften und der Literatur. Hinüber, Oskar von 2001 Das ältere Mittelindisch im Überblick. (Österreichische Akademie der Wissenschaften, philosophisch-historische Klasse, Sitzungsberichte 467 / Veröffentlichungen der Kommission für Sprachen und Kulturen Südasiens 20.) 2nd ed. Wien: Verlag der Österreichischen Akademie der Wissenschaften. Hladczuk, John, and William Eller 1987 Comparative reading: An international bibliography. (Bibliographies and Indexes in Education 4.) New York: Greenwood Press. Hunzai, Allamah Nasiruddin Nasir n.d. Burūśo Birkiṣ [Treasures of the Burusho]. Hunza/Gilgit/Karachi: Burushaski Research Academy. Inam Ullah 2004 Lexical database of the Torwali dictionary. Asia Lexicography Conference, Chiangmai, Thailand, 24–26 May, 2004. http://www.fli-online.org/documents/ languages/torwali/Inam_paper_2004.pdf (accessed 5 December 2014) Inam Ullah 2010 Torwālī-Urdū luγat [Torwali-Urdu Dictionary]. Lahore: Center for Research in Urdu Language Processing, National University for Computer and Emerging Sciences. http://www.cle.net.pk/otd/ (accessed 5 December 2014)

Writing systems 815 Inam Ullah n.d. Scripts and characters used in the languages of Northern Pakistan. http://www. cle.org.pk/IDN/download/NPLang-Charachters.pdf (accessed 5 December 2014) Jahani, Carina 1989 Standardization and orthography in the Balochi language. (Acta Universitatis Upsaliensis, Studia Iranica Upsaliensia 1). Uppsala: Uppsala University. Jahani, Carina, and Agnes Korn 2009 Balochi. In: Gernot Windfuhr (ed.), The Iranian languages, 634–692. London/ New York: Routledge. Jain, Dhanesh 2003 Sociolinguistics of the Indo-Aryan languages. In: Cardona & Jain (eds.) 2003: 391–443. Janert, Klaus Ludwig 1972 Abstände und Schlußvokalverzeichnungen in Aśoka-Inschriften: mit Editionen und Faksimiles in 107 Lichtdrucktafeln. (Verzeichnis der orientalischen Handschriften in Deutschland, Supplementband 10.) Wiesbaden: Steiner. Jensen, Hans 1969 Die Schrift in Vergangenheit und Gegenwart. Berlin: Deutscher Verlag der Wissenschaften. Jha, N., and Navaratna Srinivasa Rajaram 2000 The deciphered Indus script: Methodology, readings, interpretations. New Delhi: Aditya Prakashan. Joshi, Jagat Pati, and Asko Parpola (eds.) 1987 Corpus of Indus seals and inscriptions 1: Collections in India. (Annales Academiae Scientiarum Fennicae B 239 / Memoirs of the Archaeological Survey of India 86.) Helsinki: Suomalainen Tiedeakatemia. Karanth, Pratibha, and M. G. Suchitra 1993 Literacy acquisition and grammaticality judgments in children. In: Robert J. Scholes (ed.), Literacy and language analysis,143–156. Hillsdale: Lawrence Erlbaum Associates. Karanth, Pratibha, Anu Mathew, and Priya Kurien 2004 Orthography and reading speed: Data from native readers of Kannada. Reading and Writing 17: 101–120. Kazmi, Syed Muhammad Abbas 1996 The Balti language. In: P. N. Pushp and K. Warikoo (eds.), Jammu, Kashmir & Ladakh: Linguistic predicament. New Delhi: Himalayan Research and Cultural Foundation/Har Anand Publications. http://koshur.org/Linguistic/7. html (accessed 4 December 2014) Kenoyer, Jonathan Mark 1998 Ancient cities of the Indus civilization. Karachi: Oxford University Press. Khan, Tarik Ali 2002 Little Tibet: Renaissance and resistance in Baltistan. http://www.phayul.com/ news/article.aspx?id=325&t=0 (accessed 4 December 2014) Khubchandani, Lachman M. 2003 Sindhi. In: Cardona & Jain (eds.) 2003: 622–658.

816 Bibliographical References Knozorov, Yurii (ed.) 1965 Predvaritel’naja soobščenie ob issledovanii protoindijskix tekstov [Preliminary communication on the study of the proto-Indian texts]. Moskva: Akademia Nauk SSSR. Koul, Omkar N. 1995 Standardization of Kashmiri script. In: Imtiaz Hasnain (ed.), Standardization and modernization: Dynamics of language planning, 269–278. New Delhi: Bahri. Lamuwal, Abd-El-Malek, and Adam Baker 2013 Southeastern Pashayi. Journal of the International Phonetic Association 43(2): 243–246. http://journals.cambridge.org/action/displayJournal?jid=IPA (accessed 3 December 2014) Losey, Wayne E. 2002 Writing Gojri: Linguistic and sociolinguistic constraints on a standardized orthography for the Gujars of South Asia. University of North Dakota MA thesis. http://www.fli-online.org/documents/languages/gojri/losey_thesis.pdf (accessed 3 December 2014) MacKenzie, D. N. 1959 A standard Pashto. Bulletin of the School of Oriental and African Studies 22(1/3): 231–235. MacKenzie, D. N. 1997 The development of the Pashto script. In: Shirin Akiner and Nicholas SimsWilliams (eds.), Languages and scripts of Central Asia, 137–143. London: School of Oriental and African Studies. Mahadevan, Iravatham 1977 The Indus script: Texts, concordance and tables. (Memoirs of the Archaeological Survey of India 77.) New Delhi: Archaeological Survey of India. Mahadevan, Iravatham 2003 Early Tamil epigraphy: From the earliest times to the sixth century A. D. (Harvard Oriental Series 62.) Chennai: Cre-A. Malik, M. G. Abbas 2005 Towards a Unicode compatible Punjabi character set. 27th Internationalization and Unicode Conference, April 2005, Berlin. https://hal.inria.fr/file/index/ docid/1002347/filename/mgam05–1.pdf (accessed 24 December 2014) Malmquist, Eve (ed.) 1982 Handbook on comparative reading: An annotated bibliography and some viewpoints on university courses in comparative reading. Newark: International Reading Association. Maurer, Walter H. 1976 On the name Devanāgarī. Journal of the American Oriental Society 96: 101–104. Mirdehghan, Mahinnaz 2010 Persian, Urdu, and Pashto: A comparative orthographic analysis. Writing Systems Research 2(1): 9–23. Mock, John Howard 1998 The discursive construction of reality in the Wakhi community of Northern Pakistan. University of California PhD dissertation. ProQuest Dissertations 9922976.

Writing systems 817 Namus, M. Shuja 1961 Gilgit aur šinā zabān [Gilgit and the Shina language]. Bahawalpur, Pakistan: Urdu Academy. National Language Authority, Pakistan 2002 Balti qāeda [Balti primer]. Islamabad: National Language Authority. Norris, Edwin 1846 On the Kapur-di-Giri rock inscription. The Journal of the Royal Asiatic Society of Great Britain and Ireland 8: 303–307. Oommen, Chinna 1973 India. In: John A. Downing (ed.), Comparative reading: Cross-national studies of behavior and processes in reading and writing, 403–425. New York: Macmillan. Pandey, Anshuman 2010 Introducing another script for writing Balti. http://www-personal.umich. edu/~pandey/ (accessed 4 December 2014) Parpola, Asko 1994 Deciphering the Indus script. New York: Cambridge University Press. Parpola, Asko 1996 The Indus script. In: Daniels & Bright (eds.) 1996: 165–171. Parpola, Asko 2008 Is the Indus script indeed not a writing system? In: Airāvati: Felicitation volume in honour of Iravatham Mahadevan, 111–131. Chennai: Varalaaru. com. Parpola, Asko, B. M. Pande, and Petteri Koskikallio (eds.) 2010 Corpus of Indus seals and inscriptions 3: New material, untraced objects and collections outside India and Pakistan 1: Mohenjo-Daro and Harappa. (Annales Academiae Scientiarum Fennicae B 359 / Memoirs of the Archaeological Survey of India 96.) Helsinki: Suomalainen Tiedeakatemia. Parvez, Aslam 1996 The adaptation of Perso-Arabic script for Urdu, Panjabi, and Sindhi. New Delhi: Monumental Publishers. Patel, Purushottam G. 1993 Ancient India and the orality-literacy divide theory. In: Robert J. Scholes (ed.), Literacy and language analysis, 199–208. Hillsdale: Lawrence Erlbaum Associates. Patel, Purushottam G. 1995 Brahmi scripts, orthographic units and reading acquisition. In: Taylor & Olson (eds.): 265–275. Pederson, Eric 2003 Mirror-image discrimination among nonliterate, monoliterate, and biliterate Tamil subjects. Written Language and Literacy 6(1): 71–91. Penzl, Herbert 1954 Orthography and phonemes in Pashto (Afghan). Journal of the American Oriental Society 74(2): 74–81. Possehl, Gregory L. 1996 Indus age: The writing system. Philadelphia: University of Pennsylvania Press.

818 Bibliographical References Prakash, P., and R. Malatesha Joshi 1995 Orthography and reading in Kannada: A Dravidian language. In: Taylor & Olson (eds.) 1995: 95–108. Prakash, P., D. Rekha, R. Nigam, and P. Karanth 1993 Phonological awareness, orthography, and literacy. In: Robert J. Scholes (ed.), Literacy and language analysis, 55–70. Hillsdale: Lawrence Erlbaum Associates. Prinsep, James 1837 Note on the facsimiles of inscriptions from Sanchí near Bhilsa, taken for the Society by Captain Ed. Smith, Engineers. The Journal of the Asiatic Society of Bengal 6: 451–477. Prinsep, James 1838 Additions to Bactrian numismatics, and discovery of the Bactrian alphabet. The Journal of the Asiatic Society of Bengal 7: 636–658. Rao, S. R. 1982 The decipherment of the Indus script. Bombay: Asia Publishing. Rasoolpuri, Aslam 1976 Sirāikī rasmulxat kī muxtasar tārīx [A short history of Siraiki orthography]. Multan: Bazme Saqafat. Sagar, Muhammd Zaman 2008 A multilingual education project for Gawri-speaking children in northern Pakistan. 2nd International Conference on Language Development, Language Revitalization, and Multilingual Education in Ethnolinguistic Communities, Bangkok 1–3 July, 2008. http://www.seameo.org/_ld2008/doucments/ Presentation_document/Gawri_presentation_Bangkok2008.pdf (accessed 24 December 2014) Sakhi, Ahmad Jami 2000 waxī zabān – tārīx ke āene mẽ – mah qāeda [Wakhi language in the mirror of history, including a primer]. Gilgit: Ahmad Jami Sakhi, printed at Farman Printing Press. Salomon, Richard 1987 A recent claim to decipherment of the “shell script”. Journal of the American Oriental Society 107: 313–315. Salomon, Richard 1995 On the origin of the early Indian scripts. Journal of the American Oriental Society 115: 271–279. Salomon, Richard 1998 Indian epigraphy: A guide to the study of inscriptions in Sanskrit, Prakrit, and the other Indo-Aryan languages. New York: Oxford University Press. Salomon, Richard 1999 Ancient Buddhist scrolls from Gandhāra: The British Library Kharoṣṭhī fragments. Seattle: University of Washington Press. Salomon, Richard 2000 Typological observations on the Indic script group and its relationship to other alphasyllabaries. Studies in the Linguistic Sciences 30: 87–103.

Writing systems 819 Salomon, Richard 2008 Whatever happened to Kharoṣṭhī? The fate of a forgotten Indic script. In: John Baines, John Bennet, and Stephen Houston (eds.), The disappearance of writing systems: Perspectives on literacy and communication, 139–155. London: Equinox Publishing. Sander, Lore 1968 Paläographisches zu den Sanskrithandschriften der Berliner Turfansammlung. (Verzeichnis der orientalischen Handschriften in Deutschland, Supplementband 8.) Wiesbaden: Steiner. Sander, Lore 2000 A brief paleographical analysis of the Brāhmī manuscripts in volume I. In: Jens Braarvig (ed.), Buddhist manuscripts, volume I, 285–300. Oslo: Hermes Publishing. Schmidt, Ruth Laila, and Razwal Kohistani 1995 The mirror of writing. Third International Hindu Kush Cultural Conference, 26–30 August, 1995 in Chitral, Pakistan. Original and more accurate version at http://www.hf.uio.no/ikos/english/research/projects/shina/publications/ Schmidt%20and%20Kohistani%202008-original-1.pdf (accessed 3 December 2014) Sebba, Mark 2007 Spelling and society: The culture and politics of orthography around the world. Cambridge: Cambridge University Press. Sebba, Mark, Shahrzad Manootian, and Carla Jonsson (eds.) 2012 Language mixing and code-switching in writing: Approaches to mixedlanguage written discourse. New York/Abingdon, Oxon: Routledge. Shackle, Christopher 2003 Panjabi. In: Cardona & Jain (eds.) 2003: 581–621. Shah, Sayid Ghulam Mustafa, and Asko Parpola (eds.) 1991 Corpus of Indus seals and inscriptions 2: Collections in Pakistan. (Annales Academiae Scientiarum Fennicae B 240 / Memoirs of the Department of Archaeology and Museums, Government of Pakistan 5.) Helsinki: Suomalainen Tiedeakatemia. Sircar, Dines Chandra 1942 Select inscriptions bearing on Indian history and civilization. Calcutta: University of Calcutta. Sircar, Dines Chandra 1965a Indian epigraphy. Delhi: Motilal Banarsidass. Sircar, Dines Chandra 1965b Select inscriptions bearing on Indian history and civilization, Volume I: From the sixth century B. C. to the sixth century A. D. 2nd ed. Calcutta: University of Calcutta. Sircar, Dines Chandra 1966 Indian epigraphical glossary. Delhi: Motilal Banarsidass. Sircar, Dines Chandra 1983 Select inscriptions bearing on Indian history and civilization, Volume II: From the sixth to the eighteenth century A. D. Delhi: Motilal Banarsidass.

820 Bibliographical References Sprigg, Richard K. 1996 My Balti-Tibetan and English dictionary. The Tibet Journal 21(4): 3–22. Sprigg, Richard K. 2002 Balti-English, English-Balti dictionary. London: Routledge Curzon. Sproat, Richard 2000 A computational theory of writing systems. (AT&T Labs research report.) http:// www.cslu.ogi.edu/~sproatr/newindex/wsbook.pdf (accessed 6 December 2014) Sproat, Richard 2006a Brahmi-derived scripts, script layout, and segmental awareness. Written Language and Literacy 9(1): 45–65. Sproat, Richard 2006b A computational theory of writing systems. Cambridge: Cambridge University Press. Steinkellner, Ernst 2004 A tale of leaves: On Sanskrit manuscripts in Tibet, their past and their future. Amsterdam: Royal Netherlands Academy of Arts and Sciences. Strauch, Ingo 2008 The Bajaur collection of Kharoṣṭhī manuscripts: A preliminary survey. Studien zur Indologie und Iranistik 25: 103–36. Strauch, Ingo 2012 The character of the Indian Kharoṣṭhī script and the “Sanskrit revolution”: A writing system between identity and assimilation. In: Alex de Voogt and Joachim Friedrich Quack (eds.), The idea of writing: Writing across borders, 131–168. Leiden: Brill. Swank, Heidi 2008 It all hinges on the vowels: Reconsidering the alphasyllabary classification. Written Language and Literacy 11: 73–89. Taj, Abdul Khaliq 1989 Ṣiṇā qā’eda [Shina primer]. Gilgit: Usmani Kitabkhana. Taylor, Insup, and David R. Olson 1995 Scripts and literacy: Reading and learning to read alphabets, syllabaries and characters. (Neuropsychology and Cognition 7.) Dordrecht: Kluwer. Trail, Ron, and Gregory Cooper 1999 Kalasha dictionary – with English and Urdu. (Studies in Languages of Northern Pakistan 7.) Islamabad: National Institute of Pakistan Studies, Quaidi-Azam University/Summer Institute of Linguistics. http://www.fli-online.org/ (accessed 6 December 2014) Ul-Mulk, Samsam n.d. Khovār qā’eda [Khowar primer]. Peshawar: Department of Publications, NWFP. Unicode Consortium 1991– http://unicode.org/ (accessed 12 December 2014) Unicode Consortium 2011 Arabic Supplement code page. http://www.unicode.org/charts/ (accessed 8 January 2015)

Writing systems 821 Unseth, Peter 2005 Sociolinguistic parallels between choosing scripts and languages. Written Language and Literacy 8: 19–42. Vaid, Jyotsna 1995 Script directionality affects nonlinguistic performance: Evidence from Hindi and Urdu. In: Taylor & Olson (eds.) 1995: 295–310. van Schaik, Sam 2011 A new look at the Tibetan invention of writing. In: Yoshiro Imaeda, Matthew T. Kapstein, and Tsuguhito Takeuchi (eds.), New studies of the Old Tibetan documents: Philology, history and religion, 45–96. Tokyo: Research Institute for Languages and Cultures of Asia and Africa, Tokyo University of Foreign Studies. Vasanta, Duggirala 2004 Processing phonological information in a semi-syllabic script: Developmental data from Telugu. Reading and Writing 17: 59–78. Veldhuis, Dorina, and Jeanne Kurvers 2012 Offline segmentation and online language processing units. Written Language & Literacy 15(2): 165–184. Wali, Aamir, Richard Sproat, Prakash Padakannaya, and Bhuvaneshwari B. 2009 Model for phonemic awareness in readers of Indian script. Written Language and Literacy 12: 161–169. Weingarten, Rüdiger 2011 Comparative graphematics. Written Language and Literacy 14: 12–38. Wheeler, Mortimer 1968 The Indus civilization: Supplementary volume to the Cambridge history of India. 3rd ed. Cambridge: Cambridge University Press. Yadav, Nisha, and M. N. Vahia 2011 Indus script: A study of its sign design. Scripta 3: 1–36. Yun, Ju-Hong 2003 Pashai language development project: Promoting Pashai language, literacy and community development. Conference on Language Development, Language Revitalization and Multilingual Education Activities, Bangkok, Thailand, 6–8 November 2003. http://www.sil.org/asia/ldc/parallel_presentations.html (accessed 6 December 2014) Zia, Mohammad Amin 1986 Ṣiṇā qāeida aur græmar [Shina primer and grammar]. Gilgit (Pakistan): Zia Publications. Zia, Mohammad Amin 2010 Ṣiṇā-Urdū luγat [Shina-Urdu dictionary]. Gilgit (Pakistan): Zia Publications. Zide, Norman 1996 Scripts for Munda languages. In: Daniels & Bright (eds.) 1996: 612–618.

Appendix — Sources and resources The listings in this Appendix are intended to complement and extend the bibliographical information contained in the chapters of this volume. The purpose is to present a comprehensive overview of written and online sources and resources, with focus on Journals and periodicals; Bibliographies; Corpora, digital texts, and other online materials; Online dictionaries; Language endangerment and language preservation; General linguistic surveys; and Descriptions and handbooks on language families and individual languages, including Sign Language. While some important articles are included among the publications, the major focus is on edited volumes, monographs, and other monograph-size publications. It is hoped that this information can be put online after publication, with possibilities for online additions and updates. (URLs change rapidly and frequently; some items may therefore be more easily found by searching directly by title.) Journals and periodicals Acta Linguistica Hafniensia (University of Copenhagen) Acta Orientalia (Academia Scientiarum Hungarica) Annual of Urdu Studies (Department of Languages and Cultures of Asia, University of Wisconsin, Madison) Annual Review of South Asian Languages and Linguistics (Mouton de Gruyter, Trends in Linguistics: Studies and Monographs, 2007, 2008, 2009, 2010, 2011, 2012; continuation of the Yearbook of South Asian Languages and Linguistics; see below) Bulletin d’études indiennes (Association Française pour les Études Sanskrites) Bulletin of the Central Institute of English (and Foreign Languages) (Central Institute of English and Foreign Languages, Hyderabad) Bulletin of the Deccan College Post-Graduate and Research Institute (Deccan College, Pune) Bulletin of the School of Oriental and African Studies (London) Central Institute of English and Foreign Languages (Occasional) Working Papers in Linguistics (Central Institute of English and Foreign Languages, Hyderabad) CIEFL Working Papers in Linguistics (= Central Institute of English and Foreign Languages (Occasional) Working Papers in Linguistics) (Central Institute of English and Foreign Languages, Hyderabad) Electronic Journal of Vedic Studies (http://www.ejvs.laurasianacademy.com) Himalayan Linguistics (https://escholarship.org/uc/himalayanlinguistics) Indian Journal of Applied Linguistics (Delhi: Bahri) Indian Journal of Linguistics (Calcutta) Indian Linguistics (Linguistic Society of India) Indo-Iranian Journal (Leiden: Brill) International Journal of Dravidian Linguistics (Dravidian Linguistics Association, Thiruvananthapuram)

824

Appendix — Sources and resources

ISDL Working Papers in Linguistics (Dravidian Linguistics Association, Thiruvananthapuram) Journal Asiatique (Leuven: Peeters) Journal of Asian and African Studies / Ajia Afurika gengo bunka kenkyu (Tokyo: Tōkyō Gaikokugo Daigaku, Ajia Afurika Gengo Bunka Kenkyūjo) Journal of South Asian Languages and Linguistics (Berlin: Mouton de Gruyter) Journal of South Asian Linguistics (http://jsal-journal.org) Journal of Tamil Studies (Madras/Chennai: International Institute of Tamil Studies) Journal of the American Oriental Society (American Oriental Society) Journal of the Oriental Institute, Baroda (Oriental Institute, Baroda) Journal of the Pali Text Society (Pali Text Society; http://www.palitext.com/palitext/jours. htm#) Journal of the Royal Asiatic Society (http://journals.cambridge.org/action/displayJournal?jid=JRA) Journal of the Southeast Asian Linguistics Society (http://www.jseals.org) Language in India (http://www.languageinindia.com) Linguistics of the Tibeto-Burman Area (Amsterdam/Philadelphia: Benjamins) Mon-Khmer Studies (http://www.mksjournal.org) Nepalese Linguistics (Linguistic Society of Nepal) Osmania Papers in Linguistics (Osmania University, Hyderabad) Pakistaniaat: A Journal of Pakistan Studies (http://pakistaniaat.org/index.php/pak/) PILC Journal of Dravidic Studies (Pondicherry Institute of Linguistics and Culture) South Asian Language Review (http://journalsalr.com/home/) South Asian Review (South Asian Literary Association) Studia Iranica (Leuven: Peeters) Studia Orientalia (Suomen Itämainen Seura, Finska Orientsällskapet, Finnish Oriental Society) Studies in the Linguistic Sciences (Department of Linguistics, University of Illinois, Urbana-Champaign) Studien zur Indologie und Iranistik (Bremen: Ute Hempen Verlag) Transactions of the Philological Society (http://onlinelibrary.wiley.com/journal/10.1111/ (ISSN)1467–968X) Wiener Zeitschrift für die Kunde des Morgenlandes (Institut für Orientalistik, Universität Wien) Yearbook of South Asian Languages and Linguistics (Sage Publications, taken over by Mouton de Gruyter, 1998–2006) Zeitschrift der Deutschen Morgenländischen Gesellschaft (Deutsche Morgenländische Gesellschaft)

Bibliographies Agesthialingom, S., and S. Sakthivel 1973 A bibliography of Dravidian linguistics. Annamalainagar: Annamalai University. Aggarwal, Narinder K. 1985 A bibliography of studies on Hindi language and linguistics, 2nd ed. Gurgaon: Indian Document Service.

Appendix — Sources and resources

825

Aggarwal, Narinder K. 1991 Studies on Nepali language and linguistics: A bibliography. Gurgaon: Indian Document Service. Andronov, Mikhail 1966 Materials for a bibliography of Dravidian linguistics. Kuala Lumpur: Department of Indian Studies, University of Malaya. http://ccat.sas.upenn. edu/%7Eharoldfs/dravling/projects/androbib.html (accessed 15 January 2015) Anonymous Ongoing Project Language in Sri Lanka. https://sites.google.com/site/language sinsrilanka/ (accessed 15 January 2015) Anonymous No date The Tibeto-Burman bibliography, sorted by language name. http://www.tibetoburman.net/bib/language.html (accessed 15 January 2015) Baart, Joan L. G., and L. Baart-Bremer 2001 Bibliography of languages of Northern Pakistan. Islamabad: National Institute of Pakistan Studies/Summer Institute of Linguistics. Banerjee, Satya Ranjan 1977 A bibliography of Prakrit language. Calcutta: Sanskrit Book Depot. Bashir, Elena 2000 A thematic survey of Burushaski research. History of Language 6(1): 1–14. BrillOnline Bibliographies 2015a India’s contribution to Persian lexicography. http://bibliographies.brillon line.com/entries/index-islamicus/indias-contribution-to-persian-lexicogra phy-A33902?s.num=0&s.f.s2_parent=s.f.book.index-islamicus&s.q=India %27s+contribution+to+Persian+lexicography BrillOnline Bibliographies 2015b Persian in India, with special reference to the contributions of Hindu writers and poets. http://bibliographies.brillonline.com/entries/index-islamicus/persianin-india-with-special-reference-to-the-contribution-of-hindu-writers-andpoets-A33902?s.num=0&s.f.s2_parent=s.f.book.index-islamicus&s.q=India %27s+contribution+to+Persian+lexicography BrillOnline Bibliographies [Annual Updates] Linguistic bibliography. http://bibliographies.brillonline.com/ browse/linguistic-bibliography Cardona, George 1976 Pāṇini: A survey of research. The Hague/Paris: Mouton. 2nd ed. 1998, Delhi: Motilal Banarsidass. Cardona, George 2004 Recent research in Pāṇinian studies. 2nd ed. Delhi: Motilal Banarsidass. Central Institute of Indian Languages No date Annotated bibliography [of Kannada]. http://www.ciil-lisindia.net/Kannada/ Kan_infores.html (accessed 15 March 2015) Deshpande, Madhav M., and Hans Henrich Hock 1991 A bibliography of writings on Sanskrit syntax. In: Hans Henrich Hock (ed.), Studies in Sanskrit syntax, 219–244. Delhi: Motilal Banarsidass. Drocco, Andrea 2009 Bibliography on ergativity in Indo-Aryan. https://sites.google.com/site/ indianbhashas/ (accessed 4 December 2015)

826

Appendix — Sources and resources

Geetha, K. R. 1983Classified state bibliography of linguistic research on Indian languages 1: Hindi-speaking states. Mysore: Central Institute of Indian Languages. Ghosh, Arun 1988 Bibliotheca Austroasiatica: A classified and annotated bibliography of the Austroasiatic people and languages. Calcutta: Firma KLM. Hock, Hans Henrich 2015 A bibliography of Sanskrit syntax. In: Peter M. Scharf (ed.), Sanskrit syntax: Selected papers presented at the seminar on Sanskrit syntax and discourse structures, 13–15 June 2013, Université Paris Diderot, with an updated and revised bibliography by Hans Henrich Hock, 319–470. Providence, RI: The Sanskrit Library. Huffmann, Franklyn 1986 Bibliography and index of mainland Southeast Asian languages and linguistics. New Haven, CT: Yale University Press. (Also covers Munda languages.) Konow, Sten 1949 Primer of Khotanese Saka: Grammatical sketch, chresthomathy, vocabulary, bibliography. Oslo: H. Aschehoug & Co. Koul, Omkar Nath, and Madhu Bala 1992 Punjabi language and linguistics: An annotated bibliography. Patiala: Indian Institute of Indian Languages. Laddu, Sureshachandra Dnyaneshwar, and Kamal Lochan Kar 1983 A select bibliography on the development of Sanskrit language, ed. by Satyaprakash Sharma. Badaun, Uttar Pradesh: Satyaprakash Sharma. Mahmud, Shabana 1992 Urdu language and literature: A bibliography of sources in European languages. London: Mansell. Nagaraja, K. S. 1989 Austroasiatic languages: A linguistic bibliography. Pune: Deccan College Post-Graduate and Research Institute. Peterson, John Ongoing Project Bibliography of seldom studied and endangered South Asian languages, with assistance by Christian Peters and Yingying Hong. http://www. isfas.uni-kiel.de/de/linguistik/forschung/southasiabibliography/bibliography (accessed 19 March 2015) Ramaiah, L. S. 1994 An international bibliography of Dravidian languages and linguistics, 1. General and comparative Dravidian languages and linguistics. Madras: T. R. Publications. Ramaiah, L. S. 1995 Tamil language and linguistics. (An international bibliography of Dravidian languages and linguistics, 2.) Madras: T. R. Publications. Ramaiah, L. S. 1998 Telugu language and linguistics. (An international bibliography of Dravidian languages and linguistics, 3.) Madras: T. R. Publications. Ramaiah, L. S., and B. Ramakrishna Reddy 2005 Tribal and minor Dravidian languages and linguistics. (An International

Appendix — Sources and resources

827

bibliography of Dravidian languages and linguistics, 6.) Chennai: T. R. Publications. Ramaiah, L. S., and C. R. Karisiddappa 2003 Kannada languages and linguistics. (An international bibliography of Dravidian languages and linguistics, 4.) Chennai: T. R. Publications. Ramaiah, L. S., and M. Kanakachary 1990 Tribal linguistics in India: A bibliographical survey of international resources. Madras: T. R. Publications. Ramaiah, L. S., and N. Rajasekharan Nair 2001 Malayalam language and linguistics. (An international bibliography of Dravidian languages and linguistics, 5.) Chennai: T. R. Publications. Satyaprakash 1984 A bibliography of Sanskrit language and literature. Gurgaon: Indian Documentation Service. Schmidt, Ruth Laila, and Omkar N. Koul 1983 Kohistani to Kashmiri: An annotated bibliography of Dardic languages. Patiala: Indian Institute of Language Studies. Shafer, Robert 1957 Bibliography of Sino-Tibetan languages, 1. Wiesbaden: Harrassowitz. Shafer, Robert 1963 Bibliography of Sino-Tibetan languages, 2. Wiesbaden: Harrassowitz. Shapiro, Michael 1979 Current trends in Hindi syntax: A bibliographical survey. (Studien zur Indologie und Iranistik, Monographie 5.) Reinbek: Wezler. Sharma, Shakuntala 1978 Classified bibliography of linguistic dissertations on Indian languages. Mysore: Central Institute of Indian Languages. Singh, Udaya Narayana 1986 A bibliography of Bengali linguistics. Mysore: Central Institute of Indian Languages. Spence, Justin 1994 Burushaski [bibliography]. https://linguistlist.org/issues/5/5–221.html (accessed 15 March 2015) Stampe, David 1983 Munda bibliography to 1983. http://www.ling.hawaii.edu/austroasiatic/AA/ Munda/BIBLIO/biblio.authors (accessed 15 March 2015) Strand, Richard No Date A bibliography of the languages and cultures of Nuristân and environs. http:// nuristan.info/bibliography.html (accessed 15 March 2015) Toba, Sueyoshi 1991 A bibliography of Nepalese languages and linguistics. Kirtipur: Linguistic Society of Nepal, Tribhuvan University. Wikibooks Contributors Ongoing Project Research on Tibetan languages: A bibliography. Wikibooks: The Free Textbook Project. http://en.wikibooks.org/wiki/Research_on_Tibetan_ Languages:_A_Bibliography (accessed 19 March 2015)

828

Appendix — Sources and resources

Zide, Norman H., and V. Pandya 1989 A bibliographical introduction to Andamanese linguistics. Journal of the American Oriental Society 109: 639–651.

Corpora, digital texts, and other online materials Emille Corpus. http://www.lancs.ac.uk/fass/projects/corpus/emille/ (corpora for Assamese, Bengali, Gujarati, Hindi, Kannada, Kashmiri, Malayalam, Marathi, Oriya, Panjabi, Sinhala, Tamil, Telugu, Urdu) GRETIL — Göttingen Register of Electronic Texts in Indian Languages. http://gretil.sub. uni-goettingen.de/ (Veda, Epics, Purāṇas, Religious texts, Poetics/Metrics, grammatical, philosophical texts, dharmaśāstras, and other śāstras; Pāli, Prakrit, Hindi, Marathi, Tamil, Malayalam, Tibetan) LDC-IL — Linguistic Data Consortium for Indian Languages, Text Corpora. http://www.ldcil. org/resourcesTextCorp.aspx (Sample corpora from Assamese, Bengali, Bodo, Dogri, English, Gujarati, Hindi, Kannada, Kashmiri, Konkani, Kodava, Maithili, Malayalam, Manipuri (Meithei), Marathi, Nepali, Oriya, Punjabi, Santhali, Sanskrit, Tamil, Telugu, Urdu, Yarava; the materials are accessible by scrolling down to Sample Files of Text Corpora and clicking on the numbered links) Mahābhārata. http://bombay.indology.info/mahabharata/welcome.html Nepali Written Corpus (information available at http://universal.elra.info/product_info. php?cPath=42_43&products_id=2077). Rāmāyaṇa. http://bombay.indology.info/ramayana/welcome.html SARIT — Search and Retrieval of Indic Texts. http://sarit.indology.info/exist/apps/sarit/ works/ (marked-up Sanskrit texts, including Arthaśāstra, Brahmapurāṇa, Kavyādarśa, Manusmṛti, and Vākyapadīya) Syntactic Structures of the World’s Languages. http://sswl.railsplayground.net (Materials on Bengali, Hindi, Kannada, Kusunda, Malayalam, Meiteilon [Meithei], Nepali, Panjabi, Pashto, Sanskrit (Vedic), Telugu) The Sanskrit Library. http://www.sanskritlibrary.org (Vedic and Upaniṣadic texts, Vedic Sutra texts; Prātiśākhyas, Aṣṭādhyāyī and other grammatical texts; Manu Smṛti and other smṛti texts, Arthaśāstra, Kāmasūtra; Mahābhārata, Rāmāyaṇa; Pañcatantra, Hitopadeśa; and various classical literary texts) TITUS — Thesaurus Indogermanischer Text- und Sprachmaterialien. http://titus.uni-frank furt.de/indexe.htm?/texte/texte2.htm#ved (Veda, Classical, Epic, and Buddhist Sanskrit, Pali, Prakrit, Rajasthani, Hindi, Dhivehi; Avestan, Khotanese Saka, Tumshuqese Saka, Sogdian; Tamil)

Online dictionaries Bengali Wordnet. http://www.isical.ac.in/~lru/wordnetnew/index.php/site/wordnet Cologne Sanskrit and Tamil Dictionaries. (http://www.sanskrit-lexicon.uni-koeln.de/scans/ MWScan/tamil/index.html (Sanskrit and Tamil); see also http://www.sanskrit-lexicon. uni-koeln.de (Sanskrit-English dictionaries include Monier-Williams 1877, 1899, Apte 1890, Macdonell 1893; English-Sanskrit dictionaries: Monier-Williams 185, Borooah 1877, Apte 1884; Sanskrit-French: Burnouf 1866, Stchoupak 1932; Sanskrit-German:

Appendix — Sources and resources

829

Böhtlingk-Roth 1855, Grassmann 1873, etc.; Sanskrit-Sanskrit: Śabdakalpadruma 1822, Vācaspatyam 1873) Critical Pāli Dictionary. http://pali.hum.ku.dk/cpd/ Digital Corpus of Sanskrit. http://kjc-fs-cluster.kjc.uni-heidelberg.de/dcs/index.php? contents=dictionary Digital South Asia Library. http://dsal.uchicago.edu Digital Dictionaries of South Asia. http://dsal.uchicago.edu/dictionaries/ (Languages covered: Assamese, Baluchi, Bengali, (Indian) English, Hindi, Kashmiri, Lushai, Malayalam, Marathi, Nepali, Oriya, Pali, Pashto, Persian, Rajasthani, Sanskrit, Sindhi, Sinhala, Tamil, Telugu, Urdu, Comparative Dravidian (DEDR), Comparative IndoAryan (Turner 1962–1985); dictionaries of other languages are under preparation or negotiation.) Gujarati Wordnet. http://www.cfilt.iitb.ac.in/gujarati/ Héritage du Sanskrit: Dictionnaire sanskrit-français (1998, by Gérard Huet). http://sanskrit. inria.fr/Heritage.pdf Hindi Wordnet. http://www.cfilt.iitb.ac.in/wordnet/webhwn/wn.php IndoWordNet. http://www.cfilt.iitb.ac.in/indowordnet/ (access to Assamese, Bengali, Bodo, English, Gujarati, Hindi, Kannada, Kashmiri, Konkani, Malayalam, Manipuri (Meithei), Marathi, Nepali, Oriya, Panjabi, Sanskrit, Telugu, Urdu) Kashmiri Wordnet. http://indradhanush.unigoa.ac.in/kashmiriwordnet/public/webcontent/ webcontent.php?id=1&langid=19 Konkani Wordnet. http://konkaniwordnet.unigoa.ac.in/public/webcontent/webcontent. php?id=11&langid=7 Odiya Wordnet. http://indradhanush.unigoa.ac.in/odiawordnet/public/webcontent/webcon tent.php?id=1&langid=19 Online Torwali-Urdu dictionary. http://182.180.102.251:8081/otd/HomePage.aspx or through http://www.cle.org.pk/index.htm Online Urdu dictionary. http://182.180.102.251:8081/oud/default.aspx (or through http:// www.cle.org.pk/index.htm) Punjabi Wordnet. http://wordnet.thapar.edu (there may be problems linking to this site from some servers or on some browsers) Samakālīn Nepālī Śabdakoś. http://www.nepalisabdakos.com (in Nepali, Nagari script) Sindhi-English dictionary. http://182.180.102.251:8081/sed1/homepage.aspx (or through http://www.cle.org.pk/index.htm) Urdu Wordnet. http://indradhanush.unigoa.ac.in/urduwordnet/public/wordnet/wordnet. php?langid=19&id=2) Urdu Wordnet. http://www.cle.net.pk/urduwordnet/ (there may be problems linking to this site from some servers or on some browsers)

Language endangerment and language preservation Abbi, Anvita (ed.) 1997 Languages of the tribal and indigenous peoples of India: The ethnic space. Delhi: Motilal Banarsidass. Abbi, Anvita 2006 Endangered languages of the Andaman Islands. München: LINCOM.

830

Appendix — Sources and resources

Cardoso, Hugo C. (ed.) 2014 Language endangerment and preservation in South Asia. (Language Documentation & Conservation Special Publication No. 7.) http://scholarspace. manoa.hawaii.edu/bitstream/handle/10125/4607/master.pdf?sequence=1 (accessed 23 Nov. 2014) Müller, Katja, Elisabeth Abbess, Calvin Thiessen, and Gabriela Thiessen 2008 Language vitality and development among the Wakhi people of Tajikistan. Dallas: SIL International. www.sil.org/silesr/2008/silesr2008–011.pdf (accessed 28 Nov 2014) Munshi, Sadaf 2010–present Work in progress on an Archive of Annotated Burushaski Texts. http:// burushaskilanguage.com/ (accessed 19 Dec. 2014) Narayanan, R. Karthik 2014 Assessing vitality of languages spoken by less than 10,000 speakers in India. Jawaharlal Nehru University MPhil dissertation. van Driem, George 2008 Endangered languages of South Asia. In: M. Brenzinger (ed.), Language diversity endangered, 303–341. Berlin: Mouton de Gruyter. Yun, Ju-Hong 2003 Pashai Language Development Project: Promoting Pashai language, literacy and community development. http://www.sil.org/asia/ldc/parallel_papers/ ju-hong_yun.pdf (accessed 29 Nov. 2014) See also Bhasha Trust http://www.bhasharesearch.org, and People’s Linguistic Survey http://peopleslinguisticsurvey.org

General linguistic surveys Abbi, Anvita (ed.) 1991 India as a linguistic area revisited. (= Special issue of Language Sciences 13(2).) Abbi, Anvita, R. S. Gupta, and Ayesha Kidwai (eds.) 2001 Linguistic structure and language dynamics in South Asia: Papers from the proceedings of SALA XVIII Roundtable. Delhi: Motilal Banarsidass. Backstrom, Peter C., and Carla F. Radloff (eds.) 1992 Languages of Northern Areas. (Sociolinguistic Survey of Northern Pakistan, 2). Islamabad: National Institute of Pakistan Studies, Quaid-i-Azam University/ Summer Institute of Linguistics. Bhasha Research and Publication Centre 2013–Present People’s Linguistic Survey of India. (Volumes on Assam, Indian Sign Language(s), Jammu and Kashmir, Maharashtra, Meghalaya, Rajasthan, and Uttarakhand, as well The being of Bhasha: General introduction to the Peopleʼs Linguistic Survey of India have appeared as of 2014; other volumes are to follow) http://peopleslinguisticsurvey.org/publishing-PLSI.aspx (accessed 2 May 2015) Bhaskararao, Peri, and K. V. Subbarao (eds.) 2001 The yearbook of South Asian languages 2001. (= Proceedings of Tokyo symposium on South Asian languages: Contact, convergence, and typology.) Thousand Oaks/London/New Delhi: Sage.

Appendix — Sources and resources

831

Bhaskararao, Peri, and Karumuri V. Subbarao (eds.) 2004 Non-nominative subjects, 2 vols. Amsterdam/Philadelphia: Benjamins. Bielmeier, Roland, and Felix Haller (eds.) 2011 Linguistics of the Himalayas and beyond. Berlin/New York: Mouton de Gruyter. Breton, Roland J.-L. 1997 Atlas of the languages and ethnic communities of South Asia. Walnut Creek/ London/New Delhi: Altimara Press. Bright, William 1990 Language variation in South Asia. New York: Oxford University Press. Butt, Miriam, Tracy Holloway King, and Gillian Ramchand (eds.) 1994 Theoretical perspectives on word order in South Asian Languages. Stanford: CSLI. Central Institute of Indian Languages 1975–2010 Grammars. (Grammars of individual languages, mostly “tribal”; see http://www.ciil.org/PubBook.aspx for a listing.) D’Souza, Jean 1987 South Asia as a sociolinguistic area. University of Illinois PhD dissertation. ProQuest Dissertations 8721625. Dayal, Veneeta, and Anoop Mahajan (eds.) 2004 Clause structure in South Asian languages. Dordrecht: Kluwer. Deshpande, Madhav M. 1979 Sociolinguistic attitudes in India: An historical reconstruction. Ann Arbor: Karoma. Ferguson, Charles Albert, and John Joseph Gumperz (eds.) 1960 Linguistic diversity in South Asia: Studies in regional, social and functional variation. (International Journal of American Linguistics 26(3), pt. 3.) Bloomington, IN: Indiana University Research Center in Anthropology, Folklore and Linguistics. Grierson, George Abraham (ed.) 1903–1928 Linguistic Survey of India, 11 volumes in 20. Calcutta: Office of the Superintendent of Government Printing. Repr. 1968, Delhi: Motilal Banarsidass. http://dsal.uchicago.edu/books/lsi/ 1:1 Introductory 1:2 Comparative vocabulary 1: Supp.II Addenda et corrigenda minora 2 Mōn-Khmēr and Siamese-Chinese families (including Khassi and Tai) 3:1 Tibeto-Burman family (General introduction, Tibetan, Himalayan, and North Assam languages) 3:2 Tibeto-Burman family (Bodo, Naga, Kachin languages) 3:3 Tibeto-Burman family (Kuki-Chin and Burma languages 4 Muṇḍā and Dravidian languages 5:1 Indo-Aryan family, Eastern group (Bengali and Assamese languages) 5.2 Indo-Aryan family, Eastern group (Bihari and Oriya languages) 6 Indo-Aryan family, Mediate group (Eastern Hindi)

832

Appendix — Sources and resources 7 8:1 8:2

Indo-Aryan family, Southern group (Marathi) Indo-Aryan family, North-western group (Sindhi and Lahnda) Indo-Aryan family, North-western group (Dardic, including Kashmiri) 9:1 Indo-Aryan family, Central group (Western Hindi and Panjabi) 9:2 Indo-Aryan family, Central group (Rajasthani and Gujarati) 9:3 Indo-Aryan family, Central group (Bhil languages) 9:4 Indo-Aryan family, Central group (Pahari and Gujuri) 10 Specimens of languages of the Iranian family 11 Gipsy languages [Appendix] Index of language names Kachru, Braj B., Yamuna Kachru, and S. N. Sridhar (eds.) 2008 Language in South Asia. Cambridge: Cambridge University Press. Kaye, Alan S. (ed.) 1997 Phonologies of Asia and Africa. Winona Lake, IN: Eisenbrauns. Kaye, Alan S. (ed.) 2007 Morphologies of Asia and Africa. Winona Lake: Eisenbrauns Krishnamurti, Bhadriraju, Colin P. Masica, and Anjani Sinha (eds.) 1986 South Asian languages: Structure, convergence, and diglossia. Delhi: Motilal Banarsidass. Lust, Barbara C., Kashi Wali, James W. Gair, and K. V. Subbarao (eds.) 2000 Lexical anaphors and pronouns in selected South Asian languages: A principled typology. Berlin/New York: Mouton de Gruyter. Masica, Colin P. 1976 Defining a linguistic area: South Asia. Chicago/London: The University of Chicago Press. Masica, Colin P. (ed.) 2007 Old and new perspectives on South Asian languages: Grammar and semantics. Delhi: Motilal Banarsidass. Morey, Stephen (ed.) 2008 North East Indian linguistics. New Delhi: Foundation Books. Osada, Toshiki (ed.) 2009 Linguistics, archaeology and human past in South Asia. Delhi: Manohar. Pandey, Pramod Kumar 2014 Sounds and their patterns in Indic languages, 2 vols. Delhi: Cambridge University Press India. Pandit, Prabodh B. (ed.) 1972 India as a socio-linguistic area. Ganeshkind: University of Poona. Pattanayak, Debi Prasanna (ed.) 1990 Multilingualism in India. Clevedon, Avon (England)/Bristol, PA: Multilingual Matters. Saxena, Anju (ed.) 2015 Micro-linguistic areas in South Asia. (Journal of South Asian Languages and Linguistics 2(1), special issue.) Saxena, Anju, and Lars Borin (eds.) 2006 Lesser-known languages of South Asia: Status and policies, case studies and applications of information technology. Berlin/New York: Mouton de Gruyter.

Appendix — Sources and resources

833

Sebeok, Thomas A., Murray B. Emeneau, and Charles A. Ferguson (eds.) 1969 Current trends in linguistics, 5: Linguistics in South Asia. The Hague: Mouton. Shapiro, Michael C., and Harold, F. Schiffman 1981 Language and society in South Asia. Delhi: Motilal Banarsidass. Southworth, Franklin C. 2005 Linguistic archaeology of South Asia. London/New York: Routledge Curzon. Subbarao, Karumuri V. 2012 South Asian languages: A syntactic typology. Cambridge: University Press. Valentine, Tamara Marie 1986 Aspects of linguistic interaction and gender in South Asia. University of Illinois PhD dissertation. ProQuest Dissertations 8701643. van Driem, George 2001 Languages of the Himalayas, 2 volumes. Leiden: Brill. Verma, Manindra K. (ed.) 1976 The notion of subject in South Asian languages. (South Asian Studies 2). Madison: University of Wisconsin, Dept. of South Asian Studies. Verma, Manindra K. (ed.) 1993 Complex predicates in South Asian languages. New Delhi: Manohar. Verma, Manindra K., and K. P. Mohanan (eds.) 1990 Experiencer subjects in South Asian languages. Stanford: CSLI. Yamabe, Junji 1990 Dative Subject constructions in Indic languages. University of Tokyo MA thesis. Zide, Arlene R. K., David Magier, and Eric Schiller (eds.) 1985 Proceedings of the Conference on Participant Roles: South Asia and Adjacent Areas. Bloomington: Indiana University Linguistics Club.

Language family descriptions and handbooks Andamanese languages Abbi, Anvita 2006 Endangered languages of the Andaman Islands. München: LINCOM. Abbi, Anvita 2011 A Great Andamanese dictionary. Delhi: Ratna Sagar. Abbi, Anvita 2013 A grammar of the Great Andamanese language: An ethnolinguistic study. Leiden: Brill. Avtans, Abhishek 2006 Deictic categories in Great Andamanese. Jawaharlal Nehru University MPhil dissertation. Chaudhary, Narayan 2007 Developing a computational framework for the verb-morphology of Great Andamanese. Jawaharlal Nehru University MPhil dissertation. Kumar, Pramod 2012 Descriptive and typological study of Jarawa. Jawaharlal Nehru University PhD dissertation.

834

Appendix — Sources and resources

Man, E. Horace 1923 A Dictionary of the South Andaman (Âkà-Bêa) language. Mazgaon, Bombay: British India Press. Man, E. Horace, and R. C. Temple 1875–1878 A grammar of the Bojingyida or South Andaman language. Handwritten manuscript, loose sheets. London: Royal Anthropological Institute. Manoharan, S. 1989 A descriptive and comparative study of the Andamanese language. Calcutta: Anthropological Survey of India. Mayank 2009 Comparative lexicon of Great Andamanese languages. Jawaharlal Nehru University MPhil dissertation. Portman, Maurice Vidal 1887 Manual of the Andamanese languages. London: W. H. Allen. Repr. 1992, Delhi: Manas Publications. Portman, Maurice Vidal 1898 Notes on the languages of the South Andaman group of tribes. Calcutta: Office of the Superintendent of Government Printing, India. Som, Bidisha 2006 A lexico-semantic study of Great Andamanese: A thematic approach. Jawaharlal Nehru University PhD dissertation. Temple, Richard C. 1902 A grammar of the Andamanese language, being Chapter IV of Part I of the Census Report on the Andaman and Nicobar Islands. Port Blair. Superintendentʼs Printing Press. Reprint 1994, New Delhi.

Austro-Asiatic Jenny, Mathias, and Paul Sidwell (eds.) 2015 The handbook of Austroasiatic languages. Leiden: Brill. Parkin, Robert 1991 A guide to Austroasiatic speakers and their languages. (Oceanic Linguistics Special Publication 23.) Honolulu: University of Hawaii Press. Sidwell, Paul 2009 Classification of the Austroasiatic languages: History and state of the art. München: LINCOM. Zide, Norman H. (ed.) 1966 Studies in comparative Austroasiatic linguistics. The Hague: Mouton.

Burushaski Backstrom, Peter C. 1992 Burushaski. In: Peter C. Backstrom and Carla Radloff (eds.), Languages of Northern Areas, 31–56. Islamabad: National Institute of Pakistan Studies/ Summer Institute of Linguistics. Berger, Hermann 1974 Das Yasin-Burushaski (Werchikwar). Wiesbaden: Harrassowitz.

Appendix — Sources and resources

835

Berger, Hermann 1998 Die Burushaski Sprache von Hunza und Nager, Teil I: Grammatik; Teil II: Texte mit Übersetzungen; Teil III: Wörterbuch Burushaski-Deutsch, DeutschBurushaski. Wiesbaden: Harrassowitz. Berger, Hermann 2008 Beiträge zur historischen Laut- und Formenlehre des Burushaski. Wiesbaden: Harrassowitz. Burushaski Research Academy 2006–2015 Burūšaskī Urdū luγat [Burushaski-Urdu Dictionary], 3 vols. Karachi: Bureau of Composition, Compilation & Translation, University of Karachi. Frémont, Annette 1982 Contribution à l’étude du Burushaski: Dix-neuf récits inédits de Raja Ali Ahmed Jan (Nagir) avec mot-à-mot traduction, notes, commentaires et lexique. Thèse de Doctorat de Troisième Cycle, Université de la Sorbonne Nouvelle, Paris III. Frémont, Annette 1992 Récits inédits en Burushaski: Transcription — traduction — commentaire — lexique. Tome I–II. Thèse pour le Doctorat d’état ès-lettres, Université Sorbonne Nouvelle, Paris III. Holst, Jan Henrik 2014 Advances in Burushaski linguistics. Tübingen: Narr Verlag. Lorimer, David Lockhart Robinson 1935–1938 The Burushaski language, 3 vols. Oslo: Instituttet for Sammenlignende Kulturforskning. Lorimer, David Lockhart Robinson 1962 Werchikwar English vocabulary. Oslo: Instituttet for sammenlignende kulturforskning. Munshi, Sadaf 2006 Jammu and Kashmir Burushaski: Language, language contact and change. University of Texas, Austin, PhD dissertation. http://www.lib.utexas.edu/ etd/d/2006/munshis96677/munshis96677.pdf (accessed 29 December 2014) Munshi, Sadaf In Progress Burushaski language documentation project. http://burushaskilanguage. com/ (accessed 16 Sept 2015) Tiffou, Étienne 1999 Parlons bourouchaski: État présent sur la culture et la langue des Bourouchos (Pakistan). Paris: L’Harmattan. Tiffou, Étienne 2014 Dictionnaire du bourouchaski du Yasin: Bourouchaski – Français et Français – Bourouchaski. Louvain-la-Neuve: Peeters. Tiffou, Étienne (ed.) 2004 Bourouchaskiana: Actes du colloque sur le bourouchaski organisé à lʼoccasion du XXXVIème congrès international sur les études asiatiques et nordafricaines (Montréal 27 août – 2 septembre 2000). (Bibliothèque des Cahiers de Linguistique de l’Université de Louvain, 113.) Louvain-la-Neuve: Peeters. van Skyhawk, Hugh 2003 Burushaski-Texte aus Hispar: Materialen zum Verständnis einer archaischen Bergkultur in Nordpakistan. Wiesbaden: Harrasowitz.

836

Appendix — Sources and resources

Wazir Shafi n.d. Burushaski Raẓun: A book on Burushaski grammar (in Yasin dialect) (Foreword in Burushaski and in English by Major Dr. Faiz Aman). Karachi: Bureau of Composition & Translation, University of Karachi. Willson, Stephen R. 1990 Verb agreement and case marking in Burushaski. University of North Dakota MA thesis. Willson, Stephen R. 1999 Basic Burushaski vocabulary. (Studies in Languages of Northern Pakistan, 6.) Islamabad: National Institute of Pakistan Studies, Quaid-i-Azam University/ Summer Institute of Linguistics.

Daic Das, Bishakha 2014 A descriptive grammar of Tai-Khamti. Jawaharlal Nehru University PhD dissertation. Morey, Stephen 2005 The Tai languages of Assam: A grammar and texts. Canberra: The Australian National University. Needham, J. F. 1894 Outline grammar of the (Khâmtî) language, as spoken by the Khâmtîs residing in the neighbourhood of Sadiya. Rangoon: Superintendent of Government Printing, Burma. Weidert, Alfons K. 1977 Tai-Khamti phonology and vocabulary. Wiesbaden: Steiner.

Dravidian Agesthialingom, S., and N. Rajasekharan Nair (eds.) 1981 Dravidian syntax. Annamalainagar: Annamalai University. Andronov, Mikhail S. 2003 A comparative grammar of the Dravidian languages. Wiesbaden: Harrassowitz. Bloch, Jules 1946 Structure grammaticale des langues dravidiennes. Paris: Adrien-Maisonneuve. Burrow, Thomas, and Murray B. Emeneau 1961 A Dravidian etymological dictionary. Oxford: Clarendon Press. Burrow, Thomas, and Murray B. Emeneau 1984 A Dravidian etymological dictionary. Revised edition. Oxford: Clarendon Press. Caldwell, Robert 1856 A comparative grammar of the Dravidian or South-Indian family of languages. Madras: University of Madras. Caldwell, Robert 1875 A comparative grammar of the Dravidian or South-Indian family of languages. 2nd edition, revised and enlarged. London: Trübner & Co.

Appendix — Sources and resources

837

Caldwell, Robert 1913 A comparative grammar of the Dravidian or South-Indian family of languages. 3rd edition, revised and edited by J. L. Wyatt and T. Ramakrishna Pillai. Reprinted 1974, New Delhi: Oriental Books Reprint Corporation. DEDR = Burrow & Emeneau 1984 Emeneau, Murray B. 1962 Brahui and Dravidian comparative grammar. Berkeley/Los Angeles: University of California Press. Emeneau, Murray B. 1970 Dravidian comparative phonology: A sketch. Annamalainagar: Annamalai University. Emeneau, Murray B. 1994 Dravidian studies: Selected papers, ed. by Bh. Krishnamurti. Delhi: Motilal Banarsidass. Emeneau, Murray B., and Thomas Burrow 1962 Dravidian borrowings from Indo-Aryan. Berkeley/Los Angeles: University of California Press. Krishnamurti, Bhadriraju 2001 Comparative Dravidian linguistics: Current perspectives. Oxford: Oxford University Press. Krishnamurti, Bhadriraju 2003 The Dravidian languages: A comparative, historical and typological study. Cambridge: Cambridge University Press. McAlpin, David W. 1981 Proto-Elamo-Dravidian: The evidence and its implications, Philadelphia: The American Philosophical Society. Ramakrishna Reddy, B. 2003a Word structure in Dravidian. Kuppam: Dravidian University. Ramakrishna Reddy, B. 2003b Agreement in Dravidian languages. Chennai: International Institute of Tamil Studies. Rao, Goparaju Sambasiva 1991 A comparative study of Dravidian noun derivatives. New Delhi: Bahri. Steever, Sanford B. 1988 The serial verb formation in the Dravidian languages. Delhi: Motilal Banarsidass. Steever, Sanford B. 1993 Analysis to synthesis: The development of complex verb morphology in the Dravidian languages. New York: Oxford University Press. Steever, Sanford B. (ed). 1998 The Dravidian languages. London/New York: Routledge. Subrahmanyam, P. S. 1971 Dravidian verb morphology: A comparative study. Annamalainagar: Annamalai University. Subrahmanyam, P. S. 1983 Dravidian comparative phonology. Annamalainagar: Annamalai University.

838

Appendix — Sources and resources

Subrahmanyam, P. S. 2008 Dravidian comparative grammar, 1. Mysore: Central Institute of Indian Languages. Subrahmanyam, P. S. 2013 The morphosyntax of the Dravidian languages. Thiruvananthapuram: Dravidian Linguistics Association. Suvarchala, B. 1992 Central Dravidian comparative morphology. New Delhi: Navrang. Zvelebil, Kamil V. 1970 Comparative Dravidian morphology. The Hague/Paris: Mouton. Zvelebil, Kamil V. 1970 Comparative Dravidian phonology. The Hague/Paris: Mouton. Zvelebil, Kamil V. 1977 A sketch of comparative Dravidian morphology, Part 1. The Hague: Mouton. Zvelebil, Kamil V. 1990 Dravidian linguistics: An introduction. Pondicherry: Pondicherry Institute of Linguistics and Culture.

Indo-Aryan Beames, John 1872–1879 A comparative grammar of the modern Aryan languages of India. London: Trübner. Repr. 1970: New Delhi: Munshiram Manoharlal. Bloch, Jules 1934 L’indo-aryen du veda aux temps modernes. Paris: Adrien-Maisonneuve. Bloch, Jules 1965 Indo-Aryan from the Vedas to modern times, trans. of Bloch 1934 by Alfred Master. Paris: Adrien Maissoneuve. Bubenik, Vit 1996 The structure and development of Middle Indo-Aryan dialects. Delhi: Motilal Banarsidass. Bubenik, Vit 1998 A historical syntax of Late Middle Indo-Aryan (Apabhraṁśa). Amsterdam/ Philadelphia: Benjamins. Caillat, Colette (ed.) 1989 Dialectes dan les littératures indo-aryennes. Paris: Collège de France. Cardona, George, and Dhanesh Jain (eds.) 2003 The Indo-Aryan languages. London/New York: Routledge. Chatterji, Suniti Kumar 1926 The origin and development of the Bengali language. 3 vols. Calcutta: Calcutta University Press. Reprinted 1970, London: Allen & Unwin; distributed by Motilal Banarsidass, Delhi. Deo, Ashwini 2006 Tense and aspect in Indo-Aryan languages: Variation and diachrony. Stanford University PhD dissertation. Èdel’man, Dzhoi I. 1983 The Dardic and Nuristani languages. Moscow: Nauka.

Appendix — Sources and resources

839

Hinüber, Oskar von 2001 Das ältere Mittelindisch im Überblick. 2nd rev. ed. Wien: Österreichische Akademie der Wissenschaften. Hoernle, Augustus Friedrich Rudolf 1880 A comparative grammar of the Gaudian (Aryo-Indian) languages. London: Trübner. Katre, Sumitra Mangesh 1965 Some problems of historical linguistics in Indo-Aryan. Poona: Deccan College. Katre, Sumitra Mangesh 1968 Problems of reconstruction in Indo-Aryan. Simla: Indian Institute of Advanced Study. Masica, Colin P. 1991 The Indo-Aryan languages. Cambridge: Cambridge University Press. Marlow, Patrick Edward 1997 Origin and development of the Indo-Aryan quotatives and complementizers: An areal approach. University of Illinois PhD dissertation. ProQuest Dissertations 9737189. Nara, Tsuyoshi 1979 Avahaṭṭha and comparative vocabulary of Indo-Aryan languages. Tokyo: Institute for the Study of Languages and Cultures of Asia and Africa. Pattanayak, Debi Prasanna 1966 A controlled reconstruction of Oriya, Assamese, Bengali, and Hindi. The Hague: Mouton. Sen, Subhadra Kumar 1973 Proto-New Indo-Aryan. Calcutta: Eastern Publications. Sen, Sukumar 1953 Historical syntax of Middle Indo-Aryan. Indian Linguistics 13: 355–473. Sen, Sukumar 1960 A comparative grammar of Middle Indo-Aryan. Poona: Linguistic Society of India. Turner, Ralph L. 1962–1969 A comparative dictionary of the Indo-Aryan languages. London: Oxford University Press. http://dsal.uchicago.edu/dictionaries/soas/ (accessed 17 November 2013) Turner, Ralph L. 1973 Indo-Aryan linguistics: Collected papers, ed. by J. Brough. London: School of Oriental and African Studies. Repr. 1985, Delhi: Disha Publications. Turner, Ralph L. 1985 A comparative dictionary of the Indo-Aryan languages, vol. 3: Addenda and corrigenda, ed. by James C. Wright. London: School of Oriental and African Studies. Vale, Ramchandra Narayan 1948 Verbal composition in Indo-Aryan. Poona: Deccan College. Verbeke, Saartje 2013 Alignment and ergativity in New Indo-Aryan languages. Berlin/New York: Mouton de Gruyter.

840

Appendix — Sources and resources

Zograph, G. A. 1976 Morfologičeskiĭ stroĭ novyx indoariĭskix jazykov. Moscow: Nauka.

Iranian Encyclopædia Iranica, ed. Ehsan Yarshater. New York: Bibliotheca Persica Press, 1985– present. Online edition at http://www.iranicaonline.org (accessed 30 November 2014) Haig, Geoffrey 2008 Alignment change in Iranian languages: A construction grammar approach. Berlin/New York: Mouton de Gruyter. Morgenstierne, Georg 1938 Indo-Iranian frontier languages, Vol. II, Iranian Pamir Languages. Oslo: Instittutet for Sammenlignende Kulturforskning. Redard, Georges, Sanaoullah Sana, and Charles M. Kieffer (eds.) 1974 L’Atlas linguistique des parlers iraniens: Atlas de l’Afghanistan. (University of Bern, Institut für Sprachwissenschaft. Arbeitspapiere 13.) Bern: University of Bern. Schmitt, Rüdiger (ed.) 1989 Compendium linguarum iranicarum. Wiesbaden: Reichert. Windfuhr, Gernot (ed.) 2009 The Iranian languages. London/New York: Routledge.

Kusunda Reinhard, Johan, and Sueyoshi Toba 1970 A preliminary linguistic analysis and vocabulary of the Kusunda language. Kathmandu: Summer Institute of Linguistics/Tribhuvan University. Watters, David E. 2005 Kusunda: A typological isolate in South Asia. In: Yogendra Yadava, Govinda Bhattarai, Ram Raj Lohani, Balaram Prasain, and Krishna Parajuli (eds.), Contemporary issues in Nepalese linguistics, 375–396. Kathmandu: Linguistic Society of Nepal. Watters, David E., with Yogendra P. Yadava, Madhav P. Pokharel, and Balaram Prasain 2006 Notes on Kusunda grammar: A language isolate of Nepal. Himalayan Linguistics Archive 3: 1–182. (First published 2005, by the National Foundation for the Development of Indigenous Nationalities, Kathmandu, Nepal.) https:// escholarship.org/uc/item/83v8d1wv (accessed 18 November 2013)

Munda (Austro-Asiatic) Anderson, Gregory D. S. 2007b The Munda verb: Typological perspectives. Berlin/New York: Mouton de Gruyter. Anderson, Gregory D. S. (ed.) 2008 The Munda languages. Oxford/New York: Routledge. Bhattacharya, Sudhibushan 1975 Studies in comparative Munda linguistics. Simla: Indian Institute of Advanced Study.

Appendix — Sources and resources

841

Hoffmann, John 1903 Mundari grammar. Calcutta: The Secretariat Press. Hoffmann, John, and Arthur van Emelen 1930–1979 Encyclopedia Mundarica, 16 volumes. Patna: Government Printing. Zide, Norman H. 1978 Studies in the Munda numerals. Mysore: Central Institute of Indian Languages.

Nahali/Nihali Kuiper, F. B. J. 1962 Nahali: A comparative study. (Mededelingen der Koninklijke Nederlandse Akademie van Wetenschappen, Afd. Letterkunde, N. R., 25: 5.) Amsterdam: Noord-Hollandsche Uitgevers Maatschappij. Nagaraja, K. S. 2014 The Nihali language: Grammar, texts and vocabulary. Mysore: Central Institute of Indian Languages.

Nicobarese (see Austro-Asiatic) Nuristani Edelʹman, Džoi I. 1999 Dardskie i nuristanskie jazyki. (Jazyki Mira, 7.) Moskva: Indrik. Fussmann, Gérard 1972 Atlas linguistique des parlers dardes et kafirs. (Publications de lʼÉcole Française dʼExtrême-Orient, 86.) Paris: Adrien-Maisonneuve. Nelson, David 1986 The historical development of the Nuristani languages. University of Minnesota PhD dissertation. Strand, Richard F. 1999–present Kâmviri grammar. Nuristân: Hidden land of the Hindu Kush. http:// nuristan.info/lngFrameG.html (accessed 20 November 2014)

Pamir languages (see Iranian) Tai (see Daic) Tibeto-Burman Bauman, James 1975 Pronouns and pronominal morphology in Tibeto-Burman. University of California, Berkeley, PhD dissertation. Beckwith, Christopher I. (ed.) 2002 Medieval Tibeto-Burman languages: Proceedings of a symposium held in Leiden, June 26, 2000, at the 9th Seminar of the International Association of Tibetan Studies. Leiden: Brill. Benedict, Paul K. 1972 Sino-Tibetan: A conspectus. Cambridge: Cambridge University Press.

842

Appendix — Sources and resources

Burling, Robbins 1967 Proto-Lolo-Burmese. Bloomington: Indiana University. Chelliah, Shobhana L., and Gwendolyn Hyslop 2011 Linguistics of the Tibeto-Burman Area 34 (2): Special issue on optional case marking in Tibeto-Burman, Part 1. Chelliah, Shobhana L., and Gwendolyn Hyslop 2012 Linguistics of the Tibeto-Burman Area 35(1): Special issue on optional case marking in Tibeto-Burman, Part 2. Hale, Austin 1982 Research on Tibeto-Burman languages. Berlin/New York: Mouton de Gruyter. Matisoff, James A. 2003 Handbook of Tibeto-Burman: System and philosophy of Sino-Tibetan reconstruction. Berkeley/Los Angeles: University of California Press. Namkung, Ju (ed.) 1996 Phonological inventories of Tibeto-Burman languages. (Sino-Tibetan Etymological Dictionary and Thesaurus Project, Monograph Series 3.) Berkeley: University of California Center for Southeast Asian Studies. http:// stedt.berkeley.edu/pubs_and_prods/STEDT_Monograph3_PhonologicalInv-TB.pdf (accessed 25 June 2014) Nishi, Yoshio, James A. Matisoff, and Yasuhiko Nagano 1995 New horizons in Tibeto-Burman morpho-syntax. Osaka: National Museum of Ethnology. Shafer, Robert 1957 Bibliography of Sino-Tibetan languages, 1. Wiesbaden: Harrassowitz. Shafer, Robert 1963 Bibliography of Sino-Tibetan languages, 2. Wiesbaden: Harrassowitz. Shafer, Robert 1966–1974 Introduction to Sino-Tibetan, 5 volumes. Wiesbaden: Harrassowitz. Sharma, D. D. 1994 A comparative grammar of Tibeto-Himalayan languages (of Himachal Pradesh & Uttarakhand). New Delhi: Mittal Publications. Thurgood, Graham, and Randy LaPolla (eds.) 2003 The Sino-Tibetan languages. London/New York: Routledge.

Individual languages and smaller subfamilies (Abbreviations: AA = Austro-Asiatic, And = Andamanese, Dr = Dravidian, IAr = IndoAryan, Ir = Iranian, Nur = Nuristani, TB = Tibeto-Burman

Aka (TB ?) Simon, I. M. 1970 Aka language guide. Shillong: North-East Frontier Agency.

Appendix — Sources and resources

843

Angami (TB) Giridhar, Puttushetra Puttuswamy 1980 Angami grammar. Mysore: Central Institute of Indian Languages.

Ao (TB) Coupe, Alexander R. 2003 A phonetic and phonological description of Ao: A Tibeto-Burman language of Nagaland, North-East India. Canberra: Pacific Linguistics. Coupe, Alexander R. 2007 A grammar of Mongsen Ao. Berlin/New York: Mouton de Gruyter.

Apabhraṁśa (see Prakrit) Ardhamagadhi (see Prakrit) Assamese (Asamiya) (IAr) Goswami, Golok Chandra 1966 An introduction to Assamese phonology. Poona: Deccan College. Kakati, Banikanta 1962 Assamese, its formation and development. Gauhati: Lawyer’s Book Stall. 2nd ed. 1972, Gauhati: Lawyer’s Book Stall. Tamuli, Jyotiprakash 1998 The compound verb in Assamese. University of Reading PhD dissertation.

Athpare (see Kiranti languages) Avadhi (IAr) Saksena, Baburam 1937 Evolution of Awadhi. Allahabad: Indian Press. Repr. 1971, Delhi: Motilal Banarsidass.

Badaga (Dr) Balakrishnan, R. 1999 Badaga: A Dravidian language. Annamalainagar: Annamalai University. Hockings, Paul, and Christiane Pilot-Raichoor 1992 A Badaga-English dictionary. Berlin/New York: Mouton de Gruyter. Pilot-Raichoor, Christiane 1991 Le Badaga: langue dravidienne (Inde): description et analyse. Université de la Sorbonne-Nouvelle PhD dissertation.

Bagri (IAr) Gusain, Lakhan 2000 Bagri. München: LINCOM.

844

Appendix — Sources and resources

Balochi (Ir) Adamík, Jozef 1977 The origins and dialect differentiation of Balōčī: Chronological reconstruction of developments in the nominal morphology. Harvard University PhD dissertation. ProQuest Dissertations 0322100. Barker, Abd-al-Rahman, and Aqil Khan Mengal 1969 A course in Baluchi, 2 vols. Montreal: Institute of Islamic Studies, McGill University. Dames, Longworth M. 1881 A sketch of the Northern Balochi language, containing a grammar, vocabulary and specimens of the language. (Extra number of the Journal of the Asiatic Society of Bengal 1, 1880). Calcutta. Dames, Mansel Longworth 1922 A text book of the Balochi language, consisting of miscellaneous stories, legends, poems and a Balochi-English vocabulary Lahore: Superintendent, Govt. Print., Punjab. Jahani, Carina (ed.) 2000 Language in society: Eight sociolinguistic essays on Balochi. (Studia Iranica Upsaliensia 3.) Uppsala: Acta Universitatis Upsaliensis. Jahani, Carina, Agnes Korn, and Paul Titus (eds.) 2008 The Baloch and others: Linguistic, historical and socio-political perspectives on pluralism in Balochistan. Wiesbaden: Reichert. Jahani, Carina, and Agnes Korn (eds.) 2003 The Baloch and their neighbours: Ethnic and linguistic contact in Balochistan in historical and modern times. Wiesbaden: Reichert. Korn, Agnes 2005 Towards a historical grammar of Balochi: Studies in Balochi historical phonology and vocabulary. Wiesbaden: Reichert. Shahbaksh, A. 2004 The Balochi verb: An etymological study. School of Oriental and African Studies PhD dissertation. ProQuest Dissertations U185193.

Bashgali (Nur) Davidson, John 1902 Notes on the Bashgalī (Kāfir) language. (Journal of the Asiatic Society of Bengal, Vol. 71, extra no. 1.) Calcutta. Konow, Sten 1913 Bashgali dictionary: An analysis of Col. J. Davidson’s notes on the Bashgali language. (Journal of the Asiatic Society of Bengal, New Series, 9, Extra Number.) Calcutta. Repr. 1986, Delhi: Gyan Publishing House.

Appendix — Sources and resources

845

Belhare (see Kiranti languages) Bengali (Bangla) (IAr) Bayer, Josef 1996 Directionality and Logical Form: On the scope of focussing particles and wh-in-situ. Dordrecht: Kluwer. Bhattacharja, Shishir 2007 Word formation in Bengali: A whole word morphological description and its theoretical implications. München: LINCOM. Bhattacharya, Tanmoy 1999 The structure of the Bangla DP. University College, London, PhD dissertation. Chatterji, Suniti Kumar 1926 The origin and development of the Bengali language. 3 vols. Calcutta: Calcutta University Press. Reprinted 1970, London: Allen & Unwin; distributed by Motilal Banarsidass, Delhi. Dasgupta, Probal 1980 Questions and relative and complement clauses in a Bangla grammar. New York University PhD dissertation. Dash, Niladri Sekhar 2009 Corpus based analysis of the Bengali language. Saarbrücken: VDM Publications. David, Anne Boyle 2015 Descriptive grammar of Bangla, ed. by Thomas J. Conners and Dustin Chacón. Berlin/New York: de Gruyter Mouton. Forbes, Duncan 1861 Grammar of the Bengali language. London: Sampson Low, Marston & Co. Singh, Udaya Narayana 1986 A bibliography of Bengali linguistics. Mysore: Central Institute of Indian Languages. Singh, Udaya Narayana, and Maniruzzaman 1983 Diglossia in Bangladesh and language planning. Kolkata: Gyan Bharati. Thompson, Hanne-Ruth 2012 Bengali. Amsterdam/Philadelphia: Benjamins.

Bhili (IAr) Kulkarni, S. B. 1976 Bhili of Dangs. Poona: Deccan College

Bhumij (Munda, AA) Ramaswami, N. 1992 Bhumij grammar. Mysore: Central Institute of Indian Languages.

846

Appendix — Sources and resources

Bhojpuri (IAr) Shukla, Shaligram 1981 Bhojpuri grammar. Washington, DC: Georgetown University Press. Tiwari, Udai Narain 1960 The origin and development of Bhojpuri. Calcutta: The Asiatic Society. Repr. 2001, Kolkata: Asiatic Society.

Bonda (see Remo) Brahui (Dr) Andronov, Mikhail S. 2001 A grammar of the Brahui language in comparative treatment. München: LINCOM. Andronov, Mikhail S. 2006 Brahui, a Dravidian language: A descriptive and comparative study. München: LINCOM. Bray, Denys de S. 1909 The Brahui language, Part I: Introduction and grammar. Calcutta: Superintendent, Government Printing. Repr. 1977, Quetta: The Brahui Academy. Bray, Denys de S. 1934a The Brahui language, Part II: The Brāhūī Problem. Delhi: Manager of Publications. Repr. 1978, Quetta: The Brahui Academy. Bray, Denys de S. 1934b The Brahui language, Part III: Brahui etymological dictionary. Delhi: Manager of Publications. Repr. 1978, Quetta: The Brahui Academy. Emeneau, Murray B. 1962 Brahui and Dravidian comparative grammar. Berkeley/Los Angeles: University of California Press. McAlpin, David W. 2015 Brahui and the Zagrosian hypothesis. Journal of the American Oriental Society 135(3): 551–586. McAlpin, David W. Forthcoming Modern colloquial Eastern Elamite. Trumpp, Ernst 1880 Grammatische Untersuchungen über die Sprache der Brāhūīs. (Sitzungsberichte der Bayerischen Akademie der Wissenschaften, Philosophisch-Philologische und Historische Klasse 6.) München: Bayerische Akademie der Wissenschaften.

Braj (IAr) McGregor, Rodney Stuart 1968 The language of Indrajit of Orcha. Cambridge University Press. Varma, Dhirendra 1935 La langue braj: dialecte de Mathurā. Paris: Adrien Maisonneuve.

Appendix — Sources and resources

847

Brokskat (Brokpa) (see Dardic) Bundeli (IAr) Jaiswal, M. P. 1962 A linguistic study of Bundeli. Leiden: Brill.

Camling (see Kiranti languages) Chepang (TB) Caughley, Ross 2000 Dictionary of Chepang: A Tibeto-Burman language of Nepal. Canberra: Pacific Linguistics.

Chin (see Kuki-Chin) Coorg (see Kodagu) Dameli (see Dardic) Dari (see Persian in Afghanistan) Dardic (IAr) Baart, Joan L. G. 1997 The sounds and tones of Kalam Kohistani. (Studies in Languages of Northern Pakistan, 1.) Islamabad: National Institute of Pakistan Studies, Quaid-i-Azam University/Summer Institute of Linguistics. Baart, Joan 1999 A sketch of Kalam Kohistani grammar. Islamabad: National Institute of Pakistan Studies, Quaid-i-Azam University. http://www.academia.edu/1992272/A_ Sketch_of_Kalam_Kohistani_grammar (accessed 25 November 2014) Bailey, T. Grahame 1924 Grammar of the Shina language. London: The Royal Asiatic Society. Bashir, Elena 1988 Topics in Kalasha syntax: An areal and typological perspective. University of Michigan PhD dissertation. ProQuest Dissertations 8821545. Buddruss, Georg 1959 Beiträge zur Kenntnis der Pašai-Dialekte. Wiesbaden: Steiner. Buddruss, Georg 1960 Die Sprache von Woṭapūr and Kaṭārqalā: Linguistische Studien im afghanischen Hindukusch [The language of Woṭapūr and Kaṭārqalā: Linguistic studies in the Afghan Hindukush]. (Bonner Orientalistische Studien, neue Serie 9.) Bonn: Orientalisches Seminar der Universität Bonn. Buddruss, Georg 1967 Die Sprache von Sau in Ostafghanistan: Beiträge zur Kenntnis des dardischen Phalūra [The language of Sau in eastern Afghanistan: Contributions to the knowledge of the Dardic Phalūra]. München: Kitzinger.

848

Appendix — Sources and resources

Buddruss, Georg 1982 Khowar-Texte in arabischer Schrift. (Akademie der Wissenschaften und der Literatur, Abhandlungen der Geistes- und Sozialwissenschaftlichen Klasse 1.) Wiesbaden: Steiner. Decker, Kendall D. 1992 Languages of Chitral. (Sociolinguistic survey of northern Pakistan, 5.) Islamabad: National Institute of Pakistan Studies, Quaid-i-Azam University/ Summer Institute of Linguistics. Edelman, D(žoi) I. 1983 The Dardic and Nuristani languages. Moscow: Nauka. Edelʹman, Džoi I. 1999 Dardskie i nuristanskie jazyki. (Jazyki Mira, 7.) Moskva: Indrik. Fussmann, Gérard 1972 Atlas linguistique des parlers dardes et kafirs. (Publications de lʼÉcole Française dʼExtrême-Orient, 86.) Paris: Adrien-Maisonneuve. Grierson, George A., and Aurel Stein 1929 Torwali: an account of a Dardic language of the Swat Kohistan. London: Royal Asiatic Society. Hallberg, Daniel G., and Calinda E. Hallberg 1999 Indus Kohistani: A preliminary phonological and morphological analysis. (Studies in Languages of Northern Pakistan, 8.) Islamabad: National Institute of Pakistan Studies, Quaid-i-Azam University/Summer Institute of Linguistics. Lehr, Rachel 2014 A descriptive grammar of Pashai: The language and speech community of Darrai Nur. University of Chicago PhD dissertation. ProQuest Dissertations 3638612. Liljegren, Henrik 2008 Towards a grammatical description of Palula: An Indo-Aryan language of the Hindukush. University of Stockholm PhD dissertation. su.diva-portal.org/ smash/get/diva2:198468/FULLTEXT01 (accessed 25 November 2014) Liljegren, Henrik 2011 Palula vocabulary. Islamabad: Forum for Language Initiatives. Lunsford, Wayne A. 2001 An overview of linguistic structures in Torwali, a language of Northern Pakistan. University of Texas at Arlington MA thesis. Mørch, Ida Elisabeth, and Jan Heegård 1997 Retroflekse vokalers oprindelse i kalashamon i historisk og areallingvistisk perspektiv [The origin of retroflex vowels in Kalashamon in a historical and areal linguistic perspective]. University of Copenhagen MA thesis. Morgenstierne, Georg 1973a Indo-Iranian frontier languages, Vol. III, The Pashai language, Part I, Grammar, 2nd ed. Oslo: Univertetsforlaget. Morgenstierne, Georg 1973b Indo-Iranian frontier languages, Vol. IV, The Kalasha language, 2nd ed. Oslo: Univertetsforlaget.

Appendix — Sources and resources

849

Morgenstierne, Georg 1941 Notes on Phalûṛa: An unknown Dardic language of Chitral. (Skrifter utgitt av det Norske Videnskaps-Akademi i Oslo, Hist.-Fil. Klasse, 1940: 5.) O’Brien, D. J. T. 1895 Grammar and vocabulary of the Khowar dialect (Chitrali). Lahore: Civil & Military Gazette Press. Perder, Emil 2013 A grammatical description of Dameli. University of Stockholm PhD dissertation. Peterson, Jan Heegård 2006 Local case-marking in Kalasha. University of Copenhagen PhD dissertation. Radloff, C. F. 1999 Aspects of the sound system of Gilgit Shina. (Studies in Languages of Northern Pakistan, 4.) Islamabad: National Institute of Pakistan Studies, Quaid-i-Azam University/Summer Institute of Linguistics. Ramaswami, N. 1982 Brokskat grammar. Mysore: Central Institute of Indian Languages. Schmidt, Ruth Laila, and Omkar N. Koul 1983 Kohistani to Kashmiri: An annotated bibliography of Dardic languages. Patiala: Indian Institute of Language Studies. Schmidt, Ruth Laila, and Razwal Kohistani 2008 A grammar of the Shina language of Indus Kohistan. Wiesbaden: Harrassowitz. Trail, Ronald L., and Gregory R. Cooper 1999 Kalasha dictionary, with English and Urdu. Islamabad: National Institute of Pakistan Studies, Quaid-i-Azam University/Summer Institute of Linguistics. http://fli-online.org/ (accessed 20 Dec. 2014) Varma, Siddheshwar 1978 Dardic or Pisacha languages: A linguistic analysis. Hoshiarpur: Punjab University. Yun, Ju-Hong 2003 Pashai Language Development Project: Promoting Pashai language, literacy and community development. http://www.sil.org/asia/ldc/parallel_papers/ ju-hong_yun.pdf (accessed 29 Nov. 2014) Zoller, Claus Peter 2005 A grammar and dictionary of Indus Kohistani. 1: Dictionary. Berlin/New York: Mouton de Gruyter.

Darma (TB) Willis, Christina Marie 2007 A descriptive grammar of Darma: An endangered Tibeto-Burman language. University of Texas, Austin, PhD dissertation. ProQuest Dissertations 3324629.

Dhimal (TB) King, John 2009 A grammar of Dhimal. Leiden: Brill.

850

Appendix — Sources and resources

Dhivehi (Maldivian) (IAr) Cain, Bruce D. 2000 Dhivehi (Maldivian): A synchronic and diachronic study. Cornell University PhD dissertation. Cain, Bruce D., and James W. Gair 2000 Dhivehi (Maldivian). München: LINCOM. Fritz, Sonia 2002 The Dhivehi language: A descriptive and historical grammar of Dhivehi and its dialects, 2 vols. Würzburg: Ergon Verlag. Geiger, Wilhelm 1919 Máldivian linguistic studies. (Journal of the Ceylon Branch of the Royal Asiatic Society 27.) Colombo: H. C. Cottle, Govt. Printer. Gnanadesikan, Amalia In Press A descriptive grammar of Dhivehi. Berlin/New York: de Gruyter Mouton. Wijesundera, Stanley, G. D. Wijayawardhana, J. B. Disanayaka, Hassan Ahmed Maniku, and Mohamed Luthufee 1988 Historical and linguistic survey of Dhivehi: Final Report. MS, University of Colombo.

Dumaki (IAr) Lorimer, David Lockhart Robinson 1939 The Ḍumāki language. Nijmegen: Dekker & van de Vegt.

Dumi (see Kiranti languages) Dzongkha (TB) van Driem, George, with Karma Tshering 1998 Dzongkha. (Languages of the Greater Himalayan Region 1). Leiden: Research School CNWS.

English (Butler English; see Pidgins, creoles, and other contact languages) English (South Asian English) Agnihotri, Ramakant, and Rajendra Singh (eds.) 2012 Indian English: Towards a new paradigm. Hyderabad: Orient BlackSwan. Baumgardner, Robert J. 1993 The English language in Pakistan. Karachi: Oxford University Press. Baumgardner, Robert J. 1996 South Asian English: Structure, use, and users. Urbana: University of Illinois Press. Dasgupta, Probal 1993 The otherness of English: India’s auntie tongue syndrome. New Delhi: Sage. Hundt, Marianne, and Devyani Sharma (eds.) 2014 English in the Indian diaspora. Amsterdam/Philadelphia: Benjamins

Appendix — Sources and resources

851

Kachru, Braj B. 1983 The Indianization of English: The English language in India. Delhi: Oxford University Press. Lange, Claudia 2012 The syntax of spoken Indian English. Amsterdam/Philadelphia: Benjamins. Mansoor, Sabiha 1993 Punjabi, Urdu, English in Pakistan: A sociolinguistic study. Lahore: Vanguard. Prabhakar Babu, B. A. 1974 A phonological study of English spoken by Telugu speakers in Andhra Pradesh. Hyderabad: Osmania University. Sailaja, Pingali 2009 Indian English. Edinburgh: Edinburgh University Press.

Gadaba (Dr) Bhaskararao, Peri 1980 Koṇekor Gadaba: A Dravidian language. Poona: Deccan College.

Galo (TB) Post, Mark 2007 A grammar of Galo. LaTrobe University PhD dissertation.

Garo (TB) Burling, Robbins 1961 A Garo grammar. Poona: Deccan College Postgraduate and Research Institute. Phillips, E. G. 1904 Outline grammar of the Garo language. Shillong: Assam Secretariat Press.

Gondi (Dr) Burrow, Thomas, and Sudhibushan Bhattacharya 1960 A comparative vocabulary of the Gondi dialects. Calcutta: The Asiatic Society. Mitchell, A. N. 1942 A grammar of Maria Gondi, as spoken by the Bison Horn or Dandami Marias of Bastar State. Jagdalpur: Bastar State Press. Moss, Clement F. 1950 An introduction to the grammar of the Gondi language. Jabbalpore: Mission Press. Rao, Garapati U. 2008 A comparative grammar of the Gondi dialects: With special reference to phonology and morphology. Kuppam: Dravidian University. Subrahmanyam, P. S. 1968 A descriptive grammar of Goṇḍi. Annamalai: Annamalai University. Williamson, Henry Drummond 1890 Gondi grammar and vocabulary. London: Society for Promoting Christian Knowledge.

852

Appendix — Sources and resources

Gorum (Munda, AA) Zide, Arlene R. K. 1979 A reconstruction of Sora-Gorum morphology. University of Chicago PhD dissertation. Zide, Arlene R. K. n.d A Gorum-English lexicon. Unpublished MS, Chicago.

Gujarati (IAr) Cardona, George 1965 A Gujarati reference grammar. Philadelphia: University of Pennsylvania Press. Doctor, Raimond 2004 A grammar of Gujarati. München: LINCOM. Modi, Bharati 2013 Some issues in Gujarati phonology. München: LINCOM. Tisdall, William St. Clair 1892 A simplified grammar of the Gujarati language, together with a short reading book and vocabulary. London: Kegan Paul, Trench, Trübner & Co. https:// archive.org/details/simplifiedgramma00tisdiala (accessed 18 March 2015)

Gujari (IAr) Rensch, Calvin R., Calinda E. Hallberg, and Clare F. O’Leary 1992 Hindko and Gujari. (Sociolinguistic Survey of Northern Pakistan, 3.) Islamabad: National Institute of Pakistan Studies, Quaid-i-Azam University/ Summer Institute of Linguistics.

Hazara(gi) (see Persian in Afghanistan) Hindi (IAr) Aggarwal, Narinder K. 1985 A bibliography of studies on Hindi language and linguistics, 2nd ed. Gurgaon: Indian Document Service Bahl, Kali Charan 1974 Studies in the semantic structure of Hindi. Delhi: Motilal Banarsidass. (Actually the report of the first corpus study of Hindi complex predicates, nouns with karnā.) Bahl, Kali Charan 1979 Studies in the semantic structure of Hindi, 2. Delhi: Manohar Publications. Bains, Gurprit 1989 Complex structures in Hindi-Urdu: Explorations in Government and Binding theory. New York University PhD dissertation. Barz, Richard K., and Jeff Siegel (eds.) 1988 Languages transplanted: The development of overseas Hindi. Wiesbaden: Harrassowitz. Das, Pradeep Kumar 2006 Grammatical agreement in Hindi-Urdu and its major varieties. München: LINCOM.

Appendix — Sources and resources

853

Dayal, Veneeta 1996 Locality in wh-quantification: Questions and relative clauses in Hindi. Dordrecht: Kluwer. Dwivedi, Veena Dhar 1994 Syntactic dependencies and relative phrases in Hindi. University of Massachusetts PhD dissertation. Dyrud, Lars O. 2001 Hindi-Urdu: Stress accent or non-stress accent? University of North Dakota PhD dissertation. Gambhir, Vijay 1981 Syntactic restrictions and discourse functions of word order in Standard Hindi. University of Pennsylvania PhD dissertation. Hook, Peter Edwin 1974 The compound verb in Hindi. Ann Arbor: Center for South and Southeast Asian Studies, University of Michigan. Kachru, Yamuna 1966 An introduction to Hindi syntax. Urbana: University of Illinois, Department of Linguistics. Kachru, Yamuna 1968 Studies in a transformational grammar of Hindi. Dhanbad: East West Books. Kachru, Yamuna 2008 Hindi. Amsterdam/Philadelphia: Benjamins. Kelkar, Ashok Ramchandra 1968 Studies in Hindi-Urdu. Pune: Deccan College. Kellogg, Samuel Henry 1876 A grammar of the Hindi language. Allahabad: American Presbyterian Mission Press. 2nd ed. 1893, London: Routledge & Kegan Paul. Repr. 1972, New Delhi: Oriental Books Reprint Corp. Kidwai, Ayesha 2000 XP-adjunction in universal grammar: Scrambling and binding in Hindi-Urdu. (Oxford studies in comparative syntax.) New York: Oxford University Press. Koul, Omkar N. 2009 Modern Hindi grammar. Delhi: Indian Institute of Language Studies. Lakshmibai, B. 1973 A case grammar of Hindi. Agra: Central Institute of Hindi. Lienhard, Siegfried 1961 Tempusgebrauch und Aktionsartenbildung in der modernen Hindī. Stockholm: Almqvist & Wiksell. Mahajan, Anoop Kumar 1990 The A/A-bar distinction and movement theory. MIT PhD dissertation. (Distributed by MIT Working Papers in Linguistics.) Manetta, Emily 2011 Peripheries in Kashmiri and Hindi-Urdu: The syntax of discourse-driven movement. Amsterdam/Philadelphia: Benjamins. McGregor, Ronald Stuart 1972 Outline of Hindi grammar: With exercises. Oxford: Oxford University Press.

854

Appendix — Sources and resources

McGregor, Ronald Stuart 1993 The Oxford Hindi dictionary. Oxford: Oxford University Press. [The definitive dictionary with etymologies and word senses.] Mohanan, Tara 1995 Argument structure in Hindi. Stanford: CSLI. (Stanford University PhD dissertation, 1990.) Montaut, Annie 1991 Aspects, voix et diathèses en hindi moderne: Syntaxe, sémantique, énonciation. Louvain: Peeters. Montaut, Annie 2004 A grammar of Hindi. München: LINCOM. Nespital, Helmut 1997 Dictionary of Hindi verbs. Allahabad: Lokbharati Prakashan. [Comprehensive listing of Hindi verbs with the vector verbs which introduce different senses, with examples.] Nespital, Helmut 1997 Hindī kriyā-koś/Dictionary of Hindi verbs. Allahabad: Lokbharati. Ohala, Manjari 1983 Aspects of Hindi phonology. Delhi: Motilal Banarsidass Platts, John T. 1884 A dictionary of Urdu, Classical Hindi, and English. London: W. H. Allen & Co. Repr. 1965, Oxford: Oxford University Press. Poornima, Shakthi 2012 Hindi aspectual complex predicates at the syntax-semantics interface. State University of New York, Buffalo, PhD dissertation. Shukla, Shaligram 2000 Hindi phonology. München: LINCOM. Shukla, Shaligram 2001 Hindi morphology. München: LINCOM. Saxena, Anuradha 1979 Grammar of Hindi causatives. UCLA PhD dissertation. Singh, Rajendra, and Ramakant Agnihotri 1997 Hindi morphology: A word based description. Delhi: Motilal Banarsidass. Srivastava, Dayanand 1970 Historical syntax of Early Hindi prose (1800–1850 A. D.) Part I: Syntax of the cases. Calcutta: Atima Prakashan. Srivastav, Veneeta 1991 WH dependencies in Hindi and the theory of grammar. Cornell University PhD dissertation. Subbarao, Karumuri V. 1984 Complementation in Hindi syntax. Delhi: Academic Publications.

Hindko (IAr) Bahri, Hardev 1962 Lahndi phonology, with special reference to Awáṇkári. Allahabad: Bharati Press.

Appendix — Sources and resources

855

Bahri, Hardev 1963 Lahndi phonetics, with special reference to Awáṇkári. Allahabad: Bharati Press. Rensch, Calvin R., Calinda E. Hallberg, and Clare F. O’Leary 1992 Hindko and Gujari. (Sociolinguistic Survey of Northern Pakistan, 3.) Islamabad: National Institute of Pakistan Studies, Quaid-i-Azam University/ Summer Institute of Linguistics. Toker, Halil 2014 A practical guide to Hindko grammar. Bloomington, IN: Trafford Publishing (www.trafford.com).

Ho (Munda, AA) Burrows, Lionel 1915 Ho grammar with vocabulary: An Eastern Himalayan dialect. Calcutta: Catholic Orphan Press. Repr. 1980, New Delhi: Cosmo Publications. Deeney, John 1975 Ho grammar. Chaibasa: Xavier Ho Publications.

Indo-Persian Abidi, S. A. H., and Ravinder Gargesh 2008 Persian in South Asia. In: Braj B. Kachru, Yamuna Kachru, & S. N. Sridhar (eds.), Language in South Asia, 103–120. Cambridge: Cambridge University Press. Baevskiĭ, Solomon I. 2007 Early Persian lexicography: Farhangs of the eleventh to the fifteenth centuries, transl. by N. Killian, revised and updated by John R. Perry. (Languages of Asia, no. 6). Folkestone, UK: Global Oriental. Dudney, Arthur Dale 2013 A desire for meaning: Ḳhān-i Ārzū’s philology and the place of India in the eighteenth-century Persianate world. Columbia University PhD dissertation. ProQuest Dissertation 3595479. Gaborieau, Marc 1994 Late Persian, early Urdu: The case of “Wahhabi” literature (1818–1857). In: Françoise “Nalini” Delvoye (ed.), Confluence of cultures: French contributions to Indo-Persian studies, 170–196. New Delhi/Teheran: Centre for Human Sciences, Inst. français de recherche en Iran. Ghani, Muhammad Abdul 1994 Pre-Mughal Persian in Hindustan. In 2 vols. Gurgaon: Vintage Books. Original edition 1941. Hadi, Nabi 1995 Dictionary of Indo-Persian literature. New Delhi: Indira Gandhi National Centre for the Arts. Hakala, Walter Nils 2010 Diction and dictionaries: Language, literature, and learning in Persianate South Asia. University of Pennsylvania PhD dissertation. ProQuest 3447480 (accessed 4 April 2015).

856

Appendix — Sources and resources

Hakala, Walter Nils 2015 On equal terms: The equivocal origins of an early Mughal Indo-Persian vocabulary. Journal of the Royal Asiatic Society 25.2: 209–227. Jones, William 1771 A grammar of the Persian language. London: W. & J. Richardson. Perry, John R. 1996 Persian during the Safavid period: Sketch for an etat de langue. In: Charles Melville (ed.), Safavid Persia: The history and politics of an Islamic society, 269–283. London: I. B. Tauris. Phillott, Douglas Craven 1919 Higher Persian grammar for the use of the Calcutta University, showing differences between Afghan and modern Persian, with notes on rhetoric. Calcutta: Calcutta University. Sharma, Shri Ram 1982 A descriptive bibliography of Sanskrit works in Persian. Hyderabad (India): Abul Kalam Azad Oriental Research Institute. Spooner, Brian, and William L. Hanaway 2012 Literacy in the Persianate world: Writing and the social order. Philadelphia: University of Pennsylvania Press/University of Pennsylvania Museum of Archaeology and Anthropology. Windfuhr, Gernot L. 1979 Persian grammar: History and state of its study. The Hague/Paris/New York: Mouton. Ziauddin, Muhammad (ed.) 1935 A grammar of the Braj Bhakha: The Persian text critically edited from original MSS., with an introd., translation, and notes, together with the contents of the Tuḥfatu-l-Hind. Trans. by M. Ziauddin, with a foreword by Suniti Kumar Chatterji. Calcutta: Visva-Bharati Book-Shop.

Indo-Portuguese (see Pidgins, creoles, and other contact languages) Irula (Dr) Diffloth, Gérard 1968 The Irula language: A close relative of Tamil. UCLA PhD dissertation. Perialwar, R. 1978a Irula phonology with vocabulary. Annamalainagar: Annamalai University. Perialwar, R. 1978b A grammar of the Irula language. Annamalainagar: Annamalai University. Zvelebil, Kamil V. 1973 The Irula language. Wiesbaden: Harrassowitz. Zvelebil, Kamil V. 1979 The Irula (ёrla) language, Part 2. Wiesbaden: Harrassowitz. Zvelebil, Kamil V. 1982 The Irula (ёrla) language, Part 3: Irula lore, texts and translations. Wiesbaden: Harrassowitz.

Appendix — Sources and resources

857

Jarawa (And) Kumar, Pramod 2012 Descriptive and typological study of Jarawa. Jawaharlal Nehru University PhD dissertation.

Jero (see Kiranti languages) Juang (Munda, AA) Matson, Dan Mitchell 1964 A grammatical sketch of Juang. University of Wisconsin, Madison, PhD dissertation. Patnaik, Manideepa 2000 Aspects of Juang syntax. University of Delhi PhD dissertation. Patnaik, Manideepa In Press Grammatical sketch of Juang. München: LINCOM. Pinnow, Heinz-Jürgen 1960 Beiträge zur Kenntnis der Juang-Sprache. Unpublished MS.

Kalasha (see Dardic) Kalaṣa-alâ (Nur) Degener, Almuth 1998 Die Sprache von Nisheygram im afghanischen Hindukusch. Wiesbaden: Harrassowitz.

Kangri (IAr) Eaton, Robert D. 2008 Kangri in context: An areal perspective. University of Texas, Arlington, PhD dissertation.

Kannada (Dr) Andronov, Mikhail 1969 The Kannada language. Moscow: Nauka. Gai, G. S. 1946 Historical grammar of Old Kannaḍa, based entirely on the Kannaḍa inscriptions of the 8th, 9th, and 10th centuries AD. Poona: Deccan College. Hiremath, R. G. 1980 The structure of Kannada. Dharwad: Prasaranga. Kittel, Ferdinand 1903 A grammar of the Kannada language. Mangalore: Basel Mission. Repr. 1982, New Delhi: Asian Educational Services. Kulli, J. S. 1991 History of grammatical theories in Kannada. Thiruvananthapuram: International School of Dravidian Linguistics.

858

Appendix — Sources and resources

Nadkarni, M. V. 1970 NP embedded structures in Kannada and Konkani. UCLA PhD dissertation. Nayak, Harōgadde Mānappa 1967 Kannada: Literary and colloquial: A study of two styles. Mysore: Rao and Raghavan. Rau, Nalini 2007 Verb agreement in Kannada: A constraint based account. University of Illinois PhD dissertation. Sridhar, S. N. 1990 Kannada: Descriptive grammar. London: Routledge.

Kashmiri (IAr) Bhatt, Rakesh M. 1999 Verb Movement and the syntax of Kashmiri. London: Kluwer. Grierson, George 1911 Manual of the Kâshmiri language, comprising grammar, phrase book, and vocabularies, 2 vols. Oxford: Oxford University Press. Grierson, George 1932 A dictionary of the Kashmiri language. Calcutta: Asiatic Society of Bengal. Kachru, Braj B. 1969 A reference grammar of Kashmiri. Urbana: Department of Linguistics, University of Illinois. Kaul, Vijay Kumar 2006 Compound verbs in Kashmiri. Delhi: Indian Institute of Language Studies. Koul, Ashok K. 2008 Lexical borrowings in Kashmiri. New Delhi: Indian Institute of Language Studies. Koul, Omkar N., and Kashi Wali 2006 Modern Kashmiri grammar. Hyattsville: Dunwoody Press. Koul, Omkar N., and Peter E. Hook (eds.) 1984 Aspects of Kashmiri linguistics. New Delhi: Bahri. Koul, Omkar N., and Ruth Laila Schmidt 1983 Kashmiri: A sociolinguistic survey. Patiala: Indian Institute of Language Studies. Manetta, Emily 2011 Peripheries in Kashmiri and Hindi-Urdu: The syntax of discourse-driven movement. Amsterdam/Philadelphia: Benjamins. Wali, Kashi, and Omkar N. Koul 1997 Kashmiri: A cognitive-descriptive grammar. London/New York: Routledge. Reprinted 2010.

Kaṭārqalā (see Dardic) Kati (Nur) Grjunberg, Aleksandr Leonovič 1980 Jazyk Kati: Teksty, grammatičeskij očerk [The Kati language: texts, grammatical account]. (Jazyki vostočnogo Gindukuša [Languages of the Eastern Hindukush].) Moskva: Nauka.

Appendix — Sources and resources

859

Mohammad, Jan 1991 Causative constructions in Kati, a Nuristani language of Afghanistan. Ohio University MA thesis.

Kham (TB) Watters, David E. 2002 A grammar of Kham. Cambridge/New York: Cambridge University Press.

Khamti (see Daic) Kharia (Munda, AA) Banerjee, G. C. 1894 An introduction to the Kharia language. Calcutta: Bengal Secretariat Press. Biligiri, H. S. 1965 Kharia: Phonology, grammar and vocabulary. Poona: Deccan College. Malhotra, Veena 1982 The structure of Kharia: A study in linguistic typology and change. Jawaharlal Nehru University PhD dissertation. Peterson, John M. 2006 Kharia: A South Munda language, 3 vols. Habilitationsschrift, Universität Osnabrück. Published 2011, Leiden: Brill. Pinnow, Heinz-Jürgen 1959 Versuch einer Lautlehre der Kharia-Sprache. Wiesbaden: Harrassowitz. Rehberg, Kerstin 2003 Phonologie des Kharia: Prosodische Strukturen und segmentales Inventar. Magister-These, Universität Osnabrück.

Khasi (AA) Bars, Rev. E 1973 Khasi-English Dictionary. Shillong: Don Bosco. Gabelentz, H. C. von der 1858 Grammatik und Wörterbuch der Khassia-Sprache. Verhandlungen der königlichen Gesellschaft der Wissenschaften zu Leipzig, philolologisch-historische Klasse 10: 1–66. Nagaraja, K. S. 1985 Khasi: A descriptive analysis. Pune: Deccan College. Rabel, Lucy 1961 Khasi: A language of Assam. Baton Rouge: Louisiana State University Press. Roberts, H. 1891 A grammar of the Khassi language. London: Kegan Paul, Trench, Trübner and Co.

860

Appendix — Sources and resources

Khowar (see Dardic) Kinnauri (TB) Saxena, Anju 1992 Finite verb morphology in Tibeto-Kinnauri. University of Oregon PhD dissertation. Sharma, Devi D. 1988 A descriptive grammar of Kinnauri. Delhi: Mittal.

Kiranti languages (TB) Bickel, Balthasar 1996 Aspect, mood, and time in Belhare: Studies in the semantics-pragmatics interface of a Himalayan language. Zürich: ASAS-Verlag. Ebert, Karen H. 1994 The structure of Kiranti languages. Zürich: ASAS-Verlag. Ebert, Karen H. 1997a Athpare grammar. München: LINCOM. Ebert, Karen H. 1997b Camling. München: LINCOM. Ebert, Karen H. 2000 Camling texts and glossary. München: LINCOM. Opgenort, Jean Robert 2004b. A grammar of Wambule. Leiden: Brill. Opgenort, Jean Robert 2005 A grammar of Jero: With a historical comparative study of the Kiranti languages. Leiden: Brill. Rapacha, Lal-Shyakarelu 2008 Indo-Nepal Kiranti bhashaharu: Vigat, samakalin parivesh ra bholika chunautiharu. [Indo-Nepal Kiranti languages: Past, contemporary scenario and future challenges]. Kathmandu: Research Institute for Kirãtology. Rutgers, Roland 1998 Yamphu: Grammar, texts, and lexicon. Leiden: Research School CNWS. Schackow, Diana 2008 Clause linkage in Puma (Kiranti). Magisterarbeit, Universität Leipzig. van Driem, George 1987 A grammar of Limbu. Berlin/New York: Mouton de Gruyter. van Driem, George 1997 A grammar of Dumi. Berlin/New York: Mouton de Gruyter.

Kodagu (Kodava, Coorg) (Dr) Balakrishnan, R. 1976 Phonology of Kodagu with vocabulary. Annamalainagar: Annamalai University. Balakrishnan, R. 1977 A grammar of Kodagu. Annamalainagar: Annamalai University.

Appendix — Sources and resources

861

Cole, Robert Andrews 1867 An elementary grammar of the Coorg language. Bangalore: Wesleyan Mission Press. Ebert, Karen H. 1996 Koḍava. München: LINCOM.

Kohistani (see Dardic) Kolami (Dr) Emeneau, Murray B. 1955 Kolami: A Dravidian language. Berkeley/Los Angeles: University of California Press. Sethumadhava Rao, P. 1950 A grammar of the Kolami language. Hyderabad: The Co-operative Press. Thomasiah, K. 1986 Naikri dialect of Kolami: Descriptive and comparative study. Annamalai University PhD dissertation.

Konda (Dr) Krishnamurti, Bhadriraju 1969 Konda or Kūbi: A Dravidian language. Hyderabad: Tribal Cultural Research and Training Insitute.

Konkani (IAr) Katre, Sumitra Mangesh 1966 The formation of Konkani. Poona: Deccan College. Madtha, William 1984 The Christian Konkani of South Kanara: A linguistic analysis. Dharwad: Karnatak University. Maffei, Angelus Francis Xavier 1882 A Konkani grammar. Mangalore: Basel Mission Press. Nadkarni, M. V. 1970 NP embedded structures in Kannada and Konkani. UCLA PhD dissertation.

Koraga (Dr) Bhat, D. N. S. 1971 The Koraga language. Poona: Deccan College Postgraduate and Research Institute. Shetty, Ramakrishna T. 2008 Koraga grammar. Kuppan: Dravidian University.

862

Appendix — Sources and resources

Korku (Munda, AA) Drake, John 1903 A grammar of the Kurku language. Calcutta: Baptist Mission Press. Nagaraja, K. S. 1999 Korku language: Grammar, texts, and vocabulary. Tokyo: Institute for the Study of the Languages and Cultures of Asia and Africa. Zide, Norman H. 1960 Korku phonology and morphophonemics. University of Pennsylvania PhD dissertation.

Kota (Dr) Emeneau, Murray B. 1944–1946 Kota texts, I–IV. (University of California Publications in Linguistics 2: 1–191.) Berkeley/Los Angeles: University of California Press. Subbaiah, G. 1985 A grammar of Kota. Annamalainagar: Annamalai University.

Koya (Dr) Tyler, Stephen S. Koya: An outline grammar (Gommu dialect). Berkeley/Los Angeles: University of California Press.

Kūbi (see Konda) Kui (Dr) Friend-Perreira, J. E. 1909 A grammar of the Kui language. Calcutta: Bengal Secretariat Book Depot. Maheswaran, C. 2008 A descriptive grammar of the Kui language. Kuppam: Dravidian University. Winfield, W. W. 1928 A grammar of the Kui language. Calcutta: Asiatic Society of Bengal. Winfield, W. W. 1929 A vocabulary of the Kui language. Calcutta: Asiatic Society of Bengal.

Kuki-Chin (TB) VanBik, Kenneth 2006 Proto-Kuki-Chin: A reconstructed ancestor of the Kuki-Chin languages. University of California, Berkeley, PhD dissertation.

Kurmali (IAr) Mahto, Panchanan 1989 On the nature of empty pronominals. Central Institute of English and Foreign Languages PhD dissertation.

Appendix — Sources and resources

863

Kurtöp (TB) Hyslop, Gwendolyn 2011 A grammar of Kurtöp. University of Oregon PhD dissertation.

Kuṛux (Kurukh) (Dr) Grignard, A. 1924 A grammar of the Oraon language. Calcutta/Rome: Catholic Orphan Press. Hahn, Ferd[inand] 1911 Grammar of the Kurukh language. Calcutta: Bengal Secretariat Press. Repr. 1985, Delhi: Mittal. Pfeiffer, Martin 1972 Elements of Kuṛux historical phonology. Leiden: Brill. Vesper, Don 1971 Kurukh syntax with special reference to the verbal system. University of Chicago PhD dissertation. ProQuest Dissertations T-22543.

Kuvi (Dr) Israel, M. 1964

A grammar of the Kuvi language (with texts and vocabulary). Trivandrum: Dravidian Linguistics Association.

Reddy, Joy 1979 Kuwi grammar. Mysore: Central Institute of Indian Languages.

Lahnda (see Hindko and Saraiki) Lepcha (TB) Plaisier, Heleen 2007 A grammar of Lepcha. Leiden: Brill. Támsáng, Khárpú 1978 A grammar or the Lepcha language. Kalimpong: Lyangson Tamsang. Támsáng, Khárpú 1980 Lepcha-English encyclopedic dictionary. Kalimpong: Mayel Clymit Tamsang. Repr. 2009.

Limbu (see Kiranti languages) Magar (TB) Grunow-Hårsta, Karen 2008 A descriptive grammar of two Magar dialects of Nepal: Tanahu and Syangja Magar. University of Wisconsin, Milwaukee, PhD dissertation.

864

Appendix — Sources and resources

Maithili (IAr) Jha, Subhadra 1958 The formation of the Maithili language. London: Luzac Mishra, Mithilesh 2006 The syllable structure and stress patterns of the Maithili language. University of Illinois PhD dissertation Singh, Udaya Narayana 1979 Some aspects of Maithili syntax: A transformational-generative approach. University of Delhi PhD dissertation. Yadav, Ramawatar 1984 Maithili phonetics and phonology. Mainz: Selden & Tamm. Yadav, Ramawatar 1996 A reference grammar of Maithili. Berlin/New York: Mouton de Gruyter. Yadava, Yogendra P. 1998 Issues in Maithili syntax. München: LINCOM.

Malay (see Pidgins, creoles, and other contact languages) Malayalam (Dr) Andronov, Michail S. 1996 A grammar of the Malayalam language in historical treatment. Wiesbaden: Harrassowitz. Asher, Ronald E., and T. C. Kumari 1997 Malayalam. London/New York: Routledge. Ezuttaccan, K. N. 1975 History of grammatical theories of Malayalam. Thiruvananthapuram: Dravidian Linguistics Association. Jayaseelan, K. A. 1999 Parametric studies in Malayalam. New Delhi: Allied Publishers. Nizar, Milla 2010 Dative subject constructions in South-Dravidian languages. University of California, Berkeley, Undergraduate Honors thesis. Ramaswami Aiyar, L. V. 1936 The evolution of Malayalam morphology. Ernakulam: Cochin Government Press. Sadanand, Suchitra 1999 Malayalam phonology: An optimality-theoretic approach. University of Southern California PhD dissertation.

Maldivian (see Dhivehi) Malto (Maler) (Dr) Das, Sisir Kumar 1973 Structure of Malto. Annamalainagar: Annamalai University. Droese, Ernest 1884 Introduction to the Malto language. Agra: Secundra Orphanage Press.

Appendix — Sources and resources

865

Kobayashi, Masato 2012 Texts and grammar of Malto. Vizianagaram: Kotoba Books. Mahapatra, B. P. 1979 Malto: An ethno-semantic study. Mysore: Central Institute of Indian Languages. Mahapatra, B. P. 1987 Malto-Hindi-English dictionary. Mysore: Central Institute of Indian Languages.

Manda (Dr) Reddy, Ramakrishna B. 2009 Manda-English dictionary. Mysore/Vadodara: Central Institute of Indian Languages/Bhasha Resarch and Publication Centre.

Manipuri (see Meithei) Marathi (IAr) Bloch, Jules 1970 The formation of the Marathi language. (English translation of original 1920 edition, by Dev Raj Chanana.) Delhi: Motilal Banarsidass. Dhongde, Ramesh Vaman, and Kashi Wali 2009 Marathi. Amsterdam/Philadelphia: Benjamins. Ghatage, A. M. 1963 A survey of Marathi dialects. Bombay: Maharashtra State Literature and Culture Board. Gupte, Sharad M. 1975 Relative constructions in Marathi. Michigan State University PhD dissertation. Kelkar, Ashok Ramchandra 1958 The phonology and morphology of Marathi. Cornell University PhD dissertation. Master, Alfred 1964 A grammar of Old Marathi. Oxford: Clarendon Press. Pandharipande, Rajeshwari V. 1997 Marathi. London/New York: Routledge. Pandharipande, Rajeshwari V. 2003 Sociolinguistic dimensions of Marathi: Multilingualism in central India. München: LINCOM. Wali, Kashi 2006 Marathi: A study of comparative South Asian structures. Delhi: Indian Institute of Language Studies.

866

Appendix — Sources and resources

Marwari (see Rajasthani) Meithei, Meitheirón (TB) Bhat, D. N. S. 1997 Manipuri grammar. München: LINCOM. Chelliah, Shobhana L. 1997 A grammar of Meithei. Berlin/New York: Mouton de Gruyter. Chelliah, Shobhana L. 2011 A grammar of Meithei. 2nd ed. Berlin/New York: Mouton de Gruyter. Singh, C. Y. 1984 Some aspects of Meiteilon (Manipuri) syntax. Jawaharlal Nehru University PhD dissertation.

Mewati (see Rajasthani) Miji (TB) Simon, I. M. 1979 Miji language guide. Shillong: North-East Frontier Agency.

Mongsen Ao (see Ao) Mundari (Munda, AA) Cook, W. A. 1965 A descriptive analysis of Mundari: A study of structure. Georgetown University PhD dissertation. Hoffmann, John 1903 Mundari grammar. Calcutta: The Secretariat Press. Osada, Toshiki 1992 A reference grammar of Mundari. Tokyo: Institute for the Study of Languages and Cultures of Asia and Africa. Sinha, N. K. 1975 Mundari grammar. Mysore: Central Institute of Indian Languages.

Munji (see Pamir languages) Naga Pidgin/Nagamese (see Pidgins, creoles, and other contact languages) Nepali (IAr) Acharya, Jayaraj 1991 A descriptive grammar of Nepali. Washington, DC: Georgetown University Press. Bandhu, Churamani 1971 The computer concordance of spoken Nepali. Norman, OK: Summer Institute of Linguistics. Clark, Thomas W. 1963 Introduction to Nepali. Cambridge: W. Heffer and Sons.

Appendix — Sources and resources

867

Sharma, Tara Nath 1980 The auxiliary in Nepali. University of Wisconsin, Madison PhD dissertation. Srivastava, Dayanand 1962 Nepali language: Its history and development. Calcutta: Calcutta University Press. Toba, Sueyoshi 1991 A bibliography of Nepalese languages and linguistics. Kirtipur: Linguistic Society of Nepal, Tribhuvan University. Turner, R. L. 1931 A comparative and etymological dictionary of the Nepali language. London: Routledge & Kegan Paul. Repr. 1980, New Delhi: Allied Publishers. Wallace, William David 1985 Subjects and subjecthood in Nepali: An analysis of Nepali clause structure and its challenges to Relational Grammar and Government & Binding. University of Illinois PhD dissertation. Yadava, Yogendra P., and Warren W. Glover (eds.) 1999 Topics in Nepalese linguistics. Kathmandu: Royal Academy of Nepal.

Newar(i) (Nepāl Bhāsā) (TB) Genetti, Carol 1994 A descriptive and historical account of the Dolakha Newari dialect. Tokyo: Institute for the Study of Languages and Cultures of Asia and Africa. Genetti, Carol 2009 A grammar of Dolakhā Newār. Berlin/New York: Mouton de Gruyter. (1st ed. 2007.) Hale, Austin, and Kedar Shresta 2006 Newār (Nepāl Bhāsā). München: LINCOM. Kölver, Ulrike, and Iswaranand Shresthacharya 1994 A dictionary of contemporary Newari (Newari-English). Bonn: VGH Wissenschaftsverlag. Malla, Kamal P. (ed.) 2000 A dictionary of Classical Newari: Compiled from manuscript sources. Kathmandu: Nepal Bhasa Dictionary Committee. Manandhar, Thakur Lal 1986 Newari-English dictionary: Modern language of the Kathmandu valley, ed. by Anne Vergati. Kathmandu: Agam Kala Prakashan. Nepal Bhasa Dictionary Committee 2000 A dictionary of Classical Newari. Kathmandu: Cwasā Pāsā.

Nicobarese (AA) Braine, Jean Critchfield 1970 Nicobarese grammar (Car dialect). University of California, Berkeley, PhD dissertation. Das, A. R. 1977 A study on the Nicobarese language. Calcutta: Anthropological Survey of India.

868

Appendix — Sources and resources

Man, E. Horace 1888–1889 A dictionary of the Central Nicobarese language. Repr. 1975, Delhi: Sanskaran Prakashak. Nandan, Anshu Prokash 1993 The Nicobarese of Great Nicobar. New Delhi: Gyan Publishing. Temple, Richard C. 1902 A grammar of the Nicobarese language, Being chapter IV of part II of the census report on the Andaman and Nicobar Islands. Port Blair: Superintendent’s Office.

Ollari (Dr) Bhattacharya, Sudhibushan 1957b Ollari: A Dravidian speech. New Delhi: Department of Anthropology (Memoir No. 3), Government of India.

Onge (And) Dasgupta, Dipankar, and S. R. Sharma 1982 A handbook of Onge language. Calcutta: Anthropological Survey of India.

Oraon (see Kuṛux) Oriya Bal, B. K. 1990 Comp and complementizers in Oriya and English. CIEFL PhD dissertation. Majumdar, P. C. 1970 A historical phonology of Oriya. Calcutta: Sanskrit College. Neukom, Lukas, and Manideepa Patnaik 2003 A grammar of Oriya. (Arbeiten des Seminars für Allgemeine Sprachwissenschaft; 17.) Zürich: Seminar für Allgemeine Sprachwissenschaft der Universität Zürich. Tripathi, Kunjabihari 1962 The evolution of Oriya language and script. Cuttack: Utkal University.

Ormuri (Ir) Efimov, Valentin Aleksandrovich 1986 Yazyk ormuri v sinxronnom i istoričeskom osveščenii [The Ormuṛi language in synchronic and historical perspective]. Moskva: Nauka. (English trans.: Efimov, Valentin Aleksandrovich. The Ormuri language in past and present, translated and edited, 2011, by Joan L. G. Baart. Islamabad: Forum for Language Initiatives.) Hallberg, Daniel G. 2004 Pashto, Waneci, Ormuṛi. (Sociolinguistic Survey of Northern Pakistan, 4.) Islamabad: National Institute of Pakistani Studies, Quaid-i-Azam University/ Summer Institute of Linguistics.

Appendix — Sources and resources

869

Kieffer, Charles M. 2003 Grammaire de lʼōrmuṛī de Baraki-Barak (Lōgar, Afghanistan) [Grammar of the Ormuṛi of Baraki-Barak (Logar, Afghanistan)]. Wiesbaden: Reichert. Morgenstierne, Georg 1973 Indo-Iranian frontier languages, Vol. I: Parachi and Ormuri. (2nd ed.) Oslo: Universitetsforlaget. Morgenstierne, Georg 1974 Etymological vocabulary of the Shughni group. Wiesbaden: Reichert.

Pali (see Prakrit) Palula (see Dardic) Pamir languages (Ir) Grjunberg, Aleksandr Leonovič 1972 Mundžanskij jazyk: Teksty, slovar’, grammatičeskij očerk [The Munjī language: Texts, dictionary, grammatical sketch]. (Jazyki vostočnogo Gindukuša [Languages of the Eastern Hindukush].) Leningrad: Nauka. Grjunberg, Aleksandr Leonovič, and Ivan M. Steblin-Kamenskij 1976 Vaxanskij jazyk: Teksty, slovar’, grammatičeksij očerk [The Wakhi language: Texts, dictionary, grammatical account]. (Jazyki vostočnogo Gindukuša.) Moskva: Glavnaja redakcija vostočnoj literatury. Lorimer, David Lockhart Robinson 1958 The Wakhi language, 2 vols. London: School of Oriental and African Studies. Morgenstierne, Georg 1973 Indo-Iranian frontier languages, Vol III, Iranian Pamir languages: YidghaMunji, Sanglechi-Ishkashmi and Wakhi (2nd ed.). Oslo: Universitetsforlaget. Morgenstierne, Georg 1974 Etymological vocabulary of the Shughni group. Wiesbaden: Reichert. Müller, Katja, Elisabeth Abbess, Calvin Thiessen, and Gabriela Thiessen 2008 Language vitality and development among the Wakhi people of Tajikistan. Dallas: SIL International. www.sil.org/silesr/2008/silesr2008–011.pdf (accessed 28 Nov. 2014) Nawata, Tetsuo 1979 Shughni. Tokyo: Institute for the Study of Languages and Cultures of Asia and Africa, Tokyo University. Paxalina, Tatiana Nikolaevna 1959 Iškašimskij jazyk [The Ishkashimi language]. Moskva: Nauka. Paxalina, Tatiana Nikolaevna 1966 Sarikol’skij jazyk [The Sarikoli language]. Moskva: Nauka. Paxalina, Tatiana Nikolaevna 1969 Pamirskie jazyki [Pamir languages]. Moskva: Nauka. Paxalina, Tatiana Nikolaevna 1975 Vaxanskij jazyk [The Wakhi language]. Moskva: Nauka.

870

Appendix — Sources and resources

Paxalina, Tatiana Nikolaevna 1983 Issledovanie po sravitel’no-istoričeskoi fonetike pamirskix jazykov [Introduction to the comparative-historical phonetics of the Pamir languages]. Moskva: Nauka. Paxalina, Tatiana Nikolaevna 1989 Sravitel’no-istoričeskaja morfologija pamirskix jazykov [Comparativehistorical morphology of the Pamir languages]. Moskva: Nauka. Reinhold, Beate 2006 Neue Entwicklungen in der Wakhi-Sprache von Gojal (Nordpakistan): Bildung, Migration und Mehrsprachigkeit [New developments in the Wakhi language of Gojal (North Pakistan): Culture, migration and multilingualism]. Wiesbaden: Harrassowitz.

Panjabi (Punjabi) (IAr) Bhatia, Tej K. 1993 Punjabi: A cognitive-descriptive grammar. New York: Routledge. Reprinted 2000. Cummings, T. F., and T. G. Bailey 1925 Panjabi manual and grammar. Calcutta: Baptist Mission Press. Repr. 1961, in Panjabi manual and grammars, Patiala: Languages Department, Punjabi University. Koul, Omkar Nath, and Madhu Bala 1992 Punjabi language and linguistics: An annotated bibliography. Patiala: Indian Institute of Indian Languages. Mansoor, Sabiha 1993 Punjabi, Urdu, English in Pakistan: A sociolinguistic study. Lahore: Vanguard. Newton, E. P. 1896 Panjabi grammar. Sialkot: Mission Press. Repr. 1961, in Panjabi manual and grammars, Patiala: Languages Department, Punjabi University. Shackle, Christopher 1972 Punjabi. London: English Universities Press. Shackle, Christopher 1983 An introduction to the sacred language of the Sikhs. London: School of Oriental and African Studies. Tolstaya, Natalya I. 1981 The Panjabi language: A descriptive grammar, trans. by G. L. Campbell. London: Routledge.

Parachi (Ir) Efimov, Valentin Aleksandrovich 2009 Jazyk parači. Moscow: Vostočnaja literatura. Morgenstierne, Georg 1973 Indo-Iranian frontier languages, Vol. I: Parachi and Ormuri. (2nd ed.) Oslo: Universitetsforlaget.

Appendix — Sources and resources

871

Nawata, Tetsuo 1983 Parachi. Tokyo: Institute for the Study of Languages and Cultures of Asia and Africa, Tokyo University of Foreign Studies.

Parji (Dr) Burrow, Thomas, and Sudhibushan Bhattacharya 1953 The Parji language of Bastar. Hertford: S. Austin/Max Müller Memorial Fund.

Pashai (see Dardic) Pashto (Ir) Babrakzai, Farooq 1999 Topics in Pashto syntax. University of Hawai’i, Manoa, PhD dissertation. Bečka, Jiří 1969 A study in Pashto stress. (Dissertationes orientales 12.) Prague: Oriental Institute in Academia, Czechoslovak Academy of Sciences. David, Anne Boyle 2013 Descriptive grammar of Pashto and its dialects, ed. by Claudia Brugman. Berlin/New York: Mouton de Gruyter. Grjunberg, Aleksandr Leonovič 1987 Očerk grammatiki Afganskogo jazyka (Pašto) [An account of the grammar of the Afghan language (Pashto)]. Leningrad: Nauka. Hallberg, Daniel G. 2004 Pashto, Waneci, Ormuṛi. (Sociolinguistic Survey of Northern Pakistan, 4.) Islamabad: National Institute of Pakistani Studies, Quaid-i-Azam University/ Summer Institute of Linguistics. Meyer-Ingwersen, Johannes 1966 Untersuchungen zum Satzbau des Paschto [Studies on sentence construction of Pashto]. Universität Hamburg PhD dissertation. Morgenstierne, Georg 2003. A new etymological vocabulary of Pashto, ed. by J. Elfenbein, D. N. MacKenzie, and Nicholas Sims-Williams. Wiesbaden: Reichert. Penzl, Herbert 1955 A grammar of Pashto: A descriptive study of the dialect of Kandahar, Afghanistan. Washington: American Council of Learned Societies. Roberts, Taylor 2000 Clitics and agreement. MIT PhD dissertation. Tegey, Habibullah 1977 The grammar of clitics: Evidence from Pashto and other languages. University of Illinois PhD dissertation. Tegey, Habibullah, and Barbara Robson 1996 A reference grammar of Pashto. Washington, DC: Center for Applied Linguistics.

872

Appendix — Sources and resources

Pengo (Dr) Burrow, Thomas, and Sudhibushan Bhattacharya 1970 The Pengo language. Oxford: Clarendon Press.

Persian (in Afghanistan) (Ir) Bulkin, Carleton 2010 Dari practical dictionary: Dari-English/English-Dari. New York: Hippocrene Books. Dulling, Gurth Kenton 1973 The Hazaragi dialect of Afghan Persian: A preliminary study. (Central Asian Monograph 1.) London: Central Asian Research Center. Efimov, Valentin Aleksandrovich 1965 Yazyk afganskix Xazara: Jakualangskij dialekt [The language of the Afghan Hazara: The dialect of Yakualang]. Moskva: Nauka. Farhâdi, Abd-ul-Ghafûr 1955 Le Persan parlé en Afghanistan. Grammaire du Kâboli [Spoken Persian in Afghanistan. A grammar of Kāboli]. Paris: Centre national de la recherche scientifique. Jamal, Abedin 2010 Attitudes toward Hazaragi. Southern Illinois University PhD dissertation. ProQuest Dissertations 1477416. Kiseleva, Lidija Nikolaevna 1973 Očerki po leksikologii jazyka dari [A sketch of Dari lexicology]. Moskva: Nauka. Kiseleva, Lidija Nikolaevna 1985 Jazyk dari Afganistana [The Dari language of Afghanistan]. (Jazyki narodov Azii i Afriki.) Moskva: Nauka. Kiseleva, Lidija Nikolaevna 1986 Dari-russkij slovar’ [Dari-Russian dictionary]. Moskva: Russkij jazyk. Perry, John 2005 A Tajik Persian reference grammar. Leiden/Boston: Brill.

Persian in South Asia (see Indo-Persian) Phalura (see Dardic) Pidgins, creoles, and other contact languages Cardoso, Hugo C. 2009 The Indo-Portuguese language of Diu. University of Amsterdam PhD dissertation. Utrecht: LOT. http://dare.uva.nl/document/2/66896 (accessed 8 Nov. 2014) Clements, J. Clancy 1996 The genesis of a language: The formation and development of Korlai Portuguese. Amsterdam/Philadelphia: Benjamins. Hosali, Priya 2000 Butler English, form and function. Delhi: B. R. Publishing. Nordhoff, Sebastian 2009 A grammar of upcountry Sri Lanka Malay. University of Amsterdam PhD dissertation. Utrecht: LOT. http://dare.uva.nl/record/1/319874 (accessed 8 Nov. 2014)

Appendix — Sources and resources

873

Nordhoff, Sebastian (ed.) 2013 The genesis of Sri Lanka Malay: A case of extreme language contact. Leiden: Brill. Paauw, Scott H. 2004 A historical analysis of the lexical sources of Sri Lanka Malay. York University MA thesis. ProQuest Dissertations MQ99370 (accessed 6 May 2015). Smith, Ian R. 1977 Sri Lanka Creole Portuguese phonology. Cornell University PhD dissertation. ProQuest Dissertations 7800089. (Published version 1978, Trivandrum: Dravidian Linguistics Association.) Sreedhar, M. V. 1974 Naga pidgin: A sociolinguistic study of inter-lingual communication pattern in Nagaland. Mysore: Central Institute of Indian Languages. Sreedhar, M. V. 1985 Standardized grammar of Naga pidgin. Mysore: Central Institute of Indian Languages.

Portuguese (see Pidgins, creoles, and other contact languages) Prakrit (including Pali) (IAr) Bubenik, Vit 1996 The structure and development of Middle Indo-Aryan dialects. Delhi: Motilal Banarsidass. Bubenik, Vit 1998 A historical syntax of late Middle Indo-Aryan (Apabhraṁśa). Amsterdam/ Philadelphia: Benjamins. Burrow, Thomas 1937 The language of the Kharoṣṭhi documents from Chinese Turkestan. Cambridge: Cambridge University Press. Davids, Thomas Rhys, and William Stede (eds.) 1931 The Pali Text Society’s Pali-English dictionary. London: Pali Text Society. Online http://dsal.uchicago.edu/dictionaries/pali/ (accessed 17 November 2013) Deshpande, Madhav M. 1979 Sociolinguistic attitudes in India: An historical reconstruction. Ann Arbor: Karoma. Deshpande, Madhav M. 1993 Sanskrit & Prakrit: Sociolinguistic issues. Delhi: Motilal Banarsidass. Elizarenkova, T. Y., and V. N. Toporov 1976 The Pali language. Moscow: Nauka. Geiger, Wilhelm 1916 Pali Literatur und Sprache. Strassburg: Trübner. (Engl. trans. by B. Ghosh, 1943, Calcutta: University of Calcutta.) Geiger, Wilhelm 1994 A Pāli grammar, trans. by K. R. Norman. Oxford: The Pali Text Society. Hendriksen, Hans 1944 Syntax of the infinite verb forms of Pāli. Copenhagen: Munksgaard.

874

Appendix — Sources and resources

Hinüber, Oskar von 1968 Studien zur Kasussyntax des Pali, besonders des Vinaya-Pitaka. Universität Mainz dissertation. Hinüber, Oskar von 2001 Das ältere Mittelindisch im Überblick. 2nd rev. ed. Wien: Österreichische Akademie der Wissenschaften. Mayrhofer, Manfred 1951 Handbuch des Pali, 2 vols. Heidelberg: Winter. Mishra, M. 1992 A grammar of Apabhramsha. Delhi: Vidyānidhi Prakāshan. Nitti-Dolci, Luigia 1938 Les grammairiens prakrits. Paris: Adrien-Maisonneuve. English trans. 1972 by Prahbākara Jhā, The Prākṛita grammarians. Delhi: Motilal Banarsidass. Oberlies, Thomas 2001 Pāli: A grammar of the language of the Theravāda Tipiṭika. Berlin/New York: De Gruyter. Peterson, John M. 1998 Grammatical relations in Pali and the emergence of ergativity in Indo-Aryan. München: LINCOM. Pischel, Richard 1900 Grammatik der Prakrit-Sprachen. Strassburg: Trübner. Engl. transl. 1981 by Subhadra Jha, A grammar of the Prakrit languages. Delhi: Motilal Banarsidass. Singh, Ram Adhar 1980 Syntax of Apabhraṁśa. Calcutta: Simant Publications. Tagare, Ganesh Vasudev 1987 A historical grammar of Apabhraṁśa. Poona: Deccan College. Vaidya, Paraśurāma Lakṣmaṇa 1941 A manual of Ardhamagadhi. Poona: Wadia College. Warder, A. K. 1974 Introduction to Pali. London: Pali Text Society.

Prasun (Nur) Buddruss, Georg, and Almuth Degener 2016 Materialien zur Prasun-Sprache des afghanischen Hindukusch, Teil I: Texte und Glossar. (Harvard Oriental Series, 80.) Cambridge, MA: Department of South Asian Studies, Harvard University. Distributed by Harvard University Press.

Puma (see Kiranti languages) Rajasthani (IAr) Gusain, Lakhan 2001 Shekhavati. München: LINCOM. Gusain, Lakhan 2003 Mewati. München: LINCOM.

Appendix — Sources and resources

875

Gusain, Lakhan 2004 Marwari. München: LINCOM. Magier, David 1983 Topics in the grammar of Marwari. University of California, Berkeley PhD dissertation.

Rajbanshi (IAr) Wilde, Christopher P. 2008 A sketch of the phonology and grammar of Rājbanshi. University of Helsinki PhD dissertation.

Remo (also Bonda; Munda, AA) Fernandez, Frank 1968 A grammatical sketch of Remo: A Munda language. University of North Carolina PhD dissertation. Ramachandra Rao 1981 The sound system of Remo. Osmania University MPhil dissertation.

Saka (Ir) Bailey, Harold W. 1979 Dictionary of Khotan Saka. Cambridge: Cambridge University Press. Emmerick, Ronald E. 1968 Saka grammatical studies. London: Oxford University Press. Konow, Sten 1949 Primer of Khotanese Saka: Grammatical sketch, chresthomathy, vocabulary, bibliography. Oslo: H. Aschehoug & Co.

Sanglechi-Ishkashmi (see Pamir languages) Sanskrit (IAr) Allen, W. Sidney 1953 Phonetics in ancient India. London: Oxford University Press. Allen, W. Sidney 1962 Sandhi: The theoretical, phonetic, and historical bases of word-juncture in Sanskrit. ’sGravenhage: Mouton. Böhtlingk, Otto von 1887 Pâṇini’s Grammatik. Leipzig: Haessel. Repr. 1998, Darmstadt: Wissenschaftliche Buchgesellschaft. Böhtlingk, Otto von, and Rudolph Roth 1855–1875 Sanskrit Wörterbuch. St. Petersburg: Kaiserliche Akademie der Wissenschaften. http://www.sanskrit-lexicon.uni-koeln.de/scans/PWGScan/disp2/ index.php (accessed 28 Sept. 2015) Burrow, Thomas 1955 The Sanskrit language. London: Faber & Faber. 3rd ed. 1973.

876

Appendix — Sources and resources

Cardona, George 1976 Pāṇini: A survey of research. The Hague/Paris: Mouton. 2nd ed. 1998, Delhi: Motilal Banarsidass. Cardona, George 1997 Pāṇini: His works and its traditions, 1: Background and introduction. 2nd ed. Delhi: Motilal Banarsidass. Cardona, George 2004 Recent research in Pāṇinian studies. 2nd ed. Delhi: Motilal Banarsidass. Debrunner, Albert 1954 Altindische Grammatik, 2: Nominalsuffixe. Göttingen: Vandenhoeck & Ruprecht. Debrunner, Albert, and Jakob Wackernagel 1930 Altindische Grammatik: 3: Nominalflexion, Zahlwort, Pronomen. Göttingen: Vandenhoeck & Ruprecht. Delbrück, Bertold 1878 Die altindische Wortfolge aus dem Çatapathabrâhmaṇa dargestellt. Halle: Waisenhaus. Delbrück, Bertold 1888 Altindische Syntax. Halle: Waisenhaus. Deshpande, Madhav M. 1979 Sociolinguistic attitudes in India: An historical reconstruction. Ann Arbor: Karoma. Deshpande, Madhav M. 1993 Sanskrit & Prakrit: Sociolinguistic issues. Delhi: Motilal Banarsidass. Deshpande, Madhav M., and Hans Henrich Hock 1991 A bibliography of writings on Sanskrit syntax. In: Hock (ed.) 1991: 219–244. Edgerton, Franklin 1946 Sanskrit historical phonology. New Haven: American Oriental Society. Edgerton, Franklin 1953 Buddhist Hybrid Sanskrit grammar and dictionary. New Haven, CT: Yale University Press. Emeneau, Murray B. 1952 Sanskrit sandhi and exercises, rev. ed. London: Cambridge University Press. Emeneau, Murray B., and B. A. van Nooten 1968 Sanskrit sandhi and exercises, 2nd ed. Berkeley: University of California Press. Gonda, Jan 1951 Remarks on the Sanskrit passive. Leiden: Brill. Hale, Mark Robert 1987 Studies in the comparative syntax of the oldest Indo-Iranian languages. Harvard University PhD dissertation. Hastings, Adi M. 2004 Past perfect, future perfect: Sanskrit revival and the Hindu nation in contemporary India. University of Chicago PhD dissertation. ProQuest Dissertations 3125608. Haudry, Jean 1977 L’emploi des cas en védique: introduction à l’étude des cas en indo-européen. Lyon: L’Hermes.

Appendix — Sources and resources

877

Hettrich, Heinrich 1988 Untersuchungen zur Hypotaxe im Vedischen. Berlin: de Gruyter. Hock, Hans Henrich (ed.) 1991 Studies in Sanskrit syntax: A volume in honor of the centennial of Speijer’s Sanskrit syntax (1886–1986). Delhi: Motilal Banarsidass. Hock, Hans Henrich 2015a Some issues in Sanskrit syntax. In: Scharf (ed.) 2015: 1–52 Hock, Hans Henrich 2015b A bibliography of Sanskrit syntax. In Scharf (ed.) 2015: 399–470 Huet, Gérard, Amba Kulkarni, and Peter Scharf (eds.) 2009 Sanskrit computational linguistics. Berlin/Heidelberg: Springer. Jamison, Stephanie W. 1983 Function and form in the -áya-formations of the Rig Veda and Atharva Veda. Göttingen: Vandenhoeck & Ruprecht. Jha, Girish Nath (ed.) 2010 Sanskrit computational linguistics: 4th international symposium, New Delhi, India, December 10–12, 2010. Heidelberg: Springer. http://link.springer.com/ book/10.1007/978–3-642–17528–2/page/1 (accessed 8 Dec. 2014) Kiparsky, Paul 1979/1980 Panini as a variationist. Cambridge, MA/Pune: MIT Press/Center for Advanced Study, University of Poona. Klein, Jared S. 1985 Toward a discourse grammar of the Rigveda. Vol. 1: Coordinate conjunction. Heidelberg: Winter. Klein, Jared S. 1992 On verbal accentuation in the Rigveda. (American Oriental Society Essay Number 11.) New Haven, CT: American Oriental Society. Kobayashi, Masato 2004 Historical phonology of Old Indo-Aryan consonants. Tokyo: Research Institute for Languages and Cultures of Asia and Africa, Tokyo University of Foreign Studies. Kulikov, Leonid 2001 The Vedic ya-presents. Universiteit Leiden Proefschrift. Kulikov, Leonid 2012 The Vedic -ya-presents: Passives and intransitivity in Old Indo-Aryan. Amsterdam: Rodopi. Kulkarni, Amba, and Gérard Huet (eds.) 2009 Sanskrit computational linguistics: Third international symposium, Hyderabad, India, January 15–17, 2009: Proceedings. Heidelberg: Springer. http://link. springer.com/book/10.1007/978–3-540–93885–9/page/1 (accessed 8 Dec. 2014) Kulkarni, Malhar, and Chaitali Dangarikar (eds.) 2013 Recent researches in Sanskrit computational linguistics: Fifth international symposium proceedings, 4–6 January 2013, IIT, Bombay, India. Delhi: D. K. Printworld. Kupfer, Katharina 2002 Die Demonstrativpronomina im Rigveda. Frankfurt am Main: Lang.

878

Appendix — Sources and resources

Lahiri, Prabodh Chandra 1933 Studies in the word-order of Sanskrit prose. University of London PhD dissertation. Macdonell, Arthur Anthony 1916 A Vedic grammar for students. Oxford: University Press. Repr. 1953, Bombay/ Calcutta/Madras: Oxford University Press. Mansion, Joseph 1931 Esquisse d’une histoire de la langue sanscrite. Paris: Geuthner. Matilal, Bimal Krishna 1990 The word and the world: India’s contribution to the study of language. Oxford: Oxford University Press. Mayrhofer, Manfred 1956–1976 Kurzgefaßtes etymologisches Wörterbuch des Altindischen: A concise etymological Sanskrit dictionary. Heidelberg: Winter Mayrhofer, Manfred 1986–2001 Etymologisches Wörterbuch des Altindoarischen, 3 vols. Heidelberg: Winter. Meenakshi, K. 1997 Tolkappiyam and Astadhyai. Chennai: International Institute of Tamil Studies. Meenakshisundaran 1974 Foreign models in Tamil grammar. Trivandrum: Dravidian Linguistics Association. Monier-Williams, Monier n.d. A Sanskrit-English dictionary. Oxford: Clarendon Press. Repr. 2005, Delhi: Motilal Banarsidass. http://www.sanskrit-lexicon.uni-koeln.de/monier/ (assessed 17 November 2013) Oertel, Hanns 1926 The syntax of cases in the narrative and descriptive prose of the Brāhmaṇas, I: The disjunct use of cases. Heidelberg: Winter. Renou, Louis 1956 Histoire de la langue sanscrite. Lyon: AIC. Renou, Louis 1961 Grammaire sanskrite: Phonétique, composition, dérivation, le nom, le verbe, la phrase. 3rd ed. Paris: Adrien Maisonneuve. Scharf, Peter M. (ed.) 2015 Sanskrit syntax: Selected papers presented at the seminar on Sanskrit syntax and discourse structures, 13–15 June 2013, Université Paris Diderot. Providence, RI: Sanskrit Library. Schäufele, Steven 1990 Free word order syntax: The challenge from Vedic Sanskrit to contemporary syntactic theory. University of Illinois PhD dissertation. Speijer, J. S. 1886 Sanskrit syntax. Leiden: Brill. Speijer, J. S. 1896 Vedische und Sanskrit-Syntax. Straßburg: Trübner. Staal, Johan Frederik 1967 Word order in Sanskrit and universal grammar. Dordrecht: Reidel.

Appendix — Sources and resources

879

Staal, J. Frits (ed.) 1972 A reader on the Sanskrit grammarians. Cambridge, MA: MIT Press. Repr. 1985, Delhi: Motilal Banarsidass. Thumb, Albert, and Richard Hauschild 1958 Handbuch des Sanskrit. Heidelberg: Winter. Thumb, Albert, and Richard Hauschild 1959 Handbuch des Sanskrit II: Formenlehre. Heidelberg: Winter Tikkanen, Bertil 1987 The Sanskrit gerund: A synchronic, diachronic and typological analysis. Helsinki: Finnish Oriental Society. Tsiang-Starcevic, Sarah 1997 The discourse functions of subordinate constructions in Classical Sanskrit narrative texts. University of Illinois PhD dissertation. Varma, Siddheshwar 1961 Critical studies in the phonetic observations of Indian grammarians. Delhi: Munshi Ram Manoharlal. Vasu, Srisa Chandra 1897 The Ashtádhyáyí of Páṇini. Benares: Sindhu Charan Bose. Repr. 1962, Delhi: Motilal Banarsidass. Wackernagel, Jakob 1896 Altindische Grammatik, 1: Lautlehre. Göttingen: Vandenhoeck & Ruprecht. Wackernagel, Jakob 1905 Altindische Grammatik, 2:1: Einleitung zur Wortlehre: Nominalkomposition. Göttingen: Vandenhoeck & Ruprecht. Wackernagel, Jakob 1957 Altindische Grammatik, 1: Lautlehre, 2nd ed. Göttingen: Vandenhoeck & Ruprecht. Whitney, William Dwight 1879/1889 Sanskrit grammar, including both the classical language and the older dialects of Veda and Brahmana, 1st/2nd ed. Leipzig: Breitkopf and Härtel. Reprint 1995, Delhi: DK Publications. Whitney, William Dwight 1885 The roots, verb-forms, and primary derivatives of the Sanskrit language. Leipzig: Breitkopf und Härtel. Repr 1963, Delhi: Motilal Banarsidass.

Santali (Munda, AA) Bodding, Paul O. 1923 Materials for a Santali grammar (mostly phonetic). Benegaria: Santal Mission Press. Bodding, Paul O. 1929a Materials for a Santali Grammar, mostly morphological. Dumka: Santal Mission. Bodding, Paul O. 1929b A Santali grammar for beginners. Benegaria: Santal Mission Press. Reprinted 1944 and 1952. Ghosh, Arun 1994 Santali: A look into Santali morphology. New Delhi: Gyan Publishing.

880

Appendix — Sources and resources

Neukom, Lukas 2001 Santali. München: LINCOM.

Shina (see Dardic) Shekhawati (see Rajasthani) Shom Pen (AA?) Blench, Rogert M. 2007 The language of the Shom Pen: A language isolate in the Nicobar Islands. http://www.rogerblench.info/Language/Isolates/Shompen%20paper.pdf (accessed 26 November 2013) Chattopadhyay, Subhash Chandra, and Asok Kumar Mukhopadhyay 2003 The language of the Shompen of Great Nicobar: A preliminary appraisal. Kolkata: Anthropological Survey of India.

Shughni (see Pamir languages) Sign Language AA Collaborative Research Environment n.d. Indo-Pakistan Sign Language. http://aasl.aacore.jp/wiki/Indo-Pakistan_Sign_ Language [This site includes an extensive bibliography on Indo-Pakistani sign language, including dialects.] (accessed 2 May 2015) ABSA Research Group 1995 A dictionary of Pakistan Sign Language (focus on Karachi). Karachi, Pakistan: Anjuman Behbood-e-Samat-e-Atfal (ABSA School for the Deaf). Ali, Syed Asif 2013 Detection of Urdu sign language using Harr algorithms. International Journal of Inventive Engineering and Sciences 1(6): 50–54. http://www.ijies.org/ attachments/File/v1i6/F0223051613.pdf (accessed 2 May 2015) Anonymous 2014 Pakistan’s first sign language digital tools developed. Dawn. http://www.dawn. com/news/1145609/pakistans-first-sign-language-digital-tools-developed (accessed 2 May 2015) Devy, Ganesh N. 2014 Indian sign languages. (People’s Linguistic Survey of India, 38.) New Delhi: Orient Black Swan. Morgan, Michael W. 2009 Typology of Indian sign language (ISL) verbs from a comparative perspective. In Rajendra Singh (ed.), Annual review of South Asian languages and linguistics 2009: 101–132. Berlin/New York: Mouton de Gruyter PSL — Pakistan Sign Language n.d. Pakistan Sign Language lexicon. http://psl.org.pk/ (accessed 2 May 2015) Sinha, Samar 2012 A grammar of Indian Sign Language. Jawaharlal Nehru University PhD dissertation.

Appendix — Sources and resources

881

Sulman, Nasir, and Sadar Zuberi 2000 Pakistan Sign Language – A synopsis. Sustainable Development Networking Programme, Pakistan. IUCN – The World Conservation Union. https://www. academia.edu/2708088/Pakistan_Sign_Language_–_A_Synopsis (accessed 2 May 2015) Vasishta, Madan, James Woodward, and Susan de Santis 1980 An introduction to Indian Sign Language (focus on Delhi). College Park, MD: Sign Language Research. (Also, New Delhi: All India Federation of the Deaf, 1980.) Wallang, Melissa G. 2010 Shillong Sign Language: A multi-media lexicon. Jawaharlal Nehru University PhD dissertation. Zeshan, Ulrike 2000 Sign language in Indo-Pakistan: A description of a signed language. Philadelphia: Benjamins. Zeshan, Ulrike 2006 Regional variation in Indo-Pakistani sign language: Evidence from content questions and negatives. In: Ulrike Zeshan (ed.), Interrogative and negative constructions in sign languages, 303–323. Nimjegen: Ishara Press. Zeshan, Ulrike, and Sibaji Panda 2011 Reciprocals constructions in Indo-Pakistani sign language. In: Nicholas Evans, Alice Gaby, Stephen C. Levinson, and Asifa Majid (eds.), Reciprocals and semantic typology, 91–113. Amsterdam: Benjamins. Zeshan,Ulrike 2002 Classificatory constructions in Indo-Pakistani sign language: Grammaticalization and lexicalization processes. In: Karen Emmorey (ed.), Perspectives on classifier constructions in sign languages, 111–139. Mahwah, NJ: Lawrence Erlbaum Associates.

Sindhi (IAr) Addleton, Hubert F., and Pauline A. Brown 1981 Functional Sindhi: A basic course for English speakers. Shikarpur: Indus Christian Fellowship (Baptist). Khubchandani, Lachman Mulchand 1963 The acculturation of Indian Sindhi to Hindi: A study of language in contact. University of Pennsylvania PhD dissertation. ProQuest Dissertations 6407380. Trumpp, Ernst 1872 Grammar of the Sindhi language, compared with the Sanskrit-Prakrit and the cognate Indian vernaculars. London: Trübner and Co. Repr. 1970, Osnabrück: Biblio Verlag. Yegorova, R. P. 1971 The Sindhi language. Moscow: Nauka.

882

Appendix — Sources and resources

Sinhala (IAr) Chandral, Dileep 2010 Sinhala. Amsterdam/Philadelphia: Benjamins. De Silva, M. W. S. 1979 Sinhalese and other island languages in South Asia. Tübingen: Narr. Gair, James W. 1970 Colloquial Sinhalese clause structures. The Hague: Mouton. Gair, James W. 1998 Studies in South Asian linguistics: Sinhala and other South Asian languages. Oxford: Oxford University Press. Gair, James W., and John C. Paolillo 1997 Sinhala. München: LINCOM. Gair, James W., and W. S. Karunatillake 1974 Literary Sinhala. Ithaca, NY: South Asia Program and Department of Modern Languages and Linguistics, Cornell University. Gair, James W., and W. S. Karunatillake 2013 The Sidat San̆ garā: Text, translation and glossary. New Haven, CT: American Oriental Society. Geiger, Wilhelm 1938 A grammar of the Sinhalese language. Colombo, Sri Lanka: The Royal Asiatic Society, Ceylon Branch. Geiger, Wilhelm 1941 An etymological glossary of the Sinhalese language. Colombo: Royal Asiatic Society. Henadeerage, Deepthi Kumara 2002 Topics in Sinhala syntax. The Australian National University PhD dissertation. Karunatillake, W. S. 2001 Historical phonology of Sinhalese: From Old Indo-Aryan to the 14th century A. D. Colombo: S. Godage and Brothers. (Based on the author’s 1969 Cornell University Ph.D. dissertation.) Ratanajoti, Hundirapola 1975 The syntactic structure of Sinhalese and its relation to that of the other IndoAryan dialects. University of Texas, Austin, PhD dissertation. Slade, Benjamin 2011 Formal and philological inquiries into the nature of interrogatives, indefinites, disjunction, and focus in Sinhala and other languages. University of Illinois PhD dissertation. ProQuest Dissertations 3496670. Sumangala, Lelwala 1992 Long distance dependencies in Sinhala: The syntax of focus and Wh questions. Cornell University PhD dissertation.

Saraiki (Siraiki, Seraiki) Jukes, A. 1900

Dictionary of the Jatki or western Panjábi language. Lahore: Religious Book and Tract Society.

Appendix — Sources and resources

883

O’Brien, E. 1903 Glossary of the Multani language (or south-western Panjabi). Lahore Punjab Government Printing. Shackle, Christopher 1976 The Siraiki language of central Pakistan: A reference grammar. London: School of Oriental and African Studies. Smirnov, Y. A. 1975 The Lahndi language. Moscow: Nauka. Wilson, J. 1899 Grammar and dictionary of Western Panjabi as spoken in Shahpur District. Lahore: Panjab Government Press.

Sora (Savara) (Munda, AA) Ramamurti, G. V. 1931 A manual of the So:ra: (or Savara) language. Madras: Government Press. Starosta, Stanley 1967 Sora syntax: A generative approach to a Munda languages. University of Wisconsin PhD dissertation. Zide, Arlene R. K. A reconstruction of Sora-Gorum morphology. University of Chicago PhD dissertation.

Sri Lankan Malay (see Pidgins, creoles, and other contact languages) Tajik (see Persian in Afghanistan) Tamang (TB) Mazaudon, Martine 1973 Phonologie du tamang: Étude phonologique du dialecte tamang de Risiangku (langue tibéto-birmane du Népal). (Langues et civilisations à tradition orale, 4.) Paris: Centre National de la Recherche Scientifique, Société d’Études Linguistiques et Anthropologiques de France.

Tamil (Dr) Agesthialingom, S. 1979 A grammar of Old Tamil, with special reference to Patiṟṟuppattu. Annamalainagar: Annamalai University. Albert, D. 1985 Tolkāppiyam: Phonology and morphology (An English translation). Madras: International Institute of Tamil Studies. Arden, A. H. 1942 A progressive grammar of common Tamil. 5th edn., revised by A. C. Clayton. Madras: Christian Literature Society. Asher, Ronald E. 1985 Tamil. London/Sydney/Dover: Croom Helm.

884

Appendix — Sources and resources

Beythan, Hermann 1943 Praktische Grammatik der Tamilsprache in Umschrift. Leipzig: Harrassowitz. Britto, Francis 1986 Diglossia: A study of the theory with application to Tamil. Washington, D. C.: Georgetown University Press. Christdas, Prathima 2013 The phonology and morphology of Tamil. Oxford/New York: Routledge. Deivasundaram, N. 1981 Tamil diglossia. Tirunelveli: Nainar Patippagam. Lehmann, Thomas 1989 A grammar of Modern Tamil. Pondicherry: Pondicherry Institute of Linguistics and Culture. Lehmann, Thomas 1994 Grammatik des Alttamil unter besonderer Berücksichtigung der CaṅkamTexte des Dichters Kapilar. Stuttgart: Franz Steiner Verlag. Meenakshi, K. 1997 Tolkappiyam and Astadhyai. Chennai: International Institute of Tamil Studies. Meenakshisundaran, T. P. 1965 History of the Tamil language. Poona: Deccan College. Nizar, Milla 2010 Dative subject constructions in South-Dravidian languages. University of California, Berkeley, Linguistics Undergraduate Honors thesis. Rajam, V. S. 1992 A reference grammar of classical Tamil poetry (150 BC – pre-fifth/sixth century AD). Philadelphia: American Philosophical Society. Ramaswami, N. 1997 Diglossia: Formal and informal Tamil. Mysore: Central Institute of Indian Languages. Schiffman, Harold 1969 A transformational grammar of the Tamil aspectual system. University of Chicago PhD dissertation. (Studies in Linguistics and Language Teaching, 7.) Seattle: University of Washington. Schiffmann, Harold 1979 A grammar of Spoken Tamil. Madras: Christian Literature Society. Sethu Pillai, R. P. 1974 Tamil: Literary and colloquial. Madras: University of Madras. Steever, Sanford B. 2005 The Tamil auxiliary verb system. London/New York: Routledge. Subrahmanian, S. V. 2004 Tolka:ppiyam in English (content and cultural translation with short commentary). Chidambaram: Meyyappan Pathippagam. Subrahmanya Sastri, P. S. Tolkāppiyam: The earliest extant Tamil grammar: Text in Tamil and roman scripts with a critical commentary in English. Madras: Kuppuswami Sastri Research Institute. Sundaresan, Sandhya 2012 Context and (co)reference in the syntax and its interfaces. University of Tromsø/University of Stuttgart PhD dissertation.

Appendix — Sources and resources

885

Varadaraja Iyer, E. S. (transl.) 1948 Tolkāppiyam-Poruḷatikāram. Annamalai: Annamalai University. Vasanthakumari, T. 1989 Generative phonology of Tamil. Delhi: Mittal. Vijayakrishnan, K. G. 1978 Stress in Tamilian English: A study within the framework of generative phonology. Hyderabad: Central Institute of English and Foreign Languages.

Telugu (Dr) Arden, A. H. 1937 A progressive grammar of the Telugu language. 4th edn. Madras: Christian Literature Society. Krishnamurti, Bhadriraju 1961 Telugu verbal bases: A comparative and descriptive study. Berkeley/ Los Angeles: University of California Press. Repr. 1972, Delhi: Motilal Banarsidass. Krishnamurti, Bhadriraju 2009 Studies in Telugu linguistics. Hyderabad: C. P. Brown Academy. Krishnamurti, Bhadriraju, and J. P. L. Gwynn 1985 A grammar of Modern Telugu. Delhi: Oxford University Press. Nizar, Milla 2010 Dative subject constructions in South-Dravidian languages. University of California, Berkeley, Undergraduate Honors thesis. Purushottam, Boddupalli 1996 The theories of Telugu grammar. Thiruvananthapuram: International School of Dravidian Linguistics. Swarajya Lakshmi, V. 1984 Urdu influence on Telugu. Hyderabad: Department of Linguistics, Osmania University.

Thangmi (TB) Turin, Mark 2011 A grammar of Thangmi with an ethnolinguistic introduction to the speakers and their culture. Leiden: Brill.

Tibetan (TB) Agha, Asif 1993 Structural form and utterance context in Lhasa Tibetan: Grammar and indexicality in a non-configurational language. New York: Peter Lang. Beyer, Stephan V. 1993 The classical Tibetan language. Delhi: Sri Satguru Publications. Denwood, Philip 1999 Tibetan. Amsterdam/Philadelphia: Benjamins.

886

Appendix — Sources and resources

Goldstein, Melvyn 1973 Essentials of Modern Literary Tibetan: A reading course and reference grammar. Repr. 1991, Berkeley/Los Angeles: University of California Press. Goldstein, Melvyn, and Nawang Nornang 1970 Modern Spoken Tibetan: Lhasa dialect. Seattle/London: University of Washington Press. Jäschke, Heinrich August 1883 Tibetan grammar, 2nd ed. London: Trübner. Miller, Roy Andrew 1976 Studies in the grammatical tradition in Tibet. Amsterdam/Philadelphia: Benjamins. Miller, Roy Andrew 1993 Prolegomena to the first two Tibetan grammatical treatises. Wien: Arbeitskreis für tibetische und buddhistische Studien, Universität Wien. Tournadre, Nicolas 1996 L’ergativité en Tibétain. Approche morphosyntaxique de la langue parlée. (Bibliothèque de l’information grammaticale 33.) Paris/Leuven: Peeters. Vollmann, Ralf 2008 Descriptions of Tibetan ergativity: A historiographical account. (Grazer Vergleichende Arbeiten, 23.) Graz: Leykam. Zeisler, Bettina 2004 Relative tense and aspectual values in Tibetan languages. Berlin/New York: Mouton de Gruyter.

Toda (Dr) Emeneau, Murray B. 1984 Toda grammar and texts. Philadelphia: American Philosophical Society. Saktivel, S. 1976 Phonology of Toda with vocabulary. Annamalainagar: Annamalai University. Saktivel, S. 1977 A grammar of the Toda language. Annamalainagar: Annamalai University. Nara, Tsuyoshi, and Peri Bhaskararao 2001 Toda vocabulary: A preliminary list. Osaka: Endangered Languages of the Pacific Rim Series A3–005. Nara, Tsuyoshi, and Peri Bhaskararao 2002 Toda Texts. Osaka: Endangered Languages of the Pacific Rim Series A3–005.94pp. [+ CD with sound files of the texts.] Nara, Tsuyoshi, and Peri Bhaskararao 2003 Songs of the Toda. Osaka: Endangered Languages of the Pacific Rim Series A3–011.91pp. [+ 3CDs with sound files of the songs.]

Torwali (see Dardic) Tulu (Dr) Bhat, D. N. S. 1967 Descriptive analysis of Tuḷu. Poona: Deccan College Postgraduate and Research Institute.

Appendix — Sources and resources

887

Bhat, Shankara L. 1971 A grammar of Tulu. University of Wisconsin PhD dissertation. Shetty, Ramakrishna T. 2001 A comprehensive grammar of Tulu. Annamalainagar: Annamalai University.

Tshangla (TB) Andvik, Erik 2010 A grammar of Tshangla. Leiden: Brill. Das Gupta, K. 1968 An introduction to Central Monpa. Shillong: North-East Frontier Agency.

Urdu (IAr) Bains, Gurupreet 1989 Complex structures in Hindi-Urdu: Explorations in Government and Binding theory. New York University PhD dissertation. Butt, Miriam 1995 The structure of complex predicates in Urdu. Stanford: CSLI. Das, Pradeep Kumar 2006 Grammatical agreement in Hindi-Urdu and its major varieties. München: LINCOM. Dyrud, Lars O. 2001 Hindi-Urdu: Stress accent or non-stress sccent? University of North Dakota PhD dissertation. Kelkar, Ashok Ramchandra 1968 Studies in Hindi-Urdu. Pune: Deccan College. Kidwai, Ayesha 2000 XP-adjunction in universal grammar: Scrambling and binding in Hindi-Urdu. (Oxford studies in comparative syntax.) New York: Oxford University Press. Manetta, Emily 2011 Peripheries in Kashmiri and Hindi-Urdu: The syntax of discourse-driven movement. Amsterdam/Philadelphia: Benjamins. Mansoor, Sabiha 1993 Punjabi, Urdu, English in Pakistan: A sociolinguistic study. Lahore: Vanguard. Phillott, D. C. 1918 Hindustani manual. 3rd edn. Calcutta: By the author. Platts, John T. 1884 A dictionary of Urdu, Classical Hindi, and English. London: W. H. Allen & Co. Repr. 1965, Oxford: Oxford University Press. Schmidt, Ruth Laila 1999 Urdu: An essential grammar. London/New York: Routledge. Swarajya Lakshmi, V. 1984 Urdu influence on Telugu. Hyderabad: Department of Linguistics, Osmania University.

888

Appendix — Sources and resources

Vaagri Boli (IAr) Varma, G. Srinivasa 1970 Vaagri Boli, an Indo-Aryan language. Annamalainagar: Annamalai University.

Vedda (IAr ?) De Silva, M. W. Sugathapala 1972 Vedda language of Ceylon: Texts and lexicon. (Münchener Studien zur Sprachwissenschaft. Beiheft n. F. 7.) München: Kitzinger.

Wakhi (see Pamir languages) Wambule (see Kiranti languages) Waneci (Ir) Hallberg, Daniel G. 2004 Pashto, Waneci, Ormuṛi. (Sociolinguistic Survey of Northern Pakistan, 4.) Islamabad: National Institute of Pakistani Studies, Quaid-i-Azam University/ Summer Institute of Linguistics.

Woṭapūr language (see Dardic) Yamphu (see Kiranti languages) Yidgha (see Pamir languages)

Language index Language Index A Abor-Miri-Dafla 142, 302 Achang 387 Afro-Asiatic 73 Agariya 114 Ahom 155, 257, 636 Ainu 172 Aiton 155, 156, 636 Aka-Bale 159, 163, 635 Âkà-Bêa 159–162, 163, 635, 636 Aka-Bo (also Bo) 159, 162, 163, 159 Aka-Cari 159 Aka-Jeru (also Jeru) 159, 161, 162, 163 Aka-Jowoi 159 Aka-Kede (also Kede) 159, 160, 163, 635 Aka-Kol (also Kol) 112, 159, 163, 635 Aka-Kora (also Kora) 159, 162, 635 Aka-Pucikwar (also Pučikwar) 159, 163 Aka (see Hruso), Altaic 73, 74, 247, 249, 256, 266, 579, 590 Amdo 139, 242, 389 Amdo Sherpa 139 Amdo sprachbund 242 Amdo Tibetan 389 Amwi 123, 127, 187, 236 Anatolian 251 Andamanese 2, 9, 157–165, 258, 633, 635–637 Andro 143, 144, 301, 307 Angami 144, 145, 301, 302, 321, 382, 383, 387 Angami-Pochuri 144, 145, 152, 153, 302 Angan 162–164, 633, 636 Ang languages 159, 162 Ao 144, 152, 153, 302, 304, 321–324, 386, 388, 389, 437, 505, 571 Apabhraṁśa 22–24, 33, 34, 548, 562 Apatani 143 Arabic 39, 54, 65, 70, 94, 246, 273, 274, 283, 290, 561, 631, 646, 651, 657–659, 669, 755, 763, 766, 787, 793, 794, 796, 803, 804, 806 Ara Nandan 633

Ardhamāgadhī/Ardhamagadhi 23, 24, 26, 715 Arik 139 Ashkun/Aškūn/Âṣkuňu-Saňu-vîri 14, 66, 67, 269, 293, 586 Asho 145 Aslian 108, 109, 112, 121 Aśokan Prakrit 22, 24–26, 74 Assamese/Asamiya 40, 242, 247, 257, 300, 304–307, 311–313, 321, 386, 387, 394, 397, 398, 446, 452, 454, 456, 467, 545, 567, 636, 746 Asuri 113, 114, 635 Āṭhpahariyā/Athpare 141, 318, 319, 441, 469, 572 Atong 306 A’tong 143, 301 Australian languages 74 Austro-Asiatic/Austroasiatic 9, 85, 107–109, 111, 112, 114, 115, 121, 124, 126, 128, 129, 173, 242, 248, 250, 252, 254, 255, 300, 306, 319, 402, 440, 443, 633, 635, 656 Austronesian 112, 162, 163, 633, 636 Avadhi/Awadhi 470, 471, 473, 678 Old Avadhi 470, 471, 473 Avestan 11, 15–17, 30, 52, 53, 56, 60, 61, 64, 252, 262, 278 B Babu English 675 Bachen 139 Bactrian 52–54, 57–62, 293 Baḍagu/Badaga 97, 378 Baghati 633 Bagheli 401 Bagri 843 Bahing 141 Bahnaric 108, 109, 112 Bái 3, 88, 106, 137, 151, 313, 330, 568 Baima 139, 591, 602 Baloch(i)/Balōčī/Baluchi 15, 55, 57, 66, 105, 250, 260, 266, 271–277, 285, 287, 288, 294, 297–299, 331, 375, 388, 399,

890

Language index

400, 438, 451, 452, 454, 458, 469, 470, 585, 588, 589, 638, 639, 642, 754, 765, 804, 805 Eastern Balochi 58, 62, 272, 288, 400, 470 Balti 138, 268, 289, 290, 387, 643, 787, 808, 809 Baltic 11 Bangani 9 Bangla (also Bengali) 5, 9, 32, 39, 40, 42, 242, 264, 282, 285, 300, 303, 306, 307, 311–313, 319–321, 327, 329, 380, 381, 386–389, 395, 397–399, 401, 402, 438, 446, 451, 452, 459, 461–464, 504, 506, 507, 517, 521, 523, 524, 526–529, 545, 546, 548, 564, 567, 568, 575, 581, 631, 652–654, 657–660, 678, 736, 739, 743, 744, 746, 747, 754, 755, 761, 764, 790, 792, 794, 795, 800 Cholit Bangla 658 Sadhu Bangla 658 Tripura Bangla 401 Bangru 148 Bantawa 141 Baram 147, 307 Bartangi 458 Bashgali 66 Basque 51, 166, 527 Batek 109 Bateri 384, 644 Bathang 139 Bazaar Hindi/Bazar Hindi 669, 673, 675, 678 Belhare 318, 398, 438, 538, 544, 545, 549 Bengali (see Bangla) Benglish 631 Bengni 143 Betta Kurumba 475 Bhalavali Marathi 310 Bharmauri 633 Bhili 172 Bhoi 123, 125 Bhojpuri 38, 40, 306, 313, 320, 400, 678–680, 792 Bhotia 307 Bhumij 113, 114 Bhunjia 634

Bidar Kannada 311 Bihari 38, 678 Bijori 113, 114 Birhor 113, 114, 635 Birjia 634 Bishnupriya Manipuri 303, 321 Black Mountain 147, 151, 634 Bo (see Aka-Bo) Bodic 135, 151–153, 307, 457, 590, 593 Bodish 138–140, 151–153, 590, 591 Bodo/Boro 143, 301, 306, 387, 459, 463, 545, 746 Bodo-Garo/Boro-Garo 143, 306, 402, 590, 591 Bodo-Koch 143, 144, 301, 307 Bokar 143 Bolyu 111, 121, 122 Bom 635 Bonda (see Remo) Bori 143 Bote-Majhi 634 Brahmaputran 143, 144, 153 Brahui 5, 74, 80, 83, 104, 105, 247, 250, 253, 259, 260, 266, 271–277, 280, 281, 285, 287, 288, 331, 388, 399, 400, 588, 589, 638, 642–644, 802, 805 Braj (Bhākhā) 474 Brokkat 634 Brokskat/Brokpa 263, 387, 452 Buddhist Hybrid Sanskrit 451 Bugun 148, 302 Bujheli 147 Bumthang 139, 140 Bunan 140 Bundeli 847 Burmese 383 Burushaski 9, 16, 165–168, 245, 247, 248, 250, 252, 258, 259, 262, 263, 264–270, 280, 288, 289, 317, 331, 384, 388, 396, 398, 437, 439, 441, 444, 445, 452, 454, 468, 471, 474, 571, 584, 586, 641–644, 763, 805, 806, 809 Hunza 166, 644 Nager 268 Yasin 165–168, 268, 288, 586 Butler English 669, 675, 692 Byangsi 140

Language Index C Caló (Romani) 51 Camling 141, 549 Caribbean Hindustani 37 Car Nicobarese 127–129 Celtic 251 Chabcha 139 Chak 635 Chali 139, 634 Chali-Bumthang 140 Chamdo 139 Chamling 320 Chang 143, 144, 301, 456, 591, 593 Changtang 139 Chantyal 140, 398, 438, 579, 581 Chathare 141 Chaudangsi 140 Chepang 147, 151, 389 Chepangic 147, 152 Chɨlɨng 141 Chilisso 384, 644 Chin 111, 145, 146, 149, 387, 389 Chinali 633 Chinese 111, 134, 136, 165, 247, 248, 300, 549, 755 Chintang 141, 543 Cho 145, 382 Cho-Asho 145 Chocangacakha 139 Chokri 144, 302, 387 Chulikata 148 Chungli 144 Coorg(i) (see Kodagu) D Daai Chin 387, 389 Dagyab 139 Daic (Tai) 9, 155, 321, 501, 544 Dakkhini Hindi-Urdu 36 Dakpa 139, 634 Dameḷi/Dameli 66, 67, 72, 263, 266, 269, 384, 644 Damu 143 Danish 69 Darai 465, 466, 469, 548 Dardic 12, 14, 17, 40, 42, 66, 242, 254, 255, 260, 262, 263, 265, 268–270, 278, 290–292, 294,

891

295, 317, 395, 396, 451, 452, 470, 585 Dargari 635 Dari/Darī 277, 281, 294, 296–298, 585, 736 Darma 140, 590 Dartsedo 139 Deori 301 Derge 139 Dhangar Kuṛux 320 Dhimal 146, 320, 634 Dhimalish 146, 151, 152 Dhimmai 148 Dhivehi 45–48, 396, 796, 803 Digarish 148, 152, 153 Digaru 148 Digor Ossetic 529 Dimasa 143, 301, 306 Dogri 385 Dolakha Newar/Dolakhā Newār 388, 398, 402, 437, 438, 456, 549 Dolpo 139, 387 Dolpo Tibetan 387 Domaki (Dumaki) 50, 262, 263, 265, 268, 289, 454, 641, 644 Domari (Romani) 50 Dramding 147 Dravidian 1, 4, 9, 16, 29, 35, 36, 39, 40, 46, 48, 73–93, 95–99, 102–108, 116, 123, 173, 241, 242, 245, 247–250, 252–261, 265, 266, 268, 271, 273–276, 283, 309–319, 321, 322, 324, 327–329, 375, 388, 392, 394, 396, 398, 400, 439–443, 445, 447–449, 459–461, 465, 466, 468, 474, 503, 505, 539, 540, 544, 546, 549–554, 556, 558–560, 562, 563, 567–571, 573–584, 586, 588, 633, 647, 656, 657, 661–663, 667, 672, 678, 707, 716, 718, 719, 721, 726, 746, 791, 802, 803 Central 77–80, 85, 96, 102, 103, 314, 439 North 76, 78, 80, 83, 85, 88, 89, 91, 96, 97, 104, 105, 253, 259, 570 South-Central 73–80, 83, 85, 89, 90, 96, 99, 102 South 48, 77, 78, 85, 89–92, 96, 97 Drigung 139

892

Language index

Dumaki (see Domaki) Dumi 141 Dungmali 141 Dura 147, 151 Dutch 380, 543, 671, 673, 677 Dzala 139 Dzala-Dakpa 140 Dzongkha/Dzonkha 139, 387, 736 E Elamite 16, 74, 80, 253, 254, 260 Elamo-Dravidian 74 English 9, 39, 40, 61, 100, 132, 133, 164, 241, 243, 268, 270, 275, 280, 281, 283, 285, 287, 288, 291, 292, 299, 305, 307, 323, 325–332, 380, 451, 502–504, 519, 526, 532, 535, 542, 558, 559, 634, 635, 637, 642, 645, 647–659, 664, 669, 673, 675, 678–680, 713, 728, 739–743, 746, 747, 752, 753, 755 Indian English 327, 328, 331, 740, 741 South Asian English 325, 326 Eravallan 633 Erromintxela (Romani) 51 F Falam 145 French 379, 534, 647, 677, 755 G Gadaba (Dravidian; also Ollari) 76, 102–104, 113, 315, 375 Gadaba (Munda; see Gutob) Gallong 143 Galo 141, 143, 303 Gāndhārī/Gandhari 26–27, 74, 262, 279, 451, 789, 799 Garhwali 313 Garo 123, 143, 301, 306, 387 Gar 139, 387 Gawarbati/Gawar-Bati/Gawar Bati/GawarBātī 263, 269, 270, 293, 384, 398, 640, 644 Gawri 641, 643, 644, 805, 807 Gergye 139 German 10, 69, 241, 514, 526, 529, 675, 755 Germanic 535

Gertse 139 Gilaki 55, 59, 62, 64, 65 Godwari 243 Gojri 387, 807 Goṇḍi/Gondi 83, 84, 89, 90, 100, 101, 116, 398, 570, 792 Gongduk 135, 140, 147, 151, 634 Gorum 112–114, 116, 119–121, 314, 389, 400, 438, 474, 635 Gowro 384, 644 Grangali 263 Great Andamanese 157–165, 633, 635–637 Gtaʔ, 112, 113, 115, 120, 121, 314, 320, 400, 402, 437, 475 Gujarati 39, 40, 42, 43, 46, 76, 85, 247, 311, 312, 317, 327, 380–382, 387, 388, 393, 399, 437, 447, 452, 455, 467, 474, 507, 546, 549, 567, 573, 575, 652, 678, 679, 739, 746, 792–794, 799, 800 Gujari/Gujuri 270, 384 Gurung 140, 389, 752 Gutob/Gadaba 103, 112, 113, 315, 635 Gypsy (see Roma, Romani) H Hakha 145, 387 Hakha Lai 387 Hakkipikki 633 Halbi 103 Halung 139 Haryanvi 385 Hausa 380 Hayu 141 Hazara(gi) 640 Hebrew 534 Hill Miri 143 Himalayish 140, 146, 151–153 West 140, 151–153 Hindi 4, 5, 30, 33, 37–40, 42, 49, 157, 164, 170, 172, 243, 248, 256, 282, 283, 285–287, 292, 303, 306, 313, 323, 324, 326–331, 377, 381, 382, 385–389, 393, 394, 397, 400, 401, 437, 438, 440, 441, 443–447, 450, 452, 455, 460, 463–465, 472, 473, 475, 503, 505, 506, 511, 515, 524, 533, 536, 539–545, 559–568, 586, 647, 649–651, 657, 669, 673, 675, 678–

Language Index 680, 707, 739, 743–747, 751, 755–758, 764, 792, 793, 799 Bhojpuri-Hindi 678 Bhojpuri-Hindi (South Africa) 678 Fiji 37, 678 Hindko 270, 291, 384, 451, 639, 641–643 Hindustani 37, 39, 282, 323, 647, 649, 651, 669, 678 Fiji Pidgin Hindustani 669 Plantation Hindustani 678 Hiri Motu 673 Hmar 145, 318, 319, 441, 460, 469, 578 Hmong 381 Ho 2, 113–117, 139, 314, 320, 395, 398, 438 Hrusish 148, 151, 152, 302 Hruso 148, 302 Huang-chung-hsien 139 I Idu 148 Idu-Digaru 152, 302 Indo-Altaic 266 Indo-Aryan 1, 2, 4, 5, 9, 11–22, 24, 25, 27, 33, 35–40, 42, 43, 45–50, 56, 62, 66, 67, 69–72, 75, 82, 84, 88, 96, 104, 105, 107, 108, 123, 131, 138, 144, 156, 169, 170, 172, 173, 241–243, 245, 247–255, 257–262, 265, 266, 275, 277, 278, 281, 284, 292, 293, 300, 303, 305, 307, 309, 311–318, 320–322, 329, 330, 375, 381, 385, 393, 396, 438, 440, 444–448, 451, 461, 465–470, 472, 473, 475, 501, 503, 505, 517, 524, 525, 539, 540, 544, 545, 547–549, 552, 559, 567–571, 574–578, 581–585, 633, 635, 637, 656, 657, 673, 678, 707, 709, 787, 789–791, 799, 802 Old Indo-Aryan 9, 18, 19, 21, 22, 25, 39, 243, 250, 252, 444, 576 Indo-European 4, 5, 9, 11, 12, 52, 66, 74, 75, 106, 165, 245, 250–252, 256, 258, 292, 530, 535 Indo-Iranian 9, 11, 12, 15, 17, 42, 56, 249, 251, 252, 262, 267, 293, 294, 438, 533 Indo-Persian 855, 856 Indo-Turanian 248, 266, 283 Indus language 252, 255, 256 Northern Indus language 252

893

Iranian 1–4, 9, 11–17, 50–66, 105, 245, 247, 248, 250, 252, 258, 262, 266, 269, 271–281, 283, 285, 288, 292, 293, 299, 317, 395, 438, 451, 452, 455, 456, 458, 469, 501, 529, 549, 570, 584, 585, 587, 787 Old Iranian 11, 15, 51–54, 56, 57, 59–62, 64 Irula 91, 92, 98 Ishkashimi/Ishkashmi/Iškaš(i)mi 263, 269 Italian 392 Italic 251 J Jad 634 Jaḍgālī 272, 277 Jangil 157, 159, 636 Japanese 502, 534, 541, 589 Japonic 73 Jarawa 157–159, 161–164, 636 Jaṭkī 276, 277 Jenu Kurumba 633 Jero 141, 635 Jeru (see Aka-Jeru) Jingphaw 137, 144, 153, 301, 302 Jirel 139 Juang 112–114, 116, 120, 121, 129, 314–316, 399, 438, 445 Juray 112, 113 Juwai 163, 635 K Kachari 301 Kachin(ic) 143, 144, 150, 152, 153 Kadai (see Tai-Kadai) Kadu 143, 301 Kafir languages (see Nuristani) 292 Kagate 139 Kaike 591 Kalaṣa-alâ (Nuristani) 66, 68–72, 586 Kalasha/Kalashamon (Dardic) 4, 247, 263, 264, 266–269, 280, 283, 285, 378, 396, 452, 540, 546, 565, 567, 570, 575, 585, 641, 645, 787, 805, 807, 809 Kalkoti 270, 384 Kaman 148, 152 Kamarupan 150–152

894

Language index

Kami 139 Kamkatavari/Kâmk‘atavari 14, 15 Kanashi 140, 634 Kangri 385, 439 Kannada 73, 83, 84, 87–92, 95–98, 100, 104, 266, 310, 311, 313, 323, 324, 326, 328, 330, 387, 388, 394, 398, 437, 439, 503, 504, 507, 540, 543, 545, 546, 555, 571, 574, 586, 631, 650, 651, 662, 716–718, 746, 794, 795, 799, 800 Karbí 149, 152, 153, 302, 591 Kardze 139 Karenic 136, 151 Karmali 113, 114 Kartvelian 73 Kashmiri 42, 246, 247, 262, 263, 268, 270, 282, 289, 380, 387, 388, 393, 395, 396, 437, 447, 453, 455–458, 465, 470, 501, 504, 505, 507, 521, 525, 528, 545, 567, 575, 585, 641, 642, 738, 746, 754, 764, 767, 792, 804 Kaṭārqalā 467 Kâtavari 14 Kati/Katī 66, 70, 264, 293, 294, 297, 645 Katuic 108, 109, 112, 117 Kede (see Aka-Kede) Kensiu/Kensiw 109, 121 Keraʔ Mundari 315, 316, 319, 320 Khaling 141, 389 Kham 139, 147, 387, 389, 454, 590, 592 Khamiyang 636 Kham Tibetan 139, 387 Khamti (Tai) 155, 156, 636 Khamyang 155 Khardong 138 Kharia 6, 112–115, 120, 121, 316, 319, 398, 402, 438, 439, 449, 546 Khaṛī Bolī 39, 282 Khasic 107, 108, 112, 123, 124, 126, 127 Khas(s)i 9, 123–127, 156, 246, 257, 300, 306, 307, 321, 387, 402, 443, 501, 635 Khengkha 139 Kherwarian 113, 114, 116, 118, 319, 438, 635 Kheza 145, 302 Khezha 387 Khiamngan 143, 301 Khmer 108–111, 129, 302, 736

Khmeric 108, 109 Khmuʔic 108, 110, 111, 129 Khoa 302 Kho-Bwa 148, 151, 152 Khoirao 145, 302 Khonoma Angami 382, 387 Khora 159, 162, 163, 635 Khotanese (see Saka) Khowa 148 Khowar 4, 247, 263, 264, 266–269, 279–281, 283, 285, 288, 384, 452, 546, 567, 570, 579, 585, 586, 639, 641, 642, 645, 754, 763, 805, 806, 809 Khumbo 139 Khumi 145 Khwarezmian 52, 54, 57, 60–62, 64, 65 Khyang 635 Khynriam (see Khasi) Kinnaur(i) 139, 140, 388, 457, 590 Kintaq 109 Kiranti 135, 141, 142, 147, 151, 152, 317–319, 380, 385, 468, 469, 572, 581, 591 Kivi 280 Koch 143, 301 Kochari 143 Koḍagu (Dravidian) 91, 92, 98, 465 Kodaku (Munda) 635 Koda (Munda) 113 Kohi 141 Kohima Angami 383 Kohistani 263, 266, 268, 270, 289, 380, 384, 388, 396, 571, 586, 639, 641, 642, 644, 805, 806, 809 Dir 270, 641 Indus 268, 270, 384, 396, 639, 644 Kalam 266, 384, 571, 586, 641, 644 Kokborok 143, 301 Kolami 80, 96, 102, 103, 260 Kol (Andamanese; see Aka-Kol) Kol/Kolarian (Munda) 112, 635 Koṇḍa/Konda (also Kūbi) 86, 99, 101, 260, 314, 399, 582 Kongbo 139 Konkani 36, 42, 100, 310–313, 393, 395, 396, 398, 503, 746, 797 Mangalore Saraswat Konkani 310, 313 Konyak 143, 144, 301

Language Index Kora/Khora (see Aka-Kora) Koraga/Kodava 89, 90, 92, 96–98, 106, 571 Koraku (Munda) 113, 114 Koraput Munda 112 Korean 319, 755 Korku (also Kurku) 6, 105, 113, 114, 120, 172, 173, 254, 320, 400, 438, 445 Koro 146 Korwa 113, 114 Kota 92, 97, 387, 633 Koya 101, 394 Kūbi (see Konda) Kui 99, 101, 314, 539 Kuki-Chin 145, 149, 301, 302, 317, 318, 402, 468, 469, 590 Kukish 145, 152 Kulung 141 Kumhali 634 Kumi 635 Kundal Shahi 270, 645 Kurdish 55, 57, 60, 62, 65, 529, 534 Kurku (see Korku) Kurmali 318, 469 Kurmanji 55, 59, 61, 62 Kurtöp 139, 140, 591 Kurumba 91, 92, 97, 98, 106, 475, 633 Kuṛux/Kurukh (also Oraon) 105, 259, 271, 314–316, 318–320, 581 Kusunda 2, 9, 83, 168–171, 245, 247, 253, 258, 395, 544 Kutchi Gujarati 507 Kutiya 633 Kuvi 314, 387 Kuwi 83, 99, 101 L Ladakhi 138–139, 387, 457 Lahnda 41, 42, 273, 313, 451, 470 Lahoul 139 Lai 111, 145, 387 Lakadong 123 Lakha 634 Lâmongshé 128, 635 Langtang 139 language X, 255 Lao 109, 736 Lepcha 148, 151–153, 307, 388

895

Levai 148 Lhasa/Lhasa Tibetan (see Tibetan) Lhoka 139 Lhokpu 147, 151, 634 Lhomi 139, 387, 389 Liangmai 145, 302 Limbu 141, 147, 388, 437, 455, 471, 474, 505, 591, 593 Limirong 139 Lishpa 148, 302 Lithang 139 Little Nicobar 128 Lohorung 141, 147 Lolo-Burmese 136, 137, 149, 151, 152, 590, 591 Lomavren (Romani) 50, 51 Lotha 144, 302, 387 Luish 137, 143, 144, 152, 301, 307 Luri 55 Luro 635 Lushai 439, 829 Lyngngam 112, 123, 125, 635 M Macro-Caucasian 165 Maduga 633 Māgadhī/Magadhi 24–26, 47 Magahi 38, 40, 318, 395, 469 Magar 147, 389, 590 Magaric 147, 151, 152 Mahali (Munda) 114 Mahārāṣṭrī 24–26, 34 Mahl 46 Maithili 38, 39, 313, 318, 381, 386, 393, 395, 397, 400, 438, 469, 470, 545 Majhi 385 Malasar 633 Malay Sri Lanka Malay 669, 673–675 Malayalam 73, 83, 86, 93, 95, 260, 266, 323, 327, 331, 377, 379, 386–388, 392, 394, 397, 398, 437, 442, 459, 462–465, 468, 505–507, 545, 546, 548, 551, 555, 569, 571, 578, 586, 662, 716, 718, 746, 794, 795, 800 Maldivian 46, 47 Malto/Maler 83, 85, 104–106, 253, 259, 271, 314–316, 319, 396, 581

896

Language index

Manange 140, 387 Manchad 140, 263 Maṇḍa/Manda 86, 102, 314, 633 Mandeali 385 Mangic 108 Manichean Middle Persian 52–55 Manipuri (see Meithei) Mannan 633 Mao 145, 302 Mara 145 Maram 145, 302 Marathi 32, 40, 42, 43, 46, 76, 85, 100, 103, 172, 243, 247, 283, 310–313, 317, 321, 323, 324, 327, 330, 381, 382, 386, 388, 393, 396, 398–401, 437, 438, 446, 452, 455, 467, 470, 504, 519, 524, 541, 543, 546, 547, 549, 565, 567, 570, 573, 575, 583, 586–588, 678, 743, 746, 792, 793 Nagpuri Marathi 324, 504 Old Marathi 470 Maring 149, 302 Marwari 38, 40, 85, 313, 452, 454, 460, 461, 463, 467, 474, 549, 567 Mauritius Bhojpuri 678 Mayurbhanj Ho 115, 395 Mazatec 381 Mdzorganrabar 139 Mech 143, 301, 634 Median 15, 56, 57 Meithei/Meitheilon/Meitei/Meitelon/ Meiteiron 149, 152, 302–305, 321, 385–389, 437, 438, 457, 468, 507, 591, 746, 800 Meluhhan 252, 255 Mewahang 141 Mewati 38, 40 Middle Indo-Aryan 5, 18, 19, 21, 22, 24, 25, 27, 33, 43, 46, 67, 104, 243, 262, 313, 438, 475, 570, 575, 789–791, 799 Middle Iranian 3, 15, 16, 51–54, 58–60, 62, 65, 269, 280, 283, 395 Middle Persian 54–57, 59, 62, 64, 278, 280 Middle Telugu 100 Midzuish 148, 152, 153 Miji 148

Miju 148, 152, 302 Mikir 123, 149, 301 Milang 143, 146, 302, 303 Mili 139 Minriq 109 Mintil 109 Minyong 303 Mirish 302 Mishmi 148, 302, 387 Misingish 302 Mising-Padam 143 Mitanni 17 Mizo 145, 149, 318, 383, 460, 578, 635 Mkharmar 139 Moġol language 294 Mon (Austro-Asiatic) 110, 121, 122, 129, 302 Old Mon 121, 122 Mongsen/Mongsen Ao 144, 304, 321–324, 388, 437, 505, 571 Monic 108, 110, 122, 129 Mon-Khmer 107–109, 121, 123, 301, 402, 635 Monpa 147, 148, 302 Moopan 633 Mru 149, 152, 153, 635 Mugom Tibetan 387 Mugu(m) 139 Munda 1, 2, 6, 9, 76, 85, 103, 105, 107, 108, 112–117, 119–121, 123, 124, 129, 172, 173, 245, 247, 248, 250, 252, 254, 255, 276, 312, 314–320, 324, 388, 396, 398–400, 402, 439, 440, 443–445, 448–450, 458, 459, 468, 469, 505, 544, 549, 550, 569, 570, 573, 576, 579, 635, 800 North 85, 112, 113, 120, 121, 172, 173, 319, 320, 443 South 112, 113, 117, 119–121, 124, 173, 315, 319, 320, 443, 459, 550 Mundari 113–115, 121, 129, 314–316, 319, 320, 387, 395, 399, 438, 449, 581 Munji/Munġi 58, 64, 263 Muot 635 Mustang 139 Mynnar (Khasic) 123 Mzieme 145, 302

Language Index N Nachiring 141 Naga 143, 301, 304–307, 383, 387 Nagaland Nepali 304 Naga languages 153 Nagamese 304, 305, 322–324, 571, 669, 672, 673, 675 Naga Pidgin 304, 672, 673 Nahali/Nihali 2, 9, 105, 168, 171–173, 252, 253, 258, 570 Naiki 80, 102, 103 Naikri 103 Nakchu 139 Nancowry 127, 128, 130 Nangchen 139 Naphuk 139 Nar-Phu 140, 398, 438, 549 Nàvakat 384 Ndzorge 139 Nepali 42, 45, 242, 300, 304, 306, 307, 312, 313, 320–322, 382, 385, 386, 388, 398, 399, 446, 451, 452, 454– 456, 467, 539, 546, 561–567, 583, 585, 587, 593, 634, 656, 736, 746, 752, 753, 761, 792 Newaric 147, 151, 152 Newar(i)/Nepāl Bhāsā 147, 247, 320, 321, 385, 388, 389, 398, 402, 437, 438, 456, 505, 539, 549, 590, 592– 594, 752 Kathmandu Newar 147, 438, 549, 594 New Iranian 51, 54, 59–63, 65 Ngari 139 Nicobarese/Nicobaric 9, 107, 108, 121, 124, 127–130, 245, 257, 443, 635 Great Nicobar(ese) 128, 130 Nihali (see Nahali) Nilgiri languages 92, 93 Nishing 143 Niya Prakrit 26, 278, 279, 283 Nocte 143, 144, 301 Nongtalang (Khasic) 123, 635 Nora 155, 636 Nostratic 74 Nruanghmei 145, 302 Nubra 138 Nubri 139 Nungish 137, 150, 152, 153

897

NuristaniNûristânî/Nūrestānī 9, 12, 14–17, 42, 47, 66–72, 245, 247, 254, 255, 258, 260, 262, 263, 264, 265, 269, 270, 281, 291–293, 295, 297, 298, 317, 451, 584, 586, 640, 641 Nyarong 139 Nyisu 143 O Odia/Oriya 40, 100, 242, 264, 311–314, 330, 395, 397, 438, 446, 451, 505, 539, 543, 545, 575, 581, 744, 746, 747, 790, 794, 795, 800 Old Kuki 145 ‘Olekha 147 Ollari (see Gadaba, Dravidian) Ombule 141 Onge 75, 157–163, 636 Oraon (see Kuṛux) Ormuri/Ormuṛi/Ōrmuṛī 15, 57, 58, 292–294, 642, 644, 645 Ossetic 54, 57–62, 64, 65, 269, 529 P Pacoh 109, 117 Padma 139 Pahari 42, 45, 313, 452, 639, 792 Pahari-Potwari 384 Pakanic 108, 111 Palaungic-Wa 108, 110, 111 Pali 22–27, 33, 47, 49, 248, 310, 451, 708, 714, 715 Paliya 633 Palpa 634 Palula/Palura/Phalura 4, 263, 268–270, 384, 388, 439, 570, 575, 585, 641, 645 Pāmīrī (see Shughni) Pamir languages 15, 57, 62, 63, 65, 260, 262, 265, 266, 279, 292, 293, 395, 452, 458 Pāñcthare 141 Pangkhua 145 Panjabi/Punjabi 27, 41, 42, 45, 248, 270, 282, 283, 285, 286, 289, 291, 292, 327, 328, 384, 385, 387, 388, 393, 397, 402, 437, 445, 447, 451, 454, 458, 463, 467, 470, 504, 519, 524, 540, 544, 545, 567,

898

Language index

641, 652, 678, 679, 738, 739, 746, 754, 764, 765, 792, 793, 804, 805 Greater 451, 458 Western 470 Pankua 635 Parachi/Parāčī 15, 57, 58, 262, 263, 278, 292–294 Para-Munda 250, 252, 254, 255 Parengi (see Gorum) Parji 84, 96, 103, 116, 442 Parthian 52, 54–57, 61, 62, 64, 278 Parya 50, 677 Pashai/Pašaī 263, 266, 268, 269, 278, 281, 291–293, 295, 297, 298, 458, 470, 640, 787, 805, 808 Pashto/Pushto/Pakhto/Pukhto 15, 57, 58, 60–65, 70, 246, 247, 262, 263, 266, 272, 273, 277, 279, 281, 284, 285, 287–292, 294–299, 388, 451, 454, 458, 502, 529–536, 585, 638–641, 736, 738, 754, 762, 764–767, 804, 805, 808 Paṭani 140, 387 Pearic 108, 110 Pengo 86, 99, 102, 314, 441, 468, 474, 550, 633 Persian 9, 11, 13, 15, 34, 39, 51–65, 69, 70, 243, 244, 246, 247, 249, 250, 265, 267, 270, 272–285, 287–291, 294–299, 312, 323, 330, 464, 529, 533, 543, 561, 584, 585, 588, 640, 651, 657, 787, 793, 794, 803–805 Indo-Persian 855, 856 Old Persian 11, 13, 15, 53, 56, 57, 61, 64, 278 Phake 155, 156, 636 Phakial 636 Phedāppe 141 Phobji 139, 140 Phom 143, 144 Phon 301 Pidgin Madam 669 Plantation Hindustani 678 Pnar 123, 635 Pochuri 145, 302 Poguli 263 Portuguese 9, 647, 664, 671–675 Indo-Portuguese 669, 671 Korlai Portuguese 671, 672

Malabar Portuguese 672 Sri Lanka Portuguese 671, 672, 674, 675 Portuguese Creole 669, 671–675 Potohari 451, 807 Prakrit 22, 23, 26, 33, 74, 89, 94, 99, 104, 260, 278, 279, 283, 323, 390, 548, 647, 661, 664, 668, 707, 714–721 Prasun/Prasūn 15, 66, 264, 293, 451, 452, 640 Proto-Austro-Asiatic 121, 129, 254 Proto-Indo-European 3, 11, 257, 547, 570 Proto-Munda 119–121, 247, 314, 320 Proto-Zagrosian 254, 271 Pû 127, 635 Pučikwar (see Aka-Pucikwar) Puiron 145, 302 Pujjukar 159 Puma 141 Punjabi (see Panjabi) Purang 139 Purik 138 Puroik 148 Pyu 137, 151, 302 Q Quechua 528 Qiangic 137, 151–153, 591 R Rabha 143, 301, 306, 459 Rajasthani 38, 40, 42, 43, 311, 317, 452, 473, 678 Rājbanshi 318, 321, 330, 439, 441, 469, 583 Raji 146 Raji-Raute 146, 151 Rangdum 139 Rangkas 140 Raute 146 Rebkong 139 Regma 302 Remo (also Bonda) 2, 112–117, 120, 121, 314, 399, 438, 444, 635 Rengma 145, 302 Rgangya 139 rGyalrong 137, 153, 387 rGyalrongic 137, 151–153, 591, 593

Language Index Rgyalthang 139, 387 Rkangtsha 139 Rmagsar 139 Rmastod 139 Rngaba 139 Roma, Romani 37, 45, 48–51, 245, 267, 445, 676, 677, 679 Rong 148 Rongmai 145 Rongpo 140 Rtsekhog 139 Ruga 143, 301 Rushani/Rušānī 293 Ruthog 139 Sadani (see Sadri) Sadri (also Sadani) 316, 319, 635 Sak 143, 144, 301 Saka 15, 16, 51, 52, 56–58, 60–62, 64, 65, 262, 395, 792 Sal 150, 152 Sām 141 Sampang 141, 385 Sanenyo 635 Sangam 144, 720, 725 Sanglechi 263, 278 Sangtam 302 Sankong 591 Sanskrit 1, 3–5, 11, 13, 18–20, 22–28, 31–33, 47, 75, 76, 89, 94, 95, 99, 243, 248–250, 256–259, 262, 278, 312, 317, 323, 326, 330, 331, 375, 376, 388–391, 393, 396, 437, 438, 440–445, 447, 448, 451, 465, 468, 473, 475, 476, 503, 505, 547, 548, 551, 561, 562, 570, 571, 573–576, 578–583, 588, 634, 646–648, 651, 656–658, 661–664, 667, 668, 676, 677, 707–710, 713–722, 725–727, 739, 740, 746, 748–751, 757, 788–792, 795, 796, 799 Vedic 11, 75, 391, 475, 562, 646 Sant(h)ali 113–115, 117–119, 318–320, 395, 399, 438, 441, 468, 470, 544, 568, 635, 800 Saraiki/Seraiki/Siraiki 286, 380, 451, 470, 638, 639, 641, 652, 764, 765, 806, 807 Sare 159, 162, 163, 635 Sariqoli 262, 263 Śaurasenī 25, 26

899

Saurashtri 36, 310 Savara (see Sora) Sawi 263, 266, 270, 640 Scandoromani 51 Seke 140, 387 Semang 108, 109 Sema (also Sima/Simi, Suma/Sumi) 145, 302, 386, 387 Semnani 55, 60 Sengmai 143, 144, 301, 307 Senthang 145 Sentinelese 157–159, 162, 163, 636 Sertha 139 Sham 138 Shammai 302 Shando 139 Sharchhop 387 Shardukpen/Sherdukpen 148, 302, 303 Shekhavati/Shekhawati 874 Sherpa 139, 307, 387, 389 Shina 247, 263, 268–270, 288–290, 321, 380, 384, 387, 388, 452, 466, 467, 548, 565, 567, 570, 583, 585, 641–643, 805, 806, 809 Gilgit 269, 289 Kohistani , 289, 642, 805, 806, 809 Shompeng/Shom Pen 128, 635 Shughni/Šuġnī 58, 59, 262, 263, 269, 298, 395, 438, 549, 570, 643, 644 Shumashti 263, 266, 268 Siangic 143, 146, 151 Sign languages 3 Simi (see Sema) Sindhi 27, 42, 45, 85, 272–277, 281, 283–288, 292, 380, 383, 384, 386, 401, 451, 458, 470, 575, 641, 652, 738, 754, 764–766, 792, 804, 806 Singpho 143, 301, 634 Sinhala 42, 43, 45–48, 76, 246, 247, 310–312, 321, 383, 386, 388, 395, 396, 438, 458, 505, 541, 545, 561–563, 566, 567, 589, 654, 655, 674, 707, 736, 737, 739, 761, 800 Sinitic 135, 136, 151 Sino-Tibetan (see also Tibeto-Burman) 112, 133–136, 151, 252, 268, 300, 308, 309, 385 Siraiki (see Saraiki)

900

Language index

Sizang 145 Slavic 11, 49 Sogdian 3, 51–54, 57, 58, 60–65 Sora 112, 113, 120–123, 314, 398, 438, 443, 635, 800 Sorani 55, 60–63, 65 So/Thavung 121, 122, 61, 122, 172, 260, 455, 518–520, 528, 556, 570, 583, 585–587, 669, 722, 738, 753, 756 Spanish 534, 659, 671, 755 Spiti 139, 387, 452 Sri Lanka Malay 669, 673–675 Stot 139 Sulung 148, 302 Suma/Sumi (see Sema) Sumerian 73 Sunwar 141, 389, 591, 594 Surinam Hindi 678 T Tâba 147 Tagin 143 Tai-Kadai 108, 112, 301, 633, 636 Tai Khamti 636 Tairung/Turung 155, 634 Tai (see Daic) Tajik 265, 585, 643 Takahanyilang 635 Takpa 302, 303 Talyshi 55, 63 Tamang 140, 387, 389, 549 Tamangic 138, 140, 151–153, 591 Tamarkhole 141 Tamil 36, 46, 48, 73, 76, 79, 83, 89, 91–95, 98, 100, 247, 250, 259, 260, 266, 310, 323, 326, 327, 378, 379, 382, 386–389, 392, 396–398, 401, 402, 437, 439, 441, 442, 445, 448, 449, 459, 460, 462–466, 503, 505, 507, 539–543, 545, 546, 548, 550, 551, 553, 554, 556–559, 568, 569, 571, 573, 574, 576–579, 581–583, 586, 588, 589, 631, 633, 646, 647, 654, 655, 657, 661–669, 674, 676, 678, 680, 716–723, 725–728, 736, 739, 744, 746, 790–792, 794, 795, 800 Old 83, 93, 94, 98, 100, 259, 439, 442, 449, 550, 571, 581–583, 722, 790, 791

Tangkhul/Tangkhul Naga 149, 152, 302, 387, 149, 152, 302, 387 Tangsa 143, 144, 301 Tani/Tani languages 141, 143, 146, 152, 152, 302, 303, 307, 591, 593, 141–143, 146, 152, 153, 302, 303, 307, 591, 593 Taraon 148 Tati 55 Telugu 40, 73, 79, 80, 83–87, 89, 90, 95, 99, 100, 103, 266, 310, 311, 313, 314, 323, 327, 375, 379, 382, 386, 387, 393–395, 397, 399, 401, 439, 445, 446, 448, 463, 505, 546, 555, 586, 587, 650, 651, 659, 661–663, 678, 716–718, 743, 746, 790, 794, 795, 799, 800 Middle 100 Temiar (Austro-Asiatic) 108, 122 Teressa 127, 128 Tewo 139 Thaadou/Thado/Thadou 145, 387, 460 Thakali 140, 389 Thangmi 147 Tharu 253 Themchen 139 Thet 145 Thulung 141 Tibetan 131, 134, 135, 138–140, 151, 153, 247, 250, 270, 300, 303, 307, 312, 321, 387, 389, 395, 398, 438–440, 452–457, 549, 572, 573, 591–594, 707, 792, 808 Central 139, 387, 454 Classical 250, 312, 321, 387, 439, 440, 572, 573, 591, 808 Gar, 387 Gerze 387 Lhasa/Lhasa Tibetan 139, 387, 395, 398, 438, 454–456, 591, 592 Tibetan-Gurung 468 Tibetic 591, 593 Tibeto-Burman 1, 2, 9, 85, 108, 111, 123, 130–134, 136, 140, 142, 144, 146–148, 150–154, 156, 168, 169, 173, 242, 245–250, 253, 257, 266, 289, 300, 301, 304, 306–309, 312, 315, 317–322, 380, 383–385, 387, 388, 398, 402, 439, 444, 451, 453, 456–460, 463, 465, 467–469,

Language Index 471, 504, 505, 538, 539, 543, 544, 549, 569, 572, 573, 576–578, 590, 591, 593, 594, 633, 634, 636, 656, 800 Tibeto-Himalayan 452 Tiddim Chin 387, 389 Tilung 141 Tinan 140 Tirahi/Tirāhī 293 Tiwa 143, 301 Tocharian 251, 258, 262, 279, 445, 792 Toda 5, 90–92, 96, 98, 388, 389, 392, 395, 465, 575, 581, 586, 587, 633 Tokpe Gola 139 Tonga/Mos 109 Torkmānī/Turkmen 277, 297 Torwali 263, 266, 268, 278, 283, 290, 384, 586, 641–643, 645, 754, 763, 805–807, 809 Toto 146, 634 Tregâmî 66, 70 Trinidad Bhojpuri-Hindi 678 Trinidad Creole 678 Tripuri 387 Tromowa 139 Tsez 521 Tshangla 148, 151–153, 302, 303, 457, 591 Tshangla-Takpa 302 Tsochen 139 Tsum 139 Tŭjiā 137, 151 Tulu 84, 89–92, 96–98, 104, 398, 651, 746, 795 Tumshuqese (see Saka) Turi 113, 114, 635 Turkic 65, 248, 250, 266, 278, 445, 549, 585, 591 Turkish 386, 755 Turung 155, 634 U Ü dialects 139 Uralic 73, 74 Urdish 292 Urdu 35, 39, 40, 167, 243, 266–268, 270, 273, 275, 277, 280, 282–292, 299, 310– 312, 323, 324, 326, 331, 387, 388, 393, 438, 455, 456, 474, 503, 505, 506, 533,

901

541, 566, 567, 586, 638, 639, 641, 642, 646, 647, 649, 651–654, 657, 669, 673, 736, 738, 739, 746, 754–758, 762–764, 803–805, 807, 808 Gulf Pidgin 669 Pakistani 284, 285, 541 Ushojo 384, 641, 645 Uyghur 792 Ūzbakī/Uzbek 297–299 V Vaagri Boli 36 Vafsi 55, 61, 63, 65 Vâsivari/Vâs’ivari 12, 14, 70 Vedda(h) 253, 634, 669, 674, 675 Vedic (see Sanskrit) Vietic 108, 111, 122 Vietnamese 108, 111 W Waigali/Waigalī 66, 269, 293, 586, 587 Wakhi/Wākhī 16, 56, 58–60, 64, 262, 263, 265, 268, 269, 278–280, 290, 292–294, 438, 456, 458, 549, 570, 585, 640, 643, 645, 787, 805, 808 Hunza 458 Wambule 380, 388 Wancho 143, 144, 301 Waneci 348, 691, 871, 888 War 110, 123, 635 Wotapuri 263, 467 Wotapuri-Katarqalai 467 Y Yacham-Tengsa 144, 302 Yaghnobi 57, 58, 60–65, 269 Yakkha 141 Yamphu 141 Yano 143 Yarava 828 Yazghulami/Yazgulāmī 58, 60, 262, 263, 269, 293 Yi 381 Yidgha 58, 64, 263, 645 Yimas-Arafundi pidgin 673 Yimchungru 302 Yolmo 139

902

Language index

Z Zagrosian 74, 253 Zaiwa 148, 387 Zaiwa languages 387 Zanskar 139 Zazaki 55, 58–62, 64, 65

Zedang Tibetan 387 Zeme 145, 152, 153, 302 Zhangzhung 140, 151 Zhung 138, 139 Zotung 145

Subject index

Subject Index

A A-bar movement 508–509, 511, 513–515 Ablative 19, 86, 116, 169, 320, 468, 536, 710, 724 Ablaut 447 Absolute/absolutive case (see also Casus rectus, Direct case) 166, 256, 265, 445–446, 534, 560–561 Absolute participial formations 476, 577, 580 Absolutive (form of verb; see also Converb, Gerund, Conjunctive participle) 245–246, 256, 545, 560–561, 578 Abugida writing 799 Accent (see also Pitch accent, Stress) 6, 16, 18, 28–30, 61, 63, 115, 270, 274, 384–385, 396–400, 402, 711, 713 Accessibility hierarchy 576, 583 Accusative 22, 84, 86, 88, 91, 117, 169–170, 248, 274, 442, 446, 453–455, 458–465, 470, 518, 520, 535, 541, 545–546, 548, 557, 559, 578, 713, 724, 726, 758 Active (voice; see also Voice) 20, 62, 64, 712 Active-stative (language) 265–266, 453, 457 Actor (see also Agent) 450, 459, 564, 594, 724 Adjective 20, 61–62, 64–65, 84, 87–88, 96, 275, 282, 299, 323, 449–450, 463, 471, 475–476, 503, 533, 549, 560–561, 582, 722–723, 763 Adjoined/adjunction 508, 511, 516, 571 Adposition (see also Ambiposition, Circumposition, Postposition, Preposition),, 19–20, 60, 64–65, 156, 449, 536, 752 Adstrate/adstratum (see also Substrate/ Substratum, Superstrate/Superstratum) 75, 304, 671–672 Adverb/Adverbial 80, 84, 87–88, 243, 274, 322, 324, 397, 541, 551, 578–579, 581, 722, 730

Affected agent 450, 541, 564 Affix (see also Infix, Prefix, Suffix) 63–64, 116, 142, 266, 274, 282– 283, 287, 316, 318–319, 392, 444–446, 448–449, 468, 712–713 Agent (see also Actor, Affected agent) 25, 30–31, 59, 62–63, 71, 156, 246, 265–266, 269, 277, 285, 321–322, 329, 443, 450–460, 462, 466–467, 470, 541, 564, 566, 587, 593, 712, 716, 758 Agentive/agentivity 129, 266, 322, 450, 452–453, 455–457, 459, 593 Agentive marking (see also Ergative case) , 265, 285, 321–322, 453, 455, 457, 466–467 Agent(ive) noun 119, 122, 129, 274 Agglutinating/agglutination 88, 114, 164, 440, 442–443 Agreement (see also Attenuated agreement, Default agreement, Hierarchical agreement, Incomplete agreement, Long distance agreement, Multiple agreement, Object agreement, Possessor agreement, Split agreement, Subject agreement) 62, 86, 88, 99–100, 112, 116, 118–119, 125, 141–142, 153, 170, 246, 293, 300, 304, 306, 314–319, 322, 439–440, 442, 449–450, 452–453, 458, 463, 465–476, 502, 505, 507, 517–520, 530–531, 533, 536, 544, 548–551, 560, 569, 576–577, 580–581, 593–594, 666, 713–714, 722–723 Agreement triggers 466, 517, 519–520, 548 Akṣara 789–791, 793, 795 Aktionsart 64, 453, 473, 560–561 Alignment 261, 383, 397, 402, 450–452, 659 Allative 169 Alphasyllabic writing 787, 798 Alveolar 4, 18, 30, 67, 81–84, 92–94, 98, 100–103, 155, 257–258, 261, 264, 281, 286, 293, 313–314, 320–322, 327, 377, 379, 391–392, 709

904

Subject index

Ambiposition (see also Circumposition) 247 Amnesty (syntactic term) 512, 514, 516 A-movement 508–511, 514–515 Analytic morphology 446, 447 Anaphor 512, 514, 535, 766 Āndhra Bhāṣā Bhūṣaṇamu 717 Āndhra Śabda Cintāmaṇi 717 Anglicists 648 Animacy (see also Animate, Inanimate) 60, 245, 266, 453–455, 459–465, 548 Animate (see also Animacy) 60, 156, 170, 249, 267, 278, 280, 450, 454, 458, 460–462, 464 Anti-antigemination 394 Anti-causative 170 Antigemination 393–394 Antisymmetry 517, 527–528 Aorist 19–20, 61, 80, 588 Apical 99–100, 169, 245, 377, 400 Applications (of computational linguistics) 735–736, 753, 759–762, 765 Arabic-derived script/writing systems (see also Perso-Arabic script) 39, 54, 286, 651, 763–766, 787, 793–794, 796, 803–809 Aramaic script 3, 788–789, 797 Arapacana 789 Archaism 25, 56, 64, 88, 281, 442 Areal linguistics 241–242 Articulatory phonetics (see also Place of articulation) 376, 387, 709 Aryan invasion 67, 251 Aśoka/Aśokan 1, 3–5, 9–10, 13, 15–18, 20–32, 34–39, 45–49, 52–54, 56, 58, 60, 62–65, 73–74, 80–94, 96, 98, 100–105, 107–117, 120–125, 127–129, 131, 133–134, 137, 139, 141, 145, 149, 151–155, 157, 164–166, 169–172, 241–244, 246–250, 252–257, 259, 261–262, 264–268, 274–275, 277–284, 286–288, 290, 302, 307, 309–313, 315–317, 320–322, 325–329, 331, 375–379, 382–383, 387–390, 392, 395–397, 399, 402, 437–439, 441–442, 445–446, 449–450, 453–461, 463–475, 477, 501–503, 506, 508–513, 515, 518, 523, 526–527, 529, 532–533, 535–536,

541, 543, 545, 548, 551, 554, 555, 557–561, 563–567, 569, 571, 573–575, 577–582, 585–587, 589–590, 592–594, 631, 634–635, 639, 641–644, 653–654, 656–658, 660, 664–666, 669, 672–674, 676–680, 708–711, 723, 736–739, 744, 746, 748–765, 767, 788–791, 794–795, 797–799, 802–810 Aspect (see also Imminent, Imperfective, Incompletive, Perfective, Progressive) 31, 61, 63–64, 112, 116, 125, 167, 170, 246, 265, 274, 289, 306, 320, 454–456, 458–459, 467, 473, 517, 533, 551–552, 554, 560–561, 564, 587, 594, 756 Aspiration 13, 45–47, 56, 67, 82, 169, 247, 249, 265, 272, 274, 286, 376, 381–382, 709, 803–804 Attenuated agreement 550 Attrition (see Language attrition) Auxiliary 62, 64, 76, 86, 94, 116, 125, 166, 256, 268, 289, 331, 456, 472, 474, 533, 549, 552, 556, 585–586 Auxiliary verb constructions 116, 549 Āytam 83, 94, 795 B Backward control 547 Balance verb 474, 581 Bangla script/Bengali script 39, 401, 658, 736, 764, 790, 792, 794–795, 800 Benefactive 33, 167, Beneficiary 757 Bhaikṣukī script 799 Bhattiprolu inscriptions 791 Bidirectional convergence 322–324 Bilingual/Bilingualism (see also Multilingual/Multilingualism) 249, 273–274, 287, 289–291, 296, 300, 304–305, 308, 332, 640, 661, 672, 679–680 Binding (syntax) 501, 503, 505–506, 510, 512, 514–516, 529, 535– 536, 727 Bollywood 677 Border-area contact 309, 311 Borrowing 12, 14–16, 23, 25–26, 29, 39, 61, 82, 94, 172, 249, 252–254, 268–270, 273, 275–276, 278–279, 281–282, 288–289, 303–305, 307–308, 325–327, 391, 584, 651, 719, 796

Subject Index Brāhmī 714, 721, 788–793, 795, 797–800 Brāhmī-derived writing systems 789, 795, 799 C Case (see also Ablative, Absolutive case, Accusative, Agentive marking, Casus rectus, Dative, Direct case, Ergative case, Genitive, Instrumental, Locative, Nominative, Oblique case) 19–20, 51, 59, 60, 63–65, 71, 84, 86–88, 116–117, 166, 169–170, 243, 245–246, 248, 264–265, 268, 274, 276, 280, 285, 287, 293, 304–305, 319, 321, 439–440, 445–446, 450–465, 467, 475–476, 503, 506–509, 517–520, 527, 534–536, 543, 546–547, 556, 577–580, 673, 712–713, 722–726, 756, 805 Case system 19, 59, 64, 243, 268, 293, 304, 446, 453, 712 Casus rectus (see also Absolutive case, Direct case) 51 Categories (morphology) 59–64, 84, 88, 245, 248, 275, 280, 303–304, 383, 387, 449, 453, 467, 504, 532–533, 535, 537–538, 551–552, 556, 577, 584–585, 588–591, 678–679, 712, 720, 722–723, 726, 740, 757 Causal constructions 242, 575 Causative 20, 61, 64, 71, 86, 91, 94, 121, 124, 129, 167, 170, 245, 248–249, 267–269, 281, 324, 437, 447–448, 541, 546, 759 Census of India 40, 161, 633–634 Center-embedding 322, 569, 571 Central Asia 12, 50, 54, 75, 244, 251–252, 254, 265, 267, 269, 279, 282, 298, 563, 676, 788, 790, 792, 798 Checked consonant 114, 320 Child language acquisition 133, 542–543 Chronology 21–22, 67, 251, 261, 283, 452, 708 Chunking 401, 756–757, 767 Circumposition (see also ambiposition) 65, 299, 536 Classification (of languages or dialects; see also Subgrouping) 1, 9, 15, 21–22, 30, 40, 41, 43–45, 45–47, 51, 55–57, 69,

905

72, 76, 77–81, 89–93, 99, 102, 104–105, 107–112, 112–113, 123–124, 127–128, 130–155, 155, 159, 161, 162–163, 165, 168, 172–173, 271, 295 Classifier (see also Noun class marker) 102, 124, 128, 156, 246, 248–249, 305 Clausal nominalization 319, 569, 572–573 Clause chaining 171, 579 Cleft constructions 507, 578 Clitic (see also Enclitic, Proclitic) 19, 20, 28, 59, 63–65, 87–88, 97, 118, 163, 242, 401, 445–446, 449, 458, 470, 530–535, 571 Clusivity 85 Code mixing 1, 242, 282, 325–326, 631, 810 Code switching 1, 292, 325–327, 810 Cognitive linguistics 2, 71–72, 167, 306, 537–544 Colonial grammar 707, 727 Colonialism and language 647 Comitative (see also Sociative) 169–170, 576 COMP, 326–327, 329, 523–524, 534, 574–575 Complement structures (see also ki/keclauses/complementizer) 569, 573–574 Complex predicate (see also Complex Verb, Compound verb, Conjunct verb, Explicator compound Verb) 64, 249, 501, 503, 530, 533, 541, 549, 551, 559, 562, 757 Complex verb (see also Complex predicate, Compound verb, Conjunct verb, Explicator compound verb) 300, 304, 317, 437, 465, 471–473, 501, 549, 552 Compound, compounding 33, 84, 87, 100, 104, 119, 127–128, 131, 150, 156, 246, 248, 254, 282, 284, 299, 301, 328, 389, 394, 439, 443, 448, 472, 503, 535, 549, 553, 559, 662, 666, 717, 723, 744, 747, 751, 756 Compound verb (see also Complex predicate, Complex verb, Conjunct verb, Explicator compound verb) 21, 31, 88, 94, 243, 249, 272, 324, 472–474, 506,

906

Subject index

533, 549, 550, 552–554, 558–559, 579, 586 Computational linguistics 537, 643, 735, 736, 738, 748, 750, 753, 754, 759–762, 764–765, 767 Conceptual Metaphor Theory 541 Conditional 19, 62, 65, 104, 171, 320, 321, 324, 331, 551, 568, 569, 581, 583, 715 Condition C, 510, 513–516 Conjoined/conjunction 84, 284, 379, 465, 567, 571, 797 Conjunctive participle (see also Absolutive form of verb, Converb, Gerund) 245– 246, 256, 266, 268–269, 272, 503, 545, 561, 578 Conjunct verb (see also Complex predicate, Complex verb, Compound verb, Explicator compound verb) 287, 443, 471–472, 549, 558–561 Consonant harmony 391, 395–396 Construction 31–33, 64, 75, 89, 116, 125, 170, 248, 267, 272, 283, 299, 459–460, 462–466, 473–475, 503, 511, 531, 537, 540, 542–543, 544–567, 574–576, 579, 584–594, 758 Construction grammar/constructional 501, 542, 547 Contact/contact linguistics 1, 10, 15–17, 48, 58, 75, 76, 79, 105, 107–108, 141–143, 146, 151, 154, 164, 169, 241–332, 563, 577, 584–586, 591, 635, 636, 670–673, 675, 678–680 Contact language 262, 303, 389, 635 Continuum-of-agentivity 450 Control 30–31, 450, 455, 466, 471, 503, 517, 519, 535, 545–548, 579, 593 Converb (see Absolutive, Gerund, Conjunctive participle) 20–21, 30–31, 75, 171, 256–257, 268, 312, 315–317, 319–320, 545–548, 561, 573, 578–581, 713 Convergence 1, 16, 84, 56, 75, 81, 84, 92, 104, 108, 241–332, 440, 563, 578, 669, 672, 674, 678, 809 Convergence area (see also Linguistic area, Sprachbund) 243–244, 256, 269, 440

Coordinate/coordination 65, 171, 532, 581, 758 Copula 61–62, 65, 88, 267, 316, 323–324, 470, 578, 585, 592, 594 Copy (structural borrowing) 299, 304 Coreference/coreferential 25, 322, 534–535, 573, 576 Corpus/corpora 2, 69, 53–54, 70, 107, 167, 241, 277, 285, 297–298, 303, 503, 541, 543, 550, 737, 739–755, 758–767, 801 Corpus development 739–755, 760–767 Co-subordinate 171 Counterfactual (see also Irrealis) 61 Covert Englishization 326, 330 Covert movement (syntax) 521, 526 Creaky phonation/creaky voice (see also Voice/voiced/voiceless) 111, 114, 320, 401 Creole 631, 669–675, 678–679 Creolize/creolization 154, 647, 670, 672 Crossover (syntax) 509, 512–514, 516 D Dative 19, 31, 59, 76, 86, 89, 117, 170, 243, 245, 249, 267, 285, 446, 450, 458– 459, 463–464, 466, 503, 505, 535–536, 541, 544–548, 557, 559, 712, 724 Dative experiencer/subject (see Experiencer, Non-nominative subject/experiencer, Oblique experiencer) Default agreement 463, 466, 467, 517, 519, 548 Definite (see also Indefinite) 60, 84, 170, 248–249, 320, 328, 460, 464, 542, 548 Definiteness 245, 305, 451, 463–464, 548 Deictic (pronoun or adverb) 32, 85, 87, 93, 266, 268–269, 450, 541 Demonstrative (pronoun or adverb) 19–20, 32, 82–83, 87, 90, 121, 315, 475–476, 570, 575 Dental 4, 13–14, 16, 18–19, 24, 26, 28, 30, 46–47, 66–67, 80, 82–83, 93–94, 102, 155, 165, 245, 250, 256–259, 262, 264, 267, 286, 293, 313–314, 320–322, 377, 379, 390–392, 396, 666, 794, 803 Dentalization (Gangetic) 313–314, 320

Subject Index Dentalization of palatals 313 Derivational morphology 49, 124, 437–439, 540, 749 Devanagari (see also Nagari) 39, 651, 765, 767, 791, 792–793, 800 Dhivehi script 796, 803 Diacritics 6, 54, 74, 84, 766, 774, 786, 797–799, 801, 803, 811–812, 814–815, 817, Dialect/dialectology 3, 10, 15–17, 24–27, 36, 37–40, 41–43, 46–49, 49–51, 51–52, 55–58, 61, 66, 89, 93, 94–95, 96, 97, 98, 100–101, 102, 103, 105–106, 113, 123, 131–132, 132, 136–139, 142, 144–149, 165, 251, 262, 270, 274–275, 276–277, 291, 293–295, 298, 300, 307, 312–313, 317, 320, 322, 327, 331, 380, 385, 387–388, 400, 452, 458, 465, 546, 549, 631, 640, 644, 657–669, 677–678, 679, 712, 792, 800, 804–805, Dialect leveling 24, 678, 679 Diaspora 1, 37, 112, 284, 631, 644, 676–680, 764 Dictionary/dictionaries (see also Lexicography, Lexicon development) 54, 69–70, 102, 106, 128, 133, 138, 147–148, 160–164, 168, 253, 269, 279, 282, 292, 298, 309, 636, 642, 735, 737, 744, 746–747, 749–750, 752–754, 760, 764–765, 806–808, 828 Differential Agent Marking 454–456 Differential Argument Marking 60, 63 Differential Case Marking 245, 459 Differential Object Marking (see also Direct object marking, Object marking) 60, 156, 245, 276–277, 459–465 Diglossia 1, 22–23, 94–95, 631, 647, 657–669, 674–675, 800 Diphthong 18, 111, 114, 156, 286, 790, 793, 796 Direct case (see also Absolutive case, Casus rectus) 51, 71, 280, 452, 454, 534 Direct object 59–60, 118, 245, 265, 459–465, 468, 509–511, 546 Direct object marking 60, 118, 245, 265, 459–465

907

Directional (semantics) 61, 70–72 Directionality (syntax, scripts) 248, 260, 275, 277, 325, 521, 523, 533, 800 Discourse 32, 34, 71–72, 266, 277, 284, 292, 451–454, 457–458, 529, 535, 580, 587, 665, 760 Dislocation (syntax) 530, 534–535 Ditransitive 117, 535 Divergence 123, 280, 284–291, 296, 298–299, 308, 678, 809 DO, 156, 318, 464, 507, 511, 548, 560 Documentation (see Language documentation) Double(-)oblique (see also Oblique case) 63, 452 Dravidian influence on Indo-Aryan 75, 107, 250–263, 311–313 Dravidian influence on Munda 76, 116, 123, 314–316 Dual 19, 60, 85, 141, 307, 316–317 Dubitative 87, 321, 574 Durative/durativity 64, 100, 166 E Echo (compound/formation/reduplication/ word) 87, 316, 444–445 Education 23, 37, 98, 283–284, 291, 295, 296–299, 631, 633–634, 636–637, 638–640, 642–643, 648–649, 651, 654, 656, 658–669, 680, 716, 745, 750, 759 Egophoric/egophoricity 590–594 Embedded question 522 Embedding 32, 97, 170, 316, 322, 503–505, 519–520, 522–523, 526–529, 569, 571, 574, 576, 580, Emphatic (marking) 30, 87, 401, 556, 587 Enclitic (see also Clitic) 125, 284, 293, 470 Englishization (of South Asian languages) 284, 326, 329–332 Ergative-absolutive/ERG-ABS (alignment) 452 Ergative/ergativity 21, 59, 60, 62–63, 166, 246, 248–249, 264, 277, 321, 440, 450–458, 465–468, 470, 473, 501, 530, 533, 536, 548–549 Ergative case (see also Agentive marking) 166, 246, 321, 446, 450–458,

908

Subject index

465–468, 503, 517–519, 534, 536, 541, 543, 589 Evidential/evidentiality 242, 269, 288, 440, 456, 501, 549, 584–594 Expanded verbs (see also Complex verbs) 550–559 Experiencer (see also Oblique experiencer) 170, 245, 248, 450, 462, 503, 557 Explicator (see also Vector) 246 Explicator compound verb (see also Complex predicate, Complex verb, Compound verb, Explicator compound verb) 246, 248–249 Extraposition 32, 523–524, 526, 529, 571 Ezāfa/eżāfe construction 65, 272, 280, 282, 284, 299 F Feminine (see also Gender) 19, 60, 90, 100, 102, 285, 324, 442, 714 Figure-ground 537–540 Finite, finiteness 20, 64–65, 84, 86–88, 95, 100, 102, 116, 166–167, 170–171, 245–246, 256, 266, 268, 275, 315–316, 319, 443, 449, 453, 465, 472–475, 501, 503, 505, 508–509, 512–513, 516, 518–519, 521–528, 550–552, 567–576, 576–584, 592, 594, 666, 713–714, 758 Finiteness constraint/restrictions 88, 505, 567–576, 582 Finitization 581–582 Flexion 440–443 Fluid agent marking 265, 454 Fluid ergativity 453, 455–456 Fluid intransitivity 265, 455–456 Fluid transitivity 453, 455–456 Focus 48, 328, 401–402, 457, 461–462, 506, 509, 535, 578 Future (see also Tense) 19, 61–62, 65, 97, 104, 125, 170, 274, 328, 331, 454, 456, 555, 582, 666, 673, 707 G Geminate, gemination (see also initial gemination) 27–28, 33, 46, 82–84, 92, 95, 100–101, 261, 287, 383, 386, 389, 392, 394, 793–794, 796

Gender (grammatical) 19, 60, 62, 80, 84–88, 90–91, 95, 97, 102, 124–125, 165, 266–268, 274, 280–281, 283, 285, 287, 291, 307, 315–316, 323, 442, 465, 470, 472–476, 550, 576–577, 580, 713, 722–723, 756 Gender (sex) and language 23, 290, 382, 398, 632, 743 Genetics/genomics 73, 75, 107, 154, 162, Genitive 12–13, 19–20, 59, 65, 86–87, 166, 169, 316, 320, 445–446, 451, 462–464, 472, 545, 547, 550, 577, 580, 710, 723, 757, 805 Gerund (see also Absolutive form of verb, Conjunctive participle, Converb) 256, 299, 545, 567, 578 Gerundive 20, 31, 34, 441, 477, 582 Glottal 4, 18, 114, 121, 155, 169, 380–381, 383–384, 387, 709, 796 Glottalization (see also Preglottalization) 58, 114, 380, 387 Goal (of motion verb) 25, 511, 518 Government and Binding 501, 506 Grammaticalization/grammaticization 21, 59, 62–63, 97, 125, 163, 246, 266–267, 269, 276, 314, 446, 541, 550, 581–582, 591–594, 667, 672, 674 Grantha script 792, 794–796 Grapheme 787 Grassmann’s Law 28, 390 Greek script 789 Group inflection 65 Gujarati script 792–794 Gurmukhi script 764, 793–794, 800 H Half nasals in Sinhala 383 Head/headedness (syntax) 20, 170, 300, 304, 322, 475, 517–518, 527, 533–534, 550, 566, 569, 572, 577, 581, 723, 756–757 Headedness (phonology) 402 Head-marking 163, 458 Hearsay 585–587 Hierarchical agreement 141 Hijacked constructions 672 History of languages/Linguistic history 10–12, 13–15, 19, 21–22, 35, 39, 44,

Subject Index 46, 51–52, 58–61, 61–63, 73–76, 93, 106–107, 130, 154, 172, 244, 248, 253, 271–274, 292, 562–563, Hortative (see also Modal, Mood) 170, 475 HPSG/Head-Driven Phrase Structure Grammar 502, 535 I Iambic (structure) 402 Identity (sociolinguistic, political, ideological) 50, 300, 304, 323, 632, 637–638, 653, 679–680, 787, 808 Imminent aspect (see also Aspect) 170 Imperative (see also Modal, Mood) 19, 118, 142, 170, 285, 328, 475, 552 Imperfect (see also Tense) 19–20, 61, 97, 274, 588, 714, Imperfective (see also Aspect) 61, 63–64, 288, 452, 455, 531, 563, 581 Implosive 287, 375, 380, 804, 806 Inalienable possession/inalienability 163, 266, 316, 444 Inanimate (see also Animacy) 170, 267, 278, 454, 464 Inchoative 61 Inclusive vs. exclusive (pronoun, marking) 85, 90, 93, 101, 103, 141, 246, 248–249, 266, 274, 311, 317 Incomplete agreement 474 Incompletive aspect (see also Aspect) 170 Incorporation 121–123, 127–128, 314, 443, 472, 550 Indefinite (see also Definite) 60, 84, 315, 321, 331, 460, 570, 573 indenture 676–680 Indianization (of English) 328, 331 Indigenous grammar 376, 438–439, 448–449, 551, 707–728 Indigenous grammar and western linguistics 376, 448–449, 707, 728 Indirect object 59–60, 89, 118, 468, 509–510, 546 Indo-Altaic area 266 Indosphere 108 Indo-Turanian area 249, 266, 283 Indus script/writing 74, 787, 801–803

909

Inferential/inferentiality 267, 557, 585–589, 591–593 Infinitive (see also Verbal noun) 20, 61, 92, 267, 269, 474, 520, 551–552, 577–579, 581–582 Infix (see also Affix) 65, 84, 119, 121, 124–126, 129, 319, 444 Inflection 19–20, 30, 62, 65, 76, 116–118, 124, 170, 243, 437–439, 442, 445–446, 449, 537, 540 Inflectional morphology 49, 60, 306, 437, 540, 551, 674, 749 Initial gemination 392 Injunctive 19, 456 In situ 503–504, 518, 520–529 Instrumental 19, 23, 31, 86, 280, 451, 455, 458, 712–713 Intermediate scrambling 508–509, 511, 513–516 Interrogative (see also Wh) 22, 85, 87, 105, 166, 304, 310, 315, 321, 475, 522, 570, 573, 722 Intonation 285, 376, 396, 398, 401–402 Irrealis (see also Counterfactual) 64, 170–171 Isogloss 43, 55–58, 142, 265, 275, 293 Isolation (morphology) 440–443 J jargon 171, 327 Jharkhand convergence 316, 324 K Kaccāyana 714 Kaithi script 800 Kannaḍa script 651, 792, 794–795, 799–800 Karnāṭaka Bhāṣā Bhūṣaṇa 717 Kātantra 714 Kaumāralāta 714 Kharoṣṭhī 788–790, 797–800 Khojki script 793–794 ki/ke-clauses/complementizer (see also Complement clauses) 34, 283–284, 310–311, 501, 523, 574–575 Killer language 680 Koiné 22, 24, 26, 157, 162, 165, 649, 678 Kupwar convergence 323–324

910

Subject index

L Labial 4, 18, 25, 28, 82, 92, 94, 96, 98, 124, 155, 165, 389, 395, 790 Laminal 169 Landa script 792–794 Language and power 10, 57, 281, 324, 631, 652–653, 672, 680 Language attrition 307, 631–632, 679–680 Language death/loss 1, 306–309, 631–632, 679 Language documentation 1, 10, 54, 147, 154, 160, 164, 167, 294, 308–309, 636, 638, 642–645, 735–736, 805–809 Language endangerment 1, 2, 10, 54, 132, 141, 147–148, 154–155, 157, 168, 289, 631–637, 638–645, 675, 735–736, 805 Language maintenance 637, 675, 679, 736 Language planning 296, 297–298, 631, 645–656, 787 Language policy 298, 631, 645–656 Language shift 75, 109, 132, 249, 264, 267, 286, 289, 291, 294, 324, 633–637, 639–641, 679 Language use 132, 537, 542–544, 648, 650, 664–665, 727, 735, 743 Laryngeal (PIE/Dravidian consonant) 12, 15, 56, 81–83 Lateral (phonetics) 4–5, 19, 26, 82, 98, 155, 169, 274, 313–314, 378–379, 383, 666, 805 Latin script (see also Roman script) 664, 797 Layers (in morphology) 445–446 Left-branching (syntax) 48, 65, 88, 245–246, 266–267, 283, 568, 574–576, Left edge (accent) 398–402 Left dislocation/left extraction/leftward movement/leftward scrambling 330, 508, 512, 516–517, 527, 534–535, 574, Lepcha script 148 Lexical borrowing (see Borrowing) Lexical case 453, 547 Lexicography/lexicon development (see also Dictionary/dictionaries) 70, 72, 160, 172, 642, 749, 753–754, 761–762 Lexifier (language) 670–673 LFG/Lexical-Functional Grammar 502, 535, 547

LF/Logical Form 506 LH (pitch contour) 28–29, 397–402 Ligature (in writing systems) 789, 791, 795 Light verb 31, 64, 170, 454, 552–558, 559–564 Lilātilakam 95, 662, 718 Lingua franca (see also Link language) 39–40, 43, 97, 144, 243, 282, 296, 305–308, 316, 636, 647–648, 672–673, 678 Linguistic area (see also Convergence area, Sprachbund) 1, 10, 48, 75, 241, 244–246, 249, 266–267, 275, 376, 504 Linguistic states 324, 650–651 Link language (see also Lingua franca) 24, 655 Literacy 288, 290, 637, 640–641, 647, 787, 798–800, 807–808 Literacy programs 642–643, 807–808 Loanwords (see also borrowing(s)) 45, 56, 58, 75–76, 83, 89, 100, 105, 121, 249, 253–254, 267–270, 274–276, 278–283, 287–292, 657–658, 662, 664, 678, 717, 719, 721, 726, 766, 793, 795–796, 802, 808 Localization (computational linguistics) 643, 735–739, 752–753, 755, 760–761 Localized contact 310–311 Locative 19, 86, 169, 274–275, 277, 285, 476, 536, 541, 580, 710, 724, 757 Locative absolute (see Absolute participial formation) Logophoric 515 Long-distance agreement 502, 519–520, 577 Long scrambling 508–509, 511–513, 516 M Machine translation 737, 742, 744–745, 751, 753, 756, 759–762, 764–765, 767 Macro-areas 108, 241, 243, 264 Major syllable 115, 124, 402 Malayalam script 95, 794–795, 800 Maṇipravāḷa 662, 664, 718 Masculine 19, 26, 60, 104, 285, 287, 323, 441–442, 466 Mediative 591, 594

Subject Index Merge (syntax) 507, 511, 533 Metaphor(ic) 537–544, 720, 725, 802 Metatypy 672, 674 Micro areas 92, 98, 241–242, 268–269 Middle (voice; see also Voice) 20, 30, 61–62, 64, 153 Migration 17, 36, 50, 74–75, 97, 105, 132, 154, 157, 251, 259–260, 264, 270, 271–276, 282, 290–291, 296–297, 299, 300–303, 308, 631, 676–678 Minimalism 501, 531, 533–535 Minority language 1, 54–55, 109, 111, 132–133, 290, 294, 297, 324, 634–640, 651, 653, 655, 675, 677, 739, 752, 800, 808 Minor syllable 111, 115, 124, 402 Minute on Education (Macaulay) 648–649, 652 Mirativity 584–585, 588–590, 593 Missionaries 10, 69, 128, 648, 664, 671–672, 677, 679, 718, 727 Modal/Modality (see also Hortative, Imperative, Mood, Optative, Subjunctive) 19, 34, 62–65, 86, 166, 170, 265, 474–475, 520, 551, 572, 579, 584, 586, 589, 728 Modi script 793 Monosyllabic/monosyllabicity 111, 115, 156, 385, 400, 402, 448, 708 Mood (see also Hortative, Imperative, Modal, Optative, Subjunctive) 62, 455, 551 Morphology (see also Agglutinating/ Agglutination, Analytic morphology, Derivational morphology, Inflectional morphology, Isolation, Layers (in morphology), Portmanteau, Synthetic morphology, Whole-word morphology, Word, Word structure) 1, 19–20, 22, 27, 35, 49, 59, 61, 69, 84, 86, 88–89, 105, 112, 124, 138, 142, 147, 155–156, 165, 166, 168–169, 173, 243, 266–267, 273, 283, 287–289, 293, 299–300, 304–306, 317–318, 323, 328, 384, 389, 437–442, 446–448, 450, 469, 471, 503, 532, 540, 544, 550–551, 560, 584, 589, 593, 663, 673–674, 709, 711, 723, 726–727, 749, 761, 764

911

Morphosyntax 151, 306, 308, 328, 375, 437, 439, 450, 560, 562, 566, 578, 584, 674 Movement (syntax) 507–529, 531–533, 553, 571, 574 Multilingual/Multilingualism (see also Bilingual/Bilingualism) 1, 10, 164, 173, 244, 272, 277, 287, 294–295, 306–307, 323, 325, 631–632, 637, 643, 646, 656, 675, 742, 745, 752, 760, 765, 802 Multiple agreement 319, 465, 468–471 N Nagari (see also Devanagari) 790, 792–795, 800 Nagpur convergence 324 Nandinagari script 792 Nannūl 718, 726–728 Nasal 4–5, 18, 27, 47, 76, 80–84, 90–91, 93–95, 100–101, 114, 155, 169, 258, 260, 313–314, 376–377, 379, 381, 382–384, 387, 391–392, 666, 709 Nasalized/nasalization 5–6, 18, 47, 100, 114, 245, 247, 249, 272, 376, 382–384, 387, 666, 709–710 Naskh script 766 Nastaliq script 762–763 Nati (see also n-retroflexion) 391, 396 National language (see also Official language) 37, 40, 109, 132–133, 284, 287, 297, 639, 650, 652, 656, 678, 762, 809 Natural language processing 739–740, 753, 755, 762 Negation 65, 94, 268, 470, 553, 565, 579, 667 Negative marker 80, 91, 94, 125, 166, 275, 291, 444, 470, 586 Negative (conjugation) 86, 93–94, 101, 104, 142, 291, 316, 667 Neuter 19, 22, 60, 80, 84, 86, 91, 323, 442, 463, 548, 666 Nilgiri convergence area 92–93, 97–98, 253, 309, 325 Nirukta 708, 748 Nominalized/nominalization (see also Clausal nominalization) 125, 129, 276, 319, 321–322, 439, 569, 572–573, 576–578, 592

912

Subject index

Nominative 19, 22, 26, 47, 61, 63, 86, 88, 169, 265, 445–446, 453, 458–464, 466, 476, 505, 518–519, 535, 544, 546–548, 577, 589, 710, 713, 722–723, 726 Nominative-accusative/NOM-ACC (alignment) 63, 88, 264, 453, 459–460, 462, 464 Non-finite 20, 88, 166, 245–246, 256, 275, 316, 320, 443, 449, 465, 472–473, 501, 503, 519, 522, 567–569, 573, 576–585, 587, 713, 722, 758 Non-nominative subject/experiencer (see also Experiencer, Oblique experiencer, Oblique subject) 505, 519, 548 Nontransformational 502, 712 Northeast (of South Asia) 123, 131, 142– 143, 148, 150–153, 155, 243, 246, 257, 300–309, 317, 325, 383, 388, 634–635, 637, 651, 788 Northwest (of South Asia) 21, 24–27, 40, 42– 43, 49, 104, 139, 243, 247, 250, 253, 258–260, 260–263, 264–300, 309, 312, 325, 328, 331, 395, 398, 501, 570, 584–585, 641, 788, 791–792, 805, 809 Noun class marker (see also Classifier) 119, 126–127, 129 Noun phrase/NP, 65, 88–90, 100, 156, 245–246, 269, 322, 454–456, 458, 460, 465–466, 511–512, 519, 532–533, 535–536, 542, 544–546, 548, 566, 573, 756 n-retroflexion (see also Nati) 30, 396, 709 Number (category) 19, 60, 62, 84–88, 91, 95, 97, 141–142, 166, 254, 291, 293, 315–317, 442, 454, 465, 472–475, 550, 560, 577, 580, 581, 713, 756 Numeral 17, 60, 83–84, 86, 91, 144, 154, 265, 267, 304, 317, 449 Numeral symbols 797, 802 Numerative 60 Nuqtā 793, 795 O Object agreement 112, 314, 465, 468, 471, 474, 518–520 Object marking 60, 119, 156, 249, 277, 315, 318, 450, 459–465 Object shift 511

Oblique case (see also Double oblique) 19, 59, 71, 80, 85–86, 278, 286, 445–446, 452, 454, 534–536, 579, 805 Oblique experiencer (see also Experiencer, Non-nominative subject/experiencer, Oblique subject) 156, 501, 544–549, 579 Oblique subject (see also Experiencer, Non-nominative subject/experiencer, Oblique experiencer) , 31, 466, 501, 544–549 OCR (optical character recognition) 737, 753, 760–763, 766–767 Official language (see also National language) 35–36, 39–40, 43, 73, 95–96, 98, 124, 277, 282, 294, 296–298, 635, 649–650, 652, 654–655 Ol’ Cemet script 114 Ol Chiki script 800 Old Kannada script 794 Online sentence production 540 Onomatopoeia/onomatopoetic 87, 316, 722 Optative (see also Modal, Mood) 19, 34, 62, 64–65, 170, 456, 582, 715 Optimality Theory/OT, 378, 385, 401, 452, 455, 532 Orality 632, 710, 799 Orientalists 648 Oriya script 790, 794–795, 800 Orthography 55, 83, 114, 375, 640, 787, 791, 794, 799, 804–808, Overcounting (numeral system) 144, 154 Overt case marking 467, 517, 712 P palatal 4, 11–14, 16–18, 24, 26, 28–29, 56–57, 66, 82, 100–101, 155, 165, 262, 264, 267, 288, 313, 384, 390, 396 Palatalized/palatalization 12–14, 16, 93–94, 100, 375, 380, 395–396, 666, 794 Pallava Grantha script 792, 796 Pāṇini 25, 28, 30–31, 278, 376, 389–391, 438, 448–450, 588, 707–716, 718–721, 724–727, 750–751, 757, 788 PAN (Pan Asian Networking) 736–738, 753, 755, 761

Subject Index Paradigm 62, 275, 446, 448, 566 Parameter 243, 248, 464–465, 501, 506, 531, 591, 724 Participial formations (see Absolute participial formations, Participial relative clause, Relative participle) Participial relative clause (see also Relative participle) 503 Participle (see also Present participle, Relative participle, ta-participle, Verbal participle) 20–21, 61–62, 80, 87–88, 91, 102, 157, 266, 278, 304, 321, 330, 446, 472, 503, 505, 552, 561, 563, 567–568, 576–577, 580–581, 583, 585–586, 667, 723 Particle 17, 20, 48, 61–62, 65, 88, 112, 114, 157, 310–311, 322–324, 449, 506, 566, 570, 573–574, 585–588, 593 Passive (see also Voice) 20–21, 30–31, 61–64, 272, 287, 329, 437, 447, 450–451, 460–461, 467, 476, 508, 667, 712–713 Past (see also Preterit(e), ta-participle, Tense) 19–20, 25, 61–64, 80, 86, 91, 97, 100–101, 104, 125, 170, 265–266, 278, 293, 328, 330, 442, 452, 458, 467, 470, 536, 563, 578, 582–583, 585–586, 588–589, 667, 707, 714 Patañjali 23, 708, 710, 714–715 Patient 62, 156, 242, 443, 459–460, 462–463, 466–468, 470, 473, 712 Pen-ant 397–400, 402 Penult 397, 399–400, 402 Perfect (tense; see also Tense) 19, 61–65, 116, 293, 453, 554, 563, 585–589 Perfective/Perfectivity (see also Aspect) 20–21, 61, 63, 80, 87–88, 102, 243, 266, 321, 446, 452–456, 467, 470, 517–519, 541, 548, 560, 563, 565, 582, 586 Perso-Arabic script (see also Arabic script) 39, 286, 651, 764, 787, 803–804, 807–809 Person 19, 60, 62, 70, 86–88, 95, 124, 141–142, 153, 170, 293, 301, 315, 317, 442, 454, 456, 465, 472–473, 475, 550, 560, 581, 593, 756

913

Phonetics 1–2, 3–6, 18, 27, 30, 71, 247, 293, 375, 376–388, 708–709 Phonology 1, 11, 14, 18, 27–30, 33, 35, 47, 51, 55–56, 58, 66, 71, 81–84, 99–100, 111–112, 114–115, 124, 155–156, 165, 168–169, 245, 247–248, 256–259, 265, 270, 274, 279, 287, 309, 313, 321–322, 327, 375, 388–402, 444–445, 531–532, 709, 710–711, 806 Phrase structure 65, 518, 533, 535–536, 755, 756, 758–759 Pidgin 631, 647, 669–675 Pitch accent (see also Accent, Stress) 18, 28, 29, 270, 397–398, 400, 402 Place of articulation 169, 377, 382, 790 Plural marking 60, 268, 280, 306 Point-of-view 453, 554 Pole (in compound verbs) 473, 560–561, 564, 566 Politeness 32, 285, 475 Polysynthetic 163 Portmanteau (morphology) 259, 306, 318, 442, 468–469, 471, 666 POS (Part of Speech) labeling 737, 739, 752–756, 760, 764, 766, 767 Possessive (pronoun/reflexive) 30, 163, 265, 268–269, 274, 512, 534–535 Possessor 31, 59, 118–119, 318, 468, 534, 544, 546–548 Possessor agreement 118–119, 318–319, 468–469 Possessor dislocation 534 Possessor subject 31, 547–548 Postposition (see also Adposition) 59, 65, 86, 88, 156, 244–245, 247, 249, 269, 285–286, 288, 456, 466, 519, 541, 548–549, 756 Pragmatic case marking 454, 457, 464 Pragmatics 540, 542–543, 556, 564, Prātiśākhyas 376, 389, 708–710, 748 Prefix (see also Affix, Preverb) 20, 61–62, 64–65, 76, 117, 119, 121, 124, 126–127, 129, 166–167, 170, 245, 247, 250, 252, 257, 265, 275, 323, 402, 444, 532, 534 Preglottalization (see also Glottalization) 114, 387 Preposition (see also Adposition) 65, 70, 156, 244, 247, 272, 299, 535, 538

914

Subject index

Present (see also Tense) 19–21, 61, 63–65, 94, 115, 170, 274, 278, 293, 316, 331, 447, 581, 589, 666, 707 Present participle (see also Participle) 21, 61, 581 Preterit(e) (see also Past, Tense) 70, 264, 453 Preverb (see also Prefix) 61, 65 Pre(-)verbal position 118, 312, 470, 521, 523, 525–528, 570 Principles & Parameters 501, 506, 531 PRO, 522–523, 530–531, 545 Proclitic 125, 163–164, 470 Pro-drop 530–531 Progressive (verb formation; see also Aspect) 21, 61, 64, 104, 115, 272, 274–276, 324, 328, 331, 377, 437, 589 Pronoun/pronominal (see also Deictic (pronoun), Demonstrative pronoun, Interrogative pronoun, Inclusive vs. exclusive pronoun, Possessive pronoun, Relative pronoun, Resumption/resumptive pronoun) 19–20, 22, 30, 32, 59, 63–65, 71, 80, 83, 84–85, 87, 89–91, 93, 102, 105, 117–118, 125, 166, 242, 246, 248–249, 259, 265, 268–269, 274–275, 278–279, 285, 287, 291, 293, 300, 304, 310, 314–316, 321–323, 330, 439, 449, 442, 444, 447, 449, 452, 454, 458, 467, 469–470, 475–476, 505, 511, 515–516, 530, 532–535, 566, 568, 570, 573, 575, 658, 673, 722 Prosody/prosodic 115, 124, 376, 386–388, 396–402, 532–533, 535 Prototype structure 537–538 Prototypical meaning 538 Protraction (of accent) 398–400 Psycholinguistic/psycholinguistics 2, 541–543 Psych-predicates 535 Q QP (question particle) 48, 310–312, 321 Quantifier 509, 516 Question 48, 310–311, 328, 502–504, 506–507, 520–529, 574, 589, 593–594, 721–722 Quirky case 547

Quotative 20, 31–32, 34–35, 75, 88, 97, 248–249, 256, 300, 311–312, 322, 505, 569, 573–576, 587 R Ranjana script 792 RC-CC (see Relative clause/relative-correlative) Realis 170 Reciprocal 510–516, 555 Reconstruct/reconstruction (comparativehistorical) 24, 58, 80–87, 91, 93, 106–107, 120–121, 130, 134, 136, 138, 140, 142, 145, 149, 151, 153, 154, 247, 255, 302, 308, 320, 402, 471, 556, 591 Reconstruction (syntactic term) 509–515 Reduplication 28, 87, 120–121, 256, 320, 389, 444–445, 461, 503–504 Referential 60, 164, 515, 552, 554, 556–557, 722 Reflexive/reflexivization 30–31, 83, 86, 142, 153, 170, 448, 503, 505, 515, 535, 545–546, 555 Relational Grammar 547 Relative clause/relative-correlative (see also Participial relative clause, Relative participle) 20, 32, 65, 88, 97, 106, 156, 266, 268, 274, 300, 304, 310, 312–313, 315–316, 321–322, 329–330, 439, 503, 534, 568–571, 573, 575–576, 581, 589, 757 Relative participle (see also Participial relative clause) 330, 552, 667, 723, 758 Relative pronoun 300, 304, 310, 475–476, 568, 570 Relator noun 446 Relexification 286, 288, 680 Religion 38, 104, 282, 632, 646, 664, 677 Resumption/resumptive pronoun 330, 530, 534 Retroflex 2–5, 13, 15–17, 18–19, 26, 28, 29–30, 46–47, 58, 75–76, 82, 84, 92–94, 96, 98, 100, 102, 108, 155, 165, 242, 245, 247–250, 256–259, 262, 264–265, 267–269, 272, 274, 281, 286, 288, 293, 297, 299, 300, 313–314, 320–322, 327, 375, 376–379, 384, 390–392, 396, 709, 764, 793, 803–804, 807–809

Subject Index Retroflex approximant 4, 82, 155, 257, 314, 379 Retroflex harmony 396 Retroflex vowels 267, 269, 378, 807 Rhotic 4, 5, 19, 26, 142, 155, 169, 257, 378, 379 Right-branching (syntax) 65, 156, 246, 266–267, 283, 574–576 Right Roof Constraint 516 Rightward scrambling 507–508, 516–517 Roman script (see also Latin script) 46, 754, 765, 806–808 Root 30, 84, 86–87, 89, 91, 99–100, 117, 119–121, 126–128, 134, 146, 166, 320, 392, 395, 444, 447–459, 707–708, 711–712, 724, 726 Root alternation 447–448 Root extension 91 RUKI, 11, 12, 14–16, 390–391, 396 S Śabdamaṇidarpaṇa 717 Sandhi 21, 27–29, 33, 84, 314, 375–376, 385, 388–394, 396, 551, 663, 666, 708, 710–711, 718, 751 SanskNet 748 Śāradā script 792, 800 Saurashtra script 800 Scheduled language 73 Scheduled tribes and languages 92, 100 Scope 502, 506, 509–529, 553, 589 Scrambling 502, 507–517, 521, 523, 526, 528, 530, 533, 553 Second-position stress 115 Second position (syntax) 530–535 Semantic case 712 Semantic convergence (see also Convergence) 243 Semantic/semantics 60, 85, 88, 92, 119, 243, 245, 265–266, 268, 277–278, 281–283, 300, 304, 326, 451, 453–454, 456–461, 464, 501, 503, 521–522, 528–529, 538, 540–541, 543, 547, 551–554, 556–557, 559–561, 563–564, 584–594, 663, 672, 710, 712, 715–716, 719, 722–726, 741, 745, 749, 755, 758, 760, 765–766, 802, Semi-speaker 168

915

Serial verb 88, 104, 106, 125, 272, 315, 439, 473–475, 550, 552, 581 Sesquisyllabic/sesquisyllabicity 111, 115, 402 Shahmukhi script 286, 764, 804 Shame (as a factor in language attrition/ loss) 638, 641 Shell Script 799 Shibboleth 95 Short scrambling 508–514 Sidat San̆ garā 707 Siddhamātṛkā script 792, 794 Siddham script 792 Sign language 3 Śikṣā 708–709, 720 Simplify/simplification 13, 46, 84, 99, 100, 144, 154, 305–306, 663–664, 666, 673, 675 Sinhalese script 792, 794–796 Sinosphere 108 Sloppy Control 547 Sociative (see also comitative) 86 Sociolinguistic/sociolinguistics 1, 35, 164, 173, 271, 273, 277, 286, 289, 295, 324, 631–680 Sorang Sompeng script 800 Specific/specificity 60, 170, 457, 459–465 Specifier 509, 517–518, 526–528 Spell checker 737, 753, 761–762, 764 Split agreement 530, 533 Split-ergative/split-ergativity 63, 71, 246, 264, 277, 450–452, 454–455, 501, 533, 548 Split-headedness 533 Split-intransitive 265, 455 Sprachbund (see also Convergence area/ Linguistic area) 56, 58, 75, 241–242, 244, 272, 293, 301, 563 State language 149, 633–634, 637–638, 646, 652–654, 656 Stem 61–65, 80, 86–88, 91, 94, 100–101, 125–126, 265, 275, 279–280, 282, 285, 293, 320, 442–443, 447–448, 666, 711, 713 Stranding 509, 517, 528 Stress (see also Accent, Pitch accent) 115, 165, 376, 383–384, 394, 396–400, 402, 461, 464, 531–532, 806

916

Subject index

Strict SOV, 257, 567, 569, 576 Structural borrowing 249, 273 Structural case 453, 517 Subgrouping (see also Classification) 2, 12–13, 15, 58, 76–81, 130–155, 301 Subject 30–32, 60–63, 70–71, 76, 88–89, 100, 112, 118–119, 125, 142, 243, 245–246, 248, 256, 265–267, 314, 316, 318, 320, 450, 452–464, 466–477, 501, 503, 505, 507–509, 512–520, 522, 527, 530–531, 535–536, 544–548, 550–556, 564, 576–577, 579–580, 587, 589, 593–594, 639, 663, 710, 712, 715, 722–723, 758 Subject agreement 88, 100, 112, 118, 314, 316, 465–477, 518–519 Subject properties 31, 545–548, 579 Subjunctive (see also Modal, Mood) 19, 62, 65, 104, 285, 475 Subordination 20, 166–167, 320, 501, 567–569, 576, 578–579, 583–584, 710 Subordinator 65 Substrate/substratum (see also adstrate/ adstratum) 12, 74–75, 92, 123, 140, 143, 154, 245, 250, 252–253, 255–256, 258, 260–268, 280, 304, 311, 317, 325, 332, 670–673 Suffix (see also Affix) 23, 30, 60–62, 64–65, 70, 72, 76, 84, 86, 90, 93–94, 97, 99–101, 104, 115–116, 119, 121, 123, 125, 153, 157, 166, 170, 245, 247, 249, 265, 269, 274–276, 279–280, 287, 291, 304, 306, 314, 320, 392, 395, 442–444, 447–448, 459, 469–471, 552 556, 594, 673, 713, 715, 723, 726, 756, 802 805 Superstrate/superstratum (see also adstrate/ adstratum) 304, 332, 672 Switch reference 320, 579–580 Syllable structure 18, 21, 28–29, 84, 91, 93, 99–102, 111, 114–115, 124, 134, 287, 381, 383–386, 390, 397, 402, 666 Syloti Nagri 800 Syncope 382, 393–394 Syntax 20, 27, 30, 32–35, 48, 52–53, 69, 87–89, 106, 108, 114, 121, 124, 138, 151, 156, 166–167, 169–170, 173, 243, 245–246, 248–249, 273, 275–277, 280, 283, 288–289, 299, 305–306, 308, 325,

328, 331, 375, 384, 437, 439, 450–451, 467, 501–509, 516–519, 521, 531–536, 542–547, 549, 551–553, 557, 560, 562, 564–566, 571–582, 584, 662–663, 665, 667, 671–674, 709, 726, 755–759, 760, 766, Synthetic morphology 447 T Tag sets/tagging 737, 739, 744, 746, 749–756, 758, 760, 764, 766, 767 Takri script 792, 800 Tamil script 83, 93, 721, 790–795, 800 ta-participle (see also Verbal participle) 19–21, 31, 61, 451, 579, 580–582, 708 Telugu script 717, 790, 792, 794–795, 799–800 Tense (see also Future, Imperfect, Past, Perfect, Preterit(e), Present) 19–20, 25, 61, 63–65, 86, 88, 90, 94–95, 97, 101, 112, 116, 125, 142–143, 170, 246, 265–266, 274, 278, 306, 316–317, 320, 331, 392, 442, 452–456, 458–459, 463, 467, 474, 533, 536, 551, 555, 560, 566, 572, 582, 585–587, 594, 666–667, 674, 714, 722–723, 756 Thaana script 796, 800 Theme (syntax) 459–464, 511, 544, 546, 548, 660, Three-language policy 631, 637, 650–651 Tibetan grammatical tradition 707 Tibetan script 792, 808–809 ToBI (phonology) 401 Tolkāppiyam (Tolkāppiam) 83, 93, 392, 449, 551, 661, 668, 707, 714, 717–727 Tolkāppiyar 719–727 Tone/tonal language 6, 25, 67–68, 103, 111, 114–115, 133, 140, 156, 166, 270, 300, 307, 375, 384–385, 387, 389, 397, 401–402, 646, 793, 802, 804–807 Tonogenesis 385, 387 Topic/topicality 304, 321, 454, 456, 508–509 Transcription 2–5, 11, 14, 19, 28–29, 46, 54, 71–72, 167, 257 Transformational 501–503, 531, 701–702, 710, 712 Transitional stages 272, 288

Subject Index Transitional zones 247 Transitive/transitivity 20–21, 30–31, 61–62, 64, 86, 90–91, 101, 117, 122, 141–142, 164, 166, 246, 248, 265–266, 447, 450, 452–456, 458–459, 463–467, 471, 517–518, 533, 535–536, 546–548, 555, 561, 564, 566, 579, 712, 724, 758 Treebanking/treebank 751, 753, 755–758, 766–767 Tribal language(s) 10, 92, 96, 98, 168, 250, 324, 452 Triphthong 114 Trochaic (structure) 402 Two Mora Conspiracy 21, 27, 33, 260 U Umlaut 45, 47–48, 83, 375, 394–396, 447 Unaccusative 455, 458, 758 Unergative 450, 455, 458, 758 Unicode and South Asian scripts 643, 736, 743–744, 752, 763–766, 787, 800, 804–805, 807, 809 Usage-based approach 542–544 Uvular 4, 81, 83, 169, 293, 396 V Vākyapadīya 708, 715 Varṇamālā 164, 790 Vaṭṭeḻuttu script 792, 795 Vector (in compound verbs) 246, 473, 503, 552, 559–560, 562–566 Velar 4, 11, 12, 18, 28, 76, 82, 104, 155, 165, 169, 293, 313, 314, 382, 396, 709, 804 Verbal noun (see also Infinitive) 20, 61, 577 Verbal participle (see also ta-participle) 552–554, 556, 567, 578 Verb phrase (VP) 88, 511, 518–519, 531, 533, 569, 758 Vikṛti Vivēkamu 717

917

Virāma (script symbol) 791, 795 Visarga (script symbol) 4, 790, 795 Voice (grammatical; see also Active, Middle, Passive) 20, 62, 64, 82, 110–111, 320, 381–383, 385, 709, 712 Voice/voiced/voiceless (phonology; see also Creaky voice) 4, 12–14, 27–28, 57, 63, 67, 82, 98, 110–111, 114, 169, 274, 286, 289, 293, 303, 314, 320, 327, 376–377, 380–386, 664, 709, 790, 793, 795, 803–806, 809 Voice-to-text 736 Volitionality 243, 454–456, 560, 564 Vowel harmony 99, 375, 388, 394–395, 448, 666 W Wackernagel position/Wackernagel’s Law (see also second position) 63 Weak crossover 509, 512–514, 516 Weak-strong prosody 727 Wh (see also Interrogative) 509, 523–527, 529 Wh-in-situ 521, 523, 525 Wh-movement 508, 521, 525–528 Whole-word morphology 448 Word (morphological term) 119, 124, 384, 396 402, 444, 448, 662, 708, 715–716, 725–727, 747, 754, 756 Wordnet 737, 745–749, 754, 761, 763, 765, 766 Word order 20, 49, 76, 88–89, 123–124, 246, 301, 315, 401, 454, 504, 507–508, 521, 535–536, 540, 671, 678, 749, 758 Word structure (see also Morphology) 84, 115, 119, 124, 166, 402, 444, 557 World Englishes 680 Writing systems (see also Orthography) 1, 297, 387, 721, 787, 789–790, 792–793, 797–803, 805–807, 809