The Oxford Handbook of Languages of the Caucasus 9780190690694, 9780190690700, 9780190690717, 0190690690

The Oxford Handbook of Languages of the Caucasus is an introduction to and overview of the linguistically diverse langua

113 64 11MB

English Pages 1189 Year 2020

Table of contents :
Cover
The Oxford Handbook of Languages of the Caucasus
Copyright
Dedication
Table of Contents
Editor
Abbreviations
Notes on Contributors
Maps: Yuri Koryakov
Map 1. Caucasus: Administrative division
Map 2. Language families of the Caucasus
Map 3. Languages of the Caucasus
Map 4. Nakh-Dagestanian languages
Introduction
I.1. Navigating the Area
I.2. A Linguistic Snapshot of the Caucasus
I.3. Scholarship on Languages of the Caucasus
I.4. Structure of This Handbook
Acknowledgments
Part I: General Overview of the Caucasus
Chapter 1: Languages and Sociolinguistics of the Caucasus
1.1 Introduction
1.2 Family Affiliation
1.2.1 Caucasian Languages
1.2.2 Kartvelian
1.2.3 Northwest Caucasian
1.2.4 Nakh-Dagestanian
1.2.5 Indo-European Languages
1.2.6 Turkic Languages
1.2.7 Semitic
1.3. Official Statistics on Language Speakers and Users
1.3.1 Armenia
1.3.2 Azerbaijan and Artsakh (Nagorno-Karabakh)
1.3.3 Georgia, Abkhazia, and South Ossetia
1.3.4 Russia
1.4 Writing and Scripts
1.5 Multilingualism in the Caucasus
1.5.1 South Caucasus
1.5.1.1 Azerbaijan
1.5.1.2 Armenia
1.5.1.3 Georgia
1.5.1.4 Abkhazia
1.5.2 Russia
1.5.3 Dagestan
1.6 Language Contact in the Caucasus
1.6.1 Lexical Borrowing
1.6.2 Contact-Induced Change in Grammar: An Overview
1.6.3 Alternations in Recipient Marking
1.6.4 Ordinal Numerals Formed with ‘Say’
1.6.5 ‘Find’ as an Epistemic Auxiliary
1.6.6 Differential Object Marking in Udi
1.7 Conclusion
Acknowledgements
Chapter 2: North Caucasus: Regions and Their Demography
2.1 Northeastern Caucasus
2.1.1 Dagestan
2.1.2 Chechnya
2.1.3 Ingushetia
2.2 Northwestern Caucasus
2.2.1 North Ossetia
2.2.2 Kabardino–Balkaria
2.2.3 Karachay–Cherkessia
2.2.4 Adygea
Further Reading
Part II: Nakh-Dagestanian Languages
Chapter 3: Nakh-Dagestanian Languages
3.1 Introduction
3.1.1 Structure of the family
3.1.2 Historical Sources
3.1.3 Current Sociolinguistic Situation
3.1.4 History of Research
3.2 Phonetics and Phonology
3.2.1 Consonant Inventories
3.2.2 Vowel Inventories
3.2.3 Stress and Tone
3.2.4 Phonotactics
3.2.5 Main Phonological Processes
3.3 Lexical Classes
3.4. Nominal Morphology
3.4.1 Nominal Gender
3.4.2 Nominal Inflection and Inflectional Features
3.4.3 Personal Pronouns
3.4.4 Reflexive and Reciprocal Pronouns
3.4.5 Demonstrative Pronouns
3.4.6 Determiners
3.4.7 Adjectives
3.4.8 Numerals
3.4.9 Postpositions
3.5 Verbal Morphology
3.5.1 Morphological Classification of the Verbal Lexicon
3.5.2 Verbal Inflection
3.5.3 Finiteness
3.5.4 Indicative Forms
3.5.5 Non-Finite Forms
3.5.6 Non-Indicative Forms
3.5.7 Periphrastic Verb Forms
3.5.8 Negation
3.5.9 Valency-Changing Operations
3.5.10 Agreement Features and Their Morphological Exponence
3.6 Simple Clauses
3.6.1 Structure of Noun Phrases
3.6.2 Predicate Structure
3.6.3 Major Valency Classes
3.6.4 Minor Valency Classes and Non-Canonical Argument Marking
3.6.5 Word Order
3.6.6 Questions
3.6.7 Agreement
3.6.8 Grammatical Relations
3.7 Complex Sentences
3.7.1 General Profile of Complex Sentence Formation
3.7.2 Clause Chaining
3.7.3 Relative Clauses
3.7.4 Complement Clauses
3.7.5 Adverbial Clauses
3.8 Lexicon
3.9 Future Directions of Research
Acknowledgments
Chapter 4: Dargwa
4.1 Language and Demographics
4.1.1 Location and Dialects
4.1.2 Current Sociolinguistic Situation
4.1.3 Brief History of Research
4.2 Phonetics and Phonology
4.2.1 Consonant Inventory
4.2.2 Vowel Inventory
4.2.3 Stress
4.2.4 Important Phonological Processes
4.3 Lexical Classes
4.4 Nominal Morphology
4.4.1 Gender
4.4.2 Nominal Inflection
4.4.3 Inflectional Features
4.4.3.1 Number
4.4.3.2 Case
4.4.3.3 Locative Forms
4.4.4 Personal Pronouns
4.4.5 Reflexives and Reciprocals
4.4.6 Demonstrative Pronouns
4.4.7 Other Pronouns and Quantifiers
4.4.7.1 Interrogative Pronouns
4.4.7.2 Indefinite Pronouns
4.4.7.3 Quantifiers
4.4.8 Adjectives
4.4.9 Numerals
4.4.10 Adverbs
4.5 Verbal Morphology
4.5.1 Morphological Verb Classes
4.5.2 Verbal Inflection
4.5.2.1 Types of Verb Forms
4.5.2.2 Verb Stem Derivation
4.5.2.3 The Structure of Verbal Forms
4.5.2.4 Stress in the Verbal Domain
4.5.2.5 Inflectional Categories
4.5.3 Non-Finite Forms
4.5.3.1 Simple Converbs
4.5.3.2 Participles
4.5.3.3 Infinitives
4.5.3.4 Specialized Converbs
4.5.3.5 Verbal Nouns
4.5.4 Core Indicative Forms
4.5.4.1 Preterite
4.5.4.2 Forms Based on the Preterite Stem
4.5.4.3 Present and Past Progressive
4.5.4.4 Future
4.5.5 Non-Indicative Forms
4.5.5.1 Imperative
4.5.5.2 Prohibitive
4.5.5.3 Irrealis
4.5.5.4 Optative
4.5.5.5 Conditionals
4.5.5.6 Propositive
4.5.5.7 Short Optative (Optative Noun)
4.5.6 Negation
4.5.7 Causative
4.5.8 Agreement Features and Their Morphological Exponence
4.6 Other Word Classes
4.7 Simple Clauses
4.7.1 Noun Phrases
4.7.2 Predicate Structure and the Problem of Finiteness
4.7.3 Alignment, Major Valency Classes, and the Antipassive
4.7.4 Minor Valency Classes and Non-Canonical Argument Marking
4.7.5 Word Order and Information Structure
4.7.5.1 Word Order
4.7.5.2 Information Structure
4.7.6 Agreement
4.7.6.1 Gender Agreement
4.7.6.2 Person Agreement
4.7.7 Local Anaphora
4.8 Complex Sentence
4.8.1 Coordination and Subordination
4.8.2 Clause Chaining
4.8.3 Relative Clauses
4.8.4 Complement Clauses
4.8.5 Adverbial Clauses
4.8.6 Long-Distance Anaphora
4.9 Topics for Further Study
Acknowledgments
Chapter 5: Lak
5.1 The Language and Its Speakers
5.1.1 Location and Status
5.1.1.1 Dialects
5.1.2 Historical Sources
5.1.3 Current Sociolinguistic Situation
5.1.5 Brief History of Research
5.2 Phonetics and Phonology
5.2.1 Consonants
5.2.2 Vowels
5.2.3 Phonotactics
5.2.4 Morphophonemics
5.2.4.1 Ablaut
5.2.4.2 Velar~Palatal Alternations
5.2.4.3 Metathesis
5.3 Lexical Classes
5.4 Nominal Morphology
5.4.1 Nominal Classification
5.4.1.1 Agreement Markers
5.4.2 Stem Formation in Nouns
5.4.3 Inflection for Case
5.4.3.1 Grammatical Cases
5.4.3.2 Locative Cases
5.4.3.3 Functions of Locative Forms
5.4.4 Personal and Other Pronouns
5.4.5 Reflexive and Reciprocal Pronouns
5.4.6 Demonstratives and Deixis
5.4.7 Determiners
5.4.8 Adjectives and Adverbs
5.4.8.1 Gradation
5.4.8.2 Adverbs
5.4.9 Numerals
5.4.10 Postpositions
5.4.10.1 Locative Postpositions
5.4.10.2 Non-Locative Postpositions
5.5 Verbal Morphology
5.5.1 Morphological Classification of Verbal Lexicon
5.5.2 Verbal Inflection
5.5.3 Core Indicative Forms: Inventory of Synthetic TAM Forms
5.5.4 Non-Finite Forms
5.5.4.1 Non-Modal Gerunds and Participles
5.5.4.2 Masdars and Verbal Nouns
5.5.5 Non-Indicative Forms
5.5.6 Periphrastic Forms
5.5.7 Evidentiality
5.5.8 Negation and Interrogation
5.5.9 Valency-Changing Operations
5.5.10 Agreement
5.6 Simple Clauses
5.6.1 Structure of Noun Phrases
5.6.2 Predicate Structure
5.6.3 Finiteness
5.6.4 Major Valency Classes
5.6.5 Minor Valency Classes
5.6.6 Word Order
5.6.9 Clausal and Constituent Negation
5.7 Complex Sentences
5.7.1 General Profile of Complex Sentence Formation
5.7.2 Clause Chaining
5.7.3 Relative Clauses
5.7.4 Complement Clauses
5.7.5 Adverbial Clauses
5.7.6 Long-Distance Anaphora
5.8 Topics for Further Study
Chapter 6: Avar
6.1 The Language and Its Speakers
6.2 History of Research and Documentation
6.3 Phonology
6.3.1 Phonemic Inventory and Alphabet
6.3.2 Phonotactics and Accent
6.3.3 Morphophonology
6.4 Morphology
6.4.1 Nouns
6.4.2 Pronouns
6.4.3 Verbal Morphology
6.4.3.1 Synthetic and Analytic Tenses
6.4.3.1.1 Simple Present (-ula / -ola / -la / -una)
6.4.3.1.2 Future (-ila / -ela / -la / -ina)
6.4.3.1.3 Aorist (-ana / -una / -na; neg. -č’o / -ič’o / -inč’o)
6.4.3.1.4 Habitual Past (-ula-ʔan / -ola-ʔan / -la-ʔan / -una-ʔan)
6.4.3.1.5 Perfect (perfective converb + present tense copula-auxiliary)
6.4.3.1.6 Pluperfect (perfective converb + aorist copula-auxiliary)
6.4.3.1.7 Compound Present (present participle + present tense copula-auxiliary
6.4.3.1.8 Compound Past (present participle + aorist copula-auxiliary)
6.4.3.1.9 Compound Future (present participle + future copula-auxiliary)
6.4.3.1.10 Prospective Future (infinitive + present tense copula-auxiliary)
6.4.3.2 Non-Indicative Verb Forms
6.4.3.2.1 Imperative (-a / -e / -j) & Prohibitive (-uge / -unge / -ge)
6.4.3.2.2 Optatives
6.4.3.2.3 Hortative (infinitive/future + emphatic enclitic =in)
6.4.3.2.4 Irrealis (future + -ʔan, i.e. -ila-ʔan / -ela-ʔan / -la-ʔan / -ina-ʔan)
6.4.3.3 Non-Finite Verb Forms: Conditionals, Deverbal Nouns, Participles, Converbs
6.4.3.3.1 Infinitive (-ize / -ze / -ine)
6.4.3.3.2 Masdar (-i / -j / -in)
6.4.3.3.3 Participles
6.4.3.3.4 Converbs
Perfective Converb (-un / -n)
Imperfective Converb (simple present + =go)
Specialized Converbs
Realis Conditional (-ani / -uni / -ni; neg. -č’o-ni / -ič’o-ni / -inč’o-ni)
Irrealis Conditional (participle + -ani)
Concessives (-ani=gi / -uni=gi / -ni=gi; neg. -č’o-ni=gi / -ič’o-ni=gi /-inč’o-ni=gi)
6.4.3.4 Valency Classes and Valency Alternations
6.4.4 Adjectives
6.4.5 Numerals
6.4.6 Postpositions, Adverbs, Enclitics, and Particles
6.4.7 Word Formation
6.5 Syntax
6.5.1 Noun Phrases and Other Types of Phrases
6.5.2 Simple Clauses
6.5.3 Agreement
6.5.4 Interrogative Clauses
6.5.5 Complex Clauses
6.5.5.1 Coordination
6.5.5.2 Adverbial Clauses
6.5.5.3 Complement Clauses
6.5.5.4 Relative Clauses
6.5.6 Reflexivization and Reciprocalization
6.6. Areas of Future Research
Acknowledgements
Chapter 7: Archi
7.1 The Language and Its Speakers
7.2 Phonetics and Phonology
7.2.1 Consonant Inventories
7.2.2 Vowel Inventories
7.2.3 Stress and Tone
7.2.4 Phonotactics
7.2.5 Most Important Phonological Processes
7.3 Lexical Classes
7.4 Nominal Morphology
7.4.1 Nominal Classification: Gender
7.4.2 Nominal Inflection
7.4.3 Inflectional Features
7.4.4 Personal Pronouns
7.4.5 Reflexive and Reciprocal Pronouns
7.4.6 Demonstrative Pronouns
7.4.7 Determiners
7.4.8 Adjectives and Other Nominal Modifiers
7.4.9 Numerals
7.4.10 Postpositions
7.5 Verbal morphology
7.5.1 Morphological Classification of Verbal Lexicon
7.5.2 Verbal Inflection
7.5.3 Core Indicative Forms: Inventory of Synthetic TAM Forms
7.5.4 Non-Finite Forms
7.5.5 Non-indicative Forms
7.5.6 Periphrastic Forms
7.5.7 Negation and Other Clausal Operators Expressed Morphologically on the Verb
7.5.8 Agreement Features and Their Morphological Exponence
7.6 Simple Clause
7.6.1 Structure of Noun Phrases
7.6.2 Predicate Structure
7.6.3 Finiteness
7.6.4 Major Valency Classes
7.6.5 Minor Valency Classes
7.6.6 Word Order
7.6.7 Agreement
7.6.8 Local Anaphora
7.6.9 Grammatical Relations
7.6.10 Negation
7.7 Complex Sentence
7.7.1 General Remarks
7.7.2 Clause Chaining and Converbs
7.7.3 Relative Clauses
7.7.4 Complement Clauses
7.7.5 Adverbial Clauses
7.7.6 Long-Distance Anaphora
7.8 Outstanding Issues
Acknowledgments
Chapter 8: Chechen and Ingush
8.1 The Languages and Their Speakers
8.1.1 Geography and Sociolinguistics
8.1.2 History of Research
8.2. Phonetics and Phonology
8.2.1 Consonant Inventories
8.2.2 Vowel Inventories
8.2.3 Stress and Tone
8.2.4 Phonotactics
8.2.5 Phonological Processes
8.3 Lexical Classes
8.4 Nominal Morphology
8.4.1 Noun Classification
8.4.2 Nominal Inflection
8.4.3 Inflectional Categories of Nominals
8.4.4 Personal Pronouns
8.4.5 Reflexive and Reciprocal Pronouns
8.4.6 Demonstrative Pronouns
8.4.7 Demonstratives
8.4.8 Adjectives
8.4.9 Numerals
8.4.10 Postpositions
8.5 Verb Morphology
8.5.1 Morphological Classifications
8.5.2 Verbal Inflection
8.5.3 Synthetic TAM Forms
8.5.4 Non-Finite Forms
8.5.5 Non-Indicative Forms
8.5.6 Periphrastic (Analytic) TAM Forms
8.5.7 Negation
8.5.8 Valence-Changing Operations
8.5.9 Agreement
8.5.10 The Addressee Dative Construction
8.6 Simple Clauses
8.6.1 Structure of Noun Phrases
8.6.2 Predicate Structure
8.6.3 Finiteness
8.6.4 Major Valence Classes
8.6.5 Minor Valence Classes
8.6.6 Word Order and Information Structure
8.6.7 The Syntax of Agreement
8.6.8 Local Anaphora
8.6.9 Grammatical Relations
8.6.10 Negation
8.7. Complex Sentences
8.7.1 Coordinating and Subordinating Constructions
8.7.2 Clause Chaining
8.7.3 Relative Clauses
8.7.4 Complement Clauses
8.7.5 Adverbial Subordination
8.7.6 Long-Distance Anaphora
8.8 Open Questions
8.8.1 The it-Cleft
8.8.2 Other Clefting
8.8.3 The Proximal Demonstrative Referring to the VIP
8.8.4 Radical P Alignment
8.8.5 Schwa-Zero Alternations
8.8.6 Second Person in Long-Distance Reflexivization
Part III: Northwest Caucasian Languages
Chapter 9: The Northwes tCaucasian Languages
9.1 The Languages and Their Speakers
9.2 Phonetics and Phonology
9.2.1 Consonants
9.2.2 Vowels
9.2.3 Phonotactics and Syllable Structure
9.2.4 Stress and Prosody
9.2.5 (Morpho)phonological Processes
9.3 Lexical Classes
9.4 Nominal Morphology
9.4.1 Nominal Inflection
9.4.2 Pronominal Morphology
9.4.3 Numerals
9.4.4 Postpositions
9.5 Verbal Morphology
9.5.1 The General Outline of Verbal Morphology
9.5.2 Argument Indexing and Valency Alternations
9.5.3 Expression of Spatial Meanings
9.5.4 Finiteness
9.5.5 Tense and Aspect
9.5.5.1 Dynamic and Stative Verbs
9.5.5.2 Tense Systems
9.5.5.3 Aspectual Marking
9.5.6 Mood and Modality
9.5.7 Negation
9.6 Simple Clause
9.6.1 Structure of Noun Phrases
9.6.2 Predicate Structure
9.6.3 Valency Classes
9.6.4 Word Order
9.6.5 Agreement
9.6.6 Anaphora
9.6.7 Negation
9.6.8 Question Formation
9.7 Complex Sentence
9.7.1 Clause Chaining and Coordination
9.7.2 Relative Clauses
9.7.3 Complement Clauses
9.7.4 Adverbial Clauses
9.8 Areal and Typological Profile
9.9 Outstanding Issues
Acknowledgments
Chapter 10: Abaza and Abkhaz
10.1 The Languages and Their Speakers
10.1.1 Geographical Location and Internal Subgrouping
10.1.2 Historical Sources
10.1.3 Current Sociolinguistic Situation
10.1.4 History of Research
10.2 Phonetics and Phonology
10.2.1 Consonant Inventories
10.2.2 Vowel Inventories
10.2.3 Stress and Pitch
10.2.4 Phonotactics
10.2.5 Phonological Processes
10.3 Lexical Classes
10.3.1 Basic Lexical Classes
10.3.2 Derived Lexical Classes
10.4 Nominal Morphology
10.5 Verbal Morphology
10.5.1 Morphological Classification of Verbal Lexicon
10.5.2 Verbal Inflection
10.5.3 Agreement Features and Their Morphological Exponence
10.5.4 Indicative (Finite) Tense Forms
10.5.4.1 Stative Indicative
10.5.4.2 Dynamic Indicative
10.5.5 Non-Finite Forms
10.5.5.1 Converbs
10.5.5.2 Relatives
10.5.5.3 Content Interrogatives
10.5.5.4 Polar Interrogatives
10.5.5.5 Subordinate Non-Finite Forms
10.5.6 Other Moods and Modal Forms
10.5.6.1 Imperative
10.5.6.2 Desiderative
10.5.7 Negation and Other Clausal-Level Verb Morphology
10.5.7.1 Negation
10.5.7.2 Potential
10.5.7.3 Evidential
10.5.8 Valence-Changing Operations
10.5.8.1 Reciprocals
10.5.8.2 Causative
10.5.8.3 Applicative
10.5.8.4 Labile Verbs
10.6 Simple Clauses
10.6.1 Structure of Noun Phrases
10.6.2 Predicate Structure
10.6.3 Word Order
10.7 Complex Sentences
10.7.1 General Principles of Complex Sentence Formation
10.7.2 Relative Clauses
10.7.3 Complement Clauses
10.7.4 Adverbial Clauses
10.8 Areas of Further Research
Part IV: Kartvelian Languages
Chapter 11: Kartvelian (South Caucasian) Languages
11.1 The Languages and Their Speakers
11.2 Phonetics and Phonology
11.2.1 Consonants
11.2.2 Vowels
11.2.3 Stress and Tone
11.2.4 Phonotactics
11.2.5 Main Phonological Processes
11.3 Lexical Classes
11.4 Nominal morphology
11.4.1 Nominal Classification
11.4.2 Nominal Inflection
11.4.3 Inflectional Features
11.4.4 Personal Pronouns
11.4.5 Reflexive and Reciprocal Pronouns
11.4.6 Demonstrative Pronouns
11.4.7 Determiners
11.4.8 Adjectives and Other Nominal Modifiers
11.4.9 Numerals
11.4.10. Adpositions
11.5 Verbal Morphology
11.5.1 Morphological Classification of Verbal Lexicon
11.5.2 Verbal Inflection
11.5.3 Finiteness
11.5.4 Core Indicative Forms
11.5.5 Non-finite Forms
11.5.6 Non-indicative Forms
11.5.7 Periphrastic Forms
11.5.8 Negation
11.5.9 Valency-Changing Operations
11.5.10 Agreement
11.6 Simple Clause
11.6.1 Structure of Noun Phrases
11.6.2 Predicate (VP) Structure
11.6.3 Major Valency Classes
11.6.4 Minor Valency Classes
11.6.5 Word Order
11.6.6 Agreement
11.6.7 Local Anaphora
11.6.8 Grammatical Relations
11.6.9 Negation
11.7 Complex Sentence
11.7.1 General Remarks
11.7.2 Clause Chaining
11.7.3 Relative Clauses
11.7.4 Complement Clauses
11.7.5 Adverbial Clauses
11.7.6 Long-Distance Anaphora
11.8 Areal and Typological Profile
11.9 Outstanding Issues
Acknowledgments
Chapter 12: Megrelian
12.1 The Language and Its Speakers
12.2 State of Research
12.2.1 Brief History of Research
12.2.2 Grammatical Descriptions and Dictionaries
12.2.3 Electronic Resources and Corpora
12.3 Phonetics and Phonology
12.3.1 Segmental Units
12.3.2 Suprasegmental Units
12.3.3 Phonotactics
12.3.4 Phonological Processes
12.3.5 Phonological Word
12.4 Lexical Classes
12.4.1 Verbs
12.4.2 Nominals
12.4.3. Inflection-Free Lexical Classes
12.5 Morphology
12.6 Nominal Morphology
12.6.1 Nouns
12.6.2 Adjectives
12.6.3 Pronouns
12.6.4 Pronominals and Deixis
12.6.5 Numerals
12.7 Verbal Morphology
12.7.1 Tense-Aspect-Mood-Evidentiality
12.7.2 Person and Number
12.7.3 Preverbs
12.7.4. Non-Finite Verb Forms
12.8 Syntax
12.8.1 Noun Phrases
12.8.2 Simple Clauses
12.8.2.1 Verb Classes
12.8.2.2 Case Marking
12.8.2.3 Valency-Changing Operations
12.8.2.4 Word Order
12.8.2.5 Negation
12.8.2.6 Question Formation
12.8.3 Complex Sentences
12.8.3.1 Complement and Adverbial Clauses
12.8.3.2 Relative Clauses and Clefts
12.8.3.3 Reported Speech
12.9 Lexicon
12.9.1 Phonosemantic Component in the Lexicon
12.9.2 Animacy and Selectional Restrictions in the Lexicon
Acknowledgments
Part V: Indo-European Languages
Chapter 13: Indo-European Languages of the Caucasus
13.1 The Languages and Their Speakers
13.1.1 Geographical Location of the Family
13.1.2 Early Sources
13.1.3 Sociolionguistic Situation
13.2 Phonetics and Phonology1
13.2.1 Consonant Inventories
13.2.1.1 Armenian
13.2.1.2 Ossetic
13.2.1.3 Judeo-Tat
13.2.1.4 Talyshi
13.2.2 Vowel Inventories
13.2.2.1 Armenian
13.2.2.2 Ossetic
13.2.2.3 Judeo-Tat
13.2.2.3 Talyshi
13.2.3 Stress and Tone
13.2.4 Phonotactics: Syllable Structure, Consonant Clusters
13.2.4.1 Eastern Armenian
13.2.4.2 Ossetic
13.2.4.3 Judeo-Tat
13.2.4.4 Talyshi
13.2.5 Phonological processes
13.2.5.1 Eastern Armenian
13.2.5.2 Ossetic
13.2.5.3 Judeo-Tat
13.2.5.4 Talyshi
13.3 Lexical Classes
13.4 Nominal Morphology
13.4.1 Nominal Classification
13.4.2 Nominal Inflection
13.4.2.1 Eastern Armenian
13.4.2.2 Ossetic
13.4.2.3 Judeo-Tat
13.4.2.4 Talyshi
13.4.3 Inflectional Features
13.4.3.1 Eastern Armenian
13.4.3.2 Ossetic
13.4.3.3 Judeo-Tat
13.4.3.4 Talyshi
13.4.4 Personal Pronouns
13.4.4.1 Eastern Armenian
13.4.4.2 Ossetic
13.4.4.3 Judeo-Tat
13.4.4.3 Judeo-Tat
13.4.4.4 Talyshi
13.4.5 Reflexive and Reciprocal Pronouns
13.4.5.1 Eastern Armenian
13.4.5.2 Ossetic
13.4.5.3 Judeo-Tat
13.4.5.4 Talyshi
13.4.6 Demonstrative Pronouns
13.4.6.1 Eastern Armenian
13.4.6.2 Ossetic
13.4.6.3 Judeo-Tat
13.4.6.4 Talyshi
13.4.7 Determiners and Quantifiers
13.4.7.1 Eastern Armenian
13.4.7.2 Ossetic
13.4.7.3 Judeo-Tat
13.4.7.4 Talyshi
13.4.8 Adjectives and Adverbs
13.4.8.1 Eastern Armenian
13.4.8.2 Ossetic
13.4.8.3 Judeo-Tat
13.4.8.4 Talyshi
13.4.9 Numerals
13.4.10 Adpositions
13.4.10.1 Eastern Armenian
13.4.10.2 Ossetic
13.4.10.3 Judeo-Tat
13.4.10.4 Talyshi
13.4.11 Interrogatives and Indefinites
13.4.11.1 Eastern Armenian
13.4.11.2 Ossetic
13.4.11.3 Judeo-Tat
13.4.11.4 Talyshi
13.5 Verbal Morphology
13.5.1 Morphological Classification of Verbal Lexicon: Complex Verbs and Preverbal Derivates
13.5.2 General Structure of the Verbal Paradigm
13.5.3 Finiteness
13.5.4 Core Synthetic Indicative Forms
13.5.4.1 Eastern Armenian
13.5.4.2 Ossetic
13.5.4.3 Judeo-Tat
13.5.4.4 Talyshi
13.5.5 Non-finite Forms
13.5.5.1 Eastern Armenian
13.5.5.2 Ossetic
13.5.5.3 Judeo-Tat
13.5.5.4 Talyshi
13.5.6 Non-indicative Forms
13.5.6.1 Eastern Armenian
13.5.6.2 Ossetic
13.5.6.3 Judeo-Tat
13.5.6.4 Talyshi
13.5.7 Periphrastic Forms
13.5.7.1 Eastern Armenian
13.5.7.2 Ossetic
13.5.7.3 Judeo-Tat
13.5.7.4 Talyshi
13.5.8 Negation
13.5.8.1 Eastern Armenian
13.5.8.2 Ossetic
13.5.8.3 Judeo-Tat
13.5.8.4 Talyshi
13.5.9 Valency-changing operations
13.5.9.1 Eastern Armenian
13.5.9.2 Ossetic
13.5.9.3 Judeo-Tat
13.5.9.4 Talyshi
13.5.10 Agreement Features and Their Morphological Exponence
13.5.10.1 Eastern Armenian
13.5.10.2 Ossetic
13.5.10.3 Judeo-Tat
13.5.10.4 Talyshi
13.6 Simple Clause
13.6.1 Structure of Noun Phrases
13.6.1.1 Eastern Armenian
13.6.1.2 Ossetic
13.6.1.3 Judeo-Tat
13.6.1.4 Talyshi
13.6.2 Predicate (VP) Structure
13.6.2.1 Eastern Armenian
13.6.2.2 Ossetic
13.6.2.3 Judeo-Tat
13.6.2.4 Talyshi
13.6.3 Major Valency Classes
13.6.3.1 Eastern Armenian
13.6.3.2 Ossetic
13.6.3.3 Judeo-Tat
13.6.3.4. Talyshi
13.6.4 Minor Valency Classes
13.6.4.1 Eastern Armenian
13.6.4.2 Osseti
13.6.4.3 Judeo-Tat
13.6.5 Word Order
13.6.6 Agreement
13.6.7 Local Anaphora
13.6.8 Negation
13.7 Complex Sentence
13.7.1 General Profile of Complex Sentence Formation
13.7.2 Relative Clauses
13.7.4 Complement Clauses
13.7.5 Adverbial Clauses
13.8 Outstanding Issues
13.8.1 Areal Connections and the Caucasian Sprachbund
13.8.2 Descriptive Gaps
13.8.3 Phonological issues
13.8.4 Morpholosyntactic Issues
Acknowledgments
Chapter 14: Iron Ossetic
14.1 The Language and Its Speakers
14.1.1 Geographical Location of the Language, Dialects
14.1.2 Historical Sources
14.1.3 Current Sociolinguistic Situation
14.1.4 Brief History of Research, Language Materials
14.2 Phonetics and Phonology
14.2.1 Consonant Inventory
14.2.2 Vowel Inventory
14.2.3 Stress
14.2.4 Phonotactics
14.2.5 Main Phonological Processes
14.3 Lexical Classes
14.4 Nominal Morphology
14.4.1 Nominal Inflection
14.4.2 Inflectional Features
14.4.3 Personal Pronouns
14.4.4 Reflexive and Reciprocal Pronouns
14.4.5 Demonstratives
14.4.6 Determiners
14.4.7 Adjectives and Other Nominal Modifiers
14.4.8 Numerals
14.4.9 Prepositions and Postpositions
14.4.10 Indeterminates: Wh-Words and Indefinites
14.4.11 Conjunctions
14.5 Verbal Morphology
14.5.1 Morphological Classification of Verbal Lexicon
14.5.2 Verbal Inflection
14.5.3 Core Indicative Forms: Inventory of Synthetic TAM Forms
14.5.4 Non-Finite Forms
14.5.5 Non-Indicative Forms
14.5.6 Periphrastic Forms
14.5.7 Valency-Changing Operations
14.5.8 Agreement
14.6 Simple Clause
14.6.1 Noun Phrase
14.6.2 Predicate Structure
14.6.3 Major Valency Classes
14.6.4 Minor Valency Classes
14.6.5 Word Order and Main Information Structure Effects
14.6.6 Agreement
14.6.7 Local Anaphora
14.6.8 Grammatical Relations
14.6.9. Negation
14.6.10 Questions
14.6.12 Clitics
14.7 Complex Sentence
14.7.1 Complex Sentence Formation
14.7.2 Clause Chaining
14.7.3 Relative Clauses
14.7.4 Complement Clauses
14.7.5 Adverbial Clauses
14.7.6 Long-Distance Anaphora
14.8 Outstanding Issues
Acknowledgments
Part VI: Phenomena
Chapter 15: Segmental Phonetics and Phonology in Caucasian languages
15.1 Introduction
15.2 Phonemic Inventories
15.2.1 Kartvelian
15.2.1.1 Phonation
15.2.1.2 Phonetic Properties of Other Laryngeal Features
15.2.1.3 Ejective Stops and Vowel Duration
15.2.1.4 Aspiration
15.2.1.5 Megrelian, Laz, and Svan
15.2.2 Northwest Caucasian
15.2.2.1 Typologically Rare Segments
15.2.2.2 Fricatives
15.2.2.3 Vocalic Inventories
15.2.3 Nakh-Dagestanian
15.2.3.1 Laryngeal Features
15.2.3.2 Lateral Obstruents
15.2.3.3 Pharyngeal Place of Articulation
15.2.3.4 Pharyngealization as Secondary Articulation
15.3 Phonotactics
15.3.1 Georgian Clusters
15.3.2 Svan Clusters
15.3.3 Outside Kartvelian
15.4 Processes
15.4.1 Laz Identical Consonant Deletion
15.4.2 Megrelian Nasalization
15.4.3 Focus Gemination
15.4.4 Reduplication
15.4.5 Processes Targeting Word-Final Voiceless Stops
15.4.5.1 Final Gemination
15.4.5.2 Final Voicing
15.5 Conclusions and Future Directions
Acknowledgements
Chapter 16: Word Stress in Languages of the Caucasus
16.1 Introduction
16.2 Free Stress Languages
16.3 Fixed-Stress Languages
16.4 Quantity-Sensitive Stress Systems
16.4.1 Unbounded Systems
16.4.2 Stress-Window Systems
16.5 Morphologically Conditioned Stress Placement
16.5.1 Stress-Window Systems
16.5.2 Other Morphological Generalizations
16.6 More Complex Cases
16.6.1 NWC Stress
16.6.1.1 Kabardian and Adyghe
16.6.1.2 Abkhaz and Abaza
16.6.2 Variation in “Stress Strength”
16.7 Stress Systems That Require Further Study
16.7.1 Insufficient Descriptions
16.7.2 Conflicting Descriptions and Accounts
16.8 Acoustic Correlates and Instrumental Studies
16.9 Conclusion
Acknowledgments
Chapter 17: Tone and Intonationin Languages of the Caucasus
17.1 Introduction
17.2 Tonal Properties
17.2.1 Limited Tonal Systems
17.2.2 Tonal Contrasts on Stressed Syllables
17.2.3 Tonal Contrasts on Each Syllable
17.2.4 Conflicting Descriptions and Accounts
17.2.5 Stress and Changes in F0
17.3 Phrasal Prominence Languages
17.3.1 Ossetic
17.3.2 Georgian
17.3.2.1 Word Stress Approaches
17.3.2.2 Phrasal Prominence Approaches
17.3.2.3 Mixed Approaches
17.4 Phrasal Intonation
17.4.1 Non-Instrumental Observations
17.4.2 Instrumental Studies
17.4.2.1 Kabardian
17.4.2.2 Georgian
17.4.2.3 Other Languages
17.5 Conclusion
Acknowledgments
Chapter 18: Ergativity in the Caucasus
18.1 Introduction
18.2 Case and Agreement in the Languages of the Caucasus
18.2.1 Nakh-Dagestanian Languages
18.2.2 Northwest Caucasian Languages
18.2.3 Kartvelian Languages
18.3 Languages of the Caucasus: Absence of Syntactic Ergativity
18.4 Split Ergativity
18.4.1 Biabsolutive Construction in Nakh-Dagestanian
18.4.2 Ergative-Absolutive Syncretism
18.5 Split Intransitivity
18.6 Properties of the Ergative Case
18.6.1 Ergative Case and Theta-Roles
18.6.2 Raising
18.6.3 Person Interaction Phenomena
18.6.4 Is the Ergative Argument a DP or a PP?
18.6.5 Locus of Case Assignment
18.6.6 Configurational Case Assignment
18.7 Summary
Acknowledgments
Chapter 19: The Nominal Domainin Languages of the Caucasus
Chapter 17: Tone and Intonationin Languages of the Caucasus
17.1 Introduction
17.2 Tonal Properties
17.2.1 Limited Tonal Systems
17.2.2 Tonal Contrasts on Stressed Syllables
17.2.3 Tonal Contrasts on Each Syllable
17.2.4 Conflicting Descriptions and Accounts
17.2.5 Stress and Changes in F0
17.3 Phrasal Prominence Languages
17.3.1 Ossetic
17.3.2 Georgian
17.3.2.1 Word Stress Approaches
17.3.2.2 Phrasal Prominence Approaches
17.3.2.3 Mixed Approaches
17.4 Phrasal Intonation
17.4.1 Non-Instrumental Observations
17.4.2 Instrumental Studies
17.4.2.1 Kabardian
17.4.2.2 Georgian
17.4.2.3 Other Languages
17.5 Conclusion
Acknowledgments
19.1 Introduction
19.2 NP- vs. DP-Languages
19.2.1 Abkhaz: A Language with Articles
19.2.2 Case-Based (In)definiteness: Kabardian
19.2.3 Pazar Laz: A Misfit
19.3 NP/DP Split Revisited: The Proposal
19.4 Ordering of Projections in the Nominal Domain and Left Branch Extraction
19.4.1 The Nominal Domain in Abkhaz
19.4.2 The Nominal Domain in Kabardian
19.4.3 The Nominal Domain in Pazar Laz
19.5 Possessors and Condition B and C Effects
19.6 Conclusions
Acknowledgments
Chapter 20: Agreementin Languages of the Caucasus
20.1 Introduction
20.2 Nakh-Dagestanian
20.2.1 Gender Agreement
20.2.2 Agreement in Special Syntactic Constructions
20.2.2.1 Long-Distance Agreement
20.2.2.2 Biabsolutive Constructions
20.2.3 Person Agreement
20.3 Northwest Caucasian
20.3.1 Phi-Feature Agreement
20.3.2 Wh-Agreement
20.4 Kartvelian
20.4.1 Prefixal Agreement
20.4.2 Suffixal Agreement
20.5 Conclusion and Open Questions
Appendix
Chapter 21: Binding and Indexicality in the Caucasus
21.1 Pronouns
21.1.1 Nakh-Dagestanian
21.1.1.1 Pronominal Inventory
21.1.1.2 Systems of Reflexive Marking
21.1.1.3 Case Marking and Constituent Order in Complex Reflexives and Reciprocals
21.1.1.4 Person and Gender Features in Reflexive Pronouns
21.1.2 Kartvelian
21.1.3 Northwest Caucasian
21.2 Locality Domains
21.3 Antecedents
21.3.1 Nakh-Dagestanian Languages
21.3.1.1 Complex Reflexives
21.3.1.2 Simple and Emphatic Reflexives
21.3.1.3 Non-Canonical Antecedents in Reflexive Constructions: Coreference vs. Binding
21.3.1.4 Demonstrative Pronouns
21.3.2 Kartvelian
21.3.3 Northwest Caucasian Languages
21.4 Exempt Anaphora
21.5 Non-Reflexive Functions of Reflexive Pronouns
21.6 Indexicality
21.7 Summary
Acknowledgments
Chapter 22: Correlatives in Languages of the Caucasus
22.1 Introduction
22.2 Correlatives across Caucasian Languages
22.2.1 Nakh-Dagestanian
22.2.2 Northwest Caucasian
22.2.3 Section Summary
22.3 A Semantics for Interrogative-Based Correlatives
22.3.1 Deriving Wh-Questions
22.3.2 Deriving wh-correlatives
22.3.3 Summary and a Note on Abkhaz Correlatives
22.4 Toward a Typology of Correlatives
22.4.1 The Optionality of the Demonstrative Pronoun
22.4.2 Evidence for the Proposed LF: Island Effects
22.4.3 Interrogative Morphology
22.4.4 Similarities with Conditionals
22.4.5 Summary
22.5 Conclusions
22.6 Further Readings
Acknowledgments
Chapter 23: Ellipsis in Languages of the Caucasus
23.1 Introduction
23.2 Main Types of Ellipsis
23.2.1 Ellipsis within a Noun Phrase
23.2.2 Argument Omission
23.2.3 Verb Phrase Ellipsis
23.2.3.1 Modal Complement Ellipsis
23.2.3.2 Ellipsis in Complex Predicates
23.2.4 Gapping, Pseudogapping, and Right Node Raising
23.2.4.1 Gapping
23.2.4.2 Pseudogapping
23.2.4.3 Right Node Raising
23.2.5 Ellipsis in Comparative Constructions
23.2.6 Stripping
23.2.7 Ellipses Involving Negation
23.2.8 Sluicing
23.2.8.1 Classical Sluicing
23.2.8.2 Pseudosluicing
23.2.8.3 Sluicing beyond Wh-questions
23.2.9 Fragments
23.3 Types of Ellipsis in Languages of the Caucasus
23.3.1 Ellipsis within a Noun Phrase
23.3.2 Verb Phrase Ellipsis
23.3.3 Gapping and Right Node Raising
23.3.4 Right Node Raising
23.3.5 Pseudogapping
23.3.6 Ellipsis in Comparatives
23.3.7 Ellipsis with Negation
23.3.7.1 Y/N Ellipsis
23.3.7.2 Negative-Contrast Ellipsis
23.3.8 Stripping
23.3.9 Sluicing and Its Generalizations
23.3.10 Fragment Questions and Fragment Answers
23.4. Idiosyncratic Ellipsis Types
23.4.1 Deletion of a Verb and a Case-Marked DP Head
23.4.2 Word-Internal Ellipsis
23.5 Conclusion: Overview and Theoretical Implications
Acknowledgments
Chapter 24: Information Structure in Languages of the Caucasus
24.1 Introduction
24.2 Nakh-Dagestanian Languages
24.2.1 Focus and Contrast
24.2.1.1 Constituent Order in Phrases and Clauses
24.2.1.2 Clefts and Cleft-Like Constructions
24.2.1.3 Focus-Sensitive Particles
24.2.2 Topic
24.2.3 Backgrounding
24.2.4 Givenness
24.3 Northwest Caucasian Languages
24.3.1 Focus and Contrast
24.3.1.1 Constituent Order
24.3.1.2 Clefts and Pseudo-Clefts
24.3.1.3 Focus-Sensitive Particles
24.3.3 Topics and Givenness
24.4 Kartvelian Languages
24.4.1 Focus and Contrast
24.4.1.1 Constituent Order
24.4.1.2 Cleft Constructions
24.4.1.3 Focus-Sensitive Particles
24.4.2 Topics and Givenness
24.5 Summary and Conclusion
Acknowledgements
References
Appendix I: Languages and Language Names
Appendix II: Transliteration Tables
Index

Recommend Papers

The Oxford Handbook of African Languages 9780199609895, 0199609896

This book provides a comprehensive overview of current research in African languages, drawing on insights from anthropol

109 17 13MB Read more

The Oxford Handbook of African Languages 9780191007385, 9780199609895, 0191007382

This book provides a comprehensive overview of current research in African languages, drawing on insights from anthropol

101 6 17MB Read more

The Oxford Handbook of the Oxford Movement 9780191082849, 9780199580187, 0191082848

The Oxford Handbook of the Oxford Movement reflects the rich and diverse nature of scholarship on the Oxford Movement an

112 23 2MB Read more

The Oxford Handbook of the Pentateuch (Oxford Handbooks) 0198726309, 9780198726302

Featuring contributions from internationally-recognized scholars in the study of the Pentateuch, this volume provides a

1,437 119 3MB Read more

The Oxford Handbook of the Oxford Movement 9780199580187, 0199580189

This book which is devided into seven parts reflects the rich and diverse nature of scholarship on the Oxford Movement a

105 73 32MB Read more

The Oxford Handbook of the Septuagint (Oxford Handbooks) 9780199665716, 0199665710

The Septuagint is the term commonly used to refer to the corpus of early Greek versions of Hebrew Scriptures. The collec

99 97 62MB Read more

The Oxford Handbook of Endangered Languages 9780190877040, 9780190610036, 9780190610043, 9780190610029, 0190877049

The endangered languages crisis is widely acknowledged among scholars who deal with languages and indigenous peoples as

114 108 14MB Read more

The Oxford Handbook of Endangered Languages 9780190610036, 9780190610043, 9780190877040, 9780190610029, 0190610034

The endangered languages crisis is widely acknowledged among scholars who deal with languages and indigenous peoples as

107 29 50MB Read more

The Oxford Handbook of Victorian Medievalism 9780199669509

Victorian medievalism physically transformed the streets of Britain. It lay at the root of new laws and social policies.

1,073 80 17MB Read more

The Oxford Handbook of Demosthenes 9780191022975, 0191022977

As a speechwriter, orator, and politician, Demosthenes captured, embodied, and shaped his time. He was a key player in A

144 44 822KB Read more

The Oxford Handbook of Languages of the Caucasus
9780190690694, 9780190690700, 9780190690717, 0190690690

Author / Uploaded
Maria Polinsky

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

T h e Ox f o r d H a n d b o o k o f

L A NGUAGE S OF T H E C AUC A SUS

The Oxford Handbook of

LANGUAGES OF THE CAUCASUS Edited by

MARIA POLINSKY

1

1 Oxford University Press is a department of the University of Oxford. It furthers the University’s objective of excellence in research, scholarship, and education by publishing worldwide. Oxford is a registered trade mark of Oxford University Press in the UK and certain other countries. Published in the United States of America by Oxford University Press 198 Madison Avenue, New York, NY 10016, United States of America. © Oxford University Press 2020 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing of Oxford University Press, or as expressly permitted by law, by license, or under terms agreed with the appropriate reproduction rights organization. Inquiries concerning reproduction outside the scope of the above should be sent to the Rights Department, Oxford University Press, at the address above. You must not circulate this work in any other form and you must impose this same condition on any acquirer. Library of Congress Cataloging-in-Publication Data Names: Polinsky, Maria, editor. Title: The Oxford handbook of languages of the Caucasus / Maria Polinsky. Other titles: Handbook of languages of the Caucasus Description: Oxford, United Kingdom; New York, NY: Oxford University Press, 2020. | Series: Oxford handbooks | Includes bibliographical references and index. Identifiers: LCCN 2020021155 (print) | LCCN 2020021156 (ebook) | ISBN 9780190690694 (hardback) | ISBN 9780190690700 (ebook other) | ISBN 9780190690717 (epub) Subjects: LCSH: Caucasian languages. | Caucasus—Languages. Classification: LCC PK9004 .084 2020 (print) | LCC PK9004 (ebook) | DDC 499/.96—dc23 LC record available at https://lccn.loc.gov/2020021155 LC ebook record available at https://lccn.loc.gov/2020021156 1 3 5 7 9 8 6 4 2 Printed by Sheridan Books, Inc., United States of America

In memory of Helma van den Berg (1965–2003): The light will not go out . . .

Table of Contents

Abbreviationsxiii Notes on Contributorsxxi

Maps Yuri Koryakov Map 1. Caucasus: Administrative division Map 2. Language families of the Caucasus Map 3. Languages of the Caucasus Map 4. Nakh-Dagestanian languages

xxvii xxvii xxviii xxix xxx

Introduction Maria Polinsky

1

PA RT I G E N E R A L OV E RV I E W OF T H E C AU C A SU S 1. Languages and Sociolinguistics of the Caucasus Nina Dobrushina, Michael Daniel, and Yuri Koryakov

27

2. North Caucasus: Regions and Their Demography Konstantin Kazenin

67

PA RT I I NA K H - DAG E S TA N IA N L A N G UAG E S 3. Nakh-Dagestanian Languages Dmitry Ganenkov and Timur Maisak

87

4. Dargwa Nina Sumbatova

147

5. Lak Victor A. Friedman

201

viii table of contents

6. Avar Diana Forker

243

7. Archi Marina Chumakina

281

8. Chechen and Ingush Erwin R. Komen, Zarina Molochieva, and Johanna Nichols

317

PA RT I I I N ORT H W E S T C AU C A SIA N L A N G UAG E S 9. The Northwest Caucasian Languages Peter Arkadiev and Yury Lander

369

10. Abaza and Abkhaz Brian O’Herin

447

PA RT I V KA RT V E L IA N L A N G UAG E S 11. Kartvelian (South Caucasian) Languages Yakov G. Testelets

491

12. Megrelian Alexander Rostovtsev-Popiel

529

PA RT V I N D O - E U ROP E A N L A N G UAG E S 13. Indo-European Languages of the Caucasus Oleg Belyaev

573

14. Iron Ossetic David Erschler

641

PA RT V I P H E N OM E NA 15. Segmental Phonetics and Phonology in Caucasian Languages Gašper Beguš

689

16. Word Stress in Languages of the Caucasus Lena Borise

729

table of contents ix

17. Tone and Intonation in Languages of the Caucasus Lena Borise

757

18. Ergativity in the Caucasus Dmitry Ganenkov

783

19. The Nominal Domain in Languages of the Caucasus Balkız Öztürk and Ömer Eren

811

20. Agreement in Languages of the Caucasus Steven Foley

845

21. Binding and Indexicality in the Caucasus Dmitry Ganenkov and Natalia Bogomolova

873

22. Correlatives in Languages of the Caucasus Ömer Demirok and Balkız Öztürk

909

23. Ellipsis in Languages of the Caucasus David Erschler

943

24. Information Structure in Languages of the Caucasus Diana Forker

973

References

1003

Appendix I: Languages and Language Names

1079

Appendix II: Transliteration Tables

1083

yuri koryakov Index

1091

Editor

Maria Polinsky is Professor of Linguistics and Associate Director of the Language Science Center at the University of Maryland. She has done extensive primary work on several language families, in particular, on languages of the Caucasus: NakhDagestanian, Norwest Caucasian, and Kartvelian. Her research emphasizes the importance of lesser-studied languages for theoretical linguistics.

Abbreviations

: stem marker # number 1, 2, 3 first, second, third person i, ii, iii, etc. gender categories (Nakh-Dagestanian family) a/b/c Set A/B/C agreement affixes (Kartvelian family) A agent-like argument of a two-place verb abl ablative abs absolutive abst abstract acc accusative act active actl actualizing ad near, by (reference point/localization) add additive addr addressee-oriented adj adjective adjz adjectivizer adv adverb, adverbial aff affective case affirm affirmative agr agreement agt agent al alienable all allative alv.-pal. alveolo-palatal alveo. alveolar an action nominal anim animate ant anterior ante in front (reference point/localization) aor aorist ap antipassive appl applicative appr approximative

xiv abbreviations apud near, by, at (reference point/localization) art article assoc associative at at (reference point/localization) attr attributive aug augment aux auxiliary b B gender (Chechen and Ingush) ben benefactive bilab. bilabial caus causative cf counterfactual chez adverb formative meaning ‘at the home of. . .’, ‘at . . .’s place’ in Ingush (Fr. chez) circ circumferential cisl cislocative cl classifier cmpl completive coll collective comit comitative comp complementizer compr comparative con conative Con constraint inventory conc concessive cond conditional conj conjunctive cont on a vertical surface/in an attachment configuration (reference point/ localization) contr contrastive coord coordination cop copula cv characteristic vowel cvb converb d D gender (Chechen and Ingush) dat dative deb debitive decl declarative def definite deic deictic dem demonstrative dem.addr demonstrative ‘close to the addressee’

abbreviations xv dem.down demonstrative ‘lower than the speaker’s reference point’ dem.near demonstrative ‘near the speaker’ dem.sp demonstrative ‘close to the speaker’ dem.up demonstrative ‘higher than the speaker’s reference point’ det determiner df degrees of freedom dir directional, directive dist distal distr distributive do direct object dom differential object marking dubit dubitative dur durative dyn dynamic EANC Eastern Armenian National Corpus ec euphonic consonant ego egophoric elat elative em extension marker emph emphatic ep epenthetic eq equative erg ergative ess essive ev euphonic vowel evid evidential evt eventual exc excessive excl exclusive exp experiencer exst existential ezf ezafe f feminine fact factive fcl facilitive fin finalis foc focus fut future FV final voicing gen genitive ger gerund glot. glottal

xvi abbreviations gm gender marker gnt general tense h human hab habitual hbl habilitative HG Harmonic Grammar hor horizon of interest hort hortative iam iamitive ideoph ideophone IE Indo-European imp imperative impers impersonal imprf imperfect in inside hollow space (reference point/localization) inact inactive inal inalienable inanim inanimate inc inceptive inch inchoative incl inclusive ind indicative indef indefinite indir indirect inf infinitive infer inferential ins instrument(al) intens intensive; intensifier inter inside a mass/solid substance, amorphous space (reference point/ localization) intj interjection invol involuntative io indirect object ipfv imperfective irr irrealis iter iterative itf intentional future itr intransitive j J gender (Chechen and Ingush) K Kartvelian l local (1st or 2nd) person l.-d. labio-dental

abbreviations xvii labial. labialized lat lative lat. appr. lateral approximant lat. fric. lateral fricative lat. lab. lateral labialized like similarity lnk linker loc locative log logophoric ls lexical stem (within complex verbs) lv light verb m masculine mal malefactive mdt meditative med medial demonstrative mir mirative mnr manner mod modal ms millisecond msd masdar n neuter n nonnarr narrative ND Nakh-Dagestanian NDEB Non-Derived Environment Blocking neg negation, negative nfc non-finite conditional nmlz nominalizer, nominalization nom nominative nondum ‘not yet’ marker num number, numerative NWC Northwest Caucasian obj object obl oblique OCP Obligatory Contour Principle opt optative ord ordinal OS oblique stem OT Optimality Theory P patient-like argument of a two-place verb pal. palatal palat. palatalized

xviii abbreviations pass passive PCC Person-Case Constraint pfv perfective phar. pharyngeal phar./epigl. pharyngeal/epiglottal phar.&lab. pharyngealized and labialized pharyng. pharyngealized pl plural pluprf pluperfect pn person-number ending poss possessive post behind (reference point/localization) post-al. post-alveolar postp postposition pot potential pq polar question pr possessor series of personal prefixes pref prefix pret preterite prf perfect prl prolative prog progressive proh prohibitive prox proximal prp preposition prs present prvt privative pst past ptcl particle ptcp participle purp purposive pv preverb q question qnt quantifier quot quotative r root r.ext root extension rdp reduplication re refactive rec reciprocal ref referential

abbreviations xix refl reflexive rel relative; relativizer; relative pronoun, relative pronoun affix rep reportative repet repetitive res resultative rev reversative rq rhetorical question rt pre-root vowel S single argument of a one-place verb SAP speech act participant sbjv subjunctive sbst substantivizer sep separation (spatial) sg singular sim simultaneous sm series marker sml similative spress superessive SSP Sonority Sequencing Principle st stative sub under (reference point/localization) subj subject subord subordinator subst substitutive suff suffix super above, over, on top (reference point/localization) tag tag question TAM tense, aspect, mood TAME tense, aspect, mood, evidentiality temp temporal th thematic element tr transitive tral translocative trans translative ts thematic suffix uvul. uvular v V gender (Chechen and Ingush) vel. velar verif verificative vers versionizer vn verbal noun

xx abbreviations voc vocative VOT voice onset time wh wh-agreement whq wh-question wit witnessed

Notes on Contributors

Peter Arkadiev holds a PhD in theoretical, typological, and comparative linguistics from the Russian State University for the Humanities and a Habilitation degree from the Russian Academy of Sciences. Currently he is a Senior Researcher at the Institute of Slavic Studies of the Russian Academy of Sciences and an Assistant Professor at the Russian State University for the Humanities. His fields of interest include language typology and areal linguistics, morphology, case and alignment systems, tense-aspect, Baltic languages, and Northwest Caucasian languages. He co-edited “Contemporary Approaches to Baltic Linguistics” (with Axel Holvoet and Björn Wiemer), “Borrowed Morphology” (with Francesco Gardani and Nino Amiridze), both published by De Gruyter Mouton in 2015, and “The Complexities of Morphology” (with Francesco Gardani), published by the Oxford University Press in 2020. Gašper Beguš is an Assistant Professor of Linguistics University of California, Berkeley. His work focuses on experimental and computational phonology, phonetics, and historical linguistics. His PhD from Harvard University (2018) proposes a model that combines experimental phonology with statistical modeling of sound change to better understand cognitive and historical influences on phonological grammar. Oleg Belyaev is a Lecturer in the Department of Theoretical and Applied Linguistics at Moscow State University, a Research Fellow at the Department of Linguistic Typology at the Institute for Linguistics of the Russian Academy of Sciences, and a Research Fellow at the Pushkin Russian Language Institute in Moscow. His research interests include languages of the Caucasus (especially Ossetic and Dargwa), the syntax and semantics of clause linkage, and the typology and theory of morphological case systems. He is one of the editors of the Languages of the Caucasus series at Language Science Press (http://langsci-press.org/catalog/series/loc). Natalia Bogomolova is a PhD candidate in the Department of Linguistics at the University of Bamberg and a Research Fellow in the Department of Caucasian Languages at the Institute for Linguistics of the Russian Academy of Sciences in Moscow. She holds a PhD in Russian Literature from Moscow State University. Her work concentrates on description and theoretical analysis of Tabasaran and other Lezgic languages. She is currently interested in the syntax of ergativity, clitic doubling, binding, and indexical shift. Lena Borise is a Research Fellow at the Research Institute for Linguistics (Budapest, Hungary). Her work focuses on the acoustic manifestation of prosodic prominence,

xxii notes on contributors syntactic realization of information structure, and the interaction between the two. In her PhD dissertation at Harvard University (Borise, 2019b), she investigated the prosodic and syntactic properties of focus in Georgian. Marina Chumakina is a Research Fellow in the Surrey Morphology Group at the University of Surrey. Her work focuses on morphology and syntax of Nakh-Dagestanian languages, especially agreement. She has done extensive fieldwork on the Archi language resulting in an online Archi dictionary (Chumakina, Brown, Corbett, & Quilliam, 2007a) and a collection on Archi agreement (Bond, Corbett, Chumakina, & Brown, 2016). Michael Daniel is a Professor at the School of Linguistics and a Research Fellow in the Linguistic Convergence Laboratory, both at the Higher School of Economics (HSE) in Moscow. He received his PhD from the Russian State University for the Humanities in 2001. His interests include fieldwork in Dagestan and typology of nominal categories. Ömer Demirok is an Assistant Professor in the Department of Linguistics at Boğaziçi University, Istanbul. He received his PhD in Linguistics from the Massachusetts Institute of Technology in 2019. His primary research interests include syntax, formal semantics, and their interface. He has done fieldwork on the Pazar (At’ina) dialect of Laz and on heritage Georgian varieties spoken in Turkey. Nina Dobrushina is a Professor at the School of Linguistics and Head of the Linguistic Convergence Laboratory at the Higher School of Economics (HSE) in Moscow. She graduated from Moscow State University and received her PhD from the Russian State University for the Humanities in 1995. She works on languages of Dagestan, multilingualism, typology of mood and modality, and language variation. In 2016, she published a monograph on the Russian subjunctive. She has done fieldwork on Archi, Mehweb, Rutul, and some other languages. Ömer Eren is a PhD student in the Linguistics Department at the University of Chicago. He holds an MA from Boğaziçi University in Turkey (2017). The primary focus of his work is on morphology, syntax, and their interface in Turkic and Caucasian languages. He is mainly interested in the structure of nominals and spatial constructions. David Erschler is a Lecturer in the Department of Foreign Literatures & Linguistics at Ben-Gurion University of the Negev. He received his PhD in Mathematics from Tel Aviv University in 2007, and a PhD in Linguistics from the University of Massachusetts at Amherst in 2018. His research interests include syntactic and morphological typology, ellipsis, fieldwork, and diachronic syntax. He has done fieldwork on Ossetic, Georgian, and Svan. Steven Foley is a postdoctoral researcher at Princeton University. With interests in both formal syntax and psycholinguistics, he has conducted theoretical and experimental research on Georgian and on Zapotec languages. This work has touched on issues including agreement, clitic pronoun movement, and relative-clause processing.

notes on contributors xxiii Diana Forker is a Full Professor of the Caucasus Studies at the University of Jena. Her work focuses on Caucasian languages from a functional-typological perspective, in particular morphosyntax, language contact, and sociolinguistics. She has recently published “A Grammar of Sanzhi Dargwa” (Language Science Press). Together with Oleg Belyaev and a team of experts in Caucasian languages, she is building an electronic lexical database of Caucasian languages (LexCauc). Victor A. Friedman is Andrew W. Mellon Distinguished Service Professor Emeritus in the Humanities at the University of Chicago and Honorary Adjunct at La Trobe University. He holds a PhD in both Linguistics and in Slavic Languages and Literatures from the University of Chicago (1975). He has received the Annual Award for Outstanding Contributions to Scholarship from both the American Association of Teachers of Slavic and East European Languages (2009) and the Association for Slavic, East European, and Eurasian Studies (2014). His main research interests are all linguistic aspects relating to languages of the Balkans and the Caucasus. Dmitry Ganenkov is a Research Fellow in the Department of English and American Studies at the Humboldt University of Berlin and a Senior Research Fellow at the Laboratory for Languages of the Caucasus at the Higher School of Economics (HSE) in Moscow. He received his PhD in Linguistics in 2005 from Moscow State University. His work focuses on documentation, description, and syntactic analysis of Nakh-Dagestanian languages. His theoretical interests include ergativity, agreement, binding, and obligatory control. Konstantin Kazenin has a PhD in Linguistics from Moscow State University. He worked at the Department of Theoretical and Applied Linguistics of Moscow State University from 1997 to 2012, researching languages of Dagestan (Lak, Tsakhur, Bagwalal) and syntactic typology. In 2012, he joined the Russian Academy for National Economy and Public Administration, where he is the head of the Department for Regional Studies. His current research concentrates on demography (mainly fertility dynamics) in the Caucasus and Central Asia, as well as labor migration from Central Asia to Russia. He also serves as Associate Professor in the Department of Demography of the Higher School of Economics (HSE) in Moscow. He is a member of the European Association for Population Studies. Erwin R. Komen received his PhD at the Radboud University Nijmegen, and combines his job writing software programs for humanities researchers with his work as a linguistics consultant at SIL International. His recent web application “Cesar” facilitates syntactic research in, among others, the Nijmegen parsed corpus of modern Chechen. Yuri Koryakov is a Senior Researcher in the Department of Areal Linguistics in the Institute for Linguistics of the Russian Academy of Sciences in Moscow. His research interests include Caucasian sociolinguistics and writing systems, and, most importantly, language cartography and taxonomy. He is author or co-author of several monographs, including the “Atlas of Caucasian Languages” (2002).

xxiv notes on contributors Yury Lander is an Associate Professor in the School of Linguistics at the Higher School of Economics (HSE) in Moscow. He has worked on both Nakh-Dagestanian and Northwest Caucasian languages, including Udi, Dargwa, Adyghe (West Circassian), Kabardian (East Circassian), and Abaza, as well as on some Austronesian languages. He is particularly interested in various topics in syntactic and morphological typology, including relativization, possessive constructions, and polysynthesis. Timur Maisak is a Senior Research Fellow at the Institute for Linguistics of the Russian Academy of Sciences and a Research Fellow at the Linguistic Convergence Laboratory of the Higher School of Economics (HSE) in Moscow. He obtained his PhD in Linguistics from Moscow State University in 2002. He has conducted research on Nakh-Dagestanian languages since the 1990s, mainly on the Lezgic and Andic branches of the family. His research interests include language documentation and description, typology of verbal categories, and grammaticalization theory. Zarina Molochieva is a Lecturer at the Department for General Linguistics at the University of Kiel (Germany). She received her PhD from the University of Leipzig in 2011. Her research interests lie in verbal morphology, especially tense, aspect, and mood systems, as well as language documentation and description. Johanna Nichols is Professor Emerita at the University of California, Berkeley. She received her PhD from the University of California, Berkeley, and has taught there for most of her career. She works on typology, historical linguistics, languages of the Caucasus (especially Ingush), and language spreads in northern Eurasia. Brian O’Herin has conducted research on the morphology and syntax of Abaza since the 1980s, receiving his PhD in Linguistics from the University of California at Santa Cruz in 1995. He taught at Biola University before serving as the International Linguistics Coordinator for SIL International, and has recently returned to field research, focusing on syntax and text linguistics. Balkız Öztürk is an Associate Professor of Linguistics at Boğaziçi University, Istanbul. She received her PhD from Harvard University in 2004. Her research interests include the interface between syntax, morphology, and the lexicon. She focuses on Altaic and South Caucasian languages. She is the author of the monograph “Case, Referentiality and Phrase Structure” and has co-edited the volumes “Exploring the Turkish Linguistic Landscape”, “Morphological Complexity Within and Across Boundaries”, and “Pazar Laz”. Maria Polinsky is a Professor of Linguistics and Associate Director of the Language Science Center at the University of Maryland, College Park. Her main interests are in theoretical syntax, with an emphasis on cross-linguistic variation. She is also interested in the integration of experimental methodologies in linguistic research. She has done extensive primary work across several language families, including languages of the Caucasus.

notes on contributors xxv Alexander Rostovtsev-Popiel is a Research Fellow at the Institute for Slavic, Turkic and Circum-Baltic Studies of the Mainz University. He received his PhD in Linguistics, for a dissertation on grammaticalization in Kartvelian, at Frankfurt University in 2012. From 2017 to 2019, he was a Postdoctoral Researcher at Collège de France. He specializes in morphology, morphosyntax, and pragmatics of Kartvelian languages. Nina Sumbatova is a Professor of Linguistics at the Russian State University for the Humanities in Moscow. She received her advanced degrees from the Institute for Linguistics at the Russian Academy of Sciences. Her research interests lie in the morphology and syntax of Dargwa and other Nakh-Dagestanian languages, as well as theoretical syntax and linguistic typology. She is the author of two descriptions of Dargic languages: “Itsari Dargwa” (2003, with Rasul Mutalov) and “Tanti Dargwa” (2014, with Yury Lander). Yakov G. Testelets (Testelec) is a Professor of Linguistics at the Russian State University for the Humanities and head of the Department of Caucasian Languages in the Institute for Linguistics at the Russian Academy of Sciences. His work has focused on the grammar of languages of the Caucasus, especially Svan (Kartvelian), Adyghe (West Caucasian), Tsakhur, Bezhta, and Avar (Nakh-Dagestanian), problems of grammatical theory and typology (anaphora, linear order, grammatical asymmetries, ergativity, case), and the syntax of Russian.

Maps

Yuri Koryakov

Map 1. Caucasus: Administrative division

xxviii maps

Map 2. Language families of the Caucasus

maps xxix

Map 3. Languages of the Caucasus

xxx maps

Map 4. Nakh-Dagestanian languages

I n troduction Maria Polinsky

I.1. Navigating the Area The Caucasus is a relatively small land mass between two seas: the Black Sea on the west and the Caspian Sea on the east. Its northernmost area includes the Great Caucasus mountain range, and its southernmost part shares a border with Turkey and Iran. The Caucasus is separated from Russia by the Kuban and Terek Rivers in the north and is bound by the Kura and Araxes Rivers in the south. Famous for its dizzying cultural and linguistic diversity, this small, rectangular region of mountains (including Mount Elbrus and Mount Kazbek, which are best-known), hills, plateaus, valleys, and meadows has long been the homeland to many ethnic groups. “The ethnic complexity of the Caucasus is unequalled in Eurasia, with nearly sixty distinct peoples, including Russians and Ukrainians” (Colarusso, 2009). Rarely does an overview fail to mention the nickname given to the Caucasus by medieval Arab historians, “a mountain of tongues” (see Catford, 1977; Chumakina, 2011a, among others). Traditionally the Caucasus is divided into two main parts: the North Caucasus (Ciscaucasus, Ciscaucasia) and the South Caucasus (Transcaucasus, Transcaucasia). While about a hundred or so languages are spoken in the Caucasus, there are three major language families that exist solely in the Caucasus and do not have any member languages outside the area (various late diasporas do not count here). These three families are considered indigenous. Sometimes, the phrase “languages of the Caucasus” or, more accurately, “Caucasian languages” refers to these languages only.1 Two of these 1 See Comrie (2005) for the terminological distinction between “languages of the Caucasus” and “Caucasian languages,” and see also chapter 1. The indigenous status of Caucasian languages does not prevent speakers of individual languages of these families from arguing with each other about who got there first. This is a difficult topic, associated with many political and cultural issues, often confounded by a lack of clear historical data. Since this Handbook focuses on the linguistic richness of the area in modern times, it does not include any discussion of territorial origins or genetics. Genetic investigations addressing the migration history in the area have appeared in the last decade (Balanovsky et al., 2011;

2 Maria Polinsky indigenous families are found in the North Caucasus; the third is in the south. The north can be conveniently divided into the northwest, home of the Northwest Caucasian (Abkhaz-Adyghe) family, and the northeast, home of the Nakh-Dagestanian family.2 The south is where languages of the Kartvelian (South Caucasian) family are spoken. Both the Northwest Caucasian family and the Kartvelian family are small in terms of member languages. The former consists of Abkhaz, Abaza, Kabardian and Adyghe (these two are often combined under the umbrella term “Circassian”), and Ubykh. The Kartvelian family includes Georgian, Megrelian, Laz, and Svan. On the other hand, the Nakh-Dagestanian family includes many more languages. As its name suggests, this family is comprised of two main branches: Nakh and Dagestanian. While the Nakh languages form a single genealogical grouping,3 the languages traditionally called Dagestanian do not—this term reflects common geography rather than early branching in the history of the family.4 Researchers looking for long-range linguistic comparisons place Kartvelian languages in the Nostratic family (Bomhard, 2008; Illich-Svitych, 1971, among others) and connect the Northwest Caucasian and Nakh-Dagestanian families to Sino-Tibetan (Nikolaev & Starostin, 1994). No matter how we look at it, the three indigenous language families do not form a genealogical unit.5 Why, then, treat them together? Bernard Comrie offers an explanation, relying on traditional training and common geography: “One reason is historical, namely that the training of specialists has tended to be across the range of Caucasian languages, even if with greater specialization in just one of the three families. This also makes sense practically, for instance in that students of these languages share certain prerequisites, such as at least a reading knowledge of Russian, often also of Georgian. But perhaps more important than this is the fact that these languages occupy a more or less contiguous geographical area at the boundary of Europe and Asia as both geographical and cultural entities, an area that is moreover surrounded by representatives of much larger language families . . .” (Comrie, 2005, p. 1). In addition to the three indigenous families, the Caucasus is home to several languages that belong to families with wider distribution. Most notable among the Indo-European languages are Armenian and Ossetic, whose speakers have long lived in the area. Northern Kurdish and (Judeo)Tat are fading, with fewer and fewer native speakers left.6 Of the Turkic family, Azerbaijani, spoken in the south, is the largest. Other Turkic languages include Kumyk, Karachay-Balkar, and Noghay. For several other languages of the area, see chapter 1.

Karafet et al., 2016; Wang et al., 2018), but more work remains to be done. Of resources in English, see King (2008) and Forsyth (2013) for the history of the region and Rayfield (2012) for the history of the South Caucasus, with further references therein. 2 Here and below, I will be using the most common names of language families and individual languages. For alternative names (of which there are many), see chapter 1 and appendix I. 3 See chapters 3 and 8. 4 See chapters 1 and 3, for more discussion. 5 See also chapter 1. 6 See chapter 13.

Introduction 3 The maps included with this Handbook show the main administrative divisions in the area, the distribution of the main families, and a more detailed distribution of languages within these families. In an area as compact and densely populated as the Caucasus, multilingualism is more a norm than an exception, and research on language contact among languages of the area has always been very productive. At some point, researchers were even tempted to propose the concept of the Caucasian Sprachbund (Chirikba, 2008b; Klimov, 1978; Klimov & Alekseev, 1980; but see Tuite, 1999, for arguments against this approach). The main trends in multilingualism and contact in the Caucasus are discussed in chapter 1, with further references on this topic. Aside from the many local languages in contact, several other languages have been present in the region, too—by virtue of geography and politics. Located at the peripheries of Turkey, Iran, and Russia, and literally at the crossroads of Europe and Asia, the Caucasus has long been an arena for expansionism and political, military, religious, and cultural rivalries. Until the end of the 18th century, the area was first aligned, politically and culturally, with the Arab world, and later with the Persian and Ottoman Empires. The languages associated with these outside forces left a strong mark within the Caucasus, to the point that numerous Arabic, Turkic, and Iranian (Iranic) borrowings remain throughout the languages of the region.7 Many words of Middle Eastern origin show up in all of these languages, and it is not always easy to determine if a given loanword comes directly from Arabic, Turkish (or other Turkic languages), Persian, or another Iranian language or traveled from one of these outsider languages to another and then later, to a particular Caucasian language. The literature on loanwords from Arabic, Turkish, and Iranian languages in Kartvelian languages is quite substantial (Fähnrich, 2007; Gippert, 1990; Klimov, 1998a, and references therein). For loans from Northwest Caucasian into Kartvelian, see Chirikba (1998, 2006) and references therein, and for Nakh-Dagestanian loans in Kartvelian, see Fähnrich (1988, 2007). Studies of Arabic, Turkic, and Iranian loanwords in languages of the North Caucasus are also popular in the local philological tradition. For monographic descriptions of such borrowings into Nakh-Dagestanian languages, see Dzhidalaev (1990), Selimov (2010), Zabitov (2001), and Zabitov and Èfendiev (2001)—these studies include many further references. Yet another outside language has maintained a formidable and vigorous presence in the region since the 19th century: Russian. In the beginning of the 19th century, the Caucasus was annexed by the Russian Empire (see Baddeley, 1908; Potto, 1887–1889, for the history of the Russian invasion and subsequent annexation). The Russian conquest of the Caucasus was not unlike the settlement of New Zealand by the British or the conquest of the Sahara by the French. The remote, strange, and, at times, bleak landscape seemed squalid and uninhabitable; both its climate and its horticulture were entirely 7 Loans from Turkish dominate Turkic borrowings. Among Iranian borrowings, Persian loans are most noticeable. Throughout this Handbook, references to Turkic/Turkish and Persian/Iranian can be found interchangeably.

4 Maria Polinsky foreign. The steep mountains did not appeal to the Russian peasant farmers, who were more interested in the rich fields and forests of Siberia. Promises of natural resources and salt mines were played up by the locals, but those remained unfulfilled. And in 1801, oil drilling was not a lucrative undertaking. Instead, this alien terrain attracted vagabonds, criminals, and romantic literati who marveled at the exotic locale. The rest of the Russian settlers were moved forcibly, often as part of army divisions. Despite reservations, the Russian Empire was drawn to the Caucasus for two reasons. First, the tsars were trying to establish a reliable border with Iran and Turkey, one that they could hold steady. In this regard, the South Caucasus was the real prize, whereas the North Caucasus was viewed as more of a nuisance—the price that had to be paid in order to create a Russian presence at the Iranian and Turkish borders. Second, as a strong Christian nation which considered itself a direct descendant of Byzantium, the Russian Empire sought to protect Christians in the Caucasus, such as Georgians and Armenians (and the less numerous Greeks). For their part, the Georgians and Armenians in the South Caucasus were also looking to align themselves with the Russians for religious reasons, as they were worried that an alliance with the Persians or the Ottoman Empire would force them into Islam. With a heavy heart, the Georgian Bagrationi dynasty accepted the inclusion of their lands in the Russian empire as the lesser of two evils.8 The time that has passed since Russia’s conquest of the Caucasus has not been easy. Following the disintegration of the Soviet Union in 1991, periods of independence have been punctuated by vicious military fighting—such as a series of brutal Chechen wars (see German, 2003, and references therein; see also chapter 2) and the Russo-Georgian war of 2008. Political and military turmoil aside, the linguistic presence of Russian has remained significant throughout the area since the 19th century, especially in the North Caucasus where Russian has displaced a dozen or so local languages that used to be linguae francae, becoming the main common language (see chapter 1). Russian is “considered by many not to be a truly ‘foreign’ language (like French, German or English), but rather a sort of second native language (regardless of how well they actually spoke it)” (Blauvelt, 2013, p. 3). The role played by Russian is evident from the local migration patterns. As soon as speakers of a local language move to a more urban setting (which is often associated with migration from the highlands to the multiethnic lowlands), Russian becomes dominant. This ongoing switch to Russian has consequences both for Russian and for local languages. First, as Russian remains a prestigious, important language in the area, one associated with upward mobility, local varieties of Russian emerge (Belikov, 2011; Daniel, Dobrushina, & Knyazev, 2010; chapter 1 of this volume). In the Soviet days, such varieties of Russian were mostly ignored and considered substandard; current work on these varieties is in its early stages, and they need to be investigated more.

8 Although the Orthodox Christianity shared by the Georgians and Russians was important in the dialogue between the two nations, Georgian kings also pursued the option of aligning with the Catholic Church (Lang, 1957).

Introduction 5 Second, despite the fact that many censuses indicate large numbers of speakers for certain languages (see chapters 1 and 2 in this Handbook), a significant proportion is represented by semi-speakers or heritage speakers: recessive bilinguals who are more dominant in Russian. Furthermore, quite a few groups in the Caucasus identify themselves based on ethnicity and may state that they speak the corresponding language, when really, they only know a few words (see chapters 1 and 2).9 The growing dominance of Russian underscores the urgency of studying the languages of the northern Caucasus; the often-misleading numbers of speakers of a given language may give researchers the sense of false comfort concerning linguistic vitality. Though Russian has supplanted several local languages that used to be widely spoken, at least two languages, Georgian and Armenian, have withstood its pressure. Their endurance in the Russian Empire, and later in the Soviet Union, can be explained in part by the long-standing literary traditions in both languages, not to mention the sheer number of speakers for each. Both the Armenian and Georgian scripts go back to the 5th century (their origins are a point of contention), and medieval chronicles in both languages date back to the 9th century. There is a tremendous body of literature in both languages, which forms a common cultural background for the populations, who have an extremely high literacy rate. In the Soviet Empire, the constitutions of the local republics provided for the use of the titular (local) language and Russian, although Russian was tacitly assumed to be the more important, more prestigious language (Blauvelt, 2013; Slezkine, 1994). The Soviet “ethnophilia” of the 1920s, in which all minority languages and ethnicities were supported, yielded to the policies of the mid-1930s, which supported larger nationalities, especially ones that had titular republics within the Soviet Union. Georgian and Armenian benefited significantly in both periods, becoming the languages of state bureaucracy (Blauvelt, 2014). Around the mid-1930s, the central Soviet government decided that Georgia and Armenia would serve as the model “advanced republics” of the Union. As a result, their languages, cultures, and what was called “ethnogenesis” became the focus of all republican academic institutions created by the party and state—including unions of writers, institutes of history, ethnography, literature, archaeology, and so on. This special status played out in many ways. One example can be traced back to the late 1930s, when Georgian and Armenian were able to retain their traditional scripts (granted, they had had these traditional scripts for centuries, as mentioned above). Republican languages that did not have traditional writing systems, but rather, Latin-based orthographies developed in the 1920s, were all required to use the Cyrillic script in the late 1930s (the Azerbaijanis switched back to the Latin orthography in the 1990s, after the fall of the Soviet Union). At the same time, the languages of the minority groups in Georgia and Armenia (Abkhaz and Ossetic in Georgia, Kurdish in Armenia) switched to Georgian and Armenian scripts, respectively.10 9 While this tendency is often noted, the actual numbers of semi-speakers or non-speakers who self-identify with a given group are not known. 10 Ossetic is particularly telling in that regard: in North Ossetia, the writing system was switched to Cyrillic, and in South Ossetia, to a Kartvelian script. (See chapter 1 for a more general discussion of the writing systems used in the area.)

6 Maria Polinsky Georgians were unique in openly protesting against the spread of Russian as the Soviet government attempted to change the constitutional status of languages in Georgia, particularly in 1978. The protestors even disregarded the Soviet regime’s oppressive policies on demonstrations (Cornell, 2001). Thus, despite the strong Russification of the Soviet empire in the last several decades, state support for titular languages and institutions continued, creating a kind of paradox wherein official scholarly institutes became bastions of national projects. Even though their allegiance to their own language was unshakeable, the Georgians did not have second thoughts about the subjugation of more minor Kartvelian languages (Laz, Svan, Megrelian), Abkhaz (spoken in the contested Georgian territory), and Ossetic (spoken in another contested Georgian territory), having even fought off official support for the recognition of peoples they considered to be their own ethnic subgroups (Blauvelt, 2014). Nor did the Armenians worry much about the fate of Neo-Aramaic (Assyrian) or Northern Kurdish (spoken by the Yazidi population) in their country. Russian pushed out minority languages in the North Caucasus, but Georgian and Armenian did the same in their respective domains, too.

I.2. A Linguistic Snapshot of the Caucasus Since the languages spoken in the Caucasus are diverse and varied, sweeping generalizations about their design are often superficial and incomplete. All of the region’s major language families are known for striking characteristics that receive too much attention, often becoming distorted in the process. Mention Circassian or Kabardian and a likely reaction is that these languages have no vowels—a misinterpretation of the claim that the vowels are fully predictable and, therefore, should not be counted as part of the phonemic inventory (see Catford, 1994, 1997; Kumakhov, 1977, and chapter 15 for a discussion). Languages of Dagestan are best known for their prolific use of case forms (which are, in fact, spatial forms of nouns with incorporated postpositions, see chapter 3; Comrie & Polinsky, 1998) or for their gender oppositions, which are more complex than the usual masculine-feminine distinction.11 Kartvelian languages are famous for their consonant clusters and complex verb forms, often with different argument alignment depending on the tense, aspect, and presence of additional affixes, such as applicatives, in the verb. This Handbook intends to show the genuine complexity and diversity in the Caucasus with the goal of shifting researchers’ attention away from the few catchy, Guinness-World-Record-type properties, which are much less exotic than they may seem from the outside. 11 There may be three to eight classes depending on the language. See Corbett (1991), and chapters 3, 8, 20 of this volume.

Introduction 7 Undeniably, the Caucasus is a phonetician’s paradise. Most indigenous languages of the Caucasus have rich consonant systems with three-way distinctions in the laryngeal features of obstruents that include ejective consonants, as well as a rich inventory of post-uvular articulation, especially in Nakh-Dagestanian. Gašper Beguš (chapter 15) provides a detailed account of the main phonetic and phonological properties that characterize the three major families. As proposed by some researchers, the consistent presence of ejectives may constitute an areal feature (Catford, 1977); beyond the three indigenous families, ejectives are found in Ossetic (see chapters 13 and 14), Neo-Aramaic, as well as in some dialects of Kumyk, Azerbaijani, and Armenian (Chirikba, 2008b, p. 44; Maddieson, 2013). This spread is typically accounted for by the influence from the indigenous languages or the substrate. I have already mentioned the extensive borrowings from Turkic languages, Iranian languages, and Arabic in languages of the Caucasus. Although borrowings are found in most of the world’s languages, the pattern employed by the languages of the Caucasus deserves special mention due to its consistency. Words that relate to politics, religion, some professional names, and even some everyday items are among common borrowings. Furthermore, these words are often so tightly integrated into the lexical systems of the languages that it is hard to identify them as loanwords. The spread of Russian has resulted in a great number of Russian borrowings, as well as the integration of international lexica that arrived via Russian. Borrowings often bear a distinctive phonetic signature, for example, with voiceless stops represented by ejectives in Kartvelian, some Nakh-Dagestanian languages, and Armenian, as in Georgian p’rop’aganda ‘propaganda’, lep’t’op’i ‘laptop’, Avar q’alam ‘pencil’, Hinuq mark’a ‘stamp’,12 Mehweb Dargwa k’amp’it’ ‘candy’, and so on. Systematic comparative work on phonetic features of loanwords in the Caucasus is still outstanding. Most languages of the area are head-final: they have postpositions rather than prepositions, and non-finite clauses are predicate-final.13 At least one language of the area should be described as having SOV word order and no case marking on noun phrases: Abkhaz (Hewitt, 1979a). The absence of case-marking is typically correlated with verb-medial orders (SVO), and Greenberg’s Universal 41 specifically states that, “if in a language the verb follows both the nominal subject and nominal object as the dominant order, the language almost always has a case system” (Greenberg, 1963, p. 75). Thus, Abkhaz is relatively unusual in that regard.14 In languages of the area, the word order at the main clause level is usually less rigid, and although verb-initial orders are less common, verb-final and verb-medial orders are typical, as shown in example (1). In quite a few languages, the immediate preverbal position is dedicated to focus constituents; this is a recurrent theme in several descriptive chapters and in Diana Forker’s chapter on information structure (chapter 24). A rich 12 In Tsezic languages, borrowings from Russian only show the ejective k’ (Comrie & Khalilov, 2009b). 13 But see chapter 13, on prepositions in Indo-European languages of the area. 14 Combining the features “SOV order” and “no case marking” yields 18 languages of 565 instances of SOV listed by Dryer (2013b) in the World Atlas of Language Structures.

8 Maria Polinsky postverbal periphery (often referred to as the right periphery) is commonly used for encoding various types of backgrounded or newsworthy information, and in that regard, languages of the Caucasus await comparisons with Hindi-Urdu or Turkish, where the syntax of the right periphery has been investigated (see Kural, 1997; Manetta, 2011, among others). A hallmark of head-final languages are complex predicates, formed from a lexical component and a light verb such as ‘be’ (for intransitives) and ‘do’ (for transitives); these are very common throughout the area. In languages of the North Caucasus, we find a clean distinction between clausemedial (non-finite, converbal forms) co-occurring with the single finite predicate of a complex sentence—consider this long example from Agul (Nakh-Dagestanian), where the only finite predicate is the copular form x-a-j-e, itself built on a converb. (1) Agul peʡ ud-u-na, mertː aq’-u-na iǯi-di, fajš-u-na, chicken.abs tear-pfv-cvb clean do-pf-cvb good-adv bring-pfv-cvb ha-te ʜüjeg-i-ʕ ʕix-a-s bašlamiš aq’-u-guna kitan emph-dem.dist pot-obl-inter inter-put-inf begin do-pfv-cvb cat.abs x-a-j-e me peʡ-ela-k-as. become-ipfv-cvb-cop dem.prox chicken-obl-sub.cont-elat ‘They pluck the chicken, clean it up really well and bring it over, but when they are ready to put it in the pot, the chicken turns into a cat.’ Is this head-final structural design special to the Caucasus? Probably not. Head-final languages dominate the global linguistic landscape. For instance, all over South Asia, Indo-Aryan and Dravidian languages manifest a similar pattern of head-finality, with participial or converbal clauses dependent on the sole finite predicate. Languages of the Caucasus share non-rigid, head-final properties, including the extended right periphery, with the neighboring Persian and Turkish. It may well be that all of these languages have the most insipid word order and, therefore, areal features should not be held responsible for the apparent uniformity. All things being equal, one would expect to find the predominance of suffixal morphology in a head-final language. And while suffixation is common across languages of the area, agreement exponents appear before the verbal root in most languages of the three Caucasian families. In Northwest Caucasian and Kartvelian, these exponents index person and number15; in Nakh-Dagestanian languages, it is primarily gender and number (see chapter 20). Elements that index person, number, or gender do not have the same categorial status in all the languages of the area. Furthermore, for most languages of the area, whether these elements are morphological prefixes or clitics has yet to be determined. Distinguishing between agreement affixes and clitics is not an easy 15 Abkhaz also has gender agreement, also marked before the verb root (Hewitt, 1979a, pp. 103–125; Shaduri, 2006).

Introduction 9 task, but an important one, as this differentiation leads to a better understanding of agreement phenomena in languages of the Caucasus, as well as the order of constituents in the verbal complex, and the nature of ergativity—the feature that I will take up next. Most languages of the area are ergative and lack passive voice constructions, the latter gap a common, albeit not necessary, corollary to ergativity (see Kazenin, 2001c, for a discussion of this commonly assumed correlation). Ergativity is clearly present in the three indigenous families, yet that superficial parallel is where the similarities end (Catford, 1974; Tuite, 1999; and chapter 18 of this volume). Nakh-Dagestanian languages are consistently ergative, in terms of both their case marking and the agreement with the absolutive in gender (noun class). Their ergativity is purely morphological, it has no syntactic consequences; all types of arguments, regardless of case marking and agreement, can undergo extraction, leaving a gap in the base position. Ergativity is different in Northwest Caucasian languages. In those languages of the family that have overt case marking, noun phrases are marked for absolutive and ergative, and the ergative coincides with the generalized oblique marker (some researchers argue that it is a single marker). Agreement is with the ergative and with the absolutive, in person and number (gender is present in some but not all languages of the family). The pattern of extraction is different from Nakh-Dagestanian and Kartvelian; in Northwest Caucasian languages, only absolutive arguments can undergo extraction with a gap and no change in the verb form. That characterizes them as syntactically ergative—unlike languages of the other two families. Finally, in Kartvelian, the ergative appears only in a subset of tense-aspect-mood forms (in Georgian, in the aorist-optative group of TAM forms; see Nash, 2017b, for an analysis). And Kartvelian agreement, famous in its own right for its remarkable complexity, follows the nominative-accusative pattern and tracks only person and number features (see chapter 20).16 Kartvelian ergativity is thus quite different from the more familiar patterns (of which Nakh-Dagestanian ergativity is probably the textbook case), and some researchers classify Kartvelian languages as having active-inactive rather than ergative case alignment, although the reasons for such an analysis may differ (Harris, 1981; Hewitt, 1987a; Klimov, 1973; and see footnote 16). The main argument for classifying these languages as active-inactive has to do with a large number of verbs that can traditionally be thought of as intransitive (‘dance’, ‘scream’, ‘yawn’) which however have their sole argument marked the same way as a regular transitive subject; in the meantime, the more patient-like arguments of intransitive verbs are marked as transitive objects. This approach, which is more valid for the languages of the family other than Georgian, is reflected in the survey chapter on Kartvelian (chapter 11); but see chapter 18 where these languages are viewed as pretty much middle-of-the-road split-ergative. Clearly the final word on this issue is still to come, and if we want to go beyond just 16 Using more idiosyncratic criteria, Klimov and Alekseev (1980) examine ergativity in all three families and conclude that the Northwest Caucasian languages are the most prototypically ergative, Nakh-Dagestanian languages have elements of nominative-accusative strategies, and Kartvelian languages represent a combination of active, ergative, and nominative types.

10 Maria Polinsky naming a particular pattern it is important to operationalize the criteria which define an alignment as ergative-absolutive or active-inactive. The majority of languages in the Caucasus also have extensive pro-drop. Unlike the better-known pro-drop languages, not only subjects but also direct objects and other non-subject arguments in Caucasian languages can be freely omitted as long as they are recoverable from discourse. It is common to associate pro-drop with rich agreement, and though many languages of the area may have rich agreement (as I mentioned earlier, it is not always clear whether this is agreement or cliticization), pro-drop is also present in languages that lack agreement, such as Lezgian or Agul. Although pro-drop in languages of the Caucasus has been documented (it is hard to miss!), it has not been fully explored yet. Meanwhile, there are at least two main directions of future research on the nature of pro-drop in languages of the Caucasus. The first one has to do with licensing mechanisms and identification of the null pronominal. Is it due to rich agreement—in other words, are these languages akin to Romance with regard to pro-drop (see Rizzi, 1986)— or are the null pronominals identified by their association with a discourse topic, in a pattern similar to the one claimed for Chinese (see Huang, 1989, 1991)? The second avenue of research involves patterns of pronominal reference and resolution. Such patterns have been studied in the more familiar Romance languages, where only subjects can be deleted. For Romance, researchers have proposed that null pronouns are preferentially linked to subject antecedents and overt pronouns to antecedents in lower structural positions (Carminati, 2005). Thus, in the Spanish example in (2), the null pronoun in the second clause is preferentially interpreted as referring to the subject, and the overt pronouns el, as referring to the object: (2) Spanish Pedrok. a. Juani pegó a Juan hit prp Pedro

proi>k está enfadado. be.prs.3sg angry.m

Pedrok. Élk>i está enfadado. b. Juani pegó a Juan hit prp Pedro he be.prs.3sg angry.m ‘Juan hit Pedro. He is angry.’ (Keating, Jegerski, & Van Patten, 2016, p. 38) Since all arguments can be dropped in languages of the Caucasus, what strategies of pronominal reference can we expect? Consider the following example, wherein both the subject and the object are dropped in the second clause, and the clause is ambiguous. So far there has not been any work on strategies of pronominal reference in the Caucasus, and this line of research is promising in that it can bring together issues in theoretical syntax and sentence processing. (3) Georgian sap’rezident’o presidential

debat’-eb-ši debate-pl-loc

beridzei-m Beridze-erg

gelašvilik Gelashvili.nom

Introduction 11 uk’mexad ga-a-k’rit’ik’-a. harshly pv-vers-criticize-aor.3sg.3sg amit’om pro1SG proi/k ar ar a-v-i-rčev. because.of.that neg pv-1sg-vers-choose.fut ‘At the presidential debates, Beridzei harshly criticized Gelashvilik. For that reason, I won’t vote for himi/k.’ In а number of languages of the area, quantifier phrases are built on uniform indeterminate bases (either full words or stems) that are invariable across different categories, a paradigm that is familiar from Japanese (Haspelmath, 1997; Kuroda, 1965; Nishigauchi, 1990; Shimoyama, 2006). These indeterminate bases combine with an additional morphological exponent (which is typically analyzed as encoding a semantic operator). Depending on the exponent they combine with (including the null one), indeterminate phrases can take on a number of interpretations: interrogative, existential, universal, comparative, negative, negative-polarity, free-choice, and so on. Usually the bare forms have the interrogative interpretation. Consider the following paradigm from Svan (David Erschler, personal communication): Table I.1 Indeterminate Expressions in Svan

interrogative

existential

n-words

person thing place time

jær mæj ime šoma

erwaːle maːle/moːle imwaːle šomwaːle

dær maːmgweš/demgwaš deme demčik

Unified or close to unified paradigms of indeterminate expressions are found in most Nakh-Dagestanian languages (see Tatevosov, 2002, for Godoberi, Lak, and Tsaxur; Kibrik, Kazenin, Lyutikova, & Tatevosov, 2001, pp. 165–167 for Bagvalal; Polinsky, 2015b, for Tsez) and in Armenian and Ossetic (Haspelmath, 1997, pp. 281–282). Kartvelian languages have a mostly uniform paradigm for interrogative, existential, negative, and freechoice expressions, but their universal pronouns often have different forms. Northwest Caucasian languages have a partially unified paradigm, with universal and free-choice expressions derived from interrogatives (Nikolaeva, 2012). Indeterminate expressions raise a number of important questions with respect to quantification, syntactic displacement, or focus, and the addition of Caucasian language data to the growing body of research on syntax and semantics of these expressions holds a great deal of promise. Moving on to morphology, most Caucasian languages are agglutinative—that much can be deduced from the examples presented so far. Northwest Caucasian and Kartvelian languages are characterized by long verb forms that include multiple indexing of person and number of participants, aspect, Aktionsart, and applicative verbal

12 Maria Polinsky affixes. Such complexity of verb forms, coupled with extensive pro-drop, has led researchers to characterize Northwest Caucasian and Kartvelian languages as polysynthetic (Testelets, 2009a; Wier, 2011).17 Indexical shift is another structural phenomenon common to the area. Indexicals are expressions that depend on the context of utterance (e.g., I, you, now, here, tomorrow). Traditional accounts of indexicals assume that their referents are fixed regardless of the syntactic environments they are used in. Therefore, indexicals always refer to the actual context of utterance (Kaplan, 1989; Sudo, 2012). Over the last two decades, researchers have shown that in a number of languages, indexicals may be interpreted in the context of the utterance (direct reading), or in relation to the reported context (the shifted reading). In the Georgian example in (4), the first person pronoun is ambiguous; it can either refer to the speaker or to Nino. Referring to the speaker, the indexical receives its stand ard, unshifted interpretation based on the actual context of the utterance. Referring to Nino, the same expression is interpreted in the context of the report. (4) Georgian nino-m tkv-a (rom) xval mo-val-o. Nino-erg say-aor.3sg that tomorrow pv-go.fut.1sg-quot ‘Nino said that I [=the speaker] will come tomorrow.’ ‘Nino said that she will come tomorrow.’ Aside from Georgian, indexical shift has been observed in Svan and Laz (Demirok & Öztürk, 2015; see also chapter 21). It is widely attested in Nakh-Dagestanian (chapter 3; chapter 21; Polinsky, 2015a) and may also exist in Northwest Caucasian languages (Ershova, 2013). Because of this widespread presence, the Caucasus is a promising area for studying indexical shift. However, as with word order or complex consonantal systems, indexical shift is unlikely to be specific to the Caucasus. Kaplan used to describe shifted indexicals as monsters; once the first monsters were uncovered (Schlenker, 1999, 2003), more monsters have been found all over the world (see Deal, 2018, for a recent tally). So far, the data presented in this section make us think that parallels and similarities across different families in the Caucasus are more or less accidental. The reasons for this may be twofold: first, the languages are indeed diverse and share little beyond basic properties (pro-drop, head-finality); and second, the level of comparison is too coarsegrained, and the features we examine may need to be refined. Below, in no particular order, are some less general properties that appear across the languages of the major families with some recurrence. The list is not exhaustive; rather, it is the beginning of a tally which will hopefully grow as we learn more about the languages of the area. 17 Much in that characterization depends on the criterial properties of a polysynthetic language (see Baker, 1996, for an extensive list): is the indexing of arguments on the verb and extensive pro-drop enough? Is noun incorporation a necessary condition? Answers may be pending but, the characterization of Northwest Caucasian and Kartvelian languages as polysynthetic has thus far led to interesting comparisons of these languages to such polysynthetic exemplars as Salish, Iroquoian, or Algonquian (Testelets, 2009a; Lander & Testelets, 2017).

Introduction 13 Furthermore, as with all overviews, certain things have been omitted. For more on the features shared across languages of the Caucasus, see Chirikba (2008b), Klimov (1978), and further references therein.18 A morphological optative—the modal form that expresses wishes, desires, potentialities, or hopes—is found in almost all of the area’s languages. Example (5) highlights Ancient Greek to illustrate another common property of morphological optatives: co-occurrence with a particular tense-aspect, in this case, aorist: (5) Ancient Greek génoitó moi katà tò rhêmá happen.opt.aor 1sg.dat according det word ‘May it happen to me according to your word.’

sou. 2sg.poss

Optative meaning can be expressed by a number of constructions, but the use of dedicated morphology to do so is quite rare. In the Caucasus, morphological optatives are extremely widespread (Chirikba, 2008b; Dobrushina, 2011b; Dobrushina, van der Auwera, & Goussev, 2013).19 Consider examples from the three indigenous families, as well as some other languages of the area (and see Dobrushina, 2011b, for more examples from the Nakh-Dagestanian family): (6) a. b.

Adyghe qə-š’-ere-č̥’əx qeʁaʒ̬ĕ -xe-r. dir-loc-opt-grow flower-pl-abs ‘Let flowers grow here!’ (Kuznetsova, 2009, p. 291) Georgian man unda gadac’eros es c’eril-i. 3sg.erg mod 3sg.copy.opt dem letter-nom ‘He needs to copy this letter.’ (Cherchi, 1997, p. 260)

c. Lezgian wa-z allah-di hamišan 2sg-dat Allah-erg always

üsret gu-raj. help give-opt

‘May God always help you.’ (Haspelmath, 1993, p. 151)

d.

Kumyk tez jaz bol-ʁaj e-di. soon summer be-opt aux-pst ‘I wish summer would come soon.’ (Dobrushina, 2011b, p. 104)

e.

Judeo-Tat soχ-o-m. do-opt-1sg.pst ‘Let me do it!’ (chapter 13, p. 610)

18 See also chapters 1 and 3 for a discussion of properties shared across Nakh-Dagestanian languages. 19 Chirikba (2008b, p. 52) refers to this category as the “potential.”

14 Maria Polinsky Another common property of languages of the Caucasus has to do with vestiges of a vigesimal counting system found across all three families (Klimov, 1978, pp. 20–21). Comrie (2013) shows that languages of all three indigenous families have a hybrid decimal-vigesimal system in which, “the numbers up to 99 are expressed vigesimally, but the system then shifts to being decimal for the expression of the hundreds, so that one ends up with expressions of the type x100 + y20 + z.” Given the intensive contact in the area, this is not surprising—the counting systems were shared and could spread from one group to the others. Unusual argument mapping of objects in a subset of transitive verbs that denote physical contact is another recurrent feature in at least Nakh-Dagestanian and Kartvelian. The verbs in question most commonly include ‘hit’, ‘shoot’, ‘touch’, ‘kiss’, ‘wipe’, ‘comb’, ‘paint’, and ‘stab’. They presuppose an object that is affected by the action, and the medium (instrument) of the respective action. In more familiar languages, the entity undergoing such eventualities is expressed as a direct object, and the medium/ instrument, if expressed at all, is in an oblique form. Yet in Nakh-Dagestanian and Kartvelian languages, the mapping of non-subject arguments appears reversed: the instrument of the action is expressed as a direct object, and the undergoer appears in the dative or locative form (Klimov, 1978, pp. 58– 59).20 For example: (7) Georgian gogo-m k’at’a-s (top-i) esrola. girl-erg cat-dat gun-nom throw.aor.3sg ‘The girl shot (lit. threw the rifle to/at) the cat.’ (8) Tsez čanaqan-ä zey-qo (tupi) caƛi-n. hunter-erg bear-poss.ess rifle.abs.iv throw-pst.nwit ‘The hunter shot (lit. threw the rifle at) the bear.’ Since the expression of the instrument/medium can be omitted, one could form an impression that such verbs are somehow special, missing a direct object entirely—which they are not. 20 Klimov (1978, p. 59) suggests that the same unusual mapping is found in Northwest Caucasian languages, but this observation is not supported by the empirical data. The examples listed in Klimov (1978) represent intransitive verbs whose subject is in the absolutive, whose undergoer is expressed as an indirect object, and whose instrument appears either in the instrumental form or as another indirect object. For example, in (i), the subject is in the absolutive, and the agreement on the verb reflects an intransitive pattern; the instrument is expressed by a PP (č’e is the instrumental postposition that requires an oblique complement), and the notional object is in the oblique form: (i) Adyghe cwəweč’ə-m-č’e cwəwe-r cwə-me ja-we. rod-obl-ins whacker-abs bull-pl.obl 3pl.io+obl+dyn-beat.prs ‘The whacker is racing the bullocks with a whip.’ (Arkadiev, Lander, Letuchiy, Sumbatova, & Testelets, 2009, p. 54, glosses modified from the original)

Introduction 15 Yet another property shared by languages of the area has to do with the expression of motion events. Talmy (1975, 1985) contends that in the domain of motion events, languages fall into two major types: Path (or v[erb]-framed) languages, which lexicalize the path of motion in the verb and express the manner of motion, if specified at all, outside the verb; and Manner (or s[atellite]-framed) languages which lexicalize the manner of motion in the verb and express the path in a complement (“satellite”) to the verb. Romance languages are a common example of the Path type, and Germanic languages instantiate the Manner type. Compare the contrast between Spanish and English in example (9): (9) a. Spanish La botella the bottle

entró entered

a la cueva (flotando). at the cave floating

b. The bottle floated into the cave. Although no languages of the Caucasus are exclusively of the Motion or Path type, the Path type is preferred. The manner of motion is rarely expressed by a single verb; instead, we find basic motion verbs such as ‘go’ or ‘come’ combined with a nonfinite verb form or an adverb expressing a concomitant action (running/in the running manner, floating/in the floating manner, etc.), as illustrated in (10): (10) Chirag Dargwa cːade šːa duc’-b-ulq-le arg-an-de. woman+pl.abs home run-h.pl-ipfv-cvb go:ipfv-ptcp-pst ‘The women were running home.’ Furthermore, a number of languages of the area lack such verbs as ‘fly’ or ‘swim’. Taken together, these lexical observations (which have not been systematized so far) are indicative of a promising area of research, one that would combine careful descriptive work on verbs of motion in languages of the area with further testing of Talmy’s initial hypothesis. I have already mentioned the rich morphological makeup of verbs in the languages of the three indigenous families. In particular, most languages allow the construction of morphological causatives of transitives (and further valency increases are also possible, leading to pluritransitive verbs). Throughout the Caucasus, in causatives of transitives, the causer appears in the ergative, the object of the transitive remains in the absolutive, and the causee appears in an oblique form; the alignment where the causee is expressed as the direct object is unattested (Klimov, 1978, p. 57). To illustrate21: 21 Northwest Caucasian languages have an extremely impoverished inventory of morphological cases, almost as a mirror image of their case-rich neighbors in the northeast. In Adyghe, the ergative and oblique case have the same exponent, -m. Some researchers use that syncretism as evidence that the case is all the same (see chapter 9; Testelets, 2009a). However, the distribution of

16 Maria Polinsky (11) Adyghe a. č̥’ale-m ʁwəč̥ə-r j-e-wəfe. young.man-erg iron-abs 3sg.erg-dyn-bend.prs ‘The young man is bending iron.’

r-j-e-ʁe-wəfe. b. pŝaŝe-m č̥’ale-m ʁwəč̥ə-r girl-erg young.man-obl iron-abs obl-3sg.erg-dyn-caus-bend.prs ‘The girl is making the young man bend iron.’ (Letuchiy, 2009a, p. 377)

(12) Georgian a. švil-ma p’ur-i mo-i-t’an-a. child-erg bread-nom pv-vers-bring-aor.3sg ‘The child brought bread.’ b. deda-m švil-s p’ur-i mother-erg child-dat bread-nom ‘Mother made the child bring bread.’

mo-a-t’an-in-a. pv-vers-bring-caus-aor.3sg

(13) Tsez a. kid-b-ä magalu b-aħ-xo. girl-th-erg bread.abs iii-bake-prs ‘The girl is baking bread.’ b. eniy-ä kid-be-q magalu b-aħ-er-xo. mother-erg girl-th-poss.ess bread.abs iii-bake-caus-prs ‘The mother is making the girl bake bread.’ Although this alignment of causatives of transitives is not unique to the Caucasus (it is found in morphological causatives in Japanese, see Harley, 2008), the pervasiveness of this feature among languages of the area is striking. It is found in Ossetic as well (see chapter 14), which suggests that it may be an areal feature. It is more common to discuss categories and properties present in a given language rather than focus on what is absent. However, some significant “omissions” in the structures of languages of the area should also be noted. In particular, Kartvelian and Northwest Caucasian languages lack infinitives. Instead, they use deverbal nouns (often described as masdars, the Arabic term for a verbal noun) or other nominalized forms, such as the supine in the Northwest Caucasian family (Klimov, 1978, pp. 18–19, 78). With the exception of Armenian (see chapter 13), Old Georgian, and the Northwest Caucasian family (see chapters 9 and 10), Caucasian languages lack articles. That makes them good candidates for testing hypotheses concerning differences in the fundamental design of DP and NP languages (Bošković, 2008), an issue that Öztürk and Eren take up -marked forms and their control of verbal agreement vary by structural position. Here I adopt the m view that -m can mark different cases and that the case of the cause in (11b) is oblique, not ergative (see also chapter 18).

Introduction 17 in a separate chapter in this volume (chapter 19). Further work in this domain is imperative. In their demonstrative system, Caucasian languages all distinguish between at least three deictic categories: close to the speaker (hic), close to the hearer (iste), and away from both speech participants (ille). Actual realizations may vary from language to language (Klimov, 1978, pp. 19–20, 83) and often include the distinction between what is visible (here, there) and what is out of sight (yonder), as well as distinctions based on the position of the reference point on a vertical (higher, lower, at the same level/next to). The three-way distance contrast is also common in locative expressions. Additionally, most of these languages lack dedicated third person pronouns and use demonstratives instead. Given the dizzying array of demonstratives, it would be intriguing to find out which particular items in the demonstrative class are chosen to denote third person referents. Is it ‘this’, ‘that’, ‘next to the speaker’, or ‘below the speaker’s reference point’? A number of options are attested, and a study that could systematize the use of demonstratives for third person referents across languages of the area is gravely needed.

I.3. Scholarship on Languages of the Caucasus The data on many languages of the Caucasus are descriptively rich, though not always easily accessible. In order to appreciate the existing scholarship, one must be able to read a series of languages. The earlier research was written up in German, Russian, French, and Georgian, and most of the contemporary literature is in English and Russian. Early work on languages of the Caucasus can be roughly divided into the work done by local researchers and the work done by outsiders (Klimov, 1986, p. 25). Of the former, most studies were done in Georgia, with an emphasis on Georgian in general and on Bible translations into Georgian in particular. Early local scholars often downplayed the role of other Kartvelian languages. For instance, Megrelian was characterized as a nonstandard, uneducated variety of Georgian.22 Of the work done by outsiders, early studies on languages of the area are associated with the names of explorers, military officers, and administrators who traveled to the Caucasus and helped map out the area’s ethnic and linguistic diversity. The first lexical lists and dictionaries of indigenous languages appeared in the late 1700s (Güldenstädt, 1787–1791; Klaproth, 1812–1814, 1814). More detailed and varied work soon followed. MarieFélicité Brosset’s long and illustrious career studying Georgian and Armenian paved the way for serious historical and philological work in the South Caucasus. Franz Anton Schiefner, Adolf Dirr, and Baron Peter (Pëtr) von Uslar laid the foundations of modern study of Caucasian languages for the three indigenous families. They were not linguists 22 See also chapter 12 for some discussion of this issue.

18 Maria Polinsky by training, and their interests spanned ethnography, folklore, history, and language. Thanks to their dedication, we now have detailed grammars and dictionaries of several languages from the area (Dirr, 1904, 1905, 1908, 1928a, 1928b; Uslar, 1887, 1888, 1889, 1890, 1892, 1896, 1979).23 Baron von Uslar was also responsible for the creation of early Cyrillic-based orthographies for Nakh-Dagestanian languages.24 The Russian-language journal “Sbornik materialov dlja opisanija mestnostej i plemjon Kavkaza” (SMOMPK) was published in Tbilisi from 1881 through 1915 (additional issues appeared in 1926 and 1929) and remains a valuable resource of ethnographic and linguistic observations. (In fact, many of SMOMPK issues are listed in the references to this Handbook.) Before he gained notoriety for the idea that all of the world’s languages descend from a single proto-language with four exclamations as its entire vocabulary, Nicholas (Nikolay/Nikolai) Marr carried out important work on Georgian and Armenian philology. Nikolai Trubetzkoy conducted phonetic/phonological and comparative analysis of languages in the North Caucasus, and his work is still valid and current (e.g., Trubetzkoy, 1922, 1930, 1931). Several outstanding Russian linguists worked in the area in the 1930–1960s, with Moscow, Leningrad, and Tbilisi being established centers of research in Caucasian languages (the first department of Caucasian Language Studies was established at Tbilisi State University in the 1930s). Descriptions of languages produced in these centers remain authoritative sources of data to this day, and sometimes constitute a baseline which allows us to compare an earlier stage of a particular language to the way it is spoken now. Evgeny and Anatoly Bokarev, Arnold Chikobava, Zeynab Kerasheva, Ketevan Lomtatidze, Georgy Rogava, Akaki Shanidze, Nikolay Yakovlev, Lev Zhirkov, Varlam Topuria, Ilia Tsertsvadze, Bakar Gigineishvili—these are just some of the illustrious names on the roster of Caucasologists who worked in Russia/the USSR in the 20th century. A new model of language study and description was pioneered by Alexander Kibrik and Sandro Kodzasov who, over two decades, led groups of researchers on annual fieldwork trips in the Caucasus. Kibrik’s work was undergirded by the desire to combine rigorous theoretical analysis with thorough description of a language (preferably under- or un-described) through intensive fieldwork, typically conducted by entire research teams (see Kibrik, 1972, 1977c, for the main principles of such team fieldwork). Not only did Kibrik and Kodzasov’s fundamental work lead to excellent descriptions and analyses of Caucasian languages (A. E. Kibrik, 1977a, 1977b, 1977c, 1992, 1996; Kibrik & Kodzasov, 1988, 1990; Kibrik, Kodzasov, Olovjannikova, & Samedov, 1977a, 1977b; Kibrik & Testelets, 1999; Kibrik, Kazenin, Lyutikova, & Tatevosov, 2001), but it also set a precedent about the importance of group fieldtrips, which serve as incubators for training students and collecting data in all kinds of languages. The Adyghe collection referenced throughout this volume (Testelets, 2009a) is the result of one such field trip. 23 See chapters 3 and 9 for further discussion of early linguistic work in this area. 24 Russian scholars in the 1920s and 1930s built on that work, creating more alphabets, first based on the Latin script, and later on, as the USSR went back to more imperial aspirations, based on Cyrillic. Nikolay Yakovlev and Lev Zhirkov developed writing systems for a number of Caucasian languages (Alpatov, 2017).

Introduction 19 A good place to start for English sources is a special issue of Lingua edited by Helma van den Berg (2005a) that includes an overview of each family’s phonology, morphology, and syntax. Wixman (1980) provides an excellent ethnographic and sociolinguistic overview of the North Caucasus. Greppin (1989–2004) is a collection of more detailed descriptions, with an overview of each family and descriptions of their languages. Chumakina (2011a) provides a useful annotated bibliography of the main readings on languages of the area, with basic readings for all of the families. Comprehensive bibliographies on particular language families are also available: see Jaimoukha (2009) for Northwest Caucasian; Alekseev and Kikilashvili (2013) and Erschler (2014a) for NakhDagestanian (in Russian and in English, respectively). For Kartvelian, there is no single publication with a relevant bibliography, but the following papers and books have extensive bibliographies: Boeder (2005), Greppin and Harris (1991),25 and Tuite (1998a). Fieldwork in the Caucasus is changing. The area is more open to international researchers than ever before, which has led to worldwide collaboration among scholars (Bond, Corbett, Chumakina, & Brown, 2016; Chumakina, Brown, Corbett, & Quillam, 2007a, 2007b), nascent experimental work on languages of the area (Gagliardi, 2012; Lau et al., 2018; Polinsky, Gomez-Gallo, Graff, & Kravtchenko, 2012), and extensive new grammars (Forker, 2013b, is a recent example; see also chapter 3 for more detail). Furthermore, in the late 1990s and early 2000s, the Max Planck Institute for Evolutionary Anthropology in Leipzig supported the publication of dictionaries, language descriptions and documentation, and folklore collections, primarily from the NakhDagestanian family. There is a new sense of urgency in studying the languages of the Caucasus because many have become endangered, due either to dwindling populations or to speakers moving away to areas where Russian or Georgian takes over (see chapter 2; also van den Berg, 1992).

I.4. Structure of This Handbook This Handbook is an attempt to bring the descriptive riches of the Caucasus to an English reader, with an additional emphasis on the theoretical promise held by languages of the Caucasus. With that goal in mind, several chapters in this Handbook conclude with a section on outstanding issues or topics for future study. As previously mentioned, the reader who is looking to learn more about the history of languages of the Caucasus may have to look at other references; the emphasis in this volume is on synchronic description.26 Likewise, someone seeking information about extinct languages that were spoken in the area, for example, Hurrian or Hattic, will be disappointed; this Handbook does not include any such descriptions. 25 This volume is part of the series Greppin (1989–2004). 26 However, chapters 11 and 13 briefly discuss some aspects of the history of Kartvelian and Indo-European languages, respectively.

20 Maria Polinsky Part I includes chapters that present a general overview of the area, with emphasis on geography, demographic trends, and social aspects of language use. Demographic research in the Caucasus is still uneven; chapter 2 by Konstantin Kazenin is concerned only with the northern part of the area, and we have been unable to secure comparable chapters for the Kartvelian area—a clear indication where future work is needed. Each of the indigenous families is described in an overview chapter, and there is also an overview chapters on the local Indo-European languages (Parts II–V). In addition, this Handbook includes chapters on selected languages from the main families. Thus, each overview chapter is accompanied by a chapter (or several chapters) on selected languages; special effort was made to include lesser-described languages. For example, in Part IV, the Kartvelian overview is accompanied by a chapter on Megrelian, which has received less attention than the largest language of the family, Georgian (for descriptions of Svan, another understudied language of the family, see Tuite, 1998b, 2018, and references therein). The Indo-European languages of the Caucasus share striking areal features (see chapter 13). On the contrary, the Turkic languages of the Caucasus do not appear to have attained features specific to the area and present clear examples of Turkic (and broader, Altaic) typology, including vowel harmony and consonantal restrictions at the beginning of a word, the nominative-accusative alignment, and visible agglutination. The relevant languages have been described relatively well, and the interested reader should consult Schönig (1998) for Azerbaijani, Berta (1998) for Kumyk and Karachay-Balkar, and Csató & Karakoç (1998) for Noghay, with further references therein. Since these languages use Cyrillic (see chapter 1), their transliteration conventions are included in Appendix II. Chapters on language families and individual languages follow more or less the same format, with some deviations. For example, non-finite forms play a crucial role in NakhDagestanian grammars but are much less relevant for the other two families, so the description of such forms is much more extensive in the Nakh-Dagestanian chapters. The discussion of grammatical relations may be more important just for some languages, where their status has been subject of dispute, and may be absent from other chapters where the data are insufficient, or the issue does not even arise. For some languages, certain structural domains are studied comparatively less; while descriptive gaps may constitute obstacles for research, they also offer opportunities for future work. While the authors of overviews and related language chapters made a concerted effort to coordinate their presentations to avoid duplication, some repetitive material is inevitable, and it may be less repetitive than it seems. For instance, the overview chapter on the Northwest Caucasian family includes charts showing consonant of Abaza and Abkhaz (chapter 9), and so does the chapter by Brian O’Herin (chapter 10). However, the charts represent different dialects, and further still, the authors of the respective chapters have somewhat different views on the sound systems under consideration—an inevitable circumstance in the field, where discoveries are still being made and analyses are being actively worked out. Above I already brought up different views on ergativity in Kartvelian, which are reflected in individual chapters.

Introduction 21 I have also mentioned the complex nominal forms in Nakh-Dagestanian languages used to encode spatial meaning. Some researchers analyze them as postpositional phrases (see chapter 3; Comrie & Polinsky, 1998), while others treat them as part of the nominal case paradigm (in chapter 5, Victor Friedman presents arguments in favor of this approach to Lak spatial forms). The final part of this Handbook (Part VI) includes overview chapters that address particular aspects of language structure, from phonetics and phonology to grammar and information structure. The choice of topics was, to a large extent, motivated by available research (and researchers). For instance, there is virtually no research on lexical semantics in languages of the Caucasus and only very preliminary work on propositional semantics of these languages (mainly by Sergey Tatevosov and co-authors, see the chapters on semantics in Tsakhur and Bagvalal descriptions: Kibrik & Testelets, 1999, and Kibrik, Kazenin, Lyutikova, & Tatevosov, 2001, respectively)—that explains one of the gaps in the Handbook. It would not be hard to find other areas of inquiry that are missing, and it is my hope that this volume will stimulate new research to fill in these holes. And finally, some housekeeping notes are in order. Despite its relatively small geographic area, the Caucasus features a nearly overwhelming variety of language names (see also footnote 3). Throughout this Handbook, language names have been unified; Appendix I lists the most commonly used names of languages and language groups together with the existing alternatives. For instance, the Handbook uses the name Batsbi throughout, and Appendix I gives its alternate names: Bats, Batsaw, Tsova, Tsova-Tush. Names in the Caucasus are often more than names; some evoke the history of strife, divisions, or oppression—or other strong connotations. For example, the name Kartvelian, commonly adopted for one of the families, is rejected by the Laz, whose language belongs to that family, but who insist on the name South Caucasian (see chapters 19 and 22). And the language name Adyghe, widely used in the typological literature, and throughout this volume, may be less appropriate than West Circassian, the term used in the literature as well (e.g., Smeets, 1984); see chapter 9 for more discussion.27 While this Handbook has adopted a fairly conservative approach, keeping most names as they are found in the bibliographic tradition, it is incumbent upon researchers working in the Caucasus to be cognizant of ethnic or local names going forward. The variety of spellings and orthographic conventions is yet another issue that any intrepid researcher of the area has to face. With the exception of Azerbaijani, no language in the Caucasus uses Latin script (and many languages do not have writing systems, see chapter 1). Coupled with the complex sound systems, this creates serious challenges in transliterating names of languages or dialects, place names, or names of historical figures and local researchers. Difficulties are further confounded by the exist ence of several romanization systems for Cyrillic (which is widely used throughout the 27 The choice of names for the Circassian languages is further complicated by aspirations of terminological symmetry; if we use West Circassian for Adyghe, it is more appropriate to refer to Kabardian as East Circassian. And if we want to keep the more-common Kabardian, that may tip the scale in favor of Adyghe.

22 Maria Polinsky Caucasus) and for Georgian. Appendix II serves to show the most common correspondences between Cyrillic, Latin, and the International Phonetic Alphabet (IPA), which should help with future reading of particular texts. As much as possible, the authors have tried to use consistent romanization of personal names and names of locations, but old habits die hard and some chapters may have slightly varied transliteration for personal names and names of locations in the Caucasus. This is particularly evident with the romanization of Georgian where several systems compete (the most recent of those is the National System established in 2002 by the State Department of Geodesy and Cartography of Georgia and by the Institute of Linguistics of the Georgian Academy of Sciences). One of the main points of divergence has to do with the representation of ejectives: should they be marked with an apostrophe, with a dot under the consonant symbol, or by capitalization? (This Handbook adopts the former convention.) Differences in transliteration of personal names and local names linger, but we have attempted to keep the transcription of the Georgian data as uniform as possible throughout the volume; most exceptions have to do with the transliteration and glossing lifted from earlier work. The transliteration of Cyrillic follows the scholarly (academic) system (in particular, using the symbols č, š, ž among others), and this is used systematically for examples from Russian or the transliteration of book or article titles. Maintaining the same consistency in the transliteration of last names and names of locations is harder, since some names have already been used in a different transliteration; for some, we even find two different spellings (for example, Testelec and Testelets, or Daghestan and Dagestan). Where possible we have tried to present the most common transliteration found in the literature; for example, the capital of Georgia is most commonly written in Latin characters as Tbilisi (as opposed to the previously used Georgian name T’pilisi or the older Russian name Tiflis, based on the older Georgian name), and this former name is used throughout this Handbook. An additional problem arises when Georgian names appear in a Russian-language source; in such cases, we opted to transliterate the Russian form, for example, Dzheyranishvili (1971, 1984). In the bibliography to the volume the reader may find alternative transliterations of some last names, with a cross-reference to the more common transliteration (for example, Cagareli—see Tsagareli). A note on glossing is in order as well. For languages as complex as languages of the Caucasus, morphological division and glossing is an art in and of itself, and a number of conventions have been established for particular languages or families. For instance, infixation is indicated with angle brackets; clitics and affixes are sometimes differentiated by using + and the hyphen, respectively. In Nakh-Dagestanian, where gender agreement is pervasive, Roman numerals are used in glosses to indicate the gender of a noun and the matching of that gender on the agreeing constituent. A number of glossing abbreviations conform to the Leipzig Glossing rules, but quite a few are not on the Leipzig list—and the list of abbreviations in the beginning of this Handbook is understandably long. As with other aspect of data representation, the authors have tried to make the glossing as consistent as possible. Yet some differences are unavoidable, and they go beyond pure terminology. For example, some authors

Introduction 23 make a distinction between the generic evidential (evid) and non-evidential (nevid): the respective forms express different ways in which evidence was acquired and related to the assertion (was it the event itself that was sensed or was it some other state of affairs that implies the event). Meanwhile, other researchers, in particular those working on descriptions of several Nakh-Dagestanian languages, maintain the more fine-grained distinction between witnessed (wit), a subtype of direct evidential, and non-witnessed (nwit), a subtype within the non-evidential category. Accordingly, both categories and the respective abbreviations appear throughout this volume.

Acknowledgments Many colleagues have encouraged and supported me in the work on this volume. Dmitry Ganenkov deserves special thanks for discussion of the content, potential contributors, and chapter templates. I am also grateful to Peter Arkadiev, Zurab Baratashvili, Timothy Blauvelt, Lena Borise, Michael Daniel, Marcel den Dikken, Nina Dobrushina, Anna Dybo, David Erschler, Steven Foley, Anton Kukhto, Yuri Lander, Beth Levin, Tamara Kalkhitashvili, Léa Nash, Alexander Rostovtsev-Popiel, Peggy Speas, Nina Sumbatova, Yakov Testelets, and Thomas Wier. Many thanks to Anton Kukhto and Zachary Wellstood for editorial and technical assistance in the preparation of this volume. Part of this chapter was written during my visit at the Hungarian Academy of Sciences under the Distinguished Guest Scientist Program in 2017.

Pa rt I

GE N E R A L OV E RV I E W OF T H E C AUC A SUS

chapter 1

L a nguage s a n d Socioli ngu istics of the Caucasus Nina Dobrushina, Michael Daniel, and Yuri Koryakov

1.1 Introduction In this chapter, we give an overview of the classification and sociolinguistic situation of the languages of the Caucasus. Section 1.2 presents a summary of family affiliation and classification problems. Section 1.3 discusses language usage statistics as provided by official sources. Section 1.4 provides information on writing systems and the (recent) history of alphabetization. Section 1.5 is an overview of multilingualism in the area. Finally, section 1.6 is a brief discussion of language contact, providing several examples of contact-induced change in the area. As all of us have a deeper knowledge of NakhDagestanian (East Caucasian) languages, this chapter provides more coverage of these languages as compared to languages of the other families.

1.2 Family Affiliation The Caucasus is home to three indigenous language families: Kartvelian (also known as South Caucasian), Northwest Caucasian (other names: Abkhaz-Adyghe, West Caucasian), and Nakh-Dagestanian (also known as East Caucasian or Northeast Caucasian). The three families are often grouped under the rubric “Caucasian languages.” There is a considerable representation of Indo-European and Turkic language families, and small ethnic groups speaking Neo-Aramaic (Semitic).

28 Nina Dobrushina, Michael Daniel, and Yuri Koryakov Bold is used for primary language names. Italics are used for autonyms, that is, the name used by a given group to refer to their language, as well as for names in languages other than English. Together with the primary language name, the following information is given in parentheses: alternate names, autonyms, and (estimated) number of speakers.1 The counts do not include speakers outside the Caucasus.

1.2.1 Caucasian Languages All Caucasian languages were traditionally spoken in southernmost Russia, Georgia, and Azerbaijan (Koryakov, 2002). The only exception is Laz, spoken mainly in Turkey. Many speakers of other languages (e.g., the extinct Ubykh and Circassian languages) moved to Turkey or the Middle East. Section 1.5 discusses the composition of the three groups. Peter Uslar in his letters of 1864 was the first to voice the idea that the three indigenous language families may be related, though later he himself expressed serious doubts about that (Tuite, 2008, p. 9). This idea was further taken up by a Georgian historian Ivane Javaxishvili (1950) and his successors, A. Chikobava, V. Topuria, G. Rogava, and K. Lomtatidze. It was Chikobava who coined the name “Ibero-Caucasian.” The idea was later criticized by many Caucaseologists, especially by G. A. Klimov (1968, 1969). The relationship between Kartvelian, on the one hand, and Nakh-Dagestanian and Northwest Caucasian, on the other, is now rejected by the vast majority of scholars (Comrie, 2005, p. 1; Tuite, 2008, p. 32). That Northwest Caucasian and Nakh-Dagestanian may be related to each other was proposed by Nikolai Trubetzkoy (1930) and then followed up in A North Caucasian Etymological Dictionary by Nikolaev and Starostin (1994). Their reconstruction has been met with skepticism (Nichols, 1997b; Schulze, 1997a); nonetheless “a considerable number of scholars regard (the) North Caucasian hypothesis as at least an interesting possibility worthy of further investigation” (Tuite, 2008, p. 22). Long-range comparativists further include Northwest Caucasian and NakhDagestanian (North Caucasian) into an even more controversial macrofamily, SinoCaucasian (Bengtson & Starostin, 2011), relating it, for example, to Basque, Burushaski, and Yeniseian (Starostin, 2010). Kartvelian, on the other hand, is included into the putative Nostratic macrofamily (Bomhard, 2008; Illich-Svitych, 1971). Theories linking North Caucasian to Hattian (Ardzinba, 1979; Braun, 1994; Chirikba, 1996, pp. 406–432) or Nakh-Dagestanian to Hurrito-Urartian (Diakonoff & Starostin, 1986) were recently considered in detail by Alexei Kassian (2010, 2011a, 2011b). He concludes that there are no grounds to establish any close relationship, although they could all be members of the putative Sino-Caucasian macrofamily.

1 Estimates are based on census figures, supplemented by village population figures for some minor languages where we have more direct data (Andic, Tsezic, and Dargwa).

Languages and Sociolinguistics of the Caucasus 29

1.2.2 Kartvelian Kartvelian includes four languages, spoken to the south of the Greater Caucasus (see Harris, 1991b, and chapter 11 for an overview of the family). Georgian (kharthuli ena; 3.5 million), by far the largest language of the family, is mostly spoken in Georgia, with smaller communities in Turkey, Azerbaijan, Russia, and Iran. Svan (lušnu nin; 30,000) is spoken in the northwest of Georgia and Upper Kodor valley in Abkhazia. Megrelian (Mingrelian; margaluri nina; 300,000) is spoken in the lowlands of western Georgia and southeastern Abkhazia (see chapter 12 for its description). Laz (Chan; lazuri nena; 22,000) is spoken primarily in northeastern Turkey, and in one part of the village of Sarpi, run through by the state borders of Turkey and Georgia. Speakers of Svan and Megrelian consider themselves ethnic Georgians and use Georgian as their written language. One consequence is that in Georgia, Svan, Megrelian, and Laz are often considered Georgian dialects. Formerly, Megrelian and Laz were mistakenly considered two dialects of a single language, Zan. Church Georgian is a form of Old Georgian used liturgically by Christian speakers of all Kartvelian languages. Modern Georgian uses the Mkhedruli version and Church Georgian uses the Asomtavruli and Nuskhuri versions of the original Georgian script. Svan and Megrelian are not officially written except for Megrelian in Abkhazia (see section 1.4); occasionally, the Georgian script is used. Laz is written in the Latin alphabet.

1.2.3 Northwest Caucasian The Northwest Caucasian family comprises Circassian and Abkhaz-Abaza branches. The extinct Ubykh is transitional between the two groups (Chirikba, 1996, pp. 7–8). Circassian (adəɣabzɜ) is considered a single language by its speakers. This view is maintained in the diaspora. However, in Russia, Adyghe (West Circassian; č’axəbzə; 115,000), spoken in Adygea and Krasnodar Krai, and Kabardian (East Circassian, KabardinoCherkess; q’ɜbɜrdɜj-čɜrkjɜsəbzɜ; 505,000), spoken in Kabardino-Balkaria and KarachayCherkessia, are officially considered distinct languages. Two literary standards were created in the early 20th century, with dialects divided (sometimes arbitrarily) between them. In Russia, both standard languages use Cyrillic. Circassian is also to a limited extent written outside Russia, where the Latin alphabet is often used for this purpose. Ubykh (tʷaxə́-bza) was spoken along the coast of the Black Sea (in the area of modern Sochi). Its speakers moved to the Ottoman Empire in 1864, where they switched to Circassian dialects. The last known speaker of Ubykh died in 1992 in Hacı Osman Köyü, a village near the Sea of Marmara. Abkhaz (ápʰsšʷa; 124,000) has two dialects, Abzhui and Bzyb, spoken in Abkhazia. Several other dialects fully moved to Turkey (e.g., Sadz and Ahchypsow). Abaza (abaza bəzŝa; 36,600) is spoken in Karachay-Cherkessia. Ashqar, officially considered a dialect of Abaza, and Abkhaz seem to be mutually intelligible, while Tapanta Abaza is distinct enough to be viewed as a separate language. For further discussion, see chapters 9 and 10.

30 Nina Dobrushina, Michael Daniel, and Yuri Koryakov

1.2.4 Nakh-Dagestanian Nakh-Dagestanian is a language family of six branches spoken in the eastern part of Northern Caucasus with some communities on the southern slopes of the Greater Caucasus. For some languages (Andic, Tsezic, and Dargwa branches), the figures for speakers are given based on village population rather than on census counts. We do this for the following reasons. Highland villages are usually ethnically and linguistically homogenous (see section 1.4.3), and their populations do not shift to major languages. These groups speak their ethnic language as their L1, as suggested by the census. On the other hand, many villagers have moved to the lowlands, and their presence in the towns may now be higher than in their original villages. Such families tend to lose their ethnic language very quickly, sometimes within the first generation of resettlers, but may continue to indicate their ethnic language as their L1 (Rus. rodnoj jazyk, lit. “native language’) as a way to express their identity. This makes attempts at evaluations based on censuses unreliable (see Friedman, 2010; Kazenin, 2002a). Counts based on village population may provide more accurate estimates of the number of language speakers. Dagestanian and Nakh languages were originally considered two separate families. Klaproth (1831) recategorized them as two branches of one family (van den Berg, 2005b; Hewitt, 2004)—hence the family name Nakh-Dagestanian. As it turned out, there are almost no shared innovations that are common to all Dagestanian languages as opposed to Nakh languages (cf. Nichols, 2003, p. 241). It is therefore plausible to view the Nakh branch as a sister to other branches of Nakh-Dagestanian (Forker, 2013b; Koryakov, 2002; but see a different conclusion in Nichols, 2003). Chapter 3 presents an overview of the family. In the Nakh branch, Chechen (nu̯oχčiːn mu̯otː; 1.3 million) and Ingush (ʁalʁaːj motː; 293,000) are grouped together under the name Veynakh. Both languages are written in Cyrillic. See chapter 8 for details. Batsbi (Bats, Tsova Tush; bacbur mɔt’ː; 500) is an unwritten language spoken in northern Georgia. Speakers of Batsbi identify ethnically as Georgians. The language is severely endangered. Avar, Andic, and Tsezic languages are sometimes grouped together in a single branch (Avar-Andic-Tsezic). In the Soviet censuses after 1937, these languages were not listed separately, and all speakers were registered as Avars. In the 2002 and 2010 censuses they were again listed separately. In practice, however, ethnic identification was not consistent, so that the figures are unreliable. Below in this section, we provide estimates based on village population, but even these figures are sometimes overestimated because many villagers prefer to keep their original address and registration even after they have moved. Avar (maʕarul macʼ, awar macʼ; 693,000) is represented by several dialects (some of which might be distinct enough to be treated as separate languages) in southwestern and central Dagestan and northern Azerbaijan (Zakatala Avar). See chapter 6 for more details. Andic languages, spoken to the west of Avar in the middle basin of the Andi Koysu River, include Andi (Gʷanːab mic’ːi; 22,500), Botlikh (bujχałi mic’ːi; 7,400), Godoberi

Languages and Sociolinguistics of the Caucasus 31 (ʁibdiƛi micːi; 3,200), Karata (k’ːirƛi mac’ːi; 11,000), Tukita (1,300) (usually considered a dialect of Karata, but see Dobrushina & Zakirova, 2019), Northern Akhvakh (ašʷaƛi mic’ːi; 9,500), Southern Akhvakh (8,000 total, 350 in highlands, estimate provided by Indira Abdulaeva, personal communication), Bagvalal (bagwalal mis’ː; 5,500), Tindi (idarab micːi; 9,300), and Chamalal (č’amalaldub mic’ː; 9,600). Tindi and Bagvalal are close to each other, both geographically and linguistically, as are Botlikh and Godoberi. Some languages show visible divergence even on the level of dialects, as Andi in the villages of Andi, Zilo, Rikvani, and especially Muni and Kvankhidatl. The Tsezic (Didoic) languages, spoken to the south of the Andic languages, in the upper-middle basin of the Andi Koysu River, include Tsez (Dido; cezjas mec; 12,300), Khwarshi (2,200), Hinuq (hinuzas mec; 450), Bezhta (Kapuchi; bežƛ’alas mic; 6,500), and Hunzib (honƛ’odos mɨc; 1,000). Language experts consider some of these figures to be underestimates. Sagada dialect (soƛ’o; 700 speakers) of Tsez is sufficiently divergent to be considered a distinct language. Similarly, the variety of Khwarshi spoken in Inkhokwari is sometimes classified as a separate language. The Dargwa (Dargi) languages are spoken in the southern central part of Dagestan and include a large range of lects traditionally considered dialects of one language, Dargwa. They are all treated as one language in the censuses. On structural grounds, one may distinguish Northern Dargwa (133,000), Muira (muirala; 34,500), Tsudaqar (c’udqurla; 30,500), Kaytag (Kajtak, Kaytak, Xaidaq; χajdaq’la; 23,600), Shari (1,200), Tanti or Southwestern Dargwa (Tanti-Sirhwa-Amuq; 13,700), Usisha-Butri (7,600), Kubachi-Ashti (ʕūʁbugan-išt’ala; 6,200), Gapshima (Hapshima; ħabšila; 2,300), Chirag (xarʁnilla kub; 2,000), Sanzhi-Itsari (sanǯi-ic’arila; 2,000), Mehweb (Megweb, Megeb; meħwela; 800), and Amuzgi-Shiri (ʡaˁmuzʁan, xːeran; 200–400). All speakers of the Dargwa languages and language varieties speak the standard language, which is closest to the Aqusha and Urakhi dialects of Northern Dargwa. The use of standard Dargwa in the south is more limited. The Mehwebs, forming a Dargwa exclave surrounded by Avars and Laks and being taught Avar at school, are not proficient in the standard language. Chirag is probably the most divergent member of the branch, deep in the south of the Dargwa-speaking area. Various Dargwa varieties are endangered due to migration to the lowlands. For more on Dargwa, see chapters 3 and 4. The Lak language (lakːu maz; 140,000) is spoken to the west of Dargwa; Dargwa and Lak may form a deep-level genealogical grouping. For a language with a relatively high number of speakers, Lak does not show strong dialectal variation. See also chapter 5. The Lezgic languages, spoken in southeastern Dagestan and northern Azerbaijan, include Archi (aršatːen č’at; 1,500), Tabasaran (tabasaran č’al; 117,000), Agul (Aghul; aʁul č’al; 27,000), Lezgian (lezgi č’al; 546,000), Rutul (mɨχaˤbišdɨ č’ɛl; 27,300), Tsakhur (Tsaxur, Caxur; c’aˁχna / jɨˁqnɨ miz; 20,000, although this figure may be a strong overestimate because of the massive shift of the Tsakhurs of Azerbaijan to Azerbaijani), Budukh (budanu mɛz; 200 speakers), Kryz (Jek, Alik, Kryts, Dzhek; ɢrɨc’ä mɛz; 4,400), and Udi (udin muz; 4,900, also in Georgia and among recent migrants to Russia). Udi is exceptional in that it is by far the earliest documented language of the family. It is a

32 Nina Dobrushina, Michael Daniel, and Yuri Koryakov descendant or relative of the ancient Caucasian Albanian (Aghwan) language or of its sister (Gippert, Schulze, W., Aleksidze, A., & Mahé, 2008; see also section 1.3). Khinalug (kätš micʼ; 2,200) is spoken in northern Azerbaijan. It may be distantly related to the Lezgic branch, with which it is traditionally grouped. Today this closeness is sometimes explained by a strong Lezgic influence. Together with Batsbi, Budukh is one of the few Nakh-Dagestanian languages which seems to be immediately endangered. The village of Budukh, the only village speaking the language, is reported to be shifting to Azerbaijani (Adigoezel Hacijev, personal communication, July 4, 2018). While language shift is widespread in the lowlands, affecting all Nakh-Dagestanian languages, Budukh is one of the few known cases of language shift currently in progress right at the original location where a Nakh-Dagestanian language is spoken. More such cases may have happened relatively recently but remained undocumented, such as a probable shift from Tabasaran to Azerbaijani or Lezgian in some villages in the south of Dagestan (Genko, 2005, p. 203).

1.2.5 Indo-European Languages Russian is the most widely spoken language of the Caucasus (20.5 million). In terms of the number of L1 speakers, it is slightly behind Azerbaijani (8.3 vs. 8.5 million). Apparently, the number of Russian speakers (both L1 and L2) in the southern Caucasus including Georgia, Armenia, and Azerbaijan has declined since the fall of the Soviet Union, although no reliable statistics are available. The use of Russian in Dagestan, on the contrary, increases. All ethnic groups speak Russian as L1 in towns, where the majority of population lost their ethnic languages. In villages, monolingual speakers of ethnic languages are exceptional, and, in most cases, Russian is the only lingua franca between neighboring villages (see section 1.5). Speakers of Ukrainian (110,000) are scattered throughout the Caucasus but are especially dense in northern to central Krasnodar Krai. Several Iranian languages are spoken in the Caucasus. One is a Northwestern Iranian language, Northern Kurdish (Kurmanji; kurmanǯi; 60,000) spoken by scattered communities in Armenia, Azerbaijan, and some parts of Northern Caucasus (especially the Republic of Adygea). The Yezidis (Êzidî), a separate Kurdish religious sect, claim their variety of Kurdish to be a distinct language, Ezdiki. It appeared as such in the Armenian census, while the census administration in Russia, after linguistic consultations, merged Ezdiki and Kurdish. Talysh (tolɨši; 77,400, possibly more) is spoken in the southeast of Azerbaijan and in adjacent Iranian territory. Southwestern Iranian is represented by Tat (26,600) spoken by three different confessional groups which have their own autonyms for the language: Muslim Tats (tati; northeast Azerbaijan and the suburbs of Baku), Mountain Jews (ǯuhuri; Quba in Azerbaijan, few speakers in Dagestan, others moved to Israel or scattered over other towns in the Caucasus and elsewhere in Russia; written in the Hebrew script) and the nearly extinct Christian Tats (pʰarseren; Armeno-Tat; formerly in Madrasa and Kilvar in Azerbaijan, but moved to Armenia and Russia).

Languages and Sociolinguistics of the Caucasus 33 Northeastern Iranian is represented by Ossetic (0.5 million), spoken in the Russian Republic of North Ossetia and in the self-proclaimed republic of South Ossetia. Ossetic has two strongly divergent dialects, Iron and Digor (see chapters 13 and 14). The Indo-Aryan branch of Indo-European is marginally represented by Romani (romani čʰib; 34,000) dialects, the language of the Roma (Gypsies) scattered throughout the Northern Caucasus and Georgia. In many areas, the Roma have assimilated linguistically to the surrounding languages. Romani is not an official language anywhere in the Caucasus. In recent years, it has had limited use in writing (Cyrillic). Closely related Domari (Karachi, Mitrib, Kaloro, Cingāna) is spoken by scattered Dom communities in Azerbaijan (and elsewhere in the Middle East). Lomavren (Bosha, 35,000–40,000) is a nearly extinct mixed language, spoken by the Lom people in Armenia, southern Georgia, and northeastern Turkey. It has retained most of its Indo-Aryan lexicon, but its grammar is almost entirely Armenian. The Armenian branch is represented in the Caucasus by a dialect network usually considered to be a single language, Eastern Armenian (hajeɾen; 3.6 million). Some dialects lack mutual intelligibility. It is also the main language of Armenia and Nagorno-Karabakh (where a strongly divergent Artsakh dialect is spoken as L1). Eastern Armenian is also spoken in Georgia and Iran. Its sister language, Western Armenian, is used in Turkey and by the majority of the Armenian diaspora outside Iran. In the Northern Caucasus, there are two divergent Western Armenian varieties. One is Nor-Nakhichevan Armenian, spoken in the Rostov region by descendants of Crimean Armenians. The other is Hamshen Armenian (homšecʰma), spoken by Christians in Abkhazia and Krasnodar Krai, and also by Muslims in northeastern Turkey, who do not consider their language to be Armenian (Koryakov, 2018). Many Armenian dialects became extinct in the aftermath of the genocide in the early 20th century. At least two Hellenic languages are present in the Caucasus. Divergent varieties of Pontic (roméjka, 30,000–40,000) are spoken in southern Georgia (most of the speakers emigrated to Greece or Russia), Northern Caucasus, Abkhazia, and Armenia. Kappadokian Greek was spoken in Cappadocia (Central Turkey). In the 1920s, most speakers were forced to migrate to Greece, but some moved to Georgia, where Kappadokian Greek is still marginally present in some communities.

1.2.6 Turkic Languages Turkic languages are widely spoken in some parts of the Northern Caucasus, especially in Azerbaijan, but also in Dagestan. Turkish (türkče; 85,000) is represented in the Caucasus mainly by Meskhetian Turkish (ahɨska türkče), originally in southern Georgia, and now widely dispersed throughout Russia, Central Asia, Azerbaijan, Ukraine, Turkey, and the USA. Azerbaijani (Azeri; azərbajǯan dili; 9 million) is the main language of Azerbaijan (with more speakers in Iran than Azerbaijan) and a few villages in southern Dagestan, where it is one of the official

34 Nina Dobrushina, Michael Daniel, and Yuri Koryakov languages. Kumyk (Kumuk; qumuq til; 403,000) is a language of lowland Dagestan; it used to be a lingua franca for some parts of Dagestan. Karachay-Balkar (qaračay-malqar til, tawlu til; 299,000), the language of two ethnic groups, Karachays and Balkars, occupies the highest areas in Karachay-Cherkessia and Kabardino-Balkaria, respectively. Noghay (Nogai; noɣaj tili; 73,000) is spoken in several areas in the northern Caucasus, mostly, in northeastern Dagestan and Karachay-Cherkessia.

1.2.7 Semitic Speakers of several Northeastern Neo-Aramaic languages, the largest branch of NeoAramaic, migrated to the southern Caucasus in the 19th and 20th century and are scattered throughout Armenia (2,400 speakers), Georgia (several dozen speakers left after mass migration to Russia) and the Russian northwestern Caucasus (1,400 speakers). Today, Urmian Jewish Aramaic (lišān didān, lišānān) is still spoken in Tbilisi and in the village of Urmia in the Krasnodar Krai (around 80 speakers). Several hundred speakers of Northern Bohtan Aramaic (Hértevin; sôreth) live in Krymsk and Novopavlosk (southern Russia), where they moved from Gardabani (Georgia).

1.3. Official Statistics on Language Speakers and Users Only a few of the ex-Soviet republics kept Russian as their official language. In the Caucasus, Armenia, Azerbaijan, and Georgia each have one official language (Armenian, Azerbaijani, Georgian, respectively), as does the unrecognized state of Artsakh (Nagorno-Karabakh Republic), where the official language is also Armenian. The partially recognized republics of South Ossetia and Abkhazia declared Russian as their official language, in addition to the dominant languages (Ossetic and Abkhaz, respectively). The situation with actual language use is, of course, much more complicated. Almost all of the countries in the Caucasus are successors to the Russian Empire and the Soviet Union, which had one of the world’s longest uninterrupted histories of ethnic and linguistic statistics based on systematic census data. The first census in which language data were collected was the Russian Imperial Census of 1897. It was followed by the Soviet “All-Union” census in 1926. The 1926 census included a question regarding the respondent’s first language (Rus. rodnoj jazyk ‘mother tongue’). The question was repeated in Soviet censuses almost every decade: in 1939,2 1959, 1979, 1980, and 1989. 2 The census was originally carried out in 1937. It was then announced that its results had been intentionally distorted by the organizers (accused of being “enemies of the people”), and the census was redone two years later.

Languages and Sociolinguistics of the Caucasus 35 Table 1.1 Last Two Decennial Censuses in the Caucasus Country

Year

Year

Census Questiona

Abkhaziab

2003

2011

L1

Armenia

2001

2011

L1, OL

Azerbaijan

1999

2009

L1, proficiency in Azerbaijani, OL

Georgia

2002

2014

L1, proficiency in Georgian

Artsakhb

2005

2015

L1

2002

2010

proficiency in Russian, OL, L1

–

2015

L1

2000

2011

–c

Russia South Ossetia Turkey

b

L1 – first language, OL – other language[s] spoken fluently. Unrecognized or partially recognized states. c The 1965 census in Turkey included a question regarding the language usually spoken at home, as well as a question on other languages used by the respondent. a

b

The practice of collecting detailed ethnic and linguistic statistics in censuses continued after the collapse of the USSR (see Table 1.1.). In most cases, this practice was further developed and sociolinguistically refined. Table 1.2 presents the data on the largest languages of the Caucasus. It includes data from Abkhazia, Armenia, Artsakh, Azerbaijan, Georgia and South Ossetia, and nine administrative units of the Russian areas of the Caucasus. Table 1.2 Ten Most Spoken Languages of Caucasus (2009–2015 Censuses) Language

L1

L2

Azerbaijani

8,506,270

524,841

9,031,111

Russian

8,320,492

12,197,152

20,517,644

Armenian

3,510,783

77,281

3,588,064

Georgian

3,310,978

181,416

3,492,394

Chechen

1,284,271

9,473

1,293,744

Avar

618,673

112,853

738,526

Lezgian

532,614

13,786

546,400

Kabardian

488,700

15,672

504,372

Ossetic

462,724

19,756

482,480

English

–

771,422

771,422

Total

31,182,548

L1+L2

36 Nina Dobrushina, Michael Daniel, and Yuri Koryakov In the rest of this section, we present and discuss observations that may be drawn from the census data on language usage, country by country in alphabetical order. Political affiliations of some territories in the Caucasus are vigorously disputed. The authors do not take any political stance on the territorial conflicts of the region.

1.3.1 Armenia Table 1.3 reproduces the data from the 2011 Census in Armenia.3 Armenian is the only official language in the country. Russian continues to be widely used, even if its use has considerably reduced in the decades following the collapse of the USSR (see section 1.5 for qualitative discussion). Note the speakers of Neo-Aramaic, whose presence was officially reported as early as in the 1897 Census. Yezidi and Kurdish, though counted separately, are in fact the same language, and the difference in the census data should be attributed to self-identification based on religious affiliation.

Table 1.3 Linguistic Composition of Armenia, 2011 Language

L1

L2

Armenian

2,956,615

43,420

Yezidi

30,973

5,370

Russian

23,484

1,591,246

Assyrian (= Neo-Aramaic)

2,402

1,468

Kurdish

2,030

1,309

Ukrainian

733

1,151

Greek

332

2,136

Georgian

455

6,151

Persian

397

4,396

English

491

107,922

French

–

10,106

German

–

6,342

Other

913

10,339

29

–

3,018,854

Refuse to answer Total

3 Armenian Census 2011, http://armstat.am/ru/?nid=517

Languages and Sociolinguistics of the Caucasus 37

1.3.2 Azerbaijan and Artsakh (Nagorno-Karabakh) Table 1.4 shows language data from the official 2009 Census in Azerbaijan, including data on Artsakh (Nagorno-Karabakh). For minority languages (i.e., languages other than Azerbaijani, Russian, and English), the counts only include respondents who indicated the corresponding ethnicity. For instance, only those people who declared themselves Talysh were asked about their proficiency in Talysh and, depending on their answer, identified as L1 or L2 speakers of Talysh. All respondents were asked about their proficiency in Azerbaijani, Russian, and English. A comparison of the 2009 and 1999 census data also suggests that the number of L1 speakers is underestimated, at least for Talysh, Georgian, and Tsakhur. For Georgian, apparently only Christian Georgians were counted, while Muslim Georgians were counted as Azerbaijanis, and Table 1.4 Linguistic Composition of Azerbaijan, 2009a Language Azerbaijanis

L1

L2

8,148,282

519,417

Lezgians

162,450

2,885

Armenians

120,180

0

Russians

117,988

626,877

Talyshis

47,600

560

Avars

46,463

398

Turks

31,806

684

Tatars

24,139

104

Ukrainians

20,984

22

Tats

19,001

277

Tsakhurs

11,722

111

Georgians

9,682

20

Jews

8,509

55

Kurds

2,202

964

Kryz

1,254

18

Udis

3,773

0

Khinalugs

2,143

2

Other

7,648

228

English

—

Total

8,922,447

71,380

a Population by ethnic group: language and L2 proficiency, 2009, http://www.stat.gov.az/source/demoqraphy/en/1_11-12en.xls

38 Nina Dobrushina, Michael Daniel, and Yuri Koryakov Table 1.5 Linguistic Composition of Artsakh, 2005a Language

Native Speakers

Armenian Russian

136,366 1,274

Ukrainian

7

Other

90

Total

137,737

a Preliminary results of the 2015 Nagorno-Karabakh population census: http://www.stat-nkr.am/ hy/2010-11-24-10-40-02/597—2015

their proficiency in Georgian was not reflected in the census. Finally, Budukh is not listed in the statistics, while Kryz and Khinalug, not listed in the previous censuses, are included. Another issue has to do with the population of Artsakh (Nagorno-Karabakh). The figures were calculated based on the data of the 1989 census and thus include the Azerbaijani refugees from the Artsakh territory. Artsakh conducted its own post-Soviet censuses in 2005 and 2015. Table 1.5 provides available linguistic data (on L1).

1.3.3 Georgia, Abkhazia, and South Ossetia Table 1.6 shows the data from the 2014 census in Georgia.4 It includes the number of native speakers per language together with the number of those who speak fluent Georgian. Azerbaijanis and Armenians have the lowest number of Georgian speakers. As in the Soviet censuses, recent Georgian censuses do not include statistics for some minority languages: Svan, Megrelian, Laz, and Batsbi (Nakh branch of NakhDaghestanian). For these speakers, only their fluency in non-native language(s) is registered. As is evident from a comparison with the 2002 census, Greek, Kurdish, and Kist (a dialect of Chechen), whose speakers are not considered to be ethnic Georgians, are included as “Other” in the 2014 census. The low counts for Ossetic and Abkhaz indicate that the census does not include data on the breakaway republics of Abkhazia and South Ossetia, which are given in Table 1.7 and Table 1.8, based on the official counts provided by these republics. In Table 1.7, the majority of those who indicated Georgian as their L1 are in fact first-language speakers of Megrelian. South Ossetia conducted its only census in 2015. It included the question on ethnic affiliation, but not on language use (see Table 1.8).

4 2014 General Population Census Results, http://census.ge/en/results/census

Languages and Sociolinguistics of the Caucasus 39 Table 1.6 Linguistic Composition of Georgia, 2014 Fluently Speak Georgian

Do not Fluently Speak Georgian

3,254,852

3,254,852

—

Azerbaijani

231,436

43,579

172,134

7,642

Armenian

144,812

57,316

74,258

3,640

Russian

45,920

29,179

9,099

13,238

Ossetic

5,698

4,831

189

15,723

Abkhaz

272

163

19

678

30,742

19,095

8,007

Language Georgian

Other Not Stated Total

Native Speakers

72 3,713,804

—

Not Stated 90

72

—

3,409,015

—

263,706

41,083

Table 1.7 Linguistic Composition of Abkhazia, 2011a Language

Of the Same Ethnic Group

Of Other Ethnic Groups

121,697

120,817

880

Armenian

40,831

40,731

100

Georgian

38,020

37,933

Megrelian

3,112

3,112

—

28,580

21,921

6659

Abkhaz

Russian

Total Native Speakers

87

Greek

—

862

—

Ukrainian

717

706

Ossetic

—

426

—

Abaza

—

308

—

Turkish

—

370

—

Romani

—

253

—

Estonian

—

225

—

11

Other

7,748

—

3,955

Total

240,705

Census results for the territory of Abkhazia 1886–2011, http://www.ethno-kavkaz.narod.ru/rnabkhazia. html

a

As in Table 1.4, the number of minority language speakers in Table 1.7 only represents the respondents who identified with a given language based on their ethnicity. As a result, local patterns of multilingualism (the Abaza speaking Ossetic, etc.) are not accounted for. The only data on multilingualism that are available cover the knowledge of the major regional languages: Abkhaz, Armenian, Georgian, Russian, and Ukrainian.

40 Nina Dobrushina, Michael Daniel, and Yuri Koryakov Table 1.8 Ethnic Composition of South Ossetia, 2015a Ethnic Group

Population

Ossetians

48,146

Georgians

3,966

Russians

610

Other

810

Total

53,532

a

South Ossetia census results: http://ugosstat.ru

1.3.4 Russia The exact number of language questions and their formulations vary from census to census. The 2002 census included the following relevant questions: Do you speak Russian? What other languages do you speak? (up to three languages). In the 2010 census, an additional question (taken from the Soviet censuses) was added: What is your first language? Russian is the only official language of the entire Russian Federation. In some of the republics that form the Russian Federation, local languages are normally co-official languages, written in the Cyrillic alphabet, except indicated otherwise. The area of the Caucasus that is part of the Russian Federation comprises seven federal republics5: Adygea (Adyghe), Chechnya (Chechen), Ingushetia (Ingush), Kabardino-Balkaria (Kabardian, Karachay-Balkar), Karachay-Cherkessia (Karachay-Balkar, Kabardian, Noghay), North Ossetia-Alania (Ossetic), and Dagestan. According to the language legislation of the Republic of Dagestan, all indigenous languages are official. These languages are not, however, officially listed in the language policy laws. In practice only the written languages are treated as official: Agul, Avar, Chechen, Dargwa, Kumyk, Lak, Lezgian, Noghay, Rutul, Tabasaran, Tat, and Tsakhur. Table 1.9 provides combined data on L1 and L2 for nine administrative units of Russia conventionally included in the Northern Caucasus (Krasnodar and Stavropol Krai, Republics of Adygea, Chechnya, Dagestan, Ingushetia, Kabardino-Balkaria, KarachayCherkessia, and North Ossetia–Alania). The data are based on the 2010 census. Listed are all languages traditionally spoken in the Caucasus, as well as immigrants’ languages with more than 1,000 speakers. The total number given in the last line is the total p opulation of the nine Caucasian administrative units of the Russian Federation, not the result of adding together the numbers of speakers for each language in the table. The latter would

5 We show languages spoken in each republic in parentheses following the name of the republic.

Languages and Sociolinguistics of the Caucasus 41 Table 1.9 Linguistic Composition of Northern Caucasus (Russia), 2010 Language

L1+L2

Russian

14,350,496

Chechen

1,293,744

Avar

685,726

Kabardian

504,372

Dargwaa

456,151

English

429,120

b

Ossetic

428,636

Kumyk

402,373

Lezgian

363,100

Karachay-Balkar

299,179

Ingush

292,609

Armenian

245,020

Azerbaijani

171,703

Lak

140,394

Tabasaran

116,778

Adyghe

114,970

Germanb

107,346

Ukrainian

75,419

Noghay

73,008

Georgian

57,907

Turkish

37,161

Abaza

36,555

Romani

34,009

French

28,883

b

Rutul

27,225

Agul

26,953

Tatar

24,534

Greek

23,280

Turkmen

12,063

Tsez

11,994

Kurdish

11,505

Uzbek

10,290

Arabicb

10,251 (Continued )

42 Nina Dobrushina, Michael Daniel, and Yuri Koryakov Table 1.9 Linguistic Composition of Northern Caucasus (Russia), 2010 (Continued) Language Tsakhur

L1+L2 9,364

Belarusian

8,591

Spanish

8,308

Bezhta

5,899

Kazakh

5,712

Andi

c

5,417

Moldovan

4,149

Korean

3,997

Tajik

3,492

b

Polish

3,157

b

Italian

2,801

b

"Dagestanian"a

2,268

Tindic

2,109

Latin

1,941

b

Neo-Aramaic

1,910

Circassian

1,896

Abkhaz

1,758

Bulgarian

1,756

Khvarshi

1,729

Chuvash

1,686

Mordvin

1,649

"Jewish"

d

Bagvalal

c

1,447 1,435

Kyrgyz

1,425

Tat

1,423

Lithuanian

1,215

Bashkir

1,162

Persian

1,059

Archi

961

Hebrew

945

Hunzib

918

Chamalalc

470

Udi

433

Megrelian

230

Languages and Sociolinguistics of the Caucasus 43

Karatac Akhvakh Botlikh

222 c

c

Godoberic Total

185 179 120 15,095,469

Collective term for more than one language. A second language for most respondents. c For these languages, the numbers are strongly underestimated. d “Jewish” includes Yiddish and Hebrew (both of which were also listed separately) and Judeo-Tat (Mountain Jewish), which are all genetically unrelated. a

b

be substantially higher, because the figures of speakers per language include not only L1 but also L2 speakers (sometimes respondents indicated several L2s). Speakers of some languages may identify themselves with another, larger ethnic group and its language. This can lead to an underestimation of the number of speakers for some languages (Table 1.9, note c). For instance, the counts for most Andic languages are much lower than independent estimates. This reﬂects the fact that many Andic s peakers are registered as Avars. Section 1.2 provides more realistic figures (based on village population, usually monolingual in terms of L1). Some of our language consultants report pressure from the authorities to register as one of the majority ethnic groups.

1.4 Writing and Scripts Not all languages of the Caucasus are written. By 2017, 27 languages spoken in the Caucasus had the official status of a written language. Georgian and Armenian literary traditions have existed continuously from the early Middle Ages. In various other areas, Arabic was the main language of written communication. Udi (Lezgic branch) (or its sister dialect) was also written in the Middle Ages in a script probably related to Old Armenian and Old Georgian: the so-called Aghwan or Caucasian Albanian. The script subsequently fell into oblivion, and its language, long known only from short texts such as inscriptions on artefacts, has remained a mystery, identified by different scholars with many languages in the Caucasus and beyond. Very recently, after a palimpsest dating to the end of the first millennium was found in a monastery on Mount Sinai and has become accessible for academic research, the language has been reliably identified as an older stage of Udi (Gippert & Schulze, 2007; Gippert, Schulze, Aleksidze, & Mahé, 2008).

44 Nina Dobrushina, Michael Daniel, and Yuri Koryakov Apart from the three languages which have ancient scripts, and apart from the occasional use of the Arabic script for Avar (and several other larger languages), the languages of the Caucasus acquired their writing systems in the late 19th to the 20th century. Modern literatures exist primarily in Azerbaijani, Abkhaz, Chechen, Ingush, Avar, Dargwa, Lezgian, and Lak. The scripts in use changed several times depending on the political situation. For example, the first Abkhaz alphabet was created in 1862 by Peter Uslar and was based on the Cyrillic script. In 1926, it was replaced by an alphabet based on the Latin script. From 1938 to 1954, the Georgian alphabet was used in schools. In 1954, Cyrillic was re-introduced and has been used ever since.6 Not all languages that are officially written are equally common in writing. For example, a Cyrillic-based writing system for Rutul was introduced in 1992. Before that, there were almost no written documents in Rutul. Azerbaijani was used for official documents, religious texts, and poetry. After 1992, Rutul classes were incorporated into the school curriculum, two hours a week. Several textbooks have been published. Currently, the local newspaper Rutul’skie Vesti (‘Rutul News’) publishes most articles in Russian and only some in Rutul. However, our consultants report difficulties with reading the articles, because the newspaper uses the variety of Rutul spoken in the village of Rutul (the administrative center), whereas most villages have their own dialects, sometimes considerably different from the Rutul of Rutul Village. Rutul speakers have little motivation to learn another dialect of Rutul to read it. First, the written variety of a different dialect may not align with ethnic identity; and second, all important documents are in Russian anyway. Rutul classes in school are not popular among either students or their parents. The situation with written Tsakhur, a sister Lezgic language spoken in the neighboring villages, seems to be quite similar. Archi, a minority language spoken in a single village, provides another example of issues related to transliteration. In the 2000s, a team led by Aleksandr Kibrik suggested a practical orthography for Archi. It was used in a number of texts published by Chumakina, Arkhipov, Kibrik, & Daniel (2008), and in the online Archi dictionary compiled by the Surrey Morphology Group (Chumakina, Brown, Corbett, & Quilliam, 2007a). In practice, however, this writing system is only used in interactions between field researchers and Archi native consultants (e.g., when transcribing Archi recordings). Khinalug, a language spoken in a single village in the north of Azerbaijan, was not written until the 1980s. In the 1980s, a Cyrillic-based orthography was used by the local poet, Rahim Alhas. A slightly different Cyrillic orthography was independently used by Faida Ganieva in her dictionary of Khinalug (Ganieva, 2002). At that time, however, Azerbaijani had already switched to a Latin-based orthography, so attempts to 6 Even for the languages considered unwritten, there is a possibility of discovering relatively old manuscripts in personal archives. For example, Archi, a Lezgic language spoken in one highland village (about 1,500 speakers), has seen the publication of several small poetic religious texts, apparently dating back to the mid-19th century (Magomedxanov, 2009). At that time and earlier, an Arabic script was used.

Languages and Sociolinguistics of the Caucasus 45 c reate Cyrillic orthographies were ill-timed. A systematic Latin-based orthography was first suggested by a team lead by Aleksandr Kibrik and then further negotiated with the community in the 2010s by Monika Rind-Pawlowski. Unlike the situation with Archi, some villagers seem to be enthusiastic about the Khinalug orthography; still, its application and utility seem unclear. The situation is not much different with major Nakh-Dagestanian languages such as Avar, Dargwa, or Lezgian. It is clear that most Dagestanians read and write Russian much more often (if not exclusively) than their native language, even if they are speakers of Avar or Lezgian. There are similar reports for Ingush. Local enthusiasts use the written minority languages, especially in writing poetry, but the poetry is hardly ever read. The audience of the many newspapers which publish materials in languages other than Russian is very limited. Having said this, no true and reliable estimate of the written use of these languages has been made to date, so that these claims must be considered impressionistic. The first known attempts to create a script for Megrelian were made in the second half of the 19th century. Since then, however, Megrelian has been written mostly for academic purposes of language documentation, by the early Soviet administration in the 1920s and 1930s, and currently by several newspapers (in particular, in the breakaway region of Abkhazia). In addition, several local enthusiasts in Georgia use this system to write prose and poetry. Native speakers of Megrelian not involved in these activities are unlikely to be fluent in reading the written language (Alexander Rostovtsev-Popiel, personal communication, May 5, 2017). Some major languages also have practical problems with established writing systems. The first Ingush alphabet (created in 1921) was based on Latin script. In 1938, it was converted to Cyrillic, and that version is still in use. Johanna Nichols (personal communication, May 10, 2017) indicates problems which arise due to the fact that Ingush and Chechen both have schwa-zero alternations, which make it difficult to use a phonemic orthography. Speakers complain that pronunciation and spelling are very distinct. The younger generation, who are generally less literate in Chechen or Ingush, do not write weak vowels, which makes the spelling somewhat chaotic. Table 1.10 shows the written languages of the Caucasus. The writing systems are divided into traditional (the ones that have been in existence since the Middle Ages), systems that were introduced in the late 19th to early 20 century, transitional systems, and recently established systems. All modern scripts in the parts of the Caucasus within the Russian Federation are based on Cyrillic. Sound types alien to Russian are represented by combinations of letters rather than by the use of diacritics; especially frequent and multifunctional is the symbol I (Rus. paločka), primarily used following obstruents to represent ejectives. There are, however, slight differences in use of symbols even among different languages within Dagestan. This stems from the fact that sound inventories differ among languages, so that sound types present in some languages may use character combinations that are used for other sound types in some other languages. This does not create confusion because speakers are usually literate in only one language of Dagestan, and

46 Nina Dobrushina, Michael Daniel, and Yuri Koryakov Table 1.10 Scripts Used by Caucasian Languages Language

Fam.

Arabic Script

Roman Script

Cyrillic Script

Other Scripts

Georgian

K

since 5th c. (Georgian)

Armenian

IE

since 5th c. (Armenian)

Aghwan

ND

5th–10th cc. (Aghwan)

TRADITIONAL

LATE 1800s–EARLY 1990s Adyghe

NWC

1918–1927 (spor. 19th c.)

1927–1937 (att. 1980s)

1937 (spor. 19th c.)

Kabardian

NWC

1920–1923

1923–1936 (att. 1980s)

1936–

Abaza

NWC

1926–1938

1938–

Abkhaz

NWC

1926–1938

1862–1926, 1954–

1938–1954 (Georgian)

Chechen

ND

1918–1925 (spor. 19th c.)

1925–1938 (att. 1990s)

1938–

Ingush

ND

1918–1923 (spor. 19th c.)

1923–1938

1938–

Avar

ND

1918–1928 (spor. 15th c.)

1928–1938 (att. 1990s)

1938–

spor. 10–14th cc. (Georgian)

Dargwa

ND

1918–1928 (spor. 16th c.)

1928–1938

1938–

Kaytag

ND

(spor. 14th c.)

Lak

ND

1918–1928 (spor. 15th c.)

1928–1938

1938–

Lezgian

ND

1918–1928 (spor. 19th c.)

1928–1938

1938–

Tabasaran

ND

1928–1938

1938–

Tat

IE

1928–1938, 1990s

1938–

1870s–1928 (Hebrew)

Talysh

IE

Currently (Iran)

1928–1938, 1990s

1938–1990s (spor.)

Ossetic

IE

1923–1938

(att. 18th–19th att. 18th–19th cc., cc.) 1844–1923, 1938–54 1938– (Georgian)

Languages and Sociolinguistics of the Caucasus 47

Azerbaijani

Tu

? –1929 (Az.) / Currently (Iran)

1925–1939, 1992–

1939–2001/ Currently (RF)

Kumyk

Tu

KarachayBalkar

Tu

Noghay

19th c.–1928

1928–1938

1938–

1910–1925

1924–1938 (att. 1938– 1990s)

Tu

19th c.–1928

1928–1938

1938–

Megrelian

K

(spor. 1860s)

1920–1933, spor. 1990s, 2003–

Udi

ND

(att. late 1990s) 1935–1936, att. 1990s

Rutul

ND

1928–1938

Tsakhur

ND

1928–1938 (att. 1938–1940,1992– 1990s)

TRANSITIONAL

1938–1940,1992–

RECENTLY ESTABLISHED Agul

ND

1992–

Andi

ND

att. 1992

Tsez

ND

att. 1993

Laz

K

1984–

Abbreviations: att. — attempts

spor. — sporadically

Russian is mostly used in reading and writing anyway. Abkhaz Cyrillic uses a modified version of Uslar’s alphabet and is very different from any other Cyrillic alphabet used in the Caucasus. For transliteration of some scripts based on Cyrillic, see Appendix II.

1.5 Multilingualism in the Caucasus Due to the high language density in the area, multilingualism “was the norm in many Caucasian communities” (Chirikba, 2008b, p. 30). Large ethnic groups in the Caucasus tended to be monolingual, but the language minorities, and in some cases the speakers of major languages living close to linguistic borders, were bilingual in the languages of their neighbors. Under certain socio-economic circumstances, people could also speak

48 Nina Dobrushina, Michael Daniel, and Yuri Koryakov several unrelated languages; this was particularly common among male speakers who traveled and worked outside the home. The typical linguistic repertoire of Caucasian peoples changed significantly in the course of the 20th century, as command of Russian spread throughout the Caucasus. In the second half of the century, Russian became the language of interethnic communication, and, in many (though not all) places, displaced the use of local L2s. Sometimes Russian even became a danger for the main language of a particular community. However, language death is, so far, not a common phenomenon in the Caucasus. After the fall of the Soviet Union, the role of the Russian language and culture declined in those parts of the Caucasus that became independent countries: Azerbaijan, Georgia, and Armenia (Pavlenko, 2008). At the same time, the presence of Russian has grown in Dagestan, Chechnya, Adygea, Abkhazia, and South and North Ossetia.

1.5.1 South Caucasus Before 1991, South Caucasus was part of the Soviet Union and was administratively organized into three first-order units (“Soviet Republics”) with five second-order units (Rus. автономная республика ‘autonomous republic’ and автономная область ‘autonomous district’) within them. These were: Georgia (including Abkhazia, South Ossetia, and Adjara), Armenia, and Azerbaijan (including Nagorny Karabakh and Nakhchivan). In 1991, all these territories except Adjara and Nakhchivan declared their independence, but only Georgia, Armenia, and Azerbaijan have been commonly recognized. See Map 1. in this volume.

1.5.1.1 Azerbaijan Before the advent of Russian, Azerbaijanis were essentially monolingual, despite the fact that even in the border areas, bilingualism was (and still is) typical for their neighbors. An Azerbaijani woman who married into a non-Azeri family and lived in a non-Azeri village would hardly learn to speak her husband’s language, and most children born into such families would be more likely to speak Azerbaijani as their primary language (Murad Sulejmanov, personal communication, May 8, 2017). Dialects of Azerbaijani are very much alive. In many places, children can switch from the dialect with their parents and other family members to the standard when communicating with outgroup interlocutors. In recent decades, Azerbaijani has replaced Russian in many spheres where Russian was dominant during the Soviet era. Russian remains the most widespread second language among middle-aged and elderly Azerbaijanis, but the number of Russian speakers is much less than in the neighboring Dagestan, and young Azerbaijanis usually do not speak it. Due to the role it plays in politics and media, Turkish is becoming a more popular second language among speakers of Azerbaijani, especially among the youth, since many people watch Turkish TV. We have personally observed the same trend in Azerbaijani villages in Dagestan (e.g., Darvag, Yersi, and others).

Languages and Sociolinguistics of the Caucasus 49 Azerbaijani is the dominant language of instruction in primary and secondary schools across the country. Russian is the language of instruction in more than 300 state schools. Most state universities offer undergraduate and graduate programs in Russian, though programs in English are becoming increasingly widespread. In the schools with Azerbaijani as the language of instruction, Russian is sometimes taught as an L2, but English, French, German, Persian, and Arabic are also offered. There is no presence of dialects at schools, and teachers are supposed to teach in the standard language only. To a certain extent, dialects are present in the media, constituting a constant source of worry for the National Council for Television and Radio Broadcast, which will occasionally warn television presenters that they are supposed to restrict themselves to the standard language (Murad Sulejmanov, personal communication, May 8, 2017). Azerbaijan has numerous linguistic minorities: Armenians, Avars, Budukhs, Georgians, Khinalugs, Kryz, Kurds, Lezgians, Rutuls, Talysh, Tats, and Tsakhurs. Georgian communities in the Qax region (sometimes called Ingiloys, though Georgians reject this designation as derogatory) remain ethnically homogenous: they resist mixed marriages and have Georgian schools for children, and some locals are even monolingual in Georgian. Other minority languages are taught as heritage languages in some villages where there are native speakers, but only on an extracurricular basis. The main source of sociolinguistic information on minorities in Azerbaijan is a collection of papers by Clifton et al. (Clifton, 2002, 2003) and a fieldtrip to the Qax region undertaken by two of us in July 2018. We will now discuss the situations of the Talysh, Tat, Kryz, and Tsakhur. The Talysh comprise one of the largest linguistic minorities of Azerbaijan. There are about 350 Talysh villages, and in some areas, the Talysh consitute up to 95% of the population. Bilingualism in Azerbaijani is typical for the vast majority of the Talysh. According to Clifton, Deckinga, Lucht, & Tiessen (2003b), within Talysh settlements, Azerbaijani is used in formal situations where non-locals are present, while Talysh is used in informal situations for communication between locals. Older Talysh speakers are more likely to use Talysh in their everyday communication. Conversely, for younger people, Azerbaijani may become dominant. In ethnically mixed communities, only Azerbaijani is used in informal situations. There are significant differences between lowland villages with a stronger influence of Azerbaijani and mountain villages where Talysh is better preserved. Since the 1930s, Azerbaijani has been the language of instruction in Talysh schools. In 2003, Clifton, Deckinga, Lucht, & Tiessen (2003b) reported that a program of Talysh has been designed for grades 1 to 4 in homogeneous Talysh communities, and that there were plans to expand this program to include higher grades. Knowledge of Russian is widespread because many Talysh people go to Russia for temporary jobs. The use of Russian is significantly lower than the use of Talysh and Azerbaijani, particularly in the highland communities. Tat has been displaced by Azerbaijani in many communities where it was traditionally spoken. In all Tat communities, average proficiency in Azerbaijani is high and Azerbaijani is the language of schools. According to another recent study by Clifton, Deckinga, Lucht, & Tiessen (2003a), the viability of the vernacular is tied to the economic viability of a given community and to the remoteness of the village. Tat may survive in large lowland

50 Nina Dobrushina, Michael Daniel, and Yuri Koryakov c ommunities with ample resources, as well as in highland villages that are relatively stable economically. For example, in Qırmızı Qǝsǝbǝ, a settlement with a high concentration of Tats, the local language is still viable. It is the main language used by families, including in communicating with children, as well as between community members (Clifton, Deckinga, Lucht, & Tiessen, 2003a). The population as a whole also knows and uses Azerbaijani and Russian. In poorer villages, be they in highlands or lowlands, Azerbaijani has become the main language of Tat families, with parents communicating in it with their school-aged children. For the majority of Tat communities, Russian plays a secondary role; only a small portion of the population can speak Russian well. Kryz (Nakh-Dagestanian, Lezgic) is spoken by a small group in northern Azerbaijan, including all age groups in the mountain villages of Hapıt, Əlik, Cek, and Qrız (Authier, 2009, p. 1). Outside Kryz villages (i.e., within families who live in lowland towns), the proficiency in Kryz varies from full fluency to passive understanding, and even to complete loss of the language. Clifton, Mak, Deckinga, Lucht, & Tiessen (2002) report fluency not only in Azerbaijani but also in Russian, though our own impressionistic observations in Alik show limited command of Russian among women and younger men (similar findings are reported for the Khinalug). Azerbaijani has always been the language of instruction in the highland villages of Hapıt, Əlik, Cek, and Qrız, and no Kryz classes have ever been offered in any of the villages. Another Lezgic language, Tsakhur, is spoken in northern Azerbaijan. The Zakatala, Qax, and Sheki regions border southern Dagestan and were places of traditional seasonal migrations by the Tsakhur and Rutul peoples from Kurdul, Gelmets, Mikik, Kina, and some other villages. In the Qax region (briefly investigated by a field team led by two of us in July 2018), Tsakhurs hardly retain their language. Most Tsakhurs prefer Azerbaijani for communication, even within their family. Azerbaijani also dominates in those villages that are traditionally considered Tsakhur settlements (e.g., Kum). We do not have systematic data on Avar, Akhvakh, Rutul, or Lezgian spoken in Azerbaijan, but all these languages are reported to be strongly present in the northern areas bordering western Dagestan.

1.5.1.2 Armenia Armenian is spoken by everyone in Armenia, except a minority with Russian as their first language. Western and Eastern Armenian are usually considered different varieties of Armenian or even different languages. With some exceptions (e.g., Hamshen or Kesab), Western Armenian dialects became extinct after the genocide of 1915. Most eastern dialects, on the other hand, are vigorously alive. These are considerably divergent varieties of Eastern Armenian, some of which are claimed to be unintelligible to the speakers of the standard language, especially the dialect of Artsakh (NagornoKarabakh). After the genocide, most refugees from Turkey fled to Europe and the United States, but some escaped to Armenia, which presumably led to an interesting and understudied situation of dialect mixing and leveling. Dialects are not taught at school. Modern Western Armenian is taught at certain schools, especially after the beginning of the civil war in Syria and the immigration of Syrian Armenians to Armenia. Classical Armenian is still used as the language of liturgy.

Languages and Sociolinguistics of the Caucasus 51 Besides Armenian, both active and passive command of Russian is widespread in Armenia. Certain generational biases are observed for L2 speakers, with younger generations being less proficient in Russian and more in English, and vice versa for older generations (Victoria Khurshudyan, personal communication, May 10, 2017 see also Dum-Tragut, 2013). The presence of other ethnic groups in Armenia is limited (see Schulze & Schulze, 2016, for an overview). The largest minorities are Kurds and Yazidis, both bilingual in Armenian. Neo-Aramaic and Kurdish are taught at certain schools.

1.5.1.3 Georgia In Georgia, Georgian is used in all domains of everyday life: education, work, life, and media. Standard Georgian is the language of instruction at schools and at universities. TV and radio broadcasts are in standard Georgian. Some TV channels have one- to twohour informational blocks in Ossetic and Abkhaz. Megrelian and Svan are not used in the media or in schools. There are some Russian schools with the full instruction cycle in Russian, although their number decreased considerably throughout the 1990s. English as a second language is gaining popularity but is still spoken by only a small part of the population. Georgians born before the late 1980s continue to use Russian as their main L2. Even in border settlements, Georgians are usually monolingual, while some of their neighbors speak Georgian. Some Armenians and Azerbaijanis living in Georgia use Russian as a lingua franca. Dialects of Georgian are vigorous, not only in rural areas but also in towns. There are several language minorities in Georgia, including speakers of the other Kartvelian languages (Megrelian, Svan, and Laz), as well as speakers of languages from other families, including Abkhaz, Azerbaijani, Russian, Ossetic, Kurdish, Armenian, Neo-Aramaic, and Batsbi. The situation with Kartvelian minorities is complicated by the fact that these languages belong to the same family as Georgian and are sometimes considered Georgian dialects. By and large, their speakers identify themselves as Georgians (see Tuite, 2017b, p. 226, on Svan; Vamling, 2017, on Megrelian). In Georgia, all three languages remain mostly unwritten (see Tuite, 2017b, on attempts to write in Svan; see section 1.4 about attempts to write in Megrelian). In Gal, a newspaper published in Abkhazia, Megrelian is written in a modified Georgian script. Svan is still spoken in its places of origin, Upper Svaneti and northwest Samegrelo, whereas Lower Svaneti and southeast Samegrelo are reported to have shifted to Georgian. In Mestia, according to recent reports, Svan is fluently spoken only by people older than 40. On the contrary, in the villages of Latali, Ipari, and Adishi, Svan remains the primary language of everyday communication, including for children (Kevin Tuite, personal communication, June 12, 2017). The loss of proficiency in Svan seems to be a relatively recent development. According to Nizharadze (1964, pp. 169–172, quoted by Hewitt, 1992), Georgian was primarily acquired by Svan men when they were working in the Georgian lowlands during winter. In 1964, out of the 290 men in the Svan village

52 Nina Dobrushina, Michael Daniel, and Yuri Koryakov of Ushguli, 160 knew Georgian; in K’ala, 199 of 219; in Ipari, 306 of 546. As early as 1870, the command of Georgian was reported for three to six men in each of the villages. According to Chirikba (2009, p. 28), Svans and Megrelians acquired Georgian in the 20th century largely due to the introduction of Soviet schools, where Georgian was the language of instruction. Megrelian, also under strong pressure from Georgian, remains the dominant language of its area, at least outside institutional and administrative interactions (Alexander Rostovtsev-Popiel, personal communication, May 5, 2017). The situation of Laz is problematic in Sarpi, the only Laz village in southwestern Georgia, where locals are unwilling to admit they are Laz. In Turkey, Laz is much more viable. Few Turkish Laz, however, consider their L1 to be a social advantage, because it is unlikely to provide them a job and has low prestige in Turkey. Thus, young people tend to shift to Turkish. Batsbi is a one-village language in the Georgian province of Kakheti. All speakers of Batsbi are bilingual in Georgian (Gippert, 2008). Similar to other Georgian minorities, the Batsbi prefer to be considered Georgians. According to the project of language documentation of the languages of Georgian minorities (http://dobes.mpi.nl/projects/ svan/language), at the beginning of the 21st century only the generation older than 50 had perfect competence in the language; younger adults could understand it and were able to speak but did not use the language. There were no children who understood or used Batsbi. Rural Armenian populations living in the Samtskhe-Javakheti district of southwestern Georgia are descendants of those Armenians who resettled there in the 19th century during the Russo-Turkish Wars. They are predominantly agriculturalists who tend to have a lower level of education in comparison to Armenians living in Tbilisi. SamtskheJavakheti Armenians are seldom proficient in Georgian and speak the Karin dialect of Armenian, also used in northern Armenia. The Azerbaijani community in Tbilisi is very small. Georgian Azerbaijani populations are primarily rural agriculturalists concentrated in Kvemo Kartli, to the south of the capital near the border with Azerbaijan. Like the rural Armenians of SamtskheJavakheti, they are seldom fluent in Georgian and tend to have little formal education (Driscoll, Berglund, & Blauvelt, 2016).

1.5.1.4 Abkhazia During the Soviet era and in the post-Soviet period, Abkhaz was taught at school and was the language of instruction until the fourth grade (i.e., up to the age of about ten). Starting from the fifth year of school, instruction switched to Russian (Chirikba, 2009, p. 7). Some subjects were taught in Abkhaz at the university level. Radio and broadcasting in Abkhazia are in Abkhaz and Russian. L1 speakers of Abkhaz are the majority in Abkhazia. The largest minorities are Armenians and Megrelians. In the recent past, some Abkhaz people in the south of Abkhazia were bilingual in Megrelian. Bilingualism in Georgian, Armenian, Svan, or Greek is also reported, but is rare (Chirikba, 2009, p. 8). During the war with Georgia in 1992–1993, most Georgians, and many Megrelians and Russians, left the republic. Some

Languages and Sociolinguistics of the Caucasus 53 of them came back after the war, but in general the proportion of non-Abkhaz population in Abkhazia has declined. Local Armenians mostly speak Hamshen Armenian, but school instruction is in Standard Armenian in Armenian schools; Hamshen and standard Armenian are reported to be mutually unintelligible (Chirikba, 2008a). Proficiency in Russian was very high during Soviet times and did not decrease significantly after the fall of the Soviet Union (Chirikba, 2009). Russian continues to be the main language of Abkhaz towns, especially among younger speakers.

1.5.2 Russia In Ingushetia and Chechnya, Russian influence was stronger than in some other areas of the Caucasus because of the deportations, which were much more restricted and selective elsewhere (see chapter 2). As everywhere, Russification was stronger among urban populations. But while in the highlands of Dagestan, villagers continue to use their ethnic languages, this is not always the case in Ingushetia and Chechnya. Under Stalin, highland villagers were forcibly resettled, and the Chechen population decreased significantly. Those who came back in the 1950s often settled in the lowlands. Chechnya is large both in terms of its area and its population. It is largely monoethnic, with very few Chamalal villages in the East, on the border with Dagestan, and equally few Kumyk and Noghay villages in the North. The command of Russian is very high and has not been affected even by the military conflicts with Russia. Russian is the main language of education and science, and the media are in both Chechen and Russian. Even in rural primary schools, the language of instruction is Russian; Chechen is present only in Chechen language and literature classes (Rus. rodnoj jazyk i literatura, ‘native language and literature’; Zarina Molochieva, personal communication, June 12, 2017). Formally, Ingush is the official language of Ingushetia, used in education, media, and literature. However, in the opinion of some experts, the language situation is unstable. Children in mixed families usually speak Russian. In the minds of many young people, the Ingush language and culture are associated with a rural, low-prestige lifestyle (Nichols, 2011). Nearly all Ingushes are bilingual in Russian. Many have a passive knowledge of Chechen. Active command of Chechen is infrequent and is more typical of mixed Ingush-Chechen families. Even less frequent is the command of Ossetic or Georgian which, in older days, used to be acquired by visiting marketplaces. Chechen and Ingush are close, but there is no mutual intelligibility. Long periods of contact have resulted in passive bilingualism on both sides; oftentimes, people in a multilingual dialogue each speak their own language. If members of other ethnicities are present, Russian is used (Nichols, 2011). The whole population of North Ossetia is bilingual in Russian. In towns, especially in Vladikavkaz, Russian is used more widely than Ossetic. Children born in Vladikavkaz usually do not speak Ossetic at all. Between 2001 and 2002, only 62 of 224 schools had primary school instruction in Ossetic. Only one of the 62 was located in Vladikavkaz (Kambolov, 2007, p. 22). Due to the local activist Tamerlan Kambolov, some

54 Nina Dobrushina, Michael Daniel, and Yuri Koryakov k indergartens have started offering classes where caregivers communicate with children exclusively in Ossetic. There is a literary tradition, but according to Kambolov (2007), only about 10% of the population read books in Ossetic. Most prefer reading in Russian. The situation with radio and TV broadcasting is similar; although there are programs in Ossetic, Russian broadcasts are more diverse and are preferred by locals. Ossetic as taught at school is closer to the Iron dialect than to Digor. Newspapers, broadcasts, and theaters exist in both dialects (David Erschler, personal communication, June 2017). The situation is similar in South Ossetia. Russian is still the primary L2, and many children in the urban areas do not acquire Ossetic. There are some Georgian schools, especially in the villages close to the Georgian border (Parastaev & Mearakishvili, 2016), but in most schools, instruction is in Russian. The whole population of Adygea is bilingual in Russian. Adyghe remains the L1 for adults and children in villages, though code-mixing in Adyghe speech is becoming more and more frequent (Irina Bagirokova, personal communication, June 2017). The situation in Maykop, the capital of the republic, is that most children born there do not speak or understand Adyghe, even if their parents speak it as their L1. In kindergartens in both Maykop and the villages, caregivers speak Russian. At schools, a standard variety of Adyghe is used only in classes on the Adyghe language, literature, and customs. The standard variety is close to some dialects of Adyghe (Temirgoy and in some respects also Bzhedug), yet distant from other lects of the Adyghe-Kabardian continuum stretchings across the area where standard Adyghe is taught at school. Adyghe is sometimes used for academic purposes, including being one of the working languages at conferences in which representatives of the global Adyghe diaspora participate. There are radio and television broadcasts in Adyghe, as well as Adyghe theaters.

1.5.3 Dagestan The Republic of Dagestan is special not only in terms of language density, extreme even on the scale of the Caucasus, but also because of the vitality of its minority l anguages. More than 40 languages are still spoken in the highlands. Endogamy is probably one of the factors which has sustained linguistic diversity. Traditionally, people from highland rural communities preferred to find marriage partners in the same village; often, there existed strict prohibitions on marrying out (Comrie, 2008). The pattern was probably not ubiquitous; according to Wixman (1980, p. 94), it was not strict in Tabasaran communities, where mixed marriages with Lezgians were not uncommon. Our own field experience revealed strict endogamy in Central and Northern Dagestan, and less strict endogamy in Southern Dagestan. For example, in the Rutul village of Kina, many men married women from a neighboring Tsakhur village; there is extensive intermarriage between the Agul village of Khpyuk and the Lezgian village of Ursun, though we cannot be sure that these are not innovative developments. In any

Languages and Sociolinguistics of the Caucasus 55 case, marrying to another village was only possible for women; men did not normally move to the village of their wives. A large-scale sociolinguistic survey started in 2011 made available a massive amount of data on multilingualism in Dagestan, both in its present state and its recent past (see http://multidagestan.com). Such data are unavailable for other areas in the Caucasus. There is a huge difference in the patterns of interethnic communication in Dagestan before and after the advent of Russian in the middle of the 20th century. Prior to the Soviet period and the spread of Russian, the Caucasus had never had one single lingua franca (Chirikba, 2008b, p. 30). The language of interaction had always been negotiated between particular communities. The most frequent pattern of linguistic interaction between neighboring villages used to be “neighbor multilingualism,” where one of the languages of two neighboring villages would be used. If residents of two villages located within walking distance spoke different first languages, multilingualism was usually asymmetrical, meaning less than half of the population of one village could speak the language of their neighbors, while more than half of their neighbors were bilingual (Dobrushina, 2011a, 2013). This was clearly the case of Mehweb: 97% of the Mehweb-speaking population born before 1919 spoke Avar, whereas in the neighboring Avar village, Obokh, only 8% spoke Mehweb.7 In some parts of Dagestan, mainly a lingua franca was used between villages with different languages. This was the case in southern Dagestan, where the communities near the border with Azerbaijan were proficient in Azerbaijani. For example, the residents of the Rutul village Kina communicated with their Tsakhur neighbors from Gelmets in Azerbaijani. Another lingua franca was Kumyk, the Turkic language of the lowlands. Residents of the villages speaking Kadar Dargwa (e.g., Chabanmakhi, Vanashimakhi, Chankurbe) used Kumyk to speak with their Avar neighbors from the village of Durangi. In some parts of the central and northern highlands, Avar was the lingua franca. A striking example is the use of Avar in the Akhvakh district, where speakers of several Andic languages, Karata, Akhvakh, Bagvalal, and Tukita, much more closely related to each other than to Avar, communicated exclusively in Avar and rarely spoke each other’s language (Dobrushina & Zakirova, 2019). The choice of dominant language in asymmetrical bilingualism or as a general lingua franca was governed by many factors, including the language of the local marketplace, population size in the area and also, more visibly, by the relative altitude of the settlements. That the altitude of a language community may correlate with asymmetrical bilingualism is known from the patterns of the vertical hierarchy of Andean cultures (Murra, 1968). This was independently suggested in Caucasian studies (Lavrov, 1953; Volkova, 1974; see also Nichols, 2004). According to Nichols (2013, p. 43), “Highlanders learned lowland languages for market communication and also because a good portion of the adult working-age male population was transhumant, spending 7 All numbers in this section come from the survey within the project Atlas of Multilingualism of Dagestan (http://multidagestan.com).

56 Nina Dobrushina, Michael Daniel, and Yuri Koryakov the winter months in lowland pastures; but lowlanders had no need to learn highland languages.” Apart from competence in neighboring languages, people in a number of areas in Dagestan also spoke some more distant languages (i.e., those which were spoken as L1 beyond their immediate neighborhood). Such languages were acquired due to transhumant (nomadic) husbandry, seasonal jobs or services, or trade. For example, in Archib (Archi), Shalib (Lak), and Chittab (Avar), people used to go to Azerbaijan to work as tinners. As a result, in these villages about 30% of the population born before 1919 could speak some Azerbaijani. In Balkhar, a Lak village, about 15% of the population born before 1919 could speak Avar, because Avar-speaking areas were important for selling pottery, for which Balkhar was famous. The traditional patterns of multilingualism differed across genders. Languages of distant communities were spoken almost exclusively by men, because the practices which led to the acquisition of distant languages (such as tinning) were only typical of men (Dobrushina, Kozhukhar, & Moroz, 2019). Communication with immediate neighbors was frequent among men and women alike, so the gender bias was much less pronounced in the knowledge of geographically neighboring languages. Figure 1.1 shows that the percentage of those who spoke Avar in the Andi village of Rikvani was roughly the same among men and women. However, the gender bias is much more apparent for distant languages such as Russian, Chechen, and Kumyk. Before the Soviet period, Dagestan had almost no secular schools. Some villages had madrasahs, where Arabic literacy was taught. Basic knowledge of the Quran and Classical Arabic script was also obtained through personal training provided by educated people, often the mollahs. As mentioned in section 1.4, Avar, Lak, Dargwa, Lezgian, and Kumyk used the Arabic script for writing. There are also some documents 100 90 80 70 60 50 40 30 20 10 0

Andi

Avar

Russian Women

Chechen

Kumyk

Men

Figure 1.1. Languages spoken by men and women born between 1889 and 1940 in Rikvani (Andi). Data collected in interviews in Rikvani in 2015.

Languages and Sociolinguistics of the Caucasus 57 in minority languages written with the Arabic script. According to our research, in the generations born between 1880 and 1920, between 20% (Kina, Darvag) and 50% (Karata, Archib) could read the Arabic script (see http://multidagestan.com; also Bennigsen & Lemercier-Quelquejay, 1985, on Arabic in Dagestan). Knowledge of the Arabic script did not imply command of Classical Arabic as a language. Many people were able to read Arabic letters but could not understand the meaning of the text. This is also true today—some Dagestanians can read the Quran, but knowledge of Arabic is rare. According to historical evidence, Russian was spoken by less than 1% of the Dagestanian population in the end of the 19th century (Svod, 1893; Volkova, 1974, p. 31). In the 1930s, Soviet schools were opened in most Dagestanian villages. Russian was one of the main subjects and the primary language of instruction. The curriculum was compulsory. Most people thus acquired Russian and started reading and writing in Cyrillic. Figure 1.2 shows command of Russian in four villages of Dagestan as a function of the year of birth. In the generations born after 1950, almost everyone speaks Russian. The expansion of Russian has strongly enhanced the level of literacy and has severely endangered the patterns of local multilingualism. From the middle of the 20th century, Russian has become the language of higher education and upward social mobility in Dagestan. Younger generations speak Russian instead of the languages of their neighbors. Figure 1.3 shows the dynamics of four languages in the village of Mehweb as a function of the date of birth. The native language of the village, Mehweb Dargwa, is spoken by all residents. Neighboring languages, Lak and Avar, started their decline in the 1930s and 1980s, respectively. Avar persisted longer because it was taught at school, together with Russian. All villagers born after 1950 speak Russian.

100 90 80 70

Russian, Darvag Russian, Archib Russian, Tsulikana Russian, Ersi

60 50 40 30 20 10 0 1860

1880

1900

1920

1940

1960

1980

2000

Figure 1.2. Command of Russian among people born between 1900 and 2000 in Darvag, Archib, Tsulikana, and Ersi (generated via http://multidagestan.com).

58 Nina Dobrushina, Michael Daniel, and Yuri Koryakov 100 90 80 70 60 50

Mehweb Avar Russian Lak

40 30 20 10 0 1860

1880

1900

1920

1940

1960

1980

2000

Figure 1.3. Languages spoken in Mehweb among generations born between 1870 and 2000 (generated via http://multidagestan.com).

At present, Russian is the main language of interethnic communication in many highland areas and in the vast majority of lowland settlements. The latter are meeting places for people with different linguistic repertoires. An important factor in the spread of Russian was that townspeople quickly abandoned traditional prohibitions on interethnic marriages. Mixed marriages are still much more frequent among urban dwellers than in rural populations. The villagers who marry women from other ethnicities (whom they often meet in the lowlands, for example, during their graduate studies) move to towns. Such mixed families choose Russian as the main means of family communication. Under the influence of local languages, and probably also because of the drastic decrease of the ethnically Russian population in the 1990s, Russian used by Dagestanians has acquired special phonetic, morphological, and syntactic features and can be considered a distinct ethnolect (Daniel, Dobrushina, & Knyazev, 2010). At present, migration is the main social process which has a strong impact on the loss of local languages. Migration to the lowlands was partially initiated by the Soviet authorities as an economically driven policy to relocate highlanders. From the 1950s through the 1970s, the residents of some mountain villages were put under pressure to move down and were financially supported to build their new houses in the lowlands (Karpov & Kapustina, 2011). As a result, many villages in various areas of Dagestan are now underpopulated or deserted. For instance, the speakers of Hinuq, a Tsezic minority language, started moving to the village of Monastirsky in the Kizlyar district of Dagestan in 1986 (Forker, 2013b). Quite frequently, migration to the lowlands was linked to seasonal herding of village sheep in lowland pastures far away from the home village, so-called kutans. Originally, kutans were only used in winter. In other seasons sheep were pastured in the mountains. Nowadays, people often prefer to stay in lowland settlements for the whole year, thereby

Languages and Sociolinguistics of the Caucasus 59 establishing new villages. This process was boosted by the economic collapse of agriculture in the post-Soviet period (Kazenin, 2012). The ethnic and linguistic composition of these new villages varies significantly. Sometimes they host people from different parts of Dagestan with different home languages. For example, the village of Druzhba, founded by Dargwa resettlers in the Kayakentsky District, also hosts Tabasarans, Aguls, Laks, and a few Lezgians, Russians, Kumyks, and Avars. According to Rasul Mutalov (personal communication, May 10, 2017), people use Russian for interethnic communication and practice mixed marriages. The situation in Druzhba inevitably leads to the loss of local languages. Some new lowland settlements maintain ethnic homogeneity which was typical of traditional Dagestan and consolidate residents of several villages who speak different dialects of the same language. Personal communication with residents suggests that they are aware of dialect levelling in these villages, but at least the local language is not entirely lost. Neither language loss nor dialect leveling in the lowlands has been studied; one study of code-switching in Sanzhi-Itsari spoken in Druzhba is a notable exception (Forker, 2019b).

1.6 Language Contact in the Caucasus Caucasian languages share a number of common properties (see also the Introduction to this volume), for example, ergative alignment and the presence of ejectives. It is true that the three endemic families are overwhelmingly ergative. Ejectives seem even more significant in that they have spread to non-endemic families—such as Ossetic (Iranian)—and are argued to be present in some dialects of Kumyk, Karachay-Balkar, and Azerbaijani (Turkic; see Fallon, 1998, p. 320). Yet, shared linguistic features of the Caucasian languages as a whole are less striking than, for instance, shared features in the Mesoamerican linguistic area (see Campbell, Kaufman, & Smith-Stark, 1986). Tuite (1999) argues that the “three ergativities” in the Caucasus are morphosyntactically very different, each more similar to ergative patterns attested elsewhere in the world than to one another (see also the Introduction to this volume). Ejectives may be acoustically different even in languages of the same family (see Grawunder, 2017, for a thorough overview of the Caucasus as a phonetic contact area). It seems that, as with many other areas of language contact, it is more useful to look for features shared by some, rather than all, or even most, languages. It also seems that the boundaries of language families in the Caucasus are less transparent to language contact than in some other areas, such as Mesoamerica or Amazonia. Strong cross-family contact is observed in Ossetic (influence from Nakh languages) and in Armenian (influence from Iranian and Turkic languages). Nakh-Dagestanian shows lexical contact in border languages, such as lexical borrowings from Georgian to Tsezic, while there are many features that seem to result from structural alignment within the NakhDagestanian family itself.

60 Nina Dobrushina, Michael Daniel, and Yuri Koryakov

1.6.1 Lexical Borrowing Loanwords are an obvious point of interest of historical studies on Caucasian languages. Lexical contact in Armenian and Georgian is a topic of interest for traditional comparative linguists (for a recent overview of Armenian etymologies, see Martirosyan, 2010; for loanwords in Nakh-Dagestanian and Northwest Caucasian, see Höhlig, 1997; Khalilov & Comrie, 2010; Klimov & Khalilov, 2003; Nikolaev & Starostin, 1994; Shagirov, 1989; for Kartvelian etymologies, see Klimov, 1998a; various Caucasian etymologies are summarized at http://starling.rinet.ru). Lexical borrowings in the Caucasus are included in the World Loanwords Database (WoLD) project whose perspective is mainly sociolinguistic (Haspelmath & Tadmor, 2009). In addition to several languages with a long tradition of comparative lexical research, such as Ossetic and Armenian (Indo-European), Georgian (Kartvelian), or Kumyk and Azerbaijani (Turkic), WoLD also covered less well-studied and/or minority languages, Archi and Bezhta (Nakh-Dagestanian; see Chumakina, 2009a, 2009b; Comrie & Khalilov, 2009a, 2009b). Taking Nakh-Dagestanian languages as an example, based on WoLD and other sources, the case may be made for distinguishing between “vertical borrowing,” with loanwords coming from “external,” culturally dominant languages, such as Arabic, Turkic, Iranian, and, more recently, Russian, and “horizontal borrowing,” with loanwords between branches of the same family, or, more rarely, across family borders. Horizontal borrowing may occur from locally important languages to minority languages, such as Lak borrowings in Archi, or Avar and Georgian borrowings in Bezhta (cf. Bezhta ɣadri ‘embers’ (Georgian ɣadari), kanɬi ‘light’ (Avar kanɬi) versus ulka ‘country’ (Turkic ülke), insan ‘someone, person’ (Arabic insa:n), picka ‘match’ (Russian spíčka)) (Comrie & Khalilov, 2009b). Vertical borrowings may have entered minority languages via locally important languages (Avar in the case of Bezhta), which blurs the sociolinguistic relevance of this difference. An ongoing project on quantificative analysis of horizontal lexical borrowings in Dagestan has already yielded some results (see Chechuro, 2018). In the south, where bilingualism in Azerbaijani was almost universal (including some villages shifting to Azerbaijani, see section 1.5), the presence of Turkic loanwords is by far more visible. Across Andic languages, bilingualism in Azerbaijani or Kumyk was much less widespread, sometimes even non-existent, and Turkic loans may have been acquired via contact with other Nakh-Dagestanian languages. Andic languages show very strong influence from Avar, which served as the local lingua franca (see Magomaeva & Khalilov, 2005; Dobrushina & Zakirova, 2019). The results of this ongoing project correlate with the data on local multilingualism (http://multidaghestan.com). During fieldwork in 2018, we collected lexical lists in the Qax region of Azerbaijan and discovered that the Azerbaijani dialect of the villagers of Ilisu contained quite a few lexical items with ejectives (previously reported in Aslanov, 1974, referring back to a 1945 field report by Shiraliev).8 The presence of these items was registered in interviews with the villagers of Ilisu or with people born there. All these items were nouns designating 8 Unfortunately, the latter source was unavailable to us.

Languages and Sociolinguistics of the Caucasus 61 objects such as utensils or plants; however, data collection was not systematic across the lexicon, so borrowed items from other lexical categories may also exist. Almost all these words have been identified as Tsakhur lexical items that are known to the Tsakhur in Dagestan. This is unsurprising, because Ilisu is known to be an old center of Tsakhur presence in the territory of present-day Azerbaijan (the former Sultanate of Elisu). Surprisingly, none of the interviewees identified these lexical items as Tsakhur loans; they were all claimed to be merely characteristic of the local dialect of Azerbaijani. Additionally, in their speech, voiced consonants in some Azerbaijani words of clearly Turkic origins had a more or less articulated ejective realization (such as t’ana for Azerbaijani dana ‘calf ’, used almost consistently). In this case, the usual direction of contact-induced change (from Turkic to Nakh-Dagestanian) is inverted, most probably due to the fact that it resulted not from bilingualism in a dominant language but from language shift from a minority language (also inferred by Aslanov, 1974). Reconstructing local patterns of language shift may thus provide insights not only in local ethnic history but also in theories of phonetic contact.

1.6.2 Contact-Induced Change in Grammar: An Overview Outcomes of lexical contact in the Caucasus, such as various types of pattern replication in lexical domain, have not yet been studied on a systematic basis.9 Similarly, pattern replication in grammar has only attracted considerable attention in recent decades, and the studies have so far been rather selective. Donabédian (2000, 2018) provides an overview of contact phenomena in the history of Armenian grammar, including its evolution from Classical Armenian into the drastically different system of the modern language. For a discussion of contact phenomena in Northwest Caucasian, see Höhlig (1997). There is nascent research on contact phenomena in Nakh-Dagestanian, with its multitude of languages and contact situations inevitably leading to structural convergence, but many languages of the area have not been documented thoroughly enough to identify contact-induced changes. Some recent studies include Dobrushina (2017), who argues that the origins of a specific use of volitional moods in subordination are due to contact with Turkic; Authier (2010) who considers contact-induced morphology in Kryz (both pattern and matter copy); or the claim that the evolution of person in Batsbi is due to strong influence from Georgian (Kojima, 2019). For other studies, see also Belyaev, 2019; Desheriev, 1953, p. 85; Maisak, 2019a, 2019b. Maisak (2016b) is an overview of contactinduced change in Nakh-Dagestanian (as well as the pertinent methodological issues), and the discussion of contact-induced change in Nakh-Dagestanian in this section largely follows his overview. Given the high degree of neighbor multilingualism (see section 1.5), contact-induced phenomena in grammar are expected. However, shared grammatical features which arise via the borrowing of abstract grammatical patterns are notoriously difficult to 9 See Haspelmath (2009) and Koptjevskaja-Tamm & Liljegren (2017) for a general discussion.

62 Nina Dobrushina, Michael Daniel, and Yuri Koryakov attribute to language contact. In selecting examples of apparent contact-induced phenomena, we chose such phenomena where shared features cannot be easily interpreted as inherited from the protolanguage—even if shared by languages that have never been in direct contact. These must have spread through chains of local bilingualism; a feature is borrowed from one language to a neighboring language, and from that latter language to yet another language. Sections 1.6.3 and 1.6.4 give examples of such features.

1.6.3 Alternations in Recipient Marking Nakh-Dagestanian languages are cross-linguistically rare in having two distinct patterns of recipient encoding, which we will refer to as the “dative strategy” (with ez ‘to me’ in (1)) and “lative strategy” (with ju-w oq’er-mu-ra-k ‘to this pauper’ in (2)). Consider the constructions in Archi in (1) and (2): (1)

Dative Strategy in Archi jella wiš ja-r laha pari χanum-li, X.-obl(erg) thus you.pl.gen this-2 girl.obl(erg) P. atʼu-li, jeb sotː-or ʟo, χir npl.cut.pfv-cvb this.pl bead-pl npl.give.pfv behind da-qˁa-li, ez ʟo 2-come.pfv-evid npl.i.dat npl.give.pfv ‘Thus, this girl of yours, Pari Hanum, tore off her necklace, ran after me and gave it to me.’ (Archi Electronic Corpus)

(2) Lative Strategy in Archi kʼan harak-du-t iq-n-a ja-r ɬːanna most before-attr-4 day-obl-in this-2 woman.obl(erg) čʼut bo-ʟo-li ju-w oqʼer-mu-ra-k da‹b›χi-s jug 3-give.pfv-evid this-1 pauper-obl.1-cont-lat ‹3›hit-inf ‘On the (very) first day this woman gave this pauper a jug (of butter) to churn.’ (Archi Electronic Corpus) At first glance, the difference is between permanent transfer to a recipient (“dative strategy”) and temporary transfer (“lative strategy”). The event in (1) involves the delivery of a gift, while (2) describes temporary transfer. Upon closer scrutiny, the distinction is more complex. The lative strategy is also used when the object is given back to its owner (šahʁuli apːpːas-a-l-di ‘to Shakhguli Abbas’ in (3)) or when it is given to a final recipient by a mediator (d-is ɬːanna-ra-k ‘my wife’ in (4)).10 10 For further details on this phenomenon, see Daniel (2019); Daniel, Khalilova, and Molochieva (2010).

Languages and Sociolinguistics of the Caucasus 63 (3) Lative Strategy in Agul me ruš š-u-ne fac-u-na qa-i-ne this girl go-pfv-aor catch-pfv-cvb re-give.pfv-aor šahʁuli apːpːas-a-l-di ħajwan. Shakhgul Abbas-obl-super-lat horse ‘This girl went there, caught the horse and gave it back to Shakhguli Abbas. (Agul Electronic Corpus) (4) Lative Strategy in Archi un daki ʟo-tʼu d-is 2sg(erg) why 4.give.pfv-neg 2-I.gen ɬːanna-ra-k is amanat bo-li woman.obl-cont-lat 4.I.gen present say.pfv-evid ‘Why didn’t you give my wife the present that I gave (to you for her)—he asked.’ (Archi Electronic Corpus) These subtle semantic distinctions are consistent across all languages of the family, from the Nakh languages in the northwest to the Lezgic languages in the north of Azerbaijan. The only known Nakh-Dagestanian language that lacks this distinction is Udi, a Lezgic outlier in central Azerbaijan; Udi was cut off from the geographic continuity of NakhDagestanian languages for centuries. From Nakh languages, the distinction also expanded into both Iron and Digor Ossetic (Belyaev, 2019; Belyaev & Daniel, 2014а, 2014b). On the other hand, this feature is probably not inherited from an ancestral language, because the lative strategy in different languages is encoded by different morphological means that are certainly not cognate in all languages. Even in the two Lezgic languages considered here, Archi and Agul, the spatial markers used in the lative strategy are semantically different (cont vs. super), and neither of the lative markers is cognate. Marking the distinction between the two types of transfer is cross-linguistically rare. Outside the Nakh-Dagestanian family, a similar pattern is known to exist in Dravidian.

1.6.4 Ordinal Numerals Formed with ‘Say’ In Mehweb Dargwa, ordinal numerals are formed by combining the root of the cardinal numeral with the attributive form of the verb ‘say’ (lit. ‘one (to which one should) say N’): (5)

Mehweb Dargwa k’ʷi-e-s-i two-say.ipfv-inf-attr ‘second’

This way of forming ordinal numerals is attested in Lezgic, Dargwa, Lak, and Tsezic, as well as in Akhvakh and some dialects of Avar—that is, all branches of the family except

64 Nina Dobrushina, Michael Daniel, and Yuri Koryakov Nakh and Khinalug (in Khinalug, Azerbaijani loans are used). Thus it seems that the construction is common to the languages from central to southern Dagestan (Maisak, 2016b). As further argued in a comprehensive family-wide overview of the data in Nasledskova and Netkachev (to appear), this strategy of forming ordinal numerals does not seem to be cross-linguistically frequent (if attested at all). The verb ‘say’ used in this construction is not necessarily cognate in all branches and comes in slightly different morphological forms. This suggests another case of structural alignment which could have spread through chain bilingualism.

1.6.5 ‘Find’ as an Epistemic Auxiliary In Mehweb Dargwa, the verb ‘find’ is used in several constructions related to epistemic and evidential domains. (6) Direct Evidential Use of ‘Find’ in Mehweb Dargwa šejt’at-une-jni id-di d-arʔ-i-le devil-pl-erg this-pl npl-gather.pfv-pst-cvb ar-d-uχ-i-le d-arg-i-le le-r pv-npl-bring.pfv-pst-cvb npl-find.pfv-pst-cvb cop-npl ‘(When he came back to the place where he dropped the gold) he discovered that the devils picked it up and carried it away.’ (7) Epistemic Use of ‘Find’ in Mehweb Dargwa abx-i-le d-urg-a-re ʡʷaˤnd abzul=la huj-ni all=add road-erg open-pst-cvb npl-find.ipfv-irr-pst ptcl b-ik’-a-re b-emž-ul-le. hpl-come.ipfv-pot-pst hpl-be.hot.pfv-ptcp-cvb ‘(The windows) must have been open all the way (lit. on all the road), otherwise we would have suffocated in the heat.’ (8) Conditional Use of ‘Find’ in Mehweb Dargwa nu q’oˤj-ħe w-arg-a-k’a, uk-iša. I go.ipfv.ptcp-in(lat) m-find.pfv-irr-cond m.lead.pfv-fut.ego ‘If (you go) where I go, I will give you a ride.’ In these three cases, three different forms of the verb ‘find’ are used as quasi-auxiliaries with lexical verbs to convey three very different meanings. The perfect form in (6) is used to convey direct evidentiality; in (7), the future form is used for presumptive evidentiality (i.e., deduction by knowledge of the real world); and in (8), the conditional converb is used in a periphrastic conditional. All three constuctions with ‘find’ are also attested in some Andic and Tsezic languages, Avar, and Archi. Avar, Andic, and probably Tsezic are distantly related within the family and might have inherited the construction. Archi and Mehweb Dargwa belong to other branches of Nakh-Dagestanian, but

Languages and Sociolinguistics of the Caucasus 65 form exclaves in Avar-speaking territories. They have probably been in a state of constant bilingualism in Avar for many centuries, so the presence of the ‘find’ constructions makes a strong case for contact-induced change. The presence (or absence) of these uses of ‘find’ in other Nakh-Dagestanian languages still needs to be checked, but it does not seem to be attested in Lezgic languages other than Archi.

1.6.6 Differential Object Marking in Udi Udi, a Lezgic outlier of central Azerbaijan, features differential object marking (DOM). In (9) the form is ereqː-a ‘hazelnut (Dat)’, and in (10), it is ereqː ‘hazelnut’: (9) Differential Object Marking in Udi (…) ajiz-e 60-ǯi usen-χo-stːa ereqː-a üše village-loc 60-ord year-pl-ad hazelnut-dat at.night bašqː-esun modaχ=e=j. steal-nmlz common=3sg=pst ‘In the village, it was common in the sixties to steal hazelnuts in the night.’ (Kasyanova, 2017, p. 630) (10) Differential Object Marking in Udi hälä ereqː toj-stː-a bar=te=tːun=ne=j yet hazelnut sell-lv.inf-dat let=neg=3pl=lv.prs=pst ‘It was not permitted to sell hazelnuts.’ (Kasyanova, 2017, p. 630) Kasyanova (2017) argues that the primary factor triggering DOM in Udi is the referentiality of the object. Definite objects tend to be marked with a dative, while other objects are in the nominative (unmarked). Kasyanova notes that DOM is not attested in other Nakh-Dagestanian languages, but it is found in Azerbaijani, Farsi, and Armenian. DOM, while widespread cross-linguistically, seems to be rare in ergative languages (Malchukov, 2006). Thus, it is very likely that its presence in Udi is due to contact with surrounding languages. In Turkic languages, DOM is also sensitive to the referentiality of the object. In Armenian, the primary condition on DOM is animacy, but within animate nouns it is highly referential nouns that are marked. Whatever the exact scenario, the emergence of differential object marking in Udi seems to be a relatively clear case of a contact-induced phenomenon.

1.7 Conclusion On the verge of the 21st century, “the amazing historical stability of the linguistic landscape of the Caucasus is coming to an end” (Gippert, 2008, p. 161). The populations in Azerbaijan, Georgia, and Armenia are rapidly losing proficiency in Russian.

66 Nina Dobrushina, Michael Daniel, and Yuri Koryakov The respective national languages of these countries have been bolstered by recent transitions to independence. They are now used in most domains of education and culture. At the same time, minority languages of Azerbaijan, Georgia, and Armenia did not benefit from independence. The dominant language simply switched from Russian to the national language of the country: Azerbaijani, Georgian, and Armenian, respectively. Most minority languages are not taught at school, are not used for reading or writing, and have low prestige in their communities. In Abkhazia and Ossetia, Russian has stayed strong, and the use of Abkhaz and Ossetic has not expanded into new domains. The parts of the Caucasus which remained part of Russia are all characterized by the prevalence of Russian. Local languages (e.g., Ingush, Chechen, Ossetic, Adyghe, and many languages of Dagestan) are well-preserved in rural areas but are largely lost in cities. Since the majority of the population of the Caucasus live in villages, linguistic diversity is still preserved, but in some areas, local languages are endangered, primarily because of migration to the lowlands.

Acknowledgements We are grateful to experts who shared their knowledge about the current sociolinguistic situation in various regions of the Caucasus as well as references to the relevant literature: Irina Bagirokova, Oleg Belyaev, Viacheslav Chirikba, Anaïd Donabédian, David Erschler, Diana Forker, Adigoezel Haciyev, Victoria Khurshudian, Yury Lander, Tamar Makharoblidze, Zarina Molochieva, Johanna Nichols, Alexander Rostovtsev-Popiel, Murad Sulejmanov, Kevin Tuite, and Arseny Vydrin. The chapter was prepared within the framework of the Basic Research Program at the National Research University Higher School of Economics and supported within the framework of a subsidy by the Russian Academic Excellence Project “5-100”.

chapter 2

North Caucasus Regions and Their Demography Konstantin Kazenin

The goal of this chapter is to provide basic information on the ethnic composition of the North Caucasus, with emphasis on changes occurring over the last several decades. The chapter compares census data from different Soviet and post-Soviet years and discusses the major reasons which underlie population shifts (such as voluntary and forced migrations and changes in birth rate). Major interethnic conflicts of recent decades in the North Caucasus are also surveyed. Section 2.1 discusses northeastern regions; section 2.2 discusses the northwestern ones. Some of the discussion in this chapter complements the material in chapter 1.

2.1 Northeastern Caucasus The northeastern Caucasus includes three republics within the Russian Federation: Dagestan, Chechnya, and Ingushetia. All three have a small Russian population (less than 5%) and a very high proportion of Sunni Muslims (more than 90%). However, they differ considerably in ethnic composition and recent patterns of migration.

2.1.1 Dagestan Dagestan is the most populous and diverse region of the North Caucasus. Its population was 3,041,900 in 2017, and its ethnic diversity is the richest in Russia (the exact number of peoples depends on how certain minorities are grouped). Dagestan is the easternmost republic in the North Caucasus, bordered by the Caspian Sea to the east and connected by road to Azerbaijan in the south. Mountains cover almost half of

68 Konstantin Kazenin Dagestan, and population density in the mountains is much higher than in other North Caucasian republics. Table 2.1 shows the ethnic composition of Dagestan according to the Soviet censuses of 1959 and 1989 and the Russian census of 2010 (some minorities are not included). Six groups account for nearly 85% of the entire population: Avars, Dargwa, Kumyks, Lezgians, Laks, and Tabasarans. Among them, Avars, Dargwa, Laks, and Tabasarans originate from local mountain ranges and have migrated in great numbers to the valleys since the second half of the twentieth century. Lezgians are the most numerous people of southern Dagestan, residing in both mountains and valleys. They also form a majority in northeastern Azerbaijan. Kumyks reside almost exclusively in the lowlands of the republic and in the foothills of the Caucasus Mountains. As Table 2.1 shows, all six major ethnic groups of Dagestan experienced at least 200% growth between 1959 and 2010. That growth was mainly due to a considerable decrease in mortality, especially infant mortality, in the 1950s and 1960s (Muduev, 2003, p. 41), after which the birth rate remained high for several decades (Kazenin & Kozlov, 2016). A radical decrease in birth rate starting from the 1990s suggests that the net population increase observed from 1959 to 2010 will not continue in the near future. Minorities originating from the mountains underwent a similar increase in population between 1959 and 2010. This includes the Aguls, Rutuls, and Tsakhurs in the highlands of southern Dagestan. Table 2.1 does not include the Andic groups or the Tsez,

Table 2.1 Major Ethnic Groups in Dagestan in 1959, 1989, 2010 1959

1989

2010

Avar

239,373 (22.5%)

496,077 (27.5%)

850,011 (29.2%)

Dargwa

148,193 (13.9%)

280,431 (15.6%)

490,384 (16.9%)

Kumyk

120,859 (11.4%)

231,805 (12.8%)

431,736 (14.8%)

Lezgian

108,615 (10.2%)

240,370 (11.3%)

385,240 (13.2%)

Lak

53,451 (5.0%)

91,682 (5.1%)

161,276 (5.5%)

Tabasaran

33,545 (3.2%)

78,196 (4.3%)

118,848 (4.1%)

Agul

6,378 (0.6%)

13,791 (0.8%)

28,054 (1.0%)

Rutul

6,566 (0.6%)

14,955 (0.8%)

27,849 (1.0%)

Tsakhur

4,278 (0.4%)

5,194 (0.3%)

9,771 (0.3%)

Chechen

12,798 (1.2%)

57,877 (3.2%)

93,658 (3.2%)

Noghay

14,939 (1.4%)

28,294 (1.6%)

40,407 (1.4%)

Russian

213,754 (20.1%)

165,904 (9.2%)

104,020 (3.6%)

Azeri

38,224 (3.6%)

75,463 (4.2%)

130,919 (4.5%)

Tat

19,155 (1.8%)

25,978 (1.4%)

455 (0.1%)

Source: Soviet population censuses of 1959 and 1989, Russian population census of 2010.

North Caucasus: Regions and Their Demography 69 ethnicities closely related to the Avars who reside in the western Dagestanian mountains. They were not even registered as separate ethnicities when the Soviet censuses started in the area in 1939. In the censuses held after the collapse of the Soviet Union, some registered under separate ethnonyms, and others preferred to keep their Avar identity (see Kazenin, 2002a; on the expansion of the list of ethnicities in post-Soviet censuses, see Bogojavlenski, 2008). Dagestanian Chechens reside mainly in and around the town of Khasavyurt, near the border of Chechnya. After the exile to Central Asia and Kazakhstan imposed upon them by Soviet authorities from 1944 to 1957, Chechens were not allowed to return to some of their rural homelands in Dagestan, instead settling in neighboring areas. The Auch municipal district of Dagestan, which was, for the most part, inhabited by Chechens in 1944, has yet to be restored. This has caused interethnic tensions in those areas historically inhabited by Chechens (Kazenin, 2013b). Official statistics classify two distinct groups of people as “Azeri.” One group consists of Azeris mainly residing in the old quarters of Derbent, the biggest town in southern Dagestan, located on the Caspian Sea. This population is composed mostly of Shia Muslims. The second group, the Sunni Azeris, reside in rural valleys in southern Dagestan and in the towns of Derbent and Dagestanskiye Ogni. The two groups differ in their dialects and in self-identification. The Tat population in Dagestan has shown the most drastic decrease in recent decades (from over 25,000 in 1989 to less than 500 in 2010). Of the three confessional groups (Jewish, Muslim, and Christian) associated with this Iranian minority, mainly the Jewish Tats are present in Dagestan. Dagestanian Tats are mainly concentrated in the town of Derbent; after the collapse of the Soviet Union, many moved to Israel or scattered throughout regions of Russia. Russians in Dagestan have settled primarily in two areas. One includes major cities— mainly the capital, Makhachkala, and the capital’s satellite, Kaspiysk. A boom in the Russian population took place during the Soviet era, when officials supported migration into the region by industrial workers and engineers who were employed at the plants built there. Russians are also concentrated in the north of the republic, mainly in the town of Kizlyar and the rural valleys around it. Russian Cossacks have inhabited that area since at least the eighteenth century. Intensive migration of Russians out of the region started in the 1960s and 1970s and resurged after the collapse of the Soviet Union (Belozerov, 2005, pp. 54–108). Ongoing movements of indigenous peoples from the mountains into areas where Russians used to live is the most commonly accepted reason for the migration of Russians out of the region (see more in section 2.1.2), in addition to cuts in the number of workers at those plants where many Russians were employed. Noghays are a Turkic people found in Dagestan and some other areas of southern Russia. In Dagestan, the largest group of Noghays lives in the northernmost valleys near the border with Stavropol. There are also Noghays living in some areas close to Makhachkala, along the Sulak River by the Caspian Sea. The Noghay population’s modest growth between 1989 and 2010—compared to that of the other major peoples in Dagestan—can be explained by an earlier decrease in the birth rate, as well as their intense

70 Konstantin Kazenin migration out of Dagestan (mainly to western Siberia, but also to the Astrakhan region in the southern part of the Volga River Basin). Starting in the 1950s, indigenous groups have moved in great numbers from the mountains (highlands) to the valleys (lowlands). This migration and the urbanization of local populations have radically changed the ethnic composition of the Dagestanian lowlands. Three major waves of migration during the Soviet era can be identified, each differing in their causes and conditions (see Osmanov, 2000; for a general economic and demographic overview of post-Soviet Dagestan, see Radvanyi & Muduev, 2007). The first wave was related to the deportations of the 1940s. After Chechens were sent into exile in 1944, about 60,000 of Dagestan’s highland residents were forced to resettle the areas previously inhabited by the Chechens (Poljan, 2001, p. 133), including today’s Chechnya and Dagestan. After Chechens were allowed to return to their homeland in 1956, Dagestanians who had moved there from the mountains were resettled once again, this time in the rural valleys of Dagestan. A second migration took place due to the agricultural policies of Soviet authorities, who distributed huge swaths of pastureland in the valleys among cattle breeders who were originally based in the highlands. This land redistribution resulted in the relocation of agricultural workers and their families from the highlands to the rural lowlands. On many occasions, a minor migration started by small groups of agricultural workers was later augmented by many others who followed from their mountain homelands. Settlement communities of migrants, some with more than 500 households, gradually spread throughout pastoral lands. In 2010, more than 100 such settlements were reported by local authorities, some without any legal status. Their total population is estimated to be around 50,000 (Kazenin, 2012). Finally, in the 1960s and 1970s, Soviet authorities prompted migrations in the wake of earthquakes and other natural disasters in the mountains. Then, after the collapse of the Soviet Union, both rural-to-urban and mountain-to-valley migration accelerated noticeably due to poor economic conditions in rural Dagestan, especially in the highlands. Unfortunately, official statistics only reflect data on rural-to-urban migration,

Table 2.2 Percentage of Urban Populations, 1989 and 2010 1989

2010

Avar

30.77%

36.94%

Dargwa

31.54%

40.14%

Kumyk

47.30%

50.19%

Lak

62.25%

71.36%

Lezgian

38.02%

45.70%

Tabasaran

33.05%

44.91%

Source: Soviet population census of 1989, Russian population census of 2010.

North Caucasus: Regions and Their Demography 71 not rural-to-rural migration from highlands to lowlands. Table 2.2 shows the proportions of the urban population among major ethnic groups according to the 1989 and 2010 censuses. Actual numbers were probably higher, as migration in the northern Caucasus was probably underrepresented by the 2010 census, since many migrants were registered in their homeland (Mkrtchyan, 2011). In the decades before the collapse of the Soviet Union, the urban population of indigenous peoples in Dagestan was also growing; for example, the Avar urban population grew threefold between 1959 and 1989 (Belozerov, 2005, p. 237). Migrants originated from areas that were, for the most part, monolingual and often represented just one dialect of a particular language. Generally, this is not the case in the resettlement areas. However, in many instances, large groups of migrants speaking the same language or dialect have continued to live in relatively compact groups after relocating to the lowlands. A considerable number of rural settlements in the valley are inhabited by migrants from just one highland village; occasionally, people from two or three villages are intermixed in such settlements. This provides relatively favorable conditions for the preservation of their native languages. Currently, some parts of the Dagestanian lowlands are among the rural territories of Russia with the highest population density (Èldarov, Holland, Abdulagatov, Aliev, & Ataev, 2007). This causes intensive migration out of the region into other areas of the Russian Federation. For instance, there are sizable Dargwa diasporas in the neighboring Stavropol region (49,302 in 2010) and Kalmykia (7,590 in 2010), mainly in agricultural areas, and in eastern Siberia (3,722 in Tyumen Oblast in 2010), where they are mainly employed in oil fields. Lezgians also migrated intensively to eastern Siberia (16,247 in Tyumen Oblast in 2010), while many Avars (4,719 in 2010) reside in the city of Astrakhan on the northern shore of the Caspian Sea, where they are involved in retail for the most part (for details on post-Soviet migrations from Dagestan, see Karpov & Kapustina, 2011).

2.1.2 Chechnya Chechnya (the Chechen Republic), centrally located in the northeastern Caucasus, borders Dagestan to the east and Ingushetia to the west. Among North Caucasian republics, Chechnya is the second most populous (1,414,865 as of January 1, 2017), with Chechens comprising 95.3% of the population according to the 2010 census. Between the 1920s and 2000s, the administrative borders and ethnic composition of today’s territory of the Chechen Republic underwent a number of dramatic changes, which are partly reflected in its current demographic and political situation. During the first decade of the Soviet state, in 1922, Chechens first got their ethnic autonomous district (Rus. avtonomnaja oblast´). The territory of that district roughly corresponded to today’s Chechen Republic without its northernmost areas. Initially, the population of the autonomous district was composed almost entirely of ethnic Chechens (94.0% according to the 1926 census). Later on, however, territories where Russian populations prevailed were incorporated into the Chechen autonomous district. In 1929, Grozny, an industrial town where Russians composed 70% of the

72 Konstantin Kazenin opulation and Chechens only 2% (1926 census), became a part of the autonomous disp trict, along with some rural areas populated by Russian Cossacks. Despite the close proximity of the Chechen and Russian areas, migration between them was very uncommon at that time. Chechens resided in both the highlands and the lowlands of what is today’s Chechen Republic, whereas Russians inhabited only the lowlands. In 1934, the Chechen and Ingush autonomous districts were merged into the Chechen-Ingush District (in 1936 it became the Chechen-Ingush Autonomous Soviet Socialist Republic within the Russian Federation). This did not seriously affect the ethnic composition of the territory, because the Chechen and Ingush populations infrequently moved into each other’s territories during the time they shared an administrative district, except some migration of the Ingushes, mainly officials and their family members, to the capital Grozny. The next set of changes occurred in 1944, when the Chechens and Ingushes were accused of collaborating with the Nazis who invaded parts of the North Caucasus during World War II. Based on these accusations, Chechens and Ingushes were forcibly resettled to Central Asia. The Chechen-Ingush Autonomous Soviet Socialist Republic was dismantled. Most of its territory was absorbed into the newly established Grozny District (named after its capital, Grozny), where Russians became the majority. Some territories in the east of the former Chechen-Ingush Republic were transferred to Dagestan, and the Dagestanian population was forced to relocate there (see section 2.1.1). In 1956, the Chechens were allowed to return to their homeland, and the following year, the Chechen-Ingush Autonomous Republic was reestablished. Most Chechens who had been living in the highlands before the exile preferred to live in the lowlands after their return, and many of those who had been living in rural areas became urbanized. This resulted in an overlap between the Russian and Chechen populations and in the depopulation of the highland areas. For instance, in the rural Shelkovskoy District to the northeast, where Chechens composed less than 1% of the population before exile, their population had grown to 19.4% by 1970. Changes in the ethnic composition of the lowlands were also due to a higher birth rate among Chechens as compared to Russians (Belozerov, 2005, pp. 231–237). The growth of the lowland Chechen population, especially in Grozny, caused a number of violent conflicts after their return from exile (Kozlov, 2006). Although the majority of Chechens returned from Central Asia as early as in 1956–1959 (Poljan, 2001, pp. 160–162), their return continued through the 1960s and 1970s, causing their population to grow even more (see Table 2.3). After the collapse of the Soviet Union, the situation in Chechnya started to change dramatically. In 1991, Chechen separatists led by a former Soviet Army general Dzhokhar Dudaev declared the independence of the Chechen Republic, which was then given the name The Chechen Republic of Ichkeria. That caused an outburst of violence, the result of which was that the government of the Chechen-Ingush Autonomous Republic ceased to exist by the beginning of 1992, and Chechnya became predominantly controlled by separatists (and partly by unofficial militias which competed with the separatists and declared themselves pro-Russian). Ingushetia separated from Chechnya and became a republic of the Russian Federation in the same year (see section 2.1.3 for details).

North Caucasus: Regions and Their Demography 73 Table 2.3 Major Ethnic Groups of Chechnya in 1970, 1989, and 2010 1970

1989

2010

Chechen

499,962 (54.7%)

715,306 (66.0%)

1,205,551 (95.1%)

Russian

329,701 (36.1%)

269,130 (24.8%)

24,382 (1.9%)

Ingush

14,543 (1.6%)

25,136 (2.3%)

1,296 (0.1%)

Kumyk

6,865 (0.8%)

9,591 (0.9%)

12,221 (1.0%)

Noghay

5,503 (0.6%)

6,885 (0.6%)

3,444 (0.3%)

Avar

4,196 (0.5%)

6,035 (0.6%)

4,864 (0.4%)

Note: Unlike other regions, we provide census results from 1970 for Chechnya because the ethnic composition reflected in the 1959 census was questionable, due to the ongoing return of Chechens and Ingushes from Central Asia in exactly that period. Source: Soviet population censuses of 1970 and 1989, Russian population census of 2010.

The military conflict between separatists and the Russian Army from 1994 to 1996 (the First Chechen War) did not bring Chechnya under the control of the Federal government. The Kremlin was more successful in the Second Chechen War (1999–2002), which resulted in the defeat of the separatists and the authorization of a pro-Russian government, wherein former separatist commanders played, and still play, the main role. After the Second Chechen War, the Kremlin started large-scale investments in restoring the physical and social infrastructure of the Chechen Republic, which was declared a separate region of the Russian Federation in its Constitution of 2003. The military conflicts had several consequences for the population and the ethnic composition of the Chechen Republic. Foremost, they resulted in large-scale losses of life and many refugees. During the First Chechen War, human rights activists were the main source of data for civilian casualties. The Russian human rights center Memorial reports that 30,000 to 40,000 civilians were killed in Chechnya from 1994 to 1996. The Federal government reported 17,000 losses among separatists. Assessments of civilian losses during the second conflict vary between 1,000 (Russian officials) and 25,000 (Amnesty International). Russian assessments of the separatists’ losses from that war vary between 10,000 and 15,000. The number of refugees from both wars amounted to tens of thousands, although exact estimates made by military observers vary. The highest concentration of refugees was in the neighboring republic of Ingushetia. Until 2010, most refugees who returned to Chechnya came from Ingushetia (see section 2.1.3). The number of refugees who went to other regions is hard to assess due to the unreliability of migration statistics. The years of separatism and war turned Chechnya into an ethnically homogeneous region for the first time in its history. This was reflected in the 2002 and 2010 census results (for 2010, see Table 2.3). The decrease in the Russian population in 2010 was especially noticeable when compared to 1989. Unlike the Chechens, most Russians who left the area during wartime never returned. The number and proportion of Ingushes also decreased after 1989. At the time of the Chechen-Ingush Autonomous Republic, Ingushes

74 Konstantin Kazenin in Chechnya were found only in the present-day city of Grozny. Their numbers in today’s Chechnya have dwindled from 25,136 in 1989 to 1,296 in 2010, which shows the intensity of their emigration, mainly to Ingushetia. Since the split between Chechnya and Ingushetia in 1992, no border between the two republics has been officially agreed upon. However, the unofficial border was configured so that no predominantly Ingush areas are included in Chechnya. Nevertheless, in 2018 debates concerning the borders of the republics intensified. After the head of Chechnya, Ramzan Kadyrov, and the head of Ingushetia, Junus-Bek Evkurov, signed a border demarcation agreement in October 2018, more than 10,000 people rallied in Ingushetia, protesting against the secretive nature of the agreement and the particulars of how the border was settled. There are only three minorities in Chechnya with more than 2,000 people: Kumyks, Noghays, and Avars. Kumyks (12,221 in 2010) reside mainly in some villages in eastern Chechnya. Noghays (3,444 in 2010) reside in the northern part of Chechnya, close to the Dagestanian Noghay populations. The Noghay population reduced twofold from 1989 to 2010 because, during wartime, they fled to other regions of Russia, including Dagestan, Astrakhan, and Tyumen, where large Noghay communities already existed. Finally, Avars, though mostly found in Dagestan, live in some eastern villages in Chechnya (4,864 in 2010). They are mainly descendants of the Avars relocated to Chechnya by Soviet authorities after the Chechens were exiled in 1944 (see section 2.1.1). Only a very small part of the Avar population stayed in Chechnya after the Chechens returned in 1957. In Borozdinovskaya, the village with the highest concentration of Avars today, Avars experienced severe pressure from some paramilitary groups in 2005, when 11 local residents, still missing today, were taken away by force from the village. After that, nearly 1,500 Avars migrated from Borozdinovskaya to Dagestan. All in all, in less than a century, the territory of today’s Chechen Republic has transitioned from a multiethnic region with separate residential areas for each major ethnic group to a region with interethnic mixing, and, most recently, to an almost entirely monoethnic region. The post-war restoration of Chechnya improved the living conditions, both in towns and in major rural settlements, to the effect that post-war urbanization was not too prevalent (possibly also because it was partially controlled by local authorities). Although the population of the regional capital, Grozny, which was totally rebuilt after the wars, grew 38% between 2002 and 2017, Chechnya still has one of the highest proportions of rural residents in Russia (65% in 2016). High population growth is another demographic characteristic of Chechnya. Its population grew 11.5% between the 2010 census and January 1, 2017. This may be due in part to the high birth rate which was 23‰ in 2016, almost double of that of Russia as a whole.

2.1.3 Ingushetia Ingushetia (the Republic of Ingushetia) has the smallest territory of all Russian regions in the northern Caucasus (3,628 square km). As a distinct region of Russia, its present borders were established in 1992. Before that, between the 1920s and 1950s, the territory

North Caucasus: Regions and Their Demography 75 of today’s Ingushetia had changed several times. From 1957 to 1991, it was a part of the Chechen-Ingush Autonomous Soviet Socialist Republic within the USSR. When Chechnya claimed independence in the 1990s, Ingushetia remained a part of the Russian Federation. Almost immediately after its formation, the Republic of Ingushetia received two dramatic influxes of people. Both were caused by nearby incidents of mass violence. One began after a conflict in North Ossetia. By the time of the Soviet Union’s collapse, tens of thousands of Ingushes lived in North Ossetia (32,723, according to the census of 1989, but that is purported to be an underestimate). More than 50% of Ingushes in North Ossetia were concentrated in the Prigorodny District, a territory that belonged to the Chechen-Ingush Autonomous Republic before 1944. Upon their return from exile in 1957, Ingushes were allowed to reside in their homeland, part of which remained within the territory of North Ossetia. In October 1992, a bloody conflict between the Ossetic and Ingush populations broke out in Prigorodny District, with the following casualities: 583 dead, 939 injured, and 261 missing (based on the Russian authorities’ reports). As a result of that conflict, almost the entire Ingush population of North Ossetia was forced to move to Ingushetia. Estimates for the number of refugees ranged from 30,000 to 60,000, depending on the source. Although a fraction of those refugees has since returned to North Ossetia (see section 2.2.1), an uncertain number of refugees and their adult children still remain in Ingushetia. In addition to the conflict in North Ossetia, the other major cause of people relocating to Ingushetia was extensive violence in Chechnya, whose declaration of independence resulted in two large wars, described in section 2.1.2. Almost all the Ingushes who lived in the capital, Grozny (21,346 according to the 1989 census), fled to Ingushetia when Grozny became the center of the rebellion in Chechnya. Many Chechen refugees fled to Ingushetia as well. According to the Danish Refugee Council, the total number of refugees from Chechnya to Ingushetia was nearly 106,000 in 2002. In the same year, the Russian census registered 95,403 Chechens in Ingushetia. Afterward, Chechen refugees either returned to Chechnya or relocated to other parts of Russia. From 2002 to 2010, the number of Chechens in Ingushetia shrank fivefold. However, very few, if any, Ingushes who fled Chechnya in the 1990s have returned to Ingushetia (see Table 2.4 for changes of ethnic proportions in Ingusgetia between 1959 and 2010).

Table 2.4 Major Ethnic Groups of Ingushetia in 1959, 1989, and 2010 1959

1989

2010

Ingush

44,634 (40.6%)

138,626 (74.5%)

385,537 (93.5%)

Chechen

5,643 (5.1%)

19,195 (10.3%)

18,765 (4.5%)

Russian

51,549 (46.9%)

24,641 (13.2%)

3,215 (0.8%)

Note: For 1959 and 1989, the numbers are given for those administrative units of the Chechen-Ingush Autonomous Republic whose territory was mainly included in the Republic of Ingushetia in 1992. Source: Soviet population censuses of 1959 and 1989, Russian population census of 2010.

76 Konstantin Kazenin Overall, immigration has considerably increased the total number of Ingushes who now permanently reside in Ingushetia, although it is hard to assess how many of them are actually migrants. Migration of Russians out of the region and a large disparity between their birth rates and those of local Ingushes, who have the highest birth rate in Russia, have made Ingushetia nearly monoethnic; Ingushes compose 93.5% of the population according to the 2010 census. The overall population growth rate is very high. At present, the lowest population density in Ingushetia is found in the southern highlands, where less than 5% of the total population resides. The highest population density is found in the central part of the region, where towns and rural settlements form an agglomeration along the federal highway. The current proportion of the urban population is estimated to be 41.8% (2017). However, the actual level of urbanization is somewhat lower, as many districts of the major towns (Nazran, Malgobek, Karabulak) consist mostly of private one-family buildings and generally have rural infrastructure. Only the newly built capital of Ingushetia, Magas, has full-fledged urban infrastructure. Its population, however, is estimated at only 7,818 in 2017.

2.2 Northwestern Caucasus The northwestern Caucasus includes four republics within the Russian Federation: North Ossetia, Kabardino–Balkaria, Karachay–Cherkessia, and Adygea. Compared to the northeastern Caucasus, these republics have populations made up of a higher proportion of Russians and a lower proportion of practicing Muslims. All in all, the ethnic composition and religion of these republics contrasts less with the rest of Russia than of those in the northeast. Sometimes, Stavropol Krai and Krasnodar Krai are also considered part of the northwestern Caucasus. We will not discuss them in this chapter, as Russians constitute the overwhelming majority of their population.

2.2.1 North Ossetia North Ossetia (the Republic of North Ossetia–Alania), centered in the North Caucasus, is the only region of the North Caucasus which has a direct connection to Georgia and South Ossetia by road. It shares borders with Ingushetia to the east, Kabardino–Balkaria to the west, Stavropol to the north, and Georgia and South Ossetia to the south. The total population amounted to 703,262 in 2017. The urban population in 2017 constituted 64.2% of the total population, considerably higher than in other republics of the North Caucasus. Most of the urban population is concentrated in the capital, Vladikavkaz (324,836 in 2017). The religious composition of North Ossetia is also unique among North Caucasian republics, as the proportion of Muslims is much lower than 50%. The two major ethnic

North Caucasus: Regions and Their Demography 77 groups, Ossetians (64.5%, according to the 2010 census) and Russians (20.6%, according to the same census), are predominantly Christian. There are some Ossetian minorities who historically identified as Muslim, but their numbers are small. Russians and Ossetians have a long history of living together, since Vladikavkaz, as early as the nineteenth century, was the North Caucasian administrative center of Tsarist Russia. According to the 2010 census, more than 54% of the Russian population of the republic was concentrated in Vladikavkaz; in rural districts, Ossetians make up almost 90% of the population. The northern part of the republic (the town of Mozdok and the surrounding rural areas) is the only area where Russians are close to a majority (49.9% in 2010). That area was added to North Ossetia in 1944 and has a very different ethnic composition from the rest of the republic (see below in this section about the Kumyks living there). Changes in the ethnic composition of North Ossetia between 1959 and 2010 are shown in Table 2.5. A relatively small decrease in the Russian population from 1970 to 2010 is explained by moderate rates of emigration from North Ossetia, much lower than in the republics of the northeast. The increased proportion of Ossetians could be due to less emigration among them than among the Russians and, partly, to immigration (differences in birth rate are probably not relevant, as they are negligible between the two groups). In the late 1980s and early 1990s, Ossetians migrated to their homeland primarily from Georgia. According to the 1989 census, 164,055 Ossetians lived in Georgia, at that time including the South Ossetic Autonomous Oblast (SOAO), which was part of the Georgian Soviet Socialist Republic. After 1989, when a military conflict between Ossetians and Georgians broke out, Ossetians almost completely left the regions of Georgia that were outside the SOAO. They migrated to both North Ossetia and South Ossetia. Further migration to North Ossetia took place from South Ossetia, where the living conditions were very harsh in the 1990s. Since not all of those relocations were registered, it is impossible to calculate the exact increase in the Ossetian population in North Ossetia they have led to. Russian officials estimated the number of Ossetian refugees in North Ossetia at about 30,000 by 1993 (Rossijskaja gazeta, March 10, 1993).

Table 2.5 Major Ethnic Groups of North Ossetia in 1970, 1989, and 2010 1970

1989

2010

Ossetian

215,463 (47.8%)

334,876 (53.0%)

459,688 (64.5%)

Russian

178,654 (39.6%)

189,159 (29.9%)

147,090 (20.6%)

Ingush

6,071 (1.3%)

32,783 (5.2%)

28,336 (4.0%)

Kumyk

3,921 (0.9%)

9,478 (1.5%)

16,092 (2.3%)

Armenian

12,012 (2.7%)

13,619 (2.2%)

16,235 (2.3%)

Georgian

8,160 (1.8%)

12,284 (1.9%)

9,095 (1.3%)

Source: Soviet population censuses of 1970 and 1989, Russian population census of 2010.

78 Konstantin Kazenin Their subsequent number is even more difficult to determine because of “pendulum migration” between North and South Ossetia which often stays unregistered. The two most populous minorities of North Ossetia are the Kumyks and Ingushes. Ingushes are concentrated in the Prigorodny District between Vladikavkaz, its suburbs, and the border with Ingushetia. Their numbers fluctuated seriously in the post-Soviet era because of the conflict in the Prigorodny District in 1992 (see section 2.1.3). However, the 2010 census showed that the number of Ingushes, who were almost completely gone in 1992, had almost reached their pre-conflict level from 18 years earlier. That is partially due to the return of Ingush refugees and partially to the much higher birth rate among the Ingush population than among the Ossetian and Russian populations (Belozerov, 2005, pp. 131–137). Another Muslim minority, the Kumyks, are concentrated in the northern reaches of North Ossetia. In the Mozdok District, they composed 18.6% of the population (according to the 2010 census), outnumbered only by Russians. Georgians and Armenians are concentrated in Vladikavkaz (Armenians are also found in the northern areas, and Georgians in some far southern villages, close to the highway connecting North Ossetia with Georgia).

2.2.2 Kabardino–Balkaria Kabardino–Balkaria is a republic in the center of the North Caucasus with an estimated population of 864,454 (2017). Approximately 40% of its territory is covered by mountains, the rest being foothills and lowlands. Three major peoples are found there: Kabardians, Russians, and Balkars. Kabardians, whose proportion of the population in the region is well above 50%, belong to the eastern branch of the Circassian ethnic group. Kabardians inhabit the rural valleys in the west and southeast of the region almost exclusively. They also compose more than 90% of the population in the western town of Baksan. Kabardians started migrating to towns outside their traditional areas in the mid-twentieth century. In the 2010 census, less than 1% of Kabardians identified themselves with the broader ethnonym, “Circassian” (see also chapter 9). Balkars, a Turkic people, had mainly inhabited highlands and neighboring foothills before 1944. In that year, they were forced to move to Central Asia by Soviet authorities, who accused them of collaboration with the Nazis during World War II. After returning to their homeland in 1957, the Balkars have maintained a relatively stable geographic range of residence. However, some Balkars, who had been living in the mountains before exile, resettled in Balkar villages around the capital, Nalchik. Later on, a small number of Balkars moved to ethnically mixed lowlands in northeastern Kabardino–Balkaria. Russians mainly reside in two areas: Nalchik and the northeastern regions, including the town of Prokhladny. In recent decades, the proportion of Russians in Kabardino–Balkaria has not decreased as sharply as in the northeastern Caucasus; this is due to less drastic rates of emigration and a narrower gap in the birth rate between the Russians and the indigenous populations.

North Caucasus: Regions and Their Demography 79 Table 2.6 Major Ethnic Groups of Kabardino–Balkaria in 1959, 1989, and 2010 1959

1989

2010

Balkar

34,088 (8.1%)

70,793 (9.4%)

108,577 (12.6%)

Kabardian

190,284 (45.3%)

363,494 (48.2%)

490,453 (57.0%)

Ossetian

6,442 (1.5%)

9,996 (1.3%)

9,129 (1.1%)

Russian

162,586 (38.7%)

240,750 (31.9%)

193,155 (22.5%)

Ukranian

8,400 (2%)

12,826 (1.7%)

4,800 (0.6%)

Source: Soviet population censuses of 1959 and 1989, Russian population census of 2010.

Other ethnic groups comprise less than 10% of Kabardino–Balkaria’s total population. Meskhi Turks are among the minorities whose number has been growing in recent decades (4,162 in 1989, 11,053 in 2002, and 13,965 in 2010). Expelled by Stalin from Georgia in 1944, they were forced to move to Central Asia, and some returned to Kabardino– Balkaria after their exile was lifted in 1957 (Poljan, 2001, pp. 175–179). Another group migrated from Central Asia around the collapse of the Soviet Union. Today, they mainly reside in rural parts of eastern and northeastern Kabardino–Balkaria. Since the 1960s, internal migration in Kabardino–Balkaria has gone in two directions. First, Kabardians and Balkars tend to migrate to Nalchik and the neighboring rural settlements. The proportion of Kabardians in Nalchik changed from 18.8% in 1970 to 49.25% in 2010, and the proportion of Balkars, from 2.7% to 12.16%. Some rural settlements around Nalchik are now multiethnic, but others remain monoethnic (with either Balkars or Kabardians). Second, Kabardians and Meskhi Turks tend to migrate to the northeastern parts of the republic, where Russians are gradually leaving rural areas. The central town of the northeast of the republic, Prokhladny (57,879 in 2017), formerly a Russian Cossack settlement, has become remarkably diverse in recent decades in terms of its ethnic composition (for details on changes of “ethnic borders” due to migration within the region, see Babich & Stepanov, 2009). All in all, however, post-Soviet ruralto-urban migration was less intense in Kabardino–Balkaria compared to the northeastern Caucasus, possibly due to the higher current level of agricultural development creating more job opportunities in the rural part of this republic (see Table 2.6 for changes in the ethnic composition of the region between 1959 and 2010).

2.2.3 Karachay–Cherkessia Karachay–Cherkessia (Karachai–Cherkessia) is a republic in the northwestern part of the Caucasus. Until 1990, when it became a separate member of the Russian Federation, it was part of Stavropol Krai, where Russians constitute a majority. The total population of the republic amounted to 466,432 in 2017. Almost 80% of its territory is covered by

80 Konstantin Kazenin mountains and foothills. The major peoples are the Karachays, a Turkic people linguistically close to the Balkars of Kabardino–Balkaria; the Circassians, represented here mainly by an ethnic subgroup closely related to Kabardians; the Abazins (Abaza), who are related to the Circassians; and the Noghays, a Turkic people also present in Dagestan, Stavropol, and some other regions of Russia. None of these four major peoples constitute a majority of the total population of the region. Most of Karachay–Cherkessia’s current territory was first integrated by Soviet authorities in 1922 into a single administrative unit, which dissolved into several parts in 1926, and reunited in 1957 as an autonomous district within the Stavropol region. After the collapse of the Soviet Union, many proposals to split the region into two or three republics were put forward, but none of them were implemented, despite considerable interethnic tension in the region at the time (Kazenin, 2009b, pp. 121–128). With the exception of the regional capital, Cherkessk (originally founded as a Cossack village), and a small number of rural settlements with high ethnic diversity, the people of Karachay–Cherkessia still live in separate communities, where most settlements are monoethnic and a few are inhabited by just two ethnic groups. Karachays were concentrated in highlands to the south and foothills to the east before their deportation to Central Asia in 1943. After their return in 1957, they only partially went back to their former homeland. Many of them chose to settle in rural areas previously inhabited almost exclusively by Russians, thereby expanding into rural territories in the northeastern and western parts of the region (Belozerov, 2005, pp. 108–130). Karachays also constitute a majority in the town of Karachayevsk, which was the center of the separate Karachay autonomous district during its short existence between 1926 and 1943. Their proportion in Cherkessk has been increasing (from 6.2% in 1979 to 16.2% in 2010). Circassians inhabit the northwestern part of the republic. Rural Circassians reside primarily in ethnically homogeneous villages. Their proportion in Cherkessk grew from 6.4% in 1979 to 13.0% in 2010. Russians still constitute a majority in Cherkessk. Their migration from rural settlements shared with Karachays continues today. For instance, in the northeastern Prikubansky District, the proportion of Russians and Karachays was 28.4% and 47.0%, respectively, in 1979, 18.5% and 56.2% in 2002, and 17.2% and 75.69% in 2010. This trend is part of a general tendency for Russians to move from rural to urban areas in the south of the country (Nefedova, 2015). Aside from urbanization, emigration out of the republic is the main reason for the ongoing decrease in the Russian population (see Table 2.7). Differences in the birth rates of Russians and other peoples in Karachay–Cherkessia are not currently significant. Abazins are concentrated in several enclaves scattered throughout the region and in Cherkessk, whereas Noghays inhabit the northwest, close to Circassians. In rural areas, Abazin and Noghay villages tend to be monoethnic. In 2007, Abazins and Noghays were made titular ethnic groups in two respective districts. Nearly half of Abazin villages were included in the Abazinsky (Abaza) District, and the Noghaysky (Noghay) District constitutes almost the whole area of Noghay settlement in the region (Kazenin, 2012, pp. 104–127).

North Caucasus: Regions and Their Demography 81 Table 2.7 Major Ethnic Groups of Karachay–Cherkessia in 1959, 1989, and 2010 1959

1989

2010

Abazin

18,159 (6.5%)

27,475 (6.6%)

36,919 (7.7%)

Circassian

24,145 (8.7%)

40,241 (9.7%)

56,466 (11.8%)

Karachay

67,830 (24.4%)

129,449 (31.2%)

194,324 (40.7%)

Noghay

8,903 (3.2%)

12,993 (3.1%)

15,654 (3.3%)

Russian

141,843 (51.0%)

175,931 (42.4%)

150,025 (31.4%)

Source: Soviet population censuses of 1959 and 1989, Russian population census of 2010.

Like Kabardino–Balkaria, Karachay–Cherkessia experienced less rural-to-urban and highland-to-lowland migration than the northeastern republics did after the collapse of the Soviet Union. However, the migration of Russians out of rural areas, as well as the migration of all ethnic groups to the regional capital still changed the ethnic composition of this area dramatically in the post-Soviet period.

2.2.4 Adygea Adygea (Adyghea) is the westernmost republic of the North Caucasus; it borders Krasnodar Krai, a major region of Southern Russia. Adygea was established in 1937 as an autonomous district within Krasnodar Krai, with borders similar to the current ones. In 1990, Adygea separated from Krasnodar Krai and became a separate region of the Russian Federation. Unlike other republics of the North Caucasus, the titular ethnic group of Adygea, the Circassians (Adyghe), are actually a minority in its population. They have never amounted to even a quarter of Adygea’s population, whereas the proportion of Russians has stayed above 60% (see Kabuzan, 1996, p. 117; Kazenin, 2009b, pp. 14–20). In fact, at the time of its formation, Adygea combined predominantly Circassian and Russian territories, and the population of the latter outnumbered the population of the former. Differences between parts of the region in ethnic composition have been preserved throughout the republic up to the present day. When considering the ethnic composition of Adygea, some shifts in ethnic terminology should be taken into account. In the 2010 census, the Circassian population was given the option to register either under the name “Circassian” (Rus. čerkesy) or “Adyghe” (Rus. adygejcy); only the latter was officially acknowledged during most of the Soviet era. The ethnonym “Adyghe” was initially applied by Soviet authorities to all Circassians outside Kabardino–Balkaria (Circassians there were classified as Kabardians). Starting in the 1930s, the use of “Adyghe” was restricted to the Circassians of Adygea. Before the 2010 census, some local Circassian activists advocated for the return of the term “Circassian” in place of “Adyghe” as a symbolic reunification of the entire Circassian

82 Konstantin Kazenin Table 2.8 Major Ethnic Groups of Adygea in 1959, 1989, and 2010 1959

1989

2010

Russian

235,539 (72.8%)

293,640 (68.0%)

270,714 (61.5%)

Adyghe

65,955 (20.4%)

95,439 (22.1%)

107,048 (24.3%)

Ukrainian

9,461 (2.9%)

13,755 (3.2%)

5,856 (1.3%)

Circassian

—

—

2,651 (0.6%)

Armenian

4,659 (1.4%)

10,460 (2.4%)

15,561 (3.5%)

—

—

4,582 (1.0%)

Kurd

Source: Soviet population censuses of 1959 and 1989, Russian population census of 2010.

population of the Caucasus. However, rather few (less than 3,000) Circassians actually used the opportunity to change their official self-identification in the census, so their choice did not affect the apparent ethnic composition of the republic (see Table 2.8 for changes in the ethnic composition of Adygea between 1959 and 2010). Circassians/Adyghes are a majority in the west of the republic, close to the city of Krasnodar. There they live in the town of Adygeysk, considered a satellite of Krasnodar, and in a number of rural settlements. Some Circassians were moved to their current location in the early 1970s, before their former territory was flooded by the Krasnodar Water Reservoir built in 1973. Circassians also live in the northeastern part of Adygea, along the west bank of the Kuban River. The central part of Adygea, which separates the two Circassian enclaves, is predominantly inhabited by Russians. That area includes Adygea’s capital, Maykop, and two rural districts where the number of Circassians is especially low (less than 5% in 2010). Comparison of census data from 1959, 1989, and 2010, for Adygea shows that ethnic proportions have changed little since 1959. The lack of noticeable changes may be due to the relatively low emigration rates among other North Caucasian groups and to insignificant differences in the birth rates between the Russians and the Circassians. At the end of the Caucasian Wars in the nineteenth century, many Circassians had migrated to Turkey and the Middle East; possible repatriation of their descendants was widely discussed after the collapse of the Soviet Union, but the actual immigration of Circassians to their homeland in Adygea has been extremely modest and has not affected the total proportion of Circassians in the region. The low proportion of Circassians in Adygea has caused political tensions in the postSoviet era, as Russian political activists have protested against the high proportion of Circassians among local authorities and against some laws of the republic, which they claimed discriminate against the Russian population (including the requirement for presidential candidates to speak Circassian and the system of parliamentary representation which, in practice, guaranteed Circassians a majority in the upper house of the local parliament). In the early 2000s, most of these disputed regulations, including the language requirement, were abolished (Kazenin, 2009b, p. 31).

North Caucasus: Regions and Their Demography 83 Migration within Adygea has not been intensive in recent decades. Only Maykop has attracted a considerable population influx. Its total population grew from 82,135 in 1959 to 144,249 in 2010, and the proportion of Circassians in Maykop changed from 3.4% in 1959 to 18.9% in 2010. The rather low proportion of Circassians in the capital of the republic, as well as its more modest population growth compared to the capitals of other republics in the North Caucasus, can be explained by its proximity to Krasnodar, where the economy and urban infrastructure are much more developed than in Maykop. Therefore, Krasnodar is a more attractive urban destination for the rural citizens of Adygea than its own capital is. Yet not all Circassians of Krasnodar Krai have migrated from Adygea. Of the 13,800 Circassians registered in the 2010 census of the Krasnodar District, most lived in its areas already inhabited by Circassians for several centuries. There are, however, reasons which suggest that the present-day migration of both Russians and Circassians from Adygea to the Krasnodar District is understated. Another reason for relatively low migration into the capital of Adygea may have to do with the advantages that the climate of the lowlands offers to agricultural work. Highly developed agriculture in both Circassian and Russian areas of the republic forestalls urbanization, at least to some degree. At present, the urban population constitutes 47.31% (2017), which is much lower than in Russia as a whole. Apart from Maykop, there are only two areas in Adygea where considerable ethnic intermixing is found—first, along the westernmost margin of the republic, immediately adjacent to Krasnodar. Although Circassians constitute the majority there, the Russian population is motivated to relocate there, as home prices are lower there than in Krasnodar, and the distance from the city is short. Second, Russians and Circassians share some rural parts of northeastern Adygea, where they have been neighbors for more than a century. Among the ethnic minorities present in Adygea, Armenians are the largest. They can be considered a subgroup of the much more numerous Armenian population of Southern Russia, mainly living in Krasnodar Krai. Kurds migrated to Adygea after the collapse of the Soviet Union, mainly from Central Asia, where they had been deported during Stalin’s reign (Poljan, 2001, p. 137). They inhabit a small number of rural settlements in Adygea. Ukrainians, who initially moved to agricultural areas of Adygea in the nineteenth and early twentieth centuries, are almost entirely assimilated with Russians today.

Further Reading Fundamental studies of ethnic composition and other aspects of demography of North Caucasus are almost all in Russian. This section offers a brief survey of the most informative among them. Belozerov (2005) gives an overview of interethnic differences in birth rates, paths of migration, and proportions of urban population, mainly focusing on the western part of the North Caucasus. Changes in the ethnic composition in various parts of the

84 Konstantin Kazenin region in the last decades of the USSR and the early post-Soviet years are also considered. Kabuzan (1996) offers a historical account of changes in the ethnic composition of the North Caucasus in the Russian Empire and in the Soviet era, making use of a large set of documents. Karpov and Kapustina (2011) study migration of rural (mainly highland) populations of the North Caucasus to the lowlands within and away from their native regions. Social transformations accompanying those migration processes are also discussed. Karpov (2017) provides a study of the formation of administrative units (republics, etc.) in the North Caucasus in the 1920s and 1930s. The book accounts for the emergence of a system of administrative borders which, for the most part, has been preserved in the North Caucasus still today and which most conflicts mentioned in this chapter are related to.

PA RT I I

NA K HDAGE STA N I A N L A NGUAGE S

chapter 3

Na k h-Dagesta n i a n L a nguages Dmitry Ganenkov and Timur Maisak

This chapter provides an overview of the basic properties of the languages of the Nakh-Dagestanian (East Caucasian) family. Given the size of the family, we cannot cover even the most typical features in full here, let alone describe details of the variation that exists. Likewise, we cannot do full justice to all individual languages or even branches within the family and must instead confine the discussion to occasional mentions of languages and branches here and there. The goal of this chapter is to complement the body of previously published surveys of the family and its branches, such as van den Berg (2005b), Bokarev and Lomtatidze (1967), Klimov and Alekseev (1980), Smeets (1994), Alekseev (1998b), Hewitt (2004), and Job (2004), and to provide a state-of-the-art update on the major issues in the grammar of NakhDagestanian. Where appropriate, we refer the reader to other chapters in this volume or to existing family- or branch-wide overview studies of specific phenomena. For reasons of space, however, we do not provide references to individual grammatical descriptions, except when citing examples from the literature. Examples without references are drawn from our own fieldwork.

3.1 Introduction In this section, we discuss the diachronic relationships and geographical distribution of languages within the Nakh-Dagestanian family (3.1.1). We then proceed to an overview of available historical sources (3.1.2), сurrent sociolinguistic situation (3.1.3), and the history of research (3.1.4) on languages of the family.

88 Dmitry Ganenkov and Timur Maisak

3.1.1 Structure of the family The Nakh-Dagestanian languages are spoken in the eastern Caucasus (and for this reason they are also commonly labeled “East Caucasian”). The majority of these languages are located in the Republic of Dagestan, Russian Federation. The Republics of Chechnya and Ingushetia (both also belonging to the Russian Federation) are home to Chechen and Ingush, respectively. Northern regions of Azerbaijan and eastern parts of Georgia bordering Dagestan and Chechnya also host Nakh-Dagestanian-speaking communities. Smaller communities live in Turkey, Kazakhstan, and Kyrgyzstan and are dispersed more widely across the Russian Federation. The family is divided into four accepted branches: Nakh, Avar-Andic-Tsezic, Lezgic, Dargwa, and two family-level isolates—Lak and Khinalug—each constituting a separate branch (see Figure 3.1). Lack of clear dialectal divisions and dearth of historical reconstructions of NakhDagestanian are the main reasons why the internal composition of the family, especially on the Dagestanian side, is still subject to debate. The groupings presented here are similar, but not identical, to the groupings discussed in chapter 1, and this is a reflection of an ongoing debate in Dagestanian language studies. In particular, it is not clear whether, in fact, the Nakh branch is opposed to a Dagestanian branch including all other groups, as suggested by the family’s name, or whether the root node splits off into several branches as illustrated in Figure 3.1. It may be that the Tsezic languages are a separate branch, while Lak has been grouped together in a branch with Dargwa. Khinalug, spoken in one village in Azerbaijan, was formerly included in the Lezgic branch, but the affinity between the two is now considered to be areal rather than genetic in nature. The Nakh languages are spoken in Chechnya (Chechen), Ingushetia (Ingush), and Georgia (Batsbi, also known as Tsova-Tush). The nine Avar-Andic languages—Avar, Akhvakh, Andi, Bagvalal, Botlikh, Chamalal, Godoberi, Karata, and Tindi—are spoken Nakh-Dagestanian Lezgic

Avar-Andic-Tsezic Avar-Andic Andic

Dargwa

Tsezic East

West

East

West

Nakh

South

Akhvakh Avar Bezhta Hinuq Agul Rutul Budukh Archi Udi Khinalug Lak Aqusha Batsbi Chirag Chechen Andi Hunzib Khwarshi Lezgian Tsakhur Kryz Ingush Itsari Bagvalal Tsez Tabasaran Kaytag Botlikh Kubachi Chamalal Mehweb Godoberi … Karata Tindi

Figure 3.1. Nakh-Dagestanian languages.

Nakh-Dagestanian Languages 89 in the western part of Dagestan. Avar communities are also known in Azerbaijan and Georgia. One Akhvakh-speaking village (Akhakh-dere) is also located in Azerbaijan. The Tsezic branch comprises five languages—Bezhta, Hinuq, Hunzib, Khwarshi, and Tsez—all spoken in the southwestern part of Dagestan. A Bezhta-speaking community is reported to live in Georgia. The Lezgic branch is the most internally diverse within Nakh-Dagestanian and splits further into lower-level groups. The East Lezgic group includes Agul, Lezgian, and Tabasaran. The West Lezgic group includes Tsakhur and Rutul. The South Lezgic group includes Budukh and Kryz. Archi and Udi are not closely related to any of those groups or to each other. Archi is spoken in one village in the central part of Dagestan surrounded by Avar- and Lak-speaking villages. The East and West Lezgic languages are spoken in the southern part of Dagestan. Large Lezgian, Tsakhur, and Rutul communities are also spread across the northern part of Azerbaijan. Budukh and Kryz are spoken in Azerbaijan only; Udi is spoken in Azerbaijan, with one Udi-speaking village (Oktomberi/Zinobiani) in Georgia. Caucasian Albanian, an extinct language spoken in the medieval state known as Caucasian Albania, has been shown to be an ancestor of modern Udi, and is sometimes refered to as Old Udi in the literature (Gippert, Schulze, Aleksidze, & Mahé, 2008). The Dargwa languages and Lak are spoken in the central part of Dagestan. Dargwa is traditionally considered to be a single language with divergent dialects, but from a structural and lexical point of view, it instead represents a branch comprising several related languages. Traditionally around 40 different dialects are distinguished; their internal subgrouping is unclear. On the basis of lexicostatistical counts, Koryakov (2002) identifies 17 Dargwa languages. Along with clearly divergent languages, like Chirag, Kaytag, and Kubachi, there exists a dialect continuum with unclear internal divisions. Several clusters can still be identified. The northern cluster includes relatively close lects/dialects (Aqusha, Gapshima, Urakhi, Mekegi, Gubden, Kadar, Murego, Mehweb, Muira, Mulebki, Muhi) which may yet be identified as representing a smaller number of languages. The southern cluster shows greater internal diversity, including a number of one-village lects, where each village can readily be identified as speaking a separate language (spoken in the eponymous villages: Qunqi, Khuduts, Amuq, Itsari, Sanzhi, Shari, Shiri, Amuzgi). Most of these villages have been (almost) abandoned over the past 25 years and the languages are, therefore, severely endangered. The other Dargwa-speaking villages form a transition zone between the northern and southern clusters (Sirhwa, Tanti, Tsudaqar, Kharbuk, Deybuk, Usisha, Hinta, Butri). Many languages are split into dialects, which are sometimes significantly divergent and not mutually intelligible. Several cases are worth mentioning: Qushan Agul versus other Agul dialects, Tukita Karata versus other Karata varieties, Gigatli Chamalal versus other Chamalal varieties, Inkhokwari Khwarshi versus Khwarshi proper, Southern versus Northern Tabasaran, Southern versus Northern Akhvakh, Lower versus Upper Andi. Sizable Nakh-Dagestanian-speaking communities live in Turkey, Kyrgyzstan, and Kazakhstan. Turkey hosts Avar-, Lezgian-, and Tsez-speaking communities. Kyrgyzstan became home to various Dagestanian languages after the Soviet campaign of deportations and executions of wealthy farmers (dekulakization, Rus. raskulačivanie) in 1937–1939; the

90 Dmitry Ganenkov and Timur Maisak exact languages spoken there, their location, and number of speakers are unknown. At least one Tsudaqar-speaking, one Chirag-speaking, and one Agul-speaking community now exist in Kyrgyzstan. Kazakhstan has been home to Chechen and Ingush communities since Stalin’s deportation of the Vainakhs in 1944 (see chapter 8). The external relations of Nakh-Dagestanian are unclear (see chapter 1), and however bold, none of the proposals concerning long-range diachronic comparisons can be unconditionally accepted due to the poor state of Nakh-Dagestanian historical linguistics (see section 3.1.4 for references on this topic). Many Nakh-Dagestanian languages (or divergent dialects) are endangered, though the degree of endangerment varies. Two languages, Budukh (Lezgic) in Azerbaijan and Batsbi (Nakh) in Georgia, are probably in the worst condition, being spoken only by speakers in their 40s and above. The endangered status of other languages mainly results from resettlement from highland villages to the lowlands, which has been occurring since the 1990s (see chapter 2). It is very common for the home language to be lost in the first or, at best, the second generation born in the lowlands. Given that many highland villages are (almost) depopulated now, with younger speakers leaving their home villages after graduation from high school, it is only a matter of time before the languages/dialects of those villages will cease to be spoken. The most critical condition is observed in villages speaking southern Dargwa languages (Chirag, Amuq, Ashty, Qunqi, Khuduts, Itsari, Sanzhi, Shari, Shiri, Amuzgi), Agul dialects (Tsirkhe, Burshag, Burkikhan, Fite), and archaic Lezgian dialects spoken in the highlands, as well as some Andic and Tsezic languages and dialects.

3.1.2 Historical Sources The only historical sources attesting an older Nakh-Dagestanian language are the two Albanian palimpsests found in Saint Catherine’s Monastery on Mt. Sinai (Egypt) in the 1990s containing biblical readings for liturgy, mainly from the New Testament. These palimpsests were published after a decade of research (Gippert, Schulze, Aleksidze, & Mahé, 2008), and the analysis proposed suggests that they are written in Caucasian Albanian, which represents an earlier state of Udi (Gippert & Schulze, 2007). The language was written in its own script, which bears some resemblance to the Georgian and Armenian scripts. A wealth of religious and medical literature from the 17th to the 19th centuries, employing the Arabic writing system, has survived to the present day at least in Avar, Lezgian, Lak, and several Dargwa languages (Aqusha, Kaytag, Gubden). These sources, however, have not been published and thus remain unavailable for research.

3.1.3 Current Sociolinguistic Situation From a sociolinguistic point of view, most Nakh-Dagestanian languages are minority languages with little or no official status. Almost all communicative domains make use of the national language dominant in each respective country: Russian, Azerbaijani,

Nakh-Dagestanian Languages 91 Georgian, or Turkish. Although officially all languages of Dagestan are declared to be official languages in the Republic, most have not gained any considerable use in spheres other than everyday communication and, to a limited extent and only informally, official communication at the lowest administrative levels (village and district administrations in highland districts). Before Soviet times, the Arabic writing system adapted for Nakh-Dagestanian (called adjam) was used to write some Nakh-Dagestanian languages, such as Avar, Aqusha, Lak, and Lezgian. During the Soviet times, writing systems were developed for a number of languages. After a short period when a Latin-based orthography was used (1929–1939), Cyrillic-based writing systems were introduced for Nakh-Dagestanian. Until the 1990s, only seven Nakh-Dagestanian languages had a written standard: Chechen, Ingush, Avar, Lak, Dargwa, Lezgian, and Tabasaran. The written standards were used in schooling and had a limited presence in broadcast media (newspapers, radio). In the 1990s, Rutul, Tsakhur, and Agul were also officially recognized as written languages, though this recognition was restricted to the development of an orthography and a few primers. Currently, Chechen, Ingush, Avar, Lak, Dargwa, Lezgian, and Tabasaran have a number of printed media (weekly newspapers) and some TV programming (one-hour weekly programs each). Rutul, Tsakhur, and Agul are used occasionally in local newspapers; only a few textbooks and collections of poetry and folklore have been published in each of these languages. The other languages of Dagestan remain unwritten. In Azerbaijan, a script based on the modern Azerbaijani alphabet was introduced for Udi and is used in school (in the village of Nij) and in publications.1 Given that the phonological systems of unwritten Nakh-Dagestanian languages are similar, to a large extent, to those of the written languages, it is possible in principle to write almost any language with only a few adaptations. This is supported by recent efforts to publish dictionaries of unwritten languages at the Dagestanian Scientific Center, and collections of texts or translations in several languages (Bezhta, Tsez, Khwarshi, Kaytag), which use Avar- or Dargwa-based orthographies adapted for a specific language. In practice, however, hardly any of these are used by speakers other than the authors of the grammars and text collections themselves. In addition, the Institute for Bible Translation runs a program translating the Bible into non-Slavic languages of Russia which currently includes projects on Agul, Andi, Avar, Bezhta, Chechen, Dargwa, Ingush, Lak, Lezgian, Rutul, Tabasaran, and Tsakhur (see https://ibt.org.ru/). The writing systems of Nakh-Dagestanian languages (except modern Udi) are Cyrillicbased and all use similar orthographic principles and conventions. Apart from letters of the Russian alphabet, the orthographies make extensive use of digraphs to represent pharyngealized consonants and vowels, ejectives, and geminate consonants. Although the digraphs used are very similar across languages, they sometimes have different phonological values (e.g., кь stands for /q’/ in Lezgian, but for /ʟ’/ in Avar). Moreover, it is quite common for the orthographies to be underspecified with respect to the phonology.

See also chapter 1.

1

92 Dmitry Ganenkov and Timur Maisak A significant proportion of speakers of Nakh-Dagestanian languages are multilingual, with monolingualism restricted to preschool children and older women living in highland districts. Most speakers are at least bilingual in their native language and the national language. Traditionally, patterns of multilingualism were much more diverse and included several levels: neighboring language(s) and a regional lingua franca (Avar, Lak, Kumyk, or Azerbaijani). These traditional patterns are still preserved to some degree among older speakers. Many speakers of Andic languages, for example, are currently trilingual in their native language, Avar, and Russian. Knowledge of Lak, Kumyk, and Azerbaijani as linguae francae has declined (see Chirikba, 2008b; Dobrushina, 2013; and chapter 1 of this volume).

3.1.4 History of Research The history of scientific research on Nakh-Dagestanian languages largely began in the 19th century with the work of extraordinary members of the Russian Academy of Sciences, Franz Anton Schiefner and Pëtr von Uslar, who each left an impressive legacy of several published grammars (see Schiefner, 1856 (Batsbi), 1862 (Avar), 1863 (Udi), 1864 (Chechen); Uslar, 1888 (Chechen), 1889 (Avar), 1890 (Lak), 1892 (Urakhi Dargwa), 1896 (Lezgian), and 1979 (Tabasaran)). In the early 20th century, a series of grammatical sketches (which also included texts and word lists) were published by Adolf Dirr (see Dirr, 1904 (Udi), 1905 (Tabasaran), 1906 (Andi), 1907 (Agul), 1908 (Archi), 1912 (Rutul), 1913 (Tsakhur)). During the early Soviet period, research on Nakh-Dagestanian languages was concentrated in the Soviet Union, with only occasional instances of scholarly work on the Caucasus appearing in the West, such as Bouda (1949 (Lak)). Several Russian researchers left an important mark in Nakh-Dagestanian linguistics. Lev Zhirkov published a dictionary of Avar (1936) and a grammar of Lak (1955). Evgeny Bokarev published a handbook of the Tsezic languages (1959). His younger brother Anatoly Bokarev wrote the first description of Avar syntax (1949b) and a grammatical sketch of Chamalal (1949a), both published posthumously. Anatoly Genko compiled a dictionary of Tabasaran, covering the lexicon of several dialects and including a survey of Tabasaran dialects and sample texts, which was published in 2005 (Genko, 2005). Work by scholars trained in the Georgian school includes Magometov (1963 (on Kubachi Dargwa), 1965 (on Tabasaran), 1970 (on Agul), 1982 (on Mehweb Dargwa)); Gudava (1962 (on Botlikh), 1971 (on Bagvalal)); Imnaishvili (1963 (on Tsez)); Tsertsvadze (1965 (on Andi)); Magomedbekova (1967 (on Akhvakh), 1971 (on Karata)); and Dzheyranishvili (1971 (on Udi), 1983, 1984 (on Tsakhur and Rutul)). Collective work under the direction of Alexander Kibrik, based at Moscow State University, includes grammars of Khinalug (Kibrik, Kodzasov, & Olovjannikova, 1972), Archi (Kibrik 1977a, 1977b; Kibrik, Kodzasov, Olovjannikova, & Samedov, 1977b), Godoberi (Kibrik, 1996), Tsakhur (Kibrik & Testelets, 1999), Bagvalal (Kibrik, Kazenin, Lyutikova, & Tatevosov, 2001), and selected issues in Tabasaran (Zvegintsev, 1982).

Nakh-Dagestanian Languages 93 From the 1980s onward, Western scholars also resumed their work on NakhDagestanian; consider Charachidzé (1981) on Avar, Schulze (1982, 2001) and Harris (2002) on Udi, Schulze (1997b) on Tsakhur, Haspelmath (1993) on Lezgian, van den Berg (1995 (on Hunzib), 2001 (on Aqusha Dargwa)), Authier (2009) on Kryz, Nichols (2011) on Ingush, Forker (2013b (on Hinuq), 2020a (on Sanzhi Dargwa)). Recent dissertations on Nakh-Dagestanian languages defended in Europe include Babaliyeva (2013) on Tabasaran, Khalilova (2009) on Khwarshi, and Molochieva (2010) on Chechen. A number of grammars are currently in preparation: Polinsky and Comrie (Tsez), Belyaev (Shiri Dargwa), Ganenkov (Chirag Dargwa), Ganenkov, Maisak, & Merdanova (Agul). Shorter grammatical sketches are often included as appendices to dictionaries or text collections (see the discussion below in this section). A number of volumes have been published containing shorter sketches of individual languages and surveys of the family and individual branches: Bokarev and Lomtatidze (1967), Smeets (1994), Alekseev (1998b), Job (2004). In addition to the present volume, a handbook of Nakh-Dagestanian languages is in preparation, containing a wealth of chapters on individual languages and dialects (Koryakov & Maisak, to appear). During the Soviet period, lexicographical work focused on the compilation of dictionaries of standard/written languages: Avar (Zhirkov, 1936), Lezgian (Gadzhiev & Talibov, 1966), Lak (Khajdakov & Zhirkov, 1962), Tabasaran (Khanmagomedov & Shalbuzov, 2001). Rare examples of work on minor languages are Gukasjan (1974, Udi), Meylanova (1979, Budukh), Kibrik, Kodzasov, Olovjannikova, and Samedov (1977a, Archi). Lexicographical work became more intensive in the 1990s and 2000s under the direction of Madzhid Khalilov (Makhachkala): Khalilov (1995, Bezhta), Isakov and Khalilov (2001, Hunzib), Khalilov and Isakov (2005, Hinuq), Khalilov (1999, Tsez), Magomedova (1999, Chamalal; 2003, Tindi; 2004a, Bagvalal), Saidova (2006, Godoberi; 2008, dialectological dictionary of Avar), Saidova and Abusov (2012, Botlikh), Magomedova and Khalidova (2001, Karata), Magomedova and Abdulaeva (2007, Akhvakh), Ganieva (2002, Khinalug). A number of electronic resources have been compiled using the software of the East Armenian National Corpus (www.eanc.net): corpora of Standard Lezgian, Standard Avar, Standard Dargwa, Archi, and Chirag Dargwa. An electronic dictionary of Archi was developed by Chumakina, Brown, Corbett, and Quilliam (2007a, 2007b), and the Atlas of Multilingualism in Dagestan by Dobrushina, Staferova, and Belokon (2017) is now available online (https://multidagestan.com). Besides the text samples included as appendices to grammatical descriptions, dedicated collections of texts have been published in Archi, Tsez, Agul, Khwarshi, Aqusha Dargwa, and Kaytag Dargwa (with translations into Russian). Larger collections of texts are available for Chechen, Ingush, Standard Avar, Standard Dargwa, Standard Lezgian, Standard Lak, and Standard Tabasaran, which all have a large body of fiction and journalistic prose due to their status of written languages in the Soviet period. Other minor languages featuring original written literature are Agul, Tsakhur, Rutul, and Udi.

94 Dmitry Ganenkov and Timur Maisak Large collections of texts have been compiled over the past 10 years as a result of recent efforts to document endangered languages of the family: Sanzhi Dargwa (Forker, 50,000 tokens), Shiri Dargwa (Belyaev, 50,000 tokens), Chirag Dargwa (Ganenkov, 400,000 tokens), Agul dialects (Ganenkov, Maisak, Merdanova; 250,000 tokens), Khinalug (Rind-Pawlowski, 900,000 tokens). Works on comparative-historical issues in the Nakh-Dagestanian branches include Alekseev (1985, Lezgic, and 1988, Avar-Andic-Tsezic); Gudava (1964, Andic, and 1979, Tsezic); Imnaishvili (1977, Nakh); Talibov (1980), and Schulze (1988, Lezgic). Gigineyshvili (1977) and Bokarev (1981) are general surveys of Nakh-Dagestanian historical phonology, and Nichols (2003) is an attempt at an alternative to the NakhDagestanian reconstruction proposed by Nikolaev and Starostin (1994). A detailed review of Nikolaev and Starostin (1994) can be found in Alekseev and Testelec (1996). Some recent work on the problems of Nakh-Dagestanian genealogical classification includes Kassian (2015, 2017) and Kassian & Testelets (2017). For general bibliographies of Nakh-Dagestanian studies, see especially Erschler (2014a), Chumakina (2011a), and Alekseev and Kikilashvili (2013) (the two latter works have broader scope).

3.2 Phonetics and Phonology In this section, we discuss consonant (3.2.1) and vowel (3.2.2) inventories, the problem of identification of stress and tone (3.3.3), as well as phonotactics (3.2.4) and main phonological processes (3.2.5).

3.2.1 Consonant Inventories Nakh-Dagestanian consonant inventories employ the following points of articulation: labial, dental, alveolar, postalveolar, velar, uvular, and glottal, as well as pharyngeal and/or epiglottal. Common manners of articulation are sonorant, stop, affricate, and fricative. The inventory of sonorants is very stable across the family: /n/, /m/, /j/, /l/, /w/, and the rhotic trill /r/. Stops and affricates are in complementary distribution across points of articulation: alveolars and postalveolars are affricates, and others are stops. The usual phonation types are voiced (vd), voiceless aspirated (vl), and ejective. Many languages also have a fortis-lenis contrast among voiceless consonants. The core of the inventory of obstruents is shown in Table 3.1 (standard International Phonetic Alphabet, or IPA, accompanies non-standard symbols used in Nakh-Dagestanian studies).2

2 See also Appendix II for transliteration tables, which include Cyrillic alphabets for several languages.

Nakh-Dagestanian Languages 95 Table 3.1 Obstruents in Nakh-Dagestanian Labial

Labiodental

Alveolar

Postalveolar

Velar

Uvular

vd stop / affricate

b

d

ʒ /d͡z/

g

g

vl aspirated stop / affricate

p

t

c /t s͡ /

̬ ʒ /d͡ʒ/

q

vl ejective stop / affricate

p’

t’

c’ /t͡s’/

c ̌’ / t͡ʃ’/

k k’

q’

z

ž /ʒ/

ɣ

ʁ

s

š /ʃ/

x

χ

vd fricative vl fricative

f

c ̌ / t͡ʃ /

Some gaps commonly occur in this chart. /p’/ is often absent or plays a marginal role, being used mainly in ideophones and, more rarely, loanwords; other consonants often missing in individual languages are /ʒ/, /ɣ/, and /ɢ/. Archi and Udi are unusual in lacking velar fricatives. In addition to the usual ejective stops and affricates, Bagvalal and Chamalal have ejective fricatives: /s’/ in Chamalal, /s’/ and /ʃ’/ in Bagvalal. In the Rikvani variety of Andi, a glottalized /l’/ is attested. Udi is the only language of the family that lacks ejective consonants, only possessing a distinction between aspirated and unaspirated voiceless stops and affricates, though some sources analyze unaspirated consonants as ejectives (Harris, 2002; Schulze-Fürhoff, 1994). The fortis-lenis distinction materializes as a distinction between aspirated obstruents on one hand and unaspirated or geminated obstruents on the other (see chapter 15). Apart from non-ejective voiceless stops/affricates, the fortis-lenis contrast may also be present in voiceless fricatives, as in some Dargwa varieties, Avar, and Lak, and/or ejective stops/affricates, as in Avar-Andic. Lateral obstruents, either alveolar or velar, are found in languages of the Avar-Andic and Tsezic branches, as well as in the Lezgic language Archi. The inventory of lateral obstruents minimally includes the voiceless alveolar fricative /ɬ/, voiceless alveolar affricate /tɬ/ (ʟ or ƛ are often used in descriptions of Nakh-Dagestanian languages), and voiceless alveolar ejective affricate /tɬ’/ (ʟ’ or ƛ’ in the local tradition), as in Tsezic languages. In Archi, Avar, and most Andic languages, the inventory is expanded by adding the fortis fricative /ɬː/. Akhvakh is unique in showing a fortis-lenis distinction in the voiceless fricative, voiceless affricate, and ejective affricate. Except in Archi, which has the voiced fricative [ɮ] appearing in a few words as a variant of /tɬ’/, voiced lateral obstruents are absent. Archi is also different from other Nakh-Dagestanian languages in having velar rather than alveolar laterals. Post-uvular articulations vary considerably. Glottal consonants include glottal stop /ʔ/ and voiceless fricative /h/ and are found in all Nakh-Dagestanian languages. Epiglottals and pharyngeals are also very typical, often distinguishing the epiglottal stop /ʡ/, voiceless fricative pharyngeal /ħ/, or epiglottal /ʜ/, and voiced fricative pharyngeal /ʕ/ or epiglottal /ʢ/. Some dialects of Agul are unique in having two parallel series of fricatives—pharyngeal /ʜ/ and /ʕ/ and epiglottal /ħ/ and /ʢ/ (see also chapter 15).

96 Dmitry Ganenkov and Timur Maisak Some languages present typologically unusual possibilities. Northern Dargwa languages have a retroflex affricate, either voiceless /tʂ/ or voiced /dʐ/. In Udi, there is a series of pharyngealized postalveolar fricatives and affricates: /šˤ/, /žˤ/, /čˤ/, /čːˤ/, /ǯˤ/. Tabasaran has a series of labiosibilants: /s̊/, /z̊/, /c̊/, /c̊ː/, /c̊’/, /ʒ̊/.3 The core inventories described in the current section can be extended by secondary articulations. Labialization is very common, absent from only a few languages such as Bezhta, Hunzib, Northern Tabasaran, Udi, and some northern Dargwa languages. Labialization co-occurs with velar and uvular stops/affricates but can also appear with other consonants, including glottals and epiglottals. The contrast in labialization is usually neutralized before the labialized vowels /u/ and /o/. However, Andi has been reported to preserve the contrast in this position as well (Kibrik & Kodzasov, 1990, p. 322). Palatalization as a phonemic feature is attested in some Lezgic languages and in Khinalug. In Tsakhur, palatalization is found with dental, alveolar, and velar consonants. Rutul and Khinalug only have palatalized velars and /l/. Udi and some Agul dialects only have palatalized velars.

3.2.2 Vowel Inventories Vowel inventories in Nakh-Dagestanian are more varied. The minimal system of vowel qualities comprises /i, u, a/, as in Lak. Dargwa languages extend the inventory by adding /e/. Avar-Andic languages and Hinuq further supplement the system with /o/. Khinalug, Tsezic, and Lezgic languages have more complex vowel inventories. The most typical way to expand the system is to add more mid-vowels (/ə, ɨ/) and/or introduce a roundedness contrast among front vowels. Some Lezgic languages (e.g., Udi, Kryz, and Budukh) and Khinalug have a fully contrastive set of rounded and unrounded central and mid-vowels, probably developed under the influence of Azerbaijani. The Nakh languages have an even more complex system of basic vowels, which not only contrasts at least six basic vowel qualities /i, u, ʌ, o, e, a/ but also includes the diphthongs /oa/, /ei/, /ou/, /ʌi/. Only a few languages, such as Avar, Agul, and Khinalug, make use of no further distinctions (e.g., length and nasalization) beyond the basic vowel quality inventory. Length is commonly found in languages of the Nakh, Andic, and Tsezic branches, and occasionally in languages of other branches—for example, Kubachi Dargwa and Tsakhur from the Lezgic branch. Nasalization is characteristic of Andic and Tsezic languages but is missing elsewhere. The Yargun dialect of Lezgian has been reported to have devoiced vowels (Chitoran & Iskarous, 2008). Secondary articulations, such as pharyngealization and epiglottalization, are also commonly found in Nakh-Dagestanian (Kodzasov, 1986). Pharyngealization is wide3 These symbols are not part of traditional IPA, but they are used in Nakh-Dagestanian descriptions to indicate labiosibilants in Tabasaran.

Nakh-Dagestanian Languages 97 spread across the family, missing only from the Nakh and Avar-Andic languages, Lezgian, dialects of Agul (Lezgic), and some Tsezic languages (Bezhta, Hinuq, Hunzib). In some languages, for example, Archi, pharyngealization is described as a property of the syllable or word rather than the vowel, due to its ability to spread over the syllable/ word as a whole (Kibrik, Kodzasov, Olovjannikova, & Samedov, 1977a). In other languages pharyngealization is probably better analyzed as a secondary articulation of particular vowels, as in Aqusha, for example, where only one pharyngealized vowel /aˤ/ is found (see also chapter 15).

3.2.3 Stress and Tone Along with languages that have clear lexical stress, such as Lezgian, Avar, or Aqusha Dargwa, quite a few have been reported to have weak stress only or to lack stress altogether, such as Tsezic and Andic languages, as well as some Lezgic languages (e.g., Tsakhur, Budukh, and Kryz). Many southern Dargwa languages have a weaker stress which is distinctively perceived only in plural nouns (the plural suffix always bears stress), while singular nouns often have no acoustically prominent syllable. Bagvalal is reported to have three different classes of words: some unambiguously bear stress, others have only weak stress, and still others are stressless (Kodzasov, 2001, p. 44). Chamalal is described as having two different types of stress (“weak” and “strong,” see Magomedova, 1999, p. 414). Different descriptions, however, may diverge on whether or not a language has lexical stress. The phonological and acoustic properties of stress definitely require further research in Nakh-Dagestanian. In languages which do make use of lexical stress, its placement can vary, usually obeying strict rules which are often different for nouns and verbs. In Lezgian, for example, stress is located on the second syllable in the majority of words, with only some verb forms regularly deviating from this pattern. In Avar-Andic, several classes of nouns can be isolated with respect to stress placement, depending on whether the stress is located on the stem or the suffix in the plural and in oblique cases. In the 1970s, several Andic and Tsezic languages, as well as Tsakhur and Tabasaran, were claimed to have tonal systems and described as having three to four tones (high, low, rising, falling); minimal pairs were registered in some languages, such as Chamalal. The tonal analysis is reflected in the comparative lexicon of Dagestanian languages (Kibrik & Kodzasov, 1988, 1990) for as many as seven languages (Andi, Akhvakh, Chamalal, Tsakhur, Dyubek Tabasaran, Inkhokwari, and Bezhta). However, no later research on specific languages has confirmed the existence of tones in a single Dagestanian language, and even later publications by Kodzasov (1996, 1999a) provide a different account of “tones” in Godoberi and Tsakhur. For example, Kodzasov (1999a) indicates that acoustic analysis shows no difference in fundamental frequency in those Tsakhur words where the difference in pronunciation was earlier interpreted as tonal. A revised analysis describes the difference in terms of additional articulatory movements associated with specific words, but the status of these differences—whether they

98 Dmitry Ganenkov and Timur Maisak are phonetic, phonological, or optional paralinguistic—is unknown. The existence of tone in Nakh-Dagestanian is thus controversial, though earlier tonal analyses may reflect some real distinctions in pronunciation (see also Nichols, 2011, who describes Ingush as having high and low tones).4

3.2.4 Phonotactics Many Nakh-Dagestanian languages have relatively straightforward phonotactics, with no consonant clusters other than sonorant-obstruent (RT) in the syllable coda or at syllable juncture. The most typical syllabic structures in Nakh-Dagestanian are (C)V, (C)VC, and (C)VRT. Traditional descriptions often indicate that the syllable must have an onset, analyzing syllables that appear without an initial consonant as having an (optionally pronounced) glottal stop in the onset. Nominal and verbal inflection does not usually lead to violations of phonotactic rules, meaning that all inflected forms conform to the general structure. Change of stress position in inflection can induce regular vowel reduction, thus leading to the emergence of consonant clusters. In Chirag Dargwa, for example, the plural suffix always attracts stress, inducing vowel deletion in the last syllable of disyllabic nouns. As a result, plural forms can display more complex consonant clusters, for instance: qisqan ‘spider’, qisqne ‘spiders’. While this simple picture may be true for some languages or even the core of vocabulary in most languages, some languages demonstrate syllable-initial clusters where (historical) rules of vowel reduction lead to the emergence of stem-initial consonant clusters, sometimes quite complex, such as in Lezgian, Khinalug, and Nakh languages, e.g., Lezgian kk’al ‘small rock’, kk’lam ‘tick’, ktkana ‘got used’; Khinalug bzɨ ‘pear’, psɨ ‘bear’, gra ‘wolf ’, pχra ‘dog’; Ingush tq’aam ‘wing’, pxo ‘bullet’, txou ‘ceiling’, cq’a ‘once’.

3.2.5 Main Phonological Processes The Nakh-Dagestanian languages are characterized by relatively simple morphophonology, where morphologically conditioned phonological processes are mainly restricted to superficial adjustments without considerable changes in the appearance of morphological forms. Vowel harmony is exceedingly rare. The only language with a consistent pattern of vowel harmony is the Tladal dialect of Bezhta, which shows harmony between front and back vowels (Kibrik & Testelets, 2004). The same process also affects consonants: words with front vowels can only have postalveolar fricatives/affricates, whereas words with back vowels contain only alveolar fricatives/affricates. Khinalug comes close to this situ-

See chapter 16 for a detailed discussion. See also chapter 17 for further discussion of stress and tone.

4

Nakh-Dagestanian Languages 99 ation, requiring harmony between root vowels and also showing harmonic effects in affixes with close vowels. Although no other language demonstrates consistent vowel harmony, a few languages show affixal allomorphy where the choice of allomorph can be related to the quality of the root vowels. Nij Udi, for example, shows front/back vowel harmony in the dative suffix -a, which is, however, optional and affects no other suffixes. Chirag Dargwa shows optional front/back vowel harmony in verbal prefixes but not elsewhere. Labialized consonants can also affect adjacent vowels in inflection, for example, the low unrounded vowel /a/ often transforms into the high rounded /u/, as in Lezgian sʷas ‘daughter-in-law’ (abs.sg) → sus-ar ‘daughters-in-law’ (abs.pl). Other processes are attested as well, which are sometimes better analyzed as historical phonological effects rather than synchronically productive phenomena. For example, Lezgian shows vowel syncope and several different types of consonant alternations; however, all of these affect only the native lexicon, reflecting historical phonological changes, and do not extend to recent borrowings. Consonant metathesis is attested in a number of languages.

3.3 Lexical Classes On the basis of morphological and syntactic properties, the descriptive tradition identifies nominals, verbs, adjectives, and adverbs, as well as several classes of uninflectable words. Nominals, including nouns, personal pronouns, demonstrative pronouns, reflexives, and reciprocal pronouns, inflect for case, number, and function as heads of noun phrases. Verbs morphologically express clausal properties, such as aspect, tense, mood, evidentiality, polarity, and illocutionary force. Among verbs, a class of stative verbs and/or auxiliaries can often be identified. Both show only temporal but not aspectual distinctions and are morphologically defective, having a reduced set of forms. Adjectives and adverbs are identified as separate classes on the basis of several criteria. For example, syntactically, adjectives function as modifiers within NPs, while adverbs are modifiers to verbs. However, adjectives and manner adverbs share a number of morphological and syntactic properties with verbs. Some other classes of adverbs, such as temporal and locative, can appear in the genitive case, thus sharing some properties of nominals. Locative nouns can be identified as a subclass of nominals whose unmarked form (absolutive case) has a locative interpretation. The inventory of locative nouns is usually limited to local toponyms and a few landscape terms. Morphologically, locative nouns usually have a subset of nominal forms, namely the genitive and certain spatial forms. Many languages of the family also have a class of words that can function as both attributive modifiers and heads of a noun phrase. These words either express evaluation, positive or negative (e.g., ‘brave (person)’ and ‘fool’), or indicate ethnicity, such as ‘Avar’.

100 Dmitry Ganenkov and Timur Maisak Like nouns, evaluatives and ethnic group names can function as the head of an NP; like adjectives, they can be used as modifiers to head nouns. Uninflectable words can be classified based on their semantics and, to some extent, their syntactic behavior.

3.4. Nominal Morphology In this section, we describe the category of nominal gender, general principles of nominal inflection and inflectional features, followed by an overview of various classes of pronouns, determiners, adjectives, numerals, and postpositions.

3.4.1 Nominal Gender Grammatical gender is absent only from Udi, Lezgian, Agul, and a few southern Tabasaran dialects (see also chapter 20). Gender systems vary from the two-gender system seen in Tabasaran up to eight genders in Batsbi. Morphologically, gender is apparent from agreement on verbs and attributive modifiers. Often other clausal elements can also bear agreement morphology: auxiliaries, adverbs, postpositions, spatial (or other) case forms of nouns, and even discourse particles. Phonological exponence is similar across the whole family, thus allowing for the reconstruction of a four-way gender system in Proto-Nakh-Dagestanian (Alekseev, 2003). Gender is not visible on the noun itself, except in a few isolated examples, as, for instance, in Avar wasː ‘boy, son’ versus jasː ‘girl, daughter’, where gender prefixes can be identified (w- m.sg, j- f.sg). Gender systems in Nakh-Dagestanian minimally distinguish human versus nonhuman (neuter) nouns, as in Tabasaran. Other languages further distinguish between masculine and feminine among human nouns. Three-way systems (masculine, feminine, neuter) are attested in Avar, most Andic (Akhvakh, Tindi, Bagvalal, Godoberi, Botlikh), and most Dargwa languages. Further distinctions among neuter nouns may be made. Four-way systems involving masculine, feminine, and two neuter classes are common in Lezgic (Rutul, Tsakhur, Kryz, Budukh, Archi), Tsezic (Tsez, Hinuq), and Khinalug. The four-way gender system in Lak is unusual in that younger/unmarried females belong to one of the neuter genders, while the feminine gender only includes older/married females. Mehweb Dargwa shows a similar pattern (likely developed under Lak influence), but with younger women constituting a separate class. Systems with a larger number of genders are rarer: five genders in Andi, Chamalal, Inkhokwari, Bezhta, and Hunzib; six genders in Chechen, Ingush, and dialects of Andi; eight genders in Batsbi. For an overview of the gender systems in Nakh-Dagestanian see Schulze-Fürhoff (1992). Usually no clear-cut semantic motivation can be identified for the distribution of non-human nouns over different neuter classes, although semantic and phonotactic

Nakh-Dagestanian Languages 101 tendencies can sometimes be detected (Mel'nikov & Kurbanov, 1964, on Tsakhur; Plaster, Polinsky, & Harizanov, 2013, and Rajabov, 1997, on Tsez). Plural nouns often show a reduced set of gender distinctions. In most languages, plural nouns distinguish between human and non-human; that is, masculine and feminine nouns have syncretic agreement exponence, while neuters are also collapsed into a single class. In Tsez and Hinuq, plural agreement distinguishes masculine nouns from other genders. In Avar and Tabasaran, no genders are distinguished in the plural (see chapter 6 on Avar). As well as gender and number, a few languages (Archi, Chechen, Ingush, and most Dargwa languages) distinguish person in gender agreement: plural personal pronouns ‘we’ and ‘you’ use the same agreement markers as (one of the classes of) neuter nouns rather than human plural agreement markers used with third person NPs. Corbett and Baerman (2013) and Chumakina, Kibort, and Corbett (2007) discuss this pattern.

3.4.2 Nominal Inflection and Inflectional Features Nominal inflection includes the categories of number, case, localization, and orientation. Singular nouns are unmarked, and plural is marked by a suffix. Plural marking usually has several lexically distributed allomorphs, as in Nij Udi -χo, -urχo, -uχ, -uruχ, ‑muχ, ‑mχo. Plural forms may demonstrate irregularities (e.g., root-internal vowel or consonant alternations) or use suppletive stems, as in Chirag Dargwa unc ‘ox’—anc-e ‘ox-pl’, bare ‘day’—banre ‘day.pl’ (< *bar-ne ‘day-pl’), xade ‘woman’—cːad-e ‘woman-pl’. The plural is very productive and can often be formed from all nouns, including mass nouns (which do not combine with numerals). Plural forms of mass nouns have a distributive interpretation (e.g., location, sort, or portion): Tabasaran χaχ-ar ‘sorts of sour cream’ (sour.cream-pl), Rutul jak-bɨr ‘pieces of meat’ (meat-pl), Chirag k’um-re ‘sour cream in different containers’ (sour.cream-pl). Pluralia tantum are rare. Some nouns, such as plant and berry names, are mainly used in the plural. A study of the morphology and semantic interpretation of plural nouns in Nakh-Dagestanian can be found in Kibrik (2003). In some languages, regular plural suffixes attached to human-denoting nouns, especially proper names, yield an associative plural reading denoting a plurality that includes an anchor referent of the noun and a group associated with the anchor referent, e.g., Agul Maħamad-ar ‘Mahamad-pl’ (‘Mahamad and people associated with him’). Some languages have a separate associative plural marker, like Southern Tabasaran -ʁar and Aqusha Dargwa -qali. The case paradigm includes dozens of forms in most languages, except for Udi, Khinalug, southern dialects of Avar, and some northern Dargwa languages. Case inflection is sensitive to the distinction between the absolutive and all other cases, which are considered oblique (in opposition to the absolutive). The absolutive-oblique distinction is operative in the singular and the plural. The absolutive form is morphologically unmarked; the oblique stem, which is the basis of all other cases, is formed from the

102 Dmitry Ganenkov and Timur Maisak absolutive by means of special suffixes. Suffixes forming the oblique stem can vary, in other words, within a given language, different nouns can form oblique stems with different affixes. The distribution of these oblique-stem-forming suffixes appears to be lexically determined and is not predictable from form or meaning. Usually, about a dozen oblique-stem-forming suffixes are attested for a given language, but Lak is known for its extremely large inventory of oblique stem markers. Kibrik and Kodzasov (1990, p. 277) mention more than 30 in the dialect they describe. From a morphological point of view, the case paradigm can be divided into grammatical cases and spatial cases. Grammatical cases are formed from the oblique stem by means of monomorphemic markers, while spatial case markers, also suffixed to the oblique stem, consist of up to three components. The inventory of grammatical cases usually includes absolutive, ergative, genitive, and dative (the inventory of other typical grammatical cases is described in Daniel & Ganenkov, 2008). The ergative has a special status in some languages, such as Lezgian and Bezhta, where it coincides with the oblique stem, as in (1). In most languages, however, the ergative case is expressed by means of a special suffix on a par with other grammatical cases, as in (2). (1) Oblique stem formation and grammatical cases in Lezgian Gloss Absolutive Oblique Stem/Ergative5 Genitive ‘horse’ balk’an balk’an-dibalk’an-di-n ‘donkey’ lam lam-ralam-ra-n ‘mouse’ q’if q’if-req’if-re-n ‘hand’ ʁil ʁil-iʁil-i-n ‘forest’ tːam tːam-utːam-u-n ‘girl’ ruš ruš-aruš-a-n ‘hare’ qːür qːür-eqːür-e-n ‘village’ χür χür-üχür-ü-n (2) Oblique stem formation and grammatical cases in Bagvalal Gloss Absolutive Oblique Stem Ergative Genitive ‘house’ misa mis-umis-u-r mis-u-ɬː ‘mouth’ el elel-a-r el-a-ɬː ‘carpet’ dum dum-idum-i-r dum-i-ɬː

Dative balk’an-di-z lam-ra-z q’if-re-z ʁil-i-z tːam-u-z ruš-a-z qːür-e-z χür-ü-z

Dative mis-u-la el-a-la dum-i-la

Spatial cases are regularly formed by combining morphemes positioned in two or three morphological slots. One slot (localization) is reserved for morphemes indicating the reference point in a locative configuration: horizontal or vertical surface, inner hollow space, front part, rear part, and so on. The second slot (orientation) expresses type of movement with respect to the localization area, such as movement from, to, or across. 5 The floating hyphens at the end indicate that a form is a stem that needs a case suffix to be attached at the place specified by the hyphen (including a zero ergative suffix, as in Lezgian).

Nakh-Dagestanian Languages 103 The absence of an orientation marker indicates the absence of movement (i.e., stasis, expressed by essive) in most languages. (3) Partial paradigm of spatial cases of ʁʷan ‘rock’ in Agul ʁʷan-di-l ʁʷan-di-l-di ʁʷan-di-l-as rock-obl-super rock-obl-super-all rock-obl-super-abl ‘(sit) on the rock’ ‘(climb) on the rock’ ‘(jump) from the rock’ ʁʷan-di-q rock-obl-post ‘(hide) behind the rock’

ʁʷan-di-q-tːi rock-obl-post-all ‘(walk) behind the rock’

ʁʷan-di-q-as rock-obl-post-abl ‘(walk) from behind the rock’

In Dargwa languages, the absence of an orientation marker (i.e., a localization suffix used without an orientation suffix) conveys the idea of movement to a reference point, while location at a reference point is expressed by agreement markers matching the features of the clause-mate absolutive argument. (4) Partial paradigm of spatial cases of qːarqːa ‘rock’ in Khuduts Dargwa qːarqːa-l-e qːarqːa-li-gu rock-obl-super rock-obl-sub ‘(climb) on the rock’ ‘(move) under the rock’ qːarqːa-l-e-b rock-obl-super-n.sg ‘(sit) on the rock’ (n.sg absolutive argument)

qːarqːa-li-gu-b rock-obl-post-n.sg ‘(hide) under the rock’ (n.sg absolutive argument)

qːarqːa-l-e-d rock-obl-super-n.pl ‘(sit) on the rock’ (n.pl absolutive argument)

qːarqːa-li-gu-d rock-obl-post-n.pl ‘(hide) under the rock’ (n.pl absolutive argument)

Spatial case markers may include a third morphological slot. In Tsez, for example, this slot hosts a distance marker (Comrie & Polinsky, 1998). In some Dargwa languages, deixis and gravity are encoded in this position (see (5)). (5) Deictic and gravity distinctions in spatial cases in Qunqi Dargwa a. qal-li-šːa-r-ka house-obl-ad-abl-down ‘down from the house’ b. qal-li-šːa-r-ha house-obl-ad-abl-up ‘up from the house’

104 Dmitry Ganenkov and Timur Maisak c. qal-li-šːa-r-de house-obl-ad-abl-thither ‘there away from the house’ d. qal-li-šːa-r-ca house-obl-ad-abl-hither ‘here away from the house’ Depending on the number of distinctions available in each slot, the total number of spatial cases may reach more than a hundred (Comrie & Polinsky, 1998). A few languages have evolved further from the Nakh-Dagestanian prototype and have considerably reduced their case inventories. Kadar Dargwa, for example, has a drastically reduced case inventory which includes only nine forms: absolutive, ergative, genitive, dative, and five spatial cases. Genitive case can encode both alienable and inalienable possession. Budukh and Khinalug, however, have different cases for alienable and inalienable possession. (6) Budukh a. zaː q’ıl 1sg.gen.inal head ‘my head’ b. zo k’ant 1sg.gen.al knife ‘my knife’ (Authier, 2013) Tsezic languages have case concord whereby the genitive takes one form when it modifies a noun in the absolutive (gen1), and the other form when it modifies the head noun in any oblique case (gen2). (7) Tladal Bezhta a. abo-s is father-gen1 brother ‘father’s brother’ b. abo-la is-t’i-l father-gen2 brother-obl-dat ‘to father’s brother’ (Kibrik & Testelets, 2004, p. 232) Tsakhur has been described as lacking the genitive and instead using an attributive suffix to mark nominal dependents in NPs (a similar situation is found in Rutul) (see section 3.6.1).

3.4.3 Personal Pronouns Personal pronouns distinguish person, number, and clusivity in the first person plural. Only first and second person pronouns are found; demonstrative pronouns function as

Nakh-Dagestanian Languages 105 third person pronouns. Plural pronouns are subject to cross-linguistic variation, while the inventory of singular pronouns (1sg, 2sg) is stable across the family. The maximal system of five pronouns (1sg, 2sg, 1pl.incl, 1pl.excl, 2pl) is found in Avar-Andic, Nakh, Archi, Agul, Tabasaran, and some Dargwa languages (e.g., Kaytag, Chirag, and Qunqi). A few languages do not make a clusivity distinction and thus have four personal pronouns: Tsezic, Lak, and some Lezgic languages (e.g., Lezgian, Tsakhur, Rutul, Budukh, and Udi). A three-way inventory of personal pronouns has been found in Itsari Dargwa and Shari Dargwa, which have only one plural pronoun, used to express both 1pl and 2pl. Finally, Alik Kryz has a four-way system that distinguishes clusivity in the plural: žin ‘1pl.excl’ versus jin ‘1pl.incl’ and ‘2pl’. The pronominal paradigm in Nakh-Dagestanian mirrors the nominal paradigm, with a few differences. First, absolutive and ergative cases may be morphologically syncretic (see chapter 18 for a discussion). Second, unlike nouns, personal pronouns often form the ergative on the basis of the absolutive rather than the oblique stem, while cases other than the ergative and absolutive are based on the oblique stem. Finally, a subset of nominals (personal pronouns in Lezgic; personal pronouns and masculine nouns in Andic) lack a regular genitive form and use a possessive attributive form.

3.4.4 Reflexive and Reciprocal Pronouns Virtually all Nakh-Dagestanian languages have a stem used as a long-distance reflexive which also serves as the base for the formation of emphatic and complex reflexives. Reciprocal pronouns always involve reduplication of either the numeral ‘one’ or the indefinite pronoun ‘some.’ (For a detailed account of reflexive and reciprocal pronouns, see chapter 21.)

3.4.5 Demonstrative Pronouns Monomorphemic demonstrative roots form a system which may include pronouns, locative adverbs, manner adverbs, manner attributives, predicative particles, and locative verbs. Demonstrative roots are differentiated on the basis of two parameters: (i) distance from the speaker and/or addressee, with the two basic values ‘close to the speaker’ (dem.sp) and ‘far away from the speaker,’ with a third value ‘close to the addressee’ (dem.addr) as a frequent addition; and (ii) position on the vertical axis with respect to the speaker, which has three values: ‘higher than the speaker,’ ‘lower than the speaker,’ and ‘at the same level as the speaker’ (Fedorova, 2001; Schulze, 2003) (see Table 3.2). Demonstrative pronouns usually represent the least marked form of the demonstrative root and serve as the base for derivation of other classes of demonstrative words: locative adverbs (‘here close to the speaker,’ etc.), allative adverbs (‘toward the speaker,’ etc.), manner adverbs (‘the way the speaker does,’ etc.), and presentative particles,

106 Dmitry Ganenkov and Timur Maisak Table 3.2 Demonstrative Roots in Three Languages Language

dem.dist

dem.sp

dem.addr

dem.up

Agul

te

me

Aqusha

it

iš

il

ik’

iχ

Udi

tːe

me

ke

—

—

le

dem.down

ge

s imilar in function to voici and voilà in French (see Table 3.3). Dargwa languages also have demonstrative-based locative predicates denoting existence at a location (e.g., ‘being there, away from the speaker and the addressee’). Morphologically, demonstrative pronouns can show gender and number distinctions. In some languages, such as Nakh and Dargwa, demonstrative pronouns do not express any gender and number distinctions. By contrast, in Tsezic, Avar-Andic, and most Lezgic languages demonstrative stems in the absolutive case include a gendernumber agreement position which reflects gender-number features of the head noun. Gender-number distinctions are often seen only in part of the paradigm and often show syncretism. Thus, Avar-Andic shows a full gender-number paradigm only in the absolutive. In oblique cases which have no dedicated agreement morpheme, gender-number values are indirectly conveyed by oblique stem markers (see Table 3.4).

3.4.6 Determiners There are no article-like determiners in modern Nakh-Dagestanian languages, though a definite article (o m, a f) is attested in Old Udi. The numeral ‘one’ is used to introduce new referents into discourse and thus could be considered an analogue of the indefinite article. The use of ‘one,’ however, is not obligatory with indefinite nouns; see Fedorova (1999) on Archi. Becker (2018) identifies ‘one’ in Agul as a separate article type she labels presentational article.

Table 3.3 Demonstrative Derivation of Distal Demonstrative: Agul, Udi, Aqusha Dargwa Language

Locative Adv.

Allative Adv.

Manner Adv.

Presentative

Predicate

Agul

ti-sa

ti-c ̌

ti-štːi

t-aha

—

tː-ija

tː-aʁaj

tːe-tär

tː-ila

—

it-tːe

—

te-b

Udi Aqusha

it-tːu

Nakh-Dagestanian Languages 107 Table 3.4 Partial Paradigm of the Godoberi Demonstrative Pronoun ho-w ‘that’ Case

m.sg

f.sg

n.sg

m/f.pl

n.pl

Absolutive

ho-w

ho-j

ho-b

ho-b-e

ho-r-e

Ergative

ho-š-tːi

ho-tːi

ho-r-du-di

ho-r-di-di

Dative

ho-šːu-ɬi

ho-ɬːi-ɬi

ho-r-du-ɬi

ho-r-di-ɬi

Source: Kibrik (1996, p. 42).

3.4.7 Adjectives In some languages of the family, adjectives can be unambiguously identified as a category of their own, based on their inflectional behavior, sharing little or no morphology with either nouns or verbs, as in Lezgian and Udi. Adjectives are used as attributive modifiers to express properties. Nakh-Dagestanian languages have medium-sized adjectival systems covering all major semantic classes: size and dimension, shape, color, age, value, and physical properties. In many Nakh-Dagestanian languages, however, adjectives share a number of properties with verbs. A. E. Kibrik (1992) argues that there is no separate class of adjectives in Archi, and traditional adjectives must be analyzed as stative verbs (see also Polinsky, 2016a; Daniel, 2018b, argues for a view that adjectives constitute a lexical class separate from verbs in Archi). While Archi may demonstrate an extreme point in this respect, it is true that many Nakh-Dagestanian languages demonstrate similarities in the behavior of adjectives and (stative) verbs. Sumbatova and Lander (2014) also argue that adjectival modification can be considered a subtype of prenominal relative modification in Tanti Dargwa. Manner adverbs in such languages can then be analyzed as converbs of adjectives (stative verbs).

3.4.8 Numerals Cardinal numerals in Nakh-Dagestanian follow either a decimal or vigesimal system of counting. The core of Nakh-Dagestanian number systems are underived numerals from 1 to 10 and 20. Numerals for multiples of ten below 100 use a decimal or vigesimal base, or a mix of the two. Numerals from 11 to 19 are compounds based on 10 combined with a unit from 1 to 9. Numerals above 20 and below 100 are complex expressions consisting of tens connected to a digit by a clitic (either a regular additive/coordinating clitic, as in Agul and Lezgian, or a dedicated clitic not used elsewhere, as in Dargwa). Nij Udi is unusual among Nakh-Dagestanian in having borrowed terms for multiples of 10 higher than 20. Numerals in the hundreds and thousands are based on nouns for ‘hundred’ and ‘thousand,’ respectively. The head of the NP containing a cardinal numeral is usually singular, as in (8) and (25).

108 Dmitry Ganenkov and Timur Maisak Ordinal numerals are derived from cardinal numerals. Ordinal suffixes often go back to a participle of the verb ‘say,’ though in a few languages the origin of the ordinal exponent is unclear. Udi employs the suffix -(i)mǯi borrowed from Azerbaijani. Distributive numerals are formed by reduplication of the numeral stem. Depending on the language and the structure of the numeral (e.g., simple vs. compound, monosyllabic vs. polysyllabic), full or partial reduplication may be used. (8) Ingush aaz itt~itt kinashjka q’east-q’eastaa 1sg.erg distr~10 book.abs individually ‘I sent books off in packages of ten.’ (Nichols, 2011, p. 205)

dwa-dahwtar. pv-d.send.wit

Numeral stems also serve to derive adverbs indicating number of times (repetitions) and collective cardinal numerals denoting the number of groups (used in particular to count pairs of objects).

3.4.9 Postpositions Nakh-Dagestanian languages have postpositions only. The majority of spatial postpositions and, often also temporal postpositions, represent historical spatial forms of a nominal root, with the dependent noun in the genitive case. Many postpositions can also function as locative or temporal adverbs.

3.5 Verbal Morphology Although it is the nominal paradigm (especially spatial case forms) that has garnered wide acclaim for Nakh-Dagestanian languages, their verbal systems can be very rich, too. However, variation across language groups and individual languages is high: while both agglutination and fusion are common, as are synthetic and periphrastic forms, the size of verbal paradigms and grammatical distinctions encoded by verb forms can differ widely from one language to another.6 On average, however, the verbal morphology of Nakh-Dagestanian languages is relatively simple compared to that of Northwest Caucasian and Kartvelian.

6 According to the famous calculation made by Kibrik (1977b, p. 37), as many as 1,502,839 inflectional forms, both synthetic and periphrastic, can be derived from a single verb root in Archi. This number includes all possible gender-agreeing forms, all case-number forms of regular nominalizations, and also forms that could alternatively be interpreted as combinations of verb forms with quotative and modal particles (see also chapter 1).

Nakh-Dagestanian Languages 109 In this section, we start with the discussion of the simplex/complex verbs distinction and the main principles of verbal inflection. We will then proceed to the problem of finiteness and an overview of indicative, non-finite, non-indicative, and periphrastic verb forms. Negation strategies are summarized in 3.5.8. We describe valency-changing operations in 3.5.9 and discuss verbal agreement in 3.5.10.

3.5.1 Morphological Classification of the Verbal Lexicon To varying degrees, the contrast between simplex and complex verbs is relevant to all branches of the family. Simplex verbs comprise verbs with morphologically unanalyzable stems or with stems that include derivational prefixes or suffixes. Among the derivational prefixal morphemes, spatial prefixes (also known as “preverbs”) play an especially important role in Dargwa, Khinalug, and the Lezgic languages. There can be up to three spatial preverbs in a verb stem. As in the system of spatial case forms, there are separate series of prefixes expressing localization and orientation. In some instances, it is evident that localization preverbs have the same origin as locative case suffixes in the nominal paradigm; this is shown in example (9) from Agul, where the prefix q- of the localization ‘post’ (‘behind’) is identical to the post-essive case marker. (9) Agul ruš-a gardan-i-q šarf q-ix-i-ne. girl-erg neck-obl-post scarf.abs post-put-pfv-aor ‘The girl put a scarf on (lit. behind) her neck.’ (Maisak & Ganenkov, 2016, p. 3587) However, some prefixes are found that do not have any obvious locative meaning and are traditionally described as “expressive” (e.g., in Lezgic). In some languages (i.e., Lezgian and Agul), a regular repetitive prefix meaning ‘again’ or ‘backward’ is also attested. It is often the case that prefixal verbs develop idiomatic meanings, so that the original locative component is no longer evident. Like gender markers, prefixes can become lexicalized (“fossilized”), turning into phonological material associated with the verb roots (Schulze-Fürhoff, 1992). Suffixal verbal derivation is not very common, except for the derivation of causatives, which are well represented in the Andic, Tsezic, and Dargwa branches. In some languages, derivational suffixes seem to result from the grammaticalization of complex verbal constructions: thus, the inchoative and causative verbalizers in Andi are related to the verbs ‘become’ and ‘do,’ respectively. Verb compounding in many languages is the chief, or even the only, available means of creating new verbs. Complex verbs or light-verb constructions consist of a lexical component (sometimes described as the coverb) and a light verb. As a rule, light verbs host all inflectional marking like tense-aspect-mood and polarity. They are usually high-frequency verbs with generalized meanings, such as ‘be, become,’ ‘do, make,’ ‘give,’ ‘say,’ ‘hit, beat,’ ‘go,’ ‘come,’ and some others. In light-verb constructions, it is common to

110 Dmitry Ganenkov and Timur Maisak find intransitive/transitive pairs with different light verbs which share the same lexical component, for instance, the light verb ‘be, become’ in the intransitive and ‘do, make’ in the transitive. The lexical component of light-verb constructions can be a noun, an adjective, an adverb, an ideophone, or a nominal or verbal (or even “acategorical”) bound stem which cannot function as an independent morpheme. The most productive means of expanding verbal vocabulary is to borrow verbs from a dominant language (Russian, or Azerbaijani for southern Lezgic speakers, or Avar for Tsezic speakers) in combination with a light verb. Russian (and Avar) verbs are normally borrowed in the infinitive, while the Azerbaijani loans take the form of the perfect participle with -miš. Some examples of complex verbs in Chechen are shown in (10). (10) Chechen ja:z d- ‘write’ zakaz d- ‘order’ wojla j- ‘judge, think’ wojla xil- ‘intend’ telefon tuox- ‘phone’ nwox da:qq- ‘plow’ barkalla ba:x- ‘thank’ (Nichols, 1994, pp. 48–49)

< ja:z ‘write’ (bound stem) < zakaz ‘order’ (noun, Russian loan) < wojla ‘idea’ (noun) < wojla ‘idea’ (noun) < telefon (noun, Russian loan) < nwox ‘plow’ (noun) < barkalla ‘thanks’

+ d- ‘do, make’ + d- ‘do, make’ + j- ‘do, make’ + xil- ‘become’ + tuox- ‘strike’ + da:qq- ‘take’ + ba:x- ‘say, talk’

Light-verb constructions may be fully productive or may be fully lexicalized. Thus, describing complex verbs in Archi, Chumakina (2016, p. 3600) states that “[s]yntactically, all types of complex verbs demonstrate the characteristics of a single word: the order of the parts is fixed (the lexical part is followed by the light verb) and the insertion of other lexical material between these parts is not, as a rule, allowed.” In particular, in complex verbs formed with the light verb ‘do,’ the lexical component may no longer be assigned a thematic role, and instead becomes an incorporated component of the complex verb. The resulting transitive complex verb can then take an independent object. The internal components of complex verbs can also become so closely linked that the whole complex (not just the light verb) can serve as input for morphological operations like prefixation, as in the Agul repetitive qa-un-aq’as ‘call again’ (< un aq’as ‘to call’, lit. ‘sound do’), where the repetitive prefix qa- precedes the incorporated component (Maisak & Ganenkov, 2016, p. 3583). As a final stage, a light verb can even disappear as a result of phonological reduction. The latter situation can be illustrated with Lezgian complex verbs that include the light verb awun ‘do’ (e.g. k’ʷalaχ awun ‘to work’, lit. ‘work do’). Such verbs occur in two forms: in their full form the light verb is present, as in the aorist k’ʷalaχ awu-na ‘worked’; in their reduced form, the light verb ‘do’ is not visible, as in k’ʷalaχ-na, with the aorist suffix -na added directly to the lexical component (Haspelmath, 1993, p. 178). A development in the opposite direction is also possible; that has occurred in Udi, where simplex verbs were reanalyzed as bipartite, probably by analogy with the historically bipartite complex verbs, which far outnumber underived, monomorphemic verb stems in the language. This reanalysis has led to the rise of the cross-linguistically rare

Nakh-Dagestanian Languages 111 clitic type known as endoclitics—that is, clitics that can occur inside words (Harris, 2000, 2002). Personal agreement markers in Udi can occur as enclitics (both on verbs and non-verbal constituents, see section 3.5.10) but also appear inside the verb stem, dividing the verb into two parts. With complex verbs, the part detached is the light verb (see aš-b- ‘work-do’ in (11)), but with simplex verbs it is just the last consonant, which is devoid of any meaning on its own (see the simplex root beˁʁ- in (11)). (11) Udi a. aš=ne=b-sa work=3sg=do-prs ‘s/he works’ b. beˁ=ne=ʁ-sa look1=3sg=look2-prs ‘s/he looks’ (Harris, 2002, pp. 122, 125)

3.5.2 Verbal Inflection In inflection, verb forms typically include a stem and one or more suffixes. Inflectional prefixation is not so prominent (unlike derivational prefixation described in section 3.5.1); inflectional prefixes include gender-number and negation markers and, rarely, aspectual prefixes (e.g., perfectives in Tabasaran or repetitives in Lezgian and Agul). Gender-number, negation, and aspectual morphemes (especially those marking imperfective aspect) can also occur as infixes. Infixation is preferred when a verb includes a derivational (locative) prefix, even a lexicalized one, although simplex verb stems in some languages allow infixation as well. As a rule, synthetic verb forms can be grouped according to the stem they are derived from. In languages such as Lezgic, Dargwa, Lak, and Khinalug, there is a clear opposition between perfective and imperfective stems. Morphologically, the distinction can manifest in a handful of ways, including suffixation, apophony, infixation (where one stem or the other may include an infix, usually a sonorant -r- or -l-), or reduplication. The languages show a great deal of variation in the extent to which different morphological strategies are employed to distinguish the stems. On the whole, the markedness relation between the perfective and the imperfective stem can be analyzed as equipollent, although there are clear cases where the marked imperfective stem is derived from the perfective (especially when infixation or reduplication is at work) (see also Daniel, 2018a). High-frequency verbs like ‘go,’ ‘come,’ ‘say,’ ‘give,’ ‘do,’ and ‘be, become’ tend to have suppletive aspectual stems. TAM-based suppletion is also attested in certain verb forms of certain verbs and does not just involve stems (e.g., the present tense neχ ‘says’ is a fully suppletive form in Udi, not only displaying a special stem of the verb ‘say’ but also omitting the present tense suffix ‑sa). Example (12) illustrates the main types of formal relations between stems, with data from Archi, Tsakhur, and Mehweb Dargwa.

112 Dmitry Ganenkov and Timur Maisak (12) Formal relations between perfective and imperfective stems Language Verb pfv ipfv Morphological Strategy Archi ‘die’ k’a k’a-r r-suffixation ‘wash’ ocː’u o‹r›cː’u-r r-infixation + r-suffixation ‘freeze’ qa qe‹r›qi-r r-infixation + r-suffixation + stem reduplication + vowel alternation Tsakhur ‘do’ haʔ-uhaʔ-avowel suffixation ‘beat’ ɨˁχ-ɨɨˁχīχ-avowel suffixation + stem reduplication ‘give’ hiwohelesuppletion Mehweb ‘open’ abxibxvowel alternation ‘fill’ -ic’-i‹l›c’l-infixation ‘throw’ ihʷi‹r›hʷr-infixation ‘reap’ irxirx(no distinction) As in the nominal domain, Nakh-Dagestanian verb systems are predominantly agglutinative, although fusional morphology can also be found. Besides the perfective and imperfective aspects, which in some branches of the family are expressed by distinct suffixes, other TAM values may be unmarked or expressed cumulatively by the verb form as a whole. In particular, there are normally no distinct tense morphemes, apart from present or past tense auxiliaries, which, as parts of periphrastic constructions, do not always express temporal reference.7 Certain TAM values (as well as negation and person, see sections 3.5.8, 3.5.10) can be expressed by clitics or, at least, morphemes which enjoy more morphological freedom compared to canonical affixes. Conditional, reportative, or past (“retrospective shift”) markers can belong to this class, being able to combine with various TAM forms. Such markers typically originate from auxiliaries which are in the process of losing their syntactic autonomy (thus, reportative clitics may originate from speech verbs and retrospective shift markers from past tense copulas).

3.5.3 Finiteness The distinction between finite and non-finite verb forms is not always straightforward. On one hand, some verb forms have a clear status and are used only in syntactically independent, root, clauses where they appear in the indicative or imperative. Other verb forms are only found in certain types of subordinate clauses (infinitives in purpose or complement clauses, converbs in adverbial clauses, participles in relative clauses, etc.). On the 7 For example, periphrastic perfects tend to have the structure perfective converb + present copula, with a past tense interpretation; the future tenses also make use of the present auxiliary, which reflects their aspectual and/or modal origin (e.g., ‘is going to do’ or ‘has to do’) (see also section 3.5.7).

Nakh-Dagestanian Languages 113 other hand, in many, if not most, languages of the family syncretism between finite and non-finite forms is observed with either one or several verb forms. Perfect or past tense forms can be identical to perfective converbs or participles, while present and future tense forms can be identical to imperfective converbs or participles (or to infinitives, in the case of future tenses). Such syncretism can be purely historical in nature, or even accidental in that the functional relation between a finite and non-finite form may be unclear on a synchronic level; however, in some cases the homophony of finite and non-finite forms can be substantial. Tsakhur presents a case of massive finite/non-finite syncretism in the core indicative paradigm: the three main finite tense forms—namely, present, past perfective, and future— each possess two variants. Of these, the first variant is identical to the imperfective converb, the perfective converb, and the infinitive (“potentialis”), respectively; the second is identical to corresponding participles, derived by means of a regular attributive suffix: (13) Tsakhur Converb/Finite Form Perfective āqɨ ‘having open’ / ‘opened’ Stem Imperfective āqa ‘while opening’ / ‘is Stem opening’ Infinitive āqas ‘to open’ / ‘will open’ Stem (Kibrik & Testelets, 1999, pp. 86–87)

Participle/Finite Form āqīn ‘which opened’ / ‘opened’ āqan ‘which is opening’ / ‘is opening’ āqasɨn ‘which will be open’ / ‘will open’

In general, finite subordinate clauses are not common in the languages of the family, as complement, adverbial, and relative clauses are normally headed by converbs and participles (see sections 3.7.3–3.7.5). In languages with person agreement, the agreement tends to be restricted to independent clauses (thus constituting one criterion of finiteness), although the correlation is not absolute: both person-agreeing dependent forms (e.g., subjunctives, see section 3.6.6) and non-agreeing independent forms (e.g., imperatives) are also attested.

3.5.4 Indicative Forms There is high variation among the languages of the family as to the balance between synthetic and periphrastic forms in the core indicative paradigm. While some languages show preference for synthetic forms (e.g., Udi), in others periphrasis plays a major role. Most languages combine the two. In functional terms, the indicative paradigm typically includes several past tenses, one or more presents, and one or more futures. Thus, in Tsakhur, besides the synthetic forms listed in (13), the core part of the indicative also includes two periphrastic forms with the present tense copula wod ‘is’, namely the durative and the perfect; two periphrastic forms with the past tense auxiliary ɨxa ‘was’, namely the imperfect

114 Dmitry Ganenkov and Timur Maisak and the pluperfect; and two periphrastic forms with the future tense auxiliary ixes ‘will be’, namely the imperfective ‘potentialis’ and the perfective ‘potentialis’ (Kibrik & Testelets, 1999, p. 90). In general, the present subparadigms may include both a present continuous tense and a more general present habitual. As a rule, lexically restricted progressives are not attested in the languages of the family, although there may exist periphrastic constructions with a “focalized” progressive interpretation, as in Chechen. Several future tenses may coexist, with vague semantic differences in the degree of certainty expressed by the future form. A dedicated future form may be missing, which is especially the case when there are forms that display future/habitual polysemy (e.g., the form in -da in Lezgian), see Haspelmath (1998) and Tatevosov (2005) on diachronic explanations for this. Besides futures, prospectives and/or intentionals are found in some languages (e.g., Avar, Bagvalal, Khwarshi, and Udi), which describe the preparatory stage of a situation (‘about to happen’) or the intention of the agent. The latter forms are not, strictly speaking, temporal, but rather aspectual or modal; furthermore, they can be combined with past tenses, shifting their temporal reference to the past. Past tense forms usually present the most complex paradigm. It is common to find a distinction between a perfective past tense (aorist) and a perfect or resultative; from comparison between closely related languages, it is clear that cognate forms often vary in meaning across languages, corresponding to different stages of grammaticalization along a common path: resultative > perfect > perfective past.8 Imperfects (imperfective pasts) and pluperfects are typically periphrastic and utilize auxiliaries in the past tense. In some languages (e.g., Agul or Andi), it is the perfect, often a periphrastic perfect, that also functions as an indirect evidential form (for both hearsay and inference). In other languages (e.g., in the Nakh and Tsezic branches), unwitnessed pasts are synthetic forms which do not express perfect meanings. There are also languages where no evidential contrasts are grammaticalized in the verb system (like Udi). Such languages may express evidentiality by means of clitics (e.g., =lda in Lezgian, a hearsay marker originating from the verb ‘say’) or periphrastic forms (see section 3.5.7). For an overview of evidentiality in Nakh-Dagestanian languages, see Forker (2018b) and Verhees (2019); various aspects of their TAM systems are also discussed in Authier and Maisak (2011) and Forker and Maisak (2018).

3.5.5 Non-Finite Forms Typically, a Nakh-Dagestanian verb paradigm includes non-finite forms such as converbs, participles, infinitives, and nominalizations (or “action nominals,” often labeled “masdars” in the descriptive tradition). As a rule, such forms can be derived without any lexical restrictions from any verb (except for morphologically defective copulas or existential verbs in some instances). In some languages, converbs, 8 Belyaev (2018) is a detailed study of the synchrony and diachronic evolution of past tense systems in Dargwa, and Maisak (to appear) presents an overview of perfects and aorists in Lezgic languages.

Nakh-Dagestanian Languages 115 articiples, or infinitives can also be used as finite forms without any additional p marking, except perhaps for person agreement markers in those languages that have them (see section 3.5.3 on finite/non-finite syncretism). Some languages (e.g., Udi) also possess a regular morphological way to derive agent nouns. Action nominals and agent nouns inflect for case and number, as do infinitives in some Dargwa languages. Converbs are a heterogeneous class of non-finite forms used in adverbial clauses. This class usually includes semantically more general converbs (typically, a perfective and an imperfective converb which can also appear in periphrastic constructions), as well as a plethora of forms specifying various temporal relations (‘when,’ ‘before,’ ‘after,’ ‘immediately after,’ ‘until’; see section 3.7.5). Archi also has a special “imperative converb,” i.e., a subordinate form with ‑lli which only appears in clauses with the finite verb in the imperative. (14) Archi zaba-lli q’owq’i come-imp.cvb sit.imp ‘Come and sit down!’ (Kibrik, 1977a, p. 297) Sometimes there also exist dedicated forms to express manner, reason, comparison, conformity (‘in accordance with’), etc. Purpose is typically expressed by infinitives, although there may be other means of encoding it as well. Some languages feature a series of locative converbs (‘being in a place’), which can bear various locative inflections. An example of a system of converbs in a single language, Akhvakh, is given in (15); as a rule, the allomorphs are distinguished according to gender agreement or have free distribution. (15) Akhvakh a. general converb: -o(ho), -e(he), -e, -i, -ere b. progressive converb: -ero, -eri, -ere c. locative converb: -iɬː- + spatial case (essive -i, lative -a, or elative -u[ne]) d. posterior converb: -idiɬːi, -eɬːi ~ -adeɬːi e. simultaneous converb: -ideɬːi f. inceptive converb (‘from the moment when V-ing began/begins’): ‑ariɬoχːa g. two immediate converbs (‘as soon as V-ing occurred’): -ik’ena, -ula h. anterior converb (‘before/until V-ing occurs/occurred’): ‑alaq’o, -aloq’o i. imminent converb (‘just before V-ing occurs/occurred’): -idaɬa, -idaɬoqːe j. non-posterior converb: -iʟeda k. conditional converb: -ala or -ãc ǎ la l. concessive converb: -alala or -aloʁola ~ -aloʁona m. similative converb (‘in the same way as. . .’): -eroqe ~ -ereqe n. gradual converb (‘in proportion as. . .’, ‘the more…, the more. . .’): -ũda ~ -ũdaɬe

116 Dmitry Ganenkov and Timur Maisak o. explicative converb (‘because. . .’): -erogu ~ -eregu p. purposive converb (‘in order to. . .’): -uʁana (based on Creissels, 2010a) In the participial domain, a typical distinction is between a perfective (resultative) and an imperfective (habitual) participle although some other categories can be found as well. There is no distinction between “active” and “passive” participles (see section 3.7.3). In some languages with a regular attributivization strategy (e.g., Archi and Andi), almost any finite verb form can be turned into a modifier by means of a special morpheme. Although the origin of non-finite forms is not always transparent, in many instances the affixes bear a clear resemblance to case markers: thus, the infinitival -s/-z in Lezgic languages reflects the Proto-Lezgic dative case suffix (Alekseev, 1985, p. 100). Converbs, especially those expressing temporal relations, often include locative case markers. Those forms that convey a meaning of precedence in time usually employ elative cases, sometimes with a postposition/adverb like ‘back, after.’ In Chirag Dargwa, for instance, the posterior converb is the historical ante-elative case of the infinitive. In some languages, converbs with the temporal meaning ‘as soon as’ or ‘immediately after’ include the comparative marker ‘as, like’ attached to a participle. The word ‘time’ has become a general temporal suffix in some Andic and Lezgic languages (note that, in Agul and Tabasaran, this is the now obsolete Persian noun gah ‘time’).

3.5.6 Non-Indicative Forms Non-indicative forms include the imperative and various subjunctive and conditional forms. Plurality of the addressee (2sg vs. 2pl) in the imperative can be left unmarked, as in Avar and Lezgian, can be expressed by a distinct morpheme, as in Mehweb and Tanti Dargwa, or can be expressed by regular person markers, as in Udi, or plural suffixes, as in Andi. Imperatives are often derived from a special stem (not one of the aspectual stems) and may be zero-marked. The negative imperative (prohibitive) is typically unrelated to the imperative morphologically: it tends to be derived from the imperfective stem and employs a special negation marker. In Udi, the prohibitive is derived directly from the imperative, but a special negation marker is still used. Indirect commands can be expressed by the (ad)hortative with first persons and jussive (‘let X Verb’) for third persons. The (ad)hortative is often expressed, or at least reinforced, by means of a dedicated clitic or an auxiliary. In Lezgic languages, the verb ‘come’ (in the imperative or adhortative form) is often used as an auxiliary in adhortative utterances. Jussives can be used to express optatives (i.e., wishes of the speaker); otherwise, there may be a dedicated optative form. On the whole, according to Dobrushina (2011b), two different kinds of optatives can be distinguished in the languages of the family: performative optatives express blessings and curses, while desiderative optatives express a “powerless wish” of the speaker (e.g., a dream or longing).

Nakh-Dagestanian Languages 117 The verb forms used in hypothetical and counterfactual conditional protases are traditionally included among the specialized converbs, because such forms are restricted to subordinate clauses. When used independently, conditional clauses typically express optative meaning. Concessive forms are regularly derived from conditional ones by means of a clitic meaning ‘also, even,’ see (69) in section 3.7.5. In counterfactual apodoses, future-in-the-past forms can be employed, which are morphological futures with a past marker (clitic or auxiliary).

3.5.7 Periphrastic Verb Forms Alongside synthetic verb forms, periphrastic constructions play an important part in the tense and aspect systems of the Nakh-Dagestanian languages (though the prevalence of their use varies). Periphrastic forms are composed of a non-finite component—usually a participle, a converb, or an infinitive—and a postpositional auxiliary. The most common type of auxiliary found is the copular verb, which is typically morphologically defective and mainly occurs in the present and past tenses (or in the present only). In some languages, an existential verb such as ‘be located (inside)’ or other auxiliaries with similar meanings can be also employed. The degree of grammaticalization of periphrastic forms varies considerably both on the formal and on the semantic side. Some historically periphrastic forms have turned into synthetic ones, mainly as a result of copula loss or the merger of the auxiliary with a lexical verb. Moreover, structurally similar forms in different languages may cover different ranges of usage along the same path of grammaticalization. Thus, constructions with a perfective lexical verb and the auxiliary in the present tense like perfective converb + copula and perfective converb + existential verb give rise to meanings that fall along a spectrum of grammaticalization—from resultative to perfect to simple past (aorist). Constructions based on imperfective forms of lexical verbs tend to result in a different set of meanings—from progressive present, to habitual present, to future (or a form with a modal interpretation). Finally, the infinitive-based forms may result in a future, a prospective, or a deontic meaning. The example in (16) shows some of the most common periphrastic constructions from different languages. For an account of periphrastic paradigms in Lezgic presents and futures, see also Maisak (2011). (16) Periphrastic constructions converb + copula a. Perfect (Bagvalal; perfective converb + copula) w-išːi-w-o ek’ʷa m-catch-m-cvb cop ‘has caught’ b. Continuous (Chechen; imperfective converb + copula) hwooqu-sh v-u rub:ipfv-cvb i-cop ‘is cutting (hay)’

118 Dmitry Ganenkov and Timur Maisak converb + existential verb c. Resultative (Shiri Dargwa; perfective converb + existential verb) kejsː-un.ni te-w m.lie.pfv-cvb be.there-m ‘is lying (over there)’ d. Present (Lezgian; imperfective converb + existential verb) req’i-zwa (< req’i-z awa) die-prs die-ipfv.cvb be.in.prs ‘dies, is dying’ participle + copula e. Experiental (Agul; perfective participle + copula) ruχ-u-f-e (< ruχ-u-f e) read-pfv-nmlz-cop read-pfv-nmlz cop ‘has read’ infinitive + copula f. Prospective/Debitive (Avar)9 w-ac ’̌ -ine w-ugo m-come-inf m-cop ‘is going to come, should come’ Periphrastic forms—like aorists and perfects in the perfective domain, or presents, habituals, and futures in the imperfective domain—include auxiliaries in the present tense, but these auxiliaries can be omitted or merge with the lexical verb. Those forms in which there is a past tense auxiliary give rise to imperfects and past habituals, pluperfects and “discontinuous pasts” (i.e., “past with no present relevance”, in the terms of Plungian & van der Auwera, 2006). Also, “futures-in-the-past”, i.e., counterparts to future tenses including a past tense auxiliary, are regularly used as counterfactual irrealis forms (‘would have done’). As well as primary periphrastic forms, Nakh-Dagestanian languages also make use of a wide range of secondary periphrastic forms featuring a regular verb ‘be, become, happen.’ Being morphologically regular and fully productive, unlike copulas or stative existential verbs, the verb ‘be, become’ can potentially take any form when used as an auxiliary. The Bagvalal example in (17) uses a periphrastic imperfect of the “perfect series.” While the “plain” periphrastic imperfect includes the imperfective converb and the past auxiliary, buk’a ‘be, become’, in the “perfect series” imperfective, the auxiliary is itself in the periphrastic perfect form buk’uro ek’ʷa (perfective converb + present copula).

See also chapter 6.

9

Nakh-Dagestanian Languages 119 (17) Bagvalal hinc ’̌ a-b hins’-a-ɬi b-uhu-r-o, big-n square-obl-inter h.pl-gather-h.pl-cvb ek-unā-χ b-uk’-ur-o ek’ʷa. eat-ipfv-cvb h.pl-be-h.pl-cvb cop ‘Having gathered together on a big square, (they) were eating.’ (Kibrik, Kazenin, Lyutikova, & Tatevosov, 2001, p. 728) Alongside the canonical periphrastic forms, a number of aspectual, modal, and/or evidential constructions are attested, with varying degrees of auxiliation. Thus, in Lezgian, the stative auxiliary ama ‘stay, remain’ occurs in a series of highly morphologized continuative forms. In Agul, a peripheral construction with the auxiliary verb aq’as ‘do’ and the imperfective converb is attested, with an iterative meaning. In many languages, the verb ‘be, become,’ besides being used in various tense and aspect forms (in contexts not available to the copula), is also employed as a modal verb, expressing both internal (‘can, be able’) and external possibility (‘may, be allowed’); as a rule, the verb takes a complement headed by the infinitive; see Lezgian example (44) in section 3.6.4. In some branches of the family (especially Avar-Andic-Tsezic, partly in Dargwa) and in Archi, a construction with the auxiliary ‘find (accidentally), be found’ in the future tense is used to express conjecture or assumption (‘it is probable that’). There is also another construction with ‘find’ (predominantly in one of the past tenses) which encodes an evidential meaning—namely, the presence of direct evidence for the situation on the part of the speaker or another perceiving subject (‘X found that,’ ‘X witnessed that’) (see Daniel & Maisak, 2018, for details).

3.5.8 Negation Negation is normally expressed by means of prefixal, infixal, or suffixal morphology; it is common for copulas to have a suppletive negative form. In some languages (e.g., Udi), negative morphemes are clitics rather than affixes. In periphrastic forms, the negative form of an auxiliary verb is used. Most Nakh-Dagestanian languages have more than one negation marker. Prohibitives always have a distinct negation marker, which in most languages of the family, includes the sonorant /m/. Non-finite or non-indicative forms may have their own form of negation (such as nu in Udi). A split may occur between the form of negation used in past and non-past tenses (as in Avar).10 Constituent negation is most typically expressed by a grammatical focus construction, with the negated constituent followed by the negative copula and then the rest of the sentence. Simple finite clauses with the negative form of the verb can also be interpreted as constituent negation, though this interpretation is far less prominent than clausal negation. See also chapter 6.

10

120 Dmitry Ganenkov and Timur Maisak

3.5.9 Valency-Changing Operations Causatives are the most common valency-changing operations in the languages of the family. Causatives are predominantly formed analytically, with verbs like ‘do,’ ‘give,’ or ‘let’; morphological causatives are available but less common. In periphrastic constructions, the embedded verb usually takes the infinitive form. Causative auxiliaries do not always show clear signs of grammaticalization and can sometimes be regarded as ordinary complementizing verbs. The choice of the causative verb may depend on the transitivity of the embedded predicate; thus, in Kryz ‘do’ is used with intransitives, and ‘give’ with transitives. In some languages, former causative auxiliaries (e.g., ‘do’ in Avar and Tsakhur, ‘give’ in Udi) became morphologized as suffixes or morphologically bound light verbs, while in other languages (e.g., Andic and Tsezic), there exist suffixal causatives with an unclear lexical source. In several languages, periphrastic and morphological causatives co-occur, with some differences in interpretation (see Daniel, Maisak, & Merdanova, 2012, for a discussion of Agul causatives). In morphological causatives formed from intransitive verbs, the semantic subject of the lexical verb (“causee”) appears in the absolutive, (18). In morphological c ausatives formed from transitive verbs, the causee takes the form of an oblique argument, (19). (18) Ingush aaz yz qier-iit 1sg.erg 3sg.abs fear-caus.ind.prs ‘I give him cause to be afraid, I make him afraid.’ (Nichols, 2011, p. 489) (19) Ingush cyngaj nidzagha kinashjka diesh-iit. cuoi 3sg.erg 3sg.all force book.abs d.read-caus.ind.prs ‘Hei forces herj to read the book.’ (Nichols, 2011, p. 489) In causatives from verbs with non-canonically marked subjects (e.g. experiencers, see 3.6.4), case assignment patterns of the verb tend to be preserved. (20) Bagvalal χajžat-i-r rasul-i-ba madinat j-ēc -̌ ē. Khadizhat-obl-erg Rasul-obl-aff Madinat.abs f-forget-caus ‘Khadizhat made Rasul forget Madinat.’ (Kibrik, Kazenin, Lyutikova, Tatevosov, 2001, p. 387) A phenomenon that resembles a valency-increasing operation is the morphological “verificative” in Agul and Archi—a set of verb forms which introduces a new agentive argument (the “verifier”) in the ergative case—expressing the meaning, “to find out the truth value or the value of an unknown variable” (see Daniel & Maisak, 2014; Maisak, 2016a, for details).

Nakh-Dagestanian Languages 121 The inventory of valency-decreasing operations is small: causative/decausative alternations are commonly expressed by labile verbs or by pairs of complex verbs (with ‘do’ vs. ‘be, become’ as light verbs), and not by morphological means. Nonetheless, in some languages, productive decausatives do exist. In Lezgian, for example, the decausative is formed by adding the light verb xun ‘be, become’ to the verb stem (Haspelmath, 1993, pp. 165–166), while in Udi, the decausative also includes the verb stem and a bound light verb, -ec-, originating from a motion verb. Dargwa has an antipassive construction in which the verb is not overtly marked (Sumbatova & Lander, 2014; van den Berg, 2001), shown in (21). (21) Tanti Dargwa a. murad-li tʼantʼi-d qul-re d-irqʼ-u-le=sa-j. Murad-erg Tanti-n.pl(ess) house-pl.abs n.pl-do:ipfv-prs-cvb=cop-m.sg b. murad tʼantʼi-w qul-r-a-li Murad.abs Tanti-m.sg(ess) house-pl-obl.pl-erg w-irqʼ-u-le=sa-j. m.sg-do:ipfv-prs-cvb=cop-m.sg ‘Murad builds houses in Tanti.’ (Sumbatova & Lander, 2014, p. 278) In (21a), which is a regular ergative construction, the subject of the transitive verb is in the ergative case, whereas the direct object is in the absolutive. Gender-number agreement on the lexical verb is with the absolutive argument (i.e., the direct object). In the antipassive construction shown in (21b), the subject of the transitive verb is in the absolutive case, while the former direct object is in the ergative. Gender-number agreement on the lexical verb is, again, with the absolutive argument (i.e., the subject). The argument case marking thus is reversed in the antipassive compared to the ergative construction, with no overt morphological marking on the verb. While the object appears in the ergative case in the antipassive, that ergative case is different from the ergative seen in the subject position, primarily, with respect to its agreement possibilities: the antipassive object in the ergative cannot trigger agreement that the transitive subject in the ergative is able to trigger, such as person agreement or gender-number agreement on the auxiliary. The antipassive object, therefore, is best analyzed as oblique argument. In a few languages (Avar, Tsezic, and some Andic), a morphological antipassive is attested (see Comrie, Forker, & Khalilova, to appear). In the latter group, antipassive marking is sometimes described as an aspectual (or “actional”) iterative; for Godoberi, Tatevosov (2011) claims that it is a basic “detelicizing” aspectual function which is responsible for suppression of the patient. There are no instances of morphologically marked reflexives or reciprocals; both are expressed by full pronominals.11

See also chapter 21.

11

122 Dmitry Ganenkov and Timur Maisak

3.5.10 Agreement Features and Their Morphological Exponence Verbs show agreement in gender, number, and, in some languages, person. Number agreement may be conflated with person or gender but is sometimes expressed independently. For agreement and concord in the noun phrase, see section 3.6.1. Gender agreement follows the ergative-absolutive pattern (see sections 3.4.1, 3.6.7). In verb stems, gender agreement can be marked by a prefix or an infix. Some verbal suffixes, especially participial ones (and sometimes converbs too, cf. Creissels, 2010a, 2012, on Akhvakh), also possess a gender agreement slot. Gender can be marked more than once in a single verb form, a phenomenon known as “exuberant exponence” or “multiple exponence” (Harris, 2009). For example, in the Batsbi form of the verb -ex- ‘destroy’, the fifth gender d-, which agrees with the absolutive noun phrase ‘old house,’ occurs three times, as both a prefix and a suffix to the root. (22) Batsbi daħ d-ex-d-o-d-an-iš tišin c’a old house.abs pv v-destroy-v-prs-v-evid1-2pl.erg ‘You all are evidently tearing down the old house.’ (Harris, 2009, p. 268) Sometimes different gender agreement slots can even be associated with different agreement controllers. For example, in a participle form, a slot in the verb stem will be controlled by the absolutive verb argument (“internal” agreement), while a suffixal slot will be controlled by the head of a noun phrase (“external” agreement). Thus, in (23), the prefixal slot agrees with ‘wolf ’ in the neuter gender, while the suffixal agreement is with the masculine head noun, ‘man.’ (23) Tindi [bac’a b-ixːʲu-w] hek’ʷa wolf.abs n.sg-catch-m.sg man.abs ‘a man who caught a wolf ’ (Magomedova, 2012, p. 176) The morphological independence of plural agreement from gender agreement can be illustrated by data from Andi. In participles, and also in prohibitives and intransitive imperatives, the plural is marked suffixally, following the pattern of plural agreement found in demonstratives or adjectives (e.g., ho-w ‘this-m’ — ho-w-ul ‘this-m-pl’ and sirdosːub ‘be.afraid-proh’ — sir-dosːub-ul ‘be.afraid-proh-pl’). A handful of verbs with stem-initial vowels express plural agreement with the absolutive argument by means of apophony such as /i/>/o/ or /u/>/a/. Thus, in the minimal pair j-ik’o ‘f-be.aor’ vs. j-ok’o ‘f-pl\be.aor’, the first slot indicates gender (here, feminine), while the vowel alternation expresses number. A similar mechanism of plural marking on the verb is also attested in Chechen and Ingush.

Nakh-Dagestanian Languages 123 Person agreement is an innovation found in some Nakh-Dagestanian branches and is typically marked by suffixes or enclitics (see section 3.6.6). In Tabasaran and Udi, all or at least some person agreement markers bear an obvious resemblance to personal pronouns, and most probably originate from postpositional pronominal subjects which became encliticized to verb forms. In Lak and Udi, person markers are hosted by focused constituents, and can thus be placed on non-verbal phrases as well. (24) Lak na qːatri=ra d-ullali-sːa 1sg house.abs=1sg iv-build.dur-ptcp ‘I am building a house.’ (Kazenin, 2002b, p. 293) In terms of morphological expression, inflection for person is relatively scarce. Dargwa languages distinguish number only in the second person, while the singular and plural forms are identical in both first and third person. Lak distinguishes between singular and plural in both first and second person but not between first and second person. By contrast, Udi has a full agreement paradigm, distinguishing three persons in both singular and plural.

3.6 Simple Clauses In this section, we provide an overview of syntactic structures in Nakh-Dagestanian, including noun phrases, predicate structure, valency classes, word order, question formation, agreement, and grammatical relations.

3.6.1 Structure of Noun Phrases A noun phrase consists of a head noun and its modifiers, including demonstratives, adjectives, numerals, participial clauses, and nouns in the genitive case. Modifiers normally precede the head noun. The most neutral order is shown in (25): (25) Agul ha-me ʡu c’üre baw-a-n emph-prox two old mother-obl-gen ‘these two old dresses of Mom’s’

berʜem dress

Numerals and demonstratives usually have a fixed order with respect to each other, while other modifiers are more flexible. In particular, adjectives can be placed either before or after the demonstrative and numeral.

124 Dmitry Ganenkov and Timur Maisak Adjectival and genitive modifiers can be extraposed from the NP: either left-dislocated to the sentence-initial position or right-dislocated to the postverbal position. By contrast, numerals and demonstratives cannot be dislocated. Participial clauses, especially longer ones, can also be placed in the postverbal position. The dislocation of nominal modifiers in some languages obeys stronger syntactic restrictions. Polinsky (2015b) reports that in Tsez left dislocation is only possible out of absolutive and ergative NPs, whereas NPs in other cases do not allow subextraction. Research on the DP or NP status of nominal phrases in Nakh-Dagestanian is still lacking, but see some discussion in chapter 19. A few languages (Tsakhur and languages of the Tsezic group) distinguish between direct and oblique forms of modifiers, depending on whether the head is in the absolutive case or in an oblique case (see also Kibrik, 1995). (26) Tsakhur a. ǯagʷara-na white-attr.abs ‘a white horse’

balkan horse.abs

b. ǯagʷara-ni balkan-ɨ-lʲ white-attr.obl horse-obl-super ‘on a white horse’ (Kibrik & Testelets, 1999, p. 193)

Avar-Andic languages, Archi, Tabasaran, and Tsakhur show gender-number agreement of attributive modifiers with the head noun. (27) Godoberi a. q’aruma-w ima greedy-m.sg father ‘greedy father’ b. q’aruma-j ila greedy-f.sg mother ‘greedy mother’ c. q’aruma-b hamaχi greedy-n.sg donkey ‘greedy donkey’ (Kibrik, 1996, p. 25) In some languages, attributive markers can derive attributes from various types of phrases (including those headed by locative forms and adverbs, comitatives, or finite verb forms). When attached to adjectives and participles, attributive suffixes can also express contrastive focus, i.e., a property that sets the modified noun apart from a set of alternatives, as in Tanti Dargwa (see chapter 4). However, as Boguslavskaja (1995, pp. 236–237) reports, it is typical of Nakh-Dagestanian to mark contrastivity by a dedicated suffix.

Nakh-Dagestanian Languages 125

3.6.2 Predicate Structure Stative and dynamic verbs are the most typical predicates able to head an independent finite clause in Nakh-Dagestanian. Other parts of speech functioning as predicate are accompanied by the copula, thus forming copular clauses, consider the contrast between a verbal predicate (28a) and a copular predicate (28b) in Archi. (28) Archi a. zari to-w-mu-n q’onq’ eɬu. 1sg.erg dem-i.sg-obl-gen book.abs steal.pfv.iv.sg ‘I stole his book.’ b. gudu c’ě gʷ-du w-i. dem.i.sg.nom bald-attr.i.sg i.sg-be.prs ‘He is bald.’ (Chumakina, Brown, Corbett, Quilliam, 2007a) In languages with person agreement, person markers can be attached directly to predicate nouns and adjectives without the copula. (29) Chirag Dargwa a. du murad-la 1sg.abs Murad-gen ‘I am Murad’s mother.’

babaj=da. mother.abs=1

b. du waj-ze=da. 1sg.abs bad-attr=1 ‘I am bad.’

3.6.3 Major Valency Classes Transitive and intransitive verbs constitute the two major valency classes. NakhDagestanian languages are morphologically ergative; that is, the subject of an intransitive verb and the direct object of a transitive verb are grouped together with respect to morphological case marking, both appearing in the unmarked absolutive case. The subject of a transitive verb is marked by the ergative case. The distinction between unaccusative and unergative intransitive verbs has morphological reflexes in a number of languages of the family. In Udi, the difference is represented overtly through subject case marking: subjects of unergative intransitives are in the ergative, while subjects of unaccusative intransitives are in the absolutive. (30) Nij Udi a. χaˤ-j-en baˤp=e=ne. dog-obl-erg bark=3sg=lv:prs ‘The dog is barking.’

126 Dmitry Ganenkov and Timur Maisak b. χaˤ har-e=ne. dog.abs come-perf=3sg ‘The dog came.’ A similar situation is found in Batsbi (Holisky, 1987), where personal pronouns in the subject position of an intransitive clause can be in either the absolutive or the ergative case depending on volitionality and control; third person subjects of intransitive verbs are invariably in the absolutive. In Southern Tabasaran, the unergative-unaccusative distinction is made visible via the use of different sets of pronominal verbal clitics. The pronominal clitics used with unergative verbs are those which clitic-double transitive subjects; meanwhile, unaccusative verbs employ those which clitic-double direct objects. (31) Southern Tabasaran a. uzu uvu ʁ-uˤ‹r›χ-un=za=vu. 1sg.erg 2sg.abs pfv-‹h.sg›save-aor=1sg:a=2sg:p ‘I saved you.’ b. uzu ʁa-f-un=za. 1sg.abs pfv-come-aor=1sg:a ‘I came.’

c. uvu ka‹r›c’-un=vu. 2sg.abs ‹h.sg›get.dirty-aor=2sg:p ‘You got dirty.’ (Bogomolova, 2012, pp. 103, 104) In Akhvakh, unergative subjects pattern with transitive subjects in their ability to determine person agreement; unaccusative subjects, like direct objects, cannot trigger person agreement (Creissels, 2008). In Agul, some other Lezgic languages, and Lak, the distinction shows up as a restriction on occurrence in the Involuntary Agent Construction, a construction which expresses an external causer of an intransitive event who has a lower degree of agentivity (volitionality, control). Only intransitive changeof-state verbs can be used in the construction (Ganenkov, Maisak, & Merdanova, 2008). In addition, in Lezgian, the morphological causative -r is only used with unaccusative verbs. In the Avar-Andic and Tsezic languages, only unergative verbs have a frequentative or iterative form. (32) Tsez a. uži-bi k’oƛa-nay-s. boy-pl.abs.i.pl run-iter-pst.wit ‘The boys kept running.’ b. *buq c ’̌ ura-nay-s. sun.abs.iii shine-iter-pst.wit Intended: ‘The sun was/kept shining.’ (Polinsky, 2015b)

Nakh-Dagestanian Languages 127 Ditransitive verbs have three arguments: the agent, the transferred object, and the goal of transfer (“recipient”). The first two arguments correspond to the subject and the direct object of transitive verbs. The addressee in many Nakh-Dagestanian languages can be marked by the dative case or a spatial case. The dative marking conveys the idea of change in ownership, whereas spatial marking only suggests a physical act of transfer from the agent to the addressee (see Daniel, Khalilova, & Molochieva, 2010). Regardless of case marking, the recipient argument has never been reported to have any special syntactic properties, such as the ability to trigger agreement, so ditransitive verbs in Nakh-Dagestanian constitute a subclass of the transitive verbs which subcategorize for the recipient. Nakh-Dagestanian languages are consistently ergative. However, personal pronouns across the family often manifest morphological syncretism between the absolutive and ergative cases. Despite this syncretism, the patterns of gender agreement point to the distinction between the underlying absolutive (which determines agreement) and the underlying ergative (which does not). In addition to case syncretism in personal pronouns, unusual case marking is found in the biabsolutive construction, wherein both the transitive subject and the direct object are in the absolutive (see section 3.6.8; see also chapters 18 and 20). Udi stands out among the Nakh-Dagestanian languages in showing differential object marking: inanimate direct objects may be expressed in either the absolutive case or the dative case depending on animacy, specificity, and definiteness. Personal and demonstrative pronouns, as well as all human DPs, are obligatorily marked with the dative (Harris, 2002, pp. 244–248; Kasyanova, 2017).

3.6.4 Minor Valency Classes and Non-Canonical Argument Marking Nakh-Dagestanian languages have a fairly uniform inventory of minor valency classes with non-canonically marked subjects. These minor classes include verbs of perception and cognition with the experiencer subject (e.g., ‘see,’ ‘hear,’ ‘know,’ ‘find out,’ ‘understand,’ ‘forget,’ ‘remember,’ ‘love,’ and ‘want’). Frequent additions to this core class are the verbs ‘find,’ ‘lose,’ and ‘be afraid’ (Comrie & van den Berg, 2006; Ganenkov, 2006). The experiencer typically appears in the dative case, as in Lezgian, Ingush, and Lak. Andic languages use two different strategies for the case-marking of the experiencer: dative or affective, in (33). A similar division is found in Avar and Mehweb Dargwa. Tsakhur has three subjectexperiencer valency classes: dative subject verbs, affective subject verbs, and ablative (more specifically, ad-elative) subject verbs. Differences in case-marking are associated with differences in meaning but the full range of interpretative differences is yet to be explored. (33) Godoberi a. di-ra ʕajšati j-iʔa. 1sg-aff Ayshat.abs f.sg-know.pst ‘I got acquainted with Ayshat.’

128 Dmitry Ganenkov and Timur Maisak b. waš-u-ɬːi idaɬ-ida jaši. boy-obl-dat love-hab girl.abs ‘The boy loves the girl.’ (Kibrik, 1996, pp. 79, 120) Non-canonical subjects of different experiencer verbs may also differ with respect to their morphosyntactic properties. In Chirag Dargwa, for example, dative experiencers with the verbs ‘see,’ ‘hear,’ ‘find out,’ and ‘understand’ can trigger person agreement like canonical subjects, whereas the dative arguments of ‘find,’ ‘remember,’ and ‘forget’ cannot. In Aqusha and most other Dargwa languages, experiencer verbs have undergone ergative shift and are now regular transitive verbs; the only remaining dative subject verb is the verb ‘want’ (Ganenkov, 2013). The only Nakh-Dagestanian language that shows no evidence of a separate class of experiencer verbs is Nij Udi, wherein all these verbs pattern as transitives (Ganenkov, 2008). Vartashen Udi employs a different set of agreement clitics with experiencer verbs, while still marking ergative case on the subject (Harris, 2002). The verb ‘manage, be able, can’ often stands apart from other verbs with respect to subject marking, having its subject in a spatial case of the elative series. Often, this is the same case as is used in the involuntary-agent construction, as in Lezgian. (34) Lezgian a. a-da-w-aj qːaraʁ-iz xa-na-c .̌ 3sg-obl-ad-elat stand_up-inf can-aor-neg ‘He could not stand up.’ (Lezgian Corpus) b. zamira-di-w-aj get’e χa-na. Zamira-obl-ad-elat pot.abs break-aor ‘Zamira broke the pot accidentally/involuntarily.’ (Haspelmath, 1993, p. 293) Tsezic languages are unique in having a regular morphological derivation of “potential verbs,” which denote ability and a lower degree of agentivity (voluntariness, control); subjects of derived potential verbs also receive non-canonical (locative) case marking (Forker, 2013a; Polinsky, 2015b). Predicative possessive constructions also display special behavior. Since NakhDagestanian languages lack a verb ‘have,’ possessive constructions are formed with the existential verb. The possessum is expressed as the absolutive argument, and the possessor as an oblique argument. Nakh-Dagestanian languages often distinguish between permanent possession (ownership) and temporary possession. Most typically, permanent possession is expressed with the genitive case, as in (35a), whereas a locative case marks a temporary possessor, (35b). (35) Hinuq a. debe iyo, baru goɬ=e debe? 2sg.gen.abs mother.abs wife.abs be=q 2sg.gen.abs ‘Do you have a mother, a wife?’

Nakh-Dagestanian Languages 129 b. hayɬo-qo zoq’we-n omoq’i 3sg-at be-pst.nwit donkey.abs ‘He had a donkey.’ (lit. ‘There was a donkey by him.’) (Forker, 2013b, p. 534) In other languages, this distinction can be conveyed by different locative cases. (36) Agul a. za-q ʡu ruš=na sa gada q-a-a. 1sg-post two daughter.abs=and one son.abs post-be-prs ‘I have two daughters and one son.’ b. za-w nis=na guni f-a-a. 1sg-ad cheese.abs=and bread.abs ad-be-prs ‘I have cheese and bread with me (so we can have a snack now).’ (Ganenkov, Maisak, & Merdanova, 2008, p. 174)

3.6.5 Word Order Word order in Nakh-Dagestanian languages is basically SOV, with the possibility of deviating from this to manipulate information structure, especially in the matrix (root) clause. The major structural components involved are the preverbal field and postverbal field. In the preverbal field, the neutral (basic) order of arguments is Subject–Indirect Object–Direct Object. In reality, the arguments are arranged from left to right in order of decreasing topicality (from more to less topical); the immediately preverbal position is usually occupied by sentence focus. Nuclear stress is normally placed on the immediately preverbal constituent. The postverbal field is reserved for background information—that is, those arguments that are recoverable from the context but still mentioned for the sake of clarity. For more on word order and its relation to information structure see Testelec (1998a), Forker and Belyaev (2016), and chapter 24 of this volume.

3.6.6 Questions Polar questions are often marked on indicative forms by special suffixes or enclitics, as in (37). (37) Hinuq r-iker-i-me de dew-qo ax? v-show-q-neg 1sg.erg 2sg.obl-at cheese(v) ‘Did I not show you the cheese?’ (Forker, 2013b, p. 743) In some languages, for example, Lezgic and Dargwa, the question marker in polar questions can be attached only to the verb, while in other languages, as in Avar-Andic-Tsezic, the question marker can also be appended on a non-verbal constituent to mark focus.

130 Dmitry Ganenkov and Timur Maisak Wh-questions vary in terms of question marking on the verb; in some languages the verb carries an interrogative marker, which may or may not be identical to the one used in polar questions. An interrogative verbal suffix or enclitic is found in Andic or Dargwa languages; in Andi wh-questions, interrogative enclitics are hosted by wh-phrases. In other languages the verb in wh-questions is not marked as compared to the affirmative. It is possible that some languages use clefts or pseudo-clefts to form wh-questions but this issue requires further study. Another common variety of question marking has to do with meditative questions (‘I wonder whether. . .’). A dedicated meditative marker can be used either instead of a default question marker, or in addition to it, as in (38). (38) Andi emi=le=qe hoɬːu w-aχunni-r? who=q=mdt here m-live-prog ‘I wonder who lives here?’ (a girl asks herself after finding a house in the woods) In alternative questions, the alternatives are simply juxtaposed or combined by means of a disjunctive coordinator. The verb can again be marked with an interrogative particle, as just described. Embedded questions either feature a dedicated suffix attached to an indicative form or include the verb in a conditional form, as in (39) from Lezgian. (39) Lezgian hik’ c ǚ nüχ-na-t’a sadra sühbet aja kʷan. how steal-aor-cond ptcl talk.abs do:imp ptcl ‘Say how (they) stole it.’ (Haspelmath, 1993, p. 426)

3.6.7 Agreement Most Nakh-Dagestanian languages feature at least some agreement at clause level. Only Lezgian and Agul lack agreement of any kind. Gender agreement is a characteristic feature of Nakh-Dagestanian, only absent from a few southern Tabasaran dialects, Agul, Lezgian, and Udi. Gender agreement morphology is normally found on verbs (40a), but also on less canonical agreement targets, such as adverbs (40b) or spatial cases (40c).12 (40) Aqusha Dargwa a. rursi r-ak’-ib. girl.abs f.sg-come:pfv-aor ‘The girl came.’ See chapter 20 for further discussion.

12

Nakh-Dagestanian Languages 131 b. rursi r-uħna a-r-ac’-ib. girl.abs f.sg-inside pv-f.sg-go_in:pfv-aor ‘The girl went inside.’ c. rursi ši-li-zi-r-ad r-etkaq-ib. girl.abs village-obl-in-f.sg-elat f.sg-disappear:pfv-aor ‘The girl disappeared from the village.’ Not all members of a lexical class (verbs, adjectives, or adverbs) show overt agreement. The ability to host gender-agreement marking seems to depend on phonetic shape of the hosting stem but may also be determined lexically. Gender agreement bears no relation to finiteness. If a stem/lexeme is specified as bearing the gender agreement marker, then it shows agreement in all clauses, finite and non-finite, as shown in (41) for the nominalization in Godoberi. (41) Godoberi c’aq’ab ida [min w-aʔ-ir]. good cop 2sg.abs m.sg-come-nmlz ‘It’s good that you have come.’ (Kibrik, 1996, p. 175) In terms of agreement controllers, gender agreement operates on an ergative-absolutive basis: the absolutive argument (subject or direct object) determines gender agreement on verbs and other agreeing constituents. Although ergative-absolutive agreement is very strict in most languages of the family, Dargwa languages can show agreement with the ergative argument on agreeing auxiliaries and locative adverbs (agreement on lexical verbs is nevertheless strictly ergative-absolutive), a phenomenon that remains poorly understood (see Belyaev, 2016, 2017a; Ganenkov, 2018; Sumbatova & Lander, 2014; van den Berg, 1999). (42) Khuduts Dargwa

rasul-li žuz b-uc ’̌ -unni {ca-b / ca-w}. Rasul-erg book.abs n.sg-read:ipfv-cvb cop-n.sg cop-m.sg ‘Rasul is reading a book.’

Another non-canonical pattern of gender agreement arises in biabsolutive constructions. The most common strategy is for the direct object to control gender agreement on the lexical verb, and the subject on the auxiliary (see section 3.6.8, and also chapter 20). In contrast to gender agreement, person indexing on the verb is attested only in a subset of Nakh-Dagestanian languages: Batsbi, Udi, Lak, Dargwa, Tabasaran, Akhvakh, and some dialects of Avar. Morphosyntactically, person indexing can be either agreement or clitic doubling. In Udi, Tabasaran, and Akhvakh, person indexing is clearly a recent innovation, although it should probably be reconstructed for Proto-Dargwa (Sumbatova, 2011a). Unlike gender agreement, person indexing in Nakh-Dagestanian is mostly restricted to finite clauses. Some types of non-finite

132 Dmitry Ganenkov and Timur Maisak clauses in Dargwa (conditional clauses, subjunctives, and person-inflected infinitives in a few southern Dargwa languages) and Udi (conditional clauses and subjunctives) also show person agreement. Nakh-Dagestanian languages are fairly diverse with respect to controllers of person agreement. Udi has regular subject agreement. In Dargwa, agreement is sensitive to the person hierarchy and the hierarchy of grammatical functions. For example, Aqusha has hierarchical agreement with a preference for the direct object; that is, person agreement is controlled by the direct object unless it is third person, in which case it is controlled by the subject (van den Berg, 1999). In contrast, in Chirag, if the subject is first or second person, agreement is with the subject; otherwise, agreement is with the direct object. Various southern Dargwa languages (Qunqi, Khuduts, Itsari) have person agreement regulated by the person hierarchy 2 > 1 > 3. Thus, a second person argument preferentially determines person agreement, whether it is the subject or the direct object. In the absence of a second person argument, agreement is controlled by first person subject or direct object. Third person agreement occurs only where both core arguments are third person (for more on agreement in Dargwa, see Belyaev, 2016; Sumbatova, 2011b). Mehweb Dargwa, Akhvakh, and Zaqatala Avar have egophoric agreement. In egophoric agreement, the argument which determines agreement varies depending on illocutionary force (see Floyd, Norcliffe, & San Roque, 2018, for a recent typological study of egophoricity). In declarative sentences, first person subjects trigger agreement, while second and third person subjects appear with the unmarked verbal form. In questions, however, it is second person subjects that trigger agreement, whereas first- and thirdperson subjects do not. Egophoric agreement in declaratives and interrogatives have identical morphological paradigms (Creissels, 2008; Ganenkov, 2019a). Lak is unique in having varying rules of person agreement with different verbal TAM forms. In some verb forms, agreement is hierarchical as in Dargwa; in other verb forms, agreement is ergative-absolutive (Kazenin, 2013a). Tabasaran has clitic doubling rather than person agreement sensu stricto: verbal person markers are simply reduced personal pronouns in the required case. Tabasaran also allows personal pronouns in any syntactic position in the clause to be clitic-doubled on the finite verb. Subject clitic doubling is obligatory; non-subject arguments are optionally clitic-doubled. Up to two clitics, a subject clitic, and a non-subject clitic, can appear on the finite verb. (43) Southern Tabasaran a. mama uzu-q lic-ur=zu-q. mom.abs 1sg-post search-fut=1sg-post ‘Mom will be looking for me.’ b. durar-i uvu-z nikː 3pl-erg 2sg-dat milk.abs ‘They gave you some milk.’

tuv-un=vu-z. give-aor=2sg-dat

c. uzu uvu-x-na ʁ-uš-un=za=vu-x-na. 1sg.abs 2sg-apud-lat pfv-(h.sg)go-aor=1sg:a=2sg-apud-lat ‘I went to your place (lit. went to you).’ (Bogomolova, 2012, pp. 101, 104, 107)

Nakh-Dagestanian Languages 133

3.6.8 Grammatical Relations Although subject versus non-subject asymmetries have been claimed to be irrelevant for many Nakh-Dagestanian languages (Forker, 2014; A. E. Kibrik, 1992, 1997a), grammatical relations can be identified, since the subject and the direct object each have a unique set of syntactic properties that set them apart from each other and from oblique arguments. Morphosyntactically, subjects and direct objects often have dedicated morphological cases. For the direct object, this is invariably the absolutive case. For the subject, this is usually the absolutive case with intransitive verbs, the ergative case with transitive verbs, and dative and/or locative cases with verbs of minor valency classes (see section 3.6.4). Gender agreement is determined exclusively on the basis of morphological case: only absolutive arguments, irrespective of their syntactic function and semantic role, can control gender agreement. Person agreement, however, is usually sensitive to either the subject alone, as in Udi, or both the subject and the direct object, as in Dargwa languages and Lak. The subject can bind any other clause-internal argument, including the direct object, but cannot itself be bound in its domain. Similarly, the direct object can bind any clauseinternal argument except the subject and can be bound only by the subject, but not by any other clause-internal argument (complications are possible, however, as discussed in chapter 21 of this volume). In contrast to all other arguments, the subject appears in the pro position in obligatory control constructions; no other argument can be controlled. Among valency-increasing derivations, the causative is the most prominent (section 3.5.9). In some Lezgic languages (Udi, Kryz, Budukh, Lezgian), the subject and the direct object participate in an anticausative construction, wherein the ergative subject is suppressed and cannot be overtly expressed, while the direct object becomes a derived subject in the absolutive case and controls person agreement. (44) Lezgian ha i ara-da sadlahana samolet cːaw.u-z χkaž xa-na. emph this moment-in suddenly aircraft.abs sky-dat lift become-aor ‘Exactly at this moment the airplane suddenly rose into the sky.’ (Lezgian Corpus) Dargwa antipassives are unusual because they involve no verbal morphology and their case marking appears to be the exact opposite of what is observed in transitive clauses. The derived intransitive subject is in the absolutive case, whereas the logical object is expressed as an oblique argument, in the ergative case, as shown in (45). Unlike transitive subjects in the ergative, the oblique-ergative object of the antipassive cannot determine gender agreement on the auxiliary. (45) Khuduts Dargwa

rasul žuz-li uc ’̌ -unni {ca-w / *ca-b}. Rasul.abs book-erg (m.sg)read:ipfv-cvb cop-m.sg cop-n.sg ‘Rasul is engaged in book-reading.’

134 Dmitry Ganenkov and Timur Maisak Biabsolutive constructions are found in Avar-Andic, Tsezic, Lak, and some Lezgic languages (Archi, Tsakhur). In the biabsolutive construction, which is predominantly periphrastic, both the subject and the direct object of a transitive verb are in the absolutive; pragmatically, it is the subject (agent) which is topicalized. Gender agreement on the lexical verb is controlled by the direct object, whereas gender agreement on the auxiliary is determined by the subject (see Bond, Corbett, Chumakina, & Brown, 2016; Forker, 2012; Gagliardi, Goncalves, Polinsky, & Radkevich, 2014; see also chapter 20). The following examples illustrate a standard ergative-absolutive construction with a transitive verb ‘open’ and its biabsolutive counterpart in Lak. (46) Lak a. Ergative aˁli-l c ǎ waxulu t’it’laj b-u-r. Ali-erg window.abs open.prog n.sg-aux-3 ‘Ali is opening a/the window.’ b. Biabsolutive aˁli c ǎ waxulu t’it’laj u-r. Ali.abs window.abs open.prog (m.sg)aux-3 ‘Ali is opening a/the window.’ (Gagliardi, Goncalves, Polinsky, & Radkevich, 2014, pp. 137–138, glosses changed)

3.6.9 Local Anaphora Nakh-Dagestanian languages have elaborate anaphoric systems which are discussed in chapter 21 of this volume.

3.7 Complex Sentences This section reviews major structures associated with the syntax of complex sentences.

3.7.1 General Profile of Complex Sentence Formation Complex sentences are most commonly formed by embedding a non-finite clause under a matrix clause. Finite embedded clauses are rarely attested in Nakh-Dagestanian; they appear as complement of verbs of speech and thought across the family. Coordination of finite clauses is almost non-existent, apart from use of the conjunction wa ‘and’, borrowed from Arabic, mainly in written language under the influence of Russian. Another conjunction used with finite clauses is the disjunctive ja... ja.... The functional equivalents of coordination are constructions with “chaining” converbs which

Nakh-Dagestanian Languages 135 can display some properties of coordination. For an overview of coordinating constructions in Nakh-Dagestanian, see van den Berg (2004).

3.7.2 Clause Chaining General converbs (as opposed to specialized converbs) are non-finite forms used in the “chaining strategy” of describing a sequence of events. In these, only the final verb is in the finite form, whereas all preceding clauses forming the chain are headed by a converb. The crucial distinction is between perfective and imperfective converbs. The perfective converb describes a completed event which precedes the event described in the matrix clause. The imperfective converb describes an event occurring simultaneously with the matrix clause event. General converbs demonstrate mixed behavior with respect to coordination and subordination. Kazenin and Testelets (2004) show for Tsakhur that, despite being non-finite, converbal clauses can show properties of coordination or subordination according to a number of syntactic diagnostics, such as the locus of morphosyntactic marking, relativization, and the possibility of center embedding. The authors suggest that the variation correlates with semantic interpretation: converbal constructions describing a temporal sequence of events are coordinating, whereas the constructions that describe a causal relation between two events display subordination properties (see also Belyaev, 2011, for a discussion of converbs in three Dargwa languages).

3.7.3 Relative Clauses Relative clauses are prenominal participial clauses modifying the head noun. (47) Godoberi [wacː-u-di b-aχi-bu] hamaχi b-ic ’̌ a. brother-obl-erg n.sg-buy.pst-ptcp donkey.abs n.sg-die.pst ‘The donkey that my brother had bought died.’ (Kibrik, 1996, p. 211) Since participles in Nakh-Dagestanian usually distinguish only aspect and not tense, a reduced set of semantico-syntactic distinctions is found in relative clauses. Depending on the inventory of a specific language, this may either leave only a distinction between perfective and imperfective participles, as in Dargwa languages, or allow for an elaborate system including perfective, imperfective, habitual, resultative, and optative participial clauses, as in Agul. Almost any nominal constituent is accessible for relativization. In Tanti Dargwa, for example, the head noun modified by a participial clause can semantically correspond to almost any clausal argument or adjunct which remains unexpressed inside the participial clause (“gap strategy”).

136 Dmitry Ganenkov and Timur Maisak (48) Tanti Dargwa a. [ c ǔ tːu b-erk-un ] umra chudu.abs n.sg-eat:pfv-aor(ptcp) neighbor.abs ‘the neighbor who ate a/the chudu (a kind of stuffed flatbread)’ b. [ da-li c ǔ tːu b-ic ː̌ -ib ] durħaˤ 1sg-erg chudu.abs n.sg-give:pfv-aor(ptcp) boy.abs ‘the boy to whom I gave a/the chudu’ (Sumbatova & Lander, 2014, p. 193) If the head noun corresponds to a more deeply embedded NP inside the participial clause, such as a genitive modifier or an argument of a complement clause, resumptive pronouns are commonly used, though the gap strategy is often still available. If the head noun corresponds to an argument of an adverbial clause or embedded question, resumptive pronouns must be used. It is reflexive pronouns that take on the function of resumptives in Nakh-Dagestanian. (49) Tanti Dargwa [ sa-ri šin-ni-ja ag-ur-anne hit-i-lij refl-f.sg.abs water-obl-super(lat) go:pfv-aor-eq 3sg-obl-erg da-li ʕaˤ-r-alχ-a-d. xːar-b-aʁ-ib-se ] rursːii ask-n.sg-lv:pfv-aor-attr girl.abs 1sg-erg neg-f.sg-know:ipfv-prs-1 ‘I do not know the girli about whom hej asked whether shei went for water.’ (Sumbatova & Lander, 2014, p. 196) Participial clauses where the head noun is not represented at all are also possible (see Comrie, Forker, & Khalilova, 2017); such relative clauses and their head nouns instantiate gapless apposition, well known from Japanese (e.g., Matsumoto, 1997). In examples (50) and (51), the head noun is not represented inside the participial clause, that is, the noun ‘smell’ in (50) is not a semantic or syntactic argument of the verb ‘fry’; likewise, the noun ‘meat’ in (51) is not an argument of the verb ‘slaughter.’ The participial clauses in both examples instantiate attributive modification where the head noun and the situation described by the participial clause are related by association. (50) Bezhta [ bisa ziza-jas ] mäh fish.abs fry-prs.ptcp smell.abs ‘the smell of fish frying’ (Comrie, Forker, & Khalilova, 2017, p. 138) (51) Archi [ ħawan bu-ʟ’u-tːu-t ] aʟ’ ʟo-t’u. ram.abs iii-slaughter.pfv-attr-iv meat.abs iv.give.pfv-neg ‘They didn’t give the meat from the slaughter of the ram.’ (Daniel & Lander, 2010)

Nakh-Dagestanian Languages 137 Participial clauses may express both restrictive and non-restrictive modification; (52) illustrates the latter. (52) Lezgian [ däwe tː-akː-ur ] zun baχtːlu war.abs neg-see-aor.ptcp 1sg.abs happy ‘I, who did not see the war, am a happy person.’

insan person.abs

ja. cop

In terms of external syntax, participial clauses are head-external attributive modifiers. No head-internal relativization is attested across the family, in contrast to Northwest Caucasian and Kartvelian. Participles in headless relative clauses either acquire overt nominalizing morphology or inflect for nominal categories without additional nominalizing morphology (in such instances, the nominalizing morphology may be null). With respect to their internal syntax, participial clauses are non-finite clausal constituents. Core arguments of participial clauses in most languages do not differ in their morphological marking from arguments of finite clauses and they preserve obligatory gender agreement (see examples (47), (48), (51) above—no person agreement is found in participial clauses). Besides acting as modifiers, headless participial clauses are extensively used in grammatical focus constructions and can even function as the main predicate in such constructions, see chapter 24, Kazenin (2002b), and Rudnev (2015). (53) Bagvalal

maħammad-i-r wec ’̌ e [ di-ha kumuk ǯ-oː-b ]. Mahammad-obl-erg cop:neg 1sg-dat help.abs do-ptcp-n.sg ‘It is not Mahammad who helped me.’ (Kibrik, Kazenin, Lyutikova, & Tatevosov, 2001, p. 697)

(54) Avar šːiw aħ-a-ra-w? who.abs shout-pst-ptcp-m.sg ‘Who shouted?’ (Rudnev, 2015, p. 115) Headless participial clauses also can function as temporal adverbials and factive complement clauses. In that usage, they undergo nominalization and can appear with oblique case marking.

3.7.4 Complement Clauses Complement clauses are predominantly non-finite. The only finite complement found consistently across the family is in reported speech with verbs of speech and thought. Morphologically, a similar inventory is employed across the family to mark embedded verbs in non-finite complements (see section 3.5.5); their distribution across

138 Dmitry Ganenkov and Timur Maisak ifferent complement types is also stable. Participial complement clauses are used with d factive matrix verbs like ‘know,’ ‘understand,’ and so on. (55) Bagvalal

ʕali-r ek’ʷabq’al-šːu-ba š’ereː [ e-w-da ʕabija-w Ali-erg everyone-obl-aff prove.pst refl-m.sg-emph guilty-m.sg wec ’̌ -u-b ]. cop:neg-ptcp-n.sg ‘Ali proved to everyone that he was not guilty.’ (Kibrik, Kazenin, Lyutikova, & Tatevosov, 2001, p. 522)

Nominalizations are usually employed with activity-denoting complements. (56) Kryz jin [ rikime-k ibur ki-j-iǯ ] targitmiš 2pl door.pl-sub ear.abs pv-put-nmlz quitting ‘Stop listening by the door!’ (Authier, 2009, p. 312)

ar-ay. do.imp-imp

Imperfective converbs often serve to mark complements with perception verbs and the verb ‘begin.’ (57) Lak ga [ gikːu-sːa laboratorija-nu-wu zi-j ] dem.abs there-attr laboratory-obl-in work-ipfv.cvb aji‹w›x-unu. u-r. ‹m.sg›begin-pfv.cvb (m.sg)aux-3 ‘He has started working in the lab there.’ (Kazenin, 2013a, p. 176) The perfective converb is most typically used with the matrix verb ‘finish’ and in noncontrolled complements of desiderative verbs like ‘want,’ ‘decide,’ and so on. (58) Khuduts Dargwa aba-l [ rasul urkaraqi ag-ur-ri ] q’ast b-arq’-ib. mother-erg Rasul.abs to.Urkaraq go:pfv-aor-cvb decision n.sg-do:pfv-aor ‘The mother decided that Rasul would go to Urkaraq.’ (Ganenkov, 2019b) Infinitives are used to mark complements with reduced semantico-syntactic properties, particularly when the matrix clause and the infinitival clause share a semantic subject. Infinitival complementation is used with modal, phasal, implicative, and desiderative verbs and may involve restructuring, subject control, object control, and so on. (59) Ingush a. [ txyn-ciga dwa-vuola ] mogii hwuona? 1pl.excl-chez pv-v.come.inf can꞊q 2sg(dat) ‘Could you come over here?’

Nakh-Dagestanian Languages 139 b. muusaa [ qiera hwal-eibie ] ghert. Musa.abs stone.abs up.lift-b.caus.inf intend.prs ‘Musa intends to lift the stone.’ (Nichols, 2011, p. 552) Subject-to-subject raising is rare in Nakh-Dagestanian, mainly found with semantically empty auxiliaries, but not with phasal (e.g., ‘begin’ and ‘continue’) or modal (e.g., ‘can’) verbs. No subject-to-object raising has ever been reported in a Nakh-Dagestanian language. Apart from canonical forward control/raising constructions, Nakh-Dagestanian languages are known to have a less familiar pattern of backward control, where the shared subject is overtly expressed in the complement clause, while remaining silent in the matrix clause. (60) Tsez ___i [ kid-baːi zija b-išr-a ] j-oq-si. ii.abs girl.ii-erg cow.iii.abs iii-feed-inf ii-begin-pst.evid ‘The girl began to feed the cow.’ (Polinsky & Potsdam, 2002, p. 248) A few Dargwa languages (e.g., Itsari, Qunqi, Amuq, and Khuduts) have developed innovative person-inflected infinitives used in complements to control verbs. (61) Khuduts Dargwa ʕušːa [ urkaraqi d-uq’-aˤtː-aj ] q’ast b-arq’-ib-da. 2pl.erg to.Urkaraq 1/2pl-go:pfv-2-inf decision n.sg-make:pfv-aor-2pl ‘You decided to go to Urkaraq.’ (Ganenkov, 2019b) Case marking of arguments in non-finite complement clauses, including infinitives and nominalizations, is usually identical to that in finite environments. In some languages, such as Bagvalal and Tsez, nominalization requires one of its core arguments (the subject or the direct object) to be expressed in the genitive rather than the case required by nominalized verb. Although it preserves some typical family traits, Udi stands out among NakhDagestanian languages for possessing unusual features acquired from neighboring Iranian languages, as well as Armenian and Azerbaijani. Nij Udi makes extensive use of finite complement clauses introduced by the subordinating particle ki (Lander, 2014b). (62) Nij Udi beχ=e=baftː-i [ ki ǯöj sa jaqː fikir-b-sun ]=e lazum. understand=3sg=lv-aor that other one way.abs think-lv-nmlz=3sg necessary ‘He realized that it was necessary to figure out another way (to do that).’ Udi also has finite subjunctive complements used with typical control verbs like ‘want,’ ‘allow,’ and so on. As in Iranian languages and Armenian, subjunctive complements are non-controlled.

140 Dmitry Ganenkov and Timur Maisak (63) Nij Udi c ǔ r=uz=za [ hun sa äc ǐ -n hava want=1sg=lv.prs 2sg.erg one dance-gen melody.abs far-kː-a=vaχ ]. play-lv-sbjv=2sg:dat ‘I want you to play the tune of a dance.’

3.7.5 Adverbial Clauses Adverbial clauses are predominantly non-finite, expressed by specialized converbs which, unlike general converbs, express very specific semantic relationships between the matrix and embedded clause events—mostly temporal, but also manner, cause, reason, purpose, condition, and so forth. (64) Agul [ ha-le pːarza-ji-l-di q-aj-i-guna ], emph-that.up mountain_pass-obl-super-lat rep-come-pfv-temp aʁ-a-a mi-s mi. say-ipfv-prs this-dat this.erg ‘When they have reached that mountain pass, he says to him. . .' (65) Chirag Dargwa

[ di-la datːe j-ebc ’̌ -ib-tːel, ] it-i-la zu 1sg-gen father.abs m-die:pfv-aor-causal.cvb that-obl-gen name.abs dami c ě -h-d-išː-ible. 1sg.super super-up-n.pl-put:pfv-res ‘Because my father died, they named me with his name.’

(66) Avar [ di-cːa ab-uqe ] ha-b-e he-b ħalt’i. 1sg-erg say-manner do-n.sg-imp this-n.sg work.abs ‘Do this job as I said.’ (Alekseev et al., 2014, p. 229) Infinitival clauses are also often used to express purpose in addition to (or instead of) purposive converbs, especially with verbs of motion. Conditional converbs are used in the protasis of conditional constructions. Realis and irrealis (counterfactual) conditional converbs are often distinguished, though some languages uniformly use the same marker in both.

Nakh-Dagestanian Languages 141 (67) Khwarshi wallah, do Ø-enλ’-a goli, me mesed-is sanqisi=n honestly 1sg.abs i-go-inf be.prs 2sg.erg gold-gen1 trunk=add guga-qa=n gul-ɬo. back-cont=add put-cond ‘I swear, I will go, if you put a box of gold on my back.’ (Khalilova, 2009, p. 413) The Dargwa languages (and also Udi) differ from the rest of the family in having personmarked conditional forms. (68) Aqusha Dargwa ħu-ni nab kumek b-ar-ad-li, nu-ni miskin-t-a-s 2sg-erg 1sg(dat) help n.sg-do:pfv-2sg-cond 1sg-erg poor-pl-obl-dat k’el ɢuruš d-urt’-is. two rouble.abs n.sg-distribute-fut.1 ‘If you help me, I will distribute two roubles among the poor.’ (van den Berg, 2001, p. 49) Concessive clauses are often morphologically based on conditional clauses. The most typical way to form a concessive is by appending the additive clitic to the conditional form. (69) Lezgian

za šeker qiweh-na-t’a=ni, i c ǎ j-di-q dad 1sg:erg sugar.abs throw-aor-cond=add this tea-obl-post taste.abs gala-c .̌ be.behind-neg ‘Although I added sugar, this tea does not taste good.’ (Haspelmath, 1993, p. 396)

Morphologically, specialized converbs often (at least historically) represent case forms of other non-finite forms: general converbs, nominalizations, participles, or infinitives (see section 3.5.5). Adverbial clauses with overt subordinators are also attested. Usually subordinators are co-opted from the inventory of postpositions, such as ‘behind,’ ‘after,’ ‘in front of.’ Sometimes they are dedicated grammatical words, as is especially typical in purposive constructions, which often include a dedicated subordinator ‘in order to’, such as bahnadan in Chirag Dargwa or badala in Agul. Specialized converbal clauses usually demonstrate full syntactic independence of the matrix clause: they have no restrictions on the reference of the subject, independently license negation and NPI elements, and constitute a separate domain for case marking and gender agreement. However, temporal converbal clauses often do not receive their own temporal interpretation and express aspectual rather than temporal relations between the adverbial clause and the matrix clause; the temporal interpretation of such clauses is usually inherited from the matrix clause.

142 Dmitry Ganenkov and Timur Maisak

3.8 Lexicon Studies of individual lexical groups in Nakh-Dagestanian are still scarce, even though the lexicon of many languages has been documented in dictionaries (see section 3.1.4). Terms used in crop farming, animal husbandry, and other traditional activities are particularly popular topics of study in Dagestanian lexicology; see language-particular overviews of the relevant lexical groups in Ganieva (2004) for Lezgian, Ganieva (2015) for Khinalug, Temirbulatova (2008) for Dargwa, and Abdulmedžidova (2014) for Bezhta, among others. Studies in other lexical-semantic fields include kinship (Kadyradžiev, 1985), body parts (Kadyradžiev, 1986), plant names (Ganieva, 1989), spatial terms (Ataev, 1990), expression of time (Ataev, 1991), or names of birds and animals (Magomedova, 1988); see also Mikailov (1984). General overviews of lexical systems include Khajdakov (1961) on Lak, Gajdarov (1966) on Lezgian, and Zagirov (1981) on Tabasaran. A recent series of papers by Solmaz Merdanova and co-authors provide an in-depth analysis of a few lexical groups in Agul in the vein of modern lexical typology (Merdanova, 2009, on pain predicates, Merdanova & Reznikova, 2015, on verbs denoting animal sounds, and Reznikova & Merdanova, 2018, on verbs of searching and finding). Color terms have not been studied systematically in most languages of the family. The recent overview of the color terms system in Avar (Ataev, 2018) and the experimental study of Tsakhur by Davies, Sosenskaja, and Corbett (1999) are rare exceptions. The Avar system seems to be a fairly typical example of Nakh-Dagestanian, featuring the following eight basic color terms: baʕarab ‘red’, qaħab ‘white’, c ’̌ eʕerab ‘black’, ʕurc ː̌ inab ‘green’, qaħilab ‘blue’, t’ohilab ‘yellow’, sːurmijab ‘brown’, c’ːaħilab ‘gray’. The Tsakhur system of basic color terms for adult speakers comprises 12 words, including the first-ever reported case of a basic color term for turquoise. Kinship term systems in Nakh-Dagestanian usually include words for parents (‘father’ and ‘mother’), parents’ siblings (‘uncle,’ ‘aunt’), often having different terms for paternal and maternal uncles and aunts, such as Lezgian χala ‘maternal aunt’, eme ‘paternal aunt’, χalu ‘maternal uncle’, imi ‘paternal uncle’. Words for grandparents are often complex expressions based on ‘mother’ and ‘father’, as in Agul aħa baw ‘grandmother’ (lit. ‘big/old mother’), aħa dad ‘grandfather’ (lit. ‘big/old father’), though some languages have underived terms (e.g., Tabasaran aba ‘grandfather’, bab ‘grandmother’) or express these relationships syncretically with ‘mother’ and ‘father’ (Chirag Dargwa babaj ‘mother’, ‘grandmother’). Siblings are expressed using different words for male and female siblings (‘brother’ and ‘sister’), sometimes also distinguishing special terms for older brothers and sisters. Terms for cousins are either complex descriptions (e.g., Agul adda gada ‘uncle’s son’) or special words unproductively derived from ‘brother’ and ‘sister’ (Avar wacːʕal ‘male cousin’, cf. wacː ‘brother’). Some languages also have more elaborate systems for cousins; consider Aqusha Dargwa uziq’ar ‘male first cousin’, ruziq’ar ‘female first cousin’, q’arigan ‘second cousin’, garigan ‘third cousin’. Terms for descendants are ‘son’ and ‘daughter,’ which are often syncretic with ‘boy’ and ‘girl,’ respectively. Many languages lack any other terms for descendants, using com-

Nakh-Dagestanian Languages 143 plex expressions for grandchildren (‘son’s son,’ etc.) although Lezgic languages usually have a word for ‘grandchild’ (Rutul χɨdɨl). Some Lezgic languages have more complex systems of terms for descendants. For example, Tabasaran features χudul ‘grandchild’, gudul ‘great-grandchild’, c’udul ‘great-great-grandchild’, and budul ‘great-great-greatgrandchild’. Children of parent’s siblings are denoted by the word ‘nephew/niece’ or a complex description. Terms for in-laws usually only include terms for mother-in-law, father-in-law, daughter-in-law, and son-in-law, but not other terms. Nakh-Dagestanian languages often have terms for husband of wife’s sister and wife of husband’s brother, as in Lezgian q’elit’ ‘wife of husband’s brother’, bažanaχ ‘husband of wife’s sister’. Systems of basic motion verbs are not particularly rich. As a rule, ‘go’ and ‘come’ are lexically distinguished and describe unspecified manner of motion. Their reference point can be the speaker or a different landmark. Alongside generic verbs of going away and coming, there may be a third basic verb which is deictically unspecified and rather describes the starting point of motion. For example, in Archi, such a three-term system is represented by akːis ‘go (away)’, aʟis ‘come’ and qˤes ‘leave, go toward’. Verbs that denote the reaching of a destination are very common. Direction of motion can be also specified by locative prefixes (especially in Dargwa and Lezgic), as in (70) from Tabasaran, where the movement from the inside of the landmark is expressed on both the noun denoting the landmark and the verb. (70) Standard Tabasaran c ǔ vl-i-an duk’ a-d-a‹b›q-nu. sack-obl-in.elat millet.abs in-elat-‹n.sg›pour-aor ‘Millet poured out of the sack.’ (Khanmagomedov & Shalbuzov, 2001, p. 49) Various manners of motion like ‘run’, ‘jump’, or ‘fly’ also tend to be expressed with special lexical roots, although jumping and flying are sometimes lexicalized syncretically (e.g., in Tsezic languages Bezhta and Hunzib, cf. Plungian & Rakhilina, 2007, p. 744). Predicates describing swimming often represent periphrastic expressions (like ‘water do’) rather than dedicated verb roots; see also an overview of “aquamotion” verbs in a number of Caucasian languages by Maisak, Rostovtsev-Popiel, & Xurshudjan (2007). There are no in-depth studies of the semantics of motion verbs, although a general account of the Avar system can be found in Magomedova (2006), who also treats other verb classes. Another popular topic in the local philological tradition is the analysis of loanwords in Nakh-Dagestanian. A number of studies are available on Arabic, Persian, and Turkic loanwords (see also the Introduction to this volume). Khalilov (2004) is a study of Georgian borrowings into Nakh-Dagestanian. While Nakh-Dagestanian has had centuries-long history of contacts between its individual branches and languages, which must have left multiple traces in the lexicon, family-internal loanwords are poorly understood, except for the Avar influence on Andic and Tsezic, which is the subject of Magomaeva and Khalilov (2005), Khalidova (2006) (Avar to Andic), and Karimova and Khalilov (2013) (Avar to Khwarshi).

144 Dmitry Ganenkov and Timur Maisak A number of comparative wordlists have been published that include lexical data from Nakh-Dagestanian: Comrie & Khalilov (2010); Khajdakov (1973); Kibrik and Kodzasov (1988, 1990); Klimov and Khalilov (2003); and Murkelinskij (1971b). A historical reconstruction of the Nakh-Dagestanian lexicon is available in Nikolaev and Starostin (1994). Mudrak (2016a, 2016b) is a recent attempt to reconstruct the lexicon of Proto-Dargwa. Sources on historical phonology also include valuable lexical data and cognate sets: Gudava (1964) for Andic, Gudava (1979) for Tsezic, Imnaishvili (1977) for Nakh, Musaev (1978) for Dargwa, and Talibov (1980) for Lezgic.

3.9 Future Directions of Research Despite over a century of research on Nakh-Dagestanian languages, the family remains poorly studied. Descriptive work is still lacking. Many languages or dialects of the family are endangered and almost undocumented. Although (shorter) grammatical sketches or older traditional grammars are available for most languages of the family, detailed, contemporary grammatical descriptions are lacking for even some major Nakh-Dagestanian languages such as Lak, Chechen, and Avar, let alone smaller languages. Wordlists have been collected and published for most of the languages of the family as part of comparative work, but for many languages there are no published dictionaries. Fewer than a dozen unwritten languages of the family can boast a transcribed corpus of natural speech of any considerable size, or high-quality recordings of a wordlist (e.g., Dargwa: Sanzhi, Shiri, Chirag, Mehweb; Lezgic: Archi, Agul, Udi; Tsezic: Tsez, Hinuq; Nakh: Chechen, Ingush; Khinalug). Published or unpublished collections of written texts (not including appendices to published grammars) do exist for some languages (see section 3.1.3): those are, first of all, the written languages of the family, but some unwritten languages are included (e.g., Tsezic: Tsez, Khwarshi, Bezhta; Dargwa: Kaytag, Kubachi, and Kharbuk). Most other languages and divergent dialects require immediate documentary and descriptive efforts. Nakh-Dagestanian languages are known to make extensive use of pharyngeal and epiglottal articulations in both consonant and vowel inventories, having some typologically unique properties (see chapter 15 for more discussion). Yet, they remain severely understudied with respect to segmental phonetics and phonology, as well as suprasegmental features. Nakh-Dagestanian languages have had some impact on theory construction, and their potential has by no means been fully exhausted. For example, Tsez has been documented to show theoretically challenging patterns of obligatory control and the phenomenon of true long-distance agreement (Polinsky & Potsdam, 2001, 2002). Udi endoclitics offer another example of a typologically and theoretically relevant phenomenon. Harris (2000, 2002) carefully documents the existence of clitics intruding into underived verbal roots and proposes a system of rules regulating their placement in the

Nakh-Dagestanian Languages 145 clause. Despite continuing work which has attempted to reanalyze the data in a more conventional way, no theoretical account of Udi endoclitics has been proposed. Although they have been a focus of research for at least 40 years, patterns of case marking and agreement in Nakh-Dagestanian are still not well understood, remaining a locus of major questions in both descriptive and theoretical analysis in many of the individual languages and in a comparative perspective. In particular, agreement with the absolutive is attested even in nominalizations (masdars), which is challenging for theories that connect agreement with finiteness (see Polinsky, 2016a). Likewise, there are no accounts of gender agreement on non-verbal targets (but see Polinsky, Radkevich, & Chumakina, 2017, for a subset of data). The structure of the biabsolutive construction is still poorly understood (see Gagliardi, Goncalves, Polinsky, & Radkevich, 2014 for an initial analysis). The semantics of grammatical categories is another domain deserving intensive study. While nominal categories, especially spatial cases, have been a topic of research in a number of dedicated works, the semantics of verbal categories like tense, aspect, evidentiality, and mood/modality has received much less attention. So far, there have been only a handful of in-depth studies on the semantics of verb forms in individual languages and comparative studies of a verbal category. Almost no work has been done on aspectual verb classes (“actionality”) and their interaction with viewpoint aspect (Comrie, 1976; Smith, 1997, among many others). Lexical semantics is also awaiting investigation, as it has been largely neglected in Nakh-Dagestanian linguistics so far. Even lexical fields that usually attract a great deal of attention in modern lexical typology (such as color terms, kinship terms, or motion verbs) lack any thorough languagespecific or comparative descriptions. Another potential direction of research, which has gone almost unaddressed in the current landscape of Nakh-Dagestanian studies, concerns the historical linguistics of the family and its branches, from phonological reconstruction and the construction of an etymological dictionary to reconstruction of morphology and syntax.

Acknowledgments We are grateful to Maria Polinsky and Steven Kaye for their comments on an earlier version of this chapter and to Violetta Ivanova, Garik Moroz, and Zachary Wellstood for helping us prepare the manuscript for publication. Financial support from the Basic Research Program of the National Research University Higher School of Economics (Moscow) in 2018, to Dmitry Ganenkov, is gratefully acknowledged.

chapter 4

Da rgwa Nina Sumbatova

4.1 Language and Demographics Dargwa constitutes a separate branch of the Nakh-Dagestanian family (Alekseev, 1998c; Friedman, 2010). It is open to debate whether different varieties of Dargwa constitute dialects of a single language or separate languages. Officially and in most linguistic works, Dargwa is typically treated as a single language.

4.1.1 Location and Dialects Dargwa is known for its dialectal variation, with the number of varieties ranging from 12 (Khajdakov, 1985) to 38 (Abdullaev, 1954). Different varieties of Dargwa probably diverged more than 2,000 years ago (ca. 300–400 bce) (Koryakov & Sumbatova, 2007). Some authors prefer to treat Dargwa as a group of closely related languages (see c hapter 3 for discussion). This chapter describes the Dargwa variety spoken in the village of Tanti. This village is situated in the Aqusha district, 12 km to the south of Aqusha. The Tanti dialect is related to Tsudaqar Dargwa (see Abdullaev, 1954, p. 8; Comrie & Khalilov, 2010; Sumbatova & Lander, 2014).

4.1.2 Current Sociolinguistic Situation See chapters 1, 2, and 3 for sociolinguistic and demographic information.

148 Nina Sumbatova

4.1.3 Brief History of Research The first notable work on Dargwa was the grammar of Urakhi by Uslar (1892). The 1926 grammar by L. Zhirkov was mainly based on Uslar’s data. Currently, the most influential grammar is by Abdullaev (1954); this is a grammar of Standard Dargwa, but it also contains data on different dialects. Other important works on Standard Dargwa are those by Musaev (2002) and Mutalov (2002). Van den Berg (2001) contains a short grammar sketch of Standard Dargwa and a collection of glossed texts. There are a number of papers on Dargwa (Dargic) dialectology (Gasanova, 1961, 1971; Kibrik, 1979, 2003; Magometov, 1962, 1976, 1978a, 1978b, 1981, 1984, 1986) and gram matical descriptions of several dialects: Uslar (1892) on Urakhi; Magometov (1963) on Kubachi; Magometov (1982) on Mehweb; Sumbatova and Mutalov (2003) on Itsari; Temirbulatova (2004) on Kaytag; Sumbatova and Lander (2014) on Tanti; Daniel, Dobrushina, and Ganenkov (2019) on Mehweb; Forker (2020a) on Sanzhi. Recently, a comparative etymological dictionary (Mudrak, 2016a, 2016b) has been published. In 2017, two big Dargwa-Russian dictionaries were published (Yusupov, 2017; Abdullaev, 2017); the most widely used Russian-Dargwa dictionaries are Abdullaev (1948), Isaev (1988), and Yusupov (2005). In the last 10 years, studies of Dargwa have been very active, with several projects to describe and document some Dargic dialects: Chirag, Shiri, and Sanzhi (Forker, 2019a, 2019b, 2020b) and Mehweb (Daniel, Dobrushina, & Ganenkov, 2019).

4.2 Phonetics and Phonology In terms of phonetics and phonology, Tanti is a typical representative of the NakhDagestanian family; see chapters 3 and 15 for comparison.

4.2.1 Consonant Inventory In most Dargwa varieties, Tanti included, the consonant system includes four-way contrasts in occlusive stops (voiced—voiceless aspirated—geminated unaspirated— glottalized: /b, p, pː, p’/) and three-way contrasts of fricatives (/z, s, sː/). Tanti has five pharyngeal and glottal consonants: two stops /ʔ, ʡ/ and three fricatives /h, ʜ, ʕ/. Occlusive consonants can be also labialized; the status of labialized consonants needs further research (k’went’ ‘lip’, qːwinaˁ ‘crow’, χwala ‘big’, ixwa ‘throw’ [imperative]). Most Dargic dialects have the same four-way contrasts, but some Northern dialects have no geminated consonants. Besides, the system of pharyngeal and glottal consonants in most dialects lacks the voiceless fricative /ʕ/.

Dargwa 149 Table 4.1 Consonants

Occlusive

Affricate

Fricative

Sonorant

Labial

b p pː p’

mw

Dental

d t tː t’

c cː cʼ

z s sː

nlr

Alveolar

č čː čʼ

ž š šː

Palatal

j

Velar

k kː k’

gˇ x xː

Uvular

q qː qʼ

ʁ χ χː

Pharyngeal

Epiglottal

ʡ

ʕ

h

h

Glottal

ʔ

4.2.2 Vowel Inventory There are four basic vowels: /i, u, e, a/; three of them can be pharyngealized /aˁ, uˁ, eˁ/, as in q’waˁl ‘cow’, uˁli ‘eye’, weˁral ‘seven’. Other dialects/languages have the same system of basic vowels, but the number of pharyngealized vowels varies across dialects; for example, in Standard Dargwa and in Muira, there is only /aˁ/.

4.2.3 Stress Stress has not been studied systematically, but see chapter 16 for some discussion. At least some varieties (including Tanti) have dynamic stress that can occur on different morphological units. Stress shift is widely used to express grammatical distinctions (see 4.5.2.4). Some morphemes always bear stress, e.g., nominal plurals and the prohibitive prefix ma-.

4.2.4 Important Phonological Processes Phonetic processes attested in Tanti are degemination, devoicing, and pharyngealization. Geminate consonants are degeminated before other consonants and at the ends of words, e.g., likːa ‘bone’, lik-ne ‘bone-pl’; kulpatːe ‘family-pl’, kulpat ‘family’. Two consecutive voiced occlusives are realized as a geminate voiceless consonant between two vowels (devoicing), e.g., letːe ‘(they) existed’ ( ø), - j3

feminine ‹r›

non-human (neutral) ‹b›

1st/2nd person NPs ‹d›

human ‹b›

non-human ‹d›

2 See Abdullaev (1954, p. 74), Alekseev (2003, p. 210), and Sumbatova and Mutalov (2003, pp. 18–19) for the same phenomenon in Standard Dargwa and some other varieties. 3 Gender markers are cited in the small angle brackets (‹w›, ‹r›, ‹b›, ‹d›) when no reference is made to their position in the word and when they are infixed. Suffixal gender markers are shown as -w/-j, -r, -b, -d, prefixal as w-, r-, b-, d-.

152 Nina Sumbatova

4.4.2 Nominal Inflection Nouns inflect for number and case. In addition, nouns have a large number of locative (spatial) forms (see 4.4.3.3). Nominal forms are derived from two stems, direct and oblique, in both singular and plural (Kibrik, 2002; Kibrik & Kodzasov, 1990, pp. 251–258, 281–283; and chapter 3 of this volume). In Tanti, the stems are as follows: (1) a. direct singular stem (= absolutive singular)—the simplest morphological form; used as the base for the absolutive and adverbial case in the singular, and for some nouns, for genitive and other case forms, e.g., rursːi ‘girl’; b. oblique singular stem (= ergative singular)—consists of the direct singular stem + -li; used as the base for most other singular forms, e.g., rursːi-li; c. direct plural stem (direct singular stem + plural)—this stem coincides with the absolutive plural form, e.g., rurs-be; d. oblique plural stem (absolutive plural + -a, last vowel of the plural suffix is dropped)—this stem forms all oblique case forms in the plural except the absolutive and adverbial, e.g., rurs-b-a-.

4.4.3 Inflectional Features The nouns inflect for number and grammatical case; furthermore, they have a number of locative (spatial) forms.

4.4.3.1 Number The singular is unmarked and the plural is marked by one of many possible suffixes. Tanti uses the following plural markers: ‑ane, ‑be, ‑e, ‑le, ‑lume, ‑me, ‑ne, ‑re, ‑te, ‑ulme, ‑upːe, ‑urbe, ‑uˁpːe, ‑qali. The choice of the plural suffix is lexically determined (except ‑qali). The basic function of ‑qali is to express the associative plural, like rasul-qali ‘Rasul and those around him’. In Tanti, this suffix can only be combined with proper nouns and kinship terms whose referents are unique (like ‘mother’ or ‘husband’), e.g., neš-qali ‘mother and those around her’, but *rucːi-qali (ll

se chechen

aal-a

ool-a

eel-ira

aalla

ingush

aal-a

oal

eal-ar

eanna

*ln>nn

proto-nakh

*d.aaqq-an

*d.aaqq-o

*d.aaqq-ira

*d.aaqq-ina

ne chechen

d.aaqq-an

d.Ɔqq-u

d.æqq-ira

d.æqq-ina

se chechen

d.aaqqa

d.oaqqa

d.eqq-ira

d.eqq-ina

ingush

d.aaqq-a

d.oaqqa

d.eaqq-ar

d.eaqq-aa

*Vna>aa

proto-nakh

*toox-an

*toox-u

*toox-ira

*toox-ina

ne chechen

tuox-an

tuux-u

tüːx-ira

tüːx-ina

se chechen

tuox-a

toox-a

tuux-ira

tuux-ina

ingush

tuox-a

tox

tiex-ar

tiex-aa

proto-nakh

*mal-an

*mal-u

*mal-ira

*mal-ina

ne chechen

mal-an

mol-u

mel-ira

mel-la

se chechen

mal-a

mol-a

mel-ira

mal-la

ingush

mal-a

mol

mel-ar~mal-ar menna~manna

‘say’

‘take’

‘strike’

‘drink’

* Here and below, a dot (.) marks the morpheme boundary for the gender-marking initial consonant. The conventional citation form is D gender.

marker (e.g., butt ‘moon’, B gender), but this is not regular gender assignment and is certainly not gender agreement (see Nichols, 2011, pp. 147–150, 2007, pp. 1182–1184). Nouns can head NPs without special morphology; used as noun modifiers, they take the genitive case; used as predicate nominals, they require a copula. Verbs inflect for TAM (finite) or form converbs, participles, and verbal nouns (nonfinite). There is no person agreement. Verbs can agree in gender; about 30% of the basic roots have a gender agreement slot, always initial. These include most of the TAM auxiliaries and light verbs, high-frequency items, so the frequency of actual gender agreement in running text is closer to 50%. Derived verbs inherit the gender agreement slot from the base verb if it has a gender slot. Verbs can be clausal predicates without additional morphological marking; to function as attributives, they must take participial morphology. Adjectives inflect for case, making a nominative/oblique distinction; a minority of them have an initial gender agreement slot. Attributive demonstratives and cardinal numerals make the same case distinction. Adjectives can form comparatives (with a suffix only in Ingush) and superlatives (with a preposed superlative word). Basic adjectives function as attributive modifiers and predicate nominals without modification; they can

324 Erwin R. Komen, Zarina Molochieva, and Johanna Nichols also function adverbially, i.e., basic adjectives are also basic adverbs; to function as NP heads they must be nominalized. Derivational morphology generally produces dedicated adjectives or dedicated adverbs. Pronouns and independent demonstratives are noun-like in their inflection and behavior. Postpositions take no inflection, with the exception of laecna ‘concerning’ which becomes laacii in imperatives. Many of the simple postpositions can function as directional/local prefixes to verbs of motion and some others (derivationally, not inflectionally), and most can also be used as adverbs, with or without the locative or ablative suffixes (-ħ, -ra).

8.4 Nominal Morphology Noun and pronoun morphology is dominated by the covert category of gender and the overt category of case.

8.4.1 Noun Classification There are four gender markers which define three or four non-human genders (Corbett, 1991, pp. 150–154, 2006, p. 274) and one, two, or three human genders, depending on how they are counted. We label the genders by the form of their morphemes (which have no allomorphy). A gender is defined by the pair of singular and plural markers, shown in Table 8.6. Gender is predictable for words referring to humans and arbitrary for all others. The gender marker pairs define three larger animacy-based classes: nominals with human referents, for which gender is predictable based on sex (V-gender for masculine, J for feminine); within human nominals, those with third versus first/second person reference, for which the plural gender markers differ; and non-human nouns, with arbitrary

Table 8.6 Gender Markers for Chechen and Ingush Gender

Singular

Plural

Notes

1st/2nd person

V/J

D

Pronouns only

Human

V/J

B

V/J for M/F sex

Various

B

B

Minority of B gender

Various

D

D

Minority of D gender

Various

J

J

Chechen and Ingush 325 gender. It could also be said that non-human nouns have lexical gender and human nouns and pronouns do not but take their agreement from the person and sex of the referent.

8.4.2 Nominal Inflection The two languages have near-identical cases paradigms, and typologically similar inflectional classes of stem changes and extensions, though not all cognate nouns fall in the same paradigms. Case endings are mostly monoexponential and there is a separate plural suffix in most of the plural case forms. (2) shows the template for case paradigms: (2) Root (Extension) (Plural) Case The nominative singular is zero in all paradigms. The nominative plural has a fusional case-number ending -ii used in some paradigms, while most nouns have the plural suffix -aš/-až and a zero ending. Table 8.7 shows the endings and Table 8.8 gives examples of nouns with an unchanging stem, stem ablaut, and extensions: meaningless suffixes that create an oblique or plural stem (see also chapter 3, for the more general use of extensions in Nakh-Dagestanian nominal paradigm). In Ingush, the locative and ablative are not regular cases. Their formatives -(a)ħ and ‑(a)ra can be suffixed to various adverbs and other words. The ablative suffix consists of the allative suffix plus -ra. It is productive with pronouns and names but seems not to appear on all nouns. The local/directional locative -ħ exhibits similar behavior. Some nouns form an adverb with -a; the resultant adverb then takes the locative -ħ and ablative -ra as well.

Table 8.7 Case Endings in Chechen and Ingush Chechen

Ingush

Case

singular

plural

singular

plural

nom

-Ø

-aš-Ø, -ii, -i

-Ø

-až-Ø, -ii, -j

gen

-(a)n

-in

-a, -n

-ii, -i

dat

-na, -n

-aš-na

-na, -aa

-až-ta

erg

-uo, -as

-š-a

-uo, -z, -aa, -a

-aš-a, -až

ins

-(a)ca

-aš-ca

-ca

-až-ca

loc

-(a)h̄

-aš-ah̄, -aš, kah̄

—

—

lat

-(a)x

-iax, -ix

-x, -gh

-ex, -egh

abl

-gara

-aš-kara

—

—

all

-(a)ga, -ie

-aš-ka

-ga

-až-ka

compr

-(a)l

-ial, -il

-l

-el

326 Erwin R. Komen, Zarina Molochieva, and Johanna Nichols Table 8.8 Example Singular Case Paradigms with and without Ablaut Case

Chechen

Ingush

nom

joʕ ‘girl’

c’aa ‘house’

niq’ ‘road’

dig ‘axe’

gen

joʕas

c’iinuo~c’eenuo

neaq’uo

dogaruo

dat

joʕan

c’iinan

neaq’a

dogara

erg

joʕana

c’ianna

neaq’aa

dogaraa

ins

joʕaca

c’iinaca

neaq’aca

dogaraca

loc

joʕagah̄

c’aah̄

—

—

lat

joʕax

c’iinax

neaq’agh

dogaragh

abl

joʕagara

c’iiniera

—

—

all

joʕaga, joʕie

c’iinie

neaq’aga

dogaraga

compr

joʕal

c’iinal

neaq’al

dogaral

The allomorph -ie of the Chechen allative exists in Ingush as a derivational adverbalizer. The Ch. -ia- / Ing. -e- element in the lative and comparative plural suffixes is not synchronically identifiable as a plural marker outside these endings. The following examples show one Chechen noun without ablaut or extensions, one with ablaut and -n- extension, and one Ingush noun with ablaut but no extension, and an Ingush example with both ablaut and an extension. Ablaut is partly inherited from ancestral Nakh-Dagestanian and partly the result of umlaut-like assimilation of the root vowel to the next vowel (which was then reduced to /a/). Extensions are also ancient. Both ablaut and extensions are lexically conditioned and largely arbitrary in Chechen and Ingush. Both are mostly limited to simple words. They are occasionally found in loanwords, e.g., Ing. ghum, oblique stem ghamar- ‘sand’ from a Turkic language, probably Kumyk.

8.4.3 Inflectional Categories of Nominals Regular inflectional categories are gender, number, and case. Gender is lexical or referent-based in nouns, referent-based in pronouns, and a category of agreement in adjectives and verbs. Note that nouns referring to male and female animals, even important domestic animals that are given names, do not have gender assignment that reflects their sex; for several of them, there are distinct lexemes such as Ch. jiett, Ing. jett(B) ‘cow’ versus Ch. stu, Ing. ust (B) ‘bull, ox’. [±human] is a covert feature that is revealed in gender assignment (section 8.4.1). The singular-plural contrast is overtly marked by suffixes on nouns and independent pronouns and by singular versus plural genders. Some adjectives and a few attributive demonstratives agree with their head noun. Verbs agree in gender with a nominative argument, and this also entails number agreement. In addition, a number of simple verbs encode S/O number with stem changes (section 8.6.7).

Chechen and Ingush 327 Case is overtly marked by suffixes on nouns, pronouns, adjectives, and participles (section 8.4.2). The case paradigm of nouns and most independent pronouns is ergative; some pronouns have syncretic paradigms. Attributives (adjectives, attributive pronominals, attributive participles) make a one-dimensional nominative/oblique distinction, where the oblique agrees with any non-nominative case. Person is a lexical feature of pronouns but not an inflectional category. The different plural gender categories of first and second person pronouns versus third person and nouns (section 8.4.4) is probably best considered, typologically, a matter of different plural inflection at the highest levels of the animacy hierarchy (and not as person inflection per se).

8.4.4 Personal Pronouns Personal pronouns take the same cases as nouns with much the same endings. Stem alternations are different: pronouns have initial CV metathesis in the ergative case (first and second persons), or nominative/oblique stem suppletion (third person).The inclusive has nominative-ergative-genitive syncretism. We include the third person forms as personal pronouns, though they are demonstratives in origin.

8.4.5 Reflexive and Reciprocal Pronouns Every personal pronoun paradigm has a reflexive counterpart with largely the same case endings but different stem vowels. Third persons use an entirely different root. Second and third plural are syncretic, and the Ingush endings are partly based on those of nouns. All reflexives exhibit nominative-ergative syncretism. Partial paradigms are given in Table 8.10 (for full sets see Molochieva & Komen, to appear; Nichols, 2011, p. 175).

8.4.6 Demonstrative Pronouns There are three demonstratives: proximal (‘this’, near speaker/hearer), neutral (‘aforementioned, this, that’), and distal (‘that, that over there’). All have attributive and independent (nominal) forms, which inflect like adjectives and nouns, respectively. The independent form of the neutral demonstrative is also the third person pronoun (see Table 8.9). Only attributives have oblique forms; only independent pronouns have genitive and other specific cases (see Table 8.11).

8.4.7 Demonstratives Chechen and Ingush do not have determiners. Definiteness can be conveyed with the neutral demonstrative and indefiniteness with cħa ‘one’ (obl. cħan-).

328 Erwin R. Komen, Zarina Molochieva, and Johanna Nichols Table 8.9 Personal Pronouns first singular

second singular

third singular

Ch.

Ing.

Ch.

Ing.

Ch.

Ing.

nom

so

so

h̄ o

h̄ o

i/iza

yz

gen

san

sy

h̄ an

h̄ a

cünan

cun

dat

suuna

suona

h̄ uuna

h̄ una

cunna

cynna

erg

as

aaz

ah̄ /a

ʕa

cuo

cuo

all

süega

suoga

h̄ üega

h̄ uoga

cunga

cunga

abl

süegara

suogara

h̄ üegara

h̄ uogara

cüngara

cyngara

ins

süeca

suoca

h̄ ueca

h̄ uoca

cünca

cynca

lat

sox

sogh

h̄ ox

h̄ ogh

cunax

cynagh

compr

sol

sol

h̄ ol

h̄ ol

cul

cul

first plural (excl)

first plural (incl)

second plural

third plural

Ch.

Ing.

Ch.

Ing.

Ch.

Ing.

Ch.

Ing.

nom

txo

txo

vai

vai

šu

šu

üš

yž

gen

txan

txy

vain

vai

šun

šyn

ceeran

caar(a)

dat

txuuna

txuona

vaina

vaina

šuna

šoana

caarna

caarna

erg

ooxa

oaxa

vai

vai

aš

oaša

caara

caar(a)

all

txüega

txuoga

vaiga

vaiga

šüga

šuoga

caerga

caarga

abl

txüegara txuoga

vaigara

vaigara

šügara

šuogara

caergara caargara

ins

txüeca

txuoca

vaica

vaica(a)

šüca

šuoca(a) caarca

caarca

lat

txox

txogh

vaix

vaigh

txox

txogh

caarax

caaregh

compr

txol

txol

vail

vail

txol

txol

caaral

caarel

Table 8.10 Partial Reflexive Paradigm first singular

second singular

third singular

second/third singular

Ch.

Ing.

Ch.

Ing.

Ch.

Ing.

Ch.

Ing.

nom

suo

sie

h̄ uo

h̄ ie

šaa

šie

šaeš

šoaž

gen

sai(n)

sei

h̄ ai(n)

h̄ ei

šie(n)

šie

šai(n)

šei/šoi

dat

saina

seina

h̄ aina

h̄ aaina

šiena

šiena

šaina

šoažta

erg

ais

eisa

aih̄

ʕaaixa

šaa

šie

šaeš

šoaž

Chechen and Ingush 329 Table 8.11 Partial Demonstrative Paradigm Chechen Proximal

Ingush Proximal

attributive

attributive

independent sg

pl

independent sg

pl

nom

hara

hara

horš

je

jer

jeraž

obl

(ho)qu

uq

gen

(ho)qun

hoqar

uqan

aqaar

erg

(ho)quo

(ho)qaara

uquo

aqaar

Chechen Neutral attributive

Ingush Neutral independent

attributive

independent sg

pl

sg

pl

sg

pl

nom

i

i/iza

üš

yz yž

yz

yž

obl

cu

cy cy

gen

cun/cünan

ceeran

cun

caarna

erg

cuo

caara

cuo

caar(a)

Chechen Distal attributive

Ingush Distal independent

attributive

sg nom

obl gen erg

independent

pl

sg

dʕoora/ dʕaara dʕoora(‑nig)/ dʕaara(‑nig)

dʕooranaš/ dʕaara dʕaaranaš

dʕaara+d.ardʕaaran+d.araž

dʕaara-ču

dʕaara-čun dʕaara-čuo

dʕaara-čeran dʕaara-čaara

dʕaara-ča

pl

dʕaara-čyn dʕaara-čaara dʕaara-čuo dʕaara-čaar

Table 8.12 Interrogative Pronouns* ‘who’

‘what’

Ch.

Ing.

Ch.

Ing.

nom

mila

mala

hun

fy

gen

h̄ eenan

h̄ an

stien

sen

dat

h̄ aanna

h̄ anna/h̄ anaa

stianna

senna/sienaa

erg

h̄ a/h̄ aa

h̄ an(uo)

stie

sievuo

* See Maciev (1961, p. 598) and Nichols (2011, p. 180).

330 Erwin R. Komen, Zarina Molochieva, and Johanna Nichols (3) Chechen i zulam cuo dina xilaran šiekuo ju. dem crime 3sg.erg d.do.prf be.inf.gen suspicion j.prs ‘There is a suspicion that he committed that crime.’ (p86-00012: 7)3 (4) Chechen cħa huma dan dieza. one thing d.do.inf d.need.prs ‘Something needs to be done.’ (based on p34-00002: 14) The neutral demonstrative i ‘that’ modifying zulam ‘crime’ in (3) indicates to the reader that the crime has been mentioned beforehand. The number cħa ‘one’ in (4) is not to be taken literally but points to an indefinite amount.

8.4.8 Adjectives Adjectives, like demonstrative pronouns, make only a nominative/oblique case distinction, where the oblique agrees with a head noun in any non-nominative case. (5) Chechen a. d.ouxa hot

Ingush xi water

b. d.ouxacǎ xinuo hot.obl water.erg ‘hot water’

d.ʕaaixa hot

xii water

d.ʕaaixa-cǎ xiv hot.obl water.erg ‘hot water’

A minority of the adjectives, like ‘hot’ (5), agree in gender with the head noun. Comparatives are formed suffixally, and superlatives, analytically, by adding a dedicated superlative to the positive degree (Chechen) or the comparative degree (Ingush). Comparatives and superlatives can be either predicative or attributive in Chechen, but in Ingush they can only be predicative; for an attributive function, they are relativized with the participle of ‘be’, yielding a periphrastic attributive form. (6) Chechen d.oqqa d.oqqax/d.oqqox uggard.oaqqa

Ingush d.oaqqa d.oaqqagh eggara d.oaqqagh

‘big’ ‘bigger’ ‘biggest’

3 Example references starting with “p86” and “p34” refer to texts taken from the Nijmegen Parsed Corpus of Modern Chechen (NPCMC; see http://erwinkomen.ruhosting.nl/che/crp). They can be consulted online (https://cesar.science.ru.nl).

Chechen and Ingush 331 (7) Chechen uggar xaza joʕ superl pretty girl ‘the prettiest girl’

Ingush eggara xozagh jola joʕ superl pretty-compr J.be.ptcp girl ‘the prettiest girl’

8.4.9 Numerals Numerals are attributive in their simplest form. A few of them make a nominative/ oblique distinction and those also have separate independent forms. The numbers 1–10 are simplex forms; the teens are composed of unit + 10; the decades are base-20. ‘Thousand’ is a Persian loan; ‘million’ and above are Russian loans (see Table 8.13). Table 8.13 Numerals Chechen

Ingush

nom

obl

independent

nom

obl

independent

1

ch̄ a

ch̄ an

ch̄ aʔ

ch̄ a

ch̄ an

caʕ (< *ch̄ a-’)

2

ši

šin/šim

šiʔ

ši

šin

šiʔ

3

qo

qaʔan

qoʔ

qo

qea

qoʔ

4

d.i

deʔan

d.iʔ

d.i

d.iʔ

5

pxi

pxeʔan

pxiʔ

pxi

pxie

pxiʔ

6

jaalx

jaalx

7

vorh

vorh

8

baarh

baarh

9

iss

iis

10

itt

itt

11

ch̄ aitt

ch̄ aitt

12

šiitt

šiitt

20

tq’a

tq’o

tq’o

tq’ea

21

tq’ech̄ a

tq’each̄ a

30

tq’eitt

tq’ea itt

40

šouztq’a

šouztq’a

60

quuztq’a/ qouztq’a

qouztq’a

80

d.eztq’a/d.oeztq’a

d.ieztq’a

100

bʕee

bʕea

200

ši bʕee

ši bʕea

1000

ezar

ezar

332 Erwin R. Komen, Zarina Molochieva, and Johanna Nichols

8.4.10 Postpositions Dative is the default case assigned by postpositions to their complement. Simple postpositions and other short ones can also function as verb prefixes and adverbs: (8) Ingush a. ghalghaa-ž-ta jiq’ie Ingush-pl-dat among ‘among the Ingush’ b. jiq’ie-d.oal between-d.go ‘come/go in between; come up (in conversation), come into vogue’ c. jiq’ie ʕa=čy-b.ullaž č’ii between deic=in-b.go.sim.cvb bobbin ‘a bobbin inside (of it)/in between (the parts of a machine)’

In (8c) the prefix sequence means ‘down, down in’. It fills the prefix slot, showing that jiq’ie here is not a prefix but an adverb. Derived postpositions include converbs lexicalized as postpositions, often with a more specific sense than the source verb has, e.g., Ch. d.üezna (D.connect.ant.cvb) ‘in connection with; about, concerning’, Ing. Leacaa (catch.ant.cvb) ‘id’.

8.5 Verb Morphology This section starts out by considering verbal inflection. It extends the realm of verb related phenomena by discussing negation, valence-changing operations, and verbnoun agreement patterns.

8.5.1 Morphological Classifications Simple verbs are a closed class of about 200 members (300 including plural and pluractional stems, most of which are formed by ablaut, consonant alternation, or suppletion). Compound verbs are light verb constructions, most of which use one of a handful of high-frequency light verbs, e.g., Ch. tuox- ‘strike’, d.- ‘make/do’, ħaaq- ‘rub, wipe, apply laterally’, d.aaqq- (pl. d.aax-) ‘take’, d.ill- ‘put’ (pl. d.axk-) (Molochieva & WitzlackMakarevich, 2010). The first element is most often a noun in the nominative case, and the light verb agrees with it in gender; the construction is lexically a verb but syntactically and prosodically indistinguishable from a verb phrase with a nominative object and transitive verb:

Chechen and Ingush 333 (9) Chechen naab j.-

Ingush nab j.- ‘sleep’

(10) Ingush cuo nab 3sg.erg sleep(j) ‘S/he slept.’

jyr. j.do-pst.wit

Where the first element is an adjective, the compound is usually written as one word. The verb agrees with an independent clause argument, not with the first element: (11) Ingush aaz ʕaž c’̌orma+b.eaqqar. 1sg.erg apple(b) peeled+b.take-pst.nwit ‘I peeled the apple.’ (NB: ch’orma ‘skinned, abraded, peeled’ is not a fully independent adjective.) If the first element is a verb, it is in the anterior converb form and the compound is written as two words. The second verb is usually a verb of position or motion. It agrees with the external nominative argument. (12) Ingush a. ʕa-xeina d.aagha down-sit.ant.cvb d.sit ‘sit, be sitting’ b. ħal-’ellaa ull up-hang.ant.cvb lie ‘hang, be suspended’ c. vad-da v.ax-aa-v v.run-ant.cvb v.go-prf-v ‘ran away’ (lit. ‘went running’)

8.5.2 Verbal Inflection Table 8.14 shows the full structure of a simple verb. Probably no naturally occurring form contains all possible affixes, but acceptable examples can be constructed. Verbs have four basic stem forms in each language, marked by ablaut. In addition, many verbs have a plural and/or pluractional counterpart, formed usually by vowel ablaut, consonant alternation, or suppletion. Each of these has its own three or four stems, but all of them can be regarded as additional stems of the basic verb, for a total of seven stems (Molochieva, 2010, p. 75; see Table 8.15). Inflectional suffixes take a particular stem: the simple present and imperfect take the imperfective stem, the witnessed past and non-witnessed tense take the perfective stem,

334 Erwin R. Komen, Zarina Molochieva, and Johanna Nichols Table 8.14 Nakh Simple Verb Template 1

2

3

4

5

6

Number (nom Deictic Local Clitic Gender Root Prefix (ablaut argument) Prefix Prefix =‘a grade) and/or ‘and, Pluractional even’ (root-internal)

7

8

Derivational TAM suffixes: Causative, Inceptive

9

10

Gender (for TAM categories with suffixed gender)

Clitic =‘a ‘and, even’; Pragmatic Clitics (e.g., addressee dative, see 8.5.10)

Table 8.15 Four Base Stem Forms in Chechen and Ingush ‘strike’

‘drink’

Ch.

Ing.

Ch.

Ing.

Infinitive

tuox-

tuox-

mal-

mal-

Imperfective

tuox-

tuox-

mol-

mol-

Perfective

tüex-

tiex-

mel-

mel-/mal-

Perfective 2

tüːx-

Pluractional

Infinitive

d.iett-

d.iett-

miil-

miel-

Imperfective

d.üett-

d.iett-

muul-

miel-

Perfective

d.iitt-

d.iitt-

miil-

miil-

and the infinitive and masdar (verbal noun) take the infinitive stem. The infinitive stem is usually taken as the underlying form; the others result from umlaut (no longer transparent). The stem from which most of the others can be predicted is the infinitive stem for Chechen and the present stem for Ingush (Handel, 2003). Chechen has a total of 35 ablaut classes and Ingush has 16 (see Molochieva, 2010; Nichols, 2007, 2011, pp. 237–239; Nichols & Vagapov, 2004). Beerle (1988) and Handel (2003) give phonological analyses of the alternations and complete listings of the simple verbs and their ablaut types. For a historical discussion of these alternations, see Imnaishvili (1977). There are also approximately ten irregular verbs, most with suppletive forms in one or another tense paradigm.

8.5.3 Synthetic TAM Forms The witnessed past and perfect create an evidentiality contrast: the witnessed past is used when the speaker has seen or otherwise perceived the event, and the perfect when the speaker has not seen the event but infers it from a result (it has a strong

Chechen and Ingush 335 Table 8.16 Synthetic Tenses (D = citation form for gender) Form

Chechen

Ingush

Witnessed past

-ar

-ar (with high tone)

Recent witnessed past

-i

(-ii; marginal)

Perfective stem

Perfect

-(a/i)na

-na=D/-aa=D (< *‑ana=D)

Remote past

-niera

-niera

-u/-a

-a

Imperfective Stem Present

-ra

-ar

-r

—

resultative sense). There are also some analytic evidential tenses using an evidential tense form of the auxiliary (see Molochieva, 2010). Table 8.16 lists all analytic tenses, including those that also serve as evidentials.

8.5.4 Non-Finite Forms The participles are adjectives and make a nominative/oblique case distinction when used as attributives. Both Chechen and Ingush have separate forms for attributive and predicative participles.4 See Table 8.17 for the attributive forms. The infinitive is a verb complement; participles are used in relativization and as attributives; the anterior and simultaneous converbs are used in chaining (and the simultaneous converb also in adverbial subordination ‘while’); the masdar is used in verb complement clauses and nominalization; and the subjunctive is used in verb complement clauses. There are about two dozen other converbs used in subordination, e.g., the temporal converb in -ča (both languages) ‘when’, -alc ‘until’, and others (see Good, 2003a, 2003b; Molochieva & Komen, to appear; Nichols, 2011, pp. 297–308). (13) a. Infinitive b. Past participle

Infinitive stem with Ch. -an, Ing. –a Homophonous with the perfect in Chechen; -na/-aa in Ingush (perfect tense without gender suffix) c. Anterior converb Homophonous with the past participle d. Simultaneous converb Imperfective stem plus Ch. -uš/-aš, Ing. -až e. Verbal Noun (masdar) Infinitive stem with-ar (derives D gender noun) f. Subjunctive Any stem with -aljg(a) in Ingush

We gloss attributive participles as rel.

4

336 Erwin R. Komen, Zarina Molochieva, and Johanna Nichols Table 8.17 Attributive Participle Forms Used in Relative Clauses Verb

cop

aux

‘read’ ‘want’

Tense

Finite form

Participle form

Free relative

nom

obl

nom

erg

pres, aff

du

dolu

dolču

derg

dolčuo

pres, neg

daac

doocu

doocuču

doocurg

doocučuo

past

dara

—

—

—

—

pres

xülu

xülu

xüluču

xülurg

xülučuo

past

xilla

xilla

xillaču

xillarg

xillačuo

pres

jüešu

jüešu

jüešuču

jüešurg

jüešučuo

past

jiešna

jiešna

jiešnaču

jiešnarg

jiešnačuo

pres

laeʔa

luʔu

luʔuču

luʔurg

luʔučuo

past

liʔna

liʔna

liʔnaču

liʔnarg

liʔnačuo

8.5.5 Non-Indicative Forms The simple imperative is homophonous with the infinitive, and is used for suggestions, requests, commands, and so on. There are also two mild, or polite, imperatives: -al and -alaħ (or, analyzed differently, -l and -laħ added to the plain imperative); the second, at least for Ingush, is described by speakers as a future imperative, but it is also polite. (14) biexk ma=b.illa-laħ. bexk my=b.aaqqa-laħ. offense neg=b.put/take-imp ‘Excuse me.’

(Chechen) (Ingush)

The polite imperatives carry high tone which is realized on the vowel before the -l (see also section 8.2.3). The finite conditional (used in the apopdosis of a conditional construction) is a synthetic form composed of the future stem plus the past tense of ‘be’. The apodosis uses a conditional converb.

8.5.6 Periphrastic (Analytic) TAM Forms Periphrastic tenses are numerous. They are composed of simultaneous or anterior converb plus an auxiliary, a finite form of one of the two verbs ‘be’ (or, for Ingush, the verb ‘stand’, which is used when the speaker sees or saw the event). The converb marks aspect, and the auxiliary marks tense. The auxiliary also determines the valency of the clause: since the auxiliaries are intransitive, the subject (regardless of valency) is nominative, as

Chechen and Ingush 337 is the object. This is the biabsolutive (“binominative”) construction. It is optional for some progressives, which allow the subject case to be assigned by either the auxiliary (biabsolutive) or the converb. Hence there are three possibilities for progressives of transitive verbs: (15) Ingush Non-Progressive cuo xii mol. 3sg.erg water drink.prs ‘He drinks water.’ (16) Ingush Progressive cuo xii mo-laž 3sg.erg water drink-sim.cvb ‘He is drinking water.’

dy. d.is

(17) Ingush Biabsolutive Progressive yz xii mol-až vy. 3sg.nom water drink-sim.cvb v.is ‘He’s drinking water (these days),’ or ‘He’s a water-drinker, teetotaler.’ (18) Ingush Visible Progressive yz xii mol-až laatt. 3sg.nom water drink-sim.cvb stand ‘He’s drinking water.’ (right now; visible or at least known to speaker) Chechen also has these progressive forms, though with some semantic differences. The non-progressive is used to indicate events in the present. The progressive meaning, in Chechen, is expressed by biabsolutive constructions. (19) Chechen Non-Progressive cuo xi molu. 3sg.erg water drink.prs ‘He drinks water.’ In Chechen the non-biabsolutive progressive is used for ongoing actions where the object argument is under focus: (20) Chechen Progressive cuo xi molu-š du. 3sg.erg water drink-sim.cvb d.is ‘He is drinking water,’ or ‘The water is being drunk by him.’ The biabsolutive construction is a true progressive as in Ingush:

338 Erwin R. Komen, Zarina Molochieva, and Johanna Nichols (21) Chechen Biabsolutive iza xi molu-š 3sg.nom water drink-sim.cvb ‘He’s drinking water (now).’

vu. v.is

The verb ‘stand’ is used only when the subject is visible to the speaker, and not for progressive situations in general. (22) Chechen Visible Progressive iza xi molu-š laetta. 3sg.nom water drink-sim.cvb d.stand ‘He’s drinking water.’ (right now; visible or at least known to speaker) The ‘be’ verbs used include the basic copula (which distinguishes only present vs. past tense and is durative with no contrasts for aspect or evidentiality) and the habitual/iterative ‘be’verb xil-/xül- (which is non-durative). The combination of their different aktionsarten and the converbal aspects yields many tense-aspect forms (see Molochieva, 2010; Molochieva & Komen, to appear): (23) Chechen ža d.aaž-o-š v.ara iza. sheep d.graze-caus-sim.cvb v.was 3sg.nom ‘He was herding/guarding the sheep.’ (basic copula) (24) Chechen Past Progressive Habitual ža d.aaž-o-š xil-ura iza. sheep d.graze-caus-sim.cvb be-ipfv 3sg.nom ‘He used to herd/guard the sheep’. (habitual/iterative copula) (25) Chechen Non-Progressive Future ža d.aaž-o-r du cuo. sheep d.graze-caus-fut d.is 3sg.erg ‘He will guard the sheep.’ (basic copula) (26) Chechen Progressive Future ža d.aaž-o-š sheep d.graze-caus-sim.cvb ‘He will be guarding the sheep.’

xira vu iza. be.fut v.is 3sg.nom

The non-witnessed forms are periphrastic. Such forms consist of a converb and auxiliary xilla ‘be.perf’. The lexical verb is a simultaneous or anterior converb depending on the aspectual properties of the event. (27) Chechen Unwitnessed Past with Simultaneous Converb ža d.aaž-o-š xilla iza. sheep d.graze-caus-sim.cvb be.prf 3sg.nom ‘He was herding/guarding the sheep (I did not see it).’

Chechen and Ingush 339 (28) Chechen Unwitnessed Past with Anterior Converb ža d.aaž-i-na xilla cuo sheep d.graze-caus-ant.cvb be.prf 3sg.erg ‘He herded/guarded the sheep (I did not see it).’ The auxiliaries can also form converbs, so periphrastic tenses in converbal form can be used as complex stems with a further auxiliary, and so on. In elicitation, at least, this can be continued indefinitely, giving the impression that the tense-aspect system is highly productive and open. In any case it is very large; 49 different tense-aspect forms are in regular and frequent use. For a full list with examples and analyses of meanings, see Molochieva (2010).

8.5.7 Negation Chechen uses a proclitic ca= for negation of all synthetic indicative forms. Ingush uses a suffix -c (present) and -ndz- plus tense suffix (past) for finite forms. In both languages, the copula has a separate negative form with the suffix and ablaut. (29)

Chechen ca=oolu ca=eelir d.aac

Ingush oala-c eal-andz-ar d.aac

‘doesn’t say’ ‘didn’t say’ ‘isn’t’

(30) Chechen ocu beerana šiega jüħanca školieħ dem.obl child.dat 3sg.refl.all beginning school.loc hun jüːcu a ca-xaeʔa. what j.tell conj neg-know.prs ‘At first that kid doesn’t even know what he is taught at school.’ (p86-00009: 17) (31) Chechen naana c’aaħ mother home-adv ‘Mother isn’t home.’

jaac. j.be-neg

(32) Chechen vaj ʕilman-an konferenc-ieħ doklaad-aš jie-š daac. 1pl.incl science-gen conference-loc report-pl j.do-ptcp.prs d.neg ‘We are not delivering reports at a scientific conference.’ (p34-00002: 16) Non-finites take the proclitic in both languages, e.g., converbs: (33) huma=’a ca=ooluš hama=’a cy=oalaž thing=and neg=say.cvb ‘not saying a thing’

huma=’a ca=aella (Chechen) hama=’a cy=eanna (Ingush) thing=and neg=say.cvb ‘having said nothing’

340 Erwin R. Komen, Zarina Molochieva, and Johanna Nichols The exception is the negative participle, formed from the perfective stem. The Chechen negative participle has the meaning ‘not yet’ and is used with an auxiliary ‘be’: (34) Chechen iza baazara j.axa-za j.ara. 3sg bazaar-adv j.go-ptcp.neg j.was ‘She hadn’t gone to the market yet,’ or ‘She didn’t go to the market yet.’ The Ingush analog is mostly lexicalized as adjectives, e.g., d.iishandza ‘uneducated’, d.ittandza ‘unwashed, dirty (laundry)’, d.oa-d.andza‘intact’ (lit. ‘not destroyed’), qendza ‘immature, not grown up’. Imperatives and desideratives are negated with Ch. ma=, Ing. my=, as in ma=aala, my=aala ‘Don’t say’. (35) Chechen šaa ʕüllucǚ ra t’ulg a ma-ħabie! 3sg.refl lie.prs.abl stone and neg-move.b.imp ‘Don’t even move a stone from where it lies!’ (p86-00033: 29) (36) Chechen daaxarieħ irs aettuo a, c’ient’ieħ life.loc happiness luck and house.loc iiman-bierkat a ma-iešadojla. faith-abundance and neg-lack.d.desid ‘May you not lack happiness and luck in your life and faith and abundance at your home!’ (p86-00094: 11) The negative morpheme has high tone, which is audible on the proclitic and spreads left one syllable from the suffix. Negation is symmetrical in terms of Miestamo (2005): apart from the negation of the verb there is no structural difference between positive and negative sentences. For the syntax of negation, see section 8.6.10.

8.5.8 Valence-Changing Operations There are several transitivizing derivations, one that derives ambitransitives and a construction that allows the transitive subject, A, to be freely deleted. These last two are the closest Chechen and Ingush come to detransitivization; neither can properly be called a passive. The direct causative applies only to intransitive verbs and two-argument verbs with nominative A, plus ‘eat’ and ‘drink’. It derives a transitive verb where the change of state or position or location comes about as a result of direct, often physical, action by the added A. The intransitive subject (S) becomes the transitive object (O), semantically a

Chechen and Ingush 341 patient (P), with no case change as both the input and output are nominative. The suffix is transparently derived from d.-‘make/do’.5 The indirect causative applies to any verb, deriving a transitive verb whose A allows the event to occur or makes it occur but without necessarily direct physical action. The input S becomes an O, still nominative, but (at least for human nouns) with some semantic agency or responsibility for the action; it is often not semantically a patient. The input A becomes an allative causee. The suffix is derived from d.it- ‘leave’, without the gender prefix. The double causative causativizes a direct causative. The inceptive/potential adds the stem of ‘give’ to produce a verb with either bounded aktionsart (this effect is clearest where the input verb is stative or durative while the output is punctual, ingressive, or iterative) or a meaning ‘can . . .’ Sometimes both meanings are possible, and sometimes one or the other is strongly preferred or uniquely possible; the contexts determining which meaning applies have not been worked out yet. Here we shorten the term to inceptive (inc). The derivation applies to verbs of any valency. It adds no arguments but changes the case of an input ergative A to dative. If the input verb is transitive, the output is ambitransitive, i.e., the A is optional. In the pattern Creissels (2014) calls “radical P alignment”, the A of a transitive can be omitted entirely. In all of these valence changes, the nominative argument of the input remains nominative. An input ergative A (causee output in causatives) changes its case. Table 8.18 is adapted from Nichols (2011, p. 485). For Ingush examples grouped by input valence, see Nichols (2011, pp. 485–493). In (37) to (42) we see Chechen examples with ‘drink’:

Table 8.18 Valence-Changing Verb Derivations Derivation

Suffix

Input

Output

Case Change of A

Direct Causative

-d.u ‘make’

Non-transitive

Transitive

erg

> dat

erg

(nom)

(erg nom)

Any

Ditransitive

erg

> all

erg

erg

> dat

erg

erg

> dat/all

—

Indirect Causative

-(i)t ‘leave’

Added A

(erg nom all)

Double Causative Inceptive

-d.eit Non-transitive ‘make’+‘leave’ -lu ‘give’

Ditransitive

+ all

(erg dat all nom)

Any Ambitransitive non-ingressive ((dat) nom)

5 This verb is one of two that consist of a gender initial (here in citation form d-), a tense suffix, and a conjugation class (determining the ablaut class and/or present tense ending) but no root.

342 Erwin R. Komen, Zarina Molochieva, and Johanna Nichols (37) Chechen Transitive Input beeraš šura molu. child-erg milk drink ‘The child drinks milk.’ (38) Chechen Indirect Causative aħ beer-ana šura mala-j.o. 2sg.erg child-dat milk drink-j.caus ‘You have/make the child drink milk,’ or ‘You give the child milk to drink.’ (39) Chechen Indirect Causative aħ beer-aga šura 1sg.erg child-all milk ‘You let the child drink milk.’

mal-üːt-u. drink-indir.caus-prs

(40) Chechen Double Causative naana-s beer-ana ħöga šura mala-j.a-it-u. mother-erg child-dat 2sg.all milk drink-j.caus-caus-prs ‘The mother lets you let/make/have the child drink milk.’ (41) Chechen šura mala-lo. milk drink-inc ‘The milk is drinkable.’ (42) Chechen beer-aga šura mala-lo. child-all milk drink-inc ‘The child can drink milk.’ There are three possible interpretations of a clause with a transitive verb, an O, and no overt A. The absence of A could be interpreted as null anaphora, an unspecified null pronoun, or true absence of an agent. This set of interpretations, especially the latter, constitutes radical P alignment. Consider (43): (43) Chechen (Ø) šura melira. milk drink-pst.wit i. ‘He/she/we/you/they drank the milk.’ ii. ‘(Someone) drank the milk.’ iii. ‘The milk was/got drunk.’ There is no applicative (and, in fact, no process of any kind that changes the argument role of objects), no passive, and no antipassive.

Chechen and Ingush 343

8.5.9 Agreement Verbs agree in gender with the nominative argument of the clause. Gender markers differ between singular and plural for many nouns, so gender agreement also encodes number agreement. Gender marking is root-initial. About 30% of the verb roots have a gender agreement slot; the rest do not. There is no evident basis, semantic or otherwise, for which verbs do or do not have gender. For agreeing verbs, all forms (finite and nonfinite) have gender agreement. Some tense forms, and some tense auxiliaries, distinguish gender (without exception, so verbs that lack initial gender agreement nonetheless have it on the relevant tense forms). Some verbs have additional number agreement with the S/O, marked by ablaut, consonant alternation, and/or suppletion, e.g., Ing. sg. d.oll-, pl. d.oxk- ‘insert’, sg. qoss-, pl. qous- ‘cast’, sg. ull-, pl. d.aad- ‘lie’; Ch. sg. d.aaqq-, pl. d.ax- ‘take’, sg. d.ad-, pl. d.oud- ‘run’, sg. d.ill-, pl. d.axk- ‘put down’, sg. diž-, pl. d.iiš- ‘lie down’. These are a minority of verbs (Nichols, 2011, p. 314, lists 23 such pairs for Ingush, and Maciev, 1961, pp. 601–602, lists 13 for Chechen). A smaller minority of adjectives (under 10%) agree in gender/number with the head noun (again with no evident semantic basis). Unlike verb agreement, which is only with nominatives, adjective agreement applies to all cases. Participles of gender-agreeing verbs also agree. One adjective has a separate plural form, i.e., Ing. v.oaqqa sag ‘old man, big man’, b.oaqqii nax ‘elders’ with plural -ii. One has a suppletive plural, i.e., sg. zʕamiga ‘young’, pl. kegii, but zʕamiga in its broader and more basic sense ‘small’ is not suppletive.

8.5.10 The Addressee Dative Construction An additional evidential-like category is the addressee dative construction, where a dative personal pronoun is cliticized immediately after the finite verb form. The meaning relates to information structure, resembling miratives, or ethical datives of other languages. A dative second person pronoun indicates that the statement is important to the hearer, possibly unexpected, and the background information is known to the hearer and speaker. It may be a warning or an announcement: (44) Chechen muusaa, laetta ma ħieža — bʕaaguor Musa ground.adv neg look:iter.imp dizziness booghur bu=ħuuna! b.come-fut b.is=2sg.dat ‘Musa, don’t keep looking at the ground—you’ll get dizzy!’ (45) Ingush ʕa-lieg-až dy=ħuona! down-fall:iter-sim.cvb d.prog= 2sg.dat ‘(Be careful), he still falls down a lot.’ (Parent warning someone that a child still can’t walk very well.)

344 Erwin R. Komen, Zarina Molochieva, and Johanna Nichols It may also be a statement of an authority figure, in this example God to the father of a disobedient son: (46) Ingush ʕa jaaxar cy=dear ħa amarca vaac=ħuona. 2sg.erg say-ptcp.nmlz neg=d.do.nmlz 2sg.gen order-ins be.neg=2sg.dat ‘One who doesn’t obey you isn’t your responsibility.’ (0540) Ingush also uses the inclusive in this construction, to state something important that is known to both speaker and hearer but not in the hearer’s immediate consciousness. (47) Ingush aara žei-doaxa liela-do=i=vaina? outside sheep-cattle keep-dir.caus=q=1pl.incl ‘Well, sometimes livestock are kept outside after all, aren’t they?’ (0409.22) Either of these can co-occur with a governed dative, showing that it is not an argument. In addition, the clitic can be phonologically reduced, unlike an argument pronoun. (48) Ingush ħa-ieca ealar, ħuona=‘a by=ħuona cy=čy dx-take say.pst.wit 2sg.dat=and b.is=2sg.dat dem=in ‘Take it, he said, there’s something for you too in there.’ (0395A.31) For more on these constructions, see Molochieva (2010); Molochieva and Nichols (to appear); Nichols (2011, pp. 280–283).

8.6 Simple Clauses 8.6.1 Structure of Noun Phrases Noun phrase word order is relatively strict. Relative clauses can be extraposed and possessors can be extracted for prosodic reasons and topicalization (in (49) and (50) the underscore marks the extraction site, brackets mark the NP, and the extracted possessor is indexed): (49) Chechen a, ʕaalašja-r a __k dieqar] [NP bulan-aš d-ieb-iita-r aurochs-pl d-multiply-caus-nmlz and protect-nmlz and duty

Chechen and Ingush 345 d-u [masseer-an a]k d-prs all-gen and ‘It is everyone’s duty to protect the aurochs and let them multiply.’ (p86-00032: 22) (50) Ingush [NP ___k

kuogal’gaž] mel zʕamiga by feet how.much small b.be.pres xoi ħuona cynk? know.q 2sg.dat 3sg.gen ‘You know how tiny his feet are?’ (Adult to child, about a baby.) (0746)

8.6.2 Predicate Structure Verbs of all kinds can form predicates; other lexical classes can be predicates only with a copula or other verb. (51) Ingush Predicate noun yz dika sag 3sg good person ‘She’s a good person.’

jy. j.be.prs

(52) Ingush Predicate adjective yz dika dy. 3sg good d.be.prs ‘That’s good.’ or ‘It’s good.’ (53) Ingush Locative Predicate yz aaraħ vy 3sg outside v.be.prs ‘He’s outside.’

8.6.3 Finiteness All and, with very few exceptions, only main clauses are finite. Since there is no person agreement, person cannot be used as a criterion for finiteness. The only category unique to finites and categorically absent from non-finites is tense. Some non-finites have relative converbal tense (e.g., anterior or simultaneous to an adjacent clause or main clause), but these are not deictic like finite tenses. Some non-finites clearly belong to non-verb lexical classes: the masdar is morphologically a noun, hosting gender (D) and nominal cases; participles are morphologically

346 Erwin R. Komen, Zarina Molochieva, and Johanna Nichols adjectives, with the bipartite nominative/oblique distinction of adjectives and, when used attributively, case agreement with the head noun. Converbs have no distinguishing features of any part of speech; they are sometimes regarded as verbal adverbs, but adverbs as a class do not have any morphological identifiers.

8.6.4 Major Valence Classes There are two major valence classes: intransitives (one argument, nominative S) and transitive (two or more arguments; nominative O, ergative A). Transitive subclasses include monotransitives (ergative A, nominative O), ditransitives (ergative A, dative or allative goal, nominative theme), and polytransitives (indirect causatives of ditransitives: ergative A, dative higher causee, allative causee, see section 8.5.8). All light verbs have one of these patterns (mostly transitive). Nearly all ditransitives encode the theme as a direct object (nominative case, agreement) and the goal in an oblique case.

8.6.5 Minor Valence Classes A handful of high-frequency verbs of cognition and perception have a dative experiencer and a nominative stimulus. Another subset of these verbs have a nominative experiencer and a lative (or occasionally other) stimulus (see section 8.6.9). (54) Chechen ‘know’ (dat nom) Muusa-na duqqa tüːra-naš musa-dat many fairytale-pl ‘Muusa knows many stories.’ (55) Chechen ‘fear’ (nom lat) so cu žʕalax č’oogha 1sg.nom dem dog very ‘I fear this dog.’

xaeʔa know.prs

qoeru. fear.prs

8.6.6 Word Order and Information Structure Clauses, like phrases, are head-final by default. All non-finite clauses are verb-final. Main clauses are often verb-final, especially episode-initial ones, but the verb may also come before the subject. For Chechen this is described as marked verb-subject order, favored by certain structural and pragmatic factors (Komen, 2013; Molochieva & Komen, to appear). For Ingush it is described as verb-second order, frequent and probably unmarked in main clauses other than narrative openers (Nichols, 2011, pp. 673–677). In this order, a finite TAM auxiliary or light verb is in second position and the non-finite lexical verb is clause-final.

Chechen and Ingush 347 (56) Ingush muusaa vy ħuona telefon jettaž. Musa v.be.prs 2sg.dat phone j.strike:iter.sim.cvb ‘It’s Musa on the phone for you’ or ‘Musa is calling you.’ A verb prefix or the first element of a compound is clause-final while the finite component is in second position: (57) Ingush (cf. hwa-d.oagha‘come’). paččaħ v.oagha uquzaħ ħa. king v.come here deic ‘The king is coming.’ (PL 1.1) (58) Ingush (NB: bwarjga ‘eye’ + gu/d.ei- ‘see’ means ‘see’) myčaa j.ei-n-ii ħuona c’ie mettig where j.see-pst.nwit-j.q 2sg.dat red place ‘Have you ever seen an all-red place?’ (0240a)

bʕarjga? eye

A main clause is verb-initial if it immediately follows a chained clause, but not if it immediately follows a subordinate clause. In (59) and (60) the converb clauses are bracketed; note the position of aara-vealar in each: (59) Ingush Core-Chained Clause [so bʕarjga=’a jeina] 1sg eye=and j.see.cvb ‘Musa saw me and left.’

aara-vealar Muusaa. out-v.go.pst.wit Musa

(60) Ingush Time Subordinate Clause [so bʕargja-jeiča], Muusaa aara-vealar. 1sg eye-j.see.cvb Musa out-v.go.pst.wit ‘When he saw me, Musa left.’ In core chaining, the prefix generally remains on the verb. (For more examples and discussion, see Molochieva & Komen, to appear; Nichols, 2011, pp. 678–683). A striking difference between Chechen and Ingush is the frequency of subjects occurring after the finite verb. Counts by Komen and Bugenhagen (2017) indicate that twothirds of Ingush subjects occur after the finite verb, while only one-third of Chechen subjects occur after the finite verb. So while postverbal subjects appear to be the norm in Ingush, they are rare in Chechen. Especially rare are postverbal pronominal subjects: only one-third of the postverbal subjects are pronouns, and only possible where a participant has been established in the preceding discourse (for more, see Komen & Bugenhagen, 2017).

348 Erwin R. Komen, Zarina Molochieva, and Johanna Nichols

8.6.7 The Syntax of Agreement Gender and number agreement are almost entirely phrase- or clause-internal (section 8.5.9). Two verbs, meg ‘may, can, possible’ and d.ieza ‘should, must’, agree in gender with the subject of their infinitive clause (see also section 8.8.6). (61) Ingush ħa=joala-jie meg-agjar yz deic=j.go-j.caus.inf may-j.cond 3sg ‘. . . it would be possible to bring her back . . .’ Impersonal predicates referring to temperature, weather, and so on, agree in the J gender: (62) Ingush taxan šiila jy. today cold j.be.prs ‘It’s cold today.’ The abstract nouns vaaxar ‘life’, valar ‘death’ have fixed initial v- which does not agree with any clause member. The nouns themselves belong to D gender, like all masdars. (63) Ingush handz vai vaaxar xala dy. now 1pl.incl.gen life difficult d.be.prs ‘Our life is hard these days.’ or ‘Life is hard these days.’ In an NP containing a numeral, the head noun is always singular. A demonstrative modifying the NP is plural for numerals over one, but the verb agrees in gender/number with the head noun: (64) Ingush yž qo voaqqa sag myča vaxar? dem.pl 3 v.big person where v.go ‘Where did those three old men go?’ In copular clauses the verb agrees not with the subject but with the predicate noun or adjective. In (65) a female actor plays a male role: (65) Ingush uq spektakle-ħ Marem Muusaa vy. dem.obl performance-adv Mariam Musa v.be.prs ‘In this play Mariam (F) plays Musa (M).’

Chechen and Ingush 349 When a verb needs to agree with coordinated nouns of different genders, the rules vary between the languages. In Ingush, the verb agrees with the last conjunct: (66) Ingush suona je ghar=ji tata=ji k’orda-dea=d. 1sg.dat dem noise(j)=and bang(d)=and fed.up-d.vblz.nwit=d ‘I'm sick of all this shouting and banging.’ Agreement is with the last conjunct regardless of whether the verb follows or precedes the coordinate phrase, i.e., proximity to the verb is not a factor in agreement. In standard Chechen, the verb takes D agreement regardless of the genders of the coordinates. Highland Chechen, at least the upper Itum-Kale dialect investigated here, is like Ingush.

8.6.8 Local Anaphora Within clauses, a coreferent to the subject or (for possessor reflexivization) object is reflexive, preserving the person of the coreferent (see Nichols, 2001, 2011, pp. 640–644). Only animates can control reflexivization. (67) Subject-Controlled Possessor Reflexivization in Ingush šeik žʕalegh qer. bieražk child-pl 3pl.refl.gen dog-lat fear ‘The children are afraid of their own dog.’ (68) Object-Controlled Possessor Reflexivization in Ingush žʕalez bieražk qiera-dyr. šeik 3pl.refl.gen dog-erg child-pl fear-d.caus.pst.wit ‘Their own dog scared the children.’ Any other participant can be coreferred to using the neutral demonstrative (examples throughout) or, under various pragmatic and discourse conditions, the proximal demonstrative (section 8.8.3). Alternatively, anaphoric pronouns may be null, as the stylistic preference is to minimize overt tokens, using null coreferents after the first overt one (Nichols, 2011, pp. 638–640; for more on null pronouns, see Nichols, 2018). Overt tokens are entirely grammatical and not emphatic or focused (i.e., these are not prodrop languages), but they are not particularly frequent.

8.6.9 Grammatical Relations There are almost no valence-changing processes other than the causative derivations (see section 8.5.8). This means that mapping argument structure to valence is very

350 Erwin R. Komen, Zarina Molochieva, and Johanna Nichols Table 8.19 Mapping Argument Structure to Valence Argument

Default

Non-default

A

Ergative

Dative (verbs of perception, cognition, etc.) Nominative (‘fear’ and a few others)

S

Nominative

O

Nominative

T

Nominative

G

Dative

Lative (‘fear’, above) Allative

straightforward. For every argument role, there is a default valence mapping and usually a non-default one displayed by a handful of verbs. Ditransitives have mostly primary/secondary object alignment, with T encoded as O and G encoded differently. Both languages have very consistent ergative case and agreement morphology, but the syntax is generally accusative (S/A controller) or nominative-controlled. In phrase syntax, the default (and near-exclusive) treatment of possessors is genitive case; the default case for objects of pospositions is dative, but postpositions derived from verbs govern the same case as the verb. Section 8.6.5 illustrates non-default patterns; examples of default valence are found throughout this chapter. Nichols (2011, pp. 462 ff.) attempts to list verbs with non-default valence for Ingush completely.

8.6.10 Negation For the morphology of negation, see section 8.5.7. The negative marker is always hosted by the verb, regardless of which clausal constituents are in its scope. There is no double or multiple negation and no dedicated negative words other than Ch. cq’aa=’a, cq’aana cq’a=’a, Ing. c’aqqa=’a, cq’ea-cq’a ‘never, not once’. Negated indefinites (including nouns stag/sag ‘person; someone’ and huma/hama ‘[some]thing’) often undergo focus gemination and/or host the coordinating clitic =’a but otherwise do not change their form. An indefinite, if present, is the preferred focus of negation. (69) a. Сhechen suuna cħa sag ca=gi-ra. 1sg.dat one person neg=see-pst.wit ‘I didn’t see anyone.’ b. Ingush suona sag bʕarjga+vein-dz-ar. 1sg.dat person eye+v.see-neg-pst.wit ‘I didn’t see anyone.’

Chechen and Ingush 351 (70) a. Chechen cħa=’a huma=’a dala ca=dieza. one=and thing=and d.give-inf neg=d.should ‘free, no-cost’ (lit. ‘not needing payment’) b. Ingush cħaaqqa hama=’a dala cy=diezaž. any thing=and d.give-inf neg=d.should.sim.cvb ‘free, no-cost’ (lit. ‘not needing payment’) In what Foley and Van Valin (1984) term “nuclear chaining,” the sequence of converb and main verb, which is often lexicalized as a unit, is negated as a whole with negation internal to the main clause. (71) Ingush vad-da vax-andz-ar. v.run-ant.cvb v.go-neg-pst.wit ‘He didn’t run away.’ (not ‘went away without running’ or ‘ran but didn’t go away’) In other non-finite constructions negation is marked on the clause in its scope; scope is restricted to that clause. (72) Ingush kog loza=’a bea, dʕa-liela-luž vaac yz. leg hurt=and b.caus.ant.cvb deic-walk-inc.sim.cvb v.prog.neg 3sg ‘He hurt his leg and can’t walk.’ (1309) (73) Ingush ghaalii kuorta ħa=cy=’a boaqqaž yz cigara tower.gen head deic=neg=and b.take.sim.cvb 3sg there.abl vaxaav. v.go.pst.nwit.v ‘He left without putting up the ‘tower head’ (i.e., capstone of a tower’s roof).’ (0743) (74) Chechen so baq’derg dʕa-ca-aelča ʕaluš vaac. 1sg truth away-neg-say.when rest.pot.ptcp.prs v.neg ‘I can’t but say the truth.’ (lit. ‘I can’t rest without completely telling the truth.’) (p34-00002: 116) (75) Chechen Muusa c’a-ca=v.eʔa-ča jilxi-ra Musa home-neg=v.come-temp.cvb j.cry-pst.wit ‘Zara cried when Musa did not come home.’

Zaaraa. Zara

352 Erwin R. Komen, Zarina Molochieva, and Johanna Nichols

8.7. Complex Sentences This section highlights a number of phenomena that involve multiple clauses.

8.7.1 Coordinating and Subordinating Constructions Phrasal coordination places the clitic =ii/=i/=ji or =’a after each conjunct. =ii, and so on are dedicated phrasal coordinators, used probably as default coordinators and regularly when the coordinates are a natural or complete set or act as a unit. Conditions for use of’a have not been investigated closely, but they include emphatic coordination (e.g., ‘both . . . and’). (76) Ingush6 a. aħmada=’a, Maħmada=’a jaaz-dea=d Ahmed=and Mohammed=and write-d.pst.nwit=d ‘Ahmed and Mohammed each wrote a letter.’ b. aħmada=ji, Maħmada=ji jaaz-dea=d Ahmed=and Mohamed=and write-d.pst.nwit=d ‘Ahmed and Mohammed wrote the/a letter (together).’

keaxat. letter keaxat. letter

8.7.2 Clause Chaining

Sequences of converb clauses are informally referred to as “clause chaining,” but for Chechen and Ingush there is an important difference between true chaining, involving the chaining of converbs, and other chains, which constitute adverbial subordination. In (77) we can see the grammatical diagnostics of true clause chains. (77) Diagnostics for True Clause Chains in Chechen and Ingush a. Chaining converb appears as predicate; b. Coordinating enclitic =’a encliticized to content before the verb; If there is no object to host the clitic, it can attach to a prefix; if there is no prefix, a reduplication of the verb root hosts the clitic. c. Verb-initial word order in the main clause; d. Chaining converbs make an anterior/simultaneous distinction but no deictic tense distinctions. e. Chained and main clauses share an argument, which is overt in one of the clauses (usually the main clause) but cannot be overt in the other—unlike typical argument coreference, which can have an overt coreferent (e.g., noun, anaphoric pronoun, reflexive pronoun) in each clause. Minimal pair from Jakovlev (2001, p. 252).

6

Chechen and Ingush 353 Semantically, chained clauses are very similar to coordinated clauses, and since in Chechen and Ingush there is almost no clause coordination akin to that of European languages (with finite verbs and conjunctions), chaining can be regarded as the morphosyntactic encoding of semantic coordination. In (78), a converbal clause with a clitic attached to a prefix is followed by a converbal clause with reduplicated verb root hosting the clitic. (78) Ingush mašen ħa=’a jett-aa, ieza=’a vehicle deic=and j.load-ant.cvb red=and ʕa=jeassa-jeai. deic=j.empty-j.caus.pst.nwit ‘They loaded the truck, weighed it, and unloaded it.’

iez-aa weigh-ant.cvb

The next examples contrast a chaining construction (79) with adverbial subordinate converb clauses, which have verb-final order in the main clause, no =’a, possible overt realization of the coreferential argument as a long-distance reflexive, and possible noncoreference between the two clauses. (Bold represents the shared argument in its converb-clause instantiation; Ø represents anaphoric zero). (79) Ingush a. peat’mat-aa axča=’a danna, aara-veal-ar Peat’mat-dat money=and d.give.ant.cvb out-v.go-pst.wit ‘Musa gave Peat’mat money and went out.’

Muusaa. Musa

dalča, Muusaak aara-veal-ar. b. Øk peat’mat-aa axča Peat’mat-dat money d.give.temp.cvb Musa out-v.go-pst.wit ‘When Musak had given Peat’mat the money, hek went out.’ dalča, Muusaak c. šiek Peat’mat-aa axča 3sg.refl Peat’mat-dat money d.give.temp.cvb Musa aara-veal-ar. out-v.go-pst.wit ‘When hek had given Peat’mat the money, Musak went out.’ d. aħmad-az Peat’ma-taa axča dalča, Muusaa Ahmed-erg Peat’mat-dat money d.give-temp.cvb Musa aara-veal-ar. out-v.go-pst.wit ‘When Ahmed gave Peat’mat money, Musa went out.’ In modern usage, clause chaining is often used with an accusative pattern and an S/A pivot. However, O pivots are also possible, and more common among older speakers (for more information, see Conathan and Good, 2000; Good, 2003a, 2003b; Nichols, 2011, ch. 24; Peterson, 2001).

354 Erwin R. Komen, Zarina Molochieva, and Johanna Nichols

8.7.3 Relative Clauses Chechen and Ingush use a gap strategy of relativization, where a participle heads the prenominal relative clause. In analytic tenses and light verb constructions, only the auxiliary takes participle form (see Table 8.17 for attributive participle forms). (80) Chechen [beerana xuʔuču] mattaħ q’amiel dan dieza. child.dat know.ptcp.obl language.loc speech d.do.inf d.need.prs ‘One needs to speak the language understood by a child.’ (p34-00002: 45) (81) Chechen [vaešna guš bolu] nieq’ biicar 1pl.incl.refl.dat see.prs.sim.cvb b.ptcp.prs way b.talk.inf.nmlz du [vaj dan] diezarg. d.prs 1pl.incl d.do.inf d.need.ptcp.nmlz ‘What we need to do is talk about the path we see.’ (p34-00002: 48) Nominalized relative clauses, as in (81) vaj dan diezarg ‘what we need to do’ are equivalent to free relatives. These clauses differ from what has been shown so far in two ways: (i) they do not have a head noun; (ii) they have a nominalization suffix (or a case suffix) on the verb, -rg in this example. Agreement between relative clauses and their head nouns is complex. There is gender agreement within the relative clause and case agreement outside the relative clause, between the participle and the head noun. The copular attributive participle bolu in (81) agrees in gender with the gap in the relative clause where nieq’ ‘way’ would have been, while it agrees in case (nominative) with nieq’ ‘way’ in the main clause. Another agreement example is (82), where the relative clause agrees internally in noun class with the implied pronominal subject üš ‘they’, while it agrees externally in case: the main verb dieza ‘should’ takes a dative-case subject. (82) Chechen iza xaʔa dieza [ʕiedallieħ žüepallieħ] bolčaarna a. 3sg know.inf d.need.prs power.loc answerable.loc b.rel.pl.dat emph ‘Those in the responsible positions in the government should know it.’ (p34-00002: 593) Since Chechen and Ingush are head-final languages, the default order is for the relative clause to precede the nominal head it modifies. Extraposition is possible as well. Extraposed relative clauses are clause-final, as described in more detail by Komen (2014). There is one difference between restrictive and appositive extraposed relative clauses, and this subtle difference is related to information structure. NPs in all positions in the

Chechen and Ingush 355 sentence can have an appositive extraposed relative clause. But only NPs that occur in the default focus position can have a restrictive extraposed relative clause.7 While the reason for this link between relative clauses and information structure is not completely clear, see Komen (2014) for one possible account. There are two relative clause variations to discuss. The first is the it- cleft, a biclausal construction where one of the clauses is a relative clause (see section 8.8.1). The second is a logical extension of the free relative, whose English correlate, an adverbial locative phrase, would be quite different syntactically, e.g., ‘where I live’ or ‘where they go’. Chechen and Ingush, due to rich locative case systems, do not use adverbial phrases with a head like where, but they use “free locatives”, that is, free relative clauses that are headed by a locative nominal suffix rather than a full NP. (83) Chechen taxanleeraču diinaħ, masalaa, so veexačuoħ, today.obl day.loc example 1sg v.live.prs.rel.loc iza ištta, noxčiin muott šu dolčuoħ a xir du 2pl d.rel.loc conj be.fut d.prs 3s thus Chechen.pl.gen language ħüexuš a baac-qa. teach.sim.cvb conj b.neg-prt ‘As of today, for example, at the place I live, or at your places, the Chechen language is not taught.’ (p34-00002: 68)

The example in (83) contains two free locatives: so veexačuoħ ‘where I live’ and šu dolčuoħ ‘where you are (=at your place)’. Comparable to the English adverbial locative phrases, the head noun is not overtly mentioned, but assumed to be something generic like ‘place’, e.g., ‘the place [where] I live’. (84) Chechen amma, exxar a šaa aara-vaelča, šien saalaz but finally and 3sg.refl out-v.go.when 3sg.refl.gen sledge a taqajoj, beeraš dolču dʕaħodura iza. and mount.j.prs.conj child.pl d.rel.all away.run.ipfv 3sg ‘But when he did come outside in the end, he would mount the sledge and run to the other children.’) Example (84) contains the free locative beeraš dolču ‘to [the place where] the other children [were]’. The form dolču might look like an attributive participle form of the copula, as listed in Table 8.17, but it is not. It is the shortened form of the allative (the full form would have been dolčünga). 7 The default focus position is the syntactic position that immediately precedes the finite verb, as shown in Komen (2007b).

356 Erwin R. Komen, Zarina Molochieva, and Johanna Nichols

8.7.4 Complement Clauses Complement clauses are mostly non-finite. The few finite complements are asyndetic clauses with verbs of speech and cognition: (85) Ingush ħo guržii vy mott cynna. 2sg Georgian v.be.prs seem 3sg.dat ‘He thinks you're Georgian.’ or ‘You look Georgian to him.’ (86) Ingush aaz yz dika sag vy ealar. 1sg.erg 3sg good person v.be.prs say.pst.wit ‘I said he was a good person.’ This includes finite complements with interrogatives: (87) Ingush suona xou, t’eħa 1sg.dat know behind ‘I know who was late.’

mala who

vys-aa-v. v.stay-pst.nwit-v

And also complements with finite interrogatives: (88) Ingush šoana xaz-aa-d-ii xaac suona. 2pl.dat hear-pst.nwit-d-q know-neg 1sg.dat ‘I don't know if/whether you’ve heard this.’ (0408) (89) Chechen cigaħ xaza duj xaeʔara k’antana. there.loc beautiful d.prs.q know.pfv boy.dat ‘The boy knew it was beautiful there.’ Direct speech is a literal quote of the original speech, optionally using a converb of ‘say’ as a complementizer to speech verbs. The TAM converbal complementizer depends in part on that of the main verb: the anterior converb with perfective matrix verbs, simultaneous converb with progressive and durative matrix verbs, and Ch. -ii, Ing. -ie for iterative, generic, or imperative matrix verbs. (90) Chechen “daada! va father voc

daada!” – booxuš, maeħarii a father say.ptcp.prs shout.pl conj

ħüequš, strike.ptcp.prs

Chechen and Ingush 357 čulilxina beeraš. in.jump.pfv child.pl ‘The children jumped in, shouting and crying: “Daddy! O daddy!” ’ (p86-00152: 2) Matrix clauses often follow reported speech clauses, and the matrix clause is usually verb-initial. Semi-direct speech is much like direct speech except that a subject pronoun coreferential to the speaker is third person reflexive (logophoric). (91) Ingush8 ad-daa-c šie, eal-ar joax. say-d.fut-neg log say-pst.wit quot ‘I won't tell you, he said.’ or ‘He said he wouldn’t tell him.’ (0408) All other deictics remain as in the speech under report. In (92), the imperative d.aa ‘give (me)’ is one of very few verbs that index person (of the indirect object), and it retains its first person form even though the logophoric pronoun is third person: (92) Ingush cuo č’ʕoagha diexar dead-ar suo-ga, šii-na axča 3sg.erg very request d.make-pst.wit 1sg-all log-dat money daa ean-na. give.1sg say-ant.cvb ‘He begged me to lend him money.’ Unlike regular third person reflexives, logophoric ones do not control reflexivization of further coreferents in subordinate or chained clauses; but first and second person subjects control (non-logophoric) long-distance reflexivization (see Nichols, 2011, pp. 558–559). Verbs meaning ‘know’, ‘believe’, and so on, take the subjunctive in Ingush and the cognate desiderative in Chechen (this is more common with these verbs than indicative finite complementation): (93) Chechen rašidana xaeʔara, naana šaa kiečam muuxa Rashid.dat know.pst.wit mother 3sg.rfl preparation how bina ħožur jujla. b.do.perf watch.fut j.prs.sbd ‘Rashid knew that (his) mother would watch how he had prepared.’

joax is a hearsay evidential on the main verb ‘say’, because the speaker is telling about an event he heard about from someone else. 8

358 Erwin R. Komen, Zarina Molochieva, and Johanna Nichols (94) Ingush ʕaišiet-aa xou, suo-ga Aisha-dat know 1sg-all ‘Aisha knows I have no time.’

xa time

joaca-ljga. j.be:neg-sbjv

(95) Ingush9 dolxaljga xoura txuona, txo kulaakaž d.go.sbjv know.ipfv 1pl.excl.dat 1pl.excl kulak.pl ‘We knew we were going, we were kulaks.’ (0238A.10)

bar. b.be.pst

The verbal noun, or masdar, is used with a number of verbs. It heads a nominalized clause whose arguments cases all appear in their regular cases. The masdar itself is in the case governed by the main verb or the construction. (96) Ingush (NB: ‘surprised’ takes a lative object) muusaa cec+vealar taxan dogha Musa surprise+v.lv.pst.wit today rain ‘Musa was surprised that it rained today.’ (97) Ingush barkal xalda ħuona ʕa thanks be.opt 2sg.dat 2sg.erg ‘Thanks for helping me.’

suona 1sg.dat

delxaragh. d.precipitate.msd.lat

gho-daragh. help-msd.lat

(98) Chechen šajxi, qeču mattaħ jaazjar qeču mattaħ Shajxi other.obl language.loc write.nmlz other.obl language.loc ojla jarca düezna duj-tie? thought j.do.nmlz.ins connected d.prs.q-tag ‘Shajxi, writing in another language is connected with thinking in another language, isn’t it?’ (p34-00002: 242) (99) Chechen cundeela laaramza daac bilggal šu qajqar. therefore accidental d.neg exactly 2pl call.inf.nmlz ‘Therefore it is not accidental that it was namely you who have been invited.’ (p3400002: 7) Infinitives are used much as in European languages, but only in same-subject complements; there is nothing like raising in Chechen or Ingush. Infinitives are required by modals and phasal verbs. 9 In predicate nominal constructions the copula agrees in gender/number with the predicate noun, not the subject. In the second clause here kulaakazh is a human plural noun, therefore B gender; txo ‘we’ is first person, so D gender.

Chechen and Ingush 359 (100) Ingush ciga dʕa-vaxa magac=ii there deic-v.go.inf can.neg=q ‘Couldn’t you go over there?’

ħuona? 2sg.dat

(101) Ingush čaarx c’eaxxaa qesta juola-jalar wheel suddenly turn-inf j.begin-j.inc-pst.wit ‘The wheel suddenly started turning.’ With two of these verbs, there is what Nichols (2011, pp. 478–480, 553–554) calls case attraction: the case of the main-clause subject is determined by the transitivity of the infinitive. (102) Ingush10 muusaa čy-v.aa meg. Musa in-v.go.inf be.able.prs ‘Maybe Musa will come home.’ or ‘It may be that Musa will come home.’ (103) Ingush11 aaz yz televiizar 1sg.erg dem TV ‘Maybe I’ll buy this TV.’

ieca buy-inf

meg. be.able.prs

In Ingush, unlike Dagestanian languages with similar case alternations, there is no evidence (in prosody, word order, scope of negation, etc.) that constructions with case attraction are monoclausal, other than case attraction itself (Nichols, 2011, pp. 479–480).

8.7.5 Adverbial Subordination For the most part, adverbial subordinate clauses use converbs, of which there are about two dozen with meanings such as ‘while’, ‘when, after’, ‘before’, ‘although’, and others. Table 8.20 lists some of the main ones for Chechen; a full list for Ingush is in Nichols (2011, pp. 297–308, 794). Some examples of temporal clauses: (104) Ingush max t’iera ʕa my beallangeħ aaz seina mašen iecagjy. price down deic emph b.go.cvb 1sg.erg 1sg.refl.dat car buy.fut.j ‘As soon as prices go down I’ll buy a car.’ Nominative subject of meg with intransitive ‘go’. Ergative subject of meg with transitive ‘buy’.

10 11

360 Erwin R. Komen, Zarina Molochieva, and Johanna Nichols Table 8.20 Selection of Subordinating Converb Suffixes in Chechen -ča

temporal ‘when’ (perfective)

-lie

‘before’

-alc

‘until’with focus gemination (section 8.2.1)

-(n)ie

‘as soon as’

-ah̄

irrealis, potential

-ššieh̄

‘even though,’ ‘starting from’

-čox

manner

-čuol/-čul

comparative

-al

extent

-čieh̄

locative

(105) Chechen a ʕii-na k’orda-dall-alc louzu-š, lüːču-š bore-d.lv-until.cvb play-sim.cvb bathe-sim.cvb and rest-ant.cvb staancie juxa-daexk-ira txo. station-all back-d.come-pst.wit 1pl ‘We played until we got bored, bathed, and then returned to the station.’ Causal clauses take a causal converb and may additionally have a conjunction ‘because’ in the main clause: (106) Ingush aaz derriga ursaž jaašjkaa=čy cħan 1sg.erg d.all knife.pl drawer=in together ursaž sixa earh-lu. ʕa=čy-dexkan-dea, deic=in-d.put:pl-causal.cvb knife.pl fast dull-vblz.prs ‘Because I put all the knives together in the drawer they get dull fast.’ (107) Ingush (NB: ‘because’ is literally ‘if/when [you] ask why’) šera pišjk c’agha šiila jy hana_ealča, uq house.adv cold j.is because dem year. adv furnace toa-j.ea-joacan-dea. repair-j.caus-neg-causal.cvb ‘It’s cold in the house because we didn’t repair the furnace this year.’ Conditional converbs have a dedicated ending -ie. Finite conditionals use a past ending on a future stem.

Chechen and Ingush 361 (108) Ingush ħaai kert-aa doala-die, ħie q’uonax valie. 2sg.refl.gen head-dat d.go-d.caus.imp 2sg.refl man v.be.cond.cvb ‘Control your head if you’re a man.’ (109) Ingush dalaarie, suoga axča 1sg.all money d.be.irr.cvb ‘If I had money I’d go to Europe.’

so 1sg

Jivroopie Europe.adv

ghog-jar. go:fut-j.cond

There is also a subordinating conjunction nagaħ ‘if ’ or nagaħ sanna ‘if like/as’, used together with a conditional converb: (110) Chechen nagaħ sanna vajn ʕiedal a delaħ, respublika if if 1pl.incl.gen power and d.sbjv.cond republic jiezaš a jelaħ, ištta programma čeqjaaqqa and j.sbjv.cond such program through.j.bring.inf j.need.sim.cvb ju-q vaj. j.prs-ptcl 1pl.incl ‘If the power is in our hands, and if we are a Republic, we have to complete such a program.’ (p34-00002: 208) (111) Ingush t’aaqqa, nagaħ t’ormig cynca balie, t’ormig=’a ħa-ec so if suitcase 3sg-ins b.be.irr.cvb suitcase=and deic-take ʕa. 2sg.erg ‘If he has a suitcase with him, you take it too.’ (0415.12)

8.7.6 Long-Distance Anaphora Both languages make systematic and frequent use of long-distance reflexivization. The subject of any clause can bind any coreferent in any lower subordinate clause; a reflexive cannot be bound by a lower antecedent. There are some constraints related to person, animacy, and interference from other potential controllers, but apart from these constraints, long-distance reflexivization is completely regular. For details and many examples, see Nichols (2011, pp. 645–658). Lower clauses are bracketed, an underscore marks a relativization gap, and Ø is an optional null third person anaphoric pronoun, (112)–(115). (112) Ingush (Adverbial Converbal Subordinate) telefon tiexa-ča,] [aaz šiigai 1sg.erg 3sg.refl.all phone strike-temp.cvb ‘When I phoned himi Musai went home.’

muusaai Musa

čy-vaxar.

in-v.went

362 Erwin R. Komen, Zarina Molochieva, and Johanna Nichols (113) Ingush (Finite Complement) [siei myčaa=i] xaac suonai know.neg 1sg.dat 1sg.refl where=j.be ‘I don’t know where I am.’ (0398B.33) (114) Ingush (Chained Clause) guonaħa kart=’a jea] kuotamaži yštta [q’easttaa šoažtai chicken-pl thus separately 3pl.refl.dat around fence=and j.make.cvb joxkarii? dʕa=čy-joxkacar, deic=in-j.insert:pl.neg.pst.wit j.insert:pl.pst.wit.q ‘Didn’t he fence the chickens off (in a separate cage) (out of the garden)?’ (lit. ‘Didn't he [build a fence around themselves and] keep the chickens separately?’) (0409.22) (115) Ingush (Relative) ħearčaa] dieqar dʕa=dannad Øi. [__i šiighi 3sg.refl.lat enmesh.rel debt deic=d.give.pst.nwit.d 3sg.erg ‘He paid off his debts.’ (lit. ‘He paid off the debts that enmeshed himself.’) Reflexives from different clauses can intervene with each other, but there can never be more than one long-distance reflexive in a clause. In (116), ‘he’ in the main clause and ‘I’ in the lower clause both control their respective reflexives. The middle clause cannot have a first person reflexive, since it has another reflexive (a non-reflexive first person pronoun would be possible, but then the first person in the matrix clause would be zero, to avoid multiple tokens of coreferential anaphoric pronouns; see section 8.6.8). (116) Ingush cynna xou, [ [Ø šie bʕarjga+vei-na] suona xoza+xieta-ljga.] 3sg.dat know 1sg 3sg.refl eye+v.see-ant.cvb 1sg.dat glad+lv-sbjv ‘He knows I'm glad I saw him.’

8.8 Open Questions 8.8.1 The it-Cleft The example in (117) shows a construction that looks suspiciously like a relative clause, but it is not; it is a temporal it-cleft. (117) Chechen [vaj i düːcuš dolu] duqa xaan ju. 1pl.incl dem speak.prs.sim.cvb d.rel much time j.prs ‘We have been talking for a long time about it.’ (lit. ‘It has been a long time that we have been saying this.’) (p34-00002: 31)

Chechen and Ingush 363 The construction is biclausal. The main clause is duqa xaan ju ‘[it] is a long time’, where English needs to supply a pronominal subject. Chechen and Ingush do not require an expletive. If a subject were present, it would be a generic time denotation such as xaan ‘time’: xaan duqa [xaan] ju ‘the time is long’.12 The subjectless main clause links with the relative clause vaj i düːcuš dolu ‘that we have been saying this’ in two ways: (i) the gap in the relative clause duqa xeenaħ ‘for a long time’ matches the NP complement of the main clause duqa xaan ‘a long time’; (ii) the head of the relative clause is the not overtly specified generic time denotation xaan ‘time’ that is the implied subject of the main clause. There are a number of variations in which Chechen it-clefts can occur, as explained in Komen (2015).The function of Chechen it-clefts appears to be exclusively for text structuring. A temporal it-cleft in Chechen either sets out a story or provides a clear transition to a new paragraph. This same function also appears to be present in it-clefts found in Norwegian and Swedish, but those languages can also have it-clefts as a focusing strategy, which has not been reported for Chechen (so far).

8.8.2 Other Clefting A common way to question an interrogative pronoun is with clefting. This is near-obligatory when the pronoun would otherwise be in an oblique case. The pronoun, regardless of the syntactic role and case it would otherwise have, is the nominative subject of ‘be’, and the clause is clefted with a nominalized (headless) relative. (118) Ingush ʕa-j.iežaar fy=j? down-j.fall.ptcp.nmlz what=j.be ‘What fell?’ (lit. ‘What is it that fell?’ or ‘What is the one that fell?’) (119) Ingush ʕa-v.iežaar mala=v? down-v.fall.ptcp.nmlz who=v.be? ‘Who fell?’ (lit. ‘Who is it that fell?’or ‘Who is the one that fell?’) (120) Ingush je kinašjka jaaz-d.ear mala=v? dem book write-d.ptcp.nmlz who=v.be ‘Who wrote the book?’

12 The xaan that is part of the complement can then be left out, which is normal for ellipsis. Other main clauses for it-clefts like this may have time specifications like sho ‘year’, e.g., hinca qo sho du ‘[it] is now three years’. Rephrasing these in Chechen with an overt subject does not result in ellipsis, e.g., xaan hinca qo sho du ‘the time is now three years’.

364 Erwin R. Komen, Zarina Molochieva, and Johanna Nichols

8.8.3 The Proximal Demonstrative Referring to the VIP The independent proximal demonstrative hara ‘this’ (and its case inflections, see section 8.4.6) is prototypically used deictically to someone standing nearby, as in (121), which is taken from the Chechen narrative, Beshtuo (Baduev, 1991). (121) Chechen txovsa hara dʕajiga üš sħabaaghaħ, tonight dem away.j.lead.inf 3pl here.b.come.cond quo cħa jühwʕaeržuo jarna qieram bu. dem.erg one shameful.thing j.do.inf.nmlz.dat danger b.prs ‘When they come to lead her away tonight, there is danger she will do something shameful.’ (Baduev, 1991, p. 268) The context includes Vahid speaking about his sister Busana, who is standing nearby. This sentence introduces his reasoning on what they should do to her to make sure she marries someone (apparently against her will). The proximal demonstratives hara ‘this one’ and quo, an ergative form of hara, are used deictically. (122) exemplifies a non-deictic use of the proximal demonstrative pronoun. (122) Chechen ħaežira. qunnai šiena bolu busanai beštuoga Busana Beshtuo.all look.pfv dem.erg 3sg.refl.dat b.rel cataam c’eħħana bicbelira. sadness suddenly b.forget.pfv ‘Busana looked at Beshtuo. Suddenly she forgot her distress.’ (Baduev, 1991, p. 257) The author could have used a personal pronoun cunna ‘she’ to refer back to the subject of the preceding clause, Busana, but chose to use the proximal demonstrative instead. Baduev only starts using proximal demonstratives to refer to people after they have been introduced and have become main players in the narrative. Such participants can be termed the ‘VIP’. The proximal demonstrative, in its VIP usage, does not need its antecedent to be in a particular syntactic position while referring to it (see Baduev, 1991, p. 261). In a probably related function, Ingush discourse often assigns the proximal deictic to one participant and the neutral deictic or non-deictic vož ‘the other’ to another, without reference to spatial proximity, but based on viewpoint or topicality (Nichols, 2011, p. 658–660).

8.8.4 Radical P Alignment Is the nominative of (43) an O (in an A-less clause) or an S (in an intransitive clause)? If the latter, there has been conversion or zero derivation of an intransitive from a transitive. If the former, the construction is similar to the impersonal passives of, for example,

Chechen and Ingush 365 Finnish or some Tungusic languages, where the O retains its accusative case, the A is absent, verb morphology marks the derivation, and the verb does not agree with the O. Differences are that in Chechen and Ingush, no verb morphology marks the derivation, gender agreement is not lost, and the ergative case alignment makes it impossible to say on morphological grounds whether the input O is an output O or S.

8.8.5 Schwa-Zero Alternations In sequences of two syllables from both languages and all dialects, one schwa is generally reduced to a brief whisper or aspiration of the preceding consonant, or elided entirely, while the other is fully vocalized with its normal pronunciation. Details differ (among varieties, among speakers, and even among utterances of the same speaker), but the general principle seems to be that a schwa is reduced in a syllable before a full vowel and remains a full vowel before a reduced schwa: (123) Ingush a. ħa=čy-v.eanna=v {. . . čə . . . } [ħačəvænnu:] deic=in-v.go.pst.nwit=v ‘has come in’

b. ħa=čy=’a v.eanna { . . . čə=‘ə . . . } [ħačyʔvænnə] deic=in=and v.go.ant.cvb ‘having come in’

This is similar to the schwa-zero alternations of French, Tundra Nenets, or medieval Slavic. In the orthographies and in the transcriptions used here, all schwas are written (as “a”) regardless of their pronunciation.This is the uncontested choice of linguists, lexicographers, and practical grammarians, but it makes the spelling system difficult to learn and difficult for native speakers to master unless they have extensive exposure in school—as French speakers, of course, do, but many, perhaps most, native speakers of Chechen and Ingush today do not. The cross-linguistic variation is phonologically interesting and understudied, and a challenge of theoretical and applied interest is how to devise a spelling system for these alternations.

8.8.6 Second Person in Long-Distance Reflexivization In Ingush, coreference between second person pronouns in different clause chains makes long-distance reflexivization ungrammatical when it would be grammatical with other persons (Nichols, 2011, pp. 650–651). The effect seems to be consistent across speakers. It is triggered not by second person forms but by second person reference, as it applies also to the first person inclusive. The reasons are unknown.

Pa rt I I I

NORT H W E ST C AUC A SI A N L A NGUAGE S

chapter 9

The North w e st Caucasi a n L a nguages Peter Arkadiev and Yury Lander

9.1 The Languages and Their Speakers The Northwest Caucasian (NWC) family, also known as West Caucasian or AbkhazAdyghe, comprises five languages, which are grouped into three branches: (1) a. Abaza-Abkhaz: i. Abkhaz (including the Sadz, Ahchypsy, Bzyp, Tsabal, and Abzhywa dialects) ii. Abaza (nominally including the Tapanta and Ashkharywa dialects) b. Ubykh c. Circassian: i. Adyghe/West Circassian1 (with the Bzhedugh, Shapsugh, Abzakh/ Abadzekh, and Temirgoy dialects) ii. Kabardian (also known as East Circassian and including the Besleney, Baksan, Mozdok, Malka, Terek, and Kuban dialects) This division is based on linguistic and sociolinguistic considerations and involves some simplifications. For example, Adyghe and Kabardian are often considered by their

1 The term “Adyghe” is used now quite widely but somewhat erroneously, since the corresponding local term refers to all Circassians, including both West Circassians and East Circassians (Kabardians). In Russian linguistic terminology, the problem is solved by contrasting between the adjectives адыгский ‘Circassian’ and адыгейский ‘West Circassian’, but in English no such contrast exists. In this chapter, we use the term “Adyghe” following the conventions of the handbook. However, it is more accurate to use “West Circassian,” as we ourselves and some of our predecessors have done in numerous publications. We recommend using West Circassian in future work.

370 Peter Arkadiev and Yury Lander speakers to constitute a single Adyghe (or Circassian) language (despite the absence of mutual intelligibility), and some dialects of Abkhaz (e.g., Sadz) and Adyghe (e.g., Shapsugh) may be treated as separate languages. Nonetheless, in the sections that follow we use the language list as given in (1) with a proviso that whenever it is possible, we will try to overtly mark the variety referred to—with the exception of Abaza, whose e xamples always represent the Tapanta dialect. Speakers of NWC languages traditionally inhabited areas to the north and partly to the south of the western part of the Caucasian Ridge including the northeastern coast of the Black Sea. The situation changed drastically in the middle of the 19th century, when many Circassian, Abkhaz-Abaza, and Ubykh communities migrated to the Ottoman Empire after their lands were occupied by the Russian Empire. Speakers of NWC languages now live not only in the Northwest Caucasus per se (primarily in the Russian regions of Adygea, Karachaevo-Cherkesia, Kabardino-Balkaria, and Krasnodarsky Kray, and in the de facto independent Republic of Abkhazia), but also within a massive diaspora in Turkey, Syria, Jordan, and Israel. Outside the Caucasus, the people of this diaspora are referred to as the Cherkess, i.e., Circassians, irrespective of their actual origin. According to the 2010 Russian census, Adyghe in Russia has about 117,500 speakers; Kabardian, about 515,700 speakers, and Abaza, about 38,000 speakers. The number of Abkhaz speakers in Abkhazia and in Russia is about 100,000. There are no parallel data for the diaspora, mainly because of the complicated status of the NWC languages in Turkey (see below in this section). Tevfik Esenç, the last competent speaker of Ubykh, died in Turkey in 1992. In the Russian Federation, Adyghe is one of the official languages in the Republic of Adygea, and is also used in Krasnodarsky Kray, where its position is much less healthy. Kabardian is one of the official languages in Kabardino-Balkaria and in KarachaevoCherkesia; the latter is also home to Abaza. All these languages are taught at school, are represented in local media, and have literature, mostly published in partly standardized varieties which use the Cyrillic-based orthography with many digraphs and even trigraphs (see section 9.2.1). Abkhaz is the state language of the de facto independent Republic of Abkhazia, where it is represented in media and literature (including academic publications). All these languages undergo considerable pressure of Russian, both in Russian Federation and in Abkhazia, especially in the urban areas. In Turkey, where the number of the representatives of the NWC peoples exceeds their number in Russia, the use of their languages was much more restricted for political reasons. That is why Ubykhs lost their language completely; however, they were mostly bilingual in Circassian even before migration. Outside Turkey, the Abkhaz-Abaza communities have shifted to Circassian or Arabic. Over the last decades there have been numerous attempts to revive NWC languages in all of these countries. For long-range genealogical comparisons involving NWC languages, see chapter 1. A number of studies compare languages within the family from both historical and

The Northwest Caucasian Languages 371 typological persepctives: Dumézil (1932), Shakryl (1971), Colarusso (1988), a series of monographs by Kumakhov (1964, 1971, 1981, 1989) and Kumakhov and Vamling (2009) on Circassian, and Chkadua (1970) on Abkhaz-Abaza, to mention just a few. Systematic studies of NWC languages started in the 19th century with Peter Uslar’s grammar of Abkhaz. Uslar left insightful notes on other NWC languages including Ubykh (Uslar, 1887). In the 20th and 21st centuries, NWC languages (especially their standardized varieties) obtained a number of detailed grammatical descriptions, mainly in Russian but also in some other European languages (as well as in NWC languages themselves). Compare Russian grammatical descriptions: Jakovlev and Ashkhamaf (1941), Rogava and Kerasheva (1966), and Zekokh (2004) for Adyghe; Abitov et al. (1957), Bagov, Balkarov, Kuasheva, Kumakhov, and Rogava (1970), Jakovlev (1948), Kumakhov, Apazhev, Bizhoev, Zekoreev, & Taov (2006), and Turchaninov and Tsagov (1940) for Kabardian; Aristava, Bgazhba, Tsikolia, Chkadua, and Shakryl (1968) and Jakovlev (2006) for Abkhaz; Lomtatidze (2006; also in Georgian, which is the original version of her sketch published in English in 1989) and Tabulova (1976) for Abaza. The descriptions in other languages include sketches by Colarusso (1989) on Kabardian; Hewitt (1989) on Abkhaz; Lomtatidze & Klychev (1989) on Abaza; Paris (1989) on Abzakh Adyghe; Charachidzé (1989) on Ubykh; Abkhaz grammars by Hewitt (1979a) and Chirikba (2003a); Kabardian grammars by Colarusso (1992, 2006) and Matasović (2010a); and Ubykh grammars by Dirr (1928c), Dumézil (1931), von Mészáros (1934), and especially Fenwick (2011), who summarized the previous research. In addition to these sources, there are numerous papers and monographs devoted to specific aspects of NWC languages, as well as numerous descriptions of dialects and local varieties of the languages of the family. The electronic corpora of NWC languages include an Abkhaz corpus which is not tagged, and a West Circassian annotated corpus allowing search based on specific morphological information (Arkhangelskiy and Lander, 2016). In the examples in this chapter, whenever a source is not explicated, examples either come from the authors’ field notes or are taken from a text including one of the corpora.

9.2 Phonetics and Phonology For detailed information and an extensive bibliography on segmental inventories, see chapter 15. This section briefly outlines the most salient facts as well as the conventions of phonological representation that we adhere to. For a detailed description based on instrumental analysis, see Colarusso (1988) on the family in general, Gordon and Applebaum (2013) on Circassian in general, Höhlig (2003) on West Circassian, Paris

372 Peter Arkadiev and Yury Lander (1974) and Gordon and Applebaum (2006) on Turkish Kabardian, and Vaux (2012) on Abkhaz.

9.2.1 Consonants NWC consonant inventories are among the richest in the world, ranging from about 50 in standard Kabardian to more than 80 in Ubykh. This is due to a large number of sibilant fricatives (cf. Paschen, 2015) and affricates as well as to secondary articulations such as labialization and palatalization (and, in Ubykh only, pharyngealization). The typical system of plosives distinguishes three series: voiced, ejective, and voiceless (often with a non-distinctive aspiration), but Bzhedugh and Shapsugh dialects of Adyghe feature a four-way system contrasting plain and aspirated plosives, reconstructed back to protoCircassian (Chirikba, 1996, pp. 109–117; Kuipers, 1963, pp. 69–71; Kumakhov, 1981, pp. 121–141; Paschen, 2019). Consider Bzhedugh thәʁe ‘given’ versus tәʁe ‘sun’. This contrast is observed not only in stops and affricates but in fricatives as well (cf. Bzhedugh šʼʰe ‘milk’ vs. šʼe ‘sell’). Ubykh has uvular stops and fricatives distinguishing plain, palatalized, labialized, pharyngealized, and pharyngo-labialized series. Sonorant inventories are, by contrast, poor, being limited to just /j/, /w/, /n/, /m/, and /r/, with /l/ present only in Abkhaz-Abaza and Ubykh, and /ɥ/ only in Abkhaz and also in some varieties of Abaza. In Circassian, except for the Shapsugh varieties near the Black Sea and possibly some other varieties in closer contact with Russian, the voiced lateral is a fricative /ɮ/ rather than an approximant. The systems of sibilant fricatives and affricates in NWC are particularly rich distinguishing four points of articulation (in eastern Kabardian dialects and in the standard language, the system is reduced to three), whose characterization is not uncontroversial (see, e.g., Ladefoged and Maddieson, 1996, pp. 161–163). Traditional (Russian-oriented) grammars distinguish between dental s /s/, c /ʦ/, alveolar ŝ /ɕ/, ĉ /ʨ/, plain postalveolar š /ʃ/, č /ʧ/, and palatalized postalveolar š’ /ʃ j/, č’ /ʧ j/ series (Höhlig, 2003; Rogava and Kerasheva, 1966, pp. 30–34, 38–40). Colarusso (1988, pp. xxvi, 18, 33) identifies these as lamino-dental, alveopalatal, apico-palato-alveolar, and lamino-palato-alveolar, respectively, while Hewitt (2005a, pp. 94–98) calls them alveolar, alveolo-palatal, retroflex, and palato-alveolar. Yet Catford (1977) proposes a different classification which is uncritically accepted by Beguš (chapter 15 of this volume). NWC languages boast many rare consonants, such as the Circassian glottalized fricatives, e.g., the “hissing-hushing” /ṣ̂/2 and the lateral /λ̣/, attested in both Adyghe and Kabardian, and the mutually corresponding Adyghe labialized alveolar /ṣ̂w/ and Kabardian labio-dental /f̣/. No less exotic are palatalized uvular stops and fricatives 2 Ladefoged and Maddieson (1996, p. 161), with reference to Catford’s work.

The Northwest Caucasian Languages 373 attested in Abkhaz-Abaza and Ubykh (Colarusso, 1988, pp. 219–292), the palatalized glottal stop /ʔ’/ in the Abzakh dialect of Adyghe (Kumakhova, 1972, pp. 15, 48), and the Abkhaz palatal approximant /ɥ/. Given the dearth of fully reliable and comparable instrumental studies for all NWC varieties and a discrepancy between different sources, we refrain from using IPA (International Phonetic Alphabet) symbols, reverting to the traditional Caucasological phonemic transcription employed in Smeets (1984) and Testelets (2009a). Tables 9.1 through 9.5 represent consonantal systems of Standard Adyghe, Standard Kabardian, Standard Abkhaz, Tapanta Abaza, and Ubykh (in the absence of an orthography for Ubykh, we use the transcription in Fenwick (2011) as a reference point; Fenwick’s symbols are shown only when different from those used in this chapter).3 Phonemes attested only in loans appear in parentheses.

Table 9.1 Consonants: Standard Adyghe

Plosives

–voice

+glottal

+voice

–voice

+glottal

+voice

nasals

resonants

Labial Labialized

p

ṗ ṗw

b

f

(v)

m

w

Dental Labialized

t

d

s

z

n

r

Affricates

c

ṭ ṭẉ c̣

ʒ

ʒ̂w

“Hissing-hushing” Labialized čw

Fricatives

Sonorants

ŝ ŝw

ṣ̂ ṣ̂w

ẑ ẑw

š š’

ž ž'

Palato-alveolar Palatalized

č č’

č̣ č̣’

ǯ ǯ’

Lateral

λ

λ̣

ɮ

Palatal

j

Velar Labialized

k kw

ḳ ḳw

gw

x

γ

Uvular Labialized

q qw

χ χw

ʁ ʁw

Pharyngeal

h

Laryngeal Labialized

ʔ ʔw

3 For correspondences between the sounds presented here and the Cyrillic orthography adopted for the languages other than Ubykh, see Appendix II.

374 Peter Arkadiev and Yury Lander Table 9.2 Consonants: Standard Kabardian

Plosives

Fricatives

Sonorants

–voice

+glottal

+voice

–voice

+glottal

+voice

nasals

resonants

Labial

p

ṗ

b

f

f̣

v

m

w

Dental Affricates

t c

ṭ c̣

d ʒ

s

z

n

r

“Hissing-hushing”

ŝ

ṣ̂

ẑ

Palato-alveolar

č

č̣

ǯ

š

ž

Lateral

λ

λ̣

ɮ

Palatal

j

Velar Labialized

k kw

ḳ ḳw

gw

x xw

γ

Uvular Labialized

qχ qχw

q̇ q̇ʷ

χ χw

ʁ ʁw

Pharyngeal Laryngeal Labialized

ʔ ʔw

h

9.2.2 Vowels In contrast to the exuberant consonantal inventories, the vocalic systems of NWC are quantitatively reduced, although qualitatively quite complex. Abkhaz and Abaza have only two vowel phonemes, low /a/ (ɐ) and (mid-)high /ə/ (ɨ); Ubykh and Circassian also have the mid-low /e/ (ɜ).4 Such “vertical” vocalic systems, first posited for Kabardian in Jakovlev (1923), with members displaying minimal contrast, are typologically rare. It is therefore no surprise that the NWC vocalic systems have received much attention in the literature, with divergent views on their composition (see Hewitt, 2005a, pp. 99–100). Kuipers (1960) and Allen (1965b), followed by Anderson (1978), posit a one-vowel system for Circassian and Ubykh, arguing that the surface vocalic contrasts are determined positionally. Kumakhov (1977) and later Choi (1991) and Catford (1997, pp. 99–102) argue for a three-vowel system in Circassian, as does Fenwick (2011, pp. 24–27) for Ubykh. Two-vowel analyses for Circassian, collapsing /e/ and /a/, have also been proposed, see Colarusso (1988, pp. 294, 312–329), Halle (1970), and Jakovlev (1923). 4 The symbols used by Fenwick (2011) for Ubykh appear in parentheses.

The Northwest Caucasian Languages 375 Table 9.3 Consonants: Standard Abkhaz

Plosives

–voice +glottal +voice –voice +glottal +voice nasals resonants

Labial

p

ṗ

b

(f)

(v)

m

w

Dental Labialized Affricates

t tw c

ṭ ṭw c̣

d dw ʒ

s

z

n

r

č̣

ʒ̂

“Hissing-hushing” labialized č

w

Fricatives

w

w

Sonorants

Palato-alveolar Palatalized Labialized

č č’

č̣ č̣’

ǯ ǯ’

š š’ šw

ž ž’ žw

Lateral

l

Palatal

j ɥ

Velar Palatalized Labialized

k k’ kw

ḳ ḳ’ ḳw

g g’ gw

Uvular Palatalized Labialized

q̇ q̇’ q̇w

χ χ’ χw

ʁ ʁ’ ʁw

Pharyngeal Labialized

h hw

Any theory positing less than three vowel phonemes is falsified by the existence of unquestionable minimal pairs (cf. Besleney Kabardian šxә ‘eat it!’ ~ šxe ‘eat! (antipassive)’ ~ šxa ‘s/he ate (antipassive)’, or Ubykh asš’ә́n ‘I reap it’ ~ asš’én ‘I milk it’ ~ asš’án ‘I milk/rip them’) (Fenwick, 2011, p. 25, after Dumézil, 1965, p. 202). Dispensing with such pairs can be achieved by postulating covert consonants and additional phonological rules. Such analyses are not entirely unmotivated, since the distribution of vocalic contrasts in NWC is fairly restricted. Thus, in Circassian and Ubykh /a/ and /e/ are neutralized to /a/ word-initially (in Ubykh also word-finally; Fenwick, 2011, pp. 26–27), and in Circassian /a/ is derived from /e/ by a morphophonological rule (see section 9.2.5), with “stable /a/” restricted to just a few morphemes. Anyway, the contrast between /ə/ and /a/ in Abkhaz-Abaza and /e/ in Circassian and Ubykh has a clear functional load both in lexical roots and affixes. As Colarusso (1988, pp. 350–372) argues, it is impossible to predict the occurrence of /ə/ on the basis of syllable structure or morphological environment.

376 Peter Arkadiev and Yury Lander Table 9.4 Consonants: Tapanta Abaza

Plosives

Fricatives

–voice

+glottal

+voice

–voice

+glottal

+voice

nasals

resonants

Labial

p

ṗ

b

(f)

(f̣)

(v)

m

w

Dental Affricates

t c

ṭ c̣

d ʒ

s

z

n

r

“Hissing-hushing” ĉ

ĉ̣

ŝ

ẑ

Palato-alveolar Palatalized

č̣ č’

č̣ č̣’

ǯ ǯ’

š š’

ž ž’

Lateral

(λ)

(λ̣)

(ɮ’)

l

Palatal

j

Velar Palatalized Labialized

k k’ kw

ḳ ḳ’ ḳw

g g’ gw

Uvular Palatalized Labialized

q

χ χ’ χw

qw

q̇ q̇’ q̇w

Pharyngeal Labialized

h hw

Laryngeal

ʔ

ʕ ʕw

ʒ̂

Sonorants

ʁ ʁ’ ʁw

Basic contrasts have been contentious even within three-vowel systems. The contrast between /e/ and /a/ is especially controversial; Colarusso (1988), Hewitt (2005a), and Jakovlev (1923) characterize it as a quantitative opposition between /a/ and /aː/, respectively. At least for Circassian, this analysis is invalidated by instrumental studies (Choi, 1991; cf. also Catford, 1997, pp. 100–101), and by the presence of genuine, if marginal, quantitative contrasts. Compare in Bzhedugh (Paschen, 2014; Sitimova, 2004, pp. 26, 100–101)5: (2) Bzhedugh Adyghe a. q̇-aː-ḳwe cisl-dyn-go ‘s/he comes’

b.

q̇a‑ḳwe! cisl-go.imp ‘come!’

5 Throughout this chapter, examples are from our field notes, unless specified otherwise.

Table 9.5 Ubykh Consonants Plosives

Fricatives

−voice

+glottal

+voice

−voice

+glottal

+voice

nasals

resonants

Labial Pharyngealized

p pˁ

ṗ ṗˁ

b bˁ

f

v vˁ

m mˁ

w wˁ

Dental Labialized Affricates

t tw c

ṭ ṭ w c̣

s

z

n

r

“Hissing-hushing” Labialized

ĉ ĉw

ĉ̣ ĉ̣w

d dw ʒ

ŝ ŝw

ẑ ẑw

Palato-alveolar Palatalized Labialized

č č’

č̣ č̣’

ǯ ǯ’

š š’ šw

ž ž’ žw

Lateral

λ

λ̣

l

Palatal

j

Velar Palatalized Labialized

k k’ kw

ḳ ḳ’ ḳw

g g’ gw

x xw

γ

Uvular Palatalized Labialized Pharyngealized Lab.+pharyng. Laryngeal

q q’ qw qˁ qwˁ

q̇ q̇’ q̇w q̇ˁ q̇wˁ

χ χ’ χw χˁ χwˁ h

ʁ ʁ’ ʁw ʁˁ ʁwˁ

ʒ̂ ʒ̂w

Sonorants

The Northwest Caucasian Languages 377

378 Peter Arkadiev and Yury Lander Despite the dearth of vocalic contrasts in phonology, phonetically NWC languages have diverse vowel qualities due to the “coloring” of vowels by adjacent consonants (Colarusso, 1988, pp. 295–304 in general; Moroz, 2018, on Abaza). Thus, in Circassian /e/ and /ə/ are realized close to [o] and [u] after labialized consonants, while /e/ becomes almost indistinguishable from /a/ when adjacent to laryngeals. we/ew and wә/әw tend to be realized as [o] and [u], while je/ej and jә/әj as [e]6 and [i]. Word-initially and intervocalically, glides are preserved. These processes work differently across languages and dialects, thus, in Temirgoy Adyghe ew, ej and әw, and әj are intact, while in Kabardian and Abkhaz-Abaza they undergo monophthongization. Nasalized vowels are reported for Bzhedugh and Shapsugh dialects of Adyghe (Kerasheva, 1957a[1995], p. 231; Rogava and Kerasheva, 1966, p. 24) (cf. Bzhedugh psә̃ vs. Temirgoy psә ‘water’). Deriving these from original combinations with nasal consonants is problematic, since the cognates of the very few forms for which nasalized vowels are reported do not show any traces of final nasal consonants in other NWC varieties, and neither is nasal drop with vowel nasalization a synchronic phonological process in Bzhedugh and Shapsugh.

9.2.3 Phonotactics and Syllable Structure NWC languages show considerable variation in their phonotactics and syllable structure (cf. Moroz, 2019b, on Adyghe). The constraint against vocalic hiatus is the only general rule, at least if recent borrowings are excluded. The most common syllable type is C(C)V, but complex onsets and complex codas are well attested. Consonant clusters can appear inside a morpheme both in roots and affixes and across morpheme boundaries. Intramorphemic initial clusters are in most cases biconsonantal and decessive with all members sharing the features of voice and glottalization, e.g., Adyghe pχe ‘wood’ ~ bʁe ‘breast’ ~ ṭḳ ʷә ‘melt’ or Abaza zˆʕʷa ‘shoulder’ ~ ŝχə ‘carrot’. Accessive clusters are diachronically secondary (cf. Adyghe λfe vs. Kabardian λxʷe ‘give birth’). Intramorphemic triconsonantal clusters are rare (cf. Adyghe pske ‘cough’ or Ubykh tχre ‘break’; Fenwick, 2011, p. 27). At least in Circassian, most affixes have a CV structure, and those which feature consonant clusters, such as certain preverbs, clearly go back to lexical roots. Syllable- and word-final clusters usually result from the dropping of final /ə/ (cf. Adyghe je.pλ ‘look at it!’ ~ je.pλә.ʁ ‘s/he looked’). The range of consonant sequences created by morphological rules is much greater and includes typologically unusual ones: (3) Besleney Kabardian a. f̣e-v-bz-t mal-2pl.erg-cut-ipfv ‘you were slaughtering it’

(root bzә)

b. je-t-t-t-jә (root tә) dat-1pl.erg-give-ipfv-add ‘because we gave it to him/her’ 6 As opposed to more open [ɜ] in other environments.

The Northwest Caucasian Languages 379 Such complex clusters, especially the ones containing both voiced and unvoiced consonants, normally do not arise in Ubykh and Adyghe; at least in the latter this is due to the preservation of /ə/. In Abkhaz-Abaza, in contrast to the other NWC languages, non-syllabic sonorants can occur in word-initial clusters (4) but not in word-final clusters (5): (4) Abaza a. mʕʷa ‘road’ b. j-s-taqә́-ṗ 3sg.n.abs-1sg.io-want-npst.decl ‘I want it’ (5) Abaza sә-ĉ-t ̣ ~ sә́-ĉә-n 1sg.abs-sleep.aor-decl 1sg.abs-sleep-pst.decl ‘I slept’ Vowels in word-peripheral positions are subject to a number of restrictions. In particular, /ə/ is impossible in the word-initial position across the entire family, with the exception of certain Adyghe dialects. We already mentioned that between /e/ and /a/, only /a/ is possible word-initially in Circassian and Ubykh, and even its occurrence is restricted to a few grammatical morphemes. For Circassian, one can argue that all a-initial roots have a prothetic glottal stop; in Kabardian, a-initial prefixes receive a prothetic /j/ word-initially: (6) Besleney Kabardian a. w-a-λeʁʷ-a 2sg.abs-3pl.erg-see-pst ‘they saw you’ b. ja-λeʁʷ-a 3pl.erg-see-pst ‘they saw it’ In Kabardian /ə/ does not occur word-finally except for monosyllables and the additive suffix -jә.

9.2.4 Stress and Prosody For details of NWC stress, see chapter 16. All NWC languages have dynamic stress, although its perceptual salience differs across languages. With regard to stress assignment, there is a major division between the mobile morphologically determined stress

380 Peter Arkadiev and Yury Lander in Abkhaz-Abaza and Ubykh versus the more fixed stress in Circassian, though even there the stress is bound to the morphologically determined stem rather than to the whole word. For intonation, see chapter 17. Sentence intonation is one of the most under- investigated fields of NWC grammar. Some instrumental work has been recently done on the Kabardian varieties spoken in Turkey (see Applebaum, 2010, 2013; Applebaum and Gordon, 2007). These are in many respects inconclusive, primarily because they do not take into account the syntactic encoding of focus (for the latter, see Rygaev, 2016; Sumbatova, 2009b).

9.2.5 (Morpho)phonological Processes Although phonological processes play an important role in NWC phonology and morphology, fusion and obliteration of morpheme boundaries are rare. Most processes that do not involve surface phonology (e.g., the coloring of vowels by adjacent consonants mentioned in Section 9.2.2) are at least partly morphologically conditioned. All NWC languages have consonant assimilation, which mostly affects personal prefixes, and vowel-hiatus resolution. Personal prefixes consisting of a single obstruent (in the non-absolutive series, see section 5.2) regressively assimilate their laryngeal features to those of the following consonants7 (cf. (7a) and (7b)): (7) Besleney Kabardian a. t-λeʁʷ-a 1pl.erg-see-pst ‘we saw it’

b. d-ʁe-ḳʷ-a 1pl.erg-caus-go-pst ‘we sent him/her’

In Abkhaz-Abaza, all types of personal prefixes can consist of a single consonant; however, assimilation does not occur in the absolutive position (cf. (8)). (8) Abaza a. j-ʕ-ẑ-әj-t ̣ 3sg.n.abs-1pl.erg-boil-prs-decl ‘we boil it’ b.

h-ž-әj-t ̣ 1pl.abs-dig-prs-decl ‘we dig’ (Tabulova, 1976, p. 114).

7 It is unclear whether the ejectives induce glottalization of the prefix or only its devoicing.

The Northwest Caucasian Languages 381 Akin to this assimilation is the intervocal voicing of indirect object and ergative personal prefixes in Kabardian: (9) Besleney Kabardian a. f‑je‑ž‑a 2pl.abs-dat-wait-pst ‘you.pl waited for him/her’ b. q̇ә-v-e-ž-a cisl-2pl.io-dat-wait-pst ‘s/he waited for you.pl’ Otherwise there is no intervocal voicing of consonants in Kabardian. Besides that, in Adyghe the prefixes of 1sg s- and 1pl t- fuse with the following sibilants yielding affricates, not attested otherwise (cf. (10)). (10) Temirgoy Adyghe ĉ̣e-r-ep < {s-ṣ̂e-r-ep} 1sg.erg-know-dyn-neg ‘I don’t know’ (Smeets, 1984, pp. 118–119). The only instance of progressive assimilation is found in Abkhaz-Abaza and concerns the adverbial question prefix -ba, which turns into -pa after voiceless consonants (cf. (11)). (11) Abkhaz a. d‑a-bá‑ca-wa? 3sg.h.abs-rel.loc-q.adv-go-ipfv ‘where does he go?’ b. wә-š-pá-q̇a-w? 2sg.m.abs-rel.mnr-q.adv-live-prs.nfin ‘how are you?’ (Spruit, 1986, pp. 123–124). Vowel sequences are normally disallowed at morpheme boundaries except for recent borrowings. Such sequences are resolved by the deletion of the higher of the two vowels, i.e., a > (e >) ә (cf. Ubykh bz-ant ̣é < {bzә-ant ̣é} water-snake ‘river eel’; Fenwick, 2011, p. 28, after Vogt, 1963, p. 92). The general rule, however, has exceptions; for example, in Temirgoy Adyghe the /e/ of the cislocative preverb deletes before the 3sg.erg prefix ә-: q-ә-ʔʷa-ʁ < {qe-ә-ʔʷe-ʁe} cisl-3sg.erg-say-pst ‘s/he said’. In Abaza, the second of the two vowels is preserved (cf. a-hʷ-әj-t ̣ < {a‑hʷa-әj‑t ̣} 3sg.n.erg-say-prs-decl ‘it says’). In addition, there are instances of vowel coalescence associated with particular morphemes; thus, in Abkhaz and Abaza the imperfective suffix ‑wa coalesces with the final /a/ of the preceding morpheme (cf. (12)).

382 Peter Arkadiev and Yury Lander (12) Abkhaz s-co-jt ̣ < {s‑ca‑wa-jt ̣} 1sg.abs-go-ipfv-decl ‘I am going’ (Hewitt, 1979a, p. 267) Such coalescence is impossible word-finally and before some non-finite endings (cf. (13)). (13) Abaza h-ca-wa 1pl.abs-go-ipfv ‘for us to go’ A number of morphophonemic processes involve affixes containing /j/. In Circassian vowels are deleted if followed by /jV/ (cf. (14)). (14) Adyghe q-j-e-ʔʷate < {qe‑j‑e‑ʔʷate} cisl-3sg.erg-dyn-tell ‘s/he tells’ In Kabardian this rule normally applies only to unstressed vowels (cf. (15)), and steminternal vowels are preserved in both languages, as shown in (16). (15) Besleney Kabardian χʷ‑á‑jә become-pst-add ‘it happened’ (16) Adyghe de-ḳʷe-ja-ʁ loc-go-up-pst ‘s/he went up’ When two j-prefixes occur in a sequence, the first one dissimilates to /r/ in Circassian (17a), and is dropped in Abkhaz-Abaza (17b). However, the absolutive relative prefix is preserved, as shown in (18). In Circassian, when the 3pl.io prefix a- is combined with a j-prefix which is not not transformed into r-, methathesis occurs, contrast (19a) and (19b). (17) a.

Adyghe r-jә-ʔʷa-ʁ < {j[e]-jә-ʔʷa-ʁ} dat-3sg.erg-say-pst ‘s/he said to him/her’

The Northwest Caucasian Languages 383 b.

Abkhaz jә-l-to-jt ̣ < {jә‑jә‑l‑ta‑wa‑jt ̣} [3sg.n.abs-]3sg.m.io-3sg.f.erg-give-ipfv-decl ‘she gives it to him’ (Hewitt, 1979a, 267)

(18) Abkhaz jə-j‑hʷa‑z rel.abs-3sg.m.erg-say-pst.nfin ‘that what he said’ (Hewitt, 1979a, p. 267) (19) Adyghe a. a-r-jә-tә-ʁ. 3pl.io-dat-3sg.erg-give-pst ‘S/he gave it to them’ b. j‑a‑s‑tә-ʁ. dat-3pl.io-1sg.erg-give-pst ‘I gave it to them’ Another instance of consonant dissimilation concerns the allomorphy of the 3pl nonabsolutive prefix in Abkhaz-Abaza, which is normally r(ә)-, but changes to d(ә)- before the homophonous causative prefix, compare (20a) and (20b). This dissimilation is not automatic: when two 3pl prefixes cooccur, both surface as r(ә)- (20c). (20) Abkhaz a. jә-r-bo-jt ̣ 3sg.n.abs-3pl.erg-see.ipfv-decl ‘they see it’

b. jә-d-dә-r-bo-jt ̣ 3sg.n.abs-3pl.io-3pl.erg-caus-see.ipfv-decl ‘they show it to them’ (Aristava, Bgazhba, Tsikolia, Chkadua, & Shakryl, 1968, p. 130) c. jә-rә-r-to-jt ̣ 3sg.n.abs-3pl.io-3pl.erg-give.ipfv-decl ‘they give it to them’ (Hewitt, 1979a, p. 266)

The doubling of the causative prefix itself does not result in dissimilation, either (cf. (21) in Abaza). (21) Abaza jә-w-sә-r-r-cu-št ̣ 3sg.n.abs-2sg.m.io-1sg.erg-caus-caus-go.ipfv-fut.decl ‘I won’t let you lead it’ (Tabulova, 1976, p. 181).

384 Peter Arkadiev and Yury Lander Instances of haplology include the deletion of the ergative relativizer d(ә)- before the homophonous causative prefix in Ubykh (22a), and optional deletion of one of the causative prefixes in double causatives in Circassian (22b). (22) a. b.

Ubykh sә-[dә-]dә-ṗč’̣ -ewt-ә́ 1sg.abs-[rel.erg‑]caus.sg-guest-fut-nfin ‘the one who will give me hospitality’ (Fenwick, 2011, p. 29, after Dumézil, 1957, p. 64) Temirgoy Adyghe s-jә-(ʁe-)ʁe-č’anә-ʁ 1sg.io8-3sg.erg-(caus-)caus-sharp-pst ‘S/he made me sharpen it.’ (Letuchiy, 2009b, p. 401)

A peculiar case of metathesis is found in Ubykh with the plural possessive prefix ewprefixed to a-initial nouns (Fenwick, 2011, pp. 29, 49–50; cf. s-ew-č’ә́ (1sg.pr-pl-horse) ‘my horses’ vs. s-abˁé < {s-ew-abˁé} (1sg.pr-pl-sick) ‘my sick people’). Other apparent instances of metathesis involving /ə/ in Ubykh and Abkhaz-Abaza can be analyzed as “variant realizations of multiple instances of underlying” /ə/ (Fenwick, 2011, p. 29; cf. Abaza bzә ‘tongue’ vs. á-bәz def-tongue < {bәzә}). Circassian languages have two vocalic alternations determined by, and indicative of, morphological structure. The first one is the dissimilation e–e > a–e in the last disyllabic foot of the stem, which is the clearest indication of the stem boundary in Adyghe (see Arkadiev & Testelets, 2009, pp. 122–131; Smeets, 1984, pp. 206–211); in Kabardian the alternation is closely tied to stress, which in such contexts falls on the penultimate /a/ (