Grammaticalization from a Typological Perspective (Oxford Studies in Diachronic and Historical Linguistics) [Illustrated] 9780198795841, 019879584X

This volume explores the way in which grammaticalization processes - whereby lexical words eventually become markers of

126 39 5MB

English Pages 496 [493] Year 2019

Table of contents :
Cover
Grammaticalization from a Typological Perspective
Copyright
Contents
Series preface
Preface
List of abbreviations
The contributors
1: Introduction: Typology and grammaticalization
1.1 Introduction
1.2 Typological features influence grammaticalization
1.3 Grammaticalization as a possible explanation for typological features of languages
1.4 Conclusion and organization of the book
2: Grammaticalization in Africa: Two contrasting hypotheses
2.1 Introduction
2.1.1 A note on methodology
2.1.2 Grammaticalization studies on African languages: an overview
2.1.3 Two hypotheses
2.1.3.1 The parallel reduction hypothesis
2.1.3.2 The meaning-first hypothesis
2.1.4 The present chapter
2.2 Case studies from African languages
2.2.1 De-volitive proximatives
2.2.2 From body-part noun to reflexive marker
2.2.3 From action verb to comparative marker
2.2.4 A de-andative future
2.3 Observations on Sinitic languages
2.4 Discussion
2.5 Conclusions
Acknowledgements
3: Typological features of grammaticalization in Semitic
3.1 Introduction
3.1.1 Grammaticalization and Semitic
3.1.2 Scope of this chapter
3.1.3 Shared Semitic features
3.2 From body-part nominal to prepositional
3.3 From synthetic to analytic possessive strategy
3.4 From independent personal pronoun to verb agreement
3.4.1 Third-person pronouns as copula
3.4.2 Third-person pronouns as expletives
3.4.3 Primacy of third-person masculine singular in frequency
3.5 Conclusion
4: Grammaticalization and inflectionalization in Iranian
4.1 Introduction
4.2 The grammaticalization of person indexing in Iranian: subjects versus objects
4.2.1 From pronoun to agreement affix: the typological perspective
4.2.2 The clitic pronouns of Middle Iranian
4.2.3 Clitic pronouns as subjects (A)
4.2.4 Pronominal clitics as objects
4.2.5 Summary: The (non-)grammaticalization of Iranian clitic pronouns
4.3 Further topics in Iranian grammaticalization
4.3.1 The grammaticalization of direct object case markers
4.3.2 The grammaticalization of auxiliaries
4.3.3 Continuous aspect from `have´: Colloquial Persian dastan
4.4 Summary
5: Grammaticalization in the languages of Europe
5.1 Introduction
5.2 Europe as a linguistic area
5.3 Grammaticalization processes in European languages
5.4 Are European perfects special?
5.5 Inflectionalization rates in Europe and elsewhere
5.6 Conclusion
6: Revisiting the anasynthetic spiral
6.1 Overview
6.2 Analyticizations
6.3 The anasynthetic spiral
6.4 A bit of history
6.5 From agglutination to isolation via fusion?
6.6 Remarks on holistic anasynthesis
6.7 What drives the anasynthetic spiral?
6.8 Conclusion
7: Grammaticalization in the North Caucasian languages
7.1 North Caucasian languages: overview
7.2 Grammaticalization in the Circassian languages (West Caucasian)
7.2.1 Basics of Circassian morphosyntax
7.2.2 Expression of spatial meanings: grammaticalized nouns and verbs
7.2.2.1 Locative preverbs from body-part nouns
7.2.2.2 Directional suffixes from verbs of motion
7.2.3 Auxiliary verb constructions
7.3 Grammaticalization in The Lezgic languages (EAST CAUCASIAN)
7.3.1 Morphosyntactic features and their diachrony
7.3.2 Verbs and copulas as auxiliaries
7.3.3 Polygrammaticalization of `say´
7.3.4 Morphologization with and without clause union
7.4 Conclusion
8: Grammaticalization in Turkic
8.1 Introduction
8.1.1 The Turkic-speaking world
8.1.2 Classification
8.1.3 Turkic as Transeurasian
8.1.4 Typological characteristics
8.2 Representative grammaticalization processes
8.3 Grammaticalization of converbs
8.3.1 Grammaticalization of converbs as postpositions
8.3.2 Grammaticalization in postverbial constructions
8.3.2.1 Phase specification
8.3.2.1.1 Transformativizing constructions
8.3.2.1.2 Non-transformativizing constructions
B constructions
A constructions
8.3.2.2 Spatial orientation
8.3.2.3 Version
8.3.2.4 Potentiality
8.3.3 Grammaticalization as viewpoint aspect markers
8.4 Ambiguity
8.5 Some conclusions
9: Grammaticalization in Japanese and Korean
9.1 Introduction
9.1.1 Japanese and Korean as languages of Northeast Asia
9.1.2 Typological characteristics and historical documentation of the two languages
9.2 Some representative processes of grammaticalization in these languages
9.2.1 Converbs with grammatical function
9.2.2 Grammaticalization of nouns as markers of verbal categories (`mermaid constructions´)
9.2.3 Development of de-verbal postpositions
9.2.4 Noun classification
9.3 What is special about grammaticalization in Japanese and Korean
9.3.1 Good fit with respect to traditional reductive criteria of grammaticalization
9.3.2 Frequent grammaticalization into interpersonal domains (intersubjectification)
9.3.3 Grammaticalizations from written language
9.4 Conclusion
Acknowledgements
10: Grammaticalization processes in the languages of South Asia
10.1 Introduction
10.2 The languages of South Asia
10.3 South Asia as a linguistic area
10.4 Grammaticalization of relational morphology and metaphorical extensions
10.4.1 `Side, flank´ postposition
10.4.2 Grammaticalization of converb morphology from postpositions
10.4.3 Grammaticalization of discourse connectives
10.4.4 `Hand/arm´ terminative postpositional clitic, terminative aspect
10.5 `Send´, `give´ morphological causative
10.6 `Eat´ passive/middle/reciprocal/reflexive marking
10.7 `See/look´ conative modality `try, test out´
10.8 Relative-correlative constructions
10.9 Concluding comments
Acknowledgements
11: Grammaticalization in isolating languages and the notion of complexity
11.1 Introduction
11.2 Typological properties of EMSEA languages
11.2.1 Introduction
11.2.2 Tone
11.2.3 Syntactic formations
11.2.4 Polyfunctionality
11.2.5 Optional grammatical marking
11.2.6 Summary
11.3 Reflections
11.3.1 The problem reframed
11.3.2 Morphological elaboration
11.3.3 Morphological simplification
11.3.4 Reconciling tensions
11.3.5 Final considerations
11.4 Conclusions
12: Typology and grammaticalization in the Papuan languages of Timor, Alor, and Pantar
12.1 Introduction
12.2 Grammaticalization of verbs to adpositions and affixes
12.2.1 Typological features of TAP languages
12.2.2 The development of proto-TAP locational verb *mi `be in, at´
12.2.3 The development of proto-TAP deictic verb *ma `come´
12.2.4 The development of proto-TAP handling verb *med `take´
12.2.5 Grammaticalization of verbs in TAP and local Austronesian languages
12.3 Grammaticalization of nouns: nouns numeral classifiers
12.3.1 The development of nouns into numeral classifiers
12.3.2 The role of contact with Austronesian
12.3.3 The role of contact with Indonesian
12.3.4 Summary: grammaticalization of nouns, typology, and the role of contact
12.4 Conclusions and discussion
13: Grammaticalization and typology in Australian Aboriginal languages
13.1 Introduction
13.2 Accounting for the shortage of grammaticalization studies of Australian Aboriginal languages
13.3 Evidence for grammaticalization in the development of second-position clitic constructions
13.3.1 Grammaticalization and the development of dual pronoun systems
13.3.2 Grammaticalization and the development of tense/aspect clitics in second position
13.3.3 How constructionalized is the second-position clitic construction?
13.3.4 Pragmatic motivations for the constructionalization of second-position clitic constructions
13.4 Conclusion
Acknowledgements
14: Grammaticalization in Oceanic languages
14.1 Introduction
14.1.1 The Oceanic subgroup
14.1.2 A perspective on grammaticalization
14.2 The grammaticalization of verbs
14.2.1 Verbs of posture and localization aspectual markers or demonstratives
14.2.2 Verbs of motion or path directionals further developments
14.2.2.1 Directionals developing into benefactive or recipient markers
14.2.2.2 Directionals with aspectual or modal values
14.2.2.3 Verbs expressing comparison of inequality
14.2.2.4 Verbs developing into intensifiers, reflexive and reciprocal markers
14.2.2.5 Verbs developing into prepositions
14.2.3 Grammaticalization of the verbs `return´ and `follow´
14.2.3.1 The verb `return´
14.2.3.2 The verb `follow´
14.2.4 Verbs of transfer and saying
14.2.4.1 Verbs `give´, `help´, and `say´
14.2.4.2 Verbs `take´, `take off, throw away´
14.2.5 Phasal verbs
14.2.6 Verbs of nonexistence
14.3 Grammaticalization of nouns
14.3.1 Noun `thing´ aspect marker
14.3.2 Noun `child´ relative marker
14.4 Secondary grammaticalization: development of benefactive markers from possessive markers
14.5 Relexification
14.5.1 Verbal compounds and verbal classifiers
14.5.2 Reanalysis of articles into nouns
14.5.3 Reanalysis of a preverbal particle
14.6 Instances of degrammaticalization?
14.6.1 Formation of existential and manner deixis verbs
14.6.2 Applicative suffix and/or preposition: the POc *-akin(i)
14.7 Conclusion
15: Shaping typology through grammaticalization: North America
15.1 Introduction
15.2 Grammaticalization via auxiliation
15.2.1 Wintuan
15.2.2 Pomoan
15.2.3 Yuki
15.2.4 Wappo
15.2.5 Contact
15.3 Grammaticalization via compounding
15.3.1 Noun-verb compounds: noun incorporation
15.3.2 Verb-verb compounds
15.4 Conclusion
Acknowledgements
16: Areal diffusion and the limits of grammaticalization: An Amazonian perspective
16.1 Areal diffusion and grammaticalization: a preamble
16.2 The Vaupés River Basin linguistic area: a snapshot
16.3 Contact-induced change in Tariana under East Tucanoan impact
16.4 Grammaticalization as a source for newly developed bound morphology in Tariana
16.5 Verb compounding and grammaticalization in Tariana
16.6 The limits of grammaticalization
Acknowledgements
17: Diachronic stories of body-part nouns in some language families of South America
17.1 Introduction
17.2 Target constructions
17.2.1 Locative adpositions
17.2.2 Classifiers
17.2.3 Body-part prefixes
17.2.4 Summary
17.3 Source constructions
17.3.1 Incorporated nouns
17.3.2 Derivational nominal compounds
17.3.3 Generic genitives
17.3.4 Locative compounds
17.4 Conclusions
18: Addressing questions of grammaticalization in creoles: It´s all about the methodology
18.1 Introduction
18.2 Grammaticalization theory
18.3 Grammaticalization in creoles
18.3.1 Problems in defining creole grammaticalization
18.3.2 How we apply a diachronic approach to creole data
18.4 The data and methodology
18.5 The habitual morpheme asé and questions of provenance
18.6 Building the case for the grammaticalization of asé: what the clues tell us
18.6.1 Indicator 1: polarity
18.6.1.1 Prediction
18.6.1.2 Results
18.6.2 Indicator 2: marker deletion and polarity
18.6.2.1 Predictions
18.6.2.2 Results
18.6.3 Indicator 3: aspectual functions of ase
18.6.3.1 Prediction
18.6.3.2 Results
18.6.4 Indicator 4: progressive morpheme ta as habitual
18.6.4.1 Prediction
18.6.4.2 Results
18.6.5 Indicator 5: semantic dissonance
18.6.5.1 Predictions
18.6.5.2 Results
18.6.6 Indicator 6: frequency of preverbal ase versus main verb ase
18.6.6.1 Prediction
18.6.6.2 Results
18.7 Typological markedness and its relationship to grammaticalization
18.7.1 Marked versus unmarked values
18.7.2 Default meanings
18.7.3 Asymmetries in past versus present temporal reference
18.7.3.1 Predictions
18.7.3.2 Results
18.8 Discussion
18.9 Conclusion
19: Is grammaticalization in creoles different?
19.1 Introduction
19.2 Substrate-modelled grammaticalization
19.3 Grammar-internal grammaticalization in Saramaccan
19.3.1 Tense-aspect-mood markers
19.3.2 New information marking
19.3.3 Copulas
19.3.4 Predicate negation
19.3.5 Ablative marking
19.4 Implications
19.4.1 Substrate models are not the main drivers of grammaticalization in creoles
19.4.2 Grammaticalization in creoles is fecund
19.4.3 But why?
19.4.4 There is no `creole´ kind of grammaticalization
19.4.5 Relevance to creole genesis theory
References
Index of languages
Index of authors
Index of subjects

Recommend Papers

Quantitative Historical Linguistics: A Corpus Framework (Oxford Studies in Diachronic and Historical Linguistics) 9780198718178, 0198718179

This book is an innovative guide to quantitative, corpus-based research in historical and diachronic linguistics. Gard B

111 70 5MB Read more

Syllable and Segment in Latin (Oxford Studies in Diachronic and Historical Linguistics) [Illustrated] 9780199660186, 0199660182

Syllable and Segment in Latin offers new and detailed analyses of five long-standing problems in Latin historical phonol

109 103 3MB Read more

Verb Second in Medieval Romance (Oxford Studies in Diachronic and Historical Linguistics) [Illustrated] 9780198804673, 0198804679

This volume provides the first book-length study of the controversial topic of Verb Second and related properties in a r

101 12 2MB Read more

Parameter Theory and Linguistic Change (Oxford Studies in Diachronic and Historical Linguistics) [Illustrated] 9780199659203, 0199659206

This book focuses on some of the most important issues in historical syntax. In a series of close examinations of langua

103 19 3MB Read more

Constructionalization and Constructional Changes (Oxford Studies in Diachronic and Historical Linguistics) [Illustrated] 9780199679898, 0199679894

In this book Elizabeth Closs Traugott and Graeme Trousdale develop an approach to language change based on construction

112 19 2MB Read more

Syntactic Reconstruction and Proto-Germanic (Oxford Studies in Diachronic and Historical Linguistics) [Illustrated] 9780198712299, 0198712294

This book offers reconstructions of various syntactic properties of Proto-Germanic, including verb position in main clau

124 57 3MB Read more

Grammaticalization and the Rise of Configurationality in Indo-Aryan (Oxford Studies in Diachronic and Historical Linguistics) 9780198736660, 0198736665

This book examines historical changes in the grammar of the Indo-Aryan languages from the period of their earliest attes

103 41 2MB Read more

Indefinites between Latin and Romance (Oxford Studies in Diachronic and Historical Linguistics) [Illustrated] 9780198812661, 0198812663

This book investigates the syntactic and semantic development of a selection of indefinite pronouns and determiners (suc

112 31 4MB Read more

Word Order Change (Oxford Studies in Diachronic and Historical Linguistics) [Illustrated] 9780198747307, 0198747306

This volume explores word order change within the framework of diachronic generative syntax. Word order is at the core o

120 104 4MB Read more

From Latin to Romance: Morphosyntactic Typology and Change (Oxford Studies in Diachronic and Historical Linguistics) [Illustrated] 9780199584376, 0199584370

This book examines the grammatical changes that took place in the transition from Latin to the Romance languages. The em

104 90 4MB Read more

Grammaticalization from a Typological Perspective (Oxford Studies in Diachronic and Historical Linguistics) [Illustrated]
9780198795841, 019879584X

Author / Uploaded
Heiko Narrog (editor)
Bernd Heine (editor)

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

Grammaticalization from a Typological Perspective

O X F O R D S T U D I E S I N D I A CH R O N I C A ND H I S T O R I CA L L I N G U I S T I CS GENERAL EDITORS

Adam Ledgeway and Ian Roberts, University of Cambridge ADVISORY EDITORS

Cynthia Allen, Australian National University; Ricardo Bermúdez-Otero, University of Manchester; Theresa Biberauer, University of Cambridge; Charlotte Galves, University of Campinas; Geoff Horrocks, University of Cambridge; Paul Kiparsky, Stanford University; Anthony Kroch, University of Pennsylvania; David Lightfoot, Georgetown University; Giuseppe Longobardi, University of York; George Walkden, University of Konstanz; David Willis, University of Cambridge RECENTLY PUBLISHED IN THE SERIES

 The Development of Latin Clause Structure A Study of the Extended Verb Phrase Lieven Danckaert  Transitive Nouns and Adjectives Evidence from Early Indo-Aryan John J. Lowe  Quantitative Historical Linguistics A Corpus Framework Gard B. Jenset and Barbara McGillivray  Gender from Latin to Romance History, Geography, Typology Michele Loporcaro  Clause Structure and Word Order in the History of German Edited by Agnes Jäger, Gisella Ferraresi, and Helmut Weiß  Word Order Change Edited by Ana Maria Martins and Adriana Cardoso  Arabic Historical Dialectology Linguistic and Sociolinguistic Approaches Edited by Clive Holes  Grammaticalization from a Typological Perspective Edited by Heiko Narrog and Bernd Heine For a complete list of titles published and in preparation for the series, see pp. –

Grammaticalization from a Typological Perspective Edited by HEIKO NARROG and BERND HEINE

1

3

Great Clarendon Street, Oxford, OX DP, United Kingdom Oxford University Press is a department of the University of Oxford. It furthers the University’s objective of excellence in research, scholarship, and education by publishing worldwide. Oxford is a registered trade mark of Oxford University Press in the UK and in certain other countries © editorial matter and organization Heiko Narrog and Bernd Heine  © the chapters their several authors  The moral rights of the authors have been asserted First Edition published in  Impression:  All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing of Oxford University Press, or as expressly permitted by law, by licence or under terms agreed with the appropriate reprographics rights organization. Enquiries concerning reproduction outside the scope of the above should be sent to the Rights Department, Oxford University Press, at the address above You must not circulate this work in any other form and you must impose this same condition on any acquirer Published in the United States of America by Oxford University Press  Madison Avenue, New York, NY , United States of America British Library Cataloguing in Publication Data Data available Library of Congress Control Number:  ISBN –––– Printed and bound by CPI Group (UK) Ltd, Croydon, CR YY Links to third party websites are provided by Oxford in good faith and for information only. Oxford disclaims any responsibility for the materials contained in any third party website referenced in this work.

Contents Series preface Preface List of abbreviations The contributors . Introduction: Typology and grammaticalization Heiko Narrog and Bernd Heine

vii viii ix xv 

. Grammaticalization in Africa: Two contrasting hypotheses Bernd Heine



. Typological features of grammaticalization in Semitic Mohssen Esseesy



. Grammaticalization and inﬂectionalization in Iranian Geoffrey Haig



. Grammaticalization in the languages of Europe Östen Dahl



. Revisiting the anasynthetic spiral Martin Haspelmath



. Grammaticalization in the North Caucasian languages Peter Arkadiev and Timur Maisak



. Grammaticalization in Turkic Lars Johanson and Éva Á. Csató



. Grammaticalization in Japanese and Korean Heiko Narrog, Seongha Rhee, and John Whitman



. Grammaticalization processes in the languages of South Asia Alexander R. Coupe



. Grammaticalization in isolating languages and the notion of complexity Umberto Ansaldo, Walter Bisang, and Pui Yiu Szeto



. Typology and grammaticalization in the Papuan languages of Timor, Alor, and Pantar Marian Klamer



. Grammaticalization and typology in Australian Aboriginal languages Ilana Mushin



vi

Contents

. Grammaticalization in Oceanic languages Claire Moyse-Faurie



. Shaping typology through grammaticalization: North America Marianne Mithun



. Areal diffusion and the limits of grammaticalization: An Amazonian perspective Alexandra Y. Aikhenvald



. Diachronic stories of body-part nouns in some language families of South America Roberto Zariquiey



. Addressing questions of grammaticalization in creoles: It’s all about the methodology Hiram L. Smith



. Is grammaticalization in creoles different? John H. McWhorter



References Index of languages Index of authors Index of subjects

   

Series preface Modern diachronic linguistics has important contacts with other subdisciplines, notably ﬁrst-language acquisition, learnability theory, computational linguistics, sociolinguistics, and the traditional philological study of texts. It is now recognized in the wider ﬁeld that diachronic linguistics can make a novel contribution to linguistic theory, to historical linguistics, and arguably to cognitive science more widely. This series provides a forum for work in both diachronic and historical linguistics, including work on change in grammar, sound, and meaning within and across languages; synchronic studies of languages in the past; and descriptive histories of one or more languages. It is intended to reﬂect and encourage the links between these subjects and ﬁelds such as those mentioned above. The goal of the series is to publish high-quality monographs and collections of papers in diachronic linguistics generally, i.e. studies focusing on change in linguistic structure, and/or change in grammars, which are also intended to make a contribution to linguistic theory, by developing and adopting a current theoretical model, by raising wider questions concerning the nature of language change, or by developing theoretical connections with other areas of linguistics and cognitive science as listed above. There is no bias towards a particular language or language family, or towards a particular theoretical framework; work in all theoretical frameworks, and work based on the descriptive tradition of language typology, as well as quantitatively based work using theoretical ideas, also feature in the series. Adam Ledgeway and Ian Roberts University of Cambridge

Preface This volume has emerged out of a symposium with the title ‘Grammaticalization in Japanese and Across Languages’, held at the National Institute for Japanese Language and Linguistics, Tokyo, rd to th July . We wish to thank the head of the National Institute, Professor Taro Kageyama, and the head of the sponsoring department, Professor Prashant Pardeshi, for their generous and kind support that enabled this occasion. Proceeding with the book project, we received valuable advice from Oxford University Press and two anonymous peer reviewers. The feedback that we received, and the circumstances of some the original contributors, led to a reshaping. We are very grateful for the feedback, and very much obliged to those colleagues who had the patience to stick with the project, and ﬁnally to those who agreed to newly join, at a slight disadvantage compared with the ‘ﬁrst-generation’ collaborators, in terms of familiarity with the task, and time pressure. Heiko Narrog wishes to thank the Japanese Society for the Promotion of Science for the support received from grants number  and H. Bernd Heine wishes to thank Guangdong University of Foreign Studies and Haiping Long, and the University of Cape Town and Matthias Brenzinger for the academic hospitality he received as a visiting professor while working on the book.

List of abbreviations 

st person



nd person



rd person

>

interclause correferentiality relation

A

agent/subject of transitive verb

ABL

ablative

ABS

absolutive

ACC

accusative

ACT

actor

ADD

additive

ADES

adessive

ADJ

adjective

ADN

adnominal

ADV

adverbial

AGT

agentive

ALIEN

alienable

ALL

allative

ANA

anaphoric

ANT

anterior tense

AOR

aorist

APPL

applicative

APUD

‘near’ localization

ASP

Aspect

ASS

assertive

ASSC

associative

ATTR

attributive

AUX

auxiliary

BEN

benefactive

BOU

boulomaic modality (intention)

BU

buffer element

CAU

causative

CFUG

centrifugal directional

CL

nominal class marker

CLF

classiﬁer

x

List of abbreviations

CNTR

contrastive

CNV

converb

COM

comitative

COMP

comparative

COND

conditional

CONJ

conjunction

CON

conative

CONN

connective

CONT

continuative

COORD

coordination

COP

Copula

CPL

complementizer

DAT

dative

DEC

declarative

DEF

deﬁnite

DEIC

deictic

DEM

demonstrative

DEP

dependent

DES

desiderative

DETR

detransitive

DIR

directional

DIST

distal demonstrative

DM

discourse marker

DU

dual

DUR

durative

ELAT

elative

EMPH

emphatic

EMSEA

East and Mainland South East Asia

EP

epenthetic

ERG

ergative

ESS

essive

EVID

evidential

EXHORT

exhortative

EXCL

exclusive

EXP

experiential

EZ

ezafe (linking) particle

F, f

feminine

FOC

focus

FREQ

frequentative

List of abbreviations FUT

future

GEN

genitive

GER

gerund(ive)

HAB

habitual

HON

honoriﬁc

HORT

hortative

HRSY

hearsay

HUM

human

IMM

immediate future

IMP

imperative

IMPF

imperfect

IMPV

imperfective

IN

‘inside’ localization

INC

inceptive

INC

incompletive

INCH

inchoative

INCL

inclusive

INCON

inconsequential

IND

indicative (mood)

INDF

indeﬁnite

INESS

inessive

INF

inﬁnitive

INJ

interjection

INS

instrumental

INT

intensive

INTR

intransitive

IO

indirect object

IRR

irrealis

JUSS

jussive

LAT

lative

LK

linker

LOC

locative

LV

loan verb

M, m

masculine

MAL

malefactive

MID

middle

MSD

masdar

n

neuter

N

noun

xi

xii

List of abbreviations

N

noun class 

NAR

narrative

NEG

negation

NF, nf

non-feminine

NFIL

non-ﬁnal

NFIN

non-ﬁnite

NFUT

non-future

NI

new information

NMLZ

nominalizer

NOM

nominative

NON.POSS

non possessed

NON.PROX

non proximal to the addressee

NP

Noun Phrase

NPS

non-past tense

NRL

non-relational noun preﬁx

NSG

non-singular

NSIT

new situation

NSPC

non-speciﬁc article

NVIS

non-visible

OBJ

object

OBL

oblique

OBLIG

obligation

OBV

obviative

ONOM

onomatopoeia

ORD

ordinal

P

patient/direct object of transitive verb

PASS

passive

PAST.VIS.INTER

visual past interrogative

PE

previous event

PEOc

Proto-Eastern Oceanic

PERM

permissive

PERS

personal

PERT

pertensive

PFV

perfective

PL, pl

plural

PN

proper noun

PNP

Preposition-Noun-Preposition

POc

Proto-Oceanic

POSS

possessive

List of abbreviations POSS.LINK

Possessive Linker

POST

‘behind’ localization

POT

potential

PP

past participle

PPP

past/perfective passive participle

PR

possessor

PRET

preterite

PREX

preﬁx

PRF

perfect

PROG

progressive

PROP

proprietive

PROX

proximate

PRS

present

PRS.NONVIS

present non-visual

PRS.VIS

present visual

PRV

preverb

Ps

person

PST

past

PST.IMP

past imperfect

PTC

particle

PTCP

participle

PURP

purposive

Q

question/interrogative

QUAL

qualitative predication

QUO

quotative

R

Russian loan

RDP

reduplication

RE

refactive

REAL

realis

REL

relativizer

REM.P

remote past

REM.P.REP

remote past reported

REM.P.VIS

remote past visual

REP

repetitive

RESTR

restrictive particle

RETR

retrospective

RFL

reﬂexive

RL

relational noun preﬁx

RPR

reportative

xiii

xiv

List of abbreviations

RSN

reason

S

single argument of an intransitive predicate

S/A

S/A indexing sufﬁx

SAE

Standard Average European

SBD

subordinator

SBJ

subjunctive mood

SBZ

substantivizer

SE

simultaneous event

SENS

sensory

SEQ

sequential

SFP

sentence-ﬁnal particle

SG, sg

singular

SPC

speciﬁc article

SPEC

speciﬁc

SS

same subject

STAT

stative

SUB

‘under’ localization

SUBJ

subject

SUBO

subordinate

SUCC

successive aspect

SUR

surpass

SVC

serial verb construction

TEMP

temporal

TERM

terminative

TOP

topic

TOP.NON.A/S

topical non-subject

TR

transitive

U

undergoer

V

Verb

VB

verbalizer

VENT

ventive

VERIF

veriﬁcative

VOC

vocative

The contributors A LEXANDRA Y. A IKHENVALD is Distinguished Professor, Australian Laureate Fellow, and Director of the Language and Culture Research Centre at James Cook University. She is a major authority on languages of the Arawak family, from northern Amazonia, and has written grammars of Bare () and Warekena (), plus A Grammar of Tariana, from northwest Amazonia (Cambridge University Press, ), in addition to essays on various typological and areal features of South American languages. Her other major publications, with Oxford University Press, include Classiﬁers: A Typology of Noun Categorization Devices (), Language Contact in Amazonia (), Evidentiality (), The Manambu Language from East Sepik, Papua New Guinea (), Imperatives and Commands (), Languages of the Amazon (), The Art of Grammar (), How Gender Shapes the World (), and Serial Verbs (). U MBERTO A NSALDO is Professor in Linguistics and Head of the School of Literature, Arts and Media at The University of Sydney. He specializes in language contact and typology of the Asian region. He is the author of Languages in Contact: Ecology and Evolution in Asia (Cambridge University Press, ) and the co-editor in chief of the journal Language Ecology (John Benjamins). P ETER A RKADIEV holds a PhD in theoretical, typological, and comparative linguistics from the Russian State University for the Humanities. Currently he is a senior researcher at the Institute of Slavic Studies of the Russian Academy of Sciences. His ﬁelds of interest include language typology and areal linguistics, case and alignment systems, tense-aspect, and Baltic and Caucasian languages. He has co-edited Borrowed Morphology (with Francesco Gardani and Nino Amiridze, de Gruyter, ) and Contemporary Approaches to Baltic Linguistics (with Axel Holvoet and Björn Wiemer, de Gruyter, ). W ALTER B ISANG studied General Linguistics, Chinese, and Georgian at the University of Zürich. He has been Professor of General and Comparative Linguistics at the University of Mainz (Germany) since , and Director of the Collaborative Research Center on ‘Cultural and Linguistic Contact’ from  to . His research interests include grammaticalization, linguistic typology, and language contact/areal typology. His languages of interest are: East and mainland Southeast Asian languages, Caucasian languages (Georgian), Austronesian languages (Bahasa Indonesia, Tagalog, Yabêm, Paiwan), and Yoruba (together with Remi Sonaiya). A LEXANDER R. C OUPE is Associate Professor of Linguistics and Multilingual Studies at Nanyang Technological University, Singapore. He is the author of A Grammar of Mongsen Ao () and has a major research focus on the documentation and analysis of the TibetoBurman, Indo-Aryan, and Austroasiatic languages of Northeast India, having worked on the languages of this region for over two decades. The output of this research feeds his theoretical interests in linguistic typology, grammaticalization theory, and the development of complexity in the grammars of the world’s languages.

xvi

The contributors

É VA Á. C SATÓ is Professor Emeritus in Turkic Languages at the Department of Linguistics and Philology, Uppsala University, Sweden. She has made contributions to the typology of Turkic languages as well as the documentation of less-studied and endangered Turkic varieties. Together with Lars Johanson she edited the volume The Turkic Languages (). She is a member of the editorial boards of the journals Turkic Languages (Harrassowitz). Ö STEN D AHL is Professor Emeritus of General Linguistics at Stockholm University, Sweden. He obtained his academic training at the universities of Gothenburg, Uppsala, and Leningrad (St Petersburg), and was active at the University of Gothenburg for ten years before moving to Stockholm in . In recent years, his research has mainly been typologically oriented with a strong interest in diachronic approaches to grammar. He has published the monographs Tense and Aspect Systems (), The Growth and Maintenance of Linguistic Complexity (), and Grammaticalization in the North: Noun Phrase Morphosyntax in Scandinavian Vernaculars (). M OHSSEN E SSEESY , PhD, is an Associate Professor of Arabic and International Affairs and the Chair of the Department of Classical and Near Eastern Languages and Civilizations at George Washington University in Washington, DC. His publications include Grammaticalization of Arabic Prepositions and Subordinators: A Corpus-Based Study (Brill, ); Contemporary Business Arabic (Georgetown University Press, forthcoming); Arabic for Speciﬁc Purposes in Handbook for Arabic Teaching Professionals in the st Century, vol.  (Routledge, ). Dr Esseesy is a member of the Editorial Board of the series Routledge Studies in Language and Identity, and was an elected member the Executive Board of the American Association of Teachers of Arabic, –. G EOFFREY H AIG is Professor of General Linguistics at the Institute for Oriental Studies at the University of Bamberg. He received his PhD in  at the University of Kiel, and has since worked as a full-time researcher and lecturer in linguistics, focusing on corpus-based approaches to language typology, the historical syntax of Iranian, areal linguistics (Western Asia), and Kurdish (see https://www.uni-bamberg.de/aspra/mitarbeiter/prof-dr-geoffrey-haig/). M ARTIN H ASPELMATH is senior scientist at the Max Planck Institute for the Science of Human History (Jena) and Honorary Professor at Leipzig University. Between  and , he was a member of the linguistics department of the Max Planck Institute for Evolutionary Anthropology (Leipzig). His research interests are primarily in the area of broadly comparative and diachronic morphosyntax (Indeﬁnite Pronouns, ; Understanding Morphology, ) and in language contact (Loanwords in the World’s Languages, co-edited with Uri Tadmor, ). He is one of the editors of Oxford University Press’s World Atlas of Language Structures (), and of the Atlas of Pidgin and Creole Language Structures (). B ERND H EINE is Emeritus Professor at the Institute of African Studies (Institut für Afrikanistik), University of Cologne, Germany. He is presently Yunshan Chair Professor at Guangdong University of Foreign Studies, China. His  books include Possession: Cognitive Sources, Forces, and Grammaticalization (Cambridge University Press, ); Cognitive Foundations of Grammar (Oxford University Press, ); with Tania Kuteva, World Lexicon of Grammaticalization (Cambridge University Press, ); Language Contact and Grammatical Change (Cambridge University Press, ). He has held visiting professorships in Europe, Eastern

The contributors

xvii

Asia (Japan, South Korea, China), Australia (LaTrobe University, Melbourne), Africa (University of Nairobi, University of Cape Town), North America (University of New Mexico, Dartmouth College), and South America (Universidade Federal Fluminense, Rio de Janeiro). He has been a fellow of the Center for Advanced Study in the Behavioral Sciences, Stanford, USA (–), the Netherlands Institute for Advanced Study (NIAS), Wassenaar (–), and Tokyo University of Foreign Studies (–). His present main research areas are discourse grammar and grammaticalization theory. L ARS J OHANSON is a linguist and Turcologist. For many years he has been Professor of Turcology at the Department of Oriental Studies of the University of Mainz, Germany, and a Senior Lecturer in Turkic Languages at Uppsala University, Sweden. He has been instrumental in transforming the ﬁeld of Turcology, which was traditionally more philologically oriented, into a linguistic discipline. Apart from his contributions to Turcology, Lars Johanson has made a number of pioneering contributions to general linguistics and language typology, in particular to the typology of viewpoint aspect systems and the theory of language contact. He is the editor of the journal Turkic Languages (Harrassowitz) and of the monograph series Turcologica (Harrassowitz). M ARIAN K LAMER is Professor of Austronesian and Papuan Linguistics at Leiden University. Over the last two decades she has led research projects describing and documenting Austronesian and Papuan minority languages in Eastern Indonesia, comparing their typological characteristics and reconstructing their history. Her publications include grammars on Kambera (), Teiwa (), and Alorese (), as well as several edited volumes, and over ﬁfty articles on a wide range of topics, including language contact and grammaticalization in Indonesia. She is currently leading the NWO-VICI project ‘Reconstructing the past through languages of the present: the Lesser Sunda Islands’ (–). T IMUR M AISAK holds a PhD in theoretical, typological, and comparative linguistics from Lomonosov Moscow State University (). Currently he is a senior researcher at the Institute of Linguistics of the Russian Academy of Sciences. His ﬁelds of interest include language typology and language documentation, grammaticalization theory, tense and aspect systems, and East Caucasian (Nakh-Daghestanian) languages. He has co-edited Tense, Mood, Aspect and Finiteness in East Caucasian Languages (with Gilles Authier, Brockmeyer, ) and The Semantics of Verbal Categories in Nakh-Daghestanian Languages: Tense, Aspect, Evidentiality, Mood/Modality (with Diana Forker, Brill, ). J OHN H. M C W HORTER is Professor of Linguistics, Philosophy and Music History at Columbia University. He is the author of The Missing Spanish Creoles, Language Interrupted, Deﬁning Creole, and Linguistic Simplicity and Complexity, as well as trade books such as The Power of Babel, Our Magniﬁcent Bastard Tongue, The Language Hoax, and Words on the Move. M ARIANNE M ITHUN is Professor of Linguistics at the University of California, Santa Barbara. Her work focuses on morphology, syntax, discourse, prosody, and their interrelations; language contact and language change; typology and universals; language documentation; and the linguistics of languages indigenous to North America and eastern Austronesia. C LAIRE M OYSE -F AURIE is a French linguist from the Lacito-CNRS, Paris. Since , she has done ﬁeldwork on several Kanak languages (Drehu, Xârâcùù, Xârâgurè, West Uvean, and

xviii

The contributors

Haméa) and on two Polynesian languages (East Futunan and East Uvean), elaborating dictionaries, grammars, and oral tradition texts. She is also involved in several typological projects (such as categorization, nominalization, reﬂexives and reciprocals, valency and verb classes, and spatial deixis), as well as in documentation projects on endangered languages. I LANA M USHIN is Associate Professor and Reader in Linguistics at the University of Queensland, Australia, specializing in pragmatics, typology, and the description of Australian languages. She is the author of A Grammar of (Western) Garrwa (, Mouton de Gruyter) and co-editor of Discourse and Grammar in Australian Languages (, Benjamins). Her research into Garrwa grammar has focused on the pragmatic underpinnings of grammatical development, especially pronouns systems and the development of second-position phenomena in Australian languages. H EIKO N ARROG is Professor at Tohoku University, Japan. He received a PhD in Japanese studies from the Ruhr University Bochum in , and a PhD in language studies from Tokyo University in . His publications include Modality in Japanese and the Layered Structure of Clause (Benjamins, ), Modality, Subjectivity, and Semantic Change: A Cross-Linguistic Perspective (Oxford University Press, ), and The Oxford Handbook of Grammaticalization (Oxford University Press, ) and The Oxford Handbook of Linguistic Analysis, nd edn (Oxford University Press,), both co-edited with Bernd Heine. S EONGHA R HEE is Professor at Hankuk University of Foreign Studies, Korea. He received a PhD in Linguistics from the University of Texas at Austin in . He served as President of the Linguistic Society of Korea, –, and as President of the Discourse and Cognitive Linguistic Society of Korea, –. He has published book chapters in The Oxford Handbook of Grammaticalization (Oxford University Press, ), Rethinking Grammaticalization (Benjamins, ), Shared Grammaticalization (Benjamins, ) among others, and has published research articles in journals including Journal of Pragmatics, Language Sciences, and Journal of Historical Pragmatics. H IRAM L. S MITH received a PhD in Spanish Linguistics from Penn State University in . He currently works as Assistant Professor of Spanish and Linguistics at Bucknell University, Lewisburg, Pennsylvania. His research focuses on language variation and change, with an emphasis on marginalized speech communities in Spanish-speaking and Anglophone America. Currently, he is researching change in Palenquero Creole, spoken in the Afro-Hispanic village of San Basilio de Palenque, Colombia. An integral part of his research philosophy is that whenever possible community members should take part in the scientiﬁc study of their own community, and that researchers should always put science at the service of the community. P UI Y IU S ZETO is a PhD candidate at the Department of Linguistics, University of Hong Kong. His research interests lie primarily in the ﬁeld of language contact, particularly in contactinduced grammaticalization as well as the interplay between linguistic and social factors in the emergence and evolution of contact varieties. J OHN W HITMAN is a Professor in and Chair of the Department of Linguistics at Cornell University, New York. His linguistic specializations are East Asian linguistics, comparative syntax, language typology, and historical linguistics. He is currently editor of the journal Korean Linguistics (Benjamins). Recent publications in linguistics include a co-edited volume Ryūkyū shogo to Kodai Nihongo: Nichiryū sogo no saiken ni mukete (Ryūkyūan and premodern

The contributors

xix

Japanese: toward the reconstruction of Proto-Japanese-Ryūkyūan, , with Yukinori Takubo and Tatsuya Hirako), the chapter ‘Old Korean’ in the Blackwell Handbook of Korean Linguistics (), and the chapter on ‘Topic prominence’ in the Blackwell Companion to Syntax (with Waltraud Paul, to appear). R OBERTO Z ARIQUIEY holds a PhD in Linguistics from LaTrobe University, Melbourne, Australia. As his PhD thesis, he wrote a reference grammar of Kakataibo, which was granted an honourable mention in the prestigious Panini Award (from the Association for Linguistic Typology). This grammar is about to be published in the Grammar Library of Mouton de Gruyter (). He is currently an associate professor at the Pontiﬁcia Universidad Católica del Perú (PUCP), where he also is the Director of the Masters Program in Linguistics and the CoDirector (and founder) of the Digital Archive of Peruvian Languages. In recent years, he has been a visiting scholar in prestigious institutions such as the University of California at Berkeley (), the Max Planck Institute for Evolutionary Anthropology (MPI-EVA) (), and the Collegium de Lyon ().

1 Introduction Typology and grammaticalization HE IKO N AR R O G A ND BE RN D H EI NE

. INTRODUCTION The goal of this volume is to identify aspects of grammaticalization that correlate with typological features of languages. Previously, the hypothesis that certain criteria of grammaticalization may apply differently to different types of languages, and concomitantly language areas, was mainly raised with respect to Southeast and East Asian languages (Bisang , , ; Narrog and Ohori ). Our idea, then, was to pursue this hypothesis by casting the net wider and more systematically, and invite experts on different language areas to reﬂect on the relationship between language type and language area on the one hand and grammaticalization on the other. To start out with the basics, we deﬁne typology and grammaticalization as follows: ()

Linguistic typology ‘concerns itself with the study of structural differences and similarities between languages. [. . .] [It] is the study and interpretation of linguistic or language types’ (Velupillai : ).

()

Grammaticalization concerns ‘the way grammatical forms arise and develop through space and time’ (Heine : ).

The process of grammaticalization can be divided into the four basic aspects listed in () (cf. Heine and Narrog : ). ()

(a) extension (or context generalization): use in new contexts; (b) desemanticization: loss (or generalization) in meaning content; (c) decategorialization: loss in morphosyntactic properties characteristic of lexical or other less grammaticalized forms; and (d) erosion (or ‘phonetic reduction’): loss in phonetic substance.

In our view, the pragmatic-semantic processes (a) and (b) are essential for grammaticalization. That is, we consider semantic change as the core of grammaticalization, Grammaticalization from a Typological Perspective. First edition. Heiko Narrog and Bernd Heine (eds). This chapter © Heiko Narrog and Bernd Heine . First published  by Oxford University Press



Heiko Narrog and Bernd Heine

as will also be argued in Heine (Chapter  this volume). The structural processes given in (c,d) may or may not follow, and if they do, by hypothesis they do so in the order of the list in (). If there is typological variation in grammaticalization, it may in principle affect all aspects of grammaticalization. There is no reason to exclude some aspect a priori. However, if our hypothesis is correct that semantic change is the essence of grammaticalization and other aspects follow in the order given above, we can hypothesize that the order in which typological variation in grammaticalization occurs is in reverse: (d) > (c) > (b) > (a). Accordingly, formal reductive change should be most susceptible to variation and semantic/functional change least susceptible. From the literature on grammaticalization and typology cited in the ﬁrst paragraph of this section, one may infer that it is primarily extant typological features that in some way inﬂuence grammaticalization. But interaction between grammaticalization and typology does not have to go in one direction. As argued in Narrog (a), the two directions of inﬂuence in () and () are not only hypothetically possible but have also been empirically observed. () Typological features inﬂuence aspects of grammaticalization. () Grammaticalization motivates structural features that can be typologized. Furthermore, typological features interacting with grammaticalization have been identiﬁed with certain language areas or groups of languages rather than individual languages. This is due to the fact that structural features are often shared between areally adjacent languages, like the East and mainland Southeast Asian languages, sometimes across genetic boundaries. Therefore, while the target of our project is the interaction between typological features of languages and grammaticalization, areal groups of languages that share structural features have been chosen as the subject of most of the studies in this project. Sections . and . elaborate on () and () above, and refer to chapters in this volume that show these inﬂuences at work. Section . also contains a short conclusion. Finally, section . provides a conclusion and a brief overview of the structure of this volume.

. TYPOLOGICAL FEATURES INFLUENCE GRAMMATICALIZATION It is a long-standing question what ultimately drives, or motivates, grammaticalization. Haspelmath (, Chapter  this volume), for example, has been arguing that inﬂation on the one hand and extravagance on the other hand are the ultimate causes of a perpetual cycle of grammaticalization. For Hawkins (), efﬁciency and ease of processing are the causes of language change and grammaticalization, as he considers grammars as the ‘conventionalizations of the same processing mechanisms that psychologists ﬁnd evidence for in experimental and corpus data’ (p. ).When it comes to the shape taken by new grammaticalizations, or the paths through which

Introduction



they grammaticalize, it is only reasonable to assume that extant language structures, including typologically relevant structures, may play a role, even if it is difﬁcult (if possible at all) to demonstrate a clear cause–effect relationship between older structures and the form of emerging structures, since : replication of an older structure would be a rarity. One also has to keep in mind that the deﬁnition of grammaticalization adopted also determines the extent to which grammaticalization can be analysed as being inﬂuenced by typological features. Concretely, the more grammaticalization is reduced to a universal essence, the less room there is for variation. Himmelmann (: ), for example, claims that ‘a grammaticisation process can be deﬁned as a process of context expansion’ on the three levels of (a) host class formation/expansion, (b) syntactic context expansion, and (c) semantic-pragmatic context expansion. This deﬁnition intentionally excludes typological factors from grammaticalization, which are viewed as epiphenomenal. Himmelmann (p. ) explicitly states: ()

The above deﬁnition of grammaticisation [. . .] differs from previous deﬁnitions of grammaticisation in [. . .] singling out context expansion in general and semantic-pragmatic context expansion in particular as the major deﬁning feature of grammaticisation. All the other phenomena which are often observed in grammaticisation processes and which are considered criterial in other deﬁnitions [. . .] do not occur in this deﬁnition [. . .] [E]rosion and fusion here are considered epiphenomena. Their occurrence depends on at least two factors. For one, they depend on the overall typological proﬁle of a given language (e.g. in isolating languages the potential for fusion is generally very limited). For another, the construction type also appears to play a major role.

In generative approaches to grammaticalization as well, grammaticalization tends to be reduced to an essence that is universal and not amenable to typological inﬂuences. Thus, for example, Roberts and Roussou (: ) regard grammaticalization ‘as the diachronic development of lexical heads into functional heads’, and van Gelderen () conceptualizes grammaticalization as change from lower head to higher head (‘Late Merge’) or from Spec to Head. There might be some typological variation in the heads that are the sources and the targets of change, but the essence will remain unaffected. In this book, however, we are interested in the ‘full package’ of grammaticalization, including phonological and morphological changes. This full package is the subject of the remainder of this section. Phonological and morphological typological features may be a guiding factor for grammaticalization processes in several ways. First, it has been claimed that grammaticalization in tonal, isolating languages, like the Sinitic languages, does not lead to reduction of syllables. Ansaldo and Lim (: ) suggest that ‘[s]trongly isolating languages typically do not allow yesterday’s syntax to become today’s morphology [. . .] syllable boundaries are discrete and phonotactic constraints rule out reduced syllables of the kind observed elsewhere, the material available for reduction is not easily found at the morphological level’ (cf. also Bisang , ). In the same vein, Ansaldo, Bisang, and Szeto (Chapter  this volume) claim that ‘elaboration of morphological structure only happens in a certain type of languages’. That is, ‘the formal aspects of canonical



Heiko Narrog and Bernd Heine

grammaticalization do not happen in [East and mainland Southeast Asian] languages.’ Narrog, Rhee, and Whitman (Chapter  this volume; also Narrog and Ohori ) point out that, in contrast to the situation in the ‘isolating’ languages of East and mainland Southeast Asia, morphological parameters of grammaticalization seem to apply particularly well to Northeast Asian languages like Korean and Japanese, and probably to the so-called Transeurasian languages in general, which tend to be agglutinating. Mithun (Chapter  this volume) emphasizes the relatively high degree of morphological complexity as a product of extensive grammaticalization in North American languages. While the most common path leads from grammaticalization to univerbation, in North American languages we also often ﬁnd the reverse, namely univerbation preceding grammaticalization. This is a source of a degree of polysynthesis that is not found in many language areas of the world. On the other hand, the fact that morphological complexity is the result of grammaticalization also points to the fact that there is a two-way relationship between grammaticalization and typological features— that is, typological features do not unilaterally determine features of grammaticalization. In the case of the languages described by Bisang, structural typological features seem to perpetuate themselves horizontally, so to speak, across languages and vertically across the history of individual languages. Very generally, we may assume that, unless there is some disruption, the typological morphological tendencies of a language result in the extant morphological types as target structures: for example, in languages with agglutinative morphology, grammaticalization is more likely to lead to afﬁxation than in isolating languages. Dahl (Chapter  this volume), while in general not agreeing with Ansaldo et al.’s view of different rates of grammaticalization in East and mainland Southeast Asian versus European languages, nevertheless agrees that ‘the likelihood for a certain grammaticalization process to appear is at least to some extent dependent on structural properties of the language’. Concretely, the inﬂectional endings on verbs and nouns that modern European languages have still preserved (but to a considerable extent already lost) are likely to be a remnant of grammaticalization in head-ﬁnal proto-Indo-European. But there are also synchronic semantic and syntactic features of languages that may guide grammaticalization. First, as discussed in more detail in section ., in headﬁnal languages grammaticalization is more likely to lead to bound morphemes than in head-initial languages because of the tendency for postposed rather than preposed morphemes to become bound. This theme is echoed by Dahl (Chapter  this volume) and by Narrog, Rhee, and Whitman (Chapter ), as already seen above. Second, Bisang (, , ) claims a number of constraints on grammaticalization in Sinitic languages, which stand representatively for the East and mainland Southeast Asian languages. Besides the small extent of changes in phonology and morphology, these include (i) lack of obligatoriness of grammatical categories; i.e. lack of grammatical paradigms in general, (ii) lack of clearly determined semantic domains, (iii) existence of rigid word-order patterns within which lexical items grammaticalize, and pervasiveness of inference, which enables language users to encode and decode the function of a speciﬁc item in a speciﬁc semantic context, (iv) no grammaticalization chains (i.e. continuous grammaticalization from one category to the next). In other

Introduction



words, ‘widespread polyfunctionality undermines the semantic dimension of canonical grammaticalization in [East and mainland Southeast Asian] languages’. Accordingly, none of Lehmann’s () commonly cited six parameters of grammaticalization applies except syntagmatic variability, nor do other traditional models of grammaticalization. But even this parameter, according to Ansaldo, Bisang, and Szeto (Chapter  this volume) appears to be dubious: ‘An aspect of grammaticalization in this area may be the loss of autonomy, or constructionalization, but even this is undermined by polyfunctionality and lack of obligatory marking.’ Esseesy (Chapter  this volume) does not recognize such kinds of constraints of grammaticalization in Arabic languages. He writes that ‘there appears to be no typological limit found on the evolution of meaning and form in Semitic of the type described in Bisang’s () study’. Esseesy highlights the path of one speciﬁc gram, fī>f- ‘in, at’ that underwent rampant polygrammaticalization. It may go without saying that there is also a lot of intra-language variation, or variation within languages of one group with respect to degrees of grammaticalization. This is amply shown not only by Esseesy (Chapter ) but also by Haig (Chapter ), Mushin (Chapter ), Moyse-Faurie (Chapter ), and Arkadiev and Maisak (Chapter ). Thus, Arkadiev and Maisak demonstrate that ‘grammaticalizing constructions in North Caucasian languages display various degrees of integration, ranging from highly autonomous auxiliaries to those partly or totally fused with lexical verbs, up to the extent of becoming afﬁxes’. This kind of replication of existing structures through grammaticalization may apply not only to morphology (and syntax) but also to grammatical functions. Ideally speaking, every language would have just one exponent for every cross-linguistically available grammatical category. In reality, in any given language, expressions for speciﬁc categories may be grammaticalized en masse, while other categories are not grammaticalized at all. These categories may be expressed then indirectly through other categories, lexically, or not at all. As a result, a certain set of categories is grammaticalized over and over in waves or cycles. Those dominant categories may further impact the way other categories are grammaticalized. In the area of tense-aspect-mood (TAM), Bhat () suggests that languages tend to be either tense-, aspect-, or mood-prominent, and the categories which are not overtly expressed are often indirectly expressed through the well-grammaticalized ones. In terms of grammaticalization, we can surmise that languages tend to maintain this typological proﬁle by repeatedly grammaticalizing their preferred category rather than grammaticalizing the ‘neglected’ ones. Bhat (: ) concretely suggests different paths of grammaticalization: ‘[L]anguages that give greater prominence to aspect than to tense develop a perfective form from an earlier perfect construction and an imperfective form from an earlier progressive construction, whereas languages that give greater prominence to tense than to aspect develop past and present forms directly from their perfect and progressive constructions respectively.’ Furthermore, aspect-prominent and mood-prominent languages show distinct tendencies of change when they develop temporal distinctions. In the case of aspect-prominent languages, we generally ﬁnd a two-way past/non-past distinction or a three-way past-present-future distinction



Heiko Narrog and Bernd Heine

developing from an earlier perfective-imperfective distinction [. . .]. In the case of moodprominent languages, on the other hand, the general tendency is to develop primarily a future/non-future distinction. (Bhat : –)

Chafe ( presents a related phenomenon with his idea of ‘ﬂorescence’: ‘Like forests, languages may develop toward a climax stage where particular combinations of features, like plant communities, may ﬂourish to deﬁne a particular language type. I think it is useful to think in terms of the ﬂorescence of linguistic features in this sense— the ﬂowering of features that come to dominate the form a language takes’ (p. ). Some features of Iroquoian languages serve as examples. Firstly, especially Northern Iroquoian languages can have an elaborate inventory of up to – pronominal preﬁxes. Besides singular, plural, and dual number distinctions in the ﬁrst and second persons of both agents and patients, Cherokee has even an inclusive/exclusive distinction and gender distinctions with third persons (cf. Chafe : ). This elaborate pronominal preﬁx system is described (reconstructed) by Chafe as the result of not one but successive waves of grammaticalizations. Secondly, Iroquoian languages are characterized by noun incorporation of a wide range of categories, e.g. animals, foods, and body parts. Even whole events can be incorporated into nouns as verb roots. In this case as well, inter-language comparison suggests that these incorporations developed not at once but successively, newer incorporations following older ones. They also range from productive to idiomatic. In well-known languages such as English or German, the relatively large variety of modal and semi-modal verbs may stand for the same phenomenon, although surely not with the same abundance. Historically speaking, there was a wave of emergence of the modals from Old to Middle English, and another wave of emergence of the semi-modals from Middle to Modern English. The prior emergence of one or two auxiliary verbs seems to have drawn others along the same path, and the modals as extant structures probably induced the later development of the semi-modals. Krug () explained the development of the semi-modals in terms of a ‘gravitational model’, which operates on the principle that ‘larger masses (in our case highly frequent emerging auxiliaries) attract smaller masses (in our case less common constructions)’ (p. ). Based on frequency and similarity, Krug (p. ) calculated to which extent speciﬁc semi-modals are inﬂuencing others. More generally, then, grammatical items and constructions with low frequency change in analogy to highfrequency items and constructions. Type frequency determines the inﬂuence of a group of items on other items outside that group. This had already been observed by Paul (: ): ‘all that part of language which lacks the support of an environing group, or which enjoys it only in a limited measure, proves, unless impressed by repeated usage intensely upon the memory, not strong enough to withstand the power of the larger groups’.¹ Bybee and Thompson (: ) likewise suggest that

¹ ‘Alles dasjenige aber, was die Stütze durch eine Gruppe entbehrt oder nur in geringem Masse geniesst, ist, wenn es nicht durch häuﬁge Wiederholung besonders intensiv dem Gedächtnisse eingeprägt wird, nicht widerstandsfähig genug gegen die Macht der grösseren Gruppen’ (Paul : ).

Introduction



‘high type frequency ensures that a construction will be used frequently, which will strengthen its representational schema, making it more accessible for further use, possibly with new items’. In contrast to type frequency, token frequency may determine the item or construction with the biggest inﬂuence within a speciﬁc group. In this manner, the dominant structural and functional types of a language exert inﬂuence on the paths of new grammaticalizations. Many chapters in this volume reveal categories that seem to be grammaticalized with preference in certain language groups. A good example may be the stunning range of aspectual and actional categories that is grammaticalized across Turkish languages (Johanson and Csató, Chapter  this volume), and the instrumental afﬁxes in many North American languages. Kutenai (Mithun, Chapter ), for example, has several hundreds of such afﬁxes. Similarly, as Klamer (Chapter ) shows, Papuan languages have grammaticalized a wealth of applicative preﬁxes. If one can on the other hand speak of a ‘ﬂorescence’ or preponderance of certain structures as sources, there is no dearth of examples either. For instance, converbs are prominently involved in grammaticalization in Turkish (Johanson and Csató, Chapter  ) as well as other Northeast Asian Transeurasian languages (Narrog, Rhee, and Whitman, Chapter  ); body-part nouns are prominent source structures in South American (Zariquiey, Chapter ) and Caucasian languages (Arkadiev and Maisak, Chapter ), while serial verbs serve to grammaticalize a wide range of categories in Timor-Alor-Pantar (Klamer, Chapter ) and Oceanic languages (Moyse-Faurie, Chapter ). However, forces inﬂuencing grammaticalizations are not restricted to extant language- or language-area-speciﬁc structures. DeLancey () claims that there are functions that are ‘important enough, cross-linguistically, that in language which does not formally express it with dedicated grammatical machinery, any construction or lexical means which expresses a related function is a likely candidate for grammaticalization’ (p. ). He labels this kind of cross-linguistically salient function as a ‘functional sink’. For example, the formation of adjectives is based on the function of noun modiﬁcation, which is universal, even if adjectives as a part of speech are not. Certain nouns or verbs may be drawn into this functional sink, eventually leading to the development of a new category adjective in a given language. In the domain of grammar, Thornes () claims that such a functional sink is at work in the grammaticalization of causative constructions in Northern Paiute. In his view, causative is a grammatical function with a high communicative need that is usually available in some form in any language, even in the absence of grammatical means. Because of the frequent need of expression, it attracts lexical materials to grammaticalize. These sorts of universal aspects of grammaticalization also resonate well with the chapters on creoles in this volume. Both McWhorter (Chapter ) and Smith (Chapter ) argue against creole exceptionalism. McWhorter suggests that, ‘in terms of the grammaticalization as a process, creoles offer no insights that could not be gained from other languages. No development in Saramaccan (or in other creoles that I am aware of) exempliﬁes a process, trend, or directionality counter to the grammaticalization process as documented in languages around the world [. . .].’ Smith (Chapter  this volume) applies accountable quantitative methods and shows that grammaticalization theory can account for patterns we observe in the domain of tense-aspect



Heiko Narrog and Bernd Heine

expression of Palenquero: ‘Palenquero is behaving no differently in the realm of tense and aspect than any other world language, despite its classiﬁcation as a creole.’ McWhorter (Chapter ), on the other hand, points to a quantitative rather than qualitative difference: ‘Grammaticalization has indeed occurred to an unusually vast degree in the few centuries that most creoles are known to have existed, such that it is reasonable to state that rampant grammaticalization is a deﬁning feature of languages born from pidgins and reconstituted as new languages.’ The presumptive reason is that creoles are originally the product of adult language acquisition, and rapid grammatical elaboration took place in later generations. As McWhorter puts it, ‘circumstances were ripe for the emergence of entire new paradigmatic systems and overt markings of semantic categories many languages leave to context.’

. GRAMMATICALIZATION AS A POSSIBLE EXPLANATION FOR TYPOLOGICAL FEATURES OF LANGUAGES As already indicated in the previous section, grammaticalization can also be taken as at least a partial explanation for certain typological features of languages. Three kinds of scenarios can be identiﬁed in the literature. First, grammaticalization has been proposed as a partial explanation for some implicational universals; second, grammaticalization has been proffered as a (partial) explanation for the order of afﬁxed material; and third, grammaticalization has been given as an explanation for crosslinguistic types of the expression of certain grammatical categories. Additionally, a number of authors in this volume point to the role of grammaticalization in the creation of analytic (vs synthetic) language structures. Concerning the ﬁrst case, Lehmann (: ) observed that the process of grammaticalization can be taken as a causal factor of some for Greenberg’s implicational universals of word order, such as () and (). ()

In languages with prepositions, the genitive almost always follows the governing noun, while in languages with postpositions it almost always precedes.

()

Languages with dominant VSO order are always prepositional.

These generalizations can be explained by the fact that when lexical heads of complex constructions grammaticalize (e.g. relational noun to adposition, verb to auxiliary), they usually remain in their original position. Relevantly for the generalizations given above, relational nouns may grammaticalize to adpositions and remain in their original position. However, Lehmann (: ) does not consider grammaticalization as the ultimate cause or motivation for the order and relationship between elements of the sentence, but instead as a channel of change from the lexical to the grammatical category. Thus, grammaticalization does not serve here as a full explanation for the change. Relatedly, Greenberg himself () referred to grammaticalization within a four-part approach to diachronic typology, which consisted of a dynamicized

Introduction



state-process model, an elaborate sub-typology, intragenetic comparison, and intergenetic comparison. Within this approach, Greenberg (pp. , –) primarily saw a signiﬁcant role for grammaticalization in intragenetic comparison, where grammaticalization theory provides knowledge about the directionality of change, but also in intergenetic comparison, where grammaticalization theory does the same on a larger scale. Greenberg was mainly interested in how grammaticalization interacts with global constituent order and word order changes to produce detailed variations in word order within one language, and counterexamples to implicational universals. In the reconstructed stages of word order development in Ethiopian Semitic languages, as represented in (), the part of the chain shown in bold type constitutes a violation of implicational universal in (). ()

Pr/NG/NA ! Pr/NG/AN ! Pr/GN/AN ! Pp/GN/AN (Greenberg : ; A=adjective, G=genitive; N=noun, Pr=preposition, Pp=postposition)

This violation can be explained through the interplay of principles of grammaticalization and global constituent order and word order change in this language. Noun– adjective order is the type of word order that is the least stable and the most susceptible to change. It will change ﬁrst, followed by noun-genitive order, while grammaticalization from relational noun to adposition will take the most time, and therefore adpositions will lag behind, leading to the apparent violation of universals. A second case in which grammaticalization is at least partially responsible for typological structures is the order of afﬁxed material. There are at least two related topics in linguistic typology, the sufﬁxing preference, and morpheme order in complex words. It has long been kown that there is an overall tendency in languages to prefer sufﬁxes over preﬁxes, the ‘sufﬁxing preference’. As Bybee et al. () showed, this tendency even holds in the majority of head-initial, especially SVO, languages. Table . shows the overall preference for sufﬁxation, irrespective of basic word order, according to cross-linguistic data by Dryer (). Although there is also more recent literature, perhaps the most detailed study on the extent of the sufﬁxation preference and its causes is still found in Bybee et al. (). In this study, the authors give an overview of the sufﬁxation preference, overall and by

T .. Preﬁxing versus sufﬁxing in inﬂectional morphology (Dryer : ) No. of languages Little or no inﬂectional morphology Predominantly sufﬁxing Moderate preference for sufﬁxing Approximately equal amounts of sufﬁxing and preﬁxing Moderate preference for preﬁxing Predominantly preﬁxing

     

Total





Heiko Narrog and Bernd Heine

word-order type, investigate a number of psycholinguistic (processing) and phonological factors possibly leading to the sufﬁxing preference, and ﬁnd that none of these factors can account for it. They conclude that the ‘fossilized syntax hypothesis’ explains the sufﬁxing preference best (Bybee et al. : ). The ‘fossilized syntax hypothesis’ goes back to the idea by Givón () that ‘yesterday’s syntax is today’s morphology’. It says that ‘that the position of an afﬁx is the same as the position of the non-bound lexical or grammatical material from which the afﬁx developed’ (Bybee, Pagliuca, and Perkins : ). The idea of fossilized syntax led to the revival of interest in grammaticalization from the s. Strikingly, fossilized syntax does not necessarily reﬂect normal word order. The most conspicuous example is that person endings are sufﬁxed even in SOV languages, where subjects precede verbs. In response to this problem, it has been hypothesized that it is unstressed pronominal subject pronouns postposed to the verb, rather than pronouns in their normal position, that get grammaticalized (cf. Bybee et al. : ). Further research has shown that within person paradigms, there is a preﬁxing preference for very small and very large paradigms, while medium-size paradigms are predominantly sufﬁxing (Cysouw ). Cysouw concludes: ‘The big riddle of the sufﬁxation preference thus actually consists of various smaller-scale riddles concerning different kinds of afﬁxation asymmetry’ (p. ). Mithun () similarly argues that the ultimate answer to the sufﬁxing preference must be sought in the history (i.e. grammaticalization) of the individual morphemes that as an aggregate make up the sufﬁxing preference. Generally speaking, we may safely assume that the position of a bound morpheme with respect to the lexical stem reﬂects the order of the erstwhile independent word vis-à-vis this host, unless one can make the well-founded assumption that a morpheme changed its position after grammaticalization. While Harris and Campbell (: –) refer rather abstractly to cases where reanalysis in grammaticalizationlike changes led to a change in position, these cases should be considered as exceptional, since bound morphemes are far less mobile than independent ones. However, a change of position of a bound morpheme becomes more likely if the morpheme goes through a clitic stage. As Comrie (: –; : –) observed, the position of clitics is freer than that of other bound morphemes, and that for prosodic reasons, clitics may follow rules differing from those for independent words. Thus, if a clitic stage is involved in the grammaticalization of a morpheme, the likelihood rises that morpheme order does not reﬂect erstwhile word order. In any case, it does seem that grammaticalization is the most important mechanism behind the sufﬁxing preference. Of course, grammaticalization as such usually cannot explain why an element is in a position before or after a lexical stem at the time when it grammaticalizes. Beyond the phenomenon of the sufﬁxing preference, morpheme order among afﬁxes in morphologically complex languages is also an intriguing problem that is not fully resolved. But grammaticalization appears to be at least one important motivation, as has been shown by Mithun () in her study on the Navajo verb. The order of morphemes posits a number of difﬁculties for explanation. For example, (i) languages with verb-ﬁnal syntactic structure are expected to be sufﬁxing.

Introduction



Verb-ﬁnal Navajo, by contrast, is exclusively preﬁxing. (ii) Mutually dependent morphemes should be contiguous, but in Navajo, some are scattered throughout the verb. (iii) Inﬂectional afﬁxes are expected to occur further away from the root than derivational afﬁxes, but in Navajo, derivational and inﬂectional preﬁxes are interwoven. (iv) Paradigmatically related afﬁxes are expect to occur in the same position in a template. But they do not do so in Navajo. According to Mithun (: –), these phenomena are especially challenging. Furthermore, previous attempts at explanation—for example in terms of syntax (the ‘mirror principle’, Baker ), semantic scope (Rice ), and a combination of syntactic and phonological principles (Hale )—have failed. Instead, there is good evidence that the order of morphemes can be explained as the order in which these morphemes grammaticalized. Mithun () notes an increase in () phonological reduction, () generality and abstraction, and () diffuse meaning, from left to right in the template. In other words, there is an increasing degree of grammaticalization from left to right. Therefore, ‘the positions of preﬁxes in the verb correlate with their age: those closest to the stem are the oldest, and those furthest the youngest’ (Mithun : ). Hence it is grammaticalization that best explains morpheme order. Thus, in contrast to the case of the sufﬁxing preference, in the case of morpheme order—at least in Navajo— grammaticalization appears to be the immediate cause. The third area in which grammaticalization has been found to motivate typological patterns has grammaticalization as the source, and hence also the explanation, for cross-linguistically recurring types of expression of certain grammatical categories. Many linguistic categories are cross-linguistically expressed by a limited number of structural types. These structural types in turn are the product of grammaticalization. Furthermore, the source and the degree of grammaticalization of these structures can explain at least some of their morphosyntactic and semantic features. Especially prominent research linking typological patterns with grammaticalization is associated with Heine (a; Heine and Kuteva ) and Bybee (a; Bybee, Perkins, and Pagliuca ). In the following, we present a number of examples. Indeﬁnite articles. According to Heine (; Heine and Kuteva ), about  per cent of all indeﬁnite articles cross-linguistically are derived from the numeral ‘one’. This explains some positional tendencies of indeﬁnite articles, the fact that they are often conﬁned to singulars, and the following implicational hierarchy for their application: mass noun > plural noun > singular noun. Possessive constructions. Heine (a) identiﬁed the eight cross-linguistic source schemas for possessive constructions that are listed in (). () . . . . . . . .

Action Location Companion Genitive Goal Source Topic Equation

X takes Y Y is located at X X is with Y X’s Y exists Y exists for/to X Y exists from X As for X, Y exists Y is X’s (Y)



Heiko Narrog and Bernd Heine

Crucially, these source schemas can account for some characteristics of speciﬁc possessive constructions, such as why they often have non-verbal, or copular-like, predicates, how the possessor is encoded in a speciﬁc language, i.e. as a comitative, locative, etc., or why they frequently have locative morphology etc. (cf. Heine a: –). Furthermore, these constructions often undergo a development at the end of which () the possessor precedes the possessee, () the possessor has properties of a subject, and the possessee has properties of a clausal object, () the possessor is deﬁnite and the possessee is indeﬁnite. Therefore, possessive constructions often display ‘hybrid’ properties between source and target structures depending on the stage of development (Heine a: –). In this manner, grammaticalization as a process can account for the properties of these constructions. Future. In her analysis of the cross-linguistic polysemy of future morphemes, Bybee () found that the polysemy can be explained by reference to their diachronic evolution, which takes place in the form of paths, such as from movement (‘go’ or ‘come’) to intention, then to prediction, the core future meaning, and further to other meanings such as supposition or imperative. According to Bybee (: –), such paths of development are explanatory because they explain () why it is difﬁcult to ﬁnd a single abstract meaning for a polysemous morpheme (like many future morphemes), () the cross-linguistic similarities of grammatical meanings by similar paths of development and principles of historical change, and () differences between morphemes in different languages with reference to different lexical sources and differing extent of change along the universal paths of change. Furthermore, () they make it possible to predict possible combinations of meanings, and () they allow the reconstruction of the lexical sources of grammatical morphemes. Passives. According to Givón (, ), six common types of passive constructions can be identiﬁed: () the adjectival-stative passive (e.g. English common passive); () the reﬂexive passive (e.g. English get-passive); () the serial-verb adverse passive (e.g. Chinese); () the VP-nominalization passive (e.g. Ute); () the leftdislocation-cum-impersonal-passive (e.g. Kimbundu); () the zero-anaphora passive (e.g. Sherpa). Types ()–() are so-called promotional passives (i.e. the erstwhile object is promoted to subject), while ()–() are non-promotional. Note that the concept of passives applied here is fairly broad; not every study of passives would include the same range of constructions. In any case, the concrete structural properties of these passive structures in individual languages can be explained by the degree to which they have grammaticalized to more prototypical passives. For example, non-promotional passives can eventually become promotional as subject properties gradually shift to the object-patient. Similarly, oblique agents may eventually be added. In this way, types of passives and their morphosyntactic features can be explained with reference to their source construction and their degree of grammaticalization. Heine (Chapter  this volume) refers to a number of avenues of grammaticalization that are prevalent in Africa and belong to limited sets of schemas accounting for the large majority of grammaticalizations of speciﬁc grammatical categories cross-linguistically. One example of this process is verbs of action being appropriated for the expression of comparison; another is de-andative futures.

Introduction



Besides the cross-linguistically common schemas, some chapters also discuss typical sources for various grammaticalizations such as posture verbs for aspectual meanings and ‘say’-verbs as quotatives (Arkadiev and Maisak, Chapter  this volume; Moyse-Faurie, Chapter ), adversative constructions for passives (Coupe, Chapter ), or verbs of transfer marking beneﬁciaries or recipients (Chapter ). But there is also the occasional outlier like the polygrammaticalization of ‘return’ into reﬂexive and reciprocal markers, prepositions (‘until’) and conjunctions (‘then’), or the grammaticalization of ‘go down’ into reﬂexives and reciprocals in Oceanic languages (Chapter ). A topic brought up in several contributions to this volume is the role of grammaticalization in the change from synthetic to analytic structures, and back to synthetic structures over longer periods of time, and relatedly, the genesis of inﬂections as typical synthetic structures. Haspelmath (Chapter ), alluding to historical linguists of the th and early th century, coins the term ‘anasynthetic spiral’ for this large-scale type of change. While its application to languages as a whole is controversial, the occurrence of anasynthetic change in speciﬁc categories in speciﬁc languages is much easier to demonstrate. Haspelmath brings up a number of examples and possible motivations for the anasynthetic spiral, and concludes that the extravagance and inﬂation model of grammaticalization is best suited to account for it. Likewise, Esseesy (Chapter ) states that ‘[g]rammaticalization has been shown to facilitate the change from the direction of synthetic to analytic in several Semitic languages [. . .] and facilitates the transition from one state to another and in some cases perhaps back in a cyclical fashion’. Haig (Chapter ) also emphasizes the role of grammaticalization in cyclical typological change in Iranian languages, suggesting that ‘the history of grammaticalization can to some extent be seen as the gradual re-acquisition of lost morphological categories’. Part of Haspelmath’s study is the presumptive role of a ﬂectional-fusional stage leading from agglutinating to isolating structures, and he bemoans the fact that crosslinguistically the origins of ﬂectional/fusional patterns are mostly unknown. This topic is echoed in other chapters in a more concrete form. Haig (Chapter ), looking at historical data from Iranian languages, concludes that ‘inﬂectionalization is evidently a process that requires millennia, not centuries, to achieve’. That is, historically observing an entire process of inﬂectionalization from lexical item to inﬂectional ending would require a time-depth of data that extant historical records of languages do not afford us. Thus, Haig writes, ‘the assumed ﬁnal stage of grammaticalization, namely into fully-ﬂedged inﬂection, is an exceedingly slow process indeed, taking millennia before all traces of the lexical, or at least non-inﬂectional, origins of grammatical formatives are lost.’ This may be one reason why even in Indo-European languages, which have been the cornerstone of grammaticalization research, very few inﬂectionalizations have been observed historically, as Dahl (Chapter ) remarks, thus relativizing the contrast between East mainland and Southeast Asian languages with little morphological grammaticalization, on the one hand, and European languages, on the other. Likewise, pointing to the required time depth for morphological grammaticalizations, Mushin (Chapter ), points out with respect to Australian languages: ‘Few



Heiko Narrog and Bernd Heine

grammatical categories are regularly marked by forms whose lexical source is still available as a free form. [. . .] It is therefore challenging to ﬁnd clear comparative evidence of contemporary bound afﬁxal forms that in some languages may retain features of their lexical origins.’ Often, the best that comparative evidence can provide is to identify different stages in the process of grammaticalization of the same categories (in this case, clitic constructions) in related languages. The slow pace of the genesis of inﬂections contrasts with a potentially rapid pace of their decay. With respect to the Iranian languages, Haig (Chapter ) concludes: ‘Inﬂectionalization is evidently a process that requires millennia, not centuries, to achieve, though paradoxically, its loss can be quite rapid, even catastrophic.’ While morphologization and especially inﬂectionalization may generally take a very long time, the emergence of the expression of new grammatical categories through independent morphemes or clitics can be rather quick given the right circumstances, as shown by the high pace of grammaticalization in creole languages (Smith, Chapter , and McWhorter, Chapter ). Beyond the speciﬁc issue of creoles, it is clear that the rapid emergence of new categories, or a profound change in the typological proﬁle of a language, is more likely to be brought about by intense language contact than by the primarily languageinternal changes that are at the core of the idea of an anasynthetic spiral. Here as well, grammaticalization plays an important role. Cases in point are the development of classiﬁers (through grammaticalization) in Papuan languages (Klamer, Chapter ), and of evidentials as a category in Tariana (Aikhenvald, Chapter ), an Arawak language in strong contact with Tucanoan languages.

. CONCLUSION AND ORGANIZATION OF THE BOOK The studies in this book deal both with cases where typological features of language apparently inﬂuence grammaticalization paths and with cases in which grammaticalization creates typological features. As for the ﬁrst, variation in grammaticalization is most obvious with respect to phonological and morphological aspects of grammaticalization, but may also pertain to syntax and semantics. The most striking case where typological features of a language constrain morphological and phonological grammaticalization is still the tendency of isolating languages not to develop afﬁxal material and grammatical paradigms (Ansaldo, Bisang, and Szeto, Chapter ). However, the extent to which grammaticalization differs in these languages may not be as great as is sometimes thought (cf. Dahl, Chapter ). Furthermore, if the core of grammaticalization is semantic (functional) change, as argued in Heine (Chapter ) and Narrog (b), then the morphological aspects are more peripheral, although still of interest. On the semantic and syntactic side, it seems that there is generally a tendency in languages to follow already trodden grammaticalization paths and reproduce or ﬂesh out established grammatical categories, often to a considerable extent, rather than to create entirely new structures and categories. A salient departure from this tendency towards conservatism, or inertia, is most likely to take place under intense language

Introduction



contact, as has been demonstrated perhaps most clearly in Akhenvald’s () analysis of Amazonian languages. As for the case of grammaticalization creating language structures of typological relevance, it has been found that the order of grammatical morphemes can be explained by their position at the time of their grammaticalization. This seems to hold for morphologically bound as well as for non-bound morphemes. Second, commonalities and divergences in the coding of grammatical categories across languages seem to be motivated to a large degree by grammaticalization. Thus, as Bybee (: ) put it, ‘grammaticalization has great potential for explaining the similarities as well as the differences among languages’. A third area where grammaticalization strongly contributes to typological features of languages is the cycle—or spiral—between analytic and synthetic language structures that clearly takes place in the life of speciﬁc categories in speciﬁc languages, but perhaps even at the level of overall structure of a speciﬁc language. This cycle/spiral feeds on the mechanisms of grammaticalization. A complicating factor in documenting full cycles or spirals is that while the decay or disappearance of extant morphological marking can be very quick, its development may take a very long time, especially when it comes to full-ﬂedged morphologization such as inﬂectionalization. Lastly, grammaticalization is an important player when languages develop new categories or a new typological proﬁle under intense language contact. There is a strong tendency for languages in contact to adopt features from each other and thus develop a similar typological proﬁle. This is a major reason why we decided to organize this book mainly in terms of language areas. The order of chapters in this book starts with Africa and then proceeds towards the north and the east, roughly but not exactly mimicking the possible spread of mankind. The chapters on African and Iranian languages by Heine, Esseesy, and Haig are followed by chapters on European and Caucasian languages. The chapter by Haspelmath is classiﬁed as a ‘European’ chapter although it tackles a general issue, because this general issue has grown out of traditional European linguistics and evidence has been brought up mainly from European languages. We then proceed towards Turkic languages (Johanson and Csató, Chapter ), and as an eastward extension, part of the group of so-called Transeurasian languages, Korean and Japanese (Narrog, Rhee, and Whitman,Chapter ). Moving a step back westwards, Coupe (Chapter ) deals with South Asian languages, and Ansaldo, Bisang, and Szeto (Chapter ) with issues in East Asian mainland and Southeast Asian languages. These contributions are followed by a cluster of chapters on Papuan (Klamer, Chapter ), Australian (Mushin, Chapter ), and Oceanic (Moyse-Faurie, Chapter ) languages. Finally, across the Paciﬁc, three chapters deal with indigenous American languages—Mithun (Chapter ) for North America, and Aikhenvald (Chapter ) and Zariquiey (Chapter ) for South America—and the last two chapters by Smith and McWhorter have (American) creole languages as their topic.

2 Grammaticalization in Africa Two contrasting hypotheses B E R N D HE I N E

. INTRODUCTION

..     Work on grammaticalization is based on historical reconstruction, and the safest way to achieve reconstruction is by drawing on historical documents that provide information on earlier states of language use. However, restricting the study of grammaticalization to written languages would mean that more than  per cent of the world’s languages would have to be excluded. We therefore adopt also an alternative but wellestablished methodology of reconstruction that has been employed mostly for unwritten but also for written languages. This methodology relies mainly on three components: (a) diachronic reconstruction, e.g. by means of the comparative method, (b) internal reconstruction, and (c) typological generalization. The following example may illustrate this methodology (see also Heine : ). The Bantu language Swahili of eastern Africa has a future tense preﬁx -ta-, which is hypothesized to be historically derived from the volition verb -taka ‘wantʼ on the basis of the following evidence. By using (a) it is possible to establish that the verb must be older than the future tense marker: The application of the comparative method shows that the verb -taka is a modern reﬂex of the Proto-Bantu verb *-càk-a ʽdesireʼ, while it is not possible to reconstruct the future tense marker back to ProtoBantu (Guthrie –). (b) Internal reconstruction suggests, for example, that the earlier form of the tense marker is likely to have been -taka- since the form -taka- is still retained in relative clauses. Method (c) allows for two kinds of generalization. First, it establishes that verbs of volition (‘wantʼ, ‘desireʼ) quite commonly give rise to future tense markers in the languages of the world, the English will-future being a case in point (see Heine and Kuteva , WANT > FUTURE). And second, processes of this kind tend to involve speciﬁc types of semantic, morphosyntactic, and phonological change: loss of lexical in favour of grammatical meaning (desemanticization), Grammaticalization from a Typological Perspective. First edition. Heiko Narrog and Bernd Heine (eds). This chapter © Bernd Heine . First published  by Oxford University Press

Grammaticalization in Africa



loss of morphosyntactic properties, such as loss of word status (decategorialization), and loss of phonetic substance (erosion). On the basis of these methodological tools it is possible to formulate a strong hypothesis to the effect that the Swahili future tense marker -ta- is the result of a common grammaticalization process, having lost its lexical meaning of volition (desemanticization), its status as an independent verb (decategorialization), and part of its phonetic substance, being reduced from -taka to -ta- (erosion). To conclude, while it is always desirable to search for historical records, we argue that such records are not a requirement for the reconstruction of grammaticalization processes.

..     :   For over  years now, African languages have been the subject of studies in grammaticalization (e.g. Heine and Reh ; Heine and Claudi ; Heine and Hünnemeyer ; Heine et al. a, b; Heine a, a, b, c, , a; Heine and König ; Heine and Miyashita ; Heine and Narrog ). As all these studies suggest, many of the pathways of grammaticalization that have been recorded from other parts of the world are also documented in Africa (see Heine and Kuteva ). These studies were based mostly on a comparative methodology. Much of the work aimed at contributing to the description of African languages also had a typological perspective. Underlying this work there were a number of goals, but clearly the main goal was to explain language structure and to search for typological regularities. Since language structure is a product of language use in the past, explanations were sought mainly in the diachronic development of grammar (Heine a). This work resulted in a number of different kinds of publication. First, there is a monographic treatment of grammaticalization in African languages in Heine and Reh (). Second, there are a number of general typological studies that include but are not restricted to grammaticalization in African languages (Heine et al. a; Heine , a, a, b; Heine and Kuteva , , , ). And third, there are typological studies each dealing with a speciﬁc domain of grammar or grammatical function. Speciﬁcally, the domains and functions focused on were: from compounding to derivation (Heine and Hünnemeyer ; Heine et al. a, ; Heine and Kuteva ); from noun to adposition (Heine et al. b), from verb to auxiliary (Heine ); from verb to complementizer (Lord ); the grammaticalization of serial verbs (Hünnemeyer ; Lord ); reﬂexives and reciprocals (Heine ; Schladt ; Heine and Miyashita ); comparative constructions of inequality (Heine a; Leyew and Heine ); verbal proximative aspects (Heine , a, c); the metaphorical basis of grammmaticalization (Claudi and Heine ; Heine et al. a; Mkhatshwa ); and grammaticalization chains as linguistic categories (Heine ). Finally, a considerable part of this work was dedicated to understanding the role that language contact has played in shaping the areal landscape of the African



Bernd Heine

continent. This work aimed, on the one hand, at deﬁning Africa as a whole as a linguistic area (Heine and Leyew ; Güldemann ; Clements and Rialland ; Creissels et al. ). On the other hand, it was driven by a search for areal patternings within Africa cutting across genetic (genealogical) boundaries (Heine a; Kuteva ; Leyew and Heine ; Güldemann , a; Kießling, Mous, and Nurse ; Heine b; Heine and Kuteva ). Despite the fact that there are hardly any historical documents on earlier states of African languages, research carried out in the course of previous decades demonstrates not only that it is possible to reconstruct grammatical evolution but also that grammaticalization theory can be of help in deﬁning processes leading to areal diffusion in Africa. With regard to our understanding of areal diffusion, ﬁndings made in this work can be summarized thus. First, it is possible, at least on a quantitative basis, to distinguish the languages of Africa from those in other parts of the world (Heine and Leyew ). Second, there are a few linguistic macro-areas in Africa, most of all the Ethiopian area (for a summary, see Crass and Mayer , ) and the Macro-Sudanic Belt (Güldemann a). Second, there are also some micro-areas, such as the Tanzanian Rift Valley (Kießling et al. ). And third, in all this work on areal patternings in Africa, ﬁndings on grammaticalization have played an important role (see especially the contributions in Heine and Nurse ).

..   As the work alluded to in the preceding section suggests, grammaticalization is a well-researched topic in African linguistics—more than in some other regions of the world. The present chapter will deal with grammaticalization processes in general and more speciﬁcally with the relationship between form and meaning in such processes. To this end, we will look at African language data in order to evaluate two hypotheses that have been proposed on this issue. These hypotheses, which we will refer to as the parallel reduction (PR) hypothesis and the meaning-ﬁrst (MF) hypothesis, are now looked at in more detail.

... The parallel reduction hypothesis According to a widespread assumption, going back to the early phase of modern grammaticalization studies (Bybee and Dahl ; Bybee, Perkins, and Pagliuca : , ; Lehmann : ), meaning and form proceed in parallel—that is, there is coevolution, captured appropriately by the parallel reduction hypothesis of Bybee et al. (: ). This hypothesis, henceforth called the PR hypothesis, can be summarized in short as in ().¹ ¹ The term ‘grammaticalization’ has been employed for a wide range of linguistic changes. In the present chapter we are restricted to ‘paradigm’ cases that arise via the the evolution of grammatical categories expressing schematic functions relating to tense, aspect, modality, evidentiality, number, gender, (in) deﬁniteness, case, subordination, etc. The term ‘meaning’ is used here in contrast with ‘form’, i.e. ‘meaning’ in this sense also includes pragmatically induced factors. That it is useful to distinguish semantics from

Grammaticalization in Africa ()



The parallel reduction hypothesis Form change parallels meaning change in grammaticalization

Bybee and associates found that ‘form and meaning covary in grammaticization on a large body of data’ (Bybee et al. : ; see also Bybee, Pagliuca, and Perkins ). Distinguishing two types of form change, namely the reduction or loss of phonetic bulk and the fusion of the grammaticalizing material to surrounding material, the authors found that ‘both types of formal change in grammaticization parallel the main types of semantic change in grammaticization’ (Bybee et al. : ). On the basis of substantial cross-linguistic data on the evolution of tense, aspect, and modality, Bybee et al. (: –) in fact provide strong evidence in favour of the hypothesis in (). This evidence is based on typological analysis and comparison of data from  languages in a carefully chosen sample of the languages of the world. The data rest on grammatical descriptions of established grammatical forms in the languages concerned—i.e. on more or less conventionalized grams—or, in other words, on established grammatical categories. This coevolution hypothesis has provided an important generalization on the evolution of grammaticalization. But its scope in explaining grammaticalization is restricted, as some lines of research suggest, most of all that by Bisang (; see also Bisang ). Bisang concludes that East Asian and mainland Southeast Asian languages represent a type of grammaticalization that is characterized by its limited coevolution of meaning and form (Bisang : ; but see also Ansaldo and Lim ). Note also that, as observed by Narrog (b: –), there is evidence to the effect that form change and function change in grammaticalization do not share the same motivation, and that ‘formal grammaticalization as such cannot be regarded as essential for grammaticalization’.

... The meaning-ﬁrst hypothesis The PR hypothesis contrasts with that proposed by Heine, Claudi, and Hünnemeyer (a: ; see also Heine a: –) according to which change in meaning precedes change in form in grammaticalization; let us refer to this as the MF (meaning-ﬁrst) hypothesis.² Observations in support of this hypothesis can be found already in some of the work on grammaticalization in the s and s (e.g. Givón , ; Lord ), and such observations have also been made in some form or other in more recent work; cf. the extravagance hypothesis of Haspelmath () or the extraclarity hypothesis of Michaelis and Haspelmath (). The hypothesis is suggested most of all by work written in the tradition of Heine et al. (a; see also Heine

pragmatics has been documented abundantly in the relevant literature, including the literature on grammaticalization (e.g. Bisang ). ² This hypothesis covers also cases of grammaticalization that have undergone changes in meaning but not in form (Bisang ; see below), and where it would be correct to say ‘meaning only’ rather than ‘meaning ﬁrst’. I am grateful to Heiko Narrog (p.c.) for having drawn my attention to this observation.



Bernd Heine

a), which is ﬁrmly based on the MF hypothesis. In this tradition it is argued that the main motivation underlying grammaticalization is to communicate successfully. One salient human strategy consists in using linguistic forms for meanings that are concrete, easily accessible, and/or clearly delineated to also express less concrete, less easily accessible and/or less clearly delineated meaning contents. To this end, lexical or less grammaticalized linguistic expressions are pressed into service for the expression of more grammaticalized functions (Heine, Claudi, and Hünnemeyer a: ; Heine a: –, ; Narrog and Heine ).³ The only reasonable conclusion to be drawn from this hypothesis is that interlocutors are ﬁrst concerned with what they say, i.e. with meaning, before changing their habits on how they say what they say— hence, semantic change is assumed to precede formal change. On this position, the MF hypothesis can be formulated as in (): ()

The meaning-ﬁrst hypothesis Semantic change is primary in grammaticalization and precedes form change (i.e. morphosyntactic and phonological change) in time

That () is correct has also been argued in Heine, Kuteva, and Narrog (). According to the observations made there, the only unambiguous factor that appears to account for directionality in grammatical change is the semantic relation between the source structure and the target structure, where the former is frequently but not necessarily a lexical structure. As this study suggests, other than the semantic relation between source and target there does not seem to be any other factor, such as contextual features, inferential mechanisms, analogy, or constructional form, that ultimately can be held responsible for unidirectionality in the history of, e.g., the English be going to future. Hypothesis() has also implicitly or explicitly been claimed in a number of other studies, where it is argued that semantic change drives form change in grammaticalization (e.g. Fischer : ; see Börjars and Vincent : ). It forms a central assumption of the framework of Heine et al. (a: –) and Heine (a: –), where grammaticalization is viewed essentially as a cognitive-communicative and semantic process. Accordingly, explaining this process must be ﬁrst of all with reference to meaning. That it is the meaning (or function) that drives grammaticalization has also been suggested in studies concerned with discourse functions, such as Harder and Boye (: ), who identify as a prerequisite for grammaticalization the relative usefulness of source expressions for ‘a discursively secondary role’. This usefulness accounts for their frequency and subsequent conventionalization in their secondary role.

³ This hypothesis is outlined in Heine et al. (a): ‘there is one speciﬁc principle that can be held responsible for the creation of linguistic forms serving the expession of grammatical concepts.’ This principle is referred to by Werner and Kaplan (: ) as ‘the principle of the exploitation of old means for novel functions’. By means of this principle, concrete concepts are employed in order to understand, explain, or describe less concrete phenomena. In this way, clearly delineated or structured entities and non-physical experience are understood in terms of physical experience, time in terms of space, cause in terms of time, or abstract relations in terms of physical processes or spatial relations (Heine et al. a: ).

Grammaticalization in Africa



Most commonly, the MF hypothesis is expressed implicitly rather than explicitly. For example, Michaelis and Haspelmath observe: Grammaticalization involves (i) semantic change, which often results in (ii) functionalization (content item > function item), and then (iii) compaction (cliticization, agglutination, fusion). (Michaelis and Haspelmath : ; bold print in the original)

This depiction appears to be in accordance with the MF hypothesis in that it implies that semantic change and ‘functionalization’ precede formal changes in the process.

..    In spite of all the work that has been done on grammaticalization in the course of the last decades, there does not seem to be conclusive evidence to decide which hypothesis is correct. This chapter is restricted to linguistic observations as they can be made in grammaticalization processes commonly observed in African languages. Grammaticalization is as a rule a long process, extending over decades and centuries, and it is inﬂuenced by many factors. Furthermore, the entry point in grammaticalization and the pace of development differ from one marker to another, and from one construction to another (Narrog : ; : –). Accordingly, when we study the evolution of some grammatical category we are confronted with a long and a complex history, which is not necessarily restricted to language-internal factors but may as well involve language contact (Heine and Kuteva , ). This history is to a considerable extent a process of semantic, morphosyntactic, and phonological reduction, as captured by diagnostic techniques such as those of Lehmann (: ), Hopper (), and Heine and Kuteva (: ; see also Heine and Narrog : ). But comparing the semantic and formal structure of modern grammatical categories with that of their non-grammaticalized sources may therefore tell us little about the motivations responsible for the rise of such categories. The question to be addressed here, therefore, does not concern the overall evolution of grammatical categories—a topic that has aptly been covered in work such as that by Bybee and associates alluded to above; rather, our interest is with the motivations of interlocutors that can ultimately be held responsible for this evolution. We will therefore look at evidence from African languages that allows us to evaluate the two hypotheses. If the PR hypothesis is correct, form and meaning should generally go together—i.e. once grammatical change has taken place it should simultaneously have involved both. By contrast, if the MF hypothesis is correct, there should be evidence to show that there was semantic change but no corresponding form change in grammaticalization. More speciﬁcally, there should be language data to show that at the earliest stage of the process it is the former that has taken place while the latter hasn’t (yet).⁴ Such data can be found if there are ⁴ We are not aware that an opposite hypothesis to the effect that formal change generally precedes semantic change in grammaticalization has been proposed. Hence, we do not pursue this possibility in this chapter. An anonymous reader, however, draws our attention to a claim ﬁrst made in the th century that analytic formations in Romance languages were a response to the phonetic reduction of Latin grammatical markers.



Bernd Heine

T . The context model of grammaticalization (where source meaning=nongrammaticalized, temporarily prior; target meaning=new grammatical meaning derived from the source meaning; Heine ) Stage

Context

Resulting meaning

I Initial stage

Unconstrained

Source meaning

II Bridging context

A speciﬁc context giving rise to an inference in favour of a new meaning

Target meaning foregrounded

III Switch context

A new context which is incompatible with the source meaning

Source meaning backgrounded

IV Conventionalization

The target meaning no longer needs to be supported by the context that gave rise to it; it may be used in new contexts

Target meaning only

examples of grammaticalization where there was a change from e.g. lexical to grammatical meaning but no accompanying formal change—that is, where there is ambiguity between the two kinds of meaning while their form is (still) the same. What kind of data these are can be illustrated by means of the context model proposed in Heine (), depicted in Table .. According to this model, the trigger of grammaticalization can be seen in Stage II (and to some extent also Stage III) situations of grammatical change, where in a speciﬁc context a linguistic expression is enriched by the rise of a new meaning (the target meaning) and this change does not affect the form of the expression concerned. For the present study, therefore, which is concerned with the reconstruction of the motivations underlying grammaticalization, Stage II and Stage III situations appear to be of paramount importance. The data used by Bybee et al. (: –) are for the most part not of this kind; they typically concern conventionalized grammatical categories as they surface in reference grammars—i.e. Stage IV situations. As we will see in section ., this has a bearing on the results obtained. For the purpose of the following discussions, ‘change’ will be said to obtain whenever a linguistic expression exhibits a recurrent feature that was absent in an earlier use of the same expression. This feature is semantic in the case of meaning change, while in form change the feature is phonological, morphological, or syntactic, where phonological change includes both segmental and suprasegmental features.

. CASE STUDIES FROM AFRICAN LANGUAGES The following is a survey of four kinds of grammaticalization commonly found in African languages. The goal of this discussion is to evaluate the hypotheses proposed in section ...

Grammaticalization in Africa



.. -  A grammaticalization process that is cross-linguistically widespread but has been reported to be particularly common in Africa (Heine c) concerns proximative aspect forms and constructions. The function of proximatives is to denote the temporal phase immediately preceding the initial boundary of the situation described by the main verb, common English paraphrases being ‘be about to do’, ‘being on the verge of doing’, or ‘nearly, almost’ (Heine , a, c; Romaine ). Presumably the most common though not the only pathway (see Heine c) is one involving the auxiliation of a verb of volition (‘want’, ‘desire’, etc.) where this verb turns into a proximative marker while the complement of this verb in the source structure turns into the new main verb in the grammaticalized construction. Example (), from Swahili of East Africa, illustrates the process concerned: (a) illustrates the lexical source construction, where the verb -taka has the lexical meaning ‘want’. This meaning still exists in (b), but since dying is not something that one normally wants, the grammatical meaning of proximative is foregrounded. Accordingly, in such contexts there is a proximative meaning ‘be about to’ of the bridging Stage II. This meaning is the only one in contexts where the subject referent is inanimate, i.e. where the semantics of this referent rules out the lexical source meaning of volition, as in (c). Hence Swahili has also developed a switch Stage III meaning where the proximative provides the only reading and the source meaning is backgrounded, even if it may still be recoverable, for example, in metaphorical interpretations. ()

Swahili (Bantu, Niger-Congo; Heine c: ) a. Nilitaka ku- mpiga. .SG- PST- want INF- .SG.OBJ- hit ‘I wanted to hit him.’ b. Nilitaka ku- fa. .SG- PST- want INF- die (i) ‘I wanted to die.’ (ii) ‘I nearly died. I narrowly escaped death.’ c. Mvua i- litaka kunyesha. rain it- PST- want INF- rain ‘It was about to rain.’ (*‘The rain wanted to rain.’)

In accordance with the context model of Table ., we interpret the three examples in () as each representing a different stage of grammatical evolution, where (a) illustrates the lexical source of Stage I, (b) the bridging Stage II, and (c) the switch Stage III. The grammaticalization process illustrated by this Swahili example was restricted to the manipulation of meaning in context; it did not affect the morphosyntactic or phonological forms, which both remained essentially unaffected. To conclude, already at Stage II there must have been (optional) meaning change not accompanied by change in form—in accordance with the meaning-ﬁrst (MF) hypothesis in (), but not with the PR hypothesis in (). The process reconstructed



Bernd Heine

above has not proceeded beyond Stage III—i.e. there is no conventionalized Stage IV construction in Swahili. Examples of proximatives like the one illustrated in () are legion in African languages; they can be said to present weakly grammaticalized categories since they have not proceeded beyond Stage III. But there are also examples in African languages where the process has proceeded further, giving rise to fully grammaticalized categories of Stage IV. We may illustrate this with an example from a language not genetically related to Swahili, namely the Maa language of the Nilotic family. The data are taken from the Chamus dialect of north-central Kenya. This dialect appears to have gone through the same stages II and III but has gone one step further, resulting in a full-ﬂedged proximative category. Example (a) illustrates the Stage I source construction involving the volition verb -yyéú ‘want’ and (b) the Stage III target meaning of the process, where this construction occurs with inanimate subjects.⁵ ()

Chamus (Maa dialect, Eastern Nilotic, Nilo-Saharan; Heine : ) a. k- eyyéú m- partút.’6 k- .SG- want F- woman ‘He wants a woman/wife.’ b. k- éyyeu lcáni néuróri. k- .SG- want M- tree NAR- fall ‘The tree almost fell.’ (lit. ‘The tree wanted to fall.’)

Both stages exhibit essentially the same formal structure, once again in support of hypothesis (): there has been semantic chance but no formal change. But Maa speakers have proceeded beyond these stages: subsequently there has also been formal change, in that the erstwhile verb form k-e-yyéú ‘s/he wants’ developed into an invariable proximative aspect particle (k)éyyeu, illustrated in (). Thus, this verb form has undergone internal decategorialization, turning into a frozen particle; in this capacity it is exclusively a proximative marker, which can equally take inanimate and human subject referents. Thus, in addition to retaining the earlier Stage II and Stage III structures, grammaticalization has also led to a fully conventionalized Stage IV construction, where the lexical source meaning of volition is ruled out. ()

Chamus (Maa dialect, Eastern Nilotic, Nilo-Saharan; Heine : ) kéyyeu aók nánʊ kʊlɛ́. PROX .SG- drink .SG.N milk.A ‘I was about to drink milk.’

To conclude, Chamus has acquired a new grammatical category, namely an aspect marker, via the grammaticalization of a verb of volition inﬂected in its third person singular imperfective form, but the new construction, illustrated in (), coexists with the earlier, weakly grammaticalized construction in (b). The rise of the Stage IV construction in () had dramatic consequences for the morphosyntactic format of the ⁵ We are ignoring the tonal inﬂections to be observed in the following examples, which are morphophonologically conditioned and need not concern us here. ⁶ The preﬁx k- is restricted to the imperfective paradigm of verb forms; its exact meaning is unclear.



Grammaticalization in Africa

construction, which need not concern us here (see Heine :  for details). Sufﬁce it to mention the following morphosyntactic change: whereas the inﬂected verb form k-e-yyéú ‘s/he wants’ requires the following main verb to be encoded in the narrative tense (using the narrative inﬂection n-; cf. (b)), the aspect particle (k)éyyeu, illustrated in (), takes the verb in the unmarked main clause syntax. But what is of interest here is the fact that semantic change must have preceded formal change, which may be taken to suggest that the latter is an epiphenomenal effect of the former.

..  -     The de-volitive proximative categories looked at in section .. are in no way exceptional, as can be shown with many other kinds of grammaticalization process. In the present section we look at another process, which concerns the evolution of reﬂexive markers. A survey of reﬂexive constructions in the languages of the world suggests that reﬂexive markers are mainly the product of the grammaticalization of four kinds of conceptual processes, which are based on the strategies listed in Table .. Our concern here is exclusively with the noun strategy, which seems to be of universal signiﬁcance but is more widespread in Africa than elsewhere (Heine ; Schladt ). In accordance with this strategy, noun phrases consisting of a body (-part) noun, usually taking a coreferential possessive modiﬁer, are grammaticalized to reﬂexive markers when serving as arguments. The source noun is in most cases ‘body’, less commonly also nouns for ‘head’, and this situation does not seem to be dramatically different in other parts of the world, as the percentages in Table . suggest. T . The main strategies to develop reﬂexive markers (Heine , )

a

Label

Strategy

Pronoun strategy [uR] = ‘unmarked reﬂexive’

Use personal pronouns

b

Intensiﬁer strategy

Add an ‘intensiﬁer’ to (a)

c

Noun strategy

Use a ‘body’ noun

d

Non-transparent

(Unknown strategy)

T . Nominal sources of reﬂexive markers in Africa and elsewhere (Schladt : ; Heine, own data from  African languages,  forms) Nominal source

Africa

Other continents

Nominal source

Frequency

%

Frequency

%

‘body’ ‘head’ ‘soul/life’ Other body parts Total

    

. . . . 

 

. .

 

. 

Total

    



Bernd Heine

The following example from the Eﬁk language of southeast Nigeria illustrates this pathway: (a) is characteristic of the lexical source structure of Stage I, where the complement ídém ‘body’ is a noun. Example (b), by contrast, is suggestive of the switch stage III, where in the context involving the main verb ńdíwòt ‘kill’ the intended meaning of the phrase ídém ésiě (‘her body’) is no longer nominal but rather reﬂexive. In such contexts the lexical Stage I meaning is backgrounded and the grammatical, reﬂexive meaning foregrounded, while the form is still that of the lexical source construction. Once again we see that semantic change precedes formal change in grammaticalization. ()

Eﬁk (Benue-Congo, Niger-Congo; Essien ) a. Árìt éyě ídém. Arit has body ‘Arit has a beautiful body.’ (lit. ‘Arit has body.’) b. Árìt óyòm ńdíwòt ídém ésiě. Arit want kill body her ‘Arit wants to kill herself.’ (lit. ‘Arit wants to kill her body.’)

But in many African languages the process has advanced one step further, giving rise to a conventionalized Stage IV construction where formal change has also now taken place and the lexical source meaning of Stage I is no longer available. We may illustrate this situation with the following example from Yoruba of southwest Nigeria, which exhibits the whole range of stages of grammaticalization, as the description by Awolaye () suggests. In example ((i)), the noun ara ‘body’ (plus its possessive modiﬁer wọn ‘their’) is interpreted in its lexical Stage I meaning, whereas ((ii)) is suggestive of Stage II, which appears to be an optional variant of ((i)), showing a grammatical (i.e. a reﬂexive) meaning. Thus, the construction has undergone semantic change by inviting a bridging Stage II interpretation, while the form seems to have remained unaffected by the change. ()

Yoruba (Kwa, Niger-Congo; Awolaye : ) won rí ara wọn. they saw body their (i) ‘They saw their bodies.’ (ii) ‘They saw themselves.’

..       Cross-linguistically there is a wide range of constructions used to express comparisons between two different items, and there are a number of different comparative concepts that tend to be distinguished. Our concern in this section is with only one of these concepts, namely with the comparative of inequality, or the superior comparative as it has also been called (Stassen ).

Grammaticalization in Africa



T . The main event schemas used for encoding comparative constructions (see Heine a: ) Form of schema

Label of schema

X is Y surpasses Z X is Y at Z X is Y from Z X is Y to Z X is Y, Z is not Y X and Z, X is Y

Action Location Source Goal Polarity Topic

Comparison is a relatively abstract concept, and, as we argue here, expressions of comparison are likely to be historically derived from more concrete meanings via grammaticalization. These meanings have been described in Heine (a) in terms of conceptual templates, called event schemas. Cross-linguistically there is only a small set of event schemas that tend to be recruited to grammaticalize comparative constructions; the most common of these schemas are summarized in Table .. In accordance with these schemas, comparatives are built on concepts such as action, where the standard of comparison is presented by means of an action verb (Action Schema), location (Location Schema), source or ablative (Source Schema), direction or benefactive (Goal Schema), an antonymic relation (Polarity Schema), or in terms of thematic conjuncts (Topic Schema). In principle, speakers of a given language may select any of the schemas listed in Table . to develop a new comparative construction; and in many languages, more than one schema has been grammaticalized. It would seem, however, that neighbouring linguistic communities are more likely to draw on the same schema than are communities living at some distance from one another. This is suggested by the fact that there are geographically deﬁned macro-areas where a preference for a speciﬁc kind of schema can be observed. Table . summarizes the results of a cross-linguistic survey of these constructions, carried out by Stassen ().⁷ Our interest here is with the macro-area of Africa, which exhibits a clear preference pattern: according to Table ., more than half of all African sample languages ( per cent) have grammaticalized the Action Schema to a comparative construction. But perhaps more signiﬁcantly, almost two thirds ( per cent) of all languages of the worldwide sample in Table . having made use of this schema are spoken in Africa.⁸ As we wish to show now, the context model depicted in Table . also applies to the grammaticalization of comparatives of inequality. In most cases, the Action

⁷ The sample of  languages has been established on what Stassen () argues is a genetically and areally balanced selection of the world’s languages. ⁸ Another linguistic area where the Action Schema (‘surpass comparative’) provides the main source of grammaticalization for comparatives of inequality is mainland Southeast Asia (Ansaldo , ).



Bernd Heine

T . Event schemas serving as sources for the grammaticalization of comparatives of inequality (primary options only; sample:  languages of worldwide distribution; Stassen ; Heine a: ) Schema Action Location Source Goal Polarity Topic Opaque schemas* Total

Europe

Asia

Africa

The Americas

Indian/Paciﬁc Ocean

Total

       

       

       

       

       

       

* ‘Opaque schemas’ are conceptual sources whose genesis is etymologically not recoverable.

Schema involves a verb meaning ‘defeat’, ‘surpass’, or ‘pass’, but in a few languages there is a verb for ‘leave (behind)’ instead. In the !Xun language of southwestern Africa, two of these verb types have been grammaticalized. !Xun (or Ju), formerly known as Northern Khoisan, belongs to the Kx’a family (Heine and Honken ); the data presented below are taken from the W dialect of !Xun spoken in Ekoka of northern Namibia (Heine and König ). The examples presented in () illustrate the ﬁrst pathway, involving the verb n̏/hūnyā ‘leave (behind)’, where (a) represents the lexical and (b) the grammatical meaning of a marker denoting the standard of comparison. Which of the two meanings is expressed depends on the context in which these verbs are used. The context illustrated in (a) highlights the spatial meaning of the movement verb n̏/hūnyā, hence the lexical meaning of Stage I of the context model (Table .) surfaces in this example. In (b), by contrast, the manner of the action performed is foregrounded; accordingly, the only reasonable interpretation for !Xun speakers is one with reference to Stage III, where the lexical source meaning is backgrounded in favour of the grammatical meaning. ()

!Xun (W dialect; Kx’a; Heine and König , ; König and Heine ) a. Cālò má kē n̏|hūnyā hȁ n!āō. Calo TOP PST leave his house ‘Calo left his house.’ b. hȁ má ḿ n̏|hūnyā mí. N TOP eat leave .SG ‘He eats more than I.’

Essentially the same pathway can be reconstructed for the second verb, !’ālā ‘pass (by)’. Thus, in (a) and (c) essentially the same applies as in (a) and (b), respectively. But in this case there is an intermediate stage, i.e. a bridging stage situation of Stage II, where !’ālā is ambiguous between the lexical meaning (b (i)) and the grammatical meaning (b (ii)). The difference between (b) and (c) lies in

Grammaticalization in Africa



the context provided by the preceding verb: Whereas !!’hùȁm ‘run’ can be interpreted in this sentence with reference either to its meaning of spatial movement or to the manner of action, a spatial interpretation is ruled out with the stative verb nǁ ā’à ‘be big’; hence, (b) is ambiguous while (c) does not allow for a spatial interpretation: there is only the grammatical concept (‘more than’) of a comparative marker. ()

!Xun (W dialect; Kx’a; Heine and König , ; König and Heine ) a. mí má kē !’ālā n!āō. .SG TOP PAST pass house ‘I passed by the house.’ b. !xó má !!’hùȁm !’ālā gùmì. elephant TOP run pass cattle (i) ‘The elephant runs, overtaking the cow.’ (ii) ‘An elephant runs better (or faster) than a cow.’ c. !xó má nǁā’à !’ālā gùmì. elephant TOP be.big pass cattle ‘An elephant is bigger than a cow.’

There is no conventionalized standard marker in !Xun, i.e. grammaticalization has not proceeded beyond Stage III. As in sections .. and .., we hypothesize that there was a change leading from a lexical meaning (Stage I) to a schematic, grammatical meaning without a corresponding change in form: As far as can be ascertained, both items, !’ālā ‘pass (by)’ and n̏/hūnyā ‘leave (behind)’, are phonologically and morphosyntactically identical, irrespective of whether the lexical source or the grammatical target meanings are expressed. Once again, there is support for the MF hypothesis, according to which grammaticalization was triggered by context-induced semantic change with no corresponding formal change.

..  -  The ﬁnal case concerns a pathway of grammaticalization that is widespread in Africa but presumably equally widespread in other parts of the world (see e.g. Bybee et al. ; Bybee, Perkins, and Pagliuca ; Heine and Kuteva ), examples of it can be found in European languages such as English or French. Our example is also taken from the !Xun language of southwestern Africa, but this time from the N dialect spoken in southeastern Angola. This dialect uses a weakly grammaticalized future tense, showing the grammaticalization pathway from a lexical construction involving the verb ú ‘to go’, illustrated in (a), via an ambiguity Stage II situation in (b), where the grammatical meaning of future tense is foregrounded but the lexical meaning is still available. This is the situation that obtains with most kinds of verbs serving as complements of ú ‘go’ in this dialect. However, when the meaning of the complement verb is semantically incompatible with that of ú ‘go’, then the lexical source meaning is ruled out. This is the case in (c), where the spatial deixis of the complement verb tcí ‘come’ is incompatible with that of ú ‘go’; in such a



Bernd Heine

context, ú is exclusively a future tense marker—i.e. we are dealing with a Stage IV situation.⁹ ()

!Xun (N dialect; Kx’a; own data) a. yà gǀúyē hā kē ū- ā. N yesterday PROG PST go- TR ‘He was going yesterday.’ b. yà ǀōā úá ‘ḿ. N NEG go/FUT- TR eat (i)‘He doesn’t go to eat.’ (ii) ‘He will not eat.’ c. yà ǀōā úá tcí. N NEG FUT- TR come ‘He will not come.’

This ﬁnal case provides another example where grammaticalization was apparently triggered by meaning change arising in speciﬁc contexts. And once again, there is no evidence that this change affected the form of the construction.

. OBSERVATIONS ON SINITIC LANGUAGES There is conceivably a problem with the MF hypothesis, surfacing from an analysis by Ansaldo and Lim (), who demonstrate that even in isolating tone languages grammaticalizing words undergo subtle changes in pronunciation. They demonstrate that in two morphologically strongly isolating and tonally complex Sinitic languages, namely Cantonese and Hokkien as spoken in Singapore, function words show vowel/ syllable reduction and erosion typical of grammaticalization. The grammaticalization pathways analysed are summarized in (), where the forms to the left of the arrow are the lexical and those to the right the grammaticalized forms. ()

a. b. c. d.

Cantonese gwo³³ ‘to pass/cross’ > comparative (SURPASS) marker Cantonese dou³³ ‘to arrive’ > resultative verb Hokkien ho³³>²¹ laŋ²⁴>³³ ‘give people’ > causative, permissive, passive marker Hokkien khi²¹ ‘to go’ > perfective aspect marker

On the basis of their phonetic experiments, Ansaldo and Lim (: –) conclude, ‘[I]t is beyond doubt that phonetic erosion can be consistently observed in the grammaticalized morphemes while it does not occur, or occurs to a lesser degree, in the lexical occurrences.’ ⁹ The verb ú ‘go’ is intransitive, and in order to be transitivized it takes the transitive sufﬁx -ā (T). In (a), the sufﬁx is required because the preposed adverb g/úye ̄ ‘yesterday’ is treated as its complement. The mid tone of the sufﬁx can either assimilate a preceding high tone to mid, as in (a), or be assimilated to high tone, as in (b,c). These assimilations appear to be optional; they have been observed in both lexical and grammatical uses of the item ú (Heine and König : –).

Grammaticalization in Africa



The question then is: do these ﬁndings lend support to the PR hypothesis outlined in section ..? It would seem that the answer is in the negative, for the following reasons. First, the cases discussed by Ansaldo and Lim () appear to have been recognized as conventionalized grammaticalizations by grammarians who have described these languages, representing Stage IV situations according to Table .. This is different in the case of incipient stages of grammaticalization like the ones discussed in section ., which are not normally recognized by grammarians as distinct grammatical forms. Second, in some of the forms grammaticalization has proceeded to the extent that the grammatical forms have been generalized and the lexical forms are restricted to speciﬁc contexts. Thus, Cantonese gwo³³ has only limited lexical uses left, and for Hokkien ho³³>²¹ laŋ²⁴>³³, the default meaning is passive rather than ‘give people’. And third, as far as is shown by the data provided by Ansaldo and Lim (), none of these cases is suggestive of an ambiguity in Stage II use—one that could be taken as evidence of the presence of an incipient stage of grammaticalization. To conclude, the forms analysed by Ansaldo and Lim () do not seem to be weakly grammaticalized Stage II or Stage III forms; rather, they all seem to be conventionalized Stage IV constructions. Accordingly, it comes as no surprise that they have undergone erosion, i.e. they have become phonologically detached from their lexical counterparts, and they tend to occur as elements ‘of a syntactically and semantically closer/tighter phrasal unit’ (Ansaldo and Lim : ). This means that these examples are not really relevant to the issue discussed here, viz. the motivations that interlocutors have at the initial stage of grammaticalization, prior to their conventionalization.

. DISCUSSION In the case studies discussed in section . it was argued that there was change in meaning but not in form. The question then is whether the data presented are really appropriate to allow for such a conclusion. Form change was deﬁned in section . in a broad sense as including phonological, morphological, and syntactic features. Presumably the least controversial of these are morphological features. Ignoring context-related differences, the morphological makeup and the compositionality of the constructions concerned is essentially the same, irrespective of whether there is a Stage I, Stage II, or even a Stage III situation. It is only at Stage IV that there may also be morphological change, as we saw in the Chamus example of section .. (ex. ()). It would seem that much the same applies to syntax. Linguistic expressions generally tend to be associated with a range of different contexts in which they are used, and each of these contexts may have its own syntactic constraints. The factor that distinguishes the cases discussed in section . from many other cases of context-induced variation is that there is one context that triggers an optional interpretation in terms of a grammatical meaning (Stage II of the context model in Table .). For example, it was observed in section .. that the switch Stage III use



Bernd Heine

of the Swahili proximative requires a context where the subject referent of the verb taka ‘want’ is inanimate. This syntactic constraint appears to be a necessary but not a sufﬁcient condition for this grammaticalization process; it also applies to some uses of the verb -taka which do not involve proximative meaning, such as the following: ()

Swahili (Bantu, Niger-Congo; own data) Barua hii i- nataka ku- ﬁka kesho. letter this it- PRES- want INF- arrive tomorrow ‘This letter must arrive by tomorrow.’

Thus, there is reason to argue that in context extensions such as those discussed in section . the lexical source construction was enriched with another context, rather than there being a syntactic change. In this respect, the cases examined in section . differ from the Cantonese and Hokkien examples of conventionalized grammatical functions mentioned in (), which are each associated with their own syntactic format, as described by Ansaldo and Lim (: ). The situation is different in cases where this interpretation acquires higher frequency of use and becomes entrenched as a new, regular use pattern. Once this happens, it is likely to also have morphosyntactic and/or phonological implications, as we saw in our Chamus example in section .. (ex. ()) or the Sinitic examples of section . (ex. )). But such cases are not the concern of this chapter, which is restricted to the question of what motivates grammaticalization and, hence, to the initial phase of the process; in this phase, there appears to be neither morphological nor syntactic change. Much the same presumably applies also to phonology: Whenever there is a new semantic interpretation of Stage II, this is unlikely to be paralleled by corresponding phonological change. The main evidence that we have in support of this claim is the following. When we asked our language consultants on the data in sections .. and .. whether the ambiguous Stage II forms of Swahili in (b) and of !Xun in (b) were different depending on whether the lexical meaning in (i) or the grammatical meaning in (ii) was intended, their reaction was unanimously in the negative. Also, when we ourselves pronounced (b) and (b), they would declare that the only difference they could see was one of meaning. To be sure, this is not compelling evidence, especially considering that there are alternative observations that have been made in the relevant literature. For example, already in  Givón had suggested: Temporal-intonational packaging is the oldest, most subtle, most iconic and most ubiquitous element in syntactic structure. It takes place almost automatically and is extremely sensitive to the cognitive dimensions of information processing and chunking. (Givón 1991: 123)

Having argued earlier in favour of the MF hypothesis (Givón , , a), Givón () concludes that there may also be support for the PR hypothesis in light of prosodic and related considerations. Furthermore, reacting against the ‘no form–meaning correspondence in SE Asian languages’ hypothesis of Bisang (), Ansaldo and Lim () show by means of experimental evidence that in addition to meaning change there are also subtle changes in phonology, such as stress and

Grammaticalization in Africa



vowel length.¹⁰ The data examined by the authors just cited seem to concern examples where fairly well-established grammatical use patterns are involved, and where one might not be surprised to ﬁnd that change has already proceeded from meaning to form. At the same time, the hypothesis proposed here does not seem to be at variance with the quantitative data presented by Bybee et al. () in support of the PR hypothesis, for the following reason. Our concern is with the motivations of interlocutors in the initial phase of the process, and the effect these motivations have when grammaticalization arises. The two need to be distinguished, as can be shown with the following example of grammaticalization. Describing the development of the English particle to from purpose preposition to inﬁnitive marker, Fischer (: –) sketches this development as in (). ()

Structural stages in the grammaticalization of English to (Fischer : ; α, β = forms; x, y = meanings) a. α b. α c. α β — — — — x xy x y

With reference to the context model presented in Table ., the structural stage (a) corresponds to Stage I of that model, where to is exclusively a purpose preposition (meaning x). (b) is suggestive of the bridging (II) or switch stage (III), where to acquires a second function, namely that of a semantically empty inﬁnitive marker (meaning y). The result is ambiguity in that there is now one form (α) expressing two meanings (x and y). This result is in accordance with the MF hypothesis, whereby grammaticalization ﬁrst involves change in meaning (from purpose to inﬁnitive) and only subsequently one of form. Finally, (c) is suggestive of the conventionalization of Stage IV, where the target meaning no longer needs to be supported by the context that gave rise to it, but instead has its own formal expression. This expression, which in the present example is a reduced form of the purpose preposition to, introduces ‘a new stable isomorphic relation’ (Fischer : ), where there is now a one-toone relation between form and meaning both in the grammatical source (i.e. the purpose preposition) and in the target gram (the inﬁnitive marker).¹¹ Our interest in this chapter was in reconstructing the motivation leading to grammaticalization and, hence, in changes leading from (a) to (b). That of Bybee et al. (), by contrast, is with the evolution of grammatical categories and, hence, with the change from (a) to (c).¹² Note that reference grammars, which formed the main source of the quantitative data analysed by these authors, are in most cases silent about stage (b) situations. Now, comparing (a) with (c) ¹⁰ I am grateful to Heiko Narrog (p.c.) for having drawn my attention to these observations. ¹¹ Fischer (: ) treats this as a case of ‘re-iconization’, an interpretation that need not concern us here (but see Börjars and Vincent : ). ¹² With this account we are referring merely to one aspect of the work of Bybee and associates; a number of other studies (e.g. Bybee a) provide detailed accounts of grammaticalization processes leading from (a) to (b).



Bernd Heine

obviously allows one to conclude that grammaticalization led from the form– meaning unit (α/x) to a new form–meaning unit (β/y), where (β) is a phonetically reduced version of (α) and (y) a semantically reduced version of (x). From this perspective of a long-term evolution, then, there is in fact parallel reduction, seemingly lending support to the PR hypothesis of (). At the same time, this perspective hides the ﬁrst part of this evolution where there is semantic (or functional) change but not yet form change—in accordance with the MF hypothesis in ().

. CONCLUSIONS Studies in African linguistics have brought a number of insights on the nature of grammaticalization processes, and have contributed to our understanding of the effects that language contact has in shaping the areal landscapes of grammatical structures on the African continent. The main concern of the present chapter has been with a more general issue, namely with what drives grammaticalization; and to this end, two contrasting hypotheses were outlined in section ... The data examined in section ., which formed the main part of the chapter, provided strong evidence in favour of the meaning-ﬁrst (MF) hypothesis. But as was argued in section ., both hypotheses have their place in an account of grammaticalization. The parallel-reduction (PR) hypothesis is concerned with grammaticalization as a long-term evolutionary process. It takes care of the fact that conventionalized grammatical categories are as a rule the joint result of a reduction of semantic content (desemanticization), morphosyntactic properties (decategorialization), and phonetic features (erosion). But when it comes to establishing what motivates a process of grammaticalization, which concerns the initial phase of the process, then this hypothesis appears to be of little use; this phase is instead best captured by the MF hypothesis.

ACKNOWLEDGEMENTS The author wishes to express his gratitude for valuable comments to two anonymous reviewers, to Christa König, and to Tania Kuteva, as well as to participants in the Symposium on Grammaticalization Typologically, which took place at the National Institute for Japanese Language and Linguistics in Tokyo, – July . Most of all, his gratitude is due to Heiko Narrog, who suggested a number of highly valuable revisions. Furthermore, he wishes to thank Guangdong University of Foreign Studies and Haiping Long, and the University of Cape Town and Matthias Brenzinger, for the academic hospitality he received as a visiting professor while working on the chapter.

3 Typological features of grammaticalization in Semitic MOHSSEN ESSEESY

. INTRODUCTION Semitic is a classiﬁcation the German orientalist August Ludwig Schlözer created in  for a collection of genetically related languages. That classiﬁcation is biblically linked to Shem, one of Noah’s sons. As a language family, Semitic is traceable to the Afro-Asiatic phylum. A hypothetical Proto-Semitic is considered to have split from Afro-Asiatic around the sixth to ﬁfth millennium BCE (Diakonoff ). Competing models have been postulated for the genetic classiﬁcation of the Semitic languages; the most widely accepted one to date is Hetzron’s () model. It presents genetic relationships in the form of a family tree structure under which two major branches, East (Akkadian) and West Semitic, are separated. West Semitic is further divided into South and Central Semitic. South Semitic includes Ethiopian, Epigraphic South Arabian, and Modern South Arabian languages. Central Semitic divides into Aramaic and Arabo-Canaanite, whose branches include Arabic and Canaanite (e.g. Hebrew, Phoenician) languages. Under Semitic,  distinguishable languages and varieties have been identiﬁed, with an approximate estimate of living native speakers in excess of  million.¹ The historic homeland for these languages roughly comprises the Near East, North Africa, and the Horn of Africa. Yet in modern times, a sizeable number of Semitic native speakers is spread around many regions of the world. Extant historical records reveal that speakers of the Semitic languages have remained in continuous areal contact over several millennia extending from the third BCE until now. Certain Semitic languages (e.g. Ancient Hebrew, Literary Aramaic, and Classical Arabic)

¹ This estimated total ﬁgure is an aggregate derived from the list of languages and their corresponding native speakers, which were cited in Grimes (). Grammaticalization from a Typological Perspective. First edition. Heiko Narrog and Bernd Heine (eds). This chapter © Mohssen Esseesy . First published  by Oxford University Press



Mohssen Esseesy

have risen to a global level beyond their birthplace owing to their signiﬁcance as vehicles for religious scriptures and traditions, pertaining to Judaism, Christianity, and Islam respectively. Lacking native speakers, these languages now are largely conﬁned to liturgical use.

..    Interest in grammaticalization studies in Semitic languages collectively and individually has been on the rise in very recent decades, as evidenced by the growth in the number of publications adhering to its tenets. This trend represents a notable departure from the hitherto ubiquitous comparative diachronic philological tradition such as those pursued in Wright (), Brockelmann ( [–]), Moscati (), O’Leary (), Gray ([]), and Lipiński (), which dominated Semitics for a considerable period. Grammaticalization studies in Semitic increasingly cover a wide range of forms and constructions that have evolved to serve grammatical functions in both extinct and living Semitic languages. Most notably representative of this emerging trend are two volumes focusing entirely on the manifestation of grammaticalization across Semitic languages, Rubin’s () Studies in Semitic grammaticalization and Eades’s () edited supplementary volume of the Journal of Semitic Studies, Grammaticalization in Semitic. ‘Grammaticalization’ in this chapter refers to the emergence and evolution of forms and constructions which come to serve grammatical functions. Their erstwhile sources are typically autonomous lexical items belonging to an open class. With continued grammaticalization, functional forms often develop into a tighter, closed paradigm of functionally similar members. The typical changes that the forms and constructions undergo in the process of grammaticalization involve semantic extension and generalization, morphosyntactic changes, and in many cases eventual phonological losses. The changes often lead to morphologization of independent forms into dependent elements requiring a host, such as clitics or afﬁxes (Hopper and Traugott : ).

..     This chapter provides a selected number of typological features in the grammaticalization processes undergone by some notable constructions across Semitic that highlight both intragenetic similarities and parallels. Among these are () the evolution pattern of select body-part terms to adpositionals; () grammaticalization processes facilitating the change from older synthetic possessive strategies in ancient Semitic languages to more analytic ones in modern daughter languages; and () personal pronouns grammaticalized as agreement markers, and expanded uses of independent third-person pronouns beyond marking of person as copulas and expletives among several other functions. These processes and the changes they induced in Semitic have had a remarkably enduring effect on the Semitic grammatical system.

Grammaticalization in Semitic



..    A number of shared prominent, stable linguistic commonalities, albeit nondeﬁnitional, are found among members of the Semitic language family. They include: (a) PHONOLOGICAL FEATURES. Rich consonantal inventory compared to the impoverished vocalic counterparts, and the use of guttural and emphatic consonants. Consonant clustering is rather rare in ﬁrst and ﬁnal syllables compared to medial ones. The number of clustered consonants does not exceed two. (b) MORPHOLOGICAL PROPERTIES. Use of sets of two to four consonantal radicals, though sets of three consonants are most common. Their order within the set helps generate a general semantic realm. Consonantal sets in turn interlock with vocalic templates according to well-established patterns, to create a nonconcatenative meaning-bearing word stem (root plus pattern). The consonantal root sets, while unpronounceable, contain some general semantic properties. When combined with vocalic templates, they receive semantic speciﬁcity and formal categorization. The formed stem is typically open to morphological derivation and inﬂection. Verb stems are of two types, perfective and imperfective. Generally, Semiticists (e.g. Hasselbach ; Russel ) agree that the imperfective stem diachronically preceded the perfective one.² Both permit inﬂections for person, gender, number, and voice. The imperfective additionally shows mood inﬂections. Obligatorily encoded on the verb are person, gender, and number (PGN). These bound morphological markings appear as sufﬁxes on the perfective stem and preﬁxes/circumﬁxes on the imperfective. Sufﬁx verbal conjugation that denote pastness is seen by Hetzron () as a deﬁning innovation of West Semitic. Independent personal pronouns, the source for bound PGN marking, may also co-occur with the inﬂected verb. Hence, there is optional double marking of the subject/agent. Personal pronouns: not all Semitic personal pronouns specify person, number, and gender fully or consistently. Whereas the second person speciﬁes all three features, the ﬁrst person speciﬁes only the ﬁrst two (i.e. person and number). The third person in certain functions (e.g. as copula) speciﬁes only gender and number but not person (for Arabic: Fehri ; for Hebrew: Ariel ). Morphological case is attested in a number of Semitic languages (e.g. Akkadian, Ugaritic, Ge‘ez, Arabic). However, the origins of case and its spread across major Semitic branches is still a topic that generates intense debate. (See Hasselbach  for a summary of the contending theories.) In the languages that retain case (e.g. Arabic) the case system tends to be triptotic in the singular and diptotic in plural forms. ² In support of this assumed diachronic word-order cyclicality is Hasselbach’s () reconstruction of the constituent order, subject–predicate, in verbless clauses, which aligns well with the order of pronominal preﬁxes relative to the verb stems in older Semitic languages. A later shift to VSO, according to Hasselbach, resulted in the sufﬁxation of agreement markers on the Semitic verb. Finally, across Semitic a return to a predominant SVO occurred in modern daughter languages.



Mohssen Esseesy (c) SEMANTIC PROPERTIES. There appears to be a high degree of regularity in the semantic modiﬁcation of the verb pattern. For example, within the verb derivational systems (ranging from ﬁfteen in Classical Arabic, to ten in Modern Standard Arabic, to seven in modern Arabic dialects and Modern Hebrew), one typically ﬁnds a predictable semantic relation holding between basic-derived patterns. (d) SYNTACTIC PROPERTIES. The sentential syntax in a large number of modern Semitic languages is a dominant SVO that reﬂects a cyclical change from a dominant VSO word order in older languages (e.g. Biblical Hebrew, Classical Arabic). However, the oldest attested Semitic language, Old Akkadian, is the exception, since it had a dominant SOV sentence syntax under substrate inﬂuence of Sumerian, which is non-Semitic. Adpositions: Semitic is predominantly prepositional, but some Ethiopian languages (e.g. Modern Harari and Amharic) have shifted towards postpositional constructions to varying degrees. Prepositions linearly precede their (pro)nominal dependents and mark them in genitive. Members of this class may be arranged morphosyntactically along several distinguishable layers (see Esseesy ), ranging from proclitics (e.g. li- ‘to, for’), to simple stem (min ‘of, from’), to bi-morphemic (e.g. fī wasaṭi ‘amid’), to complexes of the type PNP (e.g. bi-r-raġmi min ‘in spite of ’). Possession: may be synthetically expressed through genitive constructions (i.e. status constructus), where the two juxtaposed nouns are morphologically marked, in the older Semitic. In the younger Semitic languages, besides the older synthetic construction, an analytic possessive linker (known as the ‘genitive exponent’) is more frequently inserted between the possessee and possessor. Verbless sentences are formed by mere juxtaposition of two NPs without any morphological change on either of them (the second syntactic slot may also be ﬁlled by an adjectival form). N₁ typically functions as the topic or subject and N₂ typically assumes the comment or predicate function. Copulas in the form of the third-person independent pronoun are frequently interposed between the two NPs to mark the present, particularly in Modern Arabic and Hebrew.

. FROM BODY-PART NOMINAL TO PREPOSITIONAL Cross-linguistically, body-part terms have been shown to be one of the major donor lexical categories to the grammaticalization of prepositional and case afﬁxes (e.g. Svorou ; Heine a, ). Semitic follows this known pattern. A few terms denoting sub-parts of the (human) body, while still maintaining their erstwhile referential semantic denotations, have evolved via metaphor to serve prepositional and other related functions. This evolutionary process is noted in a number of studies (e.g. Svorou ; Heine, Claudi, and Hünnemeyer b). The process ﬁts a pattern whereby a body-part term ﬁrst develops into locatives and from there into temporal denoting expressions along the metaphor SPACE IS AN OBJECT and TIME IS SPACE.

Grammaticalization in Semitic



T .. Body-part nominals as source and target of the grammaticalization process Language

Source

Target

Akkadian

muxxi ‘skull’ ṣērum ‘back’

ina muxxi ‘on, over’ ṣēriš ‘towards, against’

Arabic

fū/fī/fā ‘mouth’* xalfu/xalﬁ/xalfa ‘back’ wasaṭu, wasaṭi, wasaṭa ‘mid-section’ jānibu/jānibi/jāniba ‘ﬂank, side’

fī ‘in, at’ xalfa ‘back, behind’ wasṭa ‘mid, amid’ jāniba/janba ‘beside, next to’

(Biblical) Hebrew

’aḥar ‘backside’ ’eṣel ‘joint, side’

’aḥar ‘behind, after’ ’eṣel ‘beside, near, toward’

Aramaic

ge(n)b ‘side, bank’ appaē ‘face’

ge(n)b ‘beside, near’ ’al appay ‘upon, in front of ’

Modern Aramaic (Telsqof) reesh-a ‘head’ Amharic

gˇarba ‘back’

reesh ‘top of ’ gˇarba ‘behind’

* Forms separated by (/) represent triptote nominal inﬂections. Source: Akkadian, Aramaic, and Amharic examples: Rubin (2005); Arabic: Esseesy (2010); Hebrew: Moscati (1964) and Hardy (2014); Modern Aramaic, i.e. Telesqof: Rubba (1994). Transcriptions are copied from source.

Semantic extensions leading to grammaticalization of some body-part terms are shown in Table . to be widespread across a number of Semitic languages belonging to various subgroups (e.g. East: Akkadian; Northwest: Aramaic; Central: Arabic; and African Semitic: Amharic). Table . highlights important common tendencies in the grammaticalization of Semitic body-part terms, including: (i) Overall no loss of root consonants: they remain intact in all grammaticalized forms (more on this below), despite minor morphophonological modiﬁcation to the stem. (ii) Case invariance: -a accusative form (prototypical nouns are triptote), which is taken as indicative of decategorialization to becoming adverb-like (e.g. Arabic: fawq-a ‘above’; bayn-a ‘between’. Also, it is found in Geʿez: qədm-ä ‘in front’; ḫäb-ä ‘to, toward’ (Gragg : ). (iii) Syntactic head: retaining head status, exempliﬁed by governing their dependents in the genitive in like manner as nouns in status constructus across Semitic. (iv) Periphrasis: newer (complex) prepositions are created according to a crosslinguistically attested pattern PREPOSITION + RELATIONAL NOUN + NP, such as found in English (see Svorou ). (v) Functional variance: Uneven functional distribution among prepositionals. While some (e.g. xalfa ‘back > behind’) remained within the locative-spatial domain, others such as jānib ‘ﬂank, side’ progressed to discourse functions as in bi-jānib ‘besides, in addition to’. These changes are unidirectional in nature and have allowed body-part terms to denote spatio-temporal relations, among others, in a multitude of contexts, including some abstract ones.

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi



Mohssen Esseesy

The typological range of spatial relations of body-part terms in Semitic has developed as follows: IN, ON, FRONT, MIDDLE/AMID, BEHIND Whereas all recruited body-part terms primarily serve locative and/or temporal functions, in some exceptional cases (e.g. Arabic fū), extensive polysemies as well as a functional range beyond all others in Table . have occurred. The form fī ‘in, at’ is illustrative of the pathway of a body-part term that has achieved greater evolutionary maturity, as shown in some detail below. The ancestral triptote nominal fū ( f ī genitive and fā accusative) has also done so. Its grammaticalized form fī appears invariably in the genitive, which is insensitive to variation in the syntactic position. This invariable genitive, which has been suggested in Brockelmann ([–]: , ), is the result of an earlier frequent collocation in a compound-like construction, with bi- ‘at, with’, thus forming bi-f ī ‘in mouth’. Worthy of note is that as a prototypical noun, fī also had the plural ’afwāh and permitted the sufﬁxed indeﬁnite article -n and preﬁxed deﬁnite article ’al-. Grammaticalized fī lacks inﬂection for number and deﬁniteness. As a preposition, fī developed through metaphoric extensions a number of polysemic clusters: locatives showing a gradational change from full containment to partial containment to non-containment (Esseesy : ). Additionally, it has developed temporal (e.g. fī-š-šahr ‘per month, monthly) and manner meaning (e.g. fī dahšatin lit. ‘in surprise = surprisingly’), and in the context of verbs it may denote cause (e.g. māta fīh ‘died because of it’). In some spoken Arabic dialects, fī>ﬁ>f- continued its grammaticalization further to serve as an existential pronoun. It occurs in certain contexts side-by-side by its prepositional form (which is often a proclitic in morphophonologically eroded form) without ensuing ungrammaticality, as in () in the Egyptian spoken dialect of Arabic: ()

fī(h) f-rās-o ḥāgah there in-head-his thing ‘There is something on his mind’

(Esseesy : )

Similar to negation of other preposition, the negation form miš is placed before the predicate with fī-: ()

huwwa miš ﬁ-l-bēt he NEG in-the-house ‘He is not in the house’

As a particle meaning ‘exist’, it is negated like verbal predicates with the circumﬁx ma . . . š: ()

ma-fī-š ’amal fī-h NEG-exist-NEG hope in-him ‘There is no hope in him/it’

Relatedly, fī > ﬁ in a number of modern Arabic dialects is further grammaticalized as an aspectual particle. Brustad (: ) identiﬁes aspectual/durative function

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi

Grammaticalization in Semitic



added by the use of ﬁ in Moroccan, Egyptian, and Levantine. Mitchell (cited in Brustad, p. ) gives the following example, where ﬁ is an obligatorily aspectual particle triggering an atelic meaning of the verb ‘read’ in the Jordanian dialect. Absence of ﬁ, according to Mitchell’s native speaker informants, renders the sentence ungrammatical. ()



qarat ﬁ-l-kitāb ‘iddit sa‘āt Read.SGPAST in-the-book several hours ‘I read (in) the book for several hours’

Insofar as fī had commenced its process of grammaticalization as a sub-constituent of a prepositional phrase, later it also collocated with younger prepositional forms that it helped develop (‘secondary prepositions’, in Lehmann’s (: ) terms). This is a periphrasis stage, similar to the one identiﬁed in Croft (: ), whereby a periphrastic construction is recruited to serve a certain grammatical function according to the existing rules of grammar. One such case is fī xilāli ‘during’ > xilāla ‘through’ and fī ’aṯnā’i > ’aṯnā’a ‘during’. This process is very common in the evolution of Semitic prepositional forms, whether or not it originated in body-part terms. It has parallels in the incipient stages of the grammaticalization of prepositional forms in other Semitic languages, as the example from Biblical Hebrew illustrates: [b-] PREP [gll +NP]NP]PP ‘on the matter (of ) > [[bgll] PREP + NP] PP ‘because of ’ (Hardy : ). In the incipient stages of their grammaticalization, they appear in multiple constructions along this projected pathway. Kaufman () also observes that in Aramaic the development of new prepositions typically involved collocations with ‘simple common Semitic prepositions’, i.e. primary ones, as in mn gb ‘from the top’ and mn byny ‘from between (Arabic: min bayni ‘from between’ > bayna ‘between’ (p. ). Similarly, Gragg (: ) observes that complex prepositions like bä-qädmä ‘in front of ’ are common in Ge‘ez (African Semitic). Furthermore, in the context of verbs, fī also collocates with verbs in the formation of prepositional verbs such as fakkara fī ‘thought of ’, sāhama fī ‘contributed to’, batta fī ‘adjudicated’ (Esseesy : ). Traditionally they are labelled ‘verb-preposition idioms’. In the latter example, fī continues to be functionally specialized and semantically integrated with the lexical verb, and it developed into a near semantic and functional equivalent of the single verb qarrara ‘decide’. Beyond being merely a preposition or a particle, fī heads constructions serving as textual organizers, such as fī-l-wāqiʿi ‘indeed’, that denote intensiﬁcation. The early grammaticalization process thus can be mapped into a two-step process: a. Primary preposition + Relational noun + Nominal dependent b. Relational noun/Prepositional + Nominal dependent As noted earlier, the majority of body-part terms from which the adpositions discussed above have emerged did not undergo morphophonetic erosion of their root ³ Brustad notes that the /q/ in the verb should have been realized in the speech of the informant Jordanians either as /’/, under the inﬂuence of the urban Palestinian dialect, or as /g/, under Bedouin inﬂuence.

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi



Mohssen Esseesy

consonants. Reduction in phonetic materials, including the root consonants, has occurred only in the more mature prepositions of the type CV- across Semitic (e.g. b(V)- ‘in, at’; k(V)- ‘as, like’; l(V)- ‘to, for’, which still govern their dependents in the genitive, suggests that they too most likely have emerged from fuller lexical stems that contained at least two root radicals, even though currently they are non-root. Reductions in phonological materials of these Semitic prepositions seem to typically occur from right to left, as follows: (a) reduction in case inﬂections (e.g. fī above), (b) reduction in the third radical (consonant) (e.g. min ‘from’ > mi-). Further reduction would result in the loss of the medial consonant (e.g. ʿalā ‘on, above’ > ʿa-), in which case the preposition cliticizes to a nominal host. Thus, the leftmost radical consonant seems to outlast all others in the root set.⁴ This is largely due to the frequent collocation prepositions have with their syntactic dependents that eventually results in reduction in phonological material and cliticization and preﬁxation to their nominal/pronominal host. Even when they are reduced in size and become nonroot, prepositions still govern their dependents in the genitive.

. FROM SYNTHETIC TO ANALYTIC POSSESSIVE STRATEGY In older Semitic languages, the synthetic genitive (known as status constructus) was dominant in the expression of possessive relations of all types (inalienable and alienable). The possessive construction typically has two morphologically marked, juxtaposed nominals, N₁ + NGEN. No other elements, such as modiﬁers, are permitted to intervene between them. In addition to the genitive marking on the second noun, the syntactic head noun in these constructions is marked variously across Semitic, most often by reduction of some sort:⁵ case (e.g. Akkadian), deﬁniteness (e.g. Arabic), or phonological reduction of the stem (e.g. Hebrew), as shown in Table .. In spite of variations in the morphological modiﬁcation of the syntactic head of the construct state which reach considerable complexity in individual Semitic languages, the premise of the underlying semantic and syntactic relation holding between the two constituents remains constant. Expression of possessive relations through the genitive represents some semantic, pragmatic, and syntactic challenges. This is particularly true when multiple nouns and modiﬁers are serialized. Marking the head and dependent constituents in these constructions represents redundancy, as noted for Arabic in Versteegh (: ), where the head noun cannot take the deﬁnite article and the dependent receives genitive marking. Moreover, additional awkwardness is encountered when, for example, modiﬁers are added to the constructions, as these must be postposed ⁴ This tendency is not absolute. The motion verbal root RWH. ‘go’ retains the rightmost consonant ḥas a grammaticalized phonetically eroded preﬁx, which marks futurity on verbs in some modern Arabic dialects. ⁵ Ge‘ez exhibits the opposite trend; the head noun is lengthened (e.g. bet ‘house’ becomes bet-a, which resembles accusative marking). See Gensler (: ) for more details.

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi

Grammaticalization in Semitic



T .. Marking of sub-constituents in Semitic construct state Language

Individual N

Construct state

Akkadian

bīt-um’ the-house-NOM’ + David

> bēt David ‘house-of David’

Arabic

bayt-u-n ‘house-NOM-INDF’ + al-rajul-u ‘the-man-NOM’

> bayt-u ‘house-NOM’ al-rajul-i ‘the-man-GEN’ ‘the house of the man = the man’s house’

Hebrew

bayit ‘house’ + David

> bēt David

Source: Adapted from Gensler (2011: 293).

after the entire possessive construction. Semantic ambiguity may arise as a result of postposed modiﬁers of nouns in the construct state, particularly when belonging to the same gender. In Classical Arabic, dative and ablative prepositions, such as li- ‘to, for’ and min ‘from, of ’, are used in possessive functions that contain several nouns.⁶ This strategy was used alongside the synthetic genitive described above. Possibly with the loss of morphological case endings in Semitic languages, such as Akkadian, Ugaritic, Arabic, the move from synthetic to more analytic expression of possession motivated the insertion of a possessive marker to code that relation. Hence, besides the older synthetic possessive structure we ﬁnd a younger strategy (the so-called “genitive exponent”) expressing possession periphrastically. These alternatives began to appear as early as in Akkadian. For example, besides the older construct state bīt šarrim ‘the house of the king’ one ﬁnds bītum ša šarrim ‘the house of the king (+REL PRO) (Hasselbach : ). In several modern Semitic languages (e.g. Modern Hebrew, Aramaic, and Arabic) diverse possessive morphemes, often called ‘genitive exponents’, have become widely used as a formal strategy for expressing possessive relations (i.e. ‘of ’), which, as noted, has reﬂexes in ancient languages as shown in Table .. In modern spoken Arabic dialects, several genitive exponents have evolved from various nominal lexical sources to serve as possessive markers. In spite of the shared common lexical source, matāʿ ‘property’ in Syrian, Egyptian, Moroccan, and Maltese, each vernacular developed its own particular form, as shown in Table .. As shown in Table ., an erstwhile lexical form signalling possession with the etymological meaning ‘right’, ‘belonging’, or ‘property’ (Harning ; Holes ) is typically inserted between the possessed and possessor. Although these forms could

⁶ In his synchronic treatment of the prepositional strategy and analytic genitives, Fehri (: ) states that ‘the preposition li-, has the same form as the dative one’, which presupposes homomorphic morphology, where a historical connection between the preposition and the dative marker is nonexistent. However, in Esseesy () as well as in this study, the two li- forms are conceived as one polysemous from. They are the same, but diachronically linked via grammaticalization, such that li-allative/purposive has evolved to serve dative function.

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi



Mohssen Esseesy

T .. Possessive linkers in a number of ancient and modern Semitic languages Language

Form

Akkadian Hebrew Western Neo-Aramaic Eastern Neo-Aramaic Syriac Tigrinya Tigre

ša ‘of ’ šel ‘of ’ -il ‘of ’ d- ‘of ’ d- ‘of ’ nāy ‘property’ nāy ‘property’

Source: Hebrew and Akkadian forms: Gragg and Hobermann (2012: 197–8); Tigrinya and Tigre: Rubin (2005: 53).

T .. Possessive linkers in a number of modern Arabic dialects Arabic variety

Possessive linker

Egyptian General Gulf Kuwaiti Maltese Moroccan Syrian

bitā‘ < matā‘ ‘propertyʼ ḥagg ‘the right of ’ māl ‘what (belongs) to’ taʿ < matāʿ ‘property’ dyal(dī~d)/n(tāc) < matāʿ ‘propertyʼ tabaʿ < matāʿ ‘property’

equally express alienable and inalienable possessions, their use is commonly restricted to alienable possession (except in the Moroccan dialect, where it expresses both, as noted in Brustad : ). This is due perhaps to a constraint remaining from their erstwhile meaning of matāʿ ‘property’, which semantically denotes alienable possession. These forms have become specialized in use where: (i) A deﬁnite head noun is followed directly by a modiﬁer instead of being postposed after the entire synthetic genitive construct (example () below). (ii) Dual nominals (i.e. those ending in -ēn), hence the use of badlit-ēn bitūʿi is preferred over the morpho-semantically awkward badlit-ēn-ī ‘my two suits’. Furthermore, certain borrowed lexical nouns do not easily permit sufﬁxed possessive pronouns in spoken dialects (Brustad : ). (iii) A special pragmatic focus is sought (Brustad : ), where the post possessive marker (i.e. ‘linker’ in Croft’s  terms) nominal receives primary focus in the possessive relations. (iv) A uniquely identifying character trait or proclivity is expressed (example ( and ) below).

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi

Grammaticalization in Semitic



As for condition (i) above, in the Gulf dialect, the possessive linker may occur immediately after a modiﬁer of a deﬁnite noun: ()

il-bēt li-čbīr māl ṣadīg-i the-house the-big POSS.LINK friend-my (Holes : ) ‘The big house of my friend = my friend’s big house’

As for examples () and (), the possessive linker expresses a deﬁning/permanent trait:⁷ Egyptian dialect ()

bitāʿ mašākil POSS.LINK problems ‘troublemaker’

Similar use is also found in the Levantine dialect of Arabic: ()

tabaʿ mašākil POSS.LINK problems ‘troublemaker’

There is variation in the extent of the use of the synthetic and analytic genitive across modern Arabic dialects (e.g. the Moroccan dialect makes the most use of the analytic genitive, while the urban Syrian dialect seems to use it least), which is noted in Brustad (). These variations are construed here as reﬂexes of the various degrees of grammaticalization of such forms in each dialect, such that the wider the use in marking possession the more grammaticalized the possessive linker is. As for the Moroccan genitive exponent, enduring bilingualism under the inﬂuence of pervasive contact with the Romance languages may have contributed to the extensive conventionalization and grammaticalization of the possessive linker dyal—a composite form that diachronically has become eroded in some usages to d- in Moroccan Arabic. Ouhalla () attributes its frequency of occurrence to the translation of Romance idioms into Arabic-Spanish during the period of the th– th centuries when Spanish Arabic was acquired natively in Spain. Later, it was brought through migration from Spain into Morocco up until the th century. The etymological origin of the composite form dyal is still in scholarly dispute. Whereas Harning (: ) assumes a collocation of demonstrative-relative elements, Ouhalla (: ) believes that the frequent combining of the Spanish genitive preposition di- and the Arabic deﬁnite article al- in the following noun, as shown in the sketched stages below, is the source. On the other hand, Heath (: ), following Wahrmund, the th-century German Arabist, claims that a reﬂex of Late Latin genitive particle dē ‘from, out of ’, along with the Latin deﬁnite ille > el, later confused with the Arabic deﬁnite al- (: ), is the source for the Moroccan dyal.

⁷ Brustad (: ) labels usage of this type ‘a shared idiomatic expression meaning someone who likes’ (emphasis original).

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi



Mohssen Esseesy

In some modern sub-dialects of Moroccan (e.g. Shamaliya, spoken in the northwestern region of Morocco), the analytic-type possessive almost fully replaced the older synthetic genitive type, except for names of older geographical landmarks, as Ouhalla notes. Layering of this form is also evidenced. Use of dyal and its phonetically eroded variant d- has become in complementary distribution. The fuller variant is used with pronouns, whereas the shorter form is cliticized on lexical hosts (Ouhalla : ). It is unclear if contact with Spanish can fully account for the widespread use of dyal and its variant vis-à-vis the analogical pressure from the Arabic-speaking eastern region. However, the fact that in Shamaliya and related dialects the older synthetic genitive was effectively ousted from use and replaced by both diyāl and the native stem n(taʿ) suggests a combination of geographical properties and bilingualism as factors responsible for such change. To recap, the typological features of these forms most notably include: . A deﬁnite article is absent. . They are unable to take a modiﬁer and are relatively syntactically ﬁxed (having to precede possessor linearly). . The erstwhile speciﬁc lexical content is bleached and only the general notion of ownership and possession is preserved. This notion was perhaps used in double-marked constructions at an early stage as an optional reinforcement of possession together with the synthetic possessive pronominals (e.g. in Egyptian Cairene ‘arabiyyiti bita‘ti, lit. ‘my car, my own = my own car’). . They are used in syntactic constructions and contexts that were not previously possible (‘al-qiṣṣah ḥagg zamīlī ‘the story of my colleague’) *?’ the story is true of my friend’). . The root consonants of their source are reorganized and or phonetically altered; e.g. matā‘ < taba‘/bit‘/nta‘/ta‘. The morphophonological link to their ancestral form is thereby weakened, as found in several of the genitive exponents in Table .. . The grammaticalized form in most spoken dialects is still relatively transparent, and has not achieved a recognizable degree of fusion with possessors. The exceptions are in Moroccan, which is geographically the furthest removed from the Arab heartland in the Middle East, and in Maltese Arabic, which is under a dominant inﬂuence from Italian, including the Romanized script. . Their frequency in discourse is high, owing to semantic and functional extensions, which often require their use in the expression of alienable possessives and in some cases, as in the Moroccan dyal, inalienable possessives. The general diachronic evolution is an instance of the pattern identiﬁed in Heine and Kuteva (): ()

PROPERTY > A-POSSESSION

The morphological attributes of the exponent forms reveal notable variations. Whereas beside the eroded native variant (n)taʿ < matāʿ, the Moroccan dyal often appears as the proclitic d-, the Egyptian bitāʿ not only is fuller morphologically but also inﬂects for gender and number of the possessed. These morphological variations are manifestations of different degrees of categoriality and grammaticalization.

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi

Grammaticalization in Semitic



The three strategies for expression of possession in Semitic seem to indicate diachronic layers on a grammaticalization scale: ()

Layered possessive strategies in Semitic >> YOUNGER OLDER I. Synthetic (Status constructs N Ngen) II. Synthetic/+ Prep/Analytic III. Analytic/Synthetic I. An older morphosyntactic synthetic strategy juxtaposing two nouns in a construct state. II. Alongside the strategy in (I), prepositional elements, primarily allative l(V) ‘to, for’ and ablative mn ‘from, of ’ were used in multi-term constructs, or with deﬁnite head noun (e.g. Arabic). III. A relatively newer analytic strategy using a possessive morpheme/linker that originated in nominal source whose meaning relates to ‘property’ or ‘right’. In Modern Hebrew a composite of an older Semitic relative pronoun še, which is found in Akkadian, is combined with the bound preposition -l, resulting in šel. The possessive linkers used in Arabic exhibit variations in the extent of its retention of the lexical phonological materials and gender and number inﬂections and use in alienable and inalienable possessions, such that the more grammaticalized the form the broader its use, which extends to some inalienable possessions.

From the foregoing analysis of possessive strategies in Semitic, it appears that the gradual global change from synthetic to analytic contributed to the rise of a specialized set of possessive linking devices inserted between the possessed and possessee. Such new devices did not necessarily lead to the demise of the older synthetic possessive strategy. Rather, they coexisted and continued to function side by side for several millennia. What motivates the choice of one strategy over the other often seems to relate to discourse functions. In the case of Arabic, for example, the use of the synthetic or analytic strategy depends to a great degree on the level of formality in discourse, the conceptual nature of the possessee (alienable or inalienable), and the relative focus given to the possessor or possessee, among other discourse pragmatic factors.

. FROM INDEPENDENT PERSONAL PRONOUN TO VERB AGREEMENT As a category consisting of a closed set, Semitic personal pronouns exhibit common cross-linguistic (i.e. morphosyntactic, semantic, and pragmatic) features identiﬁed in Heine and Song () and Sugamoto (, cited in Gardelle and Sorlin : ). These features include: () morphological autonomy; () referential but non-speciﬁc semantic content; () inability to take modiﬁers; () limitations on reference interpretation; () lack of morphological constancy; and () lack of pragmatic implicational value or particular stylistics.

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi



Mohssen Esseesy

To the above features, I add syntactic autonomy as a language-speciﬁc feature of the personal pronouns. Compared with their English counterparts, Arabic personal pronouns, for example, form a complete utterance. In (), modelled on Siewierska (: ), use of the Arabic pronouns by themselves is grammatical in (b) but not in (a) for English. ()

Who wrote that? a. *I/*he/*we b. ’anā ‘I’/huwa ‘he’/naḥnu ‘we’/hunna ‘they.FPL’

Furthermore, in Cairene Arabic, as well as in many other modern Arabic dialects, the independent personal pronoun subsumes a function typically expected of object pronouns, such as English me. Instead of an object pronoun, e.g. the bound form -nī, the autonomous subject pronoun is used in (). () mīn ʿāyiz šāy who wants tea ‘Who wants tea?’ ’anā I ‘I am/I do’ = ‘me’ These personal pronouns must occur in their independent form when used in all expected syntactic environments. They typically function as topics of verbless (e.g. equational) sentences comprising two juxtaposed NPs, NP–NP, without any added morphological markings, or verbal forms. However, as stated in ..(b), they are syntactically optional in verbal sentences since their corresponding reduced, bound forms are inherently marked on the Semitic verb. As stated also in ..(b), they typically appear as sufﬁxes on the perfective stem and preﬁxed (or much less commonly circumﬁxed) on the imperfective stem. Hasselbach () relates these positional variations of the afﬁxes on the Semitic verb to a diachronic change in word order, where the independent pronouns at a certain archaic stage occupied a preverbal position and at another stage a postverbal position. The Semitic verbal sentence may contain double subject marking: an independent subject pronoun functioning as an optional verbal argument alongside a bound form of the same pronoun functioning as an agreement marker inherently bound to the verb. As regards the consequences of grammaticalization from an independent personal pronoun to a bound one on the Semitic verb, Table . contains a summary of comparisons between the source and target within Lehmann’s ([]) six syntagmatic and paradigmatic parameters. The correlated parameters of grammaticalization in Table . jointly measure the increasing degree of grammaticalization of independent personal pronouns in Semitic as they become morphologized as agreement markers on verb stems, which is captured in the process illustrated in (): ()

INDEPENDENT PERSONAL PRONOUN > INFLECTION AGREEMENT

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi

Grammaticalization in Semitic



T .. Measuring grammaticalization of independent pronouns and bound agreement markers Comparison

Independent person form

Bound person form

Phonological Syntagmatic

Free

Afﬁxed

Coordination with another personal pronouns Coordination potential lost (e.g. huwa wa-hiya ‘he and she’) possible* Paradigmatic

Full forms

Reduced/eroded

Morphosyntactic Syntagmatic Independent/mobile

(Context-)dependent/ﬁxed

Paradigmatic

Optional

Obligatory

Functional Syntagmatic

Controller of agreement

Target of agreement

Member of a general subject-marking paradigm

Member of verbal agreement indexation

Paradigmatic

* Van Gelderen (2011a: 39) considers coordination a feature of full pronominal status, which is disallowed for agreement markers.

T .(). Second-person singular pronouns and their corresponding preﬁxed agreement on imperfective verb stem in selected Semitic languages Person Akkadian/Old Babylonian

Arabic

msg fsg

’anta > ta’attā > ta- ’anta > ti’anta > tə’anti > ta- . . . -ī(na) ’att > t . . . -ī ’anti > ta . . . - īn ’anti > tə . . . -i

atta > taatti > ta- . . . -ī

Hebrew

Aramaic

Geʿez

Source: Modiﬁed from Gragg and Hoberman (2012: 190, 177) and Lipiński (2001: 306–7).

This cline of grammaticalization is not unique to Semitic; it is observed crosslinguistically (e.g. Lehmann []: ; Heine and Kuteva : ). Tables .(a) and .(b) contain partial reconstructions of further grammaticalizations of second personal pronouns from independent personal pronouns to bound agreement markers on the imperfective and perfective verb stems in a number of Semitic languages. Within the range of functions of the reduced/bound forms an intermediary, overlapping stage along the grammaticalization cline is observed in () and (). () jā’at l-banāt-u (Fehri : ) Came.F the-girls-NOM ‘The girls came.’

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi



Mohssen Esseesy

T .(). Second-person singular pronouns and their corresponding sufﬁxed person agreement on perfective verb stem in selected Semitic languages Person

Akkadian/Old Babylonian

Arabic

Hebrew

Aramaic

Geʿez

msg fsg

atta > -āta* atti > -āti

’anta > -ta ’anti > -ti

’attā > -ta ’att > -t

’anta > -t(a) ’anti > -t(i)

’anta > -k(a)** ’anti > -k(i)

* The [t] verbal sufﬁxes of the second person were generalized in West Semitic to all persons (Faber 1997: 9), hence, in Arabic daras.tu ‘I studied’, daras.ta ‘you.studied.MSG’, daras.at ‘she studied’. ** Geʿez verbal sufﬁxes are identical to possessive/object pronouns for second person. These pronouns are analogous to the ﬁrst-person verbal sufﬁx -ku in Akkadian, which has its origin in the ﬁrst-person independent pronoun an(ku). Source: Modiﬁed from Gragg and Hoberman (2012: 190, 176), Kaufman (1997: 126).

() jā’at (Fehri : ) came.FSG ‘She came’ Despite the identical surface marking in the above examples, but in () only gender is speciﬁed, whereas in () PGN are speciﬁed, as noted by Fehri. Full speciﬁcation of PGN in () on the verb indicates that it is an agreement inﬂection on the verb. Hence, that one polysynthetic word is a complete sentence. Moreover, () can have an optional overt pronominal subject hiya before jā’-at, similar to Fuss’s (: ) example from Italian, cited in (). In such a case, pronominal subject speciﬁcations are redundantly marked on the verb, which supports their status as agreement sufﬁxes in the marking of the three agreement features on the verb. This agreement property is taken by Fuss, and here, to rule out the status of the pronominal agreement in Arabic and in Italian as clitic. () Lui/Lei mangia He/she eat.SG ‘He/she eats.’ There is an observable dichotomy in morphological form and discourse role of the three independent personal pronouns. Speciﬁcally, it is between the ﬁrst and second person on the one hand and the third person on the other. This contrast is both formal and functional. On the formal side, the ﬁrst and second person across Semitic have (’)an(t)V morphophonological structure (e.g. ﬁrst person in Akkadian: anā(ku), Hebrew ’ānōki >’anī, Arabic and Aramaic ‚anā) vis-à-vis the third person, which has the šV (diachronic sound change resulted in hV) morphophonological base (e.g. Akkadian šū, Hebrew and Aramaic hū, Arabic huwa). Morphologically, none of these bases seems to have contributed to the rise of verbal inﬂection for third person in Semitic. Russel () reached a similar conclusion, as he doubted Givón’s () claim that third-person preﬁxes and sufﬁxes in Semitic arose from independent anaphoric pronouns. As for the functional contrast, it has been noted in the literature on Accessibility Hierarchy (e.g. Givón ; Ariel ) that the ﬁrst and second person are discourse participants (the speaker and addresses respectively) while the third person lacks such function. It is plausible that the absence of morphological

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi

Grammaticalization in Semitic



correspondence between the third-person independent pronouns and the verbal agreement inﬂection is a reﬂex of their status as non-discourse participants and their lower accessibility hierarchy. This distinction in discourse function has a metalinguistic reﬂex in the works of Arabic and Hebrew grammarians, who labelled the third-person pronoun al-ġā’ib ‘the absentee’ and ‘concealed one’ (Ariel : ) respectively. A correlation in morphological coding is further assumed between a given referent and its accessibility, as Siewierska (: ) claims, such that the more accessible the referent the less coding is needed, which in part explains the absence of gender marking on the ﬁrst person, singular and plural. Furthermore, consistent with the Accessibility Theory as used by Ariel, morphological markings of agreement on the Semitic verb are mostly on ﬁrst and second person—the two highest accessible referents. Absence of person agreement speciﬁcation for third person has been attributed in Ariel () to its frequent co-occurrence with a referent NP. Huehnergard and Pat-El (: ) went a step further and claimed that ‘Semitic is a -person family’, and that the original function of the third person was—and in many Semitic languages still is—a functioning (distal) demonstrative. The underspeciﬁcation for person and absence of a discourse role are likely to have contributed to the selection of the third person for use in functions where speciﬁcation for person is inessential (e.g. copula, expletive, identitive, discourse marker of agreement, and conclusion), to which I now turn.

.. -    In a number of modern Semitic languages parallel developments show the use of third-person pronouns as a copula in the present tense. Past-tense marking requires the use of a verbal source (e.g. Arabic root KWN and Hebrew HYY). This diachronic evolution has been examined in a number of works, including Reckendorf (), Doron (), Eid (), and Li and Thompson (). The pronoun serving copula function is often advanced enough in its grammaticalization cline that it has ceased to be co-referential with the topic of the equational sentence. It has also ceased to mark the feature PERSON, as noted in Fehri (: –). Loss of this feature thus evidences its lack of deictic properties and generalization. However, it also gains a new grammatical function, which marks a separation between topic and comment, as shown in () and () by Fehri (): () anta huwa l-mas’ūl-u you.MSG he/COP the-responsible-NOM ‘You (are) the responsible (one)’ Further evidence for lack of person speciﬁcation is found in Fehri’s example in (), which shows the ungrammaticality arising from the use of an identical personal pronoun as copula.⁸

⁸ Van Gelderen (a: ) attributes this type of ungrammaticality to ‘anti-homophony constraints’. It is doubtful that this assumption is correct, given that huwwa huwwa ‘he is the same’ is grammatical in non-copula usage.

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi



Mohssen Esseesy

() *’anta ’anta l- mas’ūl-u you.MSG you.MSG the-responsible-NOM ‘You are the responsible (one)’ Its use as a copula is not limited to human referents, perhaps in no small part due to its non-discourse participant function. In (), which is culled from the electronic Arabic Corpus Tool, the pronominal copula agrees in gender with a non-human noun referring to an abstract concept. () al-’amnu huwa l-laḏī yaḥkumu the-security he/COP REL.MSG MSG.govern.IMPV misra wa-laysa ’ayyatu jihatin ’uxrā Egypt and-not be any agency other ‘Security (is) what governs Egypt and not any other ofﬁcial agency’ In Hebrew, Li and Thompson () note the obligatoriness of the copula in certain syntactic environments (e.g. full NP subject). This usage marks a more advanced stage of grammaticalized third person as copula when compared to the one in Arabic.⁹ () David hu ha-ganav David he/COP the-thief ‘David is the thief ’

(Li and Thompson : )

() *David ha-ganav (Li and Thompson : ) ‘David the-thief ’ In Arabic, the use of third-person pronouns as copulas allows two deﬁnite NPs to form a sentence, which otherwise would be merely a phrase: () al-bintu l-miskīnatu the-girl the-poor ‘The poor girl’ () al-bintu hiya l-miskīnatu the-girl she/COP the-poor ‘The girl (is poor/is the poor one’) In Qumran Hebrew, an extinct variety from , years ago, use of the copula was only possible with deﬁnite second NP (Naudé , cited in van Gelderen a: ), as shown in (): () ’th hw’ yhwh you he lord ‘You are the Lord’

(Naudé : , cited in van Gelderen a: )

It is worth noting that the third-person pronouns as copulas in Semitic still retain their full autonomous form, thus making them indistinguishable from their usage as

⁹ The Arabic equivalent for () is a phrase, not a sentence (David al-ḥarmˆ ‘David the thief ’).

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi

Grammaticalization in Semitic



personal pronouns. That they have not yet undergone any phonological erosion or contractions, as did the verb be (e.g. I’m, you’re) in English, for example, is indicative of a grammaticalization level not advanced enough to trigger loss of phonological substance.

.. -    The independent third-person pronouns are also grammaticalized to serve in subjectless equational sentences as expletives, similar to it in it is love. This function is quite distinct from copula use (Fehri : –), as shown in the following examples modelled after Fehri’s: () huwa l-qadaru he/it the-destiny ‘It is destiny’ () hiya d-dunyā she/it the-universe ‘It is the universe’ In () and (), the pronouns lack speciﬁcity for person even though they show gender and number agreement with the postposed subject. This function again corroborates Givón’s () observation that unlike ﬁrst and second persons, which are speech act participants, third-person referents need not be so, and by extension their use with inanimate referents is sanctioned. Additionally, what seems to motivate the use of third person in this function and as the copula is the contextual metaphoric reconceptualization of an entity, actual or virtual, as person.

..   -     The masculine singular form, which is the citation form in standard Arabic grammar, assumes varied roles not only in verb morphology but also in syntax. This is evident in the Hebrew binyanim and Arabic ‘awzān verb paradigms, where the citation form is invariably the masculine singular form. The choice of this citation form rests on some factors of relevance to grammaticalization in Semitic. The importance attached to the citation form in Semitic goes far beyond Aronoff ’s notion of being ‘the address of the lexeme whose paradigm is being generated’ (: ). One can easily dismiss the third-person form as a convention that has no linguistic basis for selection, much like the use of the ﬁrst-person singular present indicative active for verbs in Classical Greek and Latin dictionaries, as Aronoff suggests. In Semitic, the situation seems entirely different. First, as stated in ..(b), the (consonantal) root morpheme by itself is not pronounceable without intervening vowels. Second, the citation form of the verb is invariably the third-person masculine singular form. This claim is corroborated for Hebrew by Ariel (: ), who states that the third person is not linked to a referent, i.e. person. Third, there is

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi



Mohssen Esseesy

further evidence to suggest additional polygrammaticalization of the third-person masculine singular as a form specialized for marking the following relations: (a) As a discourse marker indicating agreement or conclusion: () wa-huwa kaḏālika and-he=it so ‘All right’ In such cases, as a token of its grammaticalization, huwa alone can occupy that position (cf. *wa-hiya kaḏālika, with hiya ‘she’). (b) As an interrogative marker in spoken (Egyptian) Arabic.¹⁰ There is an implicit speaker doubt communicated in this construction. Again, no other person pronoun can substitute for huwa without ensuing ungrammaticality: () huw(wa) ’intī bi-tḥibbī-nī he you.FSG ASP-FSG.love-me ‘Is it the case that you love me?’ Frequency of occurrence in two corpora of roughly the same size (ShuruqColumn: ,,; Adab Literature: ,,) of the online Arabic Corpus Tool also conﬁrm the primacy of the third-person masculine singular in textual frequency compared to all other persons, singular or plural (Table .). This broad distribution can only be attributed to the polyfunctionality of that pronoun. T .. Primacy of independent third-person masculine singular in textual frequency of two online Arabic corpora Independent pronoun

st person

nd person

rd person MSG

rd person FSG

Frequency of occurrence in ShuruqColumn

Singular: , Plural: ,

Singular:*  Plural (m):  Plural (f): 

Singular: , Plural: ,

Singular:  Plural: 

Frequency of occurrence in Adab literature

Singular: , Plural: 

Singular: , Plural (m):  Plural (f): 

Singular: , Plural: 

Singular: , Plural: ,

* 2MSG and 2FSG independent pronouns are indistinguishable in form in this corpus.

This exceptionally consistent high frequency of occurrence of the third-person masculine singular is typologically striking. If textual frequency of occurrence of similarly functioning grams is taken into account, the high frequency of the independent third-person masculine singular pronoun should be regarded as evidence for polygrammaticalization vis-à-vis the pronouns of the ﬁrst and second person. As it turns out, the high frequency of the independent third-person masculine singular ¹⁰ In this usage, huwwa also appears to fulﬁl the avoidance of direct address to the addressee, as it combines with ḥaḏrit-ak ‘your presence’ instead of ’anta ‘you’.

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi

Grammaticalization in Semitic



pronoun may be explained as a consequence of its underspeciﬁcation of certain features, such as person in certain grammaticalizated functions, e.g. copula and expletive. In those cases, the independent third-person pronoun assumes more grammaticalized functions not shared by other personal pronouns.

. CONCLUSION The grammaticalization processes of the various forms and constructions examined have shown the ubiquitous working of grammaticalization in various domains in individual Semitic languages and subgroups. In so doing, the evolutionary processes not only shed some light on linking the grammaticalized forms to their ancestral sources but also illustrated the directionality of the change. The grammaticalization of body-part terms, which resulted in their use in various prepositional functions in Semitic, has shown that these terms are still morphologically relatively close to their pre-grammaticalized lexical sources. Preservation of the morphological form also holds true for the independent third-person pronouns functioning as copulas and expletives. The one gram that continued its evolution further, i.e. fī > f- ‘in, at’, highlights the far-reaching possibility for similar grams to evolve. In such mature cases of grammaticalization polysemies, decategoriality, fusion, phonetic erosion, and expansion into marking textual relations beyond phrases and clauses can be expected. Hence, there appears to be no typological limit found on the coevolution of meaning and form in Semitic of the type described in Bisang’s () study of EMSEA languages. Grammaticalization has been shown to facilitate the change from the direction of synthetic to analytic in several Semitic languages. Examples include the recruitment of lexical forms to serve as substitutes or variants for the synthetic genitive construction marking, and the insertion of pronominal copula in verbless equational sentences. In the latter case, the evolution of independent personal pronouns to agreement markers and copulas, among several other functions, attests to the workings of grammaticalization in already grammaticalized members of a restrictive, tight paradigm. Thus, grammaticalization does not seem to operate in a vacuum. Change is also neither mechanistic nor arbitrary. Rather, it is motivated by the communicative expressivity of the linguistic community at a given synchronic state of the language’s evolution, and facilitates the transition from one state to another. In some such cases, the cyclical change from SVO to VSO and perhaps back is cyclical, as referenced in ..(d), and along the line described in Croft (), even though grammaticalization itself is most often unidirectional. This study should not suggest that typologically the pathways to grammaticalization in Semitic are unfettered. There are a number of sociolinguistic factors that seem to directly constrain the workings of grammaticalization in some Semitic languages and varieties. Diglossia in Arabic, for example, poses a real challenge to change by grammaticalization. Within the framework of Arabic diglossia, Modern Standard Arabic forms and constructions are very slow to change compared to spoken vernaculars, which are uncodiﬁed and much freer to evolve.

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi



Mohssen Esseesy

Due to the sociolinguistic partitioning of Arabic varieties according to prestige, many grammaticalized forms are often dismissed by most language purists as ‘corrupted’ or ‘deformed’, since they deviate in form and meaning from the codiﬁed and standardized variety, the use of which garners prestige. The various functional elements examined here, such as the preposition fī ‘in, at’, as an existential particle; possessive linkers, such as dyal, tabaʿ, that replaced construct state possessives; the independent pronouns hu/huwa functioning as interrogatives—are all sociolinguistically marked (i.e. low), despite being natively acquired and used in most informal discourses. The consequence is that constraints are being unduly imposed on the natural evolution of Arabic, which is the ofﬁcial language of more than twenty countries in the Middle East and North African region. Because of the low status assigned to the spoken Arabic dialects by language purists and most native speakers, the aforementioned function words and expressions continued their natural evolution unrestrained, and therefore deviated from the idealized and rigid standard register. Thus, the interface between grammaticalization and sociolinguistics, particularly the attitudes preventing or otherwise promoting change, should be considered in the study of language evolution.

4 Grammaticalization and inﬂectionalization in Iranian GEOFFREY HAIG

. INTRODUCTION The Iranian languages constitute a branch of Indo-European, within which their closest relatives are the Indo-Aryan languages. The earliest uncontroversially dated attestation of Iranian is Old Persian, preserved in a series of cuneiform inscriptions located in today’s southwestern Iran (the best-known in Behistun), which stem from the th to the th centuries BCE. In addition to Old Persian, a body of texts known as Avestan (Old and Young) also provide evidence for the oldest layers of Iranian. Avestan texts are ritual and didactic in character, rooted in the belief system of the Zoroastrians. They were ‘transmitted orally over centuries and even millennia before being committed to writing some time after  CE’ (Skjærvø : ). Thus although the texts clearly represent an ancient form of Iranian (and are very similar to the oldest parts of the Rigveda), the Zoroastrian priests who ultimately committed the Avestan texts to writing spoke much later (Middle Iranian) languages, a fact which has evidently led to some mixing of the linguistic systems that ultimately crystallized in the written form of Avestan, and which makes dating and interpretation of the texts a delicate issue. Traditionally, Iranian philologists split Iranian into two groups, west Iranian and east Iranian. Although this assumption faces serious (and possibly insoluble) empirical problems, as already pointed out by Sims-Williams (), I continue to maintain it here as a pre-theoretical taxonomy. Most of this chapter will deal with what are traditionally termed west Iranian languages. Among the west Iranian languages, Persian has the longest time-depth of attestation, going back to Old Persian of the preceding paragraph, and has enjoyed the greatest cultural and political prestige as the language associated with successive dynasties of Persian empires, and spreading as a language of administration, science, and literature across Asia to the Indian subcontinent. Unsurprisingly, research on grammaticalization, and indeed on historical Iranian morphosyntax in general, has concentrated on Persian. However, the pre-eminence of Persian in the linguistic literature is an artefact of the political and Grammaticalization from a Typological Perspective. First edition. Heiko Narrog and Bernd Heine (eds). This chapter © Geoffrey Haig . First published  by Oxford University Press



Geoffrey Haig

cultural domination associated with Persian-speakers, rather than reﬂecting any intrinsically central or salient feature of the Persian language within Iranian. On the contrary, in many respects Persian is an unusual, or even atypical, representative of Iranian (quite comparable to English as a lingua franca associated with current economic and political prestige, yet an atypical representative of Germanic). However, the vast majority of other contemporary Iranian languages lack any written attestation beyond a couple of centuries, rendering their historical reconstruction particularly challenging. The developments in Iranian morphosyntax over the past two millennia exhibit many parallels to the better-known branches of Indo-European, Romance and Germanic. Old Iranian preserved much of the rich (though irregular) inﬂectional morphology of Proto-Indo-European, including nominal gender, declensional classes, inﬂectional expressions of aspect, and so on. But the transition from Old to Middle Iranian (around the beginning of the Christian Era) witnessed the collapse and levelling of much of the inherited morphology. These processes are best documented for Persian, which lost gender, all case marking, and entire paradigms in the verbal system (e.g. aorist, old perfect). Morphologically, Middle Persian is thus considerably impoverished when compared to Old Iranian—not unlike the difference between contemporary French and classical Latin. Similar processes of morphological erosion affected most of the other languages, though inherited morphological categories of case and gender have survived in attenuated form in a number of contemporary languages. Since the great levelling of morphology some two thousand years ago, Iranian languages have been gradually reacquiring morphosyntactic complexity through, for example, univerbation of erstwhile copulas with lexical verbs, the grammaticalization of lexical verbs into modal, aspectual, and voice auxiliaries, the grammaticalization of adpositions to phrasal afﬁxes with case functions, but also in the restructuring and recreation of person agreement systems on the verbs. Iranian has therefore much to offer for scholars of grammaticalization. However, dedicated research on grammaticalization within Iranian has not yet achieved the same coverage as grammaticalization research in Romance, Slavic, or Germanic (see Davari and Kohan  for recent discussion and references, and Jügel : ch.  for the grammaticalization of auxiliary verbs). Given the scale of the issues, the time-depth of attestation, and the number of languages involved, the present chapter is of necessity selective, and will almost exclusively deal with west Iranian. I will begin with a discussion of the grammaticalization— more precisely, the inﬂectionalization—of person and number agreement from erstwhile pronouns (section .), because this is a topos of grammaticalization research, and the Iranian languages offer an unusually long-term perspective on some of the main issues involved. One of the main ﬁndings of this brief survey is that the assumed ﬁnal stage of grammaticalization, namely into fully-ﬂedged inﬂection, is an exceedingly slow process, taking millennia before all traces of the lexical, or at least non-inﬂectional, origins of grammatical formatives are lost. Section . will take up a variety of other issues in the existing literature, in particular the grammaticalization of auxiliary verbs, and the grammaticalization of case marking. Section . offers some more general considerations and a summary of the main points.

Grammaticalization in Iranian



. THE GRAMMATICALIZATION OF PERSON INDEXING IN IRANIAN: SUBJECTS VERSUS OBJECTS

..     :    In a number of genetically diverse languages (e.g. Turkic, Bantu), there are striking phonological similarities between free pronouns and the corresponding verbal person agreement afﬁxes. Explanations for these similarities are generally framed in terms of a grammaticalization process: the originally free pronouns have gradually coalesced with verbal hosts, yielding phonologically dependent (clitic or afﬁxal) doublets of the pronouns. These bound forms lose their pronominal status, hence are no longer subject to Binding Conditions, and may co-occur with a co-referential NP in the same local syntactic domain. Ultimately, they become obligatory items of the verb’s inﬂectional morphology: agreement markers. The original pronouns either continue to function as free pronouns or are replaced by innovated pronouns. Siewierska (: ) refers to this cycle as ‘a continuous process on-going in all languages in all times’. According to Culbertson (), it can currently be observed in spoken French, and the unstressed French subject pronouns je, tu, etc. can now be analysed as agreement clitics (see Kibrik  for similar claims, and De Cat  for an alternative analysis). The process is often modelled in the form of a cline of form types, as in (), from Fuss (: ): ()

independent pronoun ! weak pronoun ! clitic pronoun ! afﬁxal (agglutinative) agreement marker ! fused agreement marker ! ø

Despite different theoretical assumptions, the grammaticalization account of the emergence of agreement has remained dominant in historical linguistics, though with different emphases and terminologies. Indeed, Fuss (: ) refers to the recognition of ‘a universal historical pathway’ in the rise of agreement markers; see Schnell (to appear) for critical evaluation. Most previous research has focused on the grammaticalization of subject pronouns. However, object agreement is also cross-linguistically attested, and in the relevant literature it is generally assumed that the grammaticalization of object agreement from object pronouns basically follows the same path as that of subject agreement. Thus Bresnan and Mchombo (: ) claim that the bound object pronouns in Bantu are in the process of grammaticalization into agreement markers, ‘parallel to the earlier evolution of the SM [Subject Marker—GH]’. The assumption of a uniﬁed grammaticalization pathway for subject and object pronouns has largely remained unchallenged (see van Gelderen  for recent discussion). However, as Siewierska () points out, cross-linguistically, examples of truly obligatory object agreement are vastly less frequent than of subject agreement. If both involve the same mechanisms, it is not readily obvious why this imbalance should obtain.



Geoffrey Haig

As it turns out, historical data from Iranian is particularly relevant for this question. In a number of west Iranian languages, an identical paradigm of clitic pronouns came to be used for both subjects, and direct objects, albeit in mutually exclusive domains (with predicates based on past stems and present stems respectively). The roots of this system can be traced back for more than two millennia, so that Iranian provides a natural historical laboratory for tracing the respective developments of subject and object pronouns, each carried by a paradigm of phonologically identical forms. The lesson to be learned from Iranian is that the fates of the two sets of pronouns have been very different: while the subject pronouns have, as predicted by Siewierska’s cyclical view, in some languages at least reached the stage of obligatory agreement markers, the object pronouns have basically plateaued at the same stage that obtained more than , years ago. In what follows I will brieﬂy summarize these developments; see Haig () for more details.

..       The clitic pronouns at the centre of this discussion emerged through syncretism across various non-nominative forms of clitic pronouns, for which cognates are identiﬁable in Old Iranian and Old Indic (Korn ). By the Middle Iranian period some two thousand years ago, these forms had merged to yield a single paradigm of non-nominative clitic pronouns, often referred to in Iranian philology as ‘oblique’ pronouns. The Middle Iranian forms are provided in Table ., based on Parthian and Middle Persian;¹ the remainder of this section takes up the fate of the cognates of this paradigm in two different functions: transitive subject (A) and direct object (P). Many west Iranian languages have retained a paradigm of pronominal clitics recognizably cognate with those of Table .. A selection of contemporary West Iranian languages and their pronominal clitics are provided in Table .. Despite certain minor differences (some involve superﬁcial phonetic processes such as deletion of ﬁnal -n, while others stem from deeper historical origins; see Korn ), the overarching similarities across the paradigms are evident, as are the similarities to the Middle Iranian forms of Table .. Not all west Iranian languages have preserved these clitics; some languages of the nothwestern peripheries of west T .. Pronominal clitics in Parthian and Middle Persian

First person Second person Third person

Singular

Plural

=m =t / =d =š

=mān =tān / =dān =šān / =š

(Jügel 2015: 222)

¹ Table . ignores some complications, see Skjærvø (: ) for a slightly different version, and Korn () for more detailed presentation.

Grammaticalization in Iranian



T .. Pronominal clitics in selected west Iranian languages

SG SG SG PL PL PL

Persian

Mukri (Central Kurdish)*

Hawrami*

Sivand*

=am =at =aš =mān =tān =šān

=(i)m =(i)t =ī / =y (after vowels) =mān / =in =tān =yān

=(ı)m =(ı)t =(ı)š =mā =tā =šā

=em =et =eš =emā =etā =ešā

* Sources for the languages other than Persian: Mukri Kurdish: Öpengin (2016: 92); Hawrami: MacKenzie (1966: 25); Sivand dialect: Lecoq (1979: 40). Apparent differences in the qualities of the vowels are in part due to differences in the transcription practices of the sources; they are irrelevant for the present purposes.

Iranian lack them (e.g. Zazaki, Kurmanji Kurdish, Mazanderani), but it is probably reasonable to assume that the earliest stages of these languages also had them. From their earliest attestations, the clitics were used to express adnominal possessors, experiencers, benefactives, and external possessors (see Haig : –), and following the syncretisms among the various non-subject cases, one and the same set was also used for direct objects. Having dealt with the forms, we turn now to their distribution and functions. In the Old Iranian period, the inherited ﬁnite past and perfective verb forms gradually disappeared, echoing similar changes across much of Indo-European, where ﬁnite past tense and perfective aspect forms were disappearing. In Iranian, the sole form that remained to effect past tense reference were participles, basically verbal adjectives with resultative semantics (Haig : ), that had long been in existence in Old Iranian and beyond. In tandem with this change, a fundamental reorganization of the morphosyntax of past-tense clauses occurred, yielding ergative (or nonaccusative) alignments in these tenses. These changes have been dealt with in detail elsewhere (Haig : ch. ; Jügel ; Haig ) and need not concern us here; for the present purposes it is sufﬁcient to note that in the past tenses of transitive verbs, the paradigm of clitic pronouns introduced in the last section (Table .) also served as transitive subject (A) pronouns (in what follows I will refer to ‘subject’ pronouns, but in fact only transitive subjects are involved) with all past-tense transitive verbs. Thus the typologically unusual situation arose in which one and the same paradigm of person forms indexed² the direct object (P) in present tenses and the A in past tenses (see Arkadiev  for the theoretical implications of this kind of system). The import of this situation for our purposes is that it created a natural laboratory for observing the respective developments of the grammaticalization of subject and ² I adopt the term ‘indexing’ from Haspelmath (), as a neutral term for both anaphoric and agreement relations obtaining between a target and controller; the term ‘agreement’ is reserved for those types of indexing where the presence of the index is obligatory, regardless of presence or absence of the controller in the same clause, and regardless of the information conﬁguration of the entire clause. Agreement is thus a syntactic relation, deﬁnable without reference to pragmatics.



Geoffrey Haig

object indexing, because both began with phonologically identical input material. It is important to bear in mind that in Old and Middle Iranian, these clitics were special clitics (in the sense of Anderson ), whose position was basically after the ﬁrst constituent of their clause (Wackernagel position).

..     () In Old Iranian, and well into Middle Iranian, the clitic pronouns used for the A were largely restricted to occurrence in clauses otherwise lacking an overt subject NP. The following examples are from Middle Iranian; () shows a clitic pronoun A, while () has a NP in the A role, and no clitic pronoun: ()

čē=t ātaxš ī man pus ōzād because=SG:A ﬁre of my son extinguish.PST.SG ‘because you extinguished the ﬁre of my son [. . .]’ (Middle Persian, Haig : )

()

pas ōšbām oy az pidar bōxt [. . .] then ōšbām:A SG from father rescue.PST.SG then Ošbām rescued her from (her) father [. . .] (Zoroastrian Middle Persian, Jügel : , glosses added)

The pronominal nature of these clitics was maintained into the Middle Iranian period. This is evident from the fact that they may be omitted under the condition that their reference is recoverable from the context (see Jügel :  for details). The following example shows an overt clitic pronoun for the A of the ﬁrst clause, and zero for the co-referential A of the subsequent clause: ()

a. u=š ardawān ōzad [. . .] and=SG:A Ardawān kill.PST.SG b. ud duxt ī ardawān pad zanīh kard and daughter of Ardawān to wife make.PST.SG ‘And hei killed Ardawān [. . .] and (hei) took his daughter as wife’ (Zoroastrian Middle Persian, Jügel : , glosses added)

From Jügel (), two facts emerge that support the interpretation of the Middle Iranian clitic A-pronouns as pronouns, rather than agreement: ﬁrst, the incompatibility of the pronoun with a free expression of the A,³ and second, the ability to be omitted under pragmatically felicitous conditions, in a manner comparable to the omission of a free pronoun. We will follow this analysis for the time being, but in the following discussion I will suggest that this is probably not the whole story. The system of indexing the A through a pronominal clitic has disappeared in some west Iranian languages, notably Persian, but elsewhere it has survived remarkably ³ Jügel (: –) discusses the few examples from Middle Persian where the A-clitic is doubled by an overt A in the clause ( attested in a corpus of , clauses). Most of these involve some form of dislocation, e.g. an afterthought or a preposed constituent, with unclear clausal loyalties, or are scribal errors. Otherwise, clitic pronoun and overt subject NP are mutually exclusive.

Grammaticalization in Iranian



well. In Central Kurdish, the system is still recognizably that of Middle Iranian, but with one very crucial difference: the pronominal clitics that index an A in the past tenses have become fully obligatory: ‘every single past transitive construction requires an A-past clitic’, regardless of the presence or absence of an overt A constituent in the same clause (Haig : ), or any other pragmatic conditions. The following examples illustrate the co-occurrence of subject NP and clitic in Central Kurdish, and in two other Iranian languages with a similar system: ()

eme to=mān nard bo šar-ī PL:A S:P=PL:A send:PST to city-OBL ‘We sent you to the city’ (Central Kurdish, Mukri dialect, Ergin Öpengin p.c.)

()

me ketaw=em xeri SG book=SG.A buy.PST.SG ‘I bought the book’ (Laki, Dabir-Moghaddam : )

()

mæ=m ketav ese SG=SG.A book buy.PST.SG ‘I bought the book’ (Davani, Dabir-Moghaddam : )

In Central Kurdish, Laki, and Davani, we have a typologically unusual kind of subject agreement, in which the subject index itself is a mobile clitic, clearly reﬂecting its pronominal origins. But unlike a typical subject pronoun, the clitics of Central Kurdish are not omissible, even in environments where pronouns are normally dispreferred or even disallowed (e.g. same-subject clause coordination, or subject relativization). Conditions of space preclude illustration of these properties; I refer to the ample documentation of Central Kurdish subject clitics in MacKenzie (, ) and Öpengin (); all recent research converges on the verdict that they are exponents of an agreement relationship (Samvelian ; Haig ; Öpengin ). A further stage in the assumed grammaticalization of subject agreement is attested in other west Iranian languages, for example the dialect of Semnan (Majidi ). In the past tenses, we ﬁnd two distinct paradigms of person agreement sufﬁxes on the verb, one for transitive verbs (in bold type) and the other for intransitives, illustrated in Table . (from Majidi : , transcription and segmentation adapted; the initial preﬁx is an indicative marker). For ease of comparison, the pronominal clitics of Middle Iranian from Table . have been added to Table .. It is evident that in Semnan, the paradigm of person sufﬁxes for transitive verbs is distinct from that of intransitive verbs. Furthermore, we can assume that the paradigm found with transitives is an innovation, and its source is the clitic pronoun paradigm of Middle Iranian, as shown in the right-hand column. Presumably the Semnan system arose when the erstwhile clitic pronouns lost their syntactic mobility, and grammaticalized to the verb, becoming effectively inﬂectional agreement morphology.⁴ Systems like ⁴ My account of Semnan is based on Majidi (). However, Masoud Mohammadirad (p.c.) reports that the subject-indexing clitics on the past transitive verb are not obligatory (based on recent ﬁeldwork with speakers) in Semnani, and are sometimes omitted. If this is conﬁrmed, then the account provided here needs to be modiﬁed accordingly. Such a state of affairs would actually provide further support for the fact



Geoffrey Haig

T .. Intransitive and transitive person indexing (past indicative), Semnan dialect

SG SG SG PL PL PL

Intransitive ‘die’

Transitive ‘do, make’

Middle Iranian clitic pronouns

ba-mard-un ba-mard-e ba-mard-e ba-mard-in ba-mard-in ba-mard-an

hā-kard-an hā-kard-at hā-kard-eš hā-kar-mun hā-kar-tun hā-kar-šun

=m =t / =d =š =mān =tān / =dān =šān / =š

the Semnan one are also noted in other Iranian languages (Jügel : –). It should also be noted that even in those languages like Central Kurdish, or Gorani, where the subject index retains its syntactic mobility, the verb itself is a frequent host option for the clitic; cf. Gorani kamtar ward=iš? (vulture ate.PST=SG.A) ‘did a vulture eat (them)?’, where the third singular subject index attaches to the verb ward ‘ate’ (Mahmoudveysi, Bailey, Paul, and Haig : text :). Thus the development in Semnan, where the erstwhile clitic is now exclusively found on the verb itself, represents the grammaticalization of an already available positional variant, rather than a completely novel development. Schematically, the development of subject agreement from clitic pronouns in Iranian can be sketched as in Table .. Here I distinguish the dimensions of ‘obligatoriness’, or ‘inﬂectionalization’, from the degree of phonological bondedness (Kibrik ; Norde ). The latter subsumes two criteria: morphological integration into the host and degree of freedom of host selection. The sequence of stages set out in Table . presupposes that the predecessors of Central Kurdish and Semnan dialect had a clitic system similar to that of the attested Middle Iranian languages Parthian, Middle Persian, and Bactrian.⁵ As we have no records of the immediate predecessors of Kurdish and Semnan, this is obviously hypothetical, but it nevertheless appears to be plausible in the light of what is known about the pronominal clitic system across west Iranian. Note also that Table . represents but one possible line of development. Others are attested, most notably for Modern Persian, where the pronominal clitics simply disappeared from the subject function, and Persian past-transitive verbs came to carry subject agreement that the form of the index is not necessarily a good predictor of its function. Although the subject index in Semnani looks superﬁcially like an agreement afﬁx (i.e. ﬁxed position on the verb), it has apparently retained some of its pronominal properties. Similar mismatches between form and function characterize the object indexes which are discussed below. ⁵ It is important to note that we lack direct records for the predecessors of these contemporary languages. In fact they cannot be direct descendants of either Middle Persian or Parthian, because we assume that common Kurdish must have preserved the inherited Iranian case distinctions, and Semnani does so to this day. In Parthian and Middle Persian, however, case had already been largely lost. Central Kurdish, Semnan, etc. must therefore go back to some contemporary of Parthian etc., which had retained case (and gender), but for which no historical records exist.

Grammaticalization in Iranian



T .. Schematic summary of the inﬂectionalization of person indexing in Iranian Stage/Language

Syntactic mobility (‘bondedness’)

Old Iranian/Middle Iranian Syntactically mobile, high freedom of host selection, transition Wackernagel position. Verb forms (participles) lacking subject-indexing morphology become the norm for past transitive constructions; use of genitive/ dative clitic pronouns as ‘subjects’ begins, probably inherited from existing non-canonical subject constructions (Haig ).

Degree of inﬂectionalization Not obligatory, clearly still pronominal; absolute numbers of relevant examples in the extant corpus is too limited to draw ﬁrm conclusions.

Middle Iranian (Middle Persian, Parthian, Bactrian, Jügel ) Participles now the norm for past-tense transitives, generalized ‘oblique’ pronoun is normal expression of subject.

Syntactically mobile, Wackernagel position. High freedom of host selection (including subject NP itself, or complementizer).*

Not obligatory, but already signiﬁcantly more frequent than would be expected of a corresponding free pronoun (see below in this section).

Contemporary Central Kurdish (e.g. Öpengin )

Obligatory agreement marker Syntactically mobile, VPbased position, some freedom (but see n. ). of host selection, but subject NP and complementizers are no longer possible hosts. Morphological integration into predicate is possible.

Contemporary Semnan dialect (Majidi )

Positionally ﬁxed (bonded to verb stem).

Obligatory agreement marker.

* Jügel (2015: 249) ﬁnds ‘no restrictions [keine Einschränkungen, GH]’ on the host category for the Middle Iranian clitic pronouns, except for prepositions.

morphology from a different source (identical to that used for intransitive verbs, suggesting that the latter is the source candidate). The other possibility is that the nonobligatory nature of the Middle Iranian system continued, which seems to be the case in Taleshi (Paul ). However, a system with obligatory subject indexing (agreement), still carried by mobile clitic pronouns, is certainly a possible outcome in the west Iranian context. The Semnan dialect shows how these pronouns may ﬁnally lose their syntactic mobility and become in effect verbal afﬁxes, part of verbal inﬂection.



Geoffrey Haig

A question that needs to be addressed is the nature of the mechanisms involved in the shift from clitic pronoun to agreement, and the chronology of the events. Thanks to Jügel’s () rich documentation of the Middle Iranian facts, it is now possible to advance at least some tentative hypotheses in this direction. As already mentioned, Jügel himself analyses the mobile subject clitics of Middle Iranian as pronouns, mainly due to the lack of attested cases of clitic doubling, which appears to be his main diagnostic for distinguishing pronoun from agreement marker (cf. Jügel : –, n. ). On this view, the clitic subject pronouns of Middle Iranian are basically prosodically deﬁcient versions of normal pronouns. However, Jügel’s own material actually suggests that this is not the whole story. The most striking piece of evidence is the sheer frequency of occurrence of the clitic pronouns, which can be inferred from Jügel (: , table .): in the largest corpus, Middle Persian, around  per cent of all past transitive clauses contained a clitic pronoun exponent of the subject (N=,). The full signiﬁcance of this ﬁgure emerges when we compare it with the percentage of overt pronouns in transitive clauses of other languages which allow referential null subjects. Relevant ﬁgures for contemporary spoken Persian (Adibifar ), Cypriot Greek (Hadjidas and Vollmer ), and Northern Kurdish (Haig and Thiele ) are: Persian:  per cent (N=), Cypriot Greek:  per cent (N=); Northern Kurdish:  per cent (N=).⁶ For these languages, and indeed most others that allow null referential subjects, the favoured form of expression for subjects is zero, not pronominal. The ﬁgures from Middle Persian are thus distinctly odd, and require explanation. I would tentatively suggest that the clitic pronouns had already at the Middle Persian stage undergone a rapid rise in frequency, and that this frequency increase should be seen as the precursor to the later grammaticalization (cf. Meyerhoff  for a frequencybased account of the grammaticalization of subject agreement, and Schnell, to appear, for critical discussion). Jügel cautiously relates the development towards the agreement system of Central Kurdish to the presence of ‘topic agreement’ (Jügel : –), i.e. the use of a resumptive clitic pronoun following a stage-setting, left-dislocated new topic, which is attested in some Middle Iranian texts. The construction would, on this account, have lost its pragmatic markedness after the Middle Iranian period, and become reanalysed as subject–verb agreement in Central Kurdish, and other languages with this kind of agreement (basically in line with Givón’s  account of the emergence of agreement via topicalization of pronouns). But while these structures may have played some role in the process, they cannot account for the overall jump in

⁶ The comparatively high ﬁgure for Northern Kurdish is noteworthy; it is probably related to the fact that in the Northern Kurdish corpus, many of the verbs are past-tense transitives, which lack overt agreement morphology. It is likely that in these contexts, more overt pronouns are used as compensation for the lack of overt subject indexing, as proposed by Kibrik () with reference to Russian past-tense forms (likewise lacking person agreement). This needs more research; it also underscores the inapplicability of categorical typologies along the lines of ‘pro-drop’ vs ‘non pro-drop’; there are evidently intermediate degrees, reﬂected in graded frequency of omission in discourse.

Grammaticalization in Iranian



frequency in the Middle Iranian data. In the light of cross-linguistic research on pronoun omission in transitive clauses, the ﬁgure of  per cent pronoun retention is highly signiﬁcant, and indicates that these clitic pronouns were qualitatively distinct from free pronouns; if this is on the right track, then the grammaticalization process from pronoun to agreement marker was already well under way at the Middle Persian stage. These considerations underscore the necessity to look beyond the issue of ‘clitic doubling’ as a diagnostic for the degree of grammaticalization, and to take comparativefrequency data into account. What makes a pronoun a pronoun is not just its inability to appear in the same local domain as its antecedent (lack of ‘clitic doubling’). Pronouns are also characteristically prone to be omitted under conditions of pragmatic identiﬁability of the referent. Indeed, many of the Jügel’s Middle Iranian examples illustrate clitic pronouns in contexts where pronouns would not normally be expected, for example the following: ()

ēk, ke=š man brēhēnīd one, that=SG.A SG create.PST.SG ‘one which created me’ (lit. ‘one that he created me’, Zoroastrian Middle Persian, Jügel : )

The subject pronoun =š attaches to the complementizer/relativizer, although resumptive pronouns are generally not required in Iranian subject relativization. The suggestion I am making is that the grammaticalization process begins with an extension of the clitic pronouns into syntactic environments where pronoun omission (zero) would generally be the norm. Note that such an extension will not necessarily yield instances of clitic doubling; it will simply yield an overall statistical increase in the frequency of overt pronouns, and a corresponding drop in the rates of zero anaphora. Available cross-corpus research indicates that in the case of transitive clauses, somewhere between  per cent and  per cent of subjects have non-lexical expressions, i.e. are either pronominal or zero (see Haig and Schnell ). Throughout its attested history, Persian is known to be a language that licenses referential null subjects and, concomitantly, avoids pronominal subjects. Thus the most common form for transitive subjects in Persian is zero, and there is little reason to suppose that this has changed signiﬁcantly over the history of the language. Against that background, the frequency data extractable from Jügel () is extremely revealing for understanding the grammaticalization process. For the time being, I conclude that the clitic subject pronouns of Middle Iranian, while not agreement markers in a strict sense, nevertheless differed in their distribution signiﬁcantly from free subject pronouns in other Iranian languages (and, I suspect, from free pronouns in presenttense transitives in Middle Iranian, though this awaits further research). Kibrik’s () notion of ‘bound tenacious pronoun’, implying a prosodically bound form that is approaching (to different degrees) agreement status, would perhaps be appropriate. Having outlined the development from clitic pronoun to verbal agreement afﬁx, we now turn to consider the fate of the cognate clitic pronouns in object functions.



Geoffrey Haig

..     The development of clitic pronouns in the object role is simpler than that just sketched for the subject role, and this section is correspondingly brief (see Haig, , a for a more detailed discussion). Clitic pronouns in the direct object function are attested in earliest records of Iranian, so we may assume that this was a syntactic possibility available perhaps for as long as , years. In Old Iranian, there was still a dedicated paradigm of accusative clitic pronouns, which later syncretized with the other non-nominative clitic pronouns to yield the paradigm provided in Table .. Examples from Old Persian, with the still-distinct form of the accusative pronoun, are the following (note again the Wackernagel position of the clitic): ()

pasāva=dim manā frābara after.that=SG.ACC SG.GEN/DAT bestow.PST.SG ‘After that he bestowed it on me’ (Old Persian, Haig : )

() kāra hya aθuriy hau=dim abar yātā Bābirauv people which Assyrian, DEM=SG.ACC bring.PST to Babylon ‘The Assyrian people - they brought it to Babylon’ (Old Persian, Haig : ) Examples of Middle Iranian clitic pronouns in object function are given below (from Haig : ): () čīd=mān pāyēd always=PL:P protect.PRS.SG ‘(It) always protects us’ () [. . .] u=š hamēw bōžēnd [. . .] and=SG:P always save.PRES.PL ‘(the Gods) always save him’ Throughout Old and Middle Iranian, pronominal clitics could only express the object in the absence of an overt object NP. Thus the object clitics were, despite their clitic status, fairly obviously pronouns, rather than any kind of agreement: they were in complementary distribution to a free NP (or full-form pronoun) object (Jügel : ). Some , years later, the reﬂexes of the Middle Iranian clitic pronouns remain widely attested as clitic object pronouns in numerous contemporary west Iranian languages. The main change has been what Haig () refers to as ‘rightward drift’ with regard to clitic placement: the Old and Middle Iranian Wackernagel position has given way to a VP-based clitic placement system, with the ﬁnite verb itself as a very common host (Wackernagel position is retained in a small number of languages). Typical examples of clitic pronouns attaching to the ﬁnite verb are the following: () hâlâ ne-mi-bin-am=aš now NEG-IND-see.PRS-S=SG:P ‘now I don’t see it’ (Modern Persian, Roberts : )

Grammaticalization in Iranian



() m-war-im=šān IND-eat.PRS-SG=PL:P ‘I will eat them’ (Gorani, Mahmoudveysi et al. : ) () sob mo-gor-im=eš morning IND-take.PRS-PL=SG:P ‘In the morning we will take it’(dialect of Sivand, Lecoq : ) In some languages, the clitics also occur as what are arguably ‘endoclitics’, i.e. enveloped within inﬂectional verbal morphology, most notably in Central Kurdish (see Harris  on endoclisis, and Öpengin  for analysis of the relevant facts for Kurdish). The following examples illustrate the position of the object clitics in the Mukri dialect of Central Kurdish (Northwest Iranian, West Iran, Öpengin ): () a. kut=ī ‘segbāb bo de=m=guž-ī?’ say.PST=SG.A dog.son why IND=SG.P=kill.PRS-SG ‘He said: ‘Son of a dog, why are you killing me?’ b. kut=im ‘bāb=im nā=t=guž-im’ say.PST=SG.A brother=POSSSG NEG=SG.P=kill.PRS-SG ‘I said: ‘O brother, I am not killing you’’ (Öpengin , ŽB –) Evidently the ‘clitics’ here resemble afﬁxes, because they are morphologically integrated into the predicate. Functionally, however, the object clitic of Central Kurdish continues to be in complementary distribution with a NP object. In other words, the presence of the clitic pronoun is only licensed in the absence of the coreferential object. This is demonstrated in (), where both direct objects are expressed as free pronouns (in bold type), and no corresponding clitic pronoun is allowed on the verb: () emin de-kāte wezīrī destī.řāstī šā ʕebās-ī w SG(P) IND-make.PRS.SG Vizier-of right.hand.of Shah Abbas-OBL and eto de-kāte kāłek-ﬁroš sg(P) IND-make.PRS.SG melon-seller ‘(God) is appointing (lit. making) me the right-hand vizier of Shah Abbas and making you a melon-seller.’ (Öpengin , KF.–) In sum, the history of clitic object pronouns in Iranian can be reliably traced back for over , years. In terms of the formal properties of the exponents, from the earliest attestation they exhibit typical properties of special clitics, such as syntactic mobility (clause-second in Old and Middle Iranian, VP-second in much of Kurdish) and freedom of host selection, though in some languages they may also be closely integrated into the predicate, as in Central Kurdish. Yet despite their lack of prosodic independence, they have generally failed to advance down the postulated grammaticalization cline beyond the stage of pronouns. In other words, throughout the entire attested history of west Iranian, the object clitics have not evolved into obligatory object agreement in the category of person in any language known to me (see Jügel : , n. ). Instead, the clitic pronouns have remained pronominal expressions of the object, in complementary distribution with co-referent-free NP objects, and also



Geoffrey Haig

omissible if the object is pragmatically recoverable. With regard to the proposed grammaticalization cline from pronoun to agreement (), they have basically remained stuck at the same stage for , years, namely as as clitic pronouns.⁷

.. :  (-)     The developments of the Iranian clitic pronouns provide us with a natural laboratory for investigating the differences between subject and object grammaticalization processes, because the initial phonological material was identical for both (cf. Table .), and both began their careers as Wackernagel clitics. In their later developments, however, they have diverged remarkably. The object clitic pronouns have basically remained just that: prosodically dependent object pronouns, in complementary distribution with free-form objects. Nowhere can we ﬁnd a convincing case that they have shifted closer towards an agreement system, bar sporadic cases of clitic doubling mentioned in n. . Note however that this statement applies to object agreement expressed by the descendants of the clitic pronouns. Object agreement is possible in the category of gender (e.g. in Tati: see Stilo, to appear). But the exponents of gender agreement do not originate in pronominal clitics, and thus are the outcome of a different process from the one discussed here. As for the clitic pronouns used for transitive subjects, there was indeed a shift from alternating to obligatory, precisely in line with the predictions of grammaticalization theory.

. FURTHER TOPICS IN IRANIAN GRAMMATICALIZATION In this section I will survey some current topics in grammaticalization within Iranian, with particular emphasis on ﬁndings of high typological relevance. Section .. revisits a classic of Iranian grammaticalization, the object marker =râ, while .. looks at a less well-known instance of the grammaticalization of an auxiliary.

..        A well-known example of grammaticalization in Iranian is that of object case markers. As mentioned, already by Middle Iranian, some west Iranian languages had lost the ⁷ In fact, ‘colloquial registers’ of Persian do permit sporadic instances of object clitic doubling (Samvelian and Tseng ), and this has been argued to represent an instance of agreement. However, the cited examples are pragmatically quite marked. Such clitic doubling is not possible with e.g. indeﬁnite or focal controllers (Rasekh ). I am not aware of any pragmatically neutral syntactic conﬁguration in Persian where object clitic doubling is obligatory. This would seem to preclude an analysis as agreement in the narrower sense. Of course, different researchers delineate ‘agreement’ in different ways (see Corbett ), and the differing viewpoints are in part terminological in nature.

Grammaticalization in Iranian



Old Iranian case morphology, meaning that with the exception of some minimal vestiges in the pronoun system and in kinship terms, Middle Persian and Parthian no longer marked direct objects (Jügel : ). Modern Persian, however, has renovated its case system, and now regularly marks speciﬁc⁸ direct objects via a clitic [=rɒ:], often realized as [=rɔ:], or just [=ɔ:]. Different sources use different conventions for representing this clitic; I render it orthographically with =râ, regardless of the source, or degree of phonetic attrition. Non-speciﬁc direct objects remain unmarked (i.e. modern Persian has DOM). A simple example is the following: () Sârâ=râ did-am Sara=ACC see.PST-SG ‘I saw Sara’ (Bohnacker and Mohammadi : ) Sârâ, as an inherently deﬁnite proper noun, requires overt object marking. The development of the Persian object marker was identiﬁed by Hopper and Traugott (: –) as an example of the development of the cline shown in (): ()

lexical word > postposition > sufﬁx

Note, however, that contemporary =râ is at best a phrasal afﬁx, rather than a component of nominal morphology; it attaches to the ﬁnal element of its NP, which might, for example, be an adjective: () lebâs-e seﬁd=râ xarid-am dress-of white=ACC buy.PST-SG ‘I bought the white dress’ In terms of stress placement, Kahnemuyipour () notes that =râ is non-stress bearing, and is ‘outside the phonological word’ (p. ). Thus we need to interpret the ‘afﬁx’ stage of the cline in () fairly loosely, to include clitics and phrasal afﬁxes; even after a millennium of attestation as an object marker, =râ is not a fully morphologically integrated afﬁxal case marker of the kind that characterized the Old Iranian case system. The roots of this accusative clitic can be traced back ultimately to a nominal element rādiy, meaning something like ‘because, on account of ’, already used in Old Persian as a postposition in this sense (Kent : ; Paul ), requiring the genitive case of its complement. From this, a postposition rāy developed, with benefactive, possessive, and recipient semantics in Middle Iranian, which continued into Early New Persian (approx. – CE). But already in Middle Persian, it had become extended to use as a marker of deﬁnite direct objects, though the exact pathway of the development remains obscure. A Middle Persian example with a direct object is the following:

⁸ ‘Speciﬁc’ is to be understood here as shorthand for ‘related to a complex bundle of factors involving pragmatic identiﬁability, topicality, and speciﬁcity’. In fact, the factors determining DOM in Persian remain disputed; see Paul () for discussion and references.



Geoffrey Haig

() ka ān šagrān rāy zīndag ō amā āwarēd that those lions-PL ACC living to PL bring.PL ‘That you bring those lions to us alive’ (Jügel : ) The semantic pathway transcribed by rādiy is thus one of lexical to grammatical; from a more concrete lexical meaning ‘because, on account of ’, to marking a particular grammatical relation, that of direct object (in fact, with no clear semantic core). Traces of its origins remain, however, for example in the interrogative pronoun čerâ (what=râ), lit. ‘for what, why’. The benefactive/possessive meaning of =râ survived as late as the th century CE, as in the following: () ahālī-ye orūpā=râ ‘aqīde īn ast ke people-of Europ=râ opinion this is that ‘The people of Europe are of the opinion that [. . .]’ (lit. ‘to/for the people of Europe the opinion is . . . ’, Paul : ) Similarly, =râ continues to mark left-dislocated, frame-setting topics: () in dar=râ, qoﬂ=aš=râ diruz šekast-am this door=râ, lock=POSS.SG=râ yesterday break.PST=SG ‘This door, I broke the (lit. its) lock yesterday’ (p.c. Mohammad Rasekh-Mahand) Remnant semantic content is also visible in the restriction on marking only speciﬁc direct objects, possibly an inheritance of its origin as a marker of recipients, which in discourse tend to be overwhelmingly deﬁnite. However, one would, on this view, also expect the feature of +/ human to be relevant in modern Persian DOM, but this does not appear to be the case. Hopper and Traugott (: –) consider the shift towards direct object marking to involve a ‘contraction of range with respect to thematic roles’, but this view is based on the assumption that the direct object role is restricted to Patients and Themes in Persian, which is not the case (the object in (), for example, is not a Patient). In fact, the semantic development ﬁts well with the most basic assumption of grammaticalization as involving a loss of lexical meaning; the direct object role is deﬁned syntactically, not semantically, and the association of =râ with this role (though the match is not perfect, as shown above) can indeed be interpreted as a clear case of a shift from semantic to grammatical function. It is worth pointing out that there is nothing inevitable in the Persian developments just sketched. Other west Iranian languages likewise lost inherited case morphology, but have not to this day replaced it (e.g. Central Kurdish, Southern Kurdish), leaving subjects and objects equally unmarked. Notably, there has been no shift towards (S)VO in these languages, which remain, as does the totality of Iranian, (S)OV. Elsewhere, innovated object markers have been recruited, though from sources etymologically distinct from the rādiy postposition discussed above (see Windfuhr ; Haig : ch. ; Stilo ; Paul  for discussion of Iranian case systems), while in other contemporary languages, a particle cognate with râ is attested, but it has not become an accusative marker. Thus the grammaticalization of rādiy>râ in Persian emerged through a contingent combination of factors that together yielded this speciﬁc development. The larger pan-Iranian framework, however, is the renewal

Grammaticalization in Iranian



of strategies for object marking which has led to several distinct solutions in the individual languages.

..     Perhaps the most fertile area of grammaticalization in Iranian involves the renewal of verbal TAM categories from erstwhile full verbs, a topic that has also been central to grammaticalization research since its inception (see Hopper and Traugott : – for early discussion, Hengeveld  and Narrog  for recent developments). Well-known cases include the development of future tense markers from verbs of motion (English gonna < going to, Spanish ir ‘go’+ a + inﬁnitive), or from ‘have’ in Romance. A particularly well-documented case in Iranian is the development of an analytical future tense with an auxiliary verb originally meaning ‘want’, xâstan. The development of a future marker from ‘want’ has obvious parallels in, for example, Germanic, and I will not discuss the Persian case here. For the grammaticalization and univerbation of ‘be’, see Jügel (: –). Other cases of grammaticalization involve the modern Persian modal particle bâyad (obligation), from an erstwhile ﬁnite verb construction, the development of a passive auxiliary from ‘come’ (e.g. hatin ‘come’) in Kurmanji Kurdish (Öpengin and Haig ), or from ‘become’, as in Persian šodan, which itself goes back to a verb of motion, cf. Old Persian š(i)yav‘to set, go forth’ (Cheung : ). A development that can be related to the grammaticalization of auxiliaries is the emergence of complex predicates, consisting of a non-verbal element plus a light verb, to express numerous basic verbal meanings (basically, this is the main strategy for creating new verbal lexemes in much of Iranian, where productive derivational morphology for this function is lacking; see e.g. Haig  and Samvelian and Faghiri , for discussion of complex predicates in Iranian). Finally, I should point to a typologically very rare type of grammaticalization that has recently been discussed in Iranian linguistics, namely the development of deﬁniteness sufﬁxes (e.g. in Central Kurdish) from diminutives; see Jahani () and Haig (to appear) for discussion. A less well-described and cross-linguistically more unusual development is the grammaticalization of a continuous marker from a lexical ‘have’ verb. A verb ‘have’ is not mentioned as a lexical source of continuous aspect in Heine and Kuteva (), so we can assume that the development is fairly unusual. In this section I will brieﬂy outline the facts, while referring to Davari and Kohan () for a more detailed exposition.

..    ‘’:   ˆ ˇ  Persian verbs have two stems, called here a past and a present stem, respectively associated (approximately) with past and present temporal reference. Both stems form the base for a number of distinct TAM formations. The present tense is broadly characterized by an opposition between forms preﬁxed with a stressed preﬁx mi-, used for all forms of the indicative, and forms that lack this preﬁx. The latter may be



Geoffrey Haig

T .. Preliminaries for the Persian tense and modality system Gloss

Past stem

Present stem

ind.prs.sg

sbj.prs.sg

do, make hit go

kard zad raft

konzanrav-

mi-kon-am mi-zan-am mi-rav-am

be-kon-am be-zan-am be-rav-am

preﬁxed with be- (with variant bo-), generally referred to as a subjunctive preﬁx, or may lack a preﬁx entirely (as in certain imperatives). The basics of this system are illustrated in Table . for three common verbs. Within this system (largely adhered to in the standard written language), there is no grammaticalized distinction of aspect in the present tense. In other words, the same indicative present verb form with mi- may be used for habitual, for punctual, or for ongoing and continuous events; aspectual nuances are supplied contextually, or must be inferred from the lexical verb semantics (Aktionsart), as illustrated in the following (mi- with a present tense verb is glossed with INDIC(ATIVE):⁹ () dar_bâreye mosabeq-e futbol sohbat mi-kon-im about game-EZ football conversation IND-do.PRS-PL ‘We are talking about the football game.’ (process, interpretable as referring to the current time of speech) () mâ dar kelâs ingilisi sohbat mi-kon-im PL in class English conversation IND-do.PRS-PL ‘We (generally) talk English in the classroom.’ (process, interpretable as referring to a habitual event, rather than the time of speech) () vaqti suzan=râ dar bâdkonak foru ø-kon-i, when needle=ACC in balloon downward SBJ-do.PRS-SG mi-tark-ad IND-burst.PRS-SG ‘If you push a needle into a balloon, it bursts’ (punctual, appropriate for referring to a general truth holding at all times) () man esm=aš=râ mi-dân-am SG name=POSS.SG=ACC IND-know.PRS-SG ‘I know his name’ (state, obtains for an unbounded period, including the time of speech) The west Iranian verbal system underwent a transition from the aspect-based system of Old Iranian to a tense-based system, built on the fundamental opposition between

⁹ I am grateful to Shirin Adibifar for native-speaker intuitions on these examples. My analysis differs from many others in that I do not ascribe any aspectual value to mi- when it occurs with a present stem.

Grammaticalization in Iranian



the two stems. In the wake of this major restructuring, a number of contemporary Iranian languages have since developed secondary aspectual distinctions, recruited from various sources. The mi- preﬁx just discussed is the result of such a grammaticalization, going back to an erstwhile adverb, hamē ‘alwaysʼ (Windfuhr : ). There are numerous functionally parallel (though etymologically distinct) counterparts across Iranian, yielding a particularly compelling case of parallel grammaticalization across related languages (or ‘drift’, to use Sapir’s term); an example is the di- preﬁx in Northern Kurdish, illustrated in () and (). Although I have argued here for an aspect-neutral analysis of the modern mipreﬁx with present stems, its origins as an aspectual marker are still evident from two facts: First, it does not occur with the inherently state predicate dâštan ‘haveʼ (present stem dâr-), or with the defective copular verb hastan ‘be’. Thus to express indicative present ‘I haveʼ the expected form *mi-dâr-am is not possible, and instead we ﬁnd dâr-am. This is presumably due to an incompatibility of the original continuous meaning of the source of mi- with the stative meaning of dâstan, or with the copula, quite comparable to the ungrammaticality of present continuous with English have (cf. *I am having to express a state of possession). Otherwise, however, mi- is compatible, and indeed obligatory, with indicative present forms for all Persian verbs, regardless of their aspectual semantics, and irrespective of whether they are negated or interrogative. Second, the aspectual component of mi- is preserved with past stems, where it optionally expresses imperfective (past continuous, but also certain modal nuances), and forms an opposition with an unmarked perfective past. In combination with the present stem, however, this opposition is simply not available (in the morphology at least); hence I gloss mi- in the present as IND,¹⁰ but in the past as IMPV. Where these erstwhile aspect markers have become bleached of aspectual content (as I have argued above for Persian mi-), additional aspectual markers may be added to the system. In spoken Persian, an innovated continuous form has emerged, based on a ﬁnite form of the verb dâštan ‘have’. In this construction, the main verb is likewise ﬁnite; it must take the same tense as dâštan, carry the mi- preﬁx, and the appropriate person and number agreement. The ﬁrst two examples illustrate the past tense, while () and () illustrate the present tense (the forms of dâštan are glossed (PROG)RESSIVE). () dâšt

az ânjâ rad mi-šod from there passing IMPV-become.PST(SG) ‘He was passing by there’ (Adibifar , g_f_)

PROG.PST(SG)

() faqat dâšt-am komak mi-kard-am just PROG.PST-SG help IMPV-do.PST-SG ‘I was just helping’ (Davari and Kohan , glosses modiﬁed)

¹⁰ Presumably for these two reasons, much of the relevant literature on Persian does assume an aspectual contribution of mi- in the present tense, and glosses it as e.g. durative (Taleghani ), or imperfective; see Davari and Kohan () for discussion.



Geoffrey Haig

() dâr-i či kâr mi-kon-i? PROG.PRS-SG what work IND-do.PRS-SG? ‘What are you doing?’ () dah sâl=e dâr-e piano mi-zan-e ten year=COP.SG PROG.PRS-SG piano IND-hit.PRS-SG ‘He has been playing piano for ten years’ The last example makes it clear that this is not merely objective marking of ‘continuous/ progressive aspect’, but involves more subtle aspects of speaker’s stance towards the assertion being made. Obviously the subject in () has not been playing the piano continuously (without interruption) for ten years. Rather, the speaker has chosen to portray an activity undertaken at regular intervals in terms of a continuous process. Davari and Kohan () point to further semantic nuances expressed by progressive dâštan, which are not immediately derivable from a purely aspectual sense. With inherently non-progressive verbs, such as ‘fall’, or ‘die’, the use of the dâštanauxiliary expresses prospective aspect ‘is about to’ (examples from Davari and Kohan , glosses adapted). () dâr-e mi-mir-e PROG.PRS-SG IND-die.PRS-SG ‘He is about to die’ () begir=eš dâr-e mi-oft-e hold.PRS.IMP=SG.P PROG.PRS-SG IND-fall.PRS-SG ‘Hold it! It is about to fall’ In the past, the dâštan-progressive may express a prospective state as the temporal framework within which a punctual action occurs: () dâšt-am mi-raft-am ke u zang zad PROG.PST-SG IMPV-go.PST-SG CPL SG ring strike.PST.SG ‘I was about to go when he called’ (Shirin Adibifar, p.c.) Davari and Kohan () interpret these non-progressive uses of the dâštan progressive as evidence of increasing ‘subjectiﬁcation’ of the aspectual marker, drawing on Narrog’s () account of an increase in ‘speaker orientation’. The origin of the dâštan progressive are obscure. Jügel (: ) ﬁnds no evidence for early grammaticalization of ʽhave/hold’ as an auxiliary in his Middle Persian corpus. Davari and Kohan () cite Dehghan (), who mentions written examples from the beginning of the th century as the earliest attestations. It is of course likely that these constructions had been available in the spoken language for centuries, but have only relatively recently been committed to writing. Davari and Kohan () see the point of origin for this construction in the reanalysis of a construction involving a centre-embedded relative clause containing the verb dâštan, which is reanalysed as a single clause, though this remains to be conﬁrmed in a more comprehensive survey. As an example for this kind of bridging context, they cite among others the following:

Grammaticalization in Iranian



() parde=i ke dâšt-and bâz mi-kard-and curtain=INDF that have.PST-PL open IMPV-do.PST-PL ‘The curtain that they had, they were opening’ > ‘They were opening the curtain’ It should be noted that the dâštan progressive is not fully integrated into the TAM system of Persian, because it lacks a negated form. A functionally similar renewal of aspect has also occurred in the Behdinî dialect of Northern Kurdish (the dialect spoken in the northernmost parts of Iraq bordering on the Syrian and Turkish borders). In Behdinî, an additional aspectual distinction is available in the present tenses to indicate ongoing and immediate activity. Thus () contains the particle yê, and indicates immediate and ongoing activity,¹¹ in contrast with (), lacking the particle, which stresses habitual, rather than current, activity: () Azad yê l=Duhok-ê šul di-ke-t Azad PROG.M.SG in=Dohuk-OBL.F work IND-do.PRS-SG ‘Azad is working (now) in Duhok’ () Azad l=Duhok-ê šul di-ke-t Azad in=Dohuk-OBL.F work IND-do.PRS-SG ‘Azad works (regularly) in Duhok’ The etymological source of the progressive particle in () is quite different from the source of the Persian dâštan progressive (unlike Persian, Northern Kurdish entirely lacks a lexical HAVE-verb). The particle yê originates in a linking particle (known as the ezafe in Iranian linguistics), used among other things to link adnominal attributes to nouns. The main evidence for associating the aspectual particle of () with the ezafe is that both inﬂect for number and gender in a very similar manner. Thus if the subject in () were female, the particle would have the form ya, and if the subject were plural, yêt. Originally the ezafe was of pronominal or demonstrative origin, and this appears to be the source of its function here; presumably it grew from some kind of cleft construction (‘Azad—the one working in Dohuk’) though this remains speculative (see Haig  for discussion). Whatever the source, the result is that in the present tense an additional aspectual distinction has been added to the verb system, achieving a similar result to the daštan progressive in Persian. The parallels to the Persian developments also extend to the constraint against negation: the innovated progressive of Behdinî is likewise ungrammatical with a negated main verb.

. SUMMARY This chapter has surveyed some of the grammaticalization processes that can be traced back across two and a half millennia of Iranian languages, focusing on the contrast between the grammaticalization of person indexing for subjects from ¹¹ It is notable that in colloquial Behdinî, the ‘progressive’ form is exceedingly frequent, probably more so than the simple form in (). Its claimed aspectual force is thus weak, and only really emerges when speakers wish to make a contrast. However, the subtle nuances of usage remain largely unresearched.



Geoffrey Haig

erstwhile clitic pronouns and the non-grammaticalization of the cognate set of pronouns in object function. Despite identical input material, the trajectories of these two processes have been very different. I argue that in the case of object clitics, for well over two thousand years the object clitic pronouns have remained just that: clitic pronouns, with little further development towards object agreement. From their earliest Old Iranian origins the clitic object pronouns were prosodically deﬁcient, bound elements. In the course of their subsequent development they have undergone changes in placement principles, and formally may superﬁcially resemble afﬁxes, but there has been no evident shift towards becoming obligatory indexes (agreement) rather than alternating indexes. The subject pronouns, on the other hand, have demonstrably developed into agreement markers in some languages. Although we lack evidence for the precise pathway of development, I have suggested that the process was inaugurated through an increase in frequency of these pronouns already in Middle Iranian, in which the clitic pronouns began to occur in contexts where previously pronouns would have been dispreferred (in same-subject coordination, for example). As mentioned, much of the inherited inﬂectional morphology (case, gender, and TAM categories) eroded in a relatively short transition between the Old and Middle Iranian periods, and the history of grammaticalization can to some extent be seen as the gradual reacquisition of lost morphological categories, for example case, modality, and aspect (though the restitution of gender appears to be rare). Notably, across different Iranian languages, these processes do not necessarily draw on cognate source material, and may in fact be absent altogether. Thus some languages lack grammaticalized aspect in the present tenses; others have not restituted structural case marking (e.g. still lack an accusative or genitive case). Where these inﬂectional categories have been reinstated, the process has been slow, and most examples still bear obvious traces of the source construction, e.g. through stress patterns (Persian =râ), or traces of syntactic mobility of the marker (e.g. subject agreement in Central Kurdish), or a failure to cover all the relevant slots in a paradigm (e.g. the failure of Persian dâštan-progressive to occur with negated main verbs). There are thus quite discernible structural differences between the ‘inherited’ morphology (where it has survived at all) and the ‘innovated’ inﬂectional morphology of Iranian (Haig : ; to appear, b). To reach that state, then, the inherited morphology itself must have gelled over an exceedingly long developmental period—far longer than the two thousand years of attested history of Iranian languages available for our perusal. It is thus scarcely surprising that the origins of (for example) inherited verbal person agreement morphology lie far beyond the limits of historical records, and are likely to remain unknown to historical linguists. Inﬂectionalization is evidently a process that requires millennia, not centuries, to achieve, though paradoxically, its loss can be quite rapid, even catastrophic.

5 Grammaticalization in the languages of Europe ÖST E N D AH L

. INTRODUCTION Is it possible to give a characterization of grammaticalization processes in European languages? To make any sense, such a characterization must tell us not only what grammaticalization in Europe is like, but also how it differs from grammaticalization in other parts of the world. The problem that arises is not primarily that we know too little about grammaticalization in Europe; it is rather that we know so little about it elsewhere. For many European languages, we can follow their history back for more than two millennia, with abundant written documentation. Outside of Europe, this is the case only for a very limited number of languages. It is also the case that the scientiﬁc study of language has been very much focused on the major languages of Europe, and most linguists have been native speakers of European languages. The study of grammaticalization has been no exception, with the standard examples tending to be taken from the history of ‘Standard Average European’ (SAE) languages. So it is unlikely that we will ﬁnd anything which deviates radically from conventional wisdom by just trying to see what is found in European languages. Excluding chance as an explanation, similarities in developments between languages could be explained either (i) through similarities in preconditions—either internal, i.e. shared structural properties, or external—shared ecologies, or universal cognitive properties, or (ii) through inﬂuence due to language contact. The ﬁrst type of explanation, which is generally favoured in typology, allows for generalizations over a set of languages—but it demands that we treat the members of the set as independent cases, in order to exclude inﬂuence through contact. By contrast, in areal linguistics, scholars normally want to exclude parallel independent developments. There are stumbling blocks of different kinds here. Someone who is looking for

Grammaticalization from a Typological Perspective. First edition. Heiko Narrog and Bernd Heine (eds). This chapter © Östen Dahl . First published  by Oxford University Press



Östen Dahl

independent cases may ﬁnd that there simply are not enough of them. This problem is bad enough in typology of the normal all-encompassing kind, but is even worse if we are trying to generalize over a region. But we also have to ask whether it is in principle possible to disentangle the different types of explanations. Linguists tend to have a preference for explanations that involve one type of causal mechanism, but in a more probabilistic model we would rather think in terms of probability-enhancing factors that do not exclude each other.

. EUROPE AS A LINGUISTIC AREA It is questionable if it makes sense to treat Europe as a whole as one linguistic area. It is not even clear how to delineate Europe as a geographical entity—in particular, there are different opinions about how much of the region around the Caucasus should be included, and as pointed out by Heine and Kuteva (), where the borderline is drawn can make quite a difference given that this is where the greatest linguistic diversity in Europe is found. In addition, the languages in the Caucasian region are also typologically quite different from the languages in other parts of Europe. This holds to some extent also for the non-Indo-European (Uralic and Turkic) languages of the rest of European Russia, which tend to be somewhat neglected in discussions of European languages. In actual practice, discussions of areal phenomena in Europe tend to focus on the western part of the continent. Haspelmath (b) deﬁnes a set of languages that he refers to as ‘Standard Average European’ (SAE, a term ﬁrst used by Benjamin Lee Whorf in : see Whorf ), comprising ‘Romance, Germanic and Balto-Slavic languages, the Balkan languages, and more marginally also the westernmost Finno-Ugrian languages’ (Fig. .). Within the SAE languages he ﬁnds a nucleus consisting of the continental West Germanic languages (Dutch and German) and Gallo-Romance (French, Occitan, and northern Italo-Romance) that he, following van der Auwera (), calls the ‘Charlemagne Sprachbund’, since the area it covers largely coincides with Charlemagne’s realm in the late th century (and also, incidentally, with the original six member states of the European Community). Haspelmath lists the following as ‘major Standard Average European features’: . . . . . . . . . . .

deﬁnite and indeﬁnite articles; relative clauses with relative pronouns; ‘have’-perfect; nominative experiencers; participial passives; anticausative prominence; dative external possessors; negative pronouns and lack of verbal negation; particles in comparative constructions; relative-based equative constructions; intensive-reﬂexive differentiation.

Grammaticalization in Europe



Icelandic Finnish

6

Norwegian

Komi

Swedish Estonian Latvian Lithuanian Irish

Welsh

English

Breton Basque Portuguese

Spanish

Dutch

Polish

German French

9

Czech

Greek Maltese

Udmurt

Tatar Ukrainian

Hungarian Romanian Italian Slovenian Serbo-Croat Bulgarian Albanian Sardinian

8

Russian

Lezgian Georgian Armenian Turkish

7

F. .. Standard Average European languages. Number of SAE features according to Haspelmath (b) is indicated. Only languages sampled by Haspelmath are included.

The following are his ‘further likely SAE features’: . . . . . .

verb fronting in polar interrogatives; comparative marking of adjectives; subject person afﬁxes as strict agreement markers; ‘A and B’ conjunction; comitative-instrumental syncretism; suppletive second ordinal.

. GRAMMATICALIZATION PROCESSES IN EUROPEAN LANGUAGES In Heine and Kuteva ()—the hitherto most ambitious attempt to map grammaticalization processes in the languages of the world—around  ‘source–target pairs’ found in grammaticalization processes are enumerated, with examples of each pair. The authors did not, however, aim to give total coverage of all known cases of these processes. Thus, the fact that some grammaticalization process is not listed for a given part of the world does not mean that it is not found there. One further limitation is



Östen Dahl

T .. Source–target pairs in Heine and Kuteva () Outside Europe

Western Europe

Eastern Europe

Attested grammaticalizations

+

+

+



+

+

+

 +

 

+ +

+

 

+ +



that the authors concentrate on cases where examples from more than one language family were found, which means that some examples of processes unique to some region may have been missed. On the other hand, the percentage of processes that are attested from non-European languages may tell us something about the degree to which our picture of grammaticalization has a Eurocentric bias. As it turns out, things are not quite as bad as one might perhaps think in view of what was said in the Introduction. Table . shows the distribution of the examples in the book. ‘Western Europe’ comprises the Romance, Germanic, and Celtic languages and Basque. As can be seen from the table, more than  per cent of all the source–target pairs in the book are attested in non-European languages. Only about twenty pairs are only exempliﬁed by European languages, and of those only about half come from western Europe. Fig. . shows the ‘European-only’ source–target pairs, sorted into rough groupings. As can be seen, it is a rather mixed bunch. It is indeed hard to draw any conclusions about the character of grammaticalization in Europe from it. In Heine and Kuteva (), the focus is on a small set of processes that have contributed to the typological proﬁle of SAE languages: . grammaticalization of demonstratives and the numeral ‘one’ into deﬁnite and indeﬁnite articles, respectively; . development of a perfect based on expressions for predicative possession; . development of comitative-instrumental polysemy by extension of the use of the comitative marker to express instruments; . extension of the use of question words to mark clause subordination. It is noteworthy that these processes all seem to have spread along similar routes during the same period. With origins in the Mediterranean area, they extended to western and northern Europe in the ﬁrst millennium of our era, with the Romance and Germanic languages as their core area and a more hesitant inﬂuence on the surrounding Basque, Celtic, Uralic, Baltic, and Slavic languages.

Grammaticalization in Europe

TENSEASPECT

use > habitual

come > continuous



copula, locative > POSSESSION h-possessive locative > b-possessive A-possessive > partitive w-question > complementizer

change of state > future

demonstrative > complementizer

perfect > perfective

SUBORDINATION

copula > avertive allative > temporal

field > out

intensive-refl > even

circle > around (spatial)

owe > obligation pass > after

environs > around

near > after

trace > after

ADPOSITIONS

F. .. Grammaticalizations possibly unique to Europe

Heine and Kuteva (: ) claim that ‘the processes leading to the gradual “Europeanization” of peripheral languages are to some extent predictable within limits’, and that languages that are in intense contact with an SAE language for a long time will tend to undergo processes in those directions. However, they hedge this statement by saying that ‘the result [. . .] will not necessarily be spectacular’. Indeed, we do not in general see a wholesale repetition of the four processes enumerated above outside the core area or after the medieval period. It does not appear possible at the present stage of development of areal typology to identify more precisely what factors delimit the spread of grammaticalization processes, but it may be noted that these factors are probably both linguistic and non-linguistic. Thus, although the core SAE area is included in the sphere of inﬂuence of the medieval Catholic Church, many of the languages at its eastern border have been largely untouched by at least the ﬁrst three of the processes in the list. Here, the fact that the languages in question are not Germanic or Romance may have been more important. In the next section, I shall focus on the second of the processes listed above, ‘the development of a perfect based on expressions for predicative possession’.

. ARE EUROPEAN PERFECTS SPECIAL? Verbal forms or constructions traditionally labelled as perfects are found in the grammars of many European languages. In the languages spoken today, they are more or less universally periphrastic and most of them involve auxiliaries.



Östen Dahl

Usually one speaks of two kinds of perfects depending on what auxiliary is used: (i) ESSE or ‘be’ perfects and (ii) HABEO or ‘have’ perfects—notions that will have to be problematized in what follows. While ESSE perfects have transparent sources in resultative constructions, the standard view of the origin of HABEO perfects goes back at least to Behaghel (: ), according to whom a sentence such as he has found him originally meant ‘he has/possesses him as someone found’ (‘er hat ihn, besitzt ihn als einen Gefundenen’), where the possessor role of the subject was reinterpreted as that of an agent due to what we would today call the conventionalization of an implicature. Subsequently, the verb ‘have’ obtained the character of an auxiliary and the participle was reanalysed as the main verb, sometimes leading to a change in word order. The construction also spread from transitive to intransitive verbs and extended its meaning to non-resultative readings. The situation for perfects in the SAE area is well known; detailed accounts are found in Heine and Kuteva (: ch. ) and Drinka (). As mentioned in section ., constructions involving verbs originally meaning ‘have’ or ‘hold, keep’ have spread over the Romance and Germanic areas. In some languages, notably in the core SAE area, the two types of perfects are combined in a split system, where a subset of intransitive verbs use the ﬁrst type and all other verbs the second type. In the same region, these constructions have also largely been grammaticalized further to become perfectives or general pasts. The situation in eastern Europe is more variegated, with some clear examples of ESSE perfects in the north and HABEO perfects in the south. While ESSE perfects are more widespread, the European HABEO perfects are of particular interest to the topic of this chapter in view of the apparent lack of parallels outside Europe. Both Heine and Kuteva () and Drinka () suggest that the spread of possessive perfects in Europe is due to language contact. Heine and Kuteva, who see the spread as involving replication of the grammaticalization process rather than borrowing in the sense of transfer of form–meaning units, support their claim about the role of language contact by the cross-linguistic rareness of the development, but also by the argument that the resulting perfect construction is most strongly grammaticalized in languages that must have provided the model for replication. The more geographically and/or genealogically peripheral a European language is, the less likely it is to have a full-ﬂedged HABEO perfect. At the same time, it should be emphasized (as is also done by the cited authors) that many details of the process remain unclear, and that the role of contact—or its direction—cannot in all cases be fully established. One difﬁculty lies in the reliance of accounts of the early stages of the development of perfects in Europe on a limited number of examples from written sources with sometimes dubious interpretations. To a large extent, we simply do not know what was going on in the spoken language. Also, those early stages were often characterized by a multitude of similar but not identical constructions, the relations between which tend to remain inscrutable. In recent typological literature, HABEO perfects are commonly referred to as ‘possessive perfects’. This notion is more problematic than it may seem. Taxonomies of constructions are often ambiguous as to whether they should be understood as

Grammaticalization in Europe



pertaining to synchronic form or diachronic source. A basically synchronic deﬁnition of possessive perfects is given by Heine and Kuteva (: ): a possessive perfect is a construction which takes the form of a predicative possessive construction (I have a car) where the direct object is replaced by a non-ﬁnite verb form (I have built a house)—it would perhaps have been more precise to say that it is replaced by a phrase headed by such a verb form. But the deﬁnition is problematic in that it would exclude cases such as the Spanish haber perfect, since haber is not used in possessive constructions in modern Spanish. The alternative, then, is a diachronically based deﬁnition, as in Dahl and Velupillai (), where possessive perfects are said to be historically derived from possessive constructions. In the context of this chapter, it is in any case the grammaticalization process behind the constructions that is of interest. But recently the existence of a possessive source has been questioned by several scholars (Jacob ; Nuti ; Acosta ) for a central case—the HABEO perfect of Latin and its reﬂexes in the Romance languages. It is well known that transitive verbs for predicative possession in the Indo-European languages develop relatively late out of verbs with more concrete meanings such as ‘grasp’, ‘hold’, and ‘keep’. In Early Latin (i.e. Latin in the period before the ﬁrst century BCE) habeo ‘have’ was still competing with the dative + copula mihi est x construction, and in many contexts still retained its older meaning of ‘keep’. It is possible to ﬁnd a sizeable number of examples of habeo used together with a passive perfect participle in texts from this time. However, as has been argued by Jacob () and Nuti (), this construction—at least in the majority of cases—had a ‘durative-causative’ meaning, that is, it could be described as ‘causing something to remain in a state over a period in time’. Here, then, its previous meaning of habeo is clearly retained, expressing what we can call ‘stative causation’. Consider the following examples from the works of the playwright Plautus (quoted from Drinka : ): ()

Early Latin vir me habet pessumis man.M.SG.NOM me.F.ACC.SG have.SG.PRS.ACT worst.M.ABL.PL despicatam modis despised.F.ACC.SG manner.M.ABL.PL ‘My husband holds me in a state of scorn in a most unworthy manner’ (lit. ‘holds me despised’) (Casina , , )

()

Early Latin Sed hoc tu tecum But this SG.NOM with_you tacitum habeto silence.PPP.N.ACC.SG hold.SG.FUT.IMP ‘But keep this a secret to yourself ’ (Poenulus , , )

Notice that Drinka translates habeo as ‘hold’ or ‘keep’ rather than by an English perfect. I think that the semantics of this construction is too far from the semantics of constructions usually labelled as perfects for them to be seen as belonging to the same



Östen Dahl

category (‘gram type’). The use in imperatives, as in (), also suggests that the construction is not a perfect. On the other hand, it is not so difﬁcult, as also argued by Jacob and Nuti, to see how a ‘durative-causative’ construction could acquire uses more typical of perfects, involving what we can call ‘dynamic causation’. Someone who keeps something hidden somewhere is also likely to be the person who put it there in the ﬁrst place. Notice that verbs for ‘hide’ are typically ambiguous between dynamic and durative causation. I will not here go into the rather contentious relationships between the developments in Latin and Greek; but it can be noted that parallel examples to the early Latin ones are found somewhat later in Greek, involving the verb ékhein ‘have’ and a passive participle, as in the following example from Plutarch (c.– CE): ()

Koiné Greek Toùs mèn adelphoùs [. . .] eíche [. . .] kekrumménous the PTC brother.ACC.PL have.SG.IMPF hide.PPP.M.ACC.PL ‘she kept her brothers hidden’ (Pelopidas .)

Similarly in the New Testament: ()

Koiné Greek kúrie, idoù hē mnã sou, lord.VOC behold DEF.F.SG.NOM pound SG.GEN hḕn eĩchon apokeiménēn which have.IMPF.SG hide.PPP.F.SG.ACC en soudaríōi in handkerchief.DAT.SG ‘Master, here is your mina, which I kept put away in a handkerchief ’

The most striking parallels to what we see in Latin, however, are found in Hittite. As described in Hoffner and Melchert (), Hittite has a perfect strongly resembling the split auxiliary constructions found in Western Europe, using (i) the auxiliary verb ḫar(k)- + a non-agreeing participle with transitive and some intransitive verbs, (ii) the copula eš- + a participle which agrees with the subject for other intransitive verbs. The lexical meaning of ḫar(k)- is given as’ have, hold, keep’. Although there is no etymological connection to the verbs for ‘have’ in other IndoEuropean languages, the original meaning appears to be the same: Puhvel () says that ḫar(k)- is ‘convincingly connected’ to Latin arceo ‘hold in, shut up, hold off; keep at a distance, hinder’. It is also this meaning that is manifested in what is called the ‘stative construction’. From a formal point of view, this construction is by and large identical to the perfect but has a different semantics in that it ‘expresses the maintenance of a state, either in the present or in the past’ (Inglese and Luraghi, to appear): ()

Old Hittite nu KUR-e paḫḫašnuwan ḫarker CONN land.ACC.PL protect.PTCP.N/A.N have.PST.PL ‘They kept the land protected.’ (KUB . i , NH/NS)

Grammaticalization in Europe ()



Old Hittite nu=mu DINGIR-LUM ištamanan lagān CONN=SG.DAT god ear.ACC bend.PTCP.N/A.N ḫar(a)k have.IMP.SG ‘O god, keep your ear inclined to me.’ (KUB . i –, NH/NS)

It can be seen that the Hittite ‘stative construction’ provides a close parallel to examples like () and () from Early Latin. Moreover, like its Latin counterpart, it appears to be anterior to the use of ‘have’ as a perfect auxiliary. Luraghi and Inglese agree with the claims in Boley (, ) that only the stative construction was found in Old Hittite (– BCE), although they point to a few possible ambiguous examples which could serve as ‘bridging contexts’ for the ensuing development of the perfect. To sum up what we have seen here: One way of interpreting the Latin (and maybe also the Greek) data is to postulate a development in which a transitive verb with the original meaning of ‘grasp, hold’ obtains two new roles, almost in parallel: (i) as the head of the major predicative possession construction; (ii) as the auxiliary in a perfect construction—the latter via combinations with passive participles expressing ‘maintenance of a state’. The striking parallels found in Hittite strengthen the plausibility of this account—in which no clear role for the notion of possession is apparent. On the other hand, at least in the case of Latin, it takes almost a millennium from the ﬁrst attestations of ‘have’ + passive participle combinations to the point where we can speak of a grammaticalized perfect with a reasonably high text frequency. At that point, ‘have’ verbs not only were the preferred way of expressing predicative possession but were well on their way to becoming the multifunctional auxiliary and/or light verb found in modern European languages. In other words, the place of the ‘have’ verbs in the system at later stages of the development may be as important as the character of the ultimate precursors of HABEO perfects. Furthermore, what happened in the spread of HABEO perfects to other European languages may not have duplicated early stages of the Latin development. As for the perfects of Germanic languages, it is controversial whether they arose independently or by areal pressure, and the early history is not well documented. An observation that may not carry too much weight is that Gothic uses an analogous construction to the Greek one in the translation of (): ()

Gothic frauja, sai, sa skatts þeins lord see that.M.NOM.SG money your.M.NOM.SG þanei habaida galagidana in fanin which.M.SG.ACC have.PST.SG. place.PPP.M.SG.ACC in cloth.DAT.SG ‘Master, here is your mina, which I kept put away in a handkerchief ’

The distance in time between the Old Hittite scribes and the Roman author Plautus is one and a half millennia, and the geographical distance is at least , km. From a more global perspective, on other hand, Hittite and Latin both belong to the same Mediterranean civilization and the same language family (like Greek also).



Östen Dahl

Thus, although a direct causal link cannot be assumed, it also cannot be excluded that there are shared features of these languages that are favourable to the rise of perfects of the kind we see here. One such factor could be the presence in the language of a verb with the polysemy pattern of Latin habeo, Greek ékho, and Hittite ḫar(k)-, in particular one shifting from an original meaning ‘hold, keep’ to the general expression of predicative possession. Admittedly, this does not make possessive perfect as a comparative concept in the sense of Haspelmath () any less problematic. However, it is still of interest to look for typological parallels to the European HABEO perfects, and for possible connections between perfects and possessive constructions elsewhere. To start with, the languages where we ﬁnd the HABEO perfects of western Europe and the Balkans—and also Hittite—all employ one speciﬁc ‘event schema’ (Heine b) or ‘encoding strategy’ (Stassen ) as their major construction for predicative possession, viz. the ‘Action Schema’ (Heine) or ‘Have-Possessive’ (Stassen), where the possessor and possessee are expressed as arguments of a transitive verb. In the languages of the world, this type of predicative possession construction is a minority option but still the most common type in Stassen’s -language sample, and is found in languages belonging to  language families from many different parts of the world. One might perhaps expect to ﬁnd parallels to the European HABEO perfects in this group. However, as it turns out, the constructions suggested to be ‘possessive perfects’ in the literature are not in general linked to possessive constructions based on transitive verbs. Thus, in some languages spoken east of the SAE area there are examples of perfects or perfect-like constructions which would ﬁt Heine and Kuteva’s synchronic deﬁnition of possessive perfects, i.e. their structure is analogous to that of the major predicative possession construction in the language, although that construction follows a different strategy: a locational one. The best-known examples are from North Russian dialects, but other languages in the Circum-Baltic area have also been claimed to have such a perfect, as in two from Estonian, where the subject is in the Adessive case, which is also used to mark the possessor in predicative possession constructions. ()

Estonian Mu-l on poe-s käi-dud. I-ADES be.PRS.SG shop-INESS go-PPP ‘I have done the shopping.’/‘I have been to the store.’ (Lindström and Tragel : )

The link to possession of the constructions in the Circum-Baltic languages has been disputed: according to Seržant (), the source is instead ‘a patient-oriented resultative construction based on the copula with a predicative resultative participle’. In such a construction, the subject slot was ﬁlled by what would later become the direct object, and the prepositional phrase which later took on agentive meaning was ‘an adverbial referring to a participant that is physically or mentally affected by the resultant state’. Seržant does not provide any clear examples of such adverbials, however, which makes it difﬁcult to evaluate his argument, particularly in view of the formal coincidence with predicative possessive constructions, which he admits

OUP CORRECTED PROOF – FINAL, 21/9/2018, SPi

Grammaticalization in Europe



(p. ) ‘may have provided additional reinforcement for the development of the adessive PP into the Agent and, subsequently, subject of the perfect’. Similar difﬁculties of interpretations are found with the formations in Indo-Iranian languages, Armenian, and Georgian with dative- or genitive- marked subjects that have been regarded as examples of possessive perfects. Three further candidates for possessive perfects are mentioned by Heine and Kuteva (: ) as not exhibiting ‘any marked areal clustering’: Hdi (Chadic), Chukchi (Chukotko-Kamchatkan), and Ancient Egyptian (?). The claim that Hdi has a possessive perfect refers to what in Frajzyngier and Shay () is called the ‘stative aspect’ construction used in a sentence such as: ()

Hdi ndá xná hlà STAT cut cow ‘the cow is slaughtered’

Frajzyngier and Shay hypothesize that the ‘associative marker’ ndá found here derives diachronically from an ‘associative marker’ ndá ‘with, and’ as used in a predicative possession construction: ‘The notion of “having” is thus extended from having an object to having, or being in, a state expressed by a transitive or intransitive verb.’ (Frajzyngier and Shay : ). Thus, in (), the cow would ‘own’ the state of being slaughtered. But notice that this is quite different from the way the relationship between the European HABEO perfects and possessive constructions has been described. In Behaghel’s example he has found him, it is the agent who is the ‘owner’, and what is owned is the object of the action, not the state that results the action. The case of Old Egyptian, as discussed by Satzinger (: ), is somewhat similar to Frajzyngier’s account of Hdi. It concerns the so-called n-forms (also referred to as sdmn=f ), where the marker n is assumed to originally have been a possessive preposition. By a shift of segmentation, ‘[a]n original formation jrj | n + NOUN, “to N. belongs doing,” or perhaps “to N. belongs what has been done,” was grammaticalized into jrj.n | NOUN, “N. has done” ’. Here, what is ‘owned’ appears to be the action or its result. The Chukchi example goes back to Nedjalkov (), where verb forms in Chukchi are marked by the preﬁx ge- and labelled ‘perfect’ (the semantics of these verb forms is problematic, but I will not go into that question here). The ge- forms derive historically from ‘predicative forms denoting possession of the object named by the noun stem’ (Nedjalkov : ). Thus, he shows that the possessive construction (a) and the intransitive perfect (b) have an identical make-up. ()

Chukchi (a) ge-keli-lin POSS-book-SG ‘he has a book’ (b) ge-wʔi-lin PERF-die-SG ‘he has died’

OUP CORRECTED PROOF – FINAL, 21/9/2018, SPi



Östen Dahl

Again, we see that this is not analogous to the assumed source of the European possessive perfect: we apparently have a metaphorical ownership relation between the undergoer of an event and the event itself. Furthermore, the Chukchi perfect shows similarities but also clear differences in its semantics in relation to other perfects; among other things, it is used for remote rather than recent events. It is perhaps unclear if these differences should disqualify it as a perfect in a cross-linguistic sense, as was implied in Dahl and Velupillai (); but in any case the etymology it is given by Nedjalkov entails that we cannot regard its genesis as equivalent to that of European possessive perfects. Paradoxically, it may be argued that the label ‘possessive perfect’ is more apt for languages such as Chukchi and Hdi than for European possessive perfects. Summing up these cases, the problem is that although we may well be dealing with perfects or perfect-like constructions that derive historically from expressions for predicative possession, the analyses offered in the literature do not make it possible to draw close parallels with the European HABEO perfects, whether or not we accept the traditional account of them. Cysouw (: ), quoting the statement in Dahl and Velupillai () that possessive perfects are ‘attested almost exclusively’ in Europe, labels the European HABEO perfects ‘a European quirk’. Indeed, as we have seen, it is hard to ﬁnd close parallels anywhere else, except for Hittite ḫar(k)-, but I think ‘quirk’ may not be the right word, since it gives the impression that there is something odd about it. Rather, the genesis of a construction such as the English perfect is a complex chain of events, at each point inﬂuenced by many different internal and external factors. Although grammaticalization is governed by general mechanisms of language change and tends to take place along similar pathways in human languages, we should not expect histories of individual grammatical constructions to be copied in detail elsewhere. Let me now turn to another question: to what extent have the sources of the European perfects had an impact on their semantic and expressional properties? Bernhard Wälchli and I have conducted a study of variation among perfects and similar constructions in the languages of the world based on a parallel sample of Bible texts, which at present comprises translations of the New Testament into around , languages. We have identiﬁed around  relevant ‘grams’ in the languages of the corpus which are more or less similar to the well-studied perfects of European languages. As such, the study is a synchronic one; if (as is the case for the majority of the languages included) no diachronic information is available from other sources, conclusions about grammaticalization paths will remain educated guesses from the synchronic patterns that can be observed. What follows is a summary of our ﬁndings; for details, the reader is referred to Dahl and Wälchli (). One large group that we are paying special attention to comprises what has been called ‘iamitives’ (Olsson ), which, to the extent that their historical origin can be determined, normally derive from words meaning ‘already’ or ‘ﬁnish’.¹ In Dahl ()

¹ Data from creole languages suggest that these are not alternative sources; rather, words for ‘ﬁnish’ develop into iamitives via a stage where they are used in the sense of ‘already’.

OUP CORRECTED PROOF – FINAL, 21/9/2018, SPi

Grammaticalization in Europe



and Dahl and Velupillai () these were subsumed under perfects, but in the latter work it was noted that one could argue that they should be seen as a separate ‘gram type’. It may be noted that perfects and words like English already are close in meaning and often occur together, as in She has already arrived, although already in its core uses implies that the event in question arrived earlier than expected. But with stative predicates there is a clear difference in temporal reference: She is already here and She has been here do not mean the same thing. A iamitive such as Indonesian sudah resembles English already in that it expresses a present rather than a past state with such predicates. But the uses of iamitives go beyond what we associate with words like English already in at least two important ways. The ﬁrst is that iamitives tend to become more or less obligatorily used in contexts expressing ‘natural developments’ as with predicates meaning ‘old’, ‘ripe’, or ‘near in time’, i.e. the tendency is to say ‘already old’ rather than just ‘old’ even if there is no earliness element involved. The other tendency involves a general increase in the use of iamitive markers in contexts we associate with perfects—in other words, what the distributional patterns suggest is a convergence with perfects from other sources. It is here that we ﬁnd the largest variation among iamitives, and it is particularly pronounced in the frequency of the markers in question in relative clauses. Our study shows that iamitives in languages spoken in Mainland South East Asia and Indonesia, such as Vietnamese (Austro-Asiatic) and Indonesian (Austronesian), are much closer to European-style perfects than iamitives in, for example, Austronesian languages spoken in the Philippines, such as Cebuano. In European languages, we do not in general ﬁnd fully grammaticalized iamitives, but it should be noted that there are some words meaning ‘already’ that have noticeably higher frequencies than English already and which may be seen as representing incipient grammaticalization processes. Cases in point are above all Ibero-Romance já or ya and Afrikaans al, and to a somewhat lesser degree Russian uže and German schon. Our study builds on the assumption that differences in the meanings of expressions will be reﬂected in differences in the distributions of those expressions in a parallel corpus. As a means of visualization of the relationships between the grams in the sample, we use the technique called multi-dimensional scaling, in which differences between entities are reduced to a small number of dimensions representable in a diagram. Such a diagram is shown in Fig. ., where each gram is represented by a circle, the size of which corresponds to the frequency of the gram in the NT corpus. We can see that the frequencies are lowest in the upper left corner and increase going down and to the right. Grams from three clusters of areally and/or genealogically related languages are marked in the diagram: six Austronesian languages spoken in the Philippines, nine Austronesian languages spoken in Indonesia, and eleven grams from Germanic, Romance, and Finno-Ugric languages spoken in Europe. We can see the two ﬁrst clusters as representing the end-points of two different evolutionary pathways starting in the upper left corner and leading to grammaticalized iamitives with a low or high convergence with perfects, respectively. In the third cluster, we ﬁnd European perfects deriving from non-iamitive sources. The fact that they are concentrated in one corner of the diagram suggests that they have somewhat special properties. A more

OUP CORRECTED PROOF – FINAL, 21/9/2018, SPi



Östen Dahl 0.3 ‘European’ 0.2 0.1 0.0 –0.1 –0.2 –0.3 –0.3

‘Indonesian’ ‘Philippine’ –0.2

–0.1

0.0

0.1

0.2

0.3

0.4

F. .. Multidimensional scaling diagram of perfects and iamitives in the Parallel Bible Corpus

detailed analysis of the distribution of the grams in the sample conﬁrms this. We have identiﬁed at least two context types where grams in the European group are more systematically used than any other grams in the sample: negated experientials, as in I never met him, and ‘universal’ perfects, as in You have always been my friend. We think that the general property of these is that they involve a statement about the way the event type identiﬁed by the predicate occurs in an ‘extended time span’. The question that arises is again whether to ascribe these properties to internal factors—more speciﬁcally, to the diachronic source of the perfects in the European group—or to external factors, in other words to areal convergence. We can then note that the perfects in the group belong to two different language families and are derived from two different sources: HABEO perfects and ESSE perfects. Still, they are more similar to each other in their distribution than to any other perfects in the sample. In addition, one gram from the same region, the Welsh perfect, is derived from a third source, a construction involving the preposition wedi ‘after’, and is still fairly close in its distribution to the grams in the European group—and also close to them in the MDS (multi-dimensional scaling) diagram. This calls for an explanation in terms of areal inﬂuence. Still, we cannot exclude the possibility that the fact that HABEO perfects have had a central role in the growth of European perfects may have inﬂuenced the character of all of them. This would be in accordance with the ‘source determination hypothesis’ (Bybee, Perkins, and Pagliuca : ), which is similar to the notion of ‘persistence’ of Hopper (): ‘the actual meaning of the construction that enters into grammaticization uniquely determines the path that grammaticization follows, and consequently, the resulting grammatical meanings.’ The source determination hypothesis is conﬁrmed insofar as the grams that derive from ‘already’ and ‘ﬁnish’ (‘iamitives’) on the whole have a different semantics from perfects derived from other sources, as we saw above. But in the case of European perfects, we can see that as a result of areal inﬂuence, grams deriving from different sources may converge so as to at least weaken the signal from the source signiﬁcantly. Similarly, areal

OUP CORRECTED PROOF – FINAL, 21/9/2018, SPi

Grammaticalization in Europe



inﬂuence may lead to grams from the same source diverging rather strongly. As an example from outside Europe, we may take the Morisyen (Mauritian Creole) marker ﬁnn and Melanesian Pidgin pinis/ﬁnis, which both derive from words in the respective lexiﬁers meaning ‘ﬁnish’ (Winford : f.), but which are very far from each other in Fig. ., with Morisyen at the upper right end next to the European perfects and Tok Pisin in the lower half, with most of the languages close to it in the diagram also being close to it geographically.

. INFLECTIONALIZATION RATES IN EUROPE AND ELSEWHERE There is one problem worth discussing which perhaps does not immediately come to mind in connection with the issue of characteristics of grammaticalization in European languages. It requires that we go beyond the borders of Europe to look at the languages of East and Mainland Southeast Asia (EMSEA), which stand out as being possibly the largest grouping of languages of the kind traditionally labelled ‘isolating’ in the world. They present two challenges to the study of language change: how did they acquire their isolating character, in particular their lack of inﬂectional morphology, and how do they manage to preserve it over time? Ansaldo, Bisang, and Szeto (Chapter  this volume) argue for the thesis that grammaticalization is a ‘typespeciﬁc’ or ‘areal’ phenomenon, and that grammaticalization processes in the languages of the EMSEA area differ from those of languages in other parts of the world. They say that EMSEA languages have ‘common typological properties in tone and syntactic formations which may lead to the polyfunctionality of markers and lack of obligatory grammatical marking’ and which may help account for their grammaticalization characteristics. A full discussion of their proposal would take me beyond the topic of this chapter, but I want to make a couple of remarks that have bearing on European languages. One remark concerns the claim of Ansaldo et al. concerning the ‘rampant polyfunctionality’ of items in EMSEA languages. This relates primarily to what the authors call ‘particulization’, a process whereby ‘particles are continually developing from fully lexical morphemes’, and which is characterized by the continued existence of the item’s lexical meaning and the absence of phonetic erosion, the combined effect of which is ‘rampant’ polyfunctionality. But it is clear that polyfunctionality is also found in European languages with items that are more or less grammaticalized but have not developed into afﬁxes. A case in point is a word like English have, the role of which as a perfect auxiliary was discussed above. The Oxford English Dictionary enumerates  different major senses of have in English, many of which have several separate sub-senses. This does not include the numerous set phrases that contain have. Such polyfunctionality seems characteristic of high-frequency verbs that have developed uses as auxiliaries or light verbs (another example is get, which has  senses in the OED), but extends also to other types of words. Near can be an adverb, an adjective, a preposition, or a verb, past an adjective, a noun, an adverb, or a preposition.

OUP CORRECTED PROOF – FINAL, 21/9/2018, SPi



Östen Dahl

A more general remark is that the question why inﬂectionalization is not more common in EMSEA languages assumes that we should expect it to be. The truth is that we do not really know what is the ‘normal’ rate of morphological growth. What we can say with some conﬁdence is that the proportion of grammaticalization processes that lead in the end to the creation of inﬂectional morphology is relatively small. In Dahl (), I discuss Bloomﬁeld’s claim that ‘[m]erging of two words into one is excessively rare’ (: ), noting that even if the formulation ‘excessively rare’ is too strong, it is clear that the number of such events that can be observed with a given time-frame is usually not large and that they certainly constitute a relative small part of what goes on in grammaticalization in general. The example mentioned by Bloomﬁeld is the well-known development of the West Romance inﬂectional future from the combination of an inﬁnitive with the verb ‘to have’, as in French je chanterai from Latin cantare habeo. Referring to the discussion of future tenses in European languages in Dahl (), I note that there have only been two clear cases of the creation of inﬂectional futures in European languages in the last two millennia—the Romance one and a very similar development in Ukrainian—while at least ten examples of periphrastic constructions in the future domain have arisen during the same period. Looking more broadly at grammaticalization processes in the core SAE area since the beginning of the Common Era, it turns out that the West Romance future is in fact the only clear case of such a process resulting in the creation of an inﬂection from free morphemes.² If we include the more peripheral parts of the SAE era, we ﬁnd two clear cases in North Germanic: the development of morphological middles/passives out of reﬂexives and the rise of sufﬁxal deﬁnite articles from postposed demonstratives. The total number of morphologization events in the SAE area during the last two millennia, in particular in the core ‘Charlemagne’ area, is thus so low that it is questionable whether it is possible to ﬁnd signiﬁcantly lower rates in other areas—and indeed, to put a number on it. It is striking that in spite of the wide spread of HABEO perfects, as discussed in section ., and their further grammaticalization as pasts or perfectives, there is no language involved in this spread where a possessive perfect has undergone univerbation and become an inﬂection. The number of languages in the EMSEA linguistic area is admittedly larger than that of the SAE area; but given that the number of univerbation events per millennium in a language grouping like the SAE one approaches zero, it is not obvious that the apparent absence of such events in South East Asia shows a signiﬁcant deviation from what could be expected. This also illustrates the difﬁculty of speaking of the characteristics of one region without precise comparisons with what has happened elsewhere. One observation concerning the cases of inﬂectionalization mentioned above is that they all give rise to sufﬁxal marking. This is obviously nothing unexpected: the majority of all inﬂectional afﬁxes in the world’s languages follow the stem. In the sample described in Dryer (a),  per cent of the languages with a more than rudimentary amount of afﬁxation showed a sufﬁxing preference. However, for a ² The restriction ‘from free morphemes’ is important. The inﬂected inﬁnitives of Portuguese and other Romance varieties are arguably new inﬂections, but the sufﬁxes are clearly derived from other personnumber inﬂections in the language, not from free forms.

OUP CORRECTED PROOF – FINAL, 21/9/2018, SPi

Grammaticalization in Europe



morpheme to develop into a sufﬁx, it must normally be derived from a free morpheme that follows the word that the sufﬁx attaches to. Thus, a demonstrative can only develop into a deﬁnite sufﬁx if the word order is Noun Dem. Likewise, for a modal to become a sufﬁx marking future tense, the word order needed is V Aux, and the development of a passive sufﬁx from a reﬂexive presupposes VO word order. In a way, then, it was a happy coincidence that these word orders were present in the languages where the developments took place. We can also note that the overwhelming majority of inﬂections in the Indo-European languages of Europe arose at a time when the ancestors of those languages were still predominantly verb-ﬁnal. An illustration of the possible inﬂuence of word order on grammaticalization is in fact discussed by Ansaldo et al. (Chapter  this volume). The sufﬁxal preference is particularly strong for case afﬁxes, where more than  per cent in the sample in Dryer (b) are sufﬁxal. Correspondingly, there is a correlation between the presence of case afﬁxes and adpositional word order: comparing Dryer (b) and (c), we can see that while roughly half of the languages with postpositions have morphological case, only about a quarter of the prepositional languages do. Ansaldo et al. discuss the development of case morphology in Sri Lanka Malay, the ﬁrst step of which is said to be ‘Migration of adpositional material to postposition as part of VO > OV change’, claiming that this happens ‘under typological inﬂuence of Sinhala and Tamil’. While the fact that these languages themselves do have morphological case systems is bound to be important, it appears likely that the grammaticalization would not have happened without the change in adposition order. Maybe it is more fruitful to ask why inﬂections arise than why they do not arise. The somewhat impressionistic observations made here give some support to the idea that the likelihood of a certain grammaticalization process appearing is at least to some extent dependent on structural properties of the language. However, before any ﬁnal conclusions are drawn, we need more careful analyses of data, in particular concerning differences between areas.

. CONCLUSION In the original proposal for the Tokyo symposium, the idea that grammaticalization is a uniform phenomenon was said to be ‘a mistaken impression’, there being evidence for ‘considerable differences in different types of languages with respect to what typically constitutes grammaticalization’. It is possible to deﬁne typological proﬁles that are speciﬁc to the languages of Europe or parts of it, and it is also possible to specify to a lesser or greater degree the diachronic processes that have shaped these proﬁles. The central role of language contact in this has been argued for strongly by Heine and Kuteva (). But the formulation ‘what typically constitutes grammaticalization’ suggests that it may not be sufﬁcient to say that languages change according to a principle ‘do what your neighbour does’. Rather, grammaticalization would be governed by more general properties of languages in an area, or of their speakers. Nobody would deny that internal factors play a role in language change, but here we would be dealing with internal factors that are not speciﬁc to one language but that

OUP CORRECTED PROOF – FINAL, 21/9/2018, SPi



Östen Dahl

operate uniformly over a smaller or larger area. This is what Ansaldo et al. (Chapter ) suggest for the EMSEA languages, which has led me to some critical comments but also to make some observations pertaining to European languages, where I argued that some grammaticalization events were contingent upon speciﬁc initial conditions, such as a certain word order. It is obviously more difﬁcult to identify factors that would be operative over longer stretches of time and space, especially if they are also supposed to inﬂuence not only speciﬁc paths of development but also grammaticalization in general in the languages concerned. For reasons discussed in the Introduction, the languages of Europe are probably not the most rewarding place to start, so I don’t think I have had much to offer in this respect. One of the more prominent areal grammaticalization patterns in Europe is represented by the HABEO or ‘have’ perfects, which have been called ‘possessive perfects’ in view of their apparent links to predicative possession constructions. I chose this as a suitable topic for a case study in view of its connections with my interest in the typology of perfects. A closer look at recent work on the origins of HABEO perfects made me realize that these are less straightforward than has been thought. In-depth studies of related categories in other parts of the world will hopefully contribute to a clearer picture.

6 Revisiting the anasynthetic spiral MARTIN HASPELMATH

. OVERVIEW Some of the best-known claimed macro-change patterns in grammatical studies, originating in typological considerations in the th century, are the sequences in (a,b) (cf. Horne ; Ramat ). ()

a. isolating ! agglutinative ! ﬂective/fusional b. synthetic or ﬂective/fusional ! analytic or isolating

Even though an important part of the ideology behind these developments (value judgements favouring ﬂective languages) was given up long ago, the awareness of these macro-change patterns is still very much with us (e.g. Hock and Joseph : ; Dixon : ; Croft : ; Igartua : ). They are no longer seen to necessarily apply to entire languages, but they are widely regarded as an important outcome of grammaticalization processes. In this chapter, I would like to revisit these patterns and ask to what extent they can be seen as supported by evidence and to what extent we have been able to explain them. Brieﬂy, while I do not think that we can distinguish between agglutinative and ﬂective types or stages (cf. Haspelmath ), or that there is enough evidence for saying that replacement of synthetic by analytic patterns tends to go via a stage of fusion, I do think that there is sufﬁcient evidence to say that language patterns tend to undergo changes that can in some sense be seen as cyclic alternations between synthetic and analytic patterns, as in (). ()

synthetic ! analytic ! synthetic ( . . . )

However, since the term ‘synthetic’ generally implies the expression of multiple meanings within a single word, and there is no good way of deﬁning ‘word’ (Haspelmath a), we cannot synchronically distinguish between synthetic and analytic patterns. But in section . I will argue that it is possible to maintain some of the original intuitions if one adopts a dynamic, diachronic perspective (with ANALYTICIZATION as a crucial concept). Grammaticalization from a Typological Perspective. First edition. Heiko Narrog and Bernd Heine (eds). This chapter © Martin Haspelmath . First published  by Oxford University Press



Martin Haspelmath

Moreover, the formulation in () might suggest a return to an earlier synthetic stage, but of course the relevant changes do not literally reverse earlier changes. For this reason, von der Gabelentz ([]) and Meillet () used the term ‘spiral’, which I adopt here, calling this kind of development ANASYNTHETIC SPIRAL. More generally, an anasynthetic development (or ANASYNTHESIS) is a change in which an earlier synthetic pattern (such as the Latin future tense, e.g. canta-bi-t ‘will sing’) is replaced by an analytic pattern (such as the Late Latin modal construction with habere ‘have’, e.g. cantare habet ‘has to sing’), which then undergoes various coalescence changes (Haspelmath b) and in this way becomes ‘synthetic again’.

. ANALYTICIZATIONS Since August Wilhelm von Schlegel’s (: ) discussion of grammatical changes from Latin to Romance, languages have commonly been classiﬁed as synthetic or analytic, where ‘synthetic’ means that words of the language consist of several (or even many) elements, while ‘analytic’ refers to languages where grammatical notions tend to be expressed by auxiliary words. Schlegel mentioned the following features of analytic languages (the French examples are added here for concreteness): ()

a. b. c. d. e.

deﬁnite articles before nouns (French la table ‘the table’) personal pronouns before verbs (French je vois ‘I see’) auxiliary verbs (French j’ai vu ‘I have seen’) prepositions instead of cases (French de la table ‘of the table’) adverbs of comparative degree (French plus grand ‘bigger’, lit. ‘more big’)

Like the distinction between isolating, agglutinative, and ﬂective types, this distinction is still very much with us, even though nowadays it tends to be used more for constructions (e.g. ‘analytic tenses’, ‘analytic causatives’) than for entire languages. Greenberg () was the ﬁrst to attempt to measure the degree of analyticity of a language on the basis of a corpus, and such measures of analyticity are still being applied these days (e.g. Siegel, Szmrecsanyi, and Kortmann ).¹ While some of the literature of the th and early th century may have associated analytic language structure with ‘analytic thought’ (cf. Weinrich ), nowadays it is universally accepted that the only difference between a synthetic pattern and an analytic pattern is that the former is a word-internal combination of formatives, while the latter involves multiple words. The distinction is thus exactly as well-founded as the notion of ‘word’. But as Schwegler () and Haspelmath (a) have concluded, after surveying a substantial amount of earlier literature, there is no coherent cross-linguistically applicable concept of ‘word’ that would correspond to the intuitions that linguists have about words. It seems that these intuitions are to a large extent based on our spelling habits, and these do not

¹ A derivative of the term ‘synthetic’, the notion of ‘polysynthetic’ languages, has also enjoyed considerable popularity in more recent times (e.g. Evans and Sasse ).

Revisiting the anasynthetic spiral



correspond clearly to anything in the language structure. Grammatical elements are intuitively regarded as more or less tightly linked to the host root, but this ‘tightness’ of combination is due to a range of diverse properties that do not necessarily coincide with each other. For instance, the English Saxon Genitive marker ’s is tightly linked to its host noun in that it shows grammatically conditioned allomorphic variation (zero after nouns ending in plural -s, e.g. the boys’ room, not *boys’s, cf. children’s room), but is loosely linked (or more clitic-like) to it in that it can occur after a postmodifying phrase (the king of Scotland’s throne). Similarly, the Portuguese object person-form o ‘him’ is tightly linked to its host verb in that it changes its form to lo after an inﬁnitive (vejo-o ‘I see him’, vê-lo ‘to see him’), but it is loosely linked to it (or more clitic-like) in that it occurs in pre-verbal position under certain conditions (não o vejo ‘I do not see him’). Thus, clitics and afﬁxes cannot be generally distinguished from each other (see also Haspelmath a), and neither can phrases and compound words (is Italian macchina da scrivere ‘typewriter’ a compound or a phrase?). This means that analytic languages (or patterns) cannot be distinguished from synthetic languages (or patterns), at least not straightforwardly.² But Schlegel’s observations about the changes from Latin to Romance were not completely unfounded. While he was mistaken (along with a large number of later linguists) in thinking that the difference between Latin and Romance is a simple synchronic typological difference, it is clear that there were a range of parallel morphosyntactic changes from Latin to French: ()

a. Subject person-forms deriving from independent personal pronouns (je, tu, il, etc.) have become the main expressions of subject person (je vois ‘I see’, tu vois ‘you see’, il voit ‘he sees’),³ sometimes replacing subject sufﬁxes. b. Auxiliary verbs (‘have’, ‘be’) are used for passive voice (il est vu ‘he is seen’) and for Perfect (j’ai vu ‘I have seen’) and other tenses, some of them replacing the older forms. c. The Latin Dative case and Genitive case were replaced by prepositions (ad and de, which became French à and de). d. The Latin comparative in -ior (e.g. fort-ior ‘stronger’) was replaced by the adverb plus ‘more’ (plus fort ‘stronger’).

In each of these, an earlier pattern with tightly linked grammatical markers was replaced by a new pattern based on an earlier content word (or more concrete word, or less grammatical word). Synchronically, French il-voit ‘he sees’ may be just as ‘synthetic’ as Latin vide-t ‘he sees’ (cf. Weinrich ; Miller ), and French plus-fort may be no more ‘analytic’ than Latin fort-ior; but diachronically there is no

² A reviewer asks why borderline cases, which are always there, invalidate the general distinction. The answer is that there is no good reason to assume a priori that there should be a general distinction between word-internal grammatical structure and word-external structure. If no coherent characterization of the two putative domains is possible, then one must conclude that there are not two domains to begin with, and that our spelling-based intuitions have no counterpart in spoken language. ³ Note that the ﬁnal consonant letter in vois/voit is no longer pronounced.



Martin Haspelmath

doubt that that the new function items derive from earlier content items (or more concrete items). Thus, while there is no clear synchronic typology based on the synthetic/analytic distinction, there is a clear diachronic trend for older (‘synthetic’) patterns with strongly grammaticalized function items to be replaced sooner or later by newer (‘analytic’) patterns based on content (or more concrete) items.⁴ Thus, we can distinguish locally between analytic and synthetic patterns: when one pattern occupies the same functional slot as another pattern but is clearly younger and based on a content item, it can be regarded as ‘analytic relative to the earlier synthetic pattern’, and we can speak about ANALYTICIZATIONS (cf. Haspelmath and Michaelis ). A crucial ingredient of this ‘dynamicized’ view of the ‘synthetic/analytic’ terminology is the notion of REPLACEMENT. For this reason, there is no counterpart to (a) (articles in Romance languages) in (). The articles are a new grammatical pattern in Romance, deriving from the less grammaticalized Latin demonstratives, but they do not replace any grammatical pattern in Latin. Thus, their rise is not a kind of analyticization, and the same is true for the English will future (which does not replace an earlier, more grammaticalized future tense in Old English) or the Mandarin Chinese object marker bă (which does not replace an earlier object marker in Classical Chinese). Now one might object that in those cases where an earlier form seems to have been been ‘replaced’, a closer look will show that in fact the new construction is used in somewhat different ways. Thus, the Slavic l-Perfect (e.g. Russian ja pisa-l ‘I wrote’), which replaced the earlier Aorist (Old Church Slavonic pisaxŭ ‘I wrote’), has a somewhat different range of uses, and in some Slavic languages (especially Bulgarian), both the Perfect and the Aorist coexist. So can we speak of analyticization here? This depends on the extent to which we would be willing to say that the new form replaces the old one. In some cases, nobody would deny this (e.g. the French comparative plus fort ‘more strong’), but in others one might have doubts, e.g. whether the French ‘have’ Perfect (j’ai vu ‘I have seen, I saw’) replaces the old Simple Past (je vis ‘I saw’), because their range of uses is not identical. I would say that to the extent that we have doubts about the replacement relation between the Perfect and the Simple Past, we do not regard the change as an analyticization. A reviewer asks why I keep the traditional term for the ‘revised concept’, because he feels that ‘the most basic intuition’ is that the contrast between analytic and synthetic has to do with ‘degree of morphological independence or autonomy’. The answer is that I am interested in research continuity. The terms ‘analytic’ and ‘synthetic’ are present in the earlier literature and they will not go away, so I ask how they can be deﬁned in such a way that the earlier insights can be preserved to a maximal extent. I have not ‘revised’ their deﬁnition, because there was no coherent earlier deﬁnition, as far as I am aware (the intuition of ‘morphological independence’ seems to be based entirely on our spelling habits). One may of course choose ⁴ Occasionally it is claimed that all grammatical markers arise in this way (‘all morphemes begin their life as lexical words or stems’, Givón : ). Even if this is too strong, there is little doubt that the bulk of grammatical markers in the world’s languages have an origin of this type.

Revisiting the anasynthetic spiral



to describe the relevant developments with entirely different terms, but an important impetus of the current chapter is to ask to what extent the earlier ideas are still relevant.

. THE ANASYNTHETIC SPIRAL On the basis of the deﬁnition of the dynamic concepts of ‘analyticization’, as well as the diachronically relativized concepts ‘synthetic’ (= to be replaced by a new analytic form) and ‘analytic’ (replacing an old synthetic form), we can create the new concept ‘anasynthetic’: ()

Anasynthetic change = a change whereby a new analytic construction arises that competes with an earlier synthetic pattern and grammaticalizes, eventually becoming the primary expression of its meaning, and thus ‘synthetic again’

The term ‘anasynthetic’ can be seen as formed with the Greek element ana- ‘again, back’ (cf. ana-baptist, ana-phora), or it can be seen as a fusion of ‘analytic’ and ‘synthetic’. It is a new term, but the concept is of course very old. The reason for coining a new term is that I feel that the concept is not sufﬁciently widely known and has not been sufﬁciently widely investigated, and that the older discussions of the developments have often been somewhat confused in that they did not distinguish properly between language-wide developments and constructional developments, or in that they saw a crucial role for a fusional/ﬂective stage intermediate between the synthetic/agglutinative stage and the analytic/isolating stage (see section .). Changes of this kind have often been regarded as CYCLIC (cf. Heine and Kuteva’s (: ) and Igartua’s (: ) term ‘morphological cycle’, as well as van Gelderen ), but arguably, the term ‘spiral’ is more appropriate, because a cyclic development implies that the change pattern leads back to exactly the same point, whereas in language, every new round of replacement brings with it substantial changes. Thus, following Gabelentz ([]) and Meillet (), the term ‘spiral’ is used here. Two concrete examples of changes exemplifying the anasynthetic spiral are given in () and (). In each case, four idealized stages (I–IV) can be distinguished.⁵ ()

Anasynthetic spiral of the Latin–French future tense: four stages (I–IV) Old construction New construction Schema I canta-bi-t ‘will sing’ – H-m / – II canta-bi-t cantare habet H-m / H + extra-form ‘has to sing’ III (canta-bi-t) cantar ha (H-m) / H + marker IV – chant-er-a – / H-m ‘will sing’

⁵ It is not an accident that these four stages are very similar to the four stages distinguished by von Humboldt (); cf. Lehmann (: ).



Martin Haspelmath

At the ﬁrst stage, only the old synthetic construction exists (the Latin future tense cantabit), which is schematized as a host (H) with a marker (-m₁). At the second stage, a new competing construction is introduced, based on a content item (called ‘extra-form’ in the schema). At the third stage, the old construction is on its way out and the new construction is undergoing some formal reduction of the extra-form, which becomes a marker. At the fourth stage, only the new construction exists, and it has become ‘synthetic again’, with a completely grammaticalized new marker (-m₂). Another example comes from Classical Arabic and Maltese, where the earlier genitive sufﬁx -i was replaced by the genitive preﬁx ta-: ()

Anasynthetic spiral of the Arabic-Maltese genitive marker: four stages (I–IV) Old construction New construction Schema I al-kitaab-i – H-m / – ‘of the book’ II al-kitaab-i mataaʕu l-kitaab-i H-m / extra-form + H ‘possession of the book’ III (al-kiteeb) mtaaʕ al-kiteeb (H-m) / marker + H IV (–) ta-l-ktieb – / m-H ‘of the book’

In Classical Arabic, the genitive sufﬁx -i was the only way of signaling adnominal possession, but later, a new construction making use of the content word mataaʕu ‘possession’ as an extra-form came into use (see Eskell Harning ; Koptjevskaja-Tamm ). This was then reduced (the alternative forms bitaaʕ and mtaaʕ are still found in some contemporary Arabic varieties), and now it is written as a preﬁx in Maltese (when a deﬁnite article follows), while the old genitive has largely disappeared (though it can still be used with a few inalienable nouns). The anasynthetic spiral can be schematized as in Fig. .. At the ﬁrst schematic stage (I), there is a (‘synthetic’) marker m₁; at the second stage (II), there is an additional, periphrastic (‘extra-form’) way of expressing the same notion; at the third stage (III), this has turned into an ‘analytic’ marker, and at the fourth stage (IV), this marker has become fully grammaticalized (anasynthetic, m₂).

III: extra-marker (analytic)

II: m1 or extra-form

IV: m2 (anasynthetic)

I: m1 (synthetic)

F. .. The anasynthetic spiral

Revisiting the anasynthetic spiral



. A BIT OF HISTORY Before relating the anasynthetic spiral to the more famous ‘isolating ! agglutinative ! ﬂective ! isolating’ cycle, let us brieﬂy look at the development of ideas in ‘evolutive typology’ (as Lehmann (: ) calls this approach). The three most inﬂuential works in the early th century were von Schlegel (), Bopp (), and Humboldt (), whose speculations about the origins of morphological complexity began a long tradition (see e.g. Horne ). While the idea that grammatical markers derive from earlier content items had been around even in the th century, the continuous tradition of typological speculation and investigation began only with Friedrich von Schlegel’s discussion of the relationship between ‘organic’ languages of the Indo-European ﬂective type contrasted with the ‘mechanic’ languages of the Turkish type. For the latter, it seemed clear to everyone that their morphological patterns arose from ‘glueing’ (agglutinating) earlier words onto host roots, so the term ‘agglutination’ (coined by Wilhelm von Humboldt in ) came to refer both to the process of creation of new function items and their coalescence, and to a morphological type of languages. But for the Indo-European languages with their stem changes, it seemed far less clear to Schlegel and Humboldt that their inﬂections were due to coalescence of earlier full forms. As discussed by Stolz (: §), both Schlegel and Humboldt envisaged the possibility of creating afﬁxes ‘from within the root’.⁶ Bopp (), by contrast, advocated a coalescence origin for the IndoEuropean person sufﬁxes -mi, -si, -ti, as well as other afﬁxes (see Lehmann  for more on Bopp’s agglutination theory). So for a while, the ideas of agglutinative and ‘de-radical’ origins of inﬂections were in competition. However, the idea of grammatical markers arising from full forms prevailed, not only because of the prestige of Bopp’s work on Indo-European, which was evidently successful in many ways, but probably also because Romance linguists were able to show conclusively that some of the Romance sufﬁxes (especially the future tense sufﬁxes and the adverbial sufﬁx -ment(e)) had their origins in Latin words. Thus, von der Gabelentz ([]: ) regarded his views on the spiral-like developments of grammatical forms as generally accepted, and Meillet’s () famous article that ﬁrst introduced the term ‘grammaticalization’ was intended as a popular account for a general audience.⁷ There was no similar evidence for the older idea that inﬂections arise from within the root, but the notion that agglutinative afﬁxes were somehow essentially different from ‘true’ Indo-European-style inﬂection lingered on. ⁶ Humboldt puts it as follows: ‘Durch die unerforschliche Selbstthätigkeit der Sprache brechen die Sufﬁxa aus der Wurzel hervor und dies geschieht so lange und so weit, als das schöpferische Vermögen der Sprache ausreicht. Erst wenn dies nicht mehr thätig ist, kann mechanische Anfügung antreten’ (Humboldt ; : ). ⁷ Nevertheless, the neogrammarians did not focus on grammaticalization, and they tended to prefer to look for analogy-based origins of morphological elements. Jespersen (: ch. ) even attacked the agglutination theory, and for several decades it was not widely pursued. (Tauli  and Hodge  were non-mainstream and not inﬂuential during their times; interest in grammaticalization became widespread again only with Givón .)



Martin Haspelmath

One of the reasons why morphological typology has not generally had a good reputation since the s is that it was often associated with value judgements: the Schlegels and Humboldt were clear that they regarded the ‘organic’ patterns of IndoEuropean patterns as superior to the ‘mechanic’ patterns of the agglutinating languages, and isolating and analytic languages were even less appreciated. At the same time, linguists were trying to arrange languages in temporal order, but this was difﬁcult, because it seemed that morphological structure could be built up (as in pre-Indo-European) and disintegrate (as in Romance languages). One famous proposal for understanding the seemingly contradictory patterns was Schleicher’s () idea that languages built up complexity prehistorically, but are losing complexity in historical times. Another famous proposal was the opposite idea (Jespersen []) that while synthetic complexity was old and poorly designed, analytic simplicity constituted progress (cf. McMahon (: §.) for an accessible account of these discussions). These views of linear developments were then superseded by the modern view that developments are basically cyclic, and that morphological patterns do not reﬂect cultural progress or decay. The reason why von der Gabelentz () and Meillet () are still widely cited is that their views hardly differ from contemporary views. Nevertheless, there is one aspect of the earlier stage of morphological typology that is still widely assumed as correct: the idea that the development from agglutinative to analytic/isolating (patterns or languages) goes via a stage of ‘ﬂection’ or ‘fusion’:⁸ ()

isolating ! agglutinative/synthetic ! ﬂective/fusional ! analytic/isolating

This will be discussed in the next section.

. FROM AGGLUTINATION TO ISOLATION VIA FUSION? One widespread assumption, seemingly conﬁrmed by the history of Romance and Germanic languages, is that the change from earlier synthetic patterns to new analytic patterns was primarily due to phonetic erosion. Variants of this view are still widely held, and the difference between agglutinative and fusional patterns might plausibly be related to sound change, so it is not so surprising that we still encounter the old idea that the change from the agglutinative stage to the isolating/analytic stage generally passes through a fusional/ﬂective stage, as in (). This development, schematized in Fig. ., is called ‘agglutination–fusion–isolation ‘cycle here. The cycle

⁸ While the term ‘agglutinative’ (German agglutinierend) has no competitors, the literature contains both the terms ‘ﬂective’ and ‘fusional’ (the latter apparently ﬁrst used by Sapir ). No clear distinction between them seems to have been drawn by anyone, and perhaps the main reason for introducing ‘fusional’ was that ‘inﬂection’ had come to acquire a more general sense by the th century, referring not only to Indo-European-style fusional inﬂection. The term ‘ﬂective’ (Plank , for German ﬂektierend) has the advantage of being unique (in contrast to ‘inﬂectional’, ‘inﬂecting’, etc.) and of preserving the continuity with the long Humboldtian tradition. I use both terms interchangeably.

Revisiting the anasynthetic spiral



isolating

fusional

agglutinative

F. .. The agglutination–fusion–isolation cycle

is presented and discussed at some length in Dixon (: –), Crowley and Bowern (: –), and Igartua (: ), and it is also mentioned without criticism by Hock and Joseph (: ), Dixon (: ), and Croft (: ). But what is the evidence for the intermediate position of fusion between agglutination and isolation? I have not been able to ﬁnd much discussion of this. In their inﬂuential textbook, Hock and Joseph (: ) simply say: In agglutinating languages, the afﬁxes retain their phonetic identity to such an extent that it is easy to tell where one afﬁx begins and the next one ends. If sound change obscures the boundaries between afﬁxes and brings about their amalgamation, the result is an inﬂectional language.

Likewise, Dixon (: ) states that ‘from an agglutinative proﬁle, the operation of . . . phonological change will effectively preserve the same morphological elements but fuse their realisations’. But do the properties of ﬂective languages really result from sound change? As discussed in Haspelmath (), the three main distinctive features of ﬂective patterns are generally thought to be (i) cumulative exponence (e.g. sufﬁxes like Russian -ov for genitive + plural), (ii) the existence of stem alternations (e.g. English sing/sang), and (iii) the existence of afﬁx alternations (e.g. Russian dative forms in -u/-e/-i depending on the inﬂection class) (see also Igartua : §. for a similar account). It seems that nobody has made a strong case that ﬂective patterns result from phonological reductions, but many people have made this assumption. I cannot examine the question in detail here, but I will now give some reasons why I have very little conﬁdence in the truth of the claim. Cumulative exponence as a feature of ﬂective languages is easy to illustrate from older Indo-European languages, but it actually seems to be quite rare, apart from person–number cumulation (which is frequent not only in bound person markers, but also in independent personal pronouns, cf. Daniel ; since it is extremely frequent everywhere, it is not discussed further here). In particular, the kind of number–case cumulation that is found widely in the older Indo-European languages and that contrasts so strikingly with the separative exponence found in non-IndoEuropean languages (e.g. Russian dom-óv ‘of houses’, contrasting with Turkish ev-ler-in [house-PL-GEN], Igartua : ) seems to be very rare in the world’s languages. Be that as it may, what is the evidence that its origins may have to do with sound changes? There are many speculative ideas about the origin of the older IndoEuropean plural endings (*-es, *-ns, *-om, *-su *-bʰi; cf. Clackson : ) but as far



Martin Haspelmath

as I know, only one of the endings has a possible origin in an earlier separative (i.e. agglutinative) combination, namely the accusative plural sufﬁx -ns, which has been claimed to go back to -m-s (accusative -m plus plural -s). But this hypothesis does not have much plausibility, because it would show a plural sufﬁx outside a case sufﬁx, a pattern that is virtually unattested in other languages (cf. Greenberg’s  Universal ). Thus, almost all instances of case–number cumulation in IndoEuropean go back to the protolanguage, and their origin is obscure. Igartua (: §) claims that Estonian and Basque show the incipient development of fusion (ﬂective patterns) due to sound change, but the evidence for this is actually very slim. He contrasts Estonian and Finnish number–case paradigms and shows that Estonian has somewhat more cumulation, as seen in the partial paradigms in () (Estonian lipp ‘ﬂag’, Finnish lippu ‘ﬂag’). ()

Finnish Nominative Genitive Partitive  Partitive  Illative  Illative  Inessive Adessive

Estonian

SG

PL

SG

PL

lippu lipu-n lippu-a – lippu-un – lipu-ssa lipu-lla

lipu-t lipu-j-en lippu-j-a – lippu-i-hin – lipu-i-ssa lipu-i-lla

lipp lip-u lipp-u – lipu-sse lipp-u lipu-s lipu-l

lipu-d lippu-de lippu-sid lipp-e lipu-de-sse – lipu-de-s lippu-de-l

It is true that Estonian has two forms that are clearly cumulative and which do not have cumulative counterparts in Finnish (partitive  plural lippu-sid, partitive  plural lipp-e), but the ﬁrst of these does not seem to result from sound change, and the Finnish paradigm, too, shows a striking instance of cumulation, namely the nominative plural sufﬁx -t. In fact, the biggest difference between the two languages is that the dental stop has been extended from the nominative to most of the other cases (lipudesse, lipudes, lippudel), thus actually eliminating some cumulation. Thus, this is not a good example of phonetically induced cumulation, and neither is the Basque paradigm which Igartua also discusses (this shows quite a bit of allomorphy, whose origins seem obscure, and only one case of cumulation, the absolutive plural sufﬁx -ak, which is not due to phonetic changes either). Indo-European languages also sometimes show person–number–tense cumulation, as in the French Passé Simple illustrated in (), and compared with the Latin Perfect, from which it derives. ()

SG SG SG PL PL PL

Latin cantavi cantavisti cantavit cantavimus cantavistis cantaverunt

French chantai chantas chanta chantâmes chantâtes chantèrent

Revisiting the anasynthetic spiral



It is true that the Latin tense sufﬁx -v disappeared in French, and in this sense the SG form chantai (< cantavi) and the PL form chantèrent (< cantaverunt) are now more cumulative. But overall it is hard to say that the new paradigm is more cumulative, because the French paradigm has a new tense marker -a/-è, and clearly segmentable person markers at least in the plural.⁹ In Latin, by contrast, four of the six paradigm forms have multiple exponence, with cumulative person forms -i, -isti, -istis, and -erunt. Thus, the French paradigm can even be said to be somewhat more separative than the Latin paradigm. And the one paradigm in French verb inﬂection which clearly shows substantial cumulation, the future tense (with future sufﬁx -r plus future-speciﬁc person forms -ai/-as/-a/-ont/-ez/-ont) did not arise via sound change, but via coalescence (as seen earlier in ()). While cumulation does not seem to arise commonly via sound change, it is easier to provide examples of stem alternations and afﬁx alternations (agglutination criteria (ii) and (iii) above) that result from phonological developments. In fact, phonologically conditioned alternations are by no means restricted to or even characteristic of ‘ﬂective’ languages and are very common in all types of languages, including those traditionally called agglutinative. For example, Hagège (: –) notes that Turkish has stem consonant alternations in nominative/accusative forms such as sebep/sebeb-i ‘cause’, kelebek/kelebeğ-i ‘butterﬂy’, and Kannada has afﬁx alternations as in katte/katte-ge (nominative/dative singular of ‘donkey’) versus katte-gaḷu/katte-gaḷ-ige (nominative/ dative plural). These are clearly phonologically conditioned and plausibly due to an earlier sound change. But for the characteristic inﬂectional classes of Indo-European languages (which involve afﬁx suppletion, not just alternation), e.g. the Latin o-, a-, and i-declension, it is much less clear that the different afﬁxes have anything to do with sound changes. Why does the genitive of populus end in -i (popul-i ‘of the people’), while the genitive of rex ‘king’ ends in -is (reg-is)? Why does the dative plural of populus end in -is (popul-is) and the dative plural of rex in -ibus (reg-ibus)? Nobody seems to know, and phonological change does not seem to be the main reason. Perhaps the most striking phenomenon of Sanskrit and the Germanic languages that Friedrich von Schlegel and Jakob Grimm were deeply impressed by two centuries ago are the vowel changes (called Ablaut by Grimm), especially in the verbal system, which seemed to go back to a vowel-change system in the protolanguage (apparently corresponding to more residual vowel changes in Greek and Latin, cf. Greek légō ‘say’ and lógos ‘word’, Latin tego ‘cover’ and toga ‘covering piece of cloth’). In the meantime, there have been many attempts to reduce these vowel alternations to earlier phonological changes, but especially the e/o alternation has resisted attempts at explanation (cf. Clackson : –). The origins of the even more striking vowel alternations in the Semitic languages are equally obscure. Thus, contrary to a widespread presumption, the most salient aspects of ﬂective languages do not seem to go back to sound changes, and their origins are typically unknown. More generally, we do not know how it is that robust inﬂectional patterns

⁹ The development of the second person singular form chantas can only be explained by analogical levelling, not by sound change.



Martin Haspelmath

with cumulative and suppletive afﬁxes arise. I have not seen good evidence that ﬂective patterns tend to be intermediate between agglutinative and isolating patterns. It thus appears that the idea of an agglutination–fusion–isolation cycle is a remnant of the th century, when it was widely assumed that ﬂective languages were a higher, more advanced development from the more primitive, less perfect agglutinative languages. It is time to abandon that view (or to make a serious effort to come up with actual evidence that supports it). By contrast, the anasynthetic spiral—and the original idea of bound forms arising from earlier free forms, of function items going back to content items—has stood the test of time and has been conﬁrmed by many different examples showing basically the same pattern as the examples in () and ().

. REMARKS ON HOLISTIC ANASYNTHESIS The recognition that it may not be entire language systems, but particular constructions, that develop in systematic ways has been around for a long time time (e.g. Sapir : ). This is a retreat from the much stronger earlier hypothesis that it is entire language systems that develop in coherent ways—apparently a necessary retreat. By and large, languages do not seem to be governed by large-scale regularities, and the search for ‘macroparameters’ or ‘great underlying ground-plans’ has proven largely futile so far (Haspelmath ). As van Gelderen (: ) puts it: ‘Macrocycles’ have remained controversial. Nevertheless, it has sometimes been suggested that entire languages can shift from being largely synthetic to largely analytic or vice versa. The most striking development is that of Egyptian-Coptic, as described by Hintze () and famously by Hodge () (cf. also Reintges ; Haspelmath b: §.). Egyptian-Coptic is attested over more than three millennia, and even though the hieroglyphic script does not represent the vowels, it is quite clear that a fair number of constructions with postposed function items were replaced by new constructions with preposed function items. In the examples in (), the symbol ≫ means ‘is replaced by’, and the symbol > means ‘turns into’. The left-hand example represents earlier Egyptian, and the right-hand example represents Coptic (which was written in Greek script including vowel letters). Note that the new preposed markers are generally quite different from the earlier postposed markers (except for (c)). () a. Postposed demonstrative ≫ preposed demonstrative pei-/teirmṯ pn ≫ pɜj rmt > pei-rôme man this this man this-man ‘this man’ b. Preposed demonstrative > preﬁxed deﬁnite article p-/tpɜ rmṯ > pɜ rmt > p-rôme ‘this man’ ‘the man’ ‘the man’

Revisiting the anasynthetic spiral



c. Numeral ‘one’ > preﬁxed indeﬁnite article ouḥfɜw wʕ ≫ wʕ (n) ḥfɜw > ou-hof snake one one (of ) snake INDF-snake d. Ordinal numeral sufﬁx -nw ≫ preﬁx mehḫmt-nw ≫ mḥ-ḫmt > meh-šomnt three-ORD ﬁll-three ORD-three ‘third’ e. Sufﬁxed possessive pronoun ≫ preﬁxed possessive pronoun (following the article) rn-k ≫ pɜj-k rn > p-ek-ran name-SGM DEF-SGM name DEF-SGM-name ‘your name’ f. Postverbal-subject construction ≫ pre-subject-TAM construction sḏm-n-f ≫ jr-f sdm > a-f-sôtm hear-PRF-SGM do-SGM hear PRET-SGM-hear ‘he heard’ g. Stative construction with agreement > Stative without agreement X st wḏɜ-tj > st wḏɜ > s-ouoj X she whole.STAT-SGF she whole SGF-whole.STAT ‘she is whole’ (X = some particle) h. Synthetic sufﬁxed passive ≫ passive-like construction with PL person form sḏm-w-f ≫ a-u-sotm-f hear-PASS-SGM PRET-PL-hear-SGM ‘he was heard’ ‘he was heard’ (‘they heard him’) i. Periphrastic construction > subject-verb construction X sw ḥr sḏm > f-sôtm he on hear SGM-hear ‘he is hearing’ (X = some particle)‘he is hearing, he hears’ j. Sufﬁx object pronouns (on inﬁnitives) ≫ prepositional accusative sḏm-n ≫ sdm jm-n > sôtm mmo-n hear.INF-PL hear.INF in-PL hear ACC-PL ‘to hear us’ It thus appears that the Egyptian-Coptic language underwent a wholesale change from a sufﬁxing or function-item-postposing macro-pattern to a preﬁxing or prepositing macro-pattern. This development is fascinating, as there is no strong reason why the changes should be connected in this way. Languages clearly do not have to change their patterns in such a concerted way, but it is difﬁcult to believe that these changes should be entirely accidental.¹⁰ ¹⁰ Note that there is no claimed connection between the analyticization and the change in the position of the forms. The latter is puzzling, though a parallel has been found in Romance languages, as discussed immediately below.



Martin Haspelmath

A similar macro-pattern has been described for the development of Romance languages, and more speciﬁcally French, by Baldinger () (see also the discussion in Jacob ). Baldinger notes that quite a few function items in French are preposed to their hosts, whereas the corresponding Latin items occur after their hosts. This goes beyond the old observations by August Wilhelm von Schlegel in that Baldinger highlights the change in the ordering of the elements (in the spirit of Greenberg ). ()

a. deﬁniteness (le, la) b. case (à, de, par) c. number (deﬁnite articles: le/les, la/les: possessive determiners: mon/mes) d. gender (le/la, un/une) e. comparison (plus grand) f. compound tenses (j’ai chanté etc.) g. relative pronoun (qui chante) (replacing the Latin participle) h. subject person forms (je, tu, il, . . . ) i. question particle (est-ce que)

Again it seems difﬁcult to believe that these changes should be unconnected, but how exactly they might be connected is not clear. If there were a tendency for entire languages to lose their old synthetic forms and acquire new ones, one might expect to ﬁnd larger language families where different branches differ in that some preserve the old synthetic forms, while others have lost them and replaced them by entirely new forms (whether with a consistently different order, as in Coptic and French, or with no ordering regularities). However, there are not many candidates for such changes, and they all appear to be controversial. Nichols (: ) mentions Austroasiatic, Niger-Congo, and Trans-Himalayan as possible cases. In the following paragraphs, I make a few more comments on these three families, without offering a clear conclusion. For the Austroasiatic family, Donegan and Stampe () claim that they were originally analytic and head-initial, like the Mon-Khmer languages in the east, whereas the synthetic and head-ﬁnal patterns of Munda languages in the west represent an innovation. However, according to Zide and Anderson (), ProtoAustroasiatic morphology was more like Munda morphology, and the Mon-Khmer languages adopted the areal characteristics of the other Mainland Southeast Asian languages (Tai-Kadai, Hmong-Mien, Trans-Himalayan). In the Niger-Congo (or Atlantic-Congo) language family, one sees a striking contrast between more synthetic languages like the well-known Bantu languages and more analytic languages (often without gender categories) like the Kwa, Defoid, and Igboid languages. There are also more analytic Bantu languages, especially those of the Bantu A subgroup in the northwest, and there has been an interesting recent debate between Güldemann () and Hyman (). While Güldemann claims that Proto-Bantu was more like the (analytic) Kwa languages, in line with the general features of the Macro-Sudan belt, Hyman thinks that the analytic northwestern Bantu languages are innovative and that Proto-Bantu was like the better-known

Revisiting the anasynthetic spiral



Zulu or Swahili type.¹¹ Similarly, Good () considers the Kwa-type noun patterns as secondary compared to the more elaborate Bantu type. Finally, for the Trans-Himalayan family (also called Sino-Tibetan), Scott DeLancey has recently published a series of papers in which he claims that some of the languages retain old synthetic patterns, while others have innovative synthetic patterns (e.g. DeLancey ). Thus, Kiranti and Gyalrongic are two subfamilies spoken in different regions which have very similar and rather idiosyncratic person index paradigms, which are therefore reconstructed for the protolanguage. Sinitic and Tibetic are two families that do not show person indexing at all, while Kuki-Chin languages have clearly innovative, anasynthetic person indexing paradigms. Thus, even though not only the changes we see in Romance (and Germanic) languages but also the really striking changes observed in Egyptian-Coptic may seem to support the idea of holistic anasynthesis, there are not many other clear cases of macro-anasynthesis. This may be because we do not have the kind of good diachronic data that is available for Egyptian-Coptic and Latin-Romance, or it may be due to the fact that such changes are genuinely uncommon. One suggestion that points in this direction is the proposal that large-scale changes of grammatical patterns occur only when massive bilingualism disrupts the development of a language, as happened in the Western Roman Empire, where a large number of people learned Latin as a foreign language (cf. McWhorter ). The same must have happened in Egyptian-Coptic (where the large number of foreign labourers in Egypt may have been a factor in unusual language change), and quite possibly elsewhere (see also DeLancey ).

. WHAT DRIVES THE ANASYNTHETIC SPIRAL? Before concluding this chapter, I would like to revisit also the explanation for the driving force behind the anasynthetic spiral. I will contrast three explanations, which I call (i) therapeutic periphrasis (‘periphrasis saves’), (ii) extravagance and inﬂation, and (iii) redundancy regulation. I will argue that the second explanation must be the correct one. The best-known explanation is the therapeutic explanation, which assumes that older grammatical markers were weakened by phonological reduction and then had to be replaced by new periphrastic forms in order to preserve the functionality of the language. This explanation was commonly assumed throughout the th century, and also widely in the th century. In Georg von der Gabelentz’s ([: ])

¹¹ ‘There has been plenty of time for Proto-Bantu (and even more time for Proto-Niger-Congo) to cycle back and forth, grammaticalizing full words as inﬂectional proclitics and preﬁxes, losing them, and creating them once more. . . . [Dating] may not be easy to do, given the cyclicity. We all seem to agree that Proto-Bantu came from an earlier analytic stage—the question, however, is whether Basaá, Tunen etc. represent that unchanged stage, or whether they are completing the cycle: analytic > agglutinative > analytic. I maintain that the latter is the case’ (Hyman : ).



Martin Haspelmath

famous characterization of the anasynthetic spiral, the ideas of ‘wearing off ’ of older forms and compensatory periphrasis are very clear: Die Afﬁxe verschleifen sich, verschwinden am Ende spurlos; ihre Funktionen aber oder ähnliche drängen wieder nach Ausdruck. Diesen Ausdruck erhalten sie, nach der Methode der isolierenden Sprachen, durch Wortstellung oder verdeutlichende Wörter. Letztere unterliegen wiederum mit der Zeit dem Agglutinationsprozesse, dem Verschliffe und Schwunde, und derweile bereitet sich für das Verderbende neuer Ersatz vor: periphrastische Ausdrücke werden bevorzugt.¹²

This view was also at the basis of Jespersen’s (: ) discussion of the cyclic developments that have later become known as ‘Jespersen’s Cycle’: The history of negative expressions in various languages makes us witness the following curious ﬂuctuation: the original negative adverb is ﬁrst weakened, then found insufﬁcient and therefore strengthened, generally through some additional word, and this in turn may be felt as the negative proper and may then in the course of time be subject to the same development as the original word.

More recently, the therapeutic view was explicitly defended by Geurts () (a response to Haspelmath ; see my reply in Haspelmath ).¹³ While the idea of therapeutic periphrasis hypothesizes that phonological reduction is the driving force, the ‘extravagance and inﬂation’ view sees reduction as the consequence of semantic change from content meaning to grammatical meaning, which leads to frequent use, in a pragmatically governed inﬂationary process. Novel forms are introduced for their special extravagant effect, but when they are copied and become more frequent, this effect weakens, just as the value of a currency goes down when too many bank notes are in circulation (Dahl ; : – calls this ‘rhetorical devaluation’). Thus, the two ﬁrst accounts can be seen as making opposite claims: ()

a. reduction ﬁrst ! periphrasis saves or repairs b. extravagance/periphrasis ! inﬂation and reduction

There are ﬁve reasons why the ﬁrst explanation does not work and the second explanation must be correct. First, it is implausible that phonological reduction would lead to dysfunctional patterns. Even though the metaphor of ‘wearing off ’ is often used for phonological change, sounds are not like material objects in that they lose their substance due to frequent use. Second, the loss of older categories and their replacement by new forms also happens when there is little or no phonological reduction. Thus, in the Balkan Slavic

¹² ‘The afﬁxes are worn down, disappear without a trace at the end; their functions or similar ones demand expression again. They receive this expression, after the manner of the isolating languages, through word order or clarifying words. These are again gradually subject to the agglutination process, to wearing down and to loss, and in the meantime a replacement is being prepared for what perished: periphrastic expressions are preferred.’ ¹³ ‘Then β gets the upper hand, wears down due to the general drive towards efﬁciency of expression, until it is weakened to the point where it has to be replaced by some γ’ (Geurts : ).

Revisiting the anasynthetic spiral



languages (Bulgarian and Macedonian), the older Slavic case system has been drastically reduced and replaced by prepositions, even though the phonological development did not differ noticeably from that of other Slavic languages. For Jespersen’s Cycle, Kiparsky and Condoravdi () ﬁnd that in their data, phonetic reduction played no role. And for eastern Asian languages, it has been claimed explicitly that phonological reduction is not part of grammaticalization processes (Bisang ). A reviewer also points out that polysynthetic languages, which express many categories in the verb, may still show rich periphrastic patterns. Third, grammaticalization not only ‘restores’ grammatical categories that were lost but often creates completely novel categories by the same mechanisms, such as the deﬁnite article in Romance languages. Such developments cannot be explained by reduction. (However, strictly speaking these cases do not fall under anasynthesis, as deﬁned in section ..) Fourth, new grammatical categories may arise even when the old categories do not disappear (right away). For example, both English and French have a traditional future (I will write/j’écrirai), but this has not prevented the grammaticalization of another future, based on ‘go’, that is subtly different in meaning (I’m gonna write/je vais écrire). In many northern Italian varieties, the subject clitics are grammaticalized as agreement markers, although the agreement sufﬁxes inherited from Latin are still largely intact. Bulgarian has preserved the old imperfect/aorist (=imperfective past/perfective past) distinction of early Indo-European, but this has not stopped it from grammaticalizing the new perfective/imperfective opposition as found in other Slavic languages. Again, these developments do not fall under anasynthesis as deﬁned earlier, but the changes are in no way different from the changes that replace earlier categories. Fifth, we ﬁnd quite similar developments in lexical change. Speakers occasionally introduce elaborate, vivid (‘extravagant’) expressions for relatively banal contents in order to be noticed, or in other words because of the greater salience associated with the novel expressions. A similar explanation can be given for many cases of lexicalsemantic change, e.g. developments from ‘speak’ to ‘say’ (e.g. Polish mówić) or from ‘walk’ to ‘go’ (e.g. Italian andare), or from ‘intact’ to ‘whole’ (Latin integer > French entier). These can be accounted for by the inﬂationary model, but not by the periphrasis saves model. But what about the third explanation, ‘redundancy regulation’? This explanation was advanced by Lüdtke (; ) and taken up by Keller (: –) as well as Haspelmath (a). This explanation starts out from the observation that language use varies both along the phonetic dimension and the morphosyntactic dimension (for the latter, see also Croft ), and speakers have a whole range of reduced or expanded options at their disposal for the purposes of ‘redundancy regulation’. According to these authors, an asymmetry consists in the fact that variation along the phonetic dimension is open toward the reduction pole (phonetic reduction can be indeﬁnite) and closed toward the expansion pole (we do not expand phonetically, i.e. we do not speak more clearly than fully clearly). By contrast, variation along the morphosyntactic dimension is said to be open toward the expansion pole (verbosity can be indeﬁnite, i.e. we can always add further explanatory words and



Martin Haspelmath

phrases) but closed toward the reduction pole (we do not reduce morphosyntactically, i.e. we do not simply omit afﬁxes or function words). Hence, the range of reduced and expanded options continually changes in the direction of morphosyntactically expanded forms. But as Campbell (: ) noted correctly, it is not quite true that phonetic expansion is impossible, because expansive sound changes do occur (lengthening, strengthening, epenthesis, and so on). Moreover, this explanation, too, relies on the idea that phonologically weak reduced forms disappear on their own, and on the idea of ‘compensatory’ morphosyntactic enrichment. This view neglects the fact that there are a lot of possibilities for repairing older categories if they become indistinct due to sound change. For example, the singular/plural distinction was preserved in English, even though most of the Old English plurals were no longer distinct from the singulars after ﬁnal vowels and nasals were dropped. What happened was that the one plural ending that was still distinct phonologically (the -s plural) spread over (almost) the entire class of nouns. There was no need to introduce a completely new plural form based on a content item, along the lines of Seychelles Creole bann (from French bande ‘group’) or Tok Pisin ol (from English all) (cf. Michaelis and Haspelmath, to appear). Thus, I conclude that the best explanation for the anasynthetic spiral is the extravagance and inﬂation model of grammaticalization.

. CONCLUSION The most important idea of th-century evolutionary typology that has survived into the st century is the hypothesis that many or most grammatical markers derive from earlier content items, and that the re-creation of grammatical patterns and systems on the basis of content items (or more concrete items) is a common process in language change. When earlier forms get competition from newer constructions based on content items, we can speak about analyticizations, and when these constructions become the most grammaticalized pattern in the language, we can speak about anasynthesis. Such developments can often be seen at the level of particular constructions, and sometimes perhaps at the level of entire language systems, as in Egyptian-Coptic. Another idea that is still widely found but that has not been substantiated is the claim that there is generally a fusional or ﬂective stage intermediate between the older agglutinative synthetic stage and the analytic stage. Flective patterns (cumulative exponence, stem alternations, afﬁx alternations) do not seem to originate in sound changes—the origins of the most robust patterns of this kind seem to be obscure. The driving force behind the grammaticalization changes that are reﬂected in anasynthetic patterns is best described as extravagance with inﬂation, i.e. the semantic developments precede any formal developments (as also emphasized by Heine, Chapter  this volume). The older idea that anasynthetic changes are a reaction to the destructive force of sound changes is not well motivated.

Revisiting the anasynthetic spiral



Finally, readers should be aware that the judgements expressed in this chapter about the value of particular ideas and approaches are entirely based on experience and intuition. I have not brought any quantitative evidence to bear on the competing hypotheses. Perhaps this is a development that future research of macro-change patterns will take: linguists may develop cross-linguistic databases of comparable diachronic developments in different languages from different parts of the world, and then we will be more conﬁdent about our results. However, just as the bold speculations of the Schlegels, Humboldt, Bopp, Schleicher, and Jespersen contributed to our knowledge by inspiring much further research, I think that speculative big-picture ideas still have a valuable role in our times.

7 Grammaticalization in the North Caucasian languages P E T E R AR K A D I E V A N D T I M U R M A I S A K

. NORTH CAUCASIAN LANGUAGES: OVERVIEW The Caucasus is home to dozens of languages spoken by several million people. While some of the Indo-European, Turkic, and Semitic languages are present in the region, most languages of the Caucasus belong to one of the three indigenous families: the (North-)West Caucasian or Abkhaz-Adyghe, the (North-)East Caucasian or Nakh-Daghestanian, and the Kartvelian or South Caucasian. Although at present there is no consensus regarding the genetic relationships between the three families, the idea that the West Caucasian and the East Caucasian families are distantly related (being unrelated to the Kartvelian family) seems to us the most promising (Starostin ). The three families share some important typological properties like rich consonantism, complex morphology, ergativity, SOV word order, and preﬁxal conjugation, which is sometimes taken as evidence of the existence of the Caucasian Sprachbund (Chirikba ), although this position also remains debatable. The (macro)family comprising the West Caucasian and the East Caucasian branches is known as North Caucasian (Nikolayev and Starostin : –), and in the present chapter we focus on selected grammaticalization processes in the two branches, focusing on one compact language group from each of them. Section ., written by Peter Arkadiev, describes the Circassian group of the West Caucasian family, and section ., written by Timur Maisak, deals with the Lezgic group of the East Caucasian family. The choice of the groups is mainly determined by the authors’ expertise, in particular by ﬁeldwork experience with Circassian and Lezgic languages. The Circassian group of the West Caucasian family consists of two closely related languages (or groups of dialects)—Adyghe or West Circassian, and Kabardian or East Circassian. The other known languages of the family are the closely related

Grammaticalization from a Typological Perspective. First edition. Heiko Narrog and Bernd Heine (eds). This chapter © Peter Arkadiev and Timur Maisak . First published  by Oxford University Press

Grammaticalization in North Caucasian languages



Abkhaz and Abaza and the now extinct Ubykh.¹ Up to the s the Circassian branch must have constituted a dialect continuum occupying a vast territory from the Black Sea coast in the northwest to the borders of modern North Ossetia in the southeast; but following the military defeat of the Circassians by the Russian Empire and ensuing mass deportations and resettlement, the original linguistic landscape was disrupted and a number of dialects became extinct. Currently Circassians live in compact patches of land surrounded by areas settled by the Russians in the Russian republics of Adygeya, Karachay-Cherkessia, and Kabardino-Balkaria, in certain districts of the Krasnodar region, as well as in several countries of the Middle East, most notably in Turkey. The total number of speakers of Circassian languages in all countries is hard to estimate due to the lack of reliable information about the language proﬁciency in their diaspora. According to Korjakov (: ), of the c.. million speakers, , Adygheans and , Kabardians live in Russia. Technically speaking, Circassian languages and dialects are not endangered, at least in Russia, where, despite total bilingualism in Russian, both standard Adyghe and Kabardian and the local rural dialects are still spoken and acquired by children as well as employed in media including the internet. Languages of the East Caucasian family are mainly spoken in the Russian republics of Daghestan, Chechnya, and Ingushetia, as well as in the adjacent areas of Azerbaijan and Georgia. The total number of languages in the family is about thirty, but this ﬁgure may turn out to be higher given that the traditional classiﬁcation tends to lump some mutually unintelligible idioms into one language, counting them as mere dialects.² Lezgic languages constitute the southern branch of the family and are spoken in the southern part of Daghestan and the northern part of Azerbaijan by more than , people. There are nine Lezgic languages, with the three groups of closely related idioms, namely East Lezgic (Lezgian, Tabasaran, and Agul), West Lezgic (Tsakhur and Rutul), and South Lezgic (Kryz and Budugh), and the two outliers Archi in central Daghestan and Udi in northern Azerbaijan. Among them there are major languages with developed literary standards (Lezgian and Tabasaran), smaller languages which became written only in s, with almost no original literature (Agul, Tsakhur, and Rutul), and even smaller unwritten languages (Kryz, Budugh, and Archi). Udi enjoys a special position as the closest living relative to the extinct Caucasian Albanian language,³ the only East Caucasian language with an ancient, albeit interrupted, written tradition (cf. the palimpsests found on Mt Sinai and published in Gippert et al. ). Other East Caucasian branches include Nakh (comprising Chechen, Ingush, and Batsbi), AvarAndic (comprising Avar and eight smaller languages of the Andic group), Tsezic with ﬁve or six languages, Lak and Khinalug (each constituting a separate family-level ¹ For a general overview of the family see Hewitt (, ) and Korjakov (: –); for a historical-comparative perspective see Chirikba (). ² For a general overview of the family, see van den Berg () and a collection of grammatical sketches in Smeets () and Job (); for the issues dealing with genetic classiﬁcation, see Korjakov (: –) and Nichols (). ³ Caucasian Albanian (alternatively, Agwan) is the conventional name of the dominant language of Caucasian Albania, an ancient state in the eastern Caucasus, unrelated to the Albania of the Balkans.



Peter Arkadiev and Timur Maisak

branch), and Dargwa (the latter includes literary Dargwa, as well as a large number of highly divergent dialects/languages). Despite many common features, the two North Caucasian branches differ considerably in their morphosyntax, in particular in the morphological structure of the verb and in the inventory of inﬂectional and derivational categories that can be expressed within word boundaries. The same is true of the preferred grammaticalization paths, which include both mutually and cross-linguistically common developments on the one hand and family-particular ones on the other. We start by presenting the general typological proﬁle of each family (sections .. and ..), and then proceed to the discussion of case studies that seem remarkable to us. Most of them pertain to the verbal system, which morphologically is the most complex in both families. For Circassian, the ﬁrst domain of interest is the grammaticalization of body-part nouns into locative applicatives and of motion verbs into directional sufﬁxes (section ..). In particular, it can be shown that the applicative use of body-part nouns does not necessarily presuppose their postpositional function, as has sometimes been claimed. On the whole, both nominal and verbal roots become fully integrated into the structure of the polysynthetic verbal complex of Circassian as grammaticalized markers of spatial speciﬁcation of the event. The picture is different as far as auxiliary verb constructions are concerned, which are the focus of section ... The system of periphrastic verb forms complements the complexity of morphological structure, and offers rich material concerning the role of constructional patterns in grammaticalization as well as on the gradual nature of morphosyntactic integration of such constructions. Various diagnostics show that constructions with auxiliary verbs in Circassian form a cline from free combinations of two independent verbs each heading its own clause to tightly integrated complexes where the auxiliary has almost become a sufﬁx. The same is true of the Lezgic periphrastic constructions, which are discussed in section .. On the whole, the range of verbs that have become grammaticalized in Lezgic languages is very restricted (especially in comparison with the neighbouring Turkic languages, for example). It is often the case, though, that the same lexical item can be found in a number of grammaticalizing constructions. Thus, copulas and the verbs ‘be’ and ‘become’ appear regularly in various periphrastic tense and aspect forms. This issue is dealt with in section .., where we focus on the degree of autonomy of the auxiliaries. Multiple grammaticalization paths are also characteristic of the verb ‘say’, which gives rise to a number of markers, instantiating both cross-linguistically common but also quite rare grammaticalization paths, as described in section ... Alongside some bona ﬁde cases of grammaticalization, when we see the loss of a verb’s autonomy following its gradual change into a grammatical marker (e.g. auxiliary or subordinator), an unusual development is attested in some Lezgic languages where the morphological coalescence occurs at a faster rate, anticipating even syntactic fusion.⁴ In section .., the

⁴ A similar case has been recently described for the West Caucasian Abaza in Panova ().

Grammaticalization in North Caucasian languages



origin of the ‘veriﬁcative’ category in two Lezgic languages will be outlined, which also involves the grammaticalization of a matrix verb but stands out with respect to both the structure of the source construction and the unexpected discrepancy between morphological fusion and the lack of syntactic monoclausalization. In the discussion of North Caucasian grammaticalization we draw both on existing descriptions and on our own ﬁeldwork, as well as on textual sources. The Circassian data mainly come from two varieties, the Temirgoy dialect of Adyghe, which is the basis of the standard language, and the Besleney dialect of Kabardian as spoken in the village Ulyap in the Republic of Adygeya.⁵ However, some of the texts actually come from the published literature in standard Adyghe. In the Lezgic sections we give examples from most languages of the group, including elicited data stemming from Timur Maisak’s ﬁeldwork on Agul, Tsakhur, and Udi.⁶ For uniﬁcation reasons, the transcription in examples cited from others’ works has been changed or adapted, and glosses were added in case the original did not have them.

. GRAMMATICALIZATION IN THE CIRCASSIAN LANGUAGES (WEST CAUCASIAN)

..     All Circassian varieties share the following most important structural characteristics: • Little distinction between major word classes (nouns, adjectives, and verbs), all of which can occur as arguments, predicates, and modiﬁers without any special derivational marking (Lander and Testelets ; Lander : –), as well as blurred distinctions between inﬂection and derivation on the one hand and derivation and compounding on the other (see Lander : ). Although the latter issue has direct relevance for grammaticalization, it is not discussed here for reasons of space. • Polysynthesis, i.e. indexing of all verbal arguments (S, A, P as well as various indirect objects such as recipient, benefactive, and location, cf. e.g. Smeets ) by means of pronominal preﬁxes, and a rich system of afﬁxes marking argument-related, aspectual, temporal, and modal meanings (Kumaxov ; Smeets ; Korotkova and Lander ; Lander and Letuchiy ; Lander

⁵ Both varieties have been subject to ﬁeldwork conducted by a group of linguists including Peter Arkadiev, organized under the auspices of the Russian State University for the Humanities (–) and jointly by the latter and the National Research University Higher School of Economics (–present). When both Adyghe and Kabardian cognate forms are cited, the Adyghe form comes ﬁrst. ⁶ Textual examples from the Huppuq’ dialect of Agul are taken from the oral corpus collected by Dmitry Ganenkov, Timur Maisak, and Solmaz Merdanova in the s.



Peter Arkadiev and Timur Maisak and Testelets ). A naturally occurring example of a characteristically ‘long’ verbal form is given in ():

()

Standard Adyghe (corpus data) zewap’e-mi zə-qə-Øi7-r-a-r-jə-ʁe-xə-ʁ-ep battleﬁeld-OBL RFL.ABS-DEIC-SG.IO-LOC-PL.IO-DAT-SG.ERG-CAU-carry-PST-NEG ‘He did not ask them to carry him from the battleﬁeld.’

• Ergativity in both head- and dependent-marking (Smeets ; Kumakhov and Vamling ; Letuchiy ), coupled with an impoverished case system comprising only the Absolutive (-r, marks S (a) and P (b)) and the Oblique (several allomorphs, of which -m is the basic one, marks A (b), all types of indirect objects (b), and adnominal possessors (c)). Personal pronouns, possessed nominals and proper names, as well as non-referential common nouns normally do not admit overt case marking (on the latter see Testelets and Arkadiev ). ()

Standard Adyghe (elicited) a. čʼjale-ri Øi-me-čəje. boy-ABS .ABS-PRS-sleep ‘The boy is sleeping.’ b. čʼjale-mi pŝaŝe-mj txəλə-rk Øk-Øj-r-ji-e-tə. boy-OBL girl-OBL book-ABS .ABS-SG.IO-DAT-SG.ERG-PRS-give ‘The boy is giving the book to the girl.’ c. c’əfə-mi Øi-jə-wəne man-OBL SG.PR-POSS-house ‘the man’s house’

• Head-ﬁnal word order in most types of clauses and phrases, see () above, though word order of major constituents, especially in independent clauses, is quite ﬂexible. The word in Circassian languages is deﬁned on the basis of rigid morphological structure and morphophonological rules. Among the latter, the alternation /eCe/ ~ /aCe/ is most important (see Smeets : –). Generally speaking, the alternation applies once in a word and signals the right edge of the stem; the class of morphemes occurring to the right of the domain of the alternation (the so-called ‘endings’—see Smeets : –), as well as of those exempt from it, is well deﬁned and quite limited in all Circassian varieties. According to Lander (: ), both verbal and nominal words in Circassian languages are constituted by ﬁve morphological zones schematically shown in Fig. .. Each of the zones, especially the argument structure zone (A), the stem (D), and the endings (E), can contain more than one morpheme, whose order partly reﬂects their semantic scope (see Korotkova and Lander ) and partly adheres to a rigid

⁷ Beyond this section we will not mark and gloss zero morphemes. The subscript indices show crossreferencing of the noun phrase by the pronominal preﬁx.

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi

Grammaticalization in North Caucasian languages Argument structure zone (A)

Pre-stem elements (B)

Causative marker(s) (C)

Stem

Endings

(D)

(E)



F. .. The morphological composition of the Circassian word

template; elements from different zones can interact with each other in intricate ways (see e.g. Arkadiev and Letuchiy ). Example () presents a verb manifesting elements from all morphological zones, while example () shows the so-called nominal complex, a formation equivalent to a noun phrase with modiﬁers in syntax but displaying morphological and phonological properties of a single word (see e.g. Lander ), with three of the morphological zones. Besleney Kabardian (corpus data) () [ja]A-[mə]B-[ʁe]C-[šx-a]D-[wə]E [PL.ERG]A-[NEG]B-[CAU]C-[eat-PST]D-[ADV]E ‘they having not fed (him/her)’ ()

[Ø-jə]A-[ʁʷəneʁʷ-čʼjele-c’ək’ə]D-[m]E [SG.PR-POSS]A-[neighbor-boy-little]D-[OBL]E ‘(to) her neighbor, a little boy’

Circassian languages offer a plethora of phenomena relevant for grammaticalization studies, such as the development of body-part nouns into locative applicatives, and of verbs of motion into directional sufﬁxes (section ..), and a rich system of auxiliary verb constructions expressing a variety of meanings and showing different degrees of formal integration (section ..), as well as grammaticalization of nouns and verbs into postpositions and sentence connectors, semantic bleaching of posture verbs and their role in expression of spatial conﬁgurations and motion events, and, ﬁnally, analytical challenges in distinguishing between nominal or adjectival roots and bound afﬁxes in nominal complexes. For reasons of space, only the ﬁrst two kinds of phenomena will be addressed here.

..    :     Circassian languages, like West Caucasian in general, possess elaborate systems of markers expressing spatial meanings, mostly concentrated within the verbal form (see e.g. Smeets : – and various contributions to Tabulova and Temirova ). These include both preﬁxes and sufﬁxes, which often interact with each other. Both of these categories are interesting from the perspective of grammaticalization, and they will be discussed in turn.

... Locative preverbs from body-part nouns Like other languages of the Caucasus, Circassian languages employ preﬁxes (preverbs) for spatial speciﬁcation of the event expressed by the verb. Each Circassian

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi



Peter Arkadiev and Timur Maisak

variety employs at least twenty such preﬁxes, though their productivity varies. Most prominently such preﬁxes are attested with positional roots -t- ‘stand’, -s- ‘sit’, and -λ- ‘lie’, which are used to express position and location not only of people but also of animals and inanimate entities (see e.g. Ryžova and Kjuseva ), but locative preﬁxes can also attach to verbal roots expressing many other types of events, not necessarily denoting position or motion. The choice of the preﬁx is mainly determined by the topological properties of the landmark (the entity with respect to which the spatial conﬁguration of the event is assessed) and by the entire spatial conﬁguration (see e.g. Paris ), cf. the following illustrative examples. ()

Standard Adyghe (corpus data) de-sə-ʁ. a-dre-r < . . . > lebe-č’jej DEM-other-ABS Labe-valley LOC:area-sit-PST ‘The other man . . . lived in the valley of the Labe river.’

()

zə-pχe-djəwan < . . . > qʷeʁʷə-m qʷe-tə-ʁe-r corner-OBL LOC:behind-stand-PST-ABS one-wood-sofa[R] ‘a wooden sofa standing in the corner’

()

p’ek’ʷerə-m p’ʷeble-p’c’ane jə-λə-ʁ. mat-bare LOC:enclosure-lie-PST bed-OBL ‘There was a bare mat on the bed.’

()

()

Besleney Kabardian (corpus data) a-bə a dehap’e-m dje ʔʷə-t-te-r napəžj-xe DEM-OBL DEM passage-OBL at LOC:beside-stand-IMPV-ABS SURNAME-PL j-a-č’jale. POSS-PL.PR-boy ‘Napyzhevs’ son stood in that passage.’ školə-m sə-č’e-s-wə-re . . . school[R]-OBL SG.ABS-LOC:under-sit-ADV-CNV ‘when I was a school-girl (lit. when I sat under the school)’

() p. ŝaŝe-m jə-ʔeʁʷape blatəkʷ xe-λ-t-jə p-ŝ’e-re. girl-OBL POSS-sleeve kerchief LOC:among-lie-IMPV-ADD SG.ERG-know-Q ‘The girl had a kerchief in her sleeve (lit. a kerchief lied in her sleeve), you know.’ As is common for spatial markers cross-linguistically (see e.g. Heine and Kuteva ), Circassian locative preverbs mostly go back to incorporated nouns denoting body parts or parts of objects such as ‘corner’ or ‘bottom’ (see e.g. Kimov  for an analysis of metaphorical extensions and grammaticalization of the Kabardian bodypart nouns). Table . (based on Kumaxov : –) lists some of the locative preverbs with their corresponding nominals; unless indicated otherwise, Adyghe forms are given. Some of the preverbs are actually morphologically complex, consisting of a body-part noun and another preverb.

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi

Grammaticalization in North Caucasian languages



T .. Lexical sources of Circassian locative preverbs Locative preverb

Corresponding nominal

č’e- ‘under the landmark’

č’e ‘bottom’

pe-, pə- ‘on the frontal part of the landmark’

pe ‘nose, front’

qʷe- ‘behind the landmark’

qʷeʁʷə ‘corner’, cf. ex. ()

λə- ‘moving following the landmark’

λe ‘foot, footprint’

ble- ‘moving along the landmark’

ble ‘forearm’

gwe- ‘beside or near the landmark’

gʷə ‘heart’

k’ʷec’ə- ‘inside the landmark’

k’ʷec’ə ‘intestines’

ʔʷə- ‘beside or near the front of the landmark’

ʔʷə ‘mouth, lips’

bʁe-de- (Kabardian) ‘near the landmark’

bʁe ‘chest’

bʁʷə-rə- ‘beside the landmark’

bʁʷə ‘side’

čʼje-rə- ‘on top or end of the landmark’

čʼje ‘tail, end’

že-xe- ‘close to the landmark’

že ‘mouth’

ŝhe-de-, ŝhe-rə-, ŝhe-ŝə- ‘over the (top of ) the landmark’

ŝhe ‘head, top’

A number of locative preverbs have no transparent cognates among synchronically attested lexemes, e.g. de- ‘in’, tje-/tər- ‘on’, jə- ‘in’, xe- ‘among’. Notably, these are the preverbs which are attested in all dialects. They have developed various non-spatial and idiomatic meanings and appear to be used with the greatest frequency, which might be indicative of their higher degree of grammaticalization. On the other hand, some of the preverbs listed in Table ., e.g. pe- ‘in front of ’, also belong to the oldest layer of morphological elements, being attested not only in the Circassian but also in the Abkhaz-Abaza branch of the family. As is evident from the table, the locative meanings of some preverbs are more or less transparently related to the semantics of the nouns they originate from; cf. č’e‘under’ and č’e ‘bottom’ or bʁʷərə- ‘beside’ and bʁʷə ‘side’ (see e.g. Heine and Kuteva : –, – for similar developments in other languages). On the other hand, the spatial semantics of some of the preverbs is not so evidently related to the lexical meaning of their possible nominal sources, e.g. ble- ‘movement along the landmark’ and ble ‘forearm’ or gwe- ‘beside or near the landmark’ and gʷə ‘heart’ (Heine and Kuteva  do not list corresponding paths of grammaticalization in their lexicon). Finally, there are several preverbs which retain the original semantics of their nominal correspondences and can be considered incorporated body-part terms, especially since morphologically they almost always combine with other preverbs (some such combinations of body-part noun with a preverb, e.g. bʁe-de- [chestLOC:in-] ‘near’ or ŝhe-de- [head-LOC:in-] ‘over’, have acquired purely spatial meanings). To this group belong že ‘mouth’, cf. some verbs of physical actions such as Adyghe že-de-we [mouth-LOC:in-hit] ‘hit in the mouth or in the face’, že-de-xə

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi



Peter Arkadiev and Timur Maisak

[mouth-LOC:in-take] ‘take out of the mouth’, as well as a number of metaphorical extensions having to do with speech, e.g. Adyghe že-de-zə [mouth-LOC:in-fall] ‘utter’, and ʔe ‘hand’, as in Adyghe zə-ʔe-čʼje-ʁe-zə [RFL.ABS-hand-LOC:under-CAU-fall] ‘drop, let fall’ or Standard Kabardian ʔe-ŝ’e-ne [hand-LOC:under-remain] ‘remain in someone’s hands’, and in the following textual example: Standard Adyghe (corpus data) ŝhenʁʷəpčje-m-čʼje () qə-ʔe-čʼje-fa-ʁe-r DEIC-hand-LOC:under-fall-PST-ABS window-OBL-INS qə-r-jə-ʒ-ə-ʁ DEIC-LOC.in-SG.ERG-throw-ELAT-PST ‘he threw out of the window those who fell into (lit. ‘under’) his hands’ Beside the more literal meaning the combination ʔe-čʼje-/ʔe-ŝ’e- ‘under the hand’ has developed into a marker of non-expected or non-volitional action (see Kumakhov and Vamling : –; Arkadiev and Letuchiy : –), cf. example (): Standard Adyghe (corpus data) ʔa-č’je-k’ʷede-žjə-ʁa-p () ʁʷegʷə-m šjə-ʁʷeze-nə-r road-OBL LOC-understand-MSD-ABS hand-LOC:under-lose-RE-PST-ASS ‘he completely lost the understanding of the road’ Morphosyntactically, all Circassian spatial preverbs are applicatives adding to the verb an indirect object denoting the landmark. This argument can be expressed by an overt noun phrase in the oblique case, as in examples () and (), or unmarked, as in examples () and (), and can be cross-referenced by overt personal preﬁxes appearing before the preverb except for third person singular. Cf. the following textual examples: Standard Adyghe (corpus data) () šə-λabz^e-me q-a-č’e-zə-re jet’e-taqərə-šxʷe-xe-r horse-hoof-OBL.PL DEIC-PL.IO-LOC:under-fall-PRS soil-piece-large-PL-ABS ‘large pieces of soil falling from horses’ hoofs’ Besleney Kabardian (corpus data) () λ’əʁe p-xe-λ-q’əm courage SG.IO-LOC.in-lie-NEG ‘you don’t have courage’, lit. “there is no courage inside you” The same concerns incorporated body parts, cf. the following example with ‘hand’: Standard Adyghe (corpus data) qə-f-a-šjejə-ʁe-r a-ʔ-jə-xə-ʁ () bžj-ew horn-ADV DEIC-BEN-PL.ERG-ﬁll-PST-ABS PL.IO-hand-SG.ERG-take-PST ‘He took from their hands the horn they had ﬁlled for him.’ It is worth noting that the other Circassian applicative preﬁxes with more abstract functions, such as the benefactive fe-/xwe-, the malefactive ŝ’we-/f ’e-, the comitative de-, and the instrumental/prolative rə-, have also most probably developed from spatial preverbs, thus exemplifying a more advanced stage of grammaticalization. For

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi

Grammaticalization in North Caucasian languages



example, the benefactive still retains the allative meaning in combination with verbs of motion ((a,b)), while the malefactive is productively used in the peculiar spatial meaning ‘on the tip of the landmark’ (see Mazurova ) ((a,b)). Standard Adyghe (corpus data) () a. qə-fa-k’ʷe, setenaj! DEIC-LOC:towards-go(IMP) PROPER.NAME ‘Come here, Setenay!’ b. λepŝə PROPER.NAME

čʼjele-c’ək’ʷə-m sawəsərəqʷ c’-ew boy-little-OBL PROPER.NAME name-ADV

f-jə-wəsə-ʁ. BEN-SG.ERG-invent-PST ‘Tlepsh invented for the baby boy the name “Sosruko”.’ Besleney Kabardian () a. pəʔe-r ŝhe-m f ’e-s-λh-a. hat-ABS head-OBL LOC:tip-SG.ERG-put-PST ‘I put the hat on my head.’ (elicited, Lomize : ) b. ǯja-r-jə d-a-x-r-jə DEM-ABS-ADD LOC:in-PL.ERG-carry-CNV-ADD f ’e-x-a-ʒ-a. MAL-LOC.among-PL.ERG-throw-PST ‘And so they carried him [the wounded soldier] and threw him out (to his detriment).’ (corpus data) From a typological perspective, Circassian languages appear to offer a fairly clear case of body-part nouns grammaticalizing into applicative markers on verbs, similarly to what Nordlinger () has shown for Murrinh-Patha (Northern Australia), and contrary to claims by Peterson (: –) that applicatives can only arise from an adpositional use of such nouns. Though several Circassian preverbs indeed have postpositional counterparts (e.g. pə- ‘front’ ~ pe ‘before’, č’e- ‘under’ ~ č’eʁ ‘under’; see Kumaxov : ), it is by no means the case that all preverbs (or at least all preverbs with transparent lexical sources) have corresponding postpositions. The opposite is also true, e.g. the noun wəžə ‘footprint’ has developed into a postposition meaning ‘behind, after’, but is never found as a part of the verbal complex. The use of unequivocal incorporated nouns as applicatives introducing overt indirect objects, as in (), also speaks in favour of this scenario.

... Directional sufﬁxes from verbs of motion Another type of encoding of spatial semantics in Circassian languages is constituted by roots of verbs of directed motion used as verbal sufﬁxes (see Smeets : –; Kumaxov : –; Urusov ) encoding the path and direction of concrete or abstract motion. The use of these morphemes is always accompanied by preﬁxation, and the roots themselves fall into two groups depending on whether the choice of the preﬁx is ﬁxed or not. To the ﬁrst group belong directional roots -xə- ‘go down’

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi



Peter Arkadiev and Timur Maisak

and -ž je- ‘depart; begin’, always combining with the semantically bleached ‘dative’ applicative preﬁx (j)e-/r-. Examples () and () show these morphemes as verbal roots, while in examples () and () they feature as sufﬁxes attaching to other verbal roots. Standard Adyghe (corpus data) () the-r b-ʁe-gʷəbžə-ʁe, ŝ’ex-ew q-je-xə-žj! God-ABS SG.ERG-CAU-angry-PST quickly-ADV DEIC-DAT-go.down-RE(IMP) ‘You have angered God, now quickly go down!’ Besleney Kabardian (corpus data) () bγə-m q’-je-že-xə-n-wə k’ʷ-a. mountain-OBL DEIC-DAT-run-DOWN-POT-ADV go-PST ‘He went skiing (lit. ‘down’) from a hill.’ Standard Adyghe (corpus data) λes-ew () kʷaχʷe je pčjeʁʷə q-ə-št-jə, DEIC-SG.ERG-take-ADD on.foot-ADV pitchfork or stake q-a-d-je-žja-ʁ. DEIC-PL.IO-COM-DAT-depart-PST ‘He took a pitchfork or a stake and departed with them on foot.’ Besleney Kabardian (corpus data) () t’ane hade-r q’-a-ʔat-r-jə, then deceased-ABS DEIC-PL.ERG-raise-CNV-ADD šjə-r-a-hə-žje-m . . . REL.TEMP-DAT-PL.ERG-carry-INCH-OBL ‘when they raised the body of the deceased and started carrying it . . . ’ Formally, the circumﬁxes je-V-λ’e ‘movement towards’ () and de-V-je () ‘movement upwards’, which are not used as freestanding verbs and do not have any clear etymology, belong to the same group of elements. Standard Adyghe (corpus data) () bəsλəmen-xe-r j-a-wənaʁʷe-me j-a-k’ʷe-λ’e-žjə-ʁe-x Muslim-PL-ABS POSS-PL-family-OBL.PL DAT-PL.IO-GO-ALL-RE-PST-PL ‘The Muslims returned to their families.’ () šjebzašje-r-jə waŝʷe-m d-e-bəbə-je arrow-ABS-ADD sky-OBL LOC-PRS-ﬂy-UP ‘The arrow ﬂies up into the sky, too.’ The second group of directional afﬁxes that developed from verbs comprises two directionals which combine with the semantically appropriate locative preverbs, i.e. -he ‘motion in or towards the landmark’ (lative) and -čʼjə ‘motion out of or from the landmark’ (elative), which transparently correspond to verbal roots meaning, respectively, ‘go in’ and ‘go out’. These roots are always used with locative preverbs, and the same is true of their sufﬁxal counterparts. The following examples show these verbs used on their own ((), ()) and as directional markers ((), ()). It is worth noting that in the literature such cases are sometimes described as ‘incorporation’ of

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi

Grammaticalization in North Caucasian languages



the verbal root into the preverb + directional verb combination (see e.g. Kumaxov : –; Rogava and Keraševa : –); see Lander (: ) for arguments against such an analysis. Besleney Kabardian (corpus data) ǯjər-jə () nəse-m-re čʼjale-m-re daughter.in.law-OBL-COORD boy-OBL-COORD still-ADD s-a-λexe-he-q’əm SG.ABS-PL.IO-LOC:towards-go.in-NEG ‘I still do not visit my son and my daughter in law.’ () jet’ane c’əxʷə-r šə-m t-o-t’əs-ha-r-jə then man-ABS horse-OBL LOC:on-PRS-sit.down-LAT-CNV-ADD ‘Then the man mounts (lit. ‘sits onto’) a horse.’ Standard Adyghe (corpus data) () rwəsλan wəne-m jə-č’jə-ʁ PROPER.NAME house-OBL LOC:in-go.out-PST ‘Ruslan went out of the house.’ Besleney Kabardian () baʒe-r karobke-m q’ə-de-pšə-čʼj-a ﬂy-ABS box[R]-OBL DEIC-LOC:in-crawl-ELAT-PST ‘The ﬂy went (lit. ‘crawled’) out of the box.’ (elicited, Lomize : ) Lative and elative sufﬁxes can attach not only to verbs of motion but to verbs of other semantic types as well; the development of ‘abstract motion’ and Aktionsart meanings shown in examples () and () is indicative of a high degree of grammaticalization. Standard Adyghe (corpus data) () γərz-maqe-me zəgʷere q-a-xe-kʷəwə-č’jə-ʁ moan-voice-OBL.PL someone DEIC-PL.IO-LOC:among-shout-ELAT-PST ‘Someone of [those with] moaning voices screamed out.’ () čʼje-txə-čʼjə-žjə-n LOC:under-write-ELAT-RE-POT ‘to rewrite, copy’ (Arkadiev and Letuchiy : ) Summing up, we have seen how Circassian languages employ originally nominal and verbal roots as grammaticalized markers of spatial speciﬁcation of the event. In both cases these markers are fully integrated into the structure of the polysynthetic verbal complex and interact with syntax: locative preverbs, being applicatives, augment the valency of the verb, and directional sufﬁxes, always combining with preverbs or other applicatives, ultimately do the same.

..    In addition to rich verbal morphology, Circassian languages possess an elaborate system of auxiliary verb constructions expressing aspectual and modal meanings

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi



Peter Arkadiev and Timur Maisak

(for the most comprehensive description of the Adyghe auxiliary system to date, see Kimmelman ). We deﬁne auxiliary verbs as verbs deprived of their own lexical meaning and argument structure and expressing some aspectual, temporal, modal, or other modiﬁcation of the morphosyntactically subordinate lexical verb. Auxiliaries are functionally similar to afﬁxes, but retain some morphosyntactic properties of independent predicates, e.g. their own inﬂection and selectional requirements on the form of the lexical verb. The basic inventories of both lexical source verbs and auxiliary verb constructions are identical across Circassian varieties, though, of course, some variation in their use and functions is attested. A prominent feature of this system is ‘polygrammaticalization’ (Craig ; Robert ), i.e. coexistence of different constructions employing the same lexical source verb in different functions, while the number of distinct source verbs used as auxiliaries is fairly limited. This situation can be exempliﬁed by the verb χʷə ‘become, happen’, which, in addition to its lexical use shown in (), has several grammaticalized uses exempliﬁed in (a–f ) taken from Kimmelman (: ): Standard Adyghe (corpus data) () adəγe-xe-r bəsλəmen zə-χʷə-ʁe-xe-r a-šj Circassian-PL-ABS Muslim REL.TEMP-become-PST-PL-ABS DEM-OBL fedjəz-ew beŝ’aʁ-ep. like-ADV long.ago-NEG ‘It was not a long time ago that the Circassians became Muslims.’ ()

Temirgoy Adyghe (elicited) a. se školə-m sə-k’ʷe χʷə-ʁe SG school[R]-OBL SG.ABS-go become-PST ‘I began to go to school.’ (aspect: inchoative) b. se školə-m sə-k’ʷe χʷə-šjt SG school[R]-OBL SG.ABS-go become-FUT ‘I am allowed to go to school.’ (modality: deontic possibility) c. se školə-m sə-k’ʷ-ew me-χʷə SG school[R]-OBL SG.ABS-go-ADV PRS-become ‘Sometimes I go to school (but not all the time).’ (aspect: raritive or habitual) d. se školə-m sə-k’ʷe-n-ew me-χʷə SG school[R]-OBL SG.ABS-go-POT-ADV PRS-become ‘I have to go to school.’ (modality: external necessity) e. se školə-m sə-k’ʷe-n-č’j-jə me-χʷə SG school[R]-OBL SG.ABS-go-POT-INS-ADD PRS-become ‘Maybe I will go to school.’ (modality: epistemic) f. se školə-m sə-k’ʷe-n-m-jə me-χʷə SG school[R]-OBL SG.ABS-go-POT-COND-ADD PRS-become ‘I can go to school (but it doesn’t matter if I don’t).’ (modality: external possibility)

The example of the verb χʷə clearly shows that the interpretation of the construction depends on the form of the lexical verb (bare stem in (a,b), adverbial form in (c),

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi

Grammaticalization in North Caucasian languages



and various forms built on the potential sufﬁx -n in (d,f )) and on the tense form of the auxiliary itself (cf. (a) vs (b)). A similar situation obtains with other auxiliary verbs, the most frequent of which, besides χʷə, are faj(e) / xwje(jə) ‘want; need; must’ and various combinations of the stative root -t ‘stand’ with locative preverbs, giving rise to aspectual and modal constructions. From the point of view of their morphosyntax, constructions with auxiliary verbs form a cline from free combinations of a fully-ﬂedged matrix verb with a complement verb to tightly integrated complexes where the auxiliary has almost become a sufﬁx. Polysynthetic morphosyntax of Circassian provides a whole range of diagnostics for assessing the degree of independence respecting integration of such constructions. These diagnostics fall into two groups: () Word order: (a) Can the lexical verb and the auxiliary be permutated? (b) Can any word form be inserted between the lexical verb and the auxiliary? () Locus of inﬂection: (a) Can the auxiliary host its own inﬂectional preﬁxes like cross-reference or subordination markers or do all inﬂectional preﬁxes have to appear on the lexical verb? (b) Can the lexical verb inﬂect for tense and other categories, or do all inﬂectional sufﬁxes have to appear on the auxiliary? Constructions involving two lexical verbs, e.g. with faje/xwje(jə) ‘want’ as a matrix verb, give positive answers to all the questions from (a) to (b), while most of the auxiliary verb constructions yield mixed results. Compare, for instance, the behaviour of xwje(jə) in Besleney Kabardian as a matrix verb in (a–g) versus as an auxiliary expressing deontic necessity in (a–f). As examples (a–g) show, xʷje as a lexical verb ‘want’ governing a sentential complement headed by the verb in the -n-wə form common for irrealis subordinate clauses demonstrates full morphosyntactic autonomy. In particular, it projects its own argument structure, as manifested by the obligatory preﬁx denoting the absolutive argument (d,f) and hosts morphology expressing the syntactic status of the whole construction (i.e. subordinators) (g). Besleney Kabardian (elicited) sə-xʷjejə-ne () a. bžjəhaŝhe-m sə-žjejə-žjə-n-wə evening-OBL SG.ABS-sleep-RE-POT-ADV SG.ABS-want-FUT ‘In the evening I will want to sleep.’ b. sə-xʷjejə-ne sə-žjejə-žjə-n-wə bžjəhaŝhe-m SG.ABS-want-FUT SG.ABS-sleep-POT-ADV evening-OBL ‘In the evening I will want to sleep’ (permutation) bžjəhaŝhe-m sə-xʷjejə-ne c. sə-žjejə-žjə-n-wə SG.ABS-sleep-POT-ADV evening-OBL SG.ABS-want-FUT ‘In the evening I will want to sleep’ (split) d. *se žjejə-n-wə sə-xʷje SG sleep-POT-ADV SG.ABS-want intended: ‘I want to sleep.’ (omission of cross-referencing preﬁxes on the dependent verb)

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi



Peter Arkadiev and Timur Maisak e. *se sə-žjəjə-n-wə xʷje SG SG.ABS-sleep-POT-ADV want ‘I want to sleep.’ (omission of cross-referencing preﬁxes on the matrix verb) f. s-o-ŝ’e wə-žjejə-n-wə wə-č’ə-xʷje-r SG.ERG-PRS-know SG.ABS-sleep-POT-ADV SG.ABS-REL.RSN-want-ABS ‘I know why you want to sleep.’ (subordinating marker on the matrix verb) g. *s-o-ŝ’e wə-č’e-žjejə-n-wə wə-xʷje-r SG.ERG-PRS-know SG.ABS-REL.RSN-sleep-POT-ADV SG.ABS-want-ABS intended: ‘I know why you want to sleep.’ (subordinating marker on the dependent verb)

By contrast, the same verb used as a modal auxiliary combining with the lexical verb in the -n form is fairly tightly integrated into the construction: it must occur right after the lexical verb (a–c), cannot inﬂect for person (d), and even subordinators can attach to the lexical verb (e) as if the construction were a single word; nonetheless, the auxiliary retains some of its autonomy, still being able to host relativizers (f ).⁸ xʷje () a. se pisjmo s-txə-n SG letter[R] SG.ERG-write-POT AUX:must ‘I must write a letter.’ xʷje s-txə-n b. *se pisjmo SG letter[R] AUX:must SG.ERG-write-POT intended: ‘I must write a letter.’ (permutation) c. *se xʷje pisjmo s-txə-n SG AUX:must letter[R] SG.ERG-write-POT intended: ‘I must write a letter.’ (split) d. *se pisjmo s-txə-n sə-xʷje SG letter[R] SG.ERG-write-POT SG.ABS-AUX:must intended: ‘I must write a letter.’ (cross-referencing preﬁx on the auxiliary) xʷje-r e. s-o-ŝ’e pisjmo zerə-s-txə-n SBD-SG.ERG-write-POT AUX:must-ABS SG.ERG-PRS-know letter ‘I know that I must write a letter.’ (subordinating preﬁx on the lexical verb) zerə-xʷje-r f. s-o-ŝ’e pisjmo s-txə-n SG.ERG-PRS-know letter SG.ERG-write-POT SBD-AUX:must-ABS ‘I know that I must write a letter.’ (subordinating preﬁx on the auxiliary) The most grammaticalized auxiliary verbs in Circassian have completely lost their morphosyntactic freedom and have become afﬁxes. Some of them, like the frequentative -zepət (< ‘stand one after another’) both in Adyghe and Kabardian, and the

⁸ See also section .. on the status of auxiliary ‘want’ in different modal constructions of Agul, a Lezgic language.

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi

Grammaticalization in North Caucasian languages



imperfective past -šjtəʁ(e) ( bzegʷəχ-wə šjə-t-a DEM-ABS gossip-ADV LOC-stand -PST ‘He was a gossiper.’

⁹ In some varieties of Adyghe, e.g. in the Bzhedug dialect, the whole preverb has been lost, the sufﬁxes appearing as -təʁ and -t.

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi



Peter Arkadiev and Timur Maisak

Temirgoy Adyghe () bzəwə-r bəbə-n-ew šjə-t bird-ABS ﬂy-POT-ADV LOC-stand ‘The bird has to ﬂy.’ (elicited, Kimmelman : ) Besleney Kabardian wəne-m sə-q’-jə-č’jə-n-wə šjə-t-a () žjə-w early-ADV house-OBL SG.ABS-DEIC-LOC:in-go.out-POT-ADV LOC-stand-PST ‘I had to leave home earlier.’ (elicited, Tjurenkova : ) These cases show that in Circassian languages lexical verbs become auxiliaries and then even sufﬁxes. Even more importantly, widespread polygrammaticalization amply attested in Circassian presents a strong case for the principal role of whole morphosyntactic constructions rather than simple lexical items in grammaticalization (cf. Traugott ).

. GRAMMATICALIZATION IN THE LEZGIC LANGUAGES (EAST CAUCASIAN)

..      The East Caucasian family shares with its West Caucasian sister such typological properties as predominantly verb-ﬁnal word order and ergative case alignment. At the same time, the morphological makeup of the East Caucasian languages is quite different, as they lack the polysynthetic complexes so typical of the West. Also, the distinction between major word classes is usually well-articulated in the East. Languages of the Lezgic branch of the East Caucasian family are well known for their extraordinarily rich case inventories with numerous locative forms, as well as elaborate verb paradigms. Apart from tense, aspect, mood, and evidentiality distinctions, the verb is marked for either class (gender) or person agreement in most languages. While class agreement is an archaic feature, already lost in a few languages (Agul, Lezgian, Udi), person agreement is an independent innovation in Tabasaran and Udi; thus, Tabasaran happens to combine both types of agreement. The two sentences () and () from Rutul illustrate the typical SOV word order, the ergative case alignment and the ergative pattern of class agreement, which is here marked preﬁxally and inﬁxally on verbs (the verbs agree with the absolutive noun phrase uχun ‘dress’, which belongs to one of the two non-human genders¹⁰). The absolutive case is unmarked, while other cases are derived by means of sufﬁxes. These examples also show the use of periphrastic forms (here, the perfect with the auxiliary

¹⁰ As class agreement systems vary across the languages of the Lezgic branch, we gloss the various class markers invariably as “CL”.

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi

Grammaticalization in North Caucasian languages



verb ‘be inside’ and the aorist with the morphologized copula), which will be discussed in section ... Rutul, Mukhad dialect (Maxmudova : –) () did-e wa-s uχun lü-w-šu-r Ɂa! father-ERG SG-DAT dress.ABS PRV-CL-take.PFV-CNV IN.be.PRS ‘Father has bought you a dress!’ () uχun zul~zul w-iši-r-i. CL-become.PFV-CNV-COP dress.ABS torn ‘The dress got torn.’ Apart from the scarce available records in Old Udi (Caucasian Albanian), which can be approximately dated to between the late th and the th century (Gippert et al. : I-), there are virtually no data on older stages of the Lezgic languages before the th century, when the ﬁrst grammatical sketches and texts were published. This is why the grammaticalization sources and the evolution scenarios of such old and prominent phenomena of these languages as locative cases or gender agreement markers are not clear, although attempts to discover their origins have been undertaken in works on comparative reconstruction (cf. esp. Alekseev ). For example, as far as the locative case forms are concerned, it seems plausible that the corresponding markers go back to a set of locative adverbs (or, perhaps, postpositions) which fused with nominals, ﬁnally becoming sufﬁxes. In a different syntactic construction, namely as verbal modiﬁers in the preverbal position, the same items ended up as verbal preﬁxes (preverbs).¹¹ In the modern languages, the historical afﬁnity of verbal preﬁxes and locative case forms can be seen in a still common (despite the semantic changes in preﬁxed verbs) congruence between the two sets of markers. Not only is their form similar or even identical, but verbs with a particular locative preﬁx typically go together with the dependent noun phrases with the same localization marker. In the following Tabasaran examples, kː- (a) and x- (b) are preﬁxes of locational verbs (with the root ‘be’), and one can easily see that the locative case sufﬁxes on nouns are cognate with them. Standard Tabasaran (Zagirov et al. : ) () a. har.i-kː kː-a tree-SUB SUB-be.PRS ‘(s/he) is under the tree’ b. har.i-x x-a tree-APUD APUD-be.PRS ‘(s/he) is near the tree’ Unlike in Circassian (see section ..), locative preﬁxes and locative case markers of Lezgic languages are too ancient to be traced to any particular lexemes. The ¹¹ As a cross-linguistic parallel, cf. Indo-European languages, where etymologically cognate verbal preﬁxes and prepositions are traced back to adverbs (see e.g. Delbrück : –, –, or Pinault  for a more recent overview). In Kartvelian languages as well, many locative preﬁxes have correspondences among the postpositions and adverbs (Harris b).

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi



Peter Arkadiev and Timur Maisak

etymology of postpositions is usually more transparent, as relational nouns like ‘lower part, bottom’, ‘upper side’, ‘side, ﬂank’, or ‘inside(s)’ in various locative cases are the most typical sources (often such nouns are obsolete and do not occur outside of postpositional phrases). Body-part terms can be also identiﬁed among the sources of postpositions, e.g. q’iliw ‘near, to’ in Lezgian is based on q’il ‘head’, ulixde ‘before, in front’ in Rutul is the sub-essive case of ul ‘eye’ (lit. ‘under the eye’), and aq’ʷal ʲ ‘on, on the outside’ in Tsakhur is the super-essive case of aq’ʷa ‘face’ (lit. ‘on the face’). Among the non-locative cases, it is only some recent formations whose ultimate historical source can be discerned. Thus, the Agul comitative in -qaj clearly originated in a construction with a dependent clause headed by the converb qaj ‘having’ derived from a stative verb qaa. This verb, with the preﬁx q- of the POST (‘behind’) localization, has the locative meaning ‘be behind’, but it is also the main means of expressing predicative possession; the possessor noun phrase occurs in the postessive case in -q (). The comitative case wa-qaj ‘with you’ in () resulted from the coalescence of the noun in the post-essive with the converb: the source structure like *wa-q qaj ‘you having’ ended up as a regular case form introducing a secondary participant (Merdanova : –). Agul, Huppuq’ dialect (corpus data) () za-q jaq’u gada=ra qa-a, sa ruš=ra qa-a. SG-POST four son.ABS=ADD POST.be-PRS one daughter.ABS=ADD POST.be-PRS ‘I have four sons, and also a daughter.’ () gada quš-u-f-e wa-qaj, p-u-naa. son.ABS go_away-PFV-SBZ-COP SG-COM say-PFV-PRF ‘The son went away with you, they said.’ We will now focus on several examples of grammatical markers based on source constructions with verbal lexemes.

..      An average Lezgic tense and aspect system includes both synthetic and periphrastic (analytic) forms in a varying proportion. Synthetic forms are sufﬁxal and are basically derived from one of the aspectual stems, namely perfective vs imperfective. The two stems are usually morphologically distinguished by means of sufﬁxes or inﬁxes, but sometimes apophony, reduplication, or suppletion are also employed—as in the perfective/imperfective pairs in Tsakhur āqɨ/āqa ‘open’, hiχu/heχʷa ‘run away’, hiwo/hele ‘give’, uχo/uχoχa ‘give birth’ (Kibrik and Testelec : –). Periphrastic forms are composed of a non-ﬁnite component (e.g. participle, converb, inﬁnitive) and a postpositional auxiliary. The most common auxiliaries are the copula, the existential verb ‘be’ or ‘be inside’, and the regular verb ‘become, happen’. Copulas and existential verbs are morphologically deﬁcient and possess a very reduced paradigm; as auxiliaries, they mainly occur in one of the two synthetic tenses, the present or the past. To the contrary, the verb ‘become’ has a complete paradigm, and within periphrastic forms can potentially take any form, both

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi

Grammaticalization in North Caucasian languages



synthetic and periphrastic. (Not all the potentially possible periphrastic constructions are frequently used, however, or even attested in natural speech at all.) Example () from Tsakhur illustrates the synthetic aorist (syncretic with the perfective converb), the periphrastic perfect with the copula wod as an auxiliary (the ﬁnal consonant of the copula is a class marker), the periphrastic pluperfect with the auxiliary ‘become’ in the synthetic aorist form ɨxa, and the ‘surcomposé’ past including the auxiliary ‘become’ in the periphrastic perfect ɨxa wod.¹² Tsakhur, Mishlesh dialect (based on Kibrik and Testelec : –, ) () a. maˁhammadˠ-ē ɢulʲ āqɨ. Muhammad-ERG window.ABS CL.open.PFV ‘Muhammad opened the window.’ b. maˁhammadˠ-ē ɢulʲ āqɨ wo-d. Muhammad-ERG window.ABS CL.open.PFV COP-CL ‘Muhammad has opened the window.’ c. maˁhammadˠ-ē ɢulʲ āqɨ ɨxa. Muhammad-ERG window.ABS CL.open.PFV CL.become.PFV ‘Muhammad had opened the window {and now it is closed again}.’ d. maˁhammadˠ-ē ɢulʲ āqɨ wo-d ɨxa. Muhammad-ERG window.ABS CL.open.PFV COP-CL CL.become.PFV (‘Muhammad had opened the window.’) The morphosyntactic evolution of periphrastic forms, especially those with a phonologically light copula or a verb ‘be’, involves a gradual drift towards synthetic, morphologically bound forms with the (former) auxiliary becoming afﬁxed to the main verb (see also section .. on the varying degree of independence of auxiliaries in Circassian, where a similar cline can be observed). For example, in Agul virtually all the core indicative tense and aspect forms are originally periphrastic (),¹³ but in the modern language they mostly appear as highly morphologized, with the fusion of the non-ﬁnite main verb and the auxiliary accompanied by sound changes typical of word-internal morpheme boundaries (e.g. frequent vowel drops, elision of glides and vowel coalescence in the present, the /d/>/tː/ devoicing in the negative future): Agul, Huppuq’ dialect (based on Merdanova : ) < *ruχ-u-na e () a. ruχ-u-ne read-PFV-CNV COP read-PFV-AOR ‘read’ (aorist)

¹² By ‘surcomposé’ forms I mean those periphrastic forms which are ‘double composed’, as the auxiliary is itself in a periphrastic form. The term ‘surcomposé’ stems from the Romance linguistic tradition, cf. the French ‘surcomposé’ past in Il a eu mangé, lit. ‘he has had eaten’ (Saussure and Sthioul : ). Note that, as argued in Kibrik and Testelets (: –), the copula wod in the Tsakhur surcomposé forms moves from the periphrastic auxiliary ɨxa wod to the main verb according to the general rule of copula placement on a focused element. The translation of (d) is provisional, as all the ‘surcomposé’ forms are very rare. ¹³ Only a partial paradigm is presented in (); for a more detailed treatment, see Merdanova (: –) and Majsak ().

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi



Peter Arkadiev and Timur Maisak b. ruχ-u-na(j)a read-PFV-PRF ‘has read’ (perfect)

-lla), and the combination of the imperfective converb marker -r with the interrogative -ra (i.e. -r-ra) yields simply -ra or -r, as in (). On the original structure of the Agul veriﬁcative, see below.

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi

Grammaticalization in North Caucasian languages



source of the veriﬁcative afﬁx, i.e. the original verb of ‘checking’. Undoubtedly, in Archi it goes back to the verb akːus ‘see’, which loses the ﬁrst vowel and becomes afﬁxed to the verb in the complement clause, e.g. boʟo-r-kːu ‘checked whether (they) will give’ < *boʟo-r akːu ‘saw whether (they) will give’. In Agul, the source is not immediately obvious, but most plausibly the veriﬁcative marker having dialectal variants -čug-/-čuk’- is the result of the fusion of the conditional in -či with various forms of the matrix verb agʷas ‘see’ (cognate to the Archi akːus), e.g. ruχunaj-čuk’ ‘check whether (he) has learnt’ < *ruχunaj-či agʷ ‘see whether (he) has learnt’.²⁶ The source verb akːus/agʷas in both languages refers to passive visual perception, its active counterpart (‘look’) being encoded by other lexical items. As in many other East Caucasian languages, ‘see’ in Agul and Archi belongs to the experiential class with the dative subject marking (cf. za-s agʷ-a-a [SG-DAT see-IMPV-PRS] ‘I see’ in Agul). Interestingly, in the veriﬁcative the subject (‘one who checks’), if present, is only encoded with the ergative case, i.e. as a canonical agent, not experiencer. This means that apart from the complete morphological fusion of the complement and the matrix predicate, the evolution of the veriﬁcative involved a semantic shift from passive visual perception (‘see’) to active ‘inquisitive’ meaning (‘check, ﬁnd out’), with the concomitant shift of subject encoding from the pattern typical for experiencers (dative) to the one typical for agents (ergative). Agul, Huppuq’ dialect (Danièl´ and Majsak : ) () sa zargar aj-čuk’-a-j-e mi. [one goldsmith.ABS IN.be.PRS]-VERIF-IMPV-CNV-COP this.ERG ‘He is checking whether there is a goldsmith (in the town).’ Archi (Danièl´ and Majsak : ) () tu-w-mu baˁk’ bu-ʟ’u-r-kːu-qi zari. [this-CL-ERG sheep.ABS CL-slaughter.AOR-Q]-VERIF-FUT SG.ERG ‘I’ll check whether he slaughtered a ram.’ The morphologization of the veriﬁcative yielded verbal forms which are exceptional in a number of ways (apart from being unusually polymorphemic). Forms like aj-čuk’-a-j-e ‘is checking whether (he) is there’ in () or buʟ’ur-kːuqi ‘will check whether (he) slaughtered’ in () not only refer to two situations, but morphologically have two independent positions for tense and aspect marking. For example, buʟ’ur-kːuqi contains the ‘external’ future form (as -qi is the future tense inﬂection), and the ‘internal’ aorist form (as the situation to be checked, namely ‘(he) slaughtered’, is expressed by the verb ‘slaughter’ in the aorist). The two parts of veriﬁcatives, one referring to the embedded question and another to the situation of checking, keep even more of their syntactic autonomy: each has its own set of arguments (unlike in causatives, in veriﬁcatives the argument encoding in the embedded part does not change, hence the two ergatives in ()), and can adjoin its own adverbials, as in ().

²⁶ Other dialectal variants of the veriﬁcative marker include -čuq’- with the ejective uvular and -magʷ-, whose initial consonant does not resemble the conditional afﬁx. Possibly, alternative source constructions (or some idiosyncratic sound changes) should be postulated in these cases.

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi



Peter Arkadiev and Timur Maisak

Agul, Huppuq’ dialect (Danièl´ and Majsak : ) () zun jaʕa gadaji naq’ dars ruχ-u-naj-čuk’-a-s-e. SG.ERG today [boy.ERG yesterday lesson.ABS read-PFV-PRF]-VERIF-IMPV-INF-COP ‘I will check today whether the boy learnt his lesson yesterday.’ Thus, the morphologization of the veriﬁcative does not appear to be the result of clause union (as a stage on the path from looser to tighter structure): it was not preceded by syntactic fusion of the matrix predicate and its complement. On the contrary, the veriﬁcative turns out to be the mirror image of those complex predicates which comprise two (or more) morphologically autonomous verbs, at the same time being monoclausal on the syntactic level (periphrastic causative constructions with the verb ‘do’ in French and other Romance languages can be mentioned as a paradigm case of the latter).²⁷ It is quite mysterious why it was exactly the ‘veriﬁcational’ construction with the verb ‘see’ (not a common grammaticalization source in the languages of the world), which is not particularly frequent in discourse, that has undergone such a development.²⁸ Another puzzle that still remains to be solved is the occurrence of morphological veriﬁcative only in the two Lezgic languages which are not very close genetically or geographically (cf. also Danièl´ and Majsak ). Since the ProtoLezgic status of veriﬁcative is highly dubious given the inter-language (or even interdialectal, in the case of Agul) variation in the source structure, it may turn out that this Agul-Archi peculiarity reﬂects some ancient areal connections, and not trivial ones.

. CONCLUSION In this chapter we have presented a number of case studies of grammaticalization phenomena in two subgroups of the two branches of the North Caucasian macrofamily. Despite the considerable differences in their morphological makeup, both the Circassian and the Lezgic languages share a trend of creating tense-aspect and modal markers from verbal sources. The grammaticalizing constructions display various degrees of integration, ranging from highly autonomous auxiliaries to those partly or totally fused with lexical verbs, up to the extent of becoming afﬁxes. Some of the criteria showing the degree of autonomy are common to both families (e.g. word order permutations, the possibility of insertion of any material between the auxiliary and the lexical verb, or phonological erosion). Other criteria are language-speciﬁc, like the distribution of inﬂections between the lexical verb and the auxiliary, or the blocking of stem-ﬁnal vowel alternations in West Caucasian.

²⁷ See also Maisak () for elaboration on the morphological vs syntactic fusion asymmetry of the veriﬁcative. ²⁸ The semantic shift from ‘see’ to ‘check, ﬁnd out’ is not unique (see Alm-Arvius : – for the discussion of the use of English see ‘as a near-synonym of ﬁnd out (about) or check’, or Ibarretxe-Antuñano () for similar Spanish and Basque examples), but we are not aware of any other cases where this shift would result in the auxiliation, let alone afﬁxation, of the source verb.

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi

Grammaticalization in North Caucasian languages



The set of lexical items that are most commonly employed as sources is rather restricted and includes copulas, existential verbs ‘be’ or ‘become, happen’, posture verbs (e.g. ‘stand’), or modal predicates (e.g. ‘want’). Polygrammaticalization, i.e. the coexistence of various grammaticalization paths involving one and the same lexical source item, is also characteristic of both branches, especially as regards constructions with the auxiliaries like ‘become’, ‘want’, ‘stand’, and ‘say’. On the other hand, there are grammaticalization paths, or rather families of grammaticalization paths which are amply represented in one branch but rare in the other. In particular, body-part nouns and motion and posture verbs are the obvious and the most important sources of locative markers in West Caucasian languages, whereas the equivalent source items are only rarely found in the East Caucasian family, where the origin of locative markers largely remains unclear. The development of the ‘veriﬁcative’ in some Lezgic languages involves a grammaticalization source uncommon for the Caucasus (the verb ‘see’) and is an example of a cross-linguistic, and not only family-internal, rarissimum. Interestingly, morphological veriﬁcatives amount to the creation of polysynthetic structures so typical of the Western branch: being the result of complete morphological fusion between a matrix verb and its complement, veriﬁcatives not only remain syntactically biclausal, but also include two positions for tense and aspect marking—a property unparalleled in other grammaticalized structures of East Caucasian languages. Finally, it is obviously the massive (and still ongoing) grammaticalization and morphologization of erstwhile analytic structures that has been responsible for the creation of the Circassian and more broadly West Caucasian polysynthetic morphosyntax, which makes it so distinct from the East Caucasian languages (see e.g. Chirikba to appear).

ACKNOWLEDGEMENTS We are grateful to Yury Lander, two anonymous reviewers, and the editors of the volume for their comments on the draft of this chapter. All faults and shortcomings remain ours. Peter Arkadiev also acknowledges the ﬁnancial support of the Russian Foundation for the Humanities, grants #-- and --.

8 Grammaticalization in Turkic L A R S J O H A N S O N A ND ÉV A Á . C S A T Ó

. INTRODUCTION This chapter represents the whole Northern Eurasian area, where Turkic languages are spoken in close contact with other Transeurasian languages, Mongolic and Tungusic. These three language families share signiﬁcant grammaticalization strategies and typological characteristics with each other as well as with Koreanic and Japonic. First, the distribution, classiﬁcation, and some basic typological features of Turkic languages will be brieﬂy presented in comparison to other Transeurasian languages. The main focus will be on typically non-European grammaticalization processes that are representative for the whole family and recurring throughout the known history of Turkic. A detailed account of different grammaticalization strategies of so-called converb forms will complement the treatment of similar processes in other Transeurasian languages that are otherwise less elaborated in this volume. This account will highlight grammaticalized categories of actional modiﬁcation and viewpoint aspect typical of Turkic. Finally, some theoretically interesting issues such as the lack of formal marking resulting in systematic ambiguity will be addressed. Notation: Turkish examples are given in the ofﬁcial orthography marked by chevrons (e.g. ‹düșün-e dur-›); Turkic data from languages other than Turkish are given in Turcological notation. Speciﬁc features include: ị, ụ̈ , ụ, ị̈ are near-high lax vowels; u̇, ȯ, ȧ are near-front vowels. ḳ and ġ are back stops. Curly brackets of the type { } are used for morphophonemic transcriptions. Brackets of the type h i are used for glosses.

Grammaticalization from a Typological Perspective. First edition. Heiko Narrog and Bernd Heine (eds). This chapter © Lars Johanson and Éva Á. Csató . First published  by Oxford University Press

Grammaticalization in Turkic



..  -  The Turkic language family is one of the largest in the world in terms of territorial extension. It covers a huge geographical area extending over all Eurasia. The Turkicspeaking world extends from the southwest, from Turkey and its neighbours, to the southeast, to Central Asia and further into China. From here it stretches to the northeast, via southern and northern Siberia up to the Arctic Ocean, and ﬁnally to the northwest, across western Siberia and eastern Europe to northwestern Europe. Regions in which Turkic is spoken include Anatolia, the Balkans, Azerbaijan, Iran, Iraq, Afghanistan, Central Asia, the immense areas formerly called West and East Turkistan, China, mainly Xinjiang, the Russian Federation, i.e. southern, northern, and western Siberia, the Volga region, the southern Russian steppes, the Caucasus area, and, in the last few decades, northwestern Europe, particularly Germany. The Turkic-speaking world formerly included compact areas in the PontoCaspian steppes, Crimea, the Balkans, etc. It currently counts a dozen standard languages, such as Turkish, Uzbek, Azeri, Kazakh, Turkmen, Kirghiz, Tatar, Bashkir, Chuvash, Uyghur, Tuvan, and Yakut.

..  The Turkic languages can be classiﬁed genealogically, with respect to their relationships to one another in the family tree. The dynamic history of the Turkicspeaking groupings makes it difﬁcult to set up a classiﬁcation that combines geographical and genealogical criteria in a consistent way. The following rough geographical division into branches and lower-level groupings mirrors basic genealogical afﬁliations, and to some extent also typological features. The modern languages belong to: • the Southwestern (SW), or Oghuz, branch, encompassing Turkish, Gagauz, Azeri, Turkmen, Khorasan Turkic, etc.; • the Northwestern (NW), or Kipchak, branch, encompassing Karachay-Balkar, Kumyk, Karaim, Tatar, Bashkir, Noghay, Kazakh, Karakalpak, Kipchak Uzbek, Kirghiz, South Altay, etc.; • the Southeastern (SE), or Karluk, branch, encompassing Uzbek, Uyghur, etc.; • the Northeastern (NE), or Siberian, branch, encompassing Tuvan, Tofan, Soyot, Dukhan, Tuhan, Khakas, Shor, North Altay, Chulym, Yakut, Dolgan, etc.; • the Oghur branch, represented by Chuvash; • the Arghu branch, represented by Khalaj.

..    It is still disputed whether the Turkic family belongs to a higher line of descent, a phylum derived from a common ‘Altaic’ or ‘Transeurasian’ ancestor. The latter term has recently, according to a proposal presented by Martine Robbeets and Lars



Lars Johanson and Éva Á. Csató

Johanson, come to be used in the ‘macro-Altaic’ sense for a large group of geographically adjacent languages that share signiﬁcant properties: Turkic, Mongolic, Tungusic, Koreanic, and Japonic. They form a vast linguistic continuum that extends from the Paciﬁc in the east to the Mediterranean and the Baltic in the west. Unlike ‘Altaic’, the term does not necessarily imply genealogical relatedness and avoids the historically incorrect reference to the Altai mountains as the potential homeland of the languages in question. If a Transeurasian ancestor did exist, it may have emerged around  BCE in a compact area in southern Manchuria (cf. Janhunen : ).

..   The Turkic languages can also be classiﬁed typologically, according to similarities and differences, which will be essential for the argumentation in the present chapter. The languages show substantial common features. They belong to a distinct linguistic type represented by a transcontinental belt of areally adjacent Transeurasian and Uralic languages, which share a number of basic structural traits and close similarities in phonology, bound morphology, and syntax. The syntactic features are strikingly similar to those of the other Transeurasian languages. With respect to relational typology, based on the expression of grammatical relations, Turkic adheres to the nominative–accusative pattern, rather than the absolutive–ergative pattern. Thus, in main clauses, the subject stands in the nominative case, and the speciﬁc direct object in the accusative case. According to another parameter, constituent order, the syntax is head-marking, with markers placed on the head of the syntactic phrases, indicating the relations between constituents. Turkic has a left-branching syntax with modiﬁers and dependents preceding their heads, according to the so-called ‘rectum-regens principle’. In noun phrases, determiners such as demonstratives, attributive adjectives, and relative clauses precede the head. In verb phrases, the dependent arguments such as subject and objects precede the verbal head. The unmarked order of clause constituents is subject + object + predicate. Deviations, e.g. cases of ‘scrambling’, occur for discourse-pragmatic and stylistic reasons. The constituent order is less variable in non-main clauses. There are few instances of grammatical agreement. Agreement in number or case between dependents and heads does not occur. Agreement markers are used to indicate person and number of the subject, though third-person singular forms of verbs are often unmarked. Redundant use of other devices is largely avoided. The unmarked singular is thus used after quantiﬁers, e.g. numerals. Omission of overt constituents such as subject and object is permitted if the referents are pragmatically recoverable, for instance inferable from the discourse context, often referred to as ‘pro-drop’ or ‘null anaphora’. Person-number sufﬁxes largely make subject pronouns redundant. Main clauses are typically headed by ﬁnite verb forms, which contain grammatical information in terms of person, number, viewpoint aspect, mood, and tense. Non-main

Grammaticalization in Turkic



clauses are headed by action nominals, participant nominals, and converbs provided with bound subjunctors that largely fulﬁl the functions of conjunctions in languages of the English type. There are few indigenous free junctors, but foreign conjunctions are sometimes copied. The overall Transeurasian syntactic parallels are immediately striking. Some authors have therefore attempted to reconstruct a common syntactic archetype. Many shared features may, however, be attributable to general typological principles. They may belong to the elements that spread rather easily across languages, and thus do not provide any conclusive evidence for genealogical kinship. The word structure, both of the noun and the verb, is agglutinative. Turkic lacks different declension and conjugation classes, irregular verbs, suppletive forms, etc. Another characteristic is a high degree of synthesis, a parameter based on the number of morphemes per word. Turkic word classes include nouns, verbs, adjectives, adverbs, pronouns, determiners, numerals, postpositions, interjections, particles, and copulas. Postpositions of various kinds correspond to English prepositions. Grammatical gender is lacking, and there are no traces of Proto-Turkic gender distinctions. Deﬁnite articles do not occur; the indeﬁnite articles are formally identical to the numeral ‘one’. Turkic languages display numerous bound morphemes serving word formation and grammatical marking, non-clitic and clitic sufﬁxes, as well as bound particles. Preﬁxes do not normally occur, the few exceptions being copied from contact languages. There is a wide variety of simple and complex aspect-mood-tense forms. The verbal morphology comprises numerous categories expressing grammatical notions of actionality (Aktionsart), voice/diathesis (passive, middle-reﬂexive, causative, cooperative-reciprocal), deontic modality (possibility, impossibility, necessity), epistemic modality, evidentiality, negation, viewpoint aspect (intraterminal, postterminal), mood (imperative, voluntative, optative, hypothetical, etc.), tense (past), interrogation, subject person–number agreement. Actional modiﬁcations are expressed by postverbial constructions with postposed auxiliary verbs, a phenomenon that will be examined in section ...

. REPRESENTATIVE GRAMMATICALIZATION PROCESSES In the following, some processes of grammaticalization representative for Turkic will be discussed. While it is difﬁcult to quantify what is representative, we focus on examples that are • recurring, in the sense that several morphemes or constructions have undergone the same kind of grammaticalization in various periods, • not commonly found in the more well-known European languages, and • recorded in historical documents and not merely a matter of historical reconstruction.



Lars Johanson and Éva Á. Csató

. GRAMMATICALIZATION OF CONVERBS Converbs are non-ﬁnite verb forms typically functioning as predicates in non-main clauses having a modifying or non-modifying function in the matrix clause (cf. Johanson ). The following example illustrates the use of converbs in Turkish. Ali Trabzon’-a var-ınca hiç bekle-meden deniz-e Ali Trabzon-DATIVE arrive-CONVERB no wait-CONVERB sea-DAT gir-elim. enter-VOLUNTATIVEPLURAL ‘When Ali arrives at Trabzon let us go (to swim) in the sea without waiting.’ They are targets of different grammaticalization processes creating postpositions or compound lexical items. They also participate in the grammaticalization of so-called postverbial construction typically functioning as actionality modiﬁers. These can be further grammaticalized as viewpoint aspect markers. The simplest Turkic converb markers are of two types. Type hAi ends in a vowel, e.g. Turkish {-(y)A} in ‹gel-e› ‘coming’ ‹gel-› ‘to come’. Type hBi ends in a labial stop, e.g. Turkish {-(y)Ịp} in ‹gel-ip›. In Khakas, type hBi is dropped after consonant stems, e.g. kil kil- ‘to come’. A sufﬁxless converb is also preserved in Khalaj. Yakut {-An} < *{-ỊbAn} corresponds to type hBi. Chuvash {-SA} corresponds functionally to hBi. Type hAi originally possessed an intraterminal aspectual value, the envisagement of an event within its limits, which is largely retained in modern constructions. Type hBi has partly maintained its original post-terminal aspectual value, the envisagement of an event at a point where its relevant initial or ﬁnal limit is transcended. In many cases, however, the opposition hAi vs hBi has been neutralized, leading to relatively vague functions.

..      Simple Turkic converbs have often developed into postpositions (Johanson , ). Forms based on verbs meaning ‘to see’ are the sources of postpositions meaning ‘in view of ’, ‘with respect to’, ‘according to’, ‘on account of ’, ‘considering’, ‘because of ’, ‘to judge from’, e.g. Chaghatay kör-ä, Turkish gör-e, Kumyk gör-e, KarachayBalkar kör-e, Tatar, Bashkir kür-ä, Uzbek kȯr-ȧ, Chuvash kur-a. See the following Turkish example. Ceza bana göre haksız. punishment I.DATIVE according.to unfair ‘According to me the punishment is unfair.’ Forms based on verbs meaning ‘to look’ have become markers meaning ‘in the direction of ’, ‘towards’, e.g. Turkmen baḳ-a, Bashkir ḳara-y, Noghay, Uyghur ḳara-p,

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi

Grammaticalization in Turkic



Uzbek ḳaṙ ȧ-b. The negative forms mean ‘although’, ‘in spite of ’, e.g. Uzbek ḳaṙ ȧ-mȧ-y, Chuvash pị̈χ-ma-sị̈raχ, in the ofﬁcial orthography ‹пӑхмасӑрaх›. Converbs of verbs for ‘to reach’ have given rise to markers of ‘until’, ‘up to’, ‘as far as’, e.g. Turkish ‹değin›, Kazakh dey-ịn (cf. Heine and Kuteva : ). Converbs of verbs for ‘to count’ have produced expressions meaning ‘every’ or ‘throughout’, e.g. East Old Turkic saː-yụ, Uzbek yil sayịn ‘every year’, Chuvash śulsärän ‘every year’, Noghay yïl sayị̈n ‘the whole year’. Converbs of verbs meaning ‘to put (down)’, e.g. ḳoːδ-, have developed into markers meaning ‘down(wards)’, ‘under’, ‘below’, e.g. East Old Turkic ḳoː-δị̈ ‘downwards’, Chaghatay ḳoy-ị̈, Tuvan ḳud-ụ. Converbs of the verb biːr-lä- ‘to unite’, derived from the numeral biːr ‘one’, have yielded comitative and instrumental postpositions of the type bir-lä, e.g. Turkmen bile(n), Uzbek bilȧn, Tuvan bilä, Kirghiz men(en), Khakas mïnan. The Chuvash comitative case marker {-pA(lA)} has the same origin. Converbs of verbs meaning boy-la- ‘to extend’, derived from boːδ ‘stature’, have developed into postpositions meaning ‘comprising the distance or period (of )’, ‘along’ (spatial and temporal), ‘throughout’, e.g. Uzbek boy-lȧ-b. Converbs of verbs of the types bašla- ‘to begin’ and al- ‘to take’ have developed into abtemporal markers, e.g. Noghay basla-p, al-ị̈p, Uzbek båšlȧ-b, ål-ịp ‘from . . . on’, ‘since’.

..     Turkic languages have developed systems of postverbial constructions, consisting of a lexical verb in converbal form followed by an auxiliary verb of a restricted class (Johanson a: –). In the following Turkish example, dolaş-ıp dur-du is a postverbial construction. Geceyarısına kadar şehirde dolaş-ıp midnight-DATIVE until.POSTPOSITION city-LOCATIVE stroll-B.CONVERB dur-du. stand.AUXILIARY-PAST ‘He kept on strolling around in the city until midnight.’ Postverbial constructions are actional operators, serving the expression of actionality (Aktionsart), operating on various types of verbs and actional phrases and modifying their intrinsic actional values, describing the manner of action in qualitative and quantitative terms more accurately than is possible with single lexical verbs. They may convey highly differentiated shades of meaning. The functions of postverbial constructions largely correspond to those of IndoEuropean preverbal constructions, where a preverb (préverbe, Präverbium, preverbio) modiﬁes a following verbal lexeme and ultimately undergoes univerbation with it, e.g. German ausgehen ‘to go out’ (gehen ‘to go’, aus ‘out’). Turkic uses postposed equivalents.



Lars Johanson and Éva Á. Csató

The constructions are based on limited sets of auxiliary verbs with strongly generalized grammatical meanings. The lexical source verbs are motion verbs meaning ‘to go’, ‘to come’, ‘to go away’, ‘to go out’, ‘to proceed’, postural verbs meaning ‘to stand’, ‘to remain’, ‘to sit’, ‘to lie’, phasal verbs such as ‘to begin’, ‘to ﬁnish’, and verbs denoting various activities such as ‘to do’, ‘to put’, ‘to place’, ‘to hold’, ‘to throw’, ‘to send’, ‘to give’, ‘to take’. The grammatical functions of the constructions are mostly not predictable from the lexical meanings of the source verbs. The individual languages often employ different auxiliaries to express one and the same actional notion, and one and the same lexical verb has been grammaticalized to convey different actional meanings. Thus, constructions with ber- ‘to give’ are transformativizers (see ...) and can also express continued action, sudden, unintentional action, objective version, etc. Nonetheless, many similarities are found across the various systems. A number of constructions show identical patterns in the Northwestern, Northeastern, and Southeastern branches. The auxiliaries are mostly free forms, not yet incorporated into their preceding lexical hosts. However, certain constructions represent various degrees of agglutination. In the course of the grammaticalization processes, their shapes have sometimes been signiﬁcantly reduced. In a few cases, they are already sufﬁxes, mostly subject to sound harmony, e.g. Turkish ‹yaz-a-dur-› hwrite-A.CONVERB stand.AUXILIARYi ‘to continue writing’. The converb sufﬁx has sometimes fused with the auxiliary verb into one grammatical sufﬁx. Thus, hBi alal- ‘to take’, has developed into Uyghur {-(Ị)w-al-}, e.g. yez-ịw-al- < *yaz-ị̈b al- (lit. ‘to take writing’). Tuvan and Tofan {-(Ị)β-Ịt-} has developed from hBi ït-, e.g. biž-ịβ-ịt- < *biž-ịb ït- (‘to send writing’) ‘to write down’. Khakas, Shor, and Chulym Turkic show the similar form {-(Ị)b-ỊS-} < hBi ïs-, e.g. Khakas kiz-ịb ịs- < *kis-ịb ïs- ‘to cut off ’. In some cases, the converb marker and the auxiliary verb have fused into a single, synchronically unanalysable lexeme, e.g. Khakas äkːäl- ‘to bring’ < al-ị̈p käl(al- ‘to take’ + käl- ‘to come’). This is a instance of lexicalization of a postverbial construction. The converb sufﬁx is sometimes dropped so that the auxiliary verb is attached directly to the verb stem. In Khakas, the hBi marker is maintained after vocalic stems, but mostly reduced to zero after consonantal stems, e.g. pas sal- ‘to write down’ < *bas-ị̈p sal-. As to the areal distribution, postverbial constructions may be said to be highly developed in most Turkic languages. The languages of the Northeastern branch make extensive use of them; the most developed systems are found in South Siberia. The Southwestern branch exhibits relatively poor systems. Varieties spoken in Iran and Azerbaijan employ restricted sets of constructions, often only lexicalized items. Standard Turkish is surprisingly poor in postverbial constructions, but some Anatolian dialects possess rich systems typical of vernacular narrative styles. The use is now partly register-dependent, with considerable differences between standard and non-standard dialects. Uyghur non-standard dialects differ markedly from the standard language with regard to inventories, phonetic shapes, and functions of postverbial constructions (Yakup : ). Karaim and Gagauz are often claimed

Grammaticalization in Turkic



to lack postverbial constructions altogether—a claim which is certainly exaggerated. Both languages have some residues of postverbial constructions. The modern analytic postverbial constructions replace the old synthetic actional markers found in Transeurasian languages; they are still relatively well preserved in Tungusic but have largely vanished in others. The old markers may well go back to analytic constructions themselves; a possible linking segment such as a converb marker may have dropped.

... Phase speciﬁcation Postverbial constructions are typically used for phase speciﬁcation, highlighting an inherent phase of the meaning of the actional phrase, specifying it qualitatively or quantitatively. The basic classiﬁcatory criterion is transformativity. An actional content is transformative if it has a natural evolutional turning point, a crucial initial (initiotransformative) or ﬁnal limit (ﬁnitransformative). A non-transformative actional content does not imply any such limit. Postverbial constructions can recategorize the internal phase structure, specifying one inherent phase of the action (Johanson : –). Connotations of durativity, continuity, intensity, suddenness, involuntary or unintentional action, etc., are often effects of these functions. .... Transformativizing constructions Transformativizing postverbial constructions operate on actionally ambiguous actional phrases highlighting a dynamic (initial or ﬁnal) phase of the action. They thus block non-transformative readings, e.g. ‘to write down’ rather than ‘to write’, ‘to sit down’ rather than ‘to sit’, ‘to catch sight’ rather than ‘to look’, ‘to catch ﬁre’ or ‘to burn down’ rather than ‘to burn’. Connotations of thoroughnesss, resoluteness, unexpectedness, suddenness, or quickness are frequent side effects. Highlighting the sudden transgression of the initial limit of the action yields ingressive readings. The auxiliary verbs employed in these constructions are developed from ﬁnitransformative source verbs with lexical meanings describing telic actions, e.g. ‘to send’, ‘to put’, ‘to throw’, ‘to go away’, ‘to come’, ‘to go out’, ‘to fall’. They are typically combined with hBi converbs. The ﬁnitransformative verb ïːδ- ‘to send’, ‘to release’ has been used in constructions of the type hBi ïːδ-, e.g. Old Uyghur biti-p ïːδ- hwrite-B.CONVERB send.AUXILIARYi ‘to write down’. Abakan and Sayan Turkic display constructions with the auxiliaries ïs- and ïd- in bound forms such as {-(Ị)B-Ịs-} and {-(Ị)B-(Ị)t-}, e.g. Khakas čị-bịs- ‘to eat up’, Shor tur-ị̈bị̈s- ‘to stand up’, Tuvan bižị-βịt- ‘to write down’, udï-βị̈t- ‘to go to sleep’. The auxiliary verb has various phonetic shapes such as Kirghiz, Altay iy-, Yakut ïːt-, Chuvash yar-, Khalaj hiː-, e.g., Kirghiz sög-üp iy- ‘to curse’, Chuvash pär-Zä yar- ‘to shoot suddenly’, Khalaj käl-iː- ‘to come suddenly’ < *käl-i hiː-. This simple auxiliary verb has mostly been replaced by ïːδ-u ber- ‘to send away, to release’, consisting of ïːδ- hAi ber- ‘to give’, e.g. Turkmen yiber-, Bashkir yịbär-, Tatar ǰịbär-, Kazakh žiber-, Uzbek yu̇bår-. Altay exhibits hBi iy-ä bär-, Khakas and Shor hBi ïz-a bär-. The constructions often express the sudden beginning of an action, e.g. Tatar yị̈ɣla-p ǰịbär- hcry-B.CONVERB send.AUXILIARYi, Uzbek yiɣlȧ-b yu̇bår- ‘to burst out



Lars Johanson and Éva Á. Csató

crying’, Tatar sayra-p ǰịbär- ‘to start singing’, uyna-p ǰịbär- ‘to begin to play’, Bashkir ḳïṣ ḳïṛ -ị̈p yịbär-, Tatar ḳïč̣ ḳïṛ -ị̈p ǰịbär-, Uzbek ḳičḳir-ịb yubår- ‘to scream suddenly’, Noghay kül-ịp yibär-, Kazakh kül-ịp žiber-, Uzbek ku̇l-ịb yu̇bår- ‘to burst out laughing’. Kumyk oχu-p yiber- ‘to start reading’, Noghay kül-ịp yiber- ‘to burst out laughing’. Some constructions in which the auxiliary obviously goes back to *ï:δ-u ber(Johanson : ) express sudden, swift, quick, rapid, spontaneous, easy, effortless, casual action, often with overtones of unintentional or involuntary performance. Turkish {-(y)Ị-ver-} occurs in formations such as ‹al-ı-ver-› ‘to buy quickly’, ‹at-ı-ver-› ‘to throw easily’, ‹gül-i-ver-› ‘to burst out laughing’, ‹oku-yu-ver-› ‘to read fast, easily’, ‹öl-ü-ver-› ‘to die suddenly’, ‹yaz-ı-ver-› ‘to write quickly, easily’, ‹sal-ı-ver-› ‘to release without warning’. Other examples: Gagauz gir-ị-ver- ‘to enter quickly, without notice’, baḳ-ị̈-ver ‘to look suddenly’, ḳoy-ụ-ver- ‘to let loose’, Turkmen yöräː-ber- ‘to run suddenly, easily’, bašlaː-ber- ‘to start suddenly’, iy-ber-‘to eat quickly’. This type is traditionally mistaken to represent hAi bär- ‘to give’ (e.g. Kornﬁlt : ), though it is difﬁcult to explain the semantics and the sufﬁx variants containing high vowels. The verb sal- ‘to move (transitive)’, ‘to place’, ‘to lay down’ is used in constructions hBi sal-, denoting fast and unexpected action, e.g. Tatar äyt-ịp sal- hsay-B.CONVERB move.AUXILIARYi ‘to say suddenly, thoughtlessly’, Kazakh at-ị̈p sal- ‘to shoot suddenly’, Uzbek åč-ịb sål- ‘to disclose’, Khakas pas sal- ‘to write down’, χas sal- ‘to dig up’, čị-p sal- ‘to eat up’. The verb ḳoy- ‘to put’ is represented by Turkmen ġoy-, Tatar ḳuy-, Uzbek ḳoy-, etc. in constructions expressing suddenness and quickness, e.g. Turkmen al-ị̈p ġoy- htakeB.CONVERB put.AUXILIARYi ‘to suddenly take’, Karachay-Balkar ayt-ị̈b ḳoy- ‘to blurt out’, Tatar yaz-ị̈p ḳuy-, yịšịr-ịp ḳuy- ‘to hide (transitive)’, ḳurḳị̈t-ị̈p ḳuy- ‘to frighten’, bušat-ị̈p ḳuy- ‘to empty’, bül-ịp ḳuy- ‘to divide’, Uzbek yåz-ịb ḳoy- ‘to write down’, išlȧ-b ḳoy- ‘to carry out’. Chuvash uses χur- ‘to place, to put’ combined with the converb in {-SA}. Verbs meaning ‘to throw (away)’ are used in constructions of the type hBi tasta-, which may express fast, energetic, resolute action, but also casual, careless performance without special concern or attention, e.g. Tatar ịšlä-p tašla- hdo-B.CONVERB throw.AUXILIARYi ‘to get done (efﬁciently)’, ịč-ịp tašla- ‘to drink fast’, bịtịr-ịp tašla‘to ﬁnish fast’, ḳị̈r-ị̈p tašla- ‘to break (at once, ﬁercely)’, Kazakh kes-ịp tasta- ‘to cut off ’, Uzbek åč-ịb tȧšlȧ- ‘to open up’, käs-ịb tȧšlȧ- ‘to cut up’, yåz-ịp tȧšlȧ- ‘to write rapidly’, ‘to dash off ’, Khakas pas tasta- ‘to write down’. The Yakut equivalent is käbis-; cf. East Old Turkic kämiš- ‘to throw away, to abandon’. Verbs meaning ‘to go (away)’ are used in similar constructions, mostly derived from intransitive verbs, sometimes with directional connotations, e.g. Old Uyghur ölụ̈p bar- hdie-B.CONVERB go.AUXILIARYi ‘to die, to pass away’, örtän-ịp bar- ‘to burn down (intransitive)’, Turkmen sars-ị̈p git- ‘to be startled’, yarïl-ị̈p git- ‘to break (in two) (intransitive)’, yïḳïl-ị̈p git- ‘to fall down’, Tatar awị̈r-ị̈p kit- ‘to begin to hurt’, ḳị̈zị̈p kit- ‘to get hot/excited’, tuŋ-ị̈p kit- ‘to freeze’, yörị-p kit- ‘to get moving’, bul-ị̈p kit‘to become’, Kazakh er-ịp ket- ‘to melt away’, öl-ịp ket- ‘to die, to pass away’, žan-ị̈p ket- ‘to burn down (intransitive)’, Kirghiz ḳïzar-ị̈p kät- ‘to turn red’, Uzbek isi-b ket‘to get warm’, Uyghur häri-p kät- ‘to become exhausted’, ḳiz-ịp kät- ‘to get hot’, čiḳ-ịp kät- ‘to get out’, Khakas čịt par- ‘to get lost’, usχun par- ‘to wake up’, öl par- ‘to die’,

Grammaticalization in Turkic



paz-ị̈l par- ‘to be written down’, sï-n par- ‘to get broken’. The corresponding Chuvash construction is {-SA} ḳay-, e.g. kil-Zä ḳay- ‘to arrive’. Examples of constructions based on verbs meaning ‘to take’: Tatar aŋla-p alhunderstand-B.CONVERB take.AUXILIARYi ‘to realize’, ‘to grasp’ rather than ‘to understand’, ịč-ịp al- ‘to drink up’ rather than ‘to drink’, kür-ịp al- ‘to catch sight’ rather than ‘to see’, ḳara-p al- ‘to take a look’ rather than ‘to look’, ḳị̈z-ị̈p al- ‘to get angry’ rather than ‘to be angry’, kụ̈t-ịp al- ‘to await’ rather than ‘to wait’. Constructions with verbs meaning ‘to reach’ include Tatar aŋla-p ǰithunderstand-B.CONVERB reach.AUXILIARYi ‘to grasp’, bar-ị̈p ǰit- ‘to get there’, ḳat-ị̈p ǰit- ‘to get hard’, ḳayt-ị̈p ǰit- ‘to reach home’, kil-ịp ǰit- ‘to arrive’, pịš-ịp ǰit- ‘to become well done’, ‘to ripen’. Chuvash uses śit- ‘to reach’, combined with the converb in {-SA}. Chuvash constructions with ük- ‘to fall’ have similar uses, e.g. χïra-Za ük- ‘to be frightened’ rather than ‘to be afraid’, čirlä-Zä ük- ‘to fall ill’ rather than ‘to be ill’. Turkic ḳal- is an initiotransformative verb meaning ‘to get into a state’ + ‘to remain in the state’. Accordingly, the construction hBi ḳal- highlights the initial dynamic phase and also includes the following post-transformative phase. Deﬁnitions commonly found in grammars, e.g. ‘transition of an action into a state’, as suggested for Azeri hBi ġal-, or ‘completion’, as suggested for Turkmen hBi ġal-, just correspond to the ﬁrst phase. The translation ‘to remain’ of Uzbek hBi ḳål- just covers the second phase. Examples: Old Uyghur ärt-ịp ḳal- hbe-B.CONVERB get.into.a.state. and.remain.AUXILIARYi, Turkmen bol-ụp ġaːl- ‘to become’ + ‘remain’, KarachayBalkar kel-ịb ḳal- ‘to arrive’ + ‘stay’, ket-ịb ḳal- ‘to leave’ + ‘to be gone’, arï-b ḳal‘to become tired’ + ‘to remain tired’, Tatar yoḳla-p ḳal- ‘to fall asleep’ + ‘to sleep’, ‘to oversleep’, utïr-ị̈p ḳal- ‘to take one’s seat and thus be seated’, al-ị̈p ḳal- ‘to take’ + ‘to keep’, kür-ịp ḳal- ‘to catch sight and thus see’, ḳat-ị̈p ḳal- ‘to stiffen and thus be stiff ’, kil-ịp ḳal- ‘to come and thus be present’, kit-ịp ḳal- ‘to leave and thus be absent’, Kazakh kel-ịp ḳal-, kep ḳal- ‘to come’ + ‘stay’, Uzbek sin-ịb ḳål- ‘to get broken and thus be broken’, kȯr-ịn-ịb ḳål- ‘to become visible and thus be visible’, Uyghur čiḳ-ịp ḳal- ‘to go out and thus be outside’, yet-ịp ḳal- ‘to lie down + remain lying’, körụ̈n-ụ̈p ḳal- ‘to become visible and thus be visible’, uχla-p ḳal- ‘to fall asleep and thus sleep’, Khakas kör χal- ‘to catch sight and thus see’, Dukhan dur-ụp ġal- ‘to stop and thus stand’, öl-ụ̈p ġal- ‘to die and thus be dead’, Altay uyuḳta-p ḳal-, Yakut utuy-an χaːl‘to fall asleep and thus sleep’, öl-ön χaːl- ‘to die and thus be dead’, süːr-än χaːl- ‘to run away (suddenly, resolutely) and thus be absent’. Sayan Turkic languages such as in Tuvan, Tofan, and Dukhan exhibit corresponding transitive constructions hBi ḳaɣḳaɣ- ‘to leave’, which express transformativity with a remaining result, e.g. as-ị̈p ḳaɣ- ‘to hang up something so that it remains hanging’, baɣla-p ḳaɣ- htie-B.CONVERB leave.AUXILIARYi ‘to tie something ﬁrmly so that it remains tied’, biži-p ḳaɣ- ‘to write and thus have written’. .... Non-transformativizing constructions Certain constructions are non-transformativizing, highlighting the non-dynamic phase of an action. They operate on transformative and actionally ambiguous actional phrases, turning them into non-transformatives. They block dynamic, limit-oriented readings and specify a non-dynamic phase, e.g. ‘to be ill’ rather than



Lars Johanson and Éva Á. Csató

‘to fall ill’, ‘to look’ rather than ‘to catch sight’, ‘to eat’ rather than ‘to eat up’, or ‘to read’ rather than ‘to read and ﬁnish reading’ (Johanson : –). They are based on initiotransformative postural source verbs meaning ‘to stand up/stand’, ‘to sit down/sit’, ‘to lie down/lie’, and locomotion verbs meaning ‘to proceed’, ‘to move’, ‘to run’. Compare the developments of ‘to stand’, ‘to stay’, ‘to remain’, ‘to sit’, ‘to lie’, ‘to move’ as discussed by Heine and Kuteva (: –, –, –, –, –, –). Constructions with auxiliaries going back to postural verbs may give additional information about the physical position in which a given action is performed. constructions In the case of hBi constructions, the semantic interpretation depends on the actional value of the lexical verb. In their quantitatively simplest readings, ﬁnitransformative actional phrases do not occur in these constructions, unless they get propinquitive meanings, e.g. öl-üp dur- hdie-B.CONVERB stand.AUXILIARYi ‘to be almost dying’. They sometimes get episodic, temporally limited readings, e.g. Khakas kir čör- ‘to drop by for a while’, Kyzyl par käl šör- ‘to go and come back for a while’. With initiotransformatives, the transformative ﬁrst phase is blocked, whereas the post-transformative second phase is highlighted, e.g. Sayan Turkic udu-p ǰï ʔt- hsleepB.CONVERB reach.AUXILIARYi ‘to have fallen asleep’ rather than ‘to fall asleep’. The meaning is ‘to dwell in a state where the crucial initial limit has been transcended’. With non-transformatives and initiotransformatives, hBi constructions can acquire durative connotations, e.g. Uyghur oltur-up tur- hsit-B.CONVERB stand.AUXILIARYi ‘to sit for a while’, Khakas χal-ïp odïr- ‘to remain for a long time’. Actional phrases of all types are combinable with hBi constructions when interpreted serially, i.e. ‘to act repeatedly, continuously, frequently, several times, on several occasions, regularly, usually, habitually’, e.g. Kumyk oχu-p tur- ‘to go on reading’, hread-B.CONVERB stand.AUXILIARYi ‘to read repeatedly, habitually’, Karachay tig-ịb tur- ‘to sew repeatedly, habitually’, ïrla-p čör- ‘to sing repeatedlyʼ. Examples of serially interpreted ﬁnitransformatives: Karachay-Balkar yiber-ịp tur- ‘to send repeatedly’, Khakas kil tur- ‘to come repeatedly’, pas sal tur- ‘to write some times’, surï-p al tur- ‘to ask each time’, Uyghur bol-ụp tur- ‘to occur again and again’, Chuvash pär-Zä tị̈r- ‘to throw/shoot at intervals’. The lexical verb tur- ‘to stand up/to stand’ is an initiotransformative verb covering an initial transformative phase and a following post-transformative phase: () ‘to get into the state of standing’ and () ‘to dwell in this state’. The second, non-dynamic phase allows it to function as a non-transformativizer with connotations of repetition and duration. Constructions of the type hBi tur- ‘to stand’ block limit-oriented interpretations, e.g. Tatar awị̈r-ị̈p tụr- hache-B.CONVERB stand.AUXILIARYi ‘to be ill’ rather than ‘to get ill’. The constructions can also denote continuity, e.g. Turkish ‹içedur-› ‘to keep drinking’, Tatar yaz-a tur- ‘to keep writingʼ. Serial interpretations are also possible, e.g. Kumyk oχu-p tur- ‘to read repeatedly, habitually’, Karachay-Balkar yiber-ịb tur‘to send repeatedly’, tig-ịb tur- ‘to sew repeatedly, habitually’, ïrla-p tur- ‘to sing repeatedly’. All these meanings derive from the non-transformativizing function.

Grammaticalization in Turkic



The construction hBi tur- is attested in Old Uyghur, e.g. küzät-ịp tur- hwatch-B. stand.AUXILIARYi ‘to watch (constantly)’ (Gabain : ). It is sometimes difﬁcult to distinguish this use from the lexical use of tur- ‘to stand’, e.g. ‘to stand watching’ (see more about ambiguity in section .). Modern examples: Turkish ‹çalıș-ıp dur-› ‘to work constantly’, ‹dolan-ıp dur-› ‘to stroll around’, ‹düșün-üp dur-› ‘to think constantly’, ‹yazıp dur-› ‘to write permanently’, Karachay-Balkar oltur-ụb tur- ‘to sit’ rather than ‘to sit down’, Kumyk aša-p tur- ‘to eat’ rather than ‘to begin to eat’ or ‘to eat up’, oχu-p tur- ‘to read’ rather than ‘to start reading’ or ‘to read and ﬁnish reading’, yuχla-p tur- ‘to sleep’ rather than ‘to fall asleep’, Tatar bul-ị̈p tụr- ‘to be’ rather than ‘to become’, ḳara-p tụr- ‘to look’ rather than ‘to catch sight’, kitịr-ịp tụr- ‘to bring regularly’, ḳurḳ-ị̈p tụr- ‘to fear’ rather than ‘to get frightened’, tụr-ị̈p tụr‘to stand’ rather than ‘to stand up’, tụt-ị̈p tụr- ‘to hold’ rather than ‘to grasp, to seize’, uḳị̈-p tụr- ‘to read’, uyla-p tụr- ‘to think’, Kirghiz kel-ịp tur- ‘to come regularly’, Uzbek gȧpir-ịb tur- ‘to talk’, oḳi-b tur- ‘to read’, ‘to read constantly’, Uyghur eḳ-ịp tur‘to ﬂow’, ‘to ﬂow constantly’, kel-ịp tur- ‘to come regularly’, Khakas čör tur- ‘to walk’, χorïχ tur- ‘to fear’ rather than ‘to get frightened’, Sayan Turkic olur-ụp dur- ‘to sit’ (‘to have sat down’) rather than ‘to sit down’, aːrï-p dur- ‘to be ill’ (‘to have fallen ill’) rather than ‘to fall ill’, Chuvash vula-za tị̈r- ‘to read regularly’. Constructions based on source verbs meaning ‘to sit down/to sit’ are used in similar ways, e.g. Turkmen gid-ịb otur- hgo-B.CONVERB sit.AUXILIARYi ‘to go all the time’, Bashkir uyla-p ultị̈r- ‘to consider, to ponder’, Tatar tụr-ị̈p utị̈r- ‘to stand up repeatedly’, Kirghiz ište-p otur- ‘to work continuously’, Khakas oyna-p odïr- ‘to play constantly’. Constructions based on source verbs meaning ‘to lie down/to lie’ include Kirghiz oylo-p ǰat- ‘to consider’, Tuvan aɣ-ị̈p čï ʔt- ‘to ﬂow’, Tofan boːla-p čïht- ‘to shoot’, Uyghur uχ la-wat- ‘to be asleep’, formed with {-(Ị)wat-} < hBi yat-. Many constructions based on source verbs meaning ‘to move’, ‘to walk’, ‘to proceed’, e.g. Turkmen yör-, Bashkir -yụ̈rụ̈-, Noghay yür-, Kirghiz ǰür(ü)-, Kazakh hBi žür-, Uzbek yu̇r-, Khakas čör-, Tuvan čoru-, have similar functions, e.g. Tatar uḳị̈p yụ̈rị- hread-B.CONVERB move.AUXILIARYi ‘to read’, kiy-ịp yụ̈rị- ‘to wear’ rather than ‘put on’, Kirghiz oḳï-p ǰür-’to study continuously’, Uyghur oyna-p yür- ‘to play’, kälịp kät-ịp yür- ‘to come and go regularly’, Khakas ïrla-p čör- ‘to sing long or repeatedly’, pas čör- ‘to write now and then’, Yakut kül-ä sïrït- ‘to laugh constantly’, oχt-o sïrït- ‘to fall constantly’. Some languages use hBi bar- to express gradual development of the action, in particular increasing strength or intensity over time, e.g. Turkmen al-ị̈p bar- htake-B. CONVERB go.AUXILIARYi ‘to take more and more’, Uzbek yȧχši-lȧ-n-ịb bår- ‘to improve more and more’. CONVERB

hAi constructions hAi constructions are based on the intraterminal value of the hAi converb, the view of an event within its boundaries. hAi constructions with initiotransformative auxiliary verbs cover a transformative phase and the following post-transformative phase, i.e. ‘to get into a state’ + ‘to remain in this state’. The combination of an intraterminal converb sufﬁx and the



Lars Johanson and Éva Á. Csató

initiotransformative auxiliary allows the expression of both ingressive and continuative meanings, e.g. Yakut bar-a-tur- ‘to go and continue to go’. Though hAi tur- constructions are often said to express actions in their initial phase, they also denote the continuation of an action that has already begun: ‘to continue to do’, ‘to still be doing’, ‘to go on doing’, ‘keep doing’, etc., e.g. East Old Turkic ïδ-ụ tur- hsend-A.CONVERB stand.AUXILIARYi ‘to keep sending’, yorï-yụ tur- ‘to keep walking’, ḳora-yụ tur- ‘to continue to diminish’, alda-yụ tur- ‘to keep deceiving’. This function is found in many modern languages, e.g. Turkish ‹baḳ-a dur-› ‘to continue to look’, ‹düșün-e dur-› ‘to keep thinking’, ‹çalıș-a dur-› ‘to continue working’, ‹gid-e-dur-› ‘to continue to go’, ‹iç-e-dur- ‘to keep drinking’, Kumyk oχu-y tur ‘to go on reading’, söyle-y tur- ‘to go on speaking’, Noghay ḳara-y tur- ‘to keep looking’, Yakut tüːh-ä tur- ‘to keep raining’, süːr-ä tur- ‘to keep running’, käpsi-y tur- ‘to continue telling’, kördör-ö tur- ‘to continue to show’. The type hAi tur- can obviously not combine with ﬁnitransformatives in their quantitatively simplest reading such as Turkish ‹öl-› ‘to die’, e.g. *‹öl-e-dur› ‘to keep dying’. Initiotransformatives such as ‹otur-› ‘to sit down/sit’ and ‹yat-› ‘to lie down/lie’ exclude purely limit-oriented, ingressive readings such as ‘to start doing’. Thus, ‹otura-dur-› and ‹yat-a-dur-› cannot mean ‘to start to sit’ and ‘to start to lie’. Only continuative readings are possible here, i.e. ‘to remain seated/lying’. hAi tur- can refer to an action in its relation to a second action, in the sense of ‘to do something meanwhile’. Grammarians have claimed that Turkish hAi dur- and hBi dur- denote durativity and continuous action. Both ‹söylen-e dur-› and ‹söylen-ip dur-› ‹söylen-› ‘to grumble, mutter’ are taken to mean ‘to keep grumbling’. However, the two constructions differ clearly from each other. Constructions with hAi ḳal- show comparable properties, highlighting the second phase, while also including the initial phase that leads to it, i.e. ‘to get into a posttransformative state and remain there’, e.g. Orkhon Turkic yat-ụ ḳal- hlie (down)-A. CONVERB get.into.a.state.and.remain.AUXILIARYi ‘to lie down’ + ‘remain lying’ and tur-ụ ḳal- ‘to come to a standstill’ + ‘remain without moving’. Examples from later languages: Chaghatay yaɣ-a ḳal- ‘to begin and continue raining’, Kumandin tur-a ġal- ‘to get up and remain upright’. Khakas čügür-ä χal- means ‘to run away’ + ‘to stay away’, not simply ‘to run away’ as hBi χal-. The decisive difference is that hAi χalhighlights the post-transformative phase. The corresponding Turkish construction is restricted to a few lexical verbs such as ‹bak-› ‘to look’, ‹don-› ‘to freeze’, ‹kal-› ‘to remain’, ‹șaș-› ‘to be surprised’. It implies continued dwelling in the post-transformative state, e.g. ‹don-a-kal-› hbe.frozen-A. CONVERB get.into.a.state.and.remain.AUXILIARYi ‘to become petriﬁed’ + ‘to remain petriﬁed’, ‹otur-a-kal-› ‘to sit down’ + ‘to remain seated’, ‹șaș-a-kal-› ‘to be bewildered’ + ‘to remain bewildered’, ‹uyu-ya-kal-› ‘to fall asleep’ + ‘to continue to sleep’. Constructions with hAi bär- express continued or uninterrupted action in some languages, e.g. Kazakh tamaḳ že-y ber- heat-A.CONVERB give.AUXILIARYi ‘to keep eating’, Uzbek yåz-ȧ ber- ‘to continue to write’, sȯzlȧ-y ber- ‘to continue to speak’, bår-ȧ ber- ‘to continue to go’, Uyghur kül-ụ̈-wär ‘to laugh incessantly’. However, in some

Grammaticalization in Turkic



Siberian languages such as Tuvan, Tofan, Khakas, Altay, and Chulym, hAi bär- forms ingressives that highlight the initial phase of non-transformatives and initiotransformatives, e.g. Tuvan aŋnï-y bär- ‘to start hunting’, udï-y bär- ‘to fall asleep’, ḳorɣ-a bär- ‘to get scared’, aːri-y bär- ‘to fall ill’, Khakas oyni pir- ‘to start to play’, Tofan ïɣläyị bär- ‘to start to cry’. East Old Turkic hAi tut- tut- ‘to grasp/ hold’ has a continuative function, e.g. öyü tut- hthink-A.CONVERB hold.AUXILIARYi ‘to keep thinking’. The type hAi käl-, based on verbs meaning ‘to come’, denotes continuity of an action up to a later orientation point, e.g. Karakhanid Ḳal-ụ käl-dị hremain-A.CONVERB come.AUXILIARY-PASTi ‘It has been passed down’, Turkish ‹Böyle ol-a-gel-miș› ‘It has evidently been like this’. Some languages employ hBi käl- instead, e.g. Turkmen iːšläː-p gel- ‘to have been working (up to a given point)’, Kirghiz Bol-ụp kel-dị, Kazakh Bol-ị̈p kel-dị ‘It has existed until now’. The Orkhon Turkic and Karakhanid construction hAi bar- ‘to go’ expresses continuation of a state, e.g. yoḳaδ-u bar- hbe.destroyed-A.CONVERB go.AUXILIARYi ‘to continue being destroyed’, art-a bar- ‘to keep increasing’. Constructions with hAi yaz- are propinquitive in the sense of ‘to fail’, ‘to miss’, ‘to almost/nearly do’, e.g. Turkish ‹düș-e-yaz-› hfall-A.CONVERB miss.AUXILIARYi ‘to almost fall’, Uzbek ȯl-ȧ yåz- ‘to be on the point of dying’, čiḳ-ȧ yåz- ‘to be about to come out’. The element yaz- is mostly no longer used as a lexical verb (‘to fail’). In East Old Turkic hAi ïːδ-, the combination of the ﬁnitransformative auxiliary verb ‘to send’ with the intraterminal hAi converb yields the meaning ‘to get into an already ongoing action’, ‘to get going’, etc., e.g. sanč-a ïːδ- hrout-A.CONVERB send. AUXILIARYi ‘to get into the action of routing’, yitür-ụ̈ ïːδ- ‘to get into the state of having lost’, unït- ụ ïːδ- ‘to forget’, ‘to get into the state of having forgotten’. Constructions with hAi kör- ‘to see’ have attentive functions, meaning ‘to take care to do’, ‘to make sure to do’, e.g. Orkhon Turkic Yäl-ü kör! hride.fast-A.CONVERB see. AUXILIARY.IMPERATIVEi ‘See to it that you ride fast!’, Chaghatay Örgät-ä kör! ‘See to it that you learn!’, Old Ottoman ḳïl-ị̈ gör- ‘to take care to do’. In modern Turkish, only negative, prohibitive forms are used, e.g. ‹Düșmeye gör!› ‘See to it that you don’t fall’. In contrast, constructions of the type hBi kör- signal attempted action, ‘to try to do’ or ‘to dare to do’, sometimes with desiderative connotations. Kuman Baḳ-ị̈p kör-ụ̈gịz! hlook-A.CONVERB see.AUXILIARY-IMPERATIVEPLURALi is attemptive in the sense of ‘Try to look!’, though it has been understood as ‘reinforcement of the imperative’ (Gabain : ). hAi kör- is used in older languages, e.g. Chaghatay al-a kör- ‘to try to get’. Examples of hBi kör- in later languages: Turkmen yaδ-ị̈p gör- ‘to try to write’, Karachay-Balkar išli-b kör- ‘to try to work’, Uzbek yåz-ịb kȯr- ‘to try to write’, oḳib kȯr- ‘to try to read’, kiy-ịb kȯr- ‘to try on’, åt-ịb kȯr- ‘to want to shoot’, ye-b kȯr- ‘to try to eat’, ‘to taste’, Uyghur oḳu-p kör- ‘to try to read’, oyla-p kör- ‘to try to/dare to think’, tet-ip kör- ‘to taste’, Shor šaː-p kör- ‘to try to beat’, säkri-p kör- ‘to try to jump’, Khakas pas kör- ‘to try to write’, kis kör- ‘to try on (clothes)’. A similar construction is hBi baḳ- ‘to look, to watch’, e.g. Uyghur oyla-p baḳ- ‘to dare to think’, ye-p baḳ- ‘to try to eat’, ‘to have a taste’, Kashghar Uyghur izdä-p baḳ- ‘to try to search’, Turfan dialect köŕ(ụ̈p) paḳ- ‘to try to look’, äpiŕ(ịp) paχ- ‘try to bring’ (Yakup : –).



Lars Johanson and Éva Á. Csató

... Spatial orientation Postverbial constructions based on motion verbs meaning ‘to come’ and ‘to go away’ may express spatial orientation, specifying whether an action is directed towards a deictic centre, e.g. the speaker or the addressee, or away from it. Cislocative orientation, direction towards a deictic centre (‘to this place’), is expressed by venitive constructions based on verbs meaning ‘to come’, e.g. Old Uyghur aḳ-ị̈p käl- hﬂow-B.CONVERB come.AUXILIARYi ‘to ﬂow in’, ün-ä käl- ‘to come forth’, Turkmen uč-ụp gel- ‘to ﬂy here’, Tatar al-ị̈p kil- ‘to bring’, Noghay uš-ị̈p kel- ‘to ﬂy here’, ‘to come ﬂying’, Uzbek kir-ịb käl- ‘to enter one’s own place’, ål-ịp käl- ‘to bring’, Uyghur yügür-ụ̈p käl- ‘to come running’, ḳayt-ịp käl- ‘to come back’, uč-ụp käl‘to come ﬂying’, Tuvan čäd-ịp käl- ‘to arrive’, Khakas čügür kil- ‘to come running’, al kil- ‘to bring’. Some Khalaj motion verbs such as kiːr- ‘to enter’, äːn- ‘to go down’, and hün- ‘to go up’ form imperatives with the sufﬁx {-Vk} ~ {-Vkä}, which goes back to cislocative constructions with käl- ‘to come’, e.g. Kir-äk! ‘Come in!’. Translocative orientation, direction from a deictic centre (‘from this place’) is expressed by andative constructions based on verbs meaning ‘to go away’, bar- and ket-, e.g. East Old Turkic öl-ụ̈p bar- hdie-B.CONVERB go.AUXILIARYi ‘to pass away’, ün-ụ̈p bar- ‘to go up’, uč-ụp bar- ‘to ﬂy away’,Turkmen yüδ-ụ̈p git- ‘to swim away’, ïɣla-p git‘to run away’, uč-ụp git- ‘to ﬂy away’, Tatar ül-ịp kit- ‘to pass away, to die’, čị̈ɣ-ị̈p kit‘to go out’, ḳayt-ị̈p kit- ‘to set out for home’, kịr-ịp kit- ‘to go in’, Noghay uš-ị̈p bar- ‘to ﬂy away’, Kazakh žür-ịp ket- ‘to move away’, Uzbek uč-ụb ket- ‘to ﬂy away’, ḳayt-ịb ket- ‘to go back’, ål-ịb ket- ‘to take away’, Uyghur öt-ụ̈p kät- ‘to pass away’, öl-ụ̈p kät‘to die’, Khakas apar- < al par- ‘to take away’ (cf. Heine and Kuteva : ). Some Khalaj motion verbs such as yat- ‘to lie down’ and yät- ‘to lead, to take (away)’ form imperatives with the sufﬁx {-Uv(A)}, which goes back to translocative constructions with bar- ‘to go away’, e.g. Yat-ụv! ‘Lie down!’, Yät-ụ̈v! ‘Remove it!’. There are similar constructions with hAi converbs, e.g. Old Uyghur uč-a barhﬂow-A.CONVERB go.AUXILIARYi ‘to ﬂy off ’, ‘to die’, a metaphorical use of bar- ‘to go’, Tuvan čoru-y bar- ‘to run away’. Other auxiliary verbs are used in Turkmen ïɣla-p gir- ‘to run into something’ (cislocative), ïɣla-p čïḳ- ‘to run out of something’ (translocative). Turkish employs constructions with the converb in {-(y)ArAK}, e.g. ‹koș-arak gel-› ‘to come running’ (cislocative), ‹koș-arak git-› ‘to run away’ (translocative).

... Version Constructions based on verbs meaning ‘to give’ and ‘to take’ may express so-called ‘version’, which indicates whether a given action is performed to the beneﬁt or afﬂiction (advantage or disadvantage) of the performer or some other entity. It is mostly a question of beneﬁciency, i.e. to whose beneﬁt or in whose interest the action is carried out: ‘to act for one’s own sake’ vs ‘to act for the sake of someone else’. Subjective version is expressed by constructions based on verbs meaning ‘to take’, and denotes that the action is intended for the performer, which comes close to

Grammaticalization in Turkic



diathetic meanings of the middle type. The constructions have autobenefactive meanings, ‘to act for oneself, in one’s own interest’, e.g. Kirghiz ḳol-ụn-dụ ǰuː-p alhhand-POSSESSIVE-ACCUSATIVE wash-B.CONVERB take.AUXILIARYi ‘to wash one’s hand(s)’, but also maledictive meanings, e.g. Uyghur ḳol-ụ-nị käs-ịw al- hhand-POSSESSIVEACCUSATIVE cut-B.CONVERB take.AUXILIARYi ‘to get one’s hand cut’. East Old Turkic hAi al- is used for subjective version, e.g. Toḳuz Oɣuz teːr-ä ḳuβ rat-ụ al-dị̈-m hNine Oghuz gather-A.CONVERB organize-A.CONVERB take.AUXILIARY-PASTSINGULARi ‘I gathered and organized the Nine Oghuz tribes (for me)’. Modern languages prefer hBi al-, e.g. Tatar čaɣị̈r-ị̈p al- ‘to invite to oneself ’, kiy-ịp al- ‘to get dressed’, ḳul-ị̈-nị̈ yuw-ị̈p al- ‘to wash one’s hand(s)’, tab-ị̈p al- ‘to ﬁnd for oneself ’, tart-ị̈p al- ‘to draw to oneself ’, tụ̈z-ịp al- ‘to arrange for oneself ’, tụt-ị̈p al- ‘to grasp for oneself ’, tụ̈zät-ịp al- ‘to repair for oneself ’, Uzbek bil-ịb ål- ‘to acquire knowledge for oneself ’, Kazakh že-p al-, Uzbek ye-b ål- ‘to eat up’, Uyghur yez-ịw-al- ‘to write down for oneself ’, yiɣ-ịw-al- ‘to gather for oneself ’, sözli-w-al- ‘to speak to oneself ’, Tuvan biži-p al- ‘to write for oneself ’, Tofan orula-p al- ‘to collect for oneself ’. Khalaj imperatives carrying the sufﬁx {-Aːl} go back to subject version constructions, e.g. Käd-äːl! ‘Dress!’, Tut-aːl! ‘Grasp!’, Yuːt-aːl! ‘Swallow!’. Objective version is expressed by constructions based on verbs meaning ‘to give’. They denote that the action is intended for some other entity than the performer and can thus have connotations of beneﬁciency and politeness, e.g. ‘to favour by acting’, ‘to deign to act’ (cf. Heine and Kuteva : ). In older Turkic, hAi beːr- is used in this sense, e.g. Orkhon Turkic balbal ḳïl-ụ beːrhstele make-A.CONVERB give.AUXILIARYi ‘to erect a stele for somebody’, Old Uyghur ača beːr- ‘to open for somebody’, yor-a beːr- ‘to explain to somebody’. Khalaj hAi pirmay imply that the action is performed at somebody’s request, e.g. odïr-a pir- ‘to sit down on demand’. Modern languages prefer hBi ber-, e.g. Turkmen oḳoːp ber- ‘to read for somebody’, Tatar aŋlat-ị̈p bir- ‘to explain to somebody’, bül-ịp bir- ‘to share out’, ịšlä-p bir- ‘to work for somebody’, söylä-p bir- ‘to tell somebody’, tab-ị̈p bir- ‘to ﬁnd something for somebody’, uḳị̈-p bir- ‘to read for somebody’, tịg-ịp bir- ‘to sew for somebody’, tözät-ịp bir- ‘to repair for somebody’, yaz-ị̈p bir- ‘to write for somebody’, Kazakh ayt-ïp ber- ‘to tell’, Kirghiz oḳu-p ber- ‘to read for somebody’, Uyghur išlä-p ber- ‘to work for somebody’, eč-ịp ber- ‘to open for somebody’, hikayä eyt-ịp ber- ‘to tell somebody a story’, Tuvan biži-p ber- ‘to write for somebody’, käz-ip bär- ‘to cut for somebody’, Tofan ög-lä-p bär- ‘to build a house for somebody’, Khakas pas pir- ‘to write for somebody’. Turkish constructions based on {-(y)Ịver-} express object version, e.g. ‹alıver-› ‘to buy for somebody’. They have no connection with the constructions expressing suddenness dealt with above. Two constructions of different origins have fused here. They differ with respect to suprasegmental features. In an Anatolian dialect described by Demir (), the converb sufﬁx carries high pitch in object version constructions, e.g. ‹yap-í ver-›, whereas the lexical verb carries high pitch in constructions expressing suddenness, e.g. ‹yáp-ı ver-›. Also in Turkmen and Uyghur, hAi ber- expresses object version, i.e. Turkmen {-(Ị)-ber-} ~ {-(Ị)ver-}, Uyghur {-(Ị)wär-} (Johanson : ).



Lars Johanson and Éva Á. Csató

... Potentiality Potentiality, the physical or mental ability or inability to perform actions, is expressed by various constructions. East Old Turkic used the now obsolete verb uː- ‘to be able, powerful’, thus hAi uː- for ability and hAi uː-ma- for inability. Other constructions are based on auxiliary verbs of the types al- ‘to take’, bil- ‘to know’, and (b)ol- ‘to become’. Karakhanid exhibits hAi bil-, e.g. aδr-a bil- ‘to be able to distinguish’. Middle Kipchak and Chaghatay use hAi al- ‘take’ and hAi bil- (often without converb sufﬁx), e.g. ayt-a bil- ‘to be able to say’, bil-ị bil- ‘to be able to know’. Ottoman originally used the hAi converb sufﬁx in all its old variants, i.e. {-(y)A}, {-(y)Ị}, {-(y)Ụ}. After the th century, only {-(y)A} occurs, e.g. dön-ä bil- ‘to be able to return’. In later languages, hAi al- is found in languages of the Northwestern, Southeastern, and Northeastern branches and in Salar, often in contracted forms, e.g. KarachayBalkar bar-al- < bar-a al- hgo-A.CONVERB take.AUXILIARYi ‘to be able to go’, čab-alčab- ‘to run’, kel-alkel- ‘to come’, kör-ä al- ~ kör-alkör- ‘to see’, Tatar yaz-a al-, Bashkir yaδ-a alyaz-, yaδ- ‘to write’, uḳị̈-y aluḳị̈- ‘to read’, Kirghiz ber-e alber- ‘to give’, Kazakh kör-e alkör- ‘to see’, žaz-a al- ~ žaz-alžaz- ‘to write’, Uzbek oḳi-y-åloḳi- ‘to read’, Kashghar Uyghur bar-albar- ‘to go’, oyna-loyna- ‘to play’. Trakai Karaim displays the contracted and harmonic marker {-(y)Al-}, e.g. aša-yalaša- ‘to eat’, kˊetί-älίkˊetί- ‘to go’. Chuvash possesses the marker {-(Ø)Ay-}, e.g. kil-äy- kil- ‘to come’. Chulym employs hBi al-. Verbs of the type al- ‘to take’, ‘to get’ have here developed to express physical and mental ability, mostly also permissiveness and epistemic possibility (cf. Heine and Kuteva : –, –, –). The Southwestern branch prefers constructions based on bil- ‘to know’. hAi bil- is realized in non-harmonic forms such as Turkish {-(y)A-bil-}, e.g. ‹gel-e-bil-› ‘to be able to come’, ‹gör-e-bil-› hsee-A.CONVERB know.AUXILIARYi ‘to be able to see’, ‹kal-abil-› ‹kal-› ‘to stay’, ‹oku-ya-bil-› ‹oku-› ‘to read’, ‹yaz-a-bil-› ‹yaz-› ‘to write’, Gagauz išlä-yä-bilišlä- ‘to work’, Azeri gör-ä bilgör- ‘to see’, oχu-ya biloχu- ‘to read’. Similar constructions involve Crimean Tatar at-a bilat- ‘to throw’, Bashkir yüδ-ä bịl- hswim-A.CONVERB know.AUXILIARYi ‘to be able to swim’, Uzbek kȯr-ȧ-bilkȯr- ‘to see’. Khalaj employs hAi bil- with a variable converb vowel, e.g. käl-i-bilkäl- ‘to come’, tut-a-bil- tut- ‘to hold’, var-i-bil- var- ‘to go’. The converb sufﬁx hAi has otherwise vanished in Khalaj. Chuvash uses hAi pịl- ‘to be able’. Turkmen employs hBi bil-, e.g. oḳoː-p bil- ‘to be able to read’. Most constructions of this type, e.g. Turkish {-(yA-bil-}, express ability, permissiveness, and epistemic possibility. Gagauz {-(y)A-bil-}, however, does not express epistemic possibility. The construction hBi (b)ol- ‘to be possible’, ‘to be able’ is found in Turkmen, Altay, Tuvan, Tofan, Khakas, and Shor, e.g. Khakas sarna-p pol- hsing-B.CONVERB be.AUXILIARYi ‘to be able to sing’, it pol- ‘to be able to do’. For negation, some languages employ the regular construction hAi + bil-mä-, e.g. Chaghatay söz ayt-a bil-mä- hword say-A.CONVERB know.AUXILIARY-NEGATIONi ‘not to be able to speak’, Ottoman gäč-ä bil-mä- gäč- ‘to pass’, Azeri al-a bil-mä- al- ‘to take’, ver-ä bil-mäver- ‘to give’, Uzbek yåz-ȧ bil-mȧyåz- ‘to write’, Khalaj

Grammaticalization in Turkic



käl-i-bil-mä- käl- ‘to come’. Turkmen employs hBi bil-me-, e.g. oḳoː-p bil-me- ‘not to be able to read’. Old Ottoman also displays hAi + {-mA}, which goes back to the negated form of the old verb uː- ‘to know’, e.g. bul-ï-ma- hﬁnd-A.CONVERB-NEGATIONi ‘not to be able to ﬁnd’, söylä-yü-mä- ‘not to be able to speak’. Khalaj has maintained this construction, e.g. var-um- ‘to be unable to go’ < *bar-u uː-ma-. Turkish displays {-(y)A-mA-}, e.g. ‹gel-e-me-› ‘not to be able to come’, ‹gid-e-me-› ‹git-› ‘to go’, ‹yaz-a-ma› ‹yaz-› ‘to write’. This means that {-(y)A} functions as an allomorph of {-(y)A-bil-}, which is a non-agglutinative feature. Gagauz shows similar forms, e.g. ödä-yä-mä- ‘not to be able to pay’, üːrän-ä-mä- üːrän- ‘to learn’. Khalaj employs forms such as hilä:r-i-mähilä:r- ‘to kill’, ġal-i-maġal- ‘to remain’. Many languages employ hAi al-ma-, e.g. Chaghatay bar-al-ma- ‘to be unable to go’, ïnan-a al-maïnan- ‘to believe’, Karachay-Balkar kör-al-makör- ‘to see’, Crimean Tatar at-al-ma- ~ at-a-maat- ‘to throw’, Bashkir uḳị̈-y al-mauḳị̈‘to read’, Tatar yaz-a al-ma-, Kazakh žaz-a al-ma-, Uzbek yåz-ȧ ål-mȧ- ~ yåz-ål-mȧ-, Tuvan biži-y al-ba- ‘not to be able to write’ yaz-, žaz-, yåz-, biži- ‘to write’.

..      A few postverbial constructions expressing actionality have developed into viewpoint aspect markers. Some of them, based on auxiliary verbs meaning ‘to be’, ‘to stand’, ‘to move’, ‘to sit’, ‘to lie’, have come to renew the expression of intraterminality, typical of presents and imperfectives. The type hAi tur-ụr developed in the Northwestern and Southeastern branches, e.g. Yaz-a tur-ụr ‘X stands writing’. The forms underwent phonetic erosion: fusion of the converb marker with the auxiliary and partial or total loss of tur-ụr, e.g. yaz-a-dị̈r, yaz-a-t, yaz-a. Some constructions are based on the auxiliary ‘to sit’, e.g. Kirghiz ište-p otur-mụn ‘I am working’. The Turkish sufﬁx {-(Ị)yor} has evolved from a construction with a verb meaning ‘to move’, e.g. ‹Gel-iyor› ‘X is coming/comes’ < *Gäl-ä yorị̈-r ‘X moves coming’. These items originally expressed high degrees of focality, the concentration (focus) of psychological interest on the situation obtaining at the orientation point (Johanson : -). They eventually turned into items of lower focality, simple presents and imperfects. This led to further renewals of high focality by means of the verb yat- ‘lie’ in constructions of the type hBi yat-ị̈r, e.g. Khakas Kör-čä ‘X is seeing, sees’ < *Kör-ụ̈b yat-ị̈r. The Uzbek focal present Yȧz(ȧ)-yȧp-tị ‘X is writing’ goes back to *Yaz-a yat-ị̈b tur-ụr (‘to write’ + hAi + ‘to lie’ + hBi + ‘stands’). Constructions with hBi tur-ụr developed into viewpoint aspect operators renewing the expression of post-terminality (resultatives, perfects, constatives), e.g. *Yaz-ị̈b tur-ụr ‘X stands having written’. They underwent phonetic erosion, fusion of the converb marker with the auxiliary, and partial or total loss of tur-ụr, e.g. Yaz-ị̈b-dị̈, Yaz-ị̈b ‘X has written’, Azeri Gäl-ịb-sän ‘You have come’. The constructions



Lars Johanson and Éva Á. Csató

originally expressed high degrees of focality in the sense of resultatives (‘X is in the state of having’), but later turned into items of lower focality, which led to further renewals of high focality by means of yat- ‘to lie’, e.g. Khakas Uzu-p-čat-χan ‘X has slept’ < *Uδ-ị̈b yat-ḳan. (On evidential meanings of post-terminals, see Johanson .)

. AMBIGUITY Since postverbial constructions are in most cases formally not distinguished from a sequence of two lexical verbs, it may be difﬁcult to decide whether the second verb in written texts is a lexical verb or an auxiliary verb. For instance, tur- can either mean ‘to stand’ or contribute to the actional content in the sense of ‘to keep doing’. The second verb in Yakut kör-dọ̈r-ö tur- hsee-CAUSATIVE-A.CONVERB stand.AUXILIARYi may be interpreted as a lexical verb, ‘to stand showing’, or as a grammatical marker modifying kör-dọ̈r- ‘to show’, i.e. ‘to continue to show’. Uyghur uč-ụp käl- or Turkmen uč-ụp gel- hﬂy-B.CONVERB come.AUXILIARYi may be ambiguous in a corresponding way. In speech, the interpretations are distinguishable by prosodic means (Imart : ). It is sometimes difﬁcult to distinguish actional and aspectual usages in written texts. In certain cases, the sequence of a lexical verb and a grammaticalized auxiliary verb can occur in paratactic constructions in which both verbs bear the same inﬂection. The difference between a non-subordinating serial verb construction and a converb construction in which one verb is subordinate is thus not necessarily marked in Turkic (cf. Narrog, Rhee, and Whitman, Chapter  this volume). Both the serial verb construction and the postverbial construction are systematically ambiguous. For instance, the Turkish serial verb construction ‹Al-dı git-ti› htake-PAST goPASTi and the postverbial construction ‹Al-ıp git-ti› can both be interpreted ‘Taking it, X left’ or ‘X deﬁnitely took it’ (cf. Csató , ).

. SOME CONCLUSIONS The present chapter deals with some vital cases of shared grammaticalization in Turkic. The languages involved have created analogous grammatical categories by different formal means. Postverbial constructions have been grammaticalized as actionality markers, and some of the latter have been further grammaticalized as viewpoint aspect markers. The strategies employed are genuinely Turkic, but the results of the processes only partly correspond. The paths of grammaticalization are still isomorphic across the Turkic varieties, and represent a shared heritage. Contact between the varieties has played a signiﬁcant role in triggering the renewal and maintenance of the functional categories. For instance, the fact that different auxiliary verbs meaning ‘to be’, ‘to stand’, ‘to move’, ‘to sit’, ‘to lie’ have come to renew the expression of intraterminality in different varieties demonstrates that both the grammaticalized notions and the basic strategies are shared. As mentioned above, the individual languages often employ different

Grammaticalization in Turkic



auxiliaries to express one and the same actional notion, and one and the same lexical verb has been grammaticalized to convey different actional meanings. The grammaticalization of indirectivity, the Turkic type of evidentiality, manifests similar tendencies, i.e. isomorphic grammaticalization strategies with different morphological markers (Csató ). Grammaticalization is a language-speciﬁc process that cannot be copied (Johanson , ). A careful comparison of shared grammaticalization strategies in the Transeurasian languages will no doubt provide further insights into the nature of the processes involved.

9 Grammaticalization in Japanese and Korean H EIK O N A R RO G , S EO N G H A RH EE , A N D JO H N W H I T M A N

. INTRODUCTION

..         In this volume that is organized along mainly areal groupings of languages, this chapter on Korean and Japanese represents Northeast Asia. It also represents part of a group of languages traditionally labelled as Altaic and more recently as Transeurasian, which share many structural characteristics, whether they are genetically related or not. These structural characteristics are further shared with a broader areal grouping of Northeast Asian language families including Amuric (Nivkh), Yukaghir, and Ainu, which are not normally included in Altaic/Transeurasian (TE). Historical/comparative research has tended to focus on the relationship between Japanese and Korean and the putative ‘core’ Altaic families: Mongolic, Tungusic, and Turkic. From a typological perspective, this may obscure the salient commonalities between Korean and Japanese and the three NEA families mentioned. Most of the typological features shared by Japanese and Korean with Turkic, Mongolic, and Tungusic are shared with these three NEA families as well. Such features include head-ﬁnal verbal and nominal syntax and a high index of agglutination (section ..), pervasive use of nominalization for clausal subordination, rich use of converbs with grammatical functions (section ..; Mattissen  for Nivkh, Maslova  for Yukaghir), pervasive use of mermaid constructions (section ..; Bugaeva ; Nedjalkov and Otaina ; Maslova ), a deep inventory of deverbal postpositions (section ..), and accusative alignment. Some of these features set TE/Altaic and the three families apart from immediately adjacent language groups. For example Chukotko-Kamchatkan, although in areal contact with Tungusic and Nivkh, shows

Grammaticalization from a Typological Perspective. First edition. Heiko Narrog and Bernd Heine (eds). This chapter © Heiko Narrog, Seongha Rhee, and John Whitman . First published  by Oxford University Press

Grammaticalization in Japanese and Korean



sharply divergent typological properties: ergative alignment, direct/inverse morphosyntax, and freer word order, even between NPs and their adjectival modiﬁers. Some notable typological features of Japanese and Korean not shared with some or all of the TE/Altaic languages are shared with members of this broader set of NEA languages. For example Korean and earlier Japanese (Whitman ) share with Nivkh and Yukaghir and arguably Ainu (Shibatani ) the property of RTR (retracted tongue root)-dominant vowel harmony (Ko, Joseph, and Whitman ). This property is also shared with two members of core Altaic, Mongolic, and Tungusic, but not with most Turkic varieties. In terms of other phonological commonalities, Ainu, Japanese, and Korean are the only languages in the region originally lacking a two-way contrast in largyngeal features for obstruents and displaying lexical pitch accent. From the standpoint of morphosyntax, earlier Japanese, many Ryūkyūan varieties, and Yukaghir display the type of focus concord pattern known in premodern Japanese as kakarimusubi (Maslova ). This pattern is not found in Korean or any of the Altaic languages. In terms of other grammaticalization patterns, Korean, Japanese, and Nivkh are the only language groups in Northeast Asia with inﬂecting adjectives, which in the case of Korean and Nivkh are essentially indistinguishable from stative verbs. Korean, Japanese, and Nivkh also share the property of having numeral classiﬁers (..). The point of listing these areal/typological similarities is not to deny the validity of one or the other linguistic grouping, but rather to alert the reader to the existence of areal patterns that extend beyond and in some cases cross-cut better-known groupings such as Transeurasian/Altaic. As in other linguistic areas, some typological isoglosses pick out distinct subsets. Japanese, Korean, and Ainu group together with respect to a number of phonological features (lack of a laryngeal distinction in consonants, lack of an r/l distinction, pitch accent), while Japanese, Korean and Nivkh group together with regard to morphosyntactic features distinctive in the region such as numeral classiﬁers and inﬂecting adjectives. Against the backdrop of typological parallels with other languages of Northeast Asia, we focus in this chapter on Japanese and Korean because of their relatively long documented histories and extensive research traditions.

..          Korean and Japanese have been a strictly head-ﬁnal SOV language with accusative alignment, and with frequent omission of argument NPs, throughout their documented history. Morphologically, Japanese is agglutinating, and given the head-ﬁnal nature of the language, clear cases of grammaticalization typically lead to the sufﬁxation of formerly independent morphemes (i.e. lexemes). Preﬁxation is much less common, conﬁned to a small number of categories, mostly honoriﬁcation and negation (cf. ...). For Japanese, relatively large amounts of texts are available with some gaps from the th century until now. However, the accessibility of texts for



Heiko Narrog, Seongha Rhee, and John Whitman

non-specialists also varies period by period,¹ and no comprehensive historical corpora have been published yet.² As is the case with Japanese, sufﬁxation is much more common than preﬁxation in Korean. Preﬁxation, though less common in general, is often used to derive honoriﬁc or pejorative terms from value-neutral words (Koo ). For Korean, the historical depth of texts written in Hangeul, the Korean alphabet, goes back to the th century, and a large body of texts has been compiled through government-led projects such as the st-Century Sejong Project. Identiﬁably Korean texts written in Chinese characters date back to the th century CE (Nam ; Whitman ). Prior to the invention of Hangeul, several different writing systems were used, such as Itwu, Hyangchal, and Kwukyel, that made use of Chinese characters for their meaning (semantogram) or sound (phonogram) to represent Korean words, thus creating a problem for modern scholars in translating such texts. Recently many advances have been made in deciphering such texts found in poems, tombstone inscriptions, ledgers, administrative reports, pedigree records, and the like. Note that the structural characteristics of Japanese and Korean mentioned here are shared across Transeurasian languages as well as Nivkh, Yukaghir, and for the most part Ainu. Within the larger grouping, however, Ainu is an outlier, with substantial preﬁxation and features of polysynthesis, such as extensive noun incorporation (Shibatani ). This raises the possibility that Ainu descends from an ancestor with a substantially different typological proﬁle, but has converged with the languages discussed here through contact with Japanese.

. SOME REPRESENTATIVE PROCESSES OF GRAMMATICALIZATION IN THESE LANGUAGES In this section we will provide a brief overview of processes of grammaticalization that should be representative for the languages discussed in this chapter. While it is difﬁcult to quantify what is representative, we focus on changes that are (a) recurring, i.e. several morphemes or constructions have undergone the same kind of grammaticalization, at various periods of time, (b) not commonly found in the well-known European languages, and (c) recorded in historically documented times and not merely a matter of historical reconstruction. Furthermore, we are primarily interested in grammaticalizations that are found in Northeast Asian languages and so-called Transeurasian languages beyond Korean and Japanese. This is especially true of the grammaticalizations presented in sections .. and .., while those presented in

¹ The following historical period labels are used here: OJ: Old Japanese (th–th c.); LOJ: Late Old Japanese (th–th c.); MIDJ: Middle Japanese (th–th c.); EMJ: Early Modern Japanese (th–th c.); MJ: Modern Japanese (late th c. ~); OK: Old Korean (~th c.); MIDK: Middle Korean (th–th c.); EMK: Early Modern Korean (th–th c.); MK: Modern Korean (th c. ~). ² See, however, the University of Virginia’s Japanese Text Archive, http://etext.lib.virginia.edu/japanese/, and a historical corpus project at NINJAL (http://pj.ninjal.ac.jp/corpus_center/en/kotonoha.html), which includes a parsed corpus of Old Japanese released in March  (http://oncoj.ninjal.ac.jp/).

Grammaticalization in Japanese and Korean



.. and .. seem to be speciﬁc to Japanese and Korean, apparently motivated by the need or desire to accommodate Chinese loan vocabulary.

..     Cross-linguistically, in sequences of two verbs, one verb may lose its semantic independence and modify the other, providing information on grammatical categories such as tense and aspect, and directionality, or changing argument structure. There are three major constructions in which this happens, and in which the modifying verb still morphologically retains lexeme status (i.e. does not become an afﬁx): (a) serial verb construction (SVC), (b) converbs, and (c) compound verbs. In (c), the two verbs form one phonological and morphological word, while they remain two words in (a) and (b). Thus, (a) and (b) can be considered sub-cases of the same phenomenon. The difference between (a) and (b) is that in serial verb constructions, there is no overt coordination/subordination within the construction, while in converb constructions, one verb is marked as subordinate and typically shows restrictions on marking for other verbal categories. That is, there is a clear formal asymmetry between the two verbs (cf. Bisang : ; Ansaldo : ). It has been suggested that SVCs tend to occur in languages with little morphology or little obligatoriness of marking of grammatical categories, and that the distribution of SVCs vs converbs is an areal phenomenon (see Bisang : –; Ansaldo : –). But this does not mean that one language cannot have both constructions, or all of them, including compound verbs. Northeast Asian languages, and beyond them so-called Altaic languages including Turkic, have been treated as languages typically having converbs (see e.g. Haspelmath and König ). Korean and Japanese fall squarely within that type. Note that we ﬁnd converbs (e.g. gerunds) and to a lesser extent serial verb constructions (Go get it!) in the well-known European languages, too, but they have not grammaticalized into paradigms expressing grammatical categories. The following tables and short descriptions of grammaticalizations give a glimpse of how they have grammaticalized in Japanese and Korean. Note that Chapter  in this volume, by Johanson and Csató, additionally contains descriptions of the grammaticalization of converbs in Turkic languages. Table . shows common converb constructions in Standard Modern Japanese. The dates for grammaticalization here and in Table . are taken from the NKD. Below, we give two examples. () shows the benefactive mora(w)- as a lexical verb ‘receive’, and () as a benefactive. The event kuturog- event does not involve actual transfer of anything. It merely indicates a vague relationship of beneﬁt, such as that the writer was happy about many people being able to relax in the facility. ()

Sapuraizu=de ko-inu=wo morat.ta# surprise-ESS kid-dog-ACC receive-PST ‘I received a puppy as a surprise.’

()

Syukuhaku sisetu=de ooku=no gesuto=ni kuturoi.de morat.ta# lodging facility-ESS many-GEN guest-DAT relax-GER receive-PST ‘I had many guests relaxing in the lodging facility.’



Heiko Narrog, Seongha Rhee, and John Whitman

T .. Common converb constructions in Modern Japanese Form

Original meaning Grammatical function

Date of grammaticalization

BENEFACTIVES -Te kure-

GER

give

General benefactive

th c.

-Te yar-/age-

GER

give

Other-benefactive

th c.

-Te mora(w)-/ itadak-

GER

receive

Self-benefactive

th c.

-Te i-

GER

be

Continuative

th c.

-Te ar-

GER

be

Stative resultative

th c.

-Te sima(w)-

GER

ﬁnish

Completive

th c.

-Te ok-

GER

put

Action result

th c.

-Te k-

GER

come

Directional; continuative th c.

-Te ik-

GER

go

Directional; continuative th c.

-Te mi-

GER

see

Conative

th c.

-Te mise-

GER

show

‘Show/prove being able doing’

th c.

ASPECTUALS

DIRECTIONAL/ ASPECTUALS

CONATIVES

() and () present the case of the directional/aspectual ik- ‘go’. () shows ik- in its literal sense, and in its aspectual reading, which denotes a continuous development towards a goal, not any actual movement. ()

Arubedo=wa kasei=ni it.ta# PN-TOP Mars-DAT go-PST ‘Alvedo went to Mars.’

()

Arubedo=wa zyozyoni kyooki =ni katamui.te it.ta# PN-TOP little.by.little madness-DAT verge-GER go-PST ‘Alvedo descended little by little into madness.’

Korean has a number of converb markers. First introduced in Ramstedt ([]: ‘converbum/converbalia’) in his description of Korean verbal morphology, the notion of ‘converb’ received little attention from Korean linguists until recently (cf. pwutongsa; Ko ).³ According to Ramstedt ([]: ), converbs signal ³ Converbs as a grammatical category have not been established in Korean linguistics and thus cannot be neatly delineated from related grammatical categories of diverse linking functions. Ramstedt

Grammaticalization in Japanese and Korean



T .. Common converb constructions in Modern Korean Form

Original meaning

Grammatical function

Date of grammaticalization

-e cwu-

NFIN give

General benefactive

th c.

-e tuli-

NFIN give(+HON) Honoriﬁc benefactive

th c.

-e iss-

NFIN exist

Stative resultative

th c.

-e twu-

NFIN place

Purposive perfective

th c.

-e peli-

NFIN displace

Perfective

th c.

-e ka-

NFIN go

Directional; continuative th c.

-e o-

NFIN come

Directional; continuative th c.

NFIN see

Conative

BENEFACTIVE

ASPECTUAL

DIRECTIONAL/ ASPECTUAL

CONATIVE -e po-

th c.

that ‘the sentence is not ﬁnished but a [sic] the main verb is following’ (emphasis original). Among the converb markers, -e (and its allomorph -a) ﬁgures as the most frequently used form, which forms diverse constructions that often developed into auxiliary verb constructions. Because of extensive semantic bleaching, -e is generally labelled as a non-ﬁnite (NFIN) marker. Some of the common converb constructions formed with the linker -e are exempliﬁed in Table .. The verb of locomotion -ka ‘go’ has a long history as a lexical verb attested in earlier Korean. Along with its lexical use exempliﬁed in (), it developed into a marker of continuative aspect as shown in (). ()

icey etule ka-nAn-ta now where go-PRS-Q ‘Where are you going now?’ (Penyeknokeltay, c., I: a)

()

hanAl-to hAma pAlk-a ka-nA-ta sky=also already be.bright-NFIN go-PRS-DEC ‘The sky is already becoming bright now (the day is breaking now).’ (Penyeknokeltay, c., I: a)

([]) lists as many as  subcategories of converbalia in Korean. Similarly, Johanson () and König () include a large number of linkers under this label, in which case the category would be a large collection of heterogeneous markers. In a more restrictive sense, the linkers -a/e, -key, -ci, and -ko, traditionally known as adverbializers, constitute converbs.



Heiko Narrog, Seongha Rhee, and John Whitman

The verb of giving tuli- is inherently marked with the [+HON] feature, contrasting with the neutral cwu-, and thus was used to describe a transfer in the direction from a social inferior to a social superior, e.g. from a student to the teacher or from a child to the parent. Around the end of the th century, it was grammaticalized into the benefactive marker, still retaining the upward directionality, as shown in the following examples. ()

wuli pwumo=i thayca=skuy tuli-zava-si-ni our parents-NOM prince-to give(+HON)-HON-HON-CONJ ‘ . . . as my parents gave [me] away to prince [=Buddha] (as his wife) . . . ’ (Sekposangcel, , a)

()

etop-ketun ca-si-l cali=lul tolpow-a tuli-ko be.dark-if sleep-HON-ADN place-ACC take.care-NFIN give-and ‘When it is dark, [a dutiful son’s job is] to prepare the place for [his parents] to sleep in, and . . . ’ (Cengsokenhay, , b)

The converb patterns in Tables . and . show a good deal of overlap, but there are some important differences. Korean lacks a counterpart to the Japanese V-te mora(w)‘have/get V’ pattern in (), which has the structure of a causative in that the subject of the converb is distinct from the subject of the second verb. Both languages have completive or perfective patterns involving a predicate of disposal as the second verb, but these have distinct sources: Korean peli- ‘discard’ and Japanese sima(w)- ‘put to an end’. The forms of the converb also show interesting similarities and differences. In both languages the inﬁnitive (to adopt the term used by Martin , ) is the older converbal pattern. In earlier Japanese the inﬁnitive in -i was the chief converbal form, e.g. kapyer-i ko-sa-mu return-INF come-HON-FUT ‘will come home’ (MY ), sukup-i tamap-a na save-INF give-IRR DESID ‘please save (us)’ (Bussokusekika, ). In modern Japanese the inﬁnitive is largely conﬁned to compounds (see following paragraph) and other bound usages, replaced in its converbal function by the gerund in -te shown in Table .. In Korean the inﬁnitive in -e/a is still the dominant converb form, but it has undergone univerbation in the formation of the modern past -e/ass- from -e/a + iss‘be, exist’, which coexists with the non-univerbated stative resultative in Table .. Modern Korean has developed converb patterns formed with the gerundive sufﬁx -ko, as in progressive V-ko iss- V-GER be ‘be V-ing’ and desiderative V-ko siph- V-GER want ‘want to V’. Both languages have developed a past tense from an aspectual (resultative or perfective) pattern involving converb plus ‘be’: Japanese V-te ar- V-GER be > V-tar- > V-ta V-PAST (see Table .) and Korean V-e/a iss V-INF be > V-e/ass- V-PAST. Compound verbs also play an important role in Japanese vocabulary and grammar. Compounding is an area which is primarily associated with lexicalization rather than grammaticalization, but some verbs such as hazime- ‘begin’ and tuduke‘continue’ can be productively added to a large range of other verb stems in aspectual function, as in tabe-hazime- ‘begin to eat’ (cf. Matsumoto ). Similarly, verb compounding is among the most productive means of lexicalization in Korean. Word formation involving multiple verbs may involve asyndetic connection of multiple verbs, a true V-V compound pattern, resembling serial verb formation in other languages in appearance, as in ttwinol- ‘romp about’ (< ttwi ‘jump’-nol ‘play’), khaymwut- ‘interrogate’ (< khay ‘dig’-mwut ‘ask’), etc. This type of compounding, however, is not productive. A much more productive pattern is one making use of the

Grammaticalization in Japanese and Korean



converb marker -a/e to connect the participating verbs (see section ..). Incidentally, using the converb -a/e is also the most common pattern of verb serialization and auxiliary verb formation in Korean. Owing to the superﬁcial similarity in patterns, it is often difﬁcult to determine whether the resultant forms are compound verbs, SVCs, or auxiliary verb constructions. The distinction between SVCs on the one hand and compound verbs and auxiliary verb constructions on the other largely depends on the interpretation of the single/multiple event interpretation, and on whether the verb of secondary meaning (typically V) encodes a grammatical notion, e.g. pokk-a mek‘roast-NFIN eat’ denotes two events, whereas ttwi-e ka ‘run-NFIN go’ (= ‘run, go running’) and cwul-e tul- ‘diminish-NFIN enter’ (= ‘shrink, become shrunk’) denote single events; ka- ‘go’ in ttwi-e ka- ‘run, go running’ denotes physical locomotion whereas tul- ‘enter’ in cwul-e tul- ‘shrink, become shrunk’ marks the grammatical notion of inchoative. However, there are ambiguous cases that allow multiple interpretations, as in tol-a po- ‘turn-NFIN see’ between ‘turn and see’ and ‘reminisce’, and kkwulh-e anc‘genuﬂect-NFIN sit’ between ‘kneel and sit’ and ‘kneel down.’ As this discussion implies, while Japanese has a clear formal distinction between converbs (formed with gerundive -te) and V-V compounds (formed with inﬁnitive -i), Korean does not. Inﬁnitive -e/a is used on the ﬁrst verb of the converbal constructions in Table ., ‘object sharing’ V-V sequences referred to in the Korean descriptive tradition as serial verb constructions such as kkakk-a mek- ‘peel-INF eat’ (Chung ), and V-V sequences denoting a single event. Although some earlier studies refer to the latter two types as V-V compounds (e.g. Sohn ), none of the Korean patterns is as tightly bound as Japanese compound verbs. As shown in (a) and (b), Korean V-e/a V sequences can be split by a focus or delimiter particle, while this is never possible with Japanese V-V compounds: ()

a. Mina=nun sakwa=lul kkakk-a (=man) mek-nun-ta. (Korean) Mina-TOP apple-ACC peel-NFIN (-only) eat-PRS-DEC ‘Mina eats only peeled apples.’ (lit. ‘Mina only peels apples and eats (them)’) Or: Mina=nun sakwa=lul kkakk-a (=to) mek-nun-ta. (Korean) Mina-TOP apple-ACC peel-NFIN (-even) eat-PRS-DEC ‘Mina eats peeled apples, too.’ (lit. ‘Mina also peels apples and eats (them).)’ b. Mina=wa suber-i(*=mo) oti-ta. (Japanese) Mina-TOP slip-INF(*-even) fall-PST ‘Mina slipped and fell.’

The looser juncture between Korean V-INF V sequences may give a hint as to the status of V-INF V sequences in earlier Japanese, when the inﬁnitive could still function as a converbal ending: it would be hasty to assume that such sequences were already compounds simply because their modern Japanese counterparts are (cf. Frellesvig et al. ). The asyndectic V-V pattern in Korean noted above is a true V-V compound pattern that has no counterpart either in Japanese or in Altaic: it is a bare root compounds where the ﬁrst verb is completely unafﬁxed. Thus together with kkulh-e olu- boil-INF rise ‘come to a boil’ we ﬁnd LMK kul-talh- boil-get.reduced ‘boil down’ and others, as shown in () (the bare root compound retains the original unreinforced initial, and simpliﬁes the stem ﬁnal cluster).

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi



Heiko Narrog, Seongha Rhee, and John Whitman

() ttwi-e nol- jump-INF play ‘play jumping around’ ttwi-nol- ‘romp about’; tol-a po- turn-INF see ‘turn and see, reminisce’ tol-po- turn-see ‘take care of ’; pwuth-e sal- attach-INF live ‘live upon, be parasitic on’ pwuth-cap- attach-hold ‘catch’ Martin () cites about  examples of this type, already present in LMK. They are more highly lexicalized than V-e/a V sequences: they are not productive, they typically denote a single event, their meaning is sometimes non-compositional, and they cannot be separated by a particle.

..         (‘ ’) A phenomenon virtually unknown in core European languages but common across languages formerly labeled as Altaic and now as Transeurasian, including Japanese and Korean, comprises nouns grammaticalizing to markers of modal, evidential, and other categories in the verbal complex of the main clause. Tsunoda () has provided a survey of this phenomenon across Asian languages. The noun in predicate-modifying position loses some of its categorical features but also retains some. Especially, it can be followed by a copula like other nouns serving as predicates. Thus, a clause with one of these grammaticalized nouns has the syntax of an ordinary clause with a verbal predicate to the left, while it ends on a copula like a copular clause with a nominal predicate to the right, without having the syntax of a copular clause. Hence the label ‘mermaid construction’. Table . contains a list of nouns commonly used in mermaid constructions in Modern Japanese. Below, we give an example. () shows the noun wake in its lexical function, and () as a predicate modiﬁer. In (), wake functions as a regular noun, and an argument of another verb. In () it indicates a logical relationship between two clauses, namely that the state of affairs depicted in the clause marked by wake is the consequence of a state of affairs depicted in the preceding clause. This logical relationship is not always as clear as in ex. (). Wake simply indicates that the

T .. Some nouns commonly used in mermaid constructions of Modern Japanese Form

Original meaning

Grammatical function

Approximate date of grammaticalization

hazu

Intention

Epistemic necessity

th c.

wake

Reason

Conclusion, or reason for previous clause

th c.

mono

Thing

Obligation, habituality etc.

~ th c.

koto

Thing

Mandative; others

~ th c.

tokoro

Place

Temporal conjunction

th c.

Grammaticalization in Japanese and Korean



state-of-affairs stands in some logical relationship to something else in the linguistic or non-linguistic context. () Kimi=no sini-ta.i wake=o kik-ase.te kure# you-GEN die-BOU-NPS reason hear-CAU-GER give ‘Tell me the reason why you want to die.’ () Watasi-wa naN=to=ka iki-ta-kat.ta=node ikkei=o aNzi, hikooki=de I-TOP what-QUO-Q go-BOU-VBZ-PST-CAS plan-ACC think plane-ESS hi~gaeri~si.ta wake=des.u# day-return-do-PST wake-POL-NPS ‘Since I wanted to go there by any means, I thought out a plan and went there on a day trip by aeroplane.’ Korean has a large inventory of nouns that can form mermaid constructions. The nominals in mermaid constructions range from those with nearly completely bleached semantics amounting to  (Kwon ) to those that show a split phenomenon, with one lexical noun with full semantic content and one devoid of such content (Ahn ; Rhee , ; Kim ). The nouns with no semantic content, thus labeled as dependent nouns or defective nouns, invariably occur in mermaid constructions, as illustrated in part in Table .. (NB: The approximate dates of grammaticalization in the table are inconclusive since the semantics of lexical nouns vs. that of mermaid constructions cannot be sharply delineated.) Examples () and () show the noun cikyeng (tikyeng) ‘domain’ before and after grammaticalization. () cikyeng=i niz-e sahom ani ho-n nal ep-te-ni domain-NOM connect-CONJ war NEG do-ADN day not.exist-RETR-CONJ ‘Since [the Wu and Han Kingdoms] shared a boundary, not a day passed without a war, and . . . ’ (Nayhwun, , :b) () yempyeng=ey keuy cwuk-ul tikyeng-i-la typhoid.fever-at nearly die-ADN domain-be-END ‘[A devout man] was nearly dead with typhoid fever.’ (Cyunyenchyemlyeykwangik, , a) T .. Some nouns commonly used in mermaid constructions of Modern Korean Form

Original meaning

Grammatical function

Approximate date of grammaticalization

seym

Calculation

Copulative (equivalence)

th c.

cikyeng (= tikyeng)

Domain

Be in (undesirable situation)

th c.

nolus

Role/play

Be in (undesirable situation)

th c.

cham

Point in time

Proximative aspect

th c.

pep

Law

Deontic obligation

th c.



Heiko Narrog, Seongha Rhee, and John Whitman

..   -  The last example differs from the previous one, in that it concerns a construction that has thrived in Japanese and Korean, but not in many geographically close Transeurasian languages. One major motivation for the formation of these postpositions seems to be contact with (mostly written) Chinese, and the need to translate prepositions or verbs with prepositional function (cf. Djamouri and Paul ) from that language. With their variegated contents, only some of them corresponded to simple case particles in Korean and Japanese, so that new constructions had to be put into service to render them. This was achieved through verbs and verbal nouns assuming postpositional function. Their basic structure is represented in (). ‘PV’ stands for ‘postpositional verb’. () Kor. N=ey/ul PV+-ko/a/e Jap. N=ni/o PVb/Vb+Te Ey and ul in Korean are locative-dative and accusative case particles, respectively, and ni and o their Japanese counterparts. There are a few cases of de-verbal postpositions with different case on the preceding noun and in a different inﬂectional form that will be listed individually in the tables on each language. Given that these semigrammaticalized verbs govern the case of the noun phrase preceding it, they correspond structurally closely to adpositions in Indo-European languages. Table . lists a selection of the most common de-verbal postpositions in Japanese according to descriptions as Suzuki () and Tanaka (). Those marked with percentage symbol ‘%’ are based on a Sino-Japanese morpheme. The Chinese character is given in the next row without brackets. It is worth noticing that the postpositional verb constructions listed here are not all inherited from proto-Japanese but are the result of historical developments from Late Old (Early Middle) Japanese on. As mentioned above, it is reasonable to assume that the development of the class as a whole has been motivated to a large degree by the practice of transposing Chinese into Japanese. Some of these constructions (e.g. o motte, ni oite) may be entirely calques (cf. Yamada ; Chen ). Korean has a large number of de-verbal postpositions. Table . lists a selection of the most common de-verbal postpositions in Korean. Those marked with a percentage symbol % are based on a Sino-Korean morpheme. As shown in the morphological breakdown in Table ., PVs typically follow a postpositional particle and are followed themselves by the NFIN markers -ko or -e/a (the latter become y or ye if preceded by the light verb ha-). There are cases of PVs involving Sino-Korean morphemes ﬁtting into the general template of [case particle V-NFIN]. However, the cases listed in Table . (and a few more) constitute a unique class, in that the Sino-Korean verbs at the V position contain a monosyllabic Chinese word that is never used by itself. Besides the PVs which are similar to Japanese in their structure, Korean has a smaller group of native verb-derived particles with adpositional-like functions, as shown in Table ..

Grammaticalization in Japanese and Korean



T .. De-verbal postpositions in Japanese Postpositional verb

Sino-Japanese source

Meaning

Lexical source

ni atatte

(當)

in the course of

atar- (V) ‘to hit upon’

ni oite

(於)

at/concerning

ok- (V) ‘to put’ kaNs- (V) ‘be related to’

%ni kaNsite

関

concerning

%ni saisite

際

at the occasion of sai (N) ‘occasion’ following

sitagaw- (V) ‘to follow’

towards, against

tais- (V) ‘to face’

ni tuite

about

tuk- (V) ‘to attach to’

ni tuki

concerning

tuk- (V) ‘to attach to’

ni turete

accompanying

ture- (V) ‘to accompany’

ni totte

as for

tor- (V) ‘to take’

ni sitagatte %ni taisite

對

ni tomonatte ni yotte

(由、因)

o megutte

accompany

tomonaw- (V) ‘accompany’

by, because of

yor- (V) ‘to come near, depend on’

about

megur- (V) ‘to circle around’

o motte

(以)

with

mot- (V) ‘to hold’

o toosite

(通)

through

toos- (V) ‘to pass through’

T .. De-verbal postpositions in Korean Postpositional verb

Sino-Korean source

Meaning

Lexical source

ey ttal-a

according to

ttalu- (V) ‘to follow’

ey tak-a

onto

taku- (V) ‘to draw near’

ey tay-ko

to

tay- (V) ‘to touch’

from

pwuth- (V) ‘to adhere’

%ey tayha-y

對

regarding, about

tayha- (V) ‘to encounter’

%ul wiha-y

爲

for

wiha- (V) ‘to serve, take care of ’

%ey uyha-y

依

by

uyha- (V) ‘to rely on’

%ey piha-y

比

as compared to

piha- (V) ‘to compare with’

%ey kwanha-y

關

regarding, about

kwanha- (V) ‘to relate to’

%ey hanha-y

限

restricted to

hanha- (V) ‘to restrict’

lo/eyse pwuth-e



Heiko Narrog, Seongha Rhee, and John Whitman

T .. Postpositional particles from DVPs in Korean Postpositional verb

Meaning

Lexical source of the verb

mac-e

even (extreme example)

mac- (V) ‘encounter’

coch-a

even (extreme example)

coch- (V) ‘follow’

ttal-a

on, at (with adversative/mirative connotation)

ttalu- (V) ‘follow’

ha-ko

with, along with

ha- (V) ‘do/say’ (light verb)

kac-ko (kaciko)

with

kaci- (V) ‘have, take’

tel-e

to (dative)

tAli- (V) ‘lead, accompany’

po-ta

than (comparative)

po- (V) ‘see’

Unlike the PVs in Table ., these postposition-like PVs do not govern case-marked nouns but directly follow the unmarked noun as sufﬁxes. They have thus further grammaticalized into particles as as marker either of case or of information structuring and scalarity. This is the most prominent difference from the PVs listed in Table ., which still depend on the presence of particles on the preceding noun such as ey, ul. There are items that show a minimal pair relationship in terms of function in that the same lexical item diverges in its functions depending on the use and nonuse of a postpositional particle preceding it, e.g. ey ttala and ul ttala for ‘according to’ vs ttala for adversative, both from ttalu- ‘follow’. Japanese has a smaller inventory of postpositions derived from original native converbs, but the examples that exist provide instructive comparisons with Korean. Among delimiter/focus particles, =sape ‘even’ is usually held to be derived from sope ‘attach, accompany-INF’. Locative =ni and dynamic locative =de < ni-te are taken to be derived from the inﬁnitive and gerundive forms respectively of a defective copula nV-. The clause-coordinating particle =si most likely originates from the inﬁnitive of su ‘do’, although details are obscure. Like the Korean examples in Table ., all of these verb-derived postpositions attach directly to their host, without an intervening case marker. All but the last (which dates to LMJ) are already intact by OJ. These facts again show that Japanese converbs in inﬁnitive -i are involved in an older layer of grammaticalization. However, as shown in Table ., many of the complex postpostions (PVs) calqued from Chinese sources allow both the inﬁnitive ending -i/-e or gerundive -te; in these cases it is the gerundive form in -te that shows more properties consistent with grammaticalization as a postposition (Ikegami ).

..   Lastly it should be mentioned that another prominent grammatical category grammaticalized proliﬁcally in Korean and Japanese—at least partially under the inﬂuence

Grammaticalization in Japanese and Korean



of Chinese, but not necessarily in other Northeast Asian or Transeurasian languages—is numeral classiﬁcation. Korean has about three dozen numeral classiﬁers, about half of which are of Chinese origin. The most widely used classiﬁer is kay (個) for individuated nonhuman objects. Other common classiﬁers include tay (臺) for vehicles and mechanical units, myeng (名) for humans, can (盞) for liquids in a glass, kwen (卷) for books, and cang (張) for sheets. Since half or more of the numeral classiﬁers in both languages are derived from Chinese, it is tempting to think that numeral classiﬁers in Korean and Japanese are the result of Chinese inﬂuence. (Note that in Chinese itself numeral classiﬁers have increased in number and obligatoriness over time; in Old Chinese, numeral quantiﬁcation was possible with bare numerals.) Within Northeast Asia, numeral classiﬁers are only marginally attested in Altaic (Janhunen ), but they are robustly present in Nivkh (Nedjalkov and Otaina ). Across these languages, speciﬁc quantiﬁed expressions (e.g. expressions for numbers of days, or numbers of persons) tend to be highly lexicalized and have no Chinese or other external source. This may indicate that numeral classiﬁers are an archaic trait in Northeast Asia, as suggested by Janhunen, best preserved in the peripheral languages Nivkh, Korean, and Japanese.

. WHAT IS SPECIAL ABOUT GRAMMATICALIZATION IN JAPANESE AND KOREAN After presenting a number of grammaticalization processes that are common and thus representative for Japanese and Korean, or even the larger language area to which these languages belong (sections .., ..), as compared with typical European languages, this section broaches three aspects of grammaticalization in Korean and Japanese that may be of some value in critically examining various theories of grammaticalization, and which, again, are not found to the same degree in typical European languages. First, both languages offer good examples for ‘reductionist’ approaches to grammaticalization (section ..). Second, both languages abound in examples of grammaticalizations in the interpersonal domain (section ..). Third, a fair amount of grammaticalization in both languages has taken place under the inﬂuence of written language (section ..).

..           In traditional approaches to grammaticalization, especially those prominently espoused by Lehmann () and Bybee (a, b), grammaticalization is taken mainly as a reductive process, leading to the loss of autonomy of some linguistic units. Setting up the three parameters of weight, cohesion, and variability in a paradigmatic and syntagmatic dimension, Lehmann (: –) posited six processes of grammaticalization: () loss of integrity (weight), i.e. attrition, desemanticization, and decategorialization; () increasing paradigmaticity (cohesion), i.e. paradigmaticization; () loss of



Heiko Narrog, Seongha Rhee, and John Whitman

paradigmatic variability (variability), i.e. obligatoriﬁcation; () shrinking of the morphological scope of a sign (weight), i.e. condensation; () increase in bondedness (cohesion), i.e. coalescence (also ‘univerbation’); and () loss of syntagmatic variability (variability), i.e. ﬁxation. All these processes may have morphological, syntactic, and semantic aspects, but in Lehmann’s description, the morphological aspects are foregrounded. Bybee (a, b) espouses a concept of grammaticalization as habituation and automatization through frequent repetition, which leads to phonetic and phonological reduction, fusion, and semantic bleaching. Morphology and phonology are the areas to which the ideas of reduction and loss of autonomy can be most clearly and unambiguously applied. Heine and Reh () proposed a catalogue of related changes in African languages. With respect to phonology, they distinguish adaptation, erosion, fusion, and loss, and with respect to morphology permutation, compounding, cliticization, afﬁxation, and fossilization. Here, Japanese and Korean ﬁt the bill very well, even better than from what we know of the typical European languages in historically documented times. Morphologically, head-ﬁnal Korean and Japanese are prevalently agglutinating, and grammaticalization typically leads to the sufﬁxation of formerly independent morphemes. Assimilation and some fusion between stem and sufﬁxes is not uncommon. To start with Japanese, one can distinguish three distributional classes of sufﬁxes: (i) inﬂections (only on verbs and adjectives), (ii) particles, and (iii) derivative sufﬁxes (cf. Rickmeyer  for details). Particles are more loosely bound to stems than the other two classes. Derivative sufﬁxes do not necessarily change word class, but simply derive enlarged stems. Particles and derivative sufﬁxes can themselves inﬂect (in this case, traditional school grammar classiﬁes them as jodōshi ‘auxiliary verbs’). Based on historical evidence, the following cline of grammaticalization between these morpheme classes can be posited. () word/construction > (particle)> sufﬁx > inﬂection (cf. Narrog and Ohori : ) ‘Particle’ is put into parentheses because this step can be (and frequently is) skipped. Two salient accompanying tendencies are, ﬁrst, loss of inﬂection with inﬂecting words and, second, loss of phonological substance (attrition). Furthermore, frequently two or more morphemes fuse into one. Table . shows some examples of morphological reduction in the course of grammaticalization. While these are not the only examples of phonological erosion, fusion, afﬁxation, etc., the number of such examples is limited, and the two politeness markers -masand des- are probably already the two examples in Standard Modern Japanese that exhibit the greatest extent of phonological reduction. Furthermore, it deserves to be noted that signiﬁcant phonological fusion and reduction are only found in verbattached material. In Korean, examples of rather dramatic phonological and morphological reduction accompanying grammaticalization involving multiple morphemes seem to be even more plentiful than in Japanese. Some reductive changes are listed in Table .. It is noteworthy in Table . that the honoriﬁc nominative case marker -kkeyse developed from two different sources, i.e. verbal and nominal sources (Yi :

T .. Examples of reductive changes accompanying grammaticalization in documented Japanese language history Source

Category

Meaning

Outcome

Morphological category

Meaning/function

Processes

mawi-ir.as-

Compound verb with sufﬁx verb

‘let come’

-mas-

Sufﬁx verb

Politeness

Phonological erosion and morphological fusion; afﬁxation, bleaching

de gozai.mas-

Humble verb of existence preceded by particle

‘be’ (humble)

des-

Particle verb

Politeness

Phonological erosion and morphological fusion; afﬁxation, bleaching

ka sir-an(.u)

Verb with negative sufﬁx preceded by interrogative particle

‘I don’t know whether’

kasira

Particle

Doubt (interrogation)

Phonological erosion and morphological fusion; afﬁxation, bleaching

-(a)m.u

Sufﬁx verb

Future, intention

-(y)oo

Inﬂection

Hortative

Morphological and phonological fusion; paradigmaticization, bleaching

-tar.u

Sufﬁx verb

Resultative

-ta

Inﬂection

Past

Phonological erosion and morphological loss; paradigmaticization, bleaching

T .. Examples of reductive changes accompanying grammaticalization in documented Korean language history Source

Category

Meaning

(a) kyesi-e (b) -s-kuy-Ay-sy-e

(a) exist(+HON)NFIN (b) GEN-place-atexist-NFIN

(a) honoured subject exists and (b) x exists at x’s place and

SFP-CONN-sayADN-thing-TOP

a thing that (people) call x is

SFP-say-CONN

says x and

la-ko-hanun-kes-un

Morphological category

Meaning/ function

Processes

Case marker

NOM (+HON)

Phonological erosion and morphological fusion; afﬁxation, bleaching

Identiﬁcational TOP

as for . . .

Phonological erosion and morphological fusion; afﬁxation, bleaching

CPL

that (CPL)

Phonological erosion and morphological fusion; afﬁxation, bleaching

-tanta

Reportative EVID SFP

It is said that . . .

Phonological erosion and morphological fusion; afﬁxation, bleaching

Formal polite DEC SFP

It is that . . .

-supnita

Phonological erosion and morphological fusion; afﬁxation, bleaching

kkeyse

lan

ta-ha-ko

-tako CPL-say-PRS-SFP

x says that

-tako-ha-n-ta say-IND-SFP -salv-ni-ta

Outcome

(I) say that

Grammaticalization in Japanese and Korean



–; Sohn ). As the reductive processes proceeded, the formal distinction became gradually lost, resulting in an identical form for an identical function in Modern Korean (see .. for more discussion). Another feature observable in the changes listed is the loss of ha- ‘say’ and -ko-ha ‘CONN-say’, a widespread change in Korean that affected hundreds of formerly periphrastic constructions (Rhee : ). When forms become eroded, the resultant string is often morphologically ill-formed—a state of affairs that prompts the language users to reanalyse it and rename its grammatical category. The question arises why phonological reduction and morphological fusion is more common in Korean than in Japanese. One possible reason is prosodic structure (cf. Schiering ), but both languages are usually classiﬁed as moraic, and therefore do not appear to differ fundamentally. The causes are probably more complex. One factor may be the chronology of language standardization. Standard Japanese (socalled hyōjun-go ‘standard language’) was ﬁxed in the late Meiji period, with the consequence that reduced forms such as the quotative particle =tte < to it-te CPL sayGER did not make it into the written standard, although they were already present in colloquial Tokyo speech in the Meiji period. In contrast, standardization in Korean has been more ﬂuid: although Korea possessed an ‘ŏnmun ilch’i 言文一致 ‘write as you speak’ movement parallel to the one in Japan, language standardization efforts in Korea continued through the s, with the constant interruption of the Japanese colonial regime, and continue to this day, with signiﬁcant divergences between the DPRK and ROK. Despite the apparent differences, one salient similarity between Korean and Japanese is the opposition between postnominal and postverbal elements. In Japanese the former have clitic status, as observed above. In Korean the distinction is at ﬁrst blush less clear; thus the postvocalic subject marker =ka is regularly voiced [ga], like word-internal obstruents in general. But closer examination shows that postnominal particles in Korean too have clitic, not sufﬁx status. This is shown by the contrast in (a,b): () a. /kaps=i/ [kapʃ i], [kabi] price=NOM b. /eps-i/ [ʌpʃi], *[ʌbi] not.be-ADV ‘not existing, without’ While the consonant cluster /ps/ may undergo the reduction normally found at word boundaries (and subsequent intersonorant voicing) before a postnominal particle as in (a), verbal sufﬁxes do not allow this possibility. These facts support the view that in both Korean and Japanese, postverbal particles remain clitics, rather than sufﬁxes. The consequence is that phonological reduction in the postverbal domain can result in the development of full-ﬂedged inﬂectional morphology, while similar reduction in the postnominal domain does not. This is a salient shared property of grammaticalization in Japanese and Korean: grammaticalization has contributed to the stock of verbal inﬂection but nouns have remained non-inﬂecting.



Heiko Narrog, Seongha Rhee, and John Whitman

..      () Intersubjectiﬁcation is a concept primarily espoused by Traugott (, ; Traugott and Dasher ), and while Traugott portrays intersubjectiﬁcation as much less common than subjectiﬁcation, her primary source of examples is Japanese. This is no coincidence. She writes that ‘[i]ntersubjectiﬁcation intersects less extensively with grammaticalization [. . .] It is strongly grammaticalized, in the sense of being expressed morphologically, in only a few languages, e.g., Japanese’ (Traugott : ). Furthermore, ‘genuine cases of intersubjectiﬁcation as opposed to intersubjective uses of items are hard to identify outside of languages like Japanese’ (p. ). Examples from Japanese cited by Traugott include benefactive verbs like itadak- and kudasar-, discourse-organizing adverbs like sate (Traugott ), and the Middle Japanese politeness marker sōrō (Traugott ). Honoriﬁcs in general are a fertile ground for observations on intersubjectiﬁcation. Traugott suggests that ‘Japanese and other languages with addressee-honoriﬁc systems will inevitably evidence more overt intersubjectiﬁcation than languages that do not have such a system’ (: ). One more salient class of intersubjectiﬁed grammatical markers consists of ﬁnal particles. Onodera () presented an extensive study on ne and na. Since this feature of Japanese is very well known from the extant literature, the rest of this section will focus on the phenomenon in Korean. As brieﬂy illustrated in section .., the Modern Korean honoriﬁc nominative case marker -kkeyse originated from two sources. One is a verbal origin involving the verb kyesi- ‘(an honourable person) exists’ from which the [+honoriﬁcation] feature was inherited in its grammaticalization in the th century. The other source is a nominal origin involving kuy ‘place’, but the honoriﬁcation feature was not with this noun but the genitive marker -s, an MIDK [+HON] counterpart of the plain -uy. In the th century, -skuy, the predecessor of the MK –kkey, emerged (Hong : ). The grammaticalization from the nominal origin seems to have been motivated by the honoriﬁc feature of the genitive marker and the metonymic association of a place and a person who occupies it. The place-for-person metonymy is systematically utilized in the development of honoriﬁc address terms which involve an ediﬁce or architectural structure associated with the honourable. For example, there are borrowings from Chinese, such as cenha ‘below the palace’ for a monarch, kakha ‘below the pavilion’ for a head of state, phyeyha ‘below the staircase’ for an emperor, ceha ‘below the mansion’ for a crown prince, etc. The x-ha ‘below x’ combination is semantically motivated, because the speaker may be prostrated before a palace building, pavilion, etc. The most widely used honoriﬁc title sufﬁx in MK is -nim, which was phonogrammatically represented in OK with the Chinese character 主 ‘lord, master’. MidK data show that professional titles such as wang ‘king’, pwuthye ‘Buddha’, seycon ‘Buddha’ were not sufﬁxed with -nim, but kinship terms such as father, mother, etc. were. This suggests that the honoriﬁc sufﬁx was ﬁrst used in close familial relationship to show respect, and later spread to other areas.

Grammaticalization in Japanese and Korean



An area in which intersubjectiﬁcation is often attested is sentence-ﬁnal particles, because they constitute the grammatical category for marking mood and modality. In Korean many clausal connectives were innovated as sentence-ﬁnal particles through main-clause ellipsis (often called ‘insubordination’: Evans ). All instances of the development in this category show the intersubjectiﬁcation process, but we can look at the development of the clausal connective -ni(kka) that originally marked cause which later became a sentence-ﬁnal particle when the main clause was elided. According to Rhee (), the form developed diverse functions such as cause, reason, ground, contingency, contrast, adversativity when it was used as a clausal connective. When the form came to occur at the end of an utterance due to the main-clause ellipsis, it acquired an interpersonal, intersubjective function of marking reassertion and emphasis through pragmatic inference. This well illustrates that a form can semanticize the pragmatic inferences when they are frequently associated with it. As the preceding discussion suggests, intersubjectiﬁcation is a prominent feature of both languages, but the domains where it emerges are not necessarily the same. In the addressee honoriﬁc systems of both languages, the most formal level results from grammaticalization of a deferential (humble) sufﬁx: Japanese -mas- from ma(w)ir-as‘go(DEF)-CAUS(DEF)- and Korean -supni- from the LMK object honoriﬁc sufﬁx -sopfused with the addressee honoriﬁc sufﬁx -ngi-. But Korean has a far more articulated system, with four levels of addressee politeness in everyday Seoul speech, while Japanese has only two. On the other hand, in the system of benefactive verbs and their converbal extensions (see section ..), Japanese has an in-group/out-group distinction among donatory verbs which is absent in Korean. Perhaps the most salient difference between the two languages in the domain of intersubjectivity is in the distribution of sentence-ﬁnal particles. As observed above, in Korean sentenceﬁnal markers of intersubjectivity arise from the verbal system, through devices such as insubordination. In contrast, Japanese sentence-ﬁnal particles such as na and ne do not arise from the verbal system. They have clitic status (like postnominal particles), and attach to nouns and postnominal phrases as well as inﬂected verbs (although when they attach to NPs and PPs they induce a strong prosodic break, similar to interjections). In this respect Japanese sentence-ﬁnal particles more closely resemble counterparts in Southeast Asia.

..     It is generally (and correctly) assumed that the roots of grammaticalization are found in conversation, in the interaction of speaker and hearer. For example, in Traugott’s model of grammaticalization, pragmatic inferences—and especially conversational implicatures—trigger the process of grammaticalization (cf. Hopper and Traugott : –). In Japanese, however, we ﬁnd cases where grammaticalization came through written language, especially through translation, i.e. written language contact. We ﬁnd evidence of intensive study and translation activity from Chinese as early as we have documented language history—which is a trivial observation, since Japanese started out writing their language in Chinese script. However, it is unlikely



Heiko Narrog, Seongha Rhee, and John Whitman

that large sections of Japanese society immediately participated in reading and writing. This situation presumably arose in Middle Japanese, mainly through the continuous spread and pervasive inﬂuence of Buddhism. A medium for the absorption of Chinese lexica and grammatical patterns common to both Korea and Japan was the practice of hanmun hundok/kanbun kundoku, translated by Whitman et al. () as ‘vernacular reading’. According to this practice, learners of literacy were taught to read Chinese texts (most commonly aloud) in the Korean or Japanese vernacular. The following phenomena can be taken as evidence of inﬂuence from (primarily written) Chinese: • the grammaticalization of de-verbal postpositions, many of which correspond to Chinese prepositions or preposition-like verbs, already discussed in section ..; • the grammaticalization and spread of numeral classiﬁers, as discussed in ..; • the development of mermaid constructions incorporating large numbers of Chinese borrowings, mentioned in ... All three phenomena are not exclusively associated with language contact but are also supported by indigenous structures. However, many items involved are clearly borrowings or translations, and the spread of these structures in contrast to many other Northeast Asian and Transeurasian languages that do not have them is difﬁcult to explain without inﬂuence from (written) Chinese. A second group of grammaticalizations are individual adverbs and adnominals, collocations between adverbs and speciﬁc verb forms, and some phrases with grammatical function that have come into being, or have gained their current meaning and function, as translations of Chinese function words and phrases. Yamada () is a classical study of this topic (even if he does not use the term ‘grammaticalization’). Table . is a short list of grammatical words and phrases that have gained their form and function through translation. Other borrowings from strictly written language occurred in the late th–early th-century Meiji era, when a new standard was created to unify written and spoken language. Some of the grammatical elements of this new style that were borrowed from Sino-Japanese and pseudo-classical writing eventually made it into the spoken language through formal registers. Examples are the sufﬁxes beki for deontic necessity and rasi- for inferential evidentiality. Both sufﬁxes were productive from Old to Early Middle Japanese (be-, rasi), but then became obsolete. Modern beki has a much narrower meaning than its Old Japanese predecessor (cf. Narrog ), while rasi- has a broader meaning, which is also different. An extended use of the passive, and more frequent subject marking, in (written) Modern Japanese are attributed to the inﬂuence of translations from European languages (cf. Kinsui ). A similar state of affairs is observed in Korean. There are a number of adverbials of Chinese origin that carry grammatical function. Table . is a short list of grammatical words that have gained their form and function through translation. Most of the Chinese source words of the adverbs listed in Table . are attested in MIDK, mostly in legal and religious texts. As indicated in Table ., some adverbs

Grammaticalization in Japanese and Korean



T .. Examples of Japanese grammatical words and phrases coined through written language contact with Chinese Word/phrase

Category

Meaning/function

Chinese source morpheme

sude=ni

Adverb

already

既、已

musiro

Adverb

rather

寧

kiwamete

Adverb

extremely

極

hatasite

Adverb

really (enforcing a question)

果

subete

Adverb (/noun)

all

総

nao . . . .gotosi

Adverb . . . particle

similarity

猶

masa=ni . . . .besi

Adverb . . . particle

obligation/advice

當

-(a)zaru=o ena-

Inﬂection+particle verb

necessity

不得不

T .. Examples of Korean grammatical words and phrases coined through written language contact with Chinese Word/phrase

Category

Meaning/function

Chinese source morpheme

yeha=thun

Adverb=particle

anyway

如何

kiphil=kho

Adverb=particle

by all means

期必

cikuk=hi

Adverb=particle

extremely

至極

tangyen=hi

Adverb=particle

naturally

當然

nayci

Particle

(from x) up to

乃至

kwayen

Adverb

indeed

果然

haphil

Adverb

of what necessity

何必

sellyeng

Adverb

even if, granting that

設令

hoksi

Adverb

if

或是

are sufﬁxed with the native morphemes for adverbialization, but many of them are used without such derivational processes. This seems to be attributable to the inﬂuence of translation from Chinese texts.

. CONCLUSION In this chapter, we have tried to present () what are typical processes of grammaticalization in Japanese and Korean, and () what are processes that may particularly contribute to the discussion of theoretical aspects of grammaticalization. For (), we



Heiko Narrog, Seongha Rhee, and John Whitman

picked out the grammaticalization of converbs, of de-verbal postpositions, and of nouns marking categories in the verb phrase. For (), we ﬁrst discussed the morphological properties of grammaticalization in the two languages, and then the high frequency of grammaticalization into interpersonal domains. Both features support extant ideas about grammaticalization rather than contradicting them. In contrast, the third point—that grammaticalizations may enter the language through writing rather than conversation—may be a challenge for ideas about grammaticalization that seek the source of grammaticalizations solely in spoken speaker–hearer interaction. It goes without saying that many more processes could have been cited, especially for (), and that the picture might be quite different if we focused on grammaticalization from a micro perspective, rather than the macro overview that was provided here. Nevertheless, we hope that some characteristics of grammaticalization in the two languages as compared with the core European languages that are often the focus of research have emerged. Some but not all of them may even be characteristic of the Northeast Asian language area and/or Transeurasian languages in general.

ACKNOWLEDGEMENTS Narrog’s work was supported by grant no. H of the Japan Society for the Promotion of Science. Rhee’s work was supported by the research fund of Hankuk University of Foreign Studies. Whitman’s work was supported by the Laboratory Program for Korean Studies through the Ministry of Education of Republic of Korea and the Korean Studies Promotion Service of the Academy of Korean Studies (AKS--LAB-).

10 Grammaticalization processes in the languages of South Asia AL E X A N D ER R . C O U P E

. INTRODUCTION This chapter addresses some patterns of grammaticalization in a broad selection of languages of South Asia, a region of considerable cultural and linguistic diversity inhabited by approximately . billion people living in eight countries (Afghanistan, Bangladesh, Bhutan, India, Nepal, Maldives, Pakistan, and Sri Lanka) and speaking  known languages (Simons and Fennig ). The primary purpose of the chapter is to present representative examples of grammaticalization in the languages of the region—a task that also offers the opportunity to discuss correlations between the South Asian linguistic area and evidence suggestive of contact-induced grammaticalization. With this secondary objective in mind, the chapter intentionally focuses upon processes that either target semantically equivalent lexical roots and constructions or replicate syntactic structures across genetically unrelated languages. The theoretical concept of ‘grammaticalization’ adopted here is consistent with descriptions of the phenomenon ﬁrst proposed by Meillet (), and subsequently developed by e.g. Givón (a), Lehmann (), Traugott and Heine () and papers therein, Bybee, Perkins, and Pagliuca (), Heine, Claudi, and Hünnemeyer (a), and Heine and Kuteva (, ). In accordance with this preceding work, grammaticalization is viewed as a historical process in which lexical morphemes, or constructions involving lexical morphemes, are gradually bleached of their precise semantic speciﬁcity and develop more abstract grammatical meanings that permit the conventionalization of their use in a potentially widening range of functional domains. This process is usually complemented by some phonetic erosion of the grammaticalized morpheme(s), but a reduction in phonological bulk may not necessarily accompany the shift from a concrete lexical meaning towards a more abstract grammatical meaning; both the grammaticalized element and its lexical source(s) may coexist with an identical form for an extended period of time, thereby giving rise to ambiguous interpretations of meaning. For example, the light verbs of Grammaticalization from a Typological Perspective. First edition. Heiko Narrog and Bernd Heine (eds). This chapter © Alexander R. Coupe . First published  by Oxford University Press



Alexander R. Coupe

Indo-Aryan languages are phonologically identical in form to their lexical roots, despite their grammaticalization processes being of considerable antiquity. The chapter proceeds as follows. Section . introduces the reader to the languages of South Asia, discusses their present geographical distributions, and brieﬂy outlines their typological proﬁles. Section . considers the factors that contribute to South Asia being recognized as a linguistic area. Also addressed here are some of the problems a researcher may face in deciding if a particular grammaticalization pattern is an independent phenomenon, or one induced by contact. Section . investigates the lexical sources of some body-part nouns, the trajectories by which they have developed as grammatical morphemes encoding case relations and other functional meanings, and how some have additionally developed clause-linking functions. Sections .–. examine instances of lexical verbs that have grammaticalized various valency-modifying, aspectual, and modality meanings from compounds, and section . examines the relative–correlative construction of South Asia and considers whether its wide distribution could be due to contact-induced grammaticalization. The chapter concludes in section . with a discussion of the ﬁndings and the implications for establishing grammaticalization patterns in linguistic areas.

. THE LANGUAGES OF SOUTH ASIA South Asia is home to representatives of at least six major linguistic stocks: Austroasiatic (Munda languages in eastern peninsular India, and Khasian languages in Northeast India), Dravidian (principally the south of peninsular India and northern Sri Lanka, plus one outlier in Baluchistan), the Indo-Iranian branch of IndoEuropean (namely the Indo-Aryan, Iranian, and Nuristani languages of the northern half of the subcontinent, plus Sinhala and Dhivehi, spoken in Sri Lanka and the Maldives respectively), the Tibeto-Burman languages of the Sino-Tibetan family (spoken predominantly in the Himalayas and the peripheral hill states of Northeast India), and the Tai branch of Tai-Kadai (eastern Assam and Arunachal Pradesh, Northeast India, not shown in the map in Fig. .). To this array we might add the Great Andamanese and Ongan (a.k.a. Angean) language families of the Andaman Islands. Although these languages are geographically located somewhat closer to mainland Southeast Asia, they are reported to have a head-ﬁnal constituent order as well as a retroﬂex series of consonant phonemes that cannot be attributed to language contact (Abbi ). This suspiciously links them to many language families of the subcontinent, and distinguishes them substantially from the typologically very different languages of Southeast Asia. Masica (: ) therefore ponders whether they could be the remnant of an ancient substratum formerly located on the Indian mainland. Lastly, there is a handful of language isolates, such as the Burushaski language of the Hunza and Gilgit districts in northern Pakistan, the Kusunda language of western Nepal, and (if it is still spoken) the Nihali language of central-west India. Collectively these languages offer a veritable smorgasbord of typological features and proﬁles, but also some interesting commonalities, particularly with respect to shared grammaticalization patterns and structural convergence.

Grammaticalization in South Asia



South Asian language families Indo-Aryan languages Iranian languages Nuristani languages Dravidian languages Austro-Asiatic languages Sino-Tibetan languages Unclassified/language isolate

F. .. Distribution of South Asian language families. Nihali, Kusunda, and Tai-Kadai languages are not shown (adapted from A Historical Atlas of South Asia, Oxford University Press, )

The majority of South Asian languages have a AOV/SV constituent order and demonstrate typological characteristics associated with head-ﬁnal languages, as outlined by Greenberg (), such as genitive–noun and relative–noun order, postpositions, a dominant tendency for sufﬁxal morphology, main verbs preceding auxiliary verbs, and standards of comparison preceding adjectives. The vast majority employ dependent marking at the level of the clause, and most also demonstrate the indexing of one or more arguments on matrix verbs. Word formation is typically agglutinative and synthetic, with the exception of Indo-Aryan languages, which demonstrate a moderate degree of fusion, a feature consistent with their Indo-European roots.



Alexander R. Coupe

Narrative chaining via converb constructions (a.k.a. conjunctive participles or gerunds) characterizes clause linkage patterns, and was initially proposed as a key typological feature identifying South Asia as a linguistic area (e.g. Emeneau ), before subsequent work by Masica () established the fact that similar converbal clause chaining patterns extend well into the ‘Indo-Altaic’ area of Central and Far East Asia, the Horn of Africa, and even into parts of Europe. The only languages observed to deviate substantially from the general South Asian typological proﬁle are the head-initial Khasian languages of Meghalaya state and adjacent regions of Bangladesh. These languages also demonstrate typological features that accord with Greenbergian universals, and thus have opposite orders to those outlined above for the head-ﬁnal languages of South Asia. The relatively recently arrived Tai languages of eastern Assam and Arunachal Pradesh similarly conform to a head-initial typology, have isolating word-formation typology, and are tonal, in common with their Tai relatives in Southeast Asia. According to research by Morey (), Greenberg’s characterization of Khamti (a Tai language of Assam) as being an exceptional AOV/SV language with prepositions was inaccurate. He concludes that the basic constituent order is AVO/SV, but that verb-ﬁnal structures are possible under certain pragmatically deﬁned circumstances, and he proposes that language contact with Assamese and other verb-ﬁnal Tibeto-Burman languages of the region may have played a role in variant orders. Some Tai languages have also developed postpositional (anti-agentive) marking on O arguments (Morey : –), possibly due to the areal inﬂuence of neighbouring Indic and Tibeto-Burman languages. Sources of data on Indo-Aryan languages are substantial, extending back in written form to the middle of the second millennium , and Dravidian written sources in Tamil, Telugu, Kannada and Malayalam are extant from the early centuries of the Christian era (Southworth : , ). These textual sources are extremely valuable for investigating the diachrony of grammaticalization phenomena. With the exception of Tibetan (with the earliest written records dating from the eighth century ) and the Ahom buranjis, which chronicled life and governance in the Ahom kingdom (– ) and were initially written in the Ahom script, the other languages of South Asia are unwritten.¹ As Heine (Chapter  this volume) notes, an investigation of grammaticalization phenomena in unwritten languages must therefore rely upon internal reconstruction, historical reconstructions, and typological considerations. Despite the limitations, it is still possible to reveal a good deal of evidence for grammaticalization using a combination of these techniques.

. SOUTH ASIA AS A LINGUISTIC AREA Heine and Kuteva (: ) recognize three types of linguistic area: () those established by the presence of a shared set of linguistic features; () those in which the

¹ Minor exceptions are the Tibeto-Burman languages Newar, Lepcha, Limbu, and Meiteelon, for which writing systems were developed at different times within the past millennium.

Grammaticalization in South Asia



languages share a high degree of mutual intertranslatability; and () those that share the same processes of grammaticalization (and thus form a grammaticalization area). As they note, these types are not mutually exclusive, and all three criteria arguably apply to the Sprachbund of South Asia. Since Emeneau () and later work by Masica (), the subcontinent of South Asia has been recognized as a linguistic area in which particular linguistic features have diffused across the genetic boundaries of unrelated languages as a consequence of longstanding stable multilingualism. Emeneau (: ) deﬁnes a linguistic area as ‘an area which includes languages belonging to more than one family but showing traits in common which are found not to belong to the other members of (at least) one of the families’. One prominent trait is the presence of a phonemic contrast between dental and retroﬂex consonants in South Asian languages. Retroﬂex plosives can be reconstructed to Proto-Dravidian and have spread into Indo-Aryan (with the exception of Assamese),² but are extremely rare or nonexistent in other branches of Indo-European.³ Written records from Middle and New Indo-Aryan demonstrate an increasing occurrence of retroﬂex consonants over time, which Emeneau (: ) holds to be a clear demonstration of the ‘Indianization’ of the Indo-Aryan branch of Indo-European. Retroﬂex consonants are also found in Munda, Burushaski, and Tibeto-Burman languages of South Asia in contact with Indic languages (but more rarely in related languages outside of the South Asian Sprachbund),⁴ so this distribution is plausibly attributed to convergence that began with the diffusion of Dravidian loanwords containing retroﬂexes into Vedic Sanskrit (Kuiper : ). The languages of Kupwar village are a celebrated example of how six centuries of stable multilingualism have led to what is ostensibly a single grammatical template being employed for three distinct local varieties of Urdu, Marathi, and Kannada, as demonstrated by the data of (). Gumperz and Wilson (: ) observe that this has resulted in ‘a gradual adoption of grammatical differences to the point that only morphophonemic differences (differences of lexical shape) remain’.

² The retroﬂex and dental plosive series common to all other New Indo-Aryan languages appear to have settled on an articulatory compromise in Assamese, resulting in a single alveolar series of plosives. Because a dental~retroﬂex contrast is also represented in the Assamese orthography and in the phonological inventories of related languages, it is assumed that a phonological contrast must have been historically present at an earlier stage of the language (Mahanta : ). Bilingualism in Assamese varieties used as lingua francas in Northeast India may have contributed to a simpliﬁcation of the Assamese phonological inventory. For example, it has been noted that when people converse in Nagamese, the Assamese-based lingua franca of Nagaland, they simply use their L phonology (Sridhar ; Burling ). ³ Retroﬂex consonants reported in North Germanic languages (e.g. Hamann : –) appear to be attributable to relatively recent mergers involving rhotics and stops, and thus have no bearing of the value of retroﬂexion as a deﬁning feature of South Asia as a linguistic area. ⁴ Arsenault (: ) ﬁnds that the retroﬂex consonants of Sino-Tibetan languages spoken in China are typologically divergent from those of South Asia, and Matisoff (: ) proposes that retroﬂexes in Tibeto-Burman langages are secondarily derived from proto-clusters with medial liquids. This explanation accounts for the presence of a retroﬂex plosive phoneme with a rhotacized release in Sangtam, a TibetoBurman language of central Nagaland (Coupe, in prep.). Retroﬂex consonants appear to be exclusive to Austroasiatic languages spoken in South Asia (e.g. Jenny, Weber, and Weymuth ).

 ()

Alexander R. Coupe Kupwar Village (Maharashtra): (a) Urdu, (b) Marathi, and (c) Kannada a. pala jɔra kat ̣-ke le-ke a -Ø -ya b. pala jəra kap-un ghe-un a -l -o c. tapla jəra khod-i təgond-i bə -Ø -yn greens a.little cut- take- come (-) - ‘I cut some greens and brought them.’ (Gumperz and Wilson : )5

Anderson (: –) remarks on the anomaly that, despite an extended period of coexistence of speakers of Munda and Indo-Aryan languages, Sanskrit and Middle Indic texts demonstrate no evidence of borrowing from Munda, even for plant or animal names. Similarly, Thomason and Kaufman (: ) propose that Dravidian structural interference in Indic involved minimal lexical transfer. Since it is generally assumed that lexical borrowing normally precedes structural borrowing (e.g. as implied by Comrie’s (: ) suggested implicational universal for borrowability), this paradox can perhaps be explained with recourse to the following sociolinguistic considerations. It is likely that autochthonous Munda speakers typically occupied a marginalized and lowly socioeconomic position in the Hinduized society of Vedic South Asia, just as the tribal (ādivāsī) people generally still do in modern India. In support of this assumption, Southworth (: ), citing Thapar (), mentions the contempt expressed in Rigvedic hymns for non-Hindu indigenous people, their religious beliefs, and their languages. With the prevalence of such pejorative attitudes directed towards the indigenous cultures and languages of South Asia, it would be highly improbable for Munda lexical items to be borrowed by a superstrate language. Asymmetrical sociopolitical relationships between Indo-Aryan invaders and the conquered also possibly accounts for the paucity of old Dravidian loan words in Indic (Thomason and Kaufman : ). As lexicon is highly emblematic of caste or clan membership throughout South Asia, its important role in the identiﬁcation of one’s social afﬁliation potentially makes it quite resistant to borrowing anyway.⁶ Speakers instead tend to reduce the cognitive burden of needing to speak multiple languages in their daily lives by minimizing structural differences via convergence while maintaining more overt inter-group lexical differences.

⁵ The original glossing and interlinearization has been modiﬁed to more accurately represent the grammatical categories of these examples. ⁶ The Brahui language of Baluchistan clearly presents a counterexample to this statement. This Dravidian outlier was initially thought to be an Indo-Aryan language because of the preponderance of lexical items of non-Dravidian origin (Southworth : ). However, just as a speech community might eschew lexical borrowing to outwardly maintain a caste distinction, so too might it actively borrow vocabulary to hide one, particularly if there is social pressure to assimilate to the dominant culture or language of a region. Assimilation in Baluchistan was facilitated by intermarriage between tribes resulting in a high degree of bilingualism (Emeneau : ), and this must have facilitated lexical borrowing, much of which was probably unidirectional. See Bashir (: –) for further discussion of Brahui– Baluchi convergence.

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi

Grammaticalization in South Asia



While it is clearly demonstrated that languages in sustained contact can converge in structure, as amply suggested by the data of (), it has also been proposed that grammaticalization patterns may be sensitive to language contact (e.g. Heine and Kuteva ). However, proving that a particular grammaticalization pattern results from language contact is an endeavour potentially beset by a number of uncertainties that may complicate the picture. These are outlined below. First, assuming that plausible criteria can be established for identifying a linguistic area, how does one decide if a particular grammaticalization shared by two or more languages in a multilingual contact zone is unambiguously a consequence of language contact? Obviously if a certain grammaticalization process is found only in languages located within the linguistic area, and the pattern is also unattested in related languages outside of that convergence zone, then this might be taken as convincing evidence that a borrowed conceptual schema has resulted in a shared grammaticalization. The possibility that such a shared pattern could be contact-induced is further suggested by the observed grammaticalization and its outcome being crosslinguistically rare. To illustrate, one of the processes to be discussed in this chapter concerns the grammaticalization of a conative modality meaning from a compound involving a verb of perception (typically ‘look’ or ‘see’) in a number of unrelated South Asian languages—see section .). Now, this could very well constitute a case of contact-induced grammaticalization across genetically unrelated languages within the contact zone, as it is a pattern not widely attested in the languages of the world. Conative meanings are reportedly more likely to be associated with imperfective aspect, or with different semantic classes of verbs and verbal constructions (e.g. ‘try’, ‘obtain’, ‘taste’, ‘go’ + ,  + light verb ‘do’), and in some languages conativity may instead be expressed via partitive or dative case marking on an O argument (see Vincent ). It is also noteworthy that the development of conative markers from verbs of perception appears to cluster in regions with high linguistic diversity, further suggesting that a conceptual schema can diffuse though contact. For example, Foley (: ) reports that this is an almost universal grammaticalization pattern in the Papuan languages of New Guinea: ()

Asmat (Asmat Family), Papua Province, eastern Indonesia (Drabbe ) yitim-por arise-see ‘try to awaken somebody’

()

Barai (Koiarian Family), Papua New Guinea (Olson : ) akoe ga throw see ‘try throwing it’7

⁷ Foley () glosses Barai ga as ‘see’. The source glosses ga as ‘look’ and translates the example as ‘Try throwing it and see’. The meaning of ko as a main verb in Hua is translated by Haiman (: ) as ‘see, look’.

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi



Alexander R. Coupe

()

Hua (Yagaria, East Central Highlands Family), Papua New Guinea (Haiman : ) ke hu ko-mana talk do see:- ‘I tried to talk (but to no avail)’

()

Yimas (Lower Sepik Family) Papua New Guinea (Foley : ) na-mpᵼ-kwalca-tay-ntut .-3.-arise-see-. ‘they both tried to wake him up’

The targeting of a verb of visual perception for the grammaticalization of conative modality in both New Guinea and South Asia may be due to a universal cognitive representation of human experience, but could additionally be the consequence of contact-induced grammaticalization resulting from such a diffusing cognitive schema. These two causal factors are not necessarily incompatible in a linguistic area (e.g. Heine and Kuteva : ). Certainly the concentration of this grammaticalization pattern in known contact zones with a high incidence of linguistic diversity, its relative rarity, and the lack of a random distribution of conativity based on a verb of perception across the languages of the world all lead to the conclusion that such a clustering is highly unlikely to be due to chance.⁸ Conversely, if a particular grammaticalization pattern is also found in languages more widely as well as in the contact zone, then that phenomenon might be justiﬁably attributed to parallel developments known as ‘drift’ (e.g. Sapir ; Robbeets and Cuyckens ). A globally attested example of this is the grammaticalization of postpositions from nouns in OV languages (e.g. Aristar ). Another is the development of indeﬁnite articles from the numeral ‘one’ (Robbeets and Cuyckens : ). Even if most of the languages of South Asia also happen to demonstrate a propensity to grammaticalize postpositions from relational nouns and indeﬁnite articles from the numeral ‘one’, the ubiquity of these patterns in hundreds of languages around the globe underscores the supposition that both processes represent universal grammaticalization pathways that could just as possibly arise independently in languages that happen to share a linguistic area. With these caveats now stated, the chapter will proceed to describe some grammaticalization processes observed to occur in a selection of South Asian languages, and where the evidence is sufﬁciently convincing, some of these developments will be identiﬁed as plausibly resulting from language contact.

. GRAMMATICALIZATION OF RELATIONAL MORPHOLOGY AND METAPHORICAL EXTENSIONS Relational nouns denoting body-part terms and spatial locations are an especially rich source of case marking and converbal morphology in South Asian languages. ⁸ I am grateful to Heiko Narrog for comments that helped to clarify this discussion.

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi

Grammaticalization in South Asia



These typically develop out of compounded genitival [N₁- N₂] or appositional [N₁ N₂] constructions, in which the head noun N₂ is originally a body-part term or a noun denoting a spatial location. The compound’s head loses its lexical status as it undergoes semantic bleaching, eventually permitting it to function as a grammatical morpheme encoding a purely relational meaning in this type of construction. Compounds involving a genitival sufﬁx may initially retain the genitive in the ensuing grammaticalization of the postposition, thus revealing the diachronic origin of the grammaticalized collocation and giving rise to constructions such as the Hindi comitative N-ke sāth (from the noun sāth ‘company, society’), and the postessive N-ke pīche (from the noun pīchā ‘rear part, hindquarter’), e.g. parivār-ke sāth ‘with (the) family’, ghar-ke pīche ‘behind (the) house’. Structurally similar examples of case compounding involving genitival morphemes in Bodic languages of the Himalayan region are discussed in Noonan (). The following subsections describe some grammaticalization processes involving body-part nouns in Indo-Aryan and Tibeto-Burman languages. It will be demonstrated how, once a noun has grammaticalized as an oblique relational form, it can then be extended to even more abstract morphosyntactic functions in the grammar, such as non-ﬁnite and ﬁnite clause linkage.

.. ‘, ’ >  An oblique case-marking postposition has grammaticalized from a construction involving a relational noun with the meaning of ‘armpit, side, ﬂank’ in a number of South Asian languages. According to Beames ([]: ) and Chatterji ([]: ), the lexical source of the dative marker found in New Indo-Aryan languages—e.g. Hindi ko, Bangla ke and Oriya ku—is the Middle Indo-Aryan locative declension of the Sanskrit noun kakṣe (armpit...) ‘in the armpit’. Elaborating on the observations of these and other Indic scholars, Reinöhl (: –) proposes that the starting point of the grammaticalization would most likely have been ‘side (of the body), ﬂank’, as a metaphorical extension of the meaning of kakṣa-. The ﬁrst attested uses of ko as a dative/accusative marker are from Old Urdu/ Panjabi texts dating from the twelfth and thirteenth centuries (Butt and Ahmed : , cited in Reinöhl : –). By , ko was used in Hindi to mark both abstract and concrete goals. ()

Hindi (New Indo-Aryan, circa ) (Butt and Ahmed : –)⁹ poãco-ge a. ɪs manzɪl ko kab this destination / when reach:: ‘When will (you) reach this destination?’ b. apne haq ko poãc kar self right / reach having ‘having attained one’s right’

The grammaticalization trajectory from body-part noun expressing a spatial location to a postposition encoding goals and recipients, and ﬁnally to marking a core ⁹ Glosses have been adjusted in these and the following examples for consistency.

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi



Alexander R. Coupe

argument in O function, follows the expected pathway to an increasingly more abstract function that characterizes the evolution of grammaticalized morphemes. The metaphorical shift to marking certain types of core arguments in O function in Indo-Aryan languages is of particular interest, because it is only obligatory when the referent of the O argument is human or highly referential. LaPolla (, b: –) suggests that the primary motivation for this type of case marking initially is the disambiguation of semantic roles. His observations are made in respect to pragmatically motivated relational marking in Tibeto-Burman, but they are just as relevant to Indo-Aryan: when there is no potential confusion as to which argument is the agent, there is consequently no requirement for overt marking, whereas the presence of two possible (especially human) agents in the sentence requires disambiguation via some morphosyntactic means. This is achieved via dative marking on the O argument in a grammaticalized extension of the older locative marking function. Rather than extending the discriminatory marking to the patient, another option is for languages to instead mark the agent. This is a quite common pattern in TibetoBurman languages (see e.g. LaPolla a; Noonan ). All but one member of the Ao group languages of central Nagaland have a syncretic postpositional clitic nə~na that is used to mark the agentive and instrumental cases. The Ao dialects of this sub-grouping are unique in additionally marking the allative case with the same form, and this constitutes a previously unattested agentive/instrumental/allative syncretism in the languages of the world (Coupe ). The same morpheme is also recognizable in the ablative forms of the Ao group languages, all of which have originated from old appositional N₁ N₂ compounds that have been subjected to cycles of grammaticalization as relational morphemes over time.¹⁰ The examples of () respectively demonstrate the agentive, instrumental, allative and ablative functions of this isomorphic form in Mongsen Ao. ()

Mongsen Ao (Tibeto-Burman), Nagaland (Coupe a: –) a. mətʃatshə` ŋ nə pùŋì tʃu tsə` ŋ tʃu tsə` ŋ mətʃatshə` ŋ nə pùŋì   wild.pig  spear. ‘Mechatseng speared the wild pig.’ b. təɹ məzəʔ nə ɹuŋukù tʃ ì tə` -əɹ məzəʔ nə ɹuŋ-ukù tʃì thus- ﬁre  burn-  ‘And, [he] cleared [the ﬁeld] with ﬁre.’ c. . . . təpaʔ taŋ nə waɹ, tə-paʔ taŋ nə wa-əɹ -father   go- ‘ . . . after going to the father . . . ’

¹⁰ It is common in Tibeto-Burman languages for an ablative marker to have the form of a dimorphic agentive/instrumental+locative compound. For examples in the Ao group and beyond, see Coupe () and references therein.

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi

Grammaticalization in South Asia



d. nuksənsaŋpaʔ áhlù phinə tʃhuwaɹ nuksənsaŋ-pàʔ a-hlú phinə tʃhuwa-ə` ɹ - -ﬁeld  emerge- ‘Noksensangba returns from the ﬁeld.’ In common with the New Indo-Aryan languages discussed above, the most credible diachronic source for the agentive/instrumental/allative case marker is a lexical noun reconstructed to Proto-Tibeto-Burman as *ʔ-nam ‘side/rib’ (Matisoff : , ). The root of a cognate Chungli Ao form tena is deﬁned by Clark (: ) as ‘side at the waist where are no ribs’, a meaning that seems wholly consistent with ‘ﬂank’. This would have initially grammaticalized as a locative marker *na in ProtoAo, probably in much the same way as the dative/accusative marker of New IndoAryan initially evolved from a body-part noun with a locative meaning.¹¹ The metaphorical extension to marking a core argument in A function is similarly motivated by the need to disambiguate semantic roles when this is pragmatically motivated, analogously to the way that dative marking becomes obligatory in Indic languages when an O argument is human or highly referential and thus could be mistaken for the A argument. Agentive marking in Tibeto-Burman languages is especially likely to appear when a non-canonical ordering of arguments places the A argument in non-initial position in the sentence. What makes the grammaticalization of this syncretic form intriguing is its metaphorical extension from what must have originally been a locative marking function to marking an agent, as locations and agents seemingly have very little in common semantically. However, it is probable that at the earliest stages of grammaticalization, Proto-Ao *na was a semantically underspeciﬁed oblique postposition that could be used for marking instruments and sources in addition to goals, and it was the instrumental meaning that was targeted and extended to marking a core argument. The semantic link shared by agents and instruments is that both are in some sense effectors of actions, so once the instrumental meaning of nə¯ evolved to marking instruments, it would have been only a small metaphorical step to extend the marking to agents. In this respect Mongsen Ao is in harmonious accord with Narrog’s () revision of Heine et al. (a), which proposes that in instances of case polysemy, instrument marking precedes the progression to agent marking in the diachrony of grammaticalization chains. The path from an oblique case to a core syntactic case is also consistent with the cross-linguistically valid observation that grammaticalization chains evolve increasingly more abstract categories of grammar (Givón a; Heine and Kuteva , ; Bybee et al. ).

¹¹ Some Kuki-Chin languages of the northeastern region have an agentive/instrumental form in or na; Meiteelon (a.k.a. Meithei, Manipuri) and Tangkhul similarly have a form nə (LaPolla a). The agentive/ instrumental markers of Meiteelon, Tangkhul, and a number of Kuki-Chin languages are all suspiciously similar to the reconstructed Proto-Ao form*na. Reﬂexes could have been genetically inherited from a common intermediate proto-language, although borrowing under an intense contact situation cannot be ruled out—consider the case of Chungli Ao, which has borrowed an agentive/instrumental marker from its Konyak neighbour Chang (Coupe b: –).

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi



Alexander R. Coupe

It is noteworthy that a lexical noun tāŋī, also meaning ‘side’, is currently undergoing a new cycle of grammaticalization in Mongsen Ao. An example is given in (c) above. Its phonetically reduced grammaticalized form tāŋ () is obligatorily used to mark NPs representing human goals of movement and speech. It appears to have grammaticalized relatively recently as a new postposition, as it is sometimes determined by a demonstrative, just as a lexical noun would be, and its noun phrase is also additionally case-marked by the older allative form nə¯. ()

Mongsen Ao (Tibeto-Burman), Nagaland (Coupe a: ) tāŋ̊āɹ tʃū nə¯ tə¯pāʔ khə¯ tə¯jā nə¯t tāŋ tʃū nə¯ wāɹ, . . . tāŋ̊āɹ tʃū nə¯ tə¯-pāʔ khə¯ tə¯-jā nə¯t tāŋ tʃū nə¯ wā-ə¯ɹ other   -father  -mother two    go- ‘Others went to the mother and father, . . . ’ (   ,   In Mongsen Ao, a possibly unique but entirely plausible grammaticalization of Proto-Tibeto-Burman *lak ‘arm, hand’ has undergone metaphorical extension to a relational meaning initially expressing ‘terminal part in space’ in N₁–N₂ compounds, and then extending to a meaning of ‘terminal part in time’ in verb stems. There is little doubt that these two grammaticalized meanings stem from the same source, as both express an identical meaning, the only difference being that one is situated in the dimension of space, the other in the dimension of time. () Mongsen Ao (Tibeto-Burman), Nagaland (Coupe b: ) a. tə¯-mījūŋ-lāk -ﬁnger- ‘ﬁngertip’ b. sə` ntūŋ-lāk tree- ‘apex of tree’ c. tə¯-mī-lāk -tail- ‘tail tip’ d. mī-lak ﬁre- ‘ﬂame’ e. tə` ākī tʃū tʃhà . . . tʃhālākəɹ, tə` ā-kī tʃū tʃhà . . . tʃhà-lāk-əɹ thus -house  make make-- ‘Thus, having ﬁnished (narrator stutters) building [his] house . . . ’

. ‘SEND’, ‘GIVE’ > MORPHOLOGICAL CAUSATIVE The inherent lexical semantics of ‘give’ and ‘send’ makes them common targets for grammaticalization as causative morphemes. Masica (, : ) singles out morphologically marked causative verbs as one of the deﬁning features of South Asia as a linguistic area, and therefore his observation invites a closer inspection of their lexical sources in unrelated South Asian languages to establish what they may have in common. In some South Asian languages (especially Indo-Aryan, Munda, and Dravidian), causative morphemes are thought to develop out of ‘explicator’ compound constructions, in which a non-ﬁnite converbal or absolutive verb form is compounded with a clause-ﬁnal main verb (e.g. Masica ; Hook ). But this is not necessarily the

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi



Alexander R. Coupe

case for all languages of South Asia, as the following data will demonstrate, and collocations of verb roots present another credible pathway for their grammaticalization as valency-modifying morphemes. Lexical verbs expressing similar semantics are also frequently grammaticalized as causative morphemes in languages spoken in regions extending beyond South Asia, especially in Southeast Asia (e.g. Matisoff ). Jenny (: ) proposes that the development of equivalent grammaticalized meanings from the same lexical source over such an expansive geographical region is suggestive of either language contact situations in the past, or possibly a chain of contact situations, or language-internal developments involving a shared cognitive schema. Each of these scenarios is equally plausible, but establishing precisely which is the most likely reason for a particular grammaticalization pattern occurring in a convergence zone is a potentially challenging task, as noted earlier. It is also possible that all these factors may contribute in some way to facilitating a particular lexeme’s grammaticalization as a functional morpheme. A verb with the meaning of ‘send (on an errand, entrust with a commission)’, ‘make’, or ‘give’ is extensively found to grammaticalize as a causative morpheme in Tibeto-Burman languages. LaPolla (: –) views this development as an instance of ‘drift’ in a Sapirian sense (Sapir : ), as many of the lexical forms that grammaticalize as morphological causatives are demonstrated to be non-cognate in Tibeto-Burman languages. This is strongly suggestive of a causative cognitive schema that is associated with the semantics of such verbs, and which facilitates their metaphorical extension as grammaticalized morphemes. In the examples of () presented below, the grammaticalized use of Mongsen Ao zə` k ‘send’ as a sufﬁx -zək encoding causative-related meanings in (a,b) is contrasted with the main verb usage of a cognate form in (c). The causative morpheme is segmentally identical but carries a different tone, which is a common corollary to the grammaticalization of many functional morphemes in this language. () Mongsen Ao (Tibeto-Burman), Nagaland (Coupe a: , , ) a. təɹ, tshə` luŋla nə asìjukzəkpàʔ sə níʔ la phàɹajùʔ sə tə` -əɹ tshə` luŋ-la nə asìʔ-juk-zək-pàʔ thus- fox-  deceive--- . níʔ la phàɹaʔ-ì-ùʔ one.day  catch-- ‘And, the fox that deceived [him] one day will be caught.’ (lit. ‘ . . . caused him to be deceived’) b. tə` tʃhàku mitəm nə pi tshə` màzək mitəm nə pi tshə` -mà-zək-Ø tə` -tʃhà-ku thus-do-. pestle   pound--- ‘And then, this [cane] was split by the pestle.’ c. kiphuɹ nə áwkla khə ajila nət áhlu nə zə` k khə a-ji-la nət a-hlú nə zə` k-Ø. kiphuɹ nə a-úk-la owner  -pig-  -dog- two -ﬁeld  send- ‘[The] owner sent his pig and his dog to his ﬁeld.’

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi

Grammaticalization in South Asia



In Kham, an obviously non-cognate grammaticalized form of the verb pərĩː- ‘to send’ is used in a periphrastic causative construction when the causation is indirect, while an identical form continues to be used synchronically as a main verb. The periphrastic causative construction with pərĩː- must be used when the causee retains volitional control over the caused event (e.g. see a,b). This usage contrasts with a morphological causative -se, which is obligatorily used when there is direct causation (as in c). () Kham (Tibeto-Burman), W. Nepal (Watters : –) a. no-lai dõːh-wo ŋa-pərĩː-ke him- run- -- ‘I made him run.’ b. o-zaː-lai syãː-wo pərĩː-ke-o -child- sleep- -- ‘She made her child go off to sleep.’ c. baza-rə ya-sə-buhr-ke-o bird- --ﬂy-- ‘He ﬂushed the birds. (lit. ‘made them ﬂy’)14 An identical lexical source for a morphological causative is also reported in the Austroasiatic language Khasi, spoken in the West Khasi Hills of Meghalaya state, Northeast India. According to Temsen and Koshy (: –), the causative function of phaʔ- (one of two morphological causatives in the language) has grammaticalized from a lexical verb phaʔ, also expressing a core meaning of ‘send’. Furthermore, as in Kham, grammaticalized phaʔ- is restricted to use in causativized clauses in which the causee retains control over the caused event, i.e. when there is indirect causation. The grammaticalized causative morpheme is contrasted with the main verb usage in (a,b). () Khasi (Austroasiatic), Meghalaya ya-i-khɨlluŋ a. u-jɔn u-phaʔ-thyaʔ -John :--sleep -n-child ‘John made the child sleep.’ (Temsen and Koshy : ) b. ša ka iyeng ki-n sa phaʔ  : house -  send ‘To the house they will send.’ (Nagaraja : )15 The form phaʔ ‘send’ also occurs frequently as a lexical verb in the closely related Khasian language Pnar, but a search of an extensive corpus of narrative texts reveals that it has not grammaticalized as a morphological causative synchronically (Hiram Ring, pers. comm.) Nevertheless, its participation as V₁ in V₁V₂ quasi-compounds such as phaʔ sumar in (b) suspiciously makes these constructions syntactically identical to verb complexes involving the Khasi  causative, as demonstrated by (a), and this structure is probably the prelude to the grammaticalization of a causative meaning. ¹⁴ Glosses have been added to the example.

¹⁵ Glosses have been added to this example.

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi



Alexander R. Coupe

() Pnar (Austroasiatic), Meghalaya (Ring a: ) a. man m̩ je ki jaitʃaʔ u-ðiʔ kat-kam become  able  be.patient -drink as-like ka=kɔs wa da phaʔ i =course   send  ‘so they have no patience to drink (take medicine) according to the course that we (doctors) sent’ b. tɛ ka wa sumar ŋa, ka wa paʲt ̪  ::  take.care : ::  look ja ŋa, ka wa e ʤa ja ŋa,  : ::  give rice  : ki=sistar phaʔ sumar send take.care =sister(nun) ‘so (it was) she that cared for me, she that looked after me, she that gave me food and (she) was sent to care (for me) (by) the sisters’ (Ring b, Pnar_Language_Archive.FPAHM_) The causative  construction logically develops out of a compounded structure in which the head—initially a lexical verb meaning ‘send’—undergoes the process of grammaticalization and develops causative semantics as a metaphorical extension stemming from the lexical meaning of ‘send’. As the Khasian languages are headinitial and Tibeto-Burman languages are head-ﬁnal, it stands to reason that the grammaticalized causative morpheme is a preﬁx in Khasi and a sufﬁx in Mongsen Ao. The status of pərĩː- as the semi-grammaticalized head of its own predicate in Kham similarly reﬂects its historical source as a lexical verb that is in a nascent state of grammaticalizing as a periphrastic causative morpheme. Such a lexical source for causatives is not reported in other Austroasiatic languages (e.g. Anderson ; Jenny and Sidwell ), or indeed further aﬁeld (e.g. no examples are discussed in Heine and Kuteva  either), so the  causative of Khasi may well have developed via contact with Tibeto-Burman languages. It is furthermore highly probable that the lexical verb phaʔ ‘send’ of the Khasian languages Khasi and Pnar is also a borrowing, as it reportedly has no parallels in any other Austroasiatic language (Mathias Jenny, pers. comm.  May ; Paul Sidwell, pers. comm.  May ). This makes it all the more likely that a contact-induced transfer is responsible for its causative meaning in Khasi; the causative grammaticalization trajectory as well as its lexical source may have resulted from a language-contact scenario. Heine and Kuteva (: ) discuss evidence that might be used in making the case for replication of a grammaticalization pattern, and propose that linguistic transfer can constitute any of the following: a. b. c. d. e.

forms, that is, sounds or combinations of sounds; meanings (including grammatical meanings) or combinations of meanings; form–meaning units or combinations of form–meaning units; syntactic relations, i.e. the order of meaningful elements; any combination of (a)–(d).

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi

Grammaticalization in South Asia



The fact that neither a lexical verb with a cognate form meaning ‘send’ nor a  causative has been reported in any other Austroasiatic language strengthens the assumption that the pattern as well as the form has been borrowed from an unrelated neighbouring language, thus implying that (a–c) could all be involved, although at this stage it is not possible to identify the source that could have served as the model. From a cognitive perspective, the inherently causative semantics associated with the meaning of ‘send (on an errand)’ seems to make it an ideal target for the grammaticalization of a causative meaning;¹⁶ but the most common lexical source for a morphological causative in both South Asian and mainland Southeast Asian languages turns out to be the verb ‘give’ (see Matisoff ). This verb is known to grammaticalize a wide range of meanings in addition to encoding causative semantics, including benefactive, permissive, and purposive senses (e.g. see Heine and Kuteva : –). In Mongsen Ao varieties, a sufﬁx -(p)iʔ serves as a morphological causative.¹⁷ This morpheme is cognate with the reconstructed Proto-Tibeto-Burman form *bəy ‘give’ (Matisoff : , , , ).¹⁸ A reﬂex no longer occurs synchronically in Mongsen Ao as a lexical verb, having since been replaced by a newer form khìʔ ‘give’. The older form only survives as a grammaticalized causative morpheme, as demonstrated in (). () Mongsen Ao (Tibeto-Burman), Nagaland (Coupe a: ) maŋmətuŋ tʃakiɹ, túŋkhəla, kìnìjuŋəɹ hlaɹə` kəm túŋkhəla kìnìjuŋəɹ hlà-əɹ maŋmətuŋ tʃak-iʔ-əɹ village.name leave--  village.name descend+go- kəm become. ʽHaving left [the corpse in] Mangmetong village, they went down and founded Kiniunger village.ʼ ¹⁶ Although a causative meaning is not necessarily the only outcome for the grammaticalization of ‘send’ in every South Asian language. For example, Slade (: –) notes that Nepali paṭhāunu ‘to send’ is used as a light verb to express regret: mai-le ke gar-i paṭhā-em? I- what do- send:- ‘Oh what have I done?’ This is evidence for assuming that there can also be completely unrelated metaphorical extensions of grammaticalized meaning for a given lexical morpheme in different languages, and that the unique conceptualizations of a speech community may result in substantial deviations from a putative universal cognitive schema based on shared human experience. ¹⁷ The form of the morphological causative of Mongsen Ao varies signiﬁcantly from village to village in Nagaland. In the Waromung and Khar village varieties the initial consonant has been lost, and the vowel of this and other functional morphemes has centralized and rounded to /ʉ/. In the Khensa village variety the initial consonant is retained (fortuitously revealing beyond doubt this grammatical morpheme’s lexical source), whereas in the Mangmetong village variety it has been lost (for further details see Coupe : , b: ). ¹⁸ The ‘y’ in this reconstructed proto-form deviates from conventional IPA representation in Matisoff ’s () reconstructions, and actually represents the palatal approximant /j/, not the high front rounded vowel /y/.

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi



Alexander R. Coupe

In what appears to be the beginning of a renewed cycle of causative grammaticalization, khìʔ is in the process of developing a permissive meaning in periphrastic constructions, in addition to its basic meaning lexical meaning of ‘give’. This is demonstrated in (), in which the meaning of khìʔ hovers ambiguously between the semantics of the original lexical sense of ‘give’ and a new permissive interpretation. The permissive meaning is abetted by the dative case marking, which encodes a volitionally acting causee argument in indirect causativized clauses formed with the morphological causative -iʔ for some semantic verb classes (see Coupe a: –). Note that both the structure and the semantics of the grammaticalized meaning of khìʔ align it with the periphrastic causative of Kham, as illustrated in (a,b). () Mongsen Ao (Tibeto-Burman), Nagaland (Coupe a: ) nì nə niŋ tʃàɹ li áhŋáʔ phàjpàʔ khìwʔ nì nə niŋ tʃàɹ li á-hŋáʔ phàʔ-ì-pàʔ khìʔ-ùʔ   . son  -ﬁsh catch-- give.- (i) (Your son wanted to catch ﬁsh) ‘I let your son catch ﬁsh.’ (ii) ‘I gave ﬁsh to your son to catch.’ Anderson (: ) reports that an auxiliary verb construction involving the root beɽ- ‘give’ in the Munda language Gutob encodes a kind of causative or resultative meaning denoting an effect on the causee argument. Examples (a,b) respectively contrast the function of the grammaticalized causative morpheme with the main verb usage in (c). ()

Gutob (Austroasiatic), Eastern India a. uson gol-gol-te nom bobrig-oʔ beɽ-oʔ today smoothly you make.enter- - ‘Today you put it in smoothly.’ (Hook : ) b. sobu paiʈi niŋ ɖem-oʔniŋ beɽbeʔɲiŋ work I do-.= := all ‘I do all the work.’ (Zide : , cited in Anderson : ) c. niŋ niŋ-nu onooʔn beɽ-oʔ=niŋ suŋ-tu I I- daughter give-.= -. ‘I will give my daughter.’ (Zide : , cited in Anderson : )

Nagamese is best characterized as a creole-like language with an Assamese base that is widely spoken as a lingua franca in the northeastern state of Nagaland. In common with other languages of the subcontinent, a verb with the meaning of ‘give’ is used with causative semantics when occurring periphrastically in series with another verb. That the ﬁnal verb is functioning as a grammaticalized element in () is proven by the fact that the construction predicates a single event: that of falling down. In contrast, buying a book is a separate event from the act of giving it in (). It also seems to be the case, according to native-speaker consultants, that the matrix verb dise in () is essential for deriving the causative meaning, despite the presence of the causative sufﬁx on the non-ﬁnite verb.

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi

Grammaticalization in South Asia



() Nagamese (Indo-Aryan), Nagaland (author’s ﬁeldnotes—elicited data) didi laga kutta to ami-ke gira-a-i di-se older.sister  dog  -/ fall-- - ‘Older sister’s dog made me fall down.’ () Nagamese (Indo-Aryan), Nagaland (Bhattacharjya : ) Moy tai-ke ekta kitab kin-i-kena di-se  -/ one- book buy-- give- ‘I bought her a book.’19 In Bhattacharjya’s () thesis, one can also ﬁnd examples of causativized clauses in which the causative sufﬁx -a is absent, so the entire functional load for expressing a causative meaning is then carried by grammaticalized ‘give’: () Nagamese (Indo-Aryan), Nagaland (Bhattacharjya : , ) a. bagan to ami-khan ini rakh-i di-le, . . . ek-bar garden  - fallow keep- -, . . . one-time lai.pata laga-bo are ini alchi pora rakh-i di-she greens plant- and fallow idle  leave- - ‘If we leave the garden fallow . . . if we plant greens once and then leave them uncultivated . . . ’ b. Hey, sala-ke dhur-i-bi, no-char-i-bi; , bastard-/ catch-- -release-- theng bhang-i di-bi; . . . . . . leg break- - itu kotha ki band-i-kena yate rakh-i di-bi this talk what tie-- here keep- - ‘Hey, get the bastard; don’t let him run off. Break his legs, tie him up and dump him right here.’ While the grammaticalization of the periphrastic  causative may appear at ﬁrst to be a straightforward case of layering in Nagamese, on deeper inspection it could be motivated by a functional need to express the indirect causation of intransitive verb bases. An intransitive verb stem taking the -a causative sufﬁx seems to require its causee to be a patient, whereas this semantic entailment does not necessarily apply with the periphrastic  causative in the absence of the causative sufﬁx. This is captured by the elicited example in (), in which a permissive meaning obtains from an inﬁnitival verb stem +  construction and the causee is interpreted to be acting volitionally as an agent (cf. the semantically and structurally equivalent Mongsen Ao example of () above). () Nagamese (Indo-Aryan), Nagaland (author’s ﬁeldnotes—elicited data) tai didi ke dʒa-bole di-se  older.sister / go- -pst ‘S/he let older sister go.’ ¹⁹ Bhattacharjya’s glossing and interlinearizations have been adjusted for consistency.

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi



Alexander R. Coupe

Masica (: ) discusses the parallel case of Bengali, which has no means of deriving an indirect causative from an intransitive stem via the old Indo-Aryan vowel strengthening method (e.g. Hindi intransitive uṭhnā ‘to arise’ vs transitive uṭhānā ‘to raise, lift’) and consequently resorts to periphrastic expression. Watters (: ) makes the relevant point that a periphrastic causative in Kham does not trigger a reassignment of semantic roles. The agent of a non-causativized clause remains the agent of the periphrastic causative, whereas a morphological causative may obligatorily recast the agent as a patient, potentially creating a semantic incompatibility. Speakers of languages that do not have a way of expressing indirect causation with the grammatical tools at their disposal may well take a periphrastic pathway to grammaticalizing a new causativization strategy—as suggested by the periphrastic causatives of Kham and Nagamese—but for the present this remains a topic in need of deeper exploration.

. ‘EAT’ > PASSIVE/MIDDLE/RECIPROCAL/ REFLEXIVE MARKING In Indo-Aryan and Munda languages, the verb ‘eat’ occurs in a range of idiomatic expressions, the majority of which are consistent with a general meaning of adverse experience. () Early New Indo-Aryan (Jaworski and Stroński )20 paṃkhinha dekhi sabanhi ḍara khāvā bird:::: see: all: fear:::: ::: ‘ . . . birds having seen all of that got scared / . . . birds saw all of that and got scared.’ () Sinhala (Indo-Aryan), Sri Lanka (Keenan : ) kikili lamajagen maerun kaeːva chicken child() death  ‘The chicken was killed by the child.’ () Assamese (Indo-Aryan), Assam (author’s ﬁeldnotes) naspati ɛ-khon lo-bɔ bisaɹ-is-e, lo-l-e, pear one- take- seek-- take-- kintu bhɔi kha-i as-e but fear - exist- ‘[He] is seeking to take a pear, [and] took one, but is afraid.’

²⁰ The example is from an epic poem composed c. in Old Awadhi by Malika Mohammada Jāyasī. See Mātāprasāda Gupta (ed.), Padmāvata (Ilāhābāda: Bhāratī Bhaṇḍāra, ), vol. ., p. . I am grateful to Rafał Jaworski and Kryzstof Stroński for bringing it to my attention.

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi

Grammaticalization in South Asia



() Nagamese (Indo-Aryan), Nagaland (author’s ﬁeldnotes) bhoi kha-ise? fear - [context: the interlocutors have just narrowly avoided a collision whilst driving] ‘Did you get a fright?’ Anderson (: –) speculates that a quasi-passive marker -dʒ om in the Munda language Kharia might have grammaticalized from a verb ‘eat’, and a cognate form also occurs in the related language Juang. In addition, Kharia extends the use of the morpheme -dʒ om to a kind of emphatic, reﬂexive, and ‘indirect-middle’ marking. Whereas the grammaticalized use of ‘eat’ with nouns such as ‘fear’ parallels the NV structure of light verbs in Indo-Aryan, in which the noun functions as the O complement of the verb, in Munda and Tibeto-Burman the grammaticalized morpheme occurs in the V₂ slot of what was almost certainly a compound verb construction at an earlier stage. ()

Kharia (Austroasiatic), eastern India io-dʒom-ta see-- ‘it is seen’ (Grierson : , cited in Anderson :)

() Juang (Austroasiatic), eastern India aiɲ ma’d-dʒim-sɛkɛ I beat--: ‘I am beaten’ (Pinnow : , cited in Anderson : ) The Kiranti language Yakkha of eastern Nepal has similarly grammaticalized a lexical root meaning ‘eat’ as a type of reﬂexive/reciprocal marker. Grammaticalized ‘eat’ is used in the V₂ position in compound verbs, where it can additionally express autobenefactive meanings. () Yakkha (Tibeto-Burman), eastern Nepal (Schackow : ) nda (aphai) moŋ-ca-me-ka=na  (self) beat-V.--=: ‘You beat yourself.’ Schackow (: –) notes that ca in V₂ position has a number of polysemous meanings; e.g. in kon-ca ‘walk-’, she interprets grammaticalized ca as contributing a nuance of ‘consuming’ the enjoyment of taking a walk. As in other languages of South Asia,  is used in some contexts to express an adversative passive-like meaning. () Yakkha (Tibeto-Burman), Eastern Nepal (Schackow : ) moŋ-ca-khuba babu beat-V.-S/A: boy ‘the boy who gets beaten up (regularly)’ Looking beyond the subcontinent, the metaphorical extension of a Turkish verb meaning ‘eat’ to a broad range of adversative idiomatic expressions is reported by Friedman (: –), who observes that a similar usage has been calqued in

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi



Alexander R. Coupe

Macedonian from the verb jade ‘eat’ (cited in Heine and Kuteva : ). This results in adversative expressions such as jade k’otek ‘get a beating’ (lit. ‘eat a blow’) and is based on the equivalent Turkish expression kötek yemek. Lastly, Hook (: ) presents some examples of grammaticalized ‘eat’ in the Munda languages Mundari and Ho that do not express adversative meanings: () Mundari (Austroasiatic), eastern India (Hoffman : vol. , ) en hoṛoko lel jom-me those people see - ‘Take a look at those people.’ ()

Ho (Austroasiatic), eastern India (Burrows : ) umbul-re dub jom-pe shade-in sit - ‘Sit (at ease) in the shade.’

Given the diverse and rather arbitrary range of meanings attributable to grammaticalized ‘eat’, it seems to be the case that there is no uniform shared grammaticalization trajectory that this verb could have taken. The polysemous meanings that have grammaticalized furthermore suggest that these are unlikely to be the outcome of a contact-induced grammaticalization in the languages of South Asia.

. ‘SEE/LOOK’ > CONATIVE MODALITY ‘TRY, TEST OUT’ Representatives of Munda, Dravidian, Indo-Aryan, and Tibeto-Burman languages of South Asia all have a verb with the meaning of ‘see’ or ‘look’ that appears to be on the pathway to grammaticalizing, or to have already grammaticalized, a conative modality expressing a meaning of ‘to try, test out’. () Gorum (Austroasiatic), eastern India (Rau, in prep.: ) pans din zom-ej-juʔ sun gaʔ-t-ej gi’ɟ-t-ej ﬁve day gather--: say eat-:- see-:- ‘When it has gathered for ﬁve days, they try it (by drinking).’ () Tamil (Dravidian), Tamil Nadu and Sri Lanka (Lehmann : –) a. kumaar catṭ ̣aiˑy-aiˑp poot ̣ˑt ̣ˑp paar-tt-aaṉ Kumar shirt- put- see--: ‘Kumar put on the shirt (e.g. to see if it ﬁts).’ b. kumaar inta naaval-aiˑp patị -ttuˑp paar-tt-aaṉ Kumar this novel- study- see--: ‘Kumar tried reading the novel (to see how it was).’ () Nagamese (Indo-Aryan), Nagaland (author’s ﬁeldnotes) kha-i sa-bi na eat- look- : [in the context of utterance:] ‘Try tasting it.’

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi

Grammaticalization in South Asia



A grammaticalized form of the lexical verb atsə¯ ‘look’ occurs in Mongsen Ao verb stems, where it expresses a conative modality meaning consistent with ‘test, try out’. () Mongsen Ao (Tibeto-Burman), Nagaland (Coupe a: –) a. tə` tʃhàku, tənì nə ‘mə` kəɹatsəɹuʔ pi.’ tə` -tʃhà-ku tə-ni nə mə` -kəɹa-tsə-ə` ɹ-ùʔ pi thus-do-. -wife  -ascend+come---  ‘And so, the wife says [of her husband] “[He] isn’t attempting to come up [from the Assam plain], this one”.’ b. təɹ liŋəɹ, ‘áhlù nə watsəaŋ,’ tə` tʃhàwɹə. tə` -əɹ liŋ-əɹ a-hlú nə wa-tsə-aŋ thus- plant- -ﬁeld  go-- tə` tʃhà-ùʔ tə` ɹ thus do.-  ‘And then, after [he] had done the planting, [she] said “Go and have a look at the ﬁeld”.’ The grammaticalization of a conative modality meaning from verbs of visual perception is observed to cluster in areas of high linguistic diversity, as already demonstrated by the languages of New Guinea and discussed in section .. A similar clustering is presented by a number of languages belonging to four language families of South Asia. The ubiquity of an identical grammaticalization pattern targeting semantically equivalent verbs in unrelated languages must surely be attributable to the contact-induced transfer of a conceptual schema, as in both the New Guinea and South Asian regions it is too concentrated to be merely due to chance.

. RELATIVE–CORRELATIVE CONSTRUCTIONS Up to this point we have considered grammaticalization processes applying mostly to individual lexical items that gradually evolve as functional morphemes in syntagmatic collocations. We now turn to a consideration of South Asian developments applying to larger syntactic structures—speciﬁcally, relative–correlative constructions—which demonstrate a grammaticalized function for interrogative pronouns in some languages, as well as evidence of an areal diffusion of the pattern. While it can be potentially challenging to identify grammaticalization outcomes that result unambiguously from language contact, the case for relative–correlative constructions arising out of contact scenarios in South Asia seems beyond doubt. According to Nadkarni (), the relative–correlative construction is native to Indo-Aryan languages of South Asia, and has spread into Dravidian as a result of contact-induced convergence. Supporters of this conjectured direction of diffusion have argued that the relative–correlative construction with its two ﬁnite verbs violates a constraint on Dravidian syntax that only permits a single ﬁnite verb per sentence (with the exception of quotations), thus suggesting that the structural pattern must have been borrowed from Indo-Aryan. However, an opposing view proposes that relative–correlative constructions are native to Dravidian, and that the

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi



Alexander R. Coupe

spread has gone in the opposite direction (see Kolichala :  for discussion and references, also Hock ). Regardless of what may be the correct interpretation for the direction of spread, it is indisputable that the relative–correlative structure has been replicated in Tibeto-Burman languages as a result of contact; and it will be shown that these languages are very similar to Dravidian in co-opting interrogative pronouns in an innovated role for marking the relative clause constituent. The structural and morphological characteristics of the New Indo-Aryan relative– correlative construction are as follows. Two clauses are adjoined, both containing a ﬁnite verb. The dependent ‘relative clause’ is preceded by a so-called ‘j-class’ relative pronoun () that marks the relativized argument and indicates the subordinate status of its clause, and the relativized NP argument is coreferential with a potentially optional noun or a pronoun in the matrix clause that functions as the correlative NP (). The Hindi example of () illustrates this structure. () Hindi (Indo-Aryan), North India (Kachru : ) jo kitaab mez per hɛ  book.. table... on be.. vəh merī hɛ  ... be.. ‘The book which is on the table is mine.’ The relative–correlative construction is of considerable antiquity, and is attested as early as Vedic Sanskrit. As in all the daughter languages of Indo-Aryan, the position of the matrix clause vis-à-vis the dependent clause is pragmatically determined according to whether the modifying information is restrictive or non-restrictive, and this may have been a factor inﬂuencing its diffusion into the grammars of other South Asian languages. Examples () and () respectively contrast restrictive and non-restrictive interpretations of meaning. () Vedic Sanskrit (Indo-Aryan) (Hock : ) bādhasva . . . tvaṁ taṁ ... you:: that::: bind:: . . . yo no jighāṁ atī who::: we:. slay::: ‘You . . . tie down that (evil-doer) who . . . tries to slay us.’ (Rig Veda ..) Turning now to the Dravidian relative–correlative, we ﬁnd that it similarly involves two ﬁnite verbs; but because Dravidian languages lack a form class of relative pronouns, speakers must press into service an interrogative pronoun for marking the dependent relative clause of the construction. This is demonstrated in the Malayalam example of (). () Malayalam (Dravidian), Kerala (Asher and Kumari : ) aarə manassə aʈakkunnuvo-o avaṉṉə samaadhaanam kiʈʈunnu who mind control:- he: peace obtain: ‘He who controls the mind obtains peace.’

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi

Grammaticalization in South Asia



In some Dravidian languages, a particle (typically an interrogative with the form -o) marks the end of the relative clause constituent. An identical pattern of using an interrogative pronoun in lieu of a relative pronoun is replicated in Tibeto-Burman languages employing the relative–correlative construction. Mongsen Ao is also somewhat similar to Dravidian languages in using a topic particle la at the end of the relative clause to indicate its dependent status. () Mongsen Ao (Tibeto-Burman), Nagaland (author’s ﬁeldnotes) sə´ páʔ nə¯ ì tʃə¯lāj ā-tshə¯ phāŋā tsə¯ŋ-īʔ-ɹū lā who  . daughter -mithun ﬁve attach--  ājī tʃə¯lāj pā tsə¯-ì-ùʔ tè sā-Ø . daughter  take-- thus say- ‘ “Whoever ties ﬁve mithuns (Bos frontalis) [as a bride price for] our daughter, he can take our daughter,” [he] said.’ Tibeto-Burman languages that make limited to extensive use of the relative– correlative construction are typically spoken in locations buffering communities speaking Indo-Aryan languages, and where bilingualism in an Indo-Aryan language is common. Elsewhere in the Tibeto-Burman domain the relative-correlative pattern is not attested, and speakers exclusively use the native Sino-Tibetan nominalized participle type of relative clause construction, as demonstrated by the Khiamniungan example of (). See (a) to compare an internally headed example in Mongsen Ao. () Khiamniungan (Tibeto-Burman), Nagaland (author’s ﬁeldnotes) . . . ʃawʔ¹¹ nə³¹, ko³³-khɛj³³-lɛ³³ nɔj¹¹-tʃən³³ nɔ³¹, . . . rat  earth-- stay-  ‘ . . . the rat, which lives inside the earth, . . . ’ A possible functional motivation for South Asian languages replicating the relative– correlative pattern is that, whereas access to relativization using the participle type of relative may be limited by language-speciﬁc constraints, there appears to be no such restrictions on access to relativization using the relative–correlative strategy. This logically motivates the replication of the relative–correlative construction in Tamil, a language that prohibits relativization on an instrument and other oblique positions using the standard participle relativation strategy native to Dravidian, but permits it using the relative–correlative pattern. () Tamil (Dravidian), Tamil Nadu and Sri Lanka (Keenan and Comrie : ) Eṉṉa(k) katti(y)-āl ̣ koṛi(y)-ai anta maṉitaṉ kolaippi-tt-āṉ which knife-with chicken- that man kill--: anta katti(y)-ai jāṉ kaṇ-t ̣-āṉ that knife- John see--: ‘John saw the knife with which the man killed the chicken’ (lit. ‘with which knife the man killed the chicken, John saw that knife’)

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi



Alexander R. Coupe

Genetti (: ) additionally notes that the relativized referents in relative– correlative constructions of Dolakha Newar are typically indeﬁnite, unknown, and non-speciﬁc, a pattern that is also common to the majority of relativized referents of relative–correlative clauses in Mongsen Ao. The relative–correlative construction may therefore ﬁll a functional gap in the structural inventories of some languages by facilitating relativization on indeﬁnite and non-speciﬁc referents, or by making it possible to relativize on arguments that are not accessible using a native participle relativization strategy. () Dolakha Newar (Tibeto-Burman), Nepal (Genetti : ) gunān bāmā=e khā ŋen-ai āmun sukha sir-ai who: parent= talk listen-: : happiness know-: ‘Whoever listens to his/her parent’s advice, s/he knows happiness.’ As demonstrated above for Indo-Aryan, the position of the relative clause constituent in the relative–correlative construction can be before the matrix correlative clause, where it encodes restrictive reference, or alternatively after the matrix correlative clause, where it encodes non-restrictive reference. A number of Tibeto-Burman languages also permit pre- and post-head relative clause placement using their participle relativization pattern similarly to encode a restrictive~non-restrictive contrast (e.g. see Coupe b), but perhaps not all do. If there are rigid constraints on the position of the head that precludes encoding this contrast by means of constituent order, then this could provide another motivation for languages of South Asia replicating the relative–correlative structure and using it alongside the participle type of relative clause.²¹ The relative–correlative construction is also found in Munda languages. Kharia, for example, has borrowed the set of j-class relative pronouns from Indo-Aryan, one of which is used to mark the beginning of the relative clause in (a). Intriguingly, Kharia has an additional set of relativizing forms that are homophonous with the proclitic interrogative markers (e.g. b), paralleling the grammaticalized use of interrogative morphemes in Dravidian and Tibeto-Burman languages for marking relative clauses (see Peterson :  for the full paradigm). () Kharia (Austroasiatic), eastern India (Peterson : –) khajar tar=sikh=oʔ=may ho=kɑɽ=aʔ komaŋ=ko a. . . . je  deer kill==.= that=.= meat= nalage, . . . .. ‘ . . . it isn’t the meat of the deer they had killed . . . ’ ( . . . which deer they had killed, his meat it is not . . . )

²¹ See Heine and Kuteva (: ff.) for similar examples of replicating languages ﬁlling functional gaps in their grammars in other linguistic areas of the world.

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi

Grammaticalization in South Asia



b. . . . a=boʔ=te pujapaʈh karay=na aw= ki ho boʔ=te =place= sacriﬁce do= -. that place= ɖɑm=ke ho=ki ho ɖoli=te mɑɽɑy=oʔ=mɑy arrive= that= that palanquin= put.down=.= ‘ . . . having arrived at the place where the sacriﬁce was to be done, they put the palanquin down.’

. CONCLUDING COMMENTS This chapter has considered grammaticalization phenomena from an areal perspective, and has found correlating evidence for the contact-induced transfer and replication of patterns involving unrelated languages of South Asia. Perhaps the most convincing of these is the  causative of Khasi. This satisﬁes all of the criteria for having spread through contact, since neither the pattern nor the lexical source can be linked to related languages outside South Asia. If the Khasi  causative preﬁx has indeed resulted from a contact-induced grammaticalization, then it may be added to the list of counterexamples demonstrating that there is no requirement for languages to share structural compatibility for contact-induced grammaticalization to occur (see e.g. Harris and Campbell : –). Secondly, the development of conative modality from verbs of visual perception in four language families of South Asia presents yet another convincing case of contactinduced grammaticalization of a conceptual category, due to the fact that the verbs of these unrelated languages all follow the same grammaticalization trajectory precisely leading to a conative modality outcome, and the pattern is furthermore observed to be cross-linguistically rare. Lastly, the relative–correlative construction similarly presents robust evidence for contact-induced grammaticalization, as relative–correlative constructions in TibetoBurman languages conform to the criterion of only being found in languages within the linguistic area of South Asia. The fact that Dravidian, Munda, and TibetoBurman languages all press their interrogative pronouns into service as relative pronoun equivalents (or even borrow the j-class relative pronouns for this function, as in the case of Kharia) is very convincingly the replication of a syntactic pattern. Functional motivations for copying this construction can be identiﬁed, as noted in the case of Tamil, which can use the relative–correlative construction to relativize on arguments that are otherwise inaccessible to relativization using the standard Dravidian participle construction. The data presented in this chapter collectively demonstrate the transfer of seemingly identical conceptual schemas across the genetic boundaries of languages in contact; these target morphemes or constructions with identical meanings in unrelated languages, and they produce the same grammaticalization outcomes. Such replicated patterns must cater to a multilingual community’s communicative needs, while at the same time reducing the cognitive burden imposed by multilingualism in a linguistic area.

OUP CORRECTED PROOF – FINAL, 22/9/2018, SPi



Alexander R. Coupe

ACKNOWLEDGEMENTS I thank Sander Adelaar, Nikolaus Himmelman, Uta Reinöhl, and the editors for their comments and suggestions on an earlier draft. I alone bear responsibility for the conclusions reached and any misinterpretations of analysis. The chapter was written while I was afﬁliated to the Institute for Linguistics at the University of Cologne, and the background research was facilitated by an Alexander von Humboldt Research Fellowship for Experienced Researchers. I am grateful to both of these institutions for their generous support.

11 Grammaticalization in isolating languages and the notion of complexity U M B E R T O A N S A L D O , W A L T E R BI S A N G , AN D P U I Y IU S Z E T O

. INTRODUCTION Grammaticalization theory is concerned with the emergence and development of grammatical forms and constructions. According to Hopper and Traugott (: ), the phenomenon of grammaticalization can be deﬁned as ‘the change whereby lexical items and constructions come in certain linguistic contexts to serve grammatical functions, and, once grammaticalized, continue to develop new grammatical functions’. Such a distinction between the two stages of grammaticalization from a semantic perspective may be problematic because it is hard to deﬁne what ‘more or less grammatical’ means without additional criteria from other domains of grammar (cf. Bisang a on primary vs secondary grammaticalization). Thus, typological studies on grammaticalization based on large numbers of languages (e.g. Bybee ; Bybee, Perkins, and Pagliuca ; Lehmann b; Heine and Kuteva ) generally combine the semantic side of the linguistic sign with its form side. This combination reveals the interesting fact that similar grammaticalization ‘clines’ or ‘pathways’ are found in a wide range of unrelated languages, suggesting that such a phenomenon may be shaped by some universal processes of grammatical change. Classical approaches to grammaticalization very often take it for granted that there is a certain degree of interdependence between the meaning side and the form side of grammaticalization. This assumption of the coevolution of meaning and form goes back right to Meillet (: ) and his statement that ‘the weakening of the meaning and the weakening of the form of the auxiliary word go hand in hand’.¹ ¹ The French original version runs as follows: ‘L’affaiblissement du sens et l’affaiblissement de la forme des mots accessoires vont de pair’ (Meillet : ). Grammaticalization from a Typological Perspective. First edition. Heiko Narrog and Bernd Heine (eds). This chapter © Umberto Ansaldo, Walter Bisang, and Pui Yiu Szeto . First published  by Oxford University Press



Umberto Ansaldo, Walter Bisang, and Pui Yiu Szeto

This idea of ‘the development from autonomous words to grammatical agents’² (Meillet : ) is also reﬂected in Givón’s (b: ) well-known grammaticalization cline from independent words at the level of discourse and syntax to dependent grammatical morphemes and ultimately to zero. ()

Givón (b: ) Discourse > Syntax > Morphology > Morphophopnemics > Zero

Bybee et al. () explicitly posit a causal link between semantic and phonetic reduction: It therefore seems natural to look for a direct, and even causal, link between semantic and phonetic reduction in the evolution of grammatical material, beginning with the earliest stages of development from lexical sources and continuing throughout the subsequent developments grams undergo. Our hypothesis is that the development of grammatical material is characterized by the dynamic coevolution of meaning and form. (Bybee et al. : )

Lehmann’s (b) approach is also based on the assumption of a tight relation between meaning and form. In his view, ‘[t]he content and the expression of a sign are insolubly associated with each other’ (p. ). He argues that there is an isomorphism characterized by the tendency of ‘a correspondence between the size, or complexity, of the signiﬁcans and that of the signiﬁcatum’ (p. ). In spite of this view, he does not deﬁne his parameters for assessing the autonomy of the linguistic sign in terms of the interaction of meaning and form. The criteria that are crucial for his holistic concept of the sign as a combination of meaning and form are the criteria of weight, cohesion, and variability with their paradigmatic and syntagmatic aspects (pp. –). As he further points out, it is sometimes possible to observe separate effects on meaning and form in the case of the parameters of weight and cohesion. But even in these cases, meaning and form will be both affected ‘in a parallel fashion’ (p. ). In spite of the rather widespread assumption of the coevolution of meaning and form, manifestations of grammaticalization observed in East and Mainland Southeast Asian (EMSEA) languages (Bisang , , a) indicate that it is problematic. Based on the two criteria of weight and cohesion as those criteria which can show separate effects of meaning and form (Lehmann b: ), section .. will show that the coevolution of meaning and form is rather reduced in EMSEA languages with their dominant isolating or analytic properties. Under the assumption that the coevolution of meaning and form is well supported from observations on other languages, one can further conclude that the degree to which form and meaning coevolve over time is subject to cross-linguistic variation, and that there is a strong tendency of low coevolution in the area of EMSEA languages as a whole. This tendency seems even to extend to languages like Khmer (see section ..) that are not as morphologically isolating as other EMSEA languages. It also does not exclude that a relatively small number of EMSEA languages do develop morphology and inﬂectional morphological paradigms (cf. Arcodia ,  on Sinitic, and ² In French: ‘le passage de mots autonomes au roles d’agents grammaticaux’ (Meillet : ).

Grammaticalization in isolating languages



Gerner and Bisang  on inﬂectional paradigms of numeral classiﬁers in Weining Ahmao), but the degree of morphological elaboration developed in these languages remains comparatively low. In fact, historical reconstructions of Chinese show that there was morphology in Old Chinese between the th and the th centuries BC (Sagart ) but morphology seems to have been exclusively derivational and did not express inﬂectional categories like person, tense or number. Once that morphology was lost, the development of morphology and of inﬂectional morphological paradigms in particular took place very rarely (see above). This chapter addresses the questions of the typological properties of grammaticalization in EMSEA languages and whether grammaticalization is open to different types of grammaticalization whose properties cluster areally due to a combination of contact-induced convergence and some basic typological properties of the languages involved.

. TYPOLOGICAL PROPERTIES OF EMSEA LANGUAGES

..  The EMSEA area refers to ‘the area occupied by present day Cambodia, Laos, Peninsular Malaysia, Thailand, Myanmar, and Vietnam, along with areas of China south of the Yangtze River’ (Enﬁeld and Comrie : ). This area comprises ﬁve language families: Sino-Tibetan, Mon-Khmer (or Austroasiatic), Tai (branch of TaiKadai), Hmong-Mien (or Miao-Yao), and Austronesian (Chamic in Vietnam plus languages of peninsular Malaysia) (Matisoff ; Bisang ). Due to an intricately complex history of contact with its corresponding processes of structural convergence over at least two millennia (e.g. Enﬁeld ; Enﬁeld and Comrie ; Bisang , ), these languages share many typological properties irrespective of their genetic afﬁliation. In this chapter, we shall discuss properties of tone (section ..) and syntactic formations (section ..), which may lead to the polyfunctionality of markers (section ..) and lack of obligatory grammatical marking (section ..). Such typological properties may help to account for the grammaticalization characteristics of EMSEA languages (section ..).

..  The tone systems and the discreteness of syllable boundaries in EMSEA languages contribute to the relative morphophonological stability of grammaticalized items in these languages (Bisang ). If there is phonetic erosion in these languages, it primarily operates in terms of syllable duration and vowel quality (see Ansaldo and Lim  on Southern Sinitic varieties). The phonological constraints and the areal factor of contact-induced convergence that support morphophonological stability will be brieﬂy outlined in this section. A crucial aspect that contributes to morphophonological stability in tonal languages of East and mainland Southeast Asia is the necessity to uphold tonal values



Umberto Ansaldo, Walter Bisang, and Pui Yiu Szeto

to ensure lexical contrast (Ansaldo and Lim ). This imposes phonotactic constraints on the extent to which erosion can develop. For example, in Southern Sinitic languages with complex tonal categories like Cantonese () and Hokkien (), no signiﬁcant pitch reduction is observed in the grammaticalized items. As illustrated in the following examples, the Cantonese gwo ‘to pass/cross’ (a) and Hokkien ho³³>²¹laŋ²⁴ ‘to give people’ (a) do not show changes in tonal values in their grammaticalized forms ((b) and (b–d)) (Ansaldo and Lim : –). (a) Ngo gwo malou SG cross road ‘I cross the road.’ (b) Ngo daai gwo nei SG big SUR SG ‘I’m older than you.’ (a) i> ho> laŋ> te SG give people bag ‘S/he gave them a bag.’ (b) i> ho> laŋ> sien SG CAU people be.bored ‘S/he made them bored.’ (c)

i> ho> laŋ> tsiaʔ SG PERM people eat ‘S/he let them eat.’

(d) i> ho> laŋ> me SG PASS people scold ‘S/he was scolded.’ The necessity of keeping up pitch contrast is particularly high in languages with more than one tonal register like Cantonese and Hokkien with their three registers. In languages with only one tonal register, the neutralization effects of pitch reduction are less strong. For that reason, Mandarin Chinese with its one-register tonal system has toneless syllables and, as a further consequence, also shows vowel reduction in a few well-known examples of grammaticalization like the perfective marker -le (from liǎ o ‘ﬁnish’), the durative marker -zhe (from zháo ‘touch, contact’), and the general classiﬁer ge (from gè ‘bamboo tree’) (Ansaldo and Lim ). The discrete syllable boundaries in EMSEA languages also support the avoidance of subsyllabic morphemes. Once syllabicity has become an areal phenomenon, even languages with subsyllabic morphology may reduce its use. A good example is Khmer, an EMSEA language with rich subsyllabic morphology for derivational word formation of limited productivity (Jenner and Pou /; Bisang : –; c). This can be illustrated by the inﬁx -m- for marking agentive nouns: so:m ‘ask, ask a favour’ > smòːm ‘beggar’, cam ‘wait for; guard; keep’ > chmam ‘guard, n.’ (Bisang : ). While this type of morphology tends to become more productive over time in many languages, Khmer morphology is losing

Grammaticalization in isolating languages



importance and often gets replaced by non-morphological alternatives, such as the use of the noun nὲək ‘person’ in the head position to form agentive nouns: nὲək-daə(r) [person-walk] ‘pedestrian’, nὲək-taeŋ [person-compose/write] ‘author; composer; writer’ (Bisang : ). The replacement of morphological structures by more analytic, syntax-based structures indicates typological convergence of Khmer with its neighbouring languages.

..   Due to the limited coevolution of meaning and form in EMSEA languages, most of their grammatical markers are expressed as free words with their speciﬁc syntactic positions rather than as bound morphemes. To account for this fact, Matisoff (: ) uses the term of ‘particles’ and the term of ‘particulization’ for the corresponding diachronic process, i.e. a process through which particles gradually develop from fully lexical morphemes. In Matisoff ’s (: ) approach, particles ‘cannot constitute the head of a construction’. Given the problems with the deﬁnition of heads, and given the fact that many products of grammaticalization like adpositions or complementizers can take head positions, we will not adopt this part of Matisoff ’s () deﬁnition. What remains is the centrality of particulization to syntactic formations as an areal grammaticalization phenomenon of EMSEA languages (Matisoff ). We discuss two types of such formations in this chapter, namely those involving noun-particles and verb-particles. Noun-particles may only occur after nouns, and unlike true nouns, they cannot be quantiﬁed or classiﬁed. For example, while retaining its lexical meaning (), Lahu thàʔ ‘upper surface’, a noun which can be traced all the way back to Proto-TibetoBurman *l-tak or *g-tak ‘ascend; above’, has developed into an object marker ();³ it may also mark embedded sentential objects () or a sort of ‘accusative of time’ () (Matisoff : –). ()

šɛ̂.šī chi mì.châ ɔ̀-thàʔ bu tɛ a sand DEM ﬂoor N-surface pile put PTC ‘Pile the sand up on the ﬂoor!’

()

lìʔ chi ŋà thàʔ pî tā ve yò book DEM SG OBJ give PTC PTC PTC ‘(Someone) has given me that book.’

()

yɔ̂ qɔ̀ʔ la ve thàʔ ŋà dɔ̂.lɔ ve yò SG return PTC PTC OBJ SG hope PTC PTC ‘I hope that he comes back.’

()

ŋà-hɨ khɔ̄ dɔ̂ʔ ve qhɔ̀ʔ câ ve ha.pa thàʔ yò -PL top hit REL year eat GEN month OBJ PTC ‘We play with tops in the month that we celebrate New Year’s.’

³ As Matisoff (: ) points out, the indirect object is marked in ditransitive verbs.



Umberto Ansaldo, Walter Bisang, and Pui Yiu Szeto

In addition to the examples of noun-object constructions in () to (), noun-locative constructions are also found in Lahu, where the particle lo, reconstructed as ‘road; way’, has a locative function. But unlike the case of thàʔ, where the lexical meaning is retained, the lexical source of the locative particle lo can only be deduced from comparative data (Matisoff : –). As shown in (a–c), lo is a general locative particle which ‘does not specify direction of motion, or even motion vs. rest; the interpretation depends on the following verb, or the sentence as a whole’ (Matisoff : ): (a) há.qō lo mɨ chɛ̀ Ve cave LOC sit PROG PTC ‘He’s sitting in the cave.’ (Essive) (b) há.qō lo lòʔ e ò cave LOC enter PTC PTC ‘He has already gone into the cave.’ (Adessive) (c)

há.qō lo tɔ̂ ʔ e ò cave LOC emerge PTC PTC ‘He has already come out of the cave.’ (Abessive)

Verb-particles may only occur after verbs, and unlike true verbs, they cannot be negated separately. This sort of development with verbs gradually losing their fullverb status is highly typical of EMSEA languages (Matisoff ). Common examples include the development of coverb constructions and verb–complementizer constructions (coverbs are deﬁned as verbs in adpositional function). For example, Cantonese has a rich range of verbal particles for indicating results, directions, and comparison (Matthews and Yip ). Some common examples are presented in examples (–) from Cantonese (Matthews and Yip ). In (), the verb dou ‘arrive’ as illustrated in (a) is used in the V-position of the resultative construction (b). The verb hei ‘move up, rise’ (a) is used as a directional marker in (b) indicating upward movement. Finally, the verb gwo ‘to pass/cross, surpass, exceed’ (a) marks comparative in example (b). ()

dou ‘to arrive’

As a verb: (a) ngo dou-zo hokhaau SG arrive-PFV school ‘I have arrived at school.’ As a resultative particle: (b) ngo heoi-dou daaihok zaam SG go-arrive university station ‘I have reached the University Station.’ () hei ‘up’

Grammaticalization in isolating languages



As a verb: (a) hei san up body ‘Get up.’ As a directional particle: (b) ling-hei go dinwaa pick-up CLF phone ‘Pick up the phone.’ ()

gwo ‘to pass/cross’

As a verb: (a) ngo gwo malou SG cross road ‘I cross the road.’ As a comparative particle: (b) ngo daai gwo nei SG big SUR SG ‘I’m older than you.’ Another common process of grammaticalization in EMSEA languages is the development of complementizers out of verbs with the meaning of ‘say’. Sentences (–) show relevant examples from Cantonese, Thai, and Khmer, respectively (the examples from Thai and Khmer are quoted from Matisoff : –). ()

Cantonese waa ‘say’

As a verb: (a) keoi waa haa go jyut wui heoi toiwaan SG say next CLF month will go Taiwan ‘S/he said s/he will go to Taiwan next month.’ As a complementizer: (b) keoi nam-zyu waa haa go jyut heoi toiwaan SG think-DUR CPL next CLF month go Taiwan ‘S/he plans to go to Taiwan next month.’ ()

Thai wâa ‘say’

As a verb: (a) wâa phǒ m duu.thùuk nán, mâj ciŋ ləəj say SG despise DEM NEG true EMPH ‘To say that I look down on them is simply not true.’



Umberto Ansaldo, Walter Bisang, and Pui Yiu Szeto

As a complementizer: (b) phǒ m kɔ̂ jaŋ mâj nɛ̂ɛ.caj wâa, cə paj dâj ry̌ y mâj SG CONJ still NEG sure CPL PTC go able or NEG ‘I’m still not sure whether I’ll be able to go or not.’ ()

Khmer thaa ‘say’

As a verb: (a) look thaa məc SG say how ‘What did you say?’ As a complementizer: (b) kñom kɨt thaa look qayuq prəhael məphɨy.pram SG think CPL SG age about twenty-ﬁve ‘I think that you’re about  years old.’ The above examples of particulization show a number of areal grammatical phenomena typically observed in EMSEA languages. First, the lexical meaning of a grammaticalized item is often retained in the language, showing that semantic generalization is involved but at the same time the same linguistic item can still be used in its erstwhile meaning. This phenomenon is described using the term ‘layering’ by Hopper (); it is by no means limited to EMSEA languages but is very widespread in these languages. Its frequency must be related to a second phenomenon, i.e. the absence of erosion or its limitation to the suprasegemental level (cf. section ..). The combined effect of these two phenomena leads to a third phenomenon, polyfunctionality (see section ..). However, this type of polyfunctionality, in which the same linguistic sign has a lexical meaning and a grammatical meaning, is not the end of polyfunctionality. As will be seen in the next section, many lexical items have followed different pathways and can thus express more than one grammatical function. In each case, the concrete meaning of a linguistic item depends on the construction in which it occurs and on pragmatic inference (Bisang , ). Due to such a scenario, polyfunctionality is rampant in EMSEA languages.

..  In relation to the above, we note that grammatical morphemes can often be poly- or multifunctional in languages of East and Southeast Asia. A good example is the verb ‘come to have’, which is baːn in Khmer, dây in Thai, tau in Hmong, and dé in Mandarin Chinese. In their grammaticalized form, ‘come to have’ verbs can express an impressive number of functions which are described in detail by Enﬁeld (), including permissive, obligation, past, and factuality. For example, depending on the

Grammaticalization in isolating languages



context, the Khmer baːn in () can express a permissive, past, or factuality reading, while the obligation reading is unusual yet still marginally possible (Bisang : –). () khɲom baːn tɤ̀u phsaː(r) SG TAM go market ‘I am/was able/allowed to go to the market.’ (permissive) ‘I went to the market.’ (past) ‘I was at the market.’ (factuality) [Against the presupposition that I was not.] Ngay () showed that the verb tie⁵³ ‘get’ in the Shaowu dialect of Min (Sinitic) has even more functions. They roughly cover benefactive, allative, locative, causative, purpose, modality (potential, permission), manner, intensiﬁer (‘very’) and passive. Another example is the development of ‘surpass’ verbs into comparative markers and aspect markers in Southern Sinitic and other EMSEA languages (Ansaldo ). We already have described the verb gwo ‘surpass’ in (a) and as a comparative marker in (b). In (), the same verb takes the position immediately after the verb and marks experiential aspect: () ngo heoi gwo gwongzau SG go EXP Guangzhou ‘I’ve been to Guangzhou.’ In many Southeast Asian languages, ‘give’ verbs are used as prepositions/coverbs (benefactives), causative markers, adverbial subordinators (purpose, manner), and complementizers (Bisang a). In the domain of the noun, classiﬁers are not only used for expressing individuation in the context of counting, but also express referential status. In quite a few cases, classiﬁers in [classiﬁer noun] constructions can express both functions, indeﬁniteness as well as deﬁniteness (Li and Bisang ; Wang ). These examples show that lexical and grammatical forms coexist. We suspect that in some cases, such as the surpass comparatives, the more grammatical form may show minimal tendencies to undergo phonetic erosion, but we do not have systematic, cross-linguistic data to back this.

..    As noted in the works of Bisang (, a), obligatoriness in terms of Lehmann (b) is an essential indicator of a high degree of grammaticalization (cf. Lehmann’s b parameters discussed in section ..). Although pragmatic inference is generally assumed to be a very important factor that initiates and motivates processes of grammaticalization, it typically loses its relevance after a new meaning has developed into a conventionalized grammatical function whose marking has by now become obligatory. While this development of obligatoriﬁcation seems to be cross-linguistically widespread in grammaticalization, it is a remarkable property of the grammar of EMSEA languages that they leave more room for pragmatic inference than many



Umberto Ansaldo, Walter Bisang, and Pui Yiu Szeto

other languages that require more explicitness in the use of markers that are the result of grammaticalization. In these languages, we ﬁnd very little obligatory marking (the only well-known exception being numeral classiﬁers in the context of counting), as can be seen in the frequent omission of tense-aspect markers, (in)deﬁniteness markers, and in radical pro-drop, as shown in the Mandarin example in () that stands for most EMSEA languages as well: () Mandarin Chinese (Gao : , cited from Bisang a: –) 他父親並不贊成他成天守在屋裡看書寫字，認為男孩子就要頑皮些，出去見世面，廣交際，闖天下，對當作家不以為然。 tāi fùqinj bìng bú zànchéng tāi chéngtiān shǒ u zài he father actually NEG approve he whole.day guard/spend in wū-li kàn-shū xiě -zì, ø rènwéi nánháizik jiù yào house-in read-book write-letters maintain boy then will wánpí xiē, ø chū-qù jiàn shìmiàn, øk guǎ ng jiāojì, øk be.naughty somewhat go.out-leave see world broadly socialize chuǎ ng tiānxià, duì øi/øk dāng zuòjiā øj bù yıˇwéi rán. force.ones.way.into world as.for work.as/be writer NEG think so ‘Hisi fatherj did not approve that hei stayed at home the whole day reading and writing, [hej] maintained that a boyk should be somewhat naughty, øk go out to see the world, øk widely socialize, øk make his way in the world—as for øi/øk being a writer, [hej] disagreed.’ This example shows the prevalence of zero arguments or radical pro-drop in Mandarin, where arguments can be omitted without concomitant agreement morphology on the verb. Consequently, arguments are not overtly accessible through features like person and number, and can only be pragmatically inferred from contextual information. The presence of radical pro-drop is not simply a matter of omitting any overt marking of arguments; it also has its consequences for pragmatic inference in terms of Levinson’s () Generalized Conversational Implicatures. This is due to the fact that radical pro-drop generates a three-way Horn scale consisting of , while non-pro-drop only allows for a two-way scale of the type (for details, see Huang ). Thus, the second pronoun tā ‘he’ in the ﬁrst line of () is preferably interpreted as disjunct (having a referent that differs from the one in the subject position of the preceding clause) because the speaker does not select the informationally weaker zero form. More generally, the use of zero arguments strongly depends on discourse and the formation of text structure (cf. Bisang a:  on ()). Thus, the coreference of the ﬁrst zero argument on line  with the father (øj) must be inferred from the fact that the father is the topic. Subsequently, a generic noun (nánháizik ‘a boy’) is introduced on line  as a new subtopic. This noun is coreferent with the next three zero arguments (øk) on line . Finally, the ﬁrst zero argument of line  may be associated with the protagonist/main topic (øi) or the boy/subtopic (øk), while the second zero argument refers to the father (øj). In addition, there is neither a single tense-aspect marker on the verb nor any (in)deﬁniteness marker on the noun in

Grammaticalization in isolating languages



the whole passage, showing that the lack of obligatoriness also extends to such grammatical categories—while there are tense-aspect and (in)deﬁniteness markers in Mandarin Chinese, they are optional if their corresponding grammatical categories can be pragmatically inferred.

..  While some few instances of formal reduction can be spotted in EMSEA languages, they are very rare. In general the formal aspects of canonical grammaticalization do not happen in these languages. Some bleaching occurs, but widespread polyfunctionality undermines the semantic dimension of canonical grammaticalization in EMSEA languages. An aspect of grammaticalization in this area may be the loss of autonomy, or constructionalization, but even this is undermined by polyfunctionality and lack of obligatory marking.

. REFLECTIONS

..    Whether grammaticalization is seen as an epiphenomenon (Newmeyer : ch. ) or as a universal tendency depends in part on whether grammaticalization is understood as a very comprehensive or a very narrow phenomenon. Our sense at the moment is that grammaticalization phenomena vary quite strongly based on typological and areal patterns—a view also argued for in the work of Bisang (, , , a). This by no means detracts from the theoretical signiﬁcance of grammaticalization studies, in particular if we reframe the problem in terms of crosslinguistic variation. The classical assumption that grammaticalization shows coevolution of meaning and form as it was criticized in the introduction (section .) can be defeated by the application of Lehmann’s parameters of weight and cohesion to grammaticalized markers of EMSEA languages. In the case of paradigmatic weight (integrity), grammaticalized markers are characterized by their morphophonological stability. Tenseaspect markers like the Chinese perfective marker -le (and a few others, cf. section ..) are rather exceptional. A look at the grammaticalized markers discussed in this chapter shows that they basically have the same phonological substance as the lexical word from which they are derived (cf. examples (b), (b–d), (), (), (), (a–c), (b), (b), (b), (b), (b), (b–c), (), and ()). The syntagmatic weight (structural scope) is deﬁned by the structural size of the construction to which the grammatical marker is attached. If one takes the constituent-structure level, only the resultative particle in (b), the directional particle in (b), and the experiential marker in () operate on lexical heads (also cf. the Chinese perfective marker -le). The other markers are associated with higher constituent-structure levels (e.g. the Lahu object marker thàʔ in examples ()–() has scope over the whole noun phrase even though it has to occur immediately after the noun).



Umberto Ansaldo, Walter Bisang, and Pui Yiu Szeto

Paradigmatic cohesion (paradigmaticity) is concerned with the size of paradigms (e.g. open vs closed word class) and their degree of formal homogeneity. Morphological paradigms as a typical outcome of paradigmaticity are very rare in EMSEA languages (cf. Bisang ). If there is paradigmaticity, it is limited to the emergence of syntactic slots associated with certain grammatical functions—a process that ends up in rigid word-order rules and belongs to the domain of syntagmatic variability (see below). Thus, the form side of paradigmaticity in EMSEA languages does not reach the degree of integration that is found elsewhere in the world. Finally, syntagmatic cohesion (bondedness) in terms of the degree of fusion between a marker and its host is again limited (see section ..). The above discussion of weight and cohesion illustrates the limitations of the coevolution of meaning and form in EMSEA languages (Bisang , a). If one takes weight and cohesion with their paradigmatic and their syntagmatic aspects as criteria for measuring degrees of grammaticalization in EMSEA languages, they are of comparatively minor importance. In fact, rigid word order as an instantiation of reduced syntagmatic variability is deﬁnitely the most prominent parameter—all other parameters co-vary much less clearly with increasing abstractness/grammaticality. If Lehmann (b: –) is right that variability is the criterion in which meaning and form cannot be differentiated, this might be taken as additional evidence for the limited relevance of the coevolution of meaning and form. Of course, syntagmatic variability can also be the dominant parameter in subsystems of IndoEuropean languages, as in the case of auxiliaries (cf. ‘have’ verbs and ‘be’ verbs in German and in Romance languages), but it is far from having reached the pervasiveness with which it can be observed in EMSEA languages, in which the development of morphological markers and morphological paradigms is rather rare. This areal property requires rethinking general assumptions concerning (i) morphological elaboration or maturation; and (ii) morphological reduction or simpliﬁcation.

..   A view of grammaticalization as area-speciﬁc encourages us to revise our interpretation of morphological elaboration. We are thinking here especially of Dahl (: chs  and ) and the notion of ‘maturation’.⁴ In his approach, obligatory grammatical marking is independent of whether the information contained in the marking is communicatively necessary or not and thus produces redundancy, which in its turn leads to phonological reduction and coalescence with other morphemes. This type of maturation is directly reﬂected in the increase of form-related complexity through the history of individual languages, and manifests itself in complex word structure (e.g. inﬂectional morphology, derivational morphology, incorporating constructions), lexical idiosyncrasy (e.g. grammatical gender, inﬂectional classes, ⁴ Dahl’s () deﬁnition of maturation is summarized as follows by Ansaldo (): ‘The accumulation of material in a grammar G of a language that did not exist at an earlier stage G’ of that language.’

Grammaticalization in isolating languages



idiosyncratic case marking), and the presence of morpheme- and word-level features in phonology. Maturation as described by Dahl () is an important diachronic process of universal relevance, but it will be argued in the next section that it is not the only type of maturation and that it does not operate with the same pervasiveness cross-linguistically. Bisang (, , , a), Ansaldo (, ), Enﬁeld (), and others provide good evidence for area-speciﬁc differences in the morphophonological processes of development associated with grammaticalization. Coevolution of form and meaning is likewise typologically determined and areally conditioned. Languages of EMSEA typically show that there exists a type of maturity that does not manifest (or only marginally manifests) itself in the production of morphological form as it is associated with maturation in terms of Dahl () (cf. section ..). Seen from such a perspective, maturation in Standard Average European (SAE) languages is itself an areally bound phenomenon that produces a certain type of grammatical marking that is characteristic of these languages. As we have tried to show, EMSEA languages differ considerably from this grammaticalization pattern. The status of other languages beyond SAE and EMSEA between the two types of maturation needs extensive further research.

..   Grammaticalization in EMSEA languages is characterized by the comparatively high relevance of pragmatic inference even for markers that express grammatical functions like tense-aspect-modality, number, or deﬁniteness/indeﬁniteness. The relevance of pragmatics shows itself in the lack of obligatoriness (cf. section ..) and in polyfunctionality. If a marker is not obligatory, the speaker may choose to leave its content to the inference of the speaker. If a marker is polyfunctional, the adequate interpretation needs again to be inferred, either from the constructional context or from extralinguistic context. As pointed out by Bisang (, ), non-obligatoriness and polyfunctionality both operate against the emergence of inﬂectional paradigms. Due to their non-obligatoriness, markers of grammatical categories do not occur frequently enough to be integrated into a paradigm like Latin am-o ‘I love’, am-a-s ‘you love’, am-a-t ‘s/he loves’, etc. Polyfunctional markers lack the semantic homogeneity associated with particular slots in the paradigm. In addition to these two properties, the phonological properties of grammatical markers prevent high degrees of erosion, as they are necessary for formal integration into a morphological paradigm (cf. Lehmann’s b parameter of paradigmaticization; on the properties of erosion in EMSEA languages, see section ..). The above factors of non-obligatoriness, polyfunctionality, and the absence of high degrees of erosion enhance a grammar that licenses the production of much simpler surface structures than we ﬁnd in European languages (cf. the Mandarin example in ()). At the same time, the interpretation of these same structures needs more pragmatic inference for being understood in a given situation. The use of overt marking for expressing a grammatical category is supported by explicitness, and



Umberto Ansaldo, Walter Bisang, and Pui Yiu Szeto

ultimately ends in obligatoriness and phenomena of maturation and form-related complexity as described by Dahl (: ch. ). Explicitness is in competition with economy (cf. Haiman  on the competing motivations of iconicity vs economy), which in turn supports the omission of grammatical information that can be pragmatically inferred or the use of polyfunctional markers whose concrete function in a given situation also needs pragmatic inference. Processes of grammaticalization are subject to both types of motivation. In fact, it is generally argued that they start out from pragmatic inference (e.g. Heine, Claudi, and Hünnemeyer a: ch. ; Hopper and Traugott : ch. ). At later stages, they may develop in the direction of explicitness and obligatoriness with markers that express speciﬁc grammatical categories and contribute to form-related complexity as it is generally discussed in typological work on complexity (e.g. Dahl ; Sinnemäki ). But they may also develop in the direction of economy and grammatical markers whose interpretation keeps depending on pragmatic inference either due to their lack of obligatoriness or their polyfunctionality.⁵ This type of economy-driven hidden complexiy (Bisang , b) contrasts with explicitness-driven complexity that is called ‘overt complexity’ by Bisang (). In other words, at any state x of the grammar of a language, there is a bifurcation that leads either to overt complexity (morphosyntaxbased maturity) or to hidden complexity (pragmatics-based complexity). Such a competition between economy and explicitness is ongoing in all linguistic systems. If explicitness wins, we get maturation in terms of obligatoriness and the complexity that derives from it in terms of Dahl (). If economy wins, we get another type of maturation that enhances hidden complexity. This type of economy-based maturation is signiﬁcantly more prominent in EMSEA languages than it is, for instance, in Standard Average European languages. Like explicitness-based maturity, the economy-based maturity we ﬁnd in EMSEA languages is also the result of a long historical development (for a more detailed account, see Bisang ). The factors that support it are the three properties of non-obligatoriness, polyfunctionality, and the phonological stability of the syllable. A fourth factor is language contact (cf. sections .. and ..).

..   There is a tension here because there are some broad generalizations regarding the inﬂuence of social factors on linguistic form that warrant serious consideration (e.g. Wray and Grace ; Lupyan and Dale ; Trudgill ). In addition, the literature on contact-induced change (e.g. Thomason ) has often and for a long time upheld a correlation between formal simpliﬁcation and ecological issues: (i) correlation between group size and structural type; (ii) correlation between network type (open/ closed) and structural type; (iii) correlation between acquisition ⁵ It is important to clarify that hidden complexity depends on the presence of a grammatical marker for a given syntactic category in a given language. Thus, there is no hidden complexity with regard to evidentiality in English because English has no grammaticalized category of evidentiality. In contrast, EMSEA languages have markers of tense-aspect but their use is not obligatory (cf. example ()).

Grammaticalization in isolating languages



(child/adult–monolingual/multilingual) and structural type. At the same time the link between typology and nature of change has long been pointed out (e.g. Givón b for creole development and Haspelmath b for SAE languages). Therefore it is fair to say that the roles of both language-internal and language-external factors in contact-induced change are acknowledged in the literature. In the case of EMSEA languages, the economy-based type of maturity was already signiﬁcant when the languages from the different families ﬁrst came into contact, and it was maintained and further enhanced in most of the languages involved (but not in all of them). An example of how contact contributed to the blocking of morphology to the advantage of syntax-based constructions was given in the Khmer examples discussed in section ... These considerations suggest that correlations between language type and population type that do not take into account the role of the typological ecology in which the changes take place have a limited explanatory power.

..   In Ansaldo’s (: ch. ) work on contact-induced change, it is shown that the same external conditions associated with reduction of form in EMSEA languages can lead to morphological elaboration due to typological and areal conditions. Thus, his example shows how an economy-based system moved in the direction of an explicitness-based system. For instance, in Sri Lanka Malay, we see the evolution of morphology in an isolating language due to typological and areal factors. This involves three stages: () migration of adpositional material to postposition as part of VO > OV change; () reanalysis of PPs into case sufﬁxes; () paradigmatization of a case system. In this way a Malay (Austronesian) variety developed case morphology—a strong areal feature in the Indo-Dravidian area—under the typological inﬂuence of Sinhala and Tamil. This seems to corroborate a type- and area-based account of morphological elaboration and reduction. It further suggests that areal structures clearly play an important role in the shaping of the morphosyntactic structure of a language. In the case of Malay, the areal factor clearly seems to be stronger than language-internal factors. In the case of EMSEA languages, contact certainly contributed to keeping the likelihood of emerging morphological paradigms low, but it interacted with language-internal factors.

. CONCLUSIONS The properties of grammaticalization are area-dependent—while it is often taken for granted that grammaticalization involves the coevolution of form and meaning, the formal aspects of canonical grammaticalization are much less prominent in EMSEA languages, which are characterized by their isolating typology. It follows that



Umberto Ansaldo, Walter Bisang, and Pui Yiu Szeto

morphological elaboration is also an area-speciﬁc phenomenon that is based on explicitness-based maturation, and thus only reﬂects one type of maturation that can be observed in the cross-linguistic expression of grammatical categories. For that reason, it cannot be taken as a universally valid measure of grammaticalization across the world’s languages. Finally, EMSEA languages as well as Sri Lanka Malay show that typological and areal factors of the languages under investigation play a major role in determining contact-induced change.

12 Typology and grammaticalization in the Papuan languages of Timor, Alor, and Pantar MARIAN KLAMER

. INTRODUCTION Similar grammaticalization patterns found across languages do not come about by chance. They may arise because they were inherited from a common ancestor language, or because there are certain universal tendencies in human language structure and evolution that constrain grammaticalization (Narrog and Heine a). Similar patterns in languages may also have diffused through a period of contact. How typology and universalistic tendencies in grammaticalization interact with sociohistorical factors is the issue addressed in this chapter. This chapter investigates two grammaticalization patterns that are characteristic for the Timor-Alor-Pantar (TAP) family, a family of Papuan languages spoken in eastern Indonesia. The ﬁrst process that is attested across the family is the grammaticalization of serial verbs into adpositions and verbal preﬁxes (section .); the second process is the grammaticalization of nouns into numeral classiﬁers (section .). These grammaticalization processes are cross-linguistically quite common, and I am not aware of any processes that are common in the TAP family and uncommon elsewhere. However, the fact that in the TAP family the grammaticalization of verbs ends in a plethora of applicative preﬁxes, and virtually no other type, is probably a special property of this family (see section ..). I focus on the question how we can account for the similar patterns found in the languages of this family. More speciﬁcally: to what extent can we say that these similarities are due to the typological similarities between the members of this family? And what, if any, is the role of contact with Austronesian languages spoken in the region? The Timor-Alor-Pantar (TAP) family comprises ~ Papuan (or non-Austronesian) languages spoken on the islands of Timor, Alor and Pantar in eastern Indonesia Grammaticalization from a Typological Perspective. First edition. Heiko Narrog and Bernd Heine (eds). This chapter © Marian Klamer . First published  by Oxford University Press



Marian Klamer 123˚E

125˚E

Kisar Kisar Galolen Oirata Habun Waimaha Tetun Dadu’a Dili Makasae

Sika

Tokodede

Lamaholot cluster

Hewa

Leti

Hresuk

see Alor-Pantar map

Kedang

8˚S

127˚E Roma

Wetar cluster

Mambae

Kemak

Tetun Bunak

Fataluku

Makalero Naueti Kairui-Midiki

Idaté Lakalei

Language Family Austronesian Austronesian-based Creole

Tetun

Uab Meto cluster

Luang

Timor-Alor-Pantar

Kupang Malay

10˚S

N

MAL

AYS

brunei IA

East Rote Dengka Dhao DelaOenale

Helong

0

km

I

50

N

East-Central Rote Central Rote Tii-Lole

D

O

N

E

I

S

A

TIMOR-LESTE AUSTRALIA

F. .. Languages of the Lesser Sunda islands 124˚E

125˚E

Alorese Reta

Alorese

Kroku Teiwa Klamu Alorese

Wersing

Kabola Adang

Blagar Kaera

Sar

Reta Blagar

Pantar

Wersing

Hamap Kafoa

Kula

Kamang cluster

Kui

Suboo

Abui

Klon

Sawila

Alor

Wersing

Papuna Kiramang

Western Pantar

Kui

N

Deing 0

Km

20

Language family Austronesian Timor-Alor-Pantar

F. .. The languages of Alor and Pantar

(Figs . and .) (Holton et al. ; Holton and Robinson a, b; Klamer a). Note that the term ‘Papuan’ is not a genealogical term, but refers to a cluster of several dozains of unrelated language families that are not Austronesian, and are spoken on, or close to the Papuan mainland. Four of the TAP languages are spoken on Timor and one on Kisar (Fig. .); the rest are spoken on the islands of Pantar and Alor, just north of Timor (Fig. .). The Austronesian (Malayo-Polynesian) languages discussed in this chapter are Tetun, spoken in central Timor (Fig. .), Alorese, spoken on the coasts of Alor and Pantar (Fig. .), and Lamaholot, spoken on east Flores and adjacent islands (Fig. .).

Grammaticalization in Papuan languages



T .. Alphabetical list of languages discussed in this chapter, with source, island, and afﬁliation Language

Source used

Island(s)

Genealogical afﬁliation

Alorese

Klamer ()

Pantar and Alor

Malayo-Polynesian

Abui

Kratochvíl ()

Alor

Alor-Pantar, TAP

Adang

Haan (), Robinson and Haan ()

Alor

Alor-Pantar, TAP

Blagar

Steinhauer ()

Pantar and Reta

Alor-Pantar, TAP

Bunaq

Schapper ()

Timor

Timor, TAP

Fataluku

van Engelenhoven (, )

Indonesian

Timor

Timor, TAP

(everywhere)

Malayo-Polynesian

Kaera

Klamer (a)

Pantar

Alor-Pantar, TAP

Kamang

Schapper ()

Alor

Alor-Pantar, TAP

Alor

Alor-Pantar, TAP

Kiraman(g) Holton (a) Klon

Baird (, )

Alor

Alor-Pantar, TAP

Lamaholot

Nishiyama and Kelen ()

Flores, Solor, Adonara, Lembata

Malayo-Polynesian

Makalero

Huber ()

Timor

Timor, TAP

Makasai

Huber (forthcoming)

Timor

Timor, TAP

Alor

Sawila

Kratochvíl ()

Teiwa

Klamer (a, b, c, d) Pantar

Alor-Pantar, TAP

Tetun

Hajek ()

Timor

Malayo-Polynesian

Wersing

Schapper and Hendery ()

Alor

Alor-Pantar, TAP

Western Pantar

Holton (b, c)

Pantar

Alor-Pantar, TAP

Alor-Pantar, TAP

Table . is an alphabetical list of the languages discussed or mentioned in this chapter, with their location, afﬁliation, and source. The Timor-Alor-Pantar region is a contact zone where speakers of Papuan and Austronesian speakers have been in contact for , years (Pawley ; Spriggs ), and loans from Austronesian have been borrowed into proto-Alor-Pantar (Holton et al. ). The islands are located over , km from the Papuan mainland and are surrounded by islands where Austronesian¹ languages are spoken. ¹ To be more precise, these languages are part of the Malayo-Polynesian subbranch of Austronesian, and within Malayo-Polynesian, the languages of Eastern Indonesia are traditionally assumed to be part of the Central-Eastern Malayo-Polynesian (CEMP) subgroup (Blust )—though this latter subgrouping has been debated (Donohue and Grimes ).



Marian Klamer

Moreover, Indonesian, the dominant national language of Indonesia, and a local variety of Malay² are Austronesian languages that are now spoken by virtually everyone on the islands. Given these ancient as well as ongoing contacts between TAP and Austronesian languages, any study of similarities across the TAP languages must also take into account the possible effects of language contact. This chapter is structured as follows. The grammaticalization of verbs into adpositions and afﬁxes is discussed in section .. After outlining the typological features of the TAP family that are important to understand the grammaticalization of TAP verbs (..), I present three case studies of such grammaticalizations: the locational verb *mi ‘be in, at’ (..), the deictic verb *mai ‘come’ (..), and the handling verb *med ‘take’ (..), followed by a summary (..). The grammaticalization of nouns is discussed in section .. I ﬁrst sketch the evolution of numeral classiﬁers in TAP languages (..). Then I discuss the role that was played by contact with the local Austronesian languages Alorese and Tetun (..) and the national language Indonesian (..). The most important difference between the evolution of deverbal adpositions and afﬁxes and of denominal classiﬁers in TAP languages is the role of contact. In deverbal grammaticalization, an abundance of cognates is found across the TAP family, which enables us to reconstruct the evolution of adpositions and afﬁxes back to the proto-language, while contact with Austronesian appears to have played no role at all in the diachronic change. In contrast, the denominal classiﬁers that are attested across the TAP family do not involve a single cognate form, and their evolution appears to be inﬂuenced by Austronesian quite signiﬁcantly. It is suggested that the type and intensity of contact between TAP and Austronesian languages determined why the evolution of verbs into adpositions and preﬁxes is different from that of nouns into numeral classiﬁers.

. GRAMMATICALIZATION OF VERBS TO ADPOSITIONS AND AFFIXES

..      This section discusses the typological features of TAP languages that are relevant for the grammaticalization of verbs into adpositions and afﬁxes. The major constituent order in TAP languages is Subject Object Verb (or SV, APV³). Adverbs of time and manner and adjunct phrases also precede the verb. Overall, the TAP languages have very few adpositions; some have none at all. In TAP languages, verbs are

² The Alor Malay variety is related to Kupang Malay, which in turn derived from trade Malay that was used as a trade language in the area for many centuries. ³ Abbreviations in glosses follow the Leipzig glossing rules: IND = Indonesian loan word; NSIT = New SITuation; S = single argument of intransitive predicate (verbal or non-verbal); A = most agent-like argument of a transitive clause; P = most patient-like argument of a transitive clause; R = recipient; T = displaced theme in a transfer event.

Grammaticalization in Papuan languages



distinguished from adpositions in that verbs can take person markers and aspect and mood inﬂections, while postpositions do not take any inﬂectional afﬁxes. Locations (in the village, from the garden, etc.) are often expressed as objects of locational or deictic verbs. An illustration is Kaera (a), where the location abang ‘village’ is the object of the locational verb ming ‘to be at’. In contrast, in (b), abang is expressed as part of a postpositional phrase, with the locative postposition mi. Note that the verb and postposition have similar forms that are probably etymologically related. I will return to this below. ()

Kaera (Klamer b: , ) a. Nang ir boi ming. .SG water river be.at/in ‘I am in the river.’ b. [Abang mi]PP ga-dag. village LOC .SG-leave ‘Leave him/her in the village.’

Transitive verbs of location such as Kaera ming ‘to be at’ are commonly found in TAP languages, and it is a common strategy to express locations as objects of such verbs, as in (a). Another salient feature of TAP languages is the prevalence of serial verb constructions. Such constructions are analysed here as in Klamer (a: –): two or more verbs that occur together in a single clause under a single intonation contour which share minimally one argument that is expressed maximally once. In the TAP languages, serial verb constructions are ‘core-layer’ serializations (Foley and Olson ). They are distinguished from bi-clausal constructions by the presence of a clause boundary marker in the latter.⁴ Serial verb constructions have many different functions; e.g. to encode manner (), cause (), and aspect (). (In the examples below, the serial verb constructions are underlined.) ()

()

()

Western Pantar (Holton : ) Habbang mau aname horang sauke-yabe village there person make.noise dance.lego-lego ‘Over there in the village people are making noise dancing lego-lego.’ Teiwa (Klamer a: ) A ta min-an ba’ .SG TOP die-REAL fall.down ‘He died falling down.’ Teiwa (Klamer a: ) A bir-an gi awan awan tas-an gula’ . . . .SG run-REAL go far.away far.away stand-REAL ﬁnish ‘She ran far away (and) stood [still] . . . ’

⁴ A clause boundary marker is a conjunction/disjunction-like element, e.g. le in Teiwa () and a in Kaera (), or an intonational break signalling the end of a clause, as in Klon ().



Marian Klamer

Serial verbs are also used to introduce participants into the clause. For example, in Wersing (), the ﬁrst verb (V) on ‘use’ introduces an instrument, in Kaera (), V wang ‘be, exist’ introduces a goal and in Western Pantar (), V haggi ‘take’ introduces the displaced theme. These Vs are analysed as verbs (rather than as e.g. postpositions) in these languages, as they can still function as independent predicates as well. ()

()

()

Wersing (Schapper and Hendery : ) Imi pok kinai on ken ba g-pesi burik-a. man little knife use cloth DEF -cut snap-REAL ‘A young man cuts the cloth with a knife until it breaks.’ Kaera (Klamer b: ) Ging kali-kali tei baxi gu wang ekeng . . . .PL RDP-slow tree branch that be/exist climb.up ‘Slowly they climbed up onto that tree branch . . . ’ Western Pantar (Klamer and Schapper : ) Na-iti haggi na-nia. .SG-glasses take .SG-give ‘Give me my glasses.’

Example () also illustrates another typical feature of TAP languages, namely, that transfer verbs (such as ‘give’) are mono-transitive, and their single object is the semantic recipient (R). In (), R is indexed on the verb nia ‘give’ with a preﬁx, just as P is indexed on the verb pesi ‘cut’ in (). As ditransitive verbs are generally lacking in TAP languages, transitive verbs just have two arguments.⁵ The examples presented above further show that verbs in TAP languages generally have little morphology. Verbs take a person preﬁx, but apart from that, few inﬂections are used. Tense is never marked, there is no active/passive morphology, no morphological ﬁniteness distinctions, and few languages have a causative afﬁx (Klamer a: –). In sum, the typological features of the TAP family that are important to understand the grammaticalization of TAP verbs are: (i) preverbal position of arguments and adjuncts, (ii) paucity of adpositions, (iii) locations and directions as arguments of locational and deictic verbs, (iv) abundance of serial verb constructions, (v) lack of underived ditransive verbs, and (vi) limited verbal morphology. Given the overall morphological simplicity of verbs in TAP languages, it is striking to ﬁnd at least four different applicative preﬁx forms in the family (see Table .), some languages such as Sawila and Wersing having more than one applicative. Applicative preﬁxes on verbs function to allow the coding of a thematically peripheral argument or adjunct as a core-object argument (Peterson : ). Applicative verbs in TAP languages license arguments with a wide range of semantic

⁵ Three-participant events may be expressed as (i) mono-clausal serial verb constructions, (ii) bi-clausal constructions, or (iii) particle–verb combinations; see Klamer and Schapper ().

Grammaticalization in Papuan languages



T .. Applicative preﬁxes in TAP languages Language

Appl. 

Teiwa

un-

Adang

u-

Klon

u-

Appl. 

Appl. 

Appl. 

mi-

Kamang

mi-

Makalero

mi-

Sawila

wii-

Wersing

wa-

limi-

le-

T .. Semantic role of arguments introduced by applicative preﬁxes in TAP languages Language

Applicative preﬁx

Semantic role

Sawila

wii-

instrument, displaced theme

Wersing

wa-

(displaced) theme

Sawila

li-

location, partially affected theme

Wersing

le-

goal, location, cause

Adang

u-

theme, goal, beneﬁciary

Klon

u-

patient, recipient, goal, theme

Teiwa

un-

recipient, benefactive, comitative, location, source

Klon

mi-

instrument and other roles

Wersing

mi-

location and other roles

Kamang

mi-

location, goal and other roles

Makalero

mi-

location, affected theme

roles, as illustrated in Table .. For example, a Sawila applicative verb with preﬁx wii- combines with an instrument or displaced theme, and a Sawila applicative with li- licenses a location, or partially affected theme. Moreover, while the semantic range of arguments licensed by etymologically related afﬁxes shows a common core, there are also many differences. All the applicative preﬁxes can attach to both transitive and intransitive verbal bases. With intransitives, the preﬁx increases the valency of the verb by adding an argument, and this argument is semantically neither an agent nor a patient. For instance, mi- (Table .) introduces instruments, locations, goals, and affected themes, among other roles—never an agent or a patient. The most common argument



Marian Klamer

introduced by mi- is in fact a location. When mi- attaches to a monotransitive verb, the verb is not made more transitive (as ditransitive verbs are dispreferred in TAP languages), but results in the rearrangement of argument structure (Comrie ), where a peripheral participant is coded as a core-object argument. Applicative afﬁxes typically evolve from verbs or adpositions (Peterson ), and it is likely that the applicative preﬁxes in TAP languages in Table . all have a verbal source form. In the next section this is argued for the preﬁx mi-. I assume that similar cases can be made for the applicative preﬁxes wii-, wa-, li-, le-, u-, and un-, though this question will remain outside the current study. The type of grammaticalization where a serial verb ultimately becomes a preﬁx is cross-linguistically common. However, the fact that in the TAP family the grammaticalization ends in a plethora of applicative (and hardly any other) preﬁxes seems to be a unique property of this family.

..    -   * ‘ , ’ This subsection is the ﬁrst of three case studies of grammaticalizations of verbs in the TAP family. Here I discuss the evolution of the locational proto-verb *mi ‘be in, at’. The proto-verb is reconstructed on the basis of the cognate forms presented in Table .. In the ten modern languages with reﬂexes of this form, mi functions as an applicative preﬁx, a locative postposition, or a locative verb. In Makalero (), Kamang (), and Wersing (), the only trace left of the protoverb *mi is an applicative preﬁx. In (), it is illustrated how in Makalero the preﬁx licenses an (affected) theme argument (of mi-ma’en ‘to understand X’) or a location. In Kamang () and Wersing () the argument licenced by mi- is a location.

T .. Reﬂexes of the proto-TAP locational verb *mi ‘be in, at’ Language

Verb

Postposition

Preﬁx

Makalero

mi- ‘APPL’

Kamang

mi- ‘APPL’ mi- ‘APPL’

Wersing me ‘LOC’

W Pantar

=mi, mi ‘in; to; into; from’

Blagar Teiwa

me’ ‘be in’

Abui

mia ‘be in’

Kaera

ming ‘be in, at’

mi ‘in; at; to; with’

Adang

mi ‘be in, at’

mi ‘in, at’

Klon

mi ‘be at, to place’

mi ‘LOC’

mi- ‘APPL’

Grammaticalization in Papuan languages



Makalero (Huber : , , , , , , ; Huber forthcoming: , , )⁶ ()

()

ma’en ‘know’ naser ‘stand’ lolo ‘say’ puna ‘look at’ kerek ‘write’

mi-ma’en ‘understand X’ mi-naser ‘stand along X’ mi-lolo ‘say in language X’ mi-puna ‘watch/look over/look through X’ mi-kerek ‘write along with X, copy something’

Kamang (Schapper b: ) Leon sukuu mi-ilai. L. hole APPL-look.at Leon looked into the hole.’

Wersing (Schapper and Hendery : ) () Wai aka mira mi-g-tati. goat fence inside APPL--stand ‘The goat is inside the fence.’ Note that in Wersing (), the applicative preﬁx encloses the preﬁx g- indexing a third person, in this case the S wai ‘goat’ (Schapper and Hendery : , ). A morphological conﬁguration like this, where a person preﬁx occurs within the scope of a valence-changing applicative preﬁx, goes against the commonly attested afﬁx order, in which afﬁxes with high relevance to the content of the verb (e.g. derivational afﬁxes changing valence) occur closer to the verb stem than afﬁxes with low relevance, such as (inﬂection-like) afﬁxes with broad scope (cf. Bybee ). The pattern in (), where a derivational preﬁx occurs further away from the verb stem than the person preﬁx, is therefore a counterexample to this generalization. In the case of Wersing, this aberrant order could arise because the applicative preﬁx was originally a separate V in a serial construction preceding the inﬂected verb (V). Over time, V became prosodically dependent on the inﬂected verb following it, and became a preﬁx of the [preﬁx-V] form. In four of the TAP languages investigated here, no applicative preﬁx mi- is in use today. In Blagar and Western Pantar, the modern reﬂex of *mi functions as an adposition (). In Teiwa () and Abui (), it functions as a verb. Western Pantar (Holton a: ) () N-iu ang me i-golang. SG.POSS-mother market LOC PROG-return ‘My mother is returning from the market.’ Teiwa (Klamer a: ) () Lius ita’a me’? A uyan me’. L. where be.in/at .SG mountain be.in/at ‘Where is Lius? He is in the mountains.’ ⁶ The Makalero preﬁx mi- is glossed as ‘along’ in Huber (). However, the similarities with preﬁxes mi- in related TAP languages and the derivations given in () suggest that it is an applicative.



Marian Klamer

Abui (Kratochvíl : ) () Tipai Babi buku do di=ng afen-i, he-n mia . . . T.B. land PROX .A=see stay-PFV .LOC-see be.in ‘They stayed in the Tipai Babi area, and as they were there, . . . ’ Languages where reﬂexes of *mi are used as both verb and adposition are Adang (,) and Kaera (a,b). The Kaera examples below illustrate the semantic variability of the complements of Kaera mi: in () it marks a goal, in () and () an instrument, in () a theme. Adang (Haan : )7 () Roni ip-l-e baang mi. R. go.down-DIR-DIST house be.in ‘Roni is down there at the house.’ Adang (Robinson and Haan : ) () Na ʔArabah mi mih. SG.SBJ Kalabahi in sit/live ‘I live in Kalabahi.’ Kaera (Klamer b: –) () Ui gu gang [abang mi] gi. person that .SG village MI go ‘That person goes to the village.’ Kaera (Klamer b: –) () Ui gu gang [ped mi] tei patak-o. person that .SG machete MI wood cut-FIN ‘That person cut wood with a machete.’ Kaera (Klamer a: –) () Gang [naxar mi] n-aas-o. .SG rice MI .SG-feed- FIN ‘S/he fed me (with) rice.’ Kaera (Klamer a: –) () Gang [foto mi] na-taring. .SG photograph (IND) MI .SG-point.at ‘S/he showed me a picture.’ There is one TAP language that exhibits the entire grammaticalization continuum of *mi: Klon. In (), Klon mi is an independent locational verb ‘to be at’ that takes the location lale Hwak weer ‘below Hwak river’ as object: () Klon (Baird : ) Lale Hwak weer mi. Ini gen agai taa. below H. river be.at .PL reach go sleep ‘[They] were at below Hwak river. They eventually went to sleep.’ ⁷ The spelling of Haan () has been adapted.

Grammaticalization in Papuan languages



Klon mi can also be the ﬁrst verb (V) in a serial verb construction to introduce a locational argument, as in (): Klon (Baird : ) () Ini gen agai lale Hwak weer mi taa. .PL reach go below H. river be.at sleep ‘They went until below Hwak river sleeping there.’ In (a),⁸ Klon mi forms a particle–verb combination with the second verb in the serialization, as it cannot be fronted along with the locational expression, (b). That is, mi taa in (a) has become a single morphosyntactic unit, a kind of complex verb. However, mi can also function as an independent verb and form its own clause when it is not adjacent to taa, as was shown in (). Klon (Baird : ) () a. Lale Hwak weer ini gen agai mi taa. below H. river .NSG reach go be.at sleep ‘They went until below Hwak river sleeping there.’ b. *[Lale Hwak weer mi] ini gen agai taa. below H. river be.at .NSG reach go sleep Not good for: ‘Until at below Hwak river they went to sleep.’ Klon mi can also function as a postposition, projecting separate PPs. This is illustrated in () with the PPs makna mi ‘from the past’ and Lahtal ta mi ‘at God above’. A more literal translation of () would be ‘ . . . from the past, fate sits at God above [while] we’re dying’, where mi encodes a temporal and a locational adjunct. ()

Klon (Baird : ) . . . , makna mi Lahtal ta mi tengtang mi~ mih9 past be.at God above be.at fate RDP-sit t~t-ebeer. RDP-.NSG.INCL-die ‘ . . . from the past God above decides our fate.’

Finally, Klon mi is also attested as a verbal preﬁx. In (a), mi is a free verbal or adpositional element, encoding a location (oot ‘room’) and combining with the verb uur ‘see’ in a serial or particle–verb construction. In (b), mi- is a preﬁx licensing the instrument by which something is seen (kacamata ‘glasses’).

⁸ The translation provided for (a) and () by Baird (: ) is identical, as these sentences are part of the argument that mi cannot be fronted along with the locational expression in (). It is likely that (a) has a focused location, which would render the English translation ‘Until below Hwak river they went sleeping there.’ ⁹ There is currently no etymological relation between Klon mih ‘sit’ and mi ‘be at’. Mih is a reﬂex of proto-Alor-Pantar *mis, proto TAP *mit ‘sit’ (Holton and Robinson a: ; Schapper et al. : ), while mi is a reﬂex of the proto-TAP locational verb *mi ‘be in, at’. There may be a relation between the proto-TAP posture verb and existential/locative verb (as is e.g. the case for Oceanic posture and existential/ locative verbs: Lichtenberk a), but in TAP the verbs have been different lexemes since the proto-stage of the language.



Marian Klamer

Klon (Baird : ) () a. . . . bo ga oot mi uur, . . . SEQ .ACT room be.at see ‘ . . . and she looked into the room, . . . ’ Klon (Baird : ) b. Na kacamata mi-uur. SG.ACT glasses (IND) APPL-see ‘I see with glasses.’ The Klon applicative preﬁx can also enclose a person preﬁx, as shown in (). This conﬁguration is similar to what we have seen in Wersing (), and the historical trajectory is identical. In both languages, the V mi in a serial construction came to fuse with the V that already had a person preﬁx. Klon (Baird : ) () nal ‘observe’ mi-g-nal ‘APPL--observe’ ‘pick it [using something]’ uuh ‘hold on hip’ mi-g-uuh ‘APPL--hold on hip’ ‘hold her on hip using cloth’ In sum, cognates of a locational lexeme *mi are found in ten languages across the TAP family, and a proto-TAP locational verb *mi ‘to be in, on’ can be reconstructed. The modern reﬂexes of *mi in Klon function as independent verb alongside more grammaticalized uses as locative adposition and applicative preﬁx. A careful comparison of the cognate forms found in the other languages suggests that their reﬂexes of the protoverb *mi occupy different points on the continuum verb > postposition > preﬁx. The head-ﬁnal syntax and abundance of serial verb constructions in TAP languages played a crucial role in the grammaticalization of *mi: an original serial verb construction where mi has a preverbal argument NP and is followed by another verb ([NP mi-V V]) grammaticalized into a construction where mi became a locative adposition ([[NP mi]PP V]) and/or an applicative verb ([NP [miApplicative-V]]). If in the latter construction the second verb already had a person preﬁx attached to it, this preﬁx became enclosed inside the applicative preﬁx.

..    -   * ‘’ This subsection presents the second case study of verbal grammaticalization in the TAP family. Here I discuss the evolution of the deictic proto-verb *ma ‘come’, which is reconstructed on the basis of the cognate forms presented in Table .. (A protoform of this verb was reconstructed for proto-Alor Pantar in Holton et al. : , but as cognates are also attested in the Timor languages, it is reconstructed here as a proto-TAP verb.)¹⁰ ¹⁰ Austronesian proto Malayo-Polynesian (MP) has *maRi ‘come’ and is older than proto TAP. As it is possible to reconstruct *ma ‘come’ to proto TAP, if this were an MP loan then it must have been borrowed at the proto-level of TAP. Proto-Oceanic has *mai ‘come’, but proto-Oceanic is much younger than proto TAP. The formal similarity may also be a coincidence.

Grammaticalization in Papuan languages



T .. Reﬂexes of the proto-TAP deictic verb *ma Language

Verb

W Pantar

ma ‘come’

Kaera

ma ‘come’

Adang

ma ‘come’¹¹

Kamang

ma ‘come’¹²

Sawila

me ‘come’

Wersing

mai ‘come’

Klon

ma ‘come’

Fataluku

ma’u ‘come’

Makalero

ma’u ‘come’

Bunaq

man ‘come’

Teiwa

ma ‘come’

ma ‘OBL’

Blagar

ma ‘come’¹³

=ma ‘OBL’¹⁴

Makasae

Enclitic/postposition

ma ‘OBL’

The meaning of the reﬂexes of *ma combines a motion and a deictic component; i.e. ‘come here, come towards deictic centre’ (Klamer b). Except for Makasae, all the TAP languages surveyed here use cognates of this verb as both independent verb (as in Teiwa ()) and V in a serial verb construction (as in Sawila (), where V me is a reﬂex of *mai). Teiwa (Klamer a: –) () Ha’an la ma le na’an la wa? .SG FOC come or .SG FOC go ‘Are you coming or am I going?’ Sawila (Kratochvíl : ) () Ga-me tana mu likka dang gaapa=ma .I-come same.time tree large NFIL.one shadow=be.PROX ‘He came under [the shadow of] a large tree [. . .]’ ¹¹ Adang ma ‘come toward speaker from nearby (same level)’. ¹² The examples in Schapper () contain three different forms for ‘come’: me (ex. , p. ), maa (ex. , p. ), and ma (ex. , p. ). Given the vowel /a/ in the proto-verb *ma, I assume ma(a) to be basic shape of this verb. The alternative form with the vowel /e/ is homophonous with the defective verb me ‘take’ that functions as a postposition in Kamang (see Table .). ¹³ This word is variously glossed as ‘come’ and ‘come.LEVEL’ (Steinhauer : , ). ¹⁴ This enclitic is glossed as ‘move’ in Steinhauer (: , , , ) because synchronically in Blagar the relation between ma ‘come’ and =ma is not obvious. However, the similarities in form, semantics, and distribution of verbal and postpositional ma in Blagar and closely related Teiwa do suggest an etymological relation between the lexemes.



Marian Klamer

In Teiwa, Blagar, and Makasae, *ma has also developed a function as an adposition to encode oblique arguments, such as locations (), sources (), instruments (), or displaced themes (). The latter function is also observed for Blagar =ma, which cliticizes to the noun it marks as displaced theme, as illustrated in (). The semantic role of the participant introduced by is largely determined by the semantics of the main verb of the serial verb construction; the reﬂexes of *ma just function to ﬂag obliques. Teiwa (Klamer a: –) () Tami un Lius ga-siban ma tas. tamarind.tree while Lius .SG-behind OBL stand ‘The tamarind tree is behind Lius.’ () Sangubal ma bir-an daa. Sangubal OBL run-REAL ascend ‘The refugee(s) who ran up from Sangubal.’ () Uy nuk ped ma tei taxar. person one machete OBL wood cut ‘Someone cuts wood with a machete.’ () Na-xala’ yir ma bif ga-mian hufa’. s-mother water OBL younger.sibling .SG-put.at drink ‘My mum gives water to the child to drink.’ Blagar (Steinhauer : ) () Na buk=ma e panatu. .SG.SUBJ book=OBL .SG.POSS send ‘I sent a book to you.’ In Teiwa, Kamang, and Makalero, in certain contexts, reﬂexes of *ma are found that show more traces of the original ‘movement’ meaning of this deictic verb, by having functions that involve a metaphorical extension of the meaning of motion. For example, Teiwa ma can encode futures and hortatives as ‘motion in time’ (, ), and Makalero ma’u ‘come’ can encode a hortative (cf. () and ()). In Kamang and Makalero, *ma only has this derived function; in Teiwa it is also used to ﬂag obliques. Teiwa (Klamer a: ) () Ha ma nili pat-an. .SG [come debt pay.back]-REAL ‘You will pay back the debt.’ Teiwa (Klamer a: ) () Ma pi-maran ma gi. come .PL.INCL-hut OBL go ‘Let’s go to our hut.’ Makalero (Huber, forthcoming) () Kiloo aite’=ini ma’u. .SG REC.PST=CONJ come ‘He only just arrived.’

Grammaticalization in Papuan languages



Makalero (Huber : ) () Ma’u ﬁ Makalero lolo! come .PL.INCL M. say ‘Let’s speak Makalero!’ In sum, the original verb *ma ‘come (here, to deictic centre)’ combines a motion with a deictic component. In most languages investigated here it grammaticalizes into an oblique adposition where the motion component has been ‘bleached’ completely and only the deictic semantics survive. However, in Teiwa it has developed into two different directions: one direction as an oblique adposition with bleached movement semantics, and another direction as a future and hortative marker that has kept the meaning of motion. (Klamer a: – presents a full description of Teiwa ma and all its derived functions.) The deictic proto-verb *ma ‘come’ has reﬂexes as main and serial verb in thirteen TAP languages, but evolved into a postposition/enclitic in only three of them. There are thus far fewer languages showing the continuum from verb to adposition for *ma ‘come’ than there are for *mi ‘be in, at’ (section ..). Speculating about the reason for this difference, it may be the different semantic composition of the two verbs. It may be easier to develop an adposition from a original locational verb like *mi ‘be in, at’ because it involves less semantic bleaching than when the source verb is a deictic verb like *ma ‘come’. The latter verb contains information on both movement and location, and must bleach the movement component of its verbal semantics to become a locational adposition (cf. Klamer a: –).

..    -   * ‘’ The third case study of grammaticalization of verbs in the TAP family is the development of the handling verb *med ‘take’. Reﬂexes of this verb are found in twelve TAP languages (see Table .). In all languages the verb occurs frequently in serial constructions, and in three languages (Kaera, Blagar, Kamang) it also functions as a postposition, enclitic or sufﬁx. In serial verb constructions, the verb is formally defective: it is phonologically reduced, and has lost some (but not all) of its verbal properties, such as being able to take person or aspect/mood inﬂections. The ‘defective’ reﬂexes of *med in Teiwa, Kaera, Blagar, and Kamang are all phonologically reduced. The reduction may involve loss of voice in the ﬁnal segment (Teiwa mar vs mat, Kaera med vs met), syllable contraction (Blagar medi vs met),¹⁵ or loss of the ﬁnal segment (Kamang met vs me). They typically occur in serial verb constructions.¹⁶ This is illustrated for Kaera in (), where the ﬁrst clause is headed by the full verb med ‘take’, while in the second clause the ‘defective’ verb met ‘take’ is combined with mi and -(e)ng ‘give’. Met is formally reduced as the ﬁnal consonant

¹⁵ The particle met is a contracted form of the inﬂected verb medi-t ‘take-MANNER’ (Steinhauer : ). ¹⁶ Teiwa mat is an exception to this: unlike mar it cannot take inﬂections, but like mar it can head an independent clause.



Marian Klamer

T .. Reﬂexes of the proto-TAP handling verb *med ‘take’ Language

Verb

Light verb

Postposition

Adang

med

Abui

mi

Sawila

mi

Wersing

medi

Klon

med

Fataluku

me¹⁷

Makasae

ma¹⁸

Teiwa

mar, mat¹⁹

Kaera

med

met

me

Blagar

medi

met

met

Kamang

met

me

me

Makalero

mei

Preﬁx

=m, -m mat

m-

lost its voice and it has limited distributional properties: it can only occur as V of a serial verb, not as an independent main verb. Its semantics does not appear to be bleached (yet). The function of mi in this construction is unclear. () Kaera (Klamer a: ) Gang ge-topi gu med a, .SG .SG.ALIEN-hat that take CONJ ‘He takes that hat of his, xabi mampelei utug met mi kunang masik namung gu gi-ng. then mango three take LOC children male PL that .PL-give then takes three mangoes to give to the boys.’ A further grammaticalization stage is when the lexeme for ‘take’ is used as a postposition to license arguments. This stage is observed in Kaera, Blagar, and Kamang. In (), Kamang me introduces an instrument, and in (), it licenses the displaced theme in a construction with a mono-valent ‘give’ verb. Similar functions have been attested for Fataluku me (see van Engelenhoven : –), as illustrated in (), where Fataluku me is reduced to =m and licenses the displaced theme of ‘give’. Kamang (Schapper : ): () Nal isei maa kii me maung-ma. .SG game edible palm.rib take make.hole-PFV ‘I poked the meat with a palm rib.’ ¹⁷ The verb me also has two allomorphs, eme and em, containing a person preﬁx e-. ¹⁸ Makasae ma ‘take’ derives from *mei > *mai > ma ‘take’ (cf. Klamer and Schapper (). ¹⁹ Both forms can function as the verb ‘take’, but only mar can take person preﬁxes and realis sufﬁxes.

Grammaticalization in Papuan languages



Kamang (Schapper : ) () Maria falak me ne-n. M. cloth take .SG.GEN-give ‘Maria gives me a cloth.’ Fataluku (Klamer and Schapper : ) () Markus akam lepuru=m an ina. M. NEG book=take SG.OBL give ‘Marcus didn’t give me the book.’ In (), there are two Fataluku verbs ‘take’. The second ‘take’ verb has been incorporated into the VP [e-me ina], and a serial verb construction with a new free verb ‘take’ has been created. As a result, the theme mace-nu ‘food’ (lit. ‘eat-NMLZ’), is marked by a free verb ‘take’, while there is also a verb ‘take’ that is merged with the verb ‘give’ (Klamer and Schapper : –). ()

Fataluku (Klamer and Schapper : ) . . . mace-nu me [e-me ina] tu una eat-NMLZ take it-take give SEQ eat ‘ . . . give food to eat’

The stage involving two reﬂexes of *med, one of which is part of the VP with ‘give’, is taken one step further in Makalero. In Makalero, the second reﬂex of *med has been reduced to just a consonantal preﬁx m-. Makalero ‘give’ constructions are formed around the verb root -ini ‘give’, where a pronominal preﬁx encodes the recipient. This pronominal is preﬁxed to -ini, and together they form the host of the preﬁx m- that reﬂects the original *med lexeme (see Table .). In other words, the deverbal preﬁx m- captures the pronominal preﬁx; it creates a full paradigm of univerbated ‘give’ with an entrapped recipient object preﬁx. An illustration is given in (). T .. Makalero free pronouns and inﬂections of –ini ‘give’ (Huber : , –) Free pronouns

Underlying ‘give’

Surface ‘give’

Meaning

M-R-GIVE

SG

ani

m-ani-ini

manini

‘give to me’

SG

ei

m-ei-ini

meini

‘give to you’



ki-loo(ra)

Ø-ki-ini²⁰

kini

‘give to him/her/it/them’

PL.EXCL

ini

m-ini-ini

minini

‘give to us’

PL.INCL

ﬁ

Ø-ﬁ-ini

ﬁini

‘give to us’

PL

ii

m-ii-ini

miini

‘give to you’

²⁰ The absence of the initial m- on the rd person and st person inclusive reﬂects a restriction on onset clusters */mk/, */mf/.



Marian Klamer

Makalero (Klamer and Schapper : ) () ... asi-osan hai muni m-an-ini SG.POSS-money NSIT return m-.SG-give ‘... (he) gave my money back to me’ In summary, in some languages, proto-TAP *med ‘take’ evolved into (formally reduced, defective) verbs in serial constructions and adpositions to encode additional arguments. In Fataluku and Makasae, the adposition merged with the second verb and its object preﬁx. Once this happened, a serial construction with a new verb me ‘take’ was created.

..     TAP   A  Three verbs were reconstructed for proto-TAP: *mi ‘be in, at’, *ma ‘come’, and *med ‘take’. These verbs show various stages of grammaticalization, and in some languages they developed into postpositions and verbal preﬁxes. On the comparative evidence we can reconstruct the following grammaticalization chain. When a verb has a preverbal argument NP and is followed by another verb in a serial verb construction ([NP V V]), it can grammaticalize into a construction where V becomes a postposition ([[NP P]PP V]) and/or an applicative preﬁx on V ([NP [preﬁxApplicative-V]]). If V has a person preﬁx attached, this preﬁx may be enclosed inside the applicative preﬁx ([NP [preﬁxAPPL-preﬁxpersonV]]). This kind of deverbal grammaticalization is possible because of the typology of the TAP languages described in section ..: objects precede the predicate, underived ditransitive verbs are absent, there are few if any postpositions,²¹ and locations and directions are typically expressed as arguments of locational and deictic verbs; locations, directions, instruments, goals, sources, and comitatives precede the major verb in a serial verb construction, and there is an overall prevalence for such serial verb constructions. Furthermore, there is often little verb morphology to ‘betray’ the categorical status of verbs, or this morphology is lost, so that in a serial verb construction the V can easily be reinterpreted as an oblique marker and grammaticalize as a preﬁx on V. We can see the role of TAP typology in deverbal grammaticalization even more clearly when we compare the processes discussed for TAP with similar processes of deverbal grammaticalization in Austronesian languages of the region.

²¹ It has been suggested that there is a relation between the size of a language’s inventory of adpositions and the occurrence of serial verbs of different types, to the extent that languages lacking adpositions will use verbs to express the arguments of the clause (Bickerton : –). However, Crowley () has shown that the correlation does not always hold, citing Austronesian (Oceanic) languages with few adpositions that lack serial verbs altogether, as well as languages that combine dozens of adpositions with numerous serial verbs. On the surface, the TAP languages appear to conﬁrm the correlation, but in the typological proﬁle of this family I sketch in this chapter, serialization is connected to many other features which all work together to create the morphosyntax of TAP languages.

Grammaticalization in Papuan languages



I focus on three Austronesian languages that are currently spoken on Pantar, Alor, and Timor: Indonesian, the national language of Indonesia; Alorese, spoken on the western coasts of Pantar and Alor (Klamer ); and Tetun, the national language of East Timor (Hajek ). The following typological features of these languages are the opposite of those in TAP: the languages have verb–object order, and they have underived ditransitive verbs like ‘give’. The Alorese ‘give’ construction in () employs a ditransitive verb with two bare object NPs that follow the verb: Alorese (Klamer : ) () Ama kali ning go bapa seng. man that give.(to) .SG father money ‘That man gave my father money.’ Austronesian languages commonly have at least a few prepositions, and they express locations and directions (as well as instruments, goals, sources and comitatives) as prepositional phrases that follow the main verb. Austronesian verbs have derivational morphology (applicative, causative, passive, active) to manipulate the verb’s argument structure and valency. Like the TAP languages, Austronesian languages in eastern Indonesia have prevalence for serial verb constructions where one verb may grammaticalize, but in the Austronesian languages of this region, the grammaticalizing verb is the second verb rather than the ﬁrst. Some illustrations are given for Indonesian in ()–(). Indonesian has prepositions projecting PPs, encoding comitatives, locations (), sources and goals (), and instruments (). Such PPs follow the verb. In the variety of Indonesian spoken in eastern Indonesia, instruments can also be expressed as the object of the instrumental verb pakai ‘use’, (), which then occurs as V in a serial construction. Indonesian (own knowledge) () Saya berbicara dengan dia di rumah. .SG talk with him at home ‘I talked with him at home.’ () Saya lari ke/dari hutan. .SG run to/from forest ‘I run to / from the forest.’ () Saya kejar babi dengan kayu. .SG chase pig with stick ‘I chased the pig with a stick.’ () Saya kejar babi pakai kayu. .SG chase pig use stick ‘I chased the pig using a stick.’ In Indonesian, verbs do not typically grammaticalize into adpositions. However, in the indigenous Austronesian languages Alorese and Tetun some verbs underwent exactly this kind of grammaticalization. In Alorese, an Austronesian language spoken on the coasts of Pantar and Alor, instruments and comitatives are marked with a



Marian Klamer

preposition nong ‘with, and’, as illustrated in () and (). There is good evidence that the source form of nong was a comitative verb -ong ‘to be with’. (The grammaticalized form is phonologically heavier than the original verb because it contains an old consonantal preﬁx n- ‘SG’; see the evidence presented immediately below.) Alorese (Klamer : ) () Ama to tari kaju nong peda. father one cut.down wood with/and machete ‘Someone cut the wood with a machete.’ () Ama kali nei nong ni kafae. father that SG.go.to with/and POSS wife ‘That person went (there) with his wife.’ The evidence comes from a sister language of Alorese, Lamaholot, which is spoken on eastern Flores and the islands in between Flores and Pantar. Lamaholot has a cognate (and structurally defective) verb -oʔon ‘and, be with’. This verb can be used as a comitative predicate, with a preﬁx indexing the subject, as in (a)—although such contexts also allow the use of a (default) SG singular preﬁx n-, as in (b). The lexeme -oʔon also functions as a conjunction in (). In such cases, an obligatory default SG preﬁx attaches to it (Nishiyama and Kelen : –). The preﬁx no longer has a referential function in such contexts.²² () Lamaholot (Nishiyama and Kelen ) a. Go səga k-oʔon mo. .SG come .SG-be.with .SG ‘I came with you.’ (Nishiyama and Kelen : ) b. Go səga n-oʔon mo. .SG come .SG-be.with .SG ‘I came with you.’ (Nishiyama and Kelen : ) () Mo belə n-oʔon baʔa. .SG big .SG-be.with heavy ‘You’re big and heavy.’ (Nishiyama and Kelen : ) In the Alorese word nong ‘and, with’, the original SG preﬁx n- has been fossilized as initial consonant, and the word has completely lost all of its verbal properties (cf. Klamer ). So the Alorese instrumental/comitative preposition is a grammaticalization of a comitative verb. In Tetun, the national language of East Timor, some verbs also developed prepositional functions. One example is the handling verb lori ‘take’. When lori is used as a verb it is placed before the major verb in a serial construction, as in (). But when it functions as an instrumental preposition it is placed after the verb, as in (). As a preposition, it follows the TAM markers which may occur after the major verb, and

²² In Nishiyama and Kelen () this item is variously described as ‘conjunction’, ‘preposition’, or ‘comitative’, but here it is analysed as a (structurally defective) verb on the basis of its agreement patterns.

Grammaticalization in Papuan languages



always appears in the same position as oblique prepositional phrases at the end of the clause. In other words, when the original verb lori has become a preposition, it must occur at the end of the clause, to comply with the typical Austronesian pattern mentioned above that prepositional phrases follow the predicate. Tetun (Hajek ) () Abó lori tudik ko’a paun. grandparent take knife cut bread ‘Grandfather used (lit. ‘took’) the knife to cut the bread.’ (Hajek : ) () Abó ko’a paun lori tudik. grandparent cut bread take knife ‘Grandfather cut the bread with the knife.’ (Hajek : ) In sum, a serial verb construction in the Austronesian languages Alorese and Tetun ([V NP V (NP)]) may grammaticalize into a predicate plus PP construction ([V (NP) [Prep NP]]). In contrast, a serial verb construction in TAP languages ([NP V (NP) V]) may grammaticalize into a PP plus predicate construction ([[NP Postp] (NP) V]). The typological characteristics of a family such as word order and lexical inventory of verbs and adpositions thus determine the outcome of grammaticalization in both families. Verbs with preﬁxes originating from verbs are only found in TAP languages because this development requires a conﬁguration where the grammaticalizing verb is a V in a serial verb construction. In the Austronesian languages of the region, it is the V that grammaticalizes. I have not found any evidence that the verb > postposition change in TAP has been affected by Austronesian structural features. In contrast, if any structural diffusion took place, it was probably in the other direction, from TAP into Austronesian. For instance, the serial verb construction with lori ‘take’ in Tetun that is used to express instrumental constructions and precedes the main verb () is probably a pattern that diffused from TAP substrate language(s), because instrumental constructions in Austronesian languages that employ a verb typically employ the verb ‘use’ rather than ‘take’, as shown in (). Instrumental constructions with the verb ‘take’ are very rare in Austronesian, while they are used across the board in TAP.

. GRAMMATICALIZATION OF NOUNS: NOUNS > NUMERAL CLASSIFIERS

..        Numeral classiﬁers are attested across the TAP family, and this section presents an account on how they developed from nouns. Much of the discussion in this section is based on work published elsewhere (Klamer b, d) to which the reader is referred for further descriptive and analytical details. The sources for the descriptive data presented here are given in Table ..



Marian Klamer

Numeral classiﬁers are morphemes that appear next to a numeral, and categorize the referent of a noun in terms of its animacy, shape, and other inherent properties (Aikhenvald : ). The numeral classiﬁers in TAP discussed in this chapter are sortal classiﬁers. As no cognate forms of numeral classiﬁers have been attested in any of the TAP languages, we cannot reconstruct a classiﬁer for proto-TAP. Apart from a classiﬁer for humans, the various forms reported here show no similarities between the numeral classiﬁers in individual TAP languages. Languages also vary in the size of their classiﬁer inventory. For instance, Adang has ﬁfteen reported classiﬁers, Makalero has ﬁve, while Klon has three. There are also some TAP languages for which no classiﬁers have been attested; as in Bunaq and Kaera (see example ()). In addition, in every language that has them, the classiﬁers use different types of categorizations. For instance, fruits are classiﬁed in Teiwa according to their shape, using a dedicated fruit classiﬁer for long fruits (kam), cylindrical fruits (yis), or round fruits (quu’) (Klamer b, d). In contrast, Adang classiﬁes fruits together with animals and humans using just a single classiﬁer (pir) (Robinson and Haan ), while in Western Pantar, fruits are classiﬁed together with ‘contents’ (hissa), and in Klon, Kamang, and Makalero, fruits are not classiﬁed at all. Similar observations can be made for the diverse classiﬁcation of animals or objects. In addition to the high level of diversity in forms and functions of TAP numeral classiﬁers, they are also grammatically optional and often their source form is still in use as a noun. The lack of cognate forms, the high level of variation in classiﬁer inventories and categorizations, and the grammatical optionality of classiﬁers in TAP languages together suggests that in this family, classiﬁer systems are relatively recent developments that must have developed after the proto-language split up. Classiﬁers in the Alor Pantar sub-family developed out of nouns (Klamer b, d), in particular from botanical nouns indicating the parts of plants, such as ‘fruit’, ‘leaf ’, and ‘seed’. Such ‘part-of-whole’ (PoW) nouns are attested across the family, and cognates are found in many Alor-Pantar languages. An illustration is the cognate set of the proto-AP PoW noun *hera ‘stem, base’ (of a tree)’ in (). ()

Reﬂexes of proto-Alor Pantar *hera ‘stem, base’ (Klamer d: ; Holton b: ) LANGUAGE LEXEME MEANING W. Pantar haila ‘base, area’ Teiwa heer ‘stem, base’ Kaera er ‘stem, base’ Blagar era ‘base’ Adang (s)el ‘rigid, standing object’ Klon yar ‘trunk’ Abui iya ‘trunk’ Kamang ela ‘base’ Kiraman yira ‘tree’

PoW nouns like these combine with generic nouns as illustrated in (). Generic nouns have a general meaning and no referent in the real world. An example of a

Grammaticalization in Papuan languages



generic noun is the Teiwa noun wou ‘mango-hood’. Wou is glossed as ‘mango-hood’ to indicate that it refers to anything related to mango-hood. On its own, it cannot be used as a referential expression, and it must be accompanied by a PoW noun in order to refer to certain particular parts of a mango-plant, as illustrated in (a–d). ()

Teiwa (Klamer d: ) a. wou bag mango-hood seed ‘mango seed(s)’ c. wou qaau mango-hood ﬂower ‘mango ﬂower(s)’

b. wou wa’ mango-hood leaf ‘mango leaf (leaves)’ d. wou heer mango-hood stem ‘mango tree(s)’

Together, the PoW noun and the generic noun form a complex (compound) noun. This complex noun can then be individuated and counted. In numeral expressions, the numeral phrase follows the nominal head, (). () [N N] NUM wou bag yerig mango-hood seed three ‘three mango seeds’ It is likely that classiﬁers developed out of the PoW nouns in a structurally ambiguous structure like the one in (), as shown in (). Through a simple (‘re-bracketing’) reanalysis of numeral NPs, the PoW noun bag was reanalysed to be part of the numeral phrase. ()

Structural reanalysis of Teiwa bag ‘seed’ in the NP ‘three mango seeds’ (Klamer d)

a. Bag as PoW noun

b. Bag as classifier

NP

NP

N

NumP NUM yerig ‘3’

N wou

N bag ‘seed’

CLF bag

NUM yerig ‘3’

N wou

Another factor that must have played a role in allowing this reanalysis is the fact that in all the TAP languages, nouns are ‘number-neutral’. This means that bare nouns can have either a singular or a plural interpretation, and that number is not marked on nouns. This is illustrated in (). In (a) qavif ‘goat’ can be interpreted as singular or plural, depending on the context of the utterance. However, nominal



Marian Klamer

plurality can be made explicit with a separate lexeme, the plural number word non in (b), where qavif cannot be interpreted as singular. Teiwa (Klamer, Schapper, and Corbett ) () a. Qavif ita’a ma gi? goat where come go23 ‘Where did the goat/goats go to?’ b. Qavif non ita’a ma gi? goat PL where come go ‘Where did the (several) goats go to? Plural number words like Teiwa non are lexemes whose meaning and function is similar to that of plural afﬁxes in other languages (Dryer ). Cognates of plural number words are attested across Alor and Pantar, and a proto-form *non can be reconstructed for proto-Alor Pantar (Klamer et al. ). The number-neutral TAP languages have developed their classiﬁers in parallel processes that took place independently of one another. It has been observed (e.g. Gil ) that number-neutral languages often have numeral classiﬁers. The semantic motivation for this is that number-neutral languages are likely to develop classiﬁers to individuate their nouns, in order to create units for quantiﬁcation and counting (cf. Thompson ; Link ; Gil ). In other words, nouns could become classiﬁers in TAP languages because classiﬁers are useful things to have when you want to individuate a number-neutral nominal expression. And the structure of the noun phrase allowed the reanalysis to take place, as we saw above. However, if it was just the family-speciﬁc syntax and semantics that determined the development of classiﬁers in TAP, then why are there no more cognate classiﬁers and similar ways of classiﬁcation attested in the individual TAP languages? In the evolution of verbs into adpositions discussed in section ., cognate forms with similar meanings were found across the family members. Why do we not ﬁnd more similarities in the classiﬁers of the family? In the next section I argue that this is because the classiﬁers are not inherited but rather contact-induced.

..       As mentioned in section ., the Timor-Alor-Pantar region is a contact zone where speakers of Austronesian and Papuan languages have been meeting for several millennia. This contact has played a role in the development of classiﬁers in the TAP languages. Proto-TAP lacked numeral classiﬁers, just as other Papuan families typically lack them: out of dozens of Papuan families, only a few have classiﬁers, and, crucially, these language groups are located in the Bird’s Head of Papua, Halmahera, and Timor Alor Pantar, regions that have had long-standing contacts with Austronesian languages (Klamer d: –). In other words, if we come across a Papuan ²³ Teiwa gi ‘go (from deictic centre); cf. wa ‘go (from deictic centre; not far) in example () (Klamer a: : ).

Grammaticalization in Papuan languages



language that has classiﬁers, the chances are very high that they are not inherited, but a diffused Austronesian trait. Note that classiﬁer systems are in general easily diffused (Nichols : –), and as such they are often mentioned as markers for linguistic areas. Contact with Austronesian speakers thus played a role in the development of classiﬁers in the Timor Alor Pantar languages. In addition, the intense contact with Indonesian in recent times is also a strong driving force, as I argue in the next section.

..       Indonesian, the national language of Indonesia, has been used in the TAP region as lingua franca and language of education since at least the late s, and today it is spoken by almost everyone. In Indonesian, sortal classiﬁers are obligatory in numeral contexts. Indonesian has a general classiﬁer buah, which derives from the noun buah ‘fruit’. Buah classiﬁes fruits, but when it is used as a general classiﬁer it classiﬁes three-dimensional objects such as cars (). Buah is the ‘most general classiﬁer [which] has almost lost any semantic, conceptual content’ (Hopper : ), and ‘classiﬁes things that do not have deﬁnite types and shapes’ (Chung : ). In a similar way, Teiwa uses a general classiﬁer bag, as in (). Indonesian (own knowledge) () dua buah mobil two CLF car ‘two cars’ Teiwa (Klamer d: ) () Qarbau bag ut water.buffalo CLF four ‘four water buffaloes’ General classiﬁers categorize entities that are semantically unrelated to the original meaning of their source form, and the semantics of the source form has been bleached. For example, the Teiwa classiﬁer bag derives from the PoW noun meaning ‘seed’. As a general classiﬁer it can classify everything except fruits, including all kinds of non-plant objects and animals (Klamer d: –). The original meaning ‘seed’ has disappeared completely; the classiﬁer now just has an indviduating function. General classiﬁers like this have been reported for a few other Alor Pantar languages, as shown in (). The forms do not share etymologies. ()

General classiﬁers in AP languages with source meaning and classiﬁcation Western Pantar bina < ‘be detached’: classiﬁes many different types of nouns, including ﬁsh and fruit. Teiwa bag < ‘seed’: classiﬁes all objects except fruits, and animals. Adang pa’ < ‘non-round fruit’: classiﬁes objects of many shapes and sizes, including arrows, drums, borrowed nouns, birds, ﬁsh. Kamang uh, with unknown etymology: its classiﬁcation includes human beings and animals.



Marian Klamer

There is much inter- and intra-speaker variation in the use of general classiﬁers (cf. Klamer d). This suggests that they are a relatively recent development. They probably arose following the example of Indonesian buah. Note however that Indonesian buah means ‘fruit’, and as a general classiﬁer it classiﬁes objects and fruits, but not animals. In contrast, the general classiﬁers in the AP languages do not derive from a noun meaning ‘fruit’ and can be used to classify animals, as shown by Teiwa bag. That is, neither the form, nor the meaning ‘fruit’, nor the classifying function of Indonesian buah has been diffused. Note that the word order of the numeral phrase in Indonesian and TAP languages has also remained different, in accordance with the basic word order in these languages. () Numeral NPs in Indonesian and TAP languages Indonesian: [Numeral CLF] Noun]] TAP: [Noun [CLF Numeral]] The only feature that TAP speakers adopted from Indonesian was the ‘idea’ of using a general classiﬁer in numeral constructions. This is something that is typical for classiﬁcation systems: the ‘idea’ of a classiﬁcation system gets diffused, but not the forms or the structures. It is also typical to use native nouns as source forms for the grammaticalized classiﬁers (Seifart : ).²⁴

.. :   , ,      The grammaticalization of numeral classiﬁers out of nouns in TAP languages was possible because of the semantic and structural characteristics of nouns in this family. First, the number-neutral character of TAP nouns provides room to develop a strategy by which speakers can individuate and enumerate nouns. Second, the existence of generic nouns gives rise to compound nouns that combine generic nouns and part-of-whole nouns to become referential expressions. When such complex nouns are enumerated, the PoW noun occurs in an ambiguous position between generic noun and numeral, and is easily reanalysed to form a constituent with the latter rather than the former. The process was probably caused or enhanced through contact with Austronesian languages, which typically have classiﬁers. In addition, the development of general classiﬁers in some of the languages spoken today suggests that speakers adopted the general classifying ‘idea’ of Indonesian buah, using a lexeme from their own language to express that idea.

²⁴ In the TAP languages of Alor-Pantar there is virtually no borrowing of Austronesian numeral words, while the TAP languages of Timor show more Austronesian inﬂuence in this domain (Klamer et al. ; Schapper and Klamer ). However, in the market, prices are usually quoted in Indonesian (in Alor, Pantar, and West Timor) or Tetun (in East Timor).

Grammaticalization in Papuan languages



. CONCLUSIONS AND DISCUSSION Many similarities exist across the TAP family in the grammaticalization of verbs and nouns. In the evolution of deverbal forms, the lexical and syntactic typology of the family (constituent order, verbal inventory, verbal valency, and so on) played an important role. In the nominal domain, the existence of generic nouns, the numberneutral status of nouns, and the structure of NPs are important factors in the evolution of nouns into classiﬁers. Language contact played a different role in the verbal and nominal domain. In the grammaticalization of verbs we see that many cognate forms are involved, while there is no evidence that the process is inﬂuenced by contact with Austronesian languages. The grammaticalization of nouns into classiﬁers, on the other hand, does not involve any cognates and is inﬂuenced by Austronesian. How can the different roles of contact in both domains be explained? An Austronesian type of grammaticalization of TAP verbs would manifest itself as the grammaticalization of the second minor predicate (V) of a serial verb and its object into a preposition plus complement (TAP serial verbs grammaticalize the ﬁrst object and the minor V, see section ..). In other words, for Austronesian type of deverbal grammaticalization to occur, some crucial elements of the TAP constituent order would have to change from head-ﬁnal to head-initial structures. Such wordorder changes can and do occur under contact, but they are always gradual, and the result of changing frequencies of certain patterns. In other words, emergent new word-order patterns become established patterns by slowly increasing their frequency of use across the speech community (Backus, Doğruöz, and Heine ). To become fully schematic and entrenched, a new word order must become the most frequent order in a speech community. This type of change needs intense, continued, and long-term contact, typically involving several centuries of bilingualism. In the Alor-Pantar region there has not been such long-term bilingualism with an Austronesian language; speakers are (were) instead bi- or trilingual in one or more neighbouring AP language(s). The current inﬂuence of Indonesian has not been intense enough to change word orders in AP languages, and hence the structural context of the grammaticalization of verbs in TAP also remained non-Austronesian. In the Timor region, the situation is more complex, and suggests inﬂuence from TAP on Austronesian, and the other way round. On the one hand, serial verb constructions of (a) TAP language(s) appear to have been calqued into Austronesian Tetun (cf. the construction with lori ‘take’, section ..), while there is also evidence of Austronesian VO word order being used in the ‘give’ construction of the TAP language Bunak (Klamer and Schapper : –). Austronesian inﬂuences are quite obvious in the development of general classiﬁers in TAP languages. This change is a typical emergent contact-induced change (Backus et al. ): it involves the extension of existing patterns to wider contexts and desemanticization; but they are variable and grammatically optional, and in this sense the classiﬁers are not yet completely grammaticalized. Furthermore, no structure or form is transferred—only a classiﬁer ‘idea’. Backus et al. () argue



Marian Klamer

that emergent changes like these, which do not involve linguistic forms or patterns, only need one or two generations to happen. In sum, the typology of TAP languages determines much of the grammaticalization of both verbs and nouns, but the type and intensity of the contact with Indonesian, or lack of it, also determines why structures in the verbal and nominal domain develop in different ways. Grammaticalization is not only determined by universal tendencies, nor by typology alone. Sociohistorical circumstances play an important role in setting certain chains of grammaticalization in motion. If and how contact inﬂuences grammaticalization can vary greatly, depending on the type and intensity of contact; and contact also affects the grammaticalization of verbs and nouns in very different ways.

13 Grammaticalization and typology in Australian Aboriginal languages ILANA MUSHIN

. INTRODUCTION The study of Australian Aboriginal languages has a strong emphasis on language description and grammatical typology (e.g. Dixon , ; Koch and Nordlinger ). There is now a well-established tradition of relatively theory-neutral grammar writing in Australia that allows for sophisticated comparative work on a large range of widespread morphosyntactic phenomena, including recent work on extended functions of nominal case marking (e.g. Blake ; Nordlinger ); word order and conﬁgurationality (e.g. Pensalﬁni ; Simpson and Mushin ); and complex predicate systems (e.g. Schulze-Berndt ; Bowern ). As a result Australian languages have had a signiﬁcant impact on the development of linguistic theories over the past  years (Nordlinger : ). Additionally there has been a focus in Australianist linguistics on establishing historical relations between Australian languages, an enterprise which has focused especially on sound changes and lexical reconstruction (e.g. Dixon , ; Bowern and Koch ; Koch ). Recent work on grammatical change in Australian languages has mostly focused on the development of new languages or signiﬁcant restructuring of old languages in the postcolonial period (e.g. Schmidt  on Young People’s Dyirbal; Meakins  on Gurindji Kriol; O’Shannessey  on Light Warlpiri). There has been less focus on the development of the widespread typological features of Australian Aboriginal grammars that emerged prior to colonisation.¹ Of particular interest for this volume is the notable absence of

¹ Exceptions include Harvey, Green, and Nordlinger () on the shift from preﬁxing to sufﬁxing in some Northern Australian languages, and Pensalﬁni (), Gaby (), and McGregor () on the development of discourse markers from case markers. Grammaticalization from a Typological Perspective. First edition. Heiko Narrog and Bernd Heine (eds). This chapter © Ilana Mushin . First published  by Oxford University Press



Ilana Mushin

studies that cite grammaticalization as a process in the development of Australianlanguage grammar.² Approaches to grammaticalization range from narrowly including only clear cases where members of open lexical classes have developed into members of closed grammatical classes to more broadly examining processes by which forms attain a more grammatical function (Bisang : ). In this chapter I follow Heine’s (: ) broader deﬁnition of grammaticalization as ‘the way that grammatical forms arise and develop through space and time’. This approach recognizes that certain closed-class members, especially those which are free morphemes such as pronouns, demonstratives, and complementizers, may also develop new grammatical functions as agreement markers, deictic morphemes, and discourse markers (e.g. Bybee : ch. ; van Gelderen b; Onodera ; Thompson and Suzuki ). The approach also recognizes that existing grammatical constructions may increase in generality and productivity—a process of grammatical constructionalization (e.g. Traugott and Trousdale ). Given the substantial body of evidence that the emergence of many grammatical constructions across the world’s languages results from the shift from open-class semantically complex forms to more morphosyntactically constrained and semantically general forms, we can assume that at least some of the grammatical phenomena associated with Australian Aboriginal languages did indeed emerge from grammaticalization. In this chapter I ﬁrst consider some of the reasons why studies of grammaticalization have not been a priority among scholars of language change in Australia (section .). I will then focus on one type of widespread construction of the North-Central region—second-position clitic constructions—to show how the development of ﬁxed clusters of bound morphemes that attach to the initial element of a clause may be attributed to grammaticalization and constructionalization (section .). The map in Fig. . shows languages spoken in an area to the northwest demarcated by the black line marks the approximate boundary between the Pama-Nyungan language family, which covered most of the country, including the northeastern corner of Arnhem Land, and non-Pama-Nyungan languages. The languages which are still spoken across generations as a language of daily life are in the few regions that were not as subject to intensive colonization in the th–th centuries: some languages

² One example of the marginal importance of grammaticalization for the study of language change in Australia lies in the fact that the Oxford Handbook of Grammaticalization (Narrog and Heine b) has ten chapters on grammaticalization in different language families or areas but does not include Australian languages.

Grammaticalization in Aboriginal languages



Jaminjung

Pama-Nyungan Ngaliwurru Binbinka Garrwa

Bilinarra Mudburra Gurindji

Ganggalida

Jingulu

Non-Pama-Nyungan

Wambaya

Waanyi

Warlpiri

0

Kilometres

0

Km

5000

500

F. .. Map of pre-colonial Australian languages

around Arnhem, Desert, North and West Cape regions. The remainder are in various states of endangerment, from those with neither remaining speakers nor record, to those with some partial records and/or speakers with partial knowledge of the language, to those with remaining ﬂuent older speakers but no new generation of speakers. The data presented in this chapter is drawn from both Pama-Nyungan and nonPama-Nyungan languages spoken in a contiguous area of North-Central Australia, as indicated by the shaded area on the map in Fig. .. This area is of particular interest precisely because it straddles the boundary of a number of language families and because there is a relatively high proportion of well-described languages in this area, such as the Ngumpin-Yapa (Pama-Nyungan) language Warlpiri (e.g. Nash ; Simpson ; Laughren ); the Mirndi (non-Pama-Nyungan) languages Jaminjung/Ngaliwurru (Schultze-Berndt ), Wambaya (Nordlinger ), and Jingulu (Pensalﬁni ); the Tangkic (non-Pama-Nyungan) language Yukulta/Ganggalida (Keen ); and the Garrwan (non-Pama-Nyungan) language Garrwa (Mushin ). These languages will form the basis of the analysis presented in section ..



Ilana Mushin

. ACCOUNTING FOR THE SHORTAGE OF GRAMMATICALIZATION STUDIES OF AUSTRALIAN ABORIGINAL LANGUAGES Of the more than  languages spoken in Australia prior to British colonization at the end of the th century, fewer than  are still spoken by children, and many ceased to be spoken before they could be recorded or described (Koch and Nordlinger : ). While there has been a strong tradition of producing descriptions of Australian languages, especially since the mid-th century, many descriptions are based on salvage work, relying on the insights of a relatively small number of speakers. There is therefore uneven distribution of language material and uneven capacity to collect new data on languages, making it very difﬁcult to undertake systematic diachronic research on the development of grammar. In addition to the effects of postcolonial contact, normal practices of multilingualism in Aboriginal Australia that predate colonization have resulted in long histories of language contact that further complicate our ability to gauge which changes come from shared inheritance and which from areal contact (Dixon ; Koch ). It has not been typical to associate grammaticalization with processes of language contact in Australia, yet the grammatical convergence that comes from extensive periods of language contact may be one reason for common paths of grammatical change across languages (cf. Heine and Kuteva ). Another reason why there are few grammaticalization studies in Australia may be the fact that Australian languages are mostly morphologically complex, and some are highly polysynthetic. Most languages are agglutinative and allow for multiple bound morphemes. Few grammatical categories are regularly marked by forms whose lexical source is still available as a free form. For example, spatial relations are mostly marked with locative sufﬁxes rather than free-standing prepositions or postpositions. It is therefore challenging to ﬁnd clear comparative evidence of contemporary bound afﬁxal forms that in some languages may retain features of their lexical origins. Indeed, studies of grammatical change in Australian languages have tended to focus on the extension of existing grammatical systems to new grammatical domains rather than on the origins of new material from lexical sources. For example, it is relatively common for clause-combining morphology, including switch reference marking, which is marked on verbs, to have originated from nominal case marking (e.g. Austin ; Dench and Evans ; Blake ; Nordlinger ). The examples below from Garrwa and Yukulta (neighbouring languages from different families) show how case markers may be used to signal clausal subordination of different kinds. The Garrwa examples in () show how locational case marking (locative, allative, and ablative) is also used with verbs to mark various relations between clauses that are temporally simultaneous.³ ³ This pattern was identiﬁed in Austin () for Australian languages in a study of switch reference marking.

Grammaticalization in Aboriginal languages ()



Garrwa (Gulf Country: Garrwan) (Mushin : ) a. Locative > Same subject simultaneous (i) miku-wali ngayu lunji ngalurr-ina NEG-EVID sgNOM sick chest-LOC ‘I can’t be sick with a cold.’ (Mushin ﬁeldnotes ) (lit. ‘I can’t have sick in my chest.’) (ii) bak=ili yalu waradijba wawarra ngara-jina ngamulu nayi and=HAB plNOM be.busy child drink-SS milk this barri DM

‘And those kids would be busy drinking milk here.’ (....KS)4 b. Allative > Different subject simultaneous (i) wabuda wilyurrumba ngaki-nbu-rri muwada-yurri water run.over sg-LOC-ALL boat-ALL ‘Water’s ﬂooded over my boat.’ (Mushin  ﬁeldnotes) (ii) manku ngayu wilina yanyba-kurri yalu-ngi hear sgNOM outside talk-DS pl-DAT ‘I hear (them) talking to them outside.’ (Furby and Furby : ...) c. Dative > Irrealis (i) wajba nganinji nana-nkanyi lama-nyi give sgACC/sgNOM that-DAT axe-DAT ‘Give me that axe.’ (....DG) (ii) karijba ngayu jila-kanyi feel.like sgNOM walk-IRR ‘I feel like walking.’ (Mushin  ﬁeldnotes) In the Yukulta example in () the ergative case marker -ya marks a transitive subject (ʽmanʼ). It is also used to mark the verb and object of a purpose clause. ()

Yukulta (Gulf Country: Tangkic) (Dench and Evans : , cited in Nordlinger : ) tangka-ya=karri ngit-a karna-ja makurrarra-wurlu-ya man-ERG-/.PRS wood-ACC light-IND wallaby-PROP-ERG karna-j-urlu-ya light-VB-PROP-ERG ‘The man lit a ﬁre in order to cook the wallaby’

Note that in both the Garrwa and Yukulta examples, the verbal uses of the case markers require a preceding ‘formative’ which would appear, at least historically, to cause nominalization. For example, the proprietive ‘having’ sufﬁx -wurlu that precedes the ergative marker in () is reported in Blake (: ff.) to create a predicate nominal from the verb. The j/k formative seen in the verbal uses of Garrwa

⁴ The number and initial code provided at the end of each example indexes the example to the text in my Garrwa corpus from which it is taken.



Ilana Mushin

case markers in () may also be derived from an earlier proprietive, as suggested in Blake (: ), but as the current proprietive form in Garrwa is -yudi (Mushin ), it is unclear what the actual origin of this formative might have been. In both cases, however, it is clear that the case marker was at least historically applied to a derived predicate nominal rather than directly to the verb, retaining its basic function as a marker of nominal relations. In this sense the use of case markers to relate clausal information via a process of nominalization and extension of case semantics provides a means of relating events within sentential units, as subordinate clauses do in other languages. It is not clear though that these semantic extensions of case systems result in a construction that is more grammatical than case marking on underived nominal (cf. Heine : ).⁵ Some typological features of Australian languages, however, appear likely to have emerged from processes of grammaticalization, even if the diachronic evidence is patchy. One case is the incorporation of certain lexical nouns into verbs among polysynthetic languages in Northern Australia (e.g. Evans ; Nordlinger ), where there is some transparent connection between the form of the incorporated noun and the form of the corresponding free noun. Another case might be the origins of remnant morphology such as the j/k formative in Garrwa as discussed above. This consonant is not a productive nominalization morpheme synchronically, but may have emerged from a more phonologically robust marker of nominalization.⁶ Unlike noun incorporation, the only discernible starting point could be a bound derivational morpheme and not a lexical form. However, it does exhibit the kinds of semantic changes (from less to more ‘bleached’) and morphological changes (from more to less phonetic material) that we ﬁnd with grammaticalization processes more generally. Despite the challenges outlined above, there is no intrinsic reason to presume that the grammars of Australian languages have not been subject to processes of grammaticalization. The Indigenous languages of North and South America are similarly endangered, with similar challenges associated with a lack of historical material (although in some cases with a longer time depth due to the earlier colonization period) and morphological complexity. As Mithun (Chapter  this volume) has demonstrated, it is possible to ﬁnd evidence of grammaticalization in moribund languages if the right comparative data are available. In some senses, then, the marginal status of grammaticalization as a feature of language-change research in Australia is an artefact of the priorities of researchers of Australian languages, and not related to the languages themselves.

⁵ Some Australian studies of grammatical change have called the process ‘grammaticalization’, even when there is no evidence that a less grammatical form has become more grammatical. E.g. Blythe () calls ‘grammaticalization’ the process by which bound person/number marking on Murrinh-Patha verbs attained kinship semantics. ⁶ Scholars of Australian Aboriginal languages have long noted the existence of phonological material that appears to be remnants of morphemes. For Garrwa this additionally includes /m/, /j/, /n/, and /ba/ elements of the verbal conjugation system (Mushin : ).

Grammaticalization in Aboriginal languages



. EVIDENCE FOR GRAMMATICALIZATION IN THE DEVELOPMENT OF SECOND-POSITION CLITIC CONSTRUCTIONS In this section I focus on possible grammaticalization pathways which have led to a particular widespread typological feature of Australian languages of the North-Central region, both Pama-Nyungan and Non-Pama-Nyungan: secondposition clitic constructions that minimally mark person/number/gender of clause arguments (often analysed as bound pronominal clitics) but which often also mark tense/aspect/mood, modality, transitivity and directionality (Mushin , ).⁷ In many of these languages the ﬁrst and second positions themselves form a prosodic unit (Simard ), but the bound forms in second-position clusters are analysed as clitics because they are less sensitive to the word class of their host, as illustrated in examples ()–() from four languages of this region (second position is marked in square brackets): Warlpiri (Ngumpin-Yapa, Pama-Nyungan), Wambaya (Mirndi, Non-Pama-Nyunan), Yukulta/Ganggalida (Tangkic, Non-Pama-Nyungan), and Garrwa (Garrwan, Non-Pama-Nyungan).⁸ ()

Warlpiri (Laughren : ) yurntumu-wardingki-patu [ka=lu] wangka-mi Yuendumu-habitant-PL.NOM CENTR=()pl speak-NPS ‘The Yuendumu people are speaking.’

()

Wambaya (Nordlinger : ) mugunjana=[miji gi-n] mirra louse.II(NOM)=INFER SG.S(PRS)-PROG sit ‘It must be a louse (because I keep scratching my head).’

()

Yukulta (Ganggalida) (Keen : , ex. ) yakukathu=[ngarpa-rati] wa:tya sister-couple-plNOM+PRS sing ‘The sisters are singing (in unison).’

⁷ Mushin () used the term ‘clitic complex’ from Keen () for this clustering of grammatical indices. Nordlinger (: ) calls this construction an ‘auxiliary’, following use of this term in Ngumpin-Yapa (Pama-Nyungan) languages for similar phenomena. The term ‘auxiliary’ was adopted from European linguistics because these constructions carry the main source of tense/aspect marking for the clause, rather than the main verb. ⁸ These examples are taken from published linguistic descriptions of the phenomena, and retain the conventions for representing the clitic cluster provided in the grammar. E.g. Yukulta clitics are marked as cliticized to its host with the ‘=’ symbol, as is the modal clitic component of the Wambaya clitic complex. The person/number/tense features of the Garrwa, Wambaya, and Warlpiri clitic complexes are usually written as separate words although they are prosodically dependent on their hosts. The Warlpiri and Wambaya clitic complexes are conventionally called ‘auxiliaries’.

 ()

Ilana Mushin Garrwa (Mushin : ) jarrba [nurr=ili] nanda wada barri munjimunji-nyi eat plExclNOM=HAB that food DM bush-DAT ‘We would eat that food of the bush.’ (....ER)

Optional epistemic and deontic modal clitics occur initially in these constructions when they occur as illustrated in () for Wambaya epistemic modal clitic =miji ‘INFERential’, while () illustrates a deontic modal clitic =kiya ‘(past) OBLIGation’ for Garrwa.⁹ In Wambaya the modal clitic co-occurs with the tense clitic in the construction, while in Garrwa, modal clitics do not co-occur with tense clitics. ()

Garrwa (Mushin : ) kaya=[kiya ninji] nanga-ngi call=OBLIG sgNOM sg-DAT ‘You should have called out for him.’

The main difference between Warlpiri and the other three languages is that Warlpiri requires an initial morpheme to which the pronominal clitics attach. That morpheme carries tense/aspect meaning, but tense is also marked morphologically on the verb, as shown in (), where non-past is marked both as a verb sufﬁx and as a suppletive form of the initial element of the clitic cluster (conventionally glossed CENTR). This contrasts with the other three languages, where the only location for tense/aspect morphemes is in the clitic complex.¹⁰ Mushin () argued that the similarities between Warlpiri, Wambaya, and Garrwa clitic complexes suggest not only that this is an areal phenomenon but also that there are pragmatic motivations related to information packaging in languages of this region that motivated the development of this particular grammatical architecture. That study, however, did not consider the actual historical processes by which these structures emerged. Here I provide some evidence that both the pronominal and the tense/aspect/ mood components of this clitic cluster may have emerged through two different but complementary grammaticalization processes—the development of bound pronouns from free pronouns in section .., and the development of tense/aspect clitics from complex predicate constructions in section ... In section .. I consider the status of these clitic constructions as more or less constructionalized, and in .. I consider pragmatic motivations for the development of the second-position clitic construction as a key part of the grammatical architecture of languages of this region. The analysis shows that while grammaticalization has not been a common way of explaining the development of grammar in Australian languages, it is a factor in language change in much the same way as has been analysed for other languages of the world. ⁹ While Mushin () analyses Garrwa modal clitics as part of the second-position clitic complex, Nordlinger () considers them outside of the ‘auxiliary’. This may be an artefact of the different approaches in grammatical description, and would require further investigation. ¹⁰ Wambaya and Yukulta select different verb root forms for future/non-future (Wambaya) and realis/ irrealis (Yukulta). Garrwa verb roots do not indicate temporality in any way.

Grammaticalization in Aboriginal languages



..         Bybee (: ) notes: Personal pronouns have various uses and often have both full forms (tonic forms) for more emphatic uses and reduced forms (atonic forms). Once they are formed, pronouns tend to continue undergoing grammaticalization. One outcome can be afﬁxation of the atonic forms to the verb . . .

The typology of pronouns in Australian Aboriginal languages with secondposition clitic clusters, such as those illustrated in ()–(), is consistent with Bybee’s () characterization; but, as Mushin and Simpson () and Mushin () argued, it is not always clear which direction the grammatical change is moving in. Mushin and Simpson () built on the long-held observation that Australian languages tend to have sets of both free and bound pronouns, with bound pronouns functioning as the main referential indices of person, number, noun class (where relevant) and grammatical relation, while free pronouns occur only in contexts of pragmatic prominence such as contrast or focus. In some languages, such as Warlpiri, the free and bound pronouns appear to derive from different sources, as shown by example () (from Swartz , cited in Simpson and Mushin : ): ()

Pura-mi=nya=rna=ngku=lu nganimpa-rlu=ju? follow-PRS=Q=plNOM.sgACC pl-ERG=DEF ‘Shall we follow you?’ (G)

Basing his analysis on languages like Warlpiri, Dixon (: ) argued that bound pronouns result from a reduction and ﬁxing of free pronouns at an earlier stage of the language, followed by replacement of the old free pronoun paradigm with a new set. Wambaya (Nordlinger ) is one such language. As the partial paradigm in Table . shows, the clitic subject pronouns are the ﬁrst one or two syllables of the free pronouns, and the bound object pronouns are further reductions of the st and nd person singular bound forms. Wambaya bound subject pronouns obligatorily occur as the ﬁrst element of the second-position clitic complex, with ﬁrst- and second-person object pronouns also required. Free pronouns only occur in ‘emphatic’ contexts (Nordlinger : ). The example in () shows the co-occurrence of free and bound (clitic) subject pronouns where the free pronoun occurs in initial position as the clitic host. As the obligatory bound forms, the clitic pronouns can be argued to be more grammaticalized than the free pronouns, as they are phonologically eroded and less morphosyntactically independent than their free counterparts (Heine and Kuteva : –). ()

ngirriyani ngirri-n mirra PLEXCLNOM PLEXCLS(NPS)-PR sit(nFut) ‘We’re sitting here.’ (Nordlinger : )



Ilana Mushin

T . Partial paradigm of Wambaya free and bound pronouns (from Nordlinger : , ) Free NOM/ERG

Free ACC

Bound A/S

Bound O¹¹

sg

ngawurniji, ngawu

ngawurniji, ngawu

ngi-

-ng-

sg

nyamirniji, nyami

nyamirniji, nyami

nyi-

-ny-

duInc

mindiyani

mirnda

mirndi-

-ng-

du

gurlawani

gurla

gurlu-

-ny-

pl

girriyani

girra

girri-

-ny-

pl

irriyani

irra

irri-

–

One possible explanation for dual pronominal systems like Wambaya is that they represent a stage predicted in Dixon (: ) whereby free pronouns have become ﬁxed in second position as obligatory markers of referential features of core arguments and have become phonologically reduced as part of this process. This would be evidence of the emergence of dual pronoun systems as resulting from grammaticalization. However Mushin and Simpson (: –) argued that such systems might equally be evidence for the development of free pronouns from clitic pronouns where existing clitic pronouns are augmented with additional morphology, possibly markers of information status, to enable their use as markers of pragmatic prominence in sentential positions associated with such functions. The Wambaya free pronouns are built therefore on the bound forms plus the augments -yani/wani and -rniji. Mushin and Simpson (p. ) argued that such languages raised questions for the unidirectionality of grammaticalization as more phonologically robust forms that are discoursecontingent are formed from less phonologically robust obligatory markers of referential features. Alternatively, this could be an example of ‘degrammaticalization’. Regardless of the direction, however, the study showed that there are clear discourse-pragmatic pressures to retain both a set of pronominal forms to index reference and facilitate referential continuity, and a set of pronominal forms to be used when the reference is somehow counter to recipient expectations, such as in cases of emphasis and contrast. This accounts for languages like Warlpiri, where it does appear that free pronouns have a separate origin from bound pronouns, as well as Wambaya, where the push for a dual system has led to augmented forms that can be used for marked discourse purposes. Further evidence of the push towards dual systems in Australian languages comes from Garrwa, whose pronoun system provides perhaps the best evidence of the bridging contexts that reinforce the recycling of dual-pronoun systems in Australian languages. As Mushin () argues, Garrwa, which was at ﬁrst analysed as only ¹¹ While the forms for sg and sg appear to be formally related to their corresponding forms through the retention of the same initial consonant, the same cannot be said for the other person/number categories.

Grammaticalization in Aboriginal languages



having free pronouns (Furby ), has a ‘liminal’ system where second-position pronouns are not more phonologically reduced than pronouns found elsewhere. Rather, the pronouns split according to prosody, syntactic position, and discourse function. Garrwa pronouns are by far most commonly found in second position, and lack primary stress, leading to cliticization (illustrated in examples () and ()), but no phonological reduction as in Wambaya. Unlike Wambaya and Warlpiri, Garrwa pronouns cannot occur as an initial element followed by the second-position clitic complex, illustrated in (), in which the stressed pronoun (the answer to a question) is in a left-dislocated position, separated from the rest of the clause by an intonation break. The fact that Garrwa pronouns cannot serve as clitic hosts is evidence that they have not become paradigmatically ‘free’ in the way that free pronouns in Warlpiri and Wambaya have done.¹² () baja=nyi

yalu na-nyina wayka play=HORT plNOM this-LOC down ‘Let them sing down there.’ (...DG)

() dabarraba=yili yalu, badidibadi-wanyi cook.in.ashes=HAB plNOM old.woman(RDP)-ERG ‘They would cook in ashes, the old women (would).’ (....KS) () DG: and wanyi kuyu nanda yiliburru what bring that waterlily ‘And who brought that waterlily?’ KS: yálu, minj=ili yalu, jila karri-na Winmarri-nanyi plNOM COND=HAB plNOM walk east-ABL CH.station-ABL ‘They did, when they went from the east, from Calvert Hills Station.’ (...) In this section I have shown that the forms of dual-pronoun systems where bound pronouns representing the referential indices of core arguments are the ﬁrst clitic element(s) in a second-position clitic cluster may be the result of grammaticalization from earlier free-pronoun forms (as hypothesized for languages like Warlpiri), but there is also evidence that for some languages the bound pronoun is the older form, with matching forms used when pronouns are used external to the clitic complex either in augmented form (as in Wambaya) or without further augment (as in Garrwa). As argued in Mushin and Simpson (), this evidence from Australian languages thus raises questions about the orthodoxy of unidirectionality in accounting for the development of grammatical systems. In the next section I consider the role that grammaticalization processes may have played in the development of some of the non-pronominal clitics that constitute the second-position clitic construction.

¹² The discussion here focuses on subject pronouns only. The behaviour of pronouns in other grammatical relations is consistent with this analysis, but goes beyond the scope of this chapter.



Ilana Mushin

..      /     The examples in ()–() illustrate the grammatical categories which occur in secondposition clitic complexes in addition to clitic pronouns that mark core grammatical relations. In all four languages cited in those examples, the clitic construction is also the site for tense/aspect marking. In Warlpiri, tense and aspect are additionally morphologically marked on the verb with sufﬁxes, as () illustrates. Both Wambaya and Yukulta/Ganggalida have different verb roots according to tense (Wambaya) or mood (Yukulta/Ganggalida) as () and () illustrate. Garrwa, however, only marks tense, aspect, and mood in the second-position clitic construction: Garrwa verbs have no temporal or mood-based semantics, nor do they take inﬂectional morphology associated with these categories. This is shown in (). In this section I consider how tense/aspect clitics may have emerged, at least for some of these languages. Studies of the development of tense/aspect/mood systems more broadly have shown the sources from which deictic tense systems, especially morphological systems, have developed (e.g. Dahl ; Bybee et al. ). These studies have identiﬁed some regular sources for tense/aspect inﬂections. For example, present and imperfective markers frequently derive from progressive markers which themselves derive from other constructions; past and completive markers often derive from verbs like ‘ﬁnish’; stance verbs may also be sources of tense/aspect marking. These patterns have been also been identiﬁed in non-European languages (Heine : ). While most Australian languages have some morphological marking of tense/ aspect, either as a verb inﬂection or as part of a second-position clitic construction, or both, there has been remarkably little research on how these forms emerged.¹³ In the rest of this section I consider the evidence for the development of tense/aspect clitics from verbal sources, in particular from classes of ‘inﬂecting verbs’ that participate in a complex verb construction. Many Australian languages of the North-Central region have a ‘complex verb structure’ (Dixon : ), similar to light verb constructions in other languages (Brinton ). They typically consist of a closed class of ‘simple’ verbs which carry the basic grammatical properties of the predicate (e.g. person and TAM marking) plus a semantically general meaning (e.g. ‘go’, ‘come’, ‘do’, ‘say’, ‘make’) and an open class of uninﬂected ‘co-verbs’ which carry the semantic detail of the predicate (e.g. Schultze-Berndt ; Dixon ; Amberber, Harvey, and Baker ; Bowern ). In most cases the co-verb precedes the inﬂecting verb and is contiguous with it, although some languages do allow for discontinuous complex predication (Schultze-Berndt ). The closed class may be as large as about  forms or very small (– forms). For example, the Mirndi language Jaminjung (Schultze-Berndt

¹³ Recent typological investigations have been more concerned with the semantics of particular tense/ aspect/mood categories (e.g. papers in Stirling and Dench ), or the formal comparisons of morphemes and their interactions with verb conjugation classes (Dixon ).

Grammaticalization in Aboriginal languages



) has about  inﬂecting verbs, while another Mirndi language, Jingulu (Pensalﬁni ), has just three ‘light verbs’. Inﬂecting verbs are still verbs because they can occur without co-verbs, retaining their basic verbal semantics. Examples () and () illustrate simple and complex predication in Jaminjung and Jingulu. In the (a) sentences, the inﬂecting verb is the only predicate. In the (b) sentences the same inﬂecting verb occurs with a co-verb to form a dual predicate construction where the overall meaning of the predicate is derived from the meaning of the co-verb, not the inﬂecting verbs. Both languages mostly order co-verbs before inﬂecting verbs, with person agreement occurring as a preﬁx on the inﬂecting verb. The co-verb-inﬂecting verb construction is analysed as two words in Jaminjung but as one phonological word in Jingulu. () Jaminjung (Schultze-Berndt : ) a Inﬂecting verb only Gani-ma-n jurruny-ni sg:sg-HIT-PRS lower.arm-ERG/INS ‘He hits him with the hand.’ b Co-verb and inﬂecting verb Miri bag burra-ma-nyi gurrubardu-ni Leg break pl:sg-HIT-IMPV boomerang-ERG/INS ‘They used to break its legs with a boomerang (kangaroo).’ () Jingulu (Pensalﬁni ) a. Inﬂecting verb only nga-ardu sg-go(PRS) ‘I’m going.’ (p. ) b. Co-verb and inﬂecting verb laja-nga-rdu kijurlurlu carry-sg-go(PRS) stone ‘I’m carrying a stone.’ (p. ) The status of the Jingulu construction as a single phonological word (cf. two words in Jaminjung) and the inventory of only three inﬂecting verbs (compared to  in Jaminjung) might indicate that the Jingulu co-verb-inﬂecting construction is the more grammaticalized of the two. Indeed, Pensalﬁni (: ) reports that Chadwickʼs () earlier description of Jingulu analysed the inﬂecting verbs as tense/aspect markers. This is also evidence that the Jingulu complex verb construction is more grammaticalized than the Jaminjung equivalent. This would also account for the much smaller inventory of inﬂecting verbs in Jingulu.¹⁴ ¹⁴ As Bowern (: ) notes, little is known about the origins of inﬂecting verbs. She hypothesizes that since highly frequent verbs that have general semantics (eg. do, go/come, say) tend to grammaticalize into light verbs in other languages (e.g. Brinton ), this may be the origin of the Australian examples. However, the work has yet to be done to see if there are languages which lack complex verb constructions, but nonetheless display a split in the grammatical behaviour of higher-frequency semantically general



Ilana Mushin

There are clear structural parallels between systems like Jingulu, which join bound pronouns and inﬂected verbs into the same prosodic word, and the second-position clitic constructions we have seen in ()–(). More examples are given in ()–() for Wambaya, Garrwa, and Yukulta. In these three languages from three different Australian language families, the clitic construction retains all of the inﬂectional material we see in inﬂecting verb constructions, but without a verbal root (in (a) the clitic attaches to initial position, which happens to be a verb). () Wambaya (Nordlinger ) a. ngaj-bi ng-a alag-ulu see-nFut sgA-PST child-DU(ACC) ‘I saw the two children.’ (p. ) b. Alag-bulu wurlu-ngg-a nyurrunyurru. child-DU(NOM) duA-RR-nF chase(nFut) ‘The two children are chasing each other.’ (p. ) () Garrwa a. najba ngay=i baya-wuya see SGNOM=PST child-DU ‘I saw the two children.’ b. mali nyul=i wilku, wananamba nayiba, ﬂoodwater sgNOM=PST run all.around here ‘The ﬂoodwater ran all around here . . . ’ () Yukulta (Ganggalida) (Keen ) a. janija=kadi marntuwara-nhtha look+IND=SGNOM+PRS boy-DAT ‘I’m looking for the boy.’ (p. ) b. ngijin-inyja ngamathu-nhtha=kadi marinymarija SG-DAT mother-DAT=SGNOM+PRS think+IND ‘I’m thinking of my mother.’ (p. ) The (a) examples in ()–() are very similar in surface structure to the (b) examples in () and (): in all cases a verbal predicate is followed by bound pronouns which are themselves followed by a bound morpheme that carries the tense/aspect meaning of the clause. Unlike inﬂecting verbs, clitics cannot occur without a preceding clitic host and without another independent predicate in the clause. While the clitic host is very often a verb, as shown in the (a) examples, the (b) examples in ()–() show that it need not be a verb. This reinforces the analysis of clitic clusters, not as tied to verbs or predication, but rather as attracted to initial position on the basis of the information-packaging principle, as argued in Simpson and Mushin (). I will return to this point in section ...

verbs. This would provide the strongest evidence that the closed class of inﬂecting verbs did indeed emerge as a subtype of open-class lexical verbs.

Grammaticalization in Aboriginal languages



Wambaya is a Mirndi language and is therefore the most closely related to Jingulu, which does retain three inﬂecting verbs. Green and Nordlinger (: ) account for the development of the second-position construction in Wambaya as an erosion of the inﬂecting verb (which they call a ‘verb classiﬁer’), leading to a loss of word status. This is followed by a movement of the eroded verb and its inﬂections to second position to align with other (Pama-Nyungan) languages of the area. This proposed pathway is not argued in terms of grammaticalization in Green and Nordlinger (), but it is analogous to the pathways described in the grammaticalization of tense/aspect markers from light verbs in other languages (Bybee, Perkins, and Pagliuca ). That is, a more semantically rich inﬂected predicate that occurs as a free word has become encliticized in second position. The bound pronouns are the old pronominal preﬁxes, while the verb roots have been reanalysed as tense/aspect markers. The two language families to which Garrwa and Yukulta belong do not have complex verb constructions like Mirndi languages. Furthermore Yukulta appears to be the only Tangkic language that has second-position clitic constructions. The tense/ aspect features of the clitic constructions in these languages may not have derived from the erosion of a complex predicate construction, but may have developed under the inﬂuence from the development of second-position clitics in neighbouring Mirndi languages that themselves are hypothesized to mirror the Pama-Nyungan structure seen in Warlpiri and related languages (Green and Nordlinger ). There is, however, evidence to suggest that Garrwa may have had a complex predicate construction at an earlier period, meaning that the tense/aspect features of the clitic complex are, like Wambaya, possibly the remnants of inﬂecting verbs. Key evidence for this is that Garrwa verbs remain uninﬂected as co-verbs are in other languages. As the examples in () and () illustrate, both Wambaya and Yukulta verbs are marked for tense and mood respectively. For Garrwa speakers, however, tense/ aspect/mood and modal meanings are only expressed in clitics in second position, with the verb remaining uninﬂected. However, if Garrwa tense/aspect clitics are the remnants of inﬂecting verbs, it is currently too difﬁcult to reconstruct them given the lack of comparative data of the kind that is available for Mirndi languages. The evidence from Mirndi languages does at least suggest a pathway by which these structures may have developed by the further grammaticalization of a verb inﬂected for both tense and with bound pronominal preﬁxes to a second-position clitic construction marking only the grammatical categories associated with the original verb.

..     -  ? I have so far considered how the different grammatical categories found in secondposition clitic constructions may have emerged from processes of grammaticalization. In this section I examine the degree to which the construction is ﬁxed in the languages under consideration here. If these clitic constructions did indeed develop from inﬂected verbs, then we would expect that the order of clitics within the cluster would be stable, and over time they may become more morphologically fused and phonologically reduced.



Ilana Mushin

For Wambaya and Yukulta, the order of grammatical elements within the clitic construction is highly ﬁxed: pronominal clitics are followed by tense/aspect markers which may be followed by other grammatical category markers like direction or transitivity markers. Modal clitics precede pronominal clitics when they occur. The Wambaya construction is still relatively morphologically transparent, but the Yukulta construction appears more fused, as shown in the examples in () where the clitic =kadi ‘sgPRES’is not separable into pronoun+tense. Only Garrwa allows for variability in clitic placement within the second-position construction, as described in in Mushin (, ). Mushin () showed that while the order modal+pronoun+tense/aspect is the most frequent in Garrwa discourse, mirroring the other languages considered here, three clitics (=yili ‘past habitual’, =ja ‘future’, and =yi ‘past’) can occur either before or after the pronoun in second position, as illustrated in ()–(). The example in () also shows that past tense can be doubly marked in second position and following the verb. () Garrwa a. dabarraba=yili yalu, bardidibadi-wanyi cook.in.ashes=HAB plNOM old.woman(RDP)-ERG ‘They would cook in ashes, the old women (would).’ (Mushin : ) b. jarrba nurr=ili nanda wada barri munjimunji-nyi eat plExclNOM=HAB that food DM bush-DAT ‘We would eat that food of the bush.’ (Mushin : ) () milidimba nganinji=ja / milidimba ja=nganinji teach sgACC/sgNOM=FUT teach FUT=sgACC/sgNOM ‘You’re going to teach me.’ (Mushin : ) () yanyba=yi ngayu all day wulani talk=PST sgNOM day.before ‘I talked all day yesterday.’ (Mushin : ) () kulwa=yi ngay=i nani baba look.back=PST sgNOM=PST like.this elder sister ‘I looked back like this, sister.’ (Mushin : ) The variability of Garrwa tense/aspect clitic placement as illustrated above may be a sign of further reanalysis as =yili and =ja become aligned with modals in the ﬁrst position in the clitic construction, while the past tense clitic =yi may be being reanalysed as a verb inﬂection because it is found attached to verbs in about  per cent of cases.¹⁵ If this variation is a sign of reanalysis, it shows that Garrwa tense/aspect clitics, whatever their origin, never fully constructionalized because the clitics were able to be separated from the pronoun. Garrwa data recorded in the s shows far less deviation from the expected pronoun+tense/aspect ordering, although as this is ¹⁵ The frequency analysis in Mushin () allowed for the fact that rd person singular referents rarely get a pronominal reference, and so in most rd person singular cases, the clitic =yi is found attached directly to the verb by default.

Grammaticalization in Aboriginal languages



mostly elicited data, it is hard to know if this is reﬂective of clitic placement within second position in natural language use. Regardless of whether the ﬂexibility in clitic placement in Garrwa has increased or not in the last few decades, the fact remains that the construction was never sufﬁciently fused in the way it has become, for example, in Yukulta. Recall also from section .. that the formal differentiation between free and bound pronouns is far less pronounced in Garrwa than in the other languages of this sample, as free pronouns are deﬁned as those that are not in second position and are prosodically and pragmatically prominent, but which are otherwise morphologically identical to second-position pronouns. So while Garrwa is highly consistent with the other languages in this sample in marking key grammatical categories in the second position of a clause, second position is far less constructionalized than we ﬁnd in Wambaya, Yukulta, and Warlpiri. The grammatical status of Garrwa clitic constructions raises the question of whether, assuming that second-position clitics emerged from an earlier inﬂected verb (pronoun+verb+tense/aspect marking) in a complex verb construction, it was possible to free the remnant verb, now the tense/aspect clitic, from its bound pronoun. The only cognate language, Waanyi (no longer spoken), has secondposition pronouns and some tense/aspect morphology for future and non-past marking, but no tense/aspect marking on verbs, which would suggest a further move away from maintaining the remnant links with an earlier complex predicate construction. Since Garrwa has not been learned as a ﬁrst language for at least two generations, it is unlikely that the question regarding this process will be resolved.

..       -   The above sections showed how the development of two main components of second-position clitic constructions found in Australian Aboriginal languages of the North-Central region—clitic pronouns and tense/aspect marking—might be explained by processes of grammaticalization that are found in other languages of the world. The evidence shows that the Yukulta construction may well be the most grammaticalized because the morphemes within the clitic construction are more morphologically fused than those in Wambaya, while Garrwa shows signs that the second-position construction is becoming less ﬁxed as a construction, although the gravitation of grammatical categories to second position remains a robust feature of the language. In this section I examine some pragmatic motivations for the development of this construction. This process did not semantically involve the less grammatical becoming more grammatical, because the meanings themselves were already highly grammaticalized. What may instead have occurred was a restructuring of the position and ﬁxedness of essential clausal grammatical categories, leading to the constructionalization of



Ilana Mushin

second-position clitics as a key feature of the grammatical architecture of these languages. The clustering of formal grammatical indices such as person/number, tense/ aspect/mood, and/or modality marking in the second position in clauses as a clitic or series of clitics is a widespread phenomenon across the world’s languages (e.g. Halpern and Zwicky ; Anderson ). Hopper and Traugott (: ) observed that the tendency for this kind of construction to occur encliticized to the ‘ﬁrst tonic element’ may be related to its pragmatic proﬁle. Sentential enclitics have a tendency to occur in the second position in the sentence, following the ﬁrst tonic element. But other clitics may occur in that position too, for example, clitics with auxiliary verb character. The ‘second position’ tendency may be related to the topic-comment structure that spoken sentences typically have: in many utterances there is an initial phrase (the topic) that, as it were, sets the stage for what is to be said about it (the comment). Clitics that are not bound to a particular word class will tend to follow the initial topic . . . (Hopper and Traugott : )

However, the evidence from North-Central Australian languages shows that second position follows an initial element that is ‘pragmatically prominent’ (i.e. focal, rather than topical), as argued in Mushin () and Simpson and Mushin (), thus providing a morphological and prosodic buffer between prominent information and the rest of the clause. This characterization is analogous to Lehmann’s () account of the grammaticalization of cleft constructions to offset contrastive focus from the rest of the clause, which itself was based on Lambrecht’s (: ) principle of the separation of role and reference: do not introduce a referent and talk about it in the same clause. While second-position clitic constructions do not split the sentence into two clauses, as cleft constructions do, they do provide the means of separating out the different statuses of information within a larger grammatical unit, such as a clause. As noted earlier, second-position clitic constructions are the main way of marking grammatical indices in these Australian languages, and so unlike cleft constructions they are not reliant on marked pragmatics. Indeed, it is far more frequent in discourse for second-position clitic constructions to be attached to ‘obligatory’ initial position particles like interrogatives and negatives, which are themselves markers of focus (Simpson and Mushin ). It is only in pragmatically marked contexts that we ﬁnd second-position clitic constructions attached to nominal referents, as in the (b) examples in ()–(). If there is no particular focal or prominent element in the clause, the initial position is occupied by the verb by default as in the (a) sentences in ()–(). Since the most frequent clauses in discourse are those which do not introduce new referents, verbs are the most common hosts for second-position clitic clusters, which may result in the reanalysis of clitics as verbal sufﬁxes, as has occurred in some Pama-Nyungan North-Central languages (McConvell ). The evidence presented so far suggests that the development of second-position clitic constructions was motivated by a pragmatic principle of separating out ‘pragmatically prominent’ information from the rest of the clause, but also providing a predictable locus for the key grammatical indices required to interpret the referential,

Grammaticalization in Aboriginal languages



temporal, and modal properties of the utterance.¹⁶ The result is a more ﬁxed structuring principle in languages where word order is otherwise considered to be syntactically ‘free’ (e.g. Laughren ; Simpson and Mushin ).

. CONCLUSION Studies of grammaticalization have tended not to focus on constructions that bring together many different grammatical categories, as second-position clitics do. The emergence of these clitics must account not only for the origins of the morphemes within these clusters but also for their ordering within the construction, and the signiﬁcance of second position as a structuring principle of language. Here I have proposed some pragmatic principles from studies of grammaticalization to account for the gravitation of grammatical material to second position, and shown how the development of bound pronouns and tense/aspect markers could have emerged in Australian languages in ways that are analogous to how they have been described for other languages. The data also present some challenges for grammaticalization as an approach to language change. The cycle of free to bound to free pronouns documented in Mushin and Simpson (), and the ﬂuid ordering of elements within the Garrwa second-position construction, raise questions about unidirectionality as a core principle of grammaticalization. In any case, the evidence from Australian languages suggests that a grammaticalization approach needs to be broad to take into account changes in the grammatical status of bound morphemes that are already grammatical. The analyses of the parts and whole of the second-position clitic construction found in a signiﬁcant number of Australian Aboriginal languages of the NorthCentral region is in many ways indicative of the challenges that Australianist linguists have faced in attempting to account for processes of grammatical changes. Much of what I have proposed here is based on partial comparisons of languages which are in severe states of endangerment (there are almost no living speakers of Yukulta or Wambaya and a dozen or so older ﬂuent Garrwa speakers remaining). Nonetheless, the analysis I have presented here indicates that the grammatical features we ﬁnd in other languages of the world that have emerged from grammaticalization are likely also to have emerged in similar ways in Australia.

ACKNOWLEDGEMENTS I would like to thank Heiko Narrog and Bernd Heine for comments on earlier drafts, and the participants of the  symposium on Grammaticalization and Typology for their feedback.

¹⁶ Another strategy, also documented for Garrwa in Mushin (), is to string out information prosodically into small intonation units, following Chafe’s () principle of one idea per intonation unit.

14 Grammaticalization in Oceanic languages C L A I R E M O Y S E - F AU R I E

. INTRODUCTION This chapter on grammaticalization in Oceanic languages is structured as follows. In section ., after a presentation of the Oceanic languages subgroup, and the list of languages cited (..), the main typological features found in these languages with regard to grammaticalization processes will be set out (..). Section . will present processes with verbs as sources, most often starting out from serial verb constructions. Verbs that have changed to grammatical morphemes may still function as main verbs. The relevant verbs mainly belong to speciﬁc semantic classes, such as verbs of posture and motion, phasal verbs, verbs of transfer and saying, and from these sources they have developed into a wide variety of functional types of morphemes. Section . will examine cases of grammaticalization from nominal sources, giving rise to classiﬁers, aspect and relative markers, and adpositions. Section . will be concerned with further cases of grammaticalization, starting from already grammatical items such as possessive sufﬁxes developing into benefactive markers. Section . will examine some interesting cases of relexiﬁcation, some of them also issued from serial verb constructions, others resulting from a less grammaticalized status. Finally, in section ., I will question the unidirectionality of grammaticalization processes, by examining such cases of evolution as the reanalysis of an applicative sufﬁx into a preposition, or the formation of existential and manner verbs from demonstratives.

..    The approximately  Oceanic languages belong to a well-deﬁned and major subgroup of the Austronesian family (Figure .; Table .). About half of them are spoken in New Guinea, the Bismarck Archipelago, and the northwest Solomon Islands (including Bougainville),  or so in the other parts of Melanesia (southeast Grammaticalization from a Typological Perspective. First edition. Heiko Narrog and Bernd Heine (eds). This chapter © Claire Moyse-Faurie . First published  by Oxford University Press

JAPAN CHINA

F Taiwan S PINE ILIP PH

Su m

Borneo

at

ra

Sulawasi INDONESIA

Yap

Federated States of Micronesia

Marshall Islands Kiribati

SH/WNG

CMP

0°

Nauru NEW GUINEA

Soloman Is Tuvalu

Timor

Tokelau Rotuma Wallis Samoa & Futuna Niue Fiji

New Caledonia

Marquesas FRENCH POLYNESIA

nds sla

Vanuatu

Also Madagascar

OCEANIC I ok Co

WESTERN Java MALAYOPOLYNESIAN

Guam Palau

20°

Hawai’i

Marianas

Easter Island

Rapa

F Formosan CMP Central Malayo-Polynesian SH/WNG South Halmahera/ West New Guinea Non-Austronesian languages in areas shown as Austronesian 120°

20°

Tonga Rarotonga

AUSTRALIA

100°

Tuamotu Archipelago

Tahiti

40°

NEW ZEALAND 140°

160°

180°

160°

F. .. The Austronesian family and major Austronesian language groups (Source: John Lynch, Malcolm Ross, and Terry Crowley, The Oceanic languages (Richmond, Surrey: Curzon Press, ), )

140°

120°



Claire Moyse-Faurie

T .. Main Oceanic subgroups with languages mentioned in this chapter Admiralty Islands Western Oceanic

Eastern Oceanic

Paluai, Loniu North New Guinea Papuan Tip Cluster Meso-Melanesian Cluster

Kaliai, Manam Kilivila, Saliba Tigak

Southeast Solomonic

Kwaio, Nggela, Toqabaqita

New Caledonia

Nêlêmwa, Nyelâyu, Cèmuhî, Xârâcùù

Loyalty Islands

Drehu, Iaai

Nuclear Micronesian

Mokilese

North and Central Vanuatu

Lewo, Lolovoli, Mwotlap, Paamese

South Vanuatu

Anejom ˜ , Neve’ei, South Efate

Central Pacific: Fijian Polynesian

Wayan Fijian East Futunan, East Uvean, Hawaiian, Māori, Marquesan, Samoan, Tahitian, Tokelauan, Tongan, Tuvaluan, Vaeakau-Taumako

Solomon, Vanuatu, New Caledonia, Loyalty Islands, and Fiji),  in Micronesia, and about  belong to the Polynesian subgroup. Proto-Oceanic was almost certainly spoken in the Bismarck Archipelago by the bearers of the archaeological culture known as Lapita, and broke up about , years ago when bearers of this culture spread across the previously uninhabited island groups of the southwest Paciﬁc east of the main Solomon group, as far east as western Polynesia (Kirch ; Pawley ).

..     Oceanic languages show a great deal of variation in their basic constituent order and in their valency structure. Many of them, however, have serial verb constructions, possessive and numeral classiﬁers, and rich inventories of intensiﬁers,¹ which are typically used as reﬂexive markers. All of these features are relevant for a discussion of grammaticalization issues. I will investigate whether or not the grammaticalization paths found in Oceanic languages resemble those described by Heine and Kuteva (), Narrog and Heine (b), and others, or whether some of them are rarely ¹ Throughout this chapter, I will use the term ‘intensiﬁer’ to designate not adverbs of degree like English very, but ‘operators denoting an identity function’ like Latin ipse, German selbst, or Russian sam. Four use types can be distinguished for such expressions (also called ‘emphatics’ in grammars of English): adnominal, exclusive adverbial, inclusive adverbial, and attributive (König and Gast ). As shown by König and Siemund (), intensiﬁers play an important role in the genesis, reinforcement, and renovation of reﬂexive anaphors.

Grammaticalization in Oceanic languages



found outside of Oceanic. The following developments seem to be of the second type, either (i) with verbs as the source (e.g. ‘to go down’ > reﬂexive and reciprocal marker, or ‘to return’, ‘again’ > preposition (‘until’), discourse particle, etc., or ‘to leave’ > non-beneﬁciary preposition); or (ii) with nouns as the source (e.g. ‘thing’ > stative marker, or ‘child’ > relative marker. I will also present several cases of relexiﬁcation that are relevant to the Oceanic languages (section .), as well as unusual developments from grammaticalized morphemes to less grammaticalized ones (section .). Oceanic languages also share a constituent type, referred to here as the ‘verb complex’, with a characteristic internal structure (Pawley ): ‘(i) It has as its nucleus a verb base or compound verb around which a number of grammatical functors occur in rigid order [ . . . ]. (ii) It is normally spoken as single intonation contour [ . . . ].’ The peripheral functors (particles) are usually free forms, as in Wayan Fijian, which has VO word order, as illustrated in example (), but in some Oceanic languages they have become afﬁxes, as in Manam, an OV word order language, as in (). ()

()

Wayan Fijian (Central Paciﬁc) Quu saa maci mai noo tuu . . . SG.SBJ PFV again come stay CONT ‘I’m supposed to come and stay as well . . . ’ (Pawley and Sayaba ) Manam (North New Guinea) Tanepwa maʔ mi-an-Ø-a-ŋ-ʔo. chief chicken SG.SBJ.IRR-give-SGO-BUF-BEN-SG.OBJ ‘I will give a chicken to the chief for you.’ (Lichtenberk : , cited in Pawley : )

I will mainly follow Frank Lichtenberk’s grammaticalization perspective.² This perspective entails a focus on the following results and consequences of grammaticalization: (i) emergence of a new grammatical category; (ii) loss of an existing grammatical category; (iii) change in the membership of a grammatical category; and (iv) semantic reanalysis leading to polysemy, resulting either in multiple meanings within the same grammatical category, or in heterosemy, i.e. in multicategorial or multifunctional polysemy (Lichtenberk a, b, a, , ). Throughout this chapter, I will present grammaticalization processes which are representative of this perspective, both with respect to the syntactic evolution and with respect to the semantic reanalysis. As already mentioned, I will also present some changes more rarely described, in which the original lexical use of a word often still coexists with its grammaticalized counterpart.

² Frank Lichtenberk, who tragically died in , would have been the most qualiﬁed specialist to contribute to this volume. In fact, most of the Oceanic examples included in the World Lexicon of Grammaticalization are from Frank’s contributions. It is with gratitude and memories of inspiring discussions that I acknowledge the role played by Frank’s ideas in this survey. I am also grateful to Andrew Pawley and to Ekkehard König for comments on an earlier version of this article.



Claire Moyse-Faurie

. THE GRAMMATICALIZATION OF VERBS The reanalysis of verbs as prepositions is a well-known and common phenomenon (cf. e.g. König and Kortmann ). In Oceanic languages, the process mainly took place through the evolution of coverbs and serial verb constructions. The serial verb constructions (SVCs) in Oceanic languages which are of interest in this connection are of the ‘nuclear’ serialization type,³ ‘where the verbs are bound together and have only a single set of arguments’ (Lynch, Ross, and Crowley : ). Two recurrent types of grammaticalization took place, called ‘centrifugal’ and ‘centripetal’, respectively, by Pawley (), following Durie (). The ﬁrst (V) or last verb (V) of the series becomes grammaticalized as a ‘verb-like preposition’ (also called ‘verbal preposition’, or ‘prepositional verb’), as locative, aspect, or topic marker, ﬁrst staying in its original position, then moving to the periphery (Bril and OzanneRivierre ; Crowley ; Durie ; Lichtenberk , a, b; Pawley ; Ross ). It is important to point out that although the grammaticalization of V is more frequent even in verb-initial or verb-medial Oceanic languages, the grammaticalization of V, mostly developing into aspectual and modal uses, is also attested, as shown in the following East Uvean example. The verb lolotoga ‘to last’ is rarely used as an autonomous predicate (a): ()

a. ’E

lolotoga te misa kae au ’alu au ki te falekoloa. last SPC mass but SG go SG OBL SPC store ‘I am going to the store during Mass.’ (lit. ‘Mass is going on but I go to the store’)

NPAST

It tends to grammaticalize as an aspect to mark the progressive (b): b. ’E

matou lolotoga lalaga te fala o tomatou falefono. PL.EXCL last weave SPC mat POSS our meeting house ‘We are weaving the mat for our meeting house.’ (Moyse-Faurie : )

NPST

On the other hand, lexical words or particles are attracted into the verb complex, giving rise to adverbs of manner-deixis, quantiﬁers, or to completely new lexicalized entities (cf. .). Ross () analyses the development of the directional (ad)verbs from a semantic point of view, differentiating between (i) deictic directionals, indicating direction relative to the speaker, and (ii) geographic directionals indicating direction relative to absolute points in the environment (in Oceania, these are mainly mountains, the sea, east, and west). He further distinguishes between directionals which still are, or are no longer, verbs, i.e. those which have evolved into pre-verbal clitics (less frequently ³ The other type of serial verb constructions is the ‘core’ construction, in which the verbs remain separate. ‘Core’ constructions also have undergone grammaticalization processes, mainly an adverbial specialization of the second verb, and the evolution of the ﬁrst verb as a classiﬁcatory preﬁx. These grammaticalization types are, however, much more frequent and diversiﬁed in the nuclear serialization type.

Grammaticalization in Oceanic languages



from a sequential SVC), prepositions (locative, ablative, or allative), or relators (more frequent, from directional SVCs) on the one hand, and, on the other, those which have lexicalized (cf. ..). Both intransitive and transitive verbs can grammaticalize into prepositions, but directionals mostly come from intransitive motion verbs. Formal modiﬁcations linked to the grammaticalization process, either phonological reduction or on the opposite, reduplication, will only be brieﬂy mentioned. I will now discuss the most frequent grammaticalization paths associated with verbs in Oceanic languages, pointing out in more detail the semantics of the verbs along with resulting functions which are not widely attested cross-linguistically. In Oceania, very precise spatial representations are important, not only for motion or localization but also for many other types of events and states, for existential constructions and even abstract notions. Besides the well-known ‘come’ and ‘go’ motion verbs, the ‘go up’ and ‘go down’ verbs are linked to the geographical environment of the Oceanic islands, where reaching the coast from the sea or going inland is ‘to go up’, while going towards the seaside, or open sea, or ‘in the wind’ sailing conditions is ‘to go down’. The constant use of directional adverbs, after any kind of verb, is arguably a reﬂection of the speciﬁc Oceanic environment, and the further evolution of almost all deictic and geographic directionals in various grammatical morphemes is typologically quite distinctive. Another interesting typological point must be mentioned here. In many cases, the verbal grammaticalization leads to so-called ‘splits’ (Heine, Claudi, and Hünnemeyer b: ), since the original function and meaning of the verb is often preserved.⁴ There is then a divergence between the former and the new functions that does not affect the existence of the original verb. Therefore, in most of the cases presented in this chapter no replacement of the lexical verb occurred.

..      >     Lichtenberk (a) offers a thorough analysis of the evolution of posture verbs in Oceanic: they exhibit lexical polysemy, manifesting, in addition to their posture meaning (‘sit’, ‘stand’, ‘lie’, and ‘squat’), locative meanings (ʽbe at a placeʼ, ʽdwellʼ, ʽreside’), and existential meanings (‘exist’, ‘be available’, ‘be present’). Posture verbs are also found in aspectual functions (progressive, frequentative, persistive, durative, continuative); they may keep their verbal status, with only a few restrictions, or become aspectual particles, as in Fijian (Lichtenberk a: –). Early (: ) discusses in detail the grammaticalization pathways of posture verbs in Lewo (Vanuatu), in accordance with what is predictable in grammaticalization theory. ⁴ Of course, there are exceptions. For example, the Toqabaqita comitative marker and coordinator bia/ bii are reﬂexes of a transitive verb which no longer exists, but verbal cognates are found in the neighbouring languages (Lichtenberk : ), with meanings such as ‘be one with’, ‘be a partner’, ‘assist, help’. Another important exception concerns the lexicalization of former serial verbs, which gave rise to lexical preﬁxes through compounding, often without any verbal correspondences maintained (cf. .).



Claire Moyse-Faurie

Combined with the durative aspect, to ‘sit’ indicates temporary duration, while mo(< mono ‘lie’) indicates a more permanent duration, as shown in example (b). ()

Lewo (North Vanuatu) a. A-kinana m̃ a-ga. PL.SBJ-eat DUR-just ‘They were eating.’ b. A-mo-m̃ a a-kinana. PL.SBJ-lie-DUR PL.SBJ eat ‘They continued eating.’ (Early : )

In New Caledonian languages, the verbs ‘stand’ and ‘sit’ are the ones used to express similar aspectual values, as in the following Nyelâyu example, in which the verb taa ‘sit’ expresses the continuative aspect when occurring as V: ()

Nyelâyu (New Caledonia, North of the Mainland) Yo taa boram bwa no-n ta mwa. SG sit bathe ASS SG.PFV go.up back ‘Go on bathing, as for myself, I go back up (on the beach).’ (Ozanne-Rivierre : )

In Xârâcùù the locative verb nöö ‘be at (a place) momentarily’, as in (a), has grammaticalized into an adverb (‘lately’) and a distal demonstrative (‘there, away from the speaker and the hearer, but still visible’) as in (b), and also forms locative adverbs. ()

Xârâcùù (South of the Mainland, New Caledonia) a. Xù mê péci bwa è nöö binêrè-kùrè. give VENT book DEM SG be.at side-cooking.pot ‘Give me the book which is next to the cooking pot.’ b. Fîda taiki nöö ngê chaa kwââ. hit dog DIST with one stick ‘Hit that dog [over there] with a stick!’

(verb)

(demonstrative)

..      >  >   According to Pawley (, ) and Ross (), the development of verbs of movement into deictic directionals was already achieved in Proto-Oceanic for *mai ‘come’ > ‘towards speaker’ and *[w]atu ‘go away’ > ‘towards addressee’, ‘away from speaker’. For Proto-Polynesian, three other directionals are reconstructed: *hake ‘upward’, *hifo ‘downward’, and *aŋe ‘along, obliquely’. Two of these stem from verbs in Proto-Oceanic: POc *sake ‘go upward’ and *sifo ‘go downward’ (Ross ). Looking at the grammaticalization paths of verbs of movement in several Oceanic languages, Lichtenberk (b, ) pointed out that meaning extension is not arbitrary, since metaphor and metonymy play an important role in this process, and

Grammaticalization in Oceanic languages



he also showed how an extension in the meaning is motivated by the relation perceived by the speakers between the new and the old items. Different paths of development starting out from motion verbs will be presented below, some well attested elsewhere, others more speciﬁc to Oceanic languages, such as the grammaticalization of the verbs ‘go down’ or ‘return’ (cf. ...). In some languages, however, both uses, as verb and as directional, are still maintained. This is the case, for example, in Xârâcùù, where mê ‘come’ or fè ‘go’ are used as a lexical verb or a directional, depending on the word order. In ﬁrst position they are verbs, as fè ‘go’ in (a), but in second position, they are directionals, as mê ‘come’ in (a) and fè ‘go’ in (b): ()

a. Nâ pè mê nèké bwa ke fè mênêî na. SG take VENT basket DEIC SG go forget PST ‘I take with me the basket you forgot.’ (lit. ‘I take towards me the basket you went away (and) forgot’) b. Pè fè mîî ku a! take CFUG DEM.PL yam DEIC ‘Take away these yams!’

In South Efate (Vanuatu), mai ‘come’, ‘in addition to acting as main and auxiliary verb, can occur following a locational object’: ()

Nam̃ er nen ru=pa raru mai. people that PL.REALIS=drive canoe VENT ‘Those people bring canoes.’ (Thieberger : )

Although deictic as well as geographic directionals pertain to spatial reference, they are also found in other functions. Indeed, it is in the evolution from verbs of motion and directionals that Oceanic languages show the largest variety of grammaticalization paths, a few of them seldom encountered elsewhere: e.g. ‘go.to’ > preposition ‘according to’ or adverbial marker; ventive or centrifugal directionals > comparative markers; downward directional > intensiﬁer, reﬂexive and reciprocal markers.

... Directionals developing into benefactive or recipient markers Deictic directionals may refer to participants not explicitly mentioned in the discourse, as is the case in East Futunan with atu ‘towards addressee’, as in (), or with the ventive one, mai ‘towards speaker’ in Māori, as in (): ()

East Futunan E kau kole atu ke ke ’au. NPST SG ask CFUG in.order.to SG come ‘I am asking you to come’. (Moyse-Faurie : )

Māori () Nā te kurī i amu mai te rākau. belong the dog PST bring VENT the stick ‘The dog brought me a stick.’ (Bauer : )



Claire Moyse-Faurie

... Directionals with aspectual or modal values The aspectual or modal values evolving from motion verbs through directionals are more diverse than those of the posture verbs. Interesting developments are described by Besnier () in Tuvaluan, a Polynesian language that possesses four directionals: mai ‘hither (ventive)ʼ, atu ‘thither (centrifugal)’, aka ‘up, above, landward’, ifo ‘down, below, seaward’. Mai is used to express the following changes:⁵ ‘Changes from sleeping to waking, from childhood to adulthood, from nonbeing or death to life, from darkness to light, from poor to good health, as in (), and from generally less to more desirable states’ (Besnier : ): Tuvaluan () Fakafetai me teenei koo feoloolo mai. thank because this INC middling VENT ‘Thank you, I’m feeling better.’ (Besnier : ) Moreover: ‘Mai may appear when the situation denoted by the verb has reached a conclusion, in which case it often implies that the participants have returned from the venue of the situation’ (Besnier : ), as in (): () Laatou koo pei tili mai. PL INC cast ﬁshing net VENT ‘They have returned from net-ﬁshing.’ (Besnier : ) In contrast, atu may appear when the situation denoted by the verb is continuous or recurring, as in (), but may also modify verbs denoting changes in the opposite direction from mai: () Au e takatokato atu fua. SG NPST lounge around CFUG just ‘I’m just lounging around.’ (Besnier : ) Besnier (: ) states: ‘Many metaphorical uses of aka and ifo overlap with uses of mai and atu, respectively. For instance, aka, like mai, can denote changes from darkness to light, from childhood to adulthood [etc.].’ The choice of a personal directional versus a local directional is also relevant. In example (a), the metaphorical use of the personal directional mai indicates that the increasing wind will affect the people present in this location, while in example (b), the local directional aka only expresses the increasing of the wind: () a. Te

matagi koo tuku mai. wind INC let VENT ‘The wind is increasing [and is going to affect us].’ ART

b. Te

matagi koo tuku aka. wind INC let UP ‘The wind is increasing [and may or may not affect us].’ (Besnier : ) ART

⁵ There are many parallels with the English verb ‘come’: come of age, come to light, come alive, come to one’s senses, come true (Ekkehard König, p.c.).

Grammaticalization in Oceanic languages



The upward directional may also express an event that has come to a complete end, as in the following Tokelauan example: () Fanake ai la, kua hālo ake. go.up ANA INT PFV wipe UP ‘He goes up inland, he is completely wiped out.’ (Hooper ) In Marquesan (Cablitz : –), the downward directional iho is also a relative tense marker ‘expressing that an event has happened soon/just after another event’, while the centrifugal directional atu ‘often expresses remoteness to a reference point on the time axis’. In Māori, according to Bauer (: –), ‘all directionals have uses in temporal contexts’. Mai is used to emphasize the starting point of a period of time, which may be in the past, or in the future as in (),⁶ and which may be of a speciﬁed duration or open-ended. () Mai

a tērā tau ko Pou te heamana. at(FUT) that year PRED Pou the chairman ‘The chairman from next year on will be Pou.’ (Bauer : )

VENT

Atu ‘thither’ combined with the local noun mua ‘before’ expresses prior location in time. Ake ‘upwards’ sometimes occurs with future events, especially immediate future, while iho ‘downwards’ is commonly used to mean ʽfrom time past towards the presentʼ, as in (): () Kātahi ia ka whakataukī iho. then SG TAM utter a proverb DOWN ‘Then she uttered this [prophetic] saying . . . ’ (Bauer : ) In Tokelauan, in Marquesan or in Tuvaluan, ‘upwards’ can be used as a polite downtoner (Besnier : ; Cablitz : ; Hooper ). In Drehu (Lifu, Loyalty Islands), the verb tro ‘go’ (a), although not a reﬂex of the Proto-Oceanic verb *(w)atu ‘go’, has undergone the same sort of evolution, expressing future tense, as in (b). Tro is also used as an obligation marker when combined with the imperfective aspect a, as shown in example (c); in both cases it is still compatible with the original verb: () a. Eni a tro Drehu elany. SG IPFV go Lifu tomorrow ‘I am going to Lifu tomorrow.’ b. Tro ni a tro Drehu elany. FUT SG IPFV go Lifu tomorrow ‘I will go to Lifu tomorrow.’ c. Troa tro elany la he. OBLIG go tomorrow ART boat ‘The boat should/must go tomorrow.’ (Moyse-Faurie ) ⁶ Again we have clear parallels for the future use of ‘come’ in English: Come Monday things will be all right (Ekkehard König, p.c.).



Claire Moyse-Faurie

In some Oceanic languages, directionals can be part of noun phrases, like ake ʽupʼ in East Uvean which also conveys an aspectual value in this nominal context: ’aho aké ne’e matou olo o gelu. day UP PST PL.EXCL go(PL) COMP ﬁsh ‘On the next day, we went ﬁshing.’ (Moyse-Faurie : )

() I

OBL

te

SPC

... Verbs expressing comparison of inequality Depending on the language concerned, either deictic or geographic directionals can be used in comparative constructions. In Loniu (Hamel : ), it is the verbs la (variant le) ‘go’ and me ‘come’ that are used as comparative markers, while keeping their subject clitics, as illustrated in (). ()

Loniu (Papua New Guinea) Ké itiyo elewen i-le ké itiyen. wood this long REALIS.SG-go wood that ‘This stick is longer than that stick.’ (Hamel : )

The fact that the subject clitic is retained throws doubt on the grammatical status of these comparative markers. According to Durie (: ), ‘overt morphological coding of verbal status inhibits the drift to preposition’; their status, however, is no longer verbal either.⁷ In Samoan, according to Mosel and Hovdhaugen (: ), only the deictic centrifugal directional atu ‘towards the addressee’ enters comparative constructions. In most Polynesian languages, however, it is verbs meaning ‘go down’ and ‘go up’ that are used for comparatives of inequality. Reﬂexes of Proto-Polynesian *hake ‘go up’ are used when the comparison or action denotes an increase in quantity or height (‘More is up’, Lakoff and Johnson : , ): better, higher status, older, healthier, etc., while reﬂexes of *hifo ‘go down’ are used when the comparison involves a decrease or a lower height as in example (), or quantity. In Tuvaluan, both ifo ‘downwards’ and aka ‘upwards’ enter comparative constructions: () E

maalalo ifo te taipola i te sefe. low DOWN the table at the larder ‘The table is lower down than the larder.’ (Besnier : )

NPST

In Tuvaluan, however, the ventive directional mai ‘towards the speaker’ may also enter comparative constructions to ‘denote the fact that the entity being compared is closer to the point of reference of the discourse than the entity forming the standard of comparison’:

⁷ The Loniu motion verb ‘go’ has also undergone another interesting development (Hamel : ): it is used to introduce purpose, result, instrument, or manner complements, still preceded by a clitic subject: Iy SG

i-puti REALIS.SG-take

iy i-le SG REALIS.SG-go

cani cut

‘She took him in order to cut the umbilical cord.’

puton. umbilical.cord

Grammaticalization in Oceanic languages



() Koo pili mai a Oolataga i loo o Niuooku. INC near VENT ABS Olataga OBL compared POSS Niuoku ‘Olataga islet is closer [to here] than Niuoku Islet.’ (Besnier : )

... Verbs developing into intensiﬁers, reﬂexive and reciprocal markers In addition to its grammaticalization into directional, aspect, and comparative markers, the verb *hifo ‘go down’ has undergone further developments in some Eastern Polynesian languages (Moyse-Faurie ), along with a morphological reduction (*hifo > iho). It is used as an intensiﬁer (see note  for the use of this term) in Hawaiian (), and in Tahitian (a), where it also occurs as reﬂexive (b) and reciprocal marker (c). Hawaiian () pa’akikī ma kāna iho (attributive use) stubborn with his DOWN ‘Stubborn with his own [things].’ (Pukui and Elbert : ) Tahitian () a. Nāna iho terā rata i pāpa’i. (adnominal use) PRED+SG DOWN DEIC letter PFV write ‘He himself wrote this letter.’ (Académie tahitienne : ) b. Tē hohoni ’ona iā-na iho. (reﬂexive) NPST pinch SG OBJ-SG DOWN ‘He pinches himself.’ (Poeura Vernaudon, p.c.) c. ’Ua taparahi rātou rātou iho. (reciprocal) PFV hit PL PL DOWN ‘They hit each other.’ (Poeura Vernaudon, p.c.) Without the directional iho ‘downwards’ used as reﬂexive and reciprocal marker, example (b) would mean ‘He pinches him’, and example (c) would mean ‘They hit them’. In Māori, it is the ‘upwards’ directional ake that has an intensifying attributive use,⁸ as in example (): ()

. . . kei reira tōna ake reo, ana ake tikanga at there its UP language its UP customs ‘ . . . that part has its own language and its own customs’ (Bauer : )

In Marquesan, there is a sort of serial verb construction that may also be used to express reciprocity. The verb is reduplicated, each duplicate part being followed by a deictic directional, as in ():

⁸ In Māori, contrasting with Tahitian, reﬂexives and reciprocals are not constructed with a directional, but with the postverbal emphatic marker anō ‘again’.



Claire Moyse-Faurie avei ’aua, u hopu atu hopu mai . . . meet DU PFV embrace CFUG embrace VENT ‘They met (and) embrace each other . . . ’ (Cablitz : )

() U

PFV

... Verbs developing into prepositions In Paluai, belonging to the Eastern Admiralties subgroup, the verb la ‘go’ (< POc *lako ‘go, thither’) has grammaticalized in various ways, some of them similar to what is found in Loniu (), but some also involving no more motion at all, such as manner adverbials (a) or as preposition meaning ‘according to’ (b): Paluai (Manus Province, Papua New Guinea) () a. Uro rok la bian palsi. DU.HAB stay GO.TO good PAST ‘They used to live (together) well in the past.’ (Schokkin : ) b. Minak tebo ip maro ret tou pun la aronan pwên. present DEM.PROX PL NEG.HAB move give very GO.TO way.PERT NEG ‘Nowadays, they do not run (ceremonies) properly according to procedure.’ (Schokkin : ) Throughout this section, we have seen various instances of evolution involving deictic and geographic motion (ad)verbs. I will now present several grammaticalization paths undergone by two other motion verbs, namely the verbs ‘return’ and ‘follow’, whose meanings are not purely topographical.

..     ‘’  ‘’ These two other Oceanic motion verbs, ‘return’ and ‘follow’, have undergone interesting kinds of grammaticalization, some of them less known worldwide. I choose to consider them together, for the following reason: the meanings of these verbs are complementary, since ‘return’ implies a break, a rupture from the preceding event, while ‘follow’, conversely, implies a continuation.

... The verb ‘return’ The verb ‘return’ was considered by Lichtenberk (b) as a source of several kinds of grammatical markers, speciﬁcally reditive directionals (‘back’), repetition markers (‘again’), prohibitive markers, additive particles (‘also, too, as well’), and reﬂexive markers. Looking at a larger set of Oceanic languages, I found several other paths of grammaticalization for ‘return’/‘again’ (Moyse-Faurie ), i.e. paths leading to nominal modiﬁers (‘another’, ‘same’), and to additive/focus particles (‘indeed’, ‘exactly’). These are already well known as possible paths of development elsewhere. Developments into intensiﬁers (‘-self ’), reﬂexive and reciprocal markers, emphatic adverbs, into prepositions (‘until’) or conjunctions (‘then’), and into

Grammaticalization in Oceanic languages



discourse particles (exclamative markers), are (as far as I know) rare outside Oceanic languages; continuation ‘up to a point’ and tense-aspect markers seem to be only attested in a few Oceanic languages. Below are two typical examples of the grammaticalization of the verb ‘return’ in Oceanic languages. In Toqabaqita, a Solomonic language, it is the reduplicated form of oli ‘return’ which marks reciprocity: ()

Toqabaqita Roo kini kero fale olili qani keeroqa. two woman DU.SUBJ give RETURN.RED PREP DU ‘The two women give (things) to each other, back and forth.’ (Lichtenberk : )

Reﬂexes of PPn *foki ‘return’ are used in exclamative sentences in several Polynesian languages. In these sentences the predicate typically is a nominalized verb phrase (cf. Moyse-Faurie ). The modiﬁer underlines the surprise effect. Māori () Te makariri hoki o te wai! SPEC cold RETURN POSS SPEC water ‘How cold the water is!’ (Bauer : ) I proposed the following tentative explanation for these developments: If we consider all the meanings and uses deriving from the notion of ‘return’, the common denominator could be seen not so much in the notions of returning and iteration but in the notions of continuity and ruptures or breaks in the continuity. The ruptures could be a change of direction (! ‘return’), a change of state and return to the ﬁrst one (! ‘again’, iteration), a rupture in standard assumptions about disjoint argument structure (! reﬂexivity, reciprocity), a change in perspective (! ‘namely’), a change in argumentation (! ‘and then, however’), an end of continuation (! ‘up to’, ‘from now on’, ‘for the ﬁrst time’), or a break in deictically given proximity (! aspect and tense markers). In view of this common denominator, ‘the semantic changes leading to different targets are based on very general processes of metaphorical and metonymic extensions’ (Moyse-Faurie : –).

... The verb ‘follow’ Lichtenberk () also investigated the development of reason and cause markers from the POc transitive verb *suRi ‘follow, be in motion behind somebody or something’, ‘accompany’. Heine and Kuteva () list the prepositions ‘according to’, ‘behind’ and comitative as grammaticalized forms of the verb. However, cause, as in () seems to instantiate a development unknown outside Oceanic languages. Toqabaqita () Ku too qi luma suli-a ku mataqi. SG.NFUT stay LOC house RSN-.OBJ SG.NFUT be.sick ‘I stayed at home because I was sick.’ (Lichtenberk : )



Claire Moyse-Faurie

In addition to the  languages listed by Lichtenberk (: –) showing reﬂexes of POc *suRi either introducing clause or noun phrase complements or remaining as verbs, I can mention the East Uvean complex marker (ko te) ’uhi (lit. ‘(it is) the reason’), which introduces noun phrase causal complements (a) as well as purpose complements when combined with the conjunction ke (b). East Uvean () a. Kua mapunu te ala i te ’akau ’uhi ko te afā. PFV stuck SPC road OBL SPC true reason PRED SPC hurricane ‘The road is stuck by trees because of the hurricane.’ b. ’E au ako ko te ’uhi ke au poto. NPST SG study PRED SPC reason in.order.to SG intelligent ‘I am studying in order to become educated.’ (Moyse-Faurie : ) As noted by Lichtenberk, metonymy is the motivating factor in the rise of reason/ cause-marking function from the verb meaning ‘follow’.

..      ... Verbs ‘give’, ‘help’, and ‘say’ Lichtenberk () mentions several Oceanic languages which have grammaticalized a benefactive/recipient/goal marker from the verb ‘give’, reconstructed in ProtoOceanic as *pa(n,ñ)i, which was already also, according to Pawley (), a prepositional verb meaning ‘motion to an animate goal’. The Xârâcùù verb xù ‘give’ is not cognate with the Proto-Oceanic form, but has undergone a similar development. In all the other New Caledonian languages, the benefactive marker has a different origin, being identical either to a possessive marker or to a locative preposition. In some North New Guinea languages (Tigak, Kaliai), only the grammatical reﬂex is maintained, marking goals, beneﬁciaries, or locations. In Nggela (Codrington ), the lexical reﬂex of *pa(n,ñ)i ‘give’ acquired a new meaning, ‘say’, while the grammatical reﬂex serves to introduce goals, beneﬁciaries, instruments, or causes. Grammatical reﬂexes of *pa(n,ñ)i are found either as verb-like markers, indexing their complements by means of object sufﬁxes, or as noun-like markers, taking possessive sufﬁxes, as in languages from the Southeast Solomonic subgroup. In Anejom̃ (), by contrast, the marker imta- (+ possessive sufﬁxes), introducing beneﬁciaries and recipients, has another source than the verb ‘give’, namely the verb ‘help’, ‘associate’. Anejom̃ () Et yip̃ al imta-ma a tata. SG.AOR tell.story DAT-our.EXC.PL SBJ Dad ‘Dad told us a story.’ (Lynch : ) The verb ‘say’ is known to give rise to causal, conditional, evidential, and purposive meanings (Heine and Kuteva ), and may also develop into complementizer or quotative markers, as shown by Klamer () and Hsieh () for certain non-

Grammaticalization in Oceanic languages



Oceanic Austronesian languages. This is also the case in several Oceanic languages. In Drehu, for example, hape ‘say’ is now seldom used as a main verb, but introduces direct or indirect speech combined with the stative marker ka. Drehu (Lifu, Loyalty Islands) () Hnei aji hna sa ka hape eni a madrin! SM rat PST answer STAT say SG IPFV rejoice ‘The rat answered: I am glad!’ (Moyse-Faurie : )

... Verbs ‘take’, ‘take off, throw away’ The verb ‘take’ is known to develop into causative, comitative, instrument, patient, completive, and future tense markers (Heine and Kuteva : –). According to Durie (), in most languages in which the verb ‘take’ became a relational marker, the verb was in V position in SVCs. Ozanne-Rivierre () studied the evolution of ‘take’ verbs in several Caledonian languages, in which ‘take’ in V position gave rise either to lexicalization (SVC > compound verb > simple transitive verb) or to grammaticalization, through the reanalysis of V ‘take’ as an enclitic transitivizing applicative morpheme, with an associative meaning, in constructions implying simultaneous events. Below are examples in Nyelâyu showing two occurrences of the verb pha ‘take’, as an independent verb (a), as V in a serial construction (b), and under its grammaticalized form –va as an applicative associative marker (c): ()

Nyelâyu (North of New Caledonian mainland) a. Lha pha ca pwa-ru dep. PL take each CLS-two mat ‘They each took two mats.’ b. Kam ron charemwa ta pha nae-n. so SG.PFV run go.up take child-SG.POSS ‘So she ran up to get the child.’ c. Ta taa-va an Cana nae-n. SG sit-APPL ERG Rosana child-SG.POSS ‘Rosana is sitting with her child on her lap.’ (Ozanne-Rivierre  : )

In Fijian (Pawley ) or in Saliba (Margetts ), reﬂexes of the Proto-Oceanic sufﬁx *-akin[i] (analysed by Evans  as two morphemes: the applicative sufﬁx *akin plus the transitivizing sufﬁx *-i) are used for associative casemarking, and to introduce objects referring to instrument, source, result, or accessory. This point is discussed in more detail in section ... In the Polynesian Outlier Vaeakau-Taumako (Næss : ), the verb toa ‘take’ underwent a grammaticalization process in a core-layer SVC, as V, conferring a volitional or inceptive meaning on the clause. Finally, in Xârâcùù, the verb witaa ‘throw away’ gave rise to the disattributive preposition taa:



Claire Moyse-Faurie

() Nâ xâdùù chaa lotoo taa Dapé. SG buy one car OFF Dapé ‘I am buying a car from Dapé.’ (Moyse-Faurie : )

..   Lichtenberk () describes in detail the functions and development of two Toqabaqita phasal verbs, sui ‘end, ﬁnish, be ﬁnished’ and thafali ‘start, begin’. The verb sui either occurs as a plain verb, or in ‘mini-clauses’ in which it only admits a third singular pronominal subject and signals ‘the end of a state of affairs expressed in another, preceding clause’ (p. ). Sui has also several grammatical functions: it is a sequential marker (‘then’), occurring clause-initially; a postverbal completive marker; a contrastive clausal coordinator (‘but’); or a noun-phrase internal particle with an ‘exhaustive-marking’ function, as in the following Toqabaqita example: () Wela nau ki sui boqo kera sukulu qi manga qeri. child SG PL end ASS PL.NFUT attend.school LOC time this ‘All of my children attend school at this time.’ (Lichtenberk : ) The different grammaticalization paths described for sui in Toqabaqita are well known cross-linguistically, but are not so frequently found in other Oceanic languages, except for the use of ‘ﬁnish’ as an adverb meaning ‘completely’, ‘deﬁnitively’, as for example in East Uvean: () ’E

au mahalo ’e nofo ’osi! SG think NPAST stay ﬁnish ‘I think he will deﬁnitively stay (here).’ (Moyse-Faurie : ) NPST

or as a completive aspect, representing an event as completed, as in Mwotlap: () Nēk may suwsuw bah ēnōk? SG PFV bathe FINISH now ‘Have you already taken your shower?’ (François : ) In Toqabaqita, the transitive verb thafali ‘start, begin’ has another function: in a monoclausal construction, it is an inceptive marker, and in this use it has to be detransitivized with the -qi sufﬁx: () Nau ku thafali-qi uqunu naqa. SG SG.NFUT INCEPTIVE-DETR narrate PRF ‘I am about to begin to tell a/the story.’ (Lichtenberk : )

..   () I have already mentioned the fact that posture verbs often broaden their meanings to serve as locative and existential verbs (Lichtenberk a). Other sources for existential or verbs of nonexistence are also worth mentioning. In East Uvean, the form

Grammaticalization in Oceanic languages



mole may occur either as a verb ‘not exist, disappear’, as in (a), or as a negative marker, able to modify the positive existential verb iai, as in (b). () a. ’E

mole he ’aliki. not.exist NSPC chief ‘There is no [such person as a] chief.’

(verbal occurrence)

NPST

b. ’E

mole iai ni ’ao i te lagi. (negative marker) NEG exist NSPC.PL cloud OBL SPC sky ‘There are no clouds in the sky (today).’ (Moyse-Faurie : )

NPST

This negative marker mole has replaced the older form he’e (< Samoic-Outlier Polynesian *se’e), which remains only for the negation of nominals as in he’e gata ‘without end’ or in the complex negative form he’eki ‘not yet’. In Xârâcùù, the active verb xwi ‘do, make’ may take any (pro)nominal subject, and there is agreement between the preposed pronominal subject and the lexical subject, which is postposed to the predicate and introduced by the subject marker ngê (a). In its use as an existential verb, only the third person singular pronominal subject is possible. Moreover, the subject marker ngê is no longer required, and there is no agreement with the lexical subject, as shown in (b). Xârâcùù () a. Ri xwi farawa va nèkè-ri ngê pa pwângara. PL make bread ASS CLS(starchy food)-PL SM COLL European ‘Europeans make bread as their starchy food.’ (lit. ‘they make bread as their starchy food, the Europeans’) b. È nää xwi (ngê) mîî pè-ngâârû rè ri. SG PST.PROG exist (SM) PL stone-seed POSS PL ‘Their stones for seed-plants used to exist.’ (lit. ‘it used to exist, their stones for seed-plants’) The existential predicate is also used to express the notion of ‘to amount to’, in reference to time, as in (c), and it is well known that existence and quantiﬁcation are often related. c. È xwi bachéé daa mè péépé wâ paii. SG amount.to three day that baby PFV sick ‘The baby has been sick for three days.’ (lit. ‘it amounts to three days that the baby got sick’) (Moyse-Faurie : ) Verbs meaning ‘do, make’ have several well-known grammaticalization paths (causative marker, continuous aspect, etc.), but the evolution into an existential verb has never been mentioned. Parallel evolutions leading to existential predications are attested in French, with il y a, corresponding to English ‘there is’, but not with the verb ‘do, make’ as a starting point. Moreover, in several New Caledonian languages spoken in the north of the Mainland, such as in Nyelâyu (Ozanne-Rivierre, unpublished comm.), the verb thu ‘do, make’ grammaticalized, by metonymy, into the subject/agent marker ru:



Claire Moyse-Faurie

()

Nyelâyu a. Lhe pe-hari me lhe thu mwa. DU REC-say that DU make house ‘They take the decision to build a house.’ b. Ta pavara ru uru ti hada-yeek. SG break SM wind DEF branch-tree ‘The wind broke the branch.’

. GRAMMATICALIZATION OF NOUNS With the exceptions of the grammaticalization of nouns meaning ‘thing’ (..) or ‘child’ (..), I have not found anything speciﬁcally Oceanic in the grammaticalization of nouns. As elsewhere, the nouns involved in a grammaticalization process correlate with certain semantic types (e.g. body parts, spatial notions, kinship terms, or parts of a whole), and generally are bound (relational) nouns, i.e. obligatorily possessed. Body part nouns in particular grammaticalized into locatives used for spatial deictic reference, as shown by Bowden () in more than  Oceanic languages, or more speciﬁcally by Senft () in Kilivila (Western Oceanic, Papuan Tip Cluster). I will just mention a few cases concerning bound nouns which have grammaticalized in the following ways: • As ‘noun-like’ prepositions: In Nêlêmwa (New Caledonia), the bound noun shi‘hand’ is used to introduce a beneﬁciary or recipient complement, expressed as possessor (Bril : ), and in Cèmuhî (New Caledonia), the bound noun ndε- ‘property’ ﬁrst became the possessive marker (+ animate possessor) tε-, and then was further grammaticalized into a comitative marker (Rivierre : ). • Intensiﬁers and reﬂexive markers from body part nouns are found for example in Kwaio (Southeast Solomon) with labe- ‘body’ + possessive sufﬁx (Keesing ) or in Lolovoli (North Central Vanuatu) with sibo- ‘self ’ + possessive sufﬁx (Hyslop ), but not in many other Oceanic languages. • Possessive classiﬁers in Oceanic languages are nouns or ‘noun-like’ (being nominalizations of verbs), i.e. either independent nouns or bound nouns taking the same possessive sufﬁxes as the ones occurring in direct possessive constructions. Lynch () argues that at least two of the POc possessive classiﬁers had a verbal origin: POc *kani ‘eat’ > *ka- for the food classiﬁer; POc *inum ‘drink’ > *ma- for the drink classiﬁer. They had derived from earlier transitive verbs; the possessum had originally been a direct object, and the possessor an indirect/benefactive object, indexed on the verb by a sufﬁx. Among Oceanic languages, Micronesian languages, as well as some New Caledonian languages and Kilivila (Papua New Guinea), are well known for having a large number of possessive classiﬁers. For example Iaai (Uvea, Loyalty islands) has a rich paradigm of  possessive classiﬁers (OzanneRivierre : –)—a number equivalent to that of the Micronesian classiﬁers—and their lexical origin is transparent for most of them (Dotte : –). Only three possessive classiﬁers are reconstructed for ProtoOceanic: general, foods, and drinks classiﬁers.

Grammaticalization in Oceanic languages



• Numeral classiﬁers also have a nominal origin. Their number varies from one language to the other. According to Bril (), Nêlêmwa (New Caledonia) has  numeral classiﬁers, such as pwa- to count round objects (pwa-nem pwâ-mâgo ‘ﬁve mangos’), pu(m)- to count plants and trees (pu-nem mâgo ‘ﬁve mango trees’), or aa- for living creatures (aa-nem ak ‘ﬁve men’).

..  ‘’ >   In Tahitian (Vernaudon ), as in Marquesan, the nominal mea ‘thing’ (a) grammaticalized through a qualifying use (b) into a stative aspect marker (c): ()

Tahitian a. Aore te

ho’ē mea i toe. one thing PFV remain ‘There is nothing left.’

NEG

(nominal use as subject)

ART

b. E

mea rahi te fare. thing big ART house ‘The house is big’ (*E rahi te fare).

(qualifying function)

NPST

c. Mea

ti’aturi Pito i teie rū’au. (aspectual function) trust Pito OBJ DEM old ‘Pito trusts this old man.’ (Vernaudon : –)

STAT

In (c), mea acquired a new status, and commutes with the paradigm of aspectual markers. This is apparently also a very special path of development, which is not mentioned in Heine and Kuteva ().

..  ‘’ >   In Mwotlap (François : –), the noun *m̄ ey historically means ‘child’ (< PEOc *mweRa). Yet this etymological meaning is nowadays only found as a formative in compounds, as in leplep-m̄ey (lit. ‘take-child’) ‘to give birth’. But apart from these vestigial cases, m̄ey is now only attested as a relative marker: () na-lqōvēn m̄ ey ne-leg ART-woman REL STAT-married ‘a married woman’ This development is arguably due to semantic extension. In northern Vanuatu languages, the root *mweRa already shows semantic ﬂexibility, from its original meaning ‘child’ to any human being. Thus, *mweRa-i somu (lit. ‘child of shellmoney’) means ‘rich person’ (François : ). In Mwotlap, the form m̄ey has similarly become a dummy noun for all humans, and indeed for any referent, equivalent to Eng. ‘one’ in ‘the big one’; for example, it combines with deictics in m̄ey gōh ‘this one’, m̄ey gēn ‘that one’. This dummy-noun structure would then have been the source of a general relative marker, compatible with any referent (even non-human) and any type of predication:



Claire Moyse-Faurie

() na-pnō m̄ey ne-tegha ART-country REL STAT-different ‘a country [which is] different’ (François : ) ͡ wεj] alternates freely with mey When used as a relativizer, the original form m̄ey [ŋm [mεj]; this is an indication that the grammaticalization process is complete, so that the original noun has become synchronically a different morpheme with its own properties.

. SECONDARY GRAMMATICALIZATION: DEVELOPMENT OF BENEFACTIVE MARKERS FROM POSSESSIVE MARKERS Oceanic benefactive markers are of very diverse shapes. Some originated from the verb ‘give’, as mentioned in section ..., but others developed from already grammaticalized morphemes, i.e. from possessive markers. Margetts (), Song (, ), and Lichtenberk (b) challenged the unidirectionality of the grammaticalization process from dative/benefactive marker to more abstract possessive marker, as posited by Heine (b: ). In quite a few Oceanic languages, indeed, ‘an extension from possessive to benefactive markers is well attested’ (Margetts : ). Margetts details the different stages and the conditions under which the shift from possession to benefaction happened in Saliba (Western Oceanic, Papuan Tip Cluster): Stage . Attributive possession with benefactive implicature occurs with verbs of transfer, verbs of obtaining, verbs of creation expressing an activity directly affecting the possessive relation, and verbs of performance expressing an intended transfer. () a. Yo-da ku hedehedede. (attributive possession) CLS-EXCL.POSS SG tell ‘Tell us something! / Tell us a story!’ (Margetts : ) Stage . Separate constructions with distributional overlap; the benefactive reading begins to emerge as a grammatically distinct construction (bridging contexts preceding the grammaticalization). b. Yo-na tobwa ya-halusi. (either possessive or benefactive reading) CLS-SG.POSS bag SG-weave (a) ‘I wove her bag.’ (b) ‘I wove a bag for her.’ (Margetts : ) Stage . Separate constructions without distributional overlap (no more bridging contexts). c. Yo-na ya-tolo. (benefactive reading only) CLS-SG.POSS SG-stand.up ‘I stood up for her (because they falsely accused her).’ (Margetts : ) In Saliba, there is the same marker for both possessive and benefactive expressions, though constituent order may be different (Margetts : ). In other languages there are subsequently different base morphemes, with replacement of the possessive

Grammaticalization in Oceanic languages



marker by a new form, the old form expressing benefactives only, as in Toqabaqita, where POc *ka- possessive classiﬁer for food items developed into qa-, the grammatical marker of benefaction. ()

Toqabaqita (Southeast Solomon) Kini kai faali-a qa-kuqa teqe teeter. woman SG.NFUT weave-.OBJ BEN-SG.PERS one fan ‘The woman will weave me a fan.’ (Lichtenberk : )

Some of Margetts’s remarks (: ) relate to the role of speciﬁcity: ‘Lack of a speciﬁcity constraint for attributive possessive expressions may constitute a prerequisite for the benefactive implicature to grammaticalize towards a formally distinct construction.’ There is total agreement between the three authors (Lichtenberk, Margetts, and Song) that the benefactive-marking function in Oceanic languages developed from possessive markers through a process of reanalysis. This innovation has taken place independently in several languages belonging to different branches of Oceanic: Micronesian, Southeast Solomonic, and Papuan Tip Cluster (Western Oceanic) subgroups (Lichtenberk : ).

. RELEXIFICATION

..      Mostly attested in the Western Oceanic subgroup, verbal compounds that have given rise to verbal classiﬁcatory preﬁxes are also a lesser-known feature of New Caledonian Mainland languages. Whereas this development, probably stemming from former nuclear-layer serializations, is often linked to V-ﬁnal word order,⁹ it is worth mentioning that it also happened in V-initial languages, as described in detail by Ozanne-Rivierre and Rivierre (). The verbal preﬁxes, however, show a close semantic similarity in both types of languages: they express the manner of the action—more precisely, the type of gesture accompanying the action along with the body part involved—while the second part of the compound expresses the result. Depending on the languages, these preﬁxes have, or lack, corresponding independent verbs. Nine of the ten Manam (Madang Province, Papua New Guinea) classiﬁcatory preﬁxes (Lichtenberk : –) have corresponding transitive verbs, while in New Caledonian languages, ‘both elements forming the compound have often lost their status of independent verbs’ (Ozanne-Rivierre and Rivierre : ), although their verbal origin can generally be reconstructed. There is also large variation concerning the number of preﬁxes expressing the manner of the action: there are only a few of them in the north of the Mainland, but up to several dozen in each of the southern languages. Below are some Xârâcùù examples from Moyse-Faurie (). For a fuller semantic and typological treatment of event compounds, see Gast, König, and Moyse-Faurie (). ⁹ Languages belonging to the Papuan Tip Cluster, as well as other languages of the Western Oceanic subgroup, have changed their word order from SVO to SOV through contact with the surrounding nonAustronesian languages.



Claire Moyse-Faurie

The ﬁrst component is a bound form (with CV-syllable structure) derived from a verb through a reduction of all but the ﬁrst syllable. Cf. the list of verbs with bi- from biri ‘turn, ‘twist’ in (). ()

bi- < biri turn, twist bica twist and break bicaa pick (fruits) by twisting bichâ be unscrewed bichëe screw in the wrong way bifagö unclamp, unscrew bikakörö pull to pieces bikörö grind

bimwêrê turn off (tap) bipuru break in two pieces by twisting bitia tear by twisting bitùrù squeeze by twisting biwi unscrew bixwêê twist to make fall, etc.

The second element may have one, two, or even three syllables, but is still rarely attested as an independent verb, as is shown by the formations with -puru ‘break in two pieces’ in (). ()

-puru break in two pieces (variant -buru when the preceding vowel is a nasal): bipuru — by twisting jöpuru take a short cut capuru — with the hand kêburu break in two with the hand chäburu burn the middle of a wooden kèpuru — with the teeth stick to break it into two pieces kipuru — with a saw chapuru — break with an axe kwipuru — with a saber chèpuru — (a rope) by pulling söpuru — with a circular fîburu — by hitting (with a bar of metal) movement of the hands gwépuru — by throwing tapuru — with a stick jipuru — (bread) tipuru — and tear tupuru — and fold up, etc.

The development of such lexical preﬁxes and sufﬁxes through compounding processes has contributed to a (still ongoing) expansion of the lexicon in New Caledonian as well as in some Western Oceanic languages. It does not apply, however, to borrowings.

..      As suggested by Heine (p.c.), this section and the following deals with what can be interpreted as the ﬁnal stage of grammaticalization where a form loses its function and merges with its host. According to Lynch (), Proto-Oceanic articles were rarely retained in the Southern Oceanic language subgroup, least of all as ‘free-standing articles’. Many Vanuatu languages ‘have accreted one or more of these articles onto some nouns, with greater or lesser degrees of morphological fusion with the noun’ (Lynch : ). The POc common article *na has been integrated as initial n+V in many nouns in Vanuatu languages, even if *na can be omitted in some cases, such as compounds or plural marking. For example in Neve’ei, ‘noun-initial nV- is semi-productively lost in compounds and in address terms’ (Lynch : ): niyim ‘house’ > liyim ‘at/to the house’. In his grammar of Anejom̃ , Lynch () mentions that slightly over  per cent of Anejom̃ nouns begin with n- (or in-), reanalysed as integral part of the

Grammaticalization in Oceanic languages



root. The Anejom̃ reﬂexes of POc *kutu ‘louse’ and *lima ‘hand’ are respectively necet and nijma, analysed as the fusion of the former article into the root: ()

POc *na kutu > Anejom̃ necet [ne+cet] ‘louse’, POc *na lima > Anejom̃ nijma [ni+jma-] ‘hand’, etc. (Lynch : )

Crowley (: –) argues that not only POc articles but also locative markers have been reanalysed as parts of nouns in Paamese.

..      Lynch (: –) discusses in detail the origin of the realis/irrealis distinction, which is marked in some Oceanic languages by an oral/nasal consonant alternation. This alternation could be explained ‘as a secondary development resulting from the fusion of a preverbal particle [*ma realis and *na irrealis] with the verb’.

. INSTANCES OF DEGRAMMATICALIZATION? Although grammaticalization is generally described as a unidirectional process (Haspelmath , ), I would ﬁnally like to discuss a few cases that appear to be instances of what Norde (: ) described as cases of degrammation, deﬁned as ‘a composite change whereby a function word in a speciﬁc linguistic context is reanalyzed as a member of a major word class’.

..        In section .. I discussed cases where new verbal lexemes were formed through compounding, the compounds themselves stemming from serial verb constructions. In the data presented below, the new verbal lexeme—existential verb and mannerdeixis verbs—is composed of different grammatical morphemes, which are fused in order to yield a new meaning. Several Polynesian languages have an existential verb iai, ‘there is, it exists’, resulting from a lexicalization/degrammation process. This verb derives from the locative anaphor i ai (consisting of the preposition i ‘location, at, in’ plus the deictic anaphoric ai), which refers back to a locative phrase that occurs earlier in the sentence or in a preceding sentence. In these Polynesian languages, the preposition i has morphological variants, as for example in East Futunan, with three different forms: i + toponyms, deictics and common nouns; ia + proper nouns and dual/ plural pronouns; iate + singular pronouns. Only the form i is involved in the degrammation process. Chapin ()¹⁰ offers a thorough analysis of the different uses of the anaphoric ai, mainly as locative, but also temporal, goal, attributive, or causal complements, in  different Polynesian languages. He also describes the nonanaphoric use of ai as an existential predicate (p. ), most often in combination with ¹⁰ I am indebted to Andrew Pawley for pointing out to me the importance of Chapinʼs  article.



Claire Moyse-Faurie

the preposition i, even in the languages in which this preposition does not obligatorily occur before the anaphoric ai (p. ). Below in () is an example of East Futunan, a language for which Chapin had no data. Here the existential verb iai and the locative anaphoric prepositional phrase i ai may co-occur in a sentence: ()

East Futunan O kaku atu loa ki Mamalu’a e iai le nofolaga i ai . . . and reach DIR SUCC OBL Mamalu’a NPST exist SPC camp OBL ANA ‘And arriving in Mamalu’a, there is a camp there . . . ’ (Moyse-Faurie : )

The existential verb iai ‘exist’ acquired most of the morphosyntactic properties which are typical of the verb class, ie. the compatibility with all the tense-aspect markers and the negative marker (cf. (b)). In addition, the existential verb iai ‘exist’ can occur in a nominalized phrase, as can any other verb, preceded by the speciﬁc article (and eventually an aspect marker), and followed by a possessive noun phrase, introduced by the alienable preposition a in (a), or by the inalienable preposition o in (b), depending on the relation between the possessor and the possessum: () a. Ko

le

kua iai a motokā, e se koi ano lalo le fenua. exist POSS car NPST NEG CONT go on.foot SPC people ‘Since there are cars, people do not walk any more.’ (lit. ‘this is the now existence of cars, people do not walk any more’)

PRED SPC PFV

b. Ko

le

kua iai fa’i o ne’alava. PFV exist RESTR POSS clothes ‘It is the fact that we now wear clothes.’ (lit. ‘this is the now existence of clothes’)

PRED

SPC

The same lexicalization is attested in East Uvean, as shown in example (), in which both the existential verb iai and the preposition phrase i ai co-occur: iai te fo’i ’utu i ai ’e higoa ko ’Utuuhu. exist SPC CLS rock PREP ANA NPST name PRED ’Utuuhu ‘There there was a rock called ’Utuuhu.’ (lit. ‘it existed a rock in that place called ’Utuuhu’)

() Ne’e PST

A similar degrammation process has been described by Lynch (: ) for Anejom̃ (South Vanuatu), as shown in Table .. ‘The existential verb bears a strong formal resemblance to the anaphoric demonstrative pronouns. It may be that the existential verb is a verbalisation of the demonstratives, which might explain its irregularity’ (Lynch : ). T .. ‘Verbalization’ of the Anejom̃ anaphoric demonstrative Existential verb

Corresponding anaphoric demonstrative

yek

yiiki

singular

singular

rak dual

raaki

sjek

jiiki, jeken plural

plural

dual

Grammaticalization in Oceanic languages



Heine and Kuteva () identify a development from demonstratives to copula verbs. In Oceanic languages, however, the changes under discussion do not have copular verbs as targets but full (existential) verbs. Another case of relexiﬁcation from grammatical morphemes concerns verbs of manner deixis. In East Futunan these verbs are made up of two different sorts of grammatical morphemes: deictics (nei ‘near speaker’, nā ‘near addressee’, and lā ‘away from speaker and addressee’) combined with the reciprocal circumﬁx fe- . . . -’aki¹¹. This combination has not turned the deictics into verbs, but forms verbal compounds. The three compound verbs fela’aki, fene’eki, fena’aki have similar meanings, ‘be so’, and mainly occur as main predicates or adverbs. The choice of a speciﬁc form depends on the location of the entity referred to, and on that of the speaker: East Futunan () E fena’aki lana ’aga o takai a Aloﬁ kātoa. NPST be.so his facing POSS surround ABS Aloﬁ entire ‘It is in this way that he starts to go around the entire island of Aloﬁ.’ The distal manner-deixis verb fela’aki also has a different function, viz. as optative marker, when it occurs at the beginning of the sentence and is followed by the distal directional ake: () Fela’aki ake la loa ke ’ua i le aﬁaﬁ! be.so DIR EMPH SUCC in.order.to rain OBL SPC evening ‘If only it could rain this evening!’

..   / :   *-() A long-standing debate among Oceanic linguists concerns the status of the morpheme(s) reconstructed as *-akin(i) in POc. As mentioned earlier, *akin(i) is best analysed as two morphemes, applicative *akin plus *-i. The ﬁnal *-i occurs when the sufﬁx precedes a direct object (usually an object pronoun), i.e. when it marks a transitive verb. The reﬂex of *akin is typically *aki in languages which lose word-ﬁnal consonants. This debate is about the following questions: (i) Was it already both an applicative sufﬁx and a preposition, as suggested by Evans (), as it still is in some languages? (ii) Was it only a sufﬁx, which was later on reanalysed as a preposition in some languages? (iii) Or was it a free-form preposition that became grammaticalized to an applicative sufﬁx independently in various daughter languages, as suggested by Harrison (), drawing heavily on evidence from Micronesian languages?

¹¹ In East Futunan, the circumﬁx fe- . . . -’aki is generally used to express reciprocity along with sociative, iterative, dispersive, etc. meanings; it always derives verbs or adverbs, from verbs (tio ‘see’, fe-tio-’aki ‘see each other’), but also from nouns (’uluga ‘pillow’, fe-’uluga-’aki ‘share the same pillow’), or deictics.



Claire Moyse-Faurie

There is no space here to go further into this debate.¹² I will only give an example from East Uvean, a language in which (as is the case in Tongan), the instrument adjunct is introduced by the preposition ’aki (a reﬂex of POc *-akin(i), a situation that leaves open the hypotheses in (i) and (ii): () a. ’E

fai te ﬁló ’aki te kili o make SPC string INSTR SPC skin POSS ‘Strings are made with the bark of the bourao tree.’

NPST

te SPC

faú. bourao

This instrumental preposition ’aki, however, often occurs immediately postposed to the predicate, hence separated from its complement, te toki, in the following example: b. ’E

tu’usi ’aki e Soane te fu’u niu te toki. cut INSTR ERG Soane SPC CLS coconut.tree SPC axe ‘Soane is cutting the coconut tree with an axe.ʼ

NPST

According to Durie (: –), a similar development from sufﬁx to preposition, in accordance with hypothesis (ii), occurred in Mokilese (Micronesia): S V-ki (Object) Instrument > S V Object ki Instrument. This development is illustrated in the following examples: () a. Ngoah insingeh-ki kijinlikkoano nah pehno. SG write-with letter his pen ‘I wrote the letter with his pen.’ b. Jerimweim koalikko pokihdi jerimweim siksikko ki suhkoahpas. boy big hit boy little with stick ‘The big boy hit the little boy with a stick.’ (Durie : –)

. CONCLUSION I have tried to present the clearest instances of grammaticalization processes in Oceanic languages from a typological perspective. Some of these, such as prepositions coming from nouns or verbs, are common cross-linguistically. Others are common in Oceanic but possibly rare in other language families, as for example the contribution of serial verb constructions to grammaticalization on the one hand and to relexiﬁcation on the other. Some kinds of change do not seem to be attested elsewhere, such as the grammaticalization of the verb ‘follow’ to express causal adjuncts, or of the noun ‘thing’ becoming a stative aspect marker. The evolution from possessive sufﬁxes into benefactive markers, which occurred in some Oceanic languages belonging to different subgroups, is also noteworthy. Finally, the development from morphemes to existential and manner verbs in Polynesian languages seems to be a clear case of degrammation, and thus to present a problem for the unidirectionality hypothesis. Whatever the precise developmental paths or the origins of the presumed sources were, the main conclusion is that the semantic domain of space is primary, more obviously so in the Oceanic languages than elsewhere. ¹² Moyse-Faurie (: ) also discussed the case of the Xârâcùù multifunctional preposition ngê, probably cognate with an applicative co-agent clitic.

15 Shaping typology through grammaticalization: North America MARIANNE MITHUN

. INTRODUCTION North America north of Mexico is home to around  distinct languages, comprising over  distinct genetic groups. (Numbers are necessarily approximate because of the variation in amount and quality of documentation.) There is thus considerable genealogical diversity. There is also typological diversity, but certain typological traits are pervasive. A great many North American languages show elaborate morphology, presumably the product of extensive grammaticalization. A number are polysynthetic, but none are strictly isolating. The high degrees of synthesis might be due in part to the kinds of social factors discussed by Dahl (, ) and Trudgill (, , ). In small communities with intense communicative networks among fewer interlocutors, the frequencies of individual usage patterns might be enhanced, resulting in more routinized combinations, more complex lexicalized formations. The nature of the morphological complexity is not uniform across the continent, however. It varies in sometimes subtle ways, the result of different grammaticalization pathways. And some of the patterns show areal distributions. Contact-induced grammaticalization has been discussed by a number of authors, among them Matisoff (), Heine (a, , b), Haase and Nau (), Bisang (), Kuteva (a, , ), Stolz and Stolz (), Johanson (a, b, , ), Heine and Kuteva (, , ), Aikhenvald (a, , , , ), Aikhenvald and Dixon (), Matras and Sakel (), Ramat and Roma (), Giacalone Ramat (), Narrog and Heine (b), Gast and van der Auwera (), Heine and Nomachi (), Robbeets and Cuyckens (), and Markopoulos (). It can be challenging to distinguish contact-induced grammaticalization from purely internally motivated change. In fact it is generally agreed that external forces rarely act on their own: contact tends to stimulate developments that could occur in any case, the result of basic cognitive and communicative forces. In North America, the challenge for distinguishing contact-induced grammaticalization

Grammaticalization from a Typological Perspective. First edition. Heiko Narrog and Bernd Heine (eds). This chapter © Marianne Mithun . First published  by Oxford University Press



Marianne Mithun

is increased by the absence both of philological records of an age to reveal change in progress and of documentation of prehistoric contacts among groups, including details concerning the nature of their social relationships, the intensity of the contact, and language shift. The main tools for detecting grammaticalization remain internal reconstruction and the comparative method; those for detecting contact effects are comparisons across related and neighbouring languages. Beyond the social circumstances surrounding contact-induced grammaticalization, certain other factors can shape its effects. One is the relative sequencing of certain functional and formal processes. A particular marker may undergo abstraction, extension, and perhaps phonological reduction in place, as is common with the development of auxiliaries, adpositions, and adverbs, before ultimately fusing with associated elements of the construction to become afﬁxes (univerbation). Alternatively, univerbation may occur while the elements still have their concrete, speciﬁc, lexical meanings, as in the case of compounding, before undergoing abstraction, extension, and reduction in form. Over time, the results of the two kinds of pathways may converge, but at intermediate stages of development they can produce different morphological proﬁles. Another variable is the point on a grammaticalization pathway at which contact enters in. Even before there is any evidence of grammaticalization, contact can set the stage for subsequent parallel grammaticalization in neighbouring languages. Bilinguals accustomed to specifying certain distinctions with great frequency and precision in one of their languages may transfer such propensities into the other—a kind of semantic frequential copying in Johanson’s terms (b, , ). Over time, the heightened frequency of expression can result in routinization and ultimately crystallization in the grammar in one or both languages, individually or propelled by continuing contact. At later points in the process but before grammaticalization has reached any endpoint, contact can have varying effects. Heine and Kuteva (: –) describe these in terms of use patterns, speciﬁc recurrent pieces of speech: clauses, phrases, or even single forms used in a particular context. The patterns express some kind of grammatical meaning but they are not yet obligatory. ‘A given language may acquire a minor use pattern or develop an existing minor use pattern into a major one when another language offers an appropriate model’ (Heine and Kuteva : ). ()

The rise of major use patterns (Heine and Kuteva : ) a. An existing use pattern is used more frequently. b. It is used in new contexts. c. It may become associated with a new grammatical function.

A major use pattern can evolve further into an obligatory marker of a grammatical function. Here two kinds of morphological complexity are described. First are grammatical markers that developed via auxiliary constructions, with univerbation at a late stage. Second are markers that developed via compounding, with univerbation at an initial stage. Both show areal spread, with clear contact effects in neighbouring but genealogically unrelated languages, and raise intriguing questions about the moments at which the contact may have entered in.

Grammaticalization in North America



. GRAMMATICALIZATION VIA AUXILIATION Verbal morphology has developed via auxiliary constructions in several areas of North America, with subsequent spread via contact. Evidence of their pathways of development can still be traced in some areas. One is the Southeast, described in Mithun (a). Another is northern California.

..  Languages of the Wintuan family, Wintu, Nomlaki, and Patwin, are indigenous to the Sacramento Valley area. A grammar of Wintu is in Pitkin (), and a dictionary in Pitkin (). A grammar of Patwin is in Lawyer (). Wintu, like the other languages in the family, shows basic SOV order. Verb stems are derived from a root followed by one of several sufﬁxes. The forms are now usually listed simply in stem sets. Forms of the verb ‘do/use/be’ (nearby) are in (). ()

Wintu stem set for the verb ‘do/use/be’ ʔiye INDICATIVE ʔis GERUND/NOMINAL ʔih IMPERATIVE ʔiBOUND FORM

The use of this verb ‘do’ for basic predication can be seen below. ()

Wintu verb ‘do’ (Pitkin : ) ʔeh ʔíh! this do.IMPERATIVE ‘Do this!’

The same verb occurs in complex constructions in which the dependent clause is marked with the subordinating sufﬁx -r-. Pitkin describes it as follows: ‘The inﬂectional subordinating sufﬁx -r indicates that the verb so sufﬁxed is syntactically dependent and semantically anterior in regard to causality or time. It is commonly translated ‘because of, of, . . . ’ (: ), and as an ‘inﬂectional subordinating sufﬁx of causal or temporal anteriority, “because of, of, while . . . ” ’ (p. ). ()

Wintu complement construction (Pitkin : ) Ni-s holowi-kuya-r ʔiye-kir-kele-sken. SG-OBJ scared-CON-DEP do-PAST-HRSY-SG ‘You were trying to scare me.’

This verb ‘do’ has developed into an auxiliary, following a content verb not inﬂected for person or tense/aspect. ()

Wintu auxiliary (Pitkin : ) Ba: ʔi-se:-da. eat.IND do-PFV.IND- ‘I already ate.’



Marianne Mithun

Wintu contains a wealth of auxiliary constructions. Pitkin lists  (: ). Some of the auxiliaries are linked slightly more closely to preceding lexical verbs. Pitkin describes them as enclitic, ‘with the potential of a very brief pause’ (: ). Word stress in Wintu is basically initial (or on the second syllable if the ﬁrst is light and the second heavy); but in these constructions, the auxiliary still patterns like the beginning of a word for stress. A number of the auxiliaries are transparently descended from postural verbs meaning ‘sit’, ‘stand’, and ‘lie’. (Ablaut patterns are signalled with capital letters.) ()

Wintu postural verbs bOh ‘sit’ suk ‘stand’ bEy ‘lie’

The ﬁrst verb bOh is still used as a predicate to mean ‘sit, be in a sitting position, live, reside, dwell, remain, keep, and stay’. ()

Wintu postural verb ‘sit’ (Pitkin : –) Bo:-s-ile. sit-NMLZ-VIS.EVID ‘I saw them sitting there.’

It has also developed into a durative aspect marker on verbs, adding such meanings as ‘keep on doing, always, sometimes, remain, still’. ()

Wintu durative aspect sufﬁx (Pitkin : ) Miya=ma:n min-ele-bo:-sken. SG=possibly not.exist-STATIVE-DURATIVE- ‘You, too, shall be dead.’ = ‘You too shall die.’

The durative aspect sufﬁx also appears on the more independent auxiliary verbs. ()

Wintu durative aspect on auxiliary (Pitkin : ) ʔuni ni ʔiye-ba:-k. do.thus SUBJECT do.this.IND-DURATIVE-COMPLETIVE ‘That’s the way I do it.’

The verb suk has meanings as a predicate ‘stand, stay, live, own’. ()

Wintu postural verb ‘stand’ (Pitkin : ) Ni suke-da. SG stand-SUBJECT ‘I’m (standing) here.’ ‘It’s me.’

()

Wintu postural verb ‘live’ (Pitkin : ) Be:di ʔuna: honda suk-mah-mina. PROHIBITIVE do.thus long live-CAU-NEG ‘Don’t let him live so long.’

It has developed into a perfective sufﬁx on auxiliary verbs.

Grammaticalization in North America



() Wintu perfective sufﬁx on auxiliary (Pitkin : ) ƛ’o:ma ʔi-suk ƛ’a:q-um. kill.IND do-PERFECTIVE rattlesnake-ACC ‘She recently killed a rattlesnake.’ ()

Wintu perfective sufﬁx on auxiliary (Pitkin : ) Pa:lel hara: ʔi-suk. two go do-PERFECTIVE ‘The two of them went.’

The third postural verb, bEy, is used as a predicate to mean ‘be in a lying position, be in bed, spend the night’. ()

Wintu postural verb ‘lie’ (Pitkin : ) Biya-da. lie.IND-SUBJECT ‘I am lying down.’ = ‘I am in bed.’

It is also used as a more abstract existential predicate. ()

Wintu verb ‘lie’ as existential predicate (Pitkin : ) Pi po:m be:-le-bo:-m. that ground lie-INEVITABLE.FUT-IND ‘That ground will always be (lying) there.’

It has evolved further into an imperfective aspect sufﬁx on verbs. ()

Wintu ‘lie’ as imperfective aspect sufﬁx on verb (Pitkin : ) Wira-bi-re:. come-IMPV-INFERENTIAL ‘They must be coming.’

It is also attached to auxiliaries. ()

Wintu ‘lie’ as imperfective sufﬁx on auxiliary (Pitkin : ) Ba: ʔi-bi:-da. eat.IND do-IMPV-SUBJECT ‘I’m eating.’

Verbs of directed motion have also developed into aspect and tense markers. The verb har is used as a predicate to mean ‘go’. ()

Wintu verb har ‘go’ (Pitkin : ) Har! ‘Go!’ SINGULAR Halel! ‘Go!’ DUAL Hata:rum! ‘Go!’ PLURAL

It also occurs as an auxiliary. ()

Wintu auxiliary (Pitkin : ) ʔelew hara:-ki-re:. not.be go.IND-PAST-INFERENTIAL ‘He recently passed away.’



Marianne Mithun

And it occurs as a verb sufﬁx indicating progressive aspect. ()

Wintu progressive aspect sufﬁx (Pitkin : ) Ni ma:n qati: k’iye:-hara: bi-ntʰi-da. SUBJECT possibly as.for get.old-PROGRESSIVE IMPV-NON.VIS.SENS.EVID-SUBJECT ‘As for me, I go on growing older.’

The verb wEr ‘come, bring’ is the source of what Pitkin terms a future intentional marker. ()

Wintu verb wEr ‘come’ (Pitkin : ) War! ‘Come!’ SINGULAR Walel! ‘Come!’ DUAL Wata:rum! ‘Come!’ PLURAL

()

Wintu future tense sufﬁx (Pitkin : ) Hari:l-wi-da. take.along.animate-FUTURE.INTENTIONAL-SUBJECT ‘I am about to take them.’

The verb kEr has meanings ‘die, be killed, exterminate, massacre, terminate, ﬁnish, ﬁnish off ’. ()

Wintu verb ker ‘die’ (Pitkin : ) ker-it die-PARTICULAR.ASPECT ‘all dead’

It is the source of a past tense marker. ()

Wintu past tense sufﬁx (Pitkin : ) Wirwira-ki-ntʰe:. come.RDP.IND-PAST-NON.VIS.SENSORY.EVIDENTIAL ‘They came (I heard them).’

A verb kOy ‘hurt, be sick, ache (for), need, want, crave’ has developed into a desiderative/conative sufﬁx on verbs. It has the form kuya in indicatives—except before the ﬁrst person subject sufﬁx –da, where it is ku—and imperative koyu. ()

Wintu verb root ‘want’ (Pitkin : ) koy-it want-NMLZ.PARTICULAR ‘one who likes to do something or wants to do something’

This sufﬁx appears to be descended from a complement construction with nominalized complement. It follows nominalized (gerund) forms of the verb it attaches to. ()

Wintu desiderative/conative sufﬁx (Pitkin : ) ʔewet ʔi-s-kuya-m? SG use-NMLZ-DESIDERATIVE-DUBITATIVE ‘Do you want to use this?’

Grammaticalization in North America ()

Wintu desiderative/conative sufﬁx (Pitkin : ) Ba:-s-ku-da. eat-NMLZ-DESIDERATIVE-SUBJECT ‘I want to eat.’

()

Wintu desiderative/conative sufﬁx (Pitkin : ) ʔu-s-koyu! do-NMLZ-DESIDERATIVE ‘Try to do it!’



The verb min- is a negative existential ‘not exist, die’. ()

Wintu verb min-el ‘die’ (Pitkin : ) Ko:m luli min-el-be:. all ﬂower not.exist-STATIVE-IMPV.lying ‘All the ﬂowers are dying.’

Indicative stems are formed with a sufﬁx -a. The indicative form of the verb ‘not exist’ has developed into the negative sufﬁx. ()

Wintu negative -mina (Pitkin : ) ʔelew-da har-mina. NEG-SUBJECT go-NEG ‘I didn’t go.’

It is not uncommon cross-linguistically for speakers to reinforce negative constructions with an additional marker (Jespersen ; van Gelderen b; Mithun b). In Wintu, the negative construction has been renewed with ʔelew, descended from another negative verb ‘not exist’, seen earlier as a predicate in () ‘He recently died’. The same form serves as the negative response: ʔelew ‘no, never’. A different verb was recruited to reinforce the negation in prohibitives. This is be:di, descended from the imperative stem of the postural verb ‘lie’ with hortative sufﬁx -di. ()

Wintu prohibitive (Pitkin : ) Be:di hu:m-us war-ba:-mina! PROHIBITIVE fat-GENERIC FUT.INTENTIONAL-eat-NEG ‘Don’t eat any fat!’

Just two auxiliary constructions can be reconstructed to Proto-Wintuan. Cognate auxiliaries in Patwin are the copula ʔi(h) based on the verb ‘do’, and a stative bo:/be:/ boh/beh based on the verb ‘be’ (Lawyer : –). The other Wintu auxiliaries and tense/aspect/modality sufﬁxes do not have Patwin cognates. Patwin negatives are formed with negative verbal sufﬁxes not cognate with the Wintu forms. Lawyer (: –) analyses them as the subjunctive -mu alone or in combination with another marker, with variation across the three Patwin dialects. The Hill and River Patwin -mele is from the subjunctive -mu with negative ʔele-; the River Patwin -mur is from the subjunctive -mu with noun ʔur ‘nothing’; and the Hill Patwin -muʔu is from the subjunctive -mu with the verb ʔu ‘say’. The negative verbs are reinforced



Marianne Mithun

with an initial auxiliary ʔele- based on the verb ‘not exist, lack’ seen in Wintu, or, in River Patwin with the stative auxiliary boh/beh ‘be’. Wintu thus shows the results of multiple developments along a pathway from lexical verbs via auxiliary constructions toward verbal sufﬁxes. Auxiliaries with varying degrees of attachment have developed from the verb meaning ‘be/do’ to durative aspect; from postural verbs ‘sit’, ‘stand’, and ‘lie’ to durative, perfective, and imperfective aspects; from directed motion verbs ‘go’ and ‘come’ to progressive aspect and future intentional; from the verb ‘die/ﬁnish’ to past tense; and from verbs meaning ‘not exist’ to negatives. The negative construction appears to have stimulated similar use patterns in neighbouring languages.

..  Directly to the east of the Wintuan languages are the Pomoan languages: Northern, Northeastern, Eastern, Southeastern, Central, Southern, and Southwestern/Kashaya Pomo. These languages also have basic SOV constituent order, but here lexical sources of verbal afﬁxes are less discernible. Of special interest are some negative constructions. Central Pomo contains a verb čʰó- meaning ‘not exist’. This is cognate throughout the family. Examples here are from the speech of Frances Jack, Winifred Leal, Eileen Oropeza, and Florence Paoli. ()

Central Pomo verb čʰó- ‘not exist’ Qʰá čʰó-w. water not.exist-PFV ‘There’s no water.’

With the inchoative sufﬁx -č’ it means ‘die’. The same verb is used with oblique/possessive nominals and pronominals to predicate negative possession, or lack. ()

Central Pomo ‘lack’ Yá:ʔkʰe ʔe méṭ’ hínt ̯il ší čʰó-w. PL.OBL COP such Indian name not.exist-PFV ‘We don’t have Indian names for those.’

As is common for languages with basic SOV constituent order, complement clauses generally precede matrix clauses. The verb ‘not exist’ is now used as a perfective matrix verb to negate its complement, which is still fully inﬂected. Example () shows the complement construction with matrix verb čʰów and a complement clause, itself a complement construction with matrix verb ‘want’: ‘they want [to tell their aunt].’ ()

Central Pomo complement construction Mú:lta̯ yat’̯ šé:ki ’el t̯ét̯e-:n dá-:ʔč’i-w čʰó-w. PL.POSS mother’s.sister the tell-IMPV want-IMPV.PL-PFV not-PFV ‘They don’t want to tell their aunt.’

Grammaticalization in North America



The same form is also used as the word ‘no’. ()

Central Pomo ‘no’ (‘Is that the guy that used to be so big?’) Čʰow. ‘No.’

Aspect is distinguished on all Central Pomo verbs. The negative cʰo-w is perfective, marked with the perfective aspect sufﬁx -w. It has an imperfective counterpart t ̯ʰi-n for imperfective negations, marked with the imperfective sufﬁx -n: Mú:lta̯ yal čanú ʔel šó:čan t ̯ʰín ‘I haven’t heard their language’. The Central Pomo perfective negative construction apparently came into the language after the diversiﬁcation of the Pomoan language family. The lexical verb čʰó-w ‘not exist’ can be reconstructed to Proto-Pomoan, but the negative constructions vary across the family. In some of the languages (Northeastern Pomo, Northern Pomo, Southern Pomo, Kashaya), cognates of the Central Pomo imperfective negative t ̯ʰín are also used for perfectives. Northeastern Pomo was probably the ﬁrst language to split off from Proto-Pomoan (Walker : ), and was separated geographically from the rest of the family. Here the negative tʰin is reinforced with a cognate of the verb ‘not exist’ in a few examples. ()

Northeastern Pomo (Neil Alexander Walker, p.c., from Halpern : .) Wehšem=t̯ʰin čʰó:t ̯-on. laugh-NEG NEG-PFV ‘Don’t laugh.’

()

Northeastern Pomo (Neil Alexander Walker p.c. from Halpern : .) ʔa: dihté=t̯ʰin čʰó:t ̯-on. SG.A ﬁnished=NEG NEG-PFV ‘I’m not through.’

Eastern Pomo, not closely related to either Central or Northeastern Pomo, shows a related but apparently independent development. Here basic negation is expressed with kʰu-y, cognate with Central Pomo čʰó-w. This form is used in both statements and prohibitives. ()

Eastern Pomo negative matrix (McLendon : .) Bá: ʔínčʰa xa kʰúy-:le qa:wí:-heʔ-mì:p ba:Yé:. that nevertheless HRSY NEG-HRSY boy-DEM-M.SG.A verbally-cease ‘Nevertheless the boy didn’t give it up.’

()

Eastern Pomo negative matrix (McLendon : .) Ka:y-Na má kʰúy-baʔ wí t̯’á-:-ʔwa:-qà:. ground-on SG.A NEG-JUSSIVE SG.P touch-FOC-on-CAU ‘Don’t let me touch the ground.’ Bá: há: kʰúy-baʔè máy-Mak’ líl-uhù:. then SG.A NEG-SBJV PL-with far-go ‘Then I won’t be able to go any further with you.’



Marianne Mithun

Here it has advanced a step further than in either Central or Northeastern Pomo, also appearing as a verbal sufﬁx. ()

Eastern Pomo negative sufﬁx (McLendon : :) Ma:xár-heʔ-baya xól-pʰi-:lì:-khuy . . . cry-SPEC-LOC towards-go.PL-HRSY-NEG ‘They never came to where she was crying . . . ’

()

Eastern Pomo negative sufﬁx (McLendon : .) Ba:Yé:-kʰùy-ayex=qan xa kʰi di:qá-:le, ši:wéy-heʔè. verbally-cease-NEG-FUT-DIFF.SEQ HRSY SG.F give-HRSY new-SPECIFIC ‘He would never stop, so she gave him that new one.’

..  Also to the east of the Wintuan languages, directly to the north of the Pomoan languages, is Yuki, not related to either family. The language has not been spoken for some time, but there is a dictionary in Sawyer and Schlichter () based on work with the last speakers, and an extensive grammar in Balodis () based on all known manuscript material, including more recently discovered ﬁeldnotes from the beginning of the th century, when A. L. Kroeber worked with Ralph Moore, a young, fully ﬂuent speaker. Yuki contains a verb tąl- ‘be not, lack, lose’. ()

Yuki (Balodis : ) Míʔ hąwáy há=mil=han tąl-t ̣-il-n(i)k. SG.A food hold=FIN=but not.be-INTR-MIDDLE-NEC ‘You must not let yourself seem to withhold food.’

This verb is also used to indicate negative possession, as in Wintu and Central Pomo. ()

Yuki (Sawyer and Schlichter : ) Ṭ’ąwnom’ ṭąl-ąp meʔet šulnom’ ho:t. enemy lack-but PL.POSS friend lots ‘We have no enemy but a lot of friends.’

The same form t̯al also negates dependent clauses. Balodis notes that ‘ṭal was originally transcribed by Kroeber as part of the verb in these examples, but it is unknown whether ṭąl is encliticized to the preceding verb or an independent verb’ (: ). ()

Yuki negative (Sawyer and Schlichter : , ) a. Hi:l ʔinay’ ʔąp la:l-le=k. every day SG.A ride-MIDDLE=DEC ‘I ride every day.’ b. Hi:l ʔinay’ ʔąp la:l ṭal-t-el=lek. every day SG.A ride not.be-INTR-MIDDLE=DEC ‘I don’t ride every day.’

Grammaticalization in North America ()



Yuki (Balodis : ) Są ʔin-tą́l-aʔ=han ʔinkóʔop-is=mil. SAME sleep-NEG-?=but snore-CONT=FIN ‘And even though not asleep he snored.’

The same form with declarative enclitic =k serves as the negative response to polar questions: tąl=k ‘No’, parallel to Wintu and Central Pomo. ()

Yuki ‘no’ (Balodis : ) [‘You are telling us lies, apparently,’ one of them said.] Seʔé tą́lk ʔímeymil. si=ʔi tąl=k ʔimi=mil NEW=HEARSAY NEG=DEC say=FIN ‘But, “No”, he said.’

..  Wedged between the southern Wintuan and Pomoan territories is Wappo, a language generally thought to be remotely related to Yuki, though it has been questioned whether similarities are the result of contact. Wappo has a verb lah- ‘not exist, lack’ (not cognate with the Yuki counterpart). ()

Wappo lah- ‘not.exist’ (Thompson, Park, and Li : ) Heta hut ̯’-i lah-khiʔ. here coyote-NOM not.exist-STATIVE ‘There aren’t any coyotes around here.’

The same verb is used for negative possession with a genitive possessor. ()

Wappo lah- ‘lack’ (Thompson et al. : ) I-meʔ luč-i lah-khiʔ. SG-GEN tobacco-NOM not.exist-STATIVE ‘I don’t have any cigarettes.’

The verb root + stative sufﬁx has developed into the basic negative sufﬁx for past, present, and future realis. ()

Wappo negative lakhiʔ (Thompson et al. : ) Ah may naw-t ̯a-lahkhiʔ. SG.NOM who see-PAST-NEG ‘I didn’t see anybody.’

()

Wappo negative lahkhiʔ (Thompson et al. : ) Ah chach-še-lahkhiʔ. SG.NOM cold-DUR-NEG ‘I’m not getting cold.’

OUP CORRECTED PROOF – FINAL, 21/9/2018, SPi



Marianne Mithun

()

Wappo negative lahkhiʔ (Thompson et al. : ) Ce k’ew-i tuč’-kh-lahkhiʔ. that man-NOM big-STAT-NEG ‘That man isn’t big.’

()

Wappo negative lahkhiʔ (Thompson et al. : ) Cepʰi i peh-še-lahkhiʔ SG.NOM SG look.at-DUR-NEG ‘She doesn’t look at me i me-tʰu okal’te cel’. SG CO-DAT talk COND when I talk to her.’

Thompson et al. (: ) conﬁrm that the negative marker is now a sufﬁx rather than an independent verb, because the verb it is attached to undergoes certain internal vowel changes triggered by other sufﬁxes.

..  Northern California is a recognized linguistic area. The territories of the languages discussed here, Wintuan (Wintu, Nomlaki, Patwin), Pomoan (Northeastern, Northern, Eastern, Central, Southeastern, Southern, and Kashaya), Yuki, and Wappo can be seen in Fig. .. The map shows the area between the Oregon–California border in the north, and the entrance to San Francisco Bay in the south. The Paciﬁc Ocean is to the west. Communities were generally small, and exogamy and multilingualism were the norm. Contact has been intense over a long period of time. General protocol was to speak the language of the community one was in. Speakers would thus adjust those aspects of their language that they controlled more consciously, such as vocabulary, but less conscious aspects, such as frequencies of expression of certain distinctions and phraseology, were less controlled. As a result, certain grammatical distinctions and constructions recur in languages that have never been considered related, expressed by forms with no phonological resemblance. The negative constructions in these languages appear to have resulted from the copying of constructions based on verbs meaning ‘not exist’. Not enough is known about the details of contact to provide a secure basis for mapping the origin and spread of the negative constructions. The general robustness of auxiliary constructions in Wintuan languages suggests that they could have originated there as part of a major use pattern along the lines proposed by Heine and Kuteva: ‘A given language may acquire a minor use pattern or develop an existing minor use pattern into a major one when another language offers an appropriate model’ (Heine and Kuteva : ). Motivation for the spread is easy to imagine: cross-linguistically speakers constantly seek to renew the pragmatic force of negation.



Grammaticalization in North America

Tolowa

Modoc

Shasta

Yurok Karuk Konomihu

Achumawi Okwanuchu Hupa New River Shasta Wiyot Chimariko Atsugewi Wintu

Mattole

Eel River

Yana

N. Paiute

Maidu

(Nomlaki) Yuki

Cahto

Major dialect boundary

Konkow

Washoe

N. Pomo

po ap W

NE. Pomo E. Pomo Nisenan SE. Pomo C. Pomo Lake Miwok S. Pomo Patwin Kashaya Plains a Miwok N. ierr Coast N. S wok Paiute i Miwok M ierra Saclan S C. iwok rra Karkin Chochenyo M S. Sie k wo Mi Ramaytush Tamyen Gashowu Awaswas Mono Kings River California Mutsun Shoshone Rumsen (Panamint) Chalon TufeEsselen Kaweah Antonian-o Tübatulabal batulabal y lle Va

ts ku Yo

Palewyami Buena Vista Yokuts

Obispeño Obispe Obispen-o Purisimen-o

Inesen-o Barbareno

Kawaiisu

Chemehuevi

Kitanemuk

Cahuilla

Island Luisen-o

N

Ipai

0

50

Mojave

Serrano Tataviam Venture -o Venturen Tongva (Gabrielino)

Halchi dho ma

Miguelen-o

Cupen-o Quechan Kumeyaay

100 Miles Tipai

F. .. Northern California (adapted from Robert Heizer, Handbook of North American Indians, vol. : California (Washington, DC: Smithsonian Institution, ), p. ix)



Marianne Mithun

. GRAMMATICALIZATION VIA COMPOUNDING A second pathway of grammaticalization is particularly pervasive in North America, though it has taken slightly different shapes in different areas. There is both noun– verb compounding (noun incorporation) and verb–verb compounding. Grammaticalization pathways involving compounding differ from the auxiliary pathways described in the previous sections in that the grammaticalizating markers are still quite lexical in meaning and form at the moment of univerbation, with relatively concrete, speciﬁc meanings, and little phonological reduction.

.. – :   Many North American languages contain noun incorporation constructions— compounds consisting of a noun and a verb that together form a new verb stem. Incorporation is highly productive in the Northern Iroquoian languages. Examples here are drawn from Cayuga, an Iroquoian language of Ontario, Canada. Material is from speakers Jim Skye, Reginald Henry, and Nelson Crawford. Noun incorporation is pervasive in Cayuga speech. ()

Cayuga noun incorporation A. Konaʔta:yęteiʔǫ́ ko-naʔtar-a-yętei-ʔ=ǫ F.SG.PAT-bread-LK-know-STATIVE=INFER she apparently bread knows ‘My wife’s a pretty good baker.’ B. Oihwi:yóʔ o-rihw-iyo-ʔ NEUTER-matter-be.good-STATIVE it is matter good ‘This sure is good food.’

kyę́:’ kyȩ:’ sure sure

khekę́htsih. khe-kęhtsi SG>SG-be.old she is old lady to me

shę kakhwákʔaǫh. shę ka-khw-ak-ʔa-ǫ how NEUTER-food-eat-INCH-STATIVE how it is food delicious

Because noun incorporation is a kind of compounding, the constituents can in principle be drawn from the entire noun and verb lexicons. Only the noun stem is incorporated, without the gender preﬁx or the noun sufﬁx that occur on independent nouns. The full Cayuga noun for ‘bread’, for example, is o-náʔtar-aʔ (NEUTER-breadNOUN.SUFFIX) and that for ‘food, meal’ is ká-khw-aʔ (NEUTER-food-NOUN.SUFFIX), but only -naʔtar- and -khw- are incorporated. Like other non-head members of compounds, the incorporated nouns are decategorialized: they are no longer referential, but simply qualify the verb. The incorporated noun is not a syntactic argument, and the construction itself does not specify a syntactic or semantic relation between the components. Often incorporated nouns are semantic patients/themes/goals, because these can often qualify events and states in effective ways, but as in noun–noun

Grammaticalization in North America



compounds, any relevant semantic role is possible. Incorporated nouns can be semantic instruments and locations, for example. ()

Cayuga incorporated instrument Hohnekanyohs. ho-hnek-a-ryo-hs M.SG.PAT-liquid-LINKER-kill-HAB ‘He is liquid-killed’ = ‘He has a hangover.’

Cross-linguistically, terms for body parts are often incorporated. ()

Cayuga incorporated body parts a. Satahǫhtóhae! s-at-ahǫht-ohare SG.AGT-MIDDLE-ear-wash ‘Ear-wash yourself ’ = ‘Wash your ears!’ b. Akatathnǫʔá: ́e:k. waʔ-k-atat-nǫʔa-ʔek FACTUAL-SG.AGT-RFL-head-hit.PFV ‘I head-hit myself ’ = ‘I hit myself on the head.’ c. Akatathwęʔnahsáik. waʔ-k-atat-węʔnahs-a-rik FACTUAL-SG.AGT-RFL-tongue-LINKER-bite.PFV ‘I tongue-bit myself ’ = ‘I bit my tongue.’

Body-part incorporation allows speakers to indicate the location of an event or state while casting the person or animal affected as a core argument. In ‘I hit myself on the head’ and ‘I bit my tongue’, I am highlighting the effect of the event on me rather than on the head or the tongue. Usually languages with such constructions provide analytic alternatives for highlighting the body part, though these may or may not be idiomatic. Incorporation is pervasive in Cayuga, but productivity is a function of each stem, just as productivity in derivation in any language is a function of each afﬁx. Some nouns are always incorporated, some often, some occasionally, and some never. Some verbs always incorporate, some often, some occasionally, some never. The resulting combinations are generally lexicalized, though to varying degrees. Speakers are less likely to be surprised by innovations involving certain highly productive components. For the most part, however, they know whether particular constructions exist or not, which possible combinations are idiomatic like that in (), and what their precise meanings and uses are. ()

Cayuga Ahǫwatiyaʔtakę:nye:ʔ. a-hǫwati-yaʔt-a-kęnye:-ʔ FACTUAL-PL>M-body-LINKER-drag-PFV ‘They bodily-dragged him around.’ = ‘They raked him over the coals.’



Marianne Mithun

Many incorporated nouns still have quite speciﬁc, concrete meanings. But the beginnings of desemanticization can be seen for some. As in other Northern Iroquoian languages, the Cayuga noun roots -yaʔt- ‘body’, -ʔnikǫhr- ‘mind’, and -rihw- ‘matter, cause, news, word’ now also function as verbal classiﬁers, distinguishing situations pertaining to animate beings, mental states, and verbal matters or abstractions. ()

Cayuga -yaʔt- ‘body’ a. A:yetshę́i:ʔ. aa-ye-tshęri-ʔ OPT-F.SG.AGT-ﬁnd-PFV ‘She might ﬁnd it.’ b. A:yǫtakyaʔtatshęi:ʔ. aa-yǫtat-yaʔt-a-tshęri-ʔ IRR-F.SG>F.SG-body-LINKER-ﬁnd-PFV ‘She might ﬁnd her.’ c. Akyaʔtataihę:. ak-yaʔt-atarihę-: SG.PAT-body-be.hot-STATIVE ‘I’m hot.’

()

Cayuga -ʔnikǫhr- ‘mind’ a. Ęhsheʔnikǫhaʔe:k. ę-hshe-ʔnikǫhr-a-ʔek FUT-SG>INDF-mind-LINKER-hit.PFV ‘You will offend someone.’ b. Akʔnikǫhakęhe:yǫh. ak-ʔnikǫhr-a-kęhey-ǫ SG.PAT-mind-LINKER-die-STATIVE ‘I have mind died.’ = ‘I’m mentally exhausted.’ c. Tewakʔnikǫ ́hny’akǫh. te-wak-ʔnikǫhr-yaʔk-ǫ. DUPLICATIVE-SG.PAT-mind-break-STATIVE ‘I am grieving.’

()

Cayuga -rihw- ‘matter, reason, news, word’ a. Ęhsrihwa̱hsʔa:ʔ. ę-hs-rihw-a-hsʔa:-ʔ FUT-SG.AGT-matter-LINKER-conclude-PFV ‘You will promise.’ b. Otrihwasehtǫh. o-at-rihw-ase-ht-ǫ NEUTER.PAT-MIDDLE-matter-be.hidden-CAU-STATIVE ‘It is secret.’

Grammaticalization in North America



c. Ętrihwahwinyǫʔt. ę-hs-rihw-a-hwinyǫ-ʔt FUT-SG.AGT-matter-LINKER-be.in-CAU.PFV ‘You will report.’ For the most part, the same noun stems occur incorporated and in independent lexical nouns, but not always, because the noun–verb compounds are generally lexicalized. Some noun stems now occur only incorporated, though presumably once they also occurred on their own. Their independent counterparts have since been replaced with new nouns. Noun incorporation is widespread in North America. In some languages it is highly productive, in others less so, and in others it is no longer productive at all, though relics of an earlier productive process may persist in the lexicon. Noun incorporation constructions can take on new life, however, in productive afﬁx constructions. One kind of descendant of noun incorporation can be seen throughout the Northwest. The descendants of incorporated nouns are termed ‘lexical afﬁxes’, because of their relatively concrete, nearly lexical meanings. Examples cited here are from Kutenai (Kootenay, Ktunaxa), an isolate of British Columbia and Montana, described by Morgan (). Wordlists can be found at the First Voices website: http://www.ﬁrstvoices.com/. Kutenai contains several hundred lexical sufﬁxes (Lawrence Morgan, p.c.). A typical example is given in (). () Kutenai lexical sufﬁx ‘head’ (Morgan : ) Cuqnic qanałam’nic. cu-q=ni=c qa-na-łam’=ni=c stick.in-STATIVE=IND=and be.thus-go-head=IND ‘She cut a hole and stuck her head through.’ The lexical sufﬁxes share properties with incorporated nouns in other languages. Many evoke body parts, allowing the possessor to be cast as a core argument. ()

Kutenai lexical sufﬁx (Morgan : ) Picquwatiłni niłyaps. pic-quwaʔt-ił=ni niłyap-s short-fur-TR=IND sheep-OBV ‘He sheared the sheep.’

Though their meanings are usually more concrete and speciﬁc than is typical of afﬁxes cross-linguistically, they are often somewhat more abstract and general than their independent noun counterparts, a typical effect of grammaticalization processes cross-linguistically. The noun ʔa:k-iłwiy, for example, is translated simply ‘heart’, but the sufﬁx -łwiy is translated variously ‘heart, mind, feelings, senses’. The sufﬁx occurs in a large number of derived verbs. The First Voices website (http://www.ﬁrstvoices. com/en/Ktunaxa/words) lists verbs containing the sufﬁx with the meanings below.



Marianne Mithun

() Sufﬁx appears in verbs meaning (First Voices) ‘be stubborn/headstrong’ (after root cmak’- ‘hard/solid/ﬁrm’),‘be conﬁdent’, ‘want to go somewhere’, ‘think fast/stall/hold something back’, ‘be in a hurry’, ‘observe/witness’, ‘watch over/care for’, ‘watch someone secretly/spy on someone/stalk’, ‘make sounds of anger’, ‘be wise/smart/devout’, ‘stare/gawk/leer’, ‘be persistent/insistent/determined’, ‘sigh’, ‘be the one with the idea’ Like incorporated nouns, the lexical sufﬁx constructions may alternate with analytic constructions for discourse purposes: while the lexical sufﬁxes are part of idioms or evoke established or peripheral information, the independent nouns can direct special attention to their referents. The contrast can be seen in examples () and (). ()

Kutenai lexical sufﬁx (Morgan : ) Qapsin kiʔin ʔin kin haqałquxa? qapsin k=hiʔ=ʔin ʔin k=hin ha-qa-ł-qu-xa what SUBO-x=be that SUBO= have-be.thus-carry-water-by.mouth ‘What is that you are chewing?’

()

Kutenai lexical noun (Morgan : ) Niʔs hu qał ci:kati na wuʔu. niʔ-s hu qa-ʔł cin-akat=i na wuʔu the-OBV  be.thus-ADV catch-sight=IND this water ‘I looked into the water [and there were many char there].’

Also like verbs with incorporated nouns in other languages, verbs containing lexical sufﬁxes can co-occur with lexical nouns that add greater speciﬁcation. ()

Kutenai co-occurring lexical sufﬁx and derived noun (Morgan : ) Łaxał wukłuʔti niʔs ʔa:kikłuʔis. łaxa-ʔł wu-kłuʔ-t=i niʔ-s ʔa:k-i-kłuʔ-ʔis get.to-ADV ﬁnd-village-TR=IND the-OBV IMPV-be-EP-village-POSS getting to they village found their their village ‘They found the village of their (own) people.’

()

Kutenai co-occurring lexical sufﬁx and derived noun (Morgan : ) Qa:qaq’makikmiʔni yickiʔmisč. qa-ha-qa-q’ma-kik-miʔ=ni yičkiʔ-miʔ-s=č be.thus-have-STATIVE-sudden-HORIZONTAL-earth(en)=IND pot-earth(en)-SG=and ‘And she left her pail behind.’

Like noun incorporation constructions, lexical sufﬁx constructions are lexicalized and often idiomatic. The meanings may still be discerned from the meanings of the parts, but speakers know which combinations are used. ()

Kutenai idiomaticity (Morgan : ) Taxas hu nałuqławutmałni. taxa-s hu n=ha-ł-u-qławu-t-mał=ni then-OBV  PRED=have-carry-ﬁshline-TR-COMITATIVE=IND ‘Then I went out ﬁshing with her.’

Grammaticalization in North America ()



Kutenai idiomaticity (Morgan : ) Kin wałat’ciʔt? k=hin wa-łat’-c-iʔ-t SUBO= raise-arm-CAU-STATIVE-TR ‘Did you rob someone?’

In general, the forms of most Kutenai lexical sufﬁxes no longer match those of independent nouns with similar meanings, as can be seen by comparing the sufﬁx and nouns for ‘water’. ()

Kutenai lexical sufﬁx ‘water’ (Morgan : ) Siłkułkciłni. s-i-ł-ku-ł-kc-ił=ni CONT-EP-carry-water-x-BEN.APPL-PASSIVE=IND ‘They brought him water.’ Independent nouns wuʔu ‘water’ napituk ‘water’

The lexical sufﬁxes themselves cannot stand alone as nouns. Nouns can be derived by sufﬁxing them to the imperfective form of verb ‘be/do’. In the sentence seen earlier in () ‘She cut a hole and stuck her head through’, the lexical sufﬁx for ‘head’ was łam’. The lexical nominal for ‘head’ is ʔa:kłam’. It might appear at ﬁrst that the sufﬁx is simply an eroded form of the noun. The independent nominal is actually a morphologically complex verb containing the sufﬁx. ()

Kutenai derived noun (First Voices) ʔa:-k-łam’ IMPV-be-head ‘head’

(The velar -k- disappears before another velar or uvular. Epenthetic vowels are inserted to break up other clusters.) Nevertheless, the lexical sufﬁx constructions appear to be descended from noun incorporation via familiar processes of grammaticalization. The grammaticalization pathway via incorporation explains a number of characteristics of these sets of sufﬁxes. The bonding of their sources occurred early in the process. At the point when they became attached to verb stems, the noun stems still had their concrete meanings and their full phonological forms. Speakers could draw from the entire inventory of lexical nouns to create compounds. Most of the lexical nouns which served as their sources have since been replaced, and the meanings of the sufﬁxes have begun to become more general, abstract, and diffuse as the meanings of the verbs of which they form a part have evolved semantically. This sequence of developments accounts not only for their functions, which parallel those of noun incorporation, but also for their large inventories and relatively concrete meanings. Such lexical sufﬁxes are found not only in Kutenai, but also in all languages of neighbouring families to the north, west, and south: Salishan, Chimakuan, and Wakashan, languages which share a number of other typological characteristics. The sufﬁxes often number in the hundreds and, as in Kutenai, they can show



Marianne Mithun

relatively concrete and speciﬁc meanings. The systems are apparently quite old, reconstructible to Proto-Salishan, Proto-Chimakuan, and Proto-Wakashan. (Blackfoot, to the east, also has a similar structure, reconstructible to Proto-Algonquian, which arose elsewhere.) It is unlikely that the sufﬁxes themselves were transferred through contact; their forms are not generally similar. It is more likely that the earlier source constructions from which the sufﬁxes developed were transferred through compounding. Compounding can still be seen in some Salishan languages, where the stems are linked by the same phonological material seen at the beginning of some lexical sufﬁxes. Compounding patterns, and even particular semantic patterns of compounding, are easily spread by bilinguals, though lexical sufﬁxes are not. Kutenai also contains another construction, involving sets of sufﬁxes that characterize the means, manner, or type of instruments involved in events and states. ()

Kutenai instrumental sufﬁxes (Morgan : –) -kin ‘by hand’ -iki/-ki/-k ‘by foot’ -xa/-ax/-x ‘by mouth’ -xu ‘by body’ -k’u ‘by point, pointed object(s), ﬁnger(s)’ -ku ‘by heat or ﬁre’ -qa ‘by blade’

The sufﬁxes are not referential; they simply characterize the nature of the event or state. They combine with verb roots to derive lexicalized stems. ()

Kutenai instrumental sufﬁx (Morgan : ) Łaxan’uc ʔat qaskqupxnic. łaxan’-xu=c ʔat qas-kqup-x=ni=c catch.up.to-by.body=and IMPV cut-INT-by.mouth=IND=and ‘He’d catch up to it and take a huge bite and ʔAt ʔat

pisxni niʔs qasxa. pis-x=ni niʔ-s qas-xa IMPV drop.by.mouth=IND the-SUBJECT cut-by.mouth he’d drop (from his mouth) whatever he bit off.’

Now most of the Kutenai instrumental sufﬁxes differ from the lexical sufﬁxes in form. ()

Kutenai lexical and instrumental sufﬁxes (Morgan : ) a. Mac’iyni. mac’-hiy=ni dirty-hand=IND ‘He dirtied his hands.’ b. Mac’kini. mac’-kin=i dirty-by.hand=IND ‘He dirtied it with his hands.’

Grammaticalization in North America



A few do match, however: -xu ‘(by) body/back/torso’, and -ku ‘(by) heat/ﬁre’. Semantically, the instrumental sufﬁxes contribute more general and abstract meaning than either the corresponding nouns or lexical sufﬁxes, and their semantic contributions are slightly different. The instrumental sufﬁx -k’u ‘by point, pointed object(s), ﬁnger(s)’ does not have a perfect noun or lexical sufﬁx counterpart, but Morgan notes that it is likely descended from the lexical sufﬁx -k’un ‘nose’. A sample of the range of semantic contributions of the instrumental sufﬁx can be seen by comparing the examples in (). ()

-k’u (Morgan ) a. ʔat qa tak’uʔłni. ʔat qa ta-k’u-ʔ-ł=ni. IMPV NEG able-by.point-TR-BEN.APPL=IND ‘It couldn’t gore her.’

(p. )

b. N’akumułisni cuk’utiyałs. n=ʔaku-mu-ł-is=ni cu-k’u-t-iy-ał-s PRED=stab-INS.APPL-PASSIVE-OBV=IND pierce-by.point-TR-RFL-ASSC-COPART-OBV ‘It got stabbed with a spear.’ (p. ) c. N’itk’unapni n=ʔit-k’u-ʔ-n-ap=ni PRED-become-by.point-TR-NOUN.CONN-SG.OBJ=IND ‘I got stung by a bee.’

yuwat’. yuwat’ bee

(p. )

d. Hu cu-k’umuni łuʔus. hu cu-k’us-ʔ-mu=ni łuʔu-s  pierce-by.point-TR-INS.APPL=IND awl-OBV ‘I pierced it with an awl.’ (p. ) e. Siłk’uʔni ʔa:kuq’łam’ʔisis niłsiks. s-i-ł-k’u-ʔ=ni ʔa:k-u-q’łam’-ʔis-is niłsik-s CONT-EP-carry-by.point-TR=IND IMPV-be-EP-hair-POSS-OBV bull-OBV ‘He had the Bull’s scalps on a stick.’ (p. ) f. Hu wank’umuni ku ʔi:kuł picaks. hu wan-k’u-mu=ni k=hu ʔi:kuł picaks  move-by.point-INS.APPL=IND SUBO- drink spoon-OBV ‘I stirred my drink with a spoon.’ (p. ) g. ʔat ʔat

huc hayaxa:k’ukcisni. hu=c ha-yaxa-:-k’u-kc-is=ni IMPV =FUT have-fetch-water-by.point-EP-NOUN.CONN-OBJ=IND ‘I will be one to fetch water for you.’ (p. )

These are all clearly lexicalized formations—that is, speakers know and use them as chunks. Though both the lexical and instrumental sufﬁxes of Kutenai appear to be descended from incorporated nouns, they form distinct systems and can co-occur.

 ()

Marianne Mithun Kutenai lexical, instrumental sufﬁx cooccurrence (Morgan ) a. Snutapni kłułamaxaka. s-nut-ap=ni k=łu-łam’-a-xa-kaʔ CONT-chase-SG.OBJECT=IND SUBO=remove-head-EP-by.mouth-INDF.HUM.OBJ ‘She is after me, the one who chews heads off.’ (p. ) b. Łułamaʔni cupqaʔs. łu-łam’-xu-ʔ=ni cupqaʔ-s remove-head-by.body-TR=IND deer-OBV ‘He chopped the deer’s head off.’ (p. ) c. Qa:ł cinkqupqwat’axni. qa-ha-ʔł cin-kqup-qwat’-ax=ni be.thus-have-ADV grab-INT-ear-by.mouth=IND ‘Then he grabbed and sank his teeth into her earlobe.’

(p. )

Sets of derivational verb afﬁxes with these functions, indicating means, manner, or type of instrument involved in situations, show strong areality in North America west of the Rockies (Mithun ). Means/manner/instrumental preﬁxes occur in languages of the Pomoan, Yuman-Cochiti, and Palaihnihan families, and isolates Karuk, Yana, of California, and the neighbouring Washo of Nevada. These languages were once hypothesized to be genetically related at a deeper level in a superstock termed ‘Hokan’, in part on the basis of such structural similarities. Such preﬁxes do not, however, occur in other languages once grouped as ‘Hokan’: Shasta, Esselen, and Salinan. They also occur in languages of the Maidun family of California, and the Sahaptian famiily of neighbouring Oregon, Washington, and Idaho, as well as the Klamath and Takelma languages of Oregon. These families and languages were once hypothesized to be more deeply related in a different superstock termed ‘Penutian’. But they not occur in other families and languages hypothesized to be Penutian: those of the Wintun, Utian, and Yokutsan families, and Coos, Siuslaw, and Alsea languages. They occur in a third group never proposed as either Hokan or Penutian: Wappo and Yuki. They also occur in languages of the Uto-Aztecan family, but mainly those currently or recently spoken in the West. They also occur in a few other languages, notably those of the Siouan family, spoken across the Plains and into the Southeast. The distribution of these preﬁxes in the West is signiﬁcant. At the core of the area, they are generally quite old, reconstructible to the parent languages, often quite small in substance, sometimes a single consonant, and often quite abstract in meaning. Generally no lexical source can now be discerned. At the periphery of the area, however, such as in the Numic languages of the Uto-Aztecan family, origins in noun–verb compounds are still identiﬁable.

.. –  Afﬁxes can develop from other kinds of compound constructions as well. A number of languages in North America contain verb–verb compounds. Tonkawa, a language

Grammaticalization in North America



isolate of central Texas, contains some noun–verb compounds, but a much richer inventory of verb–verb compounds. The independent verbs in (a) ‘eat’ and (b) ‘ﬁnish’ combine to form the verb stem in (c) ‘eat up’. Examples cited here are from Hoijer (, , , and ). ()

Tonkawa verb-verb compounding (Hoijer : ., ., .) a. yaxa- ‘eat’ Yaxasʔok . . . yaxa-s-ʔok eat-SUBJECT-COND ‘If I eat it, . . . b. to:xa- ‘ﬁnish, do all, destroy’ Wetoxanoʔo. we-toxa-noʔo PL.OBJECT-ﬁnish-QUO ‘He has destroyed them.’ c. yax-to:xa- ‘eat (it) all’ Yaxto:xaklaknoʔo xatyawlak. yax-to:xa-k=laknoʔo xatyaw-lak eat-ﬁnish-PTCP=NAR.EVID sweet.potato-INDF.ACC ‘He ate all the sweet potatoes.’

()

Some additional Tonkawa VV compounds (Hoijer : .) ʔeʔeyaw-to:xa work-ﬁnish ‘ﬁnish working’ totop-to:xa cut.to.bits-ﬁnish ‘cut all up, make mincemeat of ’ now-to:xa lose.gambling-ﬁnish ‘lose everything in gambling’ yapec-to:xa sew-ﬁnish ‘sew all up’ yanaw-to:xadefeat-ﬁnish ‘win all, defeat all comers’ yak-to:xa shoot-ﬁnish ‘shoot all one’s (arrows)’ yaxʔak-to:xashovel-ﬁnish ‘shovel (it) all’ ya:lo:n-to:xakill-ﬁnish ‘kill all’ hamʔam-to:xa- burn-ﬁnish ‘burn up’ xan-to:xadrink-ﬁnish ‘drink (it) all’

Both elements of these compounds are stems, or themes in Hoijer’s descriptions. A given stem may in principle occur in either ﬁrst or second position. ()

Tonkawa ‘be stuck’ (Hoijer : .) yace- ‘be stuck, fastened to something, caught, captured’ yac-yatxalka- be.caught-fastened ‘hang fastened to something’ neskʷit-yace- tie-be.caught ‘fasten by tying up’

Compounds are lexicalized in their own right, stored and used as chunks by speakers. Some compounds are composed of stems both of which still occur independently; others contain just one stem that still occurs independently; and still others are composed of stems neither of which now occurs on its own. Not surprisingly, stems that occur with great frequency in compounds can evolve into afﬁxes. The verb ʔeke- ‘give’ still occurs as an inﬂected verb on its own. It apparently occurred so often as the second constituent of compounds that it has



Marianne Mithun

grammaticalized in place and now also serves as a general benefactive applicative, a common pathway of grammaticalization cross-linguistically. ()

Tonkawa verb ʔeke- ‘give’ (Hoijer : .) ʔekeklaknoʔo. ʔeke-k=lak-noʔo give-PTCP=NAR.EVID ‘She gave it to him.’

()

Tonkawa benefactive applicative sufﬁx (Hoijer : .) ʔeʔeyawa‘to do, make, prepare; to work’ ʔeʔeyaw-ke- ‘to work for (someone)’ ʔelnawʔelnaw-ke-

‘to lie, prevaricate’ ‘to lie to’

taʔanetaʔan-ke-

‘to pick up, take hold of, grasp’ ‘to take away, steal (from)’

ʔatanawaʔatnaw-ke-

‘to like, love (someone)’ ‘to like, love (another’s wife, sweetheart)’

Certain general semantic patterns of compounding are very frequent in Tonkawa. One of these involves an initial stem indicating means or manner. ()

Tonkawa means/manner compounds (Hoijer : .; : .; : .) yakosa- ‘whistle’ ʔe:ta yakosanoklaknoʔo. ʔe:ta yakosa-no-k=laknoʔo then whistle-CONT-PTCP=NAR.EVID ‘And then he whistled.’ yamaka‘call, summon’ Keymakoʔo. ke-yamaka-oʔo OBJECT-summon-DEC ‘They have called me.’ yakos-yamaka-

‘whistle to, summon by whistling’

Such constructions are pervasive in the language. A few more examples are given in (). ()

Additional Tonkawa means/manner compounds (Hoijer : . , . ) yaxʷe‘to club’ nacaka‘to die’ yaxʷ-nacakayaxʷ-amceyaxʷ-kayceyaxʷ-yapalʔa-

club-die club-arm/leg.be.broken club-be.chopped.off club-knock.down

‘beat to death with club’ ‘break (arm, leg) by clubbing’ ‘chop, slash with long knife’ ‘knock down with a club’

Grammaticalization in North America nakel-nacakayak-nacakayakaw-nacakayako(n)-nacakasʔe:t-nacakaxʔopcocow-nacaka-

drown-die shoot-die kick-die punch-die cut-die eject.fetid.ﬂuid-die



‘kill by drowning’ ‘shoot to death’ ‘kick to death’ ‘kill with blow of ﬁst’ ‘cut to death’ ‘kill by ejection of fetid ﬂuid’

Such verb–verb compounds are another source of the kinds of means/manner/ instrumental preﬁxes seen in the northern California area. For the Numic languages of the Uto-Aztecan family in the area, both noun and verb sources have been reconstructed for modern means/manner/instrumental preﬁxes (Dayley ). Among the processes of grammaticalization involved in their development is decategorialization of the non-head, the initial stems which give rise to the preﬁxes. In most of the languages in the area, it is no longer possible to tell for certain whether the sources of the preﬁxes were nouns or verbs. A second recurring verb–verb compound type involves a second stem indicating direction of motion. The independent roots in (a) ‘look’ and (b) ‘ascend’ are compounded in (c) ‘look up’: ()

Tonkawa direction compounds (Hoijer : .; : ., ., .) a. ya:ce‘look, see’ Ya:cew! yace-w look-IMPERATIVE ‘Look!’ b. haycona‘one goes up, ascends’ Hayconata. haycona-ta ascend-and ‘And he was climbing up.’ c. ya:c-aycona- ‘look up’ Ya:cayconalʔok . . . ya:c-aycona-l-ʔok look-ascend--when ‘When they looked up (to the top of the tree, he was sitting there).’

Further examples with the initial verb ‘look’ are in (). ()

Additional Tonkawa compound verbs with ‘look’ (Hoijer : .) ya:c-akxona look-one.goes.in ‘look inside’ ya:c-atxilna- look-one.goes.out ‘look outside’ ya:c-aklana- look-one.comes.down ‘look down’ ya:c-ayxena- look-one.goes.across ‘look across’

Compounds whose second element indicates direction are pervasive in the lexicon.

 ()

Marianne Mithun Tonkawa directional compounds (Hoijer : .) taʔane‘grasp, take hold of, pick up, hold’ taʔan-ayconagrasp-one.ascend ‘pull (someone) up’ taʔan-ecnegrasp-lie.down, go.to.bed ‘put (one) down, to bed’ taʔan-(n)osʔo:yta- grasp-stretch ‘stretch out (e.g. a rope)’ taʔan-ta:hacoxo- grasp-arise ‘pick up, help get up’ taʔan-xa:klana grasp-one.descend ‘hold (one) down’

The stems that occur as the second member of these directional compounds are themselves morphologically complex. ()

Tonkawa directional stem (Hoijer : .) hatxilnaha-txil-na one.moves-out-TRANSLOCATIVE ‘one person goes out’

They generally have three parts, though the middle directional speciﬁcation is optional. ()

Tonkawa directional stems ROOT (DIRECTION) POINT OF REFERENCE ha- ‘one person moves’ -kxo- ‘in’ -ta ‘hither’ da- ‘multiple moves’ -txil- ‘out’ -na ‘thither’ -kxo- ‘in’ -txil- ‘out’ -kla- ‘down’ -yxe- ‘across’ -yco- ‘up’ -pil- ‘from another place’

The ﬁrst sufﬁx that can appear in the second slot has a source in a verb root koxo that still occurs on its own with the meaning ‘(several persons) enter’. It can be seen with cislocative ‘hither’ and translocative ‘thither’ sufﬁxes below. () Tonkawa directional verb ‘several enter’ (Hoijer : .) kox-ta- ‘several come in’ kox-na- ‘several go in’ ʔe:kla koxnaklaknoʔo. ʔe:kla kox-na-k=laknoʔo then enter-CISLOCATIVE-PTCP-NAR.EVID ‘Then they went inside.’ Its descendant, the sufﬁx -kxo ‘in, into’, has lost the number distinction and is now attached to both singular and plural verbs ‘move’. ()

Tonkawa direction ha:‘one moves’ ta:‘two move’ ha-kxo-ta- ‘one person comes in’ ha-kxo-na ‘one person goes in’ ta-kxo-ta- ‘two come in’ ta-kxo-na- ‘two go in’

Grammaticalization in North America



Stems derived by compounding may in turn enter as constituents into new compounds. ()

Tonkawa means and direction together (Hoijer : .) yaxʷ-kakayʔac-atxilnayaxʷ-ka-kayʔac-ha-txil-na club-REP-be.chopped-one.person.moves-out-TRANSLOCATIVE ‘cut a swathe through (e.g. a forest) with long knife’

Such compound constructions are the likely sources of the kinds of locative/ directional afﬁxes that are pervasive in certain linguistic areas. The region centred in Northern California is again such an area. Locative/directional sufﬁxes occur in languages of the Pomoan and Palaihnihan families, and the Karuk, Shasta, and Yana languages, once all hypothesized to be in a group called ‘Hokan’ (but not Yuman or Washo, of the same group), and in the Maidun and Sahaptian families and in Klamath, hypothesized to be in a ‘Penutian’ group (but not others in the same group). The geographical distribution overlaps that of the means/manner/instrumental preﬁxes, but it is not isomorphic. At the core, as in Pomoan, the locative/ directional sufﬁxes are quite reduced in form, often just a consonant. At the periphery, as in the Chinookan languages along the Columbia River between Oregon and Washington, full compound constructions can still be seen in which the second element indicates location or direction. Again, it was apparently not the sufﬁxes themselves that were transferred by bilinguals, but a propensity to specify such semantic distinctions, and a speciﬁc compounding construction, from which the sufﬁxes developed.

. CONCLUSION Despite the great genealogical diversity among languages indigenous to North America, certain typological features are pervasive—in particular a relatively high degree of morphological complexity, presumably the product of extensive grammaticalization. Many of the grammaticalization processes visible here occur elsewhere in the world, the result of basic common human cognitive mechanisms and communicative practices. But some of the morphological complexity shows areal patterns, suggesting contact-induced grammaticalization. Most of the structures are ‘mature phenomena’ in Dahl’s terms; they can be reconstructed to parent languages spoken thousands of years ago, so contact would have played a role from early on. A number of factors can shape grammaticalization and in turn contribute to areal patterns. Two were explored here. The ﬁrst was the relative order of some of the processes: functional semantic/pragmatic abstraction and generalization on the one hand, and formal univerbation on the other. The second factor was the point in the development of constructions at which contact enters in. Constructions arising from the occurrence of semantic/pragmatic processes in situ before univerbation can be seen in Wintu in Northern California. Internal reconstruction reveals developments from lexical verbs to auxiliary constructions and ultimately to tense/aspect/mood, desiderative, and negative sufﬁxes. The Wintu



Marianne Mithun

negative construction, descended from a multi-clause construction based on the lexical verb ‘not exist’, apparently stimulated use patterns in a number of neighbouring languages: Central, Northeastern, and Eastern Pomo; Yuki; and Wappo. Negative constructions are frequent in speech and accordingly highly susceptible to routinization and grammaticalization. At the same time, they convey a crucial distinction, often the most important information in the sentence, so speakers regularly seek to strengthen their pragmatic force by renewal. They are among the most frequent participants in Jespersen cycles. The precise trajectory of the negative construction from Wintu outward is difﬁcult to trace with certainty. It is possible that the copying occurred independently in each of the languages. In the terms described by Heine and Kuteva, the major pattern in Wintu (where negation is part of a rich set of auxiliary constructions) may have stimulated the creation of minor patterns in each of the neighbouring languages—patterns which were then carried further in some of them by grammaticalization processes common throughout the world. Examples of formal univerbation occurring before semantic/pragmatic processes of abstraction and generalization can be seen throughout the Northwest and in Texas, where an early step toward grammaticalization was compounding. Noun– verb compounding (noun incorporation) has resulted in lexical sufﬁxes in the Northwest in Kutenai and the Salishan, Chimakuan, and Wakashan languages. These languages contain often very large inventories of sufﬁxes with surprisingly speciﬁc, concrete meanings, similar to those of lexical nouns. The large inventories, sometimes numbering in the hundreds, are not surprising, given that the source constructions could draw from the full inventories of lexical nouns. Noun–verb compounding also led to the development of means/manner/instrumental afﬁxes in a number of languages unrelated genealogically but spoken in neighbouring areas. Noun–verb compounding is no longer productive in many of the languages, but traces of the earlier pattern persist in some. Verb–verb compounding has also served as the source of afﬁxes in some languages with relatively concrete meanings. The development of locative/directional sufﬁxes was illustrated here in its early stages in Tonkawa of Texas. More advanced stages can be seen in languages spoken over a wide area in Northern California. Contact presumably played a role early in the development of the afﬁxes from compound constructions. Propensities to specify certain semantic distinctions are easily spread by bilinguals, as are particular compounding patterns. In most of these cases, the spread apparently occurred before the language families had diversiﬁed, thousands of years ago. Much of what drives the development of grammatical constructions is basic human cognition, facilitated by particular social circumstances. But the systems can also be shaped by the particular pathways of development involved, and by the particular point at which contact comes into play.

ACKNOWLEDGEMENTS I am especially grateful to Heiko Narrog and Bernd Heine for helpful comments on an earlier version of this chapter. I also appreciate discussion with Alex Walker on Northeastern Pomo and Uldis Balodis on Yuki.

16 Areal diffusion and the limits of grammaticalization An Amazonian perspective AL E X A N D RA Y . A I K H E N V A L D

. AREAL DIFFUSION AND GRAMMATICALIZATION: A PREAMBLE Intensive language contact brings about diffusion of grammatical categories. Grammaticalization of lexical items is one of the mechanisms at play as languages converge and new categories develop. Grammaticalized verb roots are a source for bound verbal morphology in the situation of intensive language contact of the multilingual Vaupés River Basin area in northwest Amazonia. Many instances of grammaticalization involve forms with speciﬁc meanings, and appear to be cross-linguistically uncommon. Unusual paths—found in a number of other Lowland Amazonian languages—expand the limits of what may be considered ‘plausible’ in grammaticalization. If a number of languages are spoken in a geographically continuous area, with groups interacting with each other and having to learn each other’s languages, linguistic traits will spread from language to language. Borrowings and similarities extend over all or most of the languages in a geographical region, forming a linguistic area.¹ Languages may remain different in many of their forms, but their structures will converge towards a similar prototype. The ways in which languages develop new categories under the inﬂuence of their neighbours often involve grammaticalization— ‘the development from lexical to grammatical forms and from grammatical to even more grammatical forms’ (Heine and Kuteva : ). Tariana, the only North Arawak language spoken within the multilingual Vaupés linguistic area, is in constant contact with East Tucanoan languages, especially Tucano. East Tucanoan

¹ The notion of ‘linguistic area’ (Sprachbund) has been discussed in a wide variety of sources; see a summary and discussion in Heine and Kuteva () and Aikhenvald (), and references there. Grammaticalization from a Typological Perspective. First edition. Heiko Narrog and Bernd Heine (eds). This chapter © Alexandra Y. Aikhenvald . First published  by Oxford University Press



Alexandra Y. Aikhenvald

languages have extensive systems of grammaticalized information source, or evidentials. Such systems are absent from other North Arawak languages. Tariana has developed a system of evidentials which matches those found in East Tucanoan languages, using a variety of mechanisms. In particular, the non-visual evidential -hma/mha in Tariana arose via grammaticalization of the verb -hima ‘hear, feel, seem, perceive’ (as we will see in section .).² As Matisoff (: ) put it, ‘although many grammatizational tendencies are doubtless universal, there are certainly areal differences of nuance’: while in Indo-European languages, prepositions tend to develop from adverbs, in the languages of Southeast Asia with verb-medial constituent order, prepositions derive from verbs. This is akin to what Heine and Kuteva (: ) and Heine (b) referred to as ‘grammaticalization areas’ (further examples are Heine and Kuteva ). A grammaticalization pattern common in languages across Southeast Asian languages (see Enﬁeld ) involves the lexical verb meaning ‘to acquire’ as a marker of ‘achievement’ in immediately postverbal position, and as a modal of possibility in clause-ﬁnal position. Noun classes (or genders) are a feature of a number of languages of northern Australia. About half of these have an agreement noun class that covers vegetable food. In an overwhelming majority, this class marker has the form ma, -m(a), -mi, -mi, man-, m-, or mu-. A few languages also have a generic noun classiﬁer of the shape mayi ‘vegetable food’. In all likelihood, the markers of this noun class come from grammaticalization of the classiﬁer mayi (or its phonological variants): see Dixon (: –, –; ). In each case, speciﬁc shared grammaticalization patterns are concomitant with other features which characterize the area in its entirety. In a situation of intensive contact-induced change, the forms to be grammaticalized, and semantic changes involved, may turn out to be somewhat unexpected, forcing us to reconsider potential constraints on grammaticalization (see Narrog ). The Vaupés River Basin linguistic area in northwest Amazonia is a case in point.

. THE VAUPÉS RIVER BASIN LINGUISTIC AREA: A SNAPSHOT The Vaupés River Basin linguistic area in northwest Amazonia spans adjacent areas of Brazil and Colombia. This used to be a well-established linguistic area, characterized by obligatory multilingualism based on the principle of linguistic exogamy between its core members: ‘those who speak the same language as us are our brothers, and we do not marry our sisters.’ The exogamous marriage network ensures an obligatory societal multilingualism. Languages and peoples within the multilingual marriage network traditionally included the East Tucanoan languages Tucano, Wanano, Desano, Piratapuya, Tuyuca (and a few others), and—just on the Brazilian

² See further details in Aikhenvald (b).

An Amazonian perspective



side—the North Arawak language Tariana.³ At present Tariana is rapidly becoming obsolescent, yielding ground to the numerically larger Tucano. The long-term interaction within the area is based on institutionalized societal multilingualism between East Tucanoan languages and Tariana. This has resulted in rampant diffusion of grammatical and semantic patterns (though not so much of forms) and calquing of categories. A striking feature of the Vaupés linguistic area is a strong cultural inhibition against language mixing, viewed in terms of borrowing forms—or inserting bits of other languages—into one’s Tariana. This inhibition operates predominantly in terms of recognizable loan forms. Speakers who use nonnative forms are subject to ridicule, which may affect their status in the community. Comparison of Tariana with closely related languages is crucial for understanding how areal diffusion has impacted the language. Similarities and differences between Tariana and its closest relatives spoken outside the Vaupés River Basin linguistic area allow us to distinguish between genetically inherited and contact-induced features in the language. The Arawak language family is the largest in Amazonia in terms of its spread, and the number of languages.⁴ The highest number of recorded Arawak languages is centred in the region north of the River Amazon, between the Rio Negro and Orinoco (spanning adjacent regions of Brazil, Colombia, and Venezuela). North Arawak languages spoken within this region divide into a number of low-ranking genetic groupings. One of the established groupings is the Wapuí subgroup, whose members share an origin myth, namely that they emerged from the Wapuí waterfall on the Aiary River, a tributary of the Içana. Tariana is one of four languages within this subgroup. Another major language is the Baniwa of Içana-Kurripako dialect continuum.⁵ Tariana has been in contact with East Tucanoan languages for at least a couple of hundred years. As a result of contact, the language has undergone substantial restructuring.⁶ This is what we turn to now.

. CONTACT-INDUCED CHANGE IN TARIANA UNDER EAST TUCANOAN IMPACT Tucanoan languages and Tariana are genetically unrelated, and typologically different. The analysis of extant Tariana dialects and of the old sources on Tariana allow us ³ For a comprehensive study of the Brazilian Vaupés, see Aikhenvald (, , , and references there). A cursory study of the Colombian part of the Vaupés River Basin linguistic area, by Sorensen (), is largely irrelevant here since all the languages spoken within the area in Colombia (addressed by Sorensen) belong to the East Tucanoan subgroup of the Tucanoan family. Areal diffusion patterns there (which have not yet been investigated) concern contact-induced change between closely related languages. ⁴ An alternative and outdated name for the family is Maipuran. ‘Arawakan’ is a term used for a larger and totally unsubstantiated grouping. See Aikhenvald (: appendix ; ). ⁵ This is spoken by ,–, people along the Vaupés, Içana, and its tributaries and in the adjacent regions of Colombia and Venezuela (outside the multilingual marriage network of the Vaupés). Other languages are Guarequena and Piapoco (see Aikhenvald  for the characteristics of this subgroup). ⁶ Further instances of contact-induced grammaticalization are discussed in Aikhenvald (, b). Examples from Tariana, Tucano, and Baniwa of Içana come from my own ﬁeldwork.



Alexandra Y. Aikhenvald

to outline a number of structural changes shared by all the varieties of the language. These changes are absent from Baniwa of Içana/Kurripako and other languages from the Wapuí subgroup. Tariana has acquired numerous grammatical categories, under the impact of long-standing contact with Tucanoan languages. Like many Arawak languages, Tariana employs preﬁxes for subject cross-referencing, while Tucanoan languages are predominantly sufﬁxing. As a result of long-term contact, Tariana has developed numerous un-Arawak features, including cases for core arguments and a complex system of evidentials. Structural changes in Tariana under the impact of East Tucanoan languages vary in terms of their stability and integration into the language. Following Tsitsipis (: ), I divide grammatical changes into completed and ongoing (or continuous). Completed changes include those aspects of the grammatical system of a language which do not show any synchronic variation and which go beyond speakers’ awareness. Ongoing or continuous changes are those in progress; here the degree of inﬂuence of the other language depends on the speaker’s competence and possibly on sociolinguistic variables, such as age or degree of participation in community life. A major completed contact-induced change in Tariana concerns the expression of grammatical relations. In this respect Tariana combines areally diffused properties with features shared with related languages and inherited from the proto-language. Grammatical relations in Tariana and in Baniwa are expressed on the verb, following the split-S, or stative-active, principle. This is an inheritance from the proto-language shared with other languages of the Wapuí subgroup of Arawak. Tariana and Baniwa employ a set of personal preﬁxes (which go back to the protolanguage) to mark A (subject of transitive verb) and Sa (subject of active intransitive verb). The two languages are different in that only Baniwa has a series of enclitics which mark O (object of a transitive verb) and So (subject of stative intransitive verb). Tariana has no cross-referencing enclitics, which is one part of the major change that has taken place in marking grammatical relations. Just like the overwhelming majority of Arawak languages, Baniwa does not employ cases for expressing core grammatical functions. In contrast, Tariana has acquired a topical non-subject case marker =nuku, developed out of an erstwhile locative case. In its semantics and usage, this marker mirrors the topical non-subject case -re found in Tucano and numerous East Tucanoan languages. These two points of structural differences between Tariana and Baniwa are illustrated below. Example (), from Baniwa Hohôdene (data from my own ﬁeldwork), shows person marking with a preﬁx and a cross-referencing enclitic. This example contains an aspectual multiverb serial verb construction. The information comes from the speaker’s personal experience (a story about how he grabbed a jaguar), and so the only evidential in the language, the reported marker -pida, is not used: ()

ne:ni [nu-rinoa=ni nu-ʈaita]Serial Verb Construction then sg-kill=sg.nf.OBJECT sg-ﬁnish ‘Then I had killed him’ (lit. ‘ﬁnished killing him’)

Baniwa

Example () comes from traditional Tariana (as spoken by the few remaining representatives of the older generation; see Aikhenvald ). As in the Baniwa

An Amazonian perspective



example in (), this Tariana example shows person marking with preﬁxes. Unlike in Baniwa, it has no cross-referencing enclitic, but rather a separate pronoun, marked with object case. The pronominal participant is the topic of the stretch of discourse, and is thus marked with the topical non-subject case. Since the example comes from a story about the speaker’s personal experience a long time ago, the sentence contains the remote past visual evidential. ()

nese [nu-inu=na nu-sita]Serial Verb Consruction then sg-kill=REM.P.VIS sg-ﬁnish di-na=nuku sg.nf-OBJECT=TOP.NON.A/S ‘Then I had killed him’ (lit. ‘ﬁnished killing him’)

Tariana

Tariana has developed a system of ﬁve evidentials fused with tense, mirroring the East Tucanoan system. Numerous aspect and manner-of-action markers, causative and comparative constructions, and many other features—all absent from related languages—have evolved under the impact of East Tucanoan languages. We now turn to the role of grammaticalization in these developments.

. GRAMMATICALIZATION AS A SOURCE FOR NEWLY DEVELOPED BOUND MORPHOLOGY IN TARIANA Similarly to East Tucanoan languages and especially Tucano, Tariana has a system of ﬁve evidentials—visual, non-visual, inferred, assumed, and reported. In contrast, closely related languages have only a reported evidential speciﬁcation. Examples () and () illustrate structural parallelism between Tucano and Tariana. ()

yi'i-re upî-ka pũri-sa' I-TOP.NON.A/S tooth-CLF:ROUND hurt-PRS.NONVIS.nonthird.Ps ‘My tooth hurts’ (lit. ‘to me, tooth hurts’)

Tucano

()

nuha=nuku nu-e-da kai=mha I=TOP.NON.A/S sg-tooth-CLF:ROUND hurt=PRS.NONVIS ‘My tooth hurts’ (lit. ‘to me, my tooth hurts’)

Tariana

The non-visual evidential -mha in Tariana is a recent development. The marker comes from the grammaticalized verb root -hima ‘hear, feel’ actively used in the language. This root goes back to Proto-Arawak *-kima -kema ‘hear, feel’. It is attested in all the Wapui languages in the meaning of ‘hear, feel, taste, smell’; but it did not develop into an evidential, except in Tariana. Along similar lines, the perception verb kim in Ashéninka Kampa (from the Kampa subgroup of Arawak languages in Peru) covers a full range of non-visual sensory experiences (hear, taste, touch, smell) but has not developed into an evidential marker (Mihas : –). This grammaticalization echoes similar developments in further East Tucanoan languages (other than Tucano). Verbs of perception and feeling ‘seem, be perceived,



Alexandra Y. Aikhenvald

feel’ in verb–verb compounding structures gave rise to exponents of non-visual evidentiality in other East Tucanoan languages. In Desano, kari- ‘seem’ gave rise to the non-visual evidential (Miller : ). A similar path was described for Tuyuca -ga- from a relic auxiliary meaning ‘seem’ or ‘be perceived’ (Malone : ). Both Desano and Tuyuca take part in the multilingual marriage network of the core Vaupés. The Desano are regarded as ‘younger brothers’ of the Tariana, and do not currently intermarry with them. However, this relationship may be indicative of older links between the Tariana and the Desano. The development of the non-visual evidential—an instance of completed change—appears to be indicative of an areal grammaticalization pattern. The reﬂex of the Proto-Arawak root *-kima/-kema ‘hear’ grammaticalized in different ways in some other languages of the Arawak family outside the Wapuí group. In Piro, an Arawak language from Peru (distantly related to the Wapuí languages), this root is attested in the meaning of ‘sound’ (-gima). It has grammaticalized into the reported evidential -gima- (Matteson : ). This grammaticalization path is shared with Nanti, from the Kampa subgroup of Arawak languages (also in Peru), where the reportative evidential ke grammaticalized from the verb kem ‘hear’ (Michael : ).⁷ Grammaticalization of a perception verb as an evidential marker is just one example of the development of new bound morphology which matches the East Tucanoan categories. The most striking evidence for the ongoing impact of Tucano on Tariana verb morphology is grammaticalization of markers of aspect and manner of action, based on verb roots. As Tucano is becoming the major language of communication among speakers of Tariana, verb compounding and grammaticalization of verbal roots is drastically expanding. What can be considered loan translations from Tucano are a source of continuous enrichment of Tariana.

. VERB COMPOUNDING AND GRAMMATICALIZATION IN TARIANA Similarly to most North Arawak languages of the Upper Rio Negro region, Tariana has productive verb serialization. Serial verb constructions consist of several verbs which constitute one predicate, each forming an independent grammatical and phonological word. Examples () and () illustrate serial verb constructions with completive meaning in Baniwa and in Tariana. In contrast, East Tucanoan languages—Tucano among them—have extensive verb compounding, or one-word verb root serialization.⁸ The verbs in the second position within a compound grammaticalize into a variety of markers of aspect and manner of action. Tariana has a few dozen verb–verb compounds in which the second verb has grammaticalized as a marker of aspect or manner or time of action. The ⁷ Whether this shared grammaticalization path is part and parcel of erstwhile diffusion or a special link between Kampa and Piro, spoken in the Pre-Andine domain, is a matter for further investigation. ⁸ See Aikhenvald (: –) for a typology of serial verb constructions, and extensive references there.

An Amazonian perspective



grammaticalization of the verbal root kahwi ‘wake up, be early in the morning’ in Tariana is a case in point. In (), the verb kahwi is used as an independent verb: this is a conventionalized greeting early in the morning. ()

kawhi=tha phia be.awake=PAST.VIS.INTER you ‘Are you awake?’ (a morning greeting)

Tariana

In (), this verb occurs in a verb–verb compound consisting of two verb roots; the encliticized root kahwi has a more general meaning, ‘be early in the morning’: ()

pethe du-wheta=kawhi=naka manioc.bread sg.f-put=BE.EARLY=PRS.VIS ‘She makes (i.e. puts on the oven) manioc bread early’

Tariana

This structure is parallel to one in Tucano (Ramirez , vol. ii: ), in (): ()

ãhûga peô-wã’ka-mo manioc.bread put-BE.EARLY-PRS.VIS.sg.f ‘She makes (i.e. puts on the oven) manioc bread early’

Tucano

The Tariana enclitic =kahwi ‘do early in the morning’ satisﬁes the criteria for enclitics in the language (such as secondary stress), and is used by all generations of speakers. A number of other enclitics which describe the manner in which an action is performed have also arisen from verbs via their grammaticalization in verb–verb compounds. The verbs are still extant as morphologically independent lexical items. A selection is presented in Table .. The verbs which have developed into well-established manner of action enclitics are speciﬁc in their meanings. As a result of ongoing inﬂuence of Tucano on Tariana and grammatical calquing (or loan translation), verb compounding in Tariana is expanding. More and more verb roots are starting to be used spontaneously as second components of verb compounds where they follow the fully inﬂected verb. Their meanings mirror those of their Tucano translation equivalents. This process involves both preﬁxless (or So) verbs and preﬁxed (or A/Sa) verbs. In the latter case, the loss of preﬁxes follows the general tendency to lose non-Tucano categories from Tariana and to conform to the sufﬁxing tendency of the Tucanoan type. T .. Manner of action enclitics with corresponding verbs in Tariana Source verb and its meaning

Manner of action enclitic and its meaning

-d(h)ala ‘come unstuck, peel, be scratched’

=d(h)ala ‘touch surface by unsticking or scratching’

-kolo ‘roll, fall down’

=kolo ‘turn over, knock over’

-kusu ‘be shaken’

=kusu ‘move to and fro’

-seku ‘slip, slide’

=seku ‘slip all of a sudden’

-yaɾe ‘be unconscious’

=yaɾe ‘lose consciousness’



Alexandra Y. Aikhenvald

T .. Aspect enclitics with corresponding verbs in Tariana and their Tucano equivalents Verb in Tariana

Aspectual enclitic in Tariana

Corresponding bound verb in Tucano

-sita ‘ﬁnish, manage to do’

=sita, -əsta, -sta ‘already accomplished’

-toha ‘(do) already’

-yena ‘pass over, go on; surpass’

=yena/yəna/ina ‘short duration, little by little, close to, almost’

-tĩha ‘(do) little by little, for a short time’

-mayã ‘deceive, miss, forget, get wrong, almost do’

=mayã ‘almost do (action nearly averted)’

-we’so ‘almost do’

Table . contains a list of aspect markers which have evolved recently as a result of grammatical calquing—or loan translation—from Tucano into Tariana. The most recent grammaticalization paths to which we now turn are characteristic of innovative speakers of the language. The traditional speakers (with whom I had the chance of working in the s) employed multi-word serial verb constructions instead (illustrated in () above). Let’s start with the grammaticalization of the verb -sita ‘ﬁnish’ as a perfective marker. This is a prime example of calquing an East Tucanoan-type structure as an ongoing process. The completive meaning of the verb -sita in Tariana was shown in (). Its cognate in Baniwa, -ʈaita, has a similar meaning—see (). The completive serial verb construction with the verb -sita is a feature of traditional Tariana—example () comes from the most traditional speaker of the language. In retelling this same story, a younger speaker spontaneously used (). The inﬂected verb -sita was used as an enclitic, without cross-referencing preﬁxes, following the pattern which has just been shown for other verbs. ()

nese [nu-inu=na=sita]single.word di-na=nuku then sg-kill=REM.PS.VIS=FINISH sg.nf.-OBJECT=TOP.NON.A/S ‘Then I had killed him’ (lit. ‘ﬁnished killing him’)

Tariana

The enclitic =sita is a loan translation from Tucano -toha ‘do already’. Nowadays, since the older and traditional speakers are all but gone, the verb -sita ‘ﬁnish, complete’ is falling out of use. The enclitic =sita has come to be a pervasive feature of the language. Phonological variants of the enclitic =sita (=əsta/sta) show segmental depletion of this newly evolved marker—typical in grammaticalization. Along similar lines, the enclitic -yena ‘do little by little’ is a loan translation of Tucano -tĩha do little by little’ (Ramirez : vol. i, –). Structural parallelism between Tariana and Tucano is illustrated in () and (): ()

emite di-hña=yena=naka child sgnf.-eat=LITTLE.BY.LITTLE=PRS.VIS ‘The child is eating little by little’

Tariana

An Amazonian perspective () ba'ã-tĩha-mi eat-DO.LITTLE.BY.LITTLE-PRS.VIS.sg.m ‘(The child) is eating little by little’



Tucano

Example () was produced by an innovative speaker of Tariana, and rephrased by a traditional one as a serial verb construction (): () emite di-hña di-yena=naka child sg.nf-eat sg.nf-pass.over=PRS.VIS ‘The child is eating little by little’

Tariana

Just as with =sita, phonological depletion of =yena (which can be pronounced as =yəna or as =ina in rapid speech) shows an advanced degree of its grammaticalization. The Tariana verb -mayã means ‘deceive, miss, forget, get wrong, almost do’ and is widely used in serial verb constructions. In its meaning ‘almost do; action nearly averted’,⁹ it is pervasively used as an enclitic, =mayã, mirroring the Tucano verb -we'sa used in verb–verb combination. The two parallel structures are shown in () and () (Ramirez : vol. ii, , and my own work). () tuki di-ñami=mayã=pidana little sg.nf-die=ALMOST.DO=REM.P.REP ‘He almost died (but managed to survive)’

Tariana

() wẽrî-we0 so-pi die-almost.do-REM.P.REP.sg.m ‘He is almost dying’

Tucano

A traditional speaker of Tariana used a serial verb construction consisting of two words, with -mayã as an inﬂected verb: () tuki di-ñami di-mayã=pidana traditional Tariana little sgnf.-die sg.nf.-almost.do/do.wrong=REM.PAST.REP ‘He almost died (but managed to survive)’ Many other verbs show up as enclitic markers of manner or time of action in stories and conversations by innovative speakers of Tariana—who tend to use more and more Tucano in their day-to-day life. One example is wyume ‘be last’ (an innovative variant of the traditional Tariana form whyume, where the loss of aspirated glide, absent from Tucano, is a phonological feature of the innovative language). The form wyume was spontaneously used in  as the second verb in a verb-compounding structure in (), by a proﬁcient and respected innovative speaker: () desu nha na-hña=wyume=mhade tomorrow they pl-eat=be.last=FUTURE ‘Tomorrow they (teachers) will have a meal for the last time’ ⁹ See Kuteva (b) on this category.

Tariana



Alexandra Y. Aikhenvald

The verb wyume ‘be last, do for the last time’ as part of a verb–verb structure mirrors the way tio ‘do for the last time’ is used in Tucano. I was told by all the Tariana speakers that the compound -ka=wyume (see=be.last) is equivalent to Tucano ĩ'yâ-tio (see-be.last) ‘see for the last time’. This shows that at least some speakers are aware of the ongoing process of loan translations in Tariana. An example of another innovation was spontaneous use of the erstwhile preﬁxed (Sa) verb -wasa ‘let go suddenly, jump off ’ within a verbal word to mean ‘do something suddenly; sudden movement’. An example from a story recorded from an innovative speaker is in (). A man was told by a spirit of the jungle to close his eyes. When he abruptly opened them, an unknown village appeared: () Diha di-ka=wasa di-pe he sg.nf-look=do.suddenly sg.nf-throw.out diha-yakale=ka=pidana hiku this-CLF:VILLAGE=DEC=REM.P.REP appear ‘(As) he suddenly looked up, this village appeared’

Tariana

I was told that =wasa is a translation equivalent of Tucano maha ‘do all of a sudden’ (see Ramirez : vol. i, , vol. ii: ). In earlier records of Tariana (collected by me in the s), the verb -wasa was used with a similar meaning within a multiword serial verb construction, e.g. di-ka di-wasa (sg.nf-see sg.nf-jump) ‘he suddenly looked’. As soon as the clitic was used by one speaker, others picked it up. This was also the case with =yena ‘do little by little’ in the s and with =wyume in the s. The set of enclitics transparently grammaticalized from verbs is expanding under Tucano inﬂuence, and verb compounding is becoming more and more productive. When I began ﬁeldwork on Tariana in the early s, the older generation of the Tariana used that language more than three-quarters of the time, and Tucano less than a quarter. By , the older generation had passed away, and the oldest ﬂuent speakers would use Tariana no more than  per cent of the time, the remainder being Tucano. In summary, a relatively rapid grammaticalization of verb roots in Tariana replicates the patterns of Tucano. As an additional factor at play, Tariana is endangered, and is being used less and less even by ﬂuent speakers. The difference between language change in ‘healthy’ and in endangered or obsolescent languages very often lies not in the sorts of change, which tend to be the same (Campbell and Muntzel ). It tends to lie in the speed with which the obsolescent language changes (see Schmidt : ; Dixon ; : –; Aikhenvald : –). Some bound morphemes in Tariana have grammaticalized within less than one generation. But we need to remember that contact-induced grammaticalization must have been under way for as long as Tariana has been in contact with East Tucanoan languages in the Vaupés River Basin linguistic area. In many instances (some of which are listed in Table .), grammaticalization is a completed process. With the increasing pressure from the dominant Tucano and a strong tendency in Tariana to converge with it, ongoing grammaticalization has been enhanced. Verbs with rather speciﬁc meanings—such as ‘jump’ and ‘deceive, miss’—develop into grammatical markers of

An Amazonian perspective



manner of action and action nearly averted (see Kuteva b). This takes us to the question of semantic predictability of grammaticalization paths in intensive language contact—especially when enhanced by pressure from a dominant language.

. THE LIMITS OF GRAMMATICALIZATION The ongoing grammaticalization of verbs in Tariana is indicative of rapid convergence with Tucano, as the dominant language. This grammaticalization also accounts for growing diversiﬁcation between Tariana and other closely related Arawak languages. Tariana is developing more and more linguistic structures atypical for its family. Grammaticalization paths in languages correlate with the frequency in use of the constructions where they appear, echoing Du Bois (: ): ‘Grammars code best what speakers do most.’ Grammaticalized verbs in Tariana verb–verb sequences are frequently used in narratives and conversations. Most verbs which have undergone the grammaticalization process discussed here are used less frequently than the corresponding enclitics. Most verbs in the ﬁrst column of Table . are rarely used by current speakers (and some, like -seku ‘slip, slide’, are not even known to them). Of the Tariana verbs in Table ., -sita ‘ﬁnish, manage to do’ is no longer used by remaining speakers; the verb -yena ‘pass on, go on, surpass’ is used in the meaning ‘surpass’ only; and the verb -mayã is used in the meanings of ‘deceive, miss, forget, get wrong’, but not ‘almost do’. The rapid development of bound verbal markers with aspectual and manner meanings (within the span of one or two generations), out of a set of verbs with relatively speciﬁc meanings, raises one further issue. Grammaticalization of verbs appears to be limitless in the sense that it does not appear to be constrained by verbal semantics. Semantically rather speciﬁc verbs, such as -wasa ‘jump’ or -mayã ‘deceive, miss’, evolve broader meanings. This poses the question of an overall predictability, and plausibility of the paths grammaticalization might take, especially if languages in contact are pressed into rapid development of matching morphological categories. A comparable example comes from Hup, from a small Makú (or Nadahup) family, spoken outside the core Vaupés area. The language has undergone substantial inﬂuence from Tucano. Speakers of Hup are not included in the exogamous marriage network, and do not show the pervasive multilingual patterns, which is the reason why Hup and its closest relative, Yuhup, are not considered core members of the Vaupés River linguistic area. Following the inhibition against borrowed forms shared with the core Vaupés (Epps , a), numerous markers have been developed via grammaticalization of lexical items. The non-visual evidential =hɔ̃ is the result of grammaticalization of a compounded verb hɔ̃h ‘produce sound, make noise’.¹⁰ This is

¹⁰ Also see the discussion of the grammaticalization of Hup evidentials in Epps (a: –). The same form for non-visual evidential is attested in the closely related Yuhup (Silva and Silva : ). General pathways of grammaticalization of evidentials are addressed in Aikhenvald (a).



Alexandra Y. Aikhenvald

another example of how the need to develop matching structures in language contact may trigger a grammaticalization pattern which is (so far) unique in the world. Grammaticalization of morphemes with speciﬁc semantics appears to be a feature of many Amazonian languages.¹¹ Grammaticalization of a noun meaning ‘sound’ into a reported evidential in Piro (see section .) is also one of a kind. In Yanomama, a Yanomami language from Brazil, a number of speciﬁc terms have given rise to noun classiﬁers, e.g. maa ‘stone’ to =ma ‘classiﬁer for hard object’ (Ferreira : ). In Xamatauteri, also Yanomami, the noun ko ‘heart, kidney’ appears to have given rise to a bound morpheme -ko ‘round things’ used as a verbal classiﬁer and as a noun classiﬁer (Ramirez : ). The Bora-Miraña classiﬁer -ta ‘metal objects’ used in a number of classiﬁer contexts (e.g. verbal, numeral and demonstrative classiﬁer) appears to have come from the noun ɾa:ta ‘tin (can)’ (itself a loan from Spanish lata ‘tin (can)’ (Seifart : ). The classiﬁer -natse ‘cylindrical objects’ in Paresi, an Arawak language from the state of Mato Grosso in Brazil, is used with verbs, numerals, demonstratives, and on nouns themselves; it appears to have grammaticalized from the noun natse ‘pestle’ (Brandão : –). Verbal classiﬁer -kig ‘pointed objects’ in Palikur, an Arawak language from French Guyana and adjacent areas of Brazil, is the result of grammaticalization of the body part ‘nose’ (Aikhenvald and Green : ). This development is intuitively plausible: it involves a generic extension of ‘pointedness’ as a property of the nose as a body part to other objects of a similar, pointed shape. Each of these exemplify a crosslinguistically unique grammaticalization path—so far, with no explanation at hand. Unusual grammaticalization paths may give rise to unusual categories. In Warekena of Xié, an Arawak language of the Upper Rio Negro region, -nawi ‘people’ has developed into a marker of excessive plural (used with nouns from any semantic group). Excessive plural refers to ‘very many, a whole group of ’, e.g. kueʃi-nawi (game-EXCESSIVE.PLURAL) ‘a lot of game (animals)’, abida-pe-nawi (pig-pl-EXCESSIVE. PLURAL) ‘very many pigs’. This development has not so far been attested anywhere else.¹²

¹¹ The Amazon basin is an area of high linguistic diversity (rivalled only by the island of New Guinea). It comprises around  languages grouped into over  language families, in addition to a number of isolates. The  major families are Arawak, Tupí, Carib, Panoan, Tucanoan, and Macro-Jê. Smaller families include Arawá, Guahibo, Yanomami, Jivaroan, Bora-Witotoan, Kawapanan, Zaparo, Peba-Yagua, Makú (also called Nadahup), Harakmbet, Nambiquara, Tacana, Katuquina, and Chapacura (see Aikhenvald ). ¹² Aikhenvald (: ). Etymologically the same form, -nawi, gave rise to a collective plural in Bahuana, a now extinct Arawak language from the Middle Rio Negro area (Ramirez : ). This is echoed by the development of the common Arawá noun deni ‘person’ into a collective plural marker in Kulina and Dení, two Arawá language from southern Amazonia (Adams Lichlan and Marlett ; Mateus Carvalho, p.c.). Heine and Kuteva (: –) quote two examples of nouns meaning ‘people’ developing into human plurals and plurals of deﬁnite nouns. Seemingly ‘unusual’ grammaticalization paths may arise from spurious segmental similarities. The scenario of grammaticalization of the form teg meaning ‘wood, stick’ to a nominal sufﬁx and a generic nominalizer, and then to a purpose adverbial and ﬁnally a future marker was suggested by Epps (b). However, in Hup (Marcelo Carvalho, p.c.) and in the closely related Yuhup (Silva and Silva : ; Cácio Silva, p.c.), the marker of purpose -teg has a rising tone, and the nominal sufﬁx -teg has a falling tone. The tonal difference between the two sufﬁxes casts doubt upon a potential connection, let alone a grammaticalization path linking the two sufﬁxes.

An Amazonian perspective



These typological oddities further raise the question of the limits, and predictability, of grammaticalization paths. In each instance, one cannot exclude the impact of language contact whose traces are no longer fully recoverable—in contrast to Tariana, where we are lucky to witness grammaticalization at work. As Matisoff (: ) put it, ‘one should perhaps never say never about hypothetical semantic development’—especially in the Amazonian region with its daunting diversity, just a fraction of which is known.

ACKNOWLEDGEMENTS I am grateful to my teachers of Tariana and other indigenous languages of Amazonia. Special thanks go to R. M. W. Dixon for his extensive and incisive comments, and to Brigitta Flick for editorial help.

17 Diachronic stories of body-part nouns in some language families of South America R O B E RT O Z A R I Q U I E Y

. INTRODUCTION It is well known that languages of South America exhibit typologically attractive grammatical features that have signiﬁcantly contributed to linguistic typology. Among those features, and focusing on Amazonian languages, we could mention large inventories of nominal classiﬁers, rich systems of possession, and complex alignments of grammatical relations with different types of split ergativity (see Dixon and Aikhenvald  for a list of typological features of Amazonian languages). Although the signiﬁcance of such grammatical features for synchronic linguistic typology is obvious, the position assumed in this chapter is that we will understand the nature of these features deeper by assuming a diachronic typological approach focused on their explanation by elucidating their source constructions (Aristar ; Bybee , a, among many others). This approach to linguistic typology has been referred to as ‘diachronic typology’ (see e.g. Greenberg ) or ‘source-oriented typology’ (see e.g. Cristofaro ). Source-oriented typology has successfully been applied to the study of speciﬁc grammatical domains, such as grammatical relations, voice and valence manipulation, and adpositions, in both speciﬁc languages/families (see e.g. Gildea  or  for Carib) and larger typological samples (Cristofaro , ). In line with these studies, this chapter takes a diachronic approach to the typology of South American languages and applies it to the study of a set of grammatical constructions associated with body-part expressions. As a consequence of this diachronic typological approach, it is assumed here that there are systematic patterns in language evolution, as is usually claimed in grammaticalization studies. In other words, the general assumption behind diachronic typology is that synchronic structures often originate from the same range of source constructions cross-linguistically (Cristofaro ). Grammaticalization from a Typological Perspective. First edition. Heiko Narrog and Bernd Heine (eds). This chapter © Roberto Zariquiey . First published  by Oxford University Press

Body-part nouns in South America



By studying the synchronic grammatical constructions associated with body-part expressions and their sources, this chapter aims to contribute to the better understanding of grammaticalization in the languages of South America. At the same time, it attempts to list the most likely associations between source and target constructions in this functional domain, in order to contribute to further typological generalizations in association with body-part expressions. Grammaticalization of nouns denoting parts of the human body (or body-part nouns) is a widespread phenomenon among the word’s languages. Body-part nouns are often the source of topological expressions (Heine a: ff.; DeLancey : ), reﬂexive/reciprocal devices (Heine : ), intensiﬁers (König ; König and Siemund ), numerals (Heine a: ff.), and classiﬁers (Aikhenvald b: appendix ). The present chapter deals with some well-attested diachronic developments of body-part nouns in languages belonging to a sample of language families of South America (see Table .). Body-part nouns in these languages are often implicated in the development of locative adpositions, classiﬁers of different sorts, and body-part preﬁxes (as described for Panoan languages). This chapter argues that it is possible to postulate at least four different source constructions for these developments, including

T .. List of languages and language families quoted in this chapter Language

Language family

Palikur

Arawakan

Baniwa

Arawakan

Nanti

Arawakan

Baure

Arawakan

Kalapalo

Carib

Mapudungun

Isolate

Shiwilu

Kawapanan

Hup

Nadahup

Shipibo-Konibo

Panoan

Matses

Panoan

Kakataibo

Panoan

Cavineña

Takanan

Ese Ejja

Takanan

Munduruku

Tupian

Tupi-Guarani

Tupian

Xamatauteri

Yanomaman

Yanomami

Yanomaman



Roberto Zariquiey

incorporated nouns, derivative compounds, generic genitives, and locative compounds. There is an intrinsic relation between these constructions and body-part nouns, and this fact, in addition to the special cognitive nature of body-part expressions, may explain why these nouns undergo the grammaticalization processes described here. Due to its widespread distribution, the recruitment of body-part nouns for the development of grammatical elements such as adpositions, classiﬁers, and preﬁxes might be considered an areal feature of South American languages. Thus, the position this chapter assumes is that the different diachronic paths to be presented here can be accounted for not only by looking at the semantic properties of body-part nouns, and particularly at their tendency to polysemy, but also by elucidating the constructions that may be taken as the sources for the target constructions to be studied. It will turn out that in the case of South American languages, body-part nouns are conspicuously found in various types of noun incorporation and compounds of different sources (including derivative compounds, generic genitives, and locative compounds). Their participation in these various constructions is what ultimately explains the target grammatical constructions that we often ﬁnd in association with body-part nouns. It will also turn out, as we will see, that most of the implicated constructions are directly or exclusively related to body-part nouns in the languages in which they appear. This constitutes a strong motivation for why body-part nouns (and not nouns of other semantic types) are systematically linked to the grammatical constructions to be studied here. The remaining of this chapter is organized as follows. In section ., I present the three target constructions to be illustrated and discussed: section .. illustrates some cases of changes from body-part nouns to locative adpositions; section .. presents cases when body-part nouns have developed into classiﬁers; and section .. discusses the Panoan body-part preﬁxes, which also exhibit a diachronic relation with body-part nouns. Finally, section .. offers a summary of the section. Section ., in turn, presents the source constructions that are postulated for the paths found and listed in section .. In section .. I discuss incorporated nouns into adjectives and verbs; in section .. I present derivative compounds; in section .. I illustrate generic genitives; and in section .. I brieﬂy illustrate locative compounds. The relationship between the target constructions in section . and the source constructions in section . will be discussed in section ., which summarizes the ﬁndings of the chapter and presents its conclusions. The survey presented is based on some of the available literature on the languages included in the sample. References to this literature are included throughout this chapter. Only Kakataibo (and partially Shipibo-Konibo) data come from the author’s own databases (for a detailed description of the author’s Kakataibo corpus as it was in , see Zariquiey : –).

. TARGET CONSTRUCTIONS This section lists and illustrates three types of constructions that systematically exhibit a diachronic relation with body-part nouns in various South American

Body-part nouns in South America



languages. Section .. illustrates cases of the change from body-part nouns to locative adpositions; section .. presents cases of the change from body-part nouns to classiﬁers; and section .. illustrates the change from body-part noun to body-part preﬁx (body-part preﬁxes are a salient feature of Panoan languages). Finally, section .. offers a summary of the section.

..   As DeLancey (: ) suggests, adpositions derive historically from two sources: serial verb constructions and relator noun constructions. Locative adpositions derived from body-part nouns in relator noun constructions are attested in various language families of South America. This change is also widespread among Panoan languages. Table . illustrates some locative postpositions in Kakataibo (Panoan) and includes information about their nominal source, whereas Tables . and . do the same for Shipibo-Konibo (Panoan) and Matses (Panoan), respectively. Similar examples can be found in many languages of the world and they have received scholarly attention since the s (see Heine a: – for a quick survey). Despite their nominal origin, Panoan postpositions are synchronic relational elements with an inherent locative function. Additionally, they can be combined

T .. Kakataibo locative postpositions and their source nouns (Zariquiey : ) Postposition

Noun

Form

Meaning

Form

Meaning

kaxu

‘behind’

kaxu

‘back’

kwɨbí

‘near’

kwɨbí

‘mouth’

shimú

‘under’

shimú

‘reversal’

namɨ

‘inside (e.g. a pot)’

namɨ

‘interior’

T .. Shipibo-Konibo locative postpositions and their source nouns (Valenzuela : ) Postposition

Noun

Form

Meaning

Form

Meaning

napon

‘in the middle of ’

napo

‘interior’

kexá

‘at the edge of ’

kexá

‘mouth’

pekáo

‘after, behind’

peká

‘back’

bebon

‘in front of ’

be-

‘face, forehead’



Roberto Zariquiey

T .. Matses locative postpositions and their source nouns (Fleck : ) Postposition

Noun

Form

Meaning

Form

Meaning

cachoc

‘behind’

cacho

‘back’

anauc

‘inside’

ana

‘oral cavity’

dëbiatemi

‘upstream from’

dëbiate

‘nose’

tayun

‘at the foot of a hill’

ta-

‘foot’

with unmarked complements that function semantically as the ground of the locative construction. These two properties make them different from synchronic body-part nouns, which would take genitive modiﬁers and require an explicit locative marker in order to express a locative meaning. This can be appreciated in the examples in (), in which we ﬁnd the postposition kaxu ‘behind’ and the body-part noun kaxu ‘back’ from Kakataibo. In (a), the postposition kaxu ‘behind’ appears with an unmarked complement and no locative marker is required. In (b), the body-part noun kaxu ‘back’ is illustrated, and we ﬁnd the locative marker -nu and a genitive modiﬁer with the marker -nɨn. ()

Kakataibo (Panoan) a. Juan kaxu John behind ‘behind John’ b. Juan-nɨn kaxu-nu John-GEN back-LOC ‘on John’s back’

Note that Panoan postpositions often do not require a complement, as illustrated in (), where the same Kakataibo postposition appears by itself. This complement-less use is also different from what is found in nominal forms, which would require a locative/directional marker in order to appear in a similar function. In (a), we ﬁnd the postposition kaxu ‘behind’ without a complement, whereas in (b) the noun kaxu ‘back’ is used in a similar position and the locative/directional marker =nu is also found. ()

Kakataibo (Panoan) a. kaxu ka kuan-a-x-a behind NAR. go-PFV--NON.PROX ‘He went behind.’ b. kaxu=nu ka kuan-a-x-a back=DIR NAR. go-PFV--NON.PROX ‘He went towards the backpart (of something).’

Body-part nouns in South America



Similar complement-less uses of postpositions are found in other Panoan languages like Matses (Fleck : ; see example ()) and Shipibo-Konibo (Valenzuela : ; see example ()). ()

Matses (Panoan; Fleck : ) nuntan dabiun-o-mbi inside paint-PST-A ‘I painted inside.’

()

Shipibo-Konibo (Panoan; adapted from Valenzuela : ) Ja-ská-a-kin yoi-kan-a jo-ax-ki that-COMP-do-S/A>A(SE) tell-PL-P>S/A come-S/A>S(PE)-RPR yaka-t-ai kexá sitting.position-MID-INC at.the.edge ‘Once being called, he would come and sit at the edge.’

Equivalent postpositions, also related to body-part nouns, are found in Takanan languages (see e.g. Guillaume :  for Cavineña). Table . lists three Cavineña locative postpositions with their corresponding nouns. Hup, a language from the Nadahup language family, exhibits a large set of locative postposition, from which only a few are reminiscent of body-part nouns, as shown in Table .. T .. Cavineña locative postpositions and their source nouns (Guillaume : –) Postposition

Noun

Form

Meaning

Form

Meaning

tsekwe

‘outside’

e-tsekwe

‘outside areas’

tsuku

‘at the corner of ’

e-tsuku

‘hip’

jiruru

‘at the edge of ’

e-jiruru

‘banks’

T .. Hup locative postpositions and their source nouns (Epps a: ) Postposition

Noun

Form

Meaning

Form

Meaning

g’od-an

‘inside’

nɔg’od, mig’od

‘mouth’, ‘face’

g’od

‘within’

nɔg’od, mig’od

‘mouth’, ‘face’

hupáh

‘at the back of ’

hupáh

‘upper back’

tǒk-tæn

‘mid-level’

tǒk

‘stomach’



Roberto Zariquiey

..  According to Aikhenvald (b: ), body parts are among the semantic subgroups of nouns that frequently grammaticalize as classiﬁers. This seems to be true for South American languages as well. One example of this can be seen in Munduruku (Tupi). In this language, there are about  classiﬁers, from which at least  originated in bodypart nouns or nouns referring to parts of plants and objects (Gonçalves : –). Some examples of classiﬁers descended from body-part nouns in Munduruku are listed in () with their corresponding nominal sources. ()

Munduruku (Tupi, Gonçalves ) a. -i ‘classiﬁer: foot’ b. -kõ ‘classiﬁer: tongue’ c. -ba ‘classiﬁer: arm, feather’ d. -bi ‘classiﬁer: mouth/aperture’ e. -dopa ‘classiﬁer: face, front’ f. -bɨ ‘classiﬁer: ﬁnger-shaped’ g. -daw ‘classiﬁer: bone-shaped’

< < < < < <
rɨkɨn rɨpan rɨshi rɨbun rɨsun rɨntu rɨchin

‘nose’ ‘snout’ ‘snot’ ‘tip’ ‘at the end of ’ ‘with a chopped tip’ ‘to nose at’

N N N N Post Adj V

Cases like the ones in () and Table . may also be seen as lexicalization. If, as suggested by Zariquiey and Fleck (), the original lexical forms for some body parts were in fact short forms equivalent to the synchronic preﬁxes, then we would have the diachronic scenario in (), where the meaning of the formatives in bold are uncertain. () Kakataibo (Panoan) rɨ ‘nose’ > *ri-kɨn *ri-pan *ri-shi *ri-bun *ri-sun *ri-ntu *ri-chin

> > > > > > >

rɨkɨn rɨpan rɨshi rɨbun rɨsun rɨntu rɨchin

‘nose’ ‘snout’ ‘snot’ ‘tip’ ‘at the end of ’ ‘with a chopped tip’ ‘to nose at’

N N N N Post Adj V

What we ﬁnd in () is a hypothetical diachronic scenario in which the synchronic words on the right-hand side of the schematic representations are the result of the reanalysis as lexemes of originally morphologically complex forms. Following this line of argument, these examples are then the result of lexicalization, and illustrate how ﬁne the line between grammaticalization and lexicalization may be.

..  The data presented so far have shown that body-part nouns are the source of at least three different types of grammatical elements in South American languages: locative adpositions, classiﬁers (of different sorts), and preﬁxes (this is exclusive to Panoan). Other types of paths recurrent in other regions of the world (particularly the development of reﬂexive markers from body-part nouns) still need to be explored in order to determine if they are also common in South America. In the next section, I explore a list of constructions in which we systematically ﬁnd body-part nouns. The examples come both from the languages listed in this section and from other languages of the region. As explained in the introduction, the position of this chapter is that there is a direct link between these source constructions and the target constructions presented in this section.

Body-part nouns in South America



. SOURCE CONSTRUCTIONS In the following sections, I discuss and illustrate four different types of constructions that frequently involve the presence of body-part nouns in various languages of South America. These constructions include: incorporated nouns (section ..), derivational nominal compounds (section ..), generic genitives (section ..), and locative compounds (section ..).

..   In a large number of South American languages, noun incorporation is frequently restricted to body-part nouns or other types of inalienably possessed nouns (see Sapir :  for American languages in general, or Dixon and Aikhenvald :  for Amazonia). Among the languages mentioned so far in this chapter, incorporation of body-part nouns into the verb has been described for Munduruku (Tupian, Ramirez ), Palikur (Arawakan, Aikhenvald : ), and Cavineña (Takanan, Camp, and Liccardi ). Other languages in our sample for which we ﬁnd references to body-part noun incorporation into the verb are Nanti (Arawakan, Michael ), Ese Ejja (Takanan, Vuillermet ), the Tupi-Guarani branch of the Tupian family (Rose ), and various Carib languages, like Kalapalo (Basso ). In fact, in some languages in our sample like Munduruku (Tupian), incorporation of body-part nouns is obligatory (Ramirez ). Noun incorporation usually triggers the loss of both morphological marking and syntactic independence. This easily accounts for the change from lexical to grammatical that deﬁnes grammaticalization. Let us see the examples of body-part noun incorporation in Palikur (Arawakan, Aikhenvald : ) presented in (). () Palikur (Arawak, Aikhenvald : ) a. eg barrew-kug F clean-foot ‘She is clean-footed.’ b. ig barrew-tiw M clean-head ‘He is bald.’ (lit. ‘clean-head’) The nouns kug ‘foot’ and tiw ‘head’ in (a,b) behave as bound elements attached to the verb, and they look like bound morphemes. Crucially, the position in which they appear is exactly the same as position of verbal classiﬁers in the language, namely after the verbs stem. This is illustrated in (). () Palikur (Arawak, Aikhenvald : ) in barrew-buk this.n clean-CLF[linear] ‘This (the cord) is clean.’



Roberto Zariquiey

The examples in () and () show the clear link between noun incorporation and verbal classiﬁcation. Due to their tendency to be incorporated into the verb, bodypart nouns mirror the syntactic behaviour of the set of verbal classiﬁers of Palikur, and then the change from body-part nouns to verbal classiﬁers becomes natural (but recall from () that Palikur also has numeral classiﬁers, which require a different explanation, offered below). As previously mentioned, incorporation may also imply a morphological change/ simpliﬁcation that also contributes to the reanalysis of incorporated body-part nouns as classiﬁers. In Kalapalo (Carib, Basso ), a normally required possessive sufﬁx is omitted when the body-part noun is incorporated into the verb. Thus, again we ﬁnd that the independent body-part noun becomes a bound morpheme, which might easily end up being analysed as a classiﬁer or a similar morphological element. See the example in (), where the noun ña ‘hand’ appears incorporated in the verb ﬁ ‘blow over’. ()

Kalapalo (Carib, Basso ) ti-ña-ﬁ-ti-nda RFL-hands-blow.spell.on-TR-CONT ‘(s)he blew all over her/his hands’

The link between classiﬁcation and incorporation is well attested in the literature (see e.g. Aikhenvald b: ), and it is clear cross-linguistically that noun incorporation and verbal classiﬁcation are both formally and functionally similar. In fact, although ‘noun incorporation and verbal classiﬁcation can be shown to be distinct categories’ (Aikhenvald b: ), the same constructions have been described as either numeral classiﬁcation or noun incorporation for some language families of South America (see Derbyshire and Payne  for the Yanomaman languages as an illustration). Body-part preﬁxation of the Panoan sort has also been analysed as noun incorporation (see e.g. Loos , quoted in section ..) and in fact a noun incorporation source for Panoan preﬁxation (at least on verbs) seems likely. Takanan languages, allegedly genetically related to Panoan languages, do not have body-part preﬁxation, but exhibit productive body-part noun incorporation into the verb. There is a crucial difference between Takanan noun incorporation and Panoan preﬁxation: as previously mentioned, the forms of Panoan preﬁxes are in most cases shorter than the forms of their corresponding body-part nouns, and we do not ﬁnd such difference between free and incorporated body-part nouns in Takanan. The similarities between the two constructions, however, are signiﬁcant: the function of incorporated body-part nouns in Takanan is often similar to the function of bodypart preﬁxes in Panoan and, crucially, both appear before the verb. Let us see the example in () from Ese Ejja (Takanan, Vuillermet ), in which the body-part noun jyoxi ‘foot’ has been incorporated into the verb jeyo ‘tie’. () Ese Ejja (Takanan, Vuillermet : ) a’a kwichi jyoxi-jeyo-naje? Q pig.ABS foot-tie-PST ‘Did (you) tie up the foot of the pig (lit. ‘did you foot-tie the pig?’)?’

Body-part nouns in South America



A path from an equivalent body-part noun incorporation construction to synchronic body-part preﬁxation on verbs seems likely. However, Panoan preﬁxes are also found on adjectives and nouns, and therefore are not exclusive to verbs. Preﬁxation on nouns and adjectives cannot be explained by arguing for prototypical noun incorporation as the source construction. We require a different explanation for body-part preﬁxation on nouns and adjectives. Regarding the latter, it is important to mention that Ese Ejja (Takanan, Vuillermet ) also exhibits noun incorporation into the adjective, which may be considered a possible source construction for body-part preﬁxation on adjectives. Panoan preﬁxation on nouns will be associated with compounding in the following subsection. () Ese Ejja (Takanan, Vuillermet : ) dó =pi’ay kya-wá’o- poji. red.howler=also ADJ.PREX-tail-bald ‘The red howler also has a bald tail.’

..    Body-part nouns are often recruited for special types of compounds or derivations. These compounds/derivations usually accomplish speciﬁc functions: teasing/insulting people; giving nicknames; referring to new objects in terms of their similarity with speciﬁc body-parts; or denoting speciﬁc sub-parts of a (larger) body part (i.e. hand > ﬁnger). As was the case with incorporated nouns, in these compounds the implicated body-part expression may lose their syntactic independence and become a type of bound element with a derivative function, reminiscent of what we often ﬁnd in nonverbal classiﬁers. The examples from Ese Ejja in () show this, in the sense that they are reminiscent of adjectival classiﬁers. Ese Ejja does not have classiﬁers, but the development of adjectival classiﬁers based on examples like those in () seems likely. As noun incorporation into the adjective, N-ADJ compounds like the ones in () might be a source of body-part preﬁxation on adjectives. () Ese Ejja (Takanan, Vuillermet ) me-wo’o ‘red handed’ ino-tawa ‘green threaded’ sapa-siyo ‘shiny at the head’ Compounding with body-part nouns is also common in other South American languages, and it is not at all unusual to ﬁnd that body parts participate in idiosyncratic types of compounding which are not available for any other kind of nouns. For instance, Baker and Fasola () report that most N-N compounds in Mapudungun (isolate) include a body-part noun or express some sort of part–whole relation. See the Mapudungun examples in () and (). From examples like these, the reanalysis of the implicated body-part nouns as classiﬁers seems likely: the body-part nouns in these examples exhibit the derivative function that is often found with noun classiﬁers in different languages of the world (Aikhenvald b: ).



Roberto Zariquiey

() Mapudungun (Bacigalupo : ) kutran-longko ‘head illness’ kutran-piwke ‘heart illness’ kutran-foro ‘tooth/bone illness’ () Mapudungun (Zuñiga ) chüll-kewün (brooch-tongue) chüll-küwü (brooch-hand) chüll-mollfüñ (brooch-blood) chüll-ponon (brooch-lung)

‘frenulum linguae’ ‘eponychium, loose cuticle’ ‘vein’ ‘bronchus’

Compounds like the ones illustrated in this section may trigger the development of so-called ‘variable noun classiﬁers’ (see Aikhenvald b: ). In fact, compounding of the sort illustrated here can easily develop into nominal classiﬁers like the ones found in Yanomami and Shiwilu. A similar scenario may hold for numeral classiﬁers, which are also common in South America. Panoan preﬁxation of nouns may have also come from source constructions like those illustrated in (–). We may ﬁnd both compounding and noun incorporation in association with body-part nouns in the same language. This is the case, for instance, for Ese Ejja, as illustrated in (a,b). In (a), we ﬁnd the compound akwi-jée ‘tree-skin’, which includes the noun jée ‘skin’. In (b), the same noun jée ‘skin’ is incorporated into the verb. The existence of these two different constructions in the same language may help us to understand why a language may exhibit both verbal and non-verbal classiﬁers, as is the case with some of the languages in the sample used for this chapter (e.g. Palikur, Arawakan). This may also help us to understand the distribution of preﬁxes in Panoan languages (which can equally attach to verbs, nouns, and adjectives), in the sense that in languages like Ese Ejja we ﬁnd source constructions for body-part preﬁxation on verbs (=noun incorporation into the verb), on adjectives (=noun incorporation into the adjective/N ADJ compounds), and on nouns (N N compounds). () Ese Ejja (Takanan, Vuillermet ) a. Eyáya akwi-jée jajá-xoja-aña. SG.ERG tree-skin cut-peel-PRS./ ‘I cut off the tree-bark.’ (lit. ‘tree-skin’) b. Eyáya ákwi jeé-jaja-xoja-aña. SG.ERG tree skin-cut-peel-PRS./ ‘I cut off the bark of the tree.’ (lit. ‘I skin-cut-peel the tree’)

(e-jee ‘skin’) (e-jee ‘skin’)

..   The term ‘generic genitives’ has been used for constructions that are used to refer to generic body-part, like ‘pig ear’, ‘chicken meat’, and so on (Dryer ). Generic genitives are inherently linked to nouns referring to body parts and parts of objects. This kind of construction usually exhibits grammatical properties that make it different from other types of genitive constructions within a single language.

Body-part nouns in South America



In some of the South American languages in my corpus, as in English, generic genitives are produced by means of combining two bare nouns (in a part–whole relation). The examples in (a,b) illustrate two generic genitives in Kakataibo. In both examples, we just ﬁnd two bare nouns in a sequence. In this language, this type of genitive construction is only attested with body-part nouns and nouns referring to parts of plants and objects. Notice that this construction may be seen as a type of compound (see section ..). () Kakataibo (Panoan) a. ’ó nami tapir meat ‘tapir meat’ b. ’unkin taë peccary foot ‘peccary foot’ In Kakataibo, as in many other languages, this generic construction is in competition with another one for speciﬁc genitives. The speciﬁc genitive construction is morphologically more marked. The examples in (a,b) are the speciﬁc counterparts of the examples in (a,b). As we can see in (a,b), in speciﬁc genitives the noun referring to the whole carries a genitive marker. () Kakataibo (Panoan) a. ’ó-kan nami tapir-GEN meat ‘the/a tapir’s meat’ b. ’unkin-nin taë peccary-GEN foot ‘the/a peccary’s foot’ Smeets (: ) describes a similar situation in Mapudungun. See the examples in (). () Mapudungun (Smeets : ) a. tüfa-chi kawellu ñi pilun DEM-ATTR horse .POSS ear ‘the ear of this horse’ b. tüfa-chi pilun kawellu DEM-ATTR ear horse ‘this horse’s ear’ Generic genitives may be useful for understanding the development of locative postpositions from body-part nouns in some language families of South America (see section ..). As discussed there, Panoan (and also Takanan) postpositions differ grammatically from their corresponding body-part nouns due to the internal grammar of the construction they head: nouns require a genitive modiﬁer, like speciﬁc genitives (), whereas postpositions require a bare complement, in a



Roberto Zariquiey

construction that is reminiscent of generic genitives (). Compare (a,b) with (c,d), respectively. Also notice that the examples in (a,b) are ambiguous, and can be interpreted as either postpositional phrases or generic genitives. The diachronic relation between body-part nouns in generic genitives and postpositions is obvious. () Kakataibo (Panoan) a. ‘ó kaxu c. ‘ó-kan kaxu tapir behind tapir-GEN back ‘behind the tapir’ or ‘the/a tapir‘s back’ ‘tapir back’ b. ‘unkin bimana d. ‘unkin-nin bimana peccary in.front.of peccary-GEN face ‘in front of the peccary’ or ‘the/a peccary’s face’ ‘peccary face’

..   There is a ﬁnal type of construction that deserves our attention. Baure (Arawakan) exhibits a construction that Admiraal and Danielsen () call ‘locative compound’. Basically, a Baure locative compound ‘consists of an N plus a locative noun root in the N position, and it is obligatorily followed by the general locative marker -ye “LOC” ’. One example of a Baure locative compound is offered in (). () Baure (Arawak, Admiraal, and Danielsen ) kwore’ noiy resia-imir-ye exist.SGM there church-face-LOC ‘He is there in front of the church.’ The interesting fact about Baure locative compounds is that they may also contribute to the understanding of the development of locative postpositions from body-part nouns. In fact, some Panoan locative postpositions carry a lexicalized locative marker (=mi ~ =k(i) ~ =o, according to the language), and their development might be related to the existence of a locative compound construction in these languages. See Table ., which includes three locative postpositions with the locative marker =mi in Kakataibo. T .. Some Kakataibo locative postpositions with a lexicalized locative marker mi and their corresponding body-part nouns (Zariquiey : ) Postposition

Noun

Form

Meaning

Form

Meaning

rɨbumi

‘beyond’

rɨbu

‘tip’

manámi

‘above’

manan

‘upside part’

tsipúmi

‘below’

tsipun

‘end, buttocks’

Body-part nouns in South America



The presence of the locative marker =mi absorbed in the lexical stems in the examples in Table . strongly points toward the possibility that Panoan languages had constructions similar to the ones that Admiraal and Danielsen have called “locative compounds”. If so, this construction might have been involved in the development of Panoan postpositions. The fact that different Panoan languages exhibit traces of different locative endings in some of their locative postpositions suggests that each language recruited the locative marker that was available in its morphological inventory for the construction described in this section. One point that deserves our attention regarding this construction is that, regardless of the formal differences among languages, they always recruited an indirect locative marker for the locative compound construction.

. CONCLUSIONS The present chapter has discussed three different target constructions that are often the result of a grammaticalization process starting with body-part nouns (section .), as well as four constructions that are often associated with body parts and are good candidates to be the source constructions of these grammaticalization processes (section .). The discussion has focused on a sample of  South American languages from  different genetic units. The aim was to explore one of the most studied semantic domains in grammaticalization research (changes related to bodypart nouns) and see which systematic patterns of variation can be found and postulated for South American languages. With this aim on mind, I have attempted to postulate more speciﬁc relationships between the target and source constructions discussed throughout this chapter. Table . summarizes the ﬁndings of this chapter by listing the associations between source and target constructions that have been revealed by the data. These associations are described in terms of paths (from source construction X to target construction Y).

T .. Diachronic paths of body-part nouns in the languages of South America To locative adpositions

To classiﬁers

To body-part preﬁxes

From body-part nouns incorporated into the verb and adjectives

No

Yes

Yes

From body-part nouns in derivational nominal compounds

No

Yes

Yes

From body-part nouns in generic genitives

Yes

No

No

From body-part nouns in locative compounds

Yes

No

No



Roberto Zariquiey

What seems to be revealed by the data is that classiﬁers and body-part preﬁxes exhibit the same likely source constructions: incorporated body-part nouns into the verb and into the adjective and body-part nouns in derivational nominal compounds. This similarity indirectly suggests that body-part nouns might develop into classiﬁers, and certainly they are reminiscent of classiﬁers in terms of their function. On the other hand, locative adpositions are likely to have developed from body parts in generic genitives and locative compounds. It is important to mention that Table . offers very basic representations of the diachronic paths described in this chapter, which are represented as based on two stages (a source construction and a target construction). This does not mean that the process from the source construction to the target construction was abrupt. On the contrary, as highlighted by Heine (), this process is expected to be continuous and to involve a multitude of stages. More work is needed to understand better the diachronic paths in Table ., and to identify the bridging and switch contexts that lead to the new functions that body-part nouns received in the languages of the sample. The identiﬁcation of such intermediate stages is particularly difﬁcult in the case of languages with little or no historical documentation, but future comparative research on some of the families mentioned in this chapter may shed some light on this matter. Following the postulates of so-called diachronic typology or source-oriented typology, the perspective assumed in this chapter is that it is the participation of body-part nouns in these constructions that may explain the widespread presence among South American languages of locative adpositions, classiﬁers, and body-part preﬁxes that are diachronically related to body-part nouns. All the source constructions listed here are widely attested and productive among South American languages, and all are strongly linked to body-part nouns. In fact, the data show that it is not uncommon to discover that these constructions are restricted to body-part nouns. This may explain why the diachronic paths described in this chapter are conspicuously found in association with body-part nouns. Therefore, what we ﬁnd is a well-attested cross-linguistic tendency for body-part terms to exhibit idiosyncratic grammatical properties. It is the participation of body parts in noun incorporation or in the different types of compounds illustrated here that predicts their development into classiﬁers, adpositions, or preﬁxes. Thus, diachronic typology proves useful in understanding the paths described here. If a language has noun incorporation, it is likely that this process will be available to body-part nouns, and that these nouns will develop at some point into verbal classiﬁers (or something similar, like Panoan preﬁxes). If a language has any type of compound that combines a body-part noun with a modifying noun, then that body-part noun might develop into an adposition. Interestingly, in the case of Panoan postpositions, sometimes it is not possible to reconstruct proto-forms for some of the categories. If we take the category ‘behind’ as an illustration, we ﬁnd crucial differences in the forms attested in different Panoan language. For instance, while we ﬁnd the form kaxu ‘behind’ in Kakataibo, the same locative relation is expressed by the form pekáo ‘behind’ in Shipibo-Konibo. This lexical difference can be explained simply by the fact that both languages have taken

Body-part nouns in South America



the body-part noun ‘back’ to express the topological relation ‘behind’, and this word happened to be lexically different in the two languages. It is likely that what both languages shared was the existence of a generic genitive construction that acted as the source of the postpositional construction, which might have developed independently in each language. The existence of some constructions that are mostly or exclusively available for body-part nouns, which give them a special syntactic nature and which favour grammaticalization paths like the ones illustrated here, seems to be a widespread feature of the languages in the sample and a good candidate to be an areal feature of South American languages (similar situations have been reported in Africa and Australia; see Aikhenvald b: ; Heine ). It is interesting to mention here that the data used in this chapter do not show any evidence of other grammaticalization paths associated with body-part nouns, which are widespread in other regions, like the one that derives reﬂexives and similar markers for body-part nouns. The special morphosyntactic properties of body-part expressions as described in this chapter are in accordance with their special cognitive status: it is well known that body parts are a model for conceptualization in very different semantic domains: spatial relations, numbers, and emotions, among others (Brenzinger and KraskaSzlenk ; Heine a, ; Lakoff and Johnson ). Although I have exclusively paid attention to grammatical structures, I am aware that it is very difﬁcult to deﬁnitively tease apart the structural and cognitive levels of analysis. These conceptualization properties may indeed help us to understand better why body-part nouns exhibit similar diachronic paths not only in South America but also in various other regions of the world.

18 Addressing questions of grammaticalization in creoles It’s all about the methodology HIRAM L. SMITH

. INTRODUCTION The functional typological approach seeks to make generalizations over the patterns that systematically occur across languages and to provide explanations for their occurrence (Comrie : ch. ; Croft : –). Since languages differ formally to a great extent, however, it is not possible to make generalizations based solely on formal properties (Croft : ). Creoles, which, perhaps more than other languages, have been typologized primarily on the basis of their structural properties, may have much to contribute to this conversation. One impediment to their ability to inform is that ﬁne-grained quantitative studies of many creole grammars are still lacking, making analyses of patterns difﬁcult. We ask here: to what degree does the expression of tense and aspect in Palenquero creole (an Afro-Hispanic variety spoken in northern Colombia) conform to well-established cross-linguistic tendencies in the development of aspectual expressions? And, more importantly, how can we test empirically whether language-internal change (i.e. grammaticalization) has occurred in this language? I submit that the tense-aspect system of Palenquero is an apposite testing ground for theories of grammaticalization, since recent studies have provided detailed descriptions of the distributions of preverbal morphemes (Smith , ). Also, much is known about the development of tense-aspect systems from particular lexical sources in many languages (e.g. Bybee, Perkins, and Pagliuca ; Heine and Kuteva ). Nevertheless, while tense-aspect morphemes show strong crosslinguistic developmental tendencies, examining language-speciﬁc data allows us to reﬁne our understanding of grammaticalization processes. Since grammaticalization paths are quite broad, they may show some language-speciﬁc differences (Poplack

Grammaticalization from a Typological Perspective. First edition. Heiko Narrog and Bernd Heine (eds). This chapter © Hiram L. Smith . First published  by Oxford University Press

Grammaticalization in creoles



), be inﬂuenced by social factors (Hopper and Traugott ), or develop local anomalies along the same grammaticalization path (Tagliamonte : ). To this end, I apply rigorous tests to measure the degree of conformity of this creole to attested typical, if not universal, cross-linguistic trends. The approach taken here is a combination of two things. First, accountable, quantitative methods were employed to determine the correlation between the distribution of variant forms and their putative functions. I argue, and will demonstrate, that without ﬁrst accurately deﬁning the functions of all competing tense-aspect forms, it is nearly impossible to make strong synchronic claims of grammaticalization. Second, grammaticalization theory is used to provide explanations for patterns observed in the data. Taking the broad domains of present and past temporal reference as variable contexts, I utilized the variationist method (Labov ) to ﬁrst uncover what distributional patterns exist for the expression of habitual. I then interpreted these patterns in light of cross-linguistic evolutionary paths (Bybee et al. ) and universals of typological markedness (Croft ). The bifurcated nature of this approach not only allows us test whether grammaticalization is taking place in Palenquero, but also puts to empirical test broad typological ﬁndings. Drawing on two case studies, theories of grammaticalization were converted into testable hypotheses and operationalized as factors for the analysis of synchronic data. This chapter is organized as follows. The next two sections will discuss grammaticalization theory and some problems as it relates to creoles. I will propose a method for addressing those problems. After brieﬂy considering the data collected for this study, I will lay out the case for the grammaticalization of the Palenquero habitual morpheme, asé. I will then address typological markedness as it relates to creole languages and measure Palenquero’s degree of conformity by comparing habitual expression over past and present temporal reference. I argue throughout that only by using quantitative methods can we build a solid case for grammaticalization of asé habitual.

. GRAMMATICALIZATION THEORY Grammaticalization may be deﬁned as the gradual development of grammatical material out of discourse patterns (Sankoff and Brown ) from particular and speciﬁable lexical and phrasal antecedents (Pagliuca : ix) following cross-linguistic grammaticalization paths (Bybee et al. ). While some may not agree (e.g. Newmeyer : ), grammaticalization may be referred to as a theory precisely because it has strong explanatory power (Heine b; Bybee : –). Grammaticalization theory makes strong diachronic predictions and has profound consequences for synchronic description and analysis (Bybee :). The correlated processes associated with grammaticalization also enable us to understand better how grammatical morphemes come into being (Heine and Kuteva : –). Although creoles have been typologized based on formal characteristics, I support the prioritization of functional criteria when making cross-linguistic comparisons



Hiram L. Smith

(Croft ). Since what is universal¹ about cross-linguistic grammatical categories are not surface forms but conceptual or semantic notions, grammaticalization paths of change, and the mechanisms that underlie that change (Bybee b), grammaticalization theory provides valuable insights which can be used for cross-linguistic comparison. Because of such insights, we now know that languages do not differ to an inﬁnite degree, but that there are constraints on the extent to which they may vary. And while we acknowledge that these are empirical questions, there is no a priori reason why typological insights should not be applied to creole languages. Therefore, the appropriate focal point for analysis should be universal pathways of change for verbal categories such as habitual, past, perfective, etc., which are actually stronger cross-linguistic patterns than synchronic catch-all notions (Schwenter and Torres Cacoullos : ). The lexical source of tense-aspect morphemes constrains the outcome, not only of their grammaticalization path, but also of potential meanings that may be expressed by them (Bybee and Pagliuca ; Hopper ). This tendency for certain lexical items to produce the same or similar results across unrelated languages may exist because source concepts that enter into grammaticalization are general notions that are basic to human experience (e.g. ‘going’ and ‘coming’), and thus are largely culturally independent in that ‘they tend to be conceived of in similar ways across linguistic and ethnic boundaries’ (Heine, Claudi, and Hünnemeyer a: ; cf. Bybee et al. : ). For this reason creole languages should not be excluded from comparison with broad typological patterns, despite claims that they may be taxonomized in their own right (McWhorter , )² or that their tense-moodaspect systems are unique (Bickerton , , ). The domain of verbal categories becomes particularly important because of well-established crosslinguistic tendencies in the evolution of particular lexical source constructions into aspectual expressions (e.g. Bybee et al. ; Heine and Kuteva ) and because of the central role creole studies has assigned to preverbal morphology in typologizing creole languages (Bickerton , , ). Here I compare the variable use of tense-aspect expressions in present and past temporal reference; this approach allows us to directly test hypotheses regarding the asymmetries of imperfective grammatical expressions.

¹ Making claims in terms of absolute universals ‘has largely been unsuccessful in accounting for empirical data’ (Bybee : ), so I espouse the more balanced view of grammaticalization paths being ‘empirically supportable (statistically) strong tendencies’ (Traugott : ). Unattested language types may exist, however, which is why we must submit candidates for grammaticalization to rigorous empirical test, as we do here. ² McWhorter (: ) actually excludes ‘pre-verbal markers of tense, mood and aspect’ from the Creole Prototype Hypothesis, since ‘such constructions can be found in “regular” languages, often clustering in them, such as in Chinese languages, just as they do in creoles’. My focus here, though, is not on any formal criterion, preverbal or otherwise (for reasons which I explain later), but on seeking explanations for form–function asymmetries.

Grammaticalization in creoles



. GRAMMATICALIZATION IN CREOLES

..      There are inherent problems in applying grammaticalization theory to creoles. One is the fact that they are contact languages with complex linguistic lineages. This, in turn, leads to conﬂicting theories of provenance, making it difﬁcult to adjudicate between them. There is also a lack of historical documentation for many creoles. As researchers, how do we tackle such problems? I will brieﬂy discuss each of these. Applying a strictly monogenetic view of grammaticalization to creoles ignores the complex issues that surround contact languages, namely, competing hypotheses regarding provenance. Since creole grammatical structures may arise from substrate inﬂuence, contact with the lexiﬁer, and/or universal processes, which may include grammaticalization, then adopting ‘a strictly monogenetic view of grammaticalization is inappropriate’ (Hopper and Traugott : ). Creolization, while not a speciﬁc process in and of itself (Mufwene : ), involves grammatical restructuring and transfer at all levels of the grammar, and as such may involve some gradual language-internal changes. This being said, then, I contend that insights gained from studying grammaticalization in non-contact languages should not be discarded when addressing language-internal change in creoles (Heine and Kuteva ); rather, we may be able to apply such knowledge heuristically (Plag ). Another issue that arises is ascertaining how grammaticalization proceeds in creoles. Creole grammaticalization may be strictly language-internal in the classic sense and then proceed gradually, however one deﬁnes gradualness (Mufwene : –). Grammaticalization may begin in one language and continue into the creole, with possibilities including a complete break in transmission, ‘transmission with modiﬁcation’ (Mufwene : ), slowly or rapidly (Bruyn ). The question we address in this chapter is: how can we adjudicate between hypotheses using empirical methods? Obviously, Palenquero habitual expression differs signiﬁcantly from potential source languages. In Spanish, present temporal reference verbs expressing habitual are zero marked, while past habituals are marked with the imperfect morpheme -ba. Kikongo employs a sufﬁxed vowel for tense and aspect (Mbiavanga : ) but inﬁxation for habitual (Dereau /; Mbiavanga : ). Kikongo-Kituba, which has the same putative substrate as Palenquero, employs an overt morpheme whose lexical source is ‘be’ and not ‘do’ (Mufwene : ). So, it may appear on the surface that Palenquero asé represents a ‘smoking gun’ with respect to the grammaticalization of an overt habitual morpheme; yet, as I will demonstrate, it is only through the application of rigorous tests that we can build a strong case for grammaticalization in a language for which we only have synchronic data. What we do know—and it is a great help—is that tense-aspect morphology is rarely transferred to creoles, and so grammatical morphemes may develop through grammaticalization (Arends and Bruyn : ). Cross-linguistically, ‘for tense-aspect



Hiram L. Smith

expressions, the major source of synchronic variation is grammaticalization’ (Torres Cacoullos : ). We also know that grammaticalization is taking place in all languages at all times (Bybee : ). The burning question regarding Palenquero, then, is not if grammaticalization has taken place, but how and where languageinternal change has taken place (Baker and Syea ). And which parts of the grammar result from these processes versus contributions from inheritance or language contact (Bruyn ). Regarding the predicament of lack of historical documentation, we know that this is neither unique to Palenquero (although it is true of Palenquero: Patiño Roselli : ) nor to creole languages, but is a problem facing all language historians (Tagliamonte : ), since the vast majority of languages do not provide direct evidence of diachronic processes. To surmount this problem, Croft (: ) proposes that the typologist should examine the full range of linguistic variation, and combine that with a knowledge of the directionality of language change in order to extrapolate language change processes. The variationist method in combination with typological insights taken from grammaticalization theory, then, becomes the one-two punch in addressing this issue. The variationist method’s ‘principle of accountability’ requires that all variants within the variable context be accounted for, whether realized or not (Labov : ). Once we ﬁgure out the apportionment of all tense-aspect morphemes, including zero coded ones, we can explain the distributions through a diachronic lens. The conﬁguration of the data poses no problem, since with the variationist method even the most chaotic-looking data become orderly. Yet how does a diachronic approach apply to synchronic data? We turn to this now.

..          A diachronic perspective is critical when analysing developing morphemes. Some scholars have argued that since similarities among languages are more easily seen from a diachronic perspective, then diachronic explanations are preferable to purely synchronic ones (Bybee et al. : –). Secondly, we know that forms and their distributions may be synchronically arbitrary, ‘thus the only source of explaining their properties may be diachronic’ (Bybee : ). In fact, ‘the explanations of many grammatical phenomena are fundamentally diachronic, not synchronic’ (Croft : ). Synchronically, though, ‘a set of diachronically related functions [temporal-aspectual meanings] along a hypothesized grammaticalization path may be expressed asymmetrically by a variant or variants’ (Schwenter and Torres Cacoullos : ). A diachronic approach or perspective, which is employed here, need not have a real or apparent time component. It can be used purely for heuristic or diagnostic purposes. This is well motivated due to the lack of historical linguistic data for many creoles, and due to the strength of the unidirectionality hypothesis, a prominent feature of grammaticalization theory. Since the unidirectionality principle is an

Grammaticalization in creoles



important component of grammaticalization and an essential building block of the argument made here, we will discuss that now. The unidirectionality hypothesis is essentially that grammaticalization processes, once under way, proceed only in one direction. Recent research has emerged proving that claims of reverse grammaticalization are idiosyncratic and rare (Bybee : ) compared to the wealth of attested examples, which now number into the hundreds (e.g. Bybee et al. ; Heine and Kuteva ), supporting gradual, unidirectional grammaticalization. Lexicalization, abrupt reanalysis, and other one-step backwards processes have been shown to not constitute reverse grammaticalization (Brinton and Traugott ; Willis ; Bybee ; Borjars and Vincent ). True reversals must inversely correspond to the gradual set of correlated processes typically associated with grammaticalization, such as phonetic reduction, semantic bleaching, and incremental changes in syntactic distribution, only in reverse (Bybee : ). It would also involve metaphorical shifts from abstract to concrete, from temporal meanings to spatial ones, or changes from afﬁx>clitic>independent word, phonological accretion (Willis : ) and many other changes (Brinton and Traugott : ). They must also be frequent, systematic, and represent more than just ‘one step in the backwards direction’ to qualify as degrammaticalization (Bybee : ). Since in grammaticalization it is frequency of use that produces semantic bleaching, phonetic reduction, and chunking (recategorization), it is illogical to assume that frequency would simultaneously be responsible for producing equal and opposite effects. One such effect, the loss of phonetic material, once complete, is irreversible, making phonetic accretion impossible.³ In fact, ‘even decreases in frequency do not lead to reversals’ (Bybee : ). What is relevant to the current argument is how the unidirectionality principle may be applied to creole grammaticalization. Heine and Kuteva (: ) assert that the unidirectionality hypothesis appears to hold true in cases of contact-induced change as it does for language-internal change. These authors, however, mention that there are a few examples where phonological accretion, not reduction, takes place during the ‘transitional phase’ of contact-induced grammaticalization between model and replica languages, but then reduction proceeds unidirectionally as in canonical cases. They further state that ‘of the roughly two hundred cases of grammatical development that we have been able to identify so far in pidgins and creoles (see Heine and Kuteva ), hardly a handful are at variance with canonical pathways of grammaticalization’ (Heine and Kuteva : –). But what about the few cases where creoles are ‘at variance’ with the unidirectionality principle? As with many other languages, these cases are not uncontroversial and often need to be substantiated with data. Plag () claims that the presence of counterexamples to the unidirectionality and gradualness principles of grammaticalization theory in creoles suggests that other mechanisms are at work, i.e. structural transfer, but not reverse

³ While phonetic accretion after loss is impossible, phonetic accretion itself, even in cases of grammaticalization, may not be so rare. (See Narrog , Mushin, Ch.  this volume.)



Hiram L. Smith

grammaticalization. He proposes that by using the highly constrained monogenetic view of grammaticalization as a diagnostic tool, one can ‘unequivocally identify substrate inﬂuence in creole formation.’ The rationale is simple: under the assumption that language-internal developments must accord to the principles in grammaticalization theory [speciﬁcally, gradualness and unidirectionality], violations of these principles must be interpreted as caused by external factors. By this token, we arrive at an independent indication of substrate transfer. (Plag : )

Therefore, on the strength of the unidirectionality hypothesis, I propose to test whether language-internal grammaticalization is taking place in the domain of verbal categories in Palenquero. Since much is known about the development of tenseaspect morphemes, we have a yardstick by which we can measure the degree of Palenquero compliance. Nevertheless, we begin from an agnostic position, because if we assumed a priori that grammaticalization has (or has not occurred) in this language, whether contactinduced or not, then we would be predetermining the results of our supposed empirical study. Hence, I applied stringent tests before making any speciﬁc claims of grammaticalization. In what follows, I provide evidence that is strongly suggestive of languageinternal grammaticalization of an incipient morpheme, asé, the so-called marker of habitual. I found that asé is not marking anything, including habitual, but that it is one of several competing aspectual morphemes that is doing habitual and other work.

. THE DATA AND METHODOLOGY San Basilio de Palenque, Colombia, is the oldest surviving maroon settlement, or ‘palenque,’ formed by fugitive slaves between  and  (Morton : ; Navarette : ; Schwegler : ). These ‘palenques’ or palisade forts were ubiquitous in the Colombian hinterlands, forest swamps, mountain slopes, and plains which were all once part of the jurisdiction of Cartagena (Navarette ), South America’s most important slave port of the period (Wheat ). San Basilio de Palenque (or Palenque, as it is often called), is located some  kilometers ( miles) southeast of the regional capital, Cartagena de Indias, in the Department of Bolívar. It is home to anywhere between , to , residents (cf. Lipski : ; Schwegler : ). Figures from the turn of the century indicate that these residents make up some  families and  residences (Guerrero et al. : ). This community has recently been declared a Masterpiece of the Oral and Intangible Heritage of Humanity by UNESCO. The data were taken from sociolinguistic interviews and conversations that were audio recorded by the author during July , May , May , and November  in San Basilio de Palenque and Cartagena de Indias (Smith , ). My consultants were male and female speakers of Palenquero who ranged from  to  years of age. Recordings from  speakers in present temporal reference and  speakers in past temporal reference were exhaustively transcribed by the author using the software program Elan (Lausberg and Sloetjes ). (All examples in this chapter were taken

Grammaticalization in creoles



from this corpus.) Quantitative methods developed in variationist sociolinguistics were used to code and analyse the data (Poplack ). The data were entered into Goldvarb LION (Sankoff, Tagliamonte, and Smith ) for analysis.

. THE HABITUAL MORPHEME ASÉ AND QUESTIONS OF PROVENANCE Traditional descriptions of preverbal tense-aspect morphemes⁴ in Palenquero are as follows: asé (habitual), sabé (habitual), ta (progressive), a (past/perfective/completive). Palenquero has two sufﬁxes, -ndo (gerund), viewed traditionally as progressive, and -ba, the past imperfect (Davis ; Smith : ). This study focuses primarily on asé, though taking into account the entire variable context in which all morphemes co-vary expressing various functions, including habitual meaning. The preverbal morpheme asé is often viewed as the habitual marker in Palenquero (e.g. Bickerton and Escalante : ; Patiño Roselli ; Holm ; Schwegler ; Simarra Reyes and Triviño Doval : ), as in (). ()

Ahora nu. Majaná asé salí ku sei u siete u ocho majaná. Today NEG. Kid HAB go.out with six or seven or eight kids ‘Not these days. Kids go out with six or seven or eight kids.’ (Female , Recording , :)

Because of surface similarity, it is often claimed that asé’s lexical source is the Spanish main verb hacer (‘do’) (e.g. Bickerton and Escalante : ; Schwegler : ). Nevertheless, there is no consensus on this point, because others claim that asé is really two morphemes, a and sé (Patiño Roselli : ; Simarra Reyes and Triviño Doval : ). Yet, as Schwegler and Green (: ) point out, such authors are making implicit assumptions about origins, but have not proffered any explanation as to what the possible lexical sources of these two morphemes might be. So, how do we determine which hypothesis is correct? While attestation of superﬁcially similar grammatical features in a creole and its genetic or areal relatives may be a ﬁrst indication of their provenance, we need to validate such claims with empirical tests. In this case, both of the claims mentioned above suggest contact-induced grammaticalization, yet they are mutually exclusive, and neither offers any insights into which cross-linguistic evolutionary paths are relevant or how such conclusions were reached. Regarding which camp is right, it is clear that in the absence of any hard evidence, we are simply at an impasse. The fact that this question has remained unresolved since  suggests that fresh approaches are needed. This study responds to the call for quantitative methods to be brought to bear in issues surrounding creole grammars (Sankoff : ; Meyerhoff ), thus contributing to the ﬁelds of both creole studies and variationist sociolinguistics.

⁴ This list is not exhaustive, as it does not include future, conditional or counterfactual morphemes, etc. For more complete lists see e.g. Schwegler and Morton () and Schwegler and Green ().



Hiram L. Smith

In the next section, I consider several claims relating to grammaticalization found in the typological literature. I then convert these generalizations into testable hypotheses which make speciﬁc predictions for what the synchronic Palenquero data should look like if grammaticalization is indeed taking place. As Klein-Andreu (: ) states, when doing any historical work, we ideally ‘look for evidence of different kinds that can be viewed as pointing in the same direction’ (all emphasis in the original). In what follows, then, the (dis)conﬁrmation of the grammaticalization hypothesis will be based on the sum of the evidence when considered in aggregate. In other words, by utilizing grammaticalization indices (Torres Cacoullos and Walker ), or diagnostic tests, which Palenquero can either ‘pass’ or ‘fail,’ we infer the degree of grammaticalization of asé. Let us now consider what the cumulative evidence tells us.

. BUILDING THE CASE FOR THE GRAMMATICALIZATION OF ASÉ: WHAT THE CLUES TELL US In this section, we will review the results from quantitative variationist analyses taken from two case studies (Smith , ) which tested various factor groups. By relying on multivariate analyses, these studies weighed the contributions of multiple factors (although we will only consider those that have a direct bearing on our discussion), with a focus on the expression of tense and aspect. For present temporal reference (Total N=,), I coded for: preverbal aspectual form, aspectual meaning, stativity of the predicate, lexical type, co-occurring temporal adverbial, and polarity. For past temporal reference (Total N=,) I coded for: preverbal aspectual form, sufﬁx on the main verb, aspectual meaning, stativity of the predicate, co-occurring temporal adverbial, polarity, and frequency of lexical verb. It will become clear, after a preponderance of the cumulative evidence laid out below, that preverbal asé is not the marker of habitual aspect, but is emerging gradually along a well-deﬁned grammaticalization path. And it is developing language-internally. In the following section I adduce, in turn, several indicators as evidence for grammaticalization.

..  :  This factor group examined tense-aspect asymmetries in positive versus negative polarity contexts. Givón states that ‘it is widely observed that the number of tenseaspects in the afﬁrmative paradigm is almost always larger but never smaller than in the negative’ (a: –). This suggests that in grammaticalization, tense-aspect elaboration happens ﬁrst in afﬁrmative contexts and then gradually extends into negative ones.⁵ Here we test whether this is the case for Palenquero. Notice in ⁵ It has been observed, though, that modal meanings often arise through negation (Narrog ; : –).

Grammaticalization in creoles



example () that preverbal asé may appear in negative contexts (co-occurring with the negative particle nu), suggesting expansion of the habitual morpheme over time. The question is: how far developed is asé in negative contexts? ()

Suto asé kandá nu kuando monasito .6 We HAB sing NEG when baby born ‘We don’t sing when babies are born.’ (Female +, Recording , :)

... Prediction If asé is an incipient developing morpheme (which is a reasonable assumption since creoles are young languages), we would expect the new expression to occur more frequently in afﬁrmative contexts over negative ones. If, on the other hand, asé has had sufﬁcient time to become obligatory, or if it is also favoured in negative polarity contexts, then this test may not yield much information with respect to a grammaticalization hypothesis. The early stages of tense-aspect elaboration, though, should be easy to spot.

... Results Multivariate analysis (Smith : ) conﬁrmed that in present temporal reference, asé is favoured (.) in afﬁrmative contexts (%, N=) over negative ones (%, N=) where it was disfavoured (.), a fact consonant with a grammaticalization hypothesis. In past temporal reference, of all cases of asé, % (N=/) occurred in afﬁrmative contexts and % (N=/) occurred in negative ones. These results are consonant with an early, developing habitual morpheme, and not one that is well advanced.

..  :     Some scholars claim that asé is really two morphemes, a and sé, since, they claim, negative polarity contexts automatically trigger the deletion of a (Patiño Roselli : ; Simarra Reyes and Triviño Doval : ). The origins of asé are directly tied to this matter of its being two morphemes or just one. If preverbal asé is a monomorphemic particle derived from the etymon hacer, then we would not be surprised if it showed synchronic behaviour consistent with the ‘do’>iterative>habitual grammaticalization path (Bybee et al. : –, –; Heine and Kuteva : –). If, on the other hand, it is really two morphemes (a and sé), one being perfective a, perhaps deriving from Spanish haber ‘have’>ha>a, and the second one unknown (Schwegler and Green : ), then we would expect a different grammaticalization path altogether (‘have’>resultative>perfect>perfective) (Bybee et al. : ), and different predictions for the synchronic data would follow. It would be quite unusual indeed for habitual functions to develop from this latter path with ‘have’ as a ⁶ Following Schwegler and Green (: ), Spanish portions of the transcriptions will be set apart by brackets. Nace is Spanish rd sg, as opposed to invariant Palenquero nasé.



Hiram L. Smith

lexical source. But that is exactly what is implied. This said, how can we ascertain the morphemic status of asé (or a + sé)?

... Predictions In order to test the claim that negative polarity contexts trigger the deletion of a in asé, all instances of Palenquero nu and no had to be coded. If an examination of the data turns up no deletion rule, then that may suggest hacer ‘do’ as a possible lexical source for asé, because the case could be made that asé is one morpheme. Taking that as a ﬁrst indication of its morphological status, further diagnostics would then need to be applied in light of predictions made for ‘do’ habituals.

... Results The data turned up no support that asé comprises two morphemes. In fact, they show quite the opposite trend. Not only did asé appear along with the negative particle, as illustrated in example () above, it was overwhelmingly present in these contexts (%, N=/) in present temporal reference. For past asé, often taking the form ase-ba (asé HAB+ ba PASTIMP), similar results were found. The data revealed that when asé occurred in negative polarity contexts, a was rarely deleted (N=/), an example of deletion being () below. Therefore, I side with Schwegler and Green (: ) in judging that the reduced past form seba is most likely the result of the deletion of pre-tonic a rather than deletion that is automatically triggered by the presence of nu. More central to the present argument, though, is that we now have evidence, which we arrived at inductively, supporting hacer ‘do’ as the lexical source for asé. And this, of course, has implications for grammaticalization. ()

Papocho aki __sé kume-ba nu pogke . Papocho here HAB eat-PASTIMP NEG because nobody it like PASTIMP ‘Nobody would eat papocho here because nobody liked it.’ (Female +, Recording , :)

..  :     ́ Habitual refers to any event that is customary, usual, characteristic of an entire period of time, or that is repeated on several occasions over a period of time (Comrie : ; Bybee et al. : ). Cross-linguistically, habitual morphemes usually derive from verbs that are consonant with habitual meaning, such as ‘live’ and ‘know.’ From these lexical sources they commonly develop along a hypothesized grammaticalization path of ‘live’, ‘know’>frequentative>habitual (Bybee et al. : , , ). In English-based creoles (but not Spanish-based creoles), ‘do’ is an attested lexical source for habituals (e.g. Rickford ; Holm : ). From a typological perspective, however, there is nothing precluding ‘do’ verbs from entering into grammaticalization in Spanish-lexiﬁed creoles (Heine et al. a: ; Heine and Kuteva : ).

Grammaticalization in creoles



T .. Distribution of aspectual distinctions in present temporal reference by their forms (N=,) (Smith : ) States Habitual Progressive Frequentative Total N Total %

zero 373 59.7% 164 36% 23 24% 2 10% 562 47%

ta 1 0.2% 19 4.2% 68 70.8% 1 5% 89 7.4%

asé 15 2.4% 178 39% 0 0% 13 65% 206 17.2%

a 227 36.3% 65 14.3% 5 5.2% 3 15% 300 25.1%

sabé 9 1.4% 30 6.6% 0 0% 1 5% 40 3.3%

Total N 625 52.2% 456 38.1% 96 8% 20 1.7% 1197 100%

... Prediction As demonstrated above, hacer ‘do’ is most likely asé’s lexical precursor. If this is so— i.e. if asé: (a) has developed from erstwhile hacer and (b) is developing languageinternally—then, given well-established cross-linguistic tendencies observed in the development of habituals, we would expect it to be favoured in frequentative contexts over habitual ones (Bybee et al. ). Generally speaking, the synchronic data should show that the developing morpheme asé is variably, and not categorically, associated with habitual meaning (see Table .). What is of interest here is the degree of association of asé with related functions along the grammaticalization path of frequentative > habitual. If the Palenquero data were to reveal that habitual is obligatorily marked by asé, then this predictor could not yield any pertinent information regarding whether grammaticalization has taken place. However, since form– function asymmetry was found for all forms in both past and present temporal reference, we are able to speak to the grammaticalization hypothesis.

... Results As seen in Table ., after all tense-aspect forms (including zeroes) in present temporal reference were exhaustively extracted (Total N=,), an accountable analysis of their distributions conﬁrmed form–function asymmetry. The data showed that the preverbal morpheme asé has both habitual and frequentative functions (Smith : ), where frequentative means ‘often’, ‘sometimes’, or ‘frequently’, but ‘not necessarily habitually’ (Bybee et al. : ), as seen in (). Quantitative analysis uncovered that asé was more closely associated with the more speciﬁc frequentative meaning, which it expressed % (N=/) of the time, than with the more general habitual meaning, which it expressed only % (N=/) of the



Hiram L. Smith

time. Multivariate analysis of asé vs zero⁷ also conﬁrmed that asé is strongly favoured in the environment of frequentative (.), but shares habitual space (.) with zero, as in (). ()

A bese suto asé kumé ñame ku pekao tambié. At times we HAB eat ñame with ﬁsh too ‘Sometimes we eat yam with ﬁsh too.’ (Male , Recording , :)

()

Kuando __ kaminá jende ta saká revolve. When walk people PROG take.out revolver ‘When people [go for a] walk they pull out pistols.’ (Male , Recording , :)

These ﬁndings lend support a grammaticalization hypothesis. According to Bybee et al. (: ), ‘[when] . . . a [grammatical morpheme] has two or more uses, this implies a diachronic relation between [them], since it is reasonable to assume on the basis of our knowledge of documented cases that one use developed after, and probably out of, the other’. The fact that asé expresses both frequentative and habitual meanings is consistent with what has been observed in other languages, ‘suggesting a link between these two meanings’ (p. ). The fact that asé is favoured with the more speciﬁc meaning of frequentative is consonant with grammaticalization predictions, as newer expressions are associated with speciﬁc meanings (like frequentative) before they extend to more generalized ones, like habitual.

..  :      Cross-linguistically, erstwhile progressives can generalize to general presents that encompass habitual meaning following the grammaticalization path (locative >progressive>imperfective>general present) (Bybee et al. ; Heine and Kuteva ). The progressive morpheme ta in Palenquero, however, is said to only mark progressive but never habitual (Schwegler and Green : ). On the other hand, Holm (: ) observes for creoles that ‘when there is an expressed habitual marker, it is usually the same as the progressive marker, or at least related to it historically (Palenquero ase being a notable exception)’. This last claim seems to ignore the fact that form–function asymmetry is not coterminous (Poplack : ). In other words, just as one form, such as an ‘expressed habitual marker’, may express various diachronically related functions, one function, such as habitual, may be expressed by more than one morpheme.

⁷ The dependent variables were asé vs zero in the multivariate analysis (Smith ). The independent variables under consideration were frequentative and habitual aspects and polarity (N=).

Grammaticalization in creoles



... Prediction It stands to reason that the progressive morpheme ta, if far enough along a cline of grammaticalization, could express habitual meaning. But we should not expect to ﬁnd the other habitual morphemes (sabé or asé) in progressive contexts, as that would constitute reverse grammaticalization, assuming gradual, language-internal change.

... Results As seen in Table ., which shows the distributions of aspectual distinctions by their formal expression, all Palenquero tense-aspect morphemes may express habitual meaning. In fact, we found that ta expresses habitual meaning % of the time (N=/) (cf. Lipski : –), as seen in example (), where ta and asé express habitual in the same variable context. It is noteworthy that not only is the progressive morpheme ta found in habitual contexts, but neither habitual morpheme (asé or sabé) occurred with progressive meaning. The former case is consonant with the grammaticalization path of progressives that can generalize to express habitual meaning; the latter would indicate reverse grammaticalization, or, for creoles, that language-external mechanisms could be at work (Plag ). ()

Ele asé kaí tambié kumo mango. Ele ta kaí kumo mango. He HAB fall also like mango. He PROG fall like mango ‘[It] falls too just like the mango. It falls like the mango.’ (Male +, Recording , :)

..  :   When developing from lexical meanings to grammatical ones, aspectual morphemes must increase in frequency and generalize in meaning, which then makes them applicable to a wide variety of contexts. Since change is gradual, earlier meanings and newer aspectual ones may coexist synchronically (Bybee and Pagliuca ; Hopper ), and often in the same token (Bybee and Torres Cacoullos : ). This may produce a kind of temporary semantic dissonance between the new form and its lexical etymon, i.e. when they occur together. In fact, studies of grammaticalizing aspectual morphemes report that they tend to eschew co-occurring main verbs that have the same lexical source, i.e. until they have become sufﬁciently generalized in meaning (Poplack and Malvar : –; Poplack and Tagliamonte : –). Synchronically, developing morphemes show co-occurrence patterns which have gradient, not categorical, strengths, allowing us to apply quantitative tests.

... Predictions If asé is an incipient habitual morpheme which derives from erstwhile hacer ‘do’, then we would not expect it to be favoured with the main verb asé ‘do’, as in example (). However, if asé is favoured in this context that would not necessarily suggest that grammaticalization is not taking place, but simply that asé is far enough along the



Hiram L. Smith

grammaticalization path that it is out of range for this particular grammaticalization index. ()

Entonse bo a sabé ma jende asé asé bulá suto aki. So you PRET8 know PL people HAB make ridicule us here ‘So, you know how people make fun of us here.’ (Female +, Recording , :)

... Results Preverbal asé in present temporal reference is disfavoured with the main verb asé ‘do, make’ (%, N=/, overall rate of occurrence of asé was %). This is consistent with what has been found for other grammatical items such as the ‘go’ future avoiding the main verb ‘go’ (Aaron : ) or main verbs of motion (Poplack and Malvar : –; Poplack and Tagliamonte : –) until the grammatical morpheme has been sufﬁciently bleached of motion meaning. This is yet another bit of evidence that may point to asé being an incipient developing morpheme.

..  :     ́     ́ An increase in the frequency of lexical items which become grammatical material produces semantic generality, thus allowing them to appear in even more contexts and also in environments where their meaning is redundant (Bybee et al. : ). Synchronically, grammatical morphemes are more frequent than their lexical etyma (Bybee : ).

... Prediction Since we now have distributions for all preverbal forms and predicates, simple token counts will shed light on whether relative frequency asymmetries in this environment hold true for Palenquero. The corpus contains tokens of the lexical verbs asé ‘do’ and sabé ‘know’ as well as their (putative) descendants, asé (habitual) and sabé (habitual), expressed as preverbal morphemes—the latter two should be considerably more frequent than the former if they are semantically general enough to have grammatical functions.

... Results Token frequency counts show that preverbal asé is more frequent than its presumed lexical etymon. Table . shows that in present temporal reference, asé occurred far more frequently (N=) than the main verb asé (N=), while sabé (N=) was slightly more frequent than the main verb sabé ‘know’ (N=). For the sake of ⁸ Preverbal a, which usually functions as preterite or past, may also co-occur with present statives with non-past meaning (Schwegler and Green : ).

Grammaticalization in creoles



T .. Frequency of preverbal morphemes compared to their presumed lexical sources in present and past temporal reference Present temporal reference asé sabé ta

Past temporal reference

Preverbal

Main verb

Preverbal

Main verb

206 40 89

53 36 129

182 88 87

9 11 91

comparison, the progressive morpheme ta is less frequent (N=) than the copula verb ta (N=).⁹ In past temporal reference, asé and sabé particles were far more frequent than their supposed lexical precursors. For example, preverbal asé (N=) outnumbered the main verb asé (N=), and preverbal sabé (N=) occurred more frequently than sabé as a main verb (N=). (Interestingly, preverbal ta and ta as a copula verb occurred with virtually the same frequency, with the copula being slightly more frequent than the particle (N= vs N=).) The results are consonant with grammaticalization. For one, simple token counts conﬁrm grammaticalization predictions regarding frequency (Bybee : ). Second, we see that in past temporal reference, where overt habitual expression is expected to develop ﬁrst, there is a larger difference between past habitual expressions and present. This is consistent with typological markedness, which we will discuss in the next section. Speciﬁcally, we will address the intersection of typological markedness with grammaticalization, and how markedness patterns in Palenquero compare to typological universals.

. TYPOLOGICAL MARKEDNESS AND ITS RELATIONSHIP TO GRAMMATICALIZATION Typological ‘markedness’ is not some property of language-speciﬁc phenomena; rather, it has to do with universal properties of conceptual categories that have been observed to show cross-linguistic regularities in their formal asymmetries. Finding cross-linguistic patterns is important, since we know that the great extent to which languages vary structurally is the essential problem of cross-linguistic comparability (Croft : ). Yet, cross-linguistic comparisons of semantic categories have revealed that in spite of ‘the fact of asymmetrical or unequal grammatical properties of otherwise equal linguistic elements’ (Croft : ) they show remarkable crosslinguistic patterns. And, in fact, ‘the irregularities are themselves manifestations of

⁹ These ﬁndings, and others I have addressed (Smith : ), indicate that ta behaves differently from all of the other aspectual morphemes in Palenquero along several parameters.



Hiram L. Smith

typological universals’ (p. ). Linguistic irregularities can only be discerned by examining linguistic patterns which often are tied to distributions and frequency; hence, quantitative reporting has heuristic value. Patterns of typological markedness are revealed through a comparison of paradigmatic alternatives (such as singular and plural, perfective and imperfective) and can only be discerned inductively by looking at a broad sample of languages (Greenberg ; Comrie : ; Croft : ch. ).

..     Viewed from a typological markedness perspective, a morpheme having an ‘unmarked’ status does not necessarily entail that it will be zero coded, although it can be.¹⁰ It simply means that the unmarked value will be expressed by no more morphemes than the marked value of a conceptual category. In other words, ‘one must compare values to each other and count morphemes’ (Croft : ). Additionally, the unmarked value will also have at least as many distinctions within an inﬂectional paradigm, and it will be found in at least as many grammatical environments as the ‘marked’ value (p. ). Since the unmarked category is found in the least restrictive of environments and is also the most frequent (Dahl ; Schwenter and Torres Cacoullos : ), it is the category that is felt to be the most usual (Comrie : ). Although they are frequent, unmarked values are often zero coded, but not because they are phonetically reduced through frequency (Croft : ); rather, because they are expected members in a particular environment. Marked values, on the other hand, are not part of the default meaning within a conceptual domain, and as such must be ﬂagged with phonetic material (or greater phonetic material) to signal this relationship (Bybee : ). For example, the default meanings associated with present tense (e.g. habitual and states) are also the most frequent; their forms usually have less structural coding (including zero coding) than forms with non-default meanings (e.g. progressive). We now turn to what the default meanings are of present and past tense and the tense-aspect asymmetries predictable from them. The tenseaspect asymmetries, in turn, have consequences for the grammaticalization of habituals. These insights will help explain the skewed distributions of habitual expressions in Palenquero.

..   Observations by typologists reveal that the default reading of the present is not to directly index the moment of speaking, but rather ‘to tell how things are’ (Bybee et al. : –). The inherent meanings of states and habituals, then, are congruent with the imperfectivity of the present tense, and thus tend not to have overt coding. The progressive, on the other hand, signals a meaning that is not part of the default ¹⁰ I distinguish grammatical ‘marking’ (or to put a ‘mark’) from ‘typological markedness’. I will not use the term ‘marked’ to refer to the presence of phonetic material, but only to express a morpheme’s obligatory status (e.g. zero marked: see Bybee ). Instead, I will use the terms ‘overt coding’, ‘phonetic substance’, and the like to talk about overt structural coding.

Grammaticalization in creoles



interpretation of present tense and must therefore be ‘marked’ with phonetic material (cf. Bybee : ). When default meanings are already overtly coded, then the non-default ones will exhibit more structural coding (Croft : ). Therefore, present progressives will exhibit less zero expression than present habituals and will always have at least as much phonetic material. Something similar holds for the simple past tense, although (unlike the present) it does index a deictic or temporal relation and tends to describe ‘what happened’. Bybee et al. (: ) suggest that the path of development of an overt habitual morpheme would naturally take place ﬁrst in past contexts (given the default readings of past and present tenses) and later generalize to present contexts. Habituals in the present tend to be expressed by zero rather than by overt coding (Bybee et al. : ). Perfectives, on the other hand, tend to be zero coded in the past tense.

..        As we discussed above, habituals are congruent with the default meaning of present but not past temporal reference (Bybee et al. : –). So we can make predictions based on this tendency.

... Predictions Since the default interpretation of past dynamic verbs is perfective, overt coding will be necessary if past dynamic situations are intended to mean habitual or progressive (Bybee et al. : ). Also, since the path of development of habitual morphemes would naturally take place in past contexts (given the default readings of past and present tense) and later generalize to present contexts (Bybee et al. : ), we should see more overtly coded habituals (asé and all other forms) in past versus present temporal reference, i.e. if asé is a developing morpheme that has not reached obligatory status. (An epitomized example of this phenomenon is seen in the English past habitual construction [used to + verb] versus the present tense bare verb, which is zero marked.) It follows, then, that past habituals rarely or never have zero expression, but present habituals may.

... Results As we can see in Figure ., a side-by-side comparison of the expression of aspectual meanings across tense reveals that Palenquero’s tense-aspect system is not typologically unusual; rather, morphemes behave predictably relative to the default meanings of their linguistic contexts. For example, when we look at present temporal reference (on the right), habitual meaning is expressed to varying degrees by all possible variants, asé, sabé, a, ta, and zero. In the past, however, ta only expresses progressive and not habitual meaning. So, habitual expression is found in more environments in the present than in the past, a fact consistent with typological markedness (Croft : ).



Hiram L. Smith Present temporal reference Habitual

Past temporal reference Habitual 8 28

4 60

asé sabe Ø a

14 36

4

39 7

asé sabe Ø a ta

F. .. Distribution of morphemes expressing habitual in past versus present

The data also reveal that past habitual morphemes have more phonetic bulk (aseba, sabe-ba, and even bulkier forms such as a sabe-ba and a sabe-ba a) than present tense ones (asé and (a) sabé), since they are more marked in the past than in the present. In fact, of all asé variants in past temporal reference contexts, the vast majority were expressed as ase-ba (N=/), as in (), while sabe-ba was the chosen variant % of the time (N=), as in (), making it more marked than ase-ba. Further, as example () illustrates, even past habituals with no preverbal morpheme still have more structural coding than their present tense counterparts, since they take the past imperfective sufﬁx -ba. Finally, we observe that, the smaller morpheme a plays a greater role as a habitual in the present (%) versus past (%), while the bulkier sabé is more frequent in the past (%) over present (%). These structural asymmetries are consistent with typological markedness, and thus with grammaticalization predictions, based on the default meanings of past and present. ()

Y ante, uno Ø ngana-ba meno y plata ase-ba kansá. And before, one earn-PASTIMP less and silver HAB-PASTIMP get tired ‘And before, you used to earn a lot less and the money would get spent up.’ (Female , Recording , :)

()

Jefe sabe-ba peleá primero. Leader HAB-PASTIMP ﬁght ﬁrst ‘The leader used to ﬁght ﬁrst.’ (Male , Recording , :)

If we take another look at Fig. ., we observe that in the present, asé and zero are the main candidates vying for habitual space, with asé representing % (N=/) of all habitual tokens and zero representing % (N=/). In past temporal reference, asé and sabé—not zero—are doing most of the habitual work. The greater occurrence of zero expression for habitual meaning in the present over past (% vs %,¹¹ respectively) makes sense because cross-linguistically present habituals tend to have zero expression. Looking at the data another way, the more frequent expression of habitual meaning in past (%) as opposed to present (%) suggests the development

¹¹ Zero coding of Palenquero past habituals is not an exception to typological markedness; rather, it is consistent with the fact that in the early stages of creolization, inﬂectional morphology in the super- and substrate languages does not typically transfer to creoles (hence, creating a lot of zeroes (with open meaning) in the process), and overt forms subsequently need time to develop.

Grammaticalization in creoles



of habitual taking place ﬁrst in the past before gradually extending to the present. Additionally, habitual meaning is expressed by the preverbal morpheme asé more frequently in the past (%) than the present (%). We ﬁnd the same for sabé (compare past % vs present %). So, not only do we see that habitual is more ‘marked’ in the past than in the present, the distributions are typologically coherent and consonant with predictions made by grammaticalization theory.

. DISCUSSION As the results show, quantitative analysis proves fruitful in disentangling questions surrounding the grammaticalization of preverbal asé. For one, there is no habitual ‘marker’ (cf. Poplack and Tagliamonte : ) in Palenquero creole, in either past or present temporal reference. The so-called habitual ‘marker’, asé, has all the hallmarks of an emerging grammatical morpheme, and not one that has obligatory status. It is early enough in its development that it passes tests for early habituals, but it is advanced enough to make itself known as the main competitor for habitual. The litmus test for grammaticalization lies in comparative analysis of past and present temporal reference, since any claims of language-internal change hinge on putting to empirical test predictions made in grammaticalization theory (Bybee et al. ). In testing tense-aspect elaboration using the factor group Polarity, the data revealed that asé is not yet obligatory in negative polarity contexts. It was the contribution of multivariate analysis along with the overall distributions that proved most useful here. While asé was only slightly favoured in positive polarity contexts, it was strongly disfavoured in negative polarity contexts. So, not only did it seldom appear in that environment, for the time being it is also not likely to appear. In testing the monomorphemic status of asé using the factor group Polarity, we found that by counting not only all environments where asé occurred but also where it could have occurred but did not, we turned up evidence militating against claims of an arbitrary deletion rule for a. More importantly, we discovered that not only are these so-called deletion environments rare, but within the appropriate environment deletion almost never takes place. This simple test provides strong support for the monomorphemic status of asé, and thus for the further argument that asé is most likely derived from Spanish hacer. Regarding the progressive marker ta’s role in the domain of habitual, we found that accountable quantitative methods proved to be quite serviceable. What separates the variationist method from other types of quantitative sociolinguistics is the principle of accountability. If we had not taken all tense-aspect morphemes into the variable contexts of both past and present temporal reference, including zero coded forms, or taken into account form–function asymmetry, or if we had only used scant amounts of data, then we might never have discovered that ta can (albeit rarely) express habitual meaning. It was this ﬁnding, which we arrived at inductively, that added another dimension to our set of diagnostics, because it permitted testing of the unidirectionality principle.



Hiram L. Smith

The factor group Semantic Dissonance was important because it provided one more piece of evidence for asé’s being an early habitual. By taking into account the gradient nature of constituent structures (Beckner and Bybee ), I was able to perform a simple diagnostic to test whether or not the preverbal particle asé would be disfavoured by the homophonous main verb asé. This gave evidence, along with the other factors, that although a certain degree of generalization has taken place, asé is still quite early in its development. On the other hand, when we compare frequency of the expressed habitual morphemes with their lexical sources, we ﬁnd that the line between early and advanced is very thin. We found that the grammatical expressions asé and sabé are by and large more frequent than their homophonous lexical counterparts. Generality of meaning is correlated with greater frequency (both as a byproduct and as a contributing factor), and with later stages of development (Bybee et al. : ). Hence, we add preverbal asé’s greater frequency and its eschewing of main verb asé to its surface similarity and its monomorphemic status, thus establishing an irrefutable link between lexical and grammatical asé. Finally, the data showed asymmetries in the expression of Palenquero tense and aspect morphemes in past temporal reference compared to the present. In spite of this, the data were orderly, revealing structured heterogeneity, thus allowing us to see patterns which conform, not to a creole prototype, but to typological markedness patterns found in many of the world’s languages; these patterns were neatly accounted for by using principles from grammaticalization theory. The aspectual morphemes that occur in the past and present do different work, but in every case, that work is consistent with the default meanings of the conceptual domains in which they appear. We asked at the outset: to what degree do tense-aspect expressions in Palenquero conform to well-established typological patterns? As we have just discussed, the answer to that question is multi-faceted and requires that we think about it along all relevant parameters. Answering it also requires that we submit items to empirical tests. The domains in which a grammaticalization hypothesis is conﬁrmed all have different characteristics. Each one, in its own way, provides a nuanced view of how far grammaticalization has proceeded along a particular axis. Therefore we ﬁnd evidence for asé walking a tightrope of being both emerging yet advancing. This complements other literature that has used grammaticalization indices (Torres Cacoullos and Walker ) and quantitative methods in order to detect emergent linguistic forms ‘on the ground’ (Poplack and Torres Cacoullos ), only this time using purely synchronic data. This report, then, conﬁrms that for Palenquero, grammaticalization theory elegantly accounts for the patterns we observe in the domain of tense-aspect expression. Those theories, though, needed to be put to rigorous test, especially given our synchronic data set. Using accountable quantitative methods, we discerned patterns in the data which could then be analysed in light of strong cross-linguistic trends. We did this on the strength of the unidirectionality principle. We established that strong claims of grammaticalization could not be made for asé using only formal criteria, such as its formal similarity to Spanish hacer. While it is true that the patterns revealed through quantitative analysis can only be discerned by examining formal properties,

Grammaticalization in creoles



we could not rely on those alone. Thus, clearly mapping out the distributions of variant forms was a crucial ﬁrst step in our analysis. Once this was done, though, the incorporation of semantic criteria (form–function asymmetry), making the correct synchronic predictions, and deﬁning the appropriate variable contexts for developing habituals (i.e. comparing present with past temporal reference) were methodological imperatives. Fine-grained descriptions of the grammar allowed us to reﬁne our understanding of how grammaticalization is proceeding in this creole language.

. CONCLUSION We ﬁnd, then, that these data lend no support for creole exceptionalism, based on presumed distinctive features of their tense-aspect system. To the contrary, using a typologically informed, usage-based approach, along with the empirical rigour of the variationist method, I conclude that Palenquero is behaving no differently in the realm of tense and aspect than any other world language, despite its classiﬁcation as a creole. By carefully investigating form–function asymmetry using accountable data mining and analytical procedures—and by preferring diachronic explanations to purely synchronic ones—we can accomplish much in this regard. As I have argued here, addressing the question of grammaticalization in creoles is a matter of adopting the right methodology for the task at hand. But it also requires that we theorize about grammaticalization in novel ways; in the present case, it meant trying to capture change in its dynamism while not merely viewing it as an abstract diachronic process. The gradualness of change in ‘linguistic time’ allows for a fundamental rethinking of the diachrony/synchrony interface as it pertains to the study of diachronically related contemporaneous variables. Finally, the grammaticalizing morpheme must be properly contextualized. I agree with Poplack: The existence of layering entails that alongside the grammaticalizing form, other variant forms will be jockeying for the same linguistic work. Yet grammaticalization is usually construed— and studied—as the set of changes involved in the association of one (emphasis in original) form with a new (presumably more grammatical) meaning or function, downplaying, or even ignoring the role of other layers coexisting in that context. (Poplack : )

This study demonstrates how variationists and creolists form a natural alliance, as suggested by Sankoff () years ago. Instead of taking grammaticalization a priori, here we dug deep into the linguistic data in light of various facets of grammaticalization theory. I hope that this research contributes to countering categorical perception, and to demonstrating the systematicity of creoles in the structure of linguistic variation.

19 Is grammaticalization in creoles different? JOHN H. McWHORTER

. INTRODUCTION It has often been thought that creole languages should be of especial interest to scholars of grammaticalization. The idea has been that their development from makeshift jargons or pidgins into new languages provided a particularly fertile ground for grammaticalization—a ‘laboratory’, as it has often been phrased. However, relatively little work has emerged from the laboratory in question. The most prominent work on grammaticalization in creoles has yielded two main conclusions, neither of which have stimulated much further investigation. One is that grammaticalization has provided fundamental syntactic and semantic distinctions lacking from the creole’s precursor, characterized either as a pidgin or as a variety in which, at least, ‘there had been large-scale loss of morphosyntactic elements initially, and a reduction of the grammatical system’ (Bruyn : ). A classic demonstration is Sankoff and LaBerge’s () demonstration of the emergence of future marker bai in Tok Pisin from an adverb baimbai, initially preposed to the subject but moving into the VP as part of its grammaticalization (Table .). However, there has been little work in this vein since. Part of the reason is the inﬂuential idea that most creoles did not even emerge from pidgins at all (Mufwene ; DeGraff ), which fosters a focus on creoles as simply the product of mixture rather than interrupted transmission. Also, Derek Bickerton’s bioprogram hypothesis () focused attention during the s and s on the emergence of such core grammatical features via children’s spontaneous expression of structures speciﬁed by Universal Grammar. Largely in response to this, another school of thought acquired inﬂuence which, unintendedly, similarly discouraged sustained interest in grammaticalization in creoles. In response to Bickerton’s claim that the languages of creoles’ creators had

Grammaticalization from a Typological Perspective. First edition. Heiko Narrog and Bernd Heine (eds). This chapter © John H. McWhorter . First published  by Oxford University Press

Is grammaticalization in creoles different?



T .. Grammaticalization of the future marker bai in Tok Pisin Pidgin

baimbai mi go

Pidgin/creole

bai mi go

Creole

mi bai go

no inﬂuence on the structures of the creoles themselves, many creolists focused on identifying substrate inﬂuence in creoles. Amidst these investigations, Bruyn (), for example, stressed that in creoles, what could appear to be grammar-internal processes of grammaticalization was often modelled on the substrate languages—i.e. that grammaticalization could calque a pre-existing construction as well as create a new one. An example would be a say verb grammaticalized as a complementizer, as in Saramaccan, in which táki (elided to táa) has taken on a grammatical function paralleled by the same verb in Fongbe, the Niger-Congo language that the grammar was decisively modelled upon (Saramaccan data collected by the author and assistants): ()

()

Saramaccan: Kobí biíbi táa Báyi ó kó. Kobi believe say Bayi FUT come ‘Kobi believes that Bayi will come.’ Fongbe: Kɔ̀kú ɖì ɖɔ̀ Bàyí wá Koku believe say Bayi come ‘Koku believes that Bayi will come.’ (Lefebvre and Brousseau : )

An unintended consequence of this substrate-based analysis, however, is that the grammaticalization in question ceases to seem ‘interesting’ to most linguists in itself. Grammaticalization that recapitulates features that another language already had seems less worthy of analysis than grammaticalization that creates something unknown before and possibly even unexpected. In this chapter I will argue that in fact, most grammaticalization in creoles actually is, as it were, ‘interesting’. For one, (i) In creoles, most grammaticalization does not provide core grammatical functions as a reparative strategy. (ii) In creoles, most grammaticalization does not parallel substrate precursors. Rather, as was suspected decades ago, creole languages are indeed laboratories of grammaticalization—speciﬁcally, grammaticalization of assorted features out of many possible, quite independently of the grammars spoken by the people who created the language. My argument will complement and expand on that of Heine



John H. McWhorter

and Kuteva (), which similarly demonstrates that grammaticalization is hardly limited to calquing of substrate features. My observations will also show that: (iii) There is no such thing as a ‘creole style‘ of grammaticalization; the process proceeds according to the same mechanisms and tendencies as it does in older languages. However, (iv) Grammaticalization has occurred to a greater degree in creole languages’ lifespans, which supports the theory that creoles emerge from pidgins, rather than as simply a mixture of grammars (Mufwene ; Aboh ). I will base my analysis on Saramaccan, a creole lexiﬁed by English at its outset when the English colonized Surinam, and then partially relexiﬁed by Portuguese when Portuguese-speaking Jews established plantations in the colony fourteen years after the English settlement. The African language that had the strongest impact on Saramaccan structure was the Niger-Congo language Fongbe, in terms of syntax, some morphology, and some core lexical and even grammatical items.

. SUBSTRATE-MODELLED GRAMMATICALIZATION To be sure, a degree of creoles’ grammaticalization is indeed modelled on substrate languages. It would be unusual if it were not, and there are cases of such grammaticalization in Saramaccan. For example, it is reasonable to suppose that the reason Saramaccan has an opposition between overt deﬁnite and indeﬁnite determiners is that Fongbe has the same trait. One might suppose that English (and Portuguese) alone would be enough to furnish this distinction, but many creoles with Western European lexiﬁers do not preserve the two-way opposition (e.g. Tok Pisin, the Portuguese creoles of the Gulf of Guinea), such that Fongbe likely had at least a strong reinforcing inﬂuence in Surinam (Table .: Fongbe data from Lefebvre and Brousseau ).

T .. Deﬁnite and indeﬁnite articles in Fongbe (Lefebvre and Brousseau : –) and Saramaccan (own data) Fongbe ví

ɔ́

Saramaccan dí

míi

child DEF

DEF child

‘the child’

‘the child’

ví

ɖé

wã́

míi

child INDEF

INDEF child

‘a child’

‘a child’

Is grammaticalization in creoles different?



Less obvious, but just as likely, is that Saramaccan’s array of expressive pragmatic particles is modelled on those in Fongbe: ()

a. I kέ baláki ɔ́? S want vomit Q ‘Do you want to throw up?’ b. I kέ baláki nɔ́? S want vomit Q ‘Now, is it that you need to throw up, sweetie?’ c. A taku e! S ugly INJ ‘He’s ugly!’

BASIC INTERROGATIVE

PROBING INTERROGATIVE

EXCLAMATIVE

This compares neatly with Fongbe’s apparatus (Lefebvre and Brousseau : –): ()

a. Kɔ̀kú xɔ̀ àsɔ́n lέ à? Koku buy crab PL Q ‘Has Koku bought the crabs?’

BASIC INTERROGATIVE

b. Kɔ̀kú xɔ̀ àsɔ́n lέ cé? Koku buy crab PL Q ‘Has Koku really bought the crabs?’

PROBING INTERROGATIVE

c. Kɔ̀kú xɔ̀ àsɔ́n lέ lá! Koku buy crab PL INJ ‘Koku bought the crabs!’

EXCLAMATIVE

Saramaccan’s pair of interrogative markers resulted from the grammaticalization of nɔ́ɔmɔ from ‘no more’ (in the modern language nɔ́ɔmɔ means ‘always’) via this pathway: nɔ́ɔmɔ ‘no more’ > nɔ́ɔ ‘only’ > nɔ́ interrogative marker (with a minimizing, placating pragmatic implication derivable from the semantics of ‘only’ [cf. b]) > ɔ́ obligatory (i.e. fully grammaticalized) yes/no interrogative marker The result yielded two markers paralleling the equivalent Fongbe ones in their contrasting interrogative strength. The grammaticalization could easily have yielded simply one interrogative marker. Instead, it yielded two, with their meanings assigned to two stages in the transformation of the self-same original source word. The inﬂuence of some outside factor seems almost self-evident, and makes a transfer account from Fongbe especially likely. Also, there are other grammaticalized usages of verbs in serial constructions which can reasonably be traced to substrate models, such as that described for the say verb in () (Lord ), the use of give as a dative and benefactive marker, pass as a comparative marker, and go and come as directionals, and the go usage illustrated in ():



()

John H. McWhorter Fongbe: a. Kɔ̀kú sɔ́ àsɔ́n yì àxì mέ. Koku take crab go market in ‘Koku brought crab to the market.’ (Lefebvre and Brousseau : ) Saramaccan: b. Mi tá tjá wáta gó butá a dí gbóto déndu. S IMPV carry water go put LOC the boat inside ‘I am carrying water into the boat.’(Lefebvre and Brousseau : )

The parallels here are so close that their source in substrate inﬂuence is obvious (cf. Migge )—or, the burden of proof is upon those who would question it, and attempts such as Byrne () have not been successful. However, these substratemodelled grammaticalizations by no means exhaust evidence of grammaticalization in Saramaccan, and in fact, grammar-internal grammaticalizations, in Saramaccan, have outnumbered them. These latter could therefore be treated as the norm.

. GRAMMAR-INTERNAL GRAMMATICALIZATION IN SARAMACCAN

.. --  Saramaccan has an array of markers of tense, aspect, and mood. I will show that the vast majority of these are derived from grammaticalization, and in no case can the grammaticalization be traced to substrate sources. Typically, these markers in Saramaccan are presented, as for creoles generally, according to this three-way opposition between an anterior, imperfective, and future marker (Table .). Lefebvre (: –) claims, although not arguing for grammaticalization speciﬁcally, that Saramaccan’s tense-aspect-mood marking system is calqued on that of Fongbe. However, the argumentation fails. For one, Lefebvre includes in Saramaccan a ‘subjunctive’ marker fu. However, to the extent that fu is used at all in Saramaccan as a preverbal mood marker (which is slight: cf. McWhorter and Good : ), its semantics are those of obligation, not the hypothetical. This leaves just Saramaccan’s bi, ó, and tá from Lefebvre’s comparison, and these leave no signiﬁcant T .. Anterior, imperfective, and future markers in Saramaccan with wáka ‘walk’ TMA construction

Source

Translation

bi wáka

> been

‘walked’

tá wáka

> earlier tan ‘stand’

‘is walking’

ó wáka

> earlier gó ‘go’

‘will walk’

Is grammaticalization in creoles different?



T .. Aspect and mood markers in Saramaccan beyond the ‘big three’ Construction

Source

Translation

ló wáka

> lóbi ‘love’

‘walks (as a habit)’

náa wáka

> tá a ‘stand at’

‘used to walk’¹

sá wáka

> sábi ‘know’

‘can walk’

mú wáka

> músu ‘must’

‘must walk’

wáka kaa

> kabá ‘ﬁnish’

‘done walking’

wáka gó dóu

> ‘go through’

‘keep walking’

congruence between the Saramaccan and Fongbe apparati. In Fongbe, Lefebvre documents an actual subjunctive marker, markers for both deﬁnite and indeﬁnite future, and a habitual marker: none of these features parallels Saramaccan Also, the markers tá and ó derive from the grammaticalization of verbs, but their equivalents in Fongbe (cf. Lefebvre and Brousseau :  and passim) are not reﬂexes (synchronically, at least) of the equivalent verbs. There is no meaningful likeness, then, between tense-aspect-mood marking in Saramaccan and that in Fongbe. An argument for substrate inﬂuence here could not pass muster among language contact specialists in general. Bickerton’s () analysis attributes these three Saramaccan markers to universal speciﬁcations, but this proposal runs aground on the fact that this ‘classic’ presentation of Saramaccan’s tense-aspect-mood markers as somehow centred on a ‘big three’ is arbitrarily partial. In the actual language, there is no self-standing reason for why these three markers should be isolated as a ‘core’. Rather, they are three in a rich array, of which the other markers include those shown in Table .. All of these markers are derived from the grammaticalization of full verbs, or such verbs used in constructions. Of them, Fongbe provides a model only for the grammaticalization of ﬁnish as a completive (Lefebvre and Brousseau : –). Crucially, then, overall, Saramaccan’s array of tense-aspect-mood markers emerged independently of contact factors.

..    Saramaccan marks new information in a more overtly pragmaticized fashion than any of its source languages. As I argue in McWhorter (), the item nɔ́ɔ marks new

¹ The phonetic pathway from tá a to náa is paralleled by the irregular form nángó from tá gó IMPV go ‘going’.



John H. McWhorter

information in a sentence such as (a), from a conversation that ended with the speaker noting that he would be ready for a future phone call: ()

a. A búnu. Nɔ́ɔ mi ó tá háika i. S good NI S FUT IMPV listen S ‘Good. So I’ll be listening for you (waiting for your answer).’

Indicatively, the next day, in an exchange reiterating the arrangement, the speaker did not use nɔ́ɔ because the information was no longer new: b. A búnu, mi ó tá háika dí kái ﬁi tidé néti. S good S FUT IMPV listen DEF call POSS.S today night ‘Good, I’ll be listening (waiting) for your call tonight.’ This pragmatic marker punctuates Saramaccan speech, to an extent that can defy translation, often approximately rendered as ‘then’. However, that translation is infelicitous in examples such as (): ()

Dí wíki dí bì pasá dέ, mέ bì sá pεέ, ma nɔ́ɔ DEF week REL PAST pass there S.NEG PAST can play but NI mi kó tá pεέ báka. S come IMPV play again ‘Last week I couldn’t play, but now I’m playing again.’

The translation ‘now’ may seem tempting as well. However, besides the fact that the ‘citation’ word for ‘now’ in Saramaccan is nɔ́unɔ́u, nɔ́ɔ is especially conventionalized in marking matrix clauses that occur after preceding adverbial complements, the matrix clause containing the new information. This is the case with temporal and causal complements, for example: ()

Té mujέε sí Kobí, nɔ́ɔ de tá kulé. when woman see Kobi NI they IMPV run ‘When women see Kobi, they run.’

()

Nda dí wági ná u mi, nɔ́ɔ i ó páka fε ̃ε̃. since DEF car NEG POSS S NI S FUT pay for.SO ‘You’re going to pay for the car, since it is not mine.’

This new information marking is tense-sensitive: in the past tense the marker is hε ̃ ́ (derived from then): () Dí a bi tá duúmi, hε ̃ ́ mi gó . . . when he PAST IMPV sleep NI S go ‘When he was sleeping, I left.’ However, this is only when the action is temporally bounded; if the aspect is imperfective, then even in the past the marker is nɔ́ɔ: () Dí mi bi kó lúku de, nɔ́ɔ de bi duumí kaa. when S PAST come look them NI they PAST sleep already ‘When I came to see them, they were asleep.’

Is grammaticalization in creoles different?



The source of this foregrounding reﬂex of nɔ́ɔ is the same nɔ́ɔmɔ from ‘no more’ that was elsewhere the source of the interrogative markers described above in (). Evidence includes that even in the modern language the full form nɔ́ɔmɔ can occur variably as a new information marker: () Nɔ́ɔmɔ déé mbétimbéti túu kái ε̃. NI DEF.PL animal.RDP all call SO ‘So all of the various animals called him.’ (Price and Price n.d.: ) and that in documents of an early stage of Saramaccan’s source and sister language Sranan (Schumann ), ‘no more’ is used in the same function: () No morro hulanga tem ju sa libi dea? no more how-long time you FUT live there ‘So how long are you going to live hereabouts?’ (trans. from German: Wie lange wirst du dich hier aufhalten?) Crucially, however, there is no equivalent marker, or similar overt new information marking system at all, documented in Fongbe. Saramaccan developed this apparatus—a markedly overt system of information structure marking—via the grammaticalization of ‘no more’ independently of contact.

..  Traditionally, copulas in Saramaccan and other English-based creoles are presented as dividing the equative and the locative, as demonstrated in (a,b): () a. Equative: Mi da dí kabiténi. S be DEF captain ‘I am the captain.’ b. Locative: Mέíki ku wı̃ ́ bì dέ a táfa líba. milk with wine PAST be LOC table top ‘Milk and wine were on the table.’ There is a similar division of labour between the ‘be’-verbs in Fongbe (Lefebvre and Brousseau : –), as shown in (a,b): () a. Equative: Ùn nyí Àfíáví. S be Aﬁavi ‘I am Aﬁavi.’ b. Locative: Wémâ ɔ́ ɖɔ́ távò jí. book DEF be.at table on ‘The book is on the table.’



John H. McWhorter

T .. Pairs of equative and locative copulas cross-linguistically Equative

Locative

Vietnamese

là

o

(Thompson )

Nama

‘a

hàa

(Hagman )

Hawaiian

he

aia

(Hawkins )

Chinese

shì

zài

(Li and Thompson )

CiBemba

ni

lì

(Sadler )

As demonstrated in McWhorter (), the Saramaccan copulas are products of grammaticalization: da from dati ‘that’, and dέ from ‘there’. In the case of da, the derivation is clear from the phonetic resemblance to the Surinam creole distal demonstrative dati combined with the cross-linguistic tendency for copulas to be derived from pronominal elements on the left edge of comments in topic–comment constructions (Li and Thompson ). In the case of dέ, economy alone clinches the case in that the word for ‘there’ in Saramaccan is identical. However, despite appearances (and thus, traditional accounts such as Holm  and Migge ), this grammaticalization was not modelled on substrate equivalents. For one, the resemblance in this area between Saramaccan and Fongbe is much less speciﬁc, and much less cross-linguistically idiosyncratic, than often supposed. Secondly, a division of labour between equative and locative copulas is quite common in languages outside Europe, such that the parallel between creoles and certain West African languages is not as decisive a case for transfer as is often thought. Note that in the examples in Table ., as well as for each language, the copula conﬁguration is quite often similar in its close relatives. Also, the West African copula type quite often did not transfer into creoles, likely because copulas are of especially low semantic content and functional load, and therefore less likely to be reproduced in a makeshift pidgin/Basic Variety level of language. In French-based creoles also created by speakers of Fongbe (and related languages), despite various transfers from Fongbe grammar, there is no locative copula, such as in Haitian (Malis anba tab-la ‘Malis is under the table’: DeGraff : ). In the Portuguese creoles of the Gulf of Guinea, while their main grammatical model was the Niger-Congo language Edo (Hagemeijer and Ogie ), which has several distinct copulas for the equative and elsewhere (cf. Baker : ), the creoles have a single copula for the equative, locative, and beyond (e.g. Principense: Maurer : –). Moreover, in Saramaccan, equation is indicated only partially by da, when the equation is a relationship of identiﬁcation. Otherwise, the dέ copula is used for equation; it is not a speciﬁcally ‘locative’ copula in the Saramaccan system, as can be seen in (a,b):

Is grammaticalization in creoles different? () a. Mi da dí kabiténi. S be DEF captain ‘I am the captain.’ b. Mέíki dέ wã ́ soní dí míi tá bebé. milk be a thing REL child IMPV drink ‘Milk is something that children drink.’



IDENTIFICATIONAL EQUATIVE

CLASS EQUATIVE

Early documentation of Saramaccan reveals that this was even more the case at its emergence, when dέ was the only copula, used even for identiﬁcational equation, as in this th-century sentence: Mi de Christian Grego Aliedja ‘I am Christian Grego of Aliedja’ (Arends and Perl : , ). The Saramaccan copulas arose via grammaticalization amidst the reanalysis of topic–comment structures into subject–predicate ones, along the lines of a process documented in many languages worldwide such as Mandarin Chinese and Hebrew (Li and Thompson ). What began as a demonstrative serving as the subject after a topic became an empty placeholder between a topic and a predicate, and reanalysed as a subject, as I demonstrate via reconstruction in (a,b): -copula [mí tatá ] my father

() a. [Kobí] [da] Kobi that TOPIC

SUBJECT

PREDICATE

‘Koﬁ, that/he is my father.’ b. [Kobí] Kobi SUBJECT

[da] is

[mí tatá] my father

COPULAPREDICATE

‘Koﬁ is my father.’ This process was unconnected to Fongbe: yet another indication is that the Fongbe copulas have no perceptible source in demonstratives or deictics.

..   Saramaccan has a negator morpheme á that negates predicates (cf. (a)), as opposed to clauses, which use ná (cf. (b,c)): () a. Kobí á tɔ́tɔ dí wómi túwε. Kobi NEG push DEF man throw ‘Kobi didn’t push the man down.’ b. Ná mí dú ε̃. NEG S do SO ‘It’s not me who did it.’ c. Ná dú ε̃! NEG do SO ‘Don’t do it!’ The conditioning of á is additionally speciﬁc in that it is inapplicable to non-verbal predicates:



John H. McWhorter

() Mi ná (*á) í máti. S NEG your friend ‘I’m not your friend.’ This morpheme á and its conditioning is, like the Saramaccan copulas, the result of a reanalysis and grammaticalization (cf. McWhorter : – for details.) Originally, the negator was ná in all of the above-mentioned contexts including verbal predicates. In the third person singular in topic–comment constructions, ná was used thus: Stage : Kobí, a ná wáka. ‘Kobi, he doesn’t walk.’ Phonetic erosion amidst the reanalysis of the topic–comment construction into a subject–predicate one yielded a portmanteau third-person singular negator ã :́ Stage : Kobí, ã ́ wáka. ‘Kobi, he doesn’t walk.’ Further phonetic erosion denasalized the morpheme, upon which within the new subject–predicate construction it was á, with no pronominal content or function remaining: Stage : Kobí á wáka. ‘Kobi doesn’t walk.’ Despite its source in a third-person singular construction, today á is further conventionalized in being grammatical in all persons and numbers, this generalization being diagnostic of grammaticalization: Stage : Mi á wáka. ‘I don’t walk.’ Fongbe has no model for this particular division of labour between negators. This grammaticalization was local to Saramaccan.

..   Saramaccan has grammaticalized the verb púu ‘pull’ to mark direction away from an object: () A gó féki dí keéti púu a dí bɔ́utu. S go dust.off DEF chalk pull LOC DEF blackboard ‘He’s going to wash the chalk off of the blackboard.’ () A pusá dí miíi púu a dí nέsi. S push DEF child pull LOC DEF nest ‘It pushed the child out of the nest.’ Intransitively, púu is used as the ﬁrst verb in a serial, modifying the main action: () Dí fuúta púu kaí. DEF fruit pull fall ‘The fruit fell (down from the tree).’

Is grammaticalization in creoles different?



However, intransitive emergence, as in movement speciﬁcally ‘out from’ a location, is indicated with kumútu derived from ‘come out’: () A wáka kumútu a dí wósu. S walk come.out LOC DEF house ‘He walked out of the house.’ While Fongbe has models for the grammaticalization of many Saramaccan verbs in serial constructions, this valence-sensitive usage of púu and kumútu is derived from grammaticalization internal to Saramaccan. Fongbe’s pull verb has undergone no similar transformation; ablativity and emergence are encoded with an adposition (Lefebvre and Brousseau : ).

. IMPLICATIONS

..            These observations show that grammaticalization is creoles occurs far beyond instances that simply reproduce grammaticalizations that had already occurred in the languages of their creators. Substrate models in no sense constrain what grammaticalization occurs in the development of a creole. An investigation of grammaticalization in creoles that focuses on substrate models addresses but a fraction of the process.

..      The previous section reveals a language in which more grammaticalization has occurred within a brief period of time than is typical of language change. This becomes especially clear given certain genealogical details about Saramaccan’s history. Saramaccan is an offshoot of a precursor creole, Sranan: the creoles’ grammars are as similar as those of the mainland Scandinavian languages. Sranan cannot have existed in any form before the ﬁrst half of the seventeenth century, when the English began colonizing the New World, and speciﬁcally the Caribbean area (McWhorter : –); most analysts trace it to the English colonization of Surinam in . Saramaccan developed via the partial relexiﬁcation of Sranan by Portuguese starting in the s; no modern specialist would date its emergence after the s, when slaves from Portuguese plantations began forming maroon communities in the interior of Surinam. Some of the structures in section . are shared by Sranan and Saramaccan, and yet are too idiosyncratic in terms of lexical source and/or grammatical behaviour to have emerged independently. This is important because it means that these structures must have emerged not just within the c. years of Saramaccan’s existence,



John H. McWhorter

but in Sranan itself—and therefore within the window between Sranan’s emergence and Saramaccan’s. This cannot have been longer than a few decades (and may have been briefer). Within this period, it can be assumed that Sranan developed many of the tense-aspect-mood markers, the two copula morphemes, and the use of the pull and come out verbs as ablative markers. The new information marking, predicate negator, and some of the tense-aspectmood markers are developments local to Saramaccan, however. Given the absence of most of these items in the earliest documents of Saramaccan of the late th century, it is likely that they emerged or became entrenched during the th century—a roughly -year window. In the known history of English, there has been neither a period of a few decades nor even a period of  years during which grammaticalization has proceeded at the pace it did in these Surinam creoles, nor has there ever been even a -year stretch during which English underwent so much grammaticalization. This suggests that in creoles, at least early in their life-spans, there is indeed more grammaticalization than under ordinary processes of grammar-internal change.

..  ? However, the evidence does not suggest that the ﬂood of grammaticalization served to make Saramaccan a full language—or, phrased more formally, to express functional categories basic to Universal Grammar. Rather, what stands out about the array of grammaticalizations that occurred in Saramaccan is how marginal most of them would seem to be to the basic needs of communication: . The tense-aspect-mood markers appear, from a European perspective, fundamental. However, many languages (such as Chinese and Indonesian) leave the past and the future to context to a degree unknown in Saramaccan. Many languages lack an overt marker of the habitual (such as English, in the present), and certainly a dedicated marker of the past habitual is hardly central to communication (as it was not in English until the grammaticalization of the used to construction). . Saramaccan’s new information-marking apparatus is unusually entrenched in the cross-linguistic sense. Such pragmatic distinctions are typically conveyed by intonation. Even Saramaccan’s sister creoles, Sranan and Ndjuka, have not pragmaticized the relevant morphemes to the extent that Saramaccan happens to have done. . Copulas are semantically light or even (especially in the case of equative ones) empty items, absent in as many languages as they are present in. The development of be verbs cannot be analysed as part of a transformation from pidgin to language. Copulas are, in broad view, accidents—erstwhile lexical items bleached of content and left behind as syntactic place-ﬁllers. . Obviously having a negator morpheme dedicated to verbal (and not nonverbal) predicates is not a necessary distinction to a language.

Is grammaticalization in creoles different?



. The development of ablative markers from verbs did not ﬁll a semantic or syntactic gap. Typically, the main verb in the construction, combined with context, indicates the direction of the action (in the examples in .., dust off, push, fall, exit). Rather, the grammaticalization of púu and kumútu lent an increase in explicitness—but one that a great many languages do without with no apparent disadvantage. To wit, more grammaticalization occurred in Saramaccan than occurs in older languages’ evolution within the few hundred years that Saramaccan has existed. However, this grammaticalization cannot be analysed as having made Saramaccan a viable language, because a language with none of the relevant features would serve all of the needs of human communication. Rather, this grammaticalization occurred in a language which had already been serving as a people’s native vehicle of communication. This demonstrates the extent to which all languages are vastly overspeciﬁed in terms of necessity. The proliﬁc grammaticalization in Saramaccan’s short history shows that this overspeciﬁcation is a natural trait of human language, and commences at a language’s very beginnings.

..    ‘’    These data do not demonstrate that grammaticalization in creoles is of a different nature, in itself, from grammaticalization in other languages. Grammaticalization has indeed occurred to an unusually vast degree in the few centuries that most creoles are known to have existed, such that it is reasonable to state that rampant grammaticalization is a deﬁning feature of languages born from pidgins and reconstituted as new languages. Nevertheless, in terms of grammaticalization as a process, creoles offer no insights that could not be gained from other languages. No development in Saramaccan (or in other creoles that I am aware of) exempliﬁes a process, trend, or directionality counter to the grammaticalization process as documented in languages around the world over the past four decades.

..      However, these ﬁndings do not support the idea that nothing distinguishes creole genesis from language contact in general, as proposed by Mufwene (). The degree of grammaticalization in Saramaccan’s short history supports the traditional idea that creoles emerge from pidgin, or pidgin-like varieties. Early Saramaccan would appear to have offered more space for grammaticalization to proceed into than language ordinarily does. We seek an explanation for why that space existed. A ready explanation is that Saramaccan emerged from a speech variety heavily affected by adult acquisition, to an extent that left it grammatically much less elaborated than any older language. That is, Saramaccan, as a creole, emerged from what some analysts will prefer to call pidginization, while others (perhaps more) will prefer to term a Basic Variety level of language (cf. Klein



John H. McWhorter

and Perdue ; Plag a, b). The circumstances were ripe for the emergence of entire new paradigmatic systems and overt markings of semantic categories many languages leave to context. These ﬁndings are incompatible with Mufwene’s ‘feature selection’ model of creole genesis, under which creoles result from the same processes of language contact as other languages undergo, with no more or less interruption of normal transmission than yielded other ‘mixed’ languages such as Romanian. If Saramaccan is simply a mixture of English and Fongbe grammar, as Aboh () argues for Saramaccan’s progenitor creole, Sranan, then there is no explanation for why grammaticalization has proceeded more rampantly in its few centuries of existence than during any three-century period in the recorded existence of English itself. This rate of grammaticalization reveals creole genesis as more than simply contact between languages. Après contact, le déluge, it would seem.

References Aaron, Jessi. . ‘Pushing the envelope: looking beyond the variable context.’ Language Variation and Change : –. Abbi, Anvita. . Endangered languages of the Andaman Islands. Munich: Lincom Europa. Aboh, Enoch O. . ‘Competition and selection: that’s all!’ In Enoch O. Aboh and Norval Smith (eds), Complex processes in new languages, –. Amsterdam: Benjamins. Aboh, Enoch O. . The emergence of hybrid grammars: language contact and change. Cambridge: Cambridge University Press. Académie tahitienne. . Grammaire de la langue tahitienne. Tahiti: Fare Vāna’a. Acosta, Diego de. . ‘Rethinking the genesis of the Romance periphrastic perfect.’ Diachronica (): –. Adams Lichlan, Patsy, and Stephen A. Marlett. . ‘Madija noun morphology.’ International Journal of American Linguistics : –. Adibifar, Shirin. . ‘Persian.’ In Geoffrey Haig and Stefan Schnell (eds), Multi-CAST (Multilingual Corpus of Annotated Spoken Texts): https://lac.uni-koeln.de/multicastpersian/. Last accessed  Oct. . Admiraal, Femmy, and Swintha Danielsen. . ‘Productive compounding in Baure (Arawakan).’ In Swintha Danielsen, Katja Hannss, and Fernando Zúñiga (eds), Word formation in South American languages, –. Amsterdam: Benjamins. Ahn, Joo Hoh. . Hankwuke myengsauy mwunpephwa yenkwu [A study of the phenomena of grammaticalization in the Korean noun]. PhD dissertation, Yonsei University, Seoul: Hankook. Aikhenvald, Alexandra Y. . ‘Warekena.’ In Desmond C. Derbyshire and Geoffrey K. Pullum (eds), Handbook of Amazonian languages, vol. , –. Berlin: Mouton de Gruyter. Aikhenvald, Alexandra Y. a. ‘Areal typology and grammaticalization: the emergence of new verbal morphology in an obsolescent language.’ In Spike Gildea (ed.), Reconstructing grammar: comparative linguistics and grammaticalization, –. Amsterdam: Benjamins. Aikhenvald, Alexandra Y. b. Classiﬁers: a typology of noun categorization devices. Oxford: Oxford University Press. Aikhenvald, Alexandra Y. . ‘Areal diffusion, genetic inheritance and problems of subgrouping: a North Arawak case study.’ In Alexandra Y. Aikhenvald and R. M. W. Dixon (eds), Areal diffusion and genetic inheritance: problems in comparative linguistics, –. Oxford: Oxford University Press. Aikhenvald, Alexandra Y. . Language contact in Amazonia. Oxford: Oxford University Press. Aikhenvald, Alexandra Y. . A grammar of Tariana, from northwest Amazonia. Cambridge: Cambridge University Press. Aikhenvald, Alexandra Y. . ‘Classiﬁers and noun classes: semantics.’ In Encyclopedia of languages and linguistics, –. Oxford: Elsevier.



References

Aikhenvald, Alexandra Y. . ‘Grammars in contact: a cross-linguistic perspective.’ In Alexandra Y. Aikhenvald and R. M. W. Dixon (eds), Grammars in contact: a cross-linguistic typology, –. Oxford: Oxford University Press. Aikhenvald, Alexandra Y. a. ‘The grammaticalization of evidentiality.’ In Bernd Heine and Heiko Narrog (eds), The Oxford handbook of grammaticalization, –. Oxford: Oxford University Press. Aikhenvald, Alexandra Y. b. ‘Areal features and linguistic areas: contact-induced change and geographical typology.’ In Osamu Hieda, Christa König, and Hirosi Nakagawa (eds), Geographical typology and linguistic areas, with special focus on Africa, –. Amsterdam: Benjamins. Aikhenvald, Alexandra Y. . The languages of the Amazon. Oxford: Oxford University Press. Aikhenvald, Alexandra Y. . ‘Areal diffusion and parallelism in drift: shared grammaticalization patterns.’ In Martine Robbeets and Hubert Cuyckens (eds), Shared grammaticalization: with special focus on the Traneurasian languages, –. Amsterdam: Benjamins. Aikhenvald, Alexandra Y. . ‘Language contact and language blend: Kumandene Tariana of north-west Amazonia.’ International Journal of American Linguistics : –. Aikhenvald, Alexandra Y., and R. M. W. Dixon (eds) . Grammars in contact: a crosslinguistic typology. Oxford: Oxford University Press. Aikhenvald, Alexandra Y., and Diana Green. . ‘Palikur and the typology of classiﬁers.’ Anthropological Linguistics : –. Alekseev, Mixail E. . Voprosy sravnitel´no-istoričeskoj grammatiki lezginskix jazykov. Morfologija. Sintaksis [The problems of comparative-historical reconstruction of the grammar of Lezgic languages. Morphology. Syntax]. Moscow: Nauka. Alm-Arvius, Christina. . The English verb SEE: a study in multiple meaning. Gothenberg. Acta Universitatis Gothoburgensis. Amberber, Mengistu, Mark Harvey, and Brett Baker (eds) . Complex predicates: crosslinguistic perspectives on event structure. Cambridge: Cambridge University Press. Anderson, Gregory D. S. . The Munda verb: typological perspectives. Berlin: Mouton de Gruyter. Anderson, Gregory D. S. . ‘Introduction to the Munda languages.’ In Gregory D. S. Anderson (ed.), The Munda languages, –. London: Routledge. Anderson, Stephen R. . Aspects of the theory of clitics. Oxford: Oxford University Press. Ansaldo, Umberto. . Comparative constructions in Sinitic: areal typology and patterns of grammaticalization. PhD dissertation, Stockholm University. Ansaldo, Umberto. . ‘Serial verb constructions.’ In Keith Brown (ed.), Encyclopedia of language and linguistics, nd edn, vol. , –. Amsterdam: Elsevier. Ansaldo, Umberto. . Contact languages: ecology and evolution in Asia. Cambridge: Cambridge University Press. Ansaldo, Umberto. . ‘Surpass comparatives in Sinitic and beyond: typology and grammaticalization.’ Linguistics (): –. Ansaldo, Umberto, and Lisa Lim. . ‘Phonetic absence as syntactic prominence: grammaticalization in isolating tonal languages.’ In Olga Fischer, Muriel Norde, and Harry Peridon (eds). Up and down the cline: the nature of grammaticalization, –. Amsterdam: Benjamins. Arabic Corpus Tool. http://arabicorpus.byu.edu.

References



Arcodia, Giorgio Francesco. . ‘Grammaticalization with coevolution of form and meaning in East Asia? Evidence from Sinitic.’ Language Sciences : –. Arcodia, Giorgio Francesco. . ‘More on the morphological typology of Sinitic.’ Bulletin of Chinese Linguistics : –. Arends, Jacques, and Adrienne Bruyn. . ‘Gradualist and developmental hypotheses.’ In Jacques Arends, Pieter Muysken, and Norval Smith (eds), Pidgins and creoles: an introduction, –.Amsterdam: Benjamins. Arends, Jacques, and Matthias Perl (eds) . Early Suriname creole texts: a collection of th-century Sranan and Saramaccan documents. Frankfurt: Vervuert. Ariel, Mira. . ‘The development of person agreement markers: from pronoun to higher accessibility markers.’ In Michael Barlow and Suzanne Kemmer (eds), Usage-based models of language, –. Stanford, Calif.: Center for the Study of Language and Information. Aristar, Anthony Rodrigues. . ‘On diachronic sources and synchronic pattern: an investigation into the origin of linguistic universals.’ Language (): –. Arkadiev, Peter M. . ‘Differential argument marking in two-term case systems and its implications for the general theory of case marking.’ In Helen de Hoop and Peter de Swart (eds), Differential subject marking, –. Dordrecht: Springer. Arkadiev, Peter M., and Alexander B. Letuchiy. . ‘Preﬁxes and sufﬁxes in the Adyghe polysynthetic wordform: types of interaction.’ In Vittorio S. Tomelleri, Manana Topadze, and Anna Lukianowicz (eds), Languages and cultures in the Caucasus, –. Munich, Berlin: Otto Sagner. Aronoff, Mark. . ‘Stems in Latin verbal morphology.’ In Mark Aronoff (ed.), Morphology Now, –. Albany: State University of New York Press. Arsenault, Paul. . ‘Retroﬂexion in South Asia: typological, genetic, and areal patterns.’ Journal of South Asian Languages and Linguistics (): –. DOI ./jsall--. Asher, R. E., and T. C. Kumari. . Malayalam. London: Routledge. Austin, Peter. . ‘Switch reference in Australia.’ Language : –. Authier, Gilles. . Éléments de grammaire Kryz (dialecte d’Alik, langue caucasique d’Azerbaïdjan). Paris: Peeters. Awolaye, Y. . ‘Reﬂexivization in Kwa languages.’ In Gerrit Jan Dimmendaal (ed.), Current approaches to African linguistics, vol. , –. Dordrecht: Foris. Bacigalupo, Ana Mariella. . Shamans of the Foye tree: gender, power, and healing among Chilean Mapuche. Austin: University of Texas Press. Backus, Ad, A. Seza Doğruöz, and Bernd Heine. . ‘Salient stages in contact-induced grammatical change: evidence from synchronic vs. diachronic contact situations.’ Language Sciences (): –. DOI: ./j.langsci.... Baird, Louise. . A grammar of Klon: a non-Austronesian language of Alor, Indonesia. Canberra: Paciﬁc Linguistics. Baird, Louise. . ‘Grammaticalisation of asymmetrical serial verb constructions in Klon.’ In Michael C. Ewing and Marian Klamer (eds), Typological and areal analyses: contributions from East Nusantara, –. Canberra: Paciﬁc Linguistics. Baker, Mark. . Incorporation: a theory of grammatical function changing. Chicago: Chicago University Press. Baker, Mark C. . The syntax of agreement and concord. Cambridge: Cambridge University Press.



References

Baker, Mark, and Carlos Fasola. . ‘Araucanian: Mapudungun.’ In Rochelle Lieber and Pavol Štekauer (eds), The Oxford handbook of compounding, –. Oxford: Oxford University Press. Baker, Philip, and Anand Syea. . Changing meanings, changing functions: papers relating to grammaticalization in contact languages. London: University of Westminster Press. Baldinger, Kurt. . ‘Post- und Prädeterminierung im Französischen.’ In Kurt Baldinger (ed.), Festschrift Walther von Wartburg zum . Geburtstag, –. Tübingen: Max Niemeyer. Balodis, Uldis. . Yuki Grammar, with sketches of Huchnom and Coast Yuki. Berkeley: University of California Press. Bashir, Elena. . ‘The northwest.’ In Hans Heinrich Hock and Elena Bashir (eds), The languages and linguistics of South Asia: a comprehensive guide, –. Berlin: Mouton de Gruyter. Basso, Ellen B. . ‘Compounding in Kalapalo, a southern Cariban language.’ In Swintha Danielsen, Katja Hannss, and Fernando Zúñiga (eds), Word formation in South American languages, –. Amsterdam: Benjamins. Bauer, Winifred. . The Reed reference grammar of Māori. Chatswood, NSW: Reed. Beames, John. []. Comparative grammar of the modern Aryan languages of India, to wit, Hindi, Panjabi, Sindhi, Gujarati, Marathi, Oriya, and Bangali, vol. : The noun and the pronoun. Cambridge: Cambridge University Press. Beckner, Clay, and Joan L. Bybee. . ‘A usage-based model of constituency and reanalysis.’ Language Learning : –. Behaghel, Otto. . Die deutsche Sprache. Leipzig: G. Freytag/Prague: F. Tempsky. Besnier, Niko. . Tuvaluan: a Polynesian language of the Central Paciﬁc. London: Routledge. Bhat, D. N. S. . The prominence of tense, aspect and mood. Amsterdam: Benjamins. Bhattacharjya, Dwijen. . The genesis and development of Nagamese: its social history and linguistic structure. PhD dissertation, City University of New York. Bickerton, Derek. . Dynamics of a creole system. Cambridge: Cambridge University Press. Bickerton, Derek. . Roots of language. Ann Arbor, Mich.: Karoma. Bickerton, Derek. . ‘The Language Bioprogram Hypothesis.’ Behavioral and Brain Sciences : –. Bickerton, Derek, and Aquiles Escalante. . ‘Palenquero: a Spanish-based creole of northern Colombia.’ Lingua : –. Bisang, Walter. . Das Verb im Chinesischen, Hmong, Vietnamesischen, Thai und Khmer: vergleichende Grammatik im Rahmen der Verb-serialisierung, der Grammatikalisierung und der Attraktor-positionen. Tübingen: Narr. Bisang, Walter. . ‘Verb serialization and converbs: differences and similarities.’ In Martin Haspelmath and Ekkehard König (eds), Converbs in cross-linguistic perspective: structure and meaning of adverbial verb forms—adverbial participles, gerunds, –. Berlin: Mouton de Gruyter. Bisang, Walter. . ‘Areal typology and grammaticalization: processes of grammaticalization based on nouns and verbs in east and mainland South East Asian languages.’ Studies in Language (): –.

References



Bisang, Walter. . ‘Grammaticalization without coevolution of form and meaning as an areal phenomenon in east and mainland Southeast Asia: the case of tense-aspect-mood (TAM).’ In Walter Bisang, Nikolaus Himmelmann, and Björn Wiemer (eds), What makes grammaticalization? A look from its components and its fringes, –. Berlin : Mouton de Gruyter. Bisang, Walter. . ‘South East Asia as a linguistic area’ In Keith Brown (ed.), Encyclopedia of language and linguistics, vol. , –. Oxford: Elsevier. Bisang, Walter. . ‘Grammaticalization and the areal factor: the perspective of east and mainland Southeast Asian languages.’ In Maria José López-Couso and Elena Seoane (eds), Rethinking grammaticalization: new perspectives, –. Amsterdam: Benjamins. Bisang, Walter. . ‘On the evolution of complexity: sometimes less is more in east and mainland Southeast Asia.’ In Geoffrey Sampson, David Gil, and Peter Trudgill (eds), Language complexity as an evolving variable, –. Oxford: Oxford University Press. Bisang, Walter. . ‘Grammaticalization in Chinese: a construction-based account.’ In Elizabeth Closs Traugott and Graeme Trousdale (eds), Gradience, gradualness and grammaticalization, –. Amsterdam: Benjamins. Bisang, Walter. . ‘Grammaticalization and typology.’ In Heiko Narrog and Bernd Heine (eds), The Oxford handbook of grammaticalization, –. Oxford: Oxford University Press. Bisang, Walter. . ‘On the strength of morphological paradigms: a historical account of radical pro-drop.’ In Martine Robbeets and Walter Bisang (eds), Paradigm change in historical reconstruction: the Transeurasian languages and beyond, –. Amsterdam : Benjamins. Bisang, Walter. a. ‘Problems with primary vs. secondary grammaticalization: the case of east and mainland Southeast Asian languages.’ Language Sciences : –. Bisang, Walter. b. ‘Hidden complexity: the neglected side of complexity and its consequences.’ Linguistics Vanguard (online): ISSN: -X, DOI: ./linvan--. Bisang, Walter. c. ‘Modern Khmer.’ In Jenny Mathias and Paul Sidwell (eds), The handbook of Austroasiatic languages, –. Leiden: Brill. Blake, Barry. . ‘Nominal marking on verbs: some Australian cases.’ Word (): –. Bloomﬁeld, Leonard. . Language. New York: Henry Holt. Blust, Robert Andrew. . ‘Central and central-eastern Malayo-Polynesian.’ Oceanic Linguistics : –. Blythe, Joe. . ‘Preference organization driving structuration: evidence from Australian Aboriginal interaction for pragmatically motivated grammaticalization.’ Language (): –. Bohnacker, Ute, and Somayeh Mohammadi. . ‘Acquiring Persian object marking: Balochi learners of L Persian.’ Orientalia Suecana : –. Boley, Jacqueline. . The Hittite hark-Construction. Innsbruck: Institut für Sprachwissenschaft der Universität Innsbruck. Boley, Jacqueline. . ‘The Hittite periphrastic constructions.’ In O. Carruba (ed.), Per una grammatica ittita, –. Pavia: Gianni Iuculano. Bopp, Franz. . Uber das Conjugationssystem der Sanskritsprache. Frankfurt am Main: Andreäischen.



References

Borjars, Kersti, and Nigel Vincent. . ‘Grammaticalization and directionality.’ In Heiko Narrog and Bernd Heine (eds), The Oxford handbook of grammaticalization, –. Oxford: Oxford University Press. Bowden, John. . Behind the preposition: grammaticalisation of locatives in Oceanic languages. Canberra: Paciﬁc Linguistics. Bowern, Claire. . ‘Complex predicates in Australian languages.’ In Harold Koch and Rachel Nordlinger (eds), The languages and linguistics of Australia, –. Berlin: Mouton. Bowern, Claire, and Harold Koch (eds) . Australian languages: classiﬁcation and the comparative method. Amsterdam: Benjamins. Brandão, Ana Paula Barros. . ‘A reference grammar of Paresi-Haliti (Arawak).’ PhD dissertation, University of Texas at Austin. Brenzinger, Matthias, and Iwona Kraska-Szlenk. . ‘The body in language: an introduction.’ In Matthias Brenzinger and Iwona Kraska-Szlenk (eds), The body in language: comparative studies of linguistic embodiment, –. Leiden: Brill. Bresnan, Joan, and Sam Mchombo. . ‘Topic, pronoun and agreement in Chicheŵa.’ Language (): –. Bril, Isabelle. . Le nêlêmwa (Nouvelle-Calédonie): analyse syntaxique et sémantique. Paris: Peeters-Selaf. Bril, Isabelle, and Françoise Ozanne-Rivierre (eds) . Complex predicates in Oceanic language: studies in the dynamics of binding and boundness. Berlin: Mouton de Gruyter. Brinton, Laurel. . ‘The grammaticalization of complex predicates.’ In Heiko Narrog and Bernd Heine (eds), The Oxford handbook of grammaticalization, –. Oxford: Oxford University Press. Brinton, Laurel J., and Elizabeth Closs Traugott. . Lexicalization and language change. Cambridge: Cambridge University Press. Brockelmann, Carl. [–]. Grundriss der vergleichenden Grammatik der semitischen Sprachen. Berlin: Reuther & Reichard/ New York: Lemcke & Buechner. Brustad, Kristen E. . The syntax of spoken Arabic: a comprehensive study of Moroccan, Egyptian, Syrian, and Kuwaiti Dialects. Washington, DC: Georgetown University Press. Bruyn, Adrienne. . ‘On identifying instances of grammaticalization in creole languages.’ In Philip Baker and Anand Syea (eds), Changing meanings, changing functions: papers relating to grammaticalization in contact languages, –. London: Westminster University Press. Bugaeva, Anna. . ‘Mermaid constructions in Ainu.’ In Tasaku Tsunoda (ed.), Adnominal clauses and the ‘mermaid construction’: grammaticalization of nouns, –. Tachikawa, Japan: National Institute for Japanese Language and Linguistics. Burling, Robbins. . ‘The lingua franca cycle: implications for language shift, language change, and language classiﬁcation.’ Anthropological Linguistics (–): –. Burrows, L. . Ho grammar. Calcutta: Catholic Orphan Press. Butt, Miriam, and Tasfeer Ahmed. . ‘The redevelopment of Indo-Aryan case systems from a lexical semantic perspective.’ Morphology : –. Bybee, Joan L. . Morphology: a study of the relation between meaning and form. Amsterdam: Benjamins. Bybee, Joan L. . ‘The diachronic dimension in explanation.’ In John Hawkins (ed.), Explaining language universals, –. Oxford: Blackwell.

References



Bybee, Joan L. . ‘The grammaticization of zero.’ In William Pagliuca (ed.), Perspectives on grammaticalization, –. Amsterdam: Benjamins. Bybee, Joan L. a. ‘Mechanisms of change in grammaticization: the role of frequency.’ In Brian D. Joseph and Richard D. Janda (eds), The handbook of historical linguistics, –. Oxford: Blackwell. Bybee, Joan L. b. ‘Cognitive processes in grammaticalization.’ In Michael Tomasello (ed.), The new psychology of language: cognitive and functional approaches to language structure, vol. , –. Mahwah, NJ: Lawrence Erlbaum. Bybee, Joan L. a. ‘Language change and universals.’ In Ricardo Mairal and Juana Gil (eds), Linguistic universals, –. Cambridge: Cambridge University Press. Bybee, Joan L. b. ‘From usage to grammar: the mind’s response to repetition.’ Language (): –. Bybee, Joan L. . ‘Formal universals as emergent phenomena: the origins of structure preservation.’ In Jeff Good (ed.), Language universals and language change, –. Oxford: Oxford University Press. Bybee, Joan L. . ‘Language universals and usage-based theory.’ In Morten H. Christiansen et al. (eds), Language universals, –. Oxford: Oxford University Press. Bybee, Joan L. . Language, usage, and cognition. Cambridge: Cambridge University Press. Bybee, Joan L. . ‘Usage-based theory and grammaticalization.’ In Heiko Narrog and Bernd Heine (eds), The Oxford handbook of grammaticalization, –. Oxford: Oxford University Press. Bybee, Joan L. . Language change. Cambridge: Cambridge University Press. Bybee, Joan L., and Östen Dahl. . ‘The creation of tense and aspect systems in the languages of the world.’ Studies in Language (): –. Bybee, Joan L., and Paul J. Hopper. . Morphology. Amsterdam: Benjamins. Bybee, Joan L., and Paul J. Hopper (eds) . Frequency and the emergence of linguistic structure. Amsterdam: Benjamins. Bybee, Joan L., and William Pagliuca. . ‘The evolution of future meaning.’ In Papers from the VIIth International Conference on Historical Linguistics, –. Amsterdam: Benjamins. Bybee, Joan L., William Pagliuca, and Revere D. Perkins. . ‘On the asymmetries in the afﬁxation of grammatical material.’ In William Croft et al. (eds). Studies in typology and diachrony: papers presented to Joseph H. Greenberg on his th birthday, –. Amsterdam: Benjamins. Bybee, Joan L., William Pagliuca, and Revere D. Perkins. . ‘Back to the future.’ In Elizabeth C. Traugott and Bernd Heine (eds), Approaches to Grammaticalization, vol. , –. Amsterdam: Benjamins. Bybee, Joan L., Revere D. Perkins, and William Pagliuca. . The evolution of grammar: tense, aspect, and modality in the languages of the world. Chicago: University of Chicago Press. Bybee, Joan L., and Sandra Thompson. . ‘Three frequency effects in syntax.’ Proceedings of the Twenty-Third Annual Meeting of the Berkeley Linguistics Society, –. Bybee, Joan L., and Rena Torres Cacoullos. . ‘The role of prefabs in grammaticization: how the particular and the general interact in language change.’ In Roberta L. Corrigan, Edith A. Moravcsik, Hamid Ouali, and Kathleen Wheatley (eds), Formulaic language: distribution and historical change, vol. , –. Amsterdam: Benjamins.



References

Byrne, Francis X. . Grammatical relations in a radical creole. Amsterdam: John Benjmains. Cablitz, Gabriele H. . Marquesan: a grammar of space. Berlin: Mouton de Gruyter. Camp, Elizabeth L., and Millicent R. Liccardi. . Diccionario Cavineña–Castellano Castellano– Cavineña con bosquejo de la gramática cavineña. Dallas, Tex.: Summer Institute of Linguistics. Campbell, Lyle. . ‘What’s wrong with grammaticalization?’ Language Sciences : –. Campbell, Lyle, and Martha Muntzel. . ‘The structural consequences of language death.’ In Nancy C. Dorian (ed.), Investigating obsolescence: studies in language contraction and death, –. Cambridge: Cambridge University Press. Chadwick, Neil. . A descriptive study of the Djingili language. Canberra: Australian Institute of Aboriginal Studies. Chafe, Wallace (ed.) . The pear stories: cognitive, cultural, and linguistic aspects of narrative production. Norwood, NJ: Ablex. Chafe, Wallace. . Discourse, consciousness and time: the ﬂow and displacement of conscious experience in speaking and writing. Chicago: Chicago University Press. Chafe, Wallace. . ‘Florescence as a force in grammaticalization.’ In Spike Gildea (ed.), Reconstructing grammar: comparative linguistics and grammaticalization, –. Amsterdam: Benjamins. Chapin, Paul G. . ‘Proto-Polynesian *ai.’ Journal of the Polynesian Society (): –. Chatterji, Suniti Kumar. []. The origin and development of the Bengali language. London: George Allen & Unwin. Chen, Chun-hui. . ‘Bunpōka to shakuyō: Nihongo ni okeru dōshi no chūshikei o fukunda kōchishi ni tsuite’ [Grammaticalization and borrowing: postpositions in Japanese composed from verbs in ren’yō or -te forms]. Nihongo no Kenkyū (): –. Cheung, Johnny. . Etymological dictionary of the Iranian verb. Leiden: Brill. Chirikba, Viacheslav. . Common West Caucasian: the reconstruction of its phonological system and parts of its lexicon and morphology. Leiden: Research School CNWS. Chirikba, Viacheslav. . ‘The problem of the Caucasian Sprachbund.’ In Peter Muysken (ed.), From linguistic areas to areal linguistics, –. Amsterdam: Benjamins. Chirikba, Viacheslav. To appear. ‘From north to north west: how north-west Caucasian evolved from north Caucasian.’ Available at: academia.edu. Chung, Siaw-Fong. . ‘Numeral classiﬁer Buah in Malay: a corpus-based study.’ Language and Linguistics (): –. Chung, Taegoo. . Argument structure and serial verbs in Korean. PhD dissertation, University of Texas at Austin. Clackson, James. . Indo-European linguistics: an introduction. Cambridge: Cambridge University Press. Clark, Edward Winter. . Ao–Naga dictionary. Calcutta: Baptist Mission Press. Clark, Thomas Welbourne. . Introduction to Nepali. Cambridge: Heffer. Claudi, Ulrike, and Bernd Heine. . ‘On the metaphorical base of grammar.’ Studies in Language (): –. Clements, G. N., and Annie Rialland. . ‘Africa as a phonological area.’ In Bernd Heine and Derek Nurse (eds), A linguistic geography of Africa, –. Cambridge: Cambridge University Press.

References



Codrington, Robert, H. . The Melanesian languages. Oxford: Clarendon Press. Comrie, Bernard. . Aspect. Cambridge: Cambridge University Press. Comrie, Bernard. . ‘Morphology and word order reconstruction: problems and prospects.’ In Jaced Fisiak (ed.), Historical morphology, –. The Hague: Mouton. Comrie, Bernard. . ‘Causative verb formation and other verb-deriving morphology.’ In Timothy Shopen (ed.), Language typology and syntactic description, vol. : Grammatical categories and the lexicon, –. Cambridge: Cambridge University Press. Comrie, Bernard. . Language universals and linguistic typology: syntax and morphology, nd edn. Chicago: Chicago University Press. Comrie, Bernard, and Maria Polinsky. . ‘Form and function in syntax: relative clauses in Tsez.’ In Michael Darnell, Edith A. Moravcsik, Frederick J. Newmeyer, Michael Noonan, and Kathleen M. Wheatley (eds), Functionalism and formalism in linguistics, vol. : Case studies, –. Amsterdam: Benjamins. Cook, Kenneth W. . ‘The temporal use of Hawaiian directional particles.’ In Martin Pütz and René Dirven (eds), The construal of space in language and thought, –. Berlin: Mouton de Gruyter. Corbett, G. . Agreement. Cambridge: Cambridge University Press. Coupe, Alexander R. a. A grammar of Mongsen Ao. Berlin: Mouton de Gruyter. Coupe, Alexander R. b. ‘Converging patterns of clause linkage in Nagaland.’ In Matti Miestamo and Bernhard Wälchli (eds), New challenges in typology: broadening the horizons and redeﬁning the foundations, –. Berlin: Mouton de Gruyter. Coupe, Alexander R. . ‘On core case marking patterns in two Tibeto-Burman languages of Nagaland.’ Linguistics of the Tibeto-Burman Area (): –. Coupe, Alexander R. a. ‘Mongsen Ao.’ In Graham Thurgood and Randy J. LaPolla (eds), The Sino-Tibetan languages, nd edn, –. London: Routledge. Coupe, Alexander R. b. ‘On the diachronic origins of converbs in Tibeto-Burman and beyond.’ In Picus Ding, and Jamin Pelkey (eds), Sociohistorical linguistics in Southeast Asia: new horizons for Tibeto-Burman studies in honor of David Bradley, –. Leiden: Brill. Coupe, Alexander R. In prep. ‘Sangtam phonology and word list.’ Craig, Colette G. . ‘Ways to go in Rama: a case study in polygrammaticalization.’ In Bernd Heine and Elizabeth Closs Traugott (eds), Approaches to grammaticalization, vol. : Focus on types of grammatical markers, –. Amsterdam: Benjamins. Crass, Joachim, and Ronny Meyer. . ‘Ethiopia.’ In Bernd Heine and Derek Nurse (eds), A linguistic geography of Africa, –. Cambridge: Cambridge University Press. Crass, Joachim, and Ronny Meyer (eds) . Language contact and language change in Ethiopia. Cologne: Köppe. Creissels, Denis, Gerrit J. Dimmendaal, Zygmunt Frajzyngier, and Christa König. . ‘Africa as a morphosyntactic area.’ In Bernd Heine and Derek Nurse (eds), A linguistic geography of Africa, –. Cambridge: Cambridge University Press. Cristofaro, Sonia. . ‘The referential hierarchy: reviewing the evidence in diachronic perspective.’ In Dik Bakker and Martin Haspelmath (eds), Languages across boundaries: studies in memory of Anna Siewierska, –. Berlin: Mouton de Gruyter.



References

Cristofaro, Sonia. . ‘Competing motivations and diachrony: what evidence for what motivations?’ In Brian MacWhinney, Andrej Malchukov, and Edith A. Moravcsik (eds), Competing motivations in grammar and usage, –. Oxford: Oxford University Press. Cristofaro, Sonia. . ‘Towards a source oriented typology.’ Talk at ‘Historical Linguistics and Typology’, University of Texas at Austin,  Sept. https://www.researchgate.net/.../ _Towards_a_source_oriented_typology. Croft, William. . Typology and universals, nd edn. Cambridge University Press. Croft, William. . ‘The origins of grammaticalization in the verbalization of experience.’ Linguistics (): –. Crowley, Terry. . The Paamese language of Vanuatu. Canberra: Paciﬁc Linguistics. Crowley, Terry. . ‘Serial verbs in Paamese.’ Studies in Language (): –. Crowley, Terry. . Serial verbs in Oceanic. Oxford: Oxford University Press. Crowley, Terry, and Claire Bowern. . An introduction to historical linguistics. Oxford: Oxford University Press. Csató, Éva Á. . ‘Turkic double verbs in a typological perspective.’ In Karen H. Ebert and Fernand Zúñiga (eds), Aktionsart and aspectotemporality in non-European Languages, –. Zürich: Universität Zürich. Csató, Éva Á. . ‘A typology of Turkish double-verb constructions.’ In A. Sumru Özsoy (ed.), Studies in Turkish linguistics, –. Istanbul: Bogaziçi University Press. Csató, Éva Á. . ‘Growing apart in shared grammaticalization.’ In Martine Robbeets and Hubert Cuyckens (eds), Shared grammaticalization: with special focus on the Transeurasian languages, –. Amsterdam: Benjamins. Csató, Éva Á., and Lars Johanson. . ‘Two degrees of grammaticalization of a Turkic postverb.’ Dilbilim Ararştırmaları Dergisi : –. Culbertson, Jenny. . ‘Convergent evidence for categorial change in French: from subject clitic to agreement marker.’ Language (): –. Cysouw, Michael. . ‘The asymmetry of afﬁxation.’ In Hans-Martin Gärtner et al. (eds), Puzzles for Krifka, –. Cysouw, M. . ‘Quantitative explorations of the worldwide distribution of rare characteristics, or: the exceptionality of northwestern European languages.’ In H. J. Simon and H. Wiese (eds), Expecting the unexpected: exceptions in grammar, –. Berlin: de Gruyter. Dabir-Moghaddam, Mohammad. . ‘On agent clitics in Balochi in comparison with other Iranian languages.’ In Carina Jahani, Agnes Korn, and Paul Titus (eds), The Baloch and others, –. Wiesbaden: Reichert. Dahl, Östen. . Tense and aspect systems. Oxford: Blackwell. Dahl, Östen. . ‘The grammar of future time reference in European languages.’ In Östen Dahl (ed.), Tense and aspect in the languages of Europe, –. Berlin: de Gruyter. Dahl, Östen. . ‘Inﬂationary effects in language and elsewhere.’ In Joan L. Bybee and Paul J. Hopper (eds), Frequency and the emergence of linguistic structure, –. Amsterdam: Benjamins. Dahl, Östen. . The growth and maintenance of linguistic complexity. Amsterdam: Benjamins. Dahl, Östen. . ‘Polysynthesis and complexity.’ In Nicholas Evans, Michael Fortescue, and Marianne Mithun (eds), The Oxford handbook of polysynthesis. Oxford: Oxford University Press. DOI: http://dx.doi.org/./oxfordhb/..

References



Dahl, Östen, and Viveka Velupillai. . ‘The perfect.’ In Martin Haspelmath, Matthew S. Dryer, David Gil, and Bernard Comrie (eds), World atlas of language structures, –, –. Oxford: Oxford University Press. Dahl, Östen, and Bernard Wälchli. . ‘Perfects and iamitives: two gram types in one grammatical space.’ Letras de Hoje (): . DOI: ./-... Daniel, Michael. . ‘Plurality in independent personal pronouns.’ In Martin Haspelmath, Matthew S. Dryer, David Gil, and Bernard Comrie (eds), The world atlas of language structures, –. Oxford: Oxford University Press. Daniel, Michael, Timur Maisak, and Solmaz Merdanova. . ‘Causatives in Agul.’ In Pirkko Suihkonen, Bernard Comrie, and Valery Solovyev (eds), Argument structure and grammatical relations: a crosslinguistic typology, –. Amsterdam: Benjamins. Danièl´ [Daniel], Mixail A., and Timur Majsak [Maisak]. . ‘Grammatikalizacija veriﬁkativa: ob odnoj agul´sko-arčinskoj paralleli’ [The grammaticalization of the veriﬁcative: on an Agul-Archi parallel]. In Mixail A. Danièl´, Ekaterina A. Ljutikova, Vladimir A. Plungjan, Sergej G. Tatevosov, and Ol´ga V. Fedorova (eds), Jazyk. Konstanty. Peremennye. Pamjati Aleksandra Evgen´eviča Kibrika [Language. Constants. Variables. In memory of Aleksandr Kibrik], –. St Petersburg: Aletheia. Davari, Shadi, and Mehrdad Naghzguy Kohan. . ‘The grammaticalization of progressive aspect in Persian.’ In Kees Hengeveld, Heiko Narrog, and Hella Olbertz (eds), The grammaticalization of tense, aspect, modality and evidentiality: a functional perspective, –. Berlin: Mouton de Gruyter. Davis, Martha S. . ‘The past imperfect in Palenquero.’ Studies in Language (): –. Dayley, Jon. . Tümpisa (Panamint) Shoshone grammar. Berkeley: University of California. De Cat, Cécile. . ‘French subject clitics are not agreement markers.’ Lingua : –. DeGraff, Michel. . ‘A riddle on negation in Haitian.’ Probus : –. DeGraff, Michel. . ‘On the origin of creoles: a Cartesian critique of neo-Darwinian linguistics.’ Linguistic Typology (/): –. Dehghan, Iraj. . ‘Dâštan as an auxiliary in contemporary Persian.’ Archiv Orientální : –. DeLancey, Scott. . ‘Grammaticalization and the gradience of categories: relator nouns and postpositions in Tibetan and Burmese.’ In Joan Bybee, John Haiman, and Sandra Thompson (eds), Essays on language function and language type: dedicated to T. Givón, –. Amsterdam: Benjamins. DeLancey, Scott. . ‘Lexical categories.’ Lecture  of his Santa Barbara Lectures on Functional Syntax. The LSA Summer Institute, UC Santa Barbara. www.uoregon.edu/~delancey/ sb/functional_syntax.doc. DeLancey, Scott. . ‘Sociolinguistic typology in north east India: a tale of two branches.’ Journal of South Asian Languages and Linguistics (): –. DeLancey, Scott. . ‘The historical dynamics of morphological complexity in TransHimalayan.’ Linguistic Discovery (): –. Delbrück, Berthold. . Vergleichende Syntax der indogermanischen Sprachen, vol. . Strasburg: Karl Trübner. Demir, Nurettin. . Postverbien im Türkeitürkischen. Unter besonderer Berücksichtigung eines südanatalischen Dorfdialekts. Wiesbaden: Harrassowitz.



References

Dench, Alan, and Nicolas Evans. . ‘Multiple case-marking in Australian languages.’ Australian Journal of Linguistics (): –. Derbyshire, Desmond, and Doris Payne. . ‘Noun classiﬁcation systems of Amazonian languages.’ In Doris L. Payne (ed.), Amazonian Linguistics: Studies in Lowland South American Languages, –. Austin: University of Texas Press. Dereau, Léon. /. Cours de Kikongo. Brussels: Ad. Wesmael-Charlier. Diakonoff, Igor. . Afrasian languages. Moscow: Nauka. Dixon, R. M. W. . The languages of Australia. Cambridge: Cambridge University Press. Dixon, R. M. W. . ‘A changing language situation: the decline of Dyirbal, –.’ Language in Society : –. Dixon, R. M. W. . Ergativity. Cambridge: Cambridge University Press. Dixon, R. M. W. . The rise and fall of languages. Cambridge: Cambridge University Press. Dixon, R. M. W. . Australian languages: their nature and development. Cambridge: Cambridge University Press. Dixon, R. M. W. . Edible gender, mother-in-law style, and other grammatical wonders: studies in Dyirbal, Yidiñ, and Warrgamay. Oxford: Oxford University Press. Dixon, R. M. W. . ‘The Australian linguistic area.’ In Alexandra Y. Aikhenvald and R. M. W. Dixon(eds), The Cambridge handbook of linguistic typology, –. Cambridge: Cambridge University Press. Dixon, R. M. W., and Alexandra Y. Aikhenvald. . ‘Introduction.’ In R. M. W. Dixon and Alexandra Y. Aikhenvald (eds), The Amazonian Languages, –. Cambridge: Cambridge University Press. Djamouri, Redouane, and Waltraud Paul. . ‘Verb-to-preposition reanalysis in Chinese.’ In Paola Crisma and Giuseppe Longobardi (eds), Historical syntax and linguistic theory, –. Oxford: Oxford University Press. Donegan, Patricia, and David Stampe. . ‘Rhythm and the synthetic drift of Munda.’ Yearbook of South Asian languages and linguistics : –. Donohue, Mark, and Charles E. Grimes. . ‘Yet more on the position of the languages of eastern Indonesia and East Timor.’ Oceanic Linguistics (): –. Doron, Edit. . ‘The pronominal copula as agreement clitic.’ In Hagit Borer (ed.), The syntax of pronominal clitics, –. New York: Academic Press. Dotte, Anne-Laure. . Le iaai aujourd’hui: évolutions sociolinguistiques et linguistiques d’une langue kanak de Nouvelle-Calédonie (Ouvéa, Îles Loyauté). PhD thesis, Université Lumière-Lyon . Drabbe, Petrus. . Grammar of the Asmat language. Syracuse, Ind.: Our Lady of the Lake Press. Drinka, Bridget. . Language contact in Europe: the perfect tense through history. Cambridge: Cambridge University Press. Dryer, Matthew S. . ‘Plural words.’ Linguistics : –. Dryer, Matthew S. . ‘Preﬁxing versus sufﬁxing in inﬂectional morphology.’ In Martin Haspelmath, Matthew S. Dryer, David Gil, and Bernard Comrie (eds), The world atlas of language structures, –. Oxford: Oxford University Press. Dryer, Matthew S. . ‘NP structure.’ In Timothy Shopen (ed.), Language typology and syntactic description, vol. , nd edn, –. Cambridge: Cambridge University Press. Dryer, Matthew S. a. ‘Preﬁxing versus sufﬁxing in inﬂectional morphology.’ In Matthew S. Dryer and Martin Haspelmath (eds), The world atlas of language structures online.

References



Leipzig: Max Planck Institute for Evolutionary Anthropology. https://www.acsu.buffalo. edu/~dryer/DryerWalsPrefSuffNoMap.pdf. Dryer, Matthew S. b. ‘Position of case afﬁxes.’ In Matthew S. Dryer and Martin Haspelmath (eds), The world atlas of language structures online. Leipzig: Max Planck Institute for Evolutionary Anthropology. https://www.acsu.buffalo.edu/~dryer/DryerWalsCaseNoMap. pdf. Dryer, Matthew S. c. ‘Order of adposition and noun phrase.’ In Matthew S. Dryer and Martin Haspelmath (eds), The world atlas of language structures online. Leipzig: Max Planck Institute for Evolutionary Anthropology. www.acsu.buffalo.edu/~dryer/. Du Bois, John. . ‘Competing motivations.’ In John Haiman (ed.), Iconicity in syntax, –. Amsterdam: Benjamins. Durie, Mark. . ‘Verb serialization and “verbal-prepositions” in Oceanic languages.’ Oceanic Linguistics : –. Eades, Domenyk (ed.) . Grammaticalization in Semitic ( Journal of Semitic Studies Supplement),  Dec. Early, Robert. . ‘Sit, stand, lie: posture verbs and imperfective.’ In B. Palmer and P. Geraghty (eds), Sicol. Proceedings of the Second International Conference on Oceanic Linguistics, vol. : Historical and descriptive studies, –. Canberra: Paciﬁc Linguistics. Eid, Mushira. . ‘The copula function of pronouns.’ Lingua :–. Eksell, Kerstin. . The analytic genitive in the modern Arabic dialects. Gothenburg: University of Gothenburg. Emeneau, Murray B. . ‘India as a linguistic area.’ Language (): –. Emeneau, Murray B. . Language and linguistic area, selected and introduced by Anwar S. Dil. Stanford, Calif.: Stanford University Press. Enﬁeld, Nick J. . Linguistic epidemiology: semantics and grammar of language contact in mainland Southeast Asia. London: Routledge Curzon. Enﬁeld, Nick J., and Bernard Comrie. . ‘Mainland Southeast Asian languages.’ In Nick J. Enﬁeld and Bernard Comrie (eds), Languages of mainland Southeast Asia: the state of the art, –. Berlin: Mouton de Gruyter. Epps, Patience. . ‘The Vaupés melting pot: Tucanoan inﬂuence on Hup.’ In Alexandra Y. Aikhenvald and R. M. W. Dixon (eds), Grammars in contact: a cross-linguistic typology, –. Oxford: Oxford University Press. Epps, Patience. a. A grammar of Hup. Berlin: Mouton de Gruyter. Epps, Patience. b. ‘From “wood” to future tense: nominal origins of the future constructions in Hup.’ Studies in Language : –. Ershova, Ksenia. . ‘Reported speech and reportative grammaticalization in Besleney Kabardian.’ In Balázs Surányi (ed.), Proceedings of the Second Central European Conference in Linguistics for Postgraduate Students, –. Budapest: Pázmány Péter Catholic University. Eskell Harning, Kerstin. . The analytical genitive in the modern Arabic dialects. Gothenburg: Acta Universitatis Gothoburgensis . Esseesy, Mohssen. . Grammaticalization of Arabic prepositions and subordinators: a corpus-based study. Leiden: Brill. Essien, Okon E. . ‘The so-called reﬂexive pronouns and reﬂexivization in Ibibio.’ Studies in African Linguistics (): –.



References

Evans, Bethwyn. . A study of valency-changing devices in Proto Oceanic. Canberra: Paciﬁc Linguistics. Evans, Nicholas. . ‘Insubordination and its uses.’ In Irina Nikolaeva (ed.), Finiteness: theoretical and empirical foundations, –. Oxford: Oxford University Press. Evans, Nicholas. . ‘The syntax and semantics of body-part incorporation in Mayali.’ In Hilary Chappell and William McGregor (eds), The grammar of inalienability: a typological perspective on body-part terms and the part–whole relation, –. Berlin: Mouton de Gruyter. Evans, Nicholas, and Hans-Jürgen Sasse (eds) . Problems of polysynthesis. Berlin: Akademie. Evans, Nicholas, and David Wilkins. . ‘In the mind’s ear: the semantic extensions of perception verbs in Australian languages.’ Language : –. Faber, Alice. . ‘Genetic subgrouping of the Semitic languages.’ In Robert Hetzron (ed.), The Semitic languages, –. London: Routledge. Fehri, Fassi. . Issues in the structure of Arabic clauses and words. Dordrecht: Kluwer Academic. Ferreira, Helder Perri. . ‘Los clasiﬁcadores nominales del Yanomama de Papiu (Brasil).’ MA thesis, Centro de Investigaciones y Estudios Superiores en Antropología Social, Mexico City. First Voices website: http://www.ﬁrstvoices.com/ Fischer, Olga. . ‘Grammaticalisation: unidirectional, non-reversible?’ In Olga Fischer, Anette Rosenbach, and Dieter Stein (eds), Pathways of change: grammaticalization in English, –. Amsterdam: Benjamins. Fleck, David W. . A grammar of Matses. PhD dissertation, Rice University, Houston, Texas. Fleck, David W. . ‘Body-part preﬁxes in Matses: derivation or noun incorporation?’ International Journal of American Linguistics : –. Fleck, David W. . Panoan languages and linguistics. New York: American Museum of Natural History. Foley, William A. . The Papuan languages of New Guinea. New York: Cambridge University Press. Foley, William A. . The Yimas language of New Guinea. Stanford, Calif.: Stanford University Press. Foley, William A., and Mike Olson. . ‘Clausehood and verb serialization.’ In Johanna Nichols and Anthony C. Woodbury (eds), Grammar inside and outside the clause: some approaches to theory from the ﬁeld, –. Cambridge: Cambridge University Press. Forker, Diana. . ‘The bi-absolutive construction in Nakh-Daghestanian.’ Folia Linguistica : –. Frajzyngier, Zygmunt, and Erin Shay. . A grammar of Hdi. Berlin: Mouton de Gruyter. François, Alexandre. . Contraintes de structures et liberté dans l’organisation du discours: une description du Mwotlap, langue océanienne du Vanuatu. PhD thesis, Université Paris-IV. François, Alexandre. . ‘Shadows of bygone lives: the histories of spiritual words in northern Vanuatu.’ In Robert Mailhammer (ed.), Lexical and structural etymology: beyond word histories, –. Berlin: Mouton de Gruyter.

References



Frellesvig, Bjarke, Stephen Horn, Keri Russell, and Peter Sells. . ‘Verb semantics and argument realization in pre-modern Japanese: a preliminary study of compound verbs in Old Japanese.’ Gengo Kenkyu : –. Friedman, Victor. . Turkish in Macedonia and beyond: studies in contact, typology, and other phenomena in the Balkans and Causasus. Weiebaden: Otto Harrassowitz. Furby, Christine. . ‘The pronominal system of Garawa.’ Oceanic Linguistics (): –. Furby, Edward, and Christine Furby. . A preliminary analysis of Garawa phrases and clauses. Canberra: Paciﬁc Linguistics. Fuss, Eric. . The rise of agreement: a formal approach to the syntax and grammaticalization of verbal inﬂection. Amsterdam: Benjamins. Gabain, Annemarie von. . Alttürkische Grammatik. Leipzig: Harrassowitz. Gabain, Annemarie von. . ‘Die Sprache des Codex Cumanicus.’ In Jean Deny et al. (eds), Philologiae turcicae fundamenta, –. Wiesbaden: Steiner. Gabelentz, Georg von der. . Die Sprachwissenschaft, ihre Aufgaben, Methoden und bisherigen Ergebnisse, nd edn. Leipzig: C. H. Tauchnitz. Gaby, Alice. . ‘From discourse to syntax and back: the lifecycle of Kuuk Thaayorre ergative morphology.’ Lingua : –. Gaby, Alice, and Ruth Singer. . ‘Semantics of Australian languages.’ In Harold Koch and Rachel Nordlinger (eds),The languages and linguistics of Australia, –. Berlin: Mouton. Gardelle, Laure, and Sandrine Sorlin. . ‘Personal pronouns: an exposition.’ In Laure Gardelle and Sandrine Sorlin (eds), The pragmatics of personal pronouns, ch. . Amsterdam: Benjamins. Gast, Voker, and Johan van der Auwera. . ‘What is “contact-induced grammaticalization”? Evidence from Mesoamerican languages.’ In Bjorn Wiemer and Bernhard Hansen (eds), Grammatical replication and grammatical borrowing in language contact, –. Berlin: Mouton de Gruyter. Gast, Volker, Ekkehard König, and Claire Moyse-Faurie. . ‘Comparative lexicology and the typology of event descriptions: a programmatic study.’ In Doris Gerland, Christian Horn, Anja Latrouite, and Albert Ortmann (eds), Meaning and grammar of nouns and verbs, –. Düsseldorf: Düsseldorf University Press. Gelderen, Elly van. . The linguistic cycle: language change and the language faculty. Oxford: Oxford University Press. Gelderen, Elly van. . ‘The linguistic cycle and the language faculty.’ Language and Linguistics Compass (): –. Genetti, Carol. . A grammar of Dolakha Newar. Berlin: Mouton de Gruyter. Gensler, Orin D. . ‘Morphological typology of Semitic.’ In Stephan Weninger et al. (eds) The Semitic languages: an international handbook, –. Berlin: Mouton de Gruyter. Gerner, Matthias, and Walter Bisang. . ‘Inﬂectional classiﬁers in Weining Ahmao: mirror of the history of a people.’ Folia Linguistica Historica : –. Geurts, Bart. . ‘Explaining grammaticalization (the standard way).’ Linguistics (): –. Ghosh, Arun. . ‘Santali.’ In Gregory D. S. Anderson (ed.), The Munda languages, –. London: Routledge.



References

Giacalone Ramat, Anna. . ‘Areal convergence in grammaticalization processes.’ In María José López-Couso and Elena Seoane (eds), Rethinking grammaticalization: new perspectives, –. Amsterdam: Benjamins. Gil, David. . ‘Deﬁniteness, noun-phrase conﬁgurationality, and the count–mass distinction.’ In Erik J. Reuland and Alice G. B. ter Meulen (eds), The representation of (in) deﬁniteness, –. Cambridge, Mass.: MIT Press. Gil, David. . ‘Numeral classiﬁers.’ In Matthew Dryer and Martin Haspelmath (eds), The World Atlas of Language Structures Online. Münich: Max Plank Digital Library. Available online at http://wals.info/. Accessed on  Nov. . Gildea, Spike. . ‘Evolution of grammatical relations in Cariban: how functional motivation precedes syntactic change.’ In Talmy Givón (ed.), Grammatical relations: a functionalist perspective, –. Amsterdam: Benjamins. Gildea, Spike. . On reconstructing grammar: comparative Cariban morphosyntax. Oxford: Oxford University Press. Gippert, Jost, Wolfgang Schulze, Zaza Aleksidze, and Jean-Pierre Mahé. . The Caucasian Albanian palimpsests of Mount Sinai.  vols. Turnhout: Brépols. Givón, Talmy. . ‘Historical syntax and synchronic morphology: an archaeologist’s ﬁeld trip.’ Chicago Linguistic Society : –. Givón, Talmy. . ‘The time axes phenomenon.’ Language : –. Givón, Talmy. . ‘Serial verbs and syntactic change: Niger-Congo.’ In Charles N. Li (ed.), Word order and word order change, –. Austin: University of Texas Press. Givón, Talmy. . ‘Topic pronoun, and grammatical agreement.’ In Charles N. Li (ed.), Subject and topic, –. New York: Academic Press. Givón, Talmy. a. On understanding grammar. New York: Academic Press. Givón, Talmy. b. ‘Prolegomena to any sane creology.’ In Ian F. Hancock et al. (eds), Readings in creole studies, –. Amsterdam: Benjamins. Givón, Talmy. . Topic continuity in discourse. Amsterdam: Benjamins. Givón, Talmy. . ‘Serial verbs and the mental reality of “event”: grammatical vs. cognitive packaging.’ In Elizabeth C. Traugott and Bernd Heine (eds), Approaches to grammaticalization, vol. , –. Amsterdam: Benjamins. Givón, Talmy. . ‘Internal reconstruction, on its own.’ In Edgar C. Polomé and Carol F. Justus (eds), Language change and typological variation: in honor of Winfred P. Lehmann on the occasion of his rd birthday, vol. : Language change and phonology, –. Washington, DC: Institute for the Study of Man. Givón, Talmy. . ‘On the relational properties of passive clauses.’ In Zarina Estrada Fernández et al. (eds), Studies in voice and transitivity, –. Munich: Lincom. Givón, Talmy. . The genesis of syntactic complexity: diachrony, ontogeny, neuro-cognition, evolution. Amsterdam: Benjamins. Gonçalves, Cristina H. R. C. . Concordância em Munduruku. Campinas Unicamp. Good, Jeff. . ‘How to become a “Kwa” noun.’ Morphology (): –. Gragg, Gene. . ‘Gecez (Ethiopic).’ In Robert Hetzron (ed.), The Semitic languages, –. London: Routledge. Gragg, Gene, and Robert Hoberman. . ‘Semitic.’ In Zygmunt Frajzyngier and Erin Shay (eds), The Afroasiatic languages, –. Cambridge: Cambridge University Press.

References



Gray, H. Louis. []. Introduction to Semitic comparative linguistics. Amsterdam: Philo Press. Green, Ian, and Rachel Nordlinger. . ‘Revisiting Proto-Mirndi.’ In Claire Bowern and Harold Koch(eds), Australian languages: classiﬁcation and the comparative method, –. Amsterdam: Benjamins. Greenberg, Joseph H. . ‘A quantitative approach to the morphological typology of language.’ International Journal of American Linguistics (): –. Greenberg, Joseph H. . ‘Some universals of grammar with particular reference to the order of meaningful elements.’ In Joseph H. Greenberg (ed.), Universals of language, –. Cambridge, Mass.: MIT Press. Greenberg, Joseph H. (ed.) . Universals of grammar, nd edn. Cambridge, Mass.: MIT Press. Greenberg, Joseph H. . Language in the Americas. Stanford, Calif.: Stanford University Press. Greenberg, Joseph. . ‘The diachronic typological approach to language.’ In Masayoshi Shibatani and Theodora Binon (eds), Approaches to language typology, –. Oxford: Oxford University Press. Grierson, George A. . Linguistic survey of India, vol. : Munda and Dravidian. Calcutta: Ofﬁce of the Superintendent of Government Printing. Grimes, B. . ‘Semitic languages.’ In William Frawley (ed.) International encyclopedia of linguistics, nd edn (online version). Oxford: Oxford University Press. http://www. oxfordreference.com/view/./acref/../acref--e-. Guerrero, Clara Inés, Rubén Darío Hernandéz Cassiani, Jesús Natividad Pérez, Juana Pabla Pérez Tejedor, and Eduardo Restrepo. . Palenque de San Basilio: obra maestra del patrimonio intangible de la humanidad. Bogotá: Ministerio de Cultura/Instituto Colombiano de Antropología e Historia. Guillaume, Antoine. . A grammar of Cavineña, an Amazonian language of northern Bolivia. Berlin: Mouton de Gruyter. Güldemann, Tom. . ‘Complex predicates based on generic auxiliaries as an areal feature in northeast Africa.’ In F. K. Erhard Voeltz (ed.), Studies in African linguistic typology, –. Amsterdam: Benjamins. Güldemann, Tom. a. ‘The macro-Sudan belt: towards identifying a linguistic area in northern sub-Saharan Africa.’ In Bernd Heine and Derek Nurse (eds), A linguistic geography of Africa, –. Cambridge: Cambridge University Press. Güldemann, Tom. b. Quotative indexes in African languages: a synchronic and diachronic survey. Berlin: Mouton de Gruyter. Güldemann, Tom. . ‘Proto-Bantu and Proto-Niger-Congo: macro-areal typology and linguistic reconstruction.’ In Osamu Hieda, Christa König, and Hiroshi Nakagawa (eds), Geographical typology and linguistic areas: with special reference to Africa, –. Amsterdam: Benjamins. Gumperz, John J., and Robert Wilson. . ‘Convergence and creolization: a case from the Indo-Aryan/Dravidian border in India.’ In Dell Hymes (ed.), Pidginization and creolization of languages: proceedings of a conference held at the University of the West Indies, Mona, Jamaica, April , –. Cambridge: Cambridge University Press.



References

Guthrie, Malcolm. –. Comparative Bantu: an introduction to the comparative linguistics and prehistory of the Bantu languages.  vols. Farnborough: Gregg. Haan, Johnson Welem. . The grammar of Adang: a Papuan language spoken on the island of Alor, East Nusa Tenggara, Indonesia. PhD thesis, University of Sydney. https://ses.library. usyd.edu.au/handle//. Haase, Martin, and Nicole Nau. . Sprachkontakt und Grammatikalisierung. Special issue of Sprachtypologie und Universalienforschung (). Bremen: Akademie. Haan, Johnson Welem. . The grammar of Adang: a Papuan language spoken on the island of Alor, East Nusa Tenggara, Indonesia (Multilingual Corpus of Annotated Spoken Texts), https://lac.uni-koeln.de/multicast-cypriot-greek/, accessed  Oct. . Hadjidas, Harris, and Maria Vollmer. . ‘Cypriot Greek’. In Geoffrey Haig and Stefan Schnell (eds), Multi- CAST (Multilingual Corpus of Annotated Spoken Texts): https:// lac.uni-koeln.de/multicast-cypriot-greek/. Last accessed  Oct. . Hagège, Claude. . ‘Do the classical morphological types have clear-cut limits?’ In Wolfgang U. Dressler, Hans C. Luschützky, Oskar E. Pfeiffer, and John R. Rennison (eds), Contemporary morphology, –. Berlin: Mouton de Gruyter. Hagemeijer, Tjerk, and Otie Ogie. . ‘Edo inﬂuence on Santome: evidence from verb serialization.’ In Claire Lefebvre (ed.), Creoles, their substrates, and language typology, –. Amsterdam: Benjamins. Hagman, Roy S. . Nama Hottentot grammar. Bloomington: Indiana University. Haig, Geoffrey. . ‘Noun-plus-verb complex predicates in Kurmanji Kurdish: argument sharing, argument incorporation, or what?’ Sprachtypologie und Universalienforschung (Berlin) (): –. Haig, Geoffrey. . Alignment change in Iranian languages: a Construction Grammar approach. Berlin: Mouton. Haig, Geoffrey. . ‘Linker, relativizer, nominalizer, tense-particle: on the Ezafe in West Iranian.’ In Foong Ha Yap, Karen Grunow-Hårsta, and Janick Wrona (eds), Nominalization in Asian languages: diachronic and typological perspectives, vol. : Sino-Tibetan and Iranian languages, –. Amsterdam: Benjamins. Haig, Geoffrey. . ‘Deconstructing Iranian ergativity.’ In Jessica Coon, Lisa Travis, and Diane Massam (eds), The Oxford handbook of ergativity, –. Oxford: Oxford University Press. Haig, Geoffrey. . ‘The grammaticalization of object pronouns: why differential object indexing is an attractor state.’ Linguistics (): –. Haig, Geoffrey. To appear, a. ‘Debonding of inﬂectional morphology in Kurdish and beyond.’ In Songül Gündoǧdu, Geoffrey Haig, Ergin Öpengin, and Erik Anonby (eds), Current issues in Kurdish Linguistics. Bamberg: Bamberg University Press. Haig, Geoffrey, and Stefan Schnell. . ‘The discourse basis of ergativity revisited.’ Language (): –. Haig, Geoffrey, and Hanna Thiele. . ‘Northern Kurdish.’ In Geoffrey Haig and Stefan Schnell (eds), Multi-CAST (Multilingual Corpus of Annotated Spoken Texts), https:// lac.uni-koeln.de/multicast-northern-kurdish/, accessed  Oct. . Haiman, John. . Hua, a Papuan language of the eastern highlands of New Guinea. Amsterdam: Benjamins. Haiman, John. . ‘Iconic and economic motivation.’ Language : –.

References



Hajek, John. . ‘Serial verbs in Tetun Dili.’ In Alexandra Y. Aikhenvald and Robert M. W. Dixon (eds), Serial verb constructions: a cross-linguistic typology, –. Oxford: Oxford University Press. Hale, Ken. . ‘Navajo verb stem position and the bipartite structure of the Navajo conjunct sector.’ Linguistic Inquiry (): –. Halpern, Aaron L., and Arnold M. Zwicky (eds) . Approaching second: second position clitics and related phenomena. Stanford, Calif.: CSLI. Halpern, Abraham. . ‘Northeastern Pomo vocabulary.’ MS, California Language Archive, University of California, Berkeley. Hamann, Silke. . The phonetics and phonology of retroﬂexes. PhD thesis. Utrecht: LOT. Hamel, Patricia. . ‘Serial verbs in Loniu and an evolving preposition.’ Oceanic Linguistics (): –. Harder, Peter, and Kasper Boye. . ‘Grammaticalization and functional linguistics.’ In Heiko Narrog and Bernd Heine (eds), The Oxford handbook of grammaticalization, –. Oxford: Oxford University Press. Hardy, Humphrey Hill II. . Diachronic development in Biblical Hebrew prepositions: a case study in grammaticalization. Dissertation, University of Chicago. Harris, Alice C. . Endoclitics and the origins of Udi morphosyntax. Oxford: Oxford University Press. Harris, Alice C. a. ‘Cross-linguistic perspectives on syntactic change.’ In Brian D. Joseph and Richard D. Janda (eds), The handbook of historical linguistics, –. Oxford: Blackwell. Harris, Alice C. b. ‘Preverbs and their origins in Georgian and Udi.’ In Geert Booij and Ans van Kemenade (eds), Preverbs. Special issue of Yearbook of Morphology, –. Dordrecht: Kluwer. Harris, Alice C., and Lyle Campbell. . Historical syntax in cross-linguistic perspective. Cambridge: Cambridge University Press. Harrison, Sheldon P. . ‘Proto Oceanic *aki(ni) and the Proto Oceanic periphrastic causative.’ In Amran Halim, Lois Carrington, and S. A. Wurm (eds), Papers from the Third International Conference on Austronesian Linguistics, –. Canberra: Paciﬁc Linguistics. Harvey, Mark, Ian Green, and Rachel Nordlinger. . ‘From preﬁxes to sufﬁxes: typological change in northern Australia.’ Diachronica (): –. Haspelmath, Martin. . A grammar of Lezgian. Berlin: Mouton de Gruyter. Haspelmath, Martin. a. ‘Does grammaticalization need reanalysis?’ Studies in Language (): –. Haspelmath, Martin. b. ‘How young is Standard Average European?’ Language Sciences (): –. DOI: ./S-()-. Haspelmath, Martin. . ‘Why is grammaticalization irreversible?’ Linguistics (): –. Haspelmath, Martin. . ‘The relevance of extravagance: a reply to Bart Geurts.’ Linguistics (): –. Haspelmath, Martin. . ‘On directionality in language change with particular reference to grammaticalization.’ In O. Fischer, M. Norde, and H. Perridon (eds), Up and down the cline: the nature of grammaticalisation, –. Amsterdam: Benjamins.



References

Haspelmath, Martin. . ‘Parametric versus functional explanations of syntactic universals.’ In Theresa Biberauer (ed.), The limits of syntactic variation, –. Amsterdam: Benjamins. Haspelmath, Martin. . ‘An empirical test of the agglutination hypothesis.’ In Sergio Scalise, Elisabetta Magni, and Antonietta Bisetto (eds), Universals of language today, –. Dordrecht: Springer. Haspelmath, Martin. . ‘Comparative concepts and descriptive categories in crosslinguistic studies.’ Language (): –. Haspelmath, Martin. a. ‘The indeterminacy of word segmentation and the nature of morphology and syntax.’ Folia Linguistica (): –. Haspelmath, Martin. b. ‘The gradual coalescence into “words” in grammaticalization.’ In Heiko Narrog and Bernd Heine (eds), The Oxford handbook of grammaticalization, –. Oxford: Oxford University Press. Haspelmath, Martin. . ‘Argument indexing: a conceptual framework for the syntactic status of bound person forms.’ In Dik Bakker and Martin Haspelmath (eds), Languages across boundaries: studies in memory of Anna Siewierska, –. Berlin: Mouton. Haspelmath, Martin. a. ‘Deﬁning vs. diagnosing linguistics categories: a case study of clitic phenomena.’ In Joanna Błaszczak, Dorota Klimek-Jankowska, and Krzysztof Migdalski (eds), How categorical are categories? New approaches to the old questions of noun, verb and adjective, –. Berlin: Mouton de Gruyter. Haspelmath, Martin. b. ‘A grammatical overview of Egyptian and Coptic.’ In Eitan Grossman, Martin Haspelmath, and Tonio Sebastian Richter (eds), Egyptian-Coptic linguistics in typological perspective, –. Berlin: Mouton de Gruyter. Haspelmath, Martin. . ‘Explaining alienability contrasts in adpossessive constructions: predictability vs. iconicity.’ Zeitschrift für Sprachwissenschaft : –. Haspelmath, Martin, and Ekkehard König (eds) . Converbs in cross-linguistic perspective: structure and meaning of adverbial verb forms—adverbial participles, gerunds. Berlin: Mouton de Gruyter. Haspelmath, Martin, and Susanne Maria Michaelis. . ‘Analytic and synthetic: typological change in varieties of European languages.’ In Isabelle Buchstaller and Beat Siebenhaar (eds), Language variation: European perspectives VI, –. Amsterdam: Benjamins. Hasselbach, Rebecca. . Case in Semitic: roles, relations, and reconstruction. Oxford: Oxford University Press. Hawkins, Emily. . A pedagogical grammar of Hawaiian: recurrent problems. Honolulu: University Press of Hawaii. Hawkins, John. . ‘Processing efﬁciency and complexity in typological patterns.’ In Jae Jung Song (ed.) The Oxford handbook of linguistic typology, –. Oxford: Oxford University Press. Heath, Jeffrey. a. A grammar of Koyra Chiini: the Songhay of Timbuktu. Berlin: Mouton de Gruyter. Heath, Jeffrey. b. A grammar of Koyraboro (Koroboro) Senni. Cologne: Köppe. Heath, Jeffrey. . ‘D-possessives and the origins of Moroccan Arabic.’ Diachronica (): –. Heine, Bernd. . ‘Adpositions in African languages.’ Linguistique africaine : –. Heine, Bernd. . ‘Grammaticalization chains.’ Studies in Language (): –.

References



Heine, Bernd. . Auxiliaries: cognitive forces and grammaticalization. Oxford: Oxford University Press. Heine, Bernd. a. ‘Areal inﬂuence on grammaticalization.’ In Martin Pütz (ed.), Language contact and language conﬂict, –. Amsterdam: Benjamins. Heine, Bernd. b. ‘Grammaticalization as an explanatory parameter.’ In William Pagliuca (ed.), Perspectives on grammaticalization, –. Amsterdam: Benjamins. Heine, Bernd. . ‘Conceptual grammaticalization and predication.’ In John Taylor and Robert E. MacLaury (eds), Language and the cognitive construal of the world, –. Berlin: Mouton de Gruyter. Heine, Bernd. a. Cognitive foundations of grammar. Oxford: Oxford University Press. Heine, Bernd. b. Possession: sources, forces, and grammaticalization. Cambridge: Cambridge University Press. Heine, Bernd. c. ‘Grammaticalization theory and its relevance for African linguistics.’ In Robert K. Herbert (ed.), African linguistics at the crossroads, –. Cologne: Rüdiger Köppe. Heine, Bernd. . ‘Polysemy involving reﬂexive and reciprocal markers in African languages.’ In Zygmunt Frajzyngier and Traci Curl (eds), Reciprocals: forms and functions, –. Amsterdam: Benjamins. Heine, Bernd. . ‘Polysemy involving reﬂexive and reciprocal markers in African languages.’ In Zygmunt Frajzyngier and Traci S. Curl (eds), Reﬂexives: forms and functions, –. Amsterdam: Benjamins. Heine, Bernd. . ‘On the role of context in grammaticalization.’ In Ilse Wischer and Gabriele Diewald (eds), New reﬂections on grammaticalization, –. Amsterdam: Benjamins. Heine, Bernd. . ‘Grammaticalization.’ In Brian D. Joseph and Richard D. Janda (eds), The handbook of historical linguistics, –. Oxford: Blackwell. Heine, Bernd. . ‘On genetic motivation in grammar.’ In Günter Radden and Klaus-Uwe Panther (eds), Studies in linguistic motivation, –. Berlin: Mouton de Gruyter. Heine, Bernd. . ‘On reﬂexive forms in creoles.’ Lingua : –. Heine, Bernd. . ‘Identifying instances of contact-induced grammatical replication.’ In Samuel Gyasi Obend (ed.), Topics in descriptive and African linguistics: essays in honor of Distinguished Professor Paul Newman, –. Munich: Lincom Europa. Heine, Bernd. a. ‘Grammaticalization in African languages.’ In Heiko Narrog and Bernd Heine (eds), The Oxford handbook of grammaticalization, –. Oxford: Oxford University Press. Heine, Bernd. b. ‘Areas of grammaticalization and geographical typology.’ In Osamu Hieda, Osamu, Christa König, and Hirosi Nakagawa (eds), Geographical typology and linguistic areas, with special reference to Africa, –. Amsterdam: Benjamins. Heine, Bernd. . ‘The body in language: observations from grammaticalization.’ In Matthias Brenzinger and Iwona Kraska-Szlenk (eds), The body in language: comparative studies of linguistic embodiment, –. Leiden: Brill. Heine, Bernd, and Ulrike Claudi. . On the rise of grammatical categories: some examples from Maa. Berlin: Reimer. Heine, Bernd, Ulrike Claudi, and Friederike Hünnemeyer. a. Grammaticalization: a conceptual framework. Chicago: University of Chicago Press. Heine, Bernd, Ulrike Claudi, and Friederike Hünnemeyer. b. ‘From cognition to grammar: evidence from African languages.’ In Elizabeth C. Traugott and Bernd Heine (eds), Approaches to grammaticalization, vol. , –. Amsterdam: Benjamins.



References

Heine, Bernd, and Henry Honken. . ‘The Kx’a family: a new Khoisan genealogy.’ Journal of Asian and African Studies : –. Heine, Bernd, and Friederike Hünnemeyer. . ‘On the fate of Ewe ví “child”: the development of a diminutive marker.’ AAP (Afrikanistische Arbeitspapiere, Cologne) : –. Heine, Bernd, and Christa König. . ‘Grammatical hybrids: between serialization, compounding and derivation in !Xun (North Khoisan).’ In Wolfgang U. Dressler, Dieter Kastovsky, Oskar E. Pfeiffer, and Franz Rainer (eds), Morphology and its demarcations: selected papers from the th Morphology Meeting, Vienna, February , –. Amsterdam: Benjamins. Heine, Bernd, and Christa König. . The !Xun language: a dialect grammar of northern Khoisan. Cologne: Rüdiger Köppe. Heine, Bernd, and Tania Kuteva. . World lexicon of grammaticalization. Cambridge: Cambridge University Press. Heine, Bernd, and Tania Kuteva. . ‘On contact-induced grammaticalization.’ Studies in Language (): –. Heine, Bernd, and Tania Kuteva. . Language contact and grammatical change. Cambridge: Cambridge University Press. Heine, Bernd, and Tania Kuteva. . The changing languages of Europe. Oxford: Oxford University Press. Heine, Bernd, and Tania Kuteva. . The genesis of grammar: a reconstruction. Oxford: Oxford University Press. Heine, Bernd, and Tania Kuteva. . ‘The genesis of grammar: on combining nouns.’ In Rudie Botha and Henriette de Swart (eds), Language evolution: the view from restricted linguistic systems, –. Utrecht: LOT. Heine, Bernd, and Tania Kuteva. . ‘The areal dimension of grammaticalization.’ In Heiko Narrog and Bernd Heine (eds), The Oxford handbook of grammaticalization, –. Oxford: Oxford University Press. Heine, Bernd, Tania Kuteva, and Heiko Narrog. . ‘Back again to the future: How to account for directionality in grammatical change?’ In Walter Bisang and Andrej Malchukov (eds), Unity and diversity in grammaticalization scenarios: eight typological contributions, –. Berlin: Language Science Press. Heine, Bernd, and Zelealem Leyew. . ‘Is Africa a linguistic area?’ In Bernd Heine and Derek Nurse (eds), A linguistic geography of Africa, –. Cambridge: Cambridge University Press. Heine, Bernd, and Hiroyuki Miyashita. . ‘The intersection between reﬂexives and reciprocals: a grammaticalization perspective.’ In Ekkehard König Volker and Gast (eds), Reciprocals and reﬂexives: theoretical and typological explorations, –. Berlin: Mouton de Gruyer. Heine, Bernd, and Heiko Narrog. . ‘Grammaticalization and linguistic analysis.’ In Bernd Heine and Heiko Narrog (eds), The Oxford handbook of linguistic analysis, –. Oxford: Oxford University Press. Heine, Bernd, Heiko Narrog, and Haiping Long. . ‘Constructional change vs. grammaticalization: from compounding to derivation.’ Studies in Language (): –. Heine, Bernd, and Motoki Nomachi. . ‘Isomorphic processes: grammaticalization and copying of grammatical elements.’ In Martine Robbeets and Hubert Cuyckens (eds), Shared grammaticalization, –. Amsterdam: Benjamins.

References



Heine, Bernd, and Derek Nurse (eds) . A linguistic geography of Africa. Cambridge: Cambridge University Press. Heine, Bernd, and Mechthild Reh. . Grammaticalization and reanalysis in African languages. Hamburg: Helmut Buske. Heine, Bernd, and Kyung-An Song. . ‘On the grammaticalization of personal pronouns.’ Linguistics : –. Hengeveld, Kees. . ‘The grammaticalization of tense, mood and aspect.’ In Bernd Heine and Heiko Narrog (eds), The Oxford handbook of grammaticalization, –. Oxford: Oxford UniversityPress. Hetzron, Robert. . ‘Two principles of genetic reconstruction.’ Lingua : –. Hewitt, B. George. . ‘North West Caucasian.’ Lingua : –. Hewitt, B. George (ed.) . The indigenous languages of the Caucasus, vol. : The North West Caucasian languages. Delmar, NY: Caravan. Himmelmann, Nikolaus. . ‘Gram, construction, and word class formation.’ In Clemens Knobloch and Burhard Schaeder (eds), Wortarten und Grammatikalisierung. Perspektiven in System und Erwerb, –. Berlin: de Gruyter. Hintze, Fritz. . ‘Die Haupttendenzen der ägyptischen Sprachentwicklung.’ Zeitschrift für Phonetik und Allgemeine Sprachwissenschaft (): –. Hock, Hans Heinrich. . ‘Conjoined we stand: theoretical implications of Sanskrit relative structures.’ Studies in the Linguistic Sciences (): –. Hock, Hans Heinrich. . ‘The languages, their histories, and their genetic classiﬁcation.’ In Hans Heinrich Hock and Elena Bashir (eds), The languages and linguistics of South Asia: a comprehensive guide. Berlin: Mouton de Gruyter. Hock, Hans Heinrich, and Elena Bashir (eds) . The languages and linguistics of South Asia: a comprehensive guide. Berlin: Mouton de Gruyter. Hock, Hans Henrich, and Brian D. Joseph. . Language history, language change, and language relationship: an introduction to historical and comparative linguistics. Berlin: Mouton de Gruyter. Hodge, Carleton T. . ‘The linguistic cycle.’ Language Sciences (): –. Hoffman, Rev. J. . Encyclopedia Mundarica.  vols. Patna: Superintendent of Government Printing, Bihar. Hoffner Harry A., Jr., and H. Craig Melchert . A grammar of the Hittite Language, vol. : Reference grammar. Winona Lake, Ind.: Eisenbrauns. Hoijer, Harry. . Tonkawa: an Indian language of Texas. Reprinted from Handbook of American Indian languages III. Distributed by the University of Chicago Libraries. Hoijer, Harry. . Tonkawa. In Linguistic structures of Native America, –. New York: Johnson Reprint. Hoijer, Harry. . An analytical dictionary of the Tonkawa language. University of California Publications in Linguistics : –. Berkeley: University of California. Hoijer, Harry. . Tonkawa texts. University of California Publications in Linguistics . Berkeley: University of California. Holes, Clive. . Modern Arabic: structures, functions, and varieties, rev. edn. Washington, DC: Georgetown University Press. Holm, John A. . Pidgins and creoles: theory and structure, vol. . Cambridge: Cambridge University Press.



References

Holton, Gary. a. ‘Kinship in the Alor-Pantar languages.’ In Marian Klamer (ed.), The Alor-Pantar languages: history and typology, –. Berlin: Language Science Press. Holton, Gary. b. ‘Numeral classiﬁers and number in two Papuan outliers of East Nusantara.’ In Marian Klamer and František Kratochvíl (eds), Number and quantity in East Nusantara, –. Canberra: Asia-Paciﬁc Linguistics. Holton, Gary. c. ‘Western Pantar.’ In Antoinette Schapper (ed.), Papuan languages of Timor, Alor and Pantar: sketch grammars, vol. , –. Berlin: Mouton de Gruyter. Holton, Gary, Marian Klamer, František Kratochvíl, Laura C. Robinson, and Antoinette Schapper. . ‘The historical relations of the Papuan languages of Alor and Pantar.’ Oceanic Linguistics (): –. Holton, Gary, and Laura C. Robinson. a. ‘The internal history of the Alor-Pantar language family.’ In Marian Klamer (ed.), The Alor-Pantar languages: history and typology, –. Berlin: Language Science Press. http://langsci-press.org/catalog/book/. Holton, Gary, and Laura C. Robinson. b. ‘The linguistic position of the Timor-AlorPantar languages.’ In Marian Klamer (ed.), The Alor-Pantar languages: history and typology, –. Berlin: Language Science Press. Hong, Yunpyo. . Kuntay kwuke yenkwu I [A study of early modern Korean I]. Seoul: Thayhaksa. Hook, Peter E. . The compound verb in Hindi. Michigan: Centre for South and Southeast Asian Studies. Hook, Peter E. . ‘The compound verb in Munda.’ Language Sciences : –. Hooper, Robin. . ‘Deixis and aspect: the Tokelauan directional particles mai and atu.’ Studies in Language (): –. Hooper, Robin. . ‘Ups and downs in Tokelauan: semantic extensions of ake and ifo.’ Handout at the Sixth International Conference on Oceanic Linguistics, Port Vila (Vanuatu), – July . Hopper, Paul J. . ‘Some discourse functions of classiﬁers in Malay.’ In Colette G. Craig (ed.), Noun classes and categorization, –. Amsterdam: Benjamins. Hopper, Paul J. . ‘On some principles of grammaticization.’ In Elizabeth C. Traugott and Bernd Heine (eds), Approaches to grammaticalization, vol. , –. Amsterdam: Benjamins. Hopper, Paul J., and Elizabeth Closs Traugott. . Grammaticalization. Cambridge: Cambridge University Press. Hopper, Paul J., and Elizabeth Closs Traugott. . Grammaticalization, nd edn. Cambridge: Cambridge University Press. Horne, Kibbey M. . Language typology: th and th century views. Washington, DC: Georgetown University Press. Hsieh, Fuhui. . ‘On the grammaticalization of the Kavalan say verb zin.’ Oceanic Linguistics (): –. Huang, Yan. . The syntax and pragmatics of anaphora: a study with special reference to Chinese. Cambridge: Cambridge University Press. Huber, Juliette. . A grammar of Makalero: a Papuan language of East Timor. PhD thesis, University of Utrecht; Leiden: LOT. Huber, Juliette. Forthcoming. ‘Comparative sketch of Makalero and Makasae.’ In Antoinette Schapper (ed.), The Papuan languages of Timor, Alor, and Pantar: sketch grammars, vol. . Berlin: Mouton de Gruyter.

References



Huehnergard, John, and Na’ama Pat-El. . ‘Third person possessive sufﬁxes as deﬁnite articles in Semitic.’ Journal of Historical Linguistics (): –. Humboldt, Wilhelm von. . ‘Über das Entstehen der grammatischen Formen und ihren Einﬂuss auf die Ideenentwicklung.’ In Abhandlungen der Akademie der Wissenschaften zu Berlin –, –. Hünnemeyer, Friederike. . Die serielle Verbkonstruktion im Ewe: eine Bestandsaufnahme und Beschreibung der Veränderungstendenzen funktional-spezialisierter Serialisierungen. MA thesis, University of Cologne. Hyman, Larry M. . ‘The Macro-Sudan belt and Niger-Congo reconstruction.’ Language Dynamics and Change (): –. Hyslop, Catriona. . The Lolovoli dialect of the North-East Ambae language, Vanuatu. Canberra: Paciﬁc Linguistics. Ibarretxe-Antuñano, Iraide. . ‘Mind-as-body as a cross-linguistic conceptual metaphor.’ Miscelánea : –. Igartua, Iván. . ‘From cumulative to separative exponence in inﬂection: reversing the morphological cycle.’ Language (): –. Ikegami, Motoko. . ‘ “Kaku joshi + dōshi” kōzō o motu joshi sōtō ku o megutte: -te kei to ren’yō tyūshi no sai’ [On postpositional phrase equivalents with the structure ‘case particle + verb’: differences between the -te form and the inﬁnitive]. Hokkaido University Ryūgakusei Center Kiyō : –. Imart, Guy. . Le Kirghiz (Turc d’Asie centrale soviétique). Aix-en-Provence: Université de Provence. Inglese, Guglielmo, and Silvia. Luraghi. To appear. ‘The Hittite periphrastic perfect.’ In R. Crellin and T. Jügel (eds), Perfects in Indo-European languages, vol. . Amsterdam: Benjamins. Jacob, Daniel. . Markierung von Aktantenfunktionen und ‘Prädetermination’ im Französischen: Ein Beitrag zur Neuinterpretation morphosyntaktischer Strukturen in der französischen Umgangssprache. Tübingen: Niemeyer. Jacob, Daniel. . ‘Transitivität, Diathese und Perfekt: zur Entstehung der romanischen haben-Periphrase.’ In Hans Geisler and Daniel Jacob (eds), Transitivität und Diathese in romanischen Sprachen, –. Niemeyer: Tübingen. Jahani, Carina. . ‘On the deﬁnite marker in modern spoken Persian.’ Paper presented at the Sixth International Conference on Iranian Linguistics, Ilia State University, Ilia State University, G. Tsereteli Institute of Oriental Studies, Tbilisi/Georgia, – June. Janhunen, Juha. . Manchuria: an ethnic history. Helsinki: Finno-Ugrian Society. Janhunen, Juha. . ‘Grammatical genders from east to west.’ In Barbara Unterbeck (ed.), Gender in grammar and cognition, –. Berlin: Mouton de Gruyter. Jaworski, Rafał, and Krzysztof Stroński. . ‘Recognition and multi-layered analysis of converbs in early New Indo-Aryan.’ Paper presented at the rd South Asian Languages Analysis Round Table, Adam Mickiewicz University, Poznań, – May. Jenner, Philip N., and Saveros Pou. /. A Lexicon of Khmer Morphology. Honolulu: University Press of Hawaii. Jenny, Mathias. . ‘The far west of Southeast Asia: “give” and “get” in the languages of Myanmar.’ In N. J. Enﬁeld and Bernard Comrie (eds), The languages of mainland Southeast Asia: the state of the art, –. Berlin: Mouton de Gruyter. Jenny, Mathias, and Paul Sidwell (eds). . The handbook of Austroasiatic languages. Leiden: Brill.



References

Jenny, Mathias, Tobias Weber, and Rachel Weymuth. . ‘The Austroasiatic languages: a typological overview.’ In Mathias Jenny and Paul Sidwell (eds), The handbook of Austroasiatic languages, –. Leiden: Brill. Jespersen, Otto. []. Progress in language: with special reference to English. Amsterdam: Benjamins. Jespersen, Otto. . Negation in English and other languages. Copenhagen: Hst. Jespersen, Otto. . Language: its nature, origin and development. London: Allen & Unwin. Job, Michael (ed.) . The indigenous languages of the Caucasus, vol. : North East Caucasian languages, pt . Ann Arbor, Mich.: Caravan. Johanson, Lars. . Aspekt im Türkischen. Uppsala: Almqvist & Wiksell. Johanson, Lars. . ‘On Turkic converb clauses.’ In Martin Haspelmath and Ekkehard König (eds), Converbs in cross-linguistic perspective: structure and meaning of adverbial verb forms—adverbial participles, gerunds, –. Berlin: Mouton de Gruyter. Johanson, Lars. . ‘Viewpoint operators in European languages.’ In Östen Dahl (ed.), Tense and aspect in the languages of Europe, –. Berlin : Mouton de Gruyter. Johanson, Lars. a. Structural factors in Turkic language contacts. London: Routledge Curzon. Johanson, Lars. b. ‘Contact-induced change in a code-copying framework.’ In Mari Jones and Edith Esch (eds), Language change: the interplay of internal, external and extralinguistic factors, –. Berlin: Mouton de Gruyter. Johanson, Lars. . ‘Evidentiality in Turkic.’ In Alexandra Y. Aikhenvald and R. M. W. Dixon (eds), Studies in evidentiality, –. Amsterdam: Benjamins. Johanson, Lars. . ‘On Turkic transformativizers and nontransformativizers.’ Turkic Languages : –. Johanson, Lars. . ‘Remodeling grammar: copying, conventionalization, grammaticalization.’ In Peter Siemund and Noemi Kintana (eds), Language contact and contact languages, –. Amsterdam: Benjamins. Johanson, Lars. . ‘Grammaticalizaton in Turkic languages.’ In Heiko Narrog and Bernd Heine (eds), The Oxford handbook of grammaticalization, –. Oxford: Oxford University Press. Johanson, Lars. . ‘Isomorphic processes: grammaticalization and copying of grammatical elements.’ In Martine Robbeets and Hubert Cuyckens (eds), Shared grammaticalization: with special focus on the Transeurasian languages, –. Amsterdam: Benjamins. Johanson, Lars. . ‘Isomorphic processes: grammaticalization and copying of grammatical elements.’ In Martine Robbeets and Hubert Cuyckens (eds), Shared grammaticalization: with special focus on the Transeurasian languages, –. Amsterdam: Benjamins. Jügel, Thomas. . Die Entwicklung der Ergativkonstruktion im Alt- und Mitteliranischen: Eine korpusbasierte Untersuchung zu Kasus, Kongruenz und Satzbau. Wiesbaden: Harrassowitz. Kachru, Yamuna. . Hindi. Amsterdam: Benjamins. Kahnemuyipour, Arsalan. . ‘Syntactic categories and Persian stress.’ Natural Language and Linguistic Theory : –. Kalinina, Elena, and Nina Sumbatova. . ‘Clause structure and verbal forms in NakhDaghestanian languages.’ In Irina Nikolaeva (ed.), Finiteness: theoretical and empirical foundations, –. Oxford: Oxford University Press.

References



Kaufman, Stephen. . ‘Aramaic.’ In Robert Hetzron (ed.), The Semitic languages, –. London: Routledge. Keen, Sandra. . ‘Yukulta.’ In R. M. W. Dixon and Barry Blake (eds), Handbook of Australian languages, vol. , –. Canberra: ANU Press. Keenan, Edward L. . ‘Relative clauses.’ In Timothy Shopen (ed.), Language typology and syntactic description, vol. : Complex constructions, –. Cambridge: Cambridge University Press. Keenan, Edward L., and Bernard Comrie. . ‘Noun phrase accessibility and universal grammar.’ Linguistic Inquiry (): –. Keesing, Roger M. . Kwaio Grammar. Canberra: Paciﬁc Linguistics. Keller, Rudi. . On language change: the invisible hand in language. New York: Routledge. Kent, Roland. . Old Persian. New Haven, Conn.: American Oriental Society. Kibrik, Aleksandr E. . Opyt strukturnogo opisanija arčinskogo jazyka. Tom . Taksonomičeskaja grammatika [A structural description of the Archi language, vol. : Taxonomic grammar]. Moscow: Moscow State University. Kibrik, Aleksandr E. . Konstanty i peremennye jazyka [Constants and variables of language]. St Petersburg: Aletheia. Kibrik, Aleksandr E., and Jakov G. Testelec (eds) . Elementy caxurskogo jazyka v tipologičeskom osveščenii [Aspects of Tsakhur from a typological perspective]. Moscow: Nasledie. Kibrik, Andrej. . Reference in discourse. Oxford: Oxford University Press. Kießling, Roland, Maarten Mous, and Derek Nurse. . ‘The Tanzanian Rift Valley area.’ In Bern Heine and Derek Nurse (eds), A linguistic geography of Africa, –. Cambridge: Cambridge University Press. Kim, Joungmin. . ‘Mermaid construction in Korean.’ In Tasaku Tsunoda (ed.), Adnominal clauses and the ‘mermaid construction’: grammaticalization of nouns, –. Tokyo: National Institute for Japanese Language and Linguistics. Kimmelman, Vadim. . ‘Auxiliaries in Adyghe.’ Fieldwork report. http://www.uva.nl/over-deuva/organisatie/ medewerkers/content/k/i/v. kimmelman/v.kimmelman.html. Kimov, Rašad S. . Somatizmy kabardinskogo jazyka: grammatikalizacija [The grammaticalization of the Kabardian somatic terms]. Nal’čik: Kabardino-Balkarskij Universitet. Kinsui, Satoshi. .‘Ōbunyaku to judōbun: Edo jidai o chūshin ni [Western translations and the passive: focus on the Edo period]. In Bunka Gengogaku Henshū Iinkai (eds), Bunka Gengogaku: Sono Teigen to Kenchiku, –. Tokyo: Sanseidō. Kiparsky, Paul, and Cleo Condoravdi. . ‘Tracking Jespersen’s cycle.’ In M. Janse, B. D. Joseph, and A. Ralli (eds), Proceedings of the nd International Conference of Modern Greek Dialects and Linguistic Theory, –. Mytilene: Doukas. Kirch, Patrick. . The Lapita peoples: ancestors of the Oceanic World. Oxford: Blackwell. Klamer, Marian. . ‘How report verbs become quote markers and complementizers.’ Lingua : –. Klamer, Marian. a. A grammar of Teiwa. Berlin: Mouton de Gruyter. Klamer, Marian. b. ‘One item, many faces: “come” in Teiwa and Kaera.’ In Michael C. Ewing and Marian Klamer (eds), East Nusantara: typological and areal analyses, –. Canberra: Paciﬁc Linguistics. Klamer, Marian. . A short grammar of Alorese (Austronesian). Munich: Lincom.



References

Klamer, Marian. a. ‘Kaera.’ In Antoinette Schappter (ed.), The Papuan languages of Timor, Alor and Pantar: sketch grammars, vol. , –. Berlin: Mouton de Gruyter. Klamer, Marian. b. ‘Numeral classiﬁers in the Papuan languages of Alor and Pantar: a comparative perspective.’ In Marian Klamer and František Kratochvíl (eds), Number and quantity in East Nusantara, –. Canberra: Paciﬁc Linguistics. Klamer, Marian. c. ‘The Alor-Pantar languages: linguistic context, history and typology. In Marian Klamer (ed.), The Alor-Pantar languages: history and typology, –. Berlin: Language Science Press. Klamer, Marian. d. ‘The history of numeral classiﬁers in Teiwa (Papuan).’ In Gerrit J. Dimmendaal and Anne Storch (eds), Number constructions and semantics: case studies from Africa, Amazonia, India and Oceania, –. Amsterdam: Benjamins. Klamer, Marian, and Antoinette Schapper. . ‘ “Give” constructions in the Papuan languages of Timor-Alor-Pantar.’ Linguistic Discovery (): –. Klamer, Marian, Antoinette Schapper, and Greville G. Corbett. . ‘Plural number words in the Alor-Pantar languages.’ In Marian Klamer (ed.), The Alor-Pantar languages: history and typology, –. Berlin: Language Science Press. Klamer, Marian, Antoinette Schapper, Greville G. Corbett, Gary Holton, František Kratochvíl, and Laura C. Robinson. . ‘Numeral words and arithmetic operations in the Alor-Pantar languages.’ In Marian Klamer (ed.), The Alor-Pantar languages: history and typology, –. Berlin: Language Science Press. Klein, Wolfgang, and Clive Perdue. . ‘The basic variety.’ Second Language Research : –. Klein-Andreu, Flora. . Spanish through time: an introduction. Munich: Lincom. Ko, Seongyeon, Andrew Joseph, and John Whitman. . ‘Comparative consequences of the tongue root harmony analysis for proto-Tungusic, proto-Mongolic, and proto-Korean.’ In Martine Robbeets and Walter Bisang (eds), Paradigm change in the Transeurasian languages and beyond, –. Amsterdam: Benjamins. Ko, Yong-Kun. . ‘A study of Korean verbal morphology and its typological implications.’ Hyengthaylon/Morphology (): –. Koch, Harold. . ‘Historical relations among the Australian languages: genetic classiﬁcation and contact-based diffusion.’ In Harold Koch and Rachel Nordlinger (eds), The languages and linguistics of Australia: a comprehensive guide, –. Berlin: Mouton. Koch, Harold, and Rachel Nordlinger (eds) . The languages and linguistics of Australia: a comprehensive guide. Berlin: Mouton. Kolichala, Suresh. . ‘Dravidian languages.’ In Hans Heinrich Hock and Elena Bashir (eds), The languages and linguistics of South Asia: a comprehensive guide, –. Berlin: Mouton de Gruyter. König, Christa, and Bernd Heine. . A concise dictionary of Northwestern !Xun. Cologne: Rüdiger Köppe. König, Ekkehard. . ‘The meaning of converb constructions.’ In Martin Haspelmath and Ekkehard König (eds), Converbs in cross-linguistic perspective: structure and meaning of adverbial verb forms—adverbial participles, gerunds, –. Berlin: Mouton de Gruyter.

References



König, Ekkehard. . ‘Intensiﬁers and reﬂexive pronouns.’ In Martin Haspelmath, Ekkehard König, Wulf Oesterreicher, and Wolfgang Raible (eds), Language typology and language universals: an international handbook, vol. , –. Berlin: de Gruyter. König, Ekkehard, and Volker Gast. . ‘Focused assertion of identity: a typology of intensiﬁers.’ Linguistic Typology : –. König, Ekkehard, and Bernd Kortmann. . ‘On the reanalysis of verbs as prepositions.’ In Gisa Rauh (ed.), Approaches to prepositions, –. Tübingen: Narr. König, Ekkehard, and Claire Moyse-Faurie. . ‘Spatial reciprocity: between grammar and lexis.’ In J. Helmbrecht et al. (eds), Form and function in language research: papers in honour of Christian Lehmann, –. Berlin: Mouton de Gruyter. König, Ekkehard, and Peter Siemund. . ‘On the development of reﬂexive pronouns in English: a case study in grammaticalization.’ In Uwe Böker and Hans Sauer (eds), Anglistentag , Dresden: Proceedings, –. Trier: Wissenschaftliger Verlag. König, Ekkehard, and Peter Siemund. . ‘The development of complex reﬂexives and intensiﬁers in English.’ Diachronica (): –. Koo, Hyun Jung. . ‘A cognitive analysis of lexicalization patterns of (dis-)honoriﬁcation in Korean.’ Korean Semantics : –. Koptjevskaja-Tamm, Maria. . ‘Possessive noun phrases in Maltese: alienability, iconicity and grammaticalization.’ Rivista di linguistica (): –. Korjakov, Yurij B. . Atlas kavkazskix jazykov [Atlas of Caucasian languages]. Moscow: Institute of Linguistics of the Russian Academy of Sciences. Korn, Agnes. . ‘Western Iranian pronominal clitics.’ Orientalia Suecana : –. Korn, Agnes. . ‘Looking for the middle way: voice and transitivity in complex predicates in Iranian.’ Lingua : –. Kornﬁlt, Jaklin. . Turkish. London: Routledge. Korotkova, Natalia, and Yury Lander. . ‘Deriving sufﬁx ordering in polysynthesis: evidence from Adyghe.’ Morphology : –. Kouwenberg, Bert. . ‘Akkadian in general.’ In Stefan Weninger et al. (eds), The Semitic languages: an international handbook, –. Berlin: Mouton de Gruyter. Kratochvíl, František. . A grammar of Abui: a Papuan language of Alor. PhD thesis, University of Utrecht: Leiden: LOT. Kratochvíl, František. . ‘Sawila.’ In Antoinette Schapper (ed.), Papuan languages of Timor, Alor and Pantar: sketch grammars, vol. , –. Berlin: Mouton de Gruyter. Krug, Manfred G. . Emerging English modals: a corpus-based study of grammaticalization. Berlin: Mouton de Gruyter. Kuiper, F. B. J. . ‘The genesis of a linguistic area.’ Indo-Iranian Journal (–): –. Kumakhov, Mukhadin A., and Karina Vamling. . Circassian clause structure. Malmö: Malmö University. Kumaxov, Muxadin A. [Kumakhov, Mukhadin] . Morfologija adygskix jazykov. Sinxronno-diaxronnaja xarakteristika. I. Vvedenie, struktura slova, slovoobrazovanie častej reči [Morphology of Circassian languages. A synchronic and diachronic characteristic. I. Introduction, word structure, derivation of parts of speech.]. Nal’čik: Kabardinobalkarskoe knižnoe izdatel’stvo.



References

Kumaxov, Muxadin A. [Kumakhov, Mukhadin]. Slovoizmenenie adygskix jazykov [Inﬂection in the Circassian languages]. Moscow: Nauka. Kuteva, Tania. a. ‘Large linguistic areas in grammaticalization: auxiliation in Europe. Language Sciences (): –. Kuteva, Tania. b. ‘On identifying an evasive gram: action narrowly averted.’ Studies in Language : –. Kuteva, Tania. . ‘Areal grammaticalization: the case of the Bantu–Nilotic borderland.’ Folia Linguistica (–): –. Kuteva, Tania. . ‘On the “frills” of grammaticalization.’ In María José López-Couso and Elena Seoane (eds), Rethinking grammaticalization: new perspectives, –. Amsterdam: Benjamins. Kwon, Jae-il. . ‘Hyentaykwukeuy uyconmyengsa yenkwu’ [A study of dependent nouns in Modern Korean]. In Sotang Chensikwen paksa hwakapkinyem kwukehak nonchong [Papers on Korean linguistics, Festschrift for Dr Sodang Chun Si-kwon], –. Seoul: Hyungseol. Labov, William. . The social stratiﬁcation of English in New York City. Washington, DC: Center for Applied Linguistics. Labov, William. . Sociolinguistic patterns. Philadelphia: University of Philadelphia Press. Lakoff, George, and Mark Johnson. . Metaphors we live by. Chicago: University of Chicago Press. Lambrecht, Knud. . Information structure and sentence form. Cambridge: Cambridge University Press. Lander, Yury A. . ‘Adyghe.’ In Peter O. Müller, Ingeborg Ohnheiser, Susan Olsen, and Franz Rainer (eds), Word-formation: an international handbook of the languages of Europe, vol. , –. Berlin: Mouton de Gruyter. Lander, Yury A. . ‘Nominal complexes in Adyghe: between morphology and syntax.’ Studies in Language (): –. Lander, Yury A., and Alexander B. Letuchiy. . ‘Kinds of recursion in Adyghe morphology.’ In Harry van der Hulst (ed.), Recursion and human language, –. Berlin: Mouton de Gruyter. Lander, Yury A., and Yakov Testelets. . ‘Nouniness and speciﬁcity: Circassian and Wakashan.’ Paper presented at ‘Universals and Particulars in Parts-of-Speech Systems’, University of Amsterdam. Lander, Yury A., and Yakov Testelets. . ‘Adyghe.’ In Michael Fortescue, Marianne Mithun, and Nicholas Evans (eds), The Oxford handbook of polysynthesis, –. Oxford: Oxford University Press. LaPolla, Randy J. . ‘Anti-ergative marking in Tibeto-Burman.’ Linguistics of the TibetoBurman Area (): –. LaPolla, Randy J. . ‘Parallel grammaticalizations in Tibeto-Burman languages: evidence for Sapir’s drift.’ Linguistics of the Tibeto-Burman Area (): –. LaPolla, Randy J. a. ‘ “Ergative” marking in Tibeto-Burman.’ In Yoshio Nishi, James A. Matisoff, and Yasuhiko Nagano (eds), New horizons in Tibeto-Burman morphosyntax, –. Osaka: National Museum of Ethnology. LaPolla, Randy J. b. ‘On the utility of concepts of markedness and prototypes in understanding the development of morphological systems.’ Bulletin of the Institute of History and Philology (): –.

References



Laughren, Mary. . ‘Syntactic constraints in a “free word order” language.’ In Mengistu Amberber and Peter Collins (eds), Language universals and variation, –. Westport, Conn.: Praeger. Lausberg, Hedda, and Han Sloetjes. . ‘Coding gestural behavior with the NEUROGESELAN system.’ Behavior Research Methods, Instruments, and Computers (): –. Lawyer, Lewis. . A description of the Patwin language. PhD dissertation, University of California, Davis. Lecoq, Pierre. . Le dialecte de Sivand. Wiesbaden: Reichert. Lefebvre, Claire. . Functional cateories in three Atlantic creoles. Amsterdam: Benjamins. Lefebvre, Claire, and Anne-Marie Brousseau. . A grammar of Fongbe. Berlin: Mouton de Gruyter. Lehmann, Christian. . Thoughts on grammaticalization: a programmatic sketch, vol. . Cologne: Universität zu Köln, Institut für Sprachwissenschaft. Lehmann, Christian. . ‘Grammaticalization: synchronic variation and diachronic change.’ Lingua e stile : –. Lehmann, Christian. . ‘Grammaticalization and linguistic typology.’ General Linguistics (): –. Lehmann, Christian. a[]. Thoughts on grammaticalization. Munich: Lincom Europa. Lehmann, Christian. b. ‘Grammaticalization: synchronic variation and diachronic change.’ Lingua e stile (): –. Lehmann, Christian. . ‘Old Tamil.’ In Sanford B. Steever (ed.), The Dravidian languages, –. London: Routledge. Lehmann, Christian. . Thoughts on grammaticalization, nd edn. Erfurt: Seminar für Sprachwissenschaft der Universität. Lehmann, Christian. . ‘Information structure and grammaticalization.’ In Elena Seoane and María José López-Couso (eds), Theoretical and empirical issues in grammaticalization, –. Amsterdam: Benjamins. Lehmann, Christian. . Thoughts on grammaticalization, rd edn. Berlin: Language Science Press. Lehmann, Christian. To appear. ‘Grammaticalization in Bo.’ Veleia . Lehmann, Thomas. . A grammar of modern Tamil. Pondicherry: Pondicherry Institute of Linguistics and Culture. Lehmann, Thomas. . ‘Old Tamil.’ In Sanford B. Steever (ed.), The Dravidian languages, –. London: Routledge. Letuchiy, Alexander B. . ‘Ergativity in the Adyghe system of valency-changing derivations.’ In Gilles Authier and Katharina Haude (eds), Ergativity, valency and voice, –. Berlin: Mouton de Gruyter. Levinson, Stephen C. . Presumptive meanings: the theory of generalized conversational implicatures. Cambridge, Mass.: MIT Press. Leyew, Zelealem, and Bernd Heine. . ‘Comparative constructions in Africa: an areal dimension.’ APAL (Annual Publication in African Linguistics, Cologne) : –. Li, Charles N., and Sandra A. Thompson. . ‘A mechanism for the development of copula morphemes.’ In Charles N. Li (ed.), Word order and word order change, –. Austin: University of Texas Press. Li, Charles N., and Sandra Thompson. . ‘Development of copula.’ In Charles N. Li (ed.), Mechanisms of syntactic change. Austin: University of Texas Press.



References

Li, Charles N., and Sandra A. Thompson. . Mandarin Chinese: a functional reference grammar. Berkeley: University of California Press. Li, Xuping, and Walter Bisang. . ‘Classiﬁers in Sinitic languages: from individuation to deﬁniteness-marking.’ Lingua : –. Liberman. Stephen J. . ‘Word order in the Afro-Asiatic languages.’ Ninth World Congress of Jewish Studies: –. Lichtenberk, Frantisek. . A grammar of Manam. Honolulu: University of Hawaii Press. Lichtenberk, Frantisek. . ‘Syntactic-category change in Oceanic languages.’ Oceanic Linguistics : –. Lichtenberk, Frantisek. a. ‘On the gradualness of grammaticalization.’ In Elizabeth Closs Traugott and Bernd Heine (eds), Approaches to grammaticalization, vol. , –. Amsterdam: Benjamins. Lichtenberk, Frantisek. b. ‘Semantic change and heterosemy in grammaticalization.’ Language (): –. Lichtenberk, Frantisek. a. ‘Posture verbs in Oceanic.’ In John Newman (ed.), The linguistics of sitting, standing, and lying, –. Amsterdam: Benjamins. Lichtenberk, Frantisek. b. ‘The possessive–benefactive connection.’ Oceanic Linguistics (): –. Lichtenberk, Frantisek. . ‘Directionality and displaced directionality in Toqabaqiata.’ In Erin Shay and Uwe Seibert (eds), Motion, direction and location in languages: in honor of Zygmunt Frajzingier, –. Amsterdam: Benjamins. Lichtenberk, Frantisek. . A dictionary of Toqabaqita (Solomon Islands). Canberra: Paciﬁc Linguistics. Lichtenberk, Frantisek. . ‘Start and ﬁnish: some grammatical changes in Toqabaqita.’ In Alexander Adelaar and Andrew Pawley (eds), Austronesian historical linguistics and culture history: a festschrift for Robert Blust, –. Canberra: Paciﬁc Linguistics. Lichtenberk, Frantisek. . ‘Development of reason and cause markers in Oceanic.’ Oceanic Linguistics (): –. Lindström, Liina, and Ilona Tragel. . ‘The possessive perfect construction in Estonian.’ Folia Linguistica : –. Link, Godehard. . ‘Quantity and number.’ In Dietmar Zaefferer (ed.), Semantic universals and universal semantics, –. Dordrecht: Foris. Lipiński, Edward. . Semitic languages: outline of a comparative grammar, nd edn. Leuven: Peeters. Lipski, John. . ‘Origin and development of “ta” in Afro-Hispanic creoles.’ In Francis Byrne and John A. Holm (eds), Atlantic meets Paciﬁc: a global view of pidginization and creolization, –. Amsterdam: Benjamins. Lipski, John. . ‘The new Palenquero: revitalization and re-creolization.’ In Richard FileMuriel and Rafael Orozco (eds), Colombian varieties of Spanish, –. Vervuert: Iberoamericana. Lomize, Grigorij. . ‘Fieldwork report on locative expressions in Besleney Kabardian’ (in Russian). Loos, Eugene. . ‘Pano.’ In R. M. W. Dixon and Alexandra Y. Aikhenvald (eds), The Amazonian languages, –. Cambridge: Cambridge University Press. Lord, Carol Diane. . ‘Serial verbs in transition.’ Studies in African Linguistics (): –.

References



Lord, Carol Diane. . ‘Evidence for syntactic reanalysis: from verb to complementizer in Kwa.’ In Sandford B. Steever, Carol A. Walker, and Salikoko S. Mufwene (eds), Papers from the parasession on diachronic syntax, –. Chicago: Chicago Linguistic Society. Lord, Carol Diane. . Historical change in serial verb constructions. Amsterdam: Benjamins. Lüdtke, Helmut. . ‘Auf dem Weg zu einer Theorie des Sprachwandels.’ In Helmut Lüdtke (ed.), Kommunikationstheoretische Grundlagen des Sprachwandels, –. Berlin: de Gruyter. Lüdtke, Helmut. . ‘Esquisse d’une théorie du changement langagier.’ La linguistique (): –. Lupyan, Gary, and Rick Dale. . ‘Language structure is partly determined by social structure.’ Public Library of Science (PLoS) ONE (): e. Lynch, John. . ‘Oral/nasal alternation and the realis/irrealis distinction in Oceanic languages.’ Oceanic Linguistics (): –. Lynch, John. . ‘Towards a theory of the origin of the Oceanic possessive constructions.’ In Amran Halim, Lois Carrington, and S. A. Wurm (eds), Third International Conference on Austronesian Linguistics, vol. , –. Canberra: Paciﬁc Linguistics. Lynch, John. . ‘Proto Oceanic possessive marking.’ In John Lynch and Fa’afo Pat (eds), Oceanic studies: Proceedings of the First International Conference on Oceanic Linguistics, –. Canberra: Australian National University. Lynch, John. . A grammar of Anejom̃. Canberra: Paciﬁc Linguistics. Lynch, John. . ‘Article accretion and article creation in Southern Oceanic.’ Oceanic Linguistics (): –. Lynch, John, Malcolm Ross, and Terry Crowley. . The Oceanic languages. Richmond, Surrey: Curzon Press. MacKenzie, David. . Kurdish dialect studies, vol. . Oxford: Oxford University Press. MacKenzie, David. . Kurdish dialect studies, vol. . Oxford: Oxford University Press. MacKenzie, David. . The dialect of Awroman (Hawrāmān-ī Luhōn): grammatical sketch, texts, and vocabulary. Copenhagen: Det Kongelige Dankse Videnskabernes Selskab. Mahanta, Shakuntala. . ‘Assamese.’ Journal of the International Phonetic Association : –. Mahmoudveysi, Parvin, Denise Bailey, Ludwig Paul, and Geoffrey Haig. . The Gorani language of Gawraǰū, a village of west Iran: texts, grammar, and lexicon. Wiesbaden: Reichert. Maisak, Timur. . ‘Morphological fusion without syntactic fusion: the case of the “veriﬁcative” in Agul.’ Linguistics (): –. Majidi, Mohammed-Reza. . Strukturelle Beschreibung des iranischen Dialektes der Stadt Semnan. Hamburg: Buske. Majsak [Maisak], Timur A. . ‘Glagol´naja paradigma udinskogo jazyka (nidžskij dialekt)’ [Verbal paradigm of the Udi language (Nizh dialect)]. In Mixail E. Alekseev, Timur A. Majsak, Dmitrij S. Ganenkov, and Jurij A. Lander (eds), Udinskij sbornik: Grammatika, leksika, istorija jazyka [Studies in Udi: grammar, lexicon, history of the language], –. Moscow: Academia. Majsak [Maisak], Timur A. . ‘Pričastnye formy v vido-vremennoj sisteme agul´skogo jazyka’ [Participial forms within the tense and aspect system of Agul]. Acta Linguistica Petropolitana (): –.



References

Majsak [Maisak], Timur A., and Solmaz R. Merdanova. . ‘ “Proverjatel´naja forma” v agul´-skom jazyke: struktura, semantika i gipoteza o proisxoždenii’ [‘Veriﬁcational form’ in Agul: its structure, semantics and a hypothesis about its origin]. In Jurij A. Lander, Vladimir A. Plungjan, and Anna Ju. Urmančieva (eds), Irrealis i irreal´nost´ [Irrealis and irreality], –. Moscow: Gnozis. Malone, Terry. . ‘The origin and development of Tuyuca evidentials.’ International Journal of American Linguistics : –. Margetts, Anna. . Valence and transitivity in Saliba, an Oceanic language of Papua New Guinea. Nijmegen: Max Planck Institute for Psycholinguistics. Margetts, Anna. . ‘From implicature to construction: emergence of a benefactive construction in Oceanic.’ Oceanic Linguistics (): –. Markopoulos, Theodore. . ‘Contact-induced grammaticalization in older texts: the Medieval Greek analytic comparatives.’ In Andrew Smith, Graeme Trousdale, and Richard Waltereit (eds), New directions in grammaticalization research, –. Amsterdam: Benjamins. Martin, Samuel. . The Japanese language through time. New Haven, Conn.: Yale University Press. Martin, Samuel. . A reference grammar of Korean. Rutland, Vt.: Charles E. Tuttle. Martin, Samuel. . ‘Unaltaic features of the Korean verb.’ Japanese/Korean Linguistics : –. Masica, Colin P. . Deﬁning a linguistic area. Chicago: University of Chicago Press. Masica, Colin P. . The Indo-Aryan languages.Cambridge: Cambridge University Press. Masica, Colin P. . Overall South Asia. In Hans Heinrich Hock and Elena Bashir (eds), The languages and linguistics of South Asia: a comprehensive guide, –. Berlin: Mouton de Gruyter. Maslova, Elena. . A grammar of Kolyma Yukaghir. Berlin: Mouton de Gruyter. Matisoff, James A. . ‘Areal and universal dimensions of grammatization in Lahu.’ In Elizabeth Closs Traugott and Bernd Heine (eds), Approaches to grammaticalization, vol. , –. Amsterdam: Benjamins. Matisoff, James A. . Handbook of Proto-Tibeto-Burman: system and philosophy of SinoTibetan reconstruction. Berkeley: University of California Press. Matras, Yaron, and Jeanette Sakel. . Grammatical borrowing in cross-linguistic perspective. Berlin: Mouton de Gruyter. Matsumoto, Yo. . Complex predicates in Japanese: a syntactic and semantic study of the notion ‘word.’ Stanford, Calif.: CSLI. Matteson, Esther. . The Piro (Arawakan) language. Berkeley: University of California Press. Matthews, Stephen, and Virginia Yip. (). Cantonese: a comprehensive grammar, nd edn. New York: Routledge. Mattissen, Johanna. . Dependent head synthesis in Nivkh: a contribution to a typology of polysynthesis. Amsterdam: Benjamins. Maurer, Philippe. . Principense: grammar, texts, and vocabulary of the Afro- Portuguese creole of the island of Príncipe, Gulf of Guinea. London: Battlebridge. Maxmudova, Svetlana M. . Morfologija rutul´skogo jazyka [Morphology of the Rutul language]. Moscow: Sovetskij pisatel´.

References



Mazurova, Julia V. . Semantika lokativnyx preverbov pə- i ŝ’we- [Semantics of the locative preverbs pə- i ŝ’we-]. In Jakov G. Testelec (ed.), Aspekty polisintetizma: očerki po grammatike adygejskogo jazyka [Aspects of polysynthesis: Studies in Adyghe grammar], –. Moscow: RSUH. Mbiavanga, Fernando. . An analysis of verbal afﬁxes in Kikongo with special reference to form and function. MA thesis, University of South Africa. McConvell, Patrick. . ‘The functions of split-Wackernagel clitic systems: pronominal clitics in the Ngumpin languages (Pama-nyungan family, Northern Australia).’ In A. L. Halpern and A. M. Zwicky (eds), Approaching second: second position clitics and related phenomena, –. Stanford, Calif.: CSLI. McGregor, William. . ‘Optional ergative case marking systems in a typological-semiotic perspective.’ Lingua (): –. McLendon, Sally. . ‘Bear kills her own daughter-in-law, Deer (Eastern Pomo).’ International Journal of American Linguistics Native American Texts Series: Northern California Texts ., ed. Victor Golla and Shirley Silver, –. McMahon, April M. S. . Understanding language change. Cambridge: Cambridge University Press. McWhorter, John H. . ‘Sisters under the skin: a case for genetic relationship between the Atlantic English-based creoles.’ Journal of Pidgin and Creole Languages : –. McWhorter, John H. . ‘Identifying the creole prototype: vindicating a typological class.’ Language (): –. McWhorter, John H. . The missing Spanish creoles. Berkeley: University of California Press. McWhorter, John H. . ‘The world’s simplest grammars are creole grammars.’ Linguistic Typology : –. McWhorter, John H. . Deﬁning creole. New York: Oxford University Press. McWhorter, John H. . Language interrupted: signs of non-native acquisition in standard language grammars. Oxford: Oxford University Press. McWhorter, John H. . ‘Oh, Nɔ́ɔ! A bewilderingly multifunctional Saramaccan word teaches us how a creole develops complexity.’ In Geoffrey Sampson, David Gil, and Peter Trudgill (eds), Language complexity as an evolving variable, –. Oxford: Oxford University Press. McWhorter, John H., and Jeff Good. . A grammar of Saramaccan Creole. Berlin: Mouton de Gruyter. Meakins, Felicity. . Case-marking in contact: the development and function of case morphology in Gurindji Kriol. Amsterdam: Benjamins. Meillet, Antoine. . ‘L’évolution des formes grammaticales.’ Scientia : –. Merdanova, Solmaz R. . Morfologija i grammatičeskaja semantika agul´skogo jazyka: na materiale xpjukskogo govora [Morphology and grammatical semantics of Agul: on the data from the Huppuq’ dialect]. Moscow: Sovetskij pisatel´. Meyerhoff, Miriam. . ‘The emergence of creole subject–verb agreement and the licensing of null subjects.’ Language Variation and Change : –. Meyerhoff, Miriam. . ‘Replication, transfer, and calquing: using variation as a tool in the study of language contact.’ Language Variation and Change : –.



References

Michael, Lev D. . ‘Noun incorporation and verbal classiﬁers in Nanti (Kampa, Arawak).’ In Proceedings of the Second Conference on the Indigenous Languages of Latin America,  Feb. , –. http:// ailla.utexas.org/site/cilla/ Michael_CILLA_nanti.pdf. Michael, Lev D. . Nanti evidential practice: language, knowledge, and social action in an Amazonian society. PhD dissertation, University of Texas at Austin. Michaelis, Susanne Maria, and Martin Haspelmath. . ‘Grammaticalization in creole languages: accelerated functionalization and semantic imitation.’ Paper presented at the symposium on ‘Areal Patterns of Grammaticalization and Cross-Linguistic Variation in Grammaticalization Scenarios’, Mainz, – Mar. Michaelis, Susanne Maria, and Martin Haspelmath. To appear. ‘Grammaticalization in creole languages: accelerated functionalization and semantic imitation.’ https://www.linguistik. fb.uni-mainz.de. Migge, Bettina. . Creole formation as language contact: the case of the Surinamese creoles. Amsterdam: Benjamins. Mihas, Elena. . ‘Expression of information source meanings in Ashéninca Perené.’ In Alexandra Y. Aikhenvald and R. M. W. Dixon (eds), The grammar of knowledge: a crosslinguistic typology, –. Oxford: Oxford University Press. Miller, Philip H. . Clitics and constituents in phrase structure grammar. New York: Garland. Miller, Marion. . Desano grammar. Arlington: Summer Institute of Linguistics and the University of Texas at Arlington. Mithun, Marianne. . ‘Is basic word order universal?’ In Doris Payne (ed.), Pragmatics of word order ﬂexibility, –. Amsterdam: Benjamins. Mithun, Marianne. . ‘Why preﬁxes?’ Acta Linguistica Hungarica (–): –. Mithun, Marianne. . ‘Grammar, contact, and time.’ Journal of Language Contact. THEMA :–. www.jlc-journal.org. Mithun, Marianne. . ‘Grammaticalization and explanation.’ In Heiko Narrog and Bernd Heine (eds) The Oxford handbook of grammaticalization, –. Oxford: Oxford University Press. Mithun, Marianne. a. ‘Native North American languages.’ In Raymond Hickey (ed.), The Cambridge handbook of areal linguistics, –. Cambridge: Cambridge University Press. Mithun, Marianne. b. ‘What cycles when and why?’ In Elly van Gelderen (ed.), Cyclical change continued, –. Amsterdam: Benjamins. Mkhatshwa, Simon Nyana Leon. . Metaphorical extensions as a basis for grammaticalization: with special reference to Zulu auxiliary verbs. MA thesis, University of South Africa, Pretoria. Morey, Stephen. . ‘Constituent order change in the Tai languages of Assam.’ Linguistic Typology : –. Morgan, Lawrence Richard. . A description of the Kutenai language. PhD dissertation, University of California, Berkeley. Morton, Thomas. . Sociolinguistic variation and change in El Palenque de San Basilio (Colombia). PhD dissertation, University of Pennsylvania. Moscati, Sabatino. . An introduction to the comparative grammar of the Semitic languages: phonology and morphology. Wiesbaden: Otto Harrassowitz.

References



Mosel, Ulrike, and Even Hovdhaugen. . Samoan reference grammar. Oslo: Institute for Comparative Research in Human Culture, Scandinavian University Press. Moyse-Faurie, Claire. . Le drehu, langue de Lifou (Îles Loyauté): phonologie, morphologie, syntaxe. Paris: Peeters-Selaf. Moyse-Faurie, Claire. . Le xârâcùù, langue de Thio-Canala (Nouvelle-Calédonie): éléments de syntaxe. Paris: Peeters-Selaf. Moyse-Faurie, Claire. . Grammaire du futunien. Nouméa: Centre de Documentation Pédagogique, coll. Université. Moyse-Faurie, Claire. . ‘Complex predicate constructions in Faka’uvea (East Uvean).’ In Isabelle Bril and Françoise Ozanne-Rivierre (eds), Complex predicates in Oceanic languages: studies in the dynamics of binding and boundedness, –. Berlin: Mouton de Gruyter. Moyse-Faurie, Claire. . ‘Constructions expressing middle, reﬂexive and reciprocal situations in some Oceanic languages.’ In E. König and V. Gast (eds), Reciprocals and reﬂexives: theoretical and typological explorations, –. Berlin: Mouton de Gruyter. Moyse-Faurie, Claire. . ‘(Dé)Grammaticalisation d’expressions spatiales dans des langues océaniennes.’ In I. Choi-Jonin, M. Duval, and O. Soutet (eds), Typologie et comparatisme: hommages offerts à Alain Lemaréchal, –. Paris: Peeters. Moyse-Faurie, Claire. . ‘The concept “return” as a source of different developments in Oceanic languages.’ Oceanic Linguistics (): –. Moyse-Faurie, Claire. . ‘Valency classes in Xârâcùù (New Caledonia).’ In Andrej Malchukov and Bernard Comrie (eds), Valency classes in the world’s languages, vol. , –. Berlin: Mouton de Gruyter. Moyse-Faurie, Claire. . Te lea faka’uvea: le wallisien. Paris: Peeters. Mufwene, Salikoko S. . ‘Time reference in Kituba.’ In John V. Singler (ed.), Pidgin and creole tense-mood-sspect systems, vol. , –. Amsterdam: Benjamins. Mufwene, Salikoko S. . ‘Creolization and grammaticization: what creolists could contribute to research on grammaticization.’ In Philip Baker and Anand Syea (eds), Changing meanings, changing functions: papers relating to grammaticalization in contact languages, –. London: University of Westminster Press. Mufwene, Salikoko S. . The ecology of language evolution. Cambridge: Cambridge University Press. Mufwene, Salikoko S. . Language: contact, competition and change. London: Continuum. Mushin, Ilana. . ‘Second position clitic phenomena in north-central Australia: some pragmatic considerations.’ In Ilana Mushin (ed.), Proceedings of the  Conference of the Australian Linguistics Society: http://dspace.library.usyd.edu.au:/handle/ /. Mushin, Ilana. . ‘Motivations for second position: evidence from north-central Australia.’ Linguistic Typology (): –. Mushin, Ilana. . ‘Diverging paths: variation in Garrwa tense/aspect clitic placement.’ In Ilana Mushin and Brett Baker (eds), Discourse and grammar in Australian languages, –. Amsterdam: Benjamins. Mushin, Ilana. . A grammar of (western) Garrwa. Berlin: Mouton de Gruyter. Mushin, Ilana. . ‘Liminal pronoun systems: evidence from Garrwa.’ In Rob Pensalﬁni, Myfany Turpin, and Diana Guillemin (eds), Grammatical description informed by theory, –. Amsterdam: Benjamins.



References

Mushin, Ilana, and Jane Simpson. . ‘Free to bound to free? Interactions between pragmatics and syntax in the development of Australian pronominal systems.’ Language (): –. Myhill, John. . ‘Categoriality and clustering.’ Studies in Language : –. Nadkarni, Mangesh V. . ‘Bilingualism and syntactic change in Konkani.’ Language (): –. Næss, Åshild. . ‘Serial verbs and complex constructions in Pileni.’ In Isabelle Bril and Françoise Ozanne-Rivierre (eds), Complex predicates in Oceanic language: studies in the dynamics of binding and boundness, –. Berlin: Mouton de Gruyter. Nagaraja, K. S. . Khasi: a descriptive analysis. Poona: Deccan College Postgraduate Research Institute. Naghzguy Kohan, Mehrdad. . ‘Auxiliary verbs and representation of aspect in Persian.’ Adab Pazhuhi : . Nam, Pung-hyun. . ‘Old Korean.’ In Nicholas Tranter (ed.), The languages of Japan and Korea, –. Abingdon: Routledge. Narrog, Heiko. . ‘Polysemy and indeterminacy in modal markers: the case of Japanese beshi.’ Journal of East Asian Linguistics : –. Narrog, Heiko. . ‘Modality, mood, and change of modal meanings: a new perspective.’ Cognitive Linguistics (): –. Narrog, Heiko. . ‘Voice and non-canonical case marking in the expression of eventoriented modality: a cross-linguistic study.’ Linguistic Typology (): –. Narrog, Heiko. . Modality, subjectivity, and semantic change. Oxford: Oxford University Press. Narrog, Heiko. . ‘The grammaticalization chain of case functions: extension and reanalysis of case marking vs. universals of grammaticalization.’ In Silvia Luraghi and Heiko Narrog (eds), Perspectives on semantic roles, –. Amsterdam: Benjamins. Narrog, Heiko. . ‘Exaptation in Japanese and beyond.’ In Muriel Norde and Freek Van de Velde (eds), Exaptation and language change, –. Amsterdam: Benjamins. Narrog, Heiko. a. ‘Typology and grammaticalization.’ In Alexandra Y. Aikhenvald and R. M. W. Dixon (eds), The Cambridge handbook of linguistic typology, –. Cambridge: Cambridge University Press. Narrog, Heiko. b. ‘Relationship of form and function in grammaticalization: the case of modality.’ In Kees Hengeveld, Heiko Narrog, and Hella Olbertz (eds), The Grammaticalization of tense, aspect, modality and evidentiality, –. Berlin: Mouton de Gruyter. Narrog, Heiko. . ‘Linguistic typology and grammaticalization.’ In Alexandra Y. Aikhenvald and R. M. W. Dixon (eds), The Cambridge handbook of linguistic typology, –. Cambridge: Cambridge University Press. Narrog, Heiko, and Bernd Heine. a. ‘Introduction.’ In Heiko Narrog and Bernd Heine (eds), The Oxford handbook of grammaticalization, –. Oxford: Oxford University Press. Narrog, Heiko, and Bernd Heine (eds) b. The Oxford handbook of grammaticalization. Oxford: Oxford University Press. Narrog, Heiko, and Bernd Heine. . ‘Grammaticalisation.’ In Adam Ledgeway and Ian Roberts (eds), The Cambridge handbook of historical syntax, –. Cambridge: Cambridge University Press.

References



Narrog, Heiko, and Toshio Ohori. . ‘Grammaticalization in Japanese.’ In Heiko Narrog and Bernd Heine (eds), The Oxford handbook of grammaticalization, –. Oxford: Oxford University Press. Narrog, Heiko, and Seongha Rhee. . ‘Grammaticalization of space in Korean and Japanese.’ In Martine Robbeets and Hubert Cuyckens (eds), Shared grammaticalization: with special focus on the Transeurasian languages, –. Amsterdam: Benjamins. Nash, David. . Topics in Warlpiri grammar. New York: Garland. Navarette, María Cristina. . San Basilio de Palenque: memoria y tradición. Cali: Programa Editorial, Universidad del Valle. Nedjalkov, Vladimir P. . ‘Tense-aspect-mood forms in Chukchi.’ STUF: Language Typology and Universals (): –. Nedjalkov, Vladimir P., and Galinal A. Otaina. . A syntax of the Nivkh language: the Amur dialect, trans. and ed. Emma Š. Geniušienė, ed. Ekaterina Gruzdeva. Amsterdam: Benjamins. Newman, Paul. . The Hausa language: an encyclopedic reference grammar. New Haven, Conn.: Yale University Press. Newmeyer, Frederick J. . Language form and language function. Cambridge, Mass.: MIT Press. Ngay, Sing-Sing. . ‘The multifunctionality and polyfunctionality pathways of the GET verb [tie⁵³] in the Shaowu dialect.’ Paper given at the th Meeting of the Association for Linguistic Typology (ALT), Leipzig, – Aug. Nichols, Johanna. . Linguistic diversity in space and time. Chicago: University of Chicago Press. Nichols, Johanna. . ‘The comparative method as heuristic.’ In Mark Durie and Malcolm Ross (eds), The comparative method reviewed: regularity and irregularity in language change, –. Oxford: Oxford University Press. Nichols, Johanna. . ‘The Nakh-Daghestanian consonant correspondences.’ In Dee Ann Holisky and Kevin Tuite (eds), Current trends in Caucasian, East European, and Inner Asian linguistics: papers in honor of Howard I. Aronson, –. Amsterdam: Benjamins. Nikolayev, Sergei L., and Sergei A. Starostin. . A North Caucasian etymological dictionary. Moscow: Asterisk. Nishiyama, Kunio, and Herman Kelen. . A grammar of Lamaholot, eastern Indonesia: the morphology and syntax of the Lewoingu dialect. Munich: Lincom Europa. NKD = Nihon Kokugo Daijiten [Great dictionary of the national language of Japan]. –.  vols, nd edn, ed. Nihon Kokugo Daijiten Henshū Iinkai. Tokyo: Shōgakkan. Noonan, Michael. . ‘Complementation.’ In Timothy Shopen (ed.), Language typology and syntactic description, vol. : Complex constructions, nd edn, –. Cambridge: Cambridge University Press. Noonan, Michael. . ‘Case compounding in the Bodic languages.’ In Greville G. Corbett and Michael Noonan (eds), Case and grammatical relations: studies in honor of Bernard Comrie, –. Amsterdam: Benjamins. Norde, Muriel. . Degrammaticalization. Oxford: Oxford University Press. Nordlinger, Rachel. . A grammar of Wambaya. Canberra: Paciﬁc Linguistics. Nordlinger, Rachel. . ‘From body parts to applicatives.’ Handout of the talk at the th Biennial Meeting of the Association of Linguistic Typology, Hong Kong, July.



References

Nordlinger, Rachel. . ‘Constituency and grammatical relations in Australian languages.’ In Harold Koch and Rachel Nordlinger (eds), The languages and linguistics of Australia: a comprehensive guide, –. Berlin: Mouton de Gruyter. Nuti, Andrea. . ‘A few remarks on the habeo + object + passive perfect participle construction in archaic Latin, with special reference to lexical semantics and the reanalysis process.’ Journal of Latin Linguistics (): . DOI: ./joll..... O’Leary, De Lacy. . Comparative grammar of the Semitic languages. Amsterdam: Philo Press. O’Shannessey, Carmel. . ‘Light Warlpiri: a new language.’ Australian Journal of Linguistics (): –. Olson, Michael L. . ‘Barai grammar highlights.’ In T. E. Dutton (ed.), Studies in languages of central and south-east Papua, –. Canberra: Paciﬁc Linguistics. Olsson, Bruno. . Iamitives: perfects in Southeast Asia and beyond. Master’s thesis, Department of Linguistics, Stockholm University. Retrieved  Feb.  from: http://urn.kb.se/ resolve?urn=urn:nbn:se:su:diva- Onodera, Noriko O. . Japanese discourse markers. Amsterdam: Benjamins. Onodera, Noriko O. . ‘The grammaticalization of discourse markers.’ In Heiko Narrog and Bernd Heine (eds), The Oxford handbook of grammaticalization, –. Oxford: Oxford University Press. Öpengin, Ergin. . The Mukri variety of Central Kurdish: grammar, texts and lexicon. Wiesbaden: Reichert. Öpengin, Ergin, and Geoffrey Haig. . ‘Regional variation in Kurmanji: a preliminary classiﬁcation of dialects.’ Kurdish Studies () (special issue on Kurdish Linguistics: Variation and Change): –. Ouhalla, Jamal. . ‘Variation and change in possessive noun phrases: the evolution of the analytic type and loss of synthetic type.’ Brill’s Journal of Afroasiatic Languages and Linguistics : –. Ozanne-Rivierre, Françoise. . Le Iaai: phonologie, morphologie, syntaxe. Paris: Peeters-Selaf. Ozanne-Rivierre, Françoise. . Le Nyelâyu de Balade (Nouvelle-Calédonie). Paris: Peeters-Selaf. Ozanne-Rivierre, Françoise . ‘The evolution of the verb “take” in New Caledonian languages.’ In Isabelle Bril and Françoise Ozanne-Rivierre (eds), Complex predicates in Oceanic language: studies in the dynamics of binding and boundness, –. Berlin: Mouton de Gruyter. Ozanne-Rivierre, Françoise, and Jean-Claude Rivierre. . ‘Verbal compounds and lexical preﬁxes in the languages of New Caledonia.’ In Isabelle Bril and Françoise Ozanne-Rivierre (eds), Complex predicates in Oceanic language: studies in the dynamics of binding and boundness, –. Berlin: Mouton de Gruyter. Pagliuca, William. . ‘Introduction.’ In William Pagliuca (ed.), Perspectives on grammaticalization, ix–xx. Amsterdam: Benjamins. Pandharipande, Rajeshwari V. . Marathi. London: Routledge. Panova, Anastasia. . ‘Složnye predikaty s èlementom -ʒ əš’a- v abazinskom jazyke: meždu morfologiej i sintaksisom’ [Complex predicates with the element -ʒ əš’a- in Abaza: between morphology and syntax]. Talk at the conference ‘Minority Languages in Big Linguistics’, Moscow State University, – Nov.

References



Paris, Catherine. . ‘Localisation en tcherkesse: forme et substance du référent.’ In A. Rousseau (ed.), Les préverbes dans les langues d’Europe: introduction à l’étude de la préverbation, –. Lille: Presses Universitaires de Septentrion. Patiño Roselli, Carlos. . ‘El habla en el Palenque de San Basilio.’ In Nina De Friedemann and Carlos Patiño Roselli (eds), Lengua y sociedad en el palenque de San Basilio, –. Bogotá: Instituto Caro y Cuervo. Patiño Roselli, Carlos. . ‘Aspectos de la estructura de la criolla palenquera.’ In Klaus Zimmermann (ed.), Lenguas criollas de base lexical española y portuguesa, –. Vervuert: Bibliotheca Ibero-Americana. Paul, Daniel. . A comparative dialectal description of Iranian Taleshi. PhD thesis, University of Manchester. Paul, Hermann. . Principien der Sprachgeschichte. Halle: Max Niemeyer. Paul, Hermann. . Principles of the history of language, trans. H. A. Strong. London: Longmans. Paul, Ludwig. . ‘Some remarks on Persian sufﬁx -râ as a general and historical issue.’ In Simin Karimi, Vida Samiian and Donald Stilo (eds), Aspects of Iranian linguistics, –. Newcastle: Cambridge Scholars. Paul, Ludwig. . ‘The case system of modern West Iranian languages in typological and historical perspective (with special reference to *radi).’ Presentation at the Seventh International Conference on Iranian Linguistics (ICIL), Lomonosov Moscow State University, – Aug. Pawley, Andrew K. . ‘Some problems in Proto-Oceanic grammar.’ Oceanic Linguistics : –. Pawley, Andrew K. . ‘A reanalysis of Fijian transitive constructions.’ Te Reo : –. Pawley, Andrew K. . ‘Grammatical categories and grammaticalisation in the Oceanic verb complex.’ In A. Riehl and T. Savella (eds), Cornell working papers in Linguistics : –. Pawley, Andrew K. . ‘The chequered career of the Trans New Guinea Hypothesis: recent research and its implications.’ In Andrew K. Pawley, Robert Attenborough, Jack Golson, and Robin Hide (eds), Papuan pasts: cultural, linguistic and biological histories of Papuanspeaking peoples, –. Canberra: Paciﬁc Linguistics. Pawley, Andrew K. . ‘The origins of early Lapita culture: the testimony of historical linguistics.’ In Stuart Bedford, Christophe Sand, and Sean P. Connaughton (eds), Oceanic explorations: Lapita and western Paciﬁc settlement, –. Canberra: Australian University, ANU ePress. Pawley, Andrew, and Timoci Sayaba. . Words of Waya: a dictionary of the Wayan dialect of the Western Fijian language. MS. Available from Dept of Linguistics, Research School of Paciﬁc and Asian Studies, Australian National University. Payne, Doris L. . ‘Morphological characteristics of lowland South American languages.’ In Doris L. Payne (ed.), Amazonian linguistics: studies in lowland South American languages, –. Austin: University of Texas Press. Pensalﬁni, Rob. . ‘The rise of case sufﬁxes as discourse markers in Jingulu: a case study of innovation in an obsolescent language.’ Australian Journal of Linguistics (): –. Pensalﬁni, Rob. . A grammar of Jingulu: an Aboriginal language of the Northern Territory. Canberra: Paciﬁc Linguistics. Pensalﬁni, Rob. . ‘Towards a typology of nonconﬁgurationality.’ Natural Language and Linguistic Theory (): –. Peterson, David A. . Applicative constructions. Oxford: Oxford University Press.



References

Peterson, John. . A grammar of Kharia: a South Munda language. Leiden: Brill. Pinault, Georges-Jean. . ‘Le problème du préverbe en Indo-Européen.’ In André Rousseau (ed.), Les préverbes dans les langues d’Europe: introduction à l’étude de la préverbation, –. Lille: Presses Universitaires de Septentrion. Pinnow, Heinz-Jürgen. . Beiträge zur Kenntnis der Juang-Sprache. MS. Pitkin, Harvey. . Wintu grammar. Berkeley: University of California. Pitkin, Harvey. . Wintu dictionary. Berkeley: University of California. Plag, Ingo. . ‘On the role of grammaticalization in creolization: a reassessment.’ in Glenn Gilbert (ed.), Pidgin and creole linguistics in the st century: essays at millennium’s end, –. New York: Lang. Plag, Ingo. a. ‘Creoles as interlanguages: inﬂectional morphology.’ Journal of Pidgin and Creole Languages : –. Plag, Ingo. b. ‘Creoles as interlanguages: syntactic structures.’ Journal of Pidgin and Creole Languages : –. Plank, Frans. . ‘Paradigm size, morphological typology, and universal economy.’ Folia Linguistica (–): –. Poplack, Shana. . ‘Variation theory and language contact.’ In Dennis Preston (ed.), Variation theory and language contact: American dialect research, –. Amsterdam: Benjamins. Poplack, Shana. . ‘Grammaticalization and linguistic variation.’ In Bernd Heine and Heiko Narrog (eds), The Oxford handbook of grammaticalization, –. Oxford: Oxford University Press. Poplack, Shana, and Elisabete Malvar. . ‘Elucidating the transition period in linguistic change.’ Language Variation: European Perspectives (): –. Poplack, Shana, and Sali Tagliamonte. . ‘Nothing in context: variation, grammaticization, and past time marking in Nigerian Pidgin English.’ In Philip Baker and Anand Syea (eds), Changing meanings, changing functions: papers relating to grammaticalization in contact languages, –. London: University of Westminster Press. Poplack, Shana, and Sali Tagliamonte. . African American English in the diaspora. Oxford: Blackwell. Poplack, Shana, and Rena Torres Cacoullos. . ‘Linguistic emergence on the ground: a variationist paradigm.’ In Brian MacWhinney and William O’Grady (eds), The handbook of language emergence, –. Oxford: Wiley-Blackwell. Price, Richard, and Sally Price. n.d. Transcriptions of folk tales obtained from Val Ziegler, recorded in the s. Puhvel, Jaan. . Hittite etymological dictionary: words beginning with H, vol. . Berlin: de Gruyter. Pukui, Mary K., and Samuel H. Elbert. . Hawaiian–English dictionary. Honolulu: University of Hawai’i Press. Radatz, Hans-Ingo. . ‘Non-lexical core-arguments in Basque, German and Romance: how (and why) Spanish syntax is shifting towards clausal head-marking and morphological cross-reference.’ In Ulrich Detges and Richard Waltereit (eds), The paradox of grammatical change: perspectives from Romance, –. Amsterdam: Benjamins. Ramat, Paolo. . ‘The (early) history of linguistic typology.’ In Jae Jung Song (ed.), The Oxford handbook of linguistic typology, –. Oxford: Oxford University Press.

References



Ramat, Paolo, and Elisa Roma (eds) . Europe and the Mediterranean as linguistic areas: convergences from a historical and typological perspective. Amsterdam: Benjamins. Ramirez, Henri. . Le Bahuana: une nouvelle langue de la familie Arawak. Paris: Association d’Ethnolinguistique Amérindienne. Ramirez, Henri. . Le parler Yanomami des Xamatauteri. Dissertation, Université AixMarseille . Ramirez, Henri. . A fala Tukano dos Yepâ-masa, vol. : Gramática; vol. : Dicionário; vol. : Método de aprendizagem. Manaus: Inspetoria Salesiana. Ramstedt, Gustav John. []. A Korean grammar. Helsinki: Suomalais-Ugrilainen Seura. Rasekh, Mohammad. . ‘Persian clitics: doubling and agreement.’ Journal of Modern Languages (). http://e-journal.um.edu.my/public/issue-view.php?id=andjournal_id=. Rau, Felix. In preparation. ‘A grammar of Gorum.’ MS, Institute for Linguistics, University of Cologne. Reckendorf, Hermann von. . Arabische Syntax. Heidelberg: Carl Winter. Reinöhl, Uta. . Grammaticalization and the rise of conﬁgurationality in Indo-Aryan. Oxford: Oxford University Press. Reintges, Chris. . ‘Sapirian “drift” towards analyticity and long-term morphosyntactic change in Ancient Egyptian.’ In Ritsuko Kikusawa and Lawrence A. Reid (eds), Historical linguistics , –. Amsterdam: Benjamins. Rhee, Seongha. . ‘On the rise and fall of Korean nominalizers.’ In María José López-Couso and Elena Seoane (eds), Rethinking grammaticalization: new perspectives, –. Amsterdam: Benjamins. Rhee, Seongha. . ‘Grammaticalization in Korean.’ In Heiko Narrog and Bernd Heine (eds), The Oxford handbook of grammaticalization, –. Oxford: Oxford University Press. Rhee, Seongha. . ‘Context-induced reinterpretation and (inter)subjectiﬁcation: the case of grammaticalization of sentence-ﬁnal particles.’ Language Sciences : –. Rice, Keren. . Morpheme order and semantic scope: word formation in the Athapaskan verb. Cambridge: Cambridge University Press. Rickford, John. . ‘How does DOZ disappear?’ In Richard R. Day (ed.), Issues in English Creoles: Papers from the  Hawaii Conference, –. Heidelberg: Julius Groos. Rickmeyer, Jens. . Japanische Morphosyntax. Heidelberg: Julius Groos. Ring, Hiram R. a. A grammar of Pnar. PhD dissertation, Nanyang Technological University, Singapore. Ring, Hiram R. b. Pnar_Language_Archive. FPAHM_. Nanyang Technological University Dataverse. https://researchdata.ntu.edu.sg/dataset.xhtml?persistentId=doi:./ N/KVFGBZ. Rivierre, Jean-Claude. . La langue de Touho: phonologie et grammaire du cèmuhî (Nouvelle-Calédonie). Paris: Selaf. Robbeets, Martine, and Hubert Cuyckens. . ‘Towards a typology of shared grammaticalization.’ In Martine Robbeets and Hubert Cuyckens (eds), Shared grammaticalization: with special focus on the Transeurasian languages, –. Amsterdam: Benjamins. Robert, Stéphane. . ‘The challenge of polygrammaticalization for linguistic theory: fractal grammar and transcategorial functioning.’ In Zygmunt Frajzyngier, Adam Hodges, and



References

David S. Rood (eds), Linguistic diversity and language theories, –. Amsterdam: Benjamins. Roberts, Ian, and Anna Roussou. . ‘A formal approach to “grammaticalization”.’ Linguistics (): –. Roberts, John R. . A study of Persian discourse structure. Uppsala: Acta Universitatis Uppsaliensa. Robinson, Laura C., and John Haan. . ‘Adang.’ In Antoinette Schapper (ed.), Papuan languages of Timor, Alor and Pantar: sketch grammars, vol. , –. Berlin: Mouton de Gruyter. Rogava, Giorgi, and Zejnab Keraševa. . Grammatika adygejskogo jazyka [A grammar of Adyghe]. Majkop: Adygejskoe knižnoe izdatel’stvo. Romaine, Suzanne. . ‘The grammaticalization of the proximative in Tok Pisin.’ Language (): –. Rose, Françoise. . ‘L’incorporation nominale en Émerillon: une approche lexicale et discursive.’ Amerindia : –. Ross, Malcolm. . ‘The grammaticization of directional verbs in Oceanic languages.’ In Isabelle Bril and Françoise Ozanne-Rivierre (eds), Complex predicates in Oceanic language: studies in the synamics of binding and boundness, –. Berlin: Mouton de Gruyter. Rubba, Jo. . ‘Grammaticalization as semantic change: a case study in preposition development.’ In William Pagliuca (ed.), Perspectives on grammaticalization, –. Amsterdam: Benjamins. Rubin, Aaron D. . Studies in Semitic grammaticalization. Winona Lake, Ind.: Eisenbrauns. Russel, Robert. . ‘Historical aspects of subject–verb agreement in Arabic.’ In Proceedings of the First Eastern States Conference on Linguistics, –. Ryžova, Darja A., and Marija V. Kjuseva. . ‘ “Sidet´”, “stojat´”, “ležat´”: lokativnaja predikacija v kabardino-čerkesskom jazyke [‘Sit’, ‘stand’, ‘lie’: locative predication in Kabardian]. In Aleksej A. Kretov (ed.), Problemy leksiko-semantičeskoj tipologii [Problems of lexical-semantic typology], vol. , –. Voronezh: Voronezh State University. Sadler, Wesley. . ‘Untangled CiBemba.’ Kitwe: United Church of Central Africa in Rhodesia. Sagart, Laurent. . The roots of Old Chinese. Amsterdam: Benjamins. Samvelian, Pollet. . ‘What Sorani Kurdish absolute prepositions tell us about cliticization.’ In Frederic Hoyt, Nikki Seifert, Alexandra Teodorescu, and Jessica White (eds), Texas Linguistics Society IX: The morphosyntaxe of understudied languages. csli-publications. stanford.edu/TLS/TLS-/TLS_Samvelian_Pollet.pdf. Samvelian, Pollet. . Grammaire des prédicats complexes: les constructions nom-verbe. Paris: Lavoisier. Samvelian, Pollet, and Pegah Faghiri. . ‘Re-thinking compositionality in Persian complex predicates.’ Annual Meeting of the Berkeley Linguistics Society [online], (): –. Last accessed  Mar. . Samvelian, Pollet, and Jesse Tseng. . ‘Persian object clitics and the syntax–morphology interface.’ In Stefan Müller (ed.), Proceedings of the th International Conference on HeadDriven Phrase Structure Grammar, –. Stanford, Calif.: CSLI. Sankoff, David, Sali Tagliamonte, and Eric Smith. . ‘Goldvarb LION: a variable rule application for Macintosh.’ Dept of Linguistics, University of Toronto.

References



Sankoff, Gillian. . ‘The grammaticalization of tense and aspect in Tok Pisin.’ Language Variation and Change (): –. Sankoff, Gillian, and Penelope Brown. . ‘The origins of syntax in discourse: a case study of Tok Pisin relatives.’ Language (): –. Sankoff, Gillian, and Suzanne LaBerge. . ‘On the acquisition of native speakers by a language.’ In Gillian Sankoff (ed.), The social life of language, –. Philadelphia: University of Pennsylvania Press. Sapir, Edward. . ‘The problem of noun incorporation in American languages.’ American Anthropologist (): –. Sapir, Edward. . Language: an introduction to the study of speech. New York: Harcourt, Brace. Satzinger, Helmut. . ‘The Egyptian conjugations within the Afroasiatic framework.’ In Zahi Hawass and Lyla Pinch Brook (eds), Egyptology at the dawn of the twenty-ﬁrst century: language, conservation, museology, vol. , . Cairo: American University Press. Saussure, Louis de, and Bertrand Sthioul. . ‘The surcomposé past tenses.’ In Robert I. Binnick (ed.), The Oxford handbook of tense and aspect, –. Oxford: Oxford University Press. Sawyer, Jess, and Alice Schlichter. . Yuki. Berkeley: University of California. Saxena, Anju. . ‘On syntactic convergence: the case of the verb “say” in Tibeto-Burman.’ Proceedings of the th Meeting of the Berkeley Linguistics Society, –. Schackow, Diana. . A grammar of Yakkha. Berlin: Language Science Press. Schapper, Antoinette. . Bunaq: a Papuan language of central Timor. PhD thesis, Australian National University. Schapper, Antoinette. . ‘Kamang.’ In Antoinette Schapper (ed.), Papuan languages of Timor, Alor and Pantar: sketch frammars, vol. , –. Berlin: Mouton de Gruyter. Schapper, Antoinette, and Rachel Hendery. . ‘Wersing.’ In Antoinette Schapper (ed.), Papuan languages of Timor, Alor and Pantar: sketch frammars, vol. , –. Berlin: Mouton de Gruyter. Schapper, Antoinette, Juliette Huber, and Aone van Engelenhoven. . ‘The relatedness of Timor-Kisar and Alor-Pantar languages: a preliminary demonstration.’ In Marian Klamer (ed.), The Alor-Pantar languages: history and typology, –. Berlin: Language Science Press. http://langsci-press.org/catalog/book/. Schapper, Antoinette, and Marian Klamer. . ‘Numeral systems in the Alor-Pantar languages. In Marian Klamer (ed.), The Alor-Pantar languages: history and typology, –. Berlin: Language Science Press. Schiering, René. . ‘Reconsidering erosion in grammaticalization: evidence from cliticization.’ In Katerina Stathi, Elke Gehweiler, and Ekkehard König (eds), Grammaticalization: current views and issues, –. Amsterdam: Benjamins. Schladt, Mathias. . ‘The typology and grammaticalization of reﬂexives.’ In Zygmunt Frajzyngier and Traci S. Curl (eds), Reﬂexives: forms and functions, –. Amsterdam: Benjamins. Schmidt, Annette. . Young people’s Dyirbal: an example of language death from Australa. Cambridge: Cambridge University Press. Schnell, Stefan. . ‘Explaining formal variation in subjects and objects in Vera’a: the emergence of subject-TAM markers.’ Paper presented at ‘New Ways of Analyzing Variation, Asia–Paciﬁc ’, Tokyo, Aug.



References

Schnell, Stefan. To appear. ‘Whence subject–verb agreement? Investigating the role of topicality, accessibility and frequency in Vera’a texts.’ Linguistics . Schokkin, Dineke. . ‘Directionals in Paluai: semantics, use, and grammaticalization paths.’ Oceanic Linguistics (): –. Schultze-Berndt, Eva. . Simple and complex verbs in Jaminjung: a study of event categorization in an Australian language. Nijmegen: Max Planck Institute. Schulze, Wolfgang. . The Udi Gospels: annotated text, etymological index, lemmatized concordance. Munich: Lincom Europa. Schumann, Christian Ludwig . ‘Neger–Englisches Wörterbuch.’ http://www.sil.org/amer icas/suriname/Schumann/National/SchumannGerDict.html. Schwegler, Armin. . Analyticity and syntheticity: a diachronic perspective with special reference to Romance languages. Berlin: Mouton de Gruyter. Schwegler, Armin. . ‘Future and conditional in Palenquero.’ Journal of Pidgin and Creole Languages (): –. Schwegler, Armin. . ‘Palenque (Colombia): multilingualism in an extraordinary social and historical context.’ In Manuel Diaz-Campos (ed.), The handbook of Hispanic sociolinguistics, –. Oxford: Oxford University Press. Schwegler, Armin, and Kate Green. . ‘Palenquero (Creole Spanish).’ In John A. Holm and Peter Patrick (eds), Comparative creole syntax: parallel outlines of  creole grammars, –. London: Battlebridge. Schwegler, Armin, and Thomas Morton. . ‘Vernacular Spanish in a microcosm: Kateyano en San Basilio de Palenque (Colombia).’ Revista Internacional de Lingüística Iberoamericana : –. Schwenter, Scott A., and Rena Torres Cacoullos. . ‘Defaults and indeterminacy in temporal grammaticalization: the “perfect” road to perfective.’ Language Variation and Change (): –. Schwenter, Scott A., and Rena Torres Cacoullos. . ‘Grammaticalization paths as variable contexts in weak complementarity.’ In James A. Walker (ed.), Aspect in grammatical variation, –. Amsterdam: Benjamins. Seifart, Frank. . ‘The prehistory of nominal classiﬁcation in Witotoan languages.’ International Journal of American Linguistics : –. Seifart, Frank. . ‘Nominal classiﬁcation.’ Language and Linguistics Compass (): –. Senft, Gunter. . ‘Grammaticalisation of body-part terms in Kilivila.’ Language and Linguistics in Melanesia : –. Seržant, Ilja A. . ‘The so-called possessive perfect in North Russian and the Circum-Baltic area: a diachronic and areal account.’ Lingua (): –. DOI: ./j. lingua.... Shibatani, Masayoshi. . The languages of Japan. Cambridge: Cambridge University Press. Siegel, Jeff, Benedikt Szmrecsanyi, and Bernd Kortmann. . ‘Measuring analyticity and syntheticity in creoles.’ Journal of Pidgin and Creole Languages (): –. Siewierska, Anna. . ‘From anaphoric pronoun to grammatical agreement marker: why objects don’t make it.’ Folia Linguistica (–): –. Siewierska, Anna. . Person. Cambridge: Cambridge University Press.

References



Silva, Cácio, and Elisângela Silva. . A língua doe Yuhupdeh: introdução ethnolinguística, dicionário Yuhup–Português e glossário semântico-gramatical. São Gabriel da Cachoeira: Pro-amazonia. Simard, Candice. . The prosodic contours of Jaminjung, a language of Northern Australia. PhD thesis, University of Manchester. Simarra Reyes, Luís, and Álvaro Enrique Triviño Doval. . Gramática de la lengua palenquera: introducción para principiantes, nd edn. Cartagena de Indias, Colombia: Pluma de Mompox. Simons, Gary F., and Charles D. Fennig (eds) . Ethnologue: languages of the world, th edn. Dallas, Tex.: SIL International. http://www.ethnologue.com, accessed  May . Simpson, Jane. . Warlpiri morpho-syntax: a lexicalist approach. Dordrecht: Kluwer Academic. Simpson, Jane, and Ilana Mushin. . ‘Clause-initial position in four Australian languages.’ In Ilana Mushin and Brett Baker (eds), Discourse and grammar in Australian languages, –. Amsterdam: Benjamins. Sims-Williams, Nicholas. . ‘Eastern Iranian languages.’ In Encyclopædia iranica VII, –. Sinnemäki, Kaius. . Language universals and linguistic complexity. Helsinki: Dept of Modern Languages, University of Helsinki. Skjærv, Prods O. . ‘Old Iranian.’ In Gernot Windfuhr (ed.), The Iranian languages, –. London: Routledge. Slade, Benjamin. . ‘The diachrony of light and auxiliary verbs in Indo-Aryan.’ Diachronica (): –. Smeets, Ineke. . A grammar of Mapuche. Berlin: Mouton de Gruyter. Smeets, Rieks. . Studies in West Circassian Phonology and Morphology. Leiden: Hakuchi. Smeets, Rieks. . ‘On valencies, actants and actant coding in Circassian.’ In B. George Hewitt (ed.), Caucasian perspectives, –. Munich: Lincom Europa. Smeets, Rieks (ed.) . The indigenous languages of the Caucasus, vol. : North East Caucasian languages, part . Delmar, NY: Caravan. Smith, Hiram L. . ‘Habitual aspect marking in Palenquero: variation in present temporal reference.’ In Ana M. Carvahlo and Sara Beaudrie (eds), Selected proceedings of the th Workshop on Spanish Sociolinguistics, –. Somerville, Mass.: Cascadilla Proceedings Project. Smith, Hiram L. . Patterns of variable tense and aspect marking in Palenquero creole. PhD Dissertation, Pennsylvania State University. Sohn, Ho-min. . ‘Semantics of compound verbs in Korean.’ Ene (Language) (): –. Sohn, Sung-Ock S. . ‘The grammaticalization of honoriﬁc particles in Korean.’ In Ilse Wischer and Gabriele Diewald (eds), New reﬂections on grammaticalization, –. Amsterdam: Benjamins. Song, Jae Jung. . ‘The history of Micronesian possessive classiﬁers and benefactive marking in Oceanic languages.’ Oceanic Linguistics (): –. Song, Jae Jung. . ‘Grammaticalization and structural scope increase: possessive-classiﬁerbased benefactive marking in Oceanic languages.’ Linguistics (): –.



References

Sorensen, Arthur P. Jr. . ‘Multilingualism in the northwest Amazon.’ American Anthropologist : –. Repr. in J. B. Pride and Janet Holmes (eds), Sociolinguistics, – (Harmondsworth: Penguin, ). Southworth, Franklin C. . Linguistic archaeology of South Asia. London: Routledge Curzon. Spriggs, Matthew. . ‘Archaeology and the Austronesian expansion: where are we now?’ Antiquity : –. Sridhar, Mangadan V. . Naga pidgin: a sociolinguistic study of inter-lingual communication pattern in Nagaland. Mysore: Central Institute of Indian Languages. Starostin, Sergei A. . ‘The problem of genetic relationship and classiﬁcation of Caucasian languages: basic vocabulary.’ In Helma E. van den Berg (ed.), Studies in Caucasian linguistics: selected papers of the eighth Caucasian Colloquium, –. Leiden: Research School of Asian, African and Amerindian Studies. Stassen, Leon. . Comparison and Universal Grammar. Oxford: Blackwell. Stassen, Leon. . ‘Predicative possession.’ In Matthew S. Dryer and Martin Haspelmath (eds), The world atlas of language structures online. Leipzig: Max Planck Institute for Evolutionary Anthropology. http://wals.info/chapter/ Steinhauer, Hein. . ‘Blagar.’ In Antoinette Schapper (ed.), Papuan languages of Timor, Alor and Pantar: sketch grammars, vol. , –. Berlin: Mouton de Gruyter. Stilo, Donald. . ‘Case in Iranian: from reduction and loss to innovation and renewal.’ In Andrej Malchukov and Andrew Spencer (eds), The Oxford handbook of case, –. Oxford: Oxford University Press. Stilo, Don. To appear. ‘Caspian and Tatic.’ In Geoffrey Haig and Geoffrey Khan (eds), The languages and linguistics of Western Asia: an areal perspective. Berlin: de Gruyter. Stirling, Lesley, and Alan Dench (eds) . Tense, aspect, modality and evidentiality in Australian languages. Special issue of Australian Journal of Linguistics (). Stolz, Christel, and Thomas Stolz. . ‘Mesoamerica as a linguistic area.’ In Martin Haspelmath, Ekkehard König, Wulf Oesterreicher, and Wolfgang Raible (eds), Language typology and language universals: an international handbook, vol. , –. New York: de Gruyter. Stolz, Thomas. . ‘Agglutinationstheorie und Grammatikalisierungsforschung: einige alte und neue Gedanken zur Entstehung von gebundener Morphologie.’ Zeitschrift für Phonetik, Sprachwissenschaft und Kommunikationsforschung (): –. Suzuki, Tomomi (ed.) . Fukugō Joshi ga Kore de Wakaru [All about compound particles]. Tokyo: Hitsuji Shobō. Svorou, Soteria. . The grammar of space. Amsterdam: Benjamins. Swartz, Stephen. . Constraints on zero anaphora and word order in Warlpiri narrative text. Darwin: Summer Institute of Linguistics. Tabulova, Nurja T., and Raisa X. Temirova (eds). . Sistema preverbov i poslelogov v iberijsko-kavkazskix jazykax [Systems of preverbs and postpositions in the Ibero-Caucasian languages]. Čerkessk: Karačaevo-čerkesskij NII istorii, ﬁlologii i èkonomiki. Tagliamonte, Sali. . Roots of English: exploring the history of dialects. Cambridge: Cambridge University Press. Taleghani, Azita. . Modality, aspect and negation in Persian. Amsterdam: Benjamins. Tanaka, Hiroshi. . Fukugōji kara Mita Nihongo Bunpō no Kenkyū [Research on Japanese grammar from the viewpoint of compound morphemes]. Tokyo: Hitsuji Shobō.

References



Temsen, Gracious Mary, and Anish Koshy. . ‘Causativization in Khasi: syntactic and semantic issues.’ Interdisciplinary Journal of Linguistics : –. Testelets, Yakov G., and Peter M. Arkadiev. . ‘The challenges of differential nominal marking in Circassian.’ Talk given at the University of Leipzig,  Oct. Thapar, Romila. . Ancient Indian social history: some interpretations. New Delhi: Orient Longman. Thieberger, Nicholas. . A grammar of South Efate: an Oceanic language of Vanuatu. Honolulu: University of Hawai’i Press. Thomason, Sarah G. . Language contact: an introduction. Edinburgh: Edinburgh University Press. Thomason, Sarah G., and Terrence Kaufman. . Language contact, creolization and genetic linguistics. Berkeley: University of California Press. Thompson, Laurence C. . A Vietnamese grammar. Seattle: University of Washington Press. Thompson, Sandra, Joseph Park, and Charles Li. . A reference grammar of Wappo. Berkeley: University of California. Thompson, Sandra, and Ryoko Suzuki. . ‘Grammaticalization of ﬁnal particles.’ In Heiko Narrog and Bernd Heine (eds), The Oxford handbook of grammaticalization, –. Oxford: Oxford University Press. Thornes, Tim. . ‘Causation as “functional sink” in Northern Paiute.’ In Tim Thornes et al. (eds), Functional-historical approaches to explanation: in honor of Scott DeLancey, –. Amsterdam: Benjamins. Tjurenkova, Margarita. . Fieldwork report on possibility and necessity in Besleney Kabardian [in Russian]. Torres Cacoullos, Rena. . ‘Variation and grammaticalization.’ In Manuel Díaz-Campos (ed.), The handbook of Hispanic sociolinguistics, –. Oxford: Wiley-Blackwell. Torres Cacoullos, Rena, and James A. Walker. . ‘Collocations in grammaticalization and variation.’ In Bernd Heine and Heiko Narrog (eds), The Oxford handbook of grammaticalization, –. Oxford: Oxford University Press. Traugott, Elizabeth Closs. . ‘Legitimate counterexamples to unidirectionality.’ Paper presented at Freiburg University,  Oct. https://web.stanford.edu/~traugott/papers/Frei burg.Unidirect.pdf. Traugott, Elizabeth Closs. . ‘From subjectiﬁcation to intersubjectiﬁcation.’ In Raymond Hickey (ed.), Motives for language change, –. Cambridge: Cambridge University Press. Traugott, Elizabeth Closs. . ‘Constructions in grammaticalization.’ In Brian D. Joseph and Richard D. Janda (eds), The handbook of historical linguistics, –. Oxford: Blackwell. Traugott, Elizabeth Closs. . ‘(Inter)subjectivity and (inter)subjectiﬁcation: a reassessment.’ In Kristin Davidse, Lieven Vandelotte, and Hubert Cuyckens (eds), Subjectiﬁcation, intersubjectiﬁcation and grammaticalization, –. Berlin: Mouton de Gruyter. Traugott, Elizabeth Closs, and Richard Dasher. . Regularity in semantic change. Cambridge: Cambridge University Press. Traugott, Elizabeth Closs, and Bernd Heine (eds). . Approaches to grammaticalization, vols  and . Amsterdam: Benjamins. Traugott, Elizabeth Closs, and Graeme Trousdale. . Constructionalization and constructional changes. Oxford: Oxford University Press.



References

Trudgill, Peter. . Sociolinguistic typology: social determinants of linguistic complexity. Oxford: Oxford University Press. Trudgill, Peter. . ‘Sociolinguistic typology: social structure and linguistic complexity.’ In Alexandra Y. Aikhenvald and R. M. W. Dixon (eds), The Cambridge handbook of linguistic typology, –. Cambridge: Cambridge University Press. Trudgill, Peter. . ‘The anthropological setting of polysynthesis.’ In Nicholas Evans, Michael Fortescue, and Marianne Mithun (eds), The Oxford handbook of polysynthesis, Oxford: Oxford University Press. Tsitsipis, Lukas D. . A linguistic anthropology of praxis and language shift: Arvanítika (Albanian) and Greek in contact. Oxford: Clarendon Press. Tsunoda, Tasaku (ed.) . Adnominal clauses and the ‘mermaid construction’: grammaticalization of nouns. Tokyo: National Institute for Japanese Language and Linguistics. Urusov, Xatali Š. . ‘Preverby i napravitel’nye sufﬁksy v kabardino-čerkesskom jazyke’ [Preverbs and directional sufﬁxes in the Kabardian language]. In Nurja T. Tabulova and Raisa X. Temirova (eds), Sistema preverbov i poslelogov v iberijsko-kavkazskix jazykax [Systems of preverbs and postpositions in the Ibero-Caucasian languages], –. Čerkessk: Karačaevo-čerkesskij NII istorii, ﬁlologii i èkonomiki. Valenzuela, Pilar M. . Transitivity in Shipibo-Konibo grammar. PhD, University of Oregon, Eugene. Valenzuela, Pilar M. . ‘Nominal classiﬁcation in Shiwilu (Kapanawan).’ Presentation at the Fieldword Circle, University of Californa at Berkeley. http://linguistics.berkeley.edu/ ~fforum/handouts/valenzuela_fforum_shiwilu_slides.pdf. van den Berg, Helma. . ‘The East Caucasian language family.’ Lingua : –. van der Auwera, J. . ‘Conclusion.’ In J. van der Auwera (ed.), Adverbial constructions in the languages of Europe, –. Berlin: Mouton de Gruyter. van Engelenhoven, Aone. . ‘On derivational processes in Fataluku, a non-Austronesian language in East Timor.’ In W. L. Wetzels (ed.), The linguistics of endangered languages: contributions to morphology and morpho-syntax, –. Utrecht: Netherlands Graduate School of Linguistics. http://lotos.library.uu.nl/publish/articles//bookpart.pdf#page= . van Engelenhoven, Aone. . ‘Verb serialisation in Fataluku: the case of Take.’ In Azeb Azeb, Sascha Völlmin, Christian Rapold, and Silvia Zaug-Coretti (eds), Converbs, medial verbs, clause chaining and related issues, –. Cologne: Rüdiger Köppe. van Gelderen, Elly. . Grammaticalization as economy. Amsterdam: Benjamins. van Gelderen, Elly. a. The linguistic cycle: language change and language faculty. Oxford: Oxford University Press. van Gelderen, Elly. b. ‘Grammaticalization of agreement.’ In Heiko Narrog and Bernd Heine (eds), The Oxford handbook of grammaticalization, –. Oxford: Oxford University Press. van Gelderen, Elly. . ‘The linguistic cycle and the language faculty.’ Language and Linguistics Compass (): –. Velupillai, Viveka. . An introduction to linguistic typology. Amsterdam: Benjamins. Vernaudon, Jacques. . ‘Grammaticalization of Tahitian mea “thing, matter” into a stative aspect.’ In Claire Moyse-Faurie and Joachim Sabel (eds), Topics in Oceanic morphosyntax, –. Berlin: Mouton de Gruyter.

References



Versteegh, Kees. . The Arabic language nd edn. Edinburgh:Edinburgh University Press. Vincent, Nigel. . ‘Conative.’ Linguistic Typology (): –. Vuillermet, Marine. . ‘Two types of incorporation in Ese Ejja (Takanan).’ In Swintha Danielsen, Katja Hannss, and Fernando Zúñiga (eds), Word formation in South American languages, –. Amsterdam: Benjamins. Walker, N. Alexander. . ‘Assessing the effects of language contact on Northeastern Pomo.’ In Andrea Berez-Kroeker, Diane Hintz, and Carmen Jany (eds), Language contact and change in the Americas: studies in honor of Marianne Mithun, –. Amsterdam: Benjamins. Wang, Jian/王健. . 类型学视野下的汉语方言 “量名” 结构研究 [Bare classiﬁer phrases in Sinitic languages: a typological perspective]. Language Sciences (): –. Watters, David. . A grammar of Kham. Cambridge: Cambridge University Press. Weinrich, Harald. . ‘Ist das Französische eine analytische oder synthetische Sprache?’ Lebende Sprachen (): –. Werner, Heinz, and Bernard Kaplan. . Symbol-formation: an organismic-developmental approach to language and the expression of thought. New York: Wiley. Wheat, David. . ‘The ﬁrst great waves: African provenance zones for the trans-Atlantic slave trade to Cartagena de Indias, –.’ Journal of African History : –. Whitman, John. . ‘Old Korean.’ In Lucien Brown and Jae Hoon Yeon (eds), The Handbook of Korean Linguistics, –. Oxford: Wiley-Blackwell. Whitman, John. . ‘Nichiryū sogo no on’in taikei to rentaikei, izenkei no kigen’ [The phonological system of proto-Japanese-Ryūkyūan and the origin of the adnominal and realis forms]. In Yukinori Takubo, John Whitman, and Tatsuya Hirako (eds), Ryūkyū shogo to Kodai Nihongo: Nichiryū sogo no saiken ni mukete (Ryūkyūan and premodern Japanese: toward the reconstruction of proto-Japanese-Ryūkyūan), –. Tokyo: Hitsuji Shobō. Whitman, John, Miyoung Oh, Jinho Park, Valerio Luigi Alberizzi, Masayuki Tsukimoto, Teiji Kosukegawa, and Tomokazu Takada. . ‘Toward an international vocabulary for research on vernacular reading of Chinese texts (漢文訓讀 Hanwen xundu).’ Scripta : –. Whorf, Benjamin Lee. . Language, thought, and reality: selected writings of Benjamin Lee Whorf, ed. and with an introduction by John B. Carroll. Cambridge, Mass.: MIT Press. Willis, David. . ‘Syntactic lexicalization as a new type of degrammaticalization.’ Linguistics (): –. Windfuhr, Gernot. . ‘Cases.’ Encyclopædia iranica, vol. , fasc. , –. Windfuhr, Gernot. . ‘Dialectology and topics.’ In Gernot Windfuhr (ed.), The Iranian languages, –. London: Routledge. Winford, Donald. . ‘Creole languages.’ In Robert I. Binnick (ed.), The Oxford handbook of tense and aspect, . Oxford: Oxford University Press. Wray, Alison, and George W. Grace. . ‘The consequences of talking to strangers: evolutionary corollaries of socio-cultural inﬂuences on linguistic form.’ Lingua (): –. Wright, William. . Lectures on the comparative Semitic languages, ed. William Robertson Smith. Cambridge: Cambridge University Press. Xanmagomedov, Bejdullax G.-K. . Očerki po sintaksisu tabasaranskogo jazyka [Studies in Tabasaran syntax]. Maxačkala: Dagučpedgiz. Yakup, Abdurishid. . The Turfan dialect of Uyghur. Wiesbaden: Harrassowitz.



References

Yamada, Yoshio. . Kanbun no kundoku ni yorite tutaeraetaru gohō [Words and grammar passed on through the Japanese reading of Chinese texts]. Tokyo: Hōbunkan. Yi, Tae-Yong. . Kwuke tongsauy mwunpephwa yenkwu [A study on the grammaticalization of Korean verbs]. Seoul: Hanshin. Zagirov, Zagir M., Velibek M. Zagirov, Kazi K. Kurbanov, Bejdullax G.-K. Xanmagomedov, and Kim T. Šalbuzov. . Sovremennyj tabasaranskij jazyk [Modern Tabasaran]. Maxačkala: IJaLI DNTs RAN. Zariquiey, Roberto. . A grammar of Kashibo-Kakataibo. PhD dissertation, La Trobe University. Zariquiey, Roberto, and David W. Fleck. . ‘Preﬁxation in Kashibo-Kakataibo: synchronic or diachronic derivation.’ International Journal of American Linguistics : –. Zide, Norman. . ‘Gutob pronominal clitics and related phenomena elsewhere in GutobRemo-Gtaʔ.’ In Anvita Abbi (ed.), Languages of tribal and indigenous peoples of India: the ethnic space, –. Delhi: Motilal Banarsidass. Zide, Norman H., and Gregory D. S. Anderson. . ‘The proto-Munda verb: some connections with Mon-Khmer.’ In P. Baskhara Rao and K. V. Subbarao (eds), Yearbook of South Asian Linguistics, –. Delhi: Sage Press. Zuñiga, Fernando. . ‘Nominal compounds in Mapudungun.’ In Swintha Danielsen, Katja Hannss, and Fernando Zúñiga (eds), Word formation in South American languages, –. Amsterdam: Benjamins.

Index of languages !Xun –,  Abaza , ,  Abkhaz ,  Abui , –, ,  Adang , –, , , ,  Agul , , – Ainu – Alorese –, – Altaic languages –, –, –,  Amazonian languages – Anejom̃ , , – Arabic –,  Arawak languages –, , ,  Ashéninka Kampa , see also Kampa languages Asmat  Assamese –, ,  Australian languages – Austroasiatic languages , , , –, ,  Austronesian languages , , – Bahuana  Baniwa –, ,  Baniwa Hohôdene, see Baniwa Baniwa of Içana-Kurripako dialect continuum, see Baniwa Barai  Basque  Baure ,  Blagar , –, –,  Bora-Miraña  Bunaq , ,  Burushaski ,  Cantonese –, , – Carib languages , –, – Caucasian Albanian (Old Udi) ,  Cavineña , ,  Cayuga – Central Kurdish –, , –,  Central Pomo – Chamus ,  Chukchi –

Dení  Desano ,  Dolakha Newar  Dravidian languages –, –, –,  Drehu , ,  East Asian languages , –, , –,  East Futunan , , – East Tucanoan languages –,  East Uvean , , , , , ,  Eﬁk  English , , , , , , , , , , –, , –, , , , , ,  Ese ejja , – Estonian ,  Fataluku , , – Fijian –, ,  Finnish  Fongbe –,  French , , , –, –, –,  Garrwa – Garrwan  German , ,  Germanic languages , –, , , , , ,  Gorum  Gothic  Greek –,  Gutob  Hawaiian , ,  Hdi – Hindi , ,  Hittite –,  Ho  Hokkien –,  Hua – Hup –, ,  Indo-Aryan languages , –, – Indo-European languages , , –, –, , –, , , , , 



Index of languages

Indonesian –, –, , –,  Italian , ,  Jaminjung , – Japanese , – Jingulu , – Juang  Kaera , –, , , , –,  Kakataibo –, – Kalapalo , – Kamang , –, –, ,  Kampa languages –, see also Arawak languages Kannada , – Kartvelian languages ,  Kham –, ,  Kharia , – Khasi –,  Khiamniungan  Khmer , –, – Kikongo  Kikongo-Kituba  Kiraman(g) ,  Klon , –, ,  Korean , – Kryz ,  Kulina  Kutenai –,  Lamaholot –,  Latin , , –, , – Lewo , – Lezgian , , , –,  Lezgic languages –, – Loniu , ,  Maa  Mainland Southeast Asian languages , –, , , , , , – Maipuran languages, see Arawak languages Makalero , –,  Makasai  Makú languages – Malayalam ,  Manam –,  Mandarin , , , –, ,  Māori , –, ,  Mapudungun , – Marathi –,  Marquesan , , ,  Matses , –, , –

Middle Persian (also Middle Iranian) –,  Mirndi languages , , –,  Mokilese ,  Mongolic , – Mongsen Ao – Mundari  Munduruku , ,  Mwotlap , ,  Nadahup languages –, ,  Nagamese – Nanti , , , see also Kampa languages Nepali ,  Ngumpin-Yapa languages ,  Nivkh –,  Non-Pama-Nyungan –,  North Arawak languages – Northern Kurdish , ,  Nyelâyu , , , – Oceanic languages , , – Old Persian , , ,  Palenquero , – Palikur , , , – Paluai ,  Pama-Nyungan languages – Panoan languages – Papuan languages , , , – Paresi  Parthian , –,  Persian – Piro  Pomoan languages –, ,  Portuguese , ,  Proto-Arawak – Romance languages , , , , –, , , –, –, , ,  Russian , , , ,  Rutul , –,  Saliba , ,  Sanskrit , , ,  Saramaccan , – Sawila , , ,  Semitic languages , , , –,  Semnani – Shipibo-konibo –, –,  Shiwilu , –

Index of languages Sinhala ,  Slavic languages ,  South Asian languages – South Efate ,  Southeast Asian languages , –, , , , , –, –,  Spanish , , , , , –,  Sranan , –,  Tabasaran , –,  Tahitian , ,  Tai –,  Tamil , , , ,  Tangkic , , ,  Tariana , – Teiwa – Tetun –, –,  Thai – Tibeto-Burman languages –,  Timor-Alor-Pantar languages , –,  Tok Pisin , , – Tokelauan ,  Tonkawa – Toqabaqita , , –, ,  Transeurasian languages , , –, , –, , , ,  Tsakhur , , – Tucano – Tucanoan languages –, see also East Tucanoan languages Tungusic , , , – Tupí languages , , , 



Turkic , , , , , –,  Turkish , , , , –, – Tuvaluan , – Tuyuca ,  Udi –, –, – Urdu –,  Vaupés River Basin linguistic area –,  Wambaya – Wappo –,  Wapui languages –, see also North Arawak languages, Baniwa, Tariana Warekena of Xié  Warlpiri – Wersing – Western Pantar –, ,  Wintu –, – Wintuan family ,  Xamatauteri , ,  Xârâcùù , –, –, ,  Yakkha  Yanomama  Yanomamí languages , , ,  Yimas  Yoruba  Yuhup – Yuki –, ,  Yukulta (Ganggalida) –

Index of authors Abbi, Anvita  Adams Lichlan, P.  Ahmed, Tasfeer  Aikhenvald, Alexandra Y. , , , , –, , –, , , , , –,  Anderson, Gregory D. S. , , , , , ,  Ansaldo, Umberto –, , , , , –, , , , , , , , , ,  Aristar, Anthony Rodrigues ,  Arkadiev, Peter , , , , , –, , ,  Arsenault, Paul  Bashir, Elena  Beames, John  Behaghel, Otto  Bhattacharjya, Dwijen  Bisang, Walter , –, , , , , , , , –, –, ,  Brandão, Ana Paula Barros  Bugaeva, Anna  Burling, Rob  Butt, Miriam  Bybee, Joan L. , –, , , , , , , , , –, , , , , , , , , , , , –, , , –, ,  Campbell, Lyle , , ,  Chatterji, Suniti Kumar  Clark, Edward Winter ,  Comrie, Bernard , , , , , ,  Coupe, Alexander R. , , , , , –, , , ,  Dabir-Moghaddam, Mohammad  Dahl, Östen , , , , , , , , , , –, , ,  Davari, Shadi , , – Drinka, Bridget – Du Bois, John 

Emeneau, Murray B. – Enﬁeld, Nick J. , , ,  Epps, Patience –,  Esseesy, Mohssen , , , , –,  Ferreira, Helder Perri  Friedman, Victor  Green, Diana , , , , , , –, ,  Greenberg, Joseph H. –, , , , ,  Güldemann, Tom ,  Gumperz, John J. – Haig, Geoffrey , –, –, –, – Haiman, John –,  Hamann, Silke  Harris, Alice C. , , , ,  Haspelmath, Martin , , , , , , –, , –, , , –, –, , ,  Heine, Bernd –, –, –, –, , , –, , , , –, –, , , , , , –, , , , , –, , , , , –, , , –, , , , , , , –, , , , , –, , , –, –, , , –, , –, , –, , –, , –, ,  Hock, Hans Heinrich , ,  Hook, Peter Edwin , ,  Hopper, Paul J. , , –, , , , , , , , , –,  Jacob, Daniel , ,  Janhunen, Jujha ,  Jenny, Mathias , ,  Johanson, Lars , , , , –, –, , , –, , ,  Jügel, Thomas , –, –, –,  Keenan, Edward ,  Klamer, Marian , –, –, –, –, –,  Kohan, Mehrdad Naghzguy , , –

Index of authors Kolichala, Suresh  König, Ekkehard , –, , , , –, –, ,  Koshy, Anish  Kuiper, F. B. J.  Kuteva, Tania , –, –, , , , , , –, , , –, , , , –, , , –, , , –, , , , , , , –, , , , , , –, , –, –, , –, ,  LaPolla, Ranady J. –,  Lehmann, Christian , , , , , , , , , , –, ,  Mahanta, Shakuntala  Maisak, Timur , , , , , –,  Malone, Terry  Martin, Samuel ,  Masica, Colin P. , –, – Matisoff, James A. , , , , , –, , ,  Matteson, Esther  McWhorter, John H. , , –, , , , –, , ,  Meillet, Antoine , , , , – Michael, Lev D. ,  Mihas, Elena  Miller, Marion  Mithun, Marianne , , –, , , , , ,  Morey, Stephen  Moyse-Faurie, Claire , , , , , , –, , ,  Muntzel, Martha  Mushin, Ilana , , , , –, , , –, 



Pagliuca, William –, –, , , , , , –,  Perkins, Revere D. –, –, , , , , ,  Peterson, John , , ,  Ramirez, Henri –, , ,  Reinöhl, Uta  Rhee, Seongha , , , , , ,  Ring, Hiram – Robbeets, Martine , ,  Sapir, Edward , , , ,  Saxena, Anju  Schackow, Diana  Schmidt, Annette  Seifart, Frank ,  Seržant, Ilya  Silva, Cácio – Silva, Elisângela – Slade, Benjamin  Smith, Hiram L. , –, , –, –,  Sorensen, Arthur P. Jr.  Southworth, Franklin C. ,  Sridhar, Mangadan V.  Temsen, Gracious Mary  Thapar, Romila  Thomason, Sarah ,  Traugott, Elizabeth C. , –, , , , , , , , –,  Trudgill, Peter ,  Tsitsipis, Lukas D.  Vincent, Nigel , , ,  von der Gabelentz, Georg , , –

Nadkarni, Mangesh V.  Narrog, Heiko –, , , –, , –, , , , , , , , , , , , , ,  Noonan, Michael , –

Watters, David ,  Weber, Tobias  Weymuth, Rachel  Whitman, John , , , , –,  Wilson, Robert –

Ōhori, Toshio ,  Onodera, Noriko , 

Zariquiey, Roberto , , –, , –, 

Index of subjects ablative , , , , , , , – actionality –, – adpositions –, , , , , , , , –, –, , , , –, , – agglutinative , , –, –, , , , ,  agreement –, –, –, , , , , –, –, ,  agreement marker , , – ambiguity , , ,  analytic , , , , , –, , –, –, , , , , , ,  anasynthetic –, – applicative , , –, –, –, , , –,  applicative preﬁx , –, ,  areal diffusion , –, , –, see also language contact areality ,  aspect –, –, , –, , –, –, –, –, , –, –, , , , , , , –, –, –, –, –,  aspect marker , , , , ,  auxiliary , , , , , , –, –, , –, , –, –, , , , –, , –, , , – auxiliary verb , , , –, –, , , , ,  auxiliation , , 

case marking , , –, ,  causative , , , , –, –, , ,  classiﬁer , –, , , , , – clause union , ,  clitic pronoun , , – comparative , –, , –, –, –, , , –,  comparative of inequality ,  complex predicate , –, , ,  complexity , , –, , –, –, , –,  compound verb –, , , ,  compounding , , , , , , , –, , , , –, –, – conative –, –, –, , – constituent order , , –, ,  contact-induced grammaticalization –, –, –, , , –, , –, , –, , ,  converb –, , , –, –, – convergence –, , –, , , , , ,  copula –, , , –, , –, , , , , –,  creole –, , , , , , –, –, –

beneﬁciary , ,  bioprogram  body-part noun , , , , , , , , – body-part term , –, , , , ,  borrowing , , , , , see also language contact bound pronoun , 

decategorialization , , , , , , , ,  default meaning , –,  degrammaticalization (degrammation) , –, ,  deictic verb , – desemanticization , –, , , ,  diachronic approach ,  diachronic typology , ,  diffusion, see areal diffusion diglossia  directional , –,  discourse marker , 

case , –, –, , , –, , , , , –, , , –, , , , , –, , , , –, , –, –

enclitic , , , , – erosion , , –, , , , , , , , , , , –, , –, –, , , 

Index of subjects evidentials , , , , , , , –, – existential/non-existential verb/predicate , , , , , –, –, ,  expletives , – ﬂectional  free pronoun , , ,  frequency –, , , , –, , –, , , , , , , , , , , , , , , , –,  future tense –, , –, , –, , , –, , –, , , , –, , –, , , , –, , , , , , , –, –,  generic genitive , – genetic grouping  genetically inherited feature ,  grammaticalization area , ,  grammaticalization theory , , , , , , –, – habitual , , , , , –, ,  handling verb , –,  honoriﬁc , , – iamitive  imperfective –, , , , , , , , , –, , ,  incorporation , , , , –, , , , –,  indexing –, , , , ,  inﬂectionalization –, –, –, , – information source , , see also evidentials innovative speakers – intensiﬁer , , , , ,  interrogative marker ,  intersubjectiﬁcation – isolating language –, –, , , –, –, , –,  language contact –, , , , , , , –, , , –, , , , , , , , –, , , – lexical afﬁxes  linguistic area , , , –, , , , –, 



language change , , , , , , , , ,  linguistic diversity , , –,  loan translation – locational verb , –, , –,  locative , –, , , –, –, –, , , –, , , –, , –, , , , –, , –, , –, , – locative compound , , – manner deixis verb  manner of action , , –,  marriage network –, ,  meaning-ﬁrst hypothesis –, ,  mermaid construction , – metonymy , , ,  modality , , , –, –, , , ,  monogenetic view of grammaticalization ,  morphological elaboration , , – morphologization –, , , , – motion verb , , , –, ,  multilingualism , , , , – new information –,  nominal compound , – noun class , , , – numeral classiﬁer , , , , , , , –, , , , ,  object agreement , –,  object case  object marker –, , ,  object pronoun , , –, –, , ,  optional marking , –, , ,  parallel reduction –,  particles –, , –, , , –, , , , , , , –, , , –, , , , , , , –, , , , ,  perfective –, , , , , , , , , , , –, –, , ,  periphrasis/periphrastic , , , , , –, , –, , – person , –, –, –, , , –, –, , , –, , , –, –, –, , 



Index of subjects

personal pronoun –, –, –,  phasal verb , ,  phase speciﬁcation  pidginization  plural , , , , , –, , –, –,  polarity –, –,  polyfunctionality/polyfunctional , , , , – polysynthesis/polysynthetic , , , , –, , , , ,  possessive linker , – possessive perfect –, ,  possessive/possessive construction –, , –, , –, –, –, , , , – postposition , , , , –, –, , –, , , , , –, , –, – postverbial construction –,  preﬁxe , , , , , , , –, –, , –, , , , , –, , , , , – preposition –, , , –, , , , , , –, , , , , , , , –, , , , –, , , –, , –,  complex preposition ,  primacy of third person  progressive , –, , , –, , , – proximative , –, ,  quotative (index/marker/particle) –, , , ,  reciprocal (marker/construction) , –, , , –,  reﬂexive (marker/construction) –, –, , , –, –, , –, , ,  relative marker ,  relative-correlative – relexiﬁcation –, – reportative –,  root set  second position –, , ,  semantic dissonance , 

serial verb construction , –, –, , , , – serial verbs , , , , , ,  simpliﬁcation , ,  source-oriented typology ,  spatial orientation  speech verbs – Standard Average European –, – stative-active  structural parallelism , , see also convergence subject , , , –, , –, –, –, –, , , , , –, , –, , , , , , , , , , , , , , –, , –, – sufﬁx , , –, , , , , , , , , –, –, , , , , , , , –, , , , , , , , , –, –, –,  sufﬁxation –, , –,  synthetic , , , , , –, , –, –, , –,  tense , –, –, –, , , , –, , , –, –, , –, , , , , , –, –, –, , , , –, , , –, , –, –,  tense-aspect-mood , –,  tone , , , ,  traditional speakers  typological feature –, , –, , , , –, , , , , , , ,  typological markedness , –,  typological proﬁle , , –, , ,  typology –, –, –, , , –, , , , , –, ,  unidirectional (change) , , ,  unidirectionality hypothesis , – variationist method , ,  verb compounding , , , –, – verbal preﬁx , , , ,  veriﬁcative – version – viewpoint aspect –, –

OXFORD STUDIES IN DIACHRONIC AND HISTORICAL LINGUISTICS GENERAL EDITORS

Adam Ledgeway and Ian Roberts, University of Cambridge ADVISORY EDITORS

Cynthia Allen, Australian National University; Ricardo Bermúdez-Otero, University of Manchester; Theresa Biberauer, University of Cambridge; Charlotte Galves, University of Campinas; Geoff Horrocks, University of Cambridge; Paul Kiparsky, Stanford University; Anthony Kroch, University of Pennsylvania; David Lightfoot, Georgetown University; Giuseppe Longobardi, University of York; George Walkden, University of Konstanz; David Willis, University of Cambridge PUBLISHED

 From Latin to Romance Morphosyntactic Typology and Change Adam Ledgeway  Parameter Theory and Linguistic Change Edited by Charlotte Galves, Sonia Cyrino, Ruth Lopes, Filomena Sandalo, and Juanito Avelar  Case in Semitic Roles, Relations, and Reconstruction Rebecca Hasselbach  The Boundaries of Pure Morphology Diachronic and Synchronic Perspectives Edited by Silvio Cruschina, Martin Maiden, and John Charles Smith  The History of Negation in the Languages of Europe and the Mediterranean Volume I: Case Studies Edited by David Willis, Christopher Lucas, and Anne Breitbarth  Constructionalization and Constructional Changes Elizabeth Traugott and Graeme Trousdale  Word Order in Old Italian Cecilia Poletto  Diachrony and Dialects Grammatical Change in the Dialects of Italy Edited by Paola Benincà, Adam Ledgeway, and Nigel Vincent  Discourse and Pragmatic Markers from Latin to the Romance Languages Edited by Chiara Ghezzi and Piera Molinelli

 Vowel Length from Latin to Romance Michele Loporcaro  The Evolution of Functional Left Peripheries in Hungarian Syntax Edited by Katalin É. Kiss  Syntactic Reconstruction and Proto-Germanic George Walkden  The History of Low German Negation Anne Breitbarth  Arabic Indeﬁnites, Interrogatives, and Negators A Linguistic History of Western Dialects David Wilmsen  Syntax over Time Lexical, Morphological, and Information-Structural Interactions Edited by Theresa Biberauer and George Walkden  Syllable and Segment in Latin Ranjan Sen  Participles in Rigvedic Sanskrit The Syntax and Semantics of Adjectival Verb Forms John J. Lowe  Verb Movement and Clause Structure in Old Romanian Virginia Hill and Gabriela Alboiu  The Syntax of Old Romanian Edited by Gabriela Pană Dindelegan  Grammaticalization and the Rise of Conﬁgurationality in Indo-Aryan Uta Reinöhl  The Rise and Fall of Ergativity in Aramaic Cycles of Alignment Change Eleanor Coghill  Portuguese Relative Clauses in Synchrony and Diachrony Adriana Cardoso

 Micro-change and Macro-change in Diachronic Syntax Edited by Eric Mathieu and Robert Truswell  The Development of Latin Clause Structure A Study of the Extended Verb Phrase Lieven Danckaert  Transitive Nouns and Adjectives Evidence from Early Indo-Aryan John J. Lowe  Quantitative Historical Linguistics A Corpus Framework Gard B. Jenset and Barbara McGillivray  Gender from Latin to Romance History, Geography, Typology Michele Loporcaro  Clause Structure and Word Order in the History of German Edited by Agnes Jäger, Gisella Ferraresi, and Helmut Weiß  Word Order Change Edited by Ana Maria Martins and Adriana Cardoso  Arabic Historical Dialectology Linguistic and Sociolinguistic Approaches Edited by Clive Holes  Grammaticalization from a Typological Perspective Edited by Heiko Narrog and Bernd Heine IN PREPARATION Negation and Nonveridicality in the History of Greek Katerina Chatzopoulou Morphological Borrowing Francesco Gardani Indeﬁnites between Latin and Romance Chiara Gianollo Nominal Expressions and Language Change From Early Latin to Modern Romance Giuliana Giusti Syntactic Features and the Limits of Syntactic Change Edited by Jóhannes Gísli Jónsson and Thórhallur Eythórsson

A Study in Grammatical Change The Modern Greek Weak Subject Pronoun τος and its Implications for Language Change and Structure Brian D. Joseph Reconstructing Pre-Islamic Arabic Dialects Alexander Magidow Word Order and Parameter Change in Romanian Alexandru Nicolae Referential Null Subjects in Early English Kristian A. Rusten The History of Negation in the Languages of Europe and the Mediterranean Volume II: Patterns and Processes Edited by David Willis, Christopher Lucas, and Anne Breitbarth Verb Second in Medieval Romance Sam Wolfe Palatal Sound Change in the Romance Languages Diachronic and Synchronic Perspectives André Zampaulo